Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix alpha in PorterDuffImageComposite #3411

Merged
merged 1 commit into from
Jun 4, 2024

Conversation

dennwc
Copy link
Contributor

@dennwc dennwc commented May 6, 2024

There were two bugs in PorterDuffImageComposite.

The first one is the fact that it uses the mask input directly as alpha, missing the conversion (1-a). The fix is similar to c16f574.

The second one is that all color composition formulas assume alpha premultiplied values, while the input is not premultiplied.

This change fixes both of these issue.

Copy link
Contributor

@shawnington shawnington left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good, alpha premultiplication is required by spec for the non-blending PorterDuff functions.

However, I question the assumption that the input is a mask that needs alpha conversion as the inputs are labeled source_alpha and destination_alpha in the node.

That to me implies that the input should have already been converted to alpha before being passed into the node (especially as PorterDuff are alpha based functions).

There is probably merit in the assumption that most people are passing non alpha converted mask to the node and getting unexpected results, but that is an incorrect use issue.

@dennwc
Copy link
Contributor Author

dennwc commented May 8, 2024

@shawnington Thank you for checking it out!

However, I question the assumption that the input is a mask that needs alpha conversion as the inputs are labeled source_alpha and destination_alpha in the node.

Please see the commit c16f574 where @comfyanonymous fixed the same issue for two other nodes. I think the author who added these 3 nodes (JoinImageWithAlpha, SplitImageWithAlpha and PorterDuffImageComposite) only tested them with each other, but not with the other Comfy nodes. So their assumption was that "alpha" and "mask" are the same, so they used these names interchangeably.

The quick experiment that you can do to verify this is to load two transparent images and run PorterDuffImageComposite with either SRC_OVER or DST_OVER mode. The expected results is that one image should be rendered on top of the other image.

But in current version you will see a very strange result. If I'm not mistaken it will look something like this: opaque areas will become invisible, invisible ares will still be invisible and only half-transparent areas will be visible. This is clearly not how the composite should behave. Inverting mask, on other other hand, does the right thing.

@shawnington
Copy link
Contributor

shawnington commented May 8, 2024

@shawnington Thank you for checking it out!

However, I question the assumption that the input is a mask that needs alpha conversion as the inputs are labeled source_alpha and destination_alpha in the node.

Please see the commit c16f574 where @comfyanonymous fixed the same issue for two other nodes. I think the author who added these 3 nodes (JoinImageWithAlpha, SplitImageWithAlpha and PorterDuffImageComposite) only tested them with each other, but not with the other Comfy nodes. So their assumption was that "alpha" and "mask" are the same, so they used these names interchangeably.

The quick experiment that you can do to verify this is to load two transparent images and run PorterDuffImageComposite with either SRC_OVER or DST_OVER mode. The expected results is that one image should be rendered on top of the other image.

But in current version you will see a very strange result. If I'm not mistaken it will look something like this: opaque areas will become invisible, invisible ares will still be invisible and only half-transparent areas will be visible. This is clearly not how the composite should behave. Inverting mask, on other other hand, does the right thing.

Yes, the PorterDuffImageComposite node does need the alpha premultiplication fix you committed to work properly. I actually checked every function in the node to make sure the math needed the alpha pre-multiplication, and you are 100% correct that its required for proper function of the node, and the node does not currently function properly.

I also looked at those commits before I commented. The first commit to split_image_with_alpha is good, the intended output is alpha, so alpha conversion makes sense, and seems quite correct.

the commit to JoinImageWithAlpha is a bug that was introduced probably unintentionally and actually demonstrate the problem I am discussing in a very concrete way.

If you take the output of split_image_with_alpha and put it into JoinImageWithAlpha, you will not get expected behavior, because JoingImageWithAlpha will invert the alpha.

Alpha has a very specific meaning in terms of imaging, and yes its basically just the mask inverted, but it being in its inverted state is essential to the math working properly with a proper input. Anything specifying alpha as an input should not do alpha conversion on the alpha input. This simply inverts proper input.

As you pointed out, if you input a mask, the output is the opposite of what you expected, because the specified input is alpha. If you convert the mask to alpha before input into the node, you will get expected behavior, well not quite because of the properly identified alpha premultiplication bug, but closer.

I understand that people are confusing them and some nodes convert improperly and some dont. However with a library of this size, there should definitely be an effort to adhere to terminology, and associated behavior, just as you properly identified that PorterDuff does specify alpha pre-multiplication. Standard adherence makes everyone's life easier.

Something that specifies alpha as input, should not do alpha conversion.

Im perfectly fine with you changing the input to mask, and doing alpha conversion just as you proposed. I however feel strongly that if input is labeled as alpha, it should behave as expected when given alpha as input.

@dennwc
Copy link
Contributor Author

dennwc commented May 8, 2024

Im perfectly fine with you changing the input to mask, and doing alpha conversion just as you proposed. I however feel strongly that if input is labeled as alpha, it should behave as expected when given alpha as input.

Okay, I see. But renaming the field will likely break serialization for that node, right? I think having name inconsistency is slightly better than breaking existing workflows.

@shawnington
Copy link
Contributor

Im perfectly fine with you changing the input to mask, and doing alpha conversion just as you proposed. I however feel strongly that if input is labeled as alpha, it should behave as expected when given alpha as input.

Okay, I see. But renaming the field will likely break serialization for that node, right? I think having name inconsistency is slightly better than breaking existing workflows.

Yeah probably, which is why I would rather alpha be treated as alpha. I think we can agree its easier for other people that want to work with the nodes and with the code, if the indicated input type is treated as the indicated input type.

Thats my only issue with your commit. The output doesn't indicate alpha out even if the variable is called out_alpha, its labeled mask in the node, so your output conversion back to mask, 100% acceptable, it does whats expected.

The input is labeled alpha in the node though, so conversion is inappropriate.

Good job identifying the lack of premultiliplied alpha though, reviewing the code was one of the longest code reviews Ive ever done, checking every PorterDuff function, and all the blending mode math to make sure that premultiplied alpha was indeed correct.

Very solid commit, once again my only issue is not treating the input as the indicated type.

@comfyanonymous
Copy link
Owner

image
is this how it's supposed to behave?

@dennwc
Copy link
Contributor Author

dennwc commented May 23, 2024

Hmm, strange, maybe I'm missing clamp after mask multiplication. Will check 👍

@shawnington
Copy link
Contributor

Screenshot 2024-05-23 at 8 49 05 AM

The banding is also present in the output mask ( and Im presuming the input mask ).

out_image = torch.where(out_alpha > 0, out_image / out_alpha, torch.zeros_like(out_image)) 

is placing zeros into the image where the mask is full density.

The area that is not replaced with zeros indicates the alpha pre multiplication is not working properly for some reason, as that area should not be present in the out_image after the multiply function.

@dennwc
Copy link
Contributor Author

dennwc commented May 23, 2024

Maybe changing the condition to out_alpha > 1e-5 would help. @shawnington thank you for checking it too!

@shawnington
Copy link
Contributor

shawnington commented May 23, 2024

Maybe changing the condition to out_alpha > 1e-5 would help. @shawnington thank you for checking it too!

Im assuming it's just done to avoid a divide by zero error for out_image / out_alpha. Is that correct? If so, the error is not in this line. That line is just obfuscating what is happening.

Assuming the masked area is black or 0 for the alpha pre-multiplication. src_img = src_img * src_alpha should result in areas that have values of zero now where the src_alpha was zero.

out_image = src_image * dst_image should result in 0's in the area that is masked before undoing alpha premultiplication. But it appears to be getting multiplied as if the alpha premultiplication didn't happen at all.

Im not sure why, I just re-checked the math. It should be working correctly.

@shawnington
Copy link
Contributor

shawnington commented May 23, 2024

out_image = torch.where(out_alpha > 0, out_image / out_alpha, torch.zeros_like(out_image)) 

after reviewing again, shouldn't this actually be:

out_image = torch.where(out_alpha > 0, out_image / out_alpha, out_image) 

It's a divide by zero check, out_image / 0 != 0, shouldn't it just be getting the value from out_image, or am I missing something?

@dennwc
Copy link
Contributor Author

dennwc commented May 23, 2024

It's a divide by zero check, out_image / 0 != 0, shouldn't it just be getting the value from out_image, or am I missing something?

Yes, it's for div by zero check. This change was made two weeks ago, so I don't remember exactly why zeros are important. But I think I was getting some strange output without this 🤔

@dennwc dennwc force-pushed the fix_composite branch 2 times, most recently from 5ed59f6 to 299a668 Compare June 2, 2024 10:34
There were two bugs in PorterDuffImageComposite.

The first one is the fact that it uses the mask input directly as alpha, missing the conversion (`1-a`). The fix is similar to c16f574.

The second one is that all color composition formulas assume alpha premultiplied values, while the input is not premultiplied.

This change fixes both of these issue.
@dennwc
Copy link
Contributor Author

dennwc commented Jun 2, 2024

Indeed, switching from out_alpha > 0 to out_alpha > 1e-6 for the zero div check fixes the issue. Going below 1e-6 starts revealing these artifacts. I kept it at 1e-5 in the commit just in case. Should be good now.

@comfyanonymous comfyanonymous merged commit 20447e9 into comfyanonymous:master Jun 4, 2024
@MoonRide303
Copy link
Contributor

MoonRide303 commented Jun 6, 2024

TBH porter_duff_composite() was meant to work exclusively on alphas, as that's how this method was originally designed (#1577 + back in 1984 - no single mention of "mask" concept in the original paper, just working on alpha channel). So personally I'd rather keep any kind of alpha <-> mask conversions (side effect of ComfyUI design choices) out of this function (it should be like mathematically clean, not influenced by app-level issues).

@dennwc
Copy link
Contributor Author

dennwc commented Jun 6, 2024

Since pairing nodes were changed to work with alphas, I think it made sense to update the composite too. It just doesn't make sense to but InvertMask every time.

@MoonRide303
Copy link
Contributor

MoonRide303 commented Jun 6, 2024

@dennwc The porter_duff_composite() I mentioned earlier was implemented in such a way it could be for example moved to external graphics library (knowing nothing about ComfyUI working on inverted alpha channel as mask) - and now it's kinda "contaminated" with app-specific logic. That invertion could be made for example in more ComfyUI-specific PorterDuffImageComposite.composite(), instead.

But taking into account this file belongs to ComfyUI, and whole idea of using both alphas and mask as the same type is already pretty messy - not really a big deal. It just bugged me a bit from code aesthetics point of view 🙃.

@shawnington
Copy link
Contributor

@dennwc The porter_duff_composite() I mentioned earlier was implemented in such a way it could be for example moved to external graphics library (knowing nothing about ComfyUI working on inverted alpha channel as mask) - and now it's kinda "contaminated" with app-specific logic. That invertion could be made for example in more ComfyUI-specific PorterDuffImageComposite.composite(), instead.

But taking into account this file belongs to ComfyUI, and whole idea of using both alphas and mask as the same type is already pretty messy - not really a big deal. It just bugged me a bit from code aesthetics point of view 🙃.

I agree with you, Mask and Alpha are two different things, it's very messy to use the terms interchangeably in an image generation library.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants