-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[add pipeline]: A new Pixart-based inpainting pipeline. #7929
base: main
Are you sure you want to change the base?
Conversation
Could we see some results here to decide if it's worth adding as a core pipeline? Cc: @lawrence-cj |
cc @asomoza here too! |
prompt='Face of a yellow cat, high resolution, sitting on a park bench' |
@lawrence-cj could you review this too? |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
@@ -0,0 +1,1097 @@ | |||
# Copyright 2023 PixArt-Alpha Authors and The HuggingFace Team. All rights reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Copyright 2023 PixArt-Alpha Authors and The HuggingFace Team. All rights reserved. | |
# Copyright 2024 PixArt-Alpha Authors and The HuggingFace Team. All rights reserved. |
>>> img_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo.png" | ||
>>> mask_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo_mask.png" | ||
|
||
>>> init_image = download_image(img_url).resize((512, 512)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could replace this with diffusers.utils.load_image()
. Less lines of code. WDYT?
``` | ||
""" | ||
|
||
ASPECT_RATIO_1024_BIN = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we wanna move these constants to the init of pixart
as these are now access by two pipelines? WDYT?
Cc: @lawrence-cj WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It sounds reasonable for me.
def encode_prompt( | ||
self, | ||
prompt: Union[str, List[str]], | ||
do_classifier_free_guidance: bool = True, | ||
negative_prompt: str = "", | ||
num_images_per_prompt: int = 1, | ||
device: Optional[torch.device] = None, | ||
prompt_embeds: Optional[torch.FloatTensor] = None, | ||
negative_prompt_embeds: Optional[torch.FloatTensor] = None, | ||
prompt_attention_mask: Optional[torch.FloatTensor] = None, | ||
negative_prompt_attention_mask: Optional[torch.FloatTensor] = None, | ||
clean_caption: bool = False, | ||
**kwargs, | ||
): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use the same method from the PixArtAlpha implementation and use a # Copied from ...
statement here.
extra_step_kwargs["generator"] = generator | ||
return extra_step_kwargs | ||
|
||
def check_inputs( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this copied from the text-to-image pipeline implementation, let's use the # Copied from ...
statement here. Applies here and elsewhere.
But here, I think the function should also validate the input image and the mask image. Here's a reference:
diffusers/src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl_inpaint.py
Line 778 in 6c60e43
def check_inputs( |
def classify_height_width_bin(height: int, width: int, ratios: dict) -> Tuple[int, int]: | ||
"""Returns binned height and width.""" | ||
ar = float(height / width) | ||
closest_ratio = min(ratios.keys(), key=lambda ratio: abs(float(ratio) - ar)) | ||
default_hw = ratios[closest_ratio] | ||
return int(default_hw[0]), int(default_hw[1]) | ||
|
||
@staticmethod | ||
def resize_and_crop_tensor(samples: torch.Tensor, new_width: int, new_height: int) -> torch.Tensor: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Copied from ...
statement missing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recommits have been made in the appropriate format.@sayakpaul
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pipeline implementation looks very nice to me, thank you so much!
I think what's pending for this PR to get merged is the following:
- Docs
- Test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for adding this pipeline!
+1 on @sayakpaul 's comments about doc and tests
If `return_dict` is `True`, [`~pipelines.ImagePipelineOutput`] is returned, otherwise a `tuple` is | ||
returned where the first element is a list with the generated images | ||
""" | ||
if "mask_feature" in kwargs: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't need to deprecate for new pipeline
@eightmusic let's get the comments resolved and we would be happy to include in the new release we're cooking. |
Syncing changes from the main branch
|
Thank you! There are still open comments left to be addressed. Additionally, we need to have docs and tests for this to get merged. |
The previous comments should be resolved, is there anything else I need to do? |
Well, the tests are failing. We need to get them sorted. And then there are comments still open, e.g.: #7929 (review), #7929 (comment). And then we need to add tests for the pipeline and documentation. LMK if there's anything unclear. |
#7929 (review), #7929 (comment) problem has just been solved.'We need to get them sorted.' What do I need to do here. Are there any references to test and docs? I don't know what this is.@sayakpaul |
Quality tests can be fixed by running |
Sry to disturb you, but I found I can't push a commit to this branch, any suggestions? @sayakpaul |
I think you will need for submit a PR to https://github.com/eightmusic/diffusers/. Perhaps @eightmusic could add you as a collaborator to the repo? |
Agree. I can help adding the make style and test part. Is it possible to add me as a collaborator to your forked diffusers repo? @eightmusic |
of course. |
We add a pixart model-based inpaint method that works similarly to pipeline_stable_diffusion_inpaint. @sayakpaul @yiyixuxu @DN6