-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Pipeline] Add LEDITS++ pipelines #6074
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
…fusion and LEditsPPPipelineStableDiffusionXL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
two questions:
- For
edit_warmup_steps
,edit_guidance_scale
,edit_threshold
,reverse_editing_direction
,edit_cooldown_steps
: do we really need to allow all of these argument to be list? if so let's make sure that if they are passed asint
, we convert them into list of expected length so we do not need theseif ... else..
statement inside the noise edit loop - we should refactor the noise edit loop here. That means you will need to create a few utility functions or methods on the pipeline
for c, ( noise_pred_edit_concept, ... ) in enumerate(zip(noise_pred_edit_concepts, ... )):
if i >= edit_warmup_steps_c and (edit_cooldown_steps_c is None or i < edit_cooldown_steps_c):
...
if user_mask is not None:
noise_guidance_edit_tmp = noise_guidance_edit_tmp * user_mask
if use_cross_attn_mask:
noise_guidance_edit_tmp, attn_mask = self.apply_cross_attn_mask(...)
if use_intersedt_mask:
noise_guidance_edit_tmp = noise_guidance_edit_tmp * attn_mask
else:
noise_guidance_edit_tmp = self.apply_intersect_mask(noise_guidance_edit_tmp, attn_mask)
else:
noise_guidance_edit_tmp = make_up_a_name_here()
scheduler = DPMSolverMultistepScheduler.from_config( | ||
scheduler.config, algorithm_type="sde-dpmsolver++", solver_order=2 | ||
) | ||
logger.warning( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's:
- throw an error or warning here
- update the doc string examples to explicitly use DPM or ddim scheduler
pipe = LEditsPPPipelineStableDiffusion.from_pretrained()
pipe.scheduler = DPMSolverMultistepScheduler.from_config(
scheduler.config, algorithm_type="sde-dpmsolver++", solver_order=2
)
...
clip_skip: Optional[int] = None, | ||
callback_on_step_end: Optional[Callable[[int, int, Dict], None]] = None, | ||
callback_on_step_end_tensor_inputs: List[str] = ["latents"], | ||
**kwargs, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
**kwargs, |
I don't think this is used
if use_intersect_mask: | ||
use_cross_attn_mask = True | ||
|
||
if use_cross_attn_mask: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. got it now. let's pass these two arguments to check_inputs
if user_intersect_mask and not use_cross_attn_mask:
raise ValueError("...")
I would be ok to update the argument and throw a warning here too if that's what you prefer
|
||
latent_model_input = self.scheduler.scale_model_input(latent_model_input, t) | ||
|
||
text_embed_input = text_embeddings |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
text_embed_input = text_embeddings |
text_embed_input = text_embeddings | ||
|
||
# predict the noise residual | ||
noise_pred = self.unet(latent_model_input, t, encoder_hidden_states=text_embed_input).sample |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
noise_pred = self.unet(latent_model_input, t, encoder_hidden_states=text_embed_input).sample | |
noise_pred = self.unet(latent_model_input, t, encoder_hidden_states=text_embeddings).sample |
if isinstance(edit_warmup_steps, list): | ||
edit_warmup_steps_c = edit_warmup_steps[c] | ||
else: | ||
edit_warmup_steps_c = edit_warmup_steps |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can make sure edit_warmup_steps
is a list and remove the if else
here
if isinstance(edit_warmup_steps, list): | |
edit_warmup_steps_c = edit_warmup_steps[c] | |
else: | |
edit_warmup_steps_c = edit_warmup_steps |
if isinstance(edit_guidance_scale, list): | ||
edit_guidance_scale_c = edit_guidance_scale[c] | ||
else: | ||
edit_guidance_scale_c = edit_guidance_scale | ||
|
||
if isinstance(edit_threshold, list): | ||
edit_threshold_c = edit_threshold[c] | ||
else: | ||
edit_threshold_c = edit_threshold | ||
if isinstance(reverse_editing_direction, list): | ||
reverse_editing_direction_c = reverse_editing_direction[c] | ||
else: | ||
reverse_editing_direction_c = reverse_editing_direction |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if isinstance(edit_guidance_scale, list): | |
edit_guidance_scale_c = edit_guidance_scale[c] | |
else: | |
edit_guidance_scale_c = edit_guidance_scale | |
if isinstance(edit_threshold, list): | |
edit_threshold_c = edit_threshold[c] | |
else: | |
edit_threshold_c = edit_threshold | |
if isinstance(reverse_editing_direction, list): | |
reverse_editing_direction_c = reverse_editing_direction[c] | |
else: | |
reverse_editing_direction_c = reverse_editing_direction |
if isinstance(edit_cooldown_steps, list): | ||
edit_cooldown_steps_c = edit_cooldown_steps[c] | ||
elif edit_cooldown_steps is None: | ||
edit_cooldown_steps_c = i + 1 | ||
else: | ||
edit_cooldown_steps_c = edit_cooldown_steps |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if isinstance(edit_cooldown_steps, list): | |
edit_cooldown_steps_c = edit_cooldown_steps[c] | |
elif edit_cooldown_steps is None: | |
edit_cooldown_steps_c = i + 1 | |
else: | |
edit_cooldown_steps_c = edit_cooldown_steps |
if self.sem_guidance is None: | ||
self.sem_guidance = torch.zeros((len(timesteps), *noise_pred_uncond.shape)) | ||
|
||
for c, noise_pred_edit_concept in enumerate(noise_pred_edit_concepts): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's try to refactor this loop to have below structure:
for c, (
noise_pred_edit_concept, edit_warmup_steps_c, edit_guidance_scale_c,edit_threshold_c, reverse_editing_direction_c, edit_cooldown_steps_c) in enumerate(zip(noise_pred_edit_concepts, edit_warmup_steps, edit_guidance_scale, edit_threshold, reverse_editing_direction, edit_cooldown_steps )):
if i >= edit_warmup_steps_c and (edit_cooldown_steps_c is None or i < edit_cooldown_steps_c):
if reverse_editing_direction_c:
noise_guidance_edit_tmp = noise_guidance_edit_tmp * -1
noise_guidance_edit_tmp = noise_guidance_edit_tmp * edit_guidance_scale_c
if user_mask is not None:
noise_guidance_edit_tmp = noise_guidance_edit_tmp * user_mask
if use_cross_attn_mask:
noise_guidance_edit_tmp, attn_mask = self.apply_cross_attn_mask(...)
if use_intersedt_mask:
noise_guidance_edit_tmp = noise_guidance_edit_tmp * attn_mask
else:
noise_guidance_edit_tmp = self.apply_intersect_mask(noise_guidance_edit_tmp, attn_mask)
else:
noise_guidance_edit_tmp = make_up_a_name_here()
if i >= edit_cooldown_steps_c: | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of the two continue
let's do
if i >= edit_warmup_steps_c and i < edit_cooldown_steps_c:
...
@@ -236,6 +237,23 @@ def step_index(self): | |||
""" | |||
return self._step_index | |||
|
|||
@property |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you didn't make these changes, no?
why are they showing up here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's strange. Because this is not different from the current main branch in diffusers. So it isn't a change at all.
# Conflicts: # src/diffusers/schedulers/scheduling_dpmsolver_multistep.py
this looks very promising, few comments to polish it for production...
|
thanks for the feedback! @vladmandic @manuelbrack can we address them in the refactor PR too? |
one more item:
|
What does this PR do?
We add a set of pipelines implementing the LEDITS++ image editing method. We provide an implementation for StableDiffusion, SD-XL and DeepFloyd-IF.
Additionally, we made some minor adjustments to the DPM-Solver scheduler to support image inversion.
There are still some obvious TODOs left that we would appreciate some help with:
@patrickvonplaten, @apolinario, @linoytsaban you should all be able to commit to our fork. Would be great if you could help @kathath and I out a bit 😄
Who can review?
@patrickvonplaten