New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

[Pipeline] Add LEDITS++ pipelines #6074

Merged

linoytsaban merged 56 commits into huggingface:main from ml-research:ledits_pp

Mar 13, 2024

Contributor

manuelbrack commented Dec 6, 2023 •

edited

Loading

What does this PR do?

We add a set of pipelines implementing the LEDITS++ image editing method. We provide an implementation for StableDiffusion, SD-XL and DeepFloyd-IF.
Additionally, we made some minor adjustments to the DPM-Solver scheduler to support image inversion.

There are still some obvious TODOs left that we would appreciate some help with:

Finalize IF implementation
Adjust input checks to new, image editing setting (e.g. fail gracefully if inference is called before inversion)
Add documentation
Write unit tests

@patrickvonplaten, @apolinario, @linoytsaban you should all be able to commit to our fork. Would be great if you could help @kathath and I out a bit 😄

Who can review?

@patrickvonplaten

manuelbrack and others added 11 commits

November 30, 2023 15:48


          Setup LEdits++ file structure

eb9d25b


          Fix import

b53ecc1


          LEditsPP Stable Diffusion pipeline

92eb6b3


          Include variable image aspect ratios

c86cd2b


          Implement LEDITS++ for SDXL

cf89827


          clean up LEditsPPPipelineStableDiffusion

2111db0


          Merge branch 'ledits_pp' of github.com:ml-research/diffusers into led…

03b5770

…its_pp


          Adjust inversion output

a0deef0


          Added docu, more cleanup for LEditsPPPipelineStableDiffusion

32a78ef


          Merge branch 'ledits_pp' of github.com:ml-research/diffusers into led…

4b2e3be

…its_pp


          clean up LEditsPPPipelineStableDiffusionXL

e9aec54

manuelbrack changed the title ~~Add LEDITS++ pipelines~~ [WIP] [Pipeline] Add LEDITS++ pipelines

manuelbrack added 6 commits

December 7, 2023 11:35


          Update documentation

1ef50b3


          Fix documentation import

a00b353


          Add skeleton IF implementation

ce70506


          Fix documentation typo

127c352


          Add LEDTIS docu to toctree

c90c80e


          Add missing title

af19de6

HuggingFaceDocBuilderDev commented Dec 7, 2023

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

manuelbrack and others added 11 commits

December 7, 2023 16:26


          Finalize SD documentation

06e8861


          Finalize SD-XL documentation

149bc4d


          Fix code style and quality

d798e05


          Fix typo

e1f428b


          Fix return types

c2d7a0a


          added LEditsPPPipelineIF; minor changes for LEditsPPPipelineStableDif…

244e6f0

…fusion and LEditsPPPipelineStableDiffusionXL


          Fix copy reference

0da0115


          add documentation for IF

69f064f


          Add first tests

d690ad8


          Fix batching for SD-XL

82a2d5a


          Fix text encoding and perfect reconstruction for SD-XL

manuelbrack added 6 commits

February 9, 2024 10:33


          Refactoring of CrossAttnProcessor

e91ef78


          Revert unecessary changes to scheduler

0ab4ec0


          Remove first progress-bar in inversion

ae13535


          Refactor scheduler usage and reset

2dc5247


          Use imageprocessor instead of custom logic

8e5e310


          Fix scheduler init warning

1ec846a

patrickvonplaten requested a review from DN6

February 13, 2024 15:41

Contributor

patrickvonplaten commented Feb 13, 2024

Can we give this another review (cc @yiyixuxu and also maybe @DN6 )

manuelbrack added 3 commits

February 13, 2024 17:01


          Fix error when running the pipeline in fp16

9de246b


          Update documentation wrt perfect inversion

e71c5ce


          Update tests

76b4447

manuelbrack requested a review from yiyixuxu

February 13, 2024 16:34

yiyixuxu reviewed

View reviewed changes

Collaborator

yiyixuxu left a comment •

edited

Loading

two questions:

For edit_warmup_steps, edit_guidance_scale, edit_threshold, reverse_editing_direction, edit_cooldown_steps: do we really need to allow all of these argument to be list? if so let's make sure that if they are passed as int, we convert them into list of expected length so we do not need these if ... else.. statement inside the noise edit loop
we should refactor the noise edit loop here. That means you will need to create a few utility functions or methods on the pipeline

for c, ( noise_pred_edit_concept, ... ) in enumerate(zip(noise_pred_edit_concepts, ... )):
    if i  >= edit_warmup_steps_c and (edit_cooldown_steps_c is None or i < edit_cooldown_steps_c):
        ... 
        if user_mask is not None:
            noise_guidance_edit_tmp = noise_guidance_edit_tmp * user_mask

        if use_cross_attn_mask:
            noise_guidance_edit_tmp, attn_mask = self.apply_cross_attn_mask(...)
        
            if use_intersedt_mask:
                noise_guidance_edit_tmp = noise_guidance_edit_tmp * attn_mask
            else:
                noise_guidance_edit_tmp = self.apply_intersect_mask(noise_guidance_edit_tmp, attn_mask)
        
        else:
            noise_guidance_edit_tmp = make_up_a_name_here()

src/diffusers/pipelines/ledits_pp/pipeline_leditspp_stable_diffusion.py

Comment on lines +390 to +393

+                          scheduler = DPMSolverMultistepScheduler.from_config(
+                              scheduler.config, algorithm_type="sde-dpmsolver++", solver_order=2
+                          )
+                          logger.warning(

Collaborator

yiyixuxu Feb 15, 2024 •

edited

Loading

let's:

throw an error or warning here
update the doc string examples to explicitly use DPM or ddim scheduler

pipe = LEditsPPPipelineStableDiffusion.from_pretrained()
pipe.scheduler = DPMSolverMultistepScheduler.from_config(
                scheduler.config, algorithm_type="sde-dpmsolver++", solver_order=2
            )
...

src/diffusers/pipelines/ledits_pp/pipeline_leditspp_stable_diffusion.py

+                      clip_skip: Optional[int] = None,
+                      callback_on_step_end: Optional[Callable[[int, int, Dict], None]] = None,
+                      callback_on_step_end_tensor_inputs: List[str] = ["latents"],
+                      **kwargs,

Collaborator

yiyixuxu Feb 15, 2024

Suggested change

**kwargs,

I don't think this is used

src/diffusers/pipelines/ledits_pp/pipeline_leditspp_stable_diffusion.py

Comment on lines +908 to +911

+                      if use_intersect_mask:
+                          use_cross_attn_mask = True
+                      if use_cross_attn_mask:

Collaborator

yiyixuxu Feb 15, 2024 •

edited

Loading

ok. got it now. let's pass these two arguments to check_inputs

if user_intersect_mask and not use_cross_attn_mask:
    raise ValueError("...")

I would be ok to update the argument and throw a warning here too if that's what you prefer

src/diffusers/pipelines/ledits_pp/pipeline_leditspp_stable_diffusion.py


		latent_model_input = self.scheduler.scale_model_input(latent_model_input, t)

		text_embed_input = text_embeddings

Collaborator

yiyixuxu Feb 15, 2024

Suggested change

text_embed_input = text_embeddings

src/diffusers/pipelines/ledits_pp/pipeline_leditspp_stable_diffusion.py

+                              text_embed_input = text_embeddings
+                              # predict the noise residual
+                              noise_pred = self.unet(latent_model_input, t, encoder_hidden_states=text_embed_input).sample

Collaborator

yiyixuxu Feb 15, 2024

Suggested change

      
                            noise_pred = self.unet(latent_model_input, t, encoder_hidden_states=text_embed_input).sample
          
                            noise_pred = self.unet(latent_model_input, t, encoder_hidden_states=text_embeddings).sample

src/diffusers/pipelines/ledits_pp/pipeline_leditspp_stable_diffusion.py

Comment on lines +975 to +978

+                                      if isinstance(edit_warmup_steps, list):
+                                          edit_warmup_steps_c = edit_warmup_steps[c]
+                                      else:
+                                          edit_warmup_steps_c = edit_warmup_steps

Collaborator

yiyixuxu Feb 15, 2024

we can make sure edit_warmup_steps is a list and remove the if else here

Suggested change

      
                                    if isinstance(edit_warmup_steps, list):
          
                                        edit_warmup_steps_c = edit_warmup_steps[c]
          
                                    else:
          
                                        edit_warmup_steps_c = edit_warmup_steps

src/diffusers/pipelines/ledits_pp/pipeline_leditspp_stable_diffusion.py

Comment on lines +982 to +994

+                                      if isinstance(edit_guidance_scale, list):
+                                          edit_guidance_scale_c = edit_guidance_scale[c]
+                                      else:
+                                          edit_guidance_scale_c = edit_guidance_scale
+                                      if isinstance(edit_threshold, list):
+                                          edit_threshold_c = edit_threshold[c]
+                                      else:
+                                          edit_threshold_c = edit_threshold
+                                      if isinstance(reverse_editing_direction, list):
+                                          reverse_editing_direction_c = reverse_editing_direction[c]
+                                      else:
+                                          reverse_editing_direction_c = reverse_editing_direction

Collaborator

yiyixuxu Feb 15, 2024

Suggested change

      
                                    if isinstance(edit_guidance_scale, list):
          
                                        edit_guidance_scale_c = edit_guidance_scale[c]
          
                                    else:
          
                                        edit_guidance_scale_c = edit_guidance_scale
          
                                    if isinstance(edit_threshold, list):
          
                                        edit_threshold_c = edit_threshold[c]
          
                                    else:
          
                                        edit_threshold_c = edit_threshold
          
                                    if isinstance(reverse_editing_direction, list):
          
                                        reverse_editing_direction_c = reverse_editing_direction[c]
          
                                    else:
          
                                        reverse_editing_direction_c = reverse_editing_direction

src/diffusers/pipelines/ledits_pp/pipeline_leditspp_stable_diffusion.py

Comment on lines +996 to +1001

+                                      if isinstance(edit_cooldown_steps, list):
+                                          edit_cooldown_steps_c = edit_cooldown_steps[c]
+                                      elif edit_cooldown_steps is None:
+                                          edit_cooldown_steps_c = i + 1
+                                      else:
+                                          edit_cooldown_steps_c = edit_cooldown_steps

Collaborator

yiyixuxu Feb 15, 2024

Suggested change

      
                                    if isinstance(edit_cooldown_steps, list):
          
                                        edit_cooldown_steps_c = edit_cooldown_steps[c]
          
                                    elif edit_cooldown_steps is None:
          
                                        edit_cooldown_steps_c = i + 1
          
                                    else:
          
                                        edit_cooldown_steps_c = edit_cooldown_steps

src/diffusers/pipelines/ledits_pp/pipeline_leditspp_stable_diffusion.py

+                                  if self.sem_guidance is None:
+                                      self.sem_guidance = torch.zeros((len(timesteps), *noise_pred_uncond.shape))
+                                  for c, noise_pred_edit_concept in enumerate(noise_pred_edit_concepts):

Collaborator

yiyixuxu Feb 15, 2024

let's try to refactor this loop to have below structure:

for c, (
    noise_pred_edit_concept, edit_warmup_steps_c, edit_guidance_scale_c,edit_threshold_c, reverse_editing_direction_c, edit_cooldown_steps_c) in enumerate(zip(noise_pred_edit_concepts, edit_warmup_steps, edit_guidance_scale, edit_threshold, reverse_editing_direction, edit_cooldown_steps )):
    if i  >= edit_warmup_steps_c and (edit_cooldown_steps_c is None or i < edit_cooldown_steps_c):
        if reverse_editing_direction_c:
            noise_guidance_edit_tmp = noise_guidance_edit_tmp * -1

        noise_guidance_edit_tmp = noise_guidance_edit_tmp * edit_guidance_scale_c

        if user_mask is not None:
            noise_guidance_edit_tmp = noise_guidance_edit_tmp * user_mask

        if use_cross_attn_mask:
            noise_guidance_edit_tmp, attn_mask = self.apply_cross_attn_mask(...)
        
            if use_intersedt_mask:
                noise_guidance_edit_tmp = noise_guidance_edit_tmp * attn_mask
            else:
                noise_guidance_edit_tmp = self.apply_intersect_mask(noise_guidance_edit_tmp, attn_mask)
        
        else:
            noise_guidance_edit_tmp = make_up_a_name_here()

src/diffusers/pipelines/ledits_pp/pipeline_leditspp_stable_diffusion.py

Comment on lines +1003 to +1004

		if i >= edit_cooldown_steps_c:
		continue

Collaborator

yiyixuxu Feb 15, 2024

instead of the two continue let's do

if i >= edit_warmup_steps_c and i < edit_cooldown_steps_c:
    ...

yiyixuxu reviewed

View reviewed changes

src/diffusers/schedulers/scheduling_dpmsolver_multistep.py Outdated

@@ @@ -236,6 +237,23 @@ def step_index(self): @@
                       """
                       return self._step_index
+                  @property

Collaborator

yiyixuxu Feb 16, 2024

you didn't make these changes, no?
why are they showing up here?

Contributor Author

manuelbrack Feb 16, 2024

That's strange. Because this is not different from the current main branch in diffusers. So it isn't a change at all.

manuelbrack and others added 2 commits

February 26, 2024 22:19


          Merge branch 'huggingface:main' into ledits_pp

74627c8


          Fix code quality and copy consistency

976edb6

yiyixuxu reviewed

View reviewed changes

src/diffusers/pipelines/__init__.py Show resolved Hide resolved

manuelbrack added 6 commits

March 1, 2024 10:28


          Update LEditsPP import

73574a5


          Merge branch 'main' into ledits_pp

afab27d

# Conflicts:
#	src/diffusers/schedulers/scheduling_dpmsolver_multistep.py


          Remove enable/disable methods that are now in StableDiffusionMixin

47f64fd


          Change import in docs

6770e67


          Revert import structure change

8ba5e24


          Fix ledits imports

dfa291f

linoytsaban merged commit 00eca4b into huggingface:main

15 checks passed

Contributor

vladmandic commented Mar 13, 2024 •

edited

Loading

this looks very promising, few comments to polish it for production...

cannot make either sd or sdxl pipelines work with model cpu offloading
as it doesnt pull text_encoder from cpu to gpu on invert()
LEditsPPPipelineStableDiffusionXL fails in invert().encode_image when running in fp16
since VAE by default gets upcast causing float vs half runtime error
workaround: pipe.vae.config.force_upcast = False
pipelines do not work with GPU torch.Generator()
since tensor never gets moved to same device as generator created noise:
```
xts[idx] = self.scheduler.add_noise(x0, noise, torch.Tensor([t]))
```
SDXL pipeline does not take width/height/clip_skip args like SD pipeline does
both pipelines do not set param pipe.num_timesteps which makes it hard to check progress in callbacks, etc.
should be as simple as pipe.num_timesteps = len(pipe.inversion_steps)
guidance_rescale > 0 causes runtime error since noise_pred_edit_concepts is a tuple,
not ndarray, so .mean() does not work
if source_guidance_scale is <=1, it causes runtime errors instead of disabling guidance
math for edit_cooldown_steps seems strange as it uses absolute values, not relative to timesteps
generally, not sure why explicit invert() is even needed,
it can easily be combined in main pipeline __call__ method
especially since its output is discarded and it sets values on pipeline directly
there is a note in docs https://huggingface.co/docs/diffusers/main/en/api/pipelines/ledits_pp
that dpm++ 2m in diffusers is sub-optimal, but no info on why or example outputs?

Collaborator

yiyixuxu commented Mar 13, 2024

thanks for the feedback! @vladmandic

@manuelbrack can we address them in the refactor PR too?

manuelbrack mentioned this pull request

[WIP] LEdits++ refactoring #7318

Draft

2 tasks

Contributor

vladmandic commented Mar 14, 2024

one more item:

pipeline fails in aggregate_attention on non-standard input image sizes, for example 872x512 or even simple 560x560

tolgacangoz mentioned this pull request

RuntimeError: Input type (c10::Half) and bias type (float) should be the same #7854

Closed

a-r-r-o-w mentioned this pull request

RuntimeError: torch.cat(): expected a non-empty list of Tensors #8826

Open

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet