[Refactor] FreeInit for AnimateDiff based pipelines #6874

DN6 · 2024-02-06T12:22:23Z

What does this PR do?

The current FreeInit implementation isn't super ideal. Based on the discussions in this PR #6644,

Proposing this change based on the previous discussions that allows reusing the freeinit utils via a Mixin so that it is easy to experiment with adding the feature to other videos pipelines, while trying to avoid introducing two functions to run the denoising loop. This would preserve the existing pipeline denoising loop semantics so that the video pipelines can be read/understood like any other pipeline in diffusers and introducing FreeInit to a video pipeline does not involve adding a lot of boilerplate.

This PR refactors FreeInit for
AnimateDiff
PIA

And adds it to
AnimateDiffVideotoVideo

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

HuggingFaceDocBuilderDev · 2024-02-06T12:29:02Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

a-r-r-o-w

Thanks! Looks great to me since we're mostly back to how this was first added. Also, maybe AnimateDiffControlnetPipeline in community pipelines could benefit from these changes as well wdyt? I will do some more testing with SVD and TextToVideoSynth to see if we can easily add it there as well sometime in the near future.

a-r-r-o-w

Thanks!

src/diffusers/pipelines/animatediff/freeinit_utils.py

src/diffusers/pipelines/animatediff/pipeline_animatediff.py

src/diffusers/pipelines/animatediff/pipeline_animatediff_video2video.py

src/diffusers/pipelines/pia/pipeline_pia.py

yiyixuxu

Thanks! I left some feedback :)
I'm ok with the nested loop here - it is pretty easy to understand now.
The problem I can think of is if we want to incorporate other techniques that involve nested loops. But we can handle that later when we encounter it

src/diffusers/pipelines/animatediff/freeinit_utils.py

src/diffusers/pipelines/animatediff/pipeline_animatediff.py

yiyixuxu · 2024-02-06T19:01:51Z

src/diffusers/pipelines/animatediff/pipeline_animatediff.py

-            latents = self._denoise_loop(**denoise_args)
-
-        video = self._retrieve_video_frames(latents, output_type, return_dict)
+        num_free_init_iters = self._free_init_num_iters if self.free_init_enabled else 1


I can live with 4 extra lines 😬
does it make sense to add a progress bar for free_init loops?

I thought about it, but since we're always setting self._free_init_iters even if free_init isn't enabled, we would get an extra progress bar all the time. We could introduce additional checks to display the progress bar only when free init is enabled but it might not be worth it IMO.

src/diffusers/pipelines/animatediff/freeinit_utils.py

src/diffusers/pipelines/animatediff/pipeline_animatediff.py

yiyixuxu · 2024-02-06T19:59:03Z

src/diffusers/pipelines/animatediff/freeinit_utils.py

+            self.temporal_stop_frequency,
+        )
+
+    def _apply_freq_filter(self, x: torch.Tensor, noise: torch.Tensor, low_pass_filter: torch.Tensor) -> torch.Tensor:


same here - I don't think we need a method here

I moved these functions into the FreeInitMixin. I don't think they really need to be accessed outside the Mixin.

src/diffusers/pipelines/animatediff/freeinit_utils.py

yiyixuxu · 2024-02-06T20:04:00Z

src/diffusers/pipelines/animatediff/freeinit_utils.py

+            shape=latent_shape,
+            generator=self._free_init_generator,
+            device=device,
+            dtype=torch.float32,


any reason we use float32 here instead dtype?

This was copied over from the original codebase. I believe the filtering runs in float32 and then the results are cast back to the dtype that they latents were passed in as. Although, I haven't checked what the results look like when we run filtering in fp16.

I'll run a quick check, and if there's no difference between running filtering in fp16 and fp32, we can just use dtype

src/diffusers/pipelines/animatediff/freeinit_utils.py

src/diffusers/pipelines/animatediff/pipeline_animatediff.py

patrickvonplaten

Very nice refactor overall!

yiyixuxu

thanks!
do we need a FreeInitTexterMixin? If it makes sense we can put out an issue for the community to take up; But if it's just going to be added to 4 or 5 pipelines might not needed

DN6 · 2024-02-16T11:36:38Z

@yiyixuxu I think we limit it to just the AnimateDiff based pipelines for now until we see some usage/requests from the community to add to other pipelines. We can add the testing mixin if this expands to >5 pipelines?

a-r-r-o-w · 2024-02-16T11:42:43Z

@DN6 Btw does it make sense to also add to AnimateDiff ControlNet (community)?

DN6 · 2024-02-17T11:01:33Z

@a-r-r-o-w Yeah sure thing 👍🏽.

BTW @a-r-r-o-w AnimateLCM checkpoints are available in diffusers format now.
https://huggingface.co/wangfuyun/AnimateLCM

I tried running FreeInit with AnimateLCM and I ended up with a bunch of noise in the final output. If you're up to it, would you like to run some tests here to pin point the issue? I suspect it has to do with the LCM Scheduler, but don't have the time to really dig into it. LCM + FreeInit could be very nice for better quality videos.

a-r-r-o-w · 2024-02-18T12:35:45Z

src/diffusers/pipelines/free_init_utils.py

+    ):
+        if free_init_iteration == 0:
+            self._free_init_initial_noise = latents.detach().clone()
+            return latents, self.scheduler.timesteps


@DN6 This is incorrect and seems like a regression from the old implementation. I was trying to debug why AnimateLCM was failing to produce good results and stumbled upon this other issue (it does produce good results btw except for when use_fast_sampling==False. setting it to True seems to give good results).

Copy the FreeInit code from here and execute. You will see that the first iteration runs for 20 steps, second iteration runs for 13 steps and third iteration runs for 20 steps. This is incorrect because when use_fast_sampling=True, it should be 7, 13 and 20 but we return here without the fast sampling check.

@DN6 Could I open a PR fixing this behavior since this has been merged already?

Hi @a-r-r-o-w missed this. Yes please feel free to open a PR.

a-r-r-o-w · 2024-02-18T13:15:32Z

@a-r-r-o-w Yeah sure thing 👍🏽.

BTW @a-r-r-o-w AnimateLCM checkpoints are available in diffusers format now. https://huggingface.co/wangfuyun/AnimateLCM

I tried running FreeInit with AnimateLCM and I ended up with a bunch of noise in the final output. If you're up to it, would you like to run some tests here to pin point the issue? I suspect it has to do with the LCM Scheduler, but don't have the time to really dig into it. LCM + FreeInit could be very nice for better quality videos.

After some testing, it seems like when one uses use_fast_sampling=False, the results are all noisy no matter what the underlying freeinit method is. Setting it to True always gives me good results. Note that the results below have been done after fixing the issue mentioned here. IMO, this doesn't look like an LCMScheduler issue; I will dig deeper soon.

Code

import torch
from diffusers import MotionAdapter, AnimateDiffPipeline, LCMScheduler
from diffusers.utils import export_to_gif

adapter = MotionAdapter.from_pretrained("wangfuyun/AnimateLCM")
model_id = "SG161222/Realistic_Vision_V5.1_noVAE"
pipe = AnimateDiffPipeline.from_pretrained(model_id, motion_adapter=adapter, torch_dtype=torch.float16).to("cuda")
pipe.scheduler = LCMScheduler.from_pretrained(
    model_id,
    subfolder="scheduler",
    beta_schedule="linear",
    clip_sample=False,
    timestep_spacing="linspace",
    steps_offset=1
)

pipe.load_lora_weights("wangfuyun/AnimateLCM", weight_name="sd15_lora_beta.safetensors", adapter_name="lcm-lora")
pipe.set_adapters(["lcm-lora"], [0.8])

pipe.enable_vae_slicing()
pipe.enable_free_init(num_iters=3, method="gaussian", use_fast_sampling=True)

output = pipe(
    prompt="a panda playing a guitar, on a boat, in the ocean, high quality",
    negative_prompt="bad quality, worse quality",
    num_frames=16,
    guidance_scale=2.5,
    num_inference_steps=6,
    generator=torch.Generator("cpu").manual_seed(666),
)
frames = output.frames[0]
export_to_gif(frames, "animation.gif")

DN6 added 3 commits February 6, 2024 08:13

update

5289205

update

a779a32

update

4330a8d

DN6 mentioned this pull request Feb 6, 2024

[refactor] FreeInit #6644

Closed

a-r-r-o-w approved these changes Feb 6, 2024

View reviewed changes

a-r-r-o-w requested changes Feb 6, 2024

View reviewed changes

yiyixuxu reviewed Feb 6, 2024

View reviewed changes

DN6 added 3 commits February 8, 2024 07:54

Merge branch 'main' into refactor-freeinit

df86d79

update

fd2e462

update

a2b15b9

patrickvonplaten reviewed Feb 9, 2024

View reviewed changes

src/diffusers/pipelines/animatediff/pipeline_animatediff.py Outdated Show resolved Hide resolved

patrickvonplaten reviewed Feb 9, 2024

View reviewed changes

DN6 added 6 commits February 12, 2024 07:56

Merge branch 'main' into refactor-freeinit

e42c437

update

b23e579

update

732fa8c

update

138bc7f

update

1ed8b4d

update

8834fe6

patrickvonplaten requested a review from yiyixuxu February 13, 2024 15:41

yiyixuxu approved these changes Feb 16, 2024

View reviewed changes

a-r-r-o-w reviewed Feb 18, 2024

View reviewed changes

DN6 merged commit d2fc5eb into main Feb 19, 2024
15 checks passed

a-r-r-o-w mentioned this pull request Mar 20, 2024

[refactor] Fix FreeInit behaviour #7410

Merged

sayakpaul mentioned this pull request Mar 30, 2024

Reduce huge amounts of code duplication #6153

Closed

sayakpaul deleted the refactor-freeinit branch December 3, 2024 10:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Refactor] FreeInit for AnimateDiff based pipelines #6874

[Refactor] FreeInit for AnimateDiff based pipelines #6874

DN6 commented Feb 6, 2024

HuggingFaceDocBuilderDev commented Feb 6, 2024

a-r-r-o-w left a comment •

edited

Loading

a-r-r-o-w left a comment

yiyixuxu left a comment

yiyixuxu Feb 6, 2024

DN6 Feb 12, 2024

yiyixuxu Feb 6, 2024

DN6 Feb 12, 2024

yiyixuxu Feb 6, 2024

DN6 Feb 12, 2024

patrickvonplaten left a comment

yiyixuxu left a comment

DN6 commented Feb 16, 2024

a-r-r-o-w commented Feb 16, 2024

DN6 commented Feb 17, 2024

a-r-r-o-w Feb 18, 2024 •

edited

Loading

a-r-r-o-w Feb 19, 2024

DN6 Mar 19, 2024

a-r-r-o-w commented Feb 18, 2024 •

edited

Loading

[Refactor] FreeInit for AnimateDiff based pipelines #6874

[Refactor] FreeInit for AnimateDiff based pipelines #6874

Conversation

DN6 commented Feb 6, 2024

What does this PR do?

Before submitting

Who can review?

HuggingFaceDocBuilderDev commented Feb 6, 2024

a-r-r-o-w left a comment • edited Loading

Choose a reason for hiding this comment

a-r-r-o-w left a comment

Choose a reason for hiding this comment

yiyixuxu left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

patrickvonplaten left a comment

Choose a reason for hiding this comment

yiyixuxu left a comment

Choose a reason for hiding this comment

DN6 commented Feb 16, 2024

a-r-r-o-w commented Feb 16, 2024

DN6 commented Feb 17, 2024

a-r-r-o-w Feb 18, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

a-r-r-o-w commented Feb 18, 2024 • edited Loading

a-r-r-o-w left a comment •

edited

Loading

a-r-r-o-w Feb 18, 2024 •

edited

Loading

a-r-r-o-w commented Feb 18, 2024 •

edited

Loading