[Pipeline] AnimateDiff SDXL #6721

a-r-r-o-w · 2024-01-26T12:32:50Z

What does this PR do?

Attempt at integrating https://github.com/guoyww/AnimateDiff/tree/sdxl.

Relevant discussion: #5928 (comment). Continuation of #6195.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@DN6 @sayakpaul @guoyww

a-r-r-o-w · 2024-01-26T14:18:48Z

Link to Colab.

Target usage

import torch
from diffusers import AnimateDiffPipeline, DDIMScheduler, EulerDiscreteScheduler, DEISMultistepScheduler
from diffusers.models import MotionAdapter
from diffusers import AnimateDiffSDXLPipeline

adapter = MotionAdapter.from_pretrained("a-r-r-o-w/animatediff-motion-adapter-sdxl-beta", torch_dtype=torch.float16)

# model_id = "stabilityai/stable-diffusion-xl-base-1.0"
model_id = "stablediffusionapi/dynavision-xl-v0610"
# model_id = "Lykon/dreamshaper-xl-1-0"

# scheduler = EulerDiscreteScheduler.from_pretrained(
# scheduler = DEISMultistepScheduler.from_pretrained(
scheduler = DDIMScheduler.from_pretrained(
    model_id,
    subfolder="scheduler",
    clip_sample=False,
    timestep_spacing="linspace",
    beta_schedule="linear",
    steps_offset=1,
)
pipe = AnimateDiffSDXLPipeline.from_pretrained(
    model_id,
    motion_adapter=adapter,
    scheduler=scheduler,
    torch_dtype=torch.float16,
    variant="fp16",
).to("cuda")

pipe.enable_vae_slicing()
pipe.enable_vae_tiling()

result = pipe(
    prompt="a panda surfing in the ocean, realistic, hyperrealism, high quality",
    negative_prompt="low quality, worst quality",
    num_inference_steps=20,
    guidance_scale=8,
    width=1024,
    height=1024,
    num_frames=16,
)

from diffusers.utils import export_to_gif
export_to_gif(result.frames[0], "animation.gif")

SDXL Base

DynaVisionXL

DreamshaperXL

Still experimenting and trying to find if I missed something. Quality seems to be okay-ish but definitely need to look for better parameters/models. SDXL checkpoint by guoyww is still a beta release, and maybe we could wait for official release before considering merge.

src/diffusers/models/unets/unet_motion_model.py

src/diffusers/pipelines/animatediff/pipeline_animatediff_sdxl.py

src/diffusers/models/unets/unet_motion_model.py

DN6

Looking good 👍🏽 Left a few comments.

a-r-r-o-w · 2024-02-26T17:31:24Z

It'd be great to support the SDXL version but since there are no official checkpoints or a training script, it makes it harder for the community to experiment with and does not seem like a good feature to add in core diffusers. I will try cooking a motion adapter training script based on what I understand so far when I find the time, and try reaching out to the authors over other mediums.

cc @guoyww @limbo0000 @AnyiRao @wyhsirius Thank you for your amazing work! Any suggestions and updates would be awesome ❤️

a-r-r-o-w · 2024-03-26T13:15:05Z

Just tested SD1.5 models with the changes to UNetMotionModel. Everything seems to be working for both SD and SDXL. So I think this is very close to completion. @DN6 The sdxl motion adapter checkpoint in diffusers format is available on my account with about ~500 downloads so far. Will you be able to move it to the authors' accounts or are we okay with this? I'll update the example code accordingly.

DN6 · 2024-03-28T06:01:23Z

@a-r-r-o-w I think it's fine to have it on your account since you worked on the conversion.

sayakpaul · 2024-03-28T07:09:19Z

@DN6 the checkpoints are better in their original organizations actually. The model card can contain a note about Aryan's contributions but they should ideally reside under the original org/author.

DN6 · 2024-03-28T09:48:13Z

@DN6 the checkpoints are better in their original organizations actually. The model card can contain a note about Aryan's contributions but they should ideally reside under the original org/author.

Cool in that case we'll try to get in touch with the authors and move the checkpoints @a-r-r-o-w. We can cite your work in the model card.

a-r-r-o-w · 2024-03-28T10:24:24Z

Sounds good to me. It was only a few line changes to the already existing motion adapter script and not much work so it doesn't really matter :)

FAILED tests/pipelines/animatediff/test_animatediff_sdxl.py::AnimateDiffPipelineSDXLFastTests::test_save_load_optional_components - AttributeError: 'NoneType' object has no attribute 'tokenize'

All failing tests fixed with the latest commit locally. Seems like the above was handled differently for SDXL so I've copied over the logic from SDXL tests.

a-r-r-o-w · 2024-04-14T15:08:40Z

Cool in that case we'll try to get in touch with the authors and move the checkpoints @a-r-r-o-w. We can cite your work in the model card.

Any updates from them @DN6?

DN6 · 2024-04-18T16:45:47Z

Hi @a-r-r-o-w Yes we'll move it.

We can just transfer the checkpoint from your org to theirs. Is everything ready on your end?

Could you prep the model card and add something along the lines of "Converted to Diffusers by @a-r-r-o-w" You can link it to your preferred profile (GitHub or the Hub) so that your work is also attributed.

a-r-r-o-w · 2024-04-20T19:54:36Z

@DN6 Hey! I believe everything should be good for transfer hopefully. Please find the checkpoint here.

daz-synth · 2024-04-23T08:29:06Z

src/diffusers/models/unets/unet_motion_model.py

@@ -260,13 +280,26 @@ def __init__(
        if encoder_hid_dim_type is None:
            self.encoder_hid_proj = None

+        if addition_embed_type == "text_time":
+            self.add_time_proj = Timesteps(addition_time_embed_dim, True, 0)
+            self.add_embedding = TimestepEmbedding(projection_class_embeddings_input_dim, time_embed_dim)


Could you also load add_embedding in from_unet2d? Something like:

if hasattr(model, "add_embedding"): model.add_embedding.load_state_dict(unet.add_embedding.state_dict())

If we take a look at current unet_2d_condition.py modelling code, the team has refactored out these changes into separate helper functions. I think that because unet_motion_model.py is mostly a copy of that, we can adapt those changes here and therefore all the functionality one would need. We can take it up in a future PR in my opinion (also I'm afraid I will not have time to test things thoroughly if we do it here).

docs/source/en/api/pipelines/animatediff.md

DN6 · 2024-04-29T14:19:55Z

@a-r-r-o-w Just opened a PR on your model repo to update the code snippet in the model card and in the docs in this PR. I think we're ready to go once those changes are made. 👍🏽

Co-authored-by: Dhruv Nair <[email protected]>

DN6

Appreciate the patience with this @a-r-r-o-w. Good to merge 👍🏽

* update conversion script to handle motion adapter sdxl checkpoint * add animatediff xl * handle addition_embed_type * fix output * update * add imports * make fix-copies * add decode latents * update docstrings * add animatediff sdxl to docs * remove unnecessary lines * update example * add test * revert conv_in conv_out kernel param * remove unused param addition_embed_type_num_heads * latest IPAdapter impl * make fix-copies * fix return * add IPAdapterTesterMixin to tests * fix return * revert based on suggestion * add freeinit * fix test_to_dtype test * use StableDiffusionMixin instead of different helper methods * fix progress bar iterations * apply suggestions from review * hardcode flip_sin_to_cos and freq_shift * make fix-copies * fix ip adapter implementation * fix last failing test * make style * Update docs/source/en/api/pipelines/animatediff.md Co-authored-by: Dhruv Nair <[email protected]> * remove todo * fix doc-builder errors --------- Co-authored-by: Dhruv Nair <[email protected]>

a-r-r-o-w added 10 commits January 26, 2024 15:38

update conversion script to handle motion adapter sdxl checkpoint

56ba44b

add animatediff xl

7ae7bc8

handle addition_embed_type

01f5978

fix output

2562500

update

736a224

add imports

4a2b9de

make fix-copies

eb060e0

Merge branch 'main' into re-animatediff-sdxl

54cd75c

add decode latents

c01d2c2

Merge branch 'main' into re-animatediff-sdxl

3d45dc1

a-r-r-o-w added 6 commits January 27, 2024 07:06

update docstrings

60364ea

add animatediff sdxl to docs

0db8340

remove unnecessary lines

bf2cd49

update example

389adaa

Merge branch 'main' into re-animatediff-sdxl

f471e3c

add test

ba4f9f4

sayakpaul requested a review from DN6 January 27, 2024 03:05

a-r-r-o-w added 3 commits February 9, 2024 00:02

Merge branch 'main' into re-animatediff-sdxl

504e958

Merge branch 'main' into re-animatediff-sdxl

a3fb232

Merge branch 'main' into re-animatediff-sdxl

512d346