-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Pipeline] AnimateDiff SDXL #6721
Conversation
Link to Colab. Target usageimport torch
from diffusers import AnimateDiffPipeline, DDIMScheduler, EulerDiscreteScheduler, DEISMultistepScheduler
from diffusers.models import MotionAdapter
from diffusers import AnimateDiffSDXLPipeline
adapter = MotionAdapter.from_pretrained("a-r-r-o-w/animatediff-motion-adapter-sdxl-beta", torch_dtype=torch.float16)
# model_id = "stabilityai/stable-diffusion-xl-base-1.0"
model_id = "stablediffusionapi/dynavision-xl-v0610"
# model_id = "Lykon/dreamshaper-xl-1-0"
# scheduler = EulerDiscreteScheduler.from_pretrained(
# scheduler = DEISMultistepScheduler.from_pretrained(
scheduler = DDIMScheduler.from_pretrained(
model_id,
subfolder="scheduler",
clip_sample=False,
timestep_spacing="linspace",
beta_schedule="linear",
steps_offset=1,
)
pipe = AnimateDiffSDXLPipeline.from_pretrained(
model_id,
motion_adapter=adapter,
scheduler=scheduler,
torch_dtype=torch.float16,
variant="fp16",
).to("cuda")
pipe.enable_vae_slicing()
pipe.enable_vae_tiling()
result = pipe(
prompt="a panda surfing in the ocean, realistic, hyperrealism, high quality",
negative_prompt="low quality, worst quality",
num_inference_steps=20,
guidance_scale=8,
width=1024,
height=1024,
num_frames=16,
)
from diffusers.utils import export_to_gif
export_to_gif(result.frames[0], "animation.gif")
Still experimenting and trying to find if I missed something. Quality seems to be okay-ish but definitely need to look for better parameters/models. SDXL checkpoint by guoyww is still a beta release, and maybe we could wait for official release before considering merge. |
src/diffusers/pipelines/animatediff/pipeline_animatediff_sdxl.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good 👍🏽 Left a few comments.
It'd be great to support the SDXL version but since there are no official checkpoints or a training script, it makes it harder for the community to experiment with and does not seem like a good feature to add in core diffusers. I will try cooking a motion adapter training script based on what I understand so far when I find the time, and try reaching out to the authors over other mediums. cc @guoyww @limbo0000 @AnyiRao @wyhsirius Thank you for your amazing work! Any suggestions and updates would be awesome ❤️ |
Just tested SD1.5 models with the changes to UNetMotionModel. Everything seems to be working for both SD and SDXL. So I think this is very close to completion. @DN6 The sdxl motion adapter checkpoint in diffusers format is available on my account with about ~500 downloads so far. Will you be able to move it to the authors' accounts or are we okay with this? I'll update the example code accordingly. |
@a-r-r-o-w I think it's fine to have it on your account since you worked on the conversion. |
@DN6 the checkpoints are better in their original organizations actually. The model card can contain a note about Aryan's contributions but they should ideally reside under the original org/author. |
Cool in that case we'll try to get in touch with the authors and move the checkpoints @a-r-r-o-w. We can cite your work in the model card. |
Sounds good to me. It was only a few line changes to the already existing motion adapter script and not much work so it doesn't really matter :)
All failing tests fixed with the latest commit locally. Seems like the above was handled differently for SDXL so I've copied over the logic from SDXL tests. |
Any updates from them @DN6? |
Hi @a-r-r-o-w Yes we'll move it. We can just transfer the checkpoint from your org to theirs. Is everything ready on your end? Could you prep the model card and add something along the lines of "Converted to Diffusers by @a-r-r-o-w" You can link it to your preferred profile (GitHub or the Hub) so that your work is also attributed. |
@@ -260,13 +280,26 @@ def __init__( | |||
if encoder_hid_dim_type is None: | |||
self.encoder_hid_proj = None | |||
|
|||
if addition_embed_type == "text_time": | |||
self.add_time_proj = Timesteps(addition_time_embed_dim, True, 0) | |||
self.add_embedding = TimestepEmbedding(projection_class_embeddings_input_dim, time_embed_dim) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also load add_embedding in from_unet2d? Something like:
if hasattr(model, "add_embedding"):
model.add_embedding.load_state_dict(unet.add_embedding.state_dict())
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we take a look at current unet_2d_condition.py modelling code, the team has refactored out these changes into separate helper functions. I think that because unet_motion_model.py is mostly a copy of that, we can adapt those changes here and therefore all the functionality one would need. We can take it up in a future PR in my opinion (also I'm afraid I will not have time to test things thoroughly if we do it here).
@a-r-r-o-w Just opened a PR on your model repo to update the code snippet in the model card and in the docs in this PR. I think we're ready to go once those changes are made. 👍🏽 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Appreciate the patience with this @a-r-r-o-w. Good to merge 👍🏽
* update conversion script to handle motion adapter sdxl checkpoint * add animatediff xl * handle addition_embed_type * fix output * update * add imports * make fix-copies * add decode latents * update docstrings * add animatediff sdxl to docs * remove unnecessary lines * update example * add test * revert conv_in conv_out kernel param * remove unused param addition_embed_type_num_heads * latest IPAdapter impl * make fix-copies * fix return * add IPAdapterTesterMixin to tests * fix return * revert based on suggestion * add freeinit * fix test_to_dtype test * use StableDiffusionMixin instead of different helper methods * fix progress bar iterations * apply suggestions from review * hardcode flip_sin_to_cos and freq_shift * make fix-copies * fix ip adapter implementation * fix last failing test * make style * Update docs/source/en/api/pipelines/animatediff.md Co-authored-by: Dhruv Nair <[email protected]> * remove todo * fix doc-builder errors --------- Co-authored-by: Dhruv Nair <[email protected]>
What does this PR do?
Fixes #6158.
Attempt at integrating https://github.com/guoyww/AnimateDiff/tree/sdxl.
Relevant discussion: #5928 (comment). Continuation of #6195.
Before submitting
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@DN6 @sayakpaul @guoyww