Add support for IPAdapterFull #5911

fabiorigano · 2023-11-23T10:39:03Z

What does this PR do?

Before submitting

Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@yiyixuxu

HuggingFaceDocBuilderDev · 2023-11-23T11:39:24Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

fabiorigano · 2023-11-24T07:46:39Z

@yiyixuxu sorry for disturbing, can I ask for feedback on this PR?

I saw a newer PR is adding support for IP-Adapter Plus, should I add here too, as I only have to update checkpoint loading code (image embeddings are computed in the same way)?

Thanks

Edit: I added support for IP-Adapter Plus merging the solutions (thanks @okotaku)

yiyixuxu · 2023-11-26T22:46:06Z

hey thanks! sorry I'm a little bit slow due to the thanksgiving holidays here in US
can we see some examples of using face model with IPAdapterFull?

regarding to this question, we should not do this under normal circumstances. We can make exceptions and take over other contributor's PR when we need to add this feature quiet urgently, or the PR become stale. In this case I would only focus on IPAdapterFull on this PR for now :)

I saw a newer PR is adding support for IP-Adapter Plus, should I add here too, as I only have to update checkpoint loading code (image embeddings are computed in the same way)?

fabiorigano · 2023-11-27T07:33:34Z

@yiyixuxu perfect, I have removed IP-Adapter Plus from my PR

Example n. 1 -> SG161222/Realistic_Vision_V4.0_noVAE + ip-adapter full face + vae and ddim scheduler:

from transformers import CLIPVisionModelWithProjection
import torch
from diffusers.utils import load_image
base_model_path = "SG161222/Realistic_Vision_V4.0_noVAE"
vae_model_path = "stabilityai/sd-vae-ft-mse"
image_encoder = CLIPVisionModelWithProjection.from_pretrained(
    "h94/IP-Adapter", 
    subfolder="models/image_encoder",
    torch_dtype=torch.float16,
).to("cuda")
noise_scheduler = DDIMScheduler(
    num_train_timesteps=1000,
    beta_start=0.00085,
    beta_end=0.012,
    beta_schedule="scaled_linear",
    clip_sample=False,
    set_alpha_to_one=False,
    steps_offset=1,
)
vae = AutoencoderKL.from_pretrained(vae_model_path).to(dtype=torch.float16)
pipeline = StableDiffusionPipeline.from_pretrained(
    base_model_path,
    torch_dtype=torch.float16,
    image_encoder=image_encoder,
    feature_extractor=None,
    safety_checker=None
)
pipeline.to("cuda")
image = load_image("https://huggingface.co/datasets/YiYiXu/testing-images/resolve/main/ai_face2.png")
pipeline.load_ip_adapter("h94/IP-Adapter", subfolder="models", weight_name="ip-adapter-full-face_sd15.bin")
pipeline.set_ip_adapter_scale(0.7)
generator = torch.Generator(device="cpu").manual_seed(33)
images = pipeline(
    prompt="A photo of a girl wearing a black dress, holding red roses in hand, upper body, behind is the Eiffel Tower",
    ip_adapter_image=image,
    negative_prompt="monochrome, lowres, bad anatomy, worst quality, low quality", 
    num_inference_steps=50, num_images_per_prompt=2,
    generator=generator,
).images

Input

Output

fabiorigano · 2023-11-27T07:38:17Z

Same initialization, different output width and height:

    prompt="A photo of a girl wearing a black dress, holding red roses in hand, upper body, behind is the Eiffel Tower",
    ip_adapter_image=image,
    negative_prompt="monochrome, lowres, bad anatomy, worst quality, low quality", 
    num_inference_steps=50, num_images_per_prompt=2, width=512, height=704, 
    generator=generator,
).images

Output

fabiorigano · 2023-11-27T07:43:35Z

Example n. 2 -> runwayml/stable-diffusion-v1-5 + ip-adapter full face + vae and ddim scheduler (width=512, height=704):

from diffusers import StableDiffusionPipeline, AutoencoderKL, DDIMScheduler
from transformers import CLIPVisionModelWithProjection
import torch
from diffusers.utils import load_image
base_model_path = "runwayml/stable-diffusion-v1-5"
vae_model_path = "stabilityai/sd-vae-ft-mse"
image_encoder = CLIPVisionModelWithProjection.from_pretrained(
    "h94/IP-Adapter", 
    subfolder="models/image_encoder",
    torch_dtype=torch.float16,
).to("cuda")
noise_scheduler = DDIMScheduler(
    num_train_timesteps=1000,
    beta_start=0.00085,
    beta_end=0.012,
    beta_schedule="scaled_linear",
    clip_sample=False,
    set_alpha_to_one=False,
    steps_offset=1,
)
vae = AutoencoderKL.from_pretrained(vae_model_path).to(dtype=torch.float16)
pipeline = StableDiffusionPipeline.from_pretrained(
    base_model_path,
    torch_dtype=torch.float16,
	scheduler=noise_scheduler,
    vae=vae,
    image_encoder=image_encoder,
    feature_extractor=None,
    safety_checker=None
)
pipeline.to("cuda")
image = load_image("https://huggingface.co/datasets/YiYiXu/testing-images/resolve/main/ai_face2.png")
pipeline.load_ip_adapter("h94/IP-Adapter", subfolder="models", weight_name="ip-adapter-full-face_sd15.bin")
pipeline.set_ip_adapter_scale(0.7)
generator = torch.Generator(device="cpu").manual_seed(33)
images = pipeline(
    prompt="A photo of a girl wearing a black dress, holding red roses in hand, upper body, behind is the Eiffel Tower",
    ip_adapter_image=image,
    negative_prompt="monochrome, lowres, bad anatomy, worst quality, low quality", 
    num_inference_steps=50, num_images_per_prompt=2, width=512, height=704,
    generator=generator,
).images

Output

fabiorigano · 2023-11-27T07:48:04Z

Example n. 3: runwayml/stable-diffusion-v1-5 + ip-adapter full face without vae and ddim scheduler (width=512, height=704):

from diffusers import StableDiffusionPipeline
from transformers import CLIPVisionModelWithProjection
import torch
from diffusers.utils import load_image
base_model_path = "runwayml/stable-diffusion-v1-5"
image_encoder = CLIPVisionModelWithProjection.from_pretrained(
    "h94/IP-Adapter", 
    subfolder="models/image_encoder",
    torch_dtype=torch.float16,
).to("cuda")
pipeline = StableDiffusionPipeline.from_pretrained(
    base_model_path,
    torch_dtype=torch.float16,
    image_encoder=image_encoder,
    feature_extractor=None,
    safety_checker=None
)
pipeline.to("cuda")
image = load_image("https://huggingface.co/datasets/YiYiXu/testing-images/resolve/main/ai_face2.png")
pipeline.load_ip_adapter("h94/IP-Adapter", subfolder="models", weight_name="ip-adapter-full-face_sd15.bin")
pipeline.set_ip_adapter_scale(0.7)
generator = torch.Generator(device="cpu").manual_seed(33)
images = pipeline(
    prompt="A photo of a girl wearing a black dress, holding red roses in hand, upper body, behind is the Eiffel Tower",
    ip_adapter_image=image,
    negative_prompt="monochrome, lowres, bad anatomy, worst quality, low quality", 
    num_inference_steps=50, num_images_per_prompt=2, width=512, height=704,
    generator=generator,
).images

Output

yiyixuxu · 2023-11-27T19:57:28Z

awesome! is the output in example n3 (with default scheduler and vae) expected?

fabiorigano · 2023-11-27T20:34:05Z

awesome! is the output in example n3 (with default scheduler and vae) expected?

@yiyixuxu yes, the original implementation with runwayml/stable-diffusion-v1-5, default ddim and vae gives the same outputs as example n 3. Using the other checkpoint (SG161222/Realistic_Vision_V4.0_noVAE), output images are "clean"

Output from original IP-Adapter (runwayml/stable-diffusion-v1-5, default scheduler and vae)

yiyixuxu · 2023-11-28T03:02:20Z

can you add one example to use face model to the doc here https://huggingface.co/docs/diffusers/main/en/using-diffusers/loading_adapters
and make it clear that we should not use default vae and scheduler for face models

cc @xiaohu2015 here - why do we need to use different vae and scheduler for face models?

xiaohu2015 · 2023-11-28T03:40:41Z

can you add one example to use face model to the doc here https://huggingface.co/docs/diffusers/main/en/using-diffusers/loading_adapters and make it clear that we should not use default vae and scheduler for face models

cc @xiaohu2015 here - why do we need to use different vae and scheduler for face models?

I only tested on ddim scheduler. maybe vae is not so important, I think noise scheduler is more important?

update: I just tested some noise scheduler, I found some schedulers don't work well (or need to increase steps to get normal images), DDIM and EulerDiscreteScheduler can works well. Besides, VAE can use the default.

fabiorigano · 2023-11-28T09:12:26Z

can you add one example to use face model to the doc here https://huggingface.co/docs/diffusers/main/en/using-diffusers/loading_adapters
and make it clear that we should not use default vae and scheduler for face models

@yiyixuxu done!

how should I manage the output image (it is the first one in example n 2)? now the link points to one image in this PR, but maybe it should be moved here huggingface/documentation-images. thanks

src/diffusers/models/embeddings.py

yiyixuxu

thanks for working on this!
I left a few comments, overall looking great and can't wait to have this in diffusers

might have to coordinate with #5915 in terms of merging and may need to rebase

docs/source/en/using-diffusers/loading_adapters.md

src/diffusers/models/embeddings.py

tests/pipelines/ip_adapters/test_ip_adapter_full_face_stable_diffusion.py

fabiorigano · 2023-11-30T07:07:40Z

where can I upload output image for documentation?

fabiorigano · 2023-11-30T18:48:28Z

thanks for working on this! I left a few comments, overall looking great and can't wait to have this in diffusers

might have to coordinate with #5915 in terms of merging and may need to rebase

@yiyixuxu How do you suggest we proceed with merge? thanks

Edit: I saw the other PR is ready to be merged, so I could rebase this after that merge, if it is ok

yiyixuxu · 2023-12-01T07:19:19Z

@fabiorigano
sounds good! let's merge that one first and finish up the face model here on this PR!

yiyixuxu · 2023-12-01T07:21:19Z

where can I upload output image for documentation?

cc @sayakpaul how is it normally done? do contributors send a PR to our repo that hosts documentation images?

sayakpaul · 2023-12-01T08:14:36Z

Yes

yiyixuxu · 2023-12-01T08:22:41Z

@fabiorigano
for documentation image, you can send a PR here https://huggingface.co/datasets/huggingface/documentation-images/tree/main/diffusers

fabiorigano · 2023-12-01T09:06:03Z

@fabiorigano for documentation image, you can send a PR here https://huggingface.co/datasets/huggingface/documentation-images/tree/main/diffusers

@yiyixuxu done, https://huggingface.co/datasets/huggingface/documentation-images/discussions/234

I accidentally opened another PR in the wrong directory, I can only close it tomorrow due to restrictions on new users

fabiorigano · 2023-12-04T14:22:06Z

@yiyixuxu @patrickvonplaten Rebased after merge of #5915.

I still can't delete the first PR on huggingface/documentation-images

@fabiorigano for documentation image, you can send a PR here https://huggingface.co/datasets/huggingface/documentation-images/tree/main/diffusers

@yiyixuxu done, https://huggingface.co/datasets/huggingface/documentation-images/discussions/234

I accidentally opened another PR in the wrong directory, I can only close it tomorrow due to restrictions on new users

yiyixuxu · 2023-12-04T17:05:41Z

@sayakpaul can we merge this?

@yiyixuxu done, https://huggingface.co/datasets/huggingface/documentation-images/discussions/234

yiyixuxu

looks good to merge to me :)

yiyixuxu · 2023-12-05T06:42:00Z

@patrickvonplaten
can we have a final review here? this is the last piece of ip-adapter!

patrickvonplaten

Cool! Left some final nits - @yiyixuxu feel free to merge whenever

docs/source/en/using-diffusers/loading_adapters.md

patrickvonplaten · 2023-12-06T22:46:30Z

src/diffusers/loaders/unet.py

@@ -675,6 +675,9 @@ def _load_ip_adapter_weights(self, state_dict):
        if "proj.weight" in state_dict["image_proj"]:
            # IP-Adapter
            num_image_text_embeds = 4
+        elif "proj.3.weight" in state_dict["image_proj"]:
+            # IP-Adapter Full Face
+            num_image_text_embeds = 257


this number seems a bit arbitrary - can we extract it somehow from the shape of the weight?

hi, thanks for the review, unfortunately I couln't find a way to compute it from the shape, so I just added a comment as in the original implementation

Co-authored-by: Patrick von Platen <[email protected]>

yiyixuxu · 2023-12-07T08:19:33Z

will merge now but need to open a follow up PR for:

fabiorigano · 2023-12-07T09:37:31Z

will merge now but need to open a follow up PR for:

[Feature] Support IP-Adapter Plus #5915 (comment)

Add support for IPAdapterFull #5911 (comment)

@yiyixuxu I can open a new PR if it is ok then

fabiorigano · 2023-12-07T17:34:52Z

@yiyixuxu @patrickvonplaten thank you so much for merging my first-ever pull request! I appreciate your time and effort in reviewing and guiding the changes :)
I look forward to continuing to contribute to this community

vladmandic · 2023-12-07T17:46:08Z

FYI - examples...

* Add support for IPAdapterFull Co-authored-by: Patrick von Platen <[email protected]> --------- Co-authored-by: YiYi Xu <[email protected]> Co-authored-by: Patrick von Platen <[email protected]>

* add: script to train lcm lora for sdxl with 🤗 datasets * suit up the args. * remove comments. * fix num_update_steps * fix batch unmarshalling * fix num_update_steps_per_epoch * fix; dataloading. * fix microconditions. * unconditional predictions debug * fix batch size. * no need to use use_auth_token * Apply suggestions from code review Co-authored-by: Suraj Patil <[email protected]> * make vae encoding batch size an arg * final serialization in kohya * style * state dict rejigging * feat: no separate teacher unet. * debug * fix state dict serialization * debug * debug * debug * remove prints. * remove kohya utility and make style * fix serialization * fix * add test * add peft dependency. * add: peft * remove peft * autocast device determination from accelerator * autocast * reduce lora rank. * remove unneeded space * Apply suggestions from code review Co-authored-by: Suraj Patil <[email protected]> * style * remove prompt dropout. * also save in native diffusers ckpt format. * debug * debug * debug * better formation of the null embeddings. * remove space. * autocast fixes. * autocast fix. * hacky * remove lora_sayak * Apply suggestions from code review Co-authored-by: Younes Belkada <[email protected]> * style * make log validation leaner. * move back enabled in. * fix: log_validation call. * add: checkpointing tests * taking my chances to see if disabling autocasting has any effect? * start debugging * name * name * name * more debug * more debug * index * remove index. * print length * print length * print length * move unet.train() after add_adapter() * disable some prints. * enable_adapters() manually. * remove prints. * some changes. * fix params_to_optimize * more fixes * debug * debug * remove print * disable grad for certain contexts. * Add support for IPAdapterFull (#5911) * Add support for IPAdapterFull Co-authored-by: Patrick von Platen <[email protected]> --------- Co-authored-by: YiYi Xu <[email protected]> Co-authored-by: Patrick von Platen <[email protected]> * Fix a bug in `add_noise` function (#6085) * fix * copies --------- Co-authored-by: yiyixuxu <yixu310@gmail,com> * [Advanced Diffusion Script] Add Widget default text (#6100) add widget * [Advanced Training Script] Fix pipe example (#6106) * IP-Adapter for StableDiffusionControlNetImg2ImgPipeline (#5901) * adapter for StableDiffusionControlNetImg2ImgPipeline * fix-copies * fix-copies --------- Co-authored-by: Sayak Paul <[email protected]> * IP adapter support for most pipelines (#5900) * support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_upscale.py * support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_attend_and_excite.py * support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_instruct_pix2pix.py * update tests * support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_panorama.py * support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_sag.py * support ip-adapter in src/diffusers/pipelines/stable_diffusion_safe/pipeline_stable_diffusion_safe.py * support ip-adapter in src/diffusers/pipelines/latent_consistency_models/pipeline_latent_consistency_text2img.py * support ip-adapter in src/diffusers/pipelines/latent_consistency_models/pipeline_latent_consistency_img2img.py * support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_ldm3d.py * revert changes to sd_attend_and_excite and sd_upscale * make style * fix broken tests * update ip-adapter implementation to latest * apply suggestions from review --------- Co-authored-by: YiYi Xu <[email protected]> Co-authored-by: Sayak Paul <[email protected]> * fix: lora_alpha * make vae casting conditional/ * param upcasting * propagate comments from #6145 Co-authored-by: dg845 <[email protected]> * [Peft] fix saving / loading when unet is not "unet" (#6046) * [Peft] fix saving / loading when unet is not "unet" * Update src/diffusers/loaders/lora.py Co-authored-by: Sayak Paul <[email protected]> * undo stablediffusion-xl changes * use unet_name to get unet for lora helpers * use unet_name --------- Co-authored-by: Sayak Paul <[email protected]> * [Wuerstchen] fix fp16 training and correct lora args (#6245) fix fp16 training Co-authored-by: Sayak Paul <[email protected]> * [docs] fix: animatediff docs (#6339) fix: animatediff docs * add: note about the new script in readme_sdxl. * Revert "[Peft] fix saving / loading when unet is not "unet" (#6046)" This reverts commit 4c7e983. * Revert "[Wuerstchen] fix fp16 training and correct lora args (#6245)" This reverts commit 0bb9cf0. * Revert "[docs] fix: animatediff docs (#6339)" This reverts commit 11659a6. * remove tokenize_prompt(). * assistive comments around enable_adapters() and diable_adapters(). --------- Co-authored-by: Suraj Patil <[email protected]> Co-authored-by: Younes Belkada <[email protected]> Co-authored-by: Fabio Rigano <[email protected]> Co-authored-by: YiYi Xu <[email protected]> Co-authored-by: Patrick von Platen <[email protected]> Co-authored-by: yiyixuxu <yixu310@gmail,com> Co-authored-by: apolinário <[email protected]> Co-authored-by: Charchit Sharma <[email protected]> Co-authored-by: Aryan V S <[email protected]> Co-authored-by: dg845 <[email protected]> Co-authored-by: Kashif Rasul <[email protected]>

* add: script to train lcm lora for sdxl with 🤗 datasets * suit up the args. * remove comments. * fix num_update_steps * fix batch unmarshalling * fix num_update_steps_per_epoch * fix; dataloading. * fix microconditions. * unconditional predictions debug * fix batch size. * no need to use use_auth_token * Apply suggestions from code review Co-authored-by: Suraj Patil <[email protected]> * make vae encoding batch size an arg * final serialization in kohya * style * state dict rejigging * feat: no separate teacher unet. * debug * fix state dict serialization * debug * debug * debug * remove prints. * remove kohya utility and make style * fix serialization * fix * add test * add peft dependency. * add: peft * remove peft * autocast device determination from accelerator * autocast * reduce lora rank. * remove unneeded space * Apply suggestions from code review Co-authored-by: Suraj Patil <[email protected]> * style * remove prompt dropout. * also save in native diffusers ckpt format. * debug * debug * debug * better formation of the null embeddings. * remove space. * autocast fixes. * autocast fix. * hacky * remove lora_sayak * Apply suggestions from code review Co-authored-by: Younes Belkada <[email protected]> * style * make log validation leaner. * move back enabled in. * fix: log_validation call. * add: checkpointing tests * taking my chances to see if disabling autocasting has any effect? * start debugging * name * name * name * more debug * more debug * index * remove index. * print length * print length * print length * move unet.train() after add_adapter() * disable some prints. * enable_adapters() manually. * remove prints. * some changes. * fix params_to_optimize * more fixes * debug * debug * remove print * disable grad for certain contexts. * Add support for IPAdapterFull (huggingface#5911) * Add support for IPAdapterFull Co-authored-by: Patrick von Platen <[email protected]> --------- Co-authored-by: YiYi Xu <[email protected]> Co-authored-by: Patrick von Platen <[email protected]> * Fix a bug in `add_noise` function (huggingface#6085) * fix * copies --------- Co-authored-by: yiyixuxu <yixu310@gmail,com> * [Advanced Diffusion Script] Add Widget default text (huggingface#6100) add widget * [Advanced Training Script] Fix pipe example (huggingface#6106) * IP-Adapter for StableDiffusionControlNetImg2ImgPipeline (huggingface#5901) * adapter for StableDiffusionControlNetImg2ImgPipeline * fix-copies * fix-copies --------- Co-authored-by: Sayak Paul <[email protected]> * IP adapter support for most pipelines (huggingface#5900) * support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_upscale.py * support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_attend_and_excite.py * support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_instruct_pix2pix.py * update tests * support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_panorama.py * support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_sag.py * support ip-adapter in src/diffusers/pipelines/stable_diffusion_safe/pipeline_stable_diffusion_safe.py * support ip-adapter in src/diffusers/pipelines/latent_consistency_models/pipeline_latent_consistency_text2img.py * support ip-adapter in src/diffusers/pipelines/latent_consistency_models/pipeline_latent_consistency_img2img.py * support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_ldm3d.py * revert changes to sd_attend_and_excite and sd_upscale * make style * fix broken tests * update ip-adapter implementation to latest * apply suggestions from review --------- Co-authored-by: YiYi Xu <[email protected]> Co-authored-by: Sayak Paul <[email protected]> * fix: lora_alpha * make vae casting conditional/ * param upcasting * propagate comments from huggingface#6145 Co-authored-by: dg845 <[email protected]> * [Peft] fix saving / loading when unet is not "unet" (huggingface#6046) * [Peft] fix saving / loading when unet is not "unet" * Update src/diffusers/loaders/lora.py Co-authored-by: Sayak Paul <[email protected]> * undo stablediffusion-xl changes * use unet_name to get unet for lora helpers * use unet_name --------- Co-authored-by: Sayak Paul <[email protected]> * [Wuerstchen] fix fp16 training and correct lora args (huggingface#6245) fix fp16 training Co-authored-by: Sayak Paul <[email protected]> * [docs] fix: animatediff docs (huggingface#6339) fix: animatediff docs * add: note about the new script in readme_sdxl. * Revert "[Peft] fix saving / loading when unet is not "unet" (huggingface#6046)" This reverts commit 4c7e983. * Revert "[Wuerstchen] fix fp16 training and correct lora args (huggingface#6245)" This reverts commit 0bb9cf0. * Revert "[docs] fix: animatediff docs (huggingface#6339)" This reverts commit 11659a6. * remove tokenize_prompt(). * assistive comments around enable_adapters() and diable_adapters(). --------- Co-authored-by: Suraj Patil <[email protected]> Co-authored-by: Younes Belkada <[email protected]> Co-authored-by: Fabio Rigano <[email protected]> Co-authored-by: YiYi Xu <[email protected]> Co-authored-by: Patrick von Platen <[email protected]> Co-authored-by: yiyixuxu <yixu310@gmail,com> Co-authored-by: apolinário <[email protected]> Co-authored-by: Charchit Sharma <[email protected]> Co-authored-by: Aryan V S <[email protected]> Co-authored-by: dg845 <[email protected]> Co-authored-by: Kashif Rasul <[email protected]>

* Add support for IPAdapterFull Co-authored-by: Patrick von Platen <[email protected]> --------- Co-authored-by: YiYi Xu <[email protected]> Co-authored-by: Patrick von Platen <[email protected]>

* add: script to train lcm lora for sdxl with 🤗 datasets * suit up the args. * remove comments. * fix num_update_steps * fix batch unmarshalling * fix num_update_steps_per_epoch * fix; dataloading. * fix microconditions. * unconditional predictions debug * fix batch size. * no need to use use_auth_token * Apply suggestions from code review Co-authored-by: Suraj Patil <[email protected]> * make vae encoding batch size an arg * final serialization in kohya * style * state dict rejigging * feat: no separate teacher unet. * debug * fix state dict serialization * debug * debug * debug * remove prints. * remove kohya utility and make style * fix serialization * fix * add test * add peft dependency. * add: peft * remove peft * autocast device determination from accelerator * autocast * reduce lora rank. * remove unneeded space * Apply suggestions from code review Co-authored-by: Suraj Patil <[email protected]> * style * remove prompt dropout. * also save in native diffusers ckpt format. * debug * debug * debug * better formation of the null embeddings. * remove space. * autocast fixes. * autocast fix. * hacky * remove lora_sayak * Apply suggestions from code review Co-authored-by: Younes Belkada <[email protected]> * style * make log validation leaner. * move back enabled in. * fix: log_validation call. * add: checkpointing tests * taking my chances to see if disabling autocasting has any effect? * start debugging * name * name * name * more debug * more debug * index * remove index. * print length * print length * print length * move unet.train() after add_adapter() * disable some prints. * enable_adapters() manually. * remove prints. * some changes. * fix params_to_optimize * more fixes * debug * debug * remove print * disable grad for certain contexts. * Add support for IPAdapterFull (huggingface#5911) * Add support for IPAdapterFull Co-authored-by: Patrick von Platen <[email protected]> --------- Co-authored-by: YiYi Xu <[email protected]> Co-authored-by: Patrick von Platen <[email protected]> * Fix a bug in `add_noise` function (huggingface#6085) * fix * copies --------- Co-authored-by: yiyixuxu <yixu310@gmail,com> * [Advanced Diffusion Script] Add Widget default text (huggingface#6100) add widget * [Advanced Training Script] Fix pipe example (huggingface#6106) * IP-Adapter for StableDiffusionControlNetImg2ImgPipeline (huggingface#5901) * adapter for StableDiffusionControlNetImg2ImgPipeline * fix-copies * fix-copies --------- Co-authored-by: Sayak Paul <[email protected]> * IP adapter support for most pipelines (huggingface#5900) * support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_upscale.py * support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_attend_and_excite.py * support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_instruct_pix2pix.py * update tests * support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_panorama.py * support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_sag.py * support ip-adapter in src/diffusers/pipelines/stable_diffusion_safe/pipeline_stable_diffusion_safe.py * support ip-adapter in src/diffusers/pipelines/latent_consistency_models/pipeline_latent_consistency_text2img.py * support ip-adapter in src/diffusers/pipelines/latent_consistency_models/pipeline_latent_consistency_img2img.py * support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_ldm3d.py * revert changes to sd_attend_and_excite and sd_upscale * make style * fix broken tests * update ip-adapter implementation to latest * apply suggestions from review --------- Co-authored-by: YiYi Xu <[email protected]> Co-authored-by: Sayak Paul <[email protected]> * fix: lora_alpha * make vae casting conditional/ * param upcasting * propagate comments from huggingface#6145 Co-authored-by: dg845 <[email protected]> * [Peft] fix saving / loading when unet is not "unet" (huggingface#6046) * [Peft] fix saving / loading when unet is not "unet" * Update src/diffusers/loaders/lora.py Co-authored-by: Sayak Paul <[email protected]> * undo stablediffusion-xl changes * use unet_name to get unet for lora helpers * use unet_name --------- Co-authored-by: Sayak Paul <[email protected]> * [Wuerstchen] fix fp16 training and correct lora args (huggingface#6245) fix fp16 training Co-authored-by: Sayak Paul <[email protected]> * [docs] fix: animatediff docs (huggingface#6339) fix: animatediff docs * add: note about the new script in readme_sdxl. * Revert "[Peft] fix saving / loading when unet is not "unet" (huggingface#6046)" This reverts commit 4c7e983. * Revert "[Wuerstchen] fix fp16 training and correct lora args (huggingface#6245)" This reverts commit 0bb9cf0. * Revert "[docs] fix: animatediff docs (huggingface#6339)" This reverts commit 11659a6. * remove tokenize_prompt(). * assistive comments around enable_adapters() and diable_adapters(). --------- Co-authored-by: Suraj Patil <[email protected]> Co-authored-by: Younes Belkada <[email protected]> Co-authored-by: Fabio Rigano <[email protected]> Co-authored-by: YiYi Xu <[email protected]> Co-authored-by: Patrick von Platen <[email protected]> Co-authored-by: yiyixuxu <yixu310@gmail,com> Co-authored-by: apolinário <[email protected]> Co-authored-by: Charchit Sharma <[email protected]> Co-authored-by: Aryan V S <[email protected]> Co-authored-by: dg845 <[email protected]> Co-authored-by: Kashif Rasul <[email protected]>

fabiorigano changed the title ~~Add support for IPAdapterFull~~ Add support for IPAdapterPlus and IPAdapterFull Nov 24, 2023

fabiorigano changed the title ~~Add support for IPAdapterPlus and IPAdapterFull~~ Add support for IPAdapterFull Nov 27, 2023

yiyixuxu mentioned this pull request Nov 27, 2023

[Feature] Support IP-Adapter Plus #5915

Merged

6 tasks

patrickvonplaten reviewed Nov 29, 2023

View reviewed changes

src/diffusers/models/embeddings.py Outdated Show resolved Hide resolved

patrickvonplaten requested a review from yiyixuxu November 29, 2023 14:52

yiyixuxu reviewed Nov 30, 2023

View reviewed changes

yiyixuxu mentioned this pull request Dec 1, 2023

IP adapter plus face is failing with LCM Lora - new diffusers (v0.24.0) release. #6008

Closed

fabiorigano added 3 commits December 4, 2023 14:40

Add support for IPAdapterFull

837d6f8

Fix copy

9b428b2

Move test

577fe86

fabiorigano force-pushed the ipadapterfullface branch from ed0b471 to 577fe86 Compare December 4, 2023 13:51

Fix style and quality

95b5cb3

yiyixuxu approved these changes Dec 4, 2023

View reviewed changes

yiyixuxu and others added 2 commits December 4, 2023 07:08

Merge branch 'main' into ipadapterfullface

c11e2cb

Fix doc link

f438c66

patrickvonplaten approved these changes Dec 6, 2023

View reviewed changes

fabiorigano and others added 2 commits December 7, 2023 08:24

Fix style + comment

2ad8f4a

Update docs/source/en/using-diffusers/loading_adapters.md

9d3a590

Co-authored-by: Patrick von Platen <[email protected]>

yiyixuxu merged commit b65928b into huggingface:main Dec 7, 2023
14 checks passed

fabiorigano mentioned this pull request Jun 12, 2024

ip-adapter-full-face_sd15 is producing corrupt / broken faces #8469

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for IPAdapterFull #5911

Add support for IPAdapterFull #5911

fabiorigano commented Nov 23, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Nov 23, 2023

fabiorigano commented Nov 24, 2023 •

edited

Loading

yiyixuxu commented Nov 26, 2023 •

edited

Loading

fabiorigano commented Nov 27, 2023

fabiorigano commented Nov 27, 2023

fabiorigano commented Nov 27, 2023

fabiorigano commented Nov 27, 2023

yiyixuxu commented Nov 27, 2023

fabiorigano commented Nov 27, 2023

yiyixuxu commented Nov 28, 2023 •

edited

Loading

xiaohu2015 commented Nov 28, 2023 •

edited

Loading

fabiorigano commented Nov 28, 2023 •

edited

Loading

yiyixuxu left a comment •

edited

Loading

fabiorigano commented Nov 30, 2023

fabiorigano commented Nov 30, 2023 •

edited

Loading

yiyixuxu commented Dec 1, 2023

yiyixuxu commented Dec 1, 2023

sayakpaul commented Dec 1, 2023

yiyixuxu commented Dec 1, 2023

fabiorigano commented Dec 1, 2023

fabiorigano commented Dec 4, 2023

yiyixuxu commented Dec 4, 2023

yiyixuxu left a comment

yiyixuxu commented Dec 5, 2023

patrickvonplaten left a comment

patrickvonplaten Dec 6, 2023

fabiorigano Dec 7, 2023

yiyixuxu commented Dec 7, 2023

fabiorigano commented Dec 7, 2023

fabiorigano commented Dec 7, 2023

vladmandic commented Dec 7, 2023

Add support for IPAdapterFull #5911

Add support for IPAdapterFull #5911

Conversation

fabiorigano commented Nov 23, 2023 • edited Loading

What does this PR do?

Before submitting

Who can review?

HuggingFaceDocBuilderDev commented Nov 23, 2023

fabiorigano commented Nov 24, 2023 • edited Loading

yiyixuxu commented Nov 26, 2023 • edited Loading

fabiorigano commented Nov 27, 2023

Input

Output

fabiorigano commented Nov 27, 2023

Output

fabiorigano commented Nov 27, 2023

Output

fabiorigano commented Nov 27, 2023

Output

yiyixuxu commented Nov 27, 2023

fabiorigano commented Nov 27, 2023

Output from original IP-Adapter (runwayml/stable-diffusion-v1-5, default scheduler and vae)

yiyixuxu commented Nov 28, 2023 • edited Loading

xiaohu2015 commented Nov 28, 2023 • edited Loading

fabiorigano commented Nov 28, 2023 • edited Loading

yiyixuxu left a comment • edited Loading

Choose a reason for hiding this comment

fabiorigano commented Nov 30, 2023

fabiorigano commented Nov 30, 2023 • edited Loading

yiyixuxu commented Dec 1, 2023

yiyixuxu commented Dec 1, 2023

sayakpaul commented Dec 1, 2023

yiyixuxu commented Dec 1, 2023

fabiorigano commented Dec 1, 2023

fabiorigano commented Dec 4, 2023

yiyixuxu commented Dec 4, 2023

yiyixuxu left a comment

Choose a reason for hiding this comment

yiyixuxu commented Dec 5, 2023

patrickvonplaten left a comment

Choose a reason for hiding this comment

patrickvonplaten Dec 6, 2023

Choose a reason for hiding this comment

fabiorigano Dec 7, 2023

Choose a reason for hiding this comment

yiyixuxu commented Dec 7, 2023

fabiorigano commented Dec 7, 2023

fabiorigano commented Dec 7, 2023

vladmandic commented Dec 7, 2023

fabiorigano commented Nov 23, 2023 •

edited

Loading

fabiorigano commented Nov 24, 2023 •

edited

Loading

yiyixuxu commented Nov 26, 2023 •

edited

Loading

yiyixuxu commented Nov 28, 2023 •

edited

Loading

xiaohu2015 commented Nov 28, 2023 •

edited

Loading

fabiorigano commented Nov 28, 2023 •

edited

Loading

yiyixuxu left a comment •

edited

Loading

fabiorigano commented Nov 30, 2023 •

edited

Loading