Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for IPAdapterFull #5911

Merged
merged 8 commits into from
Dec 7, 2023

Conversation

fabiorigano
Copy link
Contributor

@fabiorigano fabiorigano commented Nov 23, 2023

What does this PR do?

Fixes #5886

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@yiyixuxu

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

@fabiorigano
Copy link
Contributor Author

fabiorigano commented Nov 24, 2023

@yiyixuxu sorry for disturbing, can I ask for feedback on this PR?

I saw a newer PR is adding support for IP-Adapter Plus, should I add here too, as I only have to update checkpoint loading code (image embeddings are computed in the same way)?

Thanks

Edit: I added support for IP-Adapter Plus merging the solutions (thanks @okotaku)

@fabiorigano fabiorigano changed the title Add support for IPAdapterFull Add support for IPAdapterPlus and IPAdapterFull Nov 24, 2023
@yiyixuxu
Copy link
Collaborator

yiyixuxu commented Nov 26, 2023

hey thanks! sorry I'm a little bit slow due to the thanksgiving holidays here in US
can we see some examples of using face model with IPAdapterFull?

regarding to this question, we should not do this under normal circumstances. We can make exceptions and take over other contributor's PR when we need to add this feature quiet urgently, or the PR become stale. In this case I would only focus on IPAdapterFull on this PR for now :)

I saw a newer PR is adding support for IP-Adapter Plus, should I add here too, as I only have to update checkpoint loading code (image embeddings are computed in the same way)?

@fabiorigano
Copy link
Contributor Author

@yiyixuxu perfect, I have removed IP-Adapter Plus from my PR

Example n. 1 -> SG161222/Realistic_Vision_V4.0_noVAE + ip-adapter full face + vae and ddim scheduler:

from transformers import CLIPVisionModelWithProjection
import torch
from diffusers.utils import load_image
base_model_path = "SG161222/Realistic_Vision_V4.0_noVAE"
vae_model_path = "stabilityai/sd-vae-ft-mse"
image_encoder = CLIPVisionModelWithProjection.from_pretrained(
    "h94/IP-Adapter", 
    subfolder="models/image_encoder",
    torch_dtype=torch.float16,
).to("cuda")
noise_scheduler = DDIMScheduler(
    num_train_timesteps=1000,
    beta_start=0.00085,
    beta_end=0.012,
    beta_schedule="scaled_linear",
    clip_sample=False,
    set_alpha_to_one=False,
    steps_offset=1,
)
vae = AutoencoderKL.from_pretrained(vae_model_path).to(dtype=torch.float16)
pipeline = StableDiffusionPipeline.from_pretrained(
    base_model_path,
    torch_dtype=torch.float16,
    image_encoder=image_encoder,
    feature_extractor=None,
    safety_checker=None
)
pipeline.to("cuda")
image = load_image("https://huggingface.co/datasets/YiYiXu/testing-images/resolve/main/ai_face2.png")
pipeline.load_ip_adapter("h94/IP-Adapter", subfolder="models", weight_name="ip-adapter-full-face_sd15.bin")
pipeline.set_ip_adapter_scale(0.7)
generator = torch.Generator(device="cpu").manual_seed(33)
images = pipeline(
    prompt="A photo of a girl wearing a black dress, holding red roses in hand, upper body, behind is the Eiffel Tower",
    ip_adapter_image=image,
    negative_prompt="monochrome, lowres, bad anatomy, worst quality, low quality", 
    num_inference_steps=50, num_images_per_prompt=2,
    generator=generator,
).images

Input

ai_face2

Output

sg_1
sg_2

@fabiorigano
Copy link
Contributor Author

Same initialization, different output width and height:

    prompt="A photo of a girl wearing a black dress, holding red roses in hand, upper body, behind is the Eiffel Tower",
    ip_adapter_image=image,
    negative_prompt="monochrome, lowres, bad anatomy, worst quality, low quality", 
    num_inference_steps=50, num_images_per_prompt=2, width=512, height=704, 
    generator=generator,
).images

Output

p0
p1

@fabiorigano fabiorigano changed the title Add support for IPAdapterPlus and IPAdapterFull Add support for IPAdapterFull Nov 27, 2023
@fabiorigano
Copy link
Contributor Author

Example n. 2 -> runwayml/stable-diffusion-v1-5 + ip-adapter full face + vae and ddim scheduler (width=512, height=704):

from diffusers import StableDiffusionPipeline, AutoencoderKL, DDIMScheduler
from transformers import CLIPVisionModelWithProjection
import torch
from diffusers.utils import load_image
base_model_path = "runwayml/stable-diffusion-v1-5"
vae_model_path = "stabilityai/sd-vae-ft-mse"
image_encoder = CLIPVisionModelWithProjection.from_pretrained(
    "h94/IP-Adapter", 
    subfolder="models/image_encoder",
    torch_dtype=torch.float16,
).to("cuda")
noise_scheduler = DDIMScheduler(
    num_train_timesteps=1000,
    beta_start=0.00085,
    beta_end=0.012,
    beta_schedule="scaled_linear",
    clip_sample=False,
    set_alpha_to_one=False,
    steps_offset=1,
)
vae = AutoencoderKL.from_pretrained(vae_model_path).to(dtype=torch.float16)
pipeline = StableDiffusionPipeline.from_pretrained(
    base_model_path,
    torch_dtype=torch.float16,
	scheduler=noise_scheduler,
    vae=vae,
    image_encoder=image_encoder,
    feature_extractor=None,
    safety_checker=None
)
pipeline.to("cuda")
image = load_image("https://huggingface.co/datasets/YiYiXu/testing-images/resolve/main/ai_face2.png")
pipeline.load_ip_adapter("h94/IP-Adapter", subfolder="models", weight_name="ip-adapter-full-face_sd15.bin")
pipeline.set_ip_adapter_scale(0.7)
generator = torch.Generator(device="cpu").manual_seed(33)
images = pipeline(
    prompt="A photo of a girl wearing a black dress, holding red roses in hand, upper body, behind is the Eiffel Tower",
    ip_adapter_image=image,
    negative_prompt="monochrome, lowres, bad anatomy, worst quality, low quality", 
    num_inference_steps=50, num_images_per_prompt=2, width=512, height=704,
    generator=generator,
).images

Output

p0
p1

@fabiorigano
Copy link
Contributor Author

Example n. 3: runwayml/stable-diffusion-v1-5 + ip-adapter full face without vae and ddim scheduler (width=512, height=704):

from diffusers import StableDiffusionPipeline
from transformers import CLIPVisionModelWithProjection
import torch
from diffusers.utils import load_image
base_model_path = "runwayml/stable-diffusion-v1-5"
image_encoder = CLIPVisionModelWithProjection.from_pretrained(
    "h94/IP-Adapter", 
    subfolder="models/image_encoder",
    torch_dtype=torch.float16,
).to("cuda")
pipeline = StableDiffusionPipeline.from_pretrained(
    base_model_path,
    torch_dtype=torch.float16,
    image_encoder=image_encoder,
    feature_extractor=None,
    safety_checker=None
)
pipeline.to("cuda")
image = load_image("https://huggingface.co/datasets/YiYiXu/testing-images/resolve/main/ai_face2.png")
pipeline.load_ip_adapter("h94/IP-Adapter", subfolder="models", weight_name="ip-adapter-full-face_sd15.bin")
pipeline.set_ip_adapter_scale(0.7)
generator = torch.Generator(device="cpu").manual_seed(33)
images = pipeline(
    prompt="A photo of a girl wearing a black dress, holding red roses in hand, upper body, behind is the Eiffel Tower",
    ip_adapter_image=image,
    negative_prompt="monochrome, lowres, bad anatomy, worst quality, low quality", 
    num_inference_steps=50, num_images_per_prompt=2, width=512, height=704,
    generator=generator,
).images

Output

p0
p1

@yiyixuxu yiyixuxu mentioned this pull request Nov 27, 2023
6 tasks
@yiyixuxu
Copy link
Collaborator

awesome! is the output in example n3 (with default scheduler and vae) expected?

@fabiorigano
Copy link
Contributor Author

awesome! is the output in example n3 (with default scheduler and vae) expected?

@yiyixuxu yes, the original implementation with runwayml/stable-diffusion-v1-5, default ddim and vae gives the same outputs as example n 3. Using the other checkpoint (SG161222/Realistic_Vision_V4.0_noVAE), output images are "clean"

Output from original IP-Adapter (runwayml/stable-diffusion-v1-5, default scheduler and vae)

image

@yiyixuxu
Copy link
Collaborator

yiyixuxu commented Nov 28, 2023

can you add one example to use face model to the doc here https://huggingface.co/docs/diffusers/main/en/using-diffusers/loading_adapters
and make it clear that we should not use default vae and scheduler for face models

cc @xiaohu2015 here - why do we need to use different vae and scheduler for face models?

@xiaohu2015
Copy link
Contributor

xiaohu2015 commented Nov 28, 2023

can you add one example to use face model to the doc here https://huggingface.co/docs/diffusers/main/en/using-diffusers/loading_adapters and make it clear that we should not use default vae and scheduler for face models

cc @xiaohu2015 here - why do we need to use different vae and scheduler for face models?

I only tested on ddim scheduler. maybe vae is not so important, I think noise scheduler is more important?

update: I just tested some noise scheduler, I found some schedulers don't work well (or need to increase steps to get normal images), DDIM and EulerDiscreteScheduler can works well. Besides, VAE can use the default.

@fabiorigano
Copy link
Contributor Author

fabiorigano commented Nov 28, 2023

can you add one example to use face model to the doc here https://huggingface.co/docs/diffusers/main/en/using-diffusers/loading_adapters
and make it clear that we should not use default vae and scheduler for face models

@yiyixuxu done!

how should I manage the output image (it is the first one in example n 2)? now the link points to one image in this PR, but maybe it should be moved here huggingface/documentation-images. thanks

Copy link
Collaborator

@yiyixuxu yiyixuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for working on this!
I left a few comments, overall looking great and can't wait to have this in diffusers

might have to coordinate with #5915 in terms of merging and may need to rebase

docs/source/en/using-diffusers/loading_adapters.md Outdated Show resolved Hide resolved
docs/source/en/using-diffusers/loading_adapters.md Outdated Show resolved Hide resolved
docs/source/en/using-diffusers/loading_adapters.md Outdated Show resolved Hide resolved
src/diffusers/models/embeddings.py Show resolved Hide resolved
@fabiorigano
Copy link
Contributor Author

where can I upload output image for documentation?

@fabiorigano
Copy link
Contributor Author

fabiorigano commented Nov 30, 2023

thanks for working on this! I left a few comments, overall looking great and can't wait to have this in diffusers

might have to coordinate with #5915 in terms of merging and may need to rebase

@yiyixuxu How do you suggest we proceed with merge? thanks

Edit: I saw the other PR is ready to be merged, so I could rebase this after that merge, if it is ok

@yiyixuxu
Copy link
Collaborator

yiyixuxu commented Dec 1, 2023

@fabiorigano
sounds good! let's merge that one first and finish up the face model here on this PR!

@yiyixuxu
Copy link
Collaborator

yiyixuxu commented Dec 1, 2023

where can I upload output image for documentation?

cc @sayakpaul how is it normally done? do contributors send a PR to our repo that hosts documentation images?

@sayakpaul
Copy link
Member

Yes

@yiyixuxu
Copy link
Collaborator

yiyixuxu commented Dec 1, 2023

@fabiorigano
for documentation image, you can send a PR here https://huggingface.co/datasets/huggingface/documentation-images/tree/main/diffusers

@fabiorigano
Copy link
Contributor Author

@fabiorigano for documentation image, you can send a PR here https://huggingface.co/datasets/huggingface/documentation-images/tree/main/diffusers

@yiyixuxu done, https://huggingface.co/datasets/huggingface/documentation-images/discussions/234

I accidentally opened another PR in the wrong directory, I can only close it tomorrow due to restrictions on new users

@fabiorigano
Copy link
Contributor Author

@yiyixuxu @patrickvonplaten Rebased after merge of #5915.

I still can't delete the first PR on huggingface/documentation-images

@fabiorigano for documentation image, you can send a PR here https://huggingface.co/datasets/huggingface/documentation-images/tree/main/diffusers

@yiyixuxu done, https://huggingface.co/datasets/huggingface/documentation-images/discussions/234

I accidentally opened another PR in the wrong directory, I can only close it tomorrow due to restrictions on new users

@yiyixuxu
Copy link
Collaborator

yiyixuxu commented Dec 4, 2023

Copy link
Collaborator

@yiyixuxu yiyixuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to merge to me :)

@yiyixuxu
Copy link
Collaborator

yiyixuxu commented Dec 5, 2023

@patrickvonplaten
can we have a final review here? this is the last piece of ip-adapter!

Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! Left some final nits - @yiyixuxu feel free to merge whenever

@@ -675,6 +675,9 @@ def _load_ip_adapter_weights(self, state_dict):
if "proj.weight" in state_dict["image_proj"]:
# IP-Adapter
num_image_text_embeds = 4
elif "proj.3.weight" in state_dict["image_proj"]:
# IP-Adapter Full Face
num_image_text_embeds = 257
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this number seems a bit arbitrary - can we extract it somehow from the shape of the weight?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi, thanks for the review, unfortunately I couln't find a way to compute it from the shape, so I just added a comment as in the original implementation

@yiyixuxu
Copy link
Collaborator

yiyixuxu commented Dec 7, 2023

will merge now but need to open a follow up PR for:

  1. [Feature] Support IP-Adapter Plus #5915 (comment)
  2. Add support for IPAdapterFull #5911 (comment)

@fabiorigano
Copy link
Contributor Author

will merge now but need to open a follow up PR for:

  1. [Feature] Support IP-Adapter Plus #5915 (comment)
  2. Add support for IPAdapterFull #5911 (comment)

@yiyixuxu I can open a new PR if it is ok then

@yiyixuxu yiyixuxu merged commit b65928b into huggingface:main Dec 7, 2023
14 checks passed
@fabiorigano
Copy link
Contributor Author

@yiyixuxu @patrickvonplaten thank you so much for merging my first-ever pull request! I appreciate your time and effort in reviewing and guiding the changes :)
I look forward to continuing to contribute to this community

@vladmandic
Copy link
Contributor

FYI - examples...
xyz_grid-xyz_grid-0016-absolutereality_v1-beautiful woman wearing a gown in a city

sayakpaul pushed a commit that referenced this pull request Dec 11, 2023
* Add support for IPAdapterFull


Co-authored-by: Patrick von Platen <[email protected]>

---------

Co-authored-by: YiYi Xu <[email protected]>
Co-authored-by: Patrick von Platen <[email protected]>
donhardman pushed a commit to donhardman/diffusers that referenced this pull request Dec 18, 2023
* Add support for IPAdapterFull


Co-authored-by: Patrick von Platen <[email protected]>

---------

Co-authored-by: YiYi Xu <[email protected]>
Co-authored-by: Patrick von Platen <[email protected]>
yoonseokjin pushed a commit to yoonseokjin/diffusers that referenced this pull request Dec 25, 2023
* Add support for IPAdapterFull


Co-authored-by: Patrick von Platen <[email protected]>

---------

Co-authored-by: YiYi Xu <[email protected]>
Co-authored-by: Patrick von Platen <[email protected]>
sayakpaul added a commit that referenced this pull request Dec 26, 2023
* add: script to train lcm lora for sdxl with 🤗 datasets

* suit up the args.

* remove comments.

* fix num_update_steps

* fix batch unmarshalling

* fix num_update_steps_per_epoch

* fix; dataloading.

* fix microconditions.

* unconditional predictions debug

* fix batch size.

* no need to use use_auth_token

* Apply suggestions from code review

Co-authored-by: Suraj Patil <[email protected]>

* make vae encoding batch size an arg

* final serialization in kohya

* style

* state dict rejigging

* feat: no separate teacher unet.

* debug

* fix state dict serialization

* debug

* debug

* debug

* remove prints.

* remove kohya utility and make style

* fix serialization

* fix

* add test

* add peft dependency.

* add: peft

* remove peft

* autocast device determination from accelerator

* autocast

* reduce lora rank.

* remove unneeded space

* Apply suggestions from code review

Co-authored-by: Suraj Patil <[email protected]>

* style

* remove prompt dropout.

* also save in native diffusers ckpt format.

* debug

* debug

* debug

* better formation of the null embeddings.

* remove space.

* autocast fixes.

* autocast fix.

* hacky

* remove lora_sayak

* Apply suggestions from code review

Co-authored-by: Younes Belkada <[email protected]>

* style

* make log validation leaner.

* move back enabled in.

* fix: log_validation call.

* add: checkpointing tests

* taking my chances to see if disabling autocasting has any effect?

* start debugging

* name

* name

* name

* more debug

* more debug

* index

* remove index.

* print length

* print length

* print length

* move unet.train() after add_adapter()

* disable some prints.

* enable_adapters() manually.

* remove prints.

* some changes.

* fix params_to_optimize

* more fixes

* debug

* debug

* remove print

* disable grad for certain contexts.

* Add support for IPAdapterFull (#5911)

* Add support for IPAdapterFull


Co-authored-by: Patrick von Platen <[email protected]>

---------

Co-authored-by: YiYi Xu <[email protected]>
Co-authored-by: Patrick von Platen <[email protected]>

* Fix a bug in `add_noise` function  (#6085)

* fix

* copies

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>

* [Advanced Diffusion Script] Add Widget default text (#6100)

add widget

* [Advanced Training Script] Fix pipe example (#6106)

* IP-Adapter for StableDiffusionControlNetImg2ImgPipeline (#5901)

* adapter for StableDiffusionControlNetImg2ImgPipeline

* fix-copies

* fix-copies

---------

Co-authored-by: Sayak Paul <[email protected]>

* IP adapter support for most pipelines (#5900)

* support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_upscale.py

* support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_attend_and_excite.py

* support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_instruct_pix2pix.py

* update tests

* support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_panorama.py

* support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_sag.py

* support ip-adapter in src/diffusers/pipelines/stable_diffusion_safe/pipeline_stable_diffusion_safe.py

* support ip-adapter in src/diffusers/pipelines/latent_consistency_models/pipeline_latent_consistency_text2img.py

* support ip-adapter in src/diffusers/pipelines/latent_consistency_models/pipeline_latent_consistency_img2img.py

* support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_ldm3d.py

* revert changes to sd_attend_and_excite and sd_upscale

* make style

* fix broken tests

* update ip-adapter implementation to latest

* apply suggestions from review

---------

Co-authored-by: YiYi Xu <[email protected]>
Co-authored-by: Sayak Paul <[email protected]>

* fix: lora_alpha

* make vae casting conditional/

* param upcasting

* propagate comments from #6145

Co-authored-by: dg845 <[email protected]>

* [Peft] fix saving / loading when unet is not "unet" (#6046)

* [Peft] fix saving / loading when unet is not "unet"

* Update src/diffusers/loaders/lora.py

Co-authored-by: Sayak Paul <[email protected]>

* undo stablediffusion-xl changes

* use unet_name to get unet for lora helpers

* use unet_name

---------

Co-authored-by: Sayak Paul <[email protected]>

* [Wuerstchen] fix fp16 training and correct lora args (#6245)

fix fp16 training

Co-authored-by: Sayak Paul <[email protected]>

* [docs] fix: animatediff docs (#6339)

fix: animatediff docs

* add: note about the new script in readme_sdxl.

* Revert "[Peft] fix saving / loading when unet is not "unet" (#6046)"

This reverts commit 4c7e983.

* Revert "[Wuerstchen] fix fp16 training and correct lora args (#6245)"

This reverts commit 0bb9cf0.

* Revert "[docs] fix: animatediff docs (#6339)"

This reverts commit 11659a6.

* remove tokenize_prompt().

* assistive comments around enable_adapters() and diable_adapters().

---------

Co-authored-by: Suraj Patil <[email protected]>
Co-authored-by: Younes Belkada <[email protected]>
Co-authored-by: Fabio Rigano <[email protected]>
Co-authored-by: YiYi Xu <[email protected]>
Co-authored-by: Patrick von Platen <[email protected]>
Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: apolinário <[email protected]>
Co-authored-by: Charchit Sharma <[email protected]>
Co-authored-by: Aryan V S <[email protected]>
Co-authored-by: dg845 <[email protected]>
Co-authored-by: Kashif Rasul <[email protected]>
donhardman pushed a commit to donhardman/diffusers that referenced this pull request Dec 29, 2023
* add: script to train lcm lora for sdxl with 🤗 datasets

* suit up the args.

* remove comments.

* fix num_update_steps

* fix batch unmarshalling

* fix num_update_steps_per_epoch

* fix; dataloading.

* fix microconditions.

* unconditional predictions debug

* fix batch size.

* no need to use use_auth_token

* Apply suggestions from code review

Co-authored-by: Suraj Patil <[email protected]>

* make vae encoding batch size an arg

* final serialization in kohya

* style

* state dict rejigging

* feat: no separate teacher unet.

* debug

* fix state dict serialization

* debug

* debug

* debug

* remove prints.

* remove kohya utility and make style

* fix serialization

* fix

* add test

* add peft dependency.

* add: peft

* remove peft

* autocast device determination from accelerator

* autocast

* reduce lora rank.

* remove unneeded space

* Apply suggestions from code review

Co-authored-by: Suraj Patil <[email protected]>

* style

* remove prompt dropout.

* also save in native diffusers ckpt format.

* debug

* debug

* debug

* better formation of the null embeddings.

* remove space.

* autocast fixes.

* autocast fix.

* hacky

* remove lora_sayak

* Apply suggestions from code review

Co-authored-by: Younes Belkada <[email protected]>

* style

* make log validation leaner.

* move back enabled in.

* fix: log_validation call.

* add: checkpointing tests

* taking my chances to see if disabling autocasting has any effect?

* start debugging

* name

* name

* name

* more debug

* more debug

* index

* remove index.

* print length

* print length

* print length

* move unet.train() after add_adapter()

* disable some prints.

* enable_adapters() manually.

* remove prints.

* some changes.

* fix params_to_optimize

* more fixes

* debug

* debug

* remove print

* disable grad for certain contexts.

* Add support for IPAdapterFull (huggingface#5911)

* Add support for IPAdapterFull


Co-authored-by: Patrick von Platen <[email protected]>

---------

Co-authored-by: YiYi Xu <[email protected]>
Co-authored-by: Patrick von Platen <[email protected]>

* Fix a bug in `add_noise` function  (huggingface#6085)

* fix

* copies

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>

* [Advanced Diffusion Script] Add Widget default text (huggingface#6100)

add widget

* [Advanced Training Script] Fix pipe example (huggingface#6106)

* IP-Adapter for StableDiffusionControlNetImg2ImgPipeline (huggingface#5901)

* adapter for StableDiffusionControlNetImg2ImgPipeline

* fix-copies

* fix-copies

---------

Co-authored-by: Sayak Paul <[email protected]>

* IP adapter support for most pipelines (huggingface#5900)

* support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_upscale.py

* support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_attend_and_excite.py

* support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_instruct_pix2pix.py

* update tests

* support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_panorama.py

* support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_sag.py

* support ip-adapter in src/diffusers/pipelines/stable_diffusion_safe/pipeline_stable_diffusion_safe.py

* support ip-adapter in src/diffusers/pipelines/latent_consistency_models/pipeline_latent_consistency_text2img.py

* support ip-adapter in src/diffusers/pipelines/latent_consistency_models/pipeline_latent_consistency_img2img.py

* support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_ldm3d.py

* revert changes to sd_attend_and_excite and sd_upscale

* make style

* fix broken tests

* update ip-adapter implementation to latest

* apply suggestions from review

---------

Co-authored-by: YiYi Xu <[email protected]>
Co-authored-by: Sayak Paul <[email protected]>

* fix: lora_alpha

* make vae casting conditional/

* param upcasting

* propagate comments from huggingface#6145

Co-authored-by: dg845 <[email protected]>

* [Peft] fix saving / loading when unet is not "unet" (huggingface#6046)

* [Peft] fix saving / loading when unet is not "unet"

* Update src/diffusers/loaders/lora.py

Co-authored-by: Sayak Paul <[email protected]>

* undo stablediffusion-xl changes

* use unet_name to get unet for lora helpers

* use unet_name

---------

Co-authored-by: Sayak Paul <[email protected]>

* [Wuerstchen] fix fp16 training and correct lora args (huggingface#6245)

fix fp16 training

Co-authored-by: Sayak Paul <[email protected]>

* [docs] fix: animatediff docs (huggingface#6339)

fix: animatediff docs

* add: note about the new script in readme_sdxl.

* Revert "[Peft] fix saving / loading when unet is not "unet" (huggingface#6046)"

This reverts commit 4c7e983.

* Revert "[Wuerstchen] fix fp16 training and correct lora args (huggingface#6245)"

This reverts commit 0bb9cf0.

* Revert "[docs] fix: animatediff docs (huggingface#6339)"

This reverts commit 11659a6.

* remove tokenize_prompt().

* assistive comments around enable_adapters() and diable_adapters().

---------

Co-authored-by: Suraj Patil <[email protected]>
Co-authored-by: Younes Belkada <[email protected]>
Co-authored-by: Fabio Rigano <[email protected]>
Co-authored-by: YiYi Xu <[email protected]>
Co-authored-by: Patrick von Platen <[email protected]>
Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: apolinário <[email protected]>
Co-authored-by: Charchit Sharma <[email protected]>
Co-authored-by: Aryan V S <[email protected]>
Co-authored-by: dg845 <[email protected]>
Co-authored-by: Kashif Rasul <[email protected]>
antoine-scenario pushed a commit to antoine-scenario/diffusers that referenced this pull request Jan 2, 2024
* add: script to train lcm lora for sdxl with 🤗 datasets

* suit up the args.

* remove comments.

* fix num_update_steps

* fix batch unmarshalling

* fix num_update_steps_per_epoch

* fix; dataloading.

* fix microconditions.

* unconditional predictions debug

* fix batch size.

* no need to use use_auth_token

* Apply suggestions from code review

Co-authored-by: Suraj Patil <[email protected]>

* make vae encoding batch size an arg

* final serialization in kohya

* style

* state dict rejigging

* feat: no separate teacher unet.

* debug

* fix state dict serialization

* debug

* debug

* debug

* remove prints.

* remove kohya utility and make style

* fix serialization

* fix

* add test

* add peft dependency.

* add: peft

* remove peft

* autocast device determination from accelerator

* autocast

* reduce lora rank.

* remove unneeded space

* Apply suggestions from code review

Co-authored-by: Suraj Patil <[email protected]>

* style

* remove prompt dropout.

* also save in native diffusers ckpt format.

* debug

* debug

* debug

* better formation of the null embeddings.

* remove space.

* autocast fixes.

* autocast fix.

* hacky

* remove lora_sayak

* Apply suggestions from code review

Co-authored-by: Younes Belkada <[email protected]>

* style

* make log validation leaner.

* move back enabled in.

* fix: log_validation call.

* add: checkpointing tests

* taking my chances to see if disabling autocasting has any effect?

* start debugging

* name

* name

* name

* more debug

* more debug

* index

* remove index.

* print length

* print length

* print length

* move unet.train() after add_adapter()

* disable some prints.

* enable_adapters() manually.

* remove prints.

* some changes.

* fix params_to_optimize

* more fixes

* debug

* debug

* remove print

* disable grad for certain contexts.

* Add support for IPAdapterFull (huggingface#5911)

* Add support for IPAdapterFull


Co-authored-by: Patrick von Platen <[email protected]>

---------

Co-authored-by: YiYi Xu <[email protected]>
Co-authored-by: Patrick von Platen <[email protected]>

* Fix a bug in `add_noise` function  (huggingface#6085)

* fix

* copies

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>

* [Advanced Diffusion Script] Add Widget default text (huggingface#6100)

add widget

* [Advanced Training Script] Fix pipe example (huggingface#6106)

* IP-Adapter for StableDiffusionControlNetImg2ImgPipeline (huggingface#5901)

* adapter for StableDiffusionControlNetImg2ImgPipeline

* fix-copies

* fix-copies

---------

Co-authored-by: Sayak Paul <[email protected]>

* IP adapter support for most pipelines (huggingface#5900)

* support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_upscale.py

* support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_attend_and_excite.py

* support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_instruct_pix2pix.py

* update tests

* support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_panorama.py

* support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_sag.py

* support ip-adapter in src/diffusers/pipelines/stable_diffusion_safe/pipeline_stable_diffusion_safe.py

* support ip-adapter in src/diffusers/pipelines/latent_consistency_models/pipeline_latent_consistency_text2img.py

* support ip-adapter in src/diffusers/pipelines/latent_consistency_models/pipeline_latent_consistency_img2img.py

* support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_ldm3d.py

* revert changes to sd_attend_and_excite and sd_upscale

* make style

* fix broken tests

* update ip-adapter implementation to latest

* apply suggestions from review

---------

Co-authored-by: YiYi Xu <[email protected]>
Co-authored-by: Sayak Paul <[email protected]>

* fix: lora_alpha

* make vae casting conditional/

* param upcasting

* propagate comments from huggingface#6145

Co-authored-by: dg845 <[email protected]>

* [Peft] fix saving / loading when unet is not "unet" (huggingface#6046)

* [Peft] fix saving / loading when unet is not "unet"

* Update src/diffusers/loaders/lora.py

Co-authored-by: Sayak Paul <[email protected]>

* undo stablediffusion-xl changes

* use unet_name to get unet for lora helpers

* use unet_name

---------

Co-authored-by: Sayak Paul <[email protected]>

* [Wuerstchen] fix fp16 training and correct lora args (huggingface#6245)

fix fp16 training

Co-authored-by: Sayak Paul <[email protected]>

* [docs] fix: animatediff docs (huggingface#6339)

fix: animatediff docs

* add: note about the new script in readme_sdxl.

* Revert "[Peft] fix saving / loading when unet is not "unet" (huggingface#6046)"

This reverts commit 4c7e983.

* Revert "[Wuerstchen] fix fp16 training and correct lora args (huggingface#6245)"

This reverts commit 0bb9cf0.

* Revert "[docs] fix: animatediff docs (huggingface#6339)"

This reverts commit 11659a6.

* remove tokenize_prompt().

* assistive comments around enable_adapters() and diable_adapters().

---------

Co-authored-by: Suraj Patil <[email protected]>
Co-authored-by: Younes Belkada <[email protected]>
Co-authored-by: Fabio Rigano <[email protected]>
Co-authored-by: YiYi Xu <[email protected]>
Co-authored-by: Patrick von Platen <[email protected]>
Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: apolinário <[email protected]>
Co-authored-by: Charchit Sharma <[email protected]>
Co-authored-by: Aryan V S <[email protected]>
Co-authored-by: dg845 <[email protected]>
Co-authored-by: Kashif Rasul <[email protected]>
AmericanPresidentJimmyCarter pushed a commit to AmericanPresidentJimmyCarter/diffusers that referenced this pull request Apr 26, 2024
* Add support for IPAdapterFull


Co-authored-by: Patrick von Platen <[email protected]>

---------

Co-authored-by: YiYi Xu <[email protected]>
Co-authored-by: Patrick von Platen <[email protected]>
AmericanPresidentJimmyCarter pushed a commit to AmericanPresidentJimmyCarter/diffusers that referenced this pull request Apr 26, 2024
* add: script to train lcm lora for sdxl with 🤗 datasets

* suit up the args.

* remove comments.

* fix num_update_steps

* fix batch unmarshalling

* fix num_update_steps_per_epoch

* fix; dataloading.

* fix microconditions.

* unconditional predictions debug

* fix batch size.

* no need to use use_auth_token

* Apply suggestions from code review

Co-authored-by: Suraj Patil <[email protected]>

* make vae encoding batch size an arg

* final serialization in kohya

* style

* state dict rejigging

* feat: no separate teacher unet.

* debug

* fix state dict serialization

* debug

* debug

* debug

* remove prints.

* remove kohya utility and make style

* fix serialization

* fix

* add test

* add peft dependency.

* add: peft

* remove peft

* autocast device determination from accelerator

* autocast

* reduce lora rank.

* remove unneeded space

* Apply suggestions from code review

Co-authored-by: Suraj Patil <[email protected]>

* style

* remove prompt dropout.

* also save in native diffusers ckpt format.

* debug

* debug

* debug

* better formation of the null embeddings.

* remove space.

* autocast fixes.

* autocast fix.

* hacky

* remove lora_sayak

* Apply suggestions from code review

Co-authored-by: Younes Belkada <[email protected]>

* style

* make log validation leaner.

* move back enabled in.

* fix: log_validation call.

* add: checkpointing tests

* taking my chances to see if disabling autocasting has any effect?

* start debugging

* name

* name

* name

* more debug

* more debug

* index

* remove index.

* print length

* print length

* print length

* move unet.train() after add_adapter()

* disable some prints.

* enable_adapters() manually.

* remove prints.

* some changes.

* fix params_to_optimize

* more fixes

* debug

* debug

* remove print

* disable grad for certain contexts.

* Add support for IPAdapterFull (huggingface#5911)

* Add support for IPAdapterFull


Co-authored-by: Patrick von Platen <[email protected]>

---------

Co-authored-by: YiYi Xu <[email protected]>
Co-authored-by: Patrick von Platen <[email protected]>

* Fix a bug in `add_noise` function  (huggingface#6085)

* fix

* copies

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>

* [Advanced Diffusion Script] Add Widget default text (huggingface#6100)

add widget

* [Advanced Training Script] Fix pipe example (huggingface#6106)

* IP-Adapter for StableDiffusionControlNetImg2ImgPipeline (huggingface#5901)

* adapter for StableDiffusionControlNetImg2ImgPipeline

* fix-copies

* fix-copies

---------

Co-authored-by: Sayak Paul <[email protected]>

* IP adapter support for most pipelines (huggingface#5900)

* support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_upscale.py

* support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_attend_and_excite.py

* support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_instruct_pix2pix.py

* update tests

* support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_panorama.py

* support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_sag.py

* support ip-adapter in src/diffusers/pipelines/stable_diffusion_safe/pipeline_stable_diffusion_safe.py

* support ip-adapter in src/diffusers/pipelines/latent_consistency_models/pipeline_latent_consistency_text2img.py

* support ip-adapter in src/diffusers/pipelines/latent_consistency_models/pipeline_latent_consistency_img2img.py

* support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_ldm3d.py

* revert changes to sd_attend_and_excite and sd_upscale

* make style

* fix broken tests

* update ip-adapter implementation to latest

* apply suggestions from review

---------

Co-authored-by: YiYi Xu <[email protected]>
Co-authored-by: Sayak Paul <[email protected]>

* fix: lora_alpha

* make vae casting conditional/

* param upcasting

* propagate comments from huggingface#6145

Co-authored-by: dg845 <[email protected]>

* [Peft] fix saving / loading when unet is not "unet" (huggingface#6046)

* [Peft] fix saving / loading when unet is not "unet"

* Update src/diffusers/loaders/lora.py

Co-authored-by: Sayak Paul <[email protected]>

* undo stablediffusion-xl changes

* use unet_name to get unet for lora helpers

* use unet_name

---------

Co-authored-by: Sayak Paul <[email protected]>

* [Wuerstchen] fix fp16 training and correct lora args (huggingface#6245)

fix fp16 training

Co-authored-by: Sayak Paul <[email protected]>

* [docs] fix: animatediff docs (huggingface#6339)

fix: animatediff docs

* add: note about the new script in readme_sdxl.

* Revert "[Peft] fix saving / loading when unet is not "unet" (huggingface#6046)"

This reverts commit 4c7e983.

* Revert "[Wuerstchen] fix fp16 training and correct lora args (huggingface#6245)"

This reverts commit 0bb9cf0.

* Revert "[docs] fix: animatediff docs (huggingface#6339)"

This reverts commit 11659a6.

* remove tokenize_prompt().

* assistive comments around enable_adapters() and diable_adapters().

---------

Co-authored-by: Suraj Patil <[email protected]>
Co-authored-by: Younes Belkada <[email protected]>
Co-authored-by: Fabio Rigano <[email protected]>
Co-authored-by: YiYi Xu <[email protected]>
Co-authored-by: Patrick von Platen <[email protected]>
Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: apolinário <[email protected]>
Co-authored-by: Charchit Sharma <[email protected]>
Co-authored-by: Aryan V S <[email protected]>
Co-authored-by: dg845 <[email protected]>
Co-authored-by: Kashif Rasul <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[IP-Adapter] support face model
7 participants