-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP][Community Pipeline] InstaFlow! One-Step Stable Diffusion with Rectified Flow #6057
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
Let's fix the styling problems. |
Also, have you tried the pipeline yourself? Have you experienced any significant speedup? Could you maybe report some results (possibly with the generated images)? |
Will try adding the LoRa results they show, since they have a better quality |
Gentle ping @ayushtues. |
Will share an update soon, was away on a trip for a while |
While I am not very familiar with benchmarking timings, at 32 bit precision, on a V100, I was able to generate images at around 0.03 secs (27its/sec in the tqdm log). The OG paper uses A100, which should be faster, not sure why I am getting faster inference than the 0.1 figure they had from examples.community.instaflow_one_step import InstaFlowPipeline
import torch
pipe = InstaFlowPipeline.from_pretrained("XCLIU/instaflow_0_9B_from_sd_1_5", torch_dtype=torch.float32, cache_dir="./cache", requires_safety_checker=False)
pipe.do_lora()
pipe.to("cuda") ### if GPU is not available, comment this line
prompt = "A hyper-realistic photo of a cute cat."
image_list = []
for i in range(10):
images = pipe(prompt=prompt,
num_inference_steps=1,
guidance_scale=0.0).images
image_list.append(images[0])
for i in range(10):
image_list[i].save(f"./image{i}.png") |
Nice very cool. Feel free to reformat and then we can merge! |
Fixed formatting, and added some documentation! |
examples/community/README.md
Outdated
|
||
|
||
pipe = DiffusionPipeline.from_pretrained("XCLIU/instaflow_0_9B_from_sd_1_5", torch_dtype=torch.float32, custom_pipeline="instaflow_one_step") | ||
pipe.do_lora() ### use dreambooth lora for better quality |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we do load_lora_weights()
here? Let's try use that instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One comment.
Hi @sayakpaul the current code (
Is there a way to do this using |
If the underlying LoRA format is new, we cannot assume |
Good to merge for me whenever! |
@sayakpaul since the text encoder and VAE are just copies of the DreamShaper model and the Unet is following a different strategy, I am not sure how to do this using just |
The UNet doesn't have different pre-trained weights right? So, if you subclass the UNet with |
Small update here, I digged a bit into it, and the VAE & Encoder are the same b/w DreamShaper and Instaflow, probably just the standard SD 1.5 ones, so just need to figure out LoRA for the Unet. Will create the LoRA weights for the Unet, and upload them to a seperate location and load from there using |
@sayakpaul is there a breaking change to Doesn't seem to be working for a simple example also, import torch
from diffusers import DiffusionPipeline
pipeline = DiffusionPipeline.from_pretrained(
"CompVis/stable-diffusion-v1-4",
torch_dtype=torch.float32,
)
pipeline.unet.save_attn_procs("./", weight_name="pytorch_custom_diffusion_weights.bin") Gives the error
Tried both the stable version of diffusers and building from source, colab here - https://colab.research.google.com/drive/1v9i09Mi9L93Cts0zywEj10t1rJVCuJmj?usp=sharing |
|
Is there any new functionality for similar usecases, namely saving weights which we can load using |
You can use
|
While I am yet to figure out how to use the Dreamshaper model using LORA (it doesn't seem to be using a trivial lora format), the pipeline does seem to be able to use other LORAs using simply import torch
from examples.community.instaflow_one_step import InstaFlowPipeline
pipe = InstaFlowPipeline.from_pretrained("XCLIU/instaflow_0_9B_from_sd_1_5", torch_dtype=torch.float32, cache_dir="E:/InstaFlow/code/cache", requires_safety_checker=False)
pipe.load_lora_weights("artificialguybr/logo-redmond-1-5v-logo-lora-for-liberteredmond-sd-1-5")
prompt = "logo, A logo for a fitness app, dynamic running figure, energetic colors (red, orange) ),LogoRedAF ,"
generator = torch.Generator(device="cpu").manual_seed(0)
images = pipe(prompt=prompt,
num_inference_steps=1,
generator=generator,
guidance_scale=0.0).images
images[0].save("./image1.png") |
Nice. Let's include this in the README. And then ship! |
Made the changes, one small thing, I haven't yet tested directly loading the custom pipeline using |
Hoe have you tested then? Ccing @patrickvonplaten for better clarification here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to merge for me!
Hi @sayakpaul, I tested out the pipeline locally using To test it locally, we need to create a new folder, with a import torch
from diffusers import DiffusionPipeline
pipe = DiffusionPipeline.from_pretrained("XCLIU/instaflow_0_9B_from_sd_1_5", custom_pipeline="./instaflow_pipe") After the PR is merged, so then we can do import torch
from diffusers import DiffusionPipeline
pipe = DiffusionPipeline.from_pretrained("XCLIU/instaflow_0_9B_from_sd_1_5", custom_pipeline="instaflow_one_step") |
Let me know if anything else is needed, else we can merge the PR! |
Once the CI is green will merge. Thanks for contributions! |
…ectified Flow (huggingface#6057) * Add instaflow community pipeline * Make styling fixes * Add lora * Fix formatting * Add docs * Update README.md * Update README.md * Remove do LORA * Update readme * Update README.md * Update README.md --------- Co-authored-by: Sayak Paul <[email protected]> Co-authored-by: Patrick von Platen <[email protected]>
Following #6037
Model/Pipeline/Scheduler description
InstaFlow is an ultra-fast, one-step image generator that achieves image quality close to Stable Diffusion, significantly reducing the demand of computational resources. This efficiency is made possible through a recent Rectified Flow technique, which trains probability flows with straight trajectories, hence inherently requiring only a single step for fast inference.
InstaFlow has several advantages:
Ultra-Fast Inference: InstaFlow models are one-step generators, which directly map noises to images and avoid multi-step sampling of diffusion models. On our machine with A100 GPU, the inference time is around 0.1 second, saving ~90% of the inference time compared to the original Stable Diffusion.
High-Quality: InstaFlow generates images with intricate details like Stable Diffusion, and have similar FID on MS COCO 2014 as state-of-the-art text-to-image GANs, like StyleGAN-T.
Simple and Efficient Training: The training process of InstaFlow merely involves supervised training. Leveraging pre-trained Stable Diffusion, it only takes 199 A100 GPU days to get InstaFlow-0.9B.
Open source status
Provide useful links for the implementation
Code & Weights - https://github.com/gnobitab/InstaFlow
HF Space - https://huggingface.co/spaces/XCLiu/InstaFlow
TO-DO
CC
@sayakpaul @gnobitab