-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[scheduler] support custom timesteps
and sigmas
#7817
Conversation
cc @asomoza here |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
@yiyixuxu yeah, SDE looks good here, in fact they all look better than the tests I did, going to try with some photo realistic prompts. |
does this work for SVD? |
# test ays (svd - default scheduler + default ays steps)
import torch
import numpy as np
import os
from diffusers import DiffusionPipeline
import torch
from diffusers.utils import load_image, export_to_video
model_id = "stabilityai/stable-video-diffusion-img2vid-xt"
pipe = DiffusionPipeline.from_pretrained(
model_id, torch_dtype=torch.float16, variant="fp16"
)
pipe.enable_model_cpu_offload()
# Load the conditioning image
image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/svd/rocket.png")
image = image.resize((1024, 576))
seed = 123
sampling_schedule10 = [700.00, 54.5, 15.886, 7.977, 4.248, 1.789, 0.981, 0.403, 0.173, 0.034, 0.002]
save_dir = './test_ays_default_svd'
if not os.path.exists(save_dir):
os.mkdir(save_dir)
test_call_params = {
"default10" :{
"image" : image,
"decode_chunk_size": 8,
"num_inference_steps" : 10,
},
"ays10":{
"image" : image,
"decode_chunk_size": 8,
"num_inference_steps" : 10,
"sigmas" : sampling_schedule10,
},
}
for test_name, params in test_call_params.items():
print(" ")
print(f" test_name: {test_name}")
generator = torch.Generator(device='cuda').manual_seed(seed)
print(f" default scheduler_config: {pipe.scheduler}")
frames = pipe(**params, generator=generator).frames[0]
print(f" timesteps: {pipe.scheduler.timesteps}")
print(f" sigmas: {pipe.scheduler.sigmas}")
export_to_video(frames, f"{save_dir}/{test_name}.mp4", fps=7)
print(f"saved image to {save_dir}/{test_name}.mp4") |
@asomoza I'm a little bit concerned about SDE - the paper released many good results with the DPM SDE variant, and these steps seem to be optimized for SDE variants. Can you share the diffusers script you for SDE? Another decision we need to make here is whether to make from my brief experiments here, #7817 (comment) I think it also improves AYS results, so I think we should keep that default, which is different from the official one; let me know what you think on this too |
@yiyixuxu maybe is that have bad seed luck and found about it, also it seems that it happens more with the finetunes than the base model. This is the code I use now with your latest commit: import torch
from diffusers import DPMSolverMultistepScheduler, StableDiffusionXLPipeline
from diffusers.schedulers import AysSchedules
sampling_schedule = AysSchedules["StableDiffusionXLTimesteps"]
pipe = StableDiffusionXLPipeline.from_pretrained(
"SG161222/RealVisXL_V4.0",
torch_dtype=torch.float16,
variant="fp16",
).to("cuda")
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config, algorithm_type="sde-dpmsolver++")
prompt = "A cinematic shot of a cute little rabbit wearing a jacket and doing a thumbs up"
generator = torch.Generator(device="cpu").manual_seed(2487854446)
image = pipe(
prompt=prompt,
negative_prompt="",
guidance_scale=7.5,
num_inference_steps=10,
generator=generator,
timesteps=sampling_schedule,
).images[0]
image.save("ays_test.png") For me with that seed I get a bad result with that model, in comfyui I tested it with juggernaut, but I don't seem to get a bad result with the base model, the quality is worse so no many people use it though.
I did get a couple of good results with that model (RealVis) and SDE after, but I get bad results more often. As for the final sigma, I agree with you in making
Using |
diffusers/src/diffusers/schedulers/scheduling_dpmsolver_multistep.py Lines 363 to 371 in 9a23c53
timesteps are being overwritten here. Is this intended? |
@okaris, thanks for catching this! |
I believe that manually setting timesteps implies manual control over sigmas as well. It would be helpful if the scheduler could ignore the |
@okaris |
…e error messages, etc
@dg845 feel free to give this a review if you have time! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have enough background knowledge to really judge the implementation of passing sigmas explicitly, so I focused more on the implementation. This generally LGTM, I just had some smaller comments.
Moreover, would it make sense to add an entry to the docs. At least to me, just from reading the docstrings, I wouldn't really know what this new feature does.
@@ -170,6 +173,16 @@ def retrieve_timesteps( | |||
scheduler.set_timesteps(timesteps=timesteps, device=device, **kwargs) | |||
timesteps = scheduler.timesteps | |||
num_inference_steps = len(timesteps) | |||
elif sigmas is not None: | |||
accept_sigmas = "sigmas" in set(inspect.signature(scheduler.set_timesteps).parameters.keys()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it would be better to add a method to the scheduler, like scheduler._accepts_sigmas()
or scheduler._check_sigmas(sigmas)
instead of inspecting. Same argument for corresponding methods below.
@@ -189,6 +192,7 @@ def __init__( | |||
timestep_type: str = "discrete", # can be "discrete" or "continuous" | |||
steps_offset: int = 0, | |||
rescale_betas_zero_snr: bool = False, | |||
final_sigmas_type: str = "zero", # can be "zero" or "sigma_min" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Type could be changed to Literal["zero", "sigma_min"]
to be more precise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good idea! I would maybe ask the community to help us do this for the whole code base
timesteps = (np.arange(self.config.num_train_timesteps, 0, -step_ratio)).round().copy().astype(np.float32) | ||
timesteps -= 1 | ||
else: | ||
if timesteps is not None and sigmas is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the next check not already encompass this one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
basically, these 3 checks together make sure among these 3 variables, one is not None
and two are None
is there a better way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh sorry, I misread the code, it should be fine as is. Another way would be to add them, so:
if (num_inference_steps is None) + (timesteps is None) + (sigmas is None) != 1
but not sure if that's easier to read ;-)
Co-authored-by: Benjamin Bossan <[email protected]>
@stevhliu, where should I add a short introduction to this custom Basically, this allows users to use AYS (#7817 (comment)) out-of-box, but not limited to this feature only if there is a new sigmas or timesteps scheduler that's not in diffusers yet, you can manually create it and pass it to pipelines as custom timesteps; also this already exists for LCM scheduler so not a completely new feature, but now we are extending it to more schedulers and pipelines |
timesteps = (np.arange(self.config.num_train_timesteps, 0, -step_ratio)).round().copy().astype(np.float32) | ||
timesteps -= 1 | ||
else: | ||
if timesteps is not None and sigmas is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh sorry, I misread the code, it should be fine as is. Another way would be to add them, so:
if (num_inference_steps is None) + (timesteps is None) + (sigmas is None) != 1
but not sure if that's easier to read ;-)
Maybe we can add a section called "Custom schedulers" to Load schedulers and models? |
@@ -165,6 +165,62 @@ image | |||
|
|||
Most images look very similar and are comparable in quality. Again, it often comes down to your specific use case so a good approach is to run multiple different schedulers and compare the results. | |||
|
|||
### Custom Timestep Schedules |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a section here @stevhliu
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should probably refactor later and also in the same place, talk about use_karras_sigmas
and rescale_zero_terminal_snr
most of the scheduler configs you just should keep the same as the default scheduler, but these are the configurations that the user can play around with
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for now, feel free to give it a review :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, I'll refactor this later in a separate PR and create a doc for scheduler features (custom timesteps
and sigmas
, rescale_zero_terminal_snr
, and use_karras_sigmas
) 🙂
Co-authored-by: Steven Liu <[email protected]>
custom
timesteps
andsigmas
for schedulersThe general logic is: when a custom
timesteps
is passed, all the scheduler configuration that is used to generate timesteps (e.g.timestep_spacing
) will be ignored; sigmas will still be calculated based on thetimesteps
and relevant config attributes, e.g.interpolation_type
,final_sigma_type
When
sigmas
is passed, all the scheduler config used to generate timesteps and sigmas is ignored.timesteps
will be calculated based onsigmas
this PR:
timesteps
andsigmas
support for more schedulers:timesteps
for dpm multi-step, dpm single-step, and heun schedulertimesteps
orsigmas
for eulerretrive_timesteps
so it accepts bothtimesteps
andsigmas
and will pass these values to the scheduler'sset_timesteps
methodsigmas
argument to more pipelines (to all the pipelines that currently accepttimesteps
argument)timesteps
/sigmas
tests for euler, heun, dpm multi and dpm singleretrieve_timesteps
methods are all copied from sd1.5, I think these two pipeline tests are sufficientsupporting AYS
with this PR, you can now use AYS 10 steps(see #7760) with this script
This PR uses AYS as an example use case, but the goal is to support custom timestep and sigmas in a very general way. Currently, I only updated Euler and DPM for now, but we should extend this to other schedulers whenever it makes sense
Notes on
final_sigmas_type
in the scheduler's
set_timesteps
method, for N steps, we will come up withN
timesteps andN+1
sigmas. This is because, at each step, we need both the current sigma and next sigma to calculate the derivative. We let the user decide how to set the final sigma value using thefinal_sigma_type
config: if it is"sigma_min"
, the final sigma will be calculated based on the beta training schedule for the last timestep; if it iszero,
the final sigma is0
I used the this logic for custom
timesteps
andsigmas
regardingfinal_sigmas_type
timesteps
value is passed, we check to make sure it have the same length asnum_inference_steps
, and the scheduler will decide the last sigma value based onfinal_sigmas_type
sigmas
value is passed, it has to have the same length asnum_inference_steps + 1
- i.e. user has to decide what the last sigma is and includes in the customsigmas
values, sofinal_sigmas_type
won't be relevant in this caseAnother note is the official AYS schedule corresponds to
final_sigmas_type="sigma_min"
, but in diffusers, we set the default to be "zero" since we think the generation quality is better with this config across all schedulers, especially when the steps are low. see more details on this PR #6477. I run a brief tests in below section to compare these two configurationsTesting across different schedulers and config
I run tests across different schedulers for 2 different configurations:
final_sigmas_type = "sigma_min
andfinal_sigmas_type="zero"
.testing script
Testing script
output
You can see the testing output for each scheduler in below sections, each output is an image grid with 3 rows and 2 columns:
final_sigma_type="sigma_min"
(this is how it's done in AYS official diffusers example), the column on the right isfinal_sigma_type="zero"
(diffusers default setting)DPMPP_2M output
DPMPP_2M_SDE output
Euler
DPM SingleStep
Heun
(heun does not have
final_sigmas_type
argument so just one column, final_sigmas is always 0 for heun)