DDIM produces incorrect samples with SDXL (epsilon or v-prediction) #6068

bghira · 2023-12-06T01:04:08Z

Describe the bug

When generating images with SDXL and DDIM, there is some residual noise in the outputs.

This leads to a "smudgy" look, and in cases where fewer steps are used, DDIM and Euler diverge a lot more than they should because of the cumulative impact of not having the timesteps aligned properly.

In some brief tests, it looks like simply adding an extra timestep with a zero sigma to the end of the schedule resolves the problem.

Reproduction

This script uses a modified Euler scheduler to create fully-denoised images:

import PIL
import requests
import torch
import numpy as np
from diffusers import StableDiffusionXLPipeline, EulerDiscreteScheduler

model_id = "ptx0/terminus-xl-gamma-training"
pipe = StableDiffusionXLPipeline.from_pretrained(model_id, add_watermarker=False, torch_dtype=torch.bfloat16).to("cuda")
generator = torch.Generator("cuda").manual_seed(420420420)

prompt = "the artful dodger, cool dog in sunglasses sitting on a recliner in the dark, with the white noise reflecting on his sunglasses"
num_inference_steps = 30
guidance_scale = 7.5
def rescale_zero_terminal_snr_sigmas(sigmas):
    sigmas = sigmas.flip(0)
    alphas_cumprod = 1 / ((sigmas * sigmas) + 1)
    alphas_bar_sqrt = alphas_cumprod.sqrt()

    # Store old values.
    alphas_bar_sqrt_0 = alphas_bar_sqrt[0].clone()
    alphas_bar_sqrt_T = alphas_bar_sqrt[-1].clone()

    # Shift so the last timestep is zero.
    alphas_bar_sqrt -= (alphas_bar_sqrt_T)

    # Scale so the first timestep is back to the old value.
    alphas_bar_sqrt *= alphas_bar_sqrt_0 / (alphas_bar_sqrt_0 - alphas_bar_sqrt_T)

    # Convert alphas_bar_sqrt to betas
    alphas_bar = alphas_bar_sqrt**2  # Revert sqrt
    alphas_bar[-1] = 4.8973451890853435e-08
    sigmas = ((1 - alphas_bar) / alphas_bar) ** 0.5
    return sigmas.flip(0)


zsnr = getattr(pipe.scheduler.config, 'rescale_betas_zero_snr', False)
pipe.scheduler = EulerDiscreteScheduler.from_config(pipe.scheduler.config)
if zsnr:
    tsbase = pipe.scheduler.set_timesteps
    def tspatch(*args, **kwargs):
        tsbase(*args, **kwargs)
        pipe.scheduler.sigmas = rescale_zero_terminal_snr_sigmas(pipe.scheduler.sigmas)
    pipe.scheduler.set_timesteps = tspatch
    sigmas = pipe.scheduler.betas

edited_image = pipe(
   prompt=prompt, 
   num_inference_steps=num_inference_steps, 
   guidance_scale=guidance_scale,
   generator=generator,
    guidance_rescale=0.7
).images[0]
edited_image.save("edited_image.png")

It uses the Sigmas code ported by @Beinsezii in #6024

However, with vanilla DDIM, the results are far worse:

import PIL
import requests
import torch
import numpy as np
from diffusers import StableDiffusionXLPipeline

model_id = "ptx0/terminus-xl-gamma-training"
pipe = StableDiffusionXLPipeline.from_pretrained(model_id, add_watermarker=False, torch_dtype=torch.bfloat16).to("cuda")
generator = torch.Generator("cuda").manual_seed(420420420)

prompt = "the artful dodger, cool dog in sunglasses sitting on a recliner in the dark, with the white noise reflecting on his sunglasses"
num_inference_steps = 30
guidance_scale = 7.5
edited_image = pipe(
   prompt=prompt, 
   num_inference_steps=num_inference_steps, 
   guidance_scale=guidance_scale,
   generator=generator,
    guidance_rescale=0.7
).images[0]
edited_image.save("edited_image.png")

Logs

No response

System Info

diffusers version: 0.21.4
Platform: Linux-5.19.0-45-generic-x86_64-with-glibc2.31
Python version: 3.9.16
PyTorch version (GPU?): 2.1.0+cu118 (True)
Huggingface_hub version: 0.16.4
Transformers version: 4.30.2
Accelerate version: 0.18.0
xFormers version: 0.0.22.post4+cu118
Using GPU in script?: A100-80G PCIe
Using distributed or parallel set-up in script?: FALSE

Who can help?

@patrickvonplaten @yiyixuxu

The text was updated successfully, but these errors were encountered:

bghira · 2023-12-06T01:42:08Z

For me, brighter images make it more noticeable.

DDIM

Euler (Patched)

patrickvonplaten · 2023-12-06T23:07:37Z

@yiyixuxu could you take a look here?

yiyixuxu · 2023-12-19T04:15:13Z

hi @bghira
is this resolved by #6024?

bghira · 2023-12-19T04:39:57Z

for euler yes, but that already appends an additional scheduler step

Beinsezii · 2023-12-19T05:17:15Z

I thought ddim had incorrect samples regardless of ZSNR or not. If the solution is to simply use euler and leave ddim broken then it may as well be deprecated.

The fact that euler needs an extra 0 sigma to avoid the residual noise issue and DPM has such options as euler_at_final leads me to believe there's a bigger problem with how the samplers are called, so either ddim and the rest all need Band-Aids or that off-by-one issue or whatever it is needs to be found.

github-actions · 2024-01-15T15:06:11Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

bghira · 2024-01-19T02:46:17Z

that stale bot is the worst! @patrickvonplaten it should probably just be removed from the project due to how many good issues just get juked.

bghira · 2024-01-19T02:46:54Z

also kinda crazy this remains an issue for more than a month?

patrickvonplaten · 2024-01-19T10:19:00Z

Could it be that this issue is resolved with: #6477 ? cc @yiyixuxu can you check?

bghira · 2024-01-19T13:05:11Z

no and that pr didnt fix dpm multistep solver either, it still has residual noise cc @AmericanPresidentJimmyCarter

AmericanPresidentJimmyCarter · 2024-01-19T14:29:22Z

I am using the code from #6647 with DPMSolverMultistep, karras_timesteps, euler_at_final=True, and things like logos still have residual noise instead of outputting a flat colour as expected. Euler and DDIM do not seem to fix this

And here it is with DPMSolverMultistep, karras_timesteps, final_sigmas_type="denoise_to_zero"

AmericanPresidentJimmyCarter · 2024-01-19T14:48:37Z

To easily reproduce the noise with solid colours, just prompt something like "Brooklyn pizza shop logo" and then open your fav image editor and crank brightness+contrast to see it clearly. Around the edges is probably jpg noise but I don't believe it all is.

bghira · 2024-01-19T15:49:12Z

the red dots are the "invisible" watermarker. #4014

AmericanPresidentJimmyCarter · 2024-01-19T16:37:59Z

the red dots are the "invisible" watermarker. #4014

I am using:

class NoWatermark:
    def apply_watermark(self, img):
        return img
...
pipe.watermarker = NoWatermark

edit: Oh, I see.

class NoWatermark:
    def apply_watermark(self, img):
        return img
...
- pipe.watermarker = NoWatermark
+ pipe.watermark = NoWatermark

Now there is still some noise, but it is reduced.

djdookie · 2024-01-22T23:32:14Z

I think I have a similar or related issue.

I created an image with diffusers and auto1111 with the same parameters, but got different images, with diffusers being worse quality (especially more noise).
Does anyone have an idea what could make that difference?

Relevant diffusers code with parameters:

pipe = StableDiffusionXLPipeline.from_single_file(".\models\Stable-diffusion\sdxl\sd_xl_base_1.0_0.9vae.safetensors", torch_dtype=torch.float16)
prompt = "concept art Amber Temple, snow, frigid air, snow-covered peaks of the mountains, dungeons and dragons style, dark atmosphere . digital artwork, illustrative, painterly, matte painting, highly detailed"
negative_prompt = "photo, photorealistic, realism, ugly"
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
image = pipe(prompt, negative_prompt=negative_prompt, guidance_scale=8, num_inference_steps=20, width=1024, height=1024, generator=torch.Generator(device='cuda').manual_seed(1337), use_karras_sigmas=True).images[0]

Auto1111 (DPM++ 2M Karras):

diffusers v0.25.1:

Slightly different results.
#6477 seemed to fix that issue, but didn't. Also #6295 couldn't help.

patrickvonplaten · 2024-01-23T11:31:04Z

@djdookie, I think you have a typo in your code snippet. Note that you should pass use_karras_sigmas=True to the from_config(...) call not to the pipeline call. The code snippet should looks as follows:

pipe = StableDiffusionXLPipeline.from_single_file(".\models\Stable-diffusion\sdxl\sd_xl_base_1.0_0.9vae.safetensors", torch_dtype=torch.float16)
prompt = "concept art Amber Temple, snow, frigid air, snow-covered peaks of the mountains, dungeons and dragons style, dark atmosphere . digital artwork, illustrative, painterly, matte painting, highly detailed"
negative_prompt = "photo, photorealistic, realism, ugly"
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config, use_karras_sigmas=True)
image = pipe(prompt, negative_prompt=negative_prompt, guidance_scale=8, num_inference_steps=20, width=1024, height=1024, generator=torch.Generator(device='cuda').manual_seed(1337)).images[0]

see the diff:

- pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
+ pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config, use_karras_sigmas=True)

patrickvonplaten · 2024-01-23T11:33:16Z

When I run the correct code in a colab, I'm getting good results: https://colab.research.google.com/drive/1IXZRZk6TYVG9uTDjocsUEfynfp5gyeoe?usp=sharing

(make sure to use current diffusers main here)

The resulting image looks very similar to A1111

bghira · 2024-01-23T14:18:31Z

using karras sigmas is incompatible with zero-terminal SNR, no? i wouldn't say it looks very similar other than compositionally. the contrast is totally washed out

djdookie · 2024-01-24T06:16:23Z

@patrickvonplaten Good finding. This indeed solved my issue. And I don't have washed out colors btw. Thank you so much!
Before:

After code correction:

Still slightly different than the A1111 image I posted earlier, but quality is good again and remaining noise is gone.

bghira · 2024-01-24T15:15:14Z

that still has residual noise in the sky, you can see the splotchy colouring there. try a retrieving a vector style image or any of the demo prompts from above.

patrickvonplaten · 2024-01-26T12:36:07Z

that still has residual noise in the sky, you can see the splotchy colouring there. try a retrieving a vector style image or any of the demo prompts from above.

I don't see any splotchy colouring tbh, but maybe I'm also just getting old and my vision is weaker than it used to haha

AmericanPresidentJimmyCarter · 2024-01-26T14:03:47Z

using karras sigmas is incompatible with zero-terminal SNR, no? i wouldn't say it looks very similar other than compositionally. the contrast is totally washed out

Not in my experience

Beinsezii · 2024-01-27T00:17:32Z

I don't see any splotchy colouring tbh, but maybe I'm also just getting old and my vision is weaker than it used to haha

EulerDiscreteScheduler, DDIMScheduler...

...EulerDiscreteScheduler(use_karras_sigmas=true), DPMSolverMultistepScheduler(use_karras_sigmas=True)

positive: flat vector artwork of a kitten looking up at the night sky
negative: blurry
Model: ptx0/terminus-xl-gamma-training (V-PRED ZSNR)
seed: 1 (cpu f32)
guidance 8 + 0.7 rescale
30 steps

Using diffusers master d4c7ab7 with my own app I think this is a fairly obvious demonstration that both DDIM and probably DPM have timestep issues. DPM doesn't have a ZSNR patch yet so it'll naturally have less contrast.

bghira · 2024-01-28T00:39:52Z

that still has residual noise in the sky, you can see the splotchy colouring there. try a retrieving a vector style image or any of the demo prompts from above.

I don't see any splotchy colouring tbh, but maybe I'm also just getting old and my vision is weaker than it used to haha

@patrickvonplaten i understand, it's something that you have to see quite a lot to really recognise it.

one oddity is that the same seed has the same splotchy pattern across every image. it's simply some deterministic noise being added/not removed completely

yiyixuxu · 2024-01-29T19:55:14Z

@bghira
for ddim, do you want to open a PR with your fix so we can start from there?
This issue is getting a little bit confusing now since we are also talking about issues across many schedulers

In some brief tests, it looks like simply adding an extra timestep with a zero sigma to the end of the schedule resolves the problem.

bghira · 2024-01-29T20:36:02Z

no, i havent had great experiences opening PRs for this project for the last handful of months, they become stale and close automatically.

yiyixuxu · 2024-01-30T06:23:45Z

Hi @bghira

I reopened this one #5969 - is there any other issues from your project that have been automatically closed? please let me know

I'm sorry that we let perfectly good issues go stale. This particular issue is a relatively low priority for me and I haven't been able to find time to work on this because
(1) DDIM is not a common choice for SDXL
(2) the scheduler PRs are most time-consuming

I should have been more upfront about this and should be more clear about the expectations. I'm sorry and I will do better next time. And please be a little bit patient with us in the meantime. Thanks

YiYi

bghira · 2024-01-30T13:33:16Z

well since that time, a colleague has ported zero terminal snr to Euler. DDIM was the only choice til that. i dont personally meed this fixed, i dont think ddim is very useful considering euler works basically the same. if you wanted to simply remove ddim i would think thats fine

spezialspezial · 2024-01-30T15:16:09Z

the contrast is totally washed out

@bghira The washed out colors in SDXL are likely from this issue: #6753

bghira · 2024-01-30T15:42:37Z

it could be for some cases, but in this one the user didn't move away from from_single_file and their results had less contrast issue

yiyixuxu · 2024-01-30T18:33:16Z

@bghira

if you wanted to simply remove ddim i would think thats fine

DDIM is still used a lot, no? just not a popular choice with SDXL I think. maybe we can add a note in our doc?

bghira · 2024-01-30T18:40:02Z

@bghira

if you wanted to simply remove ddim i would think thats fine

DDIM is still used a lot, no? just not a popular choice with SDXL I think. maybe we can add a note in our doc?

ideally it would be mapped to euler so that the behaviour remains the same for end users. ComfyUI just did this a few months back to reduce duplicate code maintenance overhead as well.

Beinsezii · 2024-01-30T20:21:15Z

ideally it would be mapped to euler so that the behaviour remains the same for end users. ComfyUI just did this a few months back to reduce duplicate code maintenance overhead as well.

FWIW he only removed the ddim sampler, there's still the "ddim_uniform" scheduler which gives a different sigma spread for what is now the Euler sampler.
Scheduler code is just different sigma spacings

patrickvonplaten · 2024-02-09T15:43:45Z

I think we should move this issue to a discussion. We're talking about the quality of different schedulers and scheduler variants such as ddim_uniform here.

Beinsezii · 2024-02-09T22:43:58Z

So doing some more in-depth research I actually feel there's a few issues going on simultaneously.

1.

The official SAI config for SDXL has `set_alpha_to_one: False` despite using EulerDiscreteScheduler

So if you inherit the config such as

pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config)

It'll inherit the set_alpha_to_one=False value which is normally enabled by default in Diffusers.

2.

The actual diffusers documentation incorrectly specifies an `set_alpha_to_one` value for Euler, DDPM

...And probably more, as the documentation for the step_offset kwarg is blindly copy-pasted across all the schedulers that have the option, explicitly stating that

You can use a combination of offset=1 and set_alpha_to_one=False to make the last step use step 0 for the previous alpha product like in Stable Diffusion

However, a cursory glance shows that this basically only applies to DDIM and maybe a few others. Euler, DDPM, DPM Multistep, etc. all do not contain a set_alpha_to_one kwarg so the value is silently dropped until someone inherits the configuration later.

3.

Why recommend `steps_offset=1` and `set_alpha_to_one=False`?

The documentation implies that step_offset=1 and set_alpha_to_one=True are contradictory solutions to a final timestep problem, however (on SDXL) it seems to not be the case at all. steps_offset=1 changes the image very slightly and set_alpha_to_one=False just adds residual noise regardless of the offset as the final cumprod is clamped down to the maximum index which is something like 0.997 instead of 1.0

Additionally, steps_offset=1 doesn't even apply to the trailing timestep spacing that most of the major UIs use nowadays, including ComfyUI which is more or less SAI's reference implementation based on their usage in Discord. With leading the difference is still so small I'm wondering what the point even is. Maybe it's only useful for the older SD pipelines?

The following figure of SDXL-Base images contains two rows: leading and trailing timesteps with the following columns

DDIMScheduler(steps_offset=0, set_alpha_to_one=False)
DDIMScheduler(steps_offset=1, set_alpha_to_one=False)
DDIMScheduler(steps_offset=0, set_alpha_to_one=True)
DDIMScheduler(steps_offset=1, set_alpha_to_one=True)
EulerDiscreteScheduler(steps_offset=0)
EulerDiscreteScheduler(steps_offset=1)

It's extremely obvious that the issue plaguing @bghira and myself is the set_alpha_to_one=False being inherited from the XL base config. Additionally, I'm not convinced that steps_offset=1 is ever exclusively preferable as a replacement for set_alpha_to_one=True (or even at all, really), as it performs identically regardless of whether the final cumrpod is overridden or not.

How did we get here?

My theory based on the above is that SAI initially opted to use DDIMScheduler for SDXL on Huggingface, as that's the scheduler in their paper. They set steps_offset=1 and set_alpha_to_one=False per recommendation from the Diffusers documentation as that's the setup "like in Stable Diffusion", but after dealing with horribly noisy images they just changed the scheduler to EulerDiscreteScheduler in their config and left the rest as-is because it seemed to work (and the documentation implies set_alpha_to_one=False should still be used on Euler despite not existing), so now the set_alpha_to_one=False left over in their scheduler config will pollute downstream uses that inherit from the XL base config.

So, how do we fix? To be honest, I'm not 100% sure.

Recommend timestep_spacing="trailing" instead of steps_offset=1, set_alpha_to_one=False?

Maybe @patrickvonplaten or @yiyixuxu have a good solution?

No matter what happens with set_alpha_to_one and steps_offset, the documentation on all the schedulers needs some refactoring. People shouldn't be setting kwargs that don't exist.

yiyixuxu · 2024-02-10T19:49:07Z

I moved this to a discussion here so more people can participate #6931

github-actions · 2024-03-06T15:05:30Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions · 2024-04-02T15:04:15Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Beinsezii · 2024-04-03T06:06:22Z

Can probably close this now @bghira

I changed the docs in #7128 so unless one of the HF team is going to open a PR on xl-base to edit the config I don't think there's much else to do.

bghira · 2024-04-03T15:16:37Z

@apolinario can you do that? i'll close this one out. but we need the SAI configs updated

bghira added the bug Something isn't working label Dec 6, 2023

yiyixuxu added the scheduler label Dec 7, 2023

bghira mentioned this issue Dec 11, 2023

Much worse performance from StableDiffusionControlNetInpaintPipeline than sd-webui-controlnet #6101

Closed

github-actions bot added the stale Issues that haven't received updates label Jan 15, 2024

patrickvonplaten removed the stale Issues that haven't received updates label Jan 19, 2024

patrickvonplaten assigned yiyixuxu Jan 19, 2024

bghira mentioned this issue Feb 10, 2024

DDPMScheduler claims to have set_alpha_to_one in combination with steps_offset #6926

Open

Beinsezii mentioned this issue Feb 28, 2024

Change step_offset scheduler docstrings #7128

Merged

github-actions bot added the stale Issues that haven't received updates label Mar 6, 2024

yiyixuxu removed the stale Issues that haven't received updates label Mar 9, 2024

github-actions bot added the stale Issues that haven't received updates label Apr 2, 2024

github-actions bot removed the stale Issues that haven't received updates label Apr 3, 2024

bghira closed this as completed Apr 3, 2024

DDIM produces incorrect samples with SDXL (epsilon or v-prediction) #6068

DDIM produces incorrect samples with SDXL (epsilon or v-prediction) #6068

Comments

bghira commented Dec 6, 2023

Describe the bug

Reproduction

Logs

System Info

Who can help?

bghira commented Dec 6, 2023

DDIM

Euler (Patched)

patrickvonplaten commented Dec 6, 2023

yiyixuxu commented Dec 19, 2023

bghira commented Dec 19, 2023

Beinsezii commented Dec 19, 2023 • edited Loading

github-actions bot commented Jan 15, 2024

bghira commented Jan 19, 2024

bghira commented Jan 19, 2024

patrickvonplaten commented Jan 19, 2024

bghira commented Jan 19, 2024

AmericanPresidentJimmyCarter commented Jan 19, 2024 • edited Loading

AmericanPresidentJimmyCarter commented Jan 19, 2024 • edited Loading

bghira commented Jan 19, 2024

AmericanPresidentJimmyCarter commented Jan 19, 2024 • edited Loading

djdookie commented Jan 22, 2024

patrickvonplaten commented Jan 23, 2024

patrickvonplaten commented Jan 23, 2024

bghira commented Jan 23, 2024 • edited Loading

djdookie commented Jan 24, 2024 • edited Loading

bghira commented Jan 24, 2024

patrickvonplaten commented Jan 26, 2024 • edited Loading

AmericanPresidentJimmyCarter commented Jan 26, 2024

Beinsezii commented Jan 27, 2024

bghira commented Jan 28, 2024

yiyixuxu commented Jan 29, 2024

bghira commented Jan 29, 2024

yiyixuxu commented Jan 30, 2024

bghira commented Jan 30, 2024

spezialspezial commented Jan 30, 2024

bghira commented Jan 30, 2024

yiyixuxu commented Jan 30, 2024

bghira commented Jan 30, 2024

Beinsezii commented Jan 30, 2024 • edited Loading

patrickvonplaten commented Feb 9, 2024

Beinsezii commented Feb 9, 2024

1.

The official SAI config for SDXL has set_alpha_to_one: False despite using EulerDiscreteScheduler

2.

The actual diffusers documentation incorrectly specifies an set_alpha_to_one value for Euler, DDPM

3.

Why recommend steps_offset=1 and set_alpha_to_one=False?

How did we get here?

yiyixuxu commented Feb 10, 2024

github-actions bot commented Mar 6, 2024

github-actions bot commented Apr 2, 2024

Beinsezii commented Apr 3, 2024

bghira commented Apr 3, 2024

Beinsezii commented Dec 19, 2023 •

edited

Loading

AmericanPresidentJimmyCarter commented Jan 19, 2024 •

edited

Loading

AmericanPresidentJimmyCarter commented Jan 19, 2024 •

edited

Loading

AmericanPresidentJimmyCarter commented Jan 19, 2024 •

edited

Loading

bghira commented Jan 23, 2024 •

edited

Loading

djdookie commented Jan 24, 2024 •

edited

Loading

patrickvonplaten commented Jan 26, 2024 •

edited

Loading

Beinsezii commented Jan 30, 2024 •

edited

Loading

The official SAI config for SDXL has `set_alpha_to_one: False` despite using EulerDiscreteScheduler

The actual diffusers documentation incorrectly specifies an `set_alpha_to_one` value for Euler, DDPM

Why recommend `steps_offset=1` and `set_alpha_to_one=False`?