Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DPMSolverMultistep add rescale_betas_zero_snr #7097

Merged
merged 7 commits into from
Feb 27, 2024

Conversation

Beinsezii
Copy link
Contributor

Effectively identical to the implementation for EulerDiscrete in #6024

TL;DR:

  • rescale_zero_terminal_snr function copied from DDIM
  • Fudge the last cumprod to 2**-24 from 0 when using zsnr so it doesn't produce an inf sigma
  • Upcast samples to torch.float32 for the duration of step() because the zsnr sigmas are substantially less numerically stable and the performance hit is negligible.
    • Also has the side effect of producing slightly better images overall
  • use_karras_sigmas=True still produces strange results as it does in EulerDiscrete. Might be worth investigating separately?
  • Simple true/false UT for the new config value

Demo images
dpm_compare_labels

use_karras_sigmas also seems to just generally produce bad/noisy results even without ZSNR so I might not have the config set up correctly?

Copy link
Collaborator

@yiyixuxu yiyixuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh thanks
left a question about dtype

@@ -880,6 +930,11 @@ def step(
if self.step_index is None:
self._init_step_index(timestep)

# store old dtype because model_output isn't always the same it seems
return_dtype = sample.dtype
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean model_output.dtype isn't always the same as sample.dtype before the upcast?

Copy link
Contributor Author

@Beinsezii Beinsezii Feb 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out that was because model_output = self.convert_model_output(model_output, sample=sample) ends up creating a shadowed tensor cast to the sample's dtype. I moved the sample upcast after this call so return_type is no longer needed. Outputs are the same.

@yiyixuxu
Copy link
Collaborator

use_karras_sigmas=True still produces strange results as it does in EulerDiscrete. Might be worth investigating separately?

yes separate issue please:)

Copy link
Collaborator

@yiyixuxu yiyixuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

@yiyixuxu
Copy link
Collaborator

can you fix the quality test?

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@Beinsezii
Copy link
Contributor Author

Alright I added a single newline.

@yiyixuxu yiyixuxu merged commit 2e31a75 into huggingface:main Feb 27, 2024
13 checks passed
@yiyixuxu
Copy link
Collaborator

thank you!

@lachlan-nicholson
Copy link
Contributor

use_karras_sigmas=True still produces strange results as it does in EulerDiscrete. Might be worth investigating separately?

yes separate issue please:)

Have we created a separate issue for ZeroSNR + Karras?
At a glance the max lambda is very large and could be clipped similar to #3314, potentially via config.sigma_max.
This gives viable results but introduces a new parameter.
I'm left wondering what it means to apply both ZeroSNR and Karras modifiers at the same time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants