-
Notifications
You must be signed in to change notification settings - Fork 894
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Timestep Sampling Function from SD3 Branch to SD #1668
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Fused backward pass
Add "--disable_mmap_load_safetensors" parameter
Display name of error latent file
removed unnecessary `torch` import on line 115
Fix caption_separator missing in subset schema
Add caption_separator to output for subset
Accelerate: fix get_trainable_params in controlnet-llite training
Hyperparameter tracking
Make timesteps work in the standard way when Huber loss is used
New optimizer:AdEMAMix8bit and PagedAdEMAMix8bit
It looks like this PR is trying to merge all the changes from the sd3 branch into main, please make any fixes within the sd3 branch. |
I got it, open #1671 and close this. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR introduces the
timestep_sampling
feature from SD3 Branch into the original SD model. The new timestep sampling options offer a more concentrated probability distribution compared to the default uniform sampling, which helps the model focus on specific aspects of learning. The new options can be used via the--timestep_sampling
argument.uniform
): Keeps the original uniform timestep sampling, which evenly distributes the learning steps.shift
by default --discrete_flow_shift = 1.Additional Parameters:
--discrete_flow_shift
: By default set to1
, uses random normal distribution to sample timesteps. A rightward shift helps the model focus more on style, while a leftward shift aids in learning specific objects. The default value is1
, providing a balanced form.--sigmoid_scale
: Adjusts the shape of the sigmoid function.While flux_shift maintains the distortion effects of FLUX, its application may vary in SD due to differences in model nature and training at fixed resolutions.
It is recommended to use the
--timestep_sampling sigmoid
option, combined with--soft_min_snr_gamma = 1
By rockerBOO #1068
#1068
for better results, as these settings seem to significantly improve model performance.
Suggested to merge this PR along with the soft min snr gamma PR.