Skip to content

Explaining `use‐me.ps1`

ArgentVASIMR edited this page Jun 29, 2024 · 4 revisions

IF YOU ARE UNSURE ABOUT WHAT A SETTING SHOULD BE SET TO, LEAVE IT AS IT IS BY DEFAULT.

Directories

$dataset_dir = ".\.dataset"
$output_dir = ".\.output"
$class_dir = ".\.class_img"
$base_model_dir = "C:\your\models\folder\here"
$prompts = ""; # Leave blank if you don't want sample images
$log_dir = "logs";

The locations of your folders on your PC. $class_dir and $prompts are optional, the rest are required.
If you have installed sd-scripts through our method, then all directories except $base_model_dir have been set for you already.

Note that you are able to put these directory links to wherever you want on your PC, even outside of sd-scripts. Just be sure that you are targetting the sd-scripts folder in powershell when you run use-me.ps1 (this can be ensured by having use-me.ps1 inside of sd-scripts, and running it directly).

Base Model Config

$base_model     = "v1-5-pruned-emaonly.safetensors"
$clip_skip      = 1
$v_prediction   = $false
$sdxl           = $false

The properties of your chosen model for finetuning from. Set all options to ones relevant to your model.

Training Config

This is where the majority of your adjustments will be made.

Output Settings

$lora_name      = "myFirstLoRA"
$version        = "prototype"
$comment        = "This is my first LoRA"
$save_amount    = 10 # 0 to disable

Where the outputted LoRAs are managed.

  • $lora_name is where you set the name of your LoRA file. This is not your instance token, and has no influence on LoRA reliability.
  • $version sets the version name/number. Can be any text (e.g., v0.5, adamw8bit-with-8-net-dim). This is optional.
  • $comment embeds a text comment into the metadata to provide further information. This is optional.
  • $save_amount determines the number of saves to be made over the training run. Regular saves can be disabled by setting it to 0.

Dataset Treatment

$base_res       = 512
$bucket_step    = 64
$flip_aug       = $true
$keep_tags      = 1

How your dataset is processed.

  • $base_res sets the total number of pixels for all dataset images (to $base_res squared). Default is 512 as an absolute minimum.
  • $bucket_step The amount of pixels' change between bucketing resolutions. (e.g., base res 768 with a bucket step of 64 means bucketing res can be 704, 640, 576, etc., OR 832, 896, 960, etc.)
  • $flip_aug randomly flips the images used in the training process to artificially increase dataset diversity. Should be turned off if training an intentionally asymmetrical concept (usually a character, potentially an object or pose).
  • $keep_tags locks N tags from the front of each caption file, stopping them from being shuffled. Intended for use with instance/class tokens.

Steps

$total_steps    = 2000
$batch_size     = 1
$grad_acc_step  = 1
$warmup         = 0.1 # 0 to disable
$warmup_type    = "percent" # "percent", "steps", "steps_batch"

Anything relevant to the (number of) steps taken by the training.

  • $total_steps is the total number of iterations to run on the LoRA training.
  • $batch_size averages the gradient updates of N steps. Higher values result in faster training, at the cost of VRAM.
  • $grad_acc_step accumulates N gradients and averages them, similar to batch size but without VRAM usage. Training is not faster with higher values.
  • $warmup sets the amount of warmup applied to the training. Values less than 1 are a percentage of steps, while integers greater than 1 are a number of steps.
  • $warmup_type determines the treatment of warmup's value. "percent" requires < 1.0 decimal to be inputted. "steps" is a static step count, unaffected by batch size. "steps_batch" is a step count that scales down with batch size.

Learning Rate

$unet_lr        = 1e-4
$text_enc_lr    = 5e-5
$scale_lr_batch = $true # Scale learning rate by batch size.
$lr_scheduler   = "cosine" # Recommended options are "cosine", "linear".

Regarding learning rates and their behaviour during training.

  • $unet_lr and $text_enc_lr affect the unet and text encoder learning rates respectively.
  • $scale_lr_batch multiplies the learning rates by batch size if set to true.
  • $lr_scheduler affects how the learning rates change across the training run. Cosine is a smooth "S" curve going from the set value above, all the way down to 0.

Network

$net_dim        = 16
$net_alpha      = 16
$optimiser      = "adamw" # Recommended options are "adamw", "adamw8bit", "dadaptadam".
$correct_alpha  = $false # Apply scaling to alpha, multiplying by sqrt($net_dim)
  • $net_dim sets the network dimensions/rank.
  • $net_alpha sets the network alpha.
  • $optimiser chooses the optimiser. Selecting DAdaptation (with variants) and Prodigy will automatically set both LRs to 1.0; you do not need to set the learning rates to 1.0 manually.
  • $correct_alpha applies a toggle to "fix" how alpha is scaled in LoRA training. Experimental option; has not been tested much.

Performance

$grad_checkpt   = $false
$full_fp16      = $false
$fp8_base       = $false

For improving performance.

  • $grad_checkpt enables gradient checkpointing, a strong VRAM optimisation. Turn this on if encountering out of memory errors.
  • $full_fp16 is an additional VRAM-saving arg. On its own, not as much as gradient checkpointing, but VRAM saving becomes extremely strong when combined with gradient checkpointing.
  • $fp8_base converts the base model to fp8, saving additional VRAM.

Debugging

$warnings       = $true
$is_lr_free     = $false
$pause_at_end   = $true
$deactivate     = $true
$precision      = "fp16"
$class_2x_steps = $true

Technical stuff, do not touch unless you know what you're doing.

  • $warnings toggles whether use-me.ps1 will issue warnings about settings being set to the wrong thing, whether intentionally or accidentally.
  • $is_lr_free allows a manual toggle to set all the args typically used for learning-rate-free optimisers, akin to DAdaptation and Prodigy, in the event that a particular optimiser is not accounted for (either an unaccounted-for DAdaptation variant or a new LR-free optimiser).
  • $pause_at_end will pause the script when it finishes running. This is because running a Powershell script through right click closes the Powershell window as soon as the process finishes running, which stops one from being able to see what an error message says. For those running it in a pre-opened Powershell window, it can be toggled to false.
  • $deactivate will deactivate the venv when the process finishes running.
  • $precision will dictate what floating point precision type is used by the training process. Options include fp16, fp32, and bf16 (RTX 30xx cards and higher only).
  • $class_2x_steps will double the number of steps if class images are detected as being in use. Can be toggled off if one wishes to set their own number of steps.

Advanced

$weight_decay   = 0.01
$seed           = 0
$cap_dropout    = 0
$net_dropout    = 0.1
$scale_weight   = 0
$d_coef         = 1.0

Advanced config options, typically those that would not normally be altered under most circumstances. Ideally, use-me.ps1 does not need to have these configurable. However, there are rare and/or niche cases where these values may need adjustment.

  • $weight_decay [WIP].
  • $seed is an integer that affects the order in which images are put into training by the dataloader, alongside other things(?).
  • $cap_dropout is 'caption dropout', a <1.0 decimal value affecting the percentage of random steps that have absolutely no captions at all.
  • $net_dropout is 'network dropout', a <1.0 decimal value [WIP].
  • $scale_weight is scale weight norms, a restriction on how high the LoRA's weights can be. Lower values have stronger effects (0 is disabled).
  • $d_coef [WIP].