Dockerized way to fine-tune FLUX.1-dev model using Dreambooth metod with memory-efficient and configurable training,
Dockerhub: kopyl/train-flux-kohya-sd-scripts
Simplified training run guide.
Throughout my ML career i found that the best way to train diffusion models is with kohya's sd-scripts repo.
One of the coolest advantages I like:
- Way less bugs than huggingface/diffusers repo's training scipts have;
- Latest and gratest models to trainl
- Amazing speed optimization which i could not get myself re-writing huggingface/diffusers' Flux training code;
- Not very difficult to understand in terms of the architecture and the code readability
While being really nice to use, you have to spend a while setting up a simple Flux Dreambooth training environment, which is basically:
- Install everything a repo requires you to. Sometimes even a newer version of Python. And I always had to install additional packages which are out of scope of the required ones by the sd-scripts repo like
torchvision
andopencv
; - Copy all the models into your project environment (and find them on the internet if you not happen to casually store 30gb of data on your computer).
- Deploy a container (or run on your own machine);
exec -it {name} bash
into a container;- Upload photos of your subject to
dataset
directory and change the subject's identifier if needed (likesks woman
) indataset-config.toml
; - Run the training script like
bash run-training.sh
; - When the training is finished, you will find the trained Flux model (transformer type) in the
/output
directory of a container
-
Remove
--apply_t5_attn_mask
parameter. It slightly increases quality and slightly reduces training speed (in my measurements on a server with NVIDIA H100 2.53s/it with attention mask and 1.91s/it without). So far i guess it's worth the time sacrifice. VRAM usage is around the smae; -
Change
--save_every_n_epochs
parameter; -
Change
--max_train_epochs
parameter. But for the most optimal training process i recomment keeping it at 300; -
Change
--sample_every_n_epochs
parameter; -
Change
--learning_rate
paramet; -
Adjust contents of
sample_prompts.txt
file to fit the subject token. I.e: if class token issks man
, prompts start witha photo of sks man...
and I'm training a model on a woman, then I'd need to changesks man
part in all prompts tosks woman
. Also change the class token in thedataset-config.toml
file. (prompt formatting syntax)
- Make sure you have the latest version of diffusers from the official repo;
- Do a couple imports:
from diffusers import FluxTransformer2DModel, FluxPipeline
- Load the transformer model you just trained:
transformer = FluxTransformer2DModel.from_single_file("finetuned-model.safetensors", torch_dtype=torch.bfloat16)
- And finally load the default model from black-forest-labs/FLUX.1-dev with swapping the main model component – the transformer with the code like this:
pipe = FluxPipeline.from_pretrained("models/flux-dev-model", transformer=transformer, torch_dtype=torch.bfloat16)