Caching latents and Text Encoder outputs with multiple GPUs #1690

kohya-ss · 2024-10-12T10:11:44Z

No description provided.

FurkanGozukara · 2024-10-12T10:17:24Z

awesome. this will work automatically when multi gpu used?

kohya-ss · 2024-10-12T11:20:22Z

awesome. this will work automatically when multi gpu used?

Yes😀 In FLUX.1, the Text Encoder cache also takes time, so we've made it compatible with multiple GPUs. We'd appreciate it if you could test it.

Please note that --highvram is needed for faster caching.

FurkanGozukara · 2024-10-12T11:25:03Z

@kohya-ss those --highvram and --lowvram made 0 impact on my testings previously what they do actually? I tested both for FLUX Fine tuning and FLUX LoRA training

I can test FLUX LoRA multi gpu caching - fine tuning still requires 80 gb GPUs, fused backward pass not working

kohya-ss · 2024-10-12T11:31:50Z

@kohya-ss those --highvram and --lowvram made 0 impact on my testings previously what they do actually? I tested both for FLUX Fine tuning and FLUX LoRA training

Currently, --highvram only affects caching of latens, and --lowvram only affects model loading. Training speed remains unchanged.

I can test FLUX LoRA multi gpu caching - fine tuning still requires 80 gb GPUs, fused backward pass not working

I've done some more research recently, but so far I don't know of any way to improve memory usage with multi-GPU fine tuning other than DeepSpeed or FSDP.

FurkanGozukara · 2024-10-12T11:33:15Z

@kohya-ss you are awesome thank you so much

SoonNOON · 2024-10-12T11:53:55Z

I hope the full version of GPT 01 will help find the right solution very soon.

sdbds · 2024-10-12T17:25:15Z

I hope it is possible to push the cached data directly to HF, so that pulling latents directly in the cloud platform will not take up cache time, and more hard disk space can be saved on large datasets.

kohya-ss · 2024-10-12T23:45:52Z

Certainly, how to handle large datasets is a big challenge. I don't have much experience working with large-scale datasets, but I think we should also consider using web datasets, etc.

Also, since the cost of AE/VAE processing decreases relatively during large-scale training, it may not be necessary to cache latents.

kohya-ss added 4 commits September 26, 2024 22:19

experimental support for multi-gpus latents caching

9249d00

remove debug print

24b1fdb

Merge branch 'sd3' into multi-gpu-caching

56a63f0

Merge branch 'sd3' into multi-gpu-caching

ff4083b

Refactor caching in train scripts

c80c304

kohya-ss changed the title ~~Caching latents with multiple GPUs~~ Caching latents and Text Encoder outputs with multiple GPUs Oct 12, 2024

kohya-ss added 6 commits October 13, 2024 11:52

Merge branch 'sd3' into multi-gpu-caching

5bb9f7f

update cache_latents/text_encoder_outputs

74228c9

load images in parallel when caching latents

2244cf5

fix to work cache latents/text encoder outputs

bfc3a65

Merge branch 'sd3' into multi-gpu-caching

886ffb4

update README

2d5f7fa

kohya-ss marked this pull request as ready for review October 13, 2024 10:23

kohya-ss merged commit 1275e14 into sd3 Oct 13, 2024
2 checks passed

kohya-ss deleted the multi-gpu-caching branch October 13, 2024 10:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Caching latents and Text Encoder outputs with multiple GPUs #1690

Caching latents and Text Encoder outputs with multiple GPUs #1690

kohya-ss commented Oct 12, 2024

FurkanGozukara commented Oct 12, 2024

kohya-ss commented Oct 12, 2024 •

edited

Loading

FurkanGozukara commented Oct 12, 2024

kohya-ss commented Oct 12, 2024

FurkanGozukara commented Oct 12, 2024

SoonNOON commented Oct 12, 2024

sdbds commented Oct 12, 2024

kohya-ss commented Oct 12, 2024

Caching latents and Text Encoder outputs with multiple GPUs #1690

Caching latents and Text Encoder outputs with multiple GPUs #1690

Conversation

kohya-ss commented Oct 12, 2024

FurkanGozukara commented Oct 12, 2024

kohya-ss commented Oct 12, 2024 • edited Loading

FurkanGozukara commented Oct 12, 2024

kohya-ss commented Oct 12, 2024

FurkanGozukara commented Oct 12, 2024

SoonNOON commented Oct 12, 2024

sdbds commented Oct 12, 2024

kohya-ss commented Oct 12, 2024

kohya-ss commented Oct 12, 2024 •

edited

Loading