SANA support #1807

recoilme · 2024-11-26T17:51:35Z

May you please add minimal SANA support?
https://github.com/NVlabs/Sana

SANA train implementation very not optimal(
NVlabs/Sana#49

GPU intensive, has no cache/ Multi Aspect Ratio / Adafactor support(

We love you, mr Kohya!

kohya-ss · 2024-11-28T13:22:23Z

Sana is very interesting. However, I am concerned that the weight license is CC BY-NC-SA 4.0.
If this applies not only to the model but also to the generated images, the use cases for the model will be limited.

It seems that a pull request for optimization is being developed in the Sana repository, so I will look forward to that first.

CorradoF · 2024-11-28T13:33:49Z

Sana is very interesting. However, I am concerned that the weight license is CC BY-NC-SA 4.0. If this applies not only to the model but also to the generated images, the use cases for the model will be limited.

What's the difference to flux.1dev? Isn't it research only too?

kohya-ss · 2024-11-28T13:37:54Z

What's the difference to flux.1dev? Isn't it research only too?

From my understanding, the output of FLUX.1 dev is not covered by the license of FLUX.1 dev.

https://github.com/black-forest-labs/flux/blob/main/model_licenses/LICENSE-FLUX1-dev

a. “Derivative” means any (i) modified version of the FLUX.1 [dev] Model (including but not limited to any customized or fine-tuned version thereof), (ii) work based on the FLUX.1 [dev] Model, or (iii) any other derivative work thereof. For the avoidance of doubt, Outputs are not considered Derivatives under this License.

CorradoF · 2024-11-28T15:31:04Z

What's the difference to flux.1dev? Isn't it research only too?

From my understanding, the output of FLUX.1 dev is not covered by the license of FLUX.1 dev.

https://github.com/black-forest-labs/flux/blob/main/model_licenses/LICENSE-FLUX1-dev

a. “Derivative” means any (i) modified version of the FLUX.1 [dev] Model (including but not limited to any customized or fine-tuned version thereof), (ii) work based on the FLUX.1 [dev] Model, or (iii) any other derivative work thereof. For the avoidance of doubt, Outputs are not considered Derivatives under this License.

Interesting thank you, I always heard differently on reddit, my bad for not reading it myself.

If Nvidia doesn't change the license eventually there will be a pony sana. Having things already ready for that could be an advantage. Thank you anyway for the great work you did so far

recoilme · 2024-11-29T07:00:31Z

Sana is very interesting. However, I am concerned that the weight license is CC BY-NC-SA 4.0.

Looks like training code under Apache now, not sure
NVlabs/Sana@335d445

Bocchi-Chan2023 · 2024-11-29T07:24:27Z

NVlabs/Sana#54
code only, unfortunately.

recoilme · 2024-11-29T11:48:27Z

@Muinez add very good code for MAR bucketing and simplified local train:
https://github.com/Muinez/Sana

i add code for model load/save and fix some small bugs
https://github.com/recoilme/Sana

And example how to use it and train small model from zero in bf16
https://github.com/recoilme/Sana/blob/main/TRAIN.md

Bocchi-Chan2023 · 2024-11-29T13:22:45Z

@Muinez add very good code for MAR bucketing and simplified local train: https://github.com/Muinez/Sana

i add code for model load/save and fix some small bugs https://github.com/recoilme/Sana

And example how to use it and train small model from zero in bf16 https://github.com/recoilme/Sana/blob/main/TRAIN.md

Is it possible to improve the way the dataset cache is generated? When I tried the official code, it used all the system RAM and froze my PC.

recoilme · 2024-11-30T20:41:28Z

Is it possible to improve the way the dataset cache is generated?

official code dont have cache

Bocchi-Chan2023 · 2024-12-01T02:05:39Z

Is it possible to improve the way the dataset cache is generated?

official code dont have cache

Oh, maybe I'm misunderstanding something.
It started using several times more memory than Sigma fine tuning, so I feel something is wrong

recoilme · 2024-12-01T09:36:48Z

current status
0.6b on A40(48Gpu) with 256 batch 512-1024 resolution
1.6b with 16 batch 512-2048 res

looks like success train from zero in bf16
https://wandb.ai/recoilme/potato/runs/8?nw=nwuserrecoilme

its not official training code.

Manni1000 · 2024-12-02T13:50:13Z

but i have seen a guy that said it runns with 24gb vram if you use vae cashing is this true or false?

recoilme · 2024-12-02T14:32:02Z

but i have seen a guy that said it runns with 24gb vram if you use vae cashing is this true or false?

true, i train batch 256 0.6b and batch 24 1.6b on 48 gpu with vae cache

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SANA support #1807

SANA support #1807

recoilme commented Nov 26, 2024

kohya-ss commented Nov 28, 2024

CorradoF commented Nov 28, 2024

kohya-ss commented Nov 28, 2024

CorradoF commented Nov 28, 2024

recoilme commented Nov 29, 2024

Bocchi-Chan2023 commented Nov 29, 2024

recoilme commented Nov 29, 2024

Bocchi-Chan2023 commented Nov 29, 2024

recoilme commented Nov 30, 2024 •

edited

Loading

Bocchi-Chan2023 commented Dec 1, 2024

recoilme commented Dec 1, 2024

Manni1000 commented Dec 2, 2024

recoilme commented Dec 2, 2024

SANA support #1807

SANA support #1807

Comments

recoilme commented Nov 26, 2024

kohya-ss commented Nov 28, 2024

CorradoF commented Nov 28, 2024

kohya-ss commented Nov 28, 2024

CorradoF commented Nov 28, 2024

recoilme commented Nov 29, 2024

Bocchi-Chan2023 commented Nov 29, 2024

recoilme commented Nov 29, 2024

Bocchi-Chan2023 commented Nov 29, 2024

recoilme commented Nov 30, 2024 • edited Loading

Bocchi-Chan2023 commented Dec 1, 2024

recoilme commented Dec 1, 2024

Manni1000 commented Dec 2, 2024

recoilme commented Dec 2, 2024

recoilme commented Nov 30, 2024 •

edited

Loading