Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to train new motion patterns with large-domain-gap datasets? #22

Open
sunnyHelen opened this issue Sep 3, 2024 · 4 comments
Open

Comments

@sunnyHelen
Copy link

Hi, thanks a lot for sharing the training code. If I need to learn new motion patterns with my own datasets which are very different from normal realistic videos, how should I train the model? Do you have any insight? Is the train.py file used for fine-tuning the motion module? And the train_lora.py file is for training the Lora module which is presented as domain adapter Lora in the Animatediff file? Should I use train_lora.py to adapt to my own datasets' domain and then use train.py to learn the new motion patterns?

If you can give me some guidance, I would really appreciate it.

Thanks a lot.

@tumurzakov
Copy link
Owner

Hello

Everything depends on your dataset.

Read how opensora and stability ai prepare their datasets:

https://github.com/hpcaitech/Open-Sora/blob/main/docs/report_01.md
https://github.com/hpcaitech/Open-Sora/blob/main/docs/report_02.md
https://github.com/hpcaitech/Open-Sora/blob/main/docs/report_03.md
https://arxiv.org/pdf/2311.15127

carefully read sections about dataset preparation, it is multistep process

You must categorize your motions and make precise prompts.

I think it is better to train one concept per lora. Don't finetune motion module itself because forgetting. My experiments shows that you need 100 epochs to good learn. Take motion module v3 . It is best quality (for 24 frames). If you need more, then must train yourself but it is very expensive. I trained 48 and 96 models and it tooks weeks of train and quality was so so. Because my dataset is poor.

About traning, i train with my framework latentflow. Scripts on this repo obsolete quite a bit, because now diffusers train with peft. Take a look at train script I adding lora to all attention layers, unet and motion module.

@sunnyHelen
Copy link
Author

Thanks a lot for your quick reply. I really appreciate it. Do you mean I should train separate loras for domain adaptation and then for one motion pattern? I think I may need a motion with 94 frames.

@tumurzakov
Copy link
Owner

You must go step by step. First train one motion in lora. I trained 96 frames for 512x288 resolution. It took near 24GB. If you will get quite good results, generate many samples and increase your dataset with them. After n step with loras you will get much bigger dataset and could train on it. Look it as distillation.

@sunnyHelen
Copy link
Author

Thanks a lot for your kind advice. I'll try.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants