Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong epoch count on training (Bug?) #71

Open
AfterHAL opened this issue Dec 5, 2024 · 1 comment
Open

Wrong epoch count on training (Bug?) #71

AfterHAL opened this issue Dec 5, 2024 · 1 comment

Comments

@AfterHAL
Copy link

AfterHAL commented Dec 5, 2024

Hi.
I didn't pay attention to it before, but I am currently testing a training base on the "Sana_600M_img512.yaml" config file with a dataset ok 20k images (single 4090 GPU / batch 4), and the reported epoch after 80000 steps is still epoch 1:

2024-12-05 21:55:33 - [Sana] - INFO - Epoch: 1 | Global Step: 81780 | Local Step: 4780 // .......

(And the reported time at the begining is not the actual time of the system)

@Deng-Xian-Sheng
Copy link

You should have 4 epochs, and then your second epoch is at 40000

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants