Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems when training mar model from scratch #76

Open
swasler opened this issue Dec 2, 2024 · 2 comments
Open

Problems when training mar model from scratch #76

swasler opened this issue Dec 2, 2024 · 2 comments

Comments

@swasler
Copy link

swasler commented Dec 2, 2024

Hi TianHong,

I've been trying to train MAR from scratch recently, but I encountered an issue. If I don’t use the pre-trained parameters you provided and train from scratch, the generated images during training (visualized using the online evaluation command) turn out completely black. I only trained for two epochs on the ImageNet dataset since it’s so large, but I feel like the output shouldn’t be entirely black—at least there should be some noise in the images, right?

To reduce the computational cost, I set num_class=1 and trained on just one class of the ImageNet dataset. Even after training for dozens of epochs, the generated images are still completely black. However, if I use your pre-trained parameters (modified due to num_class=1), the training works fine and I get the expected results.

Did you encounter this issue when training? I’m a bit confused about why training from scratch leads to black images. Do you have any suggestions or things I should pay attention to?

Thank you for your answering!

@LTH14
Copy link
Owner

LTH14 commented Dec 2, 2024

The randomly initialized model will result in NAN value if you use autocast during inference. If you want to see noise instead of black image, just turn off the autocast during inference https://github.com/LTH14/mar/blob/main/engine_mar.py#L155

@LTH14
Copy link
Owner

LTH14 commented Dec 2, 2024

Since we use ema=0.9999 by default, you would need at least 100k iterations to get a reasonable result. If you want to see it early, you can disable the ema here https://github.com/LTH14/mar/blob/main/engine_mar.py#L124, but it will harm the final performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants