ViT training hparams #946
Replies: 2 comments
-
@kfir99 there are some past discussions and issues on this, you can search as I'd have to do the same. The best source for training vit like models with timm train is looking at the deit defaults, the args for deit (https://github.com/facebookresearch/deit) and timm are pretty much the same and deit uses all the timm train components. You do have to remember to scale the LR though since deit does that for your where as timm train does not. I'm moving this to discussions as I think it's more useful there. If anyone wants to help put together a definitive set of timm hparams for vit (based on deit defaults) I'll add them to the docs but I don't have time to do that and verify them at this point. |
Beta Was this translation helpful? Give feedback.
-
I'd be curious if anyone has followed up on this. I ran the following command: |
Beta Was this translation helpful? Give feedback.
-
Hi Ross,
thanks for creating and maintaining this wonderful repo!
I would like to ask for the hparams you used when training the ViT models, similar to what you have posted for CNN models (here: https://rwightman.github.io/pytorch-image-models/training_hparam_examples/).
It would help a lot when trying to finetune to a new dataset or just reproducing the results you achieved.
Thanks again,
Kfir
Beta Was this translation helpful? Give feedback.
All reactions