ViT training hparams #946

kfirgoldberg · 2021-10-26T15:47:41Z

kfirgoldberg
Oct 26, 2021

Hi Ross,
thanks for creating and maintaining this wonderful repo!
I would like to ask for the hparams you used when training the ViT models, similar to what you have posted for CNN models (here: https://rwightman.github.io/pytorch-image-models/training_hparam_examples/).
It would help a lot when trying to finetune to a new dataset or just reproducing the results you achieved.

Thanks again,
Kfir

rwightman · 2021-10-26T16:26:21Z

rwightman
Oct 26, 2021
Maintainer

@kfir99 there are some past discussions and issues on this, you can search as I'd have to do the same. The best source for training vit like models with timm train is looking at the deit defaults, the args for deit (https://github.com/facebookresearch/deit) and timm are pretty much the same and deit uses all the timm train components. You do have to remember to scale the LR though since deit does that for your where as timm train does not.

I'm moving this to discussions as I think it's more useful there. If anyone wants to help put together a definitive set of timm hparams for vit (based on deit defaults) I'll add them to the docs but I don't have time to do that and verify them at this point.

0 replies

jbohnslav · 2021-12-08T18:51:57Z

jbohnslav
Dec 8, 2021

I'd be curious if anyone has followed up on this. I ran the following command: python train.py /media/jim/DATA_SSD/imagenet --batch-size 256 --amp -j 16 --log-wandb --model vit_small_patch16_224 and am plateauing at an accuracy of ~65%.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ViT training hparams #946

{{title}}

Replies: 2 comments

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

ViT training hparams #946

kfirgoldberg Oct 26, 2021

Replies: 2 comments

rwightman Oct 26, 2021 Maintainer

jbohnslav Dec 8, 2021

kfirgoldberg
Oct 26, 2021

rwightman
Oct 26, 2021
Maintainer

jbohnslav
Dec 8, 2021