Skip to content

A2 (batch_size=1024) config from article #1122

Answered by rwightman
hankyul2 asked this question in General
Discussion options

You must be logged in to vote

@hankyul2 your 79.63 is sitting around the mean (79.68) for the A2 runs in the paper. The 76.8 was a better than avg run (seed 0) for the official runs.

For my local runs I believe I hit EDIT (just looked up, I had 4 runs on my local machine for A2) 79.65, 79.68, 79.82, 79.83

You could try a different seed, I had a lucky affinity to '21' for some of my local runs, or you can use 0 like the paper numbers. It should start out with the same model weights as the paper rus (we checked that), but the rest of the random selections for augmentations, dataset sampling, etc will likely follow a different path (so results won't end up exactly the same).

I find that scaling LR by sqrt is better than …

Replies: 2 comments 8 replies

Comment options

You must be logged in to vote
6 replies
@rwightman
Comment options

@hankyul2
Comment options

@purvang3
Comment options

@rwightman
Comment options

@purvang3
Comment options

Answer selected by hankyul2
Comment options

You must be logged in to vote
2 replies
@hankyul2
Comment options

@Zoe-Wan
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
4 participants