-
Hi, I have experimented A2 config from ResNet Strikes back. QuestionI failed to reproduce same validation top-1 accuracy reported as 79.8. (mine: 79.416) Experiment SetupI have used below command.
Experiment ResultsI have experimented with different warmup_lr(1e-6 or 1e-4) and batch_size(256 or 448) and bce_threshold(0.0 or 0.2)
|
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 8 replies
-
@hankyul2 your 79.63 is sitting around the mean (79.68) for the A2 runs in the paper. The 76.8 was a better than avg run (seed 0) for the official runs. For my local runs I believe I hit EDIT (just looked up, I had 4 runs on my local machine for A2) 79.65, 79.68, 79.82, 79.83 You could try a different seed, I had a lucky affinity to '21' for some of my local runs, or you can use 0 like the paper numbers. It should start out with the same model weights as the paper rus (we checked that), but the rest of the random selections for augmentations, dataset sampling, etc will likely follow a different path (so results won't end up exactly the same). I find that scaling LR by sqrt is better than linear for lamb and adamw so |
Beta Was this translation helpful? Give feedback.
-
Hi @hankyul2 , thanks for sharing your config and acc-step figure, it helps me a lot. |
Beta Was this translation helpful? Give feedback.
@hankyul2 your 79.63 is sitting around the mean (79.68) for the A2 runs in the paper. The 76.8 was a better than avg run (seed 0) for the official runs.
For my local runs I believe I hit EDIT (just looked up, I had 4 runs on my local machine for A2) 79.65, 79.68, 79.82, 79.83
You could try a different seed, I had a lucky affinity to '21' for some of my local runs, or you can use 0 like the paper numbers. It should start out with the same model weights as the paper rus (we checked that), but the rest of the random selections for augmentations, dataset sampling, etc will likely follow a different path (so results won't end up exactly the same).
I find that scaling LR by sqrt is better than …