-
Notifications
You must be signed in to change notification settings - Fork 187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
the learning rate adjustment when training with one gpu or two gpus #80
Comments
you should set BASE_LR as 0.000025 instead of 0.0025. |
When first running it, the learning rate is default, i.e.0.000025, got a lower AP than the author's log, then I changed the learning rate. |
maybe you need a small lr than 0.000025 |
OK, thanks for your reply, I will try it now. |
So, what changes have you made? I guess do I need to set lr as 0.000025 * 1/8 and total number of iterations 8 times? However, as the number of iterations increases, the entire training duration becomes extremely large |
Thanks for your code.
When I training this code with two GPUs (Tesla P4), changing image_per_batch 4 by running:
python projects/SparseRCNN/train_net.py
--config-file project/SparseRCNN/configs/sparsercnn.res50.100pro.3x.yaml
--num-gpus 2 SOLVER.IMS_PER_BATCH 4
when it iter 7319, save the module and the AP of it is 3.915, which is different with your 11.440 (in your log).
Refer to detectron, I adjusted the learning rate to 0.0025 by running:
python projects/SparseRCNN/train_net.py
--config-file project/SparseRCNN/configs/sparsercnn.res50.100pro.3x.yaml
--num-gpus 2 SOLVER.IMS_PER_BATCH 4 SOLVER.BASE_LR 0.0025
and change GPU to 1 by running:
python projects/SparseRCNN/train_net.py
--config-file project/SparseRCNN/configs/sparsercnn.res50.100pro.3x.yaml
--num-gpus 1 SOLVER.IMS_PER_BATCH 2 SOLVER.BASE_LR 0.0025
but the result was worse.
Later I found that they use SGD and you use AdamW, maybe this 0.0025 is not applicable,
So I want to know if I need to adjust some parameters. and how to adjust?
Thanks.
The text was updated successfully, but these errors were encountered: