Learning rate for Imagenet #18

Shiweiliuiiiiiii · 2020-06-17T23:14:58Z

Hi Tim,

First, thank you for your code.

I notice that you change the default learning rate for Imagenet in multi-GPU running by multiplying 0.1 with the number of GPUs. I am wondering did you actually use this to get the reported performance in the paper? Will this results in better performance only for sparse training or also dense performance.

Many thanks

TimDettmers · 2020-08-26T20:21:56Z

Good catch, I was not aware of this behavior. I did not change the code any further and trained on 4 GPUs. I have not studied the performance difference in detail if I change this behavior. It might have affected the results for sparse and dense performance, but since I have not any data I cannot say for sure what the effect is.

TimDettmers added question Further information is requested wontfix This will not be worked on labels Oct 22, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Learning rate for Imagenet #18

Learning rate for Imagenet #18

Shiweiliuiiiiiii commented Jun 17, 2020

TimDettmers commented Aug 26, 2020

Learning rate for Imagenet #18

Learning rate for Imagenet #18

Comments

Shiweiliuiiiiiii commented Jun 17, 2020

TimDettmers commented Aug 26, 2020