Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About learning rate scheduler #13

Open
deeperlearner opened this issue Aug 2, 2024 · 0 comments
Open

About learning rate scheduler #13

deeperlearner opened this issue Aug 2, 2024 · 0 comments

Comments

@deeperlearner
Copy link

In the section 4.5 of the paper:

All models are trained using an SGD optimizer with an initial learning rate of 1e−1 and batch size of
512. The learning rate is divided by 10 at 30k, 60k, 90k training iterations.

Since this paper is about losses, have there been any experiments on learning rate schedulers?
In my experiment, I am using SRT loss. The loss keeps dropping with learning rate of 1e-1.
Any suggestions on when is best to divide lr by 10?

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant