-
Looking into Scheduler code and got several questions about lr_noise. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
@ayasyrev the random number for this needs to be separate from the python and other (global) library random generators and seeded consistently across all distributed workers in distributed training scenarios so that all wokers are using the same learning rate
the noise idea had some tweaks/improvements to explore that I didn't end up doing, hence that extra arg floating around I'd be curious to know if you find any improvements for find the feature useful. It's (as far as I know) unpublished and unique to |
Beta Was this translation helpful? Give feedback.
-
Thanks for you answer. Right now I preparing nbs for timmdocs. Started with scheduler- add document noise, changes and new schedulers from last commits, |
Beta Was this translation helpful? Give feedback.
@ayasyrev the random number for this needs to be separate from the python and other (global) library random generators and seeded consistently across all distributed workers in distributed training scenarios so that all wokers are using the same learning rate
t
is for time as it could be epochs or steps, I have (as yet unralized) plans to transition the schedulers to step based for a variety of reasons, mostly fine grained warmup (per step)the noise idea had some tweaks/improvements to explore that I didn't end up doing, hence that extra arg floating around
I'd be curious to know if you find any improvements for find the feature useful. It's (as far as I know) unpublished and unique to
timm