You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
lr denotes the learning rate, that is correct. min_delta is used by the curriculum and is used to determine whether a curriculum step is to be performed.
The weird output of the learning rate is due to the way Chainer calculates the learning rate of the Adam optimizer. The value actually does not steep down, it rather goes up until it reaches the provided learning rate.
Hello @Bartzi !
In my training, I have set lr = 1e-4 and min_delta = 1e-8. Am I correct in assuming these are learning rate and decay respectively?
Also, I print the values out at the start of the training and they seem fine but later on it quickly steeps down.
min_delta, decay rate: 1e-08 lr: 0.0001 /usr/local/lib/python3.6/dist-packages/chainer/training/updaters/multiprocess_parallel_updater.py:155: UserWarning: optimizer.eps is changed to 1e-08 by MultiprocessParallelUpdater for new batch size. format(optimizer.eps)) epoch iteration main/loss main/accuracy lr fast_validation/main/loss fast_validation/main/accuracy validation/main/loss validation/main/accuracy 1 100 2.49428 0 **3.08566e-05** 2.36821 0 3 200 1.94748 0 **4.25853e-05** 2.23569 0 total [#########.........................................] 19.93% this epoch [#################################################.] 98.60% 249 iter, 3 epoch / 20 epochs 0.48742 iters/sec. Estimated time to finish: 0:34:12.368322.
Can I know the relation and what might be affecting my LR drastically?
The text was updated successfully, but these errors were encountered: