Relation between min_delta and LR #97

yash-bhat · 2020-07-08T17:02:25Z

In my training, I have set lr = 1e-4 and min_delta = 1e-8. Am I correct in assuming these are learning rate and decay respectively?

Also, I print the values out at the start of the training and they seem fine but later on it quickly steeps down.

min_delta, decay rate: 1e-08 lr: 0.0001 /usr/local/lib/python3.6/dist-packages/chainer/training/updaters/multiprocess_parallel_updater.py:155: UserWarning: optimizer.eps is changed to 1e-08 by MultiprocessParallelUpdater for new batch size. format(optimizer.eps)) epoch iteration main/loss main/accuracy lr fast_validation/main/loss fast_validation/main/accuracy validation/main/loss validation/main/accuracy 1 100 2.49428 0 **3.08566e-05** 2.36821 0 3 200 1.94748 0 **4.25853e-05** 2.23569 0 total [#########.........................................] 19.93% this epoch [#################################################.] 98.60% 249 iter, 3 epoch / 20 epochs 0.48742 iters/sec. Estimated time to finish: 0:34:12.368322.

Can I know the relation and what might be affecting my LR drastically?

Yashu

The text was updated successfully, but these errors were encountered:

Bartzi · 2020-07-09T09:12:53Z

lr denotes the learning rate, that is correct.
min_delta is used by the curriculum and is used to determine whether a curriculum step is to be performed.

The weird output of the learning rate is due to the way Chainer calculates the learning rate of the Adam optimizer. The value actually does not steep down, it rather goes up until it reaches the provided learning rate.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Relation between min_delta and LR #97

Relation between min_delta and LR #97

yash-bhat commented Jul 8, 2020

Bartzi commented Jul 9, 2020

Relation between min_delta and LR #97

Relation between min_delta and LR #97

Comments

yash-bhat commented Jul 8, 2020

Bartzi commented Jul 9, 2020