Understanding the Significance of Loss Change in Conservative Q-Learning Training #70
Closed
constellation39
started this conversation in
General
Replies: 1 comment
-
I understand that there is an error in the reward function I wrote. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello Cryolite,
Is the change in loss during Conservative Q-Learning training of any reference value?
In my attempts at training, the loss in CQL is always increasing. It remains very stable until 1 million training steps, after which it starts to increase linearly.
Thank you.
Beta Was this translation helpful? Give feedback.
All reactions