-
Notifications
You must be signed in to change notification settings - Fork 317
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Chap.4. softmax(dim=1) #32
Comments
After fixing this issue like above, it turned out, that the learning_rate should be lower for a good learning. |
Hi Friends, I can with help from Dr kiaei with very well reinforcement learning course, Correct This Code. I say thank for him. Now Enjoy It 2- Add squeeze and unsqueeze In Some Line 3- Edit discount_rewards Function for Make G_1, G_2 , ... 4- in batch must be run Model again that this extra, you can remove it and not run again, but care that update weight is done, In this code I run Model again in Batch mode
|
Download All Correct Code From This link: |
The code for the model is as below
But the
softmax
operation withdim=0
is only OK when the input is a 1 dimensional array. However, when you give a batch input, then the probability will be computed along the row direction of the batch matrix.You can check it by printing
pred_batch
of Listing 4.8.One way to fix this is by modifying it to:
and do
unsqueeze(0)
andsqueeze(0)
for the computation of just one state vector:I like this book much since it gives some intuition for RL rather than trying to provide the theory^^
The text was updated successfully, but these errors were encountered: