-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Where did the #15
Comments
Hi, @ChadMcintire |
Thank you for responding. I am sorry I have been trying to derive it and I'm not asking you to derive anything. Which will simplify as follows: This is the equation I saw on the stable baselines 3 code base and I like your code and want to understand it it better. The equation in the code looks more like this: I'm unsure where the "sum" comes from, as well as the "n". In your reply, you are saying that you derived the formula from a log normal distribution. The std * I, is that just the standard deviation * the identity is what you are saying. I'm sure I'm missing something, I apologize. |
Hi, please note that the distribution is multivariate gaussian, and variables are independent. To be precise, sigma is N-dimensional diagonal matrix, whose diagonal elements are equal to So you need to calculate the probabilities of each dimension and multiply them together. Alternatively, calulate the log probabilities and sum them. Because there is no need to calculate the probability itself, I calculated it this way to reduce the computation. |
So if I start with the multivariate Gaussian PDF. Taking the natural log of both sides To make it look more like the code: The code looks like this I'm so sorry I don't understand from you comment how you go from experssion 1 to the expression in your code. Since the last part of the expression is the same in both code, how do you derive the first part to match your code. 1st part of ln pdf: vs. first part of the code. |
Calculate the probability (or log probability) of each dimension independently and multiply (or add) them together, because these variables are independent. |
I am curious in your utils.py where the calculate_gaussian_log_prob(log_std, noise) function came from? It doesn't look like the stable baselines or pytorch Log PDF of the normal distribution. So what is it if you don't mind answering.
The text was updated successfully, but these errors were encountered: