-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
low KS p-values for small domains #1
Comments
Hello Anne, do the x values only cover a very small subset of the domain?
So the actual data represent only a section of one tail of the distribution
curve?
You could try the Anderson-Darling test, which is more focused on the tails
of the distribution whereas KS pays more attention to the center.
You could also consider to fit truncated distribution functions, in which
the x values only cover the x-axis between the observed or estimated
minimum and maximum. So the test does not look for a long tail that may not
be present in the actual data.
Truncated distribution - Wikipedia
<https://en.wikipedia.org/wiki/Truncated_distribution>
In the distribution equation, I'd use a location and scale parameter. The
test may find a better alignment with a shifted or scaled distribution.
If your x values do cover much of the domain and not only a tail section,
then I'd try to multiply the x-values by a scale factor, e.g. 1mil. As if
you measure your x values not in kilometers but in millimeters, to see if
the test no longer struggles with numerical precision issues.
…On Thu, Oct 13, 2022 at 11:21 AM Anne Hulsey ***@***.***> wrote:
Hello. First, this took is amazing, thank you so much for it.
I am trying to fit a distribution with a really small domain, which means
the pdf values have 6 digits. I think this is causing an issue with the
kstest because some of the fits look great visually but have quite small ks
values. Do you have any suggestions for this?
[image: image]
<https://user-images.githubusercontent.com/28743653/195500552-3bc58b94-8f5e-4ca6-a556-08f1331ed83f.png>
—
Reply to this email directly, view it on GitHub
<#1>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AVVFP3TGB3PNOMXAO3ZMOZLWC6E35ANCNFSM6AAAAAARD35ETU>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Thank you, this gave great insights into a variety of issues. (Btw, before seeing your response, I found the Bayesian Information Criterion, which I used and seems to fit my purpose a bit better.) Your comments also helped me think about why I am looking for a fitted distribution in the first place. In my use case, it may actually be better to use a smoothed version of the empirical pdf, rather than finding a parametric fit. |
Hello. First, this tool is amazing, thank you so much for it.
I am trying to fit a distribution with a really small domain, which means the pdf values have 6 digits. I think this is causing an issue with the kstest because some of the fits look great visually but have quite small ks values. Do you have any suggestions for this?
The text was updated successfully, but these errors were encountered: