low KS p-values for small domains #1

annehulsey · 2022-10-13T04:21:07Z

Hello. First, this tool is amazing, thank you so much for it.

I am trying to fit a distribution with a really small domain, which means the pdf values have 6 digits. I think this is causing an issue with the kstest because some of the fits look great visually but have quite small ks values. Do you have any suggestions for this?

h3ik0th · 2022-10-13T05:33:37Z

Hello Anne, do the x values only cover a very small subset of the domain? So the actual data represent only a section of one tail of the distribution curve? You could try the Anderson-Darling test, which is more focused on the tails of the distribution whereas KS pays more attention to the center. You could also consider to fit truncated distribution functions, in which the x values only cover the x-axis between the observed or estimated minimum and maximum. So the test does not look for a long tail that may not be present in the actual data. Truncated distribution - Wikipedia <https://en.wikipedia.org/wiki/Truncated_distribution> In the distribution equation, I'd use a location and scale parameter. The test may find a better alignment with a shifted or scaled distribution. If your x values do cover much of the domain and not only a tail section, then I'd try to multiply the x-values by a scale factor, e.g. 1mil. As if you measure your x values not in kilometers but in millimeters, to see if the test no longer struggles with numerical precision issues.

…

On Thu, Oct 13, 2022 at 11:21 AM Anne Hulsey ***@***.***> wrote: Hello. First, this took is amazing, thank you so much for it. I am trying to fit a distribution with a really small domain, which means the pdf values have 6 digits. I think this is causing an issue with the kstest because some of the fits look great visually but have quite small ks values. Do you have any suggestions for this? [image: image] <https://user-images.githubusercontent.com/28743653/195500552-3bc58b94-8f5e-4ca6-a556-08f1331ed83f.png> — Reply to this email directly, view it on GitHub <#1>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AVVFP3TGB3PNOMXAO3ZMOZLWC6E35ANCNFSM6AAAAAARD35ETU> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

annehulsey · 2022-10-13T20:16:15Z

Thank you, this gave great insights into a variety of issues. (Btw, before seeing your response, I found the Bayesian Information Criterion, which I used and seems to fit my purpose a bit better.)

Your comments also helped me think about why I am looking for a fitted distribution in the first place. In my use case, it may actually be better to use a smoothed version of the empirical pdf, rather than finding a parametric fit.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

low KS p-values for small domains #1

low KS p-values for small domains #1

annehulsey commented Oct 13, 2022 •

edited

Loading

h3ik0th commented Oct 13, 2022 via email

annehulsey commented Oct 13, 2022

low KS p-values for small domains #1

low KS p-values for small domains #1

Comments

annehulsey commented Oct 13, 2022 • edited Loading

h3ik0th commented Oct 13, 2022 via email

annehulsey commented Oct 13, 2022

annehulsey commented Oct 13, 2022 •

edited

Loading