Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Counting with floating point numbers #111

Open
andeElliott opened this issue Aug 13, 2018 · 1 comment
Open

Counting with floating point numbers #111

andeElliott opened this issue Aug 13, 2018 · 1 comment
Labels

Comments

@andeElliott
Copy link
Collaborator

andeElliott commented Aug 13, 2018

In a test graph, I am getting the following locations vector in a dhist:

[-0.707107, -0.000000, -0.000000, 0.000000, 0.500000, 0.500000]

Notice the two final points are the same (in fact they are 10^(-16) different but the same to machine precision) and the middle points are likely the same (but I didn't check). There seems to be cases where it is close enough and where it isn't (i.e I can see places with 2 0.5 are placed in the same bin and some where they are not.

Note, this is very unlikely to make a large difference to the actual answer as we would be adding one additional segment of width 10^(-16) and of height around than 1/n, even if this happens (n/2) times this is still small. But it would affect the speed of the algorithm as a smaller number of points will solve everything.

I think we just need to add a binning step to the end of counts to dhist

@andeElliott
Copy link
Collaborator Author

So this is more of an issue than I realised, when dealing with a single bin with noise the variance scaling can make this problem quite large, (NetEMDs of 0.3 between nominally constant sequences), we should add a round step for this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants