Counting with floating point numbers #111

andeElliott · 2018-08-13T17:08:32Z

In a test graph, I am getting the following locations vector in a dhist:

[-0.707107, -0.000000, -0.000000, 0.000000, 0.500000, 0.500000]

Notice the two final points are the same (in fact they are 10^(-16) different but the same to machine precision) and the middle points are likely the same (but I didn't check). There seems to be cases where it is close enough and where it isn't (i.e I can see places with 2 0.5 are placed in the same bin and some where they are not.

Note, this is very unlikely to make a large difference to the actual answer as we would be adding one additional segment of width 10^(-16) and of height around than 1/n, even if this happens (n/2) times this is still small. But it would affect the speed of the algorithm as a smaller number of points will solve everything.

I think we just need to add a binning step to the end of counts to dhist

andeElliott · 2018-08-15T20:49:47Z

So this is more of an issue than I realised, when dealing with a single bin with noise the variance scaling can make this problem quite large, (NetEMDs of 0.3 between nominally constant sequences), we should add a round step for this issue.

martintoreilly added the bug label Sep 19, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Counting with floating point numbers #111

Counting with floating point numbers #111

andeElliott commented Aug 13, 2018 •

edited

Loading

andeElliott commented Aug 15, 2018

Counting with floating point numbers #111

Counting with floating point numbers #111

Comments

andeElliott commented Aug 13, 2018 • edited Loading

andeElliott commented Aug 15, 2018

andeElliott commented Aug 13, 2018 •

edited

Loading