Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rev=true in searchsorted_interval #111

Merged
merged 3 commits into from
Oct 17, 2023
Merged

Conversation

aplavin
Copy link
Contributor

@aplavin aplavin commented Jul 30, 2022

this also simplifies findall

no new tests added: rely on the extensive findall testsuite

@codecov
Copy link

codecov bot commented Jul 30, 2022

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.29% 🎉

Comparison is base (b3f0d6e) 98.81% compared to head (fa4cdbe) 99.10%.
Report is 23 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #111      +/-   ##
==========================================
+ Coverage   98.81%   99.10%   +0.29%     
==========================================
  Files           3        4       +1     
  Lines         253      224      -29     
==========================================
- Hits          250      222      -28     
+ Misses          3        2       -1     
Files Changed Coverage Δ
src/findall.jl 100.00% <100.00%> (ø)

... and 3 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@hyrodium hyrodium left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found some performance degression. Could you check the following result?

julia> versioninfo()
Julia Version 1.7.3
Commit 742b9abb4d (2022-05-06 12:58 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: AMD Ryzen 7 2700X Eight-Core Processor
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-12.0.1 (ORCJIT, znver1)

On the current master

julia> @benchmark findall((1..3), 2:8)
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min  max):  0.020 ns  0.031 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     0.020 ns             ┊ GC (median):    0.00%
 Time  (mean ± σ):   0.021 ns ± 0.004 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  █                                                          
  █▁▁▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄ ▂
  0.02 ns        Histogram: frequency by time       0.03 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

With this PR

julia> @benchmark findall((1..3), 2:8)
BenchmarkTools.Trial: 10000 samples with 870 evaluations.
 Range (min  max):  136.305 ns   2.080 μs  ┊ GC (min  max): 0.00%  92.90%
 Time  (median):     141.348 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   147.630 ns ± 89.578 ns  ┊ GC (mean ± σ):  2.95% ±  4.52%

   ▁▄▆████▇▆▅▃▂▁▁                               ▁        ▁ ▁   ▃
  ▇████████████████▆▆▆▅▅▅▄▅▄▅▅▄▅▄▅▅▁▄▃▅▅▅▄▁▅▆▆▇███▇▇▇████████▇ █
  136 ns        Histogram: log(frequency) by time       183 ns <

 Memory estimate: 128 bytes, allocs estimate: 4.

src/findall.jl Show resolved Hide resolved
@aplavin
Copy link
Contributor Author

aplavin commented Aug 2, 2022

Hm, that's... interesting.
Basically, the issue is that without lt = < Base searchsorted functions use isless, that doesn't consider +-0 to be equal:

julia> searchsorted([-1, -0.0, 0.0, 1], 0)
3:3
julia> searchsorted([-1, -0.0, 0.0, 1], 0; lt=<)
2:3

As IntervalSets use < for all comparisons, searchsorted_interval should be consistent with that. And indeed, with lt = < the results are always correct.
Performance for arrays stays the same with or without lt = <, but not for ranges. They special-case the default ordering (without lt = ...) and just mathematically compute the resulting indices at O(1). With lt = <, however, searchsorted functions fall back to the abstractarray implementations with O(log N) steps.

Not sure what's the best solution here.
To be consistent with interval in, searchsorted_interval has to use lt = < or do some postprocessing of searchsorted results. It wasn't correct for +-0 before this PR. At the same time, this change somewhat degrades its performance for ranges.
For example, I can roll back the old findall implementation, while keeping searchsorted_interval as in this PR. Then, findall would behave exactly as before, and searchsorted_interval would become (i) correct and (ii) somewhat slower for ranges.

@aplavin
Copy link
Contributor Author

aplavin commented Aug 15, 2022

The updated implementation is both correct (findall tests pass, and they do check for zeros of both signs), and keeps the performance for ranges (Base searchsorted funcs are called without lt=< argument).

@aplavin
Copy link
Contributor Author

aplavin commented Aug 19, 2022

This PR also fixes #100 btw.

@hyrodium
Copy link
Collaborator

This PR also fixes #100 btw.

Great! Could you add the following tests in test/findall.jl?

    @testset "zero-step ranges" begin
        r = range(1,1,10)
        @test findall(in(1..3),r) == 1:10
        @test findall(in(2..3),r) |> isempty
    end

Basically, the issue is that without lt = < Base searchsorted functions use isless, that doesn't consider +-0 to be equal:

Thanks! I didn't know that property of searchsorted.

and keeps the performance for ranges (Base searchsorted funcs are called without lt=< argument).

Hmm, the performance is now getting much better than the first benchmark, but there still be some performance regressions:

On the current master

julia> using IntervalSets, BenchmarkTools

julia> @benchmark findall((1..3), 2:8)
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min  max):  0.020 ns  0.031 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     0.020 ns             ┊ GC (median):    0.00%
 Time  (mean ± σ):   0.021 ns ± 0.004 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  █                                                          
  █▁▁▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄ ▂
  0.02 ns        Histogram: frequency by time       0.03 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark findall(($(1..3)), $(2:8))
BenchmarkTools.Trial: 10000 samples with 999 evaluations.
 Range (min  max):  11.222 ns  18.574 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     11.332 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   11.383 ns ±  0.229 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

                   ▅▃█ ▄                                       
  ▂▂▁▂▁▂▁▂▂▂▂▁▄▁▇▁▃███▁█▁▂▁▂▂▁▁▁▁▁▁▁▁▂▁▃▁▅▂▂▆▁▅▁▃▁▂▂▂▂▁▂▁▅▁▄▃ ▃
  11.2 ns         Histogram: frequency by time        11.6 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

With this PR

julia> using IntervalSets, BenchmarkTools

julia> @benchmark findall((1..3), 2:8)
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min  max):  1.442 ns  5.931 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     1.472 ns             ┊ GC (median):    0.00%
 Time  (mean ± σ):   1.471 ns ± 0.097 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

   ▄  █                                                      
  ██▂▃█▃▅▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▁▁▁▁▁▂▁▂▁▁▁▁▁▂▂▁▂▂▁▂▂▁▂▂▁▁▂▁▂▂▁▂▂▂ ▂
  1.44 ns        Histogram: frequency by time       1.82 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark findall(($(1..3)), $(2:8))

BenchmarkTools.Trial: 10000 samples with 927 evaluations.
 Range (min  max):  112.177 ns   1.880 μs  ┊ GC (min  max): 0.00%  93.46%
 Time  (median):     119.698 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   125.097 ns ± 87.030 ns  ┊ GC (mean ± σ):  3.47% ±  4.66%

    ▂▄▅▅▅▇▇▇▇█▇▄▃▂▂▁                                  ▁        ▂
  ▆██████████████████▆▇▆▅▆▅▄▆▅▄▃▃▄▁▄▁▄▄▆▅▆▆▇▆▅▆▆▆▆▇███████▇▅▆▇ █
  112 ns        Histogram: log(frequency) by time       157 ns <

 Memory estimate: 128 bytes, allocs estimate: 4.

Sorry I have not taken the time to review. I have not yet checked the reasons for the performance degression.

@aplavin
Copy link
Contributor Author

aplavin commented Aug 21, 2022

Maybe, the time difference is some benchmarking artifact. At least the timings definitely looks suspicious, in both cases.

  1. findall(∈(1..3), 2:8) changing from 0.02 ns to 1.47 ns. The first timing is totally implausible, definitely an artifact. 1.5 ns is close to the fastest computation possible, even in principle.
  2. findall(∈($(1..3)), $(2:8)) being significantly slower than without dollars, for both IntervalSets versions. This is actually the first time I see $ making benchmark slower. But it occurs with very simple Base code as well: @btime 1:2:1000 gives 2 ns, @btime 1:$(2):1000 10 ns.
    Also, very surprising that allocations appear with dollars in benchmark. I don't understand this at all.

Running both (old and new) versions manually in a loop gets 1.5s for 10^9 iterations, so 1.5 ns - agrees with btime without dollars for the new version. No allocations, of course.

@aplavin
Copy link
Contributor Author

aplavin commented Aug 21, 2022

Added the zero-step tests.

@aplavin
Copy link
Contributor Author

aplavin commented Sep 19, 2022

Anything else remains to be done here?..

@hyrodium
Copy link
Collaborator

I checked the benchmarks again, and it seems there still be some performance degressions with findall.

Before this PR

julia> using IntervalSets, BenchmarkTools

julia> i = 1..3
1..3

julia> r = 2:8
2:8

julia> @benchmark findall(($i), $r)
BenchmarkTools.Trial: 10000 samples with 999 evaluations.
 Range (min  max):  11.332 ns  18.383 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     11.443 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   11.496 ns ±  0.203 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

       ▁▃    ▃ ▂  ▇ █  ▂ ▂  ▁ ▁ ▂  ▂ ▆    ▅▂▅               ▃ ▂
  ▇▆▁▅▁██▁▃▁▁█▁█▁██▁█▁▆█▁█▁▆█▁█▁█▅▇█▁█▅█▇▁████▁▆▁▃▄▁▇▁▆▇▁▅▁▇█ █
  11.3 ns      Histogram: log(frequency) by time      11.7 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> r = 2.0:8.0
2.0:1.0:8.0

julia> @benchmark findall(($i), $r)
BenchmarkTools.Trial: 10000 samples with 980 evaluations.
 Range (min  max):  62.996 ns  186.444 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     64.929 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   65.645 ns ±   5.061 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▁▆▇█▆▁▁▁▁▁                       ▂                           ▂
  ███████████▇▅▅▄▄▄▄▄▃▃▁▃▁▁▃▁▁▃▁▃▆▄█▁▄▃▁▃▄▄▁▁▄▄▃▆▅▄▅▃▃▄▃▅▁▅▄▅▅ █
  63 ns         Histogram: log(frequency) by time      99.6 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

After this PR

julia> using IntervalSets, BenchmarkTools
[ Info: Precompiling IntervalSets [8197267c-284f-5f27-9208-e0e47529a953]

julia> using IntervalSets, BenchmarkTools

julia> i = 1..3
1..3

julia> r = 2:8
2:8

julia> @benchmark findall(($i), $r)
BenchmarkTools.Trial: 10000 samples with 979 evaluations.
 Range (min  max):  64.330 ns   1.845 μs  ┊ GC (min  max): 0.00%  94.39%
 Time  (median):     71.279 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   74.554 ns ± 56.606 ns  ┊ GC (mean ± σ):  2.72% ±  3.43%

            ▂▆▇██▇▆▄▂▁▁▁▂▁                    ▁▂▂▁▁           ▂
  ▂▃▃▂▂▂▃▂▂▇███████████████▆▅▅▄▅▅▄▅▂▅▄▆▆▆▄▅▅▆███████▇▆▅▆▅▆▆██ █
  64.3 ns      Histogram: log(frequency) by time      92.8 ns <

 Memory estimate: 64 bytes, allocs estimate: 2.

julia> r = 2.0:8.0
2.0:1.0:8.0

julia> @benchmark findall(($i), $r)
BenchmarkTools.Trial: 10000 samples with 994 evaluations.
 Range (min  max):  32.466 ns  62.311 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     32.909 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   33.192 ns ±  1.534 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▂▆█▄▆                                                     ▁ ▂
  █████▅▅▄▄▆▇▆▄▆▆▇▆▆▆▁▅▅▅▆▇▆▇▅▅▅▄▄▃▃▃▄▃▃▁▃▃▁▁▃▁▃▃▃▃▁▁▃▁▁▄▁▃▅█ █
  32.5 ns      Histogram: log(frequency) by time      42.6 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

If r is a range with integers, then the current master branch wins, but if r is a range with floats, then this PR wins.

findall(∈($(1..3)), $(2:8)) being significantly slower than without dollars, for both IntervalSets versions. This is actually the first time I see $ making benchmark slower. But it occurs with very simple Base code as well: @btime 1:2:1000 gives 2 ns, @btime 1:$(2):1000 10 ns.
Also, very surprising that allocations appear with dollars in benchmark. I don't understand this at all.

I also don't understand the behavior well, but I thought it would be better to use variables such as i and r to clarify the difference in the performances.
BenchmarkTools.jl documentation

@hyrodium
Copy link
Collaborator

Maybe, it seems better to split this PR into

  • Add rev keyword argument in searchsorted_interval function.
  • Simplify findall function.

to merge this PR quickly.

@hyrodium
Copy link
Collaborator

The searchsorted function take O(log(n)) time, but the original findall function take O(1) time.

Bedore this PR

julia> using IntervalSets, BenchmarkTools

julia> i = 1..3
1..3

julia> r = 2:8
2:8

julia> @benchmark findall(($i), $r)
BenchmarkTools.Trial: 10000 samples with 999 evaluations.
 Range (min  max):  11.332 ns  37.027 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     11.442 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   11.741 ns ±  1.556 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  █▅                  ▁                                     ▁ ▁
  ██▄▄▄▃▃▁▄▅▄▃▄▄▅▄▆▅▃██▅▄▁▅▃▁▁▄▃▁▁▃▄▁▃▃▁▃▁▄▃▁▃▁▁▁▃▁▁▁▁▃▁▁▃▁▆█ █
  11.3 ns      Histogram: log(frequency) by time      21.7 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> r = -2000000:80000000
-2000000:80000000

julia> @benchmark findall(($i), $r)  # Benchmark result remains mostly the same.
BenchmarkTools.Trial: 10000 samples with 999 evaluations.
 Range (min  max):  12.475 ns  22.756 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     12.567 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   12.619 ns ±  0.280 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

   ▅ ▅▁       ▄ █▂▄  ▂   ▃ ▅            ▆▄       ▁ ▃          ▂
  ▇█▁██▁▅▆▇▁▁▇█▁███▅▁██▁▇█▁█▁▁▁▁▁▁▁▃▁▄█▁██▁▇▇▄▇▁▄█▁█▁▁▃▁▃▃▁▁▄ █
  12.5 ns      Histogram: log(frequency) by time      12.8 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

After this PR

julia> using IntervalSets, BenchmarkTools

julia> i = 1..3
1..3

julia> r = 2:8
2:8

julia> @benchmark findall(($i), $r)
BenchmarkTools.Trial: 10000 samples with 973 evaluations.
 Range (min  max):  72.635 ns   1.769 μs  ┊ GC (min  max): 0.00%  93.18%
 Time  (median):     74.240 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   77.692 ns ± 58.474 ns  ┊ GC (mean ± σ):  2.70% ±  3.43%

   ▄██▇▇▆▄▃▂▁▁▁▁▁▁                         ▁▁▁▁▁ ▁            ▂
  ▆███████████████▇▆▅▅▅▅▅▅▅▅▄▄▅▅▄▁▁▅▅▃▃▃▄█████████▇▇▇▆▆▅▆▆▇▇█ █
  72.6 ns      Histogram: log(frequency) by time      96.3 ns <

 Memory estimate: 64 bytes, allocs estimate: 2.

julia> r = -2000000:80000000
-2000000:80000000

julia> @benchmark findall(($i), $r)  # Takes much time than small r.
BenchmarkTools.Trial: 10000 samples with 967 evaluations.
 Range (min  max):  81.032 ns   2.237 μs  ┊ GC (min  max): 0.00%  95.78%
 Time  (median):     84.948 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   90.104 ns ± 88.516 ns  ┊ GC (mean ± σ):  4.38% ±  4.27%

   ▄▇▇█████▇▆▄▃▂▂▁                             ▁ ▁            ▃
  ▆█████████████████▇▆▇▆▆▇▇▆▅▆▃▁▅▅▁▁▃▃▁▃▅▁▅▆▆████████▇▆▇▇▇██▇ █
  81 ns        Histogram: log(frequency) by time       116 ns <

 Memory estimate: 96 bytes, allocs estimate: 4.

@aplavin
Copy link
Contributor Author

aplavin commented Sep 23, 2022

Maybe, it seems better to split this PR into
Add rev keyword argument in searchsorted_interval function.
Simplify findall function.
to merge this PR quickly.

"Quickly" is loooong gone already :)
And it makes sense to have a performant searchsorted_interval function, there's no real reason why it should be slower than the current findall. To my understanding, there is no slowdown currently in this PR - but if there is, performance should be fixed for both findall and searchsorted_interval.

The searchsorted function take O(log(n)) time, but the original findall function take O(1) time.

Not sure if I understand you here... Both searchsorted and searchsorted_interval definitely take O(1) time for ranges:

julia> @btime searchsorted_interval(r, i) setup=(r=-10:10; i=0..100)
  2.206 ns (0 allocations: 0 bytes)
11:21

julia> @btime searchsorted_interval(r, i) setup=(r=-10^9:10^9; i=0..100)
  2.206 ns (0 allocations: 0 bytes)
1000000001:1000000101

I checked the benchmarks again, and it seems there still be some performance degressions with findall.

As I said before, it does look like a BenchmarkTools artifact. For example, @btime findall(∈(i), r) setup=(i = 1..3; r = 2:8) takes 2 ns for me, either for ints or floats, either in current master or with this PR.

@hyrodium
Copy link
Collaborator

"Quickly" is loooong gone already :)

Sorry for that delay 😄

Not sure if I understand you here... Both searchsorted and searchsorted_interval definitely take O(1) time for ranges:

The searchsorted family functions use binary search, so it increases computing time with the length of r.

julia> r = -2:8
-2:8

julia> @benchmark searchsortedfirst($r,3.3,first($r),last($r),$o)
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min  max):  4.919 ns  14.187 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     4.959 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   4.963 ns ±  0.156 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

                                        █                     
  ▂▁▁▁▁▁▁▁▁▅▄▁▁▁▁▁▁▁▁▅▁▁▁▁▁▁▁▁▂▂▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▇▆▁▁▁▁▁▁▁▁▃ ▂
  4.92 ns        Histogram: frequency by time        4.98 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> r = -20:80
-20:80

julia> @benchmark searchsortedfirst($r,3.3,first($r),last($r),$o)
BenchmarkTools.Trial: 10000 samples with 999 evaluations.
 Range (min  max):  10.721 ns  27.479 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     10.801 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   10.849 ns ±  0.465 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

   ▆          █                                                
  ▃█▁▇▂▁▂▂▂▂▁▆█▁▃▂▂▂▁▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▁▁▂▂▄▁▃▃▁▁▁▂▂▁▃▃▁▄▂ ▂
  10.7 ns         Histogram: frequency by time        11.1 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> r = -200:800
-200:800

julia> @benchmark searchsortedfirst($r,3.3,first($r),last($r),$o)
BenchmarkTools.Trial: 10000 samples with 998 evaluations.
 Range (min  max):  17.026 ns  45.697 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     17.348 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   17.412 ns ±  0.465 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▄ ▂    █ ▆▆▃▁▁                                              ▂
  █▁██▁▁▅███████▁▃▄▃▄▅▁▁▃▁▁▁▃▃▃▃▃▁▅▁▁▄▁▄▁▄▁▄▅▄▅▁▅▅▃▃▄▃▅▁▁▃▃▄▅ █
  17 ns        Histogram: log(frequency) by time      19.5 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> r = -2000:8000
-2000:8000

julia> @benchmark searchsortedfirst($r,3.3,first($r),last($r),$o)
BenchmarkTools.Trial: 10000 samples with 996 evaluations.
 Range (min  max):  24.282 ns  48.344 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     24.474 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   24.548 ns ±  0.618 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

     █                                                         
  ▃▁▄█▂▅▂▂▂▂▂▁▁▂▂▂▂▁▂▁▁▂▂▁▁▂▂▁▁▂▂▂▂▂▂▁▂▂▂▁▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▂▂▂ ▂
  24.3 ns         Histogram: frequency by time        27.7 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

I'm not sure how the compiler or BenchmarkTools.jl optimizes in searchsorted_interval, but ideally, the performance should depend on the length of r.

@aplavin
Copy link
Contributor Author

aplavin commented Sep 24, 2022

The searchsorted family functions use binary search

searchsorted definitely doesn't use binary search for ranges by default, see around https://github.com/JuliaLang/julia/blob/24cb92d1e94367c5ef758545b69133ded4559763/base/sort.jl#L230. That's why I expect searchsorted_interval to be as performant as the current findall.

@hyrodium
Copy link
Collaborator

hyrodium commented Sep 24, 2022

searchsorted definitely doesn't use binary search for ranges by default

Oh, that was my mistake. Thanks for the clarification.

(This was already explained in this comment #111 (comment), but I overlooked it.)

Copy link
Collaborator

@hyrodium hyrodium left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delayed review and my imprecise comments. I think I now understand all of your points.

This PR fixes the searchsorted_interval function for the AbstractVector input with -0.0, but the change introduces a new bug with -0.0 in Inteval.

julia> using IntervalSets

julia> searchsorted_interval([-2., -0., 0., 2., 3.], 0.0..3.0)
2:5

julia> searchsorted_interval([-2., -0., 0., 2., 3.], -0.0..3.0)
3:5

We also have the following problem with Unitful, so it seems better to use lt = < with searchsorted-family functions. The performance problem can be avoided with specifying X::AbstractRange instead of specifying x:::AbstractFloat.

searchsorted_interval(X, i::Interval{:closed, :open}) = searchsortedfirst(X, leftendpoint(i)) :(searchsortedfirst(X, rightendpoint(i)) - 1)
searchsorted_interval(X, i::Interval{ :open, :closed}) = (searchsortedlast(X, leftendpoint(i)) + 1):searchsortedlast(X, rightendpoint(i))
searchsorted_interval(X, i::Interval{ :open, :open}) = (searchsortedlast(X, leftendpoint(i)) + 1):(searchsortedfirst(X, rightendpoint(i)) - 1)
function searchsorted_interval(X, i::Interval{L, R}; rev=false) where {L, R}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need support for rev::Nothing, just like searchsorted?

julia> searchsorted([1,2,3,3,4],3,rev=nothing)
3:4

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be fixed in another PR.

src/findall.jl Outdated Show resolved Hide resolved
Copy link
Collaborator

@hyrodium hyrodium left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aplavin Sorry for the late. I will merge this PR in a week.

@aplavin
Copy link
Contributor Author

aplavin commented Jul 31, 2023

Well I don't think you should in the current state :) I only needed this functionality in one place, just monkeypatched it there - so no hurry.

The current explicit comparison with zero here is a hack. I think searchsorted_interval should just use searchsorted(lt=<) as it did in one of the commits. The only issue was slower performance on ranges (O(log(N)) instead of O(1)), and it'll be fixed in JuliaLang/julia#50365.

@aplavin
Copy link
Contributor Author

aplavin commented Jul 31, 2023

Restored it to use lt=<. I guess what remains is to wait for that Julia PR, and then add

if VERSION < ...
   findall = old implementation
else
   findall = new implementation
end

To avoid the performance regression for ranges.

I don't really feel like crafting separate tests for all edge cases of searchsorted_interval(rev=false/true) - findall tests cover them already, and using searchsorted_interval in findall only makes sense.

@hyrodium
Copy link
Collaborator

hyrodium commented Aug 1, 2023

To avoid the performance regression for ranges.

I personally feel we can ignore performance degressions in old Julia versions (if the degression is not that much), so it would be okay to merge this PR without if VERSION < ... not to increase the maintenance cost.

I don't really feel like crafting separate tests for all edge cases of searchsorted_interval(rev=false/true) - findall tests cover them already, and using searchsorted_interval in findall only makes sense.

No problem with this direction!

I will merge this PR after JuliaLang/julia#50365 is merged.

@hyrodium hyrodium merged commit 92c033a into JuliaMath:master Oct 17, 2023
@hyrodium
Copy link
Collaborator

The PR JuliaLang/julia#50365 was merged, so I have merged this PR.

@aplavin
Thank you for the PR and for your patience while waiting for the review & merge!

@aplavin aplavin deleted the patch-1 branch October 18, 2023 03:35
@aplavin
Copy link
Contributor Author

aplavin commented Oct 18, 2023

Thanks @hyrodium! Would be nice to release it as well.
This PR also fixes a bug in interacting with Unitful:

julia> findall((0..1), (-0.1:0.1:0.1)u"°")
ERROR: BoundsError  # without this PR
2:3  # with this PR

Fundamentally, the issue is with Unitful (PainterQubits/Unitful.jl#686), but we get the fix effectively as a bonus here :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

findall throws error for zero-step range
2 participants