-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rev=true in searchsorted_interval #111
Conversation
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## master #111 +/- ##
==========================================
+ Coverage 98.81% 99.10% +0.29%
==========================================
Files 3 4 +1
Lines 253 224 -29
==========================================
- Hits 250 222 -28
+ Misses 3 2 -1
☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found some performance degression. Could you check the following result?
julia> versioninfo()
Julia Version 1.7.3
Commit 742b9abb4d (2022-05-06 12:58 UTC)
Platform Info:
OS: Linux (x86_64-pc-linux-gnu)
CPU: AMD Ryzen 7 2700X Eight-Core Processor
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-12.0.1 (ORCJIT, znver1)
On the current master
julia> @benchmark findall(∈(1..3), 2:8)
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
Range (min … max): 0.020 ns … 0.031 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 0.020 ns ┊ GC (median): 0.00%
Time (mean ± σ): 0.021 ns ± 0.004 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
█
█▁▁▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄ ▂
0.02 ns Histogram: frequency by time 0.03 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
With this PR
julia> @benchmark findall(∈(1..3), 2:8)
BenchmarkTools.Trial: 10000 samples with 870 evaluations.
Range (min … max): 136.305 ns … 2.080 μs ┊ GC (min … max): 0.00% … 92.90%
Time (median): 141.348 ns ┊ GC (median): 0.00%
Time (mean ± σ): 147.630 ns ± 89.578 ns ┊ GC (mean ± σ): 2.95% ± 4.52%
▁▄▆████▇▆▅▃▂▁▁ ▁ ▁ ▁ ▃
▇████████████████▆▆▆▅▅▅▄▅▄▅▅▄▅▄▅▅▁▄▃▅▅▅▄▁▅▆▆▇███▇▇▇████████▇ █
136 ns Histogram: log(frequency) by time 183 ns <
Memory estimate: 128 bytes, allocs estimate: 4.
Hm, that's... interesting. julia> searchsorted([-1, -0.0, 0.0, 1], 0)
3:3
julia> searchsorted([-1, -0.0, 0.0, 1], 0; lt=<)
2:3 As Not sure what's the best solution here. |
Co-authored-by: hyrodium <[email protected]>
The updated implementation is both correct (findall tests pass, and they do check for zeros of both signs), and keeps the performance for ranges (Base |
This PR also fixes #100 btw. |
Great! Could you add the following tests in @testset "zero-step ranges" begin
r = range(1,1,10)
@test findall(in(1..3),r) == 1:10
@test findall(in(2..3),r) |> isempty
end
Thanks! I didn't know that property of
Hmm, the performance is now getting much better than the first benchmark, but there still be some performance regressions: On the current master julia> using IntervalSets, BenchmarkTools
julia> @benchmark findall(∈(1..3), 2:8)
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
Range (min … max): 0.020 ns … 0.031 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 0.020 ns ┊ GC (median): 0.00%
Time (mean ± σ): 0.021 ns ± 0.004 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
█
█▁▁▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄ ▂
0.02 ns Histogram: frequency by time 0.03 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
julia> @benchmark findall(∈($(1..3)), $(2:8))
BenchmarkTools.Trial: 10000 samples with 999 evaluations.
Range (min … max): 11.222 ns … 18.574 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 11.332 ns ┊ GC (median): 0.00%
Time (mean ± σ): 11.383 ns ± 0.229 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▅▃█ ▄
▂▂▁▂▁▂▁▂▂▂▂▁▄▁▇▁▃███▁█▁▂▁▂▂▁▁▁▁▁▁▁▁▂▁▃▁▅▂▂▆▁▅▁▃▁▂▂▂▂▁▂▁▅▁▄▃ ▃
11.2 ns Histogram: frequency by time 11.6 ns <
Memory estimate: 0 bytes, allocs estimate: 0. With this PR julia> using IntervalSets, BenchmarkTools
julia> @benchmark findall(∈(1..3), 2:8)
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
Range (min … max): 1.442 ns … 5.931 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 1.472 ns ┊ GC (median): 0.00%
Time (mean ± σ): 1.471 ns ± 0.097 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▄ █
██▂▃█▃▅▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▁▁▁▁▁▂▁▂▁▁▁▁▁▂▂▁▂▂▁▂▂▁▂▂▁▁▂▁▂▂▁▂▂▂ ▂
1.44 ns Histogram: frequency by time 1.82 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
julia> @benchmark findall(∈($(1..3)), $(2:8))
BenchmarkTools.Trial: 10000 samples with 927 evaluations.
Range (min … max): 112.177 ns … 1.880 μs ┊ GC (min … max): 0.00% … 93.46%
Time (median): 119.698 ns ┊ GC (median): 0.00%
Time (mean ± σ): 125.097 ns ± 87.030 ns ┊ GC (mean ± σ): 3.47% ± 4.66%
▂▄▅▅▅▇▇▇▇█▇▄▃▂▂▁ ▁ ▂
▆██████████████████▆▇▆▅▆▅▄▆▅▄▃▃▄▁▄▁▄▄▆▅▆▆▇▆▅▆▆▆▆▇███████▇▅▆▇ █
112 ns Histogram: log(frequency) by time 157 ns <
Memory estimate: 128 bytes, allocs estimate: 4. Sorry I have not taken the time to review. I have not yet checked the reasons for the performance degression. |
Maybe, the time difference is some benchmarking artifact. At least the timings definitely looks suspicious, in both cases.
Running both (old and new) versions manually in a loop gets 1.5s for 10^9 iterations, so 1.5 ns - agrees with |
Added the zero-step tests. |
Anything else remains to be done here?.. |
I checked the benchmarks again, and it seems there still be some performance degressions with Before this PR julia> using IntervalSets, BenchmarkTools
julia> i = 1..3
1..3
julia> r = 2:8
2:8
julia> @benchmark findall(∈($i), $r)
BenchmarkTools.Trial: 10000 samples with 999 evaluations.
Range (min … max): 11.332 ns … 18.383 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 11.443 ns ┊ GC (median): 0.00%
Time (mean ± σ): 11.496 ns ± 0.203 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▁▃ ▃ ▂ ▇ █ ▂ ▂ ▁ ▁ ▂ ▂ ▆ ▅▂▅ ▃ ▂
▇▆▁▅▁██▁▃▁▁█▁█▁██▁█▁▆█▁█▁▆█▁█▁█▅▇█▁█▅█▇▁████▁▆▁▃▄▁▇▁▆▇▁▅▁▇█ █
11.3 ns Histogram: log(frequency) by time 11.7 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
julia> r = 2.0:8.0
2.0:1.0:8.0
julia> @benchmark findall(∈($i), $r)
BenchmarkTools.Trial: 10000 samples with 980 evaluations.
Range (min … max): 62.996 ns … 186.444 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 64.929 ns ┊ GC (median): 0.00%
Time (mean ± σ): 65.645 ns ± 5.061 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▁▆▇█▆▁▁▁▁▁ ▂ ▂
███████████▇▅▅▄▄▄▄▄▃▃▁▃▁▁▃▁▁▃▁▃▆▄█▁▄▃▁▃▄▄▁▁▄▄▃▆▅▄▅▃▃▄▃▅▁▅▄▅▅ █
63 ns Histogram: log(frequency) by time 99.6 ns <
Memory estimate: 0 bytes, allocs estimate: 0. After this PR julia> using IntervalSets, BenchmarkTools
[ Info: Precompiling IntervalSets [8197267c-284f-5f27-9208-e0e47529a953]
julia> using IntervalSets, BenchmarkTools
julia> i = 1..3
1..3
julia> r = 2:8
2:8
julia> @benchmark findall(∈($i), $r)
BenchmarkTools.Trial: 10000 samples with 979 evaluations.
Range (min … max): 64.330 ns … 1.845 μs ┊ GC (min … max): 0.00% … 94.39%
Time (median): 71.279 ns ┊ GC (median): 0.00%
Time (mean ± σ): 74.554 ns ± 56.606 ns ┊ GC (mean ± σ): 2.72% ± 3.43%
▂▆▇██▇▆▄▂▁▁▁▂▁ ▁▂▂▁▁ ▂
▂▃▃▂▂▂▃▂▂▇███████████████▆▅▅▄▅▅▄▅▂▅▄▆▆▆▄▅▅▆███████▇▆▅▆▅▆▆██ █
64.3 ns Histogram: log(frequency) by time 92.8 ns <
Memory estimate: 64 bytes, allocs estimate: 2.
julia> r = 2.0:8.0
2.0:1.0:8.0
julia> @benchmark findall(∈($i), $r)
BenchmarkTools.Trial: 10000 samples with 994 evaluations.
Range (min … max): 32.466 ns … 62.311 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 32.909 ns ┊ GC (median): 0.00%
Time (mean ± σ): 33.192 ns ± 1.534 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▂▆█▄▆ ▁ ▂
█████▅▅▄▄▆▇▆▄▆▆▇▆▆▆▁▅▅▅▆▇▆▇▅▅▅▄▄▃▃▃▄▃▃▁▃▃▁▁▃▁▃▃▃▃▁▁▃▁▁▄▁▃▅█ █
32.5 ns Histogram: log(frequency) by time 42.6 ns <
Memory estimate: 0 bytes, allocs estimate: 0. If
I also don't understand the behavior well, but I thought it would be better to use variables such as |
Maybe, it seems better to split this PR into
to merge this PR quickly. |
The Bedore this PR julia> using IntervalSets, BenchmarkTools
julia> i = 1..3
1..3
julia> r = 2:8
2:8
julia> @benchmark findall(∈($i), $r)
BenchmarkTools.Trial: 10000 samples with 999 evaluations.
Range (min … max): 11.332 ns … 37.027 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 11.442 ns ┊ GC (median): 0.00%
Time (mean ± σ): 11.741 ns ± 1.556 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
█▅ ▁ ▁ ▁
██▄▄▄▃▃▁▄▅▄▃▄▄▅▄▆▅▃██▅▄▁▅▃▁▁▄▃▁▁▃▄▁▃▃▁▃▁▄▃▁▃▁▁▁▃▁▁▁▁▃▁▁▃▁▆█ █
11.3 ns Histogram: log(frequency) by time 21.7 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
julia> r = -2000000:80000000
-2000000:80000000
julia> @benchmark findall(∈($i), $r) # Benchmark result remains mostly the same.
BenchmarkTools.Trial: 10000 samples with 999 evaluations.
Range (min … max): 12.475 ns … 22.756 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 12.567 ns ┊ GC (median): 0.00%
Time (mean ± σ): 12.619 ns ± 0.280 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▅ ▅▁ ▄ █▂▄ ▂ ▃ ▅ ▆▄ ▁ ▃ ▂
▇█▁██▁▅▆▇▁▁▇█▁███▅▁██▁▇█▁█▁▁▁▁▁▁▁▃▁▄█▁██▁▇▇▄▇▁▄█▁█▁▁▃▁▃▃▁▁▄ █
12.5 ns Histogram: log(frequency) by time 12.8 ns <
Memory estimate: 0 bytes, allocs estimate: 0. After this PR julia> using IntervalSets, BenchmarkTools
julia> i = 1..3
1..3
julia> r = 2:8
2:8
julia> @benchmark findall(∈($i), $r)
BenchmarkTools.Trial: 10000 samples with 973 evaluations.
Range (min … max): 72.635 ns … 1.769 μs ┊ GC (min … max): 0.00% … 93.18%
Time (median): 74.240 ns ┊ GC (median): 0.00%
Time (mean ± σ): 77.692 ns ± 58.474 ns ┊ GC (mean ± σ): 2.70% ± 3.43%
▄██▇▇▆▄▃▂▁▁▁▁▁▁ ▁▁▁▁▁ ▁ ▂
▆███████████████▇▆▅▅▅▅▅▅▅▅▄▄▅▅▄▁▁▅▅▃▃▃▄█████████▇▇▇▆▆▅▆▆▇▇█ █
72.6 ns Histogram: log(frequency) by time 96.3 ns <
Memory estimate: 64 bytes, allocs estimate: 2.
julia> r = -2000000:80000000
-2000000:80000000
julia> @benchmark findall(∈($i), $r) # Takes much time than small r.
BenchmarkTools.Trial: 10000 samples with 967 evaluations.
Range (min … max): 81.032 ns … 2.237 μs ┊ GC (min … max): 0.00% … 95.78%
Time (median): 84.948 ns ┊ GC (median): 0.00%
Time (mean ± σ): 90.104 ns ± 88.516 ns ┊ GC (mean ± σ): 4.38% ± 4.27%
▄▇▇█████▇▆▄▃▂▂▁ ▁ ▁ ▃
▆█████████████████▇▆▇▆▆▇▇▆▅▆▃▁▅▅▁▁▃▃▁▃▅▁▅▆▆████████▇▆▇▇▇██▇ █
81 ns Histogram: log(frequency) by time 116 ns <
Memory estimate: 96 bytes, allocs estimate: 4. |
"Quickly" is loooong gone already :)
Not sure if I understand you here... Both searchsorted and searchsorted_interval definitely take O(1) time for ranges: julia> @btime searchsorted_interval(r, i) setup=(r=-10:10; i=0..100)
2.206 ns (0 allocations: 0 bytes)
11:21
julia> @btime searchsorted_interval(r, i) setup=(r=-10^9:10^9; i=0..100)
2.206 ns (0 allocations: 0 bytes)
1000000001:1000000101
As I said before, it does look like a BenchmarkTools artifact. For example, |
Sorry for that delay 😄
The searchsorted family functions use binary search, so it increases computing time with the length of julia> r = -2:8
-2:8
julia> @benchmark searchsortedfirst($r,3.3,first($r),last($r),$o)
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
Range (min … max): 4.919 ns … 14.187 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 4.959 ns ┊ GC (median): 0.00%
Time (mean ± σ): 4.963 ns ± 0.156 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
█
▂▁▁▁▁▁▁▁▁▅▄▁▁▁▁▁▁▁▁▅▁▁▁▁▁▁▁▁▂▂▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▇▆▁▁▁▁▁▁▁▁▃ ▂
4.92 ns Histogram: frequency by time 4.98 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
julia> r = -20:80
-20:80
julia> @benchmark searchsortedfirst($r,3.3,first($r),last($r),$o)
BenchmarkTools.Trial: 10000 samples with 999 evaluations.
Range (min … max): 10.721 ns … 27.479 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 10.801 ns ┊ GC (median): 0.00%
Time (mean ± σ): 10.849 ns ± 0.465 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▆ █
▃█▁▇▂▁▂▂▂▂▁▆█▁▃▂▂▂▁▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▁▁▂▂▄▁▃▃▁▁▁▂▂▁▃▃▁▄▂ ▂
10.7 ns Histogram: frequency by time 11.1 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
julia> r = -200:800
-200:800
julia> @benchmark searchsortedfirst($r,3.3,first($r),last($r),$o)
BenchmarkTools.Trial: 10000 samples with 998 evaluations.
Range (min … max): 17.026 ns … 45.697 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 17.348 ns ┊ GC (median): 0.00%
Time (mean ± σ): 17.412 ns ± 0.465 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▄ ▂ █ ▆▆▃▁▁ ▂
█▁██▁▁▅███████▁▃▄▃▄▅▁▁▃▁▁▁▃▃▃▃▃▁▅▁▁▄▁▄▁▄▁▄▅▄▅▁▅▅▃▃▄▃▅▁▁▃▃▄▅ █
17 ns Histogram: log(frequency) by time 19.5 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
julia> r = -2000:8000
-2000:8000
julia> @benchmark searchsortedfirst($r,3.3,first($r),last($r),$o)
BenchmarkTools.Trial: 10000 samples with 996 evaluations.
Range (min … max): 24.282 ns … 48.344 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 24.474 ns ┊ GC (median): 0.00%
Time (mean ± σ): 24.548 ns ± 0.618 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
█
▃▁▄█▂▅▂▂▂▂▂▁▁▂▂▂▂▁▂▁▁▂▂▁▁▂▂▁▁▂▂▂▂▂▂▁▂▂▂▁▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▂▂▂ ▂
24.3 ns Histogram: frequency by time 27.7 ns <
Memory estimate: 0 bytes, allocs estimate: 0. I'm not sure how the compiler or BenchmarkTools.jl optimizes in |
|
Oh, that was my mistake. Thanks for the clarification. (This was already explained in this comment #111 (comment), but I overlooked it.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delayed review and my imprecise comments. I think I now understand all of your points.
This PR fixes the searchsorted_interval
function for the AbstractVector
input with -0.0
, but the change introduces a new bug with -0.0
in Inteval
.
julia> using IntervalSets
julia> searchsorted_interval([-2., -0., 0., 2., 3.], 0.0..3.0)
2:5
julia> searchsorted_interval([-2., -0., 0., 2., 3.], -0.0..3.0)
3:5
We also have the following problem with Unitful, so it seems better to use lt = <
with searchsorted
-family functions. The performance problem can be avoided with specifying X::AbstractRange
instead of specifying x:::AbstractFloat
.
searchsorted_interval(X, i::Interval{:closed, :open}) = searchsortedfirst(X, leftendpoint(i)) :(searchsortedfirst(X, rightendpoint(i)) - 1) | ||
searchsorted_interval(X, i::Interval{ :open, :closed}) = (searchsortedlast(X, leftendpoint(i)) + 1):searchsortedlast(X, rightendpoint(i)) | ||
searchsorted_interval(X, i::Interval{ :open, :open}) = (searchsortedlast(X, leftendpoint(i)) + 1):(searchsortedfirst(X, rightendpoint(i)) - 1) | ||
function searchsorted_interval(X, i::Interval{L, R}; rev=false) where {L, R} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we need support for rev::Nothing
, just like searchsorted
?
julia> searchsorted([1,2,3,3,4],3,rev=nothing)
3:4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be fixed in another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aplavin Sorry for the late. I will merge this PR in a week.
Well I don't think you should in the current state :) I only needed this functionality in one place, just monkeypatched it there - so no hurry. The current explicit comparison with zero here is a hack. I think |
Restored it to use
To avoid the performance regression for ranges. I don't really feel like crafting separate tests for all edge cases of |
I personally feel we can ignore performance degressions in old Julia versions (if the degression is not that much), so it would be okay to merge this PR without
No problem with this direction! I will merge this PR after JuliaLang/julia#50365 is merged. |
The PR JuliaLang/julia#50365 was merged, so I have merged this PR. @aplavin |
Thanks @hyrodium! Would be nice to release it as well. julia> findall(∈(0..1), (-0.1:0.1:0.1)u"°")
ERROR: BoundsError # without this PR
2:3 # with this PR Fundamentally, the issue is with Unitful (PainterQubits/Unitful.jl#686), but we get the fix effectively as a bonus here :) |
this also simplifies findall
no new tests added: rely on the extensive findall testsuite