-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add FromIter
for Ranges
#278
Conversation
faedeec
to
daf2409
Compare
daf2409
to
e2146fc
Compare
version-ranges/src/lib.rs
Outdated
#[test] | ||
fn from_iter_valid(segments in proptest::collection::vec(any::<(Bound<u32>, Bound<u32>)>(), ..30)) { | ||
// We check invariants in the method. | ||
Ranges::from_iter(segments.clone()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could strengthen this test by checking that the result is equivalent to iteratively calling union.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This caught a significant oversight!
// following: |------| | ||
// | ||
// final: |------| |------| |------| | ||
segments.insert(insertion_point, segment); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is doing insertion sort. Witch is O(n^2)
, because this insert
(or the remove
above) is O(n)
. Shoving all the segments into a heap should get us O(n*log(n))
time, but take O(n)
space. I don't know if it actually matters in practice.
If the point is to be practical I'm happy to merge this with a comment about alternative implementations. If the point of this is to implement the algorithmically best solution, we can do better than this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One solution would be collecting the segments into a sorted datastructure, then building a second ranges without any shifting around: We pay the price of two datastructure for no moving around. This solution works with only a single ranges, inserting or removing elements as we detect overlaps: If the elements are already in order, it's only pushes to a single smallvec (I chose not to reserve, since we could vastly over-reserve). As usual, I'd go with this implementation and change it if benchmarking with users shows that another algorithm would be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a comment on these two lines explaining that the call is O(n)
which makes the algorithm O(n^2)
but it should be good enough for now. That way somebody looking at a flame graph that's hot here has some clue that there is a fix we've already thought of.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment here and a more general explanation of the tradeoff above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two nits about comments.
version-ranges/src/lib.rs
Outdated
// We have three constraints we need to fulfil: | ||
// 1. The segments are sorted, from lowest to highest (through `Ord`): By sorting. | ||
// 2. Each segment contains at least one version (start < end): By `union`. | ||
// 3. There is at least one version between two segments: By `union`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"By union
" does not match the implementation. While I'm making nits about comments, let's make this clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good catch, i forgot to update the comment after changing the implementation.
// following: |------| | ||
// | ||
// final: |------| |------| |------| | ||
segments.insert(insertion_point, segment); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a comment on these two lines explaining that the call is O(n)
which makes the algorithm O(n^2)
but it should be good enough for now. That way somebody looking at a flame graph that's hot here has some clue that there is a fix we've already thought of.
Add a method to construct ranges from an iterator of arbitrary segments. This allows to `.collect()` an iterator of tuples of bounds. This is more ergonomic than folding the previous ranges with the next segment each time, and also faster. Split out from #273 Closes astral-sh#33 Fixes #249
cd57a69
to
fe37b91
Compare
Add a method to construct ranges from an iterator of arbitrary segments. This allows to
.collect()
an iterator of tuples of bounds. This is more ergonomic than folding the previous ranges with the next segment each time, and also faster.Split out from #273
Closes astral-sh#33
Fixes #249