-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: more efficient intersection #157
Conversation
I tried this branch on a pathological test case and it reduced the time from 10s to 3.6s! |
That is wonderful to hear! As soon as this gets a review, let's get this merged! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks reasonable to me, but may benefit from a bit of documentation?
I rarely provide enough documentation or use good names for things. Always feel comfortable asking me to improve! Anything in particular you want me to clarify? |
I think basically anywhere you're doing something unintuitive for performance reasons as outlined in your bullets in the pull request summary you'd benefit from adding a comment explaining what's going on e.g.
|
Here's a small dissertation. Let me know if any of it made sense. |
If people think of improvements to the documentation I added, PR's are welcome. So I'm going to merge. |
I got nerds not by astral-sh#6. Which points out that intersection is slower than it needs to be when the basic operations on version are expensive. Here's what I've come up with.
The ideas are:
next_if
andeq
keep track of which iter to advance directly.end
came from, and we know that for each iteratorstart < end
, check for validity before determiningstart
e > i
dev
zanieb/#6
this
I tested this synthetically by making a version that increasing a global counter every time
eq
/cmp
/clone
/drop
is called. This global counter goes up much (~30%) less after this PR.Even on our normal benchmarks, where these operations are very cheap (a handful of CPU cycles), we see improvements (although only a few percent).
cc @konstin for real-world impact.
cc @baszalmstra mamba-org/resolvo#2 (comment) I don't know if these are improvements you'd be interested in include in your copy of this type.