-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: invalidate less contradicted_incompatibilities #170
Conversation
Just cherry picking 976d03e onto our tagged branch speeds my test up from a good 4s to 1.7s-1.8s! We'll have to rebase onto the main to measure with the others speedups, once that is done i'll make a proper hyperfine benchmark and report back. |
That sounds like a big win! Hopefully this can get a review and get merged soon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So what this PR does if I've understood it correctly, is record the decision level for each contradicted incompatibility. Then when backtracking, instead of resetting all contradicted incompats, we keep those from decisions previous to the backtrack point.
Sounds logical and it's great that this improves real world use cases.
When i cherry-pick 976d03e onto f64c499 the speedup is lower but still there. For the
|
976d03e
to
7c98334
Compare
That makes sense. When #163 applies, as it does in your case, it entirely removes the old incompatibilities from consideration. This PR keeps track of the fact that the old incompatibilities can be skipped, which is much less useful if they're not being considered at all. I'm glad to see that despite the redundancy this PR pulls its weight. It is easy to imagine cases where #163 does not apply, where this PR could have a much larger impact. Closer to the one you originally measured. |
* Invalidate less contradicted_incompatibilities * update comments
* Invalidate less contradicted_incompatibilities * update comments
* Invalidate less contradicted_incompatibilities * update comments
This extends the work from #88. To quote:
Looking at our normal benchmarks this is still within the noise. Looking at
hobo
it is a 5% improvement. However it does reduce the number of times intersection is called, and our users are telling us that intersections in their code are more expensive than in our benchmarks.There are more efficient alternatives to calling
retain(|_, dl| *dl <= decision_level)
. (The insertion order is also sorted bydl
, so binary search and truncate is an option. Or ...) But I have not seen it in profiles so I don't think it is worth more complexity at this time.If this is a big win for our users then we should merge soon. Otherwise, this can stick around until other optimizations make the background noise smaller.