Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: remove redundant scanning #175

Merged
merged 5 commits into from
Mar 11, 2024
Merged

perf: remove redundant scanning #175

merged 5 commits into from
Mar 11, 2024

Conversation

Eh2406
Copy link
Member

@Eh2406 Eh2406 commented Dec 29, 2023

While profiling the hobo example I noticed that unit_propagation was repeatedly analyzing the same package. This appears to happen when it is analyzing a package A while there is a dependency edge from A -> B and there is a satisfying assignment for B. It discovers that the edges almost satisfied and alters A to accommodate. I added in assert to the code and let the prop tests find me a small example, then I isolated it into its own test and minimized. Recorded in the first commit.

My first thought was only to add the package to the buffer if package_almost != current_package, while tests pass I don't think this is entirely correct. If we modify the current selection for current_package incompatibilities that were previously inconclusive may now be almost satisfied. So a follow-up pass is required, but only one. Not a full pass per dependency edge.

My next thought was to only add the package to the buffer if the buffer does not contain that package !self.unit_propagation_buffer.contains(&package_almost). This is a standard O(n^2) set. Based on profiling this is never hot code. And it would be easy enough to replace it with a HashSet or something if it becomes a problem.

The O(n^2) kept bugging me, even though it doesn't show up in the profile. It's a performance problem just sitting there waiting to happen. Furthermore in hobos case where this pattern is problematic is repeatedly and only adding the same package. It is sufficient to check the last item in the buffer. More complicated cases do not occur often enough to be worth the extra comparisons. So this PR suggests adding only if self.unit_propagation_buffer.last() != Some(&package_almost).

Any of the proposed changes make little to no difference to most benchmarks and reduce hobo from 1700s to 1100s (total criterion collection time).

I could not figure out how to make a robust and maintainable test out of the example prop tests generated for me. So added a commit removing it. I left it in the history if someone else can figure out what to do with it. Sorry for the messy commit history.

@Eh2406 Eh2406 force-pushed the jf/redundant_scanning branch from b49aa8c to d3c8106 Compare January 2, 2024 19:41
@mpizenberg
Copy link
Member

mpizenberg commented Jan 3, 2024

Interesting found.

Repeatedly adding the same package to the unit propagation buffer means there are many incompats with the same almost satisfied package.

In theory, the best way to reduce that would be to merge smartly these incompatibilities, so that only few of them show up.

Then the second best option is to have a set instead (or in addition) of a vec for that buffer, so what you did. (In Dart's doc, "changed" is a Set https://github.com/dart-lang/pub/blob/master/doc/solver.md#unit-propagation)

Would you mind checking/reporting on the type of incompatibilities that still show up with repeatedly the same package? Even after checking the buffer top, or even after checking a hashset?

@Eh2406
Copy link
Member Author

Eh2406 commented Jan 3, 2024

Would you mind checking/reporting on the type of incompatibilities that still show up with repeatedly the same package? Even after checking the buffer top, or even after checking a hashset?

That is an excellent idea! I started by adding a print line

                        if self.unit_propagation_buffer.contains(&package_almost) {
                            println!(
                                "{package_almost}, {}, {:?}",
                                self.unit_propagation_buffer.last() == Some(&package_almost),
                                current_incompat
                            );
                        }

on code without this PR. I ran only resolved Hobo 0.18. It printed 52409 lines of output. The first interesting insight is that there are only 248 lines where it was not the last item (the second element is false). There are 84 DerivedFrom lines, 177 NoVersions lines, and the rest are FromDependencyOf. NoVersions like all unitary packages can probably be handled more efficiently, but they're not the core of this problem. The thing that makes hobo hard is that it enables a lot of (several hundred) features of web-sys which in turn has a lot of versions. Each feature flag needs to check that it is enabled on the same version as the package it is associated with. Thus if a version has been selected for web-sys, there are N-1 Almost Satisfied dependencies of the form FromDependencyOf("web-sys/HtmlUListElement", 0.3.37..=0.3.37, "web-sys", 0.3.37..0.3.38). We should probably open a separate issue for ideas for smartly merging these "two versions must match" requirements. (And I am working on updating my code for generating ron files from crates.io, so I can test generating fewer "two versions must match" requirements in the first place.)

So having done that useful analysis, I think there is a recommended workflow for which this is the best optimization available.

@Eh2406 Eh2406 force-pushed the jf/redundant_scanning branch from d3c8106 to 4c0c646 Compare January 3, 2024 18:11
@Eh2406
Copy link
Member Author

Eh2406 commented Mar 11, 2024

@mpizenberg is this good to go?

@mpizenberg
Copy link
Member

Since it's an optimization trick I'd suggest adding a comment just above. Other than that all good.

@Eh2406 Eh2406 force-pushed the jf/redundant_scanning branch from 4c0c646 to feb6b55 Compare March 11, 2024 17:49
@Eh2406
Copy link
Member Author

Eh2406 commented Mar 11, 2024

Added a comment, I will merge when you give it a approving review.

Copy link
Member

@mpizenberg mpizenberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@Eh2406 Eh2406 added this pull request to the merge queue Mar 11, 2024
Merged via the queue into dev with commit fdd9cc9 Mar 11, 2024
5 checks passed
@Eh2406 Eh2406 deleted the jf/redundant_scanning branch March 11, 2024 18:21
Eh2406 added a commit to Eh2406/pubgrub that referenced this pull request Aug 22, 2024
Doing the stronger version of pubgrub-rs#175 because it turns out to be important to cargo crates.
Eh2406 added a commit to Eh2406/pubgrub that referenced this pull request Aug 22, 2024
Doing the stronger version of pubgrub-rs#175 because it turns out to be important to cargo crates.
github-merge-queue bot pushed a commit that referenced this pull request Aug 22, 2024
Doing the stronger version of #175 because it turns out to be important to cargo crates.
maciektr pushed a commit to software-mansion-labs/pubgrub that referenced this pull request Nov 4, 2024
Doing the stronger version of pubgrub-rs#175 because it turns out to be important to cargo crates.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants