Add bench for backtracking #281

x-hgg-x · 2024-11-08T18:51:53Z

Part of #274.

Add a new bench to follow backtracking performance.

Eh2406

I look forward to seeing these numbers improve!

benches/backtracking.rs

test-examples/backtracking.ron

konstin · 2024-11-10T12:35:23Z

Can we improve prioritization to avoid going into the pathological case? The following bad hack already gives a 1.5x speedup, could we avoid reassigning all the other packages after (bad) solana versions by improving prioritize?

    fn prioritize(
        &mut self,
        package: Package,
        range: VersionSet,
        arena: &PackageArena<Self::P>,
    ) -> Self::Priority {
        let mut x = ((range.count() as u64) << 32) + package.get() as u64;
        if !arena.pkg(package).unwrap().to_string().contains("solana") {
            x = x + 1 << 62
        }
        Reverse(x)
    }

hyperfine --runs 3 "./pubgrub-bench-base" "./pubgrub-bench-prio"
Benchmark 1: ./pubgrub-bench-base
  Time (mean ± σ):      4.013 s ±  0.003 s    [User: 3.907 s, System: 0.104 s]
  Range (min … max):    4.011 s …  4.017 s    3 runs
 
Benchmark 2: ./pubgrub-bench-prio
  Time (mean ± σ):      2.643 s ±  0.004 s    [User: 2.500 s, System: 0.142 s]
  Range (min … max):    2.639 s …  2.647 s    3 runs
 
Summary
  ./pubgrub-bench-prio ran
    1.52 ± 0.00 times faster than ./pubgrub-bench-base

Re Eh2406/pubgrub-crates-benchmark#6 (comment): In the backtracking itself, why don't we learn the incompatibilities, shouldn't we be able to memoize that only a certain range works and skip over it in the next attempt? For Eh2406/pubgrub-crates-benchmark#6 (comment), what's the representation for features and virtual packages, can we improve that? In uv, we found a backtracking bug that was fixed by improving the ordering and bounds around feature packages.

For adding large, generated files to the repository, we should add information on how they were generated and how they should be interpreted: As a non-cargo dev, I currently see only a large tree of numbers, and can't tell how they should be resolved or what it means if we're taking a certain path through the packages and versions.

x-hgg-x · 2024-11-10T12:55:29Z

Can we improve prioritization to avoid going into the pathological case?

Yes it should be possible, but we need to find a generic method without knowing the shape of the dependency tree.

Maybe we can compute the current tree depth and prioritize packages with smaller depths ?

Re Eh2406/pubgrub-crates-benchmark#6 (comment): In the backtracking itself, why don't we learn the incompatibilities, shouldn't we be able to memoize that only a certain range works and skip over it in the next attempt? For Eh2406/pubgrub-crates-benchmark#6 (comment), what's the representation for features and virtual packages, can we improve that?

These examples are using the current resolver of cargo, which doesn't use pubgrub yet.

For adding large, generated files to the repository, we should add information on how they were generated and how they should be interpreted.

I updated the benchmark which is now synthetic and not based on cargo anymore (see #281 (comment)).

konstin · 2024-11-10T13:12:23Z

Yes it should be possible, but we need to find a generic method without knowing the shape of the dependency tree.

Maybe we can compute the current tree depth and prioritize packages with smaller depths ?

There's a big space of possibilities here! For inspiration, here's uv's heuristic: https://github.com/astral-sh/uv/blob/14507a1793f12bf884786be49d716ca8f8322340/crates/uv-resolver/src/pubgrub/priority.rs. We pick more tightly constrained packages first. Another option is to prioritize package that caused conflicts earlier (by counting conflicts). I'm sure @Eh2406 has good ideas specifically for cargo, too.

konstin · 2024-11-12T16:09:10Z

Given that the benchmark is synthetic, can we generate it on-the-fly in the bench code? This would also make documenting it easier, together with how this was extracted from the real solana crate case. (Large generated files in the repo, especially when we have different types of syntaxes for benchmarks, tend to become a maintenance burden).

Generally, with more context on the change and more documentation on the feature, i could give you a more helpful review here.

x-hgg-x · 2024-11-14T17:20:08Z

I updated the PR: the bench data is now generated in the code.

benches/backtracking.rs

konstin · 2024-11-18T12:09:54Z

I updated the PR: the bench data is now generated in the code.

Thanks, that's much easier to follow!

Can you add some docstrings explaining the benchmarks functions? Both what package structure they build, and how this relates to the real world case we are modelling here, like telling they story around how we get from trying to fix the slowest cargo-pubgrub cases to this small function. As a maintainer, it's important to have context on the changes, so we can reason about future change to this code. Imagine someone with no cargo background sees changes in these benchmarks from their pubgrub changes, and tries to make sense of what they mean. Or the other way round, someone finds a way to speed up these benchmarks, can they be confident that this does indeed improve solana or have they accidentally just hacked the benchmark?

x-hgg-x · 2024-11-18T15:39:34Z

These new benchmarks are not specific to the cargo implementation nor a specific crate resolution: they describe general patterns which can cause backtracking, so they should be useful even for uv.

konstin · 2024-11-18T16:12:38Z

I think it's good to have such benchmarks (they are definitely way more realistic than sudoku dummies), but in general we should aim at benchmarks that represent real world use cases. While pubgrub is a SAT solver technically, packaging often follows very specific patterns and often there are only a few hard patterns per ecosystem, which we should try to capture in benchmarks. See e.g. this talk about the perf oriented js package manager oregene: https://youtu.be/eh3VME3opnE?si=vuUuRkJJVsoSOqWZ&t=323.

In uv, this was notably #135, other representative slow example these were boto3/botocore/urllib (backtracking over two packages, largely a prefetching problem plus also #135), apache-airflow (here package prioritization mattered a lot) and transformers with all extras (a huge dep tree). We optimized on bio-embeddings, too, which was a somewhat artificial case, but exercised a lot of the problems in resolving scientic/ML projects, including transformers. It allowed profiling for #177, which in turn unblocked astral-sh/uv#2452.

In cargo, solana is the exceptionally difficult case, which seems to follow a pattern we don't observe in uv: Even the pathological bio-embeddings benchmark now takes only 3s on my machine, and i don't know any real package that would even take 1s with cached metadata on my machine.

Again, I'm not trying to say that these benchmarks are bad (we should have diverse benchmark coverage!), but I want to explain why sharing context and adding documentation around benchmarks is important.

x-hgg-x · 2024-11-18T18:30:14Z

I added some docstrings to explain how these benchmarks were derived from a real cargo resolution.

Eh2406

These look like reasonable things for our project to keep an eye on.

Eh2406 requested changes Nov 8, 2024

View reviewed changes

benches/backtracking.rs Outdated Show resolved Hide resolved

benches/backtracking.rs Outdated Show resolved Hide resolved

test-examples/backtracking.ron Outdated Show resolved Hide resolved

x-hgg-x force-pushed the new-bench branch 3 times, most recently from bed4e71 to c6db4bc Compare November 10, 2024 08:59

x-hgg-x force-pushed the new-bench branch from c6db4bc to b9edc06 Compare November 12, 2024 13:10

x-hgg-x force-pushed the new-bench branch from b9edc06 to 39bd8b9 Compare November 14, 2024 17:13

x-hgg-x force-pushed the new-bench branch from 39bd8b9 to 19aa329 Compare November 14, 2024 18:03

Eh2406 reviewed Nov 14, 2024

View reviewed changes

benches/backtracking.rs Outdated Show resolved Hide resolved

Eh2406 reviewed Nov 14, 2024

View reviewed changes

benches/backtracking.rs Outdated Show resolved Hide resolved

x-hgg-x force-pushed the new-bench branch 7 times, most recently from cc1a9dc to 1d01141 Compare November 16, 2024 19:25

Add bench for backtracking

e8161fa

x-hgg-x force-pushed the new-bench branch from 1d01141 to e8161fa Compare November 18, 2024 18:23

Eh2406 approved these changes Nov 18, 2024

View reviewed changes

Eh2406 added this pull request to the merge queue Nov 18, 2024

Merged via the queue into pubgrub-rs:dev with commit 8a29d57 Nov 18, 2024
6 checks passed

x-hgg-x deleted the new-bench branch November 18, 2024 20:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add bench for backtracking #281

Add bench for backtracking #281

x-hgg-x commented Nov 8, 2024 •

edited

Loading

Eh2406 left a comment

konstin commented Nov 10, 2024

x-hgg-x commented Nov 10, 2024 •

edited

Loading

konstin commented Nov 10, 2024

konstin commented Nov 12, 2024 •

edited

Loading

x-hgg-x commented Nov 14, 2024

konstin commented Nov 18, 2024

x-hgg-x commented Nov 18, 2024 •

edited

Loading

konstin commented Nov 18, 2024

x-hgg-x commented Nov 18, 2024 •

edited

Loading

Eh2406 left a comment

Add bench for backtracking #281

Add bench for backtracking #281

Conversation

x-hgg-x commented Nov 8, 2024 • edited Loading

Eh2406 left a comment

Choose a reason for hiding this comment

konstin commented Nov 10, 2024

x-hgg-x commented Nov 10, 2024 • edited Loading

konstin commented Nov 10, 2024

konstin commented Nov 12, 2024 • edited Loading

x-hgg-x commented Nov 14, 2024

konstin commented Nov 18, 2024

x-hgg-x commented Nov 18, 2024 • edited Loading

konstin commented Nov 18, 2024

x-hgg-x commented Nov 18, 2024 • edited Loading

Eh2406 left a comment

Choose a reason for hiding this comment

x-hgg-x commented Nov 8, 2024 •

edited

Loading

x-hgg-x commented Nov 10, 2024 •

edited

Loading

konstin commented Nov 12, 2024 •

edited

Loading

x-hgg-x commented Nov 18, 2024 •

edited

Loading

x-hgg-x commented Nov 18, 2024 •

edited

Loading