Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Embarrassingly parallel TASO #149

Merged
merged 15 commits into from
Oct 5, 2023
Merged

feat: Embarrassingly parallel TASO #149

merged 15 commits into from
Oct 5, 2023

Conversation

aborgna-q
Copy link
Collaborator

@aborgna-q aborgna-q commented Sep 29, 2023

Use the CircuitChunks splitter for running TASO on different parts of the circuit.

  • Adds a --split-circ flag to the taso optimizer bin, to split the circuit before running instead of sharing the full circuit between threads.
  • Adds a node_cost parameter to the splitter to split circuits according to our cost functions.

I'm getting some really good results with this, we should test if they are correct :P
Optimizing barenco_tof_10.json (192 CX gates):

  • Single thread, 60s timeout:
    • min cost: 166 CX gates
    • peak memory usage: 8.48 GB
  • 2 threads (96-CX chunks), 60s timeout:
    • min cost: 141 CX gates
    • peak memory usage: 8.06 GB
  • 2 threads (96-CX chunks), 30s timeout followed by single thread for 30s:
    • min cost: 126 CX gates
    • peak memory usage: 8.01 GB

(these are noisy results, as I'm running them on a non-idle laptop)


Edit: Updated measurements

Optimizing barenco_tof_10.json (192 CX gates):

  • Single thread, 60s timeout:
    • min cost: 138 CX gates
    • peak memory usage: 3.20 GB
  • 2 threads (96-CX chunks), 60s timeout:
    • min cost: 126 CX gates
    • peak memory usage: 3.06 GB
  • 2 threads (96-CX chunks), 30s timeout followed by single thread for 30s:
    • min cost: 124 CX gates
    • peak memory usage: 3.05 GB

@lmondada
Copy link
Contributor

lmondada commented Sep 29, 2023

Give me a second, I'm testing equality on barenco_tof_10.json. I'm getting that they are not equivalent but I think it's a false positive.

Copy link
Contributor

@lmondada lmondada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! We just need to find that bug...

src/optimiser/taso/log.rs Outdated Show resolved Hide resolved
@aborgna-q
Copy link
Collaborator Author

aborgna-q commented Sep 29, 2023

Update: with #150 the tests now use ~2.5GB peak memory.

Copy link
Contributor

@lmondada lmondada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm tempted to approve it, but let's wait until I have verified the failing circuits using other means.

We'll resolve this next week.

@lmondada
Copy link
Contributor

lmondada commented Oct 3, 2023

Can you run this one more time on barenco_tof_10 end-to-end and confirm it works? Then we can merge this in.

Thanks a lot!

@aborgna-q aborgna-q marked this pull request as draft October 4, 2023 08:54
aborgna-q added a commit that referenced this pull request Oct 4, 2023
Extracts the circuit cost definitions from `rewrite::strategy` into
`circuit::cost`, and adds methods `Circuit::nodes_cost` and
`Circuit::circuit_cost`.

With this we can use the cost logic for
`CircuitChunks::split_with_cost`.

Drive-by: There are some `CircuitChunks` QoL updates extracted from #149
that I didn't separate from this PR.
@aborgna-q
Copy link
Collaborator Author

This still doesn't support identity wires in the chunks, so it will errors out if an optimization produces one. I'm working on that on the side, but that can go in a separate PR.

The current impl works alright as long as we don't do that.

Here are the updated benchmarks, optimizing test_files/barenco_tof_10.json (192 CX gates):

  • Single thread, 60s timeout:
    • min cost: 138 CX gates
    • peak memory usage: 3.20 GB
  • 2 threads (96-CX chunks), 60s timeout:
    • min cost: 126 CX gates
    • peak memory usage: 3.06 GB
  • 2 threads (96-CX chunks), 30s timeout followed by single thread for 30s:
    • min cost: 124 CX gates
    • peak memory usage: 3.05 GB

@aborgna-q aborgna-q marked this pull request as ready for review October 5, 2023 12:54
@aborgna-q aborgna-q requested a review from lmondada October 5, 2023 12:54
Copy link
Contributor

@lmondada lmondada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@aborgna-q
Copy link
Collaborator Author

aborgna-q commented Oct 5, 2023

:shipit:

The empty-chunks panic is now fixed in #172

@aborgna-q aborgna-q merged commit 4174e14 into main Oct 5, 2023
7 checks passed
@aborgna-q aborgna-q deleted the feat/taso-split branch October 5, 2023 14:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants