Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Multithreading improvements #182

Merged
merged 4 commits into from
Oct 10, 2023

Conversation

aborgna-q
Copy link
Collaborator

@aborgna-q aborgna-q commented Oct 10, 2023

The main feature of this PR is reducing the memory usage of multithreading with a priority channel by adding a shared Arc<RwLock<Option<P>>> with the maximum circuit cost in the queue, so the workers don't fill the channels with useless data.

The main improvement of this is limiting the memory usage when multiple threads are involved. Now I can run barenco_tof_10 on 10 threads with ~4GB of RAM, where before it exceeded 10GB (and caused a OOM).

I also included multiple small changes extracted from #175;

  • CircuitCost::as_usize instead of proxy operations, so we can share cost limits as atomics and run other primitive operations directly.
  • Print the actual number of circuits processed on the log (not only the seen count, which includes elements in the queue).
  • Replace eprintln!s with log calls.
  • Print the max queue capacity at the beginning of the execution.

@aborgna-q aborgna-q requested a review from lmondada October 10, 2023 14:37
Comment on lines 262 to 269
self.circ_cnt += 1;
self.log
.send(PriorityChannelLog::CircuitCount {
processed_count: self.circ_cnt,
seen_count: self.seen_hashes.len(),
queue_length: self.pq.len(),
})
.unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it on purpose that you are sending a log at every circ_cnt increment?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This generally doesn't happen that frequently. I guess I could add a timeout.

Comment on lines +20 to 24
pub struct Entry<C, P, H> {
pub circ: C,
pub cost: P,
pub hash: H,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could/should this type be the same as type Work in hugr_pchannel?

workqueue_len: Option<usize>,
seen_hashes: usize,
) {
if circ_cnt % 1000 == 0 {
self.progress(format!("{circ_cnt} circuits..."));
if circuits_processed > self.last_circ_processed && circuits_processed % 1000 == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be if circuits_processed >= self.last_circ_processed + 1000

Comment on lines 14 to 16
/// A unit of work for a worker, consisting of a circuit to process, along its
/// hash and cost.
pub type Work<P> = (P, u64, Hugr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've defined this twice.

@aborgna-q aborgna-q enabled auto-merge October 10, 2023 15:23
@aborgna-q aborgna-q added this pull request to the merge queue Oct 10, 2023
Merged via the queue into main with commit 6f65beb Oct 10, 2023
7 checks passed
@aborgna-q aborgna-q deleted the feat/taso-multithreding-improvements branch October 10, 2023 15:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants