Explore whether cloning context can be made more efficient #380

akoshelev · 2023-01-12T01:27:37Z

Our software design for IPA require protocol contexts to be cheaply cloneable because we clone them a lot.

pub struct SemiHonestContext<'a, F: Field> {
    inner: Arc<ContextInner<'a>>,
    step: Step,
}

Making steps cloning inexpensive is tracked in #139, this issue focuses on improvements that can be made on protocol contexts, specifically on ContextInner clone. It looks innocent (atomic load/store on each clone) but when doing it often it may lead to false sharing issues on CPU caches. There may be a better way to do this if we look at the structure of the ContextInner struct:

pub(super) struct ContextInner<'a> {
    pub role: Role,
    pub prss: &'a PrssEndpoint,
    pub gateway: &'a Gateway,
}

There is a role bit and two references that never change no matter how many times we clone the context. There might be an opportunity to cache those values (not references) in one place. Root-level future that is created to run protocol may be a good place to do that and Tokio crate provides some support for it: https://docs.rs/tokio/latest/tokio/task/struct.LocalKey.html.

10000 foot view

This is how query processing may look like

Helpers receive a request to process new query. They dynamically allocate roles for each of them and set up PRSS.
Each helper stores role, PRSS endpoint and Gateway inside task local data cell (tokio localkey)
Query processing is launched by spawning a tokio task (or any other runtime task). In case of tokio it would look like

   tokio::spawn(|| async {
        CONTEXT_INNER.scope((prss, gateway, role), async move {
              let context = SemiHonestContext::new("step");  // no need to pass PRSS or Gateway anymore      
         }).await;
   });

   impl SemiHonestContext {
          pub fn prss(&self) -> &PrssEndpoint {
              CONTEXT_INNER.with(|inner| inner.0) // not sure if that works fine
          }
   }

This design does not require changing protocols code, everything is contained within Infra layer

Complications

Our plan is to run IPA software on Cloudflare Workers that do not support Tokio: https://github.com/cloudflare/workers-rs. LocalKey implementation however does not depend on tokio runtime so there may be no issues
This change will require changes in our test infrastructure. Every test needs to set up per-task data before running any computation. TestWorld::contexts() function usage becomes problematic, but luckily most of our tests use Runner interface that can easily hide that compexity inside.

The text was updated successfully, but these errors were encountered:

andyleiserson · 2023-01-12T03:59:26Z

For the malicious case, there's also some data in ContextInner that isn't static -- upgrade_ctx and accumulator. (And I discovered today that upgrades are not just for inputs, they're also used in solved_bits). We may need to revisit the handling of these.

akoshelev · 2023-01-12T22:30:11Z

For the malicious case, there's also some data in ContextInner that isn't static

that's right, we may decide to continue cloning these or take the similar approach to store context, accumulator and r inside the task data as well.

   MALICIOUS.scope((context, acc, r), async move {
        // all futures spawned inside this block will have access to malicious context
   })

andyleiserson · 2023-02-02T02:13:07Z

Another thought... fortunately the malicious validator accumulator is linear, so we can replace the current synchronized accumulator with separate per-thread accumulators.

martinthomson · 2023-02-02T02:44:30Z

Wouldn't that constrain all helpers into using the same thread allocation policy?

andyleiserson · 2023-02-02T17:48:32Z

Wouldn't that constrain all helpers into using the same thread allocation policy?

I don't think so, but maybe I'm missing something. The idea is that it doesn't matter what order the terms are added, or via what combinations of intermediate subterms. All that matters is that by the end you've added all the same terms.

I think we're already in a situation where the order of addition can be different across helpers depending on how futures get scheduled and who wins contention for the lock -- the difference is that right now there is a total ordering on each helper, but if we switch to per-thread accumulators, then there won't be.

(If contention for this lock is an issue, then we ought to be able to see that once we start profiling the malicious protocol. So I wouldn't propose actually working on this unless we confirm the theory.)

martinthomson · 2023-02-03T01:33:21Z

I thought that your proposal was to shard the checksum, meaning that you would have N different checksums. You are only proposing to split the accumulator into several pieces, which you later join once all are done. That would work very neatly, I think, provided that you can be sure that you got everything.

akoshelev · 2023-02-03T01:44:43Z

There was also an idea floating around at that time to use atomics for counting terms. They also have this property of being eventually consistent

akoshelev · 2024-07-17T23:31:32Z

Just another point how painful it is to deal with lifetimes inside implementations of Context trait. @andyleiserson and myself are constantly running into rust-lang/rust#100013 whenever we work with functions that accept C: Context and return a future.

andyleiserson mentioned this issue Jan 12, 2023

rework upgrades for malicious protocol #387

Merged

andyleiserson mentioned this issue Feb 10, 2023

eliminate Arcs in contexts #469

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Explore whether cloning context can be made more efficient #380

Explore whether cloning context can be made more efficient #380

akoshelev commented Jan 12, 2023 •

edited

Loading

andyleiserson commented Jan 12, 2023

akoshelev commented Jan 12, 2023

andyleiserson commented Feb 2, 2023

martinthomson commented Feb 2, 2023

andyleiserson commented Feb 2, 2023

martinthomson commented Feb 3, 2023

akoshelev commented Feb 3, 2023

akoshelev commented Jul 17, 2024

Explore whether cloning context can be made more efficient #380

Explore whether cloning context can be made more efficient #380

Comments

akoshelev commented Jan 12, 2023 • edited Loading

10000 foot view

Complications

andyleiserson commented Jan 12, 2023

akoshelev commented Jan 12, 2023

andyleiserson commented Feb 2, 2023

martinthomson commented Feb 2, 2023

andyleiserson commented Feb 2, 2023

martinthomson commented Feb 3, 2023

akoshelev commented Feb 3, 2023

akoshelev commented Jul 17, 2024

akoshelev commented Jan 12, 2023 •

edited

Loading