-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Explore whether cloning context can be made more efficient #380
Comments
For the malicious case, there's also some data in ContextInner that isn't static -- |
that's right, we may decide to continue cloning these or take the similar approach to store MALICIOUS.scope((context, acc, r), async move {
// all futures spawned inside this block will have access to malicious context
}) |
Another thought... fortunately the malicious validator accumulator is linear, so we can replace the current synchronized accumulator with separate per-thread accumulators. |
Wouldn't that constrain all helpers into using the same thread allocation policy? |
I don't think so, but maybe I'm missing something. The idea is that it doesn't matter what order the terms are added, or via what combinations of intermediate subterms. All that matters is that by the end you've added all the same terms. I think we're already in a situation where the order of addition can be different across helpers depending on how futures get scheduled and who wins contention for the lock -- the difference is that right now there is a total ordering on each helper, but if we switch to per-thread accumulators, then there won't be. (If contention for this lock is an issue, then we ought to be able to see that once we start profiling the malicious protocol. So I wouldn't propose actually working on this unless we confirm the theory.) |
I thought that your proposal was to shard the checksum, meaning that you would have N different checksums. You are only proposing to split the accumulator into several pieces, which you later join once all are done. That would work very neatly, I think, provided that you can be sure that you got everything. |
There was also an idea floating around at that time to use atomics for counting terms. They also have this property of being eventually consistent |
Just another point how painful it is to deal with lifetimes inside implementations of |
Our software design for IPA require protocol contexts to be cheaply cloneable because we clone them a lot.
Making steps cloning inexpensive is tracked in #139, this issue focuses on improvements that can be made on protocol contexts, specifically on
ContextInner
clone. It looks innocent (atomic load/store on each clone) but when doing it often it may lead to false sharing issues on CPU caches. There may be a better way to do this if we look at the structure of theContextInner
struct:There is a role bit and two references that never change no matter how many times we clone the context. There might be an opportunity to cache those values (not references) in one place. Root-level future that is created to run protocol may be a good place to do that and Tokio crate provides some support for it: https://docs.rs/tokio/latest/tokio/task/struct.LocalKey.html.
10000 foot view
This is how query processing may look like
This design does not require changing protocols code, everything is contained within Infra layer
Complications
LocalKey
implementation however does not depend on tokio runtime so there may be no issuesTestWorld::contexts()
function usage becomes problematic, but luckily most of our tests useRunner
interface that can easily hide that compexity inside.The text was updated successfully, but these errors were encountered: