-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Timeout-based message expiration #212
Comments
Btw, if you'll see |
Some work was done in #259 but it seems like on |
On the aggregator side, we handle 3 messages:
|
Would adding the following validation to // Error handling is ignored for brevity
func (agg *Aggregator) validateMessageTimestamp(ctx context.Context, messageTimestamp uint64) error {
currentBlockNumber, _ := agg.httpClient.BlockNumber(ctx)
currentBlock, _ := agg.httpClient.BlockByNumber(ctx, big.NewInt(0).SetUint64(currentBlockNumber))
if messageTimestamp < (currentBlock.Time() - MESSAGE_AGGREGATION_TIMEOUT - MESSAGE_SUBMISSION_TIMEOUT) {
return errors.New("Message has expired")
}
return nil
}
Would changing the following to
be enough? |
Let me get up to speed on this, need to gather past thoughts! |
Okay, so my understanding on the aggregator is: |
What is the |
Unix timestamp, really. The point here is just establishing some baseline for when a message is old or not. Let me know if it makes sense, I may also be forgetting about something here. |
In both aggregator and operator we don't really have a timeout for processing messages.
For operators, it means when a message is added to the resend queue, it'll just stay there while it's not resent enough, even though it's a very old message. This is not interesting because (1) the interest in a state root update is immediate, so resending a really old message is essentially unnecessary info and can result in a lot of latency for the actual blocks to be processed when re-upping and (2) the resend queue just grows unbounded, which is never nice. Here, I propose we just check in the resend queue if a message is old enough to be dropped and drop it even if the aggregator connection is down. We could use a fancier queue with TTL (I'd prefer that) or just straight up manage this on
tryResendFromQueue
.For the aggregator, there's still the latency consideration - should take quite a while to get to the 'good' messages and, depending on the situation, aggregating an old message can affect a past checkpoint info. This should ideally be considered in slashing terms as well, but this is not desirable behavior. In that case, I propose we set a submission timeout and just ignore messages that are older than that when receiving them. It'd also be good to then adapt the checkpoint time range logic so it's from
lastCheckpointTimestamp + 1
tocurrentBlockTime - MESSAGE_AGGREGATION_TIMEOUT - MESSAGE_SUBMISSION_TIMEOUT
.The text was updated successfully, but these errors were encountered: