Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate a possible reordering issue of KeyGenMessage #234

Closed
vkomenda opened this issue Sep 14, 2018 · 4 comments
Closed

Investigate a possible reordering issue of KeyGenMessage #234

vkomenda opened this issue Sep 14, 2018 · 4 comments

Comments

@vkomenda
Copy link
Contributor

vkomenda commented Sep 14, 2018

Assuming the sender queue #226 is merged, if you delay KeyGenMessage::Part messages and deliver them in an epoch after their corresponding KeyGenMessage::Ack messages have been delivered by Dynamic Honey Badger, it seems that the DKG protocol may not terminate. This is highly speculative however as I'm looking at the log of my experiments where a queue implementation was faulty and was occasionally dropping KeyGenMessage::Part messages altogether. I assume that those messages, if delayed, will be simply discarded by the recipient because of the obsolete message epoch. The trick here is that the protocol makes progress without Parts and with Acks only.

Below is a filtered sample log of a non-terminating Dynamic Honey Badger for 2 validator nodes and one observer node is missing Part messages. Let's assume those were delayed until a later epoch where the recipient discards them.

NodeId(0) Restarting DKG for Remove(NodeId(0)).
NodeId(2) Restarting DKG for Remove(NodeId(0)).
NodeId(1) Restarting DKG for Remove(NodeId(0)).
NodeId(1) -> NodeId(2): DynamicHoneyBadger(KeyGen(2, Ack(Ack(0, "<1 values>")), Signature(0c4a3d..0c4afc)))
NodeId(1) -> NodeId(0): DynamicHoneyBadger(KeyGen(2, Ack(Ack(0, "<1 values>")), Signature(0c4a3d..0c4afc)))
NodeId(0) DKG for Remove(NodeId(0)) complete!
NodeId(2) DKG for Remove(NodeId(0)) complete!
NodeId(1) DKG for Remove(NodeId(0)) complete!
NodeId(1) Restarting DKG for Add(NodeId(0), PublicKey(0da1f7..e68e6a)).
NodeId(2) Restarting DKG for Add(NodeId(0), PublicKey(0da1f7..e68e6a)).
NodeId(1) -> NodeId(2): DynamicHoneyBadger(KeyGen(5, Part(Part("<degree 0>", "<2 rows>")), Signature(0842fe..6f4c43)))
NodeId(1) -> NodeId(2): DynamicHoneyBadger(KeyGen(5, Ack(Ack(1, "<2 values>")), Signature(0ec96c..2592ff)))
NodeId(0) Restarting DKG for Add(NodeId(0), PublicKey(0da1f7..e68e6a)).
NodeId(1) -> NodeId(0): DynamicHoneyBadger(KeyGen(5, Part(Part("<degree 0>", "<2 rows>")), Signature(0842fe..6f4c43)))
NodeId(0) -> NodeId(2): DynamicHoneyBadger(KeyGen(5, Ack(Ack(1, "<2 values>")), Signature(069365..543bac)))
NodeId(1) -> NodeId(0): DynamicHoneyBadger(KeyGen(5, Ack(Ack(1, "<2 values>")), Signature(0ec96c..2592ff)))
NodeId(0) -> NodeId(1): DynamicHoneyBadger(KeyGen(5, Ack(Ack(1, "<2 values>")), Signature(069365..543bac)))

After this point there are no DKG messages and the Dynamic Honey Badger epoch 5 does not terminate.

@afck
Copy link
Collaborator

afck commented Sep 16, 2018

Are the KeyGen messages annotated with the DHB start_epoch or the HB epoch? I think the former would be correct: They belong to a whole key generation process spanning several HB epochs. Whether I receive a Part or Ack in HB epoch 5 or 6 doesn't matter: In either case I should try to get that message committed.

Only once start_epoch has changed again, they are obsolete: Then DKG has either succeeded or restarted.

@vkomenda
Copy link
Contributor Author

KeyGen messages are annotated with a start_epoch.

In the log, "DKG for Remove" succeeds without any node sending Part messages, and only on the basis of the Ack messages. An attacker could have delayed those Part messages and sent them after start_epoch changes. That way the attacker would still only use reordering and all the messages would be delivered, but their effect would be different.

@afck
Copy link
Collaborator

afck commented Sep 17, 2018

only on the basis of the Ack messages.

Right, but that's only because node 1 committed its own Part message to a batch, without the help of any other node. In principle, that would always be possible for node removals: the key gen messages are just an anti-censorship measure.
For node additions, however, they are necessary, because the joining node can't yet commit its own Parts and Acks by itself.

In either case, the attacker shouldn't be able to do any harm: if DKG has succeeded, that means that enough Parts and Acks were committed and are visible to all validators and observers in the same order now.

@afck
Copy link
Collaborator

afck commented Oct 31, 2018

I'm closing this, because I think the reason for the weird test log is just that validators can get their own Part messages committed without even sending a SyncKeyGen message, as part of their own contribution to an epoch. In general, Acks are not even sent before the Part occurred in a batch.

Please reopen if I'm missing something.

@afck afck closed this as completed Oct 31, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants