-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
return the round of a unit with the unit data when the unit is finalized #376
return the round of a unit with the unit data when the unit is finalized #376
Conversation
Please make sure the following happened
|
e71bca8
to
5267b5b
Compare
Hi @joschisan |
Hi @ggawryal, sorry for the delay I got sidetracked and then this pr slipped my mind. When I assume you say use case do I assume correctly that you refer to the Fedimint project in general - if so there are actually no further changes besides this bugfix: #399. AlephBFT in its current form fits into our use case quite well, which is why I opted to upstream with the last change as well. So if you have questions on the integration into Fedimint that go beyond what I explained above we can always arrange a call on this (I lead the migration of the system from HBBFT to AlephBFT over the last year). On this pr, I would seem to me that using the round index to trigger a session switch is a very clean way to do it and it furthermore seems to be required every time you have a continuously running system. I understand that exposing an implementation detail like this seems tacky, however at least this seems to be a stable one at least as I assume you plan on keep the DAG structure to order items. If there is anything unclear with the session switch I described above we can also discuss this in a call anytime. |
We had quite a lot of discussion about this PR within the team and we came to the conclusion that we prefer a more minimalistic interface – using the internal implementation details is in general dangerous and switching between sessions should be done on a completely separate layer, using the ordered data. The round limit is only supposed to be there to avoid OOMs and clearly signal to devs that switching sessions is a necessary part of using ABFT. In fact, we would probably prefer to have an interface without the last addition of the unit creator to the output as well. That being said, we would be open to having a secondary, less stable, interface for People Who Are Sure They Know What They Are Doing™. This one would probably expose the full structure of the batches and underlying DAG, so return batches of |
0f8d418
to
73dc111
Compare
…und of a finalized unit
@timorleph I have implemented a version of a secondary interface. So far it only exposes creator and round, while unit_hash and parent_hashes could be returned this way as well. Please let me know if this implementation fits what you had in mind. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Thanks for merging this so quickly! |
Fedimint uses Aleph BFT to order a stream of transactions with unpredictable gaps between while ordering a transaction is latency critical. Hence we need to switch sessions before the exponential slowdown starts and can not wait for a fixed number of transactions to achieve consensus on when to trigger the next session switch. Currently nodes submit batches of transactions which may be empty and we trigger a session switch every fixed number of (possibly empty) batches. In a byzantine fault tolerant setting with 3f+1 nodes we can assume that we order at least f batches in every round, however, if all nodes are correct and online we will likely order 3f+1 batches per round, hence we switch the session about three times faster than necessary to prevent running into the exponential slowdown. The times between session switches therefore depend on how many peers are offline / malicious.
If we were to return the round index of an ordered unit with the unit data we can simply trigger the session switch if the round index matches the offset for the exponential slowdown. This gives us more predictable session times, a cleaner way to trigger the session switch and allows us to increase session length by a factor of three without increasing our theoretical bound on RAM consumption per session. Since we opted not to overlay sessions and exchange signatures over the broadcast for simplicity we have a 3x latency spike every session switch - increasing session times 3x this way would therefore by quite useful to us.