Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stress,client: nonce chaos and failed execution options #604

Merged
merged 3 commits into from
Mar 15, 2024

Conversation

jchappelow
Copy link
Member

This updates the stress tool with:

  • a "nonce chaos" (-nc) option to apply a random jitter to the correct nonce at a rate of 1/nc times.
  • When using the create_post action, create action call transactions that intentionally fail execution with incorrect argument count or types. This ensures nonce and balance updates happen regardless of the execution outcome (tx code), and that the node is resilient to failures in user SQL queries as well as the engine's handling of procedure call errors.

This also fixes a bug in core/client.Client where the WithFee option was ignored. This is not critical since kwil-cli does not use it like it does the WithNonce option from --nonce, but I did want to have the stress tool do unexpected things with the tx.Body.Fee field and I realized this was impossible.

Putting this PR up now despite more related work being in progress because I am now diagnosing the cause of a consensus failure and persistent app hash mismatch after restart. I will investigate and find the root cause and fix, but in short, on validator node A FinalizeBlock returned an error because postgres/connection was down (I had closed my laptop lid), but on sentry node B it had gotten through FinalizeBlock without error for that same block before the pg connection died. Node B finalize had gotten apphash X. After restarting node A, it tried finalizing that block again and got apphash Y. I'm not sure why since the transaction for that block was not committed (and not prepared either according to "0 orphaned" prepared transactions reported on restart. Investigating...

Copy link
Collaborator

@brennanjl brennanjl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@brennanjl brennanjl merged commit 89f2b36 into main Mar 15, 2024
1 check passed
@brennanjl brennanjl deleted the stress-update branch March 15, 2024 14:56
@jchappelow jchappelow added this to the v0.8.0 milestone May 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants