-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: update pending nonces when aborting a cctx through MsgAbortStuckCCTX #3230
base: develop
Are you sure you want to change the base?
Conversation
📝 WalkthroughWalkthroughThe pull request encompasses a series of changes aimed at enhancing the functionality and stability of the Changes
Assessment against linked issues
Possibly related PRs
Suggested labels
Suggested reviewers
Warning Rate limit exceeded@kingpinXD has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 7 minutes and 36 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #3230 +/- ##
========================================
Coverage 62.00% 62.00%
========================================
Files 428 428
Lines 30469 30470 +1
========================================
+ Hits 18891 18892 +1
Misses 10738 10738
Partials 840 840
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (14)
x/crosschain/keeper/msg_server_abort_stuck_cctx.go (2)
43-44
: Consider adding logging for this important state transition.While the status update implementation is correct, adding logging would help with debugging and monitoring of aborted transactions in production.
// update the status +k.Logger(ctx).Info("Aborting stuck CCTX", "index", cctx.Index) cctx.CctxStatus.UpdateStatusAndErrorMessages(types.CctxStatus_Aborted, AbortMessage, "") +k.Logger(ctx).Info("CCTX aborted successfully", "index", cctx.Index, "status", types.CctxStatus_Aborted)
Line range hint
27-29
: Enhance function documentation.Consider expanding the function documentation to describe:
- The side effects (status changes, nonce updates)
- The conditions under which a CCTX is considered "stuck"
- The implications of aborting a CCTX
// AbortStuckCCTX aborts a stuck CCTX // Authorized: admin policy group 2 +// Side effects: +// - Updates CCTX status to Aborted +// - Removes the transaction from pending nonces +// - Updates outbound transaction state +// A CCTX is considered stuck when it remains in a pending status without progress +// Aborting a CCTX is irreversible and should be used with caution func (k msgServer) AbortStuckCCTX(x/crosschain/keeper/outbound_tracker_test.go (1)
61-66
: Consider enhancing the negative test case.While testing for non-panicking behavior is good, consider adding assertions to verify the system state remains unchanged after attempting to remove a non-existent tracker.
Here's a suggested enhancement:
t.Run("Do nothing if tracker doesn't exist", func(t *testing.T) { keeper, ctx, _, _ := keepertest.CrosschainKeeper(t) + // Create some initial state + items := createNOutboundTracker(keeper, ctx, 1) + initialState := keeper.GetAllOutboundTracker(ctx) require.NotPanics(t, func() { keeper.RemoveOutboundTrackerFromStore(ctx, 1, 1) }) + + // Verify system state remains unchanged + finalState := keeper.GetAllOutboundTracker(ctx) + require.Equal(t, initialState, finalState) })x/crosschain/types/status.go (2)
90-92
: LGTM! Consider adding documentation.The implementation is correct and aligns with the state transition map. Consider adding documentation to explain what constitutes a terminal status and its significance in the CCTX lifecycle.
Add documentation above the method:
+// IsTerminalStatus returns true if the status represents a final state +// from which no further transitions are possible (Aborted, Reverted, or OutboundMined). func (c CctxStatus) IsTerminalStatus() bool {
94-96
: LGTM! Consider adding documentation and using constants.The implementation correctly identifies pending states. Consider documenting the method and using constants for better maintainability.
Consider refactoring to use constant sets for better maintainability:
var ( terminalStatuses = map[CctxStatus]bool{ CctxStatus_Aborted: true, CctxStatus_Reverted: true, CctxStatus_OutboundMined: true, } pendingStatuses = map[CctxStatus]bool{ CctxStatus_PendingInbound: true, CctxStatus_PendingOutbound: true, CctxStatus_PendingRevert: true, } ) func (c CctxStatus) IsTerminalStatus() bool { return terminalStatuses[c] } func (c CctxStatus) IsPendingStatus() bool { return pendingStatuses[c] }This approach:
- Makes it easier to maintain status categories
- Provides better performance for larger sets
- Makes the code more maintainable
x/crosschain/types/status_test.go (2)
178-197
: LGTM! Consider adding documentation for terminal status definition.The test implementation is clean and comprehensive, following Go testing best practices with table-driven tests. Consider adding a comment block explaining what constitutes a terminal status (OutboundMined, Reverted, Aborted) for better maintainability.
+// TestCctxStatus_IsTerminalStatus verifies the IsTerminalStatus method +// Terminal statuses represent final states (OutboundMined, Reverted, Aborted) +// from which no further transitions are possible func TestCctxStatus_IsTerminalStatus(t *testing.T) {
199-218
: LGTM! Consider adding documentation for pending status definition.The test implementation is clean and comprehensive, following Go testing best practices with table-driven tests. Consider adding a comment block explaining what constitutes a pending status (PendingInbound, PendingOutbound, PendingRevert) for better maintainability.
+// TestCctxStatus_IsPendingStatus verifies the IsPendingStatus method +// Pending statuses represent intermediate states (PendingInbound, PendingOutbound, PendingRevert) +// from which transitions to other states are expected func TestCctxStatus_IsPendingStatus(t *testing.T) {x/crosschain/keeper/msg_server_abort_stuck_cctx_test.go (3)
33-41
: Consider reducing test setup duplicationThe setup code for TssNonce and PendingNonces is duplicated across all test cases. Consider extracting this into a helper function to improve maintainability.
func setupCCTXWithPendingNonces(t *testing.T, k *keeper.Keeper, ctx sdk.Context, status crosschaintypes.CctxStatus, statusMessage string) *crosschaintypes.CrossChainTx { cctx := sample.CrossChainTx(t, "cctx_index") cctx.CctxStatus = &crosschaintypes.Status{ Status: status, StatusMessage: statusMessage, } cctx.GetCurrentOutboundParam().TssNonce = 1 k.SetCrossChainTx(ctx, *cctx) k.GetObserverKeeper().SetPendingNonces(ctx, observertypes.PendingNonces{ NonceLow: 0, NonceHigh: 10, ChainId: cctx.GetCurrentOutboundParam().ReceiverChainId, Tss: cctx.GetCurrentOutboundParam().TssPubkey, }) return cctx }Also applies to: 79-87, 129-137
57-61
: Enhance pending nonces verificationThe assertions for pending nonces state could be more comprehensive. Consider adding checks for:
- NonceHigh value remains unchanged
- ChainId and Tss values match the CCTX parameters
// Add these assertions after the existing ones require.Equal(t, int64(10), pendingNonces.NonceHigh) require.Equal(t, cctx.GetCurrentOutboundParam().ReceiverChainId, pendingNonces.ChainId) require.Equal(t, cctx.GetCurrentOutboundParam().TssPubkey, pendingNonces.Tss)Also applies to: 103-109, 153-157
Line range hint
18-224
: Consider adding edge case testsThe test suite has good coverage of main scenarios, but consider adding these edge cases:
- Test aborting when nonce equals NonceHigh
- Test handling of nonce overflow scenarios
Example test case:
t.Run("can handle nonce at upper bound", func(t *testing.T) { // Arrange k, ctx, _, _ := keepertest.CrosschainKeeperWithMocks(t, keepertest.CrosschainMockOptions{ UseAuthorityMock: true, }) // Setup CCTX with nonce = NonceHigh cctx := sample.CrossChainTx(t, "cctx_index") cctx.GetCurrentOutboundParam().TssNonce = 10 // ... rest of the setup // Assert appropriate handling })e2e/utils/zetacore.go (2)
Line range hint
69-72
: Consider documenting the implications of usingInTxHashToCctxData
.The TODO comment mentions replacing
InTxHashToCctxData
withInboundHashToCctxData
but doesn't explain the implications or potential risks of using the current approach. This information would be valuable for maintainers.
Line range hint
1-324
: Consider abstracting common waiting patterns.The file contains several functions with similar waiting patterns (
WaitCctxMinedByInboundHash
,WaitCCTXMinedByIndex
,WaitCctxByInboundHash
). Consider creating a generic waiting mechanism to reduce code duplication and improve maintainability.Example abstraction:
type WaitConfig struct { Timeout time.Duration RetryDelay time.Duration LogInterval int Query func(ctx context.Context) (interface{}, error) IsComplete func(result interface{}) bool } func WaitForCondition(ctx context.Context, config WaitConfig, logger infoLogger) (interface{}, error) { startTime := time.Now() for i := 0; ; i++ { if time.Since(startTime) > config.Timeout { return nil, errors.New("timeout waiting for condition") } result, err := config.Query(ctx) if err != nil { if i%config.LogInterval == 0 { logger.Info("Error querying condition: %s", err.Error()) } time.Sleep(config.RetryDelay) continue } if config.IsComplete(result) { return result, nil } time.Sleep(config.RetryDelay) } }changelog.md (2)
Line range hint
1-24
: Consider adding a table of contents for better navigation.The changelog is well-organized chronologically, but as it grows, navigation becomes challenging. Consider adding a table of contents at the top with links to each version section.
# CHANGELOG + +## Table of Contents + +- [Unreleased](#unreleased) +- [v23.0.0](#v2300) +- [v22.1.2](#v2212) +...
17-17
: Add more context to the fix description.The fix description "update pending nonces when aborting a cctx through MsgAbortStuckCCTX" could benefit from additional context about the problem it solves and its impact.
-* [3230](https://github.com/zeta-chain/node/pull/3230) - update pending nonces when aborting a cctx through MsgAbortStuckCCTX +* [3230](https://github.com/zeta-chain/node/pull/3230) - fix: update pending nonces when aborting a cctx through MsgAbortStuckCCTX to prevent nonce tracking inconsistencies
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
📒 Files selected for processing (8)
changelog.md
(1 hunks)e2e/utils/zetacore.go
(2 hunks)x/crosschain/keeper/cctx.go
(2 hunks)x/crosschain/keeper/msg_server_abort_stuck_cctx.go
(1 hunks)x/crosschain/keeper/msg_server_abort_stuck_cctx_test.go
(5 hunks)x/crosschain/keeper/outbound_tracker_test.go
(1 hunks)x/crosschain/types/status.go
(1 hunks)x/crosschain/types/status_test.go
(1 hunks)
🧰 Additional context used
📓 Path-based instructions (7)
e2e/utils/zetacore.go (1)
Pattern **/*.go
: Review the Go code, point out issues relative to principles of clean code, expressiveness, and performance.
x/crosschain/keeper/cctx.go (1)
Pattern **/*.go
: Review the Go code, point out issues relative to principles of clean code, expressiveness, and performance.
x/crosschain/keeper/msg_server_abort_stuck_cctx.go (1)
Pattern **/*.go
: Review the Go code, point out issues relative to principles of clean code, expressiveness, and performance.
x/crosschain/keeper/msg_server_abort_stuck_cctx_test.go (1)
Pattern **/*.go
: Review the Go code, point out issues relative to principles of clean code, expressiveness, and performance.
x/crosschain/keeper/outbound_tracker_test.go (1)
Pattern **/*.go
: Review the Go code, point out issues relative to principles of clean code, expressiveness, and performance.
x/crosschain/types/status.go (1)
Pattern **/*.go
: Review the Go code, point out issues relative to principles of clean code, expressiveness, and performance.
x/crosschain/types/status_test.go (1)
Pattern **/*.go
: Review the Go code, point out issues relative to principles of clean code, expressiveness, and performance.
🔇 Additional comments (8)
x/crosschain/keeper/msg_server_abort_stuck_cctx.go (1)
39-39
: Clean refactoring of status check logic!
The use of IsPendingStatus()
method encapsulates the status check logic, making the code more maintainable and less prone to errors.
x/crosschain/keeper/outbound_tracker_test.go (1)
45-59
: LGTM! Well-structured test for existing tracker removal.
The test case effectively verifies the removal of existing outbound trackers and follows Go testing best practices by using sub-tests.
x/crosschain/keeper/cctx.go (1)
Line range hint 37-50
: Implementation aligns with PR objectives
The code correctly handles nonce mapping for pending states and updates zeta accounting for aborted transactions. This implementation addresses the issue mentioned in #3157 regarding pending nonces during CCTX abortion.
Let's verify the status handling implementation:
✅ Verification successful
Nonce mapping implementation is consistent with status transitions
The verification confirms that:
- The status transitions between
PendingOutbound
andPendingRevert
are well-defined inx/crosschain/types/status.go
- The nonce mapping is consistently set only for these two states across the codebase
- The implementation aligns with the state transition rules and test cases validate the behavior
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify the consistency of status handling across the codebase
# Check for other places where CctxStatus_PendingOutbound and CctxStatus_PendingRevert are used
rg "CctxStatus_PendingOutbound|CctxStatus_PendingRevert" --type go
# Check for potential inconsistencies in nonce handling
ast-grep --pattern 'SetNonceToCctx($$$)'
Length of output: 18409
x/crosschain/keeper/msg_server_abort_stuck_cctx_test.go (1)
7-7
: LGTM: Clean import addition
The addition of the observertypes import is well-placed and necessary for the new pending nonces functionality.
e2e/utils/zetacore.go (3)
Line range hint 116-132
: LGTM! Well-structured status check refactoring.
The change from using a function to a method call (IsTerminalStatus()
) on the status type improves encapsulation and maintainability. The rate-limited logging is also well-implemented to prevent log spam.
Line range hint 173-185
: LGTM! Consistent status check implementation.
The status check implementation maintains consistency with the previous refactoring, using the same method call pattern and logging approach.
Line range hint 13-19
: Investigate increasing timeout requirements.
The timeout has been increased from 4 to 8 minutes, which might indicate underlying performance issues or race conditions. This warrants investigation as noted in issue #2690.
Run the following script to analyze the historical context of timeout-related changes:
✅ Verification successful
Let me gather more context about the timeout changes and their rationale.
Let me check the referenced issue and the commit history around this timeout change.
Timeout increase appears to be a deliberate performance optimization
The timeout increase from 4 to 8 minutes in DefaultCctxTimeout
is a justified change based on the following findings:
- The timeout is specifically used in end-to-end (e2e) testing scenarios, not production code
- The change is documented with a reference to issue
e2e
: cctx timeout need to be increased to avoid test failures #2690 - The timeout is consistently applied across multiple CCTX-related functions in the test utilities
- No recurring timeout-related test failures were found in the e2e test logs
The increased duration accommodates longer processing times during test scenarios without indicating underlying performance issues or race conditions.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Analyze timeout-related changes and their context
# Search for timeout-related changes in the codebase
rg -A 3 "DefaultCctxTimeout.*=.*time\.Minute"
# Search for timeout-related issues in error logs
rg -A 3 "timeout.*cctx not mined"
Length of output: 789
Script:
#!/bin/bash
# Search for timeout-related changes in test files and configurations
rg -B 2 -A 2 "time\.Minute.*timeout"
# Search for any timeout-related configurations in the codebase
rg -B 2 -A 2 "DefaultCctxTimeout|CCTX_TIMEOUT"
# Look for test failures or issues related to timeouts
rg -B 2 -A 2 "test.*failed.*timeout|error.*timeout"
Length of output: 2915
Script:
#!/bin/bash
# Check the git history for the timeout change
git log -p --grep="timeout" e2e/utils/zetacore.go
# Look for the referenced issue number in commit messages
git log --grep="#2690"
# Search for timeout-related test failures in the e2e directory
rg -B 3 -A 3 "failed.*timeout|error.*timeout" e2e/
Length of output: 399
changelog.md (1)
Line range hint 1-1500
: LGTM! The changelog follows good practices.
The changelog effectively documents:
- Breaking changes with clear explanations
- Features, fixes, and refactoring changes
- Links to relevant PRs and issues
- Version numbers following semantic versioning
// Save out outbound, we do not need to provide the tss-pubkey as NonceToCctx is not updated | ||
k.SaveOutbound(ctx, &cctx, "") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
Nonce cleanup needs to be implemented in abort stuck CCTX handler
The SaveOutbound
implementation in msg_server_vote_outbound_tx.go
shows that it handles nonce cleanup by calling RemoveFromPendingNonces
, but this is specifically for the outbound voting flow. When aborting a stuck CCTX, we need explicit nonce cleanup to prevent nonce gaps.
- Add
RemoveFromPendingNonces
call beforeSaveOutbound
inmsg_server_abort_stuck_cctx.go
to properly clean up the pending nonce:
k.GetObserverKeeper().RemoveFromPendingNonces(ctx, cctx.GetCurrentOutboundParam().TssPubkey, cctx.GetCurrentOutboundParam().ReceiverChainId, int64(cctx.GetCurrentOutboundParam().TssNonce))
🔗 Analysis chain
Verify nonce removal from pending list.
The PR objectives mention using RemoveFromPendingNonces
, but it's not explicitly called here. Let's verify if SaveOutbound
handles this or if we need additional nonce cleanup.
Let me gather more information about the SaveOutbound
implementation to understand how it handles nonces.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for nonce handling in SaveOutbound implementation
ast-grep --pattern 'func (k Keeper) SaveOutbound($_, $_, $_) {
$$$
RemoveFromPendingNonces($$$)
$$$
}'
# Also check for any other nonce handling in the keeper
rg -A 5 "RemoveFromPendingNonces"
Length of output: 6219
Script:
#!/bin/bash
# Check SaveOutbound implementation
ast-grep --pattern 'func (k Keeper) SaveOutbound($_, $_, $_) {
$$$
}'
# Check if there are any other methods in the abort stuck cctx file that handle nonces
rg -A 5 "nonce" x/crosschain/keeper/msg_server_abort_stuck_cctx.go
Length of output: 1252
|
||
// 1. Set the Nonce to Cctx mapping | ||
// A new mapping between a nonce and a cctx index should be created only when we add a new outbound to an existing cctx. | ||
// When adding a new outbound , the only two conditions are | ||
// - The cctx is in CctxStatus_PendingOutbound , which means the first outbound has been added, and we need to set the nonce for that | ||
// - The cctx is in CctxStatus_PendingRevert , which means the second outbound has been added, and we need to set the nonce for that | ||
|
||
// 2. Set the cctx in the store | ||
|
||
// 3. set the mapping inboundHash -> cctxIndex | ||
// A new value is added to the mapping when a single inbound hash is connected to multiple cctx indexes | ||
|
||
// 4. update the zeta accounting | ||
// Zeta-accounting is updated aborted cctxs of cointtype zeta.When a cctx is aborted it means that `GetAbortedAmount` | ||
//of zeta is locked and cannot be used. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider splitting the function to follow Single Responsibility Principle
The function handles four distinct responsibilities as documented in the comments. Consider breaking this down into smaller, focused functions for better maintainability and testing.
Additionally, there are some formatting issues in the comments:
- Split the function into smaller functions:
func (k Keeper) SetCctxAndNonceToCctxAndInboundHashToCctx(ctx sdk.Context, cctx types.CrossChainTx, tssPubkey string) {
k.setNonceToCctxMapping(ctx, cctx, tssPubkey)
k.setCrossChainTx(ctx, cctx)
k.setInboundHashToCctx(ctx, cctx)
k.updateZetaAccounting(ctx, cctx)
}
- Fix the comment formatting:
-// 4. update the zeta accounting
-// Zeta-accounting is updated aborted cctxs of cointtype zeta.When a cctx is aborted it means that `GetAbortedAmount`
-//of zeta is locked and cannot be used.
+// 4. Update the zeta accounting
+// Zeta-accounting is updated for aborted CCTXs of cointype zeta. When a CCTX is aborted,
+// it means that `GetAbortedAmount` of zeta is locked and cannot be used.
Description
Closes #3157
The pr does the following
Calls
SaveOutbound
at the end of MsgAbortCCTX , which processes the outbound in the same way as a regular cctxAdd simple functions for the status check , in the types package
How Has This Been Tested?
Summary by CodeRabbit
Release Notes
New Features
Bug Fixes
Tests