fix: update pending nonces when aborting a cctx through MsgAbortStuckCCTX #3230

kingpinXD · 2024-11-28T00:17:40Z

Description

The pr does the following

Calls SaveOutbound at the end of MsgAbortCCTX , which processes the outbound in the same way as a regular cctx
Add simple functions for the status check , in the types package

func (c CctxStatus) IsTerminalStatus() bool {
	return c == CctxStatus_Aborted || c == CctxStatus_Reverted || c == CctxStatus_OutboundMined
}

func (c CctxStatus) IsPendingStatus() bool {
	return c == CctxStatus_PendingInbound || c == CctxStatus_PendingOutbound || c == CctxStatus_PendingRevert
}

How Has This Been Tested?

Summary by CodeRabbit

Release Notes

New Features
- Introduced methods to easily determine if a cross-chain transaction status is terminal or pending.
Bug Fixes
- Improved handling of unsupported transaction versions for Solana.
- Enhanced transaction management by updating pending nonces when aborting a transaction.
- Fixed retry behavior for inbound vote message validation failures.
Tests
- Added new tests for evaluating terminal and pending statuses of cross-chain transactions.
- Enhanced existing tests for aborting transactions to ensure accurate nonce handling and status message validation.

coderabbitai · 2024-11-28T00:17:48Z

📝 Walkthrough

Walkthrough

The pull request encompasses a series of changes aimed at enhancing the functionality and stability of the zetaclient and cross-chain transaction handling. Key modifications include the refactoring of transaction status checks, improvements in nonce management during transaction aborts, and updates to test cases for better coverage. The changelog reflects these changes alongside fixes for transaction handling issues in Solana and Bitcoin, as well as the introduction of new methods for status evaluation in the CctxStatus type.

Changes

File Path	Change Summary
changelog.md	Updated to reflect various modifications including tests, refactoring, and fixes; new fix for updating pending nonces.
e2e/utils/zetacore.go	Refactored CCTX status checking logic; removed `IsTerminalStatus` and integrated its logic into existing functions.
x/crosschain/keeper/cctx.go	Updated comments in `SetCctxAndNonceToCctxAndInboundHashToCctx` for clarity on mapping conditions.
x/crosschain/keeper/msg_server_abort_stuck_cctx.go	Refactored `AbortStuckCCTX` logic to use `IsPendingStatus()` and `UpdateStatusAndErrorMessages`.
x/crosschain/keeper/msg_server_abort_stuck_cctx_test.go	Updated tests for `AbortStuckCCTX` to manage pending nonces and refined assertions for status messages.
x/crosschain/keeper/outbound_tracker_test.go	Refactored `TestOutboundTrackerRemove` to include sub-tests for better coverage.
x/crosschain/types/status.go	Added methods `IsTerminalStatus` and `IsPendingStatus` to `CctxStatus` type for improved status evaluation.
x/crosschain/types/status_test.go	Added tests for new `IsTerminalStatus` and `IsPendingStatus` methods to validate their functionality.

Assessment against linked issues

Objective	Addressed	Explanation
Filter aborted CCTX from pending CCTX query (#3157)	❌	The changes do not address filtering logic for aborted CCTX.

Possibly related PRs

fix(zetaclient): add zetaclient evm outbound tx index by nonce to supplement outtx tracker #2735: Enhancements in outbound transaction tracking in the zetaclient are related to the handling of unsupported transaction versions for Solana.
feat: add Solana RPC status check and some refactor around RPC status check #2751: Introduces a new RPC status check for the Solana chain, aligning with improvements in transaction handling.
fix: abort cctx on dust amount in revert outbound (Bitcoin and Solana) #3149: Fixes related to aborting CCTX on dust amounts during revert transactions are directly connected to this PR.
fix: skip solana unsupported transaction version #3206: Updates to skip unsupported transaction versions for Solana are related to similar changes made in this PR.

Suggested labels

breaking:proto, no-changelog, V2_TESTS, CONSENSUS_BREAKING_ACK

Suggested reviewers

fbac
ws4charlie
lumtis
skosito
swift1337

Warning

Rate limit exceeded

@kingpinXD has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 7 minutes and 36 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 08ff881 and 091ec82.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

codecov · 2024-11-28T00:24:10Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 62.00%. Comparing base (08ff881) to head (091ec82).

Additional details and impacted files

@@           Coverage Diff            @@
##           develop    #3230   +/-   ##
========================================
  Coverage    62.00%   62.00%           
========================================
  Files          428      428           
  Lines        30469    30470    +1     
========================================
+ Hits         18891    18892    +1     
  Misses       10738    10738           
  Partials       840      840

Files with missing lines	Coverage Δ
x/crosschain/keeper/cctx.go	`95.65% <100.00%> (+0.12%)`	⬆️
x/crosschain/keeper/msg_server_abort_stuck_cctx.go	`100.00% <100.00%> (ø)`
x/crosschain/types/status.go	`95.65% <100.00%> (+0.26%)`	⬆️

coderabbitai

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (14)

x/crosschain/keeper/msg_server_abort_stuck_cctx.go (2)
43-44: Consider adding logging for this important state transition.

While the status update implementation is correct, adding logging would help with debugging and monitoring of aborted transactions in production.
 // update the status
+k.Logger(ctx).Info("Aborting stuck CCTX", "index", cctx.Index)
 cctx.CctxStatus.UpdateStatusAndErrorMessages(types.CctxStatus_Aborted, AbortMessage, "")
+k.Logger(ctx).Info("CCTX aborted successfully", "index", cctx.Index, "status", types.CctxStatus_Aborted)
Line range hint 27-29: Enhance function documentation.

Consider expanding the function documentation to describe:

The side effects (status changes, nonce updates)

The conditions under which a CCTX is considered "stuck"

The implications of aborting a CCTX
 // AbortStuckCCTX aborts a stuck CCTX
 // Authorized: admin policy group 2
+// Side effects:
+// - Updates CCTX status to Aborted
+// - Removes the transaction from pending nonces
+// - Updates outbound transaction state
+// A CCTX is considered stuck when it remains in a pending status without progress
+// Aborting a CCTX is irreversible and should be used with caution
 func (k msgServer) AbortStuckCCTX(
x/crosschain/keeper/outbound_tracker_test.go (1)
61-66: Consider enhancing the negative test case.

While testing for non-panicking behavior is good, consider adding assertions to verify the system state remains unchanged after attempting to remove a non-existent tracker.

Here's a suggested enhancement:
 t.Run("Do nothing if tracker doesn't exist", func(t *testing.T) {
     keeper, ctx, _, _ := keepertest.CrosschainKeeper(t)
+    // Create some initial state
+    items := createNOutboundTracker(keeper, ctx, 1)
+    initialState := keeper.GetAllOutboundTracker(ctx)
     
     require.NotPanics(t, func() {
         keeper.RemoveOutboundTrackerFromStore(ctx, 1, 1)
     })
+    
+    // Verify system state remains unchanged
+    finalState := keeper.GetAllOutboundTracker(ctx)
+    require.Equal(t, initialState, finalState)
 })
x/crosschain/types/status.go (2)
90-92: LGTM! Consider adding documentation.

The implementation is correct and aligns with the state transition map. Consider adding documentation to explain what constitutes a terminal status and its significance in the CCTX lifecycle.

Add documentation above the method:
+// IsTerminalStatus returns true if the status represents a final state
+// from which no further transitions are possible (Aborted, Reverted, or OutboundMined).
 func (c CctxStatus) IsTerminalStatus() bool {
94-96: LGTM! Consider adding documentation and using constants.

The implementation correctly identifies pending states. Consider documenting the method and using constants for better maintainability.

Consider refactoring to use constant sets for better maintainability:
var (
    terminalStatuses = map[CctxStatus]bool{
        CctxStatus_Aborted:       true,
        CctxStatus_Reverted:      true,
        CctxStatus_OutboundMined: true,
    }
    
    pendingStatuses = map[CctxStatus]bool{
        CctxStatus_PendingInbound:  true,
        CctxStatus_PendingOutbound: true,
        CctxStatus_PendingRevert:   true,
    }
)

func (c CctxStatus) IsTerminalStatus() bool {
    return terminalStatuses[c]
}

func (c CctxStatus) IsPendingStatus() bool {
    return pendingStatuses[c]
}
This approach:

Makes it easier to maintain status categories

Provides better performance for larger sets

Makes the code more maintainable
x/crosschain/types/status_test.go (2)
178-197: LGTM! Consider adding documentation for terminal status definition.

The test implementation is clean and comprehensive, following Go testing best practices with table-driven tests. Consider adding a comment block explaining what constitutes a terminal status (OutboundMined, Reverted, Aborted) for better maintainability.
+// TestCctxStatus_IsTerminalStatus verifies the IsTerminalStatus method
+// Terminal statuses represent final states (OutboundMined, Reverted, Aborted)
+// from which no further transitions are possible
 func TestCctxStatus_IsTerminalStatus(t *testing.T) {
199-218: LGTM! Consider adding documentation for pending status definition.

The test implementation is clean and comprehensive, following Go testing best practices with table-driven tests. Consider adding a comment block explaining what constitutes a pending status (PendingInbound, PendingOutbound, PendingRevert) for better maintainability.
+// TestCctxStatus_IsPendingStatus verifies the IsPendingStatus method
+// Pending statuses represent intermediate states (PendingInbound, PendingOutbound, PendingRevert)
+// from which transitions to other states are expected
 func TestCctxStatus_IsPendingStatus(t *testing.T) {
x/crosschain/keeper/msg_server_abort_stuck_cctx_test.go (3)
33-41: Consider reducing test setup duplication

The setup code for TssNonce and PendingNonces is duplicated across all test cases. Consider extracting this into a helper function to improve maintainability.
func setupCCTXWithPendingNonces(t *testing.T, k *keeper.Keeper, ctx sdk.Context, status crosschaintypes.CctxStatus, statusMessage string) *crosschaintypes.CrossChainTx {
    cctx := sample.CrossChainTx(t, "cctx_index")
    cctx.CctxStatus = &crosschaintypes.Status{
        Status:        status,
        StatusMessage: statusMessage,
    }
    cctx.GetCurrentOutboundParam().TssNonce = 1

    k.SetCrossChainTx(ctx, *cctx)
    k.GetObserverKeeper().SetPendingNonces(ctx, observertypes.PendingNonces{
        NonceLow:  0,
        NonceHigh: 10,
        ChainId:   cctx.GetCurrentOutboundParam().ReceiverChainId,
        Tss:       cctx.GetCurrentOutboundParam().TssPubkey,
    })
    return cctx
}
Also applies to: 79-87, 129-137

57-61: Enhance pending nonces verification

The assertions for pending nonces state could be more comprehensive. Consider adding checks for:

NonceHigh value remains unchanged

ChainId and Tss values match the CCTX parameters
// Add these assertions after the existing ones
require.Equal(t, int64(10), pendingNonces.NonceHigh)
require.Equal(t, cctx.GetCurrentOutboundParam().ReceiverChainId, pendingNonces.ChainId)
require.Equal(t, cctx.GetCurrentOutboundParam().TssPubkey, pendingNonces.Tss)
Also applies to: 103-109, 153-157

Line range hint 18-224: Consider adding edge case tests

The test suite has good coverage of main scenarios, but consider adding these edge cases:

Test aborting when nonce equals NonceHigh

Test handling of nonce overflow scenarios

Example test case:
t.Run("can handle nonce at upper bound", func(t *testing.T) {
    // Arrange
    k, ctx, _, _ := keepertest.CrosschainKeeperWithMocks(t, keepertest.CrosschainMockOptions{
        UseAuthorityMock: true,
    })
    
    // Setup CCTX with nonce = NonceHigh
    cctx := sample.CrossChainTx(t, "cctx_index")
    cctx.GetCurrentOutboundParam().TssNonce = 10
    // ... rest of the setup
    
    // Assert appropriate handling
})
e2e/utils/zetacore.go (2)
Line range hint 69-72: Consider documenting the implications of using InTxHashToCctxData.

The TODO comment mentions replacing InTxHashToCctxData with InboundHashToCctxData but doesn't explain the implications or potential risks of using the current approach. This information would be valuable for maintainers.

Line range hint 1-324: Consider abstracting common waiting patterns.

The file contains several functions with similar waiting patterns (WaitCctxMinedByInboundHash, WaitCCTXMinedByIndex, WaitCctxByInboundHash). Consider creating a generic waiting mechanism to reduce code duplication and improve maintainability.

Example abstraction:
type WaitConfig struct {
    Timeout     time.Duration
    RetryDelay  time.Duration
    LogInterval int
    Query       func(ctx context.Context) (interface{}, error)
    IsComplete  func(result interface{}) bool
}

func WaitForCondition(ctx context.Context, config WaitConfig, logger infoLogger) (interface{}, error) {
    startTime := time.Now()
    for i := 0; ; i++ {
        if time.Since(startTime) > config.Timeout {
            return nil, errors.New("timeout waiting for condition")
        }

        result, err := config.Query(ctx)
        if err != nil {
            if i%config.LogInterval == 0 {
                logger.Info("Error querying condition: %s", err.Error())
            }
            time.Sleep(config.RetryDelay)
            continue
        }

        if config.IsComplete(result) {
            return result, nil
        }

        time.Sleep(config.RetryDelay)
    }
}
changelog.md (2)
Line range hint 1-24: Consider adding a table of contents for better navigation.

The changelog is well-organized chronologically, but as it grows, navigation becomes challenging. Consider adding a table of contents at the top with links to each version section.
 # CHANGELOG
+
+## Table of Contents
+
+- [Unreleased](#unreleased)
+- [v23.0.0](#v2300)
+- [v22.1.2](#v2212)
+...
17-17: Add more context to the fix description.

The fix description "update pending nonces when aborting a cctx through MsgAbortStuckCCTX" could benefit from additional context about the problem it solves and its impact.
-* [3230](https://github.com/zeta-chain/node/pull/3230) - update pending nonces when aborting a cctx through MsgAbortStuckCCTX
+* [3230](https://github.com/zeta-chain/node/pull/3230) - fix: update pending nonces when aborting a cctx through MsgAbortStuckCCTX to prevent nonce tracking inconsistencies

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 08ff881 and 091ec82.

📒 Files selected for processing (8)

changelog.md (1 hunks)
e2e/utils/zetacore.go (2 hunks)
x/crosschain/keeper/cctx.go (2 hunks)
x/crosschain/keeper/msg_server_abort_stuck_cctx.go (1 hunks)
x/crosschain/keeper/msg_server_abort_stuck_cctx_test.go (5 hunks)
x/crosschain/keeper/outbound_tracker_test.go (1 hunks)
x/crosschain/types/status.go (1 hunks)
x/crosschain/types/status_test.go (1 hunks)

🧰 Additional context used

📓 Path-based instructions (7)

e2e/utils/zetacore.go (1)