Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Reliability of the CCIP Gateway #266

Open
alainncls opened this issue Dec 6, 2024 · 2 comments · May be fixed by #281
Open

Improve Reliability of the CCIP Gateway #266

alainncls opened this issue Dec 6, 2024 · 2 comments · May be fixed by #281
Assignees

Comments

@alainncls
Copy link
Collaborator

Description:

The CCIP Gateway currently lacks sufficient reliability in handling RPC failures, which can lead to downtime a when a single RPC provider (e.g., Infura) is unavailable. To address this issue, implement one of the following solutions:

  1. Implement a fallback RPC URL at the gateway level:

    • Add support for multiple RPC endpoints directly in the gateway.
    • Automatically switch to a secondary RPC provider in case of a failure.
  2. Deploy a second gateway instance (recommended):

    • Use a separate RPC provider for the second gateway instance (e.g., the Linea internal RPC node).
    • Deploy a new LineaSparseProofVerifier referencing the 2 URLs.
    • Update the L1Resolver contract to reference the new LineaSparseProofVerifier address.

Acceptance Criteria:

  • The CCIP Gateway should remain operational during an RPC provider failure.
  • The chosen solution must minimize impact on the current infrastructure while enhancing resilience.
  • For solution 2, proper redeployment of required contracts (e.g., L1Resolver) must be included.

Note:

Solution 2 is the recommended approach as it aligns with CCIP-Read best practices and ensures greater reliability.

@alainncls
Copy link
Collaborator Author

Solution 1:

  • Small code modification
  • Potential additional cost for RPC provider?
  • Same devops chart with additional RPC config
  • No contract to deploy
    => Easy to implement & support

Solution 2:

  • No code modification, only configuration (DNS + deploy script)
  • Recommended by the CCIP standard
  • Requires 2 devops charts + 2 autoscaling instances
  • 2 contracts to redeploy
  • Requires Security Council sign off
  • Potential additional cost for RPC provider?
    => Easy to implement

Conclusion:

  • Solution 1 is lightweight and less expensive than solution 2
  • Solution 2 is ideal but more expensive and requires more devops config/work
  • Our recommendation: Solution 1 because practically speaking, only the 1st gateway will be called 99% of the time (so the 2nd instance is idle).

@alainncls
Copy link
Collaborator Author

ENS L1 makes the call to the contract, it reverts, but the interpretation of this revert is handled by ENS L1, not in our code.
=> Having 2 gateways doesn't have any impact on our UI

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants