Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

robust error handling #68

Open
bbortt opened this issue Sep 7, 2024 · 0 comments
Open

robust error handling #68

bbortt opened this issue Sep 7, 2024 · 0 comments

Comments

@bbortt
Copy link
Member

bbortt commented Sep 7, 2024

I've already created the sequence diagram depicting the "switch" workflow. Now, let's analyze each step and consider what could go wrong:

Propeller requests active user from Vault:
If this fails, Propeller won't know which user is currently active. However, since Vault versions secrets, the current state is preserved. Propeller could retry or alert an admin.

Propeller generates a random password:
This is a local operation, so it's unlikely to fail. If it does, Propeller could retry or use a fallback method.

Propeller rotates user 2 password in PostgreSQL:
If this fails, the old password for user 2 remains active. Propeller should not proceed with updating Vault or switching the active user. It should retry or rollback any changes made.

Propeller updates user 2 password in Vault:
If this fails, there's a mismatch between PostgreSQL and Vault. Propeller should attempt to revert the PostgreSQL change to maintain consistency.

Propeller switches active user to user 2 in Vault:
If this fails, the application will continue using user 1. Propeller should not proceed with the ArgoCD rollout. It should consider reverting the password changes for user 2.

Propeller triggers rollout via ArgoCD:
If this fails, the application won't be updated to use the new credentials. Propeller should alert admins but keep the new credentials in place, as they're not yet in use.

Propeller polls ArgoCD for rollout status:
If polling fails or times out, Propeller can't confirm if the rollout was successful. It should alert admins but avoid rotating user 1's password until confirmation is received.

Propeller rotates user 1 password in PostgreSQL:
If this fails, user 1's old password remains active. This isn't immediately problematic as the application should be using user 2, but it leaves user 1's old credentials active longer than intended.

Propeller updates user 1 password in Vault:
If this fails, there's a mismatch between PostgreSQL and Vault for user 1. Propeller should attempt to revert the PostgreSQL change to maintain consistency.

Throughout this process, the versioning of secrets in Vault provides a safety net. If any step fails, we can potentially roll back to the previous state by reverting to the last known good version of the secrets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant