Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recovery documentation #5796

Open
kfox1111 opened this issue Jan 19, 2025 · 5 comments
Open

Recovery documentation #5796

kfox1111 opened this issue Jan 19, 2025 · 5 comments

Comments

@kfox1111
Copy link
Contributor

If your server is down for too long, how do you recover?

The server does not start with something like:

ERRO[0000] Fatal run error                               error="invalid server X509-SVID: invalid X509-SVID: already expired as of 2024-12-23T16:48:19Z"
ERRO[0000] Server crashed                                error="invalid server X509-SVID: invalid X509-SVID: already expired as of 2024-12-23T16:48:19Z"
@kfox1111
Copy link
Contributor Author

Is removing keys.json enough on the server? It seems to start. Anything to do with the sql db?

@kfox1111
Copy link
Contributor Author

The agent wont connect in that case too...

ERRO[0030] Agent crashed                                 error="create attestation client: failed to dial dns:///localhost:8081: context deadline exceeded: connection error: desc = \"transport: authentication handshake failed: x509svid: could not get X509 bundle: x509bundle: no X.509 bundle found for trust domain: \\\"example.com\\\"\""

@kfox1111
Copy link
Contributor Author

For the agent, the trust bundle seems to be in agent-data.json ?

I deleted keys.json as part of trying to fix it, but not sure that step is required, as it was an unrelated issue...

@sorindumitru
Copy link
Contributor

sorindumitru commented Jan 19, 2025

ERRO[0000] Fatal run error error="invalid server X509-SVID: invalid X509-SVID: already expired as of 2024-12-23T16:48:19Z"

Was it stuck in a crash loop with this error? That sounds more like a bug. Do you happen to have the rest of the logs?

@kfox1111
Copy link
Contributor Author

I think so, but was long enough ago I don't want to say for sure.

Basically, I had a spire-server I left offline for long enough all its ca's expired. Should be able to reproduce by starting up a new spire-server with an extremely short ca time, then shut it off for a little bit until they expire, then start it back up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants