Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

validate capi-auth-token on dqlite/remove #56

Conversation

HomayoonAlimohammadi
Copy link
Contributor

@HomayoonAlimohammadi HomayoonAlimohammadi commented Oct 9, 2024

Summary

This PR adds the capi-auth-token header validation for the dqlite/remove endpoint.
The /capi/etc/token file is supposed to get created by the Microk8s bootstrap provider (in the cloud-init) and will get populated with a token string which is used to authenticate the CAPI calls (the dqlite/remove call specifically) to the cluster-agent.

Also the removeEndpoint request field is changed to remove_endpoint to comply with the recent MAAS API guidelines.

Alternative solution

We could have a "create capi token" endpoint that creates the token file and returns the /path/to/token as well as the token itself. Right now we introduced some coupling between Microk8s CAPI and cluster-agent since they both should be aware that the token path is /capi/etc/token. Implementing this solution and improving on the current implementation might be considered later.

PR series

  1. Cluster agent token validation --> This one
  2. Microk8s bootstrap provider add token file and secret --> Add capi-auth-token file to control plane machines cluster-api-bootstrap-provider-microk8s#115
  3. Microk8s control plane provider use token for request --> Add capi-auth-token header to /dqlite/remove request cluster-api-control-plane-provider-microk8s#68

Copy link
Member

@berkayoz berkayoz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, did a first pass

pkg/api/v2/remove.go Outdated Show resolved Hide resolved
cmd/cluster-agent.go Outdated Show resolved Hide resolved
Copy link
Contributor

@bschimke95 bschimke95 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, some first comments.

pkg/api/v2/remove.go Outdated Show resolved Hide resolved
}

// RemoveFromDqlite implements the "POST /v2/dqlite/remove" endpoint and removes a node from the dqlite cluster.
func (a *API) RemoveFromDqlite(ctx context.Context, req RemoveFromDqliteRequest) (int, error) {
func (a *API) RemoveFromDqlite(ctx context.Context, req RemoveFromDqliteRequest, token string) (int, error) {
isValid, err := a.Snap.IsCAPIAuthTokenValid(token)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me, the validation of the token is a separate middleware/access handler that could potential be reused for other endpoints (similar to what we do in k8s-snap)

Disclaimer: I don't know much about the microk8s cluster agent. If this matches the other code structure I'm fine with that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like what you're saying, but IIUC the middlewares are not configurable per endpoint in cluster-agent: https://github.com/canonical/microk8s-cluster-agent/blob/main/pkg/server/server.go#L45-L46
All the v1 or v2 endpoints get registered with the same middleware.
We technically can have this capi-auth-token header check as a middleware, but since it's going to get applied for the other v2 endpoints as well (/image/import and /join) we need to make it optional, which is counter intuitive. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad, thanks for pointing this out. Will do.

Copy link
Contributor Author

@HomayoonAlimohammadi HomayoonAlimohammadi Oct 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bschimke95 This is the best that I could come up with. TBH I feel like the previous implementation was more readable and understandable. Maybe I'm doing something wrong. WDYT?
2bcaeca

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm reverting this. Not really happy with the result...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, up to you. This was only a suggestion. If you think the current implementation is more straight-forward - go for it :)

Copy link
Contributor

@bschimke95 bschimke95 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

}

// RemoveFromDqlite implements the "POST /v2/dqlite/remove" endpoint and removes a node from the dqlite cluster.
func (a *API) RemoveFromDqlite(ctx context.Context, req RemoveFromDqliteRequest) (int, error) {
func (a *API) RemoveFromDqlite(ctx context.Context, req RemoveFromDqliteRequest, token string) (int, error) {
isValid, err := a.Snap.IsCAPIAuthTokenValid(token)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, up to you. This was only a suggestion. If you think the current implementation is more straight-forward - go for it :)

Copy link
Member

@berkayoz berkayoz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Although initially we had the os.GETENV("CAPI_PATH") that could override the default path. Did we forget to change the cluster agent command to use the WithCAPIPath option?

@HomayoonAlimohammadi
Copy link
Contributor Author

@HomayoonAlimohammadi
Copy link
Contributor Author

HomayoonAlimohammadi commented Oct 11, 2024

According to a conversation with @berkayoz, we decided to go with hard-coding that path for now and see if we need to make that configurable later on.

Note for future travelers

The path defined here should match the one in the "Microk8s CAPI bootstrap provider". That provider is responsible for populating this path with needed files and info (e.g. CAPI auth token).

@HomayoonAlimohammadi HomayoonAlimohammadi merged commit 5e30719 into dqlite-remove-endpoint Oct 11, 2024
3 checks passed
@HomayoonAlimohammadi HomayoonAlimohammadi deleted the KU-1719/validate-capi-auth-token-on-remove branch October 11, 2024 10:30
HomayoonAlimohammadi added a commit that referenced this pull request Oct 14, 2024
* Add /v2/dqlite/remove endpoint (#55)
* validate capi-auth-token on dqlite/remove (#56)
HomayoonAlimohammadi added a commit that referenced this pull request Oct 16, 2024
* Add /v2/dqlite/remove endpoint (#55)
* validate capi-auth-token on dqlite/remove (#56)
HomayoonAlimohammadi added a commit that referenced this pull request Oct 16, 2024
* Add /v2/dqlite/remove endpoint (#55)
* validate capi-auth-token on dqlite/remove (#56)
HomayoonAlimohammadi added a commit that referenced this pull request Oct 16, 2024
* Add /v2/dqlite/remove endpoint (#55)
* validate capi-auth-token on dqlite/remove (#56)
HomayoonAlimohammadi added a commit that referenced this pull request Oct 16, 2024
* Add /v2/dqlite/remove endpoint (#55)
* validate capi-auth-token on dqlite/remove (#56)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants