Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unauthorized when opening the IDE #22353

Closed
l0rd opened this issue Jul 11, 2023 · 8 comments
Closed

Unauthorized when opening the IDE #22353

l0rd opened this issue Jul 11, 2023 · 8 comments
Assignees
Labels
area/dashboard area/hosted-che kind/bug Outline of a bug - must adhere to the bug report template. severity/P1 Has a major impact to usage or development of the system.

Comments

@l0rd
Copy link
Contributor

l0rd commented Jul 11, 2023

Describe the bug

2023-07-11 12 14 57

This happened on sandbox this morning

Che version

7.70@latest

Steps to reproduce

Open https://github.com/eclipse-che-demo-app/che-demo-app on sandbox

Expected behavior

No error

Runtime

OpenShift

Screenshots

No response

Installation method

OperatorHub

Environment

Linux

Eclipse Che Logs

No response

Additional context

No response

@l0rd l0rd added the kind/bug Outline of a bug - must adhere to the bug report template. label Jul 11, 2023
@che-bot che-bot added the status/need-triage An issue that needs to be prioritized by the curator responsible for the triage. See https://github. label Jul 11, 2023
@dkwon17 dkwon17 added severity/P1 Has a major impact to usage or development of the system. and removed status/need-triage An issue that needs to be prioritized by the curator responsible for the triage. See https://github. labels Jul 11, 2023
@ibuziuk ibuziuk moved this to 📅 Planned in Eclipse Che Team A Backlog Jul 12, 2023
@ibuziuk
Copy link
Member

ibuziuk commented Jul 18, 2023

similar 404 error that was caught by periodic tests - output.webm

@dkwon17 dkwon17 moved this from 📅 Planned to 🚧 In Progress in Eclipse Che Team A Backlog Jul 26, 2023
@dkwon17
Copy link
Contributor

dkwon17 commented Aug 6, 2023

I see similar issues (also similar to #22352) when refreshing the dashboard
image

This error is quite difficult to reproduce. On some days, I'm able to reproduce this issue very occasionally on the dogfooding cluster, but on other days, I cannot reproduce it at all (I've used a script to refresh the dashboard on dogfooding about 4000 times, and could not reproduce it). I find it much easier to reproduce this error on Dev Sandbox, more specifically the m2 cluster.

I've noticed that these types of errors often occur because some requests made by the dashboard occasionally returns a 401 or 403 error.

403 error

Sometimes the request to get the user namespace results in a 403 error:

image

image

This causes other requests like events and pods to fail as well, because they both depend on the value of the user namespace.

In this response header, there are some differences compared to when the namespace request returns 200

image

The Gap-Auth and Gap-Upstream-Address headers seems to be set by oauth-proxy container in the che-gateway pod:
https://github.com/openshift/oauth-proxy/blob/55e0cd172625b3419c698014fa233e54021af9a3/oauthproxy.go#L99
https://github.com/openshift/oauth-proxy/blob/55e0cd172625b3419c698014fa233e54021af9a3/oauthproxy.go#L830-L834

401 error

image

image

In this case, the namespace request succeeded, but for whatever reason, the events and pods request failed with 401. Which is strange because the devworkspace request succeeded.

Just like the 403 case, the Gap-Auth and Gap-Upstream-Address headers are missing in the response header.

@dkwon17
Copy link
Contributor

dkwon17 commented Aug 28, 2023

For the 401/403 error that sometimes appears when refreshing the dashboard, it seems to happen on these requests made by the dashboard:

Request Fulfilled By
namespace/provision Che Server
namespace Che Server
events Che Dashboard backend
devworkspacetemplates Che Dashboard backend
devworkspaces Che Dashboard backend
pods Che Dashboard backend

For this specific example of the error:
image

On the screenshot we see the following error code from the response:

FST_UNAUTHORIZED

which is coming from the dashboard backend here, meaning that the authorization header which contains the bearer token is missing.

It's strange that in the screenshot, we see that requests to fetch devworkspacetemplates , devworkspaces , and pods succeeded, but only fetching events failed.

For the events request to reach the dashboard backend api (and same with the che server api), the request needs to

  1. Go through the oauth-proxy container (which sets the bearer token in the request header)
  2. And then go through the gateway (traefik) container (which uses a custom middleware to set the authorization header with the token set by oauth-proxy) within the che-gateway pod.

Because of oauth-proxy and gateway, the bearer token should be available in the request's header.

But since the request to the dashboard backend ultimately reaches here, it can either mean:

  • oauth-proxy container did not set the access token in the request header
  • custom traefik middleware was unable to set the authorization header with the access token set by oauth-proxy container
  • somehow the authorization header is removed after going through traefik, but before reaching dashboard backend server
  • maybe something else

@tolusha
Copy link
Contributor

tolusha commented Aug 30, 2023

I am wondering if we can test latest images [1] [2], maybe it will fix our problem.
[1] https://quay.io/openshift/origin-oauth-proxy
[2] https://quay.io/openshift/origin-kube-rbac-proxy

@ibuziuk
Copy link
Member

ibuziuk commented Sep 26, 2023

@dkwon17 @tolusha we should probably investigate how ArgoCD manages the OAuth. The UX flow is very similar to what we do with Eclipse Che

@RomanNikitenko RomanNikitenko changed the title Unauthorized when when opening the IDE Unauthorized when opening the IDE Sep 26, 2023
@Mbd06b
Copy link

Mbd06b commented Sep 27, 2023

I personally encountered this unauthorized issue because I had multiple nodes in my cluster, and not all the nodes had my ODIC issuer arguments applied.

I edited the API Server Configuration in the configuration file for my Kubernetes API server (under /var/snap/microk8s/current/args/kube-apiserver for microk8s).

OIDC Options: Include the following lines (adjusting for your specific Keycloak configuration):

--oidc-issuer-url=https://<KEYCLOAK_DOMAIN>/auth/realms/<REALM_NAME>
--oidc-client-id=<CLIENT_ID>
--oidc-username-claim=email
--oidc-groups-claim=groups

While I ran this from the node I built the cluster from, I discovered that I needed to apply these --odic args on each node, and restart the node for the new configuration to apply, and then my auth issues with login and the devworkspace were resolved.
.
sudo microk8s stop && sudo microk8s start

Just for your consideration if there's a similar multi-node situation to consider.

Linking back to documentation:
#21378 MicroK8S OIDC Issue
#22392 Document how to install Eclipse Che on major k8s providers

@ibuziuk
Copy link
Member

ibuziuk commented Oct 13, 2023

Retry logic has been added to the dashboard, and the issue does not seem to be reproducible on nightly. The plan is to promote 3.10 RC changes to staging cluster and verify it properly before closing the issue.

@svor svor moved this from 🚧 In Progress to Ready for Review in Eclipse Che Team A Backlog Oct 18, 2023
@ibuziuk
Copy link
Member

ibuziuk commented Nov 2, 2023

Closing the issue since we need to include it in the 7.76.0 release notes. @dkwon17 please, let me know if we need to proceed differently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/dashboard area/hosted-che kind/bug Outline of a bug - must adhere to the bug report template. severity/P1 Has a major impact to usage or development of the system.
Projects
None yet
Development

No branches or pull requests

6 participants