Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logs through Nomad UI show certificate issues (log-viewing itself works) #24269

Open
dmclf opened this issue Oct 22, 2024 · 2 comments
Open

Logs through Nomad UI show certificate issues (log-viewing itself works) #24269

dmclf opened this issue Oct 22, 2024 · 2 comments
Labels
stage/accepted Confirmed, and intend to work on. No timeline committment though. theme/allocation API theme/ui type/bug

Comments

@dmclf
Copy link

dmclf commented Oct 22, 2024

Nomad version

Nomad v1.9.0
BuildDate 2024-10-10T07:13:43Z
Revision 7ad36851ec02f875e0814775ecf1df0229f0a615

and
Nomad v1.8.3
BuildDate 2024-08-13T07:37:30Z
Revision 63b636e5cbaca312cf6ea63e040f445f05f00478

(but may not be limited to these)

Operating system and Environment details

Ubuntu 22.04.5 LTS

Nomad with

  • ACL enabled
  • TLS enabled on http + rpc
  • vault enabled
  • dev is single region
  • staging and production are federated multi-region.
  • VIP for a providing a singular entrypoint to the specific environment's Nomad-UI (and deployed nomad jobs/services)
    • consul used to generate config and Traefik then automatically routes that accordingly, together with Lets-encrypt gives a smooth TLS experience where developers , together with some CI/CD templating are able to deploy easily applications independently.
    • ie, dns wildcard *.nomad-development.company.com and you can easily deploy https://application.nomad-development.company.com (not fit for high-traffic/volume , but for low-traffic apps this works fine)

Issue

when checking the Nomad UI to look at container logs, there are errors reported due to certificate issues.
image

  • my UI is on a Traefik loadbalancing with SSL and LetsEncrypt certs on , example, https://nomad-development.company.com
  • however, when requesting logs, these for some reason go to: https://10.xx.yy.zz:4646/v1/client/fs/logs/8fb55989-139c-5256-f812-d79353993c6c?follow=true&offset=50000&origin=end&task=athena-cleaner&type=stdout
    and as the Nomad-Servers are using a private CA, as per Nomad's recommendations
This should be a private CA and not a public one like Let's Encrypt
as any certificate signed by this CA will be allowed to communicate with the cluster

and as such, this shows these certificate errors on Enduser-devices.

note: log viewing still works as apparently that call goes back to the SSL traefik endpoint, example, https://nomad-development.company.com/v1/client/fs/logs/c519c888-6c46-6d8e-2f0c-f5a17be8afc7?follow=true&offset=50000&origin=end&task=google-cadvisor&type=stderr

this flow works

  graph TD;
      A[Browser]-- OK -->B[https:nomad-development.company.com = VIP+Traefik+LetsEncrypt];
      B-- OK -->E[Nomad-Server-1:4646 -- Private CA];
      B-- OK -->F[Nomad-Server-2:4646 -- Private CA];
      B-- OK -->G[Nomad-Server-3:4646 -- Private CA];
Loading

but the UI seems to do this 'direct connection' for the errornous calls, and that fails.

  graph TD;
      A[Browser];
      A-- FAIL -->E[https.Nomad-Server-1:4646 -- Private CA];
      A-- FAIL -->F[https.Nomad-Server-2:4646 -- Private CA];
      A-- FAIL -->G[https.Nomad-Server-3:4646 -- Private CA];
Loading

Reproduction steps

  • not specific, just noticed as
  • also happens on all environments,
    • nomad-development.company.com (v1.9.0 on servers, v1.8.3 on clients)
    • nomad-staging.company.com (v1.8.3)
    • nomad-production.company.com (v1.8.3)

Expected Result

  • no certificate errors

Actual Result

GET https://10.x.y.z:4646/v1/client/fs/logs/c519c888-6c46-6d8e-2f0c-f5a17be8afc7?follow=true&offset=50000&origin=end&task=google-cadvisor&type=stderr net::ERR_CERT_AUTHORITY_INVALID
image
image

i am not sure if there is some extra config needed in such case that is not readily available yet?

like, consul and vault have some ui_url that can be set:

  consul {
    ui_url = "https://consul.nomad-development.company.com/ui"
  }

  vault {
    ui_url = "https://vault.nomad-development.company.com/ui"
  }

perhaps such a property also is needed in my case, where effectively, nomad-ui sits behind a proxy?

(or some other config I may have overlooked? response rewriting is not exactly the direction I would prefer)

@philrenaud
Copy link
Contributor

Hi @dmclf, thanks for raising this issue. There is no further agent config for this, the way there is for consul/vault ui_urls. I suspect you've already given https://developer.hashicorp.com/nomad/tutorials/manage-clusters/reverse-proxy-ui a look since you've arrived at a nice environment behind Traefik, but that guide doesn't have any certificate-specific advice anyway.

This is to say: first time I've heard of this particular issue, but not the first time I've seen issues raised around the proxied UI (for example). This could use some further investigation and I will try to set some time to dig in soon.

@dmclf
Copy link
Author

dmclf commented Oct 22, 2024

hi @philrenaud , I guess example issue 6413 sounds a bit like my first version with Fabio, which worked fine, but indeed has its limitations.

I can elaborate more on the environment I setup, but that won't help this specific ticket
(but may potentially be nice to know how people setup things? or help others with similar setups)

@philrenaud philrenaud moved this from Needs Triage to Needs Roadmapping in Nomad - Community Issues Triage Oct 23, 2024
@jrasell jrasell added theme/ui theme/allocation API stage/accepted Confirmed, and intend to work on. No timeline committment though. labels Nov 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stage/accepted Confirmed, and intend to work on. No timeline committment though. theme/allocation API theme/ui type/bug
Projects
Status: Needs Roadmapping
Status: Backlog
Development

No branches or pull requests

3 participants