Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Traefik crashes when deployed with the cos bundle #413

Open
welpaolo opened this issue Oct 17, 2024 · 1 comment
Open

Traefik crashes when deployed with the cos bundle #413

welpaolo opened this issue Oct 17, 2024 · 1 comment

Comments

@welpaolo
Copy link

Bug Description

The traefik charm is waiting for service traefik without recovering from it.

To Reproduce

juju deploy cos-lite --trust

Environment

Bundle channel: latest/stable
OS: ubuntu-22.04
Environment: Github runner
Juju 3.4.2

Relevant log output

Juju status:

Model  Controller                Cloud/Region        Version    SLA          Timestamp
cos    github-pr-ff368-microk8s  microk8s/localhost  3.6-beta2  unsupported  14:14:28Z

App           Version  Status   Scale  Charm             Channel        Rev  Address         Exposed  Message
alertmanager  0.27.0   active       2  alertmanager-k8s  latest/stable  125  10.152.183.106  no       
avalanche              active       2  avalanche-k8s     latest/edge     42  10.152.183.135  no       
catalogue              active       1  catalogue-k8s     latest/stable   59  10.152.183.103  no       
grafana       9.5.3    active       1  grafana-k8s       latest/stable  117  10.152.183.238  no       
loki          2.9.6    active       1  loki-k8s          latest/stable  160  10.152.183.50   no       
prometheus    2.52.0   active       1  prometheus-k8s    latest/stable  209  10.152.183.47   no       
traefik                waiting      1  traefik-k8s       latest/stable  194  10.152.183.19   no       installing agent

Unit             Workload  Agent  Address      Ports  Message
alertmanager/0   active    idle   10.1.39.161         
alertmanager/1*  active    idle   10.1.39.159         
avalanche/0      active    idle   10.1.39.147         
avalanche/1*     active    idle   10.1.39.148         
catalogue/0*     active    idle   10.1.39.149         
grafana/0*       active    idle   10.1.39.164         
loki/0*          active    idle   10.1.39.163         
prometheus/0*    active    idle   10.1.39.162         
traefik/0*       waiting   idle   10.1.39.158         waiting for service: 'traefik'


Juju debug-log:

unit-traefik-0: 14:07:04 ERROR unit.traefik/0.juju-log traefik-route:10: Uncaught exception while in charm code:
Traceback (most recent call last):
  File "./src/charm.py", line 1199, in <module>
    main(TraefikIngressCharm, use_juju_for_storage=True)
  File "/var/lib/juju/agents/unit-traefik-0/charm/venv/ops/main.py", line 548, in main
    manager.run()
  File "/var/lib/juju/agents/unit-traefik-0/charm/venv/ops/main.py", line 527, in run
    self._emit()
  File "/var/lib/juju/agents/unit-traefik-0/charm/venv/ops/main.py", line 516, in _emit
    _emit_charm_event(self.charm, self.dispatcher.event_name)
  File "/var/lib/juju/agents/unit-traefik-0/charm/venv/ops/main.py", line 147, in _emit_charm_event
    event_to_emit.emit(*args, **kwargs)
  File "/var/lib/juju/agents/unit-traefik-0/charm/venv/ops/framework.py", line 348, in emit
    framework._emit(event)
  File "/var/lib/juju/agents/unit-traefik-0/charm/venv/ops/framework.py", line 860, in _emit
    self._reemit(event_path)
  File "/var/lib/juju/agents/unit-traefik-0/charm/venv/ops/framework.py", line 950, in _reemit
    custom_handler(event)
  File "/var/lib/juju/agents/unit-traefik-0/charm/lib/charms/tempo_k8s/v1/charm_tracing.py", line 548, in wrapped_function
    return callable(*args, **kwargs)  # type: ignore
  File "/var/lib/juju/agents/unit-traefik-0/charm/lib/charms/traefik_route_k8s/v0/traefik_route.py", line 225, in _on_relation_changed
    self.on.ready.emit(event.relation)
  File "/var/lib/juju/agents/unit-traefik-0/charm/venv/ops/framework.py", line 348, in emit
    framework._emit(event)
  File "/var/lib/juju/agents/unit-traefik-0/charm/venv/ops/framework.py", line 860, in _emit
    self._reemit(event_path)
  File "/var/lib/juju/agents/unit-traefik-0/charm/venv/ops/framework.py", line 950, in _reemit
    custom_handler(event)
  File "/var/lib/juju/agents/unit-traefik-0/charm/lib/charms/tempo_k8s/v1/charm_tracing.py", line 548, in wrapped_function
    return callable(*args, **kwargs)  # type: ignore
  File "./src/charm.py", line 697, in _handle_traefik_route_ready
    if self._static_config_changed:
  File "./src/charm.py", line 655, in _static_config_changed
    traefik_static_config = self.traefik.pull_static_config()
  File "/var/lib/juju/agents/unit-traefik-0/charm/src/traefik.py", line 594, in pull_static_config
    static_config_raw = self._container.pull(STATIC_CONFIG_PATH).read()
  File "/var/lib/juju/agents/unit-traefik-0/charm/venv/ops/model.py", line 2351, in pull
    return self._pebble.pull(str(path), encoding=encoding)
  File "/var/lib/juju/agents/unit-traefik-0/charm/venv/ops/pebble.py", line 2262, in pull
    response = self._request_raw('GET', '/v1/files', query, headers)
  File "/var/lib/juju/agents/unit-traefik-0/charm/venv/ops/pebble.py", line 1912, in _request_raw
    raise ConnectionError(
ops.pebble.ConnectionError: Could not connect to Pebble: socket not found at '/charm/containers/traefik/pebble.socket' (container restarted?)

Additional context

CI run when the crashes happened: https://github.com/canonical/spark-k8s-bundle/actions/runs/11331470997/job/31511606569

@welpaolo welpaolo changed the title Traefik crashes when deployed in the cos bundle Traefik crashes when deployed with the cos bundle Oct 17, 2024
@dstathis
Copy link
Contributor

Looks like we probably need a can_connect guard.

@dstathis dstathis added Checked and removed Checked labels Nov 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants