You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello,
we faced with prometheus-k8s juju charm exception after some period of time. Prometheus-k8s application cannot start and tries again and again with same exception.
Exception stacktrace (below) shows that it is a query from prometheus-k8s container to kubernetes cluster to obtain pv/pvc capacity.
Looks like it could'nt make this API request, because k8s cluster cert is not trusted.
We suppose that microk8s could rotate this k8s api certificate, but can't figure out how to resolve this issue.
The best solution would be set "skip tls verify" for all k8s requests, but currently there are not such configuration parameters for this charm.
Cloud you, please, tell us, how to update K8S api site certificate in this charm and how to escape this problem in the future?
To Reproduce
Currently we don't know how to reproduce this issue. We suppose it could happen when microk8s rotates it's API certificates.
Environment
root@jjclient1:~# juju status --relations
Model Controller Cloud/Region Version SLA Timestamp
cos controller-openstack k8s-microk8s/localhost 3.4.5 unsupported 19:25:42Z
SAAS Status Store URL
ceph-mon active controller-openstack se-moshkarin/ceph-ldc-1.ceph-mon
App Version Status Scale Charm Channel Rev Address Exposed Message
alertmanager unknown 0 alertmanager-k8s latest/stable 125 10.152.183.61 no
catalogue active 1 catalogue-k8s latest/stable 59 10.152.183.130 no
grafana 9.5.3 active 1 grafana-k8s latest/stable 117 10.152.183.76 no
loki 2.9.6 active 1 loki-k8s latest/stable 160 10.152.183.60 no
prometheus 2.52.0 waiting 1 prometheus-k8s latest/stable 209 10.152.183.234 no waiting for units to settle down
prometheus-scrape-target-k8s n/a active 1 prometheus-scrape-target-k8s latest/stable 34 10.152.183.120 no
traefik v2.11.0 waiting 1/0 traefik-k8s latest/stable 194 10.152.183.86 no installing agent
Unit Workload Agent Address Ports Message
catalogue/0* active idle 10.1.33.100
grafana/0* active idle 10.1.33.91
loki/0* active idle 10.1.33.90
prometheus-scrape-target-k8s/0* active idle 10.1.33.82
prometheus/0* error idle 10.1.33.104 hook failed: "upgrade-charm"
traefik/0* error idle 10.1.33.88 hook failed: "ingress-relation-broken" for catalogue:ingress
Offer Application Charm Rev Connected Endpoint Interface Role
grafana grafana grafana-k8s 117 2/2 grafana-dashboard grafana_dashboard requirer
loki loki loki-k8s 160 2/2 logging loki_push_api provider
prometheus prometheus prometheus-k8s 209 2/2 metrics-endpoint prometheus_scrape requirer
receive-remote-write prometheus_remote_write provider
Integration provider Requirer Interface Type Message
alertmanager:alerting loki:alertmanager alertmanager_dispatch regular
alertmanager:alerting prometheus:alertmanager alertmanager_dispatch regular
alertmanager:grafana-dashboard grafana:grafana-dashboard grafana_dashboard regular
alertmanager:grafana-source grafana:grafana-source grafana_datasource regular
alertmanager:replicas alertmanager:replicas alertmanager_replica peer
alertmanager:self-metrics-endpoint prometheus:metrics-endpoint prometheus_scrape regular
catalogue:catalogue alertmanager:catalogue catalogue regular
catalogue:catalogue grafana:catalogue catalogue regular
catalogue:catalogue prometheus:catalogue catalogue regular
catalogue:replicas catalogue:replicas catalogue_replica peer
ceph-mon:metrics-endpoint prometheus:metrics-endpoint prometheus_scrape regular
grafana:grafana grafana:grafana grafana_peers peer
grafana:metrics-endpoint prometheus:metrics-endpoint prometheus_scrape regular
grafana:replicas grafana:replicas grafana_replicas peer
loki:grafana-dashboard grafana:grafana-dashboard grafana_dashboard regular
loki:grafana-source grafana:grafana-source grafana_datasource regular
loki:metrics-endpoint prometheus:metrics-endpoint prometheus_scrape regular
loki:replicas loki:replicas loki_replica peer
prometheus-scrape-target-k8s:metrics-endpoint prometheus:metrics-endpoint prometheus_scrape regular
prometheus:grafana-dashboard grafana:grafana-dashboard grafana_dashboard regular
prometheus:grafana-source grafana:grafana-source grafana_datasource regular
prometheus:prometheus-peers prometheus:prometheus-peers prometheus_peers peer
traefik:ingress alertmanager:ingress ingress regular
traefik:ingress catalogue:ingress ingress regular
traefik:ingress-per-unit loki:ingress ingress_per_unit regular
traefik:ingress-per-unit prometheus:ingress ingress_per_unit regular
traefik:metrics-endpoint prometheus:metrics-endpoint prometheus_scrape regular
traefik:peers traefik:peers traefik_peers peer
traefik:traefik-route grafana:ingress traefik_route regular
root@microk8s-0:~# microk8s.kubectl get all -A
NAMESPACE NAME READY STATUS RESTARTS AGE
cos pod/catalogue-0 2/2 Running 0 9h
cos pod/grafana-0 3/3 Running 0 9h
cos pod/loki-0 3/3 Running 0 9h
cos pod/modeloperator-c47c94774-xg76v 1/1 Running 4 (21d ago) 41d
cos pod/prometheus-0 1/2 Running 0 8h
cos pod/prometheus-scrape-target-k8s-0 1/1 Running 6 (21d ago) 40d
cos pod/traefik-0 2/2 Running 10 (21d ago) 41d
kube-system pod/calico-kube-controllers-6588bbfbdb-nbtlh 1/1 Running 0 14d
kube-system pod/calico-node-2q74k 1/1 Running 0 14d
kube-system pod/calico-node-89nr7 1/1 Running 0 14d
kube-system pod/calico-node-f2psb 1/1 Running 0 14d
kube-system pod/coredns-864597b5fd-h7vlh 1/1 Running 2 (21d ago) 48d
kube-system pod/hostpath-provisioner-7df77bc496-xrd26 1/1 Running 1 (21d ago) 22d
metallb-system pod/controller-5f7bb57799-mjltn 1/1 Running 1 (21d ago) 22d
metallb-system pod/speaker-bbm9f 1/1 Running 1 (21d ago) 22d
metallb-system pod/speaker-hg5h8 1/1 Running 9 (21d ago) 49d
metallb-system pod/speaker-p5hzj 1/1 Running 4 (21d ago) 49d
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
cos service/alertmanager ClusterIP 10.152.183.61 <none> 65535/TCP 41d
cos service/alertmanager-endpoints ClusterIP None <none> <none> 41d
cos service/catalogue ClusterIP 10.152.183.130 <none> 65535/TCP,80/TCP 41d
cos service/catalogue-endpoints ClusterIP None <none> <none> 41d
cos service/grafana ClusterIP 10.152.183.76 <none> 65535/TCP,3000/TCP 41d
cos service/grafana-endpoints ClusterIP None <none> <none> 41d
cos service/loki ClusterIP 10.152.183.60 <none> 65535/TCP,3100/TCP 41d
cos service/loki-endpoints ClusterIP None <none> <none> 41d
cos service/modeloperator ClusterIP 10.152.183.67 <none> 17071/TCP 41d
cos service/prometheus ClusterIP 10.152.183.234 <none> 65535/TCP,9090/TCP 41d
cos service/prometheus-endpoints ClusterIP None <none> <none> 41d
cos service/prometheus-scrape-target-k8s ClusterIP 10.152.183.120 <none> 65535/TCP 40d
cos service/prometheus-scrape-target-k8s-endpoints ClusterIP None <none> <none> 40d
cos service/traefik ClusterIP 10.152.183.86 <none> 65535/TCP 41d
cos service/traefik-endpoints ClusterIP None <none> <none> 41d
cos service/traefik-lb LoadBalancer 10.152.183.141 10.88.56.26 80:32432/TCP,443:30964/TCP 22d
default service/kubernetes ClusterIP 10.152.183.1 <none> 443/TCP 49d
kube-system service/kube-dns ClusterIP 10.152.183.10 <none> 53/UDP,53/TCP,9153/TCP 49d
metallb-system service/webhook-service ClusterIP 10.152.183.123 <none> 443/TCP 49d
NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
kube-system daemonset.apps/calico-node 3 3 3 3 3 kubernetes.io/os=linux 49d
metallb-system daemonset.apps/speaker 3 3 3 3 3 kubernetes.io/os=linux 49d
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE
cos deployment.apps/modeloperator 1/1 1 1 41d
kube-system deployment.apps/calico-kube-controllers 1/1 1 1 49d
kube-system deployment.apps/coredns 1/1 1 1 49d
kube-system deployment.apps/hostpath-provisioner 1/1 1 1 49d
metallb-system deployment.apps/controller 1/1 1 1 49d
NAMESPACE NAME DESIRED CURRENT READY AGE
cos replicaset.apps/modeloperator-c47c94774 1 1 1 41d
kube-system replicaset.apps/calico-kube-controllers-55fff87cdf 0 0 0 43d
kube-system replicaset.apps/calico-kube-controllers-59654dbbf 0 0 0 14d
kube-system replicaset.apps/calico-kube-controllers-5b8d7465f6 0 0 0 30d
kube-system replicaset.apps/calico-kube-controllers-5d86d446cf 0 0 0 44d
kube-system replicaset.apps/calico-kube-controllers-6588bbfbdb 1 1 1 14d
kube-system replicaset.apps/calico-kube-controllers-66c5c6884d 0 0 0 14d
kube-system replicaset.apps/calico-kube-controllers-684f9474b5 0 0 0 43d
kube-system replicaset.apps/calico-kube-controllers-6cbb8946d5 0 0 0 49d
kube-system replicaset.apps/calico-kube-controllers-74567f7d84 0 0 0 30d
kube-system replicaset.apps/calico-kube-controllers-75cdd899b7 0 0 0 48d
kube-system replicaset.apps/calico-kube-controllers-8658c8f5d7 0 0 0 30d
kube-system replicaset.apps/coredns-864597b5fd 1 1 1 49d
kube-system replicaset.apps/hostpath-provisioner-7df77bc496 1 1 1 49d
metallb-system replicaset.apps/controller-5f7bb57799 1 1 1 49d
NAMESPACE NAME READY AGE
cos statefulset.apps/alertmanager 0/0 41d
cos statefulset.apps/catalogue 1/1 41d
cos statefulset.apps/grafana 1/1 41d
cos statefulset.apps/loki 1/1 41d
cos statefulset.apps/prometheus 0/1 41d
cos statefulset.apps/prometheus-scrape-target-k8s 1/1 40d
cos statefulset.apps/traefik 1/1 41d
Relevant log output
Exception:
juju debug-log --include=prometheus/0
unit-prometheus-0: 19:20:19 ERROR unit.prometheus/0.juju-log Uncaught exception whilein charm code:
Traceback (most recent call last):
File "/var/lib/juju/agents/unit-prometheus-0/charm/venv/httpx/_transports/default.py", line 69, in map_httpcore_exceptions
yield
File "/var/lib/juju/agents/unit-prometheus-0/charm/venv/httpx/_transports/default.py", line 233, in handle_request
resp = self._pool.handle_request(req)
File "/var/lib/juju/agents/unit-prometheus-0/charm/venv/httpcore/_sync/connection_pool.py", line 216, in handle_request
raise exc from None
File "/var/lib/juju/agents/unit-prometheus-0/charm/venv/httpcore/_sync/connection_pool.py", line 196, in handle_request
response = connection.handle_request(
File "/var/lib/juju/agents/unit-prometheus-0/charm/venv/httpcore/_sync/connection.py", line 99, in handle_request
raise exc
File "/var/lib/juju/agents/unit-prometheus-0/charm/venv/httpcore/_sync/connection.py", line 76, in handle_request
stream = self._connect(request)
File "/var/lib/juju/agents/unit-prometheus-0/charm/venv/httpcore/_sync/connection.py", line 154, in _connect
stream = stream.start_tls(**kwargs)
File "/var/lib/juju/agents/unit-prometheus-0/charm/venv/httpcore/_backends/sync.py", line 168, in start_tls
raise exc
File "/usr/lib/python3.8/contextlib.py", line 131, in __exit__
self.gen.throw(type, value, traceback)
File "/var/lib/juju/agents/unit-prometheus-0/charm/venv/httpcore/_exceptions.py", line 14, in map_exceptions
raise to_exc(exc) from exc
httpcore.ConnectError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1131)
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "./src/charm.py", line 1083, in<module>
main(PrometheusCharm)
File "/var/lib/juju/agents/unit-prometheus-0/charm/venv/ops/main.py", line 548, in main
manager.run()
File "/var/lib/juju/agents/unit-prometheus-0/charm/venv/ops/main.py", line 527, in run
self._emit()
File "/var/lib/juju/agents/unit-prometheus-0/charm/venv/ops/main.py", line 513, in _emit
self.framework.reemit()
File "/var/lib/juju/agents/unit-prometheus-0/charm/venv/ops/framework.py", line 870, in reemit
self._reemit()
File "/var/lib/juju/agents/unit-prometheus-0/charm/venv/ops/framework.py", line 950, in _reemit
custom_handler(event)
File "/var/lib/juju/agents/unit-prometheus-0/charm/lib/charms/tempo_k8s/v1/charm_tracing.py", line 546, in wrapped_function
return callable(*args, **kwargs) # type: ignore
File "./src/charm.py", line 530, in _configure
ifself.resources_patch.is_ready():
File "/var/lib/juju/agents/unit-prometheus-0/charm/lib/charms/tempo_k8s/v1/charm_tracing.py", line 546, in wrapped_function
return callable(*args, **kwargs) # type: ignore
File "/var/lib/juju/agents/unit-prometheus-0/charm/lib/charms/observability_libs/v0/kubernetes_compute_resources_patch.py", line 550, in is_ready
return self.patcher.is_ready(self._pod, resource_reqs)
File "/var/lib/juju/agents/unit-prometheus-0/charm/lib/charms/observability_libs/v0/kubernetes_compute_resources_patch.py", line 397, in is_ready
self.get_templated(),
File "/var/lib/juju/agents/unit-prometheus-0/charm/lib/charms/observability_libs/v0/kubernetes_compute_resources_patch.py", line 371, in get_templated
statefulset = self.client.get(
File "/var/lib/juju/agents/unit-prometheus-0/charm/venv/lightkube/core/client.py", line 140, in get
return self._client.request("get", res=res, name=name, namespace=namespace)
File "/var/lib/juju/agents/unit-prometheus-0/charm/venv/lightkube/core/generic_client.py", line 244, in request
resp = self.send(req)
File "/var/lib/juju/agents/unit-prometheus-0/charm/venv/lightkube/core/generic_client.py", line 216, in send
return self._client.send(req, stream=stream)
File "/var/lib/juju/agents/unit-prometheus-0/charm/venv/httpx/_client.py", line 914, in send
response = self._send_handling_auth(
File "/var/lib/juju/agents/unit-prometheus-0/charm/venv/httpx/_client.py", line 942, in _send_handling_auth
response = self._send_handling_redirects(
File "/var/lib/juju/agents/unit-prometheus-0/charm/venv/httpx/_client.py", line 979, in _send_handling_redirects
response = self._send_single_request(request)
File "/var/lib/juju/agents/unit-prometheus-0/charm/venv/httpx/_client.py", line 1015, in _send_single_request
response = transport.handle_request(request)
File "/var/lib/juju/agents/unit-prometheus-0/charm/venv/httpx/_transports/default.py", line 233, in handle_request
resp = self._pool.handle_request(req)
File "/usr/lib/python3.8/contextlib.py", line 131, in __exit__
self.gen.throw(type, value, traceback)
File "/var/lib/juju/agents/unit-prometheus-0/charm/venv/httpx/_transports/default.py", line 86, in map_httpcore_exceptions
raise mapped_exc(message) from exc
httpx.ConnectError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1131)
unit-prometheus-0: 19:20:20 ERROR juju.worker.uniter.operation hook "upgrade-charm" (via hook dispatching script: dispatch) failed: exit status 1
### Additional context
_No response_
The text was updated successfully, but these errors were encountered:
Yes, we did encounter the same issue and, as @mcfly722 suspects, the issue was the cert on the kube-apiserver. Microk8s has a built-in command to refresh the certs. You'll need to do it for the ca.crt.
Be sure to read the notes at the bottom of the command documentation carefully. This operation is not safe to run with active workloads and will require you to rejoin any other microk8s nodes to the cluster.
Bug Description
Hello,
we faced with prometheus-k8s juju charm exception after some period of time. Prometheus-k8s application cannot start and tries again and again with same exception.
Exception stacktrace (below) shows that it is a query from prometheus-k8s container to kubernetes cluster to obtain pv/pvc capacity.
Looks like it could'nt make this API request, because k8s cluster cert is not trusted.
We suppose that microk8s could rotate this k8s api certificate, but can't figure out how to resolve this issue.
The best solution would be set "skip tls verify" for all k8s requests, but currently there are not such configuration parameters for this charm.
Cloud you, please, tell us, how to update K8S api site certificate in this charm and how to escape this problem in the future?
To Reproduce
Currently we don't know how to reproduce this issue. We suppose it could happen when microk8s rotates it's API certificates.
Environment
Relevant log output
The text was updated successfully, but these errors were encountered: