-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[kube-prometheus-stack] Lost Prometheus targets after upgrade to version 63 #4878
Comments
Same here , i had to revert ! |
@kallaics did you check the matching label? The default values now set release: release_name as default label and I had to insert this label on all service monitors |
You are right, I forgot to add this step to the reproducing steps. The custom matching label is a difference from the basic settings. I double checked the matching labels. My custom matching labels were added to the resources. Helm chart configuration not changed during the upgrade on my side, and after downgrade everything was okay again. |
Instead:
use
Ref:
|
I would like to remove the remove the default The following works as expected: helm template foo kube-prometheus-stack \
--repo https://prometheus-community.github.io/helm-charts \
>z0
cat >values.yaml <<EOF
prometheus:
prometheusSpec:
ruleSelector:
matchLabels: null
serviceMonitorSelector:
matchLabels: null
EOF
helm template foo kube-prometheus-stack \
--repo https://prometheus-community.github.io/helm-charts \
--values values.yaml \
>z1 $ diff z0 z1
--- z0
+++ z1
@@ -2273,8 +2273,7 @@
routePrefix: "/"
serviceAccountName: foo-kube-prometheus-stack-prometheus
serviceMonitorSelector:
- matchLabels:
- release: 'foo'
+ {}
serviceMonitorNamespaceSelector:
{}
podMonitorSelector:
@@ -2297,8 +2296,7 @@
ruleNamespaceSelector:
{}
ruleSelector:
- matchLabels:
- release: 'foo'
+ {}
scrapeConfigSelector:
matchLabels:
release: 'foo' But if I use it as a subchart: cat >Chart.yaml <<EOF
---
apiVersion: v2
name: system
type: application
version: 0.0.0
appVersion: 0.0.0
dependencies:
- name: kube-prometheus-stack
version: 63.1.0
repository: https://prometheus-community.github.io/helm-charts
EOF
cat >values.yaml <<EOF
kube-prometheus-stack:
prometheus:
prometheusSpec:
ruleSelector:
matchLabels: null
serviceMonitorSelector:
matchLabels: null
EOF
helm template foo . \
--values values.yaml \
>z1 $ diff z0 z1 | grep matchLabels
<empty> in other words the Instead: cat >values.yaml <<EOF
kube-prometheus-stack:
prometheus:
prometheusSpec:
ruleSelector:
matchLabels: a
serviceMonitorSelector:
matchLabels: a
EOF
helm template foo . \
--values values.yaml \
>z1 $ diff z0 z1 | grep matchLabels -A2
- matchLabels:
- release: 'foo'
+ matchLabels: a
serviceMonitorNamespaceSelector:
{}
--
- matchLabels:
- release: 'foo'
+ matchLabels: a
scrapeConfigSelector:
matchLabels:
release: 'foo' Maybe due to helm/helm#12637? |
@yellowhat Correct, We are aware that there is an helm issue which add an incompatibility with sub-charts. ref: #4818 Could you try this:
helm template may looks goofy, but due how duplicate keys on YAML works, the config from Mention that this is maybe a short-term workaround. We are looking forward into a solution. |
Hi guys, I never thought, during the weekend will somebody find the root cause and can provide a workaround as well. Big high five for every participants! I agree with @yellowhat recommendation to apply the changes. I have no clue, how much user use the @jkroepke It is good catch, but it will cause probably "major" change. I think this also would be great to move these items to there. We will see it can be there in the next version or just the next major release. |
From what I can see, the default value is already empty: kube-prometheus-stack:
prometheus:
prometheusSpec:
additionalConfig: {} How can I remove the |
If |
@jkroepke the helm issue with null values to subcharts that would hit in this case is afaik already fixed, and I was able to upgrade to this new version without any changes to my deployment: https://github.com/broersma-forslund/homelab/pull/570/files I suspect users with issues are using older versions of helm. |
May I ask which version are you using? I am on 3.16.1. |
Mhm weird, I'm on v3.15.2. But it's clearly working for my scenario. |
It works only, if it's placed in the default values.yaml of the umbrella chart. I see in your diff, thats the case. If the override comes from additional values, like via |
As far as I know helm does not support overriding a non-null value with a null value in a subchart, e.g.
so with this setup it is impossible to override the value in e.g. https://github.com/prometheus-community/helm-charts/blob/main/charts/kube-prometheus-stack/values.yaml#L3708-L3710 See various helm issues helm/helm#12741 helm/helm#13222 helm/helm#12511 |
As explained above this does work as overrides in the parent chart default values but apparently not in values overrides provided as separate values to the parent chart. |
Hi all, this morning, I discussed with @QuentinBisson possible option to recover from the failed v63 release. Both proposals will result into an other breaking change. But it is what it is. Proposal 1 - Switch to plain strings (Vote with 🎉 )All Selectors become plain strings. Instead: kube-prometheus-stack:
prometheus:
prometheusSpec:
scrapeConfigSelector:
matchLabels:
monitoring: prometheus This format will be used: kube-prometheus-stack:
prometheus:
prometheusSpec:
scrapeConfigSelector: |-
matchLabels:
monitoring: prometheus This would avoid the required to set values to zero while giving the full flexibility to fill values. An empty string mean null in that case. If set, the default value "release: {{ $.Release.Name }}" would be overwritten without setting anything else to null. The plain string would be passed through tpl as well. On template site, it will be changes from
to
Proposal 2 - Revert to old values (Vote with 🚀 )Basically revert #4818 without any future improvements. |
How it looks like to introduce a variable to apply the default label (release: {{ .Release.Name }} ) ? In my case will ignore the Option A: Added new variable to ruleconfigSelector level. kube-prometheus-stack:
prometheus:
prometheusSpec:
ruleConfigSelector:
includeReleaseLabel: false # Default: true and will apply the 'release' label, when matchLabels is null, will ignored
matchLabels: # Allow to configure more additional label too with or without release label.
monitoring: prometheus The template: {{- if not (kindIs "invalid" .Values.prometheus.prometheusSpec.ruleNamespaceSelector) }}
ruleNamespaceSelector:
{{- if .Values.prometheus.prometheusSpec.ruleNamespaceSelector.includeReleaseLabel) }}
release: {{ .Release.Name }}
{{- end }}
{{- if .Values.prometheus.prometheusSpec.ruleNamespaceSelector.matchLabel) }}
{{ toYaml .Values.prometheus.prometheusSpec.ruleNamespaceSelector.matchLabel . | nindent 4 }}
{{- end }}
{{- end }} Option B: the ruleConfigSelector part will untouched and new variable added to the PrometheusSpec level. kube-prometheus-stack:
prometheus:
prometheusSpec:
ruleConfigWithReleaseLabel: false # Default: true and will apply the 'release' label, when matchLabels is null, will ignored
ruleConfigSelector:
matchLabels: # Allow to configure more additional label too with or without release label.
monitoring: prometheus The template: {{- if not (kindIs "invalid" .Values.prometheus.prometheusSpec.ruleNamespaceSelector) }}
ruleNamespaceSelector:
{{- if .Values.prometheus.prometheusSpec.ruleConfigWithReleaseLabel) }}
release: {{ .Release.Name }}
{{- end }}
{{- if .Values.prometheus.prometheusSpec.ruleNamespaceSelector.matchLabel) }}
{{ toYaml .Values.prometheus.prometheusSpec.ruleNamespaceSelector.matchLabel . | nindent 4 }}
{{- end }}
{{- end }} |
Same problem here. So basically this comment here does not work for us:
|
Instead of having to set it to empty string or null I would much prefer something more functional like for example being able to set it to It doesn't help that the prometheus operator crd itself is also unusual in that it supports three states (nil, empty and filled). I don't think the current kps chart supports all three states, does it? |
Same, using special label to discover the targets, had to revert back to 62 cuz everything dissapeared 😃 |
@rouke-broersma What do you mean about my proposal above? It can keep both option. Only issue you cannot use in "OR" relation the two settings (to keep the default release labels and I want to scrape my custom resources with different label(s) ). |
Your proposal does not functionally exactly match the actual scenarios. The actual scenario is not whether or not the release label is included. The actual scenario is one of these three:
|
We decided to perform a rollback #4883 My motivation for using some of the 'cool' features is quite low because there may be other unknown bugs in the Helm ecosystem. I want to avoid another failed release. |
Describe the bug a clear and concise description of what the bug is.
We have PodMonitor, ServiceMonitor and ScrapeConfig resources and after the upgrade these resources are completely disappeared from the targets list in Prometheus and not collect the metrics. I spent a morning to find any relevant change in the release notes, but I don't found anything. The only solution was the downgrade.
All affected resources are included through Selector.matchLabels
Worked well on version: 62.7.0
Not worked on version: 63.0.0 and 63.0.1
Prometheus configuration
Scrape configuration (I know it is alpha status, so we are testing on non-production environment only)
Thanks guys the help.
What's your helm version?
version.BuildInfo{Version:"v3.16.1", GitCommit:"5a5449dc42be07001fd5771d56429132984ab3ab", GitTreeState:"dirty", GoVersion:"go1.23.1"}
What's your kubectl version?
Client Version: v1.31.1 Kustomize Version: v5.4.2 Server Version: v1.29.7
Which chart?
kube-prometheus-stack
What's the chart version?
63.0.0, 63.1.0
What happened?
After Helm chart upgrade the most of the target disappeared from Prometheus.
Affected resources:
What you expected to happen?
Not disappear the targets from Prometheus.
How to reproduce it?
Install Prometheus from kube-prometheus-stack version 62.7.0
Install PodMonitor resource in different namespace from the Prometheus
Install ServiceMonitor, ScrapeConfig resources in same namespace with Prometheus
After all target works well, upgrade the cluster to 63.0.0 or 63.1.0
Enter the changed values of values.yaml?
None
Enter the command that you execute and failing/misfunctioning.
helm install prometheus prometheus-community/kube-prometheus-stack --values values.yaml
Anything else we need to know?
The Deployment happened via FluxCD v 2.3.0 (latest stable).
Full Helm chart reinstall from scratch not solved the issue.
All CRDs are on latest version for the Helm chart.
The text was updated successfully, but these errors were encountered: