Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

agent-operator: doesn't provide information on why it's not doing anything #3363

Open
uhthomas opened this issue Mar 26, 2023 · 2 comments
Open
Labels
enhancement New feature or request operator Grafana Agent Operator related variant/operator Related to Grafana Agent Static Operator.

Comments

@uhthomas
Copy link

Hey, I haven't seen anyone else have this problem so I imagine I must be doing something obviously wrong. I've been really struggling to debug the problem, so thought this might be helpful to raise regardless as it may be a good opportunity to introduce better logging and feedback for errors or misconfiguration.

I've deployed grafana-agent-operator and it just doesn't seem to do anything with custom resources like GrafanaAgent or LogsInstance. I'm under the impression the operator should be creating a deployment or daemonset for the agent at the very least.

There are no pods or anything. Describing the resource shows nothing at all.

❯ k -n grafana-agent get po
No resources found in grafana-agent namespace.
❯ k -n grafana-agent describe grafanaagents.monitoring.grafana.com
Name:         grafana-agent
Namespace:    grafana-agent
Labels:       app.kubernetes.io/instance=grafana-agent
              app.kubernetes.io/managed-by=automata
              app.kubernetes.io/name=grafana-agent
              app.kubernetes.io/version=0.32.1
Annotations:  <none>
API Version:  monitoring.grafana.com/v1alpha1
Kind:         GrafanaAgent
Metadata:
  Creation Timestamp:  2023-03-19T01:23:19Z
  Generation:          4
  Managed Fields:
    API Version:  monitoring.grafana.com/v1alpha1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .:
          f:kubectl.kubernetes.io/last-applied-configuration:
        f:labels:
          .:
          f:app.kubernetes.io/instance:
          f:app.kubernetes.io/managed-by:
          f:app.kubernetes.io/name:
          f:app.kubernetes.io/version:
      f:spec:
        .:
        f:disableReporting:
        f:disableSupportBundle:
        f:enableConfigReadAPI:
        f:image:
        f:logs:
          .:
          f:instanceSelector:
        f:metrics:
          .:
          f:externalLabels:
            .:
            f:cluster:
          f:instanceSelector:
        f:serviceAccountName:
    Manager:         kubectl-client-side-apply
    Operation:       Update
    Time:            2023-03-26T03:18:01Z
  Resource Version:  1099462
  UID:               59d09ede-9049-4aec-9bc2-bd19c5f5b455
Spec:
  Disable Reporting:       false
  Disable Support Bundle:  false
  Enable Config Read API:  false
  Image:                   grafana/agent:v0.32.1
  Logs:
    Instance Selector:
      Match Labels:
        Agent:  grafana-agent
  Metrics:
    External Labels:
      Cluster:  unwind
    Instance Selector:
      Match Labels:
        Agent:           grafana-agent
  Service Account Name:  grafana-agent
Events:                  <none>

The logs from the operator also don't really help, even with --log.level=debug.

❯ k -n grafana-agent-operator logs grafana-agent-operator-dd4686f88-fk795
level=info ts=2023-03-26T10:44:54.879077984Z component=controller-runtime.metrics msg="Metrics server is starting to listen" addr=:8080
level=info ts=2023-03-26T10:44:54.879623566Z msg="starting manager"
level=info ts=2023-03-26T10:44:54.879868679Z path=/metrics kind=metrics addr=[::]:8080 msg="Starting server"
level=info ts=2023-03-26T10:44:54.880110958Z controller=node controllerGroup= controllerKind=Node msg="Starting EventSource" source="kind source: *v1.Node"
level=info ts=2023-03-26T10:44:54.880157505Z controller=node controllerGroup= controllerKind=Node msg="Starting EventSource" source="kind source: *v1.Service"
level=info ts=2023-03-26T10:44:54.880175122Z controller=node controllerGroup= controllerKind=Node msg="Starting EventSource" source="kind source: *v1.Endpoints"
level=info ts=2023-03-26T10:44:54.880188435Z controller=node controllerGroup= controllerKind=Node msg="Starting Controller"
level=info ts=2023-03-26T10:44:54.880219787Z controller=grafanaagent controllerGroup=monitoring.grafana.com controllerKind=GrafanaAgent msg="Starting EventSource" source="kind source: *v1alpha1.GrafanaAgent"
level=info ts=2023-03-26T10:44:54.880259906Z controller=grafanaagent controllerGroup=monitoring.grafana.com controllerKind=GrafanaAgent msg="Starting EventSource" source="kind source: *v1.StatefulSet"
level=info ts=2023-03-26T10:44:54.880281839Z controller=grafanaagent controllerGroup=monitoring.grafana.com controllerKind=GrafanaAgent msg="Starting EventSource" source="kind source: *v1.DaemonSet"
level=info ts=2023-03-26T10:44:54.880301266Z controller=grafanaagent controllerGroup=monitoring.grafana.com controllerKind=GrafanaAgent msg="Starting EventSource" source="kind source: *v1.Deployment"
level=info ts=2023-03-26T10:44:54.880319682Z controller=grafanaagent controllerGroup=monitoring.grafana.com controllerKind=GrafanaAgent msg="Starting EventSource" source="kind source: *v1.Secret"
level=info ts=2023-03-26T10:44:54.880353892Z controller=grafanaagent controllerGroup=monitoring.grafana.com controllerKind=GrafanaAgent msg="Starting EventSource" source="kind source: *v1.Service"
level=info ts=2023-03-26T10:44:54.880384389Z controller=grafanaagent controllerGroup=monitoring.grafana.com controllerKind=GrafanaAgent msg="Starting EventSource" source="kind source: *v1.Secret"
level=info ts=2023-03-26T10:44:54.880403868Z controller=grafanaagent controllerGroup=monitoring.grafana.com controllerKind=GrafanaAgent msg="Starting EventSource" source="kind source: *v1alpha1.LogsInstance"
level=info ts=2023-03-26T10:44:54.880420805Z controller=grafanaagent controllerGroup=monitoring.grafana.com controllerKind=GrafanaAgent msg="Starting EventSource" source="kind source: *v1alpha1.PodLogs"
level=info ts=2023-03-26T10:44:54.880454408Z controller=grafanaagent controllerGroup=monitoring.grafana.com controllerKind=GrafanaAgent msg="Starting EventSource" source="kind source: *v1alpha1.MetricsInstance"
level=info ts=2023-03-26T10:44:54.880485492Z controller=grafanaagent controllerGroup=monitoring.grafana.com controllerKind=GrafanaAgent msg="Starting EventSource" source="kind source: *v1alpha1.Integration"
level=info ts=2023-03-26T10:44:54.880506422Z controller=grafanaagent controllerGroup=monitoring.grafana.com controllerKind=GrafanaAgent msg="Starting EventSource" source="kind source: *v1.PodMonitor"
level=info ts=2023-03-26T10:44:54.880529989Z controller=grafanaagent controllerGroup=monitoring.grafana.com controllerKind=GrafanaAgent msg="Starting EventSource" source="kind source: *v1.Probe"
level=info ts=2023-03-26T10:44:54.880548504Z controller=grafanaagent controllerGroup=monitoring.grafana.com controllerKind=GrafanaAgent msg="Starting EventSource" source="kind source: *v1.ServiceMonitor"
level=info ts=2023-03-26T10:44:54.880564865Z controller=grafanaagent controllerGroup=monitoring.grafana.com controllerKind=GrafanaAgent msg="Starting EventSource" source="kind source: *v1.Secret"
level=info ts=2023-03-26T10:44:54.880593308Z controller=grafanaagent controllerGroup=monitoring.grafana.com controllerKind=GrafanaAgent msg="Starting EventSource" source="kind source: *v1.ConfigMap"
level=info ts=2023-03-26T10:44:54.880615514Z controller=grafanaagent controllerGroup=monitoring.grafana.com controllerKind=GrafanaAgent msg="Starting Controller"
level=info ts=2023-03-26T10:44:54.983078173Z controller=node controllerGroup= controllerKind=Node msg="Starting workers" workercount=1
level=info ts=2023-03-26T10:44:54.983229972Z controller=node controllerGroup= controllerKind=Node Node="unsupported value type" namespace= name=talos-su3-l23 reconcileID=8995d331-d5a4-4676-a006-8542fea3c95d msg="reconciling node"
level=debug ts=2023-03-26T10:44:54.983508784Z controller=node controllerGroup= controllerKind=Node Node="unsupported value type" namespace= name=talos-su3-l23 reconcileID=8995d331-d5a4-4676-a006-8542fea3c95d msg="reconciling kubelet service" svc=grafana-agent-operator/kubelet
level=info ts=2023-03-26T10:44:54.985699825Z controller=grafanaagent controllerGroup=monitoring.grafana.com controllerKind=GrafanaAgent msg="Starting workers" workercount=1
level=info ts=2023-03-26T10:44:54.985774221Z controller=grafanaagent controllerGroup=monitoring.grafana.com controllerKind=GrafanaAgent GrafanaAgent="unsupported value type" namespace=grafana-agent name=grafana-agent reconcileID=3b71a747-3695-42d7-87b4-2f2a16bef514 msg="reconciling grafana-agent"
level=info ts=2023-03-26T10:44:54.986405198Z controller=grafanaagent controllerGroup=monitoring.grafana.com controllerKind=GrafanaAgent GrafanaAgent="unsupported value type" namespace=grafana-agent name=grafana-agent reconcileID=3b71a747-3695-42d7-87b4-2f2a16bef514 msg="reconciling secret" secret=grafana-agent-secrets
level=debug ts=2023-03-26T10:44:54.989946391Z controller=node controllerGroup= controllerKind=Node Node="unsupported value type" namespace= name=talos-su3-l23 reconcileID=8995d331-d5a4-4676-a006-8542fea3c95d msg="reconciling kubelet endpoints" eps=grafana-agent-operator/kubelet
level=info ts=2023-03-26T10:44:54.992184399Z controller=grafanaagent controllerGroup=monitoring.grafana.com controllerKind=GrafanaAgent GrafanaAgent="unsupported value type" namespace=grafana-agent name=grafana-agent reconcileID=3b71a747-3695-42d7-87b4-2f2a16bef514 msg="deleting integrations Deployment" deploy=grafana-agent/grafana-agent-integrations-deploy
level=info ts=2023-03-26T10:44:54.99228564Z controller=grafanaagent controllerGroup=monitoring.grafana.com controllerKind=GrafanaAgent GrafanaAgent="unsupported value type" namespace=grafana-agent name=grafana-agent reconcileID=3b71a747-3695-42d7-87b4-2f2a16bef514 msg="deleting integrations DaemonSet" ds=grafana-agent/grafana-agent-integrations-ds
level=debug ts=2023-03-26T10:44:54.99232075Z controller=grafanaagent controllerGroup=monitoring.grafana.com controllerKind=GrafanaAgent GrafanaAgent="unsupported value type" namespace=grafana-agent name=grafana-agent reconcileID=3b71a747-3695-42d7-87b4-2f2a16bef514 msg="done reconciling grafana-agent"
level=info ts=2023-03-26T10:44:54.992383883Z controller=grafanaagent controllerGroup=monitoring.grafana.com controllerKind=GrafanaAgent GrafanaAgent="unsupported value type" namespace=grafana-agent-operator name=grafana-agent-operator reconcileID=4f3c5f61-02cb-4959-9cda-160a96fa3ae3 msg="reconciling grafana-agent"
level=info ts=2023-03-26T10:44:54.992777934Z controller=grafanaagent controllerGroup=monitoring.grafana.com controllerKind=GrafanaAgent GrafanaAgent="unsupported value type" namespace=grafana-agent-operator name=grafana-agent-operator reconcileID=4f3c5f61-02cb-4959-9cda-160a96fa3ae3 msg="reconciling secret" secret=grafana-agent-operator-secrets
level=info ts=2023-03-26T10:44:54.994953393Z controller=node controllerGroup= controllerKind=Node Node="unsupported value type" namespace= name=talos-3sl-aqp reconcileID=3ce0f2a5-e9e0-4322-b05d-051d786c2b61 msg="reconciling node"
level=debug ts=2023-03-26T10:44:54.995219871Z controller=node controllerGroup= controllerKind=Node Node="unsupported value type" namespace= name=talos-3sl-aqp reconcileID=3ce0f2a5-e9e0-4322-b05d-051d786c2b61 msg="reconciling kubelet service" svc=grafana-agent-operator/kubelet
level=info ts=2023-03-26T10:44:54.997204801Z controller=grafanaagent controllerGroup=monitoring.grafana.com controllerKind=GrafanaAgent GrafanaAgent="unsupported value type" namespace=grafana-agent-operator name=grafana-agent-operator reconcileID=4f3c5f61-02cb-4959-9cda-160a96fa3ae3 msg="deleting integrations Deployment" deploy=grafana-agent-operator/grafana-agent-operator-integrations-deploy
level=info ts=2023-03-26T10:44:54.997346518Z controller=grafanaagent controllerGroup=monitoring.grafana.com controllerKind=GrafanaAgent GrafanaAgent="unsupported value type" namespace=grafana-agent-operator name=grafana-agent-operator reconcileID=4f3c5f61-02cb-4959-9cda-160a96fa3ae3 msg="deleting integrations DaemonSet" ds=grafana-agent-operator/grafana-agent-operator-integrations-ds
level=debug ts=2023-03-26T10:44:54.997405572Z controller=grafanaagent controllerGroup=monitoring.grafana.com controllerKind=GrafanaAgent GrafanaAgent="unsupported value type" namespace=grafana-agent-operator name=grafana-agent-operator reconcileID=4f3c5f61-02cb-4959-9cda-160a96fa3ae3 msg="done reconciling grafana-agent"
level=debug ts=2023-03-26T10:44:55.000434754Z controller=node controllerGroup= controllerKind=Node Node="unsupported value type" namespace= name=talos-3sl-aqp reconcileID=3ce0f2a5-e9e0-4322-b05d-051d786c2b61 msg="reconciling kubelet endpoints" eps=grafana-agent-operator/kubelet
level=info ts=2023-03-26T10:44:55.005151218Z controller=node controllerGroup= controllerKind=Node Node="unsupported value type" namespace= name=talos-avz-rb5 reconcileID=264bfc83-0efc-4b80-8c67-d533e59c1c7e msg="reconciling node"
level=debug ts=2023-03-26T10:44:55.005391055Z controller=node controllerGroup= controllerKind=Node Node="unsupported value type" namespace= name=talos-avz-rb5 reconcileID=264bfc83-0efc-4b80-8c67-d533e59c1c7e msg="reconciling kubelet service" svc=grafana-agent-operator/kubelet
level=debug ts=2023-03-26T10:44:55.010440758Z controller=node controllerGroup= controllerKind=Node Node="unsupported value type" namespace= name=talos-avz-rb5 reconcileID=264bfc83-0efc-4b80-8c67-d533e59c1c7e msg="reconciling kubelet endpoints" eps=grafana-agent-operator/kubelet
level=info ts=2023-03-26T10:44:55.015073079Z controller=node controllerGroup= controllerKind=Node Node="unsupported value type" namespace= name=talos-god-636 reconcileID=ac2c5b99-f9a6-452b-83e7-dca3a64bad22 msg="reconciling node"
level=debug ts=2023-03-26T10:44:55.015379696Z controller=node controllerGroup= controllerKind=Node Node="unsupported value type" namespace= name=talos-god-636 reconcileID=ac2c5b99-f9a6-452b-83e7-dca3a64bad22 msg="reconciling kubelet service" svc=grafana-agent-operator/kubelet
level=debug ts=2023-03-26T10:44:55.020466852Z controller=node controllerGroup= controllerKind=Node Node="unsupported value type" namespace= name=talos-god-636 reconcileID=ac2c5b99-f9a6-452b-83e7-dca3a64bad22 msg="reconciling kubelet endpoints" eps=grafana-agent-operator/kubelet
level=info ts=2023-03-26T10:44:55.027138829Z controller=node controllerGroup= controllerKind=Node Node="unsupported value type" namespace= name=talos-l94-p4c reconcileID=437adfbb-3019-4f44-ad03-74a327eb7bb1 msg="reconciling node"
level=debug ts=2023-03-26T10:44:55.027399767Z controller=node controllerGroup= controllerKind=Node Node="unsupported value type" namespace= name=talos-l94-p4c reconcileID=437adfbb-3019-4f44-ad03-74a327eb7bb1 msg="reconciling kubelet service" svc=grafana-agent-operator/kubelet
level=debug ts=2023-03-26T10:44:55.032409039Z controller=node controllerGroup= controllerKind=Node Node="unsupported value type" namespace= name=talos-l94-p4c reconcileID=437adfbb-3019-4f44-ad03-74a327eb7bb1 msg="reconciling kubelet endpoints" eps=grafana-agent-operator/kubelet
level=info ts=2023-03-26T10:45:07.832265094Z controller=grafanaagent controllerGroup=monitoring.grafana.com controllerKind=GrafanaAgent GrafanaAgent="unsupported value type" namespace=grafana-agent name=grafana-agent reconcileID=7acbcea2-9913-444c-ad3f-00895de1a6f5 msg="reconciling grafana-agent"
level=debug ts=2023-03-26T10:45:07.832349171Z controller=grafanaagent controllerGroup=monitoring.grafana.com controllerKind=GrafanaAgent GrafanaAgent="unsupported value type" namespace=grafana-agent name=grafana-agent reconcileID=7acbcea2-9913-444c-ad3f-00895de1a6f5 msg="detected deleted agent"
level=debug ts=2023-03-26T10:45:07.832371006Z controller=grafanaagent controllerGroup=monitoring.grafana.com controllerKind=GrafanaAgent GrafanaAgent="unsupported value type" namespace=grafana-agent name=grafana-agent reconcileID=7acbcea2-9913-444c-ad3f-00895de1a6f5 msg="done reconciling grafana-agent"
level=info ts=2023-03-26T10:45:20.203059625Z controller=grafanaagent controllerGroup=monitoring.grafana.com controllerKind=GrafanaAgent GrafanaAgent="unsupported value type" namespace=grafana-agent name=grafana-agent reconcileID=cf404ca2-bfa3-45ef-9e2e-f6747c8ac101 msg="reconciling grafana-agent"
level=info ts=2023-03-26T10:45:20.203567307Z controller=grafanaagent controllerGroup=monitoring.grafana.com controllerKind=GrafanaAgent GrafanaAgent="unsupported value type" namespace=grafana-agent name=grafana-agent reconcileID=cf404ca2-bfa3-45ef-9e2e-f6747c8ac101 msg="reconciling secret" secret=grafana-agent-secrets
level=info ts=2023-03-26T10:45:20.214877589Z controller=grafanaagent controllerGroup=monitoring.grafana.com controllerKind=GrafanaAgent GrafanaAgent="unsupported value type" namespace=grafana-agent name=grafana-agent reconcileID=cf404ca2-bfa3-45ef-9e2e-f6747c8ac101 msg="deleting integrations Deployment" deploy=grafana-agent/grafana-agent-integrations-deploy
level=info ts=2023-03-26T10:45:20.214994687Z controller=grafanaagent controllerGroup=monitoring.grafana.com controllerKind=GrafanaAgent GrafanaAgent="unsupported value type" namespace=grafana-agent name=grafana-agent reconcileID=cf404ca2-bfa3-45ef-9e2e-f6747c8ac101 msg="deleting integrations DaemonSet" ds=grafana-agent/grafana-agent-integrations-ds
level=debug ts=2023-03-26T10:45:20.215027469Z controller=grafanaagent controllerGroup=monitoring.grafana.com controllerKind=GrafanaAgent GrafanaAgent="unsupported value type" namespace=grafana-agent name=grafana-agent reconcileID=cf404ca2-bfa3-45ef-9e2e-f6747c8ac101 msg="done reconciling grafana-agent"

I'd appreciate any help. If it's helpful, this is the deployment for the operator. The CRDs and cluster roles are also in the same directory.

@uhthomas
Copy link
Author

uhthomas commented Mar 26, 2023

Hmm, I do see this logic which does nothing if no integrations are defined.

if len(d.Integrations) == 0 {
// There's nothing to deploy; delete anything that might've been deployed
// from a previous reconcile.
level.Info(l).Log("msg", "deleting integrations DaemonSet", "ds", key)
var ds apps_v1.DaemonSet
return deleteManagedResource(ctx, r.Client, key, &ds)
}

I'll see if I can make it work with that. A clearer log might have been helpful.

@uhthomas
Copy link
Author

❯ k -n grafana-agent get po
NAME              READY   STATUS    RESTARTS   AGE
grafana-agent-0   2/2     Running   0          2m29s

Yes! Okay. I added a metrics instance and made sure the label selectors matched as they didn't before and now things seem to be working as expected.

I don't know if I'm just an idiot or what but I was really struggling with this for a good few days. I'd like to leave this issue open to specifically request clearer logs and events published to the GrafanaAgent CRD. One of the best ways to understand the status of something is to describe it and Events: <none> was really unhelpful.

Thanks 😄

@uhthomas uhthomas changed the title agent-operator: doesn't do anything agent-operator: doesn't provide information on why it's not doing anything Mar 26, 2023
@rfratto rfratto added enhancement New feature or request type/operator labels Mar 26, 2023
@rfratto rfratto added operator Grafana Agent Operator related and removed type/operator labels Nov 2, 2023
@rfratto rfratto added the variant/operator Related to Grafana Agent Static Operator. label Apr 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request operator Grafana Agent Operator related variant/operator Related to Grafana Agent Static Operator.
Projects
No open projects
Development

No branches or pull requests

2 participants