Why is exporting failing? Not seeing any data go to Splunk HF #986

xxPhuNguyenxx · 2023-10-23T13:36:00Z

We're currently trying to migrate from Splunk Connect for Kubernetes (SCK) to Splunk OpenTelemetry. We've uninstalled SCK and following the instructions for installing Splunk Otel. Below is values.yaml .

values.yaml
clusterName: ocp-sbx1
splunkPlatform:
token: xxxxxxxxxx
endpoint: https://SplunkHF/services/collector/event
index: sotl
metricsIndex: sotm
logsEnabled: true
metricsEnabled: false
tracesEnabled: false
maxConnections: 10 # Maximum HTTP connections to use simultaneously when sending data.
disableCompression: false # Whether to disable gzip compression over HTTP. Defaults to true.
timeout: 10s # HTTP timeout when sending data. Defaults to 10s.
idleConnTimeout: 5s # Idle connection timeout. defaults to 10s
insecureSkipVerify: false # default to true. Once this works, we'll need to put Ca certs in place
retryOnFailure:
enabled: true
initialInterval: 30s # Time to wait after the first failure before retrying; ignored if enabled is false. Defaults to 5s
maxInterval: 60s # The upper bound on backoff; ignored if enabled is false. Default is 30s
maxElapsedTime: 600s # The maximum amount of time spent trying to send a batch; ignored if enabled is false. Default is 300s
logsEngine: otel
cloudProvider: "aws"
distribution: "eks"
environment: sandbox

Otel Logging:
kubectl logs pod/ocp-aws 2023/10/04 14:29:12 2023/10/04 14:29:12 2023/10/04 14:29:12 2023-10-04T14:29:12.823Z 2023-10-04T14:29:12.823Z 2023-10-04T14:29:12.825Z 2023-10-04T14:29:12.825Z 2023-10-04T14:29:12.825Z 2023-10-04T14:29:12.825Z 2023-10-04T14:29:12.825Z 2023-10-04T14:29:12.825Z 2023-10-04T14:29:12.825Z 2023-10-04T14:29:12.825Z 2023-10-04T14:29:12.825Z 2023-10-04T14:29:12.825Z 2023-10-04T14:29:12.825Z 2023-10-04T14:29:12.908Z 2023-10-04T14:29:13.008Z 2023-10-04T14:29:13.008Z 2023-10-04T14:29:13.008Z 2023-10-04T14:29:13.008Z 2023-10-04T14:29:13.009Z 2023-10-04T14:29:13.009Z 2023-10-04T14:29:13.009Z 2023-10-04T14:29:13.009Z 2023-10-04T14:29:13.009Z 2023-10-04T14:29:13.009Z 2023-10-04T14:29:13.009Z 2023-10-04T14:29:13.011Z 2023-10-04T14:29:13.011Z 2023-10-04T14:29:13.213Z 2023-10-04T14:29:13.708Z 2023-10-04T14:29:30.534Z 2023-10-04T14:29:41.624Z -sbx1-splunk-otel-collector-agent-tgb87 -f
settings.go:399: Set config to [/conf/relay.yaml]
settings.go:452: Set ballast to 165 MiB
settings.go:468: Set memory limit to 450 MiB
info service/telemetry.go:84 Setting up own telemetry...
info service/telemetry.go:201 Serving Prometheus metrics {"address": "0.0.0.0:58889", "level": "Basic"}
info service/service.go:138 Starting otelcol... {"Version": "v0.85.0", "NumCPU": 32}
info extensions/extensions.go:31 Starting extensions...
info extensions/extensions.go:34 Extension is starting... {"kind": "extension", "name": "file_storage"}
info extensions/extensions.go:38 Extension started. {"kind": "extension", "name": "file_storage"}
info extensions/extensions.go:34 Extension is starting... {"kind": "extension", "name": "health_check"}
info [email protected]/healthcheckextension.go:35 Starting health_check extension {"kind": "extension", "name": "health_check", "config": {"Endpoint":"0.0.0.0:13133","TLSSetting":null,"CORS":null,"Auth":null,"MaxRequestBodySize":0,"IncludeMetadata":false,"ResponseHeaders":null,"Path":"/","ResponseBody":null,"CheckCollectorPipeline":{"Enabled":false,"Interval":"5m","ExporterFailureThreshold":5}}}
warn [email protected]/warning.go:40 Using the 0.0.0.0 address exposes this server to every network interface, which may facilitate Denial of Service attacks {"kind": "extension", "name": "health_check", "documentation": "https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/security-best-practices.md#safeguards-against-denial-of-service-attacks"}
info extensions/extensions.go:38 Extension started. {"kind": "extension", "name": "health_check"}
info extensions/extensions.go:34 Extension is starting... {"kind": "extension", "name": "k8s_observer"}
info extensions/extensions.go:38 Extension started. {"kind": "extension", "name": "k8s_observer"}
info extensions/extensions.go:34 Extension is starting... {"kind": "extension", "name": "memory_ballast"}
info [email protected]/memory_ballast.go:41 Setting memory ballast {"kind": "extension", "name": "memory_ballast", "MiBs": 165}
info extensions/extensions.go:38 Extension started. {"kind": "extension", "name": "memory_ballast"}
info extensions/extensions.go:34 Extension is starting... {"kind": "extension", "name": "zpages"}
info [email protected]/zpagesextension.go:53 Registered zPages span processor on tracer provider {"kind": "extension", "name": "zpages"}
info [email protected]/zpagesextension.go:63 Registered Host's zPages {"kind": "extension", "name": "zpages"}
info [email protected]/zpagesextension.go:75 Starting zPages extension {"kind": "extension", "name": "zpages", "config": {"TCPAddr":{"Endpoint":"localhost:55679"}}}
info extensions/extensions.go:38 Extension started. {"kind": "extension", "name": "zpages"}
warn [email protected]/warning.go:40 Using the 0.0.0.0 address exposes this server to every network interface, which may facilitate Denial of Service attacks {"kind": "receiver", "name": "otlp", "data_type": "logs", "documentation": "https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/security-best-practices.md#safeguards-against-denial-of-service-attacks"}
info [email protected]/otlp.go:83 Starting GRPC server {"kind": "receiver", "name": "otlp", "data_type": "logs", "endpoint": "0.0.0.0:4317"}
warn [email protected]/warning.go:40 Using the 0.0.0.0 address exposes this server to every network interface, which may facilitate Denial of Service attacks {"kind": "receiver", "name": "otlp", "data_type": "logs", "documentation": "https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/security-best-practices.md#safeguards-against-denial-of-service-attacks"}
info [email protected]/otlp.go:101 Starting HTTP server {"kind": "receiver", "name": "otlp", "data_type": "logs", "endpoint": "0.0.0.0:4318"}
info adapter/receiver.go:45 Starting stanza receiver {"kind": "receiver", "name": "filelog", "data_type": "logs"}
info healthcheck/handler.go:132 Health Check state change {"kind": "extension", "name": "health_check", "status": "ready"}
info service/service.go:161 Everything is ready. Begin running and processing data.
info fileconsumer/file.go:194 Started watching file {"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "path": "/var/log/pods/argo-cd_argocd-application-controller-0_5e3a9022-da1d-4987-9d5a-8ab6d91155b3/argocd-application-controller/0.log"}
info exporterhelper/queued_retry.go:351 Exporting failed. Will retry the request after interval. {"kind": "exporter", "data_type": "logs", "name": "splunk_hec/platform_logs", "error": "HTTP 404 "Not Found"", "interval": "36.084025499s"}
info exporterhelper/queued_retry.go:351 Exporting failed. Will retry the request after interval. {"kind": "exporter", "data_type": "logs", "name": "splunk_hec/platform_logs", "error": "HTTP 404 "Not Found"", "interval": "1m1.421101624s"}
info exporterhelper/queued_retry.go:351 Exporting failed. Will retry the request after interval. {"kind": "exporter", "data_type": "logs", "name": "splunk_hec/platform_logs", "error": "HTTP 404 "Not Found"", "interval": "27.609176856s"}

atoulme · 2023-10-24T15:48:40Z

Is this a duplicate of #987 ? Please reach out to Splunk Support.

xxPhuNguyenxx · 2023-10-24T17:24:15Z

No actually its a different issue. The other one is from our onprem OCP environment and is different error. This one is from our AWS EKS cluster where we're sending to custom port due to conflict in port.

With the other one #987, we're getting weird connection issue even though manaul curl command to the HEC endpoint works from the pod.
2023-10-18T20:30:00.323Z info exporterhelper/retry_sender.go:177 Exporting failed. Will retry the request after interval. {"kind": "exporter", "data_type": "logs", "name": "splunk_hec/platform_logs", "error": "Post "https://splunkhf/services/collector/event\": net/http: request canceled (Client.Timeout exceeded while awaiting headers)", "interval": "37.248737291s"}

With this one it just throws the "Exporting failed" error. In regards to "HTTP 404 "Not Found", what exactly is not found ? Whats causing the 404?

2023-10-04T14:29:41.624Z info exporterhelper/queued_retry.go:351 Exporting failed. Will retry the request after interval. {"kind": "exporter", "data_type": "logs", "name": "splunk_hec/platform_logs", "error": "HTTP 404 "Not Found"", "interval": "27.609176856s"}

atoulme · 2023-10-24T17:53:59Z

The HEC exporter is hitting a HTTP server that returns a 404 not found error code. You need to check if the endpoint is correct. It might be a good idea to try to hit this endpoint from your cluster to check it works properly. For more help, it's best again to open a support case.

xxPhuNguyenxx · 2023-10-24T18:10:21Z

I'm a bit hesitant about submitting a Splunk support case mainly because we'll spend alot of time on the HF which is definitely working (we have over 5tb going to it daily with no issues) or some side tracking stuff. Is there a way to set the logging to debug or confirm what endpoint is used?

When i run curl command from the pod, it always return success.

curl -k https://splunkhf/services/collector/event -H "Authorization: Splunk xxxx-xxxx-xxxx-xxxx-xxxx" -d '{"index":"main","event":"testing"}'
{"text":"Success","code":0}%

atoulme · 2023-10-24T20:08:02Z

Sure, please see https://github.com/signalfx/splunk-otel-collector-chart/blob/main/docs/troubleshooting.md

matthewmodestino · 2023-11-02T23:38:14Z

Has to be something in the way your values.yaml endpoint is being set in the configmap.

Try reviewing the configmap

kubectl get cm <name_of_agent_configmap> -o yaml

If you open a case the support team can reach out internally to grab one of us to help, or holler at your SE and tell them to ping me, or if you are in the splunk community slack we can review there.

atoulme · 2023-11-17T18:15:14Z

Please follow up with a support case if the problem persists.

atoulme closed this as completed Nov 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why is exporting failing? Not seeing any data go to Splunk HF #986

Why is exporting failing? Not seeing any data go to Splunk HF #986

xxPhuNguyenxx commented Oct 23, 2023 •

edited

Loading

atoulme commented Oct 24, 2023

xxPhuNguyenxx commented Oct 24, 2023 •

edited

Loading

atoulme commented Oct 24, 2023

xxPhuNguyenxx commented Oct 24, 2023

atoulme commented Oct 24, 2023

matthewmodestino commented Nov 2, 2023

atoulme commented Nov 17, 2023

Why is exporting failing? Not seeing any data go to Splunk HF #986

Why is exporting failing? Not seeing any data go to Splunk HF #986

Comments

xxPhuNguyenxx commented Oct 23, 2023 • edited Loading

atoulme commented Oct 24, 2023

xxPhuNguyenxx commented Oct 24, 2023 • edited Loading

atoulme commented Oct 24, 2023

xxPhuNguyenxx commented Oct 24, 2023

atoulme commented Oct 24, 2023

matthewmodestino commented Nov 2, 2023

atoulme commented Nov 17, 2023

xxPhuNguyenxx commented Oct 23, 2023 •

edited

Loading

xxPhuNguyenxx commented Oct 24, 2023 •

edited

Loading