You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
level.Debug(logger).Log("msg", "Malformed name tag", "tag", tag, "component", component)
This log is very useful for debugging missing tags in metrics. However to get it we have to set statsd-exporter at debug level logging, which also brings in the "incoming line" log which logs way too much causing issues for us.
Can we increase this log to at least info level? Debug doesn't make sense to me for such important info
The text was updated successfully, but these errors were encountered:
The reason we set it at debug level is because it might emit a log line for more or less every incoming event. As a general rule, we've kept everything O(incoming) out of info-level logs, to avoid logging volume (= cost) surprises in normal operation.
Some people run this exporter with a lot of events coming in, logging a significant fraction of these for even a short period across the whole fleet may be wildly expensive. On the other hand, you're right that we are not giving them a lot of options to find out what's wrong without turning on the firehose on at least a subset.
If you're up for it, I would love to have rate limited logging of problematic event lines. What I roughly imagine is that we identify the non-varying parts of the log line, and only emit one line per some time period for each of these. We'll probably want some settings to tune that, since what's acceptable may vary a lot, but as a default I would say one log line per metric cardinality per minute would be a good starting point. Ideally we'd have some indication how many lines we've rate limited away when we do finally flush them.
In general, this would allow us to elevate many of the "something is wrong with this event" logs to Warning, so they're shown and stand out by default.
On this log line in particular:
how useful is it in isolation? How much do you need to know about the full incoming event to make sense of it?
In this function, there's two uses of the "malformed name tag" message, making it indistinguishable which error path you hit. When touching this, please give them distinct messages.
matthiasr
changed the title
Malformed name tag - Log level should be higher
Safe logging of problems in incoming events [was: Malformed name tag - Log level should be higher]
Aug 18, 2024
The following log is currently set at debug level:
statsd_exporter/pkg/line/line.go
Line 124 in 58769c7
This log is very useful for debugging missing tags in metrics. However to get it we have to set statsd-exporter at debug level logging, which also brings in the "incoming line" log which logs way too much causing issues for us.
Can we increase this log to at least info level? Debug doesn't make sense to me for such important info
The text was updated successfully, but these errors were encountered: