-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
database_observability: report health of component and collectors #2392
Conversation
6e00e77
to
b941a40
Compare
@@ -127,7 +132,7 @@ func (c *QuerySample) fetchQuerySamples(ctx context.Context) error { | |||
} | |||
|
|||
if strings.HasSuffix(sampleText, "...") { | |||
level.Info(c.logger).Log("msg", "skipping parsing truncated query", "digest", digest) | |||
level.Debug(c.logger).Log("msg", "skipping parsing truncated query", "digest", digest) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drive-by remove noisy info log
if len(schemas) == 0 { | ||
level.Info(c.logger).Log("msg", "no schema detected from information_schema.schemata") | ||
return nil | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drive-by: log if no schema is detected
b941a40
to
dc57779
Compare
Report unhealthy in case of errors when starting up the collectors or of any collector is stopped during operations.
dc57779
to
2d5c5de
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I just suggested a different approach to give the collectors more flexibility on their health status but feel free to ignore
Start(context.Context) error | ||
Stopped() bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the future, you might have collectors that can be considered unhealthy but are still running. A different approach to support this would be to have a CurrentHealth function in the collector interface that returns the health object. Then you would not need the healthErr attribute anymore, you would just call CurrentHealth on all the collectors in the CurrentHealth function of the component.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that's a great point. I wanted to start simple for now, as collectors are anyway not resilient at all (they'll stop as soon as any error is hit). Agree that in the future we might want to delegate the logic to the collectors themselves.
PR Description
Report unhealthy in case of errors when starting up the collectors or of any collector is stopped during operations.
Which issue(s) this PR fixes
n.a.
Notes to the Reviewer
PR Checklist