-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new alert rules for throttling #509
Add new alert rules for throttling #509
Conversation
gabrielcocenza
commented
Nov 28, 2024
•
edited
Loading
edited
- If OpenSearch is throttling, this is an alert that optimizations are necessary like scaling the number of nodes or changing queries and indexing patterns
- It's recommended that OpenSearch runs with at least 3 nodes to have high availability - If OpenSearch is throttling, this is an alert that optimizations are necessary like scaling the number of nodes or changing queries and indexing patterns
@@ -105,3 +105,23 @@ | |||
"for": "1m" | |||
"labels": | |||
"severity": "alert" | |||
|
|||
- "alert": "OpenSearchFewNodes" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my opinion this alert is not too relevant. High availability requirements should be explained on docs, or deployment requisites, not trigger once integrated with observability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kind agree. However, the generated metrics cannot ensure that a node is lost apart from checking if the up
metric is working or not. I created this as a warning just to trigger something not critical to operators know that the current cluster is not operating as expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed the node count alert. We can add in the future if necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @gabrielcocenza we cannot unfortunately limit ourselves to 3+ nodes. The scenario of 1-node and 2-node cluster is also supposed to work. On the other side, the throttling alert is a really good one!
- If OpenSearch is throttling, this is an alert that optimizations are necessary like scaling the number of nodes or changing queries and indexing patterns