-
Notifications
You must be signed in to change notification settings - Fork 503
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update documentation for new AD settings #4835
Conversation
Signed-off-by: Jonah Calvo <[email protected]>
@@ -18,6 +18,10 @@ You can configure the anomaly detector processor by specifying a key and the opt | |||
| :--- | :--- | :--- | | |||
| `keys` | Yes | A non-ordered `List<String>` that is used as input to the ML algorithm to detect anomalies in the values of the keys in the list. At least one key is required. | |||
| `mode` | Yes | The ML algorithm (or model) used to detect anomalies. You must provide a mode. See [random_cut_forest mode](#random_cut_forest-mode). | |||
| `identification_keys` | No | If provided, anomalies will be detected within each unique instance of this key. For example, providing `ip` here will have anomalies detected seperately for each unique IP address. | |||
| `cardinality_limit` | No | If using `identification_keys`, a new ML model will be created for every degree of cardinality. This can cause a large amount of memory usage, so setting a limit to the number of models is useful. Defaults to 5000. | |||
| `verbose` | No | By default, the RCF algorithm will alert once on a level shift. For example, if latency is consistently 50-100 and jumps to consistently ~1000, only one anomaly will be detected. Setting `verbose` to `true` will alert many times for such a shift. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"RCF algorithm will alert once on a level shift" -> this is too strong. How about "RCF will try to auto learn and reduce the number of anomalies. For example if latency is consistently 50-100 and jumps to consistently ~1000, only the first few points after the transition will be detected (unless there are other spikes/anomalies). Likewise, for repeated spikes to the same level, RCF will likely eliminate many of the spikes after a few initial ones. The goal of this default setting is to minimize alerts. Setting verbose
to true will alert consistently on these repeated cases and may be useful in detecting anomalous behavior that lasts an extended period of time." Please free to reword as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made this change, thanks for the suggestion.
Signed-off-by: Jonah Calvo <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Hey team, could I get an approval on this updated documentation? |
@vagimeli - can you please review this? Thank you! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see doc review comments.
_data-prepper/pipelines/configuration/processors/anomaly-detector.md
Outdated
Show resolved
Hide resolved
_data-prepper/pipelines/configuration/processors/anomaly-detector.md
Outdated
Show resolved
Hide resolved
_data-prepper/pipelines/configuration/processors/anomaly-detector.md
Outdated
Show resolved
Hide resolved
@JonahCalvo @hdhalter Please see doc review comments. Once changes are made, it'll be good to go. |
…tor.md Co-authored-by: Melissa Vagi <[email protected]> Signed-off-by: Jonah Calvo <[email protected]>
…tor.md Co-authored-by: Melissa Vagi <[email protected]> Signed-off-by: Jonah Calvo <[email protected]>
…tor.md Co-authored-by: Melissa Vagi <[email protected]> Signed-off-by: Jonah Calvo <[email protected]>
Signed-off-by: Jonah Calvo <[email protected]>
Sorry I missed this folks. Great comments, applied changes and removed unclear usage of 'few'. Thanks! |
* Update documentation for new AD settings Signed-off-by: Jonah Calvo <[email protected]> * update wording for verbose Signed-off-by: Jonah Calvo <[email protected]> * Update _data-prepper/pipelines/configuration/processors/anomaly-detector.md Co-authored-by: Melissa Vagi <[email protected]> Signed-off-by: Jonah Calvo <[email protected]> * Update _data-prepper/pipelines/configuration/processors/anomaly-detector.md Co-authored-by: Melissa Vagi <[email protected]> Signed-off-by: Jonah Calvo <[email protected]> * Update _data-prepper/pipelines/configuration/processors/anomaly-detector.md Co-authored-by: Melissa Vagi <[email protected]> Signed-off-by: Jonah Calvo <[email protected]> * Remove 'few' from description Signed-off-by: Jonah Calvo <[email protected]> --------- Signed-off-by: Jonah Calvo <[email protected]> Signed-off-by: Jonah Calvo <[email protected]> Co-authored-by: Melissa Vagi <[email protected]>
* Update documentation for new AD settings Signed-off-by: Jonah Calvo <[email protected]> * update wording for verbose Signed-off-by: Jonah Calvo <[email protected]> * Update _data-prepper/pipelines/configuration/processors/anomaly-detector.md Co-authored-by: Melissa Vagi <[email protected]> Signed-off-by: Jonah Calvo <[email protected]> * Update _data-prepper/pipelines/configuration/processors/anomaly-detector.md Co-authored-by: Melissa Vagi <[email protected]> Signed-off-by: Jonah Calvo <[email protected]> * Update _data-prepper/pipelines/configuration/processors/anomaly-detector.md Co-authored-by: Melissa Vagi <[email protected]> Signed-off-by: Jonah Calvo <[email protected]> * Remove 'few' from description Signed-off-by: Jonah Calvo <[email protected]> --------- Signed-off-by: Jonah Calvo <[email protected]> Signed-off-by: Jonah Calvo <[email protected]> Co-authored-by: Melissa Vagi <[email protected]>
Description
Updates the Anomaly Detection docs with new configuration options.
Checklist
For more information on following Developer Certificate of Origin and signing off your commits, please check here.