Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update documentation for new AD settings #4835

Merged
merged 6 commits into from
Sep 21, 2023

Conversation

JonahCalvo
Copy link
Contributor

@JonahCalvo JonahCalvo commented Aug 18, 2023

Description

Updates the Anomaly Detection docs with new configuration options.

Checklist

  • By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and subject to the Developers Certificate of Origin.
    For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@@ -18,6 +18,10 @@ You can configure the anomaly detector processor by specifying a key and the opt
| :--- | :--- | :--- |
| `keys` | Yes | A non-ordered `List<String>` that is used as input to the ML algorithm to detect anomalies in the values of the keys in the list. At least one key is required.
| `mode` | Yes | The ML algorithm (or model) used to detect anomalies. You must provide a mode. See [random_cut_forest mode](#random_cut_forest-mode).
| `identification_keys` | No | If provided, anomalies will be detected within each unique instance of this key. For example, providing `ip` here will have anomalies detected seperately for each unique IP address.
| `cardinality_limit` | No | If using `identification_keys`, a new ML model will be created for every degree of cardinality. This can cause a large amount of memory usage, so setting a limit to the number of models is useful. Defaults to 5000.
| `verbose` | No | By default, the RCF algorithm will alert once on a level shift. For example, if latency is consistently 50-100 and jumps to consistently ~1000, only one anomaly will be detected. Setting `verbose` to `true` will alert many times for such a shift.
Copy link

@sudiptoguha sudiptoguha Aug 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"RCF algorithm will alert once on a level shift" -> this is too strong. How about "RCF will try to auto learn and reduce the number of anomalies. For example if latency is consistently 50-100 and jumps to consistently ~1000, only the first few points after the transition will be detected (unless there are other spikes/anomalies). Likewise, for repeated spikes to the same level, RCF will likely eliminate many of the spikes after a few initial ones. The goal of this default setting is to minimize alerts. Setting verbose to true will alert consistently on these repeated cases and may be useful in detecting anomalous behavior that lasts an extended period of time." Please free to reword as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made this change, thanks for the suggestion.

Signed-off-by: Jonah Calvo <[email protected]>
Copy link

@sudiptoguha sudiptoguha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@JonahCalvo
Copy link
Contributor Author

Hey team, could I get an approval on this updated documentation?

@hdhalter
Copy link
Contributor

@vagimeli - can you please review this? Thank you!

Copy link
Contributor

@vagimeli vagimeli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see doc review comments.

@vagimeli
Copy link
Contributor

@vagimeli - can you please review this? Thank you!

@JonahCalvo @hdhalter Please see doc review comments. Once changes are made, it'll be good to go.

@JonahCalvo
Copy link
Contributor Author

Sorry I missed this folks. Great comments, applied changes and removed unclear usage of 'few'. Thanks!

@Naarcha-AWS Naarcha-AWS merged commit 7eceb2b into opensearch-project:main Sep 21, 2023
2 checks passed
harshavamsi pushed a commit to harshavamsi/documentation-website that referenced this pull request Oct 31, 2023
* Update documentation for new AD settings

Signed-off-by: Jonah Calvo <[email protected]>

* update wording for verbose

Signed-off-by: Jonah Calvo <[email protected]>

* Update _data-prepper/pipelines/configuration/processors/anomaly-detector.md

Co-authored-by: Melissa Vagi <[email protected]>
Signed-off-by: Jonah Calvo <[email protected]>

* Update _data-prepper/pipelines/configuration/processors/anomaly-detector.md

Co-authored-by: Melissa Vagi <[email protected]>
Signed-off-by: Jonah Calvo <[email protected]>

* Update _data-prepper/pipelines/configuration/processors/anomaly-detector.md

Co-authored-by: Melissa Vagi <[email protected]>
Signed-off-by: Jonah Calvo <[email protected]>

* Remove 'few' from description

Signed-off-by: Jonah Calvo <[email protected]>

---------

Signed-off-by: Jonah Calvo <[email protected]>
Signed-off-by: Jonah Calvo <[email protected]>
Co-authored-by: Melissa Vagi <[email protected]>
vagimeli added a commit that referenced this pull request Dec 21, 2023
* Update documentation for new AD settings

Signed-off-by: Jonah Calvo <[email protected]>

* update wording for verbose

Signed-off-by: Jonah Calvo <[email protected]>

* Update _data-prepper/pipelines/configuration/processors/anomaly-detector.md

Co-authored-by: Melissa Vagi <[email protected]>
Signed-off-by: Jonah Calvo <[email protected]>

* Update _data-prepper/pipelines/configuration/processors/anomaly-detector.md

Co-authored-by: Melissa Vagi <[email protected]>
Signed-off-by: Jonah Calvo <[email protected]>

* Update _data-prepper/pipelines/configuration/processors/anomaly-detector.md

Co-authored-by: Melissa Vagi <[email protected]>
Signed-off-by: Jonah Calvo <[email protected]>

* Remove 'few' from description

Signed-off-by: Jonah Calvo <[email protected]>

---------

Signed-off-by: Jonah Calvo <[email protected]>
Signed-off-by: Jonah Calvo <[email protected]>
Co-authored-by: Melissa Vagi <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants