Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Add support for indexing pressure stats #453

Open
dbwiddis opened this issue Apr 13, 2023 · 6 comments
Open

[FEATURE] Add support for indexing pressure stats #453

dbwiddis opened this issue Apr 13, 2023 · 6 comments
Assignees
Labels
enhancement New feature or request performance Make it fast!

Comments

@dbwiddis
Copy link
Member

Is your feature request related to a problem?

Extensions need access to the node / shard indexing pressure APIs to know when to throttle requests.

See this blog post.

The Indexing APIs in OpenSearch such as _bulk allows you to write data in the cluster, which is distributed across multiple shards, on multiple data nodes. However, at times indexing requests may suffer performance degradation due to a number of reasons including non-optimal cluster configuration, shard strategy, traffic spikes, available node resources and more. These issues are further exacerbated for larger multi-node cluster and indices with many shards. All of these could cause out-of-memory errors, long garbage collection (GC) pauses, and reduced throughput, affecting the overall availability of data nodes in addition to degrading performance.

But with this API:

Primary parameters, are leading indicators for node duress and are governed through soft limits. ... Breach of primary parameters reflect no actual rejections yet, but triggers an evaluation of the secondary signals.

The Anomaly Detection HCAD framework currently uses local node Indexing Pressure as a throttling mechanism. When indexing pressure is low, all results are recorded; when it is high, selective results are discarded to respect node health.

We would like to enable a similar mechanism for the Anomaly Detection extension, where we can self-throttle what we index based on the health of the cluster.

While the opensearch-java client do implement NodeStatsRequest and NodeStatsResponse objects for use with client.nodes().stats(), they response object only includes a subset of the available Nodes stats, notably missing the results for indexing_pressure (currently used by AD plugin) and shard_indexing_pressure (as described in the blog post above).

What solution would you like?

Add parsing for the full set of Nodes stats: specifically indexing_pressure and shard_indexing_pressure.

What alternatives have you considered?

  1. Constructing a request with the GET /_nodes/stats/indexing_pressure/ or GET /_nodes/stats/shard_indexing_pressure/ endpoints and manually parsing the JSON response.
  2. Sending a Transport request to fetch the data via a TransportAction

Do you have any additional context?

Originated from: opensearch-project/opensearch-sdk-java#655

@dbwiddis dbwiddis added enhancement New feature or request untriaged labels Apr 13, 2023
@wbeckler
Copy link

wbeckler commented Oct 3, 2023

in which version of opensearch were these stats added to the node stats API? Is the next step to add this to the API spec?

@dbwiddis
Copy link
Member Author

dbwiddis commented Oct 3, 2023

in which version of opensearch were these stats added to the node stats API?

The blog post says 1.2. The relevant PR adding the API is opensearch-project/OpenSearch#1336 merged on Oct 7 2021, which would seem to concur (v1.2.0 label on the PR, 1.1 was Oct 5, 1.2 was Nov 23.)

Is the next step to add this to the API spec?

Possibly? I've heard mixed reports on whether we use the spec files. Happy to attempt to update those, is the process to do so and prompt client builds documented somewhere? The issue discussing generating from spec is still open/unresolved. opensearch-project/OpenSearch#3090 See also opensearch-project/opensearch-clients#19

@wbeckler
Copy link

wbeckler commented Oct 3, 2023

opensearch-java is not built from spec, but there's been generator progress in the more loosely typed languages. We should start with smithy spec modification. I'm guessing that's here: https://github.com/opensearch-project/opensearch-api-specification/tree/main/model/nodes/stats

You would be the first person since the creation of the smithy spec who would be adding a new field/parameter. Please feel free to open issues in the api spec repo as you run into the inevitable friction in this process.

@VachaShah
Copy link
Collaborator

To add this feature in the java client, its through manual code until we implement a generator for this client.

For other clients where we have client generators in progress, it would be through the spec. We have been adding a lot of new specs to the Smithy specs in various modules, so please feel free to raise an issue for the same and we can help you get those changes in!

@dbwiddis
Copy link
Member Author

dbwiddis commented Oct 3, 2023

Feel free to assign this issue to me. I'll take care of it as time permits. I understand:

  1. I need to update the spec at the api-spec repo to make sure it propogates to other clients (I'll file an issue first), and
  2. In parallel I need to make the change here manually

@VachaShah
Copy link
Collaborator

Yup that sounds right @dbwiddis. Thank you for taking this up!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request performance Make it fast!
Projects
None yet
Development

No branches or pull requests

4 participants