Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add metrics section to benchmark #4972

Merged
merged 14 commits into from
Sep 25, 2023
61 changes: 61 additions & 0 deletions _benchmark/metrics/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
---
layout: default
title: Metrics
nav_order: 25
has_children: true
---

# Metrics

After a workload completes, OpenSearch Benchmark stores all metric records within its metrics store. These metrics can be kept in memory or in an OpenSearch cluster.

## Storing metrics

You can specify whether metrics are stored in memory or in a metrics store while running the benchmark by setting the [`datastore.type`](https://opensearch.org/docs/latest/benchmark/configuring-benchmark/#results_publishing) parameter in your `benchmark.ini` file.

### In memory

If you want to store metrics in memory while running the benchmark, provide the following settings in the `results_publishing` section of `benchmark.ini`:

```ini
[results_publishing]
datastore.type = in-memory
datastore.host = <host-url>
datastore.port = <host-port>
datastore.secure = False
datastore.ssl.verification_mode = <ssl-verification-details>
datastore.user = <username>
datastore.password = <password>
```
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved

### OpenSearch

If you want to store metrics in an external OpenSearch memory store while running the benchmark, provide the following settings in the `results_publishing` section of `benchmark.ini`:

```ini
[results_publishing]
datastore.type = opensearch
datastore.host = <opensearch endpoint>
datastore.port = 443
datastore.secure = true
datastore.ssl.verification_mode = none
datastore.user = <opensearch basic auth username>
datastore.password = <opensearch basic auth password>
Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved
datastore.number_of_replicas =
datastore.number_of_shards =
```
When neither `datastore.number_of_replicas` nor `datastore.number_of_shards` is provided, OpenSearch uses the default values: `0` for the number of replicas and `1` for the number of shards. If these settings are changed after the data store cluster is created, the new replica and shard settings will only apply when new results indexes are created at the end of the month.

Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved
After you run OpenSearch Benchmark configured to use OpenSearch as a data store, OpenSearch Benchmark creates three indexes:

- `benchmark-metrics-YYYY-MM`: Holds granular metric and telemetry data.
- `benchmark-results-YYYY-MM`: Holds data based on final results.
- `benchmark-test-executions-YYYY-MM`: Holds data about `execution-ids`.

Naarcha-AWS marked this conversation as resolved.
Show resolved Hide resolved
You can visualize data inside these indexes in OpenSearch Dashboards.


## Next steps

- For more information about how to design a metrics store, see [Metric records]({{site.url}}{{site.baseurl}}/benchmark/metrics/metric-records/).
- For more information about what metrics are stored, see [Metric keys]({{site.url}}{{site.baseurl}}/benchmark/metrics/metric-keys/).
47 changes: 47 additions & 0 deletions _benchmark/metrics/metric-keys.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
---
layout: default
title: Metric keys
nav_order: 35
parent: Metrics
---

# Metric keys

Metric keys are the metrics that OpenSearch Benchmark stores, based on the configuration in the [metrics record]({{site.url}}{{site.baseurl}}/benchmark/metrics/metric-keys/). OpenSearch Benchmark stores the following metrics:


- `latency`: The time period between submitting a request and receiving the complete response. This also includes wait time, such as the time the request spends waiting until it is ready to be serviced by OpenSearch Benchmark.
- `service_time`: The time period between sending a request and receiving the corresponding response. This metric is similar to latency but does not include wait time.
- `processing_time`: The time period between starting to process a request and receiving the complete response. Contrary to service time, this metric also includes the OpenSearch Benchmark client-side processing overhead. Large differences between service time and processing time indicate a high overhead in the client and can thus point to a potential client-side bottleneck, which requires investigation.
- `throughput`: The number of operations that OpenSearch Benchmark can perform within a certain time period, usually per second. See the [workload reference]({{site.url}}{{site.baseurl}}/benchmark/workloads/index/) for definitions of operation types.
- `disk_io_write_bytes`: The number of bytes written to disk during the benchmark. On Linux, this metric corresponds to only the bytes that have been written by OpenSearch Benchmark. On Mac OS, it includes the number of bytes written by all processes.
- `disk_io_read_bytes`: The number of bytes read from disk during the benchmark. On MacOS, this includes the number of bytes written by all processes.

Check failure on line 18 in _benchmark/metrics/metric-keys.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _benchmark/metrics/metric-keys.md#L18

[Vale.Terms] Use 'macOS' instead of 'MacOS'.
Raw output
{"message": "[Vale.Terms] Use 'macOS' instead of 'MacOS'.", "location": {"path": "_benchmark/metrics/metric-keys.md", "range": {"start": {"line": 18, "column": 85}}}, "severity": "ERROR"}
- `node_startup_time`: The amount of time, in seconds, from the start of the process until the node is running.
- `node_total_young_gen_gc_time`: The total runtime of the young-generation garbage collector across the whole cluster, as reported by the Nodes Stats API.
- `node_total_young_gen_gc_count`: The total number of young-generation garbage collections across the whole cluster, as reported by the Nodes Stats API.
- `node_total_old_gen_gc_time`: The total runtime of the old-generation garbage collector across the whole cluster, as reported by the Nodes Stats API.
- `node_total_old_gen_gc_count`: The total number of old-generation garbage collections across the whole cluster, as reported by the Nodes Stats API.
- `node_total_zgc_cycles_gc_time`: The total time spent by the Z Garbage Collector (ZGC) on garbage collecting across the whole cluster, as reported by the Nodes Stats API.
- `node_total_zgc_cycles_gc_count`: The total number of garbage collections ZGC performed across the whole cluster, as reported by the Nodes Stats API.
- `node_total_zgc_pauses_gc_time`: The total time ZGC spent in Stop-The-World pauses across the whole cluster, as reported by the Nodes Stats API.
- `node_total_zgc_pauses_gc_count`: The total number of Stop-The-World pauses during ZGC execution across the whole cluster, as reported by the Nodes Stats API.
- `segments_count`: The total number of open segments, as reported by the Index Stats API.
- `segments_memory_in_bytes`: The total number of bytes used for all open segments, as reported by the Index Stats API.
- `segments_doc_values_memory_in_bytes`: The number of bytes used for document values, as reported by the Index Stats API.
- `segments_stored_fields_memory_in_bytes`: The number of bytes used for stored fields, as reported by the Index Stats API.
- `segments_terms_memory_in_bytes`: The number of bytes used for terms, as reported by the Index Stats API.
- `segments_norms_memory_in_bytes`: The number of bytes used for norms, as reported by the Index Stats API.
- `segments_points_memory_in_bytes`: The number of bytes used for points, as reported by the Index Stats API.
- `merges_total_time`: The cumulative runtime of merges for primary shards, as reported by the Index Stats API. Note that this time is not wall clock time. If M merge threads ran for N minutes, Benchmark reports the amount of time as M * N minutes, not N minutes. These metrics records have an additional per-shard property that contains the times across primary shards in an array.
- `merges_total_count`: The cumulative number of merges of primary shards, as reported by Index Stats API under `_all/primaries`.
- `merges_total_throttled_time`: The cumulative time for merges that have been throttled, as reported by the Index Stats API. Note that this time is not wall clock time. These metrics records have an additional per-shard property that contains the times across primary shards in an array.
- `indexing_total_time`: The cumulative time used for indexing of primary shards, as reported by the Index Stats API. Note that this is not wall clock time. These metrics records have an additional per-shard property that contains the times across primary shards in an array.
- `indexing_throttle_time`: The cumulative time during which indexing has been throttled, as reported by the Index Stats API. Note that this is not wall clock time. These metrics records have an additional per-shard property that contains the times across primary shards in an array.
- `refresh_total_time`: The cumulative time used for index refresh of primary shards, as reported by the Index Stats API. Note that this is not wall clock time. These metrics records have an additional per-shard property that contains the times across primary shards in an array.
- `refresh_total_count`: The cumulative number of refreshes of primary shards, as reported by the Index Stats API under `_all/primaries`.
- `flush_total_time`: The cumulative time used for index flush of primary shards, as reported by the Index Stats API. Note that this is not wall clock time. These metrics records have an additional per-shard property that contains the times across primary shards in an array.
- `flush_total_count`: The cumulative number of flushes of primary shards, as reported by the Index Stats API under `_all/primaries`.
- `final_index_size_bytes`: The final index size on the file system after all nodes have been shut down at the end of the benchmark, in bytes. It includes all files in the nodes’ data directories, such as index files and the translog.
- `store_size_in_bytes`: The size of the index, excluding the translog, as reported by the Index Stats API, in bytes .
- `translog_size_in_bytes`: The size of the translog, as reported by the Index Stats API, in bytes.
- `ml_processing_time`: An object containing the minimum, mean, median, and maximum bucket processing time per machine learning job, in milliseconds. These metrics are only available if a machine learning job has been created in the respective benchmark.
119 changes: 119 additions & 0 deletions _benchmark/metrics/metric-records.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
---
layout: default
title: Metric records
nav_order: 30
parent: Metrics
---

# Metric records

OpenSearch Benchmark stores metrics in the `benchmark-metrics-*` indexes. A new index is created each month. The following is an example metric record stored in the `benchmark-metrics-2023-08` index:

```json
{
"_index": "benchmark-metrics-2023-08",
"_id": "UiNY4YkBpMtdJ7uj2rUe",
"_version": 1,
"_score": null,
"_source": {
"@timestamp": 1691702842821,
"relative-time-ms": 65.90720731765032,
"test-execution-id": "8c43ee4c-cb34-494b-81b2-181be244f832",
"test-execution-timestamp": "20230810T212711Z",
"environment": "local",
"workload": "geonames",
"test_procedure": "append-no-conflicts",
"provision-config-instance": "external",
"name": "service_time",
"value": 607.8001195564866,
"unit": "ms",
"sample-type": "normal",
"meta": {
"source_revision": "unknown",
"distribution_version": "1.1.0",
"distribution_flavor": "oss",
"index": "geonames",
"took": 13,
"success": true,
"success-count": 125,
"error-count": 0
},
"task": "index-append",
"operation": "index-append",
"operation-type": "bulk"
},
"fields": {
"@timestamp": [
"2023-08-10T21:27:22.821Z"
],
"test-execution-timestamp": [
"2023-08-10T21:27:11.000Z"
]
},
"highlight": {
"workload": [
"@opensearch-dashboards-highlighted-field@geonames@/opensearch-dashboards-highlighted-field@"
],
"meta.index": [

Check failure on line 57 in _benchmark/metrics/metric-records.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _benchmark/metrics/metric-records.md#L57

[OpenSearch.HeadingCapitalization] 'meta' is a heading and should be in sentence case.
Raw output
{"message": "[OpenSearch.HeadingCapitalization] 'meta' is a heading and should be in sentence case.", "location": {"path": "_benchmark/metrics/metric-records.md", "range": {"start": {"line": 57, "column": 6}}}, "severity": "ERROR"}
"@opensearch-dashboards-highlighted-field@geonames@/opensearch-dashboards-highlighted-field@"
]
},
"sort": [
1691702831000
]
}
```

The following fields found in the `_source` section of the metric's record are configurable in the `opensearch-benchmarks-metrics-*` file.

## @timestamp

Check failure on line 69 in _benchmark/metrics/metric-records.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _benchmark/metrics/metric-records.md#L69

[OpenSearch.HeadingCapitalization] '@timestamp' is a heading and should be in sentence case.
Raw output
{"message": "[OpenSearch.HeadingCapitalization] '@timestamp' is a heading and should be in sentence case.", "location": {"path": "_benchmark/metrics/metric-records.md", "range": {"start": {"line": 69, "column": 4}}}, "severity": "ERROR"}

The timestamp of when the sample was taken since the epoch, in milliseconds. For request-related metrics, such as `latency` or `service_time`, this is the timestamp of when OpenSearch Benchmark issued the request.

## relative-time-ms

Check failure on line 73 in _benchmark/metrics/metric-records.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _benchmark/metrics/metric-records.md#L73

[OpenSearch.HeadingCapitalization] 'relative-time-ms' is a heading and should be in sentence case.
Raw output
{"message": "[OpenSearch.HeadingCapitalization] 'relative-time-ms' is a heading and should be in sentence case.", "location": {"path": "_benchmark/metrics/metric-records.md", "range": {"start": {"line": 73, "column": 4}}}, "severity": "ERROR"}

The relative time since the start of the benchmark, in milliseconds. This is useful for comparing time-series graphs across multiple tests. For example, you can compare the indexing throughput over time across multiple tests.

## test-execution-id

Check failure on line 77 in _benchmark/metrics/metric-records.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _benchmark/metrics/metric-records.md#L77

[OpenSearch.HeadingCapitalization] 'test-execution-id' is a heading and should be in sentence case.
Raw output
{"message": "[OpenSearch.HeadingCapitalization] 'test-execution-id' is a heading and should be in sentence case.", "location": {"path": "_benchmark/metrics/metric-records.md", "range": {"start": {"line": 77, "column": 4}}}, "severity": "ERROR"}

A UUID that changes on every invocation of the workload. It is intended to group all samples of a benchmarking run.

## test-execution-timestamp

Check failure on line 81 in _benchmark/metrics/metric-records.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _benchmark/metrics/metric-records.md#L81

[OpenSearch.HeadingCapitalization] 'test-execution-timestamp' is a heading and should be in sentence case.
Raw output
{"message": "[OpenSearch.HeadingCapitalization] 'test-execution-timestamp' is a heading and should be in sentence case.", "location": {"path": "_benchmark/metrics/metric-records.md", "range": {"start": {"line": 81, "column": 4}}}, "severity": "ERROR"}

The timestamp of when the workload was invoked (always in UTC).

## environment

Check failure on line 85 in _benchmark/metrics/metric-records.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _benchmark/metrics/metric-records.md#L85

[OpenSearch.HeadingCapitalization] 'environment' is a heading and should be in sentence case.
Raw output
{"message": "[OpenSearch.HeadingCapitalization] 'environment' is a heading and should be in sentence case.", "location": {"path": "_benchmark/metrics/metric-records.md", "range": {"start": {"line": 85, "column": 4}}}, "severity": "ERROR"}

The `environment` describes the origin of a metric record. This is defined when initially [configuring]({{site.url}}{{site.baseurl}}/benchmark/configuring-benchmark/) OpenSearch Benchmark. You can use separate environments for different benchmarks but store the metric records in the same index.

## workload, test_procedure, provision-config-instance

Check failure on line 89 in _benchmark/metrics/metric-records.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _benchmark/metrics/metric-records.md#L89

[OpenSearch.HeadingCapitalization] 'workload,, provision-config-instance' is a heading and should be in sentence case.
Raw output
{"message": "[OpenSearch.HeadingCapitalization] 'workload,, provision-config-instance' is a heading and should be in sentence case.", "location": {"path": "_benchmark/metrics/metric-records.md", "range": {"start": {"line": 89, "column": 1}}}, "severity": "ERROR"}

The workload, test procedures, and configuration instances for which the metrics are produced.

## name, value, unit

Check failure on line 93 in _benchmark/metrics/metric-records.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _benchmark/metrics/metric-records.md#L93

[OpenSearch.HeadingCapitalization] 'name, value, unit' is a heading and should be in sentence case.
Raw output
{"message": "[OpenSearch.HeadingCapitalization] 'name, value, unit' is a heading and should be in sentence case.", "location": {"path": "_benchmark/metrics/metric-records.md", "range": {"start": {"line": 93, "column": 4}}}, "severity": "ERROR"}

The actual metric name and value, with an optional unit. Depending on the nature of a metric, it is either sampled periodically by OpenSearch Benchmark, for example, CPU utilization or query latency, or measured once, for example, the final size of the index.

## sample-type

Check failure on line 97 in _benchmark/metrics/metric-records.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _benchmark/metrics/metric-records.md#L97

[OpenSearch.HeadingCapitalization] 'sample-type' is a heading and should be in sentence case.
Raw output
{"message": "[OpenSearch.HeadingCapitalization] 'sample-type' is a heading and should be in sentence case.", "location": {"path": "_benchmark/metrics/metric-records.md", "range": {"start": {"line": 97, "column": 4}}}, "severity": "ERROR"}

Determines whether to configure a benchmark to run in warmup mode by setting it to `warmup` or `normal`. Only `normal` samples are considered for the results that are reported.

## meta

The meta information for each metric record, including the following:

- CPU info: The number of physical and logical cores and the model name.
- OS info: The name and version of the operating system.
- Hostname.
- Node name: A unique name given to each node when OpenSearch Benchmark provisions the cluster.
- Source revision: The Git hash of the version of OpenSearch that is benchmarked.

Check failure on line 109 in _benchmark/metrics/metric-records.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _benchmark/metrics/metric-records.md#L109

[OpenSearch.Spelling] Error: benchmarked. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.
Raw output
{"message": "[OpenSearch.Spelling] Error: benchmarked. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_benchmark/metrics/metric-records.md", "range": {"start": {"line": 109, "column": 70}}}, "severity": "ERROR"}
- Distribution version: The distribution version of OpenSearch that is benchmarked.

Check failure on line 110 in _benchmark/metrics/metric-records.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _benchmark/metrics/metric-records.md#L110

[OpenSearch.Spelling] Error: benchmarked. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.
Raw output
{"message": "[OpenSearch.Spelling] Error: benchmarked. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_benchmark/metrics/metric-records.md", "range": {"start": {"line": 110, "column": 72}}}, "severity": "ERROR"}
- Custom tags: You can define custom tags by using the command line flag `--user-tags`. The tags are prefixed by `tag_` in order to avoid accidental clashes with OpenSearch Benchmark internal tags.
- Operation specific: An optional substructure of the operation. For bulk requests, this may be the number of documents; for searches, the number of hits.

Depending on the metric record, some meta information might be missing.

## Next steps

- For more information about how to access OpenSearch Benchmark metrics, see [Metrics]({{site.url}}{{site.baseurl}}/benchmark/metrics/index/).
- For more information about the metrics stored in OpenSearch Benchmark, see [Metric keys]({{site.url}}{{site.baseurl}}/benchmark/metrics/metric-keys/).
Binary file added images/benchmark/metric-index-pattern.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading