Skip to content

Commit

Permalink
Merge branch 'main' into collapse-search-results-opensearch-project#7507
Browse files Browse the repository at this point in the history
  • Loading branch information
vagimeli authored Sep 24, 2024
2 parents cf38f29 + d9829d7 commit 224721d
Show file tree
Hide file tree
Showing 10 changed files with 339 additions and 23 deletions.
21 changes: 21 additions & 0 deletions _benchmark/glossary.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
---
layout: default
title: Glossary
nav_order: 10
---

# OpenSearch Benchmark glossary

The following terms are commonly used in OpenSearch Benchmark:

- **Corpora**: A collection of documents.
- **Latency**: If `target-throughput` is disabled (has no value or a value of `0)`, then latency is equal to service time. If `target-throughput` is enabled (has a value of 1 or greater), then latency is equal to the service time plus the amount of time the request waits in the queue before being sent.
- **Metric keys**: The metrics stored by OpenSearch Benchmark, based on the configuration in the [metrics record]({{site.url}}{{site.baseurl}}/benchmark/metrics/metric-records/).
- **Operations**: In workloads, a list of API operations performed by a workload.
- **Pipeline**: A series of steps occurring both before and after running a workload that determines benchmark results.
- **Schedule**: A list of two or more operations performed in the order they appear when a workload is run.
- **Service time**: The amount of time taken for `opensearch-py`, the primary client for OpenSearch Benchmark, to send a request and receive a response from the OpenSearch cluster. It includes the amount of time taken for the server to process a request as well as for network latency, load balancer overhead, and deserialization/serialization.
- **Summary report**: A report generated at the end of a test based on the metric keys defined in the workload.
- **Test**: A single invocation of the OpenSearch Benchmark binary.
- **Throughput**: The number of operations completed in a given period of time.
- **Workload**: A collection of one or more benchmarking tests that use a specific document corpus to perform a benchmark against a cluster. The document corpus contains any indexes, data files, or operations invoked when the workload runs.
4 changes: 2 additions & 2 deletions _clients/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,8 +53,8 @@ To view the compatibility matrix for a specific client, see the `COMPATIBILITY.m

Client | Recommended version
:--- | :---
[Elasticsearch Java low-level REST client](https://search.maven.org/artifact/org.elasticsearch.client/elasticsearch-rest-client/7.13.4/jar) | 7.13.4
[Elasticsearch Java high-level REST client](https://search.maven.org/artifact/org.elasticsearch.client/elasticsearch-rest-high-level-client/7.13.4/jar) | 7.13.4
[Elasticsearch Java low-level REST client](https://central.sonatype.com/artifact/org.elasticsearch.client/elasticsearch-rest-client/7.13.4) | 7.13.4
[Elasticsearch Java high-level REST client](https://central.sonatype.com/artifact/org.elasticsearch.client/elasticsearch-rest-high-level-client/7.13.4) | 7.13.4
[Elasticsearch Python client](https://pypi.org/project/elasticsearch/7.13.4/) | 7.13.4
[Elasticsearch Node.js client](https://www.npmjs.com/package/@elastic/elasticsearch/v/7.13.0) | 7.13.0
[Elasticsearch Ruby client](https://rubygems.org/gems/elasticsearch/versions/7.13.0) | 7.13.0
Expand Down
4 changes: 2 additions & 2 deletions _field-types/supported-field-types/knn-vector.md
Original file line number Diff line number Diff line change
Expand Up @@ -175,7 +175,7 @@ By default, k-NN vectors are `float` vectors, in which each dimension is 4 bytes
Byte vectors are supported only for the `lucene` and `faiss` engines. They are not supported for the `nmslib` engine.
{: .note}

In [k-NN benchmarking tests](https://github.com/opensearch-project/k-NN/tree/main/benchmarks/perf-tool), the use of `byte` rather than `float` vectors resulted in a significant reduction in storage and memory usage as well as improved indexing throughput and reduced query latency. Additionally, precision on recall was not greatly affected (note that recall can depend on various factors, such as the [quantization technique](#quantization-techniques) and data distribution).
In [k-NN benchmarking tests](https://github.com/opensearch-project/opensearch-benchmark-workloads/tree/main/vectorsearch), the use of `byte` rather than `float` vectors resulted in a significant reduction in storage and memory usage as well as improved indexing throughput and reduced query latency. Additionally, precision on recall was not greatly affected (note that recall can depend on various factors, such as the [quantization technique](#quantization-techniques) and data distribution).

When using `byte` vectors, expect some loss of precision in the recall compared to using `float` vectors. Byte vectors are useful in large-scale applications and use cases that prioritize a reduced memory footprint in exchange for a minimal loss of recall.
{: .important}
Expand Down Expand Up @@ -411,7 +411,7 @@ As an example, assume that you have 1 million vectors with a dimension of 256 an

### Quantization techniques

If your vectors are of the type `float`, you need to first convert them to the `byte` type before ingesting the documents. This conversion is accomplished by _quantizing the dataset_---reducing the precision of its vectors. There are many quantization techniques, such as scalar quantization or product quantization (PQ), which is used in the Faiss engine. The choice of quantization technique depends on the type of data you're using and can affect the accuracy of recall values. The following sections describe the scalar quantization algorithms that were used to quantize the [k-NN benchmarking test](https://github.com/opensearch-project/k-NN/tree/main/benchmarks/perf-tool) data for the [L2](#scalar-quantization-for-the-l2-space-type) and [cosine similarity](#scalar-quantization-for-the-cosine-similarity-space-type) space types. The provided pseudocode is for illustration purposes only.
If your vectors are of the type `float`, you need to first convert them to the `byte` type before ingesting the documents. This conversion is accomplished by _quantizing the dataset_---reducing the precision of its vectors. There are many quantization techniques, such as scalar quantization or product quantization (PQ), which is used in the Faiss engine. The choice of quantization technique depends on the type of data you're using and can affect the accuracy of recall values. The following sections describe the scalar quantization algorithms that were used to quantize the [k-NN benchmarking test](https://github.com/opensearch-project/opensearch-benchmark-workloads/tree/main/vectorsearch) data for the [L2](#scalar-quantization-for-the-l2-space-type) and [cosine similarity](#scalar-quantization-for-the-cosine-similarity-space-type) space types. The provided pseudocode is for illustration purposes only.

#### Scalar quantization for the L2 space type

Expand Down
2 changes: 1 addition & 1 deletion _install-and-configure/plugins.md
Original file line number Diff line number Diff line change
Expand Up @@ -181,7 +181,7 @@ Continue with installation? [y/N]y

### Install a plugin using Maven coordinates

The `opensearch-plugin install` tool also allows you to specify Maven coordinates for available artifacts and versions hosted on [Maven Central](https://search.maven.org/search?q=org.opensearch.plugin). The tool parses the Maven coordinates you provide and constructs a URL. As a result, the host must be able to connect directly to the Maven Central site. The plugin installation fails if you pass coordinates to a proxy or local repository.
The `opensearch-plugin install` tool also allows you to specify Maven coordinates for available artifacts and versions hosted on [Maven Central](https://central.sonatype.com/namespace/org.opensearch.plugin). The tool parses the Maven coordinates you provide and constructs a URL. As a result, the host must be able to connect directly to the Maven Central site. The plugin installation fails if you pass coordinates to a proxy or local repository.

#### Usage
```bash
Expand Down
2 changes: 1 addition & 1 deletion _monitoring-your-cluster/pa/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ private-key-file-path = specify_path

The Performance Analyzer plugin is included in the installations for [Docker]({{site.url}}{{site.baseurl}}/opensearch/install/docker/) and [tarball]({{site.url}}{{site.baseurl}}/opensearch/install/tar/), but you can also install the plugin manually.

To install the Performance Analyzer plugin manually, download the plugin from [Maven](https://search.maven.org/search?q=org.opensearch.plugin) and install it using the standard [plugin installation]({{site.url}}{{site.baseurl}}/opensearch/install/plugins/) process. Performance Analyzer runs on each node in a cluster.
To install the Performance Analyzer plugin manually, download the plugin from [Maven](https://central.sonatype.com/namespace/org.opensearch.plugin) and install it using the standard [plugin installation]({{site.url}}{{site.baseurl}}/opensearch/install/plugins/) process. Performance Analyzer runs on each node in a cluster.

To start the Performance Analyzer root cause analysis (RCA) agent on a tarball installation, run the following command:

Expand Down
193 changes: 193 additions & 0 deletions _search-plugins/knn/disk-based-vector-search.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,193 @@
---
layout: default
title: Disk-based vector search
nav_order: 16
parent: k-NN search
has_children: false
---

# Disk-based vector search
**Introduced 2.17**
{: .label .label-purple}

For low-memory environments, OpenSearch provides _disk-based vector search_, which significantly reduces the operational costs for vector workloads. Disk-based vector search uses [binary quantization]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-vector-quantization/#binary-quantization), compressing vectors and thereby reducing the memory requirements. This memory optimization provides large memory savings at the cost of slightly increased search latency while still maintaining strong recall.

To use disk-based vector search, set the [`mode`]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-vector/#vector-workload-modes) parameter to `on_disk` for your vector field type. This parameter will configure your index to use secondary storage.

## Creating an index for disk-based vector search

To create an index for disk-based vector search, send the following request:

```json
PUT my-vector-index
{
"mappings": {
"properties": {
"my_vector_field": {
"type": "knn_vector",
"dimension": 8,
"space_type": "innerproduct",
"data_type": "float",
"mode": "on_disk"
}
}
}
}
```
{% include copy-curl.html %}

By default, the `on_disk` mode configures the index to use the `faiss` engine and `hnsw` method. The default [`compression_level`]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-vector/#compression-levels) of `32x` reduces the amount of memory the vectors require by a factor of 32. To preserve the search recall, rescoring is enabled by default. A search on a disk-optimized index runs in two phases: The compressed index is searched first, and then the results are rescored using full-precision vectors loaded from disk.

To reduce the compression level, provide the `compression_level` parameter when creating the index mapping:

```json
PUT my-vector-index
{
"mappings": {
"properties": {
"my_vector_field": {
"type": "knn_vector",
"dimension": 8,
"space_type": "innerproduct",
"data_type": "float",
"mode": "on_disk",
"compression_level": "16x"
}
}
}
}
```
{% include copy-curl.html %}

For more information about the `compression_level` parameter, see [Compression levels]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-vector/#compression-levels). Note that for `4x` compression, the `lucene` engine will be used.
{: .note}

If you need more granular fine-tuning, you can override additional k-NN parameters in the method definition. For example, to improve recall, increase the `ef_construction` parameter value:

```json
PUT my-vector-index
{
"mappings": {
"properties": {
"my_vector_field": {
"type": "knn_vector",
"dimension": 8,
"space_type": "innerproduct",
"data_type": "float",
"mode": "on_disk",
"method": {
"params": {
"ef_construction": 512
}
}
}
}
}
}
```
{% include copy-curl.html %}

The `on_disk` mode only works with the `float` data type.
{: .note}

## Ingestion

You can perform document ingestion for a disk-optimized vector index in the same way as for a regular vector index. To index several documents in bulk, send the following request:

```json
POST _bulk
{ "index": { "_index": "my-vector-index", "_id": "1" } }
{ "my_vector_field": [1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5, 1.5], "price": 12.2 }
{ "index": { "_index": "my-vector-index", "_id": "2" } }
{ "my_vector_field": [2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5], "price": 7.1 }
{ "index": { "_index": "my-vector-index", "_id": "3" } }
{ "my_vector_field": [3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5, 3.5], "price": 12.9 }
{ "index": { "_index": "my-vector-index", "_id": "4" } }
{ "my_vector_field": [4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5], "price": 1.2 }
{ "index": { "_index": "my-vector-index", "_id": "5" } }
{ "my_vector_field": [5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5], "price": 3.7 }
{ "index": { "_index": "my-vector-index", "_id": "6" } }
{ "my_vector_field": [6.5, 6.5, 6.5, 6.5, 6.5, 6.5, 6.5, 6.5], "price": 10.3 }
{ "index": { "_index": "my-vector-index", "_id": "7" } }
{ "my_vector_field": [7.5, 7.5, 7.5, 7.5, 7.5, 7.5, 7.5, 7.5], "price": 5.5 }
{ "index": { "_index": "my-vector-index", "_id": "8" } }
{ "my_vector_field": [8.5, 8.5, 8.5, 8.5, 8.5, 8.5, 8.5, 8.5], "price": 4.4 }
{ "index": { "_index": "my-vector-index", "_id": "9" } }
{ "my_vector_field": [9.5, 9.5, 9.5, 9.5, 9.5, 9.5, 9.5, 9.5], "price": 8.9 }
```
{% include copy-curl.html %}

## Search

Search is also performed in the same way as in other index configurations. The key difference is that, by default, the `oversample_factor` of the rescore parameter is set to `3.0` (unless you override the `compression_level`). For more information, see [Rescoring quantized results using full precision]({{site.url}}{{site.baseurl}}/search-plugins/knn/approximate-knn/#rescoring-quantized-results-using-full-precision). To perform vector search on a disk-optimized index, provide the search vector:

```json
GET my-vector-index/_search
{
"query": {
"knn": {
"my_vector_field": {
"vector": [1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 7.5, 8.5],
"k": 5
}
}
}
}
```
{% include copy-curl.html %}

Similarly to other index configurations, you can override k-NN parameters in the search request:

```json
GET my-vector-index/_search
{
"query": {
"knn": {
"my_vector_field": {
"vector": [1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 7.5, 8.5],
"k": 5,
"method_parameters": {
"ef_search": 512
},
"rescore": {
"oversample_factor": 10.0
}
}
}
}
}
```
{% include copy-curl.html %}

[Radial search]({{site.url}}{{site.baseurl}}/search-plugins/knn/radial-search-knn/) does not support disk-based vector search.
{: .note}

## Model-based indexes

For [model-based indexes]({{site.url}}{{site.baseurl}}/search-plugins/knn/approximate-knn/#building-a-k-nn-index-from-a-model), you can specify the `on_disk` parameter in the training request in the same way that you would specify it during index creation. By default, `on_disk` mode will use the [Faiss IVF method]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-index/#supported-faiss-methods) and a compression level of `32x`. To run the training API, send the following request:

```json
POST /_plugins/_knn/models/_train/test-model
{
"training_index": "train-index-name",
"training_field": "train-field-name",
"dimension": 8,
"max_training_vector_count": 1200,
"search_size": 100,
"description": "My model",
"space_type": "innerproduct",
"mode": "on_disk"
}
```
{% include copy-curl.html %}

This command assumes that training data has been ingested into the `train-index-name` index. For more information, see [Building a k-NN index from a model]({{site.url}}{{site.baseurl}}/search-plugins/knn/approximate-knn/#building-a-k-nn-index-from-a-model).
{: .note}

You can override the `compression_level` for disk-optimized indexes in the same way as for regular k-NN indexes.


## Next steps

- For more information about binary quantization, see [Binary quantization]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-vector-quantization/#binary-quantization).
- For more information about k-NN vector workload modes, see [Vector workload modes]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-vector/#vector-workload-modes).
2 changes: 1 addition & 1 deletion _search-plugins/knn/knn-vector-quantization.md
Original file line number Diff line number Diff line change
Expand Up @@ -436,7 +436,7 @@ GET my-vector-index/_search
"my_vector_field": {
"vector": [1.5, 5.5, 1.5, 5.5, 1.5, 5.5, 1.5, 5.5],
"k": 10,
"method_params": {
"method_parameters": {
"ef_search": 10
},
"rescore": {
Expand Down
Loading

0 comments on commit 224721d

Please sign in to comment.