From 332f936ec430cb675ad9db8086a59031aa13ad5a Mon Sep 17 00:00:00 2001
From: zhichao-aws <zhichaog@amazon.com>
Date: Sun, 28 Apr 2024 12:40:36 +0800
Subject: [PATCH 01/12] query tokens

Signed-off-by: zhichao-aws <zhichaog@amazon.com>
---
 _query-dsl/specialized/neural-sparse.md | 39 +++++++++++++++++++++----
 _search-plugins/neural-sparse-search.md | 27 +++++++++++++++--
 2 files changed, 58 insertions(+), 8 deletions(-)

diff --git a/_query-dsl/specialized/neural-sparse.md b/_query-dsl/specialized/neural-sparse.md
index 70fcfd892c..2651244cb9 100644
--- a/_query-dsl/specialized/neural-sparse.md
+++ b/_query-dsl/specialized/neural-sparse.md
@@ -10,12 +10,12 @@ nav_order: 55
 Introduced 2.11
 {: .label .label-purple }
 
-Use the `neural_sparse` query for vector field search in [neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/). 
+Use the `neural_sparse` query for vector field search in [neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/). Query can be performed using raw text or sparse vectors.
 
 ## Request fields
 
 Include the following request fields in the `neural_sparse` query:
-
+1. query by raw text:
 ```json
 "neural_sparse": {
   "<vector_field>": {
@@ -24,17 +24,26 @@ Include the following request fields in the `neural_sparse` query:
   }
 }
 ```
+2. query by sparse vector:
+```json
+"neural_sparse": {
+  "<vector_field>": {
+    "query_tokens": "<query_tokens>"
+  }
+}
+```
 
 The top-level `vector_field` specifies the vector field against which to run a search query. The following table lists the other `neural_sparse` query fields.
 
 Field | Data type | Required/Optional | Description
 :--- | :--- | :--- 
-`query_text` | String | Required | The query text from which to generate vector embeddings. 
-`model_id` | String | Required | The ID of the sparse encoding model or tokenizer model that will be used to generate vector embeddings from the query text. The model must be deployed in OpenSearch before it can be used in sparse neural search. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/using-ml-models/) and [Neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/).
+`query_text` | String | Optional | The query text from which to generate sparse vector embeddings. 
+`model_id` | String | Optional | The ID of the sparse encoding model or tokenizer model that will be used to generate vector embeddings from the query text. The model must be deployed in OpenSearch before it can be used in sparse neural search. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/using-ml-models/) and [Neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/). To set a default model id in neural sparse query, see [`neural_query_enricher`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/neural-query-enricher/).
+`query_tokens` | Map<String, Float> | Optional | The query tokens, also referred as sparse vector embeddings. Just like dense semantic retrieval, we can use raw sparse vectors generated by neural models or tokenizers to perform the semantic search. Either `query_text` or `query_tokens` must be provided for `neural_sparse` query.
 `max_token_score` | Float | Optional | (Deprecated) The theoretical upper bound of the score for all tokens in the vocabulary (required for performance optimization). For OpenSearch-provided [pretrained sparse embedding models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/pretrained-models/#sparse-encoding-models), we recommend setting `max_token_score` to 2 for `amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1` and to 3.5 for `amazon/neural-sparse/opensearch-neural-sparse-encoding-v1`. This field has been deprecated as of OpenSearch 2.12.
 
 #### Example request
-
+1. query by raw text
 ```json
 GET my-nlp-index/_search
 {
@@ -48,4 +57,24 @@ GET my-nlp-index/_search
   }
 }
 ```
+2. query by sparse vector:
+```json
+GET my-nlp-index/_search
+{
+  "query": {
+    "neural_sparse": {
+      "passage_embedding": {
+        "query_tokens": {
+          "hi" : 4.338913,
+          "planets" : 2.7755864,
+          "planet" : 5.0969057,
+          "mars" : 1.7405145,
+          "earth" : 2.6087382,
+          "hello" : 3.3210192
+        }
+      }
+    }
+  }
+}
+```
 {% include copy-curl.html %}
\ No newline at end of file
diff --git a/_search-plugins/neural-sparse-search.md b/_search-plugins/neural-sparse-search.md
index 58918565c4..d7e3759ab4 100644
--- a/_search-plugins/neural-sparse-search.md
+++ b/_search-plugins/neural-sparse-search.md
@@ -16,7 +16,7 @@ Introduced 2.11
 When selecting a model, choose one of the following options:
 
 - Use a sparse encoding model at both ingestion time and search time (high performance, relatively high latency).
-- Use a sparse encoding model at ingestion time and a tokenizer model at search time (low performance, relatively low latency).
+- Use a sparse encoding model at ingestion time and a tokenizer at search time (relatively low performance, low latency). The tokenizers doesn't conduct model inference, but for consist experience we still deploy and invoke them via ml-commons model APIs.
 
 **PREREQUISITE**<br>
 Before using neural sparse search, make sure to set up a [pretrained sparse embedding model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/pretrained-models/#sparse-encoding-models) or your own sparse embedding model. For more information, see [Choosing a model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/integrating-ml-models/#choosing-a-model).
@@ -144,11 +144,11 @@ PUT /my-nlp-index/_doc/2
 
 Before the document is ingested into the index, the ingest pipeline runs the `sparse_encoding` processor on the document, generating vector embeddings for the `passage_text` field. The indexed document includes the `passage_text` field, which contains the original text, and the `passage_embedding` field, which contains the vector embeddings. 
 
-## Step 4: Search the index using neural search
+## Step 4: Search the index using neural sparse search
 
 To perform a neural sparse search on your index, use the `neural_sparse` query clause in [Query DSL]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/index/) queries. 
 
-The following example request uses a `neural_sparse` query to search for relevant documents:
+The following example request uses a `neural_sparse` query to search for relevant documents using raw query text:
 
 ```json
 GET my-nlp-index/_search
@@ -241,6 +241,27 @@ The response contains the matching documents:
 }
 ```
 
+You can also use the neural_sparse query with sparse vector embeddings:
+```json
+GET my-nlp-index/_search
+{
+  "query": {
+    "neural_sparse": {
+      "passage_embedding": {
+        "query_tokens": {
+          "hi" : 4.338913,
+          "planets" : 2.7755864,
+          "planet" : 5.0969057,
+          "mars" : 1.7405145,
+          "earth" : 2.6087382,
+          "hello" : 3.3210192
+        }
+      }
+    }
+  }
+}
+```
+
 ## Setting a default model on an index or field
 
 A [`neural_sparse`]({{site.url}}{{site.baseurl}}/query-dsl/specialized/neural-sparse/) query requires a model ID for generating sparse embeddings. To eliminate passing the model ID with each neural_sparse query request, you can set a default model on index-level or field-level. 

From aa52fcdcce8e2e0cea494aa59cc7d035651d3044 Mon Sep 17 00:00:00 2001
From: zhichao-aws <zhichaog@amazon.com>
Date: Sun, 28 Apr 2024 12:43:33 +0800
Subject: [PATCH 02/12] fix typo

Signed-off-by: zhichao-aws <zhichaog@amazon.com>
---
 _search-plugins/neural-sparse-search.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/_search-plugins/neural-sparse-search.md b/_search-plugins/neural-sparse-search.md
index d7e3759ab4..f5debc7033 100644
--- a/_search-plugins/neural-sparse-search.md
+++ b/_search-plugins/neural-sparse-search.md
@@ -16,7 +16,7 @@ Introduced 2.11
 When selecting a model, choose one of the following options:
 
 - Use a sparse encoding model at both ingestion time and search time (high performance, relatively high latency).
-- Use a sparse encoding model at ingestion time and a tokenizer at search time (relatively low performance, low latency). The tokenizers doesn't conduct model inference, but for consist experience we still deploy and invoke them via ml-commons model APIs.
+- Use a sparse encoding model at ingestion time and a tokenizer at search time (relatively low performance, low latency). The tokenizers doesn't conduct model inference, but for consistent experience we still deploy and invoke them via ml-commons model APIs.
 
 **PREREQUISITE**<br>
 Before using neural sparse search, make sure to set up a [pretrained sparse embedding model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/pretrained-models/#sparse-encoding-models) or your own sparse embedding model. For more information, see [Choosing a model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/integrating-ml-models/#choosing-a-model).

From 22bb78c87f27a9f02a1fc6fee50bdf0cab82a326 Mon Sep 17 00:00:00 2001
From: zhichao-aws <zhichaog@amazon.com>
Date: Sun, 28 Apr 2024 12:44:14 +0800
Subject: [PATCH 03/12] typo

Signed-off-by: zhichao-aws <zhichaog@amazon.com>
---
 _search-plugins/neural-sparse-search.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/_search-plugins/neural-sparse-search.md b/_search-plugins/neural-sparse-search.md
index f5debc7033..655dbd27b2 100644
--- a/_search-plugins/neural-sparse-search.md
+++ b/_search-plugins/neural-sparse-search.md
@@ -241,7 +241,7 @@ The response contains the matching documents:
 }
 ```
 
-You can also use the neural_sparse query with sparse vector embeddings:
+You can also use the `neural_sparse` query with sparse vector embeddings:
 ```json
 GET my-nlp-index/_search
 {

From b0654fd47307462876571c4c1b0bd730418cbc7e Mon Sep 17 00:00:00 2001
From: zhichao-aws <zhichaog@amazon.com>
Date: Sun, 28 Apr 2024 13:56:09 +0800
Subject: [PATCH 04/12] fix

Signed-off-by: zhichao-aws <zhichaog@amazon.com>
---
 _search-plugins/neural-sparse-search.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/_search-plugins/neural-sparse-search.md b/_search-plugins/neural-sparse-search.md
index 655dbd27b2..b75ed08115 100644
--- a/_search-plugins/neural-sparse-search.md
+++ b/_search-plugins/neural-sparse-search.md
@@ -16,7 +16,7 @@ Introduced 2.11
 When selecting a model, choose one of the following options:
 
 - Use a sparse encoding model at both ingestion time and search time (high performance, relatively high latency).
-- Use a sparse encoding model at ingestion time and a tokenizer at search time (relatively low performance, low latency). The tokenizers doesn't conduct model inference, but for consistent experience we still deploy and invoke them via ml-commons model APIs.
+- Use a sparse encoding model at ingestion time and a tokenizer at search time (relatively low performance, low latency). The tokenizers doesn't conduct model inference, but for consistent experience we still deploy and invoke them using ml-commons model APIs.
 
 **PREREQUISITE**<br>
 Before using neural sparse search, make sure to set up a [pretrained sparse embedding model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/pretrained-models/#sparse-encoding-models) or your own sparse embedding model. For more information, see [Choosing a model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/integrating-ml-models/#choosing-a-model).
@@ -29,7 +29,7 @@ To use neural sparse search, follow these steps:
 1. [Create an ingest pipeline](#step-1-create-an-ingest-pipeline).
 1. [Create an index for ingestion](#step-2-create-an-index-for-ingestion).
 1. [Ingest documents into the index](#step-3-ingest-documents-into-the-index).
-1. [Search the index using neural search](#step-4-search-the-index-using-neural-search).
+1. [Search the index using neural search](#step-4-search-the-index-using-neural-sparse-search).
 
 ## Step 1: Create an ingest pipeline
 

From 8092db084e43fb2d2c54153ef23819b5c4c99d31 Mon Sep 17 00:00:00 2001
From: zhichao-aws <zhichaog@amazon.com>
Date: Tue, 30 Apr 2024 11:11:46 +0800
Subject: [PATCH 05/12] Update _query-dsl/specialized/neural-sparse.md

Co-authored-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com>
Signed-off-by: zhichao-aws <zhichaog@amazon.com>
---
 _query-dsl/specialized/neural-sparse.md | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/_query-dsl/specialized/neural-sparse.md b/_query-dsl/specialized/neural-sparse.md
index 2651244cb9..0a51658827 100644
--- a/_query-dsl/specialized/neural-sparse.md
+++ b/_query-dsl/specialized/neural-sparse.md
@@ -15,7 +15,8 @@ Use the `neural_sparse` query for vector field search in [neural sparse search](
 ## Request fields
 
 Include the following request fields in the `neural_sparse` query:
-1. query by raw text:
+### Example: Query by raw text
+
 ```json
 "neural_sparse": {
   "<vector_field>": {

From 8f6cfb5d626a5a13f0a5003f691aae2dd8cf510e Mon Sep 17 00:00:00 2001
From: zhichao-aws <zhichaog@amazon.com>
Date: Tue, 30 Apr 2024 11:11:54 +0800
Subject: [PATCH 06/12] Update _query-dsl/specialized/neural-sparse.md

Co-authored-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com>
Signed-off-by: zhichao-aws <zhichaog@amazon.com>
---
 _query-dsl/specialized/neural-sparse.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/_query-dsl/specialized/neural-sparse.md b/_query-dsl/specialized/neural-sparse.md
index 0a51658827..5329e59570 100644
--- a/_query-dsl/specialized/neural-sparse.md
+++ b/_query-dsl/specialized/neural-sparse.md
@@ -25,7 +25,7 @@ Include the following request fields in the `neural_sparse` query:
   }
 }
 ```
-2. query by sparse vector:
+### Example: Query by sparse vector
 ```json
 "neural_sparse": {
   "<vector_field>": {

From 0c621a7620b5c41926b693320869d6bc78c7ce33 Mon Sep 17 00:00:00 2001
From: zhichao-aws <zhichaog@amazon.com>
Date: Tue, 30 Apr 2024 11:12:56 +0800
Subject: [PATCH 07/12] Update _query-dsl/specialized/neural-sparse.md

Co-authored-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com>
Signed-off-by: zhichao-aws <zhichaog@amazon.com>
---
 _query-dsl/specialized/neural-sparse.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/_query-dsl/specialized/neural-sparse.md b/_query-dsl/specialized/neural-sparse.md
index 5329e59570..7b38d965bb 100644
--- a/_query-dsl/specialized/neural-sparse.md
+++ b/_query-dsl/specialized/neural-sparse.md
@@ -40,7 +40,7 @@ Field | Data type | Required/Optional | Description
 :--- | :--- | :--- 
 `query_text` | String | Optional | The query text from which to generate sparse vector embeddings. 
 `model_id` | String | Optional | The ID of the sparse encoding model or tokenizer model that will be used to generate vector embeddings from the query text. The model must be deployed in OpenSearch before it can be used in sparse neural search. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/using-ml-models/) and [Neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/). To set a default model id in neural sparse query, see [`neural_query_enricher`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/neural-query-enricher/).
-`query_tokens` | Map<String, Float> | Optional | The query tokens, also referred as sparse vector embeddings. Just like dense semantic retrieval, we can use raw sparse vectors generated by neural models or tokenizers to perform the semantic search. Either `query_text` or `query_tokens` must be provided for `neural_sparse` query.
+`query_tokens` | Map<String, Float> | Optional | The query tokens, sometimes referred to as sparse vector embeddings. Similar to dense semantic retrieval, use raw sparse vectors generated by neural models or tokenizers to perform a semantic search query. Use either the `query_text` option for raw field vectors, or `query_tokens` option for sparse vectors,  must be provided for the `neural_sparse` query to operate.
 `max_token_score` | Float | Optional | (Deprecated) The theoretical upper bound of the score for all tokens in the vocabulary (required for performance optimization). For OpenSearch-provided [pretrained sparse embedding models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/pretrained-models/#sparse-encoding-models), we recommend setting `max_token_score` to 2 for `amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1` and to 3.5 for `amazon/neural-sparse/opensearch-neural-sparse-encoding-v1`. This field has been deprecated as of OpenSearch 2.12.
 
 #### Example request

From 472b990cd0cb2884af9b24bb448cec8af6dd137e Mon Sep 17 00:00:00 2001
From: zhichao-aws <zhichaog@amazon.com>
Date: Tue, 30 Apr 2024 11:13:07 +0800
Subject: [PATCH 08/12] Update _query-dsl/specialized/neural-sparse.md

Co-authored-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com>
Signed-off-by: zhichao-aws <zhichaog@amazon.com>
---
 _query-dsl/specialized/neural-sparse.md | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/_query-dsl/specialized/neural-sparse.md b/_query-dsl/specialized/neural-sparse.md
index 7b38d965bb..d5caae63d5 100644
--- a/_query-dsl/specialized/neural-sparse.md
+++ b/_query-dsl/specialized/neural-sparse.md
@@ -44,7 +44,8 @@ Field | Data type | Required/Optional | Description
 `max_token_score` | Float | Optional | (Deprecated) The theoretical upper bound of the score for all tokens in the vocabulary (required for performance optimization). For OpenSearch-provided [pretrained sparse embedding models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/pretrained-models/#sparse-encoding-models), we recommend setting `max_token_score` to 2 for `amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1` and to 3.5 for `amazon/neural-sparse/opensearch-neural-sparse-encoding-v1`. This field has been deprecated as of OpenSearch 2.12.
 
 #### Example request
-1. query by raw text
+**Query by raw text**
+
 ```json
 GET my-nlp-index/_search
 {

From 89ebcabf28a2c1276fc57a87919e04b8db2bd47e Mon Sep 17 00:00:00 2001
From: zhichao-aws <zhichaog@amazon.com>
Date: Tue, 30 Apr 2024 11:13:17 +0800
Subject: [PATCH 09/12] Update _query-dsl/specialized/neural-sparse.md

Co-authored-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com>
Signed-off-by: zhichao-aws <zhichaog@amazon.com>
---
 _query-dsl/specialized/neural-sparse.md | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/_query-dsl/specialized/neural-sparse.md b/_query-dsl/specialized/neural-sparse.md
index d5caae63d5..a6c4aa0a7b 100644
--- a/_query-dsl/specialized/neural-sparse.md
+++ b/_query-dsl/specialized/neural-sparse.md
@@ -59,7 +59,8 @@ GET my-nlp-index/_search
   }
 }
 ```
-2. query by sparse vector:
+**Query by sparse vector**
+
 ```json
 GET my-nlp-index/_search
 {

From 701a4ac092a2683b4ed37dd2e880718fa618f778 Mon Sep 17 00:00:00 2001
From: zhichao-aws <zhichaog@amazon.com>
Date: Tue, 30 Apr 2024 11:13:56 +0800
Subject: [PATCH 10/12] Update _search-plugins/neural-sparse-search.md

Co-authored-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com>
Signed-off-by: zhichao-aws <zhichaog@amazon.com>
---
 _search-plugins/neural-sparse-search.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/_search-plugins/neural-sparse-search.md b/_search-plugins/neural-sparse-search.md
index b75ed08115..38d343c90c 100644
--- a/_search-plugins/neural-sparse-search.md
+++ b/_search-plugins/neural-sparse-search.md
@@ -16,7 +16,7 @@ Introduced 2.11
 When selecting a model, choose one of the following options:
 
 - Use a sparse encoding model at both ingestion time and search time (high performance, relatively high latency).
-- Use a sparse encoding model at ingestion time and a tokenizer at search time (relatively low performance, low latency). The tokenizers doesn't conduct model inference, but for consistent experience we still deploy and invoke them using ml-commons model APIs.
+- Use a sparse encoding model at ingestion time and a tokenizer at search time for relatively low performance and low latency. The tokenism doesn't conduct model inference, therefore for a more consistent experience, deploy and invoke tokenizer using the ML commons Model API.
 
 **PREREQUISITE**<br>
 Before using neural sparse search, make sure to set up a [pretrained sparse embedding model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/pretrained-models/#sparse-encoding-models) or your own sparse embedding model. For more information, see [Choosing a model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/integrating-ml-models/#choosing-a-model).

From 2d3e042c6bd9eb5f727976dafc33a1e3651803f2 Mon Sep 17 00:00:00 2001
From: zhichao-aws <zhichaog@amazon.com>
Date: Tue, 30 Apr 2024 11:14:08 +0800
Subject: [PATCH 11/12] Update _search-plugins/neural-sparse-search.md

Co-authored-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com>
Signed-off-by: zhichao-aws <zhichaog@amazon.com>
---
 _search-plugins/neural-sparse-search.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/_search-plugins/neural-sparse-search.md b/_search-plugins/neural-sparse-search.md
index 38d343c90c..90e64f5cec 100644
--- a/_search-plugins/neural-sparse-search.md
+++ b/_search-plugins/neural-sparse-search.md
@@ -148,7 +148,7 @@ Before the document is ingested into the index, the ingest pipeline runs the `sp
 
 To perform a neural sparse search on your index, use the `neural_sparse` query clause in [Query DSL]({{site.url}}{{site.baseurl}}/opensearch/query-dsl/index/) queries. 
 
-The following example request uses a `neural_sparse` query to search for relevant documents using raw query text:
+The following example request uses a `neural_sparse` query to search for relevant documents using a raw text query:
 
 ```json
 GET my-nlp-index/_search

From 9d32ecbb4a96a9c8dad99663d0082fc260eb50ec Mon Sep 17 00:00:00 2001
From: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com>
Date: Tue, 30 Apr 2024 12:59:48 -0500
Subject: [PATCH 12/12] Apply suggestions from code review

Co-authored-by: Nathan Bower <nbower@amazon.com>
Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com>
---
 _query-dsl/specialized/neural-sparse.md | 6 +++---
 _search-plugins/neural-sparse-search.md | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/_query-dsl/specialized/neural-sparse.md b/_query-dsl/specialized/neural-sparse.md
index a6c4aa0a7b..47f77fa95d 100644
--- a/_query-dsl/specialized/neural-sparse.md
+++ b/_query-dsl/specialized/neural-sparse.md
@@ -10,7 +10,7 @@ nav_order: 55
 Introduced 2.11
 {: .label .label-purple }
 
-Use the `neural_sparse` query for vector field search in [neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/). Query can be performed using raw text or sparse vectors.
+Use the `neural_sparse` query for vector field search in [neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/). The query can use either raw text or sparse vector tokens.
 
 ## Request fields
 
@@ -39,8 +39,8 @@ The top-level `vector_field` specifies the vector field against which to run a s
 Field | Data type | Required/Optional | Description
 :--- | :--- | :--- 
 `query_text` | String | Optional | The query text from which to generate sparse vector embeddings. 
-`model_id` | String | Optional | The ID of the sparse encoding model or tokenizer model that will be used to generate vector embeddings from the query text. The model must be deployed in OpenSearch before it can be used in sparse neural search. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/using-ml-models/) and [Neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/). To set a default model id in neural sparse query, see [`neural_query_enricher`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/neural-query-enricher/).
-`query_tokens` | Map<String, Float> | Optional | The query tokens, sometimes referred to as sparse vector embeddings. Similar to dense semantic retrieval, use raw sparse vectors generated by neural models or tokenizers to perform a semantic search query. Use either the `query_text` option for raw field vectors, or `query_tokens` option for sparse vectors,  must be provided for the `neural_sparse` query to operate.
+`model_id` | String | Optional | The ID of the sparse encoding model or tokenizer model that will be used to generate vector embeddings from the query text. The model must be deployed in OpenSearch before it can be used in sparse neural search. For more information, see [Using custom models within OpenSearch]({{site.url}}{{site.baseurl}}/ml-commons-plugin/using-ml-models/) and [Neural sparse search]({{site.url}}{{site.baseurl}}/search-plugins/neural-sparse-search/). For information on setting a default model ID in a neural sparse query, see [`neural_query_enricher`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/neural-query-enricher/).
+`query_tokens` | Map<String, Float> | Optional | The query tokens, sometimes referred to as sparse vector embeddings. Similarly to dense semantic retrieval, you can use raw sparse vectors generated by neural models or tokenizers to perform a semantic search query. Use either the `query_text` option for raw field vectors or the `query_tokens` option for sparse vectors. Must be provided in order for the `neural_sparse` query to operate.
 `max_token_score` | Float | Optional | (Deprecated) The theoretical upper bound of the score for all tokens in the vocabulary (required for performance optimization). For OpenSearch-provided [pretrained sparse embedding models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/pretrained-models/#sparse-encoding-models), we recommend setting `max_token_score` to 2 for `amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1` and to 3.5 for `amazon/neural-sparse/opensearch-neural-sparse-encoding-v1`. This field has been deprecated as of OpenSearch 2.12.
 
 #### Example request
diff --git a/_search-plugins/neural-sparse-search.md b/_search-plugins/neural-sparse-search.md
index 90e64f5cec..fd86b3f6b0 100644
--- a/_search-plugins/neural-sparse-search.md
+++ b/_search-plugins/neural-sparse-search.md
@@ -16,7 +16,7 @@ Introduced 2.11
 When selecting a model, choose one of the following options:
 
 - Use a sparse encoding model at both ingestion time and search time (high performance, relatively high latency).
-- Use a sparse encoding model at ingestion time and a tokenizer at search time for relatively low performance and low latency. The tokenism doesn't conduct model inference, therefore for a more consistent experience, deploy and invoke tokenizer using the ML commons Model API.
+- Use a sparse encoding model at ingestion time and a tokenizer at search time for relatively low performance and low latency. The tokenism doesn't conduct model inference, so you can deploy and invoke a tokenizer using the ML Commons Model API for a more consistent experience.
 
 **PREREQUISITE**<br>
 Before using neural sparse search, make sure to set up a [pretrained sparse embedding model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/pretrained-models/#sparse-encoding-models) or your own sparse embedding model. For more information, see [Choosing a model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/integrating-ml-models/#choosing-a-model).