Doc review

Signed-off-by: Fanit Kolchina <[email protected]>
opensearch-project · Dec 5, 2024 · 2daa567 · 2daa567
1 parent a331379
commit 2daa567
Showing 1 changed file with 8 additions and 8 deletions.
diff --git a/_analyzers/tokenizers/ngram.md b/_analyzers/tokenizers/ngram.md
@@ -7,11 +7,11 @@ nav_order: 80
 
 # N-gram tokenizer
 
-The `ngram` tokenizer split text into overlapping n-grams (sequences of characters) of a specified length. This tokenizer is particularly useful when you want to perform partial word matching or autocomplete search functionality, as it generates substrings (character n-grams) of the original input text.
+The `ngram` tokenizer splits text into overlapping n-grams (sequences of characters) of a specified length. This tokenizer is particularly useful when you want to perform partial word matching or autocomplete search functionality because it generates substrings (character n-grams) of the original input text.
 
 ## Example usage
 
-The following example request creates a new index named `my_index` and configures an analyzer with `ngram` tokenizer:
+The following example request creates a new index named `my_index` and configures an analyzer with an `ngram` tokenizer:
 
 ```json
 PUT /my_index
@@ -40,7 +40,7 @@ PUT /my_index
 
 ## Generated tokens
 
-Use the following request to examine the tokens generated using the created analyzer:
+Use the following request to examine the tokens generated using the analyzer:
 
 ```json
 POST /my_index/_analyze
@@ -67,22 +67,22 @@ The response contains the generated tokens:
 }
 ```
 
-## Configuration
+## Parameters
 
 The `ngram` tokenizer can be configured with the following parameters.
 
 Parameter | Required/Optional | Data type | Description
 :--- | :--- | :--- | :--- 
 `min_gram` | Optional | Integer | Minimum length of n-grams. Default is `1`.
 `max_gram` | Optional | Integer | Maximum length of n-grams. Default is `2`.
-`token_chars` | Optional | List of strings | Character classes to be included in tokenization. The following are the possible options:<br>- `letter`<br>- `digit`<br>- `whitespace`<br>- `punctuation`<br>- `symbol`<br>- `custom` (Parameter `custom_token_chars` needs to also be configured in this case)<br>Default is empty list (`[]`) which retains all the characters 
-`custom_token_chars` | Optional | String | Custom characters that will be included as part of the tokens.
+`token_chars` | Optional | List of strings | Character classes to be included in tokenization. Valid values are:<br>- `letter`<br>- `digit`<br>- `whitespace`<br>- `punctuation`<br>- `symbol`<br>- `custom` (You must also specify the `custom_token_chars` parameter)<br>Default is empty list (`[]`), which retains all the characters.
+`custom_token_chars` | Optional | String | Custom characters to be included as part of the tokens.
 
 ### Maximum difference between `min_gram` and `max_gram`
 
-The maximum difference between `min_gram` and `max_gram` is configured using index level setting `index.max_ngram_diff` and defaults to `1`.
+The maximum difference between `min_gram` and `max_gram` is configured using the index-level `index.max_ngram_diff` setting and defaults to `1`.
 
-The following example creates index with custom `index.max_ngram_diff` setting: 
+The following example creates index with a custom `index.max_ngram_diff` setting: 
 
 ```json
 PUT /my-index