Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for search using the "fields" parameter with knn_vector field #2314

Merged
merged 11 commits into from
Jan 14, 2025

Conversation

e-emoto
Copy link
Contributor

@e-emoto e-emoto commented Dec 7, 2024

Description

This change implements valueFetcher in KNNVectorFieldType so search can support using the "fields" parameter when used with the knn_vector field type. ITs for this change were added to OpenSearchIT after the change was tested manually.

Manual Testing Output

curl -X PUT "localhost:9200/test_index" -H 'Content-Type: application/json' -d' 
{
  "settings" : {
    "number_of_shards" : 1,
    "number_of_replicas" : 0,
    "index.knn": true
  },
  "mappings": {
       "properties": {
       "test_field1": {
           "type": "knn_vector",
           "dimension": 2,
           "method": {
               "name":"hnsw",
               "engine":"nmslib",
               "space_type": "l2",
               "parameters":{
                 "m":20,
                 "ef_construction": 23
               }
           }
      },"test_field2": {
           "type": "knn_vector",
           "dimension": 3,
           "method": {
               "name":"hnsw",
               "engine":"nmslib",
               "space_type": "l2",
               "parameters":{
                 "m":20,
                 "ef_construction": 23
               }
           }
      },"test_field3": {
           "type": "text"
      }
   }
  }
}
'

curl -X PUT "localhost:9200/_bulk" -H 'Content-Type: application/json' -d'
{ "index": { "_index": "test_index" } }
{ "test_field1": [1.5, 5.5], "test_field2": [1.5, 5.5, 3.5], "test_field3": "text 1"}
{ "index": { "_index": "test_index" } }
{ "test_field1": [2.5, 3.5], "test_field2": [2.5, 3.5, 7.5], "test_field3": "text 2"}
{ "index": { "_index": "test_index" } }
{ "test_field1": [4.5, 5.5], "test_field2": [4.5, 5.5, 1.5]}
{ "index": { "_index": "test_index" } }
{ "test_field1": [1.5, 5.5], "test_field2": [1.5, 5.5, 2.5]}
{ "index": { "_index": "test_index" } }
{ "test_field3": "text 3"}
{ "index": { "_index": "test_index" } }
{ "test_field3": "text 4"}
'

curl -X GET "localhost:9200/test_index/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "fields": ["test_field1"],
  "query": {
    "knn": {
      "test_field1": {
        "vector": [2.0, 4.0],
        "k": 10
      }
    }
  }
}
'

{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 4,
      "relation" : "eq"
    },
    "max_score" : 0.6666667,
    "hits" : [
      {
        "_index" : "test_index",
        "_id" : "6lOiwpMB0RzOL9-Fc3K5",
        "_score" : 0.6666667,
        "_source" : {
          "test_field1" : [
            2.5,
            3.5
          ],
          "test_field2" : [
            2.5,
            3.5,
            7.5
          ],
          "test_field3" : "text 2"
        },
        "fields" : {
          "test_field1" : [
            2.5,
            3.5
          ]
        }
      },
      {
        "_index" : "test_index",
        "_id" : "6VOiwpMB0RzOL9-Fc3K5",
        "_score" : 0.2857143,
        "_source" : {
          "test_field1" : [
            1.5,
            5.5
          ],
          "test_field2" : [
            1.5,
            5.5,
            3.5
          ],
          "test_field3" : "text 1"
        },
        "fields" : {
          "test_field1" : [
            1.5,
            5.5
          ]
        }
      },
      {
        "_index" : "test_index",
        "_id" : "7FOiwpMB0RzOL9-Fc3K5",
        "_score" : 0.2857143,
        "_source" : {
          "test_field1" : [
            1.5,
            5.5
          ],
          "test_field2" : [
            1.5,
            5.5,
            2.5
          ]
        },
        "fields" : {
          "test_field1" : [
            1.5,
            5.5
          ]
        }
      },
      {
        "_index" : "test_index",
        "_id" : "61OiwpMB0RzOL9-Fc3K5",
        "_score" : 0.10526316,
        "_source" : {
          "test_field1" : [
            4.5,
            5.5
          ],
          "test_field2" : [
            4.5,
            5.5,
            1.5
          ]
        },
        "fields" : {
          "test_field1" : [
            4.5,
            5.5
          ]
        }
      }
    ]
  }
}

Related Issues

Resolves #1633

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Ethan Emoto <eemoto@amazon.com>
@navneet1v
Copy link
Collaborator

@e-emoto please fix the failing CIs

e-emoto and others added 2 commits December 9, 2024 10:07
Signed-off-by: Ethan Emoto <eemoto@amazon.com>
Signed-off-by: Ethan Emoto <eemoto@amazon.com>
Signed-off-by: Ethan Emoto <eemoto@amazon.com>
Signed-off-by: Ethan Emoto <eemoto@amazon.com>
@navneet1v
Copy link
Collaborator

ITs for this change were added to OpenSearchIT after the change was tested manually.

can you add the manual testing result on the PR?

e-emoto and others added 3 commits December 13, 2024 10:49
Signed-off-by: Ethan Emoto <eemoto@amazon.com>
Signed-off-by: Ethan Emoto <eemoto@amazon.com>
Signed-off-by: Ethan Emoto <emoto.ethan@gmail.com>
Signed-off-by: Ethan Emoto <eemoto@amazon.com>
Signed-off-by: Ethan Emoto <emoto.ethan@gmail.com>
Signed-off-by: Ethan Emoto <eemoto@amazon.com>
Copy link
Member

@jmazanec15 jmazanec15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks @e-emoto !

@jmazanec15 jmazanec15 merged commit 5c63fde into opensearch-project:main Jan 14, 2025
33 checks passed
opensearch-trigger-bot bot pushed a commit that referenced this pull request Jan 14, 2025
…ield (#2314)

Signed-off-by: Ethan Emoto <eemoto@amazon.com>
(cherry picked from commit 5c63fde)
e-emoto added a commit to e-emoto/k-NN that referenced this pull request Jan 21, 2025
…ield (opensearch-project#2314)

Signed-off-by: Ethan Emoto <eemoto@amazon.com>
(cherry picked from commit 5c63fde)
e-emoto added a commit to e-emoto/k-NN that referenced this pull request Jan 21, 2025
…ield (opensearch-project#2314)

Signed-off-by: Ethan Emoto <emoto.ethan@gmail.com>
Signed-off-by: Ethan Emoto <eemoto@amazon.com>
(cherry picked from commit 5c63fde)
@e-emoto
Copy link
Contributor Author

e-emoto commented Jan 21, 2025

Created backport PR manually to fix issue with sign-off email:
#2411

navneet1v pushed a commit that referenced this pull request Jan 23, 2025
…th knn_vector field (#2411)

* Add support for search using the "fields" parameter with knn_vector field (#2314)

Signed-off-by: Ethan Emoto <emoto.ethan@gmail.com>
Signed-off-by: Ethan Emoto <eemoto@amazon.com>
(cherry picked from commit 5c63fde)

* Fix NPE in backporting fields parameter ITs

Signed-off-by: Ethan Emoto <emoto.ethan@gmail.com>

* Mute testKNNWarmupCustomLegacyFieldMapping (#2416)

Signed-off-by: Kunal Kotwani <kkotwani@amazon.com>

---------

Signed-off-by: Ethan Emoto <emoto.ethan@gmail.com>
Signed-off-by: Kunal Kotwani <kkotwani@amazon.com>
Co-authored-by: Kunal Kotwani <kkotwani@amazon.com>
e-emoto added a commit to e-emoto/k-NN that referenced this pull request Jan 23, 2025
…ield (opensearch-project#2314)

Signed-off-by: Ethan Emoto <emoto.ethan@gmail.com>
Signed-off-by: Ethan Emoto <eemoto@amazon.com>
(cherry picked from commit 5c63fde)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Indices that include knn_vector field fail when search includes "fields" parameter
6 participants