Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent results returned for the same query #268

Open
ministat opened this issue Mar 14, 2022 · 6 comments
Open

Inconsistent results returned for the same query #268

ministat opened this issue Mar 14, 2022 · 6 comments

Comments

@ministat
Copy link

Milvus cluster: v2.0.1
milvus-sdk-java: 2.0.4


    public static void runQuickSearch(String collectionName) {
        DescIndexResponseWrapper.IndexDesc indexDesc =
                describeIndexInfo(collectionName)
                        .getIndexDescByFieldName(PROPERTIES.getProperty("VECTOR_FIELD"));
        int topK = 20;
        int nq = 5;
        final List<List<Float>> vectors = new ArrayList();
        List<Long> ids = new ArrayList();
        for (int i = 0; i < nq; i++) {
            vectors.add(QUERY_EMBEDDINGS.get(i));
            ids.add((long)i);
        }

        final String SEARCH_PARAM = "{\"nprobe\":64}";
        SearchParam searchParam = SearchParam.newBuilder()
                .withCollectionName(collectionName)
                .withTopK(topK)
                .withMetricType(indexDesc.getMetricType())
                .withVectors(vectors)
                .withVectorFieldName(PROPERTIES.getProperty("VECTOR_FIELD"))
                .withParams(SEARCH_PARAM)
                .build();
        long begin = System.currentTimeMillis();
        R<SearchResults> response = milvusClient.search(searchParam);
        long end = System.currentTimeMillis();
        long cost = end - begin;
        System.out.println("Search time cost: " + cost + "ms");
        handleResponseStatus(response);
        SearchResultsWrapper wrapper = new SearchResultsWrapper(response.getData().getResults());
        List<List<Long>> results = new ArrayList();
        for (int i = 0; i < vectors.size(); ++i) {
            System.out.println("Search result of No." + i);
            List<SearchResultsWrapper.IDScore> scores = wrapper.getIDScore(i);
            System.out.println(scores);
            List<Long> result = new ArrayList();
            for (SearchResultsWrapper.IDScore score : scores) {
                result.add(score.getLongID());
            }
            if (!result.isEmpty()) {
                results.add(result);
            }
        }
    }

The output indicates the first ID of No.0 query is different: 87356234 vs. 504814
...
Search time cost: 369ms
Search result of No.0
[(ID: 87356234 Score: 0.94079435), (ID: 344333 Score: 0.9679534), (ID: 64429592 Score: 0.9938071), (ID: 8430679 Score: 1.0020553), (ID: 40642738 Score: 1.030692), (ID: 18530806 Score: 1.0386481), (ID: 80468484 Score: 1.0398453), (ID: 22664991 Score: 1.0444247), (ID: 53179241 Score: 1.0483937), (ID: 21192937 Score: 1.0600007), (ID: 88762980 Score: 1.0619603), (ID: 14925721 Score: 1.0722969), (ID: 99117874 Score: 1.080706), (ID: 13934280 Score: 1.0825374), (ID: 72027288 Score: 1.0837624), (ID: 97814940 Score: 1.084217), (ID: 13105118 Score: 1.090414), (ID: 97950492 Score: 1.0904853), (ID: 60066306 Score: 1.091538), (ID: 18760679 Score: 1.0933377)]
...

Search result of No.0
[(ID: 504814 Score: 0.9377404), (ID: 87356234 Score: 0.94079435), (ID: 344333 Score: 0.9679534), (ID: 28539420 Score: 0.98356724), (ID: 64429592 Score: 0.9938071), (ID: 8430679 Score: 1.0020553), (ID: 40642738 Score: 1.030692), (ID: 9295796 Score: 1.0313499), (ID: 18530806 Score: 1.0386481), (ID: 80468484 Score: 1.0398453), (ID: 22664991 Score: 1.0444247), (ID: 53179241 Score: 1.0483937), (ID: 21192937 Score: 1.0600007), (ID: 88762980 Score: 1.0619603), (ID: 82631435 Score: 1.0677121), (ID: 14925721 Score: 1.0722969), (ID: 95601336 Score: 1.0781072), (ID: 99117874 Score: 1.080706), (ID: 21686232 Score: 1.0807402), (ID: 61457390 Score: 1.0821154)]
...

@yhmo
Copy link
Contributor

yhmo commented Mar 14, 2022

Do you mean the first search returns [(ID: 87356234 Score: 0.94079435), (ID: 344333 Score: 0.9679534), ...... ], the second search returns [(ID: 504814 Score: 0.9377404), (ID: 87356234 Score: 0.94079435), (ID: 344333 Score: 0.9679534), ...... ] ?

Any operations(delete action?) between the two search requests?

@ministat
Copy link
Author

Yes, you see they are different. No other operations between them. The data set is 100m used in milvus_bootcamp.

@yhmo
Copy link
Contributor

yhmo commented Mar 17, 2022

Could you show me the "SEARCH_PARAM"?
Is the index IVF_FALT? What is the value of "nlist"? What is the value of "nprobe" in the "SEARCH_PARAM"?

@yhmo
Copy link
Contributor

yhmo commented Mar 17, 2022

Is it possible to provide a reproducible steps that we can debug into the source code?

@ministat
Copy link
Author

SEARCH_PARAM=64. The index is IVF_SQ8.
The dataset is 100m, and I follow the steps to create the index: in https://github.com/milvus-io/bootcamp/blob/master/benchmark_test/lab2_sift1b_100m.md

I have created a Java program to reproduce this issue. That program is sent to Milvus community. Hope you will receive it. Please run that program:

java -jar target/milvus-benchmark-1.0-SNAPSHOT.jar -a QUICKSEARCH -c ann_100m_sq8

Search time cost: 477ms
Search result of No.0
[(ID: 87356234 Score: 0.94079435), (ID: 344333 Score: 0.9679534), (ID: 64429592 Score: 0.9938071), (ID: 8430679 Score: 1.0020553), (ID: 40642738 Score: 1.030692), (ID: 18530806 Score: 1.0386481), (ID: 80468484 Score: 1.0398453), (ID: 22664991 Score: 1.0444247), (ID: 53179241 Score: 1.0483937), (ID: 21192937 Score: 1.0600007), (ID: 88762980 Score: 1.0619603), (ID: 14925721 Score: 1.0722969), (ID: 99117874 Score: 1.080706), (ID: 13934280 Score: 1.0825374), (ID: 72027288 Score: 1.0837624), (ID: 97814940 Score: 1.084217), (ID: 13105118 Score: 1.090414), (ID: 97950492 Score: 1.0904853), (ID: 60066306 Score: 1.091538), (ID: 18760679 Score: 1.0933377)]
……
Search time cost: 334ms
Search result of No.0
[(ID: 504814 Score: 0.9377404), (ID: 87356234 Score: 0.94079435), (ID: 344333 Score: 0.9679534), (ID: 28539420 Score: 0.98356724), (ID: 64429592 Score: 0.9938071), (ID: 8430679 Score: 1.0020553), (ID: 40642738 Score: 1.030692), (ID: 9295796 Score: 1.0313499), (ID: 18530806 Score: 1.0386481), (ID: 80468484 Score: 1.0398453), (ID: 22664991 Score: 1.0444247), (ID: 53179241 Score: 1.0483937), (ID: 21192937 Score: 1.0600007), (ID: 88762980 Score: 1.0619603), (ID: 82631435 Score: 1.0677121), (ID: 14925721 Score: 1.0722969), (ID: 95601336 Score: 1.0781072), (ID: 99117874 Score: 1.080706), (ID: 21686232 Score: 1.0807402), (ID: 61457390 Score: 1.0821154)]

@ministat
Copy link
Author

I have uploaded my Java program to: https://github.com/ministat/milvus-unstable-results. Please check whether you can reproduce this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants