Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add on disk 4x compression with Faiss #2425

Open
wants to merge 9 commits into
base: 2.x
Choose a base branch
from

Conversation

naveentatikonda
Copy link
Member

Description

Add on disk 4x compression with Faiss which accepts fp32 vectors as input and dynamically quantizes them into byte sized vectors using the Faiss SQ8 quantizer.

Related Issues

Resolves #1723

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@naveentatikonda naveentatikonda added Features Introduces a new unit of functionality that satisfies a requirement backport 2.x labels Jan 23, 2025
@naveentatikonda naveentatikonda force-pushed the faiss_ondisk_4x branch 2 times, most recently from 6721604 to 7739606 Compare January 23, 2025 23:03
@naveentatikonda naveentatikonda changed the base branch from main to 2.x January 23, 2025 23:03
Copy link
Collaborator

@shatejas shatejas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed partially

jni/src/faiss_index_service.cpp Outdated Show resolved Hide resolved
@@ -155,6 +155,47 @@ void IndexService::writeIndex(
}
}

jlong IndexService::initIndexFromTemplate(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only difference between this and initIndex is the index creation call to faiss, can we abstract out the logic and reuse the rest please. you can pass in the pointer returned by faiss to reuse the logic and set the index uniq pointer maybe.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refactored as discussed and also validated that the index is getting deleted

jni/src/faiss_index_service.cpp Show resolved Hide resolved
src/main/java/org/opensearch/knn/index/KNNIndexShard.java Outdated Show resolved Hide resolved
Signed-off-by: Naveen Tatikonda <[email protected]>
Signed-off-by: Naveen Tatikonda <[email protected]>
Signed-off-by: Naveen Tatikonda <[email protected]>
Signed-off-by: Naveen Tatikonda <[email protected]>
Signed-off-by: Naveen Tatikonda <[email protected]>
@naveentatikonda naveentatikonda force-pushed the faiss_ondisk_4x branch 2 times, most recently from 18c4a8a to 4449fde Compare January 25, 2025 17:05
@naveentatikonda
Copy link
Member Author

naveentatikonda commented Jan 27, 2025

Update - After looking at the benchmarks and having a chat with other maintainers of k-NN we want to put this feature on hold from releasing in 2.19. Will run more benchmarking tests and compare them with lucene (with rescoring POC) and then decide if we need to switch the default engine for on_disk 4x compression.

Comment on lines 263 to 265
if ((quantizationParams.getTypeIdentifier()).equals(
ScalarQuantizationParams.generateTypeIdentifier(ScalarQuantizationType.EIGHT_BIT)
)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to ensure that NPE doesn't come in case quantizationParams.getTypeIdentifier() == null, we should reverse the check.

ScalarQuantizationParams.generateTypeIdentifier(ScalarQuantizationType.EIGHT_BIT).equal(quantizationParams.getTypeIdentifier())

Comment on lines 266 to 271
quantizationState = quantizationService.train(quantizationParams, knnVectorValues, totalLiveDocs, fieldInfo);
} else {
initQuantizationStateWriterIfNecessary();
quantizationState = quantizationService.train(quantizationParams, knnVectorValues, totalLiveDocs, fieldInfo);
quantizationStateWriter.writeState(fieldInfo.getFieldNumber(), quantizationState);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this whole logic can be simplified where we create the QS first and then we just see if writer needs to init and it needs to write the state,.

return (indexInfo.getQuantizationState() instanceof ByteScalarQuantizationState);
}

private byte[] getIndexTemplate(BuildIndexParams indexInfo) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit pick: make the param final for all the functions

@@ -58,7 +59,7 @@ static IndexBuildSetup prepareIndexBuild(KNNVectorValues<?> knnVectorValues, Bui
int bytesPerVector;
int dimensions;

if (quantizationState != null) {
if (quantizationState != null && !(quantizationState instanceof ByteScalarQuantizationState)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need java doc on this. This kind of instanceOf check is making me nervous. can we think of something here.

Comment on lines +124 to +126
public QuantizationState train(final TrainingRequest<float[]> trainingRequest, final FieldInfo fieldInfo) throws IOException {
return null;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

Comment on lines +69 to +71
public QuantizationState train(final TrainingRequest<float[]> trainingRequest, final FieldInfo fieldInfo) throws IOException {
return null;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

@@ -31,6 +32,8 @@ public interface Quantizer<T, R> {
*/
QuantizationState train(TrainingRequest<T> trainingRequest) throws IOException;

QuantizationState train(TrainingRequest<T> trainingRequest, FieldInfo fieldInfo) throws IOException;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add java doc and also why we need this function? I thought we are keeping QF free from fieldInfo and other things.

import static org.opensearch.knn.common.FieldInfoExtractor.extractVectorDataType;
import static org.opensearch.knn.index.codec.transfer.OffHeapVectorTransferFactory.getVectorTransfer;

public class ByteScalarQuantizer implements Quantizer<float[], byte[]> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add java doc on all your new classes.

Comment on lines +60 to +71
if (sampledIndices.length == 0) {
return null;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we have some logs here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport main Features Introduces a new unit of functionality that satisfies a requirement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEATURE] Faiss online training to quantize fp32 vectors as int8 vectors
4 participants