You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
from colbert.infra import Run, RunConfig, ColBERTConfig
from colbert import Indexer
if __name__=='__main__':
with Run().context(RunConfig(nranks=4, experiment="msmarco")):
config = ColBERTConfig(
nbits=16,
root=".",
)
indexer = Indexer(checkpoint="./colbertv2.0", config=config)
indexer.index(name="msmarco.nbits=16", collection="./collection.tsv")
If I try to set nbits to 16 and index, I get the following error.
Clustering 35121672 points in 128D to 262144 clusters, redo 1 times, 20 iterations
Preprocessing in 9.63 s
Iteration 19 (7666.62 s, search 7389.69 s): objective=8.38992e+06 imbalance=1.247 nsplit=0
[Jul 23, 04:56:30] Loading decompress_residuals_cpp extension (set COLBERT_LOAD_TORCH_EXTENSION_VERBOSE=True for more info)...
[Jul 23, 04:56:34] Loading packbits_cpp extension (set COLBERT_LOAD_TORCH_EXTENSION_VERBOSE=True for more info)...
Process Process-2:
Traceback (most recent call last):
File "/home/jovyan/.conda/envs/colbert/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/home/jovyan/.conda/envs/colbert/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/home/jovyan/colbert-prune/experiments/ColBERTv2-reproduce/ColBERT/colbert/infra/launcher.py", line 134, in setup_new_process
return_val = callee(config, *args)
File "/home/jovyan/colbert-prune/experiments/ColBERTv2-reproduce/ColBERT/colbert/indexing/collection_indexer.py", line 33, in encode
encoder.run(shared_lists)
File "/home/jovyan/colbert-prune/experiments/ColBERTv2-reproduce/ColBERT/colbert/indexing/collection_indexer.py", line 68, in run
self.train(shared_lists) # Trains centroids from selected passages
File "/home/jovyan/colbert-prune/experiments/ColBERTv2-reproduce/ColBERT/colbert/indexing/collection_indexer.py", line 237, in train
bucket_cutoffs, bucket_weights, avg_residual = self._compute_avg_residual(centroids, heldout)
File "/home/jovyan/colbert-prune/experiments/ColBERTv2-reproduce/ColBERT/colbert/indexing/collection_indexer.py", line 315, in _compute_avg_residual
compressor = ResidualCodec(config=self.config, centroids=centroids, avg_residual=None)
File "/home/jovyan/colbert-prune/experiments/ColBERTv2-reproduce/ColBERT/colbert/indexing/codecs/residual.py", line 61, in __init__
x = (i >> (j - self.nbits)) & mask
ValueError: negative shift count
nbits = 16 and k = 1000 seem to be correct configuration of the ColBERTv2 paper. How can I reproduce the experiment?
The text was updated successfully, but these errors were encountered:
If I try to set
nbits
to 16 and index, I get the following error.nbits = 16
andk = 1000
seem to be correct configuration of the ColBERTv2 paper. How can I reproduce the experiment?The text was updated successfully, but these errors were encountered: