A few fixes to raft-ann-bench recipe and docs (#1806)

Authors: - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Divye Gala (https://github.com/divyegala) - Dante Gama Dessavre (https://github.com/dantegd) - Ray Douglass (https://github.com/raydouglass) URL: #1806
rapidsai · Sep 7, 2023 · be378ee · be378ee
1 parent f691fc9
commit be378ee
Show file tree

Hide file tree

Showing 4 changed files with 15 additions and 10 deletions.
diff --git a/conda/recipes/raft-ann-bench-cpu/meta.yaml b/conda/recipes/raft-ann-bench-cpu/meta.yaml
@@ -57,6 +57,7 @@ requirements:
     - matplotlib
     - python
     - pyyaml
+    - benchmark
 
 about:
   home: https://rapids.ai/

diff --git a/conda/recipes/raft-ann-bench/meta.yaml b/conda/recipes/raft-ann-bench/meta.yaml
@@ -90,7 +90,11 @@ requirements:
     - libfaiss {{ faiss_version }}
     {% endif %}
     - h5py {{ h5py_version }}
-
+    - benchmark
+    - glog {{ glog_version }}
+    - matplotlib
+    - python
+    - pyyaml
 about:
   home: https://rapids.ai/
   license: Apache-2.0

diff --git a/docs/source/raft_ann_benchmarks.md b/docs/source/raft_ann_benchmarks.md
@@ -8,7 +8,7 @@ The easiest way to install these benchmarks is through conda. We provide package
 
 ```bash
 
-mamba env create --name raft_ann_benchmarks
+mamba create --name raft_ann_benchmarks
 conda activate raft_ann_benchmarks
 
 # to install GPU package:
@@ -25,7 +25,7 @@ Please see the [build instructions](ann_benchmarks_build.md) to build the benchm
 ## Running the benchmarks
 
 ### Usage
-There are 3 general steps to running the benchmarks and vizualizing the results:
+There are 4 general steps to running the benchmarks and visualizing the results:
 1. Prepare Dataset
 2. Build Index and Search Index
 3. Data Export
@@ -39,7 +39,7 @@ expected to be defined to run these scripts; this variable holds the directory w
 
 ### End-to-end example: Million-scale
 
-The steps below demonstrate how to download, install, and run benchmarks on a subset of 10M vectors from the Yandex Deep-1B dataset By default the datasets will be stored and used from the folder indicated by the RAPIDS_DATASET_ROOT_DIR environment variable if defined, otherwise a datasets subfolder from where the script is being called:
+The steps below demonstrate how to download, install, and run benchmarks on a subset of 10M vectors from the Yandex Deep-1B dataset By default the datasets will be stored and used from the folder indicated by the RAPIDS_DATASET_ROOT_DIR environment variable if defined, otherwise a datasets sub-folder from where the script is being called:
 
 ```bash
 
@@ -56,7 +56,7 @@ python -m raft-ann-bench.data_export --dataset deep-image-96-inner
 python -m raft-ann-bench.plot --dataset deep-image-96-inner
 ```
 
-Configuration files already exist for the following list of the million-scale datasets. These all work out-of-the-box with the `--dataset` argument. Other million-scale datasets from `ann-benchmarks.com` will work, but will require a json configuration file to be created in `python/raft-ann-bench/src/raft-ann-bench/conf`.
+Configuration files already exist for the following list of the million-scale datasets. Please refer to [ann-benchmarks datasets](https://github.com/erikbern/ann-benchmarks/#data-sets) for more information, including actual train and sizes. These all work out-of-the-box with the `--dataset` argument. Other million-scale datasets from `ann-benchmarks.com` will work, but will require a json configuration file to be created in `python/raft-ann-bench/src/raft-ann-bench/run/conf`.
 - `deep-image-96-angular`
 - `fashion-mnist-784-euclidean`
 - `glove-50-angular`
@@ -80,17 +80,17 @@ mkdir -p datasets/deep-1B
 # (1) prepare dataset
 # download manually "Ground Truth" file of "Yandex DEEP"
 # suppose the file name is deep_new_groundtruth.public.10K.bin
-python python -m raft-ann-bench.split_groundtruth --groundtruth datasets/deep-1B/deep_new_groundtruth.public.10K.bin
+python -m raft-ann-bench.split_groundtruth --groundtruth datasets/deep-1B/deep_new_groundtruth.public.10K.bin
 # two files 'groundtruth.neighbors.ibin' and 'groundtruth.distances.fbin' should be produced
 
 # (2) build and search index
-python python -m raft-ann-bench.run --dataset deep-1B
+python -m raft-ann-bench.run --dataset deep-1B
 
 # (3) export data
-python python -m raft-ann-bench.data_export --dataset deep-1B
+python -m raft-ann-bench.data_export --dataset deep-1B
 
 # (4) plot results
-python python -m raft-ann-bench.plot --dataset deep-1B
+python -m raft-ann-bench.plot --dataset deep-1B
 ```
 
 The usage of `python -m raft-ann-bench.split-groundtruth` is:

diff --git a/python/raft-ann-bench/src/raft-ann-bench/run/__main__.py b/python/raft-ann-bench/src/raft-ann-bench/run/__main__.py
@@ -145,7 +145,7 @@ def main():
 
     # Read list of allowed algorithms
     try:
-        import pylibraft  # noqa: F401
+        import rmm  # noqa: F401
 
         gpu_present = True
     except ImportError:
-Original file line number
+Diff line change
@@ Expand Up / @@ -57,6 +57,7 @@ requirements: @@
         - matplotlib
         - python
         - pyyaml
+        - benchmark
     about:
       home: https://rapids.ai/
@@ Expand Down @@