Skip to content

Commit

Permalink
A few fixes to raft-ann-bench recipe and docs (#1806)
Browse files Browse the repository at this point in the history
Authors:
  - Corey J. Nolet (https://github.com/cjnolet)

Approvers:
  - Divye Gala (https://github.com/divyegala)
  - Dante Gama Dessavre (https://github.com/dantegd)
  - Ray Douglass (https://github.com/raydouglass)

URL: #1806
  • Loading branch information
cjnolet authored Sep 7, 2023
1 parent f691fc9 commit be378ee
Show file tree
Hide file tree
Showing 4 changed files with 15 additions and 10 deletions.
1 change: 1 addition & 0 deletions conda/recipes/raft-ann-bench-cpu/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,7 @@ requirements:
- matplotlib
- python
- pyyaml
- benchmark

about:
home: https://rapids.ai/
Expand Down
6 changes: 5 additions & 1 deletion conda/recipes/raft-ann-bench/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,11 @@ requirements:
- libfaiss {{ faiss_version }}
{% endif %}
- h5py {{ h5py_version }}

- benchmark
- glog {{ glog_version }}
- matplotlib
- python
- pyyaml
about:
home: https://rapids.ai/
license: Apache-2.0
Expand Down
16 changes: 8 additions & 8 deletions docs/source/raft_ann_benchmarks.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ The easiest way to install these benchmarks is through conda. We provide package

```bash

mamba env create --name raft_ann_benchmarks
mamba create --name raft_ann_benchmarks
conda activate raft_ann_benchmarks

# to install GPU package:
Expand All @@ -25,7 +25,7 @@ Please see the [build instructions](ann_benchmarks_build.md) to build the benchm
## Running the benchmarks

### Usage
There are 3 general steps to running the benchmarks and vizualizing the results:
There are 4 general steps to running the benchmarks and visualizing the results:
1. Prepare Dataset
2. Build Index and Search Index
3. Data Export
Expand All @@ -39,7 +39,7 @@ expected to be defined to run these scripts; this variable holds the directory w

### End-to-end example: Million-scale

The steps below demonstrate how to download, install, and run benchmarks on a subset of 10M vectors from the Yandex Deep-1B dataset By default the datasets will be stored and used from the folder indicated by the RAPIDS_DATASET_ROOT_DIR environment variable if defined, otherwise a datasets subfolder from where the script is being called:
The steps below demonstrate how to download, install, and run benchmarks on a subset of 10M vectors from the Yandex Deep-1B dataset By default the datasets will be stored and used from the folder indicated by the RAPIDS_DATASET_ROOT_DIR environment variable if defined, otherwise a datasets sub-folder from where the script is being called:

```bash

Expand All @@ -56,7 +56,7 @@ python -m raft-ann-bench.data_export --dataset deep-image-96-inner
python -m raft-ann-bench.plot --dataset deep-image-96-inner
```

Configuration files already exist for the following list of the million-scale datasets. These all work out-of-the-box with the `--dataset` argument. Other million-scale datasets from `ann-benchmarks.com` will work, but will require a json configuration file to be created in `python/raft-ann-bench/src/raft-ann-bench/conf`.
Configuration files already exist for the following list of the million-scale datasets. Please refer to [ann-benchmarks datasets](https://github.com/erikbern/ann-benchmarks/#data-sets) for more information, including actual train and sizes. These all work out-of-the-box with the `--dataset` argument. Other million-scale datasets from `ann-benchmarks.com` will work, but will require a json configuration file to be created in `python/raft-ann-bench/src/raft-ann-bench/run/conf`.
- `deep-image-96-angular`
- `fashion-mnist-784-euclidean`
- `glove-50-angular`
Expand All @@ -80,17 +80,17 @@ mkdir -p datasets/deep-1B
# (1) prepare dataset
# download manually "Ground Truth" file of "Yandex DEEP"
# suppose the file name is deep_new_groundtruth.public.10K.bin
python python -m raft-ann-bench.split_groundtruth --groundtruth datasets/deep-1B/deep_new_groundtruth.public.10K.bin
python -m raft-ann-bench.split_groundtruth --groundtruth datasets/deep-1B/deep_new_groundtruth.public.10K.bin
# two files 'groundtruth.neighbors.ibin' and 'groundtruth.distances.fbin' should be produced

# (2) build and search index
python python -m raft-ann-bench.run --dataset deep-1B
python -m raft-ann-bench.run --dataset deep-1B

# (3) export data
python python -m raft-ann-bench.data_export --dataset deep-1B
python -m raft-ann-bench.data_export --dataset deep-1B

# (4) plot results
python python -m raft-ann-bench.plot --dataset deep-1B
python -m raft-ann-bench.plot --dataset deep-1B
```

The usage of `python -m raft-ann-bench.split-groundtruth` is:
Expand Down
2 changes: 1 addition & 1 deletion python/raft-ann-bench/src/raft-ann-bench/run/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -145,7 +145,7 @@ def main():

# Read list of allowed algorithms
try:
import pylibraft # noqa: F401
import rmm # noqa: F401

gpu_present = True
except ImportError:
Expand Down

0 comments on commit be378ee

Please sign in to comment.