Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Various fixes to reproducible benchmarks #1800

Merged
merged 120 commits into from
Sep 11, 2023
Merged
Show file tree
Hide file tree
Changes from 119 commits
Commits
Show all changes
120 commits
Select commit Hold shift + click to select a range
bd738ec
ANN-benchmarks: switch to use gbench
achirkin Aug 9, 2023
7473c62
Disable NVTX if the nvtx3 headers are missing
achirkin Aug 9, 2023
aa10d7c
Merge branch 'branch-23.10' into enh-google-benchmarks
achirkin Aug 10, 2023
bed126c
Merge branch 'branch-23.10' into enh-google-benchmarks
achirkin Aug 10, 2023
09ea7a7
Merge remote-tracking branch 'upstream/branch-23.10' into python-ann-…
divyegala Aug 11, 2023
2917886
try to run gbench executable
divyegala Aug 12, 2023
49732b1
Allow to compile ANN_BENCH without CUDA
achirkin Aug 17, 2023
76cfb40
Merge remote-tracking branch 'rapidsai/branch-23.10' into enh-google-…
achirkin Aug 17, 2023
9b588af
Fix style
achirkin Aug 17, 2023
6d6c17d
Adapt ANN benchmark python scripts
achirkin Aug 17, 2023
b89b27d
Make the default behavior to produce one executable per benchmark
achirkin Aug 17, 2023
163a40c
Fix style problems / pre-commit
achirkin Aug 17, 2023
0bb51a3
Merge branch 'branch-23.10' into enh-google-benchmarks
achirkin Aug 22, 2023
2b9f649
Merge remote-tracking branch 'rapidsai/branch-23.10' into enh-google-…
achirkin Aug 23, 2023
9728f7e
Merge branch 'branch-23.10' into enh-google-benchmarks
achirkin Aug 24, 2023
7b1bf01
Merge remote-tracking branch 'origin/branch-23.10' into enh-google-be…
cjnolet Aug 24, 2023
1daf2bf
Adding k and batch-size options to run.py
cjnolet Aug 24, 2023
4e0a53e
Merge branch 'branch-23.10' - CONFIGS ONLY - dataset_memtype follows …
achirkin Aug 24, 2023
04893c9
Add dataset_memory_type/query_memory_type as build/search parameters
achirkin Aug 24, 2023
b24fcf7
middle of merge, not building
divyegala Aug 24, 2023
30f7467
Tuning guide
cjnolet Aug 24, 2023
3e35121
Merge remote-tracking branch 'artem/enh-google-benchmarks' into enh-g…
cjnolet Aug 24, 2023
f927f69
compiling, index building successful, search failing
divyegala Aug 24, 2023
404cd10
Merge remote-tracking branch 'corey/enh-google-benchmarks' into pytho…
divyegala Aug 24, 2023
2f19c44
FEA first commit rebasing changes on gbench branch
dantegd Aug 25, 2023
e0586de
FIX fixing straggling changes from rebase
dantegd Aug 25, 2023
0eaa7e0
Fix FAISS using a destroyed stream from previous benchmark case
achirkin Aug 25, 2023
9896963
Merge remote-tracking branch 'artem/enh-google-benchmarks' into enh-g…
cjnolet Aug 25, 2023
4062d6f
Fixing issue in conf file and stubbing out parameter tuning guide
cjnolet Aug 25, 2023
7141c21
Adding CAGRA to tuning guide
cjnolet Aug 25, 2023
7c42a78
Adding ivf-flat description to tuning guide
cjnolet Aug 25, 2023
92a37a8
Updating ivf-flat and ivf-pq
cjnolet Aug 25, 2023
3982840
Adding tuning guide tables for ivf-flat and ivf-pq for faiss and raft
cjnolet Aug 25, 2023
d2bfc11
Reatio is not required
cjnolet Aug 25, 2023
80482fb
Merge remote-tracking branch 'corey/enh-google-benchmarks' into pytho…
divyegala Aug 25, 2023
31594e7
FIX changes that got lost during rebasing
dantegd Aug 25, 2023
82f195e
write build,search results
divyegala Aug 25, 2023
be6eb56
FIX PEP8 fixes
dantegd Aug 25, 2023
0cf1c6f
CLeaning up a couple configs
cjnolet Aug 25, 2023
f5bf15a
FIX typo in cmake conditional
dantegd Aug 25, 2023
617c60f
add tuning guide for cagra, modify build param
divyegala Aug 25, 2023
3948f0c
Merge remote-tracking branch 'corey/enh-google-benchmarks' into pytho…
divyegala Aug 26, 2023
74c9a1b
remove data_export, use gbench csvs to plot
divyegala Aug 26, 2023
902f9f4
fix typo in docs path for results
divyegala Aug 26, 2023
9b82f85
Merge pull request #2 from divyegala/python-ann-bench-use-gbench
cjnolet Aug 26, 2023
1198e1a
for plotting, pick up recall/qps from anywhere in the csv columns
divyegala Aug 26, 2023
24c1619
Merge remote-tracking branch 'divye/python-ann-bench-use-gbench' into…
cjnolet Aug 26, 2023
3f647c3
add output-filepath for plot.py
divyegala Aug 26, 2023
354287d
fix typo in docs
divyegala Aug 26, 2023
e0dfbab
Reverting changes to deep-100M
cjnolet Aug 26, 2023
16e233b
FIX typo in build.sh
dantegd Aug 27, 2023
cac89d0
DBG Make cmake verbose
dantegd Aug 27, 2023
bb3a194
Merge branch 'branch-23.10' into enh-google-benchmarks
achirkin Aug 28, 2023
7d8ee13
FAISS refinement
cjnolet Aug 28, 2023
1720e11
Merge branch 'enh-google-benchmarks' of github.com:cjnolet/raft into …
cjnolet Aug 28, 2023
c0ee323
FIX typo in build.sh
dantegd Aug 28, 2023
aa608d2
DBG single commit of changes
dantegd Aug 28, 2023
697ab89
FIX Add openmp changes from main branch
dantegd Aug 28, 2023
8b0c4c2
FIX recipe env variables
dantegd Aug 28, 2023
49fd31d
adding build time plot
divyegala Aug 28, 2023
be3da1a
merging corey's upstream
divyegala Aug 28, 2023
e92827a
FIX flag in the wrong conditional in build.sh
dantegd Aug 28, 2023
b9e7771
Merge pull request #4 from divyegala/python-ann-bench-use-gbench
cjnolet Aug 28, 2023
8e5ab5d
Merge remote-tracking branch 'artem/enh-google-benchmarks' into enh-g…
cjnolet Aug 29, 2023
f331a94
Merge branch 'enh-google-benchmarks' of github.com:cjnolet/raft into …
cjnolet Aug 29, 2023
b9defb7
FIX remove accidentally deleted file
dantegd Aug 29, 2023
b569861
ENH merge changes from debug PR
dantegd Aug 29, 2023
cdd8d6b
ENH merge changes from enh-google-benchmark branch and create package…
dantegd Aug 29, 2023
b9e9ea6
FIX build.sh flag that was deleted in a bad merge
dantegd Aug 29, 2023
e420593
Merge branch 'branch-23.10' into enh-google-benchmarks
achirkin Aug 29, 2023
913dec2
Move the 'dump_parameters' earlier in the benchmarks to have higher c…
achirkin Aug 29, 2023
8861fc8
Implementing some of the review feedback
cjnolet Aug 29, 2023
2f52b02
Bench ann
cjnolet Aug 29, 2023
c28326c
Merge remote-tracking branch 'artem/enh-google-benchmarks' into enh-g…
cjnolet Aug 29, 2023
521b696
Fixing a couple potential merge artifacts
cjnolet Aug 29, 2023
94296ca
FIX multiple fixes
dantegd Aug 29, 2023
0a35608
FIX multiple fixes
dantegd Aug 30, 2023
d11043c
Merge branch 'enh-google-benchmarks' into dev-enh-google-benchmarks
cjnolet Aug 30, 2023
d6757c1
Merge branch 'branch-23.10' into dev-enh-google-benchmarks
cjnolet Aug 30, 2023
78356aa
Merging python (will need some more fixes later but this will work fo…
cjnolet Aug 30, 2023
9ce6ce0
Merge branch 'branch-23.10' into dev-enh-google-benchmarks
cjnolet Aug 30, 2023
184c46d
FIX merge conflicts from fork and local
dantegd Aug 31, 2023
0dc3ce4
FIX many improvements and small fixes
dantegd Aug 31, 2023
5dd7db2
FIX small fixes from minor errors in prior merges
dantegd Aug 31, 2023
14bcb92
ANN-bench: more flexible cuda_stub.hpp
achirkin Aug 31, 2023
50c9fe2
Make dlopen more flexible looking for the cudart version to link.
achirkin Aug 31, 2023
c6df11a
Fixing style
cjnolet Aug 31, 2023
aaaa182
Merge remote-tracking branch 'artem/enh-ann-bench-flexible-stub' into…
cjnolet Aug 31, 2023
9ffd68e
Fixing omp error
cjnolet Aug 31, 2023
c947004
Merge branch 'branch-23.10' into enh-ann-bench-flexible-stub
achirkin Aug 31, 2023
11f353b
FIxing a couple thing in conf files
cjnolet Aug 31, 2023
858d0a5
Merge branch 'branch-23.10' into dev-enh-google-benchmarks
dantegd Sep 1, 2023
d236090
Adding data_export
cjnolet Sep 1, 2023
c47a1bf
Merge branch 'dev-enh-google-benchmarks' of github.com:dantegd/raft i…
cjnolet Sep 1, 2023
615807a
Merge branch 'branch-23.10' into enh-ann-bench-flexible-stub
achirkin Sep 1, 2023
feef4f3
Merge branch 'branch-23.10' into dev-enh-google-benchmarks
cjnolet Sep 1, 2023
fb2140f
Merge remote-tracking branch 'artem/enh-ann-bench-flexible-stub' into…
cjnolet Sep 1, 2023
998bf48
fix dask pinnings in raft-dask recipe
divyegala Sep 1, 2023
be85537
FIX PR review feedback and readme updates
dantegd Sep 1, 2023
abb4f69
Merge branch 'branch-23.10' into dev-enh-google-benchmarks
dantegd Sep 1, 2023
076d2de
DOC doc updates
dantegd Sep 1, 2023
c6014a9
FIX pep8
dantegd Sep 1, 2023
5a12ce3
FIX docs and plot datasets path
dantegd Sep 1, 2023
d0c150b
Doing some cleanup of ggnn, expanding param tuning docs, fixing hnswl…
cjnolet Sep 1, 2023
03b5e4f
Merge branch 'branch-23.10' into imp-google-benchmarks
cjnolet Sep 1, 2023
fbdc1fa
FIX found typo in cmake
dantegd Sep 1, 2023
954aa87
FIX missing parameter in python
dantegd Sep 1, 2023
15b0dc0
FIX correct conditional
dantegd Sep 1, 2023
d863ce6
FIX for single gpu arch detection in CMake
dantegd Sep 2, 2023
c271a4e
Merge branch 'dev-enh-google-benchmarks' into imp-google-benchmarks
cjnolet Sep 2, 2023
0d60c56
FIX PR review fixes and a {yea}
dantegd Sep 2, 2023
0193607
More fixes
cjnolet Sep 2, 2023
fcc158a
Merge remote-tracking branch 'origin/branch-23.10' into dev-enh-googl…
cjnolet Sep 2, 2023
047e941
Merge branch 'dev-enh-google-benchmarks' into imp-google-benchmarks
cjnolet Sep 2, 2023
b7a6d9a
Merge branch 'branch-23.10' into imp-google-benchmarks
cjnolet Sep 5, 2023
9244674
Merge remote-tracking branch 'origin/branch-23.10' into imp-google-be…
cjnolet Sep 5, 2023
432fa45
Merge branch 'imp-google-benchmarks' of github.com:cjnolet/raft into …
cjnolet Sep 5, 2023
732b923
Update cpp/bench/ann/src/common/util.hpp
cjnolet Sep 7, 2023
ef112d0
Merge branch 'branch-23.10' into imp-google-benchmarks
cjnolet Sep 8, 2023
be6cd5c
Merge branch 'branch-23.10' into imp-google-benchmarks
cjnolet Sep 11, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions cpp/bench/ann/src/ggnn/ggnn_benchmark.cu
Original file line number Diff line number Diff line change
Expand Up @@ -33,8 +33,7 @@ template <typename T>
void parse_build_param(const nlohmann::json& conf,
typename raft::bench::ann::Ggnn<T>::BuildParam& param)
{
param.dataset_size = conf.at("dataset_size");
param.k = conf.at("k");
param.k = conf.at("k");

if (conf.contains("k_build")) { param.k_build = conf.at("k_build"); }
if (conf.contains("segment_size")) { param.segment_size = conf.at("segment_size"); }
Expand Down
22 changes: 4 additions & 18 deletions cpp/bench/ann/src/ggnn/ggnn_wrapper.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -38,8 +38,6 @@ class Ggnn : public ANN<T> {
int num_layers{4}; // L
float tau{0.5};
int refine_iterations{2};

size_t dataset_size;
int k; // GGNN requires to know k during building
};

Expand Down Expand Up @@ -182,24 +180,17 @@ GgnnImpl<T, measure, D, KBuild, KQuery, S>::GgnnImpl(Metric metric,
}

if (dim != D) { throw std::runtime_error("mis-matched dim"); }

int device;
RAFT_CUDA_TRY(cudaGetDevice(&device));

ggnn_ = std::make_unique<GGNNGPUInstance>(
device, build_param_.dataset_size, build_param_.num_layers, true, build_param_.tau);
}

template <typename T, DistanceMeasure measure, int D, int KBuild, int KQuery, int S>
void GgnnImpl<T, measure, D, KBuild, KQuery, S>::build(const T* dataset,
size_t nrow,
cudaStream_t stream)
{
if (nrow != build_param_.dataset_size) {
throw std::runtime_error(
"build_param_.dataset_size = " + std::to_string(build_param_.dataset_size) +
" , but nrow = " + std::to_string(nrow));
}
int device;
RAFT_CUDA_TRY(cudaGetDevice(&device));
ggnn_ = std::make_unique<GGNNGPUInstance>(
device, nrow, build_param_.num_layers, true, build_param_.tau);

ggnn_->set_base_data(dataset);
ggnn_->set_stream(stream);
Expand All @@ -212,11 +203,6 @@ void GgnnImpl<T, measure, D, KBuild, KQuery, S>::build(const T* dataset,
template <typename T, DistanceMeasure measure, int D, int KBuild, int KQuery, int S>
void GgnnImpl<T, measure, D, KBuild, KQuery, S>::set_search_dataset(const T* dataset, size_t nrow)
{
if (nrow != build_param_.dataset_size) {
throw std::runtime_error(
"build_param_.dataset_size = " + std::to_string(build_param_.dataset_size) +
" , but nrow = " + std::to_string(nrow));
}
ggnn_->set_base_data(dataset);
}

Expand Down
6 changes: 4 additions & 2 deletions cpp/bench/ann/src/hnswlib/hnswlib_wrapper.h
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,8 @@
#include <utility>
#include <vector>

#include <omp.h>

#include "../common/ann_types.hpp"
#include <hnswlib.h>

Expand Down Expand Up @@ -164,13 +166,13 @@ class HnswLib : public ANN<T> {
struct BuildParam {
int M;
int ef_construction;
int num_threads{1};
int num_threads = omp_get_num_procs();
};

using typename ANN<T>::AnnSearchParam;
struct SearchParam : public AnnSearchParam {
int ef;
int num_threads{1};
int num_threads = omp_get_num_procs();
Copy link
Contributor

@achirkin achirkin Sep 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work well with num_threads > n_queries? We've had a similar logic in raft host refinement and the performance on small batches was horrible due to overheads of managing many threads compared to the amount of work (n_queries = 1).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. This one is challenging because we don't (and shouldn't) know the number of queries when we set this argument. We could set it to something small(ish) like 8 or 16, but that would just lower the saturation point for larger batch queries. In general, I know the queries for online systems are going to be long tailed, with 1 being in the main mass and >= 100 being in the tail.

The problem is that when we run larger batch sizes, we aren't giving hnsw an fair try at all. My thinking was to take the middle ground- setting this to the number of cores. I guess I should measure the impact directly. What kind of perf difference are you seeing for, say, batch size of 10 when the thread pool contains the number of available cores?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this explicit pool does better job than the openmp machinery that we rely upon in the refine operation. But there, I've got something like ~100x boost for a single-query batch n_queries = 1 (72 cores). I've done some other refactoring at the same time though.

};

HnswLib(Metric metric, int dim, const BuildParam& param);
Expand Down
22 changes: 22 additions & 0 deletions docs/source/ann_benchmarks_param_tuning.md
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,28 @@ IVF-pq is an inverted-file index, which partitions the vectors into a series of

### `hnswlib`


| Parameter | Type | Required | Data Type | Default | Description |
|------------------|-----------------|----------|--------------------------------------|---------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `efConstruction` | `build_param` | Y | Positive Integer >0 | | Controls index time and accuracy. Bigger values increase the index quality. At some point, increasing this will no longer improve the quality. |
| `M` | `build_param` | Y | Positive Integer often between 2-100 | | Number of bi-directional links create for every new element during construction. Higher values work for higher intrinsic dimensionality and/or high recall, low values can work for datasets with low intrinsic dimensionality and/or low recalls. Also affects the algorithm's memory consumption. |
| `numThreads` | `build_param` | N | Positive Integer >0 | 1 | Number of threads to use to build the index. |
| `ef` | `search_param` | Y | Positive Integer >0 | | Size of the dynamic list for the nearest neighbors used for search. Higher value leads to more accurate but slower search. Cannot be lower than `k`. |
| `numThreads` | `search_params` | N | Positive Integer >0 | 1 | Number of threads to use for queries. |

Please refer to [HNSW algorithm parameters guide] from `hnswlib` to learn more about these arguments.

## GGNN Index

### `ggnn`


| Parameter | Type | Required | Data Type | Default | Description |
|----------------|-----------------|----------|--------------------------------------|-------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `dataset_size` | `build_param` | Y | Positive Integer >0 | | Controls index time and accuracy. Bigger values increase the index quality. At some point, increasing this will no longer improve the quality. |
| `M` | `build_param` | Y | Positive Integer often between 2-100 | | Number of bi-directional links create for every new element during construction. Higher values work for higher intrinsic dimensionality and/or high recall, low values can work for datasets with low intrinsic dimensionality and/or low recalls. Also affects the algorithm's memory consumption. |
| `numThreads` | `build_param` | N | Positive Integer >0 | # cpu cores | Number of threads to use to build the index. |
| `ef` | `search_param` | Y | Positive Integer >0 | | Size of the dynamic list for the nearest neighbors used for search. Higher value leads to more accurate but slower search. Cannot be lower than `k`. |
| `numThreads` | `search_params` | N | Positive Integer >0 | # cpu cores | Number of threads to use for queries. |