-
Notifications
You must be signed in to change notification settings - Fork 197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Various fixes to reproducible benchmarks #1800
Merged
rapids-bot
merged 120 commits into
rapidsai:branch-23.10
from
cjnolet:imp-google-benchmarks
Sep 11, 2023
Merged
Changes from 119 commits
Commits
Show all changes
120 commits
Select commit
Hold shift + click to select a range
bd738ec
ANN-benchmarks: switch to use gbench
achirkin 7473c62
Disable NVTX if the nvtx3 headers are missing
achirkin aa10d7c
Merge branch 'branch-23.10' into enh-google-benchmarks
achirkin bed126c
Merge branch 'branch-23.10' into enh-google-benchmarks
achirkin 09ea7a7
Merge remote-tracking branch 'upstream/branch-23.10' into python-ann-…
divyegala 2917886
try to run gbench executable
divyegala 49732b1
Allow to compile ANN_BENCH without CUDA
achirkin 76cfb40
Merge remote-tracking branch 'rapidsai/branch-23.10' into enh-google-…
achirkin 9b588af
Fix style
achirkin 6d6c17d
Adapt ANN benchmark python scripts
achirkin b89b27d
Make the default behavior to produce one executable per benchmark
achirkin 163a40c
Fix style problems / pre-commit
achirkin 0bb51a3
Merge branch 'branch-23.10' into enh-google-benchmarks
achirkin 2b9f649
Merge remote-tracking branch 'rapidsai/branch-23.10' into enh-google-…
achirkin 9728f7e
Merge branch 'branch-23.10' into enh-google-benchmarks
achirkin 7b1bf01
Merge remote-tracking branch 'origin/branch-23.10' into enh-google-be…
cjnolet 1daf2bf
Adding k and batch-size options to run.py
cjnolet 4e0a53e
Merge branch 'branch-23.10' - CONFIGS ONLY - dataset_memtype follows …
achirkin 04893c9
Add dataset_memory_type/query_memory_type as build/search parameters
achirkin b24fcf7
middle of merge, not building
divyegala 30f7467
Tuning guide
cjnolet 3e35121
Merge remote-tracking branch 'artem/enh-google-benchmarks' into enh-g…
cjnolet f927f69
compiling, index building successful, search failing
divyegala 404cd10
Merge remote-tracking branch 'corey/enh-google-benchmarks' into pytho…
divyegala 2f19c44
FEA first commit rebasing changes on gbench branch
dantegd e0586de
FIX fixing straggling changes from rebase
dantegd 0eaa7e0
Fix FAISS using a destroyed stream from previous benchmark case
achirkin 9896963
Merge remote-tracking branch 'artem/enh-google-benchmarks' into enh-g…
cjnolet 4062d6f
Fixing issue in conf file and stubbing out parameter tuning guide
cjnolet 7141c21
Adding CAGRA to tuning guide
cjnolet 7c42a78
Adding ivf-flat description to tuning guide
cjnolet 92a37a8
Updating ivf-flat and ivf-pq
cjnolet 3982840
Adding tuning guide tables for ivf-flat and ivf-pq for faiss and raft
cjnolet d2bfc11
Reatio is not required
cjnolet 80482fb
Merge remote-tracking branch 'corey/enh-google-benchmarks' into pytho…
divyegala 31594e7
FIX changes that got lost during rebasing
dantegd 82f195e
write build,search results
divyegala be6eb56
FIX PEP8 fixes
dantegd 0cf1c6f
CLeaning up a couple configs
cjnolet f5bf15a
FIX typo in cmake conditional
dantegd 617c60f
add tuning guide for cagra, modify build param
divyegala 3948f0c
Merge remote-tracking branch 'corey/enh-google-benchmarks' into pytho…
divyegala 74c9a1b
remove data_export, use gbench csvs to plot
divyegala 902f9f4
fix typo in docs path for results
divyegala 9b82f85
Merge pull request #2 from divyegala/python-ann-bench-use-gbench
cjnolet 1198e1a
for plotting, pick up recall/qps from anywhere in the csv columns
divyegala 24c1619
Merge remote-tracking branch 'divye/python-ann-bench-use-gbench' into…
cjnolet 3f647c3
add output-filepath for plot.py
divyegala 354287d
fix typo in docs
divyegala e0dfbab
Reverting changes to deep-100M
cjnolet 16e233b
FIX typo in build.sh
dantegd cac89d0
DBG Make cmake verbose
dantegd bb3a194
Merge branch 'branch-23.10' into enh-google-benchmarks
achirkin 7d8ee13
FAISS refinement
cjnolet 1720e11
Merge branch 'enh-google-benchmarks' of github.com:cjnolet/raft into …
cjnolet c0ee323
FIX typo in build.sh
dantegd aa608d2
DBG single commit of changes
dantegd 697ab89
FIX Add openmp changes from main branch
dantegd 8b0c4c2
FIX recipe env variables
dantegd 49fd31d
adding build time plot
divyegala be3da1a
merging corey's upstream
divyegala e92827a
FIX flag in the wrong conditional in build.sh
dantegd b9e7771
Merge pull request #4 from divyegala/python-ann-bench-use-gbench
cjnolet 8e5ab5d
Merge remote-tracking branch 'artem/enh-google-benchmarks' into enh-g…
cjnolet f331a94
Merge branch 'enh-google-benchmarks' of github.com:cjnolet/raft into …
cjnolet b9defb7
FIX remove accidentally deleted file
dantegd b569861
ENH merge changes from debug PR
dantegd cdd8d6b
ENH merge changes from enh-google-benchmark branch and create package…
dantegd b9e9ea6
FIX build.sh flag that was deleted in a bad merge
dantegd e420593
Merge branch 'branch-23.10' into enh-google-benchmarks
achirkin 913dec2
Move the 'dump_parameters' earlier in the benchmarks to have higher c…
achirkin 8861fc8
Implementing some of the review feedback
cjnolet 2f52b02
Bench ann
cjnolet c28326c
Merge remote-tracking branch 'artem/enh-google-benchmarks' into enh-g…
cjnolet 521b696
Fixing a couple potential merge artifacts
cjnolet 94296ca
FIX multiple fixes
dantegd 0a35608
FIX multiple fixes
dantegd d11043c
Merge branch 'enh-google-benchmarks' into dev-enh-google-benchmarks
cjnolet d6757c1
Merge branch 'branch-23.10' into dev-enh-google-benchmarks
cjnolet 78356aa
Merging python (will need some more fixes later but this will work fo…
cjnolet 9ce6ce0
Merge branch 'branch-23.10' into dev-enh-google-benchmarks
cjnolet 184c46d
FIX merge conflicts from fork and local
dantegd 0dc3ce4
FIX many improvements and small fixes
dantegd 5dd7db2
FIX small fixes from minor errors in prior merges
dantegd 14bcb92
ANN-bench: more flexible cuda_stub.hpp
achirkin 50c9fe2
Make dlopen more flexible looking for the cudart version to link.
achirkin c6df11a
Fixing style
cjnolet aaaa182
Merge remote-tracking branch 'artem/enh-ann-bench-flexible-stub' into…
cjnolet 9ffd68e
Fixing omp error
cjnolet c947004
Merge branch 'branch-23.10' into enh-ann-bench-flexible-stub
achirkin 11f353b
FIxing a couple thing in conf files
cjnolet 858d0a5
Merge branch 'branch-23.10' into dev-enh-google-benchmarks
dantegd d236090
Adding data_export
cjnolet c47a1bf
Merge branch 'dev-enh-google-benchmarks' of github.com:dantegd/raft i…
cjnolet 615807a
Merge branch 'branch-23.10' into enh-ann-bench-flexible-stub
achirkin feef4f3
Merge branch 'branch-23.10' into dev-enh-google-benchmarks
cjnolet fb2140f
Merge remote-tracking branch 'artem/enh-ann-bench-flexible-stub' into…
cjnolet 998bf48
fix dask pinnings in raft-dask recipe
divyegala be85537
FIX PR review feedback and readme updates
dantegd abb4f69
Merge branch 'branch-23.10' into dev-enh-google-benchmarks
dantegd 076d2de
DOC doc updates
dantegd c6014a9
FIX pep8
dantegd 5a12ce3
FIX docs and plot datasets path
dantegd d0c150b
Doing some cleanup of ggnn, expanding param tuning docs, fixing hnswl…
cjnolet 03b5e4f
Merge branch 'branch-23.10' into imp-google-benchmarks
cjnolet fbdc1fa
FIX found typo in cmake
dantegd 954aa87
FIX missing parameter in python
dantegd 15b0dc0
FIX correct conditional
dantegd d863ce6
FIX for single gpu arch detection in CMake
dantegd c271a4e
Merge branch 'dev-enh-google-benchmarks' into imp-google-benchmarks
cjnolet 0d60c56
FIX PR review fixes and a {yea}
dantegd 0193607
More fixes
cjnolet fcc158a
Merge remote-tracking branch 'origin/branch-23.10' into dev-enh-googl…
cjnolet 047e941
Merge branch 'dev-enh-google-benchmarks' into imp-google-benchmarks
cjnolet b7a6d9a
Merge branch 'branch-23.10' into imp-google-benchmarks
cjnolet 9244674
Merge remote-tracking branch 'origin/branch-23.10' into imp-google-be…
cjnolet 432fa45
Merge branch 'imp-google-benchmarks' of github.com:cjnolet/raft into …
cjnolet 732b923
Update cpp/bench/ann/src/common/util.hpp
cjnolet ef112d0
Merge branch 'branch-23.10' into imp-google-benchmarks
cjnolet be6cd5c
Merge branch 'branch-23.10' into imp-google-benchmarks
cjnolet File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work well with
num_threads > n_queries
? We've had a similar logic in raft host refinement and the performance on small batches was horrible due to overheads of managing many threads compared to the amount of work (n_queries = 1).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. This one is challenging because we don't (and shouldn't) know the number of queries when we set this argument. We could set it to something small(ish) like 8 or 16, but that would just lower the saturation point for larger batch queries. In general, I know the queries for online systems are going to be long tailed, with 1 being in the main mass and >= 100 being in the tail.
The problem is that when we run larger batch sizes, we aren't giving hnsw an fair try at all. My thinking was to take the middle ground- setting this to the number of cores. I guess I should measure the impact directly. What kind of perf difference are you seeing for, say, batch size of 10 when the thread pool contains the number of available cores?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this explicit pool does better job than the openmp machinery that we rely upon in the refine operation. But there, I've got something like ~100x boost for a single-query batch
n_queries = 1
(72 cores). I've done some other refactoring at the same time though.