-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Support for half-float mixed precise in brute-force #225
Conversation
rhdong
commented
Jul 17, 2024
•
edited
Loading
edited
- distance supports half-float mixed precision
- prefiltered_brute_force supports half
- migrate the ann brute force test cases and support half
- distance adaptation - prefiltered_brute_force supports half - migrate the ann brute force test cases and support half
Hey @cjnolet @benfred , here is the performance result(code link), one line of float following one for
|
Linking #110 |
Thanks for providing the benchmarks above @rhdong. For smaller number of queries (e.g. 10) it looks like the half precision is significantly slower than the single-precision. It's very common for these algos to be used in online scenarios where 1 query at a time is used. Any idea why we are seeing this perf degradation and how to fix it? |
It looks like it only happens on Col_Major; let me take a look. |
@@ -61,8 +61,8 @@ endfunction() | |||
# To use a different RAFT locally, set the CMake variable | |||
# CPM_raft_SOURCE=/path/to/local/raft | |||
find_and_configure_raft(VERSION ${RAFT_VERSION}.00 | |||
FORK ${RAFT_FORK} | |||
PINNED_TAG ${RAFT_PINNED_TAG} | |||
FORK rhdong |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question- why is this dependent upon RAFT? what do we still have in raft that is needed for this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a PR for raft which is required by the feature: rapidsai/raft#2382
@@ -0,0 +1,51 @@ | |||
/* | |||
* Copyright (c) 2021-2023, NVIDIA CORPORATION. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All new files should have 2024 as the year. Please make this change in all new files introduced in this PR.
template <typename Type, | ||
typename layout = layout_c_contiguous, | ||
typename IdxT = int, | ||
typename DistT = Type> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm really really not in love at all with introducing new template types. We should be going the opposite direction.
cpp/src/distance/distance.cu
Outdated
raft::resources const& handle, \ | ||
raft::device_matrix_view<const DataT, IdxT, layout> const x, \ | ||
raft::device_matrix_view<const DataT, IdxT, layout> const y, \ | ||
raft::device_matrix_view<OutT, IdxT, layout> dist, \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need a new template param for this? Is this absolutely necessary or is this just a convenience?
I understand you were able to drop the binary size down in this PR by fixing the pairwise distance instnatiations, but that fix should be decoupled from your additions here. In other words- if we could drop the binary size by 100mb or more just by fixing exiting instantiations, I'd rather not add new intantiations in unecessarily that add to the binary size. Can these be at all avoided using the existing template types? We have a lot of them already...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'm afraid it's necessary if we want to support a distance type different from the input type(like half -> float) unless we make large-scale refactoring on this part of code because all of the APIs assume inputs and output share the same type before the PR
.
… control - Refactoring `distance-ext.cuh` for fewer line code
… rhdong/half-knn
@@ -15,7 +15,7 @@ | |||
*/ | |||
#pragma once | |||
|
|||
#ifndef CUVS_EXPLICIT_INSTANTIATE_ONLY | |||
#ifndef _CUVS_EXPLICIT_INSTANTIATE_ONLY |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @cjnolet , I'm not so sure if our Cagra module needs to open this macro, so add a _
for the placeholder and wait for further instruction. If you feel it is necessary, could you help tag suitable reviewers/owners to take a look? (I have tried to open it, got redefined errors, I guess they caused by search_single_cta_inst.cuh
and search_multi_cta_inst.cuh
). Thanks!
@@ -23,7 +23,7 @@ | |||
namespace cuvs::neighbors::cagra::detail { | |||
namespace single_cta_search { | |||
|
|||
#ifdef CUVS_EXPLICIT_INSTANTIATE_ONLY | |||
#ifdef _CUVS_EXPLICIT_INSTANTIATE_ONLY |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @cjnolet , I'm not so sure if our Cagra module needs to open this macro, so add a _
for the placeholder. If you feel it is necessary, could you help tag a suitable reviewer to take a look? Thanks!
/merge |