-
Notifications
You must be signed in to change notification settings - Fork 197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add explicit instantiations for IVF-PQ search kernels used in tests #2212
Add explicit instantiations for IVF-PQ search kernels used in tests #2212
Conversation
What is the best way to avoid this extra compilation @lowener and @achirkin? One idea would be to cast away the index type if |
We had a discussion a little while back of just having the |
6c7275b
to
1df5de7
Compare
As discussed offline, I recommend to push further changes to the filter function to 24.06, and keep this PR in its current, non-breaking form. I have added a reference to the tracking issue for filters #1738 |
- move common -ext.cuh headers to raft_internal - move .cu files back to src
Overview of the changes in compile time. Test and bench with filtering plus test with uint32_t (highlighted in bold) is now compiled in parallel
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @tfeher, for the PR! Looks good to me. A small nitpick about the internal indexing type below.
2bf77db
to
53a6f1a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
/merge |
…apidsai#2212) Compilation of IVF-PQ search kernels can be time consuming. In `libraft.so` the compilation is done in parallel for kernels without filtering and with `int64_t` index type. We have test with `uint32_t` index type as well as tests for `bitset_filter` with both 32 and 64 bit index types. This PR adds explicit template instantiations for the test. This way we avoid repeated compilation of the kernels with filter and this also enables parallel compilation of the `compute_similarity` kernel for different template types. The kernels with these additional type parameters are not added to `libraft.so`, only linked together with the test executable. Note that this PR does not increase the number of compiled kernels, but it enables to compile them in parallel. Authors: - Tamas Bela Feher (https://github.com/tfeher) Approvers: - Artem M. Chirkin (https://github.com/achirkin) - Ben Frederickson (https://github.com/benfred) URL: rapidsai#2212
…apidsai#2212) Compilation of IVF-PQ search kernels can be time consuming. In `libraft.so` the compilation is done in parallel for kernels without filtering and with `int64_t` index type. We have test with `uint32_t` index type as well as tests for `bitset_filter` with both 32 and 64 bit index types. This PR adds explicit template instantiations for the test. This way we avoid repeated compilation of the kernels with filter and this also enables parallel compilation of the `compute_similarity` kernel for different template types. The kernels with these additional type parameters are not added to `libraft.so`, only linked together with the test executable. Note that this PR does not increase the number of compiled kernels, but it enables to compile them in parallel. Authors: - Tamas Bela Feher (https://github.com/tfeher) Approvers: - Artem M. Chirkin (https://github.com/achirkin) - Ben Frederickson (https://github.com/benfred) URL: rapidsai#2212
Compilation of IVF-PQ search kernels can be time consuming. In
libraft.so
the compilation is done in parallel for kernels without filtering and withint64_t
index type.We have test with
uint32_t
index type as well as tests forbitset_filter
with both 32 and 64 bit index types. This PR adds explicit template instantiations for the test. This way we avoid repeated compilation of the kernels with filter and this also enables parallel compilation of thecompute_similarity
kernel for different template types. The kernels with these additional type parameters are not added tolibraft.so
, only linked together with the test executable.Note that this PR does not increase the number of compiled kernels, but it enables to compile them in parallel.