Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace GEMM backend: cublas.gemm -> cublaslt.matmul #1736

Merged
merged 52 commits into from
Jan 23, 2024
Merged
Show file tree
Hide file tree
Changes from 39 commits
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
2cc477b
Replace GEMM backend: cublas.gemm -> cublaslt.matmul
achirkin Aug 14, 2023
dc7a9a4
Replace broken (due to missing direct includes) direct uses of cublas…
achirkin Aug 14, 2023
34a9479
Merge branch 'branch-23.10' into fea-cublaslt-matmul
achirkin Aug 15, 2023
71c03c0
Fix docs
achirkin Aug 15, 2023
a2fb088
Replace cublasgemm where it makes sense
achirkin Aug 16, 2023
699de0c
Fix a typo
achirkin Aug 16, 2023
f994f19
Merge branch 'branch-23.10' into fea-cublaslt-matmul
achirkin Aug 16, 2023
f4d634a
Put the cache into the resource handle as a user-define resource
achirkin Aug 21, 2023
2d1bf5c
Merge branch 'branch-23.10' into fea-cublaslt-matmul
achirkin Aug 22, 2023
e57eebf
Move matmul into a separate file
achirkin Aug 22, 2023
d44bf20
Complete the docs
achirkin Aug 22, 2023
facf81d
Merge branch 'branch-23.10' into fea-cublaslt-matmul
achirkin Aug 23, 2023
157d8ae
Merge branch 'branch-23.10' into fea-cublaslt-matmul
achirkin Aug 24, 2023
be68b61
Merge branch 'branch-23.10' into fea-cublaslt-matmul
achirkin Aug 24, 2023
f5ac41a
Merge branch 'branch-23.10' into fea-cublaslt-matmul
achirkin Aug 28, 2023
2d4dcb2
Merge branch 'branch-23.10' into fea-cublaslt-matmul
achirkin Aug 29, 2023
6f58669
Merge branch 'branch-23.10' into fea-cublaslt-matmul
achirkin Aug 30, 2023
a0e93fd
Merge branch 'branch-23.10' into fea-cublaslt-matmul
achirkin Aug 30, 2023
4c0d742
Merge branch 'branch-23.10' into fea-cublaslt-matmul
achirkin Aug 30, 2023
01c3634
Merge branch 'branch-23.10' into fea-cublaslt-matmul
achirkin Aug 30, 2023
abb3f00
Merge branch 'branch-23.10' into fea-cublaslt-matmul
achirkin Aug 30, 2023
e24b1c0
Merge branch 'branch-23.10' into fea-cublaslt-matmul
achirkin Aug 31, 2023
de29580
move matmul.hpp to cublaslt_wrappers.hpp
achirkin Aug 31, 2023
3835ed0
Merge branch 'branch-23.10' into fea-cublaslt-matmul
achirkin Aug 31, 2023
de60202
Merge branch 'branch-23.10' into fea-cublaslt-matmul
achirkin Sep 1, 2023
fe84fae
Merge branch 'branch-23.10' into fea-cublaslt-matmul
achirkin Sep 5, 2023
f47626a
Merge branch 'branch-23.10' into fea-cublaslt-matmul
cjnolet Sep 6, 2023
d7efc0c
Merge branch 'branch-23.10' into fea-cublaslt-matmul
achirkin Sep 7, 2023
dd7ee22
Merge branch 'branch-23.10' into fea-cublaslt-matmul
achirkin Sep 8, 2023
01e62b0
Merge branch 'branch-23.10' into fea-cublaslt-matmul
achirkin Sep 9, 2023
8fdf6cc
Merge branch 'branch-23.10' into fea-cublaslt-matmul
achirkin Sep 13, 2023
324f5c6
Merge branch 'branch-23.10' into fea-cublaslt-matmul
achirkin Sep 19, 2023
ba6883f
Merge branch 'branch-23.10' into fea-cublaslt-matmul
achirkin Sep 19, 2023
a56ea2c
Merge branch 'branch-23.10' into fea-cublaslt-matmul
achirkin Nov 20, 2023
cd4663a
Merge branch 'branch-23.12' into fea-cublaslt-matmul
achirkin Nov 20, 2023
c2f1daa
Merge branch 'branch-24.02' into fea-cublaslt-matmul
achirkin Dec 14, 2023
c976de0
Merge branch 'branch-24.02' into fea-cublaslt-matmul
achirkin Dec 15, 2023
a5de437
Merge branch 'branch-24.02' into fea-cublaslt-matmul
achirkin Jan 8, 2024
7849786
Update copyright year for changed files
achirkin Jan 8, 2024
9bec3cf
Merge branch 'branch-24.02' into fea-cublaslt-matmul
achirkin Jan 12, 2024
ceb8d10
Merge branch 'branch-24.02' into fea-cublaslt-matmul
achirkin Jan 15, 2024
1f39534
Deprecate linalg/gemm.cuh
achirkin Jan 15, 2024
b2e3b8b
Update copyright years
achirkin Jan 15, 2024
05c64fc
Rename user_resource -> custom_resource
achirkin Jan 15, 2024
fdbe003
Merge branch 'branch-24.02' into fea-cublaslt-matmul
achirkin Jan 17, 2024
9e08c0f
Merge branch 'branch-24.02' into fea-cublaslt-matmul
achirkin Jan 17, 2024
97f1d49
Merge branch 'branch-24.02' into fea-cublaslt-matmul
achirkin Jan 18, 2024
6164e4f
Merge branch 'branch-24.02' into fea-cublaslt-matmul
achirkin Jan 19, 2024
f6ded84
Merge branch 'branch-24.02' into fea-cublaslt-matmul
achirkin Jan 19, 2024
88ecbb0
Merge branch 'branch-24.02' into fea-cublaslt-matmul
achirkin Jan 22, 2024
ca11f9f
Use plain the vector instead of the unordered_map for the cache and c…
achirkin Jan 23, 2024
47303b7
Merge branch 'branch-24.02' into fea-cublaslt-matmul
achirkin Jan 23, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions cpp/include/raft/core/resource/cublas_handle.hpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2022-2023, NVIDIA CORPORATION.
* Copyright (c) 2022-2024, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -60,8 +60,8 @@ class cublas_resource_factory : public resource_factory {
*/

/**
* Load a cublasres_t from raft res if it exists, otherwise
* add it and return it.
* Load a `cublasHandle_t` from raft res if it exists, otherwise add it and return it.
*
* @param[in] res the raft resources object
* @return cublas handle
*/
Expand Down
68 changes: 68 additions & 0 deletions cpp/include/raft/core/resource/cublaslt_handle.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
/*
* Copyright (c) 2023-2024, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#pragma once

#include <cublasLt.h>
#include <raft/core/cublas_macros.hpp>
#include <raft/core/resource/resource_types.hpp>
#include <raft/core/resources.hpp>

#include <memory>

namespace raft::resource {

class cublaslt_resource : public resource {
achirkin marked this conversation as resolved.
Show resolved Hide resolved
public:
cublaslt_resource() { RAFT_CUBLAS_TRY(cublasLtCreate(&handle_)); }
~cublaslt_resource() noexcept override { RAFT_CUBLAS_TRY_NO_THROW(cublasLtDestroy(handle_)); }
auto get_resource() -> void* override { return &handle_; }

private:
cublasLtHandle_t handle_;
};

/** Factory that knows how to construct a specific raft::resource to populate the res_t. */
class cublaslt_resource_factory : public resource_factory {
public:
auto get_resource_type() -> resource_type override { return resource_type::CUBLASLT_HANDLE; }
auto make_resource() -> resource* override { return new cublaslt_resource(); }
};

/**
* @defgroup resource_cublaslt cuBLASLt handle resource functions
* @{
*/

/**
* Load a `cublasLtHandle_t` from raft res if it exists, otherwise add it and return it.
*
* @param[in] res the raft resources object
* @return cublasLt handle
*/
inline auto get_cublaslt_handle(resources const& res) -> cublasLtHandle_t
{
if (!res.has_resource_factory(resource_type::CUBLASLT_HANDLE)) {
res.add_resource_factory(std::make_shared<cublaslt_resource_factory>());
}
auto ret = *res.get_resource<cublasLtHandle_t>(resource_type::CUBLASLT_HANDLE);
return ret;
};

/**
* @}
*/

} // namespace raft::resource
4 changes: 3 additions & 1 deletion cpp/include/raft/core/resource/resource_types.hpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2022-2023, NVIDIA CORPORATION.
* Copyright (c) 2022-2024, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -43,6 +43,8 @@ enum resource_type {
// CUDA-free builds
THRUST_POLICY, // thrust execution policy
WORKSPACE_RESOURCE, // rmm device memory resource
CUBLASLT_HANDLE, // cublasLt handle
USER_DEFINED, // user-defined default-constructible resource

LAST_KEY // reserved for the last key
};
Expand Down
83 changes: 83 additions & 0 deletions cpp/include/raft/core/resource/user_resource.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
/*
* Copyright (c) 2023-2024, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#pragma once

#include <raft/core/resource/resource_types.hpp>
#include <raft/core/resources.hpp>

#include <memory>
#include <typeindex>

namespace raft::resource {

class user_resource : public resource {
achirkin marked this conversation as resolved.
Show resolved Hide resolved
public:
user_resource() = default;
~user_resource() noexcept override = default;
auto get_resource() -> void* override { return this; }

template <typename ResourceT>
auto load() -> ResourceT*
{
std::lock_guard<std::mutex> _(lock_);
auto key = std::type_index{typeid(ResourceT)};
auto pos = map_.find(key);
if (pos != map_.end()) { return reinterpret_cast<ResourceT*>(pos->second.get()); }
auto store_ptr = new ResourceT{};
map_[key] =
std::shared_ptr<void>(store_ptr, [](void* ptr) { delete reinterpret_cast<ResourceT*>(ptr); });
return store_ptr;
}

private:
std::unordered_map<std::type_index, std::shared_ptr<void>> map_{};
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I was thinking about overheads of copying the resources handle in the gemm/matmul, it occurred to me that we hadn't discussed what should happen to user resources when the resources handle is copied. I see three sensible options:

  1. [current implementation] The whole map is shared between the old and the new handle. Low overhead, but setting a user resource in one handle affects the other handle as well.
  2. Do a shallow copy of the map when copying the resources handle. Incurs more unwanted overheads, but the behavior of individual user resources is the same as of the compile-time-registered resources (setting an individual resource affects only the current handle).
  3. Drop the user resource map on copy. Incurs a low overhead during copy. Resets the state of the user resources, not sure for better or worse (e.g. the kernel cache resources are independent).

@cjnolet, do you have a preference here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@achirkin maybe we could provide a flag for this? Perhaps even have a predefined element in the map of user resources containing this flag? This way they user could set/change the flag using raft::resource::set_user_resource_copy_mode()` or something? We can default it to the least overhead option.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cjnolet do you mean a runtime flag? I've been thinking about that. At first glance, it seems nice to have this flexibility. However, I think, this has a potential to do more harm than good.
From what I've seen so far, most of the "users" of the user_resource are within raft itself (e.g. caching in the algorithms). When we add these use cases, we keep in mind particular behavior of the resource (e.g. cached kernel parameters are shared across the copied handles or kept private). I can imagine this behavior affecting the performance of the algorithms. If an external user is able to change this behavior at runtime (because they want a particular behavior in their piece of code) it will have an adverse effect on overall performance of the other parts of raft codebase.
Therefore, I think it's reasonable to decide on this behavior once and for all and design the uses cases with the chosen behavior in mind.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I've seen so far, most of the "users" of the user_resource are within raft itself (e.g. caching in the algorithms).

Your comment here makes me struggle with the name a little bit, because I was absolutely thinking we would want to invite users to use a "user resource". One of the reasons we didn't use an unordered map for the other resources is because lookup has shown to be slow, especially when the map is small.

Rather than a "user resource", this seems to be more of a "runtime shared resource cache". I'd prefer to name it appropriately based on its use and I think that name is generalized enough to tell a potential user (internal or external) what it's for. Docs can help describe it even better for anyone who might have further confusion on the name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's ok to invite users to use this resource, but you're right the name implies it's made specifically for users, which is wrong. How about "custom_resource"?

To me, the only difference of the custom_resource to the other resources is that raft can't allocate a slot for it at compile time. I've tried to reflect it in the updated name and docs. How does it look to you now?

Also regarding the resource copy policy: I suggest we stick to the status quo now (option 1) and revise it when we decide something on raft::resources and thread-safety issue.

std::mutex lock_{};
};

/** Factory that knows how to construct a specific raft::resource to populate the res_t. */
class user_resource_factory : public resource_factory {
public:
auto get_resource_type() -> resource_type override { return resource_type::USER_DEFINED; }
auto make_resource() -> resource* override { return new user_resource(); }
};

/**
* @defgroup resource_user_defined user-defined resource functions
* @{
*/

/**
* Get the user-defined default-constructible resource if it exists, create it otherwise.
*
* @tparam ResourceT the type of the resource; it must be complete and default-constructible.
*
* @param[in] res the raft resources object
* @return a pointer to the user-defined resource.
*/
template <typename ResourceT>
auto get_user_resource(resources const& res) -> ResourceT*
{
if (!res.has_resource_factory(resource_type::USER_DEFINED)) {
res.add_resource_factory(std::make_shared<user_resource_factory>());
}
return res.get_resource<user_resource>(resource_type::USER_DEFINED)->load<ResourceT>();
};

/**
* @}
*/

} // namespace raft::resource
Loading