Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Propagate kernel name via KernelSettings #278

Merged
merged 5 commits into from
Nov 22, 2024

Conversation

torsteingrindvik
Copy link
Contributor

Fixes #234

Names now show up!

image

Propagating this down from the proc macro until both crates/cubecl-cpp/src/shared/kernel.rs as well as crates/cubecl-cuda/src/compute/server.rs could get a hold of the name required a bit of kernel_name field duplication in the relevant data structures.
If there's a more straight forward way to propagate it I'm happy to do that, but I couldn't see any obvious way.

Also notice from the image that the same four names show up twice, but with different performance characteristics.
This is because the example I ran first goes through f32 then f16.
I hope to support automatically adding a suffix in a follow-up if it's not too hairy.

…code kernel naming

Signed-off-by: Torstein Grindvik <[email protected]>
@ArthurBrussee
Copy link
Contributor

That's awesome! Had the same problem, but in wgpu. Would it be possible to use the names as a label for the shader on wgpu too?

@torsteingrindvik
Copy link
Contributor Author

torsteingrindvik commented Nov 19, 2024

That's awesome! Had the same problem, but in wgpu. Would it be possible to use the names as a label for the shader on wgpu too?

Yes, from what I've done in wgpu I know that the entry point can be named by users somewhere in the pipeline setup.
Then similarly the CubeCL code that produces wgsl source code should match the pipeline setup.

Do you have a debugger for wgpu that shows just "main" "main" "main" etc.? Asking because I don't know what the best way of seeing compute shader calls from wpgu is.

@ maintainers: If you see this PR and you're happy please go ahead and merge. I'll then do the wgpu in a follow-up as I'm out of time right now. Else I'll continue here next time I'm coding

Comment on lines 9 to 10
pub name: Option<&'static str>,
pub kernel_name: String,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference between name and kernel_name ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should aim to have only one name field for simplicity.

Copy link
Contributor Author

@torsteingrindvik torsteingrindvik Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I didn't do it in the first pass since I wasn't sure where it was used.

I removed the previous name. That was a full core::any::type_name name which can't be used as (CUDA, wgpu, etc.) names since it has fully qualified paths and generics etc.

Luckily it was only used for debugging: #38

Your PR there shows the full name:

name: gelu::gelu_array::GeluArray<
    cubecl_core::frontend::element::float::F32,
    cubecl_cuda::runtime::CudaRuntime,
>
cube_dim: (4, 1, 1)
shared_memory: 0 bytes
source:

extern "C" __global__ void kernel(float input_0[], float output_0[],
                                  uint info[]) {

With this PR that changes to:

name: gelu_array
cube_dim: (1, 1, 1)
shared_memory: 0 bytes

source:

extern "C" __global__ void gelu_array(float_4 input_0[], float_4 output_0[],
                                      uint info[]) {

(I edited the snippets to show the relevant comparison).

Copy link
Member

@syl20bnr syl20bnr Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it is useful to have the fully qualified name in the debug info. In this case we could maybe keep both but name them more explicitly name and fqn_name ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I added that back, but changed the names to "entrypoint_name" (the new one) and "debug_name" (the old one).

To make it less confusing I added some docs:

image

@torsteingrindvik
Copy link
Contributor Author

Added a commit to make wgpu + wgsl/spirv also use the kernel name as their entrypoint names.

Copy link
Member

@nathanielsimard nathanielsimard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM Thanks 🙏

@nathanielsimard nathanielsimard merged commit a0b1971 into tracel-ai:main Nov 22, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CUDA kernels names shouldn't all be "kernel"
4 participants