You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We currently rely on the following hack in order to have code adapt to the CUDA target architecture:
#[import(cc = "device", name = "[]{return __CUDA_ARCH__;}")]fncuda_device_arch() -> i32;
calling this "function" generates []{return __CUDA_ARCH__;}(), which results in the desired value.
While this works very well, it is not very elegant. So I would like to propose to simply allow importing of built-in variables in addition to functions, for example:
#[import(cc = "device", name = "threadIdx.x")] threadIdx_x:u32;
any use of threadIdx_x would simply generate threadIdx.x.
This would also mean there's no longer a need to emit wrapper functions like
just so that the downstream compiler can inline them away again.
In fact, I would go as far as to propose to allow an arbitrary expression (really just a string that's emitted like the name would have been) to be specified for an imported variable instead of a name, e.g.:
where any read of the variable's value would simply generate (threadIdx.x * 2). One can specify either a name or an expression. This would seem both technically more sound as well as potentially very useful as it would allow us to inject arbitrary expressions into the generated code and, thus, give us access to the full set of features available in the target language without relying on the backend/runtime to add more and more wrapper functions for every use case we might imagine…
The text was updated successfully, but these errors were encountered:
For thread IDs in particular, I think it's good that they are functions, because at the IR level that's what they have to be, you might need to generate arbitrary code everytime they're accessed. There may be room for some syntactic sugar (where () is no longer necessary to perform a call, in certain scenarios), but ideally as part of a bigger language expansion, not just piece-meal.
One thing we could add, for C-based backends anyways, is a sort of asm-like construct, but for C code:
// rought draft, don't look into the signature too much#[import(cc = "thorin", name = "c_mixin")]fn c_mixin[T](code:&[u8],yield:&[u8], ...);fn @get_bits_in_byte() -> i32{let x = c_mixin("#include <limits.h>","CHAR_BIT");return x;}
That would perhaps even simplify the backend a little bit, allowing to expose more platform things as part of library code without needing back-end support.
For thread IDs in particular, I think it's good that they are functions, because at the IR level that's what they have to be, you might need to generate arbitrary code everytime they're accessed. There may be room for some syntactic sugar (where () is no longer necessary to perform a call, in certain scenarios), but ideally as part of a bigger language expansion, not just piece-meal.
Well, I guess another way of looking at what I'm suggesting would be as a means of specifying exactly what code to generate every time the value in question is accessed. And yes, the ability to do more things through library code rather than relying on hardcoding stuff into the C backend is exactly why I think this would be great to have. In particular, it would allows us to try out things in the context of the codebase we're currently working on without having to integrate stuff into the backend just so that we can later find that it didn't work out the way we had hoped…
We currently rely on the following hack in order to have code adapt to the CUDA target architecture:
calling this "function" generates
[]{return __CUDA_ARCH__;}()
, which results in the desired value.While this works very well, it is not very elegant. So I would like to propose to simply allow importing of built-in variables in addition to functions, for example:
any use of
threadIdx_x
would simply generatethreadIdx.x
.This would also mean there's no longer a need to emit wrapper functions like
just so that the downstream compiler can inline them away again.
In fact, I would go as far as to propose to allow an arbitrary expression (really just a string that's emitted like the name would have been) to be specified for an imported variable instead of a name, e.g.:
where any read of the variable's value would simply generate
(threadIdx.x * 2)
. One can specify either a name or an expression. This would seem both technically more sound as well as potentially very useful as it would allow us to inject arbitrary expressions into the generated code and, thus, give us access to the full set of features available in the target language without relying on the backend/runtime to add more and more wrapper functions for every use case we might imagine…The text was updated successfully, but these errors were encountered: