You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
CUDA kernel differentiation with autodiff_deferred fails to compile if the kernel contains a CUDA.@atomic operation. The versions used are: Enzyme v0.13.19, CUDA v5.5.2, Julia v1.11.2. MWE below:
using CUDA
using Enzyme
function test_kern!(y, x)
i = threadIdx().x
for j=1:size(y, 1)
@inbounds CUDA.@atomic y[j] += x[j, i]
end
return
end
function grad_test_kern!(y, dy, x, dx)
autodiff_deferred(Reverse, Const(test_kern!), Const, Duplicated(y, dy), Duplicated(x, dx))
return
end
d = 10
y = CUDA.zeros(d)
dy = similar(y)
fill!(dy, 1f0)
x = CUDA.randn(d, 128)
dx = zero(x)
@cuda threads=size(x, 2) grad_test_kern!(y, dy, x, dx);
Array(dx)
CUDA kernel differentiation with
autodiff_deferred
fails to compile if the kernel contains aCUDA.@atomic
operation. The versions used are: Enzyme v0.13.19, CUDA v5.5.2, Julia v1.11.2. MWE below:The resulting error reads:
LLVM error: Cannot select: 0x74f5c70: f32,ch = AtomicLoad<(load acquire (s32) from %ir."'ipc20_unwrap.i", addrspace 1)> 0x67a6a00, 0x74f59d0,
The text was updated successfully, but these errors were encountered: