Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enzyme fails to differentiate KA.jl kernel in Julia 1.11 #2198

Open
cncastillo opened this issue Dec 14, 2024 · 6 comments
Open

Enzyme fails to differentiate KA.jl kernel in Julia 1.11 #2198

cncastillo opened this issue Dec 14, 2024 · 6 comments

Comments

@cncastillo
Copy link

Hi! First of all, thank you very much for this amazing package 😄 I have been struggling to make this simple example work in Julia 1.11.2 (it works in Julia 1.10.7):

using CUDA, KernelAbstractions, Enzyme

c = 1:64
@kernel function square!(x, @Const(c))
    I = @index(Global, Linear)
    @inbounds x[I] = c[I] * x[I] ^ 2
end

function f!(x, backend)
    kernel = square!(backend)
    kernel(x, c, ndrange = size(x))
    KernelAbstractions.synchronize(backend)
end

x = CUDA.ones(64)
backend = KernelAbstractions.get_backend(x)

∂f_∂x = similar(x)
∂f_∂x .= 1.0
Enzyme.autodiff(
    Reverse, 
    f!, 
    Duplicated(x, ∂f_∂x), 
    Const(backend)
)

∂f_∂x

When running this code I get:

ERROR: Enzyme compilation failed due to an internal error.
 Please open an issue with the code to reproduce and full error log on github.com/EnzymeAD/Enzyme.jl To toggle more information for debugging (needed for bug reports), set Enzyme.Compiler.VERBOSE_ERRORS[] = true (default false)

Stacktrace:
 [1] setindex!
   @ ./array.jl:987
 [2] _sort!
   @ ./sort.jl:831
 [3] multiple call sites
   @ unknown:0

Stacktrace:
  [1] (::Enzyme.Compiler.var"#getparent#69"{})(v::LLVM.Value, offset::LLVM.Value, hasload::Bool)
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/haqjK/src/llvm/transforms.jl:888
  [2] (::Enzyme.Compiler.var"#getparent#69"{})(v::LLVM.Value, offset::LLVM.Value, hasload::Bool)
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/haqjK/src/llvm/transforms.jl:777
  [3] nodecayed_phis!(mod::LLVM.Module)
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/haqjK/src/llvm/transforms.jl:891
  [4] optimize!(mod::LLVM.Module, tm::LLVM.TargetMachine)
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/haqjK/src/compiler/optimize.jl:582
  [5] codegen(output::Symbol, job::GPUCompiler.CompilerJob{…}; libraries::Bool, deferred_codegen::Bool, optimize::Bool, toplevel::Bool, strip::Bool, validate::Bool, only_entry::Bool, parent_job::Nothing)
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/haqjK/src/compiler.jl:4096
  [6] codegen
    @ ~/.julia/packages/Enzyme/haqjK/src/compiler.jl:3223 [inlined]
  [7] _thunk(job::GPUCompiler.CompilerJob{Enzyme.Compiler.EnzymeTarget, Enzyme.Compiler.EnzymeCompilerParams}, postopt::Bool)
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/haqjK/src/compiler.jl:5273
  [8] _thunk
    @ ~/.julia/packages/Enzyme/haqjK/src/compiler.jl:5273 [inlined]
  [9] cached_compilation
    @ ~/.julia/packages/Enzyme/haqjK/src/compiler.jl:5324 [inlined]
 [10] thunkbase(mi::Core.MethodInstance, World::UInt64, FA::Type{…}, A::Type{…}, TT::Type, Mode::Enzyme.API.CDerivativeMode, width::Int64, ModifiedBetween::NTuple{…} where N, ReturnPrimal::Bool, ShadowInit::Bool, ABI::Type, ErrIfFuncWritten::Bool, RuntimeActivity::Bool)
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/haqjK/src/compiler.jl:5434
 [11] thunk_generator(world::UInt64, source::LineNumberNode, FA::Type, A::Type, TT::Type, Mode::Enzyme.API.CDerivativeMode, Width::Int64, ModifiedBetween::NTuple{…} where N, ReturnPrimal::Bool, ShadowInit::Bool, ABI::Type, ErrIfFuncWritten::Bool, RuntimeActivity::Bool, self::Any, fakeworld::Any, fa::Type, a::Type, tt::Type, mode::Type, width::Type, modifiedbetween::Type, returnprimal::Type, shadowinit::Type, abi::Type, erriffuncwritten::Type, runtimeactivity::Type)
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/haqjK/src/compiler.jl:5601
 [12] autodiff
    @ ~/.julia/packages/Enzyme/haqjK/src/Enzyme.jl:485 [inlined]
 [13] autodiff
    @ ~/.julia/packages/Enzyme/haqjK/src/Enzyme.jl:544 [inlined]
 [14] autodiff(::ReverseMode{…}, ::typeof(f!), ::Duplicated{…}, ::Const{…})
    @ Enzyme ~/.julia/packages/Enzyme/haqjK/src/Enzyme.jl:516
 [15] top-level scope
    @ REPL[9]:1
Some type information was truncated. Use `show(err)` to see complete types.
@wsmoses
Copy link
Member

wsmoses commented Dec 17, 2024

edit: ah yeah as you mentioned it does work on 1.10

Can you retry with Julia 1.10? I think this is an issue with 1.11's gc_loaded

@wsmoses
Copy link
Member

wsmoses commented Dec 23, 2024

Can you retry this on latest main and see if it still triggers?

@cncastillo
Copy link
Author

cncastillo commented Dec 23, 2024

I tried with the latest stable version (Enzyme v0.13.25, same error) and the current dev version and I am getting a new error:

StackOverflowError:
Stacktrace:
  [1] LLVM.LLVMType(ref::Ptr{LLVM.API.LLVMOpaqueType})
    @ LLVM ~/.julia/packages/LLVM/wMjUU/src/core/type.jl:49
  [2] value_type
    @ ~/.julia/packages/LLVM/wMjUU/src/core/value.jl:54 [inlined]
  [3] (::Enzyme.Compiler.var"#getparent#71"{LLVM.Context, LLVM.Function, LLVM.IntegerType, Int64, Dict{LLVM.PHIInst, LLVM.PHIInst}, Dict{LLVM.PHIInst, LLVM.PHIInst}, LLVM.PHIInst, LLVM.BitCastInst})(b::LLVM.IRBuilder, v::LLVM.Value, offset::LLVM.Value, hasload::Bool)
    @ Enzyme.Compiler ~/.julia/dev/Enzyme/src/llvm/transforms.jl:609
  [4] (::Enzyme.Compiler.var"#getparent#71"{LLVM.Context, LLVM.Function, LLVM.IntegerType, Int64, Dict{LLVM.PHIInst, LLVM.PHIInst}, Dict{LLVM.PHIInst, LLVM.PHIInst}, LLVM.PHIInst, LLVM.BitCastInst})(b::LLVM.IRBuilder, v::LLVM.Value, offset::LLVM.Value, hasload::Bool)
    @ Enzyme.Compiler ~/.julia/dev/Enzyme/src/llvm/transforms.jl:615
  [5] (::Enzyme.Compiler.var"#getparent#71"{LLVM.Context, LLVM.Function, LLVM.IntegerType, Int64, Dict{LLVM.PHIInst, LLVM.PHIInst}, Dict{LLVM.PHIInst, LLVM.PHIInst}, LLVM.PHIInst, LLVM.BitCastInst})(b::LLVM.IRBuilder, v::LLVM.Value, offset::LLVM.Value, hasload::Bool) (repeats 10889 times)
    @ Enzyme.Compiler ~/.julia/dev/Enzyme/src/llvm/transforms.jl:859
  [6] (::Enzyme.Compiler.var"#getparent#71"{LLVM.Context, LLVM.Function, LLVM.IntegerType, Int64, Dict{LLVM.PHIInst, LLVM.PHIInst}, Dict{LLVM.PHIInst, LLVM.PHIInst}, LLVM.PHIInst, LLVM.BitCastInst})(b::LLVM.IRBuilder, v::LLVM.Value, offset::LLVM.Value, hasload::Bool)
    @ Enzyme.Compiler ~/.julia/dev/Enzyme/src/llvm/transforms.jl:780
  [7] (::Enzyme.Compiler.var"#getparent#71"{LLVM.Context, LLVM.Function, LLVM.IntegerType, Int64, Dict{LLVM.PHIInst, LLVM.PHIInst}, Dict{LLVM.PHIInst, LLVM.PHIInst}, LLVM.PHIInst, LLVM.BitCastInst})(b::LLVM.IRBuilder, v::LLVM.Value, offset::LLVM.Value, hasload::Bool)
    @ Enzyme.Compiler ~/.julia/dev/Enzyme/src/llvm/transforms.jl:644
  [8] (::Enzyme.Compiler.var"#getparent#71"{LLVM.Context, LLVM.Function, LLVM.IntegerType, Int64, Dict{LLVM.PHIInst, LLVM.PHIInst}, Dict{LLVM.PHIInst, LLVM.PHIInst}, LLVM.PHIInst, LLVM.BitCastInst})(b::LLVM.IRBuilder, v::LLVM.Value, offset::LLVM.Value, hasload::Bool)
    @ Enzyme.Compiler ~/.julia/dev/Enzyme/src/llvm/transforms.jl:780
  [9] nodecayed_phis!(mod::LLVM.Module)
    @ Enzyme.Compiler ~/.julia/dev/Enzyme/src/llvm/transforms.jl:933
 [10] optimize!(mod::LLVM.Module, tm::LLVM.TargetMachine)
    @ Enzyme.Compiler ~/.julia/dev/Enzyme/src/compiler/optimize.jl:582
 [11] codegen(output::Symbol, job::GPUCompiler.CompilerJob{Enzyme.Compiler.EnzymeTarget, Enzyme.Compiler.EnzymeCompilerParams}; libraries::Bool, deferred_codegen::Bool, optimize::Bool, toplevel::Bool, strip::Bool, validate::Bool, only_entry::Bool, parent_job::Nothing)
    @ Enzyme.Compiler ~/.julia/dev/Enzyme/src/compiler.jl:4108
 [12] codegen
    @ ~/.julia/dev/Enzyme/src/compiler.jl:3240 [inlined]
 [13] _thunk(job::GPUCompiler.CompilerJob{Enzyme.Compiler.EnzymeTarget, Enzyme.Compiler.EnzymeCompilerParams}, postopt::Bool)
    @ Enzyme.Compiler ~/.julia/dev/Enzyme/src/compiler.jl:5289
 [14] _thunk
    @ ~/.julia/dev/Enzyme/src/compiler.jl:5289 [inlined]
 [15] cached_compilation
    @ ~/.julia/dev/Enzyme/src/compiler.jl:5341 [inlined]
 [16] thunkbase(mi::Core.MethodInstance, World::UInt64, FA::Type{<:Annotation}, A::Type{<:Annotation}, TT::Type, Mode::Enzyme.API.CDerivativeMode, width::Int64, ModifiedBetween::NTuple{N, Bool} where N, ReturnPrimal::Bool, ShadowInit::Bool, ABI::Type, ErrIfFuncWritten::Bool, RuntimeActivity::Bool, edges::Vector{Any})
    @ Enzyme.Compiler ~/.julia/dev/Enzyme/src/compiler.jl:5452
 [17] thunk_generator(world::UInt64, source::LineNumberNode, FA::Type, A::Type, TT::Type, Mode::Enzyme.API.CDerivativeMode, Width::Int64, ModifiedBetween::NTuple{N, Bool} where N, ReturnPrimal::Bool, ShadowInit::Bool, ABI::Type, ErrIfFuncWritten::Bool, RuntimeActivity::Bool, self::Any, fakeworld::Any, fa::Type, a::Type, tt::Type, mode::Type, width::Type, modifiedbetween::Type, returnprimal::Type, shadowinit::Type, abi::Type, erriffuncwritten::Type, runtimeactivity::Type)
    @ Enzyme.Compiler ~/.julia/dev/Enzyme/src/compiler.jl:5637
 [18] autodiff
    @ ~/.julia/dev/Enzyme/src/Enzyme.jl:485 [inlined]
 [19] autodiff
    @ ~/.julia/dev/Enzyme/src/Enzyme.jl:544 [inlined]
 [20] autodiff(::ReverseMode{false, false, FFIABI, false, false}, ::typeof(f!), ::Duplicated{CuArray{Float32, 1, CUDA.DeviceMemory}}, ::Const{CUDABackend})
    @ Enzyme ~/.julia/dev/Enzyme/src/Enzyme.jl:516
 [21] top-level scope
    @ REPL[10]:1

@wsmoses
Copy link
Member

wsmoses commented Jan 2, 2025

okay can you give this a go again? The getparent stuff should be fixed (I hope) now

@cncastillo
Copy link
Author

Using the latest dev version I get:

ERROR: Enzyme compilation failed due to an internal error.
 Please open an issue with the code to reproduce and full error log on github.com/EnzymeAD/Enzyme.jl
 To toggle more information for debugging (needed for bug reports), set Enzyme.Compiler.VERBOSE_ERRORS[] = true (default false)

Stacktrace:
 [1] #synchronize#1003
   @ ~/.julia/packages/CUDA/2kjXI/lib/cudadrv/synchronization.jl:200
 [2] synchronize (repeats 2 times)
   @ ~/.julia/packages/CUDA/2kjXI/lib/cudadrv/synchronization.jl:194
 [3] synchronize
   @ ~/.julia/packages/CUDA/2kjXI/src/CUDAKernels.jl:29
 [4] augmented_primal
   @ ~/.julia/packages/KernelAbstractions/0r40T/ext/EnzymeExt.jl:61

Stacktrace:
  [1] (::Enzyme.Compiler.var"#getparent#69"{})(b::LLVM.IRBuilder, v::LLVM.Value, offset::LLVM.Value, hasload::Bool, phicache::Dict{…})
    @ Enzyme.Compiler ~/.julia/dev/Enzyme/src/llvm/transforms.jl:931
  [2] (::Enzyme.Compiler.var"#getparent#69"{})(b::LLVM.IRBuilder, v::LLVM.Value, offset::LLVM.Value, hasload::Bool, phicache::Dict{…})
    @ Enzyme.Compiler ~/.julia/dev/Enzyme/src/llvm/transforms.jl:615
  [3] (::Enzyme.Compiler.var"#getparent#69"{})(b::LLVM.IRBuilder, v::LLVM.Value, offset::LLVM.Value, hasload::Bool, phicache::Dict{…})
    @ Enzyme.Compiler ~/.julia/dev/Enzyme/src/llvm/transforms.jl:644
  [4] (::Enzyme.Compiler.var"#getparent#69"{})(b::LLVM.IRBuilder, v::LLVM.Value, offset::LLVM.Value, hasload::Bool, phicache::Dict{…})
    @ Enzyme.Compiler ~/.julia/dev/Enzyme/src/llvm/transforms.jl:780
  [5] nodecayed_phis!(mod::LLVM.Module)
    @ Enzyme.Compiler ~/.julia/dev/Enzyme/src/llvm/transforms.jl:938
  [6] optimize!(mod::LLVM.Module, tm::LLVM.TargetMachine)
    @ Enzyme.Compiler ~/.julia/dev/Enzyme/src/compiler/optimize.jl:582
  [7] nested_codegen!(mode::Enzyme.API.CDerivativeMode, mod::LLVM.Module, funcspec::Core.MethodInstance, world::UInt64)
    @ Enzyme.Compiler ~/.julia/dev/Enzyme/src/compiler.jl:401
  [8] enzyme_custom_common_rev(forward::Bool, B::LLVM.IRBuilder, orig::LLVM.CallInst, gutils::Enzyme.Compiler.GradientUtils, normalR::Ptr{…}, shadowR::Ptr{…}, tape::Nothing)
    @ Enzyme.Compiler ~/.julia/dev/Enzyme/src/rules/customrules.jl:960
  [9] enzyme_custom_augfwd
    @ ~/.julia/dev/Enzyme/src/rules/customrules.jl:1503 [inlined]
 [10] enzyme_custom_augfwd_cfunc(B::Ptr{…}, OrigCI::Ptr{…}, gutils::Ptr{…}, normalR::Ptr{…}, shadowR::Ptr{…}, tapeR::Ptr{…})
    @ Enzyme.Compiler ~/.julia/dev/Enzyme/src/rules/llvmrules.jl:18
 [11] EnzymeCreatePrimalAndGradient(logic::Enzyme.Logic, todiff::LLVM.Function, retType::Enzyme.API.CDIFFE_TYPE, constant_args::Vector{…}, TA::Enzyme.TypeAnalysis, returnValue::Bool, dretUsed::Bool, mode::Enzyme.API.CDerivativeMode, runtimeActivity::Bool, width::Int64, additionalArg::Ptr{…}, forceAnonymousTape::Bool, typeInfo::Enzyme.FnTypeInfo, uncacheable_args::Vector{…}, augmented::Ptr{…}, atomicAdd::Bool)
    @ Enzyme.API ~/.julia/dev/Enzyme/src/api.jl:268
 [12] enzyme!(job::GPUCompiler.CompilerJob{…}, mod::LLVM.Module, primalf::LLVM.Function, TT::Type, mode::Enzyme.API.CDerivativeMode, width::Int64, parallel::Bool, actualRetType::Type, wrap::Bool, modifiedBetween::NTuple{…} where N, returnPrimal::Bool, expectedTapeType::Type, loweredArgs::Set{…}, boxedArgs::Set{…})
    @ Enzyme.Compiler ~/.julia/dev/Enzyme/src/compiler.jl:1703
 [13] codegen(output::Symbol, job::GPUCompiler.CompilerJob{…}; libraries::Bool, deferred_codegen::Bool, optimize::Bool, toplevel::Bool, strip::Bool, validate::Bool, only_entry::Bool, parent_job::Nothing)
    @ Enzyme.Compiler ~/.julia/dev/Enzyme/src/compiler.jl:4547
 [14] codegen
    @ ~/.julia/dev/Enzyme/src/compiler.jl:3350 [inlined]
 [15] _thunk(job::GPUCompiler.CompilerJob{Enzyme.Compiler.EnzymeTarget, Enzyme.Compiler.EnzymeCompilerParams}, postopt::Bool)
    @ Enzyme.Compiler ~/.julia/dev/Enzyme/src/compiler.jl:5407
 [16] _thunk
    @ ~/.julia/dev/Enzyme/src/compiler.jl:5407 [inlined]
 [17] cached_compilation
    @ ~/.julia/dev/Enzyme/src/compiler.jl:5459 [inlined]
 [18] thunkbase(mi::Core.MethodInstance, World::UInt64, FA::Type{…}, A::Type{…}, TT::Type, Mode::Enzyme.API.CDerivativeMode, width::Int64, ModifiedBetween::NTuple{…} where N, ReturnPrimal::Bool, ShadowInit::Bool, ABI::Type, ErrIfFuncWritten::Bool, RuntimeActivity::Bool, edges::Vector{…})
    @ Enzyme.Compiler ~/.julia/dev/Enzyme/src/compiler.jl:5570
 [19] thunk_generator(world::UInt64, source::LineNumberNode, FA::Type, A::Type, TT::Type, Mode::Enzyme.API.CDerivativeMode, Width::Int64, ModifiedBetween::NTuple{…} where N, ReturnPrimal::Bool, ShadowInit::Bool, ABI::Type, ErrIfFuncWritten::Bool, RuntimeActivity::Bool, self::Any, fakeworld::Any, fa::Type, a::Type, tt::Type, mode::Type, width::Type, modifiedbetween::Type, returnprimal::Type, shadowinit::Type, abi::Type, erriffuncwritten::Type, runtimeactivity::Type)
    @ Enzyme.Compiler ~/.julia/dev/Enzyme/src/compiler.jl:5755
 [20] autodiff
    @ ~/.julia/dev/Enzyme/src/Enzyme.jl:485 [inlined]
 [21] autodiff
    @ ~/.julia/dev/Enzyme/src/Enzyme.jl:544 [inlined]
 [22] autodiff(::ReverseMode{…}, ::typeof(f!), ::Duplicated{…}, ::Const{…})
    @ Enzyme ~/.julia/dev/Enzyme/src/Enzyme.jl:516
 [23] top-level scope
    @ REPL[16]:1
Some type information was truncated. Use `show(err)` to see complete types.

@wsmoses
Copy link
Member

wsmoses commented Jan 9, 2025

okay my patch to CUDA.jl fixing that has been released, want to give it another go?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants