We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I build the gass with "cmake -G Ninja ../llvm -DLLVM_ENABLE_PROJECTS="clang;clang-tools-extra" -DLLVM_TARGETS_TO_BUILD="GASS;NVPTX;X86" -DCMAKE_BUILD_TYPE=Debug -DLLVM_ENABLE_ASSERTIONS=ON -DLLVM_ENABLE_RTTI=ON -DCUDA_ARCH=70" Then i create a .bc file with "clang++ --cuda-path=/usr/local/cuda-11.3/ -std=c++11 --cuda-gpu-arch=sm_70 -emit-llvm test.cu -c " test.cu looks like Then i use llc to call gass backend "llc --march=gass test-cuda-nvptx64-nvidia-cuda-sm_70.bc -o test.s" I get the error report "GENERIC load not implemented" so, my question is what am doing is wrong and can you tell me how to test this program? the error inst is "t74: i32,ch = load<(dereferenceable load 4 from %ir.ix)> t73, FrameIndex:i64<5>, undef:i64" the N->dumpr() is "t74: i32,ch = load<(dereferenceable load 4 from %ir.ix)> t73, FrameIndex:i64<5>, undef:i64 t73: ch = store<(store 4 into %ir.idx)> t72, t70, FrameIndex:i64<7>, undef:i64 t72: ch = TokenFactor t66:1, t67:1, t68:1 t66: i32,ch = load<(dereferenceable load 4 from %ir.ix)> t65, FrameIndex:i64<5>, undef:i64 t65: ch = store<(store 4 into %ir.iy)> t52, t63, FrameIndex:i64<6>, undef:i64 t52: ch = store<(store 4 into %ir.ix)> t38, t50, FrameIndex:i64<5>, undef:i64 t38: ch = store<(store 4 into %ir.ny.addr)> t36, t10, FrameIndex:i64<4>, undef:i64 t36: ch = store<(store 4 into %ir.nx.addr)> t34, t8, FrameIndex:i64<3>, undef:i64 t34: ch = store<(store 8 into %ir.MatC.addr)> t32, t6, FrameIndex:i64<2>, undef:i64 t32: ch = store<(store 8 into %ir.MatB.addr)> t30, t4, FrameIndex:i64<1>, undef:i64 t30: ch = store<(store 8 into %ir.MatA.addr)> t0, t2, FrameIndex:i64<0>, undef:i64 t0: ch = EntryToken t2: i64 = <<Unknown Node #338>> TargetConstant:i32<352> t4: i64 = <<Unknown Node #338>> TargetConstant:i32<360> t6: i64 = <<Unknown Node #338>> TargetConstant:i32<368> t8: i32 = <<Unknown Node #338>> TargetConstant:i32<376> t10: i32 = <<Unknown Node #338>> TargetConstant:i32<380> t50: i32 = add t42, t49 t42: i32 = AssertZext t40, ValueType:ch:i10 t40: i32 = llvm.nvvm.read.ptx.sreg.tid.x TargetConstant:i64<5183> t49: i32 = mul t44, t48 t44: i32 = llvm.nvvm.read.ptx.sreg.ntid.x TargetConstant:i64<5173> t48: i32 = AssertZext t46, ValueType:ch:i31 t46: i32 = llvm.nvvm.read.ptx.sreg.ctaid.x TargetConstant:i64<5125> t63: i32 = add t55, t62 t55: i32 = AssertZext t54, ValueType:ch:i10 t54: i32 = llvm.nvvm.read.ptx.sreg.tid.y TargetConstant:i64<5184> t62: i32 = mul t57, t61 t57: i32 = llvm.nvvm.read.ptx.sreg.ntid.y TargetConstant:i64<5174> t61: i32 = AssertZext t59, ValueType:ch:i16 t59: i32 = llvm.nvvm.read.ptx.sreg.ctaid.y TargetConstant:i64<5126> t67: i32,ch = load<(dereferenceable load 4 from %ir.iy)> t65, FrameIndex:i64<6>, undef:i64 t68: i32,ch = load<(dereferenceable load 4 from %ir.ny.addr)> t65, FrameIndex:i64<4>, undef:i64 t70: i32 = add nsw t66, t69 t69: i32 = mul nsw t67, t68"
The text was updated successfully, but these errors were encountered:
主要是想咨询一下,基于 sm_70 架构环境下,要怎么跑测试用例。
Sorry, something went wrong.
No branches or pull requests
I build the gass with
"cmake -G Ninja ../llvm -DLLVM_ENABLE_PROJECTS="clang;clang-tools-extra" -DLLVM_TARGETS_TO_BUILD="GASS;NVPTX;X86" -DCMAKE_BUILD_TYPE=Debug -DLLVM_ENABLE_ASSERTIONS=ON -DLLVM_ENABLE_RTTI=ON -DCUDA_ARCH=70"
Then i create a .bc file with
"clang++ --cuda-path=/usr/local/cuda-11.3/ -std=c++11 --cuda-gpu-arch=sm_70 -emit-llvm test.cu -c "
test.cu looks like
Then i use llc to call gass backend
"llc --march=gass test-cuda-nvptx64-nvidia-cuda-sm_70.bc -o test.s"
I get the error report
"GENERIC load not implemented"
so, my question is what am doing is wrong and can you tell me how to test this program?
the error inst is
"t74: i32,ch = load<(dereferenceable load 4 from %ir.ix)> t73, FrameIndex:i64<5>, undef:i64"
the N->dumpr() is
"t74: i32,ch = load<(dereferenceable load 4 from %ir.ix)> t73, FrameIndex:i64<5>, undef:i64
t73: ch = store<(store 4 into %ir.idx)> t72, t70, FrameIndex:i64<7>, undef:i64
t72: ch = TokenFactor t66:1, t67:1, t68:1
t66: i32,ch = load<(dereferenceable load 4 from %ir.ix)> t65, FrameIndex:i64<5>, undef:i64
t65: ch = store<(store 4 into %ir.iy)> t52, t63, FrameIndex:i64<6>, undef:i64
t52: ch = store<(store 4 into %ir.ix)> t38, t50, FrameIndex:i64<5>, undef:i64
t38: ch = store<(store 4 into %ir.ny.addr)> t36, t10, FrameIndex:i64<4>, undef:i64
t36: ch = store<(store 4 into %ir.nx.addr)> t34, t8, FrameIndex:i64<3>, undef:i64
t34: ch = store<(store 8 into %ir.MatC.addr)> t32, t6, FrameIndex:i64<2>, undef:i64
t32: ch = store<(store 8 into %ir.MatB.addr)> t30, t4, FrameIndex:i64<1>, undef:i64
t30: ch = store<(store 8 into %ir.MatA.addr)> t0, t2, FrameIndex:i64<0>, undef:i64
t0: ch = EntryToken
t2: i64 = <<Unknown Node #338>> TargetConstant:i32<352>
t4: i64 = <<Unknown Node #338>> TargetConstant:i32<360>
t6: i64 = <<Unknown Node #338>> TargetConstant:i32<368>
t8: i32 = <<Unknown Node #338>> TargetConstant:i32<376>
t10: i32 = <<Unknown Node #338>> TargetConstant:i32<380>
t50: i32 = add t42, t49
t42: i32 = AssertZext t40, ValueType:ch:i10
t40: i32 = llvm.nvvm.read.ptx.sreg.tid.x TargetConstant:i64<5183>
t49: i32 = mul t44, t48
t44: i32 = llvm.nvvm.read.ptx.sreg.ntid.x TargetConstant:i64<5173>
t48: i32 = AssertZext t46, ValueType:ch:i31
t46: i32 = llvm.nvvm.read.ptx.sreg.ctaid.x TargetConstant:i64<5125>
t63: i32 = add t55, t62
t55: i32 = AssertZext t54, ValueType:ch:i10
t54: i32 = llvm.nvvm.read.ptx.sreg.tid.y TargetConstant:i64<5184>
t62: i32 = mul t57, t61
t57: i32 = llvm.nvvm.read.ptx.sreg.ntid.y TargetConstant:i64<5174>
t61: i32 = AssertZext t59, ValueType:ch:i16
t59: i32 = llvm.nvvm.read.ptx.sreg.ctaid.y TargetConstant:i64<5126>
t67: i32,ch = load<(dereferenceable load 4 from %ir.iy)> t65, FrameIndex:i64<6>, undef:i64
t68: i32,ch = load<(dereferenceable load 4 from %ir.ny.addr)> t65, FrameIndex:i64<4>, undef:i64
t70: i32 = add nsw t66, t69
t69: i32 = mul nsw t67, t68"
The text was updated successfully, but these errors were encountered: