Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Issue]: Build composable_kernel error with "CMAKE_BUILD_TYPE=Debug" #1709

Open
kewang-xlnx opened this issue Dec 2, 2024 · 4 comments
Open

Comments

@kewang-xlnx
Copy link

kewang-xlnx commented Dec 2, 2024

Problem Description

I wanted to build composable_kernel with "CMAKE_BUILD_TYPE=Debug". I was successfully with the commands:

cmake                                                                                             \
-D CMAKE_PREFIX_PATH=/opt/rocm                                                                    \
-D CMAKE_CXX_COMPILER=/opt/rocm/bin/hipcc                                                         \
-D CMAKE_BUILD_TYPE=Debug                                                                       \
-D GPU_TARGETS="gfx90a"                                                                    \
..

However, when I built CK library with commands "make -j", I got the following error messages.


[ 33%] Built target generate_cpp_files
ld.lld: error: ../../library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/xdl/mem/device_grouped_conv2d_fwd_xdl_nhwgc_gkyxc_nhwgk_int8_mem_inter_instance.cpp.o:(.rodata._ZN2c
k16tensor_operation6device47DeviceGroupedConvFwdMultipleABD_Xdl_CShuffle_V3ILi2ENS_13tensor_layout11convolution5NHWGCENS4_5GKYXCENS_5TupleIJEEENS4_5NHWGKEaaiaS8_aNS0_12element_wise11PassThroughESB_SB_LNS1_32ConvolutionForwardSpecial
izationE2ELNS1_18GemmSpecializationE7ELi128ELi16ELi32ELi64ELi8ELi8ELi16ELi16ELi1ELi1ENS_8SequenceIJLi8ELi16ELi1EEEENSE_IJLi1ELi0ELi2EEEESG_Li2ELi8ELi8ELi0ESF_SG_SG_Li2ELi8ELi8ELi0ELi1ELi1ENSE_IJLi1ELi16ELi1ELi8EEEELi4ELNS_26BlockGem
mPipelineSchedulerE1ELNS_24BlockGemmPipelineVersionE1EaaE7Invoker7RunGemmERKNSK_8ArgumentERK12StreamConfig+0x0): relocation R_X86_64_PC32 out of range: 2147663974 is not in [-2147483648, 2147483647]; references section '.text._ZN2ck
16tensor_operation6device47DeviceGroupedConvFwdMultipleABD_Xdl_CShuffle_V3ILi2ENS_13tensor_layout11convolution5NHWGCENS4_5GKYXCENS_5TupleIJEEENS4_5NHWGKEaaiaS8_aNS0_12element_wise11PassThroughESB_SB_LNS1_32ConvolutionForwardSpeciali
zationE2ELNS1_18GemmSpecializationE7ELi128ELi16ELi32ELi64ELi8ELi8ELi16ELi16ELi1ELi1ENS_8SequenceIJLi8ELi16ELi1EEEENSE_IJLi1ELi0ELi2EEEESG_Li2ELi8ELi8ELi0ESF_SG_SG_Li2ELi8ELi8ELi0ELi1ELi1ENSE_IJLi1ELi16ELi1ELi8EEEELi4ELNS_26BlockGemm
PipelineSchedulerE1ELNS_24BlockGemmPipelineVersionE1EaaE7Invoker7RunGemmERKNSK_8ArgumentERK12StreamConfig'
>>> referenced by device_grouped_conv2d_fwd_xdl_nhwgc_gkyxc_nhwgk_int8_mem_inter_instance.cpp

ld.lld: error: ../../library/src/tensor_operation_instance/gpu/grouped_conv2d_fwd/CMakeFiles/device_grouped_conv2d_fwd_instance.dir/xdl/mem/device_grouped_conv2d_fwd_xdl_nhwgc_gkyxc_nhwgk_int8_mem_inter_instance.cpp.o:(.rodata._ZN2c
k16tensor_operation6device47DeviceGroupedConvFwdMultipleABD_Xdl_CShuffle_V3ILi2ENS_13tensor_layout11convolution5NHWGCENS4_5GKYXCENS_5TupleIJEEENS4_5NHWGKEaaiaS8_aNS0_12element_wise11PassThroughESB_SB_LNS1_32ConvolutionForwardSpecial
izationE2ELNS1_18GemmSpecializationE7ELi128ELi16ELi32ELi64ELi8ELi8ELi16ELi16ELi1ELi1ENS_8SequenceIJLi8ELi16ELi1EEEENSE_IJLi1ELi0ELi2EEEESG_Li2ELi8ELi8ELi0ESF_SG_SG_Li2ELi8ELi8ELi0ELi1ELi1ENSE_IJLi1ELi16ELi1ELi8EEEELi4ELNS_26BlockGem
mPipelineSchedulerE1ELNS_24BlockGemmPipelineVersionE1EaaE7Invoker7RunGemmERKNSK_8ArgumentERK12StreamConfig+0x4): relocation R_X86_64_PC32 out of range: 2147664010 is not in [-2147483648, 2147483647]; references section '.text._ZN2ck
16tensor_operation6device47DeviceGroupedConvFwdMultipleABD_Xdl_CShuffle_V3ILi2ENS_13tensor_layout11convolution5NHWGCENS4_5GKYXCENS_5TupleIJEEENS4_5NHWGKEaaiaS8_aNS0_12element_wise11PassThroughESB_SB_LNS1_32ConvolutionForwardSpeciali
zationE2ELNS1_18GemmSpecializationE7ELi128ELi16ELi32ELi64ELi8ELi8ELi16ELi16ELi1ELi1ENS_8SequenceIJLi8ELi16ELi1EEEENSE_IJLi1ELi0ELi2EEEESG_Li2ELi8ELi8ELi0ESF_SG_SG_Li2ELi8ELi8ELi0ELi1ELi1ENSE_IJLi1ELi16ELi1ELi8EEEELi4ELNS_26BlockGemm
PipelineSchedulerE1ELNS_24BlockGemmPipelineVersionE1EaaE7Invoker7RunGemmERKNSK_8ArgumentERK12StreamConfig'
>>> referenced by device_grouped_conv2d_fwd_xdl_nhwgc_gkyxc_nhwgk_int8_mem_inter_instance.cpp
ld.lld: error: too many errors emitted, stopping now (use --error-limit=0 to see all errors)
clang++: error: linker command failed with exit code 1 (use -v to see invocation)
failed to execute:/opt/rocm-6.2.0/lib/llvm/bin/clang++ --driver-mode=g++ -O3 --hip-link  -g \@CMakeFiles/ckProfiler.dir/objects1.rsp \@CMakeFiles/ckProfiler.dir/objects2.rsp -o "../../bin/ckProfiler" ../../lib/libutility.a -pthread /opt/rocm/lib/libamdhip64.so.6.2.60200 --hip-link /opt/rocm-6.2.0/lib/llvm/lib/clang/18/lib/linux/libclang_rt.builtins-x86_64.a
make[2]: *** [profiler/src/CMakeFiles/ckProfiler.dir/build.make:2529: bin/ckProfiler] Error 1
make[1]: *** [CMakeFiles/Makefile2:136480: profiler/src/CMakeFiles/ckProfiler.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
[100%] Built target device_mha_instance
make: *** [Makefile:166: all] Error 2

How can I fix this error? Or how can I get debug version of CK library and example?

Operating System

Ubuntu 20.04.6 LTS

CPU

AMD EPYC 73F3 16-Core Processor

GPU

AMD Instinct MI250X

Other

No response

ROCm Version

ROCm 6.0.0

ROCm Component

No response

Steps to Reproduce

No response

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

No response

Additional Information

No response

@kewang-xlnx kewang-xlnx changed the title [Issue]: Build error composable_kernel with "CMAKE_BUILD_TYPE=Debug" [Issue]: Build composable_kernel error with "CMAKE_BUILD_TYPE=Debug" Dec 2, 2024
@ppanchad-amd
Copy link

Hi @kewang-xlnx. Internal ticket has been created to investigate your issue. Thanks!

@schung-amd
Copy link
Contributor

Hi @kewang-xlnx, I'll try to reproduce this. A couple clarifiers: are you building one of the release branches of CK or the develop branch? Are you building inside the docker suggested by the README?

@schung-amd
Copy link
Contributor

schung-amd commented Dec 3, 2024

I was able to reproduce this on ROCm 6.2.4 with the default branch. Unfortunately, it seems like building all of CK with the debug build type is not currently supported. Instead, we recommend you build individual examples and kernel instances with the debug build type if necessary. The cause of this error is simply that the linker can't handle more than 2GB but is trying to link more than this in the debug build. In theory the --offload-compress clang flag should help here, but I didn't have any success with it; you can try this and see if it solves the issue on your end. If you have any related questions I can forward them to the internal team, but this doesn't currently seem possible.

@kewang-xlnx
Copy link
Author

kewang-xlnx commented Dec 5, 2024

Hi @schung-amd, thank you for your test and reply. And could I connect you using our internal Teams?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants