-
Notifications
You must be signed in to change notification settings - Fork 478
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WARN: child terminated by signal: 11: Segmentation fault fault for ( C++ ) #1948
Comments
Additionally, and if started with the command $ uftrace record --time -t 1s --no-libcall --no-event it might take more than 10 minutes. Is there an attach method that can be supported? This way, the startup command won't need to be modified. |
help me ~ |
Hi, it looks like |
okk,Let me fill in some details |
thx ~~ The following is my debug messages :)
gdb debuger
|
Is there any backtrace from uftrace (or libmcount) when it got the segfault? Also I'm not sure why it takes too long.. what do you see when you use a different time filter like |
It would be a lot more helpful if you could share us what your You can get some reference of the bug report from #1949. |
Thx @honggyukim , my program is a giant program, originating from business software. here is my test on ggerganov/llama.cpp, I found that it also encountered a segmentation fault. The specific test path is as follows ~ ( #1949 uftrace record --no-libcall ./llama-cli
Log start
main: build = 3601 (2339a0be)
main: built with cc (GCC) 8.5.0 20210514
main: seed = 1724037308
WARN: Segmentation fault: address not mapped (addr: 0x14bc576c)
WARN: if this happens only with uftrace, please consider -e/--estimate-return option.
WARN: Backtrace from uftrace v0.16-13-gc546 ( x86_64 dwarf python3 luajit tui perf sched dynamic kernel )
WARN: =====================================
WARN: [2] (llama_load_model_from_file[52a9e0] <= llama_init_from_gpt_params[5b9a9b])
WARN: [1] (llama_init_from_gpt_params[5b99fa] <= main[431e9a])
WARN: [0] (main[43170a] <= __libc_start_main[7fa15de29d85])
Please report this bug to https://github.com/namhyung/uftrace/issues.
WARN: child terminated by signal: 11: Segmentation fault It still gets crashed even with -e option. uftrace record --no-libcall -e ./llama-cli
Log start
main: build = 3601 (2339a0be)
main: built with cc (GCC) 8.5.0 20210514
main: seed = 1724037362
WARN: Segmentation fault: address not mapped (addr: 0x14bdfaba)
WARN: Backtrace from uftrace v0.16-13-gc546 ( x86_64 dwarf python3 luajit tui perf sched dynamic kernel )
WARN: =====================================
WARN: [4] (llama_load_model_from_file[52a9e0] <= llama_init_from_gpt_params[5b9a9b])
WARN: [3] (llama_model_default_params[4d779a] <= llama_model_params_from_gpt_params[5b50a8])
WARN: [2] (llama_model_params_from_gpt_params[5b509d] <= llama_init_from_gpt_params[5b9a36])
WARN: [1] (llama_init_from_gpt_params[5b99fa] <= main[431e9a])
WARN: [0] (main[43170a] <= __libc_start_main[7f5343c97d85])
Please report this bug to https://github.com/namhyung/uftrace/issues.
WARN: child terminated by signal: 11: Segmentation fault Here is the backtrace. $ gdb -q --args uftrace record --no-libcall -e ./llama-cli
Reading symbols from uftrace...
(gdb) r
Starting program: /usr/local/bin/uftrace record --no-libcall -e ./llama-cli
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[Detaching after fork from child process 1137111]
[New Thread 0x7ffff4bce700 (LWP 1137112)]
[New Thread 0x7ffff43cd700 (LWP 1137113)]
[New Thread 0x7ffff3bcc700 (LWP 1137114)]
[New Thread 0x7ffff33cb700 (LWP 1137115)]
[New Thread 0x7ffff2bca700 (LWP 1137116)]
[New Thread 0x7ffff23c9700 (LWP 1137117)]
[New Thread 0x7ffff1bc8700 (LWP 1137118)]
[New Thread 0x7ffff13c7700 (LWP 1137119)]
Log start
main: build = 3601 (2339a0be)
main: built with cc (GCC) 8.5.0 20210514
main: seed = 1724037407
WARN: Segmentation fault: address not mapped (addr: 0x14bf5cf4)
WARN: Backtrace from uftrace v0.16-13-gc546 ( x86_64 dwarf python3 luajit tui perf sched dynamic kernel )
WARN: =====================================
WARN: [4] (llama_load_model_from_file[52a9e0] <= llama_init_from_gpt_params[5b9a9b])
WARN: [3] (llama_model_default_params[4d779a] <= llama_model_params_from_gpt_params[5b50a8])
WARN: [2] (llama_model_params_from_gpt_params[5b509d] <= llama_init_from_gpt_params[5b9a36])
WARN: [1] (llama_init_from_gpt_params[5b99fa] <= main[431e9a])
WARN: [0] (main[43170a] <= __libc_start_main[7ffff6a79d85])
Please report this bug to https://github.com/namhyung/uftrace/issues.
WARN: child terminated by signal: 11: Segmentation fault
[Thread 0x7ffff13c7700 (LWP 1137119) exited]
[Thread 0x7ffff1bc8700 (LWP 1137118) exited]
[Thread 0x7ffff23c9700 (LWP 1137117) exited]
[Thread 0x7ffff2bca700 (LWP 1137116) exited]
[Thread 0x7ffff33cb700 (LWP 1137115) exited]
[Thread 0x7ffff3bcc700 (LWP 1137114) exited]
[Thread 0x7ffff43cd700 (LWP 1137113) exited]
[Thread 0x7ffff4bce700 (LWP 1137112) exited]
[Inferior 1 (process 1136592) exited with code 02]
Missing separate debuginfos, use: dnf debuginfo-install bash-4.4.20-4.tl3.tencentos.x86_64 brotli-1.0.6-3.tl3.x86_64 bzip2-libs-1.0.6-26.tl3.x86_64 capstone-4.0.2-5.el8.x86_64 cyrus-sasl-lib-2.1.27-6.tl3.x86_64 elfutils-debuginfod-client-0.190-2.tl3.x86_64 elfutils-libelf-0.190-2.tl3.x86_64 elfutils-libs-0.190-2.tl3.x86_64 glibc-2.28-225.tl3.x86_64 keyutils-libs-1.5.10-9.tl3.x86_64 krb5-libs-1.18.2-22.tl3.x86_64 libcom_err-1.45.6-5.tl3.x86_64 libcurl-7.61.1-30.tl3.2.x86_64 libgcc-8.5.0-18.tl3.x86_64 libidn2-2.2.0-1.tl3.x86_64 libnghttp2-1.33.0-3.tl3.1.x86_64 libpsl-0.20.2-6.tl3.x86_64 libselinux-2.9-8.tl3.x86_64 libssh-0.9.6-6.tl3.x86_64 libstdc++-8.5.0-18.tl3.x86_64 libtraceevent-1.5.3-1.tl3.x86_64 libxcrypt-4.1.1-6.tl3.x86_64 libzstd-1.4.4-1.tl3.x86_64 ncurses-libs-6.1-10.20180224.tl3.x86_64 openssl-libs-1.1.1k-9.tl3.3.x86_64 pcre2-10.32-2.tl3.x86_64 xz-libs-5.2.4-4.tl3.x86_64 zlib-1.2.11-21.tl3.x86_64
(gdb) bt
No stack. gdb ./llama-cli ~/corefile/core-llama-cli-11-1000-1000-1133916-1724037308
GNU gdb (GDB) Red Hat Enterprise Linux 9.2-4.tl3
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./llama-cli...
warning: Can't open file (null) during file-backed mapping note processing
warning: Can't open file (null) during file-backed mapping note processing
[New LWP 1133916]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
warning: the debug information found in "/usr/lib/debug//lib64/libbz2.so.1.0.6-1.0.6-26.tl3.x86_64.debug" does not match "/lib64/libbz2.so.1" (CRC mismatch).
warning: the debug information found in "/usr/lib/debug//usr/lib64/libbz2.so.1.0.6-1.0.6-26.tl3.x86_64.debug" does not match "/lib64/libbz2.so.1" (CRC mismatch).
Missing separate debuginfo for /lib64/libbz2.so.1
Try: dnf --enablerepo='*debug*' install /usr/lib/debug/.build-id/fe/194562c2e19c235fb8c13b3ec7931029ed2f7f.debug
Core was generated by `./llama-cli'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x000000000052a9e0 in llama_load_model_from_file (path_model=0x1e1d510 "models/7B/ggml-model-f16.gguf",
params=...) at src/llama.cpp:16936
16936 struct llama_model_params params) {
Missing separate debuginfos, use: dnf debuginfo-install libgcc-8.5.0-18.tl3.x86_64 libgomp-8.5.0-18.tl3.x86_64 libstdc++-8.5.0-18.tl3.x86_64 libtraceevent-1.5.3-1.tl3.x86_64 libzstd-1.4.4-1.tl3.x86_64 xz-libs-5.2.4-4.tl3.x86_64 zlib-1.2.11-21.tl3.x86_64
(gdb) bt
#0 0x000000000052a9e0 in llama_load_model_from_file (path_model=0x1e1d510 "models/7B/ggml-model-f16.gguf",
params=...) at src/llama.cpp:16936
#1 0x00000000005b9a9b in llama_init_from_gpt_params (params=...) at common/common.cpp:2107
#2 0x0000000000431e9a in main (argc=<optimized out>, argv=<optimized out>) at examples/main/main.cpp:210
(gdb) I tried debugging using lldb ~ $ lldb
(lldb) file /usr/local/bin/uftrace
Current executable set to '/usr/local/bin/uftrace' (x86_64).
(lldb) settings set -- target.run-args record --no-libcall -e ./llama-cli
(lldb) run
Process 1521648 launched: '/usr/local/bin/uftrace' (x86_64)
warning: (x86_64) /lib64/libonion.so unsupported DW_FORM values: 0x1f20 0x1f21
warning: (x86_64) /lib64/libdl.so.2 unsupported DW_FORM values: 0x1f20 0x1f21
warning: (x86_64) /lib64/librt.so.1 unsupported DW_FORM values: 0x1f20 0x1f21
warning: (x86_64) /lib64/libstdc++.so.6 unsupported DW_FORM values: 0x1f20 0x1f21
warning: (x86_64) /lib64/libelf.so.1 unsupported DW_FORM values: 0x1f20 0x1f21
warning: (x86_64) /lib64/libdw.so.1 unsupported DW_FORM values: 0x1f20 0x1f21
warning: (x86_64) /lib64/libcapstone.so.4 unsupported DW_FORM values: 0x1f20 0x1f21
warning: (x86_64) /lib64/libm.so.6 unsupported DW_FORM values: 0x1f20 0x1f21
warning: (x86_64) /lib64/libncursesw.so.6 unsupported DW_FORM values: 0x1f20 0x1f21
warning: (x86_64) /lib64/libtinfo.so.6 unsupported DW_FORM values: 0x1f20 0x1f21
warning: (x86_64) /lib64/libpthread.so.0 unsupported DW_FORM values: 0x1f20 0x1f21
warning: (x86_64) /lib64/libc.so.6 unsupported DW_FORM values: 0x1f20 0x1f21
warning: (x86_64) /lib64/libgcc_s.so.1 unsupported DW_FORM values: 0x1f20 0x1f21
warning: (x86_64) /lib64/libzstd.so.1 unsupported DW_FORM values: 0x1f20 0x1f21
Log start
main: build = 3601 (2339a0be)
main: built with cc (GCC) 8.5.0 20210514 (TencentOS 8.5.0-18) for x86_64-redhat-linux
main: seed = 1724050099
WARN: Segmentation fault: address not mapped (addr: 0x16425d50)
WARN: Backtrace from uftrace v0.16-13-gc546 ( x86_64 dwarf python3 luajit tui perf sched dynamic kernel )
WARN: =====================================
WARN: [4] (llama_load_model_from_file[52a9e0] <= llama_init_from_gpt_params[5b9a9b])
WARN: [3] (llama_model_default_params[4d779a] <= llama_model_params_from_gpt_params[5b50a8])
WARN: [2] (llama_model_params_from_gpt_params[5b509d] <= llama_init_from_gpt_params[5b9a36])
WARN: [1] (llama_init_from_gpt_params[5b99fa] <= main[431e9a])
WARN: [0] (main[43170a] <= __libc_start_main[7ffff6a79d85])
Please report this bug to https://github.com/namhyung/uftrace/issues.
Process 1521648 stopped and restarted: thread 1 received signal: SIGCHLD
WARN: child terminated by signal: 11: Segmentation fault
warning: (x86_64) /lib64/libdebuginfod.so.1 unsupported DW_FORM values: 0x1f20 0x1f21
warning: (x86_64) /lib64/libcurl.so.4 unsupported DW_FORM values: 0x1f20 0x1f21
warning: (x86_64) /lib64/libpsl.so.5 unsupported DW_FORM values: 0x1f20 0x1f21
warning: (x86_64) /lib64/libssl.so.1.1 unsupported DW_FORM values: 0x1f20 0x1f21
warning: (x86_64) /lib64/libk5crypto.so.3 unsupported DW_FORM values: 0x1f20 0x1f21
warning: (x86_64) /lib64/libcom_err.so.2 unsupported DW_FORM values: 0x1f20 0x1f21
warning: (x86_64) /lib64/libldap-2.4.so.2 unsupported DW_FORM values: 0x1f20 0x1f21
warning: (x86_64) /lib64/liblber-2.4.so.2 unsupported DW_FORM values: 0x1f20 0x1f21
warning: (x86_64) /lib64/libbrotlidec.so.1 unsupported DW_FORM values: 0x1f20 0x1f21
warning: (x86_64) /lib64/libkrb5support.so.0 unsupported DW_FORM values: 0x1f20 0x1f21
warning: (x86_64) /lib64/libkeyutils.so.1 unsupported DW_FORM values: 0x1f20 0x1f21
warning: (x86_64) /lib64/libresolv.so.2 unsupported DW_FORM values: 0x1f20 0x1f21
warning: (x86_64) /lib64/libsasl2.so.3 unsupported DW_FORM values: 0x1f20 0x1f21
warning: (x86_64) /lib64/libbrotlicommon.so.1 unsupported DW_FORM values: 0x1f20 0x1f21
warning: (x86_64) /lib64/libselinux.so.1 unsupported DW_FORM values: 0x1f20 0x1f21
warning: (x86_64) /lib64/libpcre2-8.so.0 unsupported DW_FORM values: 0x1f20 0x1f21
Process 1521648 exited with status = 2 (0x00000002)
(lldb) tb
No breakpoints currently set. I'm not sure if these warnings make a difference lldb -c core-llama-cli-11-1000-1000-1521657-1724050099 -- ~/t/lab/llama.cpp/llama-cli
(lldb) target create "/data/home/user00/t/lab/llama.cpp/llama-cli" --core "core-llama-cli-11-1000-1000-1521657-1724050099"
warning: (x86_64) /lib64/libonion.so unsupported DW_FORM values: 0x1f20 0x1f21
warning: (x86_64) /lib64/libstdc++.so.6 unsupported DW_FORM values: 0x1f20 0x1f21
warning: (x86_64) /lib64/libm.so.6 unsupported DW_FORM values: 0x1f20 0x1f21
warning: (x86_64) /lib64/libgcc_s.so.1 unsupported DW_FORM values: 0x1f20 0x1f21
warning: (x86_64) /lib64/libpthread.so.0 unsupported DW_FORM values: 0x1f20 0x1f21
warning: (x86_64) /lib64/libc.so.6 unsupported DW_FORM values: 0x1f20 0x1f21
warning: (x86_64) /lib64/libdl.so.2 unsupported DW_FORM values: 0x1f20 0x1f21
warning: (x86_64) /lib64/librt.so.1 unsupported DW_FORM values: 0x1f20 0x1f21
warning: (x86_64) /lib64/libelf.so.1 unsupported DW_FORM values: 0x1f20 0x1f21
warning: (x86_64) /lib64/libdw.so.1 unsupported DW_FORM values: 0x1f20 0x1f21
warning: (x86_64) /lib64/libcapstone.so.4 unsupported DW_FORM values: 0x1f20 0x1f21
warning: (x86_64) /lib64/libzstd.so.1 unsupported DW_FORM values: 0x1f20 0x1f21
Core file '/data/home/user00/corefile/core-llama-cli-11-1000-1000-1521657-1724050099' (x86_64) was loaded.
(lldb) bt
* thread #1, name = 'llama-cli', stop reason = signal SIGSEGV
* frame #0: 0x000000000052a9e0 llama-cli`llama_load_model_from_file(path_model="models/7B/ggml-model-f16.gguf", params=llama_model_params @ 0x00007fffffffbf70) at llama.cpp:16936:45
frame #1: 0x00000000005b9a9b llama-cli`llama_init_from_gpt_params(params=0x00007fffffffc7b0) at common.cpp:2107:43
frame #2: 0x0000000000431e9a llama-cli`main(argc=<unavailable>, argv=<unavailable>) at main.cpp:210:69
frame #3: 0x00007ffff6a79d85 libc.so.6`__libc_start_main + 229
frame #4: 0x000000000043896e llama-cli`_start + 46 |
Thx @namhyung , I will collect the backtrace information and also change the collection frequency (if I use --nop, it doesn't seem to cause a segmentation fault).
|
Looks like the stack memory was overwritten by something. Does your program reads return address from stack or calculate something from it? But it doesn't explain the |
step 0
env
source code
step 1
$ readelf -s app |grep mcount 405: 0000000000000000 0 FUNC GLOBAL DEFAULT UND mcount@GLIBC_2.2.5 (2) 432268: 0000000000000000 0 FUNC GLOBAL DEFAULT UND mcount
step 2
0x0000000000000000
uftrace/arch/x86_64/mcount.S
Line 80 in c54660b
The text was updated successfully, but these errors were encountered: