Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault with uftrace dump --flame-graph #1928

Open
innerout opened this issue Jun 3, 2024 · 8 comments
Open

Segmentation fault with uftrace dump --flame-graph #1928

innerout opened this issue Jun 3, 2024 · 8 comments

Comments

@innerout
Copy link

innerout commented Jun 3, 2024

Hi,

I am trying to run uftrace dump --flame-graph with a small test that I have. Although record, report and graph commands work fine, the flamegraph option throws a segmentation fault.

I am attaching the uftrace.data that are generated from the execution.
(Edit)
I cannot attach uftrace.data since they are 28 MB and github has a limit of 25 MB for attachments.

From a small investigation with gdb I found that uftrace crashes here:

0x0000555555591e75 in graph_get_task (task=0x0, tg_size=tg_size@entry=56) 
at /usr/src/debug/uftrace/uftrace-0.16/utils/graph.c:41
41                      if (tg->task->tid == task->tid)

Also, here is the uftrace info output:

# system information
# ==================
# program version     : v0.16 ( x86_64 dwarf python3 luajit tui perf sched dynamic kernel )
# recorded on         : Mon Jun  3 14:30:44 2024
# cmdline             : uftrace record -k tests/test_par_threadpool
# cpu info            : Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz
# number of cpus      : 8 / 8 (online / possible)
# memory info         : 15.7 / 31.2 GB (free / total)
# system load         : 2.64 / 2.94 / 2.95 (1 / 5 / 15 min)
# kernel version      : Linux 6.9.3-zen1-1-zen
# hostname            : kernel
# distro              : "EndeavourOS"
#
# process information
# ===================
# number of tasks     : 5
# task list           : 205624(test_par_thread), 205645(bg_worker), 205646(bg_worker), 205647(bg_worker), 205648(bg_worker)
# exe image           : /home/gxanth/gitfolders/parallax/build/tests/test_par_threadpool
# build id            : dec3c32d8d5412209b5447be9d2c69ffb4351508
# pattern             : regex
# exit status         : exited with code: 0
# elapsed time        : 1.018009329 sec
# cpu time            : 2.222 / 0.983 sec (sys / user)
# context switch      : 14094 / 657 (voluntary / involuntary)
# max rss             : 6920 KB
# page fault          : 0 / 1005 (major / minor)
# disk iops           : 0 / 8 (read / write)

@namhyung
Copy link
Owner

namhyung commented Jun 4, 2024

Thanks for the report. It's strange that task is NULL when you call graph_get_task(). The flame graph logic is handled in dump_flame_task_rstack() and it calls the graph_get_task(). But it also accesses task->rstack so it should see the problem there before. Also I think the code should not allow task being NULL in the first place.

Can you please upload the the full backtrace from GDB?

Also you can try --tid option to limit the tasks during uftrace dump. Note that it accepts CSV to specify multiple tasks.

@innerout
Copy link
Author

innerout commented Jun 4, 2024

In the URL https://files.fm/u/cs3pkm6h27 is the uftrace.data of the run. I don't know if they can be used on different computers.
Here is the full backtrace. gdb.txt

@namhyung
Copy link
Owner

namhyung commented Jun 7, 2024

Thanks, I can reproduce the problem. FYI uftrace is designed to be able to analyze data from different machine and even from a different architecture.

I'll take a look at the bug.

@innerout
Copy link
Author

innerout commented Jun 7, 2024

Keep up the good work, and thank you very much for all the information!

@namhyung
Copy link
Owner

namhyung commented Jun 7, 2024

Hmm.. sometimes kernel trace data contains related tasks and it makes get_task_handle() returning NULL. This would fix the problem but I need to take a look if there are more to fix.

diff --git a/cmds/dump.c b/cmds/dump.c
index 75a81373..2ad53eb1 100644
--- a/cmds/dump.c
+++ b/cmds/dump.c
@@ -1152,6 +1152,8 @@ static void dump_flame_kernel_rstack(struct uftrace_dump_ops *ops,
 
        tid = kernel->tids[cpu];
        task = get_task_handle(kernel->handle, tid);
+       if (task == NULL)
+               return;
 
        graph = graph_get_task(task, sizeof(*graph));
 

@namhyung
Copy link
Owner

namhyung commented Jun 9, 2024

I also see the report output is invalid. I think there's a mismatch of ENTRY and EXIT. Probably with sched in/out..

$ uftrace report | head
  Total time   Self time       Calls  Function
  ==========  ==========  ==========  ====================
   25.000  h   25.000  h      890300  x64_sys_call
   25.000  h   25.000  h       14686  switch_fpu_return
    2.077  s  574.590 ms     1022312  pthread_mutex_lock
    1.360  s  580.720 ms     1022312  pthread_mutex_unlock
    1.002  s    5.955 us           1  main
    1.000  s   20.558 us           1  sleep
  612.461 ms  616.700 ms       13912  linux:schedule
   77.251 ms   77.411 ms         570  linux:schedule (pre-empted)

@namhyung
Copy link
Owner

namhyung commented Jun 9, 2024

Merged 7791257.

@innerout
Copy link
Author

Thanks! I will test it and report back if there are any issues!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants