Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create a real call graph that can have cycles in --graphviz #1500

Open
honggyukim opened this issue Jul 30, 2022 · 8 comments · May be fixed by #1978
Open

create a real call graph that can have cycles in --graphviz #1500

honggyukim opened this issue Jul 30, 2022 · 8 comments · May be fixed by #1978

Comments

@honggyukim
Copy link
Collaborator

honggyukim commented Jul 30, 2022

uftrace graph actually doesn't show a call graph, but just a call tree output. For example, t-fibonacci example's output is as follows.

$ gcc -pg -o t-fibonacci tests/s-fibonacci.c

$ uftrace record t-fibonacci

$ uftrace graph
# Function Call Graph for 't-fibonacci' (session: 7701e718f823010e)
========== FUNCTION CALL GRAPH ==========
# TOTAL TIME   FUNCTION
   20.354 us : (1) t-fibonacci
    2.199 us :  +-(1) __monstartup
             :  |
    1.129 us :  +-(1) __cxa_atexit
             :  |
   17.026 us :  +-(1) main
   16.499 us :    (1) fib
   15.859 us :    (2) fib
   14.887 us :    (4) fib
   12.782 us :    (8) fib
    8.362 us :    (14) fib
    2.976 us :    (10) fib
    0.390 us :    (2) fib

As the output shows there are multiple nodes for fib function. It's because there are multiple backtrace or call paths of each fib node.

The --graphviz output also uses the same logic when building a graph, precisely a tree. So its output doesn't show the multiple nodes for the same function, but it draws multiple edges for the same calls.

The --graphviz output of t-fibonacci is as follows.

$ gcc -pg -o t-fibonacci tests/s-fibonacci.c
$ uftrace record t-fibonacci
$ uftrace dump --graphviz
      ...
digraph G {
    "t-fibonacci"
    "t-fibonacci" -> "__monstartup" [xlabel = "1"]
    "t-fibonacci" -> "__cxa_atexit" [xlabel = "1"]
    "t-fibonacci" -> "main" [xlabel = "1"]
    "main" -> "fib" [xlabel = "1"]
    "fib" -> "fib" [xlabel = "2"]
    "fib" -> "fib" [xlabel = "4"]
    "fib" -> "fib" [xlabel = "8"]
    "fib" -> "fib" [xlabel = "14"]
    "fib" -> "fib" [xlabel = "10"]
    "fib" -> "fib" [xlabel = "2"]
}

Its image shows too many edges from fib to fib as follows.
image

But it would be much better to make it as follows.

digraph G {
    "t-fibonacci"
    "t-fibonacci" -> "__monstartup" [xlabel = "1"]
    "t-fibonacci" -> "__cxa_atexit" [xlabel = "1"]
    "t-fibonacci" -> "main" [xlabel = "1"]
    "main" -> "fib" [xlabel = "1"]
    "fib" -> "fib" [xlabel = "40"]
}

This modified dot produces much clear image as follows.
image

@daeroro
Copy link

daeroro commented Aug 2, 2022

I want to try this one!

@honggyukim
Copy link
Collaborator Author

@daeroro Thanks. But it'd be much easier to do #1499 first.

@daeroro
Copy link

daeroro commented Aug 2, 2022

@honggyukim
Thank you. I'll try #1499 first.

@hjongkim
Copy link
Contributor

I'll try this.

@honggyukim
Copy link
Collaborator Author

Thanks. Please keep in mind that this can be applied to mermaid result as well so we need to have a common routine that combines the related edges.

@hjongkim
Copy link
Contributor

Thanks for the advice. I'll also consider the impact on mermaid functionality.

@deepseafishy
Copy link

Hi, I have been looking into this issue for the past few days and I think I know the root of the problem, but I don't really know how to fix this. As mentioned intially, there are multiple fib nodes since there are multiple call paths. Seems like tracking such multiple call paths are done by add_graph_entry() through changing the pointer for the tg->node:

out:
	node->nr_calls++;
	tg->node = node;

and by add_graph_exit(), again changing the pointer for the tg->node:

out:
	node->time += fstack->total_time;
	node->child_time += fstack->child_time;

	if (exit_cb)
		exit_cb(tg, cb_arg);

	tg->node = node->parent;

If I am not wrong, tg->node is therefore decided by how many UFTRACE_ENTRY and UFTRACE_EXIT are encountered in the dump file. So I thought about changing this logic to not have a different fib node for every different call path, but then it would not make a proper graph in the first place. My thoughts are either:

  1. have a separate variable in the node like bool recursive to denote that it is a node of recursive calls, and skip any of the prints regarding recursive calls during print_graph_to_graphviz(),
  2. or reduce the edges after creating the graph by iterating through the whole.

@gichoel told me that the latter would create significant overhead if the initial graph is extremely large in size. Maybe the former idea could work, but I wanted some feedback if there are any better ideas.

@namhyung
Copy link
Owner

namhyung commented Oct 1, 2024

I think the first approach would work if you can count the number of recursive calls correctly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants