-
Notifications
You must be signed in to change notification settings - Fork 440
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Profiling native threads? #332
Comments
Not right now =( We merge the native stack traces into python frames - but not vice versa. You'll have to profile with other native profiling tools like perf etc to get profile the native thread |
That’s unfortunate. Can you say more about this merging? Does it need to happen? |
Indeed that would be very helpful to have py-spy handle native threads in the reporting to understand the performance of CPU intensive Python programs that use datascience libraries like numpy that rely on multi-threaded linear algebra native libraries such as OpenBLAS, MKL and co. Same for machine learning libraries like scikit-learn, lightgbm and xgboost that use OpenMP threads in the CPU intensive sections of the code written in Cython or C++. At the moment profiling with |
We're using libunwind-ptrace in PyPerf and we just place native frames on top of the Python frames (stopping at the first native frame that is the IIRC py-spy uses libunwind-ptrace as well? So this rather simple scheme could work. |
@benfred It would be great to have native thread in py-spy: in my case, some of those native threads are managed by OpenMP via Cython Furthermore, if speedscope ever supports multitrack views with time-aligned traces, it would be very helpful to understand when those native threads come into play and interact with the calling Python code. Would @Jongy's suggested solution above work? |
Does
py-spy record
ignore threads that don’t contain any Python stack frame by default?I have a Python program with a native extension (that happens to be written in Rust). That extension starts a thread (with Rust’s
std::thread::spawn
) to do some CPU-intensive work in parallel with other work. The child thread never runs a Python interpreter. The SVG output of the profiler is missing everything in the second thread.--native
does show Rust stack frames, but only in the parent thread. Adding--threads
adds the ID of the parent thread to the output but nothing else. Adding--idle
doesn’t seem to change anything for this program.When using
py-spy dump --pid
(at the right time) however, the stack of both threads is printed correctly.Can I use py-spy to profile both threads?
The text was updated successfully, but these errors were encountered: