You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This time, it's instead an "awaiter thread" that uses signalfd and epoll to be notified of events (and then subsequently running waitpid as one would).
There's a test application that reproduces this bug 100% of the time here, clone the repo, and just do ./run.sh (it builds and runs without then with rr record).
Short summary of the reproducer:
a tracer application, it has a thread whose job is to be notified via signalfd on SIGCHLD that it can WAITPID. The tracee is TRACEMEd and set up to signal for most of the ptrace events.
The actual CLONE events will show up, as usual, but not the subsequent events for the cloned tasks, essentially putting them to sleep forever (since the tracer application doesn't get the SIGSTOP/SIGTRAP, whatever signal is delivered at the next signal-delivery-stop for the newly created task).
Since this test application works just fine, if no additional tracer/debugger thread is doing the epoll+waitpid-dance (i.e. done on the "main thread/task leader"), I'm guessing the solution will be very similar to that of the previous related issue #3658
The text was updated successfully, but these errors were encountered:
This is a bug much similar to a previous one (that was fixed by @rocallahan) which can be found here, called RR waitpid bug not seen during non-recording
This time, it's instead an "awaiter thread" that uses
signalfd
andepoll
to be notified of events (and then subsequently runningwaitpid
as one would).There's a test application that reproduces this bug 100% of the time here, clone the repo, and just do
./run.sh
(it builds and runs without then withrr record
).Short summary of the reproducer:
a tracer application, it has a thread whose job is to be notified via signalfd on
SIGCHLD
that it canWAITPID
. The tracee isTRACEME
d and set up to signal for most of the ptrace events.The actual
CLONE
events will show up, as usual, but not the subsequent events for the cloned tasks, essentially putting them to sleep forever (since the tracer application doesn't get the SIGSTOP/SIGTRAP, whatever signal is delivered at the next signal-delivery-stop for the newly created task).Since this test application works just fine, if no additional tracer/debugger thread is doing the epoll+waitpid-dance (i.e. done on the "main thread/task leader"), I'm guessing the solution will be very similar to that of the previous related issue #3658
The text was updated successfully, but these errors were encountered: