Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pigz hanging waiting on lock #116

Open
AZaugg opened this issue Feb 7, 2024 · 4 comments
Open

pigz hanging waiting on lock #116

AZaugg opened this issue Feb 7, 2024 · 4 comments

Comments

@AZaugg
Copy link

AZaugg commented Feb 7, 2024

I am seeing an issue with pigz where it's getting stuck. I have cpio piping data over to pigz

root@m [ /proc/2018279/fd ]# ls -l
total 0
lr-x------ 1 root root 64 Feb  6 07:05 0 -> 'pipe:[881736186]'
l-wx------ 1 root root 64 Feb  6 07:05 1 -> /var/tmp/dracut.IsQD2b/initramfs.img
l-wx------ 1 root root 64 Feb  6 07:05 2 -> /dev/null
root@m [ /proc/2018279/fd ]# cd /proc/2018278/fd
root@m [ /proc/2018278/fd ]# ls -l
total 0
lr-x------ 1 root root 64 Feb  6 07:05 0 -> 'pipe:[881736185]'
l-wx------ 1 root root 64 Feb  6 07:05 1 -> 'pipe:[881736186]'
l-wx------ 1 root root 64 Feb  6 07:05 2 -> /dev/null
lr-x------ 1 root root 64 Feb  6 07:05 3 -> /var/tmp/dracut.IsQD2b/initramfs/usr/bin/less

On the pigz side i can see:

root@m [ /proc/2018279 ]# strace -p 2018279 -f
strace: Process 2018279 attached with 19 threads
[pid 2018297] futex(0x59452f6ce6e0, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 2018296] futex(0x59452f6ce6e0, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 2018295] futex(0x59452f6ce6e0, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 2018294] futex(0x59452f6ce6e0, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 2018293] futex(0x59452f6ce6e0, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 2018292] futex(0x59452f6ce6e0, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 2018291] futex(0x59452f6ce6e0, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 2018290] futex(0x59452f6ce6e0, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 2018289] futex(0x59452f6ce6e0, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 2018288] futex(0x59452f6ce6e0, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 2018287] futex(0x59452f6ce6e0, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 2018286] futex(0x59452f6ce6e0, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 2018285] futex(0x59452f6ce6e0, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 2018284] futex(0x59452f6ce6e0, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 2018283] futex(0x59452f6ce6e0, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 2018282] futex(0x59452f6ce6e0, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 2018281] futex(0x59452f6ce6e0, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 2018280] futex(0x5945315b1114, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
[pid 2018279] futex(0x59452f6ce6e0, FUTEX_WAIT_PRIVATE, 2, NULL

Stuck on a lock, looking at the stack

root@m[ /proc/2018279 ]# cat stack
[<0>] futex_wait_queue_me+0xa2/0x100
[<0>] futex_wait+0x105/0x250
[<0>] do_futex+0x1a2/0xaf0
[<0>] __x64_sys_futex+0x78/0x1e0
[<0>] do_syscall_64+0x5c/0x90
[<0>] entry_SYSCALL_64_after_hwframe+0x67/0xd1

Has anyone seen pigz get stuck like this?

@madler
Copy link
Owner

madler commented Feb 7, 2024

What operating system?

@AZaugg
Copy link
Author

AZaugg commented Feb 8, 2024

Azure Linux
Kernel 5.15.125.1-2
glibc-2.35-6
pigz-2.6-2

I should add, looking at the core

(gdb) info threads
  Id   Target Id                           Frame
* 1    Thread 0x73b25af1a700 (LWP 2814192) 0x000073b25afa605f in __lll_lock_wait () from /lib/libc.so.6
  2    Thread 0x73b25af19640 (LWP 2814193) 0x000073b25afa5e8a in __futex_abstimed_wait_common () from /lib/libc.so.6
  3    Thread 0x73b25a6d6640 (LWP 2814194) 0x000073b25afa605f in __lll_lock_wait () from /lib/libc.so.6
  4    Thread 0x73b259eaa640 (LWP 2814195) 0x000073b25afa605f in __lll_lock_wait () from /lib/libc.so.6
  5    Thread 0x73b25965d640 (LWP 2814196) 0x000073b25afa605f in __lll_lock_wait () from /lib/libc.so.6
  6    Thread 0x73b258e31640 (LWP 2814197) 0x000073b25afa605f in __lll_lock_wait () from /lib/libc.so.6
  7    Thread 0x73b243fff640 (LWP 2814198) 0x000073b25afa605f in __lll_lock_wait () from /lib/libc.so.6
  8    Thread 0x73b2437fe640 (LWP 2814199) 0x000073b25afa605f in __lll_lock_wait () from /lib/libc.so.6
  9    Thread 0x73b242ffd640 (LWP 2814200) 0x000073b25afa605f in __lll_lock_wait () from /lib/libc.so.6
  10   Thread 0x73b2427fc640 (LWP 2814201) 0x000073b25afa605f in __lll_lock_wait () from /lib/libc.so.6
  11   Thread 0x73b241ffb640 (LWP 2814202) 0x000073b25afa605f in __lll_lock_wait () from /lib/libc.so.6
  12   Thread 0x73b2417fa640 (LWP 2814203) 0x000073b25afa605f in __lll_lock_wait () from /lib/libc.so.6
  13   Thread 0x73b240ff9640 (LWP 2814204) 0x000073b25afa605f in __lll_lock_wait () from /lib/libc.so.6
  14   Thread 0x73b22bfff640 (LWP 2814205) 0x000073b25afa605f in __lll_lock_wait () from /lib/libc.so.6
  15   Thread 0x73b22b7fe640 (LWP 2814206) 0x000073b25afa605f in __lll_lock_wait () from /lib/libc.so.6
  16   Thread 0x73b22affd640 (LWP 2814207) 0x000073b25afa605f in __lll_lock_wait () from /lib/libc.so.6
  17   Thread 0x73b22a7fc640 (LWP 2814208) 0x000073b25afa605f in __lll_lock_wait () from /lib/libc.so.6
  18   Thread 0x73b229ffb640 (LWP 2814209) 0x000073b25afa605f in __lll_lock_wait () from /lib/libc.so.6

@madler
Copy link
Owner

madler commented Feb 8, 2024

I have not seen this exactly before, but there have been two reports on SuSE systems of a hang due to a pthread bug in that system, which is why I asked about your OS.

There is this report of a pthread bug in glibc that could impact pigz due to its use of condition waits. If you look at those messages, the one at the end from just last month is asking about whether a fix to glibc has been made or not. Sounds like not.

Your problem may be related to that, or it may be something else. These sorts of reports are very rare, so it is difficult to conclude anything.

It seems that pthread is a difficult thing to write correctly.

@rtissera
Copy link

I can report the issue too on Debian 11, kernel 6.6.13 (backports), Beelink SER5 Pro (Ryzen 7 5800H, 32 GB RAM, NVMe SSD).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants