Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataStorm/reliability test hang #3309

Open
pepone opened this issue Dec 26, 2024 · 4 comments · May be fixed by #3310
Open

DataStorm/reliability test hang #3309

pepone opened this issue Dec 26, 2024 · 4 comments · May be fixed by #3310

Comments

@pepone
Copy link
Member

pepone commented Dec 26, 2024

The tests hangs, I got the same issue a mac Book pro, and Debian.

The test hanging is running reader/writer as client with 2 nodes (reversed start order) test ]

I got this testing with fixes from #3294.

Reader:

Thread 1 (Thread 0x7f988966f100 (LWP 82271) "reader"):
#0  0x00007f98888a4f16 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007f98888a75d8 in pthread_cond_wait () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007f9889504bbd in DataStormI::DataElementI::waitForListeners (this=0x561885f8cdc0, count=1) at src/DataStorm/DataElementI.cpp:559
#3  0x00007f988950a716 in DataStormI::KeyDataWriterI::waitForReaders (this=0x561885f8cdc0, count=1) at src/DataStorm/DataElementI.cpp:1194
#4  0x0000561885abf811 in DataStorm::Writer<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >::waitForReaders (this=0x7ffd50e1f680, count=1) at ../cpp/include/DataStorm/DataStorm.h:1616
#5  0x0000561885aba562 in Reader::run (this=0x7ffd50e1fec0, argc=1, argv=0x7ffd50e20078) at test/DataStorm/reliability/Reader.cpp:67
#6  0x0000561885abf989 in Test::runTest<Reader> (argc=17, argv=0x7ffd50e20078) at test/include/TestHelper.h:152
#7  0x0000561885abb1ca in main (argc=17, argv=0x7ffd50e20078) at test/DataStorm/reliability/Reader.cpp:129

Writer:

Thread 1 (Thread 0x7f3d340a2100 (LWP 82269) "writer"):
#0  0x00007f3d344a4f16 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007f3d344a75d8 in pthread_cond_wait () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007f3d3510cedb in std::condition_variable::wait<DataStormI::DataReaderI::getNextUnread()::<lambda()> >(std::unique_lock<std::mutex> &, struct {...}) (this=0x558c05c77120, __lock=..., __p=...) at /usr/include/c++/12/condition_variable:102
#3  0x00007f3d35106075 in DataStormI::DataReaderI::getNextUnread (this=0x558c05c36620) at src/DataStorm/DataElementI.cpp:728
#4  0x0000558c04d212ac in DataStorm::Reader<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >::getNextUnread (this=0x7ffceb1a8720) at ../cpp/include/DataStorm/DataStorm.h:1396
#5  0x0000558c04d1c387 in Writer::run (this=0x7ffceb1a89e0, argc=1, argv=0x7ffceb1a8b98) at test/DataStorm/reliability/Writer.cpp:54
#6  0x0000558c04d213a3 in Test::runTest<Writer> (argc=14, argv=0x7ffceb1a8b98) at test/include/TestHelper.h:152
#7  0x0000558c04d1cc8e in main (argc=14, argv=0x7ffceb1a8b98) at test/DataStorm/reliability/Writer.cpp:96

client-122624-1336.log
node1-122624-1336.log
node2-122624-1336.log
server-122624-1336.log

@pepone pepone added this to the 3.8.0 milestone Dec 26, 2024
@pepone
Copy link
Member Author

pepone commented Dec 26, 2024

The test eventually recover but was stuck there for a while:

[ running reader/writer as client with 2 nodes (reversed start order) test - 12/26/24 13:36:44 ]
starting node1... ok
starting node2... ok
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/reader --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --Ice.ThreadPool.Server.Size=1 --Ice.ThreadPool.Server.SizeMax=3 --Ice.ThreadPool.Server.SizeWarn=0 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12021" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122624-1336.log pid=82271 is hanging - 12/26/24 13:37:44
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/reader --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --Ice.ThreadPool.Server.Size=1 --Ice.ThreadPool.Server.SizeMax=3 --Ice.ThreadPool.Server.SizeWarn=0 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12021" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122624-1336.log pid=82271 is hanging - 12/26/24 13:38:14
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/reader --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --Ice.ThreadPool.Server.Size=1 --Ice.ThreadPool.Server.SizeMax=3 --Ice.ThreadPool.Server.SizeWarn=0 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12021" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122624-1336.log pid=82271 is hanging - 12/26/24 13:38:44
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/reader --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --Ice.ThreadPool.Server.Size=1 --Ice.ThreadPool.Server.SizeMax=3 --Ice.ThreadPool.Server.SizeWarn=0 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12021" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122624-1336.log pid=82271 is hanging - 12/26/24 13:39:14
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/reader --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --Ice.ThreadPool.Server.Size=1 --Ice.ThreadPool.Server.SizeMax=3 --Ice.ThreadPool.Server.SizeWarn=0 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12021" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122624-1336.log pid=82271 is hanging - 12/26/24 13:39:44
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/reader --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --Ice.ThreadPool.Server.Size=1 --Ice.ThreadPool.Server.SizeMax=3 --Ice.ThreadPool.Server.SizeWarn=0 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12021" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122624-1336.log pid=82271 is hanging - 12/26/24 13:40:14
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/reader --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --Ice.ThreadPool.Server.Size=1 --Ice.ThreadPool.Server.SizeMax=3 --Ice.ThreadPool.Server.SizeWarn=0 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12021" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122624-1336.log pid=82271 is hanging - 12/26/24 13:40:44
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/reader --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --Ice.ThreadPool.Server.Size=1 --Ice.ThreadPool.Server.SizeMax=3 --Ice.ThreadPool.Server.SizeWarn=0 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12021" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122624-1336.log pid=82271 is hanging - 12/26/24 13:41:14
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/reader --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --Ice.ThreadPool.Server.Size=1 --Ice.ThreadPool.Server.SizeMax=3 --Ice.ThreadPool.Server.SizeWarn=0 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12021" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122624-1336.log pid=82271 is hanging - 12/26/24 13:41:45
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/reader --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --Ice.ThreadPool.Server.Size=1 --Ice.ThreadPool.Server.SizeMax=3 --Ice.ThreadPool.Server.SizeWarn=0 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12021" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122624-1336.log pid=82271 is hanging - 12/26/24 13:42:15
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/reader --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --Ice.ThreadPool.Server.Size=1 --Ice.ThreadPool.Server.SizeMax=3 --Ice.ThreadPool.Server.SizeWarn=0 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12021" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122624-1336.log pid=82271 is hanging - 12/26/24 13:42:45
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/reader --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --Ice.ThreadPool.Server.Size=1 --Ice.ThreadPool.Server.SizeMax=3 --Ice.ThreadPool.Server.SizeWarn=0 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12021" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122624-1336.log pid=82271 is hanging - 12/26/24 13:43:15
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/reader --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --Ice.ThreadPool.Server.Size=1 --Ice.ThreadPool.Server.SizeMax=3 --Ice.ThreadPool.Server.SizeWarn=0 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12021" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122624-1336.log pid=82271 is hanging - 12/26/24 13:43:45
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/reader --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --Ice.ThreadPool.Server.Size=1 --Ice.ThreadPool.Server.SizeMax=3 --Ice.ThreadPool.Server.SizeWarn=0 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12021" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122624-1336.log pid=82271 is hanging - 12/26/24 13:44:15
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/reader --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --Ice.ThreadPool.Server.Size=1 --Ice.ThreadPool.Server.SizeMax=3 --Ice.ThreadPool.Server.SizeWarn=0 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12021" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122624-1336.log pid=82271 is hanging - 12/26/24 13:44:45
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/reader --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --Ice.ThreadPool.Server.Size=1 --Ice.ThreadPool.Server.SizeMax=3 --Ice.ThreadPool.Server.SizeWarn=0 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12021" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122624-1336.log pid=82271 is hanging - 12/26/24 13:45:15
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/reader --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --Ice.ThreadPool.Server.Size=1 --Ice.ThreadPool.Server.SizeMax=3 --Ice.ThreadPool.Server.SizeWarn=0 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12021" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122624-1336.log pid=82271 is hanging - 12/26/24 13:45:45
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/reader --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --Ice.ThreadPool.Server.Size=1 --Ice.ThreadPool.Server.SizeMax=3 --Ice.ThreadPool.Server.SizeWarn=0 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12021" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122624-1336.log pid=82271 is hanging - 12/26/24 13:46:15
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/reader --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --Ice.ThreadPool.Server.Size=1 --Ice.ThreadPool.Server.SizeMax=3 --Ice.ThreadPool.Server.SizeWarn=0 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12021" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122624-1336.log pid=82271 is hanging - 12/26/24 13:46:45
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/reader --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --Ice.ThreadPool.Server.Size=1 --Ice.ThreadPool.Server.SizeMax=3 --Ice.ThreadPool.Server.SizeWarn=0 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12021" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122624-1336.log pid=82271 is hanging - 12/26/24 13:47:15
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/reader --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --Ice.ThreadPool.Server.Size=1 --Ice.ThreadPool.Server.SizeMax=3 --Ice.ThreadPool.Server.SizeWarn=0 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12021" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122624-1336.log pid=82271 is hanging - 12/26/24 13:47:45
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/reader --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --Ice.ThreadPool.Server.Size=1 --Ice.ThreadPool.Server.SizeMax=3 --Ice.ThreadPool.Server.SizeWarn=0 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12021" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122624-1336.log pid=82271 is hanging - 12/26/24 13:48:15
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/reader --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --Ice.ThreadPool.Server.Size=1 --Ice.ThreadPool.Server.SizeMax=3 --Ice.ThreadPool.Server.SizeWarn=0 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12021" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122624-1336.log pid=82271 is hanging - 12/26/24 13:48:45
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/reader --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --Ice.ThreadPool.Server.Size=1 --Ice.ThreadPool.Server.SizeMax=3 --Ice.ThreadPool.Server.SizeWarn=0 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12021" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122624-1336.log pid=82271 is hanging - 12/26/24 13:49:15
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/reader --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --Ice.ThreadPool.Server.Size=1 --Ice.ThreadPool.Server.SizeMax=3 --Ice.ThreadPool.Server.SizeWarn=0 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12021" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122624-1336.log pid=82271 is hanging - 12/26/24 13:49:45
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/reader --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --Ice.ThreadPool.Server.Size=1 --Ice.ThreadPool.Server.SizeMax=3 --Ice.ThreadPool.Server.SizeWarn=0 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12021" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122624-1336.log pid=82271 is hanging - 12/26/24 13:50:15
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/reader --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --Ice.ThreadPool.Server.Size=1 --Ice.ThreadPool.Server.SizeMax=3 --Ice.ThreadPool.Server.SizeWarn=0 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12021" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122624-1336.log pid=82271 is hanging - 12/26/24 13:50:45
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/reader --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --Ice.ThreadPool.Server.Size=1 --Ice.ThreadPool.Server.SizeMax=3 --Ice.ThreadPool.Server.SizeWarn=0 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12021" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122624-1336.log pid=82271 is hanging - 12/26/24 13:51:15
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/reader --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --Ice.ThreadPool.Server.Size=1 --Ice.ThreadPool.Server.SizeMax=3 --Ice.ThreadPool.Server.SizeWarn=0 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12021" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122624-1336.log pid=82271 is hanging - 12/26/24 13:51:45

[ running reader as client and writer as server with 2 nodes (reversed start order) test - 12/26/24 13:52:01 ]

Note the recovery only happen after pausing the reader and writer in the debugger to retrieve the stack traces, upon detach the debugger. This pausing and detaching might have cause the connection to get lost and the session to start a new recovery...

@pepone
Copy link
Member Author

pepone commented Dec 26, 2024

The recovery happens after the connection is closed by the inactivity timeout, see long for a similar failure:

-- 12/26/24 15:57:30150 /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/writer: Session: disconnected node session (peer = `D4FB79DC-F7E7-42AD-97B7-EF78D70022B8 -t -e 1.1:tcp -h 127.0.0.1 -p 12022 -t 60000')
-- 12/26/24 15:57:30151 /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/writer: Session: p/2: session 'sf/2-E1B02D64-0418-47BD-AE62-B5DEED482775' disconnected:
   local address = 127.0.0.1:55018
   remote address = 127.0.0.1:12022
   connection closed because it remained inactive for longer than the inactivity timeout
-- 12/26/24 15:57:30152 /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/writer: Session: p/2: retrying connecting to 'E1B02D64-0418-47BD-AE62-B5DEED482775 -t -e 1.1:tcp -h 127.0.0.1 -p 12022 -t 60000' in 0 (ms), retry 1/6
-- 12/26/24 15:57:30152 /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/writer: Session: p/2: trying to reconnect session with 'E1B02D64-0418-47BD-AE62-B5DEED482775 -t -e 1.1:tcp -h 127.0.0.1 -p 12022 -t 60000'
-- 12/26/24 15:57:30152 /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/writer: Protocol: received validate connection 
   message type = 3 (validate connection)
   compression status = 0 (not compressed; do not compress response, if any)
   message size = 14
   transport = tcp
   local address = 127.0.0.1:39270
   remote address = 127.0.0.1:12022

I suspect there is a race condition in the announcement of elements with established sessions, and the writer get stuck waiting for the connected readers notification.

@pepone
Copy link
Member Author

pepone commented Dec 28, 2024

See another occurrence

testing writer connection closure... ok
testing reader connection closure... process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/writer --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12022" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122824-0951.log pid=1455565 is hanging - 12/28/24 09:52:21
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/writer --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12022" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122824-0951.log pid=1455565 is hanging - 12/28/24 09:52:51
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/writer --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12022" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122824-0951.log pid=1455565 is hanging - 12/28/24 09:53:21
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/writer --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12022" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122824-0951.log pid=1455565 is hanging - 12/28/24 09:53:51
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/writer --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12022" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122824-0951.log pid=1455565 is hanging - 12/28/24 09:54:21
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/writer --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12022" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122824-0951.log pid=1455565 is hanging - 12/28/24 09:54:51
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/writer --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12022" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122824-0951.log pid=1455565 is hanging - 12/28/24 09:55:21
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/writer --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12022" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122824-0951.log pid=1455565 is hanging - 12/28/24 09:55:51
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/writer --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12022" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122824-0951.log pid=1455565 is hanging - 12/28/24 09:56:22
ok
testing reader multiple connection closure without writer activity... process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/writer --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12022" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122824-0951.log pid=1455565 is hanging - 12/28/24 09:57:52
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/writer --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12022" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122824-0951.log pid=1455565 is hanging - 12/28/24 09:58:22
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/writer --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12022" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122824-0951.log pid=1455565 is hanging - 12/28/24 09:58:52
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/writer --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12022" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122824-0951.log pid=1455565 is hanging - 12/28/24 09:59:22
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/writer --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12022" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122824-0951.log pid=1455565 is hanging - 12/28/24 09:59:52
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/writer --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12022" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122824-0951.log pid=1455565 is hanging - 12/28/24 10:00:22
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/writer --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12022" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122824-0951.log pid=1455565 is hanging - 12/28/24 10:00:52
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/writer --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12022" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122824-0951.log pid=1455565 is hanging - 12/28/24 10:01:22
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/writer --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12022" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122824-0951.log pid=1455565 is hanging - 12/28/24 10:01:52
ok

Unfortunately this is not reported as a failure, because the session recovers after the connection is closed by the idle timeout.

@pepone
Copy link
Member Author

pepone commented Dec 28, 2024

Logs and stacks from a different failure:

[ running reader/writer as client with 2 nodes test - 12/28/24 18:41:44 ]
starting node1... ok
starting node2... ok
testing writer connection closure... ok
testing reader connection closure... process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/writer --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12022" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122824-1841.log pid=2510615 is hanging - 12/28/24 18:42:44
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/writer --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12022" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122824-1841.log pid=2510615 is hanging - 12/28/24 18:43:15
process /home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/build/x86_64-linux-gnu/shared/writer --Ice.Default.Host=127.0.0.1 --Ice.Warn.Connections=1 --Ice.Default.Protocol=tcp --Ice.IPv6=0 --Ice.PrintStackTraces=1 --DataStorm.Node.Multicast.Enabled=0 --DataStorm.Node.Server.Enabled=0 --DataStorm.Node.ConnectTo="tcp -p 12022" --DataStorm.Trace.Topic=1 --DataStorm.Trace.Session=3 --DataStorm.Trace.Data=2 --Ice.Trace.Protocol=1 --Ice.LogFile=/home/jose/Documents/3.8/ice/cpp/test/DataStorm/reliability/client-122824-1841.log pid=2510615 is hanging - 12/28/24 18:43:45

Reader

Thread 1 (Thread 0x7fb85b88f100 (LWP 2510613) "reader"):
#0  0x00007fb85c4a4f16 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007fb85c4a75d8 in pthread_cond_wait () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007fb85d104b49 in DataStormI::DataElementI::waitForListeners (this=0x55b29d798dc0, count=1) at src/DataStorm/DataElementI.cpp:562
#3  0x00007fb85d10a6a2 in DataStormI::KeyDataWriterI::waitForReaders (this=0x55b29d798dc0, count=1) at src/DataStorm/DataElementI.cpp:1197
#4  0x000055b29bd5a811 in DataStorm::Writer<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >::waitForReaders (this=0x7ffd347ceae0, count=1) at ../cpp/include/DataStorm/DataStorm.h:1616
#5  0x000055b29bd55562 in Reader::run (this=0x7ffd347cf320, argc=1, argv=0x7ffd347cf4d8) at test/DataStorm/reliability/Reader.cpp:67
#6  0x000055b29bd5a989 in Test::runTest<Reader> (argc=17, argv=0x7ffd347cf4d8) at test/include/TestHelper.h:152
#7  0x000055b29bd561ca in main (argc=17, argv=0x7ffd347cf4d8) at test/DataStorm/reliability/Reader.cpp:129

Writer

Thread 1 (Thread 0x7fb90e65b100 (LWP 2510615) "writer"):
#0  0x00007fb90d8a4f16 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007fb90d8a75d8 in pthread_cond_wait () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007fb90e50ce67 in std::condition_variable::wait<DataStormI::DataReaderI::getNextUnread()::<lambda()> >(std::unique_lock<std::mutex> &, struct {...}) (this=0x564f1654ea00, __lock=..., __p=...) at /usr/include/c++/12/condition_variable:102
#3  0x00007fb90e506001 in DataStormI::DataReaderI::getNextUnread (this=0x564f1650dd10) at src/DataStorm/DataElementI.cpp:731
#4  0x0000564f15c7d2ac in DataStorm::Reader<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >::getNextUnread (this=0x7ffe7180f2f0) at ../cpp/include/DataStorm/DataStorm.h:1396
#5  0x0000564f15c78387 in Writer::run (this=0x7ffe7180f5b0, argc=1, argv=0x7ffe7180f768) at test/DataStorm/reliability/Writer.cpp:54
#6  0x0000564f15c7d3a3 in Test::runTest<Writer> (argc=14, argv=0x7ffe7180f768) at test/include/TestHelper.h:152
#7  0x0000564f15c78c8e in main (argc=14, argv=0x7ffe7180f768) at test/DataStorm/reliability/Writer.cpp:96

client-122824-1841.log
node1-122824-1841.log
node2-122824-1841.log
server-122824-1841.log

@pepone pepone linked a pull request Jan 3, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant