Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serial link is closed after unsuccessful session #775

Closed
vortex314 opened this issue Feb 29, 2024 · 7 comments
Closed

Serial link is closed after unsuccessful session #775

vortex314 opened this issue Feb 29, 2024 · 7 comments
Labels
bug Something isn't working

Comments

@vortex314
Copy link

Describe the bug

See issue : eclipse-zenoh/zenoh-pico#357

It looks like the link is closed when a future times out and is never opened again for listening on data

[2024-02-29T17:49:56Z TRACE zenoh_transport::unicast::universal::rx] Received: TransportMessage { body: Frame(Frame { reliability: BestEffort, sn: 1000, ext_qos: QoSType { inner: 5 }, payload: [NetworkMessage { body: Push(Push { wire_expr: 1:Sender:, ext_qos: QoS { priority: Data, congestion: Block, express: false }, ext_tstamp: None, ext_nodeid: NodeIdType { node_id: 0 }, payload: Put(Put { timestamp: None, encoding: Exact(Empty), ext_sinfo: None, ext_attachment: None, ext_unknown: [], payload: ZBuf { slices: [[5b, 20, 20, 20, 20, 39, 39, 39, 5d, 20, 5b, 45, 53, 50, 49, 44, 46, 5d, 7b, 45, 53, 50, 33, 32, 7d, 20, 20, 5a, 65, 6e, 6f, 68, 2d, 50, 69, 63, 6f, 21]] } }) }) }] }) }
[2024-02-29T17:49:56Z TRACE zenoh_transport::unicast::universal::tx] Scheduled: NetworkMessage { body: Push(Push { wire_expr: 1:Receiver:/zenoh-pico-pub, ext_qos: QoS { priority: Data, congestion: Block, express: false }, ext_tstamp: None, ext_nodeid: NodeIdType { node_id: 0 }, payload: Put(Put { timestamp: Some(65e0c3c4314f7e70/9221b5ff42804107098424500607502c), encoding: Exact(Empty), ext_sinfo: None, ext_attachment: None, ext_unknown: [], payload: ZBuf { slices: [[5b, 20, 20, 20, 20, 39, 39, 39, 5d, 20, 5b, 45, 53, 50, 49, 44, 46, 5d, 7b, 45, 53, 50, 33, 32, 7d, 20, 20, 5a, 65, 6e, 6f, 68, 2d, 50, 69, 63, 6f, 21]] } }) }) }
[2024-02-29T17:49:56Z TRACE z_serial] Read 14 bytes COBS [02, 06, 0B, 05, E9, 07, 1E, 01, 01, 03, 2E, 38, F9, 00]
[2024-02-29T17:49:56Z TRACE z_serial] Frame after COBS encoding [06, 00, 05, E9, 07, 1E, 01, 01, 03, 2E, 38, F9]
[2024-02-29T17:49:56Z TRACE z_serial] Received CRC F9382E03  Computed CRC F9382E03
[2024-02-29T17:49:56Z TRACE zenoh_transport::unicast::universal::rx] Received: TransportMessage { body: Frame(Frame { reliability: BestEffort, sn: 1001, ext_qos: QoSType { inner: 5 }, payload: [NetworkMessage { body: Declare(Declare { ext_qos: QoS { priority: Data, congestion: Drop, express: false }, ext_tstamp: None, ext_nodeid: NodeIdType { node_id: 0 }, body: UndeclareKeyExpr(UndeclareKeyExpr { id: 1 }) }) }] }) }
[2024-02-29T17:49:57Z TRACE zenoh_transport::unicast::universal::rx] Received: TransportMessage { body: KeepAlive(KeepAlive) }
[2024-02-29T17:49:58Z TRACE z_serial] Read 10 bytes COBS [02, 02, 02, 23, 05, 53, 76, 8F, F9, 00]
[2024-02-29T17:49:58Z TRACE z_serial] Frame after COBS encoding [02, 00, 23, 00, 53, 76, 8F, F9]
[2024-02-29T17:49:58Z TRACE z_serial] Received CRC F98F7653  Computed CRC F98F7653
[2024-02-29T17:49:58Z TRACE zenoh_transport::unicast::universal::rx] Received: TransportMessage { body: Close(Close { reason: 0, session: true }) }
[2024-02-29T17:49:58Z DEBUG zenoh_transport::unicast::universal::transport] [9221b5ff42804107098424500607502c] Closing transport with peer: 86ef5321af62cb9129a0d0b479d1a105
[2024-02-29T17:49:58Z TRACE zenoh_transport::unicast::universal::link] serial//dev/ttyUSB1 => serial/1e235275-fb93-4009-a20b-0d3d42a0b62b: closing
[2024-02-29T17:49:59Z TRACE zenoh_transport::unicast::manager] Accepting link... serial//dev/ttyUSB1 => serial/1e235275-fb93-4009-a20b-0d3d42a0b62b
[2024-02-29T17:49:59Z TRACE z_serial] Read 27 bytes COBS [02, 13, 18, 01, 08, F2, 78, 2D, 96, CE, 7F, F7, 2D, 47, FB, A9, 29, 3A, E1, CA, D7, 85, 76, 8F, F4, FA, 00]
[2024-02-29T17:49:59Z TRACE z_serial] Frame after COBS encoding [13, 00, 01, 08, F2, 78, 2D, 96, CE, 7F, F7, 2D, 47, FB, A9, 29, 3A, E1, CA, D7, 85, 76, 8F, F4, FA]
[2024-02-29T17:49:59Z TRACE z_serial] Received CRC FAF48F76  Computed CRC FAF48F76
[2024-02-29T17:49:59Z TRACE z_serial] Frame before COBS encoding [48, 00, 61, 08, F0, 2C, 50, 07, 06, 50, 24, 84, 09, 07, 41, 80, 42, FF, B5, 21, 92, 0A, DC, 05, 31, 30, C1, 51, 9D, 0E, D5, 66, F7, 9D, D9, 33, B2, 49, 92, 6C, 72, 6F, 77, DA, 24, 25, CB, 99, 95, 16, 6D, 14, CE, 99, 66, 5B, 42, 00, D6, E6, C4, 75, B9, B1, 14, 20, 95, BD, 1D, A0, 7E, E6, 53, D9, DA, AF, D4, FE]
[2024-02-29T17:49:59Z TRACE z_serial] Wrote 80bytes COBS [02, 48, 38, 61, 08, F0, 2C, 50, 07, 06, 50, 24, 84, 09, 07, 41, 80, 42, FF, B5, 21, 92, 0A, DC, 05, 31, 30, C1, 51, 9D, 0E, D5, 66, F7, 9D, D9, 33, B2, 49, 92, 6C, 72, 6F, 77, DA, 24, 25, CB, 99, 95, 16, 6D, 14, CE, 99, 66, 5B, 42, 15, D6, E6, C4, 75, B9, B1, 14, 20, 95, BD, 1D, A0, 7E, E6, 53, D9, DA, AF, D4, FE, 00]
[2024-02-29T17:50:00Z TRACE zenoh_transport::unicast::universal::rx] Received: TransportMessage { body: KeepAlive(KeepAlive) }
[2024-02-29T17:50:02Z TRACE zenoh_transport::unicast::universal::rx] Received: TransportMessage { body: KeepAlive(KeepAlive) }
[2024-02-29T17:50:05Z TRACE zenoh_transport::unicast::universal::rx] Received: TransportMessage { body: KeepAlive(KeepAlive) }
[2024-02-29T17:50:07Z TRACE zenoh_transport::unicast::universal::rx] Received: TransportMessage { body: KeepAlive(KeepAlive) }
[2024-02-29T17:50:09Z DEBUG zenoh_transport::unicast::manager] future has timed out
[2024-02-29T17:50:10Z TRACE zenoh_transport::unicast::universal::rx] Received: TransportMessage { body: KeepAlive(KeepAlive) }
[2024-02-29T17:50:12Z TRACE zenoh_transport::unicast::universal::rx] Received: TransportMessage { body: KeepAlive(KeepAlive) }
[2024-02-29T17:50:15Z TRACE zenoh_transport::unicast::universal::rx] Received: TransportMessage { body: KeepAlive(KeepAlive) }
[2024-02-29T17:50:17Z TRACE zenoh_transport::unicast::universal::rx] Received: TransportMessage { body: KeepAlive(KeepAlive) }
[2024-02-29T17:50:20Z TRACE zenoh_transport::unicast::universal::rx] Received: TransportMessage { body: KeepAlive(KeepAlive) }
[2024-02-29T17:50:22Z TRACE zenoh_transport::unicast::universal::rx] Received: TransportMessage { body: KeepAlive(KeepAlive) }
[2024-02-29T17:50:25Z TRACE zenoh_transport::unicast::universal::rx] Received: TransportMessage { body: KeepAlive(KeepAlive) }
[2024-02-29T17:50:27Z TRACE zenoh_transport::unicast::universal::rx] Received: TransportMessage { body: KeepAlive(KeepAlive) }
[2024-02-29T17:50:30Z TRACE zenoh_transport::unicast::universal::rx] Received: TransportMessage { body: KeepAlive(KeepAlive) }

The zenoh-pico code tries opening the session again, but from zenohd side, no more data is being read. It looks like connection/link was dropped completely.
Restarting zenohd , makes that everything works again for some time.
So it looks a bug at server side. I suspect that the serial is handled like a tcp connection, where you close the link.

To reproduce

Run zenoh-pico at client side via serial

System info

Platform Ubuntu 22
Zenoh Pico side on esp32

PRETTY_NAME="Ubuntu 22.04.4 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.4 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy

@vortex314 vortex314 added the bug Something isn't working label Feb 29, 2024
@vortex314
Copy link
Author

I created a backtrace when this happens in zenohd

[2024-02-29T23:14:12Z TRACE zenoh_transport::unicast::universal::rx] Received: TransportMessage { body: KeepAlive(KeepAlive) }
[2024-02-29T23:14:14Z TRACE zenoh_transport::unicast::universal::rx] Received: TransportMessage { body: KeepAlive(KeepAlive) }
[2024-02-29T23:14:17Z TRACE zenoh_transport::unicast::universal::rx] Received: TransportMessage { body: KeepAlive(KeepAlive) }
[2024-02-29T23:14:19Z TRACE zenoh_transport::unicast::universal::rx] Received: TransportMessage { body: KeepAlive(KeepAlive) }
[2024-02-29T23:14:19Z DEBUG zenoh_transport::unicast::manager] future has timed out
thread 'async-std/runtime' panicked at 'explicit panic', io/zenoh-transport/src/unicast/manager.rs:751:17
stack backtrace:
   0: rust_begin_unwind
             at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library/std/src/panicking.rs:593:5
   1: core::panicking::panic_fmt
             at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library/core/src/panicking.rs:67:14
   2: core::panicking::panic
             at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library/core/src/panicking.rs:117:5
   3: zenoh_transport::unicast::manager::<impl zenoh_transport::manager::TransportManager>::handle_new_link_unicast::{{closure}}::{{closure}}
             at ./io/zenoh-transport/src/unicast/manager.rs:751:17
   4: <async_std::task::builder::SupportTaskLocals<F> as core::future::future::Future>::poll::{{closure}}
             at /home/lieven/.cargo/registry/src/index.crates.io-6f17d22bba15001f/async-std-1.12.0/src/task/builder.rs:199:17
   5: async_std::task::task_locals_wrapper::TaskLocalsWrapper::set_current::{{closure}}
             at /home/lieven/.cargo/registry/src/index.crates.io-6f17d22bba15001f/async-std-1.12.0/src/task/task_locals_wrapper.rs:60:13
   6: std::thread::local::LocalKey<T>::try_with
             at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library/std/src/thread/local.rs:270:16
   7: std::thread::local::LocalKey<T>::with
             at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library/std/src/thread/local.rs:246:9
   8: async_std::task::task_locals_wrapper::TaskLocalsWrapper::set_current
             at /home/lieven/.cargo/registry/src/index.crates.io-6f17d22bba15001f/async-std-1.12.0/src/task/task_locals_wrapper.rs:55:9
   9: <async_std::task::builder::SupportTaskLocals<F> as core::future::future::Future>::poll
             at /home/lieven/.cargo/registry/src/index.crates.io-6f17d22bba15001f/async-std-1.12.0/src/task/builder.rs:197:13
  10: async_executor::Executor::spawn::{{closure}}
             at /home/lieven/.cargo/registry/src/index.crates.io-6f17d22bba15001f/async-executor-1.5.1/src/lib.rs:145:20
  11: <core::pin::Pin<P> as core::future::future::Future>::poll
             at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library/core/src/future/future.rs:125:9
  12: async_task::raw::RawTask<F,T,S,M>::run
             at /home/lieven/.cargo/registry/src/index.crates.io-6f17d22bba15001f/async-task-4.4.0/src/raw.rs:563:17
  13: async_task::runnable::Runnable<M>::run
             at /home/lieven/.cargo/registry/src/index.crates.io-6f17d22bba15001f/async-task-4.4.0/src/runnable.rs:782:18
  14: async_executor::Executor::run::{{closure}}::{{closure}}
             at /home/lieven/.cargo/registry/src/index.crates.io-6f17d22bba15001f/async-executor-1.5.1/src/lib.rs:236:21
  15: <futures_lite::future::Or<F1,F2> as core::future::future::Future>::poll
             at /home/lieven/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-lite-1.13.0/src/future.rs:529:33
  16: async_executor::Executor::run::{{closure}}
             at /home/lieven/.cargo/registry/src/index.crates.io-6f17d22bba15001f/async-executor-1.5.1/src/lib.rs:243:32
  17: <futures_lite::future::Or<F1,F2> as core::future::future::Future>::poll
             at /home/lieven/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-lite-1.13.0/src/future.rs:529:33
  18: async_io::driver::block_on
             at /home/lieven/.cargo/registry/src/index.crates.io-6f17d22bba15001f/async-io-1.13.0/src/driver.rs:146:33
  19: async_global_executor::reactor::block_on::{{closure}}
             at /home/lieven/.cargo/registry/src/index.crates.io-6f17d22bba15001f/async-global-executor-2.3.1/src/reactor.rs:3:18
  20: async_global_executor::reactor::block_on
             at /home/lieven/.cargo/registry/src/index.crates.io-6f17d22bba15001f/async-global-executor-2.3.1/src/reactor.rs:12:5
  21: async_global_executor::threading::thread_main_loop::{{closure}}::{{closure}}
             at /home/lieven/.cargo/registry/src/index.crates.io-6f17d22bba15001f/async-global-executor-2.3.1/src/threading.rs:95:17
  22: std::thread::local::LocalKey<T>::try_with
             at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library/std/src/thread/local.rs:270:16
  23: std::thread::local::LocalKey<T>::with
             at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library/std/src/thread/local.rs:246:9
  24: async_global_executor::threading::thread_main_loop::{{closure}}
             at /home/lieven/.cargo/registry/src/index.crates.io-6f17d22bba15001f/async-global-executor-2.3.1/src/threading.rs:89:13
  25: std::panicking::try::do_call
             at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library/std/src/panicking.rs:500:40
  26: __rust_try
  27: std::panicking::try
             at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library/std/src/panicking.rs:464:19
  28: std::panic::catch_unwind
             at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library/std/src/panic.rs:142:14
  29: async_global_executor::threading::thread_main_loop
             at /home/lieven/.cargo/registry/src/index.crates.io-6f17d22bba15001f/async-global-executor-2.3.1/src/threading.rs:88:12
  30: core::ops::function::FnOnce::call_once
             at /rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library/core/src/ops/function.rs:250:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
[2024-02-29T23:14:22Z TRACE zenoh_transport::unicast::universal::rx] Received: TransportMessage { body: KeepAlive(KeepAlive) }

@vortex314
Copy link
Author

I got the impression , this issue in tokio-serial is back : berkowski/tokio-serial#52
Th e nomal behaviour looks like this :

[2024-03-01T22:47:24Z TRACE zenoh_transport::unicast::manager] Accepting link... serial//dev/ttyUSB1 => serial/3a1bf942-a67a-405c-a710-80625bd1ac6e
[2024-03-01T22:47:24Z TRACE z_serial] Read 27 bytes COBS [02, 13, 09, 01, 08, F2, A8, 2D, A6, 06, BF, 0C, 3D, 80, 27, 7F, D8, 95, 3D, D7, 76, 30, A2, 03, 88, F8, 00]
[2024-03-01T22:47:24Z TRACE z_serial] Frame after COBS encoding [13, 00, 01, 08, F2, A8, 2D, A6, 06, BF, 00, 3D, 80, 27, 7F, D8, 95, 3D, D7, 76, 30, A2, 00, 88, F8]
[2024-03-01T22:47:24Z TRACE z_serial] Received CRC F88800A2  Computed CRC F88800A2
[2024-03-01T22:47:24Z TRACE z_serial] Frame before COBS encoding [38, 00, 61, 08, F0, 9F, 95, 58, 88, 95, 85, 85, 56, A2, C8, 28, A6, 55, 8B, D3, 2E, 0A, DC, 05, 21, 20, A6, E0, F0, DF, 75, 04, 13, 57, 09, 73, D3, 67, AD, 93, CC, 86, 12, F8, AD, EC, 43, 21, 3B, 1E, 36, 31, 1A, C4, 38, C3, BE, D0, AE, 95, 0C, FB]
[2024-03-01T22:47:24Z TRACE z_serial] Wrote 64bytes COBS [02, 38, 3D, 61, 08, F0, 9F, 95, 58, 88, 95, 85, 85, 56, A2, C8, 28, A6, 55, 8B, D3, 2E, 0A, DC, 05, 21, 20, A6, E0, F0, DF, 75, 04, 13, 57, 09, 73, D3, 67, AD, 93, CC, 86, 12, F8, AD, EC, 43, 21, 3B, 1E, 36, 31, 1A, C4, 38, C3, BE, D0, AE, 95, 0C, FB, 00]
[2024-03-01T22:47:24Z TRACE z_serial] Read 45 bytes COBS [02, 25, 03, 42, 0A, 27, 21, 20, A6, E0, F0, DF, 75, 04, 13, 57, 09, 73, D3, 67, AD, 93, CC, 86, 12, F8, AD, EC, 43, 21, 3B, 1E, 36, 31, 1A, C4, 38, C3, BE, D0, DD, C8, 97, FF, 00]
[2024-03-01T22:47:24Z TRACE z_serial] Frame after COBS encoding [25, 00, 42, 0A, 00, 21, 20, A6, E0, F0, DF, 75, 04, 13, 57, 09, 73, D3, 67, AD, 93, CC, 86, 12, F8, AD, EC, 43, 21, 3B, 1E, 36, 31, 1A, C4, 38, C3, BE, D0, DD, C8, 97, FF]
[2024-03-01T22:47:24Z TRACE z_serial] Received CRC FF97C8DD  Computed CRC FF97C8DD

When the connection get lost and no recovery is possible :

[2024-03-01T22:47:28Z TRACE zenoh_transport::unicast::manager] Accepting link... serial//dev/ttyUSB1 => serial/3a1bf942-a67a-405c-a710-80625bd1ac6e
[2024-03-01T22:47:28Z TRACE z_serial] Read 27 bytes COBS [02, 13, 18, 01, 08, F2, AE, 35, 8A, C8, 09, 23, 36, 86, CC, 80, 1D, D8, EF, FD, 57, 21, 3C, B4, E1, F9, 00]
[2024-03-01T22:47:28Z TRACE z_serial] Frame after COBS encoding [13, 00, 01, 08, F2, AE, 35, 8A, C8, 09, 23, 36, 86, CC, 80, 1D, D8, EF, FD, 57, 21, 3C, B4, E1, F9]
[2024-03-01T22:47:28Z TRACE z_serial] Received CRC F9E1B43C  Computed CRC F9E1B43C
[2024-03-01T22:47:28Z TRACE z_serial] Frame before COBS encoding [48, 00, 61, 08, F0, 9F, 95, 58, 88, 95, 85, 85, 56, A2, C8, 28, A6, 55, 8B, D3, 2E, 0A, DC, 05, 31, 30, 92, E0, 31, 5C, 1A, F7, F4, CB, BE, FF, 9E, 5F, 97, CE, C7, 94, C7, 80, 49, CC, 4E, BC, EB, 98, B3, 8F, 11, 4C, E4, D8, 69, 90, 00, 43, FB, 7A, 79, 35, AB, 50, 00, 94, AD, 13, B9, BF, 89, D2, 3A, 55, 29, F9]
[2024-03-01T22:47:28Z TRACE z_serial] Wrote 80bytes COBS [02, 48, 39, 61, 08, F0, 9F, 95, 58, 88, 95, 85, 85, 56, A2, C8, 28, A6, 55, 8B, D3, 2E, 0A, DC, 05, 31, 30, 92, E0, 31, 5C, 1A, F7, F4, CB, BE, FF, 9E, 5F, 97, CE, C7, 94, C7, 80, 49, CC, 4E, BC, EB, 98, B3, 8F, 11, 4C, E4, D8, 69, 90, 08, 43, FB, 7A, 79, 35, AB, 50, 0C, 94, AD, 13, B9, BF, 89, D2, 3A, 55, 29, F9, 00]
[2024-03-01T22:47:28Z TRACE zenoh_transport::unicast::universal::rx] Received: TransportMessage { body: KeepAlive(KeepAlive) }
[2024-03-01T22:47:31Z TRACE zenoh_transport::unicast::universal::rx] Received: TransportMessage { body: KeepAlive(KeepAlive) }
[2024-03-01T22:47:33Z TRACE zenoh_transport::unicast::universal::rx] Received: TransportMessage { body: KeepAlive(KeepAlive) }
[2024-03-01T22:47:36Z TRACE zenoh_transport::unicast::universal::rx] Received: TransportMessage { body: KeepAlive(KeepAlive) }
[2024-03-01T22:47:38Z DEBUG zenoh_transport::unicast::manager] future has timed out

It could be that 80 bytes write blocked ( until the future timed out) or the serial read became impossible.

@vortex314
Copy link
Author

Using strace to dig deeper, it looks like the code freezes for exactly 10 sec.

915384 16:23:57 read(9, "{", 1)         = 1
915384 16:23:57 read(9, "\347", 1)      = 1
915384 16:23:57 read(9, "0", 1)         = 1
915384 16:23:57 read(9, "\375", 1)      = 1
915384 16:23:57 read(9, "\0", 1)        = 1
915384 16:23:57 write(9, "\2HMa\10\360\245\261\275j\236R\306!'\223g\220h8V\270\n\334\00510\360G9\325d"..., 80) = 80
915368 16:23:57 <... epoll_wait resumed>[{events=EPOLLOUT, data={u32=3640101888, u64=94119258459136}}], 1024, -1) = 1
915384 16:23:57 ioctl(9, TCSBRK, 1 <unfinished ...>
915368 16:23:57 epoll_wait(3, [{events=EPOLLOUT, data={u32=3640101888, u64=94119258459136}}], 1024, -1) = 1
915368 16:23:57 epoll_wait(3,  <unfinished ...>
915384 16:23:57 <... ioctl resumed>)    = 0
915384 16:23:57 read(9, 0x5599d8f75060, 1) = -1 EAGAIN (Resource temporarily unavailable)
915384 16:23:57 write(7, "\1\0\0\0\0\0\0\0", 8) = 8
915381 16:23:57 <... epoll_wait resumed>[{events=EPOLLIN, data={u32=4294967295, u64=18446744073709551615}}], 1024, -1) = 1
915384 16:23:57 futex(0x7fab08000c60, FUTEX_WAIT_BITSET_PRIVATE, 8, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
915381 16:23:57 read(7, "\1\0\0\0\0\0\0\0", 8) = 8
915381 16:23:57 epoll_ctl(6, EPOLL_CTL_MOD, 7, {events=EPOLLIN|EPOLLPRI|EPOLLERR|EPOLLHUP|EPOLLRDHUP|EPOLLONESHOT, data={u32=4294967295, u64=18446744073709551615}}) = 0
915381 16:23:57 futex(0x7fab2c005950, FUTEX_WAIT_BITSET_PRIVATE, 1, {tv_sec=92763, tv_nsec=285354722}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out)
915381 16:23:57 timerfd_settime(8, 0, {it_interval={tv_sec=0, tv_nsec=0}, it_value={tv_sec=9, tv_nsec=988161742}}, NULL) = 0
915381 16:23:57 epoll_ctl(6, EPOLL_CTL_MOD, 8, {events=EPOLLIN|EPOLLPRI|EPOLLERR|EPOLLHUP|EPOLLRDHUP|EPOLLONESHOT, data={u32=4294967295, u64=18446744073709551615}}) = 0
915381 16:23:57 epoll_wait(6, [{events=EPOLLIN, data={u32=4294967295, u64=18446744073709551615}}], 1024, -1) = 1
915381 16:24:07 read(7, 0x7fab56318df0, 8) = -1 EAGAIN (Resource temporarily unavailable)
915381 16:24:07 epoll_ctl(6, EPOLL_CTL_MOD, 7, {events=EPOLLIN|EPOLLPRI|EPOLLERR|EPOLLHUP|EPOLLRDHUP|EPOLLONESHOT, data={u32=4294967295, u64=18446744073709551615}}) = 0
915381 16:24:07 futex(0x7fab08000c60, FUTEX_WAKE_PRIVATE, 1) = 1
915384 16:24:07 <... futex resumed>)    = 0
915381 16:24:07 futex(0x7fab2c005950, FUTEX_WAIT_BITSET_PRIVATE, 1, {tv_sec=92773, tv_nsec=274497740}, FUTEX_BITSET_MATCH_ANY <unfinished ...>
915384 16:24:07 futex(0x7fab10000c60, FUTEX_WAKE_PRIVATE, 1) = 1
915382 16:24:07 <... futex resumed>)    = 0
915384 16:24:07 write(2, "thread '", 8 <unfinished ...>
915382 16:24:07 timerfd_settime(8, 0, {it_interval={tv_sec=0, tv_nsec=0}, it_value={tv_sec=0, tv_nsec=0}},  <unfinished ...>
915381 16:24:07 <... futex resumed>)    = -1 ETIMEDOUT (Connection timed out)
915384 16:24:07 <... write resumed>)    = 8
915382 16:24:07 <... timerfd_settime resumed>NULL) = 0
915384 16:24:07 write(2, "async-std/runtime", 17 <unfinished ...>
915381 16:24:07 futex(0x7fab2c005950, FUTEX_WAIT_BITSET_PRIVATE, 1, {tv_sec=92773, tv_nsec=274700929}, FUTEX_BITSET_MATCH_ANY <unfinished ...>
915384 16:24:07 <... write resumed>)    = 17
915382 16:24:07 epoll_ctl(6, EPOLL_CTL_MOD, 8, {events=EPOLLIN|EPOLLPRI|EPOLLERR|EPOLLHUP|EPOLLRDHUP|EPOLLONESHOT, data={u32=4294967295, u64=18446744073709551615}} <unfinished ...>
915384 16:24:07 write(2, "' panicked at '", 15 <unfinished ...>
915382 16:24:07 <... epoll_ctl resumed>) = 0
915384 16:24:07 <... write resumed>)    = 15
915381 16:24:07 <... futex resumed>)    = -1 ETIMEDOUT (Connection timed out)
915384 16:24:07 write(2, "explicit panic", 14 <unfinished ...>

@vortex314
Copy link
Author

I tweaked an option : transport.unicast.max_links in zenohd config
I changed this from 1 to 10 and it seems to address the issue. After 643 sessions from the espidf in serial mode, I guess it works as expected. Iḿ guessing that the 10 failing sessions could have been serial communications errors. The remaining issue looks to be something on the ESP32 itself.

Closing Zenoh Session... sessions_ok= 643 sessions_fail=10 publish_ok=64400  
Opening Zenoh Session... session_ok= 644 sessions_fail=10 publish_ok=64400  
I (895053) uart: queue free spaces: 100

assert failed: spinlock_acquire spinlock.h:122 (result == core_id || result == SPINLOCK_FREE)


Backtrace: 0x4008204a:0x3ffb5380 0x40086af9:0x3ffb53a0 0x4008bd29:0x3ffb53c0 0x40089a0f:0x3ffb54e0 0x400879c9:0x3ffb5500 0x4008310




ELF file SHA256: 910a12a83198f52b

Rebooting...

zenohd logs for completeness :

[2024-03-05T23:27:45Z INFO  zenoh::net::runtime] Using PID: 7161e25eddf12eb50838dfda9e0a2890
[2024-03-05T23:27:45Z INFO  zenoh::net::runtime::orchestrator] Zenoh can be reached at: serial//dev/ttyUSB1
[2024-03-05T23:27:45Z INFO  zenoh::net::runtime::orchestrator] Zenoh can be reached at: tcp/127.0.0.1:7447
[2024-03-05T23:27:45Z INFO  zenoh::net::runtime::orchestrator] zenohd listening scout messages on 224.0.0.224:7446
[2024-03-05T23:27:45Z INFO  zenohd] Finished loading plugins
[2024-03-05T23:29:16Z ERROR zenoh_transport::unicast::establishment::accept] Received invalid message instead of an OpenSyn on serial//dev/ttyUSB1 => serial/d6aa84cc-e202-4a9b-beab-3b0d6ee65d60: InitSyn(InitSyn { version: 8, whatami: Client, zid: 47986c1ae9de4a352a14c411f0caa074, resolution: Resolution(10), batch_size: 65535, ext_qos: None, ext_shm: None, ext_auth: None, ext_mlink: None, ext_lowlatency: None, ext_compression: None }) at io/zenoh-transport/src/unicast/establishment/accept.rs:403.
[2024-03-05T23:30:05Z ERROR zenoh_link_serial::unicast] Read error on Serial link serial//dev/ttyUSB1 => serial/d6aa84cc-e202-4a9b-beab-3b0d6ee65d60: Unable COBS decode: () at io/zenoh-links/zenoh-link-serial/src/unicast.rs:150.
[2024-03-05T23:32:16Z ERROR zenoh_transport::unicast::establishment::accept] Received invalid message instead of an OpenSyn on serial//dev/ttyUSB1 => serial/d6aa84cc-e202-4a9b-beab-3b0d6ee65d60: InitSyn(InitSyn { version: 8, whatami: Client, zid: 48e28fb1c6f0b4db763f1cd4b2c54776, resolution: Resolution(10), batch_size: 65535, ext_qos: None, ext_shm: None, ext_auth: None, ext_mlink: None, ext_lowlatency: None, ext_compression: None }) at io/zenoh-transport/src/unicast/establishment/accept.rs:403.
[2024-03-05T23:32:34Z ERROR zenoh_transport::unicast::establishment::accept] Received invalid message instead of an OpenSyn on serial//dev/ttyUSB1 => serial/d6aa84cc-e202-4a9b-beab-3b0d6ee65d60: InitSyn(InitSyn { version: 8, whatami: Client, zid: 63ea56a18984af47cadba4b1d847884f, resolution: Resolution(10), batch_size: 65535, ext_qos: None, ext_shm: None, ext_auth: None, ext_mlink: None, ext_lowlatency: None, ext_compression: None }) at io/zenoh-transport/src/unicast/establishment/accept.rs:403.
[2024-03-05T23:40:24Z ERROR zenoh_transport::unicast::establishment::accept] Received invalid message instead of an OpenSyn on serial//dev/ttyUSB1 => serial/d6aa84cc-e202-4a9b-beab-3b0d6ee65d60: InitSyn(InitSyn { version: 8, whatami: Client, zid: 49701b2f7b84174e2d0a2711d1106291, resolution: Resolution(10), batch_size: 65535, ext_qos: None, ext_shm: None, ext_auth: None, ext_mlink: None, ext_lowlatency: None, ext_compression: None }) at io/zenoh-transport/src/unicast/establishment/accept.rs:403.
[2024-03-05T23:40:41Z ERROR zenoh_transport::unicast::establishment::accept] Received invalid message instead of an OpenSyn on serial//dev/ttyUSB1 => serial/d6aa84cc-e202-4a9b-beab-3b0d6ee65d60: InitSyn(InitSyn { version: 8, whatami: Client, zid: a83ce8b51ccc3a3f8957803a19b2627a, resolution: Resolution(10), batch_size: 65535, ext_qos: None, ext_shm: None, ext_auth: None, ext_mlink: None, ext_lowlatency: None, ext_compression: None }) at io/zenoh-transport/src/unicast/establishment/accept.rs:403.
[2024-03-05T23:42:55Z ERROR zenoh_link_serial::unicast] Read error on Serial link serial//dev/ttyUSB1 => serial/d6aa84cc-e202-4a9b-beab-3b0d6ee65d60: Serial is smaller than the minimum size at io/zenoh-links/zenoh-link-serial/src/unicast.rs:150.
[2024-03-05T23:44:48Z ERROR zenoh_transport::unicast::establishment::accept] Received invalid message instead of an OpenSyn on serial//dev/ttyUSB1 => serial/d6aa84cc-e202-4a9b-beab-3b0d6ee65d60: InitSyn(InitSyn { version: 8, whatami: Client, zid: 78716a775872647d64e7051118b32fb1, resolution: Resolution(10), batch_size: 65535, ext_qos: None, ext_shm: None, ext_auth: None, ext_mlink: None, ext_lowlatency: None, ext_compression: None }) at io/zenoh-transport/src/unicast/establishment/accept.rs:403.

@vortex314
Copy link
Author

ESP32 crash details

Opening Zenoh Session... session_ok= 674 sessions_fail=8 publish_ok=67400  
I (923273) uart: queue free spaces: 100

assert failed: spinlock_acquire spinlock.h:122 (result == core_id || result == SPINLOCK_FREE)


Backtrace: 0x4008204a:0x3ffb5380 0x40086af9:0x3ffb53a0 0x4008bd29:0x3ffb53c0 0x40089a0f:0x3ffb54e0 0x400879c9:0x3ffb5500 0x4008313c:0x3ffb5540 0x400f46a2:0x3ffb5560 0x400d4c6d:0x3ffb5580 0x400d5548:0x3ffb55a0 0x400d4ec0:0x3ffb55c0 0x400d4b89:0x3ffb55e0 0x400d1ec9:0x3ffb5600 0x400d1c1a:0x3ffb5620 0x40106604:0x3ffb5780 0x4008984d:0x3ffb57a0

  #0  0x4008204a:0x3ffb5380 in panic_abort at /home/lieven/.platformio/packages/framework-espidf/components/esp_system/panic.c:408
  #1  0x40086af9:0x3ffb53a0 in esp_system_abort at /home/lieven/.platformio/packages/framework-espidf/components/esp_system/esp_system.c:137
  #2  0x4008bd29:0x3ffb53c0 in __assert_func at /home/lieven/.platformio/packages/framework-espidf/components/newlib/assert.c:85
  #3  0x40089a0f:0x3ffb54e0 in spinlock_acquire at /home/lieven/.platformio/packages/framework-espidf/components/esp_hw_support/include/soc/spinlock.h:122
      (inlined by) xPortEnterCriticalTimeout at /home/lieven/.platformio/packages/framework-espidf/components/freertos/port/xtensa/port.c:301
  #4  0x400879c9:0x3ffb5500 in vPortEnterCritical at /home/lieven/.platformio/packages/framework-espidf/components/freertos/port/xtensa/include/freertos/portmacro.h:578
      (inlined by) xQueueSemaphoreTake at /home/lieven/.platformio/packages/framework-espidf/components/freertos/queue.c:1563
  #5  0x4008313c:0x3ffb5540 in pthread_mutex_lock_internal at /home/lieven/.platformio/packages/framework-espidf/components/pthread/pthread.c:620
  #6  0x400f46a2:0x3ffb5560 in pthread_mutex_destroy at /home/lieven/.platformio/packages/framework-espidf/components/pthread/pthread.c:585
  #7  0x400d4c6d:0x3ffb5580 in zp_mutex_free at lib/zenoh-pico/src/system/espidf/system.c:101
  #8  0x400d5548:0x3ffb55a0 in _z_unicast_transport_clear at lib/zenoh-pico/src/transport/unicast/transport.c:299
  #9  0x400d4ec0:0x3ffb55c0 in _z_transport_clear at lib/zenoh-pico/src/transport/transport.c:58
  #10 0x400d4b89:0x3ffb55e0 in _z_session_clear at lib/zenoh-pico/src/session/utils.c:105
  #11 0x400d1ec9:0x3ffb5600 in _z_session_rc_drop at lib/zenoh-pico/include/zenoh-pico/net/session.h:69
      (inlined by) z_open at lib/zenoh-pico/src/api/api.c:549
  #12 0x400d1c1a:0x3ffb5620 in app_main at src/main.cpp:76
  #13 0x40106604:0x3ffb5780 in main_task at /home/lieven/.platformio/packages/framework-espidf/components/freertos/port/port_common.c:141 (discriminator 2)
  #14 0x4008984d:0x3ffb57a0 in vPortTaskWrapper at /home/lieven/.platformio/packages/framework-espidf/components/freertos/port/xtensa/port.c:142

@vortex314
Copy link
Author

And another one after increasing ESP32 stack size

Putting Data ('demo/example/zenoh-pico-pub': '[     94] [ESPIDF]{ESP32}  Zenoh-Pico!')...
Putting Data ('demo/example/zenoh-pico-pub': '[     95] [ESPIDF]{ESP32}  Zenoh-Pico!')...
Putting Data ('demo/example/zenoh-pico-pub': '[     96] [ESPIDF]{ESP32}  Zenoh-Pico!')...
Putting Data ('demo/example/zenoh-pico-pub': '[     97] [ESPIDF]{ESP32}  Zenoh-Pico!')...
Putting Data ('demo/example/zenoh-pico-pub': '[     98] [ESPIDF]{ESP32}  Zenoh-Pico!')...
Putting Data ('demo/example/zenoh-pico-pub': '[     99] [ESPIDF]{ESP32}  Zenoh-Pico!')...
Closing Zenoh Session... sessions_ok= 592 sessions_fail=10 publish_ok=59300  
Guru Meditation Error: Core  0 panic'ed (LoadProhibited). Exception was unhandled.

Core  0 register dump:
PC      : 0x400db0b9  PS      : 0x00060f30  A0      : 0x800dc962  A1      : 0x3ffea8a0  
A2      : 0x00000002  A3      : 0x00000008  A4      : 0x00000064  A5      : 0x00000001  
A6      : 0x00000001  A7      : 0x00000001  A8      : 0x00000000  A9      : 0x3ffea830  
A10     : 0x00000000  A11     : 0x3ffea890  A12     : 0x00000000  A13     : 0x00000064  
A14     : 0x00000000  A15     : 0x000ca090  SAR     : 0x0000001e  EXCCAUSE: 0x0000001c  
EXCVADDR: 0x0000001c  LBEG    : 0x4000c28c  LEND    : 0x4000c296  LCOUNT  : 0x00000000  


Backtrace: 0x400db0b6:0x3ffea8a0 0x400dc95f:0x3ffea8c0 0x400d9d4e:0x3ffea910 0x400d643d:0x3ffea940 0x400d5fb8:0x3ffea960 0x400d50ad:0x3ffea980 0x400d4bf9:0x3ffeaa20 0x4008984d:0x3ffeaa40

  #0  0x400db0b6:0x3ffea8a0 in uart_check_buf_full at /home/lieven/.platformio/packages/framework-espidf/components/driver/uart.c:1302
  #1  0x400dc95f:0x3ffea8c0 in uart_read_bytes at /home/lieven/.platformio/packages/framework-espidf/components/driver/uart.c:1341
  #2  0x400d9d4e:0x3ffea910 in _z_read_serial at lib/zenoh-pico/src/system/espidf/network.c:639
  #3  0x400d643d:0x3ffea940 in _z_f_link_read_serial at lib/zenoh-pico/src/link/unicast/serial.c:106
  #4  0x400d5fb8:0x3ffea960 in _z_link_recv_zbuf at lib/zenoh-pico/src/link/link.c:139
  #5  0x400d50ad:0x3ffea980 in _zp_unicast_read_task at lib/zenoh-pico/src/transport/unicast/read.c:85
  #6  0x400d4bf9:0x3ffeaa20 in z_task_wrapper at lib/zenoh-pico/src/system/espidf/system.c:58
  #7  0x4008984d:0x3ffeaa40 in vPortTaskWrapper at /home/lieven/.platformio/packages/framework-espidf/components/freertos/port/xtensa/port.c:142



@vortex314
Copy link
Author

To continue the story , the fact that the ESP32 crashes after some time looks to be linked to lack of heap space. Monitoring the heap shows that it continually decreases after every loop on open/close zenoh session with declare/undeclare publisher.
To be continued...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant