You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As a preamble, I found one of our ZYRE servers with more than 700 sockets open since its last restart (a few weeks).
Among them, more than 270 were established with the same remote (Android device).
Then, I went deeper and started to investigate a bit.
Finally, I came to reproduce this, with a the C source code below and a basic scenario described later:
int main(void) {
zyre_t *node = zyre_new("LINUX-SERVER");
if (!node) {
fprintf(stderr, "Error: Failed to create ZYRE node.\n");
return -1;
}
zyre_set_port(15670); // To not interfer with other nodes in lab.
zyre_set_beacon_peer_port(25670); // For practical reasons with NETSTAT output.
zyre_start(node);
zyre_join(node, "dummy-group");
while (!zsys_interrupted) {
// Receive ZYRE event
zyre_event_t *event = zyre_event_new(node);
if (event) {
const char *event_type = zyre_event_type(event);
const char *peer_id = zyre_event_peer_uuid(event);
if (streq(event_type, "ENTER") || streq(event_type, "EXIT")) {
printf("%s - %-10s\n", peer_id, event_type);
}
zyre_event_destroy(&event);
} else {
// No event --> wait a little bit.
zclock_sleep(100);
}
}
zyre_leave(node, "dummy-group");
zyre_stop(node);
zyre_destroy(&node);
return 0;
Scenario to reproduce
Start this program on a Linux machine A
Start a similar one, on a different machine, named B.
On A, I observe 2 active TCP connexions (to simplify):
tcp 0 0 192.168.57.130:25670 0.0.0.0:* LISTEN 2196653/zre-server
tcp 0 0 192.168.57.130:39564 192.168.57.172:25670 ESTABLISHED 2196653/zre-server A --> B
tcp 0 0 192.168.57.130:25670 192.168.57.172:36072 ESTABLISHED 2196653/zre-server A <-- B
Now, unplug the Ethernet cable on B. After a few seconds, A shows an event "EXIT" from
B and 1 socket is automatically closed, the 2nd socket (A <-- B) remains active:
tcp 0 0 192.168.57.130:25670 0.0.0.0:* LISTEN 2196653/zre-server
tcp 0 0 192.168.57.130:25670 192.168.57.172:36072 ESTABLISHED 2196653/zre-server A <-- B
If the cable is plugged back, 2 new sockets are created, but the former A <-- B is still present:
tcp 0 0 192.168.57.130:25670 0.0.0.0:* LISTEN 2196653/zre-server
tcp 0 0 192.168.57.130:41530 192.168.57.172:25670 ESTABLISHED 2196653/zre-server A --> B
tcp 0 0 192.168.57.130:25670 192.168.57.172:36072 ESTABLISHED 2196653/zre-server A <-- B (former)
tcp 0 0 192.168.57.130:25670 192.168.57.172:52104 ESTABLISHED 2196653/zre-server A <-- B
Repeat the operation and more sockets are seen.
I was hoping that if ZYRE (or the layers below) are able to close the socket A --> B, it could close the 2nd one as well.
At least, something is detected "correctly", as 2 new sockets are created when B comes back.
This comes more problematic when B is a laptop (or an Android device), coming in and out of WIFI coverage,
or if the laptop is closed (hybernate) but not shut down.
I tried to play with TCP_KEEPALIVE, but without any kind of success so far.
As a preamble, I found one of our ZYRE servers with more than 700 sockets open since its last restart (a few weeks).
Among them, more than 270 were established with the same remote (Android device).
Then, I went deeper and started to investigate a bit.
Finally, I came to reproduce this, with a the C source code below and a basic scenario described later:
Scenario to reproduce
On A, I observe 2 active TCP connexions (to simplify):
Now, unplug the Ethernet cable on B. After a few seconds, A shows an event "EXIT" from
B and 1 socket is automatically closed, the 2nd socket (
A <-- B
) remains active:If the cable is plugged back, 2 new sockets are created, but the former
A <-- B
is still present:Repeat the operation and more sockets are seen.
I was hoping that if ZYRE (or the layers below) are able to close the socket
A --> B
, it could close the 2nd one as well.At least, something is detected "correctly", as 2 new sockets are created when B comes back.
This comes more problematic when B is a laptop (or an Android device), coming in and out of WIFI coverage,
or if the laptop is closed (hybernate) but not shut down.
I tried to play with TCP_KEEPALIVE, but without any kind of success so far.
The issue below looks related, actually:
The text was updated successfully, but these errors were encountered: