Skip to content

Commit

Permalink
Drop the MEET packet if the link node is in handshake state
Browse files Browse the repository at this point in the history
After valkey-io#1307 got merged, we notice there is a assert happen in setClusterNodeToInboundClusterLink:
```
=== ASSERTION FAILED ===
==> '!link->node' is not true
```

In valkey-io#778, we will call setClusterNodeToInboundClusterLink to attach the node to the link
during the MEET processing, so if we receive a another MEET packet in a short time, the
node is still in handshake state, we will meet this assert and crash the server.

If the link is bound to a node and the node is in the handshake state, and we receive
a MEET packet, it may be that the sender sent multiple MEET packets when reconnecting,
and in here we are dropping the MEET. Note that in getNodeFromLinkAndMsg, the node in
the handshake state has a random name and not truly "known", so we don't know the sender.
Dropping the MEET packet can prevent us from creating a random node, avoid incorrect
link binding, and avoid duplicate MEET packet eliminate the handshake state.

Signed-off-by: Binbin <[email protected]>
  • Loading branch information
enjoy-binbin committed Dec 13, 2024
1 parent 32f2c73 commit f342755
Showing 1 changed file with 17 additions and 4 deletions.
21 changes: 17 additions & 4 deletions src/cluster_legacy.c
Original file line number Diff line number Diff line change
Expand Up @@ -3004,7 +3004,8 @@ int clusterIsValidPacket(clusterLink *link) {
}

if (type == server.cluster_drop_packet_filter || server.cluster_drop_packet_filter == -2) {
serverLog(LL_WARNING, "Dropping packet that matches debug drop filter");
serverLog(LL_WARNING, "Dropping packet of type %s that matches debug drop filter",
clusterGetMessageTypeString(type));
return 0;
}

Expand Down Expand Up @@ -3095,7 +3096,7 @@ int clusterProcessPacket(clusterLink *link) {
if (server.debug_cluster_close_link_on_packet_drop &&
(type == server.cluster_drop_packet_filter || server.cluster_drop_packet_filter == -2)) {
freeClusterLink(link);
serverLog(LL_WARNING, "Closing link for matching packet type %hu", type);
serverLog(LL_WARNING, "Closing link for matching packet type %s", clusterGetMessageTypeString(type));
return 0;
}
return 1;
Expand All @@ -3111,8 +3112,8 @@ int clusterProcessPacket(clusterLink *link) {
freeClusterLink(link);
serverLog(
LL_NOTICE,
"Closing link for node that sent a lightweight message of type %hu as its first message on the link",
type);
"Closing link for node that sent a lightweight message of type %s as its first message on the link",
clusterGetMessageTypeString(type));
return 0;
}
clusterNode *sender = link->node;
Expand All @@ -3121,6 +3122,18 @@ int clusterProcessPacket(clusterLink *link) {
return 1;
}

if (type == CLUSTERMSG_TYPE_MEET && link->node && nodeInHandshake(link->node)) {
/* If the link is bound to a node and the node is in the handshake state, and we receive
* a MEET packet, it may be that the sender sent multiple MEET packets when reconnecting,
* and in here we are dropping the MEET. Note that in getNodeFromLinkAndMsg, the node in
* the handshake state has a random name and not truly "known", so we don't know the sender.
* Dropping the MEET packet can prevent us from creating a random node, avoid incorrect
* link binding, and avoid duplicate MEET packet eliminate the handshake state. */
serverLog(LL_NOTICE, "Dropping MEET packet from node %.40s because the node is already in handshake state",
link->node->name);
return 1;
}

uint16_t flags = ntohs(hdr->flags);
uint64_t sender_claimed_current_epoch = 0, sender_claimed_config_epoch = 0;
clusterNode *sender = getNodeFromLinkAndMsg(link, hdr);
Expand Down

0 comments on commit f342755

Please sign in to comment.