Standalone FAILOVER: Fix disconnect of blocked clients in standalone failover and support REDIRECT response #848

gmbnomis · 2024-07-30T22:25:32Z

This PR consists of two commits: If a client waits for the response to a blocking command when a failover is initiated, it will be unblocked and disconnected during the failover. While this needs to happen in general, it happens too early currently. The first commit lets the disconnect happen only after a successful failover. It can be considered as a bugfix and could be a candidate for a 7.2.x release.

Now that the disconnect happens at the correct point in time, we can redirect clients that support the "redirect" capability instead of disconnecting them. This is implemented in the second commit.

These changes are related to the bug described in issue #821.

More details are covered in the commit messages of the two commits:

Fix early unblock/disconnect of clients during failover

When entering the FAILOVER_IN_PROGRESS state during a failover,
disconnectAllBlockedClients is called (as part of
replicationSetPrimary) to disconnect all blocked clients (except
those that are postponed due to a client pause).

While this makes sense in case of a role change in general, this is
happening too early in case of failover:

The role switch hasn't happened in FAILOVER_IN_PROGESS yet,
thus, there is no new primary to connect to.
The failover may still fail. In this case, there is no need to
unblock/disconnect clients at all.

Therefore, move the disconnectAllBlockedClients to when we unblock
the paused clients after the failover completes. As said above, this
call must only happen if the failover succeeds.

Extend two tests to verify the changed behavior:

"failover to a replica with force works": Issue a blocking command
before the failover and verify that it is unblocked after the
failover finished.
"failover aborts if target rejects sync request": Both a blocking
command issued before a failover as well as one issued during the
failover won't be interrupted when the failover eventually
aborts. (Note that the location of the previous SET FOO BAR
command in the test was wrong, as it did not block and thus,
wouldn't be "interrupted" by the failover in any case.)

Role change: Redirect suitable clients instead of disconnecting them

In case of an instance role switch in standalone mode, clients with
blocking commands get disconnected sending an UNBLOCKED error to the
client.

Now that the REDIRECT error is available as an alternative (#325),
issue a REDIRECT (redirecting the client to the primary) for clients
that are supporting it. There is no need to disconnect the client in
this case. This is in-line with the semantics of the MOVED error in
cluster mode.

codecov · 2024-07-31T07:08:18Z

Codecov Report

Attention: Patch coverage is 95.83333% with 1 line in your changes missing coverage. Please review.

Project coverage is 70.35%. Comparing base (7795152) to head (1a850d0).
Report is 6 commits behind head on unstable.

Files	Patch %	Lines
src/blocked.c	91.66%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##           unstable     #848      +/-   ##
============================================
- Coverage     70.37%   70.35%   -0.03%     
============================================
  Files           112      112              
  Lines         61505    61515      +10     
============================================
- Hits          43286    43280       -6     
- Misses        18219    18235      +16

Files	Coverage Δ
src/cluster_legacy.c	`85.44% <100.00%> (+0.03%)`	⬆️
src/replication.c	`87.10% <100.00%> (-0.04%)`	⬇️
src/server.h	`100.00% <ø> (ø)`
src/blocked.c	`91.55% <91.66%> (-0.09%)`	⬇️

... and 14 files with indirect coverage changes

src/blocked.c

gmbnomis · 2024-08-15T21:52:07Z

@soloestoy Adapted to #871

When entering the `FAILOVER_IN_PROGRESS` state during a failover, `disconnectAllBlockedClients` is called (as part of `replicationSetPrimary`) to disconnect all blocked clients (except those that are postponed due to a client pause). While this makes sense in case of a role change in general, this is happening too early in case of failover: 1. The role switch hasn't happened in `FAILOVER_IN_PROGESS` yet, thus, there is no new primary to connect to. 2. The failover may still fail. In this case, there is no need to unblock/disconnect clients at all. Therefore, move the `disconnectAllBlockedClients` to when we unblock the paused clients after the failover completes. As said above, this call must only happen if the failover succeeds. Extend two tests to verify the changed behavior: 1. "failover to a replica with force works": Issue a blocking command before the failover and verify that it is unblocked after the failover finished. 2. "failover aborts if target rejects sync request": Both a blocking command issued before a failover as well as one issued during the failover won't be interrupted when the failover eventually aborts. (Note that the location of the previous `SET FOO BAR` command in the test was wrong, as it did not block and thus, wouldn't be "interrupted" by the failover in any case.) Signed-off-by: Simon Baatz <[email protected]>

In case of an instance role switch in standalone mode, clients with blocking commands get disconnected sending an UNBLOCKED error to the client. Now that the REDIRECT error is available as an alternative (valkey-io#325), issue a REDIRECT (redirecting the client to the primary) for clients that are supporting it. If a client is read-only, it may remain blocked in some situations. There is no need to disconnect the client in either case. This is in-line with the semantics of the MOVED error in cluster mode. Signed-off-by: Simon Baatz <[email protected]>

ranshid · 2024-08-19T05:59:21Z

src/blocked.c

+                 c->bstate.btype == BLOCKED_STREAM || c->bstate.btype == BLOCKED_MODULE)) {
+                /* If the client is blocked on module, but not on a specific key,
+                 * don't unblock it. */
+                if (c->bstate.btype == BLOCKED_MODULE && !moduleClientIsBlockedOnKeys(c)) continue;


I understand the general logic about key ownership and write commands. however module logic can vary greatly and I would hate placing assumptions on modules work. Maybe we can let modules opt it by registering API to query if the module wishes to be kept alive after role change?

I realized that I should have added that I did not "invent" this code, it is following the redirect logic in Valkey cluster:

valkey/src/cluster.c

Line 1215 in 7795152

int clusterRedirectBlockedClientIfNeeded(client *c) {

This particular condition was introduced by redis/redis#9483. The discussion there centered on slot migration in cluster, but the MOVED will also be sent/not sent following the usual readwrite/readonly logic on a replica.

So, it looks applicable to the redirect case here, but, TBH, I don't know the module API very well. @soloestoy, as you were part of the discussion on redis/redis#9483, do you think we should do this here as well? (alternatively we could just continue to disconnect clients which are blocked on a module or we may need to come up with something more complex, like @ranshid suggested.)

@gmbnomis actually, I agree. I forgot about the redirect logic taking the same decision, so I guess the modules are already capable of handling it (even though I think modules should have the ability to impact the decision). I would like to ask, though, why not just reuse the function (clusterRedirectBlockedClientIfNeeded) ?

Looking at both functions, the commonalities are quite limited because the redirect logic in cluster is much more complex and not applicable to the standalone case. We probably could extract the check whether we need to look at a client in more detail, i.e. something like:

int isRedirectCandidate(client *c) { if (c->flag.blocked && (c->bstate.btype == BLOCKED_LIST || c->bstate.btype == BLOCKED_ZSET || c->bstate.btype == BLOCKED_STREAM || c->bstate.btype == BLOCKED_MODULE)) { /* If the client is blocked on module, but not on a specific key, * don't unblock it. */ if (c->bstate.btype != BLOCKED_MODULE || moduleClientIsBlockedOnKeys(c)) return 1; } return 0; }

That's not much. Do you think this is worthwhile?

I think it is worthwhile to create a small function or two, less about code reusing but more for abstraction. For instance, I see the following concepts worth capturing

standalone redirection readiness

!server.cluster_enabled && server.primary_host && c->capa & CLIENT_CAPA_REDIRECT

This is also used at

valkey/src/server.c

Line 3913 in 829243e

if (!server.cluster_enabled && c->capa & CLIENT_CAPA_REDIRECT && server.primary_host && !mustObeyClient(c) &&

client conditions

c->flag.blocked && (c->bstate.btype == BLOCKED_LIST || c->bstate.btype == BLOCKED_ZSET || c->bstate.btype == BLOCKED_STREAM || c->bstate.btype == BLOCKED_MODULE)

Then L294-L296 could be rewritten as (in pseudo code)

if (redirection_is_enabled() && client_is_in_right_state(c)) { ... }

Lastly, I am not sure about the module and read-only client cases. The changes look like leaving them in a perpetually blocked state; while for all other cases, we either redirect (new behavior) or drop the connection (old behavior).

I think it is worthwhile to create a small function or two, less about code reusing but more for abstraction.

Yes, that makes sense. I will give it a try once we sort out what the logic should be.

Lastly, I am not sure about the module and read-only client cases. The changes look like leaving them in a perpetually blocked state; while for all other cases, we either redirect (new behavior) or drop the connection (old behavior).

I realized only today that CLUSTER FAILOVER does not result in MOVED errors for blocked clients satisfying these client conditions (like the documentation of the command and the comments for clusterRedirectBlockedClientIfNeeded may suggest).

Instead, we end up in exactly the function we are discussing about. That is to say, we return UNBLOCK and disconnect.

So, Valkey cluster seems to handle unblocking due to slot migration and unblocking due to primary/replica change differently. (I don't know whether this is by intention or not).

Thus, what I deemed to be "well-known territory" apparently isn't. Furthermore, replicationSetPrimary is not only called during failover but also by an arbitrary REPLICAOF command.

So, I think you are right and we should either redirect or disconnect. Shall I convert the current "do nothing" cases (i.e. continue) into disconnects? (because we are not sure whether redirecting the client is actually beneficial in these cases)

(Alternatively, we could come to the conclusion that we don't need to be "better than CLUSTER FAILOVER" and decide against doing this redirect optimization at all.)

ranshid · 2024-08-19T06:31:12Z

src/server.h

@@ -3023,7 +3023,7 @@ void replicationStartPendingFork(void);
 void replicationHandlePrimaryDisconnection(void);
 void replicationCachePrimary(client *c);
 void resizeReplicationBacklog(void);
-void replicationSetPrimary(char *ip, int port);
+void replicationSetPrimary(char *ip, int port, int disconnect_blocked);


IIUC, the only case we do not want to disco blocked clients is only in the case of 'server.failover_state = FAILOVER_IN_PROGRESS' right? can we then just check for this condition inside replicationSetPrimary instead of passing on an external condition?

Yes, that should be possible, but I am reluctant to make replicationSetPrimary (which isn't specific to failover) behave differently based on some sub-state of the the failover state machine. I am not against it, but I think the intention gets clearer if the caller of replicationSetPrimary states "I will take care of blocked clients myself".

IMO passing this flag to the replicationSetPrimary and having it handle clients disconnects is already extending the function scope. I am fine with letting it handle the state logic and encapsulate the decision into a single place.

I agree with @gmbnomis that we shouldn't introduce this dependency (server.failover_state = FAILOVER_IN_PROGRESS) to replicationSetPrimary. There's no inherent order between setting the failover state to in-progress and fixing the replication states. Introducing a failover state check adds an unrelated concern to the function, which in turn forces the caller to perform these operations in a specific sequence, an unnecessary constraint IMO and also not resilient to future changes.

O.K. I agree this is also a good reasoning.

soloestoy · 2024-08-21T10:06:58Z

IIUC, the current issue being described is to postpone the interruption of the block connection in replicationSetPrimary until after the entire failover is completed, that is, after FAILOVER_IN_PROGRESS. This is not a fix related to the correctness of the block command execution, but rather an optimization, right?

gmbnomis · 2024-08-21T16:43:17Z

IIUC, the current issue being described is to postpone the interruption of the block connection in replicationSetPrimary until after the entire failover is completed, that is, after FAILOVER_IN_PROGRESS.

Actually, it is not only postponing, but also not disconnecting those clients when the failover does not happen in the end.

This is not a fix related to the correctness of the block command execution, but rather an optimization, right?

Previously, a client that was blocked before the failover (e.g. a BRPOP):

continued to be blocked and proceeded normally if the failover failed during FAILOVER_WAIT_FOR_SYNC
was unblocked and disconnected if the failover failed during FAILOVER_IN_PROGRESS (with the error UNBLOCKED force unblock from blocking operation, instance state changed (master -> replica?), but there was no such change in the end)

Now, both cases are the same from the client's point of view. I don't think that this fixes a huge problem, as said in the PR comment "It can be considered as a bugfix and could be a candidate for a 7.2.x release."

For the REDIRECT part, yes, it is an optimization which becomes possible because we have REDIRECT now and because we know where to redirect to.

madolson requested a review from soloestoy July 30, 2024 23:34

gmbnomis force-pushed the disconnect_or_redirect branch from 33a1fa3 to 6e4762e Compare July 31, 2024 21:07

gmbnomis commented Aug 1, 2024

View reviewed changes

src/blocked.c Outdated Show resolved Hide resolved

gmbnomis force-pushed the disconnect_or_redirect branch from 6e4762e to 12053bb Compare August 1, 2024 18:38

gmbnomis mentioned this pull request Aug 5, 2024

To avoid bouncing -REDIRECT during FAILOVER #871

Merged

gmbnomis force-pushed the disconnect_or_redirect branch from 12053bb to e9f30de Compare August 15, 2024 21:50

gmbnomis added 2 commits August 18, 2024 22:32

gmbnomis force-pushed the disconnect_or_redirect branch from e9f30de to 1a850d0 Compare August 18, 2024 20:33

ranshid reviewed Aug 19, 2024

View reviewed changes

Comandante318 approved these changes Nov 11, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Standalone FAILOVER: Fix disconnect of blocked clients in standalone failover and support REDIRECT response #848

Standalone FAILOVER: Fix disconnect of blocked clients in standalone failover and support REDIRECT response #848

gmbnomis commented Jul 30, 2024

codecov bot commented Jul 31, 2024 •

edited

Loading

gmbnomis commented Aug 15, 2024

ranshid Aug 19, 2024

gmbnomis Aug 19, 2024

ranshid Aug 20, 2024

gmbnomis Aug 20, 2024

PingXie Aug 21, 2024 •

edited

Loading

gmbnomis Aug 21, 2024

ranshid Aug 19, 2024

gmbnomis Aug 19, 2024

ranshid Aug 20, 2024

PingXie Aug 21, 2024

ranshid Aug 21, 2024

soloestoy commented Aug 21, 2024

gmbnomis commented Aug 21, 2024 •

edited

Loading

Standalone FAILOVER: Fix disconnect of blocked clients in standalone failover and support REDIRECT response #848

Are you sure you want to change the base?

Standalone FAILOVER: Fix disconnect of blocked clients in standalone failover and support REDIRECT response #848

Conversation

gmbnomis commented Jul 30, 2024

codecov bot commented Jul 31, 2024 • edited Loading

Codecov Report

gmbnomis commented Aug 15, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

PingXie Aug 21, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

soloestoy commented Aug 21, 2024

gmbnomis commented Aug 21, 2024 • edited Loading

codecov bot commented Jul 31, 2024 •

edited

Loading

PingXie Aug 21, 2024 •

edited

Loading

gmbnomis commented Aug 21, 2024 •

edited

Loading