-
Notifications
You must be signed in to change notification settings - Fork 687
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Standalone FAILOVER: Fix disconnect of blocked clients in standalone failover and support REDIRECT response #848
base: unstable
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## unstable #848 +/- ##
============================================
- Coverage 70.37% 70.35% -0.03%
============================================
Files 112 112
Lines 61505 61515 +10
============================================
- Hits 43286 43280 -6
- Misses 18219 18235 +16
|
33a1fa3
to
6e4762e
Compare
6e4762e
to
12053bb
Compare
12053bb
to
e9f30de
Compare
@soloestoy Adapted to #871 |
When entering the `FAILOVER_IN_PROGRESS` state during a failover, `disconnectAllBlockedClients` is called (as part of `replicationSetPrimary`) to disconnect all blocked clients (except those that are postponed due to a client pause). While this makes sense in case of a role change in general, this is happening too early in case of failover: 1. The role switch hasn't happened in `FAILOVER_IN_PROGESS` yet, thus, there is no new primary to connect to. 2. The failover may still fail. In this case, there is no need to unblock/disconnect clients at all. Therefore, move the `disconnectAllBlockedClients` to when we unblock the paused clients after the failover completes. As said above, this call must only happen if the failover succeeds. Extend two tests to verify the changed behavior: 1. "failover to a replica with force works": Issue a blocking command before the failover and verify that it is unblocked after the failover finished. 2. "failover aborts if target rejects sync request": Both a blocking command issued before a failover as well as one issued during the failover won't be interrupted when the failover eventually aborts. (Note that the location of the previous `SET FOO BAR` command in the test was wrong, as it did not block and thus, wouldn't be "interrupted" by the failover in any case.) Signed-off-by: Simon Baatz <[email protected]>
In case of an instance role switch in standalone mode, clients with blocking commands get disconnected sending an UNBLOCKED error to the client. Now that the REDIRECT error is available as an alternative (valkey-io#325), issue a REDIRECT (redirecting the client to the primary) for clients that are supporting it. If a client is read-only, it may remain blocked in some situations. There is no need to disconnect the client in either case. This is in-line with the semantics of the MOVED error in cluster mode. Signed-off-by: Simon Baatz <[email protected]>
e9f30de
to
1a850d0
Compare
c->bstate.btype == BLOCKED_STREAM || c->bstate.btype == BLOCKED_MODULE)) { | ||
/* If the client is blocked on module, but not on a specific key, | ||
* don't unblock it. */ | ||
if (c->bstate.btype == BLOCKED_MODULE && !moduleClientIsBlockedOnKeys(c)) continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the general logic about key ownership and write commands. however module logic can vary greatly and I would hate placing assumptions on modules work. Maybe we can let modules opt it by registering API to query if the module wishes to be kept alive after role change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realized that I should have added that I did not "invent" this code, it is following the redirect logic in Valkey cluster:
Line 1215 in 7795152
int clusterRedirectBlockedClientIfNeeded(client *c) { |
This particular condition was introduced by redis/redis#9483. The discussion there centered on slot migration in cluster, but the MOVED will also be sent/not sent following the usual readwrite/readonly logic on a replica.
So, it looks applicable to the redirect case here, but, TBH, I don't know the module API very well. @soloestoy, as you were part of the discussion on redis/redis#9483, do you think we should do this here as well? (alternatively we could just continue to disconnect clients which are blocked on a module or we may need to come up with something more complex, like @ranshid suggested.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gmbnomis actually, I agree. I forgot about the redirect logic taking the same decision, so I guess the modules are already capable of handling it (even though I think modules should have the ability to impact the decision). I would like to ask, though, why not just reuse the function (clusterRedirectBlockedClientIfNeeded) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at both functions, the commonalities are quite limited because the redirect logic in cluster is much more complex and not applicable to the standalone case. We probably could extract the check whether we need to look at a client in more detail, i.e. something like:
int isRedirectCandidate(client *c) {
if (c->flag.blocked && (c->bstate.btype == BLOCKED_LIST || c->bstate.btype == BLOCKED_ZSET ||
c->bstate.btype == BLOCKED_STREAM || c->bstate.btype == BLOCKED_MODULE)) {
/* If the client is blocked on module, but not on a specific key,
* don't unblock it. */
if (c->bstate.btype != BLOCKED_MODULE || moduleClientIsBlockedOnKeys(c)) return 1;
}
return 0;
}
That's not much. Do you think this is worthwhile?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is worthwhile to create a small function or two, less about code reusing but more for abstraction. For instance, I see the following concepts worth capturing
- standalone redirection readiness
!server.cluster_enabled && server.primary_host && c->capa & CLIENT_CAPA_REDIRECT
This is also used at
Line 3913 in 829243e
if (!server.cluster_enabled && c->capa & CLIENT_CAPA_REDIRECT && server.primary_host && !mustObeyClient(c) && |
- client conditions
c->flag.blocked && (c->bstate.btype == BLOCKED_LIST || c->bstate.btype == BLOCKED_ZSET ||
c->bstate.btype == BLOCKED_STREAM || c->bstate.btype == BLOCKED_MODULE)
Then L294-L296 could be rewritten as (in pseudo code)
if (redirection_is_enabled() && client_is_in_right_state(c)) {
...
}
Lastly, I am not sure about the module and read-only client cases. The changes look like leaving them in a perpetually blocked state; while for all other cases, we either redirect (new behavior) or drop the connection (old behavior).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is worthwhile to create a small function or two, less about code reusing but more for abstraction.
Yes, that makes sense. I will give it a try once we sort out what the logic should be.
Lastly, I am not sure about the module and read-only client cases. The changes look like leaving them in a perpetually blocked state; while for all other cases, we either redirect (new behavior) or drop the connection (old behavior).
I realized only today that CLUSTER FAILOVER
does not result in MOVED
errors for blocked clients satisfying these client conditions (like the documentation of the command and the comments for clusterRedirectBlockedClientIfNeeded
may suggest).
Instead, we end up in exactly the function we are discussing about. That is to say, we return UNBLOCK and disconnect.
So, Valkey cluster seems to handle unblocking due to slot migration and unblocking due to primary/replica change differently. (I don't know whether this is by intention or not).
Thus, what I deemed to be "well-known territory" apparently isn't. Furthermore, replicationSetPrimary
is not only called during failover but also by an arbitrary REPLICAOF
command.
So, I think you are right and we should either redirect or disconnect. Shall I convert the current "do nothing" cases (i.e. continue
) into disconnects? (because we are not sure whether redirecting the client is actually beneficial in these cases)
(Alternatively, we could come to the conclusion that we don't need to be "better than CLUSTER FAILOVER" and decide against doing this redirect optimization at all.)
@@ -3023,7 +3023,7 @@ void replicationStartPendingFork(void); | |||
void replicationHandlePrimaryDisconnection(void); | |||
void replicationCachePrimary(client *c); | |||
void resizeReplicationBacklog(void); | |||
void replicationSetPrimary(char *ip, int port); | |||
void replicationSetPrimary(char *ip, int port, int disconnect_blocked); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, the only case we do not want to disco blocked clients is only in the case of 'server.failover_state = FAILOVER_IN_PROGRESS' right? can we then just check for this condition inside replicationSetPrimary instead of passing on an external condition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that should be possible, but I am reluctant to make replicationSetPrimary
(which isn't specific to failover) behave differently based on some sub-state of the the failover state machine. I am not against it, but I think the intention gets clearer if the caller of replicationSetPrimary
states "I will take care of blocked clients myself".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO passing this flag to the replicationSetPrimary and having it handle clients disconnects is already extending the function scope. I am fine with letting it handle the state logic and encapsulate the decision into a single place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @gmbnomis that we shouldn't introduce this dependency (server.failover_state = FAILOVER_IN_PROGRESS
) to replicationSetPrimary
. There's no inherent order between setting the failover state to in-progress and fixing the replication states. Introducing a failover state check adds an unrelated concern to the function, which in turn forces the caller to perform these operations in a specific sequence, an unnecessary constraint IMO and also not resilient to future changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
O.K. I agree this is also a good reasoning.
IIUC, the current issue being described is to postpone the interruption of the block connection in |
Actually, it is not only postponing, but also not disconnecting those clients when the failover does not happen in the end.
Previously, a client that was blocked before the failover (e.g. a
Now, both cases are the same from the client's point of view. I don't think that this fixes a huge problem, as said in the PR comment "It can be considered as a bugfix and could be a candidate for a 7.2.x release." For the REDIRECT part, yes, it is an optimization which becomes possible because we have REDIRECT now and because we know where to redirect to. |
This PR consists of two commits: If a client waits for the response to a blocking command when a failover is initiated, it will be unblocked and disconnected during the failover. While this needs to happen in general, it happens too early currently. The first commit lets the disconnect happen only after a successful failover. It can be considered as a bugfix and could be a candidate for a 7.2.x release.
Now that the disconnect happens at the correct point in time, we can redirect clients that support the "redirect" capability instead of disconnecting them. This is implemented in the second commit.
These changes are related to the bug described in issue #821.
More details are covered in the commit messages of the two commits:
Fix early unblock/disconnect of clients during failover
When entering the
FAILOVER_IN_PROGRESS
state during a failover,disconnectAllBlockedClients
is called (as part ofreplicationSetPrimary
) to disconnect all blocked clients (exceptthose that are postponed due to a client pause).
While this makes sense in case of a role change in general, this is
happening too early in case of failover:
FAILOVER_IN_PROGESS
yet,thus, there is no new primary to connect to.
unblock/disconnect clients at all.
Therefore, move the
disconnectAllBlockedClients
to when we unblockthe paused clients after the failover completes. As said above, this
call must only happen if the failover succeeds.
Extend two tests to verify the changed behavior:
"failover to a replica with force works": Issue a blocking command
before the failover and verify that it is unblocked after the
failover finished.
"failover aborts if target rejects sync request": Both a blocking
command issued before a failover as well as one issued during the
failover won't be interrupted when the failover eventually
aborts. (Note that the location of the previous
SET FOO BAR
command in the test was wrong, as it did not block and thus,
wouldn't be "interrupted" by the failover in any case.)
Role change: Redirect suitable clients instead of disconnecting them
In case of an instance role switch in standalone mode, clients with
blocking commands get disconnected sending an UNBLOCKED error to the
client.
Now that the REDIRECT error is available as an alternative (#325),
issue a REDIRECT (redirecting the client to the primary) for clients
that are supporting it. There is no need to disconnect the client in
this case. This is in-line with the semantics of the MOVED error in
cluster mode.