Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standalone FAILOVER: Fix disconnect of blocked clients in standalone failover and support REDIRECT response #848

Open
wants to merge 2 commits into
base: unstable
Choose a base branch
from

Conversation

gmbnomis
Copy link
Contributor

This PR consists of two commits: If a client waits for the response to a blocking command when a failover is initiated, it will be unblocked and disconnected during the failover. While this needs to happen in general, it happens too early currently. The first commit lets the disconnect happen only after a successful failover. It can be considered as a bugfix and could be a candidate for a 7.2.x release.

Now that the disconnect happens at the correct point in time, we can redirect clients that support the "redirect" capability instead of disconnecting them. This is implemented in the second commit.

These changes are related to the bug described in issue #821.

More details are covered in the commit messages of the two commits:

Fix early unblock/disconnect of clients during failover

When entering the FAILOVER_IN_PROGRESS state during a failover,
disconnectAllBlockedClients is called (as part of
replicationSetPrimary) to disconnect all blocked clients (except
those that are postponed due to a client pause).

While this makes sense in case of a role change in general, this is
happening too early in case of failover:

  1. The role switch hasn't happened in FAILOVER_IN_PROGESS yet,
    thus, there is no new primary to connect to.
  2. The failover may still fail. In this case, there is no need to
    unblock/disconnect clients at all.

Therefore, move the disconnectAllBlockedClients to when we unblock
the paused clients after the failover completes. As said above, this
call must only happen if the failover succeeds.

Extend two tests to verify the changed behavior:

  1. "failover to a replica with force works": Issue a blocking command
    before the failover and verify that it is unblocked after the
    failover finished.

  2. "failover aborts if target rejects sync request": Both a blocking
    command issued before a failover as well as one issued during the
    failover won't be interrupted when the failover eventually
    aborts. (Note that the location of the previous SET FOO BAR
    command in the test was wrong, as it did not block and thus,
    wouldn't be "interrupted" by the failover in any case.)

Role change: Redirect suitable clients instead of disconnecting them

In case of an instance role switch in standalone mode, clients with
blocking commands get disconnected sending an UNBLOCKED error to the
client.

Now that the REDIRECT error is available as an alternative (#325),
issue a REDIRECT (redirecting the client to the primary) for clients
that are supporting it. There is no need to disconnect the client in
this case. This is in-line with the semantics of the MOVED error in
cluster mode.

@madolson madolson requested a review from soloestoy July 30, 2024 23:34
Copy link

codecov bot commented Jul 31, 2024

Codecov Report

Attention: Patch coverage is 95.83333% with 1 line in your changes missing coverage. Please review.

Project coverage is 70.35%. Comparing base (7795152) to head (1a850d0).
Report is 6 commits behind head on unstable.

Files Patch % Lines
src/blocked.c 91.66% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##           unstable     #848      +/-   ##
============================================
- Coverage     70.37%   70.35%   -0.03%     
============================================
  Files           112      112              
  Lines         61505    61515      +10     
============================================
- Hits          43286    43280       -6     
- Misses        18219    18235      +16     
Files Coverage Δ
src/cluster_legacy.c 85.44% <100.00%> (+0.03%) ⬆️
src/replication.c 87.10% <100.00%> (-0.04%) ⬇️
src/server.h 100.00% <ø> (ø)
src/blocked.c 91.55% <91.66%> (-0.09%) ⬇️

... and 14 files with indirect coverage changes

@gmbnomis gmbnomis force-pushed the disconnect_or_redirect branch from 33a1fa3 to 6e4762e Compare July 31, 2024 21:07
src/blocked.c Outdated Show resolved Hide resolved
@gmbnomis
Copy link
Contributor Author

@soloestoy Adapted to #871

When entering the `FAILOVER_IN_PROGRESS` state during a failover,
`disconnectAllBlockedClients` is called (as part of
`replicationSetPrimary`) to disconnect all blocked clients (except
those that are postponed due to a client pause).

While this makes sense in case of a role change in general, this is
happening too early in case of failover:
1. The role switch hasn't happened in `FAILOVER_IN_PROGESS` yet,
   thus, there is no new primary to connect to.
2. The failover may still fail. In this case, there is no need to
   unblock/disconnect clients at all.

Therefore, move the `disconnectAllBlockedClients` to when we unblock
the paused clients after the failover completes. As said above, this
call must only happen if the failover succeeds.

Extend two tests to verify the changed behavior:

1. "failover to a replica with force works": Issue a blocking command
   before the failover and verify that it is unblocked after the
   failover finished.

2. "failover aborts if target rejects sync request": Both a blocking
  command issued before a failover as well as one issued during the
  failover won't be interrupted when the failover eventually
  aborts. (Note that the location of the previous `SET FOO BAR`
  command in the test was wrong, as it did not block and thus,
  wouldn't be "interrupted" by the failover in any case.)

Signed-off-by: Simon Baatz <[email protected]>
In case of an instance role switch in standalone mode, clients with
blocking commands get disconnected sending an UNBLOCKED error to the
client.

Now that the REDIRECT error is available as an alternative (valkey-io#325),
issue a REDIRECT (redirecting the client to the primary) for clients
that are supporting it. If a client is read-only, it may remain
blocked in some situations. There is no need to disconnect the client
in either case. This is in-line with the semantics of the MOVED error
in cluster mode.

Signed-off-by: Simon Baatz <[email protected]>
@gmbnomis gmbnomis force-pushed the disconnect_or_redirect branch from e9f30de to 1a850d0 Compare August 18, 2024 20:33
c->bstate.btype == BLOCKED_STREAM || c->bstate.btype == BLOCKED_MODULE)) {
/* If the client is blocked on module, but not on a specific key,
* don't unblock it. */
if (c->bstate.btype == BLOCKED_MODULE && !moduleClientIsBlockedOnKeys(c)) continue;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the general logic about key ownership and write commands. however module logic can vary greatly and I would hate placing assumptions on modules work. Maybe we can let modules opt it by registering API to query if the module wishes to be kept alive after role change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized that I should have added that I did not "invent" this code, it is following the redirect logic in Valkey cluster:

int clusterRedirectBlockedClientIfNeeded(client *c) {

This particular condition was introduced by redis/redis#9483. The discussion there centered on slot migration in cluster, but the MOVED will also be sent/not sent following the usual readwrite/readonly logic on a replica.

So, it looks applicable to the redirect case here, but, TBH, I don't know the module API very well. @soloestoy, as you were part of the discussion on redis/redis#9483, do you think we should do this here as well? (alternatively we could just continue to disconnect clients which are blocked on a module or we may need to come up with something more complex, like @ranshid suggested.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gmbnomis actually, I agree. I forgot about the redirect logic taking the same decision, so I guess the modules are already capable of handling it (even though I think modules should have the ability to impact the decision). I would like to ask, though, why not just reuse the function (clusterRedirectBlockedClientIfNeeded) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at both functions, the commonalities are quite limited because the redirect logic in cluster is much more complex and not applicable to the standalone case. We probably could extract the check whether we need to look at a client in more detail, i.e. something like:

int isRedirectCandidate(client *c) {
    if (c->flag.blocked && (c->bstate.btype == BLOCKED_LIST || c->bstate.btype == BLOCKED_ZSET ||
                            c->bstate.btype == BLOCKED_STREAM || c->bstate.btype == BLOCKED_MODULE)) {
        /* If the client is blocked on module, but not on a specific key,
         * don't unblock it. */
        if (c->bstate.btype != BLOCKED_MODULE || moduleClientIsBlockedOnKeys(c)) return 1;
    }
    return 0;
}

That's not much. Do you think this is worthwhile?

Copy link
Member

@PingXie PingXie Aug 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is worthwhile to create a small function or two, less about code reusing but more for abstraction. For instance, I see the following concepts worth capturing

  1. standalone redirection readiness
!server.cluster_enabled && server.primary_host && c->capa & CLIENT_CAPA_REDIRECT 

This is also used at

if (!server.cluster_enabled && c->capa & CLIENT_CAPA_REDIRECT && server.primary_host && !mustObeyClient(c) &&

  1. client conditions
c->flag.blocked && (c->bstate.btype == BLOCKED_LIST || c->bstate.btype == BLOCKED_ZSET ||
                    c->bstate.btype == BLOCKED_STREAM || c->bstate.btype == BLOCKED_MODULE)

Then L294-L296 could be rewritten as (in pseudo code)

if (redirection_is_enabled() && client_is_in_right_state(c)) {
...
}

Lastly, I am not sure about the module and read-only client cases. The changes look like leaving them in a perpetually blocked state; while for all other cases, we either redirect (new behavior) or drop the connection (old behavior).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is worthwhile to create a small function or two, less about code reusing but more for abstraction.

Yes, that makes sense. I will give it a try once we sort out what the logic should be.

Lastly, I am not sure about the module and read-only client cases. The changes look like leaving them in a perpetually blocked state; while for all other cases, we either redirect (new behavior) or drop the connection (old behavior).

I realized only today that CLUSTER FAILOVER does not result in MOVED errors for blocked clients satisfying these client conditions (like the documentation of the command and the comments for clusterRedirectBlockedClientIfNeeded may suggest).

Instead, we end up in exactly the function we are discussing about. That is to say, we return UNBLOCK and disconnect.

So, Valkey cluster seems to handle unblocking due to slot migration and unblocking due to primary/replica change differently. (I don't know whether this is by intention or not).

Thus, what I deemed to be "well-known territory" apparently isn't. Furthermore, replicationSetPrimary is not only called during failover but also by an arbitrary REPLICAOF command.

So, I think you are right and we should either redirect or disconnect. Shall I convert the current "do nothing" cases (i.e. continue) into disconnects? (because we are not sure whether redirecting the client is actually beneficial in these cases)

(Alternatively, we could come to the conclusion that we don't need to be "better than CLUSTER FAILOVER" and decide against doing this redirect optimization at all.)

@@ -3023,7 +3023,7 @@ void replicationStartPendingFork(void);
void replicationHandlePrimaryDisconnection(void);
void replicationCachePrimary(client *c);
void resizeReplicationBacklog(void);
void replicationSetPrimary(char *ip, int port);
void replicationSetPrimary(char *ip, int port, int disconnect_blocked);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, the only case we do not want to disco blocked clients is only in the case of 'server.failover_state = FAILOVER_IN_PROGRESS' right? can we then just check for this condition inside replicationSetPrimary instead of passing on an external condition?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that should be possible, but I am reluctant to make replicationSetPrimary (which isn't specific to failover) behave differently based on some sub-state of the the failover state machine. I am not against it, but I think the intention gets clearer if the caller of replicationSetPrimary states "I will take care of blocked clients myself".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO passing this flag to the replicationSetPrimary and having it handle clients disconnects is already extending the function scope. I am fine with letting it handle the state logic and encapsulate the decision into a single place.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @gmbnomis that we shouldn't introduce this dependency (server.failover_state = FAILOVER_IN_PROGRESS) to replicationSetPrimary. There's no inherent order between setting the failover state to in-progress and fixing the replication states. Introducing a failover state check adds an unrelated concern to the function, which in turn forces the caller to perform these operations in a specific sequence, an unnecessary constraint IMO and also not resilient to future changes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

O.K. I agree this is also a good reasoning.

@soloestoy
Copy link
Member

IIUC, the current issue being described is to postpone the interruption of the block connection in replicationSetPrimary until after the entire failover is completed, that is, after FAILOVER_IN_PROGRESS. This is not a fix related to the correctness of the block command execution, but rather an optimization, right?

@gmbnomis
Copy link
Contributor Author

gmbnomis commented Aug 21, 2024

IIUC, the current issue being described is to postpone the interruption of the block connection in replicationSetPrimary until after the entire failover is completed, that is, after FAILOVER_IN_PROGRESS.

Actually, it is not only postponing, but also not disconnecting those clients when the failover does not happen in the end.

This is not a fix related to the correctness of the block command execution, but rather an optimization, right?

Previously, a client that was blocked before the failover (e.g. a BRPOP):

  • continued to be blocked and proceeded normally if the failover failed during FAILOVER_WAIT_FOR_SYNC
  • was unblocked and disconnected if the failover failed during FAILOVER_IN_PROGRESS (with the error UNBLOCKED force unblock from blocking operation, instance state changed (master -> replica?), but there was no such change in the end)

Now, both cases are the same from the client's point of view. I don't think that this fixes a huge problem, as said in the PR comment "It can be considered as a bugfix and could be a candidate for a 7.2.x release."

For the REDIRECT part, yes, it is an optimization which becomes possible because we have REDIRECT now and because we know where to redirect to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Optional for next RC
Development

Successfully merging this pull request may close these issues.

5 participants