Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bugs related to synchronous TLS connections blocking #837

Merged
merged 23 commits into from
Aug 12, 2024

Conversation

naglera
Copy link
Contributor

@naglera naglera commented Jul 29, 2024

  • Fix TLS bug where connection were shutdown by primary's main process while the child process was still writing- causing main process to be blocked.
  • TLS connection fix -file descriptors are set to blocking mode in the main thread, followed by a blocking write. This sets the file descriptors to non-blocking if TLS is used (see connTLSSyncWrite()) (@xbasel).
  • Improve the reliability of dual-channel tests. Modify the pause mechanism to verify process status directly, rather than relying on log.
  • Ensure that server.repl_offset and server.replid are updated correctly when dual channel synchronization completes successfully. Thist led to failures in replication tests that validate replication IDs or compare replication offsets.

wait_for_log_messages $idx {"*Process is about to stop.*"} 0 2000 1
wait_for_condition 50 1000 {
[exec ps -o state= -p $pid] eq "T" ||
[exec ps -o state= -p $pid] eq "Z"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you plan to catch that? I mean this is basically when we are waiting to collect the zombie, this is not the expected flow we are trying to catch...

Copy link

codecov bot commented Jul 29, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 70.23%. Comparing base (b4d96ca) to head (575d549).
Report is 76 commits behind head on unstable.

Additional details and impacted files
@@             Coverage Diff              @@
##           unstable     #837      +/-   ##
============================================
- Coverage     70.38%   70.23%   -0.16%     
============================================
  Files           112      112              
  Lines         61462    61470       +8     
============================================
- Hits          43261    43173      -88     
- Misses        18201    18297      +96     
Files with missing lines Coverage Δ
src/networking.c 88.72% <100.00%> (-0.08%) ⬇️
src/rdb.c 76.16% <100.00%> (-0.11%) ⬇️
src/replication.c 87.14% <100.00%> (-0.04%) ⬇️

... and 13 files with indirect coverage changes

@ranshid
Copy link
Member

ranshid commented Jul 29, 2024

@naglera please fix the issue headline as it now include 2 fixes. please also add explanation for the fixes in the top comment

@naglera naglera changed the title Wait for pause should check process status instead of logs Fix dual-channel-replication related issues Jul 29, 2024
@madolson madolson added the run-extra-tests Run extra tests on this PR (Runs all tests from daily except valgrind and RESP) label Jul 29, 2024
Signed-off-by: naglera <[email protected]>
@madolson
Copy link
Member

There are some new errors I haven't seen before in the extended tests: https://github.com/valkey-io/valkey/actions/runs/10146979329/job/28056324528?pr=837

naglera and others added 18 commits July 30, 2024 09:59
On busy hosts, rdb-key-save-delay may pause the process for more then
expected due to long recurreing context switches

Signed-off-by: naglera <[email protected]>
…lly succeed"

This reverts commit 10c73e5.

Signed-off-by: naglera <[email protected]>
This reverts commit 5f1cd17.

Signed-off-by: naglera <[email protected]>
1.Replica recover rdb-connection killed
2.Replica recover main-connection killed

We ca only catch bgsave while in progress or expect replica to compleat the sync but not both.
Using rdb-key-save-delay is not predictable enough when machine has high load.

Signed-off-by: naglera <[email protected]>
Signed-off-by: naglera <[email protected]>
1. Test replica's buffer limit reached
2. dual-channel-replication fails when primary diskless disabled
In both cases we should not count on rdb-key-save-delay to be precise on machines with high load.

Signed-off-by: naglera <[email protected]>
Test replica unable to join dual channel replication sync after started
we should not count on rdb-key-save-delay to be precise on machines with high load.

Signed-off-by: naglera <[email protected]>
…rite.

Dual-channel file descriptors are set to blocking mode in the main
thread, followed by a blocking write. This sets the file descriptors
to non-blocking if TLS is used (see `connTLSSyncWrite()`).

The child process runs in blocking mode. If a write operation fails
to write the entire buffer, SSL returns `SSL_ERROR_WANT_WRITE`. This
error is ignored, causing the replica to fail in loading the RDB.

This change sets file descriptors after the blocking write.

Signed-off-by: xbasel <[email protected]>
Signed-off-by: Madelyn Olson <[email protected]>
Fix "Test dual-channel-replication primary reject set-rdb-client after client
killed". When replica is paused rdb child process can't recognize
connection closed. Need to resume the replica in order to fail the sync.

Signed-off-by: naglera <[email protected]>
When using blocking connection we can't normally close the connection at
the main process context, since it may block the main proc if the
replica is not responding. We are also unable to skip this since
otherwise child process will continue the save.

Signed-off-by: naglera <[email protected]>
- dual-channel-replication fails when primary diskless disabled
- Replica recover rdb-connection killed

In both tests we should make child proces sleep for shorter intervals so
the save will be terminated on time

Signed-off-by: naglera <[email protected]>
This reverts commit 75ff15d.

Signed-off-by: naglera <[email protected]>
…nsfer error

Currently lastbgsave_status is used in bgsave or disk-replication,
and the target is the disk. In valkey-io#60, we update it when transfer error,
i think it is mainly used in tests, so we can use log to replace it.

It changes lastbgsave_status to err in this case, but it is strange
that it does not set ok or err in the above if and the following else.
Also noted this will affect stop-writes-on-bgsave-error.

Signed-off-by: Binbin <[email protected]>
dual-channel-replication fails when primary diskless disabled - we
should wait for bgproc to exit

Signed-off-by: naglera <[email protected]>
@madolson madolson merged commit 27fce29 into valkey-io:unstable Aug 12, 2024
55 of 56 checks passed
mapleFU pushed a commit to mapleFU/valkey that referenced this pull request Aug 21, 2024
- Fix TLS bug where connection were shutdown by primary's main process
while the child process was still writing- causing main process to be
blocked.
- TLS connection fix -file descriptors are set to blocking mode in the
main thread, followed by a blocking write. This sets the file
descriptors to non-blocking if TLS is used (see `connTLSSyncWrite()`)
(@xbasel).
- Improve the reliability of dual-channel tests. Modify the pause
mechanism to verify process status directly, rather than relying on log.
- Ensure that `server.repl_offset` and `server.replid` are updated
correctly when dual channel synchronization completes successfully.
Thist led to failures in replication tests that validate replication IDs
or compare replication offsets.

---------

Signed-off-by: naglera <[email protected]>
Signed-off-by: naglera <[email protected]>
Signed-off-by: xbasel <[email protected]>
Signed-off-by: Madelyn Olson <[email protected]>
Signed-off-by: Binbin <[email protected]>
Co-authored-by: ranshid <[email protected]>
Co-authored-by: xbasel <[email protected]>
Co-authored-by: Madelyn Olson <[email protected]>
Co-authored-by: Binbin <[email protected]>
Signed-off-by: mwish <[email protected]>
mapleFU pushed a commit to mapleFU/valkey that referenced this pull request Aug 22, 2024
- Fix TLS bug where connection were shutdown by primary's main process
while the child process was still writing- causing main process to be
blocked.
- TLS connection fix -file descriptors are set to blocking mode in the
main thread, followed by a blocking write. This sets the file
descriptors to non-blocking if TLS is used (see `connTLSSyncWrite()`)
(@xbasel).
- Improve the reliability of dual-channel tests. Modify the pause
mechanism to verify process status directly, rather than relying on log.
- Ensure that `server.repl_offset` and `server.replid` are updated
correctly when dual channel synchronization completes successfully.
Thist led to failures in replication tests that validate replication IDs
or compare replication offsets.

---------

Signed-off-by: naglera <[email protected]>
Signed-off-by: naglera <[email protected]>
Signed-off-by: xbasel <[email protected]>
Signed-off-by: Madelyn Olson <[email protected]>
Signed-off-by: Binbin <[email protected]>
Co-authored-by: ranshid <[email protected]>
Co-authored-by: xbasel <[email protected]>
Co-authored-by: Madelyn Olson <[email protected]>
Co-authored-by: Binbin <[email protected]>
Signed-off-by: mwish <[email protected]>
madolson added a commit that referenced this pull request Sep 2, 2024
- Fix TLS bug where connection were shutdown by primary's main process
while the child process was still writing- causing main process to be
blocked.
- TLS connection fix -file descriptors are set to blocking mode in the
main thread, followed by a blocking write. This sets the file
descriptors to non-blocking if TLS is used (see `connTLSSyncWrite()`)
(@xbasel).
- Improve the reliability of dual-channel tests. Modify the pause
mechanism to verify process status directly, rather than relying on log.
- Ensure that `server.repl_offset` and `server.replid` are updated
correctly when dual channel synchronization completes successfully.
Thist led to failures in replication tests that validate replication IDs
or compare replication offsets.

---------

Signed-off-by: naglera <[email protected]>
Signed-off-by: naglera <[email protected]>
Signed-off-by: xbasel <[email protected]>
Signed-off-by: Madelyn Olson <[email protected]>
Signed-off-by: Binbin <[email protected]>
Co-authored-by: ranshid <[email protected]>
Co-authored-by: xbasel <[email protected]>
Co-authored-by: Madelyn Olson <[email protected]>
Co-authored-by: Binbin <[email protected]>
@madolson madolson added the release-notes This issue should get a line item in the release notes label Sep 2, 2024
@madolson madolson changed the title Fix dual-channel-replication related issues Fix bugs related to synchronous TLS connections blocking Sep 2, 2024
madolson added a commit that referenced this pull request Sep 3, 2024
- Fix TLS bug where connection were shutdown by primary's main process
while the child process was still writing- causing main process to be
blocked.
- TLS connection fix -file descriptors are set to blocking mode in the
main thread, followed by a blocking write. This sets the file
descriptors to non-blocking if TLS is used (see `connTLSSyncWrite()`)
(@xbasel).
- Improve the reliability of dual-channel tests. Modify the pause
mechanism to verify process status directly, rather than relying on log.
- Ensure that `server.repl_offset` and `server.replid` are updated
correctly when dual channel synchronization completes successfully.
Thist led to failures in replication tests that validate replication IDs
or compare replication offsets.

---------

Signed-off-by: naglera <[email protected]>
Signed-off-by: naglera <[email protected]>
Signed-off-by: xbasel <[email protected]>
Signed-off-by: Madelyn Olson <[email protected]>
Signed-off-by: Binbin <[email protected]>
Co-authored-by: ranshid <[email protected]>
Co-authored-by: xbasel <[email protected]>
Co-authored-by: Madelyn Olson <[email protected]>
Co-authored-by: Binbin <[email protected]>
ranshid pushed a commit that referenced this pull request Nov 21, 2024
This change prevents unintended side effects on connection state and
improves consistency with non-TLS sync operations.

For example, when invoking `connTLSSyncRead` with a blocking file
descriptor, the mode is switched to non-blocking upon `connTLSSyncRead`
exit. If the code assumes the file descriptor remains blocking and calls
the normal `read` expecting it to block, it may result in a short read.

This caused a crash in dual-channel, which was fixed in this PR by
relocating `connBlock()`:
#837

Signed-off-by: xbasel <[email protected]>
tezc added a commit to redis/redis that referenced this pull request Jan 13, 2025
This PR is based on:

#12109
valkey-io/valkey#60

Closes: #11678

**Motivation**

During a full sync, when master is delivering RDB to the replica,
incoming write commands are kept in a replication buffer in order to be
sent to the replica once RDB delivery is completed. If RDB delivery
takes a long time, it might create memory pressure on master. Also, once
a replica connection accumulates replication data which is larger than
output buffer limits, master will kill replica connection. This may
cause a replication failure.

The main benefit of the rdb channel replication is streaming incoming
commands in parallel to the RDB delivery. This approach shifts
replication stream buffering to the replica and reduces load on master.
We do this by opening another connection for RDB delivery. The main
channel on replica will be receiving replication stream while rdb
channel is receiving the RDB.

This feature also helps to reduce master's main process CPU load. By
opening a dedicated connection for the RDB transfer, the bgsave process
has access to the new connection and it will stream RDB directly to the
replicas. Before this change, due to TLS connection restriction, the
bgsave process was writing RDB bytes to a pipe and the main process was
forwarding
it to the replica. This is no longer necessary, the main process can
avoid these expensive socket read/write syscalls. It also means RDB
delivery to replica will be faster as it avoids this step.

In summary, replication will be faster and master's performance during
full syncs will improve.


**Implementation steps**

1. When replica connects to the master, it sends 'rdb-channel-repl' as
part of capability exchange to let master to know replica supports rdb
channel.
2. When replica lacks sufficient data for PSYNC, master sends
+RDBCHANNELSYNC reply with replica's client id. As the next step, the
replica opens a new connection (rdb-channel) and configures it against
the master with the appropriate capabilities and requirements. It also
sends given client id back to master over rdbchannel, so that master can
associate these channels. (initial replica connection will be referred
as main-channel) Then, replica requests fullsync using the RDB channel.
3. Prior to forking, master attaches the replica's main channel to the
replication backlog to deliver replication stream starting at the
snapshot end offset.
4. The master main process sends replication stream via the main
channel, while the bgsave process sends the RDB directly to the replica
via the rdb-channel. Replica accumulates replication stream in a local
buffer, while the RDB is being loaded into the memory.
5. Once the replica completes loading the rdb, it drops the rdb channel
and streams the accumulated replication stream into the db. Sync is
completed.

**Some details**
- Currently, rdbchannel replication is supported only if
`repl-diskless-sync` is enabled on master. Otherwise, replication will
happen over a single connection as in before.
- On replica, there is a limit to replication stream buffering. Replica
uses a new config `replica-full-sync-buffer-limit` to limit number of
bytes to accumulate. If it is not set, replica inherits
`client-output-buffer-limit <replica>` hard limit config. If we reach
this limit, replica stops accumulating. This is not a failure scenario
though. Further accumulation will happen on master side. Depending on
the configured limits on master, master may kill the replica connection.

**API changes in INFO output:**

1. New replica state: `send_bulk_and_stream`. Indicates full sync is
still in progress for this replica. It is receiving replication stream
and rdb in parallel.
```
slave0:ip=127.0.0.1,port=5002,state=send_bulk_and_stream,offset=0,lag=0
```
Replica state changes in steps:
- First, replica sends psync and receives +RDBCHANNELSYNC
:`state=wait_bgsave`
- After replica connects with rdbchannel and delivery starts:
`state=send_bulk_and_stream`
 - After full sync: `state=online`

2. On replica side, replication stream buffering metrics:
- replica_full_sync_buffer_size: Currently accumulated replication
stream data in bytes.
- replica_full_sync_buffer_peak: Peak number of bytes that this instance
accumulated in the lifetime of the process.

```
replica_full_sync_buffer_size:20485             
replica_full_sync_buffer_peak:1048560
```

**API changes in CLIENT LIST**

In `client list` output, rdbchannel clients will have 'C' flag in
addition to 'S' replica flag:
```
id=11 addr=127.0.0.1:39108 laddr=127.0.0.1:5001 fd=14 name= age=5 idle=5 flags=SC db=0 sub=0 psub=0 ssub=0 multi=-1 watch=0 qbuf=0 qbuf-free=0 argv-mem=0 multi-mem=0 rbs=1024 rbp=0 obl=0 oll=0 omem=0 tot-mem=1920 events=r cmd=psync user=default redir=-1 resp=2 lib-name= lib-ver= io-thread=0
```

**Config changes:**
- `replica-full-sync-buffer-limit`: Controls how much replication data
replica can accumulate during rdbchannel replication. If it is not set,
a value of 0 means replica will inherit `client-output-buffer-limit
<replica>` hard limit config to limit accumulated data.
- `repl-rdb-channel` config is added as a hidden config. This is mostly
for testing as we need to support both rdbchannel replication and the
older single connection replication (to keep compatibility with older
versions and rdbchannel replication will not be enabled if
repl-diskless-sync is not enabled). it affects both the master (not to
respond to rdb channel requests), and the replica (not to declare
capability)

**Internal API changes:**
Changes that were introduced to Redis replication:
- New replication capability is added to replconf command: `capa
rdb-channel-repl`. Indicates replica is capable of rdb channel
replication. Replica sends it when it connects to master along with
other capabilities.
- If replica needs fullsync, master replies `+RDBCHANNELSYNC
<client-id>` to the replica's PSYNC request.
- When replica opens rdbchannel connection, as part of replconf command,
it sends `rdb-channel 1` to let master know this is rdb channel. Also,
it sends `main-ch-client-id <client-id>` as part of replconf command so
master can associate channels.
  
**Testing:**
As rdbchannel replication is enabled by default, we run whole test suite
with it. Though, as we need to support both rdbchannel and single
connection replication, we'll be running some tests twice with
`repl-rdb-channel yes/no` config.

**Replica state diagram**
```
* * Replica state machine *
 *
 * Main channel state
 * ┌───────────────────┐
 * │RECEIVE_PING_REPLY │
 * └────────┬──────────┘
 *          │ +PONG
 * ┌────────▼──────────┐
 * │SEND_HANDSHAKE     │                     RDB channel state
 * └────────┬──────────┘            ┌───────────────────────────────┐
 *          │+OK                ┌───► RDB_CH_SEND_HANDSHAKE         │
 * ┌────────▼──────────┐        │   └──────────────┬────────────────┘
 * │RECEIVE_AUTH_REPLY │        │    REPLCONF main-ch-client-id <clientid>
 * └────────┬──────────┘        │   ┌──────────────▼────────────────┐
 *          │+OK                │   │ RDB_CH_RECEIVE_AUTH_REPLY     │
 * ┌────────▼──────────┐        │   └──────────────┬────────────────┘
 * │RECEIVE_PORT_REPLY │        │                  │ +OK
 * └────────┬──────────┘        │   ┌──────────────▼────────────────┐
 *          │+OK                │   │  RDB_CH_RECEIVE_REPLCONF_REPLY│
 * ┌────────▼──────────┐        │   └──────────────┬────────────────┘
 * │RECEIVE_IP_REPLY   │        │                  │ +OK
 * └────────┬──────────┘        │   ┌──────────────▼────────────────┐
 *          │+OK                │   │ RDB_CH_RECEIVE_FULLRESYNC     │
 * ┌────────▼──────────┐        │   └──────────────┬────────────────┘
 * │RECEIVE_CAPA_REPLY │        │                  │+FULLRESYNC
 * └────────┬──────────┘        │                  │Rdb delivery
 *          │                   │   ┌──────────────▼────────────────┐
 * ┌────────▼──────────┐        │   │ RDB_CH_RDB_LOADING            │
 * │SEND_PSYNC         │        │   └──────────────┬────────────────┘
 * └─┬─────────────────┘        │                  │ Done loading
 *   │PSYNC (use cached-master) │                  │
 * ┌─▼─────────────────┐        │                  │
 * │RECEIVE_PSYNC_REPLY│        │    ┌────────────►│ Replica streams replication
 * └─┬─────────────────┘        │    │             │ buffer into memory
 *   │                          │    │             │
 *   │+RDBCHANNELSYNC client-id │    │             │
 *   ├──────┬───────────────────┘    │             │
 *   │      │ Main channel           │             │
 *   │      │ accumulates repl data  │             │
 *   │   ┌──▼────────────────┐       │     ┌───────▼───────────┐
 *   │   │ REPL_TRANSFER     ├───────┘     │    CONNECTED      │
 *   │   └───────────────────┘             └────▲───▲──────────┘
 *   │                                          │   │
 *   │                                          │   │
 *   │  +FULLRESYNC    ┌───────────────────┐    │   │
 *   ├────────────────► REPL_TRANSFER      ├────┘   │
 *   │                 └───────────────────┘        │
 *   │  +CONTINUE                                   │
 *   └──────────────────────────────────────────────┘
 */
 ```
 -----
 This PR also contains changes and ideas from: 
valkey-io/valkey#837
valkey-io/valkey#1173
valkey-io/valkey#804
valkey-io/valkey#945
valkey-io/valkey#989
---------

Co-authored-by: Yuan Wang <[email protected]>
Co-authored-by: debing.sun <[email protected]>
Co-authored-by: Moti Cohen <[email protected]>
Co-authored-by: naglera <[email protected]>
Co-authored-by: Amit Nagler <[email protected]>
Co-authored-by: Madelyn Olson <[email protected]>
Co-authored-by: Binbin <[email protected]>
Co-authored-by: Viktor Söderqvist <[email protected]>
Co-authored-by: Ping Xie <[email protected]>
Co-authored-by: Ran Shidlansik <[email protected]>
Co-authored-by: ranshid <[email protected]>
Co-authored-by: xbasel <[email protected]>
YaacovHazan pushed a commit to redis/redis that referenced this pull request Jan 14, 2025
This PR is based on:

#12109
valkey-io/valkey#60

Closes: #11678

**Motivation**

During a full sync, when master is delivering RDB to the replica,
incoming write commands are kept in a replication buffer in order to be
sent to the replica once RDB delivery is completed. If RDB delivery
takes a long time, it might create memory pressure on master. Also, once
a replica connection accumulates replication data which is larger than
output buffer limits, master will kill replica connection. This may
cause a replication failure.

The main benefit of the rdb channel replication is streaming incoming
commands in parallel to the RDB delivery. This approach shifts
replication stream buffering to the replica and reduces load on master.
We do this by opening another connection for RDB delivery. The main
channel on replica will be receiving replication stream while rdb
channel is receiving the RDB.

This feature also helps to reduce master's main process CPU load. By
opening a dedicated connection for the RDB transfer, the bgsave process
has access to the new connection and it will stream RDB directly to the
replicas. Before this change, due to TLS connection restriction, the
bgsave process was writing RDB bytes to a pipe and the main process was
forwarding
it to the replica. This is no longer necessary, the main process can
avoid these expensive socket read/write syscalls. It also means RDB
delivery to replica will be faster as it avoids this step.

In summary, replication will be faster and master's performance during
full syncs will improve.


**Implementation steps**

1. When replica connects to the master, it sends 'rdb-channel-repl' as
part of capability exchange to let master to know replica supports rdb
channel.
2. When replica lacks sufficient data for PSYNC, master sends
+RDBCHANNELSYNC reply with replica's client id. As the next step, the
replica opens a new connection (rdb-channel) and configures it against
the master with the appropriate capabilities and requirements. It also
sends given client id back to master over rdbchannel, so that master can
associate these channels. (initial replica connection will be referred
as main-channel) Then, replica requests fullsync using the RDB channel.
3. Prior to forking, master attaches the replica's main channel to the
replication backlog to deliver replication stream starting at the
snapshot end offset.
4. The master main process sends replication stream via the main
channel, while the bgsave process sends the RDB directly to the replica
via the rdb-channel. Replica accumulates replication stream in a local
buffer, while the RDB is being loaded into the memory.
5. Once the replica completes loading the rdb, it drops the rdb channel
and streams the accumulated replication stream into the db. Sync is
completed.

**Some details**
- Currently, rdbchannel replication is supported only if
`repl-diskless-sync` is enabled on master. Otherwise, replication will
happen over a single connection as in before.
- On replica, there is a limit to replication stream buffering. Replica
uses a new config `replica-full-sync-buffer-limit` to limit number of
bytes to accumulate. If it is not set, replica inherits
`client-output-buffer-limit <replica>` hard limit config. If we reach
this limit, replica stops accumulating. This is not a failure scenario
though. Further accumulation will happen on master side. Depending on
the configured limits on master, master may kill the replica connection.

**API changes in INFO output:**

1. New replica state: `send_bulk_and_stream`. Indicates full sync is
still in progress for this replica. It is receiving replication stream
and rdb in parallel.
```
slave0:ip=127.0.0.1,port=5002,state=send_bulk_and_stream,offset=0,lag=0
```
Replica state changes in steps:
- First, replica sends psync and receives +RDBCHANNELSYNC
:`state=wait_bgsave`
- After replica connects with rdbchannel and delivery starts:
`state=send_bulk_and_stream`
 - After full sync: `state=online`

2. On replica side, replication stream buffering metrics:
- replica_full_sync_buffer_size: Currently accumulated replication
stream data in bytes.
- replica_full_sync_buffer_peak: Peak number of bytes that this instance
accumulated in the lifetime of the process.

```
replica_full_sync_buffer_size:20485             
replica_full_sync_buffer_peak:1048560
```

**API changes in CLIENT LIST**

In `client list` output, rdbchannel clients will have 'C' flag in
addition to 'S' replica flag:
```
id=11 addr=127.0.0.1:39108 laddr=127.0.0.1:5001 fd=14 name= age=5 idle=5 flags=SC db=0 sub=0 psub=0 ssub=0 multi=-1 watch=0 qbuf=0 qbuf-free=0 argv-mem=0 multi-mem=0 rbs=1024 rbp=0 obl=0 oll=0 omem=0 tot-mem=1920 events=r cmd=psync user=default redir=-1 resp=2 lib-name= lib-ver= io-thread=0
```

**Config changes:**
- `replica-full-sync-buffer-limit`: Controls how much replication data
replica can accumulate during rdbchannel replication. If it is not set,
a value of 0 means replica will inherit `client-output-buffer-limit
<replica>` hard limit config to limit accumulated data.
- `repl-rdb-channel` config is added as a hidden config. This is mostly
for testing as we need to support both rdbchannel replication and the
older single connection replication (to keep compatibility with older
versions and rdbchannel replication will not be enabled if
repl-diskless-sync is not enabled). it affects both the master (not to
respond to rdb channel requests), and the replica (not to declare
capability)

**Internal API changes:**
Changes that were introduced to Redis replication:
- New replication capability is added to replconf command: `capa
rdb-channel-repl`. Indicates replica is capable of rdb channel
replication. Replica sends it when it connects to master along with
other capabilities.
- If replica needs fullsync, master replies `+RDBCHANNELSYNC
<client-id>` to the replica's PSYNC request.
- When replica opens rdbchannel connection, as part of replconf command,
it sends `rdb-channel 1` to let master know this is rdb channel. Also,
it sends `main-ch-client-id <client-id>` as part of replconf command so
master can associate channels.
  
**Testing:**
As rdbchannel replication is enabled by default, we run whole test suite
with it. Though, as we need to support both rdbchannel and single
connection replication, we'll be running some tests twice with
`repl-rdb-channel yes/no` config.

**Replica state diagram**
```
* * Replica state machine *
 *
 * Main channel state
 * ┌───────────────────┐
 * │RECEIVE_PING_REPLY │
 * └────────┬──────────┘
 *          │ +PONG
 * ┌────────▼──────────┐
 * │SEND_HANDSHAKE     │                     RDB channel state
 * └────────┬──────────┘            ┌───────────────────────────────┐
 *          │+OK                ┌───► RDB_CH_SEND_HANDSHAKE         │
 * ┌────────▼──────────┐        │   └──────────────┬────────────────┘
 * │RECEIVE_AUTH_REPLY │        │    REPLCONF main-ch-client-id <clientid>
 * └────────┬──────────┘        │   ┌──────────────▼────────────────┐
 *          │+OK                │   │ RDB_CH_RECEIVE_AUTH_REPLY     │
 * ┌────────▼──────────┐        │   └──────────────┬────────────────┘
 * │RECEIVE_PORT_REPLY │        │                  │ +OK
 * └────────┬──────────┘        │   ┌──────────────▼────────────────┐
 *          │+OK                │   │  RDB_CH_RECEIVE_REPLCONF_REPLY│
 * ┌────────▼──────────┐        │   └──────────────┬────────────────┘
 * │RECEIVE_IP_REPLY   │        │                  │ +OK
 * └────────┬──────────┘        │   ┌──────────────▼────────────────┐
 *          │+OK                │   │ RDB_CH_RECEIVE_FULLRESYNC     │
 * ┌────────▼──────────┐        │   └──────────────┬────────────────┘
 * │RECEIVE_CAPA_REPLY │        │                  │+FULLRESYNC
 * └────────┬──────────┘        │                  │Rdb delivery
 *          │                   │   ┌──────────────▼────────────────┐
 * ┌────────▼──────────┐        │   │ RDB_CH_RDB_LOADING            │
 * │SEND_PSYNC         │        │   └──────────────┬────────────────┘
 * └─┬─────────────────┘        │                  │ Done loading
 *   │PSYNC (use cached-master) │                  │
 * ┌─▼─────────────────┐        │                  │
 * │RECEIVE_PSYNC_REPLY│        │    ┌────────────►│ Replica streams replication
 * └─┬─────────────────┘        │    │             │ buffer into memory
 *   │                          │    │             │
 *   │+RDBCHANNELSYNC client-id │    │             │
 *   ├──────┬───────────────────┘    │             │
 *   │      │ Main channel           │             │
 *   │      │ accumulates repl data  │             │
 *   │   ┌──▼────────────────┐       │     ┌───────▼───────────┐
 *   │   │ REPL_TRANSFER     ├───────┘     │    CONNECTED      │
 *   │   └───────────────────┘             └────▲───▲──────────┘
 *   │                                          │   │
 *   │                                          │   │
 *   │  +FULLRESYNC    ┌───────────────────┐    │   │
 *   ├────────────────► REPL_TRANSFER      ├────┘   │
 *   │                 └───────────────────┘        │
 *   │  +CONTINUE                                   │
 *   └──────────────────────────────────────────────┘
 */
 ```
 -----
 This PR also contains changes and ideas from: 
valkey-io/valkey#837
valkey-io/valkey#1173
valkey-io/valkey#804
valkey-io/valkey#945
valkey-io/valkey#989
---------

Co-authored-by: Yuan Wang <[email protected]>
Co-authored-by: debing.sun <[email protected]>
Co-authored-by: Moti Cohen <[email protected]>
Co-authored-by: naglera <[email protected]>
Co-authored-by: Amit Nagler <[email protected]>
Co-authored-by: Madelyn Olson <[email protected]>
Co-authored-by: Binbin <[email protected]>
Co-authored-by: Viktor Söderqvist <[email protected]>
Co-authored-by: Ping Xie <[email protected]>
Co-authored-by: Ran Shidlansik <[email protected]>
Co-authored-by: ranshid <[email protected]>
Co-authored-by: xbasel <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-notes This issue should get a line item in the release notes run-extra-tests Run extra tests on this PR (Runs all tests from daily except valgrind and RESP)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants