-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support a method to reload a CDC table without recreating it #12753
Comments
This is somehow similar to Alter MV. IIUC, the main motivation is to preserve all downstream streaming jobs. Then the problem is, how to decide the changes that should be yielded to these downstream if there's already data loss?
|
By full replication you mean do full scan of source table? |
Why is the upstream binlog/wal file missing at the first place? Is this an expected situaion. I guess this is due to binlog/WAL retention? Correct me if I am wrong, the job cannot be recovered only if the missing upstream binlog/WAL are "unconsumed" by RW. If this is the case, that means there is something unexpected happening in RW cluster causing lags in consuming the binlog/WAL files. That sounds rare to me and dropping & creating the cdc table sounds acceptable to me. |
I tried to repro binlog missing via the following steps:
Here are my findings:
I am using this git commit (git-5a7e0883e10e6cba54a70888717b3d2152bc9a61) since this is the locally cached one and my box takes a long time to pull the latest image for some reason. |
Hi @hzxa21 , thanks for the effort. But I doubt that it may not a correct way to reproduce the binlog missing case caused by retention. The mysql> show master status;
+---------------+----------+--------------+------------------+-------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+---------------+----------+--------------+------------------+-------------------+
| binlog.000009 | 13947 | | | |
+---------------+----------+--------------+------------------+-------------------+
1 row in set (0.01 sec)
mysql> PURGE BINARY LOGS BEFORE now();
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> show warnings;
+---------+------+------------------------------------------------------------------------+
| Level | Code | Message |
+---------+------+------------------------------------------------------------------------+
| Warning | 1868 | file ./binlog.000009 was not purged because it is the active log file. |
+---------+------+------------------------------------------------------------------------+
1 row in set (0.00 sec)
mysql> reset master to 10;
Query OK, 0 rows affected (0.02 sec)
The In summary, if binlog file is accidentally lost instead of removed by the retention policy, then the following behavior is expected. For this case we need to drop and rebuild the cdc table.
|
Right now we should focus on the case that binlog file is deleted by the MySQL retention policy. In this case, the persisted source offset may have been purged by MySQL. So that the connector can't be recovered correctly upon recovery, then cdc table would not sync with its upstream. I found that Debezium can emit heartbeat event to downstream. With the help with heartbeat events, we can update the source offset to keep it up to date with the upstream even the upstream doesn't have any update for a long time. Then offset can be reset successfully during the recovery. I will implement the support of heartbeat event first to solve the problem for our customer.
|
Did you accidentally reference this? 👀 |
No. Since I think the problem you mentioned is more general and complex, for example when the compute node crashed for a long time. But right now we can narrow the scope and assumes that compute node won't be down for more than the expiration time of the binlog, so that the connector can reset its offset upon recovery. |
We have another round of discussion of this issue. Let me migrate it to this thread.
If user confirm that their upstream table is append-only, I and @hzxa21 have a discussion about this case. We can provide a command to allow user trigger a full re-sync with their upstream table, and introduce a new type of Since this new type of conflict handle will introduce overhead, it should be switched off after the re-sync has been done. |
This issue has been open for 60 days with no activity. Could you please update the status? Feel free to continue discussion or close as not planned. |
Is your feature request related to a problem? Please describe.
When we try to recover a CDC table job, if the upstream binlog/wal file is missing, currently this job cannot be recovered. Right now, we can only drop the table and recreate it to do the full replication. However, the process of rebuilding a CDC table can be tedious, because there maybe many downstream stream jobs depends on the CDC table.
One of our customer request us to support a method to reload the CDC table without recreating it.
Describe the solution you'd like
Some ideas:
REFRESH
command to allow user to trigger a full replication for a specific CDC table. Mainly for operational scenario.reference:
DB2: https://www.ibm.com/docs/en/db2-for-zos/12?topic=statements-refresh-table
PG: https://www.postgresql.org/docs/current/sql-refreshmaterializedview.html
Describe alternatives you've considered
No response
Additional context
related: #12313, #14060
The text was updated successfully, but these errors were encountered: