You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, when we handle the SINK INTO TABLE statement, we run two commands atomically: a CreateStreamingJob to create a streaming graph to generate the new input, and a ReplaceTable to recreate the the downstream streaming graph that writes to the table.
When handling ReplaceTable, we drop all the actors that write the downstream table, and recreate new ones to accept the new input. However, after dropping the original actors, we don't wait for the uncommitted data to be committed. Since the data are uncommitted, it is not visible to the newly created actors. Therefore, when the new actors try to read keys that has uncommitted data, it won't read the correct one, which may cause data inconsistency. On the other hand, for ReplaceTable command that handles ALTER TABLE statement, it will treat it as a configuration, and will wait for all data to be committed before dropping and recreating the actors, and therefore won't cause bug.
Potential solution:
When handling SINK INTO TABLE statement, treat it as configuration change and run pause/resume command before and after the command.
Do not run ReplaceTable when handling SINK INTO TABLE. Instead, modify the inputs of downstream streaming graph on the flight. We can leverage the mutation information to ensure that the inputs configuration is changed at the boundary of barrier.
Error message/log
Found by reading code.
To Reproduce
No response
Expected behavior
No response
How did you deploy RisingWave?
No response
The version of RisingWave
No response
Additional context
No response
The text was updated successfully, but these errors were encountered:
Describe the bug
Currently, when we handle the
SINK INTO TABLE
statement, we run two commands atomically: aCreateStreamingJob
to create a streaming graph to generate the new input, and aReplaceTable
to recreate the the downstream streaming graph that writes to the table.When handling
ReplaceTable
, we drop all the actors that write the downstream table, and recreate new ones to accept the new input. However, after dropping the original actors, we don't wait for the uncommitted data to be committed. Since the data are uncommitted, it is not visible to the newly created actors. Therefore, when the new actors try to read keys that has uncommitted data, it won't read the correct one, which may cause data inconsistency. On the other hand, forReplaceTable
command that handlesALTER TABLE
statement, it will treat it as a configuration, and will wait for all data to be committed before dropping and recreating the actors, and therefore won't cause bug.Potential solution:
SINK INTO TABLE
statement, treat it as configuration change and run pause/resume command before and after the command.ReplaceTable
when handlingSINK INTO TABLE
. Instead, modify the inputs of downstream streaming graph on the flight. We can leverage the mutation information to ensure that the inputs configuration is changed at the boundary of barrier.Error message/log
To Reproduce
No response
Expected behavior
No response
How did you deploy RisingWave?
No response
The version of RisingWave
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: