Support pausing a decoupled Sink from SQL #17357

StrikeW · 2024-06-20T03:34:42Z

Is your feature request related to a problem? Please describe.

Usecase: sometimes user may want to prevent data coming out from RW but don't want to pause the entire cluster.

We don’t provide a way for user to just pause a Sink right now, I think for the Sink with sink_decuple = true it is feasible to provide a way to pause a Sink job from sql.

Describe the solution you'd like

No response

Describe alternatives you've considered

No response

Additional context

No response

The text was updated successfully, but these errors were encountered:

tabVersion · 2024-06-20T03:38:43Z

so why not directly drop the sink?

StrikeW · 2024-06-20T03:42:24Z

so why not directly drop the sink?

If we sink a MV out, when recreate the sink we need to backfill historical data again to ensure all data go out to downstream. Backfill historical data will deliver data already in downstream and introduce unnecessary write traffic to downstream.

hzxa21 · 2024-06-20T04:22:43Z

so why not directly drop the sink?

If we sink a MV out, when recreate the sink we need to backfill historical data again to ensure all data go out to downstream. Backfill historical data will deliver data already in downstream and introduce unnecessary write traffic to downstream.

We support a snapshot option in CREATE SINK: https://docs.risingwave.com/docs/current/sql-create-sink/

When snapshot=false, the backfilling will be skipped.

hzxa21 · 2024-06-20T04:31:58Z

Dropping the sink and recreate it with snapshot=false can cause the data ingested between DROP SINK and CRETE SINK to be missing in the sink though.

I can think of one case this feature is useful, which is when user's downstream system is overloaded and want to stop the traffic for a while before they can fix the downstream. Not sure whether it is valid though. If it is just a temporarily downstream failure, the current retry with backoff mechanism with sink decouple on seems good enough.

StrikeW · 2024-06-20T04:48:55Z

Dropping the sink and recreate it with snapshot=false can cause the data ingested between DROP SINK and CRETE SINK to be missing in the sink though.

That's it. To ensure at-least once delivery, backfilling is still needed.

I can think of one case this feature is useful, which is when user's downstream system is overloaded and want to stop the traffic for a while before they can fix the downstream.

Yes. We encounter a case that a user sinking a MV with large data into downstream PG but there is some issues with the PG cdc source they want to troubleshoot.

fuyufjh · 2024-06-20T06:36:43Z

sink_decuple, or that fact that there is a hidden log store before the sink, is somehow a bit counter-intuitive from the user's perspective. Thus, I would like to keep it transparent to users in normal cases. Only when the sink can't work, the hidden log store helps to avoid the failure of all streaming jobs.

Applying this idea here, I think we may do this in risectl in case of contingency. User won't be aware of this except troubleshooting.

StrikeW added the type/feature label Jun 20, 2024

github-actions bot added this to the release-1.10 milestone Jun 20, 2024

fuyufjh removed this from the release-1.10 milestone Jul 10, 2024

fuyufjh assigned wenym1 Jul 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support pausing a decoupled Sink from SQL #17357

Support pausing a decoupled Sink from SQL #17357

StrikeW commented Jun 20, 2024

tabVersion commented Jun 20, 2024

StrikeW commented Jun 20, 2024 •

edited

Loading

hzxa21 commented Jun 20, 2024

hzxa21 commented Jun 20, 2024

StrikeW commented Jun 20, 2024

fuyufjh commented Jun 20, 2024

Support pausing a decoupled Sink from SQL #17357

Support pausing a decoupled Sink from SQL #17357

Comments

StrikeW commented Jun 20, 2024

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

tabVersion commented Jun 20, 2024

StrikeW commented Jun 20, 2024 • edited Loading

hzxa21 commented Jun 20, 2024

hzxa21 commented Jun 20, 2024

StrikeW commented Jun 20, 2024

fuyufjh commented Jun 20, 2024

StrikeW commented Jun 20, 2024 •

edited

Loading