Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support pausing a decoupled Sink from SQL #17357

Open
StrikeW opened this issue Jun 20, 2024 · 6 comments
Open

Support pausing a decoupled Sink from SQL #17357

StrikeW opened this issue Jun 20, 2024 · 6 comments
Assignees

Comments

@StrikeW
Copy link
Contributor

StrikeW commented Jun 20, 2024

Is your feature request related to a problem? Please describe.

Usecase: sometimes user may want to prevent data coming out from RW but don't want to pause the entire cluster.

We don’t provide a way for user to just pause a Sink right now, I think for the Sink with sink_decuple = true it is feasible to provide a way to pause a Sink job from sql.

Describe the solution you'd like

No response

Describe alternatives you've considered

No response

Additional context

No response

@github-actions github-actions bot added this to the release-1.10 milestone Jun 20, 2024
@tabVersion
Copy link
Contributor

so why not directly drop the sink?

@StrikeW
Copy link
Contributor Author

StrikeW commented Jun 20, 2024

so why not directly drop the sink?

If we sink a MV out, when recreate the sink we need to backfill historical data again to ensure all data go out to downstream. Backfill historical data will deliver data already in downstream and introduce unnecessary write traffic to downstream.

@hzxa21
Copy link
Collaborator

hzxa21 commented Jun 20, 2024

so why not directly drop the sink?

If we sink a MV out, when recreate the sink we need to backfill historical data again to ensure all data go out to downstream. Backfill historical data will deliver data already in downstream and introduce unnecessary write traffic to downstream.

We support a snapshot option in CREATE SINK: https://docs.risingwave.com/docs/current/sql-create-sink/

When snapshot=false, the backfilling will be skipped.

@hzxa21
Copy link
Collaborator

hzxa21 commented Jun 20, 2024

Dropping the sink and recreate it with snapshot=false can cause the data ingested between DROP SINK and CRETE SINK to be missing in the sink though.

I can think of one case this feature is useful, which is when user's downstream system is overloaded and want to stop the traffic for a while before they can fix the downstream. Not sure whether it is valid though. If it is just a temporarily downstream failure, the current retry with backoff mechanism with sink decouple on seems good enough.

@StrikeW
Copy link
Contributor Author

StrikeW commented Jun 20, 2024

Dropping the sink and recreate it with snapshot=false can cause the data ingested between DROP SINK and CRETE SINK to be missing in the sink though.

That's it. To ensure at-least once delivery, backfilling is still needed.

I can think of one case this feature is useful, which is when user's downstream system is overloaded and want to stop the traffic for a while before they can fix the downstream.

Yes. We encounter a case that a user sinking a MV with large data into downstream PG but there is some issues with the PG cdc source they want to troubleshoot.

@fuyufjh
Copy link
Member

fuyufjh commented Jun 20, 2024

sink_decuple, or that fact that there is a hidden log store before the sink, is somehow a bit counter-intuitive from the user's perspective. Thus, I would like to keep it transparent to users in normal cases. Only when the sink can't work, the hidden log store helps to avoid the failure of all streaming jobs.

Applying this idea here, I think we may do this in risectl in case of contingency. User won't be aware of this except troubleshooting.

@fuyufjh fuyufjh removed this from the release-1.10 milestone Jul 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants