feat: compute node unregisters from meta for graceful shutdown #17662

3 / 4 tasks completed

1 task still to be completed

Required Tasks

Task	Status
The compute node will first unregister from the meta service, so that following batch queries and streaming jobs won't be scheduled here.	Incomplete
Then, it sends a `Shutdown` message on the barrier control stream, triggering a recovery on the new set of compute nodes.	Incomplete
After that, the compute node waits for the connection to be reset.	Incomplete
Finally, exit the entrypoint function then the process gracefully.	Incomplete
I have written necessary rustdoc comments	Completed
I have added necessary unit tests and integration tests	Completed
All checks passed in `./risedev check` (or alias, `./risedev c`)	Completed
My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)	Incomplete
#17802	Incomplete
#17662 👈	Incomplete
#17633	Incomplete
#17586	Incomplete
#17608	Incomplete
`main`	Incomplete
we always clear the executor cache when scaling-in, so there might be no big difference on streaming performance,	Incomplete
recovery does not affect batch availability,	Incomplete
scaling online can be less responsive (depending on the number of in-flight barriers), which may not fit within the default killing timeout of 30s in Kubernetes,	Incomplete
Explicitly send an `Err(Shutdown)` through the stream from the compute node.
We need to be able to recognize this error and differentiate it from other errors in the meta node, for (slightly) different handling behavior, e.g., do not ignore this error even if it's not associated with an in-flight barrier.	Incomplete
Close the channel and let meta service acknowledge it by receiving a `None`.
I'm just concerned whether this is reliable enough, i.e., if there are any other unexpected scenarios where the stream is disconnected without an error.	Incomplete

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: compute node unregisters from meta for graceful shutdown #17662

feat: compute node unregisters from meta for graceful shutdown #17662

3 / 4 tasks completed

Details

Required Tasks

Re-running checks...

feat: compute node unregisters from meta for graceful shutdown #17662

remove no-actor optimization & move actor op handler to async context

feat: compute node unregisters from meta for graceful shutdown #17662

3 / 4 tasks completed

Details

Required Tasks

Re-running checks...