feat(meta): deprecate parallel unit #17523
Merged
Task list completed / task-list-completed
succeeded
Jul 23, 2024 in 0s
4 / 4 tasks completed
All tasks have been completed
Details
Required Tasks
Task | Status |
---|---|
Removed the dependency on persistent global ParallelUnit , replacing them with dynamic temporary WorkerSlots . Previously, ParallelUnit had two semantics: Worker and Worker Parallelism. So for each Worker with parallelism of Parallelism, we generated (WorkerId, 0), (WorkerId, 1) and so on to replace it. The vnode mapping in the communication between meta and the frontend also uses worker slots, which should be simplified to WorkerMapping in the future. |
Incomplete |
Removed the VnodeMapping of Fragment . Since VnodeMapping was implemented through ParallelUnitMapping , we can derive VnodeMapping from ActorBitmaps and ActorStatus , so a persistent VnodeMapping is not needed. Moreover, binding Fragments to Workers is actually quite strange. |
Incomplete |
Modified the Reschedule interface. The previous syntax was {fragment_id}-[parallel_unit]+[parallel_unit] , which has now been changed to {fragment_id}-[worker_id:count]+[worker_id:count] {fragment_id}:[worker_id:diff] . Reschedule is now defined by modifying the number of allocations on workers. Simulation tests have all been updated to adapt. |
Incomplete |
rw_parallel_units is still retained, changed to the form of (slot_id, worker_id) . |
Incomplete |
The ParallelUnit in ActorStatus has not been removed because I haven't figured out how to handle backward compatibility elegantly yet. 🥵 We will only use the WorkerId field. If it's a new ActorStatus, we will fill ParallelUnit with u32::MAX as the id. | Incomplete |
The original interface for generating stable resize plans has been removed in this PR. Due to the introduction of alter syntax and auto scale, this feature is rarely used. If we were to modify it, this PR would continue to grow larger. Let's discuss this later. | Incomplete |
The parallel_units field in the WorkerNode message has been removed; its field number is now reserved. |
Incomplete |
A new field, uint32 parallelism , has been introduced to the WorkerNode message, enhancing its descriptive power. |
Incomplete |
Reserved vnode_mapping in TableFragments . |
Incomplete |
Replaced and reserved parallel_unit_migration_plan in MigrationPlan with a new worker_slot_migration_plan field. |
Incomplete |
Reserved parallel_unit_id in ListActorStatesFoResponse , substituting it with a new worker_id field. |
Incomplete |
The Reschedule message is now more appropriately named WorkerReschedule , with changes to the map types for added_parallel_units and removed_parallel_units . |
Incomplete |
The RescheduleRequest message has been simplified by removing the reschedules field and including worker_reschedules . |
Incomplete |
Deletion of GetReschedulePlanRequest and GetReschedulePlanResponse messages alongside their corresponding RPC method in ScaleService (i.e., GetReschedulePlan ) to reflect the new design. |
Incomplete |
Reflected the Protocol Buffers changes with a switch from using worker.parallel_units.len() to worker.parallelism as usize . |
Incomplete |
Redesigned the worker slot to worker node mapping system following the removal of parallel_units , utilizing a new structure based on the id and parallelism . |
Incomplete |
Modified the total_parallelism function to compute using the new parallelism field rather than the length of parallel_units . |
Incomplete |
WorkerSlotId now uses a custom implementation of fmt for its `Debug |
Incomplete |
I have written necessary rustdoc comments | Completed |
I have added necessary unit tests and integration tests | Completed |
I have added test labels as necessary. See [details] | Completed |
All checks passed in ./risedev check (or alias, ./risedev c ) |
Completed |
During scheduling, we use WorkerSlot to ensure this correspondence, meaning the WorkerSlots of NoShuffle's upstream and downstream are consistent. | Incomplete |
During scaling, which is the rescheduling process, the original 1-to-1 relationship is retained in the actor's dispatcher. We can rebuild based on this relationship without needing WorkerSlot alignment. For upstream and downstream in cascade no shuffle expansion, we use temporary WorkerSlots to solve it (same as scheduling). | Incomplete |
We can have a "code deletion" PR first (e.g., GetReschedulePlan ). I guess it's trivial to merge, and can reduce a lot of LoC of this PR. |
Incomplete |
Another idea is to do the "concept mapping" (or do abstraction) refactoring first: e.g., previously we have worker.parallel_units.len() . We can first refactor that into a method worker.parallelism() (without changing the implementation). And in the new PR, we just need to change the implementation. (But I'm not sure how much similar things we can do for this PR, and how much it would help) |
Incomplete |
Graphite can help a lot to work on the splitted PRs simultaneously! (specifically, automatically rebase later PRs when you make changes to prior ones) | Incomplete |
Do we persist the mapping? | Incomplete |
Do we directly use worker id? | Incomplete |
Is worker id reusable? | Incomplete |
Do we persist the mapping? | Incomplete |
Do we directly use worker id? | Incomplete |
Is worker id reusable? | Incomplete |
Loading