Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(meta): deprecate parallel unit #17523

Merged
merged 19 commits into from
Jul 23, 2024

Update assert in sink recovery test to allow equal counts

36f3c83
Select commit
Loading
Failed to load commit list.
Merged

feat(meta): deprecate parallel unit #17523

Update assert in sink recovery test to allow equal counts
36f3c83
Select commit
Loading
Failed to load commit list.
Task list completed / task-list-completed succeeded Jul 23, 2024 in 0s

4 / 4 tasks completed

All tasks have been completed

Details

Required Tasks

Task Status
Removed the dependency on persistent global ParallelUnit, replacing them with dynamic temporary WorkerSlots. Previously, ParallelUnit had two semantics: Worker and Worker Parallelism. So for each Worker with parallelism of Parallelism, we generated (WorkerId, 0), (WorkerId, 1) and so on to replace it. The vnode mapping in the communication between meta and the frontend also uses worker slots, which should be simplified to WorkerMapping in the future. Incomplete
Removed the VnodeMapping of Fragment. Since VnodeMapping was implemented through ParallelUnitMapping, we can derive VnodeMapping from ActorBitmaps and ActorStatus, so a persistent VnodeMapping is not needed. Moreover, binding Fragments to Workers is actually quite strange. Incomplete
Modified the Reschedule interface. The previous syntax was {fragment_id}-[parallel_unit]+[parallel_unit], which has now been changed to {fragment_id}-[worker_id:count]+[worker_id:count] {fragment_id}:[worker_id:diff]. Reschedule is now defined by modifying the number of allocations on workers. Simulation tests have all been updated to adapt. Incomplete
Considering compatibility, rw_parallel_units is still retained, changed to the form of (slot_id, worker_id). Incomplete
The ParallelUnit in ActorStatus has not been removed because I haven't figured out how to handle backward compatibility elegantly yet. 🥵 We will only use the WorkerId field. If it's a new ActorStatus, we will fill ParallelUnit with u32::MAX as the id. Incomplete
The original interface for generating stable resize plans has been removed in this PR. Due to the introduction of alter syntax and auto scale, this feature is rarely used. If we were to modify it, this PR would continue to grow larger. Let's discuss this later. Incomplete
The parallel_units field in the WorkerNode message has been removed; its field number is now reserved. Incomplete
A new field, uint32 parallelism, has been introduced to the WorkerNode message, enhancing its descriptive power. Incomplete
Reserved vnode_mapping in TableFragments. Incomplete
Replaced and reserved parallel_unit_migration_plan in MigrationPlan with a new worker_slot_migration_plan field. Incomplete
Reserved parallel_unit_id in ListActorStatesFoResponse, substituting it with a new worker_id field. Incomplete
The Reschedule message is now more appropriately named WorkerReschedule, with changes to the map types for added_parallel_units and removed_parallel_units. Incomplete
The RescheduleRequest message has been simplified by removing the reschedules field and including worker_reschedules. Incomplete
Deletion of GetReschedulePlanRequest and GetReschedulePlanResponse messages alongside their corresponding RPC method in ScaleService (i.e., GetReschedulePlan) to reflect the new design. Incomplete
Reflected the Protocol Buffers changes with a switch from using worker.parallel_units.len() to worker.parallelism as usize. Incomplete
Redesigned the worker slot to worker node mapping system following the removal of parallel_units, utilizing a new structure based on the id and parallelism. Incomplete
Modified the total_parallelism function to compute using the new parallelism field rather than the length of parallel_units. Incomplete
WorkerSlotId now uses a custom implementation of fmt for its `Debug Incomplete
I have written necessary rustdoc comments Completed
I have added necessary unit tests and integration tests Completed
I have added test labels as necessary. See [details] Completed
All checks passed in ./risedev check (or alias, ./risedev c) Completed
During scheduling, we use WorkerSlot to ensure this correspondence, meaning the WorkerSlots of NoShuffle's upstream and downstream are consistent. Incomplete
During scaling, which is the rescheduling process, the original 1-to-1 relationship is retained in the actor's dispatcher. We can rebuild based on this relationship without needing WorkerSlot alignment. For upstream and downstream in cascade no shuffle expansion, we use temporary WorkerSlots to solve it (same as scheduling). Incomplete
We can have a "code deletion" PR first (e.g., GetReschedulePlan). I guess it's trivial to merge, and can reduce a lot of LoC of this PR. Incomplete
Another idea is to do the "concept mapping" (or do abstraction) refactoring first: e.g., previously we have worker.parallel_units.len(). We can first refactor that into a method worker.parallelism() (without changing the implementation). And in the new PR, we just need to change the implementation. (But I'm not sure how much similar things we can do for this PR, and how much it would help) Incomplete
Graphite can help a lot to work on the splitted PRs simultaneously! (specifically, automatically rebase later PRs when you make changes to prior ones) Incomplete
Do we persist the mapping? Incomplete
Do we directly use worker id? Incomplete
Is worker id reusable? Incomplete
Do we persist the mapping? Incomplete
Do we directly use worker id? Incomplete
Is worker id reusable? Incomplete