Helix Node/Instance Swap #2662

zpinto · 2023-10-17T20:31:54Z

Helix Node/Instance Swap

Implementation for N -> N + 1 -> N instance swap for all replicas on the instance

Bootstrap new replicas on the SWAP_IN node: New replicas are created for all the partitions hosted on the node to be replaced. These new replicas are then assigned and bootstrapped on the SWAP_IN-node.
Remove replicas from the SWAP_OUT node: After successfully bootstrapping all the new replicas, the SWAP_OUT node will have its replicas dropped. Waiting until all replicas bootstrap on the SWAP_IN node will ensure we maintain all partitions desired replication factor.
Do not populate SWAP_IN replicas in routing tables until SWAP is completed. (Avoid spectator serving traffic to SWAP_IN node)

Requirements

Swapping of a node requires “transaction”. When bootstrapping new replicas on the SWAP_IN host, replica count of all partitions should be N + 1 (and should be in topState or secondTopState) before SWAP_OUT node can begin dropping replicas (which will return replica count to N)

Assumptions

When a node is being swapped, the following criteria must be met for the SWAP_IN node:

The cluster must have TOPOLOGY with a FAULT_ZONE_TYPE and EndNodeType:

EndNodeType is the last key in the TOPOLOGY. The corresponding value is considered to be the logicalId. This is necessary as logicalId will need to be used for deterministic placement when the SWAP_IN node is introduced as an assignable node.

DOMAIN of SWAP_IN and SWAP_OUT must have identical fault zone and logicalId.
This will be further explained in the topology section
If WAGED is enabled in the cluster, the INSTANCE_CAPACITY_MAP should match exactly.

Without these assumptions, it is not possible to swap a node as this would not be considered the best possible assignment for the cluster. If these assumptions cannot be met, this functionality should not be utilized. Instead the old node should use the Node Evacuation flow. A new instance can be added in the ENABLED state either before or after to replenish the capacity of removing the old node.

Execution

HelixAdmin.setInstanceOperation and addInstance sanity checks for SWAP_IN & SWAP_OUT
Create HelixAdmin.canCompleteSwap and completeSwapIfPossible(returns if swap finished)
Refactor BaseControllerDataProvider only allow AssignableNodes (either SWAP_OUT or SWAP_IN) and update all places with either getAssignableInstances or getAllInstances.
Refactor BestPossibleStateCalcStage to add SWAP_IN node with correct states to stateMaps containing SWAP_OUT node following partitionAssignment and partitionState calculation.
Refactor WAGED to do assignment based on logicalId instead of instanceName.
Refactor routing provider to not include replicas from SWAP_IN nodes in routing tables

The text was updated successfully, but these errors were encountered:

zpinto · 2023-12-20T22:14:57Z

This issue has now been closed with the merging of Application Cluster Manager branch. #2714

xyuanlu assigned xyuanlu and zpinto and unassigned xyuanlu Oct 17, 2023

zpinto mentioned this issue Oct 17, 2023

HelixAdmin APIs and pipeline changes to support Helix Node Swap #2661

Merged

21 tasks

This was referenced Dec 13, 2023

Build Topology with only required levels (FaultZone and EndNode) #2712

Merged

Build Topology with only required levels (FaultZone and EndNode) #2713

Merged

xyuanlu mentioned this issue Dec 18, 2023

Merge feature branch - Application cluster manager #2714

Merged

3 tasks

zpinto closed this as completed Dec 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Helix Node/Instance Swap #2662

Helix Node/Instance Swap #2662

zpinto commented Oct 17, 2023 •

edited

Loading

zpinto commented Dec 20, 2023

Helix Node/Instance Swap #2662

Helix Node/Instance Swap #2662

Comments

zpinto commented Oct 17, 2023 • edited Loading

Helix Node/Instance Swap

Requirements

Assumptions

Execution

zpinto commented Dec 20, 2023

zpinto commented Oct 17, 2023 •

edited

Loading