Skip to content

Commit

Permalink
Added support to perform cluster promotion/demotion
Browse files Browse the repository at this point in the history
Signed-off-by: Utkarsh Bhatt <[email protected]>
  • Loading branch information
UtkarshBhatthere committed Oct 20, 2024
1 parent f68f6e7 commit e36a368
Show file tree
Hide file tree
Showing 17 changed files with 631 additions and 29 deletions.
3 changes: 3 additions & 0 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -718,6 +718,9 @@ jobs:
- name: Verify RBD mirror
run : ~/actionutils.sh remote_verify_rbd_mirroring

- name: Failover site A to Site B
run : ~/actionutils.sh remote_failover_to_siteb

- name: Disable RBD mirror
run : ~/actionutils.sh remote_disable_rbd_mirroring

Expand Down
1 change: 1 addition & 0 deletions docs/how-to/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ RBD pools and images.

import-remote-cluster
configure-rbd-mirroring
perform-site-failover

Upgrading your cluster
----------------------
Expand Down
81 changes: 81 additions & 0 deletions docs/how-to/perform-site-failover.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
=============================================
Perform failover for replicated RBD resources
=============================================

In case of a disaster, all replicated RBD pools can be failed over to a non-primary remote.

An operator can perform promotion on a non-primary cluster, this will in turn promote all replicated rbd
images in all rbd pools and make them primary. This enables them to be consumed by vms and other workloads.

Prerequisites
--------------
1. A primary and a secondary MicroCeph cluster, for example named "primary_cluster" and "secondary_cluster"
2. primary_cluster has imported configurations from secondary_cluster and vice versa. refer to :doc:`import remote <./import-remote-cluster>`
3. RBD remote replication is configured for atleast 1 rbd image. refer to :doc:`configure rbd replication <./configure-rbd-mirroring>`

Failover to a non-primary remote cluster
-----------------------------------------
List all the resources on 'secondary_cluster' to check primary status.

.. code-block:: none
sudo microceph remote replication rbd list
+-----------+------------+------------+---------------------+
| POOL NAME | IMAGE NAME | IS PRIMARY | LAST LOCAL UPDATE |
+-----------+------------+------------+---------------------+
| pool_one | image_one | false | 2024-10-14 09:03:17 |
| pool_one | image_two | false | 2024-10-14 09:03:17 |
+-----------+------------+------------+---------------------+
An operator can perform cluster wide promotion as follows:

.. code-block:: none
sudo microceph remote replication rbd promote --remote primary_cluster --yes-i-really-mean-it
Here, <remote> paramter helps microceph filter the resources to promote.
Since promotion of secondary_cluster may cause a split-brain condition in future,
it is necessary to pass --yes-i-really-mean-it flag.

Verify RBD remote replication primary status
---------------------------------------------

List all the resources on 'secondary_cluster' again to check primary status.

.. code-block:: none
sudo microceph remote replication rbd status pool_one
+-----------+------------+------------+---------------------+
| POOL NAME | IMAGE NAME | IS PRIMARY | LAST LOCAL UPDATE |
+-----------+------------+------------+---------------------+
| pool_one | image_one | true | 2024-10-14 09:06:12 |
| pool_one | image_two | true | 2024-10-14 09:06:12 |
+-----------+------------+------------+---------------------+
The status shows that there are 2 replicated images and both of them are now primary.

Failback to old primary
------------------------

Once the disaster struck cluster (primary_cluster) is back online the RBD resources
can be failed back to it, but, by this time the RBD images at the current primary (secondary_cluster)
would have diverged from primary_cluster. Thus, to have a clean sync, the operator must decide
which cluster would be demoted to the non-primary status. This cluster will then receive the
RBD mirror updates from the standing primary.

Note: Demotion can cause data loss and hence can only be performed with the 'yes-i-really-mean-it' flag.

At primary_cluster (was primary before disaster), perform demotion.
.. code-block:: none
sudo microceph remote replication rbd demote --remote secondary_cluster
failed to process demote_replication request for rbd: demotion may cause data loss on this cluster. If you
understand the *RISK* and you're *ABSOLUTELY CERTAIN* that is what you want, pass --yes-i-really-mean-it.
Now, again at the 'primary_cluster', perform demotion with --yes-i-really-mean-it flag.
.. code-block:: none
sudo microceph remote replication rbd demote --remote secondary_cluster --yes-i-really-mean-it
Note: MicroCeph with demote the primary pools and will issue a resync for all the mirroring images, hence it may
cause data loss at the old primary cluster.
29 changes: 29 additions & 0 deletions docs/reference/commands/remote-replication-rbd.rst
Original file line number Diff line number Diff line change
Expand Up @@ -96,3 +96,32 @@ Usage:
--force forcefully disable replication for rbd resource
``promote``
------------

Promote local cluster to primary

.. code-block:: none
microceph remote replication rbd promote [flags]
.. code-block:: none
--remote remote MicroCeph cluster name
--force forcefully promote site to primary
``demote``
------------

Demote local cluster to secondary

Usage:

.. code-block:: none
microceph remote replication rbd demote [flags]
.. code-block:: none
--remote remote MicroCeph cluster name
9 changes: 9 additions & 0 deletions microceph/api/ops_replication.go
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ var opsReplicationCmd = rest.Endpoint{
var opsReplicationWorkloadCmd = rest.Endpoint{
Path: "ops/replication/{wl}",
Get: rest.EndpointAction{Handler: getOpsReplicationWorkload, ProxyTarget: false},
Put: rest.EndpointAction{Handler: putOpsReplicationWorkload, ProxyTarget: false},
}

// CRUD Replication
Expand All @@ -47,6 +48,12 @@ func getOpsReplicationWorkload(s state.State, r *http.Request) response.Response
return cmdOpsReplication(s, r, types.ListReplicationRequest)
}

// putOpsReplicationWorkload handles site level (promote/demoteR) operation
func putOpsReplicationWorkload(s state.State, r *http.Request) response.Response {
// either promote or demote (already encoded in request)
return cmdOpsReplication(s, r, "")
}

// getOpsReplicationResource handles status operation for a certain resource.
func getOpsReplicationResource(s state.State, r *http.Request) response.Response {
return cmdOpsReplication(s, r, types.StatusReplicationRequest)
Expand Down Expand Up @@ -105,6 +112,8 @@ func cmdOpsReplication(s state.State, r *http.Request, patchRequest types.Replic
return response.SmartError(fmt.Errorf("unknown workload %s, resource %s", wl, resource))
}

logger.Debugf("REPOPS: %s received for %s: %s", req.GetWorkloadRequestType(), wl, resource)

return handleReplicationRequest(s, r.Context(), req)
}

Expand Down
10 changes: 7 additions & 3 deletions microceph/api/types/replication.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,13 @@ type ReplicationRequestType string
const (
EnableReplicationRequest ReplicationRequestType = "POST-" + constants.EventEnableReplication
ConfigureReplicationRequest ReplicationRequestType = "PUT-" + constants.EventConfigureReplication
DisableReplicationRequest ReplicationRequestType = "DELETE-" + constants.EventDisableReplication
StatusReplicationRequest ReplicationRequestType = "GET-" + constants.EventStatusReplication
ListReplicationRequest ReplicationRequestType = "GET-" + constants.EventListReplication
PromoteReplicationRequest ReplicationRequestType = "PUT-" + constants.EventPromoteReplication
DemoteReplicationRequest ReplicationRequestType = "PUT-" + constants.EventDemoteReplication
// Delete Requests
DisableReplicationRequest ReplicationRequestType = "DELETE-" + constants.EventDisableReplication
// Get Requests
StatusReplicationRequest ReplicationRequestType = "GET-" + constants.EventStatusReplication
ListReplicationRequest ReplicationRequestType = "GET-" + constants.EventListReplication
)

type CephWorkloadType string
Expand Down
102 changes: 100 additions & 2 deletions microceph/ceph/rbd_mirror.go
Original file line number Diff line number Diff line change
Expand Up @@ -214,8 +214,8 @@ func DisablePoolMirroring(pool string, peer RbdReplicationPeer, localName string
return nil
}

// DisableMirroringAllImagesInPool disables mirroring for all images for a pool enabled in pool mirroring mode.
func DisableMirroringAllImagesInPool(poolName string) error {
// DisableAllMirroringImagesInPool disables mirroring for all images for a pool enabled in pool mirroring mode.
func DisableAllMirroringImagesInPool(poolName string) error {
poolStatus, err := GetRbdMirrorVerbosePoolStatus(poolName, "", "")
if err != nil {
err := fmt.Errorf("failed to fetch status for %s pool: %v", poolName, err)
Expand All @@ -236,6 +236,28 @@ func DisableMirroringAllImagesInPool(poolName string) error {
return nil
}

// ResyncAllMirroringImagesInPool triggers a resync for all mirroring images inside a mirroring pool.
func ResyncAllMirroringImagesInPool(poolName string) error {
poolStatus, err := GetRbdMirrorVerbosePoolStatus(poolName, "", "")
if err != nil {
err := fmt.Errorf("failed to fetch status for %s pool: %v", poolName, err)
logger.Error(err.Error())
return err
}

flaggedImages := []string{}
for _, image := range poolStatus.Images {
err := flagImageForResync(poolName, image.Name)
if err != nil {
return fmt.Errorf("failed to resync %s/%s", poolName, image.Name)
}
flaggedImages = append(flaggedImages, image.Name)
}

logger.Debugf("REPRBD: Resynced %v images in %s pool.", flaggedImages, poolName)
return nil
}

// getPeerUUID returns the peer ID for the requested peer name.
func getPeerUUID(pool string, peerName string, client string, cluster string) (string, error) {
poolInfo, err := GetRbdMirrorPoolInfo(pool, cluster, client)
Expand Down Expand Up @@ -304,6 +326,7 @@ func BootstrapPeer(pool string, localName string, remoteName string) error {
}

// ############################# Ceph Commands #############################
// configurePoolMirroring enables/disables mirroring for a pool.
func configurePoolMirroring(pool string, mode types.RbdResourceType, localName string, remoteName string) error {
var args []string
if mode == types.RbdResourceDisabled {
Expand Down Expand Up @@ -361,6 +384,7 @@ func configureImageMirroring(req types.RbdReplicationRequest) error {
return nil
}

// getSnapshotSchedule fetches the schedule of the snapshots.
func getSnapshotSchedule(pool string, image string) (imageSnapshotSchedule, error) {
if len(pool) == 0 || len(image) == 0 {
return imageSnapshotSchedule{}, fmt.Errorf("ImageName(%s/%s) not complete", pool, image)
Expand Down Expand Up @@ -484,6 +508,42 @@ func configureImageFeatures(pool string, image string, op string, feature string
return nil
}

// enableImageFeatures enables the list of rbd features on the requested resource.
func enableRbdImageFeatures(poolName string, imageName string, features []string) error {
for _, feature := range features {
err := configureImageFeatures(poolName, imageName, "enable", feature)
if err != nil && !strings.Contains(err.Error(), "one or more requested features are already enabled") {
return err
}
}
return nil
}

// disableRbdImageFeatures disables the list of rbd features on the requested resource.
func disableRbdImageFeatures(poolName string, imageName string, features []string) error {
for _, feature := range features {
err := configureImageFeatures(poolName, imageName, "disable", feature)
if err != nil {
return err
}
}
return nil
}

// flagImageForResync flags requested mirroring image in the given pool for resync.
func flagImageForResync(poolName string, imageName string) error {
args := []string{
"mirror", "image", "resync", fmt.Sprintf("%s/%s", poolName, imageName),
}

_, err := processExec.RunCommand("rbd", args...)
if err != nil {
return err
}

return nil
}

// peerBootstrapCreate generates peer bootstrap token on remote ceph cluster.
func peerBootstrapCreate(pool string, client string, cluster string) (string, error) {
args := []string{
Expand Down Expand Up @@ -548,6 +608,44 @@ func peerRemove(pool string, peerId string, localName string, remoteName string)
return nil
}

func promotePool(poolName string, isForce bool, remoteName string, localName string) error {
args := []string{
"mirror", "pool", "promote", poolName,
}

if isForce {
args = append(args, "--force")
}

// add --cluster and --id args
args = appendRemoteClusterArgs(args, remoteName, localName)

output, err := processExec.RunCommand("rbd", args...)
if err != nil {
return fmt.Errorf("failed to promote pool(%s): %v", poolName, err)
}

logger.Debugf("REPRBD: Promotion Output: %s", output)
return nil
}

func demotePool(poolName string, remoteName string, localName string) error {
args := []string{
"mirror", "pool", "demote", poolName,
}

// add --cluster and --id args
args = appendRemoteClusterArgs(args, remoteName, localName)

output, err := processExec.RunCommand("rbd", args...)
if err != nil {
return fmt.Errorf("failed to promote pool(%s): %v", poolName, err)
}

logger.Debugf("REPRBD: Demotion Output: %s", output)
return nil
}

// ########################### HELPERS ###########################

func IsRemoteConfiguredForRbdMirror(remoteName string) bool {
Expand Down
40 changes: 40 additions & 0 deletions microceph/ceph/rbd_mirror_test.go
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
package ceph

import (
"fmt"
"os"
"testing"

Expand Down Expand Up @@ -93,3 +94,42 @@ func (ks *RbdMirrorSuite) TestPoolInfo() {
assert.Equal(ks.T(), resp.LocalSiteName, "magical")
assert.Equal(ks.T(), resp.Peers[0].RemoteName, "simple")
}
func (ks *RbdMirrorSuite) TestPromotePoolOnSecondary() {
r := mocks.NewRunner(ks.T())
output, _ := os.ReadFile("./test_assets/rbd_mirror_promote_secondary_failure.txt")

// mocks and expectations
r.On("RunCommand", []interface{}{
"rbd", "mirror", "pool", "promote", "pool"}...).Return("", fmt.Errorf("%s", string(output))).Once()
r.On("RunCommand", []interface{}{
"rbd", "mirror", "pool", "promote", "pool", "--force"}...).Return("ok", nil).Once()
processExec = r

// Test stardard promotion.
err := handlePoolPromotion("pool", false)
assert.ErrorContains(ks.T(), err, "If you understand the *RISK* and you're *ABSOLUTELY CERTAIN*")

err = handlePoolPromotion("pool", true)
assert.NoError(ks.T(), err)
}

func (ks *RbdMirrorSuite) TestDemotePoolOnSecondary() {
r := mocks.NewRunner(ks.T())

output, _ := os.ReadFile("./test_assets/rbd_mirror_verbose_pool_status.json")

// mocks and expectations
r.On("RunCommand", []interface{}{
"rbd", "mirror", "pool", "demote", "pool"}...).Return("ok", nil).Once()
r.On("RunCommand", []interface{}{
"rbd", "mirror", "pool", "status", "pool", "--verbose", "--format", "json"}...).Return(string(output), nil).Once()
r.On("RunCommand", []interface{}{
"rbd", "mirror", "image", "resync", "pool/image_one"}...).Return("ok", nil).Once()
r.On("RunCommand", []interface{}{
"rbd", "mirror", "image", "resync", "pool/image_two"}...).Return("ok", nil).Once()
processExec = r

// Test stardard promotion.
err := handlePoolDemotion("pool")
assert.NoError(ks.T(), err)
}
Loading

0 comments on commit e36a368

Please sign in to comment.