From 8f46603c7e9e85ba80e7353ac80d4fbc2177841b Mon Sep 17 00:00:00 2001 From: Kamoltat Sirivadhna Date: Mon, 25 Nov 2024 21:17:37 +0000 Subject: [PATCH] doc/rados/operations/stretch-mode.rst: Added Limitations to stretch pool configurations Fixes: https://tracker.ceph.com/issues/68338 Signed-off-by: Kamoltat Sirivadhna sd Signed-off-by: Kamoltat Sirivadhna --- doc/rados/operations/stretch-mode.rst | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/doc/rados/operations/stretch-mode.rst b/doc/rados/operations/stretch-mode.rst index ffb94e52943d6..e8be5e13e6ada 100644 --- a/doc/rados/operations/stretch-mode.rst +++ b/doc/rados/operations/stretch-mode.rst @@ -94,6 +94,29 @@ configuration across the entire cluster. Conversely, opt for a ``stretch pool`` when you need a particular pool to be replicated across ``more than two data centers``, providing a more granular level of control and a larger cluster size. +Limitations +----------- + +Individual Stretch Pools do not support I/O operations during a netsplit +scenario between two or more zones. While the cluster remains accessible for +basic Ceph commands, I/O usage remains unavailable until the netsplit is +resolved. This is different from ``stretch mode``, where the tiebreaker monitor +can isolate one zone of the cluster and continue I/O operations in degraded +mode during a netsplit. See :ref:`stretch_mode1` + +Ceph is designed to tolerate multiple host failures. However, if more than 25% of +the OSDs in the cluster go down, Ceph may stop marking OSDs as out which will prevent rebalancing +and some PGs might go inactive. This behavior is controlled by the ``mon_osd_min_in_ratio`` parameter. +By default, mon_osd_min_in_ratio is set to 0.75, meaning that at least 75% of the OSDs +in the cluster must remain ``active`` before any additional OSDs can be marked out. +This setting prevents too many OSDs from being marked out as this might lead to significant +data movement. The data movement can cause high client I/O impact and long recovery times when +the OSDs are returned to service. If Ceph stops marking OSDs as out, some PGs may fail to +rebalance to surviving OSDs, potentially leading to ``inactive`` PGs. +See https://tracker.ceph.com/issues/68338 for more information. + +.. _stretch_mode1: + Stretch Mode ============ Stretch mode is designed to handle deployments in which you cannot guarantee the