diff --git a/README.md b/README.md index 8f85e5e..4cb54ca 100644 --- a/README.md +++ b/README.md @@ -9,12 +9,37 @@ The operator itself is built with the [Operator framework](https://github.com/op It inspired by [spotahome/redis-operator](https://github.com/spotahome/redis-operator). +![Redis Cluster atop Kubernetes](/static/redis-sentinel-readme.png) + +* Create a statefulset to mange Redis instances (masters and replicas), each redis instance has default PreStop script that can do failover if master is down. +* Create a statefulset to mange Sentinel instances that will control the Redis nodes, each Sentinel instance has default ReadinessProbe script to detect whether the current sentinel's status is ok. When a sentinel pod is not ready, it is removed from Service load balancers. +* Create a Service and a Headless service for Sentinel statefulset. +* Create a Headless service for Redis statefulset. + +Table of Contents +================= + + * [redis-operator](#redis-operator) + * [Overview](#overview) + * [Prerequisites](#prerequisites) + * [Features](#features) + * [Quick Start](#quick-start) + * [Deploy redis operator](#deploy-redis-operator) + * [Deploy a sample redis cluster](#deploy-a-sample-redis-cluster) + * [Resize an Redis Cluster](#resize-an-redis-cluster) + * [Create redis cluster with password](#create-redis-cluster-with-password) + * [Dynamically changing redis config](#dynamically-changing-redis-config) + * [Persistence](#persistence) + * [Custom SecurityContext](#custom-securitycontext) + * [Cleanup](#cleanup) + * [Automatic failover details](#automatic-failover-details) + ## Prerequisites -* go version v1.12+. -* Access to a Kubernetes v1.11.3+ cluster. +* go version v1.13+. +* Access to a Kubernetes v1.13.10+ cluster. -## Capabilities +## Features In addition to the sentinel's own capabilities, redis-operator can: * Push events and update status to the Kubernetes when resources have state changes @@ -28,7 +53,7 @@ In addition to the sentinel's own capabilities, redis-operator can: ## Quick Start ### Deploy redis operator -Build and push the redis-operator and e2e test image +Build and push the redis-operator image ``` $ make REGISTRY=you_public_registry build-image $ make REGISTRY=you_public_registry push @@ -92,33 +117,32 @@ Verify that the cluster instances and its components are running. ``` $ kubectl get rediscluster NAME SIZE STATUS AGE -test 3 Healthy 22h +test 3 Healthy 4m9s $ kubectl get all -l app.kubernetes.io/managed-by=redis-operator -NAME READY STATUS RESTARTS AGE -pod/redis-cluster-test-0 1/1 Running 0 22h -pod/redis-cluster-test-1 1/1 Running 0 22h -pod/redis-cluster-test-2 1/1 Running 0 22h -pod/redis-sentinel-test-7cbd85785b-6llfp 1/1 Running 0 22h -pod/redis-sentinel-test-7cbd85785b-ggqw4 1/1 Running 0 22h -pod/redis-sentinel-test-7cbd85785b-nxxfc 1/1 Running 0 22h - -NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE -service/redis-sentinel-test ClusterIP xxxxxxxxxx 26379/TCP 22h - -NAME READY UP-TO-DATE AVAILABLE AGE -deployment.apps/redis-sentinel-test 3/3 3 3 22h - -NAME DESIRED CURRENT READY AGE -replicaset.apps/redis-sentinel-test-7cbd85785b 3 3 3 22h - -NAME READY AGE -statefulset.apps/redis-cluster-test 3/3 22h +NAME READY STATUS RESTARTS AGE +pod/redis-cluster-test-0 1/1 Running 0 4m16s +pod/redis-cluster-test-1 1/1 Running 0 3m22s +pod/redis-cluster-test-2 1/1 Running 0 2m40s +pod/redis-sentinel-test-0 1/1 Running 0 4m16s +pod/redis-sentinel-test-1 1/1 Running 0 81s +pod/redis-sentinel-test-2 1/1 Running 0 18s + +NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE +service/redis-cluster-test ClusterIP None 6379/TCP 4m16s +service/redis-sentinel-headless-test ClusterIP None 26379/TCP 4m16s +service/redis-sentinel-test ClusterIP 10.22.22.34 26379/TCP 4m16s + +NAME READY AGE +statefulset.apps/redis-cluster-test 3/3 4m16s +statefulset.apps/redis-sentinel-test 3/3 4m16s ``` * redis-cluster-: Redis statefulset -* redis-sentinel-: Sentinel deployment +* redis-sentinel-: Sentinel statefulset * redis-sentinel-: Sentinel service +* redis-sentinel-headless-: Sentinel headless service +* redis-cluster-: Redis headless service Describe the Redis Cluster, Viewing Events and Status ``` @@ -277,18 +301,23 @@ spec: cpu: 50m memory: 30Mi size: 3 + # when the disablePersistence set to false, the following configurations will be set automatically: + + # disablePersistence: false + # config["save"] = "900 1 300 10" # config["appendonly"] = "yes" - # config["auto-aof-rewrite-min-size"] = "1gb" + # config["auto-aof-rewrite-min-size"] = "536870912" # config["repl-diskless-sync"] = "yes" - # config["repl-backlog-size"] = "60mb" - # config["repl-diskless-sync-delay"] = "5" + # config["repl-backlog-size"] = "62914560" # config["aof-load-truncated"] = "yes" # config["stop-writes-on-bgsave-error"] = "no" + # when the disablePersistence set to true, the following configurations will be set automatically: + + # disablePersistence: true # config["save"] = "" # config["appendonly"] = "no" - disablePersistence: false storage: # By default, the persistent volume claims will be deleted when the Redis Cluster be delete. # If this is not the expected usage, a keepAfterDeletion flag can be added under the storage section @@ -344,4 +373,37 @@ $ kubectl delete -f deploy/namespace/role.yaml $ kubectl delete -f deploy/namespace/role_binding.yaml $ kubectl delete -f deploy/service_account.yaml $ kubectl delete -f deploy/crds/redis_v1beta1_rediscluster_crd.yaml -``` \ No newline at end of file +``` + +## Automatic failover details + +Redis-operator build a **Highly Available Redis cluster with Sentinel**, Sentinel always checks the MASTER and SLAVE +instances in the Redis cluster, checking whether they working as expected. If sentinel detects a failure in the +MASTER node in a given cluster, Sentinel will start a failover process. As a result, Sentinel will pick a SLAVE +instance and promote it to MASTER. Ultimately, the other remaining SLAVE instances will be automatically reconfigured +to use the new MASTER instance. + +operator guarantees the following: +* Only one Redis instance as master in a cluster +* Number of Redis instance(masters and replicas) is equal as the set on the RedisCluster specification +* Number of Sentinels is equal as the set on the RedisCluster specification +* All Redis slaves have the same master +* All Sentinels point to the same Redis master +* Sentinel has not dead nodes + +But Kubernetes pods are volatile, they can be deleted and recreated, and pods IP will change when pod be recreated, +and also, the IP will be recycled and redistributed to other pods. +Unfortunately, sentinel cannot delete the sentinel list or redis list in its memory when the pods IP changes. +This can be caused because there’s no way of a Sentinel node to self-deregister from the Sentinel Cluster before die, +provoking the Sentinel node list to increase without any control. + +To ensure that Sentinel is working properly, operator will send a **RESET(SENTINEL RESET * )** signal to Sentinel node +one by one (if no failover is being running at that moment). +`SENTINEL RESET mastername` command: they'll refresh the list of replicas within the next 10 seconds, only adding the +ones listed as correctly replicating from the current master INFO output. +During this refresh time, `SENTINEL slaves ` command can not get any result from sentinel, so operator sent +RESET signal to Sentinel one by one and wait sentinel status became ok(monitor correct master and has slaves). +Additional, Each Sentinel instance has default ReadinessProbe script to detect whether the current sentinel's status is ok. +When a sentinel pod is not ready, it is removed from Service load balancers. +Operator also create a headless svc for Sentinel statefulset, if you can not get result from `SENTINEL slaves ` command, +You can try polling the headless domain. diff --git a/static/redis-sentinel-readme.png b/static/redis-sentinel-readme.png new file mode 100644 index 0000000..4ce3f2b Binary files /dev/null and b/static/redis-sentinel-readme.png differ