Skip to content

An extensible framework for application-level data management on Kubernetes

License

Notifications You must be signed in to change notification settings

ewhamilton/kanister

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Kanister Logo

Kanister

Go Report Card Build Status

Overview

Kanister is a framework that enables application-level data management on Kubernetes. It allows domain experts to capture application specific data management tasks via Blueprints, which can be easily shared and extended. The framework takes care of the tedious details surrounding execution on Kubernetes and presents a homogeneous operational experience across applications at scale.

Design Goals

The design of Kanister was driven by the following main goals:

  1. Application-Centric: Given the increasingly complex and distributed nature of cloud-native data services, there is a growing need for data management tasks to be at the application level. Experts who possess domain knowledge of a specific application's needs should be able to capture these needs when performing data operations on that application.

  2. API Driven: Data management tasks for each specific application may vary widely, and these tasks should be encapsulated by a well-defined API so as to provide a uniform data management experience. Each application expert can provide an application-specific pluggable implementation that satisfies this API, thus enabling a homogeneous data management experience of diverse and evolving data services.

  3. Extensible: Any data management solution capable of managing a diverse set of applications must be flexible enough to capture the needs of custom data services running in a variety of environments. Such flexibility can only be provided if the solution itself can easily be extended.

Documentation

This README provides the basic set of information to get up and running with Kanister. For further information, please refer to the Kanister Documentation.

Quick Start

The following commands will install Kanister, Kanister-enabled MySQL and backup to an AWS S3 bucket.

# Install the Kanister Controller
helm install --name myrelease --namespace kanister stable/kanister-operator --set image.tag=v0.3.0

# Add Kanister Charts
helm repo add kanister http://charts.kanister.io

# Install MySQL and configure its Kanister Blueprint.
helm install kanister/kanister-mysql                        \
    --name mysql-release --namespace mysql-ns               \
    --set kanister.s3_bucket="mysql-backup-bucket"          \
    --set kanister.s3_api_key="${AWS_ACCESS_KEY_ID}"        \
    --set kanister.s3_api_secret="${AWS_SECRET_ACCESS_KEY}" \
    --set kanister.controller_namespace=kanister

# Perform a backup by creating an ActionSet
cat << EOF | kubectl create -f -
apiVersion: cr.kanister.io/v1alpha1
kind: ActionSet
metadata:
  generateName: mysql-backup-
  namespace: kanister
spec:
  actions:
  - name: backup
    blueprint: mysql-release-kanister-mysql-blueprint
    object:
      kind: Deployment
      name: mysql-release-kanister-mysql
      namespace: mysql-ns
EOF

Getting Started

Prerequisites

In order to use Kanister, you will need to have the following set up:

  • Kubernetes version 1.8 or higher
  • kubectl
  • Helm

Kanister Installation

Kanister is based on the operator pattern. The first step to using Kanister is to deploy the Kanister controller. The Kanister controller can be configured and installed using Helm. See this for more information on the controller's Helm chart. Once Helm is initialized, install the controller with:

helm install --name myrelease --namespace kanister stable/kanister-operator --set image.tag=v0.3.0

If you wish to build and deploy the controller from source, instructions to do so can be found here.

Helm can give us the status of the objects we've installed:

$ helm status myrelease
LAST DEPLOYED: Wed Mar 21 16:40:43 2018
NAMESPACE: kanister
STATUS: DEPLOYED

RESOURCES:
==> v1/ServiceAccount
NAME                         SECRETS  AGE
myrelease-kanister-operator  1        9s

==> v1beta1/ClusterRole
NAME                                      AGE
myrelease-kanister-operator-cluster-role  9s

==> v1beta1/ClusterRoleBinding
NAME                                   AGE
myrelease-kanister-operator-edit-role  9s
myrelease-kanister-operator-cr-role    9s

==> v1beta1/Deployment
NAME                         DESIRED  CURRENT  UP-TO-DATE  AVAILABLE  AGE
myrelease-kanister-operator  1        1        1           1          9s

==> v1/Pod(related)
NAME                                          READY  STATUS   RESTARTS  AGE
myrelease-kanister-operator-1484730505-2s279  1/1    Running  0         9s

...

To check the status of the controller's pod:

# Check the pod's status.
$ kubectl --namespace kanister get pod -l app=kanister-operator
NAME                                 READY     STATUS    RESTARTS   AGE
kanister-operator-2733194401-l79mg   1/1       Running   1          12m

The Kanister controller will create CRDs on startup if they don't already exist. We can verify that they exist:

$ kubectl get crd
NAME                        AGE
actionsets.cr.kanister.io   30m
blueprints.cr.kanister.io   30m

As shown above, two custom resources are defined - Blueprints and ActionSets. A Blueprint specifies a set of actions that can be executed on an application. An ActionSet provides the necessary runtime information to trigger taking an action on the application.

Since Kanister follows the operator pattern, other useful kubectl commands work with the Kanister controller as well, such as fetching the logs:

$ kubectl --namespace kanister logs -l app=kanister-operator

In addition to installing the Kanister controller, please also install the appropriate kanctl binary from releases. Alternatively, you can also install kanctl by using the following command. Make sure your GOPATH is set.

$ go install -v github.com/kanisterio/kanister/cmd/kanctl

Example Application: Helm-Deployed MySQL

This git repo contains Helm charts of stateful applications from the stable chart repo, modified to include Kanister Blueprints. These applications can be easily backed-up and restored.

The following commands will install MySQL and configure Kanister to backup to an S3 bucket named mysql-backup-bucket.

# Add Kanister charts
helm repo add kanister http://charts.kanister.io

# Install MySQL and configure its Kanister Blueprint.
helm install kanister/kanister-mysql                        \
    --name mysql-release --namespace mysql-ns               \
    --set kanister.s3_bucket="mysql-backup-bucket"          \
    --set kanister.s3_api_key="${AWS_ACCESS_KEY_ID}"        \
    --set kanister.s3_api_secret="${AWS_SECRET_ACCESS_KEY}" \
    --set kanister.controller_namespace=kanister

To backup this application's data, we create a Kanister ActionSet. The command to create an ActionSet is included in the Helm notes, which can be displayed with helm status mysql-release.

$ cat << EOF | kubectl create -f -
apiVersion: cr.kanister.io/v1alpha1
kind: ActionSet
metadata:
  generateName: mysql-backup-
  namespace: kanister
spec:
  actions:
  - name: backup
    blueprint: mysql-release-kanister-mysql-blueprint
    object:
      kind: Deployment
      name: mysql-release-kanister-mysql
      namespace: mysql-ns
EOF
actionset "mysql-backup-qgx06" created

We can now restore this backup by chaining a restore off the ActionSet we just created using kanctl.

$ kanctl --namespace kanister perform restore --from mysql-backup-qgx06
actionset restore-mysql-backup-qgx06-bd4mq created

Example Application: MongoDB

To get a more detailed overview of Kanister's components, let's walk through a non-Helm example of using Kanister to backup and restore MongoDB. In this example, we will deploy MongoDB with a sidecar container. This sidecar container will include the necessary tools to store protected data from MongoDB into an S3 bucket in AWS. Note that a sidecar container is not required to use Kanister, but is just one of several ways to access tools needed to protect the application.

1. Deploy the Application

The following command deploys the example MongoDB application in default namespace:

$ kubectl apply -f ./examples/mongo-sidecar/mongo-cluster.yaml
configmap "mongo-cluster" created
service "mongo-cluster" created
statefulset "mongo-cluster" created

Once MongoDB is running, you can populate it with some data. Let's add a collection called "restaurants" to a test database:

# Connect to MongoDB by running a shell inside MongoDB's pod
$ kubectl exec -i -t mongo-cluster-0 -- bash -l

# From inside the shell, use the mongo CLI to insert some data into the test database
$ mongo test --quiet --eval "db.restaurants.insert({'name' : 'Roys', 'cuisine' : 'Hawaiian', 'id' : '8675309'})"
WriteResult({ "nInserted" : 1 })

# View the restaurants data in the test database
$ mongo test --quiet --eval "db.restaurants.find()"
{ "_id" : ObjectId("5a1dd0719dcbfd513fecf87c"), "name" : "Roys", "cuisine" : "Hawaiian", "id" : "8675309" }

2. Protect the Application

Next create a Blueprint which describes how backup and restore actions can be executed on this application. The Blueprint for this application can be found at ./examples/mongo-sidecar/blueprint.yaml. Notice that the backup action of the Blueprint references the S3 location specified in the ConfigMap in ./examples/mongo-sidecar/s3-location-configmap.yaml. In order for this example to work, you should update the path field of s3-location-configmap.yaml to point to an S3 bucket to which you have access. You should also update secrets.yaml to include AWS credentials that have read/write access to the S3 bucket. Provide your AWS credentials by setting the corresponding data values for aws_access_key_id and aws_secret_access_key in secrets.yaml. These are encoded using base64. The following commands will create a ConfigMap, Secrets and a Blueprint in controller's namespace:

# Get base64 encoded aws keys
$ echo "YOUR_KEY" | base64

# Create the ConfigMap with an S3 path
$ kubectl apply -f ./examples/mongo-sidecar/s3-location-configmap.yaml
configmap "mongo-s3-location" created

# Create the secrets with the AWS credentials
$ kubectl apply -f ./examples/mongo-sidecar/secrets.yaml
secrets "aws-creds" created

# Create the Blueprint for MongoDB
$ kubectl apply -f ./examples/mongo-sidecar/blueprint.yaml
blueprint "mongo-sidecar" created

You can now take a backup of MongoDB's data using an ActionSet defining backup for this application. Create an ActionSet in the same namespace as the controller.

$ kubectl --namepsace kanister apply -f ./examples/mongo-sidecar/backup-actionset.yaml
actionset "mongo-backup-12046" created

$ kubectl --namespace kanister get actionsets.cr.kanister.io
NAME                KIND
mongo-backup-12046   ActionSet.v1alpha1.cr.kanister.io

3. Disaster strikes!

Let's say someone with fat fingers accidentally deleted the restaurants collection using the following command:

# Drop the restaurants collection
$ mongo test --quiet --eval "db.restaurants.drop()"
true

If you try to access this data in the database, you should see that it is no longer there:

$ mongo test --quiet --eval "db.restaurants.find()"
# No entries should be found in the restaurants collection

4. Restore the Application

To restore the missing data, we want to use the backup created in step 2. An easy way to do this is to leverage kanctl, a command-line tool that helps create ActionSets that depend on other ActionSets:

$ kanctl --namespace kanister perform restore --from "mongo-backup-12046"
actionset restore-mongo-backup-12046-s1wb7 created

# View the status of the ActionSet
kubectl --namespace kanister get actionset restore-mongo-backup-12046-s1wb7 -oyaml

You should now see that the data has been successfully restored to MongoDB!

$ mongo test --quiet --eval "db.restaurants.find()"
{ "_id" : ObjectId("5a1dd0719dcbfd513fecf87c"), "name" : "Roys", "cuisine" : "Hawaiian", "id" : "8675309" }

5. Delete the Artifacts

The artifacts created by the backup action can be cleaned up using the following command:

$ kanctl --namespace kanister perform delete --from "mongo-backup-12046"
actionset "delete-mongo-backup-12046-kf8mt" created

# View the status of the ActionSet
$ kubectl --namespace kanister get actionset delete-mongo-backup-12046-kf8mt -oyaml

Cleanup

The Kanister components can be cleaned up with the following commands

$ helm delete --purge myrelease
$ kubectl delete crd {actionsets,blueprints}.cr.kanister.io
$ kubectl --namespace kanister delete actionset --all

More Applications

Example applications deployed through yaml can be found in the examples directory.

Support

For troubleshooting help, you can email the Kanister Google Group, reach out to us on Slack, or file an issue.

License

Apache License 2.0, see LICENSE.

About

An extensible framework for application-level data management on Kubernetes

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Go 91.7%
  • Shell 4.6%
  • Makefile 3.7%