Skip to content

Commit

Permalink
fix: Resolve broken links
Browse files Browse the repository at this point in the history
Signed-off-by: Kevin Carter <[email protected]>
  • Loading branch information
cloudnull committed Feb 26, 2024
1 parent fc0edf5 commit 192e288
Show file tree
Hide file tree
Showing 3 changed files with 25 additions and 38 deletions.
10 changes: 5 additions & 5 deletions docs/build-test-envs.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,8 @@ Take a moment to orient yourself, there are a few items to consider before movin

### Clone Genestack

> Your local genestack repository will be transferred to the eventual launcher instance for convenience (_perfect for development_).
See [[Getting Started|https://github.com/rackerlabs/genestack/wiki#getting-started]] for an example on how to recursively clone the repository and its submodules.
> Your local genestack repository will be transferred to the eventual launcher instance for convenience **perfect for development**.
See [Getting Started](quickstart.md] for an example on how to recursively clone the repository and its submodules.

### Create a VirtualEnv

Expand All @@ -29,9 +29,9 @@ pip install ansible openstacksdk

The openstacksdk used by the ansible playbook needs a valid configuration to your environment to stand up the test resources.

An example `clouds.yaml` that could be placed in [ansible/playbooks/](../../tree/main/ansible/playbooks):
An example `clouds.yaml`:

```
``` yaml
cache:
auth: true
expiration_time: 3600
Expand All @@ -50,7 +50,7 @@ clouds:
identity_api_version: "3"
```
See the configuration guide [[here|https://docs.openstack.org/openstacksdk/latest/user/config/configuration.html]] for more examples.
See the configuration guide [here](https://docs.openstack.org/openstacksdk/latest/user/config/configuration.html) for more examples.
## Create a Test Environment
Expand Down
16 changes: 8 additions & 8 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,36 +13,36 @@ to manage cloud infrastructure in the way you need it.

They say a picture is worth 1000 words, so here's a picture.

![Genestack Architecture Diagram](assets/images/diagram-genestack.png)
![Genestack Architecture Diagram](../assets/images/diagram-genestack.png)

---

Building our cloud future has never been this simple.

### 0.Getting Started
## 0.Getting Started
* [Getting Started](getting-started.md)
* [Building Virtual Environments for Testing](build-test-envs.md)

### 1.Kubernetes
## 1.Kubernetes
* [Building Your Kubernetes Environment](build-k8s.md)
* [Retrieve kube config](kube-config.md)

### 2.Storage
## 2.Storage
* [Create Persistent Storage](Create-Persistent-Storage.md)

### 3.Infrastructure
## 3.Infrastructure
* [Deploy Required Infrastructure](deploy-required-infrastructure.md)
* [Deploy Prometheus](prometheus.md)
* [Deploy Vault](vault.md)

### 4.Openstack Infrastructure
## 4.Openstack Infrastructure
* [Deploy Openstack on k8s](Deploy-Openstack.md)

#### Post Deployment
## Post Deployment
* [Post Deploy Operations](post-deploy-ops.md)
* [Building Local Images](build-local-images.md)
* [OVN Database Backup](ovn-db-backup.md)

#### Upgrades
## Upgrades
* [Running Genestack Upgrade](genestack-upgrade.md)
* [Running Kubernetes Upgrade](k8s-upgrade.md)
37 changes: 12 additions & 25 deletions docs/ovn-db-backup.md
Original file line number Diff line number Diff line change
@@ -1,48 +1,36 @@
- [Background](#background)
- [Backup](#backup)
- [Restoration and recovery](#restoration-and-recovery)
- [Recovering when a majority of OVN DB nodes work fine](#recovering-when-a-majority-of-ovn-db-nodes-work-fine)
- [Recovering from a majority of OVN DB node failures or a total cluster failure](#recovering-from-a-majority-of-ovn-db-node-failures-or-a-total-cluster-failure)
- [Trying to use _OVN_ DB files in `/etc/origin/ovn` on the _k8s_ nodes](#trying-to-use-ovn-db-files-in-etcoriginovn-on-the-k8s-nodes)
- [Finding the first node](#finding-the-first-node)
- [Trying to create a pod for `ovsdb-tool`](#trying-to-create-a-pod-for-ovsdb-tool)
- [`ovsdb-tool` from your Linux distribution's packaging system](#ovsdb-tool-from-your-linux-distributions-packaging-system)
- [Conclusion of using the OVN DB files on your _k8s_ nodes](#conclusion-of-using-the-ovn-db-files-on-your-k8s-nodes)
- [Full recovery](#full-recovery)

# Background

By default, _Genestack_ creates a pod that runs _OVN_ snapshots daily in the `kube-system` namespace where you find other centralized _OVN_ things. These get stored on a persistent storage volume associated with the `ovndb-backup` _PersistentVolumeClaim_. Snapshots older than 30 days get deleted.

You should primarily follow the [Kube-OVN documentation on backup and recovery](https://kubeovn.github.io/docs/stable/en/ops/recover-db/) and consider the information here supplementary.

# Backup
## Backup

A default _Genestack_ installation creates a _k8s_ _CronJob_ in the `kube-system` namespace along side the other central OVN components that will store snapshots of the OVN NB and SB in the _PersistentVolume_ for the _PersistentVolumeClaim_ named `ovndb-backup`. Storing these on the persistent volume like this matches the conventions for _MariaDB_ in _Genestack_.

You may wish to implement shipping these off of the cluster to a permanent location, as you might have cluster problems that could interfere with your ability to get these off of the _PersistentVolume_ when you need these backups.
## Restoration and recovery

# Restoration and recovery
You may wish to implement shipping these off of the cluster to a permanent location, as you might have cluster problems that could interfere with your ability to get these off of the _PersistentVolume_ when you need these backups.

## Recovering when a majority of OVN DB nodes work fine
### Recovering when a majority of OVN DB nodes work fine

If you have a majority of _k8s_ nodes running `ovn-central` working fine, you can just follow the directions in the _Kube-OVN_ documentation for kicking a node out. Things mostly work normally when you have a majority because OVSDB HA uses a raft algorithm which only requires a majority of the nodes for full functionality, so you don't have to do anything too strange or extreme to recover. You essentially kick the bad node out and let it recover.

## Recovering from a majority of OVN DB node failures or a total cluster failure
### Recovering from a majority of OVN DB node failures or a total cluster failure

**You probably shouldn't use this section if you don't have a majority OVN DB node failure. Just kick out the minority of bad nodes as indicated above instead**. Use this section to recover from a failure of the **majority** of nodes.

As a first step, you will need to get database files to run the recovery. You can try to use files on your nodes as described below, or use one of the backup snapshots.

### Trying to use _OVN_ DB files in `/etc/origin/ovn` on the _k8s_ nodes
#### Trying to use _OVN_ DB files in `/etc/origin/ovn` on the _k8s_ nodes

You can use the information in this section to try to get the files to use for your recovery from your running _k8s_ nodes.

The _Kube-OVN_ shows trying to use _OVN_ DB files from `/etc/origin/ovn` on the _k8s_ nodes. You can try this, or skip this section and use a backup snapshot as shown below if you have one. However, you can probably try to use the files on the nodes as described here first, and then switch to the latest snapshot backup from the `CronJob` later if trying to use the files on the _k8s_ nodes doesn't seem to work, since restoring from the snapshot backup fully rebuilds the database.

The directions in the _Kube-OVN_ documentation use `docker run` to get a working `ovsdb-tool` to try to work with the OVN DB files on the nodes, but _k8s_ installations mostly use `CRI-O`, `containerd`, or other container runtimes, so you probably can't pull the image and run it with `docker` as shown. I will cover this and some alternatives below.

#### Finding the first node
##### Finding the first node

The _Kube-OVN_ documentation directs you to pick the node running the `ovn-central` pod associated with the first IP of the `NODE_IPS` environment variable. You should find the `NODE_IPS` environment variable defined on an `ovn-central` pod or the `ovn-central` _Deployment_. Assuming you can run the `kubectl` commands, the following example gets the node IPs off of one of the the deployment:

Expand All @@ -60,14 +48,13 @@ k8s-controller01 Ready control-plane 3d17h v1.28.6 10.130.140.246
root@k8s-controller01:~#
```


#### Trying to create a pod for `ovsdb-tool`
##### Trying to create a pod for `ovsdb-tool`

As an alternative to `docker run` since your _k8s_ cluster probably doesn't use _Docker_ itself, you can **possibly** try to create a pod instead of running a container directly, but you should **try it before scaling your _OVN_ replicas down to 0**, as not having `ovn-central` available should interfere with pod creation. The broken `ovn-central` might still prevent _k8s_ from creating the pod even if you haven't scaled your replicas down, however.

**Read below the pod manifest for edits you may need to make**

```
``` yaml
apiVersion: v1
kind: Pod
metadata:
Expand Down Expand Up @@ -115,15 +102,15 @@ To reiterate, if you reached this step, this pod creation may not work because o

If creating this pod worked, **scale your replicas to 0**, use `ovsdb-tool` to make the files you will use for restore (both north and south DB), then jump to _Full Recovery_ as described below here and in the _Kube-OVN_ documentation.

#### `ovsdb-tool` from your Linux distribution's packaging system
##### `ovsdb-tool` from your Linux distribution's packaging system

As an alternative to the `docker run`, which may not work on your cluster, and the pod creation, which may not work because of your broken OVN, if you still want to try to use the OVN DB files on your _k8s_ nodes instead of going to one of your snapshot backups, you can try to install your distribution's package with the `ovsdb-tool`, `openvswitch-common` on Ubuntu, although you risk (and will probably have) a slight version mismatch with the OVS version within your normal `ovn-central` pods. OVSDB has a stable format and this likely will not cause any problems, although you should probably restore a previously saved snapshot in preference to using an `ovsdb-tool` with a slightly mismatched version, but you may consider using the mismatch version if you don't have other options.

#### Conclusion of using the OVN DB files on your _k8s_ nodes
##### Conclusion of using the OVN DB files on your _k8s_ nodes

The entire section on using the OVN DB files from your nodes just gives you an alternative way to a planned snapshot backup to try to get something to restore the database from. From here forward, the directions converge with full recovery as described below and in the full _Kube-OVN_ documentation.

### Full recovery
#### Full recovery

You start here when you have north database and south database files you want to use to run your recovery, whether you retrieved it from one of your _k8s_ nodes as described above, or got it from one of your snapshots. Technically, the south database should get rebuilt with only the north database, but if you have the two that go together, you can save the time it would take for a full rebuild by also restoring the south DB. It also avoids relying on the ability to rebuild the south DB in case something goes wrong.

Expand Down

0 comments on commit 192e288

Please sign in to comment.