From e46addbf242ebf525502c6ef641c164638946e0b Mon Sep 17 00:00:00 2001 From: "Kyle J. Davis" Date: Thu, 4 Jan 2024 14:38:03 -0700 Subject: [PATCH 1/8] initial brupop --- content/en/brupop/1.3.x/install/_index.markdown | 5 +++++ content/en/brupop/1.3.x/install/helm/index.markdown | 0 content/en/brupop/1.3.x/install/manifest/index.markdown | 7 +++++++ content/en/brupop/_index.markdown | 6 ++++++ 4 files changed, 18 insertions(+) create mode 100644 content/en/brupop/1.3.x/install/_index.markdown create mode 100644 content/en/brupop/1.3.x/install/helm/index.markdown create mode 100644 content/en/brupop/1.3.x/install/manifest/index.markdown create mode 100644 content/en/brupop/_index.markdown diff --git a/content/en/brupop/1.3.x/install/_index.markdown b/content/en/brupop/1.3.x/install/_index.markdown new file mode 100644 index 00000000..28bae3a3 --- /dev/null +++ b/content/en/brupop/1.3.x/install/_index.markdown @@ -0,0 +1,5 @@ ++++ +type="docs" +title="Installing" ++++ + diff --git a/content/en/brupop/1.3.x/install/helm/index.markdown b/content/en/brupop/1.3.x/install/helm/index.markdown new file mode 100644 index 00000000..e69de29b diff --git a/content/en/brupop/1.3.x/install/manifest/index.markdown b/content/en/brupop/1.3.x/install/manifest/index.markdown new file mode 100644 index 00000000..593228ee --- /dev/null +++ b/content/en/brupop/1.3.x/install/manifest/index.markdown @@ -0,0 +1,7 @@ ++++ +title = "Installing with pre-baked manifest" +type = "docs" +description = "foo bar Manifest" ++++ + +lorem ipsum \ No newline at end of file diff --git a/content/en/brupop/_index.markdown b/content/en/brupop/_index.markdown new file mode 100644 index 00000000..2b3bf0d4 --- /dev/null +++ b/content/en/brupop/_index.markdown @@ -0,0 +1,6 @@ ++++ +type="docs" +title="Brupop" +description="Bottlerocket Update Operator" ++++ + From 66807351f93b41acea1a91921493db52ac525eea Mon Sep 17 00:00:00 2001 From: "Kyle J. Davis" Date: Thu, 18 Jan 2024 12:41:08 -0700 Subject: [PATCH 2/8] initial --- .../1.3.x/{install => }/_index.markdown | 4 +- .../brupop/1.3.x/install/helm/index.markdown | 0 .../1.3.x/install/manifest/index.markdown | 7 - .../en/brupop/1.3.x/operate/_index.markdown | 5 + content/en/brupop/1.3.x/setup/_index.markdown | 13 ++ .../1.3.x/setup/cert-manager/index.markdown | 45 +++++ .../1.3.x/setup/configure/index.markdown | 164 ++++++++++++++++++ .../brupop/1.3.x/setup/install/index.markdown | 60 +++++++ .../brupop/1.3.x/troubleshoot/_index.markdown | 5 + content/en/brupop/_index.markdown | 7 +- content/en/faq/_index.markdown | 1 + content/en/os/_index.markdown | 1 + data/versions/current.toml | 8 +- .../cross-project-current-link.html | 10 ++ layouts/shortcodes/current-version.html | 8 + layouts/shortcodes/github-at-commit.html | 11 ++ 16 files changed, 339 insertions(+), 10 deletions(-) rename content/en/brupop/1.3.x/{install => }/_index.markdown (52%) delete mode 100644 content/en/brupop/1.3.x/install/helm/index.markdown delete mode 100644 content/en/brupop/1.3.x/install/manifest/index.markdown create mode 100644 content/en/brupop/1.3.x/operate/_index.markdown create mode 100644 content/en/brupop/1.3.x/setup/_index.markdown create mode 100644 content/en/brupop/1.3.x/setup/cert-manager/index.markdown create mode 100644 content/en/brupop/1.3.x/setup/configure/index.markdown create mode 100644 content/en/brupop/1.3.x/setup/install/index.markdown create mode 100644 content/en/brupop/1.3.x/troubleshoot/_index.markdown create mode 100644 layouts/shortcodes/cross-project-current-link.html create mode 100644 layouts/shortcodes/current-version.html create mode 100644 layouts/shortcodes/github-at-commit.html diff --git a/content/en/brupop/1.3.x/install/_index.markdown b/content/en/brupop/1.3.x/_index.markdown similarity index 52% rename from content/en/brupop/1.3.x/install/_index.markdown rename to content/en/brupop/1.3.x/_index.markdown index 28bae3a3..0a7e6017 100644 --- a/content/en/brupop/1.3.x/install/_index.markdown +++ b/content/en/brupop/1.3.x/_index.markdown @@ -1,5 +1,7 @@ +++ type="docs" -title="Installing" +title="1.3.x" +++ + + diff --git a/content/en/brupop/1.3.x/install/helm/index.markdown b/content/en/brupop/1.3.x/install/helm/index.markdown deleted file mode 100644 index e69de29b..00000000 diff --git a/content/en/brupop/1.3.x/install/manifest/index.markdown b/content/en/brupop/1.3.x/install/manifest/index.markdown deleted file mode 100644 index 593228ee..00000000 --- a/content/en/brupop/1.3.x/install/manifest/index.markdown +++ /dev/null @@ -1,7 +0,0 @@ -+++ -title = "Installing with pre-baked manifest" -type = "docs" -description = "foo bar Manifest" -+++ - -lorem ipsum \ No newline at end of file diff --git a/content/en/brupop/1.3.x/operate/_index.markdown b/content/en/brupop/1.3.x/operate/_index.markdown new file mode 100644 index 00000000..0a6eed0d --- /dev/null +++ b/content/en/brupop/1.3.x/operate/_index.markdown @@ -0,0 +1,5 @@ ++++ +type="docs" +title="Operate" +weight=10 ++++ diff --git a/content/en/brupop/1.3.x/setup/_index.markdown b/content/en/brupop/1.3.x/setup/_index.markdown new file mode 100644 index 00000000..86d4a94b --- /dev/null +++ b/content/en/brupop/1.3.x/setup/_index.markdown @@ -0,0 +1,13 @@ ++++ +type="docs" +title="Setup" +weight=1 ++++ + +Setting up Brupop for the first time has three major steps: + +- Installing the prerequisite, `cert-manager` on your cluster, +- Installing Brupop itself, +- Labeling the nodes you want to update with Brupop. + +Many clusters require nothing more than the three above steps, but familiarize yourself with the additional configuration options before installing as you may need to tweak the configuration for your particular needs. diff --git a/content/en/brupop/1.3.x/setup/cert-manager/index.markdown b/content/en/brupop/1.3.x/setup/cert-manager/index.markdown new file mode 100644 index 00000000..684356dd --- /dev/null +++ b/content/en/brupop/1.3.x/setup/cert-manager/index.markdown @@ -0,0 +1,45 @@ ++++ +title = "Prerequisite: cert-manager" +type = "docs" +description = "Prepare your cluster for Brupop" +weight = 1 ++++ + +Brupop uses [cert-manager](https://cert-manager.io/) to manage self-signed certificates. You can install it with `kubectl` or `helm`. + +## Installing `cert-manager` using `kubectl` + +You can use `kubectl` to install cert-manager: + +```shell +kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.8.2/cert-manager.yaml +``` + +## Installing `cert-manager` using `helm` + +First, add the `cert-manager` helm chart: + +```shell +helm repo add jetstack https://charts.jetstack.io +``` + +Then update your local chart: + +```shell +helm repo update +``` + +Finally, install `cert-manager` including its CRDs: + +```shell +helm install \ + cert-manager jetstack/cert-manager \ + --namespace cert-manager \ + --create-namespace \ + --version v1.8.2 \ + --set installCRDs=true +``` + +## Next step + +After installing `cert-manager`, go ahead and [install Brupop itself](../install/). diff --git a/content/en/brupop/1.3.x/setup/configure/index.markdown b/content/en/brupop/1.3.x/setup/configure/index.markdown new file mode 100644 index 00000000..cf24b8e7 --- /dev/null +++ b/content/en/brupop/1.3.x/setup/configure/index.markdown @@ -0,0 +1,164 @@ ++++ +title = "Configure Brupop" +type = "docs" +description = "Making the operator work for your needs" +weight = 30 ++++ + + +When you install Brupop, the operator comes pre-configured with reasonable defaults. +[Labeling your nodes](#label-nodes) is the only required configuration step. + +## Required Configuration + +### Label nodes + +{{% alert title="Warning" color="warning" %}} +You can fully install Brupop but if you do not apply the proper node labels the operator will not your update nodes. +{{% /alert %}} + +[Kubernetes node labels](https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/) controls which nodes Brupop updates; +specfically, the label `bottlerocket.aws/updater-interface-version=2.0.0` dictactes which nodes in the cluster get automatic updates. + +You can label nodes using {{< cross-project-current-link url="/en/os/x.x.x/api/settings/kubernetes/#node-labels" project="os" >}}`settings.kubernetes.node-labels`{{}} with TOML (including instance user data), using `apiclient` in a host container, or `kubectl`: + +#### `apiclient` + +```shell +apiclient set settings.kubernetes.node-labels.bottlerocket.aws/updater-interface-version=2.0.0 +``` + +#### `eksctl` + +```yaml +... +nodeGroups: + - name: name-of-your-nodegroup + labels: { bottlerocket.aws/updater-interface-version: 2.0.0 } +... +``` + +#### `kubectl` + +```shell +# replace MY_NODE_NAME with the name of your node +kubectl label node MY_NODE_NAME bottlerocket.aws/updater-interface-version=2.0.0 +``` + +##### Label all nodes + +If you are running Bottlerocket on all nodes in your cluster, you can use `kubectl` to label all nodes at once: + +```shell +kubectl label node $(kubectl get nodes -o jsonpath='{.items[*].metadata.name}') bottlerocket.aws/updater-interface-version=2.0.0 +``` + +#### TOML / User Data + +```TOML +... +[settings.kubernetes.node-labels] +"bottlerocket.aws/updater-interface-version" = 2.0.0 +... +``` + +## Optional Configuration + +### Scheduling + +Brupop schedules node updates based a cron expression in the following format: + +```text + ┌───────────── seconds (0 - 59) + │ ┌───────────── minute (0 - 59) + │ │ ┌───────────── hour (0 - 23) + │ │ │ ┌───────────── day of the month (1 - 31) + │ │ │ │ ┌───────────── month (Jan, Feb, Mar, Apr, Jun, Jul, Aug, Sep, Oct, Nov, Dec) + │ │ │ │ │ ┌───────────── day of the week (Mon, Tue, Wed, Thu, Fri, Sat, Sun) + │ │ │ │ │ │ ┌───────────── year (formatted as YYYY) + │ │ │ │ │ │ │ + │ │ │ │ │ │ │ + * * * * * * * +``` + +#### Helm + +You can configure the schedule with `scheduler_cron_expression`. + +#### Kubernetes YAML + +In the controller deployment, you can change the schedule by alerting the `env` named `SCHEDULER_CRON_EXPRESSION` to the desired cron expression `value`. +See {{% github-at-commit repo="bottlerocket-os/bottlerocket-update-operator" path="/deploy/charts/bottlerocket-update-operator/templates/controller-deployment.yaml" %}}`controller-deployment.yaml`{{% /github-at-commit %}} for more details on the stuctures. + +--- + +### Concurrent Updates + +You can set the maximum concurrency of updates that Brupop will perform. +You either set a specific number of concurrent updates or, alternately, `"unlimited"` to update as many nodes as possible concurrently. +In either case, Brupop always respects [PodDisruptionBudgets](https://kubernetes.io/docs/tasks/run-application/configure-pdb/). + +{{% alert title="Conflicts between load balancing and concurrency" color="warning" %}} +Take caution when setting concurrency and excluding load balancers together, as misconfiguration can result in a condition where all nodes exclude load balancing. +{{% /alert %}} + +#### Helm + +You can configure the concurrency by `max_concurrent_updates` . + +#### Kubernetes YAML + +In the controller deployment, you can change the schedule by alerting the `env` named `SCHEDULER_CRON_EXPRESSION` to the desired `value`. +See {{% github-at-commit repo="bottlerocket-os/bottlerocket-update-operator" path="/deploy/charts/bottlerocket-update-operator/templates/controller-deployment.yaml" %}}`controller-deployment.yaml`{{% /github-at-commit %}} for more details on the stuctures. + +--- + +### API Server Ports + +By default, the operator's API server uses port `8443` for internal traffic and port `443` for node agents, but you can change these ports via configuration. +Both ports must be set or the operator will fail to start. + +#### Helm + +You can configure the API server ports by changing the value of `apiserver_internal_port` for internal traffic and `apiserver_service_port` for node agent traffic. + +#### Kubernetes YAML + +If configuring Brupop via Kubernetes YAML, you need to change the port values in several places, see {{% github-at-commit repo="bottlerocket-os/bottlerocket-update-operator" path="/bottlerocket-update-operator.yaml" %}}pre-baked YAML manifest{{% /github-at-commit %}} and the following templates for more details on the structures: + +- `{{< github-at-commit repo="bottlerocket-os/bottlerocket-update-operator" path="/deploy/charts/bottlerocket-shadow/templates/custom-resource-definition.yaml" >}}custom-resource-definition.yaml{{< /github-at-commit >}}` +- `{{< github-at-commit repo="bottlerocket-os/bottlerocket-update-operator" path="/deploy/charts/bottlerocket-update-operator/templates/api-server-service.yaml" >}}api-server-service.yaml{{< /github-at-commit >}}` +- `{{< github-at-commit repo="bottlerocket-os/bottlerocket-update-operator" path="/deploy/charts/bottlerocket-update-operator/templates/agent-daemonset.yaml" >}}agent-daemonset.yaml{{< /github-at-commit >}}` +- `{{< github-at-commit repo="bottlerocket-os/bottlerocket-update-operator" path="/deploy/charts/bottlerocket-update-operator/templates/api-server-deployment.yaml" >}}api-server-deployment.yam{{< /github-at-commit >}}` + +--- + +### Logging + +Brupop emits logs from the controller, agent, and API server through standard Kubernetes logging mechanisms but you configure the log format and filter. + +Log formatting has four options: + +- `full`: Human-readable, single-line logs, +- `compact`: A shorter version of `full`, +- `pretty`: "Excessively pretty", terminal-optimized human-readable logs (default), +- `json`: New line-delimited JSON-formatted (machine-readable) logs. + +You can optionally set the logs to add ANSI colour information, which is helpful if viewing in a terminal, but adds garbage characters for non-terminal logging utilities. + +Log filtering accepts on both typical log levels (`info` (default), `debug`, `error`) or through [filter directives](https://docs.rs/tracing-subscriber/0.3.17/tracing_subscriber/filter/struct.EnvFilter.html#directives). + +#### Helm + +You can configure the log format with `logging.formatter` and ANSI color with `logging.ansi_enabled` (`true`/`false`). + +To change the log filtering, set the `logging.controller.tracing_filter`, `logging.agent.tracing_filter`, and `logging.apiserver.tracing_filter` to the desired log level or filter directive. + +#### Kubernetes YAML + +You need to configure the logging seperately for each item seperately, see the following templates: + +- API Server: `{{< github-at-commit repo="bottlerocket-os/bottlerocket-update-operator" path="/deploy/charts/bottlerocket-update-operator/templates/api-server-deployment.yaml" >}}api-server-deployment.yaml{{< /github-at-commit >}}` +- + +To configure the format of your logs with, you need to change the `env` named `LOGGING_FORMATTER` to the desired format option. diff --git a/content/en/brupop/1.3.x/setup/install/index.markdown b/content/en/brupop/1.3.x/setup/install/index.markdown new file mode 100644 index 00000000..3fbb0cd3 --- /dev/null +++ b/content/en/brupop/1.3.x/setup/install/index.markdown @@ -0,0 +1,60 @@ ++++ +title = "Install Brupop" +type = "docs" +description = "Install the Bottlerocket Update Operator to your Kubernetes cluster" +weight = 10 ++++ + +Installing Brupop creates the custom resource definitions (CRDs), roles, and deployments and uses the latest operator image from [Amazon ECR Public](https://gallery.ecr.aws/bottlerocket/bottlerocket-update-operator). + +You can install Brupop either [with `helm`](#install-with-helm) or a [pre-baked manifest](#install-with-a-manifest). + +## Install with `helm` + +First, add the `bottlerocket-operator-chart` + +```shell +helm repo add brupop https://bottlerocket-os.github.io/bottlerocket-update-operator +``` + +Then update your local chart: + +```shell +helm repo update +``` + +Create a namespace for the operator: + +```shell +kubectl create namespace brupop-bottlerocket-aws +``` + +Next, install the Brupop custom resource definition: + +```shell +helm install brupop-crd brupop/bottlerocket-shadow +``` + +Finally, install the operator itself: + +```shell +helm install brupop-operator brupop/bottlerocket-update-operator +``` + +After you've installed the operator, you can move on to the next step: [configuring Brupop](../configure/). + +## Install with a Manifest + +First, download the manifest from the release to your local machine and run the following: + +```shell +kubectl apply -f bottlerocket-update-operator-v{{< current-version project="brupop" >}}.yaml +``` + +Alternately, you can point `kubectl` directly at the manifest URL. + +```shell +kubectl apply -f https://github.com/bottlerocket-os/bottlerocket-update-operator/releases/download/v{{< current-version project="brupop" >}}/bottlerocket-update-operator-v{{< current-version project="brupop" >}}.yaml +``` + +After you've installed the operator, you can move on to the next step: [configuring Brupop](../configure/). diff --git a/content/en/brupop/1.3.x/troubleshoot/_index.markdown b/content/en/brupop/1.3.x/troubleshoot/_index.markdown new file mode 100644 index 00000000..31298755 --- /dev/null +++ b/content/en/brupop/1.3.x/troubleshoot/_index.markdown @@ -0,0 +1,5 @@ ++++ +type="docs" +title="Troubleshoot" +weight=30 ++++ diff --git a/content/en/brupop/_index.markdown b/content/en/brupop/_index.markdown index 2b3bf0d4..a5e34255 100644 --- a/content/en/brupop/_index.markdown +++ b/content/en/brupop/_index.markdown @@ -1,6 +1,11 @@ +++ type="docs" title="Brupop" -description="Bottlerocket Update Operator" +description="Documentation for the Bottlerocket Update Operator (aka Brupop)" +++ + +## Version & Update Policy + +Brupop follows semantic ([semver](https://semver.org/)) versioning to ensure that minor (e.g. 1.1.1 -> 1.2.0) or patch (e.g. 1.1.0 -> 1.1.1) updates do not introduce any breaking or incompatible changes. +However, patches are only provided to the latest version, so you should keep your Brupop installation up to date with the lastest release. diff --git a/content/en/faq/_index.markdown b/content/en/faq/_index.markdown index 966e9072..4b21b6cb 100644 --- a/content/en/faq/_index.markdown +++ b/content/en/faq/_index.markdown @@ -1,6 +1,7 @@ +++ type="docs" title="FAQ" +weight=1 +++ {{< faqlist >}} diff --git a/content/en/os/_index.markdown b/content/en/os/_index.markdown index 03876fe1..58b1ae8e 100644 --- a/content/en/os/_index.markdown +++ b/content/en/os/_index.markdown @@ -4,6 +4,7 @@ type="docs" description="Documentation for the Bottlerocket operating system" body_class="suppress_section_listing" no_version_warning=true +weight=2 +++ This section covers installing and using the Bottlerocket operating system[^1]. If you’re looking for information on building, contributing to, or learning about the inner workings of Bottlerocket, the [GitHub repo](https://github.com/bottlerocket-os/bottlerocket) has more information. diff --git a/data/versions/current.toml b/data/versions/current.toml index 99940add..5ff3e96c 100644 --- a/data/versions/current.toml +++ b/data/versions/current.toml @@ -7,4 +7,10 @@ [k8s] versions = ["1.23","1.24","1.25","1.26","1.27","1.28"] [ecs] - versions = ["1","2"] \ No newline at end of file + versions = ["1","2"] + +[brupop] + major = 1 + minor = 13 + patch = 0 + tag_commit = "6455a43fd717765da044a95a18a60a8286020971" diff --git a/layouts/shortcodes/cross-project-current-link.html b/layouts/shortcodes/cross-project-current-link.html new file mode 100644 index 00000000..a6b9d6af --- /dev/null +++ b/layouts/shortcodes/cross-project-current-link.html @@ -0,0 +1,10 @@ +{{- $project := .Get "project" | default "os" -}} +{{- $url_arg := .Get "url" -}} +{{- $current_version_data := $.Site.Data.versions.current -}} +{{- $v := index $current_version_data $project -}} + +{{- $new_url := print "/" $v.major "." $v.minor ".x/" -}} + +{{- $url := replace $url_arg "/x.x.x/" $new_url }} + +{{ .Inner | markdownify }} diff --git a/layouts/shortcodes/current-version.html b/layouts/shortcodes/current-version.html new file mode 100644 index 00000000..679a2009 --- /dev/null +++ b/layouts/shortcodes/current-version.html @@ -0,0 +1,8 @@ +{{- $project := .Get "project" | default "os" -}} +{{- $minor_override_value := .Get "minor_override" -}} +{{- $patch_override_value := .Get "patch_override" -}} +{{- $seperator := default "." (.Get "seperator_override") }} +{{- $version_data := $.Site.Data.versions.current -}} +{{- $project_data := index $version_data $project -}} + +{{ $project_data.major }}{{ $seperator }}{{ default $project_data.minor $minor_override_value }}{{ $seperator }}{{ default $project_data.patch $patch_override_value }} \ No newline at end of file diff --git a/layouts/shortcodes/github-at-commit.html b/layouts/shortcodes/github-at-commit.html new file mode 100644 index 00000000..70d65916 --- /dev/null +++ b/layouts/shortcodes/github-at-commit.html @@ -0,0 +1,11 @@ +{{- $currentPath := print .Page.File.Dir -}} +{{- /* break apart the path */ -}} +{{- $parts := split $currentPath "/" -}} +{{- /* 1st (base 0) project has the version */ -}} +{{- $path_project := index $parts 0 -}} +{{- $repo := .Get "repo" -}} +{{- $project := .Get "project" | default $path_project -}} +{{- $path := .Get "path" -}} +{{- $tag_commit := index (index $.Site.Data.versions.current $project) "tag_commit" -}} +{{- $github_url := print "https://github.com/" $repo "/blob/" $tag_commit $path }} +{{ .Inner }} \ No newline at end of file From 583790ba9e9a053a4f3a7b97354840376914e9aa Mon Sep 17 00:00:00 2001 From: "Kyle J. Davis" Date: Wed, 24 Jan 2024 15:07:39 -0700 Subject: [PATCH 3/8] adding concepts section --- .../en/brupop/1.3.x/concepts/index.markdown | 53 ++++++++ .../{_index.markdown => index.markdown} | 0 content/en/brupop/1.3.x/setup/_index.markdown | 2 +- .../1.3.x/setup/configure/index.markdown | 114 ++++++++++-------- 4 files changed, 117 insertions(+), 52 deletions(-) create mode 100644 content/en/brupop/1.3.x/concepts/index.markdown rename content/en/brupop/1.3.x/operate/{_index.markdown => index.markdown} (100%) diff --git a/content/en/brupop/1.3.x/concepts/index.markdown b/content/en/brupop/1.3.x/concepts/index.markdown new file mode 100644 index 00000000..7450a107 --- /dev/null +++ b/content/en/brupop/1.3.x/concepts/index.markdown @@ -0,0 +1,53 @@ ++++ +title = "Concepts" +type = "docs" +description = "Understanding Brupop" +weight = 1 ++++ + +## Declarative, in-place updates + +You can update Bottlerocket in a couple of ways: + +* node replacement where new instances with a new version of the OS replace nodes with older versions of the OS, +* in-place updates where the node downloads a new version of the OS and reboots into a new version of the OS while maintaining the same instance. + +There is no single preferred nor advised method to update a node; each method has pros and cons depending on your situation. + +Bottlerocket Update Operator (Brupop) is a Kubernetes operator for managing in-place updates of Bottlerocket on Kubernetes. If you use Bottlerocket on ECS or intend to replace nodes in Kubernetes, Brupop is not for you. Even if you do plan to do in-place updates Brupop is not required as you can manage in-place updates in other ways. However, Brupop offers a declarative, automated way to manage in-place Bottlerocket updates. + +## Controlled updates + +Brupop uses the Kubernetes controller pattern in an effort to safely update all the nodes whilst minimizing disruptions to workloads. To achieve this, Brupop does the following: + +* Controls the rate and flow of updates across the entire cluster, +* First prevents new workloads from being scheduled to the node then drains existing workloads prior to updates, +* Contains and prevents the propagation of update problems when the controller detects update failures. + +Brupop collects the state of each node with an agent. The Brupop Agent runs in a container on each node as a DaemonSet. This agent sends the state to an API Server. The API Server runs in a container on the cluster itself and communicates with the Kubernetes API to record the state as a custom resource. + +The Controller also runs in a container on the cluster where it regularly evaluates the information about the state of each node and the cluster as a whole; based on this information it supplies instructions to the individual agents about update actions. + +## States + +At any given point nodes are in one of five Brupop states: idle, staged & performed update, rebooted into update, monitoring update or error reset. A node is never in more than one state. The state of each node is represented as a Kubernetes Custom Resource called a BottlerocketShadow resource or brs. + +### Idle + +A node in the idle state does not have a pending update in-process. Most of the time your nodes will remain in this state.\ + +### Staged & Performed Update + +Bottlerocket uses multiple partitions to manage in-place updates. The OS runs from one partition and, when a new update is available, the update is downloaded and installed into the other. The Brupop controller periodically requests the agent to check for and download the most recent version of Bottlerocket. Once downloaded, Bottlerocket modifies the bootloader configuration to boot from the partition with the update and the agent changes the state to Staged & Performed Update with the Brupop API server. + +### Reboot into Update + +To minimize disruptions to the workloads running in the cluster, the controller signals to Kubernetes to prevent new workloads from being scheduled on to the node as well as shut down existing workloads (drain). Once drained, the agent triggers a reboot into the new OS and changes the state to Rebooted Into Update with the Brupop API server. + +### Monitoring Update + +Once the node reboots the update is technically complete, however the time whilst all your workloads startup is critical. Bottlerocket’s versioning and variant scheme is built to mitigate incompatibilities between OS versions, there is always a chance that an unforeseen incompatibility exists with some component of your architecture. Typically, these incompatibilities become visible after the update occurs and during workload start. Consequently, Brupop waits before marking the node with the API server as fully complete, instead the agent sets the state to Monitoring Update with the API Server. This monitoring period prevents the cluster creating a situation where nodes update quickly but in an unhealthy state. Once the monitoring period completes, the Agent sets the state back to Idle with the API Server. + +### Error Reset + +In the situation that any of the above states fail, the state becomes Error Reset before transitioning back to Idle. diff --git a/content/en/brupop/1.3.x/operate/_index.markdown b/content/en/brupop/1.3.x/operate/index.markdown similarity index 100% rename from content/en/brupop/1.3.x/operate/_index.markdown rename to content/en/brupop/1.3.x/operate/index.markdown diff --git a/content/en/brupop/1.3.x/setup/_index.markdown b/content/en/brupop/1.3.x/setup/_index.markdown index 86d4a94b..00fbc0a7 100644 --- a/content/en/brupop/1.3.x/setup/_index.markdown +++ b/content/en/brupop/1.3.x/setup/_index.markdown @@ -1,7 +1,7 @@ +++ type="docs" title="Setup" -weight=1 +weight=5 +++ Setting up Brupop for the first time has three major steps: diff --git a/content/en/brupop/1.3.x/setup/configure/index.markdown b/content/en/brupop/1.3.x/setup/configure/index.markdown index cf24b8e7..b2e4e5be 100644 --- a/content/en/brupop/1.3.x/setup/configure/index.markdown +++ b/content/en/brupop/1.3.x/setup/configure/index.markdown @@ -9,6 +9,10 @@ weight = 30 When you install Brupop, the operator comes pre-configured with reasonable defaults. [Labeling your nodes](#label-nodes) is the only required configuration step. +Aside from labeling nodes, you configure Brupop with helm or with a manifest. +Helm reduces the configuration burden for Brupop substantially with few down sides, so this the documentation focuses configuration with Helm. +If you choose to not use Helm, refer to the pre-baked manifest for an example. + ## Required Configuration ### Label nodes @@ -64,36 +68,19 @@ kubectl label node $(kubectl get nodes -o jsonpath='{.items[*].metadata.name}') ## Optional Configuration -### Scheduling - -Brupop schedules node updates based a cron expression in the following format: - -```text - ┌───────────── seconds (0 - 59) - │ ┌───────────── minute (0 - 59) - │ │ ┌───────────── hour (0 - 23) - │ │ │ ┌───────────── day of the month (1 - 31) - │ │ │ │ ┌───────────── month (Jan, Feb, Mar, Apr, Jun, Jul, Aug, Sep, Oct, Nov, Dec) - │ │ │ │ │ ┌───────────── day of the week (Mon, Tue, Wed, Thu, Fri, Sat, Sun) - │ │ │ │ │ │ ┌───────────── year (formatted as YYYY) - │ │ │ │ │ │ │ - │ │ │ │ │ │ │ - * * * * * * * -``` - -#### Helm - -You can configure the schedule with `scheduler_cron_expression`. +### API Server Ports -#### Kubernetes YAML +__Helm Configuration__: `apiserver_internal_port` for internal traffic, `apiserver_service_port` for node agent traffic. -In the controller deployment, you can change the schedule by alerting the `env` named `SCHEDULER_CRON_EXPRESSION` to the desired cron expression `value`. -See {{% github-at-commit repo="bottlerocket-os/bottlerocket-update-operator" path="/deploy/charts/bottlerocket-update-operator/templates/controller-deployment.yaml" %}}`controller-deployment.yaml`{{% /github-at-commit %}} for more details on the stuctures. +By default, the operator's API server uses port `8443` for internal traffic and port `443` for node agents, but you can change these ports via configuration. +Both ports must be set or the operator will fail to start. --- ### Concurrent Updates +__Helm Configuration__: `max_concurrent_updates` + You can set the maximum concurrency of updates that Brupop will perform. You either set a specific number of concurrent updates or, alternately, `"unlimited"` to update as many nodes as possible concurrently. In either case, Brupop always respects [PodDisruptionBudgets](https://kubernetes.io/docs/tasks/run-application/configure-pdb/). @@ -102,34 +89,24 @@ In either case, Brupop always respects [PodDisruptionBudgets](https://kubernetes Take caution when setting concurrency and excluding load balancers together, as misconfiguration can result in a condition where all nodes exclude load balancing. {{% /alert %}} -#### Helm +--- -You can configure the concurrency by `max_concurrent_updates` . +### Namespace -#### Kubernetes YAML +__Helm Configuration__: `brupop-bottlerocket-aws` -In the controller deployment, you can change the schedule by alerting the `env` named `SCHEDULER_CRON_EXPRESSION` to the desired `value`. -See {{% github-at-commit repo="bottlerocket-os/bottlerocket-update-operator" path="/deploy/charts/bottlerocket-update-operator/templates/controller-deployment.yaml" %}}`controller-deployment.yaml`{{% /github-at-commit %}} for more details on the stuctures. +You can change the namespace where the Kubernetes deploys Brupop (default: `brupop-bottlerocket-aws`). --- -### API Server Ports - -By default, the operator's API server uses port `8443` for internal traffic and port `443` for node agents, but you can change these ports via configuration. -Both ports must be set or the operator will fail to start. - -#### Helm - -You can configure the API server ports by changing the value of `apiserver_internal_port` for internal traffic and `apiserver_service_port` for node agent traffic. +### Load balancer exclusion -#### Kubernetes YAML +__Helm Configuration__: `exclude_from_lb_wait_time_in_sec` -If configuring Brupop via Kubernetes YAML, you need to change the port values in several places, see {{% github-at-commit repo="bottlerocket-os/bottlerocket-update-operator" path="/bottlerocket-update-operator.yaml" %}}pre-baked YAML manifest{{% /github-at-commit %}} and the following templates for more details on the structures: +With this option, you can control the exclusion of the node from load balancing and delays draining the node for the number of seconds specified. +Internally, Brupop uses [`node.kubernetes.io/exclude-from-external-load-balancers`](https://kubernetes.io/docs/reference/labels-annotations-taints/#node-kubernetes-io-exclude-from-external-load-balancers) to exclude the node from load balancing. -- `{{< github-at-commit repo="bottlerocket-os/bottlerocket-update-operator" path="/deploy/charts/bottlerocket-shadow/templates/custom-resource-definition.yaml" >}}custom-resource-definition.yaml{{< /github-at-commit >}}` -- `{{< github-at-commit repo="bottlerocket-os/bottlerocket-update-operator" path="/deploy/charts/bottlerocket-update-operator/templates/api-server-service.yaml" >}}api-server-service.yaml{{< /github-at-commit >}}` -- `{{< github-at-commit repo="bottlerocket-os/bottlerocket-update-operator" path="/deploy/charts/bottlerocket-update-operator/templates/agent-daemonset.yaml" >}}agent-daemonset.yaml{{< /github-at-commit >}}` -- `{{< github-at-commit repo="bottlerocket-os/bottlerocket-update-operator" path="/deploy/charts/bottlerocket-update-operator/templates/api-server-deployment.yaml" >}}api-server-deployment.yam{{< /github-at-commit >}}` +See [Concurrent Updates](#concurrent-updates) for an important warning about concurrency and load balancer exclusion. --- @@ -137,6 +114,10 @@ If configuring Brupop via Kubernetes YAML, you need to change the port values in Brupop emits logs from the controller, agent, and API server through standard Kubernetes logging mechanisms but you configure the log format and filter. +#### Format + +__Helm Configuration__: `logging.formatter` + Log formatting has four options: - `full`: Human-readable, single-line logs, @@ -144,21 +125,52 @@ Log formatting has four options: - `pretty`: "Excessively pretty", terminal-optimized human-readable logs (default), - `json`: New line-delimited JSON-formatted (machine-readable) logs. -You can optionally set the logs to add ANSI colour information, which is helpful if viewing in a terminal, but adds garbage characters for non-terminal logging utilities. +#### Colours + +__Helm Configuration__: `logging.ansi_enabled` + +You can optionally set the logs to add ANSI colour information (`true`/`false`), which is helpful if viewing in a terminal, but adds garbage characters for non-terminal logging utilities. + +#### Filter + +__Helm Configuration__: The controller, agent, and API server are configured via`logging.controller.tracing_filter`, `logging.agent.tracing_filter`, and `logging.apiserver.tracing_filter` (respectively). Log filtering accepts on both typical log levels (`info` (default), `debug`, `error`) or through [filter directives](https://docs.rs/tracing-subscriber/0.3.17/tracing_subscriber/filter/struct.EnvFilter.html#directives). -#### Helm +--- -You can configure the log format with `logging.formatter` and ANSI color with `logging.ansi_enabled` (`true`/`false`). +### Placement -To change the log filtering, set the `logging.controller.tracing_filter`, `logging.agent.tracing_filter`, and `logging.apiserver.tracing_filter` to the desired log level or filter directive. +__Helm Configuration__: `placement.agent`, `placement.controller`, `placement.apiserver` -#### Kubernetes YAML +With these configurations, you can control the [tolerations](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/) for the agent, controller and API server. +For the controller and and API server you can also control the [node selector](https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodeselector), and [pod affinitiy and anti-affinity](https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#affinity-and-anti-affinity). -You need to configure the logging seperately for each item seperately, see the following templates: +--- + +### Private Image Registry + +__Helm Configuration__: `image_pull_secrets` + +If you are testing Brupop with a private image registry, you can configure pull secrets to fetch images. + +--- + +### Scheduling -- API Server: `{{< github-at-commit repo="bottlerocket-os/bottlerocket-update-operator" path="/deploy/charts/bottlerocket-update-operator/templates/api-server-deployment.yaml" >}}api-server-deployment.yaml{{< /github-at-commit >}}` -- +__Helm Configuration__: `scheduler_cron_expression` -To configure the format of your logs with, you need to change the `env` named `LOGGING_FORMATTER` to the desired format option. +Brupop schedules node updates based a cron expression in the following format: + +```text + ┌───────────── seconds (0 - 59) + │ ┌───────────── minute (0 - 59) + │ │ ┌───────────── hour (0 - 23) + │ │ │ ┌───────────── day of the month (1 - 31) + │ │ │ │ ┌───────────── month (Jan, Feb, Mar, Apr, Jun, Jul, Aug, Sep, Oct, Nov, Dec) + │ │ │ │ │ ┌───────────── day of the week (Mon, Tue, Wed, Thu, Fri, Sat, Sun) + │ │ │ │ │ │ ┌───────────── year (formatted as YYYY) + │ │ │ │ │ │ │ + │ │ │ │ │ │ │ + * * * * * * * +``` From 68d6de4a90c8adfe88f2037c2b3c1e3a5e5c01d3 Mon Sep 17 00:00:00 2001 From: "Kyle J. Davis" Date: Fri, 2 Feb 2024 08:05:22 -0700 Subject: [PATCH 4/8] revisions for diagrams --- assets/scss/_styles_project.scss | 77 +++ .../en/brupop/1.3.x/concepts/index.markdown | 9 +- .../en/brupop/1.3.x/operate/index.markdown | 31 +- .../brupop/1.3.x/troubleshoot/_index.markdown | 5 - .../brupop/1.3.x/troubleshoot/index.markdown | 66 +++ .../7_2-why-updates-bottlerocket-aws.markdown | 7 + ...t_all_nodes_have_available_update.markdown | 9 + .../brupop-agent-controller-diagram.html | 530 ++++++++++++++++++ ...brupop_agent_api_server_control_plane.html | 54 ++ .../shortcodes/brupop_components_diagram.html | 82 +++ layouts/shortcodes/setting-reference.html | 14 +- 11 files changed, 874 insertions(+), 10 deletions(-) delete mode 100644 content/en/brupop/1.3.x/troubleshoot/_index.markdown create mode 100644 content/en/brupop/1.3.x/troubleshoot/index.markdown create mode 100644 content/en/faqitems/7_2-why-updates-bottlerocket-aws.markdown create mode 100644 content/en/faqitems/7_3_not_all_nodes_have_available_update.markdown create mode 100644 layouts/shortcodes/brupop-agent-controller-diagram.html create mode 100644 layouts/shortcodes/brupop_agent_api_server_control_plane.html create mode 100644 layouts/shortcodes/brupop_components_diagram.html diff --git a/assets/scss/_styles_project.scss b/assets/scss/_styles_project.scss index 971d8216..c51680d1 100644 --- a/assets/scss/_styles_project.scss +++ b/assets/scss/_styles_project.scss @@ -502,6 +502,83 @@ nav.foldable-nav .with-child.depad { padding-left: 0; } +.brupop-diagram { + .node, + .agent, + .api-server, + .controller, + .unused-container, + .unused-volume, + .active-volume, + .line-arrow-connector .arrow-head, + .ellipses { + pointer-events: all; + } + .node { + fill: $tan; + rx: 3; + } + .agent, + .api-server, + .controller, + .unused-container, + .unused-volume, + .active-volume { + rx: 2; + } + .agent { + fill: $light-teal; + } + .unused-container, + .unused-volume { + fill: $white; + } + + .label { + font-size: 12px; + font-family: $td-fonts-serif; + font-weight: 600; + + &.active-volume-label { + fill: $white; + text-anchor: middle; + } + &.outer-label { + fill: $dark-blue; + text-anchor: middle; + } + } + .active-volume { + fill: $dark-blue; + } + + .line-arrow-connector { + stroke-miterlimit: 10; + .connector { + stroke: $dark-blue; + fill: none; + pointer-events : stroke; + } + .arrow-head { + stroke: $dark-blue; + fill: $dark-blue; + } + } + + .api-server { + fill: $dark-orange; + } + .controller { + fill: $light-blue; + } + .ellipses { + fill: $tan; + stroke: none; + rx: 5; + ry: 5; + } +} + /* old docs notice */ .pageinfo.olddocs { margin-left: 0; diff --git a/content/en/brupop/1.3.x/concepts/index.markdown b/content/en/brupop/1.3.x/concepts/index.markdown index 7450a107..c784a2c1 100644 --- a/content/en/brupop/1.3.x/concepts/index.markdown +++ b/content/en/brupop/1.3.x/concepts/index.markdown @@ -4,10 +4,17 @@ type = "docs" description = "Understanding Brupop" weight = 1 +++ +--- + +## test + +{{< brupop-agent-controller-diagram >}} +{{< brupop_agent_api_server_control_plane >}} +{{< brupop_components_diagram >}} ## Declarative, in-place updates -You can update Bottlerocket in a couple of ways: +You can update Bottlerocket in a couple of ways: * node replacement where new instances with a new version of the OS replace nodes with older versions of the OS, * in-place updates where the node downloads a new version of the OS and reboots into a new version of the OS while maintaining the same instance. diff --git a/content/en/brupop/1.3.x/operate/index.markdown b/content/en/brupop/1.3.x/operate/index.markdown index 0a6eed0d..21975bab 100644 --- a/content/en/brupop/1.3.x/operate/index.markdown +++ b/content/en/brupop/1.3.x/operate/index.markdown @@ -1,5 +1,34 @@ +++ type="docs" -title="Operate" +title="Operate & Observe" weight=10 +++ + +After installation on your cluster Brupop runs in the background and generally requires no intervention. +Your nodes will check for updates and apply them according your configuration and the Bottlerocket update waves. + +However, you can observe the status of the updates by [adhoc query](#adhoc-query) or setup [on-going monitoring](#on-going-monitoring). + +## Adhoc Query + +If you want to see the update status of your nodes, use `kubectl` to get the custom resource `brs` : + +```shell +kubectl get brs --namespace brupop-bottlerocket-aws +``` + +`kubectl` returns the [state](../concepts/#states), current version, target state, and target version. For example: + +```shell +AME STATE VERSION TARGET STATE TARGET VERSION +brs-node-1 Idle 1.17.0 Idle +brs-node-2 Idle 1.17.0 StagedUpdate 1.18.0 +``` + +## On-going monitoring + +To facilitate on-going monitoring the Brupop API server and controller provide you with metrics endpoints (`/metrics`) compatible with [Prometheus](https://prometheus.io/). +The metrics endpoints expose two metrics: one that describes the current version of each node (`brupop_hosts_version`) and another for the [state](../concepts/#states) of each node (`brupop_hosts_state`). + +For a sample configuration of using Prometheus with Brupop see the [configuration on the Brupop GitHub Repo](#). +Additionally, [Containers On AWS has a step-by-step walkthrough](#) using EKS, Brupop, and Prometheus. diff --git a/content/en/brupop/1.3.x/troubleshoot/_index.markdown b/content/en/brupop/1.3.x/troubleshoot/_index.markdown deleted file mode 100644 index 31298755..00000000 --- a/content/en/brupop/1.3.x/troubleshoot/_index.markdown +++ /dev/null @@ -1,5 +0,0 @@ -+++ -type="docs" -title="Troubleshoot" -weight=30 -+++ diff --git a/content/en/brupop/1.3.x/troubleshoot/index.markdown b/content/en/brupop/1.3.x/troubleshoot/index.markdown new file mode 100644 index 00000000..3daf571d --- /dev/null +++ b/content/en/brupop/1.3.x/troubleshoot/index.markdown @@ -0,0 +1,66 @@ ++++ +type="docs" +title="Troubleshoot" +weight=30 ++++ + +## Debugging information + +Brupop’s components emit useful logs for debugging and troubleshooting. + +### API Server deployment logs + +Searching through the API Server’s deployment logs for a particular Node ID will yield the mutations to the node. Assuming the default namespace you can retrieve these by running: + +``` +kubectl logs deployment/brupop-apiserver --namespace brupop-bottlerocket-aws +``` + +### Agent logs + +Logs from the agent show the specific update actions taken on a particular node. + +First, find the node in the list of the Brupop agent pods (assuming the default namespace): + +``` +kubectl get pods --selector=brupop.bottlerocket.aws/component=agent -o wide --namespace brupop-bottlerocket-aws +``` + +From this list get the logs for the agent you’re troubleshooting by replacing `` with the node name from the previous step. + +``` +kubectl logs --namespace brupop-bottlerocket-aws +``` + +## Common Issues + +### Stuck Updates + +When one or mode nodes do not progress through the states and return to idle it is a “stuck update.” By default, Brupop only updates one node so a single node can prevent nodes across the cluster from updating. + +There are a few potential causes of stuck updates: + +1. Pod Disruption Budget preventing a node drain. Brupop uses the Kubernetes Eviction API to drain pods form a node. It’s possible to Pod Disrutpion Budgets configured (often mistakenly) to disallow a pod removal resulting in a un-drainable node that Brupop cannot update. + **Troubleshooting step:** Check your pod disruption budget configuration. +2. Unable to access `updates.bottlerocket.aws`. Bottlerocket needs to access metadata from a public endpoint to get information about the most recent release. Production environments may limit this type of outbound access. + **Troubleshooting step:** Log into the control container of a node and run `apiclient update check`. + Failures with this check indicate an outbound block. + **Potential solution:** Scrape the contents of `updates.bottlerocket.aws` with `[Tuftool](https://github.com/awslabs/tough/tree/develop/tuftool#download-tuf-repo)` and serve from within your cluster, then update your settings accordingly for `settings.updates.metadata-base-url` and `settings.updates.metadata-base-url`. +3. Other issues while updating. + **Troubleshooting step:** Check the agent logs for the stuck node. + +### **Bottlerocket instances start with an old version of Bottlerocket** + +After using Brupop for a while you may notice that any brand new nodes added to the cluster start with an older version Bottlerocket then Brupop flags them for an update almost immediately. Brupop can only update existing nodes and it doesn’t manage the node creation process. Depending on how you created your nodes determines how to address this issue: + +* Auto-scaling group: update your AMI ID in the launch configuration or template. +* Manual creation of nodes with AWS CLI: Update the `image-id` argument to the latest AMI ID +* VMware: Change the `target-name` argument when downloading the OVA with tuftool + +## Also See + +* Bottlerocket FAQ + * Why do some of the nodes in my cluster have an update available and others do not? + * Why are my nodes egressing to [https://updates.bottlerocket.aws](https://updates.bottlerocket.aws/)? +* Log Configuration + diff --git a/content/en/faqitems/7_2-why-updates-bottlerocket-aws.markdown b/content/en/faqitems/7_2-why-updates-bottlerocket-aws.markdown new file mode 100644 index 00000000..cd753518 --- /dev/null +++ b/content/en/faqitems/7_2-why-updates-bottlerocket-aws.markdown @@ -0,0 +1,7 @@ ++++ +question = "Why are my nodes egressing to https://updates.bottlerocket.aws?" +group = "Updates" ++++ + +The [Bottlerocket Updater API](https://github.com/bottlerocket-os/bottlerocket/blob/develop/sources/updater/README.md) uses TUF metadata served from a public endpoint. +The default AWS variants endpoint is `updates.bottlerocket.aws`. diff --git a/content/en/faqitems/7_3_not_all_nodes_have_available_update.markdown b/content/en/faqitems/7_3_not_all_nodes_have_available_update.markdown new file mode 100644 index 00000000..b61a6178 --- /dev/null +++ b/content/en/faqitems/7_3_not_all_nodes_have_available_update.markdown @@ -0,0 +1,9 @@ ++++ +question = "Why do some of the nodes in my cluster have an update available and others do not?" +group = "Updates" ++++ + +This is normal. +Bottlerocket uses "waves" to stagger deployment of updates. +When a node starts for the first time, the boot process generates a random seed (or uses the value from {{< setting-reference setting="settings.updates.seed" current_version="true">}}settings.updates.seed{{}}). +Bottlerocket's update process uses the seed to determine if a node should update, so in the situation where some of your nodes have an available update and some do not, it just means that the update wave hasn't reached that seed of some nodes and it has for the others. diff --git a/layouts/shortcodes/brupop-agent-controller-diagram.html b/layouts/shortcodes/brupop-agent-controller-diagram.html new file mode 100644 index 00000000..1717e3f3 --- /dev/null +++ b/layouts/shortcodes/brupop-agent-controller-diagram.html @@ -0,0 +1,530 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+
+
Host + OS
+
+
+
+ Host OS +
+
+ + + + + + +
+
+
Kubernetes + Control Plane
+
+
+
+ Kubernetes Control Plane +
+
+ + + + + + + + + + +
+
+
New + Version
+
+
+
+ New Version +
+
+ + + + +
+
+
Running + Version
+
+
+
+ Running Version +
+
+ + + + + + + + + + + + +
+
+
Old + Version
+
+
+
+ Old Version +
+
+ + + + +
+
+
New + Running Version
+
+
+
+ New Running Version +
+
+ + + + + + + + +
+
+
+ Reboot
+
+
+
+ Reboot +
+
+ + + + +
+
+
Node + 1
+
+
+
+ Node + 1 +
+
+ + + + +
+
+
Node + 1
+
+
+
+ Node + 1 +
+
+ + + + +
+
+
Update + state to
+
+
+
+ Update state to +
+
+ + + + +
+
+
+ Rebooted + Into Update +
+
+
+
+ Rebooted Into Update +
+
+ + + + + +
+
+
Stop + new and drain existing worloads 
+
+
+
+ Stop new and drain existing worloa... +
+
+ + + + + + + + + + +
+
+
Old + Version
+
+
+
+ Old Version +
+
+ + + + +
+
+
New + Running Version
+
+
+
+ New Running Version +
+
+ + + + +
+
+
Node + 1
+
+
+
+ Node + 1 +
+
+ + + + + + + + + + + + + + +
+
+
Old + Version
+
+
+
+ Old Version +
+
+ + + + +
+
+
New + Running Version
+
+
+
+ New Running Version +
+
+ + + + +
+
+
Node + 1
+
+
+
+ Node + 1 +
+
+ + + + +
+
+
Update + state to
+
+
+
+ Update state to +
+
+ + + + +
+
+
+ Monitoring + Update +
+
+
+
+ Monitoring Update +
+
+
+
\ No newline at end of file diff --git a/layouts/shortcodes/brupop_agent_api_server_control_plane.html b/layouts/shortcodes/brupop_agent_api_server_control_plane.html new file mode 100644 index 00000000..2ed474ea --- /dev/null +++ b/layouts/shortcodes/brupop_agent_api_server_control_plane.html @@ -0,0 +1,54 @@ + + + + + + + + + + + + + + Bottlerocket Host + + + + + + + + + + + + + + + + + + + Kubernetes Control Plane + + + + + + + + + + + + + + + + + + \ No newline at end of file diff --git a/layouts/shortcodes/brupop_components_diagram.html b/layouts/shortcodes/brupop_components_diagram.html new file mode 100644 index 00000000..168ff739 --- /dev/null +++ b/layouts/shortcodes/brupop_components_diagram.html @@ -0,0 +1,82 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Brupop Agent + (1x / node) + + + + + + + + + + + + + + + + API Server + (Default: + 3x / cluster) + + + + + + + + + + Controller + (1x / cluster) + + + \ No newline at end of file diff --git a/layouts/shortcodes/setting-reference.html b/layouts/shortcodes/setting-reference.html index 6e501297..bb29055a 100644 --- a/layouts/shortcodes/setting-reference.html +++ b/layouts/shortcodes/setting-reference.html @@ -2,9 +2,17 @@ {{- $setting := or (.Get "setting") $.Inner -}} {{- $setting_parts := strings.Split $setting "." -}} {{- $ref := index $setting_parts 1 -}} -{{- $current_path := print .Page.File.Dir -}} -{{- $parts := split $current_path "/" -}} -{{- $version := index $parts 1 -}} +{{- $version := "" -}} +{{- if (.Get "current_version") -}} + {{- $versions := index $.Site.Data.versions.current "os" -}} + {{/* create the version string (e.g. `1.14.x`) */}} + {{- $version = print $versions.major "." $versions.minor ".x" -}} +{{- else -}} + {{- $current_path := print .Page.File.Dir -}} + + {{- $parts := split $current_path "/" -}} + {{- $version = index $parts 1 -}} +{{- end -}} {{- $lang := print $.Page.Language -}} {{- $settings_at_version := index (index .Site.Data.settings $version) $ref }} {{- $page := . -}} From 7084aef1f74c8a2cfc7a6e9136ba291e3170dc9b Mon Sep 17 00:00:00 2001 From: "Kyle J. Davis" Date: Thu, 8 Feb 2024 12:53:09 -0700 Subject: [PATCH 5/8] adds udpates to styles and nearly complete brupop docs --- assets/scss/_styles_project.scss | 111 +++- assets/scss/_variables_project.scss | 10 +- assets/scss/rtl/_main.scss | 1 + content/en/brupop/1.3.x/_index.markdown | 3 - .../en/brupop/1.3.x/concepts/index.markdown | 74 ++- .../en/brupop/1.3.x/operate/index.markdown | 13 +- content/en/brupop/1.3.x/setup/_index.markdown | 7 +- .../1.3.x/setup/cert-manager/index.markdown | 12 +- .../1.3.x/setup/configure/index.markdown | 151 ++++- .../brupop/1.3.x/setup/install/index.markdown | 4 +- .../brupop/1.3.x/troubleshoot/index.markdown | 47 +- content/en/brupop/_index.markdown | 20 +- .../7_2-why-updates-bottlerocket-aws.markdown | 2 +- ..._all_nodes_have_available_update.markdown} | 1 - data/versions/current.toml | 4 +- .../brupop-agent-controller-diagram.html | 530 ------------------ .../agent-api-server-control-plane.html} | 0 .../brupop/agent-controller-diagram.html | 51 ++ .../brupop/cert-manager-version.html | 1 + .../components-diagram.html} | 24 +- layouts/shortcodes/brupop/idle.html | 59 ++ layouts/shortcodes/brupop/monitoring.html | 70 +++ .../shortcodes/brupop/reboot-into-update.html | 94 ++++ layouts/shortcodes/brupop/state-machine.html | 52 ++ layouts/shortcodes/github-at-commit.html | 11 - .../shortcodes/github-link-at-version.html | 11 + 26 files changed, 691 insertions(+), 672 deletions(-) create mode 100644 assets/scss/rtl/_main.scss rename content/en/faqitems/{7_3_not_all_nodes_have_available_update.markdown => 7_3-not_all_nodes_have_available_update.markdown} (99%) delete mode 100644 layouts/shortcodes/brupop-agent-controller-diagram.html rename layouts/shortcodes/{brupop_agent_api_server_control_plane.html => brupop/agent-api-server-control-plane.html} (100%) create mode 100644 layouts/shortcodes/brupop/agent-controller-diagram.html create mode 100644 layouts/shortcodes/brupop/cert-manager-version.html rename layouts/shortcodes/{brupop_components_diagram.html => brupop/components-diagram.html} (83%) create mode 100644 layouts/shortcodes/brupop/idle.html create mode 100644 layouts/shortcodes/brupop/monitoring.html create mode 100644 layouts/shortcodes/brupop/reboot-into-update.html create mode 100644 layouts/shortcodes/brupop/state-machine.html delete mode 100644 layouts/shortcodes/github-at-commit.html create mode 100644 layouts/shortcodes/github-link-at-version.html diff --git a/assets/scss/_styles_project.scss b/assets/scss/_styles_project.scss index c51680d1..708b24e9 100644 --- a/assets/scss/_styles_project.scss +++ b/assets/scss/_styles_project.scss @@ -1,5 +1,4 @@ @import '_home_svg.scss'; - .btn-lg, .btn-group-lg > .btn { border-radius: 6px; } @@ -431,7 +430,7 @@ nav.foldable-nav .with-child, nav.foldable-nav .without-child { stroke: none; } - + path.exit-flag { fill: $dark-blue; } @@ -502,6 +501,59 @@ nav.foldable-nav .with-child.depad { padding-left: 0; } +.start-align-labels.brupop-diagram { + .outer-label { + &.active-volume-label, + &.outer-label { + text-anchor: start; + } + } +} + +.brupop-diagram, +.brupop-state-machine { + .line-arrow-connector { + stroke-miterlimit: 10; + stroke: $dark-blue; + .connector { + + fill: none; + pointer-events : stroke; + &.dotted { + stroke-dasharray: 1 2; + } + } + .arrow-head { + fill: $dark-blue; + } + + } + + .label { + font-size: 12px; + font-family: $td-fonts-serif; + font-weight: 300; + + &.active-volume-label { + fill: $white; + text-anchor: middle; + } + &.outer-label { + fill: $dark-blue; + text-anchor: middle; + } + + } +} +.brupop-state-machine { + .state { + fill: $tan; + rx: 9; + ry: 9; + pointer-events: all; + } +} + .brupop-diagram { .node, .agent, @@ -511,7 +563,10 @@ nav.foldable-nav .with-child.depad { .unused-volume, .active-volume, .line-arrow-connector .arrow-head, - .ellipses { + .ellipses, + .future-volume, + .label-backer, + .wait { pointer-events: all; } .node { @@ -523,46 +578,32 @@ nav.foldable-nav .with-child.depad { .controller, .unused-container, .unused-volume, - .active-volume { + .active-volume, + .future-volume { rx: 2; } .agent { fill: $light-teal; } + + .label-backer, .unused-container, .unused-volume { fill: $white; } - .label { - font-size: 12px; - font-family: $td-fonts-serif; - font-weight: 600; - &.active-volume-label { - fill: $white; - text-anchor: middle; - } - &.outer-label { - fill: $dark-blue; - text-anchor: middle; - } - } .active-volume { fill: $dark-blue; } + .future-volume { + fill: url(#stripes); + } - .line-arrow-connector { - stroke-miterlimit: 10; - .connector { - stroke: $dark-blue; - fill: none; - pointer-events : stroke; - } - .arrow-head { - stroke: $dark-blue; - fill: $dark-blue; - } + .wait { + stroke-width: 2px; + stroke: $light-blue; + fill: none; } .api-server { @@ -577,6 +618,20 @@ nav.foldable-nav .with-child.depad { rx: 5; ry: 5; } + + #stripes { + width: 7; + height: 7; + rect { + fill: $dark-blue; + } + line { + stroke: #ffffff; + opacity: 0.1; + stroke-width: 7px; + } + } + } /* old docs notice */ diff --git a/assets/scss/_variables_project.scss b/assets/scss/_variables_project.scss index a816efa0..8e12d8f9 100644 --- a/assets/scss/_variables_project.scss +++ b/assets/scss/_variables_project.scss @@ -1,5 +1,13 @@ + $google_font_name: "IBM Plex Sans"; -$google_font_family: "IBM+Plex+Sans+Condensed:ital,wght@0,300;0,600;1,300;1,600&family=IBM+Plex+Sans:ital,wght@0,100;0,300;0,600;1,100;1,300;1,600"; +$google_font_family: "IBM+Plex+Sans:ital,wght@0,100;0,300;0,600;1,100;1,300;1,600"; + +// this is a work around for the baked in css2 vs css call to google fonts. I don't like having two calls, but there isn't a clean way to work around this +$google_font_family_secondary: "IBM+Plex+Sans+Condensed:ital,wght@0,300;0,600;1,300;1,600"; +$web-font-path_secondary: "https://fonts.googleapis.com/css?family=#{$google_font_family_secondary}&display=swap"; +@import url($web-font-path_secondary); + + $heading_font_stack: "'IBM Plex Sans Condensed', sans-serif"; diff --git a/assets/scss/rtl/_main.scss b/assets/scss/rtl/_main.scss new file mode 100644 index 00000000..c605a3e2 --- /dev/null +++ b/assets/scss/rtl/_main.scss @@ -0,0 +1 @@ +// override. This is for RTL support, we don't need it. \ No newline at end of file diff --git a/content/en/brupop/1.3.x/_index.markdown b/content/en/brupop/1.3.x/_index.markdown index 0a7e6017..4b634b54 100644 --- a/content/en/brupop/1.3.x/_index.markdown +++ b/content/en/brupop/1.3.x/_index.markdown @@ -2,6 +2,3 @@ type="docs" title="1.3.x" +++ - - - diff --git a/content/en/brupop/1.3.x/concepts/index.markdown b/content/en/brupop/1.3.x/concepts/index.markdown index c784a2c1..5c47cb00 100644 --- a/content/en/brupop/1.3.x/concepts/index.markdown +++ b/content/en/brupop/1.3.x/concepts/index.markdown @@ -1,60 +1,90 @@ +++ title = "Concepts" type = "docs" -description = "Understanding Brupop" +description = "Introduction to the components and concepts used in Brupop" weight = 1 +++ ---- - -## test - -{{< brupop-agent-controller-diagram >}} -{{< brupop_agent_api_server_control_plane >}} -{{< brupop_components_diagram >}} - -## Declarative, in-place updates You can update Bottlerocket in a couple of ways: -* node replacement where new instances with a new version of the OS replace nodes with older versions of the OS, -* in-place updates where the node downloads a new version of the OS and reboots into a new version of the OS while maintaining the same instance. +* **node replacement** where new instances with a new version of the OS replace nodes with older versions of the OS, +* **in-place updates** where the node downloads a new version of the OS and reboots into a new version of the OS while maintaining the same instance. -There is no single preferred nor advised method to update a node; each method has pros and cons depending on your situation. +There is no single preferred nor advised method to update a node; both methods have pros and cons depending on your situation. -Bottlerocket Update Operator (Brupop) is a Kubernetes operator for managing in-place updates of Bottlerocket on Kubernetes. If you use Bottlerocket on ECS or intend to replace nodes in Kubernetes, Brupop is not for you. Even if you do plan to do in-place updates Brupop is not required as you can manage in-place updates in other ways. However, Brupop offers a declarative, automated way to manage in-place Bottlerocket updates. +You can trigger an {{< cross-project-current-link project="os" url="/en/os/x.x.x/update/methods/in-place/#apiclient-commands">}}in-place update of manually with the API{{< /cross-project-current-link >}} or you can use the Bottlerocket Update Operator (Brupop). +**Brupop is a Kubernetes operator for managing in-place updates of Bottlerocket on Kubernetes.** + +If you use Bottlerocket on ECS or intend to replace nodes in Kubernetes, Brupop is not for you. +Even if you do plan to do in-place updates Brupop is not required as you can manage in-place updates in other ways. +However, Brupop offers a declarative, automated way to manage in-place Bottlerocket updates. ## Controlled updates -Brupop uses the Kubernetes controller pattern in an effort to safely update all the nodes whilst minimizing disruptions to workloads. To achieve this, Brupop does the following: +Brupop uses the [Kubernetes controller pattern](https://kubernetes.io/docs/concepts/architecture/controller/) in an effort to safely update all the nodes whilst minimizing disruptions to workloads. +To achieve this, Brupop does the following: * Controls the rate and flow of updates across the entire cluster, * First prevents new workloads from being scheduled to the node then drains existing workloads prior to updates, * Contains and prevents the propagation of update problems when the controller detects update failures. -Brupop collects the state of each node with an agent. The Brupop Agent runs in a container on each node as a DaemonSet. This agent sends the state to an API Server. The API Server runs in a container on the cluster itself and communicates with the Kubernetes API to record the state as a custom resource. +{{< brupop/components-diagram >}} + +Brupop collects the state of each node with an agent. +The Brupop Agent runs in a container on each node as a DaemonSet. +This agent sends the state to an API Server. +API Server instances run in the cluster itself and communicates with the Kubernetes API to record the state as a custom resource. + +{{< brupop/agent-api-server-control-plane >}} + +{{< alert title="Bottlerocket API Server vs Brupop API Server?" color="success" >}} +Don’t confuse Bottlerocket’s {{< cross-project-current-link project="os" url="/en/os/x.x.x/concepts/api-driven/">}}API Server{{< /cross-project-current-link >}} with Brupop’s API Server, these are two distinct servers, just with the same name. +In this part of the documentation, unless otherwise noted, assume that “API Server” refers to the Brupop API Server. +{{< /alert >}} The Controller also runs in a container on the cluster where it regularly evaluates the information about the state of each node and the cluster as a whole; based on this information it supplies instructions to the individual agents about update actions. +{{< brupop/agent-controller-diagram >}} + ## States -At any given point nodes are in one of five Brupop states: idle, staged & performed update, rebooted into update, monitoring update or error reset. A node is never in more than one state. The state of each node is represented as a Kubernetes Custom Resource called a BottlerocketShadow resource or brs. +At any given point nodes are in one of five Brupop states: **idle, staged & performed update, rebooted into update, monitoring update** or **error reset**. +A node is never in more than one state. +The state of each node is represented as a [Kubernetes Custom Resource](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/) called a BottlerocketShadow resource or `brs`. + +{{< brupop/state-machine >}} ### Idle -A node in the idle state does not have a pending update in-process. Most of the time your nodes will remain in this state.\ +A node in the **idle** state does not have a pending update in-process. +Most of the time your nodes will remain in this state. + +{{< brupop/idle >}} ### Staged & Performed Update -Bottlerocket uses multiple partitions to manage in-place updates. The OS runs from one partition and, when a new update is available, the update is downloaded and installed into the other. The Brupop controller periodically requests the agent to check for and download the most recent version of Bottlerocket. Once downloaded, Bottlerocket modifies the bootloader configuration to boot from the partition with the update and the agent changes the state to Staged & Performed Update with the Brupop API server. +Bottlerocket uses multiple partitions to manage in-place updates. +The OS runs from one partition and, when a new update is available, the update is downloaded and installed into the other. +The Brupop controller periodically requests the agent to check for and download the most recent version of Bottlerocket. +Once downloaded, Bottlerocket modifies the bootloader configuration to boot from the partition with the update and the agent changes the state to **Staged & Performed Update** with the Brupop API server. ### Reboot into Update -To minimize disruptions to the workloads running in the cluster, the controller signals to Kubernetes to prevent new workloads from being scheduled on to the node as well as shut down existing workloads (drain). Once drained, the agent triggers a reboot into the new OS and changes the state to Rebooted Into Update with the Brupop API server. +{{< brupop/reboot-into-update >}} + +To minimize disruptions to the workloads running in the cluster, the controller signals to Kubernetes to prevent new workloads from being scheduled on to the node as well as shut down existing workloads (drain). +Once drained, the agent triggers a reboot into the new OS and changes the state to **Rebooted Into Update** with the Brupop API server. ### Monitoring Update -Once the node reboots the update is technically complete, however the time whilst all your workloads startup is critical. Bottlerocket’s versioning and variant scheme is built to mitigate incompatibilities between OS versions, there is always a chance that an unforeseen incompatibility exists with some component of your architecture. Typically, these incompatibilities become visible after the update occurs and during workload start. Consequently, Brupop waits before marking the node with the API server as fully complete, instead the agent sets the state to Monitoring Update with the API Server. This monitoring period prevents the cluster creating a situation where nodes update quickly but in an unhealthy state. Once the monitoring period completes, the Agent sets the state back to Idle with the API Server. +{{< brupop/monitoring >}} + +Once the node reboots the update is technically complete, however the time whilst all your workloads startup is critical. +Bottlerocket’s versioning and variant scheme is built to mitigate incompatibilities between OS versions, there is always a chance that an unforeseen incompatibility exists with some component of your architecture. +Brupop’s state machine has a reserved state for monitoring these incompatibilities (**Monitoring Updates**), however as of this version, this state is a noop. +You can suggest a direction for this state on the [Brupop GitHub Repo](https://github.com/bottlerocket-os/bottlerocket-update-operator/issues/new?assignees=&labels=&projects=&template=issue.md&title=Suggestion%20for%20monitoring%20state). +Consequently, the Agent immediately transitions through **Monitoring Updates** back to **Idle** with the API server. ### Error Reset -In the situation that any of the above states fail, the state becomes Error Reset before transitioning back to Idle. +In the situation that any of the above states fail, the state becomes **Error Reset** before transitioning back to **Idle**. \ No newline at end of file diff --git a/content/en/brupop/1.3.x/operate/index.markdown b/content/en/brupop/1.3.x/operate/index.markdown index 21975bab..fccfa7b6 100644 --- a/content/en/brupop/1.3.x/operate/index.markdown +++ b/content/en/brupop/1.3.x/operate/index.markdown @@ -1,11 +1,11 @@ +++ -type="docs" -title="Operate & Observe" -weight=10 +type = "docs" +title = "Operate & Observe" +weight = 10 +description = "Understanding the day-to-day use of Brupop" +++ -After installation on your cluster Brupop runs in the background and generally requires no intervention. -Your nodes will check for updates and apply them according your configuration and the Bottlerocket update waves. +After installation on your cluster Brupop runs in the background and generally requires no intervention. Your nodes will check for updates and apply them according to your configuration and the Bottlerocket update waves. However, you can observe the status of the updates by [adhoc query](#adhoc-query) or setup [on-going monitoring](#on-going-monitoring). @@ -30,5 +30,4 @@ brs-node-2 Idle 1.17.0 StagedUpda To facilitate on-going monitoring the Brupop API server and controller provide you with metrics endpoints (`/metrics`) compatible with [Prometheus](https://prometheus.io/). The metrics endpoints expose two metrics: one that describes the current version of each node (`brupop_hosts_version`) and another for the [state](../concepts/#states) of each node (`brupop_hosts_state`). -For a sample configuration of using Prometheus with Brupop see the [configuration on the Brupop GitHub Repo](#). -Additionally, [Containers On AWS has a step-by-step walkthrough](#) using EKS, Brupop, and Prometheus. +For a sample configuration of using Prometheus with Brupop see the {{< github-link-at-version url="https://github.com/bottlerocket-os/bottlerocket-update-operator/blob/vx.x.x/deploy/examples/prometheus-resources.yaml" project="brupop" >}}configuration on the Brupop GitHub Repo{{}}. diff --git a/content/en/brupop/1.3.x/setup/_index.markdown b/content/en/brupop/1.3.x/setup/_index.markdown index 00fbc0a7..87be0632 100644 --- a/content/en/brupop/1.3.x/setup/_index.markdown +++ b/content/en/brupop/1.3.x/setup/_index.markdown @@ -1,7 +1,8 @@ +++ -type="docs" -title="Setup" -weight=5 +type = "docs" +title = "Setup" +weight = 5 +description = "Steps to use and configure Brupop on your Bottlerocket nodes" +++ Setting up Brupop for the first time has three major steps: diff --git a/content/en/brupop/1.3.x/setup/cert-manager/index.markdown b/content/en/brupop/1.3.x/setup/cert-manager/index.markdown index 684356dd..fd0bec09 100644 --- a/content/en/brupop/1.3.x/setup/cert-manager/index.markdown +++ b/content/en/brupop/1.3.x/setup/cert-manager/index.markdown @@ -5,14 +5,18 @@ description = "Prepare your cluster for Brupop" weight = 1 +++ -Brupop uses [cert-manager](https://cert-manager.io/) to manage self-signed certificates. You can install it with `kubectl` or `helm`. +Brupop uses [cert-manager](https://cert-manager.io/) to manage self-signed certificates. You can install it with `kubectl` or [helm](https://helm.sh/). + +{{% alert title="Note" color="success" %}} +This guide uses the most recent release of `cert-manager`, {{< brupop/cert-manager-version >}}, but there is no particular hard dependency on this version. +{{% /alert %}} ## Installing `cert-manager` using `kubectl` -You can use `kubectl` to install cert-manager: +Use `kubectl` to install cert-manager: ```shell -kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.8.2/cert-manager.yaml +kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v{{< brupop/cert-manager-version >}}/cert-manager.yaml ``` ## Installing `cert-manager` using `helm` @@ -36,7 +40,7 @@ helm install \ cert-manager jetstack/cert-manager \ --namespace cert-manager \ --create-namespace \ - --version v1.8.2 \ + --version v{{< brupop/cert-manager-version >}} \ --set installCRDs=true ``` diff --git a/content/en/brupop/1.3.x/setup/configure/index.markdown b/content/en/brupop/1.3.x/setup/configure/index.markdown index b2e4e5be..f86010b3 100644 --- a/content/en/brupop/1.3.x/setup/configure/index.markdown +++ b/content/en/brupop/1.3.x/setup/configure/index.markdown @@ -5,34 +5,34 @@ description = "Making the operator work for your needs" weight = 30 +++ - When you install Brupop, the operator comes pre-configured with reasonable defaults. [Labeling your nodes](#label-nodes) is the only required configuration step. -Aside from labeling nodes, you configure Brupop with helm or with a manifest. -Helm reduces the configuration burden for Brupop substantially with few down sides, so this the documentation focuses configuration with Helm. -If you choose to not use Helm, refer to the pre-baked manifest for an example. +Aside from labeling nodes, you configure Brupop with [helm](https://helm.sh/) or with a manifest. +Helm reduces the configuration burden for Brupop substantially with few down sides, so this documentation focuses on configuration with Helm. +If you choose to not use Helm, refer to the {{< github-link-at-version url="https://github.com/bottlerocket-os/bottlerocket-update-operator/blob/vx.x.x/bottlerocket-update-operator.yaml" project="brupop" >}}pre-baked manifest for an example{{< /github-link-at-version >}}. + + ## Required Configuration ### Label nodes {{% alert title="Warning" color="warning" %}} -You can fully install Brupop but if you do not apply the proper node labels the operator will not your update nodes. +You can fully install Brupop but if you do not apply the proper node labels the operator will not update your nodes. {{% /alert %}} -[Kubernetes node labels](https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/) controls which nodes Brupop updates; -specfically, the label `bottlerocket.aws/updater-interface-version=2.0.0` dictactes which nodes in the cluster get automatic updates. +[Kubernetes node labels](https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/) control which nodes Brupop updates; specifically, the label `bottlerocket.aws/updater-interface-version=2.0.0` dictates which nodes in the cluster get automatic updates. -You can label nodes using {{< cross-project-current-link url="/en/os/x.x.x/api/settings/kubernetes/#node-labels" project="os" >}}`settings.kubernetes.node-labels`{{}} with TOML (including instance user data), using `apiclient` in a host container, or `kubectl`: +You can label nodes using {{< cross-project-current-link url="/en/os/x.x.x/api/settings/kubernetes/#node-labels" >}}`settings.kubernetes.node-labels`{{}} with TOML ({{< cross-project-current-link url="/en/os/x.x.x/concepts/api-driven/#user-data" >}}including instance user data{{}}), using `apiclient` in a host container, or `kubectl`. -#### `apiclient` +#### Label a node with `apiclient` ```shell apiclient set settings.kubernetes.node-labels.bottlerocket.aws/updater-interface-version=2.0.0 ``` -#### `eksctl` +#### Label all nodes when starting an EKS cluster with `eksctl` ```yaml ... @@ -42,7 +42,9 @@ nodeGroups: ... ``` -#### `kubectl` +#### Labeling nodes with `kubectl` + +#### Label a single node ```shell # replace MY_NODE_NAME with the name of your node @@ -57,7 +59,9 @@ If you are running Bottlerocket on all nodes in your cluster, you can use `kubec kubectl label node $(kubectl get nodes -o jsonpath='{.items[*].metadata.name}') bottlerocket.aws/updater-interface-version=2.0.0 ``` -#### TOML / User Data +#### Labeling a node with the Bottlerocket API + +You can add the following TOML to your instance user data: ```TOML ... @@ -66,14 +70,26 @@ kubectl label node $(kubectl get nodes -o jsonpath='{.items[*].metadata.name}') ... ``` +From the control container, run the following: + +```shell +apiclient set settings.kubernetes.node-labels.bottlerocket.aws/updater-interface-version=2.0.0 +``` + ## Optional Configuration ### API Server Ports __Helm Configuration__: `apiserver_internal_port` for internal traffic, `apiserver_service_port` for node agent traffic. -By default, the operator's API server uses port `8443` for internal traffic and port `443` for node agents, but you can change these ports via configuration. -Both ports must be set or the operator will fail to start. + +By default, the operator’s API server uses port `8443` for internal traffic and port `443` for node agents, but you can change these ports via configuration. Both ports must be set or the operator will fail to start. + +Example: + +```YAML +apiserver_internal_port: "8443" +``` --- @@ -81,33 +97,48 @@ Both ports must be set or the operator will fail to start. __Helm Configuration__: `max_concurrent_updates` -You can set the maximum concurrency of updates that Brupop will perform. -You either set a specific number of concurrent updates or, alternately, `"unlimited"` to update as many nodes as possible concurrently. -In either case, Brupop always respects [PodDisruptionBudgets](https://kubernetes.io/docs/tasks/run-application/configure-pdb/). +You can set the maximum concurrency of updates that Brupop will perform. You either set a specific number of concurrent updates or, alternately, `"unlimited"` to update as many nodes as possible concurrently. In either case, Brupop always respects [PodDisruptionBudgets](https://kubernetes.io/docs/tasks/run-application/configure-pdb/). {{% alert title="Conflicts between load balancing and concurrency" color="warning" %}} Take caution when setting concurrency and excluding load balancers together, as misconfiguration can result in a condition where all nodes exclude load balancing. {{% /alert %}} +Example: + +```yaml +max_concurrent_updates: "1" +``` + --- ### Namespace __Helm Configuration__: `brupop-bottlerocket-aws` + You can change the namespace where the Kubernetes deploys Brupop (default: `brupop-bottlerocket-aws`). +Example: + +```yaml +namespace: "brupop-bottlerocket-aws" +``` + --- ### Load balancer exclusion __Helm Configuration__: `exclude_from_lb_wait_time_in_sec` -With this option, you can control the exclusion of the node from load balancing and delays draining the node for the number of seconds specified. -Internally, Brupop uses [`node.kubernetes.io/exclude-from-external-load-balancers`](https://kubernetes.io/docs/reference/labels-annotations-taints/#node-kubernetes-io-exclude-from-external-load-balancers) to exclude the node from load balancing. +With this option, you can control the exclusion of the node from load balancing and delays draining the node for the number of seconds specified. Internally, Brupop uses [`node.kubernetes.io/exclude-from-external-load-balancers`](https://kubernetes.io/docs/reference/labels-annotations-taints/#node-kubernetes-io-exclude-from-external-load-balancers) to exclude the node from load balancing. See [Concurrent Updates](#concurrent-updates) for an important warning about concurrency and load balancer exclusion. +Example: +```yaml +exclude_from_lb_wait_time_in_sec: "0" +``` + --- ### Logging @@ -123,7 +154,14 @@ Log formatting has four options: - `full`: Human-readable, single-line logs, - `compact`: A shorter version of `full`, - `pretty`: "Excessively pretty", terminal-optimized human-readable logs (default), -- `json`: New line-delimited JSON-formatted (machine-readable) logs. +- `json`: New line-delimited JSON-formatted (machine-readable) logs. + +Example: + +```yaml +logging: + formatter: "pretty" +``` #### Colours @@ -131,12 +169,30 @@ __Helm Configuration__: `logging.ansi_enabled` You can optionally set the logs to add ANSI colour information (`true`/`false`), which is helpful if viewing in a terminal, but adds garbage characters for non-terminal logging utilities. +Example: + +```yaml +logging: + ansi_enabled: "pretty" +``` + #### Filter __Helm Configuration__: The controller, agent, and API server are configured via`logging.controller.tracing_filter`, `logging.agent.tracing_filter`, and `logging.apiserver.tracing_filter` (respectively). Log filtering accepts on both typical log levels (`info` (default), `debug`, `error`) or through [filter directives](https://docs.rs/tracing-subscriber/0.3.17/tracing_subscriber/filter/struct.EnvFilter.html#directives). +Example: + +```yaml + controller: + tracing_filter: "info" + agent: + tracing_filter: "debug" + apiserver: + tracing_filter: "error" +``` + --- ### Placement @@ -144,7 +200,44 @@ Log filtering accepts on both typical log levels (`info` (default), `debug`, `er __Helm Configuration__: `placement.agent`, `placement.controller`, `placement.apiserver` With these configurations, you can control the [tolerations](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/) for the agent, controller and API server. -For the controller and and API server you can also control the [node selector](https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodeselector), and [pod affinitiy and anti-affinity](https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#affinity-and-anti-affinity). +For the controller and API server you can also control the [node selector](https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodeselector), and [pod affinitiy and anti-affinity](https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#affinity-and-anti-affinity). + +Example: + +```yaml +# Placement controls +# See the Kubernetes documentation about placement controls for more details: +# * https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/ +# * https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodeselector +# * https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#affinity-and-anti-affinity +placement: + agent: + # The agent is a daemonset, so the only controls that apply to it are tolerations. + tolerations: [] + + controller: + tolerations: [] + nodeSelector: {} + podAffinity: {} + podAntiAffinity: {} + + apiserver: + tolerations: [] + nodeSelector: {} + podAffinity: {} + # By default, apiserver pods prefer not to be scheduled to the same node. + podAntiAffinity: + preferredDuringSchedulingIgnoredDuringExecution: + - weight: 1 + podAffinityTerm: + labelSelector: + matchExpressions: + - key: brupop.bottlerocket.aws/component + operator: In + values: + - apiserver + topologyKey: kubernetes.io/hostname +``` --- @@ -154,13 +247,20 @@ __Helm Configuration__: `image_pull_secrets` If you are testing Brupop with a private image registry, you can configure pull secrets to fetch images. +Example: + +```yaml +image_pull_secrets: + - name: "brupop" +``` + --- ### Scheduling __Helm Configuration__: `scheduler_cron_expression` -Brupop schedules node updates based a cron expression in the following format: +Brupop schedules node updates based on a cron expression in the following format: ```text ┌───────────── seconds (0 - 59) @@ -174,3 +274,10 @@ Brupop schedules node updates based a cron expression in the following format: │ │ │ │ │ │ │ * * * * * * * ``` + +Example: + +```yaml +# Every day at 3 AM +scheduler_cron_expression: "* * 3 * * * *" +``` diff --git a/content/en/brupop/1.3.x/setup/install/index.markdown b/content/en/brupop/1.3.x/setup/install/index.markdown index 3fbb0cd3..8e70165c 100644 --- a/content/en/brupop/1.3.x/setup/install/index.markdown +++ b/content/en/brupop/1.3.x/setup/install/index.markdown @@ -11,7 +11,7 @@ You can install Brupop either [with `helm`](#install-with-helm) or a [pre-baked ## Install with `helm` -First, add the `bottlerocket-operator-chart` +First, using [helm](https://helm.sh/) add the `bottlerocket-operator-chart` ```shell helm repo add brupop https://bottlerocket-os.github.io/bottlerocket-update-operator @@ -45,7 +45,7 @@ After you've installed the operator, you can move on to the next step: [configur ## Install with a Manifest -First, download the manifest from the release to your local machine and run the following: +First, }}">download the manifest from the release to your local machine and run the following: ```shell kubectl apply -f bottlerocket-update-operator-v{{< current-version project="brupop" >}}.yaml diff --git a/content/en/brupop/1.3.x/troubleshoot/index.markdown b/content/en/brupop/1.3.x/troubleshoot/index.markdown index 3daf571d..d9275ec2 100644 --- a/content/en/brupop/1.3.x/troubleshoot/index.markdown +++ b/content/en/brupop/1.3.x/troubleshoot/index.markdown @@ -1,18 +1,19 @@ +++ -type="docs" -title="Troubleshoot" -weight=30 +type = "docs" +title = "Troubleshoot" +weight = 30 +description = "Debugging and solving Brupop problems" +++ ## Debugging information -Brupop’s components emit useful logs for debugging and troubleshooting. +Brupop’s components emit useful logs for debugging and troubleshooting. ### API Server deployment logs Searching through the API Server’s deployment logs for a particular Node ID will yield the mutations to the node. Assuming the default namespace you can retrieve these by running: -``` +```shell kubectl logs deployment/brupop-apiserver --namespace brupop-bottlerocket-aws ``` @@ -22,13 +23,13 @@ Logs from the agent show the specific update actions taken on a particular node. First, find the node in the list of the Brupop agent pods (assuming the default namespace): -``` +```shell kubectl get pods --selector=brupop.bottlerocket.aws/component=agent -o wide --namespace brupop-bottlerocket-aws ``` From this list get the logs for the agent you’re troubleshooting by replacing `` with the node name from the previous step. -``` +```shell kubectl logs --namespace brupop-bottlerocket-aws ``` @@ -36,20 +37,22 @@ kubectl logs --namespace brupop-bottlerocket-aws ### Stuck Updates -When one or mode nodes do not progress through the states and return to idle it is a “stuck update.” By default, Brupop only updates one node so a single node can prevent nodes across the cluster from updating. +When one or more nodes do not progress through the states and return to idle it is a "stuck update." By default, Brupop only updates one node so a single node can prevent nodes across the cluster from updating. There are a few potential causes of stuck updates: -1. Pod Disruption Budget preventing a node drain. Brupop uses the Kubernetes Eviction API to drain pods form a node. It’s possible to Pod Disrutpion Budgets configured (often mistakenly) to disallow a pod removal resulting in a un-drainable node that Brupop cannot update. - **Troubleshooting step:** Check your pod disruption budget configuration. -2. Unable to access `updates.bottlerocket.aws`. Bottlerocket needs to access metadata from a public endpoint to get information about the most recent release. Production environments may limit this type of outbound access. - **Troubleshooting step:** Log into the control container of a node and run `apiclient update check`. - Failures with this check indicate an outbound block. - **Potential solution:** Scrape the contents of `updates.bottlerocket.aws` with `[Tuftool](https://github.com/awslabs/tough/tree/develop/tuftool#download-tuf-repo)` and serve from within your cluster, then update your settings accordingly for `settings.updates.metadata-base-url` and `settings.updates.metadata-base-url`. -3. Other issues while updating. - **Troubleshooting step:** Check the agent logs for the stuck node. +1. Pod Disruption Budget preventing a node drain. Brupop uses the Kubernetes Eviction API to drain pods form a node. +It’s possible to Pod Disrutpion Budgets configured (often mistakenly) to disallow a pod removal resulting in a un-drainable node that Brupop cannot update. + **Troubleshooting step:** Check your pod disruption budget configuration. +2. Unable to access `updates.bottlerocket.aws`. +Bottlerocket needs to access metadata from a public endpoint to get information about the most recent release. Production environments may limit this type of outbound access. +**Troubleshooting step:** Log into the control container of a node and run `apiclient update check`. +Failures with this check indicate an outbound block. +**Potential solution:** Scrape the contents of `updates.bottlerocket.aws` with [`Tuftool`](https://github.com/awslabs/tough/tree/develop/tuftool#download-tuf-repo) and serve from within your cluster, then update your settings accordingly for `settings.updates.metadata-base-url` and `settings.updates.metadata-base-url`. +3. Other issues while updating. +**Troubleshooting step:** Check the agent logs for the stuck node. -### **Bottlerocket instances start with an old version of Bottlerocket** +### Bottlerocket instances start with an old version of Bottlerocket After using Brupop for a while you may notice that any brand new nodes added to the cluster start with an older version Bottlerocket then Brupop flags them for an update almost immediately. Brupop can only update existing nodes and it doesn’t manage the node creation process. Depending on how you created your nodes determines how to address this issue: @@ -57,10 +60,10 @@ After using Brupop for a while you may notice that any brand new nodes added to * Manual creation of nodes with AWS CLI: Update the `image-id` argument to the latest AMI ID * VMware: Change the `target-name` argument when downloading the OVA with tuftool -## Also See +## Related -* Bottlerocket FAQ - * Why do some of the nodes in my cluster have an update available and others do not? - * Why are my nodes egressing to [https://updates.bottlerocket.aws](https://updates.bottlerocket.aws/)? -* Log Configuration +* [Bottlerocket FAQ](/en/faq) + - [Why do some of the nodes in my cluster have an update available and others do not?](/en/faq/#7_3) + - [Why are my nodes egressing to `updates.bottlerocket.aws`?](/en/faq/#7_2) +* [Log Configuration](../setup/configure/#logging) diff --git a/content/en/brupop/_index.markdown b/content/en/brupop/_index.markdown index a5e34255..5f7102cc 100644 --- a/content/en/brupop/_index.markdown +++ b/content/en/brupop/_index.markdown @@ -1,11 +1,29 @@ +++ type="docs" title="Brupop" -description="Documentation for the Bottlerocket Update Operator (aka Brupop)" +description="Documentation for the Bottlerocket Update Operator (Brupop)" +body_class="suppress_section_listing" +no_version_warning=true +++ +This section covers installing and using the Bottlerocket Update Operator only. If you’re seeking general information about Bottlerocket updates, {{< cross-project-current-link project="os" url="/en/os/x.x.x/update/" >}}check the Updating documentation for the OS{{< /cross-project-current-link >}}. + +If you’re looking for information on building, contributing to, or learning about the inner workings of Brupop, the [GitHub repo](https://github.com/bottlerocket-os/bottlerocket-update-operator) is a better destination. + +## Organization + +The Brupop documentation is organized by minor version, with each minor release getting it’s own namespaced, version-specific section. Inside each version-specific sections are subsections which address specific tasks or categories of information. + +The current documented versions: + +{{< subsections-list >}} + ## Version & Update Policy Brupop follows semantic ([semver](https://semver.org/)) versioning to ensure that minor (e.g. 1.1.1 -> 1.2.0) or patch (e.g. 1.1.0 -> 1.1.1) updates do not introduce any breaking or incompatible changes. However, patches are only provided to the latest version, so you should keep your Brupop installation up to date with the lastest release. + +## Something Missing? + +This [documentation is open-source](https://github.com/bottlerocket-os/bottlerocket-project-website/tree/main/content/en/brupop) and likely incomplete, but will evolve over time to encompass a more complete explanation of the software. Should you find gaps, you’re invited to file issues or contribute. diff --git a/content/en/faqitems/7_2-why-updates-bottlerocket-aws.markdown b/content/en/faqitems/7_2-why-updates-bottlerocket-aws.markdown index cd753518..460bb82e 100644 --- a/content/en/faqitems/7_2-why-updates-bottlerocket-aws.markdown +++ b/content/en/faqitems/7_2-why-updates-bottlerocket-aws.markdown @@ -1,5 +1,5 @@ +++ -question = "Why are my nodes egressing to https://updates.bottlerocket.aws?" +question = "Why are my nodes egressing to `updates.bottlerocket.aws`?" group = "Updates" +++ diff --git a/content/en/faqitems/7_3_not_all_nodes_have_available_update.markdown b/content/en/faqitems/7_3-not_all_nodes_have_available_update.markdown similarity index 99% rename from content/en/faqitems/7_3_not_all_nodes_have_available_update.markdown rename to content/en/faqitems/7_3-not_all_nodes_have_available_update.markdown index b61a6178..493a4734 100644 --- a/content/en/faqitems/7_3_not_all_nodes_have_available_update.markdown +++ b/content/en/faqitems/7_3-not_all_nodes_have_available_update.markdown @@ -2,7 +2,6 @@ question = "Why do some of the nodes in my cluster have an update available and others do not?" group = "Updates" +++ - This is normal. Bottlerocket uses "waves" to stagger deployment of updates. When a node starts for the first time, the boot process generates a random seed (or uses the value from {{< setting-reference setting="settings.updates.seed" current_version="true">}}settings.updates.seed{{}}). diff --git a/data/versions/current.toml b/data/versions/current.toml index 5ff3e96c..ded62c72 100644 --- a/data/versions/current.toml +++ b/data/versions/current.toml @@ -11,6 +11,6 @@ [brupop] major = 1 - minor = 13 + minor = 3 patch = 0 - tag_commit = "6455a43fd717765da044a95a18a60a8286020971" + cert_manager = "1.14.1" diff --git a/layouts/shortcodes/brupop-agent-controller-diagram.html b/layouts/shortcodes/brupop-agent-controller-diagram.html deleted file mode 100644 index 1717e3f3..00000000 --- a/layouts/shortcodes/brupop-agent-controller-diagram.html +++ /dev/null @@ -1,530 +0,0 @@ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
-
-
Host - OS
-
-
-
- Host OS -
-
- - - - - - -
-
-
Kubernetes - Control Plane
-
-
-
- Kubernetes Control Plane -
-
- - - - - - - - - - -
-
-
New - Version
-
-
-
- New Version -
-
- - - - -
-
-
Running - Version
-
-
-
- Running Version -
-
- - - - - - - - - - - - -
-
-
Old - Version
-
-
-
- Old Version -
-
- - - - -
-
-
New - Running Version
-
-
-
- New Running Version -
-
- - - - - - - - -
-
-
- Reboot
-
-
-
- Reboot -
-
- - - - -
-
-
Node - 1
-
-
-
- Node - 1 -
-
- - - - -
-
-
Node - 1
-
-
-
- Node - 1 -
-
- - - - -
-
-
Update - state to
-
-
-
- Update state to -
-
- - - - -
-
-
- Rebooted - Into Update -
-
-
-
- Rebooted Into Update -
-
- - - - - -
-
-
Stop - new and drain existing worloads 
-
-
-
- Stop new and drain existing worloa... -
-
- - - - - - - - - - -
-
-
Old - Version
-
-
-
- Old Version -
-
- - - - -
-
-
New - Running Version
-
-
-
- New Running Version -
-
- - - - -
-
-
Node - 1
-
-
-
- Node - 1 -
-
- - - - - - - - - - - - - - -
-
-
Old - Version
-
-
-
- Old Version -
-
- - - - -
-
-
New - Running Version
-
-
-
- New Running Version -
-
- - - - -
-
-
Node - 1
-
-
-
- Node - 1 -
-
- - - - -
-
-
Update - state to
-
-
-
- Update state to -
-
- - - - -
-
-
- Monitoring - Update -
-
-
-
- Monitoring Update -
-
-
-
\ No newline at end of file diff --git a/layouts/shortcodes/brupop_agent_api_server_control_plane.html b/layouts/shortcodes/brupop/agent-api-server-control-plane.html similarity index 100% rename from layouts/shortcodes/brupop_agent_api_server_control_plane.html rename to layouts/shortcodes/brupop/agent-api-server-control-plane.html diff --git a/layouts/shortcodes/brupop/agent-controller-diagram.html b/layouts/shortcodes/brupop/agent-controller-diagram.html new file mode 100644 index 00000000..a9df83b0 --- /dev/null +++ b/layouts/shortcodes/brupop/agent-controller-diagram.html @@ -0,0 +1,51 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Bottlerocket Host + + + + + + + + + + + + + Kubernetes Control Plane + + + + \ No newline at end of file diff --git a/layouts/shortcodes/brupop/cert-manager-version.html b/layouts/shortcodes/brupop/cert-manager-version.html new file mode 100644 index 00000000..b406627e --- /dev/null +++ b/layouts/shortcodes/brupop/cert-manager-version.html @@ -0,0 +1 @@ +{{ $.Site.Data.versions.current.brupop.cert_manager }} \ No newline at end of file diff --git a/layouts/shortcodes/brupop_components_diagram.html b/layouts/shortcodes/brupop/components-diagram.html similarity index 83% rename from layouts/shortcodes/brupop_components_diagram.html rename to layouts/shortcodes/brupop/components-diagram.html index 168ff739..d3ca900f 100644 --- a/layouts/shortcodes/brupop_components_diagram.html +++ b/layouts/shortcodes/brupop/components-diagram.html @@ -1,8 +1,8 @@ - + @@ -45,9 +45,9 @@ - - Brupop Agent - (1x / node) + + Brupop Agent + (1x / node) @@ -62,10 +62,10 @@ - - API Server - (Default: - 3x / cluster) + + API Server + (Default: + 3x / cluster) @@ -74,9 +74,9 @@ - - Controller - (1x / cluster) + + Controller + (1x / cluster) \ No newline at end of file diff --git a/layouts/shortcodes/brupop/idle.html b/layouts/shortcodes/brupop/idle.html new file mode 100644 index 00000000..da278a23 --- /dev/null +++ b/layouts/shortcodes/brupop/idle.html @@ -0,0 +1,59 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + New Version + + + + Running Version + + + + + Download + + + + + + + Make + Boot Partition + + + + + Update state to + Staged & Performed + + + + \ No newline at end of file diff --git a/layouts/shortcodes/brupop/monitoring.html b/layouts/shortcodes/brupop/monitoring.html new file mode 100644 index 00000000..3efb50a9 --- /dev/null +++ b/layouts/shortcodes/brupop/monitoring.html @@ -0,0 +1,70 @@ + + + + + + + + + + + + + Prev Version + + + + + New Running Version + + + + Node 1 + + + + + + + + + + + + + + + + + + + + + + + + + + + Prev Version + + + + + New Running Version + + + + Node 1 + + + + Update state to + + + + Monitoring Update + + + \ No newline at end of file diff --git a/layouts/shortcodes/brupop/reboot-into-update.html b/layouts/shortcodes/brupop/reboot-into-update.html new file mode 100644 index 00000000..7dc2b69d --- /dev/null +++ b/layouts/shortcodes/brupop/reboot-into-update.html @@ -0,0 +1,94 @@ + + + + + + + + + + + + + + + + + + + New Version + + + + + Running Version + + + + + + + + + + + + + + + + + + Prev Version + + + + + + New Running Version + + + + + + + + + + + + + Reboot + + + + Node 1 + + + + Node 1 + + + + Update state to + + + + Rebooted Into Update + + + + + + + + + Stop new and drain existing worloads + + + \ No newline at end of file diff --git a/layouts/shortcodes/brupop/state-machine.html b/layouts/shortcodes/brupop/state-machine.html new file mode 100644 index 00000000..ed24af03 --- /dev/null +++ b/layouts/shortcodes/brupop/state-machine.html @@ -0,0 +1,52 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Idle + + + Staged & Performed + Update + + + Rebooted + into Update + + + Monitoring + Update + + + + Error Reset + + \ No newline at end of file diff --git a/layouts/shortcodes/github-at-commit.html b/layouts/shortcodes/github-at-commit.html deleted file mode 100644 index 70d65916..00000000 --- a/layouts/shortcodes/github-at-commit.html +++ /dev/null @@ -1,11 +0,0 @@ -{{- $currentPath := print .Page.File.Dir -}} -{{- /* break apart the path */ -}} -{{- $parts := split $currentPath "/" -}} -{{- /* 1st (base 0) project has the version */ -}} -{{- $path_project := index $parts 0 -}} -{{- $repo := .Get "repo" -}} -{{- $project := .Get "project" | default $path_project -}} -{{- $path := .Get "path" -}} -{{- $tag_commit := index (index $.Site.Data.versions.current $project) "tag_commit" -}} -{{- $github_url := print "https://github.com/" $repo "/blob/" $tag_commit $path }} -{{ .Inner }} \ No newline at end of file diff --git a/layouts/shortcodes/github-link-at-version.html b/layouts/shortcodes/github-link-at-version.html new file mode 100644 index 00000000..06f61ea1 --- /dev/null +++ b/layouts/shortcodes/github-link-at-version.html @@ -0,0 +1,11 @@ +{{- $project := .Get "project" | default "os" -}} +{{- $url_arg := .Get "url" -}} +{{- $replace := .Get "replace" | default "/vx.x.x/" -}} +{{- $current_version_data := $.Site.Data.versions.current -}} +{{- $v := index $current_version_data $project -}} + +{{- $new_url := print "/v" $v.major "." $v.minor "." $v.patch "/" -}} + +{{- $url := replace $url_arg $replace $new_url }} + +{{ .Inner | markdownify }} From 4100f67ab11dc5ae81aac980855ec688a95ec69d Mon Sep 17 00:00:00 2001 From: "Kyle J. Davis" Date: Thu, 8 Feb 2024 15:20:26 -0700 Subject: [PATCH 6/8] touchups, uninstall, and faq tweaks --- .../en/brupop/1.3.x/concepts/index.markdown | 19 ++++++++------ .../en/brupop/1.3.x/operate/index.markdown | 2 +- .../1.3.x/setup/configure/index.markdown | 25 ++++++++----------- .../brupop/1.3.x/setup/install/index.markdown | 2 +- .../brupop/1.3.x/troubleshoot/index.markdown | 15 +++++------ .../en/brupop/1.3.x/uninstall/_index.markdown | 22 ++++++++++++++++ content/en/brupop/_index.markdown | 4 +-- layouts/partials/faq-body.html | 2 +- layouts/partials/faq-index.html | 2 +- .../{idle.html => staged-and-performed.html} | 2 +- 10 files changed, 59 insertions(+), 36 deletions(-) create mode 100644 content/en/brupop/1.3.x/uninstall/_index.markdown rename layouts/shortcodes/brupop/{idle.html => staged-and-performed.html} (98%) diff --git a/content/en/brupop/1.3.x/concepts/index.markdown b/content/en/brupop/1.3.x/concepts/index.markdown index 5c47cb00..c21a048b 100644 --- a/content/en/brupop/1.3.x/concepts/index.markdown +++ b/content/en/brupop/1.3.x/concepts/index.markdown @@ -8,11 +8,11 @@ weight = 1 You can update Bottlerocket in a couple of ways: * **node replacement** where new instances with a new version of the OS replace nodes with older versions of the OS, -* **in-place updates** where the node downloads a new version of the OS and reboots into a new version of the OS while maintaining the same instance. +* **in-place updates** where the node downloads a new version of the OS and reboots into a new version of the OS while maintaining the same instance/machine. There is no single preferred nor advised method to update a node; both methods have pros and cons depending on your situation. -You can trigger an {{< cross-project-current-link project="os" url="/en/os/x.x.x/update/methods/in-place/#apiclient-commands">}}in-place update of manually with the API{{< /cross-project-current-link >}} or you can use the Bottlerocket Update Operator (Brupop). +You can trigger an {{< cross-project-current-link project="os" url="/en/os/x.x.x/update/methods/in-place/#apiclient-commands">}}in-place update manually with the API{{< /cross-project-current-link >}} or you can use the Bottlerocket Update Operator (Brupop). **Brupop is a Kubernetes operator for managing in-place updates of Bottlerocket on Kubernetes.** If you use Bottlerocket on ECS or intend to replace nodes in Kubernetes, Brupop is not for you. @@ -31,14 +31,14 @@ To achieve this, Brupop does the following: {{< brupop/components-diagram >}} Brupop collects the state of each node with an agent. -The Brupop Agent runs in a container on each node as a DaemonSet. +The Brupop Agent runs in a container on each node as a [DaemonSet](https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/). This agent sends the state to an API Server. API Server instances run in the cluster itself and communicates with the Kubernetes API to record the state as a custom resource. {{< brupop/agent-api-server-control-plane >}} {{< alert title="Bottlerocket API Server vs Brupop API Server?" color="success" >}} -Don’t confuse Bottlerocket’s {{< cross-project-current-link project="os" url="/en/os/x.x.x/concepts/api-driven/">}}API Server{{< /cross-project-current-link >}} with Brupop’s API Server, these are two distinct servers, just with the same name. +Don’t confuse Bottlerocket’s {{< cross-project-current-link project="os" url="/en/os/x.x.x/concepts/api-driven/">}}API Server{{< /cross-project-current-link >}} with Brupop’s API Server, these are two distinct things, just with the same name. In this part of the documentation, unless otherwise noted, assume that “API Server” refers to the Brupop API Server. {{< /alert >}} @@ -48,9 +48,9 @@ The Controller also runs in a container on the cluster where it regularly evalua ## States -At any given point nodes are in one of five Brupop states: **idle, staged & performed update, rebooted into update, monitoring update** or **error reset**. +At any given point nodes are in one of five Brupop states: **idle**, **staged & performed update**, **rebooted into update**, **monitoring update** or **error reset**. A node is never in more than one state. -The state of each node is represented as a [Kubernetes Custom Resource](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/) called a BottlerocketShadow resource or `brs`. +The state of each node is represented as a [Kubernetes Custom Resource](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/) called a `BottlerocketShadow` resource or `brs`. {{< brupop/state-machine >}} @@ -59,10 +59,11 @@ The state of each node is represented as a [Kubernetes Custom Resource](https:// A node in the **idle** state does not have a pending update in-process. Most of the time your nodes will remain in this state. -{{< brupop/idle >}} - ### Staged & Performed Update +{{< brupop/staged-and-performed >}} + + Bottlerocket uses multiple partitions to manage in-place updates. The OS runs from one partition and, when a new update is available, the update is downloaded and installed into the other. The Brupop controller periodically requests the agent to check for and download the most recent version of Bottlerocket. @@ -83,6 +84,8 @@ Once the node reboots the update is technically complete, however the time whils Bottlerocket’s versioning and variant scheme is built to mitigate incompatibilities between OS versions, there is always a chance that an unforeseen incompatibility exists with some component of your architecture. Brupop’s state machine has a reserved state for monitoring these incompatibilities (**Monitoring Updates**), however as of this version, this state is a noop. You can suggest a direction for this state on the [Brupop GitHub Repo](https://github.com/bottlerocket-os/bottlerocket-update-operator/issues/new?assignees=&labels=&projects=&template=issue.md&title=Suggestion%20for%20monitoring%20state). + + Consequently, the Agent immediately transitions through **Monitoring Updates** back to **Idle** with the API server. ### Error Reset diff --git a/content/en/brupop/1.3.x/operate/index.markdown b/content/en/brupop/1.3.x/operate/index.markdown index fccfa7b6..7a8cc9ae 100644 --- a/content/en/brupop/1.3.x/operate/index.markdown +++ b/content/en/brupop/1.3.x/operate/index.markdown @@ -27,7 +27,7 @@ brs-node-2 Idle 1.17.0 StagedUpda ## On-going monitoring -To facilitate on-going monitoring the Brupop API server and controller provide you with metrics endpoints (`/metrics`) compatible with [Prometheus](https://prometheus.io/). +To facilitate on-going monitoring the Brupop API server and controller provides you with metrics endpoints (`/metrics`) compatible with [Prometheus](https://prometheus.io/). The metrics endpoints expose two metrics: one that describes the current version of each node (`brupop_hosts_version`) and another for the [state](../concepts/#states) of each node (`brupop_hosts_state`). For a sample configuration of using Prometheus with Brupop see the {{< github-link-at-version url="https://github.com/bottlerocket-os/bottlerocket-update-operator/blob/vx.x.x/deploy/examples/prometheus-resources.yaml" project="brupop" >}}configuration on the Brupop GitHub Repo{{}}. diff --git a/content/en/brupop/1.3.x/setup/configure/index.markdown b/content/en/brupop/1.3.x/setup/configure/index.markdown index f86010b3..3c617bdc 100644 --- a/content/en/brupop/1.3.x/setup/configure/index.markdown +++ b/content/en/brupop/1.3.x/setup/configure/index.markdown @@ -9,10 +9,8 @@ When you install Brupop, the operator comes pre-configured with reasonable defau [Labeling your nodes](#label-nodes) is the only required configuration step. Aside from labeling nodes, you configure Brupop with [helm](https://helm.sh/) or with a manifest. -Helm reduces the configuration burden for Brupop substantially with few down sides, so this documentation focuses on configuration with Helm. -If you choose to not use Helm, refer to the {{< github-link-at-version url="https://github.com/bottlerocket-os/bottlerocket-update-operator/blob/vx.x.x/bottlerocket-update-operator.yaml" project="brupop" >}}pre-baked manifest for an example{{< /github-link-at-version >}}. - - +Helm reduces the configuration burden for Brupop substantially with few down sides, so this documentation focuses on configuration with helm. +If you choose to not use helm, refer to the {{< github-link-at-version url="https://github.com/bottlerocket-os/bottlerocket-update-operator/blob/vx.x.x/bottlerocket-update-operator.yaml" project="brupop" >}}pre-baked manifest for an example{{< /github-link-at-version >}}. ## Required Configuration @@ -24,7 +22,7 @@ You can fully install Brupop but if you do not apply the proper node labels the [Kubernetes node labels](https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/) control which nodes Brupop updates; specifically, the label `bottlerocket.aws/updater-interface-version=2.0.0` dictates which nodes in the cluster get automatic updates. -You can label nodes using {{< cross-project-current-link url="/en/os/x.x.x/api/settings/kubernetes/#node-labels" >}}`settings.kubernetes.node-labels`{{}} with TOML ({{< cross-project-current-link url="/en/os/x.x.x/concepts/api-driven/#user-data" >}}including instance user data{{}}), using `apiclient` in a host container, or `kubectl`. +You can label nodes using {{< cross-project-current-link url="/en/os/x.x.x/api/settings/kubernetes/#node-labels" >}}`settings.kubernetes.node-labels`{{}} with TOML {{< cross-project-current-link url="/en/os/x.x.x/concepts/api-driven/#user-data" >}}(including instance user data){{}}, using `apiclient` in a host container, or `kubectl`. #### Label a node with `apiclient` @@ -61,7 +59,7 @@ kubectl label node $(kubectl get nodes -o jsonpath='{.items[*].metadata.name}') #### Labeling a node with the Bottlerocket API -You can add the following TOML to your instance user data: +Add the following TOML to your instance user data: ```TOML ... @@ -82,8 +80,7 @@ apiclient set settings.kubernetes.node-labels.bottlerocket.aws/updater-interface __Helm Configuration__: `apiserver_internal_port` for internal traffic, `apiserver_service_port` for node agent traffic. - -By default, the operator’s API server uses port `8443` for internal traffic and port `443` for node agents, but you can change these ports via configuration. Both ports must be set or the operator will fail to start. +By default, the operator’s API server uses port `8443` for internal traffic and port `443` for node agents, but you can change these ports via this configuration. Both ports must be set or the operator will fail to start. Example: @@ -97,10 +94,10 @@ apiserver_internal_port: "8443" __Helm Configuration__: `max_concurrent_updates` -You can set the maximum concurrency of updates that Brupop will perform. You either set a specific number of concurrent updates or, alternately, `"unlimited"` to update as many nodes as possible concurrently. In either case, Brupop always respects [PodDisruptionBudgets](https://kubernetes.io/docs/tasks/run-application/configure-pdb/). +You can set the maximum concurrency of updates that Brupop will perform. You either set a specific number of concurrent updates or, alternately, `"unlimited"` to update as many nodes as possible concurrently. In either case, Brupop always respects [`PodDisruptionBudget`](https://kubernetes.io/docs/tasks/run-application/configure-pdb/). {{% alert title="Conflicts between load balancing and concurrency" color="warning" %}} -Take caution when setting concurrency and excluding load balancers together, as misconfiguration can result in a condition where all nodes exclude load balancing. +Take caution when setting concurrency and [excluding load balancers](#load-balancer-exclusion) together, as misconfiguration can result in a condition where all nodes exclude load balancing and can never drain fully to complete the update. Setting up `PodDisruptionBudget` guards against this condition. {{% /alert %}} Example: @@ -115,8 +112,7 @@ max_concurrent_updates: "1" __Helm Configuration__: `brupop-bottlerocket-aws` - -You can change the namespace where the Kubernetes deploys Brupop (default: `brupop-bottlerocket-aws`). +You can change the namespace where Kubernetes deploys Brupop (default: `brupop-bottlerocket-aws`). Example: @@ -130,11 +126,12 @@ namespace: "brupop-bottlerocket-aws" __Helm Configuration__: `exclude_from_lb_wait_time_in_sec` -With this option, you can control the exclusion of the node from load balancing and delays draining the node for the number of seconds specified. Internally, Brupop uses [`node.kubernetes.io/exclude-from-external-load-balancers`](https://kubernetes.io/docs/reference/labels-annotations-taints/#node-kubernetes-io-exclude-from-external-load-balancers) to exclude the node from load balancing. +With this option, you control the exclusion of the node from load balancing and delays draining the node for the number of seconds specified. Internally, Brupop uses [`node.kubernetes.io/exclude-from-external-load-balancers`](https://kubernetes.io/docs/reference/labels-annotations-taints/#node-kubernetes-io-exclude-from-external-load-balancers) to exclude the node from load balancing. See [Concurrent Updates](#concurrent-updates) for an important warning about concurrency and load balancer exclusion. Example: + ```yaml exclude_from_lb_wait_time_in_sec: "0" ``` @@ -173,7 +170,7 @@ Example: ```yaml logging: - ansi_enabled: "pretty" + ansi_enabled: "true" ``` #### Filter diff --git a/content/en/brupop/1.3.x/setup/install/index.markdown b/content/en/brupop/1.3.x/setup/install/index.markdown index 8e70165c..9019d950 100644 --- a/content/en/brupop/1.3.x/setup/install/index.markdown +++ b/content/en/brupop/1.3.x/setup/install/index.markdown @@ -1,7 +1,7 @@ +++ title = "Install Brupop" type = "docs" -description = "Install the Bottlerocket Update Operator to your Kubernetes cluster" +description = "Install the Bottlerocket Update Operator on your Kubernetes cluster" weight = 10 +++ diff --git a/content/en/brupop/1.3.x/troubleshoot/index.markdown b/content/en/brupop/1.3.x/troubleshoot/index.markdown index d9275ec2..28bfa5bb 100644 --- a/content/en/brupop/1.3.x/troubleshoot/index.markdown +++ b/content/en/brupop/1.3.x/troubleshoot/index.markdown @@ -41,24 +41,25 @@ When one or more nodes do not progress through the states and return to idle it There are a few potential causes of stuck updates: -1. Pod Disruption Budget preventing a node drain. Brupop uses the Kubernetes Eviction API to drain pods form a node. -It’s possible to Pod Disrutpion Budgets configured (often mistakenly) to disallow a pod removal resulting in a un-drainable node that Brupop cannot update. +1. Pod Disruption Budget preventing a node drain. Brupop uses the Kubernetes Eviction API to drain pods from a node. +It’s possible to have Pod Disruption Budgets configured (often mistakenly) to disallow a pod removal resulting in a un-drainable node that Brupop cannot update. **Troubleshooting step:** Check your pod disruption budget configuration. 2. Unable to access `updates.bottlerocket.aws`. Bottlerocket needs to access metadata from a public endpoint to get information about the most recent release. Production environments may limit this type of outbound access. **Troubleshooting step:** Log into the control container of a node and run `apiclient update check`. Failures with this check indicate an outbound block. -**Potential solution:** Scrape the contents of `updates.bottlerocket.aws` with [`Tuftool`](https://github.com/awslabs/tough/tree/develop/tuftool#download-tuf-repo) and serve from within your cluster, then update your settings accordingly for `settings.updates.metadata-base-url` and `settings.updates.metadata-base-url`. +**Potential solution:** Scrape the contents of `updates.bottlerocket.aws` with [`Tuftool`](https://github.com/awslabs/tough/tree/develop/tuftool#download-tuf-repo) and serve from within your cluster, then update your settings accordingly for {{< setting-reference setting="settings.updates.metadata-base-url" current_version="true">}}settings.updates.metadata-base-url{{}} and {{< setting-reference setting="settings.updates.targets-base-url" current_version="true">}}settings.updates.targets-base-url{{}}. + 3. Other issues while updating. **Troubleshooting step:** Check the agent logs for the stuck node. ### Bottlerocket instances start with an old version of Bottlerocket -After using Brupop for a while you may notice that any brand new nodes added to the cluster start with an older version Bottlerocket then Brupop flags them for an update almost immediately. Brupop can only update existing nodes and it doesn’t manage the node creation process. Depending on how you created your nodes determines how to address this issue: +After using Brupop for a while you may notice that any brand new nodes added to the cluster start with an older version of Bottlerocket then Brupop flags them for an update almost immediately. Brupop can only update existing nodes and it doesn’t manage the node creation process. Depending on how you created your nodes determines how to address this issue: -* Auto-scaling group: update your AMI ID in the launch configuration or template. -* Manual creation of nodes with AWS CLI: Update the `image-id` argument to the latest AMI ID -* VMware: Change the `target-name` argument when downloading the OVA with tuftool +* **Auto-scaling group**: update your AMI ID in the launch configuration or template. +* **Manual creation of nodes with AWS CLI**: Update the `image-id` argument to the latest AMI ID +* **VMware**: Change the `target-name` argument when downloading the OVA with tuftool ## Related diff --git a/content/en/brupop/1.3.x/uninstall/_index.markdown b/content/en/brupop/1.3.x/uninstall/_index.markdown new file mode 100644 index 00000000..9393dd2c --- /dev/null +++ b/content/en/brupop/1.3.x/uninstall/_index.markdown @@ -0,0 +1,22 @@ ++++ +type = "docs" +title = "Disable/Uninstall" +weight = 90 +description = "Removing Brupop from nodes or your cluster" ++++ + +You can disable Brupop from managing some or all nodes of your cluster as well as fully remove it from your cluster. + +## Disabling Brupop on nodes + +Brupop will only manage updates for the nodes you’ve labeled `bottlerocket.aws/updater-interface-version=2.0.0`. +Consequently, if you remove the label, Brupop will no longer manage the node updates. See the [Kubectl `label` docs](https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#label) for more information on removing a label. + +## Uninstalling Brupop + +To fully remove Brupop from your cluster, execute the following [helm](https://helm.sh/) uninstall operations on your cluster: + +```shell +helm uninstall brupop +helm uninstall brupop-crd +``` diff --git a/content/en/brupop/_index.markdown b/content/en/brupop/_index.markdown index 5f7102cc..92e171eb 100644 --- a/content/en/brupop/_index.markdown +++ b/content/en/brupop/_index.markdown @@ -21,8 +21,8 @@ The current documented versions: ## Version & Update Policy -Brupop follows semantic ([semver](https://semver.org/)) versioning to ensure that minor (e.g. 1.1.1 -> 1.2.0) or patch (e.g. 1.1.0 -> 1.1.1) updates do not introduce any breaking or incompatible changes. -However, patches are only provided to the latest version, so you should keep your Brupop installation up to date with the lastest release. +Brupop follows semantic ([semver](https://semver.org/)) versioning to ensure that minor (e.g. `1.1.1` -> `1.2.0`) or patch (e.g. `1.1.0` -> `1.1.1`) updates do not introduce any breaking or incompatible changes. +However, patches are only provided to the latest version, so you should keep your Brupop installation up to date with the latest release. ## Something Missing? diff --git a/layouts/partials/faq-body.html b/layouts/partials/faq-body.html index 3455fe91..773f43fb 100644 --- a/layouts/partials/faq-body.html +++ b/layouts/partials/faq-body.html @@ -6,7 +6,7 @@

{{ $group_name }}

{{- range (sort (index $questions $group_name) "question" ) -}} -

{{ .question }}

+

{{ .question | markdownify }}

{{ .answer }} {{- end -}}
diff --git a/layouts/partials/faq-index.html b/layouts/partials/faq-index.html index 2ac6cbc3..43dc421b 100644 --- a/layouts/partials/faq-index.html +++ b/layouts/partials/faq-index.html @@ -6,7 +6,7 @@
    {{- range (sort (index $questions $group_name) "question" ) -}}
  1. - {{ .question }} + {{ .question | markdownify }}
  2. {{- end -}}
diff --git a/layouts/shortcodes/brupop/idle.html b/layouts/shortcodes/brupop/staged-and-performed.html similarity index 98% rename from layouts/shortcodes/brupop/idle.html rename to layouts/shortcodes/brupop/staged-and-performed.html index da278a23..a2d6fb52 100644 --- a/layouts/shortcodes/brupop/idle.html +++ b/layouts/shortcodes/brupop/staged-and-performed.html @@ -1,5 +1,5 @@ + width="60%" viewBox="-0.5 -0.5 490 271" role="img"> From 84f635ab92bfb02ff833c0ae14c7cf70507430ec Mon Sep 17 00:00:00 2001 From: "Kyle J. Davis" Date: Wed, 14 Feb 2024 07:27:43 -0700 Subject: [PATCH 7/8] Update content/en/brupop/1.3.x/concepts/index.markdown MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Arnaldo García --- content/en/brupop/1.3.x/concepts/index.markdown | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/en/brupop/1.3.x/concepts/index.markdown b/content/en/brupop/1.3.x/concepts/index.markdown index c21a048b..d008993e 100644 --- a/content/en/brupop/1.3.x/concepts/index.markdown +++ b/content/en/brupop/1.3.x/concepts/index.markdown @@ -8,7 +8,7 @@ weight = 1 You can update Bottlerocket in a couple of ways: * **node replacement** where new instances with a new version of the OS replace nodes with older versions of the OS, -* **in-place updates** where the node downloads a new version of the OS and reboots into a new version of the OS while maintaining the same instance/machine. +* **in-place updates** where the node downloads and reboots into a new version of the OS while maintaining the same instance/machine. There is no single preferred nor advised method to update a node; both methods have pros and cons depending on your situation. From aa10c5281d1ff1713f644fd9132d5d67aef70294 Mon Sep 17 00:00:00 2001 From: "Kyle J. Davis" Date: Wed, 14 Feb 2024 08:43:31 -0700 Subject: [PATCH 8/8] feedback fixes, tidy line breaks, add current to version label --- content/en/brupop/1.3.x/_index.markdown | 2 +- .../en/brupop/1.3.x/concepts/index.markdown | 4 +-- .../en/brupop/1.3.x/operate/index.markdown | 6 +++-- .../1.3.x/setup/cert-manager/index.markdown | 3 ++- .../1.3.x/setup/configure/index.markdown | 27 +++++++++++-------- .../brupop/1.3.x/troubleshoot/index.markdown | 14 ++++++---- .../en/brupop/1.3.x/uninstall/_index.markdown | 3 ++- content/en/brupop/_index.markdown | 9 ++++--- 8 files changed, 41 insertions(+), 27 deletions(-) diff --git a/content/en/brupop/1.3.x/_index.markdown b/content/en/brupop/1.3.x/_index.markdown index 4b634b54..2778aee2 100644 --- a/content/en/brupop/1.3.x/_index.markdown +++ b/content/en/brupop/1.3.x/_index.markdown @@ -1,4 +1,4 @@ +++ type="docs" -title="1.3.x" +title="1.3.x (Current)" +++ diff --git a/content/en/brupop/1.3.x/concepts/index.markdown b/content/en/brupop/1.3.x/concepts/index.markdown index d008993e..7fc35297 100644 --- a/content/en/brupop/1.3.x/concepts/index.markdown +++ b/content/en/brupop/1.3.x/concepts/index.markdown @@ -63,7 +63,6 @@ Most of the time your nodes will remain in this state. {{< brupop/staged-and-performed >}} - Bottlerocket uses multiple partitions to manage in-place updates. The OS runs from one partition and, when a new update is available, the update is downloaded and installed into the other. The Brupop controller periodically requests the agent to check for and download the most recent version of Bottlerocket. @@ -85,9 +84,8 @@ Bottlerocket’s versioning and variant scheme is built to mitigate incompatibil Brupop’s state machine has a reserved state for monitoring these incompatibilities (**Monitoring Updates**), however as of this version, this state is a noop. You can suggest a direction for this state on the [Brupop GitHub Repo](https://github.com/bottlerocket-os/bottlerocket-update-operator/issues/new?assignees=&labels=&projects=&template=issue.md&title=Suggestion%20for%20monitoring%20state). - Consequently, the Agent immediately transitions through **Monitoring Updates** back to **Idle** with the API server. ### Error Reset -In the situation that any of the above states fail, the state becomes **Error Reset** before transitioning back to **Idle**. \ No newline at end of file +In the situation that any of the above states fail, the state becomes **Error Reset** before transitioning back to **Idle**. diff --git a/content/en/brupop/1.3.x/operate/index.markdown b/content/en/brupop/1.3.x/operate/index.markdown index 7a8cc9ae..39181437 100644 --- a/content/en/brupop/1.3.x/operate/index.markdown +++ b/content/en/brupop/1.3.x/operate/index.markdown @@ -5,7 +5,8 @@ weight = 10 description = "Understanding the day-to-day use of Brupop" +++ -After installation on your cluster Brupop runs in the background and generally requires no intervention. Your nodes will check for updates and apply them according to your configuration and the Bottlerocket update waves. +After installation on your cluster Brupop runs in the background and generally requires no intervention. +Your nodes will check for updates and apply them according to your configuration and the Bottlerocket update waves. However, you can observe the status of the updates by [adhoc query](#adhoc-query) or setup [on-going monitoring](#on-going-monitoring). @@ -17,7 +18,8 @@ If you want to see the update status of your nodes, use `kubectl` to get the cus kubectl get brs --namespace brupop-bottlerocket-aws ``` -`kubectl` returns the [state](../concepts/#states), current version, target state, and target version. For example: +`kubectl` returns the [state](../concepts/#states), current version, target state, and target version. +For example: ```shell AME STATE VERSION TARGET STATE TARGET VERSION diff --git a/content/en/brupop/1.3.x/setup/cert-manager/index.markdown b/content/en/brupop/1.3.x/setup/cert-manager/index.markdown index fd0bec09..8324f870 100644 --- a/content/en/brupop/1.3.x/setup/cert-manager/index.markdown +++ b/content/en/brupop/1.3.x/setup/cert-manager/index.markdown @@ -5,7 +5,8 @@ description = "Prepare your cluster for Brupop" weight = 1 +++ -Brupop uses [cert-manager](https://cert-manager.io/) to manage self-signed certificates. You can install it with `kubectl` or [helm](https://helm.sh/). +Brupop uses [cert-manager](https://cert-manager.io/) to manage self-signed certificates. +You can install it with `kubectl` or [helm](https://helm.sh/). {{% alert title="Note" color="success" %}} This guide uses the most recent release of `cert-manager`, {{< brupop/cert-manager-version >}}, but there is no particular hard dependency on this version. diff --git a/content/en/brupop/1.3.x/setup/configure/index.markdown b/content/en/brupop/1.3.x/setup/configure/index.markdown index 3c617bdc..7a463a45 100644 --- a/content/en/brupop/1.3.x/setup/configure/index.markdown +++ b/content/en/brupop/1.3.x/setup/configure/index.markdown @@ -26,6 +26,8 @@ You can label nodes using {{< cross-project-current-link url="/en/os/x.x.x/api/s #### Label a node with `apiclient` +From the control or admin container, run the following: + ```shell apiclient set settings.kubernetes.node-labels.bottlerocket.aws/updater-interface-version=2.0.0 ``` @@ -64,23 +66,22 @@ Add the following TOML to your instance user data: ```TOML ... [settings.kubernetes.node-labels] -"bottlerocket.aws/updater-interface-version" = 2.0.0 +"bottlerocket.aws/updater-interface-version" = "2.0.0" ... ``` -From the control container, run the following: - -```shell -apiclient set settings.kubernetes.node-labels.bottlerocket.aws/updater-interface-version=2.0.0 -``` - ## Optional Configuration ### API Server Ports __Helm Configuration__: `apiserver_internal_port` for internal traffic, `apiserver_service_port` for node agent traffic. -By default, the operator’s API server uses port `8443` for internal traffic and port `443` for node agents, but you can change these ports via this configuration. Both ports must be set or the operator will fail to start. +Brupop uses two ports for [communication between components](../../concepts/#controlled-updates): `apiserver_internal_port` for the controller and the [`BottlerocketShadow` custom resource](../../concepts/#states) and the `apiserver_service_port` for the conversion webhook. +Refer to the the +{{< github-link-at-version project="brupop" url="https://github.com/bottlerocket-os/bottlerocket-update-operator/blob/vx.x.x/bottlerocket-update-operator.yaml">}} manifest {{< / github-link-at-version >}} for more information on the usage of each port. + +By default, the operator’s API server uses port `8443` for internal traffic and port `443` for node agents, but you can change these ports via this configuration. +Both ports must be set or the operator will fail to start. Example: @@ -94,10 +95,13 @@ apiserver_internal_port: "8443" __Helm Configuration__: `max_concurrent_updates` -You can set the maximum concurrency of updates that Brupop will perform. You either set a specific number of concurrent updates or, alternately, `"unlimited"` to update as many nodes as possible concurrently. In either case, Brupop always respects [`PodDisruptionBudget`](https://kubernetes.io/docs/tasks/run-application/configure-pdb/). +You can set the maximum concurrency of updates that Brupop will perform. +You either set a specific number of concurrent updates or, alternately, `"unlimited"` to update as many nodes as possible concurrently. +In either case, Brupop always respects [`PodDisruptionBudget`](https://kubernetes.io/docs/tasks/run-application/configure-pdb/). {{% alert title="Conflicts between load balancing and concurrency" color="warning" %}} -Take caution when setting concurrency and [excluding load balancers](#load-balancer-exclusion) together, as misconfiguration can result in a condition where all nodes exclude load balancing and can never drain fully to complete the update. Setting up `PodDisruptionBudget` guards against this condition. +Take caution when setting concurrency and [excluding load balancers](#load-balancer-exclusion) together, as misconfiguration can result in a condition where all nodes exclude load balancing and can never drain fully to complete the update. +Setting up `PodDisruptionBudget` guards against this condition. {{% /alert %}} Example: @@ -126,7 +130,8 @@ namespace: "brupop-bottlerocket-aws" __Helm Configuration__: `exclude_from_lb_wait_time_in_sec` -With this option, you control the exclusion of the node from load balancing and delays draining the node for the number of seconds specified. Internally, Brupop uses [`node.kubernetes.io/exclude-from-external-load-balancers`](https://kubernetes.io/docs/reference/labels-annotations-taints/#node-kubernetes-io-exclude-from-external-load-balancers) to exclude the node from load balancing. +With this option, you control the exclusion of the node from load balancing and delays draining the node for the number of seconds specified. +Internally, Brupop uses [`node.kubernetes.io/exclude-from-external-load-balancers`](https://kubernetes.io/docs/reference/labels-annotations-taints/#node-kubernetes-io-exclude-from-external-load-balancers) to exclude the node from load balancing. See [Concurrent Updates](#concurrent-updates) for an important warning about concurrency and load balancer exclusion. diff --git a/content/en/brupop/1.3.x/troubleshoot/index.markdown b/content/en/brupop/1.3.x/troubleshoot/index.markdown index 28bfa5bb..b2c7c61c 100644 --- a/content/en/brupop/1.3.x/troubleshoot/index.markdown +++ b/content/en/brupop/1.3.x/troubleshoot/index.markdown @@ -11,7 +11,8 @@ Brupop’s components emit useful logs for debugging and troubleshooting. ### API Server deployment logs -Searching through the API Server’s deployment logs for a particular Node ID will yield the mutations to the node. Assuming the default namespace you can retrieve these by running: +Searching through the API Server’s deployment logs for a particular Node ID will yield the mutations to the node. +Assuming the default namespace you can retrieve these by running: ```shell kubectl logs deployment/brupop-apiserver --namespace brupop-bottlerocket-aws @@ -41,11 +42,13 @@ When one or more nodes do not progress through the states and return to idle it There are a few potential causes of stuck updates: -1. Pod Disruption Budget preventing a node drain. Brupop uses the Kubernetes Eviction API to drain pods from a node. +1. Pod Disruption Budget preventing a node drain. +Brupop uses the Kubernetes Eviction API to drain pods from a node. It’s possible to have Pod Disruption Budgets configured (often mistakenly) to disallow a pod removal resulting in a un-drainable node that Brupop cannot update. **Troubleshooting step:** Check your pod disruption budget configuration. 2. Unable to access `updates.bottlerocket.aws`. -Bottlerocket needs to access metadata from a public endpoint to get information about the most recent release. Production environments may limit this type of outbound access. +Bottlerocket needs to access metadata from a public endpoint to get information about the most recent release. +Production environments may limit this type of outbound access. **Troubleshooting step:** Log into the control container of a node and run `apiclient update check`. Failures with this check indicate an outbound block. **Potential solution:** Scrape the contents of `updates.bottlerocket.aws` with [`Tuftool`](https://github.com/awslabs/tough/tree/develop/tuftool#download-tuf-repo) and serve from within your cluster, then update your settings accordingly for {{< setting-reference setting="settings.updates.metadata-base-url" current_version="true">}}settings.updates.metadata-base-url{{}} and {{< setting-reference setting="settings.updates.targets-base-url" current_version="true">}}settings.updates.targets-base-url{{}}. @@ -55,7 +58,9 @@ Failures with this check indicate an outbound block. ### Bottlerocket instances start with an old version of Bottlerocket -After using Brupop for a while you may notice that any brand new nodes added to the cluster start with an older version of Bottlerocket then Brupop flags them for an update almost immediately. Brupop can only update existing nodes and it doesn’t manage the node creation process. Depending on how you created your nodes determines how to address this issue: +After using Brupop for a while you may notice that any brand new nodes added to the cluster start with an older version of Bottlerocket then Brupop flags them for an update almost immediately. +Brupop can only update existing nodes and it doesn’t manage the node creation process. +Depending on how you created your nodes determines how to address this issue: * **Auto-scaling group**: update your AMI ID in the launch configuration or template. * **Manual creation of nodes with AWS CLI**: Update the `image-id` argument to the latest AMI ID @@ -67,4 +72,3 @@ After using Brupop for a while you may notice that any brand new nodes added to - [Why do some of the nodes in my cluster have an update available and others do not?](/en/faq/#7_3) - [Why are my nodes egressing to `updates.bottlerocket.aws`?](/en/faq/#7_2) * [Log Configuration](../setup/configure/#logging) - diff --git a/content/en/brupop/1.3.x/uninstall/_index.markdown b/content/en/brupop/1.3.x/uninstall/_index.markdown index 9393dd2c..6cdbfb02 100644 --- a/content/en/brupop/1.3.x/uninstall/_index.markdown +++ b/content/en/brupop/1.3.x/uninstall/_index.markdown @@ -10,7 +10,8 @@ You can disable Brupop from managing some or all nodes of your cluster as well a ## Disabling Brupop on nodes Brupop will only manage updates for the nodes you’ve labeled `bottlerocket.aws/updater-interface-version=2.0.0`. -Consequently, if you remove the label, Brupop will no longer manage the node updates. See the [Kubectl `label` docs](https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#label) for more information on removing a label. +Consequently, if you remove the label, Brupop will no longer manage the node updates. +See the [Kubectl `label` docs](https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#label) for more information on removing a label. ## Uninstalling Brupop diff --git a/content/en/brupop/_index.markdown b/content/en/brupop/_index.markdown index 92e171eb..06770508 100644 --- a/content/en/brupop/_index.markdown +++ b/content/en/brupop/_index.markdown @@ -7,13 +7,15 @@ no_version_warning=true +++ -This section covers installing and using the Bottlerocket Update Operator only. If you’re seeking general information about Bottlerocket updates, {{< cross-project-current-link project="os" url="/en/os/x.x.x/update/" >}}check the Updating documentation for the OS{{< /cross-project-current-link >}}. +This section covers installing and using the Bottlerocket Update Operator only. +If you’re seeking general information about Bottlerocket updates, {{< cross-project-current-link project="os" url="/en/os/x.x.x/update/" >}}check the Updating documentation for the OS{{< /cross-project-current-link >}}. If you’re looking for information on building, contributing to, or learning about the inner workings of Brupop, the [GitHub repo](https://github.com/bottlerocket-os/bottlerocket-update-operator) is a better destination. ## Organization -The Brupop documentation is organized by minor version, with each minor release getting it’s own namespaced, version-specific section. Inside each version-specific sections are subsections which address specific tasks or categories of information. +The Brupop documentation is organized by minor version, with each minor release getting it’s own namespaced, version-specific section. +Inside each version-specific sections are subsections which address specific tasks or categories of information. The current documented versions: @@ -26,4 +28,5 @@ However, patches are only provided to the latest version, so you should keep you ## Something Missing? -This [documentation is open-source](https://github.com/bottlerocket-os/bottlerocket-project-website/tree/main/content/en/brupop) and likely incomplete, but will evolve over time to encompass a more complete explanation of the software. Should you find gaps, you’re invited to file issues or contribute. +This [documentation is open-source](https://github.com/bottlerocket-os/bottlerocket-project-website/tree/main/content/en/brupop) and likely incomplete, but will evolve over time to encompass a more complete explanation of the software. +Should you find gaps, you’re invited to file issues or contribute.