diff --git a/docs/src/_parts/bootstrap_config.md b/docs/src/_parts/bootstrap_config.md new file mode 100644 index 000000000..6594f10ae --- /dev/null +++ b/docs/src/_parts/bootstrap_config.md @@ -0,0 +1,485 @@ +### cluster-config +**Type:** `object`
+ + +### cluster-config.network +**Type:** `object`
+ +Configuration options for the network feature. + +### cluster-config.network.enabled +**Type:** `bool`
+ +Determines if the feature should be enabled. +If omitted defaults to `true` + +### cluster-config.dns +**Type:** `object`
+ +Configuration options for the dns feature. + +### cluster-config.dns.enabled +**Type:** `bool`
+ +Determines if the feature should be enabled. +If omitted defaults to `true` + +### cluster-config.dns.cluster-domain +**Type:** `string`
+ +Sets the local domain of the cluster. +If omitted defaults to `cluster.local`. + +### cluster-config.dns.service-ip +**Type:** `string`
+ +Sets the IP address of the dns service. If omitted defaults to the IP address +of the Kubernetes service created by the feature. + +Can be used to point to an external dns server when feature is disabled. + +### cluster-config.dns.upstream-nameservers +**Type:** `[]string`
+ +Sets the upstream nameservers used to forward queries for out-of-cluster +endpoints. + +If omitted defaults to `/etc/resolv.conf` and uses the nameservers of the node. + +### cluster-config.ingress +**Type:** `object`
+ +Configuration options for the ingress feature. + +### cluster-config.ingress.enabled +**Type:** `bool`
+ +Determines if the feature should be enabled. +If omitted defaults to `false` + +### cluster-config.ingress.default-tls-secret +**Type:** `string`
+ +Sets the name of the secret to be used for providing default encryption to +ingresses. + +Ingresses can specify another TLS secret in their resource definitions, +in which case the default secret won't be used. + +### cluster-config.ingress.enable-proxy-protocol +**Type:** `bool`
+ +Determines if the proxy protocol should be enabled for ingresses. +If omitted defaults to `false`. + +### cluster-config.load-balancer +**Type:** `object`
+ +Configuration options for the load-balancer feature. + +### cluster-config.load-balancer.enabled +**Type:** `bool`
+ +Determines if the feature should be enabled. +If omitted defaults to `false`. + +### cluster-config.load-balancer.cidrs +**Type:** `[]string`
+ +Sets the CIDRs used for assigning IP addresses to Kubernetes services with type +`LoadBalancer`. + +### cluster-config.load-balancer.l2-mode +**Type:** `bool`
+ +Determines if L2 mode should be enabled. +If omitted defaults to `false`. + +### cluster-config.load-balancer.l2-interfaces +**Type:** `[]string`
+ +Sets the interfaces to be used for announcing IP addresses through ARP. +If omitted all interfaces will be used. + +### cluster-config.load-balancer.bgp-mode +**Type:** `bool`
+ +Determines if BGP mode should be enabled. +If omitted defaults to `false`. + +### cluster-config.load-balancer.bgp-local-asn +**Type:** `int`
+ +Sets the ASN to be used for the local virtual BGP router. +Required if bgp-mode is true. + +### cluster-config.load-balancer.bgp-peer-address +**Type:** `string`
+ +Sets the IP address of the BGP peer. +Required if bgp-mode is true. + +### cluster-config.load-balancer.bgp-peer-asn +**Type:** `int`
+ +Sets the ASN of the BGP peer. +Required if bgp-mode is true. + +### cluster-config.load-balancer.bgp-peer-port +**Type:** `int`
+ +Sets the port of the BGP peer. +Required if bgp-mode is true. + +### cluster-config.local-storage +**Type:** `object`
+ +Configuration options for the local-storage feature. + +### cluster-config.local-storage.enabled +**Type:** `bool`
+ +Determines if the feature should be enabled. +If omitted defaults to `false`. + +### cluster-config.local-storage.local-path +**Type:** `string`
+ +Sets the path to be used for storing volume data. +If omitted defaults to `/var/snap/k8s/common/rawfile-storage` + +### cluster-config.local-storage.reclaim-policy +**Type:** `string`
+ +Sets the reclaim policy of the storage class. +If omitted defaults to `Delete`. +Possible values: `Retain | Recycle | Delete` + +### cluster-config.local-storage.default +**Type:** `bool`
+ +Determines if the storage class should be set as default. +If omitted defaults to `true` + +### cluster-config.gateway +**Type:** `object`
+ +Configuration options for the gateway feature. + +### cluster-config.gateway.enabled +**Type:** `bool`
+ +Determines if the feature should be enabled. +If omitted defaults to `true`. + +### cluster-config.metrics-server +**Type:** `object`
+ +Configuration options for the metric server feature. + +### cluster-config.metrics-server.enabled +**Type:** `bool`
+ +Determines if the feature should be enabled. +If omitted defaults to `true`. + +### cluster-config.cloud-provider +**Type:** `string`
+ +Sets the cloud provider to be used by the cluster. + +When this is set as `external`, node will wait for an external cloud provider to +do cloud specific setup and finish node initialisation. + +Possible values: `external`. + +### cluster-config.annotations +**Type:** `map[string]string`
+ +Annotations is a map of strings that can be used to store arbitrary metadata configuration. +Please refer to the annotations reference for further details on these options. + +### control-plane-taints +**Type:** `[]string`
+ +List of taints to be applied to control plane nodes. + +### pod-cidr +**Type:** `string`
+ +The CIDR to be used for assigning pod addresses. +If omitted defaults to `10.1.0.0/16`. + +### service-cidr +**Type:** `string`
+ +The CIDR to be used for assigning service addresses. +If omitted defaults to `10.152.183.0/24`. + +### disable-rbac +**Type:** `bool`
+ +Determines if RBAC should be disabled. +If omitted defaults to `false`. + +### secure-port +**Type:** `int`
+ +The port number for kube-apiserver to use. +If omitted defaults to `6443`. + +### k8s-dqlite-port +**Type:** `int`
+ +The port number for k8s-dqlite to use. +If omitted defaults to `9000`. + +### datastore-type +**Type:** `string`
+ +The type of datastore to be used. +If omitted defaults to `k8s-dqlite`. + +Can be used to point to an external datastore like etcd. + +Possible Values: `k8s-dqlite | external`. + +### datastore-servers +**Type:** `[]string`
+ +The server addresses to be used when `datastore-type` is set to `external`. + +### datastore-ca-crt +**Type:** `string`
+ +The CA certificate to be used when communicating with the external datastore. + +### datastore-client-crt +**Type:** `string`
+ +The client certificate to be used when communicating with the external +datastore. + +### datastore-client-key +**Type:** `string`
+ +The client key to be used when communicating with the external datastore. + +### extra-sans +**Type:** `[]string`
+ +List of extra SANs to be added to certificates. + +### ca-crt +**Type:** `string`
+ +The CA certificate to be used for Kubernetes services. +If omitted defaults to an auto generated certificate. + +### ca-key +**Type:** `string`
+ +The CA key to be used for Kubernetes services. +If omitted defaults to an auto generated key. + +### client-ca-crt +**Type:** `string`
+ +The client CA certificate to be used for Kubernetes services. +If omitted defaults to an auto generated certificate. + +### client-ca-key +**Type:** `string`
+ +The client CA key to be used for Kubernetes services. +If omitted defaults to an auto generated key. + +### front-proxy-ca-crt +**Type:** `string`
+ +The CA certificate to be used for the front proxy. +If omitted defaults to an auto generated certificate. + +### front-proxy-ca-key +**Type:** `string`
+ +The CA key to be used for the front proxy. +If omitted defaults to an auto generated key. + +### front-proxy-client-crt +**Type:** `string`
+ +The client certificate to be used for the front proxy. +If omitted defaults to an auto generated certificate. + +### front-proxy-client-key +**Type:** `string`
+ +The client key to be used for the front proxy. +If omitted defaults to an auto generated key. + +### apiserver-kubelet-client-crt +**Type:** `string`
+ +The client certificate to be used by kubelet for communicating with the kube-apiserver. +If omitted defaults to an auto generated certificate. + +### apiserver-kubelet-client-key +**Type:** `string`
+ +The client key to be used by kubelet for communicating with the kube-apiserver. +If omitted defaults to an auto generated key. + +### admin-client-crt +**Type:** `string`
+ +The admin client certificate to be used for Kubernetes services. +If omitted defaults to an auto generated certificate. + +### admin-client-key +**Type:** `string`
+ +The admin client key to be used for Kubernetes services. +If omitted defaults to an auto generated key. + +### kube-proxy-client-crt +**Type:** `string`
+ +The client certificate to be used for the kube-proxy. +If omitted defaults to an auto generated certificate. + +### kube-proxy-client-key +**Type:** `string`
+ +The client key to be used for the kube-proxy. +If omitted defaults to an auto generated key. + +### kube-scheduler-client-crt +**Type:** `string`
+ +The client certificate to be used for the kube-scheduler. +If omitted defaults to an auto generated certificate. + +### kube-scheduler-client-key +**Type:** `string`
+ +The client key to be used for the kube-scheduler. +If omitted defaults to an auto generated key. + +### kube-controller-manager-client-crt +**Type:** `string`
+ +The client certificate to be used for the Kubernetes controller manager. +If omitted defaults to an auto generated certificate. + +### kube-controller-manager-client-key +**Type:** `string`
+ +The client key to be used for the Kubernetes controller manager. +If omitted defaults to an auto generated key. + +### service-account-key +**Type:** `string`
+ +The key to be used by the default service account. +If omitted defaults to an auto generated key. + +### apiserver-crt +**Type:** `string`
+ +The certificate to be used for the kube-apiserver. +If omitted defaults to an auto generated certificate. + +### apiserver-key +**Type:** `string`
+ +The key to be used for the kube-apiserver. +If omitted defaults to an auto generated key. + +### kubelet-crt +**Type:** `string`
+ +The certificate to be used for the kubelet. +If omitted defaults to an auto generated certificate. + +### kubelet-key +**Type:** `string`
+ +The key to be used for the kubelet. +If omitted defaults to an auto generated key. + +### kubelet-client-crt +**Type:** `string`
+ +The certificate to be used for the kubelet client. +If omitted defaults to an auto generated certificate. + +### kubelet-client-key +**Type:** `string`
+ +The key to be used for the kubelet client. +If omitted defaults to an auto generated key. + +### extra-node-config-files +**Type:** `map[string]string`
+ +Additional files that are uploaded `/var/snap/k8s/common/args/conf.d/` +to a node on bootstrap. These files can then be referenced by Kubernetes +service arguments. + +The format is `map[]`. + +### extra-node-kube-apiserver-args +**Type:** `map[string]string`
+ +Additional arguments that are passed to the `kube-apiserver` only for that specific node. +A parameter that is explicitly set to `null` is deleted. +The format is `map[<--flag-name>]`. + +### extra-node-kube-controller-manager-args +**Type:** `map[string]string`
+ +Additional arguments that are passed to the `kube-controller-manager` only for that specific node. +A parameter that is explicitly set to `null` is deleted. +The format is `map[<--flag-name>]`. + +### extra-node-kube-scheduler-args +**Type:** `map[string]string`
+ +Additional arguments that are passed to the `kube-scheduler` only for that specific node. +A parameter that is explicitly set to `null` is deleted. +The format is `map[<--flag-name>]`. + +### extra-node-kube-proxy-args +**Type:** `map[string]string`
+ +Additional arguments that are passed to the `kube-proxy` only for that specific node. +A parameter that is explicitly set to `null` is deleted. +The format is `map[<--flag-name>]`. + +### extra-node-kubelet-args +**Type:** `map[string]string`
+ +Additional arguments that are passed to the `kubelet` only for that specific node. +A parameter that is explicitly set to `null` is deleted. +The format is `map[<--flag-name>]`. + +### extra-node-containerd-args +**Type:** `map[string]string`
+ +Additional arguments that are passed to `containerd` only for that specific node. +A parameter that is explicitly set to `null` is deleted. +The format is `map[<--flag-name>]`. + +### extra-node-k8s-dqlite-args +**Type:** `map[string]string`
+ +Additional arguments that are passed to `k8s-dqlite` only for that specific node. +A parameter that is explicitly set to `null` is deleted. +The format is `map[<--flag-name>]`. + +### extra-node-containerd-config +**Type:** `apiv1.MapStringAny`
+ +Extra configuration for the containerd config.toml + diff --git a/docs/src/_parts/control_plane_join_config.md b/docs/src/_parts/control_plane_join_config.md new file mode 100644 index 000000000..fa2919e45 --- /dev/null +++ b/docs/src/_parts/control_plane_join_config.md @@ -0,0 +1,152 @@ +### extra-sans +**Type:** `[]string`
+ +List of extra SANs to be added to certificates. + +### front-proxy-client-crt +**Type:** `string`
+ +The client certificate to be used for the front proxy. +If omitted defaults to an auto generated certificate. + +### front-proxy-client-key +**Type:** `string`
+ +The client key to be used for the front proxy. +If omitted defaults to an auto generated key. + +### kube-proxy-client-crt +**Type:** `string`
+ +The client certificate to be used by kubelet for communicating with the kube-apiserver. +If omitted defaults to an auto generated certificate. + +### kube-proxy-client-key +**Type:** `string`
+ +The client key to be used by kubelet for communicating with the kube-apiserver. +If omitted defaults to an auto generated key. + +### kube-scheduler-client-crt +**Type:** `string`
+ +The client certificate to be used for the kube-scheduler. +If omitted defaults to an auto generated certificate. + +### kube-scheduler-client-key +**Type:** `string`
+ +The client key to be used for the kube-scheduler. +If omitted defaults to an auto generated key. + +### kube-controller-manager-client-crt +**Type:** `string`
+ +The client certificate to be used for the Kubernetes controller manager. +If omitted defaults to an auto generated certificate. + +### kube-controller-manager-client-key +**Type:** `string`
+ +The client key to be used for the Kubernetes controller manager. +If omitted defaults to an auto generated key. + +### apiserver-crt +**Type:** `string`
+ +The certificate to be used for the kube-apiserver. +If omitted defaults to an auto generated certificate. + +### apiserver-key +**Type:** `string`
+ +The key to be used for the kube-apiserver. +If omitted defaults to an auto generated key. + +### kubelet-crt +**Type:** `string`
+ +The certificate to be used for the kubelet. +If omitted defaults to an auto generated certificate. + +### kubelet-key +**Type:** `string`
+ +The key to be used for the kubelet. +If omitted defaults to an auto generated key. + +### kubelet-client-crt +**Type:** `string`
+ +The client certificate to be used for the kubelet. +If omitted defaults to an auto generated certificate. + +### kubelet-client-key +**Type:** `string`
+ +The client key to be used for the kubelet. +If omitted defaults to an auto generated key. + +### extra-node-config-files +**Type:** `map[string]string`
+ +Additional files that are uploaded `/var/snap/k8s/common/args/conf.d/` +to a node on bootstrap. These files can then be referenced by Kubernetes +service arguments. + +The format is `map[]`. + +### extra-node-kube-apiserver-args +**Type:** `map[string]string`
+ +Additional arguments that are passed to the `kube-apiserver` only for that specific node. +A parameter that is explicitly set to `null` is deleted. +The format is `map[<--flag-name>]`. + +### extra-node-kube-controller-manager-args +**Type:** `map[string]string`
+ +Additional arguments that are passed to the `kube-controller-manager` only for that specific node. +A parameter that is explicitly set to `null` is deleted. +The format is `map[<--flag-name>]`. + +### extra-node-kube-scheduler-args +**Type:** `map[string]string`
+ +Additional arguments that are passed to the `kube-scheduler` only for that specific node. +A parameter that is explicitly set to `null` is deleted. +The format is `map[<--flag-name>]`. + +### extra-node-kube-proxy-args +**Type:** `map[string]string`
+ +Additional arguments that are passed to the `kube-proxy` only for that specific node. +A parameter that is explicitly set to `null` is deleted. +The format is `map[<--flag-name>]`. + +### extra-node-kubelet-args +**Type:** `map[string]string`
+ +Additional arguments that are passed to the `kubelet` only for that specific node. +A parameter that is explicitly set to `null` is deleted. +The format is `map[<--flag-name>]`. + +### extra-node-containerd-args +**Type:** `map[string]string`
+ +Additional arguments that are passed to `containerd` only for that specific node. +A parameter that is explicitly set to `null` is deleted. +The format is `map[<--flag-name>]`. + +### extra-node-k8s-dqlite-args +**Type:** `map[string]string`
+ +Additional arguments that are passed to `k8s-dqlite` only for that specific node. +A parameter that is explicitly set to `null` is deleted. +The format is `map[<--flag-name>]`. + +### extra-node-containerd-config +**Type:** `apiv1.MapStringAny`
+ +Extra configuration for the containerd config.toml + diff --git a/docs/src/_parts/install.md b/docs/src/_parts/install.md new file mode 100644 index 000000000..ec6c261e8 --- /dev/null +++ b/docs/src/_parts/install.md @@ -0,0 +1,3 @@ +``` +sudo snap install k8s --classic --channel=1.31/stable +``` \ No newline at end of file diff --git a/docs/src/_parts/template-explanation b/docs/src/_parts/template-explanation index 905a30f07..3ec1911ae 100644 --- a/docs/src/_parts/template-explanation +++ b/docs/src/_parts/template-explanation @@ -15,11 +15,6 @@ The documentation also supports various diagrams-as-code options. We prefer to use UML-style diagrams, but you can also use Mermaid or many other types. -Diagrams like this are processed using the 'kroki' directive: - -```{kroki} ../../assets/ck-cluster.puml -``` - ## Links Explanations frequently include links to other documents. In particular, please diff --git a/docs/src/_parts/template-tutorial b/docs/src/_parts/template-tutorial index 039f0ee52..d1d46591e 100644 --- a/docs/src/_parts/template-tutorial +++ b/docs/src/_parts/template-tutorial @@ -68,7 +68,7 @@ workload and remove everything again! ## Next Steps -- Keep mastering Canonical Kubernetes with kubectl: [How to use kubectl] +- How to control {{product}} with `kubectl`: [How to use kubectl] - Explore Kubernetes commands with our [Command Reference Guide] - Learn how to set up a multi-node environment [Setting up a K8s cluster] - Configure storage options [Storage] diff --git a/docs/src/_parts/worker_join_config.md b/docs/src/_parts/worker_join_config.md new file mode 100644 index 000000000..70a515a8f --- /dev/null +++ b/docs/src/_parts/worker_join_config.md @@ -0,0 +1,78 @@ +### kubelet-crt +**Type:** `string`
+ +The certificate to be used for the kubelet. +If omitted defaults to an auto generated certificate. + +### kubelet-key +**Type:** `string`
+ +The key to be used for the kubelet. +If omitted defaults to an auto generated key. + +### kubelet-client-crt +**Type:** `string`
+ +The client certificate to be used for the kubelet. +If omitted defaults to an auto generated certificate. + +### kubelet-client-key +**Type:** `string`
+ +The client key to be used for the kubelet. +If omitted defaults to an auto generated key. + +### kube-proxy-client-crt +**Type:** `string`
+ +The client certificate to be used for the kube-proxy. +If omitted defaults to an auto generated certificate. + +### kube-proxy-client-key +**Type:** `string`
+ +The client key to be used for the kube-proxy. +If omitted defaults to an auto generated key. + +### extra-node-config-files +**Type:** `map[string]string`
+ +Additional files that are uploaded `/var/snap/k8s/common/args/conf.d/` +to a node on bootstrap. These files can then be referenced by Kubernetes +service arguments. + +The format is `map[]`. + +### extra-node-kube-proxy-args +**Type:** `map[string]string`
+ +Additional arguments that are passed to the `kube-proxy` only for that specific node. +A parameter that is explicitly set to `null` is deleted. +The format is `map[<--flag-name>]`. + +### extra-node-kubelet-args +**Type:** `map[string]string`
+ +Additional arguments that are passed to the `kubelet` only for that specific node. +A parameter that is explicitly set to `null` is deleted. +The format is `map[<--flag-name>]`. + +### extra-node-containerd-args +**Type:** `map[string]string`
+ +Additional arguments that are passed to `containerd` only for that specific node. +A parameter that is explicitly set to `null` is deleted. +The format is `map[<--flag-name>]`. + +### extra-node-k8s-apiserver-proxy-args +**Type:** `map[string]string`
+ +Additional arguments that are passed to `k8s-api-server-proxy` only for that specific node. +A parameter that is explicitly set to `null` is deleted. +The format is `map[<--flag-name>]`. + +### extra-node-containerd-config +**Type:** `apiv1.MapStringAny`
+ +Extra configuration for the containerd config.toml + diff --git a/docs/src/assets/capi-ck8s.svg b/docs/src/assets/capi-ck8s.svg new file mode 100644 index 000000000..d7df80727 --- /dev/null +++ b/docs/src/assets/capi-ck8s.svg @@ -0,0 +1,4 @@ + + + +
Canonical Kubernetes Bootstrap Provider (CABCK)
CAPI Machine with Canonical Kubernetes Config
CA
Join Token
kubeconfig
Canonical Kubernetes Control Plane Provider (CACPCK)
Infrastructure Providder
Control Plane
Worker Nodes
VM  #1
VM  #2
VM  #3
VM  #N-2
VM  #N-1
VM  #N
...
Provisioned (Workload) Cluster
User
Cluster EP
clusterctl get config
Bootstrap (Management) Cluster
Bootstrap secret
Deliver cloudinit 
for nodes
User talks to cluster EP
Generate Secrets
- Join Token
- CA
diff --git a/docs/src/assets/how-to-cloud-storage-aws-ccm.yaml b/docs/src/assets/how-to-cloud-storage-aws-ccm.yaml new file mode 100644 index 000000000..fa6dc3cb9 --- /dev/null +++ b/docs/src/assets/how-to-cloud-storage-aws-ccm.yaml @@ -0,0 +1,170 @@ +--- +apiVersion: apps/v1 +kind: DaemonSet +metadata: + name: aws-cloud-controller-manager + namespace: kube-system + labels: + k8s-app: aws-cloud-controller-manager +spec: + selector: + matchLabels: + k8s-app: aws-cloud-controller-manager + updateStrategy: + type: RollingUpdate + template: + metadata: + labels: + k8s-app: aws-cloud-controller-manager + spec: + nodeSelector: + node-role.kubernetes.io/control-plane: "" + tolerations: + - key: node.cloudprovider.kubernetes.io/uninitialized + value: "true" + effect: NoSchedule + - effect: NoSchedule + key: node-role.kubernetes.io/control-plane + affinity: + nodeAffinity: + requiredDuringSchedulingIgnoredDuringExecution: + nodeSelectorTerms: + - matchExpressions: + - key: node-role.kubernetes.io/control-plane + operator: Exists + serviceAccountName: cloud-controller-manager + containers: + - name: aws-cloud-controller-manager + image: registry.k8s.io/provider-aws/cloud-controller-manager:v1.28.3 + args: + - --v=2 + - --cloud-provider=aws + - --use-service-account-credentials=true + - --configure-cloud-routes=false + resources: + requests: + cpu: 200m + hostNetwork: true +--- +apiVersion: v1 +kind: ServiceAccount +metadata: + name: cloud-controller-manager + namespace: kube-system +--- +apiVersion: rbac.authorization.k8s.io/v1 +kind: RoleBinding +metadata: + name: cloud-controller-manager:apiserver-authentication-reader + namespace: kube-system +roleRef: + apiGroup: rbac.authorization.k8s.io + kind: Role + name: extension-apiserver-authentication-reader +subjects: + - apiGroup: "" + kind: ServiceAccount + name: cloud-controller-manager + namespace: kube-system +--- +apiVersion: rbac.authorization.k8s.io/v1 +kind: ClusterRole +metadata: + name: system:cloud-controller-manager +rules: +- apiGroups: + - "" + resources: + - events + verbs: + - create + - patch + - update +- apiGroups: + - "" + resources: + - nodes + verbs: + - '*' +- apiGroups: + - "" + resources: + - nodes/status + verbs: + - patch +- apiGroups: + - "" + resources: + - services + verbs: + - list + - patch + - update + - watch +- apiGroups: + - "" + resources: + - services/status + verbs: + - list + - patch + - update + - watch +- apiGroups: + - "" + resources: + - serviceaccounts + verbs: + - create + - get + - list + - watch +- apiGroups: + - "" + resources: + - persistentvolumes + verbs: + - get + - list + - update + - watch +- apiGroups: + - "" + resources: + - endpoints + verbs: + - create + - get + - list + - watch + - update +- apiGroups: + - coordination.k8s.io + resources: + - leases + verbs: + - create + - get + - list + - watch + - update +- apiGroups: + - "" + resources: + - serviceaccounts/token + verbs: + - create +--- +kind: ClusterRoleBinding +apiVersion: rbac.authorization.k8s.io/v1 +metadata: + name: system:cloud-controller-manager +roleRef: + apiGroup: rbac.authorization.k8s.io + kind: ClusterRole + name: system:cloud-controller-manager +subjects: + - apiGroup: "" + kind: ServiceAccount + name: cloud-controller-manager + namespace: kube-system diff --git a/docs/src/assets/k8sd-component.puml b/docs/src/assets/k8sd-component.puml index f95cd278c..3e39ce90b 100644 --- a/docs/src/assets/k8sd-component.puml +++ b/docs/src/assets/k8sd-component.puml @@ -16,13 +16,13 @@ Container(K8sSnapDistribution.State, "State", $techn="", $descr="Datastores hold Container(K8sSnapDistribution.KubernetesServices, "Kubernetes Services", $techn="", $descr="API server, kubelet, kube-proxy, scheduler, kube-controller", $tags="", $link="") Container_Boundary("K8sSnapDistribution.K8sd_boundary", "K8sd", $tags="") { - Component(K8sSnapDistribution.K8sd.CLI, "CLI", $techn="CLI", $descr="The CLI the offered", $tags="", $link="") + Component(K8sSnapDistribution.K8sd.CLI, "CLI", $techn="CLI", $descr="The CLI offered", $tags="", $link="") Component(K8sSnapDistribution.K8sd.APIviaHTTP, "API via HTTP", $techn="REST", $descr="The API interface offered", $tags="", $link="") - Component(K8sSnapDistribution.K8sd.CLustermanagement, "CLuster management", $techn="", $descr="Management of the cluster with the help of MicroCluster", $tags="", $link="") + Component(K8sSnapDistribution.K8sd.CLustermanagement, "Cluster management", $techn="", $descr="Management of the cluster with the help of MicroCluster", $tags="", $link="") } Rel(K8sAdmin, K8sSnapDistribution.K8sd.CLI, "Sets up and configured the cluster", $techn="", $tags="", $link="") -Rel(CharmK8s, K8sSnapDistribution.K8sd.APIviaHTTP, "Orchestrates the lifecycle management of K8s", $techn="", $tags="", $link="") +Rel(CharmK8s, K8sSnapDistribution.K8sd.APIviaHTTP, "Orchestrates the lifecycle management of K8s when deployed with Juju", $techn="", $tags="", $link="") Rel(K8sSnapDistribution.K8sd.CLustermanagement, K8sSnapDistribution.KubernetesServices, "Configures", $techn="", $tags="", $link="") Rel(K8sSnapDistribution.KubernetesServices, K8sSnapDistribution.State, "Uses by default", $techn="", $tags="", $link="") Rel(K8sSnapDistribution.K8sd.CLustermanagement, K8sSnapDistribution.State, "Keeps state in", $techn="", $tags="", $link="") diff --git a/docs/src/capi/explanation/capi-ck8s.md b/docs/src/capi/explanation/capi-ck8s.md index d75db76ac..5ba3487b3 100644 --- a/docs/src/capi/explanation/capi-ck8s.md +++ b/docs/src/capi/explanation/capi-ck8s.md @@ -1,12 +1,26 @@ # Cluster API - {{product}} -ClusterAPI (CAPI) is an open-source Kubernetes project that provides a declarative API for cluster creation, configuration, and management. It is designed to automate the creation and management of Kubernetes clusters in various environments, including on-premises data centers, public clouds, and edge devices. - -CAPI abstracts away the details of infrastructure provisioning, networking, and other low-level tasks, allowing users to define their desired cluster configuration using simple YAML manifests. This makes it easier to create and manage clusters in a repeatable and consistent manner, regardless of the underlying infrastructure. In this way a wide range of infrastructure providers has been made available, including but not limited to Amazon Web Services (AWS), Microsoft Azure, Google Cloud Platform (GCP), and OpenStack. - -CAPI also abstracts the provisioning and management of Kubernetes clusters allowing for a variety of Kubernetes distributions to be delivered in all of the supported infrastructure providers. {{product}} is one such Kubernetes distribution that seamlessly integrates with Cluster API. +ClusterAPI (CAPI) is an open-source Kubernetes project that provides a +declarative API for cluster creation, configuration, and management. It is +designed to automate the creation and management of Kubernetes clusters in +various environments, including on-premises data centres, public clouds, and +edge devices. + +CAPI abstracts away the details of infrastructure provisioning, networking, and +other low-level tasks, allowing users to define their desired cluster +configuration using simple YAML manifests. This makes it easier to create and +manage clusters in a repeatable and consistent manner, regardless of the +underlying infrastructure. In this way a wide range of infrastructure providers +has been made available, including but not limited to Amazon Web Services +(AWS), Microsoft Azure, Google Cloud Platform (GCP), and OpenStack. + +CAPI also abstracts the provisioning and management of Kubernetes clusters +allowing for a variety of Kubernetes distributions to be delivered in all of +the supported infrastructure providers. {{product}} is one such Kubernetes +distribution that seamlessly integrates with Cluster API. With {{product}} CAPI you can: + - provision a cluster with: - Kubernetes version 1.31 onwards - risk level of the track you want to follow (stable, candidate, beta, edge) @@ -20,21 +34,59 @@ Please refer to the “Tutorial” section for concrete examples on CAPI deploym ## CAPI architecture -Being a cloud-native framework, CAPI implements all its components as controllers that run within a Kubernetes cluster. There is a separate controller, called a ‘provider’, for each supported infrastructure substrate. The infrastructure providers are responsible for provisioning physical or virtual nodes and setting up networking elements such as load balancers and virtual networks. In a similar way, each Kubernetes distribution that integrates with ClusterAPI is managed by two providers: the control plane provider and the bootstrap provider. The bootstrap provider is responsible for delivering and managing Kubernetes on the nodes, while the control plane provider handles the control plane’s specific lifecycle. - -The CAPI providers operate within a Kubernetes cluster known as the management cluster. The administrator is responsible for selecting the desired combination of infrastructure and Kubernetes distribution by instantiating the respective infrastructure, bootstrap, and control plane providers on the management cluster. - -The management cluster functions as the control plane for the ClusterAPI operator, which is responsible for provisioning and managing the infrastructure resources necessary for creating and managing additional Kubernetes clusters. It is important to note that the management cluster is not intended to support any other workload, as the workloads are expected to run on the provisioned clusters. As a result, the provisioned clusters are referred to as workload clusters. - -Typically, the management cluster runs in a separate environment from the clusters it manages, such as a public cloud or an on-premises data center. It serves as a centralized location for managing the configuration, policies, and security of multiple managed clusters. By leveraging the management cluster, users can easily create and manage a fleet of Kubernetes clusters in a consistent and repeatable manner. +Being a cloud-native framework, CAPI implements all its components as +controllers that run within a Kubernetes cluster. There is a separate +controller, called a ‘provider’, for each supported infrastructure substrate. +The infrastructure providers are responsible for provisioning physical or +virtual nodes and setting up networking elements such as load balancers and +virtual networks. In a similar way, each Kubernetes distribution that +integrates with ClusterAPI is managed by two providers: the control plane +provider and the bootstrap provider. The bootstrap provider is responsible for +delivering and managing Kubernetes on the nodes, while the control plane +provider handles the control plane’s specific lifecycle. + +The CAPI providers operate within a Kubernetes cluster known as the management +cluster. The administrator is responsible for selecting the desired combination +of infrastructure and Kubernetes distribution by instantiating the respective +infrastructure, bootstrap, and control plane providers on the management +cluster. + +The management cluster functions as the control plane for the ClusterAPI +operator, which is responsible for provisioning and managing the infrastructure +resources necessary for creating and managing additional Kubernetes clusters. +It is important to note that the management cluster is not intended to support +any other workload, as the workloads are expected to run on the provisioned +clusters. As a result, the provisioned clusters are referred to as workload +clusters. + +Typically, the management cluster runs in a separate environment from the +clusters it manages, such as a public cloud or an on-premises data centre. It +serves as a centralised location for managing the configuration, policies, and +security of multiple managed clusters. By leveraging the management cluster, +users can easily create and manage a fleet of Kubernetes clusters in a +consistent and repeatable manner. The {{product}} team maintains the two providers required for integrating with CAPI: -- The Cluster API Bootstrap Provider {{product}} (**CABPCK**) responsible for provisioning the nodes in the cluster and preparing them to be joined to the Kubernetes control plane. When you use the CABPCK you define a Kubernetes Cluster object that describes the desired state of the new cluster and includes the number and type of nodes in the cluster, as well as any additional configuration settings. The Bootstrap Provider then creates the necessary resources in the Kubernetes API server to bring the cluster up to the desired state. Under the hood, the Bootstrap Provider uses cloud-init to configure the nodes in the cluster. This includes setting up SSH keys, configuring the network, and installing necessary software packages. - -- The Cluster API Control Plane Provider {{product}} (**CACPCK**) enables the creation and management of Kubernetes control planes using {{product}} as the underlying Kubernetes distribution. Its main tasks are to update the machine state and to generate the kubeconfig file used for accessing the cluster. The kubeconfig file is stored as a secret which the user can then retrieve using the `clusterctl` command. - -```{figure} ./capi-ck8s.svg +- The Cluster API Bootstrap Provider {{product}} (**CABPCK**) responsible for + provisioning the nodes in the cluster and preparing them to be joined to the + Kubernetes control plane. When you use the CABPCK you define a Kubernetes + Cluster object that describes the desired state of the new cluster and + includes the number and type of nodes in the cluster, as well as any + additional configuration settings. The Bootstrap Provider then creates the + necessary resources in the Kubernetes API server to bring the cluster up to + the desired state. Under the hood, the Bootstrap Provider uses cloud-init to + configure the nodes in the cluster. This includes setting up SSH keys, + configuring the network, and installing necessary software packages. + +- The Cluster API Control Plane Provider {{product}} (**CACPCK**) enables the + creation and management of Kubernetes control planes using {{product}} as the + underlying Kubernetes distribution. Its main tasks are to update the machine + state and to generate the kubeconfig file used for accessing the cluster. The + kubeconfig file is stored as a secret which the user can then retrieve using + the `clusterctl` command. + +```{figure} ../../assets/capi-ck8s.svg :width: 100% :alt: Deployment of components diff --git a/docs/src/capi/explanation/in-place-upgrades.md b/docs/src/capi/explanation/in-place-upgrades.md new file mode 100644 index 000000000..5196fd7d1 --- /dev/null +++ b/docs/src/capi/explanation/in-place-upgrades.md @@ -0,0 +1,132 @@ +# In-Place Upgrades + +Regularly upgrading the Kubernetes version of the machines in a cluster +is important. While rolling upgrades are a popular strategy, certain +situations will require in-place upgrades: + +- Resource constraints (i.e. cost of additional machines). +- Expensive manual setup process for nodes. + +## Annotations + +CAPI machines are considered immutable. Consequently, machines are replaced +instead of reconfigured. +While CAPI doesn't support in-place upgrades, {{product}} CAPI does +by leveraging annotations for the implementation. +For a deeper understanding of the CAPI design decisions, consider reading about +[machine immutability in CAPI][1], and Kubernetes objects: [`labels`][2], +[`spec` and `status`][3]. + +## Controllers + +In {{product}} CAPI, there are two main types of controllers that handle the +process of performing in-place upgrades: + +- Single Machine In-Place Upgrade Controller +- Orchestrated In-Place Upgrade Controller + +The core component of performing an in-place upgrade is the `Single Machine +Upgrader`. The controller watches for annotations on machines and reconciles +them to ensure the upgrades happen smoothly. + +The `Orchestrator` watches for certain annotations on +machine owners, reconciles them and upgrades groups of owned machines. +It’s responsible for ensuring that all the machines owned by the +reconciled object get upgraded successfully. + +The main annotations that drive the upgrade process are as follows: + +- `v1beta2.k8sd.io/in-place-upgrade-to` --> `upgrade-to` : Instructs +the controller to perform an upgrade with the specified option/method. +- `v1beta2.k8sd.io/in-place-upgrade-status` --> `status` : As soon as the +controller starts the upgrade process, the object will be marked with the +`status` annotation which can either be `in-progress`, `failed` or `done`. +- `v1beta2.k8sd.io/in-place-upgrade-release` --> `release` : When the +upgrade is performed successfully, this annotation will indicate the current +Kubernetes release/version installed on the machine. + +For a complete list of annotations and their values please +refer to the [annotations reference page][4]. This explanation proceeds +to use abbreviations of the mentioned labels. + +### Single Machine In-Place Upgrade Controller + +The Machine objects can be marked with the `upgrade-to` annotation to +trigger an in-place upgrade for that machine. While watching for changes +on the machines, the single machine upgrade controller notices this annotation +and attempts to upgrade the Kubernetes version of that machine to the +specified version. + +Upgrade methods or options can be specified to upgrade to a snap channel, +revision, or a local snap file already placed on the +machine in air-gapped environments. + +A successfully upgraded machine shows the following annotations: + +```yaml +annotations: + v1beta2.k8sd.io/in-place-upgrade-release: "channel=1.31/stable" + v1beta2.k8sd.io/in-place-upgrade-status: "done" +``` + +If the upgrade fails, the controller will mark the machine and retry +the upgrade immediately: + +```yaml +annotations: + # the `upgrade-to` causes the retry to happen + v1beta2.k8sd.io/in-place-upgrade-to: "channel=1.31/stable" + v1beta2.k8sd.io/in-place-upgrade-status: "failed" + + # orchestrator will notice this annotation and knows that the + # upgrade for this machine failed + v1beta2.k8sd.io/in-place-upgrade-last-failed-attempt-at: "Sat, 7 Nov + 2024 13:30:00 +0400" +``` + +By applying and removing annotations, the single machine +upgrader determines the upgrade status of the machine it’s trying to +reconcile and takes necessary actions to successfully complete an +in-place upgrade. The following diagram shows the flow of the in-place +upgrade of a single machine: + +![Diagram][img-single-machine] + +### Machine Upgrade Process + +The {{product}}'s `k8sd` daemon exposes endpoints that can be used to +interact with the cluster. The single machine upgrader calls the +`/snap/refresh` endpoint on the machine to trigger the upgrade +process while checking `/snap/refresh-status` periodically. + +![Diagram][img-k8sd-call] + +### In-place upgrades on large workload clusters + +While the “Single Machine In-Place Upgrade Controller” is responsible +for upgrading individual machines, the "Orchestrated In-Place Upgrade +Controller" ensures that groups of machines will get upgraded. +By applying the `upgrade-to` annotation on an object that owns machines +(e.g. a `MachineDeployment`), this controller will mark the owned machines +one by one which will cause the "Single Machine Upgrader" to pickup those +annotations and upgrade the machines. To avoid undesirable situations + like quorum loss or severe downtime, these upgrades happen in sequence. + +The failures and successes of individual machine upgrades will be reported back +to the orchestrator by the single machine upgrader via annotations. + +The illustrated flow of orchestrated in-place upgrades: + +![Diagram][img-orchestrated] + + + +[img-single-machine]: https://assets.ubuntu.com/v1/1200f040-single-machine.png +[img-k8sd-call]: https://assets.ubuntu.com/v1/518eb73a-k8sd-call.png +[img-orchestrated]: https://assets.ubuntu.com/v1/8f302a00-orchestrated.png + + +[1]: https://cluster-api.sigs.k8s.io/user/concepts#machine-immutability-in-place-upgrade-vs-replace +[2]: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/ +[3]: https://kubernetes.io/docs/concepts/overview/working-with-objects/#object-spec-and-status +[4]: ../reference/annotations.md diff --git a/docs/src/capi/explanation/index.md b/docs/src/capi/explanation/index.md index 775dd26a3..d4ad076be 100644 --- a/docs/src/capi/explanation/index.md +++ b/docs/src/capi/explanation/index.md @@ -11,12 +11,12 @@ Overview ```{toctree} :titlesonly: -:globs: +:glob: about security capi-ck8s.md - +in-place-upgrades.md ``` diff --git a/docs/src/capi/explanation/security.md b/docs/src/capi/explanation/security.md index 6c1048a19..002e071d1 100644 --- a/docs/src/capi/explanation/security.md +++ b/docs/src/capi/explanation/security.md @@ -1,2 +1,2 @@ -```{include} /snap/explanation/security.md +```{include} ../../snap/explanation/security.md ``` diff --git a/docs/src/capi/howto/custom-ck8s.md b/docs/src/capi/howto/custom-ck8s.md index d81191980..ba3a1d0fe 100644 --- a/docs/src/capi/howto/custom-ck8s.md +++ b/docs/src/capi/howto/custom-ck8s.md @@ -1,6 +1,6 @@ # Install custom {{product}} on machines -By default, the `version` field in the machine specifications will determine which {{product}} is downloaded from the `stable` rist level. While you can install different versions of the `stable` risk level by changing the `version` field, extra steps should be taken if you're willing to install a specific risk level. +By default, the `version` field in the machine specifications will determine which {{product}} is downloaded from the `stable` risk level. While you can install different versions of the `stable` risk level by changing the `version` field, extra steps should be taken if you're willing to install a specific risk level. This guide walks you through the process of installing custom {{product}} on workload cluster machines. ## Prerequisites @@ -13,7 +13,7 @@ To follow this guide, you will need: Please refer to the [getting-started guide][getting-started] for further details on the required setup. -In this guide we call the generated cluster spec manifrst `cluster.yaml`. +In this guide we call the generated cluster spec manifest `cluster.yaml`. ## Overwrite the existing `install.sh` script diff --git a/docs/src/capi/howto/external-etcd.md b/docs/src/capi/howto/external-etcd.md index f6509fb22..a77600c68 100644 --- a/docs/src/capi/howto/external-etcd.md +++ b/docs/src/capi/howto/external-etcd.md @@ -9,7 +9,7 @@ with an external etcd. To follow this guide, you will need: -- [Clusterctl][clusterctl] installed +- [clusterctl][clusterctl] installed - A CAPI management cluster initialised with the infrastructure, bootstrap and control plane providers of your choice. Please refer to the [getting-started guide][getting-started] for instructions. @@ -78,7 +78,7 @@ kubectl get secrets ## Update etcd cluster template -Please refer to [capi-templates][capi-templates] for the latest templates. +Please refer to [CAPI-templates][CAPI-templates] for the latest templates. Update the control plane resource `CK8sControlPlane` so that it is configured to store the Kubernetes state in etcd. Add the following additional configuration to the cluster template `cluster-template.yaml`: @@ -120,5 +120,5 @@ clusterctl describe cluster peaches ``` [getting-started]: ../tutorial/getting-started.md -[capi-templates]: https://github.com/canonical/cluster-api-k8s/tree/main/templates +[CAPI-templates]: https://github.com/canonical/cluster-api-k8s/tree/main/templates [clusterctl]: https://cluster-api.sigs.k8s.io/clusterctl/overview diff --git a/docs/src/capi/howto/in-place-upgrades.md b/docs/src/capi/howto/in-place-upgrades.md new file mode 100644 index 000000000..7b20c9dab --- /dev/null +++ b/docs/src/capi/howto/in-place-upgrades.md @@ -0,0 +1,98 @@ +# Perform an in-place upgrade for a machine + +This guide walks you through the steps to perform an in-place upgrade for a +Cluster API managed machine. + +## Prerequisites + +To follow this guide, you will need: + +- A Kubernetes management cluster with Cluster API and providers installed + and configured. +- A target workload cluster managed by CAPI. +- `kubectl` installed and configured to access your management cluster. +- The workload cluster kubeconfig. + +Please refer to the [getting-started guide][getting-started] for further +details on the required setup. +This guide refers to the workload cluster as `c1` and its +kubeconfig as `c1-kubeconfig.yaml`. + +## Check the current cluster status + +Prior to the upgrade, ensure that the management cluster is in a healthy +state. + +``` +kubectl get nodes -o wide +``` + +Confirm the Kubernetes version of the workload cluster: + +``` +kubectl --kubeconfig c1-kubeconfig.yaml get nodes -o wide +``` + +## Annotate the machine + +In this first step, annotate the Machine resource with +the in-place upgrade annotation. In this example, the machine +is called `c1-control-plane-xyzbw`. + +``` +kubectl annotate machine c1-control-plane-xyzbw "v1beta2.k8sd.io/in-place-upgrade-to=" +``` + +`` can be one of: + +* `channel=` which refreshes k8s to the given snap channel. + e.g. `channel=1.30-classic/stable` +* `revision=` which refreshes k8s to the given revision. + e.g. `revision=123` +* `localPath=` which refreshes k8s with the snap file from + the given absolute path. e.g. `localPath=full/path/to/k8s.snap` + +Please refer to the [ClusterAPI Annotations Reference][annotations-reference] +for further details on these options. + +## Monitor the in-place upgrade + +Watch the status of the in-place upgrade for the machine, +by running the following command and checking the +`v1beta2.k8sd.io/in-place-upgrade-status` annotation: + +``` +kubectl get machine c1-control-plane-xyzbw -o yaml +``` + +On a successful upgrade: + +* Value of the `v1beta2.k8sd.io/in-place-upgrade-status` annotation + will be changed to `done` +* Value of the `v1beta2.k8sd.io/in-place-upgrade-release` annotation + will be changed to the `` used to perform the upgrade. + +## Cancelling a failing upgrade + +The upgrade is retried periodically if the operation was unsuccessful. + +The upgrade can be cancelled by running the following commands +that remove the annotations: + +``` +kubectl annotate machine c1-control-plane-xyzbw "v1beta2.k8sd.io/in-place-upgrade-to-" +kubectl annotate machine c1-control-plane-xyzbw "v1beta2.k8sd.io/in-place-upgrade-change-id-" +``` + +## Verify the Kubernetes upgrade + +Confirm that the node is healthy and runs on the new Kubernetes version: + +``` +kubectl --kubeconfig c1-kubeconfig.yaml get nodes -o wide +``` + + + +[getting-started]: ../tutorial/getting-started.md +[annotations-reference]: ../reference/annotations.md diff --git a/docs/src/capi/howto/index.md b/docs/src/capi/howto/index.md index 375a5025a..3bf0cca3a 100644 --- a/docs/src/capi/howto/index.md +++ b/docs/src/capi/howto/index.md @@ -14,11 +14,13 @@ Overview :glob: :titlesonly: -external-etcd +Use external etcd rollout-upgrades +in-place-upgrades upgrade-providers migrate-management custom-ck8s +refresh-certs ``` --- diff --git a/docs/src/capi/howto/migrate-management.md b/docs/src/capi/howto/migrate-management.md index 11a1474f3..f902a0731 100644 --- a/docs/src/capi/howto/migrate-management.md +++ b/docs/src/capi/howto/migrate-management.md @@ -1,4 +1,4 @@ -# Migrate the managment cluster +# Migrate the management cluster Management cluster migration is a really powerful operation in the cluster’s lifecycle as it allows admins to move the management cluster in a more reliable substrate or perform maintenance tasks without disruptions. diff --git a/docs/src/capi/howto/refresh-certs.md b/docs/src/capi/howto/refresh-certs.md new file mode 100644 index 000000000..9f8d3347d --- /dev/null +++ b/docs/src/capi/howto/refresh-certs.md @@ -0,0 +1,107 @@ +# Refreshing Workload Cluster Certificates + +This how-to will walk you through the steps to refresh the certificates for +both control plane and worker nodes in your {{product}} Cluster API cluster. + +## Prerequisites + +- A Kubernetes management cluster with Cluster API and Canonical K8s providers + installed and configured. +- A target workload cluster managed by Cluster API. +- `kubectl` installed and configured to access your management cluster. + +Please refer to the [getting-started guide][getting-started] for further +details on the required setup. +This guide refers to the workload cluster as `c1`. + +```{note} To refresh the certificates in your cluster, make sure it was +initially set up with self-signed certificates. You can verify this by +checking the `CK8sConfigTemplate` resource for the cluster to see if a +`BootstrapConfig` value was provided with the necessary certificates. +``` + +### Refresh Control Plane Node Certificates + +To refresh the certificates on control plane nodes, follow these steps for each +control plane node in your workload cluster: + +1. First, check the names of the control plane machines in your cluster: + +``` +clusterctl describe cluster c1 +``` + +2. For each control plane machine, annotate the machine resource with the +`v1beta2.k8sd.io/refresh-certificates` annotation. The value of the annotation +should specify the duration for which the certificates will be valid. For +example, to refresh the certificates for a control plane machine named +`c1-control-plane-nwlss` to expire in 10 years, run the following command: + +``` +kubectl annotate machine c1-control-plane-nwlss v1beta2.k8sd.io/refresh-certificates=10y +``` + +```{note} The value of the annotation can be specified in years (y), months +(mo), (d) days, or any unit accepted by the [ParseDuration] function in +Go. +``` + +The Cluster API provider will automatically refresh the certificates on the +control plane node and restart the necessary services. To track the progress of +the certificate refresh, check the events for the machine resource: + +``` +kubectl get events --field-selector involvedObject.name=c1-control-plane-nwlss +``` + +The machine will be ready once the event `CertificatesRefreshDone` is +displayed. + +3. After the certificate refresh is complete, the new expiration date will be +displayed in the `machine.cluster.x-k8s.io/certificates-expiry` annotation of +the machine resource: + +``` +"machine.cluster.x-k8s.io/certificates-expiry": "2034-10-25T14:25:23-05:00" +``` + +### Refresh Worker Node Certificates + +To refresh the certificates on worker nodes, follow these steps for each worker +node in your workload cluster: + +1. Check the names of the worker machines in your cluster: + +``` +clusterctl describe cluster c1 +``` + +2. Add the `v1beta2.k8sd.io/refresh-certificates` annotation to each worker +machine, specifying the desired certificate validity duration. For example, to +set the certificates for `c1-worker-md-0-4lxb7-msq44` to expire in 10 years: + +``` +kubectl annotate machine c1-worker-md-0-4lxb7-msq44 v1beta2.k8sd.io/refresh-certificates=10y +``` + +The ClusterAPI provider will handle the certificate refresh and restart +necessary services. Track the progress by checking the machine's events: + +``` +kubectl get events --field-selector involvedObject.name=c1-worker-md-0-4lxb7-msq44 +``` + +The machine will be ready once the event `CertificatesRefreshDone` is +displayed. + +3. After the certificate refresh is complete, the new expiration date will be +displayed in the `machine.cluster.x-k8s.io/certificates-expiry` annotation of +the machine resource: + +``` +"machine.cluster.x-k8s.io/certificates-expiry": "2034-10-25T14:33:04-05:00" +``` + + +[getting-started]: ../tutorial/getting-started.md +[ParseDuration]: https://pkg.go.dev/time#ParseDuration diff --git a/docs/src/capi/index.md b/docs/src/capi/index.md index df4102a01..87b56a3ab 100644 --- a/docs/src/capi/index.md +++ b/docs/src/capi/index.md @@ -1,5 +1,21 @@ # Installing {{product}} with Cluster API +```{toctree} +:hidden: +Overview +``` + +```{toctree} +:hidden: +:titlesonly: +:glob: +:caption: Deploy with Cluster API +tutorial/index.md +howto/index.md +explanation/index.md +reference/index.md +``` + Cluster API (CAPI) is a Kubernetes project focused on providing declarative APIs and tooling to simplify provisioning, upgrading, and operating multiple Kubernetes clusters. The supporting infrastructure, like virtual machines, networks, load balancers, and VPCs, as well as the cluster configuration are all defined in the same way that cluster operators are already familiar with. {{product}} supports deploying and operating Kubernetes through CAPI. ![Illustration depicting working on components and clouds][logo] @@ -55,10 +71,10 @@ and constructive feedback. [Code of Conduct]: https://ubuntu.com/community/ethos/code-of-conduct -[community]: /charm/reference/community -[contribute]: /snap/howto/contribute -[roadmap]: /snap/reference/roadmap -[overview page]: /charm/explanation/about -[arch]: /charm/reference/architecture +[community]: ../charm/reference/community +[contribute]: ../snap/howto/contribute +[roadmap]: ../snap/reference/roadmap +[overview page]: ../charm/explanation/about +[arch]: ../charm/reference/architecture [Juju]: https://juju.is -[k8s snap package]: /snap/index \ No newline at end of file +[k8s snap package]: ../snap/index \ No newline at end of file diff --git a/docs/src/capi/reference/annotations.md b/docs/src/capi/reference/annotations.md index 8f9e87fa9..3d540b446 100644 --- a/docs/src/capi/reference/annotations.md +++ b/docs/src/capi/reference/annotations.md @@ -7,9 +7,17 @@ pairs that can be used to reflect additional metadata for CAPI resources. The following annotations can be set on CAPI `Machine` resources. +### In-place Upgrade + | Name | Description | Values | Set by user | |-----------------------------------------------|------------------------------------------------------|------------------------------|-------------| | `v1beta2.k8sd.io/in-place-upgrade-to` | Trigger a Kubernetes version upgrade on that machine | snap version e.g.:
- `localPath=/full/path/to/k8s.snap`
- `revision=123`
- `channel=latest/edge` | yes | | `v1beta2.k8sd.io/in-place-upgrade-status` | The status of the version upgrade | in-progress\|done\|failed | no | | `v1beta2.k8sd.io/in-place-upgrade-release` | The current version on the machine | snap version e.g.:
- `localPath=/full/path/to/k8s.snap`
- `revision=123`
- `channel=latest/edge` | no | | `v1beta2.k8sd.io/in-place-upgrade-change-id` | The ID of the currently running upgrade | ID string | no | + +### Refresh Certificates + +| Name | Description | Values | Set by user | +|-----------------------------------------------|------------------------------------------------------|------------------------------|-------------| +| `v1beta2.k8sd.io/refresh-certificates` | The requested duration (TTL) that the refreshed certificates should expire in. | Duration (TTL) string. A number followed by a unit e.g.: `1mo`, `1y`, `90d`
Allowed units: Any unit supported by `time.ParseDuration` as well as `y` (year), `mo` (month) and `d` (day). | yes | diff --git a/docs/src/capi/reference/configs.md b/docs/src/capi/reference/configs.md index 60ce9bebe..870d240f9 100644 --- a/docs/src/capi/reference/configs.md +++ b/docs/src/capi/reference/configs.md @@ -68,6 +68,7 @@ spec: - echo "second-command" ``` +(preruncommands)= ### `preRunCommands` **Type:** `[]string` @@ -107,7 +108,7 @@ spec: **Required:** no -`airGapped` is used to signal that we are deploying to an airgap environment. In this case, the provider will not attempt to install k8s-snap on the machine. The user is expected to install k8s-snap manually with [`preRunCommands`](#preRunCommands), or provide an image with k8s-snap pre-installed. +`airGapped` is used to signal that we are deploying to an air-gapped environment. In this case, the provider will not attempt to install k8s-snap on the machine. The user is expected to install k8s-snap manually with [`preRunCommands`](#preruncommands), or provide an image with k8s-snap pre-installed. **Example Usage:** ```yaml @@ -120,7 +121,7 @@ spec: **Required:** no -`initConfig` is configuration for the initializing the cluster features +`initConfig` is configuration for the initialising the cluster features **Fields:** @@ -192,8 +193,8 @@ spec: | `datastoreType` | `string` | The type of datastore to use for the control plane. | `""` | | `datastoreServersSecretRef` | `struct{name:str, key:str}` | A reference to a secret containing the datastore servers. | `{}` | | `k8sDqlitePort` | `int` | The port to use for k8s-dqlite. If unset, 2379 (etcd) will be used. | `2379` | -| `microclusterAddress` | `string` | The address (or CIDR) to use for microcluster. If unset, the default node interface is chosen. | `""` | -| `microclusterPort` | `int` | The port to use for microcluster. If unset, ":2380" (etcd peer) will be used. | `":2380"` | +| `microclusterAddress` | `string` | The address (or CIDR) to use for MicroCluster. If unset, the default node interface is chosen. | `""` | +| `microclusterPort` | `int` | The port to use for MicroCluster. If unset, ":2380" (etcd peer) will be used. | `":2380"` | | `extraKubeAPIServerArgs` | `map[string]string` | Extra arguments to add to kube-apiserver. | `map[]` | **Example Usage:** diff --git a/docs/src/capi/reference/index.md b/docs/src/capi/reference/index.md index cd239300c..b37f2385f 100644 --- a/docs/src/capi/reference/index.md +++ b/docs/src/capi/reference/index.md @@ -12,7 +12,7 @@ Overview :titlesonly: releases annotations -community +Community configs ``` diff --git a/docs/src/capi/tutorial/getting-started.md b/docs/src/capi/tutorial/getting-started.md index 8f5554b19..71a8f823a 100644 --- a/docs/src/capi/tutorial/getting-started.md +++ b/docs/src/capi/tutorial/getting-started.md @@ -170,7 +170,7 @@ provision. You can generate a cluster manifest for a selected set of commonly used infrastructures via templates provided by the {{product}} team. -Ensure you have initialized the desired infrastructure provider and fetch +Ensure you have initialised the desired infrastructure provider and fetch the {{product}} provider repository: ``` diff --git a/docs/src/charm/explanation/index.md b/docs/src/charm/explanation/index.md index 58409c598..9b4652cf2 100644 --- a/docs/src/charm/explanation/index.md +++ b/docs/src/charm/explanation/index.md @@ -39,4 +39,4 @@ details or information such as the command reference or release notes. [Tutorials section]: ../tutorial/index [How-to guides]: ../howto/index [Reference section]: ../reference/index -[explanation topic]: /snap/explanation/index.md +[explanation topic]: ../../snap/explanation/index.md diff --git a/docs/src/charm/explanation/security.md b/docs/src/charm/explanation/security.md index 6c1048a19..002e071d1 100644 --- a/docs/src/charm/explanation/security.md +++ b/docs/src/charm/explanation/security.md @@ -1,2 +1,2 @@ -```{include} /snap/explanation/security.md +```{include} ../../snap/explanation/security.md ``` diff --git a/docs/src/charm/howto/charm.md b/docs/src/charm/howto/charm.md index 5b85f6b62..06c609c9e 100644 --- a/docs/src/charm/howto/charm.md +++ b/docs/src/charm/howto/charm.md @@ -9,7 +9,7 @@ This guide assumes the following: - The rest of this page assumes you already have Juju installed and have added [credentials] for a cloud and bootstrapped a controller. -- If you still need to do this, please take a look at the quickstart +- If you still need to do this, please take a look at the quick-start instructions, or, for custom clouds (OpenStack, MAAS), please consult the [Juju documentation][juju]. - You are not using the Juju 'localhost' cloud (see [localhost diff --git a/docs/src/charm/howto/contribute.md b/docs/src/charm/howto/contribute.md index eda251301..dff142dca 100644 --- a/docs/src/charm/howto/contribute.md +++ b/docs/src/charm/howto/contribute.md @@ -88,7 +88,7 @@ it on the [Diátaxis website]. In essence though, this guides the way we categorise and write our documentation. You can see there are four main categories of documentation: -- **Tutorials** for guided walkthroughs +- **Tutorials** for guided walk-throughs - **How to** pages for specific tasks and goals - **Explanation** pages which give background reasons and, well, explanations - **Reference**, where you will find the commands, the roadmap, etc. diff --git a/docs/src/charm/howto/cos-lite.md b/docs/src/charm/howto/cos-lite.md index 80efeebe3..97838e42a 100644 --- a/docs/src/charm/howto/cos-lite.md +++ b/docs/src/charm/howto/cos-lite.md @@ -28,7 +28,7 @@ juju add-model --config logging-config='=DEBUG' microk8s-ubuntu We also set the logging level to DEBUG so that helpful debug information is shown when you use `juju debug-log` (see [juju debug-log][juju-debug-log]). -Use the Ubuntu charm to deploy an application named “microk8s”: +Use the Ubuntu charm to deploy an application named `microk8s`: ``` juju deploy ubuntu microk8s --series=focal --constraints="mem=8G cores=4 root-disk=30G" @@ -36,13 +36,13 @@ juju deploy ubuntu microk8s --series=focal --constraints="mem=8G cores=4 root-di Deploy MicroK8s on Ubuntu by accessing the unit you created at the last step with `juju ssh microk8s/0` and following the -[Install Microk8s][how-to-install-microk8s] guide for configuration. +[Install MicroK8s][how-to-install-MicroK8s] guide for configuration. ```{note} Make sure to enable the hostpath-storage and MetalLB addons for -Microk8s. +MicroK8s. ``` -Export the Microk8s kubeconfig file to your current directory after +Export the MicroK8s kubeconfig file to your current directory after configuration: ``` @@ -57,9 +57,9 @@ command): KUBECONFIG=microk8s-config.yaml juju add-k8s microk8s-cloud ``` -## Deploying COS Lite on the Microk8s cloud +## Deploying COS Lite on the MicroK8s cloud -On the Microk8s cloud, create a new model and deploy the `cos-lite` bundle: +On the MicroK8s cloud, create a new model and deploy the `cos-lite` bundle: ``` juju add-model cos-lite microk8s-cloud @@ -145,4 +145,4 @@ you can head over to the [COS Lite documentation][cos-lite-docs]. [juju-models]: https://juju.is/docs/juju/model [juju-debug-log]: https://juju.is/docs/juju/juju-debug-log [cross-model-integration]: https://juju.is/docs/juju/relation#heading--cross-model -[how-to-install-microk8s]: https://microk8s.io/docs/getting-started \ No newline at end of file +[how-to-install-MicroK8s]: https://microk8s.io/docs/getting-started \ No newline at end of file diff --git a/docs/src/charm/howto/custom-registry.md b/docs/src/charm/howto/custom-registry.md new file mode 100644 index 000000000..2db2dcc39 --- /dev/null +++ b/docs/src/charm/howto/custom-registry.md @@ -0,0 +1,76 @@ +# Configure a Custom Registry + +The `k8s` charm can be configured to use a custom container registry for its +container images. This is particularly useful if you have a private registry or +operate in an air-gapped environment where you need to pull images from a +different registry. This guide will walk you through the steps to set up `k8s` +charm to pull images from a custom registry. + +## Prerequisites + +- A running `k8s` charm cluster. +- Access to a custom container registry from the cluster (e.g., docker registry + or Harbor). + +## Configure the Charm + +To configure the charm to use a custom registry, you need to set the +`containerd_custom_registries` configuration option. This options allows +the charm to configure `containerd` to pull images from registries that require +authentication. This configuration option should be a JSON-formatted array of +credential objects. For more details on the `containerd_custom_registries` +option, refer to the [charm configurations] documentation. + +For example, to configure the charm to use a custom registry at +`myregistry.example.com:5000` with the username `myuser` and password +`mypassword`, set the `containerd_custom_registries` configuration option as +follows: + +``` +juju config k8s containerd_custom_registries='[{ + "url": "http://myregistry.example.com:5000", + "host": "myregistry.example.com:5000", + "username": "myuser", + "password": "mypassword" +}]' +``` + +Allow the charm to apply the configuration changes and wait for Juju to +indicate that the changes have been successfully applied. You can monitor the +progress by running: + +``` +juju status --watch 2s +``` + +## Verify the Configuration + +Once the charm is configured and active, verify that the custom registry is +configured correctly by creating a new workload and ensuring that the images +are being pulled from the custom registry. + +For example, to create a new workload using the `nginx:latest` image that you +have previously pushed to the `myregistry.example.com:5000` registry, run the +following command: + +``` +kubectl run nginx --image=myregistry.example.com:5000/nginx:latest +``` + +To confirm that the image has been pulled from the custom registry and that the +workload is running, use the following command: + +``` +kubectl get pod nginx -o jsonpath='{.spec.containers[*].image}{"->"}{.status.containerStatuses[*].ready}' +``` + +The output should indicate that the image was pulled from the custom registry +and that the workload is running. + +``` +myregistry.example.com:5000/nginx:latest->true +``` + + + +[charm configurations]: https://charmhub.io/k8s/configurations diff --git a/docs/src/charm/howto/index.md b/docs/src/charm/howto/index.md index 0f885b73e..618ab666a 100644 --- a/docs/src/charm/howto/index.md +++ b/docs/src/charm/howto/index.md @@ -16,10 +16,11 @@ Overview charm install-lxd -etcd +Integrate with etcd proxy cos-lite contribute +custom-registry ``` diff --git a/docs/src/charm/howto/install-lxd.md b/docs/src/charm/howto/install-lxd.md index 321ce4e2c..99a41b7e1 100644 --- a/docs/src/charm/howto/install-lxd.md +++ b/docs/src/charm/howto/install-lxd.md @@ -24,7 +24,7 @@ profiles by running the command: lxc profile list ``` -For example, suppose we have created a model called 'myk8s'. This will +For example, suppose we have created a model called `myk8s`. This will output a table like this: ``` @@ -73,7 +73,7 @@ lxc profile show juju-myk8s ``` ```{note} For an explanation of the settings in this file, - [see below](explain-rules) + [see below](explain-rules-charm) ``` ## Deploying to a container @@ -81,7 +81,7 @@ lxc profile show juju-myk8s We can now deploy {{product}} into the LXD-based model as described in the [charm][] guide. -(explain-rules)= +(explain-rules-charm)= ## Explanation of custom LXD rules diff --git a/docs/src/charm/howto/proxy.md b/docs/src/charm/howto/proxy.md index 7a514dee9..8a57c4fc7 100644 --- a/docs/src/charm/howto/proxy.md +++ b/docs/src/charm/howto/proxy.md @@ -1,6 +1,6 @@ # Configuring proxy settings for K8s -{{product}} packages a number of utilities (eg curl, helm) which need +{{product}} packages a number of utilities (for example curl, helm) which need to fetch resources they expect to find on the internet. In a constrained network environment, such access is usually controlled through proxies. diff --git a/docs/src/charm/index.md b/docs/src/charm/index.md index 5ebb51296..83f34fe72 100644 --- a/docs/src/charm/index.md +++ b/docs/src/charm/index.md @@ -1,5 +1,20 @@ # {{product}} charm documentation +```{toctree} +:hidden: +Overview +``` + +```{toctree} +:hidden: +:titlesonly: +:caption: Deploy with Juju +tutorial/index.md +howto/index.md +explanation/index.md +reference/index.md +``` + The {{product}} charm, `k8s`, is an operator: software which wraps an application and contains all of the instructions necessary for deploying, configuring, scaling, integrating the application on any cloud supported by @@ -66,10 +81,10 @@ and constructive feedback. [Code of Conduct]: https://ubuntu.com/community/ethos/code-of-conduct -[community]: /charm/reference/community -[contribute]: /snap/howto/contribute -[roadmap]: /snap/reference/roadmap -[overview page]: /charm/explanation/about -[arch]: /charm/reference/architecture +[community]: reference/community +[contribute]: ../snap/howto/contribute +[roadmap]: ../snap/reference/roadmap +[overview page]: explanation/about +[arch]: reference/architecture [Juju]: https://juju.is -[k8s snap package]: /snap/index \ No newline at end of file +[k8s snap package]: ../snap/index \ No newline at end of file diff --git a/docs/src/charm/reference/architecture.md b/docs/src/charm/reference/architecture.md index 28a51c753..2e2696ba5 100644 --- a/docs/src/charm/reference/architecture.md +++ b/docs/src/charm/reference/architecture.md @@ -1,5 +1,5 @@ # K8s charm architecture -```{include} /snap/reference/architecture.md +```{include} ../../snap/reference/architecture.md :start-after: '## Canonical K8s charms' ``` diff --git a/docs/src/charm/reference/charms.md b/docs/src/charm/reference/charms.md index 29450218a..eb889da5e 100644 --- a/docs/src/charm/reference/charms.md +++ b/docs/src/charm/reference/charms.md @@ -24,13 +24,13 @@ The source code for both charms is contained in a single repository: [https://github.com/canonical/k8s-operator][repo] -Please see the [readme file][] there for further specifics of the charm +Please see the [README file][] there for further specifics of the charm implementation. [Juju]: https://juju.is -[explaining channels]: /charm/explanation/channels +[explaining channels]: ../explanation/channels [cs-k8s]: https://charmhub.io/k8s [cs-k8s-worker]: https://charmhub.io/k8s-worker -[readme file]: https://github.com/canonical/k8s-operator#readme +[README file]: https://github.com/canonical/k8s-operator#readme [repo]: https://github.com/canonical/k8s-operator \ No newline at end of file diff --git a/docs/src/charm/reference/index.md b/docs/src/charm/reference/index.md index 606de9a9a..d145e1bfd 100644 --- a/docs/src/charm/reference/index.md +++ b/docs/src/charm/reference/index.md @@ -15,7 +15,7 @@ releases charms proxy architecture -community +Community ``` diff --git a/docs/src/charm/reference/proxy.md b/docs/src/charm/reference/proxy.md index e4dfd0e16..1ebabb32b 100644 --- a/docs/src/charm/reference/proxy.md +++ b/docs/src/charm/reference/proxy.md @@ -1,2 +1,2 @@ -```{include} /snap/reference/proxy.md +```{include} ../../snap/reference/proxy.md ``` \ No newline at end of file diff --git a/docs/src/charm/tutorial/getting-started.md b/docs/src/charm/tutorial/getting-started.md index c43b189e5..b0859a5d1 100644 --- a/docs/src/charm/tutorial/getting-started.md +++ b/docs/src/charm/tutorial/getting-started.md @@ -239,5 +239,5 @@ informed of updates. [Juju client]: https://juju.is/docs/juju/install-and-manage-the-client [Juju tutorial]: https://juju.is/docs/juju/tutorial [Kubectl]: https://kubernetes.io/docs/reference/kubectl/ -[the channel explanation page]: /snap/explanation/channels -[releases page]: /charm/reference/releases \ No newline at end of file +[the channel explanation page]: ../../snap/explanation/channels +[releases page]: ../reference/releases \ No newline at end of file diff --git a/docs/src/snap/explanation/certificates.md b/docs/src/snap/explanation/certificates.md index 5417bb13b..63ee334b6 100644 --- a/docs/src/snap/explanation/certificates.md +++ b/docs/src/snap/explanation/certificates.md @@ -3,7 +3,7 @@ Certificates are a crucial part of Kubernetes' security infrastructure, serving to authenticate and secure communication within the cluster. They play a key role in ensuring that communication between various components (such as the -API server, kubelets, and the datastore) is both encrypted and restricted to +API server, kubelet, and the datastore) is both encrypted and restricted to authorised components only. In Kubernetes, [X.509][] certificates are primarily used for diff --git a/docs/src/snap/explanation/cis.md b/docs/src/snap/explanation/cis.md new file mode 100644 index 000000000..527342676 --- /dev/null +++ b/docs/src/snap/explanation/cis.md @@ -0,0 +1,28 @@ +# CIS Hardening + +CIS Hardening refers to the process of implementing security configurations that +align with the benchmarks set forth by the [Center for Internet Security] (CIS). +These [benchmarks] are a set of best practices and guidelines designed to secure +various software and hardware systems, including Kubernetes clusters. The +primary goal of CIS hardening is to reduce the attack surface and enhance the +overall security posture of an environment by enforcing configurations that are +known to protect against common vulnerabilities and threats. + +## Why is CIS Hardening Important for Kubernetes? + +Kubernetes, by its nature, is a complex system with many components interacting +in a distributed environment. This complexity can introduce numerous security +risks if not properly managed such as unauthorised access, data breaches and +service disruption. CIS hardening for Kubernetes focuses on configuring various +components of a Kubernetes cluster to meet the security standards specified in +the [CIS Kubernetes Benchmark]. + +## Apply CIS Hardening to {{product}} + +If you would like to apply CIS hardening to your cluster see our [how-to guide]. + + +[benchmarks]: https://www.cisecurity.org/cis-benchmarks +[Center for Internet Security]: https://www.cisecurity.org/ +[CIS Kubernetes Benchmark]: https://www.cisecurity.org/benchmark/kubernetes +[how-to guide]: ../howto/cis-hardening.md \ No newline at end of file diff --git a/docs/src/snap/explanation/clustering.md b/docs/src/snap/explanation/clustering.md index a185a6f8c..374c46d66 100644 --- a/docs/src/snap/explanation/clustering.md +++ b/docs/src/snap/explanation/clustering.md @@ -18,14 +18,13 @@ and scheduling of workloads. This is the overview of a {{product}} cluster: -```{kroki} ../../assets/ck-cluster.puml -``` +![cluster6][] ## The Role of `k8sd` in Kubernetes Clustering `k8sd` plays a vital role in the {{product}} architecture, enhancing the functionality of both the Control Plane and Worker nodes through the use -of [microcluster]. This component simplifies cluster management tasks, such as +of [MicroCluster]. This component simplifies cluster management tasks, such as adding or removing nodes and integrating them into the cluster. It also manages essential features like DNS and networking within the cluster, streamlining the entire process for a more efficient operation. @@ -69,7 +68,11 @@ entire life-cycle. Their components include: - **Container Runtime**: The software responsible for running containers. In {{product}} the runtime is `containerd`. + + +[cluster6]: https://assets.ubuntu.com/v1/e6d02e9c-cluster6.svg + [Kubernetes Components]: https://kubernetes.io/docs/concepts/overview/components/ -[microcluster]: https://github.com/canonical/microcluster +[MicroCluster]: https://github.com/canonical/microcluster diff --git a/docs/src/snap/explanation/epa.md b/docs/src/snap/explanation/epa.md new file mode 100644 index 000000000..8d3786991 --- /dev/null +++ b/docs/src/snap/explanation/epa.md @@ -0,0 +1,547 @@ +# Enhanced Platform Awareness + +Enhanced Platform Awareness (EPA) is a methodology and a set of enhancements +across various layers of the orchestration stack. + +EPA focuses on discovering, scheduling and isolating server hardware +capabilities. This document provides a detailed guide of how EPA applies to +{{product}}, which centre around the following technologies: + +- **HugePage support**: In GA from Kubernetes v1.14, this feature enables the + discovery, scheduling and allocation of HugePages as a first-class + resource. +- **Real-time kernel**: Ensures that high-priority tasks are run within a + predictable time frame, providing the low latency and high determinism + essential for time-sensitive applications. +- **CPU pinning** (CPU Manager for Kubernetes (CMK)): In GA from Kubernetes + v1.26, provides mechanisms for CPU pinning and isolation of containerised + workloads. +- **NUMA topology awareness**: Ensures that CPU and memory allocation are + aligned according to the NUMA architecture, reducing memory latency and + increasing performance for memory-intensive applications. +- **Single Root I/O Virtualisation (SR-IOV)**: Enhances networking by enabling + virtualisation of a single physical network device into multiple virtual + devices. +- **DPDK (Data Plane Development Kit)**: A set of libraries and drivers for + fast packet processing, designed to run in user space, optimising network + performance and reducing latency. + +This document provides relevant links to detailed instructions for setting up +and installing these technologies. It is designed for developers and architects +who wish to integrate these new technologies into their {{product}}-based +networking solutions. The separate [how to guide][howto-epa] for EPA includes the +necessary steps to implement these features on {{product}}. + +## HugePages + +HugePages are a feature in the Linux kernel which enables the allocation of +larger memory pages. This reduces the overhead of managing large amounts of +memory and can improve performance for applications that require significant +memory access. + +### Key features + +- **Larger memory pages**: HugePages provide larger memory pages (e.g., 2MB or + 1GB) compared to the standard 4KB pages, reducing the number of pages the + system must manage. +- **Reduced overhead**: By using fewer, larger pages, the system reduces the + overhead associated with page table entries, leading to improved memory + management efficiency. +- **Improved TLB performance**: The Translation Lookaside Buffer (TLB) stores + recent translations of virtual memory to physical memory addresses. Using + HugePages increases TLB hit rates, reducing the frequency of memory + translation lookups. +- **Enhanced application performance**: Applications that access large amounts + of memory can benefit from HugePages by experiencing lower latency and + higher throughput due to reduced page faults and better memory access + patterns. +- **Support for high-performance workloads**: Ideal for high-performance + computing (HPC) applications, databases and other memory-intensive + workloads that demand efficient and fast memory access. +- **Native Kubernetes integration**: Starting from Kubernetes v1.14, HugePages + are supported as a native, first-class resource, enabling their + discovery, scheduling and allocation within Kubernetes environments. + +### Application to Kubernetes + +The architecture for HugePages on Kubernetes integrates the management and +allocation of large memory pages into the Kubernetes orchestration system. Here +are the key architectural components and their roles: + +- **Node configuration**: Each Kubernetes node must be configured to reserve + HugePages. This involves setting the number of HugePages in the node's + kernel boot parameters. +- **Kubelet configuration**: The `kubelet` on each node must be configured to + recognise and manage HugePages. This is typically done through the `kubelet` + configuration file, specifying the size and number of HugePages. +- **Pod specification**: HugePages are requested and allocated at the pod + level through resource requests and limits in the pod specification. Pods + can request specific sizes of HugePages (e.g., 2MB or 1GB). +- **Scheduler awareness**: The Kubernetes scheduler is aware of HugePages as a + resource and schedules pods onto nodes that have sufficient HugePages + available. This ensures that pods with HugePages requirements are placed + appropriately. Scheduler configurations and policies can be adjusted to + optimise HugePages allocation and utilisation. +- **Node Feature Discovery (NFD)**: Node Feature Discovery can be used to + label nodes with their HugePages capabilities. This enables scheduling + decisions to be based on the available HugePages resources. +- **Resource quotas and limits**: Kubernetes enables the definition of resource + quotas and limits to control the allocation of HugePages across namespaces. + This helps in managing and isolating resource usage effectively. +- **Monitoring and metrics**: Kubernetes provides tools and integrations + (e.g., Prometheus, Grafana) to monitor and visualise HugePages usage across + the cluster. This helps in tracking resource utilisation and performance. + Metrics can include HugePages allocation, usage and availability on each + node, aiding in capacity planning and optimisation. + +## Real-time kernel + +A real-time kernel ensures that high-priority tasks are run within a +predictable time frame, crucial for applications requiring low latency and high +determinism. Note that this can also impede applications which were not +designed with these considerations. + +### Key features + +- **Predictable task execution**: A real-time kernel ensures that + high-priority tasks are run within a predictable and bounded time frame, + reducing the variability in task execution time. +- **Low latency**: The kernel is optimised to minimise the time it takes to + respond to high-priority tasks, which is crucial for applications that + require immediate processing. +- **Priority-based scheduling**: Tasks are scheduled based on their priority + levels, with real-time tasks being given precedence over other types of + tasks to ensure they are processed promptly. +- **Deterministic behaviour**: The kernel guarantees deterministic behaviour, + meaning the same task will have the same response time every time it is + run, essential for time-sensitive applications. +- **Preemption:** The real-time kernel supports preemptive multitasking, + allowing high-priority tasks to interrupt lower-priority tasks to ensure + critical tasks are run without delay. +- **Resource reservation**: System resources (such as CPU and memory) can be + reserved by the kernel for real-time tasks, ensuring that these resources + are available when needed. +- **Enhanced interrupt handling**: Interrupt handling is optimised to ensure + minimal latency and jitter, which is critical for maintaining the + performance of real-time applications. +- **Real-time scheduling policies**: The kernel includes specific scheduling + policies (e.g., SCHED\_FIFO, SCHED\_RR) designed to manage real-time tasks + effectively and ensure they meet their deadlines. + +These features make a real-time kernel ideal for applications requiring precise +timing and high reliability. + +### Application to Kubernetes + +The architecture for integrating a real-time kernel into Kubernetes involves +several components and configurations to ensure that high-priority, low-latency +tasks can be managed effectively within a Kubernetes environment. Here are the +key architectural components and their roles: + +- **Real-time kernel installation**: Each Kubernetes node must run a real-time + kernel. This involves installing a real-time kernel package and configuring + the system to use it. +- **Kernel boot parameters**: The kernel boot parameters must be configured to + optimise for real-time performance. This includes isolating CPU cores and + configuring other kernel parameters for real-time behaviour. +- **Kubelet configuration**: The `kubelet` on each node must be configured to + recognise and manage real-time workloads. This can involve setting specific + `kubelet` flags and configurations. +- **Pod specification**: Real-time workloads are specified at the pod level + through resource requests and limits. Pods can request dedicated CPU cores + and other resources to ensure they meet real-time requirements. +- **CPU Manager**: Kubernetes’ CPU Manager is a critical component for + real-time workloads. It enables the static allocation of CPUs to + containers, ensuring that specific CPU cores are dedicated to particular + workloads. +- **Scheduler awareness**: The Kubernetes scheduler must be aware of real-time + requirements and prioritise scheduling pods onto nodes with available + real-time resources. +- **Priority and preemption**: Kubernetes supports priority and preemption to + ensure that critical real-time pods are scheduled and run as needed. This + involves defining pod priorities and enabling preemption to ensure + high-priority pods can displace lower-priority ones if necessary. +- **Resource quotas and limits**: Kubernetes can define resource quotas + and limits to control the allocation of resources for real-time workloads + across namespaces. This helps manage and isolate resource usage effectively. +- **Monitoring and metrics**: Monitoring tools such as Prometheus and Grafana + can be used to track the performance and resource utilisation of real-time + workloads. Metrics include CPU usage, latency and task scheduling times, + which help in optimising and troubleshooting real-time applications. +- **Security and isolation**: Security contexts and isolation mechanisms + ensure that real-time workloads are protected and run in a controlled + environment. This includes setting privileged containers and configuring + namespaces. + +## CPU pinning + +CPU pinning enables specific CPU cores to be dedicated to a particular process +or container, ensuring that the process runs on the same CPU core(s) every +time, which reduces context switching and cache invalidation. + +### Key features + +- **Dedicated CPU Cores**: CPU pinning allocates specific CPU cores to a + process or container, ensuring consistent and predictable CPU usage. +- **Reduced context switching**: By running a process or container on the same + CPU core(s), CPU pinning minimises the overhead associated with context + switching, leading to better performance. +- **Improved cache utilisation**: When a process runs on a dedicated CPU core, + it can take full advantage of the CPU cache, reducing the need to fetch data + from main memory and improving overall performance. +- **Enhanced application performance**: Applications that require low latency + and high performance benefit from CPU pinning as it ensures they have + dedicated processing power without interference from other processes. +- **Consistent performance**: CPU pinning ensures that a process or container + receives consistent CPU performance, which is crucial for real-time and + performance-sensitive applications. +- **Isolation of workloads**: CPU pinning isolates workloads on specific CPU + cores, preventing them from being affected by other workloads running on + different cores. This is especially useful in multi-tenant environments. +- **Improved predictability**: By eliminating the variability introduced by + sharing CPU cores, CPU pinning provides more predictable performance + characteristics for critical applications. +- **Integration with Kubernetes**: Kubernetes supports CPU pinning through the + CPU Manager (in GA since v1.26), which allows for the static allocation of + CPUs to containers. This ensures that containers with high CPU demands have + the necessary resources. + +### Application to Kubernetes + +The architecture for CPU pinning in Kubernetes involves several components and +configurations to ensure that specific CPU cores can be dedicated to particular +processes or containers, thereby enhancing performance and predictability. Here +are the key architectural components and their roles: + +- **Kubelet configuration**: The `kubelet` on each node must be configured to + enable CPU pinning. This involves setting specific `kubelet` flags to + activate the CPU Manager. +- **CPU manager**: Kubernetes’ CPU Manager is a critical component for CPU + pinning. It allows for the static allocation of CPUs to containers, ensuring + that specific CPU cores are dedicated to particular workloads. The CPU + Manager can be configured to either static or none. Static policy enables + exclusive CPU core allocation to Guaranteed QoS (Quality of Service) pods. +- **Pod specification**: Pods must be specified to request dedicated CPU + resources. This is done through resource requests and limits in the pod + specification. +- **Scheduler awareness**: The Kubernetes scheduler must be aware of the CPU + pinning requirements. It schedules pods onto nodes with available CPU + resources as requested by the pod specification. The scheduler ensures that + pods with specific CPU pinning requests are placed on nodes with sufficient + free dedicated CPUs. +- **NUMA Topology Awareness**: For optimal performance, CPU pinning should be + aligned with NUMA (Non-Uniform Memory Access) topology. This ensures that + memory accesses are local to the CPU, reducing latency. Kubernetes can be + configured to be NUMA-aware, using the Topology Manager to align CPU + and memory allocation with NUMA nodes. +- **Node Feature Discovery (NFD)**: Node Feature Discovery can be used to + label nodes with their CPU capabilities, including the availability of + isolated and reserved CPU cores. +- **Resource quotas and limits**: Kubernetes can define resource quotas + and limits to control the allocation of CPU resources across namespaces. + This helps in managing and isolating resource usage effectively. +- **Monitoring and metrics**: Monitoring tools such as Prometheus and Grafana + can be used to track the performance and resource utilisation of CPU-pinned + workloads. Metrics include CPU usage, core allocation and task scheduling + times, which help in optimising and troubleshooting performance-sensitive + applications. +- **Isolation and security**: Security contexts and isolation mechanisms + ensure that CPU-pinned workloads are protected and run in a controlled + environment. This includes setting privileged containers and configuring + namespaces to avoid resource contention. +- **Performance Tuning**: Additional performance tuning can be achieved by + isolating CPU cores at the OS level and configuring kernel parameters to + minimise interference from other processes. This includes setting CPU + isolation and `nohz_full` parameters (reduces the number of scheduling-clock + interrupts, improving energy efficiency and [reducing OS jitter][no_hz]). + +## NUMA topology awareness + +NUMA (Non-Uniform Memory Access) topology awareness ensures that the CPU and +memory allocation are aligned according to the NUMA architecture, which can +reduce memory latency and increase performance for memory-intensive +applications. + +The Kubernetes Memory Manager enables the feature of guaranteed memory (and +HugePages) allocation for pods in the Guaranteed QoS (Quality of Service) +class. + +The Memory Manager employs hint generation protocol to yield the most suitable +NUMA affinity for a pod. The Memory Manager feeds the central manager (Topology +Manager) with these affinity hints. Based on both the hints and Topology +Manager policy, the pod is rejected or admitted to the node. + +Moreover, the Memory Manager ensures that the memory which a pod requests is +allocated from a minimum number of NUMA nodes. + +### Key features + +- **Aligned CPU and memory allocation**: NUMA topology awareness ensures that + CPUs and memory are allocated in alignment with the NUMA architecture, + minimising cross-node memory access latency. +- **Reduced memory latency**: By ensuring that memory is accessed from the + same NUMA node as the CPU, NUMA topology awareness reduces memory latency, + leading to improved performance for memory-intensive applications. +- **Increased performance**: Applications benefit from increased performance + due to optimised memory access patterns, which is especially critical for + high-performance computing and data-intensive tasks. +- **Kubernetes Memory Manager**: The Kubernetes Memory Manager supports + guaranteed memory allocation for pods in the Guaranteed QoS (Quality of + Service) class, ensuring predictable performance. +- **Hint generation protocol**: The Memory Manager uses a hint generation + protocol to determine the most suitable NUMA affinity for a pod, helping to + optimise resource allocation based on NUMA topology. +- **Integration with Topology Manager**: The Memory Manager provides NUMA + affinity hints to the Topology Manager. The Topology Manager then decides + whether to admit or reject the pod based on these hints and the configured + policy. +- **Optimised resource allocation**: The Memory Manager ensures that the + memory requested by a pod is allocated from the minimum number of NUMA + nodes, thereby optimising resource usage and performance. +- **Enhanced scheduling decisions**: The Kubernetes scheduler, in conjunction + with the Topology Manager, makes informed decisions about pod placement to + ensure optimal NUMA alignment, improving overall cluster efficiency. +- **Support for HugePages**: The Memory Manager also supports the allocation + of HugePages, ensuring that large memory pages are allocated in a NUMA-aware + manner, further enhancing performance for applications that require large + memory pages. +- **Improved application predictability**: By aligning CPU and memory + allocation with NUMA topology, applications experience more predictable + performance characteristics, crucial for real-time and latency-sensitive + workloads. +- **Policy-Based Management**: NUMA topology awareness can be managed through + policies so that administrators can configure how resources should be + allocated based on the NUMA architecture, providing flexibility and control. + +### Application to Kubernetes + +The architecture for NUMA topology awareness in Kubernetes involves several +components and configurations to ensure that CPU and memory allocations are +optimised according to the NUMA architecture. This setup reduces memory latency +and enhances performance for memory intensive applications. Here are the key +architectural components and their roles: + +- **Node configuration**: Each Kubernetes node must have NUMA-aware hardware. + The system's NUMA topology can be inspected using tools such as `lscpu` or + `numactl`. +- **Kubelet configuration**: The `kubelet` on each node must be configured to + enable NUMA topology awareness. This involves setting specific `kubelet` + flags to activate the Topology Manager. +- **Topology Manager**: The Topology Manager is a critical component that + coordinates resource allocation based on NUMA topology. It receives NUMA + affinity hints from other managers (e.g., CPU Manager, Device Manager) and + makes informed scheduling decisions. +- **Memory Manager**: The Kubernetes Memory Manager is responsible for + managing memory allocation, including HugePages, in a NUMA-aware manner. It + ensures that memory is allocated from the minimum number of NUMA nodes + required. The Memory Manager uses a hint generation protocol to provide NUMA + affinity hints to the Topology Manager. +- **Pod specification**: Pods can be specified to request NUMA-aware resource + allocation through resource requests and limits, ensuring that they get + allocated in alignment with the NUMA topology. +- **Scheduler awareness**: The Kubernetes scheduler works in conjunction with + the Topology Manager to place pods on nodes that meet their NUMA affinity + requirements. The scheduler considers NUMA topology during the scheduling + process to optimise performance. +- **Node Feature Discovery (NFD)**: Node Feature Discovery can be used to + label nodes with their NUMA capabilities, providing the scheduler with + information to make more informed placement decisions. +- **Resource quotas and limits**: Kubernetes allows defining resource quotas + and limits to control the allocation of NUMA-aware resources across + namespaces. This helps in managing and isolating resource usage effectively. +- **Monitoring and metrics**: Monitoring tools such as Prometheus and Grafana + can be used to track the performance and resource utilisation of NUMA-aware + workloads. Metrics include CPU and memory usage per NUMA node, helping in + optimising and troubleshooting performance-sensitive applications. +- **Isolation and security**: Security contexts and isolation mechanisms + ensure that NUMA-aware workloads are protected and run in a controlled + environment. This includes setting privileged containers and configuring + namespaces to avoid resource contention. +- **Performance tuning**: Additional performance tuning can be achieved by + configuring kernel parameters and using tools like `numactl` to bind + processes to specific NUMA nodes. + +## SR-IOV (Single Root I/O Virtualisation) + +SR-IOV enables a single physical network device to appear as multiple separate +virtual devices. This can be beneficial for network-intensive applications that +require direct access to the network hardware. + +### Key features + +- **Multiple Virtual Functions (VFs)**: SR-IOV enables a single physical + network device to be partitioned into multiple virtual functions (VFs), each + of which can be assigned to a virtual machine or container as a separate + network interface. +- **Direct hardware access**: By providing direct access to the physical + network device, SR-IOV bypasses the software-based network stack, reducing + overhead and improving network performance and latency. +- **Improved network throughput**: Applications can achieve higher network + throughput as SR-IOV enables high-speed data transfer directly + between the network device and the application. +- **Reduced CPU utilisation**: Offloading network processing to the hardware + reduces the CPU load on the host system, freeing up CPU resources for other + tasks and improving overall system performance. +- **Isolation and security**: Each virtual function (VF) is isolated from + others, providing security and stability. This isolation ensures that issues + in one VF do not affect other VFs or the physical function (PF). +- **Dynamic resource allocation**: SR-IOV supports dynamic allocation of + virtual functions, enabling resources to be adjusted based on application + demands without requiring changes to the physical hardware setup. +- **Enhanced virtualisation support**: SR-IOV is particularly beneficial in + virtualised environments, enabling better network performance for virtual + machines and containers by providing them with dedicated network interfaces. +- **Kubernetes integration**: Kubernetes supports SR-IOV through the use of + network device plugins, enabling the automatic discovery, allocation, + and management of virtual functions. +- **Compatibility with Network Functions Virtualisation (NFV)**: SR-IOV is + widely used in NFV deployments to meet the high-performance networking + requirements of virtual network functions (VNFs), such as firewalls, + routers and load balancers. +- **Reduced network latency**: As network packets can bypass the + hypervisor's virtual switch, SR-IOV significantly reduces network latency, + making it ideal for latency-sensitive applications. + +### Application to Kubernetes + +The architecture for SR-IOV (Single Root I/O Virtualisation) in Kubernetes +involves several components and configurations to ensure that virtual functions +(VFs) from a single physical network device can be managed and allocated +efficiently. This setup enhances network performance and provides direct access +to network hardware for applications requiring high throughput and low latency. +Here are the key architectural components and their roles: + +- **Node configuration**: Each Kubernetes node with SR-IOV capable hardware + must have the SR-IOV drivers and tools installed. This includes the SR-IOV + network device plugin and associated drivers. +- **SR-IOV enabled network interface**: The physical network interface card + (NIC) must be configured to support SR-IOV. This involves enabling SR-IOV in + the system BIOS and configuring the NIC to create virtual functions (VFs). +- **SR-IOV network device plugin**: The SR-IOV network device plugin is + deployed as a DaemonSet in Kubernetes. It discovers SR-IOV capable network + interfaces and manages the allocation of virtual functions (VFs) to pods. +- **Device Plugin Configuration**: The SR-IOV device plugin requires a + configuration file that specifies the network devices and the number of + virtual functions (VFs) to be managed. +- **Pod specification**: Pods can request SR-IOV virtual functions by + specifying resource requests and limits in the pod specification. The SR-IOV + device plugin allocates the requested VFs to the pod. +- **Scheduler awareness**: The Kubernetes scheduler must be aware of the + SR-IOV resources available on each node. The device plugin advertises the + available VFs as extended resources, which the scheduler uses to place pods + accordingly. Scheduler configuration ensures pods with SR-IOV requests are + scheduled on nodes with available VFs. +- **Resource quotas and limits**: Kubernetes enables the definition of + resource quotas and limits to control the allocation of SR-IOV resources + across namespaces. This helps manage and isolate resource usage effectively. +- **Monitoring and metrics**: Monitoring tools such as Prometheus and Grafana + can be used to track the performance and resource utilisation of + SR-IOV-enabled workloads. Metrics include VF allocation, network throughput, + and latency, helping optimise and troubleshoot performance-sensitive + applications. +- **Isolation and security**: SR-IOV provides isolation between VFs, ensuring + that each VF operates independently and securely. This isolation is critical + for multi-tenant environments where different workloads share the same + physical network device. +- **Dynamic resource allocation**: SR-IOV supports dynamic allocation and + deallocation of VFs, enabling Kubernetes to adjust resources based on + application demands without requiring changes to the physical hardware + setup. + +## DPDK (Data Plane Development Kit) + +The Data Plane Development Kit (DPDK) is a set of libraries and drivers for +fast packet processing. It is designed to run in user space, so that +applications can achieve high-speed packet processing by bypassing the kernel. +DPDK is used to optimise network performance and reduce latency, making it +ideal for applications that require high-throughput and low-latency networking, +such as telecommunications, cloud data centres and network functions +virtualisation (NFV). + +### Key features + +- **High performance**: DPDK can process millions of packets per second per + core, using multi-core CPUs to scale performance. +- **User-space processing**: By running in user space, DPDK avoids the + overhead of kernel context switches and uses HugePages for better + memory performance. +- **Poll Mode Drivers (PMD)**: DPDK uses PMDs that poll for packets instead of + relying on interrupts, which reduces latency. + +### DPDK architecture + +The main goal of the DPDK is to provide a simple, complete framework for fast +packet processing in data plane applications. Anyone can use the code to +understand some of the techniques employed, to build upon for prototyping or to +add their own protocol stacks. + +The framework creates a set of libraries for specific environments through the +creation of an Environment Abstraction Layer (EAL), which may be specific to a +mode of the Intel® architecture (32-bit or 64-bit), user space +compilers or a specific platform. These environments are created through the +use of Meson files (needed by Meson, the software tool for automating the +building of software that DPDK uses) and configuration files. Once the EAL +library is created, the user may link with the library to create their own +applications. Other libraries, outside of EAL, including the Hash, Longest +Prefix Match (LPM) and rings libraries are also provided. Sample applications +are provided to help show the user how to use various features of the DPDK. + +The DPDK implements a run-to-completion model for packet processing, where all +resources must be allocated prior to calling data plane applications, running +as execution units on logical processing cores. The model does not support a +scheduler and all devices are accessed by polling. The primary reason for not +using interrupts is the performance overhead imposed by interrupt processing. + +In addition to the run-to-completion model, a pipeline model may also be used +by passing packets or messages between cores via the rings. This enables work +to be performed in stages and is a potentially more efficient use of code on +cores. This is suitable for scenarios where each pipeline must be mapped to a +specific application thread or when multiple pipelines must be mapped to the +same thread. + +### Application to Kubernetes + +The architecture for integrating the Data Plane Development Kit (DPDK) into +Kubernetes involves several components and configurations to ensure high-speed +packet processing and low-latency networking. DPDK enables applications to +bypass the kernel network stack, providing direct access to network hardware +and significantly enhancing network performance. Here are the key architectural +components and their roles: + +- **Node configuration**: Each Kubernetes node must have the DPDK libraries + and drivers installed. This includes setting up HugePages and binding + network interfaces to DPDK-compatible drivers. +- **HugePages configuration**: DPDK requires HugePages for efficient memory + management. Configure the system to reserve HugePages. +- **Network interface binding**: Network interfaces must be bound to + DPDK-compatible drivers (e.g., vfio-pci) to be used by DPDK applications. +- **DPDK application container**: Create a Docker container image with the + DPDK application and necessary libraries. Ensure that the container runs + with appropriate privileges and mounts HugePages. +- **Pod specification**: Deploy the DPDK application in Kubernetes by + specifying the necessary resources, including CPU pinning and HugePages, in + the pod specification. +- **CPU pinning**: For optimal performance, DPDK applications should use + dedicated CPU cores. Configure CPU pinning in the pod specification. +- **SR-IOV for network interfaces**: Combine DPDK with SR-IOV to provide + high-performance network interfaces. Allocate SR-IOV virtual functions (VFs) + to DPDK pods. +- **Scheduler awareness**: The Kubernetes scheduler must be aware of the + resources required by DPDK applications, including HugePages and CPU + pinning, to place pods appropriately on nodes with sufficient resources. +- **Monitoring and metrics**: Use monitoring tools like Prometheus and Grafana + to track the performance of DPDK applications, including network throughput, + latency and CPU usage. +- **Resource quotas and limits**: Define resource quotas and limits to control + the allocation of resources for DPDK applications across namespaces, + ensuring fair resource distribution and preventing resource contention. +- **Isolation and security**: Ensure that DPDK applications run in isolated + and secure environments. Use security contexts to provide the necessary + privileges while maintaining security best practices. + + + + +[no_hz]: https://www.kernel.org/doc/Documentation/timers/NO_HZ.txt +[howto-epa]: ../howto/epa + diff --git a/docs/src/snap/explanation/index.md b/docs/src/snap/explanation/index.md index 3feb4fb83..5c95b7d39 100644 --- a/docs/src/snap/explanation/index.md +++ b/docs/src/snap/explanation/index.md @@ -16,7 +16,9 @@ certificates channels clustering ingress -/snap/explanation/security +epa +security +cis ``` --- diff --git a/docs/src/snap/explanation/ingress.md b/docs/src/snap/explanation/ingress.md index 6e7c73c5d..09ebf334b 100644 --- a/docs/src/snap/explanation/ingress.md +++ b/docs/src/snap/explanation/ingress.md @@ -19,7 +19,7 @@ CNI (Container Network Interface) called [Cilium][Cilium]. If you wish to use a different network plugin the implementation and configuration falls under your responsibility. -Learn how to use the {{product}} default network in the [networking HowTo guide][Network]. +Learn how to use the {{product}} default network in the [networking how-to guide][Network]. ## Kubernetes Pods and Services @@ -54,8 +54,7 @@ that routes traffic from outside of your cluster to services inside of your clus Please do not confuse this with the Kubernetes Service LoadBalancer type which operates at layer 4 and routes traffic directly to individual pods. -```{kroki} ../../assets/ingress.puml -``` +![cluster6][] With {{product}}, enabling Ingress is easy: See the [default Ingress guide][Ingress]. @@ -73,10 +72,14 @@ the responsibility of implementation falls upon you. You will need to create the Ingress resource, outlining rules that direct traffic to your application's Kubernetes service. + + +[cluster6]: https://assets.ubuntu.com/v1/e6d02e9c-cluster6.svg + -[Ingress]: /snap/howto/networking/default-ingress -[Network]: /snap/howto/networking/default-network +[Ingress]: ../howto/networking/default-ingress +[Network]: ../howto/networking/default-network [Cilium]: https://cilium.io/ [network plugin]: https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/network-plugins/ [Service]: https://kubernetes.io/docs/concepts/services-networking/service/ diff --git a/docs/src/snap/explanation/security.md b/docs/src/snap/explanation/security.md index 53f5ea727..8daeb368f 100644 --- a/docs/src/snap/explanation/security.md +++ b/docs/src/snap/explanation/security.md @@ -44,11 +44,11 @@ have access to your cluster. Describing the security mechanisms of these clouds is out of the scope of this documentation, but you may find the following links useful. -- Amazon Web Services -- Google Cloud Platform -- Metal As A Service(MAAS) -- Microsoft Azure -- VMWare VSphere +- [Amazon Web Services security][] +- [Google Cloud Platform security][] +- [Metal As A Service(MAAS) hardening][] +- [Microsoft Azure security][] +- [VMware VSphere hardening guides][] ## Security Compliance @@ -62,4 +62,10 @@ check the [roadmap][] for current areas of work. [Kubernetes Security documentation]: https://kubernetes.io/docs/concepts/security/overview/ [snap documentation]: https://snapcraft.io/docs/security-sandboxing [rocks-security]: https://canonical-rockcraft.readthedocs-hosted.com/en/latest/explanation/rockcraft/ -[roadmap]: /snap/reference/roadmap +[roadmap]: ../reference/roadmap +[Amazon Web Services security]: https://aws.amazon.com/security/ +[Google Cloud Platform security]:https://cloud.google.com/security/ +[Metal As A Service(MAAS) hardening]:https://maas.io/docs/snap/3.0/ui/hardening-your-maas-installation +[Microsoft Azure security]:https://docs.microsoft.com/en-us/azure/security/azure-security +[VMware VSphere hardening guides]: https://www.vmware.com/security/hardening-guides.html + diff --git a/docs/src/snap/howto/backup-restore.md b/docs/src/snap/howto/backup-restore.md index cb5345ac4..dc54a9cab 100644 --- a/docs/src/snap/howto/backup-restore.md +++ b/docs/src/snap/howto/backup-restore.md @@ -64,7 +64,7 @@ sudo k8s kubectl expose deployment nginx -n workloads --port 80 ## Install Velero Download the Velero binary from the -[releases page on github][releases] and place it in our `PATH`. In this case we +[releases page on GitHub][releases] and place it in our `PATH`. In this case we install the v1.14.1 Linux binary for AMD64 under `/usr/local/bin`: ```bash @@ -100,7 +100,7 @@ EOF ``` We are now ready to install Velero into the cluster, with an aws plugin that -[matches][aws-plugin-matching] the velero release: +[matches][aws-plugin-matching] the Velero release: ```bash SERVICE_URL="http://${SERVICE}.velero.svc:9000" diff --git a/docs/src/snap/howto/cis-hardening.md b/docs/src/snap/howto/cis-hardening.md new file mode 100644 index 000000000..f44c65cb1 --- /dev/null +++ b/docs/src/snap/howto/cis-hardening.md @@ -0,0 +1,292 @@ +# CIS compliance + +CIS Hardening refers to the process of implementing security configurations that +align with the benchmarks set by the [Center for Internet Security (CIS)][]. The +open source tool [kube-bench][] is designed to automatically check whether +your Kubernetes clusters are configured according to the +[CIS Kubernetes Benchmark][]. This guide covers how to setup your {{product}} +cluster with kube-bench. + +## What you'll need + +This guide assumes the following: + +- You have a bootstrapped {{product}} cluster (see the [Getting Started] +[getting-started-guide] guide) +- You have root or sudo access to the machine + +## Install kube-bench + +Download the latest [kube-bench release][] on your Kubernetes nodes. Make sure +to select the appropriate binary version. + +For example, to download the Linux binary, use the following command. Replace +`KB` by the version listed in the releases page. + +``` +KB=8.0 +mkdir kube-bench +cd kube-bench +curl -L https://github.com/aquasecurity/kube-bench/releases/download/v0.$KB/kube-bench_0.$KB\_linux_amd64.tar.gz -o kube-bench_0.$KB\_linux_amd64.tar.gz +``` + +Extract the downloaded tarball and move the binary to a directory in your PATH: + +``` +tar -xvf kube-bench_0.$KB\_linux_amd64.tar.gz +sudo mv kube-bench /usr/local/bin/ +``` + +Verify kube-bench installation. + +``` +kube-bench version +``` + +The output should list the version installed. + +Install `kubectl` and configure it to interact with the cluster. + +```{warning} +This will override your ~/.kube/config if you already have kubectl installed in your cluster. +``` + +``` +sudo snap install kubectl --classic +mkdir ~/.kube/ +sudo k8s kubectl config view --raw > ~/.kube/config +export KUBECONFIG=~/.kube/config +``` + +Get CIS hardening checks applicable for {{product}}: + +``` +git clone -b ck8s https://github.com/canonical/kube-bench.git kube-bench-ck8s-cfg +``` + +Test-run kube-bench against {{product}}: + +``` +sudo -E kube-bench --version ck8s-dqlite-cis-1.24 --config-dir ./kube-bench-ck8s-cfg/cfg/ --config ./kube-bench-ck8s-cfg/cfg/config.yaml +``` + +## Harden your deployments + +Before running a CIS Kubernetes audit, it is essential to first harden your +{{product}} deployment to minimise vulnerabilities and ensure +compliance with industry best practices, as defined by the CIS Kubernetes +Benchmark. + +### Control plane nodes + +Run the following commands on your control plane nodes. + +#### Configure auditing + +Create an audit-policy.yaml file under `/var/snap/k8s/common/etc/` and specify +the level of auditing you desire based on the [upstream instructions][]. Here is +a minimal example of such a policy file. + +``` +sudo sh -c 'cat >/var/snap/k8s/common/etc/audit-policy.yaml <>/var/snap/k8s/common/args/kube-apiserver </var/snap/k8s/common/etc/eventconfig.yaml </var/snap/k8s/common/etc/admission-control-config-file.yaml <>/var/snap/k8s/common/args/kube-apiserver <>/var/snap/k8s/common/args/kubelet < +[Center for Internet Security (CIS)]:https://www.cisecurity.org/ +[kube-bench]:https://aquasecurity.github.io/kube-bench/v0.6.15/ +[CIS Kubernetes Benchmark]:https://www.cisecurity.org/benchmark/kubernetes +[getting-started-guide]: ../tutorial/getting-started +[kube-bench release]: https://github.com/aquasecurity/kube-bench/releases +[upstream instructions]:https://kubernetes.io/docs/tasks/debug/debug-cluster/audit/ +[rate limits]:https://kubernetes.io/docs/reference/config-api/apiserver-eventratelimit.v1alpha1 diff --git a/docs/src/snap/howto/contribute.md b/docs/src/snap/howto/contribute.md index 05e08f1d2..67f1372b9 100644 --- a/docs/src/snap/howto/contribute.md +++ b/docs/src/snap/howto/contribute.md @@ -88,7 +88,7 @@ it on the [Diátaxis website]. In essence though, this guides the way we categorise and write our documentation. You can see there are four main categories of documentation: -- **Tutorials** for guided walkthroughs +- **Tutorials** for guided walk-throughs - **How to** pages for specific tasks and goals - **Explanation** pages which give background reasons and, well, explanations - **Reference**, where you will find the commands, the roadmap, etc. diff --git a/docs/src/snap/howto/epa.md b/docs/src/snap/howto/epa.md new file mode 100644 index 000000000..278cb3420 --- /dev/null +++ b/docs/src/snap/howto/epa.md @@ -0,0 +1,1148 @@ +# How to set up Enhanced Platform Awareness + +This section explains how to set up the Enhanced Platform Awareness (EPA) +features in a {{product}} cluster. Please see the [EPA explanation +page][explain-epa] for details about how EPA applies to {{product}}. + +The content starts with the setup of the environment (including steps for using +[MAAS][MAAS]). Then the setup of {{product}}, including the Multus & SR-IOV/DPDK +networking components. Finally, the steps needed to test every EPA feature: +HugePages, Real-time Kernel, CPU Pinning / NUMA Topology Awareness and +SR-IOV/DPDK. + +## What you'll need + +- An Ubuntu Pro subscription (required for real-time kernel) +- Ubuntu instances **or** a MAAS environment to run {{product}} on + + +## Prepare the Environment + + +`````{tabs} +````{group-tab} Ubuntu + +First, run the `numactl` command to get the number of CPUs available for NUMA: + +``` +numactl -s +``` + +This example output shows that there are 32 CPUs available for NUMA: + +``` +policy: default +preferred node: current +physcpubind: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 +cpubind: 0 1 +nodebind: 0 1 +membind: 0 1 +``` + +```{dropdown} Detailed explanation of output + +- `policy: default`: indicates that the system is using the default NUMA policy. The default policy typically tries to allocate memory on the same node as the processor executing a task, but it can fall back to other nodes if necessary. +- `preferred node: current`: processes will prefer to allocate memory from the current node (the node where the process is running). However, if memory is not available on the current node, it can be allocated from other nodes. +- `physcpubind: 0 1 2 3 ... 31 `: shows the physical CPUs that processes are allowed to run on. In this case, the system has 32 physical CPUs enabled for NUMA, and processes can use any of them. +- `cpubind: 0 1 `: indicates the specific CPUs that the current process (meaning the process “numactl \-s”) is bound to. It's currently using CPUs 0 and 1. +- `nodebind: 0 1 `: shows the NUMA nodes that the current process (meaning the process “numactl \-s”) is allowed to use for memory allocation. It has access to both node 0 and node 1. +- `membind`: 0 1 `: confirms that the current process (meaning the process “numactl \-s”) can allocate memory from both node 0 and node 1. +``` + +### Enable the real-time kernel + +The real-time kernel enablement requires an ubuntu pro subscription and some additional tools to be available. + +``` +sudo pro attach +sudo apt update && sudo apt install ubuntu-advantage-tools +sudo pro enable realtime-kernel +``` + +This should produce output similar to: + +``` +One moment, checking your subscription first +Real-time kernel cannot be enabled with Livepatch. +Disable Livepatch and proceed to enable Real-time kernel? (y/N) y +Disabling incompatible service: Livepatch +The Real-time kernel is an Ubuntu kernel with PREEMPT_RT patches integrated. + +This will change your kernel. To revert to your original kernel, you will need +to make the change manually. + +Do you want to continue? [ default = Yes ]: (Y/n) Y +Updating Real-time kernel package lists +Updating standard Ubuntu package lists +Installing Real-time kernel packages +Real-time kernel enabled +A reboot is required to complete install. +``` + +First the Ubuntu system is attached to an Ubuntu Pro subscription +(needed to use the real-time kernel), requiring you to enter a token +associated with the subscription. After successful attachment, your +system gains access to the Ubuntu Pro repositories, including the one +containing the real-time kernel packages. Once the tools and +real-time kernel are installed, a reboot is required to start using +the new kernel. + +### Create a configuration file to enable HugePages and CPU isolation + +The bootloader will need a configuration file to enable the recommended +boot options (explained below) to enable HugePages and CPU isolation. + +In this example, the host has 128 CPUs, and 2M / 1G HugePages are enabled. +This is the command to update the boot options and reboot the system: + +``` +cat < /etc/default/grub.d/epa_kernel_options.cfg +GRUB_CMDLINE_LINUX_DEFAULT="${GRUB_CMDLINE_LINUX_DEFAULT} intel_iommu=on iommu=pt usbcore.autosuspend=-1 selinux=0 enforcing=0 nmi_watchdog=0 crashkernel=auto softlockup_panic=0 audit=0 tsc=nowatchdog intel_pstate=disable mce=off hugepagesz=1G hugepages=1000 hugepagesz=2M hugepages=0 default_hugepagesz=1G kthread_cpus=0-31 irqaffinity=0-31 nohz=on nosoftlockup nohz_full=32-127 rcu_nocbs=32-127 rcu_nocb_poll skew_tick=1 isolcpus=managed_irq,32-127 console=tty0 console=ttyS0,115200n8" +EOF +sudo chmod 0644 /etc/netplan/99-sriov_vfs.yaml +update-grub +reboot +``` + +```{dropdown} Explanation of boot options + +- `intel_iommu=on`: Enables Intel's Input-Output Memory Management Unit (IOMMU), which is used for device virtualisation and Direct Memory Access (DMA) remapping. +- `iommu=pt`: Sets the IOMMU to passthrough mode, allowing devices to directly access physical memory without translation. +- `usbcore.autosuspend=-1`: Disables USB autosuspend, preventing USB devices from being automatically suspended to save power. +- `selinux=0`: Disables Security-Enhanced Linux (SELinux), a security module that provides mandatory access control. +- `enforcing=0`: If SELinux is enabled, this option sets it to permissive mode, where policies are not enforced but violations are logged. +- `nmi_watchdog=0`: Disables the Non-Maskable Interrupt (NMI) watchdog, which is used to detect and respond to system hangs. +- `crashkernel=auto`: Reserves a portion of memory for capturing a crash dump in the event of a kernel crash. +- `softlockup_panic=0`: Prevents the kernel from panicking (crashing) on detecting a soft lockup, where a CPU appears to be stuck. +- `audit=0`: Disables the kernel auditing system, which logs security-relevant events. +- `tsc=nowatchdog`: Disables the Time Stamp Counter (TSC) watchdog, which checks for issues with the TSC. +- `intel_pstate=disable`: Disables the Intel P-state driver, which controls CPU frequency scaling. +- `mce=off`: Disables Machine Check Exception (MCE) handling, which detects and reports hardware errors. +- `hugepagesz=1G hugepages=1000`: Allocates 1000 huge pages of 1GB each. +- `hugepagesz=2M hugepages=0`: Configures huge pages of 2MB size but sets their count to 0\. +- `default_hugepagesz=1G`: Sets the default size for huge pages to 1GB. +- `kthread_cpus=0-31`: Restricts kernel threads to run on CPUs 0-31. +- `irqaffinity=0-31`: Restricts interrupt handling to CPUs 0-31. +- `nohz=on`: Enables the nohz (no timer tick) mode, reducing timer interrupts on idle CPUs. +- `nosoftlockup`: Disables the detection of soft lockups. +- `nohz_full=32-127`: Enables nohz\_full (full tickless) mode on CPUs 32-127, reducing timer interrupts during application processing. +- `rcu_nocbs=32-127`: Offloads RCU (Read-Copy-Update) callbacks to CPUs 32-127, preventing them from running on these CPUs. +- `rcu_nocb_poll`: Enables polling for RCU callbacks instead of using interrupts. +- `skew_tick=1`: Skews the timer tick across CPUs to reduce contention. +- `isolcpus=managed_irq,32-127`: Isolates CPUs 32-127 and assigns managed IRQs to them, reducing their involvement in system processes and dedicating them to specific workloads. +- `console=tty0`: Sets the console output to the first virtual terminal. +- `console=ttyS0,115200n8`: Sets the console output to the serial port ttyS0 with a baud rate of 115200, 8 data bits, no parity, and 1 stop bit. +``` + +Once the reboot has taken place, ensure the HugePages configuration has been applied: + +``` +grep HugePages /proc/meminfo +``` + +This should generate output indicating the number of pages allocated + +``` +HugePages_Total: 1000 +HugePages_Free: 1000 +HugePages_Rsvd: 0 +HugePages_Surp: 0 +``` + + +Next, create a configuration file to configure the network interface +to use SR-IOV (so it can create virtual functions afterwards) using +Netplan. In the example below the file is created first, then the configuration is +applied, making 128 virtual functions available for use in the environment: + +``` +cat < /etc/netplan/99-sriov_vfs.yaml + network: + ethernets: + enp152s0f1: + virtual-function-count: 128 +EOF +sudo chmod 0600 /etc/netplan/99-sriov_vfs.yaml +sudo netplan apply +ip link show enp152s0f1 +``` + +The output of the last command should indicate the device is working and has generated the expected +virtual functions. + +``` +5: enp152s0f1: mtu 9000 qdisc mq state UP mode DEFAULT group default qlen 1000 + link/ether 40:a6:b7:96:d8:89 brd ff:ff:ff:ff:ff:ff + vf 0 link/ether ae:31:7f:91:09:97 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off + vf 1 link/ether 32:09:8b:f7:07:4b brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off + vf 2 link/ether 12:b9:c6:08:fc:36 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off + .......... + vf 125 link/ether 92:10:ff:8a:e5:0c brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off + vf 126 link/ether 66:fe:ad:f2:d3:05 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off + vf 127 link/ether ca:20:00:c6:83:dd brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off +``` + +```{dropdown} Explanation of steps + * Breakdown of the content of the file `/etc/netplan/99-sriov\_vfs.yaml` : + * path: `/etc/netplan/99-sriov\_vfs.yaml`: This specifies the location of the configuration file. The "99" prefix in the filename usually indicates that it will be processed last, potentially overriding other configurations. + * enp152s0f1: This is the name of the physical network interface you want to create VFs on. This name may vary depending on your system. + * virtual-function-count: 128: This is the key line that instructs Netplan to create 128 virtual functions on the specified physical interface. Each of these VFs can be assigned to a different virtual machine or container, effectively allowing them to share the physical adapter's bandwidth. + * permissions: "0600": This is an optional line that sets the file permissions to 600 (read and write access only for the owner). + * Breakdown of the output of ip link show enp152s0f1 command: + * Main interface: + * 5: The index number of the network interface in the system. + * enp152s0f1: The name of the physical network interface. + * \: The interface's flags indicating its capabilities (e.g., broadcast, multicast) and current status (UP). + * mtu 9000: The maximum transmission unit (MTU) is set to 9000 bytes, larger than the typical 1500 bytes, likely for jumbo frames. + * qdisc mq: The queuing discipline (qdisc) is set to "mq" (multi-queue), designed for multi-core systems. + * state UP: The interface is currently active and operational. + * mode DEFAULT: The interface is in the default mode of operation. + * qlen 1000: The maximum number of packets allowed in the transmit queue. + * link/ether 40:a6:b7:96:d8:89: The interface's MAC address (a unique hardware identifier). + * Virtual functions: + * vf \: The index number of the virtual function. + * link/ether \: The MAC address assigned to the virtual function. + * spoof checking on: A security feature to prevent MAC address spoofing (pretending to be another device). + * link-state auto: The link state (up/down) is determined automatically based on the physical connection. + * trust off: The interface doesn't trust the incoming VLAN (Virtual LAN) tags. + * Results: + * Successful VF Creation: The output confirms a success creation of 128 VFs (numbered 0 through 127\) on the enp152s0f1 interface. + * VF Availability: Each VF is ready for use, and they can be assigned i.e. to {{product}} containers to give them direct access to the network through this physical network interface. + * MAC Addresses: Each VF has its own unique MAC address, which is essential for network communication. +``` + + +Now enable DPDK, first by cloning the DPDK repository, and then placing the script which +will bind the VFs to the VFIO-PCI driver in the location that will run +automatically each time the system boots up, so the VFIO +(Virtual Function I/O) bindings are applied consistently: + +``` +git clone https://github.com/DPDK/dpdk.git /home/ubuntu/dpdk +cat < /var/lib/cloud/scripts/per-boot/dpdk_bind.sh + #!/bin/bash + if [ -d /home/ubuntu/dpdk ]; then + modprobe vfio-pci + vfs=$(python3 /home/ubuntu/dpdk/usertools/dpdk-devbind.py -s | grep drv=iavf | awk '{print $1}' | tail -n +11) + python3 /home/ubuntu/dpdk/usertools/dpdk-devbind.py --bind=vfio-pci $vfs + fi +sudo chmod 0755 /var/lib/cloud/scripts/per-boot/dpdk_bind.sh +``` + +```{dropdown} Explanation + * Load VFIO Module (modprobe vfio-pci): If the DPDK directory exists, the script loads the VFIO-PCI kernel module. This module is necessary for the VFIO driver to function. + * The script uses the `dpdk-devbind.py` tool (included with DPDK) to list the available network devices and their drivers. + * It filters this output using grep drv=iavf to find devices that are currently using the iavf driver (a common driver for Intel network adapters), excluding the physical network interface itself and just focusing on the virtual functions (VFs). + * Bind VFs to VFIO: The script uses `dpdk-devbind.py` again, this time with the \--bind=vfio-pci option, to bind the identified VFs to the VFIO-PCI driver. This step essentially tells the kernel to relinquish control of these devices to DPDK. +``` + +To test that the VFIO Kernel Module and DPDK are enabled: + +``` +lsmod | grep -E 'vfio' +``` + +...should indicate the kernel module is loaded + +``` +vfio_pci 16384 0 +vfio_pci_core 94208 1 vfio_pci +vfio_iommu_type1 53248 0 +vfio 73728 3 vfio_pci_core,vfio_iommu_type1,vfio_pci +iommufd 98304 1 vfio +irqbypass 12288 2 vfio_pci_core,kvm + +``` + +Running the helper script: + +``` +python3 /home/ubuntu/dpdk/usertools/dpdk-devbind.py -s +``` + +...should return a list of network devices using DPDK: + +``` +Network devices using DPDK-compatible driver +============================================ +0000:98:12.2 'Ethernet Adaptive Virtual Function 1889' drv=vfio-pci unused=iavf +0000:98:12.3 'Ethernet Adaptive Virtual Function 1889' drv=vfio-pci unused=iavf +0000:98:12.4 'Ethernet Adaptive Virtual Function 1889' drv=vfio-pci unused=iavf +.... +``` + +With these preparation steps we have enabled the features of EPA: + +- NUMA and CPU Pinning are available to the first 32 CPUs +- Real-Time Kernel is enabled +- HugePages are enabled and 1000 1G huge pages are available +- SR-IOV is enabled in the enp152s0f1 interface, with 128 virtual + function interfaces bound to the vfio-pci driver (that could also use the iavf driver) +- DPDK is enabled in all the 128 virtual function interfaces + +```` + +````{group-tab} MAAS + +To prepare a machine for CPU isolation, HugePages, real-time kernel, +SR-IOV and DPDK we leverage cloud-init through MAAS. + +``` +#cloud-config + +apt: + sources: + rtk.list: + source: "deb https://:@private-ppa.launchpadcontent.net/canonical-kernel-rt/ppa/ubuntu jammy main" + +write_files: + # set kernel option with hugepages and cpu isolation + - path: /etc/default/grub.d/100-telco_kernel_options.cfg + content: | + GRUB_CMDLINE_LINUX_DEFAULT="${GRUB_CMDLINE_LINUX_DEFAULT} intel_iommu=on iommu=pt usbcore.autosuspend=-1 selinux=0 enforcing=0 nmi_watchdog=0 crashkernel=auto softlockup_panic=0 audit=0 tsc=nowatchdog intel_pstate=disable mce=off hugepagesz=1G hugepages=1000 hugepagesz=2M hugepages=0 default_hugepagesz=1G kthread_cpus=0-31 irqaffinity=0-31 nohz=on nosoftlockup nohz_full=32-127 rcu_nocbs=32-127 rcu_nocb_poll skew_tick=1 isolcpus=managed_irq,32-127 console=tty0 console=ttyS0,115200n8" + permissions: "0644" + + # create sriov VFs + - path: /etc/netplan/99-sriov_vfs.yaml + content: | + network: + ethernets: + enp152s0f1: + virtual-function-count: 128 + permissions: "0600" + + # ensure VFs are bound to vfio-pci driver (so they can be consumed by pods) + - path: /var/lib/cloud/scripts/per-boot/dpdk_bind.sh + content: | + #!/bin/bash + if [ -d /home/ubuntu/dpdk ]; then + modprobe vfio-pci + vfs=$(python3 /home/ubuntu/dpdk/usertools/dpdk-devbind.py -s | grep drv=iavf | awk '{print $1}' | tail -n +11) + python3 /home/ubuntu/dpdk/usertools/dpdk-devbind.py --bind=vfio-pci $vfs + fi + permissions: "0755" + + # set proxy variables + - path: /etc/environment + content: | + HTTPS_PROXY=http://10.18.2.1:3128 + HTTP_PROXY=http://10.18.2.1:3128 + NO_PROXY=10.0.0.0/8,192.168.0.0/16,127.0.0.1,172.16.0.0/16,.svc,localhost + https_proxy=http://10.18.2.1:3128 + http_proxy=http://10.18.2.1:3128 + no_proxy=10.0.0.0/8,192.168.0.0/16,127.0.0.1,172.16.0.0/16,.svc,localhost + append: true + + # add rtk ppa key + - path: /etc/apt/trusted.gpg.d/rtk.asc + content: | + -----BEGIN PGP PUBLIC KEY BLOCK----- + Comment: Hostname: + Version: Hockeypuck 2.2 + + xsFNBGAervwBEADHCeEuR7WKRiEII+uFOu8J+W47MZOcVhfNpu4rdcveL4qe4gj4 + nsROMHaINeUPCmv7/4EXdXtTm1VksXeh4xTeqH6ZaQre8YZ9Hf4OYNRcnFOn0KR+ + aCk0OWe9xkoDbrSYd3wmx8NG/Eau2C7URzYzYWwdHgZv6elUKk6RDbDh6XzIaChm + kLsErCP1SiYhKQvD3Q0qfXdRG908lycCxgejcJIdYxgxOYFFPcyC+kJy2OynnvQr + 4Yw6LJ2LhwsA7bJ5hhQDCYZ4foKCXX9I59G71dO1fFit5O/0/oq0xe7yUYCejf7Z + OqD+TzEK4lxLr1u8j8lXoQyUXzkKIL0SWEFT4tzOFpWQ2IBs/sT4X2oVA18dPDoZ + H2SGxCUcABfne5zrEDgkUkbnQRihBtTyR7QRiE3GpU19RNVs6yAu+wA/hti8Pg9O + U/5hqifQrhJXiuEoSmmgNb9QfbR3tc0ZhKevz4y+J3vcnka6qlrP1lAirOVm2HA7 + STGRnaEJcTama85MSIzJ6aCx4omCgUIfDmsi9nAZRkmeomERVlIAvcUYxtqprLfu + 6plDs+aeff/MAmHbak7yF+Txj8+8F4k6FcfNBT51oVSZuqFwyLswjGVzWol6aEY7 + akVIrn3OdN2u6VWlU4ZO5+sjP4QYsf5K2oVnzFVIpYvqtO2fGbxq/8dRJQARAQAB + zSVMYXVuY2hwYWQgUFBBIGZvciBDYW5vbmljYWwgS2VybmVsIFJUwsGOBBMBCgA4 + FiEEc4Tsv+pcopCX6lNfLz1Vl/FsjCEFAmAervwCGwMFCwkIBwIGFQoJCAsCBBYC + AwECHgECF4AACgkQLz1Vl/FsjCF9WhAAnwfx9njs1M3rfsMMuhvPxx0WS65HDlq8 + SRgl9K2EHtZIcS7lHmcjiTR5RD1w+4rlKZuE5J3EuMnNX1PdCYLSyMQed+7UAtX6 + TNyuiuVZVxuzJ5iS7L2ZoX05ASgyoh/Loipc+an6HzHqQnNC16ZdrBL4AkkGhDgP + ZbYjM3FbBQkL2T/08NcwTrKuVz8DIxgH7yPAOpBzm91n/pV248eK0a46sKauR2DB + zPKjcc180qmaVWyv9C60roSslvnkZsqe/jYyDFuSsRWqGgE5jNyIb8EY7K7KraPv + 3AkusgCh4fqlBxOvF6FJkiYeZZs5YXvGQ296HTfVhPLOqctSFX2kuUKGIq2Z+H/9 + qfJFGS1iaUsoDEUOaU27lQg5wsYa8EsCm9otroH2P3g7435JYRbeiwlwfHMS9EfK + dwD38d8UzZj7TnxGG4T1aLb3Lj5tNG6DSko69+zqHhuknjkRuAxRAZfHeuRbACgE + nIa7Chit8EGhC2GB12pr5XFWzTvNFdxFhbG+ed7EiGn/v0pVQc0ZfE73FXltg7et + bkoC26o5Ksk1wK2SEs/f8aDZFtG01Ys0ASFICDGW2tusFvDs6LpPUUggMjf41s7j + 4tKotEE1Hzr38EdY+8faRaAS9teQdH5yob5a5Bp5F5wgmpqZom/gjle4JBVaV5dI + N5rcnHzcvXw= + =asqr + -----END PGP PUBLIC KEY BLOCK----- + permissions: "0644" + +# install the snap +snap: + commands: + 00: 'snap install k8s --classic --channel=1.31/beta' + +runcmd: +# fetch dpdk driver binding script +- su ubuntu -c "git config --global http.proxy http://10.18.2.1:3128" +- su ubuntu -c "git clone https://github.com/DPDK/dpdk.git /home/ubuntu/dpdk" +- apt update +- DEBIAN_FRONTEND=noninteractive apt install -y linux-headers-6.8.1-1004-realtime linux-image-6.8.1-1004-realtime linux-modules-6.8.1-1004-realtime linux-modules-extra-6.8.1-1004-realtime + +# enable kernel options +- update-grub + +# reboot to activate realtime-kernel and kernel options +power_state: + mode: reboot +``` + +```{note} + +In the above file, the `realtime kernel` 6.8 is installed from a private PPA. +It was recently backported from 24.04 to 22.04 and is still going through +some validation stages. Once it is officially released, it will be +installable via the Ubuntu Pro CLI. +``` + + + +```` +````` + +## {{product}} setup + +{{product}} is delivered as a [snap][]. + +This section explains how to set up a dual node {{product}} cluster for testing +EPA capabilities. + +### Control plane and worker node + +1. [Install the snap][install-link] from the relevant [channel][channel]. + + ```{note} + A pre-release channel is required currently until there is a stable release of {{product}}. + ``` + + For example: + + + ```{include} ../../_parts/install.md + ``` + +2. Create a file called *configuration.yaml*. In this configuration file we let + the snap start with its default CNI (calico), with CoreDNS deployed and we + also point k8s to the external etcd. + + ```yaml + cluster-config: + network: + enabled: true + dns: + enabled: true + local-storage: + enabled: true + extra-node-kubelet-args: + --reserved-cpus: "0-31" + --cpu-manager-policy: "static" + --topology-manager-policy: "best-effort" + ``` + +3. Bootstrap {{product}} using the above configuration file. + + ``` + sudo k8s bootstrap --file configuration.yaml + ``` + +#### Verify the control plane node is running + +After a few seconds you can query the API server with: + +``` +sudo k8s kubectl get all -A +``` + +### Add a second k8s node as a worker + +1. Install the k8s snap on the second node + + ```{include} ../../_parts/install.md + ``` + +2. On the control plane node generate a join token to be used for joining the + second node + + ``` + sudo k8s get-join-token --worker + ``` + +3. On the worker node create the configuration.yaml file + + ``` + extra-node-kubelet-args: + --reserved-cpus: "0-31" + --cpu-manager-policy: "static" + --topology-manager-policy: "best-effort" + ``` + +4. On the worker node use the token to join the cluster + + ``` + sudo k8s join-cluster --file configuration.yaml + ``` + + +#### Verify the two node cluster is ready + +After a few seconds the second worker node will register with the control +plane. You can query the available workers from the first node: + +``` +sudo k8s kubectl get nodes +``` + +The output should list the connected nodes: + +``` +NAME STATUS ROLES AGE VERSION +pc6b-rb4-n1 Ready control-plane,worker 22h v1.31.0 +pc6b-rb4-n3 Ready worker 22h v1.31.0 +``` + +### Multus and SR-IOV setup + +Apply the 'thick' Multus plugin (in case of resource scarcity we can consider +deploying the thin flavour) + +``` +sudo k8s kubectl apply -f https://raw.githubusercontent.com/k8snetworkplumbingwg/multus-cni/master/deployments/multus-daemonset-thick.yml +``` + +```{note} +The memory limits for the Multus pod spec in the DaemonSet should be +increased (i.e. to 500Mi instead 50Mi) to avoid OOM issues when deploying +multiple workload pods in parallel. +``` + +#### SR-IOV Network Device Plugin + +Create `sriov-dp.yaml` configMap: + +``` +cat <TAbort- SERR- + Kernel driver in use: vfio-pci + Kernel modules: iavf +``` + +Now, create a test pod that will claim a network interface from the DPDK +network: + +``` +cat < + +[MAAS]: https://maas.io +[channel]: ../explanation/channels/ +[install-link]: install/snap +[snap]: https://snapcraft.io/docs +[cyclictest]: https://github.com/jlelli/rt-tests +[explain-epa]: ../explanation/epa \ No newline at end of file diff --git a/docs/src/snap/howto/external-datastore.md b/docs/src/snap/howto/external-datastore.md index 5c4204432..bd583a3ca 100644 --- a/docs/src/snap/howto/external-datastore.md +++ b/docs/src/snap/howto/external-datastore.md @@ -37,7 +37,7 @@ datastore-client-key: | ``` -* `datastore-url` expects a comma seperated list of addresses +* `datastore-url` expects a comma separated list of addresses (e.g. `https://10.42.254.192:2379,https://10.42.254.193:2379,https://10.42.254.194:2379`) * `datastore-ca-crt` expects a certificate for the CA in PEM format diff --git a/docs/src/snap/howto/index.md b/docs/src/snap/howto/index.md index 1445e2a32..0df54a503 100644 --- a/docs/src/snap/howto/index.md +++ b/docs/src/snap/howto/index.md @@ -17,13 +17,15 @@ Overview install/index networking/index storage/index -external-datastore -proxy +Use an external datastore backup-restore refresh-certs restore-quorum +two-node-ha +Set up Enhanced Platform Awareness +cis-hardening contribute -support +Get support ``` --- diff --git a/docs/src/snap/howto/install/index.md b/docs/src/snap/howto/install/index.md index 6c7403acc..76e1169c7 100644 --- a/docs/src/snap/howto/install/index.md +++ b/docs/src/snap/howto/install/index.md @@ -12,8 +12,8 @@ the current How-to guides below. :glob: :titlesonly: -snap +Install from a snap multipass -lxd -offline +Install in LXD +Install in air-gapped environments ``` diff --git a/docs/src/snap/howto/install/lxd.md b/docs/src/snap/howto/install/lxd.md index e81a1c11b..60f8df590 100644 --- a/docs/src/snap/howto/install/lxd.md +++ b/docs/src/snap/howto/install/lxd.md @@ -109,7 +109,7 @@ port assigned by Kubernetes. In this example, we will use [Microbot] as it provides a simple HTTP endpoint to expose. These steps can be applied to any other deployment. -First, initialize the k8s cluster with +First, initialise the k8s cluster with ``` lxc exec k8s -- sudo k8s bootstrap @@ -239,4 +239,4 @@ need to access for example storage devices (See comment in [^5]). [default-bridged-networking]: https://ubuntu.com/blog/lxd-networking-lxdbr0-explained [Microbot]: https://github.com/dontrebootme/docker-microbot [AppArmor]: https://apparmor.net/ -[channels]: /snap/explanation/channels +[channels]: ../../explanation/channels diff --git a/docs/src/snap/howto/install/multipass.md b/docs/src/snap/howto/install/multipass.md index 7c15847c5..d008b9815 100644 --- a/docs/src/snap/howto/install/multipass.md +++ b/docs/src/snap/howto/install/multipass.md @@ -1,6 +1,6 @@ # Install with Multipass (Ubuntu/Mac/Windows) -**Multipass** is a simple way to run Ubuntu in a +[Multipass][]is a simple way to run Ubuntu in a virtual machine, no matter what your underlying OS. It is the recommended way to run {{product}} on Windows and macOS systems, and is equally useful for running multiple instances of the `k8s` snap on Ubuntu too. @@ -26,7 +26,7 @@ Multipass is shipped as a snap for Ubuntu and other OSes which support the Windows users should download and install the Multipass installer from the website. -The latest version is available here , +The [latest Windows version][] is available to download, though you may wish to visit the [Multipass website][] for more details. @@ -37,7 +37,7 @@ though you may wish to visit the [Multipass website][] for more details. Users running macOS should download and install the Multipass installer from the website. -The latest version is available here , +The [latest macOS version] is available to download, though you may wish to visit the [Multipass website][] for more details, including an alternate install method using `brew`. @@ -60,14 +60,14 @@ multipass launch 22.04 --name k8s-node --memory 4G --disk 20G --cpus 2 This command specifies: -- **22.04**: The Ubuntu image used as the basis for the instance -- **--name**: The name by which you will refer to the instance -- **--memory**: The memory to allocate -- **--disk**: The disk space to allocate -- **--cpus**: The number of CPU cores to reserve for this instance +- `22.04`: The Ubuntu image used as the basis for the instance +- `--name`: The name by which you will refer to the instance +- `--memory`: The memory to allocate +- `--disk`: The disk space to allocate +- `--cpus`: The number of CPU cores to reserve for this instance For more details of creating instances with Multipass, please see the -[Multipass documentation][multipass-options] about instance creation. +[Multipass documentation][Multipass-options] about instance creation. ## Access the created instance @@ -111,8 +111,11 @@ multipass purge +[Multipass]:https://multipass.run/ [snap-support]: https://snapcraft.io/docs/installing-snapd -[multipass-options]: https://multipass.run/docs/get-started-with-multipass-linux#heading--create-a-customised-instance +[Multipass-options]: https://multipass.run/docs/get-started-with-multipass-linux#heading--create-a-customised-instance [install instructions]: ./snap [Getting started]: ../../tutorial/getting-started [Multipass website]: https://multipass.run/docs +[latest Window version]:https://multipass.run/download/windows +[latest macOS version]:https://multipass.run/download/macos diff --git a/docs/src/snap/howto/install/offline.md b/docs/src/snap/howto/install/offline.md index dc251347e..e522efa50 100644 --- a/docs/src/snap/howto/install/offline.md +++ b/docs/src/snap/howto/install/offline.md @@ -29,7 +29,7 @@ are necessary to verify the integrity of the packages. ```{note} Update the version of k8s by adjusting the channel parameter. For more information on channels visit the -[channels explanation](/snap/explanation/channels.md). +[channels explanation](../../explanation/channels.md). ``` ```{note} @@ -91,7 +91,7 @@ All workloads in a Kubernetes cluster are run as an OCI image. Kubernetes needs to be able to fetch these images and load them into the container runtime. For {{product}}, it is also necessary to fetch the images used -by its features (network, dns, etc.) as well as any images that are +by its features (network, DNS, etc.) as well as any images that are needed to run specific workloads. ```{note} @@ -120,12 +120,12 @@ ghcr.io/canonical/k8s-snap/sig-storage/csi-node-driver-registrar:v2.10.1 ghcr.io/canonical/k8s-snap/sig-storage/csi-provisioner:v5.0.1 ghcr.io/canonical/k8s-snap/sig-storage/csi-resizer:v1.11.1 ghcr.io/canonical/k8s-snap/sig-storage/csi-snapshotter:v8.0.1 -ghcr.io/canonical/metrics-server:0.7.0-ck1 +ghcr.io/canonical/metrics-server:0.7.0-ck2 ghcr.io/canonical/rawfile-localpv:0.8.0-ck4 ``` -A list of images can also be found in the `images.txt` file when unsquashing the -downloaded k8s snap. +A list of images can also be found in the `images.txt` file when the +downloaded k8s snap is unsquashed. Please ensure that the images used by workloads are tracked as well. @@ -299,11 +299,10 @@ After a while, confirm that all the cluster nodes show up in the output of the [Core20]: https://canonical.com/blog/ubuntu-core-20-secures-linux-for-iot -[svc-ports]: /snap/explanation/services-and-ports.md -[proxy]: /snap/howto/proxy.md +[proxy]: ../networking/proxy.md [sync-images-yaml]: https://github.com/canonical/k8s-snap/blob/main/build-scripts/hack/sync-images.yaml [regsync]: https://github.com/regclient/regclient/blob/main/docs/regsync.md [regctl]: https://github.com/regclient/regclient/blob/main/docs/regctl.md [regctl.sh]: https://github.com/canonical/k8s-snap/blob/main/src/k8s/tools/regctl.sh -[nodes]: /snap/tutorial/add-remove-nodes.md +[nodes]: ../../tutorial/add-remove-nodes.md [squid]: https://www.squid-cache.org/ diff --git a/docs/src/snap/howto/install/snap.md b/docs/src/snap/howto/install/snap.md index cc16e7cd6..b18afdf1c 100644 --- a/docs/src/snap/howto/install/snap.md +++ b/docs/src/snap/howto/install/snap.md @@ -80,4 +80,4 @@ ready state. [channels page]: ../../explanation/channels [snap]: https://snapcraft.io/docs [snapd support]: https://snapcraft.io/docs/installing-snapd -[bootstrap]: /snap/reference/bootstrap-config-reference \ No newline at end of file +[bootstrap]: ../../reference/bootstrap-config-reference \ No newline at end of file diff --git a/docs/src/snap/howto/networking/default-dns.md b/docs/src/snap/howto/networking/default-dns.md index 34722d3a1..6cc6e4aac 100644 --- a/docs/src/snap/howto/networking/default-dns.md +++ b/docs/src/snap/howto/networking/default-dns.md @@ -94,4 +94,4 @@ sudo k8s help disable -[getting-started-guide]: /snap/tutorial/getting-started +[getting-started-guide]: ../../tutorial/getting-started diff --git a/docs/src/snap/howto/networking/default-ingress.md b/docs/src/snap/howto/networking/default-ingress.md index 90498d910..e70d66157 100644 --- a/docs/src/snap/howto/networking/default-ingress.md +++ b/docs/src/snap/howto/networking/default-ingress.md @@ -55,7 +55,7 @@ You should see three options: ### TLS Secret You can create a TLS secret by following the official -[Kubernetes documentation][kubectl-create-secret-tls/]. +[Kubernetes documentation][kubectl-create-secret-TLS/]. Please remember to use `sudo k8s kubectl` (See the [kubectl-guide]). Tell Ingress to use your new Ingress certificate: @@ -105,7 +105,7 @@ sudo k8s help disable -[kubectl-create-secret-tls/]: https://kubernetes.io/docs/reference/kubectl/generated/kubectl_create/kubectl_create_secret_tls/ +[kubectl-create-secret-TLS/]: https://kubernetes.io/docs/reference/kubectl/generated/kubectl_create/kubectl_create_secret_tls/ [proxy-protocol]: https://kubernetes.io/docs/reference/networking/service-protocols/#protocol-proxy-special -[getting-started-guide]: /snap/tutorial/getting-started -[kubectl-guide]: /snap/tutorial/kubectl +[getting-started-guide]: ../../tutorial/getting-started +[kubectl-guide]: ../../tutorial/kubectl diff --git a/docs/src/snap/howto/networking/default-loadbalancer.md b/docs/src/snap/howto/networking/default-loadbalancer.md index 88a2a20fb..6552b87a0 100644 --- a/docs/src/snap/howto/networking/default-loadbalancer.md +++ b/docs/src/snap/howto/networking/default-loadbalancer.md @@ -9,7 +9,7 @@ explains how to configure and enable the load-balancer. This guide assumes the following: - You have root or sudo access to the machine. -- You have a bootstraped {{product}} cluster (see the [Getting +- You have a bootstrapped {{product}} cluster (see the [Getting Started][getting-started-guide] guide). ## Check the status and configuration @@ -28,13 +28,15 @@ To check the current configuration of the load-balancer, run the following: ``` sudo k8s get load-balancer ``` + This should output a list of values like this: -- `cidrs` - a list containing [cidr] or IP address range definitions of the +- `cidrs` - a list containing [CIDR] or IP address range definitions of the pool of IP addresses to use - `l2-mode` - whether L2 mode (failover) is turned on -- `l2-interfaces` - optional list of interfaces to announce services over (defaults to all) +- `l2-interfaces` - optional list of interfaces to announce services over + (defaults to all) - `bgp-mode` - whether BGP mode is active. - `bgp-local-asn` - the local Autonomous System Number (ASN) - `bgp-peer-address` - the peer address @@ -47,7 +49,8 @@ These values are configured using the `k8s set`command, e.g.: sudo k8s set load-balancer.l2-mode=true ``` -Note that for the BGP mode, it is necessary to set ***all*** the values simultaneously. E.g. +Note that for the BGP mode, it is necessary to set ***all*** the values +simultaneously. E.g. ``` sudo k8s set load-balancer.bgp-mode=true load-balancer.bgp-local-asn=64512 load-balancer.bgp-peer-address=10.0.10.55/32 load-balancer.bgp-peer-asn=64512 load-balancer.bgp-peer-port=7012 @@ -77,6 +80,5 @@ sudo k8s disable load-balancer - -[cidr]: https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing -[getting-started-guide]: /snap/tutorial/getting-started +[CIDR]: https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing +[getting-started-guide]: ../../tutorial/getting-started diff --git a/docs/src/snap/howto/networking/default-network.md b/docs/src/snap/howto/networking/default-network.md index bde0aad9e..d39537bc2 100644 --- a/docs/src/snap/howto/networking/default-network.md +++ b/docs/src/snap/howto/networking/default-network.md @@ -91,4 +91,4 @@ sudo k8s disable network --help -[getting-started-guide]: /snap/tutorial/getting-started +[getting-started-guide]: ../../tutorial/getting-started diff --git a/docs/src/snap/howto/networking/dualstack.md b/docs/src/snap/howto/networking/dualstack.md index a7114ce6f..efa1b0c19 100644 --- a/docs/src/snap/howto/networking/dualstack.md +++ b/docs/src/snap/howto/networking/dualstack.md @@ -6,13 +6,13 @@ both IPv4 and IPv6 addresses, allowing them to communicate over either protocol. This document will guide you through enabling dual-stack, including necessary configurations, known limitations, and common issues. -### Prerequisites +## Prerequisites Before enabling dual-stack, ensure that your environment supports IPv6, and that your network configuration (including any underlying infrastructure) is compatible with dual-stack operation. -### Enabling Dual-Stack +## Enabling Dual-Stack Dual-stack can be enabled by specifying both IPv4 and IPv6 CIDRs during the cluster bootstrap process. The key configuration parameters are: @@ -133,11 +133,18 @@ cluster bootstrap process. The key configuration parameters are: working. -### CIDR Size Limitations +## CIDR Size Limitations When setting up dual-stack networking, it is important to consider the limitations regarding CIDR size: -- **/64 is too large for the Service CIDR**: Using a `/64` CIDR for services -may cause issues like failure to initialize the IPv6 allocator. This is due +- **/108 is the maximum size for the Service CIDR** +Using a smaller value than `/108` for service CIDRs +may cause issues like failure to initialise the IPv6 allocator. This is due to the CIDR size being too large for Kubernetes to handle efficiently. + +See upstream reference: [kube-apiserver validation][kube-apiserver-test] + + + +[kube-apiserver-test]: https://github.com/kubernetes/kubernetes/blob/master/cmd/kube-apiserver/app/options/validation_test.go#L435 diff --git a/docs/src/snap/howto/networking/index.md b/docs/src/snap/howto/networking/index.md index 98d42bd55..d015577a2 100644 --- a/docs/src/snap/howto/networking/index.md +++ b/docs/src/snap/howto/networking/index.md @@ -11,9 +11,11 @@ how to configure and use key capabilities of {{product}}. ```{toctree} :titlesonly: -/snap/howto/networking/default-dns.md -/snap/howto/networking/default-network.md -/snap/howto/networking/default-ingress.md -/snap/howto/networking/default-loadbalancer.md -/snap/howto/networking/dualstack.md +Use default DNS +Use default network +Use default Ingress +Use default load-balancer +Enable Dual-Stack networking +Set up an IPv6-only cluster +Configure proxy settings ``` diff --git a/docs/src/snap/howto/networking/ipv6.md b/docs/src/snap/howto/networking/ipv6.md new file mode 100644 index 000000000..65bd5cc99 --- /dev/null +++ b/docs/src/snap/howto/networking/ipv6.md @@ -0,0 +1,142 @@ +# How to set up an IPv6-Only Cluster + +An IPv6-only Kubernetes cluster operates exclusively using IPv6 addresses, +without support for IPv4. This configuration is ideal for environments that +are transitioning away from IPv4 or want to take full advantage of IPv6's +expanded address space. This document, explains how to set up +an IPv6-only cluster, including key configurations and necessary checks +to ensure proper setup. + +## Prerequisites + +Before setting up an IPv6-only cluster, ensure that: + +- Your environment supports IPv6. +- Network infrastructure, such as routers, firewalls, and DNS, are configured +to handle IPv6 traffic. +- Any underlying infrastructure (e.g. cloud providers, bare metal setups) +must be IPv6-compatible. + +## Setting Up an IPv6-Only Cluster + +The process of creating an IPv6-only cluster involves specifying only IPv6 +CIDRs for pods and services during the bootstrap process. Unlike dual-stack, +only IPv6 CIDRs are used. + +1. **Bootstrap Kubernetes with IPv6 CIDRs** + +Start by bootstrapping the Kubernetes cluster and providing only IPv6 +CIDRs for pods and services: + +```bash +sudo k8s bootstrap --timeout 10m --interactive +``` + +When prompted, set the pod and service CIDRs to IPv6 ranges. For example: + +``` +Please set the Pod CIDR: [fd01::/108] +Please set the Service CIDR: [fd98::/108] +``` + +Alternatively, these values can be configured in a bootstrap configuration file +named `bootstrap-config.yaml` in this example: + +```yaml +pod-cidr: fd01::/108 +service-cidr: fd98::/108 +``` + +Specify the configuration file during the bootstrapping process: + +```bash +sudo k8s bootstrap --file bootstrap-config.yaml +``` + +2. **Verify Pod and Service Creation** + +Once the cluster is up, verify that all pods are running: + +```sh +sudo k8s kubectl get pods -A +``` + +Deploy a pod with an nginx web-server and expose it via a service to verify +connectivity of the IPv6-only cluster. Create a manifest file +`nginx-ipv6.yaml` with the following content: + +```yaml +apiVersion: apps/v1 +kind: Deployment +metadata: + name: nginx-ipv6 +spec: + selector: + matchLabels: + run: nginx-ipv6 + replicas: 1 + template: + metadata: + labels: + run: nginx-ipv6 + spec: + containers: + - name: nginx-ipv6 + image: rocks.canonical.com/cdk/diverdane/nginxipv6:1.0.0 + ports: + - containerPort: 80 +--- +apiVersion: v1 +kind: Service +metadata: + name: nginx-ipv6 + labels: + run: nginx-ipv6 +spec: + type: NodePort + ipFamilies: + - IPv6 + ports: + - port: 80 + protocol: TCP + selector: + run: nginx-ipv6 +``` + +Deploy the web-server and its service by running: + +```sh +sudo k8s kubectl apply -f nginx-ipv6.yaml +``` + +3. **Verify IPv6 Connectivity** + +Retrieve the service details to confirm an IPv6 address is assigned: + +```sh +sudo k8s kubectl get service nginx-ipv6 -n default +``` + +Obtain the service’s IPv6 address from the output: + +``` +NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE +nginx-ipv6 NodePort fd98::7534 80:32248/TCP 2m +``` + +Use the assigned IPv6 address to test connectivity: + +```bash +curl http://[fd98::7534]/ +``` + +A welcome message from the nginx web-server is displayed when IPv6 +connectivity is set up correctly. + +## IPv6-Only Cluster Considerations + +**Service and Pod CIDR Sizing** + +Use `/108` as the maximum size for Service CIDRs. Larger ranges (e.g., `/64`) +may lead to allocation errors or Kubernetes failing to initialise the IPv6 +address allocator. diff --git a/docs/src/snap/howto/networking/proxy.md b/docs/src/snap/howto/networking/proxy.md new file mode 100644 index 000000000..d0f64414d --- /dev/null +++ b/docs/src/snap/howto/networking/proxy.md @@ -0,0 +1,57 @@ +# Configure proxy settings for K8s + +{{product}} packages a number of utilities (for example curl, helm) which need +to fetch resources they expect to find on the internet. In a constrained +network environment, such access is usually controlled through proxies. + +To set up a proxy using squid follow the +[How to install a Squid server][squid] tutorial. + +## Adding proxy configuration for the k8s snap + +If necessary, create the `snap.k8s.containerd.service.d` directory: + +```bash +sudo mkdir -p /etc/systemd/system/snap.k8s.containerd.service.d +``` + +```{note} It is important to add whatever address ranges are used by the + cluster itself to the `NO_PROXY` and `no_proxy` variables. +``` + +For example, assume we have a proxy running at `http://squid.internal:3128` and +we are using the networks `10.0.0.0/8`,`192.168.0.0/16` and `172.16.0.0/12`. +We would add the configuration to the +(`/etc/systemd/system/snap.k8s.containerd.service.d/http-proxy.conf`) file: + +```bash +# /etc/systemd/system/snap.k8s.containerd.service.d/http-proxy.conf +[Service] +Environment="HTTPS_PROXY=http://squid.internal:3128" +Environment="HTTP_PROXY=http://squid.internal:3128" +Environment="NO_PROXY=10.0.0.0/8,10.152.183.1,192.168.0.0/16,127.0.0.1,172.16.0.0/12" +Environment="https_proxy=http://squid.internal:3128" +Environment="http_proxy=http://squid.internal:3128" +Environment="no_proxy=10.0.0.0/8,10.152.183.1,192.168.0.0/16,127.0.0.1,172.16.0.0/12" +``` + +Note that you may need to restart for these settings to take effect. + + +```{note} Include the CIDR **10.152.183.0/24** in both the +`no_proxy` and `NO_PROXY` environment variables, as it's the default Kubernetes +service CIDR. If you are using a different service CIDR, update this setting +accordingly. This ensures pods can access the cluster's Kubernetes API Server. +Also, include the default pod range (**10.1.0.0/16**) and any local networks +needed. +``` + +## Adding proxy configuration for the k8s charms + +Proxy configuration is handled by Juju when deploying the `k8s` charms. Please +see the [documentation for adding proxy configuration via Juju][juju-proxy]. + + + +[juju-proxy]: ../../../charm/howto/proxy +[squid]: https://ubuntu.com/server/docs/how-to-install-a-squid-server diff --git a/docs/src/snap/howto/restore-quorum.md b/docs/src/snap/howto/restore-quorum.md index aeb15b721..9050797c7 100755 --- a/docs/src/snap/howto/restore-quorum.md +++ b/docs/src/snap/howto/restore-quorum.md @@ -1,9 +1,9 @@ # Recovering a Cluster After Quorum Loss Highly available {{product}} clusters can survive losing one or more -nodes. [Dqlite], the default datastore, implements a [Raft] based protocol where -an elected leader holds the definitive copy of the database, which is then -replicated on two or more secondary nodes. +nodes. [Dqlite], the default datastore, implements a [Raft] based protocol +where an elected leader holds the definitive copy of the database, which is +then replicated on two or more secondary nodes. When the a majority of the nodes are lost, the cluster becomes unavailable. If at least one database node survived, the cluster can be recovered using the @@ -64,8 +64,8 @@ sudo snap stop k8s ## Recover the Database -Choose one of the remaining alive cluster nodes that has the most recent version -of the Raft log. +Choose one of the remaining alive cluster nodes that has the most recent +version of the Raft log. Update the ``cluster.yaml`` files, changing the role of the lost nodes to "spare" (2). Additionally, double check the addresses and IDs specified in @@ -73,7 +73,8 @@ Update the ``cluster.yaml`` files, changing the role of the lost nodes to files were moved across nodes. The following command guides us through the recovery process, prompting a text -editor with informative inline comments for each of the dqlite configuration files. +editor with informative inline comments for each of the dqlite configuration +files. ``` sudo /snap/k8s/current/bin/k8sd cluster-recover \ @@ -82,29 +83,40 @@ sudo /snap/k8s/current/bin/k8sd cluster-recover \ --log-level 0 ``` -Please adjust the log level for additional debug messages by increasing its value. -The command creates database backups before making any changes. +Please adjust the log level for additional debug messages by increasing its +value. The command creates database backups before making any changes. -The above command will reconfigure the Raft members and create recovery tarballs -that are used to restore the lost nodes, once the Dqlite configuration is updated. +The above command will reconfigure the Raft members and create recovery +tarballs that are used to restore the lost nodes, once the Dqlite +configuration is updated. ```{note} -By default, the command will recover both Dqlite databases. If one of the databases -needs to be skipped, use the ``--skip-k8sd`` or ``--skip-k8s-dqlite`` flags. -This can be useful when using an external Etcd database. +By default, the command will recover both Dqlite databases. If one of the +databases needs to be skipped, use the ``--skip-k8sd`` or ``--skip-k8s-dqlite`` +flags. This can be useful when using an external Etcd database. ``` -Once the "cluster-recover" command completes, restart the k8s services on the node: +```{note} +Non-interactive mode can be requested using the ``--non-interactive`` flag. +In this case, no interactive prompts or text editors will be displayed and +the command will assume that the configuration files have already been updated. + +This allows automating the recovery procedure. +``` + +Once the "cluster-recover" command completes, restart the k8s services on the +node: ``` sudo snap start k8s ``` -Ensure that the services started successfully by using ``sudo snap services k8s``. -Use ``k8s status --wait-ready`` to wait for the cluster to become ready. +Ensure that the services started successfully by using +``sudo snap services k8s``. Use ``k8s status --wait-ready`` to wait for the +cluster to become ready. -You may notice that we have not returned to an HA cluster yet: ``high availability: no``. -This is expected as we need to recover +You may notice that we have not returned to an HA cluster yet: +``high availability: no``. This is expected as we need to recover ## Recover the remaining nodes @@ -113,28 +125,34 @@ nodes. For k8sd, copy ``recovery_db.tar.gz`` to ``/var/snap/k8s/common/var/lib/k8sd/state/recovery_db.tar.gz``. When the k8sd -service starts, it will load the archive and perform the necessary recovery steps. +service starts, it will load the archive and perform the necessary recovery +steps. The k8s-dqlite archive needs to be extracted manually. First, create a backup of the current k8s-dqlite state directory: ``` -sudo mv /var/snap/k8s/common/var/lib/k8s-dqlite /var/snap/k8s/common/var/lib/k8s-dqlite.bkp +sudo mv /var/snap/k8s/common/var/lib/k8s-dqlite \ + /var/snap/k8s/common/var/lib/k8s-dqlite.bkp ``` Then, extract the backup archive: ``` sudo mkdir /var/snap/k8s/common/var/lib/k8s-dqlite -sudo tar xf recovery-k8s-dqlite-$timestamp-post-recovery.tar.gz -C /var/snap/k8s/common/var/lib/k8s-dqlite +sudo tar xf recovery-k8s-dqlite-$timestamp-post-recovery.tar.gz \ + -C /var/snap/k8s/common/var/lib/k8s-dqlite ``` -Node specific files need to be copied back to the k8s-dqlite state dir: +Node specific files need to be copied back to the k8s-dqlite state directory: ``` -sudo cp /var/snap/k8s/common/var/lib/k8s-dqlite.bkp/cluster.crt /var/snap/k8s/common/var/lib/k8s-dqlite -sudo cp /var/snap/k8s/common/var/lib/k8s-dqlite.bkp/cluster.key /var/snap/k8s/common/var/lib/k8s-dqlite -sudo cp /var/snap/k8s/common/var/lib/k8s-dqlite.bkp/info.yaml /var/snap/k8s/common/var/lib/k8s-dqlite +sudo cp /var/snap/k8s/common/var/lib/k8s-dqlite.bkp/cluster.crt \ + /var/snap/k8s/common/var/lib/k8s-dqlite +sudo cp /var/snap/k8s/common/var/lib/k8s-dqlite.bkp/cluster.key \ + /var/snap/k8s/common/var/lib/k8s-dqlite +sudo cp /var/snap/k8s/common/var/lib/k8s-dqlite.bkp/info.yaml \ + /var/snap/k8s/common/var/lib/k8s-dqlite ``` Once these steps are completed, restart the k8s services: @@ -143,13 +161,15 @@ Once these steps are completed, restart the k8s services: sudo snap start k8s ``` -Repeat these steps for all remaining nodes. Once a quorum is achieved, the cluster -will be reported as "highly available": +Repeat these steps for all remaining nodes. Once a quorum is achieved, +the cluster will be reported as "highly available": ``` $ sudo k8s status cluster status: ready -control plane nodes: 10.80.130.168:6400 (voter), 10.80.130.167:6400 (voter), 10.80.130.164:6400 (voter) +control plane nodes: 10.80.130.168:6400 (voter), + 10.80.130.167:6400 (voter), + 10.80.130.164:6400 (voter) high availability: yes datastore: k8s-dqlite network: enabled diff --git a/docs/src/snap/howto/storage/ceph.md b/docs/src/snap/howto/storage/ceph.md index b448cab9f..b379d2e5f 100644 --- a/docs/src/snap/howto/storage/ceph.md +++ b/docs/src/snap/howto/storage/ceph.md @@ -29,7 +29,7 @@ this demonstration will have less than 5 OSDs. (See [placement groups]) ceph osd pool create kubernetes 128 ``` -Initialize the pool as a Ceph block device pool. +Initialise the pool as a Ceph block device pool. ``` rbd pool init kubernetes @@ -48,8 +48,7 @@ capabilities to administer your Ceph cluster: ceph auth get-or-create client.kubernetes mon 'profile rbd' osd 'profile rbd pool=kubernetes' mgr 'profile rbd pool=kubernetes' ``` -For more information on user capabilities in Ceph, see -[https://docs.ceph.com/en/latest/rados/operations/user-management/#authorization-capabilities] +For more information on user capabilities in Ceph, see the [authorisation capabilities page][] ``` [client.kubernetes] @@ -60,7 +59,7 @@ Note the generated key, you will need it at a later step. ## Generate csi-config-map.yaml -First, get the fsid and the monitor addresses of your cluster. +First, get the `fsid` and the monitor addresses of your cluster. ``` sudo ceph mon dump @@ -79,7 +78,7 @@ election_strategy: 1 dumped monmap epoch 2 ``` -Keep note of the v1 IP (`10.0.0.136:6789`) and the fsid +Keep note of the v1 IP (`10.0.0.136:6789`) and the `fsid` (`6d5c12c9-6dfb-445a-940f-301aa7de0f29`) as you will need to refer to them soon. ``` @@ -131,11 +130,10 @@ Then apply: kubectl apply -f csi-kms-config-map.yaml ``` -If you do need to configure a KMS provider, an example ConfigMap is available in -the Ceph repository: -[https://github.com/ceph/ceph-csi/blob/devel/examples/kms/vault/kms-config.yaml] +If you do need to configure a KMS provider, an [example ConfigMap][] is available +in the Ceph repository. -Create the `ceph-config-map.yaml` which will be stored inside a ceph.conf file +Create the `ceph-config-map.yaml` which will be stored inside a `ceph.conf` file in the CSI containers. This `ceph.conf` file will be used by Ceph daemons on each container to authenticate with the Ceph cluster. @@ -188,7 +186,7 @@ Then apply: kubectl apply -f csi-rbd-secret.yaml ``` -## Create ceph-csi's custom Kubernetes objects +## Create ceph-csi custom Kubernetes objects Create the ServiceAccount and RBAC ClusterRole/ClusterRoleBinding objects: @@ -251,7 +249,7 @@ Then apply: kubectl apply -f csi-rbd-sc.yaml ``` -## Create a Persistant Volume Claim (PVC) for a RBD-backed file-system +## Create a Persistent Volume Claim (PVC) for a RBD-backed file-system This PVC will allow users to request RBD-backed storage. @@ -279,7 +277,7 @@ Then apply: kubectl apply -f pvc.yaml ``` -## Create a pod that binds to the Rados Block Device PVC +## Create a pod that binds to the RADOS Block Device PVC Finally, create a pod configuration that uses the RBD-backed PVC. @@ -313,7 +311,7 @@ kubectl apply -f pod.yaml ## Verify that the pod is using the RBD PV -To verify that the csi-rbd-demo-pod is indeed using a RBD Persistant Volume, run +To verify that the `csi-rbd-demo-pod` is indeed using a RBD Persistent Volume, run the following commands, you should see information related to attached volumes in both of their outputs: @@ -331,7 +329,9 @@ Ceph documentation: [Intro to Ceph]. [Ceph]: https://ceph.com/ -[getting-started-guide]: ../tutorial/getting-started.md +[getting-started-guide]: ../../tutorial/getting-started.md [block-devices-and-kubernetes]: https://docs.ceph.com/en/latest/rbd/rbd-kubernetes/ [placement groups]: https://docs.ceph.com/en/mimic/rados/operations/placement-groups/ [Intro to Ceph]: https://docs.ceph.com/en/latest/start/intro/ +[authorisation capabilities page]:[https://docs.ceph.com/en/latest/rados/operations/user-management/#authorization-capabilities] +[example ConfigMap]:https://github.com/ceph/ceph-csi/blob/devel/examples/kms/vault/kms-config.yaml diff --git a/docs/src/snap/howto/storage/cloud.md b/docs/src/snap/howto/storage/cloud.md new file mode 100644 index 000000000..920b297dc --- /dev/null +++ b/docs/src/snap/howto/storage/cloud.md @@ -0,0 +1,496 @@ +# How to use cloud storage + +{{product}} simplifies the process of integrating and managing cloud storage +solutions like Amazon EBS. This guide provides steps to configure IAM policies, +deploy the cloud controller manager, and set up the necessary drivers for you +to take advantage of cloud storage solutions in the context of Kubernetes. + +## What you'll need + +This guide is for AWS and assumes the following: + +- You have root or sudo access to an Amazon EC2 instance +- You can create roles and policies in AWS + + +## Set IAM Policies + +Your instance will need a few IAM policies to be able to communciate with the +AWS APIs. The policies provided here are quite open and should be scoped down +based on your security requirements. + +You will most likely want to create a role for your instance. You can call this +role "k8s-control-plane" or "k8s-worker". Then, define and attach the following +policies to the role. Once the role is created with the required policies, +attach the role to the instance. + +For a control plane node: + +```{dropdown} Control Plane Policies +```json +{ + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Action": [ + "autoscaling:DescribeAutoScalingGroups", + "autoscaling:DescribeLaunchConfigurations", + "autoscaling:DescribeTags", + "ec2:DescribeInstances", + "ec2:DescribeRegions", + "ec2:DescribeRouteTables", + "ec2:DescribeSecurityGroups", + "ec2:DescribeSubnets", + "ec2:DescribeVolumes", + "ec2:DescribeAvailabilityZones", + "ec2:CreateSecurityGroup", + "ec2:CreateTags", + "ec2:CreateVolume", + "ec2:ModifyInstanceAttribute", + "ec2:ModifyVolume", + "ec2:AttachVolume", + "ec2:AuthorizeSecurityGroupIngress", + "ec2:CreateRoute", + "ec2:DeleteRoute", + "ec2:DeleteSecurityGroup", + "ec2:DeleteVolume", + "ec2:DetachVolume", + "ec2:RevokeSecurityGroupIngress", + "ec2:DescribeVpcs", + "ec2:DescribeInstanceTopology", + "elasticloadbalancing:AddTags", + "elasticloadbalancing:AttachLoadBalancerToSubnets", + "elasticloadbalancing:ApplySecurityGroupsToLoadBalancer", + "elasticloadbalancing:CreateLoadBalancer", + "elasticloadbalancing:CreateLoadBalancerPolicy", + "elasticloadbalancing:CreateLoadBalancerListeners", + "elasticloadbalancing:ConfigureHealthCheck", + "elasticloadbalancing:DeleteLoadBalancer", + "elasticloadbalancing:DeleteLoadBalancerListeners", + "elasticloadbalancing:DescribeLoadBalancers", + "elasticloadbalancing:DescribeLoadBalancerAttributes", + "elasticloadbalancing:DetachLoadBalancerFromSubnets", + "elasticloadbalancing:DeregisterInstancesFromLoadBalancer", + "elasticloadbalancing:ModifyLoadBalancerAttributes", + "elasticloadbalancing:RegisterInstancesWithLoadBalancer", + "elasticloadbalancing:SetLoadBalancerPoliciesForBackendServer", + "elasticloadbalancing:AddTags", + "elasticloadbalancing:CreateListener", + "elasticloadbalancing:CreateTargetGroup", + "elasticloadbalancing:DeleteListener", + "elasticloadbalancing:DeleteTargetGroup", + "elasticloadbalancing:DescribeListeners", + "elasticloadbalancing:DescribeLoadBalancerPolicies", + "elasticloadbalancing:DescribeTargetGroups", + "elasticloadbalancing:DescribeTargetHealth", + "elasticloadbalancing:ModifyListener", + "elasticloadbalancing:ModifyTargetGroup", + "elasticloadbalancing:RegisterTargets", + "elasticloadbalancing:DeregisterTargets", + "elasticloadbalancing:SetLoadBalancerPoliciesOfListener", + "iam:CreateServiceLinkedRole", + "kms:DescribeKey" + ], + "Resource": [ + "*" + ] + } + ] +} +``` + +For a worker node: + +```{dropdown} Worker Policies +```json +{ + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Action": [ + "ec2:DescribeInstances", + "ec2:DescribeRegions", + "ecr:GetAuthorizationToken", + "ecr:BatchCheckLayerAvailability", + "ecr:GetDownloadUrlForLayer", + "ecr:GetRepositoryPolicy", + "ecr:DescribeRepositories", + "ecr:ListImages", + "ecr:BatchGetImage" + ], + "Resource": "*" + } + ] +} +``` + +## Add a tag to your EC2 Instance + +A cluster using the AWS cloud provider needs to label existing nodes and +resources with a ClusterID or the kube-controller-manager will not start. Add +the following tag to your instance, making sure to replace the placeholder id +with your own (this can simply be "k8s" or "my-k8s-cluster"). + +``` +kubernetes.io/cluster/=owned +``` + +## Set your host name + +The cloud controller manager uses the node name to correctly associate the node +with an EC2 instance. In {{product}}, the node name is derived from the +hostname of the machine. Therefore, before bootstrapping the cluster, you must +first set an appropriate host name. + +```bash +echo "$(sudo cloud-init query ds.meta_data.local-hostname)" | sudo tee /etc/hostname +``` + +Then, reboot the machine. + +```bash +sudo reboot +``` + +When the machine is up, use `hostname -f` to check the host name. It should +look like: + +```bash +ip-172-31-11-86.us-east-2.compute.internal +``` + +This host name format is called IP-based naming and is specific to AWS. + + +## Bootstrap {{product}} + +Now that your machine has an appropriate host name, you are ready to bootstrap +{{product}}. + +First, create a bootstrap configuration file that sets the cloud-provider +configuration to "external". + +```bash +echo "cluster-config: + cloud-provider: external" > bootstrap-config.yaml +``` + +Then, bootstrap the cluster: + +```bash +sudo k8s bootstrap --file ./bootstrap-config.yaml +sudo k8s status --wait-ready +``` + +## Deploy the cloud controller manager + +Now that you have an appropriate host name, policies, and a {{product}} +cluster, you have everything you need to deploy the cloud controller manager. + +Here is a YAML definition file that sets appropriate defaults for you, it +configures the necessary service accounts, roles, and daemonsets: + +```{dropdown} CCM deployment manifest +```yaml +--- +apiVersion: apps/v1 +kind: DaemonSet +metadata: + name: aws-cloud-controller-manager + namespace: kube-system + labels: + k8s-app: aws-cloud-controller-manager +spec: + selector: + matchLabels: + k8s-app: aws-cloud-controller-manager + updateStrategy: + type: RollingUpdate + template: + metadata: + labels: + k8s-app: aws-cloud-controller-manager + spec: + nodeSelector: + node-role.kubernetes.io/control-plane: "" + tolerations: + - key: node.cloudprovider.kubernetes.io/uninitialized + value: "true" + effect: NoSchedule + - effect: NoSchedule + key: node-role.kubernetes.io/control-plane + affinity: + nodeAffinity: + requiredDuringSchedulingIgnoredDuringExecution: + nodeSelectorTerms: + - matchExpressions: + - key: node-role.kubernetes.io/control-plane + operator: Exists + serviceAccountName: cloud-controller-manager + containers: + - name: aws-cloud-controller-manager + image: registry.k8s.io/provider-aws/cloud-controller-manager:v1.28.3 + args: + - --v=2 + - --cloud-provider=aws + - --use-service-account-credentials=true + - --configure-cloud-routes=false + resources: + requests: + cpu: 200m + hostNetwork: true +--- +apiVersion: v1 +kind: ServiceAccount +metadata: + name: cloud-controller-manager + namespace: kube-system +--- +apiVersion: rbac.authorization.k8s.io/v1 +kind: RoleBinding +metadata: + name: cloud-controller-manager:apiserver-authentication-reader + namespace: kube-system +roleRef: + apiGroup: rbac.authorization.k8s.io + kind: Role + name: extension-apiserver-authentication-reader +subjects: + - apiGroup: "" + kind: ServiceAccount + name: cloud-controller-manager + namespace: kube-system +--- +apiVersion: rbac.authorization.k8s.io/v1 +kind: ClusterRole +metadata: + name: system:cloud-controller-manager +rules: +- apiGroups: + - "" + resources: + - events + verbs: + - create + - patch + - update +- apiGroups: + - "" + resources: + - nodes + verbs: + - '*' +- apiGroups: + - "" + resources: + - nodes/status + verbs: + - patch +- apiGroups: + - "" + resources: + - services + verbs: + - list + - patch + - update + - watch +- apiGroups: + - "" + resources: + - services/status + verbs: + - list + - patch + - update + - watch +- apiGroups: + - "" + resources: + - serviceaccounts + verbs: + - create + - get + - list + - watch +- apiGroups: + - "" + resources: + - persistentvolumes + verbs: + - get + - list + - update + - watch +- apiGroups: + - "" + resources: + - endpoints + verbs: + - create + - get + - list + - watch + - update +- apiGroups: + - coordination.k8s.io + resources: + - leases + verbs: + - create + - get + - list + - watch + - update +- apiGroups: + - "" + resources: + - serviceaccounts/token + verbs: + - create +--- +kind: ClusterRoleBinding +apiVersion: rbac.authorization.k8s.io/v1 +metadata: + name: system:cloud-controller-manager +roleRef: + apiGroup: rbac.authorization.k8s.io + kind: ClusterRole + name: system:cloud-controller-manager +subjects: + - apiGroup: "" + kind: ServiceAccount + name: cloud-controller-manager + namespace: kube-system +``` + +You can apply the CCM manifest easily by running the following command: + +```bash +sudo k8s kubectl apply -f https://raw.githubusercontent.com/canonical/k8s-snap/main/docs/src/assets/how-to-cloud-storage-aws-ccm.yaml +``` + +After a moment, you should see the cloud controller manager pod was +successfully deployed. + +```bash +NAME READY STATUS RESTARTS AGE +aws-cloud-controller-manager-ndbtq 1/1 Running 1 (3h51m ago) 9h +``` + +## Deploy the EBS CSI Driver + +Now that the cloud controller manager is deployed and can communicate with AWS, +you are ready to deploy the EBS CSI driver. The easiest way to deploy the +driver is with the Helm chart. Luckily, {{product}} has a built-in Helm +command. + +If you want to create encrypted drives, you need to add the statement to the +policy you are using for the instance. + +```json +{ + "Effect": "Allow", + "Action": [ + "kms:Decrypt", + "kms:GenerateDataKeyWithoutPlaintext", + "kms:CreateGrant" + ], + "Resource": "*" +} +``` + +Then, add the Helm repo for the EBS CSI Driver. + +```bash +sudo k8s helm repo add aws-ebs-csi-driver https://kubernetes-sigs.github.io/aws-ebs-csi-driver +sudo k8s helm repo update +``` + +Finally, install the Helm chart, making sure to set the correct region as an +argument. + +```bash +sudo k8s helm upgrade --install aws-ebs-csi-driver \ + --namespace kube-system \ + aws-ebs-csi-driver/aws-ebs-csi-driver \ + --set controller.region= +``` + +Once the command completes, you can verify the pods are successfully deployed: + +```bash +kubectl get pods -n kube-system -l app.kubernetes.io/name=aws-ebs-csi-driver +``` + +```bash +NAME READY STATUS RESTARTS AGE +ebs-csi-controller-78bcd46cf8-5zk8q 5/5 Running 2 (3h48m ago) 8h +ebs-csi-controller-78bcd46cf8-g7l5h 5/5 Running 1 (3h48m ago) 8h +ebs-csi-node-nx6rg 3/3 Running 0 9h +``` + +The status of all pods should be "Running". + +## Deploy a workload + +Everything is in place for you to deploy a workload that dynamically creates +and uses an EBS volume. + +First, create a StorageClass and a PersistentVolumeClaim: + +``` +sudo k8s kubectl apply -f - < Volumes` page in AWS, you should see a 10Gi gp3 volume. diff --git a/docs/src/snap/howto/storage/index.md b/docs/src/snap/howto/storage/index.md index 4310beac7..f79732d4d 100644 --- a/docs/src/snap/howto/storage/index.md +++ b/docs/src/snap/howto/storage/index.md @@ -12,6 +12,7 @@ default storage built-in to {{product}}. ```{toctree} :titlesonly: -storage -ceph -``` \ No newline at end of file +Use default storage +Use Ceph storage +Use cloud storage +``` diff --git a/docs/src/snap/howto/storage/storage.md b/docs/src/snap/howto/storage/storage.md index dbba33631..3b7d0a261 100644 --- a/docs/src/snap/howto/storage/storage.md +++ b/docs/src/snap/howto/storage/storage.md @@ -62,4 +62,4 @@ Disabling storage only removes the CSI driver. The persistent volume claims will still be available and your data will remain on disk. -[getting-started-guide]: ../tutorial/getting-started.md +[getting-started-guide]: ../../tutorial/getting-started.md diff --git a/docs/src/snap/howto/two-node-ha.md b/docs/src/snap/howto/two-node-ha.md new file mode 100644 index 000000000..e7b1cdc99 --- /dev/null +++ b/docs/src/snap/howto/two-node-ha.md @@ -0,0 +1,430 @@ +# Two-Node High-Availability with Dqlite + +High availability (HA) is a mandatory requirement for most production-grade +Kubernetes deployments, usually implying three or more nodes. + +Two-node HA clusters are sometimes preferred for cost savings and operational +efficiency considerations. Follow this guide to learn how Canonical Kubernetes +can achieve high availability with just two nodes while using the default +datastore, [Dqlite]. Both nodes will be active members of the cluster, sharing +the Kubernetes load. + +Dqlite cannot achieve a [Raft] quorum with fewer than three nodes. This means +that Dqlite will not be able to replicate data and the secondaries will simply +forward the queries to the primary node. + +In the event of a node failure, database recovery will require following the +steps in the [Dqlite recovery guide]. + +## Proposed solution + +Since Dqlite data replication is not available in this situation, we propose +using synchronous block level replication through +[Distributed Replicated Block Device] (DRBD). + +The cluster monitoring and failover process will be handled by [Pacemaker] and +[Corosync]. After a node failure, the DRBD volume will be mounted on the +standby node, allowing access to the latest Dqlite database version. + +Additional recovery steps are automated and invoked through Pacemaker. + +### Prerequisites: + +* Please ensure that both nodes are part of the Kubernetes cluster. + See the [getting started] and [add/remove nodes] guides. +* The user associated with the HA service has SSH access to the peer node and + passwordless sudo configured. For simplicity, the default "ubuntu" user can + be used. +* We recommend using static IP configuration. + +The [two-node-ha.sh script] automates most operations related to the two-node +HA scenario and is included in the snap. + +The first step is to install the required packages: + +``` +/snap/k8s/current/k8s/hack/two-node-ha.sh install_packages +``` + +### Distributed Replicated Block Device (DRBD) + +This example uses a loopback device as DRBD backing storage: + +``` +sudo dd if=/dev/zero of=/opt/drbd0-backstore bs=1M count=2000 +``` + +Ensure that the loopback device is attached at boot time, before Pacemaker +starts. + +``` +cat < +HATWO_ADDR= + +cat < +HATWO_ADDR= + +sudo mv /etc/corosync/corosync.conf /etc/corosync/corosync.conf.orig + +cat < +HATWO_ADDR= +DRBD_MOUNT_DIR=${DRBD_MOUNT_DIR:-"/mnt/drbd0"} + +sudo crm configure < + +# remove the node constraint. +sudo crm resource clear fs_res +``` + +### Managing Kubernetes Snap Services + +For the two-node HA setup, k8s snap services should no longer start +automatically. Instead, they will be managed by a wrapper service. + +``` +for f in `sudo snap services k8s | awk 'NR>1 {print $1}'`; do + echo "disabling snap.$f" + sudo systemctl disable "snap.$f"; +done +``` + +### Preparing the wrapper service + +The next step is to define the wrapper service. Add the following to +``/etc/systemd/system/two-node-ha-k8s.service``. + +```{note} +the sample uses the ``ubuntu`` user, feel free to use a different one as long as the prerequisites +are met. +``` + +``` +[Unit] +Description=K8s service wrapper handling Dqlite recovery for two-node HA setups. +After=network.target pacemaker.service + +[Service] +User=ubuntu +Group=ubuntu +Type=oneshot +ExecStart=/bin/bash /snap/k8s/current/k8s/hack/two-node-ha.sh start_service +ExecStop=/bin/bash sudo snap stop k8s +RemainAfterExit=true + +[Install] +WantedBy=multi-user.target +``` + +```{note} +The ``two-node-ha.sh start_service`` command used by the service wrapper +automatically detects the expected Dqlite role based on the DRBD state. +It then takes the necessary steps to bootstrap the Dqlite state directories, +synchronise with the peer node (if available) and recover the database. +``` + +When a DRBD failover occurs, the ``two-node-ha-k8s`` service needs to be +restarted. To accomplish this, we are going to define a separate service that +will be invoked by Pacemaker. Create a file called +``/etc/systemd/system/two-node-ha-k8s-failover.service`` containing the +following: + +``` +[Unit] +Description=Managed by Pacemaker, restarts two-node-ha-k8s on failover. +After=network.target home-ubuntu-workspace.mount + +[Service] +Type=oneshot +ExecStart=systemctl restart two-node-ha-k8s +RemainAfterExit=true +``` + +Reload the systemd configuration and set ``two-node-ha-k8s`` to start +automatically. Notice that ``two-node-ha-k8s-failover`` must not be configured +to start automatically, but instead is going to be managed through Pacemaker. + +``` +sudo systemctl enable two-node-ha-k8s +sudo systemctl daemon-reload +``` + +Make sure that both nodes have been configured using the above steps before +moving forward. + +### Automating the failover procedure + +Define a new Pacemaker resource that will invoke the +``two-node-ha-k8s-failover`` service when a DRBD failover occurs. + +``` +sudo crm configure < +[Dqlite]: https://dqlite.io/ +[Raft]: https://raft.github.io/ +[Distributed Replicated Block Device]: https://ubuntu.com/server/docs/distributed-replicated-block-device-drbd +[Dqlite recovery guide]: restore-quorum +[external datastore guide]: external-datastore +[two-node-ha.sh script]: https://github.com/canonical/k8s-snap/blob/main/k8s/hack/two-node-ha.sh +[getting started]: ../tutorial/getting-started +[add/remove nodes]: ../tutorial/add-remove-nodes +[Pacemaker]: https://clusterlabs.org/pacemaker/ +[Corosync]: https://clusterlabs.org/corosync.html +[Pacemaker fencing]: https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Explained/html/fencing.html +[split brain]: https://en.wikipedia.org/wiki/Split-brain_(computing) diff --git a/docs/src/snap/index.md b/docs/src/snap/index.md index 7022a799f..d4b1e4f92 100644 --- a/docs/src/snap/index.md +++ b/docs/src/snap/index.md @@ -1,5 +1,20 @@ # {{product}} snap documentation +```{toctree} +:hidden: +Overview +``` + +```{toctree} +:hidden: +:titlesonly: +:maxdepth: 6 +tutorial/index.md +howto/index.md +explanation/index.md +reference/index.md +``` + The {{product}} snap is a performant, lightweight, secure and opinionated distribution of **Kubernetes** which includes everything needed to create and manage a scalable cluster suitable for all use cases. @@ -70,4 +85,4 @@ and constructive feedback. [roadmap]: ./reference/roadmap [overview page]: ./explanation/about [architecture documentation]: ./reference/architecture -[Juju charm]: /charm/index +[Juju charm]: ../charm/index diff --git a/docs/src/snap/reference/annotations.md b/docs/src/snap/reference/annotations.md index b5e4404d8..0868cda20 100644 --- a/docs/src/snap/reference/annotations.md +++ b/docs/src/snap/reference/annotations.md @@ -6,8 +6,31 @@ the bootstrap configuration. | Name | Description | Values | |---------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------| -| `k8sd/v1alpha/lifecycle/skip-cleanup-kubernetes-node-on-remove` | If set, only microcluster and file cleanup are performed. This is helpful when an external controller (e.g., CAPI) manages the Kubernetes node lifecycle. By default, k8sd will remove the Kubernetes node when it is removed from the cluster. | "true"\|"false" | +| `k8sd/v1alpha/lifecycle/skip-cleanup-kubernetes-node-on-remove` | If set, only MicroCluster and file cleanup are performed. This is helpful when an external controller (e.g., CAPI) manages the Kubernetes node lifecycle. By default, k8sd will remove the Kubernetes node when it is removed from the cluster. | "true"\|"false" | +| `k8sd/v1alpha/lifecycle/skip-stop-services-on-remove` | If set, the k8s services will not be stopped on the leaving node when removing the node. This is helpful when an external controller (e.g., CAPI) manages the Kubernetes node lifecycle. By default, all services are stopped on leaving nodes. | "true"\|"false" | +| `k8sd/v1alpha1/csrsigning/auto-approve` | If set, certificate signing requests created by worker nodes are auto approved. | "true"\|"false" | +| `k8sd/v1alpha1/calico/apiserver-enabled` | Enable the installation of the Calico API server to enable management of Calico APIs using kubectl. | "true"\|"false" | +| `k8sd/v1alpha1/calico/encapsulation-v4` | The type of encapsulation to use on the IPv4 pool. | "IPIP"\|"VXLAN"\|"IPIPCrossSubnet"\|"VXLANCrossSubnet"\|"None" | +| `k8sd/v1alpha1/calico/encapsulation-v6` | The type of encapsulation to use on the IPv6 pool. | "IPIP"\|"VXLAN"\|"IPIPCrossSubnet"\|"VXLANCrossSubnet"\|"None" | +| `k8sd/v1alpha1/calico/autodetection-v4/firstFound` | Use default interface matching parameters to select an interface, performing best-effort filtering based on well-known interface names. | "true"\|"false" | +| `k8sd/v1alpha1/calico/autodetection-v4/kubernetes` | Configure Calico to detect node addresses based on the Kubernetes API. | "NodeInternalIP" | +| `k8sd/v1alpha1/calico/autodetection-v4/interface` | Enable IP auto-detection based on interfaces that match the given regex. | string | +| `k8sd/v1alpha1/calico/autodetection-v4/skipInterface` | Enable IP auto-detection based on interfaces that do not match the given regex. | string | +| `k8sd/v1alpha1/calico/autodetection-v4/canReach` | Enable IP auto-detection based on which source address on the node is used to reach the specified IP or domain. | string | +| `k8sd/v1alpha1/calico/autodetection-v4/cidrs` | Enable IP auto-detection based on which addresses on the nodes are within one of the provided CIDRs. | []string (comma separated) | +| `k8sd/v1alpha1/calico/autodetection-v6/firstFound` | Use default interface matching parameters to select an interface, performing best-effort filtering based on well-known interface names. | "true"\|"false" | +| `k8sd/v1alpha1/calico/autodetection-v6/kubernetes` | Configure Calico to detect node addresses based on the Kubernetes API. | "NodeInternalIP" | +| `k8sd/v1alpha1/calico/autodetection-v6/interface` | Enable IP auto-detection based on interfaces that match the given regex. | string | +| `k8sd/v1alpha1/calico/autodetection-v6/skipInterface` | Enable IP auto-detection based on interfaces that do not match the given regex. | string | +| `k8sd/v1alpha1/calico/autodetection-v6/canReach` | Enable IP auto-detection based on which source address on the node is used to reach the specified IP or domain. | string | +| `k8sd/v1alpha1/calico/autodetection-v6/cidrs` | Enable IP auto-detection based on which addresses on the nodes are within one of the provided CIDRs. | []string (comma separated) | +| `k8sd/v1alpha1/cilium/devices` | List of devices facing cluster/external network (used for BPF NodePort, BPF masquerading and host firewall); supports `+` as wildcard in device name, e.g. `eth+,ens+` | string | +| `k8sd/v1alpha1/cilium/direct-routing-device` | Device name used to connect nodes in direct routing mode (used by BPF NodePort, BPF host routing); if empty, automatically set to a device with k8s InternalIP/ExternalIP or with a default route. Bridge type devices are ignored in automatic selection | string | +| `k8sd/v1alpha1/cilium/vlan-bpf-bypass` | Comma separated list of VLAN tags to bypass eBPF filtering on native devices. Cilium enables a firewall on native devices and filters all unknown traffic, including VLAN 802.1q packets, which pass through the main device with the associated tag (e.g., VLAN device eth0.4000 and its main interface eth0). Supports `0` as wildcard for bypassing all VLANs. e.g. `4001,4002` | []string | +| `k8sd/v1alpha1/metrics-server/image-repo` | Override the default image repository for the metrics-server. | string | +| `k8sd/v1alpha1/metrics-server/image-tag` | Override the default image tag for the metrics-server. | string | + -[bootstrap]: /snap/reference/bootstrap-config-reference +[bootstrap]: bootstrap-config-reference diff --git a/docs/src/snap/reference/architecture.md b/docs/src/snap/reference/architecture.md index b3adad13e..02170835e 100644 --- a/docs/src/snap/reference/architecture.md +++ b/docs/src/snap/reference/architecture.md @@ -10,8 +10,7 @@ current design of {{product}}, following the [C4 model]. This overview of {{product}} demonstrates the interactions of Kubernetes with users and with other systems. -```{kroki} ../../assets/overview.puml -``` +![cluster5][] Two actors interact with the Kubernetes snap: @@ -20,7 +19,8 @@ Two actors interact with the Kubernetes snap: access to the cluster. That initial user is able to configure the cluster to match their needs and of course create other users that may or may not have admin privileges. The K8s admin is also able to maintain workloads running - in the cluster. + in the cluster. If you deploy {{product}} from a snap, this is how the cluster + is manually orchestrated. - **K8s user**: A user consuming the workloads hosted in the cluster. Users do not have access to the Kubernetes API server. They need to access the cluster @@ -52,8 +52,7 @@ distribution. We have identified the following: Looking more closely at what is contained within the K8s snap itself: -```{kroki} ../../assets/k8s-container.puml -``` +![cluster1][] The `k8s` snap distribution includes the following: @@ -74,8 +73,7 @@ The `k8s` snap distribution includes the following: K8sd is the component that implements and exposes the operations functionality needed for managing the Kubernetes cluster. -```{kroki} ../../assets/k8sd-component.puml -``` +![cluster2][] At the core of the `k8sd` functionality we have the cluster manager that is responsible for configuring the services, workload and features we deem @@ -107,8 +105,7 @@ This functionality is exposed via the following interfaces: Canonical `k8s` Charms encompass two primary components: the [`k8s` charm][K8s charm] and the [`k8s-worker` charm][K8s-worker charm]. -```{kroki} ../../assets/charms-architecture.puml -``` +![cluster4][] Charms are instantiated on a machine as a Juju unit, and a collection of units constitutes an application. Both `k8s` and `k8s-worker` units are responsible @@ -119,7 +116,7 @@ determines the node's role in the Kubernetes cluster. The `k8s` charm manages directing the `juju` controller to reach the model's eventually consistent state. For more detail on Juju's concepts, see the [Juju docs][]. -The administrator may choose any supported cloud-types (Openstack, MAAS, AWS, +The administrator may choose any supported cloud-types (OpenStack, MAAS, AWS, GCP, Azure...) on which to manage the machines making up the Kubernetes cluster. Juju selects a single leader unit per application to act as a centralised figure with the model. The `k8s` leader oversees Kubernetes @@ -140,6 +137,12 @@ and the sharing of observability data with the [`Canonical Observability Stack (COS)`][COS docs]. This modular and integrated approach facilitates a robust and flexible {{product}} deployment managed through Juju. + + +[cluster1]: https://assets.ubuntu.com/v1/dfc43753-cluster1.svg +[cluster2]: https://assets.ubuntu.com/v1/f634743e-k8sd.svg +[cluster4]: https://assets.ubuntu.com/v1/24fd1773-cluster4.svg +[cluster5]: https://assets.ubuntu.com/v1/bcfe150f-overview.svg [C4 model]: https://c4model.com/ diff --git a/docs/src/snap/reference/bootstrap-config-reference.md b/docs/src/snap/reference/bootstrap-config-reference.md index 047622c5d..758828e04 100644 --- a/docs/src/snap/reference/bootstrap-config-reference.md +++ b/docs/src/snap/reference/bootstrap-config-reference.md @@ -1,520 +1,14 @@ # Bootstrap configuration file reference -A YAML file can be supplied to the `k8s bootstrap` command to configure and +A YAML file can be supplied to the `k8s join-cluster` command to configure and customise the cluster. This reference section provides the format of this file by listing all available options and their details. See below for an example. -## Format Specification +## Configuration options -### cluster-config.network - -**Type:** `object`
-**Required:** `No` - -Configuration options for the network feature - -#### cluster-config.network.enabled - -**Type:** `bool`
-**Required:** `No`
- -Determines if the feature should be enabled. -If omitted defaults to `true` - -### cluster-config.dns - -**Type:** `object`
-**Required:** `No` - -Configuration options for the dns feature - -#### cluster-config.dns.enabled - -**Type:** `bool`
-**Required:** `No`
- -Determines if the feature should be enabled. -If omitted defaults to `true` - -#### cluster-config.dns.cluster-domain - -**Type:** `string`
-**Required:** `No`
- -Sets the local domain of the cluster. -If omitted defaults to `cluster.local` - -#### cluster-config.dns.service-ip - -**Type:** `string`
-**Required:** `No`
- -Sets the IP address of the dns service. If omitted defaults to the IP address -of the Kubernetes service created by the feature. - -Can be used to point to an external dns server when feature is disabled. - - -#### cluster-config.dns.upstream-nameservers - -**Type:** `list[string]`
-**Required:** `No`
- -Sets the upstream nameservers used to forward queries for out-of-cluster -endpoints. -If omitted defaults to `/etc/resolv.conf` and uses the nameservers of the node. - - -### cluster-config.ingress - -**Type:** `object`
-**Required:** `No` - -Configuration options for the ingress feature - -#### cluster-config.ingress.enabled - -**Type:** `bool`
-**Required:** `No`
- -Determines if the feature should be enabled. -If omitted defaults to `false` - -#### cluster-config.ingress.default-tls-secret - -**Type:** `string`
-**Required:** `No`
- -Sets the name of the secret to be used for providing default encryption to -ingresses. - -Ingresses can specify another TLS secret in their resource definitions, -in which case the default secret won't be used. - -#### cluster-config.ingress.enable-proxy-protocol - -**Type:** `bool`
-**Required:** `No`
- -Determines if the proxy protocol should be enabled for ingresses. -If omitted defaults to `false` - - -### cluster-config.load-balancer - -**Type:** `object`
-**Required:** `No` - -Configuration options for the load-balancer feature - -#### cluster-config.load-balancer.enabled - -**Type:** `bool`
-**Required:** `No`
- -Determines if the feature should be enabled. -If omitted defaults to `false` - -#### cluster-config.load-balancer.cidrs - -**Type:** `list[string]`
-**Required:** `No`
- -Sets the CIDRs used for assigning IP addresses to Kubernetes services with type -`LoadBalancer`. - -#### cluster-config.load-balancer.l2-mode - -**Type:** `bool`
-**Required:** `No`
- -Determines if L2 mode should be enabled. -If omitted defaults to `false` - -#### cluster-config.load-balancer.l2-interfaces - -**Type:** `list[string]`
-**Required:** `No`
- -Sets the interfaces to be used for announcing IP addresses through ARP. -If omitted all interfaces will be used. - -#### cluster-config.load-balancer.bgp-mode - -**Type:** `bool`
-**Required:** `No`
- -Determines if BGP mode should be enabled. -If omitted defaults to `false` - -#### cluster-config.load-balancer.bgp-local-asn - -**Type:** `int`
-**Required:** `Yes if bgp-mode is true`
- -Sets the ASN to be used for the local virtual BGP router. - -#### cluster-config.load-balancer.bgp-peer-address - -**Type:** `string`
-**Required:** `Yes if bgp-mode is true`
- -Sets the IP address of the BGP peer. - -#### cluster-config.load-balancer.bgp-peer-asn - -**Type:** `int`
-**Required:** `Yes if bgp-mode is true`
- -Sets the ASN of the BGP peer. - -#### cluster-config.load-balancer.bgp-peer-port - -**Type:** `int`
-**Required:** `Yes if bgp-mode is true`
- -Sets the port of the BGP peer. - - -### cluster-config.local-storage - -**Type:** `object`
-**Required:** `No` - -Configuration options for the local-storage feature - -#### cluster-config.local-storage.enabled - -**Type:** `bool`
-**Required:** `No`
- -Determines if the feature should be enabled. -If omitted defaults to `false` - -#### cluster-config.local-storage.local-path - -**Type:** `string`
-**Required:** `No`
- -Sets the path to be used for storing volume data. -If omitted defaults to `/var/snap/k8s/common/rawfile-storage` - -#### cluster-config.local-storage.reclaim-policy - -**Type:** `string`
-**Required:** `No`
-**Possible Values:** `Retain | Recycle | Delete` - -Sets the reclaim policy of the storage class. -If omitted defaults to `Delete` - -#### cluster-config.local-storage.default - -**Type:** `bool`
-**Required:** `No`
- -Determines if the storage class should be set as default. -If omitted defaults to `true` - - -### cluster-config.gateway - -**Type:** `object`
-**Required:** `No` - -Configuration options for the gateway feature - -#### cluster-config.gateway.enabled - -**Type:** `bool`
-**Required:** `No`
- -Determines if the feature should be enabled. -If omitted defaults to `true` - -### cluster-config.cloud-provider - -**Type:** `string`
-**Required:** `No`
-**Possible Values:** `external` - -Sets the cloud provider to be used by the cluster. - -When this is set as `external`, node will wait for an external cloud provider to -do cloud specific setup and finish node initialization. - -### control-plane-taints - -**Type:** `list[string]`
-**Required:** `No` - -List of taints to be applied to control plane nodes. - -### pod-cidr - -**Type:** `string`
-**Required:** `No` - -The CIDR to be used for assigning pod addresses. -If omitted defaults to `10.1.0.0/16` - -### service-cidr - -**Type:** `string`
-**Required:** `No` - -The CIDR to be used for assigning service addresses. -If omitted defaults to `10.152.183.0/24` - -### disable-rbac - -**Type:** `bool`
-**Required:** `No` - -Determines if RBAC should be disabled. -If omitted defaults to `false` - -### secure-port - -**Type:** `int`
-**Required:** `No` - -The port number for kube-apiserver to use. -If omitted defaults to `6443` - -### k8s-dqlite-port - -**Type:** `int`
-**Required:** `No` - -The port number for k8s-dqlite to use. -If omitted defaults to `9000` - -### datastore-type - -**Type:** `string`
-**Required:** `No`
-**Possible Values:** `k8s-dqlite | external` - -The type of datastore to be used. -If omitted defaults to `k8s-dqlite` - -Can be used to point to an external datastore like etcd. - -### datastore-servers - -**Type:** `list[string]`
-**Required:** `No`
- -The server addresses to be used when `datastore-type` is set to `external`. - -### datastore-ca-crt - -**Type:** `string`
-**Required:** `No`
- -The CA certificate to be used when communicating with the external datastore. - -### datastore-client-crt - -**Type:** `string`
-**Required:** `No`
- -The client certificate to be used when communicating with the external -datastore. - -### datastore-client-key - -**Type:** `string`
-**Required:** `No`
- -The client key to be used when communicating with the external datastore. - -### extra-sans - -**Type:** `list[string]`
-**Required:** `No`
- -List of extra SANs to be added to certificates. - -### ca-crt - -**Type:** `string`
-**Required:** `No`
- -The CA certificate to be used for Kubernetes services. -If omitted defaults to an auto generated certificate. - -### ca-key - -**Type:** `string`
-**Required:** `No`
- -The CA key to be used for Kubernetes services. -If omitted defaults to an auto generated key. - -### front-proxy-ca-crt - -**Type:** `string`
-**Required:** `No`
- -The CA certificate to be used for the front proxy. -If omitted defaults to an auto generated certificate. - -### front-proxy-ca-key - -**Type:** `string`
-**Required:** `No`
- -The CA key to be used for the front proxy. -If omitted defaults to an auto generated key. - -### front-proxy-client-crt - -**Type:** `string`
-**Required:** `No`
- -The client certificate to be used for the front proxy. -If omitted defaults to an auto generated certificate. - -### front-proxy-client-key - -**Type:** `string`
-**Required:** `No`
- -The client key to be used for the front proxy. -If omitted defaults to an auto generated key. - - -### apiserver-kubelet-client-crt - -**Type:** `string`
-**Required:** `No`
- -The client certificate to be used by kubelet for communicating with the -kube-apiserver. -If omitted defaults to an auto generated certificate. - -### apiserver-kubelet-client-key - -**Type:** `string`
-**Required:** `No`
- -The client key to be used by kubelet for communicating with the kube-apiserver. -If omitted defaults to an auto generated key. - -### service-account-key - -**Type:** `string`
-**Required:** `No`
- -The key to be used by the default service account. -If omitted defaults to an auto generated key. - -### apiserver-crt - -**Type:** `string`
-**Required:** `No`
- -The certificate to be used for the kube-apiserver. -If omitted defaults to an auto generated certificate. - -### apiserver-key - -**Type:** `string`
-**Required:** `No`
- -The key to be used for the kube-apiserver. -If omitted defaults to an auto generated key. - -### kubelet-crt - -**Type:** `string`
-**Required:** `No`
- -The certificate to be used for the kubelet. -If omitted defaults to an auto generated certificate. - -### kubelet-key - -**Type:** `string`
-**Required:** `No`
- -The key to be used for the kubelet. -If omitted defaults to an auto generated key. - -### extra-node-config-files - -**Type:** `map[string]string`
-**Required:** `No`
- -Additional files that are uploaded `/var/snap/k8s/common/args/conf.d/` -to a node on bootstrap. These files can them be references by Kubernetes -service arguments. -The format is `map[]`. - -### extra-node-kube-apiserver-args - -**Type:** `map[string]string`
-**Required:** `No`
- -Additional arguments that are passed to the `kube-apiserver` only for that -specific node. Overwrites default configuration. A parameter that is explicitly -set to `null` is deleted. The format is `map[<--flag-name>]`. - -### extra-node-kube-controller-manager-args - -**Type:** `map[string]string`
-**Required:** `No`
- -Additional arguments that are passed to the `kube-controller-manager` only for -that specific node. Overwrites default configuration. A parameter that is -explicitly set to `null` is deleted. The format is `map[<--flag-name>]`. - -### extra-node-kube-scheduler-args - -**Type:** `map[string]string`
-**Required:** `No`
- -Additional arguments that are passed to the `kube-scheduler` only for that -specific node. Overwrites default configuration. A parameter that is explicitly -set to `null` is deleted. The format is `map[<--flag-name>]`. - -### extra-node-kube-proxy-args - -**Type:** `map[string]string`
-**Required:** `No`
- -Additional arguments that are passed to the `kube-proxy` only for that -specific node. Overwrites default configuration. A parameter that is explicitly -set to `null` is deleted. The format is `map[<--flag-name>]`. - -### extra-node-kubelet-args - -**Type:** `map[string]string`
-**Required:** `No`
- -Additional arguments that are passed to the `kubelet` only for that -specific node. Overwrites default configuration. A parameter that is explicitly -set to `null` is deleted. The format is `map[<--flag-name>]`. - -### extra-node-containerd-args - -**Type:** `map[string]string`
-**Required:** `No`
- -Additional arguments that are passed to `containerd` only for that -specific node. Overwrites default configuration. A parameter that is explicitly -set to `null` is deleted. The format is `map[<--flag-name>]`. - -### extra-node-k8s-dqlite-args - -**Type:** `map[string]string`
-**Required:** `No`
+```{include} ../../_parts/bootstrap_config.md +``` -Additional arguments that are passed to `k8s-dqlite` only for that -specific node. Overwrites default configuration. A parameter that is explicitly -set to `null` is deleted. The format is `map[<--flag-name>]`. ## Example diff --git a/docs/src/snap/reference/certificates.md b/docs/src/snap/reference/certificates.md index 29df8bdb0..996d80f6c 100644 --- a/docs/src/snap/reference/certificates.md +++ b/docs/src/snap/reference/certificates.md @@ -26,13 +26,13 @@ their issuance. | **Common Name** | **Purpose** | **File Location** | **Primary Function** | **Signed By** | |--------------------------------------------|-----------|------------------------------------------------------|------------------------------------------------------------------|-----------------------------| | `kube-apiserver` | Server | `/etc/kubernetes/pki/apiserver.crt` | Securing the API server endpoint | `kubernetes-ca` | -| `apiserver-kubelet-client` | Client | `/etc/kubernetes/pki/apiserver-kubelet-client.crt` | API server communication with kubelets | `kubernetes-ca-client` | +| `apiserver-kubelet-client` | Client | `/etc/kubernetes/pki/apiserver-kubelet-client.crt` | API server communication with kubelet | `kubernetes-ca-client` | | `kube-apiserver-etcd-client` | Client | `/etc/kubernetes/pki/apiserver-etcd-client.crt` | API server communication with etcd | `kubernetes-ca-client` | | `front-proxy-client` | Client | `/etc/kubernetes/pki/front-proxy-client.crt` | API server communication with the front-proxy | `kubernetes-front-proxy-ca` | | `system:kube-controller-manager` | Client | `/etc/kubernetes/pki/controller-manager.crt` | Communication between the controller manager and the API server | `kubernetes-ca-client` | | `system:kube-scheduler` | Client | `/etc/kubernetes/pki/scheduler.crt` | Communication between the scheduler and the API server | `kubernetes-ca-client` | | `system:kube-proxy` | Client | `/etc/kubernetes/pki/proxy.crt` | Communication between kube-proxy and the API server | `kubernetes-ca-client` | -| `system:node:$hostname` | Client | `/etc/kubernetes/pki/kubelet-client.crt` | Authentication of kubelets to the API server | `kubernetes-ca-client` | +| `system:node:$hostname` | Client | `/etc/kubernetes/pki/kubelet-client.crt` | Authentication of kubelet to the API server | `kubernetes-ca-client` | | `k8s-dqlite` | Client | `/var/snap/k8s/common/var/lib/k8s-dqlite/cluster.crt`| Communication between k8s-dqlite nodes and API server | `self-signed` | | `root@$hostname` | Client | `/var/snap/k8s/common/var/lib/k8s-dqlite/cluster.crt` | Communication between k8sd nodes | `self-signed` | diff --git a/docs/src/snap/reference/control-plane-join-config-reference.md b/docs/src/snap/reference/control-plane-join-config-reference.md new file mode 100755 index 000000000..06875521a --- /dev/null +++ b/docs/src/snap/reference/control-plane-join-config-reference.md @@ -0,0 +1,11 @@ +# Control plane node join configuration file reference + +A YAML file can be supplied to the `k8s join-cluster ` command to configure and +customise new nodes. + +This reference section provides all available options for control plane nodes. + +## Configuration options + +```{include} ../../_parts/control_plane_join_config.md +``` diff --git a/docs/src/snap/reference/index.md b/docs/src/snap/reference/index.md index f1720e760..bd7a9217f 100644 --- a/docs/src/snap/reference/index.md +++ b/docs/src/snap/reference/index.md @@ -16,10 +16,12 @@ commands annotations certificates bootstrap-config-reference +control-plane-join-config-reference +worker-join-config-reference proxy troubleshooting architecture -community +Community roadmap ``` diff --git a/docs/src/snap/reference/proxy.md b/docs/src/snap/reference/proxy.md index 1fa29f765..31acde759 100644 --- a/docs/src/snap/reference/proxy.md +++ b/docs/src/snap/reference/proxy.md @@ -37,5 +37,6 @@ how to set these. -[How to guide for configuring proxies for the k8s snap]: /snap/howto/proxy -[How to guide for configuring proxies for k8s charms]: /charm/howto/proxy +[How to guide for configuring proxies for the k8s snap]: ../howto/networking/proxy +[How to guide for configuring proxies for k8s charms]: ../../charm/howto/proxy + diff --git a/docs/src/snap/reference/releases.md b/docs/src/snap/reference/releases.md index 319f16737..b6035eb58 100644 --- a/docs/src/snap/reference/releases.md +++ b/docs/src/snap/reference/releases.md @@ -18,7 +18,7 @@ Currently {{product}} is working towards general availability, but you can install it now to try: - **Clustering** - need high availability or just an army of worker nodes? - {{product}} is emminently scaleable, see the [tutorial on adding + {{product}} is eminently scalable, see the [tutorial on adding more nodes][nodes]. - **Networking** - Our built-in network component allows cluster administrators to automatically scale and secure network policies across the cluster. Find @@ -32,8 +32,8 @@ Follow along with the [tutorial] to get started! -[tutorial]: /snap/tutorial/getting-started -[nodes]: /snap/tutorial/add-remove-nodes +[tutorial]: ../tutorial/getting-started +[nodes]: ../tutorial/add-remove-nodes [COS Lite]: https://charmhub.io/cos-lite -[networking]: /snap/howto/networking/index -[observability documentation]: /charm/howto/cos-lite \ No newline at end of file +[networking]: ../howto/networking/index +[observability documentation]: ../../charm/howto/cos-lite \ No newline at end of file diff --git a/docs/src/snap/reference/roadmap.md b/docs/src/snap/reference/roadmap.md index 63550f85c..a97ae5657 100644 --- a/docs/src/snap/reference/roadmap.md +++ b/docs/src/snap/reference/roadmap.md @@ -7,7 +7,7 @@ future direction and priorities of the project. Our roadmap matches the cadence of the Ubuntu release cycle, so `24.10` is the same as the release date for Ubuntu 24.10. This does not precisely map to the release cycle of Kubernetes versions, so please consult the [release notes] for -specifics of whatfeatures have been delivered. +specifics of what features have been delivered. ``` {csv-table} {{product}} public roadmap diff --git a/docs/src/snap/reference/troubleshooting.md b/docs/src/snap/reference/troubleshooting.md index f6b44a3f1..a4edf0d94 100644 --- a/docs/src/snap/reference/troubleshooting.md +++ b/docs/src/snap/reference/troubleshooting.md @@ -44,7 +44,7 @@ the kubelet. kubelet needs a feature from cgroup and the kernel may not be set up appropriately to provide the cpuset feature. ``` -E0125 00:20:56.003890 2172 kubelet.go:1466] "Failed to start ContainerManager" err="failed to initialize top level QOS containers: root container [kubepods] doesn't exist" +E0125 00:20:56.003890 2172 kubelet.go:1466] "Failed to start ContainerManager" err="failed to initialise top level QOS containers: root container [kubepods] doesn't exist" ``` ### Explanation @@ -54,7 +54,7 @@ An excellent deep-dive of the issue exists at Commenter [@haircommander][] [states][kubernetes-122955-2020403422] > basically: we've figured out that this issue happens because libcontainer -> doesn't initialize the cpuset cgroup for the kubepods slice when the kubelet +> doesn't initialise the cpuset cgroup for the kubepods slice when the kubelet > initially calls into it to do so. This happens because there isn't a cpuset > defined on the top level of the cgroup. however, we fail to validate all of > the cgroup controllers we need are present. It's possible this is a @@ -68,7 +68,7 @@ Commenter [@haircommander][] [states][kubernetes-122955-2020403422] ### Solution This is in the process of being fixed upstream via -[kubernetes/kuberetes #125923][kubernetes-125923]. +[kubernetes/kubernetes #125923][kubernetes-125923]. In the meantime, the best solution is to create a `Delegate=yes` configuration in systemd. diff --git a/docs/src/snap/reference/worker-join-config-reference.md b/docs/src/snap/reference/worker-join-config-reference.md new file mode 100755 index 000000000..d10ea5ba2 --- /dev/null +++ b/docs/src/snap/reference/worker-join-config-reference.md @@ -0,0 +1,11 @@ +# Worker node join configuration file reference + +A YAML file can be supplied to the `k8s join-cluster ` command to configure and +customise new worker nodes. + +This reference section provides all available options for worker nodes. + +## Configuration options + +```{include} ../../_parts/worker_join_config.md +``` diff --git a/docs/src/snap/tutorial/add-remove-nodes.md b/docs/src/snap/tutorial/add-remove-nodes.md index 736474b46..72ff32988 100644 --- a/docs/src/snap/tutorial/add-remove-nodes.md +++ b/docs/src/snap/tutorial/add-remove-nodes.md @@ -8,7 +8,7 @@ This tutorial simplifies the concept by creating a cluster within a controlled environment using two Multipass VMs. The approach here allows us to focus on the foundational aspects of clustering using {{product}} without the complexities of a full-scale, production setup. If your nodes are already -installed, you can skip the multipass setup and go to [step 2](step2). +installed, you can skip the Multipass setup and go to [step 2](step2). ## Before starting @@ -76,7 +76,7 @@ A base64 token will be printed to your terminal. Keep it handy as you will need it for the next step. ```{note} It's advisable to name the new node after the hostname of the - worker node (in this case, the VM's hostname is worker). + worker node (in this case, the VM hostname is worker). ``` ### 3. Join the cluster on the worker node @@ -142,7 +142,7 @@ multipass purge ## Next Steps - Discover how to enable and configure Ingress resources [Ingress][Ingress] -- Keep mastering {{product}} with kubectl [How to use +- Learn more about {{product}} with kubectl [How to use kubectl][Kubectl] - Explore Kubernetes commands with our [Command Reference Guide][Command Reference] @@ -153,8 +153,8 @@ multipass purge [Getting started]: getting-started [Multipass Installation]: https://multipass.run/install -[Ingress]: /snap/howto/networking/default-ingress +[Ingress]: ../howto/networking/default-ingress [Kubectl]: kubectl -[Command Reference]: /snap/reference/commands -[Storage]: /snap/howto/storage -[Networking]: /snap/howto/networking/index.md +[Command Reference]: ../reference/commands +[Storage]: ../howto/storage/index +[Networking]: ../howto/networking/index.md diff --git a/docs/src/snap/tutorial/getting-started.md b/docs/src/snap/tutorial/getting-started.md index d613e202b..69832a87d 100644 --- a/docs/src/snap/tutorial/getting-started.md +++ b/docs/src/snap/tutorial/getting-started.md @@ -94,9 +94,10 @@ Let's deploy a demo NGINX server: sudo k8s kubectl create deployment nginx --image=nginx ``` -This command launches a [pod](https://kubernetes.io/docs/concepts/workloads/pods/), -the smallest deployable unit in Kubernetes, -running the NGINX application within a container. +This command launches a +[pod](https://kubernetes.io/docs/concepts/workloads/pods/), the smallest +deployable unit in Kubernetes, running the NGINX application within a +container. You can check the status of your pods by running: @@ -202,18 +203,18 @@ This option ensures complete removal of the snap and its associated data. ## Next Steps -- Keep mastering {{product}} with kubectl: [How to use kubectl] +- Learn more about {{product}} with kubectl: [How to use kubectl] - Explore Kubernetes commands with our [Command Reference Guide] - Learn how to set up a multi-node environment [Setting up a K8s cluster] -- Configure storage options [Storage] +- Configure storage options: [Storage] - Master Kubernetes networking concepts: [Networking] - Discover how to enable and configure Ingress resources [Ingress] [How to use kubectl]: kubectl -[Command Reference Guide]: /snap/reference/commands +[Command Reference Guide]: ../reference/commands [Setting up a K8s cluster]: add-remove-nodes -[Storage]: /snap/howto/storage -[Networking]: /snap/howto/networking/index.md -[Ingress]: /snap/howto/networking/default-ingress.md \ No newline at end of file +[Storage]: ../howto/storage/index +[Networking]: ../howto/networking/index.md +[Ingress]: ../howto/networking/default-ingress.md \ No newline at end of file diff --git a/docs/src/snap/tutorial/index.md b/docs/src/snap/tutorial/index.md index 237c89aed..b124db24b 100644 --- a/docs/src/snap/tutorial/index.md +++ b/docs/src/snap/tutorial/index.md @@ -33,6 +33,6 @@ Finally, our [Reference section] is for when you need to check specific details or information such as the command reference or release notes. -[How-to guides]: /snap/howto/index -[Explanation section]: /snap/explanation/index -[Reference section]: /snap/reference/index +[How-to guides]: ../howto/index +[Explanation section]: ../explanation/index +[Reference section]: ../reference/index diff --git a/docs/src/snap/tutorial/kubectl.md b/docs/src/snap/tutorial/kubectl.md index 2f9fed807..02b643c18 100644 --- a/docs/src/snap/tutorial/kubectl.md +++ b/docs/src/snap/tutorial/kubectl.md @@ -123,7 +123,7 @@ pods will have a status of `ContainerCreating`. -[Command Reference Guide]: /snap/reference/commands +[Command Reference Guide]: ../reference/commands [Getting Started]: getting-started [kubernetes-api-server]: https://kubernetes.io/docs/reference/command-line-tools-reference/kube-apiserver/ [kubeconfig-doc]: https://kubernetes.io/docs/concepts/configuration/organize-cluster-access-kubeconfig/