From c2b8971c60100ae8b1a301b6f95891c90f811890 Mon Sep 17 00:00:00 2001 From: Karl Cardenas <29551334+karl-cardenas-coding@users.noreply.github.com> Date: Thu, 5 Sep 2024 15:40:28 -0700 Subject: [PATCH] docs: PE-4831 (#3809) * docs: PE-4831 * chore: fix missing link * docs: Apply suggestions from code review Co-authored-by: Lenny Chen <55669665+lennessyy@users.noreply.github.com> --------- Co-authored-by: Lenny Chen <55669665+lennessyy@users.noreply.github.com> (cherry picked from commit 37efadcf97a10d7a29e873629549147c1be0ab76) --- .../release-notes/known-issues.md | 1 + docs/docs-content/troubleshooting/edge.md | 57 +++++++++++++++++++ 2 files changed, 58 insertions(+) diff --git a/docs/docs-content/release-notes/known-issues.md b/docs/docs-content/release-notes/known-issues.md index 35729c5ff1..4615ab72bf 100644 --- a/docs/docs-content/release-notes/known-issues.md +++ b/docs/docs-content/release-notes/known-issues.md @@ -16,6 +16,7 @@ The following table lists all known issues that are currently active and affecti | Description | Workaround | Publish Date | Product Component | | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------- | ---------------------------- | +| An issue with Edge hosts using [Trusted Boot](../clusters/edge/trusted-boot/trusted-boot.md) and encrypted drives occurs when TRIM is not enabled. As a result, Solid-State Drive and Nonvolatile Memory Express drives experience degraded performance and potentially cause cluster failures. This [issue](https://github.com/kairos-io/kairos/issues/2693) stems from [Kairos](https://kairos.io/) not passing through the `--allow-discards` flag to the `systemd-cryptsetup attach` command. | Check out the [Degreated Performance on Disk Drives](../troubleshooting/edge.md#scenario---degreated-performance-on-disk-drives) troubleshooting guide for guidance on workaround. | September 4, 2024 | Edge | | The AWS CSI pack has a [Pod Disruption Budget](https://kubernetes.io/docs/tasks/run-application/configure-pdb/) (PDB) that allows for a maximum of one unavailable pod. This behavior causes an issue for single-node clusters as well as clusters with a single control plane node and a single worker node where the control plane lacks worker capability. [Operating System (OS) patch](../clusters/cluster-management/os-patching.md) updates may attempt to evict the CSI controller without success, resulting in the node remaining in the un-schedulable state. | If OS patching is enabled, allow the control plane nodes to have worker capability. For single-node clusters, turn off the OS patching feature. | September 4, 2024 | Cluster, Packs | | On AWS IaaS Microk8s clusters, OS patching can get stuck and fail. | Refer to the [Troubleshooting](../troubleshooting/nodes.md#os-patch-fails-on-aws-with-microk8s-127) section for debug steps. | August 17, 2024 | Palette | | When upgrading a self-hosted Palette instance from 4.3 to 4.4 the MongoDB pod may be stuck with the following error: `ReadConcernMajorityNotAvailableYet: Read concern majority reads are currently not possible.` | Delete the PVC, PV and the pod manually. All resources will be recreated with the correct configuration. | August 17, 2024 | Self-Hosted Palette | diff --git a/docs/docs-content/troubleshooting/edge.md b/docs/docs-content/troubleshooting/edge.md index 9aa53bd516..107a66db68 100644 --- a/docs/docs-content/troubleshooting/edge.md +++ b/docs/docs-content/troubleshooting/edge.md @@ -176,3 +176,60 @@ configured, which will result in cluster deployment failure. ``` This will start the `systemd-resolved.service` process and move the cluster creation process forward. + +## Scenario - Degreated Performance on Disk Drives + +If you are experiencing degraded performance on disk drives, such as Solid-State Drive or Nonvolatile Memory Express +drives, and you have [Trusted Boot](../clusters/edge/trusted-boot/trusted-boot.md) enabled. The degraded performance may +be caused by TRIM operations not being enabled on the drives. TRIM allows the OS to notify the drive which data blocks +are no longer in use and can be erased internally. To enable TRIM operations, use the following steps. + +### Debug Steps + +1. Log in to [Palette](https://console.spectrocloud.com). + +2. Navigate to the left **Main Menu** and click on **Profiles**. + +3. Select the **Cluster Profile** that you want to use for your Edge cluster. + +4. Click on the BYOOS layer to access its YAML configuration. + +5. Add the following configuration to the YAML to enable TRIM operations on encrypted partitions. + + ```yaml + stages: + boot.after: + - name: Ensure encrypted partitions can be trimmed + commands: + - | + DEVICES=$(lsblk -p -n -l -o NAME) + if cat /proc/cmdline | grep rd.immucore.uki; then TRUSTED_BOOT="true"; fi + for part in $DEVICES + do + if cryptsetup isLuks $part; then + echo "Detected encrypted partition $part, ensuring TRIM is enabled..." + if ! cryptsetup status ${part#/dev/} | grep discards; then + echo "TRIM is not enabled on $part, enabling TRIM..." + if [ "$TRUSTED_BOOT" = "true" ]; then + cryptsetup refresh --allow-discards --persistent ${part#/dev/} + else + if cryptsetup status ${part#/dev/} | grep LUKS2; then OPTIONS="--persistent"; fi + passphrase=$(echo '{ "data": "{ \"label\": \"LABEL\" }"}' | /system/discovery/kcrypt-discovery-challenger "discovery.password" | jq -r '.data') + echo $passphrase | cryptsetup refresh --allow-discards $OPTIONS ${part#/dev/} + fi + if [ "$?" = "0" ]; then + echo "TRIM is now enabled on $part" + else + echo "TRIM coud not be enabled on $part!" + fi + else + echo "TRIM is already enabled on $part, nothing to do." + fi + fi + done + ``` + +6. Click on **Confirm Updates** to save the changes. + +7. Use the updated profile to create a [new Edge cluster](../clusters/edge/site-deployment/cluster-deployment.md) or + update an existing Edge cluster.