-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add E2E automation, debug helpers (#94)
* WIP: Add script to automate testing of two node cluster * Add template for cluster profile * Remove separate function to build stylus framework image Earthly stylus goal is now building this goal two * WIP: automation of cluster launch * Move test script to test/ * Exract user data template * Add creation of all machines * Refactor main * Add missing required variable * wip: e2e vmware test Signed-off-by: Tyler Gillson <[email protected]> * finish automating e2e provisioning Signed-off-by: Tyler Gillson <[email protected]> * tidy whitespace & remove oz's changes to default user-data template Signed-off-by: Tyler Gillson <[email protected]> * fixes & docs Signed-off-by: Tyler Gillson <[email protected]> * add isTwoNodeCandidate flag to cluster template Signed-off-by: Tyler Gillson <[email protected]> * fix: remove invalid node-status-update-frequency flag Signed-off-by: Tyler Gillson <[email protected]> * fix: ensure unique VM names in vSphere; docs Signed-off-by: Tyler Gillson <[email protected]> * remove invalid arg from cluster profile template Signed-off-by: Tyler Gillson <[email protected]> * add livenessSeconds & VIP config Signed-off-by: Tyler Gillson <[email protected]> * feat: add destroy_edge_hosts Signed-off-by: Tyler Gillson <[email protected]> * install ping for two node Signed-off-by: Tyler Gillson <[email protected]> * feat: configurable NIC_NAME Signed-off-by: Tyler Gillson <[email protected]> * fix test-two-node.sh * add debug scripts, cleanup funcs, remove two-node env hack Signed-off-by: Tyler Gillson <[email protected]> --------- Signed-off-by: Tyler Gillson <[email protected]> Co-authored-by: Oz Tiram <[email protected]>
- Loading branch information
1 parent
54ef5b6
commit 7e3f355
Showing
9 changed files
with
741 additions
and
15 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -8,5 +8,5 @@ config.yaml | |
content-*/* | ||
*.arg | ||
.idea | ||
|
||
.DS_Store | ||
hack/*.img | ||
.DS_Store |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
VERSION 0.6 | ||
|
||
ARG OSBUILDER_VERSION=v0.7.11 | ||
ARG OSBUILDER_IMAGE=quay.io/kairos/osbuilder-tools:$OSBUILDER_VERSION | ||
ARG ISO_NAME=debug | ||
|
||
# replace with your CanvOS provider image | ||
ARG PROVIDER_IMAGE=oci:tylergillson/ubuntu:k3s-1.26.4-v4.0.4-071c2c23 | ||
|
||
build: | ||
FROM $OSBUILDER_IMAGE | ||
WORKDIR /build | ||
COPY . ./ | ||
|
||
RUN /entrypoint.sh --name $ISO_NAME --debug build-iso --squash-no-compression --date=false $PROVIDER_IMAGE --output /build/ | ||
SAVE ARTIFACT /build/$ISO_NAME.iso kairos.iso AS LOCAL build/$ISO_NAME.iso |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
# Debugging Kairos | ||
|
||
If you're facing hard-to-diagnose issues with your custom provider image, you can use the scripts in this directory to obtain verbose Kairos output. | ||
|
||
## Steps | ||
1. Use earthly to generate an ISO from your CanvOS provider image: | ||
``` | ||
earthly +build --PROVIDER_IMAGE=<your_provider_image> # e.g., oci:tylergillson/ubuntu:k3s-1.26.4-v4.0.4-071c2c23 | ||
``` | ||
If successful, `build/debug.iso` will be created. | ||
2. Launch a local VM based on the debug ISO using QEMU and pipe all output to a log file: | ||
``` | ||
./launch-qemu.sh build/debug.iso | tee out.log | ||
``` | ||
3. Once the VM boots, use `reboot` to return to the GRUB menu, then select your desired entry and hit `x` to edit it. Add `rd.debug rd.immucore.debug` to the end of the `linux` line for your selected GRUB menu entry, then hit `CTRL+x` to boot with your edits. You should see verbose Kairos debug logs and they will be persisted to `out.log`. |
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
#!/bin/bash | ||
|
||
# Screenshot capability: | ||
# https://unix.stackexchange.com/a/476617 | ||
|
||
if [ ! -e disk.img ]; then | ||
qemu-img create -f qcow2 disk.img 60g | ||
fi | ||
|
||
# -nic bridge,br=br0,model=virtio-net-pci \ | ||
qemu-system-x86_64 \ | ||
-enable-kvm \ | ||
-cpu "${CPU:=host}" \ | ||
-nographic \ | ||
-spice port=9000,addr=127.0.0.1,disable-ticketing=yes \ | ||
-m ${MEMORY:=10096} \ | ||
-smp ${CORES:=5} \ | ||
-monitor unix:/tmp/qemu-monitor.sock,server=on,wait=off \ | ||
-serial mon:stdio \ | ||
-rtc base=utc,clock=rt \ | ||
-chardev socket,path=qga.sock,server=on,wait=off,id=qga0 \ | ||
-device virtio-serial \ | ||
-device virtserialport,chardev=qga0,name=org.qemu.guest_agent.0 \ | ||
-drive if=virtio,media=disk,file=disk.img \ | ||
-drive if=ide,media=cdrom,file="${1}" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,64 @@ | ||
{ | ||
"metadata": { | ||
"name": "_____place_holder_____", | ||
"description": "", | ||
"labels": {} | ||
}, | ||
"spec": { | ||
"version": "1.0.0", | ||
"template": { | ||
"type": "infra", | ||
"cloudType": "edge-native", | ||
"packs": [ | ||
{ | ||
"name": "edge-native-byoi", | ||
"type": "spectro", | ||
"layer": "os", | ||
"version": "1.0.0", | ||
"tag": "1.0.0", | ||
"values": "pack:\n content:\n images:\n - image: \"{{.spectro.pack.edge-native-byoi.options.system.uri}}\"\n # Below config is default value, please uncomment if you want to modify default values\n #drain:\n #cordon: true\n #timeout: 60 # The length of time to wait before giving up, zero means infinite\n #gracePeriod: 60 # Period of time in seconds given to each pod to terminate gracefully. If negative, the default value specified in the pod will be used\n #ignoreDaemonSets: true\n #deleteLocalData: true # Continue even if there are pods using emptyDir (local data that will be deleted when the node is drained)\n #force: true # Continue even if there are pods that do not declare a controller\n #disableEviction: false # Force drain to use delete, even if eviction is supported. This will bypass checking PodDisruptionBudgets, use with caution\n #skipWaitForDeleteTimeout: 60 # If pod DeletionTimestamp older than N seconds, skip waiting for the pod. Seconds must be greater than 0 to skip.\nstylusPackage: container://OCI_REGISTRY/stylus-linux-amd64:v0.0.0-STYLUS_HASH\noptions:\n system.uri: \"OCI_REGISTRY/ubuntu:k3s-1.26.4-v4.0.4-STYLUS_HASH\"", | ||
"registry": { | ||
"metadata": { | ||
"uid": "_____place_holder_____", | ||
"name": "Public Repo", | ||
"kind": "pack", | ||
"isPrivate": false | ||
} | ||
} | ||
}, | ||
{ | ||
"name": "edge-k3s", | ||
"type": "spectro", | ||
"layer": "k8s", | ||
"version": "1.26.4", | ||
"tag": "1.26.4", | ||
"values": "cluster:\n config: |\n flannel-backend: host-gw\n disable-network-policy: false\n disable:\n - traefik\n - local-storage\n - servicelb\n - metrics-server\n\n # configure the pod cidr range\n cluster-cidr: \"192.170.0.0/16\"\n\n # configure service cidr range\n service-cidr: \"192.169.0.0/16\"\n\n # kubeconfig must be in run for the stylus operator to manage the cluster\n write-kubeconfig: /run/kubeconfig\n write-kubeconfig-mode: 600\n\n # additional component settings to harden installation\n kube-apiserver-arg:\n - anonymous-auth=true\n - profiling=false\n - disable-admission-plugins=AlwaysAdmit\n - default-not-ready-toleration-seconds=60\n - default-unreachable-toleration-seconds=60\n - enable-admission-plugins=AlwaysPullImages,NamespaceLifecycle,ServiceAccount,NodeRestriction\n - audit-log-path=/var/log/apiserver/audit.log\n - audit-policy-file=/etc/kubernetes/audit-policy.yaml\n - audit-log-maxage=30\n - audit-log-maxbackup=10\n - audit-log-maxsize=100\n - authorization-mode=RBAC,Node\n - tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_128_GCM_SHA256\n kube-controller-manager-arg:\n - profiling=false\n - terminated-pod-gc-threshold=25\n - use-service-account-credentials=true\n - feature-gates=RotateKubeletServerCertificate=true\n - node-monitor-period=5s\n - node-monitor-grace-period=20s\n - pod-eviction-timeout=20s\n kube-scheduler-arg:\n - profiling=false\n kubelet-arg:\n - read-only-port=0\n - event-qps=0\n - feature-gates=RotateKubeletServerCertificate=true\n - protect-kernel-defaults=true\n - tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_AES_128_GCM_SHA256\n - rotate-server-certificates=true\nstages:\n initramfs:\n - sysctl:\n vm.overcommit_memory: 1\n kernel.panic: 10\n kernel.panic_on_oops: 1\n kernel.printk: \"0 4 0 7\"\n - directories:\n - path: \"/var/log/apiserver\"\n permissions: 0644\n files:\n - path: /etc/hosts\n permission: \"0644\"\n content: |\n 127.0.0.1 localhost\n - path: \"/etc/kubernetes/audit-policy.yaml\"\n owner_string: \"root\"\n permission: 0600\n content: |\n apiVersion: audit.k8s.io/v1\n kind: Policy\n rules:\n - level: None\n users: [\"system:kube-proxy\"]\n verbs: [\"watch\"]\n resources:\n - group: \"\" # core\n resources: [\"endpoints\", \"services\", \"services/status\"]\n - level: None\n users: [\"system:unsecured\"]\n namespaces: [\"kube-system\"]\n verbs: [\"get\"]\n resources:\n - group: \"\" # core\n resources: [\"configmaps\"]\n - level: None\n users: [\"kubelet\"] # legacy kubelet identity\n verbs: [\"get\"]\n resources:\n - group: \"\" # core\n resources: [\"nodes\", \"nodes/status\"]\n - level: None\n userGroups: [\"system:nodes\"]\n verbs: [\"get\"]\n resources:\n - group: \"\" # core\n resources: [\"nodes\", \"nodes/status\"]\n - level: None\n users:\n - system:kube-controller-manager\n - system:kube-scheduler\n - system:serviceaccount:kube-system:endpoint-controller\n verbs: [\"get\", \"update\"]\n namespaces: [\"kube-system\"]\n resources:\n - group: \"\" # core\n resources: [\"endpoints\"]\n - level: None\n users: [\"system:apiserver\"]\n verbs: [\"get\"]\n resources:\n - group: \"\" # core\n resources: [\"namespaces\", \"namespaces/status\", \"namespaces/finalize\"]\n - level: None\n users: [\"cluster-autoscaler\"]\n verbs: [\"get\", \"update\"]\n namespaces: [\"kube-system\"]\n resources:\n - group: \"\" # core\n resources: [\"configmaps\", \"endpoints\"]\n # Don't log HPA fetching metrics.\n - level: None\n users:\n - system:kube-controller-manager\n verbs: [\"get\", \"list\"]\n resources:\n - group: \"metrics.k8s.io\"\n # Don't log these read-only URLs.\n - level: None\n nonResourceURLs:\n - /healthz*\n - /version\n - /swagger*\n # Don't log events requests.\n - level: None\n resources:\n - group: \"\" # core\n resources: [\"events\"]\n # node and pod status calls from nodes are high-volume and can be large, don't log responses for expected updates from nodes\n - level: Request\n users: [\"kubelet\", \"system:node-problem-detector\", \"system:serviceaccount:kube-system:node-problem-detector\"]\n verbs: [\"update\",\"patch\"]\n resources:\n - group: \"\" # core\n resources: [\"nodes/status\", \"pods/status\"]\n omitStages:\n - \"RequestReceived\"\n - level: Request\n userGroups: [\"system:nodes\"]\n verbs: [\"update\",\"patch\"]\n resources:\n - group: \"\" # core\n resources: [\"nodes/status\", \"pods/status\"]\n omitStages:\n - \"RequestReceived\"\n # deletecollection calls can be large, don't log responses for expected namespace deletions\n - level: Request\n users: [\"system:serviceaccount:kube-system:namespace-controller\"]\n verbs: [\"deletecollection\"]\n omitStages:\n - \"RequestReceived\"\n # Secrets, ConfigMaps, and TokenReviews can contain sensitive \u0026 binary data,\n # so only log at the Metadata level.\n - level: Metadata\n resources:\n - group: \"\" # core\n resources: [\"secrets\", \"configmaps\"]\n - group: authentication.k8s.io\n resources: [\"tokenreviews\"]\n omitStages:\n - \"RequestReceived\"\n # Get repsonses can be large; skip them.\n - level: Request\n verbs: [\"get\", \"list\", \"watch\"]\n resources:\n - group: \"\" # core\n - group: \"admissionregistration.k8s.io\"\n - group: \"apiextensions.k8s.io\"\n - group: \"apiregistration.k8s.io\"\n - group: \"apps\"\n - group: \"authentication.k8s.io\"\n - group: \"authorization.k8s.io\"\n - group: \"autoscaling\"\n - group: \"batch\"\n - group: \"certificates.k8s.io\"\n - group: \"extensions\"\n - group: \"metrics.k8s.io\"\n - group: \"networking.k8s.io\"\n - group: \"policy\"\n - group: \"rbac.authorization.k8s.io\"\n - group: \"settings.k8s.io\"\n - group: \"storage.k8s.io\"\n omitStages:\n - \"RequestReceived\"\n # Default level for known APIs\n - level: RequestResponse\n resources:\n - group: \"\" # core\n - group: \"admissionregistration.k8s.io\"\n - group: \"apiextensions.k8s.io\"\n - group: \"apiregistration.k8s.io\"\n - group: \"apps\"\n - group: \"authentication.k8s.io\"\n - group: \"authorization.k8s.io\"\n - group: \"autoscaling\"\n - group: \"batch\"\n - group: \"certificates.k8s.io\"\n - group: \"extensions\"\n - group: \"metrics.k8s.io\"\n - group: \"networking.k8s.io\"\n - group: \"policy\"\n - group: \"rbac.authorization.k8s.io\"\n - group: \"settings.k8s.io\"\n - group: \"storage.k8s.io\"\n omitStages:\n - \"RequestReceived\"\n # Default level for all other requests.\n - level: Metadata\n omitStages:\n - \"RequestReceived\"\npack:\n palette:\n config:\n oidc:\n identityProvider: noauth", | ||
"registry": { | ||
"metadata": { | ||
"uid": "_____place_holder_____", | ||
"name": "Public Repo", | ||
"kind": "pack", | ||
"isPrivate": false | ||
} | ||
} | ||
}, | ||
{ | ||
"name": "cni-custom", | ||
"type": "spectro", | ||
"layer": "cni", | ||
"version": "0.1.0", | ||
"tag": "0.1.0", | ||
"values": "manifests:\n byo-cni:\n contents: |\n apiVersion: v1\n kind: ConfigMap\n metadata:\n name: custom-cni\n data:\n # property-like keys; each key maps to a simple value\n custom-cni: \"byo-cni\"", | ||
"registry": { | ||
"metadata": { | ||
"uid": "_____place_holder_____", | ||
"name": "Public Repo", | ||
"kind": "pack", | ||
"isPrivate": false | ||
} | ||
} | ||
} | ||
] | ||
} | ||
} | ||
} |
Oops, something went wrong.