Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add kata-containers extension #279

Merged
merged 1 commit into from
Feb 20, 2024

Conversation

fidencio
Copy link
Contributor

@fidencio fidencio commented Dec 7, 2023

Kata Containers provides an OCI runtime that focuses on protecting the host from malicious workloads, taking advantage of KVM to provide an extra isolation layer.

Kata Containers is also the foundation piece for Confidential Containers, as it's the most suitable OCI runtime to be used with Trusted Execution Environments.

Having Kata Containers here, even restricting it to be used with only one of its drivers (for now), opens the path for future collaboration, and providing Talos a reasonable path to become a Kubernetes distro that's TEE capable.

For now we're sticking to using Cloud Hypervisor as the preferred driver for Kata Containers, which probably could change in the future, but we don't want to start with a situation where we'll increase the image size by a whole lot, thus taking the smallest footprint that can be achieved based on Kata Containers stable releases.

Depends on: siderolabs/talos#8287

Kata Containers: https://katacontainers.io/
Cloud Hypervisor: https://www.cloudhypervisor.org/
Confidential Containers: https://github.com/confidential-containers

@fidencio
Copy link
Contributor Author

fidencio commented Dec 7, 2023

Note: this is x86_64 specific right now, but it could be easily modified to also work on aarch64

@@ -66,6 +66,7 @@ TARGETS += i915-ucode
TARGETS += intel-ice-firmware
TARGETS += intel-ucode
TARGETS += iscsi-tools
TARGETS += kata-containers
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you add this to .kres.yaml and run make rekres. The makefile is autogenerated

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⋊> extensions on main ⨯ make rekres
latest: Pulling from siderolabs/kres
Digest: sha256:36a8b5e8947500373f12ae9149be7cab2186261d3bfa194171271c5a8944bcca
Status: Image is up to date for ghcr.io/siderolabs/kres:latest
ghcr.io/siderolabs/kres:latest
gen started
GITHUB_TOKEN is missing, GitHub API integration is disabled
Error: failed to parse remote URL: origin	https://github.com/siderolabs/extensions (fetch)
origin	https://github.com/siderolabs/extensions (push)
Usage:
  kres gen [flags]

Flags:
  -h, --help   help for gen

failed to parse remote URL: origin	https://github.com/siderolabs/extensions (fetch)
origin	https://github.com/siderolabs/extensions (push)
make: *** [Makefile:190: rekres] Error 1

Not exactly sure what I'm missing here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even passing the GITHUB_TOKEN I get the same error, by the way.

@frezbo
Copy link
Member

frezbo commented Dec 7, 2023

If you're interested in adding tests can take a look at https://github.com/siderolabs/talos/blob/main/internal/integration/api/extensions_qemu.go#L365

We could help you in testing, otherwise don't sweat it, I can add a test.

@fidencio
Copy link
Contributor Author

fidencio commented Dec 7, 2023

We could help you in testing, otherwise don't sweat it, I can add a test.

Adding a test should be quite straightforward for what I can see. I can add one.
The main question is, shall the test be added after this one gets merge? Is there a way to have the test added on a different repo depending on this PR here?

@frezbo
Copy link
Member

frezbo commented Dec 7, 2023

We could help you in testing, otherwise don't sweat it, I can add a test.

Adding a test should be quite straightforward for what I can see. I can add one. The main question is, shall the test be added after this one gets merge? Is there a way to have the test added on a different repo depending on this PR here?

Currently the test can run in CI only after this gets merged, usually I test it locally, then approve the extensions PR

@frezbo
Copy link
Member

frezbo commented Dec 7, 2023

Once talos also moves to using GHA for CI, we'll do some kind of cross-testing for PR's (planned for 1.7, might be a little late)

fidencio added a commit to fidencio/talos that referenced this pull request Dec 7, 2023
Let's add a very basic test for the Kata Containers extension, mimicing
what's already in place for gVisor.

This depends on the work being done in:
siderolabs/extensions#279

Signed-off-by: Fabiano Fidêncio <[email protected]>
@frezbo
Copy link
Member

frezbo commented Dec 7, 2023

Usually a small gist of steps done to test an extensions:

Follow these to have a local setup: https://www.talos.dev/v1.5/advanced/developing-talos/

Run make <extensions-name> REGISTRY=127.0.0.1:5005 PUSH=true,

then on the talos repo run these:

make kernel initramfs
make talosctl
make imager PUSH=true
make image-installer IMAGER_ARGS="--system-extension-image=<previously push extension image>"
crane push _out/installer-amd64.tar 127.0.0.1:5055/<some username>/installer-<extension-name>-dirty

Then follow https://www.talos.dev/v1.5/advanced/developing-talos/#running-talos-cluster and replace with the installer image pushed before

Then you can do make _out/integration-test-linux-amd64

and run:

_out/integration-test-linux-amd64 -test.v -talos.failfast -talos.talosctlpath _out/talosctl-linux-amd64 -talos.kubectlpath _out/kubectl -talos.provisioner qemu -talos.name talos-default -talos.crashdump=false -talos.extensions.qemu -test.run TestIntegration/api.ExtensionsSuite/<TestName> .

- |
mkdir -p /tmp/kata-containers

tar xvJpf ata-static.tar.xz -C /tmp/kata-containers/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: ata-static.tar.xz -> kata-static.tar.xz

@frezbo
Copy link
Member

frezbo commented Dec 13, 2023

Okay, seems we cannot use the static archives from kata, containerd-shim-kata-v2 seems to be linked against glibc

@frezbo
Copy link
Member

frezbo commented Dec 13, 2023

Okay, seems we cannot use the static archives from kata, containerd-shim-kata-v2 seems to be linked against glibc

Okay, i got around this by using the rust runtime which is statically compiled, but not getting this error:

  Warning  FailedCreatePodSandBox  10s   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create containerd task: failed to create shim task: Others("failed to handler message try init runtime instance\n\nCaused by:\n    0: load config\n    1: load toml config\n    2: No such file or directory (os error 2)"): unknown

@frezbo
Copy link
Member

frezbo commented Dec 13, 2023

Okay, seems we cannot use the static archives from kata, containerd-shim-kata-v2 seems to be linked against glibc

Okay, i got around this by using the rust runtime which is statically compiled, but not getting this error:

  Warning  FailedCreatePodSandBox  10s   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create containerd task: failed to create shim task: Others("failed to handler message try init runtime instance\n\nCaused by:\n    0: load config\n    1: load toml config\n    2: No such file or directory (os error 2)"): unknown

Fixed the config file path and now getting this error:

  Warning  FailedCreatePodSandBox  2s    kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create containerd task: failed to create shim task: Others("failed to handler message try init runtime instance\n\nCaused by:\n    0: load config\n    1: load toml config\n    2: Can not find plugin for hypervisor clh"): unknown

I'm not sure why, the hypervisor.clh section correctly defined the path

talosctl -n 10.5.0.3 read /usr/local/share/kata-containers/configuration.toml | head -n 15

[hypervisor.clh]
path = "/usr/local/bin/cloud-hypervisor"
kernel = "/usr/local/share/kata-containers/vmlinux.container"
initrd = "/usr/local/share/kata-containers/kata-containers-initrd.img"

# rootfs filesystem type:
#   - ext4 (default)
#   - xfs
#   - erofs
rootfs_type="ext4"

# Enable confidential guest support.
# Toggling that setting may trigger different hardware features, ranging
# from memory encryption to both memory and CPU-state encryption and integrity.
# The Kata Containers runtime dynamically detects the available feature set and

@fidencio fidencio force-pushed the topic/kata-containers-extension branch from a3158f1 to f8ba0b8 Compare December 13, 2023 09:25
@fidencio
Copy link
Contributor Author

Okay, seems we cannot use the static archives from kata, containerd-shim-kata-v2 seems to be linked against glibc

It's statically linked against glibc, is this a problem?
We can easily switch to providing a binary that's linked against musl, I think.

@fidencio
Copy link
Contributor Author

The rust runtime is a no-go yet, as that's not on a stable release and not passing all our tests.
We plan to officially switch to the rust runtime as part of 4.0.0, but for now we should stick to the golang one, even if it means providing a musl linked runtime.

@frezbo frezbo force-pushed the topic/kata-containers-extension branch from f8ba0b8 to 36f397c Compare December 13, 2023 12:12
@frezbo
Copy link
Member

frezbo commented Dec 18, 2023

More updates:

  • building the go runtime fixed the Can not find plugin for hypervisor clh error, but got a syslog error
  • adding a dummy syslog extension fixed the issue and able to schedule pod.

Kata Containers provides an OCI runtime that focuses on protecting the
host from malicious workloads, taking advantage of KVM to provide an
extra isolation layer.

Kata Containers is also the foundation piece for Confidential
Containers, as it's the most suitable OCI runtime to be used with
Trusted Execution Environments.

Having Kata Containers here, even restricting it to be used with only
one of its drivers (for now), opens the path for future collaboration,
and providing Talos a reasonable path to become a Kubernetes distro
that's TEE capable.

For now we're sticking to using Cloud Hypervisor as the preferred driver
for Kata Containers, which probably could change in the future, but we
don't want to start with a situation where we'll increase the image size
by a whole lot, thus taking the smallest footprint that can be achieved
based on Kata Containers stable releases.

Kata Containers: https://katacontainers.io/
Cloud Hypervisor: https://www.cloudhypervisor.org/
Confidential Containers: https://github.com/confidential-containers

Depends on: siderolabs/talos#8287

Signed-off-by: Fabiano Fidêncio <[email protected]>
Signed-off-by: Noel Georgi <[email protected]>
@frezbo frezbo force-pushed the topic/kata-containers-extension branch from 36f397c to ec1cf8c Compare February 20, 2024 10:35
@frezbo
Copy link
Member

frezbo commented Feb 20, 2024

/ok-to-test

Copy link
Member

@frezbo frezbo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you ❤️ , now that siderolabs/talos#8287 is part of Talos, this can be merged.

@frezbo
Copy link
Member

frezbo commented Feb 20, 2024

/m

@talos-bot talos-bot merged commit ec1cf8c into siderolabs:main Feb 20, 2024
14 checks passed
frezbo pushed a commit to fidencio/talos that referenced this pull request Feb 20, 2024
Let's add a very basic test for the Kata Containers extension, mimicing
what's already in place for gVisor.

This depends on the work being done in:
siderolabs/extensions#279

Signed-off-by: Fabiano Fidêncio <[email protected]>
Signed-off-by: Noel Georgi <[email protected]>
frezbo pushed a commit to fidencio/talos that referenced this pull request Feb 20, 2024
Let's add a very basic test for the Kata Containers extension, mimicing
what's already in place for gVisor.

This depends on the work being done in:
siderolabs/extensions#279

Signed-off-by: Fabiano Fidêncio <[email protected]>
Signed-off-by: Noel Georgi <[email protected]>
frezbo pushed a commit to fidencio/talos that referenced this pull request Feb 20, 2024
Let's add a very basic test for the Kata Containers extension, mimicing
what's already in place for gVisor.

This depends on the work being done in:
siderolabs/extensions#279

Signed-off-by: Fabiano Fidêncio <[email protected]>
Signed-off-by: Noel Georgi <[email protected]>
dsseng pushed a commit to dsseng/talos that referenced this pull request Mar 7, 2024
Let's add a very basic test for the Kata Containers extension, mimicing
what's already in place for gVisor.

This depends on the work being done in:
siderolabs/extensions#279

Signed-off-by: Fabiano Fidêncio <[email protected]>
Signed-off-by: Noel Georgi <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants