Skip to content

Commit

Permalink
Add build scripts for building Nvidia and Neuron AMIs based on AL2023 (
Browse files Browse the repository at this point in the history
…#1924)

Co-authored-by: Carter <[email protected]>
Co-authored-by: Nikolay Kvetsinski <[email protected]>
  • Loading branch information
3 people authored Aug 29, 2024
1 parent 32bd8b4 commit a943086
Show file tree
Hide file tree
Showing 33 changed files with 2,337 additions and 9 deletions.
8 changes: 7 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,12 @@ ifeq ($(enable_fips), true)
AMI_VARIANT := $(AMI_VARIANT)-fips
endif

ifeq ($(os_distro), al2023)
ifdef enable_accelerator
AMI_VARIANT := $(AMI_VARIANT)-$(enable_accelerator)
endif
endif

ami_name ?= $(AMI_VARIANT)-node-$(K8S_VERSION_MINOR)-$(AMI_VERSION)

# ami owner overrides for cn/gov-cloud
Expand Down Expand Up @@ -91,7 +97,7 @@ validate: ## Validate packer config

.PHONY: k8s
k8s: validate ## Build default K8s version of EKS Optimized AMI
@echo "Building AMI [os_distro=$(os_distro) kubernetes_version=$(kubernetes_version) arch=$(arch)]"
@echo "Building AMI [os_distro=$(os_distro) kubernetes_version=$(kubernetes_version) arch=$(arch) $(if $(enable_accelerator),enable_accelerator=$(enable_accelerator))]"
$(PACKER_BINARY) build -timestamp-ui -color=false $(PACKER_ARGS) $(PACKER_TEMPLATE_FILE)

# DEPRECATION NOTICE: `make` targets for each Kubernetes minor version will not be added after 1.28
Expand Down
16 changes: 15 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,4 +58,18 @@ For security issues or concerns, please do not open an issue or pull request on

## ⚖️ License Summary

This sample code is made available under a modified MIT license. See the LICENSE file.
This sample code is made available under a MIT-0 license. See the LICENSE file.

Although this repository is released under the MIT license, when using NVIDIA accelerated AMIs you agree to the NVIDIA Cloud End User License Agreement: https://s3.amazonaws.com/EULA/NVidiaEULAforAWS.pdf.

Although this repository is released under the MIT license, NVIDIA accelerated AMIs
use the third party [open-gpu-kernel-modules](https://github.com/NVIDIA/open-gpu-kernel-modules). The open-gpu-kernel-modules project's licensing includes the dual MIT/GPLv2 license.

Although this repository is released under the MIT license, NVIDIA accelerated AMIs
use the third party [nvidia-container-toolkit](https://github.com/NVIDIA/nvidia-container-toolkit). The nvidia-container-toolkit project's licensing includes the Apache-2.0 license.

Although this repository is released under the MIT license, Neuron accelerated AMIs
use the third party [Neuron Driver](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/release-notes/runtime/aws-neuronx-dkms/index.html). The Neuron Driver project's licensing includes the GPLv2 license.

Although this repository is released under the MIT license, accelerated AMIs
use the third party [Elastic Fabric Adapter Driver](https://github.com/amzn/amzn-drivers/tree/master/kernel/linux/efa). The Elastic Fabric Adapter Driver project's licensing includes the GPLv2 license.
22 changes: 22 additions & 0 deletions doc/usage/al2023.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
<!-- template-variable-table-boundary -->
| Variable | Description |
| - | - |
| `enable_accelerator` | Vendor that provides the GPU or accelerator hardware. Currently we support Neuron and NVIDIA. |
| `ami_component_description` | |
| `ami_description` | |
| `ami_name` | |
Expand All @@ -23,11 +24,13 @@
| `enable_fips` | Install openssl and enable fips related kernel parameters |
| `encrypted` | |
| `iam_instance_profile` | The name of an IAM instance profile to launch the EC2 instance with. |
| `enable_efa` | Valid options are ```true``` or ```false```. Wheather or not to install the software needed to use AWS Elastic Fabric Adapter (EFA) network interfaces. |
| `instance_type` | |
| `kms_key_id` | |
| `kubernetes_build_date` | |
| `kubernetes_version` | |
| `launch_block_device_mappings_volume_size` | |
| `nvidia_driver_major_version` | To be used only when ```enable_accelerator = nvidia```. Driver version to install, depends on what is available in NVIDIA repository. |
| `remote_folder` | Directory path for shell provisioner scripts on the builder instance |
| `runc_version` | |
| `security_group_id` | |
Expand All @@ -44,3 +47,22 @@
| `volume_type` | |
| `working_dir` | Directory path for ephemeral resources on the builder instance |
<!-- template-variable-table-boundary -->

## Accelerated images

One can build images that contain Neuron or Nvidia drivers and runtime configuration. To build Neuron image execute:

```
make k8s=1.29 os_distro=al2023 enable_accelerator=neuron enable_efa=true
```

To build NVIDIA image execute:
```
make k8s=1.29 os_distro=al2023 enable_accelerator=nvidia enable_efa=true
```

One can pass the NVIDIA driver major version using the following:
```
make k8s=1.29 os_distro=al2023 enable_accelerator=nvidia enable_efa=true nvidia_driver_major_version=560
```
To see which driver versions are available, one can check the NVIDIA AL2023 [repository](https://developer.download.nvidia.com/compute/cuda/repos/amzn2023/).
11 changes: 9 additions & 2 deletions nodeadm/internal/containerd/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,9 @@ var (
)

type containerdTemplateVars struct {
SandboxImage string
SandboxImage string
RuntimeName string
RuntimeBinaryName string
}

func writeContainerdConfig(cfg *api.NodeConfig) error {
Expand All @@ -37,6 +39,7 @@ func writeContainerdConfig(cfg *api.NodeConfig) error {
if err != nil {
return err
}

// because the logic in containerd's import merge decides to completely
// overwrite entire sections, we want to implement this merging ourselves.
// see: https://github.com/containerd/containerd/blob/a91b05d99ceac46329be06eb43f7ae10b89aad45/cmd/containerd/server/config/config.go#L407-L431
Expand All @@ -56,8 +59,12 @@ func writeContainerdConfig(cfg *api.NodeConfig) error {
}

func generateContainerdConfig(cfg *api.NodeConfig) ([]byte, error) {
instanceOptions := applyInstanceTypeMixins(cfg.Status.Instance.Type)

configVars := containerdTemplateVars{
SandboxImage: cfg.Status.Defaults.SandboxImage,
SandboxImage: cfg.Status.Defaults.SandboxImage,
RuntimeBinaryName: instanceOptions.RuntimeBinaryName,
RuntimeName: instanceOptions.RuntimeName,
}
var buf bytes.Buffer
if err := containerdConfigTemplate.Execute(&buf, configVars); err != nil {
Expand Down
7 changes: 4 additions & 3 deletions nodeadm/internal/containerd/config.template.toml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ state = "/run/containerd"
address = "/run/containerd/containerd.sock"

[plugins."io.containerd.grpc.v1.cri".containerd]
default_runtime_name = "runc"
default_runtime_name = "{{.RuntimeName}}"
discard_unpacked_layers = true

[plugins."io.containerd.grpc.v1.cri"]
Expand All @@ -15,11 +15,12 @@ sandbox_image = "{{.SandboxImage}}"
[plugins."io.containerd.grpc.v1.cri".registry]
config_path = "/etc/containerd/certs.d:/etc/docker/certs.d"

[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.{{.RuntimeName}}]
runtime_type = "io.containerd.runc.v2"
base_runtime_spec = "/etc/containerd/base-runtime-spec.json"

[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.{{.RuntimeName}}.options]
BinaryName = "{{.RuntimeBinaryName}}"
SystemdCgroup = true

[plugins."io.containerd.grpc.v1.cri".cni]
Expand Down
65 changes: 65 additions & 0 deletions nodeadm/internal/containerd/runtime_config.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
package containerd

import (
"slices"
"strings"

"go.uber.org/zap"
)

type instanceOptions struct {
RuntimeName string
RuntimeBinaryName string
}

type instanceTypeMixin struct {
instanceFamilies []string
apply func() instanceOptions
}

func (m *instanceTypeMixin) matches(instanceType string) bool {
instanceFamily := strings.Split(instanceType, ".")[0]
return slices.Contains(m.instanceFamilies, instanceFamily)
}

var (
// TODO: fetch this list dynamically
nvidiaInstances = []string{"p3", "p3dn", "p4d", "p4de", "p5", "g4", "g4dn", "g5", "g6", "g6e"}
NvidiaInstanceTypeMixin = instanceTypeMixin{
instanceFamilies: nvidiaInstances,
apply: applyNvidia,
}

mixins = []instanceTypeMixin{
NvidiaInstanceTypeMixin,
}
)

const nvidiaRuntimeName = "nvidia"
const nvidiaRuntimeBinaryName = "/usr/bin/nvidia-container-runtime"
const defaultRuntimeName = "runc"
const defaultRuntimeBinaryName = "/usr/sbin/runc"

// applyInstanceTypeMixins adds the needed OCI hook options to containerd config.toml
// based on the instance family
func applyInstanceTypeMixins(instanceType string) instanceOptions {
for _, mixin := range mixins {
if mixin.matches(instanceType) {
return mixin.apply()
}
}
zap.L().Info("No instance specific containerd runtime configuration needed..", zap.String("instanceType", instanceType))
return applyDefault()
}

// applyNvidia adds the needed NVIDIA containerd options
func applyNvidia() instanceOptions {
zap.L().Info("Configuring NVIDIA runtime..")
return instanceOptions{RuntimeName: nvidiaRuntimeName, RuntimeBinaryName: nvidiaRuntimeBinaryName}
}

// applyDefault adds the default runc containerd options
func applyDefault() instanceOptions {
zap.L().Info("Configuring default runtime..")
return instanceOptions{RuntimeName: defaultRuntimeName, RuntimeBinaryName: defaultRuntimeBinaryName}
}
31 changes: 31 additions & 0 deletions nodeadm/internal/containerd/runtime_config_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
package containerd

import (
"reflect"
"testing"
)

func TestApplyInstanceTypeMixins(t *testing.T) {

var nvidiaExpectedOutput = instanceOptions{RuntimeName: "nvidia", RuntimeBinaryName: "/usr/bin/nvidia-container-runtime"}
var neuronExpectedOutput = instanceOptions{RuntimeName: "runc", RuntimeBinaryName: "/usr/sbin/runc"}
var nonAcceleratedExpectedOutput = instanceOptions{RuntimeName: "runc", RuntimeBinaryName: "/usr/sbin/runc"}

var tests = []struct {
name string
instanceType string
expectedOutput instanceOptions
}{
{name: "nvidia_test", instanceType: "p5.xlarge", expectedOutput: nvidiaExpectedOutput},
{name: "neuron_test", instanceType: "inf2.xlarge", expectedOutput: neuronExpectedOutput},
// non accelerated instance
{name: "non_accelerated_test", instanceType: "m5.xlarge", expectedOutput: nonAcceleratedExpectedOutput},
}
for _, test := range tests {
expected := applyInstanceTypeMixins(test.instanceType)

if !reflect.DeepEqual(expected, test.expectedOutput) {
t.Fatalf("unexpected output in test case %s: %s, expecting: %s", test.name, expected, test.expectedOutput)
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ base_runtime_spec = '/etc/containerd/base-runtime-spec.json'
runtime_type = 'io.containerd.runc.v2'

[plugins.'io.containerd.grpc.v1.cri'.containerd.runtimes.runc.options]
BinaryName = '/usr/sbin/runc'
SystemdCgroup = true

[plugins.'io.containerd.grpc.v1.cri'.registry]
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
---
apiVersion: node.eks.aws/v1alpha1
kind: NodeConfig
spec:
cluster:
name: my-cluster
apiServerEndpoint: https://example.com
certificateAuthority: Y2VydGlmaWNhdGVBdXRob3JpdHk=
cidr: 10.100.0.0/16
containerd:
config: |
version = 2
[grpc]
address = "/run/foo/foo.sock"
[plugins."io.containerd.grpc.v1.cri".containerd]
discard_unpacked_layers = false
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
root = '/var/lib/containerd'
state = '/run/containerd'
version = 2

[grpc]
address = '/run/foo/foo.sock'

[plugins]
[plugins.'io.containerd.grpc.v1.cri']
sandbox_image = '602401143452.dkr.ecr.us-west-2.amazonaws.com/eks/pause:3.5'

[plugins.'io.containerd.grpc.v1.cri'.cni]
bin_dir = '/opt/cni/bin'
conf_dir = '/etc/cni/net.d'

[plugins.'io.containerd.grpc.v1.cri'.containerd]
default_runtime_name = 'runc'
discard_unpacked_layers = false

[plugins.'io.containerd.grpc.v1.cri'.containerd.runtimes]
[plugins.'io.containerd.grpc.v1.cri'.containerd.runtimes.runc]
base_runtime_spec = '/etc/containerd/base-runtime-spec.json'
runtime_type = 'io.containerd.runc.v2'

[plugins.'io.containerd.grpc.v1.cri'.containerd.runtimes.runc.options]
BinaryName = '/usr/sbin/runc'
SystemdCgroup = true

[plugins.'io.containerd.grpc.v1.cri'.registry]
config_path = '/etc/containerd/certs.d:/etc/docker/certs.d'
15 changes: 15 additions & 0 deletions nodeadm/test/e2e/cases/containerd-runtime-config-neuron/run.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
#!/usr/bin/env bash

set -o errexit
set -o nounset
set -o pipefail

source /helpers.sh

mock::aws /etc/aemm-inf1-config.json
mock::kubelet 1.27.0
wait::dbus-ready

nodeadm init --skip run --config-source file://config.yaml

assert::files-equal /etc/containerd/config.toml expected-containerd-config.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
---
apiVersion: node.eks.aws/v1alpha1
kind: NodeConfig
spec:
cluster:
name: my-cluster
apiServerEndpoint: https://example.com
certificateAuthority: Y2VydGlmaWNhdGVBdXRob3JpdHk=
cidr: 10.100.0.0/16
containerd:
config: |
version = 2
[grpc]
address = "/run/foo/foo.sock"
[plugins."io.containerd.grpc.v1.cri".containerd]
discard_unpacked_layers = false
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
root = '/var/lib/containerd'
state = '/run/containerd'
version = 2

[grpc]
address = '/run/foo/foo.sock'

[plugins]
[plugins.'io.containerd.grpc.v1.cri']
sandbox_image = '602401143452.dkr.ecr.us-west-2.amazonaws.com/eks/pause:3.5'

[plugins.'io.containerd.grpc.v1.cri'.cni]
bin_dir = '/opt/cni/bin'
conf_dir = '/etc/cni/net.d'

[plugins.'io.containerd.grpc.v1.cri'.containerd]
default_runtime_name = 'nvidia'
discard_unpacked_layers = false

[plugins.'io.containerd.grpc.v1.cri'.containerd.runtimes]
[plugins.'io.containerd.grpc.v1.cri'.containerd.runtimes.nvidia]
base_runtime_spec = '/etc/containerd/base-runtime-spec.json'
runtime_type = 'io.containerd.runc.v2'

[plugins.'io.containerd.grpc.v1.cri'.containerd.runtimes.nvidia.options]
BinaryName = '/usr/bin/nvidia-container-runtime'
SystemdCgroup = true

[plugins.'io.containerd.grpc.v1.cri'.registry]
config_path = '/etc/containerd/certs.d:/etc/docker/certs.d'
15 changes: 15 additions & 0 deletions nodeadm/test/e2e/cases/containerd-runtime-config-nvidia/run.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
#!/usr/bin/env bash

set -o errexit
set -o nounset
set -o pipefail

source /helpers.sh

mock::aws /etc/aemm-g5-config.json
mock::kubelet 1.27.0
wait::dbus-ready

nodeadm init --skip run --config-source file://config.yaml

assert::files-equal /etc/containerd/config.toml expected-containerd-config.toml
3 changes: 3 additions & 0 deletions nodeadm/test/e2e/infra/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,9 @@ COPY --from=imds-mock-build /imds-mock /usr/local/bin/imds-mock
# certificateAuthority: Y2VydGlmaWNhdGVBdXRob3JpdHk=
# cidr: 10.100.0.0/16
COPY test/e2e/infra/aemm-default-config.json /etc/aemm-default-config.json
COPY test/e2e/infra/aemm-inf1-config.json /etc/aemm-inf1-config.json
COPY test/e2e/infra/aemm-g5-config.json /etc/aemm-g5-config.json
COPY test/e2e/infra/nvidia-ctk /usr/bin/nvidia-ctk
COPY --from=nodeadm-build /nodeadm /usr/local/bin/nodeadm
COPY test/e2e/infra/systemd/kubelet.service /usr/lib/systemd/system/kubelet.service
COPY test/e2e/infra/systemd/containerd.service /usr/lib/systemd/system/containerd.service
Expand Down
Loading

0 comments on commit a943086

Please sign in to comment.