Skip to content

Commit

Permalink
Customize tinkerbell load balancer interface (#8805)
Browse files Browse the repository at this point in the history
* Customize tinkerbell load balancer interface

* Add mini design doc for configuring tinkerbell stack LB interface

* Add loadBalancerInterface as an optional configuration to the baremetal configuration docs
  • Loading branch information
sp1999 authored Sep 30, 2024
1 parent 519faa3 commit f613daf
Show file tree
Hide file tree
Showing 11 changed files with 186 additions and 17 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,10 @@ spec:
description: HookImagesURLPath can be used to override the default
Hook images path to pull from a local server.
type: string
loadBalancerInterface:
description: LoadBalancerInterface can be used to configure a load
balancer interface for the Tinkerbell stack.
type: string
osImageURL:
description: OSImageURL can be used to override the default OS image
path to pull from a local server. OSImageURL is a URL to the OS
Expand Down
4 changes: 4 additions & 0 deletions config/manifest/eksa-components.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6791,6 +6791,10 @@ spec:
description: HookImagesURLPath can be used to override the default
Hook images path to pull from a local server.
type: string
loadBalancerInterface:
description: LoadBalancerInterface can be used to configure a load
balancer interface for the Tinkerbell stack.
type: string
osImageURL:
description: OSImageURL can be used to override the default OS image
path to pull from a local server. OSImageURL is a URL to the OS
Expand Down
82 changes: 82 additions & 0 deletions designs/tinkerbell-stack-load-balancer-interface.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
# Tinkerbell stack load balancer interface customization

## Problem Statement:

Customer wants to specify the Tinkerbell stack load-balancer interface in order to override the default interface used by the current load-balancer(kube-vip) daemonset. This can be done via specifying the [vip_interface](https://github.com/kube-vip/kube-vip/blob/04ce471366c21d4586fb2d683cd166f0dc4e18ce/pkg/kubevip/config_envvar.go#L34) env variable in the kube-vip daemonset after the cluster is created. But the main issue with that is this change will not persist whenever the management cluster is upgraded. In order to solve this problem, we would allow users to specify the interface through the cluster spec and configure it in our kube-vip daemonset. This doc proposes a solution for where the interface can be specified in the cluster spec and discusses various trade-offs with alternate options.

## Proposed Solution:

Specify the interface in the TinkerbellDatacenterConfig object spec at the root level:

**API Schema:**

```
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
name: mgmt-cluster
spec:
...
kubernetesVersion: 1.30
...
---
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: TinkerbellDatacenterConfig
metadata:
name: sjparekh-mgmt
spec:
...
tinkerbellIP: "x.x.x.x"
osImageURL: "https://s3-bucket-url/ubuntu.gz"
loadBalancerInterface: "eth0"
skipLoadBalancerDeployment: "false"
...
```

**Tradeoffs:**

This approach allows specifying the interface to a tinkerbell-specific custom resource which is where we already have some other tinkerbell config as well so it seems like a more appropriate place to have it but the drawback is that we are adding a field to the tinkerbell datacenter config custom resource when the field itself is not directly related to the datacenter.

Another drawback is that adding more fields in the future will require changing the API again but at the same time, it allows us to fail the validations quickly and more easily if the field is misconfigured.

**Implementation Details:**

The load balancer interface specified in the cluster spec will be passed to the tinkerbell stack helm chart through the [createValuesOverride](https://github.com/aws/eks-anywhere/blob/e24df70ec55e1be403e19685aded8850d3c45dad/pkg/providers/tinkerbell/stack/stack.go#L511) method of the [Installer](https://github.com/aws/eks-anywhere/blob/e24df70ec55e1be403e19685aded8850d3c45dad/pkg/providers/tinkerbell/stack/stack.go#L78C6-L78C15) struct when installing the stack during cluster create/upgrade operations.

The upstream helm chart template already handles setting the [vip_interface](https://github.com/tinkerbell/charts/blob/95df5bc5f89c76dd0f6cc2955bb590f023d94f28/tinkerbell/stack/templates/kubevip.yaml#L34C9-L37C19) env variable in the kube-vip daemonset with the interface value from the [values.yaml](https://github.com/tinkerbell/charts/blob/95df5bc5f89c76dd0f6cc2955bb590f023d94f28/tinkerbell/stack/values.yaml#L38) file.

**Testing:**

* E2E tests would be added to verify that the load balancer is indeed deployed with the expected interface
* Unit tests would be added for any functional changes implemented


**Documentation:**

* We would have to add tinkerbellStackLoadBalancerInterface as an optional configuration for tinkerbell datacenter config [fields](https://anywhere.eks.amazonaws.com/docs/getting-started/baremetal/bare-spec/#tinkerbelldatacenterconfig-fields) in the EKS Anywhere docs
* We also need to document that in the case of single-node clusters, same interface will be used for load-balancing both the tinkerbell stack as well as control plane components. If a user wants separate interface for them in a single-node cluster, they would have to skip deploying kube-vip and instead deploy their own load balancers with one for cp components and one for tinkerbell stack configured separately with their custom interfaces.
* Specify that new nodes will not be rolled out when the cluster is created/upgrade with the custom interface

## Alternate Solutions Considered:

Specify the interface by exposing a tinkerbell config map at the root level of the cluster spec

**API Schema:**

```
apiVersion: anywhere.eks.amazonaws.com/v1alpha1
kind: Cluster
metadata:
name: mgmt-cluster
spec:
...
kubernetesVersion: 1.30
tinkerbellConfig:
loadBalancerInterface: "eth0"
...
```

**Tradeoffs:**

This approach will allow us to add more tinkerbell configuration fields in the future without having to change the API but the drawback is that we are adding a new provider-specific configuration to the root level of cluster spec which does not seem appropriate and we don’t have it for any other providers either.
3 changes: 3 additions & 0 deletions docs/content/en/docs/getting-started/baremetal/bare-spec.md
Original file line number Diff line number Diff line change
Expand Up @@ -301,6 +301,9 @@ EKS Anywhere for Bare Metal uses `kube-vip` load balancer by default to expose t
You can disable this feature by setting this field to `true`.
>**_NOTE:_** If you skip load balancer deployment, you will have to ensure that the Tinkerbell stack is available at [tinkerbellIP]({{< relref "#tinkerbellip-required" >}}) once the cluster creation is finished. One way to achieve this is by using the [MetalLB]({{< relref "../../packages/metallb" >}}) package.

### loadBalancerInterface (optional)
Optional field to configure a custom load balancer interface for Tinkerbell stack.

## TinkerbellMachineConfig Fields
In the example, there are `TinkerbellMachineConfig` sections for control plane (`my-cluster-name-cp`) and worker (`my-cluster-name`) machine groups.
The following fields identify information needed to configure the nodes in each of those groups.
Expand Down
2 changes: 2 additions & 0 deletions pkg/api/v1alpha1/tinkerbelldatacenterconfig_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@ type TinkerbellDatacenterConfigSpec struct {
// SkipLoadBalancerDeployment when set to "true" can be used to skip deploying a load balancer to expose Tinkerbell stack.
// Users will need to deploy and configure a load balancer manually after the cluster is created.
SkipLoadBalancerDeployment bool `json:"skipLoadBalancerDeployment,omitempty"`
// LoadBalancerInterface can be used to configure a load balancer interface for the Tinkerbell stack.
LoadBalancerInterface string `json:"loadBalancerInterface,omitempty"`
}

// TinkerbellDatacenterConfigStatus defines the observed state of TinkerbellDatacenterConfig
Expand Down
2 changes: 2 additions & 0 deletions pkg/providers/tinkerbell/create.go
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ func (p *Provider) PreCAPIInstallOnBootstrap(ctx context.Context, cluster *types
p.tinkerbellIP,
cluster.KubeconfigFile,
p.datacenterConfig.Spec.HookImagesURLPath,
stack.WithLoadBalancerInterface(p.datacenterConfig.Spec.LoadBalancerInterface),
stack.WithBootsOnDocker(),
stack.WithHostNetworkEnabled(true), // enable host network on bootstrap cluster
stack.WithLoadBalancerEnabled(false),
Expand Down Expand Up @@ -107,6 +108,7 @@ func (p *Provider) PostWorkloadInit(ctx context.Context, cluster *types.Cluster,
p.templateBuilder.datacenterSpec.TinkerbellIP,
cluster.KubeconfigFile,
p.datacenterConfig.Spec.HookImagesURLPath,
stack.WithLoadBalancerInterface(p.datacenterConfig.Spec.LoadBalancerInterface),
stack.WithBootsOnKubernetes(),
stack.WithHostNetworkEnabled(false), // disable host network on workload cluster
stack.WithStackServiceEnabled(true), // use stack service on workload cluster
Expand Down
45 changes: 29 additions & 16 deletions pkg/providers/tinkerbell/stack/stack.go
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ const (
port = "port"
addr = "addr"
enabled = "enabled"
kubevipInterface = "interface"

boots = "boots"
smee = "smee"
Expand Down Expand Up @@ -76,22 +77,30 @@ type StackInstaller interface {
}

type Installer struct {
docker Docker
filewriter filewriter.FileWriter
helm Helm
podCidrRange string
registryMirror *registrymirror.RegistryMirror
proxyConfig *v1alpha1.ProxyConfiguration
namespace string
bootsOnDocker bool
hostNetwork bool
loadBalancer bool
stackService bool
dhcpRelay bool
docker Docker
filewriter filewriter.FileWriter
helm Helm
podCidrRange string
registryMirror *registrymirror.RegistryMirror
proxyConfig *v1alpha1.ProxyConfiguration
namespace string
loadBalancerInterface string
bootsOnDocker bool
hostNetwork bool
loadBalancer bool
stackService bool
dhcpRelay bool
}

type InstallOption func(s *Installer)

// WithLoadBalancerInterface is an InstallOption that allows you to configure load balancer interface for the tinkerbell stack.
func WithLoadBalancerInterface(loadBalancerInterface string) InstallOption {
return func(s *Installer) {
s.loadBalancerInterface = loadBalancerInterface
}
}

// WithBootsOnDocker is an InstallOption to run Boots as a Docker container.
func WithBootsOnDocker() InstallOption {
return func(s *Installer) {
Expand Down Expand Up @@ -141,7 +150,7 @@ func (s *Installer) AddNoProxyIP(IP string) {
}

// NewInstaller returns a Tinkerbell StackInstaller which can be used to install or uninstall the Tinkerbell stack.
func NewInstaller(docker Docker, filewriter filewriter.FileWriter, helm Helm, namespace string, podCidrRange string, registryMirror *registrymirror.RegistryMirror, proxyConfig *v1alpha1.ProxyConfiguration) StackInstaller {
func NewInstaller(docker Docker, filewriter filewriter.FileWriter, helm Helm, namespace, podCidrRange string, registryMirror *registrymirror.RegistryMirror, proxyConfig *v1alpha1.ProxyConfiguration) StackInstaller {
return &Installer{
docker: docker,
filewriter: filewriter,
Expand Down Expand Up @@ -177,7 +186,7 @@ func (s *Installer) Install(ctx context.Context, bundle releasev1alpha1.Tinkerbe
return fmt.Errorf("parsing hookOverride: %v", err)
}

valuesMap := s.createValuesOverride(bundle, bootEnv, tinkerbellIP, osiePath)
valuesMap := s.createValuesOverride(bundle, bootEnv, tinkerbellIP, s.loadBalancerInterface, osiePath)

values, err := yaml.Marshal(valuesMap)
if err != nil {
Expand Down Expand Up @@ -373,7 +382,7 @@ func (s *Installer) Upgrade(ctx context.Context, bundle releasev1alpha1.Tinkerbe
return fmt.Errorf("parsing hookOverride: %v", err)
}

valuesMap := s.createValuesOverride(bundle, bootEnv, tinkerbellIP, osiePath)
valuesMap := s.createValuesOverride(bundle, bootEnv, tinkerbellIP, s.loadBalancerInterface, osiePath)

values, err := yaml.Marshal(valuesMap)
if err != nil {
Expand Down Expand Up @@ -508,7 +517,7 @@ func (s *Installer) HasLegacyChart(ctx context.Context, bundle releasev1alpha1.T
}

// createValuesOverride generates the values override file to send to helm.
func (s *Installer) createValuesOverride(bundle releasev1alpha1.TinkerbellBundle, bootEnv []string, tinkerbellIP string, osiePath *url.URL) map[string]interface{} {
func (s *Installer) createValuesOverride(bundle releasev1alpha1.TinkerbellBundle, bootEnv []string, tinkerbellIP, loadBalancerInterface string, osiePath *url.URL) map[string]interface{} {
valuesMap := map[string]interface{}{
tink: map[string]interface{}{
controller: map[string]interface{}{
Expand Down Expand Up @@ -592,5 +601,9 @@ func (s *Installer) createValuesOverride(bundle releasev1alpha1.TinkerbellBundle
},
}

if loadBalancerInterface != "" {
valuesMap[stack].(map[string]interface{})[kubevip].(map[string]interface{})[kubevipInterface] = loadBalancerInterface
}

return valuesMap
}
5 changes: 5 additions & 0 deletions pkg/providers/tinkerbell/stack/stack_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -154,6 +154,11 @@ func TestTinkerbellStackInstallWithDifferentOptions(t *testing.T) {
expectedFile: "testdata/expected_with_load_balancer_enabled_false.yaml",
opts: []stack.InstallOption{stack.WithLoadBalancerEnabled(false)},
},
{
name: "with_load_balancer_interface",
expectedFile: "testdata/expected_with_load_balancer_interface.yaml",
opts: []stack.InstallOption{stack.WithLoadBalancerInterface("test-interface")},
},
{
name: "with_kubernetes_options",
expectedFile: "testdata/expected_with_kubernetes_options.yaml",
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
hegel:
image: public.ecr.aws/eks-anywhere/hegel:latest
trustedProxies:
- 192.168.0.0/16
rufio:
additionalArgs:
- -metrics-bind-address=127.0.0.1:8080
image: public.ecr.aws/eks-anywhere/rufio:latest
smee:
deploy: true
http:
additionalKernelArgs: []
osieUrl:
host: anywhere-assests.eks.amazonaws.com
path: /tinkerbell/hook
port: ""
scheme: https
tinkServer:
ip: 1.2.3.4
port: "42113"
image: public.ecr.aws/eks-anywhere/boots:latest
publicIP: 1.2.3.4
tinkWorkerImage: public.ecr.aws/eks-anywhere/tink-worker:latest
trustedProxies:
- 192.168.0.0/16
stack:
hook:
enabled: false
hostNetwork: false
image: public.ecr.aws/eks-anywhere/nginx:latest
kubevip:
additionalEnv:
- name: prometheus_server
value: :2213
- name: lb_class_only
value: "true"
enabled: false
image: public.ecr.aws/eks-anywhere/kube-vip:latest
interface: "test-interface"
loadBalancerIP: 1.2.3.4
relay:
enabled: false
image: public.ecr.aws/eks-anywhere/tink-relay:latest
initImage: public.ecr.aws/eks-anywhere/tink-relay-init:latest
service:
enabled: false
tink:
controller:
image: public.ecr.aws/eks-anywhere/tink-controller:latest
server:
image: public.ecr.aws/eks-anywhere/tink-server:latest
2 changes: 2 additions & 0 deletions pkg/providers/tinkerbell/tinkerbell_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -605,6 +605,7 @@ func TestPreCAPIInstallOnBootstrapSuccess(t *testing.T) {
gomock.Any(),
gomock.Any(),
gomock.Any(),
gomock.Any(),
)

err := provider.PreCAPIInstallOnBootstrap(ctx, cluster, clusterSpec)
Expand Down Expand Up @@ -651,6 +652,7 @@ func TestPostWorkloadInitSuccess(t *testing.T) {
gomock.Any(),
gomock.Any(),
gomock.Any(),
gomock.Any(),
)
stackInstaller.EXPECT().UninstallLocal(ctx)

Expand Down
3 changes: 2 additions & 1 deletion pkg/providers/tinkerbell/upgrade.go
Original file line number Diff line number Diff line change
Expand Up @@ -505,7 +505,7 @@ func (p *Provider) validateMachineCfg(ctx context.Context, cluster *types.Cluste
return nil
}

// PreCoreComponentsUpgrade staisfies the Provider interface.
// PreCoreComponentsUpgrade satisfies the Provider interface.
func (p *Provider) PreCoreComponentsUpgrade(
ctx context.Context,
cluster *types.Cluster,
Expand Down Expand Up @@ -584,6 +584,7 @@ func (p *Provider) PreCoreComponentsUpgrade(
p.datacenterConfig.Spec.TinkerbellIP,
cluster.KubeconfigFile,
p.datacenterConfig.Spec.HookImagesURLPath,
stack.WithLoadBalancerInterface(p.datacenterConfig.Spec.LoadBalancerInterface),
stack.WithBootsOnKubernetes(),
stack.WithStackServiceEnabled(true),
stack.WithDHCPRelayEnabled(true),
Expand Down

0 comments on commit f613daf

Please sign in to comment.