Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add multiple ingress controller support + traefik #5943

Merged
merged 3 commits into from
Jul 10, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion charts/chart_versions.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,15 @@ charts:
- version: 1.29.002
filename: /charts/rke2-coredns.yaml
bootstrap: true
- version: 4.10.101
- version: 4.10.102
filename: /charts/rke2-ingress-nginx.yaml
bootstrap: false
- version: 25.0.000
filename: /charts/rke2-traefik.yaml
bootstrap: false
- version: 25.0.000
filename: /charts/rke2-traefik-crd.yaml
bootstrap: false
- version: 3.12.002
filename: /charts/rke2-metrics-server.yaml
bootstrap: false
Expand Down
29 changes: 29 additions & 0 deletions docs/adrs/008-traefik-ingress.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Support for Alternative Ingress Controllers

Date: 2024-05-21

## Status

Accepted

## Context

RKE2 currently supports only a single ingress controller, ingress-nginx.
It has been requested RKE2 support alternative ingress controllers, similar to how RKE2 supports multiple CNIs.

## Decision

* A new --ingress-controller flag will be added; the default will be only `ingress-nginx` to preserve current behavior.
* All selected ingress controllers will be deployed to the cluster.
* The first selected ingress controller will be set as the default, via the `ingressclass.kubernetes.io/is-default-class` annotation
on the IngressClass resource.
* Any packaged ingress controllers not listed in the flag value will be disabled, similar to how inactive packaged CNIs are handled.
* RKE2 will package Traefik's HelmChart as a supported ingress controller, deploying as a Daemonset + ClusterIP Service
for parity with the `ingress-nginx` default configuration due to RKE2's lack of a default LoadBalancer controller.
* RKE2 will use mirrored upstream Traefik images; custom-rebuilt hardened-traefik images will not be provided or supported.
brandond marked this conversation as resolved.
Show resolved Hide resolved

## Consequences

* We will add an additional packaged component and CLI flag for ingress controller selection.
* We will need to track updates to Traefik and the Traefik chart.
* QA will need additional resources to test the new ingress controllers.
7 changes: 4 additions & 3 deletions pkg/bootstrap/bootstrap.go
Original file line number Diff line number Diff line change
Expand Up @@ -162,7 +162,7 @@ func Stage(resolver *images.Resolver, nodeConfig *daemonconfig.Node, cfg cmds.Ag

// UpdateManifests copies the staged manifests into the server's manifests dir, and applies
// cluster configuration values to any HelmChart manifests found in the manifests directory.
func UpdateManifests(resolver *images.Resolver, nodeConfig *daemonconfig.Node, cfg cmds.Agent) error {
func UpdateManifests(resolver *images.Resolver, ingressController string, nodeConfig *daemonconfig.Node, cfg cmds.Agent) error {
ref, err := resolver.GetReference(images.Runtime)
if err != nil {
return err
Expand All @@ -189,7 +189,7 @@ func UpdateManifests(resolver *images.Resolver, nodeConfig *daemonconfig.Node, c

// Fix up HelmCharts to pass through configured values.
// This needs to be done every time in order to sync values from the CLI
if err := setChartValues(manifestsDir, nodeConfig, cfg); err != nil {
if err := setChartValues(manifestsDir, ingressController, nodeConfig, cfg); err != nil {
return errors.Wrap(err, "failed to rewrite HelmChart manifests to pass through CLI values")
}
return nil
Expand Down Expand Up @@ -309,7 +309,7 @@ func copyFile(target, source string) error {
// pass through settings to both the Helm job and the chart values.
// NOTE: This will probably fail if any manifest contains multiple documents. This should
// not matter for any of our packaged components, but may prevent this from working on user manifests.
func setChartValues(manifestsDir string, nodeConfig *daemonconfig.Node, cfg cmds.Agent) error {
func setChartValues(manifestsDir, ingressController string, nodeConfig *daemonconfig.Node, cfg cmds.Agent) error {
chartValues := map[string]string{
"global.clusterCIDR": util.JoinIPNets(nodeConfig.AgentConfig.ClusterCIDRs),
"global.clusterCIDRv4": util.JoinIP4Nets(nodeConfig.AgentConfig.ClusterCIDRs),
Expand All @@ -318,6 +318,7 @@ func setChartValues(manifestsDir string, nodeConfig *daemonconfig.Node, cfg cmds
"global.clusterDomain": nodeConfig.AgentConfig.ClusterDomain,
"global.rke2DataDir": cfg.DataDir,
"global.serviceCIDR": util.JoinIPNets(nodeConfig.AgentConfig.ServiceCIDRs),
"global.systemDefaultIngressClass": ingressController,
"global.systemDefaultRegistry": nodeConfig.AgentConfig.SystemDefaultRegistry,
"global.cattle.systemDefaultRegistry": nodeConfig.AgentConfig.SystemDefaultRegistry,
}
Expand Down
125 changes: 80 additions & 45 deletions pkg/cli/cmds/server.go
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
package cmds

import (
"errors"
"strings"

"github.com/k3s-io/k3s/pkg/cli/cmds"
Expand All @@ -9,29 +10,20 @@ import (
"github.com/rancher/wrangler/v3/pkg/slice"
"github.com/sirupsen/logrus"
"github.com/urfave/cli"
"k8s.io/apimachinery/pkg/util/sets"
)

const (
rke2Path = "/var/lib/rancher/rke2"
)

var (
DisableItems = []string{"rke2-coredns", "rke2-ingress-nginx", "rke2-metrics-server"}
CNIItems = []string{"calico", "canal", "cilium", "flannel"}

config = rke2.Config{}

serverFlag = []cli.Flag{
&cli.StringSliceFlag{
Name: "cni",
Usage: "(networking) CNI Plugins to deploy, one of none, " + strings.Join(CNIItems, ", ") + "; optionally with multus as the first value to enable the multus meta-plugin (default: canal)",
EnvVar: "RKE2_CNI",
},
&cli.BoolFlag{
Name: "enable-servicelb",
Usage: "(components) Enable rke2 default cloud controller manager's service controller",
EnvVar: "RKE2_ENABLE_SERVICELB",
},
rke2.CNIFlag,
rke2.IngressControllerFlag,
rke2.ServiceLBFlag,
}

k3sServerBase = mustCmdFromK3S(cmds.NewServerCommand(ServerRun), K3SFlagSet{
Expand Down Expand Up @@ -81,7 +73,7 @@ var (
"kine-tls": dropFlag,
"default-local-storage-path": dropFlag,
"disable": {
Usage: "(components) Do not deploy packaged components and delete any deployed components (valid items: " + strings.Join(DisableItems, ", ") + ")",
Usage: "(components) Do not deploy packaged components and delete any deployed components (valid items: " + strings.Join(rke2.DisableItems, ", ") + ")",
},
"disable-scheduler": copyFlag,
"disable-cloud-controller": copyFlag,
Expand Down Expand Up @@ -166,47 +158,90 @@ func ServerRun(clx *cli.Context) error {
validateCloudProviderName(clx, Server)
validateProfile(clx, Server)
validateCNI(clx)
validateIngress(clx)
return rke2.Server(clx, config)
}

// validateCNI validates the CNI selection, and disables any un-selected CNI charts
func validateCNI(clx *cli.Context) {
cnis := []string{}
for _, cni := range clx.StringSlice("cni") {
for _, v := range strings.Split(cni, ",") {
cnis = append(cnis, v)
disableExceptSelected(clx, rke2.CNIItems, rke2.CNIFlag, func(values cli.StringSlice) (cli.StringSlice, error) {
switch len(values) {
case 0:
values = append(values, "canal")
fallthrough
case 1:
if values[0] == "multus" {
return nil, errors.New("multus must be used alongside another primary cni selection")
}
clx.Set("disable", "rke2-multus")
case 2:
if values[0] == "multus" {
values = values[1:]
} else {
return nil, errors.New("may only provide multiple values if multus is the first value")
}
default:
return nil, errors.New("must specify 1 or 2 values")
}
}
return values, nil
})
}

switch len(cnis) {
case 0:
cnis = append(cnis, "canal")
fallthrough
case 1:
if cnis[0] == "multus" {
logrus.Fatal("invalid value provided for --cni flag: multus must be used alongside another primary cni selection")
// validateCNI validates the ingress controller selection, and disables any un-selected ingress controller charts
func validateIngress(clx *cli.Context) {
disableExceptSelected(clx, rke2.IngressItems, rke2.IngressControllerFlag, func(values cli.StringSlice) (cli.StringSlice, error) {
if len(values) == 0 {
values = append(values, "ingress-nginx")
}
clx.Set("disable", "rke2-multus")
case 2:
if cnis[0] == "multus" {
cnis = cnis[1:]
} else {
logrus.Fatal("invalid values provided for --cni flag: may only provide multiple values if multus is the first value")
return values, nil
})
}

brandond marked this conversation as resolved.
Show resolved Hide resolved
// disableExceptSelected takes a list of valid flag values, and a CLI StringSlice flag that holds the user's selected values.
// Selected values are split to support comma-separated lists, in addition to repeated use of the same flag.
// Once the list has been split, a validation function is called to allow for custom validation or defaulting of selected values.
// Finally, charts for any valid items not selected are added to the --disable list.
// A value of 'none' will cause all valid items to be disabled.
// Errors from the validation function, or selection of a value not in the valid list, will cause a fatal error to be logged.
func disableExceptSelected(clx *cli.Context, valid []string, flag *cli.StringSliceFlag, validateFunc func(cli.StringSlice) (cli.StringSlice, error)) {
// split comma-separated values
values := cli.StringSlice{}
if flag.Value != nil {
for _, value := range *flag.Value {
for _, v := range strings.Split(value, ",") {
values = append(values, v)
}
}
default:
logrus.Fatal("invalid values provided for --cni flag: may not provide more than two values")
}

switch {
case cnis[0] == "none":
fallthrough
case slice.ContainsString(CNIItems, cnis[0]):
for _, d := range CNIItems {
if cnis[0] != d {
clx.Set("disable", "rke2-"+d)
clx.Set("disable", "rke2-"+d+"-crd")
}
// validate the flag after splitting values
if v, err := validateFunc(values); err != nil {
logrus.Fatalf("Failed to validate --%s flag: %v", flag.Name, err)
} else {
flag.Value = &v
}

// prepare a list of items to disable, based on all valid components.
// we have to use an intermediate set because the flag interface
// doesn't allow us to remove flag values once added.
disabledCharts := sets.Set[string]{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder why we need a set instead of a slice. If I understand correclty, the "values" are always slices that we are hardcoding with rke2 components in "pkg/rke2/rke2.go", e.g. CNIItems

Copy link
Member Author

@brandond brandond May 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made it a set so that I can easily remove entries from it by value, without having to find their indexes in the slice and elide them. It'd be much more boilerplate code than just the Insert() and Delete() functions that sets provide. I could use a map[string]struct{} and insert/delete keys but generic sets are implemented as a map[comparable]struct{} under the hood anyway:
https://github.com/kubernetes/apimachinery/blob/master/pkg/util/sets/set.go#L24-L25

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aaah ok, I was thinking on the property that no copies are possible in sets by default and that did not make sense here. Thanks

for _, d := range valid {
disabledCharts.Insert("rke2-"+d, "rke2-"+d+"-crd")
}

// re-enable components for any selected flag values
for _, d := range *flag.Value {
switch {
case d == "none":
break
case slice.ContainsString(valid, d):
disabledCharts.Delete("rke2-"+d, "rke2-"+d+"-crd")
default:
logrus.Fatalf("Invalid value %s for --%s flag: must be one of %s", d, flag.Name, strings.Join(valid, ","))
}
default:
logrus.Fatal("invalid value provided for --cni flag")
}

for _, c := range disabledCharts.UnsortedList() {
clx.Set("disable", c)
}
}
15 changes: 8 additions & 7 deletions pkg/pebinaryexecutor/pebinary.go
Original file line number Diff line number Diff line change
Expand Up @@ -46,19 +46,20 @@ var (
)

type PEBinaryConfig struct {
ManifestsDir string
ImagesDir string
Resolver *images.Resolver
CNIPlugin win.CNIPlugin
CloudProvider *CloudProviderConfig
CISMode bool
Resolver *images.Resolver
ManifestsDir string
DataDir string
AuditPolicyFile string
KubeletPath string
CNIName string
ImagesDir string
KubeConfigKubeProxy string
IngressController string
CISMode bool
DisableETCD bool
IsServer bool
CNIName string
CNIPlugin win.CNIPlugin
}

type CloudProviderConfig struct {
Expand Down Expand Up @@ -105,7 +106,7 @@ func (p *PEBinaryConfig) Bootstrap(ctx context.Context, nodeConfig *config.Node,
}

if p.IsServer {
return bootstrap.UpdateManifests(p.Resolver, nodeConfig, cfg)
return bootstrap.UpdateManifests(p.Resolver, p.IngressController, nodeConfig, cfg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have just realized that if we ever reach this line, something would be broken because windows can never be the server. What if we use this PR to add an error message instead saying that this is not supported?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wired it up here in case we ever do support windows server nodes. I don't think it needs to be an error, it's just not a code path that is reachable at the moment.

}

restConfig, err := clientcmd.BuildConfigFromFlags("", nodeConfig.AgentConfig.KubeConfigK3sController)
Expand Down
26 changes: 13 additions & 13 deletions pkg/podexecutor/staticpod.go
Original file line number Diff line number Diff line change
Expand Up @@ -105,25 +105,25 @@ type ControlPlaneProbeConfs struct {
}

type StaticPodConfig struct {
Resolver *images.Resolver
stopKubelet context.CancelFunc
CloudProvider *CloudProviderConfig
ControlPlaneResources
ControlPlaneProbeConfs
DataDir string
RuntimeEndpoint string
ManifestsDir string
IngressController string
ImagesDir string
AuditPolicyFile string
PSAConfigFile string
KubeletPath string
ControlPlaneEnv
ControlPlaneMounts
ManifestsDir string
ImagesDir string
Resolver *images.Resolver
CloudProvider *CloudProviderConfig
DataDir string
AuditPolicyFile string
PSAConfigFile string
KubeletPath string
RuntimeEndpoint string
ControlPlaneProbeConfs
CISMode bool
DisableETCD bool
ExternalDatabase bool
IsServer bool

stopKubelet context.CancelFunc
}

type CloudProviderConfig struct {
Expand Down Expand Up @@ -159,7 +159,7 @@ func (s *StaticPodConfig) Bootstrap(_ context.Context, nodeConfig *daemonconfig.
return err
}
if s.IsServer {
return bootstrap.UpdateManifests(s.Resolver, nodeConfig, cfg)
return bootstrap.UpdateManifests(s.Resolver, s.IngressController, nodeConfig, cfg)
}

// Remove the kube-proxy static pod manifest before starting the agent.
Expand Down
27 changes: 26 additions & 1 deletion pkg/rke2/rke2.go
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ import (
"github.com/pkg/errors"
"github.com/rancher/rke2/pkg/controllers/cisnetworkpolicy"
"github.com/rancher/rke2/pkg/images"

"github.com/rancher/wrangler/v3/pkg/slice"
"github.com/sirupsen/logrus"
"github.com/urfave/cli"
Expand Down Expand Up @@ -62,6 +63,30 @@ type ExtraEnv struct {
CloudControllerManager cli.StringSlice
}

var (
DisableItems = []string{"rke2-coredns", "rke2-metrics-server", "rke2-snapshot-controller", "rke2-snapshot-controller-crd", "rke2-snapshot-validation-webhook"}
CNIItems = []string{"calico", "canal", "cilium", "flannel"}
IngressItems = []string{"ingress-nginx", "traefik"}

CNIFlag = &cli.StringSliceFlag{
Name: "cni",
Usage: "(networking) CNI Plugins to deploy, one of none, " + strings.Join(CNIItems, ", ") + "; optionally with multus as the first value to enable the multus meta-plugin (default: canal)",
EnvVar: "RKE2_CNI",
Value: &cli.StringSlice{},
}
IngressControllerFlag = &cli.StringSliceFlag{
Name: "ingress-controller",
Usage: "(networking) Ingress Controllers to deploy, one of none, " + strings.Join(IngressItems, ", ") + "; the first value will be set as the default ingress class (default: ingress-nginx)",
EnvVar: "RKE_INGRESS_CONTROLLER",
Value: &cli.StringSlice{},
}
ServiceLBFlag = &cli.BoolFlag{
Name: "enable-servicelb",
Usage: "(components) Enable rke2 default cloud controller manager's service controller",
EnvVar: "RKE2_ENABLE_SERVICELB",
}
)

// Valid CIS Profile versions
const (
CISProfile123 = "cis-1.23"
Expand Down Expand Up @@ -115,7 +140,7 @@ func Server(clx *cli.Context, cfg Config) error {

var leaderControllers rawServer.CustomControllers

cnis := clx.StringSlice("cni")
cnis := *CNIFlag.Value
if cisMode && (len(cnis) == 0 || slice.ContainsString(cnis, "canal")) {
leaderControllers = append(leaderControllers, cisnetworkpolicy.Controller)
} else {
Expand Down
Loading
Loading