Skip to content

Commit

Permalink
fix
Browse files Browse the repository at this point in the history
  • Loading branch information
marek-veber committed Jan 20, 2025
1 parent e6a8478 commit 4aa1b44
Showing 1 changed file with 113 additions and 59 deletions.
172 changes: 113 additions & 59 deletions docs/proposals/20241127-namespace-separation.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,108 +12,154 @@ replaces:
superseded-by:
---

# Support running multiple instances of the same provider, each one watching different namespaces
# Enable adoption in advance multi tenant scenarios

## Table of Contents

<!-- START doctoc generated TOC please keep comment here to allow auto update -->
<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->

- [Glossary](#glossary)
- [Summary](#summary)
- [Paradigm 1: Isolated Cluster Management](#paradigm-1-isolated-cluster-management)
- [Paradigm 2: Centralized Cluster Management](#paradigm-2-centralized-cluster-management)
- [Challenge: Coexistence of Both Paradigms](#challenge-coexistence-of-both-paradigms)
- [Motivation](#motivation)
- [Goals](#goals)
- [Non-Goals/Future Work](#non-goalsfuture-work)
- [Proposal](#proposal)
- [A deployment example](#a-deployment-example)
- [Global resources:](#global-resources)
- [Namespace `capi1-system`](#namespace-capi1-system)
- [Namespace `capi2-system`](#namespace-capi2-system)
- [User Stories](#user-stories)
- [Story 1](#story-1)
- [Story 2](#story-2)
- [Requirements (Optional)](#requirements-optional)
- [Story 1 - Isolated Cluster Management](#story-1---isolated-cluster-management)
- [Story 2 - Centralized Cluster Management](#story-2---centralized-cluster-management)
- [Story 3 - Hierarchical deployment using CAPI](#story-3---hierarchical-deployment-using-capi)
- [Functional Requirements](#functional-requirements)
- [FR1](#fr1)
- [FR2](#fr2)
- [FR1 - watch multiple namespaces](#fr1---watch-multiple-namespaces)
- [FR2 - watch on all namespaces excluding multiple namespaces](#fr2---watch-on-all-namespaces-excluding-multiple-namespaces)
- [Non-Functional Requirements](#non-functional-requirements)
- [NFR1](#nfr1)
- [NFR2](#nfr2)
- [Implementation Details/Notes/Constraints](#implementation-detailsnotesconstraints)
- [Current state:](#current-state)
- [Watch on multiple namespaces](#watch-on-multiple-namespaces)
- [Exclude watching on selected namespaces](#exclude-watching-on-selected-namespaces)
- [Security Model](#security-model)
- [Risks and Mitigations](#risks-and-mitigations)
- [Alternatives](#alternatives)
- [Upgrade Strategy](#upgrade-strategy)
- [Additional Details](#additional-details)
- [Test Plan [optional]](#test-plan-optional)
- [Graduation Criteria [optional]](#graduation-criteria-optional)
- [Version Skew Strategy [optional]](#version-skew-strategy-optional)
- [Test Plan](#test-plan)
- [Implementation History](#implementation-history)

<!-- END doctoc generated TOC please keep comment here to allow auto update -->

## Summary
We need to run multiple CAPI instances in one cluster and divide the namespaces to be watched by given instances.
As a Service Provider/Consumer, a management cluster is used to provision and manage the lifecycle of Kubernetes clusters using the Kubernetes Cluster API (CAPI).
Two distinct paradigms coexist to address different operational and security requirements.

We want and consider:
- each CAPI instance:
- is running in separate namespace and is using its own service account
- can select by command the line arguments the list of namespaces:
- to watch - e.g.: `--namespace <ns1> --namespace <ns2>`
- to exclude from watching - e.g.: `--excluded-namespace <ns1> --excluded-namespace <ns2>`
- we are not supporting multiple versions of CAPI
- all running CAPI-instances:
- are using the same container image (same version of CAPI)
- are sharing global resources:
- CRDs:
- cluster.x-k8s.io:
- addons.cluster.x-k8s.io: clusterresourcesetbindings, clusterresourcesets
- cluster.x-k8s.io: clusterclasses, clusters, machinedeployments, machinehealthchecks, machinepools, machinesets, machines
- ipam.cluster.x-k8s.io: ipaddressclaims, ipaddresses
- runtime.cluster.x-k8s.io: extensionconfigs
- NOTE: the web-hooks are pointing from the CRDs into the first instance only
- the `ClusterRole/capi-aggregated-manager-role`
- the `ClusterRoleBinding/capi-manager-rolebinding` to bind all service accounts for CAPI instances (e.g. `capi1-system:capi-manager`, ..., `capiN-system:capi-manager`) to the `ClusterRole`
### Paradigm 1: Isolated Cluster Management
Each Kubernetes cluster operates its own suite of CAPI controllers, targeting specific namespaces as a hidden implementation engine.
This paradigm avoids using webhooks and prioritizes isolation and granularity.

References:
* https://cluster-api.sigs.k8s.io/developer/core/support-multiple-instances
**Key Features**:
- **Granular Lifecycle Management**: Independent versioning and upgrades for each cluster's CAPI components.
- **Logging and Metrics**: Per-cluster logging, forwarding, and metric collection.
- **Resource Isolation**: Defined resource budgets for CPU, memory, and storage on a per-cluster basis.
- **Security Requirements**:
- **Network Policies**: Per-cluster isolation using tailored policies.
- **Cloud Provider Credentials**: Each cluster uses its own set of isolated credentials.
- **Kubeconfig Access**: Dedicated access controls for kubeconfig per cluster.

The extension to enable the existing command-line option `--namespace=<ns1, …>` multiple times is proposed in this PR [#11397](https://github.com/kubernetes-sigs/cluster-api/pull/11397).

---

### Paradigm 2: Centralized Cluster Management
This paradigm manages multiple Kubernetes clusters using a shared, centralized suite of CAPI controllers. It is designed for scenarios with less stringent isolation requirements.

**Characteristics**:
- Operates under simplified constraints compared to [Paradigm 1](#paradigm-1-isolated-cluster-management).
- Reduces management overhead through centralization.
- Prioritizes ease of use and scalability over strict isolation.

The addition of the new command-line option `--excluded-namespace=<ns1, …>` is proposed in this PR [#11370](https://github.com/kubernetes-sigs/cluster-api/pull/11370).

---

### Challenge: Coexistence of Both Paradigms
To enable [Paradigm 1](#paradigm-1-isolated-cluster-management) and [Paradigm 2](#paradigm-2-centralized-cluster-management) to coexist within the same management cluster, the following is required:
- **Scope Restriction**: Paradigm 2 must have the ability to restrict its scope to avoid interference with resources owned by Paradigm 1.
- **Resource Segregation**: Paradigm 2 must be unaware of CAPI resources managed by [Paradigm 1](#paradigm-1-isolated-cluster-management) to prevent cross-contamination and conflicts.

The proposed PRs implementing such a namespace separation:
* https://github.com/kubernetes-sigs/cluster-api/pull/11397 extend the commandline option `--namespace=<ns1, …>`
* https://github.com/kubernetes-sigs/cluster-api/pull/11370 add the new commandline option `--excluded-namespace=<ns1, …>`
This coexistence strategy ensures both paradigms can fulfill their respective use cases without compromising operational integrity.

## Motivation
Our motivation is to have a provisioning cluster which is provisioned cluster at the same time while using hierarchical structure of clusters.
Two namespaces are used by management cluster and the rest of namespaces are watched by CAPI manager to manage other managed clusters.
For multi-tenant environment a cluster is used as provision-er using different CAPI providers using CAPI requires careful consideration of namespace isolation
to maintain security and operational boundaries between tenants. In such setups, it is essential to configure the CAPI controller instances
to either watch or exclude specific groups of namespaces based on the isolation requirements.
This can be achieved by setting up namespace-scoped controllers or applying filters, such as label selectors, to define the namespaces each instance should monitor.
By doing so, administrators can ensure that the activities of one tenant do not interfere with others, while also reducing the resource overhead by limiting the scope of CAPI operations.
This approach enhances scalability, security, and manageability, making it well-suited for environments with strict multi-tenancy requirements.
Our motivation is to have a provisioning cluster that also serves as a provisioned cluster, leveraging a hierarchical structure of clusters.
Two namespaces are used by the management cluster, while the remaining namespaces are monitored by the CAPI manager to oversee other managed clusters.

Our enhancement is also widely required many times from the CAPI community:
References:
* https://github.com/kubernetes-sigs/cluster-api/issues/11192
* https://github.com/kubernetes-sigs/cluster-api/issues/11193
* https://github.com/kubernetes-sigs/cluster-api/issues/7775

### Goals
We need to extend the existing feature to limit watching on specified namespace.
We need to run multiple CAPI controller instances:
- each watching only specified namespaces: `capi1-system`, …, `capi$(N-1)-system`
- and the last resort instance to watch the rest of namespaces excluding the namespaces already watched by previously mentioned instances

This change is only a small and strait forward update of the existing feature to limit watching on specified namespace by commandline `--namespace <ns>`

There are some restrictions while using multiple providers, see: https://cluster-api.sigs.k8s.io/developer/core/support-multiple-instances
But ee need to:
1. extend the existing feature to limit watching on specified namespace.
2. add new feature to watch on all namespaces except selected ones.
3. run multiple CAPI controller instances:
- each watching only specified namespaces: `capi1-system`, …, `capi$(N-1)-system`
- and the last resort instance to watch the rest of namespaces excluding the namespaces already watched by previously mentioned instances

### Non-Goals/Future Work
Non-goals:
* it's not necessary to work with the different versions of CRDs, we consider to:
* use same version of CAPi (the same container image):
* it's not necessary to work with the different versions of CRDs, we consider:
* use the same version of CAPi (the same container image):
* share the same CRDs
* the contract and RBAC need to be solved on specific provider (AWS, AZURE, ...)


## Proposal
We are proposing to:
* enable to select multiple namespaces: add `--namespace=<ns1, …>` to extend `--namespace=<ns>` to watch on selected namespaces
* the code change is only extending an existing hash with one item to multiple items
* the maintenance complexity shouldn't be extended here
* the code change involves extending an existing hash to accommodate multiple items.
* This change is only a small and straightforward update of the existing feature to limit watching on specified namespace. The maintenance complexity shouldn't be extended here
* add the new commandline option `--excluded-namespace=<ens1, …>` to define list of excluded namespaces
* the code change is only setting an option `Cache.Options.DefaultFieldSelector` to disable matching with any of specified namespace's names
* the maintenance complexity shouldn't be extended a lot here

Our objectives include:
- Each CAPI instance runs in a separate namespace and uses its own service account.
- can specify namespaces through command-line arguments:
- to watch - e.g.: `--namespace <ns1> --namespace <ns2>`
- to exclude from watching - e.g.: `--excluded-namespace <ns1> --excluded-namespace <ns2>`
- we do not support multiple versions of CAPI
- all running CAPI-instances:
- are using the same container image (same version of CAPI)
- are sharing global resources:
- CRDs:
- cluster.x-k8s.io:
- addons.cluster.x-k8s.io: clusterresourcesetbindings, clusterresourcesets
- cluster.x-k8s.io: clusterclasses, clusters, machinedeployments, machinehealthchecks, machinepools, machinesets, machines
- ipam.cluster.x-k8s.io: ipaddressclaims, ipaddresses
- runtime.cluster.x-k8s.io: extensionconfigs
- NOTE: the web-hooks are pointing from the CRDs into the first instance only
- cluster roles and managing access:
- default CAPI deployment define global cluster role:
- the `ClusterRole/capi-aggregated-manager-role`
- the `ClusterRoleBinding/capi-manager-rolebinding` to bind the service account `<instance-namespace>:capi-manager` for CAPI instance (e.g. ) to the `ClusterRole`
- in case of [Paradigm 1](#paradigm-1-isolated-cluster-management) we can define a cluster role per instance and grant access only to namespaces whose will be watched from the given instance
- in case of [Paradigm 2](#paradigm-2-centralized-cluster-management) we need to have access to all namespaces as defined in default CAPI deployment cluster role.


### A deployment example
Let's consider an example how to deploy multiple instances:
Let's consider an example how to deploy multiple instances for the [Paradigm 1+2](#challenge-coexistence-of-both-paradigms)

#### Global resources:
* CRDs (*.cluster.x-k8s.io) - webhooks will point into first instance, e.g.:
Expand Down Expand Up @@ -167,15 +213,23 @@ Let's consider an example how to deploy multiple instances:
```

### User Stories
We need to deploy two CAPI instances in the same cluster and divide the list of namespaces to assign some well known namespaces to be watched from the first instance and rest of them to assign to the second instace.
We need to deploy multiple CAPI instances in the same cluster and divide the list of namespaces to assign certain well-known namespaces to be watched from the given instances and define an instance to watch on the rest of them.
E.g.:
* instance1 (deployed in `capi1-system`) is watching `ns1.1`, `ns1.2`, ... `ns1.n1`
* instance2 (deployed in `capi2-system`) is watching `ns2.1`, `ns2.2`, ... `ns2.n2`
* ...
* last-resort instance (deployed in `capiR-system`) is watching the rest of namespaces

#### Story 1 - Isolated Cluster Management
We need to limit the list of namespaces to watch. It's possible to do this now, but only on one namespace and we need to watch on multiple namespaces by one instance.

#### Story 2 - Centralized Cluster Management
We need to exclude the list of namespaces from watch to reduces management overhead through centralization.

#### Story 1 - RedHat Hierarchical deployment using CAPI
#### Story 3 - Hierarchical deployment using CAPI
Provisioning cluster which is also provisioned cluster at the same time while using hierarchical structure of clusters.
Two namespaces are used by management cluster and the rest of namespaces are watched by CAPI manager to manage other managed clusters.
Two namespaces are used by the management cluster, while the remaining namespaces are watched by the CAPI manager to oversee other managed clusters.

RedHat Jira Issues:
* [ACM-15441](https://issues.redhat.com/browse/ACM-15441) - CAPI required enabling option for watching multiple namespaces,
* [ACM-14973](https://issues.redhat.com/browse/ACM-14973) - CAPI controllers should enabling option to ignore namespaces


#### Functional Requirements
Expand Down Expand Up @@ -267,7 +321,7 @@ The `Alternatives` section is used to highlight and record other possible approa

## Upgrade Strategy

We don't expect any changes while upgrading.
We do not expect any changes while upgrading.

## Additional Details

Expand Down

0 comments on commit 4aa1b44

Please sign in to comment.