Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NPEP-122 Tenancy API: Add Tenancy API design suggestions. #178

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

npinaeva
Copy link
Member

Tracking issue #122

Copy link

netlify bot commented Nov 20, 2023

Deploy Preview for kubernetes-sigs-network-policy-api ready!

Name Link
🔨 Latest commit 9a876cc
🔍 Latest deploy log https://app.netlify.com/sites/kubernetes-sigs-network-policy-api/deploys/673c68c56d782c000890bfc9
😎 Deploy Preview https://deploy-preview-178--kubernetes-sigs-network-policy-api.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Nov 20, 2023
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Nov 20, 2023
@astoycos astoycos self-assigned this Nov 21, 2023
npeps/npep-122.md Outdated Show resolved Hide resolved
Copy link
Member

@astoycos astoycos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good start thanks @npinaeva Sorry for all the comments!

Currently the way you intertwined the challenges, examples, and actual proposed yamls is a bit confusing for me.

I think you can define all of the problems upfront (Tenancy selection/ peers and actions/ and priorities) into a single concise blurb and then reference those problems in the "possible designs, example yaml section" since IMO theres 2 general design paths going forward:

  1. Allow Admins to use ANP + BANP to define tenancy
  2. Provide new resources to Admins for building tenancy

And all the other design problems (Tenancy selection/ peers and actions/ and priorities) seem to be subcategories to that initial decision.

npeps/npep-122.md Outdated Show resolved Hide resolved
npeps/npep-122.md Outdated Show resolved Hide resolved
npeps/npep-122.md Outdated Show resolved Hide resolved
npeps/npep-122.md Outdated Show resolved Hide resolved
npeps/npep-122.md Outdated Show resolved Hide resolved
npeps/npep-122.md Outdated Show resolved Hide resolved
npeps/npep-122.md Show resolved Hide resolved
npeps/npep-122.md Outdated Show resolved Hide resolved
npeps/npep-122.md Outdated Show resolved Hide resolved
npeps/npep-122.md Show resolved Hide resolved
@astoycos
Copy link
Member

astoycos commented Dec 7, 2023

Since this NPEP is now "implementable" let's make sure Sig-Network API approvers have a look as well.

/assign @khenidak

@astoycos
Copy link
Member

astoycos commented Dec 7, 2023

/assign @thockin

@danwinship
Copy link
Contributor

OK, try on this idea:

There are exactly 2 tenancy use cases:

  1. Overridable isolation: Traffic within tenants should be Allowed as though by a BANP rule which has higher precedence than any explicit BANP Deny rules. Traffic between tenants should be Denied as though by a BANP rule that has lower precedence than any explicit BANP Allow rules.
  2. Strict isolation: Traffic within tenants should be Allowed as though by a BANP rule which has higher precedence than any explicit BANP Deny rules. Traffic between tenants should be Denied as though by an ANP rule that has lower precedence than any explicit ANP Allow rules.

An API that allows exactly those two options and nothing else satisfies all of 4.1-4.4, right?

@danwinship
Copy link
Contributor

danwinship commented Dec 8, 2023

Going back to what I was saying above about "cluster default policy", what about:

type ClusterDefaultPolicySpec struct {
        // tenancyLabels defines the labels used for dividing namespaces up into
        // tenants. Namespaces that have the same values for all of the labels in
        // tenancyLabels are part of the same tenant. Namespaces that lack at least
        // one of the labels are not part of any tenant. If this is empty then
        // tenancy-based matching is not part of the cluster default policy.
        TenancyLabels []string

        // userNamespaces defines what constitutes a "user namespace" (as
        // opposed to a "system namespace"). If this is not set then
        // user-namespace-based matching is not part of the cluster default policy.
        UserNamespaces *metav1.LabelSelector

        // ingress defines the cluster default policy for pod-ingress traffic
        Ingress ClusterDefaultPolicyRules

        // egress defines the cluster default policy for pod-ingress traffic
        Egress ClusterDefaultPolicyRules
}

type ClusterDefaultPolicyRules struct {
        // withinUserNamespace defines the default policy for pod-to-pod
        // traffic within a user namespace. This must be nil if the cluster default
        // policy does not define userNamespaces.
        WithinUserNamespace *ClusterDefaultPolicyRule

        // withinTenant defines the default policy for pod-to-pod traffic
        // within a tenant (which was not matched by a previous rule). This
        // must be nil if the cluster default policy does not define
        // tenancyLabels.
        WithinTenant *ClusterDefaultPolicyRule

        // withinPodNetwork defines the default policy for pod-to-pod traffic
        // within the pod network (which was not matched by a previous rule).
        // Note that with some network implementations, it may be inefficient
        // to specify a "withinPodNetwork" rule (rather than just letting the
        // pod network case fall through to "default").
        WithinPodNetwork *ClusterDefaultPolicyRule

        // default defines the default policy for all traffic to a pod (for ingress)
        // or from a pod (for egress) which was not matched by a previous rule.
        // If not specified, this defaults to "DefaultAllow".
        Default *ClusterDefaultPolicyRule
}

type ClusterDefaultPolicyRule string

const (
        // DefaultAllow indicates that traffic should be allowed by default, but may
        // be denied by a BaselineAdminNetworkPolicy, NetworkPolicy, or
        // AdminNetworkPolicy. (This is the Kubernetes default for traffic which
        // does not have any other policy specified.)
        ClusterDefaultPolicyDefaultAllow ClusterDefaultPolicyRule = "DefaultAllow"

        // Allow indicates that traffic should be allowed by default, but may
        // be denied by an AdminNetworkPolicy. (Attempting to deny it with a
        // BaselineAdminNetworkPolicy or NetworkPolicy will have no effect.)
        ClusterDefaultPolicyAllow ClusterDefaultPolicyRule = "Allow"

        // DefaultDeny indicates that traffic should be denied by default, but may
        // be allowed by a BaselineAdminNetworkPolicy, NetworkPolicy, or
        // AdminNetworkPolicy.
        ClusterDefaultPolicyDefaultDeny ClusterDefaultPolicyRule = "DefaultDeny"

        // Deny indicates that traffic should be denied by default, but may
        // be allowed by an AdminNetworkPolicy. (Attempting to allow it with a
        // BaselineAdminNetworkPolicy or NetworkPolicy will have no effect.)
        ClusterDefaultPolicyDeny ClusterDefaultPolicyRule = "Deny"
)

So then, "overridable isolation" is:

kind: ClusterDefaultPolicy
name: default
spec:
    tenancyLabels:
        - "user"
    ingress:
        withinTenant: DefaultAllow
        withinPodNetwork: DefaultDeny
    egress:
        withinTenant: DefaultAllow
        withinPodNetwork: DefaultDeny

strict isolation is:

kind: ClusterDefaultPolicy
name: default
spec:
    tenancyLabels:
        - "user"
    ingress:
        withinTenant: DefaultAllow
        withinPodNetwork: Deny
    egress:
        withinTenant: DefaultAllow
        withinPodNetwork: Deny

"I don't care about the traffic within tenants, but I want to make sure traffic is denied to/from everywhere else (including N/S)" is

kind: ClusterDefaultPolicy
name: default
spec:
    tenancyLabels:
        - "user"
    ingress:
        withinTenant: DefaultAllow
        default: DefaultDeny
    egress:
        withinTenant: DefaultAllow
        default: DefaultDeny

"I don't care about within-cluster traffic, but cluster egress should be blocked by default" is

kind: ClusterDefaultPolicy
name: default
spec:
    tenancyLabels:
        - "user"
    ingress:
        # (doesn't need to be specified; this is the default)
        default: DefaultAllow
    egress:
        withinPodNetwork: DefaultAllow
        default: DefaultDeny

Though this API isn't quite right because you probably always want to DefaultAllow traffic in system namespaces, and this doesn't really give any way to express that...

@npinaeva
Copy link
Member Author

npinaeva commented Dec 13, 2023

Going back to what I was saying above about "cluster default policy", what about:

I find the fact that this struct introduces invisible priorities the most confusing part, because every ClusterDefaultPolicyRules Action is applied somewhere between ANP, NP and BANP (and priority is implied by action type) but also it has "which was not matched by a previous rule" semantics, which is even more confusing. (but may we can make it somehow more explicit)

"I don't care about the traffic within tenants, but I want to make sure traffic is denied to/from everywhere else (including N/S)" is

name: default
spec:
    tenancyLabels:
        - "user"
    ingress:
        withinTenant: DefaultAllow
        default: DefaultDeny
    egress:
        withinTenant: DefaultAllow
        default: DefaultDeny

Don't we need Deny instead of DefaultDeny here? I thought this use case wants strict isolation for N/S

I also didn't quite get the usage/meaning of UserNamespaces since there were no examples.

@npinaeva
Copy link
Member Author

@danwinship @astoycos @Dyanngg I tried to sum up provided comments/ideas in a new section called "Another approach". Please take another look and let me know what you think

@Dyanngg
Copy link
Contributor

Dyanngg commented Dec 19, 2023

I find the fact that this struct introduces invisible priorities the most confusing part, because every ClusterDefaultPolicyRules Action is applied somewhere between ANP, NP and BANP (and priority is implied by action type) but also it has "which was not matched by a previous rule" semantics, which is even more confusing. (but may we can make it somehow more explicit)

Huge +1 here. If we are going to introduce new CRD types here (which seems like we are), interoperation between tenancy related objects and AdminNetworkPolicy objects are key here. Any tenancy rules should have a clear and explicit priority so that when there are ANPs on top of these tenancy rules, which would take effect.

npeps/npep-122.md Outdated Show resolved Hide resolved
Copy link
Member

@astoycos astoycos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So theres still a few comments left over from the last time I reviewed that I think are relevant.

Additionally I am still not on board with creating a new NetworkTenancy object..... (And I think generally most of the others are of the same opinion)

At the base of this WE need an easy way to apply rules to a set of namespaces with the ability to change priority of those rules. ANP + BANP already provide this, if the existing semantics of the API don't work let's add a new subject type AND/OR new peer types if needed.

Finally the "we have to define everything as being in a tenant" just doesn't seem like an issue to me, if a tenant is defined, and is blocked off from "all other tenants" that also inherently means it's blocked from ALL OTHER PODs IMO. There doesn't need to be an explicit "NONE" tenant or anything like that.

Remember we're shooting for perfection here, but generally we need the least complex API that is agreeable with the majority of the community :)

npeps/npep-122.md Show resolved Hide resolved
npeps/npep-122.md Outdated Show resolved Hide resolved
@tssurya
Copy link
Contributor

tssurya commented Jan 30, 2024

Additionally I am still not on board with creating a new NetworkTenancy object..... (And I think generally most of the others are of the same opinion)

+1 to this from my side

@danwinship
Copy link
Contributor

I am still not on board with creating a new NetworkTenancy object..... (And I think generally most of the others are of the same opinion)

I agree we should not have tenant-specific objects that mirror a large percentage of the functionality of ANP/BANP. But having an object to just define what tenancy means, which would then be referenced by ANPs/BANPs, would make sense. (Individual ANPs don't define which pods are in which namespaces themselves. Why should they define which namespaces are in which tenants themselves?) Or alternatively, having an object with a very restricted set of tenant policy-setting functionality could make sense.

At the base of this we need an easy way to apply rules to a set of namespaces with the ability to change priority of those rules.

Our user stories do not require the ability to set arbitrary priorities on tenancy-related rules.

ANP + BANP already provide this, if the existing semantics of the API don't work let's add a new subject type AND/OR new peer types if needed.

We already tried putting tenancy into the subject/peer framework, and we didn't like how it came out. Any tenancy system that does not involve any new object types seems like it is going to end up being basically equivalent to what we had in the original API with SameLabels/NotSameLabels. It will be a weird peer type that doesn't behave like any of the other peer types, and it will have the problem that different policies can have different definitions of tenancy, even though that makes the implementation more complicated and there are no user stories for that functionality.

@danwinship
Copy link
Contributor

and it will have the problem that different policies can have different definitions of tenancy, even though that makes the implementation more complicated and there are no user stories for that functionality.

In particular... this NPEP started when Nadia pointed out that SameLabels/NotSameLabels tend to result in an explosion of "implementation rules" (i.e. OVN ACLs, iptables rules, etc) because you need to turn the tenancy-based ANP rule into a set of namespace-based implementation rules.

But if we say that there is only a single definition of tenancy for the entire cluster, then that makes it much easier to represent tenancy as a concept within the implementation rules, allowing for more-optimized tenancy matches. Instead of changing "subject and peer are in the same tenant" into "subject's namespace is A or B and peer's namespace is A or B; or subject's namespace is C and peer's namespace is C; or ...", you can just represent it as "subject's tenant == peer's tenant".

To give two examples pertaining to non-existent ANP implementations ( 😄 )

  • In openshift-sdn, k8s Namespaces all have numeric IDs associated with them, and for pod-to-pod traffic, the namespace ID of the source pod is kept in reg0 and the ID of the destination pod is in reg1, so that namespaceSelector-based NetworkPolicies can be implemented by OVS flows that just check reg0 and reg1 rather than needing to be based on pod IPs. The same approach could be used with "tenant IDs". (In fact, the "namespace IDs" originally were "tenant IDs", in openshift-sdn's "multi-tenant" mode.)
  • In an nftables-based implementation, if there was only a single definition of tenancy, we could just create a set containing pairs of pod IP addresses for all such pairs of IPs that are in the same tenant. Then a tenancy-matching peer just turns into a single lookup of { ip saddr . ip daddr } in the tenancy set.

(OK, so actually the nftables example could be done even with "every policy creates its own definition of tenancy", by just creating a separate set for each SameLabels/NotSameLabels expression. But you might be less inclined to implement it that way, because those sets would be big. The openshift-sdn example really requires a single cluster-wide definition of tenancy though, since there are only so many registers, so you can't represent "this pod is part of tenant 12 with respect to ANP X, and part of tenant 378 with respect to ANP Y, and ...")

@npinaeva
Copy link
Member Author

npinaeva commented Feb 2, 2024

Let me try to address the main point here: using (B)ANP rules for tenancy.
Current (B)ANP semantics is: subject selects pod to which all ingress and egress rules are applied.
Tenancy can't be used in this semantics, because it has its own "subject" defined by the tenancy labels.
It doesn't make a lot of sense to allow something like

apiVersion: policy.networking.k8s.io/v1alpha1
kind: AdminNetworkPolicy
spec:
  priority: 10
  subject:
    namespaces:
      matchLabels:
        kubernetes.io/metadata.name: sensitive-ns
  ingress:
    - action: Deny
      from:
      - namespaces:
         namespaceSelector: {}
   - action: Allow
     tenantLabels: ["user"]

because tenancy doesn't have anything to do with the given subject, it just tries to be injected into existing priority system.
Therefore, we can only use tenancy in (B)ANP at the same level as subject.

kind: AdminNetworkPolicy
spec:
  priority: 10
  # EITHER
  subject: <...>
  ingress: <...>
  egress: <...>
  # OR
  tenant:
# example contents of tenant, the format is not defined yet
    tenancyLabels:
      - "user"
    ingress: Allow
    egress: Allow

This brings a questions about what is the priority between tenancy and ingress/egress. We can either define it (say tenancy is always a higher prio) or only allow setting one part (see EITHER OR). I prefer the first option, but it will require re-prioritization from the existing implementations (since we need to reserve 1 extra priority before ingress and egress rules that wasn't needed before)
If you have other API options that make sense, please let me know, because ^ is the best I can do without adding a new object.

@danwinship
Copy link
Contributor

danwinship commented Feb 6, 2024

Right. So I'm leaning toward either:

1. A single external tenant definition, arbitrary rules in ANPs/BANPs:

kind: TenantDefinition
metadata:
  name: default
spec:
  labels: ["user"]

and then (based on your example above):

apiVersion: policy.networking.k8s.io/v1alpha1
kind: AdminNetworkPolicy
spec:
  priority: 10
  subject:
    namespaces:
      matchLabels:
        kubernetes.io/metadata.name: sensitive-ns
  ingress:
    - action: Deny
      from:
      - namespaces:
         namespaceSelector: {}
    - action: Allow
      from:
      - tenant: Same

which I think is coherent: it says "pods in sensitive namespaces allow ingress from pods in their tenant, but not from pods in other tenants".

2. A single external tenant definition, only two possible rules:

kind: TenantDefinition
metadata:
  name: default
spec:
  labels: ["user"]
  tenancyType: Soft

as per #178 (comment), where your only options are "hard tenancy" (inter-tenant traffic is denied like with an ANP) and "soft tenancy" (inter-tenant traffic is denied like with a BANP).


(I still like the "ClusterDefaultNetworkPolicy" idea, but I'm giving up on trying to push bigger changes through.)

We could also start with the second approach, and then if it seems too limited, deprecate the tenancyType field in the next alpha and flip to the first approach instead.

@Dyanngg
Copy link
Contributor

Dyanngg commented Feb 6, 2024

I'd personally vote for External tenant definition, arbitrary rules in ANPs/BANPs as I don't feel like tenancy definition should be overloaded with priority.
On top of that, I think the above proposal still does not solve the original issue we had with sameLabels: the namespaces selected in the subject field can 1. falls exactly into tenancy definition as all of them have the tenancy label(s) 2. intersect with the tenancy definition but not all of them have the tenancy label(s) 3. disjoint from the tenancy definition. We had a hard time agreeing on what we should do to policy subjects when they don't have the tenancy labels (either no-op for these specific namespaces or group these into a "default" tenant, but none of solutions were explicit and ideal).
If we are open to tenant definition as separate object, I will propose something like:

kind: TenantDefinition
metadata:
  name: by-user
spec:
  labels: ["user"]
apiVersion: policy.networking.k8s.io/v1alpha1
kind: AdminNetworkPolicy
spec:
  priority: 10
  subject:
     tenantNamespaces: by-user  # select namespaces by the label list in the by-user tenant definition
                               # in this case, this is equivalent to {namespaces where label "user" exists}
  ingress:
    - action: Allow
      from:
      - tenant: Same  # 'same' and 'notSame' is well-defined here as subjects must have all the label keys
    - action: Deny
      from:
      - namespaces:
         namespaceSelector: {}

In this case, the only thing we would need to document is how subject: tenantNamespaces work, which should be pretty straightforward to explain.

@danwinship
Copy link
Contributor

danwinship commented Feb 6, 2024

We had a hard time agreeing on what we should do to policy subjects when they don't have the tenancy labels (either no-op for these specific namespaces or group these into a "default" tenant, but none of solutions were explicit and ideal).

I'm pretty sure we need to answer this regardless of what approach we go with.

The options are:

  1. Some namespaces aren't part of any tenant but you are prevented by validation/admission-control from creating an ANP that uses a tenancy-based rule if the subject selects pods/namespaces that aren't part of a tenant.
    • So in this case Nadia's example would be invalid because it's not syntactically guaranteed that the "sensitive-ns" namespace has a "user" label. You'd have to change the subject to have matchExpression: [ { key: "kubernetes.io/metadata.name", operator: "In", values: ["sensitive-ns"] }, { key: "user", operator: "Exists" } ].
    • This implies that the tenancy definition has to be inside the ANP, because we generally never do cross-object validation
  2. Some namespaces aren't part of any tenant, but for tenancy rule purposes, all such namespaces form a tenant with each other (the "nil"/"default" tenant)
  3. Some namespaces aren't part of any tenant, but for tenancy rule purposes, each such namespace is treated as its own unique tenant.
  4. Some namespaces aren't part of any tenant, and for tenancy rule purposes, neither sameTenant nor notSameTenant rules will apply to them (SQL NULL / floating-point NaN semantics)
    • So in Nadia's example, if the "sensitive-ns" did not have a "user" label, then the "Deny" rule would still apply, but the "Allow" rule would not.
  5. Some namespaces aren't part of any tenant, and any ANP that refers to tenancy at all automatically doesn't apply to them.
    • So in Nadia's example, if the "sensitive-ns" did not have a "user" label, then the entire ANP would be a no-op.

On top of that, I think the above proposal still does not solve the original issue we had with sameLabels: the namespaces selected in the subject field can 1. falls exactly into tenancy definition as all of them have the tenancy label(s) 2. intersect with the tenancy definition but not all of them have the tenancy label(s) 3. disjoint from the tenancy definition.

Note that this is more of a problem with some of the options than it is with others.

@npinaeva
Copy link
Member Author

Latest update based on the feedback from the SIG meeting on Feb 13.
Pros ans cons for every priority option are updated, check Final suggestion to discuss fields of the final CRD version.

Copy link
Member

@astoycos astoycos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with the suggested way forwards and think we can hash out any final design questions in the actual API change Pull Request. Thanks for all the work @npinaeva!

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: astoycos, npinaeva

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 27, 2024
@astoycos astoycos added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 27, 2024
- action: Deny
to:
- namespaces: {}
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal of my proposal was to put all of the explicitly-tenancy-related rules into one place. Eg, this use case would be a single object:

kind: NetworkTenancy
spec:
  tenancyLabels:
    - "user"
  type: Soft

Even if you think we need something more explicit than that, it's really inelegant to have to have 2 objects (the NetworkTenancy and the BANP), which both have to specify "tenants are defined by the 'user' label", but using two different syntaxes. If you make NetworkTenancy specify the between-tenant behavior rather than the within-tenant behavior, then you can specify the same behavior with a single object:

kind: NetworkTenancy
spec:
  tenancyLabels:
    - "user"
  precedence: BANP
  action: Deny

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first of all, I am not sure between-tenant behaviour is a necessary part of the tenancy. It is not obvious why "other tenants" should be different "other namespaces". Not sure if denying traffic for a namespace outside your tenant should be based on whether that namespace is also a part of some other tenant or not.
Not saying it shouldn't be there, but the core concept of tenancy is sameTenant, which is also a dynamic selector that is not possible to express in any other way. Performance-wise I would probably even expect notSameTenant to be represented as a lower-priority deny rule for tenancy selector (with higher-priority sameTenant rule), as you don't need to create dynamic notSameTenant sets of pods for every tenant, but can only do that for sameTenant.
About including full use case (both allow and deny with their priorities): the main problem I faced is integration with ANP/BANP. When you only use sameTenant, you need only 1 implicit priority (for every use case we have highest-ANP and highest-BANP would work). But with deny rules, you may want deny as also highest-priority (use case 4.2) or lower than allow-from-monitoring (use case 4.4.1). Considering implicit priorities are adding more complexity, trying to track where exactly each one fits both for allow and deny rules just complicates the design (e.g. I am not sure how #178 (comment) this example defined tenancy priority relative to ANP priority)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first of all, I am not sure between-tenant behaviour is a necessary part of the tenancy.

Wow. OK, I guess I understand why we're having trouble agreeing on the details of the API now.

As I see it, tenancy is 100% about between-tenant behavior and 0% about anything else.

Look at the original user story:

As a cluster admin, I want to build tenants in my cluster that are isolated from each other by default. Tenancy may be modeled as 1:1, where 1 tenant is mapped to a single Namespace, or 1:n, where a single tenant may own more than 1 Namespace.

It only talks about isolation between tenants, not anything that happens inside the tenants. And the updated 4.1 says

... By default, cross-tenant traffic is dropped. ...

and again, neither 4.1 nor 4.2 says anything about what happens inside tenants.

The goal of tenancy is to put up walls between certain sets of namespaces. That's it. Traffic between tenants is restricted in a particular way, but all traffic that isn't between-tenant behaves exactly like it would have without the tenant definitions; you can make rules like "everybody can talk to the apiserver" or "namespaces with a user label can't talk to the internet by default", but you don't need a tenancy API to do that. You only need the tenancy API to specify between-tenant behavior.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if it helps, coming here from end-user/implementor perspective. The one single ask I have been getting always is: putting wall up between tenants (tenant here being set of namespaces); the users I know are designing almost always: "allow to youself, (isSame) deny to all other tenants(isNotSame)" => I know this is covered in one of the user stories already but re-hashing that that is the single most thing the current ANP/BANP can't do well enough; I think other things like allow {} to dns and deny {} can be done using anp/banp rules unless I am missing something.

  • prio1 allow to your own tenant (I think this is 4.3 but ANP level? I have not yet heard a ask for BANP level)
  • prio2 allow to DNS
  • prio3 allow from monitoring
  • last prio deny everything to everything total lockdown

the prio1 rule expression is what we struggle with on ANP because putting {} in the subject doesn't work as it does not allow reflecting tenancy in peers, rest can be done using anp/banp rules. so really the ask is almost always an easy way to say "please allow to my own tenant" because the deny is almost always going to be the default deny banp at the bottom which takes care of putting up walls. OR if you do the {} to {} deny in ANP that also solves that problem (I can see how the deny to other tenants can be useful if we can express that explicitly)

- action: Deny
to:
- namespaces: {}
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, this would be a single object if you were specifying the between-tenant behavior:

kind: NetworkTenancy
spec:
  tenancyLabels:
    - "user"
  precedence: ANP
  action: Deny

or with my earlier proposal:

kind: NetworkTenancy
spec:
  tenancyLabels:
    - "user"
  type: Hard

- action: Deny
to:
- namespaces: {}
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this can't be done if NetworkTenancy specifies the between-tenant rule, but I still don't think Story 3 is a real use case.

FTR, you described this as "you may have a strong policy where every engineering team should be always able to talk to all the other engineering teams" but this doesn't just say that, it says "the engineering teams are not allowed to have any internal security". If a user tries to create NPs with rules like "only the frontend server can connect to the backend server and only the backend server can connect to the database pod", they won't work, because there's an ANP Allow rule overriding them. I really doubt any administrator wants that policy.

ANP Allow rules really only make sense for targeted policies (like "allow from monitoring" or "allow to DNS").

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if an admin allows namespaces to create their own network policies, then they may deny internal tenant connection by accident, and it will beak things. Then as an admin you may want to say always allow same tenant, which in turn lets namespaces specify policies for external-to-tenant traffic, but also ensures internal tenant connectivity is not broken by accident. (meaning it should prevent internal tenant connections problems both intentionally and by accident)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I don't think that's a real use case. No admin who cares about security is going to want to prohibit users from using NP within their own namespaces. (And any admin who doesn't care about security isn't going to be using ANP in the first place.)

- action: Deny
from:
- namespaces: {}
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't quite right: 4.4.1 says "I want to allow ingress traffic from the monitoring namespace to all namespaces." (emphasis added) whereas this only allows to namespaces with a user label. So in your syntax, you would need 2 ANPs: one for allow-from-monitoring-to-all and one for deny-from-all-to-user-namespaces.

But anyway, with a between-tenants rule, the NetworkTenancy becomes action: Deny and then you don't need a separate deny-to-user-namespaces rule so you can just remove both the namespace selector and the Deny rule from the ANP.

kind: NetworkTenancy
spec:
  tenancyLabels:
    - "user"
  precedence: ANP
  action: Deny

kind: AdminNetworkPolicy
spec:
  priority: 1
  subject:
    namespaces: {}
  ingress:
    - action: Allow
      from:
        - namespaces:
            namespaceSelector:
              matchLabels:
                kubernetes.io/metadata.name: monitoring-ns

Now the "allow from monitoring" rule doesn't reference tenancy at all, because that rule isn't about tenancy; it applies equally to tenanted and non-tenanted namespaces.

(Also, this rewriting emphasizes the fact that this user story is basically just an extended version of 4.2, clarifying how NetworkTenancy and AdminNetworkPolicy interact.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure in the user stories context, we ever meant "all namespaces" as explicitly all namespaces in the cluster even if they are not a part of tenancy (like why would we discuss non-tenancy namespace here in the use cases). Not sure if the wording is wrong, or we have been thinking about these use cases differently this whole time. e.g. I interpret 4.4.1 as "I want to allow ingress traffic from the monitoring namespace to all (tenancy-affected) namespaces." as you can see in the yaml example. (I guess having yamls helps clarify this)
the same applies to the next comment.

- action: Deny
from:
- namespaces: {}
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to setup "deny all" BANP to protect cluster workloads, but I want to allow internal connections within tenant.

As with 4.4.1, I think 4.4.2 says it wants baseline deny all for everyone, not just for tenants. Although... that doesn't really make sense: non-tenanted namespaces would be locked down for same-namespace traffic, but tenanted namespaces would be non-locked-down and users wouldn't even have the option of trying to lock them down (which seems to conflict with the admin's goal of "protecting cluster workloads"). As with Story 3, this does not seem like a plausible real-world use case to me.

@danwinship
Copy link
Contributor

danwinship commented Mar 7, 2024

OK, Nadia and I met to talk about this.

I don't really like the current "Final suggestion" as proposed, because it makes the common cases complicated because you need both a Tenancy object and a separate ANP or BANP. But as Nadia points out, my version of the proposal doesn't implement all of the user stories we previously agreed upon. But if we do actually agree on all of those user stories, then the API I had proposed wasn't a good API and we should do it differently. 🙁

Part of the problem is that the existing user stories aren't entirely clear (eg, there were several places where they talk about "allowing" traffic, which I interpreted as "don't interfere with", but in Nadia's examples she interpreted as "action: Allow"). And, eg, Nadia's example for 4.3 used an ANP-level Allow rule, which seemed bizarre to me, but it makes a lot more sense if it was supposed to be BANP-level (which is a use case that Nadia brought up several times in our call: basically, "I want default-deny for everything except within-tenant traffic"). (But even with that interpretation it's still not implementable with my proposed API.)

So maybe we need to revisit the user stories again, at least to clarify the wording. It might also help to explicitly indicate the expected results for every interesting pair of pods in each case:

  • a pod connecting to a pod in the same namespace in the same tenant
  • a pod connecting to a pod in a different namespace in the same tenant
  • a pod connecting to a pod in a different tenant
  • a pod connecting to a pod in the same namespace, which is not in a tenant
  • a pod connecting to a pod in a different namespace, neither of which is a tenant
  • a pod in a tenant connecting to a pod in a non-tenant namespace
  • a pod in a non-tenant namespace connecting to a pod in a tenant
  • a pod in a tenant connecting to a cluster-external IP
  • a pod in a non-tenant namespace connecting to a cluster-external IP

In particular, I feel (though Nadia does not) that some of the user stories lead to "bad" results that probably aren't what any administrator would actually want. (eg, "I want default-deny for everything except within-tenant traffic" means that pod to pod in a tenant namespace is allowed by default, but pod to pod in a non-tenant namespace is denied by default. Nadia says they would just have to create NPs in the non-tenant namespaces in that case, but again, to me that seems like a failure of API design on our part; we should have figured out some way to give them the behavior they wanted in both tenanted and non-tenanted namespaces all at once...)

Anyway, I'm on PTO next week so feel free to merge anything you want while I'm not around to object. 🙂

- action: Deny
from:
- namespaces: {}
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this again:

  • If the precedence is "ANP, Tenancy, NP, BANP" then the tenancy rule is ignored, because the ANP's Deny would run first
  • If the precedence is "Tenancy, ANP, NP, BANP" then the Allow-from-monitoring rule is ignored, because the Tenancy's Skip rule would bypass the ANP.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this example it works as your second option, but Skip action is only applied to sameTenant namespaces, while monitoring is not sameTenant. So traffic from monitoring ns won't match tenancy rules (unless there is a tenant of which monitoring is a part, then monitoring will be passed as sameTenant)

@npinaeva
Copy link
Member Author

npinaeva commented Mar 12, 2024

we should have figured out some way to give them the behavior they wanted in both tenanted and non-tenanted namespaces all at once...

This can be done by assigning every namespace that is not a part of a tenant a unique tenancy label, so making it a one-namespace-tenant. Potentially always-allow-same-namespace policy could be expressed as tenancy with kubernetes.io/metadata.name tenancy label, but if it needs to be combined with another tenancy... well we need 2 tenancies then (and they would need to be allow-based not deny-based)

While multiple ANPs with the same priority are allowed, we probably can allow multiple Tenancies or Tenancy and ANP
with the same priority, but if we decide to only allow ANP per priority, Tenancy needs to be accounted for in the same range.

**CONS**: BANP doesn't have a priority, to use this method we would need to define a priority for BANP.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're going to have new resource types, I think there's value in aligning any new objects with the naming/structure of the existing objects. Having Admin/BaselineAdmin in the name of the policy resources but then embedding it in the Tenancy resource feels like mixing abstractions and it prevents the A and BA versions from evolving differently.

to use this method we would need to define a priority for BANP

I don't think that's true; there's no reason that both types have to have the exact same structure:

  • AdminXXX: has priority, fits in with ANP
  • BaselineXXX: no priority(?), always higher precedence than BANP.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are talking about option 4.1: Create 2 new objects with implicit priorities. (I will add that one too)
The main "con" for it is that it will be 2 new CRDs with just 2 fields, but following the existing structure is the "pro", so let's discuss which one makes more sense


```yaml
spec:
action: SkipSameTenant/DenyNotSameTenant
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is Skip the same as Pass in an ANP? I think it is so let's standardise on one name for this class of action. I vote for Pass because it matches Calico's name for the same concept.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh something went wrong with my wording that day, I did intend to call it Pass, thank you!


4. Create 1 new object with implicit priorities.

`precedence` field + reserved highest-priority rule before (B)ANP
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(If I've understood the semantics correctly!) having no priority in the ANP case feels like it interacts in a strange way with ANP. ANP should be as powerful or more powerful than namespaced NP, but, if you have a Tenancy that does "skip" (assuming skip is same as "pass" in an ANP), there's no place for the ANP to override it, but namespaced NP can.

A concrete example where this would be awkward: if you try to use the auto-created name label for your tenancy so that all namespaces are isolated then, I think, you'd naturally want to apply admin NPs that override that for kube-system/monitoring/etc. So it undermines story 2 in the KEP.

As a cluster admin, I want to apply non-overridable allow rules to certain pods(s) and(or) Namespace(s) that enable the selected resources to communicate with all other cluster internal entities.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

skip action actually is made to allow NP "override" something that ANP can't. We explain this action sometimes as "skip=delegate to namespaced network policies". It is useful when you want to skip some of the following ANP rules, but you don't want to make != exceptions for e.g. same-namespace traffic. Also, cluster admin doesn't always have a decision (allow or deny) for some traffic, and wants to let network policies decide that.
As per example, if I understand it correctly, you don't need to include namespaces you don't want to be affected by tenancy to the tenancy policy (not sure if there is any other need to override tenancy policy)

3. Create 2 objects with ANP and BANP priorities (let's say NetworkTenancy and BaselineNetworkTenancy)

```yaml
kind: NetworkTenancy
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I read "Tenancy" I think of the singular contract between landlord and tenant. I'd expect each tenant to have a Tenancy whereas this object is a policy for how tenancies are implicitly created... So, perhaps it should be (Baseline)AdminNetworkTenancyPolicy to fit in with the other resources. Verbose, but that is the k8s way!

Is there any chance this will grow beyond the Tenant use case. Might we regret not calling it AdminNetworkPartitioningPolicy or something "generic" like that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, we actually had a working name of TenancyNetworkPolicy (which I haven't added to the NPEP yet, but now I will), lmk if that sounds good.
About the future use cases: we are usually trying to not predict future use cases and not over-engineer APIs before it is needed. So if we need more similar functionality at some point, we probably will create a new API and some migration tool, if that makes sense.

- action: Deny
to:
- namespaces: {}
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if it helps, coming here from end-user/implementor perspective. The one single ask I have been getting always is: putting wall up between tenants (tenant here being set of namespaces); the users I know are designing almost always: "allow to youself, (isSame) deny to all other tenants(isNotSame)" => I know this is covered in one of the user stories already but re-hashing that that is the single most thing the current ANP/BANP can't do well enough; I think other things like allow {} to dns and deny {} can be done using anp/banp rules unless I am missing something.

  • prio1 allow to your own tenant (I think this is 4.3 but ANP level? I have not yet heard a ask for BANP level)
  • prio2 allow to DNS
  • prio3 allow from monitoring
  • last prio deny everything to everything total lockdown

the prio1 rule expression is what we struggle with on ANP because putting {} in the subject doesn't work as it does not allow reflecting tenancy in peers, rest can be done using anp/banp rules. so really the ask is almost always an easy way to say "please allow to my own tenant" because the deny is almost always going to be the default deny banp at the bottom which takes care of putting up walls. OR if you do the {} to {} deny in ANP that also solves that problem (I can see how the deny to other tenants can be useful if we can express that explicitly)

from:
- namespaces: {}
```
</details>
Copy link
Contributor

@tssurya tssurya Apr 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all the examples provided here are priceless and well thought out, thank you Nadia!!
I have a silly question:
whichever way we decide to do, If I do:
tenancyLabels:
- "user" (nsA:prod, nsB:staging, nsC:prod, nsD: staging, nsE:prod)
- "color" (nsA:red, nsB:red, nsC:blue, nsD:blue, nsE:red)

So subject is all namespaces that have either of these tenancyLabels set right ? (not ALL?).

then the concept of sameTenant (throughout this NPEP) means any other tenant that has Either of these 3 keys with the same value OR any tenant that has ALL of these 3 keys with the same value; is it && or || ?

so who is sameTenant here for nsA? is it nsC, nsB and nsE? or only nsE ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have mostly discussed this for one label case, but I think it applies to multiple labels too (will add a section for that)
So for one label, we decided that only namespaces that have that label will be selected to avoid potential None tenant. Therefore for multiple labels, only namespaces that have all tenancy labels will be selected for the same reason. So sameTenant for nsA is namespace with user=prod AND color=red.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 5, 2024
@npinaeva
Copy link
Member Author

Current state: we need to agree on the user story clarification (already in the commit) first before moving on to the API discussion.

npeps/npep-122.md Outdated Show resolved Hide resolved
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle rotten
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Nov 9, 2024
@tssurya
Copy link
Contributor

tssurya commented Nov 19, 2024

/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Nov 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants