Implement inline CIDR block egress peer #185

tssurya · 2023-12-30T10:46:06Z

This PR adds support for implementing inline CIDR
peer blocks.

netlify · 2023-12-30T10:46:12Z

✅ Deploy Preview for kubernetes-sigs-network-policy-api ready!

Name	Link
🔨 Latest commit	`6ff2bac`
🔍 Latest deploy log	https://app.netlify.com/sites/kubernetes-sigs-network-policy-api/deploys/65e720e34ff0e500086fd515
😎 Deploy Preview	https://deploy-preview-185--kubernetes-sigs-network-policy-api.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

tssurya · 2023-12-30T10:47:02Z

apis/v1alpha1/shared_types.go

+	// +optional
+	// +kubebuilder:validation:MinItems=1
+	// +kubebuilder:validation:MaxItems=100
+	Networks []string `json:"networks,omitempty" validate:"omitempty,dive,cidr"`


need to come back to how we want the validation of the cidr's to be done.

using https://blog.markhatton.co.uk/2011/03/15/regular-expressions-for-ip-addresses-cidr-ranges-and-hostnames/ for now (took from ocp/api)

tssurya · 2023-12-31T10:07:42Z

/test pull-network-policy-api-verify

astoycos · 2024-01-03T18:46:43Z

@tssurya You just need to run gofmt here

Verifying gofmt
diff ./apis/v1alpha1/shared_types.go.orig ./apis/v1alpha1/shared_types.go
--- ./apis/v1alpha1/shared_types.go.orig
+++ ./apis/v1alpha1/shared_types.go
@@ -176,11 +176,11 @@
 	// +optional
 	Nodes *metav1.LabelSelector `json:"nodes,omitempty"`
 	// Networks defines a way to select IPs and CIDR blocks that represent
-    // entities that live outside the cluster as a peer.
+	// entities that live outside the cluster as a peer.
 	// It is the list of NetworkCIDR (both v4 & v6) that can be used to define
 	// external destinations.
-    // You could use this to select internal cluster networks like podCIDR blocks, but it is
-    // recommended to use namespaces and pods peers instead in those cases.
+	// You could use this to select internal cluster networks like podCIDR blocks, but it is
+	// recommended to use namespaces and pods peers instead in those cases.
 	//
 	// Support: Extended
 	//
Please run make fmt to fix the issue(s)
�[0;31mTest FAILED: hack/../hack/verify-gofmt.sh

astoycos · 2024-01-08T21:59:31Z

/retest

tssurya · 2024-01-18T10:38:55Z

/hold
@astoycos : yeah this is actually only WIP PR, adding hold here, its a PoC for the NPEP PR: #144 which needs to get merged first.

Need your help in getting PR 144 merged asap.

apis/v1alpha1/shared_types.go

tssurya · 2024-02-20T13:54:51Z

apis/v1alpha1/shared_types.go

+	// <network-policy-api:experimental>
+	// +optional
+	// +kubebuilder:validation:MinItems=1
+	// +kubebuilder:validation:MaxItems=45


so the moment I set this to 50 I exceed the cost quota for the CRD
I think its because each CRD can have 100 egress rules and each rule can have 100 egress peers and each peer can have 50 or 100 networkCIDRs, its blowing out of proportion for CEL validation. Thus the max we can do is 45 here for a length of 48.

either ways this can be done only from 1.31 onwards, we need to go with regex until then

JoelSpeed

In general this is pretty simple, but I think you need to understand if IP 4-in-6 addresses are supported or not. I would suggest to either simplify the regex to prevent this, or, prevent the presence of . and : in the same string by adding a CEL validation

+kubebuilder:validation:XValidation:rule="self.contains(':') != self.contains('.')",message="CIDR must be either an IPv4 or IPv6 address. IPv4 address embedded in IPv6 addresses are not supported"

tssurya · 2024-02-28T11:41:10Z

ack thanks @JoelSpeed : I will double check if we want to do v4 in v6 support and then a change things accordingly. What about isCIDR() implementation, does that support v4 in v6 or will it consider that as invalid?
found this one: golang/go#51906 (comment) which was interesting

JoelSpeed · 2024-02-28T11:47:22Z

I will double check but I'm pretty sure when implementing it we excluded 4-in-6 support

Edit: 4in6 explicitly disallowed in CEL

danwinship · 2024-02-28T14:33:42Z

I will double check if we want to do v4 in v6 support

Oh, yeah, we don't. (isCIDR is exactly correct for what we want.)

danwinship · 2024-02-28T15:05:24Z

apis/v1alpha1/shared_types.go

+// has no impact on the behavior of the policy applied to the peer
+// trying to talk to the Service. It will work in the same way as if the
+// Service didn't exist since policy is applied after ServiceVIP (clusterIP,
+// externalIP, loadBalancerIngressVIP) is rewritten to the backendIPs.


Unfortunately, this isn't quite right; ClusterIPs and ExternalIPs are always rewritten before NP enforcement, but load balancer IPs sometimes aren't; in some cases, pod-to-LoadBalancer traffic must be sent to the load balancer rather than being processed locally. (This would happen either if the CloudProvider sets hostname rather than ip in the service's LoadBalancerStatus (like AWS), or if it sets ip but also uses the new ipMode: Proxy.)

And once a packet leaves the node and goes to an external load balancer, then all bets are off. When the packet comes back into the cluster, it's not even theoretically possible for the ANP implementation to reliably figure out its original source IP (since the LB may have transformed the connection in arbitrary ways).

(And of course, if the Service has endpoint IPs that are cluster-external, then a pod-to-LoadBalancer connection that goes to the LB would presumably then go directly to the cluster-external IPs, without coming back into the cluster at all, so there's no way the ANP implementation could enforce a Networks selector that blocked cluster-external IPs at that point.)

So I can think of four options:

Require LoadBalancer implementations to participate in ANP enforcement, so that we can guarantee that pod-to-LB connections will have the same semantics as pod-to-ClusterIP. This seems really really unlikely to happen, but I wanted to list it just for completeness.

Require ANP implementations to "pre-approve" pod-to-LB traffic by figuring out "if I were to rewrite this traffic myself, would it then pass ANP?", and if not, then drop it rather than forwarding it to the LB. Note that in the AWS-like case this requires using FQDN rules to recognize that traffic is headed for an LB. And then there are theoretical edge cases like "what if one of the service's endpoints is allowed but another one isn't?". So this also doesn't seem great.

Declare that it's undefined whether you can match post-LB traffic (IOW, a Pods selector may or may not catch pod-to-LB-to-pod traffic, and a Networks selector may or may not catch pod-to-LB-to-cluster-external traffic), but declare that you can block pre-LB traffic with a Networks selector that matches the LB IP. (This might be annoying for implementations to get right in the case where the LB traffic doesn't leave the node, but it should not actually be impossible for anyone.). If we were going to do this, we would probably want to say that you can block pod-to-externalIPs traffic too, for consistency, even though we don't run into the same "we don't know what the cloud infra is doing" problem there. (In fact, we could potentially say that this applies to clusterIPs too... we'd have to think that through. This is potentially the simplest solution though, because then the rule just becomes "if the packet got DNATted, then check both the original and the new dest IP against all policies, rather than only checking the new dest IP".)

Declare that all pod-to-LB traffic has undefined semantics with respect to ANP, and policies may have different effects in different clusters. (This is easiest for implementors, but worst for users. And of course, this is what NP does because we hadn't noticed the problem until it was too late and there were already conflicting implementations.)

FTR, note that in the cluster-ingress case (which is even worse), I feel like there's a chance the answer will end up being "delegate policy enforcement to the Gateway rather than trying to do it in ANP". I'm not sure if that has any implications for trying to decide what to do here.

(People may also want to consider how they expect ANPs to interact with other implementation-specific egress functionality (eg, EgressIPs), and whether that implies anything about the best choice for LB behavior.)

Unfortunately, this isn't quite right; ClusterIPs and ExternalIPs are always rewritten before NP enforcement, but load balancer IPs sometimes aren't; in some cases, pod-to-LoadBalancer traffic must be sent to the load balancer rather than being processed locally. (This would happen either if the CloudProvider sets hostname rather than ip in the service's LoadBalancerStatus (like AWS), or if it sets ip but also uses the new ipMode: Proxy.)

And once a packet leaves the node and goes to an external load balancer, then all bets are off. When the packet comes back into the cluster, it's not even theoretically possible for the ANP implementation to reliably figure out its original source IP (since the LB may have transformed the connection in arbitrary ways).

OH I completely forgot about the aws LB use case.. :/ great call out.

(And of course, if the Service has endpoint IPs that are cluster-external, then a pod-to-LoadBalancer connection that goes to the LB would presumably then go directly to the cluster-external IPs, without coming back into the cluster at all, so there's no way the ANP implementation could enforce a Networks selector that blocked cluster-external IPs at that point.)

YEA,,,,!

So I can think of four options:

Require LoadBalancer implementations to participate in ANP enforcement, so that we can guarantee that pod-to-LB connections will have the same semantics as pod-to-ClusterIP. This seems really really unlikely to happen, but I wanted to list it just for completeness.

Require ANP implementations to "pre-approve" pod-to-LB traffic by figuring out "if I were to rewrite this traffic myself, would it then pass ANP?", and if not, then drop it rather than forwarding it to the LB. Note that in the AWS-like case this requires using FQDN rules to recognize that traffic is headed for an LB. And then there are theoretical edge cases like "what if one of the service's endpoints is allowed but another one isn't?". So this also doesn't seem great.

yeah except AWS case other ingress.VIP types will work well with this solution. Even with the way "Services" are implemented there is inconsistency with LBs because some are hairpinned within the cluster, some are sent out etc.. but I guess even in that case we didn't really define what is correct for implementions to follow...

Declare that it's undefined whether you can match post-LB traffic (IOW, a Pods selector may or may not catch pod-to-LB-to-pod traffic, and a Networks selector may or may not catch pod-to-LB-to-cluster-external traffic), but declare that you can block pre-LB traffic with a Networks selector that matches the LB IP. (This might be annoying for implementations to get right in the case where the LB traffic doesn't leave the node, but it should not actually be impossible for anyone.).

yea... at least in ovnk.. we won't block preLB traffic..gonna be applied after the DNAT except in AWS case where we will :D

If we were going to do this, we would probably want to say that you can block pod-to-externalIPs traffic too, for consistency, even though we don't run into the same "we don't know what the cloud infra is doing" problem there. (In fact, we could potentially say that this applies to clusterIPs too... we'd have to think that through. This is potentially the simplest solution though, because then the rule just becomes "if the packet got DNATted, then check both the original and the new dest IP against all policies, rather than only checking the new dest IP".)

Declare that all pod-to-LB traffic has undefined semantics with respect to ANP, and policies may have different effects in different clusters. (This is easiest for implementors, but worst for users. And of course, this is what NP does because we hadn't noticed the problem until it was too late and there were already conflicting implementations.)

I like this the BEST. :D But perhaps we are then following in the footsteps of something NetPol did badly and we don't necessarily want to do that to users?

FTR, note that in the cluster-ingress case (which is even worse), I feel like there's a chance the answer will end up being "delegate policy enforcement to the Gateway rather than trying to do it in ANP". I'm not sure if that has any implications for trying to decide what to do here.

(People may also want to consider how they expect ANPs to interact with other implementation-specific egress functionality (eg, EgressIPs), and whether that implies anything about the best choice for LB behavior.)

yeah good point for cross-feature functionality user experience.

3. Declare that it's undefined whether you can match post-LB traffic (IOW, a Pods selector may or may not catch pod-to-LB-to-pod traffic, and a Networks selector may or may not catch pod-to-LB-to-cluster-external traffic), but declare that you can block pre-LB traffic with a Networks selector that matches the LB IP. (This might be annoying for implementations to get right in the case where the LB traffic doesn't leave the node, but it should not actually be impossible for anyone.).

yea... at least in ovnk.. we won't block preLB traffic..gonna be applied after the DNAT except in AWS case where we will :D

ugh...

So what I was trying to say with "check both the original and the new dest IP against all policies, rather than only checking the new dest IP" was that, if you have a rule saying "Deny to 192.168.0.0/24", then instead of translating that into a single rule that only checks the real/current dest IP, you'd translate it into two rules. eg:

# iptables ... -d 192.168.0.0/24 ... -j DROP ... -m conntrack --ctstate DNAT --ctorigdst 192.168.0.0/24 ... -j DROP # OVS ... nw_dst=192.168.0.0/24, actions=drop ... ct_state=+dnat,ct_nw_dst=192.168.0.0/24, actions=drop

(except more optimized)

But it looks like OVN ACL expressions don't expose OVS ct_nw_dst... (ct.dnat is the equivalent of ct_state=+dnat, but there's no equivalent of ct_nw_dst 🙁 )

Declare that all pod-to-LB traffic has undefined semantics with respect to ANP, and policies may have different effects in different clusters....

I like this the BEST. :D But perhaps we are then following in the footsteps of something NetPol did badly and we don't necessarily want to do that to users?

That's how I feel. I mean, think of the user stories that would have had to be there to get you to choose this behavior on purpose. "As an administrator, I need to be able to block certain users from accessing X. But if the user creates a LoadBalancer Service pointing to it, then whatever, I guess I don't care in that case."

(Admittedly, with the current default RBAC permissions, users cannot create Services that point to cluster-external IPs or to another user's pods... So... maybe we can get away with that?)

My initial inclination is that wherever possible we should try to avoid CIDR-based matches for rich k8s constructs, because it means that every time something relocates elsewhere on the network, the user must update their network policies. The beauty of policies for Pods is that the Pods can churn as much as they like, move IPs, redeploy elsewhere, etc, and this doesn't impact the user's specification of the policy. It's just up to the implementation to learn the right information and apply the policy correctly. (And sure, we can always paper over this with controllers that react to stuff in the network and write the ANPs, but that seems like the implementation detail rather than the user-facing API; implementations should have the freedom to choose if that's the way they want it to work).

Whether this "avoid CIDR applying to services" idea means aiming for the node to do a best-effort translation and apply policy based on that, or whether to have an explicit selector that is specifically for selecting traffic towards services by the service's labels, or one of the other options... I'm not sure.

declare that you can block pre-LB traffic with a Networks selector that matches the LB IP. (This might be annoying for implementations to get right in the case where the LB traffic doesn't leave the node, but it should not actually be impossible for anyone.).

If we're talking "How to apply the policy in cases where the CNI is not responsible for service translation" then I think this doable in Cilium without a huge amount of effort, but if this is intended to apply to all services including ones that the CNI is responsible for translating, then this seems to equate to building out a whole extra policy calculation + enforcement stage. So yes, a pain.

Can't we say that CIDR-based policy enforcement should happen with the original packet sent by the pod (pod-selector rules can still be applied after svc ip resolution). Then the only thing that matters is what the pod sends/receives as dst/src ip. Considering subject is always a pod, it kinda makes sense to think about ips in terms of what pod sees, and not how the packet will be tweaked/routed/etc. This also could be enriched later with extra lb/gateway/etc policies?

apis/v1alpha1/shared_types.go

joestringer · 2024-02-29T00:56:23Z

apis/v1alpha1/shared_types.go

+	// such a VIP. For example, a Networks selector that denies traffic to the entire
+	// service CIDR will not actually block any service traffic.


What if the VIP has no backends/endpoints? The implementation can't translate the traffic, so at that point is the traffic subject to CIDR policy or not? I believe that the behaviour in Cilium today is to drop this traffic with an explicit error code, but not motivated by policy at all (before the policy check).

(Other than that, the "translate first, apply policy after" pattern is how Cilium works with respect to policy applying to service traffic.)

Ah! I think this question is also related to the LoadBalancer behaviour question above 😅 . If we consider that the LoadBalancer "doesn't have any backends", because actually the implementation of the LB is external to the current node, then how would we apply that policy. Interesting. #185 (comment)

yes right there is a discrepancy with LBs will open a new issue to track that.

tssurya · 2024-02-29T17:38:40Z

FYI for reviewers, this PR's scope is to get the CIDR peer in and I will add a TODO linking to #203 to clarify the issues around load balancers...

mkow · 2024-02-29T23:59:45Z

apis/v1alpha1/shared_types.go

+// https://blog.markhatton.co.uk/2011/03/15/regular-expressions-for-ip-addresses-cidr-ranges-and-hostnames/
+// The resulting regex is an OR of both regexes. IPv4 address embedded in IPv6 addresses are not supported.
+// TODO: Change the CIDR's validation regex to use CEL isCIDR() in Kube 1.31 when it is available.
+// +kubebuilder:validation:Pattern=`(^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])(\/(3[0-2]|[1-2][0-9]|[0-9]))$)|(^s*((([0-9A-Fa-f]{1,4}:){7}([0-9A-Fa-f]{1,4}|:))|(([0-9A-Fa-f]{1,4}:){6}(:[0-9A-Fa-f]{1,4}|((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3})|:))|(([0-9A-Fa-f]{1,4}:){5}(((:[0-9A-Fa-f]{1,4}){1,2})|:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3})|:))|(([0-9A-Fa-f]{1,4}:){4}(((:[0-9A-Fa-f]{1,4}){1,3})|((:[0-9A-Fa-f]{1,4})?:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){3}(((:[0-9A-Fa-f]{1,4}){1,4})|((:[0-9A-Fa-f]{1,4}){0,2}:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){2}(((:[0-9A-Fa-f]{1,4}){1,5})|((:[0-9A-Fa-f]{1,4}){0,3}:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){1}(((:[0-9A-Fa-f]{1,4}){1,6})|((:[0-9A-Fa-f]{1,4}){0,4}:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:))|(:(((:[0-9A-Fa-f]{1,4}){1,7})|((:[0-9A-Fa-f]{1,4}){0,5}:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:)))(%.+)?s*(\/(12[0-8]|1[0-1][0-9]|[1-9][0-9]|[0-9]))$)`


Isn't this regex completely wrong? I think all backslash characters got dropped on that random blog page it was taken from 😬 Look e.g. at IPv6 address regex there, it starts with ^s* instead of ^\s* which makes much more sense... And in this particular one, look at e.g. 1dd - I guess it should have been 1\d\d instead?

Hi @mkow ! thanks for the review:

(^s*((([0-9A-Fa-f]{1,4}:){7}([0-9A-Fa-f]{1,4}|:))|(([0-9A-Fa-f]{1,4}:){6}(:[0-9A-Fa-f]{1,4}|((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3})|:))|(([0-9A-Fa-f]{1,4}:){5}(((:[0-9A-Fa-f]{1,4}){1,2})|:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3})|:))|(([0-9A-Fa-f]{1,4}:){4}(((:[0-9A-Fa-f]{1,4}){1,3})|((:[0-9A-Fa-f]{1,4})?:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){3}(((:[0-9A-Fa-f]{1,4}){1,4})|((:[0-9A-Fa-f]{1,4}){0,2}:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){2}(((:[0-9A-Fa-f]{1,4}){1,5})|((:[0-9A-Fa-f]{1,4}){0,3}:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){1}(((:[0-9A-Fa-f]{1,4}){1,6})|((:[0-9A-Fa-f]{1,4}){0,4}:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:))|(:(((:[0-9A-Fa-f]{1,4}){1,7})|((:[0-9A-Fa-f]{1,4}){0,5}:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:)))(%.+)?s*(\/(12[0-8]|1[0-1][0-9]|[1-9][0-9]|[0-9]))$)

is the v6 CIDR regex that I have here..
The blog: https://blog.markhatton.co.uk/2011/03/15/regular-expressions-for-ip-addresses-cidr-ranges-and-hostnames/ has

^s*((([0-9A-Fa-f]{1,4}:){7}([0-9A-Fa-f]{1,4}|:))|(([0-9A-Fa-f]{1,4}:){6}(:[0-9A-Fa-f]{1,4}|((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3})|:))|(([0-9A-Fa-f]{1,4}:){5}(((:[0-9A-Fa-f]{1,4}){1,2})|:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3})|:))|(([0-9A-Fa-f]{1,4}:){4}(((:[0-9A-Fa-f]{1,4}){1,3})|((:[0-9A-Fa-f]{1,4})?:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){3}(((:[0-9A-Fa-f]{1,4}){1,4})|((:[0-9A-Fa-f]{1,4}){0,2}:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){2}(((:[0-9A-Fa-f]{1,4}){1,5})|((:[0-9A-Fa-f]{1,4}){0,3}:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){1}(((:[0-9A-Fa-f]{1,4}){1,6})|((:[0-9A-Fa-f]{1,4}){0,4}:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:))|(:(((:[0-9A-Fa-f]{1,4}){1,7})|((:[0-9A-Fa-f]{1,4}){0,5}:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:)))(%.+)?s*(\/(12[0-8]|1[0-1][0-9]|[1-9][0-9]|[0-9]))$

so do you mean the regex on that blog is also wrong and we need backslash ? Almost everywhere I saw online I didn't see backslashes having said that I am no expert here. Could you please provide me with your alternative source?
cc @JoelSpeed we do the same in OCP here: https://github.com/openshift/api/blob/ce10821dc9997694d8058564308819011d27383e/config/v1/types_infrastructure.go#L1862 which is where I got it from, I plan to write some tests on this for sure but I'd think it can't be completely wrong if we are using it there unless I am missing something very obvious here? cc @mkowalski

Hey all, this is regex we use in OCP and of course there may be a dozen of corner cases it is not catching. Given how tricky the syntax of IPv6 addresses is, I am not sure if is possible to have entirely correct regex (but proof would be needed).

@mkow can you please provide an example of IPv6 address that is correct and is not catched by this regex? Or can you provide an example of non-IPv6 string that is passing this regex? That would definitely help

so do you mean the regex on that blog is also wrong and we need backslash?

I would rather say: don't blindly copy-paste things without understanding them ;) Yes, the blog is completely wrong.

Could you please provide me with your alternative source?

I don't have any, sorry. I guess, just read the IPv6 CIDR specification and build the regex carefully yourself. Or don't use regex for this purpose.

there may be a dozen of corner cases it is not catching

It's not about corner cases, this regex is just some garbage which (I guess) got mangled when the author of that blog tried to add it to html on their blog. Most of it doesn't make any sense, just read it yourself.

@mkow can you please provide an example of IPv6 address that is correct and is not catched by this regex? Or can you provide an example of non-IPv6 string that is passing this regex? That would definitely help

For example, the regex from this PR accepts this as a valid IPv6 CIDR: sssss::1ddd1ddd1ddd1dd%!@#$%^&*()/0

If it helps, here's the regex + tests we have in Cilium for the equivalent validation pattern: https://github.com/cilium/cilium/blob/25d946d3d88e0be2ccbcec7bb12a99905729ba16/pkg/policy/api/cidr_test.go#L156

thanks @mkow yeah that one looks like a nice tester! Let me see if this works: https://www.perplexity.ai/search/what-is-the-ILs5tiG7RHWJo4ZzyL7t8A :)

heh clearly didn't, its copying from whatever sources are out there which all seem wrong.. :/

May be worth comparing with Go unit tests too for some corner cases: https://cs.opensource.google/go/go/+/refs/tags/go1.22.0:src/net/netip/netip_test.go

While searching for prior attempts at an IPv6 regex, I came across this list of invalid and valid test cases, coming from this repo. Possibly useful for validation.

I also came across a regex that might be easier to understand as switch statement for different allowable address patterns using regex-vis.com

ref: https://ihateregex.io/expr/ipv6/

I'm not saying this has full coverage for all allowed cases, but I can comprehend what the cases are using regex-vis.com

(([0-9a-fA-F]{1,4}:){7,7}[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,7}:|([0-9a-fA-F]{1,4}:){1,6}:[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,5}(:[0-9a-fA-F]{1,4}){1,2}|([0-9a-fA-F]{1,4}:){1,4}(:[0-9a-fA-F]{1,4}){1,3}|([0-9a-fA-F]{1,4}:){1,3}(:[0-9a-fA-F]{1,4}){1,4}|([0-9a-fA-F]{1,4}:){1,2}(:[0-9a-fA-F]{1,4}){1,5}|[0-9a-fA-F]{1,4}:((:[0-9a-fA-F]{1,4}){1,6})|:((:[0-9a-fA-F]{1,4}){1,7}|:)|fe80:(:[0-9a-fA-F]{0,4}){0,4}%[0-9a-zA-Z]{1,}|::(ffff(:0{1,4}){0,1}:){0,1}((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])\.){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])|([0-9a-fA-F]{1,4}:){1,4}:((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])\.){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9]))

If only we could embed a comment string for each regexp group that tools like regex-vis.com could render to provide context for intent for each grouping. The lack of 'intent' is what makes complicated regexp so so brittle and hard to maintain.

thanks all, I'll discuss with the repo-maintainers and k8s api reviewers to see how to move forward on this one. Considering the API is not GA-ed we might skip the regex (I find it tricky to get the absolute correct regex and don't want to risk going with the wrong regex leading to breakages) for now till 1.31 where we introduce isCDIR(). This might mean implementations need to do their own validation using net.ParseCIDR() but I think we can live with that. If anyone has strong concerns or suggestions for this please lmk here on the thread.

PS: This also stems from the fact that I don't have cycles to go off and build a v6 regex currently, so if someone is willing to do that happy to give credit there as well.

This PR adds support for implementing inline CIDR peer blocks. Signed-off-by: Surya Seetharaman <[email protected]>

Dyanngg

LGTM

astoycos

/lgtm
/approve

Thanks for all the work @tssurya and all the great conversations/ bug findings from the group, all conversations are being resolved in follow up issues (i.e regex, services and external CIDR object)

see #183

k8s-ci-robot · 2024-03-05T17:13:04Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: astoycos, Dyanngg, tssurya

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [Dyanngg,astoycos]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Dec 30, 2023

k8s-ci-robot requested review from astoycos and danwinship December 30, 2023 10:46

k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Dec 30, 2023

tssurya commented Dec 30, 2023

View reviewed changes

tssurya mentioned this pull request Dec 31, 2023

NPEP: Iron out Cluster Egress Support API Design #144

Merged

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 18, 2024

tssurya changed the title ~~Implement inline CIDR block egress peer~~ WIP: Implement inline CIDR block egress peer Feb 9, 2024

k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 9, 2024

tssurya force-pushed the implement-egress-traffic-semantics-inline-cidrs branch 5 times, most recently from f84e679 to 8bfeda6 Compare February 13, 2024 11:23

k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Feb 13, 2024

danwinship reviewed Feb 13, 2024

View reviewed changes

apis/v1alpha1/shared_types.go Outdated Show resolved Hide resolved

tssurya force-pushed the implement-egress-traffic-semantics-inline-cidrs branch from 68943e0 to 96a7cc3 Compare February 13, 2024 20:38

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 18, 2024

tssurya force-pushed the implement-egress-traffic-semantics-inline-cidrs branch 2 times, most recently from 6e9cb97 to 68ec2c3 Compare February 20, 2024 13:51

k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Feb 20, 2024

tssurya commented Feb 20, 2024

View reviewed changes

tssurya mentioned this pull request Feb 22, 2024

Add Validation unit tests for the APIs #148

Closed

JoelSpeed reviewed Feb 27, 2024

View reviewed changes

tssurya force-pushed the implement-egress-traffic-semantics-inline-cidrs branch 4 times, most recently from 00a5c80 to 5658b95 Compare February 28, 2024 13:51

danwinship reviewed Feb 28, 2024

View reviewed changes

tssurya force-pushed the implement-egress-traffic-semantics-inline-cidrs branch 3 times, most recently from 1c00da1 to 7c08fb6 Compare February 28, 2024 19:04

joestringer reviewed Feb 29, 2024

View reviewed changes

apis/v1alpha1/shared_types.go Show resolved Hide resolved

joestringer reviewed Feb 29, 2024

View reviewed changes

tssurya mentioned this pull request Feb 29, 2024

[BUG] Clarify the way Services of Type LoadBalancer and ANPs will work #203

Open

tssurya force-pushed the implement-egress-traffic-semantics-inline-cidrs branch from 7c08fb6 to ec3fc38 Compare February 29, 2024 18:56

tssurya mentioned this pull request Feb 29, 2024

Add description of interaction expectations of policies and services #204

Draft

mkow reviewed Feb 29, 2024

View reviewed changes

Implement inline CIDR egress peer

6ff2bac

This PR adds support for implementing inline CIDR peer blocks. Signed-off-by: Surya Seetharaman <[email protected]>

tssurya force-pushed the implement-egress-traffic-semantics-inline-cidrs branch from ec3fc38 to 6ff2bac Compare March 5, 2024 13:40

Dyanngg approved these changes Mar 5, 2024

View reviewed changes

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 5, 2024

astoycos approved these changes Mar 5, 2024

View reviewed changes

k8s-ci-robot assigned astoycos Mar 5, 2024

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 5, 2024

k8s-ci-robot merged commit 5889d65 into kubernetes-sigs:main Mar 5, 2024
8 checks passed

tssurya mentioned this pull request Oct 24, 2024

Graduating networks peer to standard channel from experimental #259

Open

		// such a VIP. For example, a Networks selector that denies traffic to the entire
		// service CIDR will not actually block any service traffic.

Implement inline CIDR block egress peer #185

Implement inline CIDR block egress peer #185

Conversation

tssurya commented Dec 30, 2023

netlify bot commented Dec 30, 2023 • edited Loading

✅ Deploy Preview for kubernetes-sigs-network-policy-api ready!

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tssurya commented Dec 31, 2023

astoycos commented Jan 3, 2024

astoycos commented Jan 8, 2024

tssurya commented Jan 18, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JoelSpeed left a comment

Choose a reason for hiding this comment

tssurya commented Feb 28, 2024 • edited Loading

JoelSpeed commented Feb 28, 2024 • edited Loading

danwinship commented Feb 28, 2024

danwinship Feb 28, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

danwinship Feb 28, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joestringer Feb 29, 2024 • edited Loading

Choose a reason for hiding this comment

joestringer Feb 29, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tssurya commented Feb 29, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

asauber Mar 2, 2024 • edited Loading

Choose a reason for hiding this comment

jspaleta Mar 2, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Dyanngg left a comment

Choose a reason for hiding this comment

astoycos left a comment

Choose a reason for hiding this comment

k8s-ci-robot commented Mar 5, 2024

netlify bot commented Dec 30, 2023 •

edited

Loading

tssurya commented Feb 28, 2024 •

edited

Loading

JoelSpeed commented Feb 28, 2024 •

edited

Loading

danwinship Feb 28, 2024 •

edited

Loading

danwinship Feb 28, 2024 •

edited

Loading

joestringer Feb 29, 2024 •

edited

Loading

joestringer Feb 29, 2024 •

edited

Loading

tssurya commented Feb 29, 2024 •

edited

Loading

asauber Mar 2, 2024 •

edited

Loading

jspaleta Mar 2, 2024 •

edited

Loading