-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement inline CIDR block egress peer #185
Implement inline CIDR block egress peer #185
Conversation
✅ Deploy Preview for kubernetes-sigs-network-policy-api ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
apis/v1alpha1/shared_types.go
Outdated
// +optional | ||
// +kubebuilder:validation:MinItems=1 | ||
// +kubebuilder:validation:MaxItems=100 | ||
Networks []string `json:"networks,omitempty" validate:"omitempty,dive,cidr"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to come back to how we want the validation of the cidr's to be done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using https://blog.markhatton.co.uk/2011/03/15/regular-expressions-for-ip-addresses-cidr-ranges-and-hostnames/ for now (took from ocp/api)
/test pull-network-policy-api-verify |
@tssurya You just need to run gofmt here
|
/retest |
f84e679
to
8bfeda6
Compare
68943e0
to
96a7cc3
Compare
6e9cb97
to
68ec2c3
Compare
apis/v1alpha1/shared_types.go
Outdated
// <network-policy-api:experimental> | ||
// +optional | ||
// +kubebuilder:validation:MinItems=1 | ||
// +kubebuilder:validation:MaxItems=45 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so the moment I set this to 50 I exceed the cost quota for the CRD
I think its because each CRD can have 100 egress rules and each rule can have 100 egress peers and each peer can have 50 or 100 networkCIDRs, its blowing out of proportion for CEL validation. Thus the max we can do is 45 here for a length of 48.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
either ways this can be done only from 1.31 onwards, we need to go with regex until then
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general this is pretty simple, but I think you need to understand if IP 4-in-6 addresses are supported or not. I would suggest to either simplify the regex to prevent this, or, prevent the presence of .
and :
in the same string by adding a CEL validation
+kubebuilder:validation:XValidation:rule="self.contains(':') != self.contains('.')",message="CIDR must be either an IPv4 or IPv6 address. IPv4 address embedded in IPv6 addresses are not supported"
ack thanks @JoelSpeed : I will double check if we want to do v4 in v6 support and then a change things accordingly. What about |
I will double check but I'm pretty sure when implementing it we excluded 4-in-6 support Edit: 4in6 explicitly disallowed in CEL |
00a5c80
to
5658b95
Compare
Oh, yeah, we don't. (isCIDR is exactly correct for what we want.) |
apis/v1alpha1/shared_types.go
Outdated
// has no impact on the behavior of the policy applied to the peer | ||
// trying to talk to the Service. It will work in the same way as if the | ||
// Service didn't exist since policy is applied after ServiceVIP (clusterIP, | ||
// externalIP, loadBalancerIngressVIP) is rewritten to the backendIPs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, this isn't quite right; ClusterIPs and ExternalIPs are always rewritten before NP enforcement, but load balancer IPs sometimes aren't; in some cases, pod-to-LoadBalancer traffic must be sent to the load balancer rather than being processed locally. (This would happen either if the CloudProvider sets hostname
rather than ip
in the service's LoadBalancerStatus
(like AWS), or if it sets ip
but also uses the new ipMode: Proxy
.)
And once a packet leaves the node and goes to an external load balancer, then all bets are off. When the packet comes back into the cluster, it's not even theoretically possible for the ANP implementation to reliably figure out its original source IP (since the LB may have transformed the connection in arbitrary ways).
(And of course, if the Service has endpoint IPs that are cluster-external, then a pod-to-LoadBalancer connection that goes to the LB would presumably then go directly to the cluster-external IPs, without coming back into the cluster at all, so there's no way the ANP implementation could enforce a Networks
selector that blocked cluster-external IPs at that point.)
So I can think of four options:
- Require LoadBalancer implementations to participate in ANP enforcement, so that we can guarantee that pod-to-LB connections will have the same semantics as pod-to-ClusterIP. This seems really really unlikely to happen, but I wanted to list it just for completeness.
- Require ANP implementations to "pre-approve" pod-to-LB traffic by figuring out "if I were to rewrite this traffic myself, would it then pass ANP?", and if not, then drop it rather than forwarding it to the LB. Note that in the AWS-like case this requires using FQDN rules to recognize that traffic is headed for an LB. And then there are theoretical edge cases like "what if one of the service's endpoints is allowed but another one isn't?". So this also doesn't seem great.
- Declare that it's undefined whether you can match post-LB traffic (IOW, a
Pods
selector may or may not catch pod-to-LB-to-pod traffic, and aNetworks
selector may or may not catch pod-to-LB-to-cluster-external traffic), but declare that you can block pre-LB traffic with aNetworks
selector that matches the LB IP. (This might be annoying for implementations to get right in the case where the LB traffic doesn't leave the node, but it should not actually be impossible for anyone.). If we were going to do this, we would probably want to say that you can block pod-to-externalIPs
traffic too, for consistency, even though we don't run into the same "we don't know what the cloud infra is doing" problem there. (In fact, we could potentially say that this applies toclusterIPs
too... we'd have to think that through. This is potentially the simplest solution though, because then the rule just becomes "if the packet got DNATted, then check both the original and the new dest IP against all policies, rather than only checking the new dest IP".) - Declare that all pod-to-LB traffic has undefined semantics with respect to ANP, and policies may have different effects in different clusters. (This is easiest for implementors, but worst for users. And of course, this is what NP does because we hadn't noticed the problem until it was too late and there were already conflicting implementations.)
FTR, note that in the cluster-ingress case (which is even worse), I feel like there's a chance the answer will end up being "delegate policy enforcement to the Gateway rather than trying to do it in ANP". I'm not sure if that has any implications for trying to decide what to do here.
(People may also want to consider how they expect ANPs to interact with other implementation-specific egress functionality (eg, EgressIPs), and whether that implies anything about the best choice for LB behavior.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, this isn't quite right; ClusterIPs and ExternalIPs are always rewritten before NP enforcement, but load balancer IPs sometimes aren't; in some cases, pod-to-LoadBalancer traffic must be sent to the load balancer rather than being processed locally. (This would happen either if the CloudProvider sets
hostname
rather thanip
in the service'sLoadBalancerStatus
(like AWS), or if it setsip
but also uses the newipMode: Proxy
.)And once a packet leaves the node and goes to an external load balancer, then all bets are off. When the packet comes back into the cluster, it's not even theoretically possible for the ANP implementation to reliably figure out its original source IP (since the LB may have transformed the connection in arbitrary ways).
OH I completely forgot about the aws LB use case.. :/ great call out.
(And of course, if the Service has endpoint IPs that are cluster-external, then a pod-to-LoadBalancer connection that goes to the LB would presumably then go directly to the cluster-external IPs, without coming back into the cluster at all, so there's no way the ANP implementation could enforce a
Networks
selector that blocked cluster-external IPs at that point.)
YEA,,,,!
So I can think of four options:
- Require LoadBalancer implementations to participate in ANP enforcement, so that we can guarantee that pod-to-LB connections will have the same semantics as pod-to-ClusterIP. This seems really really unlikely to happen, but I wanted to list it just for completeness.
- Require ANP implementations to "pre-approve" pod-to-LB traffic by figuring out "if I were to rewrite this traffic myself, would it then pass ANP?", and if not, then drop it rather than forwarding it to the LB. Note that in the AWS-like case this requires using FQDN rules to recognize that traffic is headed for an LB. And then there are theoretical edge cases like "what if one of the service's endpoints is allowed but another one isn't?". So this also doesn't seem great.
yeah except AWS case other ingress.VIP types will work well with this solution. Even with the way "Services" are implemented there is inconsistency with LBs because some are hairpinned within the cluster, some are sent out etc.. but I guess even in that case we didn't really define what is correct for implementions to follow...
- Declare that it's undefined whether you can match post-LB traffic (IOW, a
Pods
selector may or may not catch pod-to-LB-to-pod traffic, and aNetworks
selector may or may not catch pod-to-LB-to-cluster-external traffic), but declare that you can block pre-LB traffic with aNetworks
selector that matches the LB IP. (This might be annoying for implementations to get right in the case where the LB traffic doesn't leave the node, but it should not actually be impossible for anyone.).
yea... at least in ovnk.. we won't block preLB traffic..gonna be applied after the DNAT except in AWS case where we will :D
If we were going to do this, we would probably want to say that you can block pod-to-
externalIPs
traffic too, for consistency, even though we don't run into the same "we don't know what the cloud infra is doing" problem there. (In fact, we could potentially say that this applies toclusterIPs
too... we'd have to think that through. This is potentially the simplest solution though, because then the rule just becomes "if the packet got DNATted, then check both the original and the new dest IP against all policies, rather than only checking the new dest IP".)
- Declare that all pod-to-LB traffic has undefined semantics with respect to ANP, and policies may have different effects in different clusters. (This is easiest for implementors, but worst for users. And of course, this is what NP does because we hadn't noticed the problem until it was too late and there were already conflicting implementations.)
I like this the BEST. :D But perhaps we are then following in the footsteps of something NetPol did badly and we don't necessarily want to do that to users?
FTR, note that in the cluster-ingress case (which is even worse), I feel like there's a chance the answer will end up being "delegate policy enforcement to the Gateway rather than trying to do it in ANP". I'm not sure if that has any implications for trying to decide what to do here.
(People may also want to consider how they expect ANPs to interact with other implementation-specific egress functionality (eg, EgressIPs), and whether that implies anything about the best choice for LB behavior.)
yeah good point for cross-feature functionality user experience.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3. Declare that it's undefined whether you can match post-LB traffic (IOW, a
Pods
selector may or may not catch pod-to-LB-to-pod traffic, and aNetworks
selector may or may not catch pod-to-LB-to-cluster-external traffic), but declare that you can block pre-LB traffic with aNetworks
selector that matches the LB IP. (This might be annoying for implementations to get right in the case where the LB traffic doesn't leave the node, but it should not actually be impossible for anyone.).yea... at least in ovnk.. we won't block preLB traffic..gonna be applied after the DNAT except in AWS case where we will :D
ugh...
So what I was trying to say with "check both the original and the new dest IP against all policies, rather than only checking the new dest IP" was that, if you have a rule saying "Deny to 192.168.0.0/24", then instead of translating that into a single rule that only checks the real/current dest IP, you'd translate it into two rules. eg:
# iptables
... -d 192.168.0.0/24 ... -j DROP
... -m conntrack --ctstate DNAT --ctorigdst 192.168.0.0/24 ... -j DROP
# OVS
... nw_dst=192.168.0.0/24, actions=drop
... ct_state=+dnat,ct_nw_dst=192.168.0.0/24, actions=drop
(except more optimized)
But it looks like OVN ACL expressions don't expose OVS ct_nw_dst
... (ct.dnat
is the equivalent of ct_state=+dnat
, but there's no equivalent of ct_nw_dst
🙁 )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Declare that all pod-to-LB traffic has undefined semantics with respect to ANP, and policies may have different effects in different clusters....
I like this the BEST. :D But perhaps we are then following in the footsteps of something NetPol did badly and we don't necessarily want to do that to users?
That's how I feel. I mean, think of the user stories that would have had to be there to get you to choose this behavior on purpose. "As an administrator, I need to be able to block certain users from accessing X. But if the user creates a LoadBalancer Service pointing to it, then whatever, I guess I don't care in that case."
(Admittedly, with the current default RBAC permissions, users cannot create Services that point to cluster-external IPs or to another user's pods... So... maybe we can get away with that?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My initial inclination is that wherever possible we should try to avoid CIDR-based matches for rich k8s constructs, because it means that every time something relocates elsewhere on the network, the user must update their network policies. The beauty of policies for Pods is that the Pods can churn as much as they like, move IPs, redeploy elsewhere, etc, and this doesn't impact the user's specification of the policy. It's just up to the implementation to learn the right information and apply the policy correctly. (And sure, we can always paper over this with controllers that react to stuff in the network and write the ANPs, but that seems like the implementation detail rather than the user-facing API; implementations should have the freedom to choose if that's the way they want it to work).
Whether this "avoid CIDR applying to services" idea means aiming for the node to do a best-effort translation and apply policy based on that, or whether to have an explicit selector that is specifically for selecting traffic towards services by the service's labels, or one of the other options... I'm not sure.
declare that you can block pre-LB traffic with a
Networks
selector that matches the LB IP. (This might be annoying for implementations to get right in the case where the LB traffic doesn't leave the node, but it should not actually be impossible for anyone.).
If we're talking "How to apply the policy in cases where the CNI is not responsible for service translation" then I think this doable in Cilium without a huge amount of effort, but if this is intended to apply to all services including ones that the CNI is responsible for translating, then this seems to equate to building out a whole extra policy calculation + enforcement stage. So yes, a pain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we say that CIDR-based policy enforcement should happen with the original packet sent by the pod (pod-selector rules can still be applied after svc ip resolution). Then the only thing that matters is what the pod sends/receives as dst/src ip. Considering subject is always a pod, it kinda makes sense to think about ips in terms of what pod sees, and not how the packet will be tweaked/routed/etc. This also could be enriched later with extra lb/gateway/etc policies?
1c00da1
to
7c08fb6
Compare
apis/v1alpha1/shared_types.go
Outdated
// such a VIP. For example, a Networks selector that denies traffic to the entire | ||
// service CIDR will not actually block any service traffic. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the VIP has no backends/endpoints? The implementation can't translate the traffic, so at that point is the traffic subject to CIDR policy or not? I believe that the behaviour in Cilium today is to drop this traffic with an explicit error code, but not motivated by policy at all (before the policy check).
(Other than that, the "translate first, apply policy after" pattern is how Cilium works with respect to policy applying to service traffic.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah! I think this question is also related to the LoadBalancer
behaviour question above 😅 . If we consider that the LoadBalancer
"doesn't have any backends", because actually the implementation of the LB is external to the current node, then how would we apply that policy. Interesting. #185 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes right there is a discrepancy with LBs will open a new issue to track that.
FYI for reviewers, this PR's scope is to get the CIDR peer in and I will add a TODO linking to #203 to clarify the issues around load balancers... |
7c08fb6
to
ec3fc38
Compare
apis/v1alpha1/shared_types.go
Outdated
// https://blog.markhatton.co.uk/2011/03/15/regular-expressions-for-ip-addresses-cidr-ranges-and-hostnames/ | ||
// The resulting regex is an OR of both regexes. IPv4 address embedded in IPv6 addresses are not supported. | ||
// TODO: Change the CIDR's validation regex to use CEL isCIDR() in Kube 1.31 when it is available. | ||
// +kubebuilder:validation:Pattern=`(^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])(\/(3[0-2]|[1-2][0-9]|[0-9]))$)|(^s*((([0-9A-Fa-f]{1,4}:){7}([0-9A-Fa-f]{1,4}|:))|(([0-9A-Fa-f]{1,4}:){6}(:[0-9A-Fa-f]{1,4}|((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3})|:))|(([0-9A-Fa-f]{1,4}:){5}(((:[0-9A-Fa-f]{1,4}){1,2})|:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3})|:))|(([0-9A-Fa-f]{1,4}:){4}(((:[0-9A-Fa-f]{1,4}){1,3})|((:[0-9A-Fa-f]{1,4})?:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){3}(((:[0-9A-Fa-f]{1,4}){1,4})|((:[0-9A-Fa-f]{1,4}){0,2}:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){2}(((:[0-9A-Fa-f]{1,4}){1,5})|((:[0-9A-Fa-f]{1,4}){0,3}:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){1}(((:[0-9A-Fa-f]{1,4}){1,6})|((:[0-9A-Fa-f]{1,4}){0,4}:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:))|(:(((:[0-9A-Fa-f]{1,4}){1,7})|((:[0-9A-Fa-f]{1,4}){0,5}:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:)))(%.+)?s*(\/(12[0-8]|1[0-1][0-9]|[1-9][0-9]|[0-9]))$)` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this regex completely wrong? I think all backslash characters got dropped on that random blog page it was taken from 😬 Look e.g. at IPv6 address regex there, it starts with ^s*
instead of ^\s*
which makes much more sense... And in this particular one, look at e.g. 1dd
- I guess it should have been 1\d\d
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @mkow ! thanks for the review:
(^s*((([0-9A-Fa-f]{1,4}:){7}([0-9A-Fa-f]{1,4}|:))|(([0-9A-Fa-f]{1,4}:){6}(:[0-9A-Fa-f]{1,4}|((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3})|:))|(([0-9A-Fa-f]{1,4}:){5}(((:[0-9A-Fa-f]{1,4}){1,2})|:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3})|:))|(([0-9A-Fa-f]{1,4}:){4}(((:[0-9A-Fa-f]{1,4}){1,3})|((:[0-9A-Fa-f]{1,4})?:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){3}(((:[0-9A-Fa-f]{1,4}){1,4})|((:[0-9A-Fa-f]{1,4}){0,2}:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){2}(((:[0-9A-Fa-f]{1,4}){1,5})|((:[0-9A-Fa-f]{1,4}){0,3}:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){1}(((:[0-9A-Fa-f]{1,4}){1,6})|((:[0-9A-Fa-f]{1,4}){0,4}:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:))|(:(((:[0-9A-Fa-f]{1,4}){1,7})|((:[0-9A-Fa-f]{1,4}){0,5}:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:)))(%.+)?s*(\/(12[0-8]|1[0-1][0-9]|[1-9][0-9]|[0-9]))$)
is the v6 CIDR regex that I have here..
The blog: https://blog.markhatton.co.uk/2011/03/15/regular-expressions-for-ip-addresses-cidr-ranges-and-hostnames/ has
^s*((([0-9A-Fa-f]{1,4}:){7}([0-9A-Fa-f]{1,4}|:))|(([0-9A-Fa-f]{1,4}:){6}(:[0-9A-Fa-f]{1,4}|((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3})|:))|(([0-9A-Fa-f]{1,4}:){5}(((:[0-9A-Fa-f]{1,4}){1,2})|:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3})|:))|(([0-9A-Fa-f]{1,4}:){4}(((:[0-9A-Fa-f]{1,4}){1,3})|((:[0-9A-Fa-f]{1,4})?:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){3}(((:[0-9A-Fa-f]{1,4}){1,4})|((:[0-9A-Fa-f]{1,4}){0,2}:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){2}(((:[0-9A-Fa-f]{1,4}){1,5})|((:[0-9A-Fa-f]{1,4}){0,3}:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){1}(((:[0-9A-Fa-f]{1,4}){1,6})|((:[0-9A-Fa-f]{1,4}){0,4}:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:))|(:(((:[0-9A-Fa-f]{1,4}){1,7})|((:[0-9A-Fa-f]{1,4}){0,5}:((25[0-5]|2[0-4]d|1dd|[1-9]?d)(.(25[0-5]|2[0-4]d|1dd|[1-9]?d)){3}))|:)))(%.+)?s*(\/(12[0-8]|1[0-1][0-9]|[1-9][0-9]|[0-9]))$
so do you mean the regex on that blog is also wrong and we need backslash ? Almost everywhere I saw online I didn't see backslashes having said that I am no expert here. Could you please provide me with your alternative source?
cc @JoelSpeed we do the same in OCP here: https://github.com/openshift/api/blob/ce10821dc9997694d8058564308819011d27383e/config/v1/types_infrastructure.go#L1862 which is where I got it from, I plan to write some tests on this for sure but I'd think it can't be completely wrong if we are using it there unless I am missing something very obvious here? cc @mkowalski
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey all, this is regex we use in OCP and of course there may be a dozen of corner cases it is not catching. Given how tricky the syntax of IPv6 addresses is, I am not sure if is possible to have entirely correct regex (but proof would be needed).
@mkow can you please provide an example of IPv6 address that is correct and is not catched by this regex? Or can you provide an example of non-IPv6 string that is passing this regex? That would definitely help
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so do you mean the regex on that blog is also wrong and we need backslash?
I would rather say: don't blindly copy-paste things without understanding them ;) Yes, the blog is completely wrong.
Could you please provide me with your alternative source?
I don't have any, sorry. I guess, just read the IPv6 CIDR specification and build the regex carefully yourself. Or don't use regex for this purpose.
there may be a dozen of corner cases it is not catching
It's not about corner cases, this regex is just some garbage which (I guess) got mangled when the author of that blog tried to add it to html on their blog. Most of it doesn't make any sense, just read it yourself.
@mkow can you please provide an example of IPv6 address that is correct and is not catched by this regex? Or can you provide an example of non-IPv6 string that is passing this regex? That would definitely help
For example, the regex from this PR accepts this as a valid IPv6 CIDR: sssss::1ddd1ddd1ddd1dd%!@#$%^&*()/0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it helps, here's the regex + tests we have in Cilium for the equivalent validation pattern: https://github.com/cilium/cilium/blob/25d946d3d88e0be2ccbcec7bb12a99905729ba16/pkg/policy/api/cidr_test.go#L156
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @mkow yeah that one looks like a nice tester! Let me see if this works: https://www.perplexity.ai/search/what-is-the-ILs5tiG7RHWJo4ZzyL7t8A :)
heh clearly didn't, its copying from whatever sources are out there which all seem wrong.. :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be worth comparing with Go unit tests too for some corner cases: https://cs.opensource.google/go/go/+/refs/tags/go1.22.0:src/net/netip/netip_test.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also came across a regex that might be easier to understand as switch statement for different allowable address patterns using regex-vis.com
ref: https://ihateregex.io/expr/ipv6/
I'm not saying this has full coverage for all allowed cases, but I can comprehend what the cases are using regex-vis.com
(([0-9a-fA-F]{1,4}:){7,7}[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,7}:|([0-9a-fA-F]{1,4}:){1,6}:[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,5}(:[0-9a-fA-F]{1,4}){1,2}|([0-9a-fA-F]{1,4}:){1,4}(:[0-9a-fA-F]{1,4}){1,3}|([0-9a-fA-F]{1,4}:){1,3}(:[0-9a-fA-F]{1,4}){1,4}|([0-9a-fA-F]{1,4}:){1,2}(:[0-9a-fA-F]{1,4}){1,5}|[0-9a-fA-F]{1,4}:((:[0-9a-fA-F]{1,4}){1,6})|:((:[0-9a-fA-F]{1,4}){1,7}|:)|fe80:(:[0-9a-fA-F]{0,4}){0,4}%[0-9a-zA-Z]{1,}|::(ffff(:0{1,4}){0,1}:){0,1}((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])\.){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])|([0-9a-fA-F]{1,4}:){1,4}:((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])\.){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9]))
If only we could embed a comment string for each regexp group that tools like regex-vis.com could render to provide context for intent for each grouping. The lack of 'intent' is what makes complicated regexp so so brittle and hard to maintain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks all, I'll discuss with the repo-maintainers and k8s api reviewers to see how to move forward on this one. Considering the API is not GA-ed we might skip the regex (I find it tricky to get the absolute correct regex and don't want to risk going with the wrong regex leading to breakages) for now till 1.31 where we introduce isCDIR()
. This might mean implementations need to do their own validation using net.ParseCIDR()
but I think we can live with that. If anyone has strong concerns or suggestions for this please lmk here on the thread.
PS: This also stems from the fact that I don't have cycles to go off and build a v6 regex currently, so if someone is willing to do that happy to give credit there as well.
This PR adds support for implementing inline CIDR peer blocks. Signed-off-by: Surya Seetharaman <[email protected]>
ec3fc38
to
6ff2bac
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: astoycos, Dyanngg, tssurya The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
This PR adds support for implementing inline CIDR
peer blocks.