-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VPC Subnet Routing [1/2] -- RPW and System Routers #5777
Conversation
OPTE now prevents itself from being unloaded if its underlay state is set. Currently, underlay setup is performed only once, and it seems to be the case that XDE can be unloaded in some scenarios (e.g., `a4x2` setup). However, a consequence is that removing the driver requires an extra operation to explicitly clear the underlay state. This PR adds this operation to the `cargo xtask virtual-hardware destroy` command. This is currently blocked on opte#485 being approved/merged. Closes #5314.
These update in response to VPC subnet changes. Now to plumb them into OPTE.
Currently there are no triggers attached to most of the operations that will cause us to either a) push or b) re-resolve VPC routes, but this lays the basis for sled-agent and the background task to talk in terms of versions.
cdf6025
to
006b1ca
Compare
My current 'zero-to-testing' setup is captured below in
|
This is a pre-requisite for oxidecomputer/omicron#5777. As always, we may want to hold merging this until all approvals of that PR are in to avoid blocking bugfixes to maghemite.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incredible work, things are looking good in a4x2, looking forward to exercising this on Dogfood 🚀
common/src/api/internal/shared.rs
Outdated
/// Identifier for a VPC and/or subnet. | ||
#[derive( | ||
Copy, Clone, Debug, Deserialize, Serialize, JsonSchema, PartialEq, Eq, Hash, | ||
)] | ||
pub struct RouterId { | ||
pub vni: Vni, | ||
pub subnet: Option<IpNet>, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commenting for posterity, Kyle and I discussed this offline: it seems like the Option here doesn't really make it clear how the field will be used in the event that it is Some(value)
or None
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 8162a23 -- we're now using RouterKind::System
and RouterKind::Custom(IpNet)
which is far clearer.
This is a pre-requisite for oxidecomputer/omicron#5777. As always, we may want to hold merging this until all approvals of that PR are in to avoid blocking bugfixes to maghemite.
…#5823) This PR builds on #5777 to provide the Custom routers for subnets as described in RFD21. This entails a few things: * We remove the `unpublished = true` tag from the user API for VPC routers and routes. * Custom routers may be attached/detached to a VPC subnet using the `custom_router` field in subnet `POST` and `PUT` requests. * NICs now individually have a `transit_ips` list, which denotes an additional set of CIDR blocks that a NIC is allowed to send and receive traffic on. This is set during `POST` and/or `PUT` on instances which are stopped. This is a key feature to enable software routing by instances, as today's default behaviour drops any packets not matching an assigned IP for an instance. * I suspect there will be some discussion over the shape of this API, so there isn't yet test coverage here until we know we're happy with it. * Revisited which router routes can be created by users, e.g., better validation on v4/v6 dest/target pairs. There are some allowances around currently non-existent features: * **Internet Gateways.** We allow unlimited use of one pseudo-gateway, `inetgw:outbound`, which appears in our existing rules. Using this target sends packets upstream as it does today. * **VPC peering.** VPCs as destinations/targets are currently disallowed in router routes. Closes #2116.
This PR wires up all the backing machinery for VPC subnet routing, and automatically resolves and pushes updated rules to sleds using an RPW. This allows instances in all subnets of a VPC to talk with one another -- assuming no firewall rules have been configured otherwise. At a high level, this works by a few changes:
(0.0.0.0/0, ::/0) -> inetgw:outbound
.subnet:{name} -> subnet:{name}
for each subnet, which are later resolved to both v4 and v6 entries.The most immediate consequence in this PR is that hosts within a subnet -- on different VPCs -- will be able to talk with one another at last. The user facing API (#2116) will be re-enabled in a concurrent PR -- #5823 -- as will NIC spoof detection hole-punching.
Depends on oxidecomputer/opte#490.
Closes #2232, Fixes #1336.
A few pieces will block tests passing & merge-readiness:
lab-2.0-opte-0.32
image.