Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unpublish VPC router related endpoints #4493

Merged
merged 2 commits into from
Nov 14, 2023
Merged

Unpublish VPC router related endpoints #4493

merged 2 commits into from
Nov 14, 2023

Conversation

david-crespo
Copy link
Contributor

@david-crespo david-crespo commented Nov 13, 2023

These endpoints aren't currently used for anything, so we don't want them showing up in either the docs, the generated clients, or the CLI.

Per https://github.com/oxidecomputer/docs/issues/232 (private repo)

@karencfv
Copy link
Contributor

Wait!!! Don't you need to be able to delete these in order to delete Subnets, so you can delete VPCs, so you can delete projects?

@david-crespo
Copy link
Contributor Author

david-crespo commented Nov 14, 2023

Wait!!! Don't you need to be able to delete these in order to delete Subnets, so you can delete VPCs, so you can delete projects?

Great point. I forgot to check. I was surprised to find everything is fine.

VPC routers are children of VPCs, not subnets.

omicron/schema/crdb/dbinit.sql

Lines 1400 to 1412 in 6bbaa80

CREATE TABLE IF NOT EXISTS omicron.public.vpc_router (
/* Identity metadata (resource) */
id UUID PRIMARY KEY,
name STRING(63) NOT NULL,
description STRING(512) NOT NULL,
time_created TIMESTAMPTZ NOT NULL,
time_modified TIMESTAMPTZ NOT NULL,
/* Indicates that the object has been deleted */
time_deleted TIMESTAMPTZ,
kind omicron.public.vpc_router_kind NOT NULL,
vpc_id UUID NOT NULL,
rcgen INT NOT NULL
);

The child of subnet is a NIC, so delete subnet checks for child NICs but not child routers.

pub async fn vpc_delete_subnet(
&self,
opctx: &OpContext,
db_subnet: &VpcSubnet,
authz_subnet: &authz::VpcSubnet,
) -> DeleteResult {
opctx.authorize(authz::Action::Delete, authz_subnet).await?;
use db::schema::network_interface;
use db::schema::vpc_subnet::dsl;
let conn = self.pool_connection_authorized(opctx).await?;
// Verify there are no child network interfaces in this VPC Subnet
if network_interface::dsl::network_interface
.filter(network_interface::dsl::subnet_id.eq(authz_subnet.id()))
.filter(network_interface::dsl::time_deleted.is_null())
.select(network_interface::dsl::id)
.limit(1)
.first_async::<Uuid>(&*conn)
.await
.optional()
.map_err(|e| public_error_from_diesel(e, ErrorHandler::Server))?
.is_some()
{
return Err(Error::InvalidRequest {
message: String::from(
"VPC Subnet cannot be deleted while \
network interfaces in the subnet exist",
),
});
}
// Delete the subnet, conditional on the rcgen not having changed.
let now = Utc::now();
let updated_rows = diesel::update(dsl::vpc_subnet)
.filter(dsl::time_deleted.is_null())
.filter(dsl::id.eq(authz_subnet.id()))
.filter(dsl::rcgen.eq(db_subnet.rcgen))
.set(dsl::time_deleted.eq(now))
.execute_async(&*self.pool_connection_authorized(opctx).await?)
.await
.map_err(|e| {
public_error_from_diesel(
e,
ErrorHandler::NotFoundByResource(authz_subnet),
)
})?;
if updated_rows == 0 {
return Err(Error::InvalidRequest {
message: String::from(
"deletion failed to to concurrent modification",
),
});
} else {
Ok(())
}
}

I think the relationship between subnets and routers and you might have had in mind is this: creating a subnet causes a route to be created in the system router, and deleting a subnet causes that route to be deleted. At least, that's supposed to be what happens — it turns out that is an undone TODO:

// TODO: When a subnet is created it should add a route entry into the VPC's
// system router
pub(crate) async fn vpc_create_subnet(

// TODO: When a subnet is deleted it should remove its entry from the VPC's
// system router.
pub(crate) async fn vpc_delete_subnet(

VPC delete checks for child subnets but not child routers:

pub async fn project_delete_vpc(
&self,
opctx: &OpContext,
db_vpc: &Vpc,
authz_vpc: &authz::Vpc,
) -> DeleteResult {
opctx.authorize(authz::Action::Delete, authz_vpc).await?;
use db::schema::vpc::dsl;
use db::schema::vpc_subnet;
// Note that we don't ensure the firewall rules are empty here, because
// we allow deleting VPCs with firewall rules present. Inserting new
// rules is serialized with respect to the deletion by the row lock
// associated with the VPC row, since we use the collection insert CTE
// pattern to add firewall rules.
// We _do_ need to check for the existence of subnets. VPC Subnets
// cannot be deleted while there are network interfaces in them
// (associations between an instance and a VPC Subnet). Because VPC
// Subnets are themselves containers for resources that we don't want to
// auto-delete (now, anyway), we've got to check there aren't any. We
// _might_ be able to make this a check for NICs, rather than subnets,
// but we can't have NICs be a child of both tables at this point, and
// we need to prevent VPC Subnets from being deleted while they have
// NICs in them as well.
if vpc_subnet::dsl::vpc_subnet
.filter(vpc_subnet::dsl::vpc_id.eq(authz_vpc.id()))
.filter(vpc_subnet::dsl::time_deleted.is_null())
.select(vpc_subnet::dsl::id)
.limit(1)
.first_async::<Uuid>(
&*self.pool_connection_authorized(opctx).await?,
)
.await
.optional()
.map_err(|e| public_error_from_diesel(e, ErrorHandler::Server))?
.is_some()
{
return Err(Error::InvalidRequest {
message: String::from(
"VPC cannot be deleted while VPC Subnets exist",
),
});
}
// Delete the VPC, conditional on the subnet_gen not having changed.
let now = Utc::now();
let updated_rows = diesel::update(dsl::vpc)
.filter(dsl::time_deleted.is_null())
.filter(dsl::id.eq(authz_vpc.id()))
.filter(dsl::subnet_gen.eq(db_vpc.subnet_gen))
.set(dsl::time_deleted.eq(now))
.execute_async(&*self.pool_connection_authorized(opctx).await?)
.await
.map_err(|e| {
public_error_from_diesel(
e,
ErrorHandler::NotFoundByResource(authz_vpc),
)
})?;
if updated_rows == 0 {
Err(Error::InvalidRequest {
message: String::from(
"deletion failed to to concurrent modification",
),
})
} else {
Ok(())
}
}

Delete router itself doesn't have any conditions.

pub async fn vpc_delete_router(
&self,
opctx: &OpContext,
authz_router: &authz::VpcRouter,
) -> DeleteResult {
opctx.authorize(authz::Action::Delete, authz_router).await?;
use db::schema::vpc_router::dsl;
let now = Utc::now();
diesel::update(dsl::vpc_router)
.filter(dsl::time_deleted.is_null())
.filter(dsl::id.eq(authz_router.id()))
.set(dsl::time_deleted.eq(now))
.execute_async(&*self.pool_connection_authorized(opctx).await?)
.await
.map_err(|e| {
public_error_from_diesel(
e,
ErrorHandler::NotFoundByResource(authz_router),
)
})?;
Ok(())
}

Finally, the delete VPC function in nexus/app also deletes all routers.

pub(crate) async fn project_delete_vpc(
&self,
opctx: &OpContext,
vpc_lookup: &lookup::Vpc<'_>,
) -> DeleteResult {
let (.., authz_vpc, db_vpc) = vpc_lookup.fetch().await?;
let authz_vpc_router = authz::VpcRouter::new(
authz_vpc.clone(),
db_vpc.system_router_id,
LookupType::ById(db_vpc.system_router_id),
);
// Possibly delete the VPC, then the router and firewall.
//
// We must delete the VPC first. This will fail if the VPC still
// contains at least one subnet, since those are independent containers
// that track network interfaces as child resources. If we delete the
// router first, it'll succeed even if the VPC contains Subnets, which
// means the router is now gone from an otherwise-live subnet.
//
// This is a good example of need for the original comment:
//
// TODO: This should eventually use a saga to call the
// networking subsystem to have it clean up the networking resources
self.db_datastore
.project_delete_vpc(opctx, &db_vpc, &authz_vpc)
.await?;
self.db_datastore.vpc_delete_router(&opctx, &authz_vpc_router).await?;
// Delete all firewall rules after deleting the VPC, to ensure no
// firewall rules get added between rules deletion and VPC deletion.
self.db_datastore
.vpc_delete_all_firewall_rules(&opctx, &authz_vpc)
.await
}

I tested this on dogfood and was able to delete a VPC after deleting subnets, did not have to delete routers. The VPCs integration test also covers this when it deletes a VPC after deleting subnets.

// Delete the default VPC Subnet and VPC.
let default_subnet_url = get_subnet_url("new-name", "default");
NexusRequest::object_delete(client, &default_subnet_url)
.authn_as(AuthnMode::PrivilegedUser)
.execute()
.await
.unwrap();
NexusRequest::object_delete(client, &vpc_url)
.authn_as(AuthnMode::PrivilegedUser)
.execute()
.await
.unwrap();

Copy link
Contributor

@karencfv karencfv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point. I forgot to check. I was surprised to find everything is fine.

phew 😅 carry on then

@david-crespo david-crespo merged commit a1d0738 into main Nov 14, 2023
21 checks passed
@david-crespo david-crespo deleted the unpublish-routers branch November 14, 2023 17:34
@rmustacc
Copy link

An additional note on the API surface is that every VPC has a System VPC Router, which always exists with the VPC. The bits that weren't there yet while show information about that router also allow for custom VPC Routers to be created which are then applied to individual subnets. So in the future, when there are custom VPC routers that can apply to subnets, that'll likely be something we consider blocking a VPC delete, but since none of those exist today and the System VPC router is not something that can be created or destroyed (from a theoretical perspective, obviously this doesn't exist hence the unpublish).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants