Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Network Operator BGP API improvements #6362

Merged
merged 15 commits into from
Aug 22, 2024
Merged

Network Operator BGP API improvements #6362

merged 15 commits into from
Aug 22, 2024

Conversation

internet-diglett
Copy link
Contributor

@internet-diglett internet-diglett commented Aug 16, 2024

#6081

Refactored existing api to list bgp announce sets and added a new api to list announcements for an announce set

#6244 and #6245

Refactored bgp datastore methods to propagate additional error context as well as emit error types for the external API (404, 403, etc)

#6246

Two part fix:

  1. Emit an error if operator attempts to create more than one bgp configuration for the same ASN
  2. Emit an error if the operator tries to configure BGP peers for the same switch that point to bgp configs with different ASNs

#6349

Updated bgp datastore methods to always lookup the parent record (not just when the name needs to be resolved to a UUID). This guards us against situations where an invalid UUID is accidentally submitted.

Related

Resolving these issues for BGP

@internet-diglett internet-diglett changed the title Network Operator API improvements Network Operator BGP API improvements Aug 16, 2024
@internet-diglett
Copy link
Contributor Author

Changes for #6081 with updated Oxide CLI:

sled03 ➜  oxide.rs git:(main) ✗ oxide --profile a4x2 system networking bgp announce-set list
[
  {
    "description": "Announce set for AS 47",
    "id": "4040ce02-f01d-449f-997e-aa5eb1ae1c8a",
    "name": "as47-announce",
    "time_created": "2024-08-19T20:49:49.040115Z",
    "time_modified": "2024-08-19T20:49:49.040115Z"
  },
  {
    "description": "Announce set for AS 48",
    "id": "c2e77d75-4c3d-4c61-8669-86e221bdaeca",
    "name": "as48-announce",
    "time_created": "2024-08-19T20:49:49.935591Z",
    "time_modified": "2024-08-19T20:49:49.935591Z"
  }
]


sled03 ➜  oxide.rs git:(main) ✗ oxide --profile a4x2 system networking bgp announcement list --name-or-id "as47-announce"
[
  {
    "address_lot_block_id": "4040ce02-f01d-449f-997e-aa5eb1ae1c8a",
    "announce_set_id": "4040ce02-f01d-449f-997e-aa5eb1ae1c8a",
    "network": "198.51.100.0/24"
  }
]

@internet-diglett internet-diglett marked this pull request as ready for review August 20, 2024 18:57
@internet-diglett internet-diglett added this to the 10 milestone Aug 20, 2024
Copy link
Contributor

@rcgoodfellow rcgoodfellow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Levon! Comments follow. Will take this for a test drive shortly.

.await?;

if !configs_with_asn.is_empty() {
error!(opctx.log, "config for asn already exists"; "asn" => ?config.asn, "configs" => ?configs_with_asn);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe the name of this method should be bgp_config_create instead of bgp_config_set as the latter implies we'll update/overwrite the existing config if it exists.

if let Some(err) = err.take() {
error!(opctx.log, "{msg}"; "error" => ?err);
err
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you provide a comment on why non "takable" errors result in a 500?

match e {
diesel::result::Error::NotFound => err
.bail(Error::not_found_by_id(
ResourceType::BgpConfig,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be ResourceType::BgpAnnounceSet?

diesel::result::Error::NotFound => err
.bail(Error::not_found_by_name(
ResourceType::BgpConfig,
&name,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, should probably be ResourceType::BgpAnnounceSet.

match e {
diesel::result::Error::NotFound => err
.bail(Error::not_found_by_id(
ResourceType::BgpConfig,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ResourceType::BgpAnnounceSet

match e {
diesel::result::Error::NotFound => err
.bail(Error::not_found_by_name(
ResourceType::BgpConfig,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ResourceType::BgpAnnounceSet

}
})?;

// if we're setting a port settings id (and therefore activating a configuration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I guess this is one of the consequences of being able to build up arbitrary configs, but only have some of them active - we have to validate that anything that is being activated is valid against the rest of the active set.

@internet-diglett
Copy link
Contributor Author

Some more test output from a4x2:

Error un-squashing for bgp:

sled03 ➜  oxide.rs git:(main) ✗ oxide --profile a4x2 system networking bgp config create --asn 49 --bgp-announce-set-id "32ff136e-2b11-47d6-9f23-85eb278095be" --description "BGP config for AS 49" --name as49
error
Error Response: status: 404 Not Found; headers: {"content-type": "application/json", "x-request-id": "5cbf4d5c-7005-43e9-8eac-c132dba884b4", "content-length": "185", "date": "Wed, 21 Aug 2024 02:38:02 GMT"}; value: Error { error_code: Some("ObjectNotFound"), message: "not found: bgp-announce-set with id \"32ff136e-2b11-47d6-9f23-85eb278095be\"", request_id: "5cbf4d5c-7005-43e9-8eac-c132dba884b4" }

Not allowing multiple configs for a bgp asn:

sled03 ➜  oxide.rs git:(main) ✗ oxide --profile a4x2 system networking bgp config create --asn 48 --bgp-announce-set-id "86afcc48-933e-4048-ab9b-0442aaea4efc" --description "another BGP config for AS 48" --name as48-2
error
Error Response: status: 409 Conflict; headers: {"content-type": "application/json", "x-request-id": "b14be4d5-1f08-4550-9d92-3368317a36d9", "content-length": "150", "date": "Wed, 21 Aug 2024 02:39:35 GMT"}; value: Error { error_code: Some("Conflict"), message: "cannot have more than one configuration per ASN", request_id: "b14be4d5-1f08-4550-9d92-3368317a36d9" }
Error Response: status: 409 Conflict; headers: {"content-type": "application/json", "x-request-id": "b14be4d5-1f08-4550-9d92-3368317a36d9", "content-length": "150", "date": "Wed, 21 Aug 2024 02:39:35 GMT"}; value: Error { error_code: Some("Conflict"), message: "cannot have more than one configuration per ASN", request_id: "b14be4d5-1f08-4550-9d92-3368317a36d9" }

Comment on lines 104 to 112
// TODO: remove once per-switch-multi-asn support is added
// Bail if an existing config for this ASN already exists.
// This is a temporary measure until multi-asn-per-switch is supported.
let configs_with_asn: Vec<BgpConfig> = dsl::bgp_config
.filter(dsl::asn.eq(config.asn))
.filter(dsl::time_deleted.is_null())
.select(BgpConfig::as_select())
.load_async(&conn)
.await?;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I found something that needs to be changed here. Elsewhere in Nexus our logic relies on this code being idempotent, so I'm reworking this to allow and exact match to return Ok(config) but otherwise bail with the error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still might not be fixed, doing a bit more testing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I have it sorted, it was another diesel-ism

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's been happy for multiple a4x2 laps now, auto-merging

@internet-diglett internet-diglett enabled auto-merge (squash) August 22, 2024 01:54
@internet-diglett internet-diglett enabled auto-merge (squash) August 22, 2024 02:32
@internet-diglett internet-diglett merged commit f1e1aff into main Aug 22, 2024
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants