Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Multiple) Floating IP Support #4559

Merged
merged 32 commits into from
Dec 6, 2023
Merged

(Multiple) Floating IP Support #4559

merged 32 commits into from
Dec 6, 2023

Conversation

FelixMcFelix
Copy link
Contributor

@FelixMcFelix FelixMcFelix commented Nov 24, 2023

This PR adds external API endpoints to create and manipulate floating IP objects within a project, and ensures they will be attached/detached during the instance create/destroy sagas. External IPs served to instances are now partitioned into ephemeral and floating when talking to sled-agent, as OPTE will prefer floating IP addresses for outbound traffic. The new endpoints are:

  • POST /v1/floating-ips
  • DELETE /v1/floating-ips/{floating_ip}
  • GET /v1/floating-ips
  • GET /v1/floating-ips/{floating_ip}

Currently, this only allows floating IPs to be attached/detached during instance create. Separate per-instance attach/detach endpoints for live instances will follow in another PR, since I suspect there's extra saga work needing done to get it right.

This is closely tied with opte#420 -- the semantics of having external IPs of either/both type are described on that PR. helios / deploy will fail until that is merged and we can create a new helios + OPTE image.

Will close #1467 and close #1334.

@@ -265,6 +265,22 @@ pub struct ExternalIp {
pub kind: IpKind,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re: views::ExternalIp, I don't know if we want to change this to an enum (rather than rely on IpKind) to directly hand over a views::FloatingIp instead of needing users to (painfully) look it up again. The console works with this as-is, so as long as ip is flattened out then we can make this change.

@FelixMcFelix FelixMcFelix marked this pull request as ready for review November 28, 2023 19:50
@rcgoodfellow
Copy link
Contributor

It would be good to have docs for this in the form of a guide on managing floating IPs.

@rcgoodfellow
Copy link
Contributor

Since there are new endpoints, we also need updates to oxide.rs to add those endpoints.

@rcgoodfellow
Copy link
Contributor

I deployed this branch and was able to launch an instance with a floating IP and an ephemeral IP 🎉

The following worked for me.

$ oxide api /v1/floating-ips?project=murphy --method POST --input floater.json
$ cat floater.json
{
    "name": "floater",
    "description": "a floater",
    "address": "192.168.20.47"
}
$ oxide api /v1/instances?project=murphy --method POST --input - <<EOF
{
  "name": "myinst",
  "description": "my inst",
  "hostname": "myinst",
  "memory": 1073741824,
  "ncpus": 2,
  "disks": [
    {
      "type": "attach",
      "name": "bookworm-boot"
    }
  ],
  "external_ips": [{"type": "ephemeral"}, {"type": "floating", "floating_ip_name": "floater"}]
}
EOF

I had several missteps along the way, but the errors coming back from the API were very helpful and the corrections needed were clear 👍.

Upon launching the instance, I was able to ssh in via both the floating and ephemeral IP.

I was also able to verify that OPTE was preferring the floating IP over the ephemeral on egress for ICMP, UDP and TCP traffic.

Basic kicking of the tires all looks good here. Great work, I know this is a lot of integration work in addition to the underlying OPTE bits.

Copy link
Contributor

@rcgoodfellow rcgoodfellow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Kyle. Just a few comments/questions.

use db::schema::external_ip::dsl;

// Verify this FIP is not attached to any instances/services.
if db_fip.parent_id.is_some() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should getting the db_fip value and acting on the value for the detach action below be done in a transaction to ensure that we can't get a race condition between attaching and deleting?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be improved by doing so, yes. The actual diesel::update attach/detach/delete operations all include a filter on parent_id, so currently if preempted by another write they will back out without doing anything unintended, but I'd prefer a transaction too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Let's get a follow-up issue logged for that so we don't lose track.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The update_and_check() machinery is probably going to be helpful here. It lets one do conditional updates, say on the parent ID being NULL, and returning a detectable error in the case where that condition is not met.

use db::schema::external_ip::dsl;

// Verify this FIP is not attached to any instances/services.
if db_fip.parent_id.is_some() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question as the delete case, does the parent id check need to be done in a transaction context with the attach to avoid races?

) -> UpdateResult<FloatingIp> {
use db::schema::external_ip::dsl;

let Some(instance_id) = db_fip.parent_id else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question on needing to be in transaction for db_fip check.

nexus/src/app/mod.rs Show resolved Hide resolved
@FelixMcFelix FelixMcFelix added this to the 5 milestone Dec 6, 2023
Copy link
Contributor

@rcgoodfellow rcgoodfellow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM and testing checks out, thanks!

I think it would be good to have someone who understands the instance lifecycle much better than me take a look. Perhaps @gjcolombo?

@rcgoodfellow
Copy link
Contributor

In the interest of getting things lined up for the release, I'm merging this.

@rcgoodfellow rcgoodfellow merged commit 1a3443c into main Dec 6, 2023
23 checks passed
@rcgoodfellow rcgoodfellow deleted the felixmcfelix/floating-ip branch December 6, 2023 23:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api Related to the API. database Related to database access networking Related to the networking.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Handle multiple external IP addresses per instance Implement Floating IPs
3 participants