Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blueprint execution: Send OPTE firewall rules for services #5233

Merged
merged 6 commits into from
Mar 12, 2024

Conversation

jgallagher
Copy link
Contributor

@jgallagher jgallagher commented Mar 8, 2024

This is the reworked and caught-up-with-main #5157 (which I'll close momentarily). I believe I addressed all the comments on the PR except those related to its extension trait style and the nexus-capabilities crate. This introduces a more focused nexus-networking crate that exposes free functions instead of extension traits.

I'll run this through madrid before merging, but I believe it should work (absent any mistakes I made merging changes across branches) now that #5202 has landed.

Closes #4886

Comment on lines 93 to 107
// After deploying omicron zones, we may need to refresh OPTE service
// firewall rules. This is an idempotent operation, so we don't attempt
// to optimize out calling it in unnecessary cases, although we expect
// _most_ cases this is not needed.
nexus_networking::plumb_service_firewall_rules(
datastore,
&opctx,
&[],
&opctx,
&opctx.log,
)
.await
.context("failed to plumb service firewall rules to sleds")
.map_err(|err| vec![err])?;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to triple-check -- this is the real meat of this PR, and the rest is more-or-less refactoring?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct

nexus/reconfigurator/execution/src/lib.rs Outdated Show resolved Hide resolved
@jgallagher jgallagher force-pushed the john/nexus-networking-free-fns branch from c788362 to f37525f Compare March 11, 2024 13:02
@jgallagher
Copy link
Contributor Author

Installed this on madrid, then replaced the nexus binaries with one that differs only by:

--- a/nexus/src/app/deployment.rs
+++ b/nexus/src/app/deployment.rs
@@ -122,7 +122,7 @@ impl super::Nexus {
             &sled_rows,
             &zpool_rows,
             &ip_pool_range_rows,
-            NEXUS_REDUNDANCY,
+            NEXUS_REDUNDANCY + 1,
         )?;

         // The choice of which inventory collection to use here is not

I ran through the normal blueprint dance, and ended up with a fourth Nexus running with an external IP:

# omdb db network list-eips
 IP              PORTS        KIND      STATE     OWNER_KIND  OWNER_ID                              OWNER_NAME
 172.20.28.1/32  0/65535      floating  Attached  service     cec24a89-df52-4e08-bdb5-a481d8002495  ExternalDns
 172.20.28.2/32  0/65535      floating  Attached  service     80a56da5-fcec-4bf3-8cc2-cd73d76aa09c  Nexus
 172.20.28.3/32  0/65535      floating  Attached  service     e48f11fa-b31f-4378-952f-e8ce8439d3ac  Nexus
 172.20.28.4/32  0/65535      floating  Attached  service     c1f04b44-464f-4319-ba41-c89cb457f8f3  Nexus
 172.20.28.5/32  0/16383      SNAT      Attached  service     09e99277-4765-49bb-80a2-2209983d077c  Ntp
 172.20.28.6/32  16384/32767  SNAT      Attached  service     b44d208c-7d12-4ec0-abcb-dc2ac2b5340a  Ntp
 172.20.28.7/32  0/65535      floating  Attached  service     4d48055e-c486-4788-8c0f-f6652baf2385  Nexus

and correct OPTE firewall rules:

BRM42220004 # opteadm list-ports
LINK   MAC ADDRESS        IPv4 ADDRESS  EPHEMERAL IPv4  FLOATING IPv4  IPv6 ADDRESS  EXTERNAL IPv6  FLOATING IPv6  STATE
opte0  A8:40:25:FF:CA:F7  172.30.3.5    None            None           None          None           None           running
opte1  A8:40:25:FF:EC:7A  172.30.1.5    None            172.20.28.1    None          None           None           running
opte2  A8:40:25:FF:80:00  172.30.2.8    None            172.20.28.7    None          None           None           running
BRM42220004 # opteadm dump-layer -p opte2 firewall
Port opte2 - Layer firewall
======================================================================
Inbound Flows
----------------------------------------------------------------------
PROTO  SRC IP  SPORT  DST IP  DPORT  HITS  ACTION

Outbound Flows
----------------------------------------------------------------------
PROTO  SRC IP  SPORT  DST IP  DPORT  HITS  ACTION

Inbound Rules
----------------------------------------------------------------------
ID   PRI    HITS  PREDICATES            ACTION
12   65534  0     inner.ip.proto=TCP    "Stateful Allow"
                  inner.ulp.dst=80,443

DEF  --     0     --                    "deny"

Outbound Rules
----------------------------------------------------------------------
ID   PRI  HITS  PREDICATES  ACTION
DEF  --   2     --          "stateful allow"

and confirmed I could talk to it from outside madrid:

% strace -f --trace=connect oxide --resolve recovery.sys.madrid.eng.oxide.computer:443:172.20.28.7 ping 2>&1 | rg -v 'attached|exited'
[pid 442841] connect(9, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("172.20.28.7")}, 16) = -1 EINPROGRESS (Operation now in progress)
{
  "status": "ok"
}

use std::net::SocketAddrV6;
use uuid::Uuid;

pub fn sled_lookup<'a>(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use LookupPath directly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just moving stuff what existed without trying to make any changes. I can drop this one and keep Nexus::sled_client around? (That one is called a half dozen times or so; I guess we could also replace all of those with LookupPath invocations, but that's entirely unrelated to this PR, I think.)

// License, v. 2.0. If a copy of the MPL was not distributed with this
// file, You can obtain one at https://mozilla.org/MPL/2.0/.

//! Functionality related to firewall rules.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the contents of this file basically just moved, not changed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct; only the function parameters changed (to replace self with datastore, etc.), and the corresponding changes to the function bodies.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I see that now. Not a big deal either way.

@jgallagher jgallagher merged commit fdbbbe5 into main Mar 12, 2024
22 checks passed
@jgallagher jgallagher deleted the john/nexus-networking-free-fns branch March 12, 2024 21:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

add sled: planner support for deploying Nexus
3 participants