Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add LLDP configuration support #6185

Merged
merged 1 commit into from
Aug 22, 2024
Merged

Add LLDP configuration support #6185

merged 1 commit into from
Aug 22, 2024

Conversation

Nieuwejaar
Copy link
Contributor

Several months ago, we added basic support for running lldp in the switch zone. That support consisted of plumbing through the chassis ID and sidecar revision, and starting the daemon.

This change adds support for enabling LLDP on individual ports, and for configuring per-port settings. These settings are consumed by wicket and are plumbed through the bootstore and into crdb. They are passed on to the daemon in the switch zone via SMF properties.

With this new functionality, we will be announcing our presence on a customer's network and should be visible through their network management tools. The daemon running in the switch zone will be collecting information from its peers on their network, but that collected data is not yet plumbed back through the oxide API/CLI. That will be addressed as follow-on work.

@@ -77,7 +77,7 @@ impl<'t> SmfHelper<'t> {
"addpropvalue",
&prop.to_string(),
&format!("{}:", valtype.to_string()),
&val.to_string(),
&format!("\"{}\"", val.to_string()),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed to allow the settings of property values containing spaces, e.g., "oxide switch 1".

}
}

table! {
lldp_service_config (id) {
lldp_link_config (id) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes the per-link lldp config data directly visible from that link's configuration. There was a level of indirection through a service_config which wasn't necessary.

@@ -620,8 +620,8 @@ output.intermediate_only = true
service_name = "lldp"
source.type = "prebuilt"
source.repo = "lldp"
source.commit = "30e5d89fae9190c69258ca77d5d5a1acec064742"
source.sha256 = "f58bfd1b77748544b5b1a99a07e52bab8dc5673b9bd3a745ebbfdd614d492328"
source.commit = "2a9f92dd55817af792670cd86e6a01676f5c79a0"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LLDP update gives us a daemon that can understand the new per-port properties.

/*
* The old lldp_service_config_id is being replaced with lldp_link_config_id.
*/
ALTER TABLE omicron.public.switch_port_settings_link_config DROP COLUMN IF EXISTS lldp_service_config_id;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This column and the tables below can be dropped without losing any data, as no prior release has had any way to populate any of them. They were added as placeholders/stakes-in-the-ground last year.

@@ -749,7 +750,7 @@ impl ServiceInner {
.iter()
.map(|config| NexusTypes::PortConfigV2 {
port: config.port.clone(),
routes: config
routes: config
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a bunch of spacing-only changes here, which cargo fmt added, and which cargo fmt -- -check now requires. Only the lldp clause at 799 represents a substantive change.

info!(self.inner.log, "Setting up uplinkd service");
let smfh = SmfHelper::new(&zone, &SwitchService::Uplink);
info!(self.inner.log, "ensuring scrimlet uplinks");
let usmfh = SmfHelper::new(&zone, &SwitchService::Uplink);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have two different SMF helpers, as we are updating the properties for two different services in the loop below.

Comment on lines +1399 to +1401
diesel::delete(lldp_link_config::dsl::lldp_link_config)
.filter(lldp_link_config::id.eq_any(lldp_link_ids))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a huge deal, but out of curiosity -- lldp_link_config has a time_deleted field, but it looks like we're doing hard deletion here instead of soft-deletion. When do you plan on using soft vs hard deletion for this table?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't actually have a way to delete the LLDP config at all - other than deleting the whole port, I guess. The soft delete field was carried over from the earlier, never-used, tables. I'm not really sure why we would use a soft delete for this. I'll see if Ry had a reason for going this way initially, but I expect I'll take it out.

.select(LldpServiceConfig::as_select())
use db::schema::lldp_link_config;
result.link_lldp = lldp_link_config::dsl::lldp_link_config
.filter(lldp_link_config::id.eq_any(lldp_link_ids))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to use soft-deletion, this should also probably:

.filter(lldp_link_config::dsl::time_deleted.is_null())

If we want to use hard-deletion, this isn't necessary, but then we should drop the time_deleted column

..Default::default()
},
Some(l) => LldpLinkConfigCreate {
enabled: l.status == LldpAdminStatus::Enabled,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this translation problematic? Seems weird to me to report "enabled: false" when we might be still be allowing rx_only or tx_only traffic.

(This isn't my domain of expertise, so I trust your judgement, just seems odd to have those settings convert to "LLDP if off!")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a mismatch between what we support, and what the LLDP RFC supports. The protocol allows a node to be rx-only or tx-only, but the old LLDP tables only included enabled/disabled. It looks like Cisco at least does support all 4 options, so maybe we should as well. I'll run this by Ry and/or Trey to see whether we should offer that flexibility.

lldp: LldpLinkConfigCreate {
enabled: true,
link_name: Some("Link Name".into()),
link_description: Some("link_ Dscription".into()),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super nit, because this is a test-only string and really doesn't matter, but:

Suggested change
link_description: Some("link_ Dscription".into()),
link_description: Some("link description".into()),

@@ -118,6 +118,22 @@ switch = "switch0"
# Neighbors we expect to peer with over BGP on this port.
bgp_peers = []

# LLDP settings for this port
#[rack_network_config.switch0.qsfp0.lldp]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it intentional that this config is commented out?

@@ -393,6 +471,8 @@ pub struct PortConfigV2 {
/// Whether or not to set autonegotiation
#[serde(default)]
pub autoneg: bool,
/// LLDP configuration for this port
pub lldp: Option<LldpPortConfig>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this require a #[serde(default)] to be backwards compatible?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope. The fact that it's an Option means that it's legal for it to be missing entirely. Essentially, the [serde(default)] is implicitly None for Options.

I tested this with both fresh installs and mupdateing from older, pre-llldp instances as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, ignore my comments then! I went searching for details about this online before leaving that comment but could not find an answer. Thanks for the dilligence.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like Great White says, "My, my, my, I'm once bitten, twice shy, babe".

Copy link
Contributor

@andrewjstone andrewjstone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good @Nieuwejaar. I was mainly looking for bootstore related backwards compat issues, but also took a quick look elsewhere. I think since we may read a PortConfigV2 in the bootstore that is lacking the new lldp field, we need to tag it with #[serde(default)]. Otherwise, I would expect deserialization to fail on upgrades. I'd suggest testing out an upgrade on madrid or london to be sure everything works as expected if you haven't already.

@Nieuwejaar Nieuwejaar merged commit 45fb575 into main Aug 22, 2024
23 checks passed
@Nieuwejaar Nieuwejaar deleted the lldp branch August 22, 2024 01:34
smklein added a commit that referenced this pull request Aug 26, 2024
#6185 introduced a schema
change which added a non-nullable column `lldp_link_config_id` to the
`switch_port_settings_link_config` table.

However, on systems where `switch_port_settings_link_config` had rows,
there was no "default" value, and the schema update appears to fail.
This is currently the case on our dogfood system.

To mitigate: this PR makes the `lldp_link_config_id` column nullable, by
updating the existing schema change (for cases where it could not
complete previously) and by adding a new schema change (for cases where
the `switch_port_settings_link_config` table was empty, and the schema
change **did** previously complete).

Fixes #6433
nickryand pushed a commit to nickryand/omicron that referenced this pull request Aug 27, 2024
oxidecomputer#6185 introduced a schema
change which added a non-nullable column `lldp_link_config_id` to the
`switch_port_settings_link_config` table.

However, on systems where `switch_port_settings_link_config` had rows,
there was no "default" value, and the schema update appears to fail.
This is currently the case on our dogfood system.

To mitigate: this PR makes the `lldp_link_config_id` column nullable, by
updating the existing schema change (for cases where it could not
complete previously) and by adding a new schema change (for cases where
the `switch_port_settings_link_config` table was empty, and the schema
change **did** previously complete).

Fixes oxidecomputer#6433
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants