Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NAT RPW #3804

Merged
merged 26 commits into from
Nov 18, 2023
Merged

NAT RPW #3804

merged 26 commits into from
Nov 18, 2023

Conversation

internet-diglett
Copy link
Contributor

@internet-diglett internet-diglett commented Jul 31, 2023

  • Add db table for tracking nat entries
  • Add endpoint for retrieving changesets
  • Update instance sagas to update table and trigger RPW
  • Periodically cleanup soft-deleted entries that no longer
    need to be sync'd by dendrite.

The other half of the RPW lives in Dendrite. It will periodically
check for a changeset, or check for a changeset when the trigger
endpoint is called by the relevant saga / nexus operation.

common/src/sql/dbinit.sql Outdated Show resolved Hide resolved
/// empty vec is returned.
#[endpoint {
method = GET,
path = "/rpw/nat/ipv4/changeset/{from_gen}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tiny nitpick -- I'd just prefix these paths with "/nat", the fact that they're called from Dendrite is an implementation detail

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I still might remove the /rpw prefix, the fact that Nexus processes this as an RPW is an implementation detail.

@internet-diglett
Copy link
Contributor Author

Testing resolution of oxidecomputer/buildomat#38

* Add db table for tracking nat entries
* Add endpoint for retrieving changesets
* Update instance sagas to update table and trigger RPW
* Periodically cleanup soft-deleted entries that no longer
  need to be sync'd by dendrite.

The other half of the RPW lives in Dendrite. It will periodically
check for a changeset, or check for a changeset when the trigger
endpoint is called by the relevant saga / nexus operation.
@internet-diglett internet-diglett marked this pull request as ready for review October 19, 2023 21:52
@smklein smklein self-assigned this Nov 6, 2023
Copy link
Collaborator

@smklein smklein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My first pass here is focused a bit more at the DB than the RPW itself

Comment on lines 14 to 19
// TODO correctness
// If we're not going to store ipv4 and ipv6
// NAT entries in the same table, and we don't
// need any of the special properties of the IpNetwork
// column type, does it make sense to use a different
// column type?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple questions about this:

  • I don't see anything in this struct that appears IPv4 specific -- doesn't ipnetwork::IpNetwork encapsulate both v4 and v6 addresses? Could this entire struct just be IpNatValues (and same elsewhere in this file) or would that be a bad idea?
  • If that genuinely is a bad idea, https://docs.rs/ipnetwork/0.20.0/ipnetwork/struct.Ipv4Network.html exists, and we could use that instead?

Either way, we should resolve this comment before merging!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By keeping the tables for v4 and v6 separate, we maintain a 1:1 mapping with the tables in dendrite, which does make troubleshooting, observability, and state reconciliation a bit more straightforward, so my gut says "keep them separate"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try using Ipv4Network unless someone has an objection :)

Comment on lines 32 to 36
// TODO correctness
// If we're not going to store ipv4 and ipv6
// NAT entries in the same table, we should probably
// make the types more restrictive to prevent an
// accidental ipv6 entry from being created.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the previous comment

// different target sled / vpc / mac, we will violate a uniqueness
// constraint (which is desired in this case).
//
// It seems that this isn't straightforward to express in diesel
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it's not trivial, but it's probably worth expressing this in Diesel -- if we added a necessary column here, this raw SQL code would still compile, but fail to actually insert a valid entry.

I do think you had the right idea -- get moving with sql_query -- but hopefully the following code lets us add a little more type-safety here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not executed this code, but I have confirmed it compiles:

        use db::schema::ipv4_nat_entry::dsl;                                                                                                                                                                                                                                                                                                                        
        use diesel::sql_types;                                                                                                                                                                                                                                                                                                                                      
                                                                                                                                                                                                                                                                                                                                                                    
        // Look up any NAT entries that already have the exact parameters                                                                                                                                                                                                                                                                                           
        // we're trying to INSERT.                                                                                                                                                                                                                                                                                                                                  
        let matching_entry_subquery = dsl::ipv4_nat_entry                                                                                                                                                                                                                                                                                                           
            .filter(dsl::external_address.eq(nat_entry.external_address))                                                                                                                                                                                                                                                                                           
            .filter(dsl::first_port.eq(nat_entry.first_port))                                                                                                                                                                                                                                                                                                       
            .filter(dsl::last_port.eq(nat_entry.last_port))                                                                                                                                                                                                                                                                                                         
            .filter(dsl::sled_address.eq(nat_entry.sled_address))                                                                                                                                                                                                                                                                                                   
            .filter(dsl::vni.eq(nat_entry.vni))                                                                                                                                                                                                                                                                                                                     
            .filter(dsl::mac.eq(nat_entry.mac))                                                                                                                                                                                                                                                                                                                     
            .select((                                                                                                                                                                                                                                                                                                                                               
                dsl::external_address,                                                                                                                                                                                                                                                                                                                              
                dsl::first_port,                                                                                                                                                                                                                                                                                                                                    
                dsl::last_port,                                                                                                                                                                                                                                                                                                                                     
                dsl::sled_address,                                                                                                                                                                                                                                                                                                                                  
                dsl::vni,                                                                                                                                                                                                                                                                                                                                           
                dsl::mac,                                                                                                                                                                                                                                                                                                                                           
            ));                                                                                                                                                                                                                                                                                                                                                     
                                                                                                                                                                                                                                                                                                                                                                    
        // SELECT exactly the values we're trying to INSERT, but only                                                                                                                                                                                                                                                                                               
        // if it does not already exist.                                                                                                                                                                                                                                                                                                                            
        let new_entry_subquery = diesel::dsl::select((                                                                                                                                                                                                                                                                                                              
            nat_entry.external_address.into_sql::<sql_types::Inet>(),                                                                                                                                                                                                                                                                                               
            nat_entry.first_port.into_sql::<sql_types::Int4>(),                                                                                                                                                                                                                                                                                                     
            nat_entry.last_port.into_sql::<sql_types::Int4>(),                                                                                                                                                                                                                                                                                                      
            nat_entry.sled_address.into_sql::<sql_types::Inet>(),                                                                                                                                                                                                                                                                                                   
            nat_entry.vni.into_sql::<sql_types::Int4>(),                                                                                                                                                                                                                                                                                                            
            nat_entry.mac.into_sql::<sql_types::BigInt>(),                                                                                                                                                                                                                                                                                                          
        )).filter(diesel::dsl::not(diesel::dsl::exists(matching_entry_subquery)));                                                                                                                                                                                                                                                                                  
                                                                                                                                                                                                                                                                                                                                                                    
        diesel::insert_into(dsl::ipv4_nat_entry)                                                                                                                                                                                                                                                                                                                    
            .values(new_entry_subquery)                                                                                                                                                                                                                                                                                                                             
            .into_columns((                                                                                                                                                                                                                                                                                                                                         
                dsl::external_address,                                                                                                                                                                                                                                                                                                                              
                dsl::first_port,                                                                                                                                                                                                                                                                                                                                    
                dsl::last_port,                                                                                                                                                                                                                                                                                                                                     
                dsl::sled_address,                                                                                                                                                                                                                                                                                                                                  
                dsl::vni,                                                                                                                                                                                                                                                                                                                                           
                dsl::mac,                                                                                                                                                                                                                                                                                                                                           
            ))                                                                                                                                                                                                                                                                                                                                                      
            .execute_async(&*self.pool_connection_authorized(opctx).await?)                                                                                                                                                                                                                                                                                         
            .await                                                                                                                                                                                                                                                                                                                                                  
            .map_err(|e| public_error_from_diesel(e, ErrorHandler::Server))?; 

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will give this a try! Thanks for giving this a look!

Comment on lines 83 to 90
"
UPDATE omicron.public.ipv4_nat_entry
SET
version_removed = nextval('omicron.public.ipv4_nat_version'),
time_deleted = now()
WHERE time_deleted IS NULL AND version_removed IS NULL AND id = $1 AND version_added = $2
",
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we definitely need custom SQL to handle nextval, but the rest of this could be vanilla diesel.

If we add the following function:

fn ipv4_nat_next_version() -> diesel::expression::SqlLiteral<BigInt> {                                                                                                                                                                                                                                                                                            
    diesel::dsl::sql::<BigInt>("nextval('omicron.public.ipv4_nat_version')")                                                                                                                                                                                                                                                                                      
}

Then this particular snippet of SQL could be replaced with the following:

        use db::schema::ipv4_nat_entry::dsl;                                                                                                                                                                                                                                                                                                                      
        let updated_rows = diesel::update(dsl::ipv4_nat_entry)                                                                                                                                                                                                                                                                                                    
            .set((                                                                                                                                                                                                                                                                                                                                                
                dsl::version_removed.eq(ipv4_nat_next_version().nullable()),                                                                                                                                                                                                                                                                                      
                dsl::time_deleted.eq(Utc::now()),                                                                                                                                                                                                                                                                                                                 
            ))                                                                                                                                                                                                                                                                                                                                                    
            .filter(dsl::time_deleted.is_null())                                                                                                                                                                                                                                                                                                                  
            .filter(dsl::version_removed.is_null())                                                                                                                                                                                                                                                                                                               
            .filter(dsl::id.eq(nat_entry.id))                                                                                                                                                                                                                                                                                                                     
            .filter(dsl::version_added.eq(nat_entry.version_added))                                                                                                                                                                                                                                                                                               
            .execute_async(&*self.pool_connection_authorized(opctx).await?)                                                                                                                                                                                                                                                                                       
            .await                                                                                                                                                                                                                                                                                                                                                
            .map_err(|e| public_error_from_diesel(e, ErrorHandler::Server))?;  

Comment on lines 47 to 48
pub version_added: SqlU32,
pub version_removed: Option<SqlU32>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be storing these versions as BigInts, rather than SqlU32s?

If we overflow a u32 (very possible!) these values would start throwing deserialization errors.

changeset.sort_by_key(|e| e.gen);
Ok(HttpResponseOk(changeset))
};
apictx.external_latencies.instrument_dropshot_handler(&rqctx, handler).await
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
apictx.external_latencies.instrument_dropshot_handler(&rqctx, handler).await
apictx.internal_latencies.instrument_dropshot_handler(&rqctx, handler).await

let view = Ipv4NatGenView { gen };
Ok(HttpResponseOk(view))
};
apictx.external_latencies.instrument_dropshot_handler(&rqctx, handler).await
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
apictx.external_latencies.instrument_dropshot_handler(&rqctx, handler).await
apictx.internal_latencies.instrument_dropshot_handler(&rqctx, handler).await

Ok(())
}

pub async fn ipv4_nat_cancel_delete(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be marked #[cfg(test)]? The act of cancelling a delete seems somewhat unsafe to me outside test contexts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was originally intended for usage in saga unwinds, but I think we've since decided that we shouldn't try to unwind deletion of network configuration and instead consider it the "point of no return"

struct RpwNatPathParam {
/// which change number to start generating
/// the change set from
from_gen: u32,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as other comment, about u32 vs u64 representation of generation numbers

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this still needs updating

Comment on lines 596 to 603
// TODO: Remove (maybe)
// This endpoint may be more trouble than it's worth. The reason this endpoint
// may be a headache is because the underlying SEQUENCE in the DB will advance,
// even on failed transactions. This is not a big deal, but it will lead to
// callers trying to catch up to entries that don't exist if they check this
// endpoint first. As an alternative, we could just page through the
// changesets until we get nothing back, then revert to periodic polling
// of the changeset endpoint.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where'd we land on this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't seen a place where it has merit, so I'll remove it.

@smklein smklein assigned internet-diglett and unassigned smklein Nov 6, 2023
async {
let log = &opctx.log;

let result = self.datastore.ipv4_nat_current_version(opctx).await;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To verify my understanding:

  • This task periodically reads the latest generation number from the DB
  • It sends that generation number to Dendrite instances (client.ipv4_nat_generation())
  • It cleans old records

So the "dendrite-playing-catch-up" entirely relies on Dendrite reaching back to the internal API within Nexus, correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's pretty close to how it works:

  • Omicron periodically reads the latest generation in DB
  • Omicron checks to see what generation numbers the various dendrite instances have (we are querying dendrite, not triggering any work in dendrite)
  • Omicron only cleans up records older than the earliest found dendrite generation

The idea is that we don't clean up changes that a dendrite instance hasn't picked up yet. We also shouldn't have a detrimental TOCTOU race because the only way a dendrite instance moves backwards is if it crashes, and in that case it's okay that we're cleaning up old entries because we don't want that dendrite instance to re-apply them anyway.

Dendrite only catches up when:

  • An event in the control plane calls the trigger endpoint on the dendrite API (a "doorbell" of sorts)
  • Dendrite's internal RPW interval timer expires

In these events, yes, Dendrite will reach back via the internal API to catch up on the changes that have occurred.

@rcgoodfellow
Copy link
Contributor

When you get a chance, could you merge in main so this can be tested in the a4x2 testbed topology?

@internet-diglett
Copy link
Contributor Author

Note that since we are changing some of the types used by dendrite via the internal API, the deploy task will probably fail since we must first push our changes here, update dendrite, then reference the updated dendrite in the package manifest here.

Copy link
Collaborator

@smklein smklein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last few comments are mostly nitpicks, but this looks good! Thanks for the query cleanup!

struct RpwNatPathParam {
/// which change number to start generating
/// the change set from
from_gen: u32,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this still needs updating

/// empty vec is returned.
#[endpoint {
method = GET,
path = "/rpw/nat/ipv4/changeset/{from_gen}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I still might remove the /rpw prefix, the fact that Nexus processes this as an RPW is an implementation detail.

.await?;
}
IpNetwork::V6(_v6net) => {
// TODO: implement handling of v6 nat.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is unimplemented, we should throw an error rather than silently doing nothing

}
}

/// Database representation of an Ipv4 NAT Generation.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, it took me a while to figure this out -- this is a view of a sequence, not a table, and the log_cnt + is_called values are synthetically provided from CockroachDB.

That's worth documenting! I do not find that behavior obvious - I was trying to find out where these were used.

Related: Do we actually need to include log_cnt and is_called here? When we query for the Ipv4NatGen, we only read last_value, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thus far we only care about last_value, I can delete the other two columns from the struct

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it seems that since we're not using this struct anymore thanks to some of the other changes that have been made :D

@@ -2899,14 +2847,67 @@ CREATE TABLE IF NOT EXISTS omicron.public.db_metadata (
CHECK (singleton = true)
);


CREATE SEQUENCE IF NOT EXISTS omicron.public.ipv4_nat_version START 1 INCREMENT 1;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a super minor nitpick, but we did some work to ensure that the order of declarations in dbinit.sql should mostly not matter.

Can we move this stuff up earlier? I find it a little odd to have the ordering of dbinit be:

  • Most of the structures we use
  • The db_metadata table declaration
  • The tables that were added for NAT
  • INSERT into db_metadata

(I kinda think db_metadata's declaration + INSERT should be right next to each other, at the end of the file, if opssible)

Copy link
Collaborator

@smklein smklein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor ordering comment aside, this looks good to merge to me!

@@ -0,0 +1 @@
CREATE SEQUENCE IF NOT EXISTS omicron.public.ipv4_nat_version START 1 INCREMENT 1;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before you merge this -- looks like 10.0.0 is already used on main?

https://github.com/oxidecomputer/omicron/tree/main/schema/crdb/10.0.0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, I'll need to bump dendrite before merging anyway so it will give me one another chance to get my version ahead of main 😂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dendrite PR merged, bumping Omicron dependency version and Schema version

@internet-diglett internet-diglett merged commit 14f8f31 into main Nov 18, 2023
21 checks passed
@internet-diglett internet-diglett deleted the nat_rpw branch November 18, 2023 04:22
karencfv pushed a commit to karencfv/omicron that referenced this pull request Nov 21, 2023
This was accidentally merged in with 11.0.0 in oxidecomputer#3804

Fixes oxidecomputer#4530 (but see oxidecomputer#4531 for followup work)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants