Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include enums in dbinit_equal_sum_of_all_up test #4472

Merged
merged 7 commits into from
Nov 10, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 53 additions & 4 deletions nexus/tests/integration_tests/schema.rs
Original file line number Diff line number Diff line change
Expand Up @@ -168,13 +168,14 @@ async fn query_crdb_schema_version(crdb: &CockroachInstance) -> String {
//
// Note that for the purposes of schema comparisons, we don't care about parsing
// the contents of the database, merely the schema and equality of contained data.
#[derive(Eq, PartialEq, Clone, Debug)]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to get rid of Eq to make this work because floats don't impl Eq. Fortunately it seems our equality checks only need PartialEq anyway.

#[derive(PartialEq, Clone, Debug)]
enum AnySqlType {
DateTime,
String(String),
Bool(bool),
Uuid(Uuid),
Int8(i64),
Float4(f32),
// TODO: This isn't exhaustive, feel free to add more.
//
// These should only be necessary for rows where the database schema changes also choose to
Expand Down Expand Up @@ -213,6 +214,9 @@ impl<'a> tokio_postgres::types::FromSql<'a> for AnySqlType {
if i64::accepts(ty) {
return Ok(AnySqlType::Int8(i64::from_sql(ty, raw)?));
}
if f32::accepts(ty) {
return Ok(AnySqlType::Float4(f32::from_sql(ty, raw)?));
}
Err(anyhow::anyhow!(
"Cannot parse type {ty}. If you're trying to use this type in a table which is populated \
during a schema migration, consider adding it to `AnySqlType`."
Expand All @@ -224,7 +228,7 @@ during a schema migration, consider adding it to `AnySqlType`."
}
}

#[derive(Eq, PartialEq, Debug)]
#[derive(PartialEq, Debug)]
struct NamedSqlValue {
// It's a little redunant to include the column name alongside each value,
// but it results in a prettier diff.
Expand All @@ -240,7 +244,7 @@ impl NamedSqlValue {
}

// A generic representation of a row of SQL data
#[derive(Eq, PartialEq, Debug)]
#[derive(PartialEq, Debug)]
struct Row {
values: Vec<NamedSqlValue>,
}
Expand Down Expand Up @@ -339,6 +343,42 @@ async fn crdb_select(
result
}

async fn crdb_list_enums(crdb: &CockroachInstance) -> Vec<Row> {
let client = crdb.connect().await.expect("failed to connect");

// https://www.cockroachlabs.com/docs/stable/pg-catalog
// https://www.postgresql.org/docs/current/catalog-pg-enum.html
let sql = "SELECT n.nspname as schema,
t.typname as typename,
e.enumsortorder as sort_order,
e.enumlabel as label
FROM pg_type t
JOIN pg_enum e on t.oid = e.enumtypid
JOIN pg_catalog.pg_namespace n ON n.oid = t.typnamespace
WHERE t.typtype = 'e'
ORDER BY n.nspname, t.typname, e.enumsortorder;"
.to_string();
let rows = client
.query(&sql, &[])
.await
.unwrap_or_else(|_| panic!("failed to list enums"));
client.cleanup().await.expect("cleaning up after wipe");

let mut result = vec![];
for row in rows {
let mut row_result = Row::new();
for i in 0..row.len() {
let column_name = row.columns()[i].name();
row_result.values.push(NamedSqlValue {
column: column_name.to_string(),
value: row.get(i),
});
}
result.push(row_result);
}
result
}

async fn read_all_schema_versions() -> BTreeSet<SemverVersion> {
let mut all_versions = BTreeSet::new();

Expand Down Expand Up @@ -569,10 +609,11 @@ const PG_INDEXES: [&'static str; 5] =
const TABLES: [&'static str; 4] =
["table_catalog", "table_schema", "table_name", "table_type"];

#[derive(Eq, PartialEq, Debug)]
#[derive(PartialEq, Debug)]
struct InformationSchema {
columns: Vec<Row>,
constraint_column_usage: Vec<Row>,
enums: Vec<Row>,
key_column_usage: Vec<Row>,
referential_constraints: Vec<Row>,
views: Vec<Row>,
Expand All @@ -589,6 +630,11 @@ impl InformationSchema {
// the columns diff especially needs this: it can be 20k lines otherwise
similar_asserts::assert_eq!(self.tables, other.tables);
similar_asserts::assert_eq!(self.columns, other.columns);
similar_asserts::assert_eq!(
self.enums,
other.enums,
"Enums did not match. Members must have the same order in dbinit.sql and migrations. If a migration adds a member, it should be added at the end of the definition in dbinit.sql."

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

);
similar_asserts::assert_eq!(self.views, other.views);
similar_asserts::assert_eq!(
self.table_constraints,
Expand Down Expand Up @@ -624,6 +670,8 @@ impl InformationSchema {
)
.await;

let enums = crdb_list_enums(crdb).await;

let constraint_column_usage = crdb_select(
crdb,
CONSTRAINT_COLUMN_USAGE.as_slice().into(),
Expand Down Expand Up @@ -694,6 +742,7 @@ impl InformationSchema {
Self {
columns,
constraint_column_usage,
enums,
key_column_usage,
referential_constraints,
views,
Expand Down
6 changes: 3 additions & 3 deletions schema/crdb/dbinit.sql
Original file line number Diff line number Diff line change
Expand Up @@ -191,7 +191,6 @@ CREATE UNIQUE INDEX IF NOT EXISTS lookup_switch_by_rack ON omicron.public.switch

CREATE TYPE IF NOT EXISTS omicron.public.service_kind AS ENUM (
'clickhouse',
'clickhouse_keeper',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sure what I'm about to say is absolutely taboo, but I think we should actually update the migration and not this. The specific update would be an AFTER 'clickhouse' being added to the alter clause of the enum. For enums where clickhouse already exists that migration step won't be re-run, but for new systems it'll be the same as what we have today.

Copy link
Contributor Author

@david-crespo david-crespo Nov 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that's very cool. I didn't know you could insert it into a particular spot. We might have issues with the order number comparison if we do that. I think when you insert in the middle of two existing members, it's going to give it an decimal order number halfway between them. Then again I think we will also have this problem if we ever delete a variant. Important not-really-edge cases to consider.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't be an issue because the enum isn't stored as a decimal. You can find an overview of the format here. Postgres stores enums as floats so you'd [1.0, 2.0] and inserting a value between them would end up with [1.0, 1.5, 2.0]. CRDB just does some fancy byte index padding which essentially ends up being a similar concept. So this should still work as expected.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m saying those order numbers won’t match what’s generated from scratch in dbinit.sql, which would be [1.0, 2.0, 3.0], so the test will fail as currently written.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope! To my surprise, this still passes. I don't think we can change the migrations that have already been applied on customer racks, but I agree the AFTER trick is what we should do in the future.

diff --git a/schema/crdb/4.0.0/up1.sql b/schema/crdb/4.0.0/up1.sql
index b26b646d6..ba67d8394 100644
--- a/schema/crdb/4.0.0/up1.sql
+++ b/schema/crdb/4.0.0/up1.sql
@@ -1 +1 @@
-ALTER TYPE omicron.public.service_kind ADD VALUE IF NOT EXISTS 'clickhouse_keeper';
+ALTER TYPE omicron.public.service_kind ADD VALUE IF NOT EXISTS 'clickhouse_keeper' AFTER 'clickhouse';

diff --git a/schema/crdb/4.0.0/up2.sql b/schema/crdb/4.0.0/up2.sql
index 018d39c13..e71cb22f3 100644
--- a/schema/crdb/4.0.0/up2.sql
+++ b/schema/crdb/4.0.0/up2.sql
@@ -1 +1 @@
-ALTER TYPE omicron.public.dataset_kind ADD VALUE IF NOT EXISTS 'clickhouse_keeper';
+ALTER TYPE omicron.public.dataset_kind ADD VALUE IF NOT EXISTS 'clickhouse_keeper' AFTER 'clickhouse';

diff --git a/schema/crdb/dbinit.sql b/schema/crdb/dbinit.sql
index 6e8ab0e39..da842cbfe 100644
--- a/schema/crdb/dbinit.sql
+++ b/schema/crdb/dbinit.sql
@@ -191,6 +191,7 @@ CREATE UNIQUE INDEX IF NOT EXISTS lookup_switch_by_rack ON omicron.public.switch
 
 CREATE TYPE IF NOT EXISTS omicron.public.service_kind AS ENUM (
   'clickhouse',
+  'clickhouse_keeper',
   'cockroach',
   'crucible',
   'crucible_pantry',
@@ -201,7 +202,6 @@ CREATE TYPE IF NOT EXISTS omicron.public.service_kind AS ENUM (
   'ntp',
   'oximeter',
   'tfport',
-  'clickhouse_keeper',
   'mgd'
 );
 
@@ -402,9 +402,9 @@ CREATE TYPE IF NOT EXISTS omicron.public.dataset_kind AS ENUM (
   'crucible',
   'cockroach',
   'clickhouse',
+  'clickhouse_keeper',
   'external_dns',
-  'internal_dns',
-  'clickhouse_keeper'
+  'internal_dns'
 );
 
 /*

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have to leave the migrations as is we should at least comment on why the enum arms are out of order

'cockroach',
'crucible',
'crucible_pantry',
Expand All @@ -202,6 +201,7 @@ CREATE TYPE IF NOT EXISTS omicron.public.service_kind AS ENUM (
'ntp',
'oximeter',
'tfport',
'clickhouse_keeper',
Copy link
Collaborator

@smklein smklein Nov 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zephraph asked the following extremely relevant question:

is it unsafe to update the migrations to add the AFTER clause? That migration won't rerun on customer deployments, right? Even if it did, the enum is already present so it wouldn't try to readd it

So there are a few cases we need to consider:

  1. Someone hasn't deployed yet. In this case, whatever dbinit.sql says goes, so it "matches" trivially during formatting.
  2. Someone ran with an old version of dbinit.sql , and applied the schema migrations to get to this order.
    Before this PR: clickhouse_keeper appears at the end of the enum in the deployed DB, which doesn't match dbinit.sql
    After this PR:dbinit.sql matches the order applied by schema changes.
  3. Someone ran with a newish version of dbinit.sql, and hasn't applied any schema changes.
    Before this PR: clickhouse_keeper does not appear at the end of the enum in the deployed DB.
    After this PR: the enum order is no longer correct! In the deployed DB, clickhouse_keeper is early in the order, but this doesn't match dbinit.sql, which thinks it shows up later.

By making this change, we "fix" case (2), but we "break" case (3).

I point this out to identify: if you rely on enum order after this change, there are still deployments that exist where we do not have such a guarantee.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All this is to say -- we can make this change, and if you want, but I think it's still unsafe to rely on order of our enums with possible deployments unless we add a "schema change" that validates order.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(also, to be explicit, the changes that you've added to the integration tests are extremely useful still, in either situation)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a pain-in-the-ass, but we could:

This is a "full table scan" operation but the data sizes are really small so I don't think it's so bad
Then we add that as a schema change while updating dbinit.sql at the same time
New deployments -> Use dbinit.sql, new order
Old deployments -> Use the schema change, land on the same order as dbinit.sql

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a minor note... it doesn't have to be at the end. It'll still work being inserted after clickhouse as we intended clickhouse_keeper to originally be.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this approach (even if it makes me slightly nervous b/c we have to do a small amount of data migration). There are two primary justifications that lead me to believe this is a good idea

  1. Given that enums are ordered, having this integration test is valuable. It's caught this case where migrations result in an effectively different enum shape than what we intended in dbinit
  2. We have an active case of database bifurcation. Installations which were or will be provisioned from a schema version greater 4.0.0 will effectively have different values for the service_kind and dataset_kind than those provisioned with an earlier version that had to run that migration. I don't have a full understanding of what challenges this could cause outside of ordering issues but it seems safer to sand down this sharp edge before we cut ourselves on it.

Copy link
Contributor Author

@david-crespo david-crespo Nov 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I did it in 338828a. Rather than keep clickhouse_keeper2 around to confuse people in perpetuity, I decided to do the same thing again to change it back to clickhouse_keeper after fixing the order.

See the readme I added for the migration summarizing what it does: https://github.com/oxidecomputer/omicron/blob/338828a24837790f7b96a30dde7aecf7963748d0/schema/crdb/10.0.0/README.md

'mgd'
);

Expand Down Expand Up @@ -402,9 +402,9 @@ CREATE TYPE IF NOT EXISTS omicron.public.dataset_kind AS ENUM (
'crucible',
'cockroach',
'clickhouse',
'clickhouse_keeper',
'external_dns',
'internal_dns'
'internal_dns',
'clickhouse_keeper'
);

/*
Expand Down
Loading