Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Want a way to test data migrations #4747

Closed
david-crespo opened this issue Jan 2, 2024 · 1 comment · Fixed by #4783
Closed

Want a way to test data migrations #4747

david-crespo opened this issue Jan 2, 2024 · 1 comment · Fixed by #4783
Assignees
Labels
database Related to database access development Bugs, paper cuts, feature requests, or other thoughts on making omicron development better Testing & Analysis Tests & Analyzers

Comments

@david-crespo
Copy link
Contributor

While working on #4261, I found it hard to get confidence in my data migrations using existing tooling. One thing I found helpful was using DB Fiddle to write some similar tables and running similar migrations on them to make sure I got the expected results.

https://www.db-fiddle.com/f/pzmM69My6mZWq1NEKm56PG/1

However, this is lacking in ways that formal tooling could improve on:

  1. It's Postgres 15, not CRDB, and especially not the version of CRDB we're running
  2. The tables are simplified because making them match exactly is annoying to do by hand, even though it would be easy to do automatically

So what I have in mind is essentially a way of:

  1. Running migrations up to but not including the one under test
  2. Running some SQL to insert rows in the DB
  3. Running the migration under test
  4. Asserting stuff about the resulting DB state
@david-crespo david-crespo added database Related to database access Testing & Analysis Tests & Analyzers development Bugs, paper cuts, feature requests, or other thoughts on making omicron development better labels Jan 2, 2024
@smklein
Copy link
Collaborator

smklein commented Jan 2, 2024

// This test verifies that executing all schemas, in order, twice, still
// correctly performs an upgrade.
//
// This attempts to be a rough approximation for multiple Nexuses each
// simultaneously executing these operations.
#[tokio::test]
async fn versions_have_idempotent_up() {
let config = load_test_config();
let logctx =
LogContext::new("versions_have_idempotent_up", &config.pkg.log);
let log = &logctx.log;
let populate = false;
let mut crdb = test_setup_just_crdb(&logctx.log, populate).await;
let all_versions = read_all_schema_versions().await;
for version in &all_versions {
apply_update(log, &crdb, &version.to_string(), 2).await;
assert_eq!(version.to_string(), query_crdb_schema_version(&crdb).await);
}
assert_eq!(
LATEST_SCHEMA_VERSION.to_string(),
query_crdb_schema_version(&crdb).await
);
crdb.cleanup().await.unwrap();
logctx.cleanup_successful();
}
and
// Confirms that the application of all "up.sql" files, in order, is equivalent
// to applying "dbinit.sql", which should represent the latest-known schema.
#[tokio::test]
async fn dbinit_equals_sum_of_all_up() {
let config = load_test_config();
let logctx =
LogContext::new("dbinit_equals_sum_of_all_up", &config.pkg.log);
let log = &logctx.log;
let populate = false;
let mut crdb = test_setup_just_crdb(&logctx.log, populate).await;
let all_versions = read_all_schema_versions().await;
// Go from the first version to the latest version.
for version in &all_versions {
apply_update(log, &crdb, &version.to_string(), 1).await;
assert_eq!(version.to_string(), query_crdb_schema_version(&crdb).await);
}
assert_eq!(
LATEST_SCHEMA_VERSION.to_string(),
query_crdb_schema_version(&crdb).await
);
// Query the newly constructed DB for information about its schema
let observed_schema = InformationSchema::new(&crdb).await;
let observed_data = observed_schema.query_all_tables(log, &crdb).await;
crdb.cleanup().await.unwrap();
// Create a new DB with data populated from dbinit.sql for comparison
let populate = true;
let mut crdb = test_setup_just_crdb(&logctx.log, populate).await;
let expected_schema = InformationSchema::new(&crdb).await;
let expected_data = expected_schema.query_all_tables(log, &crdb).await;
// Validate that the schema is identical
observed_schema.pretty_assert_eq(&expected_schema);
assert_eq!(observed_data, expected_data);
crdb.cleanup().await.unwrap();
logctx.cleanup_successful();
}
seem close to what you're describing for (1), but it's unfortunately not super cheap to perform "all schema upgrades in order from 1.0.0 to latest". I've been eyeing these tests anyway for optimization purposes, so I wonder if we could share that workload somehow.

(That said, the idea of "update to right before my schema change, populate some test data, do the change, then validate the state" seems super reasonable)

@smklein smklein self-assigned this Jan 8, 2024
smklein added a commit that referenced this issue Jan 9, 2024
Adds a schema test with "before" / "after" hooks, and adds an example
specifically for the "23.0.0" migration.

My intent is that this can be used for any other schema migrations that
would like to execute arbitrary SQL checks
against the new schema too.

Fixes #4747
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
database Related to database access development Bugs, paper cuts, feature requests, or other thoughts on making omicron development better Testing & Analysis Tests & Analyzers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants