-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[#3886 2/4] Region replacement omdb commands #5820
Conversation
This commit adds some commands to omdb related to the new region replacement logic: $ ./target/debug/omdb db region-replacement Query for information about region replacements, optionally manually triggering one Usage: omdb db region-replacement [OPTIONS] <COMMAND> Commands: list List region replacement requests status Show current region replacements and their status info Show detailed information for a region replacement request Manually request a region replacement help Print this message or the help of the given subcommand(s) `list` will list all region replacement requests, along with their request time and state. `status` will show a summary of all non-Complete region replacements, along with their state and progress. `info` will show a detailed view of a region replacement, starting with the details that the `status` summary shows, then showing all related notifications and steps taken to drive the replacement forward. Finally, `request` will request that a region be replaced, and return the ID of the replacement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nits.. just nits.
dev-tools/omdb/src/bin/omdb/db.rs
Outdated
println!(); | ||
|
||
if let Some(new_region_id) = request.new_region_id { | ||
println!(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the if let Some(repair) ..
fails beloe, i.e. the maybe_repair does not find anything, will we
still print an extra blank line here?
Maybe we don't need two blank lines, and instead we print space after each request?
Right now it looks like this:
EVT22200005 # omdb db region-replacement status
note: database URL not specified. Will search DNS.
note: (override with --db-url or OMDB_DB_URL)
note: using DNS server for subnet fd00:1122:3344::/48
note: (if this is not right, use --dns-server to specify an alternate DNS server)
note: using database URL postgresql://root@[fd00:1122:3344:101::4]:32221,[fd00:1122:3344:101::6]:32221,[fd00:1122:3344:101::5]:32221,[fd00:1122:3344:101::7]:32221,[fd00:1122:3344:101::3]:32221/omicron?sslmode=disable
note: database schema version matches expected (56.0.0)
af905bec-acb8-4d91-a439-af310d3f6d63:
started: 2024-05-28 20:44:57.042456 UTC
state: Running
old region id: 367d0aab-a3d5-470f-9ffa-214de71b6a5b
new region id: Some(7ae0833c-7ab2-4fa1-adb1-e1d72e7a1415)
progress: ██████████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ [ 233/ 1024]
bb75e969-ecc6-40f9-bab0-9e5520ae4f2e:
started: 2024-05-28 20:44:01.013417 UTC
state: Running
old region id: ff9d23b4-9d39-4be2-86fd-d8806f53ebd2
new region id: Some(90d3fd2f-582e-4e75-af05-74f5a90eda46)
progress: ███████████████████████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ [ 310/ 800]
So, the grouping is a little off. If you had many replacements, it might not be clear which progress
bar is for what.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved this println to after the bar.abandon call in 9970286, I think that abandon doesn't print a newline?
time: notification.time, | ||
repair_id: notification.repair_id.to_string(), | ||
repair_type: format!("{:?}", notification.repair_type), | ||
upstairs_id: notification.upstairs_id.to_string(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is upstairs_id
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's from whatever Upstairs is sending the notification.
|
||
bar.abandon(); | ||
|
||
println!(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The progress bar does not get to the end?
EVT22200005 # omdb db region-replacement info 3a4b04be-4bf9-43d6-a890-e5946d241084
note: database URL not specified. Will search DNS.
note: (override with --db-url or OMDB_DB_URL)
note: using DNS server for subnet fd00:1122:3344::/48
note: (if this is not right, use --dns-server to specify an alternate DNS server)
note: using database URL postgresql://root@[fd00:1122:3344:101::4]:32221,[fd00:1122:3344:101::6]:32221,[fd00:1122:3344:101::5]:32221,[fd00:1122:3344:101::7]:32221,[fd00:1122:3344:101::3]:32221/omicron?sslmode=disable
note: database schema version matches expected (56.0.0)
started: 2024-05-28 21:02:19.949975 UTC
state: Complete
old region id: ff1e4d49-8959-42e1-9fcc-ca219a68ef8e
new region id: Some(897b3307-36a8-4d0f-8ea2-9ba1562e0898)
Repair notifications
TIME REPAIR_ID REPAIR_TYPE UPSTAIRS_ID SESSION_ID NOTIFICATION_TYPE
2024-05-28 21:02:57.786939 UTC a661e5b9-0735-4ff6-a318-693349460b4b Live 23776670-5d21-449f-9552-efad0fe32b2a dd90808a-ff3b-41d0-a273-6a64e820458c Started
2024-05-28 21:08:21.759356 UTC a661e5b9-0735-4ff6-a318-693349460b4b Live 23776670-5d21-449f-9552-efad0fe32b2a dd90808a-ff3b-41d0-a273-6a64e820458c Succeeded
progress: █████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████░ [ 799/ 800]
Repair steps
TIME STEP_TYPE DETAILS
2024-05-28 21:02:47.663609 UTC Propolis instance Some(94122867-721b-40a2-aa85-518139367113) vmm Some(735975a0-26dd-4fe0-bb87-80e5770b16b4)
Is there a way to short-circuit it if it is done? Instead of printing the incomplete bar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this is a bug in Crucible's notification logic, in that it doesn't send the progress update notification when it's done? I'd rather fix that bug than have omdb print the complete bar. Opened oxidecomputer/crucible#1319
dev-tools/omdb/src/bin/omdb/db.rs
Outdated
let requests: Vec<RegionReplacement> = datastore | ||
.pool_connection_for_tests() | ||
.await? | ||
.transaction_async(|conn| async move { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
out of interest, why is this a transaction? IIUC only a single query is performed...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no answer, removed those transaction calls in a155af9
/// Show detailed information for a region replacement | ||
Info(RegionReplacementInfoArgs), | ||
/// Manually request a region replacement | ||
Request(RegionReplacementRequestArgs), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd probably make this more explicit: Perhaps name it Start
instead of Request
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the Request
command name as the first state things end up in is Requested
.
omdb
can't actually start it, it can just put the request into the database that
someone else has to then go and make happen.
dev-tools/omdb/src/bin/omdb/db.rs
Outdated
let limit = fetch_opts.fetch_limit; | ||
|
||
let requests: Vec<RegionReplacement> = datastore | ||
.pool_connection_for_tests() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should almost certainly use the opctx
argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pool_connection_authorized
is a private method and can't be accessed from omdb. Short of moving these into DataStore
I'm not sure what to do here except that, so that's done in a155af9, except when the fetch opts limit needs to be honoured.
dev-tools/omdb/src/bin/omdb/db.rs
Outdated
let limit = fetch_opts.fetch_limit; | ||
|
||
let requests: Vec<RegionReplacement> = datastore | ||
.pool_connection_for_tests() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above wrt opctx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in a155af9
dev-tools/omdb/src/bin/omdb/db.rs
Outdated
|
||
/// Show details for a single region replacement | ||
async fn cmd_db_region_replacement_info( | ||
_opctx: &OpContext, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in a155af9
dev-tools/omdb/src/bin/omdb/db.rs
Outdated
async fn cmd_db_region_replacement_info( | ||
_opctx: &OpContext, | ||
datastore: &DataStore, | ||
_fetch_opts: &DbFetchOptions, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't need these, we should probably get rid of them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in a155af9
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! My only required fix is to use the real opctx. Take or leave other suggestions.
remove unused arguments don't use a transaction when only one query is performed
This commit adds some commands to omdb related to the new region replacement logic:
list
will list all region replacement requests, along with their request time and state.status
will show a summary of all non-Complete region replacements, along with their state and progress.info
will show a detailed view of a region replacement, starting with the details that thestatus
summary shows, then showing all related notifications and steps taken to drive the replacement forward.Finally,
request
will request that a region be replaced, and return the ID of the replacement.