-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[#5333 3/6] Region snapshot replacement start #6294
[#5333 3/6] Region snapshot replacement start #6294
Conversation
This commit adds a "region snapshot replacement start" background task that will: 1) check for snapshots that are stored on disks that were expunged, and insert region snapshot replacement requests for them. 2) check if there are any region snapshot replacement requests in the `Requested` state and run the new "region snapshot replacement start" saga for them. This background task will also pick up manually requested region snapshot replacements. Also in this commit is the "region snapshot replacement start saga", which will transition a region snapshot replacement request from `Requested` to `Allocating` and: - allocate a new Region (with source set to Some so that a clone occurs) - create a blank Volume that will be used to stash the snapshot to delete later - swap the snapshot being replaced with the new region that was cloned from the others in the read-only region set - update the region snapshot replacement request's state to "ReplacementDone" and clearing the operating saga id This represents the first step to be taken after a snapshot goes away: allocate the replacement, and swap it in to the affected snapshot volume. Once this is done, anything that uses the snapshot volume as a read-only parent will no longer reference the expunged snapshot, and will work without any issues. Existing constructed Volumes running in a propolis or pantry context will remain unmodified: a future commit will contain the saga that takes care of performing the necessary read-only target replacements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great @jmpesp. Just minor comments from me.
|
||
Err(e) => { | ||
let s = | ||
format!("error creating replacement request: {e}"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth logging info about which region_id
and snapshot_id
the error occurred on, similar to the Ok
case above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
definitely, that's an oversight on my part. added in fea9ad9
|
||
Err(e) => { | ||
let s = | ||
format!("error looking up replacement request: {e}"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth logging information about which snapshot was problematic, similar to below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also added in fea9ad9
|
||
Err(e) => { | ||
let s = format!( | ||
"sending region snapshot replacement request failed: \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Log info about the request that failed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep! done in 4a0ae26
assert_eq!(result, RegionSnapshotReplacementStartStatus::default()); | ||
assert_eq!(starter.count_reset(), 0); | ||
|
||
// Add a region snapshot replacement request for a fake region snapshot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably a big pain in the ass, but this test could be a touch more complete if it also exercised: create_requests_for_region_snapshots_on_expunged_disks
. I believe you'd have to create a fake disk in the datastore, and a region, and then expunge the fake disk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 ended up being not too bad, see 7132858
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, nice!
// License, v. 2.0. If a copy of the MPL was not distributed with this | ||
// file, You can obtain one at https://mozilla.org/MPL/2.0/. | ||
|
||
//! Background task for detecting region snapshots that need replacing and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love how easy to read this whole RPW is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks! :)
let alloc_region_params = | ||
sagactx.lookup::<AllocRegionParams>("alloc_region_params")?; | ||
|
||
let (.., db_snapshot) = LookupPath::new(&opctx, &osagactx.datastore()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You already look up the snapshot in rsrss_get_alloc_region_params
. Why do you look it up again here? Can you instead save the volume_id
and snapshot_id
in AllocRegionParams
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like it - done in f2b9fc7 (soon on the list will be to use more typed uuids...)
// downstairs port. Once this goes away, Nexus will require a way to query | ||
// for the repair port! | ||
|
||
let mut repair_addr: SocketAddrV6 = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be clearer to explicitly say source_repair_addr
here. This makes the call to ensure_region_in_dataset
below more explicit in what it does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, added in a2bda34 along with expanding the comment for ensure_region_in_dataset
osagactx | ||
.datastore() | ||
.volume_replace_snapshot( | ||
VolumeWithTarget(old_volume_id), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICT, the only change between this undo and the prior method (as stated in the comment) is this swap in parameters and the log message above. It may be worth while to create a helper method to do all the lookups and then only perform this last call in the forward/undo functions. That would make it clearer what is different IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed, done in 94bc3cb
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for getting this out so quickly. No comments of any substance.
A question, this will replace the downstairs for a snapshot.
However, that snapshot could be a read_only_parent for a bunch of other disks.
Is that handled in a future PR?
//! beginning that process | ||
//! | ||
//! This task's responsibility is to create region snapshot replacement requests | ||
//! when physical disks are expunged, and trigger the region snapshot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have to wait for expungement, or can other things trigger a snapshot
replacement (not counting omdb)? I don' t think at the moment there exists
anytihing that would want to. I suspect at some point we may want the ability
to "move things around" and that means moving snapshots (or regions) off
a disk that is not expunged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, we were talking in the storage huddle and in chat about getting snapshot scrubbing for free once this is in - that would look like inserting region snapshot replacement requests directly. That'll come later after R10 :)
//! 1. Allocate a new region | ||
//! | ||
//! 2. Create a blank volume that can be later deleted to stash the snapshot | ||
//! being replaced. This is filled in the `volume_replace_snapshot` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit, change "This is filled in the .." to "This is populated in the .."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done in f89a395
//! request, and change the state to "Allocating". Then, it performs the | ||
//! following steps: | ||
//! | ||
//! 1. Allocate a new region |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This step includes the "clone" of an existing region, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does yeah - later on ensure_region_in_dataset
is called with a source repair address.
GET_OLD_SNAPSHOT_VOLUME_ID -> "old_snapshot_volume_id" { | ||
+ rsrss_get_old_snapshot_volume_id | ||
} | ||
CREATE_FAKE_VOLUME -> "unused_3" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why unsued_3
before unused_2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because I can't count! 4319c0a
.datastore() | ||
.regions_hard_delete(log, vec![region.id()]) | ||
.await?; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't find a new region and we expect to, should we log a message here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added in 80b23e3
|
||
// With a list of datasets and regions to ensure, other sagas need to have a | ||
// separate no-op forward step for the undo action to ensure that the undo | ||
// step occurs in the case that the ensure partially fails. Here this not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: final sentence: "Here this is not required,.."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice, 73c95c6
|
||
let Some(region_snapshot) = region_snapshot else { | ||
return Err(ActionError::action_failed(format!( | ||
"region snapshot {} {} {} deleted!", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we know this is deleted? Is that what a None
means?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do yeah - if the corresponding region_snapshot is missing then that means the volume references went to zero, meaning the crucible resources would have been cleaned up and the record hard deleted.
|
||
info!( | ||
log, | ||
"undo: replacing {} with {} in volume {}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point is "old" from the broken region we were fixing? Or have we swapped
old/new here? I'm not sure if this log is printing what I think it should be, which is
opposite from the srss_replace_snapshot_in_volume()
.
Or, wait, by saying undo
at the top, you mean we are undo-ing what the previous
step did, which is putting back things like they were. I guess I find this a little
confusing, but if all the undo
messages are like this, I'll just live with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The old and new is not swapped in this log message, where it is in the volume_replace_snapshot
arguments. You're right, adding undo:
is meant to signify that it's undoing the forward action. The intention is that someone can search for the text "replacing UUID with UUID in volume UUID" and get two results.
I put a small comment in 1b97525 explaining this intention.
This commit adds a "region snapshot replacement start" background task that will:
check for snapshots that are stored on disks that were expunged, and
insert region snapshot replacement requests for them.
check if there are any region snapshot replacement requests in the
Requested
state and run the new "region snapshot replacement start"saga for them. This background task will also pick up manually
requested region snapshot replacements.
Also in this commit is the "region snapshot replacement start saga", which will transition a region snapshot replacement request from
Requested
toAllocating
and:allocate a new Region (with source set to Some so that a clone occurs)
create a blank Volume that will be used to stash the snapshot to delete later
swap the snapshot being replaced with the new region that was cloned from the others in the read-only region set
update the region snapshot replacement request's state to "ReplacementDone" and clearing the operating saga id
This represents the first step to be taken after a snapshot goes away: allocate the replacement, and swap it in to the affected snapshot volume. Once this is done, anything that uses the snapshot volume as a read-only parent will no longer reference the expunged snapshot, and will work without any issues.
Existing constructed Volumes running in a propolis or pantry context will remain unmodified: a future commit will contain the saga that takes care of performing the necessary read-only target replacements.