Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataset ledger does not reflect ensured datasets #7095

Closed
andrewjstone opened this issue Nov 18, 2024 · 2 comments · Fixed by #7096
Closed

Dataset ledger does not reflect ensured datasets #7095

andrewjstone opened this issue Nov 18, 2024 · 2 comments · Fixed by #7096
Labels
bug Something that isn't working.

Comments

@andrewjstone
Copy link
Contributor

Background

I've been working on #6999 and trying to find all the dependencies between reconfigurator execution steps and remove them so that all steps can be run independently. Ideally, all steps are idempotent and the failure of single step gets reported as a warning and does not stop other steps from running. We don't really want to run the steps in parallel or as separate background tasks but we could if we wanted to. Eventually the structure of each step will change to be a BoxedFuture such that the execution is decoupled from the steps themselves, and so that we can mostly prevent dependencies through a constrained API. However, the first step is to find all dependencies and remove them.

One of the dependency chains is currently deploy disks -> deploy datasets -> deploy zones -> record datasets with plumb-firewall-rules mixed in but unrelated AFAICT. I was wondering what is forcing these to be dependencies so that I could go ahead and remove the dependencies themselves. As part of reading that code I found a bug, which is the topic of this issue.

Bug

As part of ensuring datasets we persist them to a ledger. However, if there are errors when ensuring the dataset, we persist them unconditionally. This is because the call to datasets_ensure_internal doesn't return a result, but a DatasetsManagementResult that contains possible internal errors. These errors are not checked before the ledger is committed.

Now for the interesting part. I actually think this may actually the right thing to do ™ . We should be writing down the desired state so that on cold boot the sled-agent can recreate these datasets if they don't already exists. I believe this was the only usage before #7006, and so this wasn't actually problematic. An argument can actually be made for persisting this desired state to the ledger before trying to ensure the datasets, in a typical write-ahead-log fashion, if this is the primary use case.

The bug arises, because we rely on the ledger to later on make decisions at runtime rather than looking at what actual datasets exist. In check_requested_zone_datasets_exist we call self.inner.storage.datasets_config_list() which ends up reading this ledger of datasets that may not actually be running. This gives a false sense of safety, even though the zones may still fail to launch in the same manner as without this check. So the check doesn't really make anything worse, it's just not correct.

If we really want to check that the datasets exist and not bail early we should go ahead and check the actual datasets that exist via the filesystem rather than reading the ledger.

In essence, my suggestion is:

  1. Always write the ledger before ensuring datasets and use it on cold boot
  2. Check that the actual datasets exists before attempting to launch corresponding zones if we indeed want to not attempt launching them. I'd probably do this one at a time though, because presumably we still want to launch some zones. On the next execution iteration presumably ensure datasets and ensure zones will be called again and the system will fix itself.

Alternatively, we could not write out the desired state of the datasets to the ledger until they are created, but this feels wrong to me since the system may crash after the datasets are created but the ledger will be out of date on cold boot.

@andrewjstone andrewjstone added the bug Something that isn't working. label Nov 18, 2024
@smklein
Copy link
Collaborator

smklein commented Nov 18, 2024

The comments around check_requested_zone_datasets_exist tries to clarify this somewhat

    // Compares the incoming set of zones with the datasets this sled is
    // configured to use.
    //
    // If the requested zones contain any datasets not configured on this sled,
    // an error is returned.

This check is explicitly for configuration, rather than inventory. I figured I'd check configuration instead of inventory, as that specifies "intent from Nexus" rather than "what datasets actually exist".

As discussed in chat - we could try to observe the union of "all datasets reported by system inventory" x "all datasets in the latest configuration", but my fear is that any local check reliant on inventory would be inherently subject to a TOCTTOU. Basically: If we try to give an affirmative answer that "a dataset (or disk) exists", it might immediately be yanked from the system, powered down, etc.

In that case, we'd fall back to the behavior we have today: We'd try to boot the zone, and if the dataset is gone, booting the zone should fail.

@andrewjstone
Copy link
Contributor Author

andrewjstone commented Nov 18, 2024

The comments around check_requested_zone_datasets_exist tries to clarify this somewhat

// Compares the incoming set of zones with the datasets this sled is
// configured to use.
//
// If the requested zones contain any datasets not configured on this sled,
// an error is returned.

This check is explicitly for configuration, rather than inventory. I figured I'd check configuration instead of inventory, as that specifies "intent from Nexus" rather than "what datasets actually exist".

As discussed in chat - we could try to observe the union of "all datasets reported by system inventory" x "all datasets in the latest configuration", but my fear is that any local check reliant on inventory would be inherently subject to a TOCTTOU. Basically: If we try to give an affirmative answer that "a dataset (or disk) exists", it might immediately be yanked from the system, powered down, etc.

In that case, we'd fall back to the behavior we have today: We'd try to boot the zone, and if the dataset is gone, booting the zone should fail.

Yeah, this was a misunderstanding on my part. Thanks for the clarification around the expunge use case. Whereas we want to error out if a dataset exists in the request that should be expunged based on configuration.

As we discussed in chat, a change to the name of the function to be check_requested_zone_datasets_configured would suffice and resolve this issue. I'll go ahead and do that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something that isn't working.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants