Simulated Crucible agent should be stricter #4049

jmpesp · 2023-09-07T20:38:54Z

There's a few more info messages here that will aid debugging, plus a few ways the simulated Crucible agent is now stricter:

it should be impossible to create a snapshot of a region in a bad state.
it should be an error to try to delete a snapshot for a non-existent region (the real sled agent would return a 404)
it should be an error to attempt creating a running snapshot for a snapshot that does not exist

The actual Crucible agent reuses freed ports, so the simulated one should too.

There's a few more info messages here that will aid debugging, plus a few ways the simulated Crucible agent is now stricter: - it should be impossible to create a snapshot of a region in a bad state. - it should be an error to try to delete a snapshot for a non-existent region (the real sled agent would return a 404) - it should be an error to attempt creating a running snapshot for a snapshot that does not exist The actual Crucible agent reuses freed ports, so the simulated one should too.

`decrease_crucible_resource_count_and_soft_delete_volume` does not disambiguate cases where the snapshot_addr of a region_snapshot is duplicated with another one, which can occur due to the Crucible Agent reclaiming ports from destroyed daemons (see also oxidecomputer#4049, which makes the simulated Crucible agent do this). Several invocations of the snapshot create and snapshot delete sagas could race in such a way that one of these ports would be reclaimed, and then be used in a different snapshot, and the lifetime of both of these would overlap! This would confuse our reference counting, which was written with a naive assumption that this port reuse **wouldn't** occur with these overlapping lifetimes. Spoiler alert, it can: root@[fd00:1122:3344:101::3]:32221/omicron> select * from region_snapshot where snapshot_addr = '[fd00:1122:3344:102::7]:19016'; dataset_id | region_id | snapshot_id | snapshot_addr | volume_references ---------------------------------------+--------------------------------------+--------------------------------------+-------------------------------+-------------------- 80790bfd-4b81-4381-9262-20912e3826cc | 0387bbb7-1d54-4683-943c-6c17d6804de9 | 1a800928-8f93-4cd3-9df1-4129582ffc20 | [fd00:1122:3344:102::7]:19016 | 0 80790bfd-4b81-4381-9262-20912e3826cc | ff20e066-8815-4eb6-ac84-fab9b9103462 | bdd9614e-f089-4a94-ae46-e10b96b79ba3 | [fd00:1122:3344:102::7]:19016 | 0 (2 rows) One way to solve this would be to create a UNIQUE INDEX on `snapshot_addr` here, but then in these cases the snapshot creation would return a 500 error to the user. This commit adds a sixth column: `deleting`, a boolean that is true when the region snapshot is part of a volume's `resources_to_clean_up`, and false otherwise. This is used to select (as part of the transaction for `decrease_crucible_resource_count_and_soft_delete_volume`) only the region_snapshot records that were decremented as part of that transaction, and skip re-deleting them otherwise. This works because the overlapping lifetime of the records in the DB is **not** the overlapping lifetime of the actual read-only downstairs daemon: for the port to be reclaimed, the original daemon has to be DELETEd, which happens after the decrement transaction has already computed which resources to clean up: 1) a snapshot record is created: snapshot_id | snapshot_addr | volume_references | deleted | -------------------------------------+-------------------------------+-------------------+---------- 1a800928-8f93-4cd3-9df1-4129582ffc20 | [fd00:1122:3344:102::7]:19016 | 0 | false | 2) it is incremented as part of `volume_create`: snapshot_id | snapshot_addr | volume_references | deleted | -------------------------------------+-------------------------------+-------------------+---------- 1a800928-8f93-4cd3-9df1-4129582ffc20 | [fd00:1122:3344:102::7]:19016 | 1 | false | 3) when the volume is deleted, then the decrement transaction will: a) decrease `volume_references` by 1 snapshot_id | snapshot_addr | volume_references | deleted | -------------------------------------+-------------------------------+-------------------+---------- 1a800928-8f93-4cd3-9df1-4129582ffc20 | [fd00:1122:3344:102::7]:19016 | 0 | false | b) note any `region_snapshot` records whose `volume_references` went to 0 and have `deleted` = false, and return those in the list of resources to clean up: [ 1a800928-8f93-4cd3-9df1-4129582ffc20 ] c) set deleted = true for any region_snapshot records whose `volume_references` went to 0 and have deleted = false snapshot_id | snapshot_addr | volume_references | deleted | -------------------------------------+-------------------------------+-------------------+---------- 1a800928-8f93-4cd3-9df1-4129582ffc20 | [fd00:1122:3344:102::7]:19016 | 0 | true | 4) That read-only snapshot daemon is DELETEd, freeing up the port. Another snapshot creation occurs, using that reclaimed port: snapshot_id | snapshot_addr | volume_references | deleted | -------------------------------------+-------------------------------+-------------------+---------- 1a800928-8f93-4cd3-9df1-4129582ffc20 | [fd00:1122:3344:102::7]:19016 | 0 | true | bdd9614e-f089-4a94-ae46-e10b96b79ba3 | [fd00:1122:3344:102::7]:19016 | 0 | false | 5) That new snapshot is incremented as part of `volume_create`: snapshot_id | snapshot_addr | volume_references | deleted | -------------------------------------+-------------------------------+-------------------+---------- 1a800928-8f93-4cd3-9df1-4129582ffc20 | [fd00:1122:3344:102::7]:19016 | 0 | true | bdd9614e-f089-4a94-ae46-e10b96b79ba3 | [fd00:1122:3344:102::7]:19016 | 1 | false | 6) It is later deleted, and the decrement transaction will: a) decrease `volume_references` by 1: snapshot_id | snapshot_addr | volume_references | deleted | -------------------------------------+-------------------------------+-------------------+---------- 1a800928-8f93-4cd3-9df1-4129582ffc20 | [fd00:1122:3344:102::7]:19016 | 0 | true | bdd9614e-f089-4a94-ae46-e10b96b79ba3 | [fd00:1122:3344:102::7]:19016 | 0 | false | b) note any `region_snapshot` records whose `volume_references` went to 0 and have `deleted` = false, and return those in the list of resources to clean up: [ bdd9614e-f089-4a94-ae46-e10b96b79ba3 ] c) set deleted = true for any region_snapshot records whose `volume_references` went to 0 and have deleted = false snapshot_id | snapshot_addr | volume_references | deleted | -------------------------------------+-------------------------------+-------------------+---------- 1a800928-8f93-4cd3-9df1-4129582ffc20 | [fd00:1122:3344:102::7]:19016 | 0 | true | bdd9614e-f089-4a94-ae46-e10b96b79ba3 | [fd00:1122:3344:102::7]:19016 | 0 | true |

`decrease_crucible_resource_count_and_soft_delete_volume` does not disambiguate cases where the snapshot_addr of a region_snapshot is duplicated with another one, which can occur due to the Crucible Agent reclaiming ports from destroyed daemons (see also oxidecomputer#4049, which makes the simulated Crucible agent do this). Several invocations of the snapshot create and snapshot delete sagas could race in such a way that one of these ports would be reclaimed, and then be used in a different snapshot, and the lifetime of both of these would overlap! This would confuse our reference counting, which was written with a naive assumption that this port reuse **wouldn't** occur with these overlapping lifetimes. Spoiler alert, it can: root@[fd00:1122:3344:101::3]:32221/omicron> select * from region_snapshot where snapshot_addr = '[fd00:1122:3344:102::7]:19016'; dataset_id | region_id | snapshot_id | snapshot_addr | volume_references ---------------------------------------+--------------------------------------+--------------------------------------+-------------------------------+-------------------- 80790bfd-4b81-4381-9262-20912e3826cc | 0387bbb7-1d54-4683-943c-6c17d6804de9 | 1a800928-8f93-4cd3-9df1-4129582ffc20 | [fd00:1122:3344:102::7]:19016 | 0 80790bfd-4b81-4381-9262-20912e3826cc | ff20e066-8815-4eb6-ac84-fab9b9103462 | bdd9614e-f089-4a94-ae46-e10b96b79ba3 | [fd00:1122:3344:102::7]:19016 | 0 (2 rows) One way to solve this would be to create a UNIQUE INDEX on `snapshot_addr` here, but then in these cases the snapshot creation would return a 500 error to the user. This commit adds a sixth column: `deleting`, a boolean that is true when the region snapshot is part of a volume's `resources_to_clean_up`, and false otherwise. This is used to select (as part of the transaction for `decrease_crucible_resource_count_and_soft_delete_volume`) only the region_snapshot records that were decremented as part of that transaction, and skip re-deleting them otherwise. This works because the overlapping lifetime of the records in the DB is **not** the overlapping lifetime of the actual read-only downstairs daemon: for the port to be reclaimed, the original daemon has to be DELETEd, which happens after the decrement transaction has already computed which resources to clean up: 1) a snapshot record is created: ``` snapshot_id | snapshot_addr | volume_references | deleted | -------------------------------------+-------------------------------+-------------------+---------- 1a800928-8f93-4cd3-9df1-4129582ffc20 | [fd00:1122:3344:102::7]:19016 | 0 | false | ``` 2) it is incremented as part of `volume_create`: snapshot_id | snapshot_addr | volume_references | deleted | -------------------------------------+-------------------------------+-------------------+---------- 1a800928-8f93-4cd3-9df1-4129582ffc20 | [fd00:1122:3344:102::7]:19016 | 1 | false | 3) when the volume is deleted, then the decrement transaction will: a) decrease `volume_references` by 1 snapshot_id | snapshot_addr | volume_references | deleted | -------------------------------------+-------------------------------+-------------------+---------- 1a800928-8f93-4cd3-9df1-4129582ffc20 | [fd00:1122:3344:102::7]:19016 | 0 | false | b) note any `region_snapshot` records whose `volume_references` went to 0 and have `deleted` = false, and return those in the list of resources to clean up: [ 1a800928-8f93-4cd3-9df1-4129582ffc20 ] c) set deleted = true for any region_snapshot records whose `volume_references` went to 0 and have deleted = false snapshot_id | snapshot_addr | volume_references | deleted | -------------------------------------+-------------------------------+-------------------+---------- 1a800928-8f93-4cd3-9df1-4129582ffc20 | [fd00:1122:3344:102::7]:19016 | 0 | true | 4) That read-only snapshot daemon is DELETEd, freeing up the port. Another snapshot creation occurs, using that reclaimed port: snapshot_id | snapshot_addr | volume_references | deleted | -------------------------------------+-------------------------------+-------------------+---------- 1a800928-8f93-4cd3-9df1-4129582ffc20 | [fd00:1122:3344:102::7]:19016 | 0 | true | bdd9614e-f089-4a94-ae46-e10b96b79ba3 | [fd00:1122:3344:102::7]:19016 | 0 | false | 5) That new snapshot is incremented as part of `volume_create`: snapshot_id | snapshot_addr | volume_references | deleted | -------------------------------------+-------------------------------+-------------------+---------- 1a800928-8f93-4cd3-9df1-4129582ffc20 | [fd00:1122:3344:102::7]:19016 | 0 | true | bdd9614e-f089-4a94-ae46-e10b96b79ba3 | [fd00:1122:3344:102::7]:19016 | 1 | false | 6) It is later deleted, and the decrement transaction will: a) decrease `volume_references` by 1: snapshot_id | snapshot_addr | volume_references | deleted | -------------------------------------+-------------------------------+-------------------+---------- 1a800928-8f93-4cd3-9df1-4129582ffc20 | [fd00:1122:3344:102::7]:19016 | 0 | true | bdd9614e-f089-4a94-ae46-e10b96b79ba3 | [fd00:1122:3344:102::7]:19016 | 0 | false | b) note any `region_snapshot` records whose `volume_references` went to 0 and have `deleted` = false, and return those in the list of resources to clean up: [ bdd9614e-f089-4a94-ae46-e10b96b79ba3 ] c) set deleted = true for any region_snapshot records whose `volume_references` went to 0 and have deleted = false snapshot_id | snapshot_addr | volume_references | deleted | -------------------------------------+-------------------------------+-------------------+---------- 1a800928-8f93-4cd3-9df1-4129582ffc20 | [fd00:1122:3344:102::7]:19016 | 0 | true | bdd9614e-f089-4a94-ae46-e10b96b79ba3 | [fd00:1122:3344:102::7]:19016 | 0 | true |

`decrease_crucible_resource_count_and_soft_delete_volume` does not disambiguate cases where the snapshot_addr of a region_snapshot is duplicated with another one, which can occur due to the Crucible Agent reclaiming ports from destroyed daemons (see also oxidecomputer#4049, which makes the simulated Crucible agent do this). Several invocations of the snapshot create and snapshot delete sagas could race in such a way that one of these ports would be reclaimed, and then be used in a different snapshot, and the lifetime of both of these would overlap! This would confuse our reference counting, which was written with a naive assumption that this port reuse **wouldn't** occur with these overlapping lifetimes. Spoiler alert, it can: root@[fd00:1122:3344:101::3]:32221/omicron> select * from region_snapshot where snapshot_addr = '[fd00:1122:3344:102::7]:19016'; dataset_id | region_id | snapshot_id | snapshot_addr | volume_references ---------------------------------------+--------------------------------------+--------------------------------------+-------------------------------+-------------------- 80790bfd-4b81-4381-9262-20912e3826cc | 0387bbb7-1d54-4683-943c-6c17d6804de9 | 1a800928-8f93-4cd3-9df1-4129582ffc20 | [fd00:1122:3344:102::7]:19016 | 0 80790bfd-4b81-4381-9262-20912e3826cc | ff20e066-8815-4eb6-ac84-fab9b9103462 | bdd9614e-f089-4a94-ae46-e10b96b79ba3 | [fd00:1122:3344:102::7]:19016 | 0 (2 rows) One way to solve this would be to create a UNIQUE INDEX on `snapshot_addr` here, but then in these cases the snapshot creation would return a 500 error to the user. This commit adds a sixth column: `deleting`, a boolean that is true when the region snapshot is part of a volume's `resources_to_clean_up`, and false otherwise. This is used to select (as part of the transaction for `decrease_crucible_resource_count_and_soft_delete_volume`) only the region_snapshot records that were decremented as part of that transaction, and skip re-deleting them otherwise. This works because the overlapping lifetime of the records in the DB is **not** the overlapping lifetime of the actual read-only downstairs daemon: for the port to be reclaimed, the original daemon has to be DELETEd, which happens after the decrement transaction has already computed which resources to clean up: 1) a snapshot record is created: ``` snapshot_id | snapshot_addr | volume_references | deleted | -------------------------------------+-------------------------------+-------------------+---------- 1a800928-8f93-4cd3-9df1-4129582ffc20 | [fd00:1122:3344:102::7]:19016 | 0 | false | ``` 2) it is incremented as part of `volume_create`: ``` snapshot_id | snapshot_addr | volume_references | deleted | -------------------------------------+-------------------------------+-------------------+---------- 1a800928-8f93-4cd3-9df1-4129582ffc20 | [fd00:1122:3344:102::7]:19016 | 1 | false | ``` 3) when the volume is deleted, then the decrement transaction will: a) decrease `volume_references` by 1 ``` snapshot_id | snapshot_addr | volume_references | deleted | -------------------------------------+-------------------------------+-------------------+---------- 1a800928-8f93-4cd3-9df1-4129582ffc20 | [fd00:1122:3344:102::7]:19016 | 0 | false | ``` b) note any `region_snapshot` records whose `volume_references` went to 0 and have `deleted` = false, and return those in the list of resources to clean up: [ 1a800928-8f93-4cd3-9df1-4129582ffc20 ] c) set deleted = true for any region_snapshot records whose `volume_references` went to 0 and have deleted = false ``` snapshot_id | snapshot_addr | volume_references | deleted | -------------------------------------+-------------------------------+-------------------+---------- 1a800928-8f93-4cd3-9df1-4129582ffc20 | [fd00:1122:3344:102::7]:19016 | 0 | true | ``` 4) That read-only snapshot daemon is DELETEd, freeing up the port. Another snapshot creation occurs, using that reclaimed port: ``` snapshot_id | snapshot_addr | volume_references | deleted | -------------------------------------+-------------------------------+-------------------+---------- 1a800928-8f93-4cd3-9df1-4129582ffc20 | [fd00:1122:3344:102::7]:19016 | 0 | true | bdd9614e-f089-4a94-ae46-e10b96b79ba3 | [fd00:1122:3344:102::7]:19016 | 0 | false | ``` 5) That new snapshot is incremented as part of `volume_create`: ``` snapshot_id | snapshot_addr | volume_references | deleted | -------------------------------------+-------------------------------+-------------------+---------- 1a800928-8f93-4cd3-9df1-4129582ffc20 | [fd00:1122:3344:102::7]:19016 | 0 | true | bdd9614e-f089-4a94-ae46-e10b96b79ba3 | [fd00:1122:3344:102::7]:19016 | 1 | false | ``` 6) It is later deleted, and the decrement transaction will: a) decrease `volume_references` by 1: ``` j snapshot_id | snapshot_addr | volume_references | deleted | -------------------------------------+-------------------------------+-------------------+---------- 1a800928-8f93-4cd3-9df1-4129582ffc20 | [fd00:1122:3344:102::7]:19016 | 0 | true | bdd9614e-f089-4a94-ae46-e10b96b79ba3 | [fd00:1122:3344:102::7]:19016 | 0 | false | ``` b) note any `region_snapshot` records whose `volume_references` went to 0 and have `deleted` = false, and return those in the list of resources to clean up: [ bdd9614e-f089-4a94-ae46-e10b96b79ba3 ] c) set deleted = true for any region_snapshot records whose `volume_references` went to 0 and have deleted = false ``` snapshot_id | snapshot_addr | volume_references | deleted | -------------------------------------+-------------------------------+-------------------+---------- 1a800928-8f93-4cd3-9df1-4129582ffc20 | [fd00:1122:3344:102::7]:19016 | 0 | true | bdd9614e-f089-4a94-ae46-e10b96b79ba3 | [fd00:1122:3344:102::7]:19016 | 0 | true | ```

`decrease_crucible_resource_count_and_soft_delete_volume` does not disambiguate cases where the snapshot_addr of a region_snapshot is duplicated with another one, which can occur due to the Crucible Agent reclaiming ports from destroyed daemons (see also #4049, which makes the simulated Crucible agent do this). Several invocations of the snapshot create and snapshot delete sagas could race in such a way that one of these ports would be reclaimed, and then be used in a different snapshot, and the lifetime of both of these would overlap! This would confuse our reference counting, which was written with a naive assumption that this port reuse **wouldn't** occur with these overlapping lifetimes. Spoiler alert, it can: root@[fd00:1122:3344:101::3]:32221/omicron> select * from region_snapshot where snapshot_addr = '[fd00:1122:3344:102::7]:19016'; dataset_id | region_id | snapshot_id | snapshot_addr | volume_references ---------------------------------------+--------------------------------------+--------------------------------------+-------------------------------+-------------------- 80790bfd-4b81-4381-9262-20912e3826cc | 0387bbb7-1d54-4683-943c-6c17d6804de9 | 1a800928-8f93-4cd3-9df1-4129582ffc20 | [fd00:1122:3344:102::7]:19016 | 0 80790bfd-4b81-4381-9262-20912e3826cc | ff20e066-8815-4eb6-ac84-fab9b9103462 | bdd9614e-f089-4a94-ae46-e10b96b79ba3 | [fd00:1122:3344:102::7]:19016 | 0 (2 rows) One way to solve this would be to create a UNIQUE INDEX on `snapshot_addr` here, but then in these cases the snapshot creation would return a 500 error to the user. This commit adds a sixth column: `deleting`, a boolean that is true when the region snapshot is part of a volume's `resources_to_clean_up`, and false otherwise. This is used to select (as part of the transaction for `decrease_crucible_resource_count_and_soft_delete_volume`) only the region_snapshot records that were decremented as part of that transaction, and skip re-deleting them otherwise. This works because the overlapping lifetime of the records in the DB is **not** the overlapping lifetime of the actual read-only downstairs daemon: for the port to be reclaimed, the original daemon has to be DELETEd, which happens after the decrement transaction has already computed which resources to clean up: 1) a snapshot record is created: ``` snapshot_id | snapshot_addr | volume_references | deleted | -------------------------------------+-------------------------------+-------------------+---------- 1a800928-8f93-4cd3-9df1-4129582ffc20 | [fd00:1122:3344:102::7]:19016 | 0 | false | ``` 2) it is incremented as part of `volume_create`: ``` snapshot_id | snapshot_addr | volume_references | deleted | -------------------------------------+-------------------------------+-------------------+---------- 1a800928-8f93-4cd3-9df1-4129582ffc20 | [fd00:1122:3344:102::7]:19016 | 1 | false | ``` 3) when the volume is deleted, then the decrement transaction will: a) decrease `volume_references` by 1 ``` snapshot_id | snapshot_addr | volume_references | deleted | -------------------------------------+-------------------------------+-------------------+---------- 1a800928-8f93-4cd3-9df1-4129582ffc20 | [fd00:1122:3344:102::7]:19016 | 0 | false | ``` b) note any `region_snapshot` records whose `volume_references` went to 0 and have `deleted` = false, and return those in the list of resources to clean up: [ 1a800928-8f93-4cd3-9df1-4129582ffc20 ] c) set deleted = true for any region_snapshot records whose `volume_references` went to 0 and have deleted = false ``` snapshot_id | snapshot_addr | volume_references | deleted | -------------------------------------+-------------------------------+-------------------+---------- 1a800928-8f93-4cd3-9df1-4129582ffc20 | [fd00:1122:3344:102::7]:19016 | 0 | true | ``` 4) That read-only snapshot daemon is DELETEd, freeing up the port. Another snapshot creation occurs, using that reclaimed port: ``` snapshot_id | snapshot_addr | volume_references | deleted | -------------------------------------+-------------------------------+-------------------+---------- 1a800928-8f93-4cd3-9df1-4129582ffc20 | [fd00:1122:3344:102::7]:19016 | 0 | true | bdd9614e-f089-4a94-ae46-e10b96b79ba3 | [fd00:1122:3344:102::7]:19016 | 0 | false | ``` 5) That new snapshot is incremented as part of `volume_create`: ``` snapshot_id | snapshot_addr | volume_references | deleted | -------------------------------------+-------------------------------+-------------------+---------- 1a800928-8f93-4cd3-9df1-4129582ffc20 | [fd00:1122:3344:102::7]:19016 | 0 | true | bdd9614e-f089-4a94-ae46-e10b96b79ba3 | [fd00:1122:3344:102::7]:19016 | 1 | false | ``` 6) It is later deleted, and the decrement transaction will: a) decrease `volume_references` by 1: ``` j snapshot_id | snapshot_addr | volume_references | deleted | -------------------------------------+-------------------------------+-------------------+---------- 1a800928-8f93-4cd3-9df1-4129582ffc20 | [fd00:1122:3344:102::7]:19016 | 0 | true | bdd9614e-f089-4a94-ae46-e10b96b79ba3 | [fd00:1122:3344:102::7]:19016 | 0 | false | ``` b) note any `region_snapshot` records whose `volume_references` went to 0 and have `deleted` = false, and return those in the list of resources to clean up: [ bdd9614e-f089-4a94-ae46-e10b96b79ba3 ] c) set deleted = true for any region_snapshot records whose `volume_references` went to 0 and have deleted = false ``` snapshot_id | snapshot_addr | volume_references | deleted | -------------------------------------+-------------------------------+-------------------+---------- 1a800928-8f93-4cd3-9df1-4129582ffc20 | [fd00:1122:3344:102::7]:19016 | 0 | true | bdd9614e-f089-4a94-ae46-e10b96b79ba3 | [fd00:1122:3344:102::7]:19016 | 0 | true | ```

leftwo · 2024-09-06T21:38:33Z

sled-agent/src/sim/storage.rs

+            match region.state {
+                State::Failed | State::Destroyed | State::Tombstoned => {
+                    bail!(
+                        "cannot create snapshot of region in state {:?}",


Nit: Do you care to include the region ID in the bail/error message here? Or will that be easy enough to determine what it was that failed if you had many running at the same time?

Good idea, done in 2a3cbd5

leftwo · 2024-09-09T15:53:17Z

sled-agent/src/sim/storage.rs


        let map =
            self.running_snapshots.entry(id).or_insert_with(|| HashMap::new());

        // If a running snapshot exists already, return it - this endpoint must
        // be idempotent.
        if let Some(running_snapshot) = map.get(&name.to_string()) {
+            self.used_ports.remove(&port_number);


Nit: Could you put the let port_number = .. below this if and then you don't have to remove it here.

Good catch! Done in 4b705af

leftwo · 2024-09-09T15:54:48Z

sled-agent/src/sim/storage.rs

+    use super::*;
+    use omicron_test_utils::dev::test_setup_log;
+
+    /// Validate that the Crucible agent reuses ports


Is this testing the sim agent, or the real agent?

In this case it's the sim one

If it's all testing the sim, we should make it clear in the comments that is what we are testing, so I don't come back later and think this is testing a real agent :)

Done! see bbca204

leftwo

Just the request to change the comments about sim testing should say it is testing sim.

jmpesp requested a review from jordanhendricks September 7, 2023 20:38

Merge branch 'main' into stricter_sled_agent

a71f38c

jmpesp mentioned this pull request Sep 15, 2023

Do not double count region snapshots records! #4095

Merged

jmpesp added 3 commits September 27, 2023 12:39

fix after merge

b7e43d1

Merge branch 'main' into stricter_sled_agent

8d6538e

fmt and clippy

bbf5a6a

jmpesp added 3 commits October 17, 2023 12:19

revert Cargo.toml change

bdea3c7

Merge branch 'main' into stricter_sled_agent

c056004

crucible#1004: don't check for a snapshot that's gone

9d91d38

jmpesp requested review from leftwo and removed request for jordanhendricks September 6, 2024 21:02

jmpesp added 3 commits September 7, 2024 05:45

Merge branch 'main' into stricter_sled_agent

d99632c

cargo fmt

d7d0515

logctx.cleanup_successful

ec4c9e2

leftwo reviewed Sep 9, 2024

View reviewed changes

leftwo approved these changes Sep 9, 2024

View reviewed changes

jmpesp enabled auto-merge (squash) September 9, 2024 21:10

jmpesp merged commit 69af861 into oxidecomputer:main Sep 9, 2024
16 checks passed

jmpesp added 3 commits September 10, 2024 08:07

add region ids to bail messages

2a3cbd5

get_free_port could move down below check

4b705af

be explicit: we are testing the simulated crucible agent!

bbca204

jmpesp deleted the stricter_sled_agent branch September 10, 2024 16:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simulated Crucible agent should be stricter #4049

Simulated Crucible agent should be stricter #4049

jmpesp commented Sep 7, 2023

leftwo Sep 6, 2024

jmpesp Sep 9, 2024

leftwo Sep 9, 2024

jmpesp Sep 9, 2024

leftwo Sep 9, 2024

jmpesp Sep 9, 2024

leftwo Sep 9, 2024

jmpesp Sep 9, 2024

leftwo left a comment

Simulated Crucible agent should be stricter #4049

Simulated Crucible agent should be stricter #4049

Conversation

jmpesp commented Sep 7, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

leftwo left a comment

Choose a reason for hiding this comment