[tests] make instance reincarnation tests less racy #7295
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
A possible test flake exists in the tests for instance reincarnation. This is caused by a race that occurs when the request to simulate an instance's state transition is sent to the simulated sled-agent before an instance-start saga sends the request to start the instance to that sled-agent. When this occurs, the simulated state transition is lost, and the test keeps waiting for it forever. See this comment for details.
This commit resolves this by adding a new
instance_wait_for_simulated_transition
helper, which is identical toinstance_wait_for_state
, but with the addition ofinstance_poke
requests every time the instance is not observed to be in the desired state. This is a bit of a blunt instrument, but it ensures that the simulated sled-agent will always be told to simulate a state transition after it's requested by the control plane.Hopefully fixes #6727