Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: Cope with busy devices when creating partitions #20521

Closed

Conversation

mvollmer
Copy link
Member

@mvollmer mvollmer commented May 28, 2024

UDisks2 wipes a freshly created partition, and this sometimes fails
with "Device or resource busy". A good fix might be to have UDisks2
retry the wiping a little bit, ut for now we do what a user would do:
Format the created partition explicitly.


Fixes #20520

To make sure that we catch all cases in our tests.
@mvollmer mvollmer added the no-test For doc/workflow changes, or experiments which don't need a full CI run, label May 28, 2024
@mvollmer
Copy link
Member Author

Some tests have not been ported to create_partition_with_retry yet and are expected to fail because of the forced error. Let's see if they are a problem in practice.

@mvollmer mvollmer force-pushed the test-retry-partition-formatting branch from f6d02c7 to 8c88224 Compare May 28, 2024 08:06
@mvollmer
Copy link
Member Author

Some tests have not been ported to create_partition_with_retry yet and are expected to fail because of the forced error. Let's see if they are a problem in practice.

Now they have.

@martinpitt
Copy link
Member

Note static-code:

not ok 8 /static-code/test-ruff
# test/common/storagelib.py:395:39: E201 [*] Whitespace after '{'

@@ -368,6 +368,39 @@ def doit():
raise
self.browser.wait(doit)

# Creating and formatting a partition sometimes fails with "Device
# or resource busy" when the storage stack trips over its own feet
# during the process. Let's do what the user would do and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which is "swear and go back to MacOS"? 😁 🙈

More seriously, feels like an OK hack in our current position. Thanks!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah... oh, wipefs has --lock and even documents it to be used to coordinate with udevd... let me make this libblockdev PR...

Copy link
Member Author

@mvollmer mvollmer May 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, flock will not protect against open returning EBUSY, because the flocking happens after open...

UDisks2 wipes a freshly created partition, and this sometimes fails
with "Device or resource busy".  A good fix might be to have UDisks2
retry the wiping a little bit, ut for now we do what a user would do:
Format the created partition explicitly.
@mvollmer mvollmer force-pushed the test-retry-partition-formatting branch from 8c88224 to d08853e Compare May 28, 2024 10:24
@mvollmer
Copy link
Member Author

The Format step also ran into "Device busy" on TF rawhide, so a little delay might be in order. But now I actually hope that the device is permanently busy and we have a chance of figuring out why.

@mvollmer
Copy link
Member Author

so a little delay might be in order.

Well, d'oh, at this point we have already waited a minute for the dialog to close, hmm.

@mvollmer
Copy link
Member Author

mvollmer commented May 28, 2024

We also have "Device or resource busy" fomr pure command line stuff in the tests:

Cannot exclusively open /dev/sda1, device in use.
Traceback (most recent call last):
  File "/source/test/verify/check-storage-used", line 38, in testUsed
    m.execute(f"echo einszweidrei | cryptsetup luksFormat --pbkdf-memory 32768 {disk}1")

I wonder if there are some new kernel shenanigans similar to the ones that made us do #17798

It looks like the new partitions are permanently busy... let's see if
we can figure out why.
@@ -19,6 +19,7 @@
import os.path
import re
import textwrap
import time

Check notice

Code scanning / CodeQL

Unused import Note test

Import of 'time' is not used.
@mvollmer
Copy link
Member Author

Let's try avoiding scsi_debug for disks that get partitioned.

@mvollmer mvollmer force-pushed the test-retry-partition-formatting branch from a2ebf24 to fb9cfa1 Compare May 28, 2024 13:37
@mvollmer
Copy link
Member Author

Let's try avoiding scsi_debug for disks that get partitioned.

This seems to have worked. I can't explain it, and I am happy with that.

@mvollmer mvollmer closed this May 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no-test For doc/workflow changes, or experiments which don't need a full CI run,
Projects
None yet
Development

Successfully merging this pull request may close these issues.

rawhide started to fail storage tests
2 participants