Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot remount /etc while rebuilding with etc.overlay enabled #303262

Open
oluceps opened this issue Apr 11, 2024 · 11 comments · Fixed by #328221
Open

Cannot remount /etc while rebuilding with etc.overlay enabled #303262

oluceps opened this issue Apr 11, 2024 · 11 comments · Fixed by #328221
Labels
0.kind: bug Something is broken

Comments

@oluceps
Copy link
Member

oluceps commented Apr 11, 2024

Describe the bug

Cannot remount /etc while rebuilding with etc.overlay enabled.
This issue does not always appear.

May related:
#270727
#291398

Steps To Reproduce

  systemd.sysusers.enable = true;
  system.etc.overlay.enable = true;
  system.etc.overlay.mutable = true;
doas nixos-rebuild switch (or with colmena apply-local --sudo)
<..snip
ktop.systemd1 /org/freedesktop/systemd1 org.freedesktop.systemd1.Manager ListUnitsByNames as 1 -- dbus-broker.service' exited with value 1 at /nix/store/0wsk0nwb6bq5f8bfxwny6bww68c44pji-nixos-system-kaambl-24.05.20240408.4cba8b5/bin/switch-to-configuration line 145.
kaambl | Activation failed: Child process exited with error code: 1
       | Failed: Child process exited with error code: 1
[ERROR] Failed to complete requested operation - Last 1 lines of logs:
[ERROR]  failure) Child process exited with error code: 1
[ERROR] Failed to deploy to kaambl - Last 20 lines of logs:
[ERROR]   stderr) Successfully installed Lanzaboote.
[ERROR]   stderr) stopping the following units: agenix-install-secrets.service, dae.service, systemd-tmpfiles-resetup.service, systemd-udevd-control.socket, systemd-udevd-kernel.socket, systemd-udevd.service
[ERROR]   stderr) activating the configuration...
[ERROR]   stdout) remounting /etc...
[ERROR]   stderr) mount: /tmp/tmp.T2fXevh7N0: overlay already mounted on /etc.
[ERROR]   stderr)        dmesg(1) may have more information after failed mount system call.
[ERROR]   stderr) Moving mount
[ERROR]   stderr) Mounting beneath top mount
[ERROR]   stderr) Invalid argument | move-mount.c: 553: main: move_mount
[ERROR]   stdout) Attaching mount /tmp/tmp.T2fXevh7N0 -> /etc
[ERROR]   stdout) Moving single attached mount
[ERROR]   stdout) Activation script snippet 'etc' failed (1)
[ERROR]   stderr) Reload daemon failed: Connection reset by peer
[ERROR]   stderr) reloading user units for elen...
[ERROR]   stderr) su: Cannot determine your user name.
[ERROR]   stderr) restarting sysinit-reactivation.target
[ERROR]   stderr) Failed to restart sysinit-reactivation.target: Connection timed out
[ERROR]   stderr) See system logs and 'systemctl status sysinit-reactivation.target' for details.
[ERROR]   stderr) '/nix/store/ydkp4xlbpmvf1j5xp09rw70vy3vb5n5a-system-path/bin/busctl --json=short call org.freedesktop.systemd1 /org/freedesktop/systemd1 org.freedesktop.systemd1.Manager ListUnitsByNames as 1 -- dbus-broker.service' exited with value 1 at /nix/store/0wsk0nwb6bq5f8bfxwny6bww68c44pji-nixos-system-kaambl-24.05.20240408.4cba8b5/bin/switch-to-configuration line 145.
[ERROR]  failure) Child process exited with error code: 1
[ERROR] -----
[ERROR] Operation failed with error: Child process exited with error code: 1
Hint: Backtrace available - Use `RUST_BACKTRACE=1` environment variable to display a backtrace

kernel log:

Apr 11 02:01:29 kaambl kernel: erofs: (device loop1): mounted with root inode @ nid 36.
Apr 11 02:01:29 kaambl kernel: overlayfs: upperdir is in-use as upperdir/workdir of another mount, mount with '-o index=off' to override exclusive upperdir protection.

Expected behavior

/etc successfully mounted and switch complete.

Screenshots

If applicable, add screenshots to help explain your problem.

Additional context

Add any other context about the problem here.

Notify maintainers

Metadata

Please run nix-shell -p nix-info --run "nix-info -m" and paste the result.

> nix-info -m
 - system: `"x86_64-linux"`
 - host os: `Linux 6.8.4-cachyos, NixOS, 24.05 (Uakari), 24.05.20240408.4cba8b5`
 - multi-user?: `yes`
 - sandbox: `yes`
 - version: `nix-env (Nix) 2.18.2`
 - channels(root): `"nixos"`
 - nixpkgs: `/nix/store/yzkrxddg9fjcjcahb197lgrsz4i9cbhh-450afzqlzzgw6wnyc3dwysf3i5yxyqkr-source`

Add a 👍 reaction to issues you find important.

@oluceps oluceps added the 0.kind: bug Something is broken label Apr 11, 2024
@oluceps
Copy link
Member Author

oluceps commented Apr 13, 2024

To be mentioned that when system.etc.overlay.mutable = false; this does not appear.

@r-vdp
Copy link
Contributor

r-vdp commented May 13, 2024

@nikstur this happens for me also when I have anything mounted on top of /etc.

This reproduces it:

diff --git a/nixos/tests/activation/etc-overlay-mutable.nix b/nixos/tests/activation/etc-overlay-mutable.nix
index 087c06408a71..5b150c61b08f 100644
--- a/nixos/tests/activation/etc-overlay-mutable.nix
+++ b/nixos/tests/activation/etc-overlay-mutable.nix
@@ -25,6 +25,9 @@
      machine.succeed("/run/current-system/bin/switch-to-configuration test")

    with subtest("switching to a new generation"):
+      machine.succeed("mkdir /etc/mountpoint")
+      machine.succeed("mount -t tmpfs tmpfs /etc/mountpoint")
+
      machine.fail("stat /etc/newgen")
      machine.succeed("echo -n 'mutable' > /etc/mutable") 

@oluceps
Copy link
Member Author

oluceps commented Aug 23, 2024

Still experiencing

kaambl | Evaluated kaambl
kaambl | Building kaambl
kaambl | /nix/store/ccjy49f1x5gvdgsb9qmi7crl2n05hisj-nixos-system-kaambl-24.11.20240729.9f10e67
kaambl | Built "/nix/store/ccjy49f1x5gvdgsb9qmi7crl2n05hisj-nixos-system-kaambl-24.11.20240729.9f10e67"
kaambl | Pushing system closure
kaambl | Pushed system closure
kaambl | No pre-activation keys to upload
kaambl | Activating system profile
kaambl | Installing Lanzaboote to "/efi"...
kaambl | Collecting garbage...
kaambl | Successfully installed Lanzaboote.
kaambl | stopping the following units: systemd-modules-load.service, systemd-tmpfiles-resetup.service, systemd-udevd-control.socket, systemd-udevd-kernel.socket, systemd-udevd.service
kaambl | activating the configuration...
kaambl | remounting /etc...
kaambl | mount: /tmp/tmp.uvOanRgx7X: /dev/loop0 already mounted or mount point busy.
kaambl |        dmesg(1) may have more information after failed mount system call.
kaambl | Moving mount
kaambl | Mounting beneath top mount
kaambl | Attaching mount /tmp/tmp.ShpAHKUkAh -> /etc
kaambl | Moving single attached mount
kaambl | Activation script snippet 'etc' failed (32)
kaambl | Failed to run activate script
kaambl | reloading user units for elen...
kaambl | Error: Failed to restart nixos-activation.service
kaambl | 
kaambl | Caused by:
kaambl |     Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
kaambl | restarting sysinit-reactivation.target
kaambl | Failed to restart sysinit-reactivation.target: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
kaambl | Error: Failed to get unit dbus-broker.service
kaambl | 
kaambl | Caused by:
kaambl |     Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.

@oluceps oluceps reopened this Aug 23, 2024
@oluceps
Copy link
Member Author

oluceps commented Aug 23, 2024

> nix-info -m
 - system: `"x86_64-linux"`
 - host os: `Linux 6.10.2, NixOS, 24.11 (Vicuna), 24.11.20240729.9f10e67`
 - multi-user?: `yes`
 - sandbox: `yes`
 - version: `nix-env (Lix, like Nix) 2.91.0-dev-pre20240726-6abad7c
System type: x86_64-linux
Additional system types: i686-linux, x86_64-v1-linux, x86_64-v2-linux, x86_64-v3-linux
Features: gc, signed-caches
System configuration file: /etc/nix/nix.conf
User configuration files: /home/elen/.config/nix/nix.conf:/etc/xdg/nix/nix.conf:/home/elen/.local/share/flatpak/exports/etc/xdg/nix/nix.conf:/var/lib/flatpak/exports/etc/xdg/nix/nix.conf:/home/elen/.nix-profile/etc/xdg/nix/nix.conf:/nix/profile/etc/xdg/nix/nix.conf:/home/elen/.local/state/nix/profile/etc/xdg/nix/nix.conf:/etc/profiles/per-user/elen/etc/xdg/nix/nix.conf:/nix/var/nix/profiles/default/etc/xdg/nix/nix.conf:/run/current-system/sw/etc/xdg/nix/nix.conf
Store directory: /nix/store
State directory: /nix/var/nix
Data directory: /nix/store/w1y9gd6yxf8azq4ilnk7ghcbjkcp2bbx-lix-2.91.0-dev-pre20240726-6abad7c/share`
 - nixpkgs: `/nix/store/n5yzhgbv2vrf43rjdw831xynv82by12f-rb49nm580v5dp49y1ram2byyg7pd4sj1-source`

@nikstur
Copy link
Contributor

nikstur commented Aug 23, 2024

Can you provide kernel logs with dmesg for this mount call? Otherwise I cannot tell what's going on. It looks like it fails to mount the metadata image.

@oluceps
Copy link
Member Author

oluceps commented Aug 23, 2024

Can you provide kernel logs with dmesg for this mount call? Otherwise I cannot tell what's going on. It looks like it fails to mount the metadata image.

I haven't seen any kernel log related to this, and it's hard to reproduce. I'll stay in this nixpkgs revision for weeks to see if I can reproduce it.

journalctl -k -p 5 --since 14:00

https://pb.nyaw.xyz/on-toucan.txt

Maybe? related #333999

@oluceps
Copy link
Member Author

oluceps commented Aug 23, 2024

@Mic92
Copy link
Member

Mic92 commented Aug 23, 2024

@nikstur I don't think there will be any logs if the kernel returns -EBUSY on the mount syscall. I think mount only prints this for filesystems that have custom error logs.

@Mic92
Copy link
Member

Mic92 commented Aug 23, 2024

@oluceps can you run strace -f -s512 -e mount nixos-rebuild switch as root? And give us the output?

@oluceps
Copy link
Member Author

oluceps commented Aug 24, 2024

@oluceps can you run strace -f -s512 -e mount nixos-rebuild switch as root? And give us the output?

Here's the log, sudo strace -f -s512 -e mount nixos-rebuild switch --flake .
https://pb.nyaw.xyz/famous-squirrel.txt

@Mic92
Copy link
Member

Mic92 commented Aug 25, 2024

Interesting. This looks different than expected. I would have expected a mount system call, but it seemed to have failed in a different syscall.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.kind: bug Something is broken
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants