Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

systemd udev doesn't wait for HDDs, system can't boot (not even recovery) #20

Open
alaricljs opened this issue Jan 27, 2017 · 23 comments
Open

Comments

@alaricljs
Copy link

With systemd udev, sd-zfs fails to find my boot devices and attempts to go into rescue mode and fails there with an inability to mount /sysroot and systemd whining about not being able to do something with the password file. With udev and zfs hooks udev spins for a while waiting on my storage and then everything comes up clean. With systemd and sd-zfs the udev startup appears to get paralleled with other items and this breaks the process. The boot devices are SSDs, however primary storage is all HDDs set to not spin up unless told to. Unfortunately I don't have the resources to determine if this will happen without sd-zfs.

Do you know of some way to force systemd to wait on udev before proceeding? The initramfs is a very truncated version and I don't know where to start looking.

@dasJ
Copy link
Owner

dasJ commented Feb 21, 2017

That should not happen, as sd-zfs goes After udev:

https://github.com/dasJ/sd-zfs/blob/master/src/zfs-generator.c#L173

You can check the dependency tree by putting systemd-analyze to your initrd and generating a depenency graph.

@alaricljs
Copy link
Author

Any thoughts on how to get that to work and get data out of it? zfs pool never imports or mounts, no login access is available. I don't see a way to run this against a non-running systemd setup.

@dasJ
Copy link
Owner

dasJ commented Feb 23, 2017 via email

@wallzero
Copy link

I appear to be having the same issue only it's also not waiting for cryptsetup.target.

@dasJ
Copy link
Owner

dasJ commented Apr 28, 2017

Do you use hibernation? You may have the same problem I had: systemd/systemd#4577

@wallzero
Copy link

wallzero commented May 1, 2017

No I don't use hibernation at the moment; I only have one SSD and I didn't want to partition it. My swap sits in a VDEV and won't work with hibernation as far as I know.

@dasJ dasJ closed this as completed Dec 6, 2017
@wallzero
Copy link

I just tried again on a fresh install and sd-zfs still seems to run immediately. Before I am even prompted for the luks password a [FAILED] Failed to mount /sysroot error is logged. systemctl status sysroot.mount shares the following:

Where: /sysroot
What: zfs:rpool/root
Docs: ...
Process: 116 ExecMount=/usr/bin/mount zfs:rpool/root /sysroot -o rw ...

@dasJ dasJ reopened this Dec 13, 2017
@dasJ
Copy link
Owner

dasJ commented Dec 13, 2017

How did you configure LUKS?

@wallzero
Copy link

wallzero commented Jan 3, 2018

I encrypted the drive with the following:

cryptsetup luksFormat -c aes-xts-plain64 -s 512 -h sha512 /dev/sda2

Then I modified /etc/default/grub and tried several things:

GRUB_CMDLINE_LINUX_DEFAULT="rd.luks.uuid=id rd.luks.name=id=luks rd.luks.crypttab=no rd.luks.options=tries=0,timeout=120s rootflags=x-systemd.mount-timeout=infinity,retry=10000,x-systemd.device-timeout=120s root=zfs:rpool/root zfs_force=1 quiet splash"

/etc/fstab doesn't have anything about the root, only the following:

# EFI
UUID=efiId /boot/efi vfat discard,umask=0077 0 0

# Boot
UUID=bootId /boot ext4 defaults,discard,nofail 0 0

# Swap
/dev/zvol/rpool/swap none swap defaults,discard 0 0

The zpool bootfs is configured:

zpool get bootfs
NAME   PROPERTY  VALUE       SOURCE
rpool  bootfs    rpool/root  local

@dasJ
Copy link
Owner

dasJ commented Jan 4, 2018

Correct ..... almost.

You need to add sd-encrypt to the HOOKS in /etc/mkinitcpio.conf.
The cmdline LUKS options are only for the non-systemd initrd. My cmdline just looks like: root=zfs:zroot/root rw.

Now create a /etc/crypttab.initramfs. You can find the syntax in crypttab(5).
Yours should probably look like this:

luks     UUID=id - tries=0,timeout=120s

Also, you can add discard to the options in the crypttab entry when you have an SSD.

Then, rebuild your initcpio. Hope this helps.

@dasJ
Copy link
Owner

dasJ commented Jan 4, 2018

Btw, my full HOOKS are:

HOOKS="base systemd autodetect modconf block keyboard sd-vconsole sd-encrypt sd-zfs"

I don't really need base, but it helps when I have to troubleshoot stuff in the initrd.

@wallzero
Copy link

wallzero commented Jan 4, 2018

Sorry I had forgotten /etc/mkinitcpio.conf:

HOOKS="base systemd autodetect modconf block keyboard keymap sd-encrypt sd-zfs filesystems fsck"

The only difference I see is sd-vconsole is missing and filesystems and fsck are added.

@dasJ
Copy link
Owner

dasJ commented Jan 4, 2018

Do you have a proper crypttab.initramfs?

@wallzero
Copy link

wallzero commented Jan 4, 2018

I tried adding your crypttab.initramfs example with my UUID but after rebuilding the initcpio and updating grub it still appears to not wait for the password prompt. Same issue as above.

Also, why is it /etc/crypttab.initramfs and not /etc/crypttab?

@dasJ
Copy link
Owner

dasJ commented Jan 4, 2018

Because the crypttab.initramfs is put into your initramfs, while crypttab isn't.
In your fstab, can you use /dev/mapper/whatever instead of the UUIDs? systemd is probably unable to generate the proper dependencies.

@wallzero
Copy link

wallzero commented Jan 5, 2018

I'm sorry, do you mean for the ZFS partition? I do not have the ZFS partition UUID under a / entry in /etc/fstab. I am using the zpool bootfs option. I didn't mention above that I also set the mountpoint on the root partition:

zfs get mountpoint rpool/root
NAME PROPERTY VALUE SOURCE
rpool/root mountpoint / local

I could try zfs set mountpoint legacy?

/dev/mapper/ only contains control and luks links. I could also try /dev/mapper/luks in /etc/crypttab.initramfs?

@dschaper
Copy link

dschaper commented Jan 9, 2018

I had the same problem. grub-mkconfig throws in additional root directives, see /etc/grub.d/10_linux@line 66 or so. My /etc/default/grub was set with root per the documentation, however the generated /boot/grub/grub.cfg had two root= lines, one with root: as required and one with root=ZFS= which is what systemd picked up and tried to run with. Booting up and removing the first entry from the kernel line let me boot without issues.

For the record, ZFS on LUKS encrypted full disk encryption, boot is encrypted on a separate drive and no keyfiles, manual entry of passwords until I get things debugged correctly.

@dasJ
Copy link
Owner

dasJ commented Jan 9, 2018

@schlesiger Sorry, I was mistaken. Have you tried the hint of @dschaper ?

@wallzero
Copy link

@dschaper Thank you for your input! @dasJ I will give @dschaper solution a try! I already see two root= definitions in my /boot/grub/grub.cfg.

@maksim-pinguin
Copy link

I have the same issue with systemd-boot. I checked my kernel parameter in the entries screen. I don't have any duplicate root=/: values.
IMG_20210623_201847

That's the error during bootstrap. I also can't get a emergency shell. And journalctl is empty when I chroot in.

IMG_20210623_201507

@n-st
Copy link

n-st commented Sep 26, 2021

I'm seeing the same behaviour — sd-zfs tries to import the pool before sd-encrypt has decrypted it — with this hook order:

HOOKS=(base systemd autodetect keyboard sd-vconsole modconf block sd-encrypt lvm2 sd-zfs filesystems fsck)

I'll try different device specifications (currently PARTLABEL=…) in /etc/crypttab.initramfs when I have some more time.

@maksim-pinguin For what it's worth, you can create an unlocked root account in your initramfs (which is created separately from your regular root account) to at least get an emergency shell: https://bbs.archlinux.org/viewtopic.php?pid=1927757#p1927757
From there, you can probably just zpool import -R /sysroot rpool; exit to continue booting normally.

@misaka18931
Copy link

I have the same error as @maksim-pinguin on my setup.

@maksim-pinguin
Copy link

maksim-pinguin commented Apr 28, 2022

I managed to get it running with the old syntax for the kernel parameter regarding the zfs partition. Check this thread:
https://bbs.archlinux.org/viewtopic.php?pid=1979863#p1979863

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants