-
-
Notifications
You must be signed in to change notification settings - Fork 14.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correctly importing ZFS pools, and nixos-generate-config perl-ness #42178
Comments
Conversation on IRC starting at https://botbot.me/freenode/nixos/2018-06-18/?msg=101240239&page=8 |
There is also a prototype for zfs upstream, that does this at boot time: openzfs/zfs#4943 |
Complementary would be to integrate also this mount generator: openzfs/zfs#7329 |
I noticed this referenced in openzfs/zfs#4943, and I wanted to point to I was actually unaware of some of the complexities of getting all devices to become available--- I've rebased the zready patch, and half-integrated it with the zfs-mount-generator, but there were many comments in that PR that need to be addressed. The zready patch would allow for much faster boots on systems that have nonessential pools that take a long time to import, so merging it would certainly be preferred. |
For non-root pools this is definitely the better choice. |
The 'big hammer' I've got in #42269 is a better approach than requiring static configuration of the pool members, I think; it's only slightly slower than having the correct systemd dependencies would be, so we can wait for upstream. I'm still going to try to rewrite nixos-generate-config, on the principle that probably more people can contribute if it isn't Perl. There are other changes I might like to make to it, but for now I'll be sticking to a 1:1 rewrite. We'll see how the community feels about that in the corresponding PR, I suppose. |
How's that rewrite? Been reading |
You have to be quick otherwise it might be python3: #48378 |
@jcrben Real life got in the way. Suffice to say that I'm only just starting to look at it. I don't really mind if it's Python, I just don't like Perl very much. :D That said, thinking about rewriting the zfs-pool-import script in rust now. The shell scripting is getting over the top, and this is something that has to work. |
Has there been any progress on this? |
Not really. It's on my todo-list... somewhere, but I found other work that has a higher reward-to-effort ratio. If you want to take over, be my guest; otherwise it'll happen eventually. |
Thank you for your contributions. This has been automatically marked as stale because it has had no activity for 180 days. If this is still important to you, we ask that you leave a comment below. Your comment can be as simple as "still important to me". This lets people see that at least one person still cares about this. Someone will have to do this at most twice a year if there is no other activity. Here are suggestions that might help resolve this more quickly:
|
Closing this issue in favor of #89396 |
Issue description
It's a long story, so bear with me for a minute. There's the ZFS pool issue; and then there's what I did in an attempt to fix it.
ZFS pools first.
These, like most volume managers, provide a level of indirection between the underlying device and the filesystem being mounted.
nixos-generate-config
already has a degree of understanding of this, even if just through robust parsing; if you use it, it'll generate segments looking like this:--which are correctly handled by the rest of NixOS, generating the appropriate systemd units for mounting the filesystem.
Well, in principle. There's one problem: The mapping from device to filesystem is lost, which means systemd won't wait until the device nodes exist. There's an attempt to work around that with
udevadm settle
, but this doesn't actually work, as on modern systems init may be executed before the devices have even joined the bus.udevadm settle
only waits until any pending udev events are processed, it doesn't wait for devices it doesn't know about.I have on multiple occasions (#16901 and #42087) attempted to fix this, by hacky methods such as sleep loops.
Unfortunately, as I was coding the newest hack, I realized there's a problem with that approach: ZFS will happily import a pool with some missing devices, so long as there's sufficient redundancy for the pool to function. In layman's terms, if you have a simple two-disk mirror, it will still import the pool and let you mount it even if one disk is missing.
This is only natural. Mirroring would be pointless if that wasn't possible. Unfortunately, ZFS will not afterwards re-join the second disk to the pool, unless if the user explicitly asks it to by way of calling
zpool online
.So, in summary: The practical effect of the wait-loop code is to make it possible that ZFS pools are imported in degraded mode, opening users up to data loss that they weren't expecting. This is terrible, and I've spent the last few hours trying to think of a quick fix... without much success. I want to say it's unlikely in practice, but all I can say is I haven't observed the behaviour myself.
I'm not sure how to quickly fix that, in a safe manner that can be backported to -stable. Help?
===
Now, the second issue: Fixing it properly requires generating systemd dependencies for the backing devices, in the form of After= dependencies on device units like
dev-disk-by\x2did-nvme\x2deui.0025385871b09294\x2dpart2.device
.These cannot be generated at build-time, since the build might not be run on the machine the pool exists on. (And it would be a breach of hermeticity.) It could be done at bootloader install time, and that's an option I've considered, though it's also one I feel rather uncomfortable with. Ideally, I'd like the devices to be defined in
hardware-configuration.nix
-- that is what we already do for all non-pooled filesystems, in practice.Which means that
nixos-generate-config
should be generating this, and I've spent the last two hours trying to make it so. It's not particularly complicated....except, nixos-generate-config is written in Perl. I've never learned Perl, and it seems a bad choice for everything but historical reasons.
So would anyone mind terribly if I start the process of converting it to Rust, by way of making at least this particular case use a Rust binary?
It won't expand the runtime closure. Rust binaries are statically linked; nothing but glibc is needed at runtime.
The text was updated successfully, but these errors were encountered: