Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support creating and mounting Btrfs subvolumes #890

Open
laenion opened this issue Nov 29, 2019 · 20 comments
Open

Support creating and mounting Btrfs subvolumes #890

laenion opened this issue Nov 29, 2019 · 20 comments
Labels
jira for syncing to jira kind/enhancement

Comments

@laenion
Copy link
Contributor

laenion commented Nov 29, 2019

Feature Request

Btrfs is a supported file system for Ignition, but currently lacks support for several extended features. One of those features is subvolume support, which is used extensively on *SUSE distributions.

Desired Feature

In a first step Ignition should at least support mounting multiple existing subvolumes from the same device to be able to drop files on there during the files stage. This is currently failing because Ignition does not support using the same device name more than once.
Mounting a single subvolume per device is possible since #872 by using appropriate mount options.

In a second step it would be good if Ignition would also support creating additional subvolumes.

Other Information

This is related to and doing the opposite of #815, where multiple disks contain one Btrfs file system.

Example

Subvolume layout from an openSUSE installation:

ID 256 gen 78440 top level 5 path @
ID 258 gen 157397 top level 256 path @/var
ID 259 gen 157350 top level 256 path @/usr/local
ID 260 gen 157389 top level 256 path @/tmp
ID 261 gen 144350 top level 256 path @/srv
ID 262 gen 157352 top level 256 path @/root
ID 263 gen 156614 top level 256 path @/opt
ID 264 gen 157397 top level 256 path @/home
ID 265 gen 124813 top level 256 path @/boot/grub2/x86_64-efi
ID 266 gen 152932 top level 256 path @/boot/grub2/i386-pc
ID 267 gen 157367 top level 256 path @/.snapshots
ID 1038 gen 2151 top level 258 path @/var/lib/machines
ID 1448 gen 154571 top level 267 path @/.snapshots/283/snapshot

Currently it is not possible to mount more than one of these subvolumes besides the root file system.

@laenion
Copy link
Contributor Author

laenion commented Nov 29, 2019

I was wondering whether it would make sense to extend the path element.

To mount the /home subvolume one could use a syntax such as

 "device": "/dev/sda[/@/home]"

The brackets are escaped by udev, so there will be no conflict with existing path names (e.g. in a file system label), and this is the same syntax used by findmnt for SOURCE. This syntax would keep device as the primary key.

Creation of new subvolumes is more complex due to the fact that Btrfs supports different styles of subvolume layouts. Looking at the example from above we can see that most of the subvolumes are direct children of @ (which is not the root file system, but @/.snapshots/283/snapshot is). On the other hand e.g. @/var/lib/machines is a subvolume of @/var. Probably a distinction between relative and absolute paths (starting with a '/' or not in the brackets maybe?) would be enough, but in any case Ignition would have to make sure the parent is mounted to be able to create the subvolume...

Any thoughts?

@laenion
Copy link
Contributor Author

laenion commented Dec 23, 2019

Ping: Is there interest in this feature, or is extended Btrfs support out of scope?

bmwiedemann added a commit to bmwiedemann/openSUSE that referenced this issue Jan 14, 2020
https://build.opensuse.org/request/show/764393
by user dimstar_suse
Add 0002-allow-multiple-mounts-of-same-device.patch:
Allows mounting a device multiple times, e.g. to mount several subvolumes from a Btrfs device or bind mounting the device to multiple places, by adding the path to the key. [Workaround for gh#coreos/ignition#890]
@jdoss
Copy link

jdoss commented Sep 28, 2020

With Fedora's move to btrfs by default would this issue be in scope now?

@bgilbert
Copy link
Contributor

I think this issue is in scope, but I suspect we should add dedicated config attributes for it rather than overloading existing ones.

@jdoss, it seems that Fedora change only affects desktop variants.

@cgwalters
Copy link
Member

@cmurf

This comment was marked as off-topic.

@bgilbert
Copy link
Contributor

bgilbert commented Nov 8, 2022

Ignition supports compound primary keys. We could add a subvolume field to the filesystems section, and have config validation require that it be absent unless the format is btrfs. I think that's probably better than trying to put multiple data items into the device field.

Probably a distinction between relative and absolute paths (starting with a '/' or not in the brackets maybe?) would be enough, but in any case Ignition would have to make sure the parent is mounted to be able to create the subvolume...

Is the idea that a relative path would be relative to the default subvolume, and an absolute path would be relative to the top-level subvolume? Can we instead require that all paths be relative to the top-level subvolume?

In order of preference, I see a few options:

  1. Only support subvolume paths relative to the top-level subvolume. Create subvolumes by mounting the top-level subvolume, then creating subvolumes in order of ascending path length.

  2. Support subvolume paths relative to the top-level subvolume (which I'll call "top-relative") or to the default subvolume (which I'll call "default-relative"). At runtime, mount the top-level subvolume, look up the default subvolume, convert all default-relative subvolume paths to top-relative, fail if the same subvolume is specified multiple times, and create subvolumes in order of ascending path length.

  3. Support both default-relative and top-relative subvolume paths, but don't try to disambiguate them. Do two creation passes, one for each type.

    I don't think this would work well for Ignition. To keep the config declarative, Ignition needs to ensure that the same object can't be referenced multiple times, since otherwise it would matter in what order the operations are performed. We could define an order (e.g. "absolute first, then relative") but we've generally avoided that approach so far.

Thoughts?

@bgilbert
Copy link
Contributor

bgilbert commented Nov 8, 2022

And I guess we'd have to reject label for subvolumes?

@laenion
Copy link
Contributor Author

laenion commented Nov 9, 2022

Ignition supports compound primary keys. We could add a subvolume field to the filesystems section, and have config validation require that it be absent unless the format is btrfs. I think that's probably better than trying to put multiple data items into the device field.

Agreed.

Probably a distinction between relative and absolute paths (starting with a '/' or not in the brackets maybe?) would be enough, but in any case Ignition would have to make sure the parent is mounted to be able to create the subvolume...

Is the idea that a relative path would be relative to the default subvolume, and an absolute path would be relative to the top-level subvolume?

Not necessarily the default subvolume, but whatever is used for the current root ('/') file system.

Can we instead require that all paths be relative to the top-level subvolume?

That sounds like a very reasonable approach, as everything will be reachable from the top-level subvolume (but not necessarily from the root file system).

In order of preference, I see a few options:

1. Only support subvolume paths relative to the top-level subvolume.  Create subvolumes by mounting the top-level subvolume, then creating subvolumes in order of ascending path length.

Thinking about this again this is probably also the expected solution from a user's perspective: At least when using snapshots for the root file system in most cases one probably doesn't want to have subvolumes of a snapshot.

And I guess we'd have to reject label for subvolumes?

Yes, btrfs subvolumes don't have labels themselves.

@bgilbert
Copy link
Contributor

bgilbert commented Nov 9, 2022

Okay, sounds good!

Not necessarily the default subvolume, but whatever is used for the current root ('/') file system.

That couldn't work anyway. Filesystem creation happens in the disks stage, before the OS mounts /sysroot, and Ignition doesn't know what filesystem will be mounted there.

@travier travier added jira for syncing to jira kind/enhancement labels Nov 21, 2022
@queeup
Copy link

queeup commented Jan 15, 2023

Any news about this? I really love to have this.

@har7an
Copy link

har7an commented Jan 22, 2023

I'm also very interested in seeing this feature.

What exactly is the "top-level" subvolume you are referring to? Do you mean subvolid=5,subvol=/? Personally I'd appreciate if one was able to create a partition layout similar to Fedoras default, with a root, home and var (and more as needed) subvolumes below subvolid=5, which are then mounted accordingly.

@bgilbert
Copy link
Contributor

subvolid=5,subvol=/ is the top-level subvolume, yes. I believe option 1 from #890 (comment) woud provide what you're looking for?

@har7an
Copy link

har7an commented Apr 22, 2023

@bgilbert Yup, that would work just fine.

Out of curiosity: I don't see a /etc/fstab on CoreOS systems, so I assume that mounts are handled entirely through systemd mount units. Is that correct? So assuming I'd like to do a more elaborate Btrfs setup on CoreOS today, I'd have to create individual mount units for all subvolumes and then sort out the ordering between the units?

@bgilbert
Copy link
Contributor

Yes, you'd need a separate mount unit for each subvolume. Butane normally creates that for you if you specify with_mount_unit: true, but it would need to learn about btrfs subvolumes.

I assume btrfs itself doesn't impose any particular mount order requirements? systemd mount units automatically order themselves with respect to parent mounts (see "Implicit Dependencies" in systemd.mount(5)) so no special ordering is required there.

@dsreyes1014
Copy link

Hey guys. Dusty helped me figure a workaround to get subvolumes mounted but it seems it has to be declared here:

storage:
  filesystems:
    - path: /var
      device: /dev/disk/by-id/$DISKID
      format: btrfs
      mount_options: [subvol=$SUBVOLUME1]
      ...
    - path: /var/home
      device: /dev/disk/by-uuid/$DISKUUID
      format: btrfs
      mount_options: [subvol=$SUBVOLUME2]
      ...

and not with systemd.mount units for reasons of the order the ignition file executes.

An issue here is you can't declare the same device twice as butane errors out with a duplication message hence I use $DISKID for the first mount and $DISKUUID for the second (both symlinked to the same device). Another issue here is we are limited to mounting two subvolumes unless we can somehow create more persistent symlinks to the device. There is a discussion here with a bit more detail about my findings with this.

@har7an
Copy link

har7an commented Jan 20, 2024

@dsreyes1014

I understand how this mounts Btrfs subvolumes (and that's a pretty creative workaround btw), but this doesn't actually create them from a bare disk, or does it? Are you creating these subvolumes by hand in advance?

@dsreyes1014
Copy link

@har7an

Not at the moment. Ignition doesn't have this capability for subolume creating and mounting. I am creating the subvolumes beforehand manually.

@har7an
Copy link

har7an commented Mar 10, 2024

So, out of pure curiosity and since this is something I'd really like to see (All my FCOS instances run on Btrfs): Where would a change like this need to be implemented? I feel like this discussion isn't actively tracked or implemented anywhere, but maybe that's just my impression from following this thread. If my time permits and I don't have to modify a dozen repos all at once, maybe I'd give it a stab.

I think that having some implementation would at least allow us to discuss how we best implement it and what's feasible.

@ispanos
Copy link

ispanos commented Aug 11, 2024

Is it possible to allow mounting multiple subvolumes, by ignoring duplicate devices if we have format: btrfs? At least as a temporary workaround for those of use who already have a specific structure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
jira for syncing to jira kind/enhancement
Projects
None yet
Development

No branches or pull requests

10 participants