Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The destination of symbolic links under /dev/zvol is not created correctly. #15904

Closed
masato-yoshi-dnsalias-com opened this issue Feb 18, 2024 · 9 comments · Fixed by #15970
Closed
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@masato-yoshi-dnsalias-com

System information

Type Version/Name
Distribution Name Proxmox
Distribution Version 8.1
Kernel Version 6.5.11-7-pve
Architecture x86_64
OpenZFS Version 2.2.2

Describe the problem you're observing

Create volume in zfs.
When you create a partition on volume, a symbolic link is created under /dev/zvol.
However, if the number of partitions is 16 or more, the destination of the symbolic link will not be the correct device file.
Partition links may also be linked to volume files.

example
test-vol -> ../zsd123
test-vol-part1 -> ../zd1123p1

test-vol-part16 -> ../zd123p1

The major number and minor number of a device over 16 are different from those below 15.

Describe how to reproduce the problem

zfs create -V 20G rpool/data/test-vol
gdisk /dev/dev/zvol/rpool/data/test-vol
Create 16 or more GPT partitions.

Include any warning/errors/backtraces from the system logs

----- udev monitor -----
KERNEL[9554.307575] remove /devices/virtual/block/zd32/zd32p1 (block)
KERNEL[9554.307762] remove /devices/virtual/block/zd32/zd32p2 (block)
KERNEL[9554.307925] remove /devices/virtual/block/zd32/zd32p3 (block)
KERNEL[9554.308189] remove /devices/virtual/block/zd32/zd32p4 (block)
KERNEL[9554.308476] remove /devices/virtual/block/zd32/zd32p5 (block)
KERNEL[9554.308550] remove /devices/virtual/block/zd32/zd32p6 (block)
KERNEL[9554.308605] remove /devices/virtual/block/zd32/zd32p7 (block)
KERNEL[9554.308644] remove /devices/virtual/block/zd32/zd32p8 (block)
KERNEL[9554.308657] remove /devices/virtual/block/zd32/zd32p9 (block)
KERNEL[9554.308683] remove /devices/virtual/block/zd32/zd32p10 (block)
KERNEL[9554.308697] remove /devices/virtual/block/zd32/zd32p11 (block)
KERNEL[9554.308717] remove /devices/virtual/block/zd32/zd32p12 (block)
KERNEL[9554.308733] remove /devices/virtual/block/zd32/zd32p13 (block)
KERNEL[9554.308776] remove /devices/virtual/block/zd32/zd32p14 (block)
KERNEL[9554.308825] remove /devices/virtual/block/zd32/zd32p15 (block)
KERNEL[9554.308853] remove /devices/virtual/block/zd32/zd32p16 (block)
KERNEL[9554.308864] remove /devices/virtual/block/zd32/zd32p17 (block)
KERNEL[9554.309270] change /devices/virtual/block/zd32 (block)
KERNEL[9554.309335] add /devices/virtual/block/zd32/zd32p1 (block)
KERNEL[9554.309368] add /devices/virtual/block/zd32/zd32p2 (block)
KERNEL[9554.309406] add /devices/virtual/block/zd32/zd32p3 (block)
KERNEL[9554.309421] add /devices/virtual/block/zd32/zd32p4 (block)
KERNEL[9554.309439] add /devices/virtual/block/zd32/zd32p5 (block)
KERNEL[9554.309454] add /devices/virtual/block/zd32/zd32p6 (block)
KERNEL[9554.309473] add /devices/virtual/block/zd32/zd32p7 (block)
KERNEL[9554.309491] add /devices/virtual/block/zd32/zd32p8 (block)
KERNEL[9554.309519] add /devices/virtual/block/zd32/zd32p9 (block)
KERNEL[9554.309530] add /devices/virtual/block/zd32/zd32p10 (block)
KERNEL[9554.309539] add /devices/virtual/block/zd32/zd32p11 (block)
KERNEL[9554.309555] add /devices/virtual/block/zd32/zd32p12 (block)
KERNEL[9554.309575] add /devices/virtual/block/zd32/zd32p13 (block)
KERNEL[9554.309612] add /devices/virtual/block/zd32/zd32p14 (block)
KERNEL[9554.309623] add /devices/virtual/block/zd32/zd32p15 (block)
KERNEL[9554.309640] add /devices/virtual/block/zd32/zd32p16 (block)
KERNEL[9554.309662] add /devices/virtual/block/zd32/zd32p17 (block)
UDEV [9554.316582] remove /devices/virtual/block/zd32/zd32p1 (block)
UDEV [9554.317037] remove /devices/virtual/block/zd32/zd32p2 (block)
UDEV [9554.318366] remove /devices/virtual/block/zd32/zd32p7 (block)
UDEV [9554.318918] remove /devices/virtual/block/zd32/zd32p14 (block)
UDEV [9554.319830] remove /devices/virtual/block/zd32/zd32p3 (block)
UDEV [9554.320509] remove /devices/virtual/block/zd32/zd32p16 (block)
UDEV [9554.320805] remove /devices/virtual/block/zd32/zd32p4 (block)
UDEV [9554.321799] remove /devices/virtual/block/zd32/zd32p5 (block)
UDEV [9554.322379] remove /devices/virtual/block/zd32/zd32p9 (block)
UDEV [9554.323082] remove /devices/virtual/block/zd32/zd32p11 (block)
UDEV [9554.323318] remove /devices/virtual/block/zd32/zd32p13 (block)
UDEV [9554.324108] remove /devices/virtual/block/zd32/zd32p17 (block)
UDEV [9554.324343] remove /devices/virtual/block/zd32/zd32p12 (block)
UDEV [9554.324541] remove /devices/virtual/block/zd32/zd32p6 (block)
UDEV [9554.324812] remove /devices/virtual/block/zd32/zd32p15 (block)
UDEV [9554.324833] remove /devices/virtual/block/zd32/zd32p10 (block)
UDEV [9554.325440] remove /devices/virtual/block/zd32/zd32p8 (block)
UDEV [9554.328891] change /devices/virtual/block/zd32 (block)
UDEV [9554.338125] add /devices/virtual/block/zd32/zd32p7 (block)
UDEV [9554.339560] add /devices/virtual/block/zd32/zd32p12 (block)
UDEV [9554.341336] add /devices/virtual/block/zd32/zd32p1 (block)
UDEV [9554.341361] add /devices/virtual/block/zd32/zd32p9 (block)
UDEV [9554.343148] add /devices/virtual/block/zd32/zd32p5 (block)
UDEV [9554.343908] add /devices/virtual/block/zd32/zd32p3 (block)
UDEV [9554.345078] add /devices/virtual/block/zd32/zd32p14 (block)
UDEV [9554.345551] add /devices/virtual/block/zd32/zd32p11 (block)
UDEV [9554.345963] add /devices/virtual/block/zd32/zd32p15 (block)
UDEV [9554.347310] add /devices/virtual/block/zd32/zd32p6 (block)
UDEV [9554.348411] add /devices/virtual/block/zd32/zd32p2 (block)
UDEV [9554.348725] add /devices/virtual/block/zd32/zd32p8 (block)
UDEV [9554.348815] add /devices/virtual/block/zd32/zd32p10 (block)
UDEV [9554.349146] add /devices/virtual/block/zd32/zd32p17 (block)
UDEV [9554.349643] add /devices/virtual/block/zd32/zd32p4 (block)
UDEV [9554.350612] add /devices/virtual/block/zd32/zd32p16 (block)
UDEV [9554.353088] add /devices/virtual/block/zd32/zd32p13 (block)
KERNEL[9555.313007] remove /devices/virtual/block/zd32/zd32p1 (block)
KERNEL[9555.313103] remove /devices/virtual/block/zd32/zd32p2 (block)
KERNEL[9555.313362] remove /devices/virtual/block/zd32/zd32p3 (block)
KERNEL[9555.313879] remove /devices/virtual/block/zd32/zd32p4 (block)
KERNEL[9555.314046] remove /devices/virtual/block/zd32/zd32p5 (block)
KERNEL[9555.314179] remove /devices/virtual/block/zd32/zd32p6 (block)
KERNEL[9555.314309] remove /devices/virtual/block/zd32/zd32p7 (block)
KERNEL[9555.314506] remove /devices/virtual/block/zd32/zd32p8 (block)
KERNEL[9555.314670] remove /devices/virtual/block/zd32/zd32p9 (block)
KERNEL[9555.314807] remove /devices/virtual/block/zd32/zd32p10 (block)
KERNEL[9555.314933] remove /devices/virtual/block/zd32/zd32p11 (block)
KERNEL[9555.315060] remove /devices/virtual/block/zd32/zd32p12 (block)
KERNEL[9555.315189] remove /devices/virtual/block/zd32/zd32p13 (block)
KERNEL[9555.315318] remove /devices/virtual/block/zd32/zd32p14 (block)
KERNEL[9555.315449] remove /devices/virtual/block/zd32/zd32p15 (block)
KERNEL[9555.315582] remove /devices/virtual/block/zd32/zd32p16 (block)
KERNEL[9555.316093] remove /devices/virtual/block/zd32/zd32p17 (block)
UDEV [9555.317837] remove /devices/virtual/block/zd32/zd32p2 (block)
UDEV [9555.318004] remove /devices/virtual/block/zd32/zd32p1 (block)
KERNEL[9555.318130] change /devices/virtual/block/zd32 (block)
KERNEL[9555.318169] add /devices/virtual/block/zd32/zd32p1 (block)
KERNEL[9555.318263] add /devices/virtual/block/zd32/zd32p2 (block)
KERNEL[9555.318333] add /devices/virtual/block/zd32/zd32p3 (block)
KERNEL[9555.318407] add /devices/virtual/block/zd32/zd32p4 (block)
KERNEL[9555.318476] add /devices/virtual/block/zd32/zd32p5 (block)
KERNEL[9555.318537] add /devices/virtual/block/zd32/zd32p6 (block)
KERNEL[9555.318607] add /devices/virtual/block/zd32/zd32p7 (block)
KERNEL[9555.318668] add /devices/virtual/block/zd32/zd32p8 (block)
KERNEL[9555.318738] add /devices/virtual/block/zd32/zd32p9 (block)
KERNEL[9555.318810] add /devices/virtual/block/zd32/zd32p10 (block)
KERNEL[9555.318885] add /devices/virtual/block/zd32/zd32p11 (block)
KERNEL[9555.318948] add /devices/virtual/block/zd32/zd32p12 (block)
KERNEL[9555.319019] add /devices/virtual/block/zd32/zd32p13 (block)
KERNEL[9555.319080] add /devices/virtual/block/zd32/zd32p14 (block)
UDEV [9555.319155] remove /devices/virtual/block/zd32/zd32p4 (block)
KERNEL[9555.319175] add /devices/virtual/block/zd32/zd32p15 (block)
KERNEL[9555.319250] add /devices/virtual/block/zd32/zd32p16 (block)
KERNEL[9555.319314] add /devices/virtual/block/zd32/zd32p17 (block)
UDEV [9555.319881] remove /devices/virtual/block/zd32/zd32p12 (block)
UDEV [9555.321574] remove /devices/virtual/block/zd32/zd32p9 (block)
UDEV [9555.322132] remove /devices/virtual/block/zd32/zd32p7 (block)
UDEV [9555.322535] remove /devices/virtual/block/zd32/zd32p14 (block)
UDEV [9555.323161] remove /devices/virtual/block/zd32/zd32p15 (block)
UDEV [9555.323661] remove /devices/virtual/block/zd32/zd32p8 (block)
UDEV [9555.324129] remove /devices/virtual/block/zd32/zd32p17 (block)
UDEV [9555.325578] remove /devices/virtual/block/zd32/zd32p11 (block)
UDEV [9555.325603] remove /devices/virtual/block/zd32/zd32p5 (block)
UDEV [9555.325621] remove /devices/virtual/block/zd32/zd32p16 (block)
UDEV [9555.325638] remove /devices/virtual/block/zd32/zd32p13 (block)
UDEV [9555.325654] remove /devices/virtual/block/zd32/zd32p3 (block)
UDEV [9555.325670] remove /devices/virtual/block/zd32/zd32p6 (block)
UDEV [9555.325729] remove /devices/virtual/block/zd32/zd32p10 (block)
UDEV [9555.349470] change /devices/virtual/block/zd32 (block)
UDEV [9555.356982] add /devices/virtual/block/zd32/zd32p4 (block)
UDEV [9555.359976] add /devices/virtual/block/zd32/zd32p7 (block)
UDEV [9555.362572] add /devices/virtual/block/zd32/zd32p10 (block)
UDEV [9555.362664] add /devices/virtual/block/zd32/zd32p8 (block)
UDEV [9555.366217] add /devices/virtual/block/zd32/zd32p14 (block)
UDEV [9555.366242] add /devices/virtual/block/zd32/zd32p12 (block)
UDEV [9555.366676] add /devices/virtual/block/zd32/zd32p16 (block)
UDEV [9555.367651] add /devices/virtual/block/zd32/zd32p11 (block)
UDEV [9555.375981] add /devices/virtual/block/zd32/zd32p2 (block)
UDEV [9555.376015] add /devices/virtual/block/zd32/zd32p15 (block)
UDEV [9555.376036] add /devices/virtual/block/zd32/zd32p17 (block)
UDEV [9555.376053] add /devices/virtual/block/zd32/zd32p1 (block)
UDEV [9555.376071] add /devices/virtual/block/zd32/zd32p6 (block)
UDEV [9555.376092] add /devices/virtual/block/zd32/zd32p13 (block)
UDEV [9555.376118] add /devices/virtual/block/zd32/zd32p3 (block)
UDEV [9555.376146] add /devices/virtual/block/zd32/zd32p9 (block)
UDEV [9555.376177] add /devices/virtual/block/zd32/zd32p5 (block)
----- udev monitor -----
----- ls -l /dev/zvol/rpool/data -----
lrwxrwxrwx 1 root root 16 Feb 18 13:37 testvol -> ../../../zd32p16
lrwxrwxrwx 1 root root 15 Feb 18 13:37 testvol-part1 -> ../../../zd32p1
lrwxrwxrwx 1 root root 16 Feb 18 13:37 testvol-part10 -> ../../../zd32p10
lrwxrwxrwx 1 root root 16 Feb 18 13:37 testvol-part11 -> ../../../zd32p11
lrwxrwxrwx 1 root root 16 Feb 18 13:37 testvol-part12 -> ../../../zd32p12
lrwxrwxrwx 1 root root 16 Feb 18 13:37 testvol-part13 -> ../../../zd32p13
lrwxrwxrwx 1 root root 16 Feb 18 13:37 testvol-part14 -> ../../../zd32p14
lrwxrwxrwx 1 root root 16 Feb 18 13:37 testvol-part15 -> ../../../zd32p15
lrwxrwxrwx 1 root root 15 Feb 18 13:37 testvol-part2 -> ../../../zd32p2
lrwxrwxrwx 1 root root 16 Feb 18 13:37 testvol-part3 -> ../../../zd32p17
lrwxrwxrwx 1 root root 15 Feb 18 13:37 testvol-part4 -> ../../../zd32p4
lrwxrwxrwx 1 root root 15 Feb 18 13:37 testvol-part5 -> ../../../zd32p5
lrwxrwxrwx 1 root root 15 Feb 18 13:37 testvol-part6 -> ../../../zd32p6
lrwxrwxrwx 1 root root 15 Feb 18 13:37 testvol-part7 -> ../../../zd32p7
lrwxrwxrwx 1 root root 15 Feb 18 13:37 testvol-part8 -> ../../../zd32p8
lrwxrwxrwx 1 root root 15 Feb 18 13:37 testvol-part9 -> ../../../zd32p9
----- ls -l /dev/zvol/rpool/data -----
----- ls -l /dev -----
brw-rw---- 1 root disk 230, 32 Feb 18 13:29 /dev/zd32
brw-rw---- 1 root disk 230, 33 Feb 18 13:29 /dev/zd32p1
brw-rw---- 1 root disk 230, 42 Feb 18 13:29 /dev/zd32p10
brw-rw---- 1 root disk 230, 43 Feb 18 13:29 /dev/zd32p11
brw-rw---- 1 root disk 230, 44 Feb 18 13:29 /dev/zd32p12
brw-rw---- 1 root disk 230, 45 Feb 18 13:29 /dev/zd32p13
brw-rw---- 1 root disk 230, 46 Feb 18 13:29 /dev/zd32p14
brw-rw---- 1 root disk 230, 47 Feb 18 13:29 /dev/zd32p15
brw-rw---- 1 root disk 259, 1 Feb 18 13:29 /dev/zd32p16
brw-rw---- 1 root disk 259, 2 Feb 18 13:29 /dev/zd32p17
brw-rw---- 1 root disk 230, 34 Feb 18 13:29 /dev/zd32p2
brw-rw---- 1 root disk 230, 35 Feb 18 13:29 /dev/zd32p3
brw-rw---- 1 root disk 230, 36 Feb 18 13:29 /dev/zd32p4
brw-rw---- 1 root disk 230, 37 Feb 18 13:29 /dev/zd32p5
brw-rw---- 1 root disk 230, 38 Feb 18 13:29 /dev/zd32p6
brw-rw---- 1 root disk 230, 39 Feb 18 13:29 /dev/zd32p7
brw-rw---- 1 root disk 230, 40 Feb 18 13:29 /dev/zd32p8
brw-rw---- 1 root disk 230, 41 Feb 18 13:29 /dev/zd32p9
----- ls -l /dev -----

@masato-yoshi-dnsalias-com masato-yoshi-dnsalias-com added the Type: Defect Incorrect behavior (e.g. crash, hang) label Feb 18, 2024
@siv0
Copy link
Contributor

siv0 commented Feb 23, 2024

Thanks for the report and the reproducer (did not have time to test this explicitly yet)!

From a very quick look at the source (grepping for zvol and udev in the source and looking at udev/zvol_id.c)
it seems that the minor is limited to 4 bits per zvol:

include/sys/fs/zfs.h:#define    ZVOL_MINOR_BITS         4
include/sys/fs/zfs.h:#define    ZVOL_MINOR_MASK         ((1U << ZVOL_MINOR_BITS) - 1)
include/sys/fs/zfs.h:#define    ZVOL_MINORS             (1 << 4)

one option we might consider for Proxmox VE is setting the volmode property to dev for the zvols (and/or as default for newly created pools) - maybe this works for you as well?
Do you really use the partitions of a zvol (which in PVE is usually a guest-disk) on the host? (If yes, why? just so we don't miss a use-case we don't think about)

However I think this issue still should be addressed (e.g. by not trying to generate any devicelinks above the 16th partition at all)

@Fabian-Gruenbichler
Copy link
Contributor

there seems to be an overflow at the moment:

  • ZFS uses 4 bits for the partitions (ZVOL_MINOR_BITS)
  • the kernel itself has 20 bits for the minor numbers (MINORBITS)
  • there is no masking/check in ZFS to ensure that a high partition number doesn't overflow the 20 MINORBITS (-> MINOR(dev))

in any case, since increasing ZVOL_MINOR_BITS would entail reducing the number of allowed zvols, and it also needs to be in sync between userspace (udev/zvol_id.c and the kernel module, the easy "fix" would be to just warn about partitions >= 16 instead of creating wrong block devs and symlinks..

@masato-yoshi-dnsalias-com
Copy link
Author

masato-yoshi-dnsalias-com commented Feb 24, 2024

thank you for your reply.
The volume now has volmode=dev set to avoid creating a symbolic link for the partition.
ProxmoxVE hosts almost never use partitions.
Since the VM is being migrated from VMware to ProxmoxVE, there are cases where the VM cannot be started and the Volume can be checked using gdisk.
However, there is no case to see symbolic links of partitions.

My question is, when performing zfs replication with ProxmoxVE, will the same volmode be set at the replication destination?
(I feel like it's not set)
If it is not set, the replication data will be corrupted unless you change the volmode setting before replication starts.

@masato-yoshi-dnsalias-com
Copy link
Author

Even when Proxmox VE was replicated, the volmode of the replication destination was inherited.

@Fabian-Gruenbichler
Copy link
Contributor

after a closer look, what I wrote above is not true at all (thankfully!).

ZFS itself only takes care of creating the zvol block devices (and those of snapshots, if snapdev is set accordingly). the partition devices (if volmode is set accordingly) are created by the kernel itself by scanning the zvol block device. since the zfs code sets the minor count according to its own hard coded shift value, the kernel will allocate the block devices for partitions outside of that count (16 at the moment, with the first slot taken by the zvol itself!) using a different major number. so the block devices themselves are actually fine, there is no unintended overflow there, and the block devices with the non-ZFS major number can be used just fine.

there is a bug in the udev helper zvol_id though, since that just blindly does:

	unsigned int dev_part = minor(sb.st_rdev) % ZVOL_MINORS;
	if (dev_part != 0)
		sprintf(zvol_name + strlen(zvol_name), "-part%u", dev_part);

which means that the 16th (or 32th, or 48th, ..) partition might overwrite the symlink for the zvol itself, and any other partition > 16 might override the symlink for its "remainder" sibling (17 -> 2, 18 ->3, etc.pp.). this is obviously bad for anything relying on the symlink, especially the main zvol one!

I don't really see a way to fix this other than to make the udev helper only work on /dev/zd* (already implied atm, but not enforced) and simply parse the pXX suffix to detect and map partitions. the major device number is not reliable (it can be changed via a module parameter), and to compare with the "main" one (zvol and partitions < 16) to detect partitions >= 16 we'd need to parse and manipulate the device name anyway.

@masato-yoshi-dnsalias-com
Copy link
Author

masato-yoshi-dnsalias-com commented Feb 26, 2024

It seems that /dev/zd*, which is the result of creating 60 partitions, is missing.
The major number changes from the 16th item onwards, but they are numbered using the same number.
However, there were cases in which minor numbers were assigned at random.

It seems that you cannot create a correct symbolic link unless you create a symbolic link with the number after zd?p.

----- ls -l /dev/zd* -----
brw-rw---- 1 root disk 230, 0 Feb 27 00:30 /dev/zd0
brw-rw---- 1 root disk 230, 1 Feb 27 00:30 /dev/zd0p1
brw-rw---- 1 root disk 230, 10 Feb 27 00:30 /dev/zd0p10
brw-rw---- 1 root disk 230, 11 Feb 27 00:30 /dev/zd0p11
brw-rw---- 1 root disk 230, 12 Feb 27 00:30 /dev/zd0p12
brw-rw---- 1 root disk 230, 13 Feb 27 00:30 /dev/zd0p13
brw-rw---- 1 root disk 230, 14 Feb 27 00:30 /dev/zd0p14
brw-rw---- 1 root disk 230, 15 Feb 27 00:30 /dev/zd0p15
brw-rw---- 1 root disk 259, 0 Feb 27 00:30 /dev/zd0p16
brw-rw---- 1 root disk 259, 1 Feb 27 00:30 /dev/zd0p17
brw-rw---- 1 root disk 259, 2 Feb 27 00:30 /dev/zd0p18
brw-rw---- 1 root disk 259, 9 Feb 27 00:30 /dev/zd0p19
brw-rw---- 1 root disk 230, 2 Feb 27 00:30 /dev/zd0p2
brw-rw---- 1 root disk 259, 10 Feb 27 00:30 /dev/zd0p20
brw-rw---- 1 root disk 259, 11 Feb 27 00:30 /dev/zd0p21
brw-rw---- 1 root disk 259, 12 Feb 27 00:30 /dev/zd0p22
brw-rw---- 1 root disk 259, 13 Feb 27 00:30 /dev/zd0p23
brw-rw---- 1 root disk 259, 14 Feb 27 00:30 /dev/zd0p24
brw-rw---- 1 root disk 259, 15 Feb 27 00:30 /dev/zd0p25
brw-rw---- 1 root disk 259, 16 Feb 27 00:30 /dev/zd0p26
brw-rw---- 1 root disk 259, 17 Feb 27 00:30 /dev/zd0p27
brw-rw---- 1 root disk 259, 18 Feb 27 00:30 /dev/zd0p28
brw-rw---- 1 root disk 259, 19 Feb 27 00:30 /dev/zd0p29
brw-rw---- 1 root disk 230, 3 Feb 27 00:30 /dev/zd0p3
brw-rw---- 1 root disk 259, 20 Feb 27 00:30 /dev/zd0p30
brw-rw---- 1 root disk 259, 21 Feb 27 00:30 /dev/zd0p31
brw-rw---- 1 root disk 259, 22 Feb 27 00:30 /dev/zd0p32
brw-rw---- 1 root disk 259, 23 Feb 27 00:30 /dev/zd0p33
brw-rw---- 1 root disk 259, 63 Feb 27 00:30 /dev/zd0p34
brw-rw---- 1 root disk 259, 64 Feb 27 00:30 /dev/zd0p35
brw-rw---- 1 root disk 259, 65 Feb 27 00:30 /dev/zd0p36
brw-rw---- 1 root disk 259, 66 Feb 27 00:30 /dev/zd0p37
brw-rw---- 1 root disk 259, 67 Feb 27 00:30 /dev/zd0p38
brw-rw---- 1 root disk 259, 68 Feb 27 00:30 /dev/zd0p39
brw-rw---- 1 root disk 230, 4 Feb 27 00:30 /dev/zd0p4
brw-rw---- 1 root disk 259, 69 Feb 27 00:30 /dev/zd0p40
brw-rw---- 1 root disk 259, 70 Feb 27 00:30 /dev/zd0p41
brw-rw---- 1 root disk 259, 71 Feb 27 00:30 /dev/zd0p42
brw-rw---- 1 root disk 259, 72 Feb 27 00:30 /dev/zd0p43
brw-rw---- 1 root disk 259, 73 Feb 27 00:30 /dev/zd0p44
brw-rw---- 1 root disk 259, 74 Feb 27 00:30 /dev/zd0p45
brw-rw---- 1 root disk 259, 75 Feb 27 00:30 /dev/zd0p46
brw-rw---- 1 root disk 259, 76 Feb 27 00:30 /dev/zd0p47
brw-rw---- 1 root disk 259, 77 Feb 27 00:30 /dev/zd0p48
brw-rw---- 1 root disk 259, 78 Feb 27 00:30 /dev/zd0p49
brw-rw---- 1 root disk 230, 5 Feb 27 00:30 /dev/zd0p5
brw-rw---- 1 root disk 259, 79 Feb 27 00:30 /dev/zd0p50
brw-rw---- 1 root disk 259, 80 Feb 27 00:30 /dev/zd0p51
brw-rw---- 1 root disk 259, 81 Feb 27 00:30 /dev/zd0p52
brw-rw---- 1 root disk 259, 82 Feb 27 00:30 /dev/zd0p53
brw-rw---- 1 root disk 259, 83 Feb 27 00:30 /dev/zd0p54
brw-rw---- 1 root disk 259, 84 Feb 27 00:30 /dev/zd0p55
brw-rw---- 1 root disk 259, 85 Feb 27 00:30 /dev/zd0p56
brw-rw---- 1 root disk 259, 86 Feb 27 00:30 /dev/zd0p57
brw-rw---- 1 root disk 259, 87 Feb 27 00:30 /dev/zd0p58
brw-rw---- 1 root disk 259, 88 Feb 27 00:30 /dev/zd0p59
brw-rw---- 1 root disk 230, 6 Feb 27 00:30 /dev/zd0p6
brw-rw---- 1 root disk 259, 89 Feb 27 00:30 /dev/zd0p60
brw-rw---- 1 root disk 230, 7 Feb 27 00:30 /dev/zd0p7
brw-rw---- 1 root disk 230, 8 Feb 27 00:30 /dev/zd0p8
brw-rw---- 1 root disk 230, 9 Feb 27 00:30 /dev/zd0p9
brw-rw---- 1 root disk 230, 16 Feb 27 00:20 /dev/zd16
brw-rw---- 1 root disk 230, 32 Feb 27 00:20 /dev/zd32
----- ls -l /dev/zd* -----

@Vanav
Copy link

Vanav commented Feb 26, 2024

I can confirm this bug in real work.
Ubuntu 23.04, 23.10, 24.04 official images (e.g. https://cloud-images.ubuntu.com/noble/20240220/noble-server-cloudimg-amd64.img) contain partition 16, that overwrites symlink /dev/zvol/rpool/data/vm-100-disk-0 and VM fails to boot:

# fdisk -l /dev/zd96

Disk /dev/zd96: 3.5 GiB, 3758096384 bytes, 7340032 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 16384 bytes
I/O size (minimum/optimal): 16384 bytes / 16384 bytes
Disklabel type: gpt
Disk identifier: 1324DA90-D3AA-4979-BD95-A0A4EADEECA9

Device         Start     End Sectors  Size Type
/dev/zd96p1  2099200 7339998 5240799  2.5G Linux filesystem
/dev/zd96p14    2048   10239    8192    4M BIOS boot
/dev/zd96p15   10240  227327  217088  106M EFI System
/dev/zd96p16  227328 2097152 1869825  913M Linux extended boot

Partition table entries are not in disk order.


# ls -al /dev/zd96*

 230       96    3670016 zd96
 230       97    2620399 zd96p1
 230      110       4096 zd96p14
 230      111     108544 zd96p15
 259        0     934912 zd96p16

Error:

# ls -al /dev/zvol/rpool/data/vm-104-disk-0*

lrwxrwxrwx 1 root root 16 Feb 26 20:42 /dev/zvol/rpool/data/vm-100-disk-0 -> ../../../zd96p16
lrwxrwxrwx 1 root root 15 Feb 26 20:42 /dev/zvol/rpool/data/vm-100-disk-0-part1 -> ../../../zd96p1
lrwxrwxrwx 1 root root 16 Feb 26 20:42 /dev/zvol/rpool/data/vm-100-disk-0-part14 -> ../../../zd96p14
lrwxrwxrwx 1 root root 16 Feb 26 20:42 /dev/zvol/rpool/data/vm-100-disk-0-part15 -> ../../../zd96p15

Error: /dev/zvol/rpool/data/vm-104-disk-0 -> ../../../zd96p16
Correct:

lrwxrwxrwx 1 root root 14 Feb 26 20:41 /dev/zvol/rpool/data/vm-100-disk-0 -> ../../../zd96
lrwxrwxrwx 1 root root 16 Feb 26 20:41 /dev/zvol/rpool/data/vm-100-disk-0-part1 -> ../../../zd96p1
lrwxrwxrwx 1 root root 17 Feb 26 20:41 /dev/zvol/rpool/data/vm-100-disk-0-part14 -> ../../../zd96p14
lrwxrwxrwx 1 root root 17 Feb 26 20:41 /dev/zvol/rpool/data/vm-100-disk-0-part15 -> ../../../zd96p15
lrwxrwxrwx 1 root root 17 Feb 26 20:41 /dev/zvol/rpool/data/vm-100-disk-0-part3 -> ../../../zd96p16

Workaround: zfs set volmode=dev rpool/data

@Fabian-Gruenbichler
Copy link
Contributor

all of that information is already in my previous reply..

to summarize:

  • the /dev/zd* block devices, including those for partitions with the p* suffix, are created correctly - but if there are many partitions, the higher partition indices might have a different major number and a minor that does not follow the numbering scheme
  • the udev zvol_id helper doesn't handle those partitions, and creates wrong symlinks -> this needs to be fixed, likely by just extracting the partition number from the pX suffix instead of deriving it from the minor device number

Fabian-Gruenbichler added a commit to Fabian-Gruenbichler/zfs that referenced this issue Mar 6, 2024
If a zvol has more than 15 partitions, the minor device number exhausts
the slot count reserved for partitions next to the zvol itself. As a
result, the minor number cannot be used to determine the partition
number for the higher partition, and doing so results in wrong named
symlinks being generated by udev.

Since the partition number is encoded in the block device name anyway,
let's just extract it from there instead.

Fixes: openzfs#15904

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
@Fabian-Gruenbichler
Copy link
Contributor

https://bugzilla.proxmox.com/show_bug.cgi?id=5288 (just an example of a real-world image that when imported as zvol causes havoc)

behlendorf pushed a commit that referenced this issue Mar 21, 2024
If a zvol has more than 15 partitions, the minor device number exhausts
the slot count reserved for partitions next to the zvol itself. As a
result, the minor number cannot be used to determine the partition
number for the higher partition, and doing so results in wrong named
symlinks being generated by udev.

Since the partition number is encoded in the block device name anyway,
let's just extract it from there instead.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Closes #15904
Closes #15970
tonyhutter pushed a commit that referenced this issue May 2, 2024
If a zvol has more than 15 partitions, the minor device number exhausts
the slot count reserved for partitions next to the zvol itself. As a
result, the minor number cannot be used to determine the partition
number for the higher partition, and doing so results in wrong named
symlinks being generated by udev.

Since the partition number is encoded in the block device name anyway,
let's just extract it from there instead.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Closes #15904
Closes #15970
datty pushed a commit to datty/zfsonlinux that referenced this issue Jun 13, 2024
If a zvol has more than 15 partitions, the minor device number
exhausts the slot count reserved for partitions next to the zvol
itself. As a result, the minor number cannot be used to determine the
partition number for the higher partition, and doing so results in
wrong named symlinks being generated by udev.

Since the partition number is encoded in the block device name anyway,
let's just extract it from there instead.

For upstream issue and PR discussion see:
openzfs/zfs#15970
openzfs/zfs#15904

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
lundman pushed a commit to openzfsonwindows/openzfs that referenced this issue Sep 4, 2024
If a zvol has more than 15 partitions, the minor device number exhausts
the slot count reserved for partitions next to the zvol itself. As a
result, the minor number cannot be used to determine the partition
number for the higher partition, and doing so results in wrong named
symlinks being generated by udev.

Since the partition number is encoded in the block device name anyway,
let's just extract it from there instead.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Closes openzfs#15904
Closes openzfs#15970
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants