Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Impossible to use "metal-iso" from any partition #9538

Closed
vills opened this issue Oct 21, 2024 · 5 comments · Fixed by #9555
Closed

Impossible to use "metal-iso" from any partition #9538

vills opened this issue Oct 21, 2024 · 5 comments · Fixed by #9555
Assignees

Comments

@vills
Copy link

vills commented Oct 21, 2024

Bug Report

In docs it is stated, that any correctly labeled partition on block device can be used to provide configuration to Talos - https://www.talos.dev/v1.8/reference/kernel/#metal-iso. But is it not. Only labeled filesystem that occupies whole block device can be used.

Description

i digged the code and placing config file in partition doesn’t work. at least in version 1.7.7, that i checked. in readConfigFromISO definition probe.GetDevWithFileSystemLabel(constants.MetalConfigISOLabel) return correct device (partition) name (for example, /dev/loop0p7), but later in filesystem.Probe(dev.Device().Name()) .Name() return block device name without partition in it (/dev/loop0) and .Probe() can’t parse superblock because of it.

List of partitions:

NAME                     MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINTS NAME               FSTYPE            FSVER    LABEL     UUID
loop0                      7:0    0    10G  0 loop              loop0
├─loop0p1                259:0    0   100M  0 part              loop0p1            vfat              FAT32    EFI       A9C5-03C6
├─loop0p2                259:1    0     1M  0 part              loop0p2
├─loop0p3                259:2    0  1000M  0 part              loop0p3            xfs                        BOOT      8310ba69-b39b-482b-80b6-79ebbdf1ffc9
├─loop0p4                259:3    0     1M  0 part              loop0p4
└─loop0p5                259:4    0     9M  0 part              loop0p5            ext4              1.0      metal-iso 2a1e612c-a81d-49e2-80e9-d6c27679874d

Logs

metal-iso-logs.txt

Environment

  • Talos version: 1.7.7, 1.8.1
  • Kubernetes version: doesn't apply
  • Platform: libvirt/qemu/kvm

This patch should fix issue:

diff --git a/internal/app/machined/pkg/runtime/v1alpha1/platform/metal/metal.go b/internal/app/machined/pkg/runtime/v1alpha1/platform/metal/metal.go
index ccbf858f4..dfc456c24 100644
--- a/internal/app/machined/pkg/runtime/v1alpha1/platform/metal/metal.go
+++ b/internal/app/machined/pkg/runtime/v1alpha1/platform/metal/metal.go
@@ -132,7 +132,7 @@ func readConfigFromISO(ctx context.Context, r state.State) ([]byte, error) {
        //nolint:errcheck
        defer dev.Close()

-       sb, err := filesystem.Probe(dev.Device().Name())
+       sb, err := filesystem.Probe(dev.Path)
        if err != nil {
                return nil, err
        }
@@ -141,7 +141,7 @@ func readConfigFromISO(ctx context.Context, r state.State) ([]byte, error) {
                return nil, stderrors.New("error while substituting filesystem type")
        }

-       if err = unix.Mount(dev.Device().Name(), mnt, sb.Type(), unix.MS_RDONLY, ""); err != nil {
+       if err = unix.Mount(dev.Path, mnt, sb.Type(), unix.MS_RDONLY, ""); err != nil {
                return nil, fmt.Errorf("failed to mount iso: %w", err)
        }
@smira smira self-assigned this Oct 21, 2024
smira added a commit to smira/talos that referenced this issue Oct 23, 2024
Fixes siderolabs#9538

Re-do the implementation by using the volume management primitives, so
that we can avoid/skip old code. This should fix all issues related to
the partition/whole disk.

Fix issues in the volume management (exposed, as we haven't used it this
way before).

Build a test case in `talosctl cluster create` to inject machine config
via `metal-iso`.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
smira added a commit to smira/talos that referenced this issue Oct 25, 2024
Fixes siderolabs#9538

Re-do the implementation by using the volume management primitives, so
that we can avoid/skip old code. This should fix all issues related to
the partition/whole disk.

Fix issues in the volume management (exposed, as we haven't used it this
way before).

Build a test case in `talosctl cluster create` to inject machine config
via `metal-iso`.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
(cherry picked from commit d393938)
@rcarpa
Copy link

rcarpa commented Nov 15, 2024

Hi team.

Since this change, talos cannot automount the following block device with a configuration file on it. The config.yaml file located inside was automatically generated by terraform; its content is not important at this stage.
controlplane_config.raw.gz

It fails with the following error:
image

The problem persists both on 1.8.3 and 1.9.0-alpha2. Reverting to 1.8.1 makes it work again.

A slightly related, but different issue: I didn't go through the code very attentively, but looking at d393938 , I have a feeling that the ext2 file system isn't supported anymore because you require a FS from an explicit list in the code. Is this the case? I artificially inflated the size so that an ext4 file system is created by mke2fs for this test, but it creates an ext2 fs when the partition is smaller than 2MB.

@vills
Copy link
Author

vills commented Nov 15, 2024

ext4 isn't supported to (from my tests). XFS works well. Try it.

@rcarpa
Copy link

rcarpa commented Nov 15, 2024

ext4 isn't supported to (from my tests). XFS works well. Try it.

Thank you for confirming the issue and for the workaround. Is ext4 just plain unsupported by the team, or is it a bug?

@vills
Copy link
Author

vills commented Nov 15, 2024

Idk. I tried ext4 for metal-iso partition without luck. Switches to xfs and it solved my problem. Tried on 1.8.2.

@smira
Copy link
Member

smira commented Nov 18, 2024

extfs wasn't officially supported, but we plan to solve this partially in 1.9

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 18, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants