Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PXE: ip=dhcp network configuration seems to stop boot progress #412

Closed
wkruse opened this issue Mar 6, 2020 · 11 comments
Closed

PXE: ip=dhcp network configuration seems to stop boot progress #412

wkruse opened this issue Mar 6, 2020 · 11 comments

Comments

@wkruse
Copy link

wkruse commented Mar 6, 2020

I am running Matchbox in VirtualBox 6.1 VM and using Terraform v0.12.21 with terraform-provider-ct v0.4.0 and terraform-provider-matchbox v0.3.0 to create groups and profiles. I am trying to install a second VM (3 GB RAM, 2 CPUs, NIC: Intel PRO/1000 MT Desktop, 32 GB disk) from PXE.

This is the iPXE script, provided by the Matchbox:

#!ipxe
kernel /assets/fedora-coreos/31.20200210.3.0/fedora-coreos-31.20200210.3.0-live-kernel-x86_64 ip=dhcp rd.neednet=1 initrd=fedora-coreos-31.20200210.3.0-live-initramfs.x86_64.img audit=0 coreos.inst=yes coreos.inst.image_url=http://matchbox.foo:8080/assets/fedora-coreos/31.20200210.3.0/fedora-coreos-31.20200210.3.0-metal.x86_64.raw.xz coreos.inst.ignition_url=http://matchbox.foo:8080/ignition?uuid=${uuid}&mac=${mac:hexhyp} coreos.inst.install_dev=sda coreos.inst.platform_id=metal coreos.inst.insecure coreos.inst.skip_reboot console=tty0 console=ttyS0 systemd.journald.max_level_console=debug
initrd /assets/fedora-coreos/31.20200210.3.0/fedora-coreos-31.20200210.3.0-live-initramfs.x86_64.img
boot

PXE boot works

pxe-boot-1

pxe-boot-4

pxe-boot-5

but ignition doesn't seem to run. I also don't see anything in the logs. How could I debug it further?

@dustymabe
Copy link
Member

i'm working with @wkruse in IRC to see if we can get to the root of the problem

@dghubble
Copy link
Member

dghubble commented Mar 6, 2020 via email

@dustymabe
Copy link
Member

I worked with him earlier. To simplify the process I had him just try to do a live PXE boot (no install). It seemed like we were able to fetch an ignition config but we could never get the system to get past the ignition fetch stage. His system would completely stop at:

[   32.699978] systemd[1]: Started Ignition (disks).
[   32.794904] audit: type=1130 audit(1583512308.873:8): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel msg='unit=ignition-disks comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[   32.795789] ignition[782]: disks: disks passed
[   32.807232] ignition[782]: Ignition finished successfully
[�[0;32m  OK  �[0m] Reached target �[0;1;39mInitrd Root Device�[0m.
[   32.813304] systemd[1]: Reached target Initrd Root Device.

When comparing to a system that does boot properly you see that the next step is mounting the root filesystem at /sysroot/

Mar 06 18:46:43 localhost systemd[1]: Started Ignition (disks).
Mar 06 18:46:43 localhost systemd[1]: Reached target Initrd Root Device.
Mar 06 18:46:43 localhost systemd-journald[297]: Missed 10 kernel messages
Mar 06 18:46:43 localhost kernel: audit: type=1130 audit(1583520403.835:10): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel msg='unit=ignition-disks comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=>
Mar 06 18:46:43 localhost audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel msg='unit=ignition-disks comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Mar 06 18:46:43 localhost systemd[1]: Mounting /sysroot...

@jlebon
Copy link
Member

jlebon commented Mar 6, 2020

Probably worth trying again with systemd.log_level=debug systemd.log_target=console systemd.journald.forward_to_console=1 and see if anything obvious shows up.

@wkruse
Copy link
Author

wkruse commented Mar 13, 2020

I tried it again with debug, the output is in the gist.

@wkruse
Copy link
Author

wkruse commented Mar 13, 2020

Using the same setup, doing live PXE boot of CoreOS 2345.3.0 works, the output is in the gist.

@dustymabe
Copy link
Member

Some more investigation.. After removing ip=dchp rd.neednet=1 @wkruse system is able to come up. So it seems networking related. Still digging in to find out more.

@wkruse
Copy link
Author

wkruse commented Mar 13, 2020

Using static config "ip=172.17.4.11::172.17.4.253:255.255.255.0:node01:eth0:none:172.17.4.253", fixes the boot problem for me. So it seems like DHCP works at the beginning (PXE boot), and then also while getting the Ignition, but stops after...

@dustymabe dustymabe changed the title Installing from PXE boot doesn't work PXE: ip=dhcp network configuration seems to stop boot progress Mar 13, 2020
@dustymabe
Copy link
Member

@wkruse - any other information we can add here?

The latest testing release (31.20200323.2.0) has reworked networking in the initramfs so maybe that will solve the issue. Can you try it ?

@wkruse
Copy link
Author

wkruse commented Mar 27, 2020

@dustymabe I guess I was hitting #358, as I had to nmcli connection down eth0 and nmcli connection up eth0 after live PXE boot in order to install to disk. After install to disk and first boot from disk, I had to nmcli connection down eth0 and nmcli connection up eth0 again.
I will try 31.20200323.2.0 now.

@wkruse
Copy link
Author

wkruse commented Mar 27, 2020

31.20200323.2.0 fixes it for me. DHCP live PXE boot, install to disk and reboot work now, the system doesn't stop anymore. Thanks a lot! 👍 🍻

@wkruse wkruse closed this as completed Mar 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants