Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generation 2 Hyper-V VM boots too fast for boot_command to trigger #7278

Closed
KimRechnagel opened this issue Feb 5, 2019 · 64 comments · Fixed by #7970
Closed

Generation 2 Hyper-V VM boots too fast for boot_command to trigger #7278

KimRechnagel opened this issue Feb 5, 2019 · 64 comments · Fixed by #7970
Labels
builder/hyperv stage/waiting-on-upstream This issue is waiting on an upstream change

Comments

@KimRechnagel
Copy link

KimRechnagel commented Feb 5, 2019

I'm trying to build a generation 2 Windows Server 2016 VM on Windows 10 with the hyper-v role installed. I have the exact same issue as janegilring in the quote below. More info in a minute, just need to get out of this "Reference new issue" popup which just closed on me.

"I was just setting up the same Packer build configuration in a different environment (lab - slower hardware).
The issue in that environment seems to be the opposite: While Packer is in the "Starting the virtual machine..." state, the VM has already started and the "Press any key to start installation" screen is gone when Packer gets to the waiting state. Even when setting the boot wait to 0 seconds, Packer is too slow to type the boot commands.
However, I suppose that`s another issue so I'll create one after some more testing.

Originally posted by @janegilring in #6208 (comment)"

@KimRechnagel
Copy link
Author

KimRechnagel commented Feb 5, 2019

Log output:
==> hyperv-iso: Creating build directory...
==> hyperv-iso: Retrieving ISO
hyperv-iso: Using file in-place: file:///C:/Automation/ISO/Newest2016/windows2016.ISO
==> hyperv-iso: Starting HTTP server on port 8068
==> hyperv-iso: Creating switch 'internal_switch' if required...
==> hyperv-iso: switch 'internal_switch' already exists. Will not delete on cleanup...
==> hyperv-iso: Creating virtual machine...
==> hyperv-iso: Enabling Integration Service...
==> hyperv-iso: Setting boot drive to os dvd drive C:/Automation/ISO/Newest2016/windows2016.ISO ...
==> hyperv-iso: Mounting os dvd drive C:/Automation/ISO/Newest2016/windows2016.ISO ...
==> hyperv-iso: Skipping mounting Integration Services Setup Disk...
==> hyperv-iso: Mounting secondary DVD images...
==> hyperv-iso: Mounting secondary dvd drive ./windows/2016/answer.iso ...
==> hyperv-iso: Configuring vlan...
==> hyperv-iso: Starting the virtual machine...
==> hyperv-iso: Attempting to connect with vmconnect...
==> hyperv-iso: Host IP for the HyperV machine: 192.168.10.103
==> hyperv-iso: Typing the boot command...
==> hyperv-iso: Waiting for WinRM to become available...

When Packer gets to the "Typing the boot command...." part the VM is already way past the "Press any key to boot from cd or dvd" prompt.

I have tried to start up in headless mode but the VM still starts too fast. I'm not really sure if there is any solution to this other than building an ISO which doesn't prompt me to press a key to start the installation. I have had plenty of success with building generation 1 VMs on the same Windows 10 machine, but I don't see the prompt here though. Below is the template I'm using.

@KimRechnagel
Copy link
Author

{
"builders": [
{
"boot_wait": "0s",
"boot_command": [ "aaaaaaa" ],
"configuration_version":"9.0",
"vm_name":"windows2016",
"type": "hyperv-iso",
"disk_size": 76800,
"floppy_files": [],
"secondary_iso_images": [
"./windows/2016/answer.iso"
],
"headless": false,
"http_directory": "./windows/common/http/",
"guest_additions_mode":"disable",
"iso_url": "../ISO/Newest2016/windows2016.ISO",
"iso_checksum_type": "none",
"iso_checksum": "e3779d4b1574bf711b063fe457b3ba63",
"communicator":"winrm",
"winrm_username": "vagrant",
"winrm_password": "vagrant",
"winrm_timeout" : "4h",
"shutdown_command": "shutdown /s /t 10 /f /d p:4:2 /c "Packer Shutdown"",
"ram_size": 2048,
"cpu": 1,
"generation": 2,
"switch_name": "internal_switch",
"enable_secure_boot":true
}
],
"provisioners": [
{
"type": "powershell",
"elevated_user":"vagrant",
"elevated_password":"vagrant",
"scripts": [
"./windows/common/cleanup.ps1"
]
}
],
"post-processors": [
{
"type": "vagrant",
"keep_input_artifact": false,
"output": "{{.Provider}}_windows-2016.box"
}
]
}

@KimRechnagel KimRechnagel changed the title Generation 2 Hyper-V VM boots to fast for boot_command to trigger Generation 2 Hyper-V VM boots too fast for boot_command to trigger Feb 5, 2019
@SwampDragons
Copy link
Contributor

@marcinbojko I know you've done a lot with generation 2 windows VMs -- do you have any insights for a workaround here? I don't think there's really anything Packer can do here because Gen 2 vms just blast through the boot sequence so fast.

@marcinbojko
Copy link
Contributor

@SwampDragons - what's funny - having a lots of different Hyper-V stacks (different baremetal and versions), different DVD/isos to test I can say one thing: it's unpredictable ;)
Unfortunately, the only workaround I've found is to make boot loop with:

      "boot_command": [
        "a<enter><wait>a<enter><wait>a<enter><wait>a<enter>"
      ],

@marcinbojko
Copy link
Contributor

marcinbojko commented Feb 5, 2019

@SwampDragons I'd suggest maybe using a feature called 'start delay', as it's better for packer to wait a sec or ten, then just let VM Gen2 to fly.
image

The name of a feature is here:

 get-vm -Name ito-el6-n1.spcph.local|select name,automaticstartdelay

Name                   AutomaticStartDelay
----                   -------------------
ito-el6-n1.spcph.local                   0

@SwampDragons
Copy link
Contributor

Startup_delay is a great hint! I'll add it to the hyper-v docs.

@KimRechnagel
Copy link
Author

KimRechnagel commented Feb 6, 2019

Hey guys, I really appreciate your suggestions on the issue here. Unfortunately the AutomaticStartDelay setting won't help much here as it doesn't slow down the boot process when the VM gets the initial start trigger.

What AutomaticStartDelay really does is preventing a boot storm when a hyper-v host, or an entire hyper-v cluster, running many VMs, are rebooted.

Example:
VM1 is running on host1
VM1 has AutomaticStartDelay set to 60 seconds
Host1 is rebooted
VM1 was originally running on host1 prior to reboot so VM1 will automatically startup again when the hyper-v service has started
Hyper-V waits 60 seconds before powering on/starting VM1
After 60 seconds VM1 powers on and runs through the boot process as fast as possible.

I'll take a look at the boot_command tweak suggested here. My current boot_command string is currently: "boot_command": [ "a<wait>a<wait>a<wait>a<wait>a<wait>a<wait>a" ],

It doesn't seem to have any effect in the VM though as I don't see the VM rebooting multiple times. It could actually work though I guess. I'll grab some screenshots in order to get you a better understanding of what happens at my end.

@KimRechnagel
Copy link
Author

Hmm I tried the following settings, but the VM doesn't seem to get any input from packer at all:
"boot_wait": "5s",
"boot_command": ["<leftCtrlOn><leftAltOn><endOn><leftCtrlOff><leftAltOff><endOff><wait>a<enter>"],

@marcinbojko
Copy link
Contributor

@KimRechnagel - my initial understanding of your problem was that packer was too slow to start interfering with VM's boot menu - in this case AutomaticStartDelay is a key to it.
I don't recall to have these issues, even on super-duper fast hosts with SSD storage.
Could you start gathering data? Packer version, what terminal you're using (cmd, powershell, conemu).
Also I'd say - let's try change ISO as i recall some of latest releases ( I am using Partner channel though) had problems with boot_command - can you download and check just generic Windows 2016 Evaluation ISO?
Last but not least, could you try my templates?
https://github.com/marcinbojko/hv-packer

@marcinbojko
Copy link
Contributor

@SwampDragons - it's not so great, as it has to be set by packer during the VM creation ;) I'd suggest to add this option to packer commands (of course in the code also) to be able to slow down a little for super fast VMs.

@KimRechnagel
Copy link
Author

KimRechnagel commented Feb 6, 2019

@marcinbojko Yes, packer is too slow to start interfering with the boot menu, or the VM is too fast for vmconnect.exe, which I can see in the code that packer is using, to connect to the VM.

I'm not trying to be rude, but AutomaticStartDelay has nothing to do with this issue as this setting works exactly as I described above. I tested it locally on my machine by setting AutomaticStartDelay to 10 seconds and then starting the VM. It doesn't delay anything after the start request has been sent to the VM, it just tells the host to wait X seconds to send the start request to the VM when the host eg. has been rebooted.

I'll test with another ISO and will also collect data about my system, versions etc. as per your suggestion.

Thanks for your feedback.

@marcinbojko
Copy link
Contributor

@KimRechnagel - no worries, startdelay would be recommended in our first understanding of your problem - which we already ruled out.

@KimRechnagel
Copy link
Author

Hmm maybe the "solution" could be as simple as getting packer to connect to the VM before sending the Start-VM cmdlet.

I just "tested" it manually and what happens is that I connect to the VM and see the black console. When I hit the start button it still takes vmconnect about 3-4 seconds to actually display the boot screen. I see the "Press any key to boot from CD or DVD..." for about 1 second before it times out and tries to PXE boot instead.

I guess the issue might just be that vmconnect.exe is too slow to connect. Well, I'll look into that as well.

@marcinbojko
Copy link
Contributor

@KimRechnagel - what would happen if you'll switch to exhanced session (in vmconnect) for this particular packer VM?

@KimRechnagel
Copy link
Author

KimRechnagel commented Feb 6, 2019

@marcinbojko Enhanced session was already enabled. I disabled it but unfortunately it didn't change anything.

I did test something else, but it raises a lot of other challenges with DHCP/PXE etc. but if I change the boot order to be:
Harddrive
Network Adapter
DVD Drive (my install ISO)
DVD Drive (answer.iso with autounattend.xml etc.)

Then the VM waits for PXE to time out and vmconnect has plenty of time to connect to the VM. The problem with this is that then I only have a small window to send the boot commands during the end of the PXE timeout and when the "Press any key to boot...." times out. Furthermore if I had a DHCP/BOOTP on my network, that would complicate the boot process even more.

A question regarding boot_commands on hyper-v; The documentation states that I can add "On" to e.g. <LeftCtrl> in order for packer to hold down the key, which would allow me to send Ctrl, Alt, End (reboot). but it doesn't seem to work. Maybe because the scancodes haven't been implemented in the same way on hyper-v as e.g. VirtualBox, VMWare etc?

I tried with
"boot_command": ["<leftCtrlOn><leftAltOn><endOn><leftCtrlOff><leftAltOff><endOff><wait>a<enter>"],
But it didn't do anything. Well, maybe my issue is that the boot_command aren't sent at all :-)

I never saw the "Press any key to boot..." when creating Gen1 VMs, so I don't actually know if boot_command works on my setup.

Still waiting for the eval ISO to download.

@marcinbojko
Copy link
Contributor

My current settings:
image

@KimRechnagel
Copy link
Author

How far into the installation is this? Did it just start?
I don't see the bootmgfw.efi in my settings.

@marcinbojko
Copy link
Contributor

That's interesting - my packer just goes through 3rd batch of WU.
As far as I know (in 2016/2019) Gen2 machine should have this file.

@KimRechnagel
Copy link
Author

I tested the templates you linked from your github repo. I used the ISO which I have downloaded from the VLSC site. Same issue. I'll test again with the eval ISO in about 20 minutes when it has finished downloading.

My settings with your template:
image

@KimRechnagel
Copy link
Author

It seems like your Hyper-V host is physical, or at least running on Server 2016? I'm testing with my llaptop with the latest version of windows 10. It might make a difference when building Gen 2 machines.

@marcinbojko
Copy link
Contributor

True. I am not a windows guy, however I'll try with w10.

@KimRechnagel
Copy link
Author

Ok the evaluation ISO finished downloading. I didn't change anything but the ISO, I used your templates... and it works. It's very odd... it seems like the eval ISO waits just about 1-2 seconds longer at the "Press any key to boot" prompt, which means that packer has time to connect and send the boot_command.

@marcinbojko
Copy link
Contributor

Yup, that's what I noticed in thread you were mentioning. Switching to different ISO (Partner channel) broke my deployment flow. BLAME Microsoft?

@KimRechnagel
Copy link
Author

It does not work with the template I modified myself. I tested two times now and the boot_command does not seem to be sent. I'll tweak the settings one line at a time until I figure out what triggers this.

@KimRechnagel
Copy link
Author

KimRechnagel commented Feb 6, 2019

Wow this is weird. I managed to "break" your template as well by changing: "iso_url": ".\\iso\\Windows_Server_2016_Datacenter_EVAL_en-us_14393_refresh.ISO",

To: "iso_url": "../ISO/Newest2016/Windows_Server_2016_Datacenter_EVAL_en-us_14393_refresh.ISO",

Changed it back, and it worked again

@KimRechnagel
Copy link
Author

Well, now your template fails again. It consistently failed three times in a row. This is odd. There is a very very fine balance between when it works and not.
I'll keep testing.

@SwampDragons SwampDragons added stage/waiting-on-upstream This issue is waiting on an upstream change and removed bug question labels Feb 6, 2019
@marcinbojko
Copy link
Contributor

@SwampDragons sorry for answering to closed issue - I'd like to try aproach with -AutomaticStartDelay passed to New-VM or Set-Vm. So the sequence would be: run vmconnect and WAIT for VM to start.
The problem is i have absolutely no clue about Golang. If it's not too much can you point me to a piece of code that builds or sets 'new-vm' or 'set-vm' part?

@SwampDragons SwampDragons reopened this Feb 6, 2019
@SwampDragons
Copy link
Contributor

Ah, sorry; didn't realize you were thinking of adding this option. The powershell scripts that comprize the hyperv driver are here, and the new-vm code specifically is here

The new-vm code uses golang templating to produce a minimal powershell script and allow us to work around passing a ton of parameters into our Powershell call.

@KimRechnagel
Copy link
Author

@marcinbojko I tested your template on a standalone physical Dell Poweredge 815 Hyper-V 2012 R2 host with local harddrives. The funny thing is that I see the same behavior as you. The VM starts, I see the "Press any key" prompt for maybe 3-4 seconds (Packer seems to be connected here), then the VM goes into PXE boot, times out after 60 seconds goes back to "Press any key" and THEN starts the installation.

@marcinbojko
Copy link
Contributor

2012/2016, windows 10 up to 1803. W10 1809/2019=packer unusable.

@KimRechnagel
Copy link
Author

@marcinbojko Just an update. I have built a nested hyper-v host on a hyper-v 2016 cluster. I have used my original 1809 ISO from the VLSC site as well as the evaluation ISO and so far I have not had any issues with packer connecting too slow. It seems like vmconnect.exe connects way faster in my current setup, so missing the boot_command is not an issue.

@marcinbojko
Copy link
Contributor

Interesting. DNS issues?

@KimRechnagel
Copy link
Author

I don't think so, as it wouldn't make sense if vmconnect relies on DNS to lookup local VMs.

@marcinbojko
Copy link
Contributor

I suppose your nested HV is a standalone, out of AD host?

@KimRechnagel
Copy link
Author

Yes, it's a standalone hyper-v host. Its only purpose is building packer templates which I'm going to use on an Ansible server elsewhere in our infrastructure. The packer hyper-v host is a member of an AD and use the AD DNS servers. The Hyper-V host is also a DHCP server exclusively for the packer templates.

@ladar
Copy link
Contributor

ladar commented Mar 1, 2019

I have tried to start up in headless mode but the VM still starts too fast.

I wish I had this bug. It takes 48 hours for all of my Hyper-V images to build. Sometimes longer if any of the builds get stuck at the "Waiting of ssh access stage."

In regards to a solution, I had a couple of ideas you could try (I obviously don't have the hardware to do so myself)... namely, try adding PXE (aka the Network Boot option) to your boot order. That might buy the seconds you need. You can modify this chunk of packer code to include the network boot option. Just update the powershell command packer is using.

A more reliable kludge might be booting the machine without an ISO (or using a non-bootable dummy ISO if it's needed to ensure a DVD drive is provisioned)... and then waiting for the new guest to reach the UEFI error screen (shown below). From there you can mount the actual installation ISO, and with a sufficient delay, have packer start the boot command/install process by first triggering a reboot, (aka press any key) onto the freshly mounted ISO.

If the latter idea works, (aka if you test it by mounting/swapping the ISO manually), then we know a potential fix, and the focus could shift to making packer use this strategy.

screenshot from 2019-02-28 18-04-30

@Geogboe
Copy link

Geogboe commented Apr 1, 2019

I've been encountering this same issue and wanted to detail something that appears to be working for me.

First off my environment:

  • Host: windows 10 version 1809
  • HW: pcie SSD
  • Packer v1.3.5
  • Powershell 5.1
  • OS: Windows Server 2019 Eval ( from Microsoft website )
  • Generate 2, verified DVD as first boot option
  • NOT headless. vmconnect running

Issue:
As others have mentioned, when the VM is first booted, it runs through the boot options before packer ever connects leaving you stuck at this screen:

image

note: This screen is actually frozen. It took me a while to realize this but if you've made it this far, packer can NO longer send boot commands. I kept trying to send ctrl+alt+del/ctrl+shift+alt+del to try and reboot the machine and nothing was happening.

Solution

The solution for me which feels very much like a hack was to wait until the frozen boot screen times out. ( about 60s ). After that, this error screen will launch:

image

The good new is, once you get to the error screen, packer can start sending boot commands again so from here you just need to tab down and press enter to restart.

Here's the code:

            "boot_wait": "70s",
            "boot_command": [
                "<tab><wait><enter><wait>",
                "a<wait>a<wait>a<wait>a<wait>a<wait>a<wait>"
            ],

@pjstam
Copy link

pjstam commented Aug 6, 2019

why not create an image with the efisys_noprompt.bin file and skip this 'press any key' entirely. Works for all hypervisors. Don't forget to get a new hash (Get-FileHash)

@SwampDragons
Copy link
Contributor

We've got a PR containing a potential fix for this, if anyone is up for testing it out:

https://circleci.com/gh/hashicorp/packer/9097#artifacts/containers/0

@mosby797
Copy link

@SwampDragons - this worked for me.

@Wasapl
Copy link

Wasapl commented Sep 26, 2019

@SwampDragons Thank you, this works for us also!
Do you know when 1.4.4 will be released?

IlyaFinkelshteyn pushed a commit to appveyor/build-images that referenced this issue Sep 26, 2019
@SwampDragons
Copy link
Contributor

Next week, probably Tuesday. :)

@Wasapl
Copy link

Wasapl commented Oct 8, 2019

@SwampDragons, thank you for your hard work! Packer 1.4.4 released and can be found https://packer.io/downloads.html page.
Will it be released at https://github.com/hashicorp/packer/releases?
Will docker images be updated at hub.docker.com?

@SwampDragons
Copy link
Contributor

Ah, I will fix that link. However, 1.4.4 had a critical HyperV bug that we didn't catch until just after releasing -- you should probably use the nightly build that is already linked there.

@marcinbojko
Copy link
Contributor

Oh, I think i was my doing. I was fixing this broken intendation and it slipped somehow.

@SwampDragons
Copy link
Contributor

@marcinbojko It's my fault for not catching it in tests; it's already fixed on the master branch, so no worries.

@ghost
Copy link

ghost commented Jan 23, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked and limited conversation to collaborators Jan 23, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
builder/hyperv stage/waiting-on-upstream This issue is waiting on an upstream change
Projects
None yet
8 participants