Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

azurerm_windows_virtual_machine fails to create when referencing a "specialised" shared gallery image. #7772

Open
woter1832 opened this issue Jul 16, 2020 · 24 comments

Comments

@woter1832
Copy link
Contributor

woter1832 commented Jul 16, 2020

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform (and AzureRM Provider) Version

Terraform v0.12.28
+ provider.azuread v0.10.0
+ provider.azurerm v2.18.0

Affected Resource(s)

  • azurerm_windows_virtual_machine

Terraform Configuration Files

resource "azurerm_windows_virtual_machine" "pool" {
  for_each = {
    for key, i in local.vm_config :
    key => i
  }
  
  name                  = each.value.vm_name
  resource_group_name   = var.resource_group_name
  location              = var.location
  size                  = each.value.vm_size
  admin_username        = each.value.admin_username
  admin_password        = each.value.admin_password
  network_interface_ids = each.value.network_interface_ids
  license_type          = each.value.license_type
  timezone              = each.value.timezone
  
  source_image_id       = each.value.source_image_id
 #redacted source_image_id value: /subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/rgvmimagedemo/providers/Microsoft.Compute/galleries/myGallary/images/windows_server_2019/versions/0.0.1
  
  tags                  = merge(var.tags, { purpose = each.value.purpose }, { node = each.value.node_key + 1 })
  
  os_disk {
    caching              = each.value.os_disk_caching
    storage_account_type = each.value.os_disk_storage_account_type
  }

  # source_image_reference {
  #   publisher = "MicrosoftWindowsServer"
  #   offer     = "WindowsServer"
  #   sku       = "2019-Datacenter"
  #   version   = "latest"
  # }

Debug Output

Panic Output

Expected Behavior

Deploy two Windows VMs from "Shared Gallery Image"

Actual Behavior

A spurious error that does not relate to the issue.

module.compute_region_0.module.windows_virtual_machine.azurerm_windows_virtual_machine.pool["1"]: Creating...
Error: creating Windows Virtual Machine "vm1" (Resource Group "rgvmimagedemo"): compute.VirtualMachinesClient#CreateOrUpdate: Failure sending request: StatusCode=400 -- Original Error: Code="InvalidParameter" Message="Parameter 'osProfile' is not allowed." Target="osProfile"

  on ..\azurerm_windows_virtual_machine\main.tf line 82, in resource "azurerm_windows_virtual_machine" "pool":
  82: resource "azurerm_windows_virtual_machine" "pool" {

Repeats for VM2.
(Line 82 is the begining of the resource: resource "azurerm_windows_virtual_mahine" "pool" {)

If this isn't a bug, please consider updating the document that would resolve my mistake. Perhaps an example of using source_image_id.

Steps to Reproduce

  1. terraform apply

Important Factoids

This may be related to #5998 although the error is different.

Plan succeeds.

I did a test by creating the old os_profile{} block, adding admin_username and admin_password which made the plan fail.

If this is the same regression as mentioned in #5998, can you check azurerm_linux_virtual_machine too, please?

If I comment out source_image_id and use source_image_reference instead, apply works and the VMs deploy.

T. I. A.

References

There is an unanswered related question on your discuss site: https://discuss.hashicorp.com/t/unable-to-create-azure-windows-vm/6672

@woter1832
Copy link
Contributor Author

The documentation is not clear on exactly what value source_image_id takes, however, I have tried it pointing to the image rather than the version too, with the same error:

I also note resource explorer does not show any galleries resources.

source_image_id value: /subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/rgvmimagedemo/providers/Microsoft.Compute/galleries/myGallary/images/windows_server_2019

@woter1832
Copy link
Contributor Author

To try and rule out any stupidity on my part, I've used the DATA source: azurerm_shared_image_version and consumed the id attribute reference. Same result.

@woter1832
Copy link
Contributor Author

woter1832 commented Jul 21, 2020

At the behest of Terraform support, I created a cut-down version of the code which still gave the same error.

# #############################################
# VARIABLES
# #############################################

variable "client_secret" {
    type = string
}

# #############################################
# PROVIDERS
# #############################################

provider "azurerm" {
    subscription_id = "00000000-0000-0000-0000-000000000000"
    tenant_id       = "00000000-0000-0000-0000-000000000000"
    client_id       = "00000000-0000-0000-0000-000000000000"
    client_secret   = var.client_secret
    version         = "2.18.0"
    features {}
}

# #############################################
# VERSION AND REMOTE STATE
# #############################################

terraform {
    required_version = ">=0.12"
}

resource "azurerm_windows_virtual_machine" "test" {
    name                  = "myVm01"
    resource_group_name   = "rgvmimagedemo"
    location              = "westeurope"
    size                  = "Standard_B1s"
    admin_username        = "LocalAdmin"
    admin_password        = "Pa$$w0rd!2345"
    network_interface_ids = ["/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/rgvmimagedemo/providers/Microsoft.Network/networkInterfaces/nic1"]
    license_type          = "Windows_Server"
    timezone              = "GMT Standard Time"

    #source_image_id       = "/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/rgvmimagedemo/providers/Microsoft.Compute/galleries/myGallery01/images/windows_server_2019/versions/0.0.2"
    #source_image_id       = "/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/rgvmimagedemo/providers/Microsoft.Compute/galleries/myGallery01/images/windows_server_2019"
    #source_image_id       = "/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/rgvmimagedemo/providers/Microsoft.Compute/galleries/myGallery01/images/windows_server_2019_capture/versions/0.0.3"
    source_image_id        = ""
    tags                  = {purpose = "Test OS Images gallery deployment"}

    os_disk {
        caching              = "ReadWrite"
        storage_account_type = "Standard_LRS"
    }

    boot_diagnostics {
        storage_account_uri = "https://someSA.blob.core.windows.net"
    }
}

After rereading some of the MS docs, osimage seems to relate to Generalised vs. Specialised VMs. I was trying to deploy a Specialised image, however, after creating a Generalised image my VMs deployed. Specialised VMs cause this error, so it looks like Specialised VMs aren't supported or there's a bug - they work in PowerShell.

I am rather confused about the different methods to create images. On one hand, there is the "Managed Image", created by sysprep'ing (Generalise) + OOBE, created by clicking Capture in the VM blade, whilst on the other hand, there is a newer method which should work directly from a VM without having to convert it to an image (as described here). Using this method, to get the VM into a Generalised state, I had to run sysprep on the VM (shutdown) and then deallocate it before setting the VM to Generalised, by using this API call. I was then able to create the "Shared Gallery Image" and version using the PowerShell (in Microsoft's tutorial) setting -OSType Generalize This allowed my Terraform code to work. I should add that I could create a Specialised image straight from an existing VM without having to run sysprep and then deploy it using PowerShell.

Please could someone from Hashicorp / MS confirm if my findings are accurate, in that Specialised VM's are not supported/do not work?

If the latter, is this something that can be fixed and if not, can we have a note added to the documentation, please?

I'll leave this open to allow Hashicorp to respond, but essentially, I seem to have resolved the issue.

@woter1832 woter1832 changed the title azurerm_windows_virtual_machine fails to create when referencing a gallery image. azurerm_windows_virtual_machine fails to create when referencing a "specialised" shared gallery image. Jul 21, 2020
@ArcturusZhang
Copy link
Contributor

Hi @woter1832 thanks for this issue!

Despite that now azurerm provider supports creating specialized image, it does not support provision VMs or VMSSes by using the specialized images. It is implementing in PR #7524 but it is currently blocked by some inconsistency in the compute API of azure. Please stay tuned.

Thanks!

@satyakrish
Copy link

(upvoted this)
Any updates on this issue, we have a scenario where the application requires specific user groups to be present and so we would like to use specialized images . But facing the same issues as explained in this issue while trying to create a vm using specialized image. Support for this feature would really help us.

@TheBlackMini
Copy link
Contributor

I'm also stuck with this issue but for azurerm_linux_virtual_machine instead.

@SylvainMartel
Copy link

Same problem here. Do I understand that terraform is unable to deploy "specialized" image at all?

@dkarlovi
Copy link

dkarlovi commented Aug 5, 2021

@kinwolfqc it appears so, I have exactly the same issue with the latest Azure provider. There doesn't seem to be a solution at this moment other than creating a generalized version of the image.

@dkarlovi
Copy link

dkarlovi commented Aug 5, 2021

I can confirm that using a generalized image with source_image_id works. The only thing is, creating generalized images is IMO a PITA if you don't need them. :|

@burnedikt
Copy link

I managed to create a (Windows) VM based on a specialised image (version) from a shared image gallery with terraform by relying on the (deprecated) azurerm_virtual_machine resource instead of azurerm_windows_virtual_machine. I am aware that this is not a future-proof solution but it might help someone as a temporary workaround.

@abdullah-lt

This comment has been minimized.

@SteFletcher

This comment was marked as off-topic.

@ChrisPetr0
Copy link

I'm having the exact same issue with a linux vm + specialized image via the azurerm_linux_virtual_machine.

I am having this same issue as well with Linux VM. Even though this was marked as off topic, wanted to add another data point. With the old resource of azurerm_virtual_machine, it does indeed work.

@lingclound
Copy link

this issue seems last more than 2 years, no final solution until now, is there any workaround , thanks in advance

@ChrisPetr0
Copy link

this issue seems last more than 2 years, no final solution until now, is there any workaround , thanks in advance

My "workaround" was to generalize the image... Heh, that's all I got.

@bn-jswick
Copy link

Hi,
Is there any news on this? I ran into this issue today while trying to create VMs using some internal custom images.

Terraform v1.3.8
on linux_amd64
+ provider registry.terraform.io/hashicorp/azuread v2.33.0
+ provider registry.terraform.io/hashicorp/azurerm v3.43.0
+ provider registry.terraform.io/hashicorp/random v3.4.3

@swinster
Copy link

I guess there are no updates.... It's been 3 years now :(

@swinster
Copy link

swinster commented Aug 28, 2023

FWIW, I was able to get a specialised image working with the azurerm_virtual_machine resource type (as mentioned by both @burnedikt and @ChrisPetr0 above), but ONLY when using a base VM that used a "Standard" security type (as opposed to "Trusted Launch" security type, which is becoming the norm). I believe Gen 1 VMs always use Standard security, and Gen 2 VMs usually use Trusted Launch security by default (the Azure CLI will be updated in November).

To prove this out, it is possible to force deployment of a Windows 11 Gen 2 Standard Security VM as a base image using the Azure CLI on which we can customise and specialise, for example:

az group create --name myResourceGroup --location eastus
az vm create --resource-group myResourceGroup --name myVM --image "MicrosoftWindowsDesktop:windows-11:win11-22h2-pro:22621.2134.230801" --public-ip-sku Standard --admin-username testadmin --admin-password Password1234! --security-type Standard

Then, I modify and capture a specialised image in the Azure Compute Gallery. I can then use something like this Terraform to deploy a working VM from this specialised image:

resource "azurerm_virtual_machine" "windows" {
  name                        = "testingvm"
  resource_group_name         = "${data.azurerm_resource_group.main.name}"
  location                    = "${data.azurerm_resource_group.main.location}"

  vm_size                     = "Standard_B2s"
  network_interface_ids       = [azurerm_network_interface.winnic.id]

  storage_image_reference {
    id                        = "/subscriptions/xxxxxx-xxxx-xxxx-xxxx-xxxxxxxxx/resourceGroups/myComputeGallery_RG/providers/Microsoft.Compute/galleries/myComputeGallery/images/Win11Gen2StandardSecuritySpecialised"
  }

  storage_os_disk {
    name                      = "${var.prefix}-osdisk"
    caching                   = "ReadWrite"
    create_option             = "FromImage"
    managed_disk_type         = "Standard_LRS"
  }
}

If we do the same process but instead deploy a Windows 11 Gen 2 VM with Trusted Launch security, then customise and capture the specialised image to the Azure Compute Gallery, subsequent VMs fail to deploy from the image. Essentially, the only thing that changes in the Terraform snippet above is the reference to the storage_image_reference. The error thrown is something like this:

Error: compute.VirtualMachinesClient#CreateOrUpdate: Failure sending request: StatusCode=400 -- Original Error: Code="BadRequest" Message="The provided gallery image only supports the creation of VMs and VM Scale Sets with 'TrustedLaunch' security type."

  with azurerm_virtual_machine.windows,
  on user-script.tf line 80, in resource "azurerm_virtual_machine" "windows":
  80: resource "azurerm_virtual_machine" "windows" {

I do not believe the azurerm_virtual_machine resource type supports (or will ever support) the "Trusted Launch" security options, hence our failure.

It seems you cannot change the VM security type once it has been deployed, so instead, you would need to start from scratch using this method. Capturing an image of a VM retains the security type. Of course (and as pointed out), the azurerm_virtual_machine resource type is now deprecated, and the Trusted Launch security type will only ever become more widespread, so you are potentially painting yourself into a corner.

When generalising a Gen 2 VM that uses the Trusted Launch security type, I can use some Terraform similar to:

resource "azurerm_windows_virtual_machine" "windows" {
  admin_password            = "Password1234!"
  admin_username            = "testadmin"
  name                      = "testingvm"
  resource_group_name       = "${data.azurerm_resource_group.main.name}"
  location                  = "${data.azurerm_resource_group.main.location}"

  size                      = "Standard_B2s"
  network_interface_ids     = [azurerm_network_interface.winnic.id]
  secure_boot_enabled   	= true
  vtpm_enabled              = true

  source_image_id           = "/subscriptions/xxxxxx-xxxx-xxxx-xxxx-xxxxxxxxx/resourceGroups/myComputeGallery_RG/providers/Microsoft.Compute/galleries/myComputeGallery/images/Win11_Gen2_TurstedLaunch_Generalised"
  
  identity {
    type                    = "SystemAssigned"
  }
  
  os_disk {
    name                    = "${var.prefix}-osdisk"
    caching                 = "ReadWrite"
    storage_account_type    = "Standard_LRS"
  }
  
  depends_on = [
    azurerm_network_interface.winnic,
  ]
}

This uses the newer azurerm_windows_virtual_machine resource type, and we can specify the secure_boot_enabled and vtpm_enabled parameters to signify a Trusted Launch security VM (although these settings are independent within Azure - you can have a Trusted Launch VM without either Secure Boot or TPM enabled).

However, when switching to use a specialised image in the same Terraform snippet, we get another failure:

Error: creating Windows Virtual Machine: (Name "testingvm" / Resource Group "apr-0ymlynte2ggae"): compute.VirtualMachinesClient#CreateOrUpdate: Failure sending request: StatusCode=400 -- Original Error: Code="InvalidParameter" Message="Parameter 'osProfile' is not allowed." Target="osProfile"

  with azurerm_windows_virtual_machine.windows,
  on user-script.tf line 80, in resource "azurerm_windows_virtual_machine" "windows":
  80: resource "azurerm_windows_virtual_machine" "windows" {

This is, of course, what @woter1832 initially reported above.

When deploying a VM from a specialised image in the portal, the Admin username and password fields are greyed out. However, we can't remove the admin_username and admin_password parameters from the Terraform as they are a requirement by the azurerm_windows_virtual_machine resource type. I'm unsure if this is the cause of the issue in this case. Maybe it's trying to change something in the OS profile that cannot be changed (hence the error).

Whatever, this is super frustrating as Gen 2 VMs using the Trusted Launch security type can be deployed from specialised images via the Azure portal and all other Azure command line options.

@soumilk-LT
Copy link

This is a HUGE problem !!!

The Simple Use Case is: I need to create a VM machine (or VMSS ) using a Specialised image using Trusted launch upon Spot priority.
I am listing down the issues we are facing while automating the Azure Via Terraform.

  1. The priority and secure_boot_enabled attributes are only supported in the new azurerm_windows_virtual_machine or azurerm_linux_virtual_machine BUT I cant use the specialized image with those.

  2. The azurerm_virtual_machine resource type supports specialized images but it do not have option to set priority or secure_boot_enabled.

Due to the above problems, we are incurring loss because the automation cannot be implemented properly. We use Terraform for all the clouds and only Azure provider has such limitations. We need to create it manually everytime (via console or CLI).

I have been following this thread for a long time now and still no progress, its Super Frustrating to see such stupid limitations when Terraform providers for AWS and GCP is highly matured. PLEASEE fix this ASAP.

@agyss
Copy link

agyss commented Mar 26, 2024

While this IMHO is a must have, I use the following workaround:

resource "terraform_data" "vm" {
  triggers_replace = [
    local.vm_name,
    "${var.dedicated_host_id != "" ? "--host ${var.dedicated_host_id}" : ""}",
    data.azurerm_resource_group.main.name,
    var.source_image_id,
    var.admin_password,
    data.azurerm_subnet.subnet.id,
    var.region
  ]

  input = {
    name            = local.vm_name
    host_id_section = "${var.dedicated_host_id != "" ? "--host ${var.dedicated_host_id}" : ""}"
    group           = data.azurerm_resource_group.main.name
    image           = var.source_image_id
    adminpw         = var.admin_password
    subnetid        = data.azurerm_subnet.subnet.id
    location        = var.region
  }
  provisioner "local-exec" {
    when    = create
    command = "az vm create -g ${self.input.group} -n ${self.input.name} --location \"${self.input.location}\" --image ${self.input.image} --specialized --admin-password ${self.input.adminpw} --size ${var.vm_size} --computer-name ${var.vm_name} --subnet ${self.input.subnetid} --enable-secure-boot --public-ip-address \"\" --nsg \"\" --enable-vtpm --os-disk-size-gb 256 --storage-sku Premium_LRS --zone 1 --os-disk-delete-option Delete --nic-delete-option Delete --security-type TrustedLaunch ${self.input.host_id_section}"
  }

  provisioner "local-exec" {
    when    = destroy
    command = "az vm delete -g ${self.input.group} -n ${self.input.name} --yes"
  }
}

The main 'problem' with this approach is that the subscription must be set correctly with 'az account set subscription' to where the resource should be created but this command could be added.

@nathanblair
Copy link

Creating a generalized image isn't possible for all virtual machine image definitions. There should be marked support of the azure_virtual_machine resource until the {windows|linux}_virtual_machine matures enough to support the specialized image deployment use case.

@Bjarki2330
Copy link

Creating a generalized image isn't possible for all virtual machine image definitions. There should be marked support of the azure_virtual_machine resource until the {windows|linux}_virtual_machine matures enough to support the specialized image deployment use case.

I second this. I need to recreate three virtual machines for an important workload and want to use a specialized image with the windows/linux resource and this is driving me insane.

@joaocc
Copy link

joaocc commented Jul 15, 2024

Hi. Any news regarding this issue? Thanks

@BillysCoolJob
Copy link

BillysCoolJob commented Nov 26, 2024

Guys, this issue has been open since 2020.... It bothered me a ton when I ran into it 2 years ago and it's bothering me that I'm running into it again....

This basically blocks the deployment of any VM from an image if you want it to have secure boot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment