Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

During a resource adjustment the VM is not shut down using ACPI #96

Closed
mritzmann opened this issue Mar 6, 2020 · 4 comments · Fixed by #120
Closed

During a resource adjustment the VM is not shut down using ACPI #96

mritzmann opened this issue Mar 6, 2020 · 4 comments · Fixed by #120

Comments

@mritzmann
Copy link
Contributor

Describe the bug
During a resource adjustment, the VM is restarted by Terraform. I noticed that the VM does not shut down cleanly with an ACPI event. Instead, the VM is turned off hard.

At least that's how I interpret the code:

https://github.com/terraform-providers/terraform-provider-nutanix/blob/34a9c14c88956bce549deb60599602e2a5b224d2/nutanix/resource_nutanix_virtual_machine.go#L1234-L1238

Expected behavior

  • Terraform should send an ACPI shutdown and wait at least five minutes.
  • The time to wait should be configurable.

Logs
I'll be happy to deliver more if necessary.

Versions (please complete the following information):

  • OS that is executing Terraform: Debian 8.11
  • Terraform: v0.11.14
  • Nutanix Cluster (Prism Element / AOS): 5.10.9.1 LTS
  • Nutanix Prism Central: 5.11.2.1
  • Terraform provider version: v1.0.2

Additional context
Add any other context about the problem here.

@PacoDw
Copy link
Contributor

PacoDw commented May 11, 2020

Hello @mritzmann thank you so much for your review, I will check it out and I will let you know when I have news

@PacoDw
Copy link
Contributor

PacoDw commented May 13, 2020

Hello, @mritzmann, I implemented the new attribute to configure the power_state_mechanism and I fixed some bugs about it, so, you can check the PR to see more details or play with the new configuration.

Plz let me know if you have another comment or concern, thanks.

@mritzmann
Copy link
Contributor Author

Hi @PacoDw, thanks for taking care of it. I built the provider plugin from source and tried to test it. I have received the following error message.

Error: internal error: cannot shut down the VM with UUID(2cea71f2-192e-4d1f-8f4f-32937abb4515): error waiting for vm (2cea71f2-192e-4d1f-8f4f-32937abb4515) to update: timeout while waiting for state to become 'COMPLETE' (last state: 'RUNNING', timeout: 1m0s)

My server.tf:

resource "nutanix_virtual_machine" "tf-test" {
  name            = "test"
  cluster_uuid    = "00056dce-32b5-c4ed-0000-00000001170b"
  memory_size_mib = "2048"
  num_sockets     = "2"
  lifecycle {
    prevent_destroy = true
    ignore_changes = [
      nic_list,
      disk_list,
      guest_customization_cloud_init_user_data,
    ]
  }
  power_state_mechanism_config {
    guest_transition_config {
      should_fail_on_script_failure = true
      enable_script_exec            = true
    }
    mechanism = "ACPI"
  }
}

Probably because my test VM did not respond to the ACPI command within one minute or not at all (but this is probably more a problem of my VM). Therefore my questions:

  • Is it possible to configure Terraform to only wait a certain amount of time and then hard shut down the VM if the VM does not respond to the ACPI command? This is important for us because we cannot always determine whether a VM is receptive to an ACPI command. The best option for us would be if Terraform should force the change in all cases without giving an error.
  • Can the timeout of 1m be adjusted? Depending on the VM configuration, it will take several minutes until a shutdown can take place. Example: A Kubernetes node needs several minutes to move all Docker Containers to another node.

@marinsalinas
Copy link
Contributor

marinsalinas commented May 15, 2020

@mritzmann Could you test it again against master and add the following to your configuration:

provider "nutanix" {
wait_timeout = 6 //that means this will wait 6 minutes.
// ... other provider configuration
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment