Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proxmox_virtual_environment_file broken #521

Closed
dawidole opened this issue Aug 23, 2023 · 12 comments · Fixed by #526 or #533
Closed

proxmox_virtual_environment_file broken #521

dawidole opened this issue Aug 23, 2023 · 12 comments · Fixed by #526 or #533
Labels
🐛 bug Something isn't working

Comments

@dawidole
Copy link

Describe the bug
An attempt to create proxmox_virtual_environment_file resource, specifically

resource "proxmox_virtual_environment_file" "debian_cloud_image" {
  lifecycle {
    ignore_changes = [
      source_file
    ]
  }

  content_type = "iso"
  datastore_id = "local"
  node_name    = pve_node

  source_file {
    path = "./debian-11-generic-amd64.img"
  }
}

ends up with crashed plugin when using provider newer than v0.28.0. I believe this commit brokes it:
f901e71

To Reproduce
Steps to reproduce the behavior:

  1. Create a resource as above
  2. Run terraform apply
  3. See error

Please also provide a minimal Terraform configuration that reproduces the issue.

Expected behavior
Resource would be created :)

Working example, v0.28.0:

proxmox_virtual_environment_file.debian_cloud_image["dev-pve04"]: Creation complete after 2m36s [id=local:iso/debian-11-generic-amd64.img]
2023-08-23T20:15:16.517+0200 [DEBUG] provider.stdio: received EOF, stopping recv loop: err="rpc error: code = Unavailable desc = error reading from server: EOF"
2023-08-23T20:15:16.551+0200 [DEBUG] provider: plugin process exited: path=.terraform/providers/registry.terraform.io/bpg/proxmox/0.28.0/linux_amd64/terraform-provider-proxmox_v0.28.0 pid=1023245
2023-08-23T20:15:16.552+0200 [DEBUG] provider: plugin exited
2023-08-23T20:15:16.565+0200 [DEBUG] POST <redacted>
╷
│ Warning: Applied changes may be incomplete
│
│ The plan was created with the -target option in effect, so some changes requested in the configuration may have been ignored and the output values may not be fully updated. Run the following command to
│ verify that no other changes are pending:
│     terraform plan
│
│ Note that the -target option is not suitable for routine use, and is provided only for exceptional situations such as recovering from errors or mistakes, or when Terraform specifically suggests to use it
│ as part of an error message.
╵
2023-08-23T20:15:16.754+0200 [DEBUG] DELETE <redacted>/lock

Apply complete! Resources: 2 added, 0 changed, 1 destroyed.

broken, v0.30.1:

proxmox_virtual_environment_file.debian_cloud_image["dev-pve05"]: Still creating... [2m31s elapsed]
2023-08-23T20:18:45.344+0200 [DEBUG] provider.terraform-provider-proxmox_v0.30.1: Received HTTP Response: tf_http_trans_id=658488ec-955b-c1af-495a-c4f49c763f97 tf_provider_addr=registry.terraform.io/bpg/proxmox tf_resource_type=proxmox_virtual_environment_file Cache-Control=max-age=0 tf_http_op_type=response tf_http_res_body={"data":"UPID:dev-pve01:003F9B4A:06C7022D:64E64D85:imgcopy::root@pam:"} tf_http_res_status_code=200 Expires="Wed, 23 Aug 2023 18:18:45 GMT" tf_mux_provider=tf5to6server.v5tov6Server Content-Length=71 Content-Type=application/json;charset=UTF-8 @caller=github.com/hashicorp/terraform-plugin-sdk/v2@v2.27.0/helper/logging/logging_http_transport.go:162 Server=pve-api-daemon/3.0 tf_http_res_version=HTTP/1.1 tf_req_id=a51fd800-dd90-b937-d94e-603c03e133e5 tf_rpc=ApplyResourceChange @module=proxmox Date="Wed, 23 Aug 2023 18:18:45 GMT" Pragma=no-cache tf_http_res_status_reason="200 OK" timestamp=2023-08-23T20:18:45.344+0200
2023-08-23T20:18:45.345+0200 [DEBUG] provider.terraform-provider-proxmox_v0.30.1: Sending HTTP Request: tf_http_req_uri=/api2/json/nodes/dev-pve05/tasks/UPID:dev-pve01:003F9B4A:06C7022D:64E64D85:imgcopy::root@pam:/status @module=proxmox Cookie=PVEAuthCookie=PVE:root@pam:<redacted> tf_http_op_type=request tf_http_req_method=GET Accept=application/json Accept-Encoding=gzip tf_http_req_body= tf_http_trans_id=425a808a-2f8c-13a4-c579-2d0a4947a963 tf_rpc=ApplyResourceChange Host=10.227.0.11:8006 tf_mux_provider=tf5to6server.v5tov6Server tf_req_id=a51fd800-dd90-b937-d94e-603c03e133e5 tf_resource_type=proxmox_virtual_environment_file @caller=github.com/hashicorp/terraform-plugin-sdk/v2@v2.27.0/helper/logging/logging_http_transport.go:162 User-Agent=Go-http-client/1.1 tf_http_req_version=HTTP/1.1 tf_provider_addr=registry.terraform.io/bpg/proxmox timestamp=2023-08-23T20:18:45.345+0200
2023-08-23T20:18:45.372+0200 [DEBUG] provider.terraform-provider-proxmox_v0.30.1: Received HTTP Response: @module=proxmox tf_http_res_version=HTTP/1.1 tf_provider_addr=registry.terraform.io/bpg/proxmox tf_resource_type=proxmox_virtual_environment_file tf_http_res_body="{"data":null,"errors":{"upid":"no such task"}}" tf_http_res_status_code=400 tf_http_res_status_reason="400 Parameter verification failed." Cache-Control=max-age=0 Content-Length=46 Content-Type=application/json;charset=UTF-8 Expires="Wed, 23 Aug 2023 18:18:45 GMT" tf_http_trans_id=425a808a-2f8c-13a4-c579-2d0a4947a963 tf_mux_provider=tf5to6server.v5tov6Server tf_rpc=ApplyResourceChange @caller=github.com/hashicorp/terraform-plugin-sdk/v2@v2.27.0/helper/logging/logging_http_transport.go:162 Date="Wed, 23 Aug 2023 18:18:45 GMT" Pragma=no-cache Server=pve-api-daemon/3.0 tf_http_op_type=response tf_req_id=a51fd800-dd90-b937-d94e-603c03e133e5 timestamp=2023-08-23T20:18:45.372+0200
2023-08-23T20:18:45.372+0200 [ERROR] provider.terraform-provider-proxmox_v0.30.1: failed to close file reader: tf_mux_provider=tf5to6server.v5tov6Server tf_req_id=a51fd800-dd90-b937-d94e-603c03e133e5 tf_resource_type=proxmox_virtual_environment_file @caller=github.com/bpg/terraform-provider-proxmox/proxmox/nodes/storage.go:243 error="close /tmp/multipart3390830491: file already closed" tf_provider_addr=registry.terraform.io/bpg/proxmox tf_rpc=ApplyResourceChange @module=proxmox timestamp=2023-08-23T20:18:45.372+0200
2023-08-23T20:18:45.448+0200 [ERROR] provider.terraform-provider-proxmox_v0.30.1: Response contains error diagnostic: diagnostic_severity=ERROR tf_proto_version=6.3 tf_provider_addr=registry.terraform.io/bpg/proxmox tf_req_id=a51fd800-dd90-b937-d94e-603c03e133e5 tf_rpc=ApplyResourceChange @module=sdk.proto diagnostic_detail= diagnostic_summary="error uploading file to datastore local: failed waiting for upload - error retrievinf task status: received an HTTP 400 response - Reason: Parameter verification failed. (upid: no such task)" tf_resource_type=proxmox_virtual_environment_file @caller=github.com/hashicorp/terraform-plugin-go@v0.18.0/tfprotov6/internal/diag/diagnostics.go:58 timestamp=2023-08-23T20:18:45.448+0200
2023-08-23T20:18:45.449+0200 [ERROR] vertex "proxmox_virtual_environment_file.debian_cloud_image[\"dev-pve05\"]" error: error uploading file to datastore local: failed waiting for upload - error retrievinf task status: received an HTTP 400 response - Reason: Parameter verification failed. (upid: no such task)
2023-08-23T20:18:45.459+0200 [DEBUG] POST <redacted>
╷
│ Warning: Applied changes may be incomplete
│
│ The plan was created with the -target option in effect, so some changes requested in the configuration may have been ignored and the output values may not be fully updated. Run the following command to
│ verify that no other changes are pending:
│     terraform plan
│
│ Note that the -target option is not suitable for routine use, and is provided only for exceptional situations such as recovering from errors or mistakes, or when Terraform specifically suggests to use it
│ as part of an error message.
╵
╷
│ Error: error uploading file to datastore local: failed waiting for upload - error retrievinf task status: received an HTTP 400 response - Reason: Parameter verification failed. (upid: no such task)
│
│   with proxmox_virtual_environment_file.debian_cloud_image["dev-pve05"],
│   on debian-image.tf line 21, in resource "proxmox_virtual_environment_file" "debian_cloud_image":
│   21: resource "proxmox_virtual_environment_file" "debian_cloud_image" {
@dawidole dawidole added the 🐛 bug Something isn't working label Aug 23, 2023
@bpg
Copy link
Owner

bpg commented Aug 29, 2023

Hey @dawidole! 👋🏼

Thanks for the report! Although I can't reproduce this issue in my LAB, I suspect this is something to do with availability of a task that PVE creates to handle the uploaded file:

Screenshot 2023-08-29 at 6 38 17 PM

Task ID is returned back to the provider in the response to file upload, but when provider checks status of the task, PVE replies with "no such task" 😕

I'll make a quick fix to simply retry the operation in this case.

@bpg
Copy link
Owner

bpg commented Aug 29, 2023

@all-contributors please add @dawidole for bug

@allcontributors
Copy link
Contributor

@bpg

I've put up a pull request to add @dawidole! 🎉

@GJKrupa
Copy link

GJKrupa commented Aug 31, 2023

I've just upgraded from 0.30.1 to 0.30.2 and I'm still seeing the issue when trying to upload an ISO to Proxmox 8.0.4

Code:

resource "proxmox_virtual_environment_file" "talos_iso" {
  datastore_id = var.shared_storage_name
  content_type = "iso"
  node_name = var.controllers[0].node
  source_file {
    path = "https://github.com/siderolabs/talos/releases/download/${var.talos_version}/metal-amd64.iso"
    file_name = "metal-amd64-${var.talos_version}.iso"
  }
}

Init output:

Initializing the backend...

Initializing provider plugins...
- Finding bpg/proxmox versions matching "0.30.2"...
- Finding hashicorp/vault versions matching "3.20.0"...
- Using previously-installed bpg/proxmox v0.30.2
- Using previously-installed hashicorp/vault v3.20.0

Plan output:

  # proxmox_virtual_environment_file.talos_iso will be created
  + resource "proxmox_virtual_environment_file" "talos_iso" {
      + content_type           = "iso"
      + datastore_id           = "nas_raid5_nfs"
      + file_modification_date = (known after apply)
      + file_name              = (known after apply)
      + file_size              = (known after apply)
      + file_tag               = (known after apply)
      + id                     = (known after apply)
      + node_name              = "beelink"
      + timeout_upload         = 1800

      + source_file {
          + changed   = false
          + file_name = "metal-amd64-v1.5.1.iso"
          + insecure  = false
          + path      = "https://github.com/siderolabs/talos/releases/download/v1.5.1/metal-amd64.iso"
        }
    }

Result

Error: error uploading file to datastore nas_raid5_nfs: failed waiting for upload - error retrieving task status: received an HTTP 400 response - Reason: Parameter verification failed. (upid: no such task)

@bpg bpg reopened this Aug 31, 2023
@bpg
Copy link
Owner

bpg commented Aug 31, 2023

Hmmm... that's interesting.
Maybe the underlying data store type plays a role. I would imagine NFS will be much slower that the local in this scenario.
Unlikely, the OP was using local data store.
I still can't reproduce this 😕

@GJKrupa How big is your file?
Also, could you check if the task for the file upload has been created in PVE at all? Similar to the screenshot I've poster earlier.

@GJKrupa
Copy link

GJKrupa commented Aug 31, 2023

The file is around 85MB. It's being downloaded over a 1Gbit broadband link and transferred to a Synology NAS over GigE. I've tried uploading the same file from the laptop instead of downloading from GitHub but I see the same error.

There is a Copy Data task being created and it's running to completion after Terraform exits. I've compared the UID in the TF_LOG=DEBUG output to the ProxMox console and they match. I noticed that the task doesn't show up in the console sometimes until a second or so after Terraform errors out but that could just be due to latency in the polling.

What I did just find out is that the apply succeeds if the target node is the same as the provider endpoint and fails if it's a different node. The task is different in these cases. In both instances it loads the image to the endpoint node first but then it either does a cp or an scp to get it to the final location depending on where the target of the upload is. For my use case I can work around it by changing the node since it's using shared storage but will fail if, for example, I was trying to distribute a file into local storage on multiple nodes.

@GJKrupa
Copy link

GJKrupa commented Aug 31, 2023

I've tried the above comparison with local storage and the behaviour is the same - fails if it's copying to another node, succeeds if it's on the same node.

@bpg
Copy link
Owner

bpg commented Aug 31, 2023

Those are important details, thanks for the testing @GJKrupa!

@all-contributors please add @GJKrupa for test

@allcontributors
Copy link
Contributor

@bpg

I've put up a pull request to add @thanks! 🎉

I've put up a pull request to add @GJKrupa! 🎉

@bpg
Copy link
Owner

bpg commented Sep 1, 2023

That was it, I was able to reproduce it in a similar scenario in multi-node PVE cluster, #553 fix seems to be working for me.

@GJKrupa, @dawidole would you be able to test v0.30.3 (will be released in a few hours) in your environments?

@bpg bpg closed this as completed in #533 Sep 1, 2023
@zimmertr
Copy link
Sponsor

I had to grant the role I was using for my API token the Datastore.AllocateTemplate permission to solve this.

Two other thoughts:

  1. Is there a location where all of the necessary permissions are listed for API actions?
  2. Thank you SO MUCH for picking up development on this provider! After years of struggle with the other one I'm so thrilled to see a new project come around.

@bpg
Copy link
Owner

bpg commented Sep 22, 2023

  1. Is there a location where all of the necessary permissions are listed for API actions?

No, unfortunately. So far, most of my testing is done under the root@pam account (:shame:)
But this is a really good point -- we should document necessary API permissions for each resource. I'll keep this in mind when migrating the stuff to the new provider framework working towards v1.0

  1. Thank you SO MUCH for picking up development on this provider! After years of struggle with the other one I'm so thrilled to see a new project come around.

@zimmertr, thanks a lot for the good words! ❤️ This is purely a hobby / free time project, and any support and encouragement means a lot to me :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐛 bug Something isn't working
Projects
None yet
4 participants