Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to restart guest OS using vSphere #21

Closed
rgomezceis opened this issue Jun 11, 2024 · 6 comments · Fixed by #22
Closed

Unable to restart guest OS using vSphere #21

rgomezceis opened this issue Jun 11, 2024 · 6 comments · Fixed by #22

Comments

@rgomezceis
Copy link

Using latest release..

First reboot works well, but when vm starts and I want to restart it again it fails with:

"Cannot complete operation because VMware Tools is not running in this virtual machine. Failed to reset the virtual machine: Cannot execute scripts."

Pod is running without any error log.

kube-system          talos-vmtoolsd-wmjjk                           1/1     Running
@jonkerj
Copy link
Collaborator

jonkerj commented Jun 12, 2024

The error indicates that vSphere is not receiving communications from vmtoolsd. Could you check if the logs of the vmtoolsd (or post them here) show anything weird?

@rgomezceis
Copy link
Author

rgomezceis commented Jun 12, 2024

Just this:

$ kubectl logs -n kube-system talos-vmtoolsd-rh4c7
{"level":"info","msg":"talos-vmtoolsd version latest\nCopyright 2020-2022 Oliver Kuckertz <oliver.kuckertz@mologie.de>\nThis program is free software and available under the Apache 2.0 license."}

Pod:

Containers:
  talos-vmtoolsd:
    Container ID:   containerd://1d1e8d606fdf6b56bf085decbbac5362c7f0690ec36f57fca8fc6dc84e996179
    Image:          ghcr.io/siderolabs/talos-vmtoolsd:latest
    Image ID:       ghcr.io/siderolabs/talos-vmtoolsd@sha256:8eefb326375abf45f07d5922e25701aa43bbf7aa50f86927a6d24633e44c3ca1
    Port:           <none>
    Host Port:      <none>
    State:          Running
      Started:      Tue, 11 Jun 2024 20:33:43 +0000
    Last State:     Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Tue, 11 Jun 2024 20:25:30 +0000
      Finished:     Tue, 11 Jun 2024 20:29:03 +0000
    Ready:          True
    Restart Count:  1
    Limits:
      cpu:     500m
      memory:  64Mi
    Requests:
      cpu:     500m
      memory:  8Mi
    Environment:
      TALOS_CONFIG_PATH:  /etc/talos-vmtoolsd/talosconfig
      TALOS_HOST:          (v1:status.hostIP)
    Mounts:
      /etc/talos-vmtoolsd from config (ro)
Conditions:
  Type                        Status
  PodReadyToStartContainers   True
  Initialized                 True
  Ready                       True
  ContainersReady             True
  PodScheduled                True
Volumes:
  config:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  talos-vmtoolsd-config
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     ceis.node.workload:NoSchedule op=Exists
                 node-role.kubernetes.io/control-plane:NoSchedule op=Exists
                 node-role.kubernetes.io/master:NoSchedule op=Exists
                 node.cloudprovider.kubernetes.io/uninitialized:NoSchedule op=Exists
                 node.kubernetes.io/disk-pressure:NoSchedule op=Exists
                 node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                 node.kubernetes.io/not-ready:NoExecute op=Exists
                 node.kubernetes.io/pid-pressure:NoSchedule op=Exists
                 node.kubernetes.io/unreachable:NoExecute op=Exists
                 node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:          <none>

image

@jonkerj
Copy link
Collaborator

jonkerj commented Jun 12, 2024

Hmm that's not much. Could you add these env vars to your container spec, and inspect your logs again? According to main.go, this is how one sets the log level.

env:
- name: LOG_LEVEL
  value: debug  # or even trace

@rgomezceis
Copy link
Author

rgomezceis commented Jun 12, 2024

{"level":"info","msg":"talos-vmtoolsd version latest\nCopyright 2020-2022 Oliver Kuckertz <oliver.kuckertz@mologie.de>\nThis program is free software and available under the Apache 2.0 license."}
{"level":"debug","module":"vmware-guestinfo","msg":"Opened channel 0"}
{"level":"debug","module":"vmware-guestinfo","msg":"Opened channel 1"}
{"handler_name":"reset","level":"debug","module":"nanotoolbox","msg":"incoming RPC request"}
{"level":"debug","module":"tboxcmds","msg":"sending hostname: ceis-worker-3"}
{"level":"debug","module":"tboxcmds","msg":"sending OS full name: Talos v1.7.4-cb3a8308"}
{"level":"debug","module":"tboxcmds","msg":"sending OS short name: Talos v1.7.4"}
{"level":"debug","module":"tboxcmds","msg":"GuestNicInfo: adding name=eth0 mac={mac} ip={ipv4}"}
{"level":"debug","module":"tboxcmds","msg":"GuestNicInfo: adding name=eth0 mac={mac }ip={ipv6}"}
{"handler_name":"Capabilities_Register","level":"debug","module":"nanotoolbox","msg":"incoming RPC request"}
{"level":"debug","module":"vmware-guestinfo","msg":"No message to retrieve"}
{"level":"debug","module":"vmware-guestinfo","msg":"No message to retrieve"}
{"handler_name":"Vix_1_Relayed_Command","level":"debug","module":"nanotoolbox","msg":"incoming RPC request"}
{"command":"vix","level":"debug","module":"nanotoolbox","msg":"sending tools state version=\"Talos v1.7.4-cb3a8308\" versionShort=\"Talos v1.7.4\" hostname=\"ceis-worker-3\""}
{"level":"debug","module":"vmware-guestinfo","msg":"No message to retrieve"}
{"level":"debug","module":"vmware-guestinfo","msg":"No message to retrieve"}
{"level":"debug","module":"vmware-guestinfo","msg":"No message to retrieve"}
{"level":"debug","module":"vmware-guestinfo","msg":"No message to retrieve"}
{"level":"debug","module":"vmware-guestinfo","msg":"No message to retrieve"}
{"level":"debug","module":"vmware-guestinfo","msg":"No message to retrieve"}
{"level":"debug","module":"vmware-guestinfo","msg":"No message to retrieve"}
{"level":"debug","module":"vmware-guestinfo","msg":"No message to retrieve"}
{"level":"debug","module":"vmware-guestinfo","msg":"No message to retrieve"}
{"level":"debug","module":"vmware-guestinfo","msg":"No message to retrieve"}
{"level":"debug","module":"vmware-guestinfo","msg":"No message to retrieve"}
{"level":"debug","module":"vmware-guestinfo","msg":"No message to retrieve"}
{"level":"debug","module":"vmware-guestinfo","msg":"No message to retrieve"}
{"level":"debug","module":"vmware-guestinfo","msg":"No message to retrieve"}
{"level":"debug","module":"vmware-guestinfo","msg":"No message to retrieve"}
{"level":"debug","module":"vmware-guestinfo","msg":"No message to retrieve"}

We are using vSphere CPI, idk if it matters

@robinelfrink
Copy link
Collaborator

I'm seeing the same with talos-vmtoolsd as a system extension on vCloud. What's interersting, is that after the reboot talos-vmtoolsd properly reports IP addresses.

I'll try to find out what's going on here.

@robinelfrink
Copy link
Collaborator

robinelfrink commented Jun 27, 2024

Reboot works only after poweron; subsequent requests throw a java exception on the esx host, with this error in the UI:

      "message": "Cannot complete operation because VMware Tools is not running in this virtual machine.\nFailed to reset the virtual machine: Cannot execute scripts.",
      "faultMessage": [
        {
          "_type": "com.vmware.vim.binding.impl.vmodl.LocalizableMessageImpl",
          "key": "msg.vigor.reset.fail",
          "arg": [
            {
              "_type": "com.vmware.vim.binding.impl.vmodl.KeyAnyValueImpl",
              "key": "1",
              "value": "msg.foundryErrMsgId.VIX_E_POWEROP_SCRIPTS_NOT_AVAILABLE"
            }
          ],
          "message": "Failed to reset the virtual machine: Cannot execute scripts."
        }
      ]

I'll continue to try to figure out what's going on, but don't expect a fix very soon. If you're automating the reboots i suggest you try a shutdown followed by a powerup for now.

Here's the full response in the vSphere UI: stacktrace.json

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants