Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix RPC input error handling #6

Merged
merged 3 commits into from
Sep 26, 2022
Merged

Conversation

jonkerj
Copy link
Collaborator

@jonkerj jonkerj commented Aug 30, 2022

In our environment (vSphere 7.0, ESXi 7.0) we got a lot of errors about Unable to send a message over the communication channel 0 and vSphere did not process any guest info being submitted. In order to debug this, we improved the logging of the tool, the GitHub workflow to actually build images in our fork and ultimately the way error responses were sent.

We suspect vSphere/ESXi are sending "unknown" RPC requests, which got responded in an invalid way ("Unknown Command" instead of "ERROR Unknown Command"). We've digged into the way open-vm-toolsd handle unknown RPC requests and mimiced these.

The tool now works perfectly on our environment.

@jonkerj jonkerj force-pushed the master branch 2 times, most recently from 93ff5b6 to 018540c Compare August 30, 2022 13:02
It really helps finding out what's going wrong when vsphere and
talos-vmtoolsd are not happy with each other.

Also, log request as string instead of []byte, so logrus does not base64
it.

Signed-off-by: Jorik Jonker <jorik.jonker@eu.equinix.com>
open-vm-toolsd seems to prefix error replies with "ERROR " instead of
"ERR " (when there is a callback) of "" (when there is no callback).

Signed-off-by: Jorik Jonker <jorik.jonker@eu.equinix.com>
Remove hardcoded references to main repo, remove dependency on repo
secrets.

Signed-off-by: Jorik Jonker <jorik.jonker@eu.equinix.com>
@mologie
Copy link
Collaborator

mologie commented Sep 26, 2022

Thank you, again, and sorry for taking so long to get back to you with this! It is interesting that vSphere did not process guestinfo commands anymore after the invalid response, since I do not observe this behavior with an identical basic stack on my end (vSphere 7 U0-U2 enterprise + vSAN). Wondering what component is sending those.

The error you fixed interestingly exists in the original govmomi service implementation too and remained undetected there, but I checked against open-vm-tools and prefixing with ERROR instead of ERR or nothing is correct behavior.

@mologie mologie changed the title Fix talos-vmtoolsd Fix RPC input error handling Sep 26, 2022
@mologie mologie merged commit e2a8ff4 into siderolabs:master Sep 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants