Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fingerprinting Nomad Clients (on restart) on Windows and ESXI hosts can far exceed the heartbeat #10761

Closed
idrennanvmware opened this issue Jun 15, 2021 · 4 comments · Fixed by #10790
Labels
theme/client theme/dependencies Pull requests that update a dependency file type/bug
Milestone

Comments

@idrennanvmware
Copy link
Contributor

idrennanvmware commented Jun 15, 2021

Nomad version

1.0.4

Operating system and Environment details

Windows

Issue

When restarting a Nomad agent on an ESXI host (vSphere) with a Windows VM, it can take over 15 seconds to fingerprint.

Right now we have manipulated the Heartbeat TTL settings to work around this, but it is not ideal. Unfortunately this issue causes all windows allocations to be "lost" during a Nomad agent restart which then causes other disruptions to our cluster(s)

Fix

@fredwangwang has submitted a PR to the underlaying library, here shirou/gopsutil#1088 and when/if this is merged we are requesting this get included in the next Nomad refresh

Thanks!

@tgross
Copy link
Member

tgross commented Jun 15, 2021

Thanks @idrennanvmware. I think the approach to get that fixed in the upstream library is the right one. Marking this as accepted and once that lands we'll get it in the next patch.

@tgross tgross added theme/client theme/dependencies Pull requests that update a dependency file labels Jun 15, 2021
@idrennanvmware
Copy link
Contributor Author

@tgross @shoenig - the above PR just got merged. woohoo.

fredwangwang pushed a commit to fredwangwang/nomad that referenced this issue Jun 19, 2021
fredwangwang added a commit to fredwangwang/nomad that referenced this issue Jun 19, 2021
fredwangwang added a commit to fredwangwang/nomad that referenced this issue Jun 19, 2021
@shoenig shoenig added this to the 1.1.2 milestone Jun 21, 2021
@idrennanvmware
Copy link
Contributor Author

Thanks to everyone for making this such a quick process!

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
theme/client theme/dependencies Pull requests that update a dependency file type/bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants