Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return a meaningful reattach error for non-existent pids on Windows #125

Merged
merged 1 commit into from
Oct 4, 2019

Conversation

lachlanorr
Copy link
Contributor

This fix causes no change in behavior on Unix systems, and only
affects the error return type from reattach() when the reattach data
provided has a non-existent pid on Windows.

With this change in place, reattach() will now consistently return
ErrProcessNotFound on both Windows and Unix systems when the process
is indeed not found.

I encountered this while writing a Windows Nomad Task Driver plugin,
here's more context from the Nomad perspective:

The Reattach info Nomad stores can get stale, whereby the pid no
longer refers to a live process.

This has happened to me regularly (but not always) when
troubleshooting plugin development as I'm testingn what happens when
Nomad is forcefully killed and restarted.

On Windows, when the go-plugin code calls FindProcess with as stale
pid, the following error is always returned:
os.SyscallError - 'OpenProcess: the paremter is incorrect'

On Unix systems, FindProcess never returns an error, at least as
presently defined in the go source code.

In order for Nomad to recognize this condition we need to bubble up
the ErrProcesNotFound error, with which we can then futher propogate
the SingletonPluginExited error from singleton.go.

With these changes in place, whenever a stale pid is found in the
Reattach info, Nomad now properly restarts the plugin exe.

This fix causes no change in behavior on Unix systems, and only
affects the error return type from reattach() when the reattach data
provided has a non-existent pid on Windows.

With this change in place, reattach() will now consistently return
ErrProcessNotFound on both Windows and Unix systems when the process
is indeed not found.

I encountered this while writing a Windows Nomad Task Driver plugin,
here's more context from the Nomad perspective:

The Reattach info Nomad stores can get stale, whereby the pid no
longer refers to a live process.

This has happened to me regularly (but not always) when
troubleshooting plugin development as I'm testingn what happens when
Nomad is forcefully killed and restarted.

On Windows, when the go-plugin code calls FindProcess with as stale
pid, the following error is always returned:
os.SyscallError - 'OpenProcess: the paremter is incorrect'

On Unix systems, FindProcess never returns an error, at least as
presently defined in the go source code.

In order for Nomad to recognize this condition we need to bubble up
the ErrProcesNotFound error, with which we can then futher propogate
the SingletonPluginExited error from singleton.go.

With these changes in place, whenever a stale pid is found in the
Reattach info, Nomad now properly restarts the plugin exe.
@hashicorp-cla
Copy link

hashicorp-cla commented Aug 8, 2019

CLA assistant check
All committers have signed the CLA.

return nil, err
// On Unix systems, FindProcess never returns an error.
// On Windows, for non-existent pids it returns:
// os.SyscallError - 'OpenProcess: the paremter is incorrect'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching - in a similar logs, I noticed the formatting was a bit different:

// os.SyscallError - 'OpenProcess: The parameter is incorrect'

@notnoop
Copy link

notnoop commented Oct 4, 2019

Thanks you @lachlanorr so much - I'll follow up with maintainers to get this PR merged.

@notnoop notnoop merged commit 8091134 into hashicorp:master Oct 4, 2019
notnoop pushed a commit to hashicorp/nomad that referenced this pull request Oct 4, 2019
Upgrade go-plugin to latest to pick up Windows fix in
hashicorp/go-plugin#125 .
notnoop pushed a commit to hashicorp/nomad that referenced this pull request Oct 4, 2019
Upgrade go-plugin to latest to pick up Windows fix in
hashicorp/go-plugin#125 .
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants