Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

driver/docker: protect against nil container #7749

Merged
merged 2 commits into from
Apr 20, 2020
Merged

driver/docker: protect against nil container #7749

merged 2 commits into from
Apr 20, 2020

Conversation

notnoop
Copy link
Contributor

@notnoop notnoop commented Apr 19, 2020

Protect against a panic when we attempt to start a container with a name
that conflicts with an existing one. If the existing one is being
deleted while nomad first attempts to create the container, the
createContainer will fail with container already exists, but we get
nil container reference from the containerByName lookup, and cause a
crash.

I'm not certain how we get into the state, except for being very
unlucky. I suspect that this case may be the result of a concurrent
restart or the docker engine API not being fully consistent (e.g. an
earlier call purged the container, but docker didn't free up resources
yet to create a new container with the same name immediately yet).

If that's the case, then re-attempting creation will hopefully succeed,
or we'd at least fail enough times for the alloc to be rescheduled to
another node.

Fixes #7738

Protect against a panic when we attempt to start a container with a name
that conflicts with an existing one.  If the existing one is being
deleted while nomad first attempts to create the container, the
createContainer will fail with `container already exists`, but we get
nil container reference from the `containerByName` lookup, and cause a
crash.

I'm not certain how we get into the state, except for being very
unlucky.  I suspect that this case may be the result of a concurrent
restart or the docker engine API not being fully consistent (e.g. an
earlier call purged the container, but docker didn't free up resources
yet to create a new container with the same name immediately yet).

If that's the case, then re-attempting creation will hopefully succeed,
or we'd at least fail enough times for the alloc to be rescheduled to
another node.
@notnoop notnoop requested a review from tgross April 19, 2020 20:36
@notnoop notnoop self-assigned this Apr 19, 2020
@notnoop notnoop added this to Triaged in Nomad - Community Issues Triage via automation Apr 19, 2020
Copy link
Member

@tgross tgross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

[ci skip]
@notnoop notnoop merged commit 68d0a9e into master Apr 20, 2020
Nomad - Community Issues Triage automation moved this from Triaged to Done Apr 20, 2020
@notnoop notnoop deleted the b-docker-panic branch April 20, 2020 14:31
@github-actions
Copy link

github-actions bot commented Jan 9, 2023

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 9, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

panic/SIGSEGV in Docker driver, invalid memory address or nil pointer dereference
3 participants