Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue a stop when start container failed with EOF error #2245

Merged
merged 1 commit into from
Oct 24, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion agent/dockerclient/dockerapi/errors.go
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@ const (
DockerTimeoutErrorName = "DockerTimeoutError"
// CannotInspectContainerErrorName is the name of container inspect error.
CannotInspectContainerErrorName = "CannotInspectContainerError"
// CannotStartContainerErrorName is the name of container start error.
CannotStartContainerErrorName = "CannotStartContainerError"
// CannotDescribeContainerErrorName is the name of describe container error.
CannotDescribeContainerErrorName = "CannotDescribeContainerError"
)
Expand Down Expand Up @@ -201,7 +203,7 @@ func (err CannotStartContainerError) Error() string {

// ErrorName returns name of the CannotStartContainerError
func (err CannotStartContainerError) ErrorName() string {
return "CannotStartContainerError"
return CannotStartContainerErrorName
}

// CannotInspectContainerError indicates any error when trying to inspect a container
Expand Down
14 changes: 13 additions & 1 deletion agent/engine/task_manager.go
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@ package engine

import (
"context"
"io"
"strings"
"sync"
"time"

Expand Down Expand Up @@ -666,10 +668,20 @@ func (mtask *managedTask) handleEventError(containerChange dockerContainerChange
container.SetKnownStatus(currentKnownStatus)
container.SetDesiredStatus(apicontainerstatus.ContainerStopped)
errorName := event.Error.ErrorName()
errorStr := event.Error.Error()
shouldForceStop := false
if errorName == dockerapi.DockerTimeoutErrorName || errorName == dockerapi.CannotInspectContainerErrorName {
// If there's an error with inspecting the container or in case of timeout error,
// we'll also assume that the container has transitioned to RUNNING and issue
// we'll assume that the container has transitioned to RUNNING and issue
// a stop. See #1043 for details
shouldForceStop = true
} else if errorName == dockerapi.CannotStartContainerErrorName && strings.HasSuffix(errorStr, io.EOF.Error()) {
// If we get an EOF error from Docker when starting the container, we don't really know whether the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we know when does an EOF error occur?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my understanding, an EOF means the client didn't read anything from the response, which can happen if the connection closes before server writes the response.

// container is started anyway. So issuing a stop here as well. See #1708.
shouldForceStop = true
}

if shouldForceStop {
seelog.Warnf("Managed task [%s]: forcing container [%s] to stop",
mtask.Arn, container.Name)
go mtask.engine.transitionContainer(mtask.Task, container, apicontainerstatus.ContainerStopped)
Expand Down
12 changes: 12 additions & 0 deletions agent/engine/task_manager_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,18 @@ func TestHandleEventError(t *testing.T) {
ExpectedContainerDesiredStatusStopped: true,
ExpectedOK: false,
},
{
Name: "Start failed with EOF error",
EventStatus: apicontainerstatus.ContainerRunning,
CurrentContainerKnownStatus: apicontainerstatus.ContainerCreated,
Error: &dockerapi.CannotStartContainerError{
FromError: errors.New("error during connect: Post http://%2Fvar%2Frun%2Fdocker.sock/v1.19/containers/containerid/start: EOF"),
},
ExpectedContainerKnownStatusSet: true,
ExpectedContainerKnownStatus: apicontainerstatus.ContainerCreated,
ExpectedContainerDesiredStatusStopped: true,
ExpectedOK: false,
},
{
Name: "Inspect failed during create",
EventStatus: apicontainerstatus.ContainerCreated,
Expand Down