Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add timeout for Docker Version API #1363

Merged
merged 4 commits into from
Apr 26, 2018
Merged

Add timeout for Docker Version API #1363

merged 4 commits into from
Apr 26, 2018

Conversation

aaithal
Copy link
Contributor

@aaithal aaithal commented Apr 25, 2018

Summary

Add timeout for Docker Version API

Implementation details

  • b1dae3d dockerclient: add timeout for Version API
    This change adds a timeout for Docker client's Version API.
    Additionally, the Docker version returned in saved in the task engine.
    This helps with avoiding additional unnecessarily calls to the Docker
    Version API everytime the agent connects to the backend. This should
    help with issues seen in ecs-agent 1.17.2 stop connection to AWS and start reporting expired credential to task #1331

    Finally, the dockerclient package has been refactored to extract all
    timeout related constants to dockerclient package from the dockerapi
    package to avoid circular dependencies.

  • 63f0650 updated go-dickerclient for VersionWithContext
    Updated go-dockerclient to get the VersionWithContext() method, which
    helps with adding timeouts against the Docker Version() API, which can
    occasionally timeout.

Testing

  • Builds on Linux (make release)
  • Builds on Windows (go build -out amazon-ecs-agent.exe ./agent)
  • Unit tests on Linux (make test) pass
  • Unit tests on Windows (go test -timeout=25s ./agent/...) pass
  • Integration tests on Linux (make run-integ-tests) pass
  • Integration tests on Windows (.\scripts\run-integ-tests.ps1) pass
  • Functional tests on Linux (make run-functional-tests) pass
  • Functional tests on Windows (.\scripts\run-functional-tests.ps1) pass

New tests cover the changes:

Description for the changelog

Licensing

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Updated go-dockerclient to get the VersionWithContext() method, which
helps with adding timeouts against the Docker Version() API, which can
occasionally timeout.
This change adds a timeout for Docker client's Version API.
Additionally, the Docker version returned in saved in the task engine.
This helps with avoiding additional unnecessarily calls to the Docker
Version API everytime the agent connects to the backend. This should
help with issues seen in aws#1331

Finally, the dockerclient package has been refactored to extract all
timeout related constants to dockerclient package from the dockerapi
package to avoid circular dependencies.
@aaithal aaithal added this to the 1.17.4 milestone Apr 25, 2018
aaithal added a commit to aaithal/amazon-ecs-agent that referenced this pull request Apr 25, 2018
@@ -292,7 +292,6 @@
name = "golang.org/x/net"
packages = [
"context",
"context/ctxhttp",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is nice, I think go docker client finally moved to builtin context. Out of this PR: probably we can get rid of golang.org/x/ dependency?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of this PR: probably we can get rid of golang.org/x/ dependency?

Not sure, there are many sub-packages that we have a dependency on. Which one in particular you want to get rid of?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the golang.org/x/net can be removed.

@@ -1126,7 +1131,18 @@ func (engine *DockerTaskEngine) State() dockerstate.TaskEngineState {

// Version returns the underlying docker version.
func (engine *DockerTaskEngine) Version() (string, error) {
return engine.client.Version()
if engine.dockerVersion != "" {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we need a lock for this method? Also, do we need a retry for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this needs a retry. Looking at the usage, there's a remote possibility of a race condition. I'll add a scoped down mutex just for this for now. We can expand it's scope to the whole of DockerTaskEngine if needed.

client, err := dg.dockerClient()
if err != nil {
return "", err
}
info, err := client.Version()
info, err := client.VersionWithContext(derivedCtx)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we not listening to the ctx.Done channel like the other Docker APIs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other APIs require special error handling on timeouts. They all require the output from Docker APIs to be massaged in some way and are invoked in an internal async method. Version() doesn't need any such semantics.

// if the client version returns a MinAPIVersion and APIVersion, then use it to return
// all the Docker clients between MinAPIVersion and APIVersion, else try pinging
// the clients in getKnownAPIVersions
var minAPIVersion, apiVersion string
// get a Docker client with the default supported version
client, err := newVersionedClient(endpoint, string(minDockerAPIVersion))
if err == nil {
clientVersion, err := client.Version()
derivedCtx, cancel := context.WithTimeout(ctx, dockerclient.VersionTimeout)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't just call client.Version here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because Version() is defined on dockerGoClient. This one is dockeriface.Client

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we not calling dockerGoClient's Version instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it's not there. dockerGoClient belongs to the engine package and is only used by DockerTaskEngine

CHANGELOG.md Outdated
@@ -1,5 +1,8 @@
# Changelog

## 1.17.4-dev
* Bug - Fixed a bug where Docker Version() API takes never returns by adding a timeout [#1363](https://github.com/aws/amazon-ecs-agent/pull/1363)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: API never returns by..

Moved the logic of caching Docker daemon version string out of the
DockerTaskEngine to dockerGoClient struct. This makes it much more
reusable across the agent.

Also, got rid of the unused SetDockerClient() method in the
DockerTaskEngine.
Copy link

@richardpen richardpen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -1,5 +1,8 @@
# Changelog

## 1.17.4-dev
* Bug - Fixed a bug where Docker Version() API never returns by adding a timeout [#1363](https://github.com/aws/amazon-ecs-agent/pull/1363)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we put Version() in `` ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it matters. I can modify it if you really insist (prefer not to 😃 ).

// if the client version returns a MinAPIVersion and APIVersion, then use it to return
// all the Docker clients between MinAPIVersion and APIVersion, else try pinging
// the clients in getKnownAPIVersions
var minAPIVersion, apiVersion string
// get a Docker client with the default supported version
client, err := newVersionedClient(endpoint, string(minDockerAPIVersion))
if err == nil {
clientVersion, err := client.Version()
derivedCtx, cancel := context.WithTimeout(ctx, dockerclient.VersionTimeout)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we not calling dockerGoClient's Version instead?

@aaithal aaithal merged commit 256a7fd into aws:dev Apr 26, 2018
@aaithal aaithal modified the milestones: 1.17.4, 1.18.0 May 10, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants