Upgrade go docker client #987

richardpen · 2017-09-20T21:18:47Z

Summary

Upgrade go docker client

Implementation details

The only change here is that added buffer for the channel to receive docker events. As go docker client changed the behavior that if the listener buffer is full the event won't be sent. See related code here

Testing

Builds on Linux (make release)
Builds on Windows (go build -out amazon-ecs-agent.exe ./agent)
Unit tests on Linux (make test) pass
Unit tests on Windows (go test -timeout=25s ./agent/...) pass
Integration tests on Linux (make run-integ-tests) pass
Integration tests on Windows (.\scripts\run-integ-tests.ps1) pass
Functional tests on Linux (make run-functional-tests) pass
Functional tests on Windows (.\scripts\run-functional-tests.ps1) pass

New tests cover the changes:

Description for the changelog

Licensing

This contribution is under the terms of the Apache 2.0 License:

aaithal · 2017-09-20T23:38:25Z

agent/engine/docker_container_engine.go

@@ -39,7 +39,8 @@ import (
 )

 const (
-	dockerDefaultTag = "latest"
+	dockerDefaultTag          = "latest"
+	containerEventsBufferSize = 1000


Why was this changed? Please add a description in either the commit message or as inline documentation. Doesn't this open up the possibility of the ECS Agent processing multiple events from docker simultaneously? Have you analyzed the impact of the same on the container and task engines?

The description is added in the Summary, I will add the same in the commit message.
I don't think ECS Agent can process multiple events simultaneously, based on the code the events will be handled serially. But I could be wrong, please correct me if that's not true.

Have you analyzed the impact of the same on the container and task engines?

Can you elaborate here, I don't think this will have impact on the container or task engine since the behavior of handling this doesn't change. The event will still be handled serially in task engine.

vsiddharth · 2017-09-21T15:03:06Z

agent/engine/docker_container_engine.go

@@ -39,7 +39,8 @@ import (
 )

 const (
-	dockerDefaultTag = "latest"
+	dockerDefaultTag          = "latest"
+	containerEventsBufferSize = 1000


Why 1000 ?
Could you please add more details here ?

1000 is picked up due to the maximum number of containers can be ran on an instance is ~1000. Let me know if you have better ideas.

That's not accurate. There's a limit of 1023 containers using the bridge network mode, but there is no limit of containers using host or none.

A note/comment is here going to be useful.
It would nice to calculate this value rather than using a fixed constant imho.

richardpen · 2017-09-21T18:44:10Z

@aaithal @samuelkarp @vsiddharth I have changed this to the same behavior as before, where the go routine that was used to send events was moved from go docker client to agent. Let me know if you have other suggestions.

samuelkarp · 2017-09-22T18:13:15Z

agent/engine/docker_container_engine.go

@@ -712,13 +715,24 @@ func metadataFromContainer(dockerContainer *docker.Container) DockerContainerMet
 	return metadata
 }

+// CopyDockerEventsWithoutBlocking copies the event from input to output without blocking the input
+func CopyDockerEventsWithoutBlocking(in chan *docker.APIEvents, out chan *docker.APIEvents) {


I don't understand how this doesn't block. This just means that you have an effective buffer of 2*containerEventsBufferSize and if that's full, you'll still block...

Even if the out is full, the read of in won't be block, as it simply reads and creates a go routine to write. In this case these go routines will be blocked until out isn't full, but go docker client can keeps writing to in so that agent won't miss any event.

In rare case, if the process to create the go routine slows down, which means producer is much faster than consumer, the in could be full, that's why I had a buffer size of 100 for in. This is the best way I can think of, please suggest.

I missed that you're creating a goroutine for each event. I think this is not a good idea; we'll have goroutine explosion and will lose any ordering that was provided by go-dockerclient.

You can implement a non-blocking, infinite (subject to how much memory you have) buffer through appending to a slice. However, I think the more important question here is to determine what the correct behavior should be.

Do we rely on every single event being delivered? We have logic that tolerates event loss and reconciles with explicit inspects.

What happens when we lose an event? Our reconciliation runs every 10 minutes, so we could have a 10-minute window of inconsistency.

Do we need every event? No, there's only a small subset of events that we pay attention to at all.

I agree this is not a good idea, but this is the same as how the agent works today. Previously the go routine to send event was created in go docker client code, but now they are created in the agent code.

Currently, the go docker client doesn't grantee the order of the event sent, so I think agent have the tolerance for the order of the event.

Do we rely on every single event being delivered? We have logic that tolerates event loss and reconciles with explicit inspects.
What happens when we lose an event? Our reconciliation runs every 10 minutes, so we could have a 10-minute window of inconsistency.

Ten minutes is really a big gap, we probably don't want the task status change report have a 10 minutes delay.

Do we need every event? No, there's only a small subset of events that we pay attention to at all.

We don't need all these events, but the agent has to do the analysis, currently go docker client doesn't support the filter.

I will try to implement a non-blocking infinite buffer to solve the goroutine explosion problem while keeping the order of the event. Any concerns or suggestions?

It might make more sense to filter here so we can discard early. It also might make more sense to do analysis of expected event volume under high load to determine if we can figure out an appropriate buffer size.

richardpen · 2017-09-29T18:02:05Z

@aaithal @samuelkarp @vsiddharth I did some test with ~800 containers start and stop, no event will miss with the buffer size 100, while with 1 there will still be some events missing. Also from the benchmark test, docker can produce the event with at most 60/second. Please let me know if you have other concerns.

aaithal · 2017-09-29T21:32:02Z

@richardpen Awesome! Thanks for doing that. Can you post the details of your experiments here (scripts if you have some) so that we can reproduce/rerun these tests in the future? My concern still is that 60 events per second might be the max tps that docker daemon is hitting today, but it can change with improvements made to that code base. It's not a good place to be constatnly worry about the buffer size that we have with each new release of docker. We should ensure that we have a better reconciliation story for missing events altogether, because at some point of time we will miss these events.

richardpen · 2017-09-29T22:38:31Z

@aithal, I agree with you that this still depends on the tps of docker events, and this isn't the only factor that impact the behavior. It will also be impacted by the speed of write to a channel(go docker client write event) and read from the channel and create a go routine(agent read the channel and create a go routine to record the event). I'm not sure how better we can do, but this should unblock us for now to upgrade the go docker client. We can probably have some other options like make the reconciliation interval configurable in the future or move to use other client tools?

The test I did for benchmarking docker events spike is launching short-life cycle containers on a high performance instance, and calculate the output of docker events. eg:

#!/bin/bash

for i in {1..1000};
do
    docker run -d --rm ubuntu bash sleep 5s &
done

The test with ecs-agent is try launch 1000 short-life cycle containers(with sleep 1m) on high performance instance, and count the event sent by go docker client and the event received by the agent.

aaithal · 2017-10-02T22:47:59Z

@richardpen Thanks. I'd be a lot more comfortable with this change if we reduced our steady state polling/reconciliation time to 5m instead of the 10m that it's currently set to. I'd even say that something like 2m or 3m should be fine. But, at the very least, we should reduce it to 5m so that the worst case latency of missing an event and reporting the same to the backend is reduced by half of what its set to today.

aaithal · 2017-10-05T20:45:29Z

agent/engine/docker_events_buffer.go

+	containerTypeEvent = "container"
+)
+
+var ignoredDockerEvents = []string{"rename", "kill", "destroy", "pause", "exec_create", "exec_start", "top", "attach", "export", "pull", "push", "tag", "untag", "import", "delete"}


Instead of doing this, we should filter events by specifying type=container parameter for /events API. We should add support for this in the vendored library here. I understand that this may require a new PR and if you want address this separately, please add a TODO for that.

If you're doing a TODO with this PR, please reformat this list to be the reverse of what it is right now. That is, let's have a list of not ignored Docker events and accept only that as this list might/will grow over time. And, make it so that each list entry is added in a new line:

var events = []strint{ "event1", "event2", ... }

aaithal · 2017-10-06T16:08:59Z

agent/engine/docker_events_buffer.go

+}
+
+// Serve starts reading from the input channel and writes to the buffer
+func (buffer *InfiniteBuffer) Serve(events chan *docker.APIEvents) {


The very common Serve() method from http is a blocking call and it might be confusing for us to have a method named Serve that's not blocking. Can you rename this to StartListener() or something?

aaithal · 2017-10-06T16:09:24Z

agent/engine/docker_events_buffer.go

+}
+
+// Write writes the event into the buffer
+func (buffer *InfiniteBuffer) Write(event *docker.APIEvents) {


CopyEvents might be a better name here?

aaithal · 2017-10-06T16:24:29Z

agent/engine/docker_events_buffer.go

+type InfiniteBuffer struct {
+	events   []*docker.APIEvents
+	empty    bool
+	waitDone chan struct{}


Can we use a sync.Waitgroup instead for this? It makes it simpler to understand the code.

The docker events filter part has been moved to docker_events_buffer

aaithal

Changes look good to me. Can you set this up on an instance and run a stress workload and make sure that the state transition latencies do not go up?

richardpen · 2017-10-11T19:30:33Z

From the benchmark test, it shows the latencies was reduced and the failure rate wasn't increased. Any other concerns?

vsiddharth

LGTM

vsiddharth · 2017-10-11T21:01:40Z

agent/engine/docker_container_engine.go

@@ -42,6 +42,8 @@ const (
 	dockerDefaultTag = "latest"
 	// imageNameFormat is the name of a image may look like: repo:tag
 	imageNameFormat = "%s:%s"
+	// the buffer size will ensure agent doesn't miss any event from docker


nit: dockerEventBufferSize is used to ensure ....

update go docker client

cbcb407

richardpen requested review from samuelkarp, aaithal, petderek and vsiddharth September 20, 2017 21:18

richardpen mentioned this pull request Sep 20, 2017

vendor: Update go-dockerclient #957

Closed

8 tasks

aaithal reviewed Sep 20, 2017

View reviewed changes

vsiddharth reviewed Sep 21, 2017

View reviewed changes

richardpen force-pushed the upgrade-go-docker-client branch from 9deeed2 to 5fed2e5 Compare September 21, 2017 18:41

samuelkarp reviewed Sep 22, 2017

View reviewed changes

richardpen added 2 commits September 29, 2017 17:09

engine: added buffer for docker events

dfde9c2

docker event: filter event type before writing to buffer

c5ed3ff

richardpen force-pushed the upgrade-go-docker-client branch from 5fed2e5 to c5ed3ff Compare September 29, 2017 17:59

jhaynes mentioned this pull request Oct 2, 2017

Upgrade go docker client to fix the data race in integration test #922

Closed

aaithal mentioned this pull request Oct 4, 2017

engine/dockerclient: Add new DockerClient version support #996

Merged

8 tasks

aaithal reviewed Oct 6, 2017

View reviewed changes

richardpen added 2 commits October 6, 2017 21:48

task manager: reduce steady state check interval to 5 mins and refactor

14f44d4

engine: remove the filter docker events code

b09465f

The docker events filter part has been moved to docker_events_buffer

richardpen force-pushed the upgrade-go-docker-client branch from 854b23f to b09465f Compare October 9, 2017 17:58

aaithal reviewed Oct 9, 2017

View reviewed changes

aaithal approved these changes Oct 11, 2017

View reviewed changes

vsiddharth approved these changes Oct 11, 2017

View reviewed changes

richardpen merged commit 1045da5 into aws:dev Oct 12, 2017

richardpen deleted the upgrade-go-docker-client branch November 21, 2017 01:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upgrade go docker client #987

Upgrade go docker client #987

richardpen commented Sep 20, 2017 •

edited

Loading

aaithal Sep 20, 2017

richardpen Sep 20, 2017

vsiddharth Sep 21, 2017

richardpen Sep 21, 2017

samuelkarp Sep 21, 2017

vsiddharth Sep 21, 2017

richardpen commented Sep 21, 2017

samuelkarp Sep 22, 2017

richardpen Sep 22, 2017

samuelkarp Sep 22, 2017

richardpen Sep 25, 2017

samuelkarp Sep 25, 2017

richardpen commented Sep 29, 2017

aaithal commented Sep 29, 2017

richardpen commented Sep 29, 2017

aaithal commented Oct 2, 2017

aaithal Oct 5, 2017

aaithal Oct 6, 2017

aaithal Oct 6, 2017

aaithal Oct 6, 2017

aaithal left a comment

richardpen commented Oct 11, 2017

vsiddharth left a comment

vsiddharth Oct 11, 2017

Upgrade go docker client #987

Upgrade go docker client #987

Conversation

richardpen commented Sep 20, 2017 • edited Loading

Summary

Implementation details

Testing

Description for the changelog

Licensing

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

richardpen commented Sep 21, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

richardpen commented Sep 29, 2017

aaithal commented Sep 29, 2017

richardpen commented Sep 29, 2017

aaithal commented Oct 2, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aaithal left a comment

Choose a reason for hiding this comment

richardpen commented Oct 11, 2017

vsiddharth left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

richardpen commented Sep 20, 2017 •

edited

Loading