-
Notifications
You must be signed in to change notification settings - Fork 618
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dependencygraph: Enforce shutdown order #1866
dependencygraph: Enforce shutdown order #1866
Conversation
435b6b5
to
6ec0db1
Compare
c983940
to
b33029c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
once tests pass
are the ft failures due to timeouts? |
Great question. They were passing before i rebased 😖 |
492919c
to
0d7f7fe
Compare
Nevermind, it was panicking. I'm looking into it |
We now apply shutdown order in any dependency case, including dependsOn directives, links, or volumes. What this means is that agent will now make a best effort attempt to shutdown containers in the inverse order they were created. For example, a container using a link for communication will wait until the linked container has terminated before terminating itself. Likewise, a container named in another container's dependsOn directive will wait until that dependent container terminates. One note about the current implementation is that the dependencies aren't assumed to be transitive, so if a chain exists such that: A -> B -> C Container "C" will shutdown before "B", but it won't check status against container "A" explicity. If A depends on C, we expect: A -> B -> C A -> C The lack of transitive dependency logic applies is consistent with startup order as of this commit.
The link / volume dependency tests are now affected by shutdown order, so the tests now take longer. Previously, it would take a max of 30s (the default docker stop timeout for agent). Now, since the containers stop in order, it will take a max of 30s * n, where n is the number of containers. Increasing the test timeout is a short term fix until we have granular start/stop timeouts plumbed through the ecs service.
Instead of explicitly checking against many conditions, we now validate that the expected condition has progressed beyond started This mirrors prior behavior in the codebase, and reduces cyclo complexity.
0d7f7fe
to
89bb2e8
Compare
Logdriver test is failing suddenly. I've made a note to fix the test in #1835 |
Summary
Enforce shutdown ordering within a task.
Implementation details
We now apply shutdown order in any dependency case, including
dependsOn directives, links, or volumes. What this means is that
agent will now make a best effort attempt to shutdown containers
in the inverse order they were created.
For example, a container using a link for communication will wait
until the linked container has terminated before terminating
itself. Likewise, a container named in another container's
dependsOn directive will wait until that dependent container
terminates.
One note about the current implementation is that the dependencies
aren't assumed to be transitive, so if a chain exists such that:
A -> B -> C
Container "C" will shutdown before "B", but it won't check status
against container "A" explicity. If A depends on C, we expect:
A -> B -> C
A -> C
The lack of transitive dependency logic applies is consistent with
startup order as of this commit.
Testing
Added additional tests to cover the shutdown use cases.
Licensing
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.