Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop container when dependencies are non-resolvable #3218

Merged
merged 6 commits into from
May 25, 2022

Conversation

angelcar
Copy link
Contributor

@angelcar angelcar commented May 19, 2022

Summary

Addresses #2579 where a task can be stuck in PENDING for ever when container dependencies cannot be fulfilled.

Implementation details

Adding the concept of terminal dependency error to detect cases where container dependencies can never succeed. When a terminal dependency error is detected, we simply forcefully stop the container with the "bad" dependencies.

This new behavior triggers a chain reaction that eventually stops the task if an essential container has terminal dependency errors. If essential containers don't have terminal dependency errors, then the task is allowed to start.

Containers that are forcefully stopped are marked with exitCode 143. This is the code docker assigns when a container receives SIGTERM. Although we technically don't send SIGTERM to containers stopped due to terminal dependency errors, this exit code is the closest to reality. However, customers will be able to see a text description of the exact error when this edge case is hit.

For example (from AWS console):

Status reason CannotStartContainerError: dependency graph: failed to resolve container ordering dependency [container-1(public.ecr.aws/u5s6c6g2/just-sleep) (STOPPED->RUNNING) - Exit: 1] for target [container-2(public.ecr.aws/u5s6c6g2/just-sleep) (NONE->RUNNING)] as dep
143
["/sleep","10"]

Testing

New tests cover the changes: yes

Description for the changelog

Bug - Fix an issue where a task can be stuck in PENDING for ever when container dependencies can never be fulfilled.

Licensing

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@angelcar angelcar force-pushed the fix/container_ordering_stuck branch from de4672d to 1836472 Compare May 19, 2022 22:18
@angelcar angelcar marked this pull request as ready for review May 19, 2022 23:20
@angelcar angelcar changed the title [WIP]Stop container when dependencies are non-resolvable Stop container when dependencies are non-resolvable May 19, 2022
Copy link
Contributor

@singholt singholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙌

Copy link
Member

@fierlion fierlion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Thanks for the well reasoned PR with clear tests which act as a walkthrough of the changes.)

@angelcar angelcar merged commit 2dd337e into aws:dev May 25, 2022
@ubhattacharjya ubhattacharjya added this to the 1.61.2 milestone Jun 1, 2022
rsheik29 pushed a commit to rsheik29/amazon-ecs-agent that referenced this pull request Jul 11, 2022
Stop container when dependencies are non-resolvable
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants