-
Notifications
You must be signed in to change notification settings - Fork 618
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't register into cluster in instance is part of warm pool #3000
Comments
Thanks for bringing this to our attention. Unfortunately, warm pool lifecycle states are an attribute of the auto scaling group. As of now, ECS Agent is not aware whether it is running in a warm-pool instance or not; furthermore, it is not directly aware of the auto scaling group itself. That being said, when the EC2 instance stops, the ECS Agent disconnects, which would make the instance ineligible to place tasks despite being registered with the cluster. This is because the cluster "knows" if the agent is registered or not and only sends tasks to instances with healthy ECS Agent. However, the agent will be seen as "connected" for as long as it remains running. If this situation is generating problems for your use case, a solution could be to stop the ECS Agent at the beginning of your userdata script if the instance state is not "InService", as indicated by the warm pool A pseudo code of the userdata script would look like: WARM_POOL_DATA=`aws autoscaling describe-warm-pool --auto-scaling-group-name my-asg`
INSTANCE_ID=`curl http://169.254.169.254/latest/meta-data/instance-id`
# search $INSTANCE_ID in $WARM_POOL_DATA
# Retrieve LifecycleState for $INSTANCE_ID
sudo systemctl start ecs # No op if the service is already running
if [ "$LifecycleState" != "InService" ]; then
sudo systemctl stop ecs
fi Thoughts? |
Unfortunately, I've seen ECS try and assign a task to the instance from the
warm pool, I guess because it takes a while before the lack of
healthcheck from the agent to result in the ecs instance being deemed "not
connected".
…On Tue, Aug 24, 2021 at 9:27 PM Angel Velazquez ***@***.***> wrote:
Thanks for bringing this to our attention.
Unfortunately, warm pool lifecycle states are an attribute of the auto
scaling group. As of now, ECS Agent is not aware whether it is running in a
warm-pool instance; furthermore, it is not directly aware of the auto
scaling group itself.
That being said, when the EC2 instance stops, the ECS Agent disconnects,
which would make the instance ineligible to place tasks despite being
registered with the cluster. This is because the cluster is "knows" if the
agent is registered or not and only sends tasks to instances with healthy
ECS Agent.
However, the agent will be seen as "connected" for as long as it remains
running. If this situation is generating problems for your use case, a
solution could be to stop the ECS Agent at the beginning of your userdata
script if the instance state is not "InService", as indicated by the warm
pool LifecycleState.
A pseudo code of the userdata script would look like:
WARM_POOL_DATA=`aws autoscaling describe-warm-pool --auto-scaling-group-name my-asg`
INSTANCE_ID=`curl http://169.254.169.254/latest/meta-data/instance-id`
# search $INSTANCE_ID in $WARM_POOL_DATA# Retrieve LifecycleState for $INSTANCE_ID
sudo systemctl start ecs # No op if the service is already runningif [ "$LifecycleState" != "InService" ]; then
sudo systemctl stop ecsfi
Thoughts?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#3000 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AA5SUD7NE6FXF34LJTBWGBTT6RIH7ANCNFSM5CX6J3CA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email>
.
|
We will work on this and update this issue when there are more details. In the meantime, feel free to use the work-around proposed. Thanks. |
Summary
When using auto scaling group warm pools, an instance is going to get registered in the cluster even though it's about to the shut down (to be made part of the warm pool). I wonder if the agent should defer registration until the instance is in the right lifecycle state. This avoid instances showing up in the cluster list and the potential for tasks to be assigned to it even though it really shouldn't receive them.
The text was updated successfully, but these errors were encountered: