Capacity Rebalance support #1124

zeyaddeeb · 2020-11-27T23:26:46Z

I have issues

I was looking to add capacity rebalance to some worker launch templates and ran into issues:

exampe:

resource "aws_autoscaling_group" "example" {
  capacity_rebalance = true
}

as per this PR: hashicorp/terraform-provider-aws#16127

I'm submitting a...

bug report
feature request
support request - read the FAQ first!
kudos, thank you, warm fuzzy

What is the current behavior?

When adding capacity_rebalance = true in the worker launch template will not work since there are no lookup values.

If this is a bug, how to reproduce? Please include a code sample if relevant.

What's the expected behavior?

Are you able to fix this problem and submit a PR? Link here if you have already.

Environment details

Affected module version: latest
OS: ubuntu18.04
Terraform version: 0.13

Any other relevant info

hashicorp/terraform-provider-aws#16127

The text was updated successfully, but these errors were encountered:

AdamTylerLynch · 2020-11-30T19:47:38Z

@zeyaddeeb you mention a Launch Template, but your example is showing creating an Auto Scaling group. Can you please share your code so we can help troubleshoot?

AdamTylerLynch · 2020-11-30T19:50:01Z

Capacity Rebalance is not a property of a launch template, it is a property of the autoscaling group through. You can create an Auto Scaling group and associate it with a launch template (or multiple launch templates in the latest versions of the SDK).

zeyaddeeb · 2020-12-01T13:43:41Z

Maybe using the wrong terminology, but it would look something like:

...

  worker_groups_launch_template = [
    {
      name                                     = "worker1"
      subnets                                  = var.subnet_ids
      asg_min_size                             = 1
      asg_desired_capacity                     = 1
      asg_max_size                             = 1
      autoscaling_enabled                      = true
      capacity_relabance                       = true   # desired variable
    },

...

AdamTylerLynch · 2020-12-02T16:00:31Z

@zeyaddeeb I misunderstood. Thank you for the clarification.

AdamTylerLynch · 2020-12-02T16:17:44Z

After reviewing, as of this posting AWS EKS/ECS does not support the Capacity Rebalance feature at this time.

petekneller · 2021-01-26T13:58:34Z

@AdamTylerLynch When you say Capacity Rebalancing is not supported by EKS, can you elaborate please? I don't understand how a feature on the ASG requires direct support in EKS. We use spot instances managed by EKS and were, like the OP, looking to enable Capacity Rebalance. I enabled it manually for an ASG via the console and it seemed to have the expected effect. When I then went looking for the correct option on this TF module I found my way here. Are you able to clarify what would need to happen in EKS before this feature was available in the module?

davidgp1701 · 2021-04-21T09:17:58Z

Hi,

I would also be interested in adding this to the module. I'm even willing to do a PR for it, but as @petekneller is commented, not sure if I understand why @AdamTylerLynch comments this it is not compatible with EKS.

So, my idea of why this feature would be interested is the following one. I have deployed both the Cluster Autoscaler: https://docs.aws.amazon.com/eks/latest/userguide/cluster-autoscaler.html and the AWS Node Termination Handler: https://github.com/aws/aws-node-termination-handler . In my mind the workflow will be like this:

At some point a spot instance in the ASG will get a Rebalance notification: https://docs.aws.amazon.com/autoscaling/ec2/userguide/capacity-rebalance.html, basically, that the instance is at high risk of being deleted. This notification is way early than the spot instance removal one that is just 120 seconds.
At the arrival of that notification, you can configure the AWS Node Termination Handler to put the node in NoScheduled and since last month (feat: add ability to drain on rebalance aws/aws-node-termination-handler#400 ), also to drain the node. That is what I would like.
At the same time, if your spot instances are configured in "Capacity Optimized" and the ASG has enabled the option of "Capacity Rebalance", the ASG will automatically create a new instance to take over the one that it is going to be shortly deleted.
The new instance has been created and automatically registered in EKS, the at-risk rebalance node is being drained and pods migrating the new one. After the draining process ends, two things could happen, the instance is being automatically deleted by AWS or the Cluster Autoscaler reaches the configured time limit that the instance has not being used by any pod and automatically removes it from the ASG.

That should optimize the usage of spot instances. Maybe I'm missing something. As far as I can see, this option is supported by the ASG terraform resource: https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/autoscaling_group#capacity_rebalance .

LAKostis · 2021-04-29T11:33:41Z

#1326

github-actions · 2022-11-20T02:31:10Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

barryib mentioned this issue May 4, 2021

feat: Add capacity_rebalance support for self-managed worker groups #1326

Merged

barryib closed this as completed in #1326 Jun 3, 2021

github-actions bot locked as resolved and limited conversation to collaborators Nov 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Capacity Rebalance support #1124

Capacity Rebalance support #1124

zeyaddeeb commented Nov 27, 2020

AdamTylerLynch commented Nov 30, 2020 •

edited

Loading

AdamTylerLynch commented Nov 30, 2020

zeyaddeeb commented Dec 1, 2020

AdamTylerLynch commented Dec 2, 2020 •

edited

Loading

AdamTylerLynch commented Dec 2, 2020

petekneller commented Jan 26, 2021

davidgp1701 commented Apr 21, 2021

LAKostis commented Apr 29, 2021

github-actions bot commented Nov 20, 2022

Capacity Rebalance support #1124

Capacity Rebalance support #1124

Comments

zeyaddeeb commented Nov 27, 2020

I have issues

I'm submitting a...

What is the current behavior?

If this is a bug, how to reproduce? Please include a code sample if relevant.

What's the expected behavior?

Are you able to fix this problem and submit a PR? Link here if you have already.

Environment details

Any other relevant info

AdamTylerLynch commented Nov 30, 2020 • edited Loading

AdamTylerLynch commented Nov 30, 2020

zeyaddeeb commented Dec 1, 2020

AdamTylerLynch commented Dec 2, 2020 • edited Loading

AdamTylerLynch commented Dec 2, 2020

petekneller commented Jan 26, 2021

davidgp1701 commented Apr 21, 2021

LAKostis commented Apr 29, 2021

github-actions bot commented Nov 20, 2022

AdamTylerLynch commented Nov 30, 2020 •

edited

Loading

AdamTylerLynch commented Dec 2, 2020 •

edited

Loading