Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] [aws_emr_instance_fleet] allocation_strategy does not detect correctly #34151

Closed
hongbo-miao opened this issue Oct 29, 2023 · 8 comments · Fixed by #34367
Closed

[Bug] [aws_emr_instance_fleet] allocation_strategy does not detect correctly #34151

hongbo-miao opened this issue Oct 29, 2023 · 8 comments · Fixed by #34367
Labels
bug Addresses a defect in current functionality. good first issue Call to action for new contributors looking for a place to start. Smaller or straightforward issues. service/emr Issues and PRs that pertain to the emr service.
Milestone

Comments

@hongbo-miao
Copy link

hongbo-miao commented Oct 29, 2023

Terraform Core Version

1.6.2

AWS Provider Version

5.23.1

Affected Resource(s)

  • aws_emr_instance_fleet

Expected Behavior

I have Amazon EMR Instance Fleet provisioned by Terraform. Inside task node allocation_strategy is using "price-capacity-optimized". Terraform created the whole EMR cluster correctly.

image

However, I run terraform apply second time, Terraform cannot detect current Amazon EMR Instance Fleet's allocation_strategy correctly.

image

Actual Behavior

Terraform detects allocation_strategy wrong. It thinks it is "capacity-optimized" which actually is "price-capacity-optimized".

Relevant Error/Panic Output Snippet

No response

Terraform Configuration Files

resource "aws_emr_cluster" "hm_amazon_emr_cluster" {
  name                              = var.amazon_emr_cluster_name
  release_label                     = var.amazon_emr_version
  applications                      = ["Trino"]
  termination_protection            = true
  keep_job_flow_alive_when_no_steps = true
  log_uri                           = "s3://integrated-test-data-archer/amazon-emr/logs/"
  ec2_attributes {
    instance_profile                  = "arn:aws:iam::272394222652:instance-profile/hm-emr-profile"
    subnet_id                         = "subnet-xxxxxxxxxxxxxxxxx"
    emr_managed_master_security_group = "sg-xxxxxxxxxxxxxxxxx"
    emr_managed_slave_security_group  = "sg-xxxxxxxxxxxxxxxxx"
    service_access_security_group     = "sg-xxxxxxxxxxxxxxxxx"
    key_name                          = "hm-ec2-key-pair"
  }
  master_instance_fleet {
    name                      = "Primary"
    target_on_demand_capacity = 1
    launch_specifications {
      on_demand_specification {
        allocation_strategy = "lowest-price"
      }
    }
    instance_type_configs {
      instance_type     = "r6a.xlarge"
      weighted_capacity = 1
    }
  }
  core_instance_fleet {
    name                      = "Core"
    target_on_demand_capacity = 1
    launch_specifications {
      on_demand_specification {
        allocation_strategy = "lowest-price"
      }
    }
    instance_type_configs {
      instance_type     = "r6a.2xlarge"
      weighted_capacity = 1
    }
  }
  bootstrap_action {
    name = "set_up"
    path = "s3://hongbomiao-bucket/amazon-emr/hm-amazon-emr-cluster-trino/bootstrap-actions/set_up.sh"
  }
  configurations_json = jsonencode(
    [
      {
        Classification : "delta-defaults",
        Properties : {
          "delta.enabled" : "true"
        }
      },
      {
        Classification : "trino-connector-delta",
        Properties : {
          "hive.metastore" : "glue"
        }
      }
    ]
  )
  service_role = "arn:aws:iam::27239422xxxx:role/service-role/AmazonEMR-ServiceRole-hm"
}

resource "aws_emr_instance_fleet" "hm_amazon_emr_cluster_task_instance_fleet" {
  cluster_id           = module.hm_amazon_emr_cluster.id
  name                 = "Task"
  target_spot_capacity = 1
  instance_type_configs {
    instance_type                              = "r6a.2xlarge"
    weighted_capacity                          = 1
    bid_price_as_percentage_of_on_demand_price = 100
  }
  launch_specifications {
    spot_specification {
      allocation_strategy      = "price-capacity-optimized"
      timeout_duration_minutes = 60
      timeout_action           = "TERMINATE_CLUSTER"
    }
  }
}

Steps to Reproduce

  1. terraform apply
  2. After the EMR cluster started, run terraform plan, you will see for aws_emr_instance_fleet, its allocation_strategy is supposed to be detected as "price-capacity-optimized", instead Terraform detects it as "capacity-optimized".

Debug Output

https://gist.github.com/hongbo-miao/080208ee300d9561e37f48450dd0bfd1

Workaround Solution

Add

resource "aws_emr_instance_fleet" "hm_amazon_emr_cluster_task_instance_fleet" {
  # ...
  lifecycle {
    ignore_changes = [
      launch_specifications[0].spot_specification["allocation_strategy"]
    ]
  }
}

to ignore detecting "allocation_strategy".

Would you like to implement a fix?

None

@hongbo-miao hongbo-miao added the bug Addresses a defect in current functionality. label Oct 29, 2023
@github-actions
Copy link

Community Note

Voting for Prioritization

  • Please vote on this issue by adding a 👍 reaction to the original post to help the community and maintainers prioritize this request.
  • Please see our prioritization guide for information on how we prioritize.
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request.

Volunteering to Work on This Issue

  • If you are interested in working on this issue, please leave a comment.
  • If this would be your first contribution, please review the contribution guide.

@github-actions github-actions bot added the service/emr Issues and PRs that pertain to the emr service. label Oct 29, 2023
@terraform-aws-provider terraform-aws-provider bot added the needs-triage Waiting for first response or review from a maintainer. label Oct 29, 2023
@acwwat
Copy link
Contributor

acwwat commented Nov 3, 2023

It looks like when the resource is read into Terraform, the allocation strategy is hardcoded to capacity-optimized thus causing the perpetual difference:

https://github.com/hashicorp/terraform-provider-aws/blob/main/internal/service/emr/cluster.go#L2106

The commented line above the linked one would probably fix the problem, assuming the value coming back from the AWS API is lowercase.

As a note, this seems to be both a bug and outdated code. In the past, the only value supported is capacity-optimized as suggested in the CLI doc. The attribute description says:

AllocationStrategy -> (string)
Specifies the strategy to use in launching Spot Instance fleets. Currently, the only option is capacity-optimized (the default), which launches instances from Spot Instance pools with optimal capacity for the number of instances that are launching.

But the JSON format/schema provides multiple allowed values:

capacity-optimized"|"price-capacity-optimized"|"lowest-price"|"diversified

@justinretzolk justinretzolk added good first issue Call to action for new contributors looking for a place to start. Smaller or straightforward issues. and removed needs-triage Waiting for first response or review from a maintainer. labels Nov 7, 2023
@acwwat
Copy link
Contributor

acwwat commented Nov 8, 2023

It looks like the change in the attribute value not only applies to capacity-optimized but to all allowed values. Below is an example for price-capacity-optimized. So we may need something a bit more dynamic to convert the value from AWS API to lowercase and replace _ with -.

The ListInstanceFleets API indicates that the lower-case names should be returned, as with the AddInstanceFleet API. So I think it's an AWS API issue that the return values is changed.

Terraform will perform the following actions:

  # aws_emr_cluster.test must be replaced
-/+ resource "aws_emr_cluster" "test" {
      ~ arn                               = "arn:aws:elasticmapreduce:us-east-1:999999999999:cluster/j-1111111111111" -> (known after apply)
      ~ cluster_state                     = "WAITING" -> (known after apply)
      - ebs_root_volume_size              = 0 -> null
      ~ id                                = "j-1111111111111" -> (known after apply)
      ~ keep_job_flow_alive_when_no_steps = true -> (known after apply)
      ~ master_public_dns                 = "ip-10-0-1-222.ec2.internal" -> (known after apply)
        name                              = "test-emr-cluster"
      - placement_group_config            = [] -> null
      ~ scale_down_behavior               = "TERMINATE_AT_TASK_COMPLETION" -> (known after apply)
      ~ step                              = [] -> (known after apply)
      - tags                              = {} -> null
      ~ tags_all                          = {} -> (known after apply)
      ~ termination_protection            = false -> (known after apply)
        # (6 unchanged attributes hidden)

      ~ core_instance_fleet {
          ~ id                             = "if-1111111111111" -> (known after apply)
            name                           = "core fleet"
          ~ provisioned_on_demand_capacity = 0 -> (known after apply)
          ~ provisioned_spot_capacity      = 2 -> (known after apply)
            # (2 unchanged attributes hidden)

          - instance_type_configs {
              - bid_price_as_percentage_of_on_demand_price = 100 -> null
              - instance_type                              = "m4.2xlarge" -> null
              - weighted_capacity                          = 2 -> null

              - ebs_config {
                  - iops                 = 0 -> null
                  - size                 = 100 -> null
                  - type                 = "gp2" -> null
                  - volumes_per_instance = 1 -> null
                }
            }
          - instance_type_configs {
              - bid_price_as_percentage_of_on_demand_price = 100 -> null
              - instance_type                              = "m4.xlarge" -> null
              - weighted_capacity                          = 1 -> null

              - ebs_config {
                  - iops                 = 0 -> null
                  - size                 = 100 -> null
                  - type                 = "gp2" -> null
                  - volumes_per_instance = 1 -> null
                }
            }
          - instance_type_configs {
              - bid_price_as_percentage_of_on_demand_price = 80 -> null
              - instance_type                              = "m3.xlarge" -> null
              - weighted_capacity                          = 1 -> null

              - ebs_config {
                  - iops                 = 0 -> null
                  - size                 = 100 -> null
                  - type                 = "gp2" -> null
                  - volumes_per_instance = 1 -> null
                }
            }
          + instance_type_configs {
              + bid_price_as_percentage_of_on_demand_price = 100
              + instance_type                              = "m4.2xlarge"
              + weighted_capacity                          = 2

              + ebs_config {
                  + size                 = 100
                  + type                 = "gp2"
                  + volumes_per_instance = 1
                }
            }
          + instance_type_configs {
              + bid_price_as_percentage_of_on_demand_price = 100
              + instance_type                              = "m4.xlarge"
              + weighted_capacity                          = 1

              + ebs_config {
                  + size                 = 100
                  + type                 = "gp2"
                  + volumes_per_instance = 1
                }
            }
          + instance_type_configs {
              + bid_price_as_percentage_of_on_demand_price = 80
              + instance_type                              = "m3.xlarge"
              + weighted_capacity                          = 1

              + ebs_config {
                  + size                 = 100
                  + type                 = "gp2"
                  + volumes_per_instance = 1
                }
            }

          ~ launch_specifications {
              ~ spot_specification {
                  ~ allocation_strategy      = "PRICE_CAPACITY_OPTIMIZED" -> "price-capacity-optimized" # forces replacement
                    # (3 unchanged attributes hidden)
                }
            }
        }

      ~ ec2_attributes {
          ~ emr_managed_master_security_group = "sg-00000000000000000" -> (known after apply)
          ~ emr_managed_slave_security_group  = "sg-11111111111111111" -> (known after apply)
          ~ service_access_security_group     = "sg-22222222222222222" -> (known after apply)
          ~ subnet_ids                        = [
              - "subnet-00000000000000000",
            ] -> (known after apply)
            # (2 unchanged attributes hidden)
        }

      ~ master_instance_fleet {
          ~ id                             = "if-1111111111111" -> (known after apply)
          ~ provisioned_on_demand_capacity = 1 -> (known after apply)
          ~ provisioned_spot_capacity      = 0 -> (known after apply)
            # (2 unchanged attributes hidden)

          - instance_type_configs {
              - bid_price_as_percentage_of_on_demand_price = 100 -> null
              - instance_type                              = "m3.xlarge" -> null
              - weighted_capacity                          = 1 -> null
            }
          + instance_type_configs {
              + bid_price_as_percentage_of_on_demand_price = 100
              + instance_type                              = "m3.xlarge"
              + weighted_capacity                          = 1
            }
        }

        # (1 unchanged block hidden)
    }

Plan: 1 to add, 0 to change, 1 to destroy.

@caiusb
Copy link

caiusb commented Nov 30, 2023

This is an issue when trying to create an aws_emr_instance_fleet with an allocation strategy other than capacity-optimized. I get expected launch_specifications.0.spot_specification.0.allocation_strategy to be one of [capacity-optimized], got price-capacity-optimized

price-capacity-optimized has been around for over a year now, and it would be great if we could add support for this in terraform, as it makes a difference for large clusters.

@caiusb
Copy link

caiusb commented Dec 1, 2023

This is an issue when trying to create an aws_emr_instance_fleet with an allocation strategy other than capacity-optimized. I get expected launch_specifications.0.spot_specification.0.allocation_strategy to be one of [capacity-optimized], got price-capacity-optimized

price-capacity-optimized has been around for over a year now, and it would be great if we could add support for this in terraform, as it makes a difference for large clusters.

I was using an older version of the provider in my test setup; Creating a cluster works as expected.

@github-actions github-actions bot added this to the v5.36.0 milestone Feb 8, 2024
Copy link

github-actions bot commented Feb 8, 2024

This functionality has been released in v5.36.0 of the Terraform AWS Provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template. Thank you!

@hongbo-miao
Copy link
Author

Confirmed this issue has been fixed, thanks! ☀️

Copy link

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 25, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Addresses a defect in current functionality. good first issue Call to action for new contributors looking for a place to start. Smaller or straightforward issues. service/emr Issues and PRs that pertain to the emr service.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants