Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error tagging resources #12427

Closed
ghost opened this issue Mar 17, 2020 · 5 comments · Fixed by #12735
Closed

Error tagging resources #12427

ghost opened this issue Mar 17, 2020 · 5 comments · Fixed by #12735
Assignees
Labels
bug Addresses a defect in current functionality. regression Pertains to a degraded workflow resulting from an upstream patch or internal enhancement. service/ec2 Issues and PRs that pertain to the ec2 service.
Milestone

Comments

@ghost
Copy link

ghost commented Mar 17, 2020

This issue was originally opened by @vmorkunas as hashicorp/terraform#24395. It was migrated here as a result of the provider split. The original body of the issue is below.


Terraform Version

Terraform v0.12.23
provider.aws v2.53.0

Terraform Configuration Files

Root module

module "peering_ldap_intapp" {
    source = "../../modules/Stack/Peering"
    stackCommon = var.stackCommon
    providers = {
        aws.src = aws.ldap
        aws.dst = aws.stack
    }
    peering = {
        peering_connection_name = "Ldap-IntApp", 
        different_account = true,
        account_id = var.account_id,
        src_vpc_id = var.ldap_ops_vpc_id,
        dst_vpc_id = module.intapp_vpc.vpc.id,
    }
}

Peering module

resource "aws_vpc_peering_connection" "src_peering" {
    provider = aws.src
    peer_owner_id = var.peering.different_account ? var.peering.account_id : null
    vpc_id = var.peering.src_vpc_id
    peer_vpc_id = var.peering.dst_vpc_id
    peer_region   = var.stackCommon.stack_region
    auto_accept   = false

    tags = merge(
        map(
            "Name", "${var.stackCommon.stack_name}-${var.peering.peering_connection_name}"
        ),
        var.stackCommon.common_tags
    )

    lifecycle {
        create_before_destroy = true
    }
}

Debug Output

Crash Output

error updating EC2 VPC Peering Connection (pcx-0c090a18f48d63647) tags: error tagging resource (pcx-0c090a18f48d63647): InvalidVpcPeeringConnectionID.NotFound: The vpcPeeringConnection ID 'pcx-0c090a18f48d63647' does not exist

Expected Behavior

Resource should be tagged

Actual Behavior

No taggs added and execution stops with error message above

Steps to Reproduce

Issue doesn't occur all the time.

  1. terraform init
  2. terraform apply

Additional Context

Many resources, started failing on Tagging step in the resources with the same error - resource not found

@ghost ghost added the service/ec2 Issues and PRs that pertain to the ec2 service. label Mar 17, 2020
@github-actions github-actions bot added the needs-triage Waiting for first response or review from a maintainer. label Mar 17, 2020
@brightshine1111
Copy link

We've started seeing the same thing within the past few days. We tried reverting the aws provider version back a couple versions, no help.

@dshkyra
Copy link
Contributor

dshkyra commented Apr 7, 2020

I'm also seeing this issue after switching to terraform AWS provider v2.54 for security groups and KMS keys:

Error: error adding EC2 Security Group (sg-123) tags: error tagging resource (sg-123): InvalidGroup.NotFound: The security group 'sg-123' does not exist

Error: error updating KMS Key (key-123) tags: error tagging resource (key-123): NotFoundException: Key 'arn:aws:kms:us-east-1:1234567890:key/key-123' does not exist

@mgusiew-guide
Copy link
Contributor

mgusiew-guide commented Apr 8, 2020

My team has also observed tagging errors after switching to terraform AWS provider v2.52 (we also switched from terraform 0.12.18 to 0.12.23)

We often get errors when tagging internet gateways or security groups. See samples below:

  1. Tagging internet gateway:
    Error: error adding EC2 Internet Gateway (igw-013f22d7f3ebe56e9) tags: error tagging resource (igw-013f22d7f3ebe56e9): InvalidInternetGatewayID.NotFound: The internetGateway ID 'igw-013f22d7f3ebe56e9' does not exist status code: 400, request id: 58d2d7a0-538f-4948-a198-55f552a5aef5 on ../../../commons/modules/vpc/main.tf line 12, in resource "aws_internet_gateway" "test_igw": resource "aws_internet_gateway" "test_igw" [command.go:158: command.go:158: retry.go:80: Returning due to fatal error: FatalError{Underlying: exit status 1}

  2. Tagging security group:
    error adding EC2 Security Group (sg-0847237555c0220b9) tags: error tagging resource (sg-0847237555c0220b9): InvalidGroup.NotFound: The security group 'sg-0847237555c0220b9' does not exist status code: 400, request id: 002fe92a-e58e-4f91-9ac7-2ed4cbd733d1 on ../../../commons/modules/ec2/main.tf line 43, in resource "aws_security_group" "test_sg": resource "aws_security_group" "test_sg" [command.go:158: [command.go:158: retry.go:80: Returning due to fatal error: FatalError{Underlying: exit status 1}

We drilled into security group scenario, searched CloudTrail and found that security group and tags are created at exactly the same time (seconds precision) . So it may be that create tag is started before create security group is completed with success (race condition), according to AWS docs it takes some time for security group to propagate.

Unfortunately this results in flaky test so it would be great to have it fixed

@bflad bflad added bug Addresses a defect in current functionality. regression Pertains to a degraded workflow resulting from an upstream patch or internal enhancement. and removed needs-triage Waiting for first response or review from a maintainer. labels Apr 8, 2020
@bflad bflad self-assigned this Apr 8, 2020
bflad added a commit that referenced this issue Apr 8, 2020
…CreateEc2Tags implementation

Reference: #11060
Reference: #12427
Reference: https://github.com/terraform-providers/terraform-provider-aws/blob/0c56c9bea1291e77f28ae99c79748251a2e23517/aws/tags.go#L198

The EC2 service has special considerations during resource creation to retry for eventual consistency within the API itself on "NotFound" errors. This switches Terraform resources that cannot tag-on-create due to the lack of EC2 API support to the keyvaluetags implementation that handles this eventual consistency automatically for 5 minutes (or max retries if API is throttling).

This retry logic was present prior to the service refactoring to keyvaluetags (although errantly retrying on "NotFound" errors on every update, which may never succeed) and was still present in the `aws_vpc` and `aws_subnet` resources manually in the logic after the refactoring.

This refactor also catches cases where the resource `Create` function was depending on the `Update` logic to handle tagging on creation logic. We discourage the usage of `Update` after `Create` for new resources, but this refactor only guards against running the tag update logic rather than bundling more complex resource refactoring into this changeset.

Now all EC2 resources are consolidated to similar tagging on creation logic.

Output from acceptance testing:

```

```
bflad added a commit that referenced this issue Apr 8, 2020
…creation

Reference: #9953
Reference: #11781
Reference: #12427 (comment)

This refactors the resource logic to prevent `Update` after `Create` type logic errors with duplicate API calls (potential error points for eventual consistency):

- Setting `description` on creation previously was done once during the `CreateKey` call and again via a separate `UpdateKeyDescription` call
- Setting `policy` on creation previously was done once during the `CreateKey` call and again via a separate `PutKeyPolicy` call
- Setting `tags` on creation previously was done once during the `CreateKey` call and again via a separate `TagResource` call

This also adds eventual consistency retries for reading tags on resource creation and removes the resource `Exists` function, which can be another source of issues and required for the upcoming Terraform Plugin SDK v2.

Previously from operator error reports:

```
Error: error listing tags for KMS Key (***): NotFoundException: Key 'arn:aws:kms:***:key/***' does not exist

Error: error updating KMS Key (key-123) tags: error tagging resource (key-123): NotFoundException: Key 'arn:aws:kms:us-east-1:1234567890:key/key-123' does not exist
```

Output from acceptance testing:

```
--- PASS: TestAccAWSKmsKey_disappears (14.50s)
--- PASS: TestAccAWSKmsKey_asymmetricKey (40.34s)
--- PASS: TestAccAWSKmsKey_basic (43.60s)
--- PASS: TestAccAWSKmsKey_policy (58.38s)
--- PASS: TestAccAWSKmsKey_tags (59.07s)
--- PASS: TestAccAWSKmsKey_isEnabled (324.81s)
```
@gdavison gdavison added this to the v2.57.0 milestone Apr 9, 2020
@ghost
Copy link
Author

ghost commented Apr 10, 2020

This has been released in version 2.57.0 of the Terraform AWS provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template for triage. Thanks!

@ghost
Copy link
Author

ghost commented May 10, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thanks!

@ghost ghost locked and limited conversation to collaborators May 10, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Addresses a defect in current functionality. regression Pertains to a degraded workflow resulting from an upstream patch or internal enhancement. service/ec2 Issues and PRs that pertain to the ec2 service.
Projects
None yet
5 participants