Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removing whitespace in ECR policy forces resource recreation #3839

Closed
suneeta-mall opened this issue Mar 20, 2018 · 10 comments
Closed

Removing whitespace in ECR policy forces resource recreation #3839

suneeta-mall opened this issue Mar 20, 2018 · 10 comments
Labels
question A question about existing functionality; most questions are re-routed to discuss.hashicorp.com. service/ecr Issues and PRs that pertain to the ecr service.

Comments

@suneeta-mall
Copy link

Terraform Version

0.11.4 (also happens on 0.11.1)
provider "template" (1.0.0)...
provider "aws" (1.7.1)...

Affected Resource(s)

  • ECR and ECR policy

Terraform Configuration Files

Following snippet has whitespace in " "countType": "imageCountMoreThan", " if we remove the trailing whitespace and make it " "countType": "imageCountMoreThan"," terraform thinks its new policy and forces all ecr recreation (see output).

resource "aws_ecr_repository" "ecr-repo" {
  count   = "${length(var.repositories)}"
  name    = "nearmap/${var.repositories[count.index]}"
}


resource "aws_ecr_lifecycle_policy" "has-env-tag" {
  count        = "${length(var.repositories)}"
  repository   = "nearmap/${var.repositories[count.index]}"

  policy = <<EOF
{
    "rules": [
        {
            "rulePriority": 1,
            "description": "Keep last 900 images",
            "selection": {
                "tagStatus": "tagged",
                "tagPrefixList": ["XXX"],
                "countType": "imageCountMoreThan",                
                "countNumber": 900
            },
            "action": {
                "type": "expire"
            }
        }
    ]
}
EOF
}

Debug Output

-/+ module.ecr.aws_ecr_lifecycle_policy.latest[8] (new resource required)
      id:             "XXXX/XXXXr" => <computed> (forces new resource)
      policy:         "...." (forces new resource)
      registry_id:    "XXXXXXX" => <computed>
      repository:     "XXXX/XXXX" => "XXXX/XXX"

Expected Behavior

Benign changes such as whitespace outside policy values should not cause resource recreation. Actually, unsure why change in policy causes ECR recreation anyway.

Actual Behavior

Should not affect anything .. its whitespace outside the values of policy content. Also, even if the policy content/payload is changed, it should not cause recreation.

Steps to Reproduce

Please list the steps required to reproduce the issue, for example:

  1. terraform apply will show that Terraform is going to recreate all ECRs
@bflad
Copy link
Contributor

bflad commented Mar 20, 2018

Hi @suneeta-mall 👋 sorry you're seeing this unexpected behavior. It looks like #3246 which was released in v1.9.0 of the AWS provider has the correct DiffSuppressFunc: suppressEquivalentJsonDiffs to prevent this type of behavior for the aws_ecr_lifecycle_policy resource. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

Can you please check that out and let us know if you still have an issue?

Actually, unsure why change in policy causes ECR recreation anyway.

The aws_ecr_lifecycle_policy resource has the policy attribute marked as ForceNew: true currently. There is probably not too much of a reason for this other than code simplification. We would probably accept a pull request to remove the ForceNew: true from the policy attribute and set the resourceAwsEcrLifecyclePolicyCreate (renamed slightly) to also function as the Update function for resource.

@bflad bflad added question A question about existing functionality; most questions are re-routed to discuss.hashicorp.com. waiting-response Maintainers are waiting on response from community or contributor. service/ecr Issues and PRs that pertain to the ecr service. labels Mar 20, 2018
@suneeta-mall
Copy link
Author

suneeta-mall commented Mar 20, 2018

Hey @bflad ,

Good to know! I am happy to try .. but terraform thinks 1.9.0 is not available

provider "aws" {
  region  = "${var.region}"
  profile = "${var.profile}"
  version = "~> 1.9.0"
}

and I get this ..

Initializing provider plugins...
- Checking for available provider plugins on https://releases.hashicorp.com...

No provider "aws" plugins meet the constraint "~> 1.7.0,~> 1.7.0,~> 1.7.0,~> 1.7.0,~> 1.9.0".

Did I miss anything in using 1.9.0?
Thanks for help!

@bflad
Copy link
Contributor

bflad commented Mar 20, 2018

It looks like you have providers declaring version = "~> 1.7.0", which is not compatible with the constraint version = "~> 1.9.0" (basically you're asking Terraform if its possible to have a version like 1.7.X also match 1.9.X). You'll need to reconcile all your ~> 1.7.0 version constraints to >= 1.7.0, ~> 1.7, or ~> 1.9.0 as well so the 1.9.0 update is allowed.

For what its worth, the latest version is 1.11.0.

@suneeta-mall
Copy link
Author

@bflad Ah silly me! yes you are right .. that was the problem with version issue.

Looks like the issue with whitespace outside policy document is not fixed .. even in 1.11.X

-/+ module.ecr.aws_ecr_lifecycle_policy.untagged[9] (new resource required)
      id:          "XXX/XXX" => <computed> (forces new resource)

on version

Initializing provider plugins...
- Checking for available provider plugins on https://releases.hashicorp.com...
- Downloading plugin for provider "aws" (1.11.0)...

@bflad
Copy link
Contributor

bflad commented Mar 20, 2018

Hmm. I think at this point I will need to see the actual policy: "XXX" => "YYY" line in the plan with both the before and after. It sounds like AWS might have normalized the JSON a different way after upload or added something material (not just whitespace difference). Could you please provide that line from the plan with any sensitive information wiped or it can be encrypted with the HashiCorp GPG Key? Thanks.

@suneeta-mall
Copy link
Author

@bflad Thanks for pointing in right direction .. It looks like the actual issue is more complicated than we thought ....

Looks like if I am on AWS provider # 1.7.X, and I apply more than one policy (3 in this case) on ECR, Terraform thinks its created all the requested policy on ECR (and this is reflected by state file) but actually its only created one of the many policies on ECR resource .. Looking at our outcomes, its not clear which one of the many policies that effectively gets applied. We end up seeing different policy on different ECR but all ECRs have only one policy of 3 we are supplying.

Without whitespace change, somehow this behaviour seem to repeatable. I cant not why repeatable apply/plan dont see that ECR is missing 2 policies and only has 1. It just think state file match the state of ECR which is false.

Changing whitespace, as mentioned before, outside policy document, on 1.7.x, for some reason causes Terraform to think that policy is changed and now forces the recreation of sources. If I have to guess, I would say, its only actively applying the change thinking the policy is changed and not checking refreshing the true state of ECR?? I am not sure.
I have not applied the change with whitespace to see if the final policy/policies that will in truth be applied on ECR will still be same or different .. or all three will be applied...(I do not want to recreate my ECR and loose images).

However, changing the provider version to be 1.11.X, without any changes at all (no whitespace, nothing), causes terraform to think that ECRs needs to be recreated. My guess is because it now finds the difference in state file and actual ECR policy and sees that two are missing and only 1 is present and given that change in ECR is force recreate and it creates the ECR. But I havenot applied them so know what would happen.

Heres the trim down and stripped log from my console (for brevity I have removed other ECRs that
I create based on a list) and the logs only show outcome of one ECR XXX/XXXX:

➜  project git:(master) terraform init                                                  

Initializing modules...

Initializing provider plugins...
- Checking for available provider plugins on https://releases.hashicorp.com...
- Downloading plugin for provider "aws" (1.7.1)...
- Downloading plugin for provider "template" (1.0.0)...

The following providers do not have any version constraints in configuration,
so the latest version was installed.

To prevent automatic upgrades to new major versions that may contain breaking
changes, it is recommended to add version = "..." constraints to the
corresponding provider blocks in configuration, with the constraint strings
suggested below.

* provider.template: version = "~> 1.0"

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.

➜  project git:(master) terraform plan -var 'profile=prod' -state=terraform-prod.tfstate
Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
persisted to local or remote state storage.

aws_ecr_repository.ecr-repo[1]: Refreshing state... (ID: XXXX/XXXXX)
aws_ecr_lifecycle_policy.has-env-tag[1]: Refreshing state... (ID: XXXX/XXXXX)
aws_ecr_lifecycle_policy.untagged[1]: Refreshing state... (ID: XXXX/XXXXX)
aws_ecr_lifecycle_policy.latest[1]: Refreshing state... (ID: XXXX/XXXXX)
aws_ecr_repository_policy.repo-policy[1]: Refreshing state... (ID: XXXX/XXXXX)

------------------------------------------------------------------------

No changes. Infrastructure is up-to-date.

This means that Terraform did not detect any differences between your
configuration and real physical resources that exist. As a result, no
actions need to be performed.
➜  project git:(master) terraform plan -var 'profile=prod' -state=terraform-prod.tfstate
➜  project git:(master) rm -rf .terraform 


➜  project git:(master) git stash apply 
### This git stash only changes the version of aws plugin to be 1.11.X from 1.7.X

➜  project git:(master) ✗ terraform init                                                  

Initializing modules...

Initializing provider plugins...
- Checking for available provider plugins on https://releases.hashicorp.com...
- Downloading plugin for provider "template" (1.0.0)...
- Downloading plugin for provider "aws" (1.11.0)...

The following providers do not have any version constraints in configuration,
so the latest version was installed.

To prevent automatic upgrades to new major versions that may contain breaking
changes, it is recommended to add version = "..." constraints to the
corresponding provider blocks in configuration, with the constraint strings
suggested below.

* provider.template: version = "~> 1.0"

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
➜  project git:(master) ✗ terraform plan -var 'profile=prod' -state=terraform-prod.tfstate
Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
persisted to local or remote state storage.

aws_ecr_repository_policy.repo-policy[1]: Refreshing state... (ID: XXXX/XXXXX)
aws_ecr_lifecycle_policy.latest[1]: Refreshing state... (ID: XXXX/XXXXX)
aws_ecr_lifecycle_policy.untagged[1]: Refreshing state... (ID: XXXX/XXXXX)
aws_ecr_repository.ecr-repo[1]: Refreshing state... (ID: XXXX/XXXXX)
aws_ecr_lifecycle_policy.has-env-tag[1]: Refreshing state... (ID: XXXX/XXXXX)

------------------------------------------------------------------------

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
-/+ destroy and then create replacement

Terraform will perform the following actions:

-/+ module.ecr.aws_ecr_lifecycle_policy.has-env-tag[1] (new resource required)
      id:          "XXXX/XXXXX" => <computed> (forces new resource)
      policy:      "{\"rules\":[{\"rulePriority\":1,\"description\":\"Expire images older than 3 days\",\"selection\":{\"tagStatus\":\"untagged\",\"countType\":\"sinceImagePushed\",\"countUnit\":\"days\",\"countNumber\":3},\"action\":{\"type\":\"expire\"}}]}" => "{\n    \"rules\": [\n        {\n            \"rulePriority\": 1,\n            \"description\": \"Keep last 900 images\",\n            \"selection\": {\n                \"tagStatus\": \"tagged\",\n                \"tagPrefixList\": [\"YYYY\"],\n                \"countType\": \"imageCountMoreThan\",                \n                \"countNumber\": 900\n            },\n            \"action\": {\n                \"type\": \"expire\"\n            }\n        }\n    ]\n}\n" (forces new resource)
      registry_id: "973383851042" => <computed>
      repository:  "XXXX/XXXXX" => "XXXX/XXXXX"


-/+ module.ecr.aws_ecr_lifecycle_policy.latest[1] (new resource required)
      id:          "XXXX/XXXXX" => <computed> (forces new resource)
      policy:      "{\"rules\":[{\"rulePriority\":1,\"description\":\"Expire images older than 3 days\",\"selection\":{\"tagStatus\":\"untagged\",\"countType\":\"sinceImagePushed\",\"countUnit\":\"days\",\"countNumber\":3},\"action\":{\"type\":\"expire\"}}]}" => "{\n    \"rules\": [\n        {\n            \"rulePriority\": 2,\n            \"description\": \"Keep last 1 images\",\n            \"selection\": {\n                \"tagStatus\": \"tagged\",\n                \"tagPrefixList\": [\"latest\"],\n                \"countType\": \"imageCountMoreThan\",                \n                \"countNumber\": 1\n            },\n            \"action\": {\n                \"type\": \"expire\"\n            }\n        }\n    ]\n}\n" (forces new resource)
      registry_id: "973383851042" => <computed>
      repository:  "XXXX/XXXXX" => "XXXX/XXXXX"


Plan: ZZZZZ

------------------------------------------------------------------------

Note: You didn't specify an "-out" parameter to save this plan, so Terraform
can't guarantee that exactly these actions will be performed if
"terraform apply" is subsequently run.

For the difference of with and without whitespace outside ECR policy on version 1.7.X, the trim down version of logs are:

  • Without whitespace change:
No changes. Infrastructure is up-to-date.
  • With whitespace change (outside policy document):
Terraform will perform the following actions:

-/+ module.ecr.aws_ecr_lifecycle_policy.has-env-tag[1] (new resource required)
      id:          "XXX/XXXX" => <computed> (forces new resource)
      policy:      "{\n    \"rules\": [\n        {\n            \"rulePriority\": 1,\n            \"description\": \"Keep last 900 images\",\n            \"selection\": {\n                \"tagStatus\": \"tagged\",\n                \"tagPrefixList\": [\"YYY\"],\n                \"countType\": \"imageCountMoreThan\",                \n                \"countNumber\": 900\n            },\n            \"action\": {\n                \"type\": \"expire\"\n            }\n        }\n    ]\n}\n" => "{\n    \"rules\": [\n        {\n            \"rulePriority\": 1,\n            \"description\": \"Keep last 900 images\",\n            \"selection\": {\n                \"tagStatus\": \"tagged\",\n                \"tagPrefixList\": [\"YYY\"],\n                \"countType\": \"imageCountMoreThan\",\n                \"countNumber\": 900\n            },\n            \"action\": {\n                \"type\": \"expire\"\n            }\n        }\n    ]\n}\n" (forces new resource)
      registry_id: "973383851042" => <computed>
      repository:  "XXX/XXXX" => "XXX/XXXX"

-/+ module.ecr.aws_ecr_lifecycle_policy.latest[1] (new resource required)
      id:          "XXX/XXXX" => <computed> (forces new resource)
      policy:      "{\n    \"rules\": [\n        {\n            \"rulePriority\": 2,\n            \"description\": \"Keep last 1 images\",\n            \"selection\": {\n                \"tagStatus\": \"tagged\",\n                \"tagPrefixList\": [\"latest\"],\n                \"countType\": \"imageCountMoreThan\",                \n                \"countNumber\": 1\n            },\n            \"action\": {\n                \"type\": \"expire\"\n            }\n        }\n    ]\n}\n" => "{\n    \"rules\": [\n        {\n            \"rulePriority\": 2,\n            \"description\": \"Keep last 1 images\",\n            \"selection\": {\n                \"tagStatus\": \"tagged\",\n                \"tagPrefixList\": [\"latest\"],\n                \"countType\": \"imageCountMoreThan\",\n                \"countNumber\": 1\n            },\n            \"action\": {\n                \"type\": \"expire\"\n            }\n        }\n    ]\n}\n" (forces new resource)
      registry_id: "973383851042" => <computed>
      repository:  "XXX/XXXX" => "XXX/XXXX"


Plan: 20 to add, 0 to change, 20 to destroy.

------------------------------------------------------------------------

Note: You didn't specify an "-out" parameter to save this plan, so Terraform
can't guarantee that exactly these actions will be performed if
"terraform apply" is subsequently run.

I am in fix on how to resolve this without having to destroy my ECRs. Since you mentioned, change in ECR policy is force recreate , I have not covered that. I can look at raising a PR later for that when the main issue with ECR is sorted.

@bflad
Copy link
Contributor

bflad commented Mar 20, 2018

When you say "destroy my ECRs", you do not mean your ECR repository (the repository) itself is getting deleted right? The aws_ecr_lifecycle_policy resource only calls the SDK function DeleteLifecyclePolicy and should not delete the actual repository:

https://github.com/terraform-providers/terraform-provider-aws/blob/0a40a3742459fa038f47329d064ebd9104bf3c44/aws/resource_aws_ecr_lifecycle_policy.go#L91

If it does, that seems we should raise an issue with AWS support as that is not a documented behavior: https://docs.aws.amazon.com/sdk-for-go/api/service/ecr/#ECR.DeleteLifecyclePolicy

That said, I think you may be confused here because there should only ever be a single aws_ecr_lifecycle_policy resource attached to a single aws_ecr_repository. In order to support multiple rules, you'll need to collect together all the rules to create a single policy JSON which can then be used by this resource. e.g. to apply two lifecycle policy rules to a single ECR repository:

resource "aws_ecr_lifecycle_policy" "example" {
  # ... other configuration ...
  policy = <<EOF
{
  "rules": [
    {
      "rulePriority": 1,
      "description": "Keep last 900 images",
      "selection": {
        "tagStatus": "tagged",
        "tagPrefixList": [
          "YYY"
        ],
        "countType": "imageCountMoreThan",
        "countNumber": 900
      },
      "action": {
        "type": "expire"
      }
    },
    {
      "rulePriority": 2,
      "description": "Keep last 1 images",
      "selection": {
        "tagStatus": "tagged",
        "tagPrefixList": [
          "latest"
        ],
        "countType": "imageCountMoreThan",
        "countNumber": 1
      },
      "action": {
        "type": "expire"
      }
    }
  ]
}
EOF
}

If you need to selectively add or remove the rules depending on some conditional logic, then its doable but requires some extra work with the template data source.

@suneeta-mall
Copy link
Author

@bflad Ah Indeed I misread the situation and its only ECR policy recreate! That makes perfect sense..

Re aws_ecr_lifecycle_policy, right, I was not aware that its only aws_ecr_lifecycle_policy per ECR . AWS seem to allow to add many... and TF did not complain anyway that we are attaching more than supported. May be I missed the documentation but should this be more obvious?

Given I was looking with blind eye for a bit, I will go back and update on the actual whitespace problem again and update you.

Thanks for all help!

@suneeta-mall
Copy link
Author

@bflad I confirm that the issue with policy recreate on whitespace change outside policy is fixed in 1.11.X.

I will close this .. Not sure if I care so much about policy reforce recreation if actually the policy has changed. I misinterpreted that as ECR recreate .. that was my bad .. thanks for all the help!

@ghost
Copy link

ghost commented Apr 7, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thanks!

@ghost ghost locked and limited conversation to collaborators Apr 7, 2020
@breathingdust breathingdust removed the waiting-response Maintainers are waiting on response from community or contributor. label Sep 17, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
question A question about existing functionality; most questions are re-routed to discuss.hashicorp.com. service/ecr Issues and PRs that pertain to the ecr service.
Projects
None yet
Development

No branches or pull requests

3 participants