-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(eks): k8s resources accidentally deleted due to logical ID change #12053
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The `KubernetesManifest` construct used `kubectl apply` for both CREATE and UPDATE operations. This means that if, for example, two manifests had resources with the same k8s name (`metadata.name`), the second manifest created will not fail, but rather override the resource definition. As a consequence, if the logical ID of a `KubernetesManifest` resource was changed (without a change in the physical name), CFN would perform a replacement process which involves a CREATE of the new resource and then a DELETE of the old one. Since the CREATE operation was implemented through `apply`, it succeeded (with no-op) but then the DELETE operation would delete the resource. The result is that the resource was deleted. The solution is to use `kubectl create --save-config` instead of `kubectl apply` for CREATE operations. This yields the desired CREATE semantics (dah!). Now, if a `KubernetesManifest` resource is defined with a K8S object name that already exists, the CREATE operation will fail as expected. The logical ID change scenario (resource replacement), would also issue a CREATE operation first which will fail. To change logical IDs of `KubernetesManifest` resources, users will have to either delete the old resource or change its physical name. Since this is quite hard to test (due to multi-phase deployments and failure modes), this was tested manually: 1. Defined a manifest with logical name X1 and physical name Y1 -> CREATE was issued 2. Changed logical name to X2 (physical remains Y1) -> update failed because CFN issues a CREATE operation first (#10397) 3. Changed also the physical name to Y2 -> deploy succeeded, new resource created, old resource pruned. This fixes #10397
eladb
added
the
pr-linter/exempt-test
The PR linter will not require test changes
label
Dec 13, 2020
github-actions
bot
added
the
@aws-cdk/aws-eks
Related to Amazon Elastic Kubernetes Service
label
Dec 13, 2020
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
iliapolo
approved these changes
Dec 13, 2020
Thank you for contributing! Your pull request will be updated from master and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork). |
eladb
pushed a commit
that referenced
this pull request
Dec 14, 2020
The change in #12053 introduced a regression which causes failures in creating new clusters. Since we changed the KubernetesManifest resource to use `kubectl create` in CREATE operations, the attempt to create the `aws-auth` config map is failing because this config map is already created by the cluster. This change adds an `override` to `KubernetesManifest` which will cause CREATE to be performed using `apply` instead, which practically allows overriding/adopting existing K8s resources.
mergify bot
pushed a commit
that referenced
this pull request
Dec 14, 2020
…12068) The change in #12053 introduced a regression which causes failures in creating new clusters. Since we changed the KubernetesManifest resource to use `kubectl create` in CREATE operations, the attempt to create the `aws-auth` config map is failing because this config map is already created by the cluster. This change adds an `override` to `KubernetesManifest` which will cause CREATE to be performed using `apply` instead, which practically allows overriding/adopting existing K8s resources. ---- *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
flochaz
pushed a commit
to flochaz/aws-cdk
that referenced
this pull request
Jan 5, 2021
…aws#12053) The `KubernetesManifest` construct used `kubectl apply` for both CREATE and UPDATE operations. This means that if, for example, two manifests had resources with the same k8s name (`metadata.name`), the second manifest created will not fail, but rather override the resource definition. As a consequence, if the logical ID of a `KubernetesManifest` resource was changed (without a change in the physical name), CFN would perform a replacement process which involves a CREATE of the new resource and then a DELETE of the old one. Since the CREATE operation was implemented through `apply`, it succeeded (with no-op) but then the DELETE operation would delete the resource. The result is that the resource was deleted. The solution is to use `kubectl create --save-config` instead of `kubectl apply` for CREATE operations. This yields the desired CREATE semantics (dah!). Now, if a `KubernetesManifest` resource is defined with a K8S object name that already exists, the CREATE operation will fail as expected. The logical ID change scenario (resource replacement), would also issue a CREATE operation first which will fail. To change logical IDs of `KubernetesManifest` resources, users will have to either delete the old resource or change its physical name. Since this is quite hard to test (due to multi-phase deployments and failure modes), this was tested manually: 1. Defined a manifest with logical name X1 and physical name Y1 -> CREATE was issued 2. Changed logical name to X2 (physical remains Y1) -> update failed because CFN issues a CREATE operation first (aws#10397) 3. Changed also the physical name to Y2 -> deploy succeeded, new resource created, old resource pruned. This fixes aws#10397 ---- *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
flochaz
pushed a commit
to flochaz/aws-cdk
that referenced
this pull request
Jan 5, 2021
…ws#12068) The change in aws#12053 introduced a regression which causes failures in creating new clusters. Since we changed the KubernetesManifest resource to use `kubectl create` in CREATE operations, the attempt to create the `aws-auth` config map is failing because this config map is already created by the cluster. This change adds an `override` to `KubernetesManifest` which will cause CREATE to be performed using `apply` instead, which practically allows overriding/adopting existing K8s resources. ---- *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
@aws-cdk/aws-eks
Related to Amazon Elastic Kubernetes Service
contribution/core
This is a PR that came from AWS.
pr-linter/exempt-test
The PR linter will not require test changes
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The
KubernetesManifest
construct usedkubectl apply
for both CREATE and UPDATE operations. This means that if, for example, two manifests had resources with the same k8s name (metadata.name
), the second manifest created will not fail, but rather override the resource definition.As a consequence, if the logical ID of a
KubernetesManifest
resource was changed (without a change in the physical name), CFN would perform a replacement process which involves a CREATE of the new resource and then a DELETE of the old one. Since the CREATE operation was implemented throughapply
, it succeeded (with no-op) but then the DELETE operation would delete the resource. The result is that the resource was deleted.The solution is to use
kubectl create --save-config
instead ofkubectl apply
for CREATE operations. This yields the desired CREATE semantics (dah!).Now, if a
KubernetesManifest
resource is defined with a K8S object name that already exists, the CREATE operation will fail as expected. The logical ID change scenario (resource replacement), would also issue a CREATE operation first which will fail.To change logical IDs of
KubernetesManifest
resources, users will have to either delete the old resource or change its physical name.Since this is quite hard to test (due to multi-phase deployments and failure modes), this was tested manually:
This fixes #10397
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license