Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(core): unused cross-region exports can't be deleted #27902

Open
shahin opened this issue Nov 8, 2023 · 17 comments
Open

(core): unused cross-region exports can't be deleted #27902

shahin opened this issue Nov 8, 2023 · 17 comments
Labels
@aws-cdk/core Related to core CDK functionality bug This issue is a bug. effort/medium Medium work item – several days of effort p1

Comments

@shahin
Copy link

shahin commented Nov 8, 2023

Describe the bug

If a stack maintains multiple cross-region exports, and some subset of cross-region exports are being imported by other stacks, attempting to remove any cross-region export value that is not being used (imported) by any deployed stack:

❯ cdk diff [...] --exclusively
[...]
[~] Custom::CrossRegionExportWriter ExportsWriteruswest2[...]
ExportsWriteruswest2[...]
 └─ [~] WriterProps
     └─ [~] .exports:
         └─ [-] Removed: ./cdk/exports/unused-value/anotherstackuseast1Ref[...]certificate[...]

Fails with the following error on cdk deploy:

ExportsWriteruswest2[...]/Resource/Default (ExportsWriteruswest2[...]) Received response status [FAILED] from custom resource. Message returned: Error: Exports cannot be updated: 

    at throwIfAnyInUse (/var/task/index.js:4:10)

Other issues don't describe this bug:

Expected Behavior

CDK should fail to deploy with the error above only if the cdk diff shows an export being removed that is currently imported by another stack.

Current Behavior

CDK fails to deploy with the error above even when cdk diff shows only exports being removed that are not currently imported by another stack.

Reproduction Steps

  1. Create two cross-region exports, exported from stack A into another stack B in a different region.
  2. Remove one of the two imports from stack B and deploy stack B exclusively. Observe that one import is removed from stack B's ["CrossRegionExportReader"]["ReaderProps"] template in the CloudFormation console.
  3. Attempt to deploy stack A. Observe one of the two exports being removed in the diff. Observe that on the deploy, the ExportWriter fails to update and the deploy fails with the error above.

Possible Solution

No response

Additional Information/Context

No response

CDK CLI Version

2.94.0 (build 987c329)

Framework Version

No response

Node.js Version

v18.17.1

OS

macOS

Language

Python

Language Version

Python (3.10.4)

Other information

No response

@shahin shahin added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Nov 8, 2023
@github-actions github-actions bot added the @aws-cdk/core Related to core CDK functionality label Nov 8, 2023
@khushail
Copy link
Contributor

khushail commented Nov 9, 2023

Hi @shahin , could you please share the sample code so that this issue can be reproduced

@khushail khushail added response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. and removed needs-triage This issue or PR still needs to be triaged. labels Nov 9, 2023
Copy link

This issue has not received a response in a while. If you want to keep this issue open, please leave a comment below and auto-close will be canceled.

@github-actions github-actions bot added the closing-soon This issue will automatically close in 4 days unless further comments are made. label Nov 12, 2023
@shahin
Copy link
Author

shahin commented Nov 15, 2023

Thanks for your attention to this issue. The reproduction steps above are the best I can do for now.

@github-actions github-actions bot removed closing-soon This issue will automatically close in 4 days unless further comments are made. response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. labels Nov 15, 2023
@khushail khushail self-assigned this Nov 30, 2023
@khushail khushail added p2 effort/medium Medium work item – several days of effort needs-review needs-reproduction This issue needs reproduction. labels Nov 30, 2023
@renschler
Copy link

renschler commented Jan 8, 2024

I am facing this issue as well. @shahin do you know if there is a workaround? Is manually deleting the SSM parameters in all regions before deploy a way to get around this? Im debugging in a dev environment now but trying to figure out how I'm going to roll my changes out to prod without a crash; makes me nervous haha

Reproduction Steps
Create two cross-region exports, exported from stack A into another stack B in a different region.
Remove one of the two imports from stack B and deploy stack B exclusively. Observe that one import is removed from stack B's ["CrossRegionExportReader"]["ReaderProps"] template in the CloudFormation console.
Attempt to deploy stack A. Observe one of the two exports being removed in the diff. Observe that on the deploy, the ExportWriter fails to update and the deploy fails with the error above

My scenario was slightly different, but essentially the same issue.

  • When I removed one of the two imports from stack B, and deployed stack A exclusively I got a crash. When I attempted to deploy the who app at once (ie both stacks at once) I also got a crash. (Note there is no way for me to deploy stack B exclusively from what I can tell because I've specified that stack B is dependent on stack A).

@shahin
Copy link
Author

shahin commented Jan 8, 2024

@renschler sorry, I don't know of a workaround in CDK. Deleting the SSM parameters might be part of the solution but did not by itself unblock deployment for us.

(Note there is no way for me to deploy stack B exclusively from what I can tell because I've specified that stack B is dependent on stack A).

If stack B depends on stack A, you can still deploy stack B by itself with the --exclusively flag:

❯ cdk deploy --help
cdk deploy [STACKS..]

[...]
-e, --exclusively          Only deploy requested stacks, don't include
                             dependencies                              [boolean]

@renschler
Copy link

renschler commented Jan 8, 2024

thanks @shahin

@khushail I made a reproduction repo please let me know if you have any questions

https://github.com/renschler/cross-region-delete

@renschler
Copy link

renschler commented Jan 8, 2024

ps as a warning to others, don't delete those unnecessary SSM paramters, it will create more issues! I'm now having to re-add them manually to SSM in every region where I deleted them. then after manually re-adding you'll hit a "last modified date does not match with the last modified date of the retrieved parameters" error during deploy, the only way to fix was to run cdk synth and then manually update the stacks in the aws console using the generated json template in cdk.out/

@NickDarvey
Copy link

Given the bug found by @elliotsegler in #29699, it looks like the error Error: Exports cannot be updated is explicitly thrown when an SSM parameter has the tag aws-cdk:strong-ref:SomeStackName 1.

Perhaps a workaround is removing that tag and then redeploying? I understand the tag is there to stop updates 2, but I'm not sure why that's necessary 3. (So this workaround might not be a very good workaround!) In my case I am making a change that removes a cross-region reference so... should be fine?

Footnotes

  1. https://github.com/aws/aws-cdk/blob/36fd79d8714bd29527bb1184ec10cd504b83510d/packages/%40aws-cdk/custom-resource-handlers/lib/core/cross-region-ssm-writer-handler/index.ts#L92-L137

  2. https://github.com/aws/aws-cdk/blob/b82320b08ebcda98b85be8ceb56a5a4b39511d4a/packages/aws-cdk-lib/core/adr/cross-region-stack-references.md?plain=1#L243-L247

  3. My best bet is that CloudFormation won't detect a change in the parameter's value for the importing stack, so resources using the parameter won't actually get updated with the new value. The strong-ref check is there to stop people getting confused about why their resourced don't get updated with the new value.

@elliotsegler
Copy link

@NickDarvey I'm wondering if that's an ordering issue...

The cross-region-reader-handler and associated Custom::CrossRegionReader resource get added to the stack that consumes the exports. That handler is capable of removing the tags1.

In my issues I was letting CDK resolve the order of operations, and it was attempting to delete resources in the stack that contained the exports before updating the reader stack which would have removed those tags. I worked around it by specifically calling a deployment of the reader stack, then a second phase of deployments to the stack that contained now unused exports.

Footnotes

  1. https://github.com/aws/aws-cdk/blob/36fd79d8714bd29527bb1184ec10cd504b83510d/packages/%40aws-cdk/custom-resource-handlers/lib/core/cross-region-ssm-reader-handler/index.ts#L31-L33

@NickDarvey
Copy link

I worked around it by specifically calling a deployment of the reader stack

That would have worked for me too actually!

@renschler
Copy link

thanks @shahin

@khushail I made a reproduction repo please let me know if you have any questions

https://github.com/renschler/cross-region-delete

@TheRealAmazonKendra I made a reproduction repo here; if anyone has issues reproducing let me know and I can help explain!

@andreialecu
Copy link
Contributor

Looks like we got hit by this too. The stack is now in UPDATE_ROLLBACK_FAILED state and cannot seemed to be rolled back.

Any workarounds to get it un stuck? Would skipping the resource be safe?

@candu
Copy link

candu commented May 29, 2024

I can see the challenge here: if I export a cross-region reference from stack A to stack B, and stack B depends on resources in stack A, and I delete the resource this cross-region reference refers to, I then have a cyclic update dependency.

From the ADR linked above, I'd want to update stack B (export reader), then stack A (export writer).

However, since stack B depends on resources in stack A, I'd want to update stack A, then stack B.

As a result, neither stack is safe to update before the other.

Maybe an ideal resolution would be to detect the cyclic dependency here and fast-fail? In cases like this, I'd much rather have fast-fail behaviour than unblock UPDATE_ROLLBACK_FAILED snags (e.g. with a warning that guides me to use --exclusively, or to break up the update into two steps).

@AposLaz
Copy link

AposLaz commented Jun 27, 2024

Any updates for this issue?

@pahud pahud removed the needs-reproduction This issue needs reproduction. label Aug 12, 2024
@comcalvi comcalvi added p1 and removed p2 labels Aug 23, 2024
@ruslan-roboto
Copy link

This is giving us headaches as well. Any update would be immensely helpful. Thanks!

@flochaz
Copy link
Contributor

flochaz commented Oct 15, 2024

On my side, got it fixed by:

  1. Reverted manually to old values in SSM (check your CustomCrossRegionExportWriter Lambda logs in cloudwatch to get the old and new value and be careful in the logs between the one that corespond to the initial rewrite and the one triggered by the rollback :) ) WITHOUT THE aws-cdk:strong-ref TAG !
  2. Trigger a "Continue update rollback" of the cloudformation stack in UPDATE_ROLLBACK_FAILED without the failing CustomCrossRegionExportWriter resource (check the box in the advanced continue rollback modal)
  3. Redeploy the stacks

@JonWallsten
Copy link
Contributor

I see two EASY solutions conceptually, but maybe harder technically.
Both would require to allow a way to keep the reference in the ExportWriter resource, but remove it from the ExportReader.
a) Either by adding it manually to the ExportWriter in the producing stack
b) Have some kind of DummyImport in the receiving stack that only creates the export and not the import

Then you can really easily create a new resource along the old if you need to replace something, like a Certificate, or remove the resource if that's what you need.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/core Related to core CDK functionality bug This issue is a bug. effort/medium Medium work item – several days of effort p1
Projects
None yet
Development

No branches or pull requests