Skip to content

(3.0.0‐3.10.1) Build image CloudFormation stacks fail to delete after images are successfully built

Hanwen edited this page Jul 9, 2024 · 2 revisions

The issue

Starting from September 19, 2023, build image CloudFormation stack ends in DELETE_FAILED status after the image is successfully built. The failures in the stack events look like:

Timestamp Logical ID Status Status reason
2023-12-01 06:00:20 UTC-0800 aws-parallelcluster-3-7-2-amzn2-hvm-arm64-202312011211 DELETE_FAILED The following resource(s) failed to delete: [DeleteStackFunctionExecutionRole].
2023-12-01 06:00:19 UTC-0800 DeleteStackFunctionExecutionRole DELETE_FAILED Internal Failure

The image is built correctly despite the stack is in DELETE_FAILED and you can use it as custom AMI for cluster creation.

Affected versions

ParallelCluster versions 3.0.0-3.10.1 are affected.

Mitigation

To delete the stack in DELETE_FAILED status, users can manually delete it on AWS CloudFormation web console or using AWS CLI. To fix stack self-deletion for future build-image process, specify a custom IAM role using CleanupLambdaRole parameter in build image configuration files. The custom IAM role should contain policies described here.

Clone this wiki locally