Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CDK Refactoring Tools #162

Open
1 of 2 tasks
Gtofig opened this issue May 21, 2020 · 15 comments
Open
1 of 2 tasks

CDK Refactoring Tools #162

Gtofig opened this issue May 21, 2020 · 15 comments
Assignees
Labels
management/rfc status/planning RFC implementation is being planned

Comments

@Gtofig
Copy link

Gtofig commented May 21, 2020

At the moment, when I move constructs around in CDK, this results in logical IDs changing and resources being replaced during deployment, even stateful ones. This causes major pain, especially if it's a production service.

I would like CDK to offer a simpler way of refactoring code without having to worry about resource replacement.

Use Case

Refactoring code is a natural part of development, including CDK development. I don't want to have to worry about breaking production due to moving constructs around.

Proposed Solution

Option 1) Work with CloudFormation team to support resource logical ID renaming without replacement (maybe use CDK metadata as a key to identify that it's the same resource)

Option 2) Make it easier to hardcode logical IDs in CDK. At the moment it only works via L1 resources, but most constructs we deal with are not L1, therefore it becomes painful to manually set logical IDs. Perhaps Construct could expose a method to hardcode prefix for logical IDs

Other

  • 👋 I may be able to implement this feature request
  • ⚠️ This feature might incur a breaking change

This is a 🚀 Feature Request

@eladb
Copy link
Contributor

eladb commented May 21, 2020

Are you familiar with renameLogicalId?

@eladb
Copy link
Contributor

eladb commented May 21, 2020

Transferring this to the RFC repo as this requires further discussion.

@eladb eladb transferred this issue from aws/aws-cdk May 21, 2020
@Gtofig
Copy link
Author

Gtofig commented May 21, 2020

Are you familiar with renameLogicalId?

Yes, but this only works at L1 level, so if I have a high level construct which I move around, it becomes more complicated. Perhaps you have a pattern in mind for this?

@tdclark
Copy link

tdclark commented May 21, 2020

It also seems like a big ask to have to drop down to renameLogicalId every time I want to move/refactor a construct around my CDK app. Also, how would this function for composed constructs? If I have a construct that I vend to multiple apps, how would renameLogicalId help me move around some of the sub-constructs within the construct and not have impact on the users of the construct?

@tdclark
Copy link

tdclark commented Aug 31, 2020

Any updates here? It seems untenable to not allow folks to refactor CDK code after the first deployment without causing breakages.

@rix0rrr
Copy link
Contributor

rix0rrr commented Sep 15, 2020

I think the better solution is to allow remapping construct paths at any point in the tree.

As long as that's a hierarchical operation we should be fine.

@foriequal0
Copy link

I use this https://gist.github.com/foriequal0/f1f4ea279fb64836e5fb38efefa133d7 while waiting for a feature that rename logical id in a stack

@markandrus
Copy link

I have just refactored a DynamoDB Table from a Stack into its own Construct within the Stack. My colleagues and I have already deployed copies of the Stack before this change into our respective AWS Accounts. How do I ensure the refactor I've just checked in will cleanly deploy, without my colleagues and I all having to jump through hoops, rename things, mess with the cdk.context.json, etc.?

It would be great if CDK had a solution for such a refactor.

@fitzoh
Copy link

fitzoh commented Dec 6, 2021

I really like how this is handled in pulumi via aliasing.
There's a great blog post explaining how it works here

@Gtofig
Copy link
Author

Gtofig commented Jun 15, 2022

Would CDK context be an appropriate tool to help here?

I'm thinking if we had a method like freezeLogicalId() on a Construct, which would check CDK context for existing logical IDs for this construct, and if not found, it would store current logical IDs into the context.

@epheat
Copy link

epheat commented Dec 19, 2022

I think this is pretty big gap in aws-cdk functionality that really hinders the cdk's main goal of allowing composability and higher level logic for creating cloud infrastructure. The lack of built-in support for refactoring without resource recreation implicitly influences developers to create giant flat stacks of resources to ensure logicalId stability, which ends up just looking like CFN templates.

A tighter integration with CloudFormation resource import functionality could probably close the gap - @foriequal0 may be onto something with their comment here: aws/aws-cdk#4733 (comment)

But until the cdk addresses it, this will always be a sharp edge. One possible workaround is to add an optional logicalId override value in the props of your reusable constructs. This will allow consumers of the construct to pin the logicalId if they desire, or omit it if they'd rather have the cdk automatically generate it. With the current state of things I'd actually recommend that for stateful named resources such as DynamoDb tables, s3 buckets, etc. users should always provide a logicalId override from the get-go, in order to allow for later refactors without replacement of the underlying resource. Otherwise you'll eventually end up with a bunch of random (table.node.defaultChild as CfnTable).overrideLogicalId("MyStackAlarmedTableTable5E3CHJS") all over your codebase.

A dummy example:

// alarmed-table.ts - a reusable construct that spins up a table and an alarm monitoring its usage

import { Construct } from "constructs";
import { aws_dynamodb as dynamodb, aws_cloudwatch as cloudwatch } from "aws-cdk-lib";

export interface MyAlarmedTableProps {
  tableName: string;
  tableLogicalId?: string;
}

export class MyAlarmedTable extends Construct {
  readonly table: dynamodb.Table;
  readonly usageAlarm: cloudwatch.Alarm;

  constructor(scope: Construct, id: string, props: MyAlarmedTableProps) {
    super(scope, id);

    this.table = new dynamodb.Table(this, 'Table', {
      tableName: props.tableName,
      billingMode: BillingMode.PAY_PER_REQUEST,
    });
    if (props.tableLogicalId) {
      (this.table.node.defaultChild as CfnTable).overrideLogicalId(props.tableLogicalId);
    }

    this.alarm = new cloudwatch.Alarm(this, 'UsageAlarm', {
      // metric for throttled requests, latency, or something
    });
  }
}

and its usage:

// stack.ts

import { Stack, StackProps } from "aws-cdk-lib";
import { MyAlarmedTable } from "./alarmed-table";

class MyStack extends Stack {
  constructor(scope: App, id: string, props: StackProps) {
    super(scope, id, props);

    new MyAlarmedTable(this, 'UnitOrdersTable', {
      tableName: `UnitOrders-${props.env.region}`,
      tableLogicalId: "UnitOrdersDDB",
    });
  }
}

With this approach, no matter where you refactor and move that UnitOrdersTable resource within the CDK app, it will always refer to the same logical resource.

@evgenyka evgenyka added status/planning RFC implementation is being planned and removed status/proposed Newly proposed RFC labels Sep 28, 2023
@BwL1289
Copy link

BwL1289 commented Mar 7, 2024

Does anyone have any recent advice on how to deal with this?

I've taken great care not to use physical names, but sometimes you just can't avoid it. For example, if you're working with cognito resources like a user pool identity provider or api gateway custom domain names where CDK requires a domain_name, its just not possible to get around. The physical names of those resources are required (and it wouldn't make sense for CDK to generate values for them anyway).

I've played with rename_logical_id but I'm unclear about unintended downstream consequences and confusion between rename_logical_id, override_logical_id, and allocate_logical_id (I know they operate on different types of resources, but some best practices of employing one over the other would be appreciated). If I understand correctly, allocate_logical_id is for "prefixing" type use cases.

As an aside, dropping to L1s for every stateful construct is a bit of a pain but can be easily circumvented.

If you're not using CDK in a static setting, this sharp edge can be dangerous in production for stateful resources, especially if you're using it in to deploy infra for customers.

Would love to hear if there's advice on this. Thanks in advance!

See #455, aws/aws-cdk#22016 and aws/aws-cdk#1424, aws/aws-cdk#4733

@BwL1289
Copy link

BwL1289 commented Mar 7, 2024

For now, here is how I'm dealing with this (in Python).

First, create a small wrapper to make pylance happy and dropping to L1s more convenient.

def override_logical_id(resource: Resource | CfnResource, logical_id: str) -> None:
    if isinstance(resource, Resource):
        cfn_resource = cast(CfnResource, resource.node.default_child)
        cfn_resource.override_logical_id(logical_id)
    else:
        resource.override_logical_id(logical_id)

Then for each resource where I need to retain state (cognito, S3, etc) and I only want to delete/update/recreate resources if I explicitly change the Id myself, I do the following:

override_logical_id(cognito_user_pool, "CognitoUserPool")

Ofc, the downside here is that if I make an update to the resource in CDK, it won't get deployed.

FWIW I know this can easily be done in a more global way w/ aspects, but for now, I'd like to keep this as stupid simple as possible.

Can someone please validate this approach and let me know of any 'gotchas' I may be missing? Does this also mean I can now name these resources without fear of naming conflicts and stack rollbacks?

This is for new deployments only, not existing deployments.

Thank you.

@BwL1289
Copy link

BwL1289 commented Mar 14, 2024

For anyone who finds this, answering my own question here.

Yes, this will work. And no, in my limited experience overriding logicalIds, this does not mean you can now name these resources without fear of naming conflicts and stack rollbacks.

If you override the logicalId and provide a name (for example, bucket_name for an S3 resource), your stack will fail with the following even if you don't have an existing bucket with that name:

The stack named WsStack failed creation, it may need to be manually deleted from the AWS console: ROLLBACK_COMPLETE: Resource handler returned message: "<bucket name>" already exists

Unclear why, but that's what I'm seeing.

@BwL1289
Copy link

BwL1289 commented Apr 20, 2024

Apologies in advance for the length of this post.

More discovery from the front lines: Api Gateway resources need to be handled with extreme care. See below for details.

Before diving into specifics, for anyone reading this, this sharp edge in CDK is extremely sharp. How sharp?

This sharp:

main-qimg-6c17adac88f211ef7d49f8c8f4fe10df

In all seriousness, to the hardworking CDK engineers — you are appreciated. Nonetheless, the issue of unintended side effects while changing seemingly minor things in CDK really needs to be addressed and rectified.

For example, while changing an Id upstream of various Api Gateway resources (even when many downstream Ids like apigw.DomainName, r53.ARecord, etc, were locked, and the resulting affected resources were undefined by our application) caused naming conflicts and stack rollbacks (more on this below):

Base path already exists for this domain name.

Did I know things may be affected downstream? Yes, of course. But I was under the false assumption that I had locked all Ids that needed to be locked, because I was not explicitly defining any of the resources that were affected. Admittedly, this may have been a naive assumption.

After discussing with AWS Support, the following resources need to be locked in Api Gateway even if you are not defining them:

apigw.DomainName
apigw.BasePathMapping
apigw.Deployment
apigw.Stage
apigw.UsagePlan

In our case, we were not defining apigw.BasePathMapping, but because Cfn/CDK assumes that the (none) base path is a named resource, we got: base path already exists for this domain name.

A separate albeit related issue is that some of the Api Gateway methods cannot be passed an Id at all, so you have to use the resource construct itself. For example:

the add_base_path_mapping method on DomainName does not take an Id as an arg, so you have to use the BasePathMapping resource construct in order to freeze the Id.

For reference, here's the relevant info from the docs for Deployment, which is the cause of the above as CDK tries to do the heavy lifting for you:

Normally, you don't need to define deployments manually. The RestApi construct manages a Deployment resource that represents the latest model. It can be accessed through restApi.latestDeployment (unless deploy: false is set when defining the RestApi).

If you manually define this resource, you will need to know that since deployments are immutable, as long as the resource's logical ID doesn't change, the deployment will represent the snapshot in time in which the resource was created. This means that if you modify the RestApi model (i.e. add methods or resources), these changes will not be reflected unless a new deployment resource is created.

All of that being said, we're now freezing as many Ids for many of the Api Gateway resources as we can because it's difficult to test 32 (2^5) permutations of the resources above, so now our code looks like this:

**# ↓ WARNING: DO NOT CHANGE ↓**
override_logical_id(
    self._api_gw_custom_domain_name,
    "ApiGatewayCustomDomainName",  **# WARNING: DO NOT CHANGE**
)
**# ↑ WARNING: DO NOT CHANGE ↑**

**# ↓ WARNING: DO NOT CHANGE ↓**
override_logical_id(
    self._api_gw_custom_domain_base_path_mapping,
    "ApiGatewayCustomDomainNameBasePathMapping",  **# WARNING: DO NOT CHANGE** 
)
**# ↑ WARNING: DO NOT CHANGE ↑**

**# ↓ WARNING: DO NOT CHANGE ↓**
override_logical_id(
    self._api_gw_deployment,
    "ApiGatewayDeployment",  **# WARNING: DO NOT CHANGE** 
)
**# ↑ WARNING: DO NOT CHANGE ↑**


. . . 

Which just looks silly.

I know this issue has been open ~4 years, there have been myriad github issues created, and according to AWS support, dozens of internal support tickets created by customers about this.

My hope is that the CDK team can dedicate time to finding / picking a solution to this problem as some seemingly viable options (IMO) have been offered. But, not doing anything more than making it possible to rename Ids is (again IMO) not enough

Imagine if after CDK v1 was released, large parts of the codebase could never be changed?

I hope the intent of this post is clear — let's find a solution to this problem. It's not to point fingers or be ungrateful. I know how hard the CDK org has worked.

NOTE: if any of the above is incorrect, please let me know. Happy to edit this post as needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
management/rfc status/planning RFC implementation is being planned
Projects
None yet
Development

No branches or pull requests