Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(core): Cyclic dependencies during prepareApp() #29501

Open
FlorinAsavoaie opened this issue Mar 15, 2024 · 7 comments
Open

(core): Cyclic dependencies during prepareApp() #29501

FlorinAsavoaie opened this issue Mar 15, 2024 · 7 comments
Assignees
Labels
@aws-cdk/core Related to core CDK functionality bug This issue is a bug. effort/medium Medium work item – several days of effort p1

Comments

@FlorinAsavoaie
Copy link
Contributor

FlorinAsavoaie commented Mar 15, 2024

Describe the bug

In certain scenarios, the prepareApp() phase generates cyclic dependencies while trying to find transitive dependencies of nested resources.

Expected Behavior

DependsOn is not generated on resources that really don't need it.

Current Behavior

Cyclic dependencies are generated by adding unnecessary DependsOn properties to resources.

Reproduction Steps

https://github.com/FlorinAsavoaie/cdk-grant-bug

I wrote a very simple piece of code here and a test for it. You will notice that the Queue starts depending on the TaskRole Policy and there are no reasons why that should be happening.

Additional Information/Context

I tried debugging this as much as possible. During synthesis, DependsOn on the CfnQueue resource node is empty until it reaches the prepareApp(root) step. Something goes wrong in the logic here.

If you search prepareApp in the list of issues on the project, most of the issues are referring to Cyclic Dependencies. I wouldn't be surprised if most of them are unnecessary, like reported here.

CDK CLI Version

2.132.1 (build 9df7dd3)

Node.js Version

v20.11.1

OS

Linux, Windows

Language

TypeScript, Java

@FlorinAsavoaie FlorinAsavoaie added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Mar 15, 2024
@github-actions github-actions bot added the @aws-cdk/core Related to core CDK functionality label Mar 15, 2024
@pahud pahud self-assigned this Mar 18, 2024
@pahud
Copy link
Contributor

pahud commented Mar 18, 2024

I don't have any clue off the top of my head.

This works for me:

export class DummyStack extends Stack {
  constructor(scope: Construct, id: string, props: StackProps) {
    super(scope, id, props);  

    const vpc = ec2.Vpc.fromLookup(this, 'Vpc', { isDefault: true });
    const task = new ecs.FargateTaskDefinition(this, 'TaskDefinition');

    new ecs.FargateService(this, 'Service', {
      cluster: new ecs.Cluster(this, 'Cluster', { vpc }),
      taskDefinition: task,
    })

    task.addContainer('Container', { image: ecs.ContainerImage.fromRegistry('nginx') });
    const queue = new sqs.Queue(this, 'Queue');
    queue.grantConsumeMessages(task.taskRole);
  } 
}

But this is having ValidationError on cdk deploy

cdk-dummy failed: Error [ValidationError]: Circular dependency between resources: [dummyfargateserviceQueueE402EBE3, dummyfargateserviceSecurityGroup4B42F2EB, TaskDefinitionTaskRoleDefaultPolicy282E8624, dummyfargateserviceService49EB81BF]

export class DummyFargateService extends ecs.FargateService {
  constructor(scope: Construct, id: string) {
    super(scope, id, {
      cluster: new ecs.Cluster(scope, 'Cluster', { 
        vpc: ec2.Vpc.fromLookup(scope, 'Vpc', { isDefault: true }),
      }),
      taskDefinition: new ecs.FargateTaskDefinition(scope, 'TaskDefinition'),
    })

    this.taskDefinition.addContainer('Container', { image: ecs.ContainerImage.fromRegistry('nginx') });

    const queue = new sqs.Queue(this, 'Queue');
    queue.grantConsumeMessages(this.taskDefinition.taskRole);
  }
}

Looking at the synthesized template:

template YAML
Resources:
  ClusterEB0386A7:
    Type: AWS::ECS::Cluster
    Metadata:
      aws:cdk:path: cdk-dummy/Cluster/Resource
  TaskDefinitionTaskRoleFD40A61D:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Statement:
          - Action: sts:AssumeRole
            Effect: Allow
            Principal:
              Service: ecs-tasks.amazonaws.com
        Version: "2012-10-17"
    Metadata:
      aws:cdk:path: cdk-dummy/TaskDefinition/TaskRole/Resource
  TaskDefinitionTaskRoleDefaultPolicy282E8624:
    Type: AWS::IAM::Policy
    Properties:
      PolicyDocument:
        Statement:
          - Action:
              - sqs:ChangeMessageVisibility
              - sqs:DeleteMessage
              - sqs:GetQueueAttributes
              - sqs:GetQueueUrl
              - sqs:ReceiveMessage
            Effect: Allow
            Resource:
              Fn::GetAtt:
                - dummyfargateserviceQueueE402EBE3
                - Arn
        Version: "2012-10-17"
      PolicyName: TaskDefinitionTaskRoleDefaultPolicy282E8624
      Roles:
        - Ref: TaskDefinitionTaskRoleFD40A61D
    Metadata:
      aws:cdk:path: cdk-dummy/TaskDefinition/TaskRole/DefaultPolicy/Resource
  TaskDefinitionB36D86D9:
    Type: AWS::ECS::TaskDefinition
    Properties:
      ContainerDefinitions:
        - Essential: true
          Image: nginx
          Name: Container
      Cpu: "256"
      Family: cdkdummyTaskDefinition07492867
      Memory: "512"
      NetworkMode: awsvpc
      RequiresCompatibilities:
        - FARGATE
      TaskRoleArn:
        Fn::GetAtt:
          - TaskDefinitionTaskRoleFD40A61D
          - Arn
    Metadata:
      aws:cdk:path: cdk-dummy/TaskDefinition/Resource
  dummyfargateserviceService49EB81BF:
    Type: AWS::ECS::Service
    Properties:
      Cluster:
        Ref: ClusterEB0386A7
      DeploymentConfiguration:
        Alarms:
          AlarmNames: []
          Enable: false
          Rollback: false
        MaximumPercent: 200
        MinimumHealthyPercent: 50
      EnableECSManagedTags: false
      LaunchType: FARGATE
      NetworkConfiguration:
        AwsvpcConfiguration:
          AssignPublicIp: DISABLED
          SecurityGroups:
            - Fn::GetAtt:
                - dummyfargateserviceSecurityGroup4B42F2EB
                - GroupId
          Subnets:
            - subnet-071c85610846aa9c0
            - subnet-0ef7ac49e1edb06e4
            - subnet-0e2177a10a166f87d
      TaskDefinition:
        Ref: TaskDefinitionB36D86D9
    DependsOn:
      - TaskDefinitionTaskRoleDefaultPolicy282E8624
      - TaskDefinitionTaskRoleFD40A61D
    Metadata:
      aws:cdk:path: cdk-dummy/dummy-fargate-service/Service
  dummyfargateserviceSecurityGroup4B42F2EB:
    Type: AWS::EC2::SecurityGroup
    Properties:
      GroupDescription: cdk-dummy/dummy-fargate-service/SecurityGroup
      SecurityGroupEgress:
        - CidrIp: 0.0.0.0/0
          Description: Allow all outbound traffic by default
          IpProtocol: "-1"
      VpcId: vpc-1f5b7e78
    DependsOn:
      - TaskDefinitionTaskRoleDefaultPolicy282E8624
      - TaskDefinitionTaskRoleFD40A61D
    Metadata:
      aws:cdk:path: cdk-dummy/dummy-fargate-service/SecurityGroup/Resource
  dummyfargateserviceQueueE402EBE3:
    Type: AWS::SQS::Queue
    DependsOn:
      - TaskDefinitionTaskRoleDefaultPolicy282E8624
      - TaskDefinitionTaskRoleFD40A61D
    UpdateReplacePolicy: Delete
    DeletionPolicy: Delete
    Metadata:
      aws:cdk:path: cdk-dummy/dummy-fargate-service/Queue/Resource
  CDKMetadata:
    Type: AWS::CDK::Metadata
    Properties:
      Analytics: v2:deflate64:H4sIAAAAAAAA/11QTQ+CMAz9Ld7HFEiMd4xeFbybOYopH5vuA0PI/rtjYEI89fW1fX1tQuM0pbsN++iIl03U4oOOhWG8ITloaRUH4mv3EbimY9ZabUCRrBI/eGLqyQzcmG6OUKFAg1JMDf+MFIahALXiltkCVI9+jx9aoCPIOjrmsg1siBfZIh+mdEaOAE+8V+BWoRnOStrXLLEiHNFv7/tqwQalAJwLost5jghZAq31to8PNN77Z9QaMVJWGOyA5nP8AoemEv0pAQAA
    Metadata:
      aws:cdk:path: cdk-dummy/CDKMetadata/Default
Parameters:
  BootstrapVersion:
    Type: AWS::SSM::Parameter::Value<String>
    Default: /cdk-bootstrap/hnb659fds/version
    Description: Version of the CDK Bootstrap resources in this environment, automatically retrieved from SSM Parameter Store. [cdk:skip]
Rules:
  CheckBootstrapVersion:
    Assertions:
      - Assert:
          Fn::Not:
            - Fn::Contains:
                - - "1"
                  - "2"
                  - "3"
                  - "4"
                  - "5"
                - Ref: BootstrapVersion
        AssertDescription: CDK bootstrap stack version 6 required. Please run 'cdk bootstrap' with a recent version of the CDK CLI.

Looking into the template I found:

  • Queue depends on TaskRole and DefaultPolicy
  • DefaultPolicy depends on Queue

I think Queue should not DependOn any other resources but I can't find what makes that dependency.

@pahud pahud added p1 effort/medium Medium work item – several days of effort and removed needs-triage This issue or PR still needs to be triaged. labels Mar 18, 2024
@FlorinAsavoaie
Copy link
Contributor Author

I am fairly confident it happens here.

@pahud
Copy link
Contributor

pahud commented Mar 18, 2024

@FlorinAsavoaie Thank you for your insight. I'll bring up and discuss with the maintainers.

@pahud pahud removed their assignment Mar 22, 2024
@comcalvi
Copy link
Contributor

There's two workaround to this:

  1. avoid inheritance, and use a composition approach instead:
export class CdkGrantBugComponent extends Construct {
  constructor(scope: Construct, id: string) {
    super(scope, id);
    const fargateService = new ecs.FargateService(this, id, {
      cluster: ecs.Cluster.fromClusterAttributes(this, 'Cluster', {
        clusterName: 'Cluster',
        vpc: ec2.Vpc.fromVpcAttributes(this, 'Vpc', {
          availabilityZones: ['us-east-1'],
          publicSubnetIds: ['sub-1'],
          vpcId: cdk.Fn.importValue('Vpc'),
        }),
      }),
      taskDefinition: new ecs.FargateTaskDefinition(this, 'TaskDefinition'),
    });

    fargateService.taskDefinition.addContainer('Container', { image: ecs.ContainerImage.fromRegistry('nginx') });

    const queue = new sqs.Queue(this, 'Queue');

    queue.grantConsumeMessages(fargateService.taskDefinition.taskRole);
  }
}
  1. Use the scope for the queue instead of this:
export class CdkGrantBug extends ecs.FargateService {
  constructor(scope: Construct, id: string) {
    super(scope, id, {
      cluster: ecs.Cluster.fromClusterAttributes(scope, 'Cluster', {
        clusterName: 'Cluster',
        vpc: ec2.Vpc.fromVpcAttributes(scope, 'Vpc', {
          availabilityZones: ['us-east-1'],
          publicSubnetIds: ['sub-1'],
          vpcId: cdk.Fn.importValue('Vpc'),
        }),
      }),
      taskDefinition: new ecs.FargateTaskDefinition(scope, 'TaskDefinition'),
    });

    this.taskDefinition.addContainer('Container', { image: ecs.ContainerImage.fromRegistry('nginx') });

    const queue = new sqs.Queue(scope, 'Queue'); // <- here

    queue.grantConsumeMessages(this.taskDefinition.taskRole);
  }
}

When you create the queue in the this scope, you get this construct path for the queue:

CdkGrantBugStack/CdkGrantBug/Queue/Resource

When you instead create the queue in the scope scope, you get this construct path (this results in the queue having no dependencies at all):

CdkGrantBugStack/Queue/Resource

The interesting bit is the construct path of the TaskRole:

CdkGrantBugStack/TaskDefinition/TaskRole/Resource

I'm not sure what causes this dependency to appear at all yet...the Queue should never really depend on the IAM resources, it should be the other way around

@FlorinAsavoaie
Copy link
Contributor Author

I understand both workaround and have since applied the composition approach in my particular case.

However, this is still a bug and I believe it affects a lot of other cases. It is definitely in that bit of code in the prepareApp function, I debugged until there, but the code in there is quite cumbersome and I haven't had the time to go through it deeper or try to create a simpler test using only IDependency.

@comcalvi
Copy link
Contributor

This where the dependency is being created: https://github.com/aws/aws-cdk/blob/main/packages/aws-cdk-lib/aws-ecs/lib/base/base-service.ts#L680

Every child construct of the BaseService (which FargateService inherits from) will have a depdency on the taskRole. The fix here is to scope down the dependencies of BaseService to only be the resources we actually care about.

@FlorinAsavoaie
Copy link
Contributor Author

@comcalvi with all due respect, I am 99.99% sure that's not true. Simply attaching a debugger on the code you can easily see that the precise dependency discussed here is not added during synth so it is definitely not in the constructs code but in the method described by me in the original bug.

@comcalvi comcalvi self-assigned this Apr 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/core Related to core CDK functionality bug This issue is a bug. effort/medium Medium work item – several days of effort p1
Projects
None yet
Development

No branches or pull requests

3 participants