-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
aws-ecs: LogDriver automatically creates log-resource-policies, exhausting unmodifiable resource limit #22307
Comments
@trks1970 This makes sense to me. Before we dive deep into the fix, can you provide reproduction steps with CDK code that can help us quickly reproduce this issue from our end? |
This issue has not received a response in a while. If you want to keep this issue open, please leave a comment below and auto-close will be canceled. |
Will take some time to provide an example. |
Hi @pahud we are running into this too with some CDK-created CloudWatch log group policies: Example 1, sending cloudtrail events to eventbridge and logging the events:
Example 2, creating an OpenSearch domain:
|
Are you having the Resource limit exceeded error just with this? Can you provide a full sample that reproduces this error? const loginEvent = aws_cloudtrail.Trail.onEvent(this, 'login-event', {
eventPattern: {
source: ['aws.signin'],
detailType: ['AWS Console Sign In via CloudTrail'],
},
});
loginEvent.addTarget(new targets.CloudWatchLogGroup(new logs.LogGroup(this, 'login-event-logs'))); |
@pahud We already have 10 cloudwatch log group policies in our account. Running the code you mentioned will create one more, and thus the error. |
OK I see. Looking at this
When we new AwsLogDriver() for a container definition, there will always be a grantWrite() execution, adding a new log group policy to this log group resource.
I don't have immediate workaround and I am making it p1 here for more visibility. Meanwhile I will try reproduce this in my account by creating 10 containers with the same log group. |
I was trying to create 10 containers with 2 approaches below and I didn't hit the error you mentioned. import * as cdk from 'aws-cdk-lib';
import { Construct } from 'constructs';
export class EcsTsStack extends cdk.Stack {
constructor(scope: Construct, id: string, props?: cdk.StackProps) {
super(scope, id, props);
const logGroup = new cdk.aws_logs.LogGroup(this, 'LG')
const task = new cdk.aws_ecs.FargateTaskDefinition(this, 'Task', {
cpu: 512,
memoryLimitMiB: 4096,
});
const task2 = new cdk.aws_ecs.FargateTaskDefinition(this, 'Task2', {
cpu: 512,
memoryLimitMiB: 4096,
});
// this works
for (let x = 0; x < 10; x++) {
task.addContainer(`nginx${x}`, {
image: cdk.aws_ecs.ContainerImage.fromRegistry('nginx:latest'),
memoryLimitMiB: 256,
logging: new cdk.aws_ecs.AwsLogDriver({
logGroup,
streamPrefix: 'demo-',
})
});
};
// this works as well
for (let x = 0; x < 10; x++) {
task2.addContainer(`nginx${x}`, {
image: cdk.aws_ecs.ContainerImage.fromRegistry('nginx:latest'),
memoryLimitMiB: 256,
logging: new cdk.aws_ecs.AwsLogDriver({
logGroup: new cdk.aws_logs.LogGroup(this, `LG${x}`),
streamPrefix: 'demo-',
})
});
};
}
} Can you provide me with the full cdk code that reproduces your error? |
OK I can reproduce this as below: // create 10 events each having a loggroup as the event target
for (let x = 0; x < 10; x++) {
const loginEvent = cdk.aws_cloudtrail.Trail.onEvent(this, `login-event${x}`, {
eventPattern: {
source: ['aws.signin'],
detailType: ['AWS Console Sign In via CloudTrail'],
},
});
loginEvent.addTarget(new cdk.aws_events_targets.CloudWatchLogGroup(
new cdk.aws_logs.LogGroup(this, `login-event-logs${x}`),
));
} This will create 10 cloudwatch logs resource policies and according to the doc the hard limit is 10 thus the error. The resources can be listed with I think this is not a bug but a limitation of cloudwatch logs resources. From CDK's perspective we probably can have some workaround such as opt out the resource policy creation but I am not sure if this is a good idea. I am making this as p1 feature request for more visibility from the team. |
Similar Issue: #20313 |
I have come up with a possible workaround. |
Any Update on this? |
@sakurai-ryo just seeing pull request on this. when do we expect this change to available? any local workaround we can have? Thanks |
@XiaowenMaoA If this doesn't work, please send me the code and I may be able to suggest a workaround. |
@sakurai-ryo we are not inside ECS task. create OS domain logic is like following.
|
Thanks @XiaowenMaoA Looking at this part of the code, it appears that we are currently using CustomResource to create the ResourcePolicy for the logs.
Since CustomResource is executed by CloudFormation after The PR I created is an ECS fix, and even if it were merged, it would not solve the problem on the OpenSearch side. |
@sakurai-ryo i found one related issue created before, #23637 i can not find a way to pass the shared ResourcePolicy and with logging enabled. Is there any possible workaround? |
This PR modified to avoid creating unnecessary `ResourcePolicy` in CloudWatch Logs. ## issue summary The related issue reports an error when using the awslogs driver on ECS. This error is caused by the creation of a ResourcePolicy in CloudWatch Logs that reaches the maximum number of ResourcePolicies. https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/cloudwatch_limits_cwl.html ## Current behavior In some cases, this ResourcePolicy will be created and in other cases it will not be created. Currently, `Grant.addToPrincipalOrResource` is used to grant permissions to ExecutionRole and Log Group in the ECS taskDef. https://github.com/aws/aws-cdk/blob/607dccb0fd920d25f0fe2613b83c9830322c439e/packages/aws-cdk-lib/aws-ecs/lib/log-drivers/aws-log-driver.ts#L138 https://github.com/aws/aws-cdk/blob/607dccb0fd920d25f0fe2613b83c9830322c439e/packages/aws-cdk-lib/aws-logs/lib/log-group.ts#L194 https://github.com/aws/aws-cdk/blob/607dccb0fd920d25f0fe2613b83c9830322c439e/packages/aws-cdk-lib/aws-iam/lib/grant.ts#L122 `Grant.addToPrincipalOrResource` first grants permissions to the Grantee (ExecutionRole) and creates a resource base policy for cross account access in cases where certain conditions are not met. This condition is determined by the contents of the `principalAccount` of the ExecutionRole and the accountID in the `env.account` and whether or not these are Tokens, but in this scenario, cross account access is not necessary. https://github.com/aws/aws-cdk/blob/607dccb0fd920d25f0fe2613b83c9830322c439e/packages/aws-cdk-lib/aws-iam/lib/grant.ts#L141 Also, when the `LogGroup.grantWrite` call was added to `aws-log-driver.ts`, the ResourcePolicy for logs could not be created from CFn and only granted to the ExecutionRole. #1291 ![スクリーンショット 2023-12-27 1 08 20](https://github.com/aws/aws-cdk/assets/58683719/5a17a041-d560-45fa-bac6-cdc3894b18bc) https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/ReleaseHistory.html Therefore, the resource base policy should not be necessary when using the awslogs driver. ## Major changes This PR changed to grant permissions only to ExecutionRole when using the awslogs driver. With this fix, ResourcePolicy will no longer be created when using the awslogs driver. I don't consider this a breaking change, as it changes the content of the generated template, but does not change the behavior of forwarding logs to CloudWatch Logs. However, if this is a breaking change, I think it is necessary to use the feature flag. fixes #22307, fixes #20313 ---- *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
|
This PR modified to avoid creating unnecessary `ResourcePolicy` in CloudWatch Logs. The related issue reports an error when using the awslogs driver on ECS. This error is caused by the creation of a ResourcePolicy in CloudWatch Logs that reaches the maximum number of ResourcePolicies. https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/cloudwatch_limits_cwl.html In some cases, this ResourcePolicy will be created and in other cases it will not be created. Currently, `Grant.addToPrincipalOrResource` is used to grant permissions to ExecutionRole and Log Group in the ECS taskDef. https://github.com/aws/aws-cdk/blob/607dccb0fd920d25f0fe2613b83c9830322c439e/packages/aws-cdk-lib/aws-ecs/lib/log-drivers/aws-log-driver.ts#L138 https://github.com/aws/aws-cdk/blob/607dccb0fd920d25f0fe2613b83c9830322c439e/packages/aws-cdk-lib/aws-logs/lib/log-group.ts#L194 https://github.com/aws/aws-cdk/blob/607dccb0fd920d25f0fe2613b83c9830322c439e/packages/aws-cdk-lib/aws-iam/lib/grant.ts#L122 `Grant.addToPrincipalOrResource` first grants permissions to the Grantee (ExecutionRole) and creates a resource base policy for cross account access in cases where certain conditions are not met. This condition is determined by the contents of the `principalAccount` of the ExecutionRole and the accountID in the `env.account` and whether or not these are Tokens, but in this scenario, cross account access is not necessary. https://github.com/aws/aws-cdk/blob/607dccb0fd920d25f0fe2613b83c9830322c439e/packages/aws-cdk-lib/aws-iam/lib/grant.ts#L141 Also, when the `LogGroup.grantWrite` call was added to `aws-log-driver.ts`, the ResourcePolicy for logs could not be created from CFn and only granted to the ExecutionRole. #1291 ![スクリーンショット 2023-12-27 1 08 20](https://github.com/aws/aws-cdk/assets/58683719/5a17a041-d560-45fa-bac6-cdc3894b18bc) https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/ReleaseHistory.html Therefore, the resource base policy should not be necessary when using the awslogs driver. This PR changed to grant permissions only to ExecutionRole when using the awslogs driver. With this fix, ResourcePolicy will no longer be created when using the awslogs driver. I don't consider this a breaking change, as it changes the content of the generated template, but does not change the behavior of forwarding logs to CloudWatch Logs. However, if this is a breaking change, I think it is necessary to use the feature flag. fixes #22307, fixes #20313 ---- *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
@XiaowenMaoA |
Any update on this? we don't want to add policies on our roles, but we are forced to do it when defining the logs.. |
Hi @stoicad. |
Hi @sakurai-ryo, yes we do. We are not allowed to touch the existing roles via CDK, but when defining this one:
it tries to add permissions to the role attached to the ECS (as a separate policy), although the logs:* is already available on the role. |
Thanks @stoicad. |
Describe the bug
We are running 12 Fargate containers, each of them logging to CloudWatch via LogDriver, creare with cdk.
However, the last 2 fails to deploy, due to
"Resource limit exceeded. (Service: CloudWatchLogs, Status Code: 400,
which I understand is fixed at 10 and that is it. No way to change.
However, it looks like log-resource-policy is created automaticall by LogDriver, which uses up the amount.
Expected Behavior
It is not possible a) not to create a policy or b) reuse a policy
Current Behavior
A log-resource-policy is created with each LogDriver.awsLogs call.
Reproduction Steps
Deploy 10 Containers with LogDriver
Possible Solution
No response
Additional Information/Context
No response
CDK CLI Version
2.40.0 (build 56ba2ab)
Framework Version
aws-cdk-lib:2.27.0
Node.js Version
16.15.1
OS
Win10
Language
Java
Language Version
Kotlin
Other information
No response
The text was updated successfully, but these errors were encountered: