-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Asynchronous processing (async/await) in a construct #8273
Comments
I wrap my "main" (entrypoint) in an async function, since I have some similar needs (I pre-fetch some things from Secrets Manager). I don't use the out of the box directory structure (what comes with async function main() {
const res = await getMyAsyncThing();
const app = new App();
new FooStack(App, 'fooStack');
...
}
main(); It works great for me! |
We currently do not support this and in general we consider this an anti-pattern. One of the tenets of CDK apps is that given the same source they will always produce the same output (same as a compiler). If you need to perform async operations, it means you are going to the network to consult with an external entity, which by definition means you lose determinism. File system operations can be done synchronously in node.js, so consulting your local disk is "ok", but bear in mind that this still means that you may end up with non-deterministic outputs which breaks some core assumptions of the framework and considered a bad practice in the operational sense. One direction we are considering is to open up the context provider framework (see aws/aws-cdk-rfcs#167). Closing for now. |
@eladb I mostly agree with you here. But one important corner case is if you want to use node's crypto library. Those functions all use promises / callbacks. |
Which functions? |
@eladb nevermind, I erred. There are sync versions of what I thought was purely async. Thanks! |
Strong disagree here, and a bit annoyed. Sorry. I need/want a custom asset bundler for Cloudwatch Synthetic event lambdas (due to the directory layout requirements if you have more than a single It's disingenuous to claim to be protecting users from themselves maybe possibly doing something that could lead to non-deterministic behavior as a means of justifying only allowing |
Is it possible to use the new asset bundling API to run |
Unfortunately not, (Thank you for taking the time to respond. Sorry for the tone, just hit this wall after hours of digging into the codebase.) |
You don't have to implement |
Is there any way to do this without falling back to Docker (which is a |
At the moment the only way would be to simply use |
Hmm, what about context providers? Are you telling me that *.fromLookup(...) will also not go over the network? |
this was my thought - I think limiting CDK to sync activities is an odd thing to impose. I have a few use cases where I want to use CDK to generate CloudFormation/Terraform based on data in an API or from another source. For me, the main benefit of CDK is to be able to use "real code" to create infra and creating constructs that can do some of the heavy lifting for me when it comes to configuration - I can do this with Python because most of it is synchronous but was disappointed to find this being a limitation when using CDK with Typescript. Feels like something CDK should support overall and let users decide if it's an anti-pattern or not based on their requirements, workflows and use cases. |
I've been using async code for the CDK in typescript for a while (because I wanted to use rollup to bundle my lambdas), but in order to do so I had to stay away from the CDK-way which is defining your code using the class constructs. I used async factories instead. Just a simple example: // Instead of the following
export class MyConstruct extends cdk.Construct {
constructor(scope: cdk.Construct, id: string, props: MyConstructProps) {
super(scope, id, props);
this.lambda = new lambda.Function(this, "MyLambda", {
code: ...
});
}
}
// I just did
export async function createMyConstruct(scope: cdk.Construct, id: string, props: MyConstructProps) {
const construct = new cdk.Construct(scope, id, props);
const lambda = new lambda.Function(construct, "MyLambda", {
code: await bundleCodeWithRollup(...),
});
return {
construct,
lambda,
};
} AFAIK, there shouldn't be any technical issues ever if you do it like this. |
One of the big problems in my opinion is that the AWS SDK is asynchronous, so it's not compatible with the CDK. |
@girotomas This is my use-case exactly. Because of AWS account quotas, I need to use sub-accounts to scale. So when I want to update my stacks, I need to check current resource usage (dynamo), perhaps create a new account (sdk), and create/update stacks across the fleet. It's still deterministic. Given a certain state of the database, a certain output is achieved. |
When we say "deterministic" in this context we mean that a commit in your repository will always produce the same CDK output. This is a common invariant for compilers and build systems and this is where the CDK tenet comes from. If you consult an external database during synthesis, this invariant may break, depending on the contents of your database. What I would recommend to do is to write a little program/script that queries your database, creates any accounts needed and then writes some JSON file with a data model that can be read by your CDK app during synthesis. This file will be committed to source control, which means that if I clone the repo and run To keep this file up-to-date you can create a simple scheduled task that simply runs this script and commits the change to your repo. Very easy to do with something like GitHub workflows. This commit will trigger your CI/CD, and your CDK app will be resynthed accordingly. |
@eladb Thanks. I'll head in that direction. I found that querying and caching the data seems to be the best way to build CDK apps anyway. As I code my way through the different triggers (user signup, user domain registration/transfer, bulk stack update, etc), I'm finding the natural boundaries seem to line up nicely with the cache method. If a nightly job checks for aws account creation, it can easily also commit the metadata to the repo. |
100% agree. Our experience shows that this pattern works pretty well and helps maintaining healthy architectural boundaries. |
Aside from that, not supporting async locks out other patterns like worker-threads or wasm. I think we are thinking too small here. There are many valid asynchronous use cases, that are still deterministic. |
I am thinking about storing the git commit signature in a system parameter at every deploy, which is an async command. Do you think also this would break the cdk assumptions? I could do that with the aws cli and a post deploy hook, but that could break if some colleague don't have the cli installed, or is using some strange operating system that has no bash (cough... win cough.. dows). |
I just wanted to add but my use case is very similar to as @girotomas mentioned actually. As a specific example, in my stack I set up an API gateway and secure it behind an API key, however I wanted to include the API key value in the stack outputs so that it is easier for developers to use. I realized this was not possible, so the simplest (and most cost-effective) solution was to use the AWS SDK to auto-generate an API key and store it in Parameter store. In my CDK script, I essentially have logic to either retrieve the API key value from this parameter, or else auto-generate a value and create the parameter if it doesn't exist. This allows me to retain same API key value for a stack, and also populate a stack output with the value for the API key. The one downside is as mentioned, the AWS SDKs all seem to be asynchronous so I'd need to use the |
I think we can all agree we are all worse off for this discussion. No one more than the cdk itself. |
I'd like to point out that Serverless Stack has supported CDK parallel builds and async constructs (well, stacks) for a few versions now: https://github.com/serverless-stack/sst/releases/tag/v1.11.1 |
I was facing a problem where I needed to retain an API Gateway custom domain since I didn't want to configure the DNS records of the not AWS managed domain whenever I redeploy the CDK app. Since CDK gave me a "Domain name already exists" error when I wanted to redeploy the stack I had the choice
import { Stack, StackProps, Duration, CfnOutput, RemovalPolicy } from 'aws-cdk-lib';
import { Construct } from 'constructs';
import { Certificate } from 'aws-cdk-lib/aws-certificatemanager';
import * as apigateway from 'aws-cdk-lib/aws-apigateway';
import { NodejsFunction } from 'aws-cdk-lib/aws-lambda-nodejs';
import { Runtime } from 'aws-cdk-lib/aws-lambda';
import * as path from 'path';
import { EndpointType } from 'aws-cdk-lib/aws-apigateway';
import * as sdk from 'aws-sdk';
export class PlatformApiStack extends Stack {
constructor(scope: Construct, id: string, props?: StackProps) {
super(scope, id, props);
// Wrapping the whole stack in an async function is considered anti-pattern.
// The alternative would be a Lambda backed CustomResource where you'd have to deal with resource creation
// using AWS SDK functions. Since this solution here works fine, why bother with the overhead.
(async () => {
// API Gateway
const api = new apigateway.RestApi(this, `${process.env.STACK_NAME}-${process.env.STACK_ENV}_api`, {
description: 'Customer API',
deployOptions: {
stageName: `${process.env.STACK_ENV}`
},
endpointConfiguration: { types: [EndpointType.REGIONAL] },
// enable CORS (TODO: harden it later)
defaultCorsPreflightOptions: {
allowHeaders: ['Content-Type', 'X-Amz-Date', 'Authorization', 'X-Api-Key'],
allowMethods: ['OPTIONS', 'GET', 'POST', 'PUT', 'PATCH', 'DELETE'],
allowCredentials: true,
allowOrigins: ['*']
}
});
let apiGatewayDomainExists = false;
const sdk_apigw = new sdk.APIGateway();
try {
await sdk_apigw.getDomainName({ domainName: `${process.env.DOMAIN}` }).promise();
sdk_apigw.createDomainName()
apiGatewayDomainExists = true;
console.log(`API Gateway custom domain "${process.env.DOMAIN}" does exist and will NOT be created.`);
} catch (error) {
console.log(`API Gateway custom domain "${process.env.DOMAIN}" does not exist and will be created.`);
}
if (!apiGatewayDomainExists) {
const domainName = new apigateway.DomainName(this, `${process.env.STACK_NAME}-${process.env.STACK_ENV}_domain`, {
domainName: `${process.env.DOMAIN}`,
certificate: Certificate.fromCertificateArn(this, `${process.env.STACK_NAME}-${process.env.STACK_ENV}_cert`, `${process.env.AWS_ACM_CERT_ARN}`),
endpointType: EndpointType.REGIONAL,
mapping: api
});
domainName.applyRemovalPolicy(RemovalPolicy.RETAIN);
new CfnOutput(this, `${process.env.STACK_NAME}-${process.env.STACK_ENV}_api_gateway_domain_name`, { value: domainName.domainNameAliasDomainName });
}
//.... all the other stuff
})();
}
} This works as expected and only creates the DomainName resource if it doesn't exist already. Why would I bother using a CustomResource for this and are there other ways to achieve this goal? Thanks 🙂 |
I just ran into an issue where I was using Of course that involves awaiting on an SDK call while the stack is being constructed, which is not possible and led me here. In the end followed the above example's recommendation and put all the values I needed into the stack's props, requiring the caller to look up the right values and pass them in to the |
Having read all the use cases mentioned in this thread, I have another use case that required me to want to await an async operation: dynamically importing arbitrary TypeScript modules containing Construct classes based on some context value passed to Those dynamically imported modules are committed to Git but in separate projects: in fact, I have a generic, reusable CDK project that creates a sizable "standard" stack, and gets included as a Git submodule in other projects, some of which need to add minor extensions to that standard stack, hence needing to add some unique CDK code that have no business being included in the generic CDK project just so it can be conditionally invoked based on a config value. I think my use case is "deterministic" in every sense as all code is committed in Git. While I could publish the standard stack as an NPM package, but then I'd have to have a full CDK project with all its extra "boilerplate" in all my concrete projects instead of just adding a submodule and occassionally providing a single .ts file to be dynamically imported. |
One could argue that defining an entire stack inside a constructor is an anti-pattern. Constructors are for building bare resources or dependency injection, not complex if/else logic or other things we are forced to do. anyways, something as simple as If we have For example
|
@Kilowhisky CDK can do what you want, but you approach it differently. You need to use something like DockerImageAsset to have your project built inside Docker, and then that is what gets deployed. Async would definitely be good but I can see their point. They are forced to work within the limitations of CloudFormation. It would be better in many ways to scrap CloudFormation and have CDK issue all the API calls directly, as that would allow working around many of CloudFormation's limitations. At the moment a suitable workaround is putting your async code in the |
Have a look at CDK for Terraform 😉 At the time CDK was "unique" (CloudFormation only), this archtitectural decision had not that big impact like nowadays with all these new "targets" (Terraform, Kubernetes and even Repositories with Projen). |
You can't even generate a zip file in NodeJS synchronously, this is a ridiculous limitation! |
I've been reading emails for this thread for nearly 3 years. So before I unsubscribe, I thought I'd share my thoughts and research on the topic, given that this issue was closed with failed reasoning (logical fallacy):
(Emphasis mine). Unfortunately this logic fails even the most trivial of thought experiments. The failure here is assuming asynchronous software is by definition non-deterministic, synchronous software is by definition deterministic, and that the only use case for asynchronous software is networking or database queries or other non-deterministic tasks. Software concurrency is unrelated to the determinism of the resulting output of an application. If you concurrently add up a sum of N numbers, the result is the same as if you did it synchronously. If you synchronously add up the sum of R random numbers (generated synchronously), the output is still random. If deterministic concurrency was impossible, languages such as Go (with goroutines) and the Trying to protect your users from themselves and forcing unpopular arbitrary rules that fit an outdated/misinformed perspective will always result in your users finding a different (and often better) way to do what they want, how they want to do it - usually via your competitors. I was an early adopter and promoter of the AWS CDK. Sadly, this is one of several unfortunate architectural design decisions which has resulted in my company (and myself) abandoning the AWS CDK entirely. I'm sorry if it sounds harsh, but I believe honest feedback is important. |
@rrrix I also thought AWS CDK would be the tool. More and more I discovered that there are so many limitations - often due to CloudFormation and I didn't want to always create custom resources and write SDK code. Recent problem has been creating multiple GSIs on a DynamoDB table. Not possible with CDK. In the end I've found that Terraform, due to it's under the hood usage of AWS SDK, get's things done easier and quicker for my use cases. |
@rrrix Thanks for your clear and on-point statement! |
It won't solve any other pains that are outlined above, but if you are genuinely blocked on async (ha) and only need it at the top stack level, I'd encourage checking out SST. It's a meta framework built on top of the CDK that, amongst other things, supports async in Stack declarations. |
Just want to add here that I stumbled across this thread and was pretty disheartened about it as well. However, upon looking at the aws-quickstart-blueprints library for typescript, there is a pattern for async await through the cdk by nature of the fact that node will wait until all promises have resolved. While this isn't free, creating an underlying resource stack and then keeping a list of ordered calls to subsequent addons is a pattern that will chain async behaviors. This definitely has some edges to it, but allows for things like fetching an existing secret and running bcrypt on it for a kubernetes secret that is created for argocd. It would be nice to create a further abstraction of this current library that does not specifically target EKS cluster's but it can be cannabalized to basically be a build method that calls X number of "AddOn" interfaces that have an async deploy/postDeploy method in successive order after creating the underlying resource. The only place I can see this being problematic would be if a resource is late created in one of these methods and immediately accessed in another constructor (so it is not without teeth, but I believe I can make this edge case go away for usage of custom_resources). Would love to see this more readily supported by the base framework though |
In my case, I need to create a CloudFront PublicKey, so I have a script to create/rotate public/private key contents, and some metadata (createdAt), and save it to a System's Manager parameter. Then save the parameter name in cdk.context.json. At synth time I need to retrieve the public key contents to create the PublicKey, so I'm forced to do an async call to the AWS SSM API... I don't think saving key contents to source control is a good practice 😅 |
Workaround using Lazy to resolve the value from a immediately invoked async function:
Synth Output:
|
@deuscapturus Wouldn't that solution result in race conditions leading to possible random failures? To test this, if you put a delay in the async function, does it still work? e.g.
|
I tested for race conditions before posting. |
The explanation of non deterministic doesn't make sense at all because it only applies to nodejs nature of async operations, in Java cdk one could consult a database or make an http call and proceed with the cdk based on that response because Java will do that in "sync" way. The real issue is that the code architecture to use class constructors to build your infrastructure is non sense (this is the real anti pattern here), and in nodejs you cannot have async class instantiation. |
@eladb open this ticket up, please Most people's views are against your view |
How do you use the same/current credentials of cdk into sdk when creating an instance from a class? e.x. .
.
const sdk_apigw = new sdk.APIGateway({credentials: xxx});` |
Just in case anyone else stumbles on this thread and doesn't want to shell out to a separate process, it's possible to force synchronous execution by using node worker threads. There are some hoops to jump through depending on your flavor of TS/JS, but overall it's:
Not really great in terms of performance since the async code is now blocking, but if it's expected to be reasonably fast at least it's an option. It's also possible to use |
❓ General Issue
The Question
Is it possible to do asynchronous processing in a Construct? Is there anything in CDK that will wait for a promise to resolve? It seems that everything is synchronous.
I need to resolve a string value to a path, but it could take some time so implemented it asynchronously.
Options considered:
Aspects
The visit method is synchronous.
Tokens
The resolve method is synchronous.
Create the app asynchronously
Context
The same as before, but put result into the context, so don't have to pass it to every stack/construct. Honestly, I don't really like this solution, because the construct is reaching out to some known location to get a value. Like
process.env
calls that are scattered throughout the code.Is there any support in CDK for asynchronous processing? Or is it an anti-pattern and I'm doing the wrong way?
Environment
Other information
The text was updated successfully, but these errors were encountered: