-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(aws-appsync): resolver replacement issues #13269
Comments
This one is pretty serious in the fact that once you place your CF stack into a bad state, you can't roll back CDK to a working deployment without manual intervention and an outage. I'm going to escalate it with my TAM but is there a potential workaround to ensure that the unique ids do not change? |
It looks like its been a known issue for a few years https://forums.aws.amazon.com/thread.jspa?messageID=884492 |
This can be worked around as described in aws-amplify/amplify-cli#682. What I did was version my resolver |
@bigkraig thats a great find! could you post an example of the versioning, perhaps this is something we could handle in the construct opaquely? |
Just to show my work, here is a selection of the diff. How we might work this into the construct is something else, as we need to modify the schema blob. My original types were named Schema change:
I then also created some exports in the same path that contains the schema with:
And the modification to my resolvers look like:
|
ahh I see. Yeah this exact strategy may not be possible to handle within the construct as changing the schema won't be easy unless the user is generating the schema via CDK and not bringing their own. Versioning in general seems like something we should play with though. |
I've raised a similar issue on the AppSync community (aws/aws-appsync-community#146). There seems to be some disagreement about what causes this. The only way I could reliably reproduce this was to deploy a broken schema followed by the correct schema. There is also an issue where you rename fields/resolvers and they become detached but I couldn't reliably reproduce this with any consistency. This seems to affect larger stacks where a change on one resolver will cause others to become detached with larger stacks just having a higher chance of this happening. |
@alextriaca I'm not sure what would have caused your workaround to break in recent releases. I've had a hard time reliably reproducing the detached resolvers as well. |
Spent many hours debugging and working around the |
Is this being worked on actively by the Appsync/Cloudformation team? The CFN error Can we have an official statement of what is causing the issue and an ETA on a fix? OR if this is user error please explain the correct way to update/replace resolvers and datasources. |
I can reliably get the There doesn't seem to be a way of changing to a Lambda resolver short of changing the API. Changing the API because of a CFT/appsync bug seems like a terrible idea, my UI folks will be unhappy. |
Wicked find @pszabop this is awesome news. Reliably reproducing this issue has been seriously difficult so far. I really hope this helps AWS get to the bottom of this issue! |
@pszabop this is my experience too. I can reliably reproduce the error by first creating the resolver FROM the data source then attempting to change the data source type. Creating the resolver against the data source results in the inability to update it. You'll need to delete the resolver then re-create it. Does not go over well in production. In my experience to avoid this undesired behavior DO NOT create the resolver from the data source, create it from the appsync api. const graphqlApi = new GraphqlApi(...);
const demoDS = api.addDynamoDbDataSource(...);
// Do this
graphqlApi.createResolver(...)
//Not this
demoDS.createResolver(...) https://docs.aws.amazon.com/cdk/api/latest/docs/@aws-cdk_aws-appsync.CfnResolver.html |
Hi! The above solutions are not working for me - I need to add a caching configuration to the resolver and the only way to do that seems to be using CfnResolver. Changing the schema type names would be somewhat of a heavy lift. Do we have a path forward for this? Thanks, |
This is the workaround I'm using for now: // creates a resolver for a datasource but with a static name so it's safe to move around
// CDK gets mad ("Only one resolver is allowed per field") if you give a new resolver a different name for the same field
// so this helps with that
// https://github.com/aws/aws-cdk/issues/13269
createResolver<TA, TR>({ fieldName, typeName, dataSource, ...rest }: CreateResolver<TA, TR>) {
const nameBase = `${upperFirst(typeName)}${upperFirst(fieldName)}`
// create resolver from datasource
const resolver = dataSource.createResolver({
typeName,
fieldName,
...rest,
})
// graphQL resolver CFN - make name static
const resolverCfn = resolver.node.defaultChild as CfnResolver
resolverCfn.overrideLogicalId(`${nameBase}Resolver`) // don't rename if moved around
return resolver
}
} This is a real problem though that makes AppSync and CDK extremely perilous to use. |
The CDK path changes as you move it around therefore resulting in a new logical id. "Updates in place" work if the logical id is the same. Overriding the logical id is the only way to really get around this. My previous suggestion to attach it to the api and not the datasource "solves" the issue because the path doesn't change. The appsync graphq api construct names resolvers with Another option is to do this dynamically. You can build an aspect and attach it to the api for example: import { CfnResolver } from '@aws-cdk/aws-appsync'
import { IConstruct } from '@aws-cdk/core'
interface IAspect {
visit: (node: IConstruct) => void
}
class EnforceAppSyncResolverNaming implements IAspect {
visit (node: IConstruct): void {
if (node instanceof CfnResolver) node.overrideLogicalId(`${node.typeName}${node.fieldName}Resolver`)
}
}
export { EnforceAppSyncResolverNaming } export default class GraphqlApiStack extends Stack {
api: GraphqlApi
constructor (scope: App, id: string, props: GraphqlStackProps) {
super(scope, id, props)
this.api = new GraphqlApi(this, 'GraphqlApi', {
name: 'GraphqlApi',
schema: Schema.fromAsset('graphql/schema.graphql'),
authorizationConfig: {
defaultAuthorization: {
authorizationType: AuthorizationType.API_KEY
}
}
})
Aspects.of(this.api).add(new EnforceAppSyncResolverNaming())
}
} |
Hey, the suggested solution from @mmccall10 and @revmischa are only a solution to keep the the logicalIds stable, but not not helpful if you encounter the issue in a production setup, since adding the Aspect will change the logicalId, which will result in the Did anyone find solution that does not require manually detaching the resolvers before running CloudFormation with the new resolver names? |
This can be a confusing land mine for newbies to step on. Here's an example from when this issue cost me many frustrated hours: https://repost.aws/questions/QUmiAqh543TR2MnU40ucFx1g/possible-to-override-default-graph-ql-model-resolvers-with-lambda-function-resolvers Could I please suggest a more informative error message when this happens? That message is pithy and direct and meaningful and ... useless in this scenario. Especially for people who are not (yet?) AppSync experts. It's not ideal that people need to understand this much mental model for how the nuts and bolts in the cloud affect their workflows. But it's even worse if the error is baffling. I have a high pain tolerance for this kind of thing but most people would have given up on this and used a less-ideal GraphQL schema as a workaround. Just a usability suggestion from a customer. Thank you for a fantastic, game-changing product. |
Has anyone come up with a better solution for this? I can't change to using |
For now, I think the best solution here on the CDK side is to remove the |
Fixes an issue where users couldn't control the ID's of appsync resolvers and functions which would cause them to be replaced whenever the data source of the resolver or function was replaced. For resolvers, this could cause a deployment error as cloudformation would attempt to create the new instance of the resolver before deleting the old one throwing 'only one resolver per field' error. Removing `dataSource.createFunction` and `dataSource.createResolver` in favor of the same methods on the `GraphQlApi` construct makes it clear that the parent scope of resolvers and functions is an API and not a data source and therefore wont be replaced when changing data sources. BREAKING CHANGE: removes `createFunction` and `createResolver` from `IDataSource`. Fixes: #13269
Fixes an issue that would cause unexpected resource replacement for appsync resolvers and functions because of construct nesting and ID generation. Changes `createResolver` and `createFunction` methods on `GraphQlApi` and `DataSource` constructs to require explicitly passing an ID. Additionally changes the scope of the constructs created in `createResolver` and `createFunction` on the `DataSource` construct to be `this.api` instead of `this`. This allows users to change the data sources of resolvers and functions while keeping the IDs stable and avoiding resource replacement. This helps to avoid the `only one resolver per field` error that occurs when deleting a resolver on a field, and adding a new one within the same deployment. BREAKING CHANGE: `DataSource.createResolver`, `DataSource.createFunction`, and `GraphQlApi.createResolver` now require 2 arguments instead of 1. Fixes: #13269
fix(appsync): unstable IDs on resolvers and functions Fixes an issue that would cause unexpected resource replacement for appsync resolvers and functions because of construct nesting and ID generation. Changes `createResolver` and `createFunction` methods on `GraphQlApi` and `DataSource` constructs to require explicitly passing an ID. Additionally changes the scope of the constructs created in `createResolver` and `createFunction` on the `DataSource` construct to be `this.api` instead of `this`. This allows users to change the data sources of resolvers and functions while keeping the IDs stable and avoiding resource replacement. This helps to avoid the `only one resolver per field` error that occurs when deleting a resolver on a field, and adding a new one within the same deployment. BREAKING CHANGE: `DataSource.createResolver`, `DataSource.createFunction`, and `GraphQlApi.createResolver` now require 2 arguments instead of 1. Fixes: #13269 ---- ### All Submissions: * [x] Have you followed the guidelines in our [Contributing guide?](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md) ### Adding new Construct Runtime Dependencies: * [ ] This PR adds new construct runtime dependencies following the process described [here](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md/#adding-construct-runtime-dependencies) ### New Features * [ ] Have you added the new feature to an [integration test](https://github.com/aws/aws-cdk/blob/main/INTEGRATION_TESTS.md)? * [ ] Did you use `yarn integ` to deploy the infrastructure and generate the snapshot (i.e. `yarn integ` without `--dry-run`)? *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
|
Instead of removing these methods, we changed the parenting of the generated constructs to use the api instead of the data source and also require explicitly passing the ID so it cannot change unexpectedly from underneath you. If you're still running into this issue please let us know as I'd like to discuss additional solutions if needed. |
@corymhall @MrArnoldPalmer For what its worth, the fix in #23322 makes the upgrade path of this change is seemingly quite difficult and might need some TLC for users with existing stacks. Say if you previously had this resolver:
This now has to become:
The diff for this now looks like:
CloudFormation then processes the
There is also no seemingly no way to re-use the previously used ID. |
I encountered this exact behavior when trying to upgrade to v2.55.0 : I added IDs to all our resolvers and CloudFormation deployment failed with errors I had to revert the CDK update as we have a stack in Production using the previous resolvers (which were created without an ID) |
You can work around this by specifying the same IDs that were being computed before the change (as you are doing @neilferreira) but instead of using You can also use |
It now works using the workaround :
Thank you @MrArnoldPalmer :) |
fix(appsync): unstable IDs on resolvers and functions Fixes an issue that would cause unexpected resource replacement for appsync resolvers and functions because of construct nesting and ID generation. Changes `createResolver` and `createFunction` methods on `GraphQlApi` and `DataSource` constructs to require explicitly passing an ID. Additionally changes the scope of the constructs created in `createResolver` and `createFunction` on the `DataSource` construct to be `this.api` instead of `this`. This allows users to change the data sources of resolvers and functions while keeping the IDs stable and avoiding resource replacement. This helps to avoid the `only one resolver per field` error that occurs when deleting a resolver on a field, and adding a new one within the same deployment. BREAKING CHANGE: `DataSource.createResolver`, `DataSource.createFunction`, and `GraphQlApi.createResolver` now require 2 arguments instead of 1. Fixes: aws#13269 ---- ### All Submissions: * [x] Have you followed the guidelines in our [Contributing guide?](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md) ### Adding new Construct Runtime Dependencies: * [ ] This PR adds new construct runtime dependencies following the process described [here](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md/#adding-construct-runtime-dependencies) ### New Features * [ ] Have you added the new feature to an [integration test](https://github.com/aws/aws-cdk/blob/main/INTEGRATION_TESTS.md)? * [ ] Did you use `yarn integ` to deploy the infrastructure and generate the snapshot (i.e. `yarn integ` without `--dry-run`)? *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
fix(appsync): unstable IDs on resolvers and functions Fixes an issue that would cause unexpected resource replacement for appsync resolvers and functions because of construct nesting and ID generation. Changes `createResolver` and `createFunction` methods on `GraphQlApi` and `DataSource` constructs to require explicitly passing an ID. Additionally changes the scope of the constructs created in `createResolver` and `createFunction` on the `DataSource` construct to be `this.api` instead of `this`. This allows users to change the data sources of resolvers and functions while keeping the IDs stable and avoiding resource replacement. This helps to avoid the `only one resolver per field` error that occurs when deleting a resolver on a field, and adding a new one within the same deployment. BREAKING CHANGE: `DataSource.createResolver`, `DataSource.createFunction`, and `GraphQlApi.createResolver` now require 2 arguments instead of 1. Fixes: aws#13269 ---- ### All Submissions: * [x] Have you followed the guidelines in our [Contributing guide?](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md) ### Adding new Construct Runtime Dependencies: * [ ] This PR adds new construct runtime dependencies following the process described [here](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md/#adding-construct-runtime-dependencies) ### New Features * [ ] Have you added the new feature to an [integration test](https://github.com/aws/aws-cdk/blob/main/INTEGRATION_TESTS.md)? * [ ] Did you use `yarn integ` to deploy the infrastructure and generate the snapshot (i.e. `yarn integ` without `--dry-run`)? *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
👋 we are experiencing this issue for our customer use-case which can be described as below: However, during the deployment it fails with I have attached the CFN logs below: From the logs for original "Test" stack (pic2), we see that it's waiting in phase "update_complete_cleanup_in_progress" where I would assume the resolvers to be soft deleted and detached. Then it wouldn't cause the subsequent new stack to fail with the error mentioned above. Please let me know if you need any other information. |
@phani-srikar workarounds posted here don't work when moving resolvers to a new stack. A depends on relationship between stacks doesn't do anything to tell the service that some resources from one need to be deleted before resources from the other can be deployed, it just means that for one (CustomTestStack) to be deployed, the other (TestStack) must also be, which it already is. So there is no way to declare a dependency on a resource from a stack being deleted before another is created. For your case I would suggest deploying the changes to the original stack first, deleting the old resolvers, then deploying the new stack creating the new ones. I'm investigating alternatives with the service team in the meantime, but right now this isn't something that cloudformation supports. |
Hi @MrArnoldPalmer. Thanks for your quick response and confirming the behavior.
Given these pros and cons, would be really useful if the service can handle detaching the resolvers once they are soft deleted and waiting for cleanup. |
There seem to be issues when that come up during appsync deployment when resolvers for specific fields are replaced. If you change the uniqueID of a resolver, this will cause a CFN deployment where the old resolver is removed, and the new one is created. Since only one resolver can exist on a field, this will cause the deployment to fail with
Only one resolver is allowed per field
.Alternatively, this can also result in 'detached' resolvers, which are not actually triggered on their corresponding queries.
This may require resolution on the cloudformation or appsync side since the changes being submitted by the CDK are valid and replacing existing resolvers is allowed, though obviously could incur some downtime.
See: #13238 and #12635 for relevant details.
The text was updated successfully, but these errors were encountered: