-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
INCIDEN-839: Revert ESbuild and AWS SDK updates #268
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This effectively reverts #259 and #266 as the write-activity-log lambda is failing all requests after them. The update to ESbuild isn't actually a minor update - going to 0.22.0 breaks how packages are built. See their changelog (https://github.com/evanw/esbuild/blob/main/CHANGELOG.md) and related issues (evanw/esbuild#3819) for more details. We did already update to ESBuild 0.23.0 which should have solved the problem, but we're still seeing all invocations failing. Let's revert back to a known good config while we understand what's happen, but we're still seeing all invocations failing. Let's revert back to a known good config while we understand what's happened.
saralk
approved these changes
Jul 3, 2024
alex9smith
added a commit
that referenced
this pull request
Jul 3, 2024
I'm suspicious that there's a difference here compared to the last PR (#268) so I've deleted `package-lock.json` and regenerated it with `npm install`
alex9smith
added a commit
that referenced
this pull request
Jul 3, 2024
## Proposed changes <!-- Provide a summary of your changes in the title above --> <!-- Include the Jira ticket number in square brackets as prefix, eg `[OLH-XXXX]: PR Title` --> ### What changed Revert ESbuild version in actions and also regenerate the `package-lock.json` file since I'm suspicious that there's a difference now compared to the last PR. ### Why did it change #268 didn't fix the lambda
alex9smith
added a commit
that referenced
this pull request
Jul 3, 2024
The ESBuild reverts in #268 and #270 did fix the problem building the lambdas, but the write activity log lambda was still failing with timeouts. It turns out this was for a different reason - the input queue for this function was backed up with ~100k events. We have batch sizes of 10 events per invocation, but under normal operation the batches are much smaller than this. This lambda does a lot per event because it has to encrypt and write each to DynamoDB seperately. That meant once it was processing 10 events per invocation these were legitimately timing out because all the operations take longer than 5 seconds. This is the only lambda that does so much on each invocation. Increase the timeout to 30 seconds. I can see that batches of 10 events are taking ~9 seconds to process in production now, so this is comfortably above that.
1 task
alex9smith
added a commit
that referenced
this pull request
Jul 3, 2024
## Proposed changes <!-- Provide a summary of your changes in the title above --> <!-- Include the Jira ticket number in square brackets as prefix, eg `[OLH-XXXX]: PR Title` --> ### What changed Increase the timeout to 30 seconds for the `write-activity-log` lambda. ### Why did it change The ESBuild reverts in #268 and #270 did fix the problem building the lambdas, but the write activity log lambda was still failing with timeouts. It turns out this was for a different reason - the input queue for this function was backed up with ~100k events. We have batch sizes of 10 events per invocation, but under normal operation the batches are much smaller than this. This lambda does a lot per event because it has to encrypt and write each to DynamoDB seperately. That meant once it was processing 10 events per invocation these were legitimately timing out because all the operations take longer than 5 seconds. This is the only lambda that does so much on each invocation. I can see that batches of 10 events are taking ~9 seconds to process in production now, so this is comfortably above that. I've manually changed this in integration and production already and proved that it has solved the problem. ## Checklists <!-- Merging this PR deploys to production. Please answer accurately. --> ### Environment variables or secrets - [x] No environment variables or secrets were added or changed
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Proposed changes
What changed
Revert ESbuild and AWS SDK updates. This effectively reverts #259 and #266 as the write-activity-log lambda is failing all requests after them.
Why did it change
The update to ESbuild isn't actually a minor update - going to 0.22.0 breaks how packages are built. See their changelog (https://github.com/evanw/esbuild/blob/main/CHANGELOG.md) and related issues (evanw/esbuild#3819) for more details.
We did already update to ESBuild 0.23.0 which should have solved the problem, but we're still seeing all invocations failing. Let's revert back to a known good config while we understand what's happened.
Related links
Checklists
Environment variables or secrets
How to review
Please check the version numbers here against the ones in #259