Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

S3 message spillover #45

Merged
merged 6 commits into from
Sep 5, 2024
Merged

S3 message spillover #45

merged 6 commits into from
Sep 5, 2024

Conversation

alukach
Copy link
Collaborator

@alukach alukach commented Sep 4, 2024

As described in #37 (comment), I recommend that we try a custom middleware to automatically push all responses greater than 6MBs to AWS S3 and utilize a presigned URL + 303 redirect to instruct clients to retrieve the results from that location, thereby side-stepping the AWS Lambda Response limit. A 303 (rather than a 302 or 307) is recommended as it informs clients that a GET request should be used to follow the redirect, even when the original request was a POST (docs):

This response code is often sent back as a result of PUT or POST methods so the client may retrieve a confirmation, or view a representation of a real-world object (see HTTP range-14). The method to retrieve the redirected resource is always GET.

The S3 bucket should auto-flush these messages after 1 day.

@alukach alukach force-pushed the feature/s3-message-spillover branch 4 times, most recently from 1061a23 to 1dcfcf2 Compare September 4, 2024 05:18
@alukach alukach force-pushed the feature/s3-message-spillover branch from 1dcfcf2 to d08b6c7 Compare September 4, 2024 05:20
@alukach alukach changed the title Feature/s3 message spillover S3 message spillover Sep 4, 2024
Copy link
Collaborator

@zacdezgeo zacdezgeo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alukach, the approach with the s3 redirect is pretty seamless (once we figured out the encoding error). The user experience stays exactly the same, and performance remains relatively constant from my quick and dirty benchmarking:

Time for 0 fields: 1.8640 seconds
Time for 1 fields: 1.2667 seconds
Time for 2 fields: 1.3008 seconds
Time for 3 fields: 1.4476 seconds
[...]
Time for 35 fields: 2.7537 seconds
Time for 36 fields: 2.8424 seconds
Time for 37 fields: 2.9549 seconds
Time for 38 fields: 5.3728 seconds

and now we can return as much data a user requests!

Feel free to re-request my review once the patch is done on the middleware.

space2stats_api/src/tests/test_api.py Show resolved Hide resolved
@alukach alukach force-pushed the feature/s3-message-spillover branch from 3021dc4 to 17722ec Compare September 4, 2024 20:23
@zacdezgeo zacdezgeo self-requested a review September 5, 2024 16:37
Copy link
Collaborator

@zacdezgeo zacdezgeo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So awesome! Thanks @alukach.

@zacdezgeo
Copy link
Collaborator

Will try deploying, and I'll merge if everything goes well.

@zacdezgeo
Copy link
Collaborator

Deployment succeeded. I just picked up on the bucket creation in cdk/aws_stack.py. Are we good to merge @alukach ?

@alukach alukach merged commit 021f960 into main Sep 5, 2024
2 checks passed
@alukach alukach deleted the feature/s3-message-spillover branch September 5, 2024 18:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants