Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC - Static Generation / SSG Improvements #355

Closed
danielcondemarin opened this issue Apr 11, 2020 · 54 comments
Closed

RFC - Static Generation / SSG Improvements #355

danielcondemarin opened this issue Apr 11, 2020 · 54 comments

Comments

@danielcondemarin
Copy link
Contributor

danielcondemarin commented Apr 11, 2020

Problem

Next.js recently released a new API designed to improve Static Site Generation. More details can be found in vercel/next.js#9524.

serverless-next.js is not currently compatible with the new Next.js API and this RFC proposes changes to support these features.

Proposal

getStaticProps.

For pages that declares getStaticProps, Next.js generates a JSON file that holds the return value of executing getStaticProps. The file is used for client-side routing.

All JSON files for a given build can be found in .next/prerender-manifest.json.
serverless-next.js will lookup the files and upload them to S3 so they can be fetch-ed from the browser.
Also, a new cache behaviour will be created in CloudFront for the path pattern /_next/data/*. This will match any requests made to the JSON files and forward them to S3.

getStaticPaths

When you have a dynamic route you can add a method called getStaticPaths to the page. The way it works is Next.js generates a fallback HTML page that can be served when the route has not been pre-rendered on a previous request.

If the path is returned in getStaticPaths then is pre-rendered at build time. If is not pre-rendered at build time then it needs to be compiled server side in the background the first time the route is visited.

Here is a diagram to illustrate how this will work in serverless-next.js (fallback: true):

Serverless SSG Improvements - GetStaticPaths Fallback_ true

You can also set fallback: false. In this case a 404 is returned instead of pre-rendering:

Serverless SSG Improvements - GetStaticPaths Fallback_ false

getServerProps

getServerProps behaves similarly to getInitialProps. The difference is that on client side transitions, Next.js makes an api request to the server to run getServerProps and uses that to render the page client side.
In serverless-next.js this will be handled in the origin-request edge function as illustrated below:

Serverless SSG Improvements - GetServerSideProps

Preview mode

Preview mode enables you to preview the "draft" of a page at request time rather than build time. Next.js achieves this by always calling getStaticProps on the page at request time whenever preview mode is enabled. Details of how to enable preview mode and use it can be found in https://nextjs.org/docs/advanced-features/preview-mode.

In serverless-next.js this will be handled in the origin-request edge function as illustrated below:

Serverless SSG Improvements - Preview Mode

@janus-reith
Copy link

janus-reith commented Apr 11, 2020

So in step 2, if not cached in cloudfront yet, it would always call the lambda and not S3 directy, right? How would the static fallback page be returned in that step, while the actual page is still rendered?

Also: Could it be possible to avoid calling the lamba, and just utilize that logic directly in cloudfront somehow, maybe by defining behaviours per page? So that Cloudfront could directly make calls to S3, and if a page is not present, serve the fallback page for that route, and still trigger that lambda for the page creation. Or if triggering the lamba is not possible, still do the initial lookup from S3, and if not present, call the lambda and make that serve the static fallback page while still creating the actual page. Or do the opposite, and serve the static fallback page from S3 and make that page trigger the lambda in its client side while it's in fallback mode and awaiting the page.

Although the overhead might be negligible as most things will be cached in Cloudfront, I was under the impression that the architecture would allow to always serve any request for a static page directly from file, without needing to call a function first, including the fallback and revalidate pages.

@danielcondemarin
Copy link
Contributor Author

danielcondemarin commented Apr 11, 2020

Hi @janus-reith Thanks for the questions! Made me realise a few things I didn't consider before.

Let me start with this:

Could it be possible to avoid calling the lamba, and just utilize that logic directly in cloudfront somehow, maybe by defining behaviours per page?

It is possible but wouldn't scale. CloudFront has a limit of 25 cache behaviours per distribution. It wouldn't work for apps with more than 25 page routes which is a fairly common use case.

So in step 2, if not cached in cloudfront yet, it would always call the lambda and not S3 directy, right? How would the static fallback page be returned in that step, while the actual page is still rendered?

Yes you are correct, it would call the lambda. Also, this question made me realise something I had missed. I assumed by setting callbackWaitsForEmptyEventLoop = false I could make Lambda@Edge forward the request onto S3 for the fallback whilst in the background generating the actual page. However from reading the docs AWS wouldn't actually run anything in the background. It freezes the execution context of the Lambda and up to their discretion it may resume in the next invocation any outstanding events.
There are other ways to approach this like using some sort of queue / polling mechanism (e.g. SQS) but I want to avoid adding more complexity to the architecture if possible.

I'll keep looking for solutions, any ideas are appreciated 🙏

@janus-reith
Copy link

janus-reith commented Apr 11, 2020

Thanks for the explanation, I didn't know about that Cloudfront limit.

However from reading the docs AWS wouldn't actually run anything in the background. It freezes the execution context of the Lambda and up to their discretion it may resume in the next invocation any outstanding events.

Could my third suggestion work for this maybe?
Serve the static fallback page from S3 and make that page trigger the lambda on the client side while it's in fallback mode and awaiting the page.
I did'nt look it up exactly, but when the client recieves the static fallback page for a route, it does some call to the server and waits for a response when the actual page finished rendering. So that call could be trigger for the page creation I guess?
Im not sure if that actually is a good approach, because when the lambda is hit initially and could directly start creating the page that would be finished faster than to wait until the client recieved the fallback page and then sends a request that triggers the render, but that could be simpler to implement.

One could probably verify easily if default next start or Now deployments are doing that aswell by fetching a fallback page first time with disabled js, so if that client side call triggers the creation it wouldn't execute then. Then retry that request and check the output.
Update: Just tried that out on a Now deployment that has fallback enabled, and it does not seem to be the case, the pages were available on second request without a clientside fetch involved.

@danielcondemarin
Copy link
Contributor Author

@janus-reith I may have a come up with a solution similar to what you suggested.

Serverless SSG Improvements

Let me know what you think 🙏

A few notes on this approach:

  • It doesn't require updating the pre-render manifest whenever a new page is compiled.
  • Optimises for the most common use case. origin-request edge function assumes the page has been compiled already and 404s are handled on origin-response.
  • Latency when serving the Fallback page should be minimal -> single read roundtrip to S3.

cc. @oseibonsu

@janus-reith
Copy link

@danielcondemarin Im not sure about that last part where you get /blog/hello, it seems redundant if you already fetched the json.
I just checked that on a fallback page, and I can't see another document being fetched there either.
Therefore I believe that render would just execute clientside with the fetched json.

@danielcondemarin
Copy link
Contributor Author

danielcondemarin commented Apr 12, 2020

@danielcondemarin Im not sure about that last part where you get /blog/hello, it seems redundant if you already fetched the json.

That last part is just to illustrate what would happen if there is a second request after the actual page and JSON props file has been generated in step 2.
Also, there is a distinction between fetching via client side or doing a page load / refresh.
When client side, the JSON file is fetched like you said, but if you do a page refresh then the HTML document is served and no JSON file is requested.

@janus-reith
Copy link

Ah ok, yes I agree then, it wasn't clear for me in in the diagram - Maybe separate that third step from the other two which are connected? :)

@danielcondemarin
Copy link
Contributor Author

Ah ok, yes I agree then, it wasn't clear for me in in the diagram - Maybe separate that third step from the other two which are connected? :)

I've updated the RFC with an updated diagram that provides description for each step so is more clear 👍

@HaNdTriX
Copy link

HaNdTriX commented Apr 22, 2020

Afaik Next.js relies heavily on RFC5861 for serving stale content and revalidating pages in the background.

https://github.com/zeit/next.js/blob/18036d4e5198b6375a849c584c8b5a822ee41952/packages/next/next-server/server/send-payload.ts#L31-L47

Currently this extension is not supported by Cloudfront. Nevertheless it definetly would make sense to follow the RFC as close as possible.

@janus-reith
Copy link

@HaNdTriX I guess this would make it possible to skip the lamba and serve directly from S3 instead if available?

@danielcondemarin
Copy link
Contributor Author

Afaik Next.js relies heavily on RFC5861 for serving stale content and revalidating pages in the background.

https://github.com/zeit/next.js/blob/18036d4e5198b6375a849c584c8b5a822ee41952/packages/next/next-server/server/send-payload.ts#L31-L47

Currently this extension is not supported by Cloudfront. Nevertheless it definetly would make sense to follow the RFC as close as possible.

@HaNdTriX I will be covering vercel/next.js#11552 on a separate RFC. Whilst CloudFront doesn't support stale-while-revalidate I suspect we can achieve something pretty similar using the Lambda@Edge origin-request and origin-response hooks.

@danielcondemarin
Copy link
Contributor Author

danielcondemarin commented May 14, 2020

Hi folks, I've just merged to master support for getStaticPaths (with fallback: false) and possibly also support for getServerSideProps although I haven't tested the latter.
Please let me know if someone could do a bit of testing on this before I release as stable.

Thanks to @mertyildiran for doing all the hard work!

@Negan1911
Copy link
Contributor

Hey @danielcondemarin I could test it in a couple of projects, is this pushed into an alpha release on npm? Or should I just pull from github

@danielcondemarin
Copy link
Contributor Author

Hey @danielcondemarin I could test it in a couple of projects, is this pushed into an alpha release on npm? Or should I just pull from github

Thanks a lot @Negan1911 . I'll get it deployed to an alpha release today!

@marcfielding1
Copy link

marcfielding1 commented May 15, 2020

@danielcondemarin Are you free for a quick chat about this, please find my LI here: https://www.linkedin.com/in/marc-fielding-a8bb3293/

I mentioned you guys in GCS connect talk I was doing yesterday, I really, really want us to commit to using this since it'll change the way we do eCommerce.

FYI We can test this across almost 600 pages that require these features.

@danielcondemarin
Copy link
Contributor Author

Hey @danielcondemarin I could test it in a couple of projects, is this pushed into an alpha release on npm? Or should I just pull from github

Thanks a lot @Negan1911 . I'll get it deployed to an alpha release today!

@Negan1911 and anyone else who can test I've now published serverless-next.js@1.12.0-alpha.2. Also added to our examples examples/blog-starter 🎉 straight from next's repo and works like a charm.

@Negan1911
Copy link
Contributor

Negan1911 commented May 16, 2020

@danielcondemarin Only have to say something: Works incredibly good, seriously! I'm on the early steps of building my own startup with a cofounder, and our UI deployments are based on this library, as soon as we get a better budget that doesn't depend on my monthly salary I would definitively come back and donate. Thanks for maintaining this library!

@SarKurd
Copy link
Contributor

SarKurd commented May 16, 2020

@danielcondemarin tested both examples/blog-starter and examples/with-loading(getServerSideProps), both working great.
Thanks for implementing this

@lone-cloud
Copy link
Contributor

lone-cloud commented May 16, 2020

I just redeployed a newish project with the new alpha and I see some improvements, but the dynamic routes don't seem to be being pre-fetched correctly. My assumption was that it's supposed to be working with the alpha.

To be more specific we have dynamic paths for "products". The alpha fixed the fetching of "/_next/data/hash/products.json" which was failing in 1.11.5, but the pre-fetching is failing for paths like "/_next/static/hash/pages/products/product slug.js". It looks like "/_next/static/runtime/main-hash.js" is trying to pre-fetch them. It is currently a mystery to me where these files are supposed to come from as I don't see them generated on next build.
Note that I am using the new "unstable_revalidate" flag here and returning the product from getStaticProps like { props: { product, unstable_revalidate: 1 };.

n/m, I wasn't link to dynamic routes correctly :( That part works perfectly.

@marcfielding1
Copy link

@Negan1911 Heh us two, I don't think I've been this excited about something since the first release of Doom.

@vdev9
Copy link

vdev9 commented May 20, 2020

@danielcondemarin when will you merge serverless-next.js@1.2.0-alpha.2 !!

@medv
Copy link

medv commented May 21, 2020

Works well for us with next@latest and serverless-next.js@1.12-alpha.5

However navigating to a page using getServerSideProps() causes a 403 from S3 on data/HASH/serversidepropspage.json during routing (doesn't exist), which gets caught by next.js and causes a hard window refresh to the new route. This is tenable for now, but of course app-wide state is not possible to maintain without writing it to localStorage, as most of our pages use getServerSideProps.

@danielcondemarin
Copy link
Contributor Author

Has step 1 been done already?

Nope. Only the origin-request is attached. You'd need to add another entry for origin-response to the same lambda fn.

I'm thinking of only supporting the latest manifest types. Objections?

No objections, it simplifies things. We can document which version of next is required for it to work.

https://nextjs.org/blog/next-9-4#incremental-static-regeneration-beta

That one will be tricky to implement. We could discuss it separately after this RFC is completed.

@andrewgadziksonos
Copy link
Collaborator

Hello @lone-cloud and @danielcondemarin!

I would like to offer my support to this repo and help contribute to its code base. Wasn't sure what the best way was to contact you both so please forgive me if this wasn't the proper avenue. I've read the contribute readme and I'm fully setup and ready to go. Let me know what I could tackle, or help tackle.

Thanks,

Andrew

@danielcondemarin
Copy link
Contributor Author

Hello @lone-cloud and @danielcondemarin!

I would like to offer my support to this repo and help contribute to its code base. Wasn't sure what the best way was to contact you both so please forgive me if this wasn't the proper avenue. I've read the contribute readme and I'm fully setup and ready to go. Let me know what I could tackle, or help tackle.

Thanks,

Andrew

@andrewgadziksonos Appreciate the help, thanks! Could you send me a DM on twitter and I'll add you to our contributors channel.

@andrewgadziksonos
Copy link
Collaborator

Hello @lone-cloud and @danielcondemarin!
I would like to offer my support to this repo and help contribute to its code base. Wasn't sure what the best way was to contact you both so please forgive me if this wasn't the proper avenue. I've read the contribute readme and I'm fully setup and ready to go. Let me know what I could tackle, or help tackle.
Thanks,
Andrew

@andrewgadziksonos Appreciate the help, thanks! Could you send me a DM on twitter and I'll add you to our contributors channel.

@danielcondemarin It says you can't be messaged directly. Here's my twitter

@nodabladam
Copy link

I am testing this out and noticed that the https://CLOUDFRONT-ID.cloudfront.net/_next/data/BUILD-ID/page.json doesn't have cache headers so each user seems to get x-cache: Miss from cloudfront. Could/should the files served from _next/data have some header like "cache-control": "public,max-age=30672000,immutable" set on them so it serves a bit faster?

@thchia
Copy link
Collaborator

thchia commented Aug 9, 2020

@danielcondemarin is there progress on supporting the fallback: true use case? If not I don't mind attempting this. I would like to clarify a few things though:

  • Logic to respond to a request for /blog/hello with /blog/[slug].html has not been implemented yet right?
    • I believe the most efficient way to do this in when handling the HTML pages/public files? Currently the logic checks for public files, static HTML pages and pre-rendered HTML pages. If the requested path is none of these, we can do one more check to see if there is a fallback HTML page before moving on.
  • For handling the JSON request (which originates from the fallback page), I'm a little confused. In the SSR implementation, the lambda takes JSON requests from client side transitions and builds the response without ever going to the origin. In the proposed solution for fallback pages, it seems you recommend to undo this, send the request directly to the origin, and in the origin-response check the status:
    • 404: generate the JSON and HTML, return the JSON and upload the HTML into the static-pages directory in the origin. What if this was a JSON request that originated from a client side transition to a SSR page? We don't want to upload that HTML.
    • any other: pass on the data (this dynamic fallback route has been visited before, or we are dealing with a SSG page)

@danielcondemarin
Copy link
Contributor Author

danielcondemarin commented Aug 9, 2020

@danielcondemarin is there progress on supporting the fallback: true use case?

I've not had time to look into the fallback: true implementation. Happy for you to PR and I'll help where I can 👍

  • Logic to respond to a request for /blog/hello with /blog/[slug].html has not been implemented yet right?

Pre-rendered dynamic routes should work fine. Are you seeing any issues?

  • I believe the most efficient way to do this in when handling the HTML pages/public files? Currently the logic checks for public files, static HTML pages and pre-rendered HTML pages. If the requested path is none of these, we can do one more check to see if there is a fallback HTML page before moving on.

To check if there is a fallback page from the origin-request handler you'd need a roundtrip to S3 as the lambda function doesn't have any fallback HTML pages in the deployment artefact. The flow I proposed is not much different, only that it does the roundtrip on origin-response only if the route wasn't pre-rendered already (hence the 404 check).

What if this was a JSON request that originated from a client side transition to a SSR page? We don't want to upload that HTML.

If is a client side transition to a SSR page (aka getServerProps) we should ignore those. There is a way to differentiate between the 2 like next-server does.

@thchia
Copy link
Collaborator

thchia commented Aug 12, 2020

@danielcondemarin thanks for the feedback - I am working on this.

@danielcondemarin
Copy link
Contributor Author

danielcondemarin commented Aug 14, 2020

Props to @thchia for adding getStaticPaths fallback: true support and a bunch other improvements via #544 🙌
This is currently available on alpha @sls-next/serverless-component@1.17.0-alpha.5. Please try it and feedback 🙏

@thchia
Copy link
Collaborator

thchia commented Aug 25, 2020

@danielcondemarin I am thinking about Preview Mode (not sure I can commit to working on it yet). I believe that preview mode is indicated by the presence of certain cookies, meaning that in order to make sure we don't accidentally cache the preview response, we should be:

  1. Whitelisting said cookies in CloudFront (so we don't serve cached, non-preview data during a preview request);
  2. Setting a cache-control age of 0 on the generated responses (so the response is not cached - in preview mode we don't know how often the underlying data is changing, so we should not cache preview responses).

Edit: As I am playing around, having not done either of these things, the problems I expected are not appearing. I'll keep looking at it.

Edit 2: The cache behaviour for * forwards all cookies, explaining why we get cache misses when the preview cookies are present. This means I still expect to see the problems for the data requests (but have not tried yet). Also, regarding point 2, I realise it is Next generating the response when we render in the lambda, so I think this is already handled.

@danielcondemarin
Copy link
Contributor Author

@danielcondemarin I am thinking about Preview Mode (not sure I can commit to working on it yet).

I believe that preview mode is indicated by the presence of certain cookies

That's right. More specifically there are 2 cookies involved,

  • __prerender_bypass
  • __next_preview_data

I think the first one is simply to indicate the page should be SSR'ed instead of it being pre-rendered and cached at the edge locations.

The second cookie is used to validate the preview request was issued from someone with access to preview mode.

Whitelisting said cookies in CloudFront (so we don't serve cached, non-preview data during a preview request);
Setting a cache-control age of 0 on the generated responses (so the response is not cached - in preview mode we don't know how often the underlying data is changing, so we should not cache preview responses).

👍

@ezalorsara
Copy link

I hope unstable_revalidate will be supported soon.

@danielcondemarin
Copy link
Contributor Author

Hi folks, I've published @sls-next/serverless-component@1.17.0-alpha.13 which should have Preview Mode support. Please test and feedback if you find any issues. Thanks to @thchia for implementing!

@weplay-devservices
Copy link

weplay-devservices commented Oct 14, 2020

I assume nobody looked yet at fallback: 'unstable_blocking' as it is not yet stated in docs - but looks quite ok if I ask me.

@minlare
Copy link

minlare commented Oct 20, 2020

@HaNdTriX I will be covering vercel/next.js#11552 on a separate RFC. Whilst CloudFront doesn't support stale-while-revalidate I suspect we can achieve something pretty similar using the Lambda@Edge origin-request and origin-response hooks.

@danielcondemarin @dphang - do you have any update regarding implentation of revalidate for getStaticProps ?

@danielcondemarin
Copy link
Contributor Author

@danielcondemarin @dphang - do you have any update regarding implentation of revalidate for getStaticProps ?

Implementing revalidate is a tricky one I might have underestimated. In order to implement the stale-while-revalidate spec properly you have to generate a fresh copy of the page in the background whilst serving the stale copy in a non-blocking fashion. Doing that in Lambda is not possible due the way the lambda execution environment works. In other words, Lambda doesn't let you return a response to the client and keep on running a task on the background. The closest thing to running something in the background is to set callbackWaitsForEmptyEventLoop to false but that freezes outstanding events so not even that would work.
I need to take a closer look at Next.js implementation as I'm not sure if they conform strictly to the spec. Any ideas / suggestions please let me know 🙂

@faceshape
Copy link

faceshape commented Oct 21, 2020

@danielcondemarin not too familiar with the internals, would something like this make sense (very simplified):

  • when the user requests a page with revalidation, if there's a cached version on s3 return that and put a message in some kind of queue (aws sqs?) to regenerate/revalidate that page,
  • have a separate normal lambda fn that processes that queue and essentially regenerates the page and stores it on s3.

This essentially moves the background regeneration into a separate lambda.

@danielcondemarin
Copy link
Contributor Author

@faceshape That's a sensible idea I've actually mulled over before. However it has a few downsides, one is it requires another piece of infrastructure (SQS) to add. While it may not seem like a big deal, the more infrastructure surface we add to serverless-next.js, the slower builds get and more difficult it is to maintain the project. The second, strictly speaking even making a call to SQS to post a message, would be blocking, as in, we have to wait for the request to SQS to fulfil before returning the response to the client.

@janus-reith
Copy link

I wonder what I'm missing here - A separate lambda invocation makes sense, but why would SQS, or anything else on top be necessary? Couldn't the first Lambda just invoke the second lambda with InvocationType Event and not await its result?

@danielcondemarin
Copy link
Contributor Author

I wonder what I'm missing here - A separate lambda invocation makes sense, but why would SQS, or anything else on top be necessary? Couldn't the first Lambda just invoke the second lambda with InvocationType Event and not await its result?

You're right @janus-reith. That removes the need for SQS even though I think AWS still uses a queue behind the scenes for async invocations, but if is abstracting it away from us that's best!

My only remaining concern would be latency between regions, say that a user is in us-east-1 and hits one of the edge functions in that region, when invoking the other lambda for background page regeneration what latency are we expected to see? I realise that it won't need to wait for the regeneration lambda to process but the request itself to put the event in the queue what region is that happening and how do we know is not going to a queue somewhere in eu-*.
My initial guess would be the queue AWS uses behind the scenes would be colocated in the same region as the async lambda but I'm not sure of that.

@danielcondemarin
Copy link
Contributor Author

Since all of the features in the RFC have now been implemented will close this and we can open a separate issue for revalidate.

@Talhafayyaz11
Copy link

#1317

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests