Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

getStaticPaths re-imports node modules preventing caching remote data in memory #10933

Closed
smnh opened this issue Mar 10, 2020 · 13 comments
Closed

Comments

@smnh
Copy link

smnh commented Mar 10, 2020

Bug report

Describe the bug

When running next dev and requesting a page having the getStaticPaths method, node modules required by the page are re-imported, thus preventing caching headless CMS remote data in memory.

Same happens when running next build - modules required by page component that have getStaticPaths method are re-imported for every pre-rendered page. Making it impossible to fetch the whole remote data in a single API request and use it for all pre-rendered pages.

Important Note:
Headless CMS services may limit number of API requests per month and may apply charges when this limit is passed. The development and build of statically generated sites should minimize the usage of headless CMS API and re-fetch the data only when it is changed. The caching and data invalidation logic might be implemented by CMS clients. Additionally headless CMS services have endpoints to fetch the whole data in a single request. Therefore, to decrease the API usage and support caching, I think Next.js should allow importing modules that cache their data in memory and not re-import them every time page is pre-rendered, while running dev server or when building the site.

To Reproduce

Create simple page pages/[...slug].js

import React from 'react';
import pageLayouts from '../layouts';
import cmsClient from '../ssg/cms-client';

class Page extends React.Component {
    render() {
        // every page can have different layout, pick the layout based
        // on the model of the page (_type in Sanity CMS)
        const PageLayout = pageLayouts[this.props.page._type];
        return <PageLayout {...this.props}/>;
    }
}

export async function getStaticPaths() {
    console.log('Page [...slug].js getStaticPaths');
    const paths = await cmsClient.getStaticPaths();
    return { paths, fallback: false };
}

export async function getStaticProps({ params }) {
    console.log('Page [...slug].js getStaticProps, params: ', params);
    const pagePath = '/' + params.slug.join('/');
    const props = await cmsClient.getStaticPropsForPageAtPath(pagePath);
    // If not using JSON.parse(JSON.stringify(props)), next.js throws following error when running "next build"
    // Error occurred prerendering page "/blog/design-team-collaborates". Read more: https://err.sh/next.js/prerender-error:
    // Error: Error serializing `.posts[4]` returned from `getStaticProps` in "/[...slug]".
    // Reason: Circular references cannot be expressed in JSON.
    return { props: JSON.parse(JSON.stringify(props)) };
}

export default Page;

Implement simple singleton CMS client that fetches CMS data and caches it in memory:

class CMSClient {

    constructor() {
        console.log('CMSClient constructor');
        this.data = null;
    }

    async getData() {
        if (this.data) {
            console.log('CMSClient getData, has cached data, return it');
            return this.data;
        }
        console.log('CMSClient getData, has no cached data, fetch data from CMS');
        this.data = await this.fetchDataFromCMS();
        return this.data;
    }

    async getStaticPaths() {
        console.log('CMSClient getStaticPaths');
        const data = await this.getData();
        return this.getPathsFromCMSData(data);
    }

    async getStaticPropsForPageAtPath(pagePath) {
        console.log('CMSClient getStaticPropsForPath');
        const data = await this.getData();
        return this.getPropsFromCMSDataForPagePath(data, pagePath);
    }

    async fetchDataFromCMS() { ... }
    getPathsFromCMSData(data) { ... }
    getPropsFromCMSDataForPagePath(data, pagePath) { ... }
}

module.exports = new Client();

Navigate to any page rendered by [...slug].js, for example /about.
Following logs will be printed on server::

CMSClient constructor
Page [...slug].js getStaticPaths
CMSClient getStaticPaths
CMSClient getData, has no cached data, fetch data from CMS
Page [...slug].js getStaticProps, params:  { slug: [ 'about' ] }
CMSClient getStaticPropsForPath
CMSClient getData, has cached data, return it
  • constructor is invoked - assuming [...slug].js module was loaded for the first time it is OK.
  • [...slug].js calls getStaticPaths - OK according to Runs on every request in development
  • getStaticPaths of the CMS client is invoked, it does not have the cached data because the client was just constructed therefore the getData is called for the first time - OK.
  • [...slug].js calls getStaticProps - OK according to Runs on every request in development
  • getStaticPropsForPath of the CMS client is invoked, it already has cached data so getData returns early returning the cached data - OK

Refresh the page or click a link <Link href="/[...slug]" as="/about"><a>About</a></Link>.
Following logs will be printed on server:

CMSClient constructor
Page [...slug].js getStaticPaths
CMSClient getStaticPaths
CMSClient getData, has no cached data, fetch data from CMS
Page [...slug].js getStaticProps, params:  { slug: [ 'about' ] }
CMSClient getStaticPropsForPath
CMSClient getData, has cached data, return it

As it can be seen the CMS client is constructed again, and every time a page is requested (even thought it uses the same page module), and the same steps related to fetching and caching the data are repeated. This behavior suggest that when page is requested and getStaticPaths is called, it re-imports all modules.

Note: When using getStaticProps without getStaticPaths, the client is not constructed on every request and therefore cached data is used as expected. See link to demo repository below.

Expected behavior

When running next dev server (or next build), the modules imported by a page component should be imported only once and reused to allow them cache remote data in memory.

Page [...slug].js getStaticPaths
CMSClient getStaticPaths
CMSClient getData, has cached data, return it
Page [...slug].js getStaticProps, params:  { slug: [ 'about' ] }
CMSClient getStaticPropsForPath
CMSClient getData, has cached data, return it

System information

  • OS: macOS
  • Browser: Chrome
  • Version of Next.js: 9.3.0

Additional context

I've setup an example repository that I've used to reproduce this issue. It uses Sanity as Headless CMS. The README file has all the info needed to setup Sanity account and import the initial data used by this example site.

https://github.com/stackbithq/azimuth-nextjs-sanity/tree/nextjs-ssg-api
(use nextjs-ssg-api branch)

Note, when loading the root page '/' (pages/index.js) which has only the getStaticProps method and does not have getStaticPaths, the CMS client is not constructed on every request and therefore data cached in memory of the CMS client module is used as expected.

@ijjk
Copy link
Member

ijjk commented Mar 12, 2020

Hi, we run getStaticPaths and getStaticProps in separate workers to ensure we're prerendering pages as fast as possible. In development, we make sure this separation is honored by running getStaticPaths in a separate worker also so that you don't see differing behavior between development and a production build.

If you would like to cache data between calls to getStaticPaths and getStaticProps you can use various strategies to achieve this like writing the cache to the filesystem in getStaticPaths and reading the data from the filesystem in getStaticProps or using something like Redis to store the data and query it from there.

The filesystem cache approach works pretty well on the Notion blog example and helps prevent re-fetching of data that has already been fetched.

@tomduncalf
Copy link

Oh man, this is a shame, makes my use case much more complicated (assumed state in a "store" class would be preserved in a single client side session as you navigate between pages, but actually the store gets recreated when you navigate to a page generated by getStaticPaths). Any way to disable the worker thread, or approach this in a different way? Having to serialize the client state into LocalStorage or whatever feels like overkill for my use case (its just UI state that should not persist beyond a single session)

@tomduncalf
Copy link

Actually, am I doing something wrong here? My site can be entirely run as SSR, and if I browse the exported site (just served from a web server, no Next.js server), I can see that when I navigate between non-getStaticPaths-generated pages, the browser just loads the extra bits of JS, as expected, but if I navigate to a getStaticPaths-generated page, the browser does a full reload of the page, which also makes the load noticably slower. Seems like strange behaviour is this is correct, so perhaps I have set something up wrong?

@tomduncalf
Copy link

I was doing something wrong heh, dynamic links need both href and as

@eric-burel
Copy link
Contributor

eric-burel commented Mar 29, 2021

Is it safe to store those data in .next/cache for instance? From the Notion example I don't really grasp the strategies used to manage the cache across builds. When you have one page, you can safely drop the cache and rebuild it in getStaticPaths. But if 2 pages are using the same cache entry, I'd like to guarantee that the cache is dropped beforehand when I run yarn dev or yarn build.

Edit: maybe we can get some unique build ID to check against?

@Vadorequest
Copy link
Contributor

Vadorequest commented May 26, 2021

@ijjk This is still a frequently requested feature. Your answer basically say "find/implement your own solution". What the community is asking is for a built-in way to do something like this.

One of my main issue crafting my own solution is that it should be called during or before webpack runs, and it makes it hard to use the same programming language version, because Webpack script (in next.config.js) doesn't use the TypeScript/ESXXX version my code usually use, so I can't reuse the code I've already written in my app but need to duplicate it. It feels wrong, and even if duplicated code was fine (it's not) it's still not clear to me about how I should implement it to make those data accessible for getStaticProps/getStaticPaths.

The "best" (not tested) solutions I've found so far are from #18550:

Last time I checked they weren't fitting my needs, I'll check again but I really need the framework needs to come up with a proper built-in solution.

@Vadorequest
Copy link
Contributor

I used https://github.com/ricokahler/next-plugin-preval to prefetch Locize API (i18n) to fetch all translations during Webpack bundling phase, and make them statically accessible to all components, pages and API routes.

PR: UnlyEd/next-right-now#337


The journey wasn't so easy, though. I encountered several Webpack bugs (related to the plugin's implementation), see ricokahler/next-plugin-preval#26 and overall the Developer Experience has decreased (although performances have increased).

It's clearly better than nothing, and clearly not as good as what the Next.js team does -- hence the need for a built-in way to do something like this. Nobody likes to mess with Webpack, and the Next.js Webpack config is highly sophisticated.

@leerob @timneutkens Some dev feedback regarding this long-lasting issue (I wanted to do this a year ago)

@jrandeniya
Copy link

I've implemented the file-based cache solution here if anyone is interested. I needed it for my own project:
jrandeniya/nextjs-ssg-contentful-cached.

It's in Typescript with build-time type generation and caching calls between getStaticProps and getStaticPaths.

@smnh
Copy link
Author

smnh commented Dec 20, 2021

@jrandeniya nice!

We (@stackbit) have sponsored a similar project that solves similar issues with Contentful and other CMSes. Check it out at: https://github.com/contentlayerdev/contentlayer

Also, we have created a utility package that enable "Hot Content Reloads" for page props while working locally with Next.js: https://github.com/stackbit/nextjs-hot-content-reload

@jrandeniya
Copy link

@smnh - this looks great. I'll check it out later and see if I can contribute 😄

@cureau
Copy link

cureau commented Jan 22, 2022

If anyone wants to follow @ijjk's approach, here's a function where you can cache data on SSG.

  • Make sure to REPLACE THIS with your actual data fetching logic
  • Make sure your cached data is somewhere that's included in your build
  • If you want to simplify what's below and only refresh cache during deployments, delete the DELETE THIS block
import fs from 'fs';
import { promisify } from 'util';
import path from 'path/posix';

const readFile = promisify(fs.readFile);
const writeFile = promisify(fs.writeFile);

const CacheFile = path.resolve('./data/cached-data'); // REAPLCE THIS
const cacheFile = `${CacheFile}`;

const getDataFromFS = async () => {
  let Data = [];

  let refreshCache: boolean;
  const fileStats = fs.statSync(cacheFile);

  // DELETE THIS IF YOU WANT
  // 43200 = twice a day; you can also take it out if you just want to cache on build
  const refreshCacheSeconds = process.env.REFRESH_CACHE_SECONDS || 43200;
  const secondsSinceLastUpdate =
    (new Date().getTime() - new Date(fileStats.mtime).getTime()) / 1000;
  console.log(
    `seconds since cache update: ${secondsSinceLastUpdate}; refreshCacheSeconds: ${refreshCacheSeconds}`
  );
  if (secondsSinceLastUpdate > refreshCacheSeconds) {
    refreshCache = true;
  }
  // END OF DELETE THIS; just take out the conditional if (!refreshCache) below

  if (!refreshCache) {
    try {
      Data = JSON.parse(await readFile(cacheFile, 'utf8'));
      console.log(`Reading ${Data.length} from file`);
    } catch {
      console.log('No cached data found');
    }
  }

  if (Data.length === 0) {
    try {
      console.log(
        `Fetching ${Data.length} from database. NOTE: you should only see this once per build`
      );
      console.time('writing data to filesystem');
      Data = await fetchYourDataHere(); // REPLACE THIS
      writeFile(cacheFile, JSON.stringify(Data), 'utf8')
        .then(() => {
          console.timeEnd('writing data to filesystem');
        })
        .catch((e) => {
          console.timeEnd('writing  data to filesystem');
        });
    } catch (err) {
      console.warn(
        `Failed to load from firestore using fetchYourDataHere, are we connected to the right firestore db?`
      );
    }
  } else {
    console.log('successfully fetched from filesystem');
  }
  return Data;
};

export default getDataFromFS;

@github-actions
Copy link
Contributor

This closed issue has been automatically locked because it had no new activity for a month. If you are running into a similar issue, please create a new issue with the steps to reproduce. Thank you.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 21, 2022
@leerob
Copy link
Member

leerob commented Aug 4, 2022

For folks landing here from Google -> https://github.com/vercel/examples/tree/main/solutions/reuse-responses

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants