Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide batch actions for handling nodes to increase performance #5079

Closed
szimek opened this issue Apr 22, 2018 · 4 comments
Closed

Provide batch actions for handling nodes to increase performance #5079

szimek opened this issue Apr 22, 2018 · 4 comments
Labels
type: question or discussion Issue discussing or asking a question about Gatsby

Comments

@szimek
Copy link
Contributor

szimek commented Apr 22, 2018

Description

I'm using gatsby-source-contentful plugin. I've got 6 spaces (something like a project or db), one for each language, and each space has about 500-600 posts and about 1500 images. The problem is that one of the first steps - source and transform nodes - takes about 10 minutes to complete. The whole build takes about 30 minutes to complete and requires changing memory limit for node.js process to 6GB, but that part (i.e. generating HTML pages, JS and CSS) will hopefully be at least somewhat optimized by switching to Webpack 4 in Gatsby v2.

Back to source and transform nodes. I added some logs to Gatsby and it seems that the main culprit is the reducer for CREATE_NODE action. Its code is pretty simple and standard for redux apps:

newState = {
  ...state,
  [action.payload.id]: action.payload,
}

In my case the state at the end has about 24K nodes. The plugin adds them one by one, each time dispatching CREATE_NODE action. This means that, when the cache is empty, the state has to be copied 24K times. At the beginning it takes less than 1ms to copy, but at the end, when it has over ~20K entries, it takes about 30ms. For each space more than 4K nodes are created, so when processing the last space, about 2 minutes (4K nodes * 30ms) are spent only on copying the state.

Would it be possible to add batch actions for managing nodes, so instead of creating nodes one by one, a plugin like Contentful could dispatch CREATE_NODES action just once and do something like:

```js
newState = {
  ...state,
  ...action.payload,
}

where action.payload is an object that contains all new nodes?

If it's not possible, then maybe there are other ways to optimize it?

@m-allanson m-allanson added the type: question or discussion Issue discussing or asking a question about Gatsby label Apr 23, 2018
@szimek szimek changed the title Provide batch actions for handling nodes Provide batch actions for handling nodes to increase performance Apr 23, 2018
@m-allanson
Copy link
Contributor

@pieh or @KyleAMathews do you have any thoughts on this?

@pieh
Copy link
Contributor

pieh commented Apr 24, 2018

I was just checking if we could break redux rule/convention to not mutate state and just add new nodes to existing state would be bad and found this

Furthermore, when it gets expensive (e.g. fast array changes), you can start using a library like Immutable.js that has very fast copying thanks to structural sharing. With Immutable.js, copying even large arrays isn't really that expensive because large chunks of the memory are reused. Finally, whether with or without Immutable.js, immutability helps us efficiently rerender the app because we know what exactly has changed thanks to the objects not being mutated.

( reduxjs/redux#758 (comment) ).

So I guess it's worth exploring that option. This also could close #4685 in single PR I think

@KyleAMathews
Copy link
Contributor

More useful probably would be doing a PR like #4681 that switches things to use a Map. In v2, we'll be switching all internal storage of nodes and other core data objects to use Maps as they're a lot faster #4685

@KyleAMathews
Copy link
Contributor

v2 is a lot faster! So closing this for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: question or discussion Issue discussing or asking a question about Gatsby
Projects
None yet
Development

No branches or pull requests

4 participants