Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v2] out-of-memory error with large amount of nodes #6611

Closed
pieh opened this issue Jul 20, 2018 · 2 comments
Closed

[v2] out-of-memory error with large amount of nodes #6611

pieh opened this issue Jul 20, 2018 · 2 comments
Assignees
Labels
type: bug An issue or pull request relating to a bug in Gatsby

Comments

@pieh
Copy link
Contributor

pieh commented Jul 20, 2018

With sufficently large amount of data heavy nodes Gatsby v2 will fail with JavaScript heap out of memory (without increasing heap size).

Terminal output
⢀ source and transform nodescontentTypes fetched 42
Updated entries  6908
Deleted entries  0
Updated assets  2800
Deleted assets  0
Fetch Contentful data: 108808.495ms
success source and transform nodes — 116.323 s

<--- Last few GCs --->

[60520:0x103800000]   132309 ms: Mark-sweep 1422.2 (1564.9) -> 1421.8 (1564.9) MB, 138.1 / 0.0 ms  allocation failure GC in old space requested
[60520:0x103800000]   132557 ms: Mark-sweep 1421.8 (1564.9) -> 1421.8 (1523.9) MB, 247.6 / 0.7 ms  last resort GC in old space requested
[60520:0x103800000]   132726 ms: Mark-sweep 1421.8 (1523.9) -> 1421.8 (1519.4) MB, 168.9 / 0.0 ms  last resort GC in old space requested


<--- JS stacktrace --->

==== JS stack trace =========================================

Security context: 0xf62d8aa57c1 <JSObject>
    1: mapToObject(aka mapToObject) [/Users/mike/dev/gatsbyinc/project/node_modules/gatsby/dist/redux/index.js:~32] [pc=0xb78e9d0547a](this=0xf625f7822d1 <undefined>,map=0xf622b48d6c1 <Map map = 0xf62f42048d9>)
    2: /* anonymous */ [/Users/mike/dev/gatsbyinc/project/node_modules/gatsby/dist/redux/index.js:103] [bytecode=0xf6295584df1 offset=79](this=0xf62dea0c211 <JSGlobal Object>,state...

FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory
 1: node::Abort() [/Users/mike/.nvm/versions/node/v8.11.1/bin/node]
 2: node::FatalException(v8::Isolate*, v8::Local<v8::Value>, v8::Local<v8::Message>) [/Users/mike/.nvm/versions/node/v8.11.1/bin/node]
 3: v8::internal::V8::FatalProcessOutOfMemory(char const*, bool) [/Users/mike/.nvm/versions/node/v8.11.1/bin/node]

This seems to be cause by serializing huge nodes array to JSON string:

Here we pass all nodes and nodes of given type as args for setFieldsOnGraphQLNodeType API hook:

intermediateType.name = typeName
intermediateType.nodes = nodes
const fieldsFromPlugins = await apiRunner(`setFieldsOnGraphQLNodeType`, {
type: intermediateType,
allNodes: getNodes(),
traceId: `initial-setFieldsOnGraphQLNodeType`,
parentSpan: span,
})

This later get's serialized (JSON.stringify(args)):

apiRunInstance.id = `${api}|${apiRunInstance.startTime}|${
apiRunInstance.traceId
}|${JSON.stringify(args)}`

Resulting string itself wouldn't cause to cause out of memory, it seems like process of serializing it does:
memory

What I think is that JSON.stringify concatenate array items one by one and those intermediate strings are kept and we run out of memory before they are garbage collected (not sure about this, maybe someone with deeper v8 knowledge can correct me if I'm wrong).

This is change that fixes this problem (or maybe I should say workaround it) - pieh@b32ae20 (just don't pass nodes as-is to API hook) - but if we would have 2x more nodes - we would hit same problem when we serialize reducer state to save it to file - right now it seems we get out-of-memory crash because we serialize nodes array twice independently (in API runner tracing and redux state saving)

@pieh pieh added the type: bug An issue or pull request relating to a bug in Gatsby label Jul 20, 2018
@pieh pieh self-assigned this Jul 20, 2018
@KyleAMathews
Copy link
Contributor

Not passing allNodes is an easy win. Also serializing arguments to generate an ID is a problem... there should be a cheaper way to generate the id there. Even if it's just we include some specific ways to generate an ID for specific API calls that can be expensive like this one.

@KyleAMathews
Copy link
Contributor

The markdown benchmark site would be good for testing this. I noticed it hit memory pressures around ~50k pages — probably from this issue

https://github.com/gatsbyjs/gatsby/tree/master/benchmarks/markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug An issue or pull request relating to a bug in Gatsby
Projects
None yet
Development

No branches or pull requests

2 participants