You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The instructions are very simple, displayed directly in the README, and require no dependencies installed. (other than Node/npm of course)
How often does it reproduce? Is there a required condition?
On my computers at least (desktop and laptop), it reproduces always, given a loop-count of 5000.
Specifically:
If the loop count is >=3000, it always reproduces.
If it is <=2000, it never reproduces. (at least I don't recall it doing so)
The cutoff point between working and not working is somewhere around 2700 on my dev (Windows) computer.
What is the expected behavior?
It is expected that the npm run start_trackHeap command will not crash partway through operation -- or if it does crash, that at least some sort of error message or dump or the like is output, helping users identify what is wrong.
What do you see instead?
On Node 16.13.2 (the latest LTS release), the process just abruptly stops without any information whatsoever.
On Node 14.17.1, the process abruptly stops, but at least has the decency to output a 3221225477 error code. (as shown in the screen capture in the linked repo)
Additional information
While the easiest way to reproduce the problem is by adding the --track-heap-objects flag to Node, another way is by trying to take a memory snapshot while (or maybe before/after -- not sure) the example code/data-transformation is running. (which is how I discovered the bug initially)
Of course, you don't need to look into the code of postgraphile, since the minimal repro above is self-contained, and much easier to debug. But the above is some context on how I came across it. (ie. it is indeed a bug that can be hit in the wild; in my case, I ended up spending over 30 hours trying to debug it, because of the utter lack of error message, stack trace, etc., combined with its depth in a third-party library, and my assumption that the bug would be "higher level" than the language/engine itself)
The text was updated successfully, but these errors were encountered:
Venryx
changed the title
Abrupt crash on particular structure/processing, when --track-heap-objects is enabled
Abrupt crash on particular structure/processing, when --track-heap-objects is enabled (or memory profiling)
Jan 15, 2022
I found a way to restructure the deepClone function which avoids the crash:
constdeepClone=value=>{if(Array.isArray(value)){returnvalue.map(val=>deepClone(val));}if(typeofvalue==="object"&&value){// crash occurs, with this version/*return Object.keys(value).reduce((memo, k)=>{ memo[k] = deepClone(value[k]); return memo; }, {});*/// crash does not occur, with this versionconstfunc=(memo,k,value)=>{memo[k]=deepClone(value[k]);returnmemo;};returnObject.keys(value).reduce((memo,k)=>func(memo,k,value),{});}returnvalue;};
So the bug in NodeJS must have something to do with recursive functions.
Screen capture showing before and after
EDIT: Benjie found a better workaround for the crash that simply unrolls the calls to reduce, rather than inserting a (userland) redundant function:
const deepClone = value=>{
if (Array.isArray(value)) {
const l = value.length;
const cloned = new Array(l);
for (let i = 0; i < l; i++) {
cloned[i] = deepClone(value[i]);
}
return cloned;
}
if (typeof value === "object" && value) {
// Unrolled `reduce` to work around memory bug in Node.js
const cloned = Object.create(null);
const keys = Object.keys(value);
for (let i = 0, l = keys.length; i < l; i++) {
const k = keys[i];
cloned[k] = deepClone(value[k]);
}
return cloned;
}
return value;
};
Version
16.13.2 (the latest LTS release), as well as v14.17.1
Platform
Microsoft Windows NT 10.0.19043.0 x64, as well as: Linux 4.15.0-163-generic #171-Ubuntu SMP Fri Nov 5 11:55:11 UTC 2021 x86_64 Linux
Subsystem
No response
What steps will reproduce the bug?
I created a minimal reproduction here: https://github.com/issue-repros/nodejs/tree/main/TrackHeapObjects_Crash
The instructions are very simple, displayed directly in the README, and require no dependencies installed. (other than Node/npm of course)
How often does it reproduce? Is there a required condition?
On my computers at least (desktop and laptop), it reproduces always, given a loop-count of 5000.
Specifically:
If the loop count is >=3000, it always reproduces.
If it is <=2000, it never reproduces. (at least I don't recall it doing so)
The cutoff point between working and not working is somewhere around 2700 on my dev (Windows) computer.
What is the expected behavior?
It is expected that the
npm run start_trackHeap
command will not crash partway through operation -- or if it does crash, that at least some sort of error message or dump or the like is output, helping users identify what is wrong.What do you see instead?
On Node 16.13.2 (the latest LTS release), the process just abruptly stops without any information whatsoever.
On Node 14.17.1, the process abruptly stops, but at least has the decency to output a
3221225477
error code. (as shown in the screen capture in the linked repo)Additional information
While the easiest way to reproduce the problem is by adding the
--track-heap-objects
flag to Node, another way is by trying to take a memory snapshot while (or maybe before/after -- not sure) the example code/data-transformation is running. (which is how I discovered the bug initially)As for the context in which I originally hit the issue: I was debugging why my server was crashing whenever I was trying to take memory profiles. I narrowed the bug down to being in the postgraphile library somewhere, and then narrowed it down further still to this line: https://github.com/graphile/graphile-engine/blob/a9fcbafc52244495420faac5da0e05cbceaf5e69/packages/graphile-build-pg/src/plugins/PgIntrospectionPlugin.js#L821
Of course, you don't need to look into the code of
postgraphile
, since the minimal repro above is self-contained, and much easier to debug. But the above is some context on how I came across it. (ie. it is indeed a bug that can be hit in the wild; in my case, I ended up spending over 30 hours trying to debug it, because of the utter lack of error message, stack trace, etc., combined with its depth in a third-party library, and my assumption that the bug would be "higher level" than the language/engine itself)The text was updated successfully, but these errors were encountered: