-
Notifications
You must be signed in to change notification settings - Fork 29.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VM: Extreme memory growth #3113
Comments
The fact that the memory growth is stopped by using --max-old-space-size means that those objects are collectable by gc. Probably gc isn't triggered in this situation. Will take a look in an hour. |
What you're seeing is more or less the expected behavior. Garbage collection is expensive and garbage collecting VM contexts even more so, so V8 will generally prefer to grow the heap instead of sweeping it. If you start node with |
@bnoordhuis This is still quite unexpected, especially as we didn't have this problem with v0.10. I understand that the internals of the I understand that I could either expose the gc or limit the max heap size, but both solutions feel like some sort of hack to me. |
It is what it is. There is not much we (node.js) can do here, GC heuristics are predominantly a V8 affair. We can't really hint at V8 either that now would be a good time to collect garbage because node.js doesn't really know when the VM context is well and truly dead. |
Oh well, that's unfortunate. :( I worked around this issue for now by re-using the same context instead of recreating a new context. Not perfect, but better than using up all available memory on the system. |
The code fires only scavenges, but not mark-sweeps (until the total RSS reaches the limit). |
It's also worth noting that somewhy
|
@ChALkeR What I've seen in our app is that even when after running for more than 48 hours, none of the memory that is used up by the contexts is ever really freed again. Basically, memory usage grows to some limit around 1GB and then stays there forever. Not sure why no mark-sweep is triggered and why these context objects are not cleaned up. |
What do you mean by «not freed again»? It's actually freed when a mark-sweep is triggered and re-used for further allocations. If your If your |
A side note: |
@indutny Maybe |
That's about 1 MiB per each (empty) |
@ChALkeR One thing I noticed: The
I don't see any call to |
The v8 source says the following about
Looks like this call was somehow lost during the merge of the |
Just for the reference: diff --git a/src/node_contextify.cc b/src/node_contextify.cc
index 6520355..d21fc80 100644
--- a/src/node_contextify.cc
+++ b/src/node_contextify.cc
@@ -83,10 +83,13 @@ class ContextifyContext {
proxy_global_.SetWeak(this, WeakCallback<Object, kProxyGlobal>);
proxy_global_.MarkIndependent();
references_++;
+
+ env->isolate()->AdjustAmountOfExternalAllocatedMemory(5 * 1024 * 1024);
}
~ContextifyContext() {
+ env_->isolate()->AdjustAmountOfExternalAllocatedMemory(-5 * 1024 * 1024);
context_.Reset();
proxy_global_.Reset();
sandbox_.Reset(); is just enough stress for gc to clean things up and keep the memory usage stable there. |
@arthurschreiber It could be related, but by modifying just the destructor you can't solve this problem. |
@ChALkeR Are you sure? My C++ is not that great, but I think the destructor is invoked here: https://github.com/nodejs/node/blob/master/src/node_contextify.cc#L337. |
I looked into it a while ago and IIRC, it's because the v8::Context can stay alive after the ContextifyContext itself has been destroyed; embedded weak references in code objects and such keep it alive. It doesn't help that ContextifyContext does a complicated dance with multiple weakly persistent objects. |
Looking at lines 336-337,
|
I tried sticking |
cc @domenic again |
At this point @ChALkeR has a better grasp on this than I do. |
For reference, reduced testcase: 'use strict';
var vm = require('vm');
for (var i = 0; i < 10000000; i++) {
if (i % 10 === 0) {
console.log("After", i, "iterations:", parseInt(process.memoryUsage().heapTotal / 1024 / 1024), "MB");
}
vm.createContext();
} |
+1 we have faced the same problem while executing precompiled vm.Script in runInNewContext and runInContext. It went pretty bad for us, if anyone could provide some hint about where this is located on the v8 we would like to investigate (we don't know C but we'll do our best) |
I'm seeing the same kind of issue - from the heap dumps, it looks like I've got vm sandboxes piling up from calls to runInNewContext. A simple test case, with very little going on in the script benefits from -expose-gc and calling gc().
Would is be possible to expose some way of marking/disposing of a sandbox/context as unwanted e.g. unref() or .dispose() ? |
That wouldn't make a difference. As long as there are still references (from your own code or e.g. from weak embedded pointers in code objects), the context can't be reclaimed by the garbage collector. |
With #3962 it still leaks memory, but at least it collects it a bit faster (partly, because allocation takes much less time). |
Just as an FYI as it is suggested many times in here. If you add the --expose-gc flag, it adds the gc to the context when it is created in runInNewContext. The gc must reference the context and now the context references the gc so it leaks as a result of this flag being set. Setting context.gc = null solves this after the context is created. I updated #3249 aswell. |
Has anyone done any work on comparing v0.10 to v4 and v5 and even master (perhaps v8 4.9 is interesting now)? I'm hearing that this issue is holding up some users from migrating to v4+ since it's such a big deal. @nodejs/v8 I'm wondering if this is something on V8's radar at all? How do they use new contexts in Chromium or is that functionality entirely for embedders? i.e. would Chromium feel any pain from this change from their user-base or is it on us to provide feedback about what we're experiencing and try and find a resolution with them? |
I can reproduce this with v4 and v5, but it seems to be fixed on |
It seems like I fixed this issue recently when I simplified the logic of how node_contextify was keeping references to the V8 context (relevant commits ebbbc5a...66566df). Next I can take a look at whether these are easily backportable. |
Accidentally fixing something, nice! Thanks for looking into this @ofrobots! |
These back-port to @nodejs/lts: what would be the process to get this backported? These 4 commits (ebbbc5a...66566df) need to be backported, but I don't want to tag the issue (#5392) with lts-watch tag. |
For some reason I thought that the issue contained more commits than just these 4. That is not the case, so I am just going to tag the issue. |
I have test builds showing up here: https://nodejs.org/download/test/v4.3.2-test20160316bdc07626e8/ using https://github.com/rvagg/io.js/tree/5392-backport where I have those 4 commits applied on top of the current v4.x. I'd appreciate it if those having trouble with this problem could test it out and report back. |
It turns out the above 4 commits are not functionally complete – an additional commit (#5800) is needed to avoid a crash. (@rvagg, would you mind putting together another build?) I would really like to get some independent verification whether this memory growth issue is indeed fixed. In my experimentation, the reduced test-case from @ChALkeR above does seem to get fixed. |
New builds with the extra commit are available here: https://nodejs.org/download/test/v4.3.2-test2016032010820d8910/ please test folks! |
Using the build provided by @rvagg in development benchmarks dramatically reduces the memory footprint of creating and tearing down VMs over time. This fix is a huge boon for our team! Thank you! |
I'm looking to do a test with 5.9.1 later this week. I've seen good results locally but need to kick the tyres more with a % of production traffic to be certain. |
If things turn out to be good with this in 5.9.1 then I don't see any reason they won't be applied to a v4 release soonish. Given that it turned out to be not quite as simple (regression in 5.9.0) it might be something we want to sit on for a little longer to ensure that the functionality is stable in v5 before we have confidence to pull it back. / @nodejs/lts |
+1 to that @rvagg I would love to see these changes backported, but would also like to see them mature a little longer in v5. |
Finally had chance to test this - I was able to do an initial non-prod performance test. Metrics look good. I saw an overall drop in CPU and memory is reduced. I'm hoping to do a realistic prod test in the next couple of days but so far I am very happy with the results. Thanks for looking into this. I should clarify that those peaks are expected in my particular use case. That is actually when node-vm is executing. |
Cool. So this can be closed, right? |
Closing this as it's not reproducable anymore and seems to be fixed.
|
PR to get this in to v4.x: #6871 |
Released in v4.4.5 folks: https://nodesource.com/blog/node-js-v4-4-5-release-brief/ |
It looks like
context
objects are not collected by the GC until the max memory limit is reached.If I specify
--max-old-space-size
I can stop the memory growth.If I rewrite the code to reuse the context object, memory usage and performance are a lot better:
I understand why performance is a lot better (re-using the same object is obviously faster than creating new ones), but why is the memory profile so different?
The text was updated successfully, but these errors were encountered: