-
Notifications
You must be signed in to change notification settings - Fork 30k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance regression when enumerating objects (for shallow cloning) v12.15.0 to v12.16.1 #32049
Comments
I ran some tests and found about 5-7% overhead on v12.16.1 (compared to v12.15.0). Is that coherent with your results? If so, this is a small regression and it would be hard to perceive it in a real-world application (in comparison, #31961 will make the operations hundreds of times slower, which will be easier to perceive in an application). Have you noticed a regression on an application after upgrading from v12.15.0 to v12.16.1? If so, can you share how much overhead you noticed? It might be something else (not covered by this test case). |
Yes, we have noticed a slow down in an application, which I have pin pointed to looping through object keys for shallow cloning. When I run one of the unit tests, the unit test has a slowdown from ~2100ms to ~2800ms. So the difference is really perceivable and it seems like it's orders of magnitude. That said, the test case that I produced here is really simple, and the result that I'm seeing can be influenced by a multitude of other factors, which I'm still trying to work out - so yeah, there definitely can be something else, but it is also definitely related to shallow cloning as changing that to an assignment makes the issue go away. I'll be working to build up a better reduced case... |
That's a 33% slowdown, so even if you're affected by this issue, it's unlikely to be causing all the overhead. If you can provide a test case closer to your unit tests (without sensitive code), we can help pinpoint the issue. Or maybe a CPU profile of your unit tests (either with --cpu-prof, --prof or Linux perf). |
Not exactly. The unit test (call it an integration test if you will - it exercises an endpoint and goes through the whole app) in question does a lot lot more than just the shallow cloning, and there's definitely slow downs in other parts of it. However, there is one method, which does shallow cloning of two objects (each with 100-200 keys). That method gets called ~1000 times. The grand total (sum of durations based on
(For comparison: if I remove the shallow cloning, the method in both cases takes 2-3ms total) So that's a 5x slowdown just there. I think the rest of the slowdown is fairly spread out across the rest of the system as it isn't obvious from the flame charts, but I'll have to investigate that too at some point.
That's the tricky part, right? :) I can't figure out what is special about the objects being cloned, because the reduced case does not exhibit the same behavior. There is no deopt in the method, nothing to grab onto... It might also be a consequence of some other conditions? Possibly GC kicking in aggressively or something like that? Anyways, working on this...
I'm definitely not able to share the CPU profile - any tips on what I should be looking for inside? That's the same data as you can see in dev tools or |
OK, so 15:
16:
If I skip the shallow cloning and return the same instance, I get roughly the same amount of GC messages log, and the memory usage is a lot closer: 15:
16:
I think the script that I have at the top post also exhibits higher memory usage - not sure if any of this is related, but figured it's worth testing/sharing... I also started looking at various V8 options that I could play with and I've no idea what |
I wonder if this is a memory leak. You could take a heap snapshot on each version and compare using Chrome DevTools to see which objects are being retained. Also, are you able to test on v13.10.1? Would be good to double check if the issue is already fixed or not. |
v13.10.1 still has the problem, and so does latest master. Next week I'll try to work on a better case to reproduce this, but I spotted slowdown (~3x) in other areas where we use various methods of cloning. The one place I did poke does also show increased time for GC. I have not yet checked the memory snapshot diffs. I have however tried a couple of experiments:
So it feels it has less to do with GC, but more with how/where the objects get allocated? I'm sorry if I'm not explaining this well - V8 memory management is totally not my area of competence :) I'm also not really able to repro much of that with the reduced case I have at the top, so I'll keep on working on that next week. |
I've tested the above on the following:
The trend is not really helping.
We looked into that. However forcing |
Note that from 13.10.1 and master are using the same V8 version, showing the results are unstable (which is coherent with what I've seen so far), but there's a small overhead. @mcollina can you try the latest node-v8 build as well? Premature promotion to old space makes sense. If you haven't already, you can use |
@mmarchini in node-v8 it gets even worse: This is the output from the
The only conclusion I could draw from this is that the object is allocated in the Large Object Space. |
Ok, I think I've identified part of the problem. In Node 12.15.0 the object did not go into large object space. In Node 12.16.1 and beyond, it moves into large object space. |
Found another repo where we can see a ~10% total slowdown in the unit test suite when changing from 12.15.0 to 12.16.1. Running
Once again - looping through an object to construct a new object... |
Tracked in V8 as well: https://bugs.chromium.org/p/v8/issues/detail?id=10339 |
Closed, this seems fixed. |
Does this need any action to get into v14.x? |
I thought it was already shipped in v14, normally PRs are backported after 2 week in a current release. |
My bad, it seems to actually be there: 748720e, not sure how I missed it. Thanks. |
What steps will reproduce the bug?
The following code became slightly slower going from v12.15.0 to v12.16.1 (likely 12.16.0).
Additional information
I'm still not 100% sure this accounts for all of the slowdowns while looping through object keys that I'm seeing and I'm working on a further reduced test case, but the results here are very consistent.
Adding more items into the object makes things slower, seemingly linearly.
Note that this is quite similar to #31961, however the fix on master in 3d894d0 resolves #31961, but not this issue.
The text was updated successfully, but these errors were encountered: