-
Notifications
You must be signed in to change notification settings - Fork 7.3k
deps: don't busy loop in v8 cpu profiler thread #25268
deps: don't busy loop in v8 cpu profiler thread #25268
Conversation
Backport 6964a9e0685fa186d9d9b7907be17505e839db1a from upstream v8. Original commit message: Make CPU profiler do not hog 100% of CPU. Tick event processor should not stay in a tight loop when there's nothing to do. It can go sleep until next sample event. LOG=N BUG=v8:3967 Committed: https://crrev.com/6964a9e0685fa186d9d9b7907be17505e839db1a Cr-Commit-Position: refs/heads/master@{#28211} Fixes nodejs#25137 Related: nodejs#9439, nodejs#8789
LGTM, thank you! |
@tunniclm Testing this PR, I cannot even reproduce the 100% CPU issue in the first place. I've tried to reproduce it by running:
without success. How do you reproduce this problem? |
@misterdjules The problem is with the CpuProfiler API, which is a native V8 API. The problem can be reproduced via a Node add on that starts the profiler in this way. The --prof option is not affected as the code path is different with that option (--prof writes events/samples to file, whereas CpuProfiler does runtime processing of the events between samples to keep an up-to-date mapping of code addresses.) |
An example addon to reproduce the problem (for Node 0.12 / v8 3.28), start_profiler.cc:
With binding.gyp:
And test.js:
|
lgtm |
@tunniclm I just noticed that there is no regression test in the changes from V8 upstream. I think it would be helpful if we could come up with a regression test, as it would prevent us from breaking that again. We would probably need to write it in a way similar to the tests in I'm OK to land this without a regression test for now as such a regression would not be a severe issue, but I think it would help us to have a regression test sooner than later. |
@tunniclm Also, thank you very much for the more detailed information. I was able to reproduce the problem and verify the fix in this PR, so LGTM. |
Triggered node-accept-pull-request to land this PR, so it should be merged in v0.12 soon. Thank you @tunniclm! |
Backport 6964a9e0685fa186d9d9b7907be17505e839db1a from upstream v8. Original commit message: Make CPU profiler do not hog 100% of CPU. Tick event processor should not stay in a tight loop when there's nothing to do. It can go sleep until next sample event. LOG=N BUG=v8:3967 Committed: https://crrev.com/6964a9e0685fa186d9d9b7907be17505e839db1a Cr-Commit-Position: refs/heads/master@{#28211} Fixes nodejs#25137 Related: nodejs#9439, nodejs#8789 PR: nodejs#25268 PR-URL: nodejs#25268 Reviewed-By: Michael Dawson <michael_dawson@ca.ibm.com> Reviewed-By: Julien Gilli <jgilli@fastmail.fm>
@joyent/node-coreteam The node-accept-pull-request Jenkins job keeps failing because I made the decision to move forward with this and land this PR anyway since it was the only test failing, and I was able to make sure the test actually passes on two different Windows systems (2012 server and Windows 7). |
FWIW, also added a reference to this floating patch in the documentation on our V8 upgrade process in the Wiki. |
@misterdjules I have a test that could work, but it currently has a dependency on an npm module for measuring CPU usage. Do the tests in test/addons need to be self-contained, without external dependencies? |
@tunniclm Currently the only tests with external dependencies that I'm aware of are the ones in These tests are not run automatically by the CI platform when landing pull requests. I think the reason for that are:
What is the npm module that you'd need to use in this case? If we can't manage to write this test without a dependency on a npm module, we could explore different options:
We might need to reorganize the tests directory a bit. @tunniclm Thoughts? |
@misterdjules At present, the test relies on the usage module to gather the process CPU usage. I just ran the test on Windows and found usage doesn't support Windows. Having said that, the high CPU issue is not solved on Windows anyway. |
@tunniclm Given that the high CPU usage issue is not solved on Windows, I would think that the best solution currently would be to:
It seems that adding Windows support to https://github.com/arunoda/node-usage could be reasonably simple, so that approach would work on Windows too when node-usage supports Windows CPU usage gathering. Before moving on though, I would like to get some feedback from @joyent/node-tsc and especially @orangemocha and @mhdawson because that's a slight departure from what has been done in tests run automatically by the CI platform in the past. |
@misterdjules arguably we need a better way of defining test suites than just using subdirectories. Our test runner already supports some basic predicates to specify the expected test outcomes based on the run environment. We currently use this for defining flaky tests (see https://github.com/joyent/node/blob/v0.12/test/internet/internet.status#L5). One of the possible outcomes is SKIP, which means that the test is not run. By adding an environment var 'runtype' we might be able to use these status files to configure what gets run in CI versus other test run types. But I think it would be best to defer these changes till after we have reconciled CI with io.js. So the directory restructuring you suggest sounds reasonable as an interim solution. Regarding the npm rebuild phase, can that not be encapsulated in the test itself? |
@orangemocha The npm rebuild phase could be encapsulated in the test itself, but it seems that even if abstracting something like I don't have a very strong opinion though, if doing the Thank you! |
@misterdjules if it can be implemented in a reusable way in |
@misterdjules in respect to node-usage I'd want to check its behaviour on pLinuz, zLinux and AIX |
@ misterdjules is there some existing doc which describes the different test "buckets" and when we believe each set should be run ? A table might be most useful that lists the bucket (directory name right now) when it run (part of make test, at PR accept time, or as an extended test etc.) That would help me to respond to your question As I understand it now the main difference is that we'd add running test/external to what the CI runs by default ? |
@mhdawson There is no documentation about that, it would be great to have it and link it from the contributing guide, both on nodejs.org and in Yes, my suggestion is to add tests in Currently, "quickly" means "in less than 60 seconds". I'm not sure if |
Backport 6964a9e0685fa186d9d9b7907be17505e839db1a from upstream v8.
Original commit message:
Fixes #25137
Related: #9439, #8789