-
Notifications
You must be signed in to change notification settings - Fork 30k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trace event support for Node.js #9304
Conversation
Awesome
Should we consider doing this by default? |
@AndreasMadsen I think that could be a good idea and it did generate useful information but I want to keep this PR focused on the core tracing system. I am hoping that instrumentation to generate trace events can be left to a follow on. |
Is it possible to add docs and tests? |
@cjihrig Working on tests now. Are the debugger docs the best place to put doc changes or should they go somewhere else? |
Good question. Do you think there will be enough content to warrant a separate Tracing documentation page? If not, then the Debugger docs or even guides might be an appropriate place. Others can weigh in here too. |
In the example above, trace config doesn't look complex enough to warrant a JSON config file. It can be difficult to arrange for files on disk to appear when wanting to configure node behaviour. I suggest the example JSON config file would be better expressed as
|
@cjihrig I've added an end to end test and put some documentation in the debugger docs. Happy to move it if there is a better place. @sam-github I've updated the description to list the full set of supported configuration options. Though I would anticipate included and excluded categories to be the options relevant to node, V8 does provide additional configuration. |
@@ -0,0 +1,29 @@ | |||
'use strict'; | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you include ../common
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
const cp = require('child_process'); | ||
|
||
const CODE = 'for (var i = 0; i < 100000; i++) { \"test\" + i }'; | ||
const FILE_NAME = 'node_trace.1.log'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The file should probably be written to common.tmpDir
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
const CODE = 'for (var i = 0; i < 100000; i++) { \"test\" + i }'; | ||
const FILE_NAME = 'node_trace.1.log'; | ||
|
||
fs.access(FILE_NAME, (err) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use common.fileExists()
to avoid a level of indentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
const proc = cp.spawn(process.execPath, | ||
[ '--enable-tracing', '-e', CODE ]); | ||
|
||
proc.once('exit', () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add common.mustCall()
around the callback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@matthewloring I still think the configuration should be via CLI options. JSON should be avoided unless necessary, where necessary means "complex data structures", which this doesn't have, its a bunch of boolean or string flags. Its not even nice for dev use... typing CLI options is easy, seeing CLI docs in the help output is easy, having to write your options into a file for a one off trace run is not easy. Sure, do it for permanent config like .eslintrc, but ephemeral config like tracing? Doesn't make sense to me. I have tooling that allows node monitoring and tracing to be enabled in production, and I've been looking forward to the v8 trace lib getting integrated, pretty excited about this. Restarting node with new CLI options is easy, having to write my config into filesystem (somewhere, where?) is not convenient. There isn't even a way to guarantee its cleaned up afterwards. I can do it (well, assuming I'm not running in docker with a read-only FS I can do it), but its not a good way. |
proc.once('exit', common.mustCall(() => { | ||
assert(common.fileExists(FILE_NAME)); | ||
fs.readFile(FILE_NAME, (err, data) => { | ||
fs.unlinkSync(FILE_NAME); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tiniest of nits that can totally be ignored: No need to clean up after the test in the temp directory. Each test that needs to use the temp directory runs common.refreshTmpDir()
which will clean up anything there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, thanks.
@sam-github That makes a lot of sense. I'm happy to introduce command line flags for included/excluded categories and get rid of the json configuration. I'm going to give it a little time before changing this in case anyone has other ideas on configuration approaches. |
Does this have any performance impact when off? |
@Fishrock123 The performance impact is minimal specially when off For every call site, we check if the category is enabled or disabled the first time the call site was visited, and this value is cached so that we do not have to do the expensive category check again. If tracing was off, then it was enabled: all cached references are updated. And we do that to avoid doing the category check every time we visit the call site. But that limits the ability to change the enabled categories when tracing is already on. Instead, we need to stop change, then start again. That being said, trace-events should not be used to track high frequency event*, sampling and runtime stats should be used instead (both are getting standardized in tracing). |
@fmeawad Ignoring the tight loop scenario, a single http request can make 12-18 async calls. A high speed proxy can process upwards of 30k reqs/sec. Which results in 360k-540k async calls/sec. This ignores sync calls that would also be traced (e.g. writes to a socket that are immediately flushed to the kernel). At that volume, what performance impact would we expect to see with this both enabled and disabled? |
On the side, this PR doesn't replace async hooks in any way. The two are complimentary. |
`--trace-config` flag: | ||
|
||
```txt | ||
node --enable-trace --trace-config=trace-config.json server.js |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--enable-tracing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, fixed.
node --enable-trace --trace-config=trace-config.json server.js | ||
``` | ||
|
||
where `trace-config.json` may override any of the default configuration values: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More clarification would be helpful here:
- Are the values below actually the defaults, or just examples?
- What are the valid values for
record_mode
? - What do each of those boolean flags mean?
- Are there any other known categories for the node/v8 engine? (I realize user code can add arbitrary tracing categories.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Configuration has been simplified to just specify enabled categories through a command line flag. Other configuration options are not useful at this time (only one record_mode
supported by the buffer, systrace not being used by node, etc). We are working on getting a list of categories compiled here: https://github.com/v8/v8/wiki/Tracing%20V8. Currently node does not introduce any categories of its own. The node
category introduced by this PR is just a placeholder and can be updated when the community decides how/where trace points should be added to core.
"enable_systrace": true, | ||
"enable_argument_filter": true, | ||
"included_categories": [ "v8", "node" ], | ||
"excluded_categories": [ ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does excluding categories actually do anything? As far as I can see, TraceConfig::IsCategoryGroupEnabled()
ignores that excluded category list, and the list is not checked anywhere else either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is used by Chrome since they specify enabled categories using regular expressions which can be refined by excluded categories. The node/v8 category parser doesn't support regexes so this isn't needed at this time.
std::string str((std::istreambuf_iterator<char>(fin)), | ||
std::istreambuf_iterator<char>()); | ||
TraceConfigParser::FillTraceConfig(env->isolate(), trace_config, | ||
str.c_str()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the above code will crash if the trace config file is not found or has invalid contents. Is that acceptable? Or should there be a more helpful error message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
@trevnorris According to my math, If you are using a single thread for that, an async call can take as low as 1.5us, a trace event can take up to 4us, therefore I do not recommend using trace events for that. The off overhead is going to be high as well, it will be ~10% here. But it seems this is an easy experiment to conduct, and that would be my recommendation here. |
@trevnorris Just for a data point, adding the trace events used to generate the graphs above to async-wrap (but with tracing disabled) caused no change in request latency and ~2% reduction in requests per second for a server with no logic: var server = http.createServer(function(req, res) {
res.end('Hello world');
}); The % performance impact would be even lower if the server did anything before responding. We will definitely need to be careful about how many trace points are added and where they go. |
bool TraceConfigParser::GetBoolean(Isolate* isolate, Local<Context> context, | ||
Local<Object> object, const char* property) { | ||
Local<Value> value = GetValue(isolate, context, object, property); | ||
if (value->IsNumber()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be:
if (value->IsNumber() || value->IsBoolean())
I'm not certain IsNumber()
is false for a boolean literal, but I'm assuming that's the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
Mutex::ScopedLock stream_scoped_lock(stream_mutex_); | ||
if (total_traces_ >= kTracesPerFile) { | ||
total_traces_ = 0; | ||
// Destroying the member JSONTraceWriter object appends "]"\n"" to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"]}"
to be more precise. (And no \n
.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, thanks!
@sam-github I've updated the category configuration to use the @jasongin Thank you for the review. This round of comments should all be addressed by the last commit. |
Node.js application. | ||
|
||
The set of categories for which traces are recorded can be specified using the | ||
`--enabled-categories` flag followed by a list of comma separated category names. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably the categories command-line option should have the word 'tracing' in it somehow, so that it's more obviously associated with tracing. Maybe --tracing-categories
?
And is there any way to help avoid confusion with the other command-line options starting with --trace-
, which are actually about stack traces?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, that's a good point. Maybe --enable-trace-events
and --trace-event-categories
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to have them all start with --trace-event so its clear they go together. For example:
--trace-events-enabled
--trace-event-categories
Not sure how this fits with the way we use enabled/disabled for existing options but having all options related to trace-events start with the same prefix might make it easier to know they are related.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the flags.
@matthewloring This isn't landing cleanly on v7.x. Mind submitting a backport PR? |
If this is backported please also include #10959. |
@evanlucas I opened a backport PR here: #11106. |
Adding possibly-gratuitous but hopefully-not-harmful |
ping @nodejs/lts ... do we want this in 4 and 6? |
The versions of V8 in 4 and 6 do not have the APIs that this change depends on. |
`node_trace.*.log` files are generated by `NodeTraceWriter`. Refs: nodejs#9304
`node_trace.*.log` files are generated by `NodeTraceWriter`. Refs: nodejs#9304 PR-URL: nodejs#12754 Reviewed-By: Сковорода Никита Андреевич <chalkerx@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Checklist
make -j8 test
(UNIX), orvcbuild test nosign
(Windows) passesAffected core subsystem(s)
src
Description of change
Background: nodejs/diagnostics#30, nodejs/diagnostics#53
This PR adds support for trace-event tracing to Node.js. It provides a mechanism to centralize tracing information generated by V8, Node core, and userspace code. The PR contains a few components:
Generating Trace Events
This PR introduces macros for recording trace events. Synchronous operations can be traced by wrapping them with the appropriate macros:
Asynchronous operations can be traced using:
An example of async traces can be seen below. The example was generated by integrating the above macros into Async-Wrap.
Additional examples as well as a complete list of trace macros can be found in
src/tracing/trace_event.h
. This PR does not introduce any JS interface for generating events or tracing of Node core with the hope that the community can determine how the instrumentation of core should be approached and how this functionality can be exposed to end users.Viewing Trace Events
The current trace serialization outputs traces in a format that can be visualized by navigating to
chrome://tracing
in Chrome and loading a trace output file.We can zoom in on this visualization to view VM events alongside the opening and closing of TCP connections:
Usage
Tracing can be enabled by running:
By default, this will record all events in the
"v8"
and"node"
categories. To trace other categories or ignore either of these categories, use the--trace-event-categories
flag:This PR is the result of a lot of hard work by @misterpoe, @kjin, @lpy, @fmeawad, and @ofrobots.
Thanks for the review!
/cc @nodejs/diagnostics