Node.js Foundation Diagnostics WorkGroup Meeting 2019-09-11 #326

mhdawson · 2019-09-09T12:05:44Z

Time

UTC Wed 11-Sep-2019 17:30 (05:30 PM):

Timezone	Date/Time
US / Pacific	Wed 11-Sep-2019 10:30 (10:30 AM)
US / Mountain	Wed 11-Sep-2019 11:30 (11:30 AM)
US / Central	Wed 11-Sep-2019 12:30 (12:30 PM)
US / Eastern	Wed 11-Sep-2019 13:30 (01:30 PM)
London	Wed 11-Sep-2019 18:30 (06:30 PM)
Amsterdam	Wed 11-Sep-2019 19:30 (07:30 PM)
Moscow	Wed 11-Sep-2019 20:30 (08:30 PM)
Chennai	Wed 11-Sep-2019 23:00 (11:00 PM)
Hangzhou	Thu 12-Sep-2019 01:30 (01:30 AM)
Tokyo	Thu 12-Sep-2019 02:30 (02:30 AM)
Sydney	Thu 12-Sep-2019 03:30 (03:30 AM)

Or in your local time:

Links

Minutes Google Doc: https://docs.google.com/document/d/1EAh61ftov5pcLhfjh_2bMvU6fyYxSpBzHVxShMvfu0g/edit

Agenda

Extracted from diag-agenda labelled issues and pull requests from the nodejs org prior to the meeting.

nodejs/node

Async Hooks do not recognize execution contexts created when processing thenables #22360

nodejs/diagnostics

proposal: --inspect-store flag #298
Proposal to drive Diagnostics WG initiatives through user journeys #295
Perfetto in Node.js #277
Diagnostics "Best Practices" Guide? #211
[async_hooks] stable API - tracking issue #124

Invited

Diagnostics team: @nodejs/diagnostics

Observers/Guests

Notes

The agenda comes from issues labelled with diag-agenda across all of the repositories in the nodejs org. Please label any additional issues that should be on the agenda before the meeting starts.

Joining the meeting

link for participants: https://zoom.us/j/466707925
For those who just want to watch:
youtube admin page: https://www.youtube.com/my_live_events?filter=scheduled

The text was updated successfully, but these errors were encountered:

slonka · 2019-09-09T15:22:37Z

Hi everyone! I would like to discuss the state of CPU profiling in production. Here: https://github.com/slonka/nodejs-production-ready-profiling is a summary and code samples of things that I've tried.

I'm tagging @mmarchini since he's listed as CPU Profiling champion here: https://github.com/nodejs/diagnostics#current-initiatives

vmarchaud · 2019-09-09T15:31:11Z

@slonka I'm interested in why did you put the inspector with a warning to use it in production except it's considered as experimental ? I've been using it for few years in production without issue, it's the same C++ API that's used by pprof-node too

mmarchini · 2019-09-09T15:36:16Z

Just for context, the inspector module (require('inspector')) is experimental. The Inspector Protocol and the V8 CPU Profiler are considered stable and production-safe afaik (cc @hashseed @psmarshall @eugeneo can you confirm this?)

slonka · 2019-09-10T06:48:30Z

I'm interested in why did you put the inspector with a warning to use it in production except it's considered as experimental ?

Because it's experimental and I did not check if it collects the same information as others.

I've been using it for few years in production without issue

Can you share some info about the overhead? I was worried that enabling inspector interface would cause a large performance hit.

it's the same C++ API that's used by pprof-node too

Correct me if I'm wrong, but If it used the same C++ API then I guess it would not work with worker-threads right?

@mmarchini what about native frames? Has anything changed since #148 (comment) ?

Also I've got an app that shows the following summary when run with --prof:

 [Summary]:
   ticks  total  nonlib   name
   5558    0.8%   80.2%  JavaScript
      0    0.0%    0.0%  C++
   8448    1.2%  121.9%  GC
  717022   99.0%          Shared libraries
   1373    0.2%          Unaccounted

 [Shared libraries]:
   ticks  total  nonlib   name
  636589   87.9%          /lib/x86_64-linux-gnu/libc-2.23.so

Is there any other way than using perf to see which functions are called in libc-2.23.so?

What's the state of "Best Practices" Guide in terms of profiling?

gireeshpunathil · 2019-09-10T07:42:44Z

@slonka:

What's the state of "Best Practices" Guide in terms of profiling?

still a TODO. We have been aiming to gather consensus on the overall structure and layout of the best practice content, but unfortunately that was not completed.

Feel free to contribute to profiling best practices, if you are willing and have bandwidth !

slonka · 2019-09-10T09:13:55Z

@gireeshpunathil

Feel free to contribute to profiling best practices, if you are willing and have bandwidth !

Is this the right repo https://github.com/goldbergyoni/nodebestpractices#7-performance-best-practices ?

vmarchaud · 2019-09-10T09:29:17Z

@slonka

Can you share some info about the overhead? I was worried that enabling inspector interface would cause a large performance hit.

I didn't even saw changes when the Profiler domain was enabled, when it's running it's done on a separate thread so again no impact. The only cost is when the C++ call the Javascript callback with the profile result, which is pretty minor.

Correct me if I'm wrong, but If it used the same C++ API then I guess it would not work with worker-threads right?

The problem is not the C++ API but the way inspector works. Each worker has his own V8 instance, so you need to propagate the Profiler.start request to each of them and send the result of the right V8 instance. I believe some work has been done (there is a inspector namespace for that: NodeWorker), but i didn't follow exactly what is working and what's not.

Is there any other way than using perf to see which functions are called in libc-2.23.so

Not in my knowledge and i'm pretty sure V8 will never be able to know that information without self-instrumenting the around the function call, which is exactly what perf is supposed to do (at the kernel level).

mmarchini · 2019-09-10T16:12:01Z

@mmarchini what about native frames? Has anything changed since #148 (comment) ?

No changes in that regard. If you also need C++ frames the best options are --prof or a system-level profiler (Linux perf, dtrace, Instruments, etc.). Bear in mind that --prof won't give you all C++ frames though (although I don't remember which ones it will get).

I was worried that enabling inspector interface would cause a large performance hit.

The Inspector Protocol was created to be production-safe. It also separate features into Domains. The only domain that affects performance (afaik) is the Debugger domain, but you won't want to use it in production in most cases anyway.

slonka · 2019-09-10T16:47:17Z

@mmarchini So right now the best option to get the complete picture is perf + --perf-basic-prof flag right?

A couple of questions about --prof:

How hard would it be to implement starting and stopping --prof at runtime? (comment:
Node CPU Profiling Roadmap #148 (comment)). Doing canary deployments just to turn on --prof seems to be an overkill.
Another problem is that the log grows quite quickly. How hard would it be to implement log rotation of --prof output (based on some event like SIGUSR2)?
How hard would it be to add sampling interval to --prof (something like Profiler.setSamplingInterval in inspector)?

The only domain that affects performance (afaik) is the Debugger domain, but you won't want to use it in production in most cases anyway.

I think that sampling profilers have low overhead but I do not think they are free. I'll try to create a benchmark in my repo.

Flarna · 2019-09-10T18:51:49Z

I have seen quite high memory overhead with the v8 CpuProfiler if the application profiled as a lot functions (really a lot...).
There were significant improvements in latest Node.Js version but depending on the application the overhead may still be significant.
For the same samples starting the CpuProfiler takes also quite long - and it is a synchrous call so it may impact your application.
Root cause for both is that profiler has to copy some information (e.g. function location,..) form JS thread to the profiling thread to allow mapping of IP to functions. This information is not that much but if there are a lot functions to sums up.

But I'm quite sure that most applications are not effected by this.

see nodejs/node#23070 (comment)

gireeshpunathil · 2019-09-11T03:59:56Z

@slonka - no, I meant the best practices initiative part of this (node.js diagnostic) working group itself: #211 and #254

mhdawson · 2019-09-11T17:34:06Z

Pinging Peter to see if he's starting the meeting

mhdawson · 2019-09-11T17:34:48Z

Seems to be started, must be the way I tried to join.

gireeshpunathil · 2019-10-29T07:53:53Z

was there a minute captured for this sitting? if so can we close this?

gireeshpunathil · 2020-03-26T14:36:16Z

meeting happened, closing

mhdawson self-assigned this Sep 9, 2019

gireeshpunathil closed this as completed Mar 26, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Node.js Foundation Diagnostics WorkGroup Meeting 2019-09-11 #326

Node.js Foundation Diagnostics WorkGroup Meeting 2019-09-11 #326

mhdawson commented Sep 9, 2019

slonka commented Sep 9, 2019 •

edited

Loading

vmarchaud commented Sep 9, 2019

mmarchini commented Sep 9, 2019

slonka commented Sep 10, 2019 •

edited

Loading

gireeshpunathil commented Sep 10, 2019

slonka commented Sep 10, 2019 •

edited

Loading

vmarchaud commented Sep 10, 2019

mmarchini commented Sep 10, 2019

slonka commented Sep 10, 2019

Flarna commented Sep 10, 2019 •

edited

Loading

gireeshpunathil commented Sep 11, 2019

mhdawson commented Sep 11, 2019

mhdawson commented Sep 11, 2019

gireeshpunathil commented Oct 29, 2019

gireeshpunathil commented Mar 26, 2020

Node.js Foundation Diagnostics WorkGroup Meeting 2019-09-11 #326

Node.js Foundation Diagnostics WorkGroup Meeting 2019-09-11 #326

Comments

mhdawson commented Sep 9, 2019

Time

Links

Agenda

nodejs/node

nodejs/diagnostics

Invited

Observers/Guests

Notes

Joining the meeting

slonka commented Sep 9, 2019 • edited Loading

vmarchaud commented Sep 9, 2019

mmarchini commented Sep 9, 2019

slonka commented Sep 10, 2019 • edited Loading

gireeshpunathil commented Sep 10, 2019

slonka commented Sep 10, 2019 • edited Loading

vmarchaud commented Sep 10, 2019

mmarchini commented Sep 10, 2019

slonka commented Sep 10, 2019

Flarna commented Sep 10, 2019 • edited Loading

gireeshpunathil commented Sep 11, 2019

mhdawson commented Sep 11, 2019

mhdawson commented Sep 11, 2019

gireeshpunathil commented Oct 29, 2019

gireeshpunathil commented Mar 26, 2020

slonka commented Sep 9, 2019 •

edited

Loading

slonka commented Sep 10, 2019 •

edited

Loading

slonka commented Sep 10, 2019 •

edited

Loading

Flarna commented Sep 10, 2019 •

edited

Loading