Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc: include poor-performance diagnostic #4928

Merged
merged 1 commit into from
Nov 15, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions locale/en/docs/guides/diagnostics/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,5 +17,6 @@ This is the available set of diagnostics guides:

* [Memory](/en/docs/guides/diagnostics/memory)
* [Live Debugging](/en/docs/guides/diagnostics/live-debugging)
* [Poor Performance](/en/docs/guides/diagnostics/poor-performance)

[Diagnostics Working Group]: https://github.com/nodejs/diagnostics
36 changes: 36 additions & 0 deletions locale/en/docs/guides/diagnostics/poor-performance/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
---
title: Poor Performance - Diagnostics
layout: docs.hbs
---

# Poor Performance

In this document you can learn about how to profile a Node.js process.

* [Poor Performance](#poor-performance)
* [My application has a poor performance](#my-application-has-a-poor-performance)
* [Symptoms](#symptoms)
* [Debugging](#debugging)

## My application has a poor performance

### Symptoms

My applications latency is high and I have already confirmed that the bottleneck
is not my dependencies like databases and downstream services. So I suspect that
my application spends significant time to run code or process information.

You are satisfied with your application performance in general but would like to
understand which part of our application can be improved to run faster or more
efficient. It can be useful when we want to improve the user experience or save
computation cost.

### Debugging

In this use-case, we are interested in code pieces that use more CPU cycles than
the others. When we do this locally, we usually try to optimize our code.

This document provides two simple ways to profile a Node.js application:

* [Using V8 Sampling Profiler](https://nodejs.org/en/docs/guides/simple-profiling/)
* [Using Linux Perf](/en/docs/guides/diagnostics/poor-performance/using-linux-perf)
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
---
title: Poor Performance - Using Linux Perf
layout: docs.hbs
---

# Using Linux Perf

[Linux Perf](https://perf.wiki.kernel.org/index.php/Main_Page) provides low level CPU profiling with JavaScript,
native and OS level frames.

**Important**: this tutorial is only available on Linux.

## How To

Linux Perf is usually available through the `linux-tools-common` package. Through either `--perf-basic-prof` or
`--perf-basic-prof-only-functions` we are able to start a Node.js application supporting _perf_events_.

`--perf-basic-prof` will always write to a file (/tmp/perf-PID.map), which can lead to infinite disk growth.
If that’s a concern either use the module: [linux-perf](https://www.npmjs.com/package/linux-perf)
or `--perf-basic-prof-only-functions`.

The main difference between both is that `--perf-basic-prof-only-functions` produces less output, it is a viable option
for production profiling.

```console
# Launch the application an get the PID
$ node --perf-basic-prof-only-functions index.js &
[1] 3870
```

Then record events based in the desired frequency:

```console
$ sudo perf record -F 99 -p 3870 -g
```

In this phase, you may want to use a load test in the application in order to generate more records for a reliable
analysis. When the job is done, close the perf process by sending a SIGINT (Ctrl-C) to the command.

The `perf` will generate a file inside the `/tmp` folder, usually called `/tmp/perf-PID.map`
(in above example: `/tmp/perf-3870.map`) containing the traces for each function called.

To aggregate those results in a specific file execute:

```console
$ sudo perf script > perfs.out
```

```console
$ cat ./perfs.out
node 3870 25147.878454: 1 cycles:
ffffffffb5878b06 native_write_msr+0x6 ([kernel.kallsyms])
ffffffffb580d9d5 intel_tfa_pmu_enable_all+0x35 ([kernel.kallsyms])
ffffffffb5807ac8 x86_pmu_enable+0x118 ([kernel.kallsyms])
ffffffffb5a0a93d perf_pmu_enable.part.0+0xd ([kernel.kallsyms])
ffffffffb5a10c06 __perf_event_task_sched_in+0x186 ([kernel.kallsyms])
ffffffffb58d3e1d finish_task_switch+0xfd ([kernel.kallsyms])
ffffffffb62d46fb __sched_text_start+0x2eb ([kernel.kallsyms])
ffffffffb62d4b92 schedule+0x42 ([kernel.kallsyms])
ffffffffb62d87a9 schedule_hrtimeout_range_clock+0xf9 ([kernel.kallsyms])
ffffffffb62d87d3 schedule_hrtimeout_range+0x13 ([kernel.kallsyms])
ffffffffb5b35980 ep_poll+0x400 ([kernel.kallsyms])
ffffffffb5b35a88 do_epoll_wait+0xb8 ([kernel.kallsyms])
ffffffffb5b35abe __x64_sys_epoll_wait+0x1e ([kernel.kallsyms])
ffffffffb58044c7 do_syscall_64+0x57 ([kernel.kallsyms])
ffffffffb640008c entry_SYSCALL_64_after_hwframe+0x44 ([kernel.kallsyms])
....
```

The raw output can be a bit hard to understand so typically the raw file is used to generate flamegraphs for a better
visualization.

![Example nodejs flamegraph](https://user-images.githubusercontent.com/26234614/129488674-8fc80fd5-549e-4a80-8ce2-2ba6be20f8e8.png)

To generate a flamegraph from this result, follow [this tutorial](https://nodejs.org/en/docs/guides/diagnostics-flamegraph/#create-a-flame-graph-with-system-perf-tools)
from step 6.

Because `perf` output is not a Node.js specific tool, it might have issues with how JavaScript code is optimized in
Node.js. See [perf output issues](https://nodejs.org/en/docs/guides/diagnostics-flamegraph/#perf-output-issues) for a
futher reference.

## Useful Links

* https://nodejs.org/en/docs/guides/diagnostics-flamegraph/
* https://www.brendangregg.com/blog/2014-09-17/node-flame-graphs-on-linux.html
* https://perf.wiki.kernel.org/index.php/Main_Page
* https://blog.rafaelgss.com.br/node-cpu-profiler