Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs(linux-perf): add tutorial of linux_perf #493

Merged
merged 3 commits into from
Aug 25, 2021
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
67 changes: 63 additions & 4 deletions documentation/profiling/step2/using_linux_perf.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,72 @@
# Using Linux Perf

Linux Perf provides low level CPU profiling with JavaScript, native and OS level frames.
[Linux Perf](https://perf.wiki.kernel.org/index.php/Main_Page) provides low level CPU profiling with JavaScript, native and OS level frames.

**Important**: this tutorial is only available on Linux.

## How To

// TODO
Linux Perf is usually available through the `linux-tools-common` package. Through either `--perf-basic-prof` or `--perf-basic-prof-only-functions` we are able to start a Node.js application supporting _perf_events_.

> `--perf-basic-prof` will always write to a file (/tmp/perf-PID.map), which can lead to infinite disk growth. If that’s a concern either use the module: https://www.npmjs.com/package/linux-perf or `--perf-basic-prof-only-functions`

The main difference between both is that `--perf-basic-prof-only-functions` produces less output, it is a viable option for production profiling.

```sh
# Launch the application an get the PID
node --perf-basic-prof-only-functions index.js &
[1] 3870
```

Then record events based in the desired frequency:

```sh
sudo perf record -F 99 -p 3870 -g
```

In this phase, you may want to use a load test in the application in order to generate more records for a reliable analysis. When the job is done, close the perf process by sending a SIGINT (Ctrl-C) to the command.

The `perf` will generate a file inside the `/tmp` folder, usually called `/tmp/perf-PID.map`(in above example: `/tmp/perf-3870.map`) containing the traces for each function called.

To aggregate those results in a specific file execute:

```sh
sudo perf script > perfs.out
```

```sh
cat ./perfs.out
node 3870 25147.878454: 1 cycles:
ffffffffb5878b06 native_write_msr+0x6 ([kernel.kallsyms])
ffffffffb580d9d5 intel_tfa_pmu_enable_all+0x35 ([kernel.kallsyms])
ffffffffb5807ac8 x86_pmu_enable+0x118 ([kernel.kallsyms])
ffffffffb5a0a93d perf_pmu_enable.part.0+0xd ([kernel.kallsyms])
ffffffffb5a10c06 __perf_event_task_sched_in+0x186 ([kernel.kallsyms])
ffffffffb58d3e1d finish_task_switch+0xfd ([kernel.kallsyms])
ffffffffb62d46fb __sched_text_start+0x2eb ([kernel.kallsyms])
ffffffffb62d4b92 schedule+0x42 ([kernel.kallsyms])
ffffffffb62d87a9 schedule_hrtimeout_range_clock+0xf9 ([kernel.kallsyms])
ffffffffb62d87d3 schedule_hrtimeout_range+0x13 ([kernel.kallsyms])
ffffffffb5b35980 ep_poll+0x400 ([kernel.kallsyms])
ffffffffb5b35a88 do_epoll_wait+0xb8 ([kernel.kallsyms])
ffffffffb5b35abe __x64_sys_epoll_wait+0x1e ([kernel.kallsyms])
ffffffffb58044c7 do_syscall_64+0x57 ([kernel.kallsyms])
ffffffffb640008c entry_SYSCALL_64_after_hwframe+0x44 ([kernel.kallsyms])
....
```

The raw output can be a bit hard to understand so typically the raw file is used to generate flamegraphs for a better visualization.


![Example nodejs flamegraph](https://user-images.githubusercontent.com/26234614/129488674-8fc80fd5-549e-4a80-8ce2-2ba6be20f8e8.png)

To generate a flamegraph from this result, follow [this tutorial](https://nodejs.org/en/docs/guides/diagnostics-flamegraph/#create-a-flame-graph-with-system-perf-tools) from step 6.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pointing to Node.js docs here to avoid duplication.


Because `perf` output is not a Node.js specific tool, it might have issues with how JavaScript code is optimized in Node.js. See [perf output issues](https://nodejs.org/en/docs/guides/diagnostics-flamegraph/#perf-output-issues) for a futher reference.

## Useful Links

- https://nodejs.org/en/docs/guides/diagnostics-flamegraph/
- [http://www.brendangregg.com/blog/2014-09-17/node-flame-graphs-on-linux.html](http://www.brendangregg.com/blog/2014-09-17/node-flame-graphs-on-linux.html)
- https://github.com/mmarchini/node-linux-perf
- https://www.brendangregg.com/blog/2014-09-17/node-flame-graphs-on-linux.html
- https://perf.wiki.kernel.org/index.php/Main_Page
- https://blog.rafaelgss.com.br/node-cpu-profiler
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if can reference this blog post here, but it's pretty similar to the tutorial described here.