Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add first of the new diagnostics articles to guides #1444

Merged
merged 5 commits into from
Feb 20, 2019

Conversation

naugtur
Copy link
Contributor

@naugtur naugtur commented Nov 3, 2017

This is the first one of a few guides I would like to propose to add here.
There's more: https://github.com/naugtur/node-diagnostics-howtos

The goal is to make knowledge about diagnostics tools more approachable and gather it in one place.

I'm setting up this PR to discuss how we want to do this. Please share thoughts on the format in general.
As discussed in recent @nodejs/diagnostics meeting, we'd like to proceed with more of those guides, all of them written in a similar fashion.

Related: nodejs/community-committee#163

Solaris vms are no longer needed for flame graphs on linux!

1. install lldb in your system (eg. with apt-get)
1. install llnode according to readme in [github project](https://github.com/nodejs/llnode)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure entire lldb and llnode installation is necessary, maybe a subset would suffice? I had trouble getting a flame graph without them, but maybe it's about one of the dependencies.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that I don't think we need either of those 2 specifically. I think its just the perf output that is being used.


Wait, what is that `sleep 3` for? It's there to keep the perf running - despite `-p` option pointing to a different pid, the command needs to be executed on a process and end with it. Just a convenient way to keep perf going for a given time.

Why is `-F` (profiling frequency) set to 99? I honestly don't know, Netflix guys used that value. Results look good. You can adjust if you want.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too honest. This part will have to change. On the other hand, I think we should provide a reasonable default here instead of a lengthy explanation of what -F really does and how to decide on a value.


## Examples

See a real flame graph - download the file from /samples/flame-graph-example.html
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are other resources stored? This is a path from https://github.com/naugtur/node-diagnostics-howtos. This guide should have a link to an example. nodejs/nodejs.org repo doesn't seem to be a place for those.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An external link should work.


# Flame Graphs

## What's a flame graph useful for?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explaining why before explaining how - part of my intended pattern for those guides.


1. install lldb in your system (eg. with apt-get)
1. install llnode according to readme in [github project](https://github.com/nodejs/llnode)
1. install linux-tools-common (eg. with apt-get)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sounds like this might be to get perf. If so it would be good to say that specifically so that we know why we need each of the pieces that we are asking people to install.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I'll update it. thanks.

@mhdawson
Copy link
Member

mhdawson commented Nov 3, 2017

We probably need to give a heads up on nodejs/benchmarking#168 as well as it seems 8.x has issues due to the pipeline change.

1. install stackvis `npm i -g stackvis`
1. run node with perf enabled `perf record -e cycles:u -g -- node --perf-basic-prof app.js`
1. disregard warnings unless they're saying you can't run perf due to missing packages; you may get some warnings about not being able to access kernel module samples which you're not after anyway.
1. run `perf script`, but pipe it through some cleanup: `perf script | egrep -v "( __libc_start| LazyCompile | v8::internal::| Builtin:| Stub:| LoadIC:|\[unknown\]| LoadPolymorphicIC:)" | sed 's/ LazyCompile:[*~]\?/ /' > perfs.out` [explanation](#about-perf-script-filtering)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to have the part you want to cut and paste on its own line. In this case it I included the 'explanation' part in my cut and paste and of course if failed. So while people can figure that out making it easier to get it right the first time makes sense to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on a code block, and I think it's worth repeating the whole perf line:

1. Run your script through `perf` again, removing V8 internals and redirecting output to a file.
```bash
perf record -e cycles:u -g -- node --perf-basic-prof app.js > perfs.out
```

I'd suggest moving the grep and sed into your paragraph below. I think it's a good idea, and worth mentioning, but it makes the initial example more complex, and as you note later it isn't always useful.

@mhdawson
Copy link
Member

mhdawson commented Nov 3, 2017

Overall this is a great doc. I was able to follow it and get flamegraphs out easily within 5-10 mins.

My main suggestions would be a bit more explanation about what is going on in each of the steps. Having said that you already have some of that later on so maybe I just needed to read further down. Maybe a line up front that says 'if you want to understand each step better take a look at the sections that follow were we go into more detail'

1. install linux-tools-common (eg. with apt-get)
1. try running `perf` - it might complain about missing kernel modules, install them too
1. install stackvis `npm i -g stackvis`
1. run node with perf enabled `perf record -e cycles:u -g -- node --perf-basic-prof app.js`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the number at the beginning reflect a step in the process. If so they should be incremented at every step.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's how markdown works. Generated HTML will have an ordered list here.
I could switch to incremental so the MD file is more readable, but it's not meant to be read without rendering.

Copy link
Member

@lpinca lpinca Nov 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know 😮. Always used 1. 2. 3. .... Thanks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to spend a lot of time reading markdown files in the raw (not least in PR review), so I'd appreciate incremental numbering.

@lpinca
Copy link
Member

lpinca commented Nov 4, 2017

It might make sense to mention 0x which makes creating flames graph very easy.

@naugtur
Copy link
Contributor Author

naugtur commented Nov 4, 2017

@lpinca 0x looks great, thanks for mentioning it.

I'm not sure if showing a one step opinionated tool is helpful for the educational purposes. Maybe a separate sections with links to all tools suits this article, but it's a maintenance cost to curate the list and keep adding new and removing deprecated.

It's just my opinion, let's wait and see what others think.

@fhemberger
Copy link
Contributor

@naugtur Hi, what's the status of this PR? Is something still missing? Do you need help/feedback?

@fhemberger fhemberger self-assigned this Dec 26, 2017
@fhemberger fhemberger removed their assignment Dec 27, 2017
@naugtur
Copy link
Contributor Author

naugtur commented Dec 28, 2017

Hi,

I'm going to address the minor issues above as soon as I get out of the newborn baby situation at home, sorry for the delays, there's been complications.

From my point of view this work is ready for publication, but I'm not even a native speaker. I'm looking forward to editing/style support as discussed in nodejs/community-committee#163

Oh, and nodejs/benchmarking#168 is a serious issue with this guide unfortunately. It will need a section added at least.

@fhemberger
Copy link
Contributor

@naugtur Congratulations and all the best for your kid! 🎉

@fhemberger
Copy link
Contributor

fhemberger commented Dec 28, 2017

/cc @nodejs/diagnostics @nodejs/benchmarking PTAL

@naugtur
Copy link
Contributor Author

naugtur commented Dec 28, 2017

I just pushed some tweaks I had laying around and a stub of the section explaining 8.x V8 issues. It's likely not good enough, but should get us started.


## How to create a flame graph

You might have heard creating a flame graph for Node.js is difficult, but that's not true (anymore).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I support this a lot as a technical documentation for how to make a flame graph using only basic primitives. However, I wouldn't call this easy. I someone asked me how to make flame graphs easily I would defer to 0x.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hehe, easy in comparison. It's a second mention of 0x, so it's time I added a mention. thx

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

someone who hasn't heard its difficult now has, maybe this will make people nervous before they even started? I third the 0x reference


Why do I need them at all?

Well, without these options you'll still get a flame graph, but with most bars labeled `v8::Function::Call`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be more detailed. They contain a map between the code addresses and the JS function names, collected by walking stack. I have not tried it, but do you really get more v8::Function::Call frames? I would suspect that you simply don't see the JS function frames, as pref doesn't know how to resolve the symbols for those code address.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did generate a flame graph with all javascript functions labeled v8::Function::Call if I didn't pass --perf-* - that's what it does.

--perf-basic-prof is probably doing more than just exposing the function names from JS, but I don't know enough about it to elaborate.

@mike-kaufman
Copy link

@naugtur - per nodejs/community-committee#163 discussion, can you add docs and help-wanted labels to this as a start?

@fhemberger
Copy link
Contributor

@naugtur What's the status of this PR? Is something still missing or are we already discussing details and enhancements?

I'd love to get this merged, if there's nothing blocking, we still can iron out little nits in subsequent PRs.

Copy link
Member

@gibfahn gibfahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some thoughts/suggestions (from someone who isn't familiar with this stuff).

1. install linux-tools-common (eg. with apt-get)
1. try running `perf` - it might complain about missing kernel modules, install them too
1. install stackvis `npm i -g stackvis`
1. run node with perf enabled `perf record -e cycles:u -g -- node --perf-basic-prof app.js`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to spend a lot of time reading markdown files in the raw (not least in PR review), so I'd appreciate incremental numbering.

1. install stackvis `npm i -g stackvis`
1. run node with perf enabled `perf record -e cycles:u -g -- node --perf-basic-prof app.js`
1. disregard warnings unless they're saying you can't run perf due to missing packages; you may get some warnings about not being able to access kernel module samples which you're not after anyway.
1. run `perf script`, but pipe it through some cleanup: `perf script | egrep -v "( __libc_start| LazyCompile | v8::internal::| Builtin:| Stub:| LoadIC:|\[unknown\]| LoadPolymorphicIC:)" | sed 's/ LazyCompile:[*~]\?/ /' > perfs.out` [explanation](#about-perf-script-filtering)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on a code block, and I think it's worth repeating the whole perf line:

1. Run your script through `perf` again, removing V8 internals and redirecting output to a file.
```bash
perf record -e cycles:u -g -- node --perf-basic-prof app.js > perfs.out
```

I'd suggest moving the grep and sed into your paragraph below. I think it's a good idea, and worth mentioning, but it makes the initial example more complex, and as you note later it isn't always useful.


### Node's profiling options

`--perf-basic-prof-only-functions` and `--perf-basic-prof` seem like the only two you might be initially interested in for debugging your JavaScript code. Other options should only be useful for profiling Node itself, which is outside the scope of this howto.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe:

--perf-basic-prof-only-functions and --perf-basic-prof seem like the only two you might be initially interested in for debugging your JavaScript code. Other options should only be useful for profiling Node itself, which is outside the scope of this howto.

to

--perf-basic-prof-only-functions and --perf-basic-prof are the two that are useful for debugging your JavaScript code. Other options are used for profiling Node itself, which is outside the scope of this howto.

perf record -F99 -p `pgrep -n node` -g -- sleep 3
```

Wait, what is that `sleep 3` for? It's there to keep the perf running - despite `-p` option pointing to a different pid, the command needs to be executed on a process and end with it. Just a convenient way to keep perf going for a given time.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wasn't immediately clear to me. Maybe something like:

perf runs for the life of the command you pass to it, whether or not you're actually profiling that command. sleep 3 ensures that perf runs for 3 seconds.

You might have heard creating a flame graph for Node.js is difficult, but that's not true (anymore).
Solaris vms are no longer needed for flame graphs on linux!

1. install linux-tools-common - the package containing `perf` (eg. with apt-get)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe:

Install perf (usually available through the linux-tools-common package if not already installed).

1. try running `perf` - it might complain about missing kernel modules, install them too
1. install stackvis `npm i -g stackvis`
1. run node with perf enabled `perf record -e cycles:u -g -- node --perf-basic-prof app.js`
1. disregard warnings unless they're saying you can't run perf due to missing packages; you may get some warnings about not being able to access kernel module samples which you're not after anyway.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe simplify to:

Disregard non-fatal warnings, e.g. those about accessing kernel module samples.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The errors you get did sound pretty fatal to me, but they only matter if they say perf is missing some dependency. I went through figuring it out and I don't think I would understand what you mean by "non-fatal" before I figured it out.

1. install linux-tools-common - the package containing `perf` (eg. with apt-get)
1. try running `perf` - it might complain about missing kernel modules, install them too
1. install stackvis `npm i -g stackvis`
1. run node with perf enabled `perf record -e cycles:u -g -- node --perf-basic-prof app.js`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to break this out too:

1. Run your script through `perf` to ensure everything is setup correctly:
```bash
perf record -e cycles:u -g -- node --perf-basic-prof app.js
```


Wait, what is that `sleep 3` for? It's there to keep the perf running - despite `-p` option pointing to a different pid, the command needs to be executed on a process and end with it. Just a convenient way to keep perf going for a given time.

Why is `-F` (profiling frequency) set to 99? It's a reasonable default. You can adjust if you want. Lower values should produce less output with less precise results.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe:

-F99 tells perf to take 99 samples per second, for more precision increase the value.


Why is `-F` (profiling frequency) set to 99? It's a reasonable default. You can adjust if you want. Lower values should produce less output with less precise results.

After you get that 3 second perf record, proceed with generating the flame graph with last two steps from above.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

last two -> the last two


### About `perf script` filtering

The long line of greps and seds after `perf script` call is there to remove some V8 and node internals so that it's easier for you to focus on your JavaScript calls. These are only important if you are looking for an issue with Node internals.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned above, I'd suggest putting the grep/sed in here:

### Filtering out Node internal functions

Usually you just want to look at the performance of your own calls, so filtering out Node and V8 internal functions can make the graph much easier to read. You can clean up your perf file with:

```bash
sed -i \
  -e "s/( __libc_start| LazyCompile | v8::internal::| Builtin:| Stub:| LoadIC:|\[unknown\]| LoadPolymorphicIC:)/d" \
  -e 's/ LazyCompile:[*~]\?/ /' \
  perfs.out
```

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a grammatician, but I read the "These" of the second sentence to refer to the most recently mentioned thing, "JavaScript calls", maybe be more explicit: The internals are only important if you are ....

@fhemberger
Copy link
Contributor

ping @naugtur

@nodejs/diagnostics Could you please help landing this?

@naugtur
Copy link
Contributor Author

naugtur commented Jul 12, 2018

Hi, I'm so sorry I couldn't get back to this.
I'm still trying to carve out the time after becoming a father. I should be able to come back to this doc in the next 2 weeks.

@fhemberger
Copy link
Contributor

@naugtur Hi, no worries! Take your time. I was hoping that someone from @nodejs/diagnostics could help out a bit, as it looks like there are only some details left to be done.

@naugtur
Copy link
Contributor Author

naugtur commented Jul 13, 2018

@fhemberger there were some corner cases with node8 to add or work around and I'm not sure if they propagate to v10.

Anyway, as soon as I arrange to get back to it I'm hoping to finish this one and start another.

@naugtur
Copy link
Contributor Author

naugtur commented Jul 27, 2018

@fhemberger I've addressed the comments and reorganized the document a bit. Hope it's reasonable.

Next, I'll make sure the guide actually works after all those changes and report back :)

@mmarchini
Copy link
Contributor

Just an FYI: there's a new flag (--interpreted-frames-native-stack) on Node.js v10.x to address the "missing frames" issue introduced with Turbofan.

@fhemberger
Copy link
Contributor

@naugtur Awesome, thank you so much!

@naugtur
Copy link
Contributor Author

naugtur commented Jul 29, 2018

@mmarchini Great! I'll have to read up on that and add another edit.

@naugtur
Copy link
Contributor Author

naugtur commented Sep 17, 2018

Hi, I've added interpreted-frames-native-stack to the doc.
It could be ready to publish, but I have some minor concerns.

  • sed has different syntax on mac, maybe we should go back to egrep
  • I wanted to test it on all versions of node, but now I'm getting flamegraphs without function names and with node`_ZN... instead. With the same commands that used to work. Looks like there's one more factor I'm not taking into account here and it changed with some updates I got.

@mcollina
Copy link
Member

@naugtur You'll need to add --interpreted-frames-native-stack on all commands to work and show the functions.

@mmarchini
Copy link
Contributor

I wanted to test it on all versions of node, but now I'm getting flamegraphs without function names and with node`_ZN... instead.

The Linux perf you're using was not compiled with demangle support, see https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1396654. IIRC this if fixed on Ubuntu 18.04.

@naugtur
Copy link
Contributor Author

naugtur commented Sep 18, 2018

The Linux perf you're using was not compiled with demangle support, see https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1396654. IIRC this if fixed on Ubuntu 18.04.

I saw that bug and:

  1. I want to mention it in the guide
  2. I wonder why it worked before - I'm running it on the same system as 1yr ago. It might be due to a kernel update.

Once that's done, I'm hoping to 🚢 it :)

4. disregard warnings unless they're saying you can't run perf due to missing packages; you may get some warnings about not being able to access kernel module samples which you're not after anyway.
5. Run `perf script > perfs.out` to generate the data file you'll visualize in a moment. It's useful to [apply some cleanup](#filtering-out-node-internal-functions) for a more readable graph
6. install stackvis if not yet installed `npm i -g stackvis`
7. run `stackvis perf < perfs.out > flamegraph.htm`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this today and the result does not display nicely for me in the browser. Does it still work ok for you?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should suggest using Brendan Gregg's flamegraph scripts instead? Stackvis result is quite unreadable here as well:

stackvis perf < perfs.out > flamegraph.htm :

image

./FlameGraph/stackcollapse-perf.pl < ./perfs.out | ./FlameGraph/flamegraph.pl --color=js > ./flamegraph.svg:

image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did follow that suggestion in the demo I was putting together for my NodeConfEU presentation. FlameGraph worked for me. Once we close on the content we should see if we can figure out some testing to make sure our recommendation continues to work properly.

Copy link
Contributor Author

@naugtur naugtur Oct 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's exactly why I'm not saying this is ready to publish - it started showing me the same issue you have on the screenshot. It seemed an issue with perf output, so I upgraded my OS and retired, got the same thing, ran out of time. I'll check other options, but it's good to see the output from perf seems ok and it's about the visualization tool.

I'm hoping to look for reasons in stackvis and explore other options. IMHO it'd be nice to use something one can install from npm after all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@naugtur thanks for the update.

Copy link
Contributor Author

@naugtur naugtur Oct 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've compared various flamegraph generators and the difference in readabiliy is mostly due to overflow:hidden of long trace names, but the input data is still incorrect. I switched to ubuntu 18.04.1 and it still produces bad stack traces.

Planning to check on a few systems and compare.

And BTW, 0x didn't work for me either.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but the input data is still incorrect

Incorrect how? If you could share an example I can happily take a look :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://naugtur.egnyte.com/fl/6GGszZUQFy
Every tool I used to generate a flame graph on my machine had the same issues

@fhemberger
Copy link
Contributor

fhemberger commented Feb 20, 2019

@naugtur you have been working on this for quite a while, so I'd really love to get this merged. I don't want to stress you if you're busy, but what is still needed for this PR to land? Maybe someone can help.

We could also see that we merge the current version, and add further details later if this works for you.

@naugtur
Copy link
Contributor Author

naugtur commented Feb 20, 2019

@fhemberger The last thing I was hoping to fix here is issues with displaying function names I'm experiencing. I've tried this on 2 different Ubuntu versions I use on daily basis and didn't get the nice names on either of them.
I've failed to find time to install a clean new system and prove it works.

If anybody can go through the steps and confirm they're seeing a nice flame graph with correct function names - I'd consider it ready.

I'm also ok with merging this as-is because the part that's not working for me seems to require a fix independent of the whole procedure.

@fhemberger
Copy link
Contributor

@naugtur Okay, then I'm merging it. If someone comes up with a nicer flame graph, this can be added in a separate PR.

Thank you so much for your work!

@fhemberger fhemberger merged commit c170a70 into nodejs:master Feb 20, 2019
fhemberger added a commit that referenced this pull request Feb 20, 2019
@naugtur
Copy link
Contributor Author

naugtur commented Feb 20, 2019

Thanks, sorry for the delays.
Noone's ever really ready for kids ;)

fhemberger added a commit that referenced this pull request Feb 21, 2019
* Add guide: Diagnostics - Flame Graphs

Ref: #1444

* zh-cn:trans for 'Diagnostics - Flame Graphs'

trans for 'Diagnostics - Flame Graphs'.
@ghost
Copy link

ghost commented Feb 21, 2019

@nodejs/localization: Feel free to translate to your own languages if possible :)

@AliSawari
Copy link

@Maledong thanks
I will do my part once I have the chance, really busy these days :))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.