-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for generating perf maps for simple perf profiling #6030
Conversation
Subscribe to Label Actioncc @peterhuene
This issue or pull request has been labeled: "wasmtime:api", "wasmtime:c-api", "wasmtime:config", "wasmtime:docs"
Thus the following users have been cc'd because of the following labels:
To subscribe or unsubscribe from this label, edit the |
Label Messager: wasmtime:configIt looks like you are changing Wasmtime's configuration options. Make sure to
To modify this label's message, edit the To add new label messages or remove existing label messages, edit the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoa I had no idea this was a supported format by perf
! Is there some online documentation for this? Or is there a reason to use jitdump
over this?
I've got some minor comments but otherwise this seems reasonable to me to add.
Jitdump can handle code memory being reused by multiple functions at different times. In addition it allows perf to show the actual instructions that executed rather than just the function names. It also supports recording unwind tables as opposed to requiring frame pointers for unwinding. And finally it allows recording line tables.
https://github.com/torvalds/linux/blob/master/tools/perf/Documentation/jit-interface.txt |
I think the performance issues of the jitdump support are because we emit a record for each individual function at
|
Nice find @bjorn3! Although I'm not deep enough into the jitdump support to know how to fix this properly and quickly, and perfmap support is a great start for our use case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the info and the link @bjorn3! I'm not sure what I expected in the way of documentation but... that I suppose at least does what it says on the tin! Makes sense we'll definitely want to keep the jitdump interface for "more robust" usage.
instead of emitting a single record covering the entire text section of the module as intended.
Oh I didn't realize that it was possible to do this, but reading over the documentation it looks like JIT_CODE_LOAD
only supports one symbol? Do you know off the top of your head how the whole text section would be loaded? I'd love to fix this myself to avoid generating thousands of tiny *.so
files on all perf inject
calls.
for (idx, func, len) in module.trampolines() { | ||
let (addr, len) = (func as usize as *const u8, len); | ||
let name = format!("wasm::trampoline[{}]", idx.index()); | ||
let _ = file.write_all(Self::make_line(&name, addr, len).as_bytes()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I missed this earlier, but I think we'll want ot do something other than ignoring errors here ideally. If an error happens it should probably "close" the file and terminate all future writing to it I suspect? (along with perhaps a warning message printed?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! I've added log messages and early-returns. Looks like a BufWriter
will automatically flush in its Drop impl, so we can avoid doing it explicitly in most early-returns.
) { | ||
let mut file = PERFMAP_FILE.lock().unwrap(); | ||
let file = file.as_mut().unwrap(); | ||
let _ = file.write_all(Self::make_line(name, addr, size).as_bytes()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this may want a flush at the end too?
Also with a BufWriter
I think you could pass that into make_line
to avoid the intermediate string allocation (e.g. write directly to the buffer). Although not required of course, that's ok to defer to if it's ever actually an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Turns out File
is also a Write
implementator, so could have make_line
take &mut dyn Write
and use it here without having a BufWriter
(since there's only one write in this function, it didn't seem worth having an extra buffer).
Right, I should have read the docs more carefully. My bad. All we can do I guess is to make |
Wow, |
|
I should have addressed all the feedback, thanks for the review! |
This adds support for a simpler way to profile with
perf
, using perf maps files. Whilejitdump
support is pretty effective and complete when you have access to DWARF metadata, theperf-inject
step may take lots of time, and it is not optimized at all when modules contain lots of functions. (In our embedding, I've stoppedperf-inject
after it ran for 5 hours and didn't complete, while creating 175K.so
objects.) Simplistic support for perf maps enable basic profiling: caller/callee trees, top/bottom time analysis, flamegraphs etc., and works very nicely when the wasm modules have many functions.I've also tweaked the documentation and tried to make the feature available in the C API too.
I haven't put this behind a feature toggle because this profiling agent does very little, so it's always compiled when running Linux. For consistency I could add one, if you prefer.