Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for generating perf maps for simple perf profiling #6030

Merged
merged 10 commits into from
Mar 20, 2023

Conversation

bnjbvr
Copy link
Member

@bnjbvr bnjbvr commented Mar 16, 2023

This adds support for a simpler way to profile with perf, using perf maps files. While jitdump support is pretty effective and complete when you have access to DWARF metadata, the perf-inject step may take lots of time, and it is not optimized at all when modules contain lots of functions. (In our embedding, I've stopped perf-inject after it ran for 5 hours and didn't complete, while creating 175K .so objects.) Simplistic support for perf maps enable basic profiling: caller/callee trees, top/bottom time analysis, flamegraphs etc., and works very nicely when the wasm modules have many functions.

I've also tweaked the documentation and tried to make the feature available in the C API too.

I haven't put this behind a feature toggle because this profiling agent does very little, so it's always compiled when running Linux. For consistency I could add one, if you prefer.

@github-actions github-actions bot added wasmtime:api Related to the API of the `wasmtime` crate itself wasmtime:c-api Issues pertaining to the C API. wasmtime:config Issues related to the configuration of Wasmtime wasmtime:docs Issues related to Wasmtime's documentation labels Mar 16, 2023
@github-actions
Copy link

Subscribe to Label Action

cc @peterhuene

This issue or pull request has been labeled: "wasmtime:api", "wasmtime:c-api", "wasmtime:config", "wasmtime:docs"

Thus the following users have been cc'd because of the following labels:

  • peterhuene: wasmtime:api, wasmtime:c-api

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

@github-actions
Copy link

github-actions bot commented Mar 16, 2023

Label Messager: wasmtime:config

It looks like you are changing Wasmtime's configuration options. Make sure to
complete this check list:

  • If you added a new Config method, you wrote extensive documentation for
    it.

    Our documentation should be of the following form:

    Short, simple summary sentence.
    
    More details. These details can be multiple paragraphs. There should be
    information about not just the method, but its parameters and results as
    well.
    
    Is this method fallible? If so, when can it return an error?
    
    Can this method panic? If so, when does it panic?
    
    # Example
    
    Optional example here.
    
  • If you added a new Config method, or modified an existing one, you
    ensured that this configuration is exercised by the fuzz targets.

    For example, if you expose a new strategy for allocating the next instance
    slot inside the pooling allocator, you should ensure that at least one of our
    fuzz targets exercises that new strategy.

    Often, all that is required of you is to ensure that there is a knob for this
    configuration option in wasmtime_fuzzing::Config (or one
    of its nested structs).

    Rarely, this may require authoring a new fuzz target to specifically test this
    configuration. See our docs on fuzzing for more details.

  • If you are enabling a configuration option by default, make sure that it
    has been fuzzed for at least two weeks before turning it on by default.


To modify this label's message, edit the .github/label-messager/wasmtime-config.md file.

To add new label messages or remove existing label messages, edit the
.github/label-messager.json configuration file.

Learn more.

Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoa I had no idea this was a supported format by perf! Is there some online documentation for this? Or is there a reason to use jitdump over this?

I've got some minor comments but otherwise this seems reasonable to me to add.

crates/jit/src/profiling/perfmap_linux.rs Outdated Show resolved Hide resolved
crates/jit/src/profiling/perfmap_linux.rs Outdated Show resolved Hide resolved
crates/cli-flags/src/lib.rs Outdated Show resolved Hide resolved
@bjorn3
Copy link
Contributor

bjorn3 commented Mar 16, 2023

Or is there a reason to use jitdump over this?

Jitdump can handle code memory being reused by multiple functions at different times. In addition it allows perf to show the actual instructions that executed rather than just the function names. It also supports recording unwind tables as opposed to requiring frame pointers for unwinding. And finally it allows recording line tables.

Is there some online documentation for this?

https://github.com/torvalds/linux/blob/master/tools/perf/Documentation/jit-interface.txt

@bjorn3
Copy link
Contributor

bjorn3 commented Mar 16, 2023

I think the performance issues of the jitdump support are because we emit a record for each individual function at

for (idx, func) in module.finished_functions() {
instead of emitting a single record covering the entire text section of the module as intended.

@bnjbvr
Copy link
Member Author

bnjbvr commented Mar 16, 2023

I think the performance issues of the jitdump support are because we emit a record for each individual function at

Nice find @bjorn3! Although I'm not deep enough into the jitdump support to know how to fix this properly and quickly, and perfmap support is a great start for our use case.

Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the info and the link @bjorn3! I'm not sure what I expected in the way of documentation but... that I suppose at least does what it says on the tin! Makes sense we'll definitely want to keep the jitdump interface for "more robust" usage.

instead of emitting a single record covering the entire text section of the module as intended.

Oh I didn't realize that it was possible to do this, but reading over the documentation it looks like JIT_CODE_LOAD only supports one symbol? Do you know off the top of your head how the whole text section would be loaded? I'd love to fix this myself to avoid generating thousands of tiny *.so files on all perf inject calls.

for (idx, func, len) in module.trampolines() {
let (addr, len) = (func as usize as *const u8, len);
let name = format!("wasm::trampoline[{}]", idx.index());
let _ = file.write_all(Self::make_line(&name, addr, len).as_bytes());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I missed this earlier, but I think we'll want ot do something other than ignoring errors here ideally. If an error happens it should probably "close" the file and terminate all future writing to it I suspect? (along with perhaps a warning message printed?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! I've added log messages and early-returns. Looks like a BufWriter will automatically flush in its Drop impl, so we can avoid doing it explicitly in most early-returns.

) {
let mut file = PERFMAP_FILE.lock().unwrap();
let file = file.as_mut().unwrap();
let _ = file.write_all(Self::make_line(name, addr, size).as_bytes());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this may want a flush at the end too?

Also with a BufWriter I think you could pass that into make_line to avoid the intermediate string allocation (e.g. write directly to the buffer). Although not required of course, that's ok to defer to if it's ever actually an issue.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out File is also a Write implementator, so could have make_line take &mut dyn Write and use it here without having a BufWriter (since there's only one write in this function, it didn't seem worth having an extra buffer).

@bjorn3
Copy link
Contributor

bjorn3 commented Mar 16, 2023

Oh I didn't realize that it was possible to do this, but reading over the documentation it looks like JIT_CODE_LOAD only supports one symbol? Do you know off the top of your head how the whole text section would be loaded? I'd love to fix this myself to avoid generating thousands of tiny *.so files on all perf inject calls.

Right, I should have read the docs more carefully. My bad. All we can do I guess is to make perf inject not produce a .so for each individual function or write some rust code that has the same functionality as perf inject but is faster.

@bjorn3
Copy link
Contributor

bjorn3 commented Mar 16, 2023

Wow, perf inject has a pretty naive implementation. It is not even that we do something wrong causing each function to be a separate elf file. Instead it really produces a elf file for each function unconditionally: https://github.com/torvalds/linux/blob/9c1bec9c0b08abeac72ed6214b723adc224013bf/tools/perf/util/jitdump.c#L466

@bnjbvr
Copy link
Member Author

bnjbvr commented Mar 16, 2023

or write some rust code that has the same functionality as perf inject but is faster.

👀 mstange/samply#33

@bnjbvr
Copy link
Member Author

bnjbvr commented Mar 20, 2023

I should have addressed all the feedback, thanks for the review!

@alexcrichton alexcrichton enabled auto-merge March 20, 2023 14:32
@alexcrichton alexcrichton added this pull request to the merge queue Mar 20, 2023
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Mar 20, 2023
@bnjbvr bnjbvr enabled auto-merge March 20, 2023 15:58
@bnjbvr bnjbvr added this pull request to the merge queue Mar 20, 2023
@bnjbvr bnjbvr merged commit 6f4f30c into bytecodealliance:main Mar 20, 2023
@bnjbvr bnjbvr deleted the perf-map branch March 20, 2023 17:05
kpreisser added a commit to kpreisser/wasmtime-dotnet that referenced this pull request May 10, 2023
peterhuene pushed a commit to bytecodealliance/wasmtime-dotnet that referenced this pull request May 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wasmtime:api Related to the API of the `wasmtime` crate itself wasmtime:c-api Issues pertaining to the C API. wasmtime:config Issues related to the configuration of Wasmtime wasmtime:docs Issues related to Wasmtime's documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants