Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use WASM function names in compiled objects #8627

Merged
merged 7 commits into from
May 16, 2024
45 changes: 40 additions & 5 deletions crates/wasmtime/src/compile.rs
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ use crate::Engine;
use anyhow::{Context, Result};
use std::{
any::Any,
borrow::Cow,
collections::{btree_map, BTreeMap, BTreeSet, HashMap, HashSet},
mem,
};
Expand Down Expand Up @@ -346,6 +347,9 @@ struct CompileInputs<'a> {
}

impl<'a> CompileInputs<'a> {
/// Maximum length of symbols generated in objects.
const MAX_SYMBOL_LEN: usize = 96;

fn push_input(&mut self, f: impl FnOnce(&dyn Compiler) -> Result<CompileOutput> + Send + 'a) {
self.inputs.push(Box::new(f));
}
Expand Down Expand Up @@ -419,6 +423,26 @@ impl<'a> CompileInputs<'a> {
ret
}

fn clean_symbol(name: &str) -> Cow<str> {
// Just to be on the safe side, avoid passing user-provided data to tools
// like "perf" or "objdump", and filter the name. Let only characters usually
// used for function names, plus some characters that might be used in name
// mangling.
let bad_char = |c: char| !c.is_alphanumeric() && !r"<>[]_-:@$".contains(c);
lpereira marked this conversation as resolved.
Show resolved Hide resolved
if name.chars().any(bad_char) {
let mut name = name
.chars()
.map(|c| if bad_char(c) { '?' } else { c })
.collect::<String>()
.trim_matches('?')
.to_string();
name.truncate(Self::MAX_SYMBOL_LEN);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.take(N) in the iterator chain should allow the truncation to happen inline (and stop iteration early if so, since the whole chain is lazy)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, no, I'm not sure how to do this; I want to truncate the string after trim_matches('?') removed repeated occurrences of '?' in the cleaned string, not before.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mention that you intended to turn any run of multiple ? into a single ?. Note that trim_matches doesn't do that. Instead, it deletes all copies of the given pattern from both ends of the string. That's why it can return a borrowed &str instead of a heap-allocated String: it doesn't change the middle of the string, so it can just adjust the start pointer and the length.

I think this should do more or less what you want (though I'm typing it in GitHub's web editor without testing it so who knows). It deletes all ? at the beginning of the string, then compresses any run of multiple ? into a single ?. It can leave a single ? at the end of the string but I couldn't think of a simple way to fix that and I don't think it matters. Also it uses take like Chris suggested.

            let mut last_char = '?';
            let name = name
                .chars()
                .map(|c| if bad_char(c) { '?' } else { c })
                .filter(|c| { let keep = last_char != c; last_char = c; keep })
                .take(Self::MAX_SYMBOL_LEN)
                .collect::<String>();

Cow::Owned(name)
} else {
Cow::Borrowed(&name[..])
lpereira marked this conversation as resolved.
Show resolved Hide resolved
}
lpereira marked this conversation as resolved.
Show resolved Hide resolved
}

fn collect_inputs_in_translations(
&mut self,
types: &'a ModuleTypesBuilder,
Expand All @@ -436,13 +460,24 @@ impl<'a> CompileInputs<'a> {
let func_index = translation.module.func_index(def_func_index);
let (info, function) =
compiler.compile_function(translation, def_func_index, func_body, types)?;
Ok(CompileOutput {
key: CompileKey::wasm_function(module, def_func_index),
symbol: format!(
"wasm[{}]::function[{}]",
let symbol = match translation
.debuginfo
.name_section
.func_names
.get(&func_index)
{
Some(name) => format!(
"wasm[{}]::func[{}]::{}",
module.as_u32(),
func_index.as_u32()
func_index.as_u32(),
Self::clean_symbol(&name)
),
None => format!("wasm[{}]::func[{}]", module.as_u32(), func_index.as_u32()),
};

Ok(CompileOutput {
key: CompileKey::wasm_function(module, def_func_index),
symbol,
function: CompiledFunction::Function(function),
info: Some(info),
})
Expand Down
Loading