Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cranelift: add support for the Mac aarch64 calling convention #2742

Merged
merged 1 commit into from
Mar 22, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ wasi-cap-std-sync = { path = "crates/wasi-common/cap-std-sync", version = "0.25.
structopt = { version = "0.3.5", features = ["color", "suggestions"] }
object = { version = "0.23.0", default-features = false, features = ["write"] }
anyhow = "1.0.19"
target-lexicon = { version = "0.11.0", default-features = false }
target-lexicon = { version = "0.12.0", default-features = false }
pretty_env_logger = "0.4.0"
file-per-thread-logger = "0.1.1"
wat = "1.0.36"
Expand Down
2 changes: 1 addition & 1 deletion cranelift/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ log = "0.4.8"
termcolor = "1.1.2"
capstone = { version = "0.7.0", optional = true }
wat = { version = "1.0.36", optional = true }
target-lexicon = { version = "0.11", features = ["std"] }
target-lexicon = { version = "0.12", features = ["std"] }
peepmatic-souper = { path = "./peepmatic/crates/souper", version = "0.72.0", optional = true }
pretty_env_logger = "0.4.0"
rayon = { version = "1", optional = true }
Expand Down
2 changes: 1 addition & 1 deletion cranelift/codegen/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ cranelift-codegen-shared = { path = "./shared", version = "0.72.0" }
cranelift-entity = { path = "../entity", version = "0.72.0" }
cranelift-bforest = { path = "../bforest", version = "0.72.0" }
hashbrown = { version = "0.9.1", optional = true }
target-lexicon = "0.11"
target-lexicon = "0.12"
log = { version = "0.4.6", default-features = false }
serde = { version = "1.0.94", features = ["derive"], optional = true }
bincode = { version = "1.2.1", optional = true }
Expand Down
1 change: 1 addition & 0 deletions cranelift/codegen/meta/src/shared/settings.rs
Original file line number Diff line number Diff line change
Expand Up @@ -192,6 +192,7 @@ pub(crate) fn define() -> SettingGroup {
"cold",
"system_v",
"windows_fastcall",
"apple_aarch64",
"baldrdash_system_v",
"baldrdash_windows",
"baldrdash_2020",
Expand Down
34 changes: 30 additions & 4 deletions cranelift/codegen/src/isa/aarch64/abi.rs
Original file line number Diff line number Diff line change
Expand Up @@ -171,6 +171,21 @@ impl ABIMachineSpec for AArch64MachineDeps {
let has_baldrdash_tls = call_conv == isa::CallConv::Baldrdash2020;

// See AArch64 ABI (https://c9x.me/compile/bib/abi-arm64.pdf), sections 5.4.
//
// MacOS aarch64 is slightly different, see also
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I noticed here: "If the total number of bytes for stack-based arguments is not a multiple of 8 bytes, insert padding on the stack to maintain the 8-byte alignment requirements." However below we align the final stack-arg area size up to a 16-byte alignment.

(Related to this, my understanding is that the trap-on-not-16-aligned-SP behavior of aarch64 is configurable with a mode bit as well; maybe this means Apple runs with only an 8-aligned stack?)

I think this is OK as it should be fine to reserve extra space at a callsite, but we should document that we diverge and why it's OK (and verify that it is!).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and it maintains our invariant that the stack is always allocated in 16-bytes chunked, thus always aligned, which is nice. (Otherwise would require more changes when generating prologues and epilogues.)

// https://developer.apple.com/documentation/xcode/writing_arm64_code_for_apple_platforms.
// We are diverging from the MacOS aarch64 implementation in the
// following ways:
// - sign- and zero- extensions of data types less than 32 bits are not
// implemented yet.
// - i128 arguments passing isn't implemented yet in the standard (non
// MacOS) aarch64 ABI.
// - we align the arguments stack space to a 16-bytes boundary, while
// the MacOS allows aligning only on 8 bytes. In practice it means we're
// slightly overallocating when calling, which is fine, and doesn't
// break our other invariants that the stack is always allocated in
// 16-bytes chunks.

let mut next_xreg = 0;
let mut next_vreg = 0;
let mut next_stack: u64 = 0;
Expand Down Expand Up @@ -264,13 +279,24 @@ impl ABIMachineSpec for AArch64MachineDeps {
*next_reg += 1;
remaining_reg_vals -= 1;
} else {
// Compute size. Every arg takes a minimum slot of 8 bytes. (16-byte
// stack alignment happens separately after all args.)
// Compute the stack slot's size.
let size = (ty_bits(param.value_type) / 8) as u64;
let size = std::cmp::max(size, 8);
// Align.

let size = if call_conv != isa::CallConv::AppleAarch64 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When depending on new features of a dependency could you also update the dependency requirement in Cargo.toml? That would be useful for users like cg_clif that only update a single crate at a time and keep the rest pinned using Cargo.lock.

// Every arg takes a minimum slot of 8 bytes. (16-byte stack
// alignment happens separately after all args.)
std::cmp::max(size, 8)
} else {
// MacOS aarch64 allows stack slots with sizes less than 8
// bytes. They still need to be properly aligned on their
// natural data alignment, though.
size
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the fast and cold call conv use this too? They are unstable anyway and this saves stack usage.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe? It's a bit out of scope for this PR, so I'll try not to accidentally introduce new changes there. Plus, I am not sure if these conventions are used; I seem to recall that there fast implies the default calling convention in some cases, and if we're not being careful that might mean subtly breaking other calling conventions. We should probably audit the "fast"/"cold" calling conventions at some point, and re-design them from the ground up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 -- I actually think we should do something about the "fast" and "cold" conventions, but we should be a little more explicit in designing them, and probably give them better names. I'm not a huge fan of generic terms like "fast ABI" because the ambiguity is confusing -- both on the user/embedder side ("what guarantees does this have? how fast is fast? when can I use it?") and on the implementer side (do we just choose an array of features that lead to better speed, and add more as we think of them? IMHO an evolving ABI is a recipe for subtle bugs as we mutate invariants over time).

So I'd rather design a "fast internal Cranelift ABI v1", implement it, and then keep that as a first-class, well-defined ABI alongside the others, and retire names like "fast" and "cold", just for clarity's sake. But, that's a deeper discussion for another day, I think!

};

// Align the stack slot.
debug_assert!(size.is_power_of_two());
next_stack = align_to(next_stack, size);

ret.push(ABIArg::stack(
next_stack as i64,
param.value_type,
Expand Down
20 changes: 13 additions & 7 deletions cranelift/codegen/src/isa/call_conv.rs
Original file line number Diff line number Diff line change
Expand Up @@ -10,22 +10,24 @@ use serde::{Deserialize, Serialize};
#[derive(Debug, Copy, Clone, PartialEq, Eq, Hash)]
#[cfg_attr(feature = "enable-serde", derive(Serialize, Deserialize))]
pub enum CallConv {
/// Best performance, not ABI-stable
/// Best performance, not ABI-stable.
Fast,
/// Smallest caller code size, not ABI-stable
/// Smallest caller code size, not ABI-stable.
Cold,
/// System V-style convention used on many platforms
/// System V-style convention used on many platforms.
SystemV,
/// Windows "fastcall" convention, also used for x64 and ARM
/// Windows "fastcall" convention, also used for x64 and ARM.
WindowsFastcall,
/// SpiderMonkey WebAssembly convention on systems using natively SystemV
/// Mac aarch64 calling convention, which is a tweak aarch64 ABI.
AppleAarch64,
/// SpiderMonkey WebAssembly convention on systems using natively SystemV.
BaldrdashSystemV,
/// SpiderMonkey WebAssembly convention on Windows
/// SpiderMonkey WebAssembly convention on Windows.
BaldrdashWindows,
/// SpiderMonkey WebAssembly convention for "ABI-2020", with extra TLS
/// register slots in the frame.
Baldrdash2020,
/// Specialized convention for the probestack function
/// Specialized convention for the probestack function.
Probestack,
}

Expand All @@ -36,6 +38,7 @@ impl CallConv {
// Default to System V for unknown targets because most everything
// uses System V.
Ok(CallingConvention::SystemV) | Err(()) => Self::SystemV,
Ok(CallingConvention::AppleAarch64) => Self::AppleAarch64,
Ok(CallingConvention::WindowsFastcall) => Self::WindowsFastcall,
Ok(unimp) => unimplemented!("calling convention: {:?}", unimp),
}
Expand All @@ -49,6 +52,7 @@ impl CallConv {
LibcallCallConv::Cold => Self::Cold,
LibcallCallConv::SystemV => Self::SystemV,
LibcallCallConv::WindowsFastcall => Self::WindowsFastcall,
LibcallCallConv::AppleAarch64 => Self::AppleAarch64,
LibcallCallConv::BaldrdashSystemV => Self::BaldrdashSystemV,
LibcallCallConv::BaldrdashWindows => Self::BaldrdashWindows,
LibcallCallConv::Baldrdash2020 => Self::Baldrdash2020,
Expand Down Expand Up @@ -80,6 +84,7 @@ impl fmt::Display for CallConv {
Self::Cold => "cold",
Self::SystemV => "system_v",
Self::WindowsFastcall => "windows_fastcall",
Self::AppleAarch64 => "apple_aarch64",
Self::BaldrdashSystemV => "baldrdash_system_v",
Self::BaldrdashWindows => "baldrdash_windows",
Self::Baldrdash2020 => "baldrdash_2020",
Expand All @@ -96,6 +101,7 @@ impl str::FromStr for CallConv {
"cold" => Ok(Self::Cold),
"system_v" => Ok(Self::SystemV),
"windows_fastcall" => Ok(Self::WindowsFastcall),
"apple_aarch64" => Ok(Self::AppleAarch64),
"baldrdash_system_v" => Ok(Self::BaldrdashSystemV),
"baldrdash_windows" => Ok(Self::BaldrdashWindows),
"baldrdash_2020" => Ok(Self::Baldrdash2020),
Expand Down
3 changes: 3 additions & 0 deletions cranelift/codegen/src/isa/x64/abi.rs
Original file line number Diff line number Diff line change
Expand Up @@ -907,6 +907,7 @@ fn get_intreg_for_retval(
_ => None,
},
CallConv::BaldrdashWindows | CallConv::Probestack => todo!(),
CallConv::AppleAarch64 => unreachable!(),
}
}

Expand All @@ -933,6 +934,7 @@ fn get_fltreg_for_retval(
_ => None,
},
CallConv::BaldrdashWindows | CallConv::Probestack => todo!(),
CallConv::AppleAarch64 => unreachable!(),
}
}

Expand Down Expand Up @@ -1001,6 +1003,7 @@ fn get_callee_saves(call_conv: &CallConv, regs: &Set<Writable<RealReg>>) -> Vec<
.filter(|r| is_callee_save_fastcall(r.to_reg()))
.collect(),
CallConv::Probestack => todo!("probestack?"),
CallConv::AppleAarch64 => unreachable!(),
};
// Sort registers for deterministic code output. We can do an unstable sort because the
// registers will be unique (there are no dups).
Expand Down
1 change: 1 addition & 0 deletions cranelift/codegen/src/isa/x86/abi.rs
Original file line number Diff line number Diff line change
Expand Up @@ -512,6 +512,7 @@ pub fn prologue_epilogue(func: &mut ir::Function, isa: &dyn TargetIsa) -> Codege
}
CallConv::Probestack => unimplemented!("probestack calling convention"),
CallConv::Baldrdash2020 => unimplemented!("Baldrdash ABI 2020"),
CallConv::AppleAarch64 => unreachable!(),
}
}

Expand Down
3 changes: 2 additions & 1 deletion cranelift/codegen/src/machinst/abi_impl.rs
Original file line number Diff line number Diff line change
Expand Up @@ -646,7 +646,8 @@ impl<M: ABIMachineSpec> ABICalleeImpl<M> {
|| call_conv == isa::CallConv::Fast
|| call_conv == isa::CallConv::Cold
|| call_conv.extends_baldrdash()
|| call_conv.extends_windows_fastcall(),
|| call_conv.extends_windows_fastcall()
|| call_conv == isa::CallConv::AppleAarch64,
"Unsupported calling convention: {:?}",
call_conv
);
Expand Down
2 changes: 1 addition & 1 deletion cranelift/filetests/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ gimli = { version = "0.23.0", default-features = false, features = ["read"] }
log = "0.4.6"
memmap2 = "0.2.1"
num_cpus = "1.8.0"
target-lexicon = "0.11"
target-lexicon = "0.12"
thiserror = "1.0.15"
anyhow = "1.0.32"

Expand Down
2 changes: 1 addition & 1 deletion cranelift/frontend/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ edition = "2018"

[dependencies]
cranelift-codegen = { path = "../codegen", version = "0.72.0", default-features = false }
target-lexicon = "0.11"
target-lexicon = "0.12"
log = { version = "0.4.6", default-features = false }
hashbrown = { version = "0.9.1", optional = true }
smallvec = { version = "1.6.1" }
Expand Down
2 changes: 1 addition & 1 deletion cranelift/jit/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ anyhow = "1.0"
region = "2.2.0"
libc = { version = "0.2.42" }
errno = "0.2.4"
target-lexicon = "0.11"
target-lexicon = "0.12"
memmap2 = { version = "0.2.1", optional = true }
log = { version = "0.4.6", default-features = false }

Expand Down
2 changes: 1 addition & 1 deletion cranelift/native/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ edition = "2018"

[dependencies]
cranelift-codegen = { path = "../codegen", version = "0.72.0", default-features = false }
target-lexicon = "0.11"
target-lexicon = "0.12"

[features]
default = ["std"]
Expand Down
2 changes: 1 addition & 1 deletion cranelift/object/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ edition = "2018"
cranelift-module = { path = "../module", version = "0.72.0" }
cranelift-codegen = { path = "../codegen", version = "0.72.0", default-features = false, features = ["std"] }
object = { version = "0.23.0", default-features = false, features = ["write"] }
target-lexicon = "0.11"
target-lexicon = "0.12"
anyhow = "1.0"
log = { version = "0.4.6", default-features = false }

Expand Down
2 changes: 1 addition & 1 deletion cranelift/reader/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ edition = "2018"
[dependencies]
cranelift-codegen = { path = "../codegen", version = "0.72.0" }
smallvec = "1.6.1"
target-lexicon = "0.11"
target-lexicon = "0.12"
thiserror = "1.0.15"

[badges]
Expand Down
2 changes: 1 addition & 1 deletion cranelift/wasm/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ thiserror = "1.0.4"

[dev-dependencies]
wat = "1.0.36"
target-lexicon = "0.11"
target-lexicon = "0.12"
# Enable the riscv feature for cranelift-codegen, as some tests require it
cranelift-codegen = { path = "../codegen", version = "0.72.0", default-features = false, features = ["riscv"] }

Expand Down
2 changes: 1 addition & 1 deletion crates/debug/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ gimli = "0.23.0"
wasmparser = "0.76"
object = { version = "0.23.0", default-features = false, features = ["read_core", "elf", "write"] }
wasmtime-environ = { path = "../environ", version = "0.25.0" }
target-lexicon = { version = "0.11.0", default-features = false }
target-lexicon = { version = "0.12.0", default-features = false }
anyhow = "1.0"
thiserror = "1.0.4"
more-asserts = "0.2.1"
Expand Down
2 changes: 1 addition & 1 deletion crates/jit/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ wasmtime-obj = { path = "../obj", version = "0.25.0" }
rayon = { version = "1.0", optional = true }
region = "2.2.0"
thiserror = "1.0.4"
target-lexicon = { version = "0.11.0", default-features = false }
target-lexicon = { version = "0.12.0", default-features = false }
wasmparser = "0.76"
more-asserts = "0.2.1"
anyhow = "1.0"
Expand Down
2 changes: 1 addition & 1 deletion crates/obj/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ anyhow = "1.0"
wasmtime-environ = { path = "../environ", version = "0.25.0" }
object = { version = "0.23.0", default-features = false, features = ["write"] }
more-asserts = "0.2.1"
target-lexicon = { version = "0.11.0", default-features = false }
target-lexicon = { version = "0.12.0", default-features = false }
wasmtime-debug = { path = "../debug", version = "0.25.0" }

[badges]
Expand Down
2 changes: 1 addition & 1 deletion crates/profiling/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ lazy_static = "1.4"
libc = { version = "0.2.60", default-features = false }
scroll = { version = "0.10.1", features = ["derive"], optional = true }
serde = { version = "1.0.99", features = ["derive"] }
target-lexicon = "0.11.0"
target-lexicon = "0.12.0"
wasmtime-environ = { path = "../environ", version = "0.25.0" }
wasmtime-runtime = { path = "../runtime", version = "0.25.0" }
ittapi-rs = { version = "0.1.5", optional = true }
Expand Down
2 changes: 1 addition & 1 deletion crates/test-programs/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ wasi-common = { path = "../wasi-common", version = "0.25.0" }
wasi-cap-std-sync = { path = "../wasi-common/cap-std-sync", version = "0.25.0" }
wasmtime = { path = "../wasmtime", version = "0.25.0" }
wasmtime-wasi = { path = "../wasi", version = "0.25.0" }
target-lexicon = "0.11.0"
target-lexicon = "0.12.0"
pretty_env_logger = "0.4.0"
tempfile = "3.1.0"
os_pipe = "0.9"
Expand Down
2 changes: 1 addition & 1 deletion crates/wasmtime/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ wasmtime-jit = { path = "../jit", version = "0.25.0" }
wasmtime-cache = { path = "../cache", version = "0.25.0", optional = true }
wasmtime-profiling = { path = "../profiling", version = "0.25.0" }
wasmtime-fiber = { path = "../fiber", version = "0.25.0", optional = true }
target-lexicon = { version = "0.11.0", default-features = false }
target-lexicon = { version = "0.12.0", default-features = false }
wasmparser = "0.76"
anyhow = "1.0.19"
region = "2.2.0"
Expand Down
2 changes: 1 addition & 1 deletion fuzz/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ cranelift-codegen = { path = "../cranelift/codegen" }
cranelift-reader = { path = "../cranelift/reader" }
cranelift-wasm = { path = "../cranelift/wasm" }
libfuzzer-sys = "0.4.0"
target-lexicon = "0.11"
target-lexicon = "0.12"
peepmatic-fuzzing = { path = "../cranelift/peepmatic/crates/fuzzing", optional = true }
wasmtime = { path = "../crates/wasmtime" }
wasmtime-fuzzing = { path = "../crates/fuzzing" }
Expand Down
3 changes: 0 additions & 3 deletions tests/all/func.rs
Original file line number Diff line number Diff line change
Expand Up @@ -121,9 +121,6 @@ fn signatures_match() {
}

#[test]
// Note: Cranelift only supports refrerence types (used in the wasm in this
// test) on x64.
#[cfg(target_arch = "x86_64")]
fn import_works() -> Result<()> {
static HITS: AtomicUsize = AtomicUsize::new(0);

Expand Down
Loading