s390x: Add z14 support #2991

uweigand · 2021-06-16T13:06:23Z

Add support for processor features (including auto-detection).
Move base architecture set requirement back to z14.
Add z15 feature sets and re-enable z15-specific code generation
when required features are available.

alexcrichton · 2021-06-16T16:40:22Z

I was curious so I ran the test suite in qemu and I ran into an issue that looks like:

---- wasi_cap_std_sync::fd_readdir stdout ----
preopen: "/tmp/wasi_common_fd_readdirfr5CGv"
guest stderr:
thread 'main' panicked at 'assertion failed: `(left == right)`
  left: `0`,
 right: `2`: expected two entries in an empty directory', src/bin/fd_readdir.rs:76:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

===
Error: error while testing Wasm module 'fd_readdir'

Caused by:
    wasm trap: call stack exhausted
    wasm backtrace:
        0: 0xaafb - <unknown>!<std::sys::wasi::stdio::Stderr as std::io::Write>::is_write_vectored::hf152121ba89ed5c9
        1: 0xa867 - <unknown>!rust_panic
        2: 0xa3fd - <unknown>!std::panicking::rust_panic_with_hook::hf735cc98c0f3e6f4
        3: 0x9b35 - <unknown>!std::panicking::begin_panic_handler::{{closure}}::hb082d09953c1ceec
        4: 0x9a76 - <unknown>!std::sys_common::backtrace::__rust_end_short_backtrace::hac58197bca415fd5
        5: 0xa2a1 - <unknown>!rust_begin_unwind
        6: 0xff0b - <unknown>!core::panicking::panic_fmt::hf8b3045973a2d1f9
        7: 0x10b73 - <unknown>!core::panicking::assert_failed::inner::h4a10935a4d4a4d0d
        8: 0x3581 - <unknown>!core::panicking::assert_failed::hf88aca872cdb2b11
        9: 0x167b - <unknown>!fd_readdir::main::hb00a7e1c801281d4
       10: 0x2dcf - <unknown>!std::sys_common::backtrace::__rust_begin_short_backtrace::hf92d88d850fed84d
       11: 0x2e06 - <unknown>!std::rt::lang_start::{{closure}}::hc83ff37db6c562d6
       12: 0xa912 - <unknown>!std::rt::lang_start_internal::h3bc712c5a299b4e4
       13: 0x1ec4 - <unknown>!__original_main
       14:  0x545 - <unknown>!_start
       15: 0x13c08 - <unknown>!_start.command_export
    note: run with `WASMTIME_BACKTRACE_DETAILS=1` environment variable to display more information

where something seems off there if it's saying that the call-stack is exhausted. Perhaps a qemu bug? Maybe a backend bug? In any case was just curious how qemu would run, although it unfortunately didn't make it to the meat of the tests.

I was also a little surprised at how slow the compile was, our aarch64 build finishes building tests in ~18m but the s390x tests built in ~27m. This is the speed of the LLVM backend for s390x presumably, so nothing related to Wasmtime, just curious!

uweigand · 2021-06-16T16:56:25Z

Note that mainline qemu doesn't quite support z14 yet. Support has been merged into the qemu s390x maintainer repo (branch s390-next in https://gitlab.com/cohuck/qemu.git) but not yet mainline. Not sure if this explains this particular crash.

I was also a little surprised at how slow the compile was, our aarch64 build finishes building tests in ~18m but the s390x tests built in ~27m. This is the speed of the LLVM backend for s390x presumably, so nothing related to Wasmtime, just curious!

Is this running as cross-compiler, or running the native LLVM under qemu? I don't see any particular reason why the s390x back-end should be significantly slower than the aarch64 back-end when running as cross-compiler ...

uweigand · 2021-06-16T16:58:05Z

Also, please hold off merging this a bit -- I just noticed that there seems to be bug in the auxv crate that causes getauxval to sometimes return a wrong value so the native platform is mis-detected. I'm currently testing a fix to just use getauxval from the libc crate, which works correctly (and seems more straightforward anyway).

alexcrichton · 2021-06-16T17:01:06Z

Ah yeah it was using stock qemu 6.0.0, and "stack overflow" also happens with illegal instructions, so that would indeed explain that!

For the slowness, it's LLVM running natively but compiling to s390x. It could also just be variance in GitHub Actions perhaps, but afaik the only thing affecting the speed of compiling the test suite in this case would be the s390x backend in LLVM. In any case though not like something we'll fix here, just something I was curious about.

uweigand · 2021-06-16T17:07:16Z

Ah yeah it was using stock qemu 6.0.0, and "stack overflow" also happens with illegal instructions, so that would indeed explain that!

For the slowness, it's LLVM running natively but compiling to s390x. It could also just be variance in GitHub Actions perhaps, but afaik the only thing affecting the speed of compiling the test suite in this case would be the s390x backend in LLVM. In any case though not like something we'll fix here, just something I was curious about.

Is there a simple way to reproduce this process outside of GitHub actions? I could have a look ...

alexcrichton · 2021-06-16T17:39:48Z

While not exactly easy one possible way to reproduce is to run the same steps locally that CI does, which basically just downloads QEMU, builds it, and then configures some env vars for cargo's build

uweigand · 2021-06-16T18:32:43Z

Also, please hold off merging this a bit -- I just noticed that there seems to be bug in the auxv crate that causes getauxval to sometimes return a wrong value so the native platform is mis-detected. I'm currently testing a fix to just use getauxval from the libc crate, which works correctly (and seems more straightforward anyway).

OK, this is fixed now. The current version passes the full test suite on both z14 and z15, and it will indeed use the z15 instructions on the latter. As far as I can see, this should be good to merge now. FYI @cfallin .

cfallin

Thanks! This all looks great; just the tiniest of nits on a comment formatting issue below, and a question about upstreaming feature detection which shouldn't block merging this PR.

cranelift/codegen/src/isa/s390x/inst/mod.rs

cranelift/native/src/lib.rs

* Add support for processor features (including auto-detection). * Move base architecture set requirement back to z14. * Add z15 feature sets and re-enable z15-specific code generation when required features are available.

cfallin

Thanks!

uweigand · 2021-06-18T18:25:49Z

While not exactly easy one possible way to reproduce is to run the same steps locally that CI does, which basically just downloads QEMU, builds it, and then configures some env vars for cargo's build

Turns out this has nothing to do with qemu, I'm seeing the same failure natively. This is related to the --features "test-programs/test_programs" argument used by ./ci/run-tests.sh -- I hadn't been using this argument in my testing, which means I've apparently never even attempted to executed some of those tests.

I'll have a look why those tests are failing.

uweigand · 2021-06-22T12:08:17Z

Turns out this was an endian bug in handling of the Dirent data type: #3016

With this, I can now successfully run ./ci/run-tests.sh (at least natively).

alexcrichton · 2021-06-22T14:20:02Z

The trap was originally reported as a stack overflow exhaustion but given the wasm stack that doesn't seem to be the case, but was the trap classification fixed by #3014? I could definitely imagine that switching endiannness would cause some random traps on reads/writes in wasm though...

uweigand · 2021-06-22T14:44:17Z

The trap was originally reported as a stack overflow exhaustion but given the wasm stack that doesn't seem to be the case, but was the trap classification fixed by #3014?

Looks like this is indeed the case! I now get wasm trap: unreachable which seems reasonable for a rust_panic.

github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:meta Everything related to the meta-language. labels Jun 16, 2021

uweigand force-pushed the s390x-z14 branch from a699f3a to e0c9698 Compare June 16, 2021 17:18

uweigand force-pushed the s390x-z14 branch from e0c9698 to 8eddde9 Compare June 16, 2021 17:47

cfallin approved these changes Jun 16, 2021

View reviewed changes

cranelift/codegen/src/isa/s390x/inst/mod.rs Outdated Show resolved Hide resolved

cranelift/native/src/lib.rs Show resolved Hide resolved

uweigand force-pushed the s390x-z14 branch from 8eddde9 to a9ab202 Compare June 17, 2021 08:17

s390x: Add z14 support

def54fb

* Add support for processor features (including auto-detection). * Move base architecture set requirement back to z14. * Add z15 feature sets and re-enable z15-specific code generation when required features are available.

uweigand force-pushed the s390x-z14 branch from a9ab202 to def54fb Compare June 17, 2021 08:23

cfallin approved these changes Jun 17, 2021

View reviewed changes

cfallin merged commit 5ddf562 into bytecodealliance:main Jun 17, 2021

uweigand deleted the s390x-z14 branch June 18, 2021 18:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

s390x: Add z14 support #2991

s390x: Add z14 support #2991

uweigand commented Jun 16, 2021

alexcrichton commented Jun 16, 2021

uweigand commented Jun 16, 2021

uweigand commented Jun 16, 2021

alexcrichton commented Jun 16, 2021

uweigand commented Jun 16, 2021

alexcrichton commented Jun 16, 2021

uweigand commented Jun 16, 2021

cfallin left a comment

cfallin left a comment

uweigand commented Jun 18, 2021

uweigand commented Jun 22, 2021

alexcrichton commented Jun 22, 2021

uweigand commented Jun 22, 2021

s390x: Add z14 support #2991

s390x: Add z14 support #2991

Conversation

uweigand commented Jun 16, 2021

alexcrichton commented Jun 16, 2021

uweigand commented Jun 16, 2021

uweigand commented Jun 16, 2021

alexcrichton commented Jun 16, 2021

uweigand commented Jun 16, 2021

alexcrichton commented Jun 16, 2021

uweigand commented Jun 16, 2021

cfallin left a comment

Choose a reason for hiding this comment

cfallin left a comment

Choose a reason for hiding this comment

uweigand commented Jun 18, 2021

uweigand commented Jun 22, 2021

alexcrichton commented Jun 22, 2021

uweigand commented Jun 22, 2021