Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

s390x: Add z14 support #2991

Merged
merged 1 commit into from
Jun 17, 2021
Merged

s390x: Add z14 support #2991

merged 1 commit into from
Jun 17, 2021

Conversation

uweigand
Copy link
Member

  • Add support for processor features (including auto-detection).

  • Move base architecture set requirement back to z14.

  • Add z15 feature sets and re-enable z15-specific code generation
    when required features are available.

@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:meta Everything related to the meta-language. labels Jun 16, 2021
@alexcrichton
Copy link
Member

I was curious so I ran the test suite in qemu and I ran into an issue that looks like:

---- wasi_cap_std_sync::fd_readdir stdout ----
preopen: "/tmp/wasi_common_fd_readdirfr5CGv"
guest stderr:
thread 'main' panicked at 'assertion failed: `(left == right)`
  left: `0`,
 right: `2`: expected two entries in an empty directory', src/bin/fd_readdir.rs:76:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

===
Error: error while testing Wasm module 'fd_readdir'

Caused by:
    wasm trap: call stack exhausted
    wasm backtrace:
        0: 0xaafb - <unknown>!<std::sys::wasi::stdio::Stderr as std::io::Write>::is_write_vectored::hf152121ba89ed5c9
        1: 0xa867 - <unknown>!rust_panic
        2: 0xa3fd - <unknown>!std::panicking::rust_panic_with_hook::hf735cc98c0f3e6f4
        3: 0x9b35 - <unknown>!std::panicking::begin_panic_handler::{{closure}}::hb082d09953c1ceec
        4: 0x9a76 - <unknown>!std::sys_common::backtrace::__rust_end_short_backtrace::hac58197bca415fd5
        5: 0xa2a1 - <unknown>!rust_begin_unwind
        6: 0xff0b - <unknown>!core::panicking::panic_fmt::hf8b3045973a2d1f9
        7: 0x10b73 - <unknown>!core::panicking::assert_failed::inner::h4a10935a4d4a4d0d
        8: 0x3581 - <unknown>!core::panicking::assert_failed::hf88aca872cdb2b11
        9: 0x167b - <unknown>!fd_readdir::main::hb00a7e1c801281d4
       10: 0x2dcf - <unknown>!std::sys_common::backtrace::__rust_begin_short_backtrace::hf92d88d850fed84d
       11: 0x2e06 - <unknown>!std::rt::lang_start::{{closure}}::hc83ff37db6c562d6
       12: 0xa912 - <unknown>!std::rt::lang_start_internal::h3bc712c5a299b4e4
       13: 0x1ec4 - <unknown>!__original_main
       14:  0x545 - <unknown>!_start
       15: 0x13c08 - <unknown>!_start.command_export
    note: run with `WASMTIME_BACKTRACE_DETAILS=1` environment variable to display more information

where something seems off there if it's saying that the call-stack is exhausted. Perhaps a qemu bug? Maybe a backend bug? In any case was just curious how qemu would run, although it unfortunately didn't make it to the meat of the tests.

I was also a little surprised at how slow the compile was, our aarch64 build finishes building tests in ~18m but the s390x tests built in ~27m. This is the speed of the LLVM backend for s390x presumably, so nothing related to Wasmtime, just curious!

@uweigand
Copy link
Member Author

Note that mainline qemu doesn't quite support z14 yet. Support has been merged into the qemu s390x maintainer repo (branch s390-next in https://gitlab.com/cohuck/qemu.git) but not yet mainline. Not sure if this explains this particular crash.

I was also a little surprised at how slow the compile was, our aarch64 build finishes building tests in ~18m but the s390x tests built in ~27m. This is the speed of the LLVM backend for s390x presumably, so nothing related to Wasmtime, just curious!

Is this running as cross-compiler, or running the native LLVM under qemu? I don't see any particular reason why the s390x back-end should be significantly slower than the aarch64 back-end when running as cross-compiler ...

@uweigand
Copy link
Member Author

Also, please hold off merging this a bit -- I just noticed that there seems to be bug in the auxv crate that causes getauxval to sometimes return a wrong value so the native platform is mis-detected. I'm currently testing a fix to just use getauxval from the libc crate, which works correctly (and seems more straightforward anyway).

@alexcrichton
Copy link
Member

Ah yeah it was using stock qemu 6.0.0, and "stack overflow" also happens with illegal instructions, so that would indeed explain that!

For the slowness, it's LLVM running natively but compiling to s390x. It could also just be variance in GitHub Actions perhaps, but afaik the only thing affecting the speed of compiling the test suite in this case would be the s390x backend in LLVM. In any case though not like something we'll fix here, just something I was curious about.

@uweigand
Copy link
Member Author

Ah yeah it was using stock qemu 6.0.0, and "stack overflow" also happens with illegal instructions, so that would indeed explain that!

For the slowness, it's LLVM running natively but compiling to s390x. It could also just be variance in GitHub Actions perhaps, but afaik the only thing affecting the speed of compiling the test suite in this case would be the s390x backend in LLVM. In any case though not like something we'll fix here, just something I was curious about.

Is there a simple way to reproduce this process outside of GitHub actions? I could have a look ...

@alexcrichton
Copy link
Member

While not exactly easy one possible way to reproduce is to run the same steps locally that CI does, which basically just downloads QEMU, builds it, and then configures some env vars for cargo's build

@uweigand
Copy link
Member Author

Also, please hold off merging this a bit -- I just noticed that there seems to be bug in the auxv crate that causes getauxval to sometimes return a wrong value so the native platform is mis-detected. I'm currently testing a fix to just use getauxval from the libc crate, which works correctly (and seems more straightforward anyway).

OK, this is fixed now. The current version passes the full test suite on both z14 and z15, and it will indeed use the z15 instructions on the latter. As far as I can see, this should be good to merge now. FYI @cfallin .

Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This all looks great; just the tiniest of nits on a comment formatting issue below, and a question about upstreaming feature detection which shouldn't block merging this PR.

cranelift/codegen/src/isa/s390x/inst/mod.rs Outdated Show resolved Hide resolved
cranelift/native/src/lib.rs Show resolved Hide resolved
* Add support for processor features (including auto-detection).

* Move base architecture set requirement back to z14.

* Add z15 feature sets and re-enable z15-specific code generation
  when required features are available.
Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@cfallin cfallin merged commit 5ddf562 into bytecodealliance:main Jun 17, 2021
@uweigand
Copy link
Member Author

While not exactly easy one possible way to reproduce is to run the same steps locally that CI does, which basically just downloads QEMU, builds it, and then configures some env vars for cargo's build

Turns out this has nothing to do with qemu, I'm seeing the same failure natively. This is related to the --features "test-programs/test_programs" argument used by ./ci/run-tests.sh -- I hadn't been using this argument in my testing, which means I've apparently never even attempted to executed some of those tests.

I'll have a look why those tests are failing.

@uweigand uweigand deleted the s390x-z14 branch June 18, 2021 18:25
@uweigand
Copy link
Member Author

Turns out this was an endian bug in handling of the Dirent data type: #3016

With this, I can now successfully run ./ci/run-tests.sh (at least natively).

@alexcrichton
Copy link
Member

The trap was originally reported as a stack overflow exhaustion but given the wasm stack that doesn't seem to be the case, but was the trap classification fixed by #3014? I could definitely imagine that switching endiannness would cause some random traps on reads/writes in wasm though...

@uweigand
Copy link
Member Author

The trap was originally reported as a stack overflow exhaustion but given the wasm stack that doesn't seem to be the case, but was the trap classification fixed by #3014?

Looks like this is indeed the case! I now get wasm trap: unreachable which seems reasonable for a rust_panic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cranelift:meta Everything related to the meta-language. cranelift Issues related to the Cranelift code generator
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants