-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
s390x: Add z14 support #2991
s390x: Add z14 support #2991
Conversation
I was curious so I ran the test suite in qemu and I ran into an issue that looks like:
where something seems off there if it's saying that the call-stack is exhausted. Perhaps a qemu bug? Maybe a backend bug? In any case was just curious how qemu would run, although it unfortunately didn't make it to the meat of the tests. I was also a little surprised at how slow the compile was, our aarch64 build finishes building tests in ~18m but the s390x tests built in ~27m. This is the speed of the LLVM backend for s390x presumably, so nothing related to Wasmtime, just curious! |
Note that mainline qemu doesn't quite support z14 yet. Support has been merged into the qemu s390x maintainer repo (branch s390-next in https://gitlab.com/cohuck/qemu.git) but not yet mainline. Not sure if this explains this particular crash.
Is this running as cross-compiler, or running the native LLVM under qemu? I don't see any particular reason why the s390x back-end should be significantly slower than the aarch64 back-end when running as cross-compiler ... |
Also, please hold off merging this a bit -- I just noticed that there seems to be bug in the auxv crate that causes getauxval to sometimes return a wrong value so the native platform is mis-detected. I'm currently testing a fix to just use getauxval from the libc crate, which works correctly (and seems more straightforward anyway). |
Ah yeah it was using stock qemu 6.0.0, and "stack overflow" also happens with illegal instructions, so that would indeed explain that! For the slowness, it's LLVM running natively but compiling to s390x. It could also just be variance in GitHub Actions perhaps, but afaik the only thing affecting the speed of compiling the test suite in this case would be the s390x backend in LLVM. In any case though not like something we'll fix here, just something I was curious about. |
Is there a simple way to reproduce this process outside of GitHub actions? I could have a look ... |
While not exactly easy one possible way to reproduce is to run the same steps locally that CI does, which basically just downloads QEMU, builds it, and then configures some env vars for cargo's build |
OK, this is fixed now. The current version passes the full test suite on both z14 and z15, and it will indeed use the z15 instructions on the latter. As far as I can see, this should be good to merge now. FYI @cfallin . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! This all looks great; just the tiniest of nits on a comment formatting issue below, and a question about upstreaming feature detection which shouldn't block merging this PR.
* Add support for processor features (including auto-detection). * Move base architecture set requirement back to z14. * Add z15 feature sets and re-enable z15-specific code generation when required features are available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Turns out this has nothing to do with qemu, I'm seeing the same failure natively. This is related to the I'll have a look why those tests are failing. |
Turns out this was an endian bug in handling of the With this, I can now successfully run |
The trap was originally reported as a stack overflow exhaustion but given the wasm stack that doesn't seem to be the case, but was the trap classification fixed by #3014? I could definitely imagine that switching endiannness would cause some random traps on reads/writes in wasm though... |
Looks like this is indeed the case! I now get |
Add support for processor features (including auto-detection).
Move base architecture set requirement back to z14.
Add z15 feature sets and re-enable z15-specific code generation
when required features are available.