-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add an option to support big-endian machines #12387
Comments
Hi @kripken, would you have any suggestions on this issue? |
Yes, it is a little odd that typed arrays use the system endianness, while wasm Memories are always little endian. This is an issue for all emscripten output, because it is a combination of wasm (for compiled code) and JS (to access Web APIs). The JS will use a typed array view to read from memory in a fast way (for example compiled C can call an OpenGL API, and the JS side will read data to send to WebGL). On a big endian machine that may not be correct, hence we have that assertion. In principle we could add a mode in which we rewrite all memory access from JS to use DataView. We haven't had a request for this yet. Do you have a use case that needs it? |
Thanks for the info, yes an issue has come up when using So to clarify, it's not possible to always continue with LE ordering when compiling to wasm and only assert when compiling to JS? |
@miladfarca Correct. Any time there is any JS - and there is by default, as wasm can't access Web APIs itself - then this is an issue. The only solutions are either
Option 2 wouldn't be a large amount of work. Is this important for your project? |
@kripken Thanks for clarifying. It's not a specific project we are trying to support, rather we are trying to support BE architectures in general. Nodejs and V8 support big endian platforms and we are seeing endianness problems with modules similar to In terms of adding support using DataView, is this something we could contribute or emscripten would be able to add the changes and we can test it on BE machines? |
I think some web APIs like WebGL in particular would be more difficult as they'd have to convert data back and forth, knowing which buffers contain what kind of data. However this may be moot, if BE arches are mostly a mainframe/server thing that run node, not browsers. |
@Brion correct, at this point we are only concerned with Node.js and server related processes and as you mentioned WebGL should not be an issue. |
A contribution would be great! It shouldn't be too hard I think. The key thing would be to write a transformation pass on the acorn AST, see for example the HEAP16[..] => dataView.getInt16(.., endianness) and similar for sets. (The safeHeap pass may also be interesting as it does something very close, but it hasn't been converted to acorn yet.) @Brion WebGL would indeed be an issue, yeah. That would come up specifically when doing |
Btw, there is a PR up to write |
@kripken I've created this PR #12780 as a start so we can discuss possible ways to support BE machines. In this case i'm redirecting all the HEAP get/sets with a proxy and using Data Views to do the actual reading and wiring. I made a simple Note that we still cannot compile with
Once runtime is fixed we can fix the BE compilation after. |
Initial PR to add BE support has landed: #13413 |
This issue has been automatically marked as stale because there has been no activity in the past year. It will be closed automatically if no further activity occurs in the next 30 days. Feel free to re-open at any time if this issue is still relevant. |
@miladfarca @kripken Do you think there is a performance drop on LE systems when using |
@juru1234 This comment has some details on the issue: #12780 (comment) tho we would love to enable it by default as Wasm usage on big endian platforms (using Node.js) in definitely increasing. |
Is there a big-endian node platform that can be easily tested by folks without S/390 big iron these days? actually gave a quick shot at using qemu system emulation to test something in node for linux/s390 on a whim a few weeks ago but the emulator crashed when I tried to run the node binary... "Back in the day" of big-endian PPC Macs this was easier to exercise on consumer hardware. ;) |
@bvibber you can register for a test system here: https://linuxone.cloud.marist.edu/#/register?flag=VM |
Thanks, that will come in handy for testing my js & php packages! |
It is worth measuring, but I would guess there is a penalty, yes. Things like This option is also larger in code size (that For those reasons I think this should remain an option and not be on by default. But if we gather benchmark data that shows there actually is no slowdown, and we have an idea how to reduce the code size cost, then that could change my mind. |
Quick test shows a slight penalty on repeated calls to dataView.get/setInt8/16/32 vs Int8/16/32Array on my M1 MacBook Air, though this penalty would only be paid on individual accesses on the JavaScript side so this benchmark may not be super relevant. ;) (See details below) I suspect the minification issue is of greater importance, since larger size for the .js bundle will be paid for all users of the BE-friendly builds. Wrapping the load/stores in callable functions would make them super minifier-friendly but may add overhead. (Idea: second Wasm module using the base memory but with memory accessor functions that always work in little-endian?)
endian-test.js: function testInt32Array(buffer) {
let arr = new Int32Array(buffer);
let len = buffer.byteLength;
for (let i = 0; i < len; i += 4) {
arr[i >> 2] = arr[i >> 2] + 1 | 0;
}
}
function testInt16Array(buffer) {
let arr = new Int16Array(buffer);
let len = buffer.byteLength;
for (let i = 0; i < len; i += 2) {
arr[i >> 1] = arr[i >> 1] + 1 | 0;
}
}
function testInt8Array(buffer) {
let arr = new Int8Array(buffer);
let len = buffer.byteLength;
for (let i = 0; i < len; i++) {
arr[i] = arr[i] + 1 | 0;
}
}
function testDataView32(buffer) {
let view = new DataView(buffer);
let len = buffer.byteLength;
for (let i = 0; i < len; i += 4) {
view.setInt32(i, view.getInt32(i, true) + 1 | 0, true);
}
}
function testDataView16(buffer) {
let view = new DataView(buffer);
let len = buffer.byteLength;
for (let i = 0; i < len; i += 2) {
view.setInt16(i, view.getInt16(i, true) + 1 | 0, true);
}
}
function testDataView8(buffer) {
let view = new DataView(buffer);
let len = buffer.byteLength;
for (let i = 0; i < len; i++) {
view.setInt8(i, view.getInt8(i) + 1 | 0);
}
}
function testem() {
let len = 1024 * 1024 * 64;
function run(func, msg) {
let buffer = new ArrayBuffer(len);
let start = performance.now();
func(buffer);
let delta = performance.now() - start;
console.log(`${msg}: ${delta} ms`);
}
run(testInt8Array, 'Int8Array');
run(testInt16Array, 'Int16Array');
run(testInt32Array, 'Int32Array');
run(testDataView8, 'set/getInt8')
run(testDataView16, 'set/getInt16 LE')
run(testDataView32, 'set/getInt32 LE')
}
for (let i = 0; i < 10; i++) {
console.log(`run ${i}`);
testem();
console.log('');
} |
Looks like the current big-endian mode already wraps the DataView accessors in functions which are called as replacements for sets/loads into the HEAP arrays, eg |
@bvibber I curious what LE platforms you are interested in targeting? |
Given that we now have |
LE platforms is Windows, mac, iOS, Android x86_64, arm64, and I guess anything else. BE is only s390/Linux node AFAIK. |
(IMHO a better fix is to re-spec TypedArrays to be always little-endian, fix V8's s390 output for that, and probably fix a billion other bugs in things using typed arrays and/or wasm in node libraries.) But that's obviously out of scope for emscripten. |
Are you looking to run your emscripten modules on s390/Linux? Do chrome or firefox even run there? |
All I care about is things running in node. I assume there are no browsers on s390/Linux in practice. |
Also I have no particular investment in s390 nor any software I require running there, I simply wish the tooling I work with to be correct whereever it's run. |
So this is really a hypothetical requirement (at least from your POV @bvibber). You don't have have an s390 machine that you want to run your modules on, nor any specific users asking for this? I'm not saying that means we should not invest all in this effort, just trying to get an idea of where the demand for it is coming from. |
Correct; my primary concern is with the difficulty of debugging for downstream users of libraries using wasm components in them, as a person who sometimes creates such libraries and wants to make peoples' lives easier if the case comes up later. I don't think there's a lot of specific demand for big-endian in production, but some folks are aware that s390/linux exists, and has node available on it, and in case anybody's libraries end up getting used on one of those machines they should either work or fail gracefully, at a minimum. :) Since the build-time support option is already in there, as long as we have a clean fail-out if you try to instantiate a module on a big-endian host I'm happy with that. If there's interest in doing the work to push forward a clean, no-regressions "works everywhere" default build, that can live on a separate issue I think. |
I'm a Linux on s390x developer and yes, at least a graceful fail would have helped me a lot when I ran into this issue with bash-language-server :) |
So we do have the @juru1234 I assume you are asking for this message to also appear in release builds? In general we try to avoid that kind of thing, since even the text of the message itself ends up adding code side to all builds. In this case we could perhaps gate the inclusion on node support, since those building modules for the web who care about code size will already be removing the node support. |
If somebody wants to send a PR to add this error message, even in release builds (but only when node support is enabled) that would great. |
@bvibber Thanks for running those benchmarks! I did some size comparisons. On a typical benchmark in our benchmark suite (e.g. zlib) the JS gets 5% larger with big-endian support. I also see similar things when compiling WebGL testcases. So this is not a huge increase, but I think that's big enough that we wouldn't want it to be the default. @sbc100 Interesting idea to improve the error message only in builds targeting non-Web. I would be ok with that. |
I'll see if I can whip up a quick warning patch that at least dumps a warning to console on non-debug non-big-endian build if run on big endian :D It will have a small but constant size cost, since it won't bloat the actual load/store sites. |
Hello,
It seems like Big Endian machines are not able to use emscripten, as assertions are added in this commit:
f178541
I understand that there are problems when it comes to "Typed Arrays" in Javascript since they are endianness dependent,
however, WebAssembly is little endian enforced. Runtime compilers such as
Google V8
are compatible with running wasm (LE) binary code on BE machines.Is the above assertion for endianness only done when generating regular
js
code (such asasm.js
) or italso checks when compiling to WebAssembly? In the case of wasm, would it be possible to continue with LE compilation even on BE machines? (I'm assuming assertions are only done during compilation to JS or to WASM and emscripten isn't concerned with how the generated output gets executed?).
The text was updated successfully, but these errors were encountered: