Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace _malloc, _memcpy and friends with imported from env. #5755

Closed
pepyakin opened this issue Nov 9, 2017 · 19 comments
Closed

Replace _malloc, _memcpy and friends with imported from env. #5755

pepyakin opened this issue Nov 9, 2017 · 19 comments
Labels

Comments

@pepyakin
Copy link

pepyakin commented Nov 9, 2017

I'm working in a environment where _malloc/_free and _memcpy, _memmove, _memset are provided by custom env module. This environment requires tiny WASM binaries.

When I compile code with emcc, it emits it's own implementations of these functions. Even if I explicitly declare these function as extern, emcc will still fill them with default implementations. However, if I do provide these functions not as extern, but as ordinary functions with bodies, then emcc will use my versions of these functions.

So to get rid of these functions I just do some post-process stripping on the wasm binary with some home brewed tools (I'm wondering if this is safe enough?)

Is there a better way to compile wasm to a such kind of environment?

@pepyakin pepyakin changed the title Replace _malloc, Replace _malloc, _memcpy and friends with imported from env. Nov 9, 2017
@pepyakin
Copy link
Author

pepyakin commented Nov 9, 2017

Btw, I tried to use ONLY_MY_CODE but it seems it triggers running js-optimizer which seems to me is incorrect.

@juj
Copy link
Collaborator

juj commented Nov 9, 2017

You can try replacing malloc on the C side, like shown here: https://github.com/kripken/emscripten/blob/incoming/tests/wrap_malloc.cpp . That might work in suppressing the dlmalloc ones.

If that does not work, or is somehow tricky, you can try commenting out this line https://github.com/kripken/emscripten/blob/incoming/tools/system_libs.py#L447 to see if that will be effective. If so, perhaps we can add a linker flag -s NO_DLMALLOC=1 or similar to do it out of the box.

@pepyakin
Copy link
Author

pepyakin commented Nov 9, 2017

You can try replacing malloc on the C side, like shown here:

Yep, that helps to replace dlmalloc one, but it is not precisely what I want.
If I declare a symbol with a name malloc then it wouldn't be possible to call malloc symbol from env...

@kripken
Copy link
Member

kripken commented Nov 9, 2017

If you want just your compiled code, no system libraries or JS support code, then you can use the standalone wasm option, which basically means creating a dynamic library (side module) of your code. That wouldn't include malloc if you didn't statically link it in yourself.

If that's what you're looking for, feedback on that path would be great.

@pepyakin
Copy link
Author

pepyakin commented Nov 9, 2017

Sounds like what I need!
I should have said that I'm using Rust... and AFAIK it has some problems with SIDE_MODULE=1.
Anyway, I will try again and report back tomorrow.

@kripken
Copy link
Member

kripken commented Nov 9, 2017

We should make sure that mode works well with rust, when you can please let me know what issues you find.

@pepyakin
Copy link
Author

I managed to get a simple bare project building and I had a few issues.

First, I encountered issue on post-link step in which LLVM's opt fed with rust.metadata.bin files (they are coming *.rlib) when SIDE_MODULE is set to 1.
I had a guess, that maybe some combining of the intermediate bitcode files are not happening, so I just tried to build with LTO enabled (rust's -C lto) and it turned out that it worked.

Second issue is somekind simpler. On the steps asm2wasm and wasm-as with SIDE_MODULE=0 these executables are launched by absolute path, like $(HOME)/.emscripten_ports/binaryen/binaryen-version_39/bin/asm2wasm. But with SIDE_MODULE=1 these executables are launched by relative path, like bin/asm2wasm. So I just copied these executables to ./bin.

After that I got wasm binaries.

I published a repository with sources: https://github.com/pepyakin/rust-side-module-poc

@pepyakin
Copy link
Author

pepyakin commented Nov 10, 2017

About the second issue:

It looks like BINARYEN_ROOT isn't set up since if SIDE_MODULE is set to 1 then ports stuff is getting skipped

I can workaround this issue by providing -s BINARYEN_ROOT=<path to my local binaryen> .

@kripken
Copy link
Member

kripken commented Nov 10, 2017

I noticed the second issue yesterday too, yeah. #5763 should fix it.

I didn't understand the first issue. Are you passing emcc a bunch of bitcode files, and that didn't work (with what error?), then you linked them using rust beforehand (and passed emcc just one file) and that did?

@pepyakin
Copy link
Author

Nothing special, here is how i build it
build.sh
linker.sh

You can see error in the log

@kripken
Copy link
Member

kripken commented Nov 10, 2017

Hmm, what's in that .bin file? Is it llvm bitcode? If it is, what happens if you just do emcc [that bin file]?

@pepyakin
Copy link
Author

pepyakin commented Nov 10, 2017

As far as I get it, rust.metadata.bin comes from

$(HOME)/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/wasm32-unknown-emscripten/lib/libcore-03a45bcd40eeb81d.rlib

(you can download it here)

emcc $(HOME)/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/wasm32-unknown-emscripten/lib/libcore-03a45bcd40eeb81d.rlib
returns successfully and produces a.out.js.

It seems that this file is an archive, here is llvm-nm output for this rlib file.

-------- T _ZN4core3num7dec2flt5parse13parse_decimal17h9648af8cb2d440bdE
-------- T _ZN4core3num7dec2flt5parse7Decimal3new17h2c2a641aeb3d8e7aE
-------- t _ZN4core3num7dec2flt5parse9parse_exp17hb009161d7e0cfe7fE
-------- T _ZN4core3num7dec2flt5rawfp8Unpacked3new17hc29e3823b03e9d67E
-------- T _ZN4core3num7dec2flt5rawfp9big_to_fp17h865ca6b7c333ba45E
-------- T _ZN4core3num7dec2flt9algorithm10make_ratio17hcbbd0368e47393c4E
-------- T _ZN4core3num7dec2flt9algorithm12power_of_ten17haeff5ab6d0f9e2b4E
-------- T _ZN4core3num7dec2flt9pfe_empty17hc574b5ad28d35eecE
-------- T _ZN4core3num7flt2dec14determine_sign17hcfbb152bd2efbe4cE
.... <skipped> ....
-------- t _ZN4core3ptr13drop_in_place17h4f550a039845d939E
-------- t _ZN4core3ptr13drop_in_place17h566277a0ca1bddbeE
.... <skipped> ....
-------- d panic_bounds_check_loc.Y
-------- d panic_bounds_check_loc.Z
-------- d panic_bounds_check_loc.l
-------- d panic_bounds_check_loc.m
.... <skipped> ....
-------- d vtable.aw
-------- d vtable.az

rust.metadata.bin:

I guess rust.metadata.bin just contains some rust specific metadata.

It seems that there was similar issue.

@kripken
Copy link
Member

kripken commented Nov 11, 2017

Hmm, the older issue was fixed it says. But I guess there is something else in that archive that we can't handle. Specifically,

llvm-nm failed on file /tmp/emscripten_temp_3ab2M8_archive_contents/core-03a45bcd40eeb81d.core0.rust-cgu.bytecode.encoded: return code 1, error: /home/alon/Dev/fastcomp/build/bin/llvm-nm: /tmp/emscripten_temp_3ab2M8_archive_contents/core-03a45bcd40eeb81d.core0.rust-cgu.bytecode.encoded The file was not recognized as a valid object file

Any idea what that .bytecode.encoded file is? It starts with RUST_OBJECT.

@pepyakin
Copy link
Author

Hm, according to this file it contains compressed LLVM bytecode.

@pepyakin
Copy link
Author

While considering archive, in case SIDE_MODULE=0 emcc adds only core-03a45bcd40eeb81d.core0.rust-cgu.o:

https://github.com/pepyakin/rust-side-module-poc/blob/db2f6c31a4dfd1b5d50c9337195625f3fe74c72b/emcc-side_module_0.txt#L45-L47

But in case SIDE_MODULE=1, emcc adds all files from the archive:

https://github.com/pepyakin/rust-side-module-poc/blob/db2f6c31a4dfd1b5d50c9337195625f3fe74c72b/emcc-side_module_1.txt#L29-L33

It looks like it has something to do with force_archive_contents:

If I set it to False, then seems like it works and only adds core-03a45bcd40eeb81d.core0.rust-cgu.o, as in SIDE_MODULE=0.

However, then I encounter third issue:

          DEBUG:root:emcc step "post-link" took 0.13 seconds
          DEBUG:root:LLVM => JS
          Unsupported:   %59 = tail call { i128, i1 } @llvm.smul.with.overflow.i128(i128 %result.1132, i128 %26) #0, !dbg !413
          LLVM ERROR: Instruction not yet supported for integer types larger than 64 bits
          Traceback (most recent call last):
            File "/Users/pepyakin/tmp/emsdk_portable/emscripten/incoming/emcc", line 13, in <module>
              emcc.run()
            File "/Users/pepyakin/tmp/emsdk_portable/emscripten/incoming/emcc.py", line 1563, in run
              final = shared.Building.emscripten(final, append_ext=False, extra_args=extra_args)
            File "/Users/pepyakin/tmp/emsdk_portable/emscripten/incoming/tools/shared.py", line 2010, in emscripten
              call_emscripten(cmdline)
            File "/Users/pepyakin/tmp/emsdk_portable/emscripten/incoming/emscripten.py", line 2248, in _main
              temp_files.run_and_clean(lambda: main(
            File "/Users/pepyakin/tmp/emsdk_portable/emscripten/incoming/tools/tempfiles.py", line 79, in run_and_clean
              return func()
            File "/Users/pepyakin/tmp/emsdk_portable/emscripten/incoming/emscripten.py", line 2253, in <lambda>
              DEBUG=DEBUG,
            File "/Users/pepyakin/tmp/emsdk_portable/emscripten/incoming/emscripten.py", line 2151, in main
              temp_files=temp_files, DEBUG=DEBUG)
            File "/Users/pepyakin/tmp/emsdk_portable/emscripten/incoming/emscripten.py", line 93, in emscript
              backend_output = compile_js(infile, settings, temp_files, DEBUG)
            File "/Users/pepyakin/tmp/emsdk_portable/emscripten/incoming/emscripten.py", line 122, in compile_js
              shared.jsrun.timeout_run(subprocess.Popen(backend_args, stdout=subprocess.PIPE), note_args=backend_args)
            File "/Users/pepyakin/tmp/emsdk_portable/emscripten/incoming/tools/jsrun.py", line 20, in timeout_run
              raise Exception('Subprocess "' + ' '.join(note_args) + '" failed with exit code ' + str(proc.returncode) + '!')

This is no surprise for me, because rust's core supports i128 types natively, and seems like emcc is not.

@kripken
Copy link
Member

kripken commented Nov 11, 2017

Yes, when building a library the behavior is to link in all the contents of .a files (since they were provided as inputs, and we don't have a main() function or other way to know which of them is actually needed; what is needed depends on linking at runtime). So I think rust should pick the files it needs and tell emcc to build those.

However, maybe we should also ignore a "broken" file like that by default? On the other hand ignoring it silently might be worse.

The third issue is code that fastcomp's backend can't legalize. We basically have code that legalizes anything clang will emit from C or C++, but arbitrary bitcode might not work. This is one motivation to use the wasm backend, although we could add it to fastcomp, but probably not worth it at this point. (I didn't know rust had i128 by default, btw, that's interesting.)

@pepyakin
Copy link
Author

The third issue is code that fastcomp's backend can't legalize. We basically have code that legalizes anything clang will emit from C or C++, but arbitrary bitcode might not work. This is one motivation to use the wasm backend, although we could add it to fastcomp, but probably not worth it at this point. (I didn't know rust had i128 by default, btw, that's interesting.)

I completely agree that it is not worth adding support for legalization of i128. Also, for Rust, there is core library without i128 exists. Furthermore, rust's i128 is still somekind a moving target, since it is still a nightly-only feature.

However, maybe we should also ignore a "broken" file like that by default? On the other hand ignoring it silently might be worse.

It seems to me that we are already ignoring some "broken" files when we are collecting symbols.
Take a look at the fix of the issue mentioned above.

So I think rust should pick the files it needs and tell emcc to build those.

Hm, I'm not sure whether this is possible/feasible. Probably we need an advice from some rust wizards.

@kripken
Copy link
Member

kripken commented Nov 13, 2017

Good point, that would be more consistent. Fix in #5777.

@stale
Copy link

stale bot commented Sep 19, 2019

This issue has been automatically marked as stale because there has been no activity in the past year. It will be closed automatically if no further activity occurs in the next 7 days. Feel free to re-open at any time if this issue is still relevant.

@stale stale bot added the wontfix label Sep 19, 2019
@stale stale bot closed this as completed Sep 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants