Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does it support running wasm in go #568

Closed
zonghaishang opened this issue Apr 16, 2021 · 8 comments
Closed

Does it support running wasm in go #568

zonghaishang opened this issue Apr 16, 2021 · 8 comments
Labels

Comments

@zonghaishang
Copy link

zonghaishang commented Apr 16, 2021

At present, we support the running of wasm module in the mosn project (implemented in go language), and found that teavm also supports the generation of wasm files, and we hope that the wasm generated by java can also run in the go environment.

I wrote a demo to test the wasm file generated by teavm,found some functions that rely on teavm, but they will not be implemented in the host.

➜  wasm git:(master) ✗ wasm-objdump -x classes.wasm -j Import
classes.wasm:   file format wasm 0x1
Section Details:

Import[6]:
// user defined function
 - func[0] sig=2 <.hello> <- .hello

// These functions are not required to execute the wasm module in go
 - func[1] sig=1 <teavm.logString> <- teavm.logString
 - func[2] sig=1 <teavm.logInt> <- teavm.logInt
 - func[3] sig=5 <teavm.logOutOfMemory> <- teavm.logOutOfMemory
 - func[4] sig=1 <teavmHeapTrace.init> <- teavmHeapTrace.init
 - func[5] sig=14 <teavm.currentTimeMillis> <- teavm.currentTimeMillis

Similar to the export part, there are also functions that teavm depends on:

➜  wasm git:(master) ✗ wasm-objdump -x classes.wasm -j Export

classes.wasm:   file format wasm 0x1

Section Details:

Export[36]:
// user defined function
 - func[8] <thePurposeOfLife> -> "thePurposeOfLife"

// These functions are not required to execute the wasm module in go
 - func[37] <teavm_gc_collect> -> "teavm_gc_collect"
 - func[38] <teavm_gc_collectFull> -> "teavm_gc_collectFull"
 - func[45] <teavm_gc_fixHeap> -> "teavm_gc_fixHeap"
 - func[46] <teavm_gc_tryShrink> -> "teavm_gc_tryShrink"
 - func[107] <teavm_catchException> -> "teavm_catchException"
 - func[109] <teavm_throwNullPointerException> -> "teavm_throwNullPointerException"
 - func[110] <teavm_throwArrayIndexOutOfBoundsException> -> "teavm_throwArrayIndexOutOfBoundsException"
 - func[119] <teavm_processQueue> -> "teavm_processQueue"
 - func[120] <teavm_stopped> -> "teavm_stopped"
 - func[137] <teavm_allocateString> -> "teavm_allocateString"
 - func[138] <teavm_stringData> -> "teavm_stringData"
 - func[139] <teavm_allocateObjectArray> -> "teavm_allocateObjectArray"
 - func[140] <teavm_allocateStringArray> -> "teavm_allocateStringArray"
 - func[141] <teavm_allocateByteArray> -> "teavm_allocateByteArray"
 - func[142] <teavm_allocateShortArray> -> "teavm_allocateShortArray"
 - func[143] <teavm_allocateCharArray> -> "teavm_allocateCharArray"
 - func[144] <teavm_allocateIntArray> -> "teavm_allocateIntArray"
 - func[145] <teavm_allocateLongArray> -> "teavm_allocateLongArray"
 - func[146] <teavm_allocateFloatArray> -> "teavm_allocateFloatArray"
 - func[147] <teavm_allocateDoubleArray> -> "teavm_allocateDoubleArray"
 - func[148] <teavm_objectArrayData> -> "teavm_objectArrayData"
 - func[149] <teavm_byteArrayData> -> "teavm_byteArrayData"
 - func[150] <teavm_shortArrayData> -> "teavm_shortArrayData"
 - func[151] <teavm_charArrayData> -> "teavm_charArrayData"
 - func[152] <teavm_intArrayData> -> "teavm_intArrayData"
 - func[153] <teavm_longArrayData> -> "teavm_longArrayData"
 - func[154] <teavm_floatArrayData> -> "teavm_floatArrayData"
 - func[155] <teavm_doubleArrayData> -> "teavm_doubleArrayData"
 - func[156] <teavm_arrayLength> -> "teavm_arrayLength"
 - func[215] <teavm_javaHeapAddress> -> "teavm_javaHeapAddress"
 - func[216] <teavm_availableBytes> -> "teavm_availableBytes"
 - func[217] <teavm_regionsAddress> -> "teavm_regionsAddress"
 - func[218] <teavm_regionSize> -> "teavm_regionSize"

Question:

  1. Wasm should be portable. Can dependencies on specific environments be disabled? Similar to the wasm generated by tinygo compilation, specify target=wasi, so that all dependencies on js will be removed.

Here is the wasm file (wasi specification) I compiled with go:

➜  wasm git:(fix_hijack_context_not_found) wasm-objdump -x bolt-go.wasm -j Import
bolt-go.wasm:   file format wasm 0x1
Section Details:

Import[5]:
// Functions generated by the compiler
 - func[0] sig=0 <fd_write> <- wasi_unstable.fd_write
 
//  User-defined host implementation functions
 - func[1] sig=1 <proxy_get_buffer_bytes> <- env.proxy_get_buffer_bytes
 - func[2] sig=2 <proxy_get_header_map_pairs> <- env.proxy_get_header_map_pairs
 - func[3] sig=1 <proxy_set_buffer_bytes> <- env.proxy_set_buffer_bytes
 - func[4] sig=2 <proxy_log> <- env.proxy_log

➜  wasm git:(fix_hijack_context_not_found) wasm-objdump -x bolt-go.wasm -j Export

bolt-go.wasm:   file format wasm 0x1

Section Details:

Export[28]:
// Functions generated by the compiler
 - memory[0] -> "memory"
 - func[71] <_start> -> "_start"

// //  User-defined wasm exported functions
 - func[176] <proxy_decode_buffer_bytes> -> "proxy_decode_buffer_bytes"

  1. Go itself has its own gc capability, the wasm module written in java is executed in go, is gc processed by wasm itself? Or do not have gc capability?
@konsoletyper
Copy link
Owner

Wasm should be portable. Can dependencies on specific environments be disabled? Similar to the wasm generated by tinygo compilation, specify target=wasi, so that all dependencies on js will be removed.

False. Wasm IS NOT portable. It's portable as a virtual machine. But this virtual machine does not interact with the outer world (e.g. printing to console), it has to call functions that provide corresponding capabilities. There is no standard portable way to do that. Wasi does not work in the browser, so I can't call wasm portable.

So my answer it: not, it's impossible to remove these dependencies. But I don't see any problems here. There are few functions that you should provide to wasm module generated by TeaVM, they are quite trivial. What's the problem to write these functions for your environment?

Go itself has its own gc capability, the wasm module written in java is executed in go, is gc processed by wasm itself? Or do not have gc capability?

The question is how TeaVM should interact with this GC. I see no way. TeaVM has it own GC.

@konsoletyper
Copy link
Owner

And the main question is: why? Why do you want to run Java such a strange way?

@zonghaishang
Copy link
Author

And the main question is: why? Why do you want to run Java such a strange way?

At present, the wasm module is supported on the go project mosn, and we hope that the wasm compiled by java can also be executed (outside the js environment).

We have implemented protocol extensions in the rpc, so we also hope that java developers can provide extension plugins.

We noticed that wasm currently supports multiple language implementations:https://github.com/proxy-wasm/spec

  • AssemblyScript SDK
  • C++ SDK
  • Go (TinyGo) SDK
  • Rust SDK

In the java compiler, no examples of use in a non-web environment have been found so far.

@jcaesar
Copy link

jcaesar commented Sep 17, 2021

I'd also like to run Java in such a strange way.

why?

I have added a plugin system to my existing application. I've based it on WebAssembly because

  • WebAssembly is comparatively language and architecture agnostic - you can write a plugin for my app in C, Rust, Go, Python, ...
  • Easy to embed into an existing application (compared to e.g. Node or a JVM) - most WASM runtimes are designed to be used as a library
  • The runtime I use allows for tricky things like having multiple isolated instances of the same module and limiting the CPU usage (both seem impossible with the non-strange way of Java running in OpenJDK)

So, I'd like to be able to support as many languages as possible, and supporting JVM languages would be a big plus in that direction.

Ultimately, this currently fails because I expect my plugins to provide me with pointers into their memory, and that memory should contain a CBOR buffer. I failed to achieve this, because Jackson et al. won't compile (e.g. because UUID and ConcurrentHashMap are missing), and because getting a pointer to the content of a byte[] is currently not possible (apologies if I've missed an intrinsic I can use from inside/Java).

There are few functions that you should provide [...] What's the problem to write these functions for your environment?

I've attempted this and it indeed wasn't difficult. But it also isn't a very good solution, since these imports are internal interface that may change at any release, right?
(Side note: This strategy is also used for Go, and showing the same weaknesses.)

Wasi does not work in the browser,

There are libraries/polyfills for that. Alternatively, implementing the three functions you need (fd_write for logString/logInt/LogOOM, random_get, clock_time_get) should also be rather trivial. The rest of the imports could be implemented in Java only(?), but that is a lot less trivial, and would increase binary size :(.

Anyway, TeaVM generated WASM currently seems to need at least a little support from the outside system. Maybe that could change when WASM gets its own GC and exception support, but currently, I don't think it's possble for TeaVM to run entirely with what WASI provides.

So in summary: TeaVM is for running Java in the browser. I would be happy if it could be extended to a broader scope, but I don't think it can (but not for the reasons previously mentioned here).

@konsoletyper
Copy link
Owner

@jcaesar thanks for your great feedback!

Ultimately, this currently fails because I expect my plugins to provide me with pointers into their memory, and that memory should contain a CBOR buffer. I failed to achieve this, because Jackson et al. won't compile (e.g. because UUID and ConcurrentHashMap are missing),

Is Jackson that necessary? Can't you write your own CBOR parser/generator? Can you tell me more info so that I could reimplement TeaVM-friendly CBOR support (I already have JSON parser/generator compatible with Jackson annotations that is supported with TeaVM).

and because getting a pointer to the content of a byte[] is currently not possible (apologies if I've missed an intrinsic I can use from inside/Java).

Address.ofData(array).toLong(). But please note that TeaVM GC can relocate arrays, so this pointer may change. If you are interested, I can think on providing API for allocating off-heap buffers.

There are few functions that you should provide [...] What's the problem to write these functions for your environment?

I've attempted this and it indeed wasn't difficult. But it also isn't a very good solution, since these imports are internal interface that may change at any release, right?
(Side note: This strategy is also used for Go, and showing the same weaknesses.)

Right. But runtime part is rather small and does not change often. Do you know of some other strategy which solves this issue?

Wasi does not work in the browser,

There are libraries/polyfills for that. Alternatively, implementing the three functions you need (fd_write for logString/logInt/LogOOM, random_get, clock_time_get) should also be rather trivial. The rest of the imports could be implemented in Java only(?), but that is a lot less trivial, and would increase binary size :(.

Ah, I see. You want TeaVM-generated wasm file to depend directly on subset of WASI and also ship JS module that provides this subset of WASI API? Right?

Anyway, TeaVM generated WASM currently seems to need at least a little support from the outside system.

What kind of support do you mean?

Maybe that could change when WASM gets its own GC and exception support, but currently, I don't think it's possble for TeaVM to run entirely with what WASI provides.

What's the problem to run TeaVM entirely in WASI environment?

So in summary: TeaVM is for running Java in the browser. I would be happy if it could be extended to a broader scope, but I don't think it can (but not for the reasons previously mentioned here).

And what are the actual reasons?

@jcaesar
Copy link

jcaesar commented Sep 17, 2021

What kind of support do you mean?
What's the problem to run TeaVM entirely in WASI environment?
And what are the actual reasons?

Sorry, I guess I was being cryptic.

  • I thought that an external run-time must call functions like teavm_catchException, teavm_gc_fixHeap, teavm_gc_collectFull?
  • WASI doesn't provide anything like teavm.towupper or teavmMath.atan2. Reimplementing those in pure Java would be a significant effort, and it's probably not so easy to find high quality libraries that reimplement these system functions. And doing so would increase the size of the generated binaries.

Do you know of some other strategy which solves this issue?

I would document the API and mention changes in the release notes. You could also change the import namespace (e.g. teavm_api_v1.logString) when introducing breaking changes. But compared to using WASI, this will always be a second-best.

You want TeaVM-generated wasm file to depend directly on subset of WASI and also ship JS module that provides this subset of WASI API? Right?

Yes.


Sorry about the discussion below, it's off-topic for this issue. I'll move it if desired.

Is Jackson that necessary?

No, It's just the nicest thing out there. I'd be happy with anything that gives me a JsonNode or similar.

I already have JSON parser/generator compatible with Jackson

I'd be happy with using that, too. (I can add a switch to my API that specifies which format the data is passed in.) But I think I can't use that because it depends on JS?

Can't you write your own CBOR parser/generator?

Could, but I shun the effort. I think it would be easier to implement what little classes are missing in TeaVM. It seems like com.google.iot.cbor was missing only a few little things, like a hashCode implementation on Integer. (I hope I'll have some time for PRs for that in the next weeks...)

Can you tell me more info so that I could reimplement TeaVM-friendly CBOR support

The problem with these binary formats is that next week there's another. Having specific support for one of them is nice, but you'll probably be bombarded with requests for more. Anyway, here is an example of one of these libraries failing.

Address.ofData(array).toLong(). But please note that TeaVM GC can relocate arrays

Oh! Thank you. (I was hoping to be safe from the GC if I call the import in the same statement... But maybe I'm mixing this up with Go semantics.)

@konsoletyper
Copy link
Owner

* I thought that an external run-time must call functions like `teavm_catchException`, `teavm_gc_fixHeap`, `teavm_gc_collectFull`?

No, external runtime does not have to call these functions.

teavm_catchException is used to retrieve exception information from the last call. You can't throw exception from Wasm, so the only way to get exception from TeaVM is to call TeaVM function (in most cases main) and then call teavm_catchException to check if this function actually thrown anything. Yes, this could be improved. First, Wasm can actually throw exception by reaching unreachable instruction, but still there's no way to retrieve actual exception from JVM. Second, currently TeaVM does not provide functions to inspect returned exception (I hoped to add these functions some day). But anyway, I end up with 'no exception thrown to native' approach and just write main method that itself contains try { ... } catch (Throwable e) { ... } where handler interacts with native runtime functions.

teavm_gc_collectFull is just a hint to start garbage collection right now, called from within native environment (just like you call System.gc from Java). TeaVM calls GC as needed.

teavm_gc_fixHeap is very cryptic function. While running, heap can be eventually inconsistent (just a couple of global system pointers can be in 'strange' state). The heap is required to be fully consistent in two cases:

  1. GC is to be performed. In fact, GC calls this function first.
  2. Heap dump is to be created. I wrote heap dump collector as an external routine for C backend (can be found here), however there's no heap dump tool for Wasm. Since both backends share runtime, this function also available among Wasm module exports.
* WASI doesn't provide anything like `teavm.towupper` or `teavmMath.atan2`. Reimplementing those in pure Java would be a significant effort, and it's probably not so easy to find high quality libraries that reimplement these system functions. And doing so would increase the size of the generated binaries.

Right. As for towupper I guess I could implement it. TeaVM ships with highly compressed subset of unicode table needed for Character.getType implementation. May be I could also include upper/lower case mapping and implement Character.toUpperCase.

Is Jackson that necessary?

No, It's just the nicest thing out there. I'd be happy with anything that gives me a JsonNode or similar.

I'll write a simple JSON parser in an hour. Do you need one?

I already have JSON parser/generator compatible with Jackson

I'd be happy with using that, too. (I can add a switch to my API that specifies which format the data is passed in.) But I think I can't use that because it depends on JS?

This is not a problem, it's easy to fix.

Can't you write your own CBOR parser/generator?

Could, but I shun the effort. I think it would be easier to implement what little classes are missing in TeaVM. It seems like com.google.iot.cbor was missing only a few little things, like a hashCode implementation on Integer. (I hope I'll have some time for PRs for that in the next weeks...)

I'm not sure it's easy to implement them. For example, you mentioned ConcurrentHashMap. People used to think since there's no threads in JS, ConcurrentHashMap could be simply replaced with HashMap. That's a simplification, because TeaVM replaces threads with its own coroutines. I have an old branch that provides ConcurrentHashMap implementation friendly to coroutines, but still have no idea how to test this implementation properly.

Can you tell me more info so that I could reimplement TeaVM-friendly CBOR support

The problem with these binary formats is that next week there's another. Having specific support for one of them is nice, but you'll probably be bombarded with requests for more. Anyway, here is an example of one of these libraries failing.

Isn't Class.getCanonicalName supported in latest versions? I recommend not to use Maven Central versions of TeaVM, since I don't publish there anymore. Instead, use latest build according to instructions on readme.md.

Address.ofData(array).toLong(). But please note that TeaVM GC can relocate arrays

Oh! Thank you. (I was hoping to be safe from the GC if I call the import in the same statement... But maybe I'm mixing this up with Go semantics.)

Unlike JVM, TeaVM GC does not relocate objects that are directly reachable from stack. This means that if method foo call native function bar and passes array address there, bar can be sure that array won't be relocated until bar completes. If native function bar calls Java method foo which returns address of an array, array is not guaranteed to be there.

@jcaesar
Copy link

jcaesar commented Sep 17, 2021

I see. TeaVM holds significantly more magic than I expected. I'm considerably more optimistic that I might be able to run Java in my plugin system. (I'm on vacation next week... ;) So not too soon.)

Isn't Class.getCanonicalName supported in latest versions?

It is. (And sorry, I just used the archetype from getting started without thinking. That still comes with 0.6.1 from central.)
Now that I know that the current version has more support, I'll try a few more libraries. Maybe one of them happens to be usable as is.
(You're right, testing data structures like ConcurrentHashMap is tricky. Maybe in this case, one could at least use fuzzing, since execution is deterministic. (If it actually is...?))

I'll write a simple JSON parser in an hour. Do you need one?

You may leave that exercise to the reader. ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants