Prevent multiple fetch + initialize for side modules #14820

warvstar · 2021-08-07T15:04:59Z

So I have two builds of my program, a game engine, one where it's a single 37mb wasm file (5mb br compressed).

With my game engine, I use all threads available on my machine via Navigator.hardwareConcurrency (32). This works fine, as only the .js files are duplicated and my game starts up on my machine within 1-2 seconds with ~700ms of that been fetching and initializing the .js and wasm file and the remaining been engine + game startup time.

Now, I've started experimenting with hot reloading parts of my engine, to do this I've decided to use shared libraries and this has produced about 120 libraries around a few hundred KB and a couple libraries that are 11MB, 17MB. This has made the total size a little larger, around 42mb (6mb compressed), the size is not the issue though.

Now that the code is split among many wasm modules, emscripten is loading up each library (125 modules or 42mb) * threads (32), resulting in 1.3gb of code been downloaded and instantiated.

Is there currently a way to do what I'm trying to achieve in an efficient way, or would I have to modify emscripten for this purpose?

Also, slightly related, I can understand the worker.js files needing to match the threads, but is there not a way around the fetching of the extra non worker .js files?

Thanks

warvstar · 2021-08-07T19:34:07Z

Adding this check if (!ENVIRONMENT_IS_PTHREAD) before preloadDylibs(); inside run() has appeared to have solved the issue for me? Now they are only loaded once and I've tested calling between libraries and main and everything looks to be working fine.

Edit: Looks like I've run into issues when calling functions on other threads... as expected I suppose.

So this cannot work for me as is, because 42mb -> 1.3gb is just not very worthwhile just to use shared modules.

I'm actually mostly interested in runtime dynamic loading game modules, rather than engine modules, and the initial 140+ engine modules could be compiled as one MAIN_MODULE, so another way I've thought about doing this is to just build all the engine modules as a single module with MAIN_MODULE=2 and then at runtime I can load additional modules using emscripten_dlopen();. This looks like it should work, although it adds an 8+mb function (__wasm_apply_data_relocs) to the 40mb main module, this just happens to be too large of a function for Chrome and probably any other browser.

I guess this next question should probably have it's own thread and probably better suited to llvm, but would it not be possible to split __wasm_apply_data_relocs and __wasm_call_ctors into multiple functions if they are too large?

Is there any way this is possible today, if not, what would have to change to make this possible?

kripken · 2021-08-09T23:45:20Z

MAIN_MODULE would be the standard way to do this, yes, and to load libraries in side modules.

What do you mean by the function being too large? Do you get an error? (I would expect a function so large to not be fully optimized, but I'd hope it still runs.)

warvstar · 2021-08-10T02:37:33Z

Combining all those engines modules has caused the (__wasm_apply_data_relocs) method to grow to 8mb with full optimizations, Oz and O3 are similar. The error Chrome gives is this ```ModuleError: CompileError: WebAssembly.instantiate(): size 8101201 > maximum function size 7654321```

…

On Mon, Aug 9, 2021, 5:45 PM Alon Zakai ***@***.***> wrote: MAIN_MODULE would be the standard way to do this, yes, and to load libraries in side modules. What do you mean by the function being too large? Do you get an error? (I would expect a function so large to not be fully optimized, but I'd hope it still runs.) — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#14820 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACBQKLZVX2CLJOQYQ6S3SFDT4BSBXANCNFSM5BXPSXWA> .

kripken · 2021-08-10T16:49:47Z

Interesting, yes, that is actually hitting the VM limit then...

As this is emitted by LLVM, I think the fix should be there. Specifically, I assume __wasm_apply_data_relocs is created in wasm-ld, so perhaps it could create sub-functions when it's close to that limit, what do you think @sbc100 ?

sbc100 · 2021-08-10T18:30:34Z

I wonder if we can instead move the relocations into data. Then the code could be tiny (and potentially not linker-generated).

It sounds like you are passing all the side modules on the command line (so that MAIN_MODULE=2 works correctly?) but that you don't actually want to load them on startup? Is that right? If that is correct then we the -sAUTOLOAD_DYLIBS=0 might be what you want?

sbc100 · 2021-08-10T18:31:35Z

Also, dynamic linking + pthreads is experimental right now, and one of the limitation is that runtime loading (loading after startup) basically doesn't work.

warvstar · 2021-08-10T20:45:35Z

It sounds like you are passing all the side modules on the command line (so that MAIN_MODULE=2 works correctly?) but that you don't actually want to load them on startup? Is that right? If that is correct then we the -sAUTOLOAD_DYLIBS=0 might be what you want?

Essentially the engine I'm using has hot reloading for modules, note I'm not talking about emscripten or wasm modules, just code that can be linked statically or as a shared library; on windows this works fine, on the web compiling them all statically works fine without MAIN_MODULE but so far I've not had success compiling them as shared libraries.

Originally, as in my first post, I tried compiling all modules as separate shared libraries (SIDE_MODULE) except for the launch module (MAIN_MODULE with the side modules listed). However it wants to load the side modules on each thread, that's 140x32 times the modules need to load, blowing out the memory on the devices I want to use this on. Even using -s AUTOLOAD_DYLIBS=0 does not avoid the problem, it just allows me to load some libraries later via loadDynamicLibrary but they all need to get instantiated on each thread. I imagine this would be easy to fix once interface types are ready.

For now, I've decided to tackle something slightly easier, instead of compiling all the engine modules as shared, I compile them all statically as one MAIN_MODULE, no shared modules yet - the reason I even added MAIN_MODULE is so that I can at least load up new modules dynamically and hopefully this avoids needing to instantiate the module multiple times.

This would allow me to code new parts of the engine or the game in the browser.

warvstar · 2021-08-10T20:48:34Z

Also, are the relocations even necessary (in the MAIN_MODULE) if the only need is to load libraries at runtime via dlopen? If not, would it be possible to have MAIN_MODULE=3 that only works with runtime / dlopen'ing libraries?

sbc100 · 2021-08-10T21:35:04Z

In theory MAIN_MODULE could/should be built without relocations... but last time I tried to do that I hit a roadblock. The problem is that side modules are loaded before the main module which means that static data region is not fixed in size.. which means addresses are not known at link time. One solution to that problem would be to load the main module first and somehow allocate the space for the side modules using the start of the heap (basically dump sbrk)... but that would also mean that calls from the main module into the side modules are forced to go via JS (as opposed to today where they can be direct but calls in the other direction are via JS).

sbc100 · 2021-08-10T21:35:48Z

Also, even if we make MAIN_MODULE non-relocatable is doesn't solve the general problem of __wasm_apply_data_relocs being large which can also effect SIDE_MODULEs.

dschuff · 2023-01-13T00:26:36Z

@sbc100 I'm guessing a fix for this would probably be dependent on #18376?
Also IIUC there are really 2 separate problems here: one is the title of the issue (redundant fetching/compiling of all modules across the threads, where we should be just postMessage-ing the modules), and the other is that __wasm_apply_data_relocs is too large; and I guess this is particular to use of dynamic modules because a statically linked module wouldn't need the data relocs at all...

sbc100 · 2023-01-13T01:21:31Z

Re: __wasm_apply_data_relocs, yes this only exists when in MAIN_MODULE and SIDE_MODULE binaries.

Re: Using postMessage, this could work, but even with #18376 it can only work in the dlopen case if workers are returning the event loop (which, in general, they don't). It should always work for preloadDylibs though, so we should probably do that.

sbc100 · 2023-03-09T21:17:32Z

Fixing #18552 will help some use cases like this.

dschuff added the dynamic_linking label Jan 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prevent multiple fetch + initialize for side modules #14820

Prevent multiple fetch + initialize for side modules #14820

warvstar commented Aug 7, 2021

warvstar commented Aug 7, 2021 •

edited

Loading

kripken commented Aug 9, 2021

warvstar commented Aug 10, 2021 via email

kripken commented Aug 10, 2021

sbc100 commented Aug 10, 2021

sbc100 commented Aug 10, 2021

warvstar commented Aug 10, 2021 •

edited

Loading

warvstar commented Aug 10, 2021 •

edited

Loading

sbc100 commented Aug 10, 2021

sbc100 commented Aug 10, 2021

dschuff commented Jan 13, 2023

sbc100 commented Jan 13, 2023

sbc100 commented Mar 9, 2023

Prevent multiple fetch + initialize for side modules #14820

Prevent multiple fetch + initialize for side modules #14820

Comments

warvstar commented Aug 7, 2021

warvstar commented Aug 7, 2021 • edited Loading

kripken commented Aug 9, 2021

warvstar commented Aug 10, 2021 via email

kripken commented Aug 10, 2021

sbc100 commented Aug 10, 2021

sbc100 commented Aug 10, 2021

warvstar commented Aug 10, 2021 • edited Loading

warvstar commented Aug 10, 2021 • edited Loading

sbc100 commented Aug 10, 2021

sbc100 commented Aug 10, 2021

dschuff commented Jan 13, 2023

sbc100 commented Jan 13, 2023

sbc100 commented Mar 9, 2023

warvstar commented Aug 7, 2021 •

edited

Loading

warvstar commented Aug 10, 2021 •

edited

Loading

warvstar commented Aug 10, 2021 •

edited

Loading