-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[web] "Uncaught (in promise) 1991937888" or "RuntimeError: abort(undefined)" when loading multiple sessions of a large model #10957
Comments
From below comment, the maximum wasm memory will be 2GB by default unless it's built with MAXIMUM_MEMORY option. ONNX Runtime Web is not built with that option, so it's limited to 2GB memory. // Set the maximum size of memory in the wasm module (in bytes). This is only |
@hanbitmyths Thanks for looking into this! A few questions:
|
1> probably this can be improved by using a "debug" web assembly build ( with -s ASSERTIONS=1 ). However, including another 4 .wasm files into the NPM package will significantly increase the package size. This need to be think of carefully. 2> I see no harm to increase max_size to 4GB. 3> this cannot be done based on current code, so need code changes to make it happen. However, using a singleton wasm instance in one JS context is a design decision which is based on the requirement of saving memory usage, as there is no need to duplicate the memory used by ORT itself (and also simply the implementation of status management). The only benefit seems to be allowing bypassing the 4GB limit, which may be supported by wasm64 in future. |
Thanks for looking into this! Would the changes required to support parallel execution of several sessions also allow those individual sessions to have their own separate 4 GB limits? In this OpenAI CLIP demo I'm using ONNX Runtime Web to get the embeddings for a user-provided directory of images, and if the user's machine has 16 GB of RAM and 16 threads, then I'd love it if it were possible to process images at up to ~16x the speed (the models takes ~400 MB of RAM IIRC, so there'd be leftover RAM for the OS and other processes). As an example, ideally it would be as simple as something like: let session1 = await ort.InferenceSession.create(imageModelUrl, { resourceGroup: "foo" });
let session2 = await ort.InferenceSession.create(imageModelUrl, { resourceGroup: "foo" });
let session3 = await ort.InferenceSession.create(imageModelUrl, { resourceGroup: "bar" }); So in this case |
@fs-eire @hanbitmyths Wondering if there are any updates on this issue? The main two things are:
The first one seems like it just involves changing a single number, and it would unblock and others on a few projects. Any chance a pull request would be accepted for that? |
this should be fixed. |
Describe the bug
When trying to initialize several sessions of a large model, I get some errors that aren't very helpful. I can create 4 session, but it errors when I try to create a 5th. The model is about 350mb. I'm using the Wasm backend since WebGL backend doesn't work for this model due to operator compatibility/support problems (#10031).
Urgency
Not super urgent.
System information
To Reproduce
Minimal reproduction: https://josephrocca.github.io/clip-image-sorter/debug-onnx-several-image-sessions-at-once.html
Expected behavior
I expect to be able to create as many sessions as the browser's memory limits allow. The memory usage according to Chrome's Task Manager is about ~2GB when the error is thrown. The browser allows much more than ~2GB of memory usage when allocated using
ArrayBuffer
s (up to 16GB in Chrome according to this answer), so I pretty sure this isn't hitting a browser memory limit, and the wasm module memory limit is 4GB until we get wasm64/memory64, IIUC. But my knowledge here is limited.Screenshots
Additional context
The reason I'm trying to create several instances of the same model is because I'm using the model to process a large folder of images, and I'd like to process several images at once. It seems like performance will scale better if I have several sessions with only a single thread each, instead of having a single session with all the threads.
The text was updated successfully, but these errors were encountered: