-
Notifications
You must be signed in to change notification settings - Fork 194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regarding multithreading #481
Comments
Hi @TudbuT , There are sane ways to use JavaScript multi-threaded on GraalVM. We have listed them in https://github.com/oracle/graaljs/blob/master/docs/user/Multithreading.md I don't think we will be easily convinced to adopt and support your suggestion, even in an "experimental" fashion. JavaScript, by the design of the language (ECMAScript), is single-threaded and takes little to no precaution to ensure that sane semantics apply when you execute it in multi-threaded fashion. Problems we have seen (that were e.g. exploitable in Nashorn) were around the object model, where severe data corruption occurs when two threads update the same hidden class (object with identical structure). Maybe @eleinadani can help you achieve a reasonable solution for your problem without enabling full multi-threading. |
What kind of data corruption? Maybe i can fix that in my fork. Do you mean [Class].prototype ? If so, it would be possible to block only that from multithreading. |
@TudbuT certain (internal) Graal.js components are not thread-safe. One example mentioned above by @wirthi is the internal "object model", which handles object shapes (a.k.a. "hidden classes") and object-related metadata. In that case, multiple threads updating the same internal metadata could lead to data corruption (e.g., wrong values stored in any object, not just the |
That sounds quite hard to implement... I'm sorry if i was wasting your time, and for now, i will probably just stick to my fork. To explain why im so obsessed about multithreading: I'm using GraalVM to add plugins to a game, and have previously used Nashorn for it, so i used multithreading quite a lot as Nashorn allowed it. Thanks for your help and information. |
I'll leave this open for now, maybe i find some kind of solution. I'll experiment further. |
is there any news about multithreading? |
I've privately tried making this work, and it seems to just require a few synchronized {} blocks to make all problems i had encountered before disappear, however i won't publish this work as it is very clunky and i have left out a bunch of places because I am unfamiliar with the codebase. |
@eleinadani, maybe, is there any way to execute javascript code on graaljs like on nashorn - multithreading and concurrently (with potentially race conditions)? |
@eleinadani for example, my application processes 300 rps (it is not highload). every request processed about 3 seconds.. and have i create and hold in memory about 1_000 instances of graaljs? 1_000 instances of graaljs are so expensive for memory... one instance of nashorn is better than 1_000 instances of graaljs, isn't it? =(( |
@no0ker using a single engine to serve multiple (concurrent) requests is potentially unsafe, and is not supported in Graal.js for JavaScript code. All the supported models of concurrent execution are in the doc I've linked to above. To reduce memory consumption, you can share a single Engine across multiple independent Context instances. One example can be found in our documentation here. |
every graaljs instance has many js libraries... 1_000 instances of graaljs allocate about 1.5GB memory... (shared engine & Source cache)...
hmm.. let's unsafe - how i can disable concurrent thread check? |
can i change |
Yes, pretty sure. My fork (of graal, not graalJS) has an option for it that you can enable/disable, but I stopped merging from upstream as I am too lazy to fix merge conflicts for 10 minutes every day. |
If you do, all support from our side is off. We don't give any guarantees what works if you change that. While you should be able to secure your own classes by proper synchronization, problems might occur in core classes like JavaScript's Object model (as stated above already). So yes, that might work for some cases, but you will eventually run into typical multithreading problems like lost updates, corrupted objects, potentially even deadlocks. Again: this is a model we neither endorse nor support. You have been warned. The models we support have been quoted above. |
@wirthi nashorn supports concurrently execution, but graaljs does not support. it's blocker for migrate from nashorn to graaljs. creating polyglot context for every request affects cpu and memory. creating polyglot context slower, more expensive, than using 1 (one) instance nashorn scriptEngine for all requests. |
Hi,
The right body to address this criticism to is the ECMAScript TC39, in charge of the ECMAScript/JavaScript specification. JavaScript, by design, is specified with single-threaded evaluation semantics. Unlike other languages, it is NOT designed to allow full multi-threading. There are some models still possible (as linked above).
That is true. And it has been shown repeatedly that this model is fundamentally broken: that you can run into corruptions of the object model by out-of-order updates, have lost updates or duplicate entries in objects, etc. You might be lucky to do the proper synchronization yourself; you might be lucky not to exercise any critical code (e.g., putting new properties into objects, or deleting them, or changing their datatype) concurrently; you might be lucky that such problems happen rare enough that it takes ages until your application runs into one, and then you probably don't notice. But the problems are there and cannot easily be solved by the engine - except with expensive synchronization, with severe impact on the execution performance.
As I stated: probably yes, at the cost of execution performance. As there are better models available, we will not work towards providing those models; and being convinced it is hard to get that right, we will not support any endeavors in that area. If you run into any problems, see a data race, etc: you have been warned. Again: there are good models for multithreading in JavaScript supported in Graal.js, but yes, they will most likely require you to rewrite your existing code to match those models. Best, |
@wirthi if you can't support concurrently js execution like nashorn, why you can't provide api for fast polyglot context cloning? create context (load libraries/ int variables/ etc) and fast deep cloning context for every requests - looks like production ready. sorry, but... nashorn is fast. nashorn is not expensive. nashorn many years is in production. nashorn supports concurrently js execution. nashorn is great. nashorn is production ready. but graaljs... =( |
Hi, our Shared Engine Code Caching feature is designed for that case. Yes, there is still some cost for each Context you create, both in terms of footprint and startup time. We try to minimize that as much as possible. For any reasonably large code, the faster code execution should outweigh the cost of creating the new Context. If you have a huge library to load and only execute a tiny amount of JavaScript source code in each Context, if will be measurable of course. If you can measure any problems with that, please share a reproducible example so we can have a look. But please try to make it realistic example of the actual usecase you have, in respect of the share code (library) and the JavaScript code you execute for each new Context created. There might be some optimization potential we are still missing, but in general, this is something we have already optimized a lot. Best, |
@wirthi a little benchmark is here https://github.com/no0ker/js_pt
nashorn processed 300 requests (with library) time = 360 ms is there any way to increse graaljs performance in this case? |
Hi @no0ker thanks for your code, much easier to talk about concrete items. There is quite a fundamental difference in how you use Nashorn vs Graal.js: In your Nashorn variant, you share ALL data between all instances. That can be very dangerous, depending on your usage scenario; data from one customer can easily leak to another one for instance. But let's assume that's really what you want to do. Then let's do the same in Graal.js! Open one Context, evaluate the library once and then execute all requests in the same Context without closing it. You can execute several such Contexts on a different thread each; that is fine as long as each Context is always executed by the same Thread. You still cannot exchange data between the contexts, but I don't believe you want to do that, at least not in the example you gave here. I am attaching a version where I drafted this behavior based on your example (modified it and moved everything into one file, and got rid of the ThreadPool in favor of a for loop, remove the package and used local file instead of resource for the library; you can of course keep the Thread Pool as long as you make sure that each Thread is using one independent Context). This results in:
I run 10 iterations here with executing the core user code 100.000 times each. Graal.js wins that one by a factor of 6. Note that I run on GraalVM, and this means that also Nashorn is optimized by GraalVM a bit here. But the impact should not be high and it is a fair comparison anyway. So again: have several Threads that each open a Context, load the library, and the execute in sequence all the code you want - that is a fine model. But be careful that in this model you can very easily pass data from one run to the next; closing the Context would guarantee this cannot happen. (minor remark: you should also cache the source of the User code, i.e. the call to Best, |
@wirthi your benchmark version measures eval time in single thread... =(
|
I have no insight into how you would code such a scenario. The point of my solution is: have one How many such threads you start, or how you start them, or how you distribute your requests to those threads (using a queue or so?), I don't think we care about that. Using a thread pool might not be the ideal abstraction, as you need a 1:1 relation between Thread and Context. Best, |
Well, my point is typically java spring application processes all http requests concurrently (not sequentaly) on some thread pool. |
This is exactly what i mean: Sometimes, multithreading is required. Production environments usually run on servers with 32+ threads, and for a reason. I'd also like to add, that there are faster alternatives which fully support current specifications and multithreading at the same time. I would, however, personally still prefer Graal as it is much easier to do java interop with it. I will also not link these alternatives out of respect for this (despite lack of official MT support very good) project. |
I get that people might be irresponsible with multithreading, and cause tons of bug-reports simply due to them misusing some experimental multithreading feature; However, I believe that it is also important to look at the perspective of someone who needs multithreading for their app to work efficiently. Just because some people might cause a (probably quite small amount) of bug reports, doesn't mean a useful feature should be completely blocked. These bug reports could also be quite quickly dismissed if they have to do with multithreading issues because it can be marked as an unsupported/experimental feature. Losing objects and corruption is a very bad issue, but, like you said, this can be avoided with proper synchronization. Adding a log warning, and making multithreading a setting won't hurt anyone. You could also synchronize some of the internal Graal methods, so that such instructions are always synchronized properly. |
And we fully support that. Just not in a "share all" model. |
The problem is not only user-code, it is at the core of the language. JavaScript as a language is not prepared to handle that (concurrently). Any modification of an object (by user code) might trigger an internal update of the object model. It might change the storage type of a property, that another thread is reading or writing a few clock cycles later: boom. To mitigate, we would need to synchronize every operation that reads or writes the object model. That comes with high cost that we are not willing to take. Nashorn, to the best of my knowledge, has no strong mitigations for that case. When the project was still active within Oracle, we could trigger such crashes easily. That lead to our decision just not to support this model. If you run every thread on a separate Christian |
I've just been reading through this thread out of curiosity and I'm really struggling to see what the problem is? If you want to use a thread pool, well that's the perfect way to think about it. This means you're dealing with a relatively small, fixed number of threads, and as wirthi has said you only need to init one context per thread. (NOT per request!) The cost is fixed and up-front. Wirthi has pointed out that you lose some isolation doing this. Requests processed on the same thread may leak between them since they share a context ... but sharing the same Nashorn instance between all requests would presumably have the same issue so this isn't a step down. If you have specific objections to this model I must have missed them. And if so then I apologise for wading in uninformed, but it seems like you're talking past each other a bit and it seems like a shame for this not to be resolved to your satisfaction when I believe the answer is there. |
@eliasvasylenko the cost of N graaljs instances is not fixed. cost is linear. |
But for a given run of the program on a given piece of hardware, N should really be fixed and known up front. And cost of initializing all your contexts should only be incurred once at startup, and then amortised over the duration of the run. More specifically you probably want one thread per processor/core. So the cost per processing core is constant. Which means in theory, modulo fighting for cache/memory resources, it should actually be pretty close to constant time to init all your contexts. Yes, if you scale the number of threads with the size of your input---rather than the size of your hardware---then the cost will be linear, but that would probably be an engineering mistake. |
Sorry to hijack this thread, but a quick announcement. Especially because of this limitation of Graal-JS and other reasons - we decided to write a JS engine from scratch. Our needs are very different from the Graal project - we don't care about pure Node JS or ECMA conformance, but we just want a way to script on the JVM at run-time and take advantage of all the JVM features, including concurrency. We have also done some benchmarking, full details here: https://github.com/karatelabs/karate-js |
FWIW I recently opened a PR to add support for ReactJS rendering to Micronaut, that means executing JS with GraalJS in a multi-threaded server. Performance is good, I can get 100krps on my laptop with all cores saturated (after a few fixes in GraalJS that are now on master). This is done by using a pool of contexts that share an engine: It's not very hard to make a cached pool in which threads grab a context when they need one, execute some JS, and then release it. From measuring this server doesn't normally go above 32 contexts, and that's probably optimizable. So the question becomes how much memory overhead do those contexts have for your use case. For ReactJS rendering it seems OK. |
I've recently forked graaljs and forcefully enabled multithreading by removing the throw statement (a very dirty way to do it, i know. i just wanted to test if multithreading works in practice), and the results were surprisingly very good. Compared to other slight interop issues, this one was quite unnoticable. In fact, during multiple benchmarks, i only got minor inconsistencies that can be avoided by not benchmarking and instead using it normally, or writing your code in such a way that two threads don't try to use the same JS variables too much. During normal use, i've had zero issues with 4 threads accessing the context at once.
This is NOT an actual bug report. I just want to make the suggestion to add an option, for example 'js.EnableMultithreading', and making it experimental. There are multiple other issues with interop that don't matter in practice so i think we shouldn't be denied mutithreading completely. I might later start a pull request to add an option (if i manage to add an option that is :/ ).
The text was updated successfully, but these errors were encountered: