-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Truffle compiler control based on HotSpot's CompileBroker compilation activity #10135
Conversation
…ompilation activity
...e/src/com.oracle.truffle.runtime/src/com/oracle/truffle/runtime/OptimizedTruffleRuntime.java
Outdated
Show resolved
Hide resolved
...e/src/com.oracle.truffle.runtime/src/com/oracle/truffle/runtime/OptimizedTruffleRuntime.java
Outdated
Show resolved
Hide resolved
...e/src/com.oracle.truffle.runtime/src/com/oracle/truffle/runtime/OptimizedTruffleRuntime.java
Outdated
Show resolved
Hide resolved
...e/src/com.oracle.truffle.runtime/src/com/oracle/truffle/runtime/OptimizedTruffleRuntime.java
Outdated
Show resolved
Hide resolved
"Increase the code cache size using '-XX:ReservedCodeCacheSize=' and/or run with '-XX:+UseCodeCacheFlushing -XX:+MethodFlushing'."); | ||
} | ||
try { | ||
queue.shutdownAndAwaitTermination(100 /* milliseconds */); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you describe a bit what you are trying to achieve here? why would it help to block the entire interpreter thread if this happens?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I must confess, that this is something I don't really understand very well. My actual intention is to stop/tear-down the Truffle compiler and all associated compiler threads once compilation has been shut down in HotSpot, but I didn't found a good way to do it.
Could you please advice what would be the best reaction if HotSpot is shutting down compilations because of a full code cache without any hope for recovery? I.e. what can we do to run full speed interpreted without any profiling/compilation overhead from that point on?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fastest way to disable a compilation is using OptimizedCallTarget.compilationFailed.
Every time we see shutdown with a call target we could disable compilation.
Best idea I have.
I put up an example what i would do in another comment (no idea how to link draft comments here)
...com.oracle.truffle.runtime/src/com/oracle/truffle/runtime/hotspot/HotSpotTruffleRuntime.java
Show resolved
Hide resolved
...com.oracle.truffle.runtime/src/com/oracle/truffle/runtime/hotspot/HotSpotTruffleRuntime.java
Outdated
Show resolved
Hide resolved
.../src/com.oracle.truffle.compiler/src/com/oracle/truffle/compiler/TruffleCompilerRuntime.java
Outdated
Show resolved
Hide resolved
...e/src/com.oracle.truffle.runtime/src/com/oracle/truffle/runtime/OptimizedRuntimeOptions.java
Outdated
Show resolved
Hide resolved
...e/src/com.oracle.truffle.runtime/src/com/oracle/truffle/runtime/OptimizedRuntimeOptions.java
Outdated
Show resolved
Hide resolved
.../src/com.oracle.truffle.compiler/src/com/oracle/truffle/compiler/TruffleCompilerRuntime.java
Outdated
Show resolved
Hide resolved
...e/src/com.oracle.truffle.runtime/src/com/oracle/truffle/runtime/OptimizedRuntimeOptions.java
Outdated
Show resolved
Hide resolved
Thanks a lot for your quick and very helpful review. I've just pushed a new version of my changes which hopefully addresses your requests (except for the right reaction on compilation shutdowns for which I've asked for additional help). I've now moved the compilation mode handling one lever up into Please let me know what you think? Thank you and best regards, |
...e/src/com.oracle.truffle.runtime/src/com/oracle/truffle/runtime/OptimizedRuntimeOptions.java
Outdated
Show resolved
Hide resolved
@chumer, could you please have a look at the updated version? |
truffle/src/com.oracle.truffle.runtime/src/com/oracle/truffle/runtime/OptimizedCallTarget.java
Outdated
Show resolved
Hide resolved
"Increase the code cache size using '-XX:ReservedCodeCacheSize=' and/or run with '-XX:+UseCodeCacheFlushing -XX:+MethodFlushing'."); | ||
} | ||
try { | ||
queue.shutdownAndAwaitTermination(100 /* milliseconds */); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fastest way to disable a compilation is using OptimizedCallTarget.compilationFailed.
Every time we see shutdown with a call target we could disable compilation.
Best idea I have.
I put up an example what i would do in another comment (no idea how to link draft comments here)
truffle/src/com.oracle.truffle.runtime/src/com/oracle/truffle/runtime/OptimizedCallTarget.java
Outdated
Show resolved
Hide resolved
truffle/src/com.oracle.truffle.runtime/src/com/oracle/truffle/runtime/OptimizedCallTarget.java
Show resolved
Hide resolved
truffle/src/com.oracle.truffle.runtime/src/com/oracle/truffle/runtime/OptimizedCallTarget.java
Outdated
Show resolved
Hide resolved
@simonis Thank you for the changes!. did another turn. looks like its going to be the last one 👍 |
I can imagine something like the following:
It would be a little tricky to exclude all the JIT compilations and only allow truffle compilations so the results don't get influenced by JIT compilations. Is this something acceptable as Truffle test? Do you already have similar tests (i.e. simple tests which repeatedly trigger the compilation of a single method, tests which call external tools and parse their output). I'm just asking because the HotSpot JTreg tests have frameworks and tooling for such kind of tests. I'm not familiar with the Truffle tests and don't want to re-invent the wheel so I'd be happy if you could point me to some tests which I could use a template to start with. |
@chumer, could you please have a look at the updated version? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Thanks a lot for this contribution @simonis !
I will take it from here to integrate.
Integrated: #10358 |
Truffle compilations run in "hosted" mode, i.e. the Truffle runtimes triggers compilations independently of HotSpot's
CompileBroker
. But the results of Truffle compilations are still stored as ordinary nmethods in HotSpot's code cache (with the help of the JVMCI methodjdk.vm.ci.hotspot.HotSpotCodeCacheProvider::installCode()
). The regular JIT compilers are controlled by theCompileBroker
which is aware of the code cache occupancy. If the code cache runs full, theCompileBroker
temporary pauses any subsequent JIT compilations until the code cache gets swept (if running with-XX:+UseCodeCacheFlushing -XX:+MethodFlushing
which is the default) or completely shuts down the JIT compilers if running with-XX:+UseCodeCacheFlushing
.Truffle compiled methods can contribute significantly to the overall code cache occupancy and they can trigger JIT compilation stalls if they fill the code cache up. But the Truffle framework itself is neither aware of the current code cache occupancy, nor of the compilation activity of the
CompileBroker
. If Truffle tries to install a compiled method through JVMCI and the code cache is full, it will silently fail. Currently Truffle interprets such failures as transient errors and basically ignores it. Whenever the corresponding method gets hot again (usually immediately at the next invocation), Truffle will recompile it again just to fail again in the nmethod installation step, if the code cache is still full.When the code cache is tight, this can lead to situations, where Truffle is unnecessarily and repeatedly compiling methods which can't be installed in the code cache but produce a significant CPU load. Instead, Truffle should poll HotSpot's
CompileBroker
compilation activity and paus compilations for the time theCompileBroker
is pausing JIT compilations (or completely shutdown Truffle compilations if theCompileBroker
shut down the JIT compilers).The corresponding JVMCI change is tracked under JDK-8344727: [JVMCI] Export the CompileBroker compilation activity mode for Truffle compiler control.
This PR fixes the problem by checking HotSpot's compilation activity mode in
OptimizedTruffleRuntime::submitForCompilation()
before actually submitting a compilation task to a compile queue. If the compilation activity mode isRUN_COMPILATION
the task is submitted as before without any changes. However, if the compilation activity mode isSTOP_COMPILATION
(i.e. theCompileBroker
has temporarily stopped JIT compilations, we flush the current compile queue (because compiled methods can not be installed anyway) and returnnull
. We also start a timer which can be configured with the newStoppedCompilationRetryDelay
parameter (defaults to 1000ms). AfterStoppedCompilationRetryDelay
we submit a new compilation task, even if the compilation activity mode is notRUN_COMPILATION
. This can help to trigger a code cache cleanup in situations when there are no JIT compilations, because code cache sweeping is only triggered when new nmethods are installed. Finally, when the compilation activity mode isSHUTDOWN_COMPILATION
, we simply shutdown the compilation queue and issue a warning to inform users about the code cache shortage.I've manually tested the change by running an octane benchmark with a very small code cache. Before the change, we could see the following results:
As you can see, out of 2255 compilations, 1702 have been to no purpose, because they failed in the final installation step because of a full code cache (i.e.
BailoutException: Code installation failed: code cache is full: 1702
). The other interesting observation is that although the benchmark terminated after 3min11s wall clock time, it actually consumed 11m19s cpu time, because the Truffle compiler threads where continuously doing useless work.With my changes applied, the picture looks as follows:
As you can see, we compile considerably fewer methods and we only have 16 compilation failures due to a full code cache. At the same time we can see that from the 410 methods which have been enqueued for compilation, 277 have been dequeued because of
Compilation temporary disabled due to full code cache
. Again, the wall clock time of the benchmark was 3m14s but this time, the overall cpu time was just slightly higher with 3m22s because we haven't done such a huge amount of unnecessary compilations any more. Also notice how HotSpot's code cachefull_count
, which is incremented every time when a nmethod can't be installed because of a full code cache, dropped from 3200 to 33.These example were run with
-XX:+UseCodeCacheFlushing -XX:+MethodFlushing
. If the JVM is configured with-XX:-UseCodeCacheFlushing
, the benefits of this change are even higher.Fixes #10133