Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parser crashing in native code due to multi-threaded access #11121

Closed
hubertp opened this issue Sep 18, 2024 · 9 comments · Fixed by #11147
Closed

Parser crashing in native code due to multi-threaded access #11121

hubertp opened this issue Sep 18, 2024 · 9 comments · Fixed by #11147
Assignees
Labels
--regression Important: regression -compiler -parser p-high Should be completed in the next sprint

Comments

@hubertp
Copy link
Collaborator

hubertp commented Sep 18, 2024

C  [libenso_parser.dylib+0x2e81c]  alloc::collections::btree::search::_$LT$impl$u20$alloc..collections..btree..node..NodeRef$LT$BorrowType$C$K$C$V$C$alloc..collections..btree..node..marker..LeafOrInternal$GT$$GT$::search_tree::h2955aec13fd254fd+0x90
C  [libenso_parser.dylib+0x23e98]  Java_org_enso_syntax2_Parser_getUuidHigh+0x2c
j  org.enso.syntax2.Parser.getUuidHigh(JJJ)J+0 org.enso.syntax
j  org.enso.syntax2.Message.getUuid(JJ)Ljava/util/UUID;+6 org.enso.syntax
j  org.enso.syntax2.Tree.deserialize(Lorg/enso/syntax2/Message;)Lorg/enso/syntax2/Tree;+940 org.enso.syntax
j  org.enso.syntax2.Tree.deserialize(Lorg/enso/syntax2/Message;)Lorg/enso/syntax2/Tree;+1902 org.enso.syntax

or

Stack: [0x0000000287a38000,0x0000000287c3b000],  sp=0x0000000287c38920,  free space=2050k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [libenso_parser.dylib+0x2e81c]  alloc::collections::btree::search::_$LT$impl$u20$alloc..collections..btree..node..NodeRef$LT$BorrowType$C$K$C$V$C$alloc..collections..btree..node..marker..LeafOrInternal$GT$$GT$::search_tree::h2955aec13fd254fd+0x90
C  [libenso_parser.dylib+0x23e98]  Java_org_enso_syntax2_Parser_getUuidHigh+0x2c
j  org.enso.syntax2.Parser.getUuidHigh(JJJ)J+0 org.enso.syntax
j  org.enso.syntax2.Message.getUuid(JJ)Ljava/util/UUID;+6 org.enso.syntax
j  org.enso.syntax2.Tree.deserialize(Lorg/enso/syntax2/Message;)Lorg/enso/syntax2/Tree;+940 org.enso.syntax
j  org.enso.syntax2.Tree.deserialize(Lorg/enso/syntax2/Message;)Lorg/enso/syntax2/Tree;+1902 org.enso.syntax
j  org.enso.syntax2.MultiSegmentAppSegment.deserialize(Lorg/enso/syntax2/Message;)Lorg/enso/syntax2/MultiSegmentAppSegment;+17 org.enso.syntax
j  org.enso.syntax2.Tree.deserialize(Lorg/enso/syntax2/Message;)Lorg/enso/syntax2/Tree;+3350 org.enso.syntax
j  org.enso.syntax2.Line.deserialize(Lorg/enso/syntax2/Message;)Lorg/enso/syntax2/Line;+17 org.enso.syntax

hs_err_pid90180.log
hs_err_pid90473.log

@hubertp hubertp added p-high Should be completed in the next sprint -parser --regression Important: regression labels Sep 18, 2024
@hubertp
Copy link
Collaborator Author

hubertp commented Sep 19, 2024

Encountered another one:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007d20e0463f1a, pid=849536, tid=849636
#
# JRE version: OpenJDK Runtime Environment GraalVM CE 21.0.2+13.1 (21.0.2+13) (build 21.0.2+13-jvmci-23.1-b30)
# Java VM: OpenJDK 64-Bit Server VM GraalVM CE 21.0.2+13.1 (21.0.2+13-jvmci-23.1-b30, mixed mode, sharing, tiered, jvmci, jvmci compiler, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# C  [libenso_parser.so+0xaef1a]  alloc::collections::btree::search::_$LT$impl$u20$alloc..collections..btree..node..NodeRef$LT$BorrowType$C$K$C$V$C$alloc..collections..btree..node..marker..LeafOrInternal$GT$$GT$::search_tree::h40e1af357c1cc3f3+0xaa
#
# Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -- %E" (or dumping to /home/hubert/Documents/enso-projects/core.849536)
#
# An error report file with more information is saved as:
# /home/hubert/Documents/enso-projects/hs_err_pid849536.log
[10.884s][warning][os] Loading hsdis library failed
#
# If you would like to submit a bug report, please visit:
#   https://github.com/oracle/graal/issues
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#

hs_err_pid849536.log

The project has worked a couple of minutes ago with no problems before being re-opened and crashing.

@kazcw
Copy link
Contributor

kazcw commented Sep 19, 2024

I have reproduced this; it is the same issue as #11104.

This was referenced Sep 19, 2024
@kazcw
Copy link
Contributor

kazcw commented Sep 20, 2024

Adding this debugging code:
c912a55

I found this result:

[ERROR] [2024-09-19T17:45:12.839] [enso.org.enso.interpreter.service.ExecutionService] Evaluation of visualization [Text(Standard.Visualization.Helpers,a -> a.default_visualization.to_js_object.to_json,Vector())] failed in module [Module[Standard.Visualization.Helpers]] with [class java.lang.IllegalStateException]: Race condition detected
[ERROR] [2024-09-19T17:45:12.842] [enso.org.enso.interpreter.service.ExecutionService] Visualization for expression 72abc08d-c464-4ce2-853e-9f96c00ca36b failed: Race condition detected (evaluation result: None)
[ERROR] [2024-09-19T17:45:12.850] [enso.org.enso.interpreter.service.ExecutionService] Evaluation of visualization [Text(Standard.Visualization.Preprocessor,default_preprocessor,Vector())] failed in module [Module[Standard.Visualization.Preprocessor]] with [class java.lang.IllegalStateException]: Race condition detected
[ERROR] [2024-09-19T17:45:12.857] [enso.org.enso.interpreter.service.ExecutionService] Visualization for expression 54541b7d-f9fc-4ebb-81f0-907f3d1a634a failed: Race condition detected (evaluation result: None)
[ERROR] [2024-09-19T17:45:12.860] [enso.org.enso.interpreter.service.ExecutionService] Evaluation of visualization [Text(Standard.Visualization.Helpers,a -> a.default_visualization.to_js_object.to_json,Vector())] failed in module [Module[Standard.Visualization.Helpers]] with [class java.lang.IllegalStateException]: Race condition detected
[ERROR] [2024-09-19T17:45:12.860] [enso.org.enso.interpreter.service.ExecutionService] Visualization for expression b9f7fc7d-aee2-4f31-be3a-7961dbba800c failed: Race condition detected (evaluation result: None)
[ERROR] [2024-09-19T17:45:12.878] [enso.org.enso.interpreter.service.ExecutionService] Evaluation of visualization [Text(Standard.Visualization.Helpers,a -> a.default_visualization.to_js_object.to_json,Vector())] failed in module [Module[Standard.Visualization.Helpers]] with [class java.lang.IllegalStateException]: Race condition detected
[ERROR] [2024-09-19T17:45:12.878] [enso.org.enso.interpreter.service.ExecutionService] Visualization for expression 54541b7d-f9fc-4ebb-81f0-907f3d1a634a failed: Race condition detected (evaluation result: None)
[ERROR] [2024-09-19T17:45:12.880] [enso.org.enso.interpreter.service.ExecutionService] Evaluation of visualization [Text(Standard.Visualization.Preprocessor,default_preprocessor,Vector())] failed in module [Module[Standard.Visualization.Preprocessor]] with [class java.lang.IllegalStateException]: Race condition detected
[ERROR] [2024-09-19T17:45:12.880] [enso.org.enso.interpreter.service.ExecutionService] Visualization for expression 72abc08d-c464-4ce2-853e-9f96c00ca36b failed: Race condition detected (evaluation result: None)
[ERROR] [2024-09-19T17:45:12.882] [enso.org.enso.interpreter.service.ExecutionService] Evaluation of visualization [Text(Standard.Visualization.Helpers,a -> a.default_visualization.to_js_object.to_json,Vector())] failed in module [Module[Standard.Visualization.Helpers]] with [class java.lang.IllegalStateException]: Race condition detected
[ERROR] [2024-09-19T17:45:12.882] [enso.org.enso.interpreter.service.ExecutionService] Visualization for expression d17307e7-8b24-4e4f-8f12-76eca7080241 failed: Race condition detected (evaluation result: None)
[ERROR] [2024-09-19T17:45:12.890] [enso.org.enso.interpreter.service.ExecutionService] Evaluation of visualization [Text(Standard.Visualization.Preprocessor,default_preprocessor,Vector())] failed in module [Module[Standard.Visualization.Preprocessor]] with [class java.lang.IllegalStateException]: Race condition detected
[ERROR] [2024-09-19T17:45:12.890] [enso.org.enso.interpreter.service.ExecutionService] Visualization for expression b9f7fc7d-aee2-4f31-be3a-7961dbba800c failed: Race condition detected (evaluation result: None)
[ERROR] [2024-09-19T17:45:12.892] [enso.org.enso.interpreter.service.ExecutionService] Evaluation of visualization [Text(Standard.Visualization.Preprocessor,default_preprocessor,Vector())] failed in module [Module[Standard.Visualization.Preprocessor]] with [class java.lang.IllegalStateException]: Race condition detected
[ERROR] [2024-09-19T17:45:12.892] [enso.org.enso.interpreter.service.ExecutionService] Visualization for expression d17307e7-8b24-4e4f-8f12-76eca7080241 failed: Race condition detected (evaluation result: None)
[ERROR] [2024-09-19T17:45:12.894] [enso.org.enso.interpreter.service.ExecutionService] Evaluation of visualization [Text(Standard.Visualization.Preprocessor,default_preprocessor,Vector())] failed in module [Module[Standard.Visualization.Preprocessor]] with [class java.lang.IllegalStateException]: Race condition detected
[ERROR] [2024-09-19T17:45:12.894] [enso.org.enso.interpreter.service.ExecutionService] Visualization for expression c64f449c-84b2-4f24-9e4f-4865bb1bd7bd failed: Race condition detected (evaluation result: None)
[ERROR] [2024-09-19T17:45:12.894] [enso.org.enso.interpreter.service.ExecutionService] Evaluation of visualization [Text(Standard.Visualization.Helpers,a -> a.default_visualization.to_js_object.to_json,Vector())] failed in module [Module[Standard.Visualization.Helpers]] with [class java.lang.IllegalStateException]: Race condition detected
[ERROR] [2024-09-19T17:45:12.894] [enso.org.enso.interpreter.service.ExecutionService] Visualization for expression c64f449c-84b2-4f24-9e4f-4865bb1bd7bd failed: Race condition detected (evaluation result: None)

It seems the parser state is being concurrently modified by multiple threads. This is not supported. A parser can be moved between threads (with appropriate locks/fencing), or different threads can have their own parsers, but one parser instance must not be concurrently used by multiple threads.

@enso-bot
Copy link

enso-bot bot commented Sep 20, 2024

Keziah Wesley reports a new STANDUP for today (2024-09-19):

Progress: Investigated parser problems, traced to unsupported sharing of parser instance between threads. It should be finished by 2024-09-26.

Next Day: Next day I will be working on the #11121 task. Next task.

mergify bot pushed a commit that referenced this issue Sep 20, 2024
- Fix debug logging for #11088 case--attempt to create an exception that is its own cause fails.
- In case the parser is used after closing, throw an `IllegalStateException` instead of UB. (This case is not known to occur and doesn't seem to be behind the #11121, but we should handle it more safely if it does.)
@kazcw
Copy link
Contributor

kazcw commented Sep 20, 2024

After a review of parser usage in the backend, I've decided to simplify the parser API to make this kind of bug impossible.

  • Although the parser is logically stateless, it maintains some state between parses to enable a buffer reuse optimization.
  • Currently: The stateful API adds some complexity to parser usage; in particular, unsynchronized sharing of a parser instance between threads causes Undefined Behavior.
  • Plan: I will replace the stateful API with a stateless API; the parser bindings will handle the buffer reuse optimization internally using thread-local storage.

@kazcw kazcw moved this from ❓New to 🔧 Implementation in Issues Board Sep 20, 2024
kazcw added a commit that referenced this issue Sep 20, 2024
Stateless (static) parser interface. Buffer-reuse optimization is now hidden
behind JNI FFI implementation. Fixes #11121 and prevents similar bugs.
@kazcw kazcw mentioned this issue Sep 20, 2024
4 tasks
kazcw added a commit that referenced this issue Sep 20, 2024
Stateless (static) parser interface. Buffer-reuse optimization is now hidden
behind JNI FFI implementation. Fixes #11121 and prevents similar bugs.
@kazcw kazcw added this to the 2024-09 Release milestone Sep 20, 2024
kazcw added a commit that referenced this issue Sep 20, 2024
Stateless (static) parser interface. Buffer-reuse optimization is now hidden
behind JNI FFI implementation. Fixes #11121 and prevents similar bugs.
kazcw added a commit that referenced this issue Sep 20, 2024
Stateless (static) parser interface. Buffer-reuse optimization is now hidden
behind JNI FFI implementation. Fixes #11121 and prevents similar bugs.
kazcw added a commit that referenced this issue Sep 20, 2024
Stateless (static) parser interface. Buffer-reuse optimization is now hidden
behind JNI FFI implementation. Fixes #11121 and prevents similar bugs.
kazcw added a commit that referenced this issue Sep 21, 2024
Stateless (static) parser interface. Buffer-reuse optimization is now hidden
behind JNI FFI implementation. Fixes #11121 and prevents similar bugs.
@JaroslavTulach
Copy link
Member

It is very interesting result, @kazcw! So far I've been convinced that our execution is only single threaded. We know we want to move towards multi-threaded one - as such fixing the parsing to support multiple threads is desirable. But it is still surprising.

Adding this debugging code: c912a55

I'd be interested in knowing the stack traces of callers when the collision in the critical section happens. Possibly this small modification of your code could give us traces of the first two threads that collide.

diff --git lib/rust/parser/generate-java/java/org/enso/syntax2/Parser.java lib/rust/parser/generate-java/java/org/enso/syntax2/Parser.java
index 2c375ee840..12290a230b 100644
--- lib/rust/parser/generate-java/java/org/enso/syntax2/Parser.java
+++ lib/rust/parser/generate-java/java/org/enso/syntax2/Parser.java
@@ -5,8 +5,12 @@ import java.net.URISyntaxException;
 import java.nio.ByteBuffer;
 import java.nio.ByteOrder;
 import java.nio.charset.StandardCharsets;
+import java.util.concurrent.atomic.AtomicInteger;
+import java.util.function.Supplier;
 
 public final class Parser implements AutoCloseable {
+  private final AtomicInteger mutators = new AtomicInteger(0);
+
   private static void initializeLibraries() {
     try {
       System.loadLibrary("enso_parser");
@@ -116,22 +120,39 @@ public final class Parser implements AutoCloseable {
   }
 
   public ByteBuffer parseInputLazy(CharSequence input) {
-    byte[] inputBytes = input.toString().getBytes(StandardCharsets.UTF_8);
-    ByteBuffer inputBuf = ByteBuffer.allocateDirect(inputBytes.length);
-    inputBuf.put(inputBytes);
-    return parseTreeLazy(state, inputBuf);
+    return criticalSection(
+        () -> {
+          byte[] inputBytes = input.toString().getBytes(StandardCharsets.UTF_8);
+          ByteBuffer inputBuf = ByteBuffer.allocateDirect(inputBytes.length);
+          inputBuf.put(inputBytes);
+          return parseTreeLazy(state, inputBuf);
+        });
   }
 
   public Tree parse(CharSequence input) {
-    byte[] inputBytes = input.toString().getBytes(StandardCharsets.UTF_8);
-    ByteBuffer inputBuf = ByteBuffer.allocateDirect(inputBytes.length);
-    inputBuf.put(inputBytes);
-    var serializedTree = parseTree(state, inputBuf);
-    var base = getLastInputBase(state);
-    var metadata = getMetadata(state);
-    serializedTree.order(ByteOrder.LITTLE_ENDIAN);
-    var message = new Message(serializedTree, input, base, metadata);
-    return Tree.deserialize(message);
+    return criticalSection(
+        () -> {
+          byte[] inputBytes = input.toString().getBytes(StandardCharsets.UTF_8);
+          ByteBuffer inputBuf = ByteBuffer.allocateDirect(inputBytes.length);
+          inputBuf.put(inputBytes);
+          var serializedTree = parseTree(state, inputBuf);
+          var base = getLastInputBase(state);
+          var metadata = getMetadata(state);
+          serializedTree.order(ByteOrder.LITTLE_ENDIAN);
+          var message = new Message(serializedTree, input, base, metadata);
+          return Tree.deserialize(message);
+        });
+  }
+
+  private <R> R criticalSection(Supplier<R> action) {
+    if (mutators.getAndIncrement() != 0) {
+      throw new IllegalStateException("Race condition detected. On enter.");
+    }
+    var r = action.get();
+    if (mutators.getAndDecrement() != 1) {
+      throw new IllegalStateException("Race condition detected. On exit.");
+    }
+    return r;
   }
 
   public static String getWarningMessage(Warning warning) {

then there is going to be a lot of warnings, as the counters will be off. Or maybe new IllegalStateException("...").printStackTrace() to make sure the stacktrace is visible and the counter gets back to 0 unless there is an error.

@kazcw
Copy link
Contributor

kazcw commented Sep 23, 2024

@JaroslavTulach Added more detailed diagnostics based on your suggestion, and I found this thread conflict:

Thread 0: Thread[#74,job-pool-3,5,main]
org.enso.syntax/org.enso.syntax2.Parser.parse(Parser.java:173)
org.enso.runtime/org.enso.compiler.core.EnsoParser.parse(EnsoParser.java:38)
org.enso.runtime/org.enso.compiler.Compiler.uncachedParseModule(Compiler.scala:602)
org.enso.runtime/org.enso.compiler.Compiler.parseModule(Compiler.scala:568)
org.enso.runtime/org.enso.compiler.Compiler.$anonfun$runCompilerPipeline$1(Compiler.scala:247)
org.enso.runtime/org.enso.compiler.Compiler.$anonfun$runCompilerPipeline$1$adapted(Compiler.scala:245)
org.enso.runtime/scala.collection.immutable.List.foreach(List.scala:333)
org.enso.runtime/org.enso.compiler.Compiler.runCompilerPipeline(Compiler.scala:245)
org.enso.runtime/org.enso.compiler.Compiler.go$1(Compiler.scala:229)
org.enso.runtime/org.enso.compiler.Compiler.runInternal(Compiler.scala:236)
org.enso.runtime/org.enso.compiler.Compiler.run(Compiler.scala:127)
org.enso.runtime/org.enso.interpreter.instrument.job.EnsureCompiledJob.compile(EnsureCompiledJob.scala:308)
org.enso.runtime/org.enso.interpreter.instrument.job.EnsureCompiledJob.$anonfun$ensureCompiledModule$1(EnsureCompiledJob.scala:118)
org.enso.runtime/scala.Option.map(Option.scala:242)
org.enso.runtime/org.enso.interpreter.instrument.job.EnsureCompiledJob.ensureCompiledModule(EnsureCompiledJob.scala:117)
org.enso.runtime/org.enso.interpreter.instrument.job.EnsureCompiledJob.$anonfun$ensureCompiledFiles$2(EnsureCompiledJob.scala:88)
org.enso.runtime/scala.collection.immutable.List.map(List.scala:246)
org.enso.runtime/scala.collection.immutable.List.map(List.scala:79)
org.enso.runtime/org.enso.interpreter.instrument.job.EnsureCompiledJob.ensureCompiledFiles(EnsureCompiledJob.scala:88)
org.enso.runtime/org.enso.interpreter.instrument.job.EnsureCompiledJob.$anonfun$run$1(EnsureCompiledJob.scala:68)
org.enso.runtime/org.enso.interpreter.instrument.execution.ReentrantLocking.withWriteCompilationLock(ReentrantLocking.scala:93)
org.enso.runtime/org.enso.interpreter.instrument.job.EnsureCompiledJob.run(EnsureCompiledJob.scala:64)
org.enso.runtime/org.enso.interpreter.instrument.job.EnsureCompiledJob.run(EnsureCompiledJob.scala:49)
org.enso.runtime/org.enso.interpreter.instrument.execution.JobExecutionEngine.$anonfun$runInternal$1(JobExecutionEngine.scala:138)
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
java.base/java.lang.Thread.run(Thread.java:1583)
org.graalvm.truffle/com.oracle.truffle.polyglot.PolyglotThread.access$001(PolyglotThread.java:53)
org.graalvm.truffle/com.oracle.truffle.polyglot.PolyglotThread$1.execute(PolyglotThread.java:106)
org.graalvm.truffle/com.oracle.truffle.polyglot.PolyglotThread$ThreadSpawnRootNode.executeImpl(PolyglotThread.java:140)
org.graalvm.truffle/com.oracle.truffle.polyglot.PolyglotThread$ThreadSpawnRootNode.execute(PolyglotThread.java:131)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.executeRootNode(OptimizedCallTarget.java:745)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.profiledPERoot(OptimizedCallTarget.java:669)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.callBoundary(OptimizedCallTarget.java:602)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.doInvoke(OptimizedCallTarget.java:586)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.callIndirect(OptimizedCallTarget.java:519)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.call(OptimizedCallTarget.java:500)
org.graalvm.truffle/com.oracle.truffle.polyglot.PolyglotThread.run(PolyglotThread.java:102)

Thread 1: Thread[#75,prioritized-job-pool-1,5,main]
org.enso.syntax/org.enso.syntax2.Parser.parse(Parser.java:173)
org.enso.runtime/org.enso.compiler.core.EnsoParser.parse(EnsoParser.java:38)
org.enso.runtime/org.enso.compiler.Compiler.runInline(Compiler.scala:688)
org.enso.runtime/org.enso.interpreter.node.expression.debug.EvalNode.parseExpression(EvalNode.java:80)
org.enso.runtime/org.enso.interpreter.node.expression.debug.EvalNodeGen.executeAndSpecialize(EvalNodeGen.java:148)
org.enso.runtime/org.enso.interpreter.node.expression.debug.EvalNodeGen.execute(EvalNodeGen.java:99)
org.enso.runtime/org.enso.interpreter.node.expression.builtin.debug.DebugEvalNode.execute(DebugEvalNode.java:28)
org.enso.runtime/org.enso.interpreter.node.expression.builtin.debug.DebugEvalMethodGen.execute(DebugEvalMethodGen.java:145)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.executeRootNode(OptimizedCallTarget.java:745)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.profiledPERoot(OptimizedCallTarget.java:669)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.callBoundary(OptimizedCallTarget.java:602)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.doInvoke(OptimizedCallTarget.java:586)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.callDirect(OptimizedCallTarget.java:535)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedDirectCallNode.call(OptimizedDirectCallNode.java:94)
org.enso.runtime/org.enso.interpreter.node.callable.ExecuteCallNode.callDirect(ExecuteCallNode.java:94)
org.enso.runtime/org.enso.interpreter.node.callable.ExecuteCallNodeGen.executeAndSpecialize(ExecuteCallNodeGen.java:171)
org.enso.runtime/org.enso.interpreter.node.callable.ExecuteCallNodeGen.executeCall(ExecuteCallNodeGen.java:101)
org.enso.runtime/org.enso.interpreter.node.callable.dispatch.LoopingCallOptimiserNode$RepeatedCallNode.executeRepeating(LoopingCallOptimiserNode.java:270)
org.graalvm.truffle/com.oracle.truffle.api.nodes.RepeatingNode.executeRepeatingWithValue(RepeatingNode.java:112)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedOSRLoopNode.profilingLoop(OptimizedOSRLoopNode.java:169)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedOSRLoopNode.execute(OptimizedOSRLoopNode.java:120)
org.enso.runtime/org.enso.interpreter.node.callable.dispatch.LoopingCallOptimiserNode.dispatch(LoopingCallOptimiserNode.java:95)
org.enso.runtime/org.enso.interpreter.node.callable.dispatch.LoopingCallOptimiserNode.cachedDispatch(LoopingCallOptimiserNode.java:69)
org.enso.runtime/org.enso.interpreter.node.callable.dispatch.LoopingCallOptimiserNodeGen.executeAndSpecialize(LoopingCallOptimiserNodeGen.java:153)
org.enso.runtime/org.enso.interpreter.node.callable.dispatch.LoopingCallOptimiserNodeGen.executeDispatch(LoopingCallOptimiserNodeGen.java:130)
org.enso.runtime/org.enso.interpreter.runtime.Module$InvokeMember.evalExpression(Module.java:662)
org.enso.runtime/org.enso.interpreter.runtime.Module$InvokeMember.doInvoke(Module.java:723)
org.enso.runtime/org.enso.interpreter.runtime.ModuleGen$InteropLibraryExports$Cached.executeAndSpecialize(ModuleGen.java:115)
org.enso.runtime/org.enso.interpreter.runtime.ModuleGen$InteropLibraryExports$Cached.invokeMember(ModuleGen.java:104)
org.graalvm.truffle/com.oracle.truffle.api.interop.InteropLibraryGen$CachedDispatch.invokeMember(InteropLibraryGen.java:8549)
org.enso.runtime/org.enso.interpreter.service.ExecutionService$InvokeMemberRootNode.execute(ExecutionService.java:608)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.executeRootNode(OptimizedCallTarget.java:745)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.profiledPERoot(OptimizedCallTarget.java:669)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.callBoundary(OptimizedCallTarget.java:602)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.doInvoke(OptimizedCallTarget.java:586)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.callIndirect(OptimizedCallTarget.java:519)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.call(OptimizedCallTarget.java:500)
org.enso.runtime/org.enso.interpreter.service.ExecutionService.evaluateExpression(ExecutionService.java:296)
org.enso.runtime/org.enso.interpreter.instrument.job.UpsertVisualizationJob$.$anonfun$evaluateVisualizationFunction$1(UpsertVisualizationJob.scala:368)
org.enso.runtime/scala.util.Try$.apply(Try.scala:210)
org.enso.runtime/org.enso.interpreter.instrument.job.UpsertVisualizationJob$.evaluateVisualizationFunction(UpsertVisualizationJob.scala:364)
org.enso.runtime/org.enso.interpreter.instrument.job.UpsertVisualizationJob$.evaluateModuleExpression(UpsertVisualizationJob.scala:445)
org.enso.runtime/org.enso.interpreter.instrument.job.UpsertVisualizationJob$.$anonfun$evaluateVisualizationExpression$2(UpsertVisualizationJob.scala:473)
org.enso.runtime/scala.util.Either.flatMap(Either.scala:352)
org.enso.runtime/org.enso.interpreter.instrument.job.UpsertVisualizationJob$.$anonfun$evaluateVisualizationExpression$1(UpsertVisualizationJob.scala:472)
org.enso.runtime/scala.util.Either.flatMap(Either.scala:352)
org.enso.runtime/org.enso.interpreter.instrument.job.UpsertVisualizationJob$.org$enso$interpreter$instrument$job$UpsertVisualizationJob$$evaluateVisualizationExpression(UpsertVisualizationJob.scala:471)
org.enso.runtime/org.enso.interpreter.instrument.job.UpsertVisualizationJob.$anonfun$run$1(UpsertVisualizationJob.scala:70)
org.enso.runtime/org.enso.interpreter.instrument.execution.ReentrantLocking.withContextLock(ReentrantLocking.scala:217)
org.enso.runtime/org.enso.interpreter.instrument.job.UpsertVisualizationJob.run(UpsertVisualizationJob.scala:68)
org.enso.runtime/org.enso.interpreter.instrument.job.UpsertVisualizationJob.run(UpsertVisualizationJob.scala:42)
org.enso.runtime/org.enso.interpreter.instrument.execution.JobExecutionEngine.$anonfun$runInternal$1(JobExecutionEngine.scala:138)
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
java.base/java.lang.Thread.run(Thread.java:1583)
org.graalvm.truffle/com.oracle.truffle.polyglot.PolyglotThread.access$001(PolyglotThread.java:53)
org.graalvm.truffle/com.oracle.truffle.polyglot.PolyglotThread$1.execute(PolyglotThread.java:106)
org.graalvm.truffle/com.oracle.truffle.polyglot.PolyglotThread$ThreadSpawnRootNode.executeImpl(PolyglotThread.java:140)
org.graalvm.truffle/com.oracle.truffle.polyglot.PolyglotThread$ThreadSpawnRootNode.execute(PolyglotThread.java:131)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.executeRootNode(OptimizedCallTarget.java:745)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.profiledPERoot(OptimizedCallTarget.java:669)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.callBoundary(OptimizedCallTarget.java:602)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.doInvoke(OptimizedCallTarget.java:586)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.callIndirect(OptimizedCallTarget.java:519)
org.graalvm.truffle.runtime/com.oracle.truffle.runtime.OptimizedCallTarget.call(OptimizedCallTarget.java:500)
org.graalvm.truffle/com.oracle.truffle.polyglot.PolyglotThread.run(PolyglotThread.java:102)

@kazcw
Copy link
Contributor

kazcw commented Sep 23, 2024

It seems visualization expression compilation and module compilation use the same Compiler instance from different threads concurrently; is Compiler and its owned data (besides Parser) designed for this, or could this cause other problems?

@JaroslavTulach JaroslavTulach changed the title Parser crashing in native code Parser crashing in native code due to multi-threaded access Sep 24, 2024
@JaroslavTulach
Copy link
Member

Added more detailed diagnostics ... and I found this thread conflict:

Amazing! Thanks a lot, @kazcw. So, it is DebugEvalNode... I had recently a conflicting expectations about it too in #11022 ... looks like some of us tend to forget it does compilation and that it does it in middle of execution.

It seems visualization expression compilation and module compilation use the same Compiler instance from different threads concurrently; is Compiler and its owned data (besides Parser) designed for this, or could this cause other problems?

That's a very good question. So far the common expectation among @4e6, @Akirathan, @hubertp was that compilation is single-threaded. Apparently it is not. Yes, it can have consequences.

Compiler references FreshNameSupply

FreshNameSupply contains a var not ready for multi-threaded access. Possible damage is however low in this case - all that's needed is to avoid duplicated newName inside of a single thread and that's (according to my understanding of Java memory model) guaranteed.

Compiler references DefaultPackageRepository

I see attempts to make DefaultPackageRepository multi-threaded ready. Let's assume it is until we notice why it shouldn't be.

EnsoContext & ModuleScope

At the end the DebugNode calls IrToTruffle step and that may mangle with internals of runtime structures like EnsoContext and ModuleScope. There have been some effort to make ModuleScope more mutli-threaded ready (#9914) as it was known to behave badly under multi-threaded access. There are no known problems, but the code remains too convoluted for a review. ensureCompiledModule then interacts with ModuleScope & co. too.

Goal

The goal is to get parallel compilation and execution working. A task to execute visualizations in parallel is pending somewhere. We will need to make ModuleScope more robust to achieve that. Other parts of the Enso Compiler (except the parser - which is being fixed in #11147) seem to be somehow designed for multi-threaded access. We should strive to get the multi-threaded usage working.

@farmaazon farmaazon moved this from 🔧 Implementation to 👁️ Code review in Issues Board Sep 24, 2024
jdunkerley pushed a commit that referenced this issue Sep 26, 2024
- Fix debug logging for #11088 case--attempt to create an exception that is its own cause fails.
- In case the parser is used after closing, throw an `IllegalStateException` instead of UB. (This case is not known to occur and doesn't seem to be behind the #11121, but we should handle it more safely if it does.)

(cherry picked from commit e587d56)
@mergify mergify bot closed this as completed in #11147 Sep 27, 2024
@mergify mergify bot closed this as completed in 2891981 Sep 27, 2024
@github-project-automation github-project-automation bot moved this from 👁️ Code review to 🟢 Accepted in Issues Board Sep 27, 2024
@farmaazon farmaazon moved this from 🟢 Accepted to 🗄️ Archived in Issues Board Oct 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
--regression Important: regression -compiler -parser p-high Should be completed in the next sprint
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants