Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenJDK java/lang/Thread/virtual/stress/GetStackTraceALotWhenPinned timeout #18908

Open
pshipton opened this issue Feb 6, 2024 · 31 comments · Fixed by adoptium/aqa-tests#5395
Assignees
Labels

Comments

@pshipton
Copy link
Member

pshipton commented Feb 6, 2024

https://openj9-jenkins.osuosl.org/job/Test_openjdk22_j9_sanity.openjdk_s390x_linux_Nightly_testList_0/1/
jdk_lang_0
java/lang/Thread/virtual/stress/GetStackTraceALotWhenPinned.java#id0

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk22_j9_sanity.openjdk_s390x_linux_Nightly_testList_0/1/openjdk_test_output.tar.gz

06:20:19  Timeout information:
06:20:19  Running jcmd on process 798231
06:20:19  Dump written to /home/jenkins/workspace/Test_openjdk22_j9_sanity.openjdk_s390x_linux_Nightly_testList_0/aqa-tests/TKG/output_17072162697706/jdk_lang_0/work/scratch/1/core.20240206.111933.798231.0001.dmp
06:20:19  Dump written to /home/jenkins/workspace/Test_openjdk22_j9_sanity.openjdk_s390x_linux_Nightly_testList_0/aqa-tests/TKG/output_17072162697706/jdk_lang_0/work/scratch/1/javacore.20240206.112004.798231.0002.txt
06:20:19  Running jstack on process 798231
06:20:19  2024-02-06T11:20:04.428061278
06:20:19  Virtual machine: 798231 JVM information:
06:20:19  JRE 22 Linux s390x-64-Bit Compressed References 20240205_7 (JIT enabled, AOT enabled)
06:20:19  OpenJ9   - c92d58374fe
06:20:19  OMR      - 410ef0d2314
06:20:19  JCL      - 2a64d9c6fcb based on jdk-22+34
06:20:19  
06:20:19  "main" prio=5 Id=2 WAITING
06:20:19  	at java.base@22-internal/java.lang.Object.waitImpl(Native Method)
06:20:19  	at java.base@22-internal/java.lang.Object.wait(Object.java:255)
06:20:19  	at java.base@22-internal/java.lang.Object.wait(Object.java:221)
06:20:19  	at java.base@22-internal/java.lang.Thread.join(Thread.java:2069)
06:20:19  	at java.base@22-internal/java.lang.Thread.join(Thread.java:2145)
06:20:19  	at app//com.sun.javatest.regtest.agent.MainWrapper.main(MainWrapper.java:85)
06:20:19  
06:20:19  "JIT Compilation Thread-000" prio=10 Id=4 RUNNABLE
06:20:19  
06:20:19  "JIT Compilation Thread-001 Suspended" prio=10 Id=5 RUNNABLE
06:20:19  
06:20:19  "JIT Compilation Thread-002 Suspended" prio=10 Id=6 RUNNABLE
06:20:19  
06:20:19  "JIT Compilation Thread-003 Suspended" prio=10 Id=7 RUNNABLE
06:20:19  
06:20:19  "JIT Compilation Thread-004 Suspended" prio=10 Id=8 RUNNABLE
06:20:19  
06:20:19  "JIT Compilation Thread-005 Suspended" prio=10 Id=9 RUNNABLE
06:20:19  
06:20:19  "JIT Compilation Thread-006 Suspended" prio=10 Id=10 RUNNABLE
06:20:19  
06:20:19  "JIT Diagnostic Compilation Thread-007 Suspended" prio=10 Id=11 RUNNABLE
06:20:19  
06:20:19  "JIT-SamplerThread" prio=10 Id=12 TIMED_WAITING
06:20:19  
06:20:19  "IProfiler" prio=5 Id=13 RUNNABLE
06:20:19  
06:20:19  "Common-Cleaner" prio=8 Id=3 TIMED_WAITING
06:20:19  	at java.base@22-internal/java.lang.Object.waitImpl(Native Method)
06:20:19  	at java.base@22-internal/java.lang.Object.wait(Object.java:255)
06:20:19  	at java.base@22-internal/java.lang.Object.wait(Object.java:221)
06:20:19  	at java.base@22-internal/java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:137)
06:20:19  	at java.base@22-internal/jdk.internal.ref.CleanerImpl.run(CleanerImpl.java:140)
06:20:19  	at java.base@22-internal/java.lang.Thread.run(Thread.java:1582)
06:20:19  	at java.base@22-internal/jdk.internal.misc.InnocuousThread.run(InnocuousThread.java:186)
06:20:19  
06:20:19  "Finalizer thread" prio=5 Id=14 RUNNABLE
06:20:19  
06:20:19  "Concurrent Mark Helper" prio=1 Id=15 RUNNABLE
06:20:19  
06:20:19  "GC Worker" prio=5 Id=16 RUNNABLE
06:20:19  
06:20:19  "GC Worker" prio=5 Id=17 RUNNABLE
06:20:19  
06:20:19  "GC Worker" prio=5 Id=18 RUNNABLE
06:20:19  
06:20:19  "Attach API wait loop" prio=10 Id=21 RUNNABLE
06:20:19  	at java.base@22-internal/openj9.internal.tools.attach.target.IPC.waitSemaphore(Native Method)
06:20:19  	at java.base@22-internal/openj9.internal.tools.attach.target.CommonDirectory.waitSemaphore(CommonDirectory.java:264)
06:20:19  	at java.base@22-internal/openj9.internal.tools.attach.target.WaitLoop.waitForNotification(WaitLoop.java:66)
06:20:19  	at java.base@22-internal/openj9.internal.tools.attach.target.WaitLoop.run(WaitLoop.java:157)
06:20:19  
06:20:19  "MainThread" prio=5 Id=22 RUNNABLE
06:20:19  	at java.base@22-internal/java.lang.Thread.getStackTraceImpl(Native Method)
06:20:19  	at java.base@22-internal/java.lang.Thread.getStackTrace0(Thread.java:2476)
06:20:19  	- locked java.lang.Object@cea6ea87
06:20:19  	at java.base@22-internal/java.lang.Thread.asyncGetStackTrace(Thread.java:2457)
06:20:19  	at java.base@22-internal/java.lang.VirtualThread.asyncGetStackTrace(VirtualThread.java:978)
06:20:19  	at java.base@22-internal/java.lang.Thread.getStackTrace(Thread.java:2442)
06:20:19  	at app//GetStackTraceALotWhenPinned.main(GetStackTraceALotWhenPinned.java:84)
06:20:19  	at java.base@22-internal/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
06:20:19  	at java.base@22-internal/java.lang.reflect.Method.invoke(Method.java:586)
06:20:19  	at app//com.sun.javatest.regtest.agent.MainWrapper$MainTask.run(MainWrapper.java:138)
06:20:19  	at java.base@22-internal/java.lang.Thread.run(Thread.java:1582)
06:20:19  
06:20:19  "ForkJoinPool-1-worker-1" prio=5 Id=24 BLOCKED on java.lang.Object@cea6ea87 owned by "MainThread" Id=22
06:20:19  	at java.base@22-internal/java.lang.VirtualThread.unmount(VirtualThread.java:399)
06:20:19  	at java.base@22-internal/java.lang.VirtualThread.runContinuation(VirtualThread.java:253)
06:20:19  	at java.base@22-internal/java.util.concurrent.ForkJoinTask$RunnableExecuteAction.compute(ForkJoinTask.java:1726)
06:20:19  	at java.base@22-internal/java.util.concurrent.ForkJoinTask$RunnableExecuteAction.compute(ForkJoinTask.java:1717)
06:20:19  	at java.base@22-internal/java.util.concurrent.ForkJoinTask$InterruptibleTask.exec(ForkJoinTask.java:1641)
06:20:19  	at java.base@22-internal/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:507)
06:20:19  	at java.base@22-internal/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1491)
06:20:19  	at java.base@22-internal/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:2073)
06:20:19  	at java.base@22-internal/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:2035)
06:20:19  	at java.base@22-internal/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:187)
06:20:19  
06:20:19  "ForkJoinPool-1-worker-2" prio=5 Id=25 WAITING
06:20:19  	at java.base@22-internal/jdk.internal.misc.Unsafe.park(Native Method)
06:20:19  	at java.base@22-internal/java.util.concurrent.ForkJoinPool.awaitWork(ForkJoinPool.java:2145)
06:20:19  	at java.base@22-internal/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:2036)
06:20:19  	at java.base@22-internal/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:187)
06:20:19  
06:20:19  "VirtualThread-unparker" prio=5 Id=26 WAITING
06:20:19  	at java.base@22-internal/jdk.internal.misc.Unsafe.park(Native Method)
06:20:19  	at java.base@22-internal/java.util.concurrent.locks.LockSupport.park(LockSupport.java:371)
06:20:19  	at java.base@22-internal/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block(AbstractQueuedSynchronizer.java:519)
06:20:19  	at java.base@22-internal/java.util.concurrent.ForkJoinPool.unmanagedBlock(ForkJoinPool.java:4013)
06:20:19  	at java.base@22-internal/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3961)
06:20:19  	at java.base@22-internal/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1707)
06:20:19  	at java.base@22-internal/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1170)
06:20:19  	at java.base@22-internal/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:899)
06:20:19  	at java.base@22-internal/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1070)
06:20:19  	at java.base@22-internal/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
06:20:19  	at java.base@22-internal/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
06:20:19  	at java.base@22-internal/java.lang.Thread.run(Thread.java:1582)
06:20:19  	at java.base@22-internal/jdk.internal.misc.InnocuousThread.run(InnocuousThread.java:186)
06:20:19  
06:20:19  "ForkJoinPool-1-worker-3" prio=5 Id=27 WAITING
06:20:19  	at java.base@22-internal/jdk.internal.misc.Unsafe.park(Native Method)
06:20:19  	at java.base@22-internal/java.util.concurrent.ForkJoinPool.awaitWork(ForkJoinPool.java:2145)
06:20:19  	at java.base@22-internal/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:2036)
06:20:19  	at java.base@22-internal/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:187)
06:20:19  
06:20:19  "ForkJoinPool-1-worker-4" prio=5 Id=28 WAITING
06:20:19  	at java.base@22-internal/jdk.internal.misc.Unsafe.park(Native Method)
06:20:19  	at java.base@22-internal/java.util.concurrent.ForkJoinPool.awaitWork(ForkJoinPool.java:2145)
06:20:19  	at java.base@22-internal/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:2036)
06:20:19  	at java.base@22-internal/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:187)
06:20:19  
06:20:19  "file lock watchdog" prio=10 Id=30 TIMED_WAITING
06:20:19  	at java.base@22-internal/java.lang.Object.waitImpl(Native Method)
06:20:19  	at java.base@22-internal/java.lang.Object.wait(Object.java:255)
06:20:19  	at java.base@22-internal/java.lang.Object.wait(Object.java:221)
06:20:19  	at java.base@22-internal/java.util.TimerThread.mainLoop(Timer.java:607)
06:20:19  	at java.base@22-internal/java.util.TimerThread.run(Timer.java:548)
06:20:19  
06:20:19  "Attachment portNumber: 44871" prio=10 Id=32 RUNNABLE
06:20:19  	at java.base@22-internal/openj9.internal.tools.attach.target.DiagnosticUtils.dumpAllThreadsImpl(Native Method)
06:20:19  	at java.base@22-internal/openj9.internal.tools.attach.target.DiagnosticUtils.getThreadInfo(DiagnosticUtils.java:249)
06:20:19  	at java.base@22-internal/openj9.internal.tools.attach.target.DiagnosticUtils.executeDiagnosticCommand(DiagnosticUtils.java:185)
06:20:19  	at java.base@22-internal/openj9.internal.tools.attach.target.Attachment.doCommand(Attachment.java:251)
06:20:19  	at java.base@22-internal/openj9.internal.tools.attach.target.Attachment.run(Attachment.java:161)
06:20:19  
06:20:19  
06:20:19  --- Timeout information end.
06:20:19  2024-02-06T11:19:30.248219936Z 336026 remaining ...
06:20:19  2024-02-06T11:19:30.749430608Z 335815 remaining ...
06:20:19  2024-02-06T11:19:31.405589797Z 335742 remaining ...
06:20:19  2024-02-06T11:19:31.908375978Z 335693 remaining ...
06:20:19  Timeout signalled after 960 seconds
06:20:19  2024-02-06T11:19:32.412877104Z 335644 remaining ...
06:20:19  2024-02-06T11:19:32.922927033Z 335592 remaining ...
06:20:19  2024-02-06T11:20:03.339294832Z 335583 remaining ...
06:20:19  2024-02-06T11:20:03.840195248Z 334276 remaining ...
06:20:19  2024-02-06T11:20:04.342183739Z 332638 remaining ...
06:20:19  2024-02-06T11:20:04.848906147Z 331119 remaining ...
@pshipton
Copy link
Member Author

pshipton commented Feb 6, 2024

https://openj9-jenkins.osuosl.org/job/Test_openjdk22_j9_sanity.openjdk_s390x_linux_Nightly_testList_1/1
jdk_lang_1
java/lang/Thread/virtual/stress/GetStackTraceALotWhenPinned.java#id0

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk22_j9_sanity.openjdk_s390x_linux_Nightly_testList_1/1/openjdk_test_output.tar.gz

06:16:53  Timeout information:
06:16:53  Running jcmd on process 31288
06:16:53  Dump written to /home/jenkins/workspace/Test_openjdk22_j9_sanity.openjdk_s390x_linux_Nightly_testList_1/aqa-tests/TKG/output_17072160479527/jdk_lang_1/work/scratch/0/core.20240206.061600.31288.0001.dmp
06:16:53  Dump written to /home/jenkins/workspace/Test_openjdk22_j9_sanity.openjdk_s390x_linux_Nightly_testList_1/aqa-tests/TKG/output_17072160479527/jdk_lang_1/work/scratch/0/javacore.20240206.061630.31288.0002.txt
06:16:53  Running jstack on process 31288
06:16:53  2024-02-06T06:16:33.168758544
06:16:53  Virtual machine: 31288 JVM information:
06:16:53  JRE 22 Linux s390x-64-Bit 20240205_7 (JIT enabled, AOT enabled)
06:16:53  OpenJ9   - c92d58374fe
06:16:53  OMR      - 410ef0d2314
06:16:53  JCL      - 2a64d9c6fcb based on jdk-22+34
06:16:53  
06:16:53  "main" prio=5 Id=2 WAITING
06:16:53  	at java.base@22-internal/java.lang.Object.waitImpl(Native Method)
06:16:53  	at java.base@22-internal/java.lang.Object.wait(Object.java:255)
06:16:53  	at java.base@22-internal/java.lang.Object.wait(Object.java:221)
06:16:53  	at java.base@22-internal/java.lang.Thread.join(Thread.java:2069)
06:16:53  	at java.base@22-internal/java.lang.Thread.join(Thread.java:2145)
06:16:53  	at app//com.sun.javatest.regtest.agent.MainWrapper.main(MainWrapper.java:85)
06:16:53  
06:16:53  "JIT Compilation Thread-000" prio=10 Id=4 RUNNABLE
06:16:53  
06:16:53  "JIT Compilation Thread-001 Suspended" prio=10 Id=5 RUNNABLE
06:16:53  
06:16:53  "JIT Compilation Thread-002 Suspended" prio=10 Id=6 RUNNABLE
06:16:53  
06:16:53  "JIT Compilation Thread-003 Suspended" prio=10 Id=7 RUNNABLE
06:16:53  
06:16:53  "JIT Compilation Thread-004 Suspended" prio=10 Id=8 RUNNABLE
06:16:53  
06:16:53  "JIT Compilation Thread-005 Suspended" prio=10 Id=9 RUNNABLE
06:16:53  
06:16:53  "JIT Compilation Thread-006 Suspended" prio=10 Id=10 RUNNABLE
06:16:53  
06:16:53  "JIT Diagnostic Compilation Thread-007 Suspended" prio=10 Id=11 RUNNABLE
06:16:53  
06:16:53  "JIT-SamplerThread" prio=10 Id=12 TIMED_WAITING
06:16:53  
06:16:53  "IProfiler" prio=5 Id=13 RUNNABLE
06:16:53  
06:16:53  "Common-Cleaner" prio=8 Id=3 TIMED_WAITING
06:16:53  	at java.base@22-internal/java.lang.Object.waitImpl(Native Method)
06:16:53  	at java.base@22-internal/java.lang.Object.wait(Object.java:255)
06:16:53  	at java.base@22-internal/java.lang.Object.wait(Object.java:221)
06:16:53  	at java.base@22-internal/java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:137)
06:16:53  	at java.base@22-internal/jdk.internal.ref.CleanerImpl.run(CleanerImpl.java:140)
06:16:53  	at java.base@22-internal/java.lang.Thread.run(Thread.java:1582)
06:16:53  	at java.base@22-internal/jdk.internal.misc.InnocuousThread.run(InnocuousThread.java:186)
06:16:53  
06:16:53  "Finalizer thread" prio=5 Id=14 RUNNABLE
06:16:53  
06:16:53  "Concurrent Mark Helper" prio=1 Id=15 RUNNABLE
06:16:53  
06:16:53  "GC Worker" prio=5 Id=16 RUNNABLE
06:16:53  
06:16:53  "GC Worker" prio=5 Id=17 RUNNABLE
06:16:53  
06:16:53  "GC Worker" prio=5 Id=18 RUNNABLE
06:16:53  
06:16:53  "Attach API wait loop" prio=10 Id=21 RUNNABLE
06:16:53  	at java.base@22-internal/openj9.internal.tools.attach.target.IPC.waitSemaphore(Native Method)
06:16:53  	at java.base@22-internal/openj9.internal.tools.attach.target.CommonDirectory.waitSemaphore(CommonDirectory.java:264)
06:16:53  	at java.base@22-internal/openj9.internal.tools.attach.target.WaitLoop.waitForNotification(WaitLoop.java:66)
06:16:53  	at java.base@22-internal/openj9.internal.tools.attach.target.WaitLoop.run(WaitLoop.java:157)
06:16:53  
06:16:53  "MainThread" prio=5 Id=22 RUNNABLE
06:16:53  	at app//GetStackTraceALotWhenPinned$Barrier.await(GetStackTraceALotWhenPinned.java:117)
06:16:53  	at app//GetStackTraceALotWhenPinned.main(GetStackTraceALotWhenPinned.java:82)
06:16:53  	at java.base@22-internal/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
06:16:53  	at java.base@22-internal/java.lang.reflect.Method.invoke(Method.java:586)
06:16:53  	at app//com.sun.javatest.regtest.agent.MainWrapper$MainTask.run(MainWrapper.java:138)
06:16:53  	at java.base@22-internal/java.lang.Thread.run(Thread.java:1582)
06:16:53  
06:16:53  "ForkJoinPool-1-worker-1" prio=5 Id=24 WAITING
06:16:53  	at java.base@22-internal/jdk.internal.misc.Unsafe.park(Native Method)
06:16:53  	at java.base@22-internal/java.util.concurrent.ForkJoinPool.awaitWork(ForkJoinPool.java:2145)
06:16:53  	at java.base@22-internal/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:2036)
06:16:53  	at java.base@22-internal/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:187)
06:16:53  
06:16:53  "ForkJoinPool-1-worker-2" prio=5 Id=25 WAITING
06:16:53  	at java.base@22-internal/jdk.internal.misc.Unsafe.park(Native Method)
06:16:53  	at java.base@22-internal/java.util.concurrent.ForkJoinPool.awaitWork(ForkJoinPool.java:2145)
06:16:53  	at java.base@22-internal/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:2036)
06:16:53  	at java.base@22-internal/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:187)
06:16:53  
06:16:53  "ForkJoinPool-1-worker-3" prio=5 Id=26 WAITING
06:16:53  	at java.base@22-internal/jdk.internal.misc.Unsafe.park(Native Method)
06:16:53  	at java.base@22-internal/java.util.concurrent.ForkJoinPool.awaitWork(ForkJoinPool.java:2145)
06:16:53  	at java.base@22-internal/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:2036)
06:16:53  	at java.base@22-internal/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:187)
06:16:53  
06:16:53  "ForkJoinPool-1-worker-4" prio=5 Id=27 TIMED_WAITING
06:16:53  	at java.base@22-internal/jdk.internal.misc.Unsafe.park(Native Method)
06:16:53  	at java.base@22-internal/java.util.concurrent.ForkJoinPool.awaitWork(ForkJoinPool.java:2145)
06:16:53  	at java.base@22-internal/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:2036)
06:16:53  	at java.base@22-internal/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:187)
06:16:53  
06:16:53  "VirtualThread-unparker" prio=5 Id=28 WAITING
06:16:53  	at java.base@22-internal/jdk.internal.misc.Unsafe.park(Native Method)
06:16:53  	at java.base@22-internal/java.util.concurrent.locks.LockSupport.park(LockSupport.java:371)
06:16:53  	at java.base@22-internal/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block(AbstractQueuedSynchronizer.java:519)
06:16:53  	at java.base@22-internal/java.util.concurrent.ForkJoinPool.unmanagedBlock(ForkJoinPool.java:4013)
06:16:53  	at java.base@22-internal/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3961)
06:16:53  	at java.base@22-internal/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1707)
06:16:53  	at java.base@22-internal/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1170)
06:16:53  	at java.base@22-internal/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:899)
06:16:53  	at java.base@22-internal/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1070)
06:16:53  	at java.base@22-internal/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
06:16:53  	at java.base@22-internal/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
06:16:53  	at java.base@22-internal/java.lang.Thread.run(Thread.java:1582)
06:16:53  	at java.base@22-internal/jdk.internal.misc.InnocuousThread.run(InnocuousThread.java:186)
06:16:53  
06:16:53  "file lock watchdog" prio=10 Id=30 TIMED_WAITING
06:16:53  	at java.base@22-internal/java.lang.Object.waitImpl(Native Method)
06:16:53  	at java.base@22-internal/java.lang.Object.wait(Object.java:255)
06:16:53  	at java.base@22-internal/java.lang.Object.wait(Object.java:221)
06:16:53  	at java.base@22-internal/java.util.TimerThread.mainLoop(Timer.java:607)
06:16:53  	at java.base@22-internal/java.util.TimerThread.run(Timer.java:548)
06:16:53  
06:16:53  "Attachment portNumber: 36752" prio=10 Id=32 RUNNABLE
06:16:53  	at java.base@22-internal/openj9.internal.tools.attach.target.DiagnosticUtils.dumpAllThreadsImpl(Native Method)
06:16:53  	at java.base@22-internal/openj9.internal.tools.attach.target.DiagnosticUtils.getThreadInfo(DiagnosticUtils.java:249)
06:16:53  	at java.base@22-internal/openj9.internal.tools.attach.target.DiagnosticUtils.executeDiagnosticCommand(DiagnosticUtils.java:185)
06:16:53  	at java.base@22-internal/openj9.internal.tools.attach.target.Attachment.doCommand(Attachment.java:251)
06:16:53  	at java.base@22-internal/openj9.internal.tools.attach.target.Attachment.run(Attachment.java:161)
06:16:53  
06:16:53  
06:16:53  --- Timeout information end.
06:16:53  2024-02-06T11:15:56.354789666Z 334811 remaining ...
06:16:53  2024-02-06T11:15:57.304847040Z 334755 remaining ...
06:16:53  2024-02-06T11:15:57.810945791Z 334619 remaining ...
06:16:53  Timeout signalled after 960 seconds
06:16:53  2024-02-06T11:15:58.334285041Z 334475 remaining ...
06:16:53  2024-02-06T11:15:59.212961794Z 334434 remaining ...
06:16:53  2024-02-06T11:15:59.755126539Z 334421 remaining ...
06:16:53  2024-02-06T11:16:26.853400419Z 334404 remaining ...
06:16:53  2024-02-06T11:16:27.382698916Z 334374 remaining ...
06:16:53  2024-02-06T11:16:28.239463665Z 334372 remaining ...
06:16:53  2024-02-06T11:16:29.005149416Z 334370 remaining ...
06:16:53  2024-02-06T11:16:29.515176039Z 334354 remaining ...
06:16:53  2024-02-06T11:16:30.018537288Z 334315 remaining ...
06:16:53  2024-02-06T11:16:30.531717039Z 334294 remaining ...
06:16:53  2024-02-06T11:16:31.482719414Z 334291 remaining ...
06:16:53  2024-02-06T11:16:32.483652669Z 334250 remaining ...
06:16:53  2024-02-06T11:16:33.156306294Z 334246 remaining ...
06:16:53  2024-02-06T11:16:33.764316794Z 334227 remaining ...
06:16:53  2024-02-06T11:16:34.412770794Z 334225 remaining ...
06:16:53  2024-02-06T11:16:35.055028040Z 334191 remaining ...

@pshipton
Copy link
Member Author

pshipton commented Feb 7, 2024

Excluding the test on zlinux via adoptium/aqa-tests#5042

JasonFengJ9 pushed a commit to adoptium/aqa-tests that referenced this issue Feb 7, 2024
@tajila
Copy link
Contributor

tajila commented Feb 8, 2024

@fengxue-IS Please take a look at this

pshipton added a commit to pshipton/openjdk-tests that referenced this issue Feb 8, 2024
Fix adoptium#5042 which added the
excludes to the wrong files.

Issue eclipse-openj9/openj9#18908

Signed-off-by: Peter Shipton <Peter_Shipton@ca.ibm.com>
@pshipton
Copy link
Member Author

pshipton commented Feb 8, 2024

Corrected the excludes in adoptium/aqa-tests#5051

llxia pushed a commit to adoptium/aqa-tests that referenced this issue Feb 8, 2024
…5051)

Fix #5042 which added the
excludes to the wrong files.

Issue eclipse-openj9/openj9#18908

Signed-off-by: Peter Shipton <Peter_Shipton@ca.ibm.com>
@tajila tajila added the project:loom Used to track Project Loom related work label Feb 26, 2024
@tajila
Copy link
Contributor

tajila commented Mar 4, 2024

@fengxue-IS any update on this?

@fengxue-IS
Copy link
Contributor

I wasn't able to reproduce this failure individually on a zlinux machine, running grinder to see if this is can only be reproduced in the suite.

@fengxue-IS
Copy link
Contributor

fengxue-IS commented Mar 7, 2024

[Grinder 20x run on 20 machines 19/20 passed] (https://hyc-runtimes-jenkins.swg-devops.com/view/Test_grinder/job/Grinder/38416/)
All the passing machines average 2~5mins per iteration, the only failed machine took 8~9 mins for the passing runs and 16+mins for the failed run.

All machines are spec to 4CPU and ~8G RAM. so perf delta isn't related to insufficient core count.

Looking at the failed test's output, its very similar to the original where timeout occurred while the test is still slowly making process. So this doesn't seem to be a functional issue, just performance.

I'll see if modifying the test set up parameters could speed this up.

@fengxue-IS
Copy link
Contributor

fengxue-IS commented Mar 11, 2024

I've tried to modify the test to get a bit more detail of where the perf issue is coming from.

Noticed that the Thread.getStackTrace() API is whats taking majority of the time. after setting the thread state to unmount whenever API is called, that resulted in a 100x perf improvement on the API runtime (ie 1800~3000us -> 20~27us per getStackTrace()). Compared to RI, they only have a 10x perf delta (80~100us -> 10~15us) changing from unsteady state to consistent unmounted state.

This is a perf issue that we should consider to investigate and address by looking into how the getstacktrace call is done for virtual threads.

@pshipton
Copy link
Member Author

pshipton commented May 6, 2024

https://openj9-jenkins.osuosl.org/job/Test_openjdk21_j9_sanity.openjdk_s390x_linux_Nightly_testList_0/217 - rh7-390-2
jdk_lang_j9_0
java/lang/Thread/virtual/stress/GetStackTraceALotWhenPinned.java#id0

timeout

Changes from previous build
6e1e36f...2c34fcf
ibmruntimes/openj9-openjdk-jdk21@10c9cf1...5194f18 - update to 21.0.4

@pshipton
Copy link
Member Author

pshipton commented May 7, 2024

https://openj9-jenkins.osuosl.org/job/Test_openjdk21_j9_sanity.openjdk_s390x_linux_Nightly_testList_0/218 - rh8-390-1
jdk_lang0, jdk_lang_j9_0
java/lang/Thread/virtual/stress/GetStackTraceALotWhenPinned.java#id0

https://openj9-jenkins.osuosl.org/job/Test_openjdk21_j9_sanity.openjdk_s390x_linux_Nightly_testList_1/218 - ub20-390-5
jdk_lang_1
java/lang/Thread/virtual/stress/GetStackTraceALotWhenPinned.java#id0

timeout

https://openj9-jenkins.osuosl.org/job/Test_openjdk21_j9_sanity.openjdk_s390x_linux_OMR_testList_0/107/
jdk_lang_0
java/lang/Thread/virtual/stress/GetStackTraceALotWhenPinned.java#id0

@pshipton pshipton removed the jdk22 label May 9, 2024
@pshipton
Copy link
Member Author

pshipton commented May 9, 2024

There was an update 21.0.3 -> 21.0.4+1 and this has been regularly failing on zlinux ever since. I've set it as a blocker for the next release to figure out what's going on.

@fengxue-IS
Copy link
Contributor

I ll take a look at the update between 21.0.3 and 21.0.4+1

@fengxue-IS
Copy link
Contributor

fengxue-IS commented May 9, 2024

This test looks to be a back port to 21 from 22 in 21.0.4 with modification of a new class VThreadPinner added as a wrapper in the test code. looking at the output of test failures, seem to be same issue as previously mentioned due to performance, the test timed out after ~15mins completing 75~90% of total workload.

I will create a custom build to get some profiling data to see what is causing this bottleneck in performance in getStackTrace() API.

@pshipton
Copy link
Member Author

It just fails once or more in every build, I'm not going to keep reporting them.

@pshipton
Copy link
Member Author

We should probably exclude it.

@fengxue-IS
Copy link
Contributor

I created the PR to exclude test for zLinux until the perf issue is resolved.

The xmac failure looks to be different from zLinux as it triggered an assertion failure during stackwalk, will have to look more into the corefile generated to find the cause.

@fengxue-IS
Copy link
Contributor

I launched a 50x grinder on zlinux with the latest JDK 21 nightly build, all passed, longest is ~4mins / iteration, which is much less than the timeout.

Currently looks like performance isn't preventing test from completing, given that getStackTrace is a slowpath and test targets a very edge scenario, can we remove the blocker tag for this (possibly move this to backlog as well)? @pshipton

Note: this have only be verified on internal machine, I don't have access to verify this on the openj9 jenkins farm, so we should also test this on openj9 jenkins.

@pshipton
Copy link
Member Author

I will remove the blocker. Any update on the xmac failures?

@pshipton pshipton removed the blocker label Jun 12, 2024
@fengxue-IS
Copy link
Contributor

I've been running grinders on xmac, have not been able to reproduce the failure yet

@pshipton
Copy link
Member Author

pshipton commented Jun 12, 2024

Testing using default options on zlinux and xmac
60x across 3 machines https://openj9-jenkins.osuosl.org/view/Test/job/Grinder/3659 - passed on both xmac and AIX

@pshipton
Copy link
Member Author

The grinders passed, we could re-enable the testing and see how it goes. Perhaps some change has improved the behavior, or it's possible the tests only fail when they are run as part of the larger group of tests, and not when run standalone.

@pshipton
Copy link
Member Author

pshipton commented Jun 25, 2024

https://openj9-jenkins.osuosl.org/job/Test_openjdk21_j9_sanity.openjdk_s390x_linux_OMR_testList_0/128 - rh7-390-1
10:14:59 jdk_lang_0 - Test results: passed: 936; error: 1
10:14:59 Failed test cases:
10:14:59 TEST: java/lang/Thread/virtual/stress/GetStackTraceALotWhenPinned.java#id0
10:14:59
10:14:59 jdk_lang_j9_0 - Test results: passed: 936; error: 1
10:14:59 Failed test cases:
10:14:59 TEST: java/lang/Thread/virtual/stress/GetStackTraceALotWhenPinned.java#id0

@pshipton pshipton reopened this Jun 25, 2024
@fengxue-IS
Copy link
Contributor

Output show same timeout issue, 20% of workload completed in the 960s interval for jdk_lang_0, 80% complete for jdk_lang_j9_0

Running a 5 x 10x grinder on the jdk_lang_j9_0 suite to check failure rate as this have not been reproducible as stand alone test.

@pshipton
Copy link
Member Author

Pls exclude the test again on zlinux, it's failing in pretty much every build.

@fengxue-IS
Copy link
Contributor

PR open to disable test on zlinux

The 50x grinder from yesterday passed, so this seem to be specific to the openj9 jenkins environment.

@pshipton
Copy link
Member Author

Removing from the 0.46 milestone.

@JasonFengJ9
Copy link
Member

JDK23 x86-64_windows(win22x86-rtp-rt4-1)

[2024-09-10T09:29:37.028Z] variation: -Xdump:system:none -Xdump:heap:none -Xdump:system:events=gpf+abort+traceassert+corruptcache -XX:-JITServerTechPreviewMessage Mode650
[2024-09-10T09:29:37.912Z] JVM_OPTIONS:  -Xdump:system:none -Xdump:heap:none -Xdump:system:events=gpf+abort+traceassert+corruptcache -XX:-JITServerTechPreviewMessage -XX:-UseCompressedOops 

[2024-09-10T13:03:51.276Z] TEST: java/lang/Thread/virtual/stress/GetStackTraceALotWhenPinned.java#id0

[2024-09-10T13:03:51.288Z] TEST RESULT: Error. Program `C:\Users\jenkins\workspace\Test_openjdk23_j9_sanity.openjdk_x86-64_windows_testList_0\jdkbinary\j2sdk-image\bin\java' timed out (timeout set to 960000ms, elapsed time including timeout handling was 1012676ms).
[2024-09-10T13:03:51.288Z] --------------------------------------------------
[2024-09-10T13:41:09.616Z] Test results: passed: 930; error: 1
[2024-09-10T13:43:52.575Z] Report written to C:\Users\jenkins\workspace\Test_openjdk23_j9_sanity.openjdk_x86-64_windows_testList_0\aqa-tests\TKG\output_17259605494566\jdk_lang_1\report\html\report.html
[2024-09-10T13:43:52.575Z] Results written to C:\Users\jenkins\workspace\Test_openjdk23_j9_sanity.openjdk_x86-64_windows_testList_0\aqa-tests\TKG\output_17259605494566\jdk_lang_1\work
[2024-09-10T13:43:52.575Z] Error: Some tests failed or other problems occurred.
[2024-09-10T13:43:52.575Z] -----------------------------------
[2024-09-10T13:43:52.575Z] jdk_lang_1_FAILED

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants