Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loom: jdk19 OpenJDK Invalid JIT return address ASSERTION FAILED swalk.c:1601 #15939

Closed
tajila opened this issue Sep 21, 2022 · 58 comments · Fixed by #16485 · May be fixed by #16403
Closed

Loom: jdk19 OpenJDK Invalid JIT return address ASSERTION FAILED swalk.c:1601 #15939

tajila opened this issue Sep 21, 2022 · 58 comments · Fixed by #16485 · May be fixed by #16403
Assignees
Labels
comp:jit comp:vm project:loom Used to track Project Loom related work
Milestone

Comments

@tajila
Copy link
Contributor

tajila commented Sep 21, 2022

Likely due to lack of JIT support.

FAILED	java/lang/Thread/virtual/stress/TimedGet.java
			STDOUT:
			Timeout refired 480 times
			34826
			45401
			56520
			67517
			77472
			87823
			98800
			110187
			121007
			131805
			142457
			152339
			162664
			173232
			183909
			194413
			205068
			215806
			226674
			238249
			250000
			STDERR:
			STATUS:Passed.
@tajila tajila added comp:vm comp:jit project:loom Used to track Project Loom related work labels Sep 21, 2022
@tajila tajila added this to the Java 19 milestone Sep 21, 2022
@vij-singh
Copy link

With Loom support now in, are these still an issue?

@tajila
Copy link
Contributor Author

tajila commented Oct 18, 2022

@fengxue-IS Is this test passing now?

@tajila
Copy link
Contributor Author

tajila commented Oct 25, 2022

@fengxue-IS any update on this?

@fengxue-IS
Copy link
Contributor

This test passed with -Xint, failed when JIT is enabled due to Invalid JIT return address.
Similar failure seen on java/lang/Thread/virtual/stress/PingPong.java on aarch64 and x86

Failed: main --enable-preview PingPong LTQ 1000000

I am working on triaging this, may need help from JIT team with understanding this cause

@fengxue-IS
Copy link
Contributor

I was able to get a core locally by directly compiling the test class java/lang/Thread/virtual/stress/PingPong.java and run with java --enable-preview PingPong

the invalid JIT Return address comes from GC walking JIT stacks on carrier thread after the continuation have yielded
Stacktrace:

Thread 29 "ForkJoinPool-1-" hit Breakpoint 1, jitWalkStackFrames (walkState=0x7fffc8753dc0) at /root/openj9-openjdk-jdk19/openj9/runtime/codert_vm/jswalk.c:282
282					walkState->walkThread->privateFlags |= J9_PRIVATE_FLAGS_STACK_CORRUPT;
(gdb) bt
#0  jitWalkStackFrames (walkState=0x7fffc8753dc0) at /root/openj9-openjdk-jdk19/openj9/runtime/codert_vm/jswalk.c:282
#1  0x00007ffff67a46a3 in walkStackFrames (currentThread=<optimized out>, walkState=0x7fffc8753dc0) at /root/openj9-openjdk-jdk19/openj9/runtime/vm/swalk.c:345
#2  0x00007ffff44a516c in GC_VMThreadStackSlotIterator::scanSlots (vmThread=vmThread@entry=0x252200, walkThread=walkThread@entry=0x252200, userData=userData@entry=0x7fffc87540e0, 
    oSlotIterator=oSlotIterator@entry=0x7ffff449ca70 <stackSlotIterator(J9JavaVM*, J9Object**, void*, J9StackWalkState*, void const*)>, includeStackFrameClassReferences=<optimized out>, 
    trackVisibleFrameDepth=<optimized out>) at /root/openj9-openjdk-jdk19/openj9/runtime/gc_structs/VMThreadStackSlotIterator.cpp:129
#3  0x00007ffff449c532 in MM_RootScanner::scanOneThread (this=0x7fffc8754150, env=0x7fff70026948, walkThread=0x252200, localData=0x7fffc87540e0)
    at /root/openj9-openjdk-jdk19/openj9/runtime/gc_base/RootScanner.cpp:542
#4  0x00007ffff449b23f in MM_RootScanner::scanThreads (this=0x7fffc8754150, env=0x7fff70026948) at /root/openj9-openjdk-jdk19/openj9/runtime/gc_base/RootScanner.cpp:508
#5  0x00007ffff449e172 in MM_RootScanner::scanRoots (this=0x7fffc8754150, env=0x7fff70026948) at /root/openj9-openjdk-jdk19/openj9/runtime/gc_base/RootScanner.cpp:972
#6  0x00007ffff4604eab in MM_MarkingDelegate::scanRoots (this=0x7ffff0047770, env=env@entry=0x7fff70026948, processLists=processLists@entry=true)
    at /root/openj9-openjdk-jdk19/openj9/runtime/gc_glue_java/MarkingDelegate.cpp:310
#7  0x00007ffff457ce6c in MM_MarkingScheme::markLiveObjectsRoots (this=<optimized out>, env=env@entry=0x7fff70026948, processLists=processLists@entry=true)
    at /root/openj9-openjdk-jdk19/omr/gc/base/MarkingScheme.cpp:363
#8  0x00007ffff462b8d9 in MM_ParallelMarkTask::run (this=0x7fffc8754230, env=0x7fff70026948) at /root/openj9-openjdk-jdk19/omr/gc/base/ParallelMarkTask.cpp:52
#9  0x00007ffff45951be in MM_ParallelDispatcher::run (this=0x7ffff00474c0, env=0x7fff70026948, task=0x7fffc8754230, newThreadCount=<optimized out>)
    at /root/openj9-openjdk-jdk19/omr/gc/base/ParallelDispatcher.cpp:588
#10 0x00007ffff45b07d5 in MM_ParallelGlobalGC::markAll (this=this@entry=0x7ffff000b740, env=env@entry=0x7fff70026948, initMarkMap=initMarkMap@entry=true)
    at /root/openj9-openjdk-jdk19/omr/gc/base/standard/ParallelGlobalGC.cpp:955
#11 0x00007ffff45b11a0 in MM_ParallelGlobalGC::mainThreadGarbageCollect (this=0x7ffff000b740, env=0x7fff70026948, allocDescription=0x7fffc8754668, initMarkMap=<optimized out>, rebuildMarkBits=<optimized out>)
    at /root/openj9-openjdk-jdk19/omr/gc/base/standard/ParallelGlobalGC.cpp:461
#12 0x00007ffff45ad552 in MM_ParallelGlobalGC::internalGarbageCollect (this=<optimized out>, env=0x7fff70026948, subSpace=<optimized out>, allocDescription=0x7fffc8754668)
    at /root/openj9-openjdk-jdk19/omr/gc/base/standard/ParallelGlobalGC.cpp:1234
#13 0x00007ffff4571ba6 in MM_Collector::garbageCollect (this=0x7ffff000b740, env=0x7fff70026948, callingSubSpace=0x7ffff0080c40, allocateDescription=0x7fffc8754668, gcCode=<optimized out>, 
    objectAllocationInterface=0x7fff70029460, baseSubSpace=0x7ffff00808b0, context=0x0) at /root/openj9-openjdk-jdk19/omr/gc/base/Collector.cpp:500
#14 0x00007ffff4626e00 in MM_MemorySubSpaceFlat::allocationRequestFailed (this=0x7ffff0080c40, env=0x7fff70026948, allocateDescription=0x7fffc8754668, allocationType=MM_MemorySubSpace::ALLOCATION_TYPE_TLH, 
    objectAllocationInterface=0x7fff70029460, baseSubSpace=0x7ffff00808b0, previousSubSpace=0x7ffff00808b0) at /root/openj9-openjdk-jdk19/omr/gc/base/MemorySubSpaceFlat.cpp:154
#15 0x00007ffff4628f18 in MM_MemorySubSpaceGeneric::allocateTLH (this=0x7ffff00808b0, env=0x7fff70026948, allocDescription=0x7fffc8754668, objectAllocationInterface=0x7fff70029460, baseSubSpace=0x0, 
    previousSubSpace=<optimized out>, shouldCollectOnFailure=true) at /root/openj9-openjdk-jdk19/omr/gc/base/MemorySubSpaceGeneric.cpp:377
#16 0x00007ffff459c9d4 in MM_TLHAllocationSupport::refresh (this=this@entry=0x7fff70029510, env=0x7fff70026948, allocDescription=allocDescription@entry=0x7fffc8754668, shouldCollectOnFailure=<optimized out>)
    at /root/openj9-openjdk-jdk19/omr/gc/base/TLHAllocationSupport.cpp:210
#17 0x00007ffff459cbbe in MM_TLHAllocationSupport::allocateFromTLH (this=0x7fff70029510, env=<optimized out>, allocDescription=0x7fffc8754668, shouldCollectOnFailure=<optimized out>)
    at /root/openj9-openjdk-jdk19/omr/gc/base/TLHAllocationSupport.cpp:279
#18 0x00007ffff459b51a in MM_TLHAllocationInterface::allocateObject (this=0x7fff70029460, env=0x7fff70026948, allocDescription=0x7fffc8754668, memorySpace=0x7ffff0080f90, shouldCollectOnFailure=true)
    at /root/openj9-openjdk-jdk19/omr/gc/base/TLHAllocationInterface.cpp:183
#19 0x00007ffff45a179b in MM_AllocateInitialization::allocateAndInitializeObject (omrVMThread=<optimized out>, this=0x7fffc8754650) at /root/openj9-openjdk-jdk19/omr/gc/base/AllocateInitialization.hpp:201
#20 OMR_GC_AllocateObject (omrVMThread=<optimized out>, allocator=allocator@entry=0x7fffc8754650) at /root/openj9-openjdk-jdk19/omr/gc/startup/omrgcalloc.cpp:39
#21 0x00007ffff44aaee6 in J9AllocateObject (vmThread=0x252200, clazz=0x239900, allocateFlags=<optimized out>) at /root/openj9-openjdk-jdk19/openj9/runtime/gc_modron_startup/mgcalloc.cpp:416
#22 0x00007fffefb6e7bd in slow_jitNewObjectImpl (nonZeroTLH=false, checkClassInit=true, currentThread=0x252200) at /root/openj9-openjdk-jdk19/openj9/runtime/codert_vm/cnathelp.cpp:573
#23 old_slow_jitNewObject (currentThread=0x252200) at /root/openj9-openjdk-jdk19/openj9/runtime/codert_vm/cnathelp.cpp:637
#24 0x00007fffefb83471 in jitNewObject () at /root/openj9-openjdk-jdk19/build/linux-x86_64-server-release/vm/runtime/codert_vm/xnathelp.s:1265
#25 0x0000000000252200 in ?? ()
* 29   Thread 0x7fffc8757700 (LWP 13730) "ForkJoinPool-1-" jitWalkStackFrames (walkState=0x7fffc8753dc0) at /root/openj9-openjdk-jdk19/openj9/runtime/codert_vm/jswalk.c:282

The Java stack of thread 29 in DDR:

	!stack 0x0024d700	!j9vmthread 0x0024d700	!j9thread 0x7fff70009788	tid 0x35a1 (13729) // (ForkJoinPool-1-worker-4)
	!stack 0x00252200	!j9vmthread 0x00252200	!j9thread 0x7fff70024790	tid 0x35a2 (13730) // (ForkJoinPool-1-worker-5)
	!stack 0x00255f00	!j9vmthread 0x00255f00	!j9thread 0x7fff70024d08	tid 0x35a3 (13731) // (ForkJoinPool-1-worker-6)
	!stack 0x00259b00	!j9vmthread 0x00259b00	!j9thread 0x7fff5400afc0	tid 0x35a4 (13732) // (ForkJoinPool-1-worker-7)

> !stack 0x252200
<252200> 	                        known but unhandled frame type com.ibm.j9ddr.vm29.pointer.U8Pointer @ 0x00000005 

 FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT 

<252200> 	!j9method 0x0000000000220338   java/util/concurrent/ForkJoinTask.adapt(Ljava/lang/Runnable;)Ljava/util/concurrent/ForkJoinTask;
<252200> 	!j9method 0x0000000000094E40   java/lang/VirtualThread.submitRunContinuation(Z)V
<252200> 	!j9method 0x0000000000094E80   java/lang/VirtualThread.lazySubmitRunContinuation()V
<252200> 	!j9method 0x0000000000094F20   java/lang/VirtualThread.afterYield()V
<252200> 	!j9method 0x0000000000094E20   java/lang/VirtualThread.runContinuation()V
Oct 31, 2022 2:55:27 PM com.ibm.j9ddr.vm29.events.DefaultEventListener corruptData
WARNING: CorruptDataException thrown walking stack. walkThread = 0x0000000000252200
com.ibm.j9ddr.AddressedCorruptDataException: Invalid JIT return address

@fengxue-IS
Copy link
Contributor

Update
It seems that PingPong test failure is intermittent. using TimedGet produced a more consistent failure for triaging.
I noticed that there seem to be two distinct issues with TimedGet, with JIT enabled, I would see failure due to
Invalid JIT return address
I will try to see if this can be narrowed down to where the PC is being set on stack.

When testing with -Xint there is an issue with corrupted stack object. This seem to be only an issue with gencon, running the test under -Xgcpolicy:balanced / optthruput both successfully completed
Discussed this with @LinHu2016 , and he will be investigating this case.

@LinHu2016
Copy link
Contributor

it looks similar issue as #16212

@0xdaryl
Copy link
Contributor

0xdaryl commented Nov 3, 2022

@nbhuiyan : please investigate this from a JIT perspective.

@babsingh
Copy link
Contributor

babsingh commented Nov 3, 2022

The original timeout no longer occurs. Currently, there are two failures associated to this test:

@nbhuiyan nbhuiyan self-assigned this Nov 3, 2022
@nbhuiyan
Copy link
Member

nbhuiyan commented Nov 4, 2022

I discussed with @fengxue-IS about the JIT invalid return address issue running the PingPong test. Jack reported that running with -Xjit:disableInlining,optLevel=noOpt prevented the crash from happening. I was able to reproduce the test failure locally at around the same failure rate that Jack reported earlier, using the command java --enable-preview -Xmx1g PingPong LTQ 1000000.

-Xjit:disableInlining,optLevel=noOpt preventing the failure from occurring does not necessarily mean that the failure was caused by a bad optimization or something going wrong during inlining. It seems that just adding JIT options significantly reduces the failure rate. I experimented with -Xjit options that are specific to a different platform and would not impact the compilation behaviour on x86, such as -Xjit:TR_DisableZ10, and that was sufficient to reduce the failure rate from 50% to less than 1% (verified using a bash for loop to run the test repeatedly). I needed to obtain verbose logs to get a list of methods compiled for a failed run, and then try narrowing down to the smallest set of method(s) needed to be compiled to reproduce the problem. I had to increase the iteration option of the PingPong test to a ridiculously large number and that seems to bring the failure rate to a reasonable level.

@nbhuiyan
Copy link
Member

nbhuiyan commented Nov 8, 2022

I have obtained a JIT verbose log from a failing run and that contained some of the methods seen in the strack trace from investigating a core dump. I was trying to narrow down to the smallest set of methods that need to be compiled in order for the invalid return address to occur, so that I can focus on what happens during the compilation of those method(s). However, using limit file to try various subsets of methods causes the problem to hide, so I have yet to figure out a way to progress further with that. I have also tried the GC race condition fix in #16290, and still saw the problem happening.

@tajila
Copy link
Contributor Author

tajila commented Nov 10, 2022

@nbhuiyan Do you think this will be resolved within 2 weeks?

@nbhuiyan
Copy link
Member

@tajila

Do you think this will be resolved within 2 weeks?

It's possible, depending on what the actual cause is. Examining the JIT logs of the methods leading up to the frame where invalid JIT return address occurs during a failed run does not show anything going wrong there, so I am starting to somewhat suspect that the problem may not be JIT-related.

@nbhuiyan
Copy link
Member

nbhuiyan commented Nov 17, 2022

Some updates with this investigation:

  • The invalid JIT return address is always within the JIT code cache, in an unused region between two methods that can vary every run
  • Examining the stack slots show a number of inlined JIT frames in a JIT-ed method. Explicitly preventing every single one of those methods from getting inlined reproduces the failure with the root method (ForkJoinTask.doExec) at the top.
  • It was possible to reproduce the failure with ForkJoinTask.doExec prevented from getting compiled, but this time we see a failure elsewhere, and get either a segfault or another JIT invalid return address error:

Segfault case, where we end up in the invalid JIT code cache region:

Unhandled exception
Type=Segmentation error vmState=0x00000000
J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000001
Handler1=00007FC61DCB0F10 Handler2=00007FC61D4B41D0 InaccessibleAddress=0000000000000000
RDI=000000000000001F RSI=0000000000000001 RAX=0000000000000000 RBX=0000000000217B00
RCX=0000000000000020 RDX=00000000003D9BE1 R8=0000000000000001 R9=00000000FFEDEF88
R10=000000000021CA78 R11=00000000C0092938 R12=00000000FFEDEFE8 R13=0000000000092B88
R14=00000000C0000000 R15=0000000000227600
RIP=00007FC5FC4EC0CB GS=0000 FS=0000 RSP=000000000021CB20
EFlags=0000000000010206 CS=0033 RBP=000000000023C000 ERR=0000000000000006
TRAPNO=000000000000000E OLDMASK=0000000000000000 CR2=0000000000000000
xmm0 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm1 00000000003d9be0 (f: 4037600.000000, d: 1.994839e-317)
xmm2 000000000021caf8 (f: 2214648.000000, d: 1.094181e-317)
xmm3 000000000021cc99 (f: 2215065.000000, d: 1.094388e-317)
xmm4 00007fc61de37cca (f: 501447872.000000, d: 6.941073e-310)
xmm5 0000000000000038 (f: 56.000000, d: 2.766768e-322)
xmm6 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm7 00000000c00c7dc0 (f: 3222044160.000000, d: 1.591901e-314)
xmm8 00000000001d35e8 (f: 1914344.000000, d: 9.458116e-318)
xmm9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm15 0000000000000000 (f: 0.000000, d: 0.000000e+00)

Compiled_method=unknown (In JIT code segment 00007FC6180C7F78 but no method found)

JIT invalid return address case, where we fail during stackwalk:

*** Invalid JIT return address 00007FD2434CD649 in 00007FD240472440

Looking at how we end up with the invalid JIT return address:

...
#14 0x00007fd264e208f3 in invalidJITReturnAddress (
    walkState=<optimized out>)
    at /root/builds/openj9-openjdk-jdk19/openj9/runtime/vm/swalk.c:1632
#15 0x00007fd25f4a74b5 in jitWalkStackFrames (
    walkState=0x7fd240472440)
    at /root/builds/openj9-openjdk-jdk19/openj9/runtime/codert_vm/jswalk.c:283
#16 0x00007fd264e1f6c5 in walkStackFrames (
    currentThread=<optimized out>, walkState=0x7fd240472440)
    at /root/builds/openj9-openjdk-jdk19/openj9/runtime/vm/swalk.c:384
#17 0x00007fd25ec7f8a1 in jitMethodSampleInterrupt (
    vmThread=0x259d00, handlerKey=<optimized out>,
    userData=<optimized out>)
    at /root/builds/openj9-openjdk-jdk19/openj9/runtime/compiler/control/HookedByTheJit.cpp:1278
#18 0x00007fd264dde507 in dispatchAsyncEvents (
    currentThread=0x259d00, asyncEventFlags=1)
    at /root/builds/openj9-openjdk-jdk19/openj9/runtime/vm/hookableAsync.c:238
#19 0x00007fd264db5de8 in javaCheckAsyncMessages (
    currentThread=0x259d00, throwExceptions=0)
    at /root/builds/openj9-openjdk-jdk19/openj9/runtime/vm/AsyncMessageHandler.cpp:61
#20 0x00007fd25f4931e5 in old_slow_jitCheckAsyncMessages (
    currentThread=0x259d00)
    at /root/builds/openj9-openjdk-jdk19/openj9/runtime/codert_vm/cnathelp.cpp:1431
#21 0x00007fd25f4aa8d6 in jitCheckAsyncMessages ()
    at /root/builds/openj9-openjdk-jdk19/build/linux-x86_64-server-release/vm/runtime/codert_vm/xnathelp.s:1780
#22 0x0000000000259d00 in ?? ()   <--- J9VMThread
#23 0x00007fd240472b50 in ?? ()   <--- Ptr Obj - {java/util/concurrent/ForkJoinTask$AdaptedRunnableAction} +8
#24 0x00000000ffe52a90 in ?? ()   <--- Obj - {java/util/concurrent/ForkJoinTask$AdaptedRunnableAction}
#25 0x0000000000217700 in ?? ()   <--- J9Class - {java/lang/VirtualThread$$Lambda$8/0x0000000000000000}
#26 0x0000000000071c1f in ?? ()   <--- CP - {java/lang/Module} + 1551
#27 0x0000000000227600 in ?? ()   <--- J9Class - {java/util/concurrent/ForkJoinTask}
#28 0x0000000000259d00 in ?? ()   <--- J9VMThread
#29 0x0000000000010000 in ?? ()   <--- unknown
#30 0x00007fd264f63cca in j9_void_void ()
   from /root/builds/openj9-openjdk-jdk19/build/linux-x86_64-server-release/images/jdk/lib/default/libj9vm29.so
#31 0x0000000000000000 in ?? ()

The stack walker is triggered through jitCheckAsyncMessages, and the J9VMThread is 0x259d00. Here is what we have in that J9VMThread at the time of the crash:

J9VMThread at 0x259d00 {
  Fields for J9VMThread:
	0x0: struct JNINativeInterface_* functions = !jninativeinterface_ 0x00007FD2651A6B60
	0x8: struct J9JavaVM* javaVM = !j9javavm 0x00007FD26000FA80
	0x10: UDATA* arg0EA = !j9x 0x0000000000259A98
	0x18: UDATA* bytecodes = !j9x 0x0000000000000000
	0x20: UDATA* sp = !j9x 0x0000000000259A78
	0x28: U8* pc = !j9x 0x0000000000000005
	0x30: struct J9Method* literals = !j9method 0x0000000000000000
	0x38: UDATA jitStackFrameFlags = 0x0000000000000000 (0)
	0x40: struct J9Object* jitException = !j9object 0x0000000000000000
	0x48: struct J9Object* currentException = !j9object 0x0000000000000000
	0x50: UDATA* stackOverflowMark = !j9x 0x0000000000259450
	0x58: UDATA* stackOverflowMark2 = !j9x 0x0000000000259450
	0x60: U8* heapAlloc = !j9x 0x00000000FFE52AD8 // ""
	0x68: U8* heapTop = !j9x 0x00000000FFE53018 // "���"
	0x70: IDATA tlhPrefetchFTA = 0x0000000000000000 (0)
	0x78: U8* nonZeroHeapAlloc = !j9x 0x0000000000000000
	0x80: U8* nonZeroHeapTop = !j9x 0x0000000000000000
	0x88: IDATA nonZeroTlhPrefetchFTA = 0x0000000000000000 (0)
	0x90: struct J9ThreadMonitor* publicFlagsMutex = !j9threadmonitor 0x00007FD1D8009428
	0x98: UDATA publicFlags = 0x0000000000000020 (32)
	0xa0: UDATA compressObjectReferences = 0x0000000000000001 (1)
	0xa8: struct J9Object* threadObject = !j9object 0x00000000C00A3610 // jdk/internal/misc/CarrierThread
	0xb0: void* lowTenureAddress = !j9x 0x00000000C0000000
	0xb8: void* highTenureAddress = !j9x 0x00000000C0600000
	0xc0: void* heapBaseForActiveCardTable = !j9x 0x0000000000000000
	0xc8: void* activeCardTableBase = !j9x 0x00007FD25C93C000
	0xd0: UDATA heapSizeForActiveCardTable = 0x0000000000000000 (0)
	0xd8: void* heapBaseForBarrierRange0 = !j9x 0x00000000C0000000
	0xe0: UDATA heapSizeForBarrierRange0 = 0x0000000000600000 (6291456)
	0xe8: UDATA* jniLocalReferences = !j9x 0x0000000000000000
	0xf0: UDATA tempSlot = 0x00007FD25F4B4810 (140541518628880)
	0xf8: void* jitReturnAddress = !j9x 0x00007FD2434CD649
	0x100: void* floatTemp1 = !j9x 0x00007FD2434CD545
	0x108: void* floatTemp2 = !j9x 0x00000000FFE90018
	0x110: void* floatTemp3 = !j9x 0x00007FD2434BF0D4
	0x118: void* floatTemp4 = !j9x 0x0000000000000000
	0x120: UDATA returnValue = 0x00007FD280000000 (140542067343360)
	0x128: UDATA returnValue2 = 0x0000000000000001 (1)
	0x130: UDATA* objectFlagSpinLockAddress = !j9x 0x0000000000000000
	0x138: struct J9JavaStack* stackObject = !j9javastack 0x0000000000256C50
	0x140: struct J9Thread* osThread = !j9thread 0x00007FD1E4010540
	0x148: UDATA inspectionSuspendCount = 0x0000000000000000 (0)
	0x150: UDATA inspectorCount = 0x0000000000000000 (0)
	0x158: U32 eventFlags = 0x00000010 (16)
	0x15c: U32 osrFrameIndex = 0x00000000 (0)
	0x160: void* codertTOC = !j9x 0x0000000000000000
	0x168: U8* cardTableVirtualStart = !j9x 0x00007FD25C33C000 // "9ddr/vm29/pointer/StructurePointer"
	0x170: struct J9Object* stopThrowable = !j9object 0x0000000000000000
	0x178: struct J9Object* outOfMemoryError = !j9object 0x00000000C00A3698 // java/lang/OutOfMemoryError
	0x180: UDATA* jniCurrentReference = !j9x 0x0000000000000000
	0x188: UDATA* jniLimitReference = !j9x 0x0000000000000000
	0x190: struct J9VMThread* linkNext = !j9vmthread 0x000000000025DA00
	0x198: struct J9VMThread* linkPrevious = !j9vmthread 0x0000000000233500
	0x1a0: UDATA privateFlags = 0x0000000000040002 (262146)
	0x1a8: UDATA jitTOC = 0x0000000000000000 (0)
	0x1b0: UDATA ferReturnType = 0x0000000000000000 (0)
	0x1b8: U64 ferReturnValue = 0x0000000000000000 (0)
	0x1c0: U64 mgmtBlockedTimeTotal = 0x0000000000000000 (0)
	0x1c8: U64 mgmtBlockedTimeStart = 0x0000000000000000 (0)
	0x1d0: U64 mgmtWaitedTimeTotal = 0x0000000000000000 (0)
	0x1d8: U64 mgmtWaitedTimeStart = 0x0000000000000000 (0)
	0x1e0: UDATA jniVMAccessCount = 0x0000000000000000 (0)
	0x1e8: UDATA debugEventData1 = 0x0000000000000000 (0)
	0x1f0: UDATA debugEventData2 = 0x0000000000000000 (0)
	0x1f8: UDATA debugEventData3 = 0x0000000000000000 (0)
	0x200: UDATA debugEventData4 = 0x0000000000000000 (0)
	0x208: UDATA debugEventData5 = 0x0000000000000000 (0)
	0x210: UDATA debugEventData6 = 0x0000000000000000 (0)
	0x218: UDATA debugEventData7 = 0x0000000000000000 (0)
	0x220: UDATA debugEventData8 = 0x0000000000000000 (0)
	0x228: struct J9StackElement* classLoadingStack = !j9stackelement 0x0000000000000000
	0x230: UDATA jitTransitionJumpSlot = 0x0000000000000000 (0)
	0x238: struct J9ThreadMonitor* gcClassUnloadingMutex = !j9threadmonitor 0x0000000000000000
	0x240: struct J9VMThread* gcClassUnloadingThreadPrevious = !j9vmthread 0x0000000000000000
	0x248: struct J9VMThread* gcClassUnloadingThreadNext = !j9vmthread 0x0000000000000000
	0x250: struct J9StackWalkState* stackWalkState = !j9stackwalkstate 0x000000000025A000
	0x258: struct J9VMEntryLocalStorage* entryLocalStorage = !j9vmentrylocalstorage 0x00007FD240472B70
	0x260: UDATA gpProtected = 0x0000000000000001 (1)
	0x268: struct J9VMGCSublistFragment gcRememberedSet = !j9vmgcsublistfragment 0x0000000000259F68
	0x298: struct MM_GCRememberedSetFragment sATBBarrierRememberedSetFragment = !mm_gcrememberedsetfragment 0x0000000000259F98
	0x2c8: void* gcTaskListPtr = !j9x 0x0000000000000000
	0x2d0: UDATA* dropBP = !j9x 0x0000000000000000
	0x2d8: UDATA dropFlags = 0x0000000000000000 (0)
	0x2e0: struct J9Pool* monitorEnterRecordPool = !j9pool 0x00007FD1E40116F0
	0x2e8: struct J9MonitorEnterRecord* monitorEnterRecords = !j9monitorenterrecord 0x0000000000000000
	0x2f0: UDATA* jniArrayCache = !j9x 0x0000000000000000
	0x2f8: UDATA* jniArrayCache2 = !j9x 0x0000000000000000
	0x300: struct J9StackWalkState inlineStackWalkState = !j9stackwalkstate 0x000000000025A000
	0x538: struct J9JITDecompilationInfo* decompilationStack = !j9jitdecompilationinfo 0x0000000000000000
	0x540: struct J9ModronThreadLocalHeap allocateThreadLocalHeap = !j9modronthreadlocalheap 0x000000000025A240
	0x570: struct J9ModronThreadLocalHeap nonZeroAllocateThreadLocalHeap = !j9modronthreadlocalheap 0x000000000025A270
	0x5a0: void* sidecarEvent = !j9x 0x0000000000000000
	0x5a8: struct PortlibPTBuffers_struct* ptBuffers = !portlibptbuffers_ 0x0000000000000000
	0x5b0: struct J9Object* blockingEnterObject = !j9object 0x0000000000000000
	0x5b8: void* gcExtensions = !j9x 0x00007FD1E4012878
	0x5c0: void* gpInfo = !j9x 0x0000000000000000
	0x5c8: void* jitVMwithThreadInfo = !j9x 0x00007FD1D4003210
	0x5d0: U8* profilingBufferEnd = !j9x 0x00007FD1D00031C0 // "!�f"�
	0x5d8: U8* profilingBufferCursor = !j9x 0x00007FD1D0002E32 // "���@��"
	0x5e0: UDATA* j2iFrame = !j9x 0x0000000000000000
	0x5e8: UDATA currentOSStackFree = 0x000000000007CBD0 (510928)
	0x5f0: UDATA mgmtBlockedCount = 0x0000000000000000 (0)
	0x5f8: UDATA mgmtWaitedCount = 0x0000000000000230 (560)
	0x600: UDATA mgmtBlockedStart = 0x0000000000000000 (0)
	0x608: UDATA mgmtWaitedStart = 0x0000000000000000 (0)
	0x610: UDATA cardTableShiftSize = 0x0000000000000009 (9)
	0x618: void* aotVMwithThreadInfo = !j9x 0x0000000000000000
	0x620: UDATA asyncEventFlags = 0x0000000000000000 (0)
	0x628: struct J9Object* forceEarlyReturnObjectSlot = !j9object 0x0000000000000000
	0x630: struct J9MonitorEnterRecord* jniMonitorEnterRecords = !j9monitorenterrecord 0x0000000000000000
	0x638: struct J9DLTInformationBlock dltBlock = !j9dltinformationblock 0x000000000025A338
	0x7f8: struct J9VMGCSegregatedAllocationCacheEntry* segregatedAllocationCache = !j9vmgcsegregatedallocationcacheentry 0x0000000000000000
	0x800: struct J9StackWalkState* activeWalkState = !j9stackwalkstate 0x00007FD240472440
	0x808: void* jniCalloutArgs = !j9x 0x0000000000000000
	0x810: struct J9VMThread* exclusiveVMAccessQueueNext = !j9vmthread 0x0000000000000000
	0x818: struct J9VMThread* exclusiveVMAccessQueuePrevious = !j9vmthread 0x0000000000000000
	0x820: struct J9Object* javaLangThreadLocalCache = !j9object 0x0000000000000000
	0x828: UDATA jitCountDelta = 0x0000000000000002 (2)
	0x830: UDATA maxProfilingCount = 0x0000000000001771 (6001)
	0x838: j9objectmonitor_t[] objectMonitorLookupCache = !j9x 0x000000000025A538
	0x938: UDATA jniCriticalCopyCount = 0x0000000000000000 (0)
	0x940: UDATA jniCriticalDirectCount = 0x0000000000000000 (0)
	0x948: struct J9Pool* jniReferenceFrames = !j9pool 0x0000000000000000
	0x950: U32 ludclInlineDepth = 0x00000000 (0)
	0x954: U32 ludclBPOffset = 0x00000000 (0)
	0x958: UDATA* osrJittedFrameCopy = !j9x 0x0000000000000000
	0x960: struct J9OSRBuffer* osrBuffer = !j9osrbuffer 0x0000000000000000
	0x968: void* osrReturnAddress = !j9x 0x0000000000000000
	0x970: void* osrScratchBuffer = !j9x 0x0000000000000000
	0x978: void* jitArtifactSearchCache = !j9x 0x00007FD200009B00
	0x980: void* jitExceptionHandlerCache = !j9x 0x0000000000000000
	0x988: void* jitPrivateData = !j9x 0x00007FD240487EC0
	0x990: struct J9Method* jitMethodToBeCompiled = !j9method 0x0000000000000000
	0x998: UDATA privateFlags2 = 0x0000000000000000 (0)
	0x9a0: struct OMR_VMThread* omrVMThread = !omr_vmthread 0x000000000025A748
	0x9a8: void* gpuInfo = !j9x 0x0000000000000000
	0x9b0: void* startOfMemoryBlock = !j9x 0x0000000000259CA8
	0x9b8: UDATA inNative = 0x0000000000000000 (0)
	0x9c0: struct J9JITDecompilationInfo* lastDecompilation = !j9jitdecompilationinfo 0x0000000000000000
	0x9c8: struct J9GSParameters gsParameters = !j9gsparameters 0x000000000025A6C8
	0x9d8: UDATA readBarrierRangeCheckBase = 0xFFFFFFFFFFFFFFFF (-1)
	0x9e0: UDATA readBarrierRangeCheckTop = 0x0000000000000000 (0)
	0x9e8: U32 readBarrierRangeCheckBaseCompressed = 0xFFFFFFFF (4294967295)
	0x9ec: U32 readBarrierRangeCheckTopCompressed = 0x00000000 (0)
	0x9f0: UDATA safePointCount = 0x0000000000000000 (0)
	0x9f8: volatile struct J9HashTable* utfCache = !j9hashtable 0x0000000000000000
	0xa00: U64* ffiArgs = !j9x 0x0000000000000000
	0xa08: UDATA ffiArgCount = 0x0000000000000000 (0)
	0xa10: struct J9VMContinuation* currentContinuation = !j9vmcontinuation 0x0000000000000000
	0xa18: UDATA continuationPinCount = 0x0000000000000000 (0)
	0xa20: UDATA ownedMonitorCount = 0x0000000000000000 (0)
	0xa28: UDATA callOutCount = 0x0000000000000000 (0)
	0xa30: struct J9Object* carrierThreadObject = !j9object 0x00000000C00A3610 // jdk/internal/misc/CarrierThread
	0xa38: struct J9Object* extentLocalCache = !j9object 0x0000000000000000
}

In the J9VMThread, we see the JIT return address set as: 0x00007FD2434CD649, which is a region in the JIT code cache that is invalid. What I am now trying to understand now is what sets that field in the J9VMThread to this value, and where this is obtained from, and what it is supposed to be.

@babsingh
Copy link
Contributor

babsingh commented Nov 17, 2022

What I am now trying to understand now is what sets that field in the J9VMThread to this value, and where this is obtained from, and what it is supposed to be.

While mounting a virtual thread, we copy the J9VMContinuation fields into J9VMThread and store the carrier thread fields in J9VMContinuation. While unmounting a virtual thread, we restore the carrier thread state by again swapping values into J9VMThread and caching the virtual thread specific-details in J9VMContinuation. We may need to cache the jitReturnAddress and other similar JIT fields in J9VMContinuation, which may need to be swapped during virtual thread mount and unmount. J9VMContinuation contains fields that are thread-specific and cannot be shared between a virtual thread and its carrier thread. See code associated to the below function for the above operations. Currently, jitReturnAddress is not part of the swap.

static VMINLINE void
swapFieldsWithContinuation(J9VMThread *vmThread, J9VMContinuation *continuation, bool swapJ9VMthreadSavedRegisters = true)
{
/* Helper macro to swap fields between the two J9Class structs. */
#define SWAP_MEMBER(fieldName, fieldType, class1, class2) \
do { \
fieldType temp = (fieldType)((class1)->fieldName); \
(class1)->fieldName = (class2)->fieldName; \
(class2)->fieldName = (fieldType)temp; \
} while (0)
SWAP_MEMBER(arg0EA, UDATA*, vmThread, continuation);
SWAP_MEMBER(bytecodes, UDATA*, vmThread, continuation);
SWAP_MEMBER(sp, UDATA*, vmThread, continuation);
SWAP_MEMBER(pc, U_8*, vmThread, continuation);
SWAP_MEMBER(literals, J9Method*, vmThread, continuation);
SWAP_MEMBER(stackOverflowMark, UDATA*, vmThread, continuation);
SWAP_MEMBER(stackOverflowMark2, UDATA*, vmThread, continuation);
SWAP_MEMBER(stackObject, J9JavaStack*, vmThread, continuation);
SWAP_MEMBER(decompilationStack, J9JITDecompilationInfo*, vmThread, continuation);
SWAP_MEMBER(j2iFrame, UDATA*, vmThread, continuation);
J9VMEntryLocalStorage *threadELS = vmThread->entryLocalStorage;
/* Swap the JIT GPR registers data referenced by ELS */
J9JITGPRSpillArea tempGPRs = continuation->jitGPRs;
J9I2JState tempI2J = continuation->i2jState;
continuation->jitGPRs = *(J9JITGPRSpillArea*)threadELS->jitGlobalStorageBase;
continuation->i2jState = threadELS->i2jState;
if (swapJ9VMthreadSavedRegisters) {
*(J9JITGPRSpillArea*)threadELS->jitGlobalStorageBase = tempGPRs;
}
threadELS->i2jState = tempI2J;
SWAP_MEMBER(oldEntryLocalStorage, J9VMEntryLocalStorage*, threadELS, continuation);
}

@gacholio
Copy link
Contributor

Ah, I see now - jitReturnAddress is used while in JIT helpers (cnathelp.cpp). I do not believe it's relevant to transitions. I've done a search and don't see it used anywhere else.

Even in the helpers, the value would be stored into a resolve frame before any potential calls into java could occur.

You could try swapping it to see if the assertion goes away, but I do not believe it will have any effect.

@0xdaryl
Copy link
Contributor

0xdaryl commented Nov 17, 2022

@nbhuiyan : from the crash where you extracted the info above, can you prove whether the pop uword ptr J9TR_VMThread_jitReturnAddress[_rbp] wrote a correct value for the jitCheckAsyncMessages call? If yes then we're looking for something else to have stomped on that field.

I think prototyping adding a field to the J9VMContinuation struct and swapping the JIT return address should be easy enough to do. Could you prototype that just to rule that out as a cause?

@gacholio
Copy link
Contributor

gacholio commented Nov 17, 2022

jitCheckAsyncMessages can not run java code, so it is not a swap point for continuations.

@tajila
Copy link
Contributor Author

tajila commented Nov 22, 2022

I think prototyping adding a field to the J9VMContinuation struct and swapping the JIT return address should be easy enough to do. Could you prototype that just to rule that out as a cause?

@nbhuiyan do you have an update

@nbhuiyan
Copy link
Member

@tajila Yes, I have prototyped adding a jitReturnAddress field to J9VMContinuation struct, and using that to swap the JIT return address during virtual thread mounting and unmounting. I confirmed that the J9VMThread had jitReturnAddress field pointing to a valid address, however the stackWalkState still contained the invalid address that resulted in the assertion failure.

@gacholio
Copy link
Contributor

This is what I expected.

@nbhuiyan
Copy link
Member

nbhuiyan commented Nov 22, 2022

Using -verbose:stackwalk, I was able to track down what the invalid address belonged to, and it was the PC for method ForkJoinPool$WorkQueue::topLevelExec. At the time of the crash, the address was invalid.

Here is a part of the list of compiled methods obtained from a core dump using jdmpview:

       start=0x7fba8126d648  end=0x7fba8126e025   java/util/concurrent/ForkJoinPool$WorkQueue::topLevelExec(Ljava/util/concurrent/ForkJoinTask;Ljava/util/concurrent/ForkJoinPool$WorkQueue;)V
	start=0x7fba81292708  end=0x7fba81293221   java/util/concurrent/ForkJoinPool$WorkQueue::topLevelExec(Ljava/util/concurrent/ForkJoinTask;Ljava/util/concurrent/ForkJoinPool$WorkQueue;)V

There are similar methods with more than one entry in the list, such as ForkJoinPool.scan. When checking the core dump for what’s within the address range of the first entry of a JIT-ed method, it is invalid, but for the second entry’s address range, it is a valid address for the compiled method, meaning the second entry is due to recompilation, as confirmed by the JIT verbose log:

! (AOT load) java/util/concurrent/ForkJoinPool$WorkQueue.topLevelExec(Ljava/util/concurrent/ForkJoinTask;Ljava/util/concurrent/ForkJoinPool$WorkQueue;)V Q_SZ=42 Q_SZI=42 QW=180 j9m=00000000001FF940 time=107us compilationRelocationFailure (classByNameValidationFailure) memLimit=262144 KB compThreadID=0
+ (cold) java/util/concurrent/ForkJoinPool$WorkQueue.topLevelExec(Ljava/util/concurrent/ForkJoinTask;Ljava/util/concurrent/ForkJoinPool$WorkQueue;)V @ 00007FBA8126E088-00007FBA8126E245 OrdinaryMethod - Q_SZ=42 Q_SZI=42 QW=180 j9m=00000000001FF940 bcsz=83 GCR compThreadID=0 CpuLoad=9%(1%avg) JvmCpu=0%
+ (warm) java/util/concurrent/ForkJoinPool$WorkQueue.topLevelExec(Ljava/util/concurrent/ForkJoinTask;Ljava/util/concurrent/ForkJoinPool$WorkQueue;)V @ 00007FBA81292708-00007FBA81293221 OrdinaryMethod C Q_SZ=0 Q_SZI=0 QW=12 j9m=00000000001FF940 bcsz=83 compThreadID=0 CpuLoad=0%(0%avg) JvmCpu=56%

The warm compilation of topLevelExec was also the final entry in the verbose log, meaning this was the last method to successfully compile prior to the crash.

So for the crash, we have

*** Invalid JIT return address 00007FBA8126E105 in 00007FBA7AA4C6B0

and the jit return address is immediately after the JIT-ed method address range of the first entry of topLevelExec. In the core dump, that address was invalid, but within the range of the JIT code cache. The stack walk results in an error due to this invalid address in 00007FBA7AA4C6B0, which is the walkState object.

Here is the info we get using KCA on the J9Method:

Method   {ClassPath/Name.MethodName}: {java/util/concurrent/ForkJoinPool$WorkQueue.topLevelExec}
                           Signature: (Ljava/util/concurrent/ForkJoinTask;Ljava/util/concurrent/ForkJoinPool$WorkQueue;)V
                              Access: Final
                    J9Class/J9Method: 0x00000000001ff500 / 0x00000000001ff940
               Compiled Method Start: 0x00007fba81292708 (2909 bytes)
                      ByteCode Start: 0x00007fba582ce73c (83 bytes)
                   ROM Constant Pool: 0x00007fba582cdb30 (79 entries)
                       Constant Pool: 0x00000000001feec0 (79 entries)

As you can see, the compiled method start is the same value as the compiled method start of the second entry of topLevelExec reported by jdmpview. So somehow the walkState object needed to be updated with the new address for topLevelExec, but that’s what is not happening.

@fengxue-IS
Copy link
Contributor

local testing with #16374 against Skynet.java test:
5/5 passes for default GC (gencon)
3/3 passed for metronome(I noticed that there may be upto 30x regression compared with gencon is test performance)
1/1 crashed when running balance GC.

Not sure if crash in BalancedGC is an intermittent issue or not, have not seen on other GC policy( or I haven't ran enough times to trigger it)

Updated commit: dfebbcd
I modified the realtimeGC code to double scan the vmthread and avoid locking vthread mount/unmount, seem like that fixed the issue with pref regression.
2/2 passed for gencon
2/2 passed for metronome
1/1 Crash in Balanced GC still exist, will look into core later

@0xdaryl
Copy link
Contributor

0xdaryl commented Dec 1, 2022

How shall we proceed with this?

I have a fix that implements the stack walk in place in jitReleaseCodeStackWalk that has passed testing for JDK 11 and 19 on all platforms. I am ready to open a PR for that.

Alternatively, if your universal stack walk feature will be available soon then I can incorporate that instead.

However, given where we are in the release cycle I would prefer the first option because it is less disruptive in terms of code changes. I'm worried about introducing subtle problems that will be hard to track down late in the release.

@gacholio
Copy link
Contributor

gacholio commented Dec 1, 2022

@fengxue-IS We need a definitive answer on whether all continuations can currently be found via the vthread list.

@tajila
Copy link
Contributor Author

tajila commented Dec 1, 2022

@fengxue-IS We need a definitive answer on whether all continuations can currently be found via the vthread list.

For JDK19 this is the case.

@gacholio
Copy link
Contributor

gacholio commented Dec 1, 2022

OK so the current solution will work. In the future, the code really belongs in a VM helper, but that's complicated by realtime (which is basically the antithesis or loom anyway).

@0xdaryl 0xdaryl assigned 0xdaryl and unassigned nbhuiyan Dec 5, 2022
@tajila tajila changed the title Loom: timeouts in jtreg tests Loom: jdk19 OpenJDK Invalid JIT return address ASSERTION FAILED swalk.c:1601 Dec 5, 2022
@tajila
Copy link
Contributor Author

tajila commented Dec 5, 2022

I've renamed the issue to more accurated represent the problem

@pshipton
Copy link
Member

pshipton commented Dec 6, 2022

https://openj9-jenkins.osuosl.org/job/Test_openjdk19_j9_sanity.openjdk_ppc64_aix_Nightly/63/
jdk_lang_1
java/lang/Thread/virtual/stress/GetStackTraceALot.java#id0

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk19_j9_sanity.openjdk_ppc64_aix_Nightly/63/openjdk_test_output.tar.gz

23:45:13  STDERR:
23:45:13  
23:45:13  
23:45:13  *** Invalid JIT return address 0000000000000000 in 000001002312DE00
23:45:13  
23:45:13  04:41:21.095 0x1002312db00    j9vm.249    *   ** ASSERTION FAILED ** at /home/jenkins/workspace/Build_JDK19_ppc64_aix_Nightly/openj9/runtime/vm/swalk.c:1632: ((0 ))

@0xdaryl
Copy link
Contributor

0xdaryl commented Dec 6, 2022

Just FYI that my workaround fix also crashed with balanced GC with the same backtrace as Jack's PR.

@pshipton
Copy link
Member

pshipton commented Dec 9, 2022

https://openj9-jenkins.osuosl.org/job/Test_openjdk19_j9_sanity.openjdk_s390x_linux_Nightly/67
jdk_lang_1
java/lang/Thread/virtual/stress/GetStackTraceALot.java#id0

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk19_j9_sanity.openjdk_s390x_linux_Nightly/67/openjdk_test_output.tar.gz

22:05:16  *** Invalid JIT return address 000003FF696C6E14 in 000003FFA8A7E190
22:05:16  
22:05:16  03:04:47.621 0x3ffa4508600    j9vm.249    *   ** ASSERTION FAILED ** at /home/jenkins/workspace/Build_JDK19_s390x_linux_Nightly/openj9/runtime/vm/swalk.c:1632: ((0 ))

@pshipton
Copy link
Member

https://openj9-jenkins.osuosl.org/job/Test_openjdk19_j9_sanity.openjdk_x86-64_linux_Nightly/68
jdk_lang_1
java/lang/Thread/virtual/stress/GetStackTraceALot.java#id0

No diagnostics, process was terminated.

08:59:39  *** Invalid JIT return address 00007F2FA2DD659C in 00007F2FA03496C0
08:59:39  
08:59:39  04:55:05.186 0x7f2fdc4e1900    j9vm.249    *   ** ASSERTION FAILED ** at /home/jenkins/workspace/Build_JDK19_x86-64_linux_Nightly/openj9/runtime/vm/swalk.c:1632: ((0 ))

@pshipton
Copy link
Member

https://openj9-jenkins.osuosl.org/job/Test_openjdk19_j9_sanity.openjdk_s390x_linux_Nightly/71
jdk_lang_0
java/lang/Thread/virtual/stress/SleepALot.java#id0

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk19_j9_sanity.openjdk_s390x_linux_Nightly/71/openjdk_test_output.tar.gz

22:07:41  *** Invalid JIT return address 000003FF8417E53C in 000003FF9DEFDE10
22:07:41  
22:07:41  03:07:26.226 0x1d47300    j9vm.249    *   ** ASSERTION FAILED ** at /home/jenkins/workspace/Build_JDK19_s390x_linux_Nightly/openj9/runtime/vm/swalk.c:1632: ((0 ))

@pshipton
Copy link
Member

Not sure if this is related but I'll add it here for now.

https://openj9-jenkins.osuosl.org/job/Test_openjdk19_j9_sanity.openjdk_ppc64_aix_Nightly/72
jdk_lang_0
java/lang/Thread/virtual/stress/GetStackTraceALot.java#id0

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk19_j9_sanity.openjdk_ppc64_aix_Nightly/72/openjdk_test_output.tar.gz

23:44:08  Unhandled exception
23:44:08  Type=Unhandled trap vmState=0x00000000
23:44:08  J9Generic_Signal_Number=00000108 Signal_Number=00000005 Error_Value=00000000 Signal_Code=00000008
23:44:08  Handler1=09001000A10125B0 Handler2=09001000A0F13020
23:44:08  R0=000001002206B1A0 R1=00000100236EA650 R2=0000000000000000 R3=000000000A000000
23:44:08  R4=000000000A000000 R5=000000000A000000 R6=00000100236EA780 R7=0000000000000088
23:44:08  R8=FFFFFFFFFFFFFFFF R9=00000100234FAFC0 R10=0A000200093CF8E5 R11=09001000A1016138
23:44:08  R12=0000000022000244 R13=00000100236F5800 R14=000001002323ECE0 R15=000001002323EE00
23:44:08  R16=0000010021AE0118 R17=FFFFFFFFFFFFFFFF R18=0000000000000000 R19=09001000A0EA3288
23:44:08  R20=0000000000000000 R21=000001002323F940 R22=00000100236EA900 R23=0000000000000000
23:44:08  R24=090000000E38D0D8 R25=0000000000000000 R26=0000010023442100 R27=0000000000000001
23:44:08  R28=0A0000001FDC2920 R29=0A0000001FDC27B0 R30=0A0000000013CAD0 R31=0000000000000000
23:44:08  IAR=000001002206B1A4 LR=000001002206B1A0 MSR=A00000000200D032 CTR=000001001063B128
23:44:08  CR=4800024400000000 FPSCR=8200000000000000 XER=0000000082000000
23:44:08  FPR0 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:44:08  FPR1 c3e0000000000000 (f: 0.000000, d: -9.223372e+18)
23:44:08  FPR2 3ff0000000000000 (f: 0.000000, d: 1.000000e+00)
23:44:08  FPR3 433000000005537f (f: 349055.000000, d: 4.503600e+15)
23:44:08  FPR4 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:44:08  FPR5 4330000000000000 (f: 0.000000, d: 4.503600e+15)
23:44:08  FPR6 41154dfc00000000 (f: 0.000000, d: 3.490550e+05)
23:44:08  FPR7 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:44:08  FPR8 3f6069c0ff9e20ca (f: 4288553216.000000, d: 2.003552e-03)
23:44:08  FPR9 402bd1a5376c55c0 (f: 929846720.000000, d: 1.390946e+01)
23:44:08  FPR10 412e848000000000 (f: 0.000000, d: 1.000000e+06)
23:44:08  FPR11 43300000000f4240 (f: 1000000.000000, d: 4.503600e+15)
23:44:08  FPR12 4530000000000000 (f: 0.000000, d: 1.934281e+25)
23:44:08  FPR13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:44:08  FPR14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:44:08  FPR15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:44:08  FPR16 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:44:08  FPR17 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:44:08  FPR18 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:44:08  FPR19 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:44:08  FPR20 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:44:08  FPR21 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:44:08  FPR22 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:44:08  FPR23 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:44:08  FPR24 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:44:08  FPR25 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:44:08  FPR26 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:44:08  FPR27 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:44:08  FPR28 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:44:08  FPR29 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:44:08  FPR30 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:44:08  FPR31 0000000000000000 (f: 0.000000, d: 0.000000e+00)
23:44:08  Target=2_90_20221217_141 (AIX 7.1)
23:44:08  CPU=ppc64 (16 logical CPUs) (0x780000000 RAM)
23:44:08  ----------- Stack Backtrace -----------
23:44:08  runJavaThread+0x28c (0x090000000DEE5CB0 [libj9vm29.so+0x74cb0])
23:44:08  _ZL23javaProtectedThreadProcP13J9PortLibraryPv+0x11c (0x090000000DE74000 [libj9vm29.so+0x3000])
23:44:08  omrsig_protect+0x4fc (0x090000000E2691E0 [libj9prt29.so+0x5f1e0])
23:44:08  javaThreadProc+0x70 (0x090000000DE73E74 [libj9vm29.so+0x2e74])
23:44:08  thread_wrapper+0x558 (0x090000000E2D299C [libj9thr29.so+0x599c])
23:44:08  _pthread_body+0xf0 (0x090000000051FE14 [libpthreads.a+0x3e14])
23:44:08  ---------------------------------------

@JasonFengJ9
Copy link
Member

JDK19 internal build(paix824)

openjdk version "19.0.1" 2022-10-18
IBM Semeru Runtime Open Edition 19.0.1+10 (build 19.0.1+10)
Eclipse OpenJ9 VM 19.0.1+10 (build master-a2475ea96, JRE 19 AIX ppc64-64-Bit Compressed References 20221018_15 (JIT enabled, AOT enabled)
OpenJ9   - a2475ea96
OMR      - 7d8ea8b6b
JCL      - 5168276e58 based on jdk-19.0.1+10)

[2022-12-31T20:42:45.757Z] variation: -Xdump:system:none -Xdump:heap:none -Xdump:system:events=gpf+abort+traceassert+corruptcache -XX:-JITServerTechPreviewMessage Mode150
[2022-12-31T20:42:45.757Z] JVM_OPTIONS:  -Xdump:system:none -Xdump:heap:none -Xdump:system:events=gpf+abort+traceassert+corruptcache -XX:-JITServerTechPreviewMessage -XX:+UseCompressedOops 

[2022-12-31T21:17:34.575Z] TEST: java/lang/Thread/virtual/stress/GetStackTraceALot.java#id0

[2022-12-31T21:17:34.576Z] STDERR:
[2022-12-31T21:17:34.576Z] 
[2022-12-31T21:17:34.576Z] 
[2022-12-31T21:17:34.576Z] *** Invalid JIT return address 0000000000000000 in 000000003028C000
[2022-12-31T21:17:34.576Z] 
[2022-12-31T21:17:34.600Z] 21:17:14.417 0x3028bd00    j9vm.249    *   ** ASSERTION FAILED ** at /home/jenkins/workspace/build-scripts/jobs/jdk19u/jdk19u-aix-ppc64-openj9/workspace/build/src/openj9/runtime/vm/swalk.c:1632: ((0 ))
[2022-12-31T21:17:34.600Z] JVMDUMP039I Processing dump event "traceassert", detail "" at 2022/12/31 16:17:14 - please wait.

[2022-12-31T21:24:30.283Z] jdk_lang_0_FAILED

@pshipton
Copy link
Member

pshipton commented Jan 10, 2023

https://openj9-jenkins.osuosl.org/job/Test_openjdk19_j9_sanity.openjdk_s390x_linux_Nightly/88
jdk_lang_1
java/lang/Thread/virtual/stress/SleepALot.java#id0

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk19_j9_sanity.openjdk_s390x_linux_Nightly/88/openjdk_test_output.tar.gz

22:05:06  *** Invalid JIT return address 000003FF4E74AFD8 in 000003FF8E97E320
22:05:06  
22:05:06  03:04:39.203 0x3ff00002400    j9vm.249    *   ** ASSERTION FAILED ** at /home/jenkins/workspace/Build_JDK19_s390x_linux_Nightly/openj9/runtime/vm/swalk.c:1632: ((0 ))

https://openj9-jenkins.osuosl.org/job/Test_openjdk19_j9_sanity.openjdk_s390x_linux_Nightly/89
jdk_lang_1
java/lang/Thread/virtual/stress/GetStackTraceALot.java#id0

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk19_j9_sanity.openjdk_s390x_linux_Nightly/89/openjdk_test_output.tar.gz

22:38:00  *** Invalid JIT return address 000003FF6C4BC1C6 in 000003FFB1FFE1B0
22:38:00  
22:38:00  03:37:19.996 0x3ffac552e00    j9vm.249    *   ** ASSERTION FAILED ** at /home/jenkins/workspace/Build_JDK19_s390x_linux_Nightly/openj9/runtime/vm/swalk.c:1632: ((0 ))

https://openj9-jenkins.osuosl.org/job/Test_openjdk19_j9_sanity.openjdk_s390x_linux_Nightly/90
jdk_lang_1
java/lang/Thread/virtual/stress/SleepALot.java#id0

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk19_j9_sanity.openjdk_s390x_linux_Nightly/90/openjdk_test_output.tar.gz

22:54:36  *** Invalid JIT return address 000003FF6AE4AC44 in 000003FFA8BFDE10
22:54:36  
22:54:36  03:53:56.783 0x3fefc02bb00    j9vm.249    *   ** ASSERTION FAILED ** at /home/jenkins/workspace/Build_JDK19_s390x_linux_Nightly/openj9/runtime/vm/swalk.c:1632: ((0 ))

@JasonFengJ9
Copy link
Member

JDK19 internal build(ub18x64rt1-9)

openjdk version "19.0.1-beta" 2022-10-18
IBM Semeru Runtime Open Edition 19.0.1+10-202301170035 (build 19.0.1-beta+10-202301170035)
Eclipse OpenJ9 VM 19.0.1+10-202301170035 (build master-082682ae5, JRE 19 Linux amd64-64-Bit Compressed References 20230117_23 (JIT enabled, AOT enabled)
OpenJ9   - 082682ae5
OMR      - ec81a72de
JCL      - 40656c4641 based on jdk-19.0.1+10)
[2023-01-17T01:34:11.784Z] variation: -Xdump:system:none -Xdump:heap:none -Xdump:system:events=gpf+abort+traceassert+corruptcache -XX:-JITServerTechPreviewMessage Mode650
[2023-01-17T01:34:11.784Z] JVM_OPTIONS:  -Xdump:system:none -Xdump:heap:none -Xdump:system:events=gpf+abort+traceassert+corruptcache -XX:-JITServerTechPreviewMessage -XX:-UseCompressedOops 

[2023-01-17T01:55:34.319Z] TEST: java/lang/Thread/virtual/stress/GetStackTraceALot.java#id0

[2023-01-17T01:55:34.320Z] STDERR:
[2023-01-17T01:55:34.320Z] 
[2023-01-17T01:55:34.320Z] 
[2023-01-17T01:55:34.320Z] *** Invalid JIT return address 00007F1131070814 in 00007F112A054450
[2023-01-17T01:55:34.320Z] 
[2023-01-17T01:55:34.320Z] 01:55:28.588 0x7f10cc098800    j9vm.249    *   ** ASSERTION FAILED ** at ../../../../../openj9/runtime/vm/swalk.c:1632: ((0 ))
[2023-01-17T01:55:34.320Z] JVMDUMP039I Processing dump event "traceassert", detail "" at 2023/01/16 20:55:28 - please wait.

[2023-01-17T02:01:24.305Z] jdk_lang_1_FAILED

@pshipton
Copy link
Member

pshipton commented Jan 20, 2023

https://openj9-jenkins.osuosl.org/job/Test_openjdk19_j9_sanity.openjdk_ppc64_aix_Nightly/93
jdk_lang_1
java/lang/Thread/virtual/stress/GetStackTraceALot.java#id0

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk19_j9_sanity.openjdk_ppc64_aix_Nightly/93/openjdk_test_output.tar.gz

00:03:50  *** Invalid JIT return address 0000010010627AE0 in 0000010023682BC0
00:03:50  
00:03:50  05:00:19.185 0x1002368b400    j9vm.249    *   ** ASSERTION FAILED ** at /home/jenkins/workspace/Build_JDK19_ppc64_aix_Nightly/openj9/runtime/vm/swalk.c:1632: ((0 ))

https://openj9-jenkins.osuosl.org/job/Test_openjdk19_j9_sanity.openjdk_s390x_linux_Nightly/96
jdk_lang_0
java/lang/Thread/virtual/stress/SleepALot.java#id0

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk19_j9_sanity.openjdk_s390x_linux_Nightly/96/openjdk_test_output.tar.gz

23:17:21  *** Invalid JIT return address 000003FF73F53D48 in 000003FF8DCFDE10
23:17:21  
23:17:21  04:16:49.699 0x1dd3a00    j9vm.249    *   ** ASSERTION FAILED ** at /home/jenkins/workspace/Build_JDK19_s390x_linux_Nightly/openj9/runtime/vm/swalk.c:1632: ((0 ))

https://openj9-jenkins.osuosl.org/job/Test_openjdk19_j9_sanity.openjdk_ppc64_aix_Nightly/98

fengxue-IS added a commit to fengxue-IS/aqa-tests that referenced this issue Feb 14, 2023
Tests excluded due eclipse-openj9/openj9#15939 are fixed by eclipse-openj9/openj9#16374

Signed-off-by: Jack Lu <Jack.S.Lu@ibm.com>
@tajila
Copy link
Contributor Author

tajila commented Feb 14, 2023

Closing as issue is addressed

@tajila tajila closed this as completed Feb 14, 2023
smlambert pushed a commit to adoptium/aqa-tests that referenced this issue Feb 14, 2023
Tests excluded due eclipse-openj9/openj9#15939 are fixed by eclipse-openj9/openj9#16374

Signed-off-by: Jack Lu <Jack.S.Lu@ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment