Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Mono] AOT compiler segfaults when building tests #56480

Closed
steveisok opened this issue Jul 28, 2021 · 4 comments
Closed

[Mono] AOT compiler segfaults when building tests #56480

steveisok opened this issue Jul 28, 2021 · 4 comments
Assignees
Labels
area-Codegen-AOT-mono blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms'
Milestone

Comments

@steveisok
Copy link
Member

steveisok commented Jul 28, 2021

Discovered in #56316

When we go to AOT our functional tests, the aot compiler crashes with a segmentation fault (11). When peeling back what's going on, the backtrace shows us:

* thread #1, name = 'tid_103', queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x44)
    frame #0: 0x00000001001487e1 mono-aot-cross`create_jit_info(cfg=<unavailable>, method_to_compile=<unavailable>) at mini.c:2595:18 [opt]
   2592					int end_offset;
   2593					if (ec->handler_offset + ec->handler_len < header->code_size) {
   2594						tblock = cfg->cil_offset_to_bb [ec->handler_offset + ec->handler_len];
-> 2595						if (tblock->native_offset) {
   2596							end_offset = tblock->native_offset;
   2597						} else {
   2598							int j, end;

* thread #1, name = 'tid_103', queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x44)
  * frame #0: 0x00000001001487e1 mono-aot-cross`create_jit_info(cfg=<unavailable>, method_to_compile=<unavailable>) at mini.c:2595:18 [opt]
    frame #1: 0x000000010014792c mono-aot-cross`mini_method_compile(method=<unavailable>, opts=<unavailable>, flags=JIT_FLAG_AOT | JIT_FLAG_FULL_AOT | JIT_FLAG_LLVM, parts=0, aot_method_index=179) at mini.c:3924:2 [opt]
    frame #2: 0x00000001001d9f1d mono-aot-cross`compile_method(acfg=0x0000000101008c00, method=0x0000000101a0fca0) at aot-compiler.c:8884:8 [opt]
    frame #3: 0x00000001001ca8cb mono-aot-cross`mono_compile_assembly [inlined] compile_methods(acfg=0x0000000101008c00) at aot-compiler.c:12356:3 [opt]
    frame #4: 0x00000001001ca74c mono-aot-cross`mono_compile_assembly(ass=0x0000000100f09970, opts=<unavailable>, aot_options=<unavailable>, global_aot_state=0x00007ffeefbff068) at aot-compiler.c:14125:2 [opt]
    frame #5: 0x00000001001b962c mono-aot-cross`mono_main at driver.c:1434:10 [opt]
    frame #6: 0x00000001001b9598 mono-aot-cross`mono_main(argc=<unavailable>, argv=<unavailable>) at driver.c:2682:3 [opt]
    frame #7: 0x0000000100003a73 mono-aot-cross`main [inlined] mono_main_with_options(argc=<unavailable>, argv=<unavailable>) at main.c:54:9 [opt]
    frame #8: 0x0000000100003a5f mono-aot-cross`main(argc=5, argv=0x00007ffeefbff2a0) at main.c:397:9 [opt]
    frame #9: 0x00007fff203fbf5d libdyld.dylib`start + 1
  thread #2, name = 'SGen worker'
    frame #0: 0x00007fff203adcde libsystem_kernel.dylib`__psynch_cvwait + 10
    frame #1: 0x00007fff203e0e49 libsystem_pthread.dylib`_pthread_cond_wait + 1298
    frame #2: 0x000000010013e5e3 mono-aot-cross`thread_func [inlined] mono_os_cond_wait(cond=<unavailable>, mutex=<unavailable>) at mono-os-mutex.h:219:8 [opt]
    frame #3: 0x000000010013e5c8 mono-aot-cross`thread_func at sgen-thread-pool.c:167:3 [opt]
    frame #4: 0x000000010013e4cb mono-aot-cross`thread_func(data=<unavailable>) at sgen-thread-pool.c:198:3 [opt]
    frame #5: 0x00007fff203e08fc libsystem_pthread.dylib`_pthread_start + 224
    frame #6: 0x00007fff203dc443 libsystem_pthread.dylib`thread_start + 15

If we go to the 2nd frame and print the important method details, we get:

method->name = "Trim"
method->klass->name_space = "System.Buffers"
method->klass->name = "TlsOverPerCoreLockedStacksArrayPool`1"

Which leads to this block that was changed in the PR #56316 https://github.com/dotnet/runtime/blob/96673df0d1ba24cf2be4ec9529a6ba54f7d97902/src/libraries/System.Private.CoreLib/src/System/Buffers/TlsOverPerCoreLockedStacksArrayPool.cs#L187-L280

@vargaz it looks like foreach may be causing problems under certain conditions.

@steveisok steveisok added this to the 6.0.0 milestone Jul 28, 2021
@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Jul 28, 2021
@steveisok steveisok removed the untriaged New issue has not been triaged by the area owner label Jul 28, 2021
@stephentoub stephentoub added the blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' label Jul 28, 2021
@vargaz
Copy link
Contributor

vargaz commented Jul 29, 2021

Which test suite is this ?
That PR has AOT failures on a lot of assemblies, not just corlib.

@steveisok
Copy link
Member Author

@vargaz
Copy link
Contributor

vargaz commented Jul 29, 2021

This was caused by a linker problem.

@vargaz vargaz closed this as completed Jul 29, 2021
@stephentoub
Copy link
Member

This was caused by a linker problem.

Isn't it still a problem? The other code generators are accepting this apparently; shouldn't the AOT compiler not seg fault in this case?

My PR is also still blocked until something is fixed somewhere.

@ghost ghost locked as resolved and limited conversation to collaborators Aug 29, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-Codegen-AOT-mono blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms'
Projects
None yet
Development

No branches or pull requests

3 participants