Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add unconditional warning before handling StackOverflow #54622

Merged
merged 2 commits into from
Jun 5, 2024

Conversation

vtjnash
Copy link
Member

@vtjnash vtjnash commented May 29, 2024

Since this is not modeled by the exception logic, and it can interrupt arbitrary program state or corrupt locks (leading to hangs and other issues), as well as just frequently segfaulting again afterwards, give a printed message as soon as we notice things are going badly before attempting to recover. For example:

$ ./julia -e 'f() = f(); f()'
Warning: detected a stack overflow, which may result in problems for the program.
ERROR: StackOverflowError:
Stacktrace:
 [1] f() (repeats 2 times)
   @ Main ./none:1

Refs #52291

@vtjnash vtjnash added error handling Handling of exceptions by Julia or the user error messages Better, more actionable error messages labels May 29, 2024
@KristofferC
Copy link
Member

Same with InterruptException then? Regarding the message I don't know if anyone will really understand the meaning of "which may result in problems for the program.".

@vtjnash
Copy link
Member Author

vtjnash commented May 31, 2024

Yeah, I wanted people to have realistic expectations that this is not within the guarantees to program behavior. The ^C has some similar issues, but I don't want to print as much text (just the ^C) for that case. Happy to take other wording suggestions, as I didn't like this much, but couldn't think of anything more clear.

vtjnash added 2 commits June 4, 2024 10:15
Avoids a race condition where a signal (e.g. StackOverflowError) happens
while trying to initialize the rest of the frame, resulting in trying to
longjmp to garbage.
Since this is not modeled by the exception logic, and it can interrupt
arbitrary program state or corrupt locks (leading to hangs and other
issues), as well as just frequently segfaulting afterwards, give a
printed message as soon as we notice things are going badly before
attempting to recover.
@vtjnash vtjnash force-pushed the jn/stackoverflow-warn branch from 491f900 to 21b1e15 Compare June 4, 2024 15:22
@vtjnash
Copy link
Member Author

vtjnash commented Jun 5, 2024

Well that is curious, I didn't expect this PR to already help clarify a possible cause of a crash in CI (rr trace included for reproducer); which problem appears to have likely been introduced in #47186:

abstractarray                                    (6) |        started at 2024-06-04T16:36:53.359
      From worker 6:	Warning: detected a stack overflow; program state may be corrupted, so further execution might be unreliable.
      From worker 6:	Warning: detected a stack overflow; program state may be corrupted, so further execution might be unreliable.
      From worker 6:	Warning: detected a stack overflow; program state may be corrupted, so further execution might be unreliable.
      From worker 6:	Warning: detected a stack overflow; program state may be corrupted, so further execution might be unreliable.
      From worker 6:	Warning: detected a stack overflow; program state may be corrupted, so further execution might be unreliable.
      From worker 6:	Warning: detected a stack overflow; program state may be corrupted, so further execution might be unreliable.
      From worker 6:	Warning: detected a stack overflow; program state may be corrupted, so further execution might be unreliable.
      From worker 6:	Warning: detected a stack overflow; program state may be corrupted, so further execution might be unreliable.
      From worker 6:
      From worker 6:	Please submit a bug report with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
      From worker 6:	Exception: EXCEPTION_ACCESS_VIOLATION at 0x7316662f -- gc_read_stack at C:/workdir/src\gc.c:1985 [inlined]
      From worker 6:	gc_mark_stack at C:/workdir/src\gc.c:2539
      From worker 6:	in expression starting at C:\buildkite-agent\builds\win2k22-amdci6-4\julialang\julia-master\julia-21b1e15b8f\share\julia\test\abstractarray.jl:1447
      From worker 6:	gc_read_stack at C:/workdir/src\gc.c:1985 [inlined]
      From worker 6:	gc_mark_stack at C:/workdir/src\gc.c:2539
      From worker 6:	gc_mark_outrefs at C:/workdir/src\gc.c:2760 [inlined]
      From worker 6:	gc_mark_loop_serial_ at C:/workdir/src\gc.c:2969
      From worker 6:	gc_mark_loop_serial at C:/workdir/src\gc.c:2992
      From worker 6:	gc_mark_loop at C:/workdir/src\gc.c:3169 [inlined]
      From worker 6:	_jl_gc_collect at C:/workdir/src\gc.c:3558
      From worker 6:	ijl_gc_collect at C:/workdir/src\gc.c:3937
      From worker 6:	maybe_collect at C:/workdir/src\gc.c:922 [inlined]
      From worker 6:	jl_gc_pool_alloc_inner at C:/workdir/src\gc.c:1325
      From worker 6:	ijl_gc_pool_alloc_instrumented at C:/workdir/src\gc.c:1383
      From worker 6:	argextype at .\compiler\optimize.jl:406
      From worker 6:	argextype at .\compiler\optimize.jl:399 [inlined]
      From worker 6:	argextype at .\compiler\optimize.jl:399 [inlined]
      From worker 6:	iscall_with_boundscheck at .\compiler\optimize.jl:664 [inlined]
      From worker 6:	scan_inconsistency! at .\compiler\optimize.jl:786
      From worker 6:	ScanStmt at .\compiler\optimize.jl:822
      From worker 6:	scan! at .\compiler/ssair\irinterp.jl:263
      From worker 6:	ipo_dataflow_analysis! at .\compiler\optimize.jl:931
      From worker 6:	optimize at .\compiler\optimize.jl:955
      From worker 6:	jfptr_optimize_41160 at C:\buildkite-agent\builds\win2k22-amdci6-4\julialang\julia-master\julia-21b1e15b8f\lib\julia\sys.dll (unknown line)
      From worker 6:	_jl_invoke at C:/workdir/src\gf.c:2997 [inlined]
      From worker 6:	ijl_apply_generic at C:/workdir/src\gf.c:3193
      From worker 6:	_typeinf at .\compiler\typeinfer.jl:253
      From worker 6:	typeinf at .\compiler\typeinfer.jl:215
      From worker 6:	jfptr_typeinf_38200 at C:\buildkite-agent\builds\win2k22-amdci6-4\julialang\julia-master\julia-21b1e15b8f\lib\julia\sys.dll (unknown line)
      From worker 6:	_jl_invoke at C:/workdir/src\gf.c:2997 [inlined]
      From worker 6:	ijl_invoke at C:/workdir/src\gf.c:3023
      From worker 6:	typeinf_edge at .\compiler\typeinfer.jl:863
      From worker 6:	abstract_call_method at .\compiler\abstractinterpretation.jl:660
      From worker 6:	abstract_call_gf_by_type at .\compiler\abstractinterpretation.jl:101
      From worker 6:	abstract_call_known at .\compiler\abstractinterpretation.jl:2199
      From worker 6:	abstract_call at .\compiler\abstractinterpretation.jl:2281
      From worker 6:	abstract_apply at .\compiler\abstractinterpretation.jl:1689
      From worker 6:	abstract_call_known at .\compiler\abstractinterpretation.jl:2101
      From worker 6:	abstract_call at .\compiler\abstractinterpretation.jl:2281
      From worker 6:	abstract_call at .\compiler\abstractinterpretation.jl:2274
      From worker 6:	abstract_call at .\compiler\abstractinterpretation.jl:2424
      From worker 6:	abstract_eval_call at .\compiler\abstractinterpretation.jl:2437
      From worker 6:	abstract_eval_statement_expr at .\compiler\abstractinterpretation.jl:2668
      From worker 6:	abstract_eval_statement at .\compiler\abstractinterpretation.jl:2776
      From worker 6:	abstract_eval_basic_statement at .\compiler\abstractinterpretation.jl:3094
      From worker 6:	typeinf_local at .\compiler\abstractinterpretation.jl:3347
      From worker 6:	typeinf_nocycle at .\compiler\abstractinterpretation.jl:3429
      From worker 6:	_typeinf at .\compiler\typeinfer.jl:235
      From worker 6:	typeinf at .\compiler\typeinfer.jl:215
      From worker 6:	jfptr_typeinf_38200 at C:\buildkite-agent\builds\win2k22-amdci6-4\julialang\julia-master\julia-21b1e15b8f\lib\julia\sys.dll (unknown line)
      From worker 6:	_jl_invoke at C:/workdir/src\gf.c:2997 [inlined]
      From worker 6:	ijl_invoke at C:/workdir/src\gf.c:3023
      From worker 6:	typeinf_edge at .\compiler\typeinfer.jl:863
      From worker 6:	abstract_call_method at .\compiler\abstractinterpretation.jl:660
      From worker 6:	abstract_call_gf_by_type at .\compiler\abstractinterpretation.jl:101
      From worker 6:	abstract_call_known at .\compiler\abstractinterpretation.jl:2199
      From worker 6:	abstract_call at .\compiler\abstractinterpretation.jl:2281
      From worker 6:	abstract_apply at .\compiler\abstractinterpretation.jl:1689
      From worker 6:	abstract_call_known at .\compiler\abstractinterpretation.jl:2101
      From worker 6:	abstract_call at .\compiler\abstractinterpretation.jl:2281
      From worker 6:	abstract_call at .\compiler\abstractinterpretation.jl:2274
      From worker 6:	abstract_call at .\compiler\abstractinterpretation.jl:2424
      From worker 6:	abstract_eval_call at .\compiler\abstractinterpretation.jl:2437
      From worker 6:	abstract_eval_statement_expr at .\compiler\abstractinterpretation.jl:2668
      From worker 6:	abstract_eval_statement at .\compiler\abstractinterpretation.jl:2776
      From worker 6:	abstract_eval_basic_statement at .\compiler\abstractinterpretation.jl:3094
      From worker 6:	typeinf_local at .\compiler\abstractinterpretation.jl:3347
      From worker 6:	typeinf_nocycle at .\compiler\abstractinterpretation.jl:3429
      From worker 6:	_typeinf at .\compiler\typeinfer.jl:235
      From worker 6:	typeinf at .\compiler\typeinfer.jl:215
      From worker 6:	jfptr_typeinf_38200 at C:\buildkite-agent\builds\win2k22-amdci6-4\julialang\julia-master\julia-21b1e15b8f\lib\julia\sys.dll (unknown line)
      From worker 6:	_jl_invoke at C:/workdir/src\gf.c:2997 [inlined]
      From worker 6:	ijl_invoke at C:/workdir/src\gf.c:3023
      From worker 6:	typeinf_edge at .\compiler\typeinfer.jl:863
      From worker 6:	abstract_call_method at .\compiler\abstractinterpretation.jl:660
      From worker 6:	abstract_call_gf_by_type at .\compiler\abstractinterpretation.jl:101
      From worker 6:	abstract_call_known at .\compiler\abstractinterpretation.jl:2199
      From worker 6:	abstract_call at .\compiler\abstractinterpretation.jl:2281
      From worker 6:	abstract_apply at .\compiler\abstractinterpretation.jl:1689
      From worker 6:	abstract_call_known at .\compiler\abstractinterpretation.jl:2101
      From worker 6:	abstract_call at .\compiler\abstractinterpretation.jl:2281
      From worker 6:	abstract_call at .\compiler\abstractinterpretation.jl:2274
      From worker 6:	abstract_call at .\compiler\abstractinterpretation.jl:2424
      From worker 6:	abstract_eval_call at .\compiler\abstractinterpretation.jl:2437
      From worker 6:	abstract_eval_statement_expr at .\compiler\abstractinterpretation.jl:2668
      From worker 6:	abstract_eval_statement at .\compiler\abstractinterpretation.jl:2776
      From worker 6:	abstract_eval_basic_statement at .\compiler\abstractinterpretation.jl:3094
      From worker 6:	typeinf_local at .\compiler\abstractinterpretation.jl:3347
      From worker 6:	typeinf_nocycle at .\compiler\abstractinterpretation.jl:3429
      From worker 6:	_typeinf at .\compiler\typeinfer.jl:235
      From worker 6:	typeinf at .\compiler\typeinfer.jl:215
      From worker 6:	typeinf_ext at .\compiler\typeinfer.jl:1121
      From worker 6:	typeinf_ext_toplevel at .\compiler\typeinfer.jl:1179 [inlined]
      From worker 6:	typeinf_ext_toplevel at .\compiler\typeinfer.jl:1177
      From worker 6:	jfptr_typeinf_ext_toplevel_38444 at C:\buildkite-agent\builds\win2k22-amdci6-4\julialang\julia-master\julia-21b1e15b8f\lib\julia\sys.dll (unknown line)
      From worker 6:	_jl_invoke at C:/workdir/src\gf.c:2997 [inlined]
      From worker 6:	ijl_apply_generic at C:/workdir/src\gf.c:3193 [inlined]
      From worker 6:	jl_apply at C:/workdir/src\julia.h:2189 [inlined]
      From worker 6:	jl_type_infer at C:/workdir/src\gf.c:393
      From worker 6:	jl_compile_method_internal at C:/workdir/src\gf.c:2582
      From worker 6:	_jl_invoke at C:/workdir/src\gf.c:3008 [inlined]
      From worker 6:	ijl_apply_generic at C:/workdir/src\gf.c:3193
      From worker 6:	macro expansion at C:\workdir\usr\share\julia\stdlib\v1.12\Test\src\Test.jl:676 [inlined]
      From worker 6:	macro expansion at C:\buildkite-agent\builds\win2k22-amdci6-4\julialang\julia-master\julia-21b1e15b8f\share\julia\test\abstractarray.jl:1460 [inlined]
      From worker 6:	macro expansion at C:\workdir\usr\share\julia\stdlib\v1.12\Test\src\Test.jl:1700 [inlined]
      From worker 6:	top-level scope at C:\buildkite-agent\builds\win2k22-amdci6-4\julialang\julia-master\julia-21b1e15b8f\share\julia\test\abstractarray.jl:1448
      From worker 6:	jl_fptr_args at C:/workdir/src\gf.c:2658
      From worker 6:	_jl_invoke at C:/workdir/src\gf.c:3016 [inlined]
      From worker 6:	ijl_invoke at C:/workdir/src\gf.c:3023
      From worker 6:	jl_toplevel_eval_flex at C:/workdir/src\toplevel.c:960
      From worker 6:	jl_toplevel_eval_flex at C:/workdir/src\toplevel.c:909
      From worker 6:	ijl_toplevel_eval at C:/workdir/src\toplevel.c:980 [inlined]
      From worker 6:	ijl_toplevel_eval_in at C:/workdir/src\toplevel.c:1022
      From worker 6:	eval at .\boot.jl:432 [inlined]
      From worker 6:	include_string at .\loading.jl:2589
      From worker 6:	jl_fptr_args at C:/workdir/src\gf.c:2658
      From worker 6:	_jl_invoke at C:/workdir/src\gf.c:2997 [inlined]
      From worker 6:	ijl_apply_generic at C:/workdir/src\gf.c:3193
      From worker 6:	_include at .\loading.jl:2649
      From worker 6:	include at .\Base.jl:559 [inlined]
      From worker 6:	macro expansion at C:\buildkite-agent\builds\win2k22-amdci6-4\julialang\julia-master\julia-21b1e15b8f\share\julia\test\testdefs.jl:33 [inlined]
      From worker 6:	macro expansion at C:\workdir\usr\share\julia\stdlib\v1.12\Test\src\Test.jl:1700 [inlined]
      From worker 6:	macro expansion at C:\buildkite-agent\builds\win2k22-amdci6-4\julialang\julia-master\julia-21b1e15b8f\share\julia\test\testdefs.jl:26 [inlined]
      From worker 6:	macro expansion at .\timing.jl:578 [inlined]
      From worker 6:	#runtests#1 at C:\buildkite-agent\builds\win2k22-amdci6-4\julialang\julia-master\julia-21b1e15b8f\share\julia\test\testdefs.jl:24
      From worker 6:	runtests at C:\buildkite-agent\builds\win2k22-amdci6-4\julialang\julia-master\julia-21b1e15b8f\share\julia\test\testdefs.jl:5 [inlined]
      From worker 6:	runtests at C:\buildkite-agent\builds\win2k22-amdci6-4\julialang\julia-master\julia-21b1e15b8f\share\julia\test\testdefs.jl:5
      From worker 6:	unknown function (ip: 0b67a7df)
      From worker 6:	_jl_invoke at C:/workdir/src\gf.c:2997 [inlined]
      From worker 6:	ijl_apply_generic at C:/workdir/src\gf.c:3193
      From worker 6:	jl_apply at C:/workdir/src\julia.h:2189 [inlined]
      From worker 6:	jl_f__call_latest at C:/workdir/src\builtins.c:875
      From worker 6:	jl_fptr_args at C:/workdir/src\gf.c:2658
      From worker 6:	_jl_invoke at C:/workdir/src\gf.c:2997 [inlined]
      From worker 6:	ijl_apply_generic at C:/workdir/src\gf.c:3193
      From worker 6:	jl_apply at C:/workdir/src\julia.h:2189 [inlined]
      From worker 6:	do_apply at C:/workdir/src\builtins.c:831
      From worker 6:	jl_f__apply_iterate at C:/workdir/src\builtins.c:839
      From worker 6:	#invokelatest#2 at .\essentials.jl:1035
      From worker 6:	jl_fptr_args at C:/workdir/src\gf.c:2658
      From worker 6:	_jl_invoke at C:/workdir/src\gf.c:2997 [inlined]
      From worker 6:	ijl_apply_generic at C:/workdir/src\gf.c:3193
      From worker 6:	jl_apply at C:/workdir/src\julia.h:2189 [inlined]
      From worker 6:	do_apply at C:/workdir/src\builtins.c:831
      From worker 6:	jl_f__apply_iterate at C:/workdir/src\builtins.c:839
      From worker 6:	invokelatest at .\essentials.jl:1030
      From worker 6:	jl_fptr_args at C:/workdir/src\gf.c:2658
      From worker 6:	_jl_invoke at C:/workdir/src\gf.c:2997 [inlined]
      From worker 6:	ijl_apply_generic at C:/workdir/src\gf.c:3193
      From worker 6:	jl_apply at C:/workdir/src\julia.h:2189 [inlined]
      From worker 6:	do_apply at C:/workdir/src\builtins.c:831
      From worker 6:	jl_f__apply_iterate at C:/workdir/src\builtins.c:839
      From worker 6:	#110 at C:\workdir\usr\share\julia\stdlib\v1.12\Distributed\src\process_messages.jl:287
      From worker 6:	run_work_thunk at C:\workdir\usr\share\julia\stdlib\v1.12\Distributed\src\process_messages.jl:70
      From worker 6:	#109 at C:\workdir\usr\share\julia\stdlib\v1.12\Distributed\src\process_messages.jl:287
      From worker 6:	unknown function (ip: 0b66aa3f)
      From worker 6:	_jl_invoke at C:/workdir/src\gf.c:2997 [inlined]
      From worker 6:	ijl_apply_generic at C:/workdir/src\gf.c:3193
      From worker 6:	jl_apply at C:/workdir/src\julia.h:2189 [inlined]
      From worker 6:	start_task at C:/workdir/src\task.c:1240
      From worker 6:	Allocations: 908022155 (Pool: 907997306; Big: 24849); GC: 289
Worker 6 terminated.
abstractarrayUNHANDLED TASK ERROR: IOError: read: connection reset by peer (ECONNRESET)
Stacktrace:
  [1] wait_readnb(x::TCPSocket, nb::Int32)
    @ Base .\stream.jl:410
  [2] (::Base.var"#wait_locked#851")(s::TCPSocket, buf::IOBuffer, nb::Int32)
    @ Base .\stream.jl:973
  [3] unsafe_read(s::TCPSocket, p::Ptr{UInt8}, nb::UInt32)
    @ Base .\stream.jl:979
  [4] unsafe_read
    @ .\io.jl:891 [inlined]
  [5] unsafe_read(s::TCPSocket, p::Base.RefValue{NTuple{4, Int32}}, n::Int32)
    @ Base .\io.jl:890
  [6] read!
    @ .\io.jl:895 [inlined]
  [7] deserialize_hdr_raw
    @ C:\buildkite-agent\builds\win2k22-amdci6-4\julialang\julia-master\julia-21b1e15b8f\share\julia\stdlib\v1.12\Distributed\src\messages.jl:167 [inlined]
  [8] message_handler_loop(r_stream::TCPSocket, w_stream::TCPSocket, incoming::Bool)
    @ Distributed C:\buildkite-agent\builds\win2k22-amdci6-4\julialang\julia-master\julia-21b1e15b8f\share\julia\stdlib\v1.12\Distributed\src\process_messages.jl:172
  [9] process_tcp_streams(r_stream::TCPSocket, w_stream::TCPSocket, incoming::Bool)
    @ Distributed C:\buildkite-agent\builds\win2k22-amdci6-4\julialang\julia-master\julia-21b1e15b8f\share\julia\stdlib\v1.12\Distributed\src\process_messages.jl:133
 [10] (::Distributed.var"#103#104"{TCPSocket, TCPSocket, Bool})()
    @ Distributed C:\buildkite-agent\builds\win2k22-amdci6-4\julialang\julia-master\julia-21b1e15b8f\share\julia\stdlib\v1.12\Distributed\src\process_messages.jl:121
                                    (6) |         failed at 2024-06-04T16:40:17.615

@vtjnash vtjnash merged commit b946b94 into master Jun 5, 2024
5 of 7 checks passed
@vtjnash vtjnash deleted the jn/stackoverflow-warn branch June 5, 2024 21:59
gbaraldi added a commit that referenced this pull request Jun 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
error handling Handling of exceptions by Julia or the user error messages Better, more actionable error messages
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants