-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suggestion: abbreviate stack traces by default #40138
Comments
So always take the first? How about:
Would only showing the first entry there really be better? |
I think so? The error message itself is pretty clear, the stack trace seems dispensable? |
Or you could show the first non-Base/stdlib frame |
But how would you know where in your code the problem originated from? |
This has come up a couple times in the past and I generally resist hiding information. To me, the tradeoff is not good since at best this just avoids a bit of visual noise, but at worst it makes debugging harder. We will start to see bug reports without the whole stack trace. Obviously many people will request the full trace before filing an issue, but it's one more step in the way of getting the information we need. To be a bit hyperbolic, it's like complaining that a smoke detector is too loud. |
I totally understand that view. To a novice user though, extraneous information can be daunting and actively prevent them from understanding where the problem is, potentially leading to unnecessary support requests or frustration. I was partly inspired by the complaint posted on Slack from the JuliaOptics guy, but I've seen that sort of complaint before and it was a bit of a barrier for me getting started. Readily available, discoverable, but not shown by default strikes a good balance there IMO. I made a similar argument in favor of low-cost notice of deprecation warnings, against them being hidden entirely. Take this example:
If a stack trace can be filtered for Base/stdlib methods and show the top one, it would be a lot more clear where those more simple errors originate. Maybe there are good reasons not to do it right now, but it might be useful as Julia matures and grows. Amending with another example. HTTP requests fail with an exception. But it's just information, the stack trace isn't useful there, and takes up ~20 lines depending on window width:
|
Is there some way to override how the REPL displays exceptions? I'm imagining something like a user-supplied |
This comment has been minimized.
This comment has been minimized.
I want to add a voice from the "core dev" side in support of abbreviated stack traces. I think this is a major usability issue for the language. Two-language solutions have an advantage here: the boundary between the high-level (e.g. Python) and low-level language (e.g. C) is a natural stopping point for errors — if something is wrong about how the user is calling a function, it will get caught before calling into C or not at all; if something goes wrong inside of the C call, the user doesn't get a C stack trace, they see an error raised by the C function. Imagine if Python users were subjected to C stack traces deep into the CPython runtime? That's basically what we're currently subjecting Julia users to. Yes, the stack is all Julia calls, but if something goes deep inside of stdlib, does the typical user they care about a dozen internal function calls between the public stdlib function they called and where they error actually occurred? They do not. Printing all that extra information makes it much harder for the typical user to understand what actually went wrong and fix it. Frankly, when I get stack traces, I find it annoyingly difficult to figure out where the actual problem is in the wall of text that we dump. |
What if we restricted the stack trace by default to show (a) lines in code defined in Main or any modules in untracked/dev packages, (b) the entry points into all other code, and (c) the topmost stackframe. The example above would become something like:
The other thing that would help a lot is to abbreviate the display of long parameterized type names, as I've done manually above. So many times I've been faced with an unhelpful "wall of types". |
That's basically what my split-off package does right now if you haven't taken a look yet, @stevengj. My plan is to refine the idea there and then it can be brought back when it seems most issues are addressed. |
Also, I see that @vtjnash suggested an alternative heuristic here. However, I'm not convinced that "crossing API boundaries" is the right heuristic. e.g. if |
Bump on this discussion. While I understand the argument that we want stacktraces to be complete so that when someone posts them onto the Discourse we have all of the information to debug from, it turns out that the current strategy of "print everything" actually stops that from occurring in practice. I mean, the stacktraces are so long now that we post pictures of them in chats as jokes. Do we really think people are going to copy-paste all of that?
We won't start to, we already do see people cut stack traces. What we see is gigantic 1000 line stacktraces that confuse users, which makes them think it's all a bunch of junk. This leads people to shorten it themselves (like in https://discourse.julialang.org/t/incorporating-forcing-functions-in-the-ode-model/70133/5). Making people who don't know how to read stacktraces be the ones to figure out what to delete is the worst of all worlds. I think the majority of heuristics would do better than most user's ideas (and the average Discourse user seems to generally take the suggestion of the OP to just take the top of the stack. That almost never points to the solution so I would heavily oppose that). |
What's a good heuristic here? Within the text which is currently light grey, it could (1) just cap the length at 1 screen width or so? (2) cap the number of levels of nesting and print? Still printing all the arguments (thus all the bits of white text at the end of [10] here). This seems largely orthogonal deciding which frames to print, and perhaps easier?
One reason to nevertheless print things is that the people you ask for help may know about these packages.
This isn't linked here, but is https://github.com/BioTurboNick/AbbreviatedStackTraces.jl . This is, if I understand right, primarily about deciding which frames to print. |
Ah thanks for linking my package @mcabbott, and to its related draft PR #40537, which it seems I neglected to link back to this issue. And thanks @ChrisRackauckas for bumping it. The solution I settled on there was to hide "internal" frames by default, but they would be viewable in full on demand by accessing an But I hadn't done anything about type information otherwise because I haven't had direct experience of the problem, enough to have an idea of how to handle it. That screenshot is pretty crazy. This might not be exactly what my package would produce (I haven't tested it with nested task exceptions, and it could probably be abbreviated further than this), but in theory the stack trace in the Discourse thread would be abbreviated to: ┌ Warning: Only a single thread available: MCMC chains are not sampled in parallel
└ @ AbstractMCMC C:\Users\Bharadwaj\.julia\packages\AbstractMCMC\BPJCW\src\sample.jl:291
ERROR: LoadError: TaskFailedException
nested task error: TaskFailedException
Stacktrace:
[1-5] internal
@ Base.Threads, AbstractMCMC
nested task error: MethodError: no constructors have been defined for Any
Stacktrace:
[1-4] internal
@ Base
[5] materialize
@ Base .\broadcast.jl:883 [inlined]
[6] fitting_epidemic_wildtype(__model__::DynamicPPL.Model{typeof(fitting_epidemic_wildtype), (:observ_data, :w_forcing, :m_forcing), (), (), Tuple{Matrix{Float64}, Interpolations.BSplineInterpolation{Float64, 1, Vector{Float64}, BSpline{Linear{Throw{OnGrid}}}, Tuple{Base.OneTo{Int64}}}, Interpolations.BSplineInterpolation{Float64, 1, Vector{Float64}, BSpline{Linear{Throw{OnGrid}}}, Tuple{Base.OneTo{Int64}}}}, Tuple{}, DynamicPPL.DefaultContext}, __varinfo__::DynamicPPL.UntypedVarInfo{DynamicPPL.Metadata{Dict{AbstractPPL.VarName, Int64}, Vector{Distribution}, Vector{AbstractPPL.VarName}, Vector{Real}, Vector{Set{DynamicPPL.Selector}}}, Float64}, __context__::DynamicPPL.SamplingContext{DynamicPPL.SampleFromUniform, DynamicPPL.DefaultContext, Random._GLOBAL_RNG}, observ_data::Matrix{Float64}, w_forcing::Interpolations.BSplineInterpolation{Float64, 1, Vector{Float64}, BSpline{Linear{Throw{OnGrid}}}, Tuple{Base.OneTo{Int64}}}, m_forcing::Interpolations.BSplineInterpolation{Float64, 1,
Vector{Float64}, BSpline{Linear{Throw{OnGrid}}}, Tuple{Base.OneTo{Int64}}})
@ Main c:\Users\Bharadwaj\Indian Institute of Science\COVID-19 variant study - General\codes_wildtype_fitting\discourse_NPI_code.jl:100
[7-21] internal
@ DynamicPPL, AbstractMCMC, Turing
Stacktrace:
[1-11] internal
@ Base, AbstractMCMC, ProgressLogging, Turing
[12] sample(model::DynamicPPL.Model{typeof(fitting_epidemic_wildtype), (:observ_data, :w_forcing, :m_forcing), (), (), Tuple{Matrix{Float64}, Interpolations.BSplineInterpolation{Float64, 1, Vector{Float64}, BSpline{Linear{Throw{OnGrid}}}, Tuple{Base.OneTo{Int64}}}, Interpolations.BSplineInterpolation{Float64, 1, Vector{Float64}, BSpline{Linear{Throw{OnGrid}}}, Tuple{Base.OneTo{Int64}}}}, Tuple{}, DynamicPPL.DefaultContext}, alg::NUTS{Turing.Core.ForwardDiffAD{40}, (), AdvancedHMC.DiagEuclideanMetric}, ensemble::MCMCThreads, N::Int64, n_chains::Int64)
@ Turing.Inference C:\Users\Bharadwaj\.julia\packages\Turing\uMQmD\src\inference\Inference.jl:189
[13] top-level scope
@ c:\Users\Bharadwaj\Indian Institute of Science\COVID-19 variant study - General\codes_wildtype_fitting\discourse_NPI_code.jl:120
in expression starting at c:\Users\Bharadwaj\Indian Institute of Science\COVID-19 variant study - General\codes_wildtype_fitting\discourse_NPI_code.jl:120 EDIT: oh, just saw the OneDrive stacktrace 😬. I believe my package would have cut it down from 116 unwrapped lines to 33. Which, once type information is reduced, would look closer to: ERROR: LoadError: TaskFailedException
nested task error: TaskFailedException
Stacktrace:
[1-5] internal
@ Base.Threads, AbstractMCMC
nested task error: BoundsError: attempt to access 36-element interpolate(::Vector{Float64}, BSpline(Linear())) with element type Float64 at index [NaN]
Stacktrace:
[1] internal
@ Base
[2] BSplineInterpolation
@ C:\Users\Bharadwaj\.julia\packages\Interpolations\3gTQB\src\b-splines\indexing.jl:6 [inlined]
[3] epidemic_wildtype(...)
@ Main C:\Users\Bharadwaj\Indian Institute of Science\COVID-19 variant study - General\codes_wildtype_fitting\discourse_NPI_code.jl:62
[4-11] internal
@ SciMLBase, OrdinaryDiffEq, DiffEqBase
[12] #solve#43
@ C:\Users\Bharadwaj\.julia\packages\DiffEqBase\Samo4\src\solve.jl:73 [inlined]
[13] fitting_epidemic_wildtype(...)
@ Main C:\Users\Bharadwaj\Indian Institute of Science\COVID-19 variant study - General\codes_wildtype_fitting\discourse_NPI_code.jl:107
[14-43] internal
@ DynamicPPL, Turing, Turing.Inference, ForwardDiff, AdvancedHMC, UnPack, AbstractMCMC
Stacktrace:
[1-15] internal
@ Base, AbstractMCMC, Base.CoreLogging, ProcessLogging, Turing.Inference
[16] sample(...)
@ Turing.Inference C:\Users\Bharadwaj\.julia\packages\Turing\uMQmD\src\inference\Inference.jl:189
[17] top-level scope
@ C:\Users\Bharadwaj\Indian Institute of Science\COVID-19 variant study - General\codes_wildtype_fitting\discourse_NPI_code.jl:123
[18] include(fname::String)
@ Base.MainInclude .\client.jl:444
[19] top-level scope
@ REPL[2]:1
in expression starting at C:\Users\Bharadwaj\Indian Institute of Science\COVID-19 variant study - General\codes_wildtype_fitting\discourse_NPI_code.jl:123 |
How about not printing the type parameters if they exceed, say 20 characters. That would reduce Chris' stack trace a lot, e.g. frame 10 would be:
At least for me, the type parameters always seem to be the worst offenders. |
Maybe this should be a separate issue from the other sense of abbreviation? Anyway the option to truncate it just by length not by nesting is dead-simple, since there is already a function handling a string: https://github.com/JuliaLang/julia/blob/master/base/show.jl#L2382-L2393 . For example, as in this gist. |
I should correct myself: My package does currently elide type parameters when there are more than 2. But I don't propose that's the best rule. It does make sense to talk about that separately. |
I am not a fan of heuristics. I think the better way to solve this is to create Then, we also modify the REPL so that we can interactively fold and unfold portions of the type. https://github.com/JuliaCollections/FoldingTrees.jl already does this but it's in a multiline format, and for type-printing I think we want an inline variant. This should be done in a way that I can use with an arbitrary type that I call Finally, we modify stacktraces to use All this should be done in a manner that allows VS Code to do similar operations but via mouse clicks. And yes, I agree with @ChrisRackauckas and others that this is a first-order usability issue. |
You mean this infamous julia> StridedArray
StridedArray (alias for Union{DenseArray{T, N}, Base.ReinterpretArray{T, N, S, A, IsReshaped} where {A<:Union{SubArray{T, N, A, I, true} where {T, N, A<:DenseArray, I<:Union{Tuple{Vararg{Real}}, Tuple{AbstractUnitRange, Vararg{Any}}}}, DenseArray}, IsReshaped, S}, Base.ReshapedArray{T, N, A} where A<:Union{Base.ReinterpretArray{T, N, S, A, IsReshaped} where {T, N, A<:Union{SubArray{T, N, A, I, true} where {T, N, A<:DenseArray, I<:Union{Tuple{Vararg{Real}}, Tuple{AbstractUnitRange, Vararg{Any}}}}, DenseArray}, IsReshaped, S}, SubArray{T, N, A, I, true} where {T, N, A<:DenseArray, I<:Union{Tuple{Vararg{Real}}, Tuple{AbstractUnitRange, Vararg{Any}}}}, DenseArray}, SubArray{T, N, A, I} where {A<:Union{Base.ReinterpretArray{T, N, S, A, IsReshaped} where {T, N, A<:Union{SubArray{T, N, A, I, true} where {T, N, A<:DenseArray, I<:Union{Tuple{Vararg{Real}}, Tuple{AbstractUnitRange, Vararg{Any}}}}, DenseArray}, IsReshaped, S}, Base.ReshapedArray{T, N, A} where {T, N, A<:Union{Base.ReinterpretArray{T, N, S, A, IsReshaped} where {T, N, A<:Union{SubArray{T, N, A, I, true} where {T, N, A<:DenseArray, I<:Union{Tuple{Vararg{Real}}, Tuple{AbstractUnitRange, Vararg{Any}}}}, DenseArray}, IsReshaped, S}, SubArray{T, N, A, I, true} where {T, N, A<:DenseArray, I<:Union{Tuple{Vararg{Real}}, Tuple{AbstractUnitRange, Vararg{Any}}}}, DenseArray}}, DenseArray}, I<:Tuple{Vararg{Union{Int64, AbstractRange{Int64}, Base.AbstractCartesianIndex, Base.ReshapedArray{T, N, A, Tuple{}} where {T, N, A<:AbstractUnitRange}}}}}} where {T, N})
julia> f(::StridedArray) = 1
f (generic function with 1 method)
julia> methods(f)
# 1 method for generic function "f":
[1] f(::StridedArray) in Main at REPL[3]:1 |
Any plans to add more visual elements in stacktraces a la https://github.com/FedeClaudi/Term.jl ? Maybe adding boxes around the main error message could help? |
That is pretty! That could be a separate issue, or demonstration package, possibly? Though if you have an idea that would look particularly great with the abbreviated traces I came up with here: https://github.com/BioTurboNick/AbbreviatedStackTraces.jl I'd welcome an issue or PR. |
I had a very concrete way to shorten a lot of stack traces without losing any information while also making code easier to maintain, so I opened that in #45687 . It would solve the long stack trace problem for SciML at least. |
I would like to voice my support for improving the current state of stack traces, and soon. As a real-world example, when using OrdinaryDiffEq.jl for time integration, we regularly get error stacktraces for Trixi.jl that are around 80 KB in size, where two stack frames alone have >25k characters. If I trigger such an error with multithreading enabled (e.g., 32 threads), this is even worse. I truly appreciate the fantastic job the OrdinaryDiffEq.jl folks do, and I think they are doing everything right in leveraging Julia's capabilities to the full extent. However, I believe the tooling (in this case: stack traces) should keep up with such "extreme" usage of the Julia machinery (specifically, the type parameters). A one-size-fits-all solution for this issue is likely impossible, but if we were able to start with one of the proposed approaches (either radically shortening type parameter printing or omitting some frames), we would be able to gather experience with this in practice. Of course, there should be a command line flag + possible an environment variable to restore the previous behavior (e.g., to globally turn off stacktrace shortening) and/or an easy option to recreate the full stack trace (like |
#45687 is something we want to follow up on. It would solve most SciML cases. |
|
Where can I find this/how can I use this - is this documented in the Julia manual somewhere? It does not solve the issue of the overly large stacktraces though, or does it? |
No, just pointing that out. It was in NEWS for 1.8 but it appears some docs are missing. Should probably be next to |
Can we have abbreviated stack traces (by default), and also output the full one elsewhere? The example Chis had is bad, and worse if people don't know where to cut. The short version could end with full stack trace is at /tmp/something. It's unclear to me if advised to overwrite this file each time, or log-rotate, keep last 3 or whatever, maybe configurable. Or put it into syslog (maybe only if opted into with ENV var)? Could we even pop up a windows with short, and details button? [In the issue I came from that linked here it was suggested that SIGTERM is normal, and no stack trace should be given, filling up log files. That seems correct and an exception. Possibly should be opt-in overridable.] |
I think writing exceptions to a file would be a great idea. Could do a separate issue/PR for that, as that would possibly be useful regardless of whether traces get abbreviated, though it would work well with it. And might get better discussion than here. If nothing else, it could just copy what gets written to the Invoking anything with a GUI is probably a no-go in Base though. I think though the stacktrace in the issue you're referencing (from the Julia process crashing) is a different matter than the traces produced internally in a normal Julia session - different code path, different considerations. |
I wouldn't want to write to a file for non-erroneous situations, why I mentioned SIGTERM also, which is considered normal. Another thing, when I install packages, I often stop the pre-compilation process intentionally, and get huge stacktraces. I'm not really interested in them, even if abbreviated. A separate PR could drop them entirely? But for real valid stacktraces, I would be very ok with abbrevated (similar to in Python, stopping at the C API), and the file idea I proposed. Would anyone object to that? I'm thinking could this be considered a privacy issue, or bad for diskspace reasons? The GUI idea was to possibly get out of wasting disk space, log-rotation would also. But it seems bad to abbreviate only and have no way of getting full, since that can be helpful. If current stack-traces go to logs in some situations, and too many generated (never good) then we would lose out with log-rotation (so should be off by default?), only keeping abbreviated in syslog... |
I made a proposal on the PR, in case people have participated on the issue and not the PR: #40537 (comment) Maybe it is a faster way forward? |
I'd like to suggest abbreviating the display of stack traces in errors by default, with the ability to display the full trace immediately after if you wish to see it.
As an example, this stack trace from when
BenchmarkTools.@btime
errors:An improved experience might be something like:
Related to #36517, where @vtjnash mentioned the possibility of a
errs
variable that stores an error similar toans
storing the last result.The text was updated successfully, but these errors were encountered: