-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Uncatchable FatalException. Separating bugs from recoverable errors. #15514
Comments
I don't think we need a special If the only thing |
@nalimilan, as I see it, the benefit of the Midori model arrises from having a class of errors which are not catchable at all. " Perhaps a better interface would be to have a non-exported The key thing is that from the ordinary user's point of view there are a class of errors that cannot be caught. Consider a practical example: If You could see this as a bargain struck between tool designer who is trying to make the tool safe and efficient and the tool user who just wants to get stuff done: " |
I do like the idea of something like this. The idea of catching, say, an UndefVarError is crazy enough to warrant some special treatment at the language level. I agree with @samoconnor that catching exceptions by type is not quite enough. It's too difficult to know exactly which types of exceptions to catch. Ideally, the default behavior for There is a strong connection here to error checks that can be optionally disabled, like bounds checks. If all disable-able checks are handled with abandonment, you can be sure correct programs will work with |
@JeffBezanson can you tell me if I'm on the right track for a quick-and-dirty proof-of-concept implementation of uncatchable... I'm thinking that the condition Line 3073 in 6b5a05e
So, instead of this: expand(:(try error("foo") catch e println(e) end))
:($(Expr(:thunk, AST(:($(Expr(:lambda, Any[], Any[Any[Any[:e,:Any,18]],Any[],1], :(begin
$(Expr(:enter, 0)) # none, line 1:
GenSym(0) = (Main.error)("foo")
$(Expr(:leave, 1))
return GenSym(0)
0:
$(Expr(:leave, 1))
e = $(Expr(:the_exception)) # none, line 1:
return (Main.println)(e)
end)))))))) ... you would get this: :($(Expr(:thunk, AST(:($(Expr(:lambda, Any[], Any[Any[Any[:e,:Any,18]],Any[],1], :(begin
$(Expr(:enter, 0)) # none, line 1:
GenSym(0) = (Main.error)("foo")
$(Expr(:leave, 1))
return GenSym(0)
0:
$(Expr(:leave, 1))
e = $(Expr(:the_exception)) # none, line 1:
unless (Main.isa)(e,Main.FatalException) goto 2 # none, line 1:
(Main.retrhow)(e)
2: # none, line 1:
return (Main.println)(e)
end)))))))) A quick hack like this would allow experimenting with making e.g. |
I agree that uncatchable exceptions make sense – I suspect they should terminate the current task. That means you can still write processes that can carry on so long as some other task is still running. |
Having read that midori article several times now, I'm not convinced why we would want this. Uncatchable exceptions are a bit like private methods, in that they're a statement by the library writer that leverages them that he knows better than all his users, and is just forcing them to go looking for workarounds. For writing an operating system, that is a safe bet and a reasonable design decision. For the things people use Julia for, not so much. Even UndefVarError could have uses in exploratory automated generation of code. Things that seem like "obvious bugs" in isolation can be recoverable exception situations that trigger backtracking or mode switching in real algorithms (e.g. "restoration mode" in many optimization solvers). |
It could potentially make throwing uncatchable exceptions much more efficient – given the current cost of adding a single error return to a function call, that would be good. I also just don't really believe that people catch these sorts of things correctly in general. At least with a bugs-kill-the-task approach, you know exactly what you're catching – a failed task – and there's a reasonable possibility that the task that's catching the failure is in working order. |
My only concern with having just bugs-kill-the-task is that abandonment-on-bug behaviour could be lost in a situation like using |
As I read Joe Duffy one of the things that made abandonment feasible on Midori was very light-weight processes, so if you wanted robustness in face of abandonment you could spin up a new process easily and do the processing there. I imagine a web server on Midori would do one process per request, so that even if one request did something crazy it would only kill that process, not the entire web server. Rust does something similar with its panics, except there a panic only tears down the current thread, not the entire process. This does not isolate the potential fallout of a logic bug as well, but I suppose it is a better performance trade-off when running on systems with more heavy-weight processes. Personally, having recently had to debug a subtle logic bug that had been feeding junk data into a database in production for six months I've developed a real preference for systems where bugs show up early and in spectacular fashion. |
The problem I have with this is that it's a relatively complex system already, and yet it does nothing to prevent people from catching e.g. a So I'm suggesting that (like in #7026) one would never be able to catch an exception type which wasn't explicitly mentioned after @StefanKarpinski Why should throwing a fatal error be fast? Are you proposing they would become a standard way to do control flow? Else, I don't really see the point. I also think @tkelman is right that sometimes it's useful to be able to catch even the most fatal exceptions for debugging or to temporarily work around an ugly bug in library code. |
@johansigfrids: Julia's tasks are lightweight enough to be used the way you describe. In fact, if you want to write a server that handles multiple requests concurrently (which you do), then you need to spawn a task for each of them anyway. Tearing down a thread won't really make any sense since our threading model will have tasks as the unit of work, and those will be mapped onto threads by the work scheduler. In other words, threads belong to the system, tasks belong to the program. @nalimilan: What I was thinking (vaguely) in terms of performance is that we currently have to worry about unwinding the stack, figuring out where to unwind to, constructing an error object, etc. The problem isn't so much the time it takes to do this but the optimizations that being prepared to do it prevents. The task abandonment on bugs approach would make errors terminate the current task, which amounts to just putting the task in the "error" state and returning to the scheduler, all the hard work would be done on the handling side – whatever task is waiting on this one would have the entire task and its stack to figure out what happened. But the open questions are: can we avoid causing a GC frame in the caller of a method that errors, and can we avoid having errors prevent inlining of otherwise simple methods? I don't know, but if all you have to do is terminate the thread and call the scheduler, it does seem plausible that this could be easier. I wholeheartedly agree that catching the wrong error is way too easy right now. This would mitigate that problem by making a whole class of errors that you shouldn't be catching at all just bypass any catch block. Joe Duffy's main point about separating out bugs from I/O exceptions and the like, is that it makes exceptions that are catchable far less common – otherwise it's impossible to write any code anywhere that isn't riddled with catchable exceptions. By distinguishing errors from exceptions and making only the latter catchable, the number of places where you have to worry about true exception handling is reduced to a manageable level. The key issue is that the set of catchable exceptions a function can throw are really part of its signature: they are also ways for the function to return, and if you want to write a correct program you need to handle them. The "chain of custody" proposal makes this explicit by requiring you to annotate call sites with In Midori, the compiler forces you to handle catchable exceptions – programs won't compile, let alone run unless you handle all catchable exceptions. In Julia, we won't do that – unless you opt into it by running some kind of static code analysis tool on your program. But what we can do is convert an unexpected exception into a task-terminating error – because failing to handle an exception is a programmer error. This gives Julia libraries flexibility to evolve their APIs and introduce exceptions where they didn't previously exist: in Midori, would causes a compile time error, but in Julia programs would continue to work, raising errors if unexpected exceptions occur; if you do run static analysis tools on your code beforehand to detect unhandled exceptions, then you would get a warning about any new unhandled exceptions when you upgrade dependencies, and get a chance to handle them – but your code will still run. |
One way to think of catchable exceptions in the chain of custody model – and a possible way to implement them if we can reduce the class of exceptions sufficiently – is that they are literally part of the function signature and that the function bar(a, b)
# before
throw BarException()
# after
end
function foo1(x)
# before
bar(2x, y) throws BarException
# after
end
function foo2(x)
# before
bar(2x, y)
# after
end It is really a shorthand for writing this: function bar(a, b, handleBarException=error)
# before
handleBarException(BarException())
# after
end
function foo1(x; handleBarException=error)
# stuff
bar(2x, y, handleBarException=handleBarException)
# more
end
function foo2(x)
# stuff
bar(2x, y)
# more
end Obviously for stuff like array indexing, we can't afford this kind of implementation, but if we reduced catchable exceptions to things like I/O and other non-bug conditions, then it might be a perfectly reasonable implementation. An interesting aspect of this implementation approach is that recovering from exceptions is trivial – just provide a handler that returns a value. I'm not sure if we'd want to do that or not. |
@StefanKarpinski it is interesting that your CoC model does almost the same thing as Midori, but opposite. i.e Midori has no type annotation at the call site (just a I like the exception type annotation on the method signature because it is nice self-documentation. I worry that burden of annotating the type at the call site might discourage creation of fine-grained exception types (because call sites would end up with a growing list of A counter argument would be that: if it is poor form for an API method to return more than one or two exception types; and if catchable exceptions are outnumbered by hard errors 10:1; then I'm interested to know what you think about this type-at-definition vs type-at-call-site tradeoff. For those who haven't studied the Midori blog post, the example above would look like this "the Midori way": function bar(a, b) throws BarException
# before
throw BarException()
# after
end
function foo1(x)
# before
bar(2x, y) <- error missing "try"
# after
end
function foo2(x) <- error missing "throws"
# before
try bar(2x, y) <- "try" means rethrow whatever bar() throws.
# after
end
function foo3(x) throws BarException
# before
try bar(2x, y)
# after
end See Easily Audible Callsites . The Alternate value on exception: i = try foo(x, y) else 7 Exception as value (kind of like type Result{T}
value::T
exception
end
result = try foo(x, y) else catch
if is_failure(result)
log(result)
throw(result.exception)
end
println(result.value) Exception as value propagation (like a general form of x = try foo() else catch
y = try bar else catch
z = x + y See Syntactic Sugar . |
@nalimilan what Jeff said above hints at the reason:
e.g. If you assume that
Other performance opportunities include:
|
How would generic pass through functions like |
I guess my proposal boils down to this: functions have to be annotated via Then the details of how abandonment happens can vary depending on compilation options: in release mode, the program would just abort. But in debugging mode or at the REPL, the exception would still be catchable using e.g. This doesn't mean we shouldn't have guidelines about which exceptions should be considered fatal by function writers. For example, we would advise not to add |
This needs to be possible to work around even in release mode. If library A throws something that it considers fatal, and library B which wraps it does not handle that case, I can guarantee there will be cases where you need to call library B in a way that these "fatal" errors from inside A are recoverable. Classifying exceptions and degrees of fatalness is such a subjective thing, one decision will not be appropriate for all use cases. I'd hate to have to copy all of library B wholesale and need to modify its exception handling annotations to be able to do this. Or have to introduce Tasks for every single computation when so many computational use cases have so far been able to ignore the existence of the Task programming model entirely. |
@samoconnor: I appreciate the easily auditable call site thing, but it requires an amount of static analysis that we don't – and generally can't – do in Julia. When you see One way around this would be to make the function foo0(x, y)
bar(x, y) # parse-time error?
end
function bar throws BarException end # some syntax for declaring this
function foo1(x, y)
bar(x, y) # parse-time error
end
function foo2(x, y)
b = randbool() ? bar : +
b(x, y) # error?
end
function foo3(b, x, y)
b(x, y) # could be bar, depending on how it's called
end If the The only way that we can in general, without completely changing the dynamic nature of the language, do this sort of thing is in an opt-in static mode where we do that sort of checking and separate the program into code that we know is ok, that we don't know about, and that we know is wrong. Midori is trying to do something different than what my chain of custody proposal is aiming at: Midori's approach ensures that you cannot call any function without handling all possible exceptions (counting explicitly ignoring them as "handling"); the chain of custody proposal ensures that if an exception occurs that you didn't expect, a fatal error is raised, rather than the exception being caught by code expecting a different condition. Both approaches have in common that they make sure that when you catch an exception, it's actually the one you expected to catch, which can easily not be the case in Julia currently. |
Would it be possible to map the exception throwing and handling mechanism onto function arguments? I'm not proposing to use that as actual syntax, but because Julia's semantics for function calls and argument type matching are very well defined, defining a new mechanism in terms of this would avoid the need to explicitly handle exception declarations ("throws") in Julia's run-time system. For example, a function that might throw 3 different exceptions might be represented internally as a function that takes 3 keyword arguments with particular reserved names and types. In this way, a mismatch would be detected, and the method selection mechanism would ensure that a function returning exception E can't be called from a site that doesn't handle exception E. If keyword arguments don't work for this, then maybe a single argument with a parameterized type |
@nalimilan: your proposal is pretty similar in spirit to mine. The main differences, afaict, is whether you annotate the method signature with I think some of the reason @JeffBezanson didn't like my proposal may have had to do with him not understanding and me not conveying that the number of functions with |
@eschnett: that's essentially what I proposed above with my |
@StefanKarpinski You're right, I was mostly adapting your plan to this issues' proposal regarding fatal errors. Maybe after this discussion distinguishing exceptions that are supposed to be caught (and therefore mentioned in the annotation) from others, Jeff will be more convinced... |
From the original issue description:
There is much discussion above about refinement of the try/catch mechanism. Obviously more discussion is needed before a conclusion is reached. Putting that aside, and returning to the issue of "abandonment" for handling bugs, my question to @JeffBezanson and @StefanKarpinski is: would you support a minimal PR that adds an uncatchable exception type? |
See #15906. This implements a very simple fatal error mechanism. The intention of this WIP PR is:
If the general idea is accepted, there is lots of scope for the performance optimisations suggested by Jeff and Stefan to be added later. (e.g. disabling some checks in release mode, immediate task termination etc) |
The Midori Error Model has two types of error handling:
try/catch
mechanism for handling for recoverable errors (this has similarities with the "chain of custody" exception handling idea: julep: "chain of custody" error handling #7026 (comment) ).Combining the ideas from #7026 with Midori's
throws
method annotation and auditable call sites seems like a promising way forward for Julia'stry/catch
mechanism.This RFC focuses on "abandonment" for handling bugs. The eventual refinement of the
try/catch
mechanism can be dealt with seperately. The proposal here is to add a simple fail-fast bug handling mechanism that can co-exist with whatever the finaltry/catch
design turns out to be.The proposed approach is to:
abstract FatalException
type,if isa(ex, FatalException) rethrow(ex)
at the top of everycatch
block (i.e. makeFatalException
uncatchable.A selection of exception types could be made
<: FatalException
. e.g. perhapsArgumentError
,AssertionError
,StackOverflowError
,OutOfMemoryError
,UndefVarError
(Joe Duffy's blog post has a list of the error types that were treated as fatal in Midori ).This would immediately make things a little safer in all the places where existing Julia
catch
blocks currently catch more than was intended.The downside would be that the REPL would crash hard at the first
FatalException
. The Midori answer to this would be that the REPL should start a seperate process to execute the code that the user types in.Another approach would be to have a special
catchfatal
keyword for use in the REPL. There are a few other cases wherecatchfatal
is also needed. e.g. test frameworks; a server that needs to logFatalExceptions
before exiting; and the remote side ofremotecall_fetch
that needs to catch and serialise theFatalException
.Low-level library exception types should probably be made
<: FatalException
by default. e.g.UVError
see #14972. In most cases theUVError
should be translated into something meaningful likeHostNotFoundError
(which would not be fatal), but any rawUVError
occurrences that slip between the cracks should be fatal so that they are noticed and fixed.Joe Duffy says:
The text was updated successfully, but these errors were encountered: