Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Principles: Error handling #84
Principles: Error handling #84
Changes from 5 commits
8a3ced1
5d53403
ce47f2d
9e21afd
827500c
ccb9469
4cc1618
fe0e1e4
9218134
6baa4f5
fe4a8af
8c58e3d
3ecbd2e
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another option is to roll back the damage to the state that was done by the error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, please consider style when you're using parentheses. While I know you lean towards more use, I think things like this "(mostly)" could be better handled with a little rephrasing, like "the most effective way" or ".
https://developers.google.com/style/parentheses
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The repeated ", or" plus other commas (", if", ", that", ", among") in this sentence makes it hard to read. Consider rewording, e.g.:
For example, ... might mean:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Markdown considers bulleted lists to be separate paragraphs (with vertical whitespace above and below), so I'd rather not go that route. How's this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the conclusion, but I don't agree with the reasoning :) I think the reasoning is important, because it determines what is considered a programming error.
Consider an issue that is typically considered a recoverable error -- an operation that opens a file for reading by name determines that there's no such file. "When such an error is detected, the original cause is neither known nor bounded." It can have many causes: the programmer forgot to check whether the file exists, the programmer forgot to call the routine that creates the file, the programmer forgot to handle "out of disk space" error from the routine that creates the file, the programmer that wrote the script that invokes this program creates the file in the wrong directory, etc. "Without more information, it's impossible to know, so the only way to somewhat reliably recover from a programming error is to discard the entire address space and terminate the program."
Given this explanation, is there a distinction between dereferencing a dandling pointer and file not found error?
If you ask me, I'd explain it in terms of preconditions of APIs and language features. Violating a precondition is a programming error that is non-recoverable. If an API or a programming language feature has a requirement that some condition must hold, but it can detect a violation and return control to the caller, then it is not a programming error -- it is regular control flow (which may be expressed using error handling language features if we so desire).
A precondition can be something an API requires (for example, a file must exist, input array must be non-empty, input array must be sorted etc.), or the programming language requires (a pointer to be dereferenced must point to valid memory, addition should not overflow etc.)
"File not found" is usually not a programming error, but it is entirely reasonable to design an API where "file not found" is a non-recoverable error that terminates the program (think of a map reduce batch job). In that case, file being present and readable is a precondition, and violating it is a programming error. So what is a precondition and what is an error that can be handled really depends on the designer of the API or of the language feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The distinction I would make is that in the case of a file-not-found error, we may not know the cause for certain, but the program can know (at least roughly) what the likely causes are. In the case of a dangling pointer, on the other hand, the program generally can't even know that, at least not with enough specificity to plausibly recover. I've tried to rephrase to make that clearer; does that help?
I agree with that, but it seems to just assert the position that I'm trying to justify here: that programming errors should be considered non-recoverable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't see much of a distinction. It is often possible to make an informed guess about why the pointer is dangling -- think about all those times when a report from ASan that such and such pointer is used after free is all one needs to implement a fix even when one can't reproduce the problem locally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, but the code handling the error, and the programmer writing that code, doesn't have access to that ASan report. Or if they do, they should probably just fix the bug, rather than try to detect and programmatically recover from it. I've tried to make this point more explicit; does that help?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like "throw exceptions" suddenly pulls a ton of context from C++ into this document...
Can this be phrased more generically?
Possible approach:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Beyond the caveat you give below (or maybe instead? see my comments below) I think it would be good to pretty clearly call out that the goal is to address the underlying requirements here, just in a different way.
Basically, I think we don't want people to take away from this that Carbon won't be applicable in a sharply memory constrained environment. I think we're pretty committed to having some way to support such uses of Carbon if we want this to be viable in a wide range of environments. Just that the approach isn't expected to be for the default heap allocation mechanism to allow for recoverable failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is tricky, since we're talking about the standard library here. Is the goal to make sure that we don't prevent those users from using Carbon, or is it to make sure that those use cases get first-class support in the standard library? The former would just mean we need to make sure the language doesn't get in their way, which I think we definitely do want, but the latter would mean we have to provide alternative memory-exhaustion-compatible APIs for everything in the standard library, which I think we definitely don't want. At the level of a principles doc, I don't know how to spell out where between those extremes we intend Carbon to land.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Separate comment -- maybe worth clarifying that this is true for the default memory allocation APIs, but not necessarily all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like both of these statements are pushing a bit far into details and specifics that haven't materialized yet. I think they're more intended to be examples, but as written feel a bit sweeping in scope.
For example, I think we might work to enable parts of the standrad library to take advantage of different allocation strategies like this if we can find a clean way to incorporate it into the design. But it is a big "if", and I'm totally down with not overpromising. I just don't want to discourage too sharply either or preclude still open design exploration.
As I mentioned above, maybe we can replace specific caveats with a more general statement around working to explore and find ways of addressing the fundamental requirements of constrained systems programming which don't have as dramatic of an effect on the overall language and API design.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Working to explore" those use cases is pretty different from having them be an explicit goal (which you seem to be suggesting above), so I'm not sure what you're looking for here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think "may need to build their own libraries on top of it" covers this adequately: it does leave open the possibility of a standard library that includes recovery from memory allocation failure.
I do want to avoid over-promising in the other case too, though: saying we may provide heap allocation that allows recovery from allocation failure rather than saying we definitely will.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little worried the second half here will be read to indicate that genuinely huge stack sizes will be necessary much like they are in C++.
I think we should (similar to above) actually address the use case for sharply limited stack size, but in a way that doesn't require recovering from arbitrary stack exhaustion.
As a concrete thing, I'd really love if we could allow threads to have very small data stacks by default while allowing them to grow cleanly to quite large when necessary. This would help reduce the address space pressure and other challenges of the current C++ model.
Anyways, mostly I worry we're getting too far into exactly how we will do this in Carbon rather than just the high level principle that the default memory allocation approach won't have recoverable errors on exhaustion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you say more? The connection between the two isn't obvious to me, because I don't see how recoverable stack-overflow errors can be used to mitigate a limited stack size, except in the very limited sense that they might let you isolate stack-overflow failures to a single computation, rather than the whole process. In other words, it seems like your system has to be designed so that the computations that you need to actually work will fit within the stack size limit, regardless of whether stack overflows are recoverable.
At least in the case of stack exhaustion, isn't that pretty much what I've done?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to explicitly call out that this may simply be via the return type?
Otherwise, I can see this reading as an explicit marking beyond the return type even if we end up completely representing the error in the return type...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not convinced on this paragraph. My canonical example for when the "invisible control flow" has been helpful for me in the past is writing network code. I can't always verify the data is good before doing any work (doing so might imply doing double the work or waiting to begin processing one packet until the next packet has arrived), so an error can occur fairly deep in the call stack. For certain types of errors, bad data suggests some sort of corruption somewhere. There is frequently no sane way to recover, so the program logic I want to express is "Go back to the code where I created this socket, clean up anything that I created since then, disconnect, and reconnect". Exceptions handle that perfectly and localize the error to the two places that care about it.
You somewhat address this later on in the section "Error propagation must be straightforward". To be convinced of this, I would want to see an alternative strategy that is at least somewhat comparable. I don't expect it to be as nice as "no code", but I want to understand just how much of that I am giving up to get the benefits you describe before I could say I support this paragraph.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rust and Swift mark the propagation of an error with a single token at the callsite (postfix
?
and prefixtry
, respectively), which seems about as close to "no code" as you can get while still being code (admittedly, in some cases you may also need parentheses for disambiguation, as with any other unary operator). That's the kind of thing I have in mind when I say propagation should be straightforward.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What might help strengthen the argument is to talk about the experience of a reader in the middle of the propagation, who is less familiar with the code than the author. This is IMO where the readability hit is felt most -- otherwise as David says it can feel like an effective way to separate concerns. But a reader who is trying to understand the behavior of code in the middle and is unaware that control flow doesn't proceed as expected based on the locally visible code can be left completely lost and having to read a much larger amount of code both up and down the call stack to understand what the local behavior is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you con make this a bit more squishy without removing the importance of it:
I think this also avoids a debate over "is it really altering the structure?" by instead focusing on how much structural churn is necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this conflates two questions.
(1) Does Carbon itself, either in the core language or in the standard library, establish an error hierarchy?
(2) Does Carbon allow/encourage/require users to define their own hierarchy?
The text itself mainly answers question 1, but the argument about brittle code also applies to question 2. I believe it's important to address both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree this conflates those questions, but I think that conflation is correct: the two questions are aspects of one underlying question, namely whether classifying propagated errors is a programming practice that Carbon will encourage. I've tweaked part of the next paragraph to be less specific to (1); are there other places that you think put too much emphasis on (1), or not enough emphasis on (2)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"or other reusable error vocabulary" seems a bit over-broad to me. Go's
error
interface seems pretty harmless to me (in particular it doesn't require so many type shenanigans as Rust'sError
trait), and your arguments about the downside of hierarchy and classification don't seem to apply to it.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I phrased this poorly; I didn't intend to exclude things like that. Better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The second part of this doesn't really make sense to me when first reading it. Reading the rest, I think I understand, but maybe to help clarify:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, except that I've omitted "common set" because I actually mean this to cover any properties.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like an argument against including an easy to use construct to propagate errors, rather than an argument against universal error classification APIs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it's an argument against having both convenient error propagation and universal error classification. But as argued above, I think we need convenient error propagation, so classification has to be what we drop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this is actually an independent principle that would be worth having: the desire to minimize (and potentially avoid) having fundamental language constructs or control flow constructs whose only purpose is error handling, and instead to try to ensure the general facilities of the language are sufficient. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed offline, I regard that as a nice-to-have rather than a requirement, and I can easily imagine wanting to trade it off for priorities like readability, so I'm hesitant to enshrine it as a principle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This example loses me, i think because it is imagining a fairly specific thing and I just don't have the context.
How essential is it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's important to provide a concrete example, because the previous discussion has been pretty abstract. However, if it's losing you, it's not doing that job. Let me try to elaborate, and then hopefully you can suggest how to phrase it more clearly without spending a paragraph on it:
There's a very common language feature that lets you specify a block of code and a set of pattern/handler pairs, and if an exception escapes the block, control is transferred to the handler whose pattern best matches the exception. try/catch in C++, Java, and JavaScript, try/except in Python, and do/catch in Swift are all examples. However, this feature really combines two separate pieces of functionality:
The primary practical consequence of the passage above is that Carbon will not have #2, but I don't want to just say "Carbon won't have try/catch", because nothing we've said so far has ruled out having #1 on its own, i.e. having a form of try/catch that doesn't incorporate pattern matching.