[DONE] Should we reconsider our error types? #460
Replies: 5 comments 2 replies
-
Note: I am not really familiar with the identity.rs codebase, so please take everything with a grain of salt. Great comment @olivereanderson!
This implies of course that you add your own errors types at least for the larger libraries that you consume. But I think that since you are the one that is writing the logic to integrate the underlying library, you are in a much better position to classify the errors than your user is. |
Beta Was this translation helpful? Give feedback.
-
Thanks for the nice write-up @olivereanderson! I'm focusing on the account error enum, and even there just the non-wrapped variants to keep a reasonable scope. This also repeats parts of what @elenaf9 already said. Fatal vs. LocalWhat is fatal and local is not always very obvious (aka error handling is hard). Taking a closer look at our account error enum, it seems that we avoided making that distinction in the past and simply enumerated all possible errors, be they fatal or local. If we want to trim that down, we need to make that distinction and so we need a way to distinguish between the two.
The idea would be to add a
More debatable variants are:
Some variants we should keep as local errors:
Internal vs. ExternalSeparate more clearly between internal and external errors. This boils down to more distinct error types, less wrapping, and more mapping. Basically what @elenaf9 said, applied to our situation.
Convenience vs. explicitnessI want to preface this: None of these options address the fact that functions have an error in their signature that has all the variants even though the function may actually return just a subset of it. I don't know how to handle the combinatorial explosion of many possible unique subsets that functions can return, but would like to be proven wrong. I know stronghold did just that in a way, but I'm unsure if it applies to the account. Perhaps someone wants to explore that option? I primarily see a convenience issue, i.e. having to add lots of There are at least two options of putting this distinction into code. One is to take the current account enum, reduce the variants according to the above, and add a Taking the sled error post as inspiration, the second approach is to generally change our function signatures to: fn method(...) -> Result<Result<T, AccountError>, FatalError> where pub type AccountResult<T> = Result<Result<T, AccountError>, FatalError> As pointed out in the sled post, this allows users to use In summary, this proposes to:
Would love to hear everyone's thoughts! |
Beta Was this translation helpful? Give feedback.
-
:::info A note on stabilityAnother argument against enum style errors at the crate level is versioning; Unless the enum is marked with Note that what is described above also applies to enum style errors on the module level, hence if we decide to have a module with a single enum style error, then we need to consider how likely it is for this module to have implemented everything that logically should or could belong there. Hence if we want to provide exhaustive matching on the errors we expose in our public APIs and also easily add new functionality without introducing a major version, then the easiest way to achieve this goal is to avoid error types on the crate and (if possible) module level. Do we have kitchen-sink enums on our hands ?This post by Matklad speaks of the kitchen-sink enum anti-pattern. From what I understand it means to add all possible errors, including external ones, into a single enum. Even though our case is not as dire as the example from that post, the three problems listed there still (at least partially) apply to us:
Note that we mostly dont' re-export all the errors from our dependencies, so point 1. only partially applies, but we should still keep this in mind. A more responsible way of pushing error handling to the callerIt is possible to push error handling to the caller/user in a more isolated fashion. The idea is well explained in the blog post by Matklad from the previous section so I will quote it directly here:
Another benefit of this approach is that it tends to be easier to test, as it leads to having less side effects for any given function. The downside of this approach is that it can also lead to making more functions part of the public API even if the user is not interested in them, and the user is also forced to understand more dependencies . One could however argue that in some situations one would have to understand If this alternative way of pushing error handling to the caller could be of interest to us, then we also need to consider how well it fits with the DID specifications, as they may specify exactly what a functions signature should be (see for instance resolution). |
Beta Was this translation helpful? Give feedback.
-
Status update: In our weekly meeting we agreed to change our error handling in this library and we will start with the I suggest that we keep this discussion open for now so that we may ask for suggestions and/or help while we are refactoring the errors. |
Beta Was this translation helpful? Give feedback.
-
I don't have much to add, but I really liked reading the full discussion. From a framework perspective, our design philosophy centers around being convenient to implement. This does mean that error handling is an important thing to do right. As is mentioned in the discussions, I agree that we shouldn't force users to figure out which errors matter to them. We should do the heavy lifting and create manageable subsets for them. I like the idea of splitting the Fatal errors out and just keep them a bit vague, but make it clear these are fatal, so nothing you can really do about it. Lastly, I also think that the above point on breaking changes should be taken into account. In short, my opinion is that we should take the burden of responsibility away from the library consumers, at the cost of more work. |
Beta Was this translation helpful? Give feedback.
-
Should we reconsider our error approach:
Introduction
Users with little experience with Rust may want to first read the background section.
Currently each of our crates introduce a single error enum with (often) many different variants that cover any possible way any single method exposed in the crate can fail. This makes it very easy to compose functions and return early with the try operator
?
, but also leads to a large cognitive burden on our users as they have to think about which errors will actually occur and how to handle them. When there are so many variants to consider it is tempting for library users to match on the variants they understand and/or expect and use a wild card for the the rest. The creator of the sled crate has stated that such things indeed do happen, and they can cause bugs over time. Here is a quote from the following blog post (thanks @PhilippGackstatter for pointing this one out):Furthermore it is also hard for us to maintain these huge error enums. The fact that https://github.com/iotaledger/identity.rs/blob/dev/identity-account/src/error.rs#L14 and https://github.com/iotaledger/identity.rs/blob/dev/identity-account/src/error.rs#L95 both are variants of the same enum, and the former has an
InvalidPrivateKey
variant of its own, suggests that we are already struggling. Finally having to think about the possibility of loads of values that may or may not exist does not feel right in a statically typed language like Rust.These are the reasons we should reconsider how we go about error handling in this project. The purpose of this discussion is to try to establish guidelines for how our library should present errors.
Some background on how errors are represented in Rust
This section is mainly meant to give developers who mostly use the javascript bindings (whose input we definitely also value) a better understanding of what is being discussed. Unfortunately we cannot explain every aspect of error handling in Rust and must settle for a brief overview.
Representing errors in Rust
The book Rust for Rustaceans by Jon Gjengset states that there are two main options for representing errors: Enumeration or erasure.
An example of the former:
We have a service which users can log in to. There are several different ways in which this operation can fail for a given user so we enumerate them:
And the corresponding function would look something like
Whenever this function fails we can find out exactly how and act accordingly;
If the database connection is lost we may wish to re-establish it and retry, if the password is typed wrong 10 times we may wish to deactivate the account for the time being, if the username is not found we tell the user "we could not find any use with the password they provided".
Suppose now instead that we are making a library for compressing and decompressing arbitrary bytes. Note that callers of the (de)compression function are not likely to understand or care exactly how our function failed, hence it is enough for them to just know that an error did occur so they can log it if they wish to do so. Thus in this scenario we may provide an opaque error type like
CompressionError
or even be more vague and just state that our function returns something resembling an error (an Error trait object, which is a bit like an interface in languages with a stronger emphasis on OOP).The Try operator
Suppose we have a function that loads a config file.
This operation can fail in several ways, such as the config file was not found, or it was corrupted etc, so the function signature would look like something like this.
We also want a function that can restore certain settings back to their defaults
restore_default_settings
. Suppose that the default settings can be found in the config file, thenrestore_default_settings
would typically callload_config
internally to get the config. For this reason we expectrestore_default_settings
to fail if the call toload_config
fails and it would actually be super convenient if there was no other way for this operation to fail as then we could implement it in the following way:here the
?
operator means ifload_config
failed then return that error now and don't proceed any further. Ifrestore_default_settings
could fail in other ways in addition to the config not being able to load, then we would have to do some more programming in order to convert theLoadingError
to the appropriate one. This is why it is so tempting to have a single error type, because then it becomes super easy to compose functions. The problem with this approach is that we don't help the function caller much in handling distinct errors. Even with the enumeration approach the enumeration can get so long that it becomes very hard for users to understand how a given function actually can fail, as it is unlikely that all the functions in our library can fail in exactly the same way. In the creator of this discussions opinion this is a problem we now need to tackle in this project.More advanced ways of representing errors in Rust
In the previous section we said that enumeration and erasure are the two main options we have for representing errors in Rust, but there is nothing stopping us from combining these two ways. Here is an example (again thanks @PhilippGackstatter )
The
StorageErrorKind
gives a conceptual explanation for what went wrong, if aTimeout
orConnectionLost
variant is discovered a retry might be suitable, but if aNotFound
orWriteError
occurs this might be so serious that the application might want to shut itself down and notify someone (by email) to fix their storage device. For debugging purposes theStorageError
also keeps the original error (represented as a trait object on the heap) in theinner
field that can either be printed, or one could attempt to cast it down to a more specific error, if more information is needed.Another interesting way of representing errors can be found in this PR from the stronghold team where the implementation is roughly
And then the methods return things like
Result<(), Error<WriteSnapshotError>>
andResult<(), Error<ReadSnapshotError>>
. In short (from what we understand) one gets method specific errors for methods that can fail in ways shared with others without having to duplicate the code for those common cases.What are the best practices for error handling in Rust as of today?
The following quote is from the book Rust for Rustaceans :
Although this is true, it might be a good idea to look at the patterns from stdlib pointed out by Steve Klabnik in this discussion:
Note also that although the io ErrorKind has 40 variants, it is indeed an error for a module and not an entire crate and Withoutboats mentions in the aforementioned discussion that
io::Error
is perhaps really a very unique case.In the future error handling might become easier in Rust. There are several RFCs and discussions where people wish for anonymous sum types and they point out more ergonomic granular error handling as an important use case. Whether anonymous sum types eventually will be added to Rust is an open question, but the powerset enum crate can already be very helpful in this regard. Unfortunately that crate is only available on nightly Rust at the moment.
One possible Suggestion for error guidelines in identity.rs
Here is a suggestion for some guidelines we could follow when it comes to how we represent errors/failure in our project:
We stop having monster enum error types. But could consider having some global opaque error types to use for failures that we don't expect the caller to handle in detail.
For functions in the same module that are expected to be composed, we can use a common error enum defined in that module.
High-level functions that implement a known protocol capable of failing in different ways, some of which the caller would want to handle differently, should have their own dedicated error enum types parsable by humans.
Everyone is encouraged to keep the following quote from Yuan et.al in mind throughout this discussion:
Beta Was this translation helpful? Give feedback.
All reactions