Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error wrapping and conversion between modules #68

Closed
2 of 4 tasks
ancazamfir opened this issue May 20, 2020 · 14 comments
Closed
2 of 4 tasks

Error wrapping and conversion between modules #68

ancazamfir opened this issue May 20, 2020 · 14 comments
Labels
I: dependencies Internal: related to dependencies I: logic Internal: related to the relaying logic
Milestone

Comments

@ancazamfir
Copy link
Collaborator

Summary

Current relayer uses the anomaly crate for defining error "kinds" and handling errors. It is not clear if and how conversion between different errors or "kinds" can be done.

Problem Definition

There are interactions between IBC modules where error handling is currently awkward. For example the channel message creation (ICS4) verifies that input is properly formatted by invoking validation part of the "host" (ICS24):

   (...
          port_id: String,
    )
....
            port_id: port_id
                .parse()
                .map_err(|e| crate::ics04_channel::error::Kind::IdentifierError.context(e))?,

This would be nice:

            port_id: port_id.parse()?,

For Admin Use

  • Not duplicate issue
  • Appropriate labels applied
  • Appropriate contributors tagged
  • Contributor assigned/self-assigned
@ancazamfir ancazamfir added the I: logic Internal: related to the relaying logic label May 20, 2020
@ancazamfir ancazamfir added this to the 0.6-6mo milestone May 20, 2020
@ancazamfir ancazamfir changed the title Error wrapping and conversion betweeen module Error wrapping and conversion between modules May 26, 2020
@ancazamfir ancazamfir modified the milestones: 0.6-6mo, 0.7-7mo Jun 16, 2020
@ancazamfir ancazamfir modified the milestones: v.0.0.3, v0.0.7 Aug 14, 2020
This was referenced Nov 6, 2020
@adizere adizere added the I: dependencies Internal: related to dependencies label Nov 12, 2020
@adizere
Copy link
Member

adizere commented Nov 12, 2020

Seems like tm-rs is getting some activity on this front soon: informalsystems/tendermint-rs#669.

Since the anomaly crate will become deprecated, we'll also need to move away to something else.

@tarcieri
Copy link

Here's an example of moving from anomaly to eyre: iqlusioninc/crates#555

eyre has a unified eyre::Report type for errors and an eyre::Result type that eliminates the need for the generic error parameter.

Together those should eliminate the need for any error wrapping/conversions entirely. Here's an example of what an error.rs looks like after switching to eyre:

https://github.com/iqlusioninc/crates/pull/555/files#diff-e1562983b9a9dca8f286969d7c6506314d6f3c215da068f3172356961d0cc048

Note that we also plan on moving Abscissa to use eyre, although I don't have an exact timeframe on that.

@adizere
Copy link
Member

adizere commented Jan 2, 2021

After playing with eyre (#493), and given our experience with anomaly so far, here are some observations:

We cannot get rid of all 'awkwardness' around error handling

With this issue we originally wanted to get rid of patterns such as

   (...
          port_id: String,
    )
....
            port_id: port_id
                .parse()
                .map_err(|e| crate::ics04_channel::error::Kind::IdentifierError.context(e))?,

and replace them with:

            port_id: port_id.parse()?,

This replacement is possible with eyre, but not always desirable. Sometimes we may want to chain errors and wrap them into one another to capture both the high-level error cause as well as the lower-level cause(s). In the example above, error chaining is useful because it will report on a high level error (namely, channel msg. creation) and also point to an underlying error (identifier error in parsing the port id). Below is an example where error chaining is not useful, and we can get rid of it without impacting debugging/usability:

With error wrapping

For the keys list command, we can wrap any lower-level keybase error into a ErrorMsg::Keys error:
https://github.com/informalsystems/ibc-rs/blob/e4629358526564e26850c071b4cb030485d52c8c/relayer-cli/src/commands/keys/list.rs#L50-L55

The output in the error case would look like this:

keys list failed: keys error: Keybase error

Without error wrapping

The code becomes:

let res = list_keys(opts);
// ...

Output becomes:

keys list failed: Keybase error

As a rule of thumb, I think that the deeper we are in the stack, the more we may wish to chain errors (with wrap/map) instead of simply propagating them.

Some advantages of using eyre

We can simplify by removing the .context

For example this

match value.type_url.as_str() {
            TENDERMINT_CONSENSUS_STATE_TYPE_URL => Ok(AnyConsensusState::Tendermint(
                TendermintConsensusState::decode_vec(&value.value)
                    .map_err(|e| Kind::InvalidRawConsensusState.context(e))?,

becomes this (last line is relevant)

match value.type_url.as_str() {
            TENDERMINT_CONSENSUS_STATE_TYPE_URL => Ok(AnyConsensusState::Tendermint(
                TendermintConsensusState::decode_vec(&value.value)
                    .wrap_err(Kind::InvalidRawConsensusState)?,

or another example, where this:

let raw_client_state = raw
            .client_state
            .ok_or_else(|| Kind::InvalidRawClientState.context("missing client state"))?;

becomes this:

let raw_client_state = raw
            .client_state
            .ok_or_else(|| eyre!("missing client_state"))?;

Instead of eyre!("missing client_state") we could alternatively introduce a typed error such as Kind::MissingClientState. But the former solution was simpler to adopt for this early stage.

Some error conversion becomes simpler

In the process method of handler/conn_open_ack.rs, for example:

        None => {
            // No connection end exists for this conn. identifier. Impossible to continue handshake.
-            Err(Into::<Error>::into(Kind::UninitializedConnection(
-                msg.connection_id().clone(),
-           )))
+            Err(Kind::UninitializedConnection(msg.connection_id().clone()))

Some naming conventions become simpler

We used to have two types for managing errors (see also informalsystems/tendermint-rs#150 (comment)):

  • pub type Error = anomaly::Error<Kind>; which is redefined in every module and assigned the same anomaly type
  • pub enum Kind is the enumeration of module-specific errors

Using eyre we:

  • get rid of redefining the Error type everywhere; instead, we use eyre::Report. For example:
 impl TryFrom<Any> for AnyHeader {
-    type Error = crate::ics02_client::error::Error;
+    type Error = eyre::Report;
  • keep the enumeration Kind and rename that into Error (this is how eyre seems to be intended for use, example)

Disadvantages of using eyre

  • the eyre::Report type contains the error and that makes it difficult to handle errors by match or reporting (see this downcast example or this) on them

  • In general, it seems like we are moving away from core Rust. Below are two examples of code (one where eyre::Result replaces core::Result, and one where we replace map_err with wrap_err) that looks cleaner with eyre, but has the downside of making us more dependent on the eyre library.

-    pub fn sub(&self, delta: u64) -> Result<Height, Error> {
+    pub fn sub(&self, delta: u64) -> eyre::Result<Height> {
-                    .map_err(|e| Kind::InvalidRawClientState.context(e))?,
+                    .wrap_err(Error::InvalidRawClientState)?,

@tarcieri
Copy link

tarcieri commented Jan 2, 2021

In general, it seems like we are moving away from core Rust. Below are two examples of code (one where eyre::Result replaces core::Result, and one where we replace map_err with wrap_err) that looks cleaner with eyre, but has the downside of making us more dependent on the eyre library.

I'd say it's pretty common to define aliases of Result regardless of what error handling approach is being used.

I think it's especially nice for consumers of a downstream crate which need to impl a trait that uses the downstream crate's Result type, e.g. -> somecrate::Result<T>.

io::Result is an example of this in std.

@adizere
Copy link
Member

adizere commented Jan 4, 2021

Thank you @tarcieri, I wasn't familiar that this is a common approach. The example with std::io::Result drives the point home very well. I guess the second disadvantage I mentioned ("moving away from core") is not so much of a problem.

The other disadvantage I mentioned above has to do with the difficulty of matching or handling a eyre::Report. As stated in their docs:

Use eyre if you don't think you'll do anything with an error other than report it.

In the ibc-rs modules, most errors are unrecoverable, and I can't think of a case where we'd need to handle an error. In the relayer, however, there is a subset of all possible errors we'd like to handle, for example:

  • failed to submit a tx, and therefore we'd want to retry that,
  • failed a light client operation, and we may want to retry that,
  • failed to get a query responses from a full node, which we may want to retry as well.

All of these are network calls. That being said, it will not be impossible to match the error and retry (just not very clean).

@romac
Copy link
Member

romac commented Jan 4, 2021

We cannot get rid of all 'awkwardness' around error handling

...

Sometimes we may want to chain errors and wrap them into one another to capture both the high-level error cause as well as the lower-level cause(s). In the example above, error chaining is useful because it will report on a high level error (namely, channel msg. creation) and also point to an underlying error (identifier error in parsing the port id).

Agreed. A potential solution for tackling the verbosity of wrapping the parse error in a high-level error would be to introduce a parse_port_id function somewhere in the ics04_channel module:

fn parse_port_id(port_id: &str) -> Result<PortId, crate::ics04_channel::error::Error> {
  port_id.parse().map_err(crate::ics04_channel::error::Error::InvalidIdentifier)
}

which would be used as:

...
  port_id: parse_port_id(&port_id)?,
...

with the crate::ics04_channel::error::Error being defined as:

#[derive(Error)]
enum Error {
  InvalidIdentifier(ValidationError),
}

@romac
Copy link
Member

romac commented Jan 4, 2021

As a rule of thumb, I think that the deeper we are in the stack, the more we may wish to chain errors (with wrap/map) instead of simply propagating them.

Agreed. In my opinion, library APIs should return specific, nested errors via a Result<T, my_module::Error> type. The eyre::Report type should only be used in "top-level" application code in cases where one does not need to statically know about the specific kind of error being returned (keeping in mind such code can always fall back on downcasts to recover the specific underlying error if needed).

@tarcieri
Copy link

tarcieri commented Jan 4, 2021

...the difficulty of matching or handling a eyre::Report. As stated in their docs:

Use eyre if you don't think you'll do anything with an error other than report it.

In the ibc-rs modules, most errors are unrecoverable, and I can't think of a case where we'd need to handle an error. In the relayer, however, there is a subset of all possible errors we'd like to handle

@adizere in general it seems people use a combination of error handling strategies, selecting which one most appropriately fits their needs for each given scenario.

This generally looks like using an enum (with a library like thiserror if you so desire) for lower-level libraries, and a library like eyre (or alternatively anyhow, but eyre seems to have largely surpassed it) for higher-level error handling, particularly at the application level.

@romac
Copy link
Member

romac commented Jan 4, 2021

or another example, where this:

let raw_client_state = raw
            .client_state
            .ok_or_else(|| Kind::InvalidRawClientState.context("missing client state"))?;

becomes this:

let raw_client_state = raw
            .client_state
            .ok_or_else(|| eyre!("missing client_state"))?;

Instead of eyre!("missing client_state") we could alternatively introduce a typed error such as Kind::MissingClientState. But the former solution was simpler to adopt for this early stage.

I'd argue that introducing a specific error type is both cleaner and more useful as it allows the enclosing function to return a Result<T, my_module::Error> rather than have to use eyre::Report (cf. #68 (comment) and #68 (comment)).

@romac
Copy link
Member

romac commented Jan 4, 2021

This generally looks like using an enum (with a library like thiserror if you so desire) for lower-level libraries, and a library like eyre (or alternatively anyhow, but eyre seems to have largely surpassed it) for higher-level error handling, particularly at the application level.

Agreed.

@romac
Copy link
Member

romac commented Jan 4, 2021

Regarding .context, I would add that while anomaly's context method is very useful to add "dynamic" context to an error, we should instead be more principled and define appropriate hierarchies of nested error types and use those at the library level.

@romac
Copy link
Member

romac commented Jan 4, 2021

@adizere Thanks for the great write-up!

In the ibc-rs modules, most errors are unrecoverable, and I can't think of a case where we'd need to handle an error. In the relayer, however, there is a subset of all possible errors we'd like to handle, for example:

  • failed to submit a tx, and therefore we'd want to retry that,
  • failed a light client operation, and we may want to retry that,
  • failed to get a query responses from a full node, which we may want to retry as well.
  • All of these are network calls. That being said, it will not be impossible to match the error and retry (just not very clean).

In general, I agree with your conclusions, and in light of the comment of yours quoted above, I would suggest we do the following:

  • return specific errors (ie. Result<T, MyError>) in the modules crate (those being defined as plain, nested enums with help from the this error crate for deriving the std::error::Error trait)
  • return dynamic errors (eyre::Report) via eyre::Result<T> aliased as relayer::Result<T> in the relayer crate
  • same in the relayer-cli crate

This would let us:

a) precisely define all the various errors and their relationship in the low-level modules crate
b) easily catch/handle the errors yielded by the modules crate in the relayer, eg. for retrying some operations
c) easily bubble up any errors we deem unrecoverable from the relayer and relayer-cli crate

What do you think?

@adizere
Copy link
Member

adizere commented Jan 5, 2021

What do you think?

In short, I love it!

Both the thread of discussion as well as the last suggestions are great, so I have nothing to add at this moment. I'll align #493 with these comments and then we'll see where we go from there. Many thanks @tarcieri & @romac.

@adizere adizere modified the milestones: v0.1.1, v0.1.2 Feb 9, 2021
@adizere
Copy link
Member

adizere commented Aug 16, 2021

This issue is no longer relevant with the introduction of flex-error #988

@adizere adizere closed this as completed Aug 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I: dependencies Internal: related to dependencies I: logic Internal: related to the relaying logic
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants