Improve ChainId implementation; this is API breaking change #698

mina86 · 2023-06-01T02:36:49Z

ChainId::new allocates a new String so there’s no point in passing
ownership of name argument. The constructor can accept it as a str
slice without loss of functionality. This reduces allocations when
creating ids since caller no longer has to have an owned string.

Replace From implementations with TryFrom implementations. Conversion
now fails if string is not in an epoch format. There’s still a way to
construct an identifier without epoch format if it comes from
tendermint::chain::Id (or by using ChainId::new with zero version;
behaviour of that might change). And since there’s From
implementation available, also get rid of ChainId::from_string.

Optimise epoch format parsing. from_string used to check if argument
was in epoch format and then call chain_version which did it again.
To avoid duplicating work, introduce split_chain_id method which does
the parsing and returns result which includes all the information
callers might want.

Similarly, calling split and collecting values into a vector is
wasteful if all we need is the last token. Furthermore, regexes are
quite a heavy machinery for the task of checking if string ends with
a number. To address that, use split_last and parse the number to
check if it’s valid.

PR author checklist:

Added changelog entry, using unclog.
[n/a] Added tests.
[n/a] Linked to GitHub issue.
Updated code comments and documentation (e.g., docs/).
Tagged one reviewer who will be the one responsible for shepherding this PR.

Reviewer checklist:

Reviewed Files changed in the GitHub PR explorer.
Manually tested (in case integration/unit/mock tests are absent).

crates/ibc/src/core/ics24_host/identifier.rs

plafer

Thank you for this @mina86, this is a much needed cleanup of the type.

plafer · 2023-06-02T20:10:11Z

crates/ibc/src/core/ics24_host/identifier.rs

-                split.join("-")
-            };
-            self.version = version;
+        if self.version != 0 {


This is a slightly different check than previously. The previous implementation effectively checks if there's a - in the id, whereas this checks if version is 0 (which can occur with e.g. chain id "mockChainNotEpochFormat").

I suggest we:

store an is_epoch_format boolean in the type itself, so that we only need to parse the chain ID upon construction.

Use this is_epoch_format instead of self.version != 0

Change this signature to with_version(self, version: u64) -> Result<Self, IdentifierError>

we needlessly require the caller to have a mutable reference to the object

we should fail if the identifier is not in epoch format, and let the caller decide what to do then. Note that this function is not used anywhere, so this change of signature should be straightforward

self.version == 0 implies that the id is not in epoch format so
checking it is just an optimisation¹. As far as I can tell the old
and new code behave the same way (module panic if version doesn’t fit
u64). Neither the old nor the new code will change identifier such
as foo-bar for example.

I can add is_epoch_format if you prefer but it will increase the
size of the ChainId structure which I feel like it could end up
hurting rather than helping.

Returning Result certainly makes sense though note that changing
mut self to self doesn’t actually matter for the caller. self
is moved into the method. What possibly would make most sense is
with_version(self, u64) -> Result<Self, Self> (or changing the
semantics such that it adds version if not already in epoch format).

self.version == 0 implies that the id is not in epoch format so
checking it is just an optimisation¹.

Ah right, I didn't realize "foo-0" is not considered epoch format. So basically we had it wrong before, and this is correct.

I can add is_epoch_format if you prefer

Nope, this is no longer necessary (see previous comment).

What possibly would make most sense is with_version(self, u64) -> Result<Self, Self> (or changing the semantics such that it adds version if not already in epoch format).

The reason I still prefer my suggestion is that when not in epoch format, the concept of version in a chain ID doesn't exist. In practice, all chains now use the epoch format. I do not see a use case to "convert your chain ID to epoch format"; most likely, if someone called with_version() on a chain ID that is not in epoch format, it's a bug and they should fix it, as opposed to intended behavior.

Note that this whole "encode the version number in the chain id" evolved as a hack; it wasn't in the original IBC standard, and is pretty messy. Non-epoch format chain IDs are pretty much non-existent.

plafer · 2023-06-02T20:15:13Z

crates/ibc/src/core/ics24_host/identifier.rs

+impl<'a> From<&'a str> for ChainId {
+    fn from(value: &'a str) -> Self {
+        let version = Self::chain_version(value);
+        Self {
+            id: String::from(value),
+            version,
+        }
+    }
+}
+
+impl From<String> for ChainId {
+    fn from(value: String) -> Self {
+        let version = Self::chain_version(&value);
+        Self { id: value, version }
+    }


Panicking is not acceptable; then one could easily crash a chain by sending a ClientState object with a very long version number (here).

However, our previous solution of storing 0 in that case is also not that great. I suggest we simply implement TryFrom<String> and TryFrom<&'a str> instead, and bubble the errors up instead of hiding them. In tests, we can just unwrap(). Note that returning an error is consistent with ibc-go's implementation (where a panic is equivalent to failing to transaction).

Good catch though, I never noticed this behavior before.

So to clarify, the desired behaviour is:

foo → valid not in epoch format,

foo-bar → valid not in epoch format,

foo-0 → valid not in epoch format (?),

foo-1 → valid in epoch format and

foo-18446744073709551616 → invalid.

PS. Regarding panicking, I was under the impression that this code is
executed within the VM where panics result in transaction failure.

So to clarify, the desired behaviour is:

That's correct. I use this golang playground (extracted from ibc-go's implementation) to figure out which strings should be considered epoch format.

Note that foo-18446744073709551616 is considered valid given this regexp, but they later fail when trying to parse the version number into a u64.

PS. Regarding panicking, I was under the impression that this code is executed within the VM where panics result in transaction failure.

We make no such assumption. And as it turns out, most (if not all) of our users don't use a VM at all; everything is compiled directly in the binary. So a panic would make the whole node crash (unless of course they guard against it, but we don't assume they do).

What’s the output of ChainId::chain_version("foo-18446744073709551616")?

What’s the output of ChainId::from(tendermint::chain::Id("foo-18446744073709551616"))? That conversion is used for serde deserialisation.

What’s the output of ChainId::chain_version("foo-18446744073709551616")?

That should be an error; so the return type should be changed to Result<u64, IdentifierError>

What’s the output of ChainId::from(tendermint::chain::Id("foo-18446744073709551616"))?

We should implement TryFrom<tendermint::chain::Id> for ChainIdinstead, and return anIdentifierError`.

plafer · 2023-06-02T20:41:06Z

crates/ibc/src/core/ics24_host/identifier.rs

+            }
+            let (car, cdr) = version.as_bytes().split_first()?;
+            if b'1' <= *car && *car <= b'9' && cdr.iter().all(u8::is_ascii_digit) {
+                Some((name, version.parse().unwrap()))


As mentioned in other comments, we'd need to return an IdentifierError when the version parsing here fails

plafer · 2023-06-02T20:43:15Z

crates/ibc/src/core/ics24_host/identifier.rs

+    /// **Panics** if chain id is in epoch format but the version overflows
+    /// `u64`.


Reminder to remove this after removing the panicking behavior

ChainId::new allocates a new String so there’s no point in passing ownership of name argument. The constructor can accept it as a str slice without loss of functionality. This reduces allocations when creating ids since caller no longer has to have an owned string. Replace From implementations with TryFrom implementations. Conversion now fails if string is not in an epoch format. There’s still a way to construct an identifier without epoch format if it comes from tendermint::chain::Id (or by using ChainId::new with zero version; behaviour of that might change). And since there’s From implementation available, also get rid of ChainId::from_string. Optimise epoch format parsing. from_string used to check if argument was in epoch format and then call chain_version which did it again. To avoid duplicating work, introduce split_chain_id method which does the parsing and returns result which includes all the information callers might want. Similarly, calling split and collecting values into a vector is wasteful if all we need is the last token. Furthermore, regexes are quite a heavy machinery for the task of checking if string ends with a number. To address that, use split_last and parse the number to check if it’s valid.

plafer · 2023-06-21T13:35:35Z

Is #721 meant to replace this PR?

mina86 · 2023-06-21T13:58:57Z

Is #721 meant to replace this PR?

Not quite. Since this PR is growing I’ve moved some of the changes to separate PR.

Farhad-Shabani · 2023-07-28T18:44:42Z

Given that #762 is being merged, this PR gets outdated. Feel free to open an issue, if still any of the above has not been addressed.

mina86 force-pushed the a branch from f601c67 to 4a95e8e Compare June 1, 2023 02:37

DaviRain-Su reviewed Jun 1, 2023

View reviewed changes

crates/ibc/src/core/ics24_host/identifier.rs Show resolved Hide resolved

crates/ibc/src/core/ics24_host/identifier.rs Outdated Show resolved Hide resolved

mina86 force-pushed the a branch from 4a95e8e to 2e5acbb Compare June 2, 2023 10:34

plafer suggested changes Jun 2, 2023

View reviewed changes

mina86 force-pushed the a branch 2 times, most recently from b724ea9 to d8ce548 Compare June 10, 2023 13:15

mina86 changed the title ~~Improve ChainId implementation~~ Improve ChainId implementation; this is API breaking change Jun 10, 2023

mina86 force-pushed the a branch from d8ce548 to 983fb95 Compare June 10, 2023 16:35

Farhad-Shabani mentioned this pull request Jul 13, 2023

[ICS24] Enhancements and fixes for ChainId handling and validation #761

Closed

mina86 mentioned this pull request Jul 13, 2023

Optimise ChainId epoch format parsing #725

Closed

7 tasks

Farhad-Shabani closed this Jul 28, 2023

mina86 deleted the a branch October 31, 2023 06:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve ChainId implementation; this is API breaking change #698

Improve ChainId implementation; this is API breaking change #698

mina86 commented Jun 1, 2023 •

edited

Loading

plafer left a comment

plafer Jun 2, 2023

mina86 Jun 2, 2023

plafer Jun 5, 2023

plafer Jun 2, 2023

mina86 Jun 2, 2023

plafer Jun 5, 2023

mina86 Jun 10, 2023

plafer Jun 12, 2023

plafer Jun 2, 2023

plafer Jun 2, 2023

plafer commented Jun 21, 2023

mina86 commented Jun 21, 2023

Farhad-Shabani commented Jul 28, 2023

		/// Panics if chain id is in epoch format but the version overflows
		/// `u64`.

Improve ChainId implementation; this is API breaking change #698

Improve ChainId implementation; this is API breaking change #698

Conversation

mina86 commented Jun 1, 2023 • edited Loading

PR author checklist:

Reviewer checklist:

plafer left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

plafer commented Jun 21, 2023

mina86 commented Jun 21, 2023

Farhad-Shabani commented Jul 28, 2023

mina86 commented Jun 1, 2023 •

edited

Loading