Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactoring and improve update client semantics #1011

Closed
wants to merge 4 commits into from

Conversation

soareschen
Copy link
Contributor

@soareschen soareschen commented May 28, 2021

Closes: #997

Description


For contributor use:

  • Updated the Unreleased section of CHANGELOG.md with the issue.
  • If applicable: Unit tests written, added test to CI.
  • Linked to Github issue with discussion and accepted design OR link to spec that describes this work.
  • Updated relevant documentation (docs/) and code comments.
  • Re-reviewed Files changed in the Github PR explorer.

Copy link
Member

@adizere adizere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like your proposal for the NonZeroHeight type, which could help us remove the clumsy if h == Height::zero() checks around the codebase. It would be good if you can briefly argue for the rationale behind this new type in the PR description.

I know it's still a draft, so you're probably fixing these problems already, but please mind:

modules/src/ics02_client/height.rs Outdated Show resolved Hide resolved
relayer/src/chain/cosmos.rs Outdated Show resolved Hide resolved
.trusted_height
.and_then(|h| NonZeroHeight::new(ibc::Height::new(src_chain_id, h)));

let target_height = match m_target_height {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These checks make the command quite intelligent, which is nice. But it might be better if the user is allowed to exercise the corner-cases of update client CLIENT-ID --target-height 0 for example, especially for tx raw. To do that, it could be a good idea to leave TxUpdateClientCmd untouched (modulo the nice refactor with execute) and create a new command e.g., UpdateClientCmd that is similar to TxUpdateClientCmd but which has more checks in place to cover corner-cases.

Having a new UpdateClientCmd command will also address (part of) #773 .

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a NonZeroHeight::unsafe_new() function for creation of heights that are really meant to be zero. Btw note that previously even if zero height is allowed in the command line, it still really means the absence of height and the relayer will still fetch the current height.

I have now added the UpdateClientCmd and have the two behaviors slightly different. In UpdateClientCmd, only the target height can optionally be provided, and the command will wait for the target height to reach before updating. For TxUpdateClientCmd, both trusted height and target height must be provided, and there is no sanity checking in the command. So if invalid heights are provided, they will go into the implementation code and return error from there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For TxUpdateClientCmd, both trusted height and target height must be provided, and there is no sanity checking in the command. So if invalid heights are provided, they will go into the implementation code and return error from there.

What is the rationale for this modification? It's probably better to keep this API unchanged and leave it simple so that we can invoke these commands with minimal effort.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just my suggestion, because otherwise I am not really sure what is the point of having two commands with separate implementations if their behavior are the same. Or do you have any recommendation of how the two commands should be differentiated?

relayer/src/foreign_client.rs Show resolved Hide resolved
@soareschen soareschen force-pushed the soares/update-client-refactoring branch from 86df2fa to 02c3500 Compare June 1, 2021 08:33
@soareschen soareschen marked this pull request as ready for review June 1, 2021 12:26
@soareschen
Copy link
Contributor Author

The PR is now ready for review. Here is a more reliable way to reproduce #997:

  • Modify the timeout_commit and timeout_propose fields in the one-chain script to a longer duration, such as 10s.
  • Recreate the chains using ./scripts/dev-env.
  • Run commands such as./target/debug/hermes create client ibc-0 ibc-1 ; ./target/debug/hermes update client ibc-0 07-tendermint-{N}, with {N} replaced with expected new client ID.
  • The command should fail on master, but pass on this PR.

@soareschen soareschen force-pushed the soares/update-client-refactoring branch from d15c806 to 4d4cb52 Compare June 1, 2021 13:19
Copy link
Member

@adizere adizere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirm that the fix for #997 works, thanks Soares!

> ./target/debug/hermes create client ibc-0 ibc-1 ; ./target/debug/hermes update client ibc-0 07-tendermint-6
Success: CreateClient(
    CreateClient(
        Attributes {
            height: Height {
                revision: 0,
                height: 522,
            },
            client_id: ClientId(
                "07-tendermint-6",
            ),
            client_type: Tendermint,
            consensus_height: Height {
                revision: 1,
                height: 521,
            },
        },
    ),
)
Jun 02 12:46:29.257  INFO ibc_relayer_cli::commands: using default configuration from '/Users/adi/.hermes/config.toml'
Jun 02 12:46:29.291  WARN ibc_relayer::foreign_client: [ibc-1 -> ibc-0:07-tendermint-6] skipping update: trusted height (1-521) >= chain target height (1-521)
Jun 02 12:46:29.291  INFO ibc_relayer::foreign_client: Client 07-tendermint-6 is already up-to-date with chain ibc-1@1-521
Success: None

The PR is difficult to review, however. There's multiple changes (in particular around the ForeignClient::build* methods) and could not understand their impact fully. The difficulty is mostly due to the complex ways in which these methods can be called and used (not necessarily because of this PR). So I'm wondering if it is possible to break the PR in two parts: one that strictly (and minimally) fixes 997, and another part with your refactoring.

#[options(help = "the trusted height of the client update", short = "t")]
trusted_height: Option<u64>,
#[options(
required,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should stay unchanged, not clear why we need it to be required.

dst_client_id: ClientId,

#[options(help = "the target height of the client update", short = "h")]
target_height: Option<u64>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just FYI: we may need to update the guide (this section) with this change.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

definitely needs guide update. Also, if we want to change this as compared to tx raw we should discuss to understand why and how. At first look I would propose hermes client update <chain-id> <cliend-id> (i.e. drop the target_height)

// If not specified, set trusted state to the highest height smaller than target height.
// Otherwise ensure that a consensus state at trusted height exists on-chain.
let cs_heights = self.consensus_state_heights()?;
let trusted_height = if trusted_height == Height::zero() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we don't have this check in place anymore, this means that any caller of build_update_client_with_trusted has to make sure that the client has a consensus state at trusted_height installed. Do I understand correctly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. In pretty much all places, the trusted height is already obtained from the consensus state height, so the check is unnecessary. With the new update client command, I make it such that the trusted height is also automatically fetchd, so there is no need for checking. The only place left unchecked is in the tx raw update-client command, but since it is a raw command I think it is good to bypass checking so that we can test the behavior on the actual chain.

@@ -540,9 +546,7 @@ impl ForeignClient {
))
})?;

assert!(!events.is_empty());

Ok(events.pop().unwrap())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice refactor around this method.

dst_client_id: ClientId,

#[options(help = "the target height of the client update", short = "h")]
target_height: Option<u64>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

definitely needs guide update. Also, if we want to change this as compared to tx raw we should discuss to understand why and how. At first look I would propose hermes client update <chain-id> <cliend-id> (i.e. drop the target_height)

@ancazamfir
Copy link
Collaborator

Running locally e2e/run.py i get:

2021-06-02 12:37:20 [   DEBUG] {"result":null,"status":"success"}
Traceback (most recent call last):
  File "/Users/ancaz/rust/ibc-rs/./e2e/run.py", line 229, in <module>
    main()
  File "/Users/ancaz/rust/ibc-rs/./e2e/run.py", line 213, in main
    ibc0_client_id, ibc0_conn_id, ibc0_chan_id, ibc1_client_id, ibc1_conn_id, ibc1_chan_id = raw(config, ibc0 , ibc1, port_id)
  File "/Users/ancaz/rust/ibc-rs/./e2e/run.py", line 130, in raw
    ibc0_client_id = client.create_update_query_client(c, ibc0, ibc1)
  File "/Users/ancaz/rust/ibc-rs/e2e/e2e/client.py", line 128, in create_update_query_client
    update_client(c, dst, client.client_id)
  File "/Users/ancaz/rust/ibc-rs/e2e/e2e/client.py", line 111, in update_client
    res = cmd.run(c).success()
  File "/Users/ancaz/rust/ibc-rs/e2e/e2e/cmd.py", line 34, in success
    data = self.cmd.process(result)
  File "/Users/ancaz/rust/ibc-rs/e2e/e2e/client.py", line 49, in process
    return from_dict(ClientUpdated, result['UpdateClient']['common'])
KeyError: 'UpdateClient'

probably because the return value is None and update client event cannot be decoded.

(one-chain script changed with 10 seconds timeouts)

@soareschen
Copy link
Contributor Author

probably because the return value is None and update client event cannot be decoded.

Hmm makes sense. If the trusted height is already at target height, then None is returned, so the e2e script will need to handle that. I will fix the e2e script then.

@soareschen
Copy link
Contributor Author

So I'm wondering if it is possible to break the PR in two parts: one that strictly (and minimally) fixes 997, and another part with your refactoring.

I suppose if we want to fix #997 naively, we can just replace the comparator operator in the original build_update_client_and_send from < to <=:

diff --git a/relayer/src/foreign_client.rs b/relayer/src/foreign_client.rs
index dad8993d..1385fe4b 100644
--- a/relayer/src/foreign_client.rs
+++ b/relayer/src/foreign_client.rs
@@ -439,7 +439,7 @@ impl ForeignClient {
             // Get highest height smaller than target height
             cs_heights
                 .into_iter()
-                .find(|h| h < &target_height)
+                .find(|h| h <= &target_height)
                 .ok_or_else(|| {
                     ForeignClientError::ClientUpdate(format!(
                         "chain {} is missing trusted state smaller than target height {}",

I agree that the original code is pretty difficult to understand. And at least without familiarizing myself with what the code actually does, I wouldn't be able to be certain that the fix really solves the issue. Right now with the correct understanding after the refactoring, I think the simple fix could work.

The refactored code should preserve the original semantics while keeping it simpler. But since the original code is so complex, it certainly requires some reading to make sure that it really doesn't break anything.

@soareschen
Copy link
Contributor Author

I suppose if we want to fix #997 naively, we can just replace the comparator operator in the original build_update_client_and_send from < to <=:

Hmm maybe it's not that straightforward. Somewhere upstream the original code would still convert empty result to error.

Jun 02 11:19:29.047  WARN ibc_relayer::foreign_client: [ibc-1 -> ibc-0:07-tendermint-3] skipping update: trusted height (1-14) >= chain target height (1-14)
Error: tx error: error raised while updating client: Client 07-tendermint-3 is already up-to-date with chain ibc-1@1-14

@romac
Copy link
Member

romac commented Jun 2, 2021

So I'm wondering if it is possible to break the PR in two parts: one that strictly (and minimally) fixes 997, and another part with your refactoring.

That would be great, and would perhaps allow us to merge the fix for 997 before the release and take some time to properly review the refactoring afterwards.

@soareschen
Copy link
Contributor Author

I have opened #1036 for the quick fix.

@adizere
Copy link
Member

adizere commented Jun 9, 2021

This work will be subsided by later work that will replace Height::zero with Option<Height>.

Heights can have different meanings:

  • some specific height
  • none specified, which can mean an error (because a specific height is necessary)
  • none specified, which can mean the latest one

@soareschen:

  • Create a discussion to capture this refactor, describe the rationale.

@adizere adizere closed this Jun 9, 2021
@romac romac deleted the soares/update-client-refactoring branch January 26, 2022 14:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Hermes client update fails if client was recently created
4 participants