Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

query client status CLI fails with high number of consensus states #3814

Closed
2 of 8 tasks
ancazamfir opened this issue Jan 25, 2024 · 6 comments
Closed
2 of 8 tasks

query client status CLI fails with high number of consensus states #3814

ancazamfir opened this issue Jan 25, 2024 · 6 comments
Milestone

Comments

@ancazamfir
Copy link
Collaborator

ancazamfir commented Jan 25, 2024

Summary of Bug

While debugging an IBC client on osmosis, I noticed that query client status fails (added some debugs):

$ hermes query client status --chain osmosis-1 --client 07-tendermint-2007
[crates/relayer-cli/src/commands/query/client.rs:335] consensus_state_heights.len() = 881
[crates/relayer-cli/src/commands/query/client.rs:336] client_state.latest_height() = Height {
    revision: 0,
    height: 12502781,
}
[crates/relayer-cli/src/commands/query/client.rs:337] latest_consensus_height = Height {
    revision: 0,
    height: 11944260,
}
ERROR error decoding protobuf: error converting message type into domain type: the client consensus state was not found

As seen above the client_state.latest_height() is higher than the highest consensus state height returned by query_consensus_state_heights().

It looks like the query for all consensus states does not return latest states and, for this particular case, the highest returned is for a consensus state that has been pruned, hence the error. We need to debug this, maybe it's an issue with the pagination.

But in particular for this CLI we don't need to get all consensus states but only the one at client_state.latest_height().

So we need to:

  • fix the pagination for query_consensus_state_heights() (i.e. get N at a time and assemble the full list) this was not an issue but rather:
  • health check does not check the gRPC syncing status which is an issue if the gRPC and RPC nodes are different
  • change the query client status CLI to use the client.latest_height to retrieve the latest consensus state

Version

all

Steps to Reproduce

Acceptance Criteria

query client status should succeed with high number of consensus states.


For Admin Use

  • Not duplicate issue
  • Appropriate labels applied
  • Appropriate milestone (priority) applied
  • Appropriate contributors tagged
  • Contributor assigned/self-assigned
@ancazamfir
Copy link
Collaborator Author

It looks like the query for all consensus states does not return all states and, for this particular case, the highest returned is for a consensus state that has been pruned...

This doesn't look like a pagination issue but I don't understand what is happening. When querying all consensus heights:

  • hermes gets 881 of them in the 10562226 - 11944260 range
  • simd gets 2947 in the 12218515 - 12502781 range

Then when hermes tries to query a single consensus state in the range it got, the state is not found:

$ hermes query client consensus --chain osmosis-1 --client 07-tendermint-2007 --consensus-height 11944260
ERROR error decoding protobuf: error converting message type into domain type: the client consensus state was not found

When hermes tries to find a consensus state from the simd returned range the state is fine:

$ hermes query client consensus --chain osmosis-1 --client 07-tendermint-2007 --consensus-height 12218515
SUCCESS Tendermint(
    ConsensusState {
        timestamp: Time(
            2023-12-27 13:03:31.513030309,
        ),
        root: CommitmentRoot(
            "B5CF9EB99E55A27C2B680B02805EEC7519D33C9B2BDE505316FFDB3646CFF32B",
        ),
        next_validators_hash: Hash::Sha256(406F67F8FE04AD55291ED7DBD5F410C2B97C5E7B980F3D7826AD99E5EBE98F08),
    },
)

I tried with other clients on osmosis and the issue is the same.
Same results if I change query_consensus_state_heights() to use query_consensus_states().

Thought maybe the states hermes gets are some old prune states but I tried to reproduce on a local setup and was not able to (may have to try higher scale).

@romac
Copy link
Member

romac commented Jan 31, 2024

Do you still see the same issue if you apply this diff?

diff --git a/crates/relayer/src/chain/requests.rs b/crates/relayer/src/chain/requests.rs
index cc459081..68ed170c 100644
--- a/crates/relayer/src/chain/requests.rs
+++ b/crates/relayer/src/chain/requests.rs
@@ -117,6 +117,7 @@ impl PageRequest {
 
         PageRequest {
             limit: u32::MAX as u64,
+            reverse: true,
             ..Default::default()
         }
     }

@ancazamfir
Copy link
Collaborator Author

Do you still see the same issue if you apply this diff?

yes, it's the same

@ancazamfir
Copy link
Collaborator Author

ancazamfir commented Jan 31, 2024

Oh I think the gRPC endpoint was of a node that was not synced (the RPC endpoint was good though). I changed it to a good node and it's fine. Wondering if there is gRPC status query for the sync info. Will try to find out.

@ancazamfir
Copy link
Collaborator Author

$ grpcurl -plaintext services.staketab.com:9010 cosmos.base.tendermint.v1beta1.Service.GetSyncing
{
  "syncing": true
}

@ancazamfir
Copy link
Collaborator Author

Fixed by #3829 and #3833

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: ✅ Done
Development

No branches or pull requests

2 participants