-
Notifications
You must be signed in to change notification settings - Fork 332
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recover from missed RPC events after WebSocket subscription is closed by Tendermint #1205
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finished reviewing, I'm still testing though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ran different tests and works well, this is a great improvement, great work Romain!
I only had one more minor comment regarding the logs. Beside that, please also remember:
- changelog
- revert one-chain temporary changes, revert Cargo.toml patch
Thanks :)
Just pushed a commit to improve the logs
It was already updated :) https://github.com/informalsystems/ibc-rs/pull/1205/files#diff-06572a96a58dc510037d5efa622f9bec8519bc1beab13c9f251e97e657a9d4edR29
Done
Will do once tendermint-rs 0.21 is out and after we do the update in a standalone PR. |
… by Tendermint (informalsystems#1205) After some investigation, the culprit for informalsystems#1196 seems to be that Tendermint is closing the WebSocket connection over which we listen for IBC events whenever more than 100 txs are included in a single block [0], as we are not able to pull the events fast enough over the WebSocket connection to avoid completely filling the event buffer in Tendermint (which currently has a hard-coded capacity of 100 events, hence the issue). We never noticed this previously since this problem only appears in practice with a high-enough commit/propose timeout (to allow enough txs to be included in a single block), and we were testing with a lower value for the timeouts. Now that we landed some changes in tendermint-rs [1] which allow us to notice the connection being closed, this PR makes use of this to resubscribe to the events and trigger a packet clear whenever we notice the connection being closed under our feet. [0] tendermint/tendermint#6729 [1] informalsystems/tendermint-rs#929 --- * Propagate JSON-RPC errors through the Rust subscription * Use tendermint-rs branch with both fixes * Fix compilation issue in tests * Clear pending packets when event subscription is cancelled * Temp: Update one-chain script to use 10s commit timeout * Use tendermint-rs master * Update Cargo.lock * Update changelog * Update lockfile * Increase delay before checking for relaying result in e2e tests * Add comment explaining who the RPC error is propagated to * Improve event monitor logs * Reset `timeout_commit` and `timeout_propose` to 1s
Closes: #1196
Depends on:
tendermint::block::Size
tendermint-rs#931Description
After some investigation, the culprit for #1196 seems to be that Tendermint is closing the WebSocket connection over which we listen for IBC events whenever more than 100 txs are included in a single block [0], as we are not able to pull the events fast enough over the WebSocket connection to avoid completely filling the event buffer in Tendermint (which currently has a hard-coded capacity of 100 events, hence the issue).
We never noticed this previously since this problem only appears in practice with a high-enough commit/propose timeout (to allow enough txs to be included in a single block), and we were testing with a lower value for the timeouts.
Now that we landed some changes in tendermint-rs [1] which allow us to notice the connection being closed, this PR makes use of this to resubscribe to the events and trigger a packet clear whenever we notice the connection being closed under our feet.
[0] tendermint/tendermint#6729
[1] informalsystems/tendermint-rs#929
For contributor use:
docs/
) and code comments.Files changed
in the Github PR explorer.