-
Notifications
You must be signed in to change notification settings - Fork 384
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IBC upgrade plan summary #445
Comments
I like the categorization of the different types of upgrades. A quick sketch of an idea for solving 2 and 3 in IBC 1.0. (Note, this is just a rough idea, it could be unsafe or incorrect) Switching the light client algorithm in-place would mean that the unprocessed packets that relied on the old algorithm can no longer be processed. Unless the client implements some switching logic to internally swap light client algorithms based on height (this seems ugly) Proposal Sketch: Client changes: Connection changes: The Connection package can then introduce message type(s) for appending a client to this list. There probably needs to be a way to show and handle misbehaviour if the validator-sets of a previous client height-range are signing different headers for a height-range on the latest client. (ie We may need forking still freezes clients even if the misbehaviour is only detectable across the different light clients) There are no channel changes necessary. This seems like it should work for both 2 and 3 (and 1 if a cleaner implementation does not exist). It also allows previous clients to coexist with later clients. |
Another major question (ref cosmos/cosmos-sdk#6531) is upgrade security - will Tendermint accept evidence from the old chain within the unbonding period after an upgrade? If not, this potentially opens a large attack vector for double-signing to fool IBC light clients without punishment, we'd probably need to freeze light clients an unbonding period's worth of time before an upgrade. |
Why would this be the case? New proofs might have to be created, but the packets should still be in the state (we can assume that the application state is persisted through the upgrade). |
Ahh right this is correct. Still seems like the above proposal might make sense to implement for upgrade security. From what I understand there are two types of misbehaviour we might want to avoid:
As mentioned here:
There may be a case where the chain forks right before the upgrade, but in a way that is detectable only after the upgrade occurs. If so, updating the light client in-place may make detecting and punishing this misbehavour impossible (especially if the update is light-client breaking), since evidence detection and handling is dependent on things like the light client algorithm and header format. However, consider the proposal above where the old client still exists, but is simply supplanted by a newer client in the list of This effectively allows evidence to be processed by the IBC client and have all subsequent heights to be frozen and unprocessable even across light-client breaking upgrades without any updates to Tendermint's evidence handling logic to handle this case. Note that we also do not need to freeze the light client for three weeks waiting for the old light-client's unbonding period to expire, this avoids a potential major UX pain in more popular connections. Of course, if all full-nodes, upgrade to the new chain before evidence is caught then the evidence will never get submitted. So a safer way to upgrade might be to leave some full nodes on the old-chain for the duration of the unbonding period. These nodes will not receive any new blocks, but their evidence reactors should still be capable of receiving and gossipping evidence. From my understanding, a relayer can then pick up this evidence and relay it to connecting chains.
Not sure if this is a situation we are concerned about. Suppose there is a fork at the upgrade height, such that the validator sets runs the new chain and continues operating the old chain past the upgrade height. It's unclear to me if any light-client is fooled in this situation since the light-clients should know the upgrade height and new parameters ahead of time (since supported upgrades are pre-planned). However, it may still be a misbehaviour we want to punish. In this case, a full node tracking the old chain post-upgrade (a strategy repeated from the previous case) can still submit a header for the old chain to the older client. Since the old client has an The proposal described here #445 (comment) should handle and punish these forks correctly. The only necessary addition to the proposal described is the addition of the fields to ClientState: type ClientState struct {
// old fields
...
PrevClient string // empty string if this is first client of chain
NextClient string // empty string if this is latest client of chain
} This effectively creates a doubly-linked list of clients in the order of when they are active. This is useful to establish the ordering even at the client level (rather than just at the connection), and it helps when the next client needs to be frozen based on a misbehaviour caught on previous client. |
I think what you're proposing makes sense for IBC light clients, but we still need Tendermint to be able to process evidence within the unbonding period in order to slash validators, which IBC light clients cannot do - if this doesn't happen, validators have no disincentive discouraging them from signing forks.
I think whether or not this needs to count as misbehaviour to retain the same security assumptions depends on how far in advance the upgrade is known about - if it is known about at least an unbonding period in advance, the light client can definitely reject any heights in the original epoch beyond the upgrade height, but if it is not known at least an unbonding period in advance, such signatures would need to be treated as misbehaviour and slashed for. |
Correct, what I'm proposing is only a way to freeze light clients in the case of this behaviour which would be necessary for IBC. And leaving the problem of slashing individual validators as an open-problem that is more Tendermint's concern |
Per discussion with the Tendermint team, the state of affairs:
Assuming state machine upgrades always use the upgrade module, this leaves us with Tendermint breaking-upgrades as the remaining issue. Our plan is as follows: either Tendermint Core, in the first post-0.34 release which breaks past data structures (and thus requires a zero-height upgrade), will support processing past evidence (rendering this upgrade path safe), or in that exceptional upgrade case, we will freeze all IBC channels an unbonding period prior to the upgrade (which must be known about at least an unbonding period in advance), which would be unfortunate but not the end of the world. In terms of immediate IBC work, then, we still need to support this upgrade path (since it might be exercised in the future if Tendermint supports old evidence processing), and we also need to implement the "freeze for an unbonding period" logic, which we need to think about a bit more carefully to ensure that it is safe (as it's not just that validators could sign fake headers to prevent packets from being sent which were actually sent, they could also sign fake headers to send packets which weren't actually sent). I think it needs to work something like:
|
We should add the light client freeze operation to the spec. |
I suppose we should also include the chain ID epoch extraction scheme in the spec, although it's a bit of temporary hack. |
Just to confirm, eventually this will not live in the chain-id, and just be data the validators sign over and client txs don't? |
In principle, it should be; not sure what the Tendermint team's plans are. |
Implemented in the Cosmos SDK. Future updates should be made to simplify this logic for Tendermint light clients, but those are implementation choices. |
Hi, are there any plans too support emergency upgrade (3)? Thanks |
This is a high-level summary of my current thoughts. If all are concurred on executing this for 1.0, I'll update the SDK-side ADR with particular data structure information etc.
This document addresses the procedures for maintaining IBC functionality with minimal disruption to applications when upgrading a ledger which utilises IBC to send & receive messages from other ledgers. Our primary design goal is to minimise/eliminate disruptions to or manual interventions required by applications running on top of the ledger which use IBC to communicate with other applications on other ledgers. For now, we address only the case of a single ledger being upgraded. More complex multi-ledger upgrades should be reducable to sequenced instances of this process.
Basic assumptions:
Upgrades can be classified into three categories:
The text was updated successfully, but these errors were encountered: