-
Notifications
You must be signed in to change notification settings - Fork 399
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgradability fixes #967
Upgradability fixes #967
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
partial review, left off on verifyChannelUpgrade
Co-authored-by: colin axnér <25233464+colin-axner@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! I left a bunch of nits/suggestions and some questions. Will continue to review in more detail tomorrow
Co-authored-by: Damian Nolan <damiannolan@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @AdityaSripal. I left a bunch of comments, some of them nits or inconsistencies (I stopped adding more after I realised that the fine details still need to be worked out 😅).
function channelRestorePath(portIdentifier: Identifier, channelIdentifier: Identifier): Path { | ||
return "channelUpgrades/restore/ports/{portIdentifier}/channels/{channelIdentifier}" | ||
} | ||
function channelCounterpartyLastPacketSequencePath(portIdentifier: Identifier, channelIdentifier: ChannelIdentifier): Path { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we have the ChannelIdentifier
type anywhere in the spec, so maybe we keep Identifier
? I would also use the word latest
over last
.
function channelCounterpartyLastPacketSequencePath(portIdentifier: Identifier, channelIdentifier: ChannelIdentifier): Path { | |
function channelCounterpartyLatestPacketSequencePath(portIdentifier: Identifier, channelIdentifier: Identifier): Path { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer Last actually since it denotes that there will be no packet sent after this one until we reopen. Rather than just a snapshot of the latest packet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just noting in ibc-go the current usage is latest
. I'm happy with whatever but we should probably be consistent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
previous discussion: cosmos/ibc-go#3451 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I now also prefer last
here after thinking some more about it.
provableStore.set(channelPath(portIdentifier, channelIdentifier), channel) | ||
initUpgradeChannelHandshake(portIdentifier, channelIdentifier, upgradeFields, counterpartyUpgrade.timeout) | ||
} else if currentChannel.state == INITUPGRADE { | ||
existingUpgrade = publicStore.get(channelUpgradePath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we expect the upgrade sequence to be fastforwarded in this case as well if the counterparty upgrade sequence is higher in crossing hellos?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No in crossing hellos, we will just error if the sequences don't match
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc: @colin-axner
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The upgrade sequence timing is somewhat tricky to wrap ones head around. There is a world in switch the current channel is on upgrade sequence 2 and the counterparty is on upgrade sequence 3 and moving the current channel sequence to 3 would result in a successful upgrade. Erroring on non-matching upgrade sequences would just result in a do over. But, at the same time, if this is a crossing hellos case, it is possible the counterparty errors after seeing we are in upgrade sequence 2, as they cannot fast forward our sequence. They would either need to wait for a timeout (same result as erroring early) or hope that we fast forward and then they can do an ACK
Given that fast forwarding is an optimization in this situation, I think it makes sense to leave out for now and maybe open an issue to add it later. In practice we may find it to be very useful or never used
Regarding ordering and now being able to freely change regardless of "strictness", I think we need to still add the pseudo code for verifying the ordering is supported by the underlying connection. see cosmos/ibc-go#3635 (comment) |
} | ||
``` | ||
|
||
```typescript | ||
function chanUpgradeConfirm( | ||
function chanUpgradeOpen( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there good reason to stray from the Confirm
naming convention? I find it a bit more natural, as open is kind of close to Init
in some ways
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess since the channel state is changing to OPEN
if the in flight packets have been flushed, it might be confusing to have the function called Confirm
when the end state will actually OPEN
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because this function is called on both ChannelEnds, it strays from the semantics of OnChanOpenConfirm which only gets called on the initiating chain.
This function effectively removes the "block" after flushing is complete on both ends. So I renamed it to avoid confusing the two similar but different datagrams since it isn't a perfect analogue of ChanOpenConfirm
@colin-axner and I were chatting this morning and I suggested the possibility of enforcing both parties to perform the Are there any downsides to doing to this? And what would they be? cc. @AdityaSripal |
Let's discuss the pros and cons of this synchronously |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wahoo! Excellent work! 🎉
Finally made it through all the changes! Leaving my approval as I think most everything can be addressed in followups. My main outstanding points of discussion:
- timeouts needing to be equal?
- enforcing init on both chains
- bool vs enum for flushing status
One side note, the upgrade sequence is somewhat of a tricky field since it changes often and it needs to be consistent for proofs. I think it'd be very valuable to add a section discussing when it changes, when it is fast forwarded and after what state it must not be modified, maybe also scenarios when sequences go out of sync. There's also a subtle check in canceling upgrades that the error receipt is >= current sequence. If it was just equality the handshake would break down, so I think that's also worth a direct mention outside of the pseudo-code
} | ||
``` | ||
|
||
- The chain that is proposing the upgrade should set the channel state from `OPEN` to `INITUPGRADE` | ||
- The counterparty chain that accepts the upgrade should set the channel state from `OPEN` to `TRYUPGRADE` | ||
- Once the initiating chain verifies the counterparty is in `TRYUPGRADE`, it must move to `ACKUPGRADE` unless all in-flight packets are already flushed on both ends, in which case it must move directly to `OPEN`. | ||
- The `TRYUPGRADE` chain must prove the counterparty is in `ACKUPGRADE` or completed the upgrade in `OPEN` AND have no in-flight packets on **both ends** before it can complete the upgrade and move to `OPEN`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it'd be useful to have some diagrams for these conditions (someone other than @AdityaSripal could do this, ie we can open an issue)
// it will verify the new upgrade field parameters, and make the relevant state changes for initializing a new upgrade: | ||
// - moving channel state to INITUPGRADE | ||
// - incrementing upgrade sequence | ||
function initUpgradeHandshake( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one nice benefit of forcing crossing upgrade INIT's is that a app developer knows that OnUpgradeInit
will always be called on its end. It doesn't need to handle a crossing initialization in OnUpgradeTry
provableStore.set(channelPath(portIdentifier, channelIdentifier), channel) | ||
initUpgradeChannelHandshake(portIdentifier, channelIdentifier, upgradeFields, counterpartyUpgrade.timeout) | ||
} else if currentChannel.state == INITUPGRADE { | ||
existingUpgrade = publicStore.get(channelUpgradePath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The upgrade sequence timing is somewhat tricky to wrap ones head around. There is a world in switch the current channel is on upgrade sequence 2 and the counterparty is on upgrade sequence 3 and moving the current channel sequence to 3 would result in a successful upgrade. Erroring on non-matching upgrade sequences would just result in a do over. But, at the same time, if this is a crossing hellos case, it is possible the counterparty errors after seeing we are in upgrade sequence 2, as they cannot fast forward our sequence. They would either need to wait for a timeout (same result as erroring early) or hope that we fast forward and then they can do an ACK
Given that fast forwarding is an optimization in this situation, I think it makes sense to leave out for now and maybe open an issue to add it later. In practice we may find it to be very useful or never used
sequence: counterpartyUpgradeSequence, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be good to explicitly set the FlushStatus
as well
counterpartyChannelIdentifier: channelIdentifier, | ||
connectionHops: counterpartyHops, | ||
version: currentChannel.version, | ||
sequence: channel.sequence, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the idea here is we trust that the counterparty would have canceled the upgrade in the TRY handler if our channel.UpgradeSequence != their channel.UpgradeSequence
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does channel refer to here?
counterpartyChannelIdentifier: channelIdentifier, | ||
connectionHops: upgrade.fields.connectionHops, | ||
version: upgrade.fields.version, | ||
sequence: currentChannel.sequence, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be worth noting that implementations must make proof of the open state available. That is, it must not allow a successful switch to the open state followed by an immediate INIT of a new upgrade within the same provable state transition
// delete unnecessary state | ||
originalChannel = privateStore.get(channelRestorePath(portIdentifier, channelIdentifier)) | ||
currentChannel.state = OPEN | ||
currentChannel.flushStatus = NOTINFLUSH | ||
provableStore.set(channelPath(portIdentifier, channelIdentifier), originalChannel) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we fast forward our upgrade sequence upon canceling an upgrade if the error receipt is > our upgrade sequence?
if counterpartyUpgradeSequence > channel.upgradeSequence { | ||
channel.upgradeSequence = counterpartyUpgradeSequence | ||
channel.upgradeSequence = counterpartyUpgradeSequence - 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be currentChannel
here and on line 522, right?
counterpartyFlushStatus: FlushStatus, | ||
counterpartyUpgrade: Upgrade, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there any benefit to passing the FlushStatus separately here?
Co-authored-by: colin axnér <25233464+colin-axner@users.noreply.github.com>
currentChannel = provableStore.get(channelPath(portIdentifier, channelIdentifier)) | ||
currentChannel.version = version | ||
provableStore.set(channelPath(portIdentifier, channelIdentifier), currentChannel) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe you should setting the upgrade, not the channel
Refactor according to @colin-axner suggestions
Happy path handshake is done
Cancel/Timeout still needs to change