Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: retransmission for tlc #89

Merged
merged 6 commits into from
Jun 26, 2024
Merged

Conversation

quake
Copy link
Member

@quake quake commented Jun 19, 2024

No description provided.

@quake quake requested a review from contrun June 19, 2024 07:45
@contrun
Copy link
Collaborator

contrun commented Jun 19, 2024

Only the messages sent by us need to be resent on connection reestablishment. We don't have to deliberately resend our RevokeAndAck as the counter-party will resend their CommitmentSigned and we can react to that. So there are only two possibilities, one is that we need to resend one CommitmentSigned, the other is that we only need to resend AddTlcs and RemoveTlc.

That is, we may previously send a sequence of events after a RevokeAndAck of the following two forms

  1. AddTlc/RemoveTlc, ... , CommitmentSigned, ...
  2. AddTlc/RemoveTlc, ...

In case 1, assume we don't have message after the CommitmentSigned for now (we will take care of that later). This case is no different from the ordinary commitment signed command. We only have to resend all the AddTlcs and RemoveTlcs. As our handler of CommitmentSigned already has the function to select all the AddTlc and RemoveTlc for that commitment transaction. We don't have to re-process all these tlcs. Just change the code there and and a parameter like resend_tlc.

But how do we know if we need to resend a CommitmentSigned? If we previously send a CommitmentSigned and the counter-party successfully processed this CommitmentSigned message, then he will have updated his remote_commitment_number (this is the number which he records how many commitment has signed by his counter-party, i.e. us). While connection is reestablished, we will find the remote_commitment_number sent from counter-party is larger than our local_commitment_number by 1. If the counter-party has not correctly processed the CommitmentSigned message, then, then we can just ignore our previous CommitmentSigned message. This way, we can determine whether we need to resend a CommitmentSigned. We only have to rebuild the commitment transaction and resend all the AddTlcs and RemoveTlcs that should be included in that commitment transaction.

What if we still have messages after our CommitmentSigned? Previously, I removed ChannelReadyFlags::AWAIT_REMOTE_REVOKE in #72. In hindsight, I think this flag is still useful. Say we send the following message sequences to the counter-party without receiving any RevokeAndAck

AddTlc1, RemoveTlc, ... , CommitmentSigned, RemoveTlc2, AddTlc2

Say the local commitment number for CommitmentSIgned is n. Then AddTlc2 and RemoveTlc2 should be in our commitment transaction n+1. The function insert_tlc currently actually set the commitment number to current local commitment number n (this is because local commitment number is updated only when RevokeAndAck is received). We need to fix that.

In the case 2, all we need is just pretend to send a commitment transaction. And do not send the actual CommitmentSIgned message but all the AddTlcs and RemoveTlcs instead.

ChannelState::ChannelReady() => {
// resend AddTlc, RemoveTlc and CommitmentSigned messages if needed
let local_commitment_number = reestablish_channel.local_commitment_number;
let mut need_resend_commitment_signed = false;
Copy link
Collaborator

@contrun contrun Jun 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should check whether we need to resend a CommitmentSigned as previous comment.

// resend AddTlc, RemoveTlc and CommitmentSigned messages if needed
let local_commitment_number = reestablish_channel.local_commitment_number;
let mut need_resend_commitment_signed = false;
for info in self.tlcs.values() {
Copy link
Collaborator

@contrun contrun Jun 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have to repeat all this logic here. https://github.com/nervosnetwork/cfn-node/blob/f9e79b060f7329309eff752b7485bfacc91a629f/src/ckb/channel.rs#L2027 This function may be modified to check whether we need to resend a AddTlc or SendTlc. Just exclude all tlcs from previous round.

src/ckb/channel.rs Outdated Show resolved Hide resolved
@quake quake force-pushed the quake/retransmission branch 2 times, most recently from 79c5258 to ac5399c Compare June 25, 2024 07:52
// resend AddTlc, RemoveTlc and CommitmentSigned messages if needed
let mut need_resend_commitment_signed = false;
for info in self.tlcs.values() {
if info.is_offered() {
Copy link
Collaborator

@contrun contrun Jun 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we have multiple implicit assumptions.

  1. If a TLC is added by us or a TLC is removed by us then we need to resend a CommitmentSigned message.
  2. We only need to resend a single CommitmentSigned message (or we only have a single TLC added/removed in this commitment phase).
  3. We only need to resend AddTlc or RemoveTlc messages when the channel is established.

Assumption 1 and assumption 2 are correct in current implementation. Assumption 2 is a direct corollary of assumption 1, but the code here can be a little fragile if we don't know all these assumptions and try to refactor the logic here. Can we add a warning if there are multiple AddTlc/RemoveTlc need to be processed?

Assumption 3 is actually wrong. For RemoveTlc, we may need to transition the state to Shutdown if both parties agreed to shutdown and this is the last pending TLC to be removed. https://github.com/nervosnetwork/cfn-node/blob/975f25668aff31b54ccad0eaf97a177069367de2/src/ckb/channel.rs#L593

In regard to assumption 3, I think we can create a AddTlc or RemoveTlc from the DetailedTlcInfo, then call handle_add_tlc_command or handle_remove_tlc_command, and make sure handle_add_tlc_command and handle_remove_tlc_command are always idempotent (note that functions handle_add_tlc_command and handle_remove_tlc_command will send CommitmentSigned, so we don't have to resend CommitmentSigned here). This way even if code changes, we can keep the the processing logic of RemoveTlc in channel reestablishment and normal operations in sync.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have to reuse all the code of handle_add_tlc_command and handle_remove_tlc_command. The common logic can be extracted to new functions.

@quake quake merged commit f563279 into nervosnetwork:main Jun 26, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants