Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core] fix unrecoverable initial packets lose in group message mode #2257

Merged
merged 1 commit into from
Mar 14, 2022

Conversation

gou4shi1
Copy link
Contributor

@gou4shi1 gou4shi1 commented Mar 2, 2022

Fix #2236:

The need_notify_loss = false was introduced by #1448, according to #1448 (comment)

This is to prevent checking for loss for a freshly turned-to-running link. If this isn't done then the sequence distance between the last reception - which in this case, as a group member, is artificially updated to be in synch with other links in the group - and the current packet reception would be treated as loss and loss-reported. This actually isn't a loss because the "previous" sequence is not really a received packet sequence. After turning to RUNNING state though since this moment this range should be treated as a loss.

But it also introduces another bug #2236
I have carefully checked synchronizeWithGroup(), updateIdleLinkFrom(), processCtrlAck(), processCtrlLossReport() and packLostData(), with this PR, receivers may send ACK/NAK behind m_iSndLastAck, but senders will just ignore them, so it should be ok to remove need_notify_loss.

@ethouris
Copy link
Collaborator

ethouris commented Mar 2, 2022

AFAIK no. The sender will send in response the URQ_DROPREQ for an information that the loss-retransmission-requested packets are no longer available.

The problem actually begins with the group-specific sequence synchronization. This sequence synchronization is being done "to the closest possible place", but not to the exact place where the secondary link is going to start sending.

The procedure is the following, more-less:

  1. The idle link is being synchronized to sequence X.
  2. There's a notification of response timeout from the current active link on the sender.
  3. The sender then turns one of the currently idle links to running and sends the first packet with the sequence that is the last sent sequence at the sender. The receiver is completely unaware what sequence it may be.
  4. The receiver is stupid - it didn't know that anything happened on the sender, and simply receives a packet with some sequence. This likely is shifted towards the last sequence received by some packets, so there's definitely a gap between the last synchronized sequence and the newly received packet's sequence.

The tricky part here is that packets that were lost as first sent should be recovered, but not packets that predate it. There is, however, no information as to which packet was intended to be sent first. However I think it might be possible to retrieve this information from the group data as the last valid packet received and check if this sequence is between the synchronized sequence and the first received packet sequence.

@gou4shi1
Copy link
Contributor Author

gou4shi1 commented Mar 2, 2022

AFAIK no. The sender will send in response the URQ_DROPREQ for an information that the loss-retransmission-requested packets are no longer available.

Thanks for your reminder, yes, but these DROPREQ can make receivers move forward correctly.
It's actually a sync between senders and receivers.

However I think it might be possible to retrieve this information from the group data as the last valid packet received and check if this sequence is between the synchronized sequence and the first received packet sequence.

The last valid packet received from group may still not the firstly sent packet in the new-running-link.

@maxsharabayko maxsharabayko added Type: Bug Indicates an unexpected problem or unintended behavior [core] Area: Changes in SRT library core labels Mar 14, 2022
@maxsharabayko maxsharabayko added this to the v1.4.5 milestone Mar 14, 2022
@maxsharabayko maxsharabayko merged commit 3975428 into Haivision:master Mar 14, 2022
@gou4shi1 gou4shi1 deleted the fix-initial-lose branch March 14, 2022 09:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[core] Area: Changes in SRT library core Type: Bug Indicates an unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Initially lost packets may be ignored in group message mode
3 participants