Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core] Fix hang up on not enough space in the RCV buffer. #2745

Merged
merged 10 commits into from
Aug 8, 2023

Conversation

yomnes0
Copy link
Collaborator

@yomnes0 yomnes0 commented Jun 2, 2023

When there is space available in the receiving buffer after it has been fulled, send an ack to allow the transmitter to resume transmission

When there is space available in the receiving buffer after it has been fulled, send an ack to allow the transmitter to resume transmission
@maxsharabayko maxsharabayko added this to the v1.6.0 milestone Jun 2, 2023
@maxsharabayko maxsharabayko added Type: Bug Indicates an unexpected problem or unintended behavior [core] Area: Changes in SRT library core labels Jun 2, 2023
srtcore/core.h Outdated Show resolved Hide resolved
srtcore/core.cpp Outdated Show resolved Hide resolved
srtcore/core.cpp Outdated Show resolved Hide resolved
srtcore/core.cpp Outdated Show resolved Hide resolved
srtcore/core.cpp Outdated Show resolved Hide resolved
@yomnes0
Copy link
Collaborator Author

yomnes0 commented Jun 5, 2023

Naming convention is now respected and explicit check on true removed

@yomnes0 yomnes0 changed the title [core] Fix attempt for bug #2297 [core] Fix for bug #2297 Jun 9, 2023
@yomnes0 yomnes0 changed the title [core] Fix for bug #2297 [core] Fix hang up on not enough space in the RCV buffer. Jun 9, 2023
@maxsharabayko maxsharabayko linked an issue Jul 4, 2023 that may be closed by this pull request
srtcore/core.cpp Outdated Show resolved Hide resolved
srtcore/core.h Show resolved Hide resolved
srtcore/core.cpp Outdated Show resolved Hide resolved
@yomnes0
Copy link
Collaborator Author

yomnes0 commented Jul 7, 2023

I pushed some modification that are not intended to be merged as is. They are just a "quickfix" to allow for further debug with Maxim

@maxsharabayko
Copy link
Collaborator

maxsharabayko commented Jul 7, 2023

I feel like m_bBufferWasFull should be set back to false only upon successful sending of ACK.
Something like this:

diff --git a/srtcore/core.cpp b/srtcore/core.cpp
index 0e3cce0..dac1f03 100644
--- a/srtcore/core.cpp
+++ b/srtcore/core.cpp
@@ -7904,6 +7904,7 @@ int srt::CUDT::sendCtrlAck(CPacket& ctrlpkt, int size)
     SRT_ASSERT(ctrlpkt.getMsgTimeStamp() != 0);
     int nbsent = 0;
     int local_prevack = 0;
+    bool sendAckAgain = false; //Send ack again after a buffer full was detected
 
 #if ENABLE_HEAVY_LOGGING
     struct SaveBack
@@ -7927,12 +7928,16 @@ int srt::CUDT::sendCtrlAck(CPacket& ctrlpkt, int size)
     // The TSBPD thread may change the first lost sequence record (TLPKTDROP).
     // To avoid it the m_RcvBufferLock has to be acquired.
     UniqueLock bufflock(m_RcvBufferLock);
+    if (m_bBufferWasFull && getAvailRcvBufferSizeNoLock() > 0)
+    {
+        sendAckAgain = true;
+    }
 
     int32_t ack;    // First unacknowledged packet sequence number (acknowledge up to ack).
     if (!getFirstNoncontSequence((ack), (reason)))
         return nbsent;
 
-    if (m_iRcvLastAckAck == ack)
+    if (m_iRcvLastAckAck == ack && !sendAckAgain)
     {
         HLOGC(xtlog.Debug,
               log << CONID() << "sendCtrl(UMSG_ACK): last ACK %" << ack << "(" << reason << ") == last ACKACK");
@@ -7941,7 +7946,7 @@ int srt::CUDT::sendCtrlAck(CPacket& ctrlpkt, int size)
 
     // send out a lite ACK
     // to save time on buffer processing and bandwidth/AS measurement, a lite ACK only feeds back an ACK number
-    if (size == SEND_LITE_ACK)
+    if (size == SEND_LITE_ACK && !sendAckAgain)
     {
         bufflock.unlock();
         ctrlpkt.pack(UMSG_ACK, NULL, &ack, size);
@@ -8101,7 +8106,7 @@ int srt::CUDT::sendCtrlAck(CPacket& ctrlpkt, int size)
     // [[using locked(m_RcvBufferLock)]];
 
     // Send out the ACK only if has not been received by the sender before
-    if (CSeqNo::seqcmp(m_iRcvLastAck, m_iRcvLastAckAck) > 0)
+    if (CSeqNo::seqcmp(m_iRcvLastAck, m_iRcvLastAckAck) > 0 || sendAckAgain)
     {
         // NOTE: The BSTATS feature turns on extra fields above size 6
         // also known as ACKD_TOTAL_SIZE_VER100.
@@ -8117,9 +8122,7 @@ int srt::CUDT::sendCtrlAck(CPacket& ctrlpkt, int size)
         data[ACKD_RTT] = m_iSRTT;
         data[ACKD_RTTVAR] = m_iRTTVar;
         data[ACKD_BUFFERLEFT] = (int) getAvailRcvBufferSizeNoLock();
-        // a minimum flow window of 2 is used, even if buffer is full, to break potential deadlock
-        if (data[ACKD_BUFFERLEFT] < 2)
-            data[ACKD_BUFFERLEFT] = 2;
+        m_bBufferWasFull = (data[ACKD_BUFFERLEFT] == 0);
 
         if (steady_clock::now() - m_tsLastAckTime > m_tdACKInterval)
         {
diff --git a/srtcore/core.h b/srtcore/core.h
index e7ca57c..eb3e73f 100644
--- a/srtcore/core.h
+++ b/srtcore/core.h
@@ -934,7 +934,7 @@ private: // Receiving related data
     int32_t m_iAckSeqNo;                         // Last ACK sequence number
     sync::atomic<int32_t> m_iRcvCurrSeqNo;       // (RCV) Largest received sequence number. RcvQTh, TSBPDTh.
     int32_t m_iRcvCurrPhySeqNo;                  // Same as m_iRcvCurrSeqNo, but physical only (disregarding a filter)
-
+    bool m_bBufferWasFull; // Indicate that RX buffer was full last time a ack was sent
     int32_t m_iPeerISN;                          // Initial Sequence Number of the peer side
 
     uint32_t m_uPeerSrtVersion;

@yomnes0
Copy link
Collaborator Author

yomnes0 commented Aug 7, 2023

With the help of @maxsharabayko , this bug should be fixed by my last commit. Reschedule was done before m_iSndLastAck was updated, this could cause issues when the flightspan was close to the available size on the receiver

srtcore/core.cpp Outdated Show resolved Hide resolved
srtcore/core.cpp Outdated Show resolved Hide resolved
@maxsharabayko maxsharabayko modified the milestones: v1.6.0, v1.5.3 Aug 8, 2023
srtcore/core.cpp Outdated Show resolved Hide resolved
@maxsharabayko maxsharabayko merged commit 744035b into Haivision:master Aug 8, 2023
9 checks passed
guilletrejo added a commit to swxtchio/srt that referenced this pull request Aug 23, 2023
* [core] Fix crypto mode auto for listener sender (Haivision#2711).


Co-authored-by: oviano <ovcollyer@mac.com>

* [build] Upgraded CI: ubuntu to version 20.04 (Haivision#2682).

* [docs] Added the link for registration in slack to the getting started table (Haivision#2721).

* [core] Fixed FEC Emergency resize crash (Haivision#2717).

Fixed minimum history condition.

* [core] Fixed various compiler warnings on various platforms (Haivision#2679).

* [core] Minor fix of variable shadowing.

* [tests] Minor fix of variable shadowing.

* [build] Add -Wshadow=local to CMake build flags.
Supported since GCC 7.0.

* [core] Correct remaining endianness issues

Fixes the last two remaining test failures on big-endian.  These
operations were all already no-ops on little-endian, and unnecessarily
byteswapped the IP addresses on big-endian.

Closes: Haivision#2697

* [docs] Minor updates to AEAD docs plus changed v1.6.0 to 1.5.2 in some files

* [build] Fix downversioning of _WIN32_WINNT (Haivision#2754).

* [core] Fixed unhandled error in haicrypt (Haivision#2685).

* [core] Use overlapped WSASendTo to avoid loss in UDP sending (Haivision#2632).

* [core] Add volatile keyword to asm block in rdtsc (Haivision#2759).

* [core] Fixed srctime from closing socket was mistakenly cleared

* [core] Fixed group read-ready epoll events.

* [core] Removed unused CUDTGroup::m_Positions.

* [core] Perf improvement of group reading.

* [core] Fixed RCV buffer initialization in Rendezvous.

* [docs] Updating the explicit information for binding to IPv6 wildcard (Haivision#2765).

* [tests] Added custom main with transparent parameters for tests (Haivision#2681).

* [core] Fix memory leak when can't buffer a HS packet (Haivision#2757).

* [core] Refactor CRcvQueue::storePkt(..) for better resource management (Haivision#2775).

* [core] Fix hang up on not enough space in the RCV buffer (Haivision#2745).

When there is space available in the receiving buffer after it is full,
send an ack to allow the sender to resume transmission.
Reschedule sending if ACK decreases the flight span after sending is congested.

Co-authored-by: Maxim Sharabayko <maxlovic@gmail.com>

* [core] fix tsbpd() may deadlock with processCtrlShutdown()

* [core] Slightly optimize the RCV drop by message number (Haivision#2686).

Some minor improvements of logs and comments.

* [core] Rejection not undertaken in rendezvous after KMX failure (Haivision#2692).

* [core] Fix: In rendezvous when processing resulted in ACCEPT it was still sending rejection

* [core] Minor code clean up in CRateEstimator.

* [core] Initialize ISN and PeerISN in CUDT.

* [core] Drop unencrypted packets in GCM mode.

* [apps] Fix the build for target without IP_ADD_SOURCE_MEMBERSHIP (Haivision#2779).

* [core] Added maximum BW limit for retransmissions (Haivision#2714).

* [API] SRT version raised to 1.5.3.

* [apps] Fixed conditional IP_ADD_SOURCE_MEMBERSHIP in testmedia (Haivision#2780).

* [core] Fixed SRT_ATTR_REQUIRES use.

* [build] Added missing public header files in Windows binary installer (Haivision#2784).

The header file access_control.h was added to the source tree
at some point but was not added to the Windows installer.

---------

Co-authored-by: Maxim Sharabayko <maxlovic@gmail.com>
Co-authored-by: oviano <ovcollyer@mac.com>
Co-authored-by: Sektor van Skijlen <ethouris@gmail.com>
Co-authored-by: Maria Sharabayko <41019697+mbakholdina@users.noreply.github.com>
Co-authored-by: Maxim Sharabayko <maxsharabayko@haivision.com>
Co-authored-by: matoro <matoro@users.noreply.github.com>
Co-authored-by: Maria Sharabayko <msharabayko@haivision.com>
Co-authored-by: Steve Lhomme <robux4@ycbcr.xyz>
Co-authored-by: Aaron Jencks <32805004+aaron-jencks@users.noreply.github.com>
Co-authored-by: Guangqing Chen <hi@goushi.me>
Co-authored-by: john <hondaxiao@tencent.com>
Co-authored-by: yomnes0 <127947185+yomnes0@users.noreply.github.com>
Co-authored-by: Mikołaj Małecki <mmalecki@haivision.com>
Co-authored-by: Jose Santiago <jsantiago@haivision.com>
Co-authored-by: Thierry Lelegard <lelegard@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[core] Area: Changes in SRT library core Type: Bug Indicates an unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Hang up on not enough space in the receiver buffer.
3 participants