Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Main-Backup switching bandwidth peak #2157

Closed
J-Rogmann opened this issue Oct 12, 2021 · 3 comments
Closed

[BUG] Main-Backup switching bandwidth peak #2157

J-Rogmann opened this issue Oct 12, 2021 · 3 comments
Assignees
Labels
[core] Area: Changes in SRT library core Type: Bug Indicates an unexpected problem or unintended behavior
Milestone

Comments

@J-Rogmann
Copy link
Contributor

During testing the recently improved Main/Backup switching algorithm (mode=backup) in version 1.4.4 following behaviour was observed, which is worth noting and explaining, since it can have impact on workflows and the limitation should be taken into consideration for planning redundant connections with SRT socket-groups.

Furthermore we should look into improvement in future versions.

Problem Description

When using socket-groups in backup mode, a spike in bandwidth usage is observed when switching from a main connection to a backup connection. The freshly starting backup link continues sending the SRT stream plus a number of packets, which are considered to be still in flight on the main link and their arrival can be doubted, since the main to backup switch was triggered and the previous main line is most likely down. So these packages are sent again right away on new connection in parallel to the still ongoing stream.

As result a peak in bandwidth can be seen, which can be a problem, if the streaming bandwidth is close to maximum available network bandwidth on that backup network path.

Setting SRTO_MAXBW didn't help, in this very special case the algorithm doesn't kick in.

Example

  • 2 links with 1 Gbps and 155 Mbps
  • RTT1 = RTT2 = 250 ms
  • latency buffer = 4x RTT = 1000 ms
  • SRT stream bandwidth = 50 Mbps

When triggering switching to the backup link, a peak of 78 Mbps was seen in the first second of backup link activity.

Conclusion and Recommendations

A Test series with 2 links limited to 100 Mbps and another with 2 links limited to 10 Mbps showed: the link with lowest available bandwidth within a backup socket group defines the maximum usable bandwidth for the SRT stream, which should be lower than 40% of that available network bandwidth.

  • path 1/2: 10/10 Mbps -> max 4 Mbps SRT stream bandwidth
  • path 1/2 100/10 Mbps -> max 4 Mbps SRT stream bandwidth
  • path 1/2 100/100 Mbps -> max 40 Mbps SRT stream bandwidth

Exceeding this 40% limit by setting higher stream bandwidth, dropped packets can be observed and reproduced when triggering a main to backup switch.

Increasing the latency buffer to 8x RTT instead of 4x RTT gained another 10%, so up to 50% of the available bandwidth could be used, without seeing dropped packets during switching between main and backup.

When facing higher packet loss scenarios, the overhead of packet-retransmissions also needs to be taken into account. The default of SRTO_OHEADBW is 25% Recovery bandwidth overhead above input rate itself, if not specified explicitly different.

When using a single socket SRT connection or a socket group in broadcast mode, up to 75% of the available network path bandwidth could be utilised for SRT with default SRTO_OHEADBW setting. (In real world this should be 65-70% to leave a safety margin)

In backup mode this number is significantly smaller and needs attention, if network bandwidth is limited.

To Do

We should investigate, if that initial retransmission of packets to be considered "still in-flight" on the shutdown previous main link can be minimised and if this initial burst could be smoothed out by SRTO_MAXBW setting.

@J-Rogmann J-Rogmann added the Type: Bug Indicates an unexpected problem or unintended behavior label Oct 12, 2021
@maxsharabayko maxsharabayko added the [core] Area: Changes in SRT library core label Oct 12, 2021
@maxsharabayko maxsharabayko added this to the Backlog milestone Oct 12, 2021
@maxsharabayko
Copy link
Collaborator

Tracing relevant progress.

Setting SRTO_MAXBW didn't help, in this very special case the algorithm doesn't kick in.

PR #2232 fixes the MaxBW limit (issue #713).

@J-Rogmann
Copy link
Contributor Author

J-Rogmann commented Mar 2, 2022

Main-Backup switching bandwidth peak #2157

This test was performed to investigate improvements with current build PR#2250 when switching from Main to Backup link. In earlier [tests](#2157) a bandwidth peak was observed, which also disrespected a bandwidth limit.

Two tests were performed, including captures.

First was performed with SRTO_MAXBW set to 10 Mbps.

Second test was performed without a limit set.

Bandwidth limits and network impairments on LanForge were set to same conditions as in previous test [#2157](#2157)

Results

The results look very promising, no packets got dropped during switching and the SRTO_MAXBW was not exceeded that much anymore.

When SRTO_MAXBW was set to maxbw=1250000, the maximum spike of bandwidth was around 1.154e+07 bit/s, so approx 11 Mbps.

i2157-snd-10M-limit

Interestingly when not specifying SRTO_MAXBW the spike was a bit lower.

i2157-snd-no-limit

The pictures above are from Sender, based on pcap analysis with wireshark.

Log files

All log files and network captures can be downloaded [here](https://hai365.sharepoint.com/:u:/s/Team-SRT/Eb0L49XBy0lLhX6kSm6_IUkBy2GaUzxJPPL_hXUD5-B5ww?e=KFCyVJ).

Steps to reproduce

cd /home/haivision/projects/justus/rcvbuffer/xtransmit/_gcc7_2250/bin

bigFlop

tshark -i em3 -i em4 -s 1500 -w i2157-rcv.pcapng

bigFlip

tshark -i ens785f0 -i ens785f1 -s 1500 -w i2157-snd.pcapng

main/backup with MAXBW set

sender

./srt-xtransmit generate "srt://192.168.4.2:4200?latency=1000&grouptype=backup&weight=51&maxbw=1250000" "srt://192.168.3.2:4433?weight=50&maxbw=1250000" --sendrate 7Mbps --duration 60s --enable-metrics --statsfile group-snd-stats.csv --statsfreq 1s -v

receiver

./srt-xtransmit receive srt://:4200?latency=1000 srt://:4433 --enable-metrics --metricsfile group-rcv-metrics.csv --metricsfreq 1s --statsfile group-rcv-stats.csv --statsfreq 1s -v --reconnect

results

sender:

Processing finished: ../../../hai-sync/MarkDown/Main-Backup switching bandwidth peak 2157/logs/i2157-snd.csv
~~~~~~~~~~~~~~~~~~~~~~~~~~~~ SRT Packets ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- SRT DATA+CONTROL pkts                                60383
- SRT DATA pkts                                        48748
  - Original DATA pkts sent                            48748    100.0%  out of orig+retrans sent DATA pkts
  - Retransmitted DATA pkts sent                           0      0.0%  out of orig+retrans sent DATA pkts
      Once                                              0(0)      0.0%
      Twice                                             0(0)      0.0%
      3×                                                0(0)      0.0%
      4×                                                0(0)      0.0%
      5+                                                0(0)      0.0%
- SRT CONTROL pkts                                     11635
  - ACK pkts received                                   5770
  - ACKACK pkts sent                                    5734
  - NAK pkts received                                      0
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Traffic ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- SRT DATA pkts
  - SRT payload + SRT hdr + UDP hdr (orig+retrans)           8.71 Mbps
  - SRT payload + SRT hdr (orig+retrans)                     8.66 Mbps
  - SRT payload (orig+retrans)                               8.55 Mbps
  - SRT payload + SRT hdr + UDP hdr (orig)                   8.71 Mbps
  - SRT payload + SRT hdr (orig)                             8.66 Mbps
  - SRT payload (orig)                                       8.55 Mbps
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Overhead ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- SRT DATA pkts
  - UDP+SRT headers over SRT payload (orig)                     1.87 %
  - Retransmitted over original sent pkts                        0.0 %
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Notations ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
pkts - packets
hdr - header
orig - original
retrans - retransmitted
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

receiver:

Processing finished: ../../../hai-sync/MarkDown/Main-Backup switching bandwidth peak 2157/logs/i2157-rcv.csv
~~~~~~~~~~~~~~~~~~~~~~~~~~~~ SRT Packets ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- SRT DATA+CONTROL pkts                                53452
- SRT DATA pkts                                        41804
  - Original DATA pkts received                        41804    100.0%  out of orig+retrans received DATA pkts
  - Retransmitted DATA pkts received                       0      0.0%  out of orig+retrans received DATA pkts
      Once                                              0(0)      0.0%
      Twice                                             0(0)      0.0%
      3×                                                0(0)      0.0%
      4×                                                0(0)      0.0%
      5+                                                0(0)      0.0%
  - Original DATA pkts lost                                0      0.0%  out of orig received+lost DATA pkts
      Recovered pkts                                       0      0.0%
      Unrecovered pkts                                     0      0.0%
- SRT CONTROL pkts                                     11648
  - ACK pkts sent                                       5834
  - ACKACK pkts received                                5710
  - NAK pkts sent                                          0
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Traffic ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- SRT DATA pkts
  - SRT payload + SRT hdr + UDP hdr (orig+retrans)           7.47 Mbps
  - SRT payload + SRT hdr (orig+retrans)                     7.42 Mbps
  - SRT payload (orig+retrans)                               7.34 Mbps
  - SRT payload + SRT hdr + UDP hdr (orig)                   7.47 Mbps
  - SRT payload + SRT hdr (orig)                             7.42 Mbps
  - SRT payload (orig)                                       7.34 Mbps
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Overhead ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- SRT DATA pkts
  - UDP+SRT headers over SRT payload (orig)                     1.77 %
  - Retransmitted over original (received+lost) pkts             0.0 %
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Notations ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
pkts - packets
hdr - header
orig - original
retrans - retransmitted
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

no limit, MAXBW not set

bigFlop

tshark -i em3 -i em4 -s 1500 -w i2157-rcv-no-limit.pcapng

bigFlip

tshark -i ens785f0 -i ens785f1 -s 1500 -w i2157-snd-no-limit.pcapng

sender

./srt-xtransmit generate "srt://192.168.4.2:4200?latency=1000&grouptype=backup&weight=51" "srt://192.168.3.2:4433?weight=50" --sendrate 7Mbps --duration 60s --enable-metrics --statsfile group-snd-stats.csv --statsfreq 1s -v

receiver

./srt-xtransmit receive srt://:4200?latency=1000 srt://:4433 --enable-metrics --metricsfile group-rcv-metrics.csv --metricsfreq 1s --statsfile group-rcv-stats.csv --statsfreq 1s -v --reconnect

results

sender:

Processing finished: ../../../hai-sync/MarkDown/Main-Backup switching bandwidth peak 2157/logs/no-limit/i2157-rcv.csv
~~~~~~~~~~~~~~~~~~~~~~~~~~~~ SRT Packets ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- SRT DATA+CONTROL pkts                                53452
- SRT DATA pkts                                        41804
  - Original DATA pkts received                        41804    100.0%  out of orig+retrans received DATA pkts
  - Retransmitted DATA pkts received                       0      0.0%  out of orig+retrans received DATA pkts
      Once                                              0(0)      0.0%
      Twice                                             0(0)      0.0%
      3×                                                0(0)      0.0%
      4×                                                0(0)      0.0%
      5+                                                0(0)      0.0%
  - Original DATA pkts lost                                0      0.0%  out of orig received+lost DATA pkts
      Recovered pkts                                       0      0.0%
      Unrecovered pkts                                     0      0.0%
- SRT CONTROL pkts                                     11648
  - ACK pkts sent                                       5834
  - ACKACK pkts received                                5710
  - NAK pkts sent                                          0
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Traffic ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- SRT DATA pkts
  - SRT payload + SRT hdr + UDP hdr (orig+retrans)           7.47 Mbps
  - SRT payload + SRT hdr (orig+retrans)                     7.42 Mbps
  - SRT payload (orig+retrans)                               7.34 Mbps
  - SRT payload + SRT hdr + UDP hdr (orig)                   7.47 Mbps
  - SRT payload + SRT hdr (orig)                             7.42 Mbps
  - SRT payload (orig)                                       7.34 Mbps
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Overhead ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- SRT DATA pkts
  - UDP+SRT headers over SRT payload (orig)                     1.77 %
  - Retransmitted over original (received+lost) pkts             0.0 %

receiver:

Processing finished: ../../../hai-sync/MarkDown/Main-Backup switching bandwidth peak 2157/logs/no-limit/i2157-snd-no-limit.csv
~~~~~~~~~~~~~~~~~~~~~~~~~~~~ SRT Packets ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- SRT DATA+CONTROL pkts                                60381
- SRT DATA pkts                                        48749
  - Original DATA pkts sent                            48749    100.0%  out of orig+retrans sent DATA pkts
  - Retransmitted DATA pkts sent                           0      0.0%  out of orig+retrans sent DATA pkts
      Once                                              0(0)      0.0%
      Twice                                             0(0)      0.0%
      3×                                                0(0)      0.0%
      4×                                                0(0)      0.0%
      5+                                                0(0)      0.0%
- SRT CONTROL pkts                                     11632
  - ACK pkts received                                   5770
  - ACKACK pkts sent                                    5734
  - NAK pkts received                                      0
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Traffic ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- SRT DATA pkts
  - SRT payload + SRT hdr + UDP hdr (orig+retrans)           8.71 Mbps
  - SRT payload + SRT hdr (orig+retrans)                     8.66 Mbps
  - SRT payload (orig+retrans)                               8.55 Mbps
  - SRT payload + SRT hdr + UDP hdr (orig)                   8.71 Mbps
  - SRT payload + SRT hdr (orig)                             8.66 Mbps
  - SRT payload (orig)                                       8.55 Mbps
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Overhead ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- SRT DATA pkts
  - UDP+SRT headers over SRT payload (orig)                     1.87 %
  - Retransmitted over original sent pkts                        0.0 %

@maxsharabayko
Copy link
Collaborator

Closing as resolved by PR #2232.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[core] Area: Changes in SRT library core Type: Bug Indicates an unexpected problem or unintended behavior
Projects
None yet
Development

No branches or pull requests

3 participants