-
Notifications
You must be signed in to change notification settings - Fork 866
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Main-Backup switching bandwidth peak #2157
Comments
Main-Backup switching bandwidth peak #2157This test was performed to investigate improvements with current build PR#2250 when switching from Main to Backup link. In earlier [tests](#2157) a bandwidth peak was observed, which also disrespected a bandwidth limit. Two tests were performed, including captures. First was performed with Second test was performed without a limit set. Bandwidth limits and network impairments on ResultsThe results look very promising, no packets got dropped during switching and the When Interestingly when not specifying The pictures above are from Sender, based on pcap analysis with wireshark. Log filesAll log files and network captures can be downloaded [here](https://hai365.sharepoint.com/:u:/s/Team-SRT/Eb0L49XBy0lLhX6kSm6_IUkBy2GaUzxJPPL_hXUD5-B5ww?e=KFCyVJ). Steps to reproducecd /home/haivision/projects/justus/rcvbuffer/xtransmit/_gcc7_2250/bin bigFlop tshark -i em3 -i em4 -s 1500 -w i2157-rcv.pcapng bigFlip tshark -i ens785f0 -i ens785f1 -s 1500 -w i2157-snd.pcapng main/backup with MAXBW setsender./srt-xtransmit generate "srt://192.168.4.2:4200?latency=1000&grouptype=backup&weight=51&maxbw=1250000" "srt://192.168.3.2:4433?weight=50&maxbw=1250000" --sendrate 7Mbps --duration 60s --enable-metrics --statsfile group-snd-stats.csv --statsfreq 1s -v receiver./srt-xtransmit receive srt://:4200?latency=1000 srt://:4433 --enable-metrics --metricsfile group-rcv-metrics.csv --metricsfreq 1s --statsfile group-rcv-stats.csv --statsfreq 1s -v --reconnect resultssender:
receiver:
no limit, MAXBW not setbigFlop tshark -i em3 -i em4 -s 1500 -w i2157-rcv-no-limit.pcapng bigFlip tshark -i ens785f0 -i ens785f1 -s 1500 -w i2157-snd-no-limit.pcapng sender./srt-xtransmit generate "srt://192.168.4.2:4200?latency=1000&grouptype=backup&weight=51" "srt://192.168.3.2:4433?weight=50" --sendrate 7Mbps --duration 60s --enable-metrics --statsfile group-snd-stats.csv --statsfreq 1s -v receiver./srt-xtransmit receive srt://:4200?latency=1000 srt://:4433 --enable-metrics --metricsfile group-rcv-metrics.csv --metricsfreq 1s --statsfile group-rcv-stats.csv --statsfreq 1s -v --reconnect resultssender:
receiver:
|
Closing as resolved by PR #2232. |
During testing the recently improved Main/Backup switching algorithm (
mode=backup
) in version 1.4.4 following behaviour was observed, which is worth noting and explaining, since it can have impact on workflows and the limitation should be taken into consideration for planning redundant connections with SRT socket-groups.Furthermore we should look into improvement in future versions.
Problem Description
When using socket-groups in backup mode, a spike in bandwidth usage is observed when switching from a
main
connection to abackup
connection. The freshly starting backup link continues sending the SRT stream plus a number of packets, which are considered to be still in flight on themain
link and their arrival can be doubted, since themain
tobackup
switch was triggered and the previousmain
line is most likely down. So these packages are sent again right away on new connection in parallel to the still ongoing stream.As result a peak in bandwidth can be seen, which can be a problem, if the streaming bandwidth is close to maximum available network bandwidth on that
backup
network path.Setting SRTO_MAXBW didn't help, in this very special case the algorithm doesn't kick in.
Example
When triggering switching to the
backup
link, a peak of 78 Mbps was seen in the first second ofbackup
link activity.Conclusion and Recommendations
A Test series with 2 links limited to 100 Mbps and another with 2 links limited to 10 Mbps showed: the link with lowest available bandwidth within a
backup
socket group defines the maximum usable bandwidth for the SRT stream, which should be lower than 40% of that available network bandwidth.Exceeding this 40% limit by setting higher stream bandwidth,
dropped packets
can be observed and reproduced when triggering amain
tobackup
switch.Increasing the
latency
buffer to 8x RTT instead of 4x RTT gained another 10%, so up to 50% of the available bandwidth could be used, without seeingdropped packets
during switching betweenmain
andbackup
.When facing higher packet loss scenarios, the overhead of packet-retransmissions also needs to be taken into account. The default of SRTO_OHEADBW is 25% Recovery bandwidth overhead above input rate itself, if not specified explicitly different.
When using a single socket SRT connection or a socket group in broadcast mode, up to 75% of the available network path bandwidth could be utilised for SRT with default SRTO_OHEADBW setting. (In real world this should be 65-70% to leave a safety margin)
In
backup
mode this number is significantly smaller and needs attention, if network bandwidth is limited.To Do
We should investigate, if that initial retransmission of packets to be considered "still in-flight" on the shutdown previous main link can be minimised and if this initial burst could be smoothed out by SRTO_MAXBW setting.
The text was updated successfully, but these errors were encountered: