Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[META] Optimize the WebRTC stack to the maximum #157

Open
ehfd opened this issue May 25, 2024 · 2 comments
Open

[META] Optimize the WebRTC stack to the maximum #157

ehfd opened this issue May 25, 2024 · 2 comments
Labels
enhancement New feature or request funding Requires funding to implement help wanted External contribution is required interface OS input, display, or audio interfaces performance Performance or latency issues, not critical but impacts usage transport Underlying media or data transport protocols upstream Requires upstream development from dependencies web Web components including gst-web

Comments

@ehfd
Copy link
Member

ehfd commented May 25, 2024

Linked with #160, #153, #152, #39, #34, #30

Also read: m1k1o/neko#371

In the v1.6.0 release, there is much higher confidence in our performance optimizations in the WebRTC stack.
We have achieved a way to eliminate jitterbuffer latency from the WebRTC decoder using playout-delay and jitterBufferTarget, along with many other measures to stabilize and improve the video and input (DataChannel) stack.

Moreover, we have incorporated smaller frames for the Opus codec to see if the latency improves (tracked in #153), but NetEQ in Chrome mostly works on its own.

There are still multiple interventions that may bring this WebRTC stack to the maximum and achieve the most ideal and optimal performance possible.

Backend:

  • Correctly implement YUV 4:4:4 color

https://issues.chromium.org/issues/40198264

This is possible in WebRTC, where Nutanix Frame implemented YUV 4:4:4 within Chromium quite some time ago.
First, however, color in YUV 4:2:0 (#160) should be solved first as there is no legitimate reason that color in YUV 4:2:0 should be over +/- 1 different from the original source.

  • Obtain the sweet spot of video encoder maximum and minimum QP parameters

https://multi.app/blog/making-illegible-slow-webrtc-screenshare-legible-and-fast
https://multi.app/blog/measuring-shared-control-latency

  • Investigate the usage of queues to GStreamer RTP payloaders

Currently, the Opus queue is commented out. However, queues may have useful features.
Along with re-investigating the effectiveness of queues in Opus and their roles in latency, queues in video RTP payloaders may (or may not) also help during congestion where certain latency spikes might stay for >5-15 seconds because the WebRTC decoder scrambles to decode very late frames instead of simply dropping them.
An unknown configuration from the web browser may also totally eliminate this situation.
This must work nicely with infinite keyframe/GOP configurations and NACK/PLI with RTX.

  • Compress DataChannel using GZip

It seems that Nestri saw some effective input latency drops with this.

Frontend:

  • Override system power settings (Especially Chromium on Windows) to decode full frames

https://web.dev/articles/requestvideoframecallback-rvfc
It seems that when the system is in a low power efficiency mode, video decoding is not done quickly, as in the example. This leads to perceived increased latency because the frames aren't getting painted as often as they should.
Some settings in WebRTC or

  • Moreover, jitterBufferTarget / jitterBufferDelayHint / playoutDelayHint are not well understood. Find out where this and other hidden WebRTC settings can improve upon the current approach.

Current configuration (reference from https://groups.google.com/g/discuss-webrtc/c/wtuhQu6c1KY/m/Usq84y0mAQAJ, a bit of a CPU hog but acceptable with async, could be more optimized or otherwise able to assess the effect of this configuration in web browsers):

// Repeatedly emit minimum latency target
webrtc.peerConnection.getReceivers().forEach((receiver) => {
    let intervalLoop = setInterval(async () => {
        if (receiver.track.readyState !== "live" || receiver.transport.state !== "connected") {
            clearInterval(intervalLoop);
            return;
        } else {
            receiver.jitterBufferTarget = receiver.jitterBufferDelayHint = receiver.playoutDelayHint = 0;
        }
    }, 15);
});

WebRTC:

Check if merging webrtcbin back to one session is plausible: It seems that the video-delay could have reduced the video latency without needing to have two separate sessions.

  • Merge two different WebRTC sessions into one with multiple independent streams:

Use a=group:BUNDLE 0 1 2 3 ... and a=mid:0, a=mid:1, ... to establish one SDP session, but with independent streams for Audio, Video, DataChannel (m=application x UDP/DTLS/SCTP webrtc-datachannel), Microphone, Webcam, and other types of streams which don't interfere nor do audio/video sync.

Such as:

v=0
o=- 2 IN IP4 1.1.1.1
t=0 0
a=group:BUNDLE 0 1 2 3
a=fingerprint:sha-256
a=setup:actpass
m=audio x UDP/TLS/RTP/SAVPF 111 63
c=IN IP4 0.0.0.0
a=rtcp:x IN IP4 0.0.0.0
a=mid:0
a=extmap:3 http://www.ietf.org/id/draft-holmer-rmcat-transport-wide-cc-extensions-01
a=sendonly
a=msid:id audio
a=rtcp-mux
a=rtcp-rsize
a=rtpmap:111 opus/48000/2
a=rtcp-fb:111 transport-cc
a=fmtp:111 minptime=10;useinbandfec=1
a=rtpmap:63 red/48000/2
a=rtcp-fb:63 transport-cc
a=fmtp:63 111/111
a=ptime:10
m=video x UDP/TLS/RTP/SAVPF 96 97 101 102 98
c=IN IP4 0.0.0.0
a=rtcp:x IN IP4 0.0.0.0
a=mid:1
a=extmap:2 http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time
a=extmap:3 http://www.ietf.org/id/draft-holmer-rmcat-transport-wide-cc-extensions-01
a=extmap:7 http://www.webrtc.org/experiments/rtp-hdrext/video-timing
a=extmap:12 http://www.webrtc.org/experiments/rtp-hdrext/playout-delay
a=sendonly
a=msid:id video
a=rtcp-mux
a=rtcp-rsize
a=rtpmap:101 H264/90000
a=rtcp-fb:101 transport-cc
a=rtcp-fb:101 ccm fir
a=rtcp-fb:101 nack
a=rtcp-fb:101 nack pli
a=fmtp:101 level-asymmetry-allowed=1;packetization-mode=1;sps-pps-idr-in-keyframe=1;profile-level-id=42e01f
a=rtpmap:102 rtx/90000
a=fmtp:102 apt=101;rtx-time=125
m=application x UDP/DTLS/SCTP webrtc-datachannel
c=IN IP4 0.0.0.0
a=mid:2
a=sctp-port:5000
a=max-message-size:262144
m=audio x UDP/TLS/RTP/SAVPF 111
c=IN IP4 0.0.0.0
a=rtcp:x IN IP4 0.0.0.0

The main purpose of doing this is to still isolate different streams so that there is no audio/video sync at all (which adds inevitable latency) and at the same time improve the performance of DataChannels as well by maintaining an independent stream separate from the video, but handle all of them with one TURN relay port or other types of WebRTC port in one single SDP.

  • RTP Header Extensions and other WebRTC browser-side, server-side settings to implement and improve:

https://www.rtcbits.com/2023/05/webrtc-header-extensions.html

a=extmap:1 http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time
a=extmap:2 http://www.ietf.org/id/draft-holmer-rmcat-transport-wide-cc-extensions-01
a=extmap:3 http://www.webrtc.org/experiments/rtp-hdrext/video-timing
a=extmap:4 http://www.webrtc.org/experiments/rtp-hdrext/playout-delay

Note: http://www.webrtc.org/experiments/rtp-hdrext/color-space causes the Chrome WebRTC decoder to skip the Hardware Decoder and go straight to the Software FFmpeg decoder.

The above RTP Header Extensions are known to help with controlling latency and timing. These can be implemented in GStreamer so that it can be emitted into RTP payloaders.

https://gitlab.freedesktop.org/gstreamer/gstreamer/-/issues/3549
https://gitlab.freedesktop.org/gstreamer/gstreamer/-/issues/3550

https://gstreamer.freedesktop.org/documentation/rtpmanager/rtphdrextclientaudiolevel.html
https://gstreamer.freedesktop.org/documentation/rtpmanager/rtphdrextmid.html

SDP support in web browsers: https://codepen.io/kwst/full/yLaaxRy

draft-holmer-rmcat-transport-wide-cc-extensions-01 is enabled for video and audio when rtpgccbwe is active. abs-send-time, video-timing are not available in GStreamer. playout-delay has been implemented in a very restricted temporary form in gstwebrtc_app.py, where the only zero values can be sent (which is what we need, anyways).

  • Investigate imageattr and flexfec in video:
a=imageattr:96 send [x=[1280:1920],y=[720:1080],fps=[30:60]]
a=imageattr:97 send [x=[1280:1920],y=[720:1080],fps=[30:60]]
a=rtpmap:98 flexfec-03/90000
a=rtcp-fb:98 transport-cc
a=fmtp:98 repair-window=10000000
a=ssrc-group:FEC-FR
  • Larger DataChannels:
a=max-message-size:262144
  • Understand the effects of b=AS: and x-google-max-bitrate (in the receiving-side, not the sending-side or browser-to-browser !!):

nextcloud/spreed#6739
https://groups.google.com/g/discuss-webrtc/c/u7k1_hASS4Q
https://stackoverflow.com/questions/57653899/how-to-increase-the-bitrate-of-webrtc
https://groups.google.com/g/discuss-webrtc/c/udyHHPnrQMo
pion/webrtc#1827
https://ekobit.com/blog/diving-deeper-into-webrtc-advanced-options-and-possibilities/
https://chromium.googlesource.com/external/webrtc/+/a6b99448eec51527eca0bc59f6da71061d02e807/webrtc/media/base/mediaconstants.cc
https://groups.google.com/g/discuss-webrtc/c/ORJdeoFAaBE
https://webrtc.googlesource.com/src/+/refs/heads/main/docs/native-code/sdp-ext/fmtp-x-google-per-layer-pli.md

The above links may have irrelevant information (controlling sender bitrate, this is because webrtcbin is the sender and it does not use libwebrtc).

b=AS:300000
a=fmtp:96 sps-pps-idr-in-keyframe=1;x-google-max-bitrate=300000;x-google-min-bitrate=0;x-google-start-bitrate=12000
  • Different protocol topologies to TURN and STUN

https://neko.m1k1o.net/#/getting-started/configuration?id=webrtc

Pion provides various WebRTC configurations and protocols including EPR, UDPMUX, TCPMUX, NAT1TO1, ICE-LITE, ICE-TCP, etc. These techniques allow more setup flexibility in addition to TURN/STUN and allow limiting port ranges or using a single port for many numbers of connections. This should be implemented with GStreamer's webrtcbin.

https://www.w3.org/2021/03/media-production-workshop/talks/slides/sergio-garcia-murillo-whip.pdf
https://groups.google.com/g/discuss-webrtc/c/wtuhQu6c1KY
https://henbos.github.io/webrtc-timing/
https://github.com/jakearchibald/web-platform-tests/blob/master/webrtc-extensions/RTCRtpReceiver-playoutDelayHint.html
https://mediasoup.discourse.group/t/webrtc-playout-delay-extension/2067
https://issues.chromium.org/issues/324276557
https://bugzilla.mozilla.org/show_bug.cgi?id=1592988
https://groups.google.com/a/chromium.org/g/blink-dev/c/4W4orKqA3Rs
https://www.reddit.com/r/WebRTC/comments/ipewaq/disable_use_of_jitter_buffer/?rdt=58693

@ehfd ehfd added enhancement New feature or request help wanted External contribution is required upstream Requires upstream development from dependencies funding Requires funding to implement transport Underlying media or data transport protocols performance Performance or latency issues, not critical but impacts usage web Web components including gst-web interface OS input, display, or audio interfaces labels May 25, 2024
@ehfd ehfd changed the title [META] Optimize the WebRTC stack to the extreme [META] Optimize the WebRTC stack to the maximum May 25, 2024
@ehfd
Copy link
Member Author

ehfd commented Jul 6, 2024

The answer is all in: https://github.com/webrtc-sdk/libwebrtc

Someone's going to have to dive into this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request funding Requires funding to implement help wanted External contribution is required interface OS input, display, or audio interfaces performance Performance or latency issues, not critical but impacts usage transport Underlying media or data transport protocols upstream Requires upstream development from dependencies web Web components including gst-web
Projects
None yet
Development

No branches or pull requests

1 participant