Improve audio player behavior #4572

rom1v · 2024-01-07T21:05:48Z

This PR improves internal implementation details of the audio player.

Atomics

The main change consists in removing locking contention between the audio receiver thread and the audio output thread in the "happy path". Synchronization is replaced by atomics. Locking is kept for corner cases where the writer thread needs to "read" (to consume/drop samples).

Compensation thresholds

To adjust the audio samples so that a target latency is preserved between the input and the output, compensation (think "resampling", see blogpost) is applied. The compensation is proportional to the difference between the actual buffering level and the target buffering level.

But to avoid spurious compensation (due to noise errors), it was only enabled if this difference was more than 1 ms. However, the buffering level does not change continuously: it increases abruptly when a packet is received, and decreases abruptly when an audio block is consumed, so a rolling average is used. This estimation may sometimes vary by an amount which may trigger (unwanted) compensation.

To avoid the problem, make two changes:

increase the rolling average smoothness
increase the threshold to enable compensation from 1 ms to 4 ms

But keep a smaller threshold (1 ms) for disabling compensation, so that the buffering level is restored closer to the target value. This avoids to keep the actual level close to the compensation threshold.

Here is a log capture before the changes (scrcpy -Vverbose) (look at the actual spurious compensation values):

VERBOSE: [Audio] Buffering: target=2400 avg=2353.506104 cur=2610 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2354.768311 cur=2370 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2300.147705 cur=1890 compensation=99
VERBOSE: [Audio] Buffering: target=2400 avg=2376.177734 cur=2635 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2359.578613 cur=2395 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2340.571777 cur=2635 compensation=59
VERBOSE: [Audio] Buffering: target=2400 avg=2343.081055 cur=2649 compensation=56
VERBOSE: [Audio] Buffering: target=2400 avg=2360.970947 cur=2423 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2365.144531 cur=2423 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2361.825684 cur=2423 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2369.891357 cur=2663 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2368.985352 cur=2423 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2354.009277 cur=2183 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2355.908203 cur=2903 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2385.742920 cur=2663 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2274.370605 cur=1943 compensation=125
VERBOSE: [Audio] Buffering: target=2400 avg=2205.103271 cur=1735 compensation=194
VERBOSE: [Audio] Buffering: target=2400 avg=2144.965820 cur=2023 compensation=255
VERBOSE: [Audio] Buffering: target=2400 avg=2328.314941 cur=2327 compensation=71
VERBOSE: [Audio] Buffering: target=2400 avg=2413.046875 cur=2345 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2303.484375 cur=1865 compensation=96
VERBOSE: [Audio] Buffering: target=2400 avg=2282.799561 cur=2369 compensation=117
VERBOSE: [Audio] Buffering: target=2400 avg=2449.577148 cur=2398 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2423.165527 cur=2398 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2492.875244 cur=2398 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2480.594482 cur=2398 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2394.034912 cur=2398 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2286.450928 cur=1438 compensation=113
VERBOSE: [Audio] Buffering: target=2400 avg=2260.310547 cur=2186 compensation=139
VERBOSE: [Audio] Buffering: target=2400 avg=2252.251953 cur=1981 compensation=147
VERBOSE: [Audio] Buffering: target=2400 avg=2422.288330 cur=2498 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2572.453613 cur=2498 compensation=-172

The compensation values are expressed in samples / 4 seconds, so for example a value of 96 means 24 samples compensated per second (for 48000 input samples, there will be 48024 samples written to the buffer, which adds 500µs of compensation).

And after the changes:

VERBOSE: [Audio] Buffering: target=2400 avg=2364.726318 cur=2260 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2350.423096 cur=2020 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2358.727783 cur=2260 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2339.984131 cur=2020 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2335.473145 cur=2020 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2332.891602 cur=2260 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2333.311523 cur=2020 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2334.060791 cur=2980 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2327.629883 cur=2500 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2338.355957 cur=2740 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2361.206299 cur=2980 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2349.394043 cur=1540 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2339.491943 cur=2260 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2339.862305 cur=2260 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2344.531494 cur=2020 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2381.400879 cur=1780 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2437.357422 cur=2980 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2486.074463 cur=2740 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2541.514648 cur=2740 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2544.961914 cur=1540 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2518.584473 cur=1780 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2484.096680 cur=2740 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2474.057373 cur=2980 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2465.440430 cur=2020 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2469.177002 cur=2980 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2489.142090 cur=2020 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2481.343994 cur=2980 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2496.606689 cur=2500 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2511.326172 cur=2260 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2512.120361 cur=2980 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2507.924072 cur=2980 compensation=0
VERBOSE: [Audio] Buffering: target=2400 avg=2491.568604 cur=2980 compensation=0

Spurious compensation is still possible, but less likely (of course, expected compensation still occurs, for example on buffer underflow).

This avoids unreasonable values which could lead to integer overflow. PR #4572 <#4572>

The audio output thread only reads samples from the buffer, and most of the time, the audio receiver thread only writes samples to the buffer. In these cases, using atomics avoids lock contention. There are still corner cases where the audio receiver thread needs to "read" samples (and drop them), so lock only in these cases. PR #4572 <#4572>

Use different thresholds for enabling and disabling compensation. Concretely, enable compensation if the difference between the average and the target buffering levels exceeds 4 ms (instead of 1 ms). This avoids unnecessary compensation due to small noise in buffering level estimation. But keep a smaller threshold (1 ms) for disabling compensation, so that the buffering level is restored closer to the target value. This avoids to keep the actual level close to the compensation threshold. PR #4572 <#4572>

The buffering level does not change continuously: it increases abruptly when a packet is received, and decreases abruptly when an audio block is consumed. To estimate the buffering level, a rolling average is used. To make the buffering more stable, increase the smoothness of this rolling average. This decreases the risk of enabling audio compensation due to an estimation error. PR #4572 <#4572>

PR #4572 <#4572>

If playback starts too early, insert silence until the buffer is filled up to at least target_buffering before playing. PR #4572 <#4572>

The assumption that underflow and overbuffering are caused by jitter (and that the delay between the producer and consumer will be caught up) does not always hold. For example, if the consumer does not consume at the expected rate (the SDL callback is not called often enough, which is an audio output issue), many samples will be dropped due to overbuffering, decreasing the average buffering indefinitely. Prevent the average buffering to become negative to limit the consequences of an unexpected behavior. PR #4572 <#4572>

scrcpy v2.4 Changes since v2.3.1: - Add UHID keyboard and mouse support (Genymobile#4473) - Simulate tilt multitouch by pressing Shift (Genymobile#4529) - Add rotation support for non-default display (Genymobile#4698) - Improve audio player (Genymobile#4572) - Adapt to display API changes in Android 15 (Genymobile#4646, Genymobile#4656, Genymobile#4657) - Adapt audio workarounds to Android 14 (Genymobile#4492) - Fix clipboard for IQOO devices on Android 14 (Genymobile#4492, Genymobile#4589, Genymobile#4703) - Fix integer overflow for audio packet duration (Genymobile#4536) - Rework cleanup (Genymobile#4649) - Upgrade FFmpeg to 6.1.1 in Windows releases (Genymobile#4713) - Upgrade libusb to 1.0.27 in Windows releases (Genymobile#4713) - Various technical fixes

PR #4752 removed the need for locks except for corner cases. Now replace the remaining lock sections by atomics. Refs #4572 <#4572>

PR #4572 removed the need for locks except for corner cases. Now replace the remaining lock sections by atomics. Refs #4572 <#4572>

rom1v force-pushed the audio_player_atomic branch 4 times, most recently from 52fe7b5 to c8f00f0 Compare January 19, 2024 16:13

rom1v force-pushed the audio_player_atomic branch 3 times, most recently from 0126abd to 4dd201e Compare January 24, 2024 15:17

rom1v force-pushed the audio_player_atomic branch 3 times, most recently from edad610 to 8fe2e29 Compare February 2, 2024 13:47

rom1v force-pushed the audio_player_atomic branch from 0c5b7af to 4e35761 Compare February 16, 2024 10:59

rom1v added 8 commits February 17, 2024 15:59

Limit buffering time value

d47ecef

This avoids unreasonable values which could lead to integer overflow. PR #4572 <#4572>

Fix audio player comment

dfa3f97

PR #4572 <#4572>

Use early return to avoid additional indentation

4502126

PR #4572 <#4572>

Minimize buffer underflow on starting

c12fdf9

If playback starts too early, insert silence until the buffer is filled up to at least target_buffering before playing. PR #4572 <#4572>

rom1v force-pushed the audio_player_atomic branch from 4e35761 to a7cf4da Compare February 17, 2024 15:14

rom1v merged commit a7cf4da into dev Feb 17, 2024

rom1v added a commit that referenced this pull request May 27, 2024

Never lock in audio player

a5a3b51

PR #4752 removed the need for locks except for corner cases. Now replace the remaining lock sections by atomics. Refs #4572 <#4572>

rom1v added a commit that referenced this pull request May 28, 2024

Never lock in audio player

ce06282

PR #4752 removed the need for locks except for corner cases. Now replace the remaining lock sections by atomics. Refs #4572 <#4572>

rom1v added a commit that referenced this pull request May 29, 2024

Never lock in audio player

37457f9

PR #4572 removed the need for locks except for corner cases. Now replace the remaining lock sections by atomics. Refs #4572 <#4572>

rom1v added a commit that referenced this pull request May 30, 2024

Never lock in audio player

a14b798

PR #4572 removed the need for locks except for corner cases. Now replace the remaining lock sections by atomics. Refs #4572 <#4572>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve audio player behavior #4572

Improve audio player behavior #4572

rom1v commented Jan 7, 2024 •

edited

Loading

Improve audio player behavior #4572

Improve audio player behavior #4572

Conversation

rom1v commented Jan 7, 2024 • edited Loading

Atomics

Compensation thresholds

rom1v commented Jan 7, 2024 •

edited

Loading