-
-
Notifications
You must be signed in to change notification settings - Fork 21.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent clicks when looping in AudioStreamWAV #83341
base: master
Are you sure you want to change the base?
Conversation
The audio thread isn't really performance constrained in the traditional sense. Outside of godot I've been doing a lot of embedded programming and it's nothing like that. The main thing is we avoid the use of any mutexes and allocation (because most allocators require mutexes). Audio mixing is the most cache-friendly operation I can think of, and branch predictors on the kinds of processors that Godot runs on tend to be pretty good at their jobs, so I wouldn't sweat an additional branch or two if the condition you're branching on is cheap to evaluate. Do we have any test files for this bug so I could listen before and after? |
Then there's the option to remove the check whether the special case handling is neccessary. Might be slightly less performant but would require less code (one less template parameter and thus half the function calls to
I used the cooking_S00 example from the initial post in the issue thread and converted it to WAV. That did click for me when looping: Edit: "Loop Begin" was set to 396900 (9 Seconds). |
For this change I'd say that as you're only adding O(1) cache misses/branch mispredicts (per mix, per WAV playback) you should do whatever is least complex and easiest to read and review. edit: More concretely, if you have a branch in the inner loop that's either almost always taken or almost never taken, we should ignore it from a performance standpoint for this change. If that sounds agreeable let me know when I should take a look at the code. I remember this part of the codebase requiring some amount of focus to review. |
Ah, yes, that makes a lot of sense. I've removed the template parameter, so all that's new now is a new if in the inner loop and two parameters to I haven't touched the ADPCM path, by the way. As far as I can tell, that doesn't do interpolation, so there shouldn't be anything to fix there. Would be great if you could take a look at it now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this better. Definitely more readable. :) Just have some questions, but I might need to go re-familiarize myself with how wav looping is actually implemented.
break; | ||
case AudioStreamWAV::LOOP_PINGPONG: | ||
// Interpolate the last sample with the second-to-last sample (mirror). | ||
sample_border = MAX(0, sample_limit - 2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks right for handling the playback bouncing off the last sample of the loop (and thank you for thinking about extremely short sound files) but will this do the right thing when the ping pong loop bounces off the start position? I'm also a little bit unclear as to why the forward and backwards cases are handled identically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it took me some time, to get my head around which possible cases we're dealing with here. In the end, we only ever need to worry about the right edge because do_resample
will always interpolate between the "previous" sample and the "next" regardless of the direction we're moving in. offset
is mapped to samples by discarding the lower bits, so we can be up to almost a whole sample after the (start of) the last sample, but we cannot be before the first sample. I found it somewhat less obvious that we can end up after the (start of) the last sample when walking backwards and wrapping around to the loop end (because the sub-sample part is retained), but that then maps to the same case of having to deal with the loop end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if I'm doing a very good job at explaining what I mean 😅, so here's an example: If we have a loop that starts at 0 and has a length of 10, then the loop end is 10 and the last sample that is part of the loop is 9. offset
is a fixed-point number but if it were a float, its valid values would be between 0.0 and 9.99... I believe the original code correctly ensures that offset
will never leave the [0, 10) range. The problem was that for positions > 9, the interpolation was done between samples 9 and 10, although 10 isn't part of the loop region. On the lower end, however, even the extreme value 0.0 would be correctly interpolated between samples 0 and 1, so there's no issue there. When starting at 3.5 and playing 10 samples backwards, the code would play 4 samples starting from 3.5 (positions 3.5, 2.5, 1.5, 0.5) and then 6 samples starting at 9.5, so the critical case is still in the 9-10 range.
I had to do quite a bit of familiarization today, too. 😅 There's a lot going on here. |
"It's just copying one PCM buffer into another, how hard could it be?" -- me, to myself, circa 2020 😆 I'm still getting clicks in the wav file and loop points you provided but in audacity it doesn't seem to be looping at zero crossings, so I'm going to find my own loop points and then re-test. |
I tried 396954 to 3571752 and I'm still getting clicks both before and after these changes. :/ |
Being able to do backwards and ping-pong looping is really neat, but it does open a whole new can of worms, that's true. 😄
Hm, let me check that again. 🤔 In any case, it's correct that there's no zero crossing at 9 seconds, but the waveform there should line up with the end of the stream. |
That makes sense. And even if you loop at zero crossings I think you can still get sidebands because of abrupt frequency or amplitude changes so that could be what I'm hearing. I'm definitely not a competent sound engineer. Just trying to find a good test case. |
So I don't get a click with the original test case at 396900. I do get a click with your 396954, but I suppose that's expected, because the stream ends on a positive value, so the jump to zero will click (should be the same in other applications). 3571752 is 8 ms before the end of the stream. That loop sounds... interesting. 😄 Can you set a breakpoint around line 285 and check what values are actually used for mixing? Mine are: |
Yeah I just picked zero crossings without listening to them because audacity takes exclusive control of my audio device and I was halfway through an album I like that I wanted to go back and finish. I don't have an environment set up to actually debug Godot, which I should probably fix because I spend a lot of time working on the engine. I fell out of the habit of working with debuggers when I spent a few years working on an app notorious for crashing debuggers. I'm going to shelve this and work on some other stuff because I do work on Godot as a day job and this is my weekend so it feels like I should be doing something else. I'll try and pick it back up soon though. |
Ah that's okay. I was just wondering if maybe we were using different mixing rates or something and if that could explain the difference, so it would have been interesting if the values that actually end up in the method match up. Enjoy your weekend! 🙂 |
Ok, I tested only with 3 WAV files (I'm pretty tired right now) which were clearly popping. They all have loop offset > 0. But I cut off the intro, so I had 6 files to test (looping back to 0 also popped). With this PR: They're all still popping. I was only skimming over your comments, but you seem to get some popping as well? |
Thanks for testing!
I didn't before, but I just tried a few things and I do get a click when I set my system sampling rate to match the file sampling rate! In my mind that should've been unproblematic because this case does not require interpolation and thus should be unaffected by any issues with the interpolation, but clearly that's either not true or this case has a problem of its own. 😅 Thanks for pointing me in the right direction! |
Funnily enough, there were indeed two separate issues. The one I fixed in my original commit affects only the resampling case while the other affects almost exclusively the non-resampling case. Specificly, this line: I've changed this now to
|
@bs-mwoerner: sorry for the late testing. But I can say that all 6 test WAV files are now looping smoothly (tested only forward looping; with looping at second 0 and > 0). They still pop in Godot 4.1.1 and 4.2.1 (tested there just for reference). Could you please squash the commits into 1 and rebase on latest on master? @ellenhp: Ping, could you please review (if you have the time) |
…scontinuities/clicks in the waveform when looping. - Fixed calculation of remaining samples in the loop returning one too many samples when the input sample rate matches the output sample rate (causing the remaining length to be evenly divisible by the increment), removing another possible source of clicks.
d38be0d
to
6d4b1b8
Compare
That's great! Thanks for testing. I've squashed all commits into one and rebased on the current master. 👍 |
@bs-mwoerner: I poked @reduz for a review, and he mentioned doing this once on import. |
@MJacred I had a look at the importer and, yes, as long as there's no way to play a file with a loop mode different from what was specified during import, I think we can hardcode the correct overflow sample into the imported copy of the stream (with a bit of special-case handling for when the loop ends at the end of the stream). This would solve the issue with having to handle loop modes when picking the next sample (resampling case) and replace that part of this fix. The other part (non-resampling case), though, would still need to be fixed in the playback, I think. If I remember correctly, that was just a miscalculation when advancing the pointer. |
I think we do support changing loop mode at runtime. At the very least it is possible and I've suggested people do it before to reduce export size in cases where they might otherwise need two copies of the same resource |
@ellenhp: Thanks! I can confirm. Just tested in Godot v4.2.1: while changing the loop mode in the editor requires a re-import, and the editor makes the UI widgets insensitive for clicks, you can still change the mode at runtime. Even disabling. |
@bs-mwoerner: Hm… can we go hybrid with this? No idea if this can work (can also be used as inspiration for a different / more refined solution):
|
@MJacred Hm, but this would then only work in the editor, since we can't re-import during runtime, can we? With exported builds, we don't even have access to the source media, so we definitely can't reimport anything there. I feel adding considerable code complexitiy for an optimization that doesn't affect release builds isn't really worth it. However, it looks like the actual WAV data is mutable during runtime, so I suppose we could have a solution that modifies the stream in-memory in accordance with the loop mode. Then we wouldn't need the special case handling during mixing. We would, however, have to make sure that we keep track of these changes, so that we can undo them before a loop setting or the actual data is changed and then modify the stream anew for the new settings afterwards. I'd like to hear ellenhp's input on this. I remember her saying that with branch prediction, the impact of a single "if" that's almost never taken in the mixing loop is negligible, so I'm not yet convinced that the additional management effort for replacing this with transient stream modifications is a good trade-off (we would add lots of small things that need to be done in a number of places). One example for an edge case, we'd have to consider: It's possible to save the stream to a file while it is being played, so we'd have to make sure that we undo the stream modification on the fly while saving without affecting what the playback sees. |
Yeah, I'm not an expert on any particular microarchitecture, but processors are designed to be able to handle branches like the one in the inner loop here without stalling the pipeline. You could optimize out the arithmetic add/subtract by comparing edit: looked into this and I'm still convinced it's not an issue for any modern CPU, including old android phones and stuff, but I think the heat death comment is pretty far into the realm of hyperbole. 😆 |
Yeah. The re-import is more like a "save this default config".
Yes, tracking the changes would be necessary.
If I understood everything correctly, then the trade-off should be agreeable, right?
Hm… I don't get that. Shouldn't the user desire exactly this behaviour (edge case)? |
Personally, I'd go the "handle this during playback" route, because patching the stream in-memory creates more possiblities to forget to do or undo the modifications at the various places that they need to be done and thus introduces more ways for things to go wrong. If it's true that there's no noticeable difference in performance (I didn't do any benchmarking, so that's an assumption for now), then I'd prefer the simpler solution for maintainability reasons. Just for context: reduz's original suggestion of preprocessing the stream during import doesn't work with loop mode changes during runtime, because this preprocessing is destructive, so something needs to be done during runtime either way. The remaining options are handling looping during playback and modifying the stream in memory. The first one is simpler and has no side effects (the actual data is not touched), the second one saves one
Hm, that's actually a good question. The docs for
Thanks for moving this forward, by the way! I had almost forgotten about this issue. |
Happy to be of help! And a big "thank you" for everyone who pitched in! It's good that we went through our options. Now we have sth. to report back. I'll poke again in the contributor chat to get an official review on this. |
@reduz: Can we get your review on this, please? It would be great to get one more audio fix into 4.3 |
@bs-mwoerner: I got a response from Juan on this issue (I'm late in relaying this)
What is your take on this response? |
Loop offsets >0 aren't an edge case, they're a core feature of the engine. The issue is that there's a bug in the interpolation code that shows up at the loop boundary. I understand having some objections to this PR. It adds branches in inner loops. I understand wanting to see profiling data to back it up, or asking for it to be commented better. But if Juan doesn't want the underlying bug to ever be fixed I don't really know what to say. On a meta level: It's very frustrating that after putting in dozens or sometimes hundreds of developer-hours across all contributors into a fix, ~5 minutes of Juan's time can completely sink it because he either doesn't understand the problem fully or doesn't think it's important enough to review. This is the main reason I completely deprioritized contributing to and maintaining the project. I recognize that I'm rocking the boat by saying this, but it's because I still care about the project, and I want the best for it. Providing this feedback is important. |
I was a bit confused initially, because when Juan says "Interpolation does just that, it should never go past the end sample, so even if there is a next sample it should not be audible", that sounds like an exact description of what this PR does: It stops the interpolation from going past the end sample, which it previously did. After giving this some thought, I now think when Juan says "interpolation", he may refer to overlapping and fading between the end and the start of the loop (whereas we were talking about interpolating between individual samples in the stream to account for sample rate differences or non-standard playback rates). I ran a new test with the official 4.3 build and as of now, the clicking problem is still there, but if there are plans to implement overlapping loops for WAV streams at some point in the future, then Juan may mean that we shouldn't bother fixing these issues now, because they will likely not be heard through the fade anyway? |
Partially fixes #64775 (namely, the WAV part. OGG is addressed by #80452).
AudioStreamWAV
supports resampling and variable playback speeds by using sub-sample accuracy in the playback position and linear interpolation between two neighboring samples.When forward or backward looping is enabled between samples s and e (inclusive) and the playback pointer is between samples e and e + 1, interpolation is done between samples e and e + 1. The next/previous sample to be played after/before e, however, is s, so interpolation should instead be done between e and s to prevent discontinuities/clicks in the signal. For pingpong looping, interpolation in that same case should be done between e and e - 1, because the stream is essentially mirrored at the end.
Ensuring this border condition requires one additional check per output sample. I don't know how performance sensitive we are in the audio thread so just to be sure, I added one more template parameter to the
do_resample
method so that there's a separate code path for mixing blocks near the end of the loop region. That produces quite the cascade of function calls, but allows skipping the border checks for the vast majority of blocks.