Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add QOA (Quite OK Audio) as a WAV compression mode #91014

Merged
merged 1 commit into from
May 2, 2024

Conversation

DeeJayLSP
Copy link
Contributor

@DeeJayLSP DeeJayLSP commented Apr 22, 2024

This is an alternative (a better one in my opinion) to #88646, with all caveats from it nullified. Closes godotengine/godot-proposals#9133 too.

Once again, QOA was developed to (not exactly just this, but is the best case) be a better alternative to ADPCM formats for use in games according to the article announcing it.

Production release templates, for some reason, had a binary size decrease of 4064 bytes on Linux.

The patches in qoa.h suppresses a few warnings, allow editor to build (due to implementation being applied in both importer and stream) and reduce binary size penalty (it would be a bit over 4KiB otherwise).


This simply adds QOA as a compression mode within AudioStreamWAV:

image

Briefly, the differences between IMA-ADPCM and QOA within AudioStreamWAV should be:

+ IMA-ADPCM distorts lots of sound types, specially higher frequencies. The maximum QOA will do is add a barely audible white noise to higher frequencies.

+ IMA-ADPCM isn't resampled on playback, which means sounds different than the project's mix rate will get incredibly distorted. QOA doesn't use prediction when fetching decoded samples, so it can be resampled.

+ Since IMA-ADPCM decoding uses prediction, only Forward loop mode is available. While QOA does use prediction too, it does within frames that are decoded to a buffer and fetches samples from that, so all loop modes can work. Resampling had to be adapted to avoid some unnecessary decode callbacks.

- QOA is slightly more complex than IMA-ADPCM, which should result in increased CPU usage. Despite this, it's still much faster than MP3 and Vorbis.

Supersedes #88646. Unlike it, QOA files can't be used (which shouldn't be a problem, as IMA-ADPCM WAVs could never be used either).

Copy link
Member

@Calinou Calinou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested locally, it works as expected. Same MRP as here: #88646 (review)

File sizes of the imported WAV (converted to QOA on import) match the one from the previous PR.

@Riteo
Copy link
Contributor

Riteo commented Apr 22, 2024

I really like this approach! (can we call this non-destructive asset management?)

But at this point I wonder why we can't have the best of both worlds and also allow importing plain QOA files, basically merging this and the superseded PR. Sorry, was there a reason for that? I can't find it in the old thread. There seemed to be a pretty good consensus there.

@DeeJayLSP
Copy link
Contributor Author

DeeJayLSP commented Apr 23, 2024

I really like this approach! (can we call this non-destructive asset management?)

But at this point I wonder why we can't have the best of both worlds and also allow importing plain QOA files, basically merging this and the superseded PR. Sorry, was there a reason for that? I can't find it in the old thread. There seemed to be a pretty good consensus there.

Currently there is no common software capable of converting audio files into QOA for distribution. Therefore few people would use it.

Also, according to the format's creator, QOA is meant to be embeddable, and so I believe it made more sense to make it a WAV compression mode instead of another AudioStream type.

If there's demand, QOA files could be allowed to be imported in the future. IMA-ADPCM never had that demand.

@Riteo
Copy link
Contributor

Riteo commented Apr 23, 2024

@DeeJayLSP I see, this makes perfect sense, thanks for clearing things up!

Copy link
Contributor

@Riteo Riteo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just took a look, and everything looks fine! I'm not an audio person so some thing flew over my head but they didn't raise any obvious red flags.

Small nitpick: the patch comments a bit of stuff. I suppose that those lines might as well be removed altogether.

thirdparty/misc/patches/qoa-min-fix.patch Outdated Show resolved Hide resolved
@DeeJayLSP DeeJayLSP force-pushed the qoa-wav-playback branch 2 times, most recently from e3b3c3c to 722bc5f Compare April 23, 2024 04:12
@DeeJayLSP
Copy link
Contributor Author

DeeJayLSP commented Apr 24, 2024

The amount of workarounds I'm having to do for the sake of an optimal resampling...

@DeeJayLSP
Copy link
Contributor Author

DeeJayLSP commented Apr 24, 2024

I explained this a few times but I wanted to leave a definitive explanation for the workarounds.

QOA frames are composed of 5120 samples. The moment it begins playing, or when it goes from 5119 to 5120 (or similar intervals), the decoder is triggered, and a buffer large enough for 5120 samples gets replaced with the new frame data.

PCM8/16's resampling works by interpolating the current sample with the next. And this is a problem if we're trying to interpolate a backwards playback or a different sampling rate.

On a backwards playback, the following situation could occur:

// Previous sample: 5120
- Fetch sample 5119, interpolate with 5120
- Fetch sample 5118, interpolate with 5119

It crossed the 5120 interval 3 times, therefore the decoder would be triggered 3 times for two samples in a row.

The solution was simple: fetch samples backwards. Add 1 to the current position, then reverse the from/to assignment. Of course, only when playing backwards.

This led to a side effect that when going from first to last sample in a backward loop, it would try to interpolate last with first, resulting in a pop. A solution was to simply return the last sample whenever it requests a sample beyond the length.

Due to the way resampling it works, some samples end up repeating, with different fractions. If the repeated sample happens to be the last one in a QOA frame (guaranteed if the audio's mix rate is half the project's) the same problem above would occur.

- Fetch sample 5119, interpolate with 5120
- Fetch sample 5119, interpolate with 5120 // Again but with a different fraction

The solution I came up with was to store the current sample, then return it back if the next request is for the same instead of repeating the whole process of diving into checks, which causes the problem. I don't think this is the best solution but at least solves the problem.

There might be ways to optimize this. In a scenario where resampling isn't used, QOA decoding could be just this.

@DeeJayLSP DeeJayLSP force-pushed the qoa-wav-playback branch 3 times, most recently from 40ca9ec to 2229096 Compare April 24, 2024 20:07
@DeeJayLSP DeeJayLSP force-pushed the qoa-wav-playback branch 4 times, most recently from 314e526 to b319054 Compare April 25, 2024 15:25
scene/resources/audio_stream_wav.cpp Outdated Show resolved Hide resolved
scene/resources/audio_stream_wav.cpp Outdated Show resolved Hide resolved
@DeeJayLSP
Copy link
Contributor Author

DeeJayLSP commented Apr 29, 2024

Is it normal for binary sizes to decrease by over 4KB after implementing a feature like this?

(I did test to see if it would work)

@DeeJayLSP
Copy link
Contributor Author

DeeJayLSP commented May 2, 2024

For a while now I'm unable to find ways to optimize/fix potential problems in this implementation, so this is the part I say I believe it's in the best state.

@akien-mga akien-mga modified the milestones: 4.x, 4.3 May 2, 2024
@akien-mga akien-mga merged commit 9cb3a16 into godotengine:master May 2, 2024
16 checks passed
@akien-mga
Copy link
Member

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for QOA (Quite OK Audio) format
6 participants