Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replaygain backend ffmpeg #3056

Merged
merged 6 commits into from
Jul 21, 2019
Merged

Conversation

zsinskri
Copy link
Contributor

Add a replaygain backend using the ffmpeg CLI-tool and its ebur128 filter.
It is a alternative to the bs1770gain to create R128_* tags.

comparision with r128gain backend

#3055 also uses ffmpeg, but with an intermediary python module ("r128gain"). This pull request replicates some of the work done in r128gain, but avoids the (beta-) dependency.

The album gain algorithm implemented in this backend - the mean of all track gains, also used by the bs1770gain backend - gives different results than the implementation in r128gain (calculating the gain for a concatenation of all tracks). Calculating the mean is obviously faster than rescanning the whole album, but might deliver worse results.

This backend also does not implement any kind of threading (a single track is scanned at a time), but r128gain does.

@zsinskri
Copy link
Contributor Author

I'm not really happy with my changes to command_output: it feels wrong to just append both streams. Is there a better way to access stderr with the existing tooling in beets?

@zsinskri
Copy link
Contributor Author

I just found this conversation. It seems like using the mean to calculate the album gain is just plain wrong. I will look into this...

Copy link
Member

@sampsyo sampsyo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Thanks for getting this started. It looks great already; I noted the two issues inline.

We don't currently have a variant of command_output that gets the stderr stream, but doing that is probably a good idea. Maybe it would be best to just refactor that function to always return both streams—and change everywhere that calls it to ignore (or appropriately use) the stderr data?

beets/util/__init__.py Outdated Show resolved Hide resolved
beetsplug/replaygain.py Outdated Show resolved Hide resolved
# convert db to LUFS according to:
# http://wiki.hydrogenaud.io/index.php?title=ReplayGain_
# specification#Reference_level
self._target_level = config['targetlevel'].as_number() - 103
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How exactly are the options tragetlevel and r128 supposed to work together?
The only backend that already supports r128, Bs1770gainBackend, just completely ignores targetlevel. (That might be a bug?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Bs1770gainBackend initialising it as an EBU R128 Backend only sets the target level to -23 LUFS. So it is impossible to configure the target level of R128_* tags.
Maybe it would make sense to add a config r128_targetlevel, defaulting to 80 db (that corresponds to -23 LUFS according to http://wiki.hydrogenaud.io/index.php?title=ReplayGain_specification#Reference_level).
Then add a parameter target_level to Backend.compute_track_gain and Backend.compute_album_gain that is either targetlevel or r128_targetlevel depending on the filetype. That also makes the whole ReplayGainPlugin.r128_backend_instance obsolete, the same backend can just be called with different target levels.
I guess I should create a separate pull request implementing that change?

Also Bs1770gainBackend's method configuration just changes the target level. Shouldn't that be controlled by targetlevel?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I like that proposed design a lot—with that change, it would be great to dispense with the distinction between different "backend instances" for classic RG and R128.

I also agree that it seems wrong for Bs1770gainBackend to unilaterally change the target level. It does seem like the configuration should control that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been implemented in #3065.

@sampsyo
Copy link
Member

sampsyo commented Oct 27, 2018

This is already looking great! Thank you for carefully considering all the details here.

If it seems appropriate to you, let's merge this PR and work on the refactoring you mentioned in a separate PR?

@zsinskri
Copy link
Contributor Author

Yes, I'm fine with merging.

@zsinskri
Copy link
Contributor Author

zsinskri commented Nov 7, 2018

Have I missed something, that I should do to get this pull request merged, or is it just waiting for a review?

@sampsyo
Copy link
Member

sampsyo commented Nov 7, 2018

It's just in my queue to review. Thanks for your patience; I'll get to it real soon!

@zsinskri
Copy link
Contributor Author

Just merged master and resolved conflicts, no new features in latest push.

@zsinskri
Copy link
Contributor Author

Testing this backend's accuracy (especially the custom album gain calculation) with an unscientific sample size of 1:

beets w/ ffmpeg r128gain (uses ffmpeg) beets w/ bs1770gain
R128_TRACK_GAIN -1536 -1536 -1526
R128_ALBUM_GAIN -1662 -1664 -1669

These are with #3314 merged (otherwise every backend will produce garbage).

r128gain calculates the album gain by truly concatenating all tracks and letting ffmpeg re-analyse all audio. We are only 0.0078125 LU (= (1664-1662)/pow(2,8)) off. Considering the difference between bs1770gain and r128gain is 3.5 times as large (or for the track gain 5 times as large) this seems quite good.

@zsinskri
Copy link
Contributor Author

Both ffmpeg and bs1770gain can calculate either the sample or true peak. At least to my understanding the sample peak is the highest stored value and thus really fast to compute, while the true peak is the maximum of the resulting waveform and takes somewhat longer.

The ffmpeg backend currently offers the peak configuration option, defaulting to true peak as the played waveform is what we actually care about. (Don't we? Is there any relevant standart?)

But the bs1770gain backend unconditionally chooses the sample peak (by using the -p, not -t, CLI-Option). Should I change the default value of the peak option to sample? Why was the sample peak chosen for the bs1770gain backend?

I guess bs1770gain should also respect peak if we do not decide to remove that option. But that is something for a different PR.

@sampsyo
Copy link
Member

sampsyo commented Jun 22, 2019

Huh, good point. I don’t know why that was chosen as the default behavior for the existing backend. It does seem like the backends should be consistent, but that true peak should be the default. Let’s make it do that—perhaps in a PR just for cleaning up the semantics of the “peak” config option?

@zsinskri
Copy link
Contributor Author

The peak config option is introduced by this PR as a ffmpeg specific option (as I did not know that other backends could support it too).

cleaning up the semantics of the “peak” config option?

If its semantics are off that should be fixed right here and only bs1770gain's support added in a separate PR to avoid merging suboptimal code. Or should we move the whole peak option to a new PR?

@sampsyo
Copy link
Member

sampsyo commented Jun 22, 2019

Aha! Sorry, I misunderstood. No, I think it’s fine to keep it here, but I like your idea of making the other backend respect the option too (and use the same default).

beetsplug/replaygain.py Outdated Show resolved Hide resolved
@zsinskri
Copy link
Contributor Author

Now everything is uptodate with master and even the version check has been added.

This branch should be ready to merge once again.
If there is anything I can do to help the review process of this merge request, please let me know.

beets/util/__init__.py Outdated Show resolved Hide resolved
zsinskri added a commit to zsinskri/beets that referenced this pull request Jul 14, 2019
Return a namedtuple CommandOutput(stdout, stderr) instead of just stdout from
util.command_ouput, allowing separate access to stdout and stderr.

This change is required by the ffmpeg replaygain backend (GitHub
PullRequest beetbox#3056) as ffmpeg's ebur128 filter outputs only to stderr.
@zsinskri
Copy link
Contributor Author

I have rebased this branch onto #3329 which is now a perquisite for these changes (as per fortes great suggestion to split #3329 off), cleaning up some history in the process.

Previously using EBU R128 forced the use of the bs1770gain backend.
This change adds a whitelist of backends supporting R128. When the configured
backend is in that list it will also be used for R128 calculations. Otherwise
bs1770gain is still used as a default.

This should not change the overall behaviour of the program at all, but allow
for further R128-supporting backends to be added.
Add replaygain backend using ffmpeg's ebur128 filter.

The album gain is calculated as the mean of all BS.1770 gating block powers.
Besides differences in gating block offset, this should be equivalent to a
BS.1770 analysis of a proper concatenation of all tracks.

Just calculating the mean of all track gains (as implemented by the bs1770gain
backend) yields incorrect results as that would:
- completely ignore track lengths
  - just using length in seconds won't work either (e.g. BS.1770 ignores
    passages below a threshold)
- take the mean of track loudness, not power

When using the ffmpeg replaygain backend to create R128_*_GAIN tags, the
targetlevel will be set to -23 LUFS. GitHub PullRequest beetbox#3065 will make this
configurable.
It will also skip peak calculation, as there is no R128_*_PEAK tag.

It is checked if the libavfilter library supports replaygain calculation. Before
version 6.67.100 that did require the `--enable-libebur128` compile-time-option,
after that the ebur128 library is included in libavfilter itself. Thus we
require either a recent enough libavfilter version or the `--enable-libebur128`
option.
Add changelog entry for the new ffmpeg replaygain backend.
Use keyword arguments to make the ffmpeg parser more readable.
@zsinskri
Copy link
Contributor Author

Rebasing onto master again so that GitHub does not show changes from #3329 under "Files changed". This should make this PR easier to review.

Copy link
Member

@sampsyo sampsyo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just did a full review (at last)—this looks wonderful! Here are just a few suggestions.

)

# check that peak_method is valid
valid_peak_method = ("true", "sample")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No parentheses are necessary here.


# ffmpeg backend
class FfmpegBackend(Backend):
"""A replaygain backend using ffmpegs ebur128 filter.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""A replaygain backend using ffmpegs ebur128 filter.
"""A replaygain backend using ffmpeg's ebur128 filter.


def _analyse_item(self, item, count_blocks=True):
"""Analyse item. Returns a Pair (Gain object, number of gating
blocks above threshold).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Analyse item. Return a pair of a Gain object and the number of gating blocks above the threshold.


line_integrated_loudness = self._find_line(
output, b" Integrated loudness:",
start_line=(len(output) - 1), step_size=-1,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No parentheses are necessary here.

def _parse_float(self, line):
"""Extract a float.

Extract a float from a key value pair in `line`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This extra sentence may not add much?

This plugin can use one of many backends to compute the ReplayGain values:
GStreamer, mp3gain (and its cousin, aacgain), Python Audio Tools or ffmpeg.
mp3gain can be easier to install but GStreamer, Audio Tools and ffmpeg support
more audio formats.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would even say that ffmpeg belongs to the "easy to install" camp! It's available in nearly every package manager ever, and it usually comes as a complete package with everything included.

This commit mostly addresses feedback:
- remove some unused parenthesis
- fix a typo
- expand some docstrings
- document that ffmpeg is usually easy to install
Use the POSIX character class instead of `\s` to match all whitespace in a
regular expression describing the language of valid inputs, in order to avoid a
test failure for the invalid escape sequence `\s` in Python strings.
@sampsyo
Copy link
Member

sampsyo commented Jul 21, 2019

This looks perfect! Thank you again for all your hard work on this—this new backend will be a very useful alternative IMO. 🎉

@sampsyo sampsyo merged commit bd6a5cf into beetbox:master Jul 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants