Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: replaygain: R128 support #2560

Merged
merged 4 commits into from
Jun 11, 2017
Merged

RFC: replaygain: R128 support #2560

merged 4 commits into from
Jun 11, 2017

Conversation

autrimpo
Copy link
Contributor

Following the discussion in #2557 I've decided to try implement a very WIP, 'proof of concept' support of R128 to the replaygain plugin.

As I'm not very familiar with the codebase nor Python I'm creating a PR already and would appreciate if someone more knowledgable could review the changes.

Things that I'd especially like comments on:

  • I'm saving and restoring the backend because R128 is only supported by bs1770 as far as I know. Is that needed? If yes, perhaps somehow detect and create the second backend in advance and switch between those two instead of recreating it every time.
  • There's a lot of shared code between the branches right now, should it branch only for those 2 specific calls or perhaps even branch inside those calls?
  • How to decide when to branch. My idea was a new plugin setting, a list of formats with a sensible default (so right now only Opus).
  • The MediaType and db entries - they work, but are they correct?
  • Removal of REPLAYGAIN_ tags when tagging with override set - right now, my files are incorrectly tagged with these and I'd like the R128_ to replace them. When/where should it be done, should it be configurable?
  • Expanding on the 'branching on list' and the previous point - is tagging with both systems desirable and how should it be handled?

This only calculates track gain, album gain is still missing. But I've tested it on MPD and DeadBeef and the gain produced with these changes works as expected.

@mention-bot
Copy link

@autrimpo, thanks for your PR! By analyzing the history of the files in this pull request, we identified @sampsyo, @geigerzaehler and @jrobeson to be potential reviewers.

Copy link
Member

@sampsyo sampsyo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are a few comments tied to specific lines of code. I'll also answer some general questions separately…

r128_track_gain = MediaField(
StorageStyle(
u'R128_TRACK_GAIN',
float_places=0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need float_places if you don't intend to store floating-point numbers. Is this field supposed to store an integer?

StorageStyle(
u'R128_TRACK_GAIN',
float_places=0
),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With just a StorageStyle, this will only support "free-form" tag formats. On MP3, MPEG-4, and WMA, it will crash unceremoniously. I think the right thing to do is to fill in logical names for these other formats, even if support isn't widespread yet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added support for other formats, with names more or less correspoinding to their replaygain equivalents.

method = self.method
self.method = '--ebu'
output = self.compute_gain(items, False)
self.method = method
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yes—mutating and restoring the object's state is a bit hacky. Maybe this decision should be made inside compute_gain? The idea would be to write a clear policy about when the "format-specific default" overrides the global default or the user's configuration.

self._log.info(u"ReplayGain error: {0}", e)
except FatalReplayGainError as e:
raise ui.UserError(
u"Fatal replay gain error: {0}".format(e))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm; I see what you mean about duplicate structure between the two branches. It does seem like a good idea to trim this down, but I don't have any non-obvious suggestions at the moment…

u"ReplayGain backend failed for track {0}".format(item)
self._log.debug(u'format: {0}', item.format)

if item.format == 'Opus':
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe some manner of global function to make this decision (e.g., should_use_r128) would be useful to encapsulate the policy here?

@sampsyo
Copy link
Member

sampsyo commented May 13, 2017

Removal of REPLAYGAIN_ tags when tagging with override set - right now, my files are incorrectly tagged with these and I'd like the R128_ to replace them. When/where should it be done, should it be configurable?

Good question. I don't have a strong feeling whether it should be configurable (maybe some people want both sets of tags for maximum compatibility? Or maybe we should force compliance with the RFC's recommendation, so adding R128 tags clobbers RG tags but not the other way around?). I imagine it should be done in the store_track_gain/store_album_gain functions (or rather, the new versions of the same).

@autrimpo
Copy link
Contributor Author

I've cleaned up the code a little, the decision making is now done by a method and controlled by a setting (whitelist of formats). Unfortunately I can't delete the REPLAYGAIN_ tags, only set their value to be empty. Any hints on how to do that would be appreciated.

@sampsyo
Copy link
Member

sampsyo commented May 15, 2017

Sadly, actually deleting the tag is a little more difficult than it should be. (Real, general support for deleting tags is a long-standing debacle—see #919, for example.) The actual answer is that we need a way to invoke del mediafile.rg_track_gain inside the Item.write() method when the corresponding Item field is None. As pedantic as this sounds, can we perhaps address this in a separate PR? Adding this tag-deletion ability would be really useful in general.

Copy link
Member

@sampsyo sampsyo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! This design seems quite a bit simpler, and the configurable list of format names seems like the right thing.

A note for the future: it looks like the only trouble in the tests so far is that we need to add the new field name to the MediaFile tests. https://travis-ci.org/beetbox/beets/jobs/232405195#L894

@@ -822,6 +823,9 @@ def __init__(self):
if self.config['auto']:
self.import_stages = [self.imported]

# Formats to use R128.
self.r128_whitelist = self.config['r128'].as_str_seq()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! This configurable format list makes sense to me. (We'll need to add it to the documentation eventually—unless we want to leave it undocumented, if we think people shouldn't be fiddling with it.)

self.r128_backend_instance = ''

def should_use_r128(self, item):
return item.format in self.r128_whitelist
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A docstring for this method might be useful.

raise ui.UserError(
u'replaygain initialization failed: {0}'.format(e))

self.r128_backend_instance.method = '--ebu'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method body looks like it might have gotten indented one step too far.

@autrimpo
Copy link
Contributor Author

I see, in that case I'll skip the tag removal functionality for now. I'll add the tests, and I of course plan to update the plugin documentation to reflect the final implementation.

@autrimpo autrimpo force-pushed the r128 branch 4 times, most recently from 2b9ab76 to 2cde5c0 Compare May 15, 2017 19:01
@autrimpo
Copy link
Contributor Author

autrimpo commented May 15, 2017

I've added album gain, so this can be considered a 'final draft'.
However, the ALAC and MP4 tests are failing with Python 3. I've tried to debug it but don't know what the problem is.

I'm also making an assumption that all the files in an album are in the same format, but I think it's a fairly safe one to make and I can't really think of a way how to handle multiple formats and gain standards in one album in an elegant way. Maybe exiting gracefully and notifying the user would be a bit better if it happens?

MP4SoundCheckStorageStyle(
'----:com.apple.iTunes:iTunR128',
index=0
),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend dropping the SoundCheck storage styles (this and the MP3 variant above). The SC format is included for the RG tags for compatibility with the iTunes-specific normalization metadata, which we should probably (eventually) break off as a third standard that deserves a third set of tags.

We can also cross our fingers and hope that this solves the crash on Python 3…

@sampsyo
Copy link
Member

sampsyo commented May 16, 2017

Great! I think the error probably has to do with the SoundCheck conversion stuff, so we may be able to sidestep it altogether. In short, the MediaFile abstraction layer is trying to data stored as bytes in the MPEG-4 format as a Unicode string, which won't do.

Good point about heterogenous albums. Yeah, just logging an error and "giving up" on the album (raising a ReplayGainError) would be nice—and slightly better than silently assuming the format from the first track on the album.

@autrimpo autrimpo force-pushed the r128 branch 2 times, most recently from 13eca81 to 856fc9b Compare May 16, 2017 19:34
@autrimpo
Copy link
Contributor Author

autrimpo commented May 17, 2017

Alright, I've finally fixed the tests.

MP4 tags starting with ----: are encoded as unicode. The float branch of _safe_cast handled the decoding fine but the int branch had no such logic, that's why REPLAYGAIN_ passed fine but R128_ struggled. I've also taken the liberty of adjusting the regex a little, to support a sign prefix. I think the spec allows both in the tag and - is actually very common, as with replaygain values.

If everything looks fine I'll rewrite the history to something prettier (I was thinking 3 commits - the _safe_cast fix, the tests and finally the implementation itself) and update the documentation (fourth commit).

Copy link
Member

@sampsyo sampsyo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work tracking that down! This looks great—thanks for all your effort.

I personally don't mind a "messy" git history, but if you're inclined to clean things up, that sequence of 3 commits sounds great.

if not isinstance(val, six.string_types):
if isinstance(val, bytes):
val = val.decode('utf-8', 'ignore')
else:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool; this looks right. Just to clarify what's going on, may I recommend elif not isinstance(val, six.string_types) here?

val = six.text_type(val)
# Get a number from the front of the string.
val = re.match(r'[0-9]*', val.strip()).group(0)
val = re.match(r'[\+-]?[0-9]*', val.strip()).group(0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Good call.

),
MP3DescStorageStyle(
u'r128_track_gain'
),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It belatedly occurs to me to recommend that we don't try use capitalizations here. It was useful for RG, where there are apparently both styles in use, but here we're essentially "inventing a standard"—I don't see any prior art yet indicating one or the other. We can always add both later if someone complains.

The only problem is I'm not sure which to choose… maybe lower case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the RFC always refers to it in upper case, so I guess we should go with that for consistency.

@autrimpo
Copy link
Contributor Author

Cleaned up, updated the docs.

@sampsyo sampsyo merged commit 8168a88 into beetbox:master Jun 11, 2017
sampsyo added a commit that referenced this pull request Jun 11, 2017
RFC: replaygain: R128 support
sampsyo added a commit that referenced this pull request Jun 11, 2017
@sampsyo
Copy link
Member

sampsyo commented Jun 11, 2017

Thank you for your patience! I've finally done a complete review and merged. This looks great. ✨

@autrimpo
Copy link
Contributor Author

Glad to hear that!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants