Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes #1210: Skip non-audio tracks from MusicBrainz #2776

Merged
merged 3 commits into from
Jan 1, 2018

Conversation

nguillaumin
Copy link
Contributor

This ignores non-audio tracks during import:

  • Data tracks, based on their title [data track] (which seems to be
    the MusicBrainz convention, as there's no specific flag to indicate
    that a track is a data one),
  • Video tracks, based on the video=true attribute.

It's similar to the Picard changes mentioned in #1210, except it doesn't
deal with [silence] tracks: These ones will probably require a setting
to let the user control if they should be imported or not.

This ignores non-audio tracks during import:
- Data tracks, based on their title `[data track]` (which seems to be
the MusicBrainz convention, as there's no specific flag to indicate
that a track is a data one),
- Video tracks, based on the `video=true` attribute.

It's similar to the Picard changes mentioned in beetbox#1210, except it doesn't
deal with `[silence]` tracks: These ones will probably require a setting
to let the user control if they should be imported or not.
Users may want to keep tracking video tracks, for example if they rip
the audio part of the video tracks. Added a setting to allow this.
@nguillaumin nguillaumin deleted the skip-non-audio-tracks branch December 31, 2017 18:50
@nguillaumin nguillaumin restored the skip-non-audio-tracks branch December 31, 2017 18:55
@nguillaumin nguillaumin reopened this Dec 31, 2017
@nguillaumin
Copy link
Contributor Author

Hm, I added a setting to continue including video tracks following @pprkut comment on #2688 , but I realize that we may also need a setting to continue including non-audio media too. Is that correct @pprkut we would need both?

Copy link
Member

@sampsyo sampsyo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks awesome; thank you! I have only one comment from a code review.

The configuration option seems like a good idea. I'm OK with merging this as-is, and we can then sort out how the configuration options should work. Specifically, I would be interested to hear from @pprkut whether one configuration option that controls both video tracks and video media would be the right thing. If so, we might consider simplifying the name to ignore_video.


if ('video' in track['recording'] and
track['recording']['video'] == 'true' and
config['match']['ignore_video_tracks'].get(bool) is True):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for is True—calling .get(bool) is guaranteed to return a bool.

In fact, it's actually possible to drop that part too—Confit automatically gets the truthiness of a view when it's used as a boolean.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, I changed it

@nguillaumin
Copy link
Contributor Author

Thanks!

Regarding the settings I thought it would be good to split them regardless, because then we could have a list of formats to ignore, rather than hard-coding them in the code (Data CD, DVD, DVD-Video, etc). That way users could choose exactly what format they want to ignore or not. But let's see what @pprkut thinks.

@sampsyo
Copy link
Member

sampsyo commented Jan 1, 2018

Great! I'll merge this initial version for now, and we can keep the discussion going about how to set up the config options.

@sampsyo sampsyo merged commit 3272b93 into beetbox:master Jan 1, 2018
@pprkut
Copy link
Contributor

pprkut commented Jan 8, 2018

Hey! Sorry for the late reply! I would indeed also like to be able to import non-audio media, but wouldn't that work with the same setting? Ideally, video media would have video=true for all tracks on it. Maybe that same fact can be used to remove the hardcoded media format list? I.e. ignore media where all tracks are video tracks

@nguillaumin
Copy link
Contributor Author

I think how it will work with that change + a couple of other that were merged:

  • Specific media formats will be ignored (e.g. DVD-Video). The list of formats to ignore is configurable.
  • Tracks with video=true will be ignored. This is configurable too.

@sampsyo
Copy link
Member

sampsyo commented Jan 9, 2018

I guess I read @pprkut’s question as asking whether checking for video=true is enough by itself. Would that catch all the tracks on video media?

@nguillaumin
Copy link
Contributor Author

Ah I see. Well, I have no idea 😉 We'd have to find a MusicBrainz example of a video track that doesn't have video=true. Looking at the related Picard changes it seems video=true is the only way to flag a video track.

@pprkut
Copy link
Contributor

pprkut commented Jan 10, 2018

One could argue that a video track not marked as video=true would be something to fix in musicbrainz and not in beets ;)
I was trying to come up with a use-case where you'd want video tracks, but not if they are on certain formats, but it's getting really theoretical. I'd think that just checking for video=true would cover 99% of the use cases.

@sampsyo
Copy link
Member

sampsyo commented Jan 10, 2018

Got it. If that's the case then, @nguillaumin, perhaps we can remove the medium-level version of the setting? Just filtering on video=true might be all we need.

@nguillaumin
Copy link
Contributor Author

Hm, I'm keen to keep it:

  • I think it makes sense to just ignore media that can't hold audio data. There's no point in parsing UMD or CD-Rom media for example?
  • There are some (a lot?) of cases where tracks on a video-type media are not flagged as video. For example this and this. I'm not sure that's a MusicBrainz issue because the media is of type DVD or Blu-ray, so there's no need to flag each track as video=true. My understanding of this flag is that it's only useful for mixed-media, like a CD with a video track, or Digital Media that contains a video file.

@nguillaumin
Copy link
Contributor Author

Up to you of course, I won't object if you want to remove it 😉

@pprkut
Copy link
Contributor

pprkut commented Jan 11, 2018

There are some (a lot?) of cases where tracks on a video-type media are not flagged as video. For example this and this. I'm not sure that's a MusicBrainz issue because the media is of type DVD or Blu-ray, so there's no need to flag each track as video=true. My understanding of this flag is that it's only useful for mixed-media, like a CD with a video track, or Digital Media that contains a video file.

The video flag is on the recording, not the track. You don't actually see the media a recording is on. If there are releases with tracks that are clearly video, but are not marked as such, that should be fixed in musicbrainz. Note however that a track being on a DVD-Video medium (for example) doesn't mean it's a video track. I have some video DVDs containing audio only tracks.

@pprkut
Copy link
Contributor

pprkut commented Jan 11, 2018

I think it makes sense to just ignore media that can't hold audio data. There's no point in parsing UMD or CD-Rom media for example?

I don't think that's a safe assumption to make. Those are data disks and could potentially hold audio data. I, for example, do have a release that has a bunch of MP3 files on a CD-ROM (there's no audio tracks next to the data track)

@nguillaumin
Copy link
Contributor Author

Fair enough, if that's the case. Do you have any doc explaining this? I was trying to lookup docs about the video attribute and how non-audio media is supposed to be entered in MusicBrainz but didn't find anything...

@pprkut
Copy link
Contributor

pprkut commented Jan 13, 2018

@Freso might be able to answer that better than me, but I'll try :)
Guidelines mostly depend on the actual release. Situations can be different, so making general rules about how non-audio tracks should be entered are difficult.
There's pages explaining how (and when) to mark tracks as video. There's another one for data tracks

Anything else you'd be interested in in particular?

@nguillaumin
Copy link
Contributor Author

I guess I was after something that would confirm that video need to be set to true, even for recordings that are only on video-only media (e.g. a DVD-Video). What happens for example for CD+DVD releases of live shows that have exactly the same content, but the DVD has video and the CD doesn't. Are 2 recordings created?

I have some video DVDs containing audio only tracks.

I'm not sure I understand this? Either it's a DVD-Audio, but if it's a DVD-Video then it's a video track (even if it does only play audio when you play it on your TV, it needs to be played on a DVD-Video player). Perhaps it's more a matter of interpretation, i.e. is it video because it has actual video content, or because it's played through a video medium?

@pprkut
Copy link
Contributor

pprkut commented Jan 13, 2018

Check the "Video" checkbox if this is a video recording (an audio track uploaded to Youtube with a static photo does not qualify as a video, this should be used only for actual videos).

That's a quote from here. Nothing is marked as video by default. There's an open feature request for musicbrainz to mark tracks automatically as video if they are on a video medium, but for now everything has to be done explicitely.

What happens for example for CD+DVD releases of live shows that have exactly the same content, but the DVD has video and the CD doesn't. Are 2 recordings created?

Yes, there will be 2 recordings. One for the audio-only version, and one for the video.

I'm not sure I understand this? Either it's a DVD-Audio, but if it's a DVD-Video then it's a video track (even if it does only play audio when you play it on your TV, it needs to be played on a DVD-Video player). Perhaps it's more a matter of interpretation, i.e. is it video because it has actual video content, or because it's played through a video medium?

It's video if the audio is linked to moving pictures. A static background picture (like mentioned for youtube tracks above) doesn't make it a video. I also have a DVD though where when ripping the track that plays the audio you'd really get audio-only, not even a single-frame background picture attached to it.

@nguillaumin
Copy link
Contributor Author

Thanks, that clarifies things a bit.

I still have a gut feeling there's value in having a setting to choose which media to include / exclude when importing, but I can't give a clear reason as to why, so if you want to take it away that's fine.

@sampsyo
Copy link
Member

sampsyo commented Jan 31, 2018

Hi! Just checking back in on this: as it currently stands, we have two settings, right? It seems OK to keep them both, but maybe it would be useful to reexamine the defaults. For example, do most users really need to skip both video=true tracks and the list of non-audio media types? Or is there any chance that could cause weird behavior—and we should just stick one or the other being enabled by default (for example, just ignore_video_tracks might make the most sense)?

@nguillaumin
Copy link
Contributor Author

Hehe, I think that's a tough question to answer. I would tend to think that most users would skip non audio tracks and media, because that's what I do. Others will think differently...

If there's a channel through which you can survey users, that would be ideal, but of not then I suspect all we can do is guess and take a chance.

@sampsyo
Copy link
Member

sampsyo commented Jan 31, 2018

Well, I suppose my question boils down to: are there examples of situations where the two settings differ? That is, are there tracks that will be caught by one setting but not the other? I get the sense that the answer is "ideally, no," but looking for real examples one way or the other might help.

@nguillaumin
Copy link
Contributor Author

Hm... I'm not exactly sure what you're after 😉 but basically, as per the previous discussion, MusicBrainz makes the distinction of "Video media format" and "Video recordings", like we would do with these two settings.

So you can have a media (e.g. Digital Media) with a single video recording (that may or may not get ignored depending on ignore_video_tracks): https://musicbrainz.org/ws/2/release/49da37ee-065a-4d7f-a204-9dda8047aad4?inc=recordings (See last track).

..Or you can have a Video media (DVD) where the tracks were not flagged individually as being video: https://musicbrainz.org/ws/2/release/c9ca3d33-881c-41ad-8ad0-a6edd1aec185?inc=recordings (See the 2nd media). That may well be a MusicBrainz issue, but it exists.

Does that help?

@sampsyo
Copy link
Member

sampsyo commented Jan 31, 2018

I think I see! Here's another way of putting my question (with possible answers): what are some actual the differences in experience of users with these three settings?

  1. Ignoring video media, and ignoring video tracks (the current default).
  2. Ignoring video media but not tracks.
  3. Ignoring video tracks but not media.

It seems like the answer is that there may be some tracks that don't get filtered out in 3, but that's probably a MusicBrainz data problem that should be fixed—or a legitimate case of audio tracks appearing on typically-video media. But I don't think we've seen an example of anything meaningful getting missed in case 2.

I think I lean toward making option 3 the default. We may miss some video tracks that way, but that seems like it's OK. It's a good semantic match for what most users probably actually want. And leaving the option to ignore entire media, for people who want it, will help address the remaining tracks.

@nguillaumin
Copy link
Contributor Author

Hm, I'm not sure. For me the default that makes sense is 1. (which is why I implemented it 😉).

I don't rip the audio track of video media (either a full media like DVD or a single video track), I only rip audio CDs. Beets currently report video tracks (media or single track) as "missing" when I import my media. With the new default, these tracks would not be reported missing. I would be inclined to think that's what most users want: it has been requested in #1210 and #2688, which is where all of this is coming from. See this comment on #2688 :

"In general I think the right thing to do is that all media files that are marked as video in MB should not be considered for matching. After all beets currently only deals with audio files."

Other users are ripping video tracks, in that case the tracks weren't marked as missing by the current version of Beets. With the new default set to 1. these tracks will be marked "unmatched". If set to 3. , some (but not all) will be marked "unmatched". You would be inclined to think that's what most users want.

good semantic match for what most users probably actually want

Going back to my initial point, unless you have survey data, I think that's hard to claim this. #1210 and #2688 seem to show that some users want to ignore video tracks and media. Is that just some, or most?

To sum up, I don't think there's a good default unless we have more insight. Neither of us is "wrong", it's just that we use the software differently.

We have to make a choice though, so I think I would go either with ignore all, or ignore none. It's likely that users either import audio only (like me), or import everything (video media and individual tracks like @pprkut ). Using 2. or 3. as a default may actually cover the smallest portion of users, the ones who import only individual video tracks but not full media, or the ones importing full media but no video tracks. Though again, it's a guess 😅

@sampsyo
Copy link
Member

sampsyo commented Jan 31, 2018

Certainly we can’t know what people want without a survey, but I can anticipate what will be the least confusing. :) That user above, for example:

all media files that are marked as video in MB should not be considered for matching

is talking about tracks, not media, which suggests they expect option number 3. That’s what I mean by “semantic match”: the tracks thing is the easiest to explain, even if it might make some “mistakes” in some circumstances.

So I think I’ll make that the default. Fortunately, your options now make it easy to get exactly the behavior that anyone might want!

@sampsyo
Copy link
Member

sampsyo commented Jan 31, 2018

Also! If you're interested, probably the best way to get feedback would be a quick post on our forum: https://discourse.beets.io

@pprkut
Copy link
Contributor

pprkut commented Jan 31, 2018

I agree, option 3 sounds like a sensible default approach, if only because it doesn't hide musicbrainz data problems/inconsistencies/etc and might trigger people to improve the data there instead of hacking around it in beets.

Just for completeness though, a case that might be missed in scenario 2 is videos in data tracks. Beets doesn't handle data tracks yet, but once it does, those would qualify as video recordings not on video media. They would be covered by scenario 3 though, if those tracks are properly marked as video in musicbrainz.

A different question though. What would be the behavior if you ignore video tracks (and or video media) and you want to import one that is only that, for example a live DVD or some such. I would imagine that beets would try to match a release with 0 tracks on it? Or would it abort that process somehow?

This could hit people new to beets who don't know that video tracks are ignored by default, or maybe don't even know that the tracks they want to import were originally video tracks. Could be nice if they'd get some feedback then in the form of "Hey, you're trying to import a video release, but beets is currently configured to ignore those. Read more about it here" or some such.

@sampsyo
Copy link
Member

sampsyo commented Jan 31, 2018

That's a great point. It would, the way we've currently implemented, just say the release has zero tracks, so all the ones you're trying to import are "extra."

One easy (but incomplete) way to address this would be to add some debug logging, at least, that enumerates all the tracks that are skipped because they're video tracks. Then people at least have a hope of figuring out by looking at the verbose log.

@nguillaumin
Copy link
Contributor Author

that user above, for example:
is talking about tracks, not media

I don't think that's the case. If you read the full issue it's originally about a CD+DVD, so it's a media issue:

Matching a folder containing all audio files of a MB release made of CD+DVD for instance fails since beets does take into account also the video files.
For instance (1 CD+ 1DVD): [...]

I'm reading that as "all non audio files should be skipped", probably because they don't know the subtleties of MB regarding media and recordings. Perhaps we can them @zuzzurro .

As you say the options are flexible enough to do whatever users may want, so I'm glad we kept both in the end 😉 3. is fine, as long as I can reconfigure it. 😄

sampsyo added a commit that referenced this pull request Feb 8, 2018
We determined on the PR thread that ignoring video tracks is enough, and
ignoring typically-video media has more pitfalls.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants