Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes to add work id as a tag. #2619

Closed
wants to merge 5 commits into from
Closed

Conversation

mhendu
Copy link

@mhendu mhendu commented Jul 1, 2017

This does not currently function - need advice or help from someone more familiar with programming than I.

This does not currently function - need advice or help from someone more familiar with programming than I.
@mention-bot
Copy link

@mhendu, thanks for your PR! By analyzing the history of the files in this pull request, we identified @sampsyo, @geigerzaehler and @jrobeson to be potential reviewers.

@mhendu mhendu mentioned this pull request Jul 1, 2017
@mhendu
Copy link
Author

mhendu commented Jul 1, 2017

I don't think work_disambig does anything in this case, so probably an unnecessary change, but not really sure how to get this working.

Copy link
Member

@Freso Freso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what is making this not work is that you haven't changed the MusicBrainz queries to request the Work information? Not sure though and don't have time to dig into the code right now. @sampsyo?

@@ -157,9 +158,10 @@ def __init__(self, title, track_id, artist=None, artist_id=None,
medium_total=None, artist_sort=None, disctitle=None,
artist_credit=None, data_source=None, data_url=None,
media=None, lyricist=None, composer=None, composer_sort=None,
arranger=None, track_alt=None):
arranger=None, track_alt=None, work_id=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would put the work_id=None into the same place it's put in all the other contexts (ie., after track_id). Other places should be using artist=…, … to specify the artist etc. specifically. If they don't and thus break by this, we've found places that need fixing.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, did this and made the other suggested update but no luck. Any other thoughts?

Copy link
Member

@sampsyo sampsyo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for getting this started! I think the problem is in updating TrackInfo, as I indicated inline.

for work_relation in recording.get('work-relation-list', ()):
if work_relation['type'] != 'performance':
continue
work_id=work_relation['work']['id']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you're assigning to a new variable work_id here, but this is never making it into the TrackInfo object. You might want to use info.work_id = ... here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, tried that but no luck. Anything else I'm missing?

@sampsyo
Copy link
Member

sampsyo commented Jul 1, 2017

OK! If those didn't work, the next thing to try is print-debugging. That is, try putting print(work_id) and such after the various places where you've changed things—this can help you see whether the data is what you think it is throughout the execution.

@mhendu
Copy link
Author

mhendu commented Jul 2, 2017

Well I spoke too soon - when searching by a release it does return the work id, but if looking for a singleton it does not do this. Not really clear to me why this is the case. Any suggestions would be appreciated.

When I put in 'print(work_id)' immediately following the statement in mb.py that is intended to populate the work_id it just prints brackets ([]). Also, it writes the work id to an M4A file but not to an MP3 file (although both have the work id in beets).

@mhendu
Copy link
Author

mhendu commented Jul 2, 2017

Fixed MP3 tagging. Issue RE: singletons remains.

@mhendu
Copy link
Author

mhendu commented Jul 2, 2017

OK, so I could be wrong, but it appears to be an issue with this pull request:

#2333

I do have an updated version of musicbrainzngs and confirmed by printing the track_includes variable that it shows work-level-rels. When I print the results of track_includes (for a track that does have a work id and a composer) I don't see this data included. Composer doesn't seem to be available for anything that's in track_includes. Neither does work id, but this is available in work-rels, which is a valid parameter for a recording.

If you update musicbrainz.py to allow for work-rels as a parameter for recordings then my fork should work; you also have to update mb.py to include work-rels in track_includes. I haven't placed that updated version here. The issue with composers at the track level remains. Not sure of a good way around that.

@sampsyo
Copy link
Member

sampsyo commented Jul 2, 2017

Hmm… are you suggesting that we should change the work-level-rels include to just work-rels? That sounds possible! If it seems like the right thing to you, perhaps it's time for a second pull request?

(And is that the same root cause as the singletons issue you pointed out?)

@mhendu
Copy link
Author

mhendu commented Jul 2, 2017

Well I think (could be mistaken) that an update to MusicBrainzNGS is required to add work-rels as an include for recordings. I don't know if you need work-level-rels or not; I just left it in and it seems to work fine. I can fork MusicBrainzNGS and make the update there, but for this fork of beets to give the expected result I do think that MusicBrainzNGS must also be updated.

@sampsyo
Copy link
Member

sampsyo commented Jul 2, 2017

Aha! If I understand correctly from the docs, it looks like work-rels is supposed to be used on track requests and work-level-rels is supposed to be used for album requests. Does that match your experience? Can you perhaps to a little experimentation to see whether we need one, the other, or both to cover all cases?

Thanks for looking into this!

@mhendu
Copy link
Author

mhendu commented Jul 9, 2017

I didn't see any indication that work-level-rels does anything for a track lookup. For instance, these two queries return the same result:

https://musicbrainz.org/ws/2/recording/56322a25-62ee-4f2a-8af8-db06681a35cd
https://musicbrainz.org/ws/2/recording/56322a25-62ee-4f2a-8af8-db06681a35cd?inc=work-level-rels

Adding work-rels gets more specific information about the work (but does not return the composer):

https://musicbrainz.org/ws/2/recording/56322a25-62ee-4f2a-8af8-db06681a35cd?inc=work-rels

@sampsyo
Copy link
Member

sampsyo commented Jul 9, 2017

That makes sense! In that case, I'm inclined to agree that extending the MB library is the right way to start.

@mhendu
Copy link
Author

mhendu commented Jul 9, 2017

OK, spoke too soon. I think work-level-rels is intended to be combined with other lookups to return results. See for instance:

http://musicbrainz.org/ws/2/recording/aff7db3a-b566-47ef-bb44-b11fa2a88295?inc=work-rels+artist-rels+artist-credits+work-level-rels

Now without:

http://musicbrainz.org/ws/2/recording/aff7db3a-b566-47ef-bb44-b11fa2a88295?inc=work-rels+artist-rels+artist-credits

You need artist-rels, work-rels and work-level-rels to get the composer. Omitting any one of those will return results without the composer.

@dosoe
Copy link
Contributor

dosoe commented Jul 20, 2017

Hi! Concerning the work-level-rels vs work-rels I use work-rels in this plugin: #2580 and it works fine (however I use the get_work_by_id function, I think beets uses the get_recording_by_id or get_album_by_id function) . Maybe I could add work_id (and parent_work_id) in the plugin? Or would you prefer to have it in the main code, since the parentwork plugin is agonizingly slow (a few seconds per recording) because he also fetches the related works? If you do prefer it, I don't need to put it into the parentwork plugin. However, the way it's done there could be interesting. Btw, to be honest it seems weird to me to fetch the work_id (which nobody can do anything with, it's only for the database) and not fetch the work title and disambiguation. Concerning musicbrainzngs, I did have problems with it (see #2333 comment of 11th of May) so I had to use the git version and not the standart version.

@dosoe
Copy link
Contributor

dosoe commented Jul 20, 2017

I actually fetch the work-ids in the parentwork plugin (#2580) but I use them only for disambiguation.

@dosoe
Copy link
Contributor

dosoe commented Jul 20, 2017

Just genuinely wondering: why, of all things, do you want the work_id (and not title)?

@sampsyo
Copy link
Member

sampsyo commented Jul 20, 2017

Good question. A good policy might be to fetch everything we can in core that doesn’t require additional MusicBrainz API request. Beyond that, everything else can be left to the plugin. Does that help draw the appropriate line?

If we end up with the work MBID but not the work’s title by applying this policy (I’m not sure whether this is the case), then that might help speed up the plugin’s operation because it can start by looking up by work ID.

@mhendu
Copy link
Author

mhendu commented Jul 21, 2017

@dosoe to answer your question, I wanted work_id because it's a unique identifier that I can theoretically use to build playlists with covers, remixes, alternate versions of songs that I like. The work title wouldn't allow you to do this since it may not be unique. The problem is that a large portion (probably the majority) of the recordings in Musicbrainz don't have a corresponding work_id, so this is nice in theory but maybe not practical except for really well-known recordings and well-known covers.

Makes sense to me to have all the info beets can grab from Musicbrainz without additional API pulls placed into the database / tags - seems like that would be desirable.

@dosoe
Copy link
Contributor

dosoe commented Jul 21, 2017

Hi! I am contributing to MB, and in my experience, classical music is pretty well covered concerning work relationships (even if there is still plenty of work) and my parentwork plugin provides the links to add works on recordings. On non-classical music however the situation is not that good, since there the notion of work is much less central.
Concerning what can be done with only one API query, there is a lot that can be done. I know for sure that we can fetch the work title, disambiguation, id, related artists, related works like medleys, arrangements, and parent works up to level one, catalog name and number. As an example: https://musicbrainz.org/release/7e77cba7-5584-4f58-9da2-ee24af146eb7 Everything that is visible on this page can be fetched with one API request.

So what I can do is make beets core fetch work title, disambig and id and then use the fetched work id in the parentwork plugin. This will speed up the plugin because it means one less api query. I can add a log that explains how to add works for recordings that don't have some as I do in the parentwork plugin.
However that means that the parentwork plugin and this commit will be linked, the parentwork plugin will not be able to work without this work_id-fetching feature, so the pull requests will somehow be linked together.
Practically, how should I do this? Just do everything in one pull request (the one of the parentwork plugin)? Make a new pull request for it and condition the parentwork plugin to it? Use this pull request I don't have ownership rights on?

@sampsyo
Copy link
Member

sampsyo commented Jul 21, 2017

A fresh PR never hurts! They’re free to create. 😃 Just clearly describing the differences between multiple options can help organize the effort.

@dosoe
Copy link
Contributor

dosoe commented Jul 21, 2017

Ok, just wondering because all there PR are related. Should I adapt the parentwork plugin so that it fits with the future PR (just to be sure: This should fetch work title, id, disambiguation)? I added a work tag in the parentwork plugin, should I just move it to this new PR?

@sampsyo
Copy link
Member

sampsyo commented Jul 21, 2017

Yeah, absolutely! I like the idea of basing the parentwork plugin off of the new functionality, and moving the work tag here also makes sense. (Speaking personally, reusing an existing PR or opening a new one doesn’t make that much of a difference; whichever’s more convenient is fine.)

@dosoe
Copy link
Contributor

dosoe commented Jul 21, 2017

Well, the point is that I'm already using the parentwork plugin, so if I rewrite it and I make a separate branch with the work fetching, I will have one version of my code with the plugin (that wouldn't work because would be depending on the work fetching and therefore will hardly be testable), one with the work fetching, one with the performer fetching (#2563) and none of them does what I want it to do (fetch the parentwork), so I would have a forth version of beets with the code I'm actually using in real life... That's why I on one side push that you accept the pull request #2563 and on the other side would like to make the plugin and the parentwork fetching in one pull request. However if you consider that the plugin and the work fetching are two separate things I would understand and do it the complicated way.

@dosoe
Copy link
Contributor

dosoe commented Jul 21, 2017

Ok I did it in the pull request #2580 that also adds the parentwork plugin for the reasons stated above. It also uses the picard tag mapping to put the work tag on the files. I have one problem with the work_disambig tag: sometimes a recording contains multiple works, and some of them have a disambiguation, others don't. If I just create a plain list where the disambiguations, if they are present, are put in, then there will be an issue because it will be impossible to tell which disambiguation corresponds to which work (for example, if the first one has no disambiguation and the second has one, then work would look like: [work1, work2] and work_disambig would look like: [disambig] , however, the disambiguation corresponds to the second work). If I create a list with disambiguations and add empty strings for each work that doesn't have one, it will remove this problem but create another: if I make my paths like $work ($work_disambig) then if a work has no disambiguation, the path will look like: work () which is not satisfying either. Nevertheless, that is the solution I chose, but there could be a discussion about it. So all the work discussed above and in the previous comments is in #2580, I propose to continue the discussion there.

@dosoe dosoe mentioned this pull request Jul 21, 2017
@dosoe
Copy link
Contributor

dosoe commented Jul 23, 2017

I finished the implementation as described. Any comments?

arcresu pushed a commit to beetbox/mediafile that referenced this pull request May 19, 2019
@arcresu
Copy link
Member

arcresu commented May 19, 2019

Just a heads up that there's a plan to split MediaFile into a standalone Python package and git repository. The change here to add the work ID is affected by that, but since it seems like something we definitely want to add, I went ahead and migrated that change to the new repository as beetbox/mediafile#8.

@arcresu
Copy link
Member

arcresu commented May 31, 2019

Looks like we can close this one now that #3272 is merged 🎉

@arcresu arcresu closed this May 31, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants