Fetch work title from MusicBrainz #2452

dosoe · 2017-02-23T16:19:46Z

Hi!
I'm new to beets. I have a big collection of classical music that is in the MB database.
For classical music a straightforward way of ordering it is to order it first by composer, then by work and then by performer.
MusicBrainz has a relation "recording of" that relates a recording to a work. A work can have a relation "part of" that refers either to another work or to a catalogue. So the works are organised as a tree with the work-work relation "part of" or "part" as a link.
Could it be possible with beets to "climb up" the tree, so that I can create a folder with the name of the "parent work" that contains the recordings of parts of the work?

As an example, if we look at this recording of the Matthäus-Passion of Bach: https://musicbrainz.org/release/51d08afb-0d81-4617-8c33-979806910ddf
there every recording has a tag "recording of" followed by a work. This work is on his side part of the work "Matthäus-Passion, BWV 244: Teil ?" which is part of the work "Matthäus-Passion, BWV 244" which is part of the catalogue "Bach-Werke-Verzeichnis" with the number BWV 244.
Could it be possible to go up the ladder to the uppermost work "Matthäus-Passion, BWV 244" and optionally (not every composer has a complete catalogue, and not every work on MB is linked appropriately to the corresponding catalogue) to the catalog number?
this would give:

Music/Bach, Johann Sebastian/Matthäus-Passion, BWV 244/Performer/St. Matthew Passion

(I choose sort names for artists, else I get stuck with russian, japanese and other names that I can't read for Suzuki, Prokofiev etc.)
and here the next problems appear: who do we choose as the performer? An obvious choice would be the recording artist, but this has 2 problems: in many cases, the recording artist is just the composer because noone updated it. In other cases, as in this release, the recording artist changes in every track, because the choir for example doesn't sing in every track. A solution might be to just list all performers linked to recordings in this album that are linked to the work. This seems to be the best solution to me. However, this doesn't work if we have, for example, a "best of" of the best performances of a given work, but that is pretty rare (even if I have an example in my collection). Maybe there is a better solution.

Is there a way to implement this into beets?

Dorian

dosoe · 2017-02-23T16:35:13Z

maybe if a given performer has made several recordings of the same work, add the date of the recording.

dosoe · 2017-02-23T16:37:47Z

and if there is no performer or recording artist, use the release artist

sampsyo · 2017-02-24T04:37:00Z

Ok, sounds intriguing! How would you propose to expose the "parent work" information? Would we just record the title, or other stuff too? Do you have ideas for the names of the fields that should hold this information?

dosoe · 2017-02-25T20:01:19Z

Hi! Thanks for your reply. I would say the title and the disambiguation (for example for arrangements like https://musicbrainz.org/work/51bb8773-8492-3773-ab88-73a89c922c3d ). The composer information would be saved on a different tag.
I don't know how beets is made, but I would imagine adding 2 tags, for "work" and "composer" and then have some routine that can call from the musicbrainz database the "parent works" and the "performers". Or maybe even add an additional tag for "parent work". This kind of data organisation is relevant especially for classical music, for more modern stuff as far as I can tell the music is more rarely organised in several movements of a bigger piece.

sampsyo · 2017-02-25T20:09:15Z

OK, so just to summarize, you'd like to be able to access these fields in path templates, right?

$work: This would be the title of the directly associated work.
$composer: I think we already have this one, as of Add Composer, Lyricist and Arranger tags #2333.
$parentwork: The title of the parent work?
$parentwork_composer: Would this be relevant too?

dosoe · 2017-02-25T20:20:09Z

Ideally, $parentwork_composer would be relevant. Practically, I expect it to always be the $composer but for the sake of completeness, if it doesn't include a lot of work, I would take it as well.

dosoe · 2017-02-25T20:22:04Z

Another question: is there a way to use sort names for artists (and therefore composers) with beets?

sampsyo · 2017-02-25T20:27:04Z

OK, thanks. But just to be clear, $composer already does what you want, right?

If so, I'll make this ticket into a request to get the work title field. Then, as a second stage, we can consider doing the "parent work" thing to get copies of the relevant fields reflecting the parent work.

Yes, we do fetch artist_sort and albumartist_sort from MusicBrainz. If you're ever wondering about this kind of a thing, you can type beet fields to get a complete list.

dosoe · 2017-02-25T20:30:15Z

I never tried it out, but from what I can read on the conversation, yes. It even uses the "arranger" tag, which can be helpful as well. As I said, I'm new to beets, I'm just importing my collection right now.

sampsyo · 2017-02-25T20:31:52Z

OK, cool. Marked this as a feature request for that first stage.

dosoe · 2017-02-25T21:56:21Z

Thanks!

dosoe · 2017-04-14T14:44:03Z

Hi! I have seen that you added "composer" as a field, could it be possible to also make a field "composer_sort" (and "arranger_sort" by the way) or does this field contain it already?

sampsyo · 2017-04-15T15:30:39Z

Hi, @dosoe—that sounds like a separate feature request. Maybe it deserves a separate GitHub thread?

dosoe · 2017-05-15T22:15:06Z

Hi!
After having fun with 'composer_sort', 'arranger_sort' and 'lyricist_sort' (to be submitted) I'm trying to get the 'work' and 'parent_work' settled.
For this I tried out something: I added this in the track_info function of beets.beets.autotag.mb.py

    for work_relation in recording.get('work-relation-list', ()):
        if work_relation['type'] != 'performance':
            continue
        work.append(work_relation['work']['title'])

gives the title of the work

            for parent_work_relation_1 in work_relation['work'].get('work-relation-list',()):
                if parent_work_relation_1['type'] != 'parts':
                    continue
                parent_work_1.append(parent_work_relation_1['work']['title'])

gives the title of the work the initial one is part of

Now however when I try to do the same to get the parent work of parent_work_1, I get nothing:
parent_work_relation_1['work'].get('work-relation-list',())
gives an empty list, even if the parent_work_1 effectively is part of a work (tested with the work being
https://musicbrainz.org/work/2fb76aa1-b37f-3e05-a185-f7e607efaf80 and choosing a recording of the work I have in my collection).

What seems to be happening is that beets is going on the page of the recording and takes out all the information he can out of this. In this case (https://musicbrainz.org/recording/546c4659-96c0-46ee-9b31-cfe3e78a1c48 for testing) it is: the work and the parent_work_1 and other tags such as composer etc.
So would there be a way to not go on the page of the recording but on the page of the work using something similar to the track_url and album_url functions that are defined (but I don't know how and where they are used) since it is easy for a work to fetch its id (work_relation['work']['id'] with everything defined as above) and from there to get to its url:
def work_url(workid):
return urljoin(BASE_URL, 'work/' + workid)

Now I don't know where and how you use the album_url and track_url functions to get to the actual recording infos but you probably do.

Right now I'm going up the ladder of parent works by hand but once I get this working to rather do a 'while' loop to climb up to the top of the ladder, but only once I manage to do it this way.

Other question: How can I submit a merge (for adding 'arranger_sort' and lyricist_sort' tags) while continuing to advance on this 'work' stuff? I also have some issues with the 'arranger' and 'lyricist' tags exposed in #2333

sampsyo · 2017-05-16T15:53:40Z

Hmm… to summarize, it sounds like you're interested in how to query the MusicBrainz web service for a specific work ID? For that, you go through the client library and use, for example, get_work_by_id. In general, you might want to read a little bit about the MusicBrainz Web service. For example, here's the URL for the recording you mentioned with its work relations included:
https://musicbrainz.org/ws/2/recording/546c4659-96c0-46ee-9b31-cfe3e78a1c48?inc=work-rels

About the other question: to submit a new PR, the thing to do is to put your work in a branch and push it to your fork. Then you can open several PRs at once; one for each branch.

dosoe · 2017-05-16T16:03:22Z

I will read about the MB web service, it sounds like it could answer some of my questions (but not today).
I wonder if it could even be better to make a work_info function like the track_info function in beets.autotag.mb.py. This could also fetch the composer, lyricist and more generally the tags that MB associates to works.

About the other question: So the idea is that I have more than one fork of beets on my repo, right?

sampsyo · 2017-05-16T18:23:13Z

Sure, a work_info function would be OK—but it would need to look different from the track_info function. The latter produces a complete TrackInfo object, and there would not be a corresponding WorkInfo object (because there is no such thing as a "work" in the beets database). It would need to build up information to put on the TrackInfo object.

Furthermore, it would be something of a problem if we needed to issue a series of new MusicBrainz API requests to get the work data. Is it possible to pull all the information out of the work-rels included data? If not, we may need to make this data an optional feature to avoid making metadata fetching take much longer than it does currently.

No, there's no need to fork the repo twice—you can just create different branches within one git repository. (The GitHub help pages can be useful for this.)

dosoe · 2017-05-16T19:35:20Z

But can't we make a WorkInfo object and just not put it into the library? Just use it as a temporary variable.

sampsyo · 2017-05-16T19:37:07Z

Sure! But I'd argue that you'd probably be better of with just a plain dict instead.

dosoe · 2017-05-16T19:45:15Z

Yes, I would be very satisfied with that.

dosoe · 2017-05-25T15:16:46Z

Ok thanks to get_work_by_id I did the necessary to fetch the work title, the work disambiguation, the parent work title, the parent work disambiguation, the parent work composer name and the parent work composer sort name. However, I already have a pull request (#2563) so if I just push it on my repo it will be added as a commit to this one. Additionally, I only implemented the fetching part (in beet.autotag.mb.py) and not all the stuff around.
However, I can show you what the code looks like:
Just insert it into the track_info function:

lyricist = []
composer = []
composer_sort = []
work = []
work_disambig = []
parent_work = []
parent_work_disambig = []
parent_composer = []
parent_composer_sort = []
for work_relation in recording.get('work-relation-list', ()):
    if work_relation['type'] != 'performance':
        continue
    work_id=work_relation['work']['id']
    work_info=musicbrainzngs.get_work_by_id(work_id, includes=["work-rels","artist-rels"])
    work.append(work_info['work']['title'])
    try:
        work_disambig.append(work_info['work']['disambiguation'])
        parent_disambig_tmp=work_info['work']['disambiguation']
    except KeyError:
        work_disambig.append('')
        parent_disambig_tmp=''
    partof=True
    parent_work_tmp=work_info['work']['title']
    while partof:
        partof=False
        for work_father in work_info['work']['work-relation-list']:
            if work_father['type'] == 'parts': 
                try: 
                    if work_father['direction'] == 'backward':
                        father_id=work_father['work']['id']
                        partof=True
                        work_info=musicbrainzngs.get_work_by_id(father_id, includes=["work-rels","artist-rels"])
                        parent_work_tmp=work_info['work']['title']
                        try:
                            parent_disambig_tmp=work_info['work']['disambiguation']
                        except KeyError:
                            parent_disambig_tmp=''
                except KeyError:
                    pass 
    for artist in work_info['work']['artist-relation-list']:
        if artist['type']=='composer':
            parent_composer.append(artist['artist']['name'])
            parent_composer_sort.append(artist['artist']['sort-name'])
    parent_work.append(parent_work_tmp)
    parent_work_disambig.append(parent_disambig_tmp)

instead of

lyricist = []
composer = []
composer_sort = []
for work_relation in recording.get('work-relation-list', ()):
    if work_relation['type'] != 'performance':
        continue

I guess there also should a 'parent_lyricist' and 'parent_lyricist_sort' tag, but that is easy and quick to do. If the work is not part of a bigger one, the parent_work is the work itself (same for all the 'parent_' tags)
What I assume there is that a work only has one parent, which might not always be the case.
There are probably style errors, but it works for me.

I will need more time to sort out how to choose the performer correctly.

This calls the musicbrainzngs.get_work_by_id function multiple times, so there might be an issue because we can only go on the server once a second.
Additionally, I lately had the problem that a significant proportion of my test runs gave a 503 error (service unavailable).

sampsyo · 2017-05-25T17:16:11Z

Hmm; that's interesting! I notice that this seems to have gotten quite a bit more complicated. It seems like it would be a worthy goal to see if this can be done in a more generic way: that is, maybe we can write one function to get all the information for the "parent work," and then a separate function that pulls out all the work-related information from any work? Then, we can just join "parent_" onto the front of all the stuff from the parent work in one fell swoop, rather than needing to duplicate logic for every field.

dosoe · 2017-05-25T19:20:33Z

Most of this code (the while loop) only aims to find the parent work Am 25.05.2017 7:16 nachm. schrieb "Adrian Sampson" <notifications@github.com

…

: Hmm; that's interesting! I notice that this seems to have gotten quite a bit more complicated. It seems like it would be a worthy goal to see if this can be done in a more generic way: that is, maybe we can write one function to get all the information for the "parent work," and then a separate function that pulls out all the work-related information from *any* work? Then, we can just join "parent_" onto the front of all the stuff from the parent work in one fell swoop, rather than needing to duplicate logic for every field. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2452 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AYyAOHpUpJP6zvpa6HL3_QeE7l89nYCkks5r9bdigaJpZM4MKK9X> .

dosoe · 2017-05-25T19:22:22Z

And then you get all the info about it once you have it Am 25.05.2017 9:20 nachm. schrieb "Dorian Soergel" <soergeldorian@gmail.com

…

: Most of this code (the while loop) only aims to find the parent work Am 25.05.2017 7:16 nachm. schrieb "Adrian Sampson" < ***@***.***>: > Hmm; that's interesting! I notice that this seems to have gotten quite a > bit more complicated. It seems like it would be a worthy goal to see if > this can be done in a more generic way: that is, maybe we can write one > function to get all the information for the "parent work," and then a > separate function that pulls out all the work-related information from > *any* work? Then, we can just join "parent_" onto the front of all the > stuff from the parent work in one fell swoop, rather than needing to > duplicate logic for every field. > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <#2452 (comment)>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/AYyAOHpUpJP6zvpa6HL3_QeE7l89nYCkks5r9bdigaJpZM4MKK9X> > . >

dosoe · 2017-05-25T22:11:07Z

So basically what is happening is the following:
I get the work id with the recording relationships.
So I have a work id
Then I get the work relationships by using ger_work_by_id
in the work relationships I look for a work that is of 'type': 'part' and of 'direction': 'backward' .
If there is none, then this is the parent work, if there is one I take its id.
Then I repeat with the id I just got.

Once we have the parent work, we take its name, composer, composer_sort, lyricist, etc. The disambiguation needs a try/except syntax because some works don't have a disambiguation. The same way, only works with a parent have a work with a 'direction' tag in their work-relationships.
We should also watch out for dupes, so maybe append the tag to the list only if it's not already in the list, since a recording can very well contain more than many works but all are part of the same parent work.

It may be coded clumsily, but it works.

Now there are two issues:
-first, there might be several parents of one work. I believe that would be an error in the MB database but maybe there are good reasons for this to happen. I don't know so far how to deal with this.
-second, each call of get_work_by_id is a call of MB and I can only do one a second. This would therefore substantially slow down the autotagger (I guess) so maybe it would be a good idea to make it optional or in a plugin (as far as I can tell, this is useful only for classical music and if there were many classical lovers here this would already have been implemented). I have no idea how to do this.

sampsyo · 2017-05-26T03:10:32Z

Yeah, making a plugin would be a great way to make the extra queries optional and encapsulate the new code! It's actually fairly straightforward: the beets plugin system has an "import stage" API, where you can add arbitrary code to run on music that's been imported. So an easy way to get started would be to make a plugin that just runs this same code in an import stage, making calls into the beets.autotag.mb module.

Let me know if I can help more with pointing the way!

dosoe · 2017-05-26T10:59:42Z

Yes, I would appreciate that if you could help me to set it up.

sampsyo · 2017-05-26T17:43:35Z

Sure! Here's the place to start: http://docs.beets.io/en/v1.4.3/dev/plugins.html

Feel free to post questions along the way if anything comes up.

dosoe · 2017-05-31T13:43:56Z

Ok, now I have a start file using https://beets.readthedocs.io/en/v1.4.3/dev/plugins.html#add-path-format-functions-and-fields and the keyfinder plugin as a template. However, I don't know how to add a new tag: on the keyfinder plugin they just have a tag mapping and write it directly to the file, but that doesn't work for me. Additionally, I don't know how to tell him to do it also when importing and updating.
I'm attaching the code how he is so far. The part about fetching the data from MB works afaict, even if it is a little ugly (I'm not a programmer).

parentwork.txt

dosoe · 2017-05-31T15:01:30Z

here an updated version. It works (when I ask him to print the data, it is correct) , I just don't get how to write the tags into the library. I made a branch for it, but I would like to commit this stuff without modifying my other pull request #2563 .
parentwork.txt

sampsyo · 2017-06-01T02:26:39Z

Cool! It looks like you're already adding the relevant information to the Item objects and calling store(), so I think that should be enough?

JDLH · 2017-06-03T08:15:04Z

I've only read this thread, not examined the code. I'm encourage to see effort to handle well tagging by composer and work title.

From my knowledge of MusicBrainz, three things to be careful of:

Many releases will not have relationships linking their recordings to works.
A single recording may link to multiple works, say if a single track of an opera recording contains what the score desribes as two scenes, which are represented in MusicBrainz as two Work entities.
There are sometimes 1, or 2, or 3 levels in a Work "is a part of" Work tree. A Work linked to a Recording may be the top-level work all by itself. Or it may be a part of the top-level work (e.g. a movement in a symphony). Or it may be a grandchild of a top-level work (e.g. a scene which is part of an Act which is part of an opera).

It might be helpful to find examples of each of these in MusicBrainz, and include them in your unit test cases.

Good luck with this plug-in!

dosoe · 2017-06-03T09:34:31Z

Hi! Thanks for your input. First, keep in mind that I am writing this script to handle classical music, because that's what my music is mainly composed of and that's where the composer and the work are important entities.
Concerning point 1: I can't do anything about it, except add the works myself on MB (60 000 edits so far). With the corresponding scripts it is pretty quick to do.
Concerning point 2: This plugin doesn't put out one work title but a list of all the works that are related to the recording. Then for each of these works it goes and fetches the parent work. One thing could be an issue: I consider that a each work has only one parent work, which I would expect to be true but there is strictly speaking no reason for it.
Concerning point 3: This is the reason why I'm doing a while loop. I tested it with different works, some CPE Bach sonatas that have 1 or 2 levels and the Matthew Passion of JS Bach that has up to 4 levels of parent works.
This script aims for classical music, because there the work information is valuable, as said above my goal is to be able to have a file tree such as: parentcomposer/parentwork/performer/recordings
A problem there is to find the performer. The MB tag on the individual recordings can vary on the different parts of a work (for an opera for example), so maybe I would like to make a list of all performers of all the tracks containing parts of the parentwork but then I will have problem to find them precisely, as I know examples of releases where the same work is played twice with different performers.

jacksondm33 · 2021-12-22T21:00:22Z

Is there a reason TRACK_INCLUDES does not include 'work-rels'? This causes work relations to not be fetched by track_for_id. By adding it, importing songs (singletons) include 'mb_workid', 'composer', 'lyricist', etc. tags, while they did not before.

sampsyo added the needinfo We need more details or follow-up from the filer before this can be tagged "bug" or "feature." label Feb 24, 2017

sampsyo changed the title ~~Use works for ordering classical music~~ Fetch work title from MusicBrainz Feb 25, 2017

sampsyo added feature features we would like to implement and removed needinfo We need more details or follow-up from the filer before this can be tagged "bug" or "feature." labels Feb 25, 2017

dosoe mentioned this issue May 31, 2017

Add parentwork plugin #2580

Closed

mhendu mentioned this issue Jul 1, 2017

Fetch MusicBrainz work ID #2618

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fetch work title from MusicBrainz #2452

Fetch work title from MusicBrainz #2452

dosoe commented Feb 23, 2017 •

edited

Loading

dosoe commented Feb 23, 2017

dosoe commented Feb 23, 2017

sampsyo commented Feb 24, 2017

dosoe commented Feb 25, 2017

sampsyo commented Feb 25, 2017

dosoe commented Feb 25, 2017

dosoe commented Feb 25, 2017

sampsyo commented Feb 25, 2017

dosoe commented Feb 25, 2017

sampsyo commented Feb 25, 2017

dosoe commented Feb 25, 2017

dosoe commented Apr 14, 2017

sampsyo commented Apr 15, 2017

dosoe commented May 15, 2017 •

edited

Loading

sampsyo commented May 16, 2017

dosoe commented May 16, 2017

sampsyo commented May 16, 2017

dosoe commented May 16, 2017

sampsyo commented May 16, 2017 •

edited

Loading

dosoe commented May 16, 2017

dosoe commented May 25, 2017

sampsyo commented May 25, 2017

dosoe commented May 25, 2017 via email

dosoe commented May 25, 2017 via email

dosoe commented May 25, 2017

sampsyo commented May 26, 2017

dosoe commented May 26, 2017

sampsyo commented May 26, 2017

dosoe commented May 31, 2017 •

edited

Loading

dosoe commented May 31, 2017

sampsyo commented Jun 1, 2017 •

edited

Loading

JDLH commented Jun 3, 2017

dosoe commented Jun 3, 2017

jacksondm33 commented Dec 22, 2021

Fetch work title from MusicBrainz #2452

Fetch work title from MusicBrainz #2452

Comments

dosoe commented Feb 23, 2017 • edited Loading

dosoe commented Feb 23, 2017

dosoe commented Feb 23, 2017

sampsyo commented Feb 24, 2017

dosoe commented Feb 25, 2017

sampsyo commented Feb 25, 2017

dosoe commented Feb 25, 2017

dosoe commented Feb 25, 2017

sampsyo commented Feb 25, 2017

dosoe commented Feb 25, 2017

sampsyo commented Feb 25, 2017

dosoe commented Feb 25, 2017

dosoe commented Apr 14, 2017

sampsyo commented Apr 15, 2017

dosoe commented May 15, 2017 • edited Loading

sampsyo commented May 16, 2017

dosoe commented May 16, 2017

sampsyo commented May 16, 2017

dosoe commented May 16, 2017

sampsyo commented May 16, 2017 • edited Loading

dosoe commented May 16, 2017

dosoe commented May 25, 2017

sampsyo commented May 25, 2017

dosoe commented May 25, 2017 via email

dosoe commented May 25, 2017 via email

dosoe commented May 25, 2017

sampsyo commented May 26, 2017

dosoe commented May 26, 2017

sampsyo commented May 26, 2017

dosoe commented May 31, 2017 • edited Loading

dosoe commented May 31, 2017

sampsyo commented Jun 1, 2017 • edited Loading

JDLH commented Jun 3, 2017

dosoe commented Jun 3, 2017

jacksondm33 commented Dec 22, 2021

dosoe commented Feb 23, 2017 •

edited

Loading

dosoe commented May 15, 2017 •

edited

Loading

sampsyo commented May 16, 2017 •

edited

Loading

dosoe commented May 31, 2017 •

edited

Loading

sampsyo commented Jun 1, 2017 •

edited

Loading