-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some releases don't get their works fetched #3308
Comments
Hmm; is it really just big releases that don't include work relationships? Can you link to an example where the works are included and one where they're not? It can be instructive, when investigating MB behaviors like this, to take these and load the actual API response in a browser. Here's an API lookup for one of those albums, for instance: This can let you quickly experiment with different releases and You might also be interested in reading the MusicBrainz API docs, which describe what relationships are legal for which entities: Please see if you can narrow down what's going on with MusicBrainz itself by talking directly to the API! If it seems to be doing something "wrong," we can file a bug with the MB folks. I don't think we should make separate MB requests for every recording by default. |
Indeed when I check https://musicbrainz.org/ws/2/release/9c5c043e-bc69-4edb-81a4-1aaf9c81e6dc?inc=media+recordings+release-groups+labels+artist-credits+aliases+recording-level-rels+work-rels+work-level-rels+artist-rels which is the equivalent of
As can be seen, the artists and their aliases are there, but that's pretty much all. Also, I would have expected recording dates to be there as well as instruments (Gould as pianist). If we now look at https://musicbrainz.org/ws/2/release/db49c56b-7e11-4cbc-8fcc-577a031e8cd6?inc=media+recordings+release-groups+labels+artist-credits+aliases+recording-level-rels+work-rels+work-level-rels+artist-rels which is a release that contains exactly the same recordings (it's pretty much the first medium of the release above). There, the first track is much more detailed:
As we can see, it contains the work title and other relations, composer, arrangements, performers and their instrument, producer, recording date etc. It is the same recording, we asked for the same information but get much more information for a smaller release. |
Wait, I'm not sure this is a problem in musicbrainzngs, which is the name of the Python library—perhaps you mean the MusicBrainz server? There are details about the MB bug tracker on the wiki: |
Answer from MB:
|
Should we implement a check and if the release has more than 500 tracks then get the data track by track? |
Makes sense! I don't think we can do that by default—fetching every recording for a 500-track album will take a very long time, and it will be wasted if the user doesn't need work information. Maybe it should be behind a configuration option? Or maybe it could be part of the responsibility of the Also, I'm intrigued by this suggestion:
Because this person didn't say "you have to fetch every recording individually," it suggests there may still be some way to fetch them all in bulk, by "browsing." Maybe that's worth looking into? |
Problem
I implemented the
work
,mb_workid
andwork_disambig
tags not so long ago (#3272) . My problem is: for some recordings, the works just don't get fetched. It concerns especially the very big releases (20+ CDs like for example https://musicbrainz.org/release/9c5c043e-bc69-4edb-81a4-1aaf9c81e6dc or https://musicbrainz.org/release/9bcd75dd-995e-482b-8ba7-1ef074d253de ). I tried to backtrace the error (by putting random prints in thebeets/autotag/mb.py
and then runningbeet mbsync
on the problematic releases).What I can see is:
while
RELEASE_INCLUDES
(inbeets/autotag/mb.py
) does contain'work-rels',
and'work-level-rels'
, for the releases I'm looking at,TRACK_INCLUDES
doesn't:musicbrainzngs.VALID_INCLUDES['recording']
contains'work-rels'
but not'work-level-rels'
, which is odd.At line 494 in
beets/autotag/mb.py
,musicbrainzngs.get_release_by_id(albumid,RELEASE_INCLUDES)
doesn't contain any works, even if the recordings do have works andRELEASE_INCLUDES
contains'work-rels',
and'work-level-rels'
. I first tried to checkmusicbrainzngs.get_recording_by_id(recording['id'], TRACK_INCLUDES)
for all the recordings and it turns out it doesn't contain any works, because of the first error. If now I add'work-rels'
toTRACK_INCLUDES
and then look atmusicbrainzngs.get_recording_by_id(recording['id'], TRACK_INCLUDES)
then it contains the works just fine.So I'm wondering: why do we get the work relationships with
musicbrainzngs.get_recording_by_id(recording['id'], TRACK_INCLUDES)
but not withmusicbrainzngs.get_release_by_id(albumid, RELEASE_INCLUDES)
even if both ask for'work-rels
andwork-level-rels
formusicbrainzngs.get_release_by_id
?A quick and dirty fix would be to ask for
musicbrainzngs.get_recording_by_id(recording['id'], TRACK_INCLUDES)
for each track. The problem is, there is a significant performance loss if we have one musicbrainz query for each track instead of each release, but I didn't look at it in too much detail. It seems to me thatmusicbrainzngs
doesn't send all info we ask for for very big releases, could that be because it is too big and they have a cap on the maximum size they can send?Of course, I checked: the concerned recordings do have works on MB.
Setup
My configuration (output of
beet config
) is:The text was updated successfully, but these errors were encountered: