-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
First try adding new albuminfo and trackinfo class #3568
Conversation
This new implementation of AlbumInfo and TrackInfo allows flexible tags. A quick test with some of my data seems to work. It has one significant difference: because I overcharged |
This behaviour of |
It raises the question: The goal of this is to have some core attributes that are always populated and then some that are flexibly attributed. Which attributes are core attributes? From the definition of |
Great! Thanks for getting this going! Here's what I think we should do:
Does that seem reasonable? Perhaps I'm being too aggressive here about eliminating all built-in attributes, but I think it is probably the right thing to do. |
I think in principle, yes, but I'm running into trouble adapting |
Ok, that's gonna be a wild chase to find all the |
This already works.
That's already the case. |
e84c272
to
62566ee
Compare
…AlbumInfo to the absence of positional arguments
That's a working prototype. Now I kept all the default values (all the tags that were previously set to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking great so far! It's awesome that the "apply" functions do not need to change at all (for now).
One thing to eventually clean up: it looks like the Map
class is taken from a Stack Overflow answer? https://stackoverflow.com/a/32107024/39182
It might be a bit "overkill" for what we need. For example, I don't think we're using the *args
part of the constructor? And similarly I don't think we're using the __dict__
part of the functionality? Maybe something much simpler like Confuse's AttrDict would suffice?
https://github.com/beetbox/confuse/blob/3bf9680e7d242136cc304232f902db06c4cc8e11/confuse.py#L1659-L1667
I do need some of the methods, because it needs to be hashable and I need a |
I see… can you elaborate a little bit on which parts of the code require those two things (hashability and deep copying)? For hashability, I think we might not want a smart approach that hashes based on the data… we might want every object to be "unequal to" every other unique object. I think that's the default for |
I remember deepcopy being a problem at my previous attempt, especially for python2. It seems like it's linked to pickle and needs a
This shows the types have to be hashable. |
OK—would you mind double-checking to see whether deep copying is still necessary anywhere? It might work to just delete those various methods and see if any tests fail… That traceback is an example showing that the type does indeed need to be hashable, but this doesn't necessarily imply that the implementation should use the actual contents instead of the object identity. In fact, I think it would be incorrect to make two objects be considered equal in that context if their contents are equal. I think a default-ish implementation like |
|
Yep, that's all it does—it only works on bytestrings (i.e., |
The only ones I could think of are acoustic fingerprints and similar pseudo-strings |
Whatever it is, it's probably simpler to just decode all strings into unicode and add a list of exceptions rather than doing it the other way. Is there a reason to convert everything to unicode? |
Yeah: the main reason to use the "positive" rather than "negative" approach is that, with this PR, the list of fields is meant to be extensible. So when someone comes along and adds a new field, perhaps in a plugin, we don't want to have to touch the rest of the code. We would need to choose a "default behavior," and the most sensible default is not to touch any data—to pass it along as the metadata source provided it. |
But why do you actually bother to convert anything to unicode? Are regulat bytestrings not good enough? Is it for special characters (cyrillic, japanese, chinese etc.)? |
Yes—eventually, all strings that represent text in beets must be Unicode strings. That's the only way to reliably represent the full range of characters people use in their metadata. |
Then it would make sense to make the default behaviour to convert every string to unicode unless stated otherwise. |
But that runs into the problem above: what if a plugin wants to add a field that is supposed to contain bytes? Especially if it's not a built-in beets plugin, it would have no way to instruct beets core to skip the conversion. |
Why not? A field gets converted only if it exists. |
Say I write a plugin that provides a fingerprint tag, $myfp. It holds a byte string, intentionally. The beets core has never heard of this before. But the loop you're proposing will look like this: for field in self.data:
if isinstance(self[field], bytes) and field not in do_not_convert_these_fields:
self[field] = self[field].decode('utf8') Because |
Then is there anything that holds this PR back from being merged? I'm thinking of adding a method like |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking great overall! Here's one more code review with a few low-level revisions.
beets/autotag/hooks.py
Outdated
@@ -138,53 +144,41 @@ def decode(self, codec='utf-8'): | |||
if isinstance(value, bytes): | |||
setattr(self, fld, value.decode(codec, 'ignore')) | |||
|
|||
if self.tracks: | |||
if 'tracks' in self: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps tracks
is the one think we should keep non-optional. That is, you must provide a list of track objects—unlike all the other fields, which are different because they are just metadata.
I'm actually not sure why we have this if
. The loop just doesn't do anything if the list is empty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, to me it doesn't make sense to have an album without tracks.
beets/autotag/hooks.py
Outdated
for track in self.tracks: | ||
track.decode(codec) | ||
|
||
def dup_albuminfo(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's just call this copy
(because it's obviously a method on AlbumInfo).
beets/autotag/hooks.py
Outdated
tracks = [] | ||
for track in self.tracks: | ||
tracks.append(track.dup_trackinfo()) | ||
dupe.tracks = tracks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This loop can be replaced with a list comprehension:
dupe.tracks = [track.copy() for track in self.tracks]
beets/autotag/hooks.py
Outdated
@@ -224,6 +219,11 @@ def decode(self, codec='utf-8'): | |||
if isinstance(value, bytes): | |||
setattr(self, fld, value.decode(codec, 'ignore')) | |||
|
|||
def dup_trackinfo(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also call this copy
?
test/test_autotag.py
Outdated
trackinfo.append(TrackInfo(u'three', None)) | ||
trackinfo.append(TrackInfo(title=u'one', track_id=None)) | ||
trackinfo.append(TrackInfo(title=u'two', track_id=None)) | ||
trackinfo.append(TrackInfo(title=u'three', track_id=None)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like track_id=None
may no longer be necessary?
test/test_autotag.py
Outdated
@@ -595,7 +597,8 @@ def item(i, length): | |||
items.append(item(12, 186.45916150485752)) | |||
|
|||
def info(index, title, length): | |||
return TrackInfo(title, None, length=length, index=index) | |||
return TrackInfo(title=title, track_id=None, length=length, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above.
test/test_autotag.py
Outdated
@@ -749,13 +752,15 @@ def test_albumtype_applied(self): | |||
self.assertEqual(self.items[1].albumtype, 'album') | |||
|
|||
def test_album_artist_overrides_empty_track_artist(self): | |||
my_info = copy.deepcopy(self.info) | |||
# make a deepcopy of self.info |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is probably not necessary every time?
test/test_ui.py
Outdated
i2 = library.Item() | ||
i2.bitrate = 4321 | ||
i2.length = 10 * 60 + 54 | ||
i2.format = "F" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any particular reason why this is not i2 = self.item.copy()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder why this hasn't been done long ago, in a docstring of one of the parent classes it explicitly says that deepcopy()
doesn't work on these objects.
Should I provide the changelog as well? Should the documentation be altered? |
Awesome! A changelog entry would be great. I can't think of anywhere else in the docs where we mention this stuff, so probably nothing else needs to change. Thanks for all your work on this! |
Co-authored-by: Adrian Sampson <adrian@radbox.org>
That should be it then. |
Awesome!! Thank you for your careful work on this. This is a long-standing request that will help enable lots of interesting additions in the future. Three cheers! |
I'm coming back to this PR: I tried to add new fields, which should be easy now, but I found out we missed an important part in |
New attempt at dealing with #1547 learning from errors in #2650