Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

embedart has invalid text encoding? #2264

Closed
gboysko opened this issue Nov 13, 2016 · 13 comments
Closed

embedart has invalid text encoding? #2264

gboysko opened this issue Nov 13, 2016 · 13 comments
Labels
needinfo We need more details or follow-up from the filer before this can be tagged "bug" or "feature."

Comments

@gboysko
Copy link

gboysko commented Nov 13, 2016

Problem

When I try to embedart into an entire album, MP3Diag complains that the "text encoding is unsupported for APIC frame in ID3V2 tag".

$ beet embedart grind

Setup

  • OS: Ubuntu 16.04
  • Python version: 2.7.12
  • beets version: 1.3.18
  • Turning off plugins made problem go away (yes/no): N/A

My configuration (output of beet config) is:

directory: /mnt/980425CD0425AF66/MP3s.new
import:
    move: yes
    write: yes
original_date: yes
asciify_paths: yes
id3v23: yes
plugins: edit embedart fetchart lastgenre scrub missing info
paths:
    default: $albumartist/$album%aunique{}/$track $title
replaygain:
    backend: gstreamer
@gboysko gboysko changed the title embedart has invalid text encoding? embedart has invalid text encoding? Nov 13, 2016
@gboysko
Copy link
Author

gboysko commented Nov 13, 2016

Looking at mid3v2 it seems to be using UTF-16 encoding. Looking at the output generated by Picard, I see that the encoding for the ID3V2.3 tag is UTF-16, however, the APIC tag is encoded with LATIN1. I wonder if UTF-16 encoding makes sense for image data.

@sampsyo sampsyo added the needinfo We need more details or follow-up from the filer before this can be tagged "bug" or "feature." label Nov 13, 2016
@sampsyo
Copy link
Member

sampsyo commented Nov 13, 2016

Interesting! Can you find any other references on the Web that dictate specific encodings for APIC? Or perhaps this is a special proclivity in MP3Diag?

@gboysko
Copy link
Author

gboysko commented Nov 14, 2016

I will look to see what I can find. The only reference I have found is http://id3.org/id3v2.3.0, but there is no indication that APIC should be encoding in any other format than UTF-16. I'll reach out to the creator of MP3Diag.

@gboysko
Copy link
Author

gboysko commented Nov 14, 2016

Is this the same problem as #899?

@sampsyo
Copy link
Member

sampsyo commented Nov 14, 2016

Good point; it could be! Encoding in id3v23 mode is a somewhat tricky business.

@gboysko
Copy link
Author

gboysko commented Nov 15, 2016

Here is another investigation showing problems with the EasyTag application:

https://mail.gnome.org/archives/easytag-list/2013-March/msg00008.html

Seems to be consistent with what I'm seeing.

What do we need to take this forward? Is this code in another module? Or is this controlled by beets?

@sampsyo
Copy link
Member

sampsyo commented Nov 15, 2016

Does it seem like the right solution is to solve #899 entirely? That is, if we switch to non-Unicode encodings entirely for ID3v2.3, will this problem be solved?

If so, you can help by taking a look at mediafile.py (yes, in this repository) and thinking about how we might switch to use different encodings based on the id3v23 flag on the MediaFile class.

@gboysko
Copy link
Author

gboysko commented Nov 15, 2016

I don't think we should switch to non-Unicode text encoding for all of
ID3V2.3. Rather, there seems to be something specific about the APIC frame
that requires Latin1 encoding for it to be accepted by a large number of
devices/viewers.

On Tue, Nov 15, 2016 at 8:53 AM, Adrian Sampson notifications@github.com
wrote:

Does it seem like the right solution is to solve #899
#899 entirely? That is, if we
switch to non-Unicode encodings entirely for ID3v2.3, will this problem be
solved?

If so, you can help by taking a look at mediafile.py (yes, in this
repository) and thinking about how we might switch to use different
encodings based on the id3v23 flag on the MediaFile class.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#2264 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAm3MZ5TUsruS-Bh17qQ88FI_YK5b8O9ks5q-blsgaJpZM4Kw1KG
.

@sampsyo
Copy link
Member

sampsyo commented Nov 15, 2016

OK, good call. In any case, we'll need some mechanism to let that storage style check which encoding it should use—the other issue gives a little more detail.

@gboysko
Copy link
Author

gboysko commented Nov 16, 2016

I'm new to the code base, but metafile.py seems to conflict with what I'm seeing. That is, it seems line 923 specifies UTF-8 Encoding (3) is used. However, looking at mid3v2, it suggests that UTF-16 (1) was used. Not sure how to explain the difference.

@sampsyo
Copy link
Member

sampsyo commented Nov 17, 2016

Weird. It seems like this could use some more detailed debugging to nail down exactly what's going on…

@lazka
Copy link
Contributor

lazka commented Nov 18, 2016

I'm new to the code base, but metafile.py seems to conflict with what I'm seeing. That is, it seems line 923 specifies UTF-8 Encoding (3) is used. However, looking at mid3v2, it suggests that UTF-16 (1) was used. Not sure how to explain the difference.

When saving text to id3v2.3, mutagen will use utf-16 in case utf-8 was specified since utf-8 isn't available in v2.3.

I've opened a PR: #2270

@sampsyo
Copy link
Member

sampsyo commented Nov 18, 2016

That would explain it; thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needinfo We need more details or follow-up from the filer before this can be tagged "bug" or "feature."
Projects
None yet
Development

No branches or pull requests

3 participants