Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

web: UnicodeEncodeError on non-latin1 characters in filename #2815

Open
UniIsland opened this issue Feb 22, 2018 · 3 comments
Open

web: UnicodeEncodeError on non-latin1 characters in filename #2815

UniIsland opened this issue Feb 22, 2018 · 3 comments
Labels
bug bugs that are confirmed and actionable python 3 Arises from the Python 2->3 transition.

Comments

@UniIsland
Copy link

Problem

I'm using beet with tomahawk. If I try to play a song with CJK character in its name, the web server throws UnicodeEncodeError.

127.0.0.1 - - [22/Feb/2018 15:28:19] "GET /item/10933/file HTTP/1.0" 200 -
Error on request:
Traceback (most recent call last):
  File "/usr/local/Cellar/pyenv/1.2.1/versions/3.6.4/lib/python3.6/site-packages/werkzeug/serving.py", line 270, in run_wsgi
    execute(self.server.app)
  File "/usr/local/Cellar/pyenv/1.2.1/versions/3.6.4/lib/python3.6/site-packages/werkzeug/serving.py", line 261, in execute
    write(data)
  File "/usr/local/Cellar/pyenv/1.2.1/versions/3.6.4/lib/python3.6/site-packages/werkzeug/serving.py", line 227, in write
    self.send_header(key, value)
  File "/usr/local/Cellar/pyenv/1.2.1/versions/3.6.4/lib/python3.6/http/server.py", line 508, in send_header
    ("%s: %s\r\n" % (keyword, value)).encode('latin-1', 'strict'))
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 46-50: ordinal not in range(256)

Here's a song that can cause this problem (fields dumped with the export plugin):

  {
        "acoustid_fingerprint": null,
        "acoustid_id": null,
        "album": "匆匆",
        "albumartist": "胡德夫",
        "albumartist_credit": "胡德夫",
        "albumartist_sort": "Ara Kimbo",
        "albumdisambig": null,
        "albumstatus": "Official",
        "albumtype": "album",
        "arranger": "",
        "art": false,
        "artist": "胡德夫",
        "artist_credit": "胡德夫",
        "artist_sort": "Ara Kimbo",
        "asin": null,
        "bitdepth": 0,
        "bitrate": 128000,
        "bpm": 0,
        "catalognum": "WFM05001",
        "channels": 2,
        "comments": null,
        "comp": false,
        "composer": null,
        "composer_sort": null,
        "country": "TW",
        "date": "2005-04-01",
        "day": null,
        "disc": 1,
        "disctitle": null,
        "disctotal": 1,
        "encoder": null,
        "format": "MP3",
        "genre": "Folk",
        "genres": [
            "Folk"
        ],
        "grouping": null,
        "initial_key": null,
        "label": "野火樂集",
        "language": "zho",
        "length": 316.6040625,
        "lyricist": null,
        "lyrics": "",
        "mb_albumartistid": "46dfef42-826d-4cb1-8d28-940d30aa3bf9",
        "mb_albumid": "de95a0cb-87c0-4d64-b753-f5c98bde3271",
        "mb_artistid": "46dfef42-826d-4cb1-8d28-940d30aa3bf9",
        "mb_releasegroupid": "0fe19e52-b54b-4a6a-946d-a23b58766e7c",
        "mb_trackid": "b69fce9c-373b-46a3-b060-4ffdc4800430",
        "media": "CD",
        "month": 4,
        "original_date": "2005-04-01",
        "original_day": null,
        "original_month": 4,
        "original_year": 2005,
        "r128_album_gain": 0,
        "r128_track_gain": 0,
        "rg_album_gain": -4.44,
        "rg_album_peak": 1.088344,
        "rg_track_gain": -5.09,
        "rg_track_peak": 1.032934,
        "samplerate": 44100,
        "script": "Hant",
        "title": "太平洋的風",
        "track": 1,
        "tracktotal": 12,
        "year": 2005
    }

Setup

  • OS: MacOS 10.13.3
  • Python version: 3.6.4
  • beets version: 1.4.6
@sampsyo sampsyo added the needinfo We need more details or follow-up from the filer before this can be tagged "bug" or "feature." label Feb 22, 2018
@sampsyo
Copy link
Member

sampsyo commented Feb 22, 2018

Hello! Thanks for the details. Because this error happens when sending the headers, I suspect that the problem only occurs because there are non-Latin1 characters in the filename (not just in the metadata). Can you confirm that the filename has CJK characters?

@waweic
Copy link
Contributor

waweic commented Feb 28, 2018

This seems like a Python 3 specific issue. It works fine for me with Python 2.7 and Chromium. In Python 3, it even occurs on characters like single right quotation marks (u2019), that can be found in filenames pretty often. This could possibly be prevented by "de-asciifying" the attachment_filename or taking a fallback filename. Is that an option?

@sampsyo
Copy link
Member

sampsyo commented Feb 28, 2018

Thanks! Yeah, it seems like the right thing to do is to ASCIIfy the filename. (For clues about how to do this, see the uses of unidecode elsewhere in the codebase.)

@sampsyo sampsyo added bug bugs that are confirmed and actionable python 3 Arises from the Python 2->3 transition. and removed needinfo We need more details or follow-up from the filer before this can be tagged "bug" or "feature." labels Mar 1, 2018
@sampsyo sampsyo changed the title Web plugin throws UnicodeEncodeError on non-latin characters. web: UnicodeEncodeError on non-latin1 characters in filename Mar 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug bugs that are confirmed and actionable python 3 Arises from the Python 2->3 transition.
Projects
None yet
Development

No branches or pull requests

3 participants