Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lyrics plugin cannot fetch lyrics when accented (non-ascii?) characters in URL (from artist name or title) #2357

Closed
katonagl opened this issue Dec 30, 2016 · 2 comments
Labels
needinfo We need more details or follow-up from the filer before this can be tagged "bug" or "feature."

Comments

@katonagl
Copy link

katonagl commented Dec 30, 2016

Problem

I try to fetch lyrics for songs where either the artist name or the title has non ascii charcters.

The command used is beet -vv lyrics halász micimackó

The answer is:

lyrics: failed to fetch: http://lyrics.wikia.com/Hal%C3%A1sz_Judit:Micimack%C3%B3 (404)
lyrics: failed to fetch: https://www.musixmatch.com/lyrics/Hal%C3%A1sz-Judit/Micimack%C3%B3 (404)
lyrics: lyrics not found: Halász Judit - Halász Judit - Micimackó

However, the second link do exist. The problem should be with character encoding, since titles with only ascii characters work.

Setup

  • OS: Opensuse Tumbleweed
  • Python version: 3.5.1
  • beets version: 1.4.1
  • Turning off plugins made problem go away (yes/no): it is a problem with a plugin

My configuration (output of beet config) is:

lyrics:
    bing_lang_from: []
    force: yes
    auto: yes
    google_API_key: REDACTED
    bing_client_secret: REDACTED
    genius_api_key: REDACTED
    google_engine_ID: REDACTED
    bing_lang_to:
    fallback:
    sources:
    - google
    - lyricwiki
    - lyrics.com
    - musixmatch

import:
    move: yes
directory: /media/music
mbsubmit:
    format: $track. $title ($length)
    threshold: medium
library: ~/.config/beets/musiclibrary.blb

plugins: lyrics fetchart fromfilename mbsubmit scrub
scrub:
    auto: yes
fetchart:
    auto: yes
    google_engine: 001442825323518660753:hrh5ch1gjzm
    cautious: no
    cover_names:
    - cover
    - front
    - art
    - album
    - folder
    sources:
    - filesystem
    - coverart
    - itunes
    - amazon
    - albumart
    store_source: no
    maxwidth: 0
    enforce_ratio: no
    google_key: REDACTED
    fanarttv_key: REDACTED
    minwidth: 0
@sampsyo sampsyo added the needinfo We need more details or follow-up from the filer before this can be tagged "bug" or "feature." label Dec 30, 2016
@sampsyo
Copy link
Member

sampsyo commented Dec 30, 2016

Hi! It actually looks like this isn't an encoding issue but a case of Musixmatch blocking our scraper:

>>> import requests
>>> requests.get('https://www.musixmatch.com/lyrics/Hal%C3%A1sz-Judit/Micimack3%B3')
<Response [404]>

We can change our user-agent from the requests default, but I'm afraid it will only be a matter of time before that gets blocked too. We'll see, I guess?

@sampsyo
Copy link
Member

sampsyo commented Dec 30, 2016

OK, I've added a User-Agent header to the plugin. It's not a real fix, of course, but it might make this work for now! Care to give it a try?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needinfo We need more details or follow-up from the filer before this can be tagged "bug" or "feature."
Projects
None yet
Development

No branches or pull requests

2 participants