Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lyrics: In Genius backend, tolerate artist disambiguation markers #4791

Open
ronajon opened this issue May 15, 2023 · 3 comments · May be fixed by #5474
Open

lyrics: In Genius backend, tolerate artist disambiguation markers #4791

ronajon opened this issue May 15, 2023 · 3 comments · May be fixed by #5474
Assignees
Labels
feature features we would like to implement

Comments

@ronajon
Copy link

ronajon commented May 15, 2023

Problem

importing Genius lyrics for specific Artists does not work.
The reason is that the artists are not know in Genius with just their bandname (Psychonaut or Brutus) but due to multiple bands having the same name, know as <>-<> so psychonaut-be (https://genius.com/artists/Psychonaut-be) and brutus-be https://genius.com/artists/brutus-be

Running this command in verbose (-vv) mode:

$ beet -vv lyrics violate consensus reality all your gods have gone

Led to this problem:

user configuration: /Media/home/.config/beets/config.yaml
data directory: /Media/home/.config/beets
plugin paths: 
Sending event: pluginload
library database: /Media/home/.config/beets/library.db
library directory: /Media/Music
Sending event: library_opened
lyrics: Genius failed to find a matching artist for 'Psychonaut'
lyrics: failed to fetch: https://www.musixmatch.com/lyrics/Psychonaut/All-Your-Gods-Have-Gone (404)
lyrics: lyrics not found: Psychonaut - Violate Consensus Reality - All Your Gods Have Gone
Sending event: cli_exit

Here's a link to the music files that trigger the bug (if relevant):

Setup

  • OS: alpine 3.17.3
  • Python version: 3.10.11
  • beets version: 1.6.0
  • Turning off plugins made problem go away (no):

My configuration (output of beet config) is:

lyrics:
    bing_lang_from: []
    google_API_key: REDACTED
    google_engine_ID: REDACTED
    fallback: ''
    sources: genius musixmatch
    auto: yes
    bing_client_secret: REDACTED
    bing_lang_to:
    genius_api_key: REDACTED
    force: no
    local: no
directory: /Media/Music
library: /Media/home/.config/beets/library.db

import:
    copy: no
    write: yes
ignore: ['?eaDir*']
incremental: yes
genres: yes

ui:
    color: yes

paths:
    default: $albumartist/$albumartist - $year - $album/$albumartist - $album - $track - $title

plugins: web discogs fetchart mbsync duplicates info missing lyrics
web:
    host: 0.0.0.0
    readonly: no
    include_paths: yes
    port: 8337
    cors: ''
    cors_supports_credentials: no
    reverse_proxy: no
fetchart:
    auto: yes
    cover_names: cover front art album folder
    sources: coverart itunes amazon albumart
    minwidth: 0
    maxwidth: 0
    quality: 0
    max_filesize: 0
    enforce_ratio: no
    cautious: no
    google_key: REDACTED
    google_engine: 001442825323518660753:hrh5ch1gjzm
    fanarttv_key: REDACTED
    lastfm_key: REDACTED
    store_source: no
    high_resolution: no
    deinterlace: no
    cover_format:
discogs:
    index_tracks: yes
    apikey: REDACTED
    apisecret: REDACTED
    tokenfile: discogs_token.json
    source_weight: 0.5
    user_token: REDACTED
    separator: ', '
missing:
    count: no
    total: no
    album: no
duplicates:
    album: no
    checksum: ''
    copy: ''
    count: no
    delete: no
    format: ''
    full: no
    keys: []
    merge: no
    move: ''
    path: no
    tiebreak: {}
    strict: no
    tag: ''
@sampsyo sampsyo added the needinfo We need more details or follow-up from the filer before this can be tagged "bug" or "feature." label May 16, 2023
@sampsyo
Copy link
Member

sampsyo commented May 16, 2023

This sounds annoying! It would be helpful to experiment with different ways of resolving the ambiguity. For example, does simply dropping the last two-letter word always work, or does that ever introduce ambiguity with a different artist?

Here's where to start when tweaking the matching heuristic:

if slug(hit_artist) == slug(artist):

@ronajon
Copy link
Author

ronajon commented May 16, 2023

if i regex replace [<2 letter country code>] it seems to work
line 359

old

hit_artist = hit["result"]["primary_artist"]["name"]

new

hit_artist = re.sub(r'.[\(\[]..[\)\]]','',hit["result"]["primary_artist"]["name"]) 

@sampsyo
Copy link
Member

sampsyo commented May 17, 2023

Nice, that seems like a good step! An eventual PR should try both (the original and truncated name, if any) to make sure we don't miss artists that happen to look like this.

@sampsyo sampsyo changed the title lyrics not found with exisiting artists in Genius lyrics: In Genius backend, tolerate artist disambiguation markers May 17, 2023
@sampsyo sampsyo added feature features we would like to implement and removed needinfo We need more details or follow-up from the filer before this can be tagged "bug" or "feature." labels May 17, 2023
snejus added a commit that referenced this issue Oct 8, 2024
This commit introduces a distance threshold mechanism to the Genius
backend and unifies its implementation across the rest of backends that
perform searching and matching artists and titles.

- Create a new `SearchBackend` base class with a method `check_match`
  that performs checking.
- Start using undocumented `dist_thresh` configuration option for good,
  and mention it in the docs. This controls the maximum allowable
  distance for matching artist and title names.

These changes aim to improve the accuracy of lyrics matching, especially
when there are slight variations in artist or title names, see #4791.
snejus added a commit that referenced this issue Oct 8, 2024
This commit introduces a distance threshold mechanism to the Genius
backend and unifies its implementation across the rest of backends that
perform searching and matching artists and titles.

- Create a new `SearchBackend` base class with a method `check_match`
  that performs checking.
- Start using undocumented `dist_thresh` configuration option for good,
  and mention it in the docs. This controls the maximum allowable
  distance for matching artist and title names.

These changes aim to improve the accuracy of lyrics matching, especially
when there are slight variations in artist or title names, see #4791.
snejus added a commit that referenced this issue Oct 8, 2024
This commit introduces a distance threshold mechanism to the Genius
backend and unifies its implementation across the rest of backends that
perform searching and matching artists and titles.

- Create a new `SearchBackend` base class with a method `check_match`
  that performs checking.
- Start using undocumented `dist_thresh` configuration option for good,
  and mention it in the docs. This controls the maximum allowable
  distance for matching artist and title names.

These changes aim to improve the accuracy of lyrics matching, especially
when there are slight variations in artist or title names, see #4791.
snejus added a commit that referenced this issue Oct 9, 2024
This commit introduces a distance threshold mechanism to the Genius
backend and unifies its implementation across the rest of backends that
perform searching and matching artists and titles.

- Create a new `SearchBackend` base class with a method `check_match`
  that performs checking.
- Start using undocumented `dist_thresh` configuration option for good,
  and mention it in the docs. This controls the maximum allowable
  distance for matching artist and title names.

These changes aim to improve the accuracy of lyrics matching, especially
when there are slight variations in artist or title names, see #4791.
@snejus snejus self-assigned this Oct 9, 2024
snejus added a commit that referenced this issue Oct 9, 2024
This commit introduces a distance threshold mechanism to the Genius
backend and unifies its implementation across the rest of backends that
perform searching and matching artists and titles.

- Create a new `SearchBackend` base class with a method `check_match`
  that performs checking.
- Start using undocumented `dist_thresh` configuration option for good,
  and mention it in the docs. This controls the maximum allowable
  distance for matching artist and title names.

These changes aim to improve the accuracy of lyrics matching, especially
when there are slight variations in artist or title names, see #4791.
snejus added a commit that referenced this issue Oct 12, 2024
This commit introduces a distance threshold mechanism for the Genius and
Google backends.

- Create a new `SearchBackend` base class with a method `check_match`
  that performs checking.
- Start using undocumented `dist_thresh` configuration option for good,
  and mention it in the docs. This controls the maximum allowable
  distance for matching artist and title names.

These changes aim to improve the accuracy of lyrics matching, especially
when there are slight variations in artist or title names, see #4791.
snejus added a commit that referenced this issue Oct 12, 2024
This commit introduces a distance threshold mechanism for the Genius and
Google backends.

- Create a new `SearchBackend` base class with a method `check_match`
  that performs checking.
- Start using undocumented `dist_thresh` configuration option for good,
  and mention it in the docs. This controls the maximum allowable
  distance for matching artist and title names.

These changes aim to improve the accuracy of lyrics matching, especially
when there are slight variations in artist or title names, see #4791.
snejus added a commit that referenced this issue Oct 13, 2024
This commit introduces a distance threshold mechanism for the Genius and
Google backends.

- Create a new `SearchBackend` base class with a method `check_match`
  that performs checking.
- Start using undocumented `dist_thresh` configuration option for good,
  and mention it in the docs. This controls the maximum allowable
  distance for matching artist and title names.

These changes aim to improve the accuracy of lyrics matching, especially
when there are slight variations in artist or title names, see #4791.
snejus added a commit that referenced this issue Oct 19, 2024
This commit introduces a distance threshold mechanism for the Genius and
Google backends.

- Create a new `SearchBackend` base class with a method `check_match`
  that performs checking.
- Start using undocumented `dist_thresh` configuration option for good,
  and mention it in the docs. This controls the maximum allowable
  distance for matching artist and title names.

These changes aim to improve the accuracy of lyrics matching, especially
when there are slight variations in artist or title names, see #4791.
snejus added a commit that referenced this issue Oct 19, 2024
This commit introduces a distance threshold mechanism for the Genius and
Google backends.

- Create a new `SearchBackend` base class with a method `check_match`
  that performs checking.
- Start using undocumented `dist_thresh` configuration option for good,
  and mention it in the docs. This controls the maximum allowable
  distance for matching artist and title names.

These changes aim to improve the accuracy of lyrics matching, especially
when there are slight variations in artist or title names, see #4791.
snejus added a commit that referenced this issue Oct 19, 2024
This commit introduces a distance threshold mechanism for the Genius and
Google backends.

- Create a new `SearchBackend` base class with a method `check_match`
  that performs checking.
- Start using undocumented `dist_thresh` configuration option for good,
  and mention it in the docs. This controls the maximum allowable
  distance for matching artist and title names.

These changes aim to improve the accuracy of lyrics matching, especially
when there are slight variations in artist or title names, see #4791.
snejus added a commit that referenced this issue Oct 19, 2024
This commit introduces a distance threshold mechanism for the Genius and
Google backends.

- Create a new `SearchBackend` base class with a method `check_match`
  that performs checking.
- Start using undocumented `dist_thresh` configuration option for good,
  and mention it in the docs. This controls the maximum allowable
  distance for matching artist and title names.

These changes aim to improve the accuracy of lyrics matching, especially
when there are slight variations in artist or title names, see #4791.
snejus added a commit that referenced this issue Oct 23, 2024
This commit introduces a distance threshold mechanism for the Genius and
Google backends.

- Create a new `SearchBackend` base class with a method `check_match`
  that performs checking.
- Start using undocumented `dist_thresh` configuration option for good,
  and mention it in the docs. This controls the maximum allowable
  distance for matching artist and title names.

These changes aim to improve the accuracy of lyrics matching, especially
when there are slight variations in artist or title names, see #4791.
snejus added a commit that referenced this issue Oct 23, 2024
This commit introduces a distance threshold mechanism for the Genius and
Google backends.

- Create a new `SearchBackend` base class with a method `check_match`
  that performs checking.
- Start using undocumented `dist_thresh` configuration option for good,
  and mention it in the docs. This controls the maximum allowable
  distance for matching artist and title names.

These changes aim to improve the accuracy of lyrics matching, especially
when there are slight variations in artist or title names, see #4791.
snejus added a commit that referenced this issue Oct 30, 2024
This commit introduces a distance threshold mechanism for the Genius and
Google backends.

- Create a new `SearchBackend` base class with a method `check_match`
  that performs checking.
- Start using undocumented `dist_thresh` configuration option for good,
  and mention it in the docs. This controls the maximum allowable
  distance for matching artist and title names.

These changes aim to improve the accuracy of lyrics matching, especially
when there are slight variations in artist or title names, see #4791.
snejus added a commit that referenced this issue Nov 22, 2024
This commit introduces a distance threshold mechanism for the Genius and
Google backends.

- Create a new `SearchBackend` base class with a method `check_match`
  that performs checking.
- Start using undocumented `dist_thresh` configuration option for good,
  and mention it in the docs. This controls the maximum allowable
  distance for matching artist and title names.

These changes aim to improve the accuracy of lyrics matching, especially
when there are slight variations in artist or title names, see #4791.
snejus added a commit to snejus/beets that referenced this issue Dec 5, 2024
This commit introduces a distance threshold mechanism for the Genius and
Google backends.

- Create a new `SearchBackend` base class with a method `check_match`
  that performs checking.
- Start using undocumented `dist_thresh` configuration option for good,
  and mention it in the docs. This controls the maximum allowable
  distance for matching artist and title names.

These changes aim to improve the accuracy of lyrics matching, especially
when there are slight variations in artist or title names, see beetbox#4791.
snejus added a commit that referenced this issue Dec 7, 2024
This commit introduces a distance threshold mechanism for the Genius and
Google backends.

- Create a new `SearchBackend` base class with a method `check_match`
  that performs checking.
- Start using undocumented `dist_thresh` configuration option for good,
  and mention it in the docs. This controls the maximum allowable
  distance for matching artist and title names.

These changes aim to improve the accuracy of lyrics matching, especially
when there are slight variations in artist or title names, see #4791.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature features we would like to implement
Projects
None yet
3 participants