Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scrape new Genius song page html #3594

Merged
merged 5 commits into from
May 17, 2020
Merged

Scrape new Genius song page html #3594

merged 5 commits into from
May 17, 2020

Conversation

stlutz
Copy link
Contributor

@stlutz stlutz commented May 16, 2020

As noted in #3535, Genius now doesn't always produce html pages that the existing code can scrape. While the fix in #3554 stops the lyrics plugin from completely crashing, it now simply ignores these pages, even though they do contain the desired lyrics. I added a few lines to the algorithm to deal with this new layout.

While I was at it, I also removed the indirection over the /song api, so we only need to query Genius twice per song instead of thrice.

Another change was to include the artist in the search query sent to Genius. This produces much better search results for songs with very common names but less known arists.

stlutz added 4 commits May 16, 2020 13:26
Searching only for the title and just verifying the artist afterwards leads to songs with very common titles not being found, since Genius limits the amount of returned hits.
An example would be 'Saviour' by 'Circa Waves'.
…ng lyrics.

The search results already include the correct song page url, making it superfluous to do another request via the /song api just to get it.
@sampsyo
Copy link
Member

sampsyo commented May 16, 2020

Woohoo; looks awesome! Would you mind adding a quick changelog entry describing how this works now?

@sampsyo
Copy link
Member

sampsyo commented May 17, 2020

Awesome; thanks!!

sampsyo added a commit that referenced this pull request May 17, 2020
@sampsyo sampsyo merged commit 485abb0 into beetbox:master May 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants