Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update utils import to resolve src module error #17

Open
wants to merge 12 commits into
base: master
Choose a base branch
from
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -15,3 +15,5 @@ build/
src/settings.cfg
tab_scraper/
release_notes.txt
src copy/utils.py
.DS_Store
24 changes: 13 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,16 @@
# tab-scraper
An interface for downloading guitar tabs from Ultimate Guitar.

An interface for downloading guitar tabs from Ultimate Guitar. Please check the news at the bottom.

![ui-image](screens/ui-screen.png)

Get screenshots of Guitar Chords, Tabs, Bass Tabs and Ukulele Chords with no clutter.

Chords | Tab
:------:|:------|
![chords](screens/feather-chords.png) | ![tab](screens/sultans-tab.png)
| Chords | Tab |
| :---------------------------------: | :---------------------------- |
| ![chords](screens/feather-chords.png) | ![tab](screens/sultans-tab.png) |

You can also download GuitarPro and PowerTab files. <br>
You can also download GuitarPro and PowerTab files. `<br>`
All files are sorted into directories for quick and easy access.

### Prerequisites
Expand All @@ -25,17 +26,18 @@ All files are sorted into directories for quick and easy access.

#### Command Line

1. Open settings.cfg and enter in the root directory where you would like all tabs to be stored e.g. <i>username/Music/Tabs/ </i>
2. Download [Geckodriver](https://github.com/mozilla/geckodriver/releases) and put the geckodriver executable into the <i>src</i> directory.
1. Open settings.cfg and enter in the root directory where you would like all tabs to be stored e.g. `<i>`username/Music/Tabs/ `</i>`
2. Download [Geckodriver](https://github.com/mozilla/geckodriver/releases) and put the geckodriver executable into the `<i>`src`</i>` directory.
3. Run `pip install -r requirements.txt`
4. run `python tab_scraper.py` from <i>src</i> directory.
4. run `python tab_scraper.py` from `<i>`src`</i>` directory.

### Built With

- Python 3

- [PyQT5](https://pypi.org/project/PyQt5/)

- [Selenium](https://selenium-python.readthedocs.io/)

- [Geckodriver](https://github.com/mozilla/geckodriver/releases)

### News

I am assuming the original author has given up on this project, as I myself forked it and began working on it over a year ago (and also promptly disappeared from the project) and there have been no updates since. I'll do my best to get this working again on all platforms, but I cannot make any promises. I'll likely take the existing GUI and rebuild it using libraries I'm familiar with, and build new logic to go in the back end as well. Keep an eye out for updates.
134 changes: 134 additions & 0 deletions jsonout.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
[
{
"artist_name": "TesseracT",
"artist_url": "/artist/tesseract_29262",
"song_name": "Juno",
"marketing_type": "official",
"tab_url": "https://www.ultimate-guitar.com/pro/?utm_source=UltimateGuitar&amp;utm_medium=Search&amp;utm_campaign=UG+Search&amp;utm_content=Official+Version&amp;artist=TesseracT&amp;song=juno&amp;tab_id=2433301",
"device": null,
"app_link": "//www.ultimate-guitar.com/send?ug_from=yozio_splash&amp;url=https://play.google.com/store/apps/details?id=com.ultimateguitar.tabs&amp;ug_campaign=UG_TPAndroid_SearchTpLink_SearchPage_mobile0&amp;referrer=utm_campaign=UG_TPAndroid_SearchTpLink_SearchPage_mobile0",
"highlight": {
"song_name": [
[
0,
4
]
],
"artist_name": [
[
0,
9
]
]
}
},
{
"artist_name": "TesseracT",
"artist_url": "/artist/tesseract_29262",
"song_name": "Juno",
"marketing_type": "TabPro",
"rating": 4.8404,
"votes": 7,
"tab_url": "https://www.ultimate-guitar.com/pro/?utm_source=UltimateGuitar&amp;utm_medium=Search&amp;utm_campaign=UG+Search&amp;artist=TesseracT&amp;song=juno&amp;tab_id=2425607",
"device": null,
"app_link": "//www.ultimate-guitar.com/send?ug_from=yozio_splash&amp;url=https://play.google.com/store/apps/details?id=com.ultimateguitar.tabs&amp;ug_campaign=UG_TPAndroid_SearchTpLink_SearchPage_mobile0&amp;referrer=utm_campaign=UG_TPAndroid_SearchTpLink_SearchPage_mobile0",
"highlight": {
"song_name": [
[
0,
4
]
],
"artist_name": [
[
0,
9
]
]
}
},
{
"id": 2390557,
"song_id": 2743409,
"song_name": "Juno",
"artist_id": 29262,
"artist_name": "TesseracT",
"type": "Pro",
"part": "",
"version": 1,
"votes": 10,
"rating": 4.80031,
"date": "1527099921",
"status": "approved",
"preset_id": 28177,
"tab_access_type": "public",
"tp_version": 1,
"tonality_name": "Fm",
"version_description": "Whole song transcription, all instruments, ambient guitar and lyrics included. Multiple notation/time signature interpretations for similar parts as it's quite ambiguous. The Tab Pro conversion by UG has a few weird things going on, and the lyrics don't align like in the original file, so keep that in mind.",
"verified": 0,
"recording": {
"is_acoustic": 0,
"tonality_name": "",
"performance": null,
"recording_artists": []
},
"artist_url": "https://www.ultimate-guitar.com/artist/tesseract_29262",
"tab_url": "https://tabs.ultimate-guitar.com/tab/tesseract/juno-guitar-pro-2390557"
},
{
"id": 2395961,
"song_id": 2743409,
"song_name": "Juno",
"artist_id": 29262,
"artist_name": "TesseracT",
"type": "Pro",
"part": "",
"version": 2,
"votes": 0,
"rating": 0,
"date": "1528113155",
"status": "approved",
"preset_id": 28177,
"tab_access_type": "public",
"tp_version": 2,
"tonality_name": "",
"version_description": "",
"verified": 0,
"recording": {
"is_acoustic": 0,
"tonality_name": "",
"performance": null,
"recording_artists": []
},
"artist_url": "https://www.ultimate-guitar.com/artist/tesseract_29262",
"tab_url": "https://tabs.ultimate-guitar.com/tab/tesseract/juno-guitar-pro-2395961"
},
{
"id": 2425607,
"song_id": 2743409,
"song_name": "Juno",
"artist_id": 29262,
"artist_name": "TesseracT",
"type": "Pro",
"part": "",
"version": 3,
"votes": 7,
"rating": 4.8404,
"date": "1531581826",
"status": "approved",
"preset_id": 28177,
"tab_access_type": "public",
"tp_version": 3,
"tonality_name": "F",
"version_description": "I saw TesseracT in Paris to be sure of finger placement.",
"verified": 0,
"recording": {
"is_acoustic": 0,
"tonality_name": "",
"performance": null,
"recording_artists": []
},
"artist_url": "https://www.ultimate-guitar.com/artist/tesseract_29262",
"tab_url": "https://tabs.ultimate-guitar.com/tab/tesseract/juno-guitar-pro-2425607"
}
]
81 changes: 81 additions & 0 deletions new/utilities.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
# To do:
# Check for updates in search results page on UG
# Build new GUI (likely using tkinter, as I'm not that familiar with PyQt5)
# Build functions to search on UG and parse search results using updated Selenium
# Connect GUI and utility functions
import os,sys,json
import subprocess
import sys
from tqdm import tqdm
def install(package):
subprocess.check_call([sys.executable, "-m", "pip", "install", package])
fail = False
try:
from bs4 import BeautifulSoup as bs
except ImportError:
fail = True
print(f'WARNING: MODULE BeautifulSoup4 MISSING, ATTEMPTING TO INSTALL')
install('beautifulsoup4')
from datetime import date, timedelta
try:
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
except ImportError:
fail = True
print(f'WARNING: MODULE selenium MISSING, ATTEMPTING TO INSTALL')
install('selenium')
from time import sleep
try:
from webdriver_manager.chrome import ChromeDriverManager
except ImportError:
fail = True
print(f'WARNING: MODULE webdriver-manager MISSING, ATTEMPTING TO INSTALL')
install('webdriver-manager')
try:
import requests
except ImportError:
fail = True
print(f'WARNING: MODULE requests MISSING, ATTEMPTING TO INSTALL')
install('requests')
if fail == True:
print(f'WARNING: ONE OR MORE MODULES WERE MISSING AND HAVE BEEN INSTALLED. THE PROGRAM WILL NEED TO BE RE-RUN AS A RESULT.')
quit()
# set up currentdir variable
currentdir = os.path.dirname(os.path.realpath(__file__))
parentdir = os.path.dirname(currentdir)
sys.path.append(parentdir)
# creating a simple logger
logger = open(os.path.join(currentdir,'logfile.txt'),'w')
# create a webdriver instance using selenium, set chrome options to headless to avoid a window popping up every time you scrape
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
# set chrome options "headless" to ensure a window doesn't pop up
chrome_options.add_argument("--headless")
# and the below line to simulate an actual browser window size
chrome_options.add_argument("--window-size=1100,1000")
# the below line instantiates the driver instance, with a simple webdriver check
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()),options=chrome_options)
# the below line gives us a useragent header to pretend to be a real browser
driver.execute_cdp_cmd('Network.setUserAgentOverride', {"userAgent": 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.53 Safari/537.36'})



SEARCH_URL = "https://www.ultimate-guitar.com/search.php?search_type=title&value={}"
RESULTS_PATTERN = "\&quot\;results\&quot\;:(\[.*?\]),\&quot\;pagination\&quot\;"
#RESULTS_PATTERN = "\"results\":(\[.*?\]),\"pagination\""
RESULTS_COUNT_PATTERN = "\&quot\;tabs\&quot\;,\&quot\;results_count\&quot\;:([0-9]+?),\&quot\;results\&quot\;"
#RESULTS_COUNT_PATTERN = "\"tabs\",\"results_count\":([0-9]+?),\"results\""
DOWNLOAD_TIMEOUT = 15
# retrieves a URL and returns a soup object
def retrieve(url):
# retrieve the page using our driver instance
driver.get(url)
# sleep for 3 seconds to allow the page to fully load, and also to simulate human behavior
sleep(3)
# get the page source to scrape through for information
data = driver.page_source
# turn it into a soup object and return it
soup = bs(data, features="html.parser")
driver.close()
return soup

Loading