It's simple crawler to get the data, save on sqlite lite, and can be export to csv.
-
How do I get set up? Set up Install python 3.x and Create a virtualenv: See here how to
-
Install all requeriments to use:
user@server:~$ pip install -r requirements
-
Create tables and import base data:
user@server:~$ python create_tables.py && python import_weather_stations.py && python import_wind_directions.py
-
Change the config before crawling: Step 1: Create a login on INMET ( you will need an account ): See here how to
Step 2: Set user/pass (config.py)
USER = 'email@xpto.com' PASS = '123456'
-
Change the string DATABASE_URI on config.py editing for DATABASE_URI = 'sqlite:////tmp/climate.db'
-
Crawling!! (Follow progress in crawler.log)
user@server:~$ python crawler_data.py
-
Change the query in export_data.py as you need to export to csv and run:
user@server:~$ python export_data.py
This project is based in PhantomJS to do the crawler. If do you want change for another webdriver this is a list of suported drives below and line for change is 20 from crawler_data.py .
- webdriver.Firefox
- webdriver.FirefoxProfile
- webdriver.Chrome
- webdriver.ChromeOptions
- webdriver.Ie
- webdriver.Opera
- webdriver.PhantomJS
- webdriver.Remote
- webdriver.DesiredCapabilities
- webdriver.ActionChains
- webdriver.TouchActions
- webdriver.Proxy
Install PhantomJS
Tutorial how install PhantomJS gist - Tutorial