- Scrapy (for site grabbing)
- ipython notebook/pandas/json/etc - to Excel conversion
-
- put your YW search string to SEARCH_URL, file yachtworldcrawler/yachtworldcrawler/spiders/yachtworld.py (or you can keep default)
-
- cd yachtworldcrawler/yachtworldcrawler
-
- scrapy crawl -o yachts.json
-
- PROFIT!
-
- open ipython notebook ./yw-analyzer.ipynb
-
- correct IN_PATH/OUT_PATH in step 2
-
- run all
-
- PROFIT!