This git project is for DFB Data mining competition. It contains a crawler based on Scrapy that collects TV series data from Youku Index and Douban.
The crawler.
To activate spiders, run scrapy crawl -o OUTPUT -a list=tv.txt micro
or scrapy crawl -o OUTPUT -a list=tv.txt macro
For more information, Scrapy Docs
The virtual python environment for this project. It contains all the packages that are used in crawler.