WeChat: zhong_888520 , Email: zhong_heart@163.com

MobileNew

爬取的网址为http://shouji.tenaa.com.cn/Mobile/MobileNew.aspx
把手机的详情爬取下来，并把手机图片保存到对应的文件夹内。

haodfSpider

爬取的APP 好大夫在线
通过fiddle进行手机的抓包，拿到参数然后拼接请求，处理返回的json数据。APP升级代码可能会失效，但整体逻辑还是一样的。

YiYaoWang

爬取的网站为一药网，爬取的内容是商品的评价内容

weiboBudweiser

爬取的网站是新浪微博, （移动端接口）。
为百威写的爬虫项目，因为微博数据量的庞大，所以采取了使用redis进行分布式的爬取，对爬取的url进行hashlib的去重，请求携带的cookie池是使用python的Flask搭建的（其实不携带cookie也可以，设置下载延迟，部署到不同的服务器），在数据的存储的pipelines使用了异步存储，提高了存储的效率。

WeChat: zhong_888520 , Email: zhong_heart@163.com

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
MobileNew		MobileNew
YiYaoWang		YiYaoWang
haodfSpider		haodfSpider
weiboAPI		weiboAPI
weiboBudweiser		weiboBudweiser
Automation.py		Automation.py
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MobileNew

haodfSpider

YiYaoWang

weiboBudweiser

WeChat: zhong_888520 , Email: zhong_heart@163.com

项目启动方法 scrapy crawl [SpiderName]

About

Releases

Packages

Languages

i-artist/scrapySpider

Folders and files

Latest commit

History

Repository files navigation

MobileNew

haodfSpider

YiYaoWang

weiboBudweiser

WeChat: zhong_888520 , Email: zhong_heart@163.com

项目启动方法 scrapy crawl [SpiderName]

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages