Skip to content

weefatboi/spider-projects

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

65 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

spider-projects

a collection of all personal web crawler projects

Automated Data Mining and Related Impacts on the Real Estate Industry

scrapes HomeFinder, Realtor and Homes sites for real estate listing information by City and State terms
aggregates data to one master list, joined on full address and preserving sources
accesses website's structured json response instead of referencing html

Steam spider

scrapes from the Steam Top Sellers list and outputs curated deals (under $10) in an email to the user

sample email output:

HackerNews spider

scrapes HackerNews article titles, source links, and upvote points

sample news output:

Amazon spider

scrapes Amazon market results by search term.
user can provide category= <some-search-term> in cmd line to scrape that term's results

sample search results output:

Credits

  1. scrapy-rotating-proxies
  2. scrapy-user-agents
  3. Scrapy
  4. AWS SES
  5. Postman

Releases

No releases published

Packages

No packages published

Languages