Skip to content

Latest commit

 

History

History
28 lines (14 loc) · 390 Bytes

readme.md

File metadata and controls

28 lines (14 loc) · 390 Bytes

Webcrawler for incoming links

Requirements

NPM, Node, MySQL

Install

  1. Clone this repo

  2. Run npm install

  3. Create mysql table and modify it in ./db.js

  4. Run node crawler.js --install

  5. Add a first website (e.g. http://news.ycombinator.com) with empty status to Table sites

Run the crawler

node crawler.js

What it can not do

  • crawl pdfs
  • crawl flash