webArchive

Crawls websites and saves found URLs to a file.

Usage

Install Node.js and run npm install in ./crawler.

There are 2 required CLI arguments:

And 2 optional CLI arguments:

For example, if you want to crawl example.com and save found URLs to ./test.txt, run the following command:

node ./index.js example.com test.txt

Use Wget: wget --input-file=CHANGE_THIS --warc-file="warc" --force-directories --tries=10

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
crawler		crawler
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
renovate.json		renovate.json