Ask James - The Serverless Web Crawler

There is an article explaining the usage of this code here: https://read.acloud.guru/how-to-build-a-serverless-web-crawler-7c1c2f37a36.

This is naive crawler that's a Proof of Concept and not appropriate for production usage. Do not use against production websites. Do not use against websites where you do not have permission to crawl. Do not violate AWS Terms & Conditions.

This code is provided for educational purposes only with no warranty implied.

Misuse may result in considerable AWS expenses and may negatively impact target websites.

Do not run this code unless you understand the implications of web crawling. You are entirely responsible for the consequences of running this code.

Installation

Clone and npm install in your downloaded directory.

Usage

Don't forget to:

Update your testEvent.json
Create the DynamoDB table 'crawler' - the table should have a partition key called 'url', no sort key and capacity set to on-demand.
Add the stream ARN in serverless.yaml (when you are ready)
Spend time to test and understand what the code is doing

Support

If you have any questions or comments, feel free to contact me (James Beswick) at @jbesw. I hope you enjoy!

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.eslintrc.js		.eslintrc.js
.gitignore		.gitignore
README.md		README.md
crawl.js		crawl.js
dynamodb.js		dynamodb.js
example.js		example.js
handler.js		handler.js
package.json		package.json
processURL.js		processURL.js
serverless.yml		serverless.yml
test.js		test.js
testEvent.json		testEvent.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ask James - The Serverless Web Crawler

Table of Contents

Installation

Usage

Support

About

Releases

Packages

Contributors 2

Languages

jbesw/askJames-serverlessCrawler

Folders and files

Latest commit

History

Repository files navigation

Ask James - The Serverless Web Crawler

Table of Contents

Installation

Usage

Support

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages