Scrappy - Puppeteer-based Web Scraping Tool

Scrappy is a powerful web scraping tool built on top of Puppeteer, a Node.js library that provides a high-level API for controlling web browsers. With Scrappy, you can automate the process of extracting data from websites, navigating through pages, interacting with elements, and much more.

Features

Easy-to-use API: Scrappy provides a simple and intuitive API for interacting with web pages, making it easy to write scraping scripts.
Headless browser automation: Scrappy leverages Puppeteer's headless browser capabilities, allowing you to scrape websites that rely on JavaScript for rendering content.
Page navigation and interaction: Scrappy enables you to navigate through multiple pages, click buttons, fill forms, submit data, and perform other interactions just like a real user would.
Data extraction: Scrappy provides powerful methods for extracting data from web pages, including selecting elements using CSS or XPath selectors, retrieving attribute values, text content, and more.
Concurrency and parallelism: Scrappy supports running multiple scraping tasks concurrently, allowing you to scrape multiple websites simultaneously and maximize your efficiency.
Persistence: Scrappy supports saving scraped data to various output formats such as JSON, CSV, or a database of your choice.
Customization: Scrappy is highly customizable, allowing you to configure various aspects such as user agents, timeouts, request headers, and more.

Installation

To install Scrappy, you need to have Node.js and npm (Node Package Manager) installed on your machine. Follow the steps below to install Scrappy:

Clone the Scrappy repository from GitHub:

git clone https://github.com/biratdatta/scrappy.git

Install all the packages

npm install

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
node_modules		node_modules
README.md		README.md
example.pdf		example.pdf
example.png		example.png
index.js		index.js
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scrappy - Puppeteer-based Web Scraping Tool

Features

Installation

About

Languages

biratdatta/scrappy

Folders and files

Latest commit

History

Repository files navigation

Scrappy - Puppeteer-based Web Scraping Tool

Features

Installation

About

Topics

Resources

Stars

Watchers

Forks

Languages