Toptal Scraper

A web scraper built using node.js and puppeteer to scrap the data (developer's resume information) from Toptal, which is an exclusive network of the top freelance software developers, designers, finance experts, product managers, and project managers in the world.

Features

Scrape developer profiles from Toptal
Save scraped data into MongoDB

Scrape Data

The scraper gets the following data from each developer profile:

id
name
title
location
country
summary
skills
top_skills
portfolio
availability
preferred_env
amazing
work_exp
proj_exp
education
certification
category_skills

Getting Started

Prerequisites

Node.js and npm installed on your machine. Here's a guide on how you can install them.
MongoDB instance running either locally or cloud-based (like MongoDB Atlas)

Installing

Clone this repository

git clone https://github.com/dreamjet31/toptal_scraper.git

Install the dependencies
```
cd toptal_scraper
npm install
```
Create a .env file and add your MongoDB connection string:
```
MONGODB_URI=mongodb+srv://<username>:<password>@cluster0.mongodb.net/test?retryWrites=true&w=majority
```
Replace <username> and <password> with the actual username and password of your MongoDB.
Run the scraper
```
node index.js
```
Wait for the script to finish. All data is saved in the MongoDB collection 'resume'.

NOTE: Please ensure that you have a stable internet connection while running the script to successfully scrape the data.

Dependencies

dotenv: Loads environment variables from a .env file into process.env
memory-cache: In-memory cache that is simple to use
mongodb Node.js driver for MongoDB
puppeteer: Provides a high-level API to control Chrome or Chromium over the DevTools Protocol

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

License

This project is licensed under the ISC License.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
index.js		index.js
list.js		list.js
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Toptal Scraper

Features

Scrape Data

Getting Started

Prerequisites

Installing

Dependencies

Contributing

License

About

Releases

Packages

Languages

dreamjet31/toptal_scraper

Folders and files

Latest commit

History

Repository files navigation

Toptal Scraper

Features

Scrape Data

Getting Started

Prerequisites

Installing

Dependencies

Contributing

License

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages