Scraper

The system that will scrap data for the website.

Add a new scraper

Create a scraper

Create a new file in ./groups/<name>.js

export const name = '<name of group>';
export const url = '<page that list brands>';
export const infoUrl = '<wikipedia page>';

export const scrapDetails = async (get$, getPage) => {
    const details = {
        name,
        slug: slugify(name),
        url,
        infoUrl,
        description,
        picture,
    };
    return details;
};

export const scrapBrands = async (get$, getPage) => {
    const brands = new Map();
    return brands;
};

Scrap details

Usually, we scrap details from the group's wikipedia page.

You have access to a default one getDetailsScraper, it will scrap the name, description and logo of a group, given its url.

You can replace the scrapDetails function of your group with:

import { getDetailsScraper } from '../utils/index.js';

export const scrapDetails = getDetailsScraper(url, infoUrl);

Scrap the brands

In your scrapBrands script you can choose to use either Cheerio or Puppeteer by using respectively get$ and getPage:

export const scrapBrands = async (get$, getPage) => {
    const $ = await get$(url);
    const page = await getPage(url);
};

Then you're free to use whatever lib you need. Take example of what's been already done in ./packages/scraper/groups/*

Run the command

yarn scrap <name>

And it will add the new group and its brands to the shared data in ./packages/website/public/data.json

Usage

yarn start <group>

⚠️ New data will delete the previous data.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Scraper

Add a new scraper

Usage

Files

README.md

Latest commit

History

README.md

File metadata and controls

Scraper

Add a new scraper

Usage