Skip to content

A list of web scrapers I created both in PyCharm and JupyterLab

Notifications You must be signed in to change notification settings

LidorPrototype/Web-Scrapers

Repository files navigation

Web Scraping

Developer: L-ES

The web scrapers are:

1) Selenium scraper of 30 websites
    - selenium_scraper.py: The selenium scraper, it will download from 30 different (some are similar) websited, it downloads into csv files all of the prices, its meant to 
                            run every day and download the final results of yesterday for each store or market in Israel.
    - apply_exe.py: This file takes 2-3 types of files from each store that we downloaded and it apply some simple math on it to calculate and see all the discounts
    - upload_azure_blob_containers.py: This file will upload the final parquet files that we got back from the .exe file on every store to Azure Blob Containers
2) Amazon Scraper: Search for a term and scrape all the listings of it
3) IMDb Scrapers:   (1,000 best movies)
    - One page scraper, takes a page and gather all the data from it  
    - Multi page scraper, same as the one page but also moves on all the pages of the list
4) Israel Bank Scrapers:
    - About Section
    - Research Section
    - Statistics Section
    - Term Dictionary Section
    - Careers Section
5) Twitter Scraper: Logs into a given account ( need to provide credentials ) search for a term and gather all the tweets data
6) Yahoo Scraper: Go into 'Yahoo! Finance' a grab:
    - Financials tab data
    - Profile tab data
    - Statistic tab data
    - Historical Stock tab data

About

A list of web scrapers I created both in PyCharm and JupyterLab

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published