Skip to content

lil-km/Selenium-Images-Scraper-for-Rawpixel-Website

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Selenium Images Scraper 🕷 for Rawpixel Website

This project scrape images from rawpixel website that are on public domain.

Prerequisites

  • Install Chrome Web Browser.
  • Install Chrome WebDriver, just ensure version compatibility with you Chrome Web Browser version.
  • Install Python 3.7.

Requirements

  • The requirements.txt file contain Selenium and tqdm libraries.
  • Install requirements with pip install -r requirements.txt, it's good to install dependencies in isolated Python virtual environment.

Getting started

To start using this project you need to have account or create one in rawpixel website.

  1. Open Chrome Web Browser in debugger mode.

    1. Navigate to where your Chrome Web Browser application (chrome.exe) is installed in your filesystem, and copy the path where it's installed.
    2. Add the path to system environment variable i.e PATH "make sure to not include chrome.exe in the path".
    3. Create new directory where to launch the browser. It's added to avoid conflict with your already installed Chrome Web Browser.
    4. Open command prompt cmd, and enter this command:
          chrome.exe -remote-debugging-port=9014 --user-data-dir="<absolute path to the created directory>"
      
      The command will launch Chrome Web Browser window in debugging mode.
    5. In the opened window (tab) navigate to rawpixel website and login with your account informations, then go to a public domain images album of your choice.
      NOTE: This feature run on >= 63 version of chrome web browser only.
  2. Next, you need to run get_session_cookies.py Python script. to save your session cookies. Here is the command to run:

    python code/get_session_cookies.py \
        --webdriver="<absolute path to chrome webdriver>"
    

    It will save session cookies in cookies.pkl file.

Running the tests

  1. Finally, you need to run img_scraper_rawpixel.py Python script, to downlowd images in in your specified directory. Here is the command to run:

    python code/img_scraper_rawpixel.py \ 
        --webdriver="<absolute path to chrome webdriver>" \
        --output_dir="<absolute path to output directory>" \
        --url="<url of your choice>"
    

Command Arguments:

Authors

K Tonpa.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published