For Webpages Are Getting Larger Every Year, and Here’s Why it Matters
Author: Jorge Orpinel Perez
© 2018 Pingdom AB.
Python script that uses Selenium and Headless Chrome to determine the average page size among a list of websites. This will include transferSize AND any other content loaded dynamically to display the home page of each site.
This tool was developed and ran with Python 3.6.5 on macOS 10.13
Further versions should continue to work.
- Chrome 68+ installed
- ChromeDriver in the OS path - chromedriver 2.42 used
See requirements.txt
- Python language bindings for Selenium WebDriver – selenium 3.14 used
To install, we will use virtualenv:
virtualenv venv
source venv/bin/activate
pip install -r requirements.txt
Virtualenv installs pip automatically.
Save a list of web page URIs (one per line) in a plain text file. Included in 2018-09-15-alexa-topsites-50-preview.txt is a sample list of 50 top sites published by Alexa (Sep 2018).
Make sure the script is executable by your user:
chmod u+x from_list.py
You may now run it:
chromedriver 2> /dev/null & # Implies --remote-debugging-port=9515. Runs in background.
./from_list.py 2018-09-15-alexa-top-sites-50.txt
See the file docstring in from_list.py for further info.
Don't forget to stop chromedriver after running the Python script e.g.:
fg # To bering chromedriver to the background
^C # Ctrl+C