To help a Scientific Iniciation in Asphalt Impact, we development a Web Scraping Software to collect reserch data from EDP's (Enviroment Production Declarations). The source of EDP's is Emerald Eco-Label EPD Tool, the current application get specific information from all EDP's from all 39 states of United States of America, in total we got data from 1717 EDP's that are writed in tables (matrices) on csv file format, every state have his on csv file.
The current documentation are in working process and assumes that you are a beginner in programming and not familiar with python projects, so maybe the docs can be to heavily 'step by step' for you.
Note
Last Update 18/10/2023
Run the application is pretty easy, but first you need to install the project dependencies, let's get on it.
First make sure that you have Python 3 installed in your machine, you can download from here.
The projects depends on Requests and BeautifulSoup4 libs, to avoid install this libs in global scope (that may cause version conflits with other python projects in you computer) we strongly recommend using virtualenv that is a virtual enviroment, you may have a little more work but is worth! Let's create your venv and install the libs.
To create a virtual env:
Windows:
py -m venv .venv
Mac/Linux:
python3 -m venv .venv
Now you may have a .venv folder in the project folder, to install the dependencies only in this .venv we need first activate the venv, to do that run this command:
Windows:
.\.venv\Scripts\activate
Mac/Linux:
.\.venv\bin\activate
If things are doing right, you now have a '(.venv)' in the left of console line, that shows that the virtual enviroment is activated.
Important
Make sure to activate the venv, otherwise the libs will be installed globally
To install the dependencies:
pip install -r requirements.txt
To list packages list:
pip list
You may have to see this in the left side:
beautifulsoup4
bs4
certifi
charset-normalizer
idna
pip
requests
setuptools
soupsieve
urllib3
Done! Finnaly the enviroment is setted and you can run the application.
Windows:
py run.py
Mac/Linux:
python3 run.py
After running the scrappy, to deactivate the virtual env you just need to run this command:
deactivate