Python script which scraps the given set of websites as input and test wheather it is a ecommerce website or not.
Developed and tested in Python 3.X and windows OS. But we can run and use this wrapper as per our envornment UNIX/MAC.
What things you need to install the software and how to install them
Python 3.x
A step by step series of examples that tell you have to get a development env running
Say what the step will be
pip
pandas : pip install pandas
requests : pip install requests
Rest of the modules come acorss the python 3.x distribution. If not we can install it by pip install <module-name>
Explain how to run the automated tests for this system
Explain what these tests test and why
python ecom_website_selector.py -i <input file of all websites> -o <output csv file where you need to store the result>[Not Mandatory]
- Read all websites from given input .txt file and storing into a list of websites
- Iterating each website in the wbsites list to test wheather it is an ecommerce website or not?
- Using request module to get the html markup of the given website home page
- Expecting some given links in the website html markup to confirm the site is ecommerce or not.
- If certain webite cannot be reached during time of test, script would have mark it as non ecommerce site due to HTTP error.
- Relying more on given sets of links to validate the website is ecommerce or not.
-> More efficient web scrapping design to validate the site is ecommerce or not.