Skip to content

Latest commit

 

History

History
32 lines (23 loc) · 1.86 KB

File metadata and controls

32 lines (23 loc) · 1.86 KB

Web-Scraping-Amazon-MensFashion-Search-Images-Beautiful-Soup-Python

Using Python and Beautiful Soupe this code is able to download images from amazon.com Men's Fashion department in Clothing and download top N images for every search Provided in the keywords file.

What Beautiful Soup does is parses the webpage in its html format so we can easily access any of the html tags, and even refine it with different classes. It's really easy to use, just check the code and you will get the gist of how it works.

My Use Case

I have used Amazon's Clothing -> Mens Fashion Department to scrape images form it. There is a file named keywords.txt which contains all the keywords which will be used to search The format in which the it should be filled is

"(Number Of Images You Want) (((SINGLE SPACE))) (Keywords seperated by spaces)"

Example : "12 red jacket" So it will download first 12 images when red jacket is searched under amazon's men fashion brand

The Code is quite easy and self explainatory.

Output

screen shot 2018-08-11 at 11 57 54 pm

Images stored in individual folder as mentioned in keywords

screen shot 2018-08-11 at 11 58 02 pm screen shot 2018-08-11 at 11 58 55 pm

Issues with this

You may sometime get an error that says "urllib.error.HTTPError: HTTP Error 503: Service Unavailable" Dont worry, it happens. It's because Amazon don't allow automated access to their data, so they're rejecting your request because it didn't come from a proper browser. So try again after sometime, or maybe the next day, it definitely works.

Happy Coding :)