Image Data Collection Tool for Object Detection, Segmentation & Classification achieved through Web Scrapping (Google Images) ~ Image Scrapping Peeps!
- Clone the Repository:
git clone https://github.com/Sid-047/Image-DataCollection.git
-
Navigate to the Project Directory:
cd Image-DataCollection
-
Install Dependencies:
pip install -r requirements.txt
Note: Mozilla FireFox Web Browser is Recommended
Windows
winget install Mozilla.Firefox
MacOS
brew install firefox
Linux
sudo snap install firefox
-
Wait, Wanna Create QueryList?
python queryList.py
Here it Comes!
Come On Start Entering the Search QueryKeyWords Yo! Enter 'Exit' to Finish ' <Search Keyword Query1> <Search Keyword Query2> . . . <Search Keyword QueryN> Exit ' The Search KeyWord Query List Yo! ['<Search Keyword Query1>', '<Search Keyword Query2>', ..., '<Search Keyword QueryN>']
Now copy the QueryList
-
Enlist the Search Queries:
#ImgScrapping.py q = ['<Search Keyword Query>', '<Search Keyword Query>', '<Search Keyword Query>']
Alter the line of Code or Paste the queryList from the Previous Stage
-
Run the Tool:
python ImgScrapping.py
-
Boom! That is it.
-
But Wait! What if yo Program's crashed? No Worries:
python URLset_convo.py
Select the right TimeStamp, then GooD to Go!
-
Just the Last One:
python ImgDown.py
You could see the Image Files Written
- Automated Image Web Scrapping via Selenium.
- The image URLs are backed in a .txt file in Real-time.
- Image files are Dynamically written without OverWriting.
- Concept of Threading & TimeOut is used to efficiently write the Image files.
- The Image URLs are scrapped at first, next off the Image downloads are initiated.
- The QueryLiat can be generated via the built-in tool as per the User Inputs each Line.
- Should a glitch disrupt the execution, Fear Not! the URLs stored in the .txt files can be served to initiate Image downloads via ImgDown.py.
This project is licensed under the MIT License - see the LICENSE file for details.