Forked repository and added conversion python script

My added script is: convert_annotations.py

Use toolkit normally to gather images from open images dataset. After gathering images just run from root directory:

python convert_annotations.py

This will generate .txt annotation files in proper format for custom object detection with YOLOv3. The text files are generated in folder with images.

~ OIDv4 ToolKit ~

Do you want to build your personal object detector but you don't have enough images to train your model? Do you want to train your personal image classifier, but you are tired of the deadly slowness of ImageNet? Have you already discovered Open Images Dataset v4 that has 600 classes and more than 1,700,000 images with related bounding boxes ready to use? Do you want to exploit it for your projects but you don't want to download gigabytes and gigabytes of data!?

With this repository we can help you to get the best of this dataset with less effort as possible. In particular, with this practical ToolKit written in Python3 we give you, for both object detection and image classification tasks, the following options:

(2.0) Object Detection

download any of the 600 classes of the dataset individually, taking care of creating the related bounding boxes for each downloaded image
download multiple classes at the same time creating separated folder and bounding boxes for each of them
download multiple classes and creating a common folder for all of them with a unique annotation file of each image
download a single class or multiple classes with the desired attributes
use the practical visualizer to inspect the donwloaded classes

(3.0) Image Classification

download any of the 19,794 classes in a common labeled folder
exploit tens of possible commands to select only the desired images (ex. like only test images)

The code is quite documented and designed to be easy to extend and improve. Me and Angelo are pleased if our little bit of code can help you with your project and research. Enjoy ;)

Open Image Dataset v4

All the information related to this huge dataset can be found here. In these few lines are simply summarized some statistics and important tips.

Object Detection

	Train	Validation	Test	#Classes
Images	1,743,042	41,620	125,436	-
Boxes	14,610,229	204,621	625,282	600

Image Classification

	Train	Validation	Test	#Classes
Images	9,011,219	41,620	125,436	-
Machine-Generated Labels	78,977,695	512,093	1,545,835	7,870
Human-Verified Labels	27,894,289	551,390	1,667,399	19,794

As it's possible to observe from the previous table we can have access to images from free different groups: train, validation and test. The ToolKit provides a way to select only a specific group where to search. Regarding object detection, it's important to underline that some annotations has been done as a group. It means that a single bounding box groups more than one istance. As mentioned by the creator of the dataset:

IsGroupOf: Indicates that the box spans a group of objects (e.g., a bed of flowers or a crowd of people). We asked annotators to use this tag for cases with more than 5 instances which are heavily occluding each other and are physically touching. That's again an option of the ToolKit that can be used to only grasp the desired images.

Finally, it's interesting to notice that not all annotations has been produced by humans, but the creator also exploited an enhanced version of the method shown here reported 1

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
images		images
modules		modules
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
classes.txt		classes.txt
convert_annotations.py		convert_annotations.py
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Forked repository and added conversion python script

~ OIDv4 ToolKit ~

Open Image Dataset v4

1.0 Getting Started

License

mastercam123/OIDv4_ToolKit

Folders and files

Latest commit

History

Repository files navigation

Forked repository and added conversion python script

~ OIDv4 ToolKit ~

Open Image Dataset v4

1.0 Getting Started