A good dataset to use when getting started with image captioning is the Flickr8K dataset.
The reason is because it is realistic and relatively small so that you can download it and build models on your workstation using a CPU.
A good dataset to use when getting started with image captioning is the Flickr8K dataset.
The reason is because it is realistic and relatively small so that you can download it and build models on your workstation using a CPU.
You can use the link below to request the dataset: https://illinois.edu/fb/sec/1713398 Within a short time, you will receive an email that contains links to two files:
Flickr8k_Dataset.zip (1 Gigabyte) An archive of all photographs. Flickr8k_text.zip (2.2 Megabytes) An archive of all text descriptions for photographs.