Image-Captioning

Image Captioning using Recurrent Neural Networks

In this project we use deep neural network models to caption Flickr images.
The dataset has 8091 images and each image in this dataset has an ID and there are 5 caption for each image.
we used pretrained bert model to get the embedings and an LSTM layer for generating.

Model Architecture

|--------------------------------|   | -------------------------------|
|     pictures_input(2048,)      |   |   captions_input(max_length,)  |
|--------------------------------|   | -------------------------------|
                 ↓                                   ↓               
|--------------------------------|   | -------------------------------|
|          Dropout(0.5)          |   |  Embedding(vocab_size, 128)    |
|--------------------------------|   | -------------------------------|
                 ↓                                   ↓
|--------------------------------|   | -------------------------------|
|         Dense(256, relu)       |   |           LSTM(128)            |
|--------------------------------|   | -------------------------------|
                 ↓                                    ↓
|--------------------------------|                    ↓
|          Dropout(0.5)          |                    ↓    
|--------------------------------|                    ↓
                 ↓                                    ↓   
|--------------------------------|                    ↓
|         Dense(256, relu)       |                    ↓
|--------------------------------|                    ↓
                 ↓                                    ↓
| --------------------------------------------------------------------|
|                             Concatenate                             |
| --------------------------------------------------------------------|
                                  ↓
| --------------------------------------------------------------------|
|                          Dense(128, relu)                           |
| --------------------------------------------------------------------|
                                  ↓
| --------------------------------------------------------------------|
|                          Dense(vocab_size, softmax)                 |
| --------------------------------------------------------------------|
                                  ↓

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Image-Captioning

Model Architecture

Files

README.md

Latest commit

History

README.md

File metadata and controls

Image-Captioning

Model Architecture