Skip to content

Generates suitable captions for the images of people and animals input by the user.

Notifications You must be signed in to change notification settings

parask11/image-captioner

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

64 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Image Captioner

Getting Started

It generates captions according to the given images. For example:

result1

result2

result3

File descriptions

  1. app.py Main code to run to create the server
  2. generate_captions.py Python module that compiles the AI model and makes predictions.
  3. embedding_matrix.pkl Matrix for the word embeddings of the vocabulary.
  4. train_descriptions.pkl Dictionary to map image names to the captions for training data.
  5. word_to_index.pkl Dictionary to map words in the vocabulary to their index numbers.
  6. index_to_word.pkl Dictionary to map index number to their words in the vocabulary.
  7. results Contains samples of results on testing.
  8. static Stores images input by the user while generating captions.
  9. templates Contains the <index.html> to generate the UI.
  10. preparing_data.ipynb Jupyter Notebook to prepare the data for training.
  11. training_model.ipynb Jupyter Notebook to train the model.
  12. generate_captions.ipynbJupyter Notebook to import all the essentials and generate the captions.
  13. model_weights Folder that contains the models number 28-40 generated in 40 epochs during the training. (Model below 28 were useless).

External Data

  1. glove.6B.50d.txt Text file to contain mapping of words to their corresponding 50-dimensional vector. (Link: https://drive.google.com/open?id=1mqHRTOyF87fHoiuRZwOlgcYwcCynQ5Ki
  2. encoding_train_features.pkl Dictionary to map training images to their corresponding 2048 dimensional vector. (Link: https://drive.google.com/open?id=1qO4fgm8qUu0eIslMpg6oqqmcZil5qs9k
  3. flickr30k_images Training images and their captions. (Link: https://www.kaggle.com/hsankesara/flickr-image-dataset)

Making virtual environment (Optional but recommended)

  1. Make the environment. python -m venv captioner For any other name: python -m venv <name>

  2. Activate the environment. source captioner/bin/activate For any other name: source <name>/bin/activate

Installation

  1. Clone the repository. git clone https://github.com/parask11/image-captioner.git

  2. Go in the directory. cd image-captioner

  3. Install requirements. pip install -r requirements.txt

Running

Run the python script. python app.py

It will start a server.

Open the link from the browser. localhost:5000

The UI will appear. Upload images and generate the captions!