CSE 847 Machine Learning Final Project

Advanced Image Captioning using Attention Mechanism and Word Embeddings

Presentation link - https://youtu.be/8ChOLvpfkzc

Problem Statement - Image captioning involves the automatic generation of one or more natural language sentences to describe an image. In recent years, the field has advanced rapidly, transitioning from initial template-based models to contemporary deep neural network-based approaches. This report provides a comprehensive overview of recent image captioning research, focusing specifically on models that employ the combination of convolutional neural networks (CNNs) and long short-term memory (LSTM) networks. We examine the merits of various strategies - Attention mechanism and Word Embeddings, as well as review prevalent evaluation metrics and datasets commonly used in the field to evaluate these strategies.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

CSE 847 Machine Learning Final Project

Advanced Image Captioning using Attention Mechanism and Word Embeddings

Files

README.md

Latest commit

History

README.md

File metadata and controls

CSE 847 Machine Learning Final Project

Advanced Image Captioning using Attention Mechanism and Word Embeddings