Skip to content

Latest commit

 

History

History
9 lines (5 loc) · 887 Bytes

README.md

File metadata and controls

9 lines (5 loc) · 887 Bytes

CSE 847 Machine Learning Final Project

Advanced Image Captioning using Attention Mechanism and Word Embeddings

Presentation link - https://youtu.be/8ChOLvpfkzc

Problem Statement - Image captioning involves the automatic generation of one or more natural language sentences to describe an image. In recent years, the field has advanced rapidly, transitioning from initial template-based models to contemporary deep neural network-based approaches. This report provides a comprehensive overview of recent image captioning research, focusing specifically on models that employ the combination of convolutional neural networks (CNNs) and long short-term memory (LSTM) networks. We examine the merits of various strategies - Attention mechanism and Word Embeddings, as well as review prevalent evaluation metrics and datasets commonly used in the field to evaluate these strategies.