Watch & Tell

Show and Tell, but on video

This project has been completely reworked into an improved version and is now obsolete.

Join the dark side here on the new repository

Quick Jupyter mini guide on how to use the Python API

pip install pycocotools-windows

Run the make script to get the COCO dataset (50 GB, 2017 challenge) (requires gnu wget)

Download the YOLO weights (I have included the class names and config file here, but these are too big)

cd YOLO
wget https://pjreddie.com/media/files/yolov3.weights

Train:

python train.py

Run:

python run.py

Based on the original architecture (and repo), this is using ResNet-152 as the encoder, and the LSTM as the decoder

CNN Encoder	RNN Decoder

Using Darknet's YOLO to constrain where the model should look

@misc{https://doi.org/10.48550/arxiv.1411.4555,
  doi = {10.48550/ARXIV.1411.4555},
  url = {https://arxiv.org/abs/1411.4555},
  author = {Vinyals, Oriol and Toshev, Alexander and Bengio, Samy and Erhan, Dumitru},
  keywords = {Computer Vision and Pattern Recognition (cs.CV), FOS: Computer and information sciences, FOS: Computer and information sciences},
  title = {Show and Tell: A Neural Image Caption Generator},
  publisher = {arXiv},
  year = {2014},
  copyright = {arXiv.org perpetual, non-exclusive lic

To be improved: ✔️ (Visit the new repo)

Migrate to OpenCV GPU build
Add an attention mechanism to the Decoder
Optimize model parameter size for inference speed
Change the greedy nearest word search to a beam search for the words in the vocabulary

Name		Name	Last commit message	Last commit date
Latest commit History 84 Commits
PythonAPI		PythonAPI
YOLO		YOLO
models		models
README.md		README.md
api.pdf		api.pdf
data_loader.py		data_loader.py
make.sh		make.sh
model.py		model.py
run.py		run.py
train.py		train.py
vocabulary.py		vocabulary.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Watch & Tell

Show and Tell, but on video

This project has been completely reworked into an improved version and is now obsolete.

Join the dark side here on the new repository

Download the YOLO weights (I have included the class names and config file here, but these are too big)

To be improved: ✔️ (Visit the new repo)

About

Releases

Packages

Languages

AndreiMoraru123/Watch-and-Tell

Folders and files

Latest commit

History

Repository files navigation

Watch & Tell

Show and Tell, but on video

This project has been completely reworked into an improved version and is now obsolete.

Join the dark side here on the new repository

Download the YOLO weights (I have included the class names and config file here, but these are too big)

To be improved: ✔️ (Visit the new repo)

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages