Skip to content

CaptionThis: A Python-based deep learning tool for generating insightful captions for images. Utilizing the BLIP model and trained using Google's Conceptual Captions dataset.

Notifications You must be signed in to change notification settings

luisdavidgarcia/CaptionThis

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CaptionThisBanner

CaptionThis 📷 🔤

CaptionThis is a Python command-line deep learning model that generates captions describing images provided as inputs.

Table of Contents

Overview

Summary

CaptionThis is a deep learning project aimed at generating descriptive captions for images using Python. The system is accessible through a command-line interface and leverages a large training dataset for improving caption quality.

Team

The CaptionThis team consists of 6 Cal Poly students. The team members are listed below:

Skills Gained

  1. Deep Learning and AI: Utilized TensorFlow and PyTorch for building and training AI models, showcasing expertise in machine learning and neural networks.
  2. Image Processing and Computer Vision: Employed image processing techniques and computer vision libraries to handle and analyze image data.
  3. Natural Language Processing (NLP): Applied NLP techniques in conjunction with the BLIP model for generating image captions, demonstrating your ability to work with language models and textual data.
  4. Python Programming: Developed the tool using Python, indicating strong programming skills in a language widely used in AI and data science.
  5. Use of GPU Acceleration with CUDA: Utilized CUDA for GPU acceleration, which is essential for efficiently training deep learning models.
  6. Data Collection and Preprocessing: Implemented web scraping and data preprocessing, crucial for gathering and preparing datasets for training AI models.
  7. Concurrency and Multithreading: Used a multithreaded approach for efficient web scraping, showcasing your ability to write efficient and scalable code.
  8. Command Line Interface (CLI) Development: Developed a user-friendly command-line interface for the tool, enhancing its accessibility and ease of use.
  9. Software Engineering Best Practices: Applied principles of software development, including version control (evident from the use of GitHub), code optimization, and modular design.

Getting Started

Here is all you need to know to setup this repo on your local machine to start developing!

Setup

  1. Clone this repository git clone https://github.com/Jkozmo10/CaptionThis.git

Project Structure

Contributing

Here are all of the steps you should follow whenever contributing to this repo!

Making Changes

  1. Before you start making changes, always make sure you're on the main branch, then git pull to make sure your code is up to date
  2. Create a branch with the name relating to the change you will make git checkout -b <name-of-branch>
  3. Make changes to the code

Commiting Changes

When interacting with Git/GitHub, feel free to use the command line, VSCode extension, or Github desktop. These steps assume you have already made a branch using git checkout -b <branch-name> and you have made all neccessary code changes for the provided task.

  1. View diffs of each file you changed using the VSCode Github extension or GitHub Desktop
  2. git add . (to stage all files) or git add <file-name> (to stage specific file)
  3. git commit -m " <description>" or git commit to get a message prompt
  4. git push -u origin <name-of-branch>

Making Pull Requests

  1. Go to the Pull Requests tab on this repo
  2. Find your PR, and provide a description of your change, steps to test it, and any other notes
  3. Link your PR to the corresponding Issue
  4. Request a reviewer to check your code
  5. Once approved, your code is ready to be merged in 🎉

Documents and Artifacts

  1. Project Proposal
  2. Features, Requirements, and Evaluation Criteria
  3. System Design and Architecture
  4. Implementation and Prototypes

References

  1. Pre-Trained Model
  2. Helpful Training Google Colab Notebooks
  3. Example Google Colab Image Captioning
  4. Generating Datasets

About

CaptionThis: A Python-based deep learning tool for generating insightful captions for images. Utilizing the BLIP model and trained using Google's Conceptual Captions dataset.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 98.5%
  • Shell 1.5%