Gesture-Control---Conv3D-RNN-LSTM

Smart TV Gesture Recognition

Problem Statement

Imagine revolutionizing TV control with gestures! I, as a data science team at a leading home electronics company, embarked on a journey to develop a feature for smart TVs that recognizes five different gestures. Users can control the TV seamlessly without a remote.

Each gesture corresponds to a specific command:

Thumbs up: Increase the volume
Thumbs down: Decrease the volume
Left swipe: 'Jump' backward 10 seconds
Right swipe: 'Jump' forward 10 seconds
Stop: Pause the movie

Our challenge? Continuous gesture monitoring through the TV's webcam and processing these gestures effectively.

Architecture

To tackle this problem, we adopted a Conv3D CNN-RNN architecture stack, leveraging the power of Convolutional 3D layers and LSTM (Long Short-Term Memory) networks. This architecture excels at capturing both spatial and temporal information from the video sequences.

Dataset

The dataset consists of video sequences, each containing 30 frames/images. These videos were recorded using regular webcams, simulating real interactions with smart TVs. Each gesture's frames are categorized into five classes (0-4), corresponding to the five gestures.

Data Augmentation

To enhance model generalization, we employed an ImageDataGenerator for data augmentation, effectively increasing the dataset's diversity and robustness.

Project Structure

data/: Contains the dataset (not included in this repository due to size).
notebooks/: Jupyter notebooks for data exploration, model development, and evaluation.
models/: Saved model checkpoints.
README.md: You're reading it!

Usage

Download the dataset from here : https://drive.google.com/uc?id=1ehyrYBQ5rbQQe6yL4XbLWe3FMvuVUGiL
Set up your Python environment with the necessary libraries.
Explore the provided Jupyter notebooks to understand the project.
Train the model and save checkpoints as needed.

Contributions

Contributions are welcome! Feel free to open issues, provide feedback, or submit pull requests.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Gesture_Recognition_code.ipynb		Gesture_Recognition_code.ipynb
README.md		README.md
model-00020-0.10707-0.96983-0.24425-0.90000.h5		model-00020-0.10707-0.96983-0.24425-0.90000.h5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Gesture-Control---Conv3D-RNN-LSTM

Smart TV Gesture Recognition

Problem Statement

Architecture

Dataset

Data Augmentation

Project Structure

Usage

Contributions

License

About

Releases

Packages

Languages

GvHemanth/Gesture-Control---Conv3D-RNN-LSTM

Folders and files

Latest commit

History

Repository files navigation

Gesture-Control---Conv3D-RNN-LSTM

Smart TV Gesture Recognition

Problem Statement

Architecture

Dataset

Data Augmentation

Project Structure

Usage

Contributions

License

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages