This repository is a 300-day coding challenge focused on vision technologies. The repository serves as a comprehensive log of the journey, providing insights into the progress and evolution of skills. Get ready for 300 days of coding excitement, challenges, and triumphs in the universe of computer vision!
Welcome to my 300-day coding challenge focused on vision technologies! This repository documents my daily coding efforts in the realm of computer vision, encompassing tasks such as semantic segmentation, object detection, classification, reinforcement learning, and GANs. I will also be solving DSA problems from LeetCode on some days to improve my python skills. The 300 days would also include some general python based projects to showcase and improve my skills. The goal is to actively code for at least 1 hour a day for 300 days in the year 2024.
Project Title | Description | Framework | Comments | ||
---|---|---|---|---|---|
1 | Road Sign Classifier | Multiclass classification of road sign images | Pytorch | Building training and tracking pipelines from scratch | 🟢 |
2 | Human Action Recognition | Video based multiclass classification of human actions | TensorFlow | In Progress: training baseline models | 🟢 |
🟠 : To Do
🟢 : In Progress
🟣 : Complete
2024-04-03
- Task Description:
LeetCode problems: Longest Palindromic Substring, Zigzag Conversion, Reverse Integer & Remove Element
Embark on a thrilling 300-day coding odyssey, a quest where every day is a new adventure in the realm of computer vision and deep learning. Join me on this exciting journey of practical coding tasks, where each day unfolds with hands-on challenges, research paper implementations, and real-world problem-solving.
Here's what makes this challenge an epic adventure:
-
Hands-on Coding: Dive deep into practical coding tasks, from implementing cutting-edge research papers to tackling real-world problems head-on.
-
Continuous Learning: Embrace a culture of lifelong learning, exploring new concepts, algorithms, and frameworks in the dynamic field of vision technologies.
-
Beyond Boundaries: Explore the frontiers of computer vision and deep learning, pushing the limits with projects that go from semantic segmentation to GANs, reinforcement learning, and more.
-
Building a Robust Portfolio: Craft a comprehensive portfolio of projects and code snippets, showcasing not just skills, but the journey of growth and innovation.
-
Progressive Learning: Witness the evolution of skills as each day adds new layers of expertise, building a solid foundation and demonstrating continuous improvement.
-
Meaningful Contributions: Connect, collaborate, and share insights with a growing community of enthusiasts, making this journey a collective exploration of the fascinating world of vision technologies.
-
DailyLogs: Daily log and description of task undertaken.
-
Projects: Repositories and subfolders containing individual projects, each focused on a specific aspect of vision technologies.
-
CodingChallenges: Code snippets or solutions from coding challenges, providing a mix of practical coding skills and problem-solving capabilities.
Here's a glimpse into the projects accomplished during the inaugural 30-Day Sprint of my 300-day challenge:
-
Implementing Vision Transformer (ViT) from Scratch: Developing a deep understanding of the ViT architecture and translating theoretical concepts into functional code to create a ViT model using PyTorch.
-
Training a Semantic Segmentation Model with Open3D: Leveraging the Open3D library to train a semantic segmentation model on the SemanticKITTI dataset, involving data loading, transformation, and visualization tasks.
-
Exploring Classic Control Tasks for Reinforcement Learning: Delving into classic control environments to understand Markov Decision Processes (MDP), Temporal Difference (TD) learning, and Q-learning, implementing these concepts in Python using reinforcement learning techniques.
-
Building a Multimodal GAN for Image Generation: Constructing a Generative Adversarial Network (GAN) capable of generating images from text descriptions by combining pre-trained models such as CLIP and VQGAN, emphasizing multi-modal fusion and learning.
Here's a log of the daily tasks completed during the coding challenge:
Day | Date | Task Description | Tags |
---|---|---|---|
50 | 2024-04-03 | LeetCode: Longest Palindromic Substring, Zigzag Conversion, Reverse Integer & Remove Element | |
49 | 2024-04-02 | Exploring Graph Neural Networks using PyG: Link Prediction & Link Regression on toy MovieLens dataset | GNN |
48 | 2024-04-01 | Exploring Graph Neural Networks using PyG: Understanding message passing and utilization of various aggregation functions | GNN |
47 | 2024-03-26 | Exploring Graph Neural Networks using PyG: Understanding GNN predictions with the Captum lib and went through a GNN overview | GNN |
46 | 2024-03-25 | Exploring Graph Neural Networks using PyG: Point Cloud Classification using PointNet++ using the GeometricShapes dataset | GNN |
45 | 2024-03-24 | Exploring Graph Neural Networks using PyG: Working on understanding and implementing Recurrent GNNs | GNN |
44 | 2024-03-22 | Exploring Graph Neural Networks using PyG: Data handling in PyG, MetaPath2vec & Graph Pooling - DiffPool | GNN |
43 | 2024-03-21 | Exploring Graph Neural Networks using PyG: Edge analysis for label prediction & Edge analysis for link prediction | GNN |
42 | 2024-03-20 | Exploring Graph Neural Networks using PyG: Graph Generation, Recurrent GNNs, DeepWalk and Node2Vec | GNN |
41 | 2024-03-19 | Exploring Graph Neural Networks using PyG: Spectral Graph Convolutional Layers, Aggregation Functions in GNNs, GAE and VGAE, ARGA and ARGVA | GNN |
40 | 2024-03-18 | Exploring Graph Neural Networks using PyG: node classification and graph classification tasks | GNN |
39 | 2024-03-17 | LeetCode: 0016-3sum-closest and 0017-letter-combinations-of-a-phone-number | DSA |
38 | 2024-03-15 | Exploring 3D object detection by implementing a model using methods including Frustum PointNets and VoteNet | DL 3D |
37 | 2024-03-14 | Finished implementing the ESRGAN paper to code in PyTorch. | GANs |
36 | 2024-03-13 | Working on image super-resolution and implementing a SOTA model like ESRGAN (Enhanced Super-Resolution Generative Adversarial Networks). GitHub Repo: ESRGAN | GANs |
35 | 2024-03-12 | Finished implementing the PointNet paper to code in PyTorch. | DL 3D |
34 | 2024-03-11 | LeetCode problems: 15-3sum | DSA |
33 | 2024-03-08 | Implementing the PointNet paper to code in PyTorch. | DL 3D |
32 | 2024-03-07 | Explored PyTorch3D tutorials and updated the 3D Vision Playground repo. | DL 3D |
31 | 2024-03-06 | Researching PointNet paper for code recreation | DL 3D |
30 | 2024-03-05 | Implemented the VAE paper from scratch in Pytorch training on MNIST | GANs |
29 | 2024-03-04 | Completed VQGAN implementation for code repository | GANs |
28 | 2024-03-01 | Exploring the Mesa library for agent-based modeling, analysis and visualization | RL |
27 | 2024-02-29 | Implementing VQGAN paper from scratch in PyTorch. VQGAN debugging and scripting for transformer | GANs |
26 | 2024-02-28 | Implementing VQGAN paper from scratch in PyTorch. Scripts for encoder-decoder as well as VQGAN arch. | GANs |
25 | 2024-02-27 | Built scripts for editing person's clothes in image using pretrained segmentation and diffusion models: 1 2 | Diffusion CLIP |
24 | 2024-02-26 | Implementing VQGAN paper from scratch. Understanding the paper and code repo, building skeleton. | GANs |
23 | 2024-02-24 | Trained a multimodal GAN to generate image from text using pretrained CLIP ('ViT-B/32') and Taming Transformers (VQGAN) pretrained models | GANs |
22 | 2024-02-23 | Working on multimodal GAN architecture to generate image from text | GANs |
21 | 2024-02-22 | Trained a basic GAN on the MNIST datasetand an advanced GAN architecture on the celebA dataset; WANDB tracking here | GANs |
20 | 2024-02-20 | Finished implementing the ProGAN paper from Scratch in PyTorch. Currently Training on the CelebA-HQ dataset! | GANs |
19 | 2024-02-19 | Implementing the ProGAN paper from Scratch in PyTorch. | GANs |
18 | 2024-02-18 | Implemented the CycleGAN paper from Scratch in PyTorch. Trained for 150 epochs on a custom car2damagedcar dataset | GANs |
17 | 2024-02-17 | Implemented the pix2pix paper from Scratch in PyTorch. Training for 500 epochs on the Maps Dataset | GANs |
16 | 2024-02-16 | Implemented the WGAN and WGAN-GP papers from scratch in PyTorch and trained them on the MNIST dataset | GANs |
15 | 2024-02-15 | Implemented the DCGAN model from scratch from scratch in PyTorch and trained on the MNIST dataset | GANs |
14 | 2024-02-14 | Trained a Semantic Segmentation model with Open3D and Open3D-ML packages with PyTorch on SemanticKITTI dataset | DL 3D |
13 | 2024-02-13 | Explored the Open3D and Open3D-ML packages and performed data loading, tranformation and visualization tasks. | DL 3D |
12 | 2024-02-12 | Trained a simple 2 layer model to play the classic Snake game in Pytorch | RL |
11 | 2024-02-10 | Trained two models in Pytorch on the ViT architecture for Multiclass Road Sign Classifier. | DL 2D |
10 | 2024-02-09 | Built pipelines for dataset manipulation and training in Pytorch for Multiclass Road Sign Classifier. | DL 2D |
9 | 2024-02-07 | Hugging Face RL course completed units 7, 8a, 8b and advanced topics. Certificate | RL |
8 | 2024-02-06 | Hugging Face RL course completed units 4, 5 and 6. | RL |
7 | 2024-02-03 | LeetCode problems: 11-container-with-most-water and 26-remove-duplicates-from-sorted-array | DSA |
6 | 2024-02-01 | Explored datasets, structured project and trained EfficientNet_B0 model for MultiClass Human Action Classification from video data | DL 3D |
5 | 2024-01-31 | Explored datasets, conducted EDA, and structured project for Multiclass Road Sign Classifier. | DL 2D |
4 | 2024-01-29 | Implementing Vision Transformer (ViT) model from scratch in PyTorch. | DL 2D |
3 | 2024-01-28 | LeetCode problems: 1-two-sum, 2-add-two-numbers, 4-median-of-two-sorted-arrays | DSA |
2 | 2024-01-27 | Explored classic control tasks; studied MDP, TD, Monte Carlo, Q-Learning theory | RL |
1 | 2024-01-26 |
MDP basics exploration on custom Maze env with random policy exploration. | RL |
Feel free to reach out, provide feedback, or collaborate on any aspect of the journey. Let's embark on this coding adventure together!
Happy Coding! 🚀