Skip to content

Latest commit

 

History

History
452 lines (350 loc) · 25.1 KB

README.md

File metadata and controls

452 lines (350 loc) · 25.1 KB

100-Days-of-ML-Code

Daily log to track my progress on the 100 days of ML code challenge.

Day 1 (09-09-18) : Naive Bayes

  • Started with the intro to machine learning course on Udacity
  • Learnt the basics of a Naive Bayes classifier on the iris dataset
  • Working on classifying the Stanley Terrain dataset and graph the decision surface

Day 2 (10-09-18) : Naive Bayes mini-project

  • Working on the Naive Bayes mini-project to classify email.
  • Tried really hard to make the python 2.7 code compatible with 3.6 and learnt about dos2unix and pickling of data.
  • Completed the Naive Bayes project with accuracy of 90.24% (Need to improve it!)

Day 3 (11-09-18) : SVM and Linear Algebra

  • Improved efficiency to 97.869% and completed the mini-project.
  • Started the lesson on Support Vector Machines.
  • Completed Week 1 of Mathematics for Machine Learning: Linear Algebra, a course from Imperial College London on Coursera.

Day 4 (12-09-18) : SVM and Decision Trees

  • Completed the SVM mini-project with 99.08% accuracy using an rbf kernel
  • Started the lesson on Decision Trees

Day 5 (13-09-18) : Decision Trees mini-project

  • Working on the Decision tree mini-project
  • Referred to 3Blue1Brown's Essence of Calculus playlist

Day 6 (14-09-18) Decision Tree(Entropy and Information gain) and KNN

  • Completed the Decision Tree mini-project
  • Learnt about the K-Nearest Neighbours classifier and implemented the same

Day 7 (15-09-18) K-Nearest Neighbours

  • Implemented the KNN classisier after referring to this Medium article
  • Watched 2 more videos from 3Blue1Brown's Essence of Calculus playlist
  • Watched Siraj Raval's video on classifiers

Day 8 (16-09-18) RandomForest classifier, Datasets and Questions

  • Completed the lesson on datasets and questions to gain key inferences
  • Completed the lesson WEB 3.0 from Siraj Raval's Decentralized Applications playlist
  • Implemented the RandomForest classifier and read up about adaBoost

Day 9 (17-09-18) Linear Regression, Unsupervised Learning (K Means)

  • Completed the lesson on Regressions and implemented the same in the mini-project
  • Completed the analysis of outliers in the enron dataset and the Q&A on the analysis
  • Completed the lesson on unsupervised learning (K-Means clustering)
  • Implemented K Means clustering on the Enron dataset
  • Completed the lesson on feature scaling (MinMaxScaler)

Day 10 (18-09-18) Bag of words, stemming and TfIdf using NLTK

  • Stemming using NLTK(Natural Language Toolkit)
  • Completed the lesson on text learning
  • Completed implementing the string processing techniques in the dataset (17578 emails)

Day 11 (19-09-18) Feature Selection, Dimensionality Reduction(PCA) and Validation

  • Completed the lesson on feature selection
  • Implemented Lasso regression to understand regularization
  • Completed the lesson on dimensionality reduction
  • Working on the eigenfaces mini-project
  • Completed the lesson on Validation and its exercises

Day 12 (20-09-18) Evaluation metrics and intro to neural networks

  • Completed the lesson on evaluation metrics and its exercises
  • Started the Deeplearning.ai course Neural Networks and Deep learning by Andrew NG
  • Completed the intro to machine learning course on Udacity!!

Day 13 (21-09-18) Enron Fraud Detection

  • Working on finding the persons of interest from the Enron emails dataset
  • Completed Week 1 of the Neural networks and deep learning course

Day 14 (22-09-18) Intro to tensorflow and tensorflow.js

  • Read up about Tensorflow from the documentation and medium articles
  • Watched 2 Coding Train videos to understand Tensorflow.js

Day 15 (23-09-18) Intro to deep learning

  • Implemented classifier and regressor using tensorflow and compared the same with the sklearn implementations
  • Learnt about the softmax, one-hot encoding and cross-entropy loss minimization using gradient descent

Day 16 (24-09-18) Data preprocessing and handling missing data

  • Learn best practices to handle missing data and effective feature selection
  • Practiced the preprocessing workflow

Day 17 (25-09-18) Stock Predictor App

  • Built a basic stock predictor app that predicts the value of the stock and the value of the company
  • Referred to this video by Siraj Raval

Day 18 (26-09-18) Data Science on the HI-SEAS dataset

  • Analyzed the Mars HI-SEAS dataset using SVM (and PCA) to unearth outliers and analyze for predictive analytics
  • Performed data wrangling and analysis using dplyr in R

Day 19 (27-09-18) Intro to deep learning

Day 20 (28-09-18) Neural network for notMNIST

  • Built a neural network with 84% accuracy for the notMNIST dataset
  • Completed lesson 1 of the intro to deep learning course

Day 21 (29-09-18) Neural Networks and deep learning

  • Working on week 2 of Andrew NG's course on deep learning and neural networks
  • Implemented gradient descent from scratch

Day 22 (30-09-18) Neural Networks and Deep Learning

  • Completed assignment 1 of week 2
  • Implemented logistic regression using a neural network approach to classify images
  • Completed Week 2 of Andrew NG's course

Day 23 (1-10-18) Implemented gradient descent from scratch

  • Implemented gradient descent form scratch
  • Learnt more about activation functions sigmoid, tanh, ReLU and leaky ReLU
  • Learnt about the advantages and differences between tensorflow.js and tensorflow

Day 24 (2-10-18) Planar data classification using a neural network

  • Completed planar classification assignment
  • Completed Week 3 of Andrew NG's Neural Networks course
  • Started Week 4 of the course

Day 25 (3-10-18) Deep neural networks

  • Completed all lecture videos of Week 4 pertaining to deep neural networks
  • Working on the programming assignments
  • Completed assignment 1

Day 26 (4-10-18) Cat-notCat classifier from scratch

  • Working on a cat-notCat binary classifier using a deep neural net
  • Completed Week 4 of the course and obtained the certificate!

Day 27 (5-10-18) Hyperparameter tuning and regularization

  • Learnt the math behind Frobenius norm and regularization
  • Started course 2 of Andrew NG's Deeplearning.ai specialization

Day 28 (6-10-18) Optimization and regularization

  • Completed week 1 materials and working on the optimization exercises
  • Implemented l2-regularization from scratch
  • Implemented dropout (forward and back-prop) from scratch
  • Implemented Gradient checking from scratch
  • Completed Week 1 of the course

Day 29 (7-10-18) mini-batch gradient descent with momentum and Adam

  • Implemented mini-batch gradient descent with momentum
  • Implemented Adam optimization from the ICLR 2015 paper
  • Completed week 2 of the course

Day 30 (8-10-18) Batch normalization, softmax and Structuring ML Projects!

  • Implemented batch normalization from scratch
  • Working on the SIGNS dataset to identify numbers from sign language (Using Tensorflow)
  • Completed the course on Improving deep neural nets - certificate

Day 31 (9-10-18) Structuring ML Projects and transfer learning

  • Completed the course on structuring machine learning projects! Certificate
  • Learnt more about transfer learning

Day 32 (10-09-18) Familiarizing myself with Tensorflow

  • Read and practiced from the Tensorflow documentation to better understand the workflow
  • Understood the importance of GPUs in Deep Learning and the tensorflow-gpumodule

Day 33 (11-10-18) Edge detection and convolutions

  • Started Week 1 of Andrew NG's course on Convolutional Neural Networks
  • Learnt more about Tensorflow from Jordi Torres' Deep Learning book

Day 34 (12-10-18) Building a Convolutional layer and Pooling

  • Completed Week 1 of Convolutional Neural Networks
  • Learnt about pooling(POOL) and fully connected(FC)

Day 35 (13-10-18) GDG DevFest 2018 and CNN step by step

  • Attended GDG DevFest 2018! Was a very informative event for ML/AI practitioners
  • Working on building a CNN step by step

Day 36 (14-10-18) What is AlphaGoZero and intro to RL

  • Learnt more about Google's ALphaGoZero and why it's such a big breakthrough
  • Learnt the very basics of Reinforcement Learning

Day 37 (15-10-18) Basics of Reinforcement Learning

  • Learnt about Basics of RL from David Silver's online course

Day 38 (16-10-18) CNNs

  • Learnt about Pooling layers for CNNs and improved implementation
  • Working on Week 2 content of Andrew NG's CNNs course

Day 39 (17-10-18) Landing a rocket using Reinforcement Learning

  • Learning about PPOs (Proximal Policy Optimization) in RL
  • Learning about rocket launches to build an app to track space-flight schedules
  • Building and training a ConvNet in TensorFlow for a classification problem

Day 40 (18-10-18) Nasa SpaceApps Preparartion

  • Spent some time preparing data from Nasa datasets for the topic "Do YOU Know When the Next Rocket Launch Is?"

Day 41 (19-10-18) Data preparation and pre-processing

  • Prepared and pre-processed the data for the Nasa SapceApps competition

Day 42 (20-10-18) Nasa SpaceApps Challenge Nationals

  • Using the GLOBE dataset to predict effective sunlight cover on solar panels

Day 43 (21-10-18) Worked on CNNs and Monte Carlo Simulations

  • Used Monte Carlo simulations and normalization to predict the conversion factor for solar panels
  • Used the conversion factor thus obtained to build a calculator to visualize the data
  • Worked on CNNs with a 'selu' activation function for better learning rate with normalization

Day 44 (22-10-18) Revised CNNs from Andrew NG's course notes

  • Revised building CNNs from scratch from Andrew NG's course notes

Day 45 (23-10-18) Artificial Intelligence

  • Studying for internal exam on the subject of Artificial Intelligence

Day 46 (24-10-18) Artificial Intelligence

  • Studied for my AI exam on 25th
  • This includes pedicate logic, Bayesian statistics, Bayesian networks and partitioned semantic nets

Day 47 (25-10-18) AI exam and CNNs for roof exposure estimation

  • Gave my AI exam and probably aced it!
  • Working on training a model on a scraped data of roof pictures with given dimensions (labelled) into a CNN to estimate the solar irradiance incident on the surface

Day 48 (26-10-18) Car detection using YOLOv2

  • Working on a You Only Learn Once model for car detection
  • The ML project pipeline is underway
  • Went to the Google office for a meetup called #chAI where early stage AI startups explailned the deep learning they have been doing
  • fixed all deployment bugs in the Nasa SpaceApps project and hosted the website

Day 49 (27-10-18) Learnt more about YOLO

  • Learnt more about YOLOv2 from medium articles
  • Got project guidance and tips on the Solar roof CNNs project from Vibhor Kalra from merak.ai
  • He suggested to look into tensorflow.js if browser based real-time models need to be deployed
  • Need to learn about deploying a tensorflow project
  • Learnt about tf-lite models and their merits and demerits for DL apps

Day 50 (28-10-18) Monte Carlo simulations and curve fitting in R

  • Improved the prediction model for the solar project and working on the final submission as today is the last day
  • Registered for the Microsoft AI challenge to improve Bing's suggestion box answers using DL models
  • Re-doing the plan for the next 50 days to get the most done from this challenge

Day 51 (29-10-18) Sentiment classification

  • An implementation from Andrew Trask's blog about sentiment classification to frame problems in deep learning
  • Completed CNN implementation from scratch
  • Still working on a feedback analysis of the progress thus far to get much more done in the second half of the challenge

Day 52 (30-10-18) Improving CNN Backpropagation

  • Studying the math behind backpropagation (for CNNs) from Ian Goodfellow's Deep Learning Textbook

Day 53 (31-10-18) Fully functioning ConvNets using Tensorflow

  • Implemented a fully functional CNN using Tensorflow
  • Improved the friend dashboard project
  • ALso created a Genomic and AI related github organization for related projects

Day 54 (1-11-18) GenomicAI's website and revising data science in R from Rafael's Textbook

  • Working on GenomicAI's website. Looking to finish it up after Monday's exam
  • Revising data science in R from Harvard Prof Rafael's textbook

Day 55 (2-11-18) GCP's How Google does ML

  • Completed half the course by Google Cloud Platform on 'How Google does ML'
  • Working on the paper on 'Genomic analysis for persoanlized medicine'

Day 56 (3-11-18) GCP Datalab for ML instances

  • Earthquakes project using a datalab instance Link
  • Project Link
  • The Common pitfalls in ML deployment. Gosh it has much more to do with stuff other than ML!
  • Completed Google Cloud's first course 'How Google does ML' Link

Day 57 (4-11-18) Launching into ML on GCP

  • Started learning the procedure to prepare and pre-process datasets to bucket in cloud instances
  • Working on GenomicAI's about page

Day 58 (5-11-18) (Still!)Launching into ML on GCP

  • Made UI improvements to the Social Network
  • Learnt more about operationalizing ML models for production using the Google Cloud Platform

Day 59 (6-11-18) Completed Launching into ML

  • Awaiting the scholarship confirmation. I just about finished the content in the course from Coursera
  • Working on the site page for GenomicsAI

Day 60 (7-11-18) Comparing AWS and GCP

  • Exploring AWS ML APIs as compared to GCP's ML APIs
  • Buying parts for my Deep Learning rig. Got a GTX 1080 and an 8Th gen Intel i7 processor. Need to save up and buy the rest of the parts!

Day 61 (8-11-18) Learning up about building a recommendation engine

  • Made some progress on Google's GCP challenge on specializing in ML by 30th November
  • Learnt about some of the methods to build and operationalize a recommendation engine

Day 62 (9-11-18) Working on a recommendation engine for the social network

  • Working on the prototype for a recommendation engine for user's feed in a social network I am building
  • The social network is built on a PostgreSQL database with a flask business logic. Check out my profile for details

Day 63 (10-11-18) Started the PyTorch Challenge

  • Working on lesson 1 of the content to build a neural network using PyTorch
  • Made significant headway in the Social Network project

Day 64 (11-11-18) Learnt more about using PyTorch for DL

  • Continued the Udacity course on PyTorch

Day 65 (12-11-18) Started the Move 37 course

  • Started Siraj Raval's Move 37 course for Reinforcement Learning
  • Working on the social networking application for my DBMS project. Looking for ideas to include ML concepts in it

Day 66 (13-11-18) Completed Lesson 1 in the PyTorch Challenge

  • Completed lesson 1 in the PyTorch challenge
  • Learnt about the Bellman equation in Reinforcement Learning (Move 37)

Day 67 (14-11-18) PyTorch Challenge

  • Fixed performance bugs in the 'Social Network' project (which I have to submit soon)
  • Working on the PyTorch challenge lesson 2

Day 68 (15-11-18) CUDA programming using PyTorch

  • Learnt about CUDA programming to utilize a GPU to its max
  • Learnt about TFlite models for deep learning on a smartphone

Day 69 (16-11-18) Google Cloud : Machine Learning and BigQuery

  • Learnt about the basics of using Google Cloud along with BigQuery datasets and ML
  • Worked on learning about TFX to build end to end Deep Learning models
  • Completed the minimum viable project for DBMS lab!

Day 70 (17-11-18) Completed lesson 3 of PyTorch Challenge

  • Completed lesson 3 of the PyTorch challenge
  • Working on the DNNs with PyTorch lesson

Day 71 (18-11-18) Learnt more about PyTorch and Tensorflow workflows

  • Imperative programming in PyTorch and the dynamic front end is more suited for research implementations
  • Learnt more about deploying models as low level C++ and the production-ready Tensorflow workflow

Day 72 (19-11-18) Learnt the Keras workflow for RESNET-50 Implementation

  • Read about classic CNN architectures like AlexNet, Lenet-5, VGG-16 and Microsoft Research's Resnet
  • Learnt the Keras workflow to implement Resnet
  • Learnt more about Skip connections with Convolutional as well as ID blocks

Day 73 (20-11-18) Working on object detection for cars

  • Completed the Resnet implementation
  • Working on car/object detection

Day 74 (21-11-18) Finished the YOLOv2 implementation

  • Part of the YOLO paper released on June 12th 2015 but without the K-Means clustering for drawing the bounding boxes
  • Learnt about implementing non-max suppression and IoU for filtering probabilistic results

Day 75 (22-11-18) Working on Faster CNNs implementation

  • Despite Yolo being a good solution, tried to implement Fast CNNs and Faster CNNs from scratch in Tensorflow
  • Poor results on this. It is more or less guessing the solution despite using Adam optimization
  • YOLO with K-Means clustering seems like a better option. Will look into it soon

Day 76 (23-11-18) O'Reilly Tensorflow for Deep Learning

  • Learnt about the Computation graph and how paralellizing TF clusters improves performance
  • Ordered parts for my Deep Learning PC!
  • Have an i7-8700, an NVIDIA GTX 1080 and the MSI A-Pro Z370 motherboard so far!

Day 77 (24-11-18) Working on Face Recognition

  • Working on Siamese Networks for learning Similarity functions
  • Improving Happy House with Face recognition

Day 78 (25-11-18) Working on the PyTorch lesson 4

  • Working on Udacity's Lesson 4 of the PyTorch challenge
  • Got the O'Reilly Data Science book in R with the Tidyverse
  • Working on the problem statement for Hackference Hackathon

Day 79 (26-11-18) O'Reilly Tensorflow Models

  • Work on the Hackference hack cancelled as the deadline for documentation submission passed hours before we submitted our proposal!
  • Learnt more about GPU and CUDA programming for Deep Learning
  • Completed 2 chapters of the O'Reilly Deep Learning with Tensorflow book. It is a fantastic book to read!

Day 80 (27-11-18) Pandas and Matplotlib for Data Science in Python

  • Watched a few videos by SentDex on Youtube for a hands-on refresher in Pandas and Matplotlib
  • Preparing for data science internships and thus reviewed Sampling theory and some sample interview questions
  • Learnt more about the Tidyverse in R and how statisticians build their workflow in it
  • The initial steps include: Exploratory Data Analysis(EDA) with ggplot2, Wrangling with tidyr, dplyr and programming with magrittr and purr

Day 81 (28-11-18) Deep Learning in insurance

  • Had to spend time on exam preparation, but managed to review data science material in R
  • Revised notes from a previous Harvard Data Science in Genomics course that I had audited in Feb

Day 82 (29-11-18) Papers papers everywhere!

  • Spent time on reading papers in Computer Vision to further deepen my understanding of CNNs
  • Read about CNN architectures in depth
  • Ran the numbers and did some research for the Acko insurance hackathon proposal

Day 83 (30-11-18) Neural Style Transfer paper

  • Spent time reviewing Deep Dream and Neural Style transfer with Gatys et al and their paper
  • Worked on the documentation for the Acko Hack proposal of insurance premium predictions for self-driving car adoption
  • Made plans for the home stretch of the challenge with lots of cool stuff planned for the 18 days to come and more!

Day 84 (1-12-18) Worked on CNNs Assignment and my own implementation of the paper

  • Worked on the neural style transfer assignment
  • Read up more and worked on the implementation of the Neural Style Transfer paper from scratch

Day 85 (2-12-18) Completed the course on CNNs

  • Completed the course on CNNs as part of the Deep Learning Specialization Certificate

Day 86 (3-12-18) Hierarchical Clustering

  • Spent learning about the different clustering techniques apart from K-Means to solve the question for an internship interview
  • Learnt about the end-to-end data science pipeline

Day 87 (6-12-18) Papers papers everywhere 2!

  • Spent a whole lot of time going through different papers handpicked from Arxiv and Arxiv sanity
  • Topics and papers include CycleGan (Didn't really get it!), DeepFace, FaceNet..etc

Day 88 (7-12-18) Google Cloud Developers Meetup

  • Started the Sequence Models course by Andrew NG
  • Attended the Google Cloud Meetup which covered topics including using Kibana, elasticsearch and Cloud ML API

Day 89 (8-12-18) GDG AI Meetup - Hands-on session on NLP

  • Attended the GDG AI Meetup at Altimetrik
  • Worked on the colab notebook pertaining to building a top class sentiment analyser using Spacy and Altair
  • NLP with Spacy which is an industry grade NLP library along with NLTK

Day 90 (9-12-18) Time Series with GCP

  • Worked on time series analysis with Google cloud codelab
  • GPU accelarated sessions with PyTorch support (CUDA backed)

Day 91 (10-12-18) O'Reilly From Scratch Challenge

  • Worked on O'Reilly's Tensorflow from scratch challenge form the E-Mail newsletter
  • Also had my DBMS lab exam today. Did great by the way!

Day 92 (11-12-18) RNNs and GRUs

  • Worked on implementing RNNs and GRUs from scratch
  • Understood the efficiency tradeoffs of GRUs while working on it

Day 93 (12-11-18) Breast Cancer Classification

  • Using the UCI FNA (Fine Needle Aspiartion) dataset to classify tumours
  • Worked on implementing the entire Machine learning pipeline from data preprocessing to model validation
  • Used pandas for data pre-processing, seaborn for exploratory data analysis and used a SVM

Day 94 (13-12-18) Acko Insurance Hackathon Project: Phase 1

  • Working on 3 Kaggle datasets to predict insurance claims in various categories
  • Used the AllClaims Insurance dataset to precict the insurance claims in Auto and Life insurance industries
  • Working on a chat application for better customer retention

Day 95 (14-12-18) Acko Insurance Hackathon Project: Phase 2

  • Working on the Acko insurance project submission
  • Learnt about lstms and seq2seq word embeddings to build a chatbot
  • All set for the hackathon tomorrow!

Day 96 (15-12-18) Sequence Models

  • Completed Week 1 of Andrew NG's Sequence Models course
  • Working on the Jazz production with lstms problem statement
  • Successfully submitted the Acko Hackathon solution (We didn't get shorlisted :( )

Day 97 (17-12-18) Working on my research paper

  • Resumed work on my Genomics Research paper
  • Learnt about Python galaxy for DNA Sequencing
  • Spent time learning about Stanford's clusters for gene sequencing and their enormous budgets!

Day 98 (18-12-18) Built my Deep Learning Rig!

Day 99 (21-12-18) ARIMA models for Time series analysis

  • Autoregressive Integrated Moving Average models are perfect for time series prediction
  • Used it on data that includes a seasonal temporal shift. The data was non-stationary and had trends in the distribution and thus had to be integrated wth the differences as used in Box-Jenkins approach
  • Walk-forward validation is extremely accurate as it provides every iteration with all the available data. This is computationally intensive and hence can used only for small datasets.

Day 100 (22-12-18) Memory Networks for single supporting fact problems

  • Learnt about memory networks and applied it on the bAbI dataset
  • Memory networks can also be used to make chatbots as they have more information gain than lstms with seq2seq embeddings
  • Worked on the Reddit dataset to build a general purpose dataset

Day 101-105 (23-12-18 - 26-12-18) Genomics for personalised medicine Whitepaper

  • Private repo for now. Will make it public soon!

Challenge Complete!

  • It has been a wonderful learning curve and am looking forward to do another one post my exams!