I am Rishikesh Fulari, am open to work on fascinating AI projects. I have been working in NLP domain for a while now, but would love to collaborate on anything relating to ML.
- Prevsiously worked as Machine Learning Engineer at CrimeCheck.ai (Bangalore based startup now acquired by Idfy)
- Completed Bachelors in Computer Science with distinction, currently pursuing Masters in Computer Science at Purdue University.
1. Devised a Named entity recognition system to convert unstructured Indian addresses to structured ones. product
- Indian addresses do not follow any particular structure and therefore suffer from plenty of formatting and spelling mistakes. This project aimed at implementing a named entity recognition model to convert those unstructured addresses to structured ones.
- Collected address data from various publicly available sources like OpenStreetMap, government websites, Indian postal department, etc. Created a synthetic dataset of 90 million addresses and trained a language model using masked language modeling.
- Further, fine-tuned this model for downstream NLP task of named-entity recognition to identify and label different components of the address. This is an industry project and is currently being used by my past employer.
- Finetuned Detectron 2 for extracting tabular data from receipts and other documents. Extracted useful information from documents.
1. Automating the ML life cycle using MlOps tools for a car price prediction machine learning model. code
- Complete end-to-end MlOps implementation for training, maintaining and monitoring a machine learning model that predicts the price of an old car based on several different relevant factors.
- It includes experiment tracking using MlFlow, model orchestration using Prefect and Grafana for monitoring and detecting data drifts. Model has been deployed using Docker and Flask.
2. Predicting the readmission rate of diabetic patients using machine learning for better healthcare. blog code
- Implemented end-to-end Machine Learning pipeline right from data cleaning to model deployment on AWS EC2 instance, demonstrating full stack machine learning skills.
- Hospitals in the USA are penalized by the government if the patient is readmitted to the hospital within 30 days. Hospitals however have no means of predicting which patient will be readmitted. This project addresses this problem using machine learning by predicting which patient is likely to get readmitted within 30 days.
3. Predicting the likelihood of conversion of a free-tier user to a paid one for an Ed-tech company. blog code demo
- Implemented end-to-end machine learning model to predict if the free-tier user would buy the subscription for e-learning platform ‘365 data science’ using real world platform analytics data.
- Data was provided by the Ed-tech platform and the final model was deployed on the Hugging Face Spaces using Streamlit as the front end framework.
4. Developed and deployed an application to generate captions for the visually challenged people. demo
- Implemented deep learning model using pretrained models from Hugging face to generate captions for images which are further fed as input to text-to-speech API for reading aloud the captions. This project was made to help differently abled people browse the image content on the internet
- Implemented end-to-end machine learning model to predict if given two questions have the same semantic meaning.
- Used Quora question pair similarity dataset and embeddings from pretrained model for detecting semantic similarity. Forums and QnA sections are often filled with duplicate entries, this project was aimed at finding those duplicate questions using the recent advances in natural language processing domain like embeddings from pretrained models.
6. Predicting the user engagement for celebrity tweets. demo
- Developed and deployed a machine learning model that predicts the user engagement - the number of retweets that particular tweet would get based on the semantic meaning and timestamp of the tweet.
- Scrapped twitter data from Justin Bieber’s twitter account and used it as the training data to predict the number of retweets his tweet would get.
- Python, Tensorflow, PyTorch
- Machine Learning, Deep Learning, Data Analysis, Predictive Modeling, Forecasting methods
- Natural Lanugage Processing
- Deep Learning
- Machine Learning