List of Repositories containing my data science projects completed for academic, self learning, competitions and hobby purposes.They are presented in Jupyter notebooks.
-
-
SupervisedLearning: Customers Segmentation Using RFM Analysis: In this project we perfrom customer segmentation using Recency, Frequency and Monetary analysis on current customers. After segmentation, the customers are assigned as psossible target or not. Finally we train a LGBMClassifier and use it to predict on new customers.
Tools: scikit-learn, pandas,RFM analysis, LGBMClassifier, imblearn, plotly
-
SupervisedLearning: Stock Percentage Estimation: Estimate stock percentage.
Tools: scikit-learn, pandas, RandomForestRegressor, seaborn
-
Supervised Learning: Predict Customer Buying Behavior: This was the second Task of British Airways Virtual Experience program. Using the provided data we had to predict wether a cutomer would complete a booking.
Tools: scikit-learn, Pandas, imbalanced-learn, seaborn, matplotlib
-
Supervised Learning: Medical Cost Analysis: This was the final project for the Global AI Hub International ML Bootcamp.the The aim of the project is to estimate the approximate cost of a person's health insurance.
Tools: scikit-learn, pandas, seaborn, matplotlib
-
Supervised Learning: Predict Career Longevity for NBA Rookies: This was a competition that was part of the Python for Machine Learning International Bootcamp by Global AI Hub. The aim is to predict career longevity of NBA rookies.
Tools: scikit-learn, xgboost, Pandas, imbalanced-learn, seaborn, matplotlib
-
Supervised Learning: Parkison's Disease Detection: The aim of the project is to predict if a person has Parkison's Disease using biomedical voice measurements.
Tools: scikit-learn, pandas, seaborn, matplotlib
-
Supervised Learning: Predicting Type of Wine: The aim of the project is to predict if the type of wine is red or white using the famous wine quality dataset.
Tools: scikit-learn, pandas, imbalanced-learn, seaborn, matplotlib
-
Unsupervised Learning: Mall Customers Segmentation: Analyzing a dataset containing data on various mall customers.
Tools: scikit-learn, pandas, KMeans, seaborn, matplotlib
-
-
-
Webscrapping and Customer Reviews Analysis: This was the first task of British Airways Virtual Experience program. Data was scrapped from Skytrax. The data was customer reviews about british airways airline. The data was cleaned and analysed using NLP techniques such as topic modelling and semantic analysis.
Tools: scikit-learn, pandas, beautifulsoup, nltk, gensim, wordcloud, vader, LatentDirichletAllocation, seaborn, matplotlib
-
-
-
Python
-
Stock Data Analysis: Analyzed Stock data for insights.
-
Data Quality analysis: Documented data quality issues after analysis.
-
-
Power BI
-
Sprocket Central Dashboard: These are Power BI reports from Task 3 of KPMG Virtual Internship Program, that analyze the Sprocket Central customer data.
-
SuperStore Dashboard: These are Power BI reports that analyze the superstore sales data.
-
-
SQL
- Analysis of Hospital ER Data : Analysis of hospital dataset using SQL, visualized with seaborn.
-
-
-
Python
- Kafka Producer and Consumer: An introduction to Apache Kafka. Containing a producer thats sends data to a topic and a consumer that reads from the topic.
-