Projects and Challenges related to Data Analytics and Data Science. All projects here were either conducted during competitions, hackathons or as hobby projects.
Project Name | Description | Libraries Used |
---|---|---|
Data Preprocessing - Credit Scores | Preparing a report for the loan division of a bank to find out factors that affect the likelihood of a default on the loan | pandas,matplotlib, numpy |
Traffic Indicators | Explore and understand causes of traffic congestion on a highway based on factors such as weather, time of day and accidents | pandas, matplotlib, numpy |
Ebay Car Sales | Pre-Processing of car sales data to deal with missing values, duplicates and outliers | pandas, numpy |
Hacker News | Analyse posts on Hacker News and identify engagement trends for types of posts | csv, datetime |
Car Sales EDA | Exploratory Analysis on used car sales data to identify trends and outliers. Finally, identify trends in prices of cars based on a variety of factors. | pandas, numpy, matplotlib, seaborn |
Statistical Testing | Running Statistical Tests on Cell Phone Users' data to understand revenue earned from users on different plans and from different regions | pandas, numpy, matplotlib, seaborn, scipy, math, regex |
Video Game Sales | Using historical video game sales and rating data to inform global advertising strategy. | pandas, numpy, matplotlib, seaborn, scipy, sys, warnings |
SQL Music Analysis | Running Simple SQL queries to analyse trends and identify prospective markets. The data consists of multiple relational databases | sql |
Employee Satisfaction | Analysis of data from Employee Exit surveys to identify common causes for employee churn. | pandas,numpy |
Bank Churn Prediction | Predicting if a customer will churn based on certain characteristics. | pandas, numpy, sklearn |
Phone Plan Classifier | Classify legacy customers to new plans based on usage metrics such as calls, internet and texts. | pandas, numpy, sklearn |
Telecom Churn Prediction | Predict churn in phone and internet customers based on certain parameters | pandas, numpy, sklearn, datetime |