Repository containing portfolio of Data Science and Machine Learning projects.
It is presented in the form of iPython Notebooks and PDF.
No | Notebook | Description |
---|---|---|
1 | NumPy Overview | Overview of how to use numpy |
2 | Pandas Overview | Overview of how to use pandas |
3 | Matplotlib Overview | Overview how to use matplotlib data visualization |
4 | Seaborn Overview | Overview of how to use seaborn data visualization |
No | Notebook | Description |
---|---|---|
1 | Feature Engineering: Variable Types & Characteristics | Collections of variables type and characteristics, such as MNAR, MCAR, MAR, cardinality, distributions, linear model assumptions, outliers, and variable magnitude |
2 | Feature Engineering: Univariate Missing Data Imputation | Collections of univariate missing data imputation technique, such as mean median mode, aribitrary, end of distribution, random sample, and many more |
3 | Feature Engineering: Multivariate Missing Data Imputation | KNN and MICE multivariate missing data imputation |
4 | Feature Engineering: Categorical Encoding | Collection of categorical encoding techniques, such as rare label encoding, one hot encoding, woe encoding, and other monotonic relationship encoding |
5 | Feature Engineering: Variable Transformation | Collection of variable transformation techniques to transform non-gaussian distribution for linear model, such as log transformer, box-cox transformer, yeo-johnson transformer |
6 | Feature Engineering: Discretization | Collection of discretization methods, such as equal width discretization, equal frequency discretization, K-means discretization, and many more |
7 | Feature Selection: Filter Methods | Collection of feature selection filter methods, such as constant, quasi-constant, duplicated features pair, multi-collinearity, mutual information, ANOVA, and many more |
No | Notebook | Report | Dasbhoard | Description |
---|---|---|---|---|
1 | E-Commerce Sales Performance and Customer RFM Behavior Analysis | Tableau Dashboard Story | E-Commerce companies want to know sales performance and customer behavior. This analysis goals are to understand customer behavior and what recommendations can be made to increase sales and customer satisfaction | |
2 | Credit Default Risk_Home Credit_Light GBM | - | Credit Default Risk classification and Debtors Grading with SHAP model explainability using Light GBM | |
3 | Book Recommendation System_Content and Item-based Collaborative Filtering | - | Build a book recommendation system to help users choose their books based on the books they have purchased | |
4 | Article Topic Classification_Kumparan_Light GBM | - | Build a model to classify article topics based on their content using TF-IDF vectorization | |
5 | Airplane Passengers_SARIMA Forecasting | - | - | Number of plane passengers seasonal forecasting using Walk-Forward Validation |
6 | Sales Advertising_Linear Regression | - | - | Sales prediction based on advertising amount |