This repository represents a project where I create a cluster model, for clustering the customers of a bank that take credits.
The goal is investigate what group of customers take the highest credit amounts, in this case we have a dataset with categorical and numerical data so, I used K-Means Clustering with OneHot Encoding for categorical data,
Also I used a K_Prototypes algorith beacause this algorithm permit using categorical and numerical datawithout encoding categorical data.
The result was that the group with ages between 20 and 68 years take credits with highest money amounts.
- Python
- Scikit-Learn
- Plotly
- Seaborn
- Gower distances algorithm
- Kmodes package for K-Prototype algorithm
- Prince library for Factorial analysis mixed data (PCA for numerical and categorical )
If you want use the repository you can make a git clone or download the repository
**You can see the notebook here