My submission for the Home Credit Default Risk Kaggle competition.
The objective of this competition is to predict how capable each applicant is of repaying a loan. I started with a simple exploratory data analysis investigating the distributions of features between classes, looking for correlations. Later, I experimented with manual and automated feature engineering, comparing the performance of various machine learning models.
- Python - The programming language for of this project
- Pandas and Numpy - Data wrangling,
- Various Scikit-learn tools and pipelines - Data preprocessing,
- Imbalanced-learn - Class balancing,
- Seaborn and Matplotlib - Data visualization,
- Scikit-learn, LightGBM - Machine learning model training and evaluation