Scikit-learn is a free software machine learning library for the Python programming language. It features various classification, regression and clustering algorithms including support vector machines, random forests, gradient boosting and k-means and is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy.
This repository contains 3 separate notebooks, each covering different aspects of data preprocessing for machine learning using scikit-learn, namely:
- Feature encoding
- Feature scaling
- Missing values imputation