EDA and comparing different learning algorithms on Car features dataset.
Preprocessing of the dataset includes the step:
-Deleting attributes with large amount of missing data.
-Deleting duplicate data.
-Removing outliers.
-Imputing the missing data.
-Encoding the categorical data.
-Scaling the dataset.
After performing the above preprocessing, the dataset will be ready to be fed into a learning model.
The different models tested here are:
-Linear Regression
-KNN Regression
-Support Vector Regression (SVR)
-Decision Tree Regression
-Random Forest Regression
-XGBoost
XGBoost is found to be having the best R2 score among the models used here.