This is a repository to Machine Learning Summer Training Hackathon 2022 from Analytics Vidhya
The data from Machine Learning Summer Training Hackathon. Steps of my solution:
- Univariate analisys of features. Check for normal distribution and outliers
- Bivariate analisys of features. Check for correlation between features and target variables
- Check for Correlation matrix
- Feature Imputing for 'education'
- One hot encoding for 'proof_submitted'
- Label Encoding for 'education'
- Modelling. Models: Logistic Regression, Desicion Tree Classifier, Random Forest Classifier, Bagging Classifier, XGBOOST. My final model is Random Forest Classifier Final macro f1_score=0.524564 for test pd.