Skip to content

miguelgd54/118BFinal-Project

Repository files navigation

Census Income: Exploratory Data Science and Machine Learning

In this project, we compared the performance of different classifiers in a binary classification task. The goal was to predict if a person makes more or less than 50 k per year, given aset of categorical and continuous attributes. The algorithms which we used were support vector machines with linear kernel (SVM) and multilayer perceptrons with different activation functions. Testing accuracy was around 0.86 for all classifiers except for SVM with automated correction for class imbalance which had significantly lower testing accuracy. Training and testing accuracies were similar for all classifiers, indicating good generalization from training to test data. The biggest challenge consisted in the imbalance in the data set, resulting in relatively poor recall rates and, consequently, low F1 scores. Using sklearn’s automated correction for class imbalance improved recall but at the cost of lowering precision.

Please see "COGS_118b_final_report.pdf" for the full paper. The above is an excerpt from the paper.

About

Census Income Data Exploration

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published