Census Income: Exploratory Data Science and Machine Learning

In this project, we compared the performance of different classifiers in a binary classification task. The goal was to predict if a person makes more or less than 50 k per year, given aset of categorical and continuous attributes. The algorithms which we used were support vector machines with linear kernel (SVM) and multilayer perceptrons with different activation functions. Testing accuracy was around 0.86 for all classifiers except for SVM with automated correction for class imbalance which had significantly lower testing accuracy. Training and testing accuracies were similar for all classifiers, indicating good generalization from training to test data. The biggest challenge consisted in the imbalance in the data set, resulting in relatively poor recall rates and, consequently, low F1 scores. Using sklearn’s automated correction for class imbalance improved recall but at the cost of lowering precision.

Please see "COGS_118b_final_report.pdf" for the full paper. The above is an excerpt from the paper.

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
.ipynb_checkpoints		.ipynb_checkpoints
118B-final-project.pdf		118B-final-project.pdf
COGS118B-final-file.ipynb		COGS118B-final-file.ipynb
COGS_118B_final_report.pdf		COGS_118B_final_report.pdf
README.md		README.md
adult.data		adult.data

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Census Income: Exploratory Data Science and Machine Learning

About

Releases

Packages

Contributors 2

Languages

miguelgd54/118BFinal-Project

Folders and files

Latest commit

History

Repository files navigation

Census Income: Exploratory Data Science and Machine Learning

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages