Skip to content

Clustering of undergraduate majors according to economic outcomes

Notifications You must be signed in to change notification settings

pscharfe/Clustering-College-Majors-in-Python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

52 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Clustering College Majors By Economic Outcomes

  • Dataset from the data journalism website fivethirtyeight, listing college majors, associated demographic data, and their economic outcomes, especially income and employment
  • Data includes only recent BA/BS graduates to get an up-to-date view of current economic outcomes
  • Begins by constructing new variables, e.g., share of recent grads working part-time
  • Exploratory data analysis with correlations, visualizations, and multiple linear regressions
  • Three algorithms (K-Means, KNN, and Hierarchical Clustering) were used to build clusters of majors
  • Loops with different validation methods found the optimal number of clusters
  • Principal Component Analysis was applied to visualize the two successful models: K-Means and Hierarchical Clustering
  • The PCA plot visualizing the K-Means clusters is as follows:

About

Clustering of undergraduate majors according to economic outcomes

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published