Use anomaly detection techniques to create a holistic early detection tool that indicates which high school students are most at risk to drop out of college. An individual’s high school experience is not confined to the academic components. As such, an effective model should incorporate both environmental and educational factors, including various descriptive data on the student’s home area, the school’s area, and the school’s overall structure and performance. This project combined this information with data on students throughout their secondary educational careers (i.e., from ninth through twelfth grade) in an attempt to develop a model that could detect during high school which students have a higher probability of dropping out of college.
The clustering-based and classification-based anomaly detection algorithms detail the situational and numeric circumstances, respectively, that most frequently result in a student dropping out of college. High school administrators could implement these models at the culmination of each school year to identify which students are most at risk for dropping out in college. Then, administrators could provide additional support to those students during the following school year to decrease that risk. College administrators could also follow this same process to minimize dropout rates.