This project aims to conduct a comprehensive analysis to identify the key factors that significantly impact student grades and develop a model to predict student performance. By utilizing statistical modeling techniques, correlation analysis, and feature importance determination, we seek to gain valuable insights that can enhance interventions and support systems, ultimately improving academic performance.
Understanding the factors that influence student grades is essential for educational institutions and stakeholders to implement effective strategies that foster academic success. By analyzing a wide range of potential variables, we can uncover the most influential factors affecting student performance. Additionally, developing a predictive model can provide insights into future performance trends and enable targeted interventions.
- Identify the key factors that significantly impact student grades.
- Utilize statistical modeling techniques to analyze and understand the relationships between various variables and academic performance.
- Conduct correlation analysis to determine the strength and direction of associations between factors and student grades.
- Determine feature importance to prioritize the most influential variables.
- Develop a predictive model to forecast student grades based on the identified factors.
- Provide actionable insights that can inform the development of tailored interventions and support systems to improve academic performance.
The data for this project focuses on student achievement in secondary education in two Portuguese schools. The dataset includes various attributes such as student grades, demographic information, social factors, and school-related features. The data for this project was downloaded from the UCI Machine Learning Repository.
The following table provides a description of the attributes present in both the student-mat.csv(Math course)
and student-por.csv (Portuguese language course)
datasets:
Attribute | Attribute Description | Attribute Type | Attribute Values |
---|---|---|---|
school | Student's school | Binary | 'GP' - Gabriel Pereira 'MS' - Mousinho da Silveira |
sex | Student's sex | Binary | 'F' - female 'M' - male |
age | Student's age | Numeric | From 15 to 22 |
address | Student's home address type | Binary | 'U' - urban 'R' - rural |
famsize | Family size | Binary | 'LE3' - less or equal to 3 'GT3' - greater than 3 |
Pstatus | Parent's cohabitation status | Binary | 'T' - living together 'A' - apart |
Medu | Mother's education | Numeric | 0 - none 1 - primary education (4th grade) 2 - 5th to 9th grade 3 - secondary education 4 - higher education |
Fedu | Father's education | Numeric | 0 - none 1 - primary education (4th grade) 2 - 5th to 9th grade 3 - secondary education 4 - higher education |
Mjob | Mother's job | Nominal | 'teacher' 'health' care related 'services' (e.g. administrative or police) 'at_home' 'other' |
Fjob | Father's job | Nominal | 'teacher' 'health' care related 'services' (e.g. administrative or police) 'at_home' 'other' |
reason | Reason to choose this school | Nominal | close to 'home' school 'reputation' 'course' preference 'other' |
guardian | Student's guardian | Nominal | 'mother' 'father' 'other' |
traveltime | Home to school travel time | Numeric | 1 - <15 min. 2 - 15 to 30 min. 3 - 30 min. to 1 hour 4 - >1 hour |
studytime | Weekly study time | Numeric | 1 - <2 hours 2 - 2 to 5 hours 3 - 5 to 10 hours 4 - >10 hours |
failures | Number of past class failures | Numeric | n if 1<=n<3, else 4 |
schoolsup | Extra educational support | Binary | 'yes' or 'no' |
famsup | Family educational support | Binary | 'yes' or 'no' |
paid | Extra paid classes within the course subject | Binary | 'yes' or 'no' |
activities | Extra-curricular activities | Binary | 'yes' or 'no' |
nursery | Attended nursery school | Binary | 'yes' or 'no' |
higher | Wants to take higher education | Binary | 'yes' or 'no' |
internet | Internet access at home | Binary | 'yes' or 'no' |
romantic | With a romantic relationship | Binary | 'yes' or 'no' |
famrel | Quality of family relationships | Numeric | From 1 - very bad to 5 - excellent |
freetime | Free time after school | Numeric | From 1 - very low to 5 - very high |
goout | Going out with friends | Numeric | From 1 - very low to 5 - very high |
Dalc | Workday alcohol consumption | Numeric | From 1 - very low to 5 - very high |
Walc | Weekend alcohol consumption | Numeric | From 1 - very low to 5 - very high |
health | Current health status | Numeric | From 1 - very bad to 5 - very good |
absences | Number of school absences | Numeric | From 0 to 93 |
G1 | First period grade (Math or Portuguese) | Numeric | From 0 to 20 |
G2 | Second period grade (Math or Portuguese) | Numeric | From 0 to 20 |
G3 | Final grade (Math or Portuguese) | Numeric | From 0 to 20 |
This table provides a summary of the attributes in the dataset, their descriptions, types, and possible values.