Analysis of the Hitters Dataset, in which baseball players' statistics are used to predict their salary.
- Determined the most important features in predicting baseball players' salaries using:
- Linear Regression
- Best Subsets
- Step-wise approaches (forward and backward)
- Lasso
- Elastic Net
- Adaptive Lasso
- Fit and visualized regularization paths for:
- Lasso
- Elastic Net at ɑ = 0.33, 0.66
- Adaptive Lasso
- Determined the average prediction mean squared error (MSE) for:
- Least Squares
- Ridge Regression
- Best Subsets
- Step-wise approaches (forward and backward)
- Lasso
- Elastic Net
- Adaptive Lasso
Simply open the correct file and run to replicate the results as described above.
hittersfeatureselection.R performs feature selection on the Hitters dataset.
mse_analysis.R determines the average prediction MSE for each model on the Hitters dataset.