by Ellen Hsu, Samita Limbu, and Lindsey McKenna
For this project we were interested in analyzing the World Development Indicators (WDI) that have been tracked by the World Bank for more than 50 years in order to help achieve global development goals. In conjunction, the U.N. has developed the Human Development Index (HDI) to focus human achievement beyond just economic development. We wanted to discover which WDIs, in particular, affect HDI and what WDI values can increase HDI. We charted individual WDIs and created a form that allows a user to enter any number of WDI values. The form then calculates a machine learning model score and a predicted HDI value.
We primarily used a multilinear regression model because HDI was numerical (as opposed to categorical) and we had a few features/indicators to consider. Some WDIs weighed more heavily in the model than others because they had strong linear relationships to HDI. For more moderately weighted WDIs, the model allowed us to combine WDIs to discover the most predictive combination and to calculate HDI based on known or estimated values.
-
World Development Indicators
If you decide to git clone the project and run the notebook Predict_HDI.ipynb, please download the original bulk CSV from Bulk Downloads and extract the file 'WDIData.csv' to the resources folder. In order to save storage capacity we chose not to upload this file to GitHub since it is ~198MB. -
Human Development Index
From this page, we downloaded the data from Dimensions:- "Human Development Index (HDI)"
- "Education > Education Index"
- "Gender > Gender Inequality Index (GII)"
- Plus several other secondary indicators
- Data Processing - Python Pandas
- Machine Learning - SciKit Learn
- Visualizations - Matplotlib
- Publication - Flask App hosted on Heroku
2 weeks