Nonprofit foundation Alphabet Soup is looking for a tool to predict whether applicants they fund will be successful.
The neural network model being built is based on a CSV containing data on organizations that have received funding. Included columns:
EIN
andNAME
- Identification columnsAPPLICATION_TYPE
- Alphabet Soup application typeAFFILIATION
- Affiliated sector of industryCLASSIFICATION
- Government organization classificationUSE_CASE
- Use case for fundingORGANIZATION
- Organization typeSTATUS
- Active statusINCOME_AMT
- Income classificationSPECIAL_CONSIDERATIONS
- Special considerations for applicationASK_AMT
- Funding amount requestedIS_SUCCESSFUL
- Was the money used effectively
Steps 1 and 2 take place in the fund_predictor notebook. The specified steps were followed:
- Read in the CSV and remove
EIN
andNAME
columns - Determine number of unique values in each column
- Using cutoff values, bin "rare" categorical values into category "Other" for columns
APPLICATION_TYPE
andCLASSIFICATION
- Encode categorical variables with
Pandas.get_dummies()
The stated steps were followed:
- Create a neural network using
tensorflow.keras
- Add two hidden layers with the "ReLU" activation function
- Add an output layer with the "sigmoid" activation function
- Compile and train the model, saving the model's weights every 5 epochs
- Evaluate the model's loss and accuracy with test data
- Save the model to an HDF5 file
Optimization takes place in a separate notebook called AlphabetSoupCharity_Optimization.
The goal of optimization was to reach a target predictive accuracy higher than 75%. This goal was achieved by reintroducing the NAME
column and reducing the number of nodes active in each hidden layer.
Through the optimization process, the accuracy of the model increased from around 72% to just under 79%.
The full analysis report is available to read at the link.