Unit 11—Risky Business

Background

Auto loans, mortgages, student loans, debt consolidation ... these are just a few examples of credit and loans that people are seeking online. Peer-to-peer lending services such as LendingClub or Prosper allow investors to loan other people money without the use of a bank. However, investors always want to mitigate risk, so you have been asked by a client to help them use machine learning techniques to predict credit risk.

In this assignment, you will build and evaluate several machine-learning models to predict credit risk using free data from LendingClub. Credit risk is an inherently imbalanced classification problem (the number of good loans is much larger than the number of at-risk loans), so you will need to employ different techniques for training and evaluating models with imbalanced classes. You will use the imbalanced-learn and Scikit-learn libraries to build and evaluate models using the two following techniques:

Resampling
Ensemble Learning

Files

Resampling Starter Notebook

Ensemble Starter Notebook

Lending Club Loans Data

Instructions

1. Resampling

Use the above referenced Resampling Stater Notebook to answer the following:

Which model had the best balanced accuracy score?

The RandomOverSampler model predicts the highest balanced accuracy score. This model gives us the highest probability of correct calls or correct predictions for credit risk.

Which model had the best recall score?

The RandomOverSampler model has the best Recall score. This model will give us the highest actual positive samples that are correct.

Which model had the best geometric mean score?

2. Ensemble Learning

Use the above referenced Ensemble Starter Notebook to answer the following:

Which model had the best balanced accuracy score?

This model gives us the highest probability of correct calls or correct predictions for credit risk.

Which model had the best recall score?

This model will give us the highest actual positive samples that are correct.

Which model had the best geometric mean score?

What are the top three features?

Hints and Considerations

For the ensemble learners, use 100 estimators for both models.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Images		Images
Resources		Resources
Starter_Code		Starter_Code
.DS_Store		.DS_Store
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Unit 11—Risky Business

Background

Files

Instructions

1. Resampling

2. Ensemble Learning

Hints and Considerations

About

Releases

Packages

Languages

pclaypoole/Machine_Learning

Folders and files

Latest commit

History

Repository files navigation

Unit 11—Risky Business

Background

Files

Instructions

1. Resampling

2. Ensemble Learning

Hints and Considerations

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages