Colab Notebook: Mortgage Case Study
https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3619715
This is a holistic approach to implement fair outputs at the individual and group level. Some of the methods developed or used includes quantitative monotonic measures
, residual explanations
, benchmark competition
, adverserial attacks
, disparate error analysis
, model agnostic pre-and post-processing
, reasoning codes
, counterfactuals
, contrastive explanations
, and prototypical examples
.
FairPut is a light open framework that describes a preferred process at the end of the machine learning pipeline to enchance model fairness. The aim is to simultaneously enhance model interpretability, robustness, and fairness while maintaining a reasonable level of accuracy. FairPut unifies various recent machine learning constructs in a practical manner. This method is model agnostic, but this particular development instance uses LightGBM.
1. Model Explainability (Colab)
- Model Respecification
- Protected Values Prediction
- Model Constraints
- Hyperparameter Modelling
- Interpretable Model
- Global Explanations
- Monotonicity Feature Explanations
- Quantitative Validation
- Level Two Monotonicity
- Relationship Analysis
- Partial Dependence (LV1) Monotonicity
- Feature Interactions
- Metrics and Cut-off
2. Model Robustness (Colab)
- Residual Deviation
- Residual Explanations
- Benchmark Competition
- Adversarial Attack
3. Regulatory Fairness (Colab)
- Group
- Disparate Error Analysis
- Parity Indicators
- Fair Lending Measures
- Model Agnostic Processing
- Reweighing Preprocessing
- Disparate Impact Preprocessing
- Calibrate Equalized Odds
- Feature Decomposition
- Disparate Error Analysis
- Individual
- Reasoning
- Individual Disparity
- Reasoning Codes
- Example Base
- Prototypical
- Counterfactual
- Contrastive
- Reasoning
If you end up using any of the novel techniques, or the framework as a whole, you can cite the following.
BibTeX entry:
@software{fairput,
title = {{FairPut}: Fair Machine Learning Framework},
author = {Snow, Derek},
url = {https://github.com/firmai/fairput/},
version = {1.15},
date = {2020-03-31},
}
Stack: Alibi, AIF360, AIX360, SHAP, PDPbox
- Can the model predict the outcome using just protected values? (Protected Value Prediction)
- Is the model monotonic and are variables randomly selected? (Model Constraints, LV1 & LV2 Monotonicity)
- Is the model explainable? (Model Selection, Feature Interactions)
- Can you explain the predictions globally and locally? (SHAP)
- Does the model perform well? (Metrics)
- What individuals have received the most and least accurate predictions? (Residual Deviation)
- Can you point to the feature responsible for large individual residuals? (Residual Explanations)
- What feature values could potentially be outliers due to their misprediction? (Residual Explanations)
- Do some models perform better at predicting the outcomes for a certain type of individual? (Benchmark Competition)
- Can the model outcome be changed by artificially perturbing certain values of interest? (Adversarial Attack)
- Do certain groups suffer relative to others as measured through group statistics? (Parity Indicators, Fair Lending Measures)
- Can various data and prediction processing techniques improve these group statistics? (Model Agnostic Processing)
- What features are driving the structural differences between groups controlling for demographic factors? (Feature Decomposition)
- What individuals have received the most unfair prediction or treatment by the model? (Individual Disparity)
- Why did the model decide to predict a specific outcome for a particular individual or sub-group of individuals? (Reasoning Codes)
- What individuals are most similar to those receiving unfair treatment and were these individuals treated similar? (Prototypical)
- What individual is the closest related instance to a sample individual but has a different predicted outcome? (Counterfactual)
- What is the minimal feature perturbation necessary to switch an individual's prediction to another category? (Contrastive)
- What is the maximum perturbation possible while the model prediction remains the same? (Contrastive)