This package implements a basic version of binary sensitive calibration based in this Medium article.
pip install git+https://github.com/Tokukawa/cost-sensitive-calibration.git
So let's say you have a binary classifier that must be calibrated based on a utility matrix. How can I use this package? Here an actual example:
from cost_sensitive_calibration.calibrate import BinaryCalibration, AcceptReviewReject
import numpy as np
EXAMPLES = 1000
preds = np.random.uniform(size=EXAMPLES)
labels = (preds > np.random.rand(EXAMPLES)) * 1
preds
contain the model predictions and labels
the true labels. Now let's say our utility matrix
is like this:
Tables | True Positive | True Negative |
---|---|---|
model positive | 0.0 | -0.1 |
model negative | -1.0 | +0.5 |
The utility matrix must expressed in per dollar return. So -1 means -100% returns 0.5 means +50% return and so on. Than you can use ROC based optimization like this in order to find the optimal threshold:
utility_matrix = {'tp': 0., 'fp': -0.1, 'tn': .5, 'fn': -1.}
caliber = BinaryCalibration(utility_matrix)
threshold, max_utility = caliber.calibrate(labels, preds, plot_roc=False)
print("Optimal Threshold:{} \nMax Utility: {}".format(threshold, max_utility))
>Optimal Threshold:0.316255844096
>Max Utility: 0.0975
In case you want to use an Accept-Review-Reject approach you can use the class AcceptReviewReject
. Example
from cost_sensitive_calibration.calibrate import AcceptReviewReject
two_thresholds_caliber = AcceptReviewReject(utility_matrix={'PR': 0., 'NR': 0., 'PM': -0.1, 'NM': 0.025 , 'PA': -1, 'NA': 0.025}, steps=1000)
lower_threshold, higher_threshold, utility_per_dollar = two_thresholds_caliber.calibrate(labels, preds)
where PR, NR, PM, NM, PA, NA means:
PR -> Positive Rejected
NR -> Negative Rejected
PM -> Positive to Manual Review
NM -> Negative to Manual Review
PA -> Positive Accepted
NA -> Negative Accepted
Or you can use a bayesian approach to take an action without a threshold:
multiple_options_utility_matrix = np.array([[0., -0.1], [-1., .5]])
binary_bayesian_classifier = BinaryBayesianMinimumRisk(multiple_options_utility_matrix)
my_pred = 0.12345
action = {0: 'Take action 1', 1: 'Take action 2'}
print(action[binary_bayesian_classifier.predict(my_pred)])
>Take action 2
The class BinaryBayesianMinimumRisk
can be initialized with more than two possible options.