Skip to content

On Adaptive Prediction Sets #9

Answered by aangelopoulos
PaulScemama asked this question in Q&A
Discussion options

You must be logged in to vote

You're not missing anything!

The one thing I'd say is that it's not totally clear that the second prediction is "better". The model might be better, but they may be equally calibrated from the perspective of the score function. In both yhat1 and yhat2, you need to take 80% of the probability mass before you contain the true label. In that sense, they're the same.

As another example, consider

yhat1 = [0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1] # label_idx = 8
yhat2 = [0.4, 0.3, 0.1, 0.2/7, 0.2/7, 0.2/7, 0.2/7, 0.2/7, 0.2/7, 0.2/7] # label_idx = 3

In this one, it's less clear which prediction is "better".
In both cases, you need to take the qhat = 0.8, i.e., they require you to take …

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@PaulScemama
Comment options

Answer selected by PaulScemama
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants