Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

249 clarification of the symmetry argument in cqr and more general documentation about cqr #443

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions HISTORY.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ History
* Reduce precision for test in `MapieCalibrator`.
* Fix invalid certificate when downloading data.
* Add citations utility to the documentation.
* Add explanation and example for symmetry argument in CQR.

0.8.3 (2024-03-01)
------------------
Expand Down
47 changes: 30 additions & 17 deletions doc/theoretical_description_regression.rst
Original file line number Diff line number Diff line change
Expand Up @@ -245,30 +245,43 @@ uncertainty is higher than :math:`CV+`, because the models' prediction spread
is then higher.


9. The conformalized quantile regression (CQR) method
9. The Conformalized Quantile Regression (CQR) Method
=====================================================

The conformalized quantile method allows for better interval widths with
heteroscedastic data. It uses quantile regressors with different quantile
values to estimate the prediction bounds and the residuals of these methods are
used to create the guaranteed coverage value.
The conformalized quantile regression (CQR) method allows for better interval widths with
heteroscedastic data. It uses quantile regressors with different quantile values to estimate
the prediction bounds. The residuals of these methods are used to create the guaranteed
coverage value.

Notations and Definitions
-------------------------
- :math:`\mathcal{I}_1` is the set of indices of the data in the training set.
- :math:`\mathcal{I}_2` is the set of indices of the data in the calibration set.
- :math:`\hat{q}_{\alpha_{\text{low}}}`: Lower quantile model trained on :math:`{(X_i, Y_i) : i \in \mathcal{I}_1}`.
- :math:`\hat{q}_{\alpha_{\text{high}}}`: Upper quantile model trained on :math:`{(X_i, Y_i) : i \in \mathcal{I}_1}`.
- :math:`E_i`: Residuals for the i-th sample in the calibration set.
- :math:`E_{\text{low}}`: Residuals from the lower quantile model.
- :math:`E_{\text{high}}`: Residuals from the upper quantile model.
- :math:`Q_{1-\alpha}(E, \mathcal{I}_2)`: The :math:`(1-\alpha)(1+1/|\mathcal{I}_2|)`-th empirical quantile of the set :math:`{E_i : i \in \mathcal{I}_2}`.

Mathematical Formulation
------------------------
The prediction interval :math:`\hat{C}_{n, \alpha}^{\text{CQR}}(X_{n+1})` for a new sample :math:`X_{n+1}` is given by:

.. math::
.. math::

\hat{C}_{n, \alpha}^{\text{CQR}}(X_{n+1}) =
[\hat{q}_{\alpha_{\text{lo}}}(X_{n+1}) - Q_{1-\alpha}(E_{\text{low}}, \mathcal{I}_2),
\hat{q}_{\alpha_{\text{hi}}}(X_{n+1}) + Q_{1-\alpha}(E_{\text{high}}, \mathcal{I}_2)]

\hat{C}_{n, \alpha}^{\rm CQR}(X_{n+1}) =
[\hat{q}_{\alpha_{lo}}(X_{n+1}) - Q_{1-\alpha}(E_{low}, \mathcal{I}_2),
\hat{q}_{\alpha_{hi}}(X_{n+1}) + Q_{1-\alpha}(E_{high}, \mathcal{I}_2)]
Where:

Where :math:`Q_{1-\alpha}(E, \mathcal{I}_2) := (1-\alpha)(1+1/ |\mathcal{I}_2|)`-th
empirical quantile of :math:`{E_i : i \in \mathcal{I}_2}` and :math:`\mathcal{I}_2` is the
residuals of the estimator fitted on the calibration set. Note that in the symmetric method,
:math:`E_{low}` and :math:`E_{high}` are equal.
- :math:`\hat{q}_{\alpha_{\text{lo}}}(X_{n+1})` is the predicted lower quantile for the new sample.
- :math:`\hat{q}_{\alpha_{\text{hi}}}(X_{n+1})` is the predicted upper quantile for the new sample.

As justified by [3], this method offers a theoretical guarantee of the target coverage
level :math:`1-\alpha`.
Note: In the symmetric method, :math:`E_{\text{low}}` and :math:`E_{\text{high}}` sets are no longer distinct. We consider directly the union set :math:`E_{\text{all}} = E_{\text{low}} \cup E_{\text{high}}` and the empirical quantile is then calculated on all the absolute (positive) residuals.

Note that only the split method has been implemented and that it will run three separate
regressions when using :class:`mapie.quantile_regression.MapieQuantileRegressor`.
As justified by the literature, this method offers a theoretical guarantee of the target coverage level :math:`1-\alpha`.


10. The ensemble batch prediction intervals (EnbPI) method
Expand Down
114 changes: 114 additions & 0 deletions examples/regression/1-quickstart/plot_cqr_symmetry_difference.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
"""
====================================
Plotting CQR with symmetric argument
====================================
An example plot of :class:`~mapie.quantile_regression.MapieQuantileRegressor`
illustrating the impact of the symmetry parameter.
"""
import numpy as np
from matplotlib import pyplot as plt
from sklearn.datasets import make_regression
from sklearn.ensemble import GradientBoostingRegressor

from mapie.metrics import regression_coverage_score
from mapie.quantile_regression import MapieQuantileRegressor

random_state = 2

##############################################################################
# We generate a synthetic data.

X, y = make_regression(n_samples=500, n_features=1, noise=20, random_state=59)
thibaultcordier marked this conversation as resolved.
Show resolved Hide resolved

# Define alpha level
alpha = 0.2

# Fit a Gradient Boosting Regressor for quantile regression
gb_reg = GradientBoostingRegressor(
loss="quantile", alpha=0.5, random_state=random_state
)

# MAPIE Quantile Regressor
mapie_qr = MapieQuantileRegressor(estimator=gb_reg, alpha=alpha)
mapie_qr.fit(X, y, random_state=random_state)
y_pred_sym, y_pis_sym = mapie_qr.predict(X, symmetry=True)
y_pred_asym, y_pis_asym = mapie_qr.predict(X, symmetry=False)
y_qlow = mapie_qr.estimators_[0].predict(X)
y_qup = mapie_qr.estimators_[1].predict(X)

# Calculate coverage scores
coverage_score_sym = regression_coverage_score(
y, y_pis_sym[:, 0], y_pis_sym[:, 1]
)
coverage_score_asym = regression_coverage_score(
y, y_pis_asym[:, 0], y_pis_asym[:, 1]
)

# Sort the values for plotting
order = np.argsort(X[:, 0])
X_sorted = X[order]
y_pred_sym_sorted = y_pred_sym[order]
y_pis_sym_sorted = y_pis_sym[order]
y_pred_asym_sorted = y_pred_asym[order]
y_pis_asym_sorted = y_pis_asym[order]
LacombeLouis marked this conversation as resolved.
Show resolved Hide resolved
y_qlow = y_qlow[order]
y_qup = y_qup[order]

##############################################################################
# We will plot the predictions and prediction intervals for both symmetric
# and asymmetric intervals. The line represents the predicted values, the
# dashed lines represent the prediction intervals, and the shaded area
# represents the symmetric and asymmetric prediction intervals.

plt.figure(figsize=(14, 7))

plt.subplot(1, 2, 1)
plt.xlabel("x")
plt.ylabel("y")
plt.scatter(X, y, alpha=0.3)
plt.plot(X_sorted, y_qlow, color="C1")
plt.plot(X_sorted, y_qup, color="C1")
plt.plot(X_sorted, y_pis_sym_sorted[:, 0], color="C1", ls="--")
plt.plot(X_sorted, y_pis_sym_sorted[:, 1], color="C1", ls="--")
plt.fill_between(
X_sorted.ravel(),
y_pis_sym_sorted[:, 0].ravel(),
y_pis_sym_sorted[:, 1].ravel(),
alpha=0.2,
)
plt.title(
f"Symmetric Intervals\n"
f"Target and effective coverages for "
f"alpha={alpha:.2f}: ({1-alpha:.3f}, {coverage_score_sym:.3f})"
)

# Plot asymmetric prediction intervals
plt.subplot(1, 2, 2)
plt.xlabel("x")
plt.ylabel("y")
plt.scatter(X, y, alpha=0.3)
plt.plot(X_sorted, y_qlow, color="C2")
plt.plot(X_sorted, y_qup, color="C2")
plt.plot(X_sorted, y_pis_asym_sorted[:, 0], color="C2", ls="--")
plt.plot(X_sorted, y_pis_asym_sorted[:, 1], color="C2", ls="--")
plt.fill_between(
X_sorted.ravel(),
y_pis_asym_sorted[:, 0].ravel(),
y_pis_asym_sorted[:, 1].ravel(),
alpha=0.2,
)
plt.title(
f"Asymmetric Intervals\n"
f"Target and effective coverages for "
f"alpha={alpha:.2f}: ({1-alpha:.3f}, {coverage_score_asym:.3f})"
)
plt.tight_layout()
plt.show()

##############################################################################
# The symmetric intervals (`symmetry=True`) use a combined set of residuals
# for both bounds, while the asymmetric intervals use distinct residuals for
# each bound, allowing for more flexible and accurate intervals that reflect
# the heteroscedastic nature of the data. The resulting effective coverages
# demonstrate the theoretical guarantee of the target coverage level
# :math:`1 - \alpha`.
Loading