Improvements and changes to ECDF plots #2309

sethaxen · 2024-02-05T23:33:58Z

Tell us about it

Here are some proposed improvements to ECDF plots:

currently the code is quite PIT-focused. e.g. the simulated confidence bands are implemented assuming the target distribution is (discrete) uniform, but this is nowhere documented. We can generalize this method to support other distributions if the user provides both a cdf and an rvs function for the assumed distribution.
it is the only function that takes a keyword fpr. A more consistent one with our API would be to specify a prob=1-fpr.
we should add the optimized confidence bands from https://doi.org/10.1007/s11222-022-10090-6, which are faster and more stable than the simulated ones.
we should consider removing values2. This is supposedly for ECDF comparison, but it's not as useful as the ECDF comparison plot from the paper, which we should consider making its own plot.
we should allow the user to specify the evaluation points. The theory behind the confidence bands assumes the evaluation points are independent of the sampled values, and the below notebook shows that setting the evaluation points based on the sampled values can cause the bounds to be slightly too tight. Plus, if one wants to plot many ECDFs in the same plot, it's common one would want them to share evaluation points.
we should allow the user to provide a pre-computed confidence band. Alternatively, the confidence band could be its own plotting function. In cases like PIT and rank plots where all subplots share the same comparison distribution and evaluation points, one would want to compute the confidence band once and use it for all subplots (even with optimization, this is much more expensive than computing the ECDF)
When not plotting a band, we should default to using the sampled points as the evaluation points.

Thoughts on implementation

This notebook implements the bands and tests them on a few distributions. It also compares different methods of selecting the evaluation points.

The text was updated successfully, but these errors were encountered:

sethaxen · 2024-02-06T09:39:11Z

Also, when the evaluation points are different from the sample points, it feels a little weird using step plots. Step plots give the sense that we know what the function values are between the points, but unlike the full ECDF, we don't. Line plots aren't much better, but we're more accustomed to lines in lineplots not necessarily implying interpolation. I wonder if we should only reserve step plots for the cases where eval points and sample points are the same (Edit: or at least make stepping configurable)

sethaxen mentioned this issue Feb 10, 2024

Refactor ECDF code #2311

Merged

6 tasks

sethaxen mentioned this issue Feb 23, 2024

Refactor plot_ecdf arguments #2316

Merged

6 tasks

OriolAbril mentioned this issue Jun 21, 2024

Add ecdf arviz-devs/arviz-stats#8

Open

sethaxen mentioned this issue Aug 19, 2024

Add optimized simultaneous ECDF bands #2368

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improvements and changes to ECDF plots #2309

Improvements and changes to ECDF plots #2309

sethaxen commented Feb 5, 2024

sethaxen commented Feb 6, 2024 •

edited

Loading

Improvements and changes to ECDF plots #2309

Improvements and changes to ECDF plots #2309

Comments

sethaxen commented Feb 5, 2024

Tell us about it

Thoughts on implementation

sethaxen commented Feb 6, 2024 • edited Loading

sethaxen commented Feb 6, 2024 •

edited

Loading