You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Here are some proposed improvements to ECDF plots:
currently the code is quite PIT-focused. e.g. the simulated confidence bands are implemented assuming the target distribution is (discrete) uniform, but this is nowhere documented. We can generalize this method to support other distributions if the user provides both a cdf and an rvs function for the assumed distribution.
it is the only function that takes a keyword fpr. A more consistent one with our API would be to specify a prob=1-fpr.
we should consider removing values2. This is supposedly for ECDF comparison, but it's not as useful as the ECDF comparison plot from the paper, which we should consider making its own plot.
we should allow the user to specify the evaluation points. The theory behind the confidence bands assumes the evaluation points are independent of the sampled values, and the below notebook shows that setting the evaluation points based on the sampled values can cause the bounds to be slightly too tight. Plus, if one wants to plot many ECDFs in the same plot, it's common one would want them to share evaluation points.
we should allow the user to provide a pre-computed confidence band. Alternatively, the confidence band could be its own plotting function. In cases like PIT and rank plots where all subplots share the same comparison distribution and evaluation points, one would want to compute the confidence band once and use it for all subplots (even with optimization, this is much more expensive than computing the ECDF)
When not plotting a band, we should default to using the sampled points as the evaluation points.
Thoughts on implementation
This notebook implements the bands and tests them on a few distributions. It also compares different methods of selecting the evaluation points.
The text was updated successfully, but these errors were encountered:
Also, when the evaluation points are different from the sample points, it feels a little weird using step plots. Step plots give the sense that we know what the function values are between the points, but unlike the full ECDF, we don't. Line plots aren't much better, but we're more accustomed to lines in lineplots not necessarily implying interpolation. I wonder if we should only reserve step plots for the cases where eval points and sample points are the same (Edit: or at least make stepping configurable)
Tell us about it
Here are some proposed improvements to ECDF plots:
cdf
and anrvs
function for the assumed distribution.fpr
. A more consistent one with our API would be to specify aprob=1-fpr
.values2
. This is supposedly for ECDF comparison, but it's not as useful as the ECDF comparison plot from the paper, which we should consider making its own plot.Thoughts on implementation
This notebook implements the bands and tests them on a few distributions. It also compares different methods of selecting the evaluation points.
The text was updated successfully, but these errors were encountered: