-
-
Notifications
You must be signed in to change notification settings - Fork 407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Additional LOO functionality #2059
Comments
Generally agree. Not so sure about the
I personally think it would be better to document this and have users do the groupby themselves instead of adding a logo function. If we add a function, I expect users to assume only cv schemes directly compatible with loo or logo are supported, but they can also do any combination themselves and I don't think we can support all possible cases, a draft example of already using logo with loo would be https://nbviewer.org/github/OriolAbril/Exploratory-Analysis-of-Bayesian-Models/blob/multi_obs_ic/content/Section_04/Multiple_likelihoods.ipynb#Predicting-team-level-performance. |
Good point about |
Hi @sethaxen , I was willing to implement the |
That would be one approach.
Yep, this is the right approach, with a few notes:
Namely, in addition to using summand = np.sign(values) * np.exp(np.log(np.abs(values)) - log_likelihood))
log_abs_summand = np.abs(values) - log_likelihood
max_log_abs_summand = log_abs_summand.max()
summand_scaled = np.sign(values) * np.exp(max_log_abs_summand - max_log_abs_summand)
tail_right = ... # select right tail of summand_scaled using approach of https://github.com/arviz-devs/arviz/blob/6b1b2be60cca804d4b0ec43a3303820bfa8785c6/arviz/stats/stats.py#L974-L980
tail_left = ... # do the same but for `-summand_scaled`
k_hat_right, _ = _gpdfit(tail_right)
k_hat_left, _ = _gpdfit(tail_left)
k_hat = max(k_hat_right, k_hat_left) It might be better to refactor @avehtari does this look right? FWIW, I would support adding |
Corresponding R functions are in https://github.com/stan-dev/loo/blob/master/R/E_loo.R, see specifically For the diagnostics, |
@sethaxen Thanks for the quick reply 👍 . I'll keep these points in mind and try to submit a draft PR implementing the |
There are several useful functions that can be used for model assessment and comparison that could be added here. In some cases these are already available in loo or soon will be:
loo_expectation(data, values, kwargs...)
: compute expectation ofvalues
wrt the leave-one-out posteriors using PSIS. e.g. ifvalues
are likelihoods, this computes pointwise ELPD, though less efficiently and stably thanloo
does. Ifvalues
are posterior predictions, then the result is leave-one-out posterior predictive means. Exists in loo asE_loo
and was previously proposed to be added to ArviZ in Is there a LOO R2? #1931 (comment).logo(data, groups, var_name, kwargs...)
: compute leave-one-group-out crossvalidation.groups
is an array of the same shape as the log-likelihood corresponding to the var specified byvar_name
, minus sample dimensions. The unique group values define new dimensions, and log-likelihoods that share the same group are summed together. The docstring should warn that this is more likely to produce Pareto shape errors than LOO, and it will probably fail if there are group-specific parameters.loo_predictive_error(data, method, kwargs...)
: useloo_expectation
to compute point estimates of LOO posterior-predictive means, and then use the name or functionmethod
to compute pointwise and mean predictive error across the data points. Coming to loo: Additional loo utilities stan-dev/loo#202loo_crps(idata, scale=False, kwargs...)
: useloo_expectation
to compute continuous ranked probability score (CRPS) or it's scaled variant SCRPS, which is another strictly proper scoring rule besides Log score (ELPD) to use for model comparison. Requires 2 posterior predictive draws for each posterior draw, so blocked by Supporting multiple posterior predictive draws for each posterior draw #2048. Coming to loo: CRPS and SCRPS stan-dev/loo#203The text was updated successfully, but these errors were encountered: