diff --git a/vignettes/cmdstanr.Rmd b/vignettes/cmdstanr.Rmd index dcd24d15a..781b9fdf1 100644 --- a/vignettes/cmdstanr.Rmd +++ b/vignettes/cmdstanr.Rmd @@ -187,7 +187,7 @@ first argument specifies the variables to summarize and any arguments after that are passed on to `posterior::summarise_draws()` to specify which summaries to compute, whether to use multiple cores, etc. -```{r summary, eval=FALSE} +```{r summary} fit$summary() fit$summary(variables = c("theta", "lp__"), "mean", "sd") @@ -202,24 +202,6 @@ fit$summary( ) ``` -```{r, echo=FALSE} -# NOTE: the hack of using print.data.frame in chunks with echo=FALSE -# is used because the pillar formatting of posterior draws_summary objects -# isn't playing nicely with pkgdown::build_articles(). -options(digits = 2) -print.data.frame(fit$summary()) - -print.data.frame(fit$summary(variables = c("theta", "lp__"), "mean", "sd")) - -print.data.frame(fit$summary("theta", pr_lt_half = ~ mean(. <= 0.5))) - -print.data.frame(fit$summary( - variables = NULL, - posterior::default_summary_measures(), - extra_quantiles = ~posterior::quantile2(., probs = c(.0275, .975)) -)) -``` - #### CmdStan's stansummary utility CmdStan itself provides a `stansummary` utility that can be called using the @@ -351,20 +333,11 @@ the `$sample()` method demonstrated above. We can find the (penalized) maximum likelihood estimate (MLE) using [`$optimize()`](https://mc-stan.org/cmdstanr/reference/model-method-optimize.html). -```{r optimize, eval=FALSE} +```{r optimize} fit_mle <- mod$optimize(data = data_list, seed = 123) fit_mle$summary() # includes lp__ (log prob calculated by Stan program) fit_mle$mle("theta") ``` -```{r, echo=FALSE} -# NOTE: the hack of using print.data.frame in chunks with echo=FALSE -# is used because the pillar formatting of posterior draws_summary objects -# isn't playing nicely with pkgdown::build_articles(). -options(digits = 2) -fit_mle <- mod$optimize(data = data_list, seed = 123) -print.data.frame(fit_mle$summary()) # includes lp__ (log prob calculated by Stan program) -fit_mle$mle("theta") -``` Here's a plot comparing the penalized MLE to the posterior distribution of `theta`. @@ -380,18 +353,10 @@ We can run Stan's experimental variational Bayes algorithm (ADVI) using the [`$variational()`](https://mc-stan.org/cmdstanr/reference/model-method-variational.html) method. -```{r variational, eval=FALSE} +```{r variational} fit_vb <- mod$variational(data = data_list, seed = 123, output_samples = 4000) fit_vb$summary("theta") ``` -```{r, echo=FALSE} -# NOTE: the hack of using print.data.frame in chunks with echo=FALSE -# is used because the pillar formatting of posterior draws_summary objects -# isn't playing nicely with pkgdown::build_articles(). -options(digits = 2) -fit_vb <- mod$variational(data = data_list, seed = 123, output_samples = 4000) -print.data.frame(fit_vb$summary("theta")) -``` The `$draws()` method can be used to access the approximate posterior draws. Let's extract the draws, make the same plot we made after MCMC, and compare the diff --git a/vignettes/posterior.Rmd b/vignettes/posterior.Rmd index cb54a14a7..526178346 100644 --- a/vignettes/posterior.Rmd +++ b/vignettes/posterior.Rmd @@ -15,62 +15,38 @@ vignette: > ```{r child="children/_settings-knitr.Rmd"} ``` - -```{r, include=FALSE} -options(digits=2) -``` - ## Summary statistics -We can easily customise the summary statistics reported by `$summary()` and `$print()`. +We can easily customize the summary statistics reported by `$summary()` and `$print()`. -```{r eval=FALSE} +```{r} fit <- cmdstanr::cmdstanr_example("schools", method = "sample") fit$summary() ``` -```{r echo=FALSE} -fit <- cmdstanr::cmdstanr_example("schools", method = "sample") -print.data.frame(fit$summary()) -``` By default all variables are summaries with the follow functions: ```{r} posterior::default_summary_measures() ``` -To change the variables summarised, we use the variables argument -```{r eval=FALSE} +To change the variables summarized, we use the variables argument +```{r} fit$summary(variables = c("mu", "tau")) ``` -```{r echo=FALSE} -print.data.frame(fit$summary(variables = c("mu", "tau"))) -``` We can additionally change which functions are used -```{r eval=FALSE} +```{r} fit$summary(variables = c("mu", "tau"), mean, sd) ``` -```{r echo=FALSE} -print.data.frame(fit$summary(variables = c("mu", "tau"), mean, sd)) -``` -To summarise all variables with non-default functions, it is necessary to set explicitly set the variables argument, either to `NULL` or the full vector of variable names. -```{r eval=FALSE} +To summarize all variables with non-default functions, it is necessary to set explicitly set the variables argument, either to `NULL` or the full vector of variable names. +```{r} fit$metadata()$model_params fit$summary(variables = NULL, "mean", "median") ``` -```{r echo=FALSE} -fit$metadata()$model_params -print.data.frame(fit$summary(variables = NULL, "mean", "median")) -``` Summary functions can be specified by character string, function, or using a formula (or anything else supported by [rlang::as_function]). If these arguments are named, those names will be used in the tibble output. If the summary results are named they will take precedence. -```{r eval=FALSE} +```{r} my_sd <- function(x) c(My_SD = sd(x)) fit$summary( c("mu", "tau"), @@ -81,58 +57,31 @@ fit$summary( Minimum = function(x) min(x) ) ``` -```{r echo=FALSE} -my_sd <- function(x) c(My_SD = sd(x)) -print.data.frame(fit$summary( - c("mu", "tau"), - MEAN = mean, - "median", - my_sd, - ~quantile(.x, probs = c(0.1, 0.9)), - Minimum = function(x) min(x) -)) -``` - Arguments to all summary functions can also be specified with `.args`. -```{r eval=FALSE} +```{r} fit$summary(c("mu", "tau"), quantile, .args = list(probs = c(0.025, .05, .95, .975))) ``` -```{r echo=FALSE} -print.data.frame(fit$summary(c("mu", "tau"), quantile, .args = list(probs = c(0.025, .05, .95, .975)))) -``` The summary functions are applied to the array of sample values, with dimension `iter_sampling`x`chains`. -```{r eval=FALSE} +```{r} fit$summary(variables = NULL, dim, colMeans) ``` -```{r echo=FALSE} -print.data.frame(fit$summary(variables = NULL, dim, colMeans)) -``` For this reason users may have unexpected results if they use `stats::var()` directly, as it will return a covariance matrix. An alternative is the `distributional::variance()` function, which can also be accessed via `posterior::variance()`. -```{r eval=FALSE} +```{r} fit$summary(c("mu", "tau"), posterior::variance, ~var(as.vector(.x))) ``` -```{r echo=FALSE} -print.data.frame(fit$summary(c("mu", "tau"), posterior::variance, ~var(as.vector(.x)))) -``` - Summary functions need not be numeric, but these won't work with `$print()`. -```{r eval=FALSE} +```{r} strict_pos <- function(x) if (all(x > 0)) "yes" else "no" fit$summary(variables = NULL, "Strictly Positive" = strict_pos) # fit$print(variables = NULL, "Strictly Positive" = strict_pos) ``` -```{r echo=FALSE} -strict_pos <- function(x) if (all(x > 0)) "yes" else "no" -print.data.frame(fit$summary(variables = NULL, "Strictly Positive" = strict_pos)) -# fit$print(variables = NULL, "Strictly Positive" = strict_pos) -``` For more information, see `posterior::summarise_draws()`, which is called by `$summary()`.