Skip to content

Commit

Permalink
Add ggplot2::autoplot() for tune_* results (issue #36).
Browse files Browse the repository at this point in the history
  • Loading branch information
fabrice-rossi committed Sep 3, 2023
1 parent cba80a2 commit 3380ea8
Show file tree
Hide file tree
Showing 20 changed files with 1,188 additions and 17 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ LazyData: true
Imports:
assertthat,
butcher,
ggplot2,
nnet,
pROC,
Rcpp (>= 1.0.8.3),
Expand All @@ -45,7 +46,6 @@ RoxygenNote: 7.2.3
Suggests:
data.table,
geodist,
ggplot2,
knitr,
rmarkdown,
testthat (>= 3.0.0),
Expand Down
4 changes: 4 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@
S3method(as_covlmc,tune_covlmc)
S3method(as_vlmc,ctx_tree)
S3method(as_vlmc,tune_vlmc)
S3method(autoplot,tune_covlmc)
S3method(autoplot,tune_vlmc)
S3method(coef,constant_model)
S3method(context_number,covlmc)
S3method(context_number,ctx_tree)
Expand Down Expand Up @@ -108,6 +110,8 @@ export(tune_vlmc)
export(vlmc)
import(Rcpp)
importFrom(Rcpp,evalCpp)
importFrom(ggplot2,autoplot)
importFrom(rlang,.data)
importFrom(stats,predict)
importFrom(stats,simulate)
useDynLib(mixvlmc)
6 changes: 3 additions & 3 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,9 @@
(`vignette("likelihood")`)
* `tune_vlmc()` and `tune_covlmc()` can be used with the different likelihood
function definitions
* results of `tune_vlmc()` and `tune_covlmc()` can be plotted using base R graphics
(issue #36)
* `tune_covlmc()` can trim the best model (and the initial one) is asked to
* results of `tune_vlmc()` and `tune_covlmc()` can be plotted using base R
graphics or ggplot2 (issue #36)
* `tune_covlmc()` can trim the best model (and the initial one) if asked to
* `cutoff()` uses a new `tolerance` parameter to avoid reporting cut off values
that are almost identical due to numerical imprecision
* `trim.covlmc()` implements simple trimming for VGAM based objects (issue #48)
Expand Down
110 changes: 110 additions & 0 deletions R/autoplot_tune.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
#' Create a complete ggplot for the results of automatic VLMC complexity
#' selection
#'
#' This function prepares a plot of the results of [tune_vlmc()] using ggplot2.
#' The result can be passed to [print()] to display the result.
#'
#' The graphical representation proposed by this function is complete, while the
#' one produced by [plot.tune_vlmc()] is minimalistic. We use here the faceting
#' capabilities of ggplot2 to combine on a single graphical representation the
#' evolution of multiple characteristics of the VLMC during the pruning process,
#' while [plot.tune_vlmc()] shows only the selection criterion or the log
#' likelihood. Each facet of the resulting plot shows a quantity as a function
#' of the cut off expressed in quantile or native scale.
#'
#' @param object a `tune_vlmc` object
#' @param cutoff the scale used for the cut off criterion (default "quantile")
#' @param ... additional parameters (not used currently)
#' @returns a ggplot object
#' @examples
#' pc <- powerconsumption[powerconsumption$week %in% 10:11, ]
#' dts <- cut(pc$active_power, breaks = c(0, quantile(pc$active_power, probs = c(0.5, 1))))
#' dts_best_model_tune <- tune_vlmc(dts, criterion = "BIC")
#' vlmc_plot <- ggplot2::autoplot(dts_best_model_tune)
#' print(vlmc_plot)
#' ## simple post customisation
#' print(vlmc_plot + ggplot2::geom_point())
#' @export
autoplot.tune_vlmc <- function(object, cutoff = c("quantile", "native"), ...) {
cutoff <- match.arg(cutoff)
if (cutoff == "quantile") {
x_lab <- "Cut off (quantile scale)"
x_var <- "alpha"
} else {
x_lab <- "Cut off (native scale)"
x_var <- "cutoff"
}
vars <- names(object$results)
vars[3] <- "Depth"
vars[4] <- "Context number"
vars[5] <- "Log likelihood"
names(object$results) <- vars
res_long <- stats::reshape(object$results,
direction = "long",
ids = row.names(object$results),
varying = list(vars[-(1:2)]),
times = vars[-(1:2)],
idvar = "id",
timevar = "variable"
)

ggplot2::ggplot(res_long, ggplot2::aes(
x = .data[[x_var]],
y = .data[["Depth"]]
)) +
ggplot2::geom_line() +
ggplot2::facet_wrap(~variable, scales = "free_y") +
ggplot2::ylab("") +
ggplot2::xlab(x_lab)
}

#' Create a complete ggplot for the results of automatic COVLMC complexity
#' selection
#'
#' This function prepares a plot of the results of [tune_covlmc()] using
#' ggplot2. The result can be passed to [print()] to display the result.
#'
#' The graphical representation proposed by this function is complete, while the
#' one produced by [plot.tune_covlmc()] is minimalistic. We use here the
#' faceting capabilities of ggplot2 to combine on a single graphical
#' representation the evolution of multiple characteristics of the VLMC during
#' the pruning process, while [plot.tune_covlmc()] shows only the selection
#' criterion or the log likelihood. Each facet of the resulting plot shows a
#' quantity as a function of the cut off expressed in quantile or native scale.
#'
#' @param object a `tune_civlmc` object
#' @param ... additional parameters (not used currently)
#' @returns a ggplot object
#' @export
#' @examples
#' pc <- powerconsumption[powerconsumption$week %in% 10:12, ]
#' dts <- cut(pc$active_power, breaks = c(0, quantile(pc$active_power, probs = c(0.5, 1))))
#' dts_cov <- data.frame(day_night = (pc$hour >= 7 & pc$hour <= 17))
#' dts_best_model_tune <- tune_covlmc(dts, dts_cov, criterion = "AIC")
#' covlmc_plot <- ggplot2::autoplot(dts_best_model_tune)
#' print(covlmc_plot)
#'
autoplot.tune_covlmc <- function(object, ...) {
vars <- names(object$results)
vars[2] <- "Depth"
vars[3] <- "Context number"
vars[4] <- "Log likelihood"
names(object$results) <- vars
res_long <- stats::reshape(object$results,
direction = "long",
ids = row.names(object$results),
varying = list(vars[-1]),
times = vars[-1],
idvar = "id",
timevar = "variable"
)

ggplot2::ggplot(res_long, ggplot2::aes(
x = .data[["alpha"]],
y = .data[["Depth"]]
)) +
ggplot2::geom_line() +
ggplot2::facet_wrap(~variable, scales = "free_y") +
ggplot2::ylab("") +
ggplot2::xlab("Cut off (quantile scale)")
}
2 changes: 2 additions & 0 deletions R/mixvlmc-package.R
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@
#' @importFrom Rcpp evalCpp
#' @importFrom stats simulate
#' @importFrom stats predict
#' @importFrom ggplot2 autoplot
#' @importFrom rlang .data
#' @useDynLib mixvlmc
## usethis namespace: end
NULL
Expand Down
11 changes: 7 additions & 4 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -115,9 +115,7 @@ sun_model_tune
The results of the pruning process can be represented graphically:

```{r sunspots_bic}
ggplot(sun_model_tune$results, aes(x = alpha, y = BIC)) +
geom_line() +
geom_point()
print(autoplot(sun_model_tune) + geom_point())
```

The plot shows that simpler models are too simple as the BIC increases when pruning becomes strong enough. The best model remains rather complex (as expected based on the periodicity of the Solar cycle):
Expand Down Expand Up @@ -175,12 +173,17 @@ draw(elec_covlmc, time_sep = " | ", model = "full", p_value = FALSE)

The model appears a bit complex. To get a more adapted model, we use a BIC based model selection as follows:

```{r}
```{r elec_tune_process}
elec_covlmc_tune <- tune_covlmc(elec_dts, elec_cov)
print(autoplot(elec_covlmc_tune))
```

```{r}
best_elec_covlmc <- as_covlmc(elec_covlmc_tune)
draw(best_elec_covlmc, model = "full", time_sep = " | ", p_value = FALSE)
```


As in the VLMC case, the optimal model remains rather simple:

- the *high* context do not use the covariate and is equivalent to the vlmc context;
Expand Down
17 changes: 10 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -166,18 +166,16 @@ We adjust automatically an optimal VLMC as follows:
sun_model_tune <- tune_vlmc(sun_activity)
sun_model_tune
#> VLMC context tree on high, low
#> cutoff: 2.306 (quantile: 0.0317468917204769)
#> cutoff: 2.306 (quantile: 0.03175)
#> Number of contexts: 9
#> Maximum context length: 5
#> Selected by BIC (248.5057)
#> Selected by BIC (248.5057) with likelihood function "truncated" (-98.83247)
```

The results of the pruning process can be represented graphically:

``` r
ggplot(sun_model_tune$results, aes(x = alpha, y = BIC)) +
geom_line() +
geom_point()
print(autoplot(sun_model_tune) + geom_point())
```

<img src="man/figures/README-sunspots_bic-1.png" width="100%" />
Expand Down Expand Up @@ -287,6 +285,12 @@ BIC based model selection as follows:

``` r
elec_covlmc_tune <- tune_covlmc(elec_dts, elec_cov)
print(autoplot(elec_covlmc_tune))
```

<img src="man/figures/README-elec_tune_process-1.png" width="100%" />

``` r
best_elec_covlmc <- as_covlmc(elec_covlmc_tune)
draw(best_elec_covlmc, model = "full", time_sep = " | ", p_value = FALSE)
#> *
Expand Down Expand Up @@ -373,8 +377,7 @@ lh_covlmc <- sapply(covlmc_simul, longuest_high)
The average longest time spent in *high* consecutively is

- for the VLMC: 243.6 minutes with a standard error of 6.7337834;
- for the VLMC with covariate: 286.2 minutes with a standard error of
8.9157448;
- for the VLMC with covariate: 280 minutes with a standard error of 0;
- 410 minutes for the observed time series.

The following figure shows the distributions of the times obtained by
Expand Down
2 changes: 2 additions & 0 deletions _pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -63,12 +63,14 @@ reference:
- contents:
- tune_vlmc
- plot.tune_vlmc
- autoplot.tune_vlmc
- cutoff
- prune
- subtitle: COVLMC
- contents:
- tune_covlmc
- plot.tune_covlmc
- autoplot.tune_covlmc
- cutoff.covlmc
- prune.covlmc
- title: Simulation
Expand Down
39 changes: 39 additions & 0 deletions man/autoplot.tune_covlmc.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

41 changes: 41 additions & 0 deletions man/autoplot.tune_vlmc.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Binary file added man/figures/README-elec_tune_process-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified man/figures/README-longest_time_in_high-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified man/figures/README-longest_time_in_high_ld-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified man/figures/README-sunspots_bic-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit 3380ea8

Please sign in to comment.