Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Three monitoring functions (for use in dashboards) #92

Merged
merged 25 commits into from
May 23, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
80521bc
First draft of three monitoring functions
juliasilge May 10, 2022
2839d1d
Need to use %>% now (in `expr()` for `vetiver_write_plumber()`)
juliasilge May 10, 2022
99d4761
Move curl to Suggests
juliasilge May 10, 2022
06c9b78
Update pkgdown with monitoring funcs
juliasilge May 10, 2022
04b6cad
Use slider for computing metrics instead
juliasilge May 10, 2022
737ea6b
Add tests
juliasilge May 10, 2022
570603f
Not using lubridate anymore
juliasilge May 10, 2022
c7982ca
Remove lubridate (not using anymore) and redocument
juliasilge May 10, 2022
ec5c326
Update docs
juliasilge May 11, 2022
ebf9059
Namespace functions to avoid `library()` in tests
juliasilge May 23, 2022
ba26008
Apply suggestions from code review
juliasilge May 23, 2022
b82d7bc
Oops more `library()` to remove
juliasilge May 23, 2022
90a9303
No more `initiate` option
juliasilge May 23, 2022
2d6b1d1
Refactor function with feedback from Davis
juliasilge May 23, 2022
ddb694c
Add more context for overwriting metrics in pin, plus `overwrite` arg…
juliasilge May 23, 2022
2b0463c
Try R CMD check "hard" to see if yardstick is a problem
juliasilge May 23, 2022
d67a743
Don't evaluate vignette if pkgs are not there
juliasilge May 23, 2022
422fa3f
Revert R CMD check change
juliasilge May 23, 2022
977070f
Make `vetiver_pin_metrics()` more like other pinning funcs
juliasilge May 23, 2022
9aed822
Remove `.` from slider args! :open_mouth:
juliasilge May 23, 2022
08f8c4a
Try `R-CMD-check-hard` again now that CRAN is up
juliasilge May 23, 2022
22bbccb
Skip if suggested pkgs not available
juliasilge May 23, 2022
b1daf0e
Use @examplesIf for suggested pkgs
juliasilge May 23, 2022
2e025dc
Revert to regular R CMD check, but also add `-hard` as an additional …
juliasilge May 23, 2022
f1b6bde
Update NEWS
juliasilge May 23, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 57 additions & 0 deletions .github/workflows/R-CMD-check-hard.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# Workflow derived from https://github.com/r-lib/actions/tree/v2/examples
# Need help debugging build failures? Start at https://github.com/r-lib/actions#where-to-find-help
#
# NOTE: This workflow only directly installs "hard" dependencies, i.e. Depends,
# Imports, and LinkingTo dependencies. Notably, Suggests dependencies are never
# installed, with the exception of testthat, knitr, and rmarkdown. The cache is
# never used to avoid accidentally restoring a cache containing a suggested
# dependency.
on:
push:
branches: [main]
pull_request:
branches: [main]

name: R-CMD-check-hard

jobs:
R-CMD-check:
runs-on: ${{ matrix.config.os }}

name: ${{ matrix.config.os }} (${{ matrix.config.r }})

strategy:
fail-fast: false
matrix:
config:
- {os: ubuntu-18.04, r: 'release'}

env:
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
R_KEEP_PKG_SOURCE: yes

steps:
- uses: actions/checkout@v2

- uses: r-lib/actions/setup-pandoc@v2

- uses: r-lib/actions/setup-r@v2
with:
r-version: ${{ matrix.config.r }}
http-user-agent: ${{ matrix.config.http-user-agent }}
use-public-rspm: true

- uses: r-lib/actions/setup-r-dependencies@v2
with:
dependencies: '"hard"'
cache: false
extra-packages: |
any::rcmdcheck
any::testthat
any::knitr
any::rmarkdown
needs: check

- uses: r-lib/actions/check-r-package@v2
with:
upload-snapshots: true
13 changes: 10 additions & 3 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -20,14 +20,14 @@ Depends:
Imports:
butcher,
cli,
curl,
fs,
generics,
glue,
hardhat,
httr,
jsonlite,
lifecycle,
magrittr (>= 2.0.3),
pins (>= 1.0.0),
plumber (>= 1.0.0),
purrr,
Expand All @@ -42,6 +42,9 @@ Suggests:
callr,
caret,
covr,
curl,
dplyr,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR adds A LOT of packages here. I put all the packages for monitoring in Suggests because I am up against 20 packages in Imports already and I can sort of make the argument that people may want to deploy a model but not monitor it. I am very open to ideas for reducing this somehow.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm only using dplyr in examples and tests FWIW 🤷‍♀️

ggplot2,
httpuv,
knitr,
LiblineaR,
Expand All @@ -55,13 +58,17 @@ Suggests:
rmarkdown,
rpart,
rsconnect,
slider (>= 0.2.2),
testthat (>= 3.0.0),
tidyselect,
vdiffr,
workflows,
xgboost
xgboost,
yardstick
VignetteBuilder:
knitr
Config/Needs/website: tidyverse/tidytemplate
Config/testthat/edition: 3
Encoding: UTF-8
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.1.2.9000
RoxygenNote: 7.2.0
22 changes: 14 additions & 8 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -66,15 +66,18 @@ export(load_pkgs)
export(map_request_body)
export(new_vetiver_model)
export(vetiver_api)
export(vetiver_compute_metrics)
export(vetiver_create_description)
export(vetiver_create_meta)
export(vetiver_create_ptype)
export(vetiver_deploy_rsconnect)
export(vetiver_endpoint)
export(vetiver_meta)
export(vetiver_model)
export(vetiver_pin_metrics)
export(vetiver_pin_read)
export(vetiver_pin_write)
export(vetiver_plot_metrics)
export(vetiver_pr_docs)
export(vetiver_pr_post)
export(vetiver_pr_predict)
Expand All @@ -83,18 +86,21 @@ export(vetiver_ptype)
export(vetiver_type_convert)
export(vetiver_write_docker)
export(vetiver_write_plumber)
import(purrr)
import(rlang)
importFrom(generics,augment)
importFrom(generics,required_pkgs)
importFrom(glue,glue)
importFrom(glue,glue_collapse)
importFrom(rlang,abort)
importFrom(rlang,expr)
importFrom(rlang,expr_deparse)
importFrom(rlang,has_name)
importFrom(rlang,is_interactive)
importFrom(rlang,is_null)
importFrom(rlang,warn)
importFrom(magrittr,"%>%")
importFrom(purrr,compact)
importFrom(purrr,map)
importFrom(purrr,map_chr)
importFrom(purrr,map_lgl)
importFrom(purrr,pluck)
importFrom(purrr,pmap)
importFrom(purrr,safely)
importFrom(purrr,transpose)
importFrom(stats,predict)
importFrom(utils,head)
importFrom(vctrs,vec_slice)
importFrom(vctrs,vec_sort)
2 changes: 2 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# vetiver (development version)

* Add functions for model monitoring (#92).

# vetiver 0.1.4

* Improve how Dockerfiles are generated.
Expand Down
226 changes: 226 additions & 0 deletions R/monitor.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,226 @@
#' Aggregate, store, and plot model metrics over time for monitoring
#'
#' These three functions can be used for model monitoring (such as in a
#' monitoring dashboard):
#' - `vetiver_compute_metrics()` computes metrics (such as accuracy for a
#' classification model or RMSE for a regression model) at a chosen time
#' aggregation `.period`
#' - `vetiver_pin_metrics()` updates an existing pin storing model metrics
#' over time
#' - `vetiver_plot_metrics()` creates a plot of metrics over time
#'
#' @inheritParams yardstick::metrics
#' @inheritParams pins::pin_read
#' @inheritParams slider::slide_period
#' @param date_var The column in `data` containing dates or date-times for
#' monitoring, to be aggregated with `.period`
#' @param metric_set A [yardstick::metric_set()] function for computing metrics.
#' Defaults to [yardstick::metrics()].
#' @param df_metrics A tidy dataframe of metrics over time, such as created by
#' `vetiver_compute_metrics()`.
#' @param metrics_pin_name Pin name for where the *metrics* are stored (as
#' opposed to where the model object is stored with [vetiver_pin_write()]).
#' @param overwrite If `TRUE` (the default), overwrite any metrics for dates
#' that exist both in the existing pin and new metrics with the _new_ values.
#' If `FALSE`, error when the new metrics contain overlapping dates with the
#' existing pin.
#' @param .index The variable in `df_metrics` containing the aggregated dates
#' or date-times (from `time_var` in `data`). Defaults to `.index`.
#' @param .estimate The variable in `df_metrics` containing the metric estimate.
#' Defaults to `.estimate`.
#' @param .metric The variable in `df_metrics` containing the metric type.
#' Defaults to `.metric`.
#' @param .n The variable in `df_metrics` containing the number of observations
#' used for estimating the metric.
#'
#' @return Both `vetiver_compute_metrics()` and `vetiver_pin_metrics()` return
#' a dataframe of metrics. The `vetiver_plot_metrics()` function returns a
#' `ggplot2` object.
#'
#' @details Sometimes when you monitor a model at a given time aggregation, you
#' may end up with dates in your new metrics (like `new_metrics` in the example)
#' that are the same as dates in your existing aggregated metrics (like
#' `original_metrics` in the example). This can happen if you need to re-run a
#' monitoring report because something failed. With `overwrite = TRUE` (the
#' default), `vetiver_pin_metrics()` will replace such metrics with the new
#' values. With `overwrite = FALSE`, `vetiver_pin_metrics()` will error when
#' there are overlapping dates.
#'
#' For arguments used more than once in your monitoring dashboard,
#' such as `date_var`, consider using
#' [R Markdown parameters](https://bookdown.org/yihui/rmarkdown/parameterized-reports.html)
#' to reduce repetition and/or errors.
#'
#' @examplesIf rlang::is_installed(c("dplyr", "parsnip", "modeldata", "ggplot2"))
#' library(dplyr)
#' library(parsnip)
#' data(Chicago, package = "modeldata")
#' Chicago <- Chicago %>% select(ridership, date, all_of(stations))
#' training_data <- Chicago %>% filter(date < "2009-01-01")
#' testing_data <- Chicago %>% filter(date >= "2009-01-01", date < "2011-01-01")
#' monitoring <- Chicago %>% filter(date >= "2011-01-01", date < "2012-12-31")
#' lm_fit <- linear_reg() %>% fit(ridership ~ ., data = training_data)
#'
#' library(pins)
#' b <- board_temp()
#'
#' ## before starting monitoring, initiate the metrics and pin
#' ## (for example, with the testing data):
#' original_metrics <-
#' augment(lm_fit, new_data = testing_data) %>%
#' vetiver_compute_metrics(date, "week", ridership, .pred, every = 4L)
#' pin_write(b, original_metrics, "lm_fit_metrics")
#'
#' ## to continue monitoring with new data, compute metrics and update pin:
#' new_metrics <-
#' augment(lm_fit, new_data = monitoring) %>%
#' vetiver_compute_metrics(date, "week", ridership, .pred, every = 4L)
#' vetiver_pin_metrics(b, new_metrics, "lm_fit_metrics")
#'
#' library(ggplot2)
#' vetiver_plot_metrics(new_metrics) +
#' scale_size(range = c(2, 4))
#'
#' @export
vetiver_compute_metrics <- function(data,
juliasilge marked this conversation as resolved.
Show resolved Hide resolved
date_var,
period,
truth, estimate, ...,
metric_set = yardstick::metrics,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be an issue if yardstick is Suggests and you have this here

Copy link
Member Author

@juliasilge juliasilge May 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked with the R-CMD-check-hard action like in broom and it was fine, so I'm going to risk it (vs. the NOTE).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked again (with CRAN back up this time) and it still looks OK to have yardstick::metrics() here. 🤞

every = 1L,
origin = NULL,
before = 0L,
after = 0L,
complete = FALSE) {

rlang::check_installed("slider")
truth_quo <- enquo(truth)
estimate_quo <- enquo(estimate)

# Figure out which column in `data` corresponds to `date_var`
date_var <- enquo(date_var)
date_var <- eval_select_one(date_var, data, "date_var")

index <- data[[date_var]]

slider::slide_period_dfr(
.x = data,
.i = index,
.period = period,
.f = compute_metrics,
date_var = date_var,
metric_set = metric_set,
truth_quo = truth_quo,
estimate_quo = estimate_quo,
...,
.every = every,
.origin = origin,
.before = before,
.after = after,
.complete = complete
)

}

#' @rdname vetiver_compute_metrics
#' @export
vetiver_pin_metrics <- function(board,
df_metrics,
metrics_pin_name,
.index = .index,
overwrite = TRUE) {
.index <- enquo(.index)
.index <- eval_select_one(.index, df_metrics, "date_var")
new_dates <- unique(df_metrics[[.index]])

old_metrics <- pins::pin_read(board, metrics_pin_name)
overlapping_dates <- old_metrics[[.index]] %in% new_dates
if (overwrite) {
old_metrics <- vec_slice(old_metrics, !overlapping_dates)
} else {
if (any(overlapping_dates))
abort(c(
glue("The new metrics overlap with dates \\
already stored in {glue::single_quote(metrics_pin_name)}"),
i = "Check the aggregated dates or use `overwrite = TRUE`"
))
}
new_metrics <- vctrs::vec_rbind(old_metrics, df_metrics)
new_metrics <- vec_slice(
new_metrics,
vctrs::vec_order(new_metrics[[.index]])
)

pins::pin_write(board, new_metrics, basename(metrics_pin_name))
juliasilge marked this conversation as resolved.
Show resolved Hide resolved
new_metrics

}

compute_metrics <- function(data,
date_var,
metric_set,
truth_quo,
estimate_quo,
...) {
index <- data[[date_var]]
index <- min(index)

n <- nrow(data)

metrics <- metric_set(
data = data,
truth = !!truth_quo,
estimate = !!estimate_quo,
...
)

tibble::tibble(
.index = index,
.n = n,
metrics
)
}

eval_select_one <- function(col, data, arg, ..., call = caller_env()) {
rlang::check_installed("tidyselect")
check_dots_empty()

# `col` is a quosure that has its own environment attached
env <- empty_env()

loc <- tidyselect::eval_select(
expr = col,
data = data,
env = env,
error_call = call
)

if (length(loc) != 1L) {
message <- glue::glue("`{arg}` must specify exactly one column from `data`.")
abort(message, call = call)
}

loc
}

#' @rdname vetiver_compute_metrics
#' @export
vetiver_plot_metrics <- function(df_metrics,
.index = .index,
.estimate = .estimate,
.metric = .metric,
Comment on lines +210 to +211
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.estimate = .estimate,
.metric = .metric,
estimate = .estimate,
metric = .metric,

Copy link
Contributor

@DavisVaughan DavisVaughan May 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just because I think we only ever add . to arguments if they have the potential to conflict with named ...

(I realize it is a little weird here because the column name is .estimate, so if you want to keep this then I wouldn't mind)

.n = .n) {
rlang::check_installed("ggplot2")
.metric <- enquo(.metric)

ggplot2::ggplot(data = df_metrics,
ggplot2::aes({{ .index }}, {{.estimate}})) +
ggplot2::geom_line(ggplot2::aes(color = !!.metric), alpha = 0.7) +
ggplot2::geom_point(ggplot2::aes(color = !!.metric,
size = {{.n}}),
alpha = 0.9) +
ggplot2::facet_wrap(ggplot2::vars(!!.metric),
scales = "free_y", ncol = 1) +
ggplot2::guides(color = "none") +
ggplot2::labs(x = NULL, y = NULL, size = NULL)
}
Loading