-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for deploying recipes #179
Changes from all commits
45efe99
c527545
ff5e703
37ac83e
f93e62d
c25088b
4bccc38
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
#' @rdname vetiver_create_description | ||
#' @export | ||
vetiver_create_description.recipe <- function(model) { | ||
num_steps <- length(model$steps) | ||
cli::pluralize("A feature engineering recipe with {num_steps} step{?s}") | ||
} | ||
|
||
#' @rdname vetiver_create_meta | ||
#' @export | ||
vetiver_create_meta.recipe <- function(model, metadata) { | ||
reqs <- required_pkgs(model) | ||
reqs <- sort(unique(c(reqs, "recipes"))) | ||
vetiver_meta(metadata, required_pkgs = reqs) | ||
} | ||
|
||
#' @rdname vetiver_create_ptype | ||
#' @export | ||
vetiver_ptype.recipe <- function(model, ...) { | ||
rlang::check_dots_used() | ||
dots <- list(...) | ||
check_ptype_data(dots) | ||
ptype <- vctrs::vec_ptype(dots$prototype_data) | ||
tibble::as_tibble(ptype) | ||
} | ||
|
||
#' @rdname vetiver_create_description | ||
#' @export | ||
vetiver_prepare_model.recipe <- function(model) { | ||
if (!recipes::fully_trained(model)) { | ||
rlang::abort("Your `model` object is not a trained recipe.") | ||
} | ||
ret <- butcher::butcher(model) | ||
ret <- bundle::bundle(ret) | ||
ret | ||
} | ||
|
||
#' @rdname handler_startup | ||
#' @export | ||
handler_startup.recipe <- function(vetiver_model) { | ||
attach_pkgs(vetiver_model$metadata$required_pkgs) | ||
} | ||
|
||
#' @rdname handler_startup | ||
#' @export | ||
handler_predict.recipe <- function(vetiver_model, ...) { | ||
|
||
function(req) { | ||
new_data <- req$body | ||
new_data <- vetiver_type_convert(new_data, vetiver_model$prototype) | ||
recipes::bake(vetiver_model$model, new_data = new_data, ...) | ||
} | ||
} |
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
# can print recipe | ||
|
||
Code | ||
v | ||
Output | ||
|
||
-- car-splines - <bundled_recipe> model for deployment | ||
A feature engineering recipe with 1 step using 2 features | ||
|
||
# create plumber.R for recipe | ||
|
||
Code | ||
cat(readr::read_lines(tmp), sep = "\n") | ||
Output | ||
# Generated by the vetiver package; edit with care | ||
|
||
library(pins) | ||
library(plumber) | ||
library(rapidoc) | ||
library(vetiver) | ||
|
||
# Packages needed to generate model predictions | ||
if (FALSE) { | ||
library(recipes) | ||
} | ||
b <- board_folder(path = "<redacted>") | ||
v <- vetiver_pin_read(b, "car-splines") | ||
|
||
#* @plumber | ||
function(pr) { | ||
pr %>% vetiver_api(v) | ||
} | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,70 @@ | ||
skip_if_not_installed("recipes") | ||
skip_if_not_installed("plumber") | ||
|
||
library(plumber) | ||
library(recipes) | ||
|
||
trained_rec <- | ||
recipe(mpg ~ disp + wt, mtcars) %>% | ||
step_ns(wt) %>% | ||
prep(retain = FALSE) | ||
|
||
v <- vetiver_model(trained_rec, "car-splines", prototype_data = mtcars[c("disp", "wt")]) | ||
|
||
test_that("can print recipe", { | ||
expect_snapshot(v) | ||
}) | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would normally have a test here like this: test_that("can predict recipe", {
preds <- predict(v, mtcars)
expect_equal(<<blah blah blah>>)
}) But I don't think that's possible for recipes. The There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For more clarity, this is also separate from calling There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @isabelizimm do you mind summing up here what the situation is for unsupervised models from scikit-learn as deployed by vetiver? These models typically have a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Looking at just the clustering algorithms from scikit-learn, most of them have a predict method. You can use these in a Pipeline (similar to workflow), same as other models. Vetiver Python doesn't look for supervised/unsupervised models, only if it is coming from scikit-learn, so it will return the outputs of the predict method as expected. If one of the unsupervised learning models that do NOT have a predict method are used as the last element in a Pipeline, there will be an error along the lines of FWIW: (clustering algorithms with predict: k-means, bisecting k-means, affinity propagation, mean shift, BIRCH, Gaussian mixture. do NOT have predict: spectral clustering, agglomerative clustering, DBSCAN, OPTIC) |
||
test_that("can pin a recipe", { | ||
b <- board_temp() | ||
vetiver_pin_write(b, v) | ||
pinned <- pin_read(b, "car-splines") | ||
expect_equal( | ||
pinned, | ||
list( | ||
model = bundle::bundle(butcher::butcher(trained_rec)), | ||
prototype = vctrs::vec_slice(tibble::as_tibble(mtcars[c("disp", "wt")]), 0) | ||
) | ||
) | ||
expect_equal( | ||
pin_meta(b, "car-splines")$user$required_pkgs, | ||
c("recipes") | ||
) | ||
}) | ||
|
||
test_that("default endpoint for recipe", { | ||
p <- pr() %>% vetiver_api(v) | ||
p_routes <- p$routes[-1] | ||
expect_equal(names(p_routes), c("ping", "predict")) | ||
expect_equal(map_chr(p_routes, "verbs"), | ||
c(ping = "GET", predict = "POST")) | ||
}) | ||
|
||
test_that("default OpenAPI spec", { | ||
v$metadata <- list(url = "potatoes") | ||
p <- pr() %>% vetiver_api(v) | ||
car_spec <- p$getApiSpec() | ||
expect_equal(car_spec$info$description, | ||
"A feature engineering recipe with 1 step") | ||
post_spec <- car_spec$paths$`/predict`$post | ||
expect_equal(names(post_spec), c("summary", "requestBody", "responses")) | ||
expect_equal(as.character(post_spec$summary), | ||
"Return predictions from model using 2 features") | ||
get_spec <- car_spec$paths$`/pin-url`$get | ||
expect_equal(as.character(get_spec$summary), | ||
"Get URL of pinned vetiver model") | ||
|
||
}) | ||
|
||
test_that("create plumber.R for recipe", { | ||
skip_on_cran() | ||
b <- board_folder(path = tmp_dir) | ||
vetiver_pin_write(b, v) | ||
tmp <- tempfile() | ||
vetiver_write_plumber(b, "car-splines", file = tmp) | ||
expect_snapshot( | ||
cat(readr::read_lines(tmp), sep = "\n"), | ||
transform = redact_vetiver | ||
) | ||
}) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Notice that we are requiring the user to pass in some
prototype_data
(check out thevetiver_ptype.recipe
method). This is what we have to do for ranger because the info on the training data isn't in there anywhere. If I was understanding Max correctly, this is what he was recommending.I want to note, though, that the original column names and types are stored in a list, at
trained_rec$var_info
. Would there be a way to reconstruct the needed info (i.e. aptype
)?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As it stands right now, there isn't a foolproof way of going from
trained_rec$var_info
toptype
s, since there is no guarantee that a 1-1 mapping can be found. This is much clearly seen since the type will be listed asother
for any classes we don't currently specify.I do however wish that this information was in recipes, as it is useful, even if we don't force the input checking. I will note and see if we can add such information in a future version.
Which is another thing. The variable checking in recipes is done on a optional per-step basis, and can at times be quite loose. many steps doesn't care if input is double or integer.
step_dummy()
as a gross outlier doesn't do any type checking