Add support for deploying recipes #179

juliasilge · 2023-02-22T22:32:04Z

Closes #177

This PR adds support in vetiver for deploying standalone recipes (not as part of workflows).

library(recipes)
#> Loading required package: dplyr
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
#> 
#> Attaching package: 'recipes'
#> The following object is masked from 'package:stats':
#> 
#>     step
library(embed)

split <- seq.int(1, 150, by = 9)
tr <- iris[-split, ]
te <- iris[split, ]

set.seed(11)
supervised <-
    recipe(Species ~ ., data = tr) %>%
    step_center(all_predictors()) %>%
    step_scale(all_predictors()) %>%
    step_umap(all_predictors(), outcome = vars(Species), num_comp = 2) %>%
    prep(training = tr)

library(vetiver)
v <- vetiver_model(supervised, "iris-umap", prototype_data = te[, -5])

library(plumber)
pr() %>%
    vetiver_api(v) ## next pipe to `pr_run()`
#> # Plumber router with 2 endpoints, 4 filters, and 1 sub-router.
#> # Use `pr_run()` on this object to start the API.
#> ├──[queryString]
#> ├──[body]
#> ├──[cookieParser]
#> ├──[sharedSecret]
#> ├──/logo
#> │  │ # Plumber static router serving from directory: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library/vetiver
#> ├──/ping (GET)
#> └──/predict (POST)

^{Created on 2023-02-22 with reprex v2.0.2}

juliasilge · 2023-02-22T22:34:40Z

tests/testthat/test-recipe.R

+test_that("can print recipe", {
+    expect_snapshot(v)
+})
+


I would normally have a test here like this:

test_that("can predict recipe", { preds <- predict(v, mtcars) expect_equal(<<blah blah blah>>) })

But I don't think that's possible for recipes. The predict method for a vetiver model does bundle::unbundle() and then calls predict on what is inside. I guess we could add a bake method for a vetiver model if needed? This is separate from the API where we can say exactly what to do at the endpoint.

For more clarity, this is also separate from calling predict() on a remote vetiver endpoint, which would also work. What we don't have a way to do right now is read the recipe back into memory from remote storage (a pin) and then call bake() on it, without the user manually getting out the recipe object themselves and unbundling it.

@isabelizimm do you mind summing up here what the situation is for unsupervised models from scikit-learn as deployed by vetiver? These models typically have a predict method so this is not a problem in Python, right?

Looking at just the clustering algorithms from scikit-learn, most of them have a predict method. You can use these in a Pipeline (similar to workflow), same as other models. Vetiver Python doesn't look for supervised/unsupervised models, only if it is coming from scikit-learn, so it will return the outputs of the predict method as expected.

If one of the unsupervised learning models that do NOT have a predict method are used as the last element in a Pipeline, there will be an error along the lines of model has no predict method.

FWIW: (clustering algorithms with predict: k-means, bisecting k-means, affinity propagation, mean shift, BIRCH, Gaussian mixture. do NOT have predict: spectral clustering, agglomerative clustering, DBSCAN, OPTIC)

juliasilge · 2023-02-22T22:38:09Z

tests/testthat/test-recipe.R

+    step_ns(wt) %>%
+    prep(retain = FALSE)
+
+v <- vetiver_model(trained_rec, "car-splines", prototype_data = mtcars[c("disp", "wt")])


Notice that we are requiring the user to pass in some prototype_data (check out the vetiver_ptype.recipe method). This is what we have to do for ranger because the info on the training data isn't in there anywhere. If I was understanding Max correctly, this is what he was recommending.

I want to note, though, that the original column names and types are stored in a list, at trained_rec$var_info. Would there be a way to reconstruct the needed info (i.e. a ptype)?

As it stands right now, there isn't a foolproof way of going from trained_rec$var_info to ptypes, since there is no guarantee that a 1-1 mapping can be found. This is much clearly seen since the type will be listed as other for any classes we don't currently specify.

I do however wish that this information was in recipes, as it is useful, even if we don't force the input checking. I will note and see if we can add such information in a future version.

Which is another thing. The variable checking in recipes is done on a optional per-step basis, and can at times be quite loose. many steps doesn't care if input is double or integer. step_dummy() as a gross outlier doesn't do any type checking

R/recipe.R

EmilHvitfeldt

Overall looks good (in so far that I only looked at the recipes side of the PR). I think the main struggle right now is that a recipe object doesn't include a reliable way to generate a ptype like object.

Co-authored-by: Emil Hvitfeldt <emilhhvitfeldt@gmail.com>

juliasilge · 2023-03-02T20:17:22Z

After bake() is added to generics, we can come back and add in some methods.

…to add-recipes

juliasilge added 3 commits February 22, 2023 15:26

Add methods for recipes

45efe99

Update tests

c527545

Redocument

ff5e703

juliasilge commented Feb 22, 2023

View reviewed changes

Namespace for recipes::bake()

37ac83e

juliasilge mentioned this pull request Feb 22, 2023

Using Vetiver for UMAPs #177

Closed

juliasilge requested a review from EmilHvitfeldt February 22, 2023 22:59

EmilHvitfeldt reviewed Feb 22, 2023

View reviewed changes

R/recipe.R Outdated Show resolved Hide resolved

EmilHvitfeldt reviewed Feb 23, 2023

View reviewed changes

Update R/recipe.R

f93e62d

Co-authored-by: Emil Hvitfeldt <emilhhvitfeldt@gmail.com>

juliasilge mentioned this pull request Feb 28, 2023

Add generic for bake() r-lib/generics#75

Open

juliasilge added 2 commits March 2, 2023 13:18

Update NEWS

c25088b

Merge branch 'add-recipes' of https://github.com/rstudio/vetiver-r in…

4bccc38

…to add-recipes

juliasilge merged commit 0df86e1 into main Mar 2, 2023

juliasilge deleted the add-recipes branch March 2, 2023 20:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for deploying recipes #179

Add support for deploying recipes #179

juliasilge commented Feb 22, 2023

juliasilge Feb 22, 2023

juliasilge Feb 22, 2023

juliasilge Feb 28, 2023

isabelizimm Mar 1, 2023

juliasilge Feb 22, 2023

EmilHvitfeldt Feb 23, 2023 •

edited

Loading

EmilHvitfeldt left a comment •

edited

Loading

juliasilge commented Mar 2, 2023

Add support for deploying recipes #179

Add support for deploying recipes #179

Conversation

juliasilge commented Feb 22, 2023

juliasilge Feb 22, 2023

Choose a reason for hiding this comment

juliasilge Feb 22, 2023

Choose a reason for hiding this comment

juliasilge Feb 28, 2023

Choose a reason for hiding this comment

isabelizimm Mar 1, 2023

Choose a reason for hiding this comment

juliasilge Feb 22, 2023

Choose a reason for hiding this comment

EmilHvitfeldt Feb 23, 2023 • edited Loading

Choose a reason for hiding this comment

EmilHvitfeldt left a comment • edited Loading

Choose a reason for hiding this comment

juliasilge commented Mar 2, 2023

EmilHvitfeldt Feb 23, 2023 •

edited

Loading

EmilHvitfeldt left a comment •

edited

Loading