Support for `luz` #187

dfalbel · 2023-03-10T15:29:09Z

Hi 👋!

This is a first pass at supporting luz in vetiver.

There are a few things that I'd like to ask the best way to proceed:

Calling predict in the result of vetiver_model doesn't yield the same structure as calling predict in an endpoint containing a luz model - which can be confusing. I wonder if we should enforce that somehow, in this case, I think it would be nice if the same validations happening in handler_predict.luz_module_fitted also happened for predict.vetiver_model.
Outputs of luz models can be arrays with arbitrary dimensions and vetiver enforces a data frame output. To handle this, we are returning a dataframe with an array column, which helps preserving the output dimensions. However, the json serializer and de-serializer somehow breaks the array column:

x <- tibble::tibble(.pred = array(1, dim = c(5, 2,2,2)))
str(x)
#> tibble [5 × 1] (S3: tbl_df/tbl/data.frame)
#>  $ .pred: num [1:5, 1:2, 1:2, 1:2] 1 1 1 1 1 1 1 1 1 1 ...

y <- jsonlite::fromJSON(jsonlite::toJSON(x))
str(y)
#> 'data.frame':    5 obs. of  1 variable:
#>  $ .pred:List of 5
#>   ..$ : int [1:2, 1:2, 1:2] 1 1 1 1 1 1 1 1
#>   ..$ : int [1:2, 1:2, 1:2] 1 1 1 1 1 1 1 1
#>   ..$ : int [1:2, 1:2, 1:2] 1 1 1 1 1 1 1 1
#>   ..$ : int [1:2, 1:2, 1:2] 1 1 1 1 1 1 1 1
#>   ..$ : int [1:2, 1:2, 1:2] 1 1 1 1 1 1 1 1

I wonder if there's a way to safely override the de-serializer for those models so the original structure is preserved.

juliasilge · 2023-03-15T21:54:52Z

Thank you so much for this contribution @dfalbel!

The design here for what goes into a handler_predict method is about common failure modes for deployed models, for example, new data coming with problems or in a different format. The use case for predict is intended to be much less broad, basically like just calling predict on the model object inside as a user would expect to be able to do.
We have been thinking about non-rectangular data for a while (see Consider options for more flexible ptype specification #55) but until we get some of that worked out, what we are doing is letting folks choose between rectangular data with strict/robust checking and turning the checking off. See Details here. We'll want to take the same approach for luz as for keras in this respect.

juliasilge

Thanks again for contributing this! Let me know if you have any questions on this feedback, and I would very much welcome your input and ideas on #55.

R/luz.R

juliasilge · 2023-03-15T22:15:57Z

R/luz.R

+#' @rdname vetiver_create_meta
+#' @export
+vetiver_create_meta.luz_module_fitted <- function(model, metadata) {
+    pkgs <- c("luz", model$model$required_pkgs)


Can you tell me more about how this works in luz? I have trained some of the example models and they do require me to have, for example, torch and/or torchvision loaded, but then they are not stored in this slot. Instead, I see:

model$model$required_pkgs #> NULL

In luz we don't try to be smart about capturing used packages, but users can optionally set this field in the nn_module so it's available as metadata. Eg, one can do:

module <- torch::nn_module( initialize = function(in_features, out_features) { self$linear <- torch::nn_linear(10, 10) }, forward = function(x) { self$linear(x) }, required_pkgs = c("torch", "torchvision") )

We could try to traverse the forward expression and find functions calls that come from other packages, but I feel this can still have many edge cases and is kind of error prone.

Would they always need torch? Should we include that there? I think this is about what needs to be installed and attached for predictions to work. Getting the right packages installed for the deployment is a big part of what vetiver aims to do.

torch don't necessary need to be attached, but it definitely needs to be installed, but it should already be as it's a hard dependency for luz. In the above example, torch wouldn't need to attached for predictions to work.

R/luz.R

juliasilge · 2023-03-15T22:19:19Z

R/luz.R

+#' @rdname handler_startup
+#' @export
+handler_predict.luz_module_fitted <- function(vetiver_model, ...) {
+    force(vetiver_model)


Can you tell me a little more about this?

Since we return the closure directly, without evaluating vetiver_model, it will not be in the in the function env until the first closure call - but at this point vetiver_model could potentially have been garbage collected. This might not be the case for vetiver though, just feels like a good practice to force before returning the closure.

I'm trying to avoid something like this:

f <- function(a, force) { if (force) force(a) function() { a + 1 } } b <- 1 fun_f <- f(a = b, force = TRUE) fun_nf <- f(a = b, force = FALSE) rm(b);gc() #> used (Mb) gc trigger (Mb) limit (Mb) max used (Mb) #> Ncells 674744 36.1 1413787 75.6 NA 710975 38.0 #> Vcells 1201226 9.2 8388608 64.0 32768 1888973 14.5 fun_f() #> [1] 2 fun_nf() #> Error in fun_nf(): object 'b' not found

Thanks for the details!

R/luz.R

tests/testthat/test-luz.R

juliasilge · 2023-03-15T22:28:06Z

tests/testthat/test-luz.R

+    expect_error(predict(v, as.array(torch::torch_randn(10, 2))), regex = "dim error")
+})
+
+test_that("can call endpoints", {


You can check out the approach for testing a local plumber session I have set up in the package already to use instead of this. Unfortunately it's not practical to set up APIs for all the model types that run in CI (just takes too long for the API to come up on some architectures) so a test like this will need to skip on CI (as well of course on CRAN).

I removed that test for now. I wasn't able to use local_plumber_session because an unbundled model is passed to the callr session, and that breaks the luz model. We could perhaps 'unbundle' on the first call to the API instead? But probably should pe part of another PR.

… for now.

dfalbel · 2023-03-16T10:05:29Z

@juliasilge Thank you very much for the review and suggestions. I simplified the PR to make luz support very similar to Keras. I'll think about multi-output support and post on #55

juliasilge · 2023-04-03T18:51:41Z

I made a little example for returning higher dimensional tensors and put it in inst/mtcars_luz.R. The output now looks like this, which I think is a pretty nice option:

library(vetiver)
endpoint <- vetiver_endpoint("http://127.0.0.1:8080/predict")
scaled_cars <- scale(as.matrix(mtcars))
x_test  <- scaled_cars[26:32, 2:ncol(scaled_cars)]
predict(endpoint, data.frame(x_test[1:2,]))
#> # A tibble: 2 × 1
#>   preds              
#>   <list>             
#> 1 <dbl [3 × 64 × 64]>
#> 2 <dbl [3 × 64 × 64]>

^{Created on 2023-04-03 with reprex v2.0.2}

@dfalbel would you mind taking a look at this again and seeing if you have any feedback (other than, of course, how to extend the prototype checking to non-rectangular data, which we can handle separately)?

juliasilge · 2023-04-03T20:10:27Z

Ah, I went to deploy one of these models on Connect and realized that we haven't set up the torch installation for the API. 🙈

What do you think is the best way to go about this @dfalbel? The way we handle installing keras is via a requirements.txt that gets bundled along to Connect. (See here and here.)

What would be a good way to handle this for torch? What do you all do for installing torch on Connect typically?

dfalbel · 2023-04-03T20:39:16Z

In theory, just setting the env var TORCH_INSTALL=1 is enough for the installation to succeed. THis is to allow downloading external files without a prompt. There's no python devs or anything else. Is it possible to set an env var? Or eg, send a .Renvrion file that would set it?

juliasilge · 2023-04-03T23:29:45Z

Does that mean it will install torch every time the content starts, i.e. the API starts up? That's not ideal.

How do you all typically install torch into content when you are deploying on Connect? Do you have an example I can look at? We would want the install to happen when the content deploys, not each time it starts up.

dfalbel added 7 commits March 10, 2023 14:39

First pass at luz support.

ef34f20

Update test expectations

21d188c

try with non retangular data

b8460b9

Add more expectations.

0e8b5fa

Handle skips + torch installation.

e0ae2c1

Force torch installation if tests are to be executed.

217185d

retry more times, don't wait more than 10s between retries.

c6562b6

juliasilge reviewed Mar 15, 2023

View reviewed changes

dfalbel added 4 commits March 16, 2023 09:29

Review: Add parameter count information.

57ffbba

use tensors_to_array directly.

8d21a05

Take approach very similar to keras and support only rectangular data…

8679440

… for now.

convert new data to array before passing to luz.

4c75d50

juliasilge added 5 commits March 17, 2023 18:14

Update tests

abe11d1

Should be prototype here

9f29d68

Update documentation for luz

ed2d042

Use tibble, for better JSON serialization

66344c3

Add luz example to inst/

d625d37

juliasilge added 2 commits April 3, 2023 12:53

Fix variable name in little example

66ce9ed

Try using any::ranger after new release

9276fa9

juliasilge added 5 commits April 5, 2023 16:56

New method to generate .Renviron for deployment

8b84b36

Update tests

6b7c735

Redocument

43960f0

Attach torch at startup

ff83bd6

Namespace torch functions

c0a2956

juliasilge added 15 commits April 5, 2023 20:07

Merge branch 'main' into luz-support

40e637e

Try without torch here

63d9ffb

Handle ranger problems separately

94dd35a

Update NEWS

eedce33

Merged origin/main into dfalbel-luz-support

adf8cb0

Add torch back to required_pkgs

31d484e

Test what is happening at startup

28206dc

Update snapshot

1f2fd90

Try attachNamespace

c11949a

Try using vetiver's attach_pkgs()

6f7b30a

Try mapping through required_pkgs

7f367c9

Update test

6799654

Fix bug in namespace handling

7ef9d5d

Back to using attach_pkgs()

abcba74

Back to using library(torch) for example

a35d8ad

juliasilge merged commit bc207a6 into rstudio:main Apr 14, 2023

juliasilge added a commit that referenced this pull request Apr 14, 2023

Update tests for luz. Related to #187.

466adde

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for `luz` #187

Support for `luz` #187

dfalbel commented Mar 10, 2023

juliasilge commented Mar 15, 2023

juliasilge left a comment

juliasilge Mar 15, 2023

dfalbel Mar 16, 2023

juliasilge Mar 16, 2023

dfalbel Mar 16, 2023

juliasilge Mar 15, 2023

dfalbel Mar 16, 2023

juliasilge Mar 16, 2023

juliasilge Mar 15, 2023

dfalbel Mar 16, 2023

dfalbel commented Mar 16, 2023

juliasilge commented Apr 3, 2023

juliasilge commented Apr 3, 2023

dfalbel commented Apr 3, 2023

juliasilge commented Apr 3, 2023

Support for luz #187

Support for luz #187

Conversation

dfalbel commented Mar 10, 2023

juliasilge commented Mar 15, 2023

juliasilge left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dfalbel commented Mar 16, 2023

juliasilge commented Apr 3, 2023

juliasilge commented Apr 3, 2023

dfalbel commented Apr 3, 2023

juliasilge commented Apr 3, 2023

Support for `luz` #187

Support for `luz` #187