Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Are the packages of S3 methods being found? #46

Open
DavisVaughan opened this issue Jun 11, 2019 · 2 comments
Open

Are the packages of S3 methods being found? #46

DavisVaughan opened this issue Jun 11, 2019 · 2 comments
Labels

Comments

@DavisVaughan
Copy link
Contributor

I've had two reports over in furrr where packages are not loaded because the only thing used in the expression are S3 generics where the generic doesn't come from the package, but the method does (for example, [ is in base R, but [.data.table is in data table). Does globals find these?

Here is an example:

library(data.table)
library(future.apply)
#> Loading required package: future

dt_cars <- data.table(mtcars)

plan(multiprocess, workers = 2)

future_lapply(1:4, function(x) {head(dt_cars[carb %in% 1], 2)})
#> Error in carb %in% 1: object 'carb' not found

future_lapply(1:4, function(x) {library(data.table); head(dt_cars[carb %in% 1], 2)})
#> [[1]]
#>     mpg cyl disp  hp drat    wt  qsec vs am gear carb
#> 1: 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
#> 2: 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
#> 
#> [[2]]
#>     mpg cyl disp  hp drat    wt  qsec vs am gear carb
#> 1: 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
#> 2: 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
#> 
#> [[3]]
#>     mpg cyl disp  hp drat    wt  qsec vs am gear carb
#> 1: 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
#> 2: 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
#> 
#> [[4]]
#>     mpg cyl disp  hp drat    wt  qsec vs am gear carb
#> 1: 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
#> 2: 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1

Created on 2019-06-11 by the reprex package (v0.2.1)

Here are the two issues:
futureverse/furrr#72

futureverse/furrr#73

@DavisVaughan
Copy link
Contributor Author

You can technically find the methods registered for the generic with attr(utils::methods("["), "info"), although that gives them all, and not in the cleanest form (the package isn't cleanly separated out into its own column).

I wouldn't know how you could use that information to figure out that only the data table package is the one you need. I don't think trying to load every library with a method registered is the best way to go about this.

In the generics package, we have a function that cleans up this info a bit, although it has a small bug when trying to find [ / [[ due to some regex. This is the idea though, notice we found the package the method goes with:

head(generics:::methods_find("predict"))
#>                method       class package               topic visible
#> 1          predict.ar          ar   stats                  ar   FALSE
#> 2       predict.Arima       Arima   stats       predict.arima   FALSE
#> 3      predict.arima0      arima0   stats              arima0   FALSE
#> 4         predict.glm         glm   stats         predict.glm    TRUE
#> 5 predict.HoltWinters HoltWinters   stats predict.HoltWinters   FALSE
#> 6          predict.lm          lm   stats          predict.lm    TRUE
#>                source
#> 1 registered S3method
#> 2 registered S3method
#> 3 registered S3method
#> 4               stats
#> 5 registered S3method
#> 6               stats

Created on 2019-06-11 by the reprex package (v0.2.1)

https://github.com/r-lib/generics/blob/c15ac433450078b581cdfa5d9a3c10797f3a3fd5/R/docs.R#L2

@HenrikBengtsson
Copy link
Collaborator

Hi and thanks for the pointers to 'generics'. Correct, the 'globals' package does not identify registered S3 methods that are needed. Section 'Missing packages (false negatives)' in the 'Common Issues with Solutions' vignette gives an example of this based on 'data.table'.

I've added Issue #47 (migrated from some internal notes of mine, and closes #2) that outline code how to identify this. Three reasons why I haven't got around to roll that out is: (1) time, (2) deep-focus study the impact of automating this as well, (3) being conservative. I want to make sure this "magic" does not introduce side effects that are hard to roll back from.

But, the current behavior is annoying and confusing for people who don't understand S3 methods, parallel processing, how the future+globals work. So, this would spare some headache if fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants