-
Notifications
You must be signed in to change notification settings - Fork 217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
setup-r-dependencies action always reinstalls site-packages if no cache is available #814
Comments
It defaults to whatever you set as your primary library. E.g. if you have a user library, that is used:
That is the expected behavior, pak does not consider other libraries, only the library it is installing to and the base and recommended packages in Whether this is always the desired behavior is a good question, possibly not. |
Since no Package library to install the packages to. Note that all dependent packages will be installed here, even if they are already installed in another library. The only exceptions are base and recommended packages installed in .Library. These are not duplicated in lib, unless a newer version of a recommemded package is needed. As specified here: https://pak.r-lib.org/reference/lockfile_create.html Unless you're using caching (actions/cache@v4) and a cache file is found |
I've double checked this, and I can't get it to work. I.e. installed packages are always being ignored, even if they're installed in .Library next to the base packages. This is because the example with the pre installed yaml package:
The result for the
Which means it's going to be reinstalled When I change the pak::lockfile_create() call to:
the result for the
|
See issue: r-lib#814
As I said above:
I.e. it only considers the base and recommended packages in |
I see, so there's no way to use prebuilt base images to speed up deployments? |
You need to install them into the same library that pak uses to install packages to. |
I've tried that, but they're still being ignored and are being reinstalled. Because the Later on |
I'm losing my hair here, when I pull the base image locally (Docker Desktop) and run the scripts inside the container, the preinstalled packages seem to be detected. When I do the same in a Github workflow, they aren't.
produces:
Nevertheless in GitHub actions:
Which takes forever. Locally (Docker Desktop) in a pulled image (same base image, same DESCRIPTION file):
The /usr/local/lib/R/site-library contains all the preinstalled packages It looks like GitHub Actions is not acting the same way when you run actions in a pulled image -> container on top of the runner. Or is it something with pkgcache/cache user dir that is different for a GitHub actor? How can I check? |
In pulled image in Docker:
In GitHub actions container:
So could the GH action be looking in ~/.cache which is different in both cases? |
What I did now is:
Resulting in:
Ofcourse not a really nice solution, I would still prefer to have |
Wait, you want to put packages into the pkgcache cache? I thought you wanted to pre-install them into the site library. In any case, if you have something that works, that's great. This is the issue that tracks being able to use multiple libraries: r-lib/pkgdepends#189 |
No, I don't want to put them in pkgcache perse, but if they're not in there, they are being reinstalled and the site library packages are being ignored. You said they shouldn't be ignored but they do. Maybe not on a standard Github runner but on a Github runner which uses a container on top, they are being ignored, no matter what I do or try. That's basicly what this bug is about |
I'm sorry, I can't provide you with an example as our code base and base images are private. If I use For example:
And with the
Although this shouldn't be a problem, since lockfile_install() should detect already installed packages in the target destination (in my case R_LIB_FOR_PAK = /usr/local/lib/site-library), it doesn't and is reïnstalling all packages anyway. So basicly there are two bugs in my pov:
|
That's not where packages are installed by default. That's the library where pak is installed, to make is separate from the user's default library. pak installs packages to Also, |
A lockfile is supposed to contain all dependencies of the project.
Here is an example: FROM rhub/r-minimal
RUN installr -c
RUN echo -e 'Package: test\nVersion: 1.0.0\nImports: dplyr' \
> DESCRIPTION
RUN R -e 'source("https://pak.r-lib.org/install.R")'
RUN R -e 'pak::lockfile_create()'
RUN R -e 'pak::lockfile_install()'
# ----------------------------------------------------------------------------
# SAVE Dockerfile here
# ----------------------------------------------------------------------------
# Add data.table as well
RUN echo -e 'Package: test\nVersion: 1.0.0\nImports: dplyr, data.table' \
> DESCRIPTION
RUN R -e 'pak::lockfile_create()'
RUN R -e 'pak::lockfile_install()' First we install dplyr, and then use the resulting image to add an extra data.table package to it. In real life the second part would happen in GitHub Actions of course. If you build this, you'll see that the second installation only adds the missing data.table package:
|
Could be due to the fact that we're using the good old |
Actually this is not true, as we're using:
in the base image |
Even if you are using FROM rhub/r-minimal
RUN installr -c
RUN R -q -e 'install.packages("dplyr", repos = "https://cloud.r-project.org")'
# ----------------------------------------------------------------------------
# SAVE Dockerfile here
# ----------------------------------------------------------------------------
RUN R -e 'source("https://pak.r-lib.org/install.R")'
# Add data.table as well
RUN echo -e 'Package: test\nVersion: 1.0.0\nImports: dplyr, data.table' \
> DESCRIPTION
RUN R -e 'pak::lockfile_create()'
RUN R -e 'pak::lockfile_install()' you'll get
|
This could be related to using Artifactory as packagemanager.
If I use our Artifactory setup for package sources, it's:
As you can see, the fields missing while creating the lockfile when using Artifactory as source, are: Resulting in:
So it's using pkgcache as a fallback. I'm kind of stuck, implementing my (PR)[https://github.com//pull/815] would really be helpful. |
If you are using custom repositories, then the important thing is that you use the same exact repositories for the pre-installation and the installation. E.g. if you pre-install yaml from artifactory, but then use CRAN or PPM only for the installation, then pak will see that there is a yaml package that was installed from artifactory, but according to the lockfile it should be installed from CRAN/PPM, so it will (re)install it from there. |
Unfortunately that PR breaks other things, so I cannot merge it as is. |
That's too bad, but maybe as an optional parameter? Thanks for your time and effort though! |
So can you run |
These kind of issues are much easier to solve if you have a public repository that reproduces your problem. Otherwise it is guesswork. It does not have to be your real image if you want to keep that private. |
Too bad I can't, it would include a private Artifactory packagemanager (SaaS) as well |
I think we should clone the setup-r-dependencies action and change this 1 line of code ourselves then. With the risk of getting out-of-date. There no way we can influence this in the current setup because too many things are happening in this one action. |
As mentioned in r-lib/pak#608 (comment) |
Can you try this? You need to use the
If it works I'll update the |
This seems to be working. In the lockfile there's for example: So it's detecting the site-library packages now, resulting in: However all are being marked as to be downloaded:
But in the end: It used to be: So we've saved 20+ minutes in installing dependencies by just this one parameter, we're very happy with that! |
OK, updated |
This issue has been automatically locked. If you believe you have found a related problem, please file a new issue and include a link to this issue |
STOP
If you are debugging a failed build or have a question about GitHub Actions in
general do NOT open an issue here. Either post on the Actions sections of
the GitHub Community or the RStudio Community forums.
Open an issue here only if you have a bug in one of the
custom R specific actions themselves.
Describe the bug
setup-r-dependencies uses pak::lockfile_create() with no
lib
-parameter. Therefor it defaults to.Library
which means only installed base packages are being detected and all (pre-)installed site-packages (.Library.site) are being ignored and will be reinstalled.See:
actions/setup-r-dependencies/action.yaml
Line 112 in 7171bbd
To Reproduce
We're using nightly built base images for Github actions with most used R packages already included so we can speed up the install dependencies step in our workflows. As caching is being stored on the runner and not on the base image itself, it's not being used unless you rerun the same workflow/branch. Initial runs don't have cache but the installed site-packages should be checked, this does not happen because of the mentioned finding (.Library only contains R base packages).
Expected behavior
Installed site-packages should be checked, this does not happen because of the mentioned finding (.Library only contains R base packages). Therefor all site-packages will be downloaded from remote repositories like CRAN or RSPM
Additional context
Possible workaround is to install the extra packages in the base images in the .Library folder (in our case it's /usr/local/lib/R/library) instead of the .Library.site folder (/usr/local/lib/R/site-library).
A nicer solution is to specify .Library.site for the
lib
-parameter of the pak::lockfile_create() call here, like so:relevant (tried) env-variables:
R_LIBS_USER="/usr/local/lib/R/site-library"
R_LIBS_SITE="/usr/local/lib/R/site-library"
R_LIB_FOR_PAK="/usr/local/lib/R/site-library"
The text was updated successfully, but these errors were encountered: