Skip to content

Commit

Permalink
Update for RStudio Server and R/4.2.1
Browse files Browse the repository at this point in the history
  • Loading branch information
cgross95 committed Feb 20, 2024
1 parent 7eaca8e commit 0c3ea6f
Show file tree
Hide file tree
Showing 10 changed files with 218 additions and 43 deletions.
3 changes: 2 additions & 1 deletion config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -59,14 +59,15 @@ contact: 'grosscra@msu.edu'

# Order of episodes in your lesson
episodes:
- rstudio-ondemand.Rmd
- rstudio-server-ondemand.Rmd
- managing-r-env-packages.Rmd
- parallelizing-r-code.Rmd
- r-command-line.Rmd
- r-slurm-jobs.Rmd

# Information for Learners
learners:
- rstudio-ondemand.Rmd

# Information for Instructors
instructors:
Expand Down
Binary file added episodes/fig/rstudio-server-interface.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added episodes/fig/rstudio-server-ondemand-options.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added episodes/fig/running-rstudio-server-job.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added episodes/fig/select-rstudio-server.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
28 changes: 18 additions & 10 deletions episodes/managing-r-env-packages.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -31,15 +31,15 @@ We can do this by typing `.libPaths()` into the R console:
.libPaths()
```
```output
[1] "/mnt/ufs18/home-237/k0068027/R/x86_64-pc-linux-gnu-library/4.0"
[2] "/cvmfs/pub_software.icer.msu.edu/software/R/4.0.3-foss-2020a/lib64/R/library"
[1] "/mnt/ufs18/home-237/k0068027/R/x86_64-pc-linux-gnu-library/4.2"
[2] "/cvmfs/pub_software.icer.msu.edu/software/R/4.2.1-foss-2022a/lib64/R/library"
```

We see two directories.
The first is created for you in your home directory, and the second (or one like it, starting with `/cvmfs/pub_software.icer.msu.edu/software` or `/opt/software`) points to all of the packages that are pre-installed on the HPCC.
When you use `install.packages()` in the future, by default, it will install to the first entry in your `.libPaths()`.

One important point to note is that the library in your home directory is labeled with `4.0` for version 4.0(.3) of R.
One important point to note is that the library in your home directory is labeled with `4.2` for version 4.2(.1) of R, the default used by RStudio Server.
If you ever use different versions of R, it is important that the packages you use are consistent with those versions.
So, for example, if you choose to use R/3.6.2, you should make sure that the library in your home directory returned by `.libPaths()` ends in 3.6.
Mixing versions will likely cause your packages to stop working!
Expand All @@ -65,7 +65,7 @@ Luckily, R knows this, and if you try to install a package, you will be offered

``` output
Warning in install.packages :
'lib = "/cvmfs/pub_software.icer.msu.edu/software/R/4.0.3-foss-2020a/lib64/R/library"' is not writable
'lib = "/cvmfs/pub_software.icer.msu.edu/software/R/4.2.1-foss-2022a/lib64/R/library"' is not writable
Would you like to use a personal library instead? (yes/No/cancel)
```

Expand Down Expand Up @@ -122,7 +122,7 @@ Here are some general tips:

1. Read the documentation for the package you're using and take note of any dependencies you need and their versions. This information is also included under SystemRequirements on a package's [CRAN](https://cran.r-project.org/) page.
2. Make sure that software is available before you try to install/use the R package. This could involve:
- Loading it through the HPCC module system. **Note**: This is not possible (yet) in RStudio on the HPCC. You will have to [use R through the command line](r-command-line.Rmd#loading-external-dependencies).
- Loading it through the HPCC module system. To do this in OnDemand, click the Advanced Options" checkbox when you start a new RStudio Server session. The first option will allow you to enter HPCC modules you'd like to load before RStudio starts. Otherwise, you can [load these packages and use R through the command line](r-command-line.Rmd#loading-external-dependencies).
- Installing it yourself in a way that R can find it.
3. If a package's setup instructions suggest something like `sudo apt-get ...` or `sudo dnf install ...` under the Linux instructions, this is a sign that it needs external dependencies.
These methods won't work for installation on the HPCC; instead, look for and load HPCC modules with similar names.
Expand Down Expand Up @@ -186,9 +186,9 @@ To make sure this runs every time we start R, we'll put it in the `.Rprofile` fi

Use RStudio to open a new Text File and type

```text
```r
local({
options(repos = c(CRAN="https://repo.miserver.it.umich.edu/cran/"))
options(repos = c(CRAN="https://repo.miserver.it.umich.edu/cran/"))
})
```

Expand All @@ -197,7 +197,15 @@ It's good practice to put any code you write in your `.Rprofile` in a call to `l

Save this in your `r_workshop` directory as `.Rprofile` (don't forget the leading `.`).
Any time R starts, it will look for a `.Rprofile` file in the current directory, and execute all of the code before doing anything else.
To make this take effect in RStudio, you can restart R by going to the Session menu, and select Restart R.
To make this take effect in RStudio, you can restart R by going to the Session menu, and select Restart R. To check our work, run

```r
options()$repos
```
```output
CRAN
"https://repo.miserver.it.umich.edu/cran/"
```

Now suppose that this project we're working on uses some very special packages that we don't want in the library in our home directory.
The right way to do this is with a package manager like [`packrat`](https://rstudio.github.io/packrat/) or the newer [`renv`](https://rstudio.github.io/renv/articles/renv.html).
Expand Down Expand Up @@ -229,7 +237,7 @@ Now, restart R using the Session menu, and check your `.libPaths()` in the R con
```
```output
[1] "/mnt/ufs18/home-237/k0068027/r_workshop/library"
[2] "/cvmfs/pub_software.icer.msu.edu/software/R/4.0.3-foss-2020a/lib64/R/library"
[2] "/cvmfs/pub_software.icer.msu.edu/software/R/4.2.1-foss-2022a/lib64/R/library"
```

Great! We can even check that we've isolated ourselves from the default home directory library by trying to load `cowsay`:
Expand Down Expand Up @@ -283,7 +291,7 @@ Double checking our library paths
```
```output
[1] "/mnt/ufs18/home-237/k0068027/r_workshop/library"
[2] "/cvmfs/pub_software.icer.msu.edu/software/R/4.0.3-foss-2020a/lib64/R/library"
[2] "/cvmfs/pub_software.icer.msu.edu/software/R/4.2.1-foss-2022a/lib64/R/library"
```

we see that our `r_workshop/library` directory is first.
Expand Down
46 changes: 23 additions & 23 deletions episodes/r-command-line.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -77,42 +77,40 @@ module spider R
$ module -r spider '.*R.*'
```

We've abbreviated the output, but we can see that there are lots of different versions of R available! We'll try loading 4.0.3 since that version on the HPCC has a large number of packages pre-installed.
We've abbreviated the output, but we can see that there are lots of different versions of R available! We'll try loading 4.2.1 since that version matches the one used in the RStudio Server OnDemand app.

If you're familiar with the module system, you might try to load the module right away with `module load`:

```bash
module load R/4.0.3
module load R/4.2.1
```

```output
Lmod has detected the following error: These module(s) or extension(s) exist
but cannot be loaded as requested: "R/4.0.3"
Try: "module spider R/4.0.3" to see how to load the module(s).
but cannot be loaded as requested: "R/4.2.1"
Try: "module spider R/4.2.1" to see how to load the module(s).
```

But we get an error! Let's try the suggested fix to see what's going on:


```bash
module spider R/4.0.3
module spider R/4.2.1
```

```output
----------------------------------------------------------------------------
R: R/4.0.3
R: R/4.2.1
----------------------------------------------------------------------------
Description:
R is a free software environment for statistical computing and
graphics.
You will need to load all module(s) on any one of the lines below before the
"R/4.0.3" module is available to load.
"R/4.2.1" module is available to load.
GCC/10.2.0 OpenMPI/4.0.5
GCC/9.3.0 OpenMPI/4.0.3
iccifort/2020.1.217 impi/2019.7.217
GCC/11.3.0 OpenMPI/4.1.4
Help:
Description
Expand All @@ -129,7 +127,7 @@ Before we do that, it's good practice to purge any other modules that might be l

```bash
module purge
module load GCC/10.2.0 OpenMPI/4.0.5 R/4.0.3
module load GCC/11.3.0 OpenMPI/4.1.4 R/4.2.1
```

No error! Let's check that we can access R:
Expand All @@ -139,8 +137,8 @@ R
```

```output
R version 4.0.3 (2020-10-10) -- "Bunny-Wunnies Freak Out"
Copyright (C) 2020 The R Foundation for Statistical Computing
R version 4.2.1 (2022-06-23) -- "Funny-Looking Kid"
Copyright (C) 2022 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
Expand All @@ -158,6 +156,7 @@ Type 'demo()' for some demos, 'help()' for on-line help, or
Type 'q()' to quit R.
>
```

Great! We now have an R console where we can run short lines of code, just like from RStudio. As the output from `R` shows, type `q()` to quit and return back to the command line.
Expand All @@ -174,36 +173,37 @@ We will use this option below to make sure we run our code in the cleanest envir
When you load the R module using the `module load` commands above (before you actually run `R`), this is also the time to load those external dependencies.

Note that these dependencies and R will all need to be be compatible (e.g., use the same version of GCC and MPI).
For example, a spatial ecology workflow might require the use of GDAL and UDUNITS as dependencies for R packages.
For example, a Bayesian modeling workflow might require the use of JAGS as dependencies for R packages.
After loading R and its dependencies with

```bash
module purge
module load GCC/10.2.0 OpenMPI/4.0.5 R/4.0.3
module load GCC/11.3.0 OpenMPI/4.1.4 R/4.2.1
```

you can try loading a compatible GDAL and UDUNITS without specifying a version:

```bash
module load GDAL UDUNITS
module load JAGS
```

Then check which versions got loaded:
Then check which version gets loaded:

```bash
module list
```
```output
Currently Loaded Modules:
1) GCCcore/10.2.0 27) zstd/1.4.5 53) FLAC/1.3.3
2) zlib/1.2.11 28) libdrm/2.4.102 54) libvorbis/1.3.7
1) GCCcore/11.3.0 34) zstd/1.5.2 67) ICU/71.1
2) zlib/1.2.12 35) libdrm/2.4.110 68) Szip/2.1.1
3) binutils/2.38 36) libglvnd/1.4.0 69) HDF5/1.12.2
...
24) X11/20201008 50) GMP/6.2.0 76) GDAL/3.3.2
25) gzip/1.10 51) NLopt/2.6.2 77) UDUNITS/2.2.26
26) lz4/1.9.2 52) libogg/1.3.4
31) X11/20220504 64) libopus/1.3.1 97) PROJ/9.0.0
32) gzip/1.12 65) LAME/3.100 98) JAGS/4.3.1
33) lz4/1.9.3 66) libsndfile/1.1.0
```

If these versions will work, then great!
If this version will work, then great!
If not, then you might try using a different version of R.
Usually, newer versions of dependencies for popular R packages get installed with newer versions of R.

Expand Down
18 changes: 9 additions & 9 deletions episodes/r-slurm-jobs.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ This code is exactly what you would enter on the command line to run your R scri

# Load the R module
module purge
module load GCC/10.2.0 OpenMPI/4.0.5 R/4.0.3
module load GCC/11.3.0 OpenMPI/4.1.4 R/4.2.1

# Get to our project directory
cd ~/r_workshop
Expand Down Expand Up @@ -125,12 +125,12 @@ JobId=17815750 JobName=single_core.sh
MinCPUsNode=1 MinMemoryCPU=500M MinTmpDiskNode=0
Features=[intel14|intel16|intel18|(amr|acm)|nvf|nal|nif] DelayBoot=00:00:00
OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
Command=/mnt/ufs18/home-045/grosscra/r_workshop/slurm/single_core.sh
WorkDir=/mnt/ufs18/home-045/grosscra/r_workshop
Comment=stdout=/mnt/ufs18/home-045/grosscra/r_workshop/slurm-17815750.out
StdErr=/mnt/ufs18/home-045/grosscra/r_workshop/slurm-17815750.out
Command=/mnt/ufs18/home-237/k0068027/r_workshop/slurm/single_core.sh
WorkDir=/mnt/ufs18/home-237/k0068027/r_workshop
Comment=stdout=/mnt/ufs18/home-237/k0068027/r_workshop/slurm-17815750.out
StdErr=/mnt/ufs18/home-237/k0068027/r_workshop/slurm-17815750.out
StdIn=/dev/null
StdOut=/mnt/ufs18/home-045/grosscra/r_workshop/slurm-17815750.out
StdOut=/mnt/ufs18/home-237/k0068027/r_workshop/slurm-17815750.out
Power=
```

Expand Down Expand Up @@ -159,7 +159,7 @@ Submit the job and compare the time it took to run with the single core job.

# Load the R module
module purge
module load GCC/10.2.0 OpenMPI/4.0.5 R/4.0.3
module load GCC/11.3.0 OpenMPI/4.1.4 R/4.2.1

# Get to our project directory
cd ~/r_workshop
Expand Down Expand Up @@ -220,7 +220,7 @@ For the steps below, you will need the [list of SLURM job specifications](https:

# Load the R module
module purge
module load GCC/10.2.0 OpenMPI/4.0.5 R/4.0.3
module load GCC/11.3.0 OpenMPI/4.1.4 R/4.2.1

# Get to our project directory
cd ~/r_workshop
Expand Down Expand Up @@ -275,7 +275,7 @@ Here is an example script:

# Load the R module
module purge
module load GCC/10.2.0 OpenMPI/4.0.5 R/4.0.3
module load GCC/11.3.0 OpenMPI/4.1.4 R/4.2.1

# Get to our project directory
cd ~/r_workshop
Expand Down
Loading

0 comments on commit 0c3ea6f

Please sign in to comment.