Skip to content

Releases: osg-bosco/GridR

Adding Debian 7 support

27 Feb 21:48
Compare
Choose a tag to compare
Pre-release

In this release, we add Debian 7 support (#15)

Boostrap of condor.local service

30 Jan 04:25
Compare
Choose a tag to compare
Pre-release

In this release, we add a new feature to Bosco, bootstrapping for the condor.local service. This can be used when submitting to distributed systems, such as GlideinWMS. Full details of the change can be found in #14.

This addition has been tested with GlideinWMS running on the Open Science Grid.

Batch and Debian Bug Fix

29 Jan 04:45
Compare
Choose a tag to compare
Pre-release

In this release, we fixed 2 bugs and added Debian URL:

  • #11 - Fixes batch processing for local executions.
  • #12 - UNKNOWN platforms are unable to override the URL.
  • #13 - Add Debian URL for automatic download / install.

Python 2.4 Bug Fix

31 Aug 01:16
Compare
Choose a tag to compare
Python 2.4 Bug Fix Pre-release
Pre-release

The previous release, v0.9.8 included new code that was not compatible with Python 2.4. This has been fixed with #8.

Custom R Edition

30 Aug 22:45
Compare
Choose a tag to compare
Custom R Edition Pre-release
Pre-release

Summary

In this release, easily the most complicated changes since the beginning of the modifications, we have made some much requested additions:

  • #3 - It possible for users to specify custom packages to be installed on the remote machine.
  • #4 - A user may specify a custom R URL to download the R binary from.
  • #7 - The R bootstrap can now download a newer version of the R binaries if they are available.
  • #6 - When running a batched job, the outputs in the list will be updated asynchronously as they complete.

Updated Default Download

Since more people are using GridR, we have updated the default download URL to one that can handle more traffic.

Custom Packages

When the user initializes GridR, they are now able to specify packages that will be installed on the remote machine.

> grid.init(service="bosco.direct", localTmpDir="tmp", 
   remotePackages=c("<path to package.tar.gz>", "<package2>", "<package3>"))

The packages listed in the remotePackages argument will be sent with the jobs and installed on the remote cluster. They will automatically be available in the R environment when your processing begins.

The packages must be in source form. They will be installed with the command:

> R CMD INSTALL --build <package>

Custom R Installation

In addition to installing custom packages, users may also want to have a custom version of R itself. In this case, the user can specify the HTTP URL of another R binary tar ball. It will be downloaded on the worker node and the user's processing will be executed using this custom R installation.

User will give the argument Rurl to grid.init. The Rurl will then be used to download and start R on the worker node.

For example, the user would do:

> library("GridR")
> grid.init(service="bosco.direct", localTmpDir="tmp", Rurl="http://asdf/R-new.tar.gz")
> grid.apply("x", a)

Creating the R tarball

The R tar ball needs to be created to include the user's custom packages. Additionally, the R executable needs to be made portable. Documentation can be found on the Wiki. You may view this blog post. A working example can be seen from Dropbox.

Installing custom packages

Installing custom packages can be done by:

  1. Download or create a functional, portable, R tar ball.
  2. Install the custom package(s).
  3. Tar back the package.

Bug Fix release 0.9.7

02 Aug 21:51
Compare
Choose a tag to compare
Bug Fix release 0.9.7 Pre-release
Pre-release

This release is a bug fix release. It includes a very important bug fix (#1) which affected people that attempted to submit functions which called other user defined functions. More details are in (#1).

Bugs Fixed

  • Fixed compilation of functions before sending to remote cluster for processing. This will enable user to send very complex functions to remote clusters using Bosco. (#1)
  • Updates documentation to correct wording for bosco.direct mode. (#2)

Batch Apply Release

16 Jul 23:00
Compare
Choose a tag to compare
Batch Apply Release Pre-release
Pre-release

This release implements the proper batching of apply statements. This means you can parallelize a function by passing a vector to the grid.apply function.

Example

In this next example, we will run a very simple function to illustrate how to use the batch apply. We will multiply a number by 2:

> a<-function(s){return(s*2)}
> library("GridR", lib.loc="/Library/Frameworks/R.framework/Versions/2.15/Resources/library")
Loading required package: codetools
> grid.init(service="bosco.direct", localTmpDir="tmp")
> grid.apply("x", a, c(1:10), batch=c(1))

In this example, we created a vector using the c(1:10) that contains the numbers 1 to 10. Next, we call the grid.apply function to call the function a against every element in the vector. The grid.apply function sends each of these function calls to a separate processor on the remote Bosco connected cluster, parallelizing the execution. After the function is complete, we can access the result in the variable x:

> x
Grid job finished, result written to variable x 
[[1]]
[1] 2

[[2]]
[1] 4

[[3]]
[1] 6

[[4]]
[1] 8

[[5]]
[1] 10
...

Fixed Issues

  • Fixed CAMPUS-120 - Add proper support for Apply functionality