shinybatch

This package provides a simple framework to create, launch automatically and retrieve time-consuming operations (tasks) in batch mode from a Shiny app.

The tasks are automatically launched using a scheduler, e.g. a timer that periodically launches a (batch) operation.

with a CRON in Linux/Mac, using package cronR
with Windows Task Scheduler in Windows, using package taskscheduleR
or defined directly with OS tools

Installation

devtools::install_github("datastorm-open/shinybatch")

Demo application

runApp(system.file("demo_app", package = "shinybatch"))

Main functions

configure_task : create a .yml file filed with operation params (fun path, fun args, priority, ...),
run_task : run a selected task from a .yml file. Used by launcher,
launcher : select and run the task(s) with highest priority, calling run_task in batch mode with RScript command,
scheduler_init : (opt: create the cron R script) create the file to be launched by the scheduler,
scheduler_add : (opt: create the cron file, and by default the cron R script) and start the scheduler which will read the file at the given frequency to launch batch operations,
configure_task_server : define a task and call configure_task() in the Shiny app,
tasks_overview_server : display tasks & retrieve results within the Shiny app.

Definition of a task

configure_task()

A task is defined by its .yml file that contains the following informations :

run_info:
  date_creation: 2020-04-24 15:21:00
  date_start: N/A
  date_end: N/A
  priority: 1.0
  status: waiting
descriptive:
  title: my_title
  description: my_descr
function:
  path: /path/to/my_fun
  name: my_fun_name
args:
  x: 1.0
  'y':
    _path: /path/to/task/dir/inputs/y.RDS
  z:
    _path: /path/to/task/dir/inputs/z.RDS
dir: /path/to/task/dir/

The run_info part contains general informations about the task.

Priority can be any number, with 0 as default. The highest the priority, the sooner it is launched.

Valid status are:

waiting
running
finished
timeout
error

The descriptive part contains free informations given by the user. The title and description fields are only example.

The function part contains the location of the fun R script (for sourcing) and its name (for calling). The script must have all necessary ressources (packages, variables, functions) for execute the main function :

# Load package(s) (if needed)
require(data.table)

# source script(s) (if needed)
source("/path/to/script")

# Load data (if needed)
data <- readRDS("/path/to/script")

# Define main function (needed !)
my_fun_name <- function(x, y, z){
  ...
}

The args part contains either the argument itself if it is of length 1 or the location of the argument (in dir_conf/inputs/arg_name.RDS), complex arguments are storing as .RDS

The dir argument contains the location of the directory in which is stored the conf.yml file.

When a task has been succesfully run, some fields are updated:

date_start and date_end are filled,
status is set to 'running', then to 'finished'.

Some outputs are created:

if wanted, the result of the task, in other words the output of the main function (in dir_conf/output/res.RDS)
the log of the run (in dir_conf/output/log_run.txt)

Description of the launcher

launcher()

The launcher retrieves all the tasks in a main directory and build a the table of their run_info. Based on this, it verifies that there are tasks with status that allow a run, e.g. all but those in ignore_status. Then, if the maximum number of simultaneously running tasks is not reached, it launches new tasks according to their priority, in batch mode with RScript command (and calling run_task())

The task with higher priority is defined as the one:

with status not in ignore_status (default is all but waiting),
which has the highest priority,
and then the oldest date_init.

Description of the scheduler

Before calling the scheduler, we first need to create the file that it will launch. We can use scheduler_init(). By default, it looks like this (scheduler_script.R) :

#!/usr/bin/env Rscript
args = commandArgs(trailingOnly = TRUE)

shinybatch::launcher(dir_path = '/path/to/main_directory/',
                     max_runs = 1,
                     ignore_status = c('running','finished','error'),
                     delay_reruns = TRUE
)

... but the head lines can be customized by filling the head_rows argument.

Once the file has been created, the cron has to be defined. You can use directly cron on linux or the Task Scheduler on windows, or the scheduler_add function. The default command is :

Rscript /path/to/scheduler_script.R

N.B : scheduler_add creates by default the R scheduler script using scheduler_init

Example

sb_fun_ex.R

sb_fun_ex <- function(x, y, z) {
  res <- x + y
  message('Running !')
  warning("Complex (or not) variable z is not used...!", call. = FALSE)
  res
}

Configure a task

require(shinybatch)

?configure_task

# create temporary directory
dir <- tempdir()

# create and save conf
conf <- configure_task(
  dir_path = dir,
  conf_descr = list(
    title = "my_title",
    description = "my_descr"
  ),
  fun_path = system.file("ex_fun/sb_fun_ex.R", package = "shinybatch"), # as an example,
  fun_name = "sb_fun_ex",
  fun_args = list(
    x = 1,
    y = 0:4,
    z = iris
  ),
  priority = 1)

# check results
list.files(conf$dir, recursive = T)
# [1] "conf.yml"     "inputs/y.RDS" "inputs/z.RDS"

read_conf <- yaml::read_yaml(paste0(conf$dir, "conf.yml"))

read_conf$run_info
# $date_creation
# [1] "2021-03-02 16:54:24"
# 
# $date_start
# [1] "N/A"
# 
# $date_end
# [1] "N/A"
# 
# $priority
# [1] 1
# 
# $status
# [1] "waiting"

read_conf$args$x
# [1] 1

y <- readRDS(paste0(conf$dir, "inputs/y.RDS"))
# [1] 0 1 2 3 4

z <- readRDS(paste0(conf$dir, "inputs/z.RDS"))

Run one given task (for demo/test)

?run_task

run_task(paste0(conf$dir, "conf.yml"))

# catch results
list.files(conf$dir, recursive = T)
# output directory with log & result
# [1] "conf.yml"                              "inputs/y.RDS"                         
# [3] "inputs/z.RDS"                          "output/log_run_20210302_1656_1080.txt"
# [5] "output/res.RDS" 

conf_update <- yaml::read_yaml(paste0(conf$dir, "conf.yml"))
conf_update$run_info
# $date_creation
# [1] "2021-03-02 16:54:24"
# 
# $date_start
# [1] "2021-03-02 16:56:10"
# 
# $date_end
# [1] "2021-03-02 16:56:10"
# 
# $priority
# [1] 1
# 
# $status
# [1] "finished"


output <- readRDS(paste0(conf$dir, "output/res.RDS"))
#[1] 1 2 3 4 5

log <- read.delim(list.files(paste0(conf$dir, "output/"), pattern = "log_run", full.names = T), header = F)
# [2021-03-02 16:56:10] [INFO] Starting task execution...
# [2021-03-02 16:56:10] [INFO] Running !
# [2021-03-02 16:56:10] [WARN] Complex (or not) variable z is not used...!
# [2021-03-02 16:56:10] [INFO] ... task terminated.

Use scheduler to launch the tasks

?scheduler_add

# create temporary directory for conf
dir_conf <- paste0(tempdir(), "/conf/")
dir.create(dir_conf, recursive = T)

# create 2 confs
conf_1 <- configure_task(
  dir_path = dir_conf,
  conf_descr = list(
    title_1 = "my_title_1",
    description_1 = "my_descr_1"
  ),
  fun_path = system.file("ex_fun/sb_fun_ex.R", package = "shinybatch"), # as an example,
  fun_name = "sb_fun_ex",
  fun_args = list(
    x = 0, 
    y = 0:4,  
    z = iris
  ),
  priority = 1)
  
conf_2 <- configure_task(
  dir_path = dir_conf,
  conf_descr = list(
    title_1 = "my_title_2",
    description_1 = "my_descr_2"
  ),
  fun_path = system.file("ex_fun/sb_fun_ex.R", package = "shinybatch"), # as an example,
  fun_name = "sb_fun_ex",
  fun_args = list(
    x = 0, 
    y = 0:4,  
    z = iris
  ),
  priority = 2)

# on LINUX -> Needs cronR package
# on Windows -> Needs taskscheduleR package

scheduler_add(
  dir_scheduler = tempdir(),
  dir_conf = dir_conf,
  max_runs = 1,
  create_file = T,
  head_rows = NULL, 
  taskname = "cron_script_ex"
)
           
scheduler_ls() # display existing crons

# wait up to 1 min for conf_2 and up to 2 mins for conf_1
yaml::read_yaml(paste0(conf_1$dir, "/conf.yml"))$run_info$status
yaml::read_yaml(paste0(conf_2$dir, "/conf.yml"))$run_info$status

scheduler_rm(id = "cron_script_ex") # kill selected cron

Shiny modules

These modules contain the basic framework to use all the previous functions in a Shiny app. Both are used in the demo app which presents a simple usecase.

Configure a new task

call:

?configure_task_server

# ui : just create an actionButton
actionButton("go_task", "Configure the task !")

# server
# call module to configure a task
# connect app inputs to the module
callModule(configure_task_server, "my_id_1",
           btn = reactive(input$go_task),
           dir_path = dir_conf,
           conf_descr = reactive(
            list(
              "title" = input$title,
              "description" = input$description
            )
           ),
           fun_path = paste0(dir_fun, "/fun_script.R"),
           fun_name = "my_fun",
           fun_args = reactive(
            list(
              n = input$fun_nb_points,
              mean = input$fun_mean,
              sd = input$fun_sd,
              sleep = input$sleep_time
            )
           ),
           priority = reactive(input$priority)
)

Display configured tasks

call:

?tasks_overview_UI

# ui
tasks_overview_UI("my_id_2")

# server
# call module to view tasks
sel_task <- callModule(
    tasks_overview_server, "my_id_2",
    dir_path = dir_conf,
    allowed_status = c("waiting", "running", "finished", "timeout", "error"),
    allowed_run_info_cols = NULL,
    allowed_function_cols = NULL,
    allow_descr = T,
    allow_args = T
)

This module returns :

the status of the selected line (one run) of the summary table,
the path to the directory in which its output is stored.

Thus we know when a run is finished and we can load its result to reuse/display it : (readRDS(paste0(path, "/res.RDS"))).

Demo app

A demo app to create and automatically launch an example task : the generation of normally distributed observations.

global : the path to the confs directory, the path to the script of the function to be run, the call to scheduler_add() ;
ui : shiny inputs (description args for the conf ; parameters for the function to be called by the cron) ;
server : a renderPlot (a graph of the data create in a run).

As a credible usecase, the results of the runs are retrieved and can be displayed.

runApp(system.file("demo_app", package = "shinybatch"))

Name		Name	Last commit message	Last commit date
Latest commit History 143 Commits
R		R
inst		inst
man		man
tests		tests
vignettes		vignettes
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
NAMESPACE		NAMESPACE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

shinybatch

Installation

Demo application

Main functions

Definition of a task

Description of the launcher

Description of the scheduler

Example

About

Releases

Packages

Contributors 4

Languages

datastorm-open/shinybatch

Folders and files

Latest commit

History

Repository files navigation

shinybatch

Installation

Demo application

Main functions

Definition of a task

Description of the launcher

Description of the scheduler

Example

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages