Cromwell+WDL Boilerplate

Outline

Hello World!
Structure
Naming Conventions
Modular Design
Development Steps
Your First Workflow
Testing
Known Issues

Setup

The boilerplate is a part of SCING (Single-Cell pIpeliNe Garden; pronounced as "sing" /siŋ/). For setup, please refer to this page. All the instructions below is given under the assumption that you have already configured SCING + JRE or JDK in your environment.

However, with small changes in the instructions and code, you should be able to use this boilerplate for any Cromwell/WDL-based workflow system.

Hello World!

The boilerplate comes with an example workflow called HelloWorld. Let's run this workflow first on your workflow system to verify your environment is ready.

Download the boilerplate and extract it to a new directory called wdl-HelloWorld:

wget https://github.com/hisplan/wdl-boilerplate/archive/refs/tags/v0.0.13.tar.gz -O wdl-boilerplate.tar.gz
mkdir -p wdl-HelloWorld && tar xvzf wdl-boilerplate.tar.gz -C wdl-HelloWorld --strip-components 1

cd wdl-HelloWorld

Open configs/HelloWorld.inputs.json and change the HelloWorld.name value to your real name (e.g. Jaeyoung):

{
    "HelloWorld.name": "Jaeyoung"
}

Open configs/HelloWorld.labels.aws.json and change the destination value to s3://dp-lab-gwf-core/outputs/HelloWorld/$NAME where $NAME should be replaced with your real name (e.g. Jaeyoung):

{
    "pipelineType": "HelloWorld",
    "project": "Test",
    "sample": "Test",
    "owner": "chunj",
    "destination": "s3://dp-lab-gwf-core/outputs/HelloWorld/Jaeyoung/",
    "transfer": "-",
    "comment": ""
}

Activate the scing conda environment:

conda activate scing

Submit a HelloWorld job to the workflow system:

./submit.sh \
    -k ~/keys/cromwell-secrets.json \
    -i ./configs/HelloWorld.inputs.json \
    -l ./configs/HelloWorld.labels.aws.json \
    -o HelloWorld.options.aws.json

where cromwell-secrets.json is your secrets file that contains your credentials and server address.

Structure

.
├── configs
│   ├── HelloWorld.inputs.json
│   ├── HelloWorld.labels.aws.json
│   ├── HelloWorld.labels.gcp.json
│   ├── template.inpus.json
│   └── template.labels.json
├── modules
│   └── Greeter.wdl
├── tests
│   ├── run-all-tests.sh
│   ├── run-test.sh
│   ├── test.Greeter.inputs.json
│   ├── test.Greeter.wdl
│   ├── test.labels.json
│   ├── validate.sh
│   └── zip-deps.sh
├── HelloWorld.deps.zip
├── HelloWorld.options.aws.json
├── HelloWorld.options.gcp.json
├── HelloWorld.wdl
├── README.md
├── init.sh
├── make-deployable.sh
├── submit.sh
└── validate.sh

File/Directory	Description
`configs`	Directory where job configurations should be placed
`modules`	Directory where subworkflows should be placed
`tests`	Directory where tests for subworkflows should be placed
`HelloWorld.wdl`	Main workflow
`HelloWorld.deps.zip`	Packaged/compressed subworkflows (`modules/*`)
`submit.sh`	Script for submitting a job to the workflow system

Naming Conventions

Use pascal case for the main workflow, subworkflow name, task name, and file name (e.g. HelloWorld).
Use camel case for variables (e.g. helloWorld).
Add postfix .inputs.json and .labels.json for job configurations.

Modular Design

The boilerplate comes with the HelloWorld example which takes your name as input and outputs your name 1) as a string and 2) as a file. HelloWorld.wdl is the main workflow. ./modules/Greeter.wdl is the subworkflow. You can add additional subworkflows under the modules directory and call them from your main workflow (e.g. HelloWorld.wdl).

When you finish writing your subworkflows, you must run tests/zip-deps.sh which packages all your subworkflows into a single deployable file.

Development Steps

Write subworkflows (under the modules directory)
Create a test workflow that can test your subworkflows (under the tests directory)
Validate each subworkflow (tests/validate.sh)
Test each subworkflow by actually running them on the workflow system (tests/run-test.sh)
Package subworkflows into a deployable file (tests/zip-deps.sh)
Write the main workflow.
Validate the main workflow.
Create a job input/label file (under the configs directory)

Your First Workflow

Before you do anything, you should change HelloWorld to something else. For example, if you are building a Cell Hashing pipeline, you probably want to replace the name HelloWorld to CellHashing.

These are the files to be updated:

./validate.sh
./tests/zip-deps.sh
./tests/run-test.sh
./submit.sh
./HelloWorld.wdl
./configs/HelloWorld.labels.gcp.json
./configs/HelloWorld.labels.aws.json
./configs/HelloWorld.inputs.json

You should also change the file names as well (e.g. HelloWorld.wdl to CellHashing.wdl)

Renaming will be a tedious thing to do, so you can try out the auto-rename tool (experimental):

./init.sh -n CellHashing

Without the -e flag, it will run as a test (i.e. dry run)

Testing

Currently, this is not really designed for unit testing (TBD), rather this will allow you to verify if your WDL files are written syntactically and semantically right.

cd tests
./validate.sh

If you have added new subworkflows, make sure to include them in validate.sh before running it:

modules="MyNewSubWorkflow Greeter"

Also, another thing you can do is running your subworkflow(s) on the workflow system.

cd tests
./run-all-tests.sh -k ~/keys/cromwell-secrets.json

Again, if you have added new subworkflows, make sure to include them in run-all-tests.sh before running it:

modules="MyNewSubWorkflow Greeter"

You can also run an individual subworkflow separately:

cd tests
./run-test.sh -k ~/keys/cromwell-secrets.json -m Greeter

Known Issues

The following three doesn't work currently (in HelloWorld.options.*.json)

{
    "final_workflow_outputs_dir": "s3://dp-lab-batch/cromwell-execution/_outputs/HelloWorld/results",
    "final_workflow_log_dir": "s3://dp-lab-batch/cromwell-execution/_outputs/HelloWorld/workflow-logs",
    "final_call_logs_dir": "s3://dp-lab-batch/cromwell-execution/_outputs/HelloWorld/call-logs"
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cromwell+WDL Boilerplate

Outline

Setup

Hello World!

Structure

Naming Conventions

Modular Design

Development Steps

Your First Workflow

Testing

Known Issues

About

Releases 13

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
configs		configs
modules		modules
tests		tests
.gitignore		.gitignore
HelloWorld.deps.zip		HelloWorld.deps.zip
HelloWorld.options.aws.json		HelloWorld.options.aws.json
HelloWorld.options.gcp.json		HelloWorld.options.gcp.json
HelloWorld.wdl		HelloWorld.wdl
LICENSE		LICENSE
README.md		README.md
init.sh		init.sh
make-deployable.sh		make-deployable.sh
submit.sh		submit.sh
validate.sh		validate.sh

License

hisplan/wdl-boilerplate

Folders and files

Latest commit

History

Repository files navigation

Cromwell+WDL Boilerplate

Outline

Setup

Hello World!

Structure

Naming Conventions

Modular Design

Development Steps

Your First Workflow

Testing

Known Issues

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 13

Packages 0

Languages

Packages