This is where the benchmarking workflows for APAeval live. The dedicated code for each benchmarking event resides in its own subdirectory.
- Overview
- Benchmarking workflow general description
- HOW TO: (File) naming requirements
- HOW TO: DEVELOP
- HOW TO: "PRODUCTION"
- Origin
APAeval consists of a number of benchmarking events to evaluate the performance of different tasks that the methods of interest (=participants) might be able to perform: poly(A) site identification, absolute quantification, relative quantification and assessment of their differential usage (not implemented yet). A method can participate in one or several events, depending on its functions.
Within a benchmarking event, one or more challenges will be performed. A challenge is primarily defined by the ground truth dataset used for performance assessment (see the APAeval Zenodo snapshot). A challenge is evaluated within a benchmarking workflow, which can be run with either docker or singularity, locally or on an HPC infrastructure (currently only a profile for Slurm is included in the APAeval code, but both Snakemake and Nextflow offer various default profiles for submission to common HPCs). The benchmarking workflow will compute all metrics relevant for the benchmarking event. A list of challenge IDs and input files (= output files of one participant for all specified challenges) is passed to the workflow.
In order to compare the performance of participants within a challenge/event, the respective benchmarking workflow will be run on output files from all eligible participant method workflows. The calculated metrics will be written to .json
files that can be either submitted to OEB for database storage and online vizualisation, or transformed into a table format that can be used for creating custom plots with the help of scripts from the APAeval utils directory.
In a first step the provided input files are validated. Subsequently, all specified metrics are computed, using the matched ground truth files, if applicable. Finally, the results are gathered in OEB specific .json
files per participant.
Based on the created .json files, results can be vizualized on OEB per challenge, such that performance of participants can be compared for each metric.
In order to eventually be compatible with the OEB infrastructure, benchmarking workflows are written in Nextflow and are structured in a predefined manner that will be described in the following sections.
DON'T FREAK OUT IF YOU'RE UNFAMILIAR WITH
NEXTFLOW
! MOST CHANGES YOU'LL MAKE ARE INPYTHON
! 😉
benchmarking_workflows/
|- JSON_templates/
|- [benchmarking_event]/
|- main.nf
|- nextflow.config
|- [participant]_[event].config
|- specification/
|- example_files/
|- specification.md
|- [benchmarking_event]_dockers/
|- validation/
|- Dockerfile
|- requirements.txt
|- validation.py
|- ...
|- metrics/
|- Dockerfile
|- requirements.txt
|- compute_metrics.py
|- ...
|- consolidation/
|- Dockerfile
|- requirements.txt
|- aggregation.py
|- merge_data_model_files.py
|- ...
...
utils/
|- apaeval/
|- src/apaeval/main.py
Within such a directory we find the main.nf
and nextflow.config
files, which specify the workflow and all its event-specific parameters, respectively, as well as a [participant]_[event].config
, which contains the input file- and challenge names for a particular participant. main.nf
ideally does NOT have to be changed (at least not much) between benchmarking events, as it simply connects the three steps validation
, metrics_computation
and consolidation
inherent to the OEB workflow structure. In contrast, file and tool names have to be adapted in [participant]_[event].config
for dedicated workflow runs.
ATTENTION: Keep
nextflow.config
unchanged within an event, in order to be able to directly compare the different participant runs.
Within the benchmarking event's directory resides a subdirectory specification
with a detailed description of required input and output file formats, as well as of the metrics to be calculated for the respective benchmarking event. The actual code is hidden in the directory [benchmarking_event]_dockers
; For each of the three benchmarking workflow steps required by OEB, a separate docker container will be built:
- Validation
- Metrics calculation
- Consolidation
The "dockers" directories contain Dockerfiles, requirements, and dedicated python scripts. In order to create datasets that are compatible with the Elixir Benchmarking Data Model, the JSON templates in the main benchmarking_workflows
directory are imported in the respective docker containers. The provided python scripts, as well as the module utils/apaeval
they import, are where the action happens: These scripts are where you most likely will have to make adjustments for different benchmarking events.
Challenge IDs have to be of the form
[SAMPLE_NAME].([ADDITIONAL_INFO].)[GENOME]
where [SAMPLE_NAME]
is a unique id of the condition represented in the ground truth and assessed by the participant as listed in the APAeval Zenodo snapshot. [ADDITIONAL_INFO]
is optional; this can be used if several ground truths are obtained from the same condition but differ otherwise, e.g. one is a subset of the other.[GENOME]
is the genome version used for creating the ground truth. MUST contain either "mm" or "hg", e.g. mm10
or hg38_v26
MmusCortex_adult_R1.TE.mm10
GTEXsim_R19.hg38_v26
The gold standard file MUST be named in the format
[CHALLENGE].[EXT]
where [CHALLENGE]
is specified in challenges_ids
in [tool]_[event].config
. The extension .bed
is hardcoded within compute_metrics.py
and [CHALLENGE]
itself has to be of the format described above.
Participant outputs MUST contain the exact [SAMPLE_NAME]
part of the challenge(s) (see requirements above) they want to participate in. They have to be of the format
[PARTICIPANT].[SAMPLE_NAME].[EVENT_ID].[EXT]
where [PARTICIPANT]
is the unique name of the participant to be tested. If a tool is for example run in two different modes, that should be reflected here (like MYTOOL and MYTOOL_SPECIAL). [EVENT_ID]
is a two-digit-code as follows:
01 - Identification
02 - absolute quantification
03 - differential expression
04 - relative quantification
Metric names MUST be exactly the same in the respective compute_metrics.py
and aggregation_template_X.json
files of a benchmarking workflow. These metric names will then appear in the result .json
files of the workflow, and will be appearing on OEB plots after uploading the results there.
Jaccard_index:10nt
percentage_genes_w_correct_nPAS
For an example of a benchmarking workflow and further instructions, refer to the quantification benchmarking workflow.
If not done so already, copy the whole contents of the quantification
directory into the directory for your new benchmarking event. Specify the objectives of your event by adapting the contents of specification/
.
OEB requires all inputs to be validated. To check for correct input file formats for your benchmarking event, adapt the validation in validation.py
(around line 50). Update the corresponding requirements.txt
, constraints.txt
and Dockerfile
for installation of additional packages, if necessary.
Adapt compute_metrics.py
to compare the participant output to the community provided gold standard file(s). You can define custom functions in the utils/apaeval
module.
NOTE: the extension of the gold standard file is currently hardcoded in
compute_metrics.py
in line 56. Change this according to your gold standard file format.
Update the corresponding requirements.txt
, constraints.txt
and Dockerfile
for installation of additional packages, if applicable.
The json outputs from the first two steps will be gathered here, and "aggregation objects" for OEB vizualisation will be created based on the aggregation_template.json
. Thus, this is the file you want to adapt in order to control which metrics are plotted in OEB. You can set visualization types for local plotting in manage_assessment_data.py
. The current python scripts have been copied from https://github.com/inab/TCGA_benchmarking_dockers, and only support 2D plots with x and y axes.
In the former you'll have to adjust the docker container names and general workflow parameters, whereas in the latter you'll only have to make changes if you have introduced new workflow parameters (or want to change the wiring of steps, which is not recommended for the sake of attempted OEB compatibility).
Describe the type of validation and metric calculation you perform in the README.md
in your benchmarking event directory (see example from quantification benchmarking workflow).
ATTENTION: the apaeval module is installed inside the containers via a git url specified in the respective
requirements.txt
(for q_validation and q_metrics). If you made changes to the module, don't forget to push your branch and adjust those urls accordingly.
After making the necessary changes for your specific event, you will have to build the docker images locally by either of the following two methods:
- Go to the
[X]_dockers/
directory and run the following (note: thetag_id
should match the one in yournextflow.config
)
run `./build.sh <tag_id>`
- Go to the specific docker directory for each step in
[X]_dockers/
:
[X]_validation/
,[X]_metrics/
, or[X]_consolidation/
and run the following
docker build . -t apaeval/[X]_[validation OR metrics OR consolidation]:<tag_id>
If you want to update the docker containers, please remove your original images first:
docker image ls #look for the IMAGE_ID of your docker image
docker rmi [IMAGE_ID]
Then, you can rebuild the docker image locally (see above).
After having activated the APAeval conda environment you can use the following command to run the quantification benchmarking workflow with the provided test files from command line:
nextflow run main.nf -profile docker -c tool_event.config --participant_id tool1
# Or for running with singularity and slurm:
nextflow run main.nf -profile slurm -c tool_event.config --participant_id tool1
NOTE: Parameters from the nextflow.config file are read in addition to the ones specified with the
-c
flag, but the latter will override any parameters of the same name in the nextflow.config. Set theparticipant_id
directly with--participant_id TOOL
When you have completed the steps described above you can finally run the benchmarking workflow on real data. Below are some hints to help you get going.
Place the participant output into a directory like DATA/PARTICIPANT_NAME/
and make sure the files are named as described in the section Participant output (=input) files
You're going to run the workflow for one participant at a time, but you can specify multiple challenges for that participant. To do so, create a participant specific [participant]_[event].config
(copy tool_event.config
). There you'll specify input files and challenge names.
Make sure you have the images appropriate for your system ready. If you're running docker you can use the images you built locally in the HOW TO: DEVELOP section. If you want to use singularity you'll first have to push those images to a publicly accessible repo, ideally biocontainers. Make sure to rename the images (see bash command below) and adjust the paths in the nextflow.config
accordingly.
docker tag apaeval/q_consolidation:1.0 your_docker_repo/q_consolidation:1.0
docker push your_docker_repo/q_consolidation:1.0
ATTENTION: always make sure you're using up to date versions of the images. More specifically: DO make sure you have removed old local images and/or cleared singularity caches on your system and checked your
nextflow.config
After having activated the APAeval conda environment, from the root directory of your benchmarking event, you can use the following command to run a benchmarking workflow:
nextflow -bg run main.nf -profile docker -c [TOOL]_[EVENT].config --participant_id [TOOL] >> stdout_err_[TOOL]_[EVENT].log 2>&1
# Or for running with singularity and slurm:
nextflow -bg run main.nf -profile slurm -c [TOOL]_[EVENT].config --participant_id [TOOL] >> stdout_err_[TOOL]_[EVENT].log 2>&1
The APAeval OEB benchmarking workflow is an adaptation of the TCGA_benchmarking_workflow with the corresponding docker declarations. The structure of output files is compatible with the ELIXIR Benchmarking Data Model. The current version of the workflow is not compatible with the OEB VRE setup, however, only minor changes should be needed to re-establish compatibility.