WARDEN

The WARDEN (Workflow for the Analysis of RNA-Seq Differential ExpressioN) software uses RNA-Seq sequence files to perform alignment, coverage analysis, gene counts and differential expression analysis.

There are 3 entrypoints to the WARDEN workflow. The start-to-end workflow begins with FastQ files which are aligned by STAR. WARDEN can optionally be entered at this point with user-aligned RNA-Seq BAMs. Aligned BAMs are then run through HTSeq-count to determine the number of reads mapping to features. The next stage can also be entered with user-derived count files, where differential expression analysis is performed on the defined cohorts.

For the full usage documentation, visit the St. Jude Cloud University.

Workflow Steps

FastQ files generated by RNA-Seq are mapped to a reference genome using the STAR.
HTSeq-count is used to assign mapped reads to features (default feature is gene).
Differential expression analysis is performed using VOOM normalization of counts and LIMMA analysis.
Coverage plots of mapped reads are optionally generated as interactive visualizations.

Archeticture

WARDEN's three entry points exist as their own apps in the directories stjude_warden_fastq, stjude_warden_bam, and stjude_warden_counts. Within each app's resources/app_data/internal_source/ directory are source code for dnanexus applets which are dynamically built when running the main app. Those applets are linked together by the resources/usr/bin/create_workflow.py scripts to create a workflow, which is built and run by the main app.

There is a very large amount of code duplication between these 3 main directories because the dx build process can't handle symlinks or imports. CI has been built that will ensure files that should be exact copies of eachother are. There are weakpoints in this, in that there is large amounts of duplication in the create_workflow.py, warden.sh, and dxapp.json files which must be manually ensured share the same updates. Similarly the subapplet warden_genome_coverage_bed in stjude_warden_fastq and stjude_warden_bam are slightly different and also require manual maintainence. While developing for this repo, project wide "find and replace" is your friend.

For the most part, stjude_warden_counts is a subset of the code in stjude_warden_bam, which in turn is a subset of the code in stjude_warden_fastq.

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
.github/workflows		.github/workflows
stjude_warden_bam		stjude_warden_bam
stjude_warden_counts		stjude_warden_counts
stjude_warden_fastq		stjude_warden_fastq
.gitignore		.gitignore
README.md		README.md
Versions.txt		Versions.txt
bad_counts_samplesheet.txt		bad_counts_samplesheet.txt
bad_names_samplesheet.txt		bad_names_samplesheet.txt
name_samplesheet.txt		name_samplesheet.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WARDEN

Workflow Steps

Archeticture

About

Releases 6

Packages

Languages

stjude/WARDEN

Folders and files

Latest commit

History

Repository files navigation

WARDEN

Workflow Steps

Archeticture

About

Resources

Stars

Watchers

Forks

Releases 6

Packages 0

Languages

Packages