-
Notifications
You must be signed in to change notification settings - Fork 88
SnakeMake pipeline for ichorCNA
This workflow will run the ichorCNA pipeline starting from the BAM files and generating ichorCNA outputs.
- R-3.3
- ichorCNA
- HMMcopy
- optparse
- Python 3.4
- snakemake-3.12.0
- PyYAML-3.12
- HMMcopy Suite (http://compbio.bccrc.ca/software/hmmcopy/).
-In particular,readCounter
is used.
-
readCounter
(C++ executable; HMMcopy Suite) runIchorCNA.R
The list of cfDNA samples should be defined in a YAML file. See config/samples.yaml
for an example. The field samples
must be provided.
samples:
tumor_sample_1: /path/to/bam/tumor.bam
ichorCNA.snakefile
Invoking the full snakemake workflow for ichorCNA
# show commands and workflow
snakemake -s ichorCNA.snakefile -np
# run the workflow locally using 5 cores
snakemake -s ichorCNA.snakefile --cores 5
# run the workflow on qsub using a maximum of 50 jobs.
# Broad UGER cluster parameters can be set directly in config/cluster.sh.
snakemake -s ichorCNA.snakefile --cluster-sync "qsub" -j 50 --jobscript config/cluster.sh
For hg38, please use config_hg38.yaml
.
It has paths to reference files specific for hg38. The chromosome naming style is set for UCSC (e.g. "chr1"). Users can set this so that the output can be UCSC or NCBI style. The input files, including tumor and normal wigs, ichorCNA_normalPanel
, ichorCNA_gcWig
, ichorCNA_mapWig
, ichorCNA_centromere
, ichorCNA_exons
files can be in any style. Also, the ichorCNA_chrs
and ichorCNA_chrTrain
settings in the config file can be in any style.
ichorCNA_genomeStyle: UCSC # sets output chromosome naming style
Setting chromosomes and bin size to analyze. The bin size should be adjusted to account for sequencing coverage - larger bin sizes for lower coverage. Currently, 1Mb is used for 0.1x coverage.
chrs:
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,X,Y
binSize: 1000000 # set window size to compute coverage
Include paths to the main ichorCNA R script and the normal panel to help normalize the data. The normal panel is optional but if included, should correspond to the same bin size.
# included in GitHub repo
ichorCNA_rscript: ../runIchorCNA.R
# use panel matching same bin size (optional)
ichorCNA_normalPanel: ../../inst/extdata/HD_ULP_PoN_1Mb_median_normAutosome_mapScoreFiltered_median.rds
The GC and mappability wig files must be provided. These files should correspond to the same bin size.
# must use gc wig file corresponding to same binSize (required)
ichorCNA_gcWig: ../../inst/extdata/gc_hg19_1000kb.wig
# must use map wig file corresponding to same binSize (required)
ichorCNA_mapWig: ../../inst/extdata/map_hg19_1000kb.wig
Targeted intervals (e.g. exons) and centromere file. Both are optional.
# use bed file if sample has targeted regions, eg. exome data (optional)
ichorCNA_exons: NULL
ichorCNA_centromere: ../../inst/extdata/GRCh37.p13_centromere_UCSC-gapTable.txt
Various settings for ichorCNA model parameters. Normal (non-tumor) settings should include various restart values. For cfDNA, non-tumor fraction tends to be higher, therefore including higher values are recommended.
ichorCNA_chrs: c(1:22, \"X\")
# chrs used for training ichorCNA parameters, e.g. tumor fraction.
ichorCNA_chrTrain: c(1:22)
# non-tumor fraction parameter restart values; higher values should be included for cfDNA
ichorCNA_normal: c(0.5,0.6,0.7,0.8,0.9,0.95)
# ploidy parameter restart values
ichorCNA_ploidy: c(2,3)
ichorCNA_estimateNormal: TRUE
ichorCNA_estimatePloidy: TRUE
scStates
refers to subclonal copy number states - 1 (deletion) and 3 (gain) subclonal states are included. If you do not wish to model subclonal events, then use ichorCNA_scStates: c()
and ichorCNA_estimateClonality: TRUE
.
# states to use for subclonal CN
ichorCNA_scStates: c(1,3)
ichorCNA_estimateClonality: TRUE
Settings for copy number. For low coverage (e.g 0.1x) and therefore large bin size (e.g. 1Mb) is used, then homozygous deletion should not be included (i.e. ichorCNA_includeHOMD: FALSE
). For higher coverage data (e.g. >10x), modeling homozygous deletions can be turned on.
# set maximum copy number to use
ichorCNA_maxCN: 5
# TRUE/FALSE to include homozygous deletion state
ichorCNA_includeHOMD: FALSE
Segmentation settings including adjusting sensitivity for events and controlling number of segments.
# higher (e.g. 0.9999999) leads to higher specificity and fewer segments
# lower (e.g. 0.99) leads to higher sensitivity and more segments
ichorCNA_txnE: 0.9999
# higher (e.g. 10000000) leads to higher specificity and fewer segments
# lower (e.g. 100) leads to higher sensitivity and more segments
ichorCNA_txnStrength: 10000