Runs Sentieon STAR-Aligner to align RNA sequence data to a reference
--fastqs
: (array of files) Array of gzipped FASTQs for one sample--sentieon_tar
: (file) Tarballed Sentieon package. Currently defaults to use Sentieon 202112.05--ctat_genome_indices_read_length
: (int) The length of the genomic sequence around the annotated junction to be used in constructing the splice junctions database. Ideally, this length should be equal to the ReadLength-1, where ReadLength is the length of the reads used in construction of the indices. The reference genome and genome indices from the CTAT library used reads of length 151 bp, so this is the default--genome_lib
:(file) A CTAT genome library, which is a reference file bundle required by Trinity CTAT tools. This contains the genome indices and reference genome that are needed to run Sentieon STAR aligner--opt_parameters
: (string) The parameters with which to run Sentieon STAR aligner. Should be space-delimited, and in the format--parameter-name value
e.g.--alignIntronMax 100000 --alignSJstitchMismatchNmax 5 -1 5 5
. Has default parameters set if not configured by the user. Note that you cannot useopt_parameters
to set the following parameters, as they are already set within the app:--runThreadN
: the app calculates this from the spec of the machine it is running on--genomeDir
: this is extracted from the--genome_lib
input--readFilesIn
: the app sets this using the files in the--fastqs
input--readFilesCommand
: hardcoded in the app aszcat
as the fastqs have to be .gz compressed--readFilesManifest
: the manifest file is generated in the app, and the app currently is not set up to run with a manifest input--outSAMattrRGline
: this is set within the app from the read group info in the fastq files--sjdbOverhang
: this is set by the input--ctat_genome_indices_read_length
--outStd
: this is set in the app to ensure the bam file is the main output--outSAMtype
: this is set in the app to ensure it outputs a bam file--sysShell
--genomeFastaFiles
--genomeChainFiles
--genomeConsensusFile
--parametersFiles
--limitBAMsortRAM
--outFileNamePrefix
--outTmpDir
--readFilesPrefix
--readFilesIn
--readFilesCommand
--readNameSeparator
--readFiles
--readFilesType
--varVCFfile
--inputBAMfile
eggd_staraligner
takes an input array of fastq.gz files and runs sentieon STAR aligner on the fastqs to align them to the reference genome.
eggd_staraligner
outputs a .bam and .bam.bai file for the alignment of the sample and a Chimeric.out.junctions file with junction information. It also outputs several Sentieon STAR aligner log files, a .bam file with the duplicate reads marked, and a corresponding .bam.bai index file.