alnsl

A nextflow pipeline for alignment of short WGS reads.

Preparing index sequences

prepare reference index for bwa-mem2
```
 bwa-mem2 index ref.fa
```

prepare elprep files using the folowing commands:

#reference file
elprep fasta-to-elfasta ref.fa ref.fa.elfasta
#variats files GTAK4_Bunddle for variant calibration
elprep vcf-to-elsites <vcf-file> <elsites-file>

Conda Environment

We will build a micromamba environment with the needed software

micromamba create -f conda.yml

Param files

Here we provide an example to map the short reads to hg38, the context of the params files is the following

dbsnp: /mnt/beegfs/labs/DiGenomaLab/databases/references/human/GATK_Bundle/Homo_sapiens_assembly38.dbsnp138.elsites
dbindel: /mnt/beegfs/labs/DiGenomaLab/databases/references/human/GATK_Bundle/Mills_and_1000G_gold_standard.indels.hg38.elsites
ref: /mnt/beegfs/labs/DiGenomaLab/databases/references/human/bwa2/hs38DH.fa
elpre_ref: /mnt/beegfs/labs/DiGenomaLab/databases/references/human/hs38DH.fa.elfasta
alt_js: /mnt/beegfs/home/adigenova/micromamba/envs/aln/bin/bwa-postalt.js
bqsr: true

Save the above content in a file (i.e) : aln-params.yml

Read file

provide a csv file wiht the following information:

sampleId,read1,read2
test2,./test_reads/test2.R1.fq.gz,./test_reads/test2.R2.fq.gz
test3,./test_reads/test3.R1.fq.gz,./test_reads/test3.R2.fq.gz
test,./test_reads/test.R1.fq.gz,./test_reads/test.R2.fq.gz

Currently if a sample is split into several files is necesary to merge the reads before runing the pipeline.

Save the above content in a file (i.e) : reads.csv

runnig the pipeline

nextflow run main.nf --csv reads.cvs -profile uoh -params-file aln-params.yml

in case of failure use:

nextflow run main.nf --csv reads.cvs -profile uoh -params-file aln-params.yml -resume

that will generate a directory called results

Creating an aggregated report

To create an aggregated report across all the samples, is possible to run multiqc on the result directory:

load the environment
```
micromamba activate aln
```
run multiqc

 multiqc .

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
bin		bin
misc		misc
test_reads		test_reads
LICENSE		LICENSE
README.md		README.md
aln-params.yml		aln-params.yml
conda.yml		conda.yml
main.nf		main.nf
nextflow.config		nextflow.config
reads-test.cvs		reads-test.cvs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

alnsl

Preparing index sequences

Conda Environment

Param files

Read file

runnig the pipeline

Creating an aggregated report

About

Releases 1

Packages

Languages

License

digenoma-lab/alnsl

Folders and files

Latest commit

History

Repository files navigation

alnsl

Preparing index sequences

Conda Environment

Param files

Read file

runnig the pipeline

Creating an aggregated report

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages