Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to DSL2 Best Practices #379

Merged
merged 53 commits into from
Jun 11, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
53 commits
Select commit Hold shift + click to select a range
1e0edf8
feat: update main script and lib from DSL2 Best Practices
maxulysse May 20, 2021
b120e90
fix: add more file to lint ignore
maxulysse May 20, 2021
b5fc96c
feat: strip out unused functions
maxulysse May 20, 2021
34a0b4d
feat: update some modules
maxulysse May 20, 2021
14d5b97
fix: HaplotypeCaller
maxulysse May 20, 2021
bcc8385
feat: update msisensor
maxulysse May 25, 2021
f68fa78
feat: update msisensor_msi
maxulysse May 25, 2021
b86fc4f
fix: Gstring.toString()
maxulysse May 25, 2021
7f55703
fix: output
maxulysse May 25, 2021
7be61c8
remove unused file
maxulysse May 25, 2021
6291ec5
remove white space
maxulysse May 25, 2021
8f6e461
fix: output
maxulysse May 25, 2021
ce7b5ee
fix: markduplicates subworkflow test
maxulysse May 26, 2021
4141f64
fix: subworkflow md finally fixed
maxulysse May 26, 2021
99dd4ea
fix: nextflow version
maxulysse May 26, 2021
b1d14e0
moving files and functions around
maxulysse May 27, 2021
8db1c6f
feat: update modules and move around functions, and possibly break ev…
maxulysse May 28, 2021
151d600
fix: add .gitignore to .nf-core-lint.yml
maxulysse May 28, 2021
712e29d
fix: add new csv files
maxulysse May 28, 2021
b83e31f
feat: simplify mapping script, update module, default test working
maxulysse May 28, 2021
ea0b43e
feat: cleanup files
maxulysse May 28, 2021
266772b
cleanup
maxulysse May 28, 2021
c371418
feat: simplify variant_calling
maxulysse May 28, 2021
6c3299d
fix: output
maxulysse May 30, 2021
77a5c65
feat: simplify tests
maxulysse May 30, 2021
4fbbf80
fix: tests
maxulysse May 30, 2021
7750ce9
feat: simplify test profiles
maxulysse Jun 3, 2021
55852ba
feat: refactor + reorganize modules
maxulysse Jun 3, 2021
d140635
feat: refactor - forgot to commit file
maxulysse Jun 3, 2021
7e6b232
fix: gatk4/markduplicatesspark
maxulysse Jun 4, 2021
ad88088
fix: subworfklow tests
maxulysse Jun 7, 2021
980b702
fix: markduplicats tests
maxulysse Jun 7, 2021
4332455
feat: update bwa/bwamem2 modules
maxulysse Jun 7, 2021
2b3b55c
feat: update trimgalore
maxulysse Jun 7, 2021
7ed1a92
feat: update tabix/tabix
maxulysse Jun 7, 2021
7b2a7a3
feat: update and add strelka modules
maxulysse Jun 7, 2021
61eee21
fix: typo
maxulysse Jun 7, 2021
b214c5d
fix: path to module
maxulysse Jun 7, 2021
0f1fba5
fix: typo
maxulysse Jun 7, 2021
04d97f4
feat: update samtools/merge
maxulysse Jun 7, 2021
b4a903c
feat: update multiqc modules
maxulysse Jun 7, 2021
7dc1c86
feat: replace msisensor by msisensor-pro
maxulysse Jun 7, 2021
8d10fc4
fix: path to modules
maxulysse Jun 7, 2021
54bd1d4
fix: msisensor-pro test
maxulysse Jun 7, 2021
7d4547b
fix: msisensorpro execution7
maxulysse Jun 7, 2021
28b232b
feat: code polishing
maxulysse Jun 7, 2021
214a28b
fix: improve csv files generation
maxulysse Jun 7, 2021
6b23e8e
feat: code polishing
maxulysse Jun 7, 2021
f52c318
fix: update CI
maxulysse Jun 7, 2021
dacc326
fix: add csv for recalibrated bam files
maxulysse Jun 7, 2021
b092ea3
feat: restart from all steps
maxulysse Jun 7, 2021
0dd07ea
fix: gatk4spark tests
maxulysse Jun 8, 2021
68e1cc4
feat: code polishing
maxulysse Jun 8, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/local_modules.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ jobs:
strategy:
fail-fast: false
matrix:
nxf_version: ['20.11.0-edge']
nxf_version: ['21.04.0']
tags: ['${{ fromJson(needs.changes.outputs.modules) }}']
profile: ['docker', 'singularity'] ## 'conda'
env:
Expand Down
1 change: 0 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ work/
data/
results/
.DS_Store
tests/
testing/
testing*
*.pyc
2 changes: 2 additions & 0 deletions .nf-core-lint.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,7 @@ files_unchanged:
- .github/ISSUE_TEMPLATE/bug_report.md
- .github/ISSUE_TEMPLATE/feature_request.md
- .github/PULL_REQUEST_TEMPLATE.md
- .gitignore
- assets/nf-core-sarek_logo.png
- docs/images/nf-core-sarek_logo.png
- lib/NfcoreSchema.groovy
Empty file added assets/dummy_file.txt
Empty file.
64 changes: 32 additions & 32 deletions bin/concatenateVCFs.sh
Original file line number Diff line number Diff line change
Expand Up @@ -8,27 +8,27 @@ usage() { echo "Usage: $0 [-i genome_index_file] [-o output.file.no.gz.extension

while [[ $# -gt 0 ]]
do
key=$1
case $key in
key=$1
case $key in
-i)
genomeIndex=$2
shift # past argument
shift # past value
shift # past value
;;
-c)
cpus=$2
shift # past argument
shift # past value
shift # past value
;;
-o)
outputFile=$2
shift # past argument
shift # past value
shift # past value
;;
-t)
targetBED=$2
shift # past argument
shift # past value
shift # past value
;;
-n)
noInt=1
Expand All @@ -46,7 +46,7 @@ if [ -z ${cpus} ]; then echo "No CPUs defined: setting to 1"; cpus=1; fi
if [ -z ${outputFile} ]; then echo "Missing output file name"; usage; fi


if [ -z ${noInt+x} ]
if [ -z ${noInt+x} ]
then
# First make a header from one of the VCF
# Remove interval information from the GATK command-line, but leave the rest
Expand All @@ -62,36 +62,36 @@ then

# Concatenate VCFs in the correct order
(
cat header
cat header

for chr in "${CONTIGS[@]}"; do
# Skip if globbing would not match any file to avoid errors such as
# "ls: cannot access chr3_*.vcf: No such file or directory" when chr3
# was not processed.
pattern="${chr}_*.vcf"
if ! compgen -G "${pattern}" > /dev/null; then continue; fi
for chr in "${CONTIGS[@]}"; do
# Skip if globbing would not match any file to avoid errors such as
# "ls: cannot access chr3_*.vcf: No such file or directory" when chr3
# was not processed.
pattern="${chr}_*.vcf"
if ! compgen -G "${pattern}" > /dev/null; then continue; fi

# ls -v sorts by numeric value ("version"), which means that chr1_100_
# is sorted *after* chr1_99_.
for vcf in $(ls -v ${pattern}); do
# Determine length of header.
# The 'q' command makes sed exit when it sees the first non-header
# line, which avoids reading in the entire file.
L=$(sed -n '/^[^#]/q;p' ${vcf} | wc -l)
# Then print all non-header lines. Since tail is very fast (nearly as
# fast as cat), this is way more efficient than using a single sed,
# awk or grep command.
tail -n +$((L+1)) ${vcf}
done
done
# ls -v sorts by numeric value ("version"), which means that chr1_100_
# is sorted *after* chr1_99_.
for vcf in $(ls -v ${pattern}); do
# Determine length of header.
# The 'q' command makes sed exit when it sees the first non-header
# line, which avoids reading in the entire file.
L=$(sed -n '/^[^#]/q;p' ${vcf} | wc -l)

# Then print all non-header lines. Since tail is very fast (nearly as
# fast as cat), this is way more efficient than using a single sed,
# awk or grep command.
tail -n +$((L+1)) ${vcf}
done
done
) | bgzip -@${cpus} > rawcalls.vcf.gz
tabix rawcalls.vcf.gz
else
VCF=$(ls no_intervals*.vcf)
cp $VCF rawcalls.vcf
bgzip -@${cpus} rawcalls.vcf
tabix rawcalls.vcf.gz
VCF=$(ls no_intervals*.vcf)
cp $VCF rawcalls.vcf
bgzip -@${cpus} rawcalls.vcf
tabix rawcalls.vcf.gz
fi

set +u
Expand Down
10 changes: 10 additions & 0 deletions conf/genomes.config
Original file line number Diff line number Diff line change
Expand Up @@ -79,5 +79,15 @@ params {
'custom' {
fasta = null
}
'small_hg38' {
dbsnp = "${params.genomes_base}/data/genomics/homo_sapiens/genome/vcf/dbsnp_146.hg38.vcf.gz"
fasta = "${params.genomes_base}/data/genomics/homo_sapiens/genome/genome.fasta"
fasta_fai = "${params.genomes_base}/data/genomics/homo_sapiens/genome/genome.fasta.fai"
germline_resource = "${params.genomes_base}/data/genomics/homo_sapiens/genome/vcf/gnomAD.r2.1.1.vcf.gz"
known_indels = "${params.genomes_base}/data/genomics/homo_sapiens/genome/vcf/mills_and_1000G.indels.vcf.gz"
snpeff_db = 'GRCh38.86'
species = 'homo_sapiens'
vep_cache_version = '99'
}
}
}
Loading