-
Notifications
You must be signed in to change notification settings - Fork 7
GATK4 first round without MuTect1 and indel realignment #607
Conversation
…ow error. Maybe NXF bug?
annotate.nf
Outdated
@@ -73,23 +73,23 @@ vcfToAnnotate = Channel.create() | |||
vcfNotToAnnotate = Channel.create() | |||
|
|||
if (annotateVCF == []) { | |||
// by default we annotate both germline and somatic results that we can find in the VariantCalling directory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, we annote all available vcfs by default, so it's not really a question of germline/somatic, but really more a question of which tools was run
containers/sarek/Dockerfile
Outdated
|
||
LABEL \ | ||
author="Maxime Garcia" \ | ||
authors="Maxime.Gracia@scilifelab.se, Szilveszter.Juhos@scilifelab.se" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo in my name.
containers/sarek/Dockerfile
Outdated
description="Image with tools used in Sarek" \ | ||
maintainer="maxime.garcia@scilifelab.se" | ||
maintainers="Maxime.Gracia@scilifelab.se, Szilveszter.Juhos@scilifelab.se" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I remember well, we can use whichever label we want, but maintainer
is meant to stay that way, because it's a port of the deprecated instruction MAINTAINER.
I do think we can use:
maintainer="Maxime Garcia <maxime.garcia@scilifelab.se>, Szilveszter Juhos <Szilveszter.Juhos@scilifelab.se>"
containers/vcfanno/to_build
Outdated
docker build -t szilvajuhos/sarek-vcfanno:latest . | ||
docker images | ||
docker push szilvajuhos/sarek-vcfanno:latest | ||
singularity pull docker://szilvajuhos/sarek-vcfanno:latest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need this script in this repo ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, of course not
germlineVC.nf
Outdated
-L ${intervalBed} \ | ||
--dbsnp ${dbsnp} \ | ||
-O ${intervalBed.baseName}_${idSample}.g.vcf \ | ||
--emit-ref-confidence GVCF |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
strange indentation
lib/QC.groovy
Outdated
@@ -59,14 +59,14 @@ class QC { | |||
// Get GATK version | |||
static def getVersionGATK() { | |||
""" | |||
echo "GATK version"\$(java -jar \$GATK_HOME/GenomeAnalysisTK.jar --version 2>&1) > v_gatk.txt | |||
gatk-launch ApplyBQSR --help 2>&1| awk -F/ '/java/{for(i=1;i<=NF;i++){if(\$i~/gatk4/){sub("gatk4-","",\$i);print \$i>"v_gatk.txt"}}}' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can work out the regex in the Python script instead of doing it here, it'll make more sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the gatk-launch is GATK-provided, I do not want to fiddle with that. OTOH it would be nice if they would have a --version option :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll look more if there's something similar with the new GATK
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My feeling is that it is still the easiest way to have the version :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking more of removing the awk part, and do the regex in the python script
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, sure, we can refactor it for the rest of the software later also.
Quite an impressive work. |
containers/sarek/environment.yml
Outdated
- conda-forge::openjdk=8.0.144 # Needed for FastQC docker - see bioconda/bioconda-recipes#5026 | ||
- fastqc=0.11.7 | ||
- freebayes=1.2.0 | ||
- gatk4=4.0.3.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use the 4.0.4.0, the executable is back to being gatk
and not gatck-launch
anymore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine, will change the name in processes as well. In fact we have 4.0.6.0 also
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even better ;-)
containers/sarek/environment.yml
Outdated
- fastqc=0.11.7 | ||
- freebayes=1.2.0 | ||
- gatk4=4.0.3.0 | ||
- htslib=1.7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should use the 1.8 here.
If I remember well, htslib
, bcftools
and samtools
can all have the same version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, done but will check since I got a feeling that 1.8 has compatibility issues
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, strange, but good to know if you can confirm that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is even 1.9 out already! You could already skip 1.8 ...
https://github.com/samtools/samtools/releases/
containers/sarek/environment.yml
Outdated
- gatk4=4.0.3.0 | ||
- htslib=1.7 | ||
- igvtools=2.3.93 | ||
- manta=1.3.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're updating, we can try the 1.4.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, done
@@ -14,7 +14,7 @@ env { | |||
params { | |||
genome_base = params.genome == 'GRCh37' ? '/sw/data/uppnex/ToolBox/ReferenceAssemblies/hg38make/bundle/2.8/b37' : params.genome == 'GRCh38' ? '/sw/data/uppnex/ToolBox/hg38bundle' : 'References/smallGRCh37' | |||
singleCPUMem = 8.GB | |||
totalMemory = 104.GB // change to 240 on irma | |||
totalMemory = 92.GB // change to 240 on irma |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you change to 92?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by mistake
build.sh | ||
COPY environment.yml / | ||
RUN conda env update -n root -f /environment.yml && conda clean -a | ||
ENV PATH /opt/conda/bin:$PATH |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can get rid of this last line:
https://gitter.im/nf-core/Lobby?at=5b59f41bd2f0934551d30d5d
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tried, and was not working as expected, so I prefer to leave it as it is now, and improve when needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we will keep now with the ENV
containers/sarek/environment.yml
Outdated
@@ -0,0 +1,24 @@ | |||
# You can use this file to create a conda environment for this pipeline: | |||
# conda env create -f environment.yml | |||
name: sarek-core |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should specify a version here
so I would go for sarek-core-dev
or sarek-core-2.1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we just leave as sarek
?
GATK4 first round without MuTect1 and indel realignment
Also have a look at the new container structure. I am trying to accommodate nf-core guidelines. alleleCount and ASCAT needs new bioconda recipes, but most of the other tools are in a collated (relatively big) container including GATK4, igvtools, etc.