From 768efb345010dfa6d40a21e6a1e051e7dcc59013 Mon Sep 17 00:00:00 2001 From: Maxime Garcia Date: Tue, 21 Aug 2018 10:19:25 +0200 Subject: [PATCH 01/25] move files --- .../PULL_REQUEST_TEMPLATE.md | 0 containers/sarek/Dockerfile => Dockerfile | 0 {configuration => conf}/aws-batch.config | 0 {configuration => conf}/base.config | 2 +- {configuration => conf}/binac.config | 0 {configuration => conf}/containers.config | 0 {configuration => conf}/docker.config | 0 {configuration => conf}/genomes.config | 0 .../singularity-path.config | 0 {configuration => conf}/singularity.config | 0 {configuration => conf}/travis.config | 0 .../uppmax-localhost.config | 0 {configuration => conf}/uppmax-slurm.config | 0 containers/gatk/Dockerfile | 8 -- containers/gatk4/Dockerfile | 15 ---- containers/gatk4/environment.yml | 9 --- containers/igvtools/Dockerfile | 25 ------ containers/picard/Dockerfile | 25 ------ containers/qctools/Dockerfile | 15 ---- containers/qctools/environment.yml | 13 ---- {doc => docs}/ASCAT.md | 0 {doc => docs}/Abstracts/2016-09-KICR.md | 0 {doc => docs}/Abstracts/2017-05-ESHG.md | 0 {doc => docs}/Abstracts/2018-05-PMC.md | 0 {doc => docs}/Abstracts/2018-06-EACR25.md | 0 {doc => docs}/Abstracts/2018-06-NPMI.md | 0 {doc => docs}/Abstracts/2018-07-JOBIM.md | 0 {doc => docs}/BUILD.md | 0 {doc => docs}/CONFIG.md | 0 {doc => docs}/CONTAINERS.md | 73 ++---------------- {doc => docs}/FOLDER.md | 0 {doc => docs}/INSTALL.md | 4 +- {doc => docs}/INSTALL_BIANCA.md | 0 {doc => docs}/INSTALL_RACKHAM.md | 0 {doc => docs}/INTERVALS.md | 0 {doc => docs}/PROCESS.md | 0 {doc => docs}/Posters/ESHG_2017_Mgarcia.pdf | Bin {doc => docs}/Posters/ESHG_2017_Mgarcia.svg | 0 {doc => docs}/Posters/PMC_2018_Mgarcia.pdf | Bin {doc => docs}/Posters/PMC_2018_Mgarcia.svg | 0 .../2018-07-04-MGarcia-JOBIM.pdf | Bin docs/README.md | 19 +++++ {doc => docs}/REFERENCES.md | 2 +- {doc => docs}/SELECTROI.md | 0 .../SprintReview/2016-08-18/ASCAT short.pptx | Bin .../2016-08-18/General_schema.png | Bin .../2016-08-18/GitHub_contribution.png | Bin .../2016-08-18/SprintReview20160818.tex | 0 .../2016-08-18/VCs_integrated.png | Bin .../SprintReview/2016-08-18/dogEating.jpg | Bin .../SprintReview/2016-08-18/memory_usage.png | Bin .../multi-megabase-phase-blocks-large.jpg | Bin .../SprintReview/2016-08-18/wall_clock.png | Bin .../SprintReview/2016-10-13/Irma_timings.png | Bin .../2016-10-13/SprintReview20161013.tex | 0 .../SprintReview/2016-11-14/DREAM_Clones.png | Bin .../SprintReview/2016-11-14/DREAM_SNPs.png | Bin .../SprintReview/2016-11-14/DREAM_SVs.png | Bin .../SprintReview/2016-11-14/DREAM_indels.png | Bin .../2016-11-14/HCC1143_purity.png | Bin .../2016-11-14/HCC1143_subclones.png | Bin .../2016-11-14/HCC1954_purity.png | Bin .../2016-11-14/HCC1954_subclones.png | Bin .../2016-11-14/MGarciaRefactoring.pdf | Bin {doc => docs}/SprintReview/2016-11-14/S1.jpg | Bin {doc => docs}/SprintReview/2016-11-14/S1.xcf | Bin {doc => docs}/SprintReview/2016-11-14/S2.jpg | Bin {doc => docs}/SprintReview/2016-11-14/S2.xcf | Bin {doc => docs}/SprintReview/2016-11-14/S3.jpg | Bin {doc => docs}/SprintReview/2016-11-14/S3.xcf | Bin .../2016-11-14/SprintReview20161114.pdf | Bin .../2016-11-14/SprintReview20161114.tex | 0 {doc => docs}/TESTS.md | 2 +- {doc => docs}/TSV.md | 0 {doc => docs}/USE_CASES.md | 0 {doc => docs}/Various/tumor_genes.bed | 0 {doc => docs}/images/CAW_icon.png | Bin {doc => docs}/images/CAW_icon.svg | 0 {doc => docs}/images/CAW_logo.png | Bin {doc => docs}/images/CAW_logo.svg | 0 {doc => docs}/images/CPU_usage.pdf | Bin {doc => docs}/images/CPU_usage.png | Bin {doc => docs}/images/CPU_usage.svg | 0 {doc => docs}/images/General_schema.graphml | 0 {doc => docs}/images/GitHub.QR.png | Bin {doc => docs}/images/NBIS_logo.png | Bin {doc => docs}/images/NGI_logo.png | Bin .../images/Preprocessing_bubble.graphml | 0 {doc => docs}/images/Preprocessing_bubble.jpg | Bin {doc => docs}/images/SNV_indel_bubble.graphml | 0 {doc => docs}/images/SNV_indel_bubble.jpg | Bin {doc => docs}/images/SV_bubble.graphml | 0 {doc => docs}/images/SV_bubble.jpg | Bin {doc => docs}/images/Sarek_germline_icon.png | Bin {doc => docs}/images/Sarek_germline_logo.png | Bin {doc => docs}/images/Sarek_icon.png | Bin {doc => docs}/images/Sarek_icon.svg | 0 {doc => docs}/images/Sarek_logo.png | Bin {doc => docs}/images/Sarek_logo.svg | 0 {doc => docs}/images/Sarek_no_Border.png | Bin {doc => docs}/images/Sarek_somatic_icon.png | Bin {doc => docs}/images/Sarek_somatic_logo.png | Bin {doc => docs}/images/Sarek_workflow.pdf | Bin {doc => docs}/images/Sarek_workflow.png | Bin {doc => docs}/images/Sarek_workflow.svg | 0 {doc => docs}/images/SciLifeLab_logo.png | Bin {doc => docs}/images/ascat.graphml | 0 {doc => docs}/images/ascat.jpg | Bin {doc => docs}/images/folder_structure.graphml | 0 {doc => docs}/images/folder_structure.jpg | Bin .../images/preprocessing_simplified.graphml | 0 .../images/preprocessing_simplified.jpg | Bin {doc => docs}/images/workflow_schema.graphml | 0 doc/USAGE.md => docs/usage.md | 0 .../sarek/environment.yml => environment.yml | 0 115 files changed, 30 insertions(+), 182 deletions(-) rename PULL_REQUEST_TEMPLATE.md => .github/PULL_REQUEST_TEMPLATE.md (100%) rename containers/sarek/Dockerfile => Dockerfile (100%) rename {configuration => conf}/aws-batch.config (100%) rename {configuration => conf}/base.config (99%) rename {configuration => conf}/binac.config (100%) rename {configuration => conf}/containers.config (100%) rename {configuration => conf}/docker.config (100%) rename {configuration => conf}/genomes.config (100%) rename {configuration => conf}/singularity-path.config (100%) rename {configuration => conf}/singularity.config (100%) rename {configuration => conf}/travis.config (100%) rename {configuration => conf}/uppmax-localhost.config (100%) rename {configuration => conf}/uppmax-slurm.config (100%) delete mode 100644 containers/gatk/Dockerfile delete mode 100644 containers/gatk4/Dockerfile delete mode 100644 containers/gatk4/environment.yml delete mode 100644 containers/igvtools/Dockerfile delete mode 100644 containers/picard/Dockerfile delete mode 100644 containers/qctools/Dockerfile delete mode 100644 containers/qctools/environment.yml rename {doc => docs}/ASCAT.md (100%) rename {doc => docs}/Abstracts/2016-09-KICR.md (100%) rename {doc => docs}/Abstracts/2017-05-ESHG.md (100%) rename {doc => docs}/Abstracts/2018-05-PMC.md (100%) rename {doc => docs}/Abstracts/2018-06-EACR25.md (100%) rename {doc => docs}/Abstracts/2018-06-NPMI.md (100%) rename {doc => docs}/Abstracts/2018-07-JOBIM.md (100%) rename {doc => docs}/BUILD.md (100%) rename {doc => docs}/CONFIG.md (100%) rename {doc => docs}/CONTAINERS.md (57%) rename {doc => docs}/FOLDER.md (100%) rename {doc => docs}/INSTALL.md (97%) rename {doc => docs}/INSTALL_BIANCA.md (100%) rename {doc => docs}/INSTALL_RACKHAM.md (100%) rename {doc => docs}/INTERVALS.md (100%) rename {doc => docs}/PROCESS.md (100%) rename {doc => docs}/Posters/ESHG_2017_Mgarcia.pdf (100%) rename {doc => docs}/Posters/ESHG_2017_Mgarcia.svg (100%) rename {doc => docs}/Posters/PMC_2018_Mgarcia.pdf (100%) rename {doc => docs}/Posters/PMC_2018_Mgarcia.svg (100%) rename {doc => docs}/Presentations/2018-07-04-MGarcia-JOBIM.pdf (100%) create mode 100644 docs/README.md rename {doc => docs}/REFERENCES.md (99%) rename {doc => docs}/SELECTROI.md (100%) rename {doc => docs}/SprintReview/2016-08-18/ASCAT short.pptx (100%) rename {doc => docs}/SprintReview/2016-08-18/General_schema.png (100%) rename {doc => docs}/SprintReview/2016-08-18/GitHub_contribution.png (100%) rename {doc => docs}/SprintReview/2016-08-18/SprintReview20160818.tex (100%) rename {doc => docs}/SprintReview/2016-08-18/VCs_integrated.png (100%) rename {doc => docs}/SprintReview/2016-08-18/dogEating.jpg (100%) rename {doc => docs}/SprintReview/2016-08-18/memory_usage.png (100%) rename {doc => docs}/SprintReview/2016-08-18/multi-megabase-phase-blocks-large.jpg (100%) rename {doc => docs}/SprintReview/2016-08-18/wall_clock.png (100%) rename {doc => docs}/SprintReview/2016-10-13/Irma_timings.png (100%) rename {doc => docs}/SprintReview/2016-10-13/SprintReview20161013.tex (100%) rename {doc => docs}/SprintReview/2016-11-14/DREAM_Clones.png (100%) rename {doc => docs}/SprintReview/2016-11-14/DREAM_SNPs.png (100%) rename {doc => docs}/SprintReview/2016-11-14/DREAM_SVs.png (100%) rename {doc => docs}/SprintReview/2016-11-14/DREAM_indels.png (100%) rename {doc => docs}/SprintReview/2016-11-14/HCC1143_purity.png (100%) rename {doc => docs}/SprintReview/2016-11-14/HCC1143_subclones.png (100%) rename {doc => docs}/SprintReview/2016-11-14/HCC1954_purity.png (100%) rename {doc => docs}/SprintReview/2016-11-14/HCC1954_subclones.png (100%) rename {doc => docs}/SprintReview/2016-11-14/MGarciaRefactoring.pdf (100%) rename {doc => docs}/SprintReview/2016-11-14/S1.jpg (100%) rename {doc => docs}/SprintReview/2016-11-14/S1.xcf (100%) rename {doc => docs}/SprintReview/2016-11-14/S2.jpg (100%) rename {doc => docs}/SprintReview/2016-11-14/S2.xcf (100%) rename {doc => docs}/SprintReview/2016-11-14/S3.jpg (100%) rename {doc => docs}/SprintReview/2016-11-14/S3.xcf (100%) rename {doc => docs}/SprintReview/2016-11-14/SprintReview20161114.pdf (100%) rename {doc => docs}/SprintReview/2016-11-14/SprintReview20161114.tex (100%) rename {doc => docs}/TESTS.md (98%) rename {doc => docs}/TSV.md (100%) rename {doc => docs}/USE_CASES.md (100%) rename {doc => docs}/Various/tumor_genes.bed (100%) rename {doc => docs}/images/CAW_icon.png (100%) rename {doc => docs}/images/CAW_icon.svg (100%) rename {doc => docs}/images/CAW_logo.png (100%) rename {doc => docs}/images/CAW_logo.svg (100%) rename {doc => docs}/images/CPU_usage.pdf (100%) rename {doc => docs}/images/CPU_usage.png (100%) rename {doc => docs}/images/CPU_usage.svg (100%) rename {doc => docs}/images/General_schema.graphml (100%) rename {doc => docs}/images/GitHub.QR.png (100%) rename {doc => docs}/images/NBIS_logo.png (100%) rename {doc => docs}/images/NGI_logo.png (100%) rename {doc => docs}/images/Preprocessing_bubble.graphml (100%) rename {doc => docs}/images/Preprocessing_bubble.jpg (100%) rename {doc => docs}/images/SNV_indel_bubble.graphml (100%) rename {doc => docs}/images/SNV_indel_bubble.jpg (100%) rename {doc => docs}/images/SV_bubble.graphml (100%) rename {doc => docs}/images/SV_bubble.jpg (100%) rename {doc => docs}/images/Sarek_germline_icon.png (100%) rename {doc => docs}/images/Sarek_germline_logo.png (100%) rename {doc => docs}/images/Sarek_icon.png (100%) rename {doc => docs}/images/Sarek_icon.svg (100%) rename {doc => docs}/images/Sarek_logo.png (100%) rename {doc => docs}/images/Sarek_logo.svg (100%) rename {doc => docs}/images/Sarek_no_Border.png (100%) rename {doc => docs}/images/Sarek_somatic_icon.png (100%) rename {doc => docs}/images/Sarek_somatic_logo.png (100%) rename {doc => docs}/images/Sarek_workflow.pdf (100%) rename {doc => docs}/images/Sarek_workflow.png (100%) rename {doc => docs}/images/Sarek_workflow.svg (100%) rename {doc => docs}/images/SciLifeLab_logo.png (100%) rename {doc => docs}/images/ascat.graphml (100%) rename {doc => docs}/images/ascat.jpg (100%) rename {doc => docs}/images/folder_structure.graphml (100%) rename {doc => docs}/images/folder_structure.jpg (100%) rename {doc => docs}/images/preprocessing_simplified.graphml (100%) rename {doc => docs}/images/preprocessing_simplified.jpg (100%) rename {doc => docs}/images/workflow_schema.graphml (100%) rename doc/USAGE.md => docs/usage.md (100%) rename containers/sarek/environment.yml => environment.yml (100%) diff --git a/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md similarity index 100% rename from PULL_REQUEST_TEMPLATE.md rename to .github/PULL_REQUEST_TEMPLATE.md diff --git a/containers/sarek/Dockerfile b/Dockerfile similarity index 100% rename from containers/sarek/Dockerfile rename to Dockerfile diff --git a/configuration/aws-batch.config b/conf/aws-batch.config similarity index 100% rename from configuration/aws-batch.config rename to conf/aws-batch.config diff --git a/configuration/base.config b/conf/base.config similarity index 99% rename from configuration/base.config rename to conf/base.config index 5f2e145408..64aff09cf5 100644 --- a/configuration/base.config +++ b/conf/base.config @@ -50,7 +50,7 @@ params { } process { - $ConcatVCF { + withName:ConcatVCF { // For unknown reasons, ConcatVCF sometimes fails with SIGPIPE // (exit code 141). Rerunning the process will usually work. errorStrategy = {task.exitStatus == 141 ? 'retry' : 'terminate'} diff --git a/configuration/binac.config b/conf/binac.config similarity index 100% rename from configuration/binac.config rename to conf/binac.config diff --git a/configuration/containers.config b/conf/containers.config similarity index 100% rename from configuration/containers.config rename to conf/containers.config diff --git a/configuration/docker.config b/conf/docker.config similarity index 100% rename from configuration/docker.config rename to conf/docker.config diff --git a/configuration/genomes.config b/conf/genomes.config similarity index 100% rename from configuration/genomes.config rename to conf/genomes.config diff --git a/configuration/singularity-path.config b/conf/singularity-path.config similarity index 100% rename from configuration/singularity-path.config rename to conf/singularity-path.config diff --git a/configuration/singularity.config b/conf/singularity.config similarity index 100% rename from configuration/singularity.config rename to conf/singularity.config diff --git a/configuration/travis.config b/conf/travis.config similarity index 100% rename from configuration/travis.config rename to conf/travis.config diff --git a/configuration/uppmax-localhost.config b/conf/uppmax-localhost.config similarity index 100% rename from configuration/uppmax-localhost.config rename to conf/uppmax-localhost.config diff --git a/configuration/uppmax-slurm.config b/conf/uppmax-slurm.config similarity index 100% rename from configuration/uppmax-slurm.config rename to conf/uppmax-slurm.config diff --git a/containers/gatk/Dockerfile b/containers/gatk/Dockerfile deleted file mode 100644 index ae8ad2ed0c..0000000000 --- a/containers/gatk/Dockerfile +++ /dev/null @@ -1,8 +0,0 @@ -FROM broadinstitute/gatk3:3.8-0 - -LABEL \ - author="Maxime Garcia" \ - description="GATK image for use in Sarek" \ - maintainer="maxime.garcia@scilifelab.se" - -ENV GATK_HOME=/usr diff --git a/containers/gatk4/Dockerfile b/containers/gatk4/Dockerfile deleted file mode 100644 index ad3cfc8aef..0000000000 --- a/containers/gatk4/Dockerfile +++ /dev/null @@ -1,15 +0,0 @@ -FROM nfcore/base:latest - -LABEL \ - author="Maxime Garcia" \ - description="GATK4 Image for Sarek" \ - maintainer="maxime.garcia@scilifelab.se" - -COPY environment.yml / - -RUN \ - conda env create -f /environment.yml && \ - conda clean -a - - # Export PATH -ENV PATH /opt/conda/envs/sarek-gatk4-2.0/bin:$PATH diff --git a/containers/gatk4/environment.yml b/containers/gatk4/environment.yml deleted file mode 100644 index 041d2a2b92..0000000000 --- a/containers/gatk4/environment.yml +++ /dev/null @@ -1,9 +0,0 @@ -# You can use this file to create a conda environment: -# conda env create -f environment.yml -name: sarek-gatk4-2.0 -channels: - - bioconda - - conda-forge - - defaults -dependencies: - - gatk4=4.0.4.0 diff --git a/containers/igvtools/Dockerfile b/containers/igvtools/Dockerfile deleted file mode 100644 index 1629b64cf0..0000000000 --- a/containers/igvtools/Dockerfile +++ /dev/null @@ -1,25 +0,0 @@ -FROM openjdk:8-slim - -LABEL \ - author="Maxime Garcia" \ - description="IGVtools 2.3.98 image for use in Sarek" \ - maintainer="maxime.garcia@scilifelab.se" - -# Install libraries -RUN \ - apt-get update && apt-get install -y --no-install-recommends \ - wget \ - && rm -rf /var/lib/apt/lists/* - -# Setup ENV variables -ENV \ - IGVTOOLS_HOME=/opt/IGVTools \ - IGVTOOLS_VERSION=2.3.98 - -# Install IGVTools -RUN \ - wget --quiet -O igvtools_${IGVTOOLS_VERSION}.zip \ - http://data.broadinstitute.org/igv/projects/downloads/2.3/igvtools_${IGVTOOLS_VERSION}.zip \ - && unzip igvtools_${IGVTOOLS_VERSION}.zip \ - && rm igvtools_${IGVTOOLS_VERSION}.zip \ - && mv IGVTools $IGVTOOLS_HOME diff --git a/containers/picard/Dockerfile b/containers/picard/Dockerfile deleted file mode 100644 index 8a558102d5..0000000000 --- a/containers/picard/Dockerfile +++ /dev/null @@ -1,25 +0,0 @@ -FROM openjdk:8-slim - -LABEL \ - author="Maxime Garcia" \ - description="Picard image for use in Sarek" \ - maintainer="maxime.garcia@scilifelab.se" - -# Install libraries -RUN \ - apt-get update && apt-get install -y --no-install-recommends \ - wget \ - && rm -rf /var/lib/apt/lists/* - -# Setup ENV variables -ENV \ - PICARD_HOME=/opt/picard \ - PICARD_VERSION=2.0.1 - -# Install PicardTools -RUN \ - wget --quiet -O picard-tools-${PICARD_VERSION}.zip \ - https://github.com/broadinstitute/picard/releases/download/${PICARD_VERSION}/picard-tools-${PICARD_VERSION}.zip \ - && unzip picard-tools-${PICARD_VERSION}.zip \ - && mv picard-tools-${PICARD_VERSION} ${PICARD_HOME} \ - && rm picard-tools-${PICARD_VERSION}.zip diff --git a/containers/qctools/Dockerfile b/containers/qctools/Dockerfile deleted file mode 100644 index 4b018cf723..0000000000 --- a/containers/qctools/Dockerfile +++ /dev/null @@ -1,15 +0,0 @@ -FROM nfcore/base:latest - -LABEL \ - author="Maxime Garcia" \ - description="Image with QC tools used in Sarek" \ - maintainer="maxime.garcia@scilifelab.se" - -COPY environment.yml / - -RUN \ - conda env create -f /environment.yml && \ - conda clean -a - - # Export PATH -ENV PATH /opt/conda/envs/sarek-qctools-2.0/bin:$PATH diff --git a/containers/qctools/environment.yml b/containers/qctools/environment.yml deleted file mode 100644 index d8c969924e..0000000000 --- a/containers/qctools/environment.yml +++ /dev/null @@ -1,13 +0,0 @@ -# You can use this file to create a conda environment: -# conda env create -f environment.yml -name: sarek-qctools-2.0 -channels: - - bioconda - - conda-forge - - defaults -dependencies: - - conda-forge::openjdk=8.0.144 - - fastqc=0.11.7 - - multiqc=1.5 - - qualimap=2.2.2a - - vcftools=0.1.15 diff --git a/doc/ASCAT.md b/docs/ASCAT.md similarity index 100% rename from doc/ASCAT.md rename to docs/ASCAT.md diff --git a/doc/Abstracts/2016-09-KICR.md b/docs/Abstracts/2016-09-KICR.md similarity index 100% rename from doc/Abstracts/2016-09-KICR.md rename to docs/Abstracts/2016-09-KICR.md diff --git a/doc/Abstracts/2017-05-ESHG.md b/docs/Abstracts/2017-05-ESHG.md similarity index 100% rename from doc/Abstracts/2017-05-ESHG.md rename to docs/Abstracts/2017-05-ESHG.md diff --git a/doc/Abstracts/2018-05-PMC.md b/docs/Abstracts/2018-05-PMC.md similarity index 100% rename from doc/Abstracts/2018-05-PMC.md rename to docs/Abstracts/2018-05-PMC.md diff --git a/doc/Abstracts/2018-06-EACR25.md b/docs/Abstracts/2018-06-EACR25.md similarity index 100% rename from doc/Abstracts/2018-06-EACR25.md rename to docs/Abstracts/2018-06-EACR25.md diff --git a/doc/Abstracts/2018-06-NPMI.md b/docs/Abstracts/2018-06-NPMI.md similarity index 100% rename from doc/Abstracts/2018-06-NPMI.md rename to docs/Abstracts/2018-06-NPMI.md diff --git a/doc/Abstracts/2018-07-JOBIM.md b/docs/Abstracts/2018-07-JOBIM.md similarity index 100% rename from doc/Abstracts/2018-07-JOBIM.md rename to docs/Abstracts/2018-07-JOBIM.md diff --git a/doc/BUILD.md b/docs/BUILD.md similarity index 100% rename from doc/BUILD.md rename to docs/BUILD.md diff --git a/doc/CONFIG.md b/docs/CONFIG.md similarity index 100% rename from doc/CONFIG.md rename to docs/CONFIG.md diff --git a/doc/CONTAINERS.md b/docs/CONTAINERS.md similarity index 57% rename from doc/CONTAINERS.md rename to docs/CONTAINERS.md index e6027acb22..3ffd0916ca 100644 --- a/doc/CONTAINERS.md +++ b/docs/CONTAINERS.md @@ -2,18 +2,7 @@ Subsets of all containers can be dowloaded: -For processing + germline variant calling + Reports: - - [gatk](#gatk-) - - [picard](#picard-) - - [sarek](#sarek-) - -For processing + somatic variant calling + Reports: - - [freebayes](#freebayes-) - - [gatk](#gatk-) - - [mutect1](#mutect1-) - - [picard](#picard-) - - [r-base](#r-base-) - - [runallelecount](#runallelecount-) +For processing, germline and somatic variant calling and Reports: - [sarek](#sarek-) For annotation for GRCh37, you will need: @@ -24,40 +13,10 @@ For annotation for GRCh38, you will need: - [snpeffgrch38](#snpeffgrch38-) - [vepgrch38](#vepgrch38-) -A container named after the process is made for each process. If a container can be reused, it will be named after the tool used. - -## freebayes [![freebayes-docker status][freebayes-docker-badge]][freebayes-docker-link] - -- Based on `debian:8.6` -- Contain **[FreeBayes][freebayes-link]** 1.1.0 - -## gatk [![gatk-docker status][gatk-docker-badge]][gatk-docker-link] - -- Based on `broadinstitute/gatk3:3.8-0` -- Contain **[GATK][gatk-link]** 3.8 - -## igvtools [![igvtools-docker status][igvtools-docker-badge]][igvtools-docker-link] - -- Based on `openjdk:8-slim` -- Contain **[IGVTools][igvtools-link]** 2.3.98 - -## mutect1 [![mutect1-docker status][mutect1-docker-badge]][mutect1-docker-link] - -- Based on `openjdk:7-slim` -- Contain **[MuTect1][mutect1-link]** 1.5 - -## picard [![picard-docker status][picard-docker-badge]][picard-docker-link] - -- Based on `openjdk:8-slim` -- Contain **[Picard][picard-link]** 2.0.1 - -## qctools [![qctools-docker status][qctools-docker-badge]][qctools-docker-link] +## r-base [![r-base-docker status][r-base-docker-badge]][r-base-docker-link] -- Based on `nfcore/base:latest` -- Contain **[FastQC][fastqc-link]** 0.11.7 -- Contain **[MultiQC][multiqc-link]** 1.5 -- Contain **[qualimap][qualimap-link]** 2.2.1 -- Contain **[vcftools][vcftools-link]** 0.1.15 + - Based on `debian:8.9` + - Contain **[AlleleCount][allelecount-link]** 2.2.0 ## runallelecount [![runallelecount-docker status][runallelecount-docker-badge]][runallelecount-docker-link] @@ -111,39 +70,23 @@ A container named after the process is made for each process. If a container can [allelecount-link]: https://github.com/cancerit/alleleCount [bcftools-link]: https://github.com/samtools/bcftools [bwa-link]: https://github.com/lh3/bwa -[fastqc-docker-badge]: https://img.shields.io/docker/automated/maxulysse/fastqc.svg -[fastqc-docker-link]: https://hub.docker.com/r/maxulysse/fastqc [fastqc-link]: http://www.bioinformatics.babraham.ac.uk/projects/fastqc/ -[freebayes-docker-badge]: https://img.shields.io/docker/automated/maxulysse/freebayes.svg -[freebayes-docker-link]: https://hub.docker.com/r/maxulysse/freebayes [freebayes-link]: https://github.com/ekg/freebayes -[gatk-docker-badge]: https://img.shields.io/docker/automated/maxulysse/gatk.svg -[gatk-docker-link]: https://hub.docker.com/r/maxulysse/gatk [gatk-link]: https://github.com/broadgsa/gatk-protected [htslib-link]: https://github.com/samtools/htslib -[igvtools-docker-badge]: https://img.shields.io/docker/automated/maxulysse/igvtools.svg -[igvtools-docker-link]: https://hub.docker.com/r/maxulysse/igvtools [igvtools-link]: http://software.broadinstitute.org/software/igv/ [manta-link]: https://github.com/Illumina/manta -[multiqc-docker-badge]: https://img.shields.io/docker/automated/maxulysse/multiqc.svg -[multiqc-docker-link]: https://hub.docker.com/r/maxulysse/multiqc [multiqc-link]: https://github.com/ewels/MultiQC/ -[mutect1-docker-badge]: https://img.shields.io/docker/automated/maxulysse/mutect1.svg -[mutect1-docker-link]: https://hub.docker.com/r/maxulysse/mutect1 [mutect1-link]: https://github.com/broadinstitute/mutect [nbis-link]: https://www.nbis.se/ [ngi-link]: https://ngisweden.scilifelab.se/ -[picard-docker-badge]: https://img.shields.io/docker/automated/maxulysse/picard.svg -[picard-docker-link]: https://hub.docker.com/r/maxulysse/picard [picard-link]: https://github.com/broadinstitute/picard -[qctools-docker-badge]: https://img.shields.io/docker/automated/maxulysse/qctools.svg -[qctools-docker-link]: https://hub.docker.com/r/maxulysse/qctools [qualimap-link]: http://qualimap.bioinfo.cipf.es [rcolorbrewer-link]: https://CRAN.R-project.org/package=RColorBrewer [runallelecount-docker-badge]: https://img.shields.io/docker/automated/maxulysse/runallelecount.svg [runallelecount-docker-link]: https://hub.docker.com/r/maxulysse/runallelecount -[runascat-docker-badge]: https://img.shields.io/docker/automated/maxulysse/runascat.svg -[runascat-docker-link]: https://hub.docker.com/r/maxulysse/runascat +[r-base-docker-badge]: https://img.shields.io/docker/automated/maxulysse/r-base.svg +[r-base-docker-link]: https://hub.docker.com/r/maxulysse/r-base [samtools-link]: https://github.com/samtools/samtools [sarek-docker-badge]: https://img.shields.io/docker/automated/maxulysse/sarek.svg [sarek-docker-link]: https://hub.docker.com/r/maxulysse/sarek @@ -156,11 +99,7 @@ A container named after the process is made for each process. If a container can [snpeffgrch38-docker-badge]: https://img.shields.io/docker/automated/maxulysse/snpeffgrch38.svg [snpeffgrch38-docker-link]: https://hub.docker.com/r/maxulysse/snpeffgrch38 [strelka-link]: https://github.com/Illumina/strelka -[vcftools-docker-badge]: https://img.shields.io/docker/automated/maxulysse/vcftools.svg -[vcftools-docker-link]: https://hub.docker.com/r/maxulysse/vcftools [vcftools-link]: https://vcftools.github.io/index.html -[vep-docker-badge]: https://img.shields.io/docker/automated/maxulysse/vep.svg -[vep-docker-link]: https://hub.docker.com/r/maxulysse/vep [vep-link]: https://github.com/Ensembl/ensembl-vep [vepgrch37-docker-badge]: https://img.shields.io/docker/automated/maxulysse/vepgrch37.svg [vepgrch37-docker-link]: https://hub.docker.com/r/maxulysse/vepgrch37 diff --git a/doc/FOLDER.md b/docs/FOLDER.md similarity index 100% rename from doc/FOLDER.md rename to docs/FOLDER.md diff --git a/doc/INSTALL.md b/docs/INSTALL.md similarity index 97% rename from doc/INSTALL.md rename to docs/INSTALL.md index 4dccd9a327..4aafadfe89 100644 --- a/doc/INSTALL.md +++ b/docs/INSTALL.md @@ -26,13 +26,13 @@ export NXF_SINGULARITY_CACHEDIR=$HOME/.singularity Docker can also be used as a container technology. -You can [Test Sarek with small dataset and small reference](https://github.com/SciLifeLab/Sarek/blob/master/doc/TESTS.md) +You can [Test Sarek with small dataset and small reference](https://github.com/SciLifeLab/Sarek/blob/master/docs/TESTS.md) ## Update To update Sarek, it's also very simple: - + ```bash # Connect to your system > ssh -AX [USER]@[system]REFERENCES diff --git a/doc/INSTALL_BIANCA.md b/docs/INSTALL_BIANCA.md similarity index 100% rename from doc/INSTALL_BIANCA.md rename to docs/INSTALL_BIANCA.md diff --git a/doc/INSTALL_RACKHAM.md b/docs/INSTALL_RACKHAM.md similarity index 100% rename from doc/INSTALL_RACKHAM.md rename to docs/INSTALL_RACKHAM.md diff --git a/doc/INTERVALS.md b/docs/INTERVALS.md similarity index 100% rename from doc/INTERVALS.md rename to docs/INTERVALS.md diff --git a/doc/PROCESS.md b/docs/PROCESS.md similarity index 100% rename from doc/PROCESS.md rename to docs/PROCESS.md diff --git a/doc/Posters/ESHG_2017_Mgarcia.pdf b/docs/Posters/ESHG_2017_Mgarcia.pdf similarity index 100% rename from doc/Posters/ESHG_2017_Mgarcia.pdf rename to docs/Posters/ESHG_2017_Mgarcia.pdf diff --git a/doc/Posters/ESHG_2017_Mgarcia.svg b/docs/Posters/ESHG_2017_Mgarcia.svg similarity index 100% rename from doc/Posters/ESHG_2017_Mgarcia.svg rename to docs/Posters/ESHG_2017_Mgarcia.svg diff --git a/doc/Posters/PMC_2018_Mgarcia.pdf b/docs/Posters/PMC_2018_Mgarcia.pdf similarity index 100% rename from doc/Posters/PMC_2018_Mgarcia.pdf rename to docs/Posters/PMC_2018_Mgarcia.pdf diff --git a/doc/Posters/PMC_2018_Mgarcia.svg b/docs/Posters/PMC_2018_Mgarcia.svg similarity index 100% rename from doc/Posters/PMC_2018_Mgarcia.svg rename to docs/Posters/PMC_2018_Mgarcia.svg diff --git a/doc/Presentations/2018-07-04-MGarcia-JOBIM.pdf b/docs/Presentations/2018-07-04-MGarcia-JOBIM.pdf similarity index 100% rename from doc/Presentations/2018-07-04-MGarcia-JOBIM.pdf rename to docs/Presentations/2018-07-04-MGarcia-JOBIM.pdf diff --git a/docs/README.md b/docs/README.md new file mode 100644 index 0000000000..628ebeccac --- /dev/null +++ b/docs/README.md @@ -0,0 +1,19 @@ +## Documentation + +The Sarek pipeline comes with the following documentation: + +01. [Installation documentation](https://github.com/SciLifeLab/Sarek/blob/master/docs/INSTALL.md) +02. [Installation documentation specific for UPPMAX `rackham`](https://github.com/SciLifeLab/Sarek/blob/master/docs/INSTALL_RACKHAM.md) +03. [Installation documentation specific for UPPMAX `bianca`](https://github.com/SciLifeLab/Sarek/blob/master/docs/INSTALL_BIANCA.md) +04. [Tests documentation](https://github.com/SciLifeLab/Sarek/blob/master/docs/TESTS.md) +05. [Reference files documentation](https://github.com/SciLifeLab/Sarek/blob/master/docs/REFERENCES.md) +06. [Configuration and profiles documentation](https://github.com/SciLifeLab/Sarek/blob/master/docs/CONFIG.md) +07. [Intervals documentation](https://github.com/SciLifeLab/Sarek/blob/master/docs/INTERVALS.md) +08. [Running the pipeline](https://github.com/SciLifeLab/Sarek/blob/master/docs/USAGE.md) +09. [Examples](https://github.com/SciLifeLab/Sarek/blob/master/docs/USE_CASES.md) +10. [TSV file documentation](https://github.com/SciLifeLab/Sarek/blob/master/docs/TSV.md) +11. [Processes documentation](https://github.com/SciLifeLab/Sarek/blob/master/docs/PROCESS.md) +12. [Documentation about containers](https://github.com/SciLifeLab/Sarek/blob/master/docs/CONTAINERS.md) +13. [Documentation about building](https://github.com/SciLifeLab/Sarek/blob/master/docs/BUILD.md) +14. [More information about ASCAT](https://github.com/SciLifeLab/Sarek/blob/master/docs/ASCAT.md) +15. [Folder structure](https://github.com/SciLifeLab/Sarek/blob/master/docs/FOLDER.md) diff --git a/doc/REFERENCES.md b/docs/REFERENCES.md similarity index 99% rename from doc/REFERENCES.md rename to docs/REFERENCES.md index cba49fd615..3a3fda85f7 100644 --- a/doc/REFERENCES.md +++ b/docs/REFERENCES.md @@ -21,7 +21,7 @@ The following files need to be downloaded: From our repo, get the [`intervals` list file](https://raw.githubusercontent.com/SciLifeLab/Sarek/master/repeats/wgs_calling_regions.grch37.list). More information about this file in the [intervals documentation](INTERVALS.md) -Description of how to generate the Loci file used in the ASCAT process is described [here](https://github.com/SciLifeLab/Sarek/blob/master/doc/ASCAT.md). +Description of how to generate the Loci file used in the ASCAT process is described [here](https://github.com/SciLifeLab/Sarek/blob/master/docs/ASCAT.md). You can create your own cosmic reference for any human reference as specified below. diff --git a/doc/SELECTROI.md b/docs/SELECTROI.md similarity index 100% rename from doc/SELECTROI.md rename to docs/SELECTROI.md diff --git a/doc/SprintReview/2016-08-18/ASCAT short.pptx b/docs/SprintReview/2016-08-18/ASCAT short.pptx similarity index 100% rename from doc/SprintReview/2016-08-18/ASCAT short.pptx rename to docs/SprintReview/2016-08-18/ASCAT short.pptx diff --git a/doc/SprintReview/2016-08-18/General_schema.png b/docs/SprintReview/2016-08-18/General_schema.png similarity index 100% rename from doc/SprintReview/2016-08-18/General_schema.png rename to docs/SprintReview/2016-08-18/General_schema.png diff --git a/doc/SprintReview/2016-08-18/GitHub_contribution.png b/docs/SprintReview/2016-08-18/GitHub_contribution.png similarity index 100% rename from doc/SprintReview/2016-08-18/GitHub_contribution.png rename to docs/SprintReview/2016-08-18/GitHub_contribution.png diff --git a/doc/SprintReview/2016-08-18/SprintReview20160818.tex b/docs/SprintReview/2016-08-18/SprintReview20160818.tex similarity index 100% rename from doc/SprintReview/2016-08-18/SprintReview20160818.tex rename to docs/SprintReview/2016-08-18/SprintReview20160818.tex diff --git a/doc/SprintReview/2016-08-18/VCs_integrated.png b/docs/SprintReview/2016-08-18/VCs_integrated.png similarity index 100% rename from doc/SprintReview/2016-08-18/VCs_integrated.png rename to docs/SprintReview/2016-08-18/VCs_integrated.png diff --git a/doc/SprintReview/2016-08-18/dogEating.jpg b/docs/SprintReview/2016-08-18/dogEating.jpg similarity index 100% rename from doc/SprintReview/2016-08-18/dogEating.jpg rename to docs/SprintReview/2016-08-18/dogEating.jpg diff --git a/doc/SprintReview/2016-08-18/memory_usage.png b/docs/SprintReview/2016-08-18/memory_usage.png similarity index 100% rename from doc/SprintReview/2016-08-18/memory_usage.png rename to docs/SprintReview/2016-08-18/memory_usage.png diff --git a/doc/SprintReview/2016-08-18/multi-megabase-phase-blocks-large.jpg b/docs/SprintReview/2016-08-18/multi-megabase-phase-blocks-large.jpg similarity index 100% rename from doc/SprintReview/2016-08-18/multi-megabase-phase-blocks-large.jpg rename to docs/SprintReview/2016-08-18/multi-megabase-phase-blocks-large.jpg diff --git a/doc/SprintReview/2016-08-18/wall_clock.png b/docs/SprintReview/2016-08-18/wall_clock.png similarity index 100% rename from doc/SprintReview/2016-08-18/wall_clock.png rename to docs/SprintReview/2016-08-18/wall_clock.png diff --git a/doc/SprintReview/2016-10-13/Irma_timings.png b/docs/SprintReview/2016-10-13/Irma_timings.png similarity index 100% rename from doc/SprintReview/2016-10-13/Irma_timings.png rename to docs/SprintReview/2016-10-13/Irma_timings.png diff --git a/doc/SprintReview/2016-10-13/SprintReview20161013.tex b/docs/SprintReview/2016-10-13/SprintReview20161013.tex similarity index 100% rename from doc/SprintReview/2016-10-13/SprintReview20161013.tex rename to docs/SprintReview/2016-10-13/SprintReview20161013.tex diff --git a/doc/SprintReview/2016-11-14/DREAM_Clones.png b/docs/SprintReview/2016-11-14/DREAM_Clones.png similarity index 100% rename from doc/SprintReview/2016-11-14/DREAM_Clones.png rename to docs/SprintReview/2016-11-14/DREAM_Clones.png diff --git a/doc/SprintReview/2016-11-14/DREAM_SNPs.png b/docs/SprintReview/2016-11-14/DREAM_SNPs.png similarity index 100% rename from doc/SprintReview/2016-11-14/DREAM_SNPs.png rename to docs/SprintReview/2016-11-14/DREAM_SNPs.png diff --git a/doc/SprintReview/2016-11-14/DREAM_SVs.png b/docs/SprintReview/2016-11-14/DREAM_SVs.png similarity index 100% rename from doc/SprintReview/2016-11-14/DREAM_SVs.png rename to docs/SprintReview/2016-11-14/DREAM_SVs.png diff --git a/doc/SprintReview/2016-11-14/DREAM_indels.png b/docs/SprintReview/2016-11-14/DREAM_indels.png similarity index 100% rename from doc/SprintReview/2016-11-14/DREAM_indels.png rename to docs/SprintReview/2016-11-14/DREAM_indels.png diff --git a/doc/SprintReview/2016-11-14/HCC1143_purity.png b/docs/SprintReview/2016-11-14/HCC1143_purity.png similarity index 100% rename from doc/SprintReview/2016-11-14/HCC1143_purity.png rename to docs/SprintReview/2016-11-14/HCC1143_purity.png diff --git a/doc/SprintReview/2016-11-14/HCC1143_subclones.png b/docs/SprintReview/2016-11-14/HCC1143_subclones.png similarity index 100% rename from doc/SprintReview/2016-11-14/HCC1143_subclones.png rename to docs/SprintReview/2016-11-14/HCC1143_subclones.png diff --git a/doc/SprintReview/2016-11-14/HCC1954_purity.png b/docs/SprintReview/2016-11-14/HCC1954_purity.png similarity index 100% rename from doc/SprintReview/2016-11-14/HCC1954_purity.png rename to docs/SprintReview/2016-11-14/HCC1954_purity.png diff --git a/doc/SprintReview/2016-11-14/HCC1954_subclones.png b/docs/SprintReview/2016-11-14/HCC1954_subclones.png similarity index 100% rename from doc/SprintReview/2016-11-14/HCC1954_subclones.png rename to docs/SprintReview/2016-11-14/HCC1954_subclones.png diff --git a/doc/SprintReview/2016-11-14/MGarciaRefactoring.pdf b/docs/SprintReview/2016-11-14/MGarciaRefactoring.pdf similarity index 100% rename from doc/SprintReview/2016-11-14/MGarciaRefactoring.pdf rename to docs/SprintReview/2016-11-14/MGarciaRefactoring.pdf diff --git a/doc/SprintReview/2016-11-14/S1.jpg b/docs/SprintReview/2016-11-14/S1.jpg similarity index 100% rename from doc/SprintReview/2016-11-14/S1.jpg rename to docs/SprintReview/2016-11-14/S1.jpg diff --git a/doc/SprintReview/2016-11-14/S1.xcf b/docs/SprintReview/2016-11-14/S1.xcf similarity index 100% rename from doc/SprintReview/2016-11-14/S1.xcf rename to docs/SprintReview/2016-11-14/S1.xcf diff --git a/doc/SprintReview/2016-11-14/S2.jpg b/docs/SprintReview/2016-11-14/S2.jpg similarity index 100% rename from doc/SprintReview/2016-11-14/S2.jpg rename to docs/SprintReview/2016-11-14/S2.jpg diff --git a/doc/SprintReview/2016-11-14/S2.xcf b/docs/SprintReview/2016-11-14/S2.xcf similarity index 100% rename from doc/SprintReview/2016-11-14/S2.xcf rename to docs/SprintReview/2016-11-14/S2.xcf diff --git a/doc/SprintReview/2016-11-14/S3.jpg b/docs/SprintReview/2016-11-14/S3.jpg similarity index 100% rename from doc/SprintReview/2016-11-14/S3.jpg rename to docs/SprintReview/2016-11-14/S3.jpg diff --git a/doc/SprintReview/2016-11-14/S3.xcf b/docs/SprintReview/2016-11-14/S3.xcf similarity index 100% rename from doc/SprintReview/2016-11-14/S3.xcf rename to docs/SprintReview/2016-11-14/S3.xcf diff --git a/doc/SprintReview/2016-11-14/SprintReview20161114.pdf b/docs/SprintReview/2016-11-14/SprintReview20161114.pdf similarity index 100% rename from doc/SprintReview/2016-11-14/SprintReview20161114.pdf rename to docs/SprintReview/2016-11-14/SprintReview20161114.pdf diff --git a/doc/SprintReview/2016-11-14/SprintReview20161114.tex b/docs/SprintReview/2016-11-14/SprintReview20161114.tex similarity index 100% rename from doc/SprintReview/2016-11-14/SprintReview20161114.tex rename to docs/SprintReview/2016-11-14/SprintReview20161114.tex diff --git a/doc/TESTS.md b/docs/TESTS.md similarity index 98% rename from doc/TESTS.md rename to docs/TESTS.md index 61756256d1..aabc59d6d0 100644 --- a/doc/TESTS.md +++ b/docs/TESTS.md @@ -77,7 +77,7 @@ nextflow run runMultiQC.nf -profile singularity ## Testing on a secure cluster On a secure cluster as bianca, with no internet access, you will need to download and transfer Sarek and the test data first. -Follow the [installation guide for `bianca`](https://github.com/SciLifeLab/Sarek/blob/master/doc/INSTALL_BIANCA.md). +Follow the [installation guide for `bianca`](https://github.com/SciLifeLab/Sarek/blob/master/docs/INSTALL_BIANCA.md). And then start the test at the `Build the references for the test data` step. diff --git a/doc/TSV.md b/docs/TSV.md similarity index 100% rename from doc/TSV.md rename to docs/TSV.md diff --git a/doc/USE_CASES.md b/docs/USE_CASES.md similarity index 100% rename from doc/USE_CASES.md rename to docs/USE_CASES.md diff --git a/doc/Various/tumor_genes.bed b/docs/Various/tumor_genes.bed similarity index 100% rename from doc/Various/tumor_genes.bed rename to docs/Various/tumor_genes.bed diff --git a/doc/images/CAW_icon.png b/docs/images/CAW_icon.png similarity index 100% rename from doc/images/CAW_icon.png rename to docs/images/CAW_icon.png diff --git a/doc/images/CAW_icon.svg b/docs/images/CAW_icon.svg similarity index 100% rename from doc/images/CAW_icon.svg rename to docs/images/CAW_icon.svg diff --git a/doc/images/CAW_logo.png b/docs/images/CAW_logo.png similarity index 100% rename from doc/images/CAW_logo.png rename to docs/images/CAW_logo.png diff --git a/doc/images/CAW_logo.svg b/docs/images/CAW_logo.svg similarity index 100% rename from doc/images/CAW_logo.svg rename to docs/images/CAW_logo.svg diff --git a/doc/images/CPU_usage.pdf b/docs/images/CPU_usage.pdf similarity index 100% rename from doc/images/CPU_usage.pdf rename to docs/images/CPU_usage.pdf diff --git a/doc/images/CPU_usage.png b/docs/images/CPU_usage.png similarity index 100% rename from doc/images/CPU_usage.png rename to docs/images/CPU_usage.png diff --git a/doc/images/CPU_usage.svg b/docs/images/CPU_usage.svg similarity index 100% rename from doc/images/CPU_usage.svg rename to docs/images/CPU_usage.svg diff --git a/doc/images/General_schema.graphml b/docs/images/General_schema.graphml similarity index 100% rename from doc/images/General_schema.graphml rename to docs/images/General_schema.graphml diff --git a/doc/images/GitHub.QR.png b/docs/images/GitHub.QR.png similarity index 100% rename from doc/images/GitHub.QR.png rename to docs/images/GitHub.QR.png diff --git a/doc/images/NBIS_logo.png b/docs/images/NBIS_logo.png similarity index 100% rename from doc/images/NBIS_logo.png rename to docs/images/NBIS_logo.png diff --git a/doc/images/NGI_logo.png b/docs/images/NGI_logo.png similarity index 100% rename from doc/images/NGI_logo.png rename to docs/images/NGI_logo.png diff --git a/doc/images/Preprocessing_bubble.graphml b/docs/images/Preprocessing_bubble.graphml similarity index 100% rename from doc/images/Preprocessing_bubble.graphml rename to docs/images/Preprocessing_bubble.graphml diff --git a/doc/images/Preprocessing_bubble.jpg b/docs/images/Preprocessing_bubble.jpg similarity index 100% rename from doc/images/Preprocessing_bubble.jpg rename to docs/images/Preprocessing_bubble.jpg diff --git a/doc/images/SNV_indel_bubble.graphml b/docs/images/SNV_indel_bubble.graphml similarity index 100% rename from doc/images/SNV_indel_bubble.graphml rename to docs/images/SNV_indel_bubble.graphml diff --git a/doc/images/SNV_indel_bubble.jpg b/docs/images/SNV_indel_bubble.jpg similarity index 100% rename from doc/images/SNV_indel_bubble.jpg rename to docs/images/SNV_indel_bubble.jpg diff --git a/doc/images/SV_bubble.graphml b/docs/images/SV_bubble.graphml similarity index 100% rename from doc/images/SV_bubble.graphml rename to docs/images/SV_bubble.graphml diff --git a/doc/images/SV_bubble.jpg b/docs/images/SV_bubble.jpg similarity index 100% rename from doc/images/SV_bubble.jpg rename to docs/images/SV_bubble.jpg diff --git a/doc/images/Sarek_germline_icon.png b/docs/images/Sarek_germline_icon.png similarity index 100% rename from doc/images/Sarek_germline_icon.png rename to docs/images/Sarek_germline_icon.png diff --git a/doc/images/Sarek_germline_logo.png b/docs/images/Sarek_germline_logo.png similarity index 100% rename from doc/images/Sarek_germline_logo.png rename to docs/images/Sarek_germline_logo.png diff --git a/doc/images/Sarek_icon.png b/docs/images/Sarek_icon.png similarity index 100% rename from doc/images/Sarek_icon.png rename to docs/images/Sarek_icon.png diff --git a/doc/images/Sarek_icon.svg b/docs/images/Sarek_icon.svg similarity index 100% rename from doc/images/Sarek_icon.svg rename to docs/images/Sarek_icon.svg diff --git a/doc/images/Sarek_logo.png b/docs/images/Sarek_logo.png similarity index 100% rename from doc/images/Sarek_logo.png rename to docs/images/Sarek_logo.png diff --git a/doc/images/Sarek_logo.svg b/docs/images/Sarek_logo.svg similarity index 100% rename from doc/images/Sarek_logo.svg rename to docs/images/Sarek_logo.svg diff --git a/doc/images/Sarek_no_Border.png b/docs/images/Sarek_no_Border.png similarity index 100% rename from doc/images/Sarek_no_Border.png rename to docs/images/Sarek_no_Border.png diff --git a/doc/images/Sarek_somatic_icon.png b/docs/images/Sarek_somatic_icon.png similarity index 100% rename from doc/images/Sarek_somatic_icon.png rename to docs/images/Sarek_somatic_icon.png diff --git a/doc/images/Sarek_somatic_logo.png b/docs/images/Sarek_somatic_logo.png similarity index 100% rename from doc/images/Sarek_somatic_logo.png rename to docs/images/Sarek_somatic_logo.png diff --git a/doc/images/Sarek_workflow.pdf b/docs/images/Sarek_workflow.pdf similarity index 100% rename from doc/images/Sarek_workflow.pdf rename to docs/images/Sarek_workflow.pdf diff --git a/doc/images/Sarek_workflow.png b/docs/images/Sarek_workflow.png similarity index 100% rename from doc/images/Sarek_workflow.png rename to docs/images/Sarek_workflow.png diff --git a/doc/images/Sarek_workflow.svg b/docs/images/Sarek_workflow.svg similarity index 100% rename from doc/images/Sarek_workflow.svg rename to docs/images/Sarek_workflow.svg diff --git a/doc/images/SciLifeLab_logo.png b/docs/images/SciLifeLab_logo.png similarity index 100% rename from doc/images/SciLifeLab_logo.png rename to docs/images/SciLifeLab_logo.png diff --git a/doc/images/ascat.graphml b/docs/images/ascat.graphml similarity index 100% rename from doc/images/ascat.graphml rename to docs/images/ascat.graphml diff --git a/doc/images/ascat.jpg b/docs/images/ascat.jpg similarity index 100% rename from doc/images/ascat.jpg rename to docs/images/ascat.jpg diff --git a/doc/images/folder_structure.graphml b/docs/images/folder_structure.graphml similarity index 100% rename from doc/images/folder_structure.graphml rename to docs/images/folder_structure.graphml diff --git a/doc/images/folder_structure.jpg b/docs/images/folder_structure.jpg similarity index 100% rename from doc/images/folder_structure.jpg rename to docs/images/folder_structure.jpg diff --git a/doc/images/preprocessing_simplified.graphml b/docs/images/preprocessing_simplified.graphml similarity index 100% rename from doc/images/preprocessing_simplified.graphml rename to docs/images/preprocessing_simplified.graphml diff --git a/doc/images/preprocessing_simplified.jpg b/docs/images/preprocessing_simplified.jpg similarity index 100% rename from doc/images/preprocessing_simplified.jpg rename to docs/images/preprocessing_simplified.jpg diff --git a/doc/images/workflow_schema.graphml b/docs/images/workflow_schema.graphml similarity index 100% rename from doc/images/workflow_schema.graphml rename to docs/images/workflow_schema.graphml diff --git a/doc/USAGE.md b/docs/usage.md similarity index 100% rename from doc/USAGE.md rename to docs/usage.md diff --git a/containers/sarek/environment.yml b/environment.yml similarity index 100% rename from containers/sarek/environment.yml rename to environment.yml From 6cd1ebab2b72df86d7488c6554827510fd7680e2 Mon Sep 17 00:00:00 2001 From: Maxime Garcia Date: Tue, 21 Aug 2018 10:20:19 +0200 Subject: [PATCH 02/25] update path to newly moved files --- README.md | 93 +++++++++++++++++++++++++++------------------- buildContainers.nf | 3 +- nextflow.config | 60 +++++++++++++++--------------- 3 files changed, 87 insertions(+), 69 deletions(-) diff --git a/README.md b/README.md index bae514464f..f8fabe09da 100644 --- a/README.md +++ b/README.md @@ -1,29 +1,40 @@ -# [![Sarek](https://raw.githubusercontent.com/SciLifeLab/Sarek/master/doc/images/Sarek_logo.png "Sarek")](http://opensource.scilifelab.se/projects/sarek/) +# [![Sarek](https://raw.githubusercontent.com/SciLifeLab/Sarek/master/docs/images/Sarek_logo.png "Sarek")](http://opensource.scilifelab.se/projects/sarek/) -#### An open-source analysis pipeline to detect germline or somatic variants from whole genome sequencing. +#### An open-source analysis pipeline to detect germline or somatic variants from whole genome sequencing -[![sarek version][version-badge]][version-link] -[![Travis status][travis-badge]][travis-link] -[![nextflow version][nextflow-badge]][nextflow-link] -[![License][license-badge]][license-link] +[![Nextflow version][nextflow-badge]][nextflow-link] +[![Travis build status][travis-badge]][travis-link] +[![Join the chat at [gitter](gitter-link)][gitter-badge]][gitter-link] + +[![MIT License][license-badge]][license-link] +[![Sarek version][version-badge]][version-link] [![DOI][zenodo-badge]][zenodo-link] -[![Join the chat at https://gitter.im/SciLifeLab/Sarek][gitter-badge]][gitter-link] + +[![Install with bioconda][bioconda-badge]][bioconda-link] +[![Docker Container available][docker-badge]][docker-link] ## Introduction - + -Previously known as the Cancer Analysis Workflow (CAW), Sarek is a workflow tool designed to run analyses on WGS data from regular samples or tumour / normal pairs, including relapse samples if required. +Previously known as the Cancer Analysis Workflow (CAW), +Sarek is a workflow designed to run analyses on WGS data from regular samples or tumour / normal pairs, including relapse samples if required. -It's built using [Nextflow][nextflow-link], a bioinformatics domain specific language for workflow building. Software dependencies are handled using [Docker](https://www.docker.com) or [Singularity](http://singularity.lbl.gov) - container technologies that provide excellent reproducibility and ease of use. Singularity has been designed specifically for high-performance computing environments. This means that although Sarek has been primarily designed for use with the Swedish [UPPMAX HPC systems](https://www.uppmax.uu.se), it should be able to run on any system that supports these two tools. +It's built using [Nextflow][nextflow-link], a bioinformatics domain specific language for workflow building. Software dependencies are handled using [Docker](https://www.docker.com) or [Singularity](http://singularity.lbl.gov) - container technologies that provide excellent reproducibility and ease of use. +Singularity has been designed specifically for high-performance computing environments. +This means that although Sarek has been primarily designed for use with the Swedish [UPPMAX HPC systems](https://www.uppmax.uu.se), it should be able to run on any system that supports these two tools. -Sarek was developed at the [National Genomics Infastructure][ngi-link] and [National Bioinformatics Infastructure Sweden][nbis-link] which are both platforms at [SciLifeLab][scilifelab-link]. It is listed on the [Elixir - Tools and Data Services Registry](https://bio.tools/Sarek). +Sarek was developed at the [National Genomics Infastructure][ngi-link] and [National Bioinformatics Infastructure Sweden][nbis-link] which are both platforms at [SciLifeLab][scilifelab-link]. +It is listed on the [Elixir - Tools and Data Services Registry](https://bio.tools/Sarek). ## Workflow steps -Sarek is built with several workflow scripts. A wrapper script contained within the repository makes it easy to run the different workflow scripts as a single job. To test your installation, follow the [tests documentation.](https://github.com/SciLifeLab/Sarek/blob/master/doc/TESTS.md) +Sarek is built with several workflow scripts. +A wrapper script contained within the repository makes it easy to run the different workflow scripts as a single job. +To test your installation, follow the [tests documentation.](https://github.com/SciLifeLab/Sarek/blob/master/docs/TESTS.md) -Raw FastQ files or aligned BAM files (with or without realignment & recalibration) can be used as inputs. You can choose which variant callers to use, plus the pipeline is capable of accommodating additional variant calling software or CNV callers if required. +Raw FastQ files or aligned BAM files (with or without realignment & recalibration) can be used as inputs. +You can choose which variant callers to use, plus the pipeline is capable of accommodating additional variant calling software or CNV callers if required. The worflow steps and tools used are as follows: @@ -40,7 +51,6 @@ The worflow steps and tools used are as follows: * [Manta](https://github.com/Illumina/manta) 3. **Somatic variant calling** - `somaticVC.nf` _(optional)_ * SNVs and small indels - * [MuTect1](https://github.com/broadinstitute/mutect) * [MuTect2](https://github.com/broadgsa/gatk-protected) * [Freebayes](https://github.com/ekg/freebayes) * [Strelka](https://github.com/Illumina/strelka) @@ -58,23 +68,23 @@ The worflow steps and tools used are as follows: ## Documentation -The Sarek pipeline comes with documentation in the `doc/` directory: - -01. [Installation documentation](https://github.com/SciLifeLab/Sarek/blob/master/doc/INSTALL.md) -02. [Installation documentation specific for UPPMAX `rackham`](https://github.com/SciLifeLab/Sarek/blob/master/doc/INSTALL_RACKHAM.md) -03. [Installation documentation specific for UPPMAX `bianca`](https://github.com/SciLifeLab/Sarek/blob/master/doc/INSTALL_BIANCA.md) -04. [Tests documentation](https://github.com/SciLifeLab/Sarek/blob/master/doc/TESTS.md) -05. [Reference files documentation](https://github.com/SciLifeLab/Sarek/blob/master/doc/REFERENCES.md) -06. [Configuration and profiles documentation](https://github.com/SciLifeLab/Sarek/blob/master/doc/CONFIG.md) -07. [Intervals documentation](https://github.com/SciLifeLab/Sarek/blob/master/doc/INTERVALS.md) -08. [Running the pipeline](https://github.com/SciLifeLab/Sarek/blob/master/doc/USAGE.md) -09. [Examples](https://github.com/SciLifeLab/Sarek/blob/master/doc/USE_CASES.md) -10. [TSV file documentation](https://github.com/SciLifeLab/Sarek/blob/master/doc/TSV.md) -11. [Processes documentation](https://github.com/SciLifeLab/Sarek/blob/master/doc/PROCESS.md) -12. [Documentation about containers](https://github.com/SciLifeLab/Sarek/blob/master/doc/CONTAINERS.md) -13. [Documentation about building](https://github.com/SciLifeLab/Sarek/blob/master/doc/BUILD.md) -14. [More information about ASCAT](https://github.com/SciLifeLab/Sarek/blob/master/doc/ASCAT.md) -15. [Folder structure](https://github.com/SciLifeLab/Sarek/blob/master/doc/FOLDER.md) +The Sarek pipeline comes with documentation in the `docs/` directory: + +01. [Installation documentation](https://github.com/SciLifeLab/Sarek/blob/master/docs/INSTALL.md) +02. [Installation documentation specific for UPPMAX `rackham`](https://github.com/SciLifeLab/Sarek/blob/master/docs/INSTALL_RACKHAM.md) +03. [Installation documentation specific for UPPMAX `bianca`](https://github.com/SciLifeLab/Sarek/blob/master/docs/INSTALL_BIANCA.md) +04. [Tests documentation](https://github.com/SciLifeLab/Sarek/blob/master/docs/TESTS.md) +05. [Reference files documentation](https://github.com/SciLifeLab/Sarek/blob/master/docs/REFERENCES.md) +06. [Configuration and profiles documentation](https://github.com/SciLifeLab/Sarek/blob/master/docs/CONFIG.md) +07. [Intervals documentation](https://github.com/SciLifeLab/Sarek/blob/master/docs/INTERVALS.md) +08. [Running the pipeline](https://github.com/SciLifeLab/Sarek/blob/master/docs/USAGE.md) +09. [Examples](https://github.com/SciLifeLab/Sarek/blob/master/docs/USE_CASES.md) +10. [TSV file documentation](https://github.com/SciLifeLab/Sarek/blob/master/docs/TSV.md) +11. [Processes documentation](https://github.com/SciLifeLab/Sarek/blob/master/docs/PROCESS.md) +12. [Documentation about containers](https://github.com/SciLifeLab/Sarek/blob/master/docs/CONTAINERS.md) +13. [Documentation about building](https://github.com/SciLifeLab/Sarek/blob/master/docs/BUILD.md) +14. [More information about ASCAT](https://github.com/SciLifeLab/Sarek/blob/master/docs/ASCAT.md) +15. [Folder structure](https://github.com/SciLifeLab/Sarek/blob/master/docs/FOLDER.md) ## Contributions & Support @@ -86,13 +96,16 @@ For further information or help, don't hesitate to get in touch on [Gitter][gitt - [CHANGELOG](https://github.com/SciLifeLab/Sarek/blob/master/CHANGELOG.md) -## Authors +## Credits +Main authors: +* [Maxime Garcia](https://github.com/MaxUlysse) +* [Szilveszter Juhos](https://github.com/szilvajuhos) + +Helpful contributors: * [Sebastian DiLorenzo](https://github.com/Sebastian-D) * [Jesper Eisfeldt](https://github.com/J35P312) * [Phil Ewels](https://github.com/ewels) -* [Maxime Garcia](https://github.com/MaxUlysse) -* [Szilveszter Juhos](https://github.com/szilvajuhos) * [Max Käller](https://github.com/gulfshores) * [Malin Larsson](https://github.com/malinlarsson) * [Marcel Martin](https://github.com/marcelm) @@ -101,11 +114,15 @@ For further information or help, don't hesitate to get in touch on [Gitter][gitt -------------------------------------------------------------------------------- -[![SciLifeLab](https://raw.githubusercontent.com/SciLifeLab/Sarek/master/doc/images/SciLifeLab_logo.png "SciLifeLab")][scilifelab-link] -[![NGI](https://raw.githubusercontent.com/SciLifeLab/Sarek/master/doc/images/NGI_logo.png "NGI")][ngi-link] -[![NBIS](https://raw.githubusercontent.com/SciLifeLab/Sarek/master/doc/images/NBIS_logo.png "NBIS")][nbis-link] +[![SciLifeLab](https://raw.githubusercontent.com/SciLifeLab/Sarek/master/docs/images/SciLifeLab_logo.png "SciLifeLab")][scilifelab-link] +[![NGI](https://raw.githubusercontent.com/SciLifeLab/Sarek/master/docs/images/NGI_logo.png "NGI")][ngi-link] +[![NBIS](https://raw.githubusercontent.com/SciLifeLab/Sarek/master/docs/images/NBIS_logo.png "NBIS")][nbis-link] -[gitter-badge]: https://badges.gitter.im/SciLifeLab/Sarek.svg +[bioconda-badge]:https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg +[bioconda-link]:http://bioconda.github.io/ +[docker-badge]: https://img.shields.io/docker/automated/maxulysse/sarek.svg +[docker-link]: https://hub.docker.com/r/maxulysse/sarek +[gitter-badge]: https://img.shields.io/badge/gitter-%20join%20chat%20%E2%86%92-4fb99a.svg [gitter-link]: https://gitter.im/SciLifeLab/Sarek [license-badge]: https://img.shields.io/github/license/SciLifeLab/Sarek.svg [license-link]: https://github.com/SciLifeLab/Sarek/blob/master/LICENSE diff --git a/buildContainers.nf b/buildContainers.nf index 92294ff432..b7f016d2d7 100644 --- a/buildContainers.nf +++ b/buildContainers.nf @@ -87,8 +87,9 @@ process BuildDockerContainers { when: params.docker script: + path = container == "sarek" ? "${baseDir}" : "${baseDir}/containers/${container}/." """ - docker build -t ${params.repository}/${container}:${params.tag} ${baseDir}/containers/${container}/. + docker build -t ${params.repository}/${container}:${params.tag} ${path} """ } diff --git a/nextflow.config b/nextflow.config index 4f6196a9fe..4525c76ac0 100644 --- a/nextflow.config +++ b/nextflow.config @@ -22,66 +22,66 @@ profiles { // Runs the pipeline locally on a single 16-core node // Singularity images need to be set up standard { - includeConfig 'configuration/base.config' - includeConfig 'configuration/uppmax-localhost.config' - includeConfig 'configuration/singularity-path.config' + includeConfig 'conf/base.config' + includeConfig 'conf/uppmax-localhost.config' + includeConfig 'conf/singularity-path.config' } // slurm profile for UPPMAX secure clusters // Runs the pipeline using the job scheduler // Singularity images need to be set up slurm { - includeConfig 'configuration/base.config' - includeConfig 'configuration/uppmax-slurm.config' - includeConfig 'configuration/singularity-path.config' + includeConfig 'conf/base.config' + includeConfig 'conf/uppmax-slurm.config' + includeConfig 'conf/singularity-path.config' } // slurm profile for UPPMAX clusters // Runs the pipeline using the job scheduler // Singularity images will be pulled automatically slurmDownload { - includeConfig 'configuration/base.config' - includeConfig 'configuration/uppmax-slurm.config' - includeConfig 'configuration/singularity.config' - includeConfig 'configuration/containers.config' + includeConfig 'conf/base.config' + includeConfig 'conf/uppmax-slurm.config' + includeConfig 'conf/singularity.config' + includeConfig 'conf/containers.config' } // Small testing with Docker profile // Docker images will be pulled automatically docker { - includeConfig 'configuration/base.config' - includeConfig 'configuration/travis.config' - includeConfig 'configuration/docker.config' - includeConfig 'configuration/containers.config' + includeConfig 'conf/base.config' + includeConfig 'conf/travis.config' + includeConfig 'conf/docker.config' + includeConfig 'conf/containers.config' } // AWS Batch with Docker profile // Docker images will be pulled automatically awsbatch { - includeConfig 'configuration/base.config' - includeConfig 'configuration/aws-batch.config' - includeConfig 'configuration/docker.config' - includeConfig 'configuration/containers.config' + includeConfig 'conf/base.config' + includeConfig 'conf/aws-batch.config' + includeConfig 'conf/docker.config' + includeConfig 'conf/containers.config' } // Small testing with Singularity profile // Singularity images will be pulled automatically singularity { - includeConfig 'configuration/base.config' - includeConfig 'configuration/travis.config' - includeConfig 'configuration/singularity.config' - includeConfig 'configuration/containers.config' + includeConfig 'conf/base.config' + includeConfig 'conf/travis.config' + includeConfig 'conf/singularity.config' + includeConfig 'conf/containers.config' } // Small testing with Singularity profile // Singularity images need to be set up singularityPath { - includeConfig 'configuration/base.config' - includeConfig 'configuration/travis.config' - includeConfig 'configuration/singularity-path.config' + includeConfig 'conf/base.config' + includeConfig 'conf/travis.config' + includeConfig 'conf/singularity-path.config' } - // Default config for german BinAC cluster + // Default config for german BinAC cluster // Runs the pipeline using the pbs executor // Singularity images will be pulled automatically binac { - includeConfig 'configuration/base.config' - includeConfig 'configuration/binac.config' - includeConfig 'configuration/singularity.config' - includeConfig 'configuration/containers.config' + includeConfig 'conf/base.config' + includeConfig 'conf/binac.config' + includeConfig 'conf/singularity.config' + includeConfig 'conf/containers.config' } } From 1b2f51514e3f94a55aeced9e5b4472d56fdc6caa Mon Sep 17 00:00:00 2001 From: Maxime Garcia Date: Tue, 21 Aug 2018 10:24:22 +0200 Subject: [PATCH 03/25] fix gitter badge [skip ci] --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index f8fabe09da..a77ada24b5 100644 --- a/README.md +++ b/README.md @@ -4,7 +4,7 @@ [![Nextflow version][nextflow-badge]][nextflow-link] [![Travis build status][travis-badge]][travis-link] -[![Join the chat at [gitter](gitter-link)][gitter-badge]][gitter-link] +[![Join the chat on https://gitter.im/SciLifeLab/Sarek][gitter-badge]][gitter-link] [![MIT License][license-badge]][license-link] [![Sarek version][version-badge]][version-link] From b55862ce852d62c56a49c9f1a438724993fe7ac7 Mon Sep 17 00:00:00 2001 From: Maxime Garcia Date: Tue, 21 Aug 2018 10:47:18 +0200 Subject: [PATCH 04/25] update README [skip ci] --- README.md | 21 ++++++++++++--------- 1 file changed, 12 insertions(+), 9 deletions(-) diff --git a/README.md b/README.md index a77ada24b5..33dfaa5a5e 100644 --- a/README.md +++ b/README.md @@ -20,7 +20,8 @@ Previously known as the Cancer Analysis Workflow (CAW), Sarek is a workflow designed to run analyses on WGS data from regular samples or tumour / normal pairs, including relapse samples if required. -It's built using [Nextflow][nextflow-link], a bioinformatics domain specific language for workflow building. Software dependencies are handled using [Docker](https://www.docker.com) or [Singularity](http://singularity.lbl.gov) - container technologies that provide excellent reproducibility and ease of use. +It's built using [Nextflow][nextflow-link], a domain specific language for workflow building. +Software dependencies are handled using [Docker](https://www.docker.com) or [Singularity](http://singularity.lbl.gov) - container technologies that provide excellent reproducibility and ease of use. Singularity has been designed specifically for high-performance computing environments. This means that although Sarek has been primarily designed for use with the Swedish [UPPMAX HPC systems](https://www.uppmax.uu.se), it should be able to run on any system that supports these two tools. @@ -39,21 +40,23 @@ You can choose which variant callers to use, plus the pipeline is capable of acc The worflow steps and tools used are as follows: 1. **Preprocessing** - `main.nf` _(based on [GATK best practices](https://software.broadinstitute.org/gatk/best-practices/))_ - * Read alignment + * Map reads to Reference * [BWA](http://bio-bwa.sourceforge.net/) - * Read realignment and recalibration of short-read data - * [GATK](https://github.com/broadgsa/gatk-protected) + * Mark Duplicates + * [GATK](https://github.com/broadinstitute/gatk) + * Base (Quality Score) Recalibration + * [GATK](https://github.com/broadinstitute/gatk) 2. **Germline variant calling** - `germlineVC.nf` * SNVs and small indels - * [GATK HaplotyeCaller](https://github.com/broadgsa/gatk-protected) - * [Strelka](https://github.com/Illumina/strelka) + * [GATK HaplotyeCaller](https://github.com/broadinstitute/gatk) + * [Strelka2](https://github.com/Illumina/strelka) * Structural variants * [Manta](https://github.com/Illumina/manta) 3. **Somatic variant calling** - `somaticVC.nf` _(optional)_ * SNVs and small indels - * [MuTect2](https://github.com/broadgsa/gatk-protected) + * [MuTect2](https://github.com/broadinstitute/gatk) * [Freebayes](https://github.com/ekg/freebayes) - * [Strelka](https://github.com/Illumina/strelka) + * [Strelka2](https://github.com/Illumina/strelka) * Structural variants * [Manta](https://github.com/Illumina/manta) * Sample heterogeneity, ploidy and CNVs @@ -61,7 +64,7 @@ The worflow steps and tools used are as follows: 4. **Annotation** - `annotate.nf` _(optional)_ * Variant annotation * [SnpEff](http://snpeff.sourceforge.net/) - * [VEP](https://www.ensembl.org/info/docs/tools/vep/index.html) (Variant Effect Predictor) + * [VEP (Variant Effect Predictor)](https://www.ensembl.org/info/docs/tools/vep/index.html) 5. **Reporting** - `runMultiQC.nf` * Reporting * [MultiQC](http://multiqc.info) From 3954b9a01e9d12ccfe591035169082ef6af92866 Mon Sep 17 00:00:00 2001 From: Maxime Garcia Date: Tue, 21 Aug 2018 10:51:19 +0200 Subject: [PATCH 05/25] update README [skip ci] --- README.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 33dfaa5a5e..1a32b10333 100644 --- a/README.md +++ b/README.md @@ -43,9 +43,10 @@ The worflow steps and tools used are as follows: * Map reads to Reference * [BWA](http://bio-bwa.sourceforge.net/) * Mark Duplicates - * [GATK](https://github.com/broadinstitute/gatk) + * [GATK MarkDuplicates](https://github.com/broadinstitute/gatk) * Base (Quality Score) Recalibration - * [GATK](https://github.com/broadinstitute/gatk) + * [GATK BaseRecalibrator](https://github.com/broadinstitute/gatk) + * [GATK ApplyBQSR](https://github.com/broadinstitute/gatk) 2. **Germline variant calling** - `germlineVC.nf` * SNVs and small indels * [GATK HaplotyeCaller](https://github.com/broadinstitute/gatk) From 37c13fee2fb9a25221cb4d04beab44a56ede6b61 Mon Sep 17 00:00:00 2001 From: Maxime Garcia Date: Wed, 22 Aug 2018 11:08:31 +0200 Subject: [PATCH 06/25] add logo to badges [skip ci] --- README.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/README.md b/README.md index 1a32b10333..6f2b6f3b67 100644 --- a/README.md +++ b/README.md @@ -122,22 +122,22 @@ Helpful contributors: [![NGI](https://raw.githubusercontent.com/SciLifeLab/Sarek/master/docs/images/NGI_logo.png "NGI")][ngi-link] [![NBIS](https://raw.githubusercontent.com/SciLifeLab/Sarek/master/docs/images/NBIS_logo.png "NBIS")][nbis-link] -[bioconda-badge]:https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg +[bioconda-badge]:https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg?logo= [bioconda-link]:http://bioconda.github.io/ -[docker-badge]: https://img.shields.io/docker/automated/maxulysse/sarek.svg +[docker-badge]: https://img.shields.io/docker/automated/maxulysse/sarek.svg?logo=docker [docker-link]: https://hub.docker.com/r/maxulysse/sarek -[gitter-badge]: https://img.shields.io/badge/gitter-%20join%20chat%20%E2%86%92-4fb99a.svg +[gitter-badge]: https://img.shields.io/gitter/room/SciLifeLab/Sarek.svg?logo=gitter&logoColor=white&colorB=4fb99a [gitter-link]: https://gitter.im/SciLifeLab/Sarek [license-badge]: https://img.shields.io/github/license/SciLifeLab/Sarek.svg [license-link]: https://github.com/SciLifeLab/Sarek/blob/master/LICENSE [nbis-link]: https://www.nbis.se/ -[nextflow-badge]: https://img.shields.io/badge/nextflow-%E2%89%A50.31.0-brightgreen.svg +[nextflow-badge]: https://img.shields.io/badge/nextflow-%E2%89%A50.31.0-brightgreen.svg?logo= [nextflow-link]: https://www.nextflow.io/ [ngi-link]: https://ngisweden.scilifelab.se/ [scilifelab-link]: https://www.scilifelab.se/ -[travis-badge]: https://api.travis-ci.org/SciLifeLab/Sarek.svg +[travis-badge]: https://img.shields.io/travis/SciLifeLab/Sarek.svg?logo=travis [travis-link]: https://travis-ci.org/SciLifeLab/Sarek -[version-badge]: https://img.shields.io/github/release/SciLifeLab/Sarek.svg +[version-badge]: https://img.shields.io/github/release/SciLifeLab/Sarek.svg?logo=github&logoColor=white [version-link]: https://github.com/SciLifeLab/Sarek/releases/latest [zenodo-badge]: https://zenodo.org/badge/54024046.svg [zenodo-link]: https://zenodo.org/badge/latestdoi/54024046 From 683c378ede6526ab1defb132aff9fd71d95b330c Mon Sep 17 00:00:00 2001 From: Maxime Garcia Date: Wed, 22 Aug 2018 14:54:52 +0200 Subject: [PATCH 07/25] use inline svg bas64 image in badges [skip ci] --- README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 6f2b6f3b67..38d874a5f6 100644 --- a/README.md +++ b/README.md @@ -122,7 +122,7 @@ Helpful contributors: [![NGI](https://raw.githubusercontent.com/SciLifeLab/Sarek/master/docs/images/NGI_logo.png "NGI")][ngi-link] [![NBIS](https://raw.githubusercontent.com/SciLifeLab/Sarek/master/docs/images/NBIS_logo.png "NBIS")][nbis-link] -[bioconda-badge]:https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg?logo= +[bioconda-badge]:https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg?logo= [bioconda-link]:http://bioconda.github.io/ [docker-badge]: https://img.shields.io/docker/automated/maxulysse/sarek.svg?logo=docker [docker-link]: https://hub.docker.com/r/maxulysse/sarek @@ -131,7 +131,7 @@ Helpful contributors: [license-badge]: https://img.shields.io/github/license/SciLifeLab/Sarek.svg [license-link]: https://github.com/SciLifeLab/Sarek/blob/master/LICENSE [nbis-link]: https://www.nbis.se/ -[nextflow-badge]: https://img.shields.io/badge/nextflow-%E2%89%A50.31.0-brightgreen.svg?logo= +[nextflow-badge]: https://img.shields.io/badge/nextflow-%E2%89%A50.31.0-brightgreen.svg?logo= [nextflow-link]: https://www.nextflow.io/ [ngi-link]: https://ngisweden.scilifelab.se/ [scilifelab-link]: https://www.scilifelab.se/ From 2d4de2b7be84b7e62b507cb4c4d2a05030b3c5ad Mon Sep 17 00:00:00 2001 From: Maxime Garcia Date: Wed, 22 Aug 2018 15:32:13 +0200 Subject: [PATCH 08/25] enhance bioconda inline svg [skip ci] --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 38d874a5f6..47faeb8d4a 100644 --- a/README.md +++ b/README.md @@ -122,7 +122,7 @@ Helpful contributors: [![NGI](https://raw.githubusercontent.com/SciLifeLab/Sarek/master/docs/images/NGI_logo.png "NGI")][ngi-link] [![NBIS](https://raw.githubusercontent.com/SciLifeLab/Sarek/master/docs/images/NBIS_logo.png "NBIS")][nbis-link] -[bioconda-badge]:https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg?logo= +[bioconda-badge]:https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg?logo= [bioconda-link]:http://bioconda.github.io/ [docker-badge]: https://img.shields.io/docker/automated/maxulysse/sarek.svg?logo=docker [docker-link]: https://hub.docker.com/r/maxulysse/sarek From adba143cb1aa7a67caf120df5c4af6dbac726124 Mon Sep 17 00:00:00 2001 From: Maxime Garcia Date: Wed, 22 Aug 2018 15:36:56 +0200 Subject: [PATCH 09/25] use inline png for bioconda [skip ci] --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 47faeb8d4a..4e3d72609d 100644 --- a/README.md +++ b/README.md @@ -122,7 +122,7 @@ Helpful contributors: [![NGI](https://raw.githubusercontent.com/SciLifeLab/Sarek/master/docs/images/NGI_logo.png "NGI")][ngi-link] [![NBIS](https://raw.githubusercontent.com/SciLifeLab/Sarek/master/docs/images/NBIS_logo.png "NBIS")][nbis-link] -[bioconda-badge]:https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg?logo= +[bioconda-badge]:https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg?logo= [bioconda-link]:http://bioconda.github.io/ [docker-badge]: https://img.shields.io/docker/automated/maxulysse/sarek.svg?logo=docker [docker-link]: https://hub.docker.com/r/maxulysse/sarek From 7cb3c3173af47195afde0bd32e74f4b657597811 Mon Sep 17 00:00:00 2001 From: Maxime Garcia Date: Thu, 30 Aug 2018 13:13:34 +0200 Subject: [PATCH 10/25] add logo for dark background --- docs/images/Sarek_logo_dark_background.svg | 374 +++++++++++++++++++++ 1 file changed, 374 insertions(+) create mode 100644 docs/images/Sarek_logo_dark_background.svg diff --git a/docs/images/Sarek_logo_dark_background.svg b/docs/images/Sarek_logo_dark_background.svg new file mode 100644 index 0000000000..212539c56c --- /dev/null +++ b/docs/images/Sarek_logo_dark_background.svg @@ -0,0 +1,374 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + image/svg+xml + + + + + + + + + + Sarek + + + + + + From 8dbdbd58e6dc8e58d3033cd8177d64ef3dc65916 Mon Sep 17 00:00:00 2001 From: Maxime Garcia Date: Thu, 30 Aug 2018 13:14:48 +0200 Subject: [PATCH 11/25] update PR Template and Contributing guidelines --- .github/CONTRIBUTING.md | 2 +- .github/PULL_REQUEST_TEMPLATE.md | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/.github/CONTRIBUTING.md b/.github/CONTRIBUTING.md index 302e8876c9..88b77a940e 100644 --- a/.github/CONTRIBUTING.md +++ b/.github/CONTRIBUTING.md @@ -18,7 +18,7 @@ is as follows: * Feel free to add a new issue here for the same reason. 2. Fork the Sarek repository to your GitHub account 3. Make the necessary changes / additions within your forked repository -4. Submit a Pull Request against the master branch and wait for the code to be reviewed and merged. +4. Submit a Pull Request against the `dev` branch and wait for the code to be reviewed and merged. If you're not used to this workflow with git, you can start with some [basic docs from GitHub](https://help.github.com/articles/fork-a-repo/) or even their [excellent interactive tutorial](https://try.github.io/). diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md index bde080a138..777f2f0b4e 100644 --- a/.github/PULL_REQUEST_TEMPLATE.md +++ b/.github/PULL_REQUEST_TEMPLATE.md @@ -4,6 +4,7 @@ Please fill in the appropriate checklist below (delete whatever is not relevant) These are the most common things requested on pull requests (PRs). ## PR checklist + - [ ] PR is made againt `dev` branch - [ ] This comment contains a description of changes (with reason) - [ ] If you've fixed a bug or added code that should be tested, add tests! - [ ] Ensure the test suite passes (`./scripts/test.sh -p docker -t ALL`). From 770e395fde406e56f6c22e2ca141d639e8389a14 Mon Sep 17 00:00:00 2001 From: Maxime Garcia Date: Thu, 30 Aug 2018 13:34:15 +0200 Subject: [PATCH 12/25] spacing [skip ci] --- .github/PULL_REQUEST_TEMPLATE.md | 3 ++- .github/RELEASE_CHECKLIST.md | 27 ++++++++++++++------------- 2 files changed, 16 insertions(+), 14 deletions(-) diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md index 777f2f0b4e..f6c91da38e 100644 --- a/.github/PULL_REQUEST_TEMPLATE.md +++ b/.github/PULL_REQUEST_TEMPLATE.md @@ -4,7 +4,8 @@ Please fill in the appropriate checklist below (delete whatever is not relevant) These are the most common things requested on pull requests (PRs). ## PR checklist - - [ ] PR is made againt `dev` branch + - [ ] PR is made against `dev` branch + - [ ] PR is a hotfix against `master` branch - [ ] This comment contains a description of changes (with reason) - [ ] If you've fixed a bug or added code that should be tested, add tests! - [ ] Ensure the test suite passes (`./scripts/test.sh -p docker -t ALL`). diff --git a/.github/RELEASE_CHECKLIST.md b/.github/RELEASE_CHECKLIST.md index 748a4c5a33..58ffe43635 100644 --- a/.github/RELEASE_CHECKLIST.md +++ b/.github/RELEASE_CHECKLIST.md @@ -2,21 +2,22 @@ This checklist is for our own reference 1. Check that everything is up to date and ready to go - - Travis test is passing - - Manual testing on Bianca is passing -2. Increase version numbers. + - Travis tests are passing + - Manual tests on Bianca are passing +2. Increase version numbers 3. Update version numbers in code: `configuration/base.config` 4. Build, and get the containers. - - `./scripts/do_all.sh --push --tag ` - - `./scripts/do_all.sh --pull --tag ` + - `./scripts/do_all.sh --push --tag ` + - `./scripts/do_all.sh --pull --tag ` 5. Test against sample data. - - Check for any command line errors - - Check version numbers are printed correctly - - `./scripts/test.sh -p docker --tag ` - - `./scripts/test.sh -p singularity --tag ` - - `./scripts/test.sh -p singularityPath --tag ` + - Check for any command line errors + - Check version numbers are printed correctly + - `./scripts/test.sh -p docker --tag ` + - `./scripts/test.sh -p singularity --tag ` + - `./scripts/test.sh -p singularityPath --tag ` 6. Commit and push version updates 7. Make a [release](https://github.com/SciLifeLab/Sarek/releases) on GitHub -8. Tweet that new version is released -9. Commit and push. Continue making more awesome :metal: -10. Have fika :cake: +8. Choose an appropriate codename for the release +9. Tweet that new version is released +10. Commit and push. Continue making more awesome :metal: +11. Have fika :cake: From ace85183bac76c83867f092da1a3cbd6cb69016e Mon Sep 17 00:00:00 2001 From: Maxime Garcia Date: Thu, 30 Aug 2018 13:34:27 +0200 Subject: [PATCH 13/25] update version [skip ci] --- conf/base.config | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/conf/base.config b/conf/base.config index 64aff09cf5..6a1a6bceae 100644 --- a/conf/base.config +++ b/conf/base.config @@ -46,7 +46,7 @@ params { test = false // Not testing by default tools = '' // List of tools to use verbose = false // Enable for more verbose information - version = '2.0.0' // Workflow version + version = '2.1.0' // Workflow version } process { From 87e90c762415fabe00f042e5eeb30a2c7df509b0 Mon Sep 17 00:00:00 2001 From: Maxime Garcia Date: Thu, 30 Aug 2018 13:37:22 +0200 Subject: [PATCH 14/25] update contributors list [skip ci] --- README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/README.md b/README.md index 4e3d72609d..ee26669da0 100644 --- a/README.md +++ b/README.md @@ -107,6 +107,7 @@ Main authors: * [Szilveszter Juhos](https://github.com/szilvajuhos) Helpful contributors: +* [Johannes Alneberg](https://github.com/alneberg) * [Sebastian DiLorenzo](https://github.com/Sebastian-D) * [Jesper Eisfeldt](https://github.com/J35P312) * [Phil Ewels](https://github.com/ewels) @@ -115,6 +116,7 @@ Helpful contributors: * [Marcel Martin](https://github.com/marcelm) * [Björn Nystedt](https://github.com/bjornnystedt) * [Pall Olason](https://github.com/pallolason) +* [Aron Skaftason](https://github.com/arontommi) -------------------------------------------------------------------------------- From 1f419b0d51604718263d107c460068463aa05946 Mon Sep 17 00:00:00 2001 From: Maxime Garcia Date: Thu, 30 Aug 2018 17:23:25 +0200 Subject: [PATCH 15/25] add Singularity Recipe [skip ci] --- Singularity | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) create mode 100644 Singularity diff --git a/Singularity b/Singularity new file mode 100644 index 0000000000..a22fcaa520 --- /dev/null +++ b/Singularity @@ -0,0 +1,18 @@ +From:nfcore/base +Bootstrap:docker + +%labels + MAINTAINER Maxime Garcia + DESCRIPTION Singularity image containing all requirements for the Sarek pipeline + VERSION 2.1.0 + +%environment + PATH=/opt/conda/envs/sarek-2.1.0/bin:$PATH + export PATH + +%files + environment.yml / + +%post + /opt/conda/bin/conda env create -f /environment.yml + /opt/conda/bin/conda clean -a From 267bdcb0521c404e745d238ccbbc9a95fff152a8 Mon Sep 17 00:00:00 2001 From: Maxime Garcia Date: Thu, 30 Aug 2018 17:24:12 +0200 Subject: [PATCH 16/25] move files --- .../CODE_OF_CONDUCT.md | 0 DELIVERY.README.md => docs/OUTPUT.md | 101 ++++++++---------- docs/{usage.md => USAGE.md} | 10 -- 3 files changed, 43 insertions(+), 68 deletions(-) rename CODE_OF_CONDUCT.md => .github/CODE_OF_CONDUCT.md (100%) rename DELIVERY.README.md => docs/OUTPUT.md (57%) rename docs/{usage.md => USAGE.md} (93%) diff --git a/CODE_OF_CONDUCT.md b/.github/CODE_OF_CONDUCT.md similarity index 100% rename from CODE_OF_CONDUCT.md rename to .github/CODE_OF_CONDUCT.md diff --git a/DELIVERY.README.md b/docs/OUTPUT.md similarity index 57% rename from DELIVERY.README.md rename to docs/OUTPUT.md index 8e337be990..f9a256a19c 100644 --- a/DELIVERY.README.md +++ b/docs/OUTPUT.md @@ -1,40 +1,40 @@ -![](doc/images/Sarek_logo.png) -# Sarek - Cancer Analysis Workflow Results Delivery -This README describes the delivery directory structure for files passed to users at [NGI][ngi-link] +# Sarek - output delivery +This README describes the output delivery directory structure. -There are four sections dedicated for different results: Annotation, Preprocessing, Reports and -VariantCalling. All the four sections can have sub-directories containing results from different software. +There are four sections dedicated for different results: Annotation, Preprocessing, Reports and VariantCalling. +All the four sections can have sub-directories containing results from different software. -## Annotation: +## Annotation: -This directory contains results from the final annotation steps: two software are used for annotation, [VEP][vep-link] and [snpEff][snpeff-link]. -Only a subset of the VCF files are annotated, and only variants that have a PASS filter. FreeBayes results are not annotated in the moment yet as -we are lacking a decent somatic filter. For HaplotypeCaller the germline variations are annotated for both the tumour and the normal sample. +This directory contains results from the final annotation steps: two software are used for annotation, [VEP][vep-link] and [snpEff][snpeff-link]. +Only a subset of the VCF files are annotated, and only variants that have a PASS filter. +FreeBayes results are not annotated in the moment yet as we are lacking a decent somatic filter. +For HaplotypeCaller the germline variations are annotated for both the tumour and the normal sample. -All the VCFs annotated have an `ann.vcf` extension, and a summary HTML file associated. +All the VCFs annotated have an `ann.vcf` extension, and a summary HTML file associated. ### SnpEff -[SnpEff][snpeff-link] can add annotations for many sort of variants not only SNPs, and is using multiple databases for annotations. SnpEff prints out -not only the annotated VCF files, but a summary HTML and CSV, also a list of affected genes with the actual changes and impact is included in a text file. -The generated VCF header contains the software version and the used command line. +[SnpEff][snpeff-link] can add annotations for many sort of variants not only SNPs, and is using multiple databases for annotations. +SnpEff prints out not only the annotated VCF files, but a summary HTML and CSV, also a list of affected genes with the actual changes and impact is included in a text file. +The generated VCF header contains the software version and the used command line. -Annotations added are in [cancer mode][snpeff-cancer-mode] are very rich, Sarek is using the software in a single-sample mode. VCF files containing germline -calls are annotated in [regular mode][snpeff-regular-mode] of SnpEff. +Annotations added are in [cancer mode][snpeff-cancer-mode] are very rich, Sarek is using the software in a single-sample mode. +VCF files containing germline calls are annotated in [regular mode][snpeff-regular-mode] of SnpEff. ### VEP -The [Variant Effect Predictor][vep-link] is based on Ensembl, and can determine the effects of all sorts of variants, including SNPs, indels, structural variants, -CNVs. Some of the Manta VCF files are not always succeed in going through the VEP filtering though: there can be missing annotations for these variant calls. - -The HTML summary files show general statistics and quality-related measures. In the header of the annotated VCF files one can find the VEP/Ensembl version used -for annotation, also the version numbers for additional databases like Clinvar or dbSNP used in the "VEP" line. The format of the [consequence annotations][VEP-predictions] is also -in the VCF header describing the INFO field. In the moment it contains +The [Variant Effect Predictor][vep-link] is based on Ensembl, and can determine the effects of all sorts of variants, including SNPs, indels, structural variants, CNVs. +Some of the Manta VCF files are not always succeed in going through the VEP filtering though: there can be missing annotations for these variant calls. +The HTML summary files show general statistics and quality-related measures. +In the header of the annotated VCF files one can find the VEP/Ensembl version used for annotation, also the version numbers for additional databases like Clinvar or dbSNP used in the "VEP" line. +The format of the [consequence annotations][VEP-predictions] is also in the VCF header describing the INFO field. +In the moment it contains: * Consequence: impact of the variation, if there is any * Codons: the codon change, i.e. cGt/cAt * Amino\_acids: change in amino acids, i.e. R/H if there is any -* Gene: ENSEMBL gene name +* Gene: ENSEMBL gene name * SYMBOL: gene symbol * Feature: actual transcript name * EXON: affected exon @@ -46,65 +46,52 @@ in the VCF header describing the INFO field. In the moment it contains --- ## Preprocessing: -The preprocessing is following the [GATK Best Practices][GATK-BP] to obtain aligned BAM files used for whole-genome germline analysis. - -### NonRealigned: - -This directory is usually empty, as it is a placeholder for the original mapped, merged and duplicate marked BAM files. After these steps the BAM files are -processed further, reads are realigned around known indels, and recalibrated. +The preprocessing is following the [GATK Best Practices][GATK-BP] to obtain aligned BAM files used for whole-genome germline analysis. -### NonRecalibrated: +### DuplicateMarked: -This is the place for the BAM file delivered to users: besides the realigned files the recalibration tables are also stored (`*.recal.table`), these can be -used to create base recalibrated files. The `.tsv` file is autogenerated also, these can be used by Sarek for further processing and/or variant calling. +This is the place for the BAM file delivered to users: besides the duplicatemarked files the recalibration tables are also stored (`*.recal.table`), these can be used to create base recalibrated files. +The `.tsv` file is autogenerated also, these can be used by Sarek for further processing and/or variant calling. -The BAM file headers contain the details about the actual command-line arguments for mapping, merging, use `samtools view -H ` to view the used -reference, read groups etc. +The BAM file headers contain the details about the actual command-line arguments for mapping, merging, use `samtools view -H ` to view the used reference, read groups etc. ### Recalibrated: -This directory is usually empty, it is the location for the final recalibrated files in the preprocessing pipeline: recalibrated BAMs are usually 2-3 times -larger than the realigned files, and are needed only by MuTect1 and MuTect2 (considering GATK 3.8). To re-generate recalibrated BAMs you have to apply the -recalibration table delivered to the `NonRecalibrated` directory either by calling Sarek, or doing this [recalibration step][BQSR-link] yourself. +This directory is usually empty, it is the location for the final recalibrated files in the preprocessing pipeline: recalibrated BAMs are usually 2-3 times larger than the duplicatemarked files. To re-generate recalibrated BAMs you have to apply the recalibration table delivered to the `NonRecalibrated` directory either by calling Sarek, or doing this [recalibration step][BQSR-link] yourself. --- ## Reports: -The `Reports` directory is the place for collecting outputs for different quality control (QC) software; going through these files can help us to decide -whether the sequencing and the workflow was successful, or further steps are needed to get meaningful results. The main entry point it the [MultiQC][multiqc-link] -directory: the HTML index file aggregates and visualizes all the software use for QC. - +The `Reports` directory is the place for collecting outputs for different quality control (QC) software; going through these files can help us to decide whether the sequencing and the workflow was successful, or further steps are needed to get meaningful results. +The main entry point it the [MultiQC][multiqc-link] directory: the HTML index file aggregates and visualizes all the software use for QC. + ### MultiQC -To assess the quality of the sequencing and workflow the best start is to view at the Reports\/MultiQC\/multiqc\_report.html file of the MultiQC directory, where the -statistics and graphics of all the software below should be presented. The actual graphs and the tables are configurable, and generally much easier to view than the -raw output of the individual software. The subsequent QC compartments are: +To assess the quality of the sequencing and workflow the best start is to view at the `Reports/MultiQC/multiqc_report.html` file of the `MultiQC` directory, where the statistics and graphics of all the software below should be presented. +The actual graphs and the tables are configurable, and generally much easier to view than the raw output of the individual software. +The subsequent QC compartments are: -* bamQC: [Qualimap][qualimap-link] examines sequencing alignment data in SAM/BAM files according to the features of the mapped reads and provides an overall view - of the data provides quality control statistics about aligned BAM files +* bamQC: [Qualimap][qualimap-link] examines sequencing alignment data in SAM/BAM files according to the features of the mapped reads and provides an overall view of the data provides quality control statistics about aligned BAM files * BCFToolsStats: [bcftools][bcftools] measuring non-reference allele frequency, depth distribution, stats by quality and per-sample counts, singleton stats, etc. of VCF files. -* [FastQC][fastqc]: provides statistics about the raw FASTQ files only. +* [FastQC][fastqc]: provides statistics about the raw FASTQ files only. * MarkDuplicates: a [Picard][picard-md] tool to tag PCR/optical duplicates from aligned BAM data * SamToolsStats: [samtools][samtools] collection of statistics from BAM files --- ## VariantCallings: -All the raw results regarding variant-calling are collected in this directory. Not all the software below are producing VCF files, also both somatic and germline -variants are collected in this directory. +All the raw results regarding variant-calling are collected in this directory. Not all the software below are producing VCF files, also both somatic and germline +variants are collected in this directory. -* [Ascat][ascat]: is a method to derive copy number profiles of tumour cells, accounting for normal cell admixture and tumour aneuploidy. This direcory contains the -graphical output of the software, CNV, ploidy and sample purity estimations. +* [Ascat][ascat]: is a method to derive copy number profiles of tumour cells, accounting for normal cell admixture and tumour aneuploidy. This direcory contains the graphical output of the software, CNV, ploidy and sample purity estimations. * [FreeBayes][freebayes]: is for Bayesian haplotype-based genetic polymorphism discovery and genotyping. The single VCF file generated by FreeBayes is huge, it is recommended to flatten and filter this VCF, i.e. using the provided [SpeedSeq][speedseq] filter * [HaplotypeCaller][haplotypecaller] is the in-house germline caller of the Broad Institute, the non-recalibrated variant files are there to check the germline variations and compare the two samples (tumour and normal) for possible mixup * HaplotypeCallerGVCF: germline calls in [gVCF format][genomicvcf] even for the tumour sample: this format makes possible the joint analysis of a cohort -* [Manta][manta]: is a structural variant caller supported by Illumina. There are several output files, corresponding to germline (diploid) calls, candidate calls -and somatic files. Manta provides a candidate list for small indels also that can be fed to Strelka, but this feature is not incorporated yet. -* [MuTect1][mutect1] is a now-defunct GATK-based somatic SNP-only caller - going to be left out for analysis in the future. It is sensitive, recommended to keep only -lines with "PASS" filter. +* [Manta][manta]: is a structural variant caller supported by Illumina. There are several output files, corresponding to germline (diploid) calls, candidate calls and somatic files. +Manta provides a candidate list for small indels also that can be fed to Strelka. * [MuTect2][mutect2] is the current somatic caller of GATK for both SNPs and indels. Recommended to keep only lines with the "PASS" filter. -* [Strelka][strelka] is somatic SNP and indel caller supported by Illumina. Strelka gives filtered and unfiltered calls for SNPs and indels separately, together with germline calls. +* [Strelka2][strelka2] is somatic SNP and indel caller supported by Illumina. Strelka gives filtered and unfiltered calls for SNPs and indels separately, together with germline calls. [ascat]:https://www.crick.ac.uk/research/a-z-researchers/researchers-v-y/peter-van-loo/software/ [bcftools]: http://www.htslib.org/doc/bcftools.html @@ -116,7 +103,6 @@ lines with "PASS" filter. [genomicvcf]: https://gatkforums.broadinstitute.org/gatk/discussion/4017/what-is-a-gvcf-and-how-is-it-different-from-a-regular-vcf [manta]: https://github.com/Illumina/manta/blob/master/docs/userGuide/README.md#structural-variant-predictions [multiqc-link]: http://multiqc.info/ -[mutect1]: https://software.broadinstitute.org/gatk/download/mutect [mutect2]: https://software.broadinstitute.org/gatk/documentation/tooldocs/current/org_broadinstitute_gatk_tools_walkers_cancer_m2_MuTect2.php [ngi-link]: https://ngisweden.scilifelab.se/ [picard-md]: http://broadinstitute.github.io/picard/command-line-overview.html#MarkDuplicates @@ -128,8 +114,7 @@ lines with "PASS" filter. [snpeff-cancer-mode]: http://snpeff.sourceforge.net/SnpEff_manual.html#cancer [snpeff-regular-mode]: http://snpeff.sourceforge.net/SnpEff_manual.html#input [speedseq]: https://github.com/SciLifeLab/Sarek/blob/master/scripts/speedseq.filter.awk -[strelka]: https://github.com/Illumina/strelka +[strelka2]: https://github.com/Illumina/strelka [vep-link]: http://www.ensembl.org/Tools/VEP [VEP-predictions]: https://www.ensembl.org/info/genome/variation/predicted_data.html [logo]: https://img.shields.io/github/release/SciLifeLab/Sarek.svg - diff --git a/docs/usage.md b/docs/USAGE.md similarity index 93% rename from docs/usage.md rename to docs/USAGE.md index 5dea8d6dc8..16e46f975f 100644 --- a/docs/usage.md +++ b/docs/USAGE.md @@ -175,13 +175,3 @@ If there is a feature or bugfix you want to use in a resumed or re-analyzed run, ```bash nextflow run -latest SciLifeLab/Sarek/main.nf ... -resume ``` - --------------------------------------------------------------------------------- - -[![](images/SciLifeLab_logo.png "SciLifeLab")][scilifelab-link] -[![](images/NGI_logo.png "NGI")][ngi-link] -[![](images/NBIS_logo.png "NBIS")][nbis-link] - -[nbis-link]: https://www.nbis.se/ -[ngi-link]: https://ngisweden.scilifelab.se/ -[scilifelab-link]: https://www.scilifelab.se/ From d49752eb4179f12f21de19070d6c2842cd1ac220 Mon Sep 17 00:00:00 2001 From: Maxime Garcia Date: Thu, 30 Aug 2018 17:24:30 +0200 Subject: [PATCH 17/25] merge two docs --- docs/BUILD.md | 62 ----------------------- docs/CONTAINERS.md | 122 ++++++++++++++++++++++++++++++--------------- 2 files changed, 83 insertions(+), 101 deletions(-) delete mode 100644 docs/BUILD.md diff --git a/docs/BUILD.md b/docs/BUILD.md deleted file mode 100644 index d5b7a3a12c..0000000000 --- a/docs/BUILD.md +++ /dev/null @@ -1,62 +0,0 @@ -# Building - -Use the Nextflow script to build and/or push containers from Docker and/or Singularity. - -All the containers have built in UPPMAX directories, so there is no need to add them for use on UPPMAX clusters. -- See the [Singularity UPPMAX guide](https://www.uppmax.uu.se/support-sv/user-guides/singularity-user-guide/) - -## Usage - -```bash -nextflow run . [--docker] [--singularity] [--containerPath ] [--push] [--containers ] [--repository ] [--tag tag] -``` - -- `--containers`: Choose which containers to build. Default: `all`. Possible values (to separate by commas): - - `all` - Build all available containers. - - `fastqc` - - `freebayes` - - `gatk` - - `igvtools` - - `multiqc` - - `mutect1` - - `picard` - - `qualimap` - - `r-base` - - `runallelecount` - - `sarek` - - `snpeff` this container serves as a base for `snpeffgrch37` and `snpeffgrch38` - - `snpeffgrch37` - - `snpeffgrch38` - - `vcftools` - - `vepgrch37` - - `vepgrch38` - -- `--docker`: Build containers using `Docker` -- `--push`: Push containers to `DockerHub` -- `--repository`: Build containers under given repository. Default: `maxulysse` -- `--singularity`: Build containers using `Singularity`. -- `--containerPath`: Select where to download containers. Default: `$PWD` -- `--tag`: Build containers using given tag. Default is version number. - -## Example - -```bash -nextflow run . --docker --singularity --push --containers multiqc,fastqc -``` - -## For lazy users -We provide script to build/push or pull all containers -```bash -./scripts/do_all.sh # Build all docker containers -./scripts/do_all.sh --push # Build and push all Docker containers into DockerHub -./scripts/do_all.sh --pull # Pull all containers from DockerHub into Singularity -``` - ---- -[![](images/SciLifeLab_logo.png "SciLifeLab")][scilifelab-link] -[![](images/NGI_logo.png "NGI")][ngi-link] -[![](images/NBIS_logo.png "NBIS")][nbis-link] - -[nbis-link]: https://www.nbis.se/ -[ngi-link]: https://ngisweden.scilifelab.se/ -[scilifelab-link]: https://www.scilifelab.se/ diff --git a/docs/CONTAINERS.md b/docs/CONTAINERS.md index 3ffd0916ca..092888626e 100644 --- a/docs/CONTAINERS.md +++ b/docs/CONTAINERS.md @@ -1,96 +1,139 @@ # Containers -Subsets of all containers can be dowloaded: +Subsets of all containers can be downloaded: -For processing, germline and somatic variant calling and Reports: - - [sarek](#sarek-) +- For processing, germline and somatic variant calling and Reports: + - [sarek](#sarek-) +- For annotation for GRCh37, you will need: + - [snpeffgrch37](#snpeffgrch37-) + - [vepgrch37](#vepgrch37-) +- For annotation for GRCh38, you will need: + - [snpeffgrch38](#snpeffgrch38-) + - [vepgrch38](#vepgrch38-) -For annotation for GRCh37, you will need: - - [snpeffgrch37](#snpeffgrch37-) - - [vepgrch37](#vepgrch37-) +## Building -For annotation for GRCh38, you will need: - - [snpeffgrch38](#snpeffgrch38-) - - [vepgrch38](#vepgrch38-) +Use the Nextflow script to build and/or push containers from Docker and/or Singularity. -## r-base [![r-base-docker status][r-base-docker-badge]][r-base-docker-link] +All the containers have built in UPPMAX directories, so there is no need to add them for use on UPPMAX clusters. +- See the [Singularity UPPMAX guide](https://www.uppmax.uu.se/support-sv/user-guides/singularity-user-guide/) - - Based on `debian:8.9` - - Contain **[AlleleCount][allelecount-link]** 2.2.0 +### Usage -## runallelecount [![runallelecount-docker status][runallelecount-docker-badge]][runallelecount-docker-link] +```bash +nextflow run . [--docker] [--singularity] [--containerPath ] [--push] [--containers ] [--repository ] [--tag tag] +``` + +- `--containers`: Choose which containers to build. Default: `all`. Possible values (to separate by commas): + - `all` - all available containers. + - `r-base` - the [r-base](#r-base-) container. + - `runallelecount` - the [runallelecount](#runallelecount-) container. + - `sarek` - the [sarek](#sarek-) container. + - `snpeff` - the [snpeff](#snpeff-) container, that serves as a base for `snpeffgrch37` and `snpeffgrch38`. + - `snpeffgrch37` - the [snpeffgrch37](#snpeffgrch37-) container. + - `snpeffgrch38` - the [snpeffgrch38](#snpeffgrch38-) container. + - `vepgrch37` - the [vepgrch37](#vepgrch37-) container. + - `vepgrch38` - the [vepgrch38](#vepgrch38-) container. + +- `--docker`: Build containers using `Docker` +- `--push`: Push containers to `DockerHub` +- `--repository`: Build containers under given repository. Default: `maxulysse` +- `--singularity`: Build containers using `Singularity`. +- `--containerPath`: Select where to download containers. Default: `$PWD` +- `--tag`: Build containers using given tag. Default is version number. + +### Example + +```bash +nextflow run . --docker --singularity --push --containers multiqc,fastqc +``` + +### For lazy users +We provide script to build/push or pull all containers +```bash +./scripts/do_all.sh # Build all docker containers +./scripts/do_all.sh --push # Build and push all Docker containers into DockerHub +./scripts/do_all.sh --pull # Pull all containers from DockerHub into Singularity +``` + +## What is actually inside the containers + +### r-base [![r-base-docker status][r-base-docker-badge]][r-base-docker-link] + + - Based on `r-base:3.3.2` + - Contain **RColorBrewer** + +### runallelecount [![runallelecount-docker status][runallelecount-docker-badge]][runallelecount-docker-link] - Based on `debian:8.9` - Contain **[AlleleCount][allelecount-link]** 2.2.0 -## sarek [![sarek-docker status][sarek-docker-badge]][sarek-docker-link] +### sarek [![sarek-docker status][sarek-docker-badge]][sarek-docker-link] - Based on `debian:8.9` -- Contain **[BCFTools][bcftools-link]** 1.5 -- Contain **[BWA][bwa-link]** 0.7.16 -- Contain **[HTSlib][htslib-link]** 1.5 -- Contain **[Manta][manta-link]** 1.1.1 -- Contain **[samtools][samtools-link]** 1.5 -- Contain **[Strelka][strelka-link]** 2.8.2 - -## snpeff [![snpeff-docker status][snpeff-docker-badge]][snpeff-docker-link] +- Contain **[BCFTools][bcftools-link]** 1.8 +- Contain **[BWA][bwa-link]** 0.7.17 +- Contain **[FastQC][fastqc-link]** 0.11.7 +- Contain **[FreeBayes][freebayes-link]** 1.2.0 +- Contain **[GATK4][gatk4-link]** 4.0.6.0 +- Contain **[HTSlib][htslib-link]** 1.9 +- Contain **[IGVtools][igvtools-link]** 2.3.93 +- Contain **[Manta][manta-link]** 1.4.0 +- Contain **[MultiQC][multiqc-link]** 1.5 +- Contain **[Qualimap][qualimap-link]** 2.2.2a +- Contain **[samtools][samtools-link]** 1.8 +- Contain **[Strelka2][strelka-link]** 2.9.3 +- Contain **[VCFanno][vcfanno-link]** 0.2.8 +- Contain **[VCFtools][vcftools-link]** 0.1.15 + +### snpeff [![snpeff-docker status][snpeff-docker-badge]][snpeff-docker-link] - Based on `openjdk:8-slim` - Contain **[snpEff][snpeff-link]** 4.3i -## snpeffgrch37 [![snpeffgrch37-docker status][snpeffgrch37-docker-badge]][snpeffgrch37-docker-link] +### snpeffgrch37 [![snpeffgrch37-docker status][snpeffgrch37-docker-badge]][snpeffgrch37-docker-link] - Based on `maxulysse/snpeff` - Contain **[snpEff][snpeff-link]** 4.3i - Contain GRCh37.75 -## snpeffgrch38 [![snpeffgrch38-docker status][snpeffgrch38-docker-badge]][snpeffgrch38-docker-link] +### snpeffgrch38 [![snpeffgrch38-docker status][snpeffgrch38-docker-badge]][snpeffgrch38-docker-link] - Based on `maxulysse/snpeff` - Contain **[snpEff][snpeff-link]** 4.3i - Contain GRCh38.86 -## vepgrch37 [![vepgrch37-docker status][vepgrch37-docker-badge]][vepgrch37-docker-link] +### vepgrch37 [![vepgrch37-docker status][vepgrch37-docker-badge]][vepgrch37-docker-link] - Based on `willmclaren/ensembl-vep:release_90.6` - Contain **[VEP][vep-link]** 90.5 - Contain GRCh37 -## vepgrch38 [![vepgrch38-docker status][vepgrch38-docker-badge]][vepgrch38-docker-link] +### vepgrch38 [![vepgrch38-docker status][vepgrch38-docker-badge]][vepgrch38-docker-link] - Based on `willmclaren/ensembl-vep:release_90.6` - Contain **[VEP][vep-link]** 90.5 - Contain GRCh38 ---- -[![](images/SciLifeLab_logo.png "SciLifeLab")][scilifelab-link] -[![](images/NGI_logo.png "NGI")][ngi-link] -[![](images/NBIS_logo.png "NBIS")][nbis-link] - [allelecount-link]: https://github.com/cancerit/alleleCount [bcftools-link]: https://github.com/samtools/bcftools [bwa-link]: https://github.com/lh3/bwa [fastqc-link]: http://www.bioinformatics.babraham.ac.uk/projects/fastqc/ [freebayes-link]: https://github.com/ekg/freebayes -[gatk-link]: https://github.com/broadgsa/gatk-protected +[gatk4-link]: https://github.com/broadinstitute/gatk [htslib-link]: https://github.com/samtools/htslib [igvtools-link]: http://software.broadinstitute.org/software/igv/ [manta-link]: https://github.com/Illumina/manta [multiqc-link]: https://github.com/ewels/MultiQC/ -[mutect1-link]: https://github.com/broadinstitute/mutect -[nbis-link]: https://www.nbis.se/ -[ngi-link]: https://ngisweden.scilifelab.se/ -[picard-link]: https://github.com/broadinstitute/picard [qualimap-link]: http://qualimap.bioinfo.cipf.es +[r-base-docker-badge]: https://img.shields.io/docker/automated/maxulysse/r-base.svg +[r-base-docker-link]: https://hub.docker.com/r/maxulysse/r-base [rcolorbrewer-link]: https://CRAN.R-project.org/package=RColorBrewer [runallelecount-docker-badge]: https://img.shields.io/docker/automated/maxulysse/runallelecount.svg [runallelecount-docker-link]: https://hub.docker.com/r/maxulysse/runallelecount -[r-base-docker-badge]: https://img.shields.io/docker/automated/maxulysse/r-base.svg -[r-base-docker-link]: https://hub.docker.com/r/maxulysse/r-base [samtools-link]: https://github.com/samtools/samtools [sarek-docker-badge]: https://img.shields.io/docker/automated/maxulysse/sarek.svg [sarek-docker-link]: https://hub.docker.com/r/maxulysse/sarek -[scilifelab-link]: https://www.scilifelab.se/ [snpeff-docker-badge]: https://img.shields.io/docker/automated/maxulysse/snpeff.svg [snpeff-docker-link]: https://hub.docker.com/r/maxulysse/snpeff [snpeff-link]: http://snpeff.sourceforge.net/ @@ -99,6 +142,7 @@ For annotation for GRCh38, you will need: [snpeffgrch38-docker-badge]: https://img.shields.io/docker/automated/maxulysse/snpeffgrch38.svg [snpeffgrch38-docker-link]: https://hub.docker.com/r/maxulysse/snpeffgrch38 [strelka-link]: https://github.com/Illumina/strelka +[vcfanno-link]: https://github.com/brentp/vcfanno [vcftools-link]: https://vcftools.github.io/index.html [vep-link]: https://github.com/Ensembl/ensembl-vep [vepgrch37-docker-badge]: https://img.shields.io/docker/automated/maxulysse/vepgrch37.svg From a4d878498a3171f460e94b8c35516a5c2ca1a776 Mon Sep 17 00:00:00 2001 From: Maxime Garcia Date: Thu, 30 Aug 2018 17:25:04 +0200 Subject: [PATCH 18/25] remove logos --- docs/ASCAT.md | 43 +++++++++++++++++++++++------------------ docs/CONFIG.md | 19 ++++++------------ docs/FOLDER.md | 10 ---------- docs/INSTALL.md | 11 ----------- docs/INSTALL_BIANCA.md | 10 ---------- docs/INSTALL_RACKHAM.md | 11 ----------- docs/INTERVALS.md | 31 +++++++++++++++-------------- docs/PROCESS.md | 10 ---------- docs/REFERENCES.md | 12 +----------- docs/TESTS.md | 10 ---------- docs/TSV.md | 10 ---------- docs/USE_CASES.md | 10 ---------- 12 files changed, 47 insertions(+), 140 deletions(-) diff --git a/docs/ASCAT.md b/docs/ASCAT.md index c2b326235c..8b8b675482 100644 --- a/docs/ASCAT.md +++ b/docs/ASCAT.md @@ -2,13 +2,19 @@ ## Introduction -Ascat is a software for performing allele-specific copy number analysis of tumor samples and for estimating tumor ploidy and purity (normal contamination). Ascat is written in R and available here: [github.com/Crick-CancerGenomics/ascat](https://github.com/Crick-CancerGenomics/ascat). +ASCAT is a software for performing allele-specific copy number analysis of tumor samples and for estimating tumor ploidy and purity (normal contamination). +ASCAT is written in R and available here: [github.com/Crick-CancerGenomics/ascat](https://github.com/Crick-CancerGenomics/ascat). -To run Ascat on NGS data we need .bam files for the tumor and normal samples, as well as a loci file with SNP positions. If Ascat is run on SNP array data, the loci file contains the SNPs on the chip. When runnig Ascat on NGS data we can use the same loci file, for exampe the one corresponding to the AffymetrixGenome-Wide Human SNP Array 6.0, but we can also choose a loci file of our choice with i.e. SNPs detected in the 1000 Genomes project. +To run ASCAT on NGS data we need .bam files for the tumor and normal samples, as well as a loci file with SNP positions. +If ASCAT is run on SNP array data, the loci file contains the SNPs on the chip. +When runnig ASCAT on NGS data we can use the same loci file, for exampe the one corresponding to the AffymetrixGenome-Wide Human SNP Array 6.0, but we can also choose a loci file of our choice with i.e. SNPs detected in the 1000 Genomes project. ### BAF and LogR values -Running Ascat on NGS data requires that the .bam files are converted into BAF and LogR values. This can be done using the software [AlleleCount](https://github.com/cancerit/alleleCount) followed by a simple R script. AlleleCount extracts the number of reads in a bam file supporting each allele at specified SNP positions. Based on this, BAF and logR can be calculated for every SNP position i as: +Running ASCAT on NGS data requires that the .bam files are converted into BAF and LogR values. +This can be done using the software [AlleleCount](https://github.com/cancerit/alleleCount) followed by a simple R script. +AlleleCount extracts the number of reads in a bam file supporting each allele at specified SNP positions. +Based on this, BAF and logR can be calculated for every SNP position i as: ``` BAFi(tumor)=countsBi(tumor)/(countsAi(tumor)+countsBi(tumor)) @@ -35,19 +41,25 @@ Calculation of LogR and BAF based on AlleleCount output is done as in [runASCAT. ### Loci file -The loci file was created based on the 1000Genomes latest release (phase 3, releasedate 20130502), available [here](ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp//release/20130502/ALL.wgs.phase3_shapeit2_mvncall_integrated_v5b.20130502.sites.vcf.gz). The following filter was applied: Only bi-allelc SNPs with minor allele frequencies > 0.3. The filtered file can be found on [export.uppmax.uu.se](https://export.uppmax.uu.se/b2015110/caw-references/b37/1000G_phase3_20130502_SNP_maf0.3.loci.tar.bz2) and is stored on Milou in: +The loci file was created based on the 1000Genomes latest release (phase 3, releasedate 20130502), available [here](ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp//release/20130502/ALL.wgs.phase3_shapeit2_mvncall_integrated_v5b.20130502.sites.vcf.gz). +The following filter was applied: Only bi-allelc SNPs with minor allele frequencies > 0.3. +The filtered file can be found on [export.uppmax.uu.se](https://export.uppmax.uu.se/b2015110/caw-references/b37/1000G_phase3_20130502_SNP_maf0.3.loci.tar.bz2) and is stored on Milou in: ``` /sw/data/uppnex/ToolBox/ReferenceAssemblies/hg38make/bundle/2.8/b37/1000G_phase3_20130502_SNP_maf0.3.loci ``` -The loci file was originally generated for GRCh37. It was translated into GRCh38 using the tool liftOver available at the UCSC Genome Browser. To run liftOver the loci file was first written in bed format: +The loci file was originally generated for GRCh37. +It was translated into GRCh38 using the tool liftOver available at the UCSC Genome Browser. +To run liftOver the loci file was first written in bed format: ``` awk '{print "chr"$1":"$2"-"$2}' 1000G_phase3_20130502_SNP_maf0.3.loci > 1000G_phase3_20130502_SNP_maf0.3.bed ``` -Using the web interface to liftOver at [genome.ucsc.edu](https://genome.ucsc.edu/cgi-bin/hgLiftOver) the file was translated into GRCh38 coordinates. LiftOver was possible for 3261270 out of 3268043 SNPs. The converted SNP positions were printed in the format required by AlleleCounter by: +Using the web interface to liftOver at [genome.ucsc.edu](https://genome.ucsc.edu/cgi-bin/hgLiftOver) the file was translated into GRCh38 coordinates. +LiftOver was possible for 3261270 out of 3268043 SNPs. +The converted SNP positions were printed in the format required by AlleleCounter by: ``` more hglft_genome_5834_13aba0.bed | awk 'BEGIN{FS="chr"} {print $2}' | awk 'BEGIN{FS="-"} {print $1}' | awk 'BEGIN{FS=":";OFS="\t"} {print $1,$2}' > 1000G_phase3_GRCh38_maf0.3.loci @@ -63,7 +75,8 @@ The loci file in GRCh38 coordinates is stored on Milou in: ### Run AlleleCount -AlleleCount is installed as part of the `bioinfo-tools` module on Milou. It runs on single bam files (tumor and normal separately) with the command below: +AlleleCount is installed as part of the `bioinfo-tools` module on Milou. +It runs on single bam files (tumor and normal separately) with the command below: ```bash $ module load bioinfo-tools alleleCount @@ -72,7 +85,8 @@ $ alleleCounter -l /sw/data/uppnex/ToolBox/ReferenceAssemblies/hg38make/bundle/2 ### Convert allele counts to LogR and BAF values -The allele counts can then be converted into LogR and BAF values using the script `convertAlleleCounts.r`. Usage for a male sample (`Gender = "XY"`, replace with `Gender = "XX"` for a female sample): +The allele counts can then be converted into LogR and BAF values using the script `convertAlleleCounts.r`. +Usage for a male sample (`Gender = "XY"`, replace with `Gender = "XX"` for a female sample): ```bash sbatch -A PROJID -p core -n 1 -t 240:00:00 -J convertAllelecounts -e convertAllelecounts.err -o convertAllelecounts.out /path/to/your/Sarek-fork/convertAlleleCounts.r tumor_sample tumor.allelecount normal_sample normal.allelecount XY @@ -82,7 +96,8 @@ This creates the BAF and LogR data for the tumor and normal samples, to be used ### Run ASCAT -The script "run_ascat.r" can be used to run ASCAT in the simplest possible way without compensating for the local CG content across the genome. It calls the main ASCAT R script [ascat.R](https://github.com/Crick-CancerGenomics/ascat/tree/master/ASCAT/R/ascat.R). +The script "run_ascat.r" can be used to run ASCAT in the simplest possible way without compensating for the local CG content across the genome. +It calls the main ASCAT R script [ascat.R](https://github.com/Crick-CancerGenomics/ascat/tree/master/ASCAT/R/ascat.R). ```bash sbatch -A PROJID -p core -n 1 -t 240:00:00 -J ascat -e ascat.err -o ascat.out run_ascat.r tumor_baf tumor_logr normal_baf normal_logr @@ -91,13 +106,3 @@ sbatch -A PROJID -p core -n 1 -t 240:00:00 -J ascat -e ascat.err -o ascat.out ru ## Flowchart ![Overview of ASCAT process](images/ascat.jpg "ASCAT") - --------------------------------------------------------------------------------- - -[![](images/SciLifeLab_logo.png "SciLifeLab")][scilifelab-link] -[![](images/NGI_logo.png "NGI")][ngi-link] -[![](images/NBIS_logo.png "NBIS")][nbis-link] - -[nbis-link]: https://www.nbis.se/ -[ngi-link]: https://ngisweden.scilifelab.se/ -[scilifelab-link]: https://www.scilifelab.se/ diff --git a/docs/CONFIG.md b/docs/CONFIG.md index 368324d1c4..01b6a6e4fe 100644 --- a/docs/CONFIG.md +++ b/docs/CONFIG.md @@ -4,11 +4,13 @@ For more informations on how to use configuration files, have a look at the [Nex For more informations about profiles, have a look at the [Nextflow documentation](https://www.nextflow.io/docs/latest/config.html#config-profiles) -We provides several configuration files and profiles for Sarek. The standard ones are designed to work on a Swedish UPPMAX clusters, and can be modified and tailored to your own need. +We provides several configuration files and profiles for Sarek. +The standard ones are designed to work on a Swedish UPPMAX clusters, and can be modified and tailored to your own need. ## Configuration files -Every configuration file can be modified for your own use. If you want you can specify the use of a config file using `-c ` +Every configuration file can be modified for your own use. +If you want you can specify the use of a config file using `-c ` ### [`containers.config`](https://github.com/SciLifeLab/Sarek/blob/master/configuration/containers.config) @@ -51,7 +53,8 @@ Will run the workflow on `/scratch` using the Nextflow [`scratch`](https://www.n ## profiles -Every profile can be modified for your own use. To use a profile, you'll need to specify `-profile ` +Every profile can be modified for your own use. +To use a profile, you'll need to specify `-profile ` ### `docker` @@ -79,13 +82,3 @@ Singularity images will be pulled automatically. This is the profile for Singularity testing on a small machine, or on Travis CI. Singularity images will be pulled automatically. - --------------------------------------------------------------------------------- - -[![](images/SciLifeLab_logo.png "SciLifeLab")][scilifelab-link] -[![](images/NGI_logo.png "NGI")][ngi-link] -[![](images/NBIS_logo.png "NBIS")][nbis-link] - -[nbis-link]: https://www.nbis.se/ -[ngi-link]: https://ngisweden.scilifelab.se/ -[scilifelab-link]: https://www.scilifelab.se/ diff --git a/docs/FOLDER.md b/docs/FOLDER.md index b3ff148e73..342df282a4 100644 --- a/docs/FOLDER.md +++ b/docs/FOLDER.md @@ -47,13 +47,3 @@ Read group information will be parsed from fastq file names according to this: - `RGPL` = "Illumina" - `PU` = sample - `RGLB` = lib - --------------------------------------------------------------------------------- - -[![](images/SciLifeLab_logo.png "SciLifeLab")][scilifelab-link] -[![](images/NGI_logo.png "NGI")][ngi-link] -[![](images/NBIS_logo.png "NBIS")][nbis-link] - -[nbis-link]: https://www.nbis.se/ -[ngi-link]: https://ngisweden.scilifelab.se/ -[scilifelab-link]: https://www.scilifelab.se/ diff --git a/docs/INSTALL.md b/docs/INSTALL.md index 4aafadfe89..cf76921b15 100644 --- a/docs/INSTALL.md +++ b/docs/INSTALL.md @@ -47,14 +47,3 @@ To update Sarek, it's also very simple: Follow the [references documentation](REFERENCES.md) on how to download/build the references files. Follow the [configuration and profile documentation](CONFIG.md) on how to modify and use the configuration files and profiles. - --------------------------------------------------------------------------------- - -[![](images/SciLifeLab_logo.png "SciLifeLab")][scilifelab-link] -[![](images/NGI_logo.png "NGI")][ngi-link] -[![](images/NBIS_logo.png "NBIS")][nbis-link] - -[nbis-link]: https://www.nbis.se/ -[nextflow-link]: https://www.nextflow.io/ -[ngi-link]: https://ngisweden.scilifelab.se/ -[scilifelab-link]: https://www.scilifelab.se/ diff --git a/docs/INSTALL_BIANCA.md b/docs/INSTALL_BIANCA.md index cbe9825e85..5f10e7247e 100644 --- a/docs/INSTALL_BIANCA.md +++ b/docs/INSTALL_BIANCA.md @@ -201,13 +201,3 @@ Repeat the same steps as for installing Sarek, and once the tar has been extract ``` You can for example keep a `default` version that you are sure is working, an make a link for a `testing` or `development` - --------------------------------------------------------------------------------- - -[![](images/SciLifeLab_logo.png "SciLifeLab")][scilifelab-link] -[![](images/NGI_logo.png "NGI")][ngi-link] -[![](images/NBIS_logo.png "NBIS")][nbis-link] - -[nbis-link]: https://www.nbis.se/ -[ngi-link]: https://ngisweden.scilifelab.se/ -[scilifelab-link]: https://www.scilifelab.se/ diff --git a/docs/INSTALL_RACKHAM.md b/docs/INSTALL_RACKHAM.md index 360fe8e6fe..ab79efb4b3 100644 --- a/docs/INSTALL_RACKHAM.md +++ b/docs/INSTALL_RACKHAM.md @@ -75,14 +75,3 @@ To use Sarek on rackham you will need to use the `slurmDownload` profile. # Run the workflow directly on the login node > nextflow run SciLifeLab/Sarek/main.nf --project [PROJECT] -profile slurmDownload ``` - --------------------------------------------------------------------------------- - -[![](images/SciLifeLab_logo.png "SciLifeLab")][scilifelab-link] -[![](images/NGI_logo.png "NGI")][ngi-link] -[![](images/NBIS_logo.png "NBIS")][nbis-link] - -[nbis-link]: https://www.nbis.se/ -[nextflow-link]: https://www.nextflow.io/ -[ngi-link]: https://ngisweden.scilifelab.se/ -[scilifelab-link]: https://www.scilifelab.se/ diff --git a/docs/INTERVALS.md b/docs/INTERVALS.md index e773898306..9520c077ea 100644 --- a/docs/INTERVALS.md +++ b/docs/INTERVALS.md @@ -1,18 +1,29 @@ # Intervals -To speed up the variant calling processes, the reference is chopped into smaller pieces. The variant calling is done by this intervals, and the different resulting VCFs are then merged. This can parallelize the variant calling processes, and push down the variant calling wall clock time significantly. +To speed up the variant calling processes, the reference is chopped into smaller pieces. +The variant calling is done by this intervals, and the different resulting VCFs are then merged. +This can parallelize the variant calling processes, and push down the variant calling wall clock time significantly. -The calling intervals can be defined using a `.list` or a `.bed` file. A `.list` file contains one interval per line in the format `chromosome:start-end` (1-based coordinates). +The calling intervals can be defined using a `.list` or a `.bed` file. +A `.list` file contains one interval per line in the format `chromosome:start-end` (1-based coordinates). -When the intervals file is in BED format, the file must be a tab-separated text file with one interval per line. There must be at least three columns: chromosome, start, and end. In BED format, the coordinates are 0-based, so the interval `chrom:1-10` becomes `chrom010`. +When the intervals file is in BED format, the file must be a tab-separated text file with one interval per line. +There must be at least three columns: chromosome, start, and end. +In BED format, the coordinates are 0-based, so the interval `chrom:1-10` becomes `chrom010`. -Additionally, the "score" column of the BED file can be used to provide an estimate of how many seconds it will take to call variants on that interval. The fourth column remains unused. Example (the fields would actually be tab-separated, this is not shown here): +Additionally, the "score" column of the BED file can be used to provide an estimate of how many seconds it will take to call variants on that interval. +The fourth column remains unused. +Example (the fields would actually be tab-separated, this is not shown here): `chr1 10000 207666 NA 47.3` This indicates that variant calling on the interval chr1:10001-207666 takes approximately 47.3 seconds. -The runtime estimate is used in two different ways. First, when there are multiple consecutive intervals in the file that take little time to compute, they are processed as a single job, thus reducing the number of processes that needs to be spawned. Second, the jobs with largest processing time are started first, which reduces wall-clock time. If no runtime is given, a time of 1000 nucleotides per second is assumed. Actual figures vary from 2 nucleotides/second to 30000 nucleotides/second. +The runtime estimate is used in two different ways. +First, when there are multiple consecutive intervals in the file that take little time to compute, they are processed as a single job, thus reducing the number of processes that needs to be spawned. +Second, the jobs with largest processing time are started first, which reduces wall-clock time. +If no runtime is given, a time of 1000 nucleotides per second is assumed. +Actual figures vary from 2 nucleotides/second to 30000 nucleotides/second. ## GRCh37 @@ -38,13 +49,3 @@ samtools faidx /sw/data/uppnex/ToolBox/ReferenceAssemblies/hg38make/bundle/2.8/b bwa mem /sw/data/uppnex/ToolBox/hg38bundle/Homo_sapiens_assembly38.fasta intervals.fasta > intervals.sam grep -v '^@' intervals.sam | awk '{printf("%s:%d-%d\n", $3, $4, $4+$6-1)}' > tiny-GRCh38.list ``` - --------------------------------------------------------------------------------- - -[![](images/SciLifeLab_logo.png "SciLifeLab")][scilifelab-link] -[![](images/NGI_logo.png "NGI")][ngi-link] -[![](images/NBIS_logo.png "NBIS")][nbis-link] - -[nbis-link]: https://www.nbis.se/ -[ngi-link]: https://ngisweden.scilifelab.se/ -[scilifelab-link]: https://www.scilifelab.se/ diff --git a/docs/PROCESS.md b/docs/PROCESS.md index b37f3da14c..dbd045a4e3 100644 --- a/docs/PROCESS.md +++ b/docs/PROCESS.md @@ -48,13 +48,3 @@ We divide them for the moment into 5 main steps: - RunSnpeff - Run snpEff for annotation of vcf files - RunVEP - Run VEP for annotation of vcf files - --------------------------------------------------------------------------------- - -[![](images/SciLifeLab_logo.png "SciLifeLab")][scilifelab-link] -[![](images/NGI_logo.png "NGI")][ngi-link] -[![](images/NBIS_logo.png "NBIS")][nbis-link] - -[nbis-link]: https://www.nbis.se/ -[ngi-link]: https://ngisweden.scilifelab.se/ -[scilifelab-link]: https://www.scilifelab.se/ diff --git a/docs/REFERENCES.md b/docs/REFERENCES.md index 3a3fda85f7..084eb14696 100644 --- a/docs/REFERENCES.md +++ b/docs/REFERENCES.md @@ -21,7 +21,7 @@ The following files need to be downloaded: From our repo, get the [`intervals` list file](https://raw.githubusercontent.com/SciLifeLab/Sarek/master/repeats/wgs_calling_regions.grch37.list). More information about this file in the [intervals documentation](INTERVALS.md) -Description of how to generate the Loci file used in the ASCAT process is described [here](https://github.com/SciLifeLab/Sarek/blob/master/docs/ASCAT.md). +Description of how to generate the Loci file used in the ASCAT process is described [here](https://github.com/SciLifeLab/Sarek/blob/master/docs/ASCAT.md). You can create your own cosmic reference for any human reference as specified below. @@ -95,13 +95,3 @@ Same parameter used for other scripts. - GRCh37 - GRCh38 (not yet available) - smallGRCh37 - --------------------------------------------------------------------------------- - -[![](images/SciLifeLab_logo.png "SciLifeLab")][scilifelab-link] -[![](images/NGI_logo.png "NGI")][ngi-link] -[![](images/NBIS_logo.png "NBIS")][nbis-link] - -[nbis-link]: https://www.nbis.se/ -[ngi-link]: https://ngisweden.scilifelab.se/ -[scilifelab-link]: https://www.scilifelab.se/ diff --git a/docs/TESTS.md b/docs/TESTS.md index aabc59d6d0..458ffbad82 100644 --- a/docs/TESTS.md +++ b/docs/TESTS.md @@ -115,13 +115,3 @@ Four optional arguments are supported: # Will perform all tests using Singularity on manta test data ./scripts/test.sh -s Sarek-data/testdata/tsv/tiny-manta.tsv ``` - --------------------------------------------------------------------------------- - -[![](images/SciLifeLab_logo.png "SciLifeLab")][scilifelab-link] -[![](images/NGI_logo.png "NGI")][ngi-link] -[![](images/NBIS_logo.png "NBIS")][nbis-link] - -[nbis-link]: https://www.nbis.se/ -[ngi-link]: https://ngisweden.scilifelab.se/ -[scilifelab-link]: https://www.scilifelab.se/ diff --git a/docs/TSV.md b/docs/TSV.md index fdb6313a5c..084adb1c93 100644 --- a/docs/TSV.md +++ b/docs/TSV.md @@ -57,13 +57,3 @@ All the files will be in he Preprocessing/Recalibrated/ directory, and by defaul ```bash nextflow run SciLifeLab/Sarek/somaticVC.nf --sample Preprocessing/Recalibrated/mysample.tsv --tools Mutect2,Strelka ``` - --------------------------------------------------------------------------------- - -[![](images/SciLifeLab_logo.png "SciLifeLab")][scilifelab-link] -[![](images/NGI_logo.png "NGI")][ngi-link] -[![](images/NBIS_logo.png "NBIS")][nbis-link] - -[nbis-link]: https://www.nbis.se/ -[ngi-link]: https://ngisweden.scilifelab.se/ -[scilifelab-link]: https://www.scilifelab.se/ diff --git a/docs/USE_CASES.md b/docs/USE_CASES.md index 8dabca0fbc..b894d79498 100644 --- a/docs/USE_CASES.md +++ b/docs/USE_CASES.md @@ -189,13 +189,3 @@ SUBJECT_ID XX 1 SAMPLEIDR /samples/SAMPLEIDR.bam /samples/SAMPLEIDR ``` If you want to restart a previous run of the pipeline, you may not have a recalibrated BAM file. This is the case if HaplotypeCaller was the only tool (recalibration is done on-the-fly with HaplotypeCaller to improve performance and save space). In this case, you need to start with `--step=recalibrate` (see previous section). - --------------------------------------------------------------------------------- - -[![](images/SciLifeLab_logo.png "SciLifeLab")][scilifelab-link] -[![](images/NGI_logo.png "NGI")][ngi-link] -[![](images/NBIS_logo.png "NBIS")][nbis-link] - -[nbis-link]: https://www.nbis.se/ -[ngi-link]: https://ngisweden.scilifelab.se/ -[scilifelab-link]: https://www.scilifelab.se/ From df4cb0530e61bcb8deb787a3cac3c81d51a0874e Mon Sep 17 00:00:00 2001 From: Maxime Garcia Date: Thu, 30 Aug 2018 17:25:19 +0200 Subject: [PATCH 19/25] update docs --- README.md | 5 ++--- docs/README.md | 5 ++--- 2 files changed, 4 insertions(+), 6 deletions(-) diff --git a/README.md b/README.md index ee26669da0..9f5301e456 100644 --- a/README.md +++ b/README.md @@ -86,9 +86,8 @@ The Sarek pipeline comes with documentation in the `docs/` directory: 10. [TSV file documentation](https://github.com/SciLifeLab/Sarek/blob/master/docs/TSV.md) 11. [Processes documentation](https://github.com/SciLifeLab/Sarek/blob/master/docs/PROCESS.md) 12. [Documentation about containers](https://github.com/SciLifeLab/Sarek/blob/master/docs/CONTAINERS.md) -13. [Documentation about building](https://github.com/SciLifeLab/Sarek/blob/master/docs/BUILD.md) -14. [More information about ASCAT](https://github.com/SciLifeLab/Sarek/blob/master/docs/ASCAT.md) -15. [Folder structure](https://github.com/SciLifeLab/Sarek/blob/master/docs/FOLDER.md) +13. [More information about ASCAT](https://github.com/SciLifeLab/Sarek/blob/master/docs/ASCAT.md) +14. [Output documentation structure](https://github.com/SciLifeLab/Sarek/blob/master/docs/OUTPUT.md) ## Contributions & Support diff --git a/docs/README.md b/docs/README.md index 628ebeccac..be93b67e25 100644 --- a/docs/README.md +++ b/docs/README.md @@ -14,6 +14,5 @@ The Sarek pipeline comes with the following documentation: 10. [TSV file documentation](https://github.com/SciLifeLab/Sarek/blob/master/docs/TSV.md) 11. [Processes documentation](https://github.com/SciLifeLab/Sarek/blob/master/docs/PROCESS.md) 12. [Documentation about containers](https://github.com/SciLifeLab/Sarek/blob/master/docs/CONTAINERS.md) -13. [Documentation about building](https://github.com/SciLifeLab/Sarek/blob/master/docs/BUILD.md) -14. [More information about ASCAT](https://github.com/SciLifeLab/Sarek/blob/master/docs/ASCAT.md) -15. [Folder structure](https://github.com/SciLifeLab/Sarek/blob/master/docs/FOLDER.md) +13. [More information about ASCAT](https://github.com/SciLifeLab/Sarek/blob/master/docs/ASCAT.md) +14. [Output documentation structure](https://github.com/SciLifeLab/Sarek/blob/master/docs/OUTPUT.md) From 649af667a885e70c24140362a2829f125043c6cf Mon Sep 17 00:00:00 2001 From: Maxime Garcia Date: Thu, 30 Aug 2018 17:26:05 +0200 Subject: [PATCH 20/25] update base config and manifest --- conf/base.config | 16 +++++----------- nextflow.config | 5 ++++- 2 files changed, 9 insertions(+), 12 deletions(-) diff --git a/conf/base.config b/conf/base.config index 6a1a6bceae..db4b0891f6 100644 --- a/conf/base.config +++ b/conf/base.config @@ -11,40 +11,29 @@ wf_repository = 'maxulysse' params { // set up default params - annotateTools = '' // Tools to annotate by annotate.nf - annotateVCF = '' // Files to annotate by annotate.nf containerPath = '' // Path to Singularity images - containers = '' // List of containers to build in buildContainers.nf docker = false // Don't use Docker to build buildContainers.nf download = false // Don't download reference files in buildReferences.nf explicitBqsrNeeded = true // Enable recalibration in main.nf genome = 'GRCh38' // Default reference genome is GRCh38 - genome_base = '' // Path to the reference files help = false // Don't give help information max_cpus = 16 // Base specifications max_memory = 128.GB // Base specifications max_time = 240.h // Base specifications more = false // Don't give version information - nfRequiredVersion = '0.25.0' // Minimum version of nextflow required noBAMQC = false // Use BAMQC noGVCF = false // HaplotypeCaller will output gVCF as well noReports = false // Reports are made by default nucleotidesPerSecond = 1000.0 // To estimate interval size by default onlyQC = false // All process will be run and not only the QC tools outDir = "${PWD}" // Path to output directory - project = '' // UPPMAX project number push = false // Don't push container to DockerHub - refDir = '' // Path to the references to build repository = wf_repository // DockerHub containers repository - sample = '' // sample files in tsv format - sampleDir = '' // samples directory (for Germline only) - sequencing_center = '' // CN field in BAM files singularity = false // Don't use singularity to build buildContainers.nf step = 'mapping' // Default step is mapping strelkaBP = false // Don't use Manta's candidate indels as input to Strelka tag = 'latest' // Default tag is latest, to be overwritten by --tag test = false // Not testing by default - tools = '' // List of tools to use verbose = false // Enable for more verbose information version = '2.1.0' // Workflow version } @@ -69,6 +58,11 @@ timeline { // Turning on timeline tracking by default file = "${params.outDir}/Reports/Sarek_timeline.html" } +dag { // Turning on dag by default + enabled = true + file = "${params.outDir}/Reports/Sarek_DAG.svg" +} + trace { // Turning on trace tracking by default enabled = true fields = 'process,task_id,hash,name,attempt,status,exit,realtime,%cpu,vmem,rss,submit,start,complete,duration,realtime,rchar,wchar' diff --git a/nextflow.config b/nextflow.config index 4525c76ac0..449e5a4001 100644 --- a/nextflow.config +++ b/nextflow.config @@ -9,8 +9,11 @@ */ manifest { - homePage = 'http://opensource.scilifelab.se/projects/sarek/' description = 'Sarek - Workflow For Somatic And Germline Variations' + homePage = 'http://sarek.scilifelab.se' + mainScript = 'main.nf' + name = 'Sarek' + nextflowVersion = '>=0.31.0' } env { From 8c6b4fc2083b05941e1e73c7e4d45361d1d3dc5a Mon Sep 17 00:00:00 2001 From: Maxime Garcia Date: Thu, 30 Aug 2018 17:26:39 +0200 Subject: [PATCH 21/25] update CHANGELOG --- CHANGELOG.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index 274ea46ac3..0ed16e20e0 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -19,6 +19,8 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0. - [#615](https://github.com/SciLifeLab/Sarek/pull/615) - Use `splitCsv` instead of `readlines` - [#621](https://github.com/SciLifeLab/Sarek/pull/621) - Improve install script - [#621](https://github.com/SciLifeLab/Sarek/pull/621) - Simplify tests +- [#627](https://github.com/SciLifeLab/Sarek/pull/627), [#629](https://github.com/SciLifeLab/Sarek/pull/629) - Refactor docs +- [#629](https://github.com/SciLifeLab/Sarek/pull/629) - Refactor config ### `Removed` - [#616](https://github.com/SciLifeLab/Sarek/pull/616) - Remove old Issue Template From afac39520a1b79b07135152d92e646fd97fe54ed Mon Sep 17 00:00:00 2001 From: Maxime Garcia Date: Mon, 3 Sep 2018 10:47:47 +0200 Subject: [PATCH 22/25] better docs --- docs/CONTAINERS.md | 23 +++++++++++++------ docs/FOLDER.md | 49 ---------------------------------------- docs/INSTALL.md | 5 +++-- docs/OUTPUT.md | 2 +- docs/USAGE.md | 56 ++++++++++++++++++++++++++++++++++++++++++---- 5 files changed, 72 insertions(+), 63 deletions(-) delete mode 100644 docs/FOLDER.md diff --git a/docs/CONTAINERS.md b/docs/CONTAINERS.md index 092888626e..9766a4419c 100644 --- a/docs/CONTAINERS.md +++ b/docs/CONTAINERS.md @@ -4,6 +4,8 @@ Subsets of all containers can be downloaded: - For processing, germline and somatic variant calling and Reports: - [sarek](#sarek-) + - [r-base](#r-base-) + - [runallelecount](#runallelecount-) - For annotation for GRCh37, you will need: - [snpeffgrch37](#snpeffgrch37-) - [vepgrch37](#vepgrch37-) @@ -21,10 +23,14 @@ All the containers have built in UPPMAX directories, so there is no need to add ### Usage ```bash -nextflow run . [--docker] [--singularity] [--containerPath ] [--push] [--containers ] [--repository ] [--tag tag] +nextflow run buildContainers.nf [--docker] [--singularity] / +[--containerPath ] [--push] [--containers ] / +[--repository ] [--tag tag] ``` -- `--containers`: Choose which containers to build. Default: `all`. Possible values (to separate by commas): +- `--containers`: Choose which containers to build. +Default: `all`. +Possible values (to separate by commas): - `all` - all available containers. - `r-base` - the [r-base](#r-base-) container. - `runallelecount` - the [runallelecount](#runallelecount-) container. @@ -37,15 +43,18 @@ nextflow run . [--docker] [--singularity] [--containerPath ] [--push] [--c - `--docker`: Build containers using `Docker` - `--push`: Push containers to `DockerHub` -- `--repository`: Build containers under given repository. Default: `maxulysse` +- `--repository`: Build containers under given repository. +Default: `maxulysse` - `--singularity`: Build containers using `Singularity`. -- `--containerPath`: Select where to download containers. Default: `$PWD` -- `--tag`: Build containers using given tag. Default is version number. +- `--containerPath`: Select where to download containers. +Default: `$PWD` +- `--tag`: Build containers using given tag. +Default is version number. ### Example ```bash -nextflow run . --docker --singularity --push --containers multiqc,fastqc +nextflow run buildContainers.nf --docker --singularity --push --containers sarek ``` ### For lazy users @@ -53,7 +62,7 @@ We provide script to build/push or pull all containers ```bash ./scripts/do_all.sh # Build all docker containers ./scripts/do_all.sh --push # Build and push all Docker containers into DockerHub -./scripts/do_all.sh --pull # Pull all containers from DockerHub into Singularity +./scripts/do_all.sh --pull # Pull all containers from DockerHub with Singularity ``` ## What is actually inside the containers diff --git a/docs/FOLDER.md b/docs/FOLDER.md deleted file mode 100644 index 342df282a4..0000000000 --- a/docs/FOLDER.md +++ /dev/null @@ -1,49 +0,0 @@ -# Project folder structure - -The workflow is started for a sample, or a set of samples from the same person. -Each different physical samples is identified by its own ID. -For example in a Tumor/Normal settings, this ID could correspond to "Normal", "Tumor 1", "Tumor 2" etc corresponding to all physical samples from the same cancer patient. - -Below is an overview of the intended folder structure for an analyzed project. - -![Project folder structure](images/folder_structure.jpg "Folder structure") - -# Input Fastq file name conventions - -The input folder, containing the fastq files for one ID (Individual) should be organized into one subfolder for every sample. All fastq files for that sample should be collected here. - -``` -ID -+--sample1 -+------sample1_lib_flowcell-index_lane_R1_1000.fastq.gz -+------sample1_lib_flowcell-index_lane_R2_1000.fastq.gz -+------sample1_lib_flowcell-index_lane_R1_1000.fastq.gz -+------sample1_lib_flowcell-index_lane_R2_1000.fastq.gz -+--sample2 -+------sample2_lib_flowcell-index_lane_R1_1000.fastq.gz -+------sample2_lib_flowcell-index_lane_R2_1000.fastq.gz -+--sample3 -+------sample3_lib_flowcell-index_lane_R1_1000.fastq.gz -+------sample3_lib_flowcell-index_lane_R2_1000.fastq.gz -+------sample3_lib_flowcell-index_lane_R1_1000.fastq.gz -+------sample3_lib_flowcell-index_lane_R2_1000.fastq.gz -``` - -Fastq filename structure: - -- `sample_lib_flowcell-index_lane_R1_1000.fastq.gz` and -- `sample_lib_flowcell-index_lane_R2_1000.fastq.gz` - -Where: - -- `sample` = sample id -- `lib` = indentifier of libaray preparation -- `flowcell` = identifyer of flow cell for the sequencing run -- `lane` = identifier of the lane of the sequencing run - -Read group information will be parsed from fastq file names according to this: - -- `RGID` = "sample_lib_flowcell_index_lane" -- `RGPL` = "Illumina" -- `PU` = sample -- `RGLB` = lib diff --git a/docs/INSTALL.md b/docs/INSTALL.md index cf76921b15..c07c442658 100644 --- a/docs/INSTALL.md +++ b/docs/INSTALL.md @@ -2,7 +2,8 @@ This small tutorial will explain to you how to install and run Sarek on a small sample test data on any POSIX compatible system (Linux, Solaris, OS X, etc). -To use this pipeline, you need to have a working version of Nextflow installed, Reference files and Docker or Singularity to facilitate the use of other tools. You can use a small reference genome as testing +To use this pipeline, you need to have a working version of Nextflow installed, Reference files and Docker or Singularity as a container engine. +You can use a small reference genome as testing. - See the [Install Nextflow documentation](https://www.nextflow.io/docs/latest/getstarted.html#installation) - See the [Reference files documentation](REFERENCES.md) @@ -28,7 +29,7 @@ Docker can also be used as a container technology. You can [Test Sarek with small dataset and small reference](https://github.com/SciLifeLab/Sarek/blob/master/docs/TESTS.md) -## Update +## Update To update Sarek, it's also very simple: diff --git a/docs/OUTPUT.md b/docs/OUTPUT.md index f9a256a19c..422ed59a1e 100644 --- a/docs/OUTPUT.md +++ b/docs/OUTPUT.md @@ -1,5 +1,5 @@ # Sarek - output delivery -This README describes the output delivery directory structure. +This document describes the output delivery directory structure. There are four sections dedicated for different results: Annotation, Preprocessing, Reports and VariantCalling. All the four sections can have sub-directories containing results from different software. diff --git a/docs/USAGE.md b/docs/USAGE.md index 16e46f975f..b1474d0f31 100644 --- a/docs/USAGE.md +++ b/docs/USAGE.md @@ -1,7 +1,57 @@ # Usage -I would recommand to run Nextflow within a [screen](https://www.gnu.org/software/screen/) or [tmux](https://tmux.github.io/) session. -It is recommended to run only one instance of Sarek for one patient in the same directory. +I would recommend to run Nextflow within a [screen](https://www.gnu.org/software/screen/) or [tmux](https://tmux.github.io/) session. + +## Project folder structure + +The workflow is started for a sample, or a set of samples from the same Individual. + +Each different physical samples is identified by its own ID. +For example in a Tumour/Normal settings, this ID could correspond to "Normal", "Tumour_1", "Tumour_2" etc. corresponding to all physical samples from the same patient. + +## Input FASTQ file name best practices + +The input folder, containing the FASTQ files for one individual (ID) should be organized into one subfolder for every sample. +All fastq files for that sample should be collected here. + +``` +ID ++--sample1 ++------sample1_lib_flowcell-index_lane_R1_1000.fastq.gz ++------sample1_lib_flowcell-index_lane_R2_1000.fastq.gz ++------sample1_lib_flowcell-index_lane_R1_1000.fastq.gz ++------sample1_lib_flowcell-index_lane_R2_1000.fastq.gz ++--sample2 ++------sample2_lib_flowcell-index_lane_R1_1000.fastq.gz ++------sample2_lib_flowcell-index_lane_R2_1000.fastq.gz ++--sample3 ++------sample3_lib_flowcell-index_lane_R1_1000.fastq.gz ++------sample3_lib_flowcell-index_lane_R2_1000.fastq.gz ++------sample3_lib_flowcell-index_lane_R1_1000.fastq.gz ++------sample3_lib_flowcell-index_lane_R2_1000.fastq.gz +``` + +Fastq filename structure: + +- `sample_lib_flowcell-index_lane_R1_1000.fastq.gz` and +- `sample_lib_flowcell-index_lane_R2_1000.fastq.gz` + +Where: + +- `sample` = sample id +- `lib` = indentifier of libaray preparation +- `flowcell` = identifyer of flow cell for the sequencing run +- `lane` = identifier of the lane of the sequencing run + +Read group information will be parsed from fastq file names according to this: + +- `RGID` = "sample_lib_flowcell_index_lane" +- `RGPL` = "Illumina" +- `PU` = sample +- `RGLB` = lib + +## Scripts + Sarek uses several scripts, a wrapper is currently being made to simplify the command lines. Currently the typical reduced command lines are: @@ -70,7 +120,6 @@ Choose which tools will be used in the workflow. Different tools to be separated - manta (use `Manta` for SV) (germlineVC,somaticVC) - strelka (use `Strelka` for VC) (germlineVC,somaticVC) - ascat (use `ASCAT` for CNV) (somaticVC) -- mutect1 (use `MuTect1` for VC) (somaticVC) - mutect2 (use `MuTect2` for VC) (somaticVC) - snpeff (use `snpEff` for Annotation) (annotate) - vep (use `VEP` for Annotation) (annotate) @@ -82,7 +131,6 @@ Choose which tools will be used in the workflow. Different tools to be separated Choose which tools to annotate. Different tools to be separated by commas. Possible values are: - haplotypecaller (Annotate `HaplotypeCaller` output) - manta (Annotate `Manta` output) -- mutect1 (Annotate `MuTect1` output) - mutect2 (Annotate `MuTect2` output) - strelka (Annotate `Strelka` output) From b1191ae5034506b8999634a42b8f4d6293631d99 Mon Sep 17 00:00:00 2001 From: Maxime Garcia Date: Wed, 5 Sep 2018 16:50:43 +0200 Subject: [PATCH 23/25] add logo for exome seq [skip ci] --- docs/images/Sarek_exome_logo.png | Bin 0 -> 10775 bytes 1 file changed, 0 insertions(+), 0 deletions(-) create mode 100644 docs/images/Sarek_exome_logo.png diff --git a/docs/images/Sarek_exome_logo.png b/docs/images/Sarek_exome_logo.png new file mode 100644 index 0000000000000000000000000000000000000000..b1e11e37a832f7cfd48176a8819688552b0e93d7 GIT binary patch literal 10775 zcmX|H1yCGYu*KaqxCRd%+;wpR0fGc~cV8^X;w%t?6D)+_?(S~EU4kzTi}UyWU-hSI zv~KnE^gT7V`}B#{R9D2tAjg1%gTqz=$Z5gB!3V;webG>0$4hu6Ti6N7Lq_Q{8f*(h zvjoBJ(Om)h96MA!4+PZP-_@$hPpu9l?fhq^@u* z$fKUYtBy}d7!fk~HnZ7hQmK@vs+Z2p@bXfv0|ysyZJ`a_^rk%3%R3${L>LjhQgjy< zTf~-(mBRH9)K_>TTuS^!jdD#Eq7D#;sR%cX?T|1Sx3MKuBV<{sO9+Sxvuj2cZ96NB zO*UbIj1{*RFB^$v_I!UCMF%ZCqECkqvy_4E{14*d59Z#b&iW(C<{{!lR{Z)u>MA^nQ1yRJ6tOBeIw zJ;?E)X3q7c-yMv%*y;wI;YycQRh5*N$9bJB)L4!XkZ>8R>geb+ySx!K(|j>kvVobQ zmj@vJf*(8V2~W<+shG$X)(&{S9BNR@60q~GbK6yb)C^J2S&ygZTUl8}SX@0&HurB- zY3FOQ5Tqz|+slBg4Gc297xZoHuJ}DB1eE3koy!}Ry zZu`_!1`8H?H-7gfUX!9Wn1h4k?^)-Qb47dmnw`@**lD%V&S%~0hwhgz@04%aLP|n= z8iQA1QQU^J11YL}ln!@0oX%hQhAZ*Ap}t zCSQ{gOOZjqNt=%HqkZb#&76h<1Oib%@gA>lXt09;W?orYSqA-0%d`k_Q5yCoJENXI zaR|4^Mn-1NS`MfyDk>)IavHM}5^$U*vxNq*gb zTL#4hET-qN(Cq{x)(Dw=P^bmC$O67s0Ql%xQCT@@2l0wqSuudX>|--MK7|lcwwV71 zQDnjpaxC*SFgSnRV@ljWPw&TE=7|t>qGA@GO+s2m2HAM5O|tBwI_!gb_g4YVev-x?mID2r}L9nT06kr)r^L|%c^RmrMBd$bLypU;1PEV9wp61kD`fHkJmyM?fs_|ynm=9Iw<9UiMSVDr zj@SaE7=6X4om-EXa{cznc2yd>m%DYJ0*F5X@fOzzwg7vAS|LO+WJEn~21!3VzDd2W zo@@JPu&fUYNG7L`h77*`$-}N%%Z+yT@L66@;Xy7#9Pk4AP^_%@d zK6LYe3Z){k^E9@xfFY1b2)bgxU(BOerPs%i6y_4k7^F5k7Z+UXITPs}`Mq3}_V~K@ zEKmf&p9HhT^khiJ=Zln(Ph5MSut>N6YHQ4AW@ctDjHc$7ma?WpCuykKj$iktWn&yr z8=)aUbbnAZFaJ)bniFht98Js+c{?@5404Uqeu0T}YSF@%@|4v;KL>z?Um9exCpRnavh z5tmA%OxxK8P&LnncHW36&|`w4&KrSGck^jnCTBMxLRR`M?#23YXVP-_**kXQic0~Z zn~2Q#Zw4cUp}|N;QSMf1RDaPn>1KIGY~BtF6GbYbS*QANB^{aU-+D3-ZZ^Bco{qGo zIuT#|hKsn)g{E=k12x0W@g#dL1~jq+)UJKS)KAPtcQ1YG$@r|(qhAv#@2Q?QifCMR z#wSqwL$`kBOHSb2DVlvyzM$Gd!}^t7P*7l;b35y7@8B?@%)DR<3at*+Kz~LRo7tCm z=NHi}5${^Bh*72)=EAdm&n*eczrQ2Y5R;Y;T{oBuAQSWD))2cy;caak1ql_fT#Uzw+1;L!-#i~5rPoK8yBMKL5ErxJDFf$Jc4K%2n z_L#27l3b-^U~TF4NBkX44?)7+B226g=QHH0 zzU#w_tJH+gdC7Z}9kG@@@z@e$Yyh$)kL217MWuE%xb_!^ZUj^piCE4v4GR`Ar>^;<;RBscRNb2D=lVT~{LC?a-f!PCrKrW1?w8qs z@Fp8ma*XzGq8jV#ZLvjyaO6amU`oc3k!P@;4Jme8mg_t8X7A0IL(cP8GoNj#`{%b7jcuReL z(#hx=0SIe01xH1Nl-`ID2E1!tpY|3okKtj;c93@NY}gp;JeQ=27NCwy|5~dySxEP- z1y{{@Ob#I{Y}rwKcx5<9t~vjy4PIYySfMkKpnriT?kV&Q`%>f!g|8PnHz^CRpOdhw z_=tE&bo9u(vydsV>o|+Tm4!h8r?*HprxV40EBnwUWXDw!J=>=R9eY%tkG4)~kj%85 zjRFmchoa@-5uBdj^}Y6yFTekFp*>SNZ^>^V$Tia!qunFZg9a`1raiL5meGRq&n~UE za7@Lv|6z|xIZt{d&6K@1t9)x`rgg$_m-WM%32*0~EVB`FrOY?>Mw&df0; zOrx7Wy?K(CCAkpow=5{Z=P{f|@iQrBcrF8HeP2Fx;MuvU79Y1LMzUrTg3nn(Tnl=F zBxJgi0n|X$Ai6`caTXF03*Qv^8>KHLKsP-F`ijRk>#mD%k%p=E- zuaU{I<>egCZ4d#A>xla-!=0Wnq*huS1y$uL1U`Jh4!Hy0zU%le40p7G-D6-QX@6%r zk*5nTaFO3kjQB8|bN-nhdFXz5+Xd;98o=q_mjP0GZ(@@ zC;B~Wfm?_2X-{CdByMy$azAQ&K|xr1JHW<+B)^AGuTw_P26`?Y44}4{WTWfiew47X zw3@D>&B~TuzM)8!^?oZaikWh>v&%0k3eU_W2VQG;Pq<0k5`&tfGI@0vUN76)gMkA3 zrs#7wp0wId?N1BGjTmxb=5&}miHCkP_vh~t9robm_HI;>NsyN}9mBie-MUWAK!hZ3 zv2w~!&EK5iB@k-Hpz|HCR9hb~C(D>ASo$_AuHgt+4tSO&XYqr`dF_MIS~A`}yqgiI zTZ%Uhb?XPm)2(iOXjTIidb)BEe zb;eU_A$RtvC}qBD6)TmGB}xKFxQLaSl{FSGxG@Zbu9vuUlGiW{vBl*3QZR=iAt$Sn3Y6GJK5 zn?KT!IlEfo+D@Mx|6HG!?sMiDI-ZbPMywqnOSd92G9(lC=O<)Vw;aHbOzhJCB1tS* zNWk-L4PI56L|OxNkY3xf?U$krJyV8FC`I(dg426)cUcj&#k&c##7}eAoMv@vZ5d|e zT-9T}*W^5zb5iE?1>0G}s@MD}i}v8n1Ia=$Z{-5gXBkWIK6jn3#$e)?(9_xKZ1uo1i%ul%iq8_Uw7J98D#lS!F&DHMjf1ZmGUk^OBK z32ugO&VmYgezrpNsy*$fm5QewlJVtFZUes9^i<*vH|??Si70Xfhc1n*k1)nhAwBOx zmo7A96z;6MiXT8HPG0>TDsDGnl8Ea+e&pm#=;#TXi2c5lLh2I7w)g(~si`Elz=e8X zI3+S=gE`01>)rd6RNQZPSy!6pB#y1W={AJs0U^4`p|P=ehovc@ zNDKpGMg$)Qr~HU6E7<0vyEGN)?@JiMoE6^hVBcD&z9{>_e3gIe9UwIs?+%>|SZc9n zpxN^k5&vAitfJ#bRme=GCJpA%%e%xmjV!yvEAdJ5I8gZ}3 zB^4GBSVUKYu#NGGXPF;B&J8~Ns=VHf!$`7i;nJ1KL&JKI;v43Ig^?B4A}(euM}ObL zB0nDEUCH{MN}@-j&oy*P@-6Twz+y{NmVzp(z=Xre$-clwmIijC3q}o*O+;uUyhU3y# zMqAnsGc{+7bK>Z`#<_vS--cLbaAPzs5@oMkD6L0RSwjO>RUwV7xjq=*TEE`%pE_CI zqxZ3h_!SkNrCo9(W*C(fCIyFRsfgq>m!m9foxchr@k}5YLNrh}q*T){3nVYlUgcMT zU);Xc5Yz>VKJl0s#gn?oRiAf(LF?K{pYd8+7daJ!FePbyA3?_P8W%4_7O=t(mKJPS z`UI3#RG%rV&;K)nw!VCJF0=X@P9K@@q5gD<2frUHRh(AE8V*@lJTp{XXd5XKhe}yT z=X<9gMk&`lEkj^o*PB!r<|_n=^eG3xna$RJu-Vl8+mFVEE-dF9xVHHe$darw1sl#ZVdBt5ud#RV!F4A0K-zx9aD-0!b5|u9HK`uJIFWQfw6&E zyOY+8#N&j+^qq+kfU|UH2RC;+Vm+iRJzfy=t@xifs+<1&wIU-#io+bY*0qhA=O3kt=|;NT z;31pc0R5k@^|as1g5-Wuoah7K)_fpmp9^+82(QQRfjiOai^U{H9|xu69cIMHp9FNZ z>IxG+5hhw6ruX(|xn`M}qp%%faepFaxYK7p$($UL6bxY9CanXJT4&&WYw;#iK?{?^ zHjGETCr-8(=6TJKsiSy%78s&?L)R_#s{VF>f1Nwi&xfXjdc~jY-5-OzQ}muH!~e0u zJb^-RGGpV=wW-a{DNYaDI#>k9tJcd+*kN-+$|Ku__p-niP*cO2b4K8&>qc%~d*S4* zvw{(!FWhQc=v5byg-CKIpBTB+n-c03dGIY-;_)M;auN|)Cj=0iaKd*f=5n>7aX9b~ z6IjHsK^a9R;%O^$zq#mKl_)H!UUZ*a>-5z~T@hh_5`7@#86NAH2G7OxHfNP3f#Cw0 zT%Bds-4fZzHr`>;y(Y_)3wFDEf>#_WVR9nPs@Ioh+R-Z8q^`TT&sP;2qN@ur1SGVy z|3g9xqK`X+w{H^NRd(>w5)fr2;~8DDyd79G_svziaQD4r!oF-1*#Bl=TU&)&Hxt+% zbO8j!k!cEXFiw{p35I9*nuoW@wXwnp`mcEb_unQm>@vva->i}k0U#+a+ov+%?db2Mi-tl(4 zu+-c)U+&=1hxsQ+DipiyX5gSqs;Rr}(dH_}rHO7Af;G&jO*G&WiLABF^!B4>xggF;(r6MLf;jmU((; zAE;dj-~IC^Gk53;pO$axGefEcb>Kwbdk*=EiTWjVuOR5r3yx4kS0w8*xp$Bd3Nbj! zXnpFB)nv-W2MM1*<@Dk?j)uFpJXX7c&6~Fb^j5#KaAP^DkCJ>qL`7P6xMcfxeqE3VXrbkNE3aE_n-I0*%_1zQuW0N!11$sIk`y zaKqoLP$p~=!k~@_{*qz?XwOzm!!w;#Lh7m708@^f5wNmSNNvW!;r98qBBvoP(ptx{ zgYmcJ`Z>vQSMVsV?fXwJ&})eOgwfWbRSz`eAA>Ml#mo!HVpO4h!jmLhe2lnHZ0inL zXp|pN$_}p^Mmp7zBYcy>tzBsM>q`MxgDmGBb?wsDe;KHShyg8w-vO^#7k|j3g(vQF z%DGp0Fme@D#e4|v^-*ZiSI!hsNLXdb&6~usslAqKtyDAlM_Q=i*>es1TALuM3#6hM zEg1>SR!^;-;)~B$7(i{(J1&KPbkx)gcv;Ty=2?q+H5dxa>-3A^MD@R=*I0cBR$g6T zR91=x&bm(P6Vs|C?@@Vg`Wpij*s>ZCn&U^aquPpmnGA7c-H5!B`

zh>s-p`tWZX}x z*~lm`vQ&s{_-{-1;`G|-GuFq<+Q{f|ZKY+Ab@(INWP^)?7fWLk4XZT$UcLzWGt1fA z+uL*F^LAPT07@i&=UmwNAG`dzj~P2UVY@1$jV{{n|F4AUDiUR9@6K5) z?cs#EDp}@w`h3~{BNeh$>95R0#B;htm!?O|hacz=gTPJ{lS+#y>kQ5j^;z-0;%tf< zRnu{N3nUF#)SET>C!_Kw6B9hQeD>?Kj_P#NXds<0_9(c>tCYiA59PCatRjib>hQR$ z1GSI+gEV_Gs*itWj&pZoQfU6Pm%hHfk%q6U4Wg*-so7g3;%ApKwfn2E%ibTU75|^~ z)(io3PO+QyDN@Yd6AN|PjpW{78Kv%spQ^=9oHJH82$}`aBmbj z-Ud*>QXeD+b)n??o5Kk)(K|;$f$y1+5=|C==A(CV`K_&)kNyr8Y(`#DOV{&3@9(5t zeLQO{1~_#mOs0C)XMoW7iFkf~)W3L|VO5$eI4L-eeR#FhE_DgL*q$ z5PHTEhR?=os4vfoN`Xk9N=ind6=IlFE_b<>Dy#^uy+;b>i=@qwkJb9bl}urx)u>Yy zB(I~7-TAJ8_yiGy5TZCb&q{_;^%ucPEE009!)-cR`PjB|b@`@cFb) z3G*?e6K<5g$X*nV26%%H127hO(ZV>v_kDZPyf+8ru>};y8||<>f2^W5kZ*9&Y-*L( zIvpdiH@234Eu}-5nia!8nnY=3$zKfArl{aRwYDVQ#O)$_7MWvoQo3R!pcNwpo=ZLHQ`#^syfhE!B!-dfLJbr7)NeFJ(>6@A(K9CkpPcPU@A_U}VDGmJ`E>7s2pM7^taq1i)eC%dQG zj_o+_C_xBTSZL==-0Dz@uK5K~8MZ}83fqTukm9H)j?UH0Q0G5<|NQryTcE71zM&k2 z6NzN@gM<&9^VVCGUz}YxyXRY%7hT~@aOPeF8OVG@)PaNe=Fpw8NBF_1eIe^hjW=1i?AH^sI^FL#6LyBhQj4f<8Q z9U3SH2)y&oBRlv``?ML+wZw%Kq;8p^F&A=;M+otCSyk(f>TZLL`!W33J=87U{R5BJ z#*fDG?uux=SSyO*S2Er30g!-99-0~;J{9oVcGL(liH^IUcg*C1&mK}F zxsctG-m}id8JDf}QGP6@u((?deRzyy&I_EqWcv~$wbHioP;OpR^PrPTS(5Z0uMWa@IL=9M zM-TNc;5&aB-duM$bt#5FNS|M)U{wW#oE-T9|6#4~0zr5KqlTWC5}+2A)GaRBNBy0H zd(79YPt*srvx=J&FJDrClMCX;N>DU$gZ;S=_dxLZ(aXEU3|AzTKHsj_kx)cb-^+p2 zuR{QlG>CzWg`w0&;4c3ce91J1f~q`o@eq!E>clsMDveVUwB2U^nrD$aXM`PwDnZ$t zwZ7MyqQ4u&B^LiO+JpX_)y!b?lP~PLB$He8@DSqE;Y$GiLU=8SDMCuG_=*v@FXeN(&(mK+OD!iNV}ue&p^tWpJ}V3Qx@mKE2`8!WywbOg zw!Ry(2jJ0EY$|2(T+y+O-WF4!DAXC^^}wX62yh(m`FEE)=L7L~M^#e}Lzfr^U0$SG zA57*vXZ!jutaB2*}MGsPmD;29S` z;B~un?ghFz)v#!+qNJp>Ci)Al=vQ%OViPjbOi<{(6MObH-rS25M);T`4iez%b z>_>G;gGuaGJ@c6qAI;9QOYU>)gNKx4*m?{vh{%4i@@Qyi*z&NS^>ERL3pvc{?S78u znmf9T<%hy`x&9W#*mfUZr{a>Omy~UBBDc$5O*Jise0D)h92idq+{}v`!#@x0>~=X{ zUJ`_9hT-nKzu@5gh|E~J(stfCX7%GM$@5o?v4`VsdjZ`FK7lD&F(RT#Sj)`BGj__m zANN`8{JIT4_|mmK7%uaWB#>Qk=g~;2G@0uc`0N&>bN{r*9yaY!^Z7FBmXCbXeeA>E z`Qq$Y``(>nTkF?IxWrETUd*CT{ZZ^NT3Z~%eqB5qR+-p_793x!09{h1a1T9mQ=9Hc z#(arqe-D9N(o*(R$;!QP2SCrP?oOBgU=?87+P2r9>60U@a|cJfnP>M(R7z;w-S+loTds;A*{Pzi;3`{Xzk8m^bB!R${ZBaav7tKTV znMCOVaf^Sekq0IQuH$l2lCdK24hj!bbnsuU!AsGp3`RXr+LrbgRne}8370R89JJz9 zz4YD-Kd#RfLWXa?0cKe~cGTuMB#8-iHdLW7PQ~i)*;Bc~?VO_1kU1It-Jw5*>d>(2pI&{4Y>Di)JyqxLgb?45v zmve0$9V@RcfAM%HY_YmW^j5JTW*hqoG}7aGUEHykWJpgrC(BGN*pkHWa{@OJ44FTaHm}fx_sUwuv)4a{zfCAec(3@f;7sGeVEi|F z;vabF)u#si;roOex~3uoh%JZ>$^F6Vh5JTyP6?v3*V; z7hoGysPpewAAYFh$f{&V$+^;W z=t<28k!ZI4xX|XugapQr82-JBEwUs}kFPQg3-1uDPEl%zDf4w(D z20wU~(C=Q-)C?l-mP1hi_DFe4KD1})D)d78>T>;WvOZ>D(e{JZ_3CrF^Uh#t_l^>} zTN9Urtw8@3JU%Tp1q&Lf&N9+aj}E7yfF@i0#B|SA{}mpx=>DSJzWBZ7J3e*}{o3&5 zI-A|gaP6TlW|4~57toIyH4RMN;|8Hb8E8$NCB5X<+gx|f{RuP^9A!i`M<*{CF0$6+ z3gmbJSIfgP=KK=Vq8&pL)aY<3O7OCthhlqgyG{j|94B%lp}}F6BjweK3O z=)E~5HNq&In?qruq~zQUAXh0{#lH%Z(!(`PUw+)G^56ukHbm>>27vO`Y`0Ib7%;kA zp_I71I3VZP@P2hFE5wP1I#?*w4o-lM;nLKScBFR@t}scAw*_R^3jw+i?))-5TkmBR zLNLb>k%k5e&O9CCD#S!T9Ugf^?AW%ZjCc_Q-Ay}Ml6={LjKR4WQoDaOY!5C=+Bs`W z`UI?hQDrm74@@#2h~+Ng$`STpdV4;Q#LeCPC9S%ddo5;jN!>a6|71}$nVvUBarZBe?}0Fp z)6PiLe(k~c5X!TTObBYzq!^g&wT_Q~-iT_*qVcIk*WCGtK^ z5(|rGl}@ru@__lND2E8MW2LHdw)!Uro#$ z$cUNLz9xvrc)`xamsb~C;`%tuN0#iMDw_P|a|@e(3)CY`m!qJ#kJwQVSIhNT{j-#LzwaIyw1xmFsO1y62|x z-Jk%5IwhrWmfx{S;HpyKrlNvB$}VQ&!BVda_Zetc=+7J4tmAQK%(4oXAJ|`U#dU(vkiG{dRB}( zXS=i)Tt#tfnSjmzl?vvlGFFWTGQQNNnHq{-P-_VZw;09zC&LRR!+OSx-dK5SZzT$IFKyRVA3L!Hp!P#|>!TmsDZ;E1nUydxhGN znhmFH=s8*u)4nOAVlYb5`Fa;wDe|!lhTI}uA(U6{uL^jj_?IyLe{UmukgRw$5RN}r SfT=m*;FRRm Date: Thu, 6 Sep 2018 12:01:35 +0200 Subject: [PATCH 24/25] rework version gathering --- annotate.nf | 28 ---------------- bin/scrape_tool_versions.py | 26 +++++++-------- buildContainers.nf | 14 -------- buildReferences.nf | 14 -------- germlineVC.nf | 49 ---------------------------- lib/QC.groovy | 34 -------------------- main.nf | 58 +-------------------------------- runMultiQC.nf | 25 +++++++-------- somaticVC.nf | 64 ++----------------------------------- 9 files changed, 27 insertions(+), 285 deletions(-) diff --git a/annotate.nf b/annotate.nf index d3571a15cd..8fab6e3b0a 100644 --- a/annotate.nf +++ b/annotate.nf @@ -34,20 +34,6 @@ kate: syntax groovy; space-indent on; indent-width 2; ================================================================================ */ -// Check that Nextflow version is up to date enough -// try / throw / catch works for NF versions < 0.25 when this was implemented -try { - if( ! nextflow.version.matches(">= ${params.nfRequiredVersion}") ){ - throw GroovyException('Nextflow version too old') - } -} catch (all) { - log.error "====================================================\n" + - " Nextflow version ${params.nfRequiredVersion} required! You are running v${workflow.nextflow.version}.\n" + - " Pipeline execution will continue, but things may break.\n" + - " Please update Nextflow.\n" + - "============================================================" -} - if (params.help) exit 0, helpMessage() if (!SarekUtils.isAllowedParams(params)) exit 1, "params unknown, see --help for more information" if (!checkUppmaxProject()) exit 1, "No UPPMAX project ID found! Use --project " @@ -279,13 +265,6 @@ if (params.verbose) vcfCompressedoutput = vcfCompressedoutput.view { "Index : ${it[3].fileName}" } -process GetVersionBCFtools { - publishDir directoryMap.version, mode: 'link' - output: file("v_*.txt") - when: !params.noReports - script: QC.getVersionBCFtools() -} - process GetVersionSnpEFF { publishDir directoryMap.version, mode: 'link' output: file("v_*.txt") @@ -293,13 +272,6 @@ process GetVersionSnpEFF { script: QC.getVersionSnpEFF() } -process GetVersionVCFtools { - publishDir directoryMap.version, mode: 'link' - output: file("v_*.txt") - when: !params.noReports - script: QC.getVersionVCFtools() -} - process GetVersionVEP { publishDir directoryMap.version, mode: 'link' output: file("v_*.txt") diff --git a/bin/scrape_tool_versions.py b/bin/scrape_tool_versions.py index fcc61c4c7f..afecaf112d 100755 --- a/bin/scrape_tool_versions.py +++ b/bin/scrape_tool_versions.py @@ -9,7 +9,7 @@ 'bcftools': ['v_bcftools.txt', r"bcftools (\S+)"], 'BWA': ['v_bwa.txt', r"Version: (\S+)"], 'FastQC': ['v_fastqc.txt', r"FastQC v(\S+)"], - 'GATK': ['v_gatk.txt', r"GATK version(\S+)"], + 'GATK': ['v_gatk.txt', r"Version:(\S+)"], 'htslib': ['v_samtools.txt', r"htslib (\S+)"], 'Manta': ['v_manta.txt', r"([0-9.]+)"], 'MultiQC': ['v_multiqc.txt', r"multiqc, version (\S+)"], @@ -28,24 +28,24 @@ results = OrderedDict() results['Sarek'] = 'N/A' results['Nextflow'] = 'N/A' +results['AlleleCount'] = 'N/A' +results['ASCAT'] = 'N/A' +results['bcftools'] = 'N/A' results['BWA'] = 'N/A' -results['samtools'] = 'N/A' -results['htslib'] = 'N/A' +results['FastQC'] = 'N/A' +results['FreeBayes'] = 'N/A' results['GATK'] = 'N/A' -results['Picard'] = 'N/A' +results['htslib'] = 'N/A' results['Manta'] = 'N/A' -results['Strelka'] = 'N/A' -results['FreeBayes'] = 'N/A' -results['AlleleCount'] = 'N/A' +results['MultiQC'] = 'N/A' +results['Picard'] = 'N/A' +results['Qualimap'] = 'N/A' results['R'] = 'N/A' -results['ASCAT'] = 'N/A' +results['samtools'] = 'N/A' results['SnpEff'] = 'N/A' -results['VEP'] = 'N/A' -results['FastQC'] = 'N/A' -results['Qualimap'] = 'N/A' -results['bcftools'] = 'N/A' +results['Strelka'] = 'N/A' results['vcftools'] = 'N/A' -results['MultiQC'] = 'N/A' +results['VEP'] = 'N/A' # Search each file using its regex for k, v in regexes.items(): diff --git a/buildContainers.nf b/buildContainers.nf index b7f016d2d7..8aa546c657 100644 --- a/buildContainers.nf +++ b/buildContainers.nf @@ -34,20 +34,6 @@ kate: syntax groovy; space-indent on; indent-width 2; ================================================================================ */ -// Check that Nextflow version is up to date enough -// try / throw / catch works for NF versions < 0.25 when this was implemented -try { - if( ! nextflow.version.matches(">= ${params.nfRequiredVersion}") ){ - throw GroovyException('Nextflow version too old') - } -} catch (all) { - log.error "====================================================\n" + - " Nextflow version ${params.nfRequiredVersion} required! You are running v${workflow.nextflow.version}.\n" + - " Pipeline execution will continue, but things may break.\n" + - " Please update Nextflow.\n" + - "============================================================" -} - if (params.help) exit 0, helpMessage() if (!SarekUtils.isAllowedParams(params)) exit 1, "params unknown, see --help for more information" if (!checkUppmaxProject()) exit 1, "No UPPMAX project ID found! Use --project " diff --git a/buildReferences.nf b/buildReferences.nf index cfc7a567a2..8403d3de06 100644 --- a/buildReferences.nf +++ b/buildReferences.nf @@ -37,20 +37,6 @@ kate: syntax groovy; space-indent on; indent-width 2; ================================================================================ */ -// Check that Nextflow version is up to date enough -// try / throw / catch works for NF versions < 0.25 when this was implemented -try { - if( ! nextflow.version.matches(">= ${params.nfRequiredVersion}") ){ - throw GroovyException('Nextflow version too old') - } -} catch (all) { - log.error "====================================================\n" + - " Nextflow version ${params.nfRequiredVersion} required! You are running v${workflow.nextflow.version}.\n" + - " Pipeline execution will continue, but things may break.\n" + - " Please update Nextflow.\n" + - "============================================================" -} - if (params.help) exit 0, helpMessage() if (!SarekUtils.isAllowedParams(params)) exit 1, "params unknown, see --help for more information" if (!checkUppmaxProject()) exit 1, "No UPPMAX project ID found! Use --project " diff --git a/germlineVC.nf b/germlineVC.nf index 4dd71fb528..1fb96776cb 100644 --- a/germlineVC.nf +++ b/germlineVC.nf @@ -40,20 +40,6 @@ kate: syntax groovy; space-indent on; indent-width 2; ================================================================================ */ -// Check that Nextflow version is up to date enough -// try / throw / catch works for NF versions < 0.25 when this was implemented -try { - if( ! nextflow.version.matches(">= ${params.nfRequiredVersion}") ){ - throw GroovyException('Nextflow version too old') - } -} catch (all) { - log.error "====================================================\n" + - " Nextflow version ${params.nfRequiredVersion} required! You are running v${workflow.nextflow.version}.\n" + - " Pipeline execution will continue, but things may break.\n" + - " Please update Nextflow.\n" + - "============================================================" -} - if (params.help) exit 0, helpMessage() if (!SarekUtils.isAllowedParams(params)) exit 1, "params unknown, see --help for more information" if (!checkUppmaxProject()) exit 1, "No UPPMAX project ID found! Use --project " @@ -590,41 +576,6 @@ if (params.verbose) vcfReport = vcfReport.view { vcfReport.close() -process GetVersionGATK { - publishDir directoryMap.version, mode: 'link' - output: file("v_*.txt") - when: 'haplotypecaller' in tools && !params.onlyQC - script: QC.getVersionGATK() -} - -process GetVersionStrelka { - publishDir directoryMap.version, mode: 'link' - output: file("v_*.txt") - when: 'strelka' in tools && !params.onlyQC - script: QC.getVersionStrelka() -} - -process GetVersionManta { - publishDir directoryMap.version, mode: 'link' - output: file("v_*.txt") - when: 'manta' in tools && !params.onlyQC - script: QC.getVersionManta() -} - -process GetVersionBCFtools { - publishDir directoryMap.version, mode: 'link' - output: file("v_*.txt") - when: !params.noReports - script: QC.getVersionBCFtools() -} - -process GetVersionVCFtools { - publishDir directoryMap.version, mode: 'link' - output: file("v_*.txt") - when: !params.noReports - script: QC.getVersionVCFtools() -} - /* ================================================================================ = F U N C T I O N S = diff --git a/lib/QC.groovy b/lib/QC.groovy index 2f6885d84b..6e6ef83eb9 100644 --- a/lib/QC.groovy +++ b/lib/QC.groovy @@ -49,26 +49,6 @@ class QC { """ } -// Get BCFtools version - static def getVersionBCFtools() { - """ - bcftools version > v_bcftools.txt - """ - } - -// Get GATK version - static def getVersionGATK() { - """ - gatk ApplyBQSR --help 2>&1| awk -F/ '/java/{for(i=1;i<=NF;i++){if(\$i~/gatk4/){sub("gatk4-","",\$i);print \$i>"v_gatk.txt"}}}' - """ - } - -// Get Manta version - static def getVersionManta() { - """ - configManta.py --version > v_manta.txt - """ - } // Get SnpEFF version static def getVersionSnpEFF() { @@ -77,20 +57,6 @@ class QC { """ } -// Get Strelka version - static def getVersionStrelka() { - """ - configureStrelkaGermlineWorkflow.py --version > v_strelka.txt - """ - } - -// Get VCFtools version - static def getVersionVCFtools() { - """ - vcftools --version > v_vcftools.txt - """ - } - // Get VEP version static def getVersionVEP() { """ diff --git a/main.nf b/main.nf index c967555402..67039d860a 100644 --- a/main.nf +++ b/main.nf @@ -40,20 +40,6 @@ kate: syntax groovy; space-indent on; indent-width 2; ================================================================================ */ -// Check that Nextflow version is up to date enough -// try / throw / catch works for NF versions < 0.25 when this was implemented -try { - if( ! nextflow.version.matches(">= ${params.nfRequiredVersion}") ){ - throw GroovyException('Nextflow version too old') - } -} catch (all) { - log.error "====================================================\n" + - " Nextflow version ${params.nfRequiredVersion} required! You are running v${workflow.nextflow.version}.\n" + - " Pipeline execution will continue, but things may break.\n" + - " Please update Nextflow.\n" + - "============================================================" -} - if (params.help) exit 0, helpMessage() if (!SarekUtils.isAllowedParams(params)) exit 1, "params unknown, see --help for more information" if (!checkUppmaxProject()) exit 1, "No UPPMAX project ID found! Use --project " @@ -393,7 +379,7 @@ process RecalibrateBam { --output ${idSample}.recal.bam \ -L ${intervals} \ --create-output-bam-index true \ - --bqsr-recal-file ${recalibrationReport} + --bqsr-recal-file ${recalibrationReport} """ } // Creating a TSV file to restart from this step @@ -452,48 +438,6 @@ if (params.verbose) bamQCreport = bamQCreport.view { Dir : [${it.fileName}]" } -process GetVersionBamQC { - publishDir directoryMap.version, mode: 'link' - output: file("v_*.txt") - when: !params.noReports && !params.noBAMQC - - script: - """ - qualimap --version &> v_qualimap.txt - """ -} - -process GetVersionBWAsamtools { - publishDir directoryMap.version, mode: 'link' - output: file("v_*.txt") - when: step == 'mapping' && !params.onlyQC - - script: - """ - bwa &> v_bwa.txt 2>&1 || true - samtools --version &> v_samtools.txt - """ -} - -process GetVersionFastQC { - publishDir directoryMap.version, mode: 'link' - output: - file("v_fastqc.txt") - when: step == 'mapping' && !params.noReports - - script: - """ - fastqc -v > v_fastqc.txt - """ -} - -process GetVersionGATK { - publishDir directoryMap.version, mode: 'link' - output: file("v_*.txt") - when: !params.onlyQC - script: QC.getVersionGATK() -} - /* ================================================================================ = F U N C T I O N S = diff --git a/runMultiQC.nf b/runMultiQC.nf index 59ed74a3d1..1781cee5fb 100644 --- a/runMultiQC.nf +++ b/runMultiQC.nf @@ -33,20 +33,6 @@ kate: syntax groovy; space-indent on; indent-width 2; ================================================================================ */ -// Check that Nextflow version is up to date enough -// try / throw / catch works for NF versions < 0.25 when this was implemented -try { - if( ! nextflow.version.matches(">= ${params.nfRequiredVersion}") ){ - throw GroovyException('Nextflow version too old') - } -} catch (all) { - log.error "====================================================\n" + - " Nextflow version ${params.nfRequiredVersion} required! You are running v${workflow.nextflow.version}.\n" + - " Pipeline execution will continue, but things may break.\n" + - " Please update Nextflow.\n" + - "============================================================" -} - if (params.help) exit 0, helpMessage() if (!SarekUtils.isAllowedParams(params)) exit 1, "params unknown, see --help for more information" if (!checkUppmaxProject()) exit 1, "No UPPMAX project ID found! Use --project " @@ -73,9 +59,20 @@ process GetVersionAll { script: """ + bcftools version > v_bcftools.txt + bwa &> v_bwa.txt 2>&1 || true + configManta.py --version > v_manta.txt + configureStrelkaGermlineWorkflow.py --version > v_strelka.txt echo "${params.version}" &> v_sarek.txt echo "${workflow.nextflow.version}" &> v_nextflow.txt + fastqc -v > v_fastqc.txt + freebayes --version > v_freebayes.txt + gatk ApplyBQSR --help 2>&1 | grep Version: > v_gatk.txt multiqc --version &> v_multiqc.txt + qualimap --version &> v_qualimap.txt + samtools --version &> v_samtools.txt + vcftools --version > v_vcftools.txt + scrape_tool_versions.py &> tool_versions_mqc.yaml """ } diff --git a/somaticVC.nf b/somaticVC.nf index 0e2347bb0e..5801a7a3c3 100644 --- a/somaticVC.nf +++ b/somaticVC.nf @@ -44,20 +44,6 @@ kate: syntax groovy; space-indent on; indent-width 2; ================================================================================ */ -// Check that Nextflow version is up to date enough -// try / throw / catch works for NF versions < 0.25 when this was implemented -try { - if( ! nextflow.version.matches(">= ${params.nfRequiredVersion}") ){ - throw GroovyException('Nextflow version too old') - } -} catch (all) { - log.error "====================================================\n" + - " Nextflow version ${params.nfRequiredVersion} required! You are running v${workflow.nextflow.version}.\n" + - " Pipeline execution will continue, but things may break.\n" + - " Please update Nextflow.\n" + - "============================================================" -} - if (params.help) exit 0, helpMessage() if (!SarekUtils.isAllowedParams(params)) exit 1, "params unknown, see --help for more information" if (!checkUppmaxProject()) exit 1, "No UPPMAX project ID found! Use --project " @@ -197,7 +183,7 @@ bamsTumor = bamsTumor.map { idPatient, status, idSample, bam, bai -> [idPatient, // We know that MuTect2 (and other somatic callers) are notoriously slow. // To speed them up we are chopping the reference into smaller pieces. // Do variant calling by this intervals, and re-merge the VCFs. -// Since we are on a cluster or a multi-CPU machine, this can parallelize the +// Since we are on a cluster or a multi-CPU machine, this can parallelize the // variant call processes and push down the variant call wall clock time significanlty. process CreateIntervalBeds { @@ -283,7 +269,7 @@ bamsTumorNormalIntervals = bamsAll.spread(bedIntervals) // MuTect2, FreeBayes ( bamsFMT2, bamsFFB) = bamsTumorNormalIntervals.into(3) -// This will give as a list of unfiltered calls for MuTect2. +// This will give as a list of unfiltered calls for MuTect2. process RunMutect2 { tag {idSampleTumor + "_vs_" + idSampleNormal + "-" + intervalBed.baseName} @@ -807,24 +793,6 @@ if (params.verbose) vcfReport = vcfReport.view { vcfReport.close() -process GetVersionGATK { - publishDir directoryMap.version, mode: 'link' - output: file("v_*.txt") - when: !params.onlyQC - script: QC.getVersionGATK() -} - -process GetVersionFreeBayes { - publishDir directoryMap.version, mode: 'link' - output: file("v_*.txt") - when: 'freebayes' in tools && !params.onlyQC - - script: - """ - freebayes --version > v_freebayes.txt - """ -} - process GetVersionAlleleCount { publishDir directoryMap.version, mode: 'link' output: file("v_*.txt") @@ -848,34 +816,6 @@ process GetVersionASCAT { """ } -process GetVersionStrelka { - publishDir directoryMap.version, mode: 'link' - output: file("v_*.txt") - when: 'strelka' in tools && !params.onlyQC - script: QC.getVersionStrelka() -} - -process GetVersionManta { - publishDir directoryMap.version, mode: 'link' - output: file("v_*.txt") - when: 'manta' in tools && !params.onlyQC - script: QC.getVersionManta() -} - -process GetVersionBCFtools { - publishDir directoryMap.version, mode: 'link' - output: file("v_*.txt") - when: !params.noReports - script: QC.getVersionBCFtools() -} - -process GetVersionVCFtools { - publishDir directoryMap.version, mode: 'link' - output: file("v_*.txt") - when: !params.noReports - script: QC.getVersionVCFtools() -} - /* ================================================================================ = F U N C T I O N S = From 04e31238de76e400390dc722fa951edc01b9bf45 Mon Sep 17 00:00:00 2001 From: Maxime Garcia Date: Thu, 6 Sep 2018 13:37:04 +0200 Subject: [PATCH 25/25] update CHANGELOG and CONTRIBUTING guide [skip ci] --- .github/CONTRIBUTING.md | 12 +++++++++--- CHANGELOG.md | 2 ++ 2 files changed, 11 insertions(+), 3 deletions(-) diff --git a/.github/CONTRIBUTING.md b/.github/CONTRIBUTING.md index 88b77a940e..46febab2c8 100644 --- a/.github/CONTRIBUTING.md +++ b/.github/CONTRIBUTING.md @@ -15,10 +15,16 @@ is as follows: 1. Check that there isn't already an issue about your idea in the [Sarek issues](https://github.com/SciLifeLab/Sarek/issues) to avoid duplicating work. - * Feel free to add a new issue here for the same reason. + * Feel free to add a [new issue here](https://github.com/SciLifeLab/Sarek/issues/new/choose) for the same reason. 2. Fork the Sarek repository to your GitHub account -3. Make the necessary changes / additions within your forked repository -4. Submit a Pull Request against the `dev` branch and wait for the code to be reviewed and merged. +3. [Configure a remote for your fork](https://help.github.com/articles/configuring-a-remote-for-a-fork/) +``` +git remote add upstream https://github.com/SciLifeLab/Sarek.git +``` + +4. [Sync your fork](https://help.github.com/articles/syncing-a-fork/) +5. Make the necessary changes / additions within your forked repository +6. Submit a [Pull Request](https://github.com/SciLifeLab/Sarek/compare) against the `dev` branch and wait for the code to be reviewed and merged. If you're not used to this workflow with git, you can start with some [basic docs from GitHub](https://help.github.com/articles/fork-a-repo/) or even their [excellent interactive tutorial](https://try.github.io/). diff --git a/CHANGELOG.md b/CHANGELOG.md index 0ed16e20e0..52b7f981da 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -21,9 +21,11 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0. - [#621](https://github.com/SciLifeLab/Sarek/pull/621) - Simplify tests - [#627](https://github.com/SciLifeLab/Sarek/pull/627), [#629](https://github.com/SciLifeLab/Sarek/pull/629) - Refactor docs - [#629](https://github.com/SciLifeLab/Sarek/pull/629) - Refactor config +- [#632](https://github.com/SciLifeLab/Sarek/pull/632) - Use 2 threads and 2 cpus FastQC processes ### `Removed` - [#616](https://github.com/SciLifeLab/Sarek/pull/616) - Remove old Issue Template +- [#629](https://github.com/SciLifeLab/Sarek/pull/629) - Remove old Dockerfiles ### `Fixed` - [#621](https://github.com/SciLifeLab/Sarek/pull/621) - Fix VEP tests