Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic restart & General csv file updates #562

Merged
merged 32 commits into from
Jun 9, 2022

Conversation

FriederikeHanssen
Copy link
Contributor

@FriederikeHanssen FriederikeHanssen commented May 25, 2022

This PR adds automatic restart when no --input + any step but mapping is specified. In addition, the way the csvs are written is changed:

  • After variantcalling, for starting from annotation a CSV is also written:

When multiple variantcallers are selected it looks something like this:

patient,sample,variantcaller,vcf
test3,sample3,strelka,results/variant_calling/sample3/strelka/sample3.variants.vcf.gz
test3,sample4_vs_sample3,manta,results/variant_calling/sample4_vs_sample3/manta/sample4_vs_sample3.diploid_sv.vcf.gz
test3,sample4_vs_sample3,manta,results/variant_calling/sample4_vs_sample3/manta/sample4_vs_sample3.somatic_sv.vcf.gz
test3,sample4_vs_sample3,strelka,results/variant_calling/sample4_vs_sample3/strelka/sample4_vs_sample3.somatic_indels.vcf.gz
test3,sample4_vs_sample3,strelka,results/variant_calling/sample4_vs_sample3/strelka/sample4_vs_sample3.somatic_snvs.vcf.gz

  • The file names within the csv files are retrieved with using file.name, to ensure compatibility even when prefixes in modules.config or the modules themselves are changed, i.e. custom configs or upstream updates that would lead to multiple points of adaptation.
  • CSV files are now only written for the complete cohort
  • CSV files are all stort under outdir/csv/

Other changes:

  • Field type is not needed anymore. The file name for annotation are retrieved from the vcf directly and not custom build. This also ensures that annotated and called vcf files have a similar file name structure to ease postprocessing when searching for files.
  • Remove unused parameter target_bed and adapt docs for intervals

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
    • If you've added a new tool - add to the software_versions process and a regex to scrape_software_versions.py
    • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
    • If necessary, also make a PR on the nf-core/sarek branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core lint .).
  • Ensure the test suite passes (nextflow run . -profile test,docker).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

@github-actions
Copy link

github-actions bot commented Jun 9, 2022

nf-core lint overall result: Passed ✅ ⚠️

Posted for pipeline commit a84b939

+| ✅ 144 tests passed       |+
#| ❔   4 tests were ignored |#
!| ❗   8 tests had warnings |!

❗ Test warnings:

  • readme - README did not have a Nextflow minimum version badge.
  • pipeline_todos - TODO string in awsfulltest.yml: You can customise AWS full pipeline tests as required
  • pipeline_todos - TODO string in test_full.config: Specify the paths to your full test data ( on nf-core/test-datasets or directly in repositories, e.g. SRA)
  • pipeline_todos - TODO string in test_full.config: Give any required params for the test so that command line flags are not needed
  • pipeline_todos - TODO string in test_full.config: Specify the paths to your full test data ( on nf-core/test-datasets or directly in repositories, e.g. SRA)
  • pipeline_todos - TODO string in test_full.config: Give any required params for the test so that command line flags are not needed
  • schema_description - No description provided in schema for parameter: umi_read_structure
  • schema_description - No description provided in schema for parameter: group_by_umi_strategy

❔ Tests ignored:

  • files_unchanged - File ignored due to lint config: assets/nf-core-sarek_logo_light.png
  • files_unchanged - File ignored due to lint config: docs/images/nf-core-sarek_logo_light.png
  • files_unchanged - File ignored due to lint config: docs/images/nf-core-sarek_logo_dark.png
  • files_unchanged - File ignored due to lint config: lib/NfcoreTemplate.groovy

✅ Tests passed:

Run details

  • nf-core/tools version 2.4.1
  • Run at 2022-06-09 13:36:03

conf/modules.config Outdated Show resolved Hide resolved
@@ -988,7 +988,7 @@ process{
// VCF QC
withName: 'BCFTOOLS_STATS'{
ext.when = { !(params.skip_tools && params.skip_tools.contains('bcftools')) }
ext.prefix = { meta.type ? "${meta.variantcaller}_${vcf.baseName.minus(".vcf")}_${meta.type}" : "${meta.variantcaller}_${vcf.baseName.minus(".vcf")}" }
ext.prefix = { "${vcf.baseName.minus(".vcf")}" }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then why minus(".vcf") if basename here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here baseName removes .gz and then vcf is still left

Using simpleName would remove also anything after a dot in the regular filename

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will comment this better for future us

@FriederikeHanssen FriederikeHanssen changed the title Automatic restart Automatic restart & General csv file updates Jun 9, 2022
@FriederikeHanssen FriederikeHanssen marked this pull request as ready for review June 9, 2022 11:54
Copy link
Member

@maxulysse maxulysse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love it

@FriederikeHanssen
Copy link
Contributor Author

Test on actual data ran through

@FriederikeHanssen FriederikeHanssen merged commit 226263b into nf-core:dev Jun 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants