Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dsl2 refactor #338

Merged
merged 41 commits into from
Feb 12, 2021
Merged

Dsl2 refactor #338

merged 41 commits into from
Feb 12, 2021

Conversation

maxulysse
Copy link
Member

@maxulysse maxulysse commented Jan 28, 2021

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
    • If you've added a new tool - add to the software_versions process and a regex to scrape_software_versions.py
    • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
    • If necessary, also make a PR on the nf-core/sarek branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core lint .).
  • Ensure the test suite passes (nextflow run . -profile test,docker).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

@abhi18av
Copy link
Member

abhi18av commented Feb 8, 2021

Hi @maxulysse 👋

What's the plan to add DeepVariant to the DSL2 version?

This current PR looks pretty extensive, any plans on merging this one to dsl2 branch soon? I can use that one (after the merge) as a foundation.

@maxulysse
Copy link
Member Author

Hi @abhi18av
You got the main idea.
I do need a couple of more stuff to be fully up to date, but it should arrive soon.
Current plan is to merge this PR to dsl2 by the end of the week, that way we will have caught up with the current DSL2 best practices.
And from there, start to work on deepvariant and other issues.

@abhi18av
Copy link
Member

abhi18av commented Feb 8, 2021

Ah, cool 👍

Also, I wanted to ask whether you'd like to include the stub feature as well while we are at it? This should make the dev iterations faster.

@maxulysse
Copy link
Member Author

That's a point I need to check with the @nf-core/core team.
But I do like the idea

@maxulysse maxulysse marked this pull request as ready for review February 11, 2021 12:23
@maxulysse maxulysse requested review from FriederikeHanssen and a team February 11, 2021 12:23
@maxulysse
Copy link
Member Author

OK, so I think current state is good enough for me for now.
and from that we can really start working on finishing the port.
There's still some linting issue to figure out, but that'll be done when tools is updated and ready for DSL2

@maxulysse
Copy link
Member Author

There will also be more modules update, but I really think that we can merge that before, and then go crazy with the modules and (sub)workflows.
I think there's room for improvement in the current sarek workflow (in the workflow folder).
I'll try to put more code in lib/ so that we can have dedicated workflow depending on what we want to do.
-> thinking about regular sarek, sarek with minimal genome, sarek with sentieon, etc...

Copy link
Contributor

@FriederikeHanssen FriederikeHanssen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So beautiful 😍 I just had some minor comments. One thing, I am wondering though is, whether it would make sense to tag the conda versions more specifically as you did in the latest environment.yml: bwa-mem2=2.0=he513fc3_1 and so on

@@ -2,8 +2,6 @@ name: nf-core CI
# This workflow runs the pipeline with the minimal test dataset to check that it completes without any syntax errors
on:
push:
branches:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this now running on all branches? or none?

@@ -21,6 +19,7 @@ jobs:
matrix:
# Nextflow versions: check pipeline minimum
nxf_ver: ['20.11.0-edge']
engine: ['docker']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason to not add singularity and conda? To keep the tests short? I am just thinking that given we pull from 3 different sources now, this could catch incorrect urls and such quickly

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

random spaces accidentally added somewhere in between has caused me a few issues here and there 😄

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Plan is to do that, but not implemented yet

@@ -22,12 +25,28 @@ Sarek can also handle tumour / normal pairs and could include additional relapse

The pipeline is built using [`Nextflow`](https://www.nextflow.io), a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It comes with `Docker` containers making installation trivial and results highly reproducible.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It comes with 'Docker'... is this still up-to date or should we add the other container engines, since we don't have a monolithic Docker file anymore?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, we might need to update the TEMPLATE

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that was fast 🚒


* Sequencing quality control (`FastQC`)
* Map Reads to Reference (`BWA mem`)
* Mark Duplicates (`GATK MarkDuplicatesSpark`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be MarkDuplicates due to param changes

Suggested change
* Mark Duplicates (`GATK MarkDuplicatesSpark`)
* Mark Duplicates (`GATK MarkDuplicates`)


If you would like to contribute to this pipeline, please see the [contributing guidelines](.github/CONTRIBUTING.md).

For further information or help, don't hesitate to get in touch on the [Slack `#sarek` channel](https://nfcore.slack.com/channels/sarek) (you can join with [this invite](https://nf-co.re/join/slack)), or contact us: [Maxime Garcia](mailto:maxime.garcia@scilifelab.se?subject=[GitHub]%20nf-core/sarek), [Szilvester Juhos](mailto:szilveszter.juhos@scilifelab.se?subject=[GitHub]%20nf-core/sarek)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
For further information or help, don't hesitate to get in touch on the [Slack `#sarek` channel](https://nfcore.slack.com/channels/sarek) (you can join with [this invite](https://nf-co.re/join/slack)), or contact us: [Maxime Garcia](mailto:maxime.garcia@scilifelab.se?subject=[GitHub]%20nf-core/sarek), [Szilvester Juhos](mailto:szilveszter.juhos@scilifelab.se?subject=[GitHub]%20nf-core/sarek)
For further information or help, don't hesitate to get in touch on the [Slack `#sarek` channel](https://nfcore.slack.com/channels/sarek) (you can join with [this invite](https://nf-co.re/join/slack)), or contact us: [Maxime Garcia](mailto:maxime.garcia@scilifelab.se?subject=[GitHub]%20nf-core/sarek), [Szilvester Juhos](mailto:szilveszter.juhos@scilifelab.se?subject=[GitHub]%20nf-core/sarek), [Friederike Hanssen](mailto:friederike.hanssen@qbic.uni-tuebingen.de?subject=[GitHub]%20nf-core/sarek), [Gisela Gabernet](mailto:gisela.gabernet@qbic.uni-tuebingen.de?subject=[GitHub]%20nf-core/sarek)

@@ -15,8 +12,12 @@ process CONCAT_VCF {
publishDir params.outdir, mode: params.publish_dir_mode,
saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), publish_id:meta.id) }

conda environment
container container
conda (params.enable_conda ? "bioconda::htslib=1.11" : null)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this also have a build number? --hd3b49d5_0 I mean?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I'll look into it

mode: params.publish_dir_mode,
saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:getSoftwareName(task.process), publish_id:meta.id) }

conda (params.enable_conda ? "bioconda::bwa=0.7.17 bioconda::samtools=1.10" : null)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use samtoll 1.11? as in the other steps? Ah I guess, it is from the modules repo and not a local one, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I did not check if there was a mulled container for it actually...
I just used the modules from nf-core/modules when there was one.
Was not planning to update tools in this PR.
I'm saving that for the future PRs

if (ioptions.publish_by_id) path_list.add(args.publish_id)
if (ioptions.publish_by_id) {
path_list.add(args.publish_id)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the functions file is now within each tool/module folder, is this one still needed?

@maxulysse maxulysse merged commit a51856e into nf-core:dsl2 Feb 12, 2021
@maxulysse maxulysse deleted the dsl2_refactor branch February 12, 2021 09:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants