Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dsl2 first draft #162

Merged
merged 67 commits into from
Mar 23, 2021
Merged

Dsl2 first draft #162

merged 67 commits into from
Mar 23, 2021

Conversation

skrakau
Copy link
Member

@skrakau skrakau commented Mar 9, 2021

OK, here is a first running DSL2 version:

  • ...
  • Compared results to latest DSL1 version
  • Updated docs
  • Updated CHANGELOG
  • Added new nf-core/test-datasets in new format (Add mag test data samplesheets for dsl2 branch test-datasets#226)
  • Replaced local fastp module by nf-core fastp module
  • fix CAT db problem (built Biocontainer containing CAT and suitable Diamond version for latest CAT database)
  • solve S3 nf-amazon plugin problem (will be solved with next nextflow release)
  • make use of containers consistent (when only using tar etc.)
  • run and check full-size test
  • update required nextflow version (nextflowVersion >= 20.10.0 for now, assuming release, although singularity requires >=20.11.0-edge)
  • fix non-reproducible centrifuge error (on CFC and locally), any changed dependencies?

Use of local Bowtie2 modules instead of using existing nf-core modules:

  1. bowtie2_removal_build/align: different behaviour -> mapped reads are discarded for contamination removal
  2. bowtie2_assembly_build/align: since the index need to be build for all assemblies, I needed to hand over a meta map to the index building process as well to avoid complicated channel logic before and after calling the module

Use of local quast modules instead of using nf-core modules:

  1. quast, quast_bins: metaQUAST is used (not QUAST), with different processing for bins

I observed a non-reproducible centrifuge error (which disappears when using -resume) on different systems, which I did not observe with previous pipeline versions. However, since I could not reproduce it recently I will postpone this.

Since there are so many changes, I don't know what else to describe here. Let me know, if I should add something.

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - add to the software_versions process and a regex to scrape_software_versions.py
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/mag branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core lint .).
  • Ensure the test suite passes (nextflow run . -profile test,docker).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

@skrakau
Copy link
Member Author

skrakau commented Mar 19, 2021

Thanks a lot for reviewing @d4straub ! The optional output seems to work in both cases so far, at least with the current Nextflow versions.

Copy link
Contributor

@KevinMenden KevinMenden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!
I've been mainly just looking at the modules and DSL2 stuff. Most of the modules are local so you can basically write them however you want. Probably a good idea though to make them similar to nf-core modules, so it's easier to put them on nf-core/modules later.

Mostly I found that a process label is missing for I think basically all local modules, and sometimes emit statements are missing.

lib/CustomSchema.groovy Outdated Show resolved Hide resolved
main.nf Outdated Show resolved Hide resolved
modules/local/bowtie2_assembly_align.nf Show resolved Hide resolved
modules/local/bowtie2_assembly_build.nf Show resolved Hide resolved
modules/local/busco.nf Show resolved Hide resolved
modules/local/busco_db_preparation.nf Show resolved Hide resolved
modules/local/krona_db.nf Show resolved Hide resolved
@skrakau
Copy link
Member Author

skrakau commented Mar 22, 2021

Hi @KevinMenden ,

I added the missing emit statements (apart from one exception where it doesn't make sense, see above).

Regarding the labels, we didn't use them for mag because it doesn't make sense to squeeze all those processes with very different requirements in these few categories. Thus I would also prefer adding the labels to the modules only when porting them to nf-core later.

@github-actions
Copy link

github-actions bot commented Mar 22, 2021

nf-core lint overall result: Passed ✅ ⚠️

Posted for pipeline commit b9f220e

+| ✅ 116 tests passed       |+
#| ❔   4 tests were ignored |#
!| ❗   4 tests had warnings |!
### ❗ Test warnings:

❔ Tests ignored:

  • files_unchanged - File does not exist: .github/workflows/push_dockerhub_dev.yml
  • files_unchanged - File does not exist: .github/workflows/push_dockerhub_release.yml
  • conda_env_yaml - No environment.yml file found - skipping conda_env_yaml test
  • conda_dockerfile - No environment.yml / Dockerfile file found - skipping conda_dockerfile test

✅ Tests passed:

Run details

  • nf-core/tools version 1.13.1
  • Run at 2021-03-23 12:16:05

@skrakau skrakau requested a review from d4straub March 22, 2021 15:30
@skrakau
Copy link
Member Author

skrakau commented Mar 22, 2021

I additionally changed some defaults in nextflow.config (and nextflow_schema.json) to clean up the params which are listed in the summary (-> only params != default).

@KevinMenden
Copy link
Contributor

Okay, but the modules are always the same processes, so requirements should be similar right?
Anyway it's only mandatory when putting them on nf-core/modules :) so no need to change here

@skrakau
Copy link
Member Author

skrakau commented Mar 23, 2021

Thanks a lot for reviewing @KevinMenden !

Regarding labels: it is just that for mag there are for various processes error strategies applied for specific exit codes, different maxRetries applied, exponential increases of mem and time requirements applied when re-running, and in general the requirements where adjusted over time by people using the pipeline, and I don't want to mess with it right now (maybe later) :)

Copy link
Collaborator

@d4straub d4straub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could't find any issues any more, looks good!

@skrakau skrakau merged commit ecab713 into nf-core:dsl2 Mar 23, 2021
@skrakau
Copy link
Member Author

skrakau commented Mar 23, 2021

Parallelised GitHub CI test to address #170.

This was referenced Mar 23, 2021
@skrakau skrakau deleted the dsl2_first_draft branch May 31, 2021 13:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants