Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release 0.1.0 #90

Merged
merged 312 commits into from
May 8, 2024
Merged

Release 0.1.0 #90

merged 312 commits into from
May 8, 2024

Conversation

fellen31
Copy link
Collaborator

@fellen31 fellen31 commented Apr 22, 2024

This PR aims to make a pre-release of the pipeline, which is on the whole, functional.

There's still lots of things I would like to fix before a 1.0 release though. Including going through each subworkflow and updating/tidying it. At the same time, it would be good to have a version of the pipeline since some data will be generated with it before 1.0.

So there are a number of things this PR does not concern itself with, that I aim to adress in future releases:

General:

  • Outstanding bugs are not fixed.
  • Citations list is not updated.
  • Config does not use [‘ ‘].join(‘ ‘) everywhere.

Subworkflows:

  • Subworkflows are not always matching naming of config and/or output dir
  • Most sort and index processes that can be merged have not been so.
  • Most sort and concat indexes that can be merged have not been so.
  • The short_variant_calling process naming is bad and confusing, and the above also apply.
  • Inputs and outputs are not explained with // channel: [ val(meta), reads ]`
  • Specifically for the ALIGN_READS subworkflow, single_end will be added to meta when processing samplesheet instead of doing that here.
  • Specifically for the GENOME_ASSEMBLY workfow, the trio-calling is not well explained and hard to understand.

Modules:

  • Local modules have not been added to nf-core.
  • Local modules are not tested.
  • Local modules that use meta.id instead of prefix have not been fixed.
  • Specifically for the DIPCALL module it is awful (a deconstructed make script that could not be run well within a nextflow process), this will most likely be replaced with another software.
  • Specifically for the HIPHASE module I could not get functions working.

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • Make sure your code lints (nf-core lint).
  • Ensure the test suite passes (nf-test test main.nf.test -profile test,docker).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

runs-on: ubuntu-latest
strategy:
matrix:
parameters:
- ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A test with no parameters?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, so to run the -profile test with no extra arguments.

CHANGELOG.md Outdated

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## v1.0dev - [date]
## v0.1.0 - [date]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update date once ready

}
withLabel:process_high_memory {
memory = { check_max( 200.GB * task.attempt, 'memory' ) }
memory = { check_max( 218.GB * task.attempt, 'memory' ) }
}
withLabel:error_ignore {
errorStrategy = 'ignore'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does nallo processes need that much more than the standard compute? I would usually not meddle with the base config but specify per module in modules.config if some modules need extra.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some need a lot to merge many samples, but I should keep this label 200.

rannick
rannick previously approved these changes Apr 30, 2024
Copy link
Contributor

@rannick rannick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

Copy link
Collaborator

@jemten jemten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Massive work @fellen31
As you write in the PR description their are some workflows that would benefit from a refactoring. Also, quite a few of the local modules uses your private dockerhub, let's see if we can shift a few of those in the next release.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to rework some of the more deeply nested if statements.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! General preference between using if/switch statements in workflow vs ext.when?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I we could branch out into more subworkflows. But otherwise, it's a balance... I kind of prefer the ext.when from an aesthetic point of view, but then I had people complaining that it wasn't clear from the code what was happening 😅


// Index and normalize single sample vcfs
BCFTOOLS_INDEX_SINGLESAMPLE(ch_single_sample_vcf)
ss = ch_single_sample_vcf.join(BCFTOOLS_INDEX_SINGLESAMPLE.out.csi)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

guessing that ss is "single sample" but it would be good to have a more explicit channel name.

// Collect GVCFs
ch_snp_calls_gvcf = ch_snp_calls_gvcf.mix(DEEPVARIANT.out.gvcf)

// TODO: This only works with DeepVariant for now (remove PEPPER_MARGIN_DEEPVARIANT/Deeptrio?)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove?

Co-authored-by: Anders Jemt <jemten@users.noreply.github.com>
* Add review suggestions + full test changes
jemten
jemten previously approved these changes May 3, 2024
Copy link
Collaborator

@jemten jemten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! We can work on the remaining issues in coming PRs/releases

* Update CODEOWNERS

Change the code-owners to the GitHub team. This way we can more easily change the team and not having to update the CODEOWNERS

* Fixed org
jemten
jemten previously approved these changes May 3, 2024
* Update whatshap stats version to avoid ZeroDivisionError

* Update release date
jemten
jemten previously approved these changes May 8, 2024
Copy link
Collaborator

@jemten jemten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@fellen31 fellen31 merged commit eda623d into master May 8, 2024
25 of 27 checks passed
fellen31 added a commit that referenced this pull request May 8, 2024
fellen31 added a commit that referenced this pull request May 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants