Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding support for fastq.gz.spring-files as input #1534

Merged
merged 23 commits into from
Jun 19, 2024

Conversation

asp8200
Copy link
Contributor

@asp8200 asp8200 commented May 21, 2024

In this PR, I'm trying to add support for fastq.gz.spring-files as input. (This PR replaces the former #1522 )

As agreed with @maxulysse, I've extended the sample-sheet schema with spring_1 and spring_2.

The spring-input can be one file containing both R1 and R2 (goes in spring_1 while spring_2 is left undefined or empty) or two files - one with R1 (goes in spring_1). and one with R2 (goes in spring_2).

I added the pytest alignment_from_everything which tests that the pipeline is able to start from a "samplesheet" containing fastq.gz-files, fastq.gz.spring-files and bam-files (all in the same csv). I didn't include cram-files in that csv, but that can be also done if desired.

I think this new pytest would also have caught this bug.

TO-DO:
The subway-map needs to include an icon for the spring-format as input. However, the task of updating the subway-map has been relegated to this issue.

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/sarek branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core lint).
  • Ensure the test suite passes (nf-test test tests/ --verbose --profile +docker).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

Copy link

github-actions bot commented May 21, 2024

nf-core lint overall result: Passed ✅ ⚠️

Posted for pipeline commit 67e3f02

+| ✅ 200 tests passed       |+
#| ❔  12 tests were ignored |#
!| ❗   3 tests had warnings |!

❗ Test warnings:

  • pipeline_todos - TODO string in main.nf: Optionally add in-text citation tools to this list.
  • pipeline_todos - TODO string in main.nf: Optionally add bibliographic entries to this list.
  • pipeline_todos - TODO string in main.nf: Only uncomment below if logic in toolCitationText/toolBibliographyText has been filled!

❔ Tests ignored:

  • files_exist - File is ignored: .github/workflows/awsfulltest.yml
  • files_exist - File is ignored: .github/workflows/awstest.yml
  • files_exist - File is ignored: conf/modules.config
  • files_unchanged - File ignored due to lint config: .github/PULL_REQUEST_TEMPLATE.md
  • files_unchanged - File ignored due to lint config: assets/nf-core-sarek_logo_light.png
  • files_unchanged - File ignored due to lint config: docs/images/nf-core-sarek_logo_light.png
  • files_unchanged - File ignored due to lint config: docs/images/nf-core-sarek_logo_dark.png
  • files_unchanged - File ignored due to lint config: .gitignore or .prettierignore
  • actions_ci - actions_ci
  • actions_awstest - 'awstest.yml' workflow not found: /home/runner/work/sarek/sarek/.github/workflows/awstest.yml
  • template_strings - template_strings
  • modules_config - modules_config

✅ Tests passed:

Run details

  • nf-core/tools version 2.14.1
  • Run at 2024-06-18 13:20:11

workflows/sarek/main.nf Outdated Show resolved Hide resolved
@asp8200
Copy link
Contributor Author

asp8200 commented May 22, 2024

Something wrong here :-/

The test

nextflow run main.nf -profile test,alignment_from_fastq_and_spring,docker --outdir results -resume --save_mapped --save_output_as_bam

results in bam and bai in different subfolders:

results/preprocessing/mapped/test/test-test_L1.sorted.bam.bai
results/preprocessing/mapped/test-test_L1/test-test_L1.sorted.bam

etc.

The bug must have something to do with meta.id being set to ${meta.sample}-${meta.lane}.

EDIT: It also seems to be a problem on the dev-branch and on the master-branch 😬 🤨 When I run some very similar test of the dev-branch or master-branch:

nextflow run main.nf -profile test,alignment_to_fastq,docker --outdir results -resume --save_mapped --save_output_as_bam

I get:

results/preprocessing/mapped/test/test-1.sorted.bam.bai
results/preprocessing/mapped/test-1/test-1.sorted.bam

@asp8200 asp8200 mentioned this pull request May 23, 2024
11 tasks
@asp8200
Copy link
Contributor Author

asp8200 commented May 24, 2024

Something wrong here :-/

The test

nextflow run main.nf -profile test,alignment_from_fastq_and_spring,docker --outdir results -resume --save_mapped --save_output_as_bam

results in bam and bai in different subfolders:

results/preprocessing/mapped/test/test-test_L1.sorted.bam.bai
results/preprocessing/mapped/test-test_L1/test-test_L1.sorted.bam

etc.

The bug must have something to do with meta.id being set to ${meta.sample}-${meta.lane}.

EDIT: It also seems to be a problem on the dev-branch and on the master-branch 😬 🤨 When I run some very similar test of the dev-branch or master-branch:

nextflow run main.nf -profile test,alignment_to_fastq,docker --outdir results -resume --save_mapped --save_output_as_bam

I get:

results/preprocessing/mapped/test/test-1.sorted.bam.bai
results/preprocessing/mapped/test-1/test-1.sorted.bam

I expect that problem to be solved by #1541

@maxulysse maxulysse marked this pull request as ready for review May 27, 2024 11:19
nextflow.config Outdated Show resolved Hide resolved
@FriederikeHanssen
Copy link
Contributor

I think publishing should be disabled by default. It bloats the output directory massively and likely folks don't need the converted fastq files in most cases

@asp8200
Copy link
Contributor Author

asp8200 commented May 30, 2024

I think publishing should be disabled by default. It bloats the output directory massively and likely folks don't need the converted fastq files in most cases

Ok. I'll try to disable that.

Comment on lines +161 to +162
one_fastq_gz_spring: it[0].data_type == "one_fastq_gz_spring"
two_fastq_gz_spring: it[0].data_type == "two_fastq_gz_spring"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are the spring fles split, but not the normal fastq files?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. Again, the variables could perhaps have been named more appropriately. Anyways, here is my attempt at explaining what is going on:

"one_fastq_gz_spring" is when you have just one spring-file containing both R1 and R2. (In that case, the spring should be listed under spring_1 in the samplesheet.)

"two_fastq_gz_spring" is when you have two spring-files containing R1 and R2, respectively. (In that case, the R1-spring and R2-spring should be listed under spring_1 and spring_2, respectively.)

For fastq.gz-input there is always two files.

Copy link
Contributor

@FriederikeHanssen FriederikeHanssen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need an update to the metromap, overview map and readme docs

@asp8200 asp8200 changed the title WIP: Adding support for fastq.gz.spring-files as input Adding support for fastq.gz.spring-files as input Jun 10, 2024
docs/usage.md Outdated Show resolved Hide resolved
workflows/sarek/main.nf Outdated Show resolved Hide resolved
Copy link
Member

@maxulysse maxulysse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments, but it is looking good to me

@maxulysse
Copy link
Member

@FriederikeHanssen what do you mean by overview map?

@maxulysse
Copy link
Member

subway map will be done on a separate issue as I'll do it rather than @asp8200

@asp8200 asp8200 merged commit 7684a6d into nf-core:dev Jun 19, 2024
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants