Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trim3p nextflex #386

Merged
merged 14 commits into from
Sep 7, 2024
Merged

Trim3p nextflex #386

merged 14 commits into from
Sep 7, 2024

Conversation

lpantano
Copy link
Contributor

@lpantano lpantano commented Aug 23, 2024

This PR is trying to add the right 3 end trimming after adapter removal for next flex protocol, where some NT needs to be removed from the end, but after adapter removal.

I couldn't figure out a way to reuse the fastp twice in row because the symlinks input/outputs, so if somebody has an idea, happy to implement, for now I added it as local module to run only for next flex.

I have test data to add, if we all are ok with this approach.

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/smrnaseq branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core lint).
  • Ensure the test suite passes (nf-test test main.nf.test -profile test,docker).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

@lpantano lpantano changed the base branch from master to dev August 23, 2024 22:01

This comment was marked as outdated.

Copy link

github-actions bot commented Aug 23, 2024

nf-core lint overall result: Passed ✅ ⚠️

Posted for pipeline commit 7a15447

+| ✅ 217 tests passed       |+
#| ❔   1 tests were ignored |#
!| ❗   3 tests had warnings |!

❗ Test warnings:

  • pipeline_todos - TODO string in main.nf: Optionally add in-text citation tools to this list.
  • pipeline_todos - TODO string in main.nf: Optionally add bibliographic entries to this list.
  • pipeline_todos - TODO string in main.nf: Only uncomment below if logic in toolCitationText/toolBibliographyText has been filled!

❔ Tests ignored:

  • nextflow_config - Config default ignored: params.fastp_known_mirna_adapters

✅ Tests passed:

Run details

  • nf-core/tools version 2.14.1
  • Run at 2024-09-06 21:19:30

@lpantano lpantano marked this pull request as ready for review August 26, 2024 20:25
@apeltzer
Copy link
Member

Could you upload the testdata and prepare a testcase for this already? That way we could test more easily if the results match expectations, even when trying to solve this without a local module 👍🏻

@lpantano
Copy link
Contributor Author

sounds good, I will do that ASAP, I checked that sequences were matching reference miRNA, so that is good. I will ping again when is ready.

@apeltzer
Copy link
Member

One idea would be to run the fastp module multiple times and use the stageIn directive to copy instead of symlink. That should be possible via the modules.config without having to worry about the data being changed.

@lpantano
Copy link
Contributor Author

I added the test data, not running in ci right now, but we can add it if we want. With this subsampled test data, the majority of the reads map to reference, if you don't running with next flex then all are isomiRs, because of the lack of trimming. So, I think is working.

Screenshot 2024-08-29 at 12 13 44 PM

conf/modules.config Outdated Show resolved Hide resolved
conf/test_no_genome_nextflex.config Outdated Show resolved Hide resolved
modules/local/trim3p.nf Outdated Show resolved Hide resolved
workflows/smrnaseq.nf Outdated Show resolved Hide resolved
workflows/smrnaseq.nf Outdated Show resolved Hide resolved
@lpantano
Copy link
Contributor Author

lpantano commented Sep 5, 2024 via email

@apeltzer
Copy link
Member

apeltzer commented Sep 5, 2024

I think we can just use your branch and add the fix, we can likely also help with that to try if the strategy works indeed. The main fix will stay untouched - its just not using a local module to achieve the same thing :)

@apeltzer apeltzer reopened this Sep 5, 2024
@apeltzer
Copy link
Member

apeltzer commented Sep 5, 2024

Ran the pipeline like this now:

nextflow run . -profile docker,nextflex --outdir nextflex --input https://raw.githubusercontent.com/nf-core/test-datasets/smrnaseq/samplesheet/v2.0/samplesheet_test_nextflex.csv  --
genome 'GRCh37' --mirtrace_species 'hsa'

@lpantano
Copy link
Contributor Author

lpantano commented Sep 5, 2024

@apeltzer , isn't that command need the -profile docker,nextflex ?

@apeltzer
Copy link
Member

apeltzer commented Sep 5, 2024

Absolutely, added that in

@nschcolnicov
Copy link
Contributor

@lpantano @apeltzer @atrigila Should be good to go now, please review.

Copy link
Contributor

@atrigila atrigila left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lpantano
Copy link
Contributor Author

lpantano commented Sep 6, 2024

I will run it later (give me a couple of hours) to check results, and make sure we have the right trimming. I am sure it is right, but good to double check, thanks!

@apeltzer
Copy link
Member

apeltzer commented Sep 6, 2024

Sounds good - we could also add a test 👍

@apeltzer
Copy link
Member

apeltzer commented Sep 6, 2024

Ah sorry there is one, let's test if trimming looks good and then merge if ok

@lpantano
Copy link
Contributor Author

lpantano commented Sep 6, 2024

yes, I will mainly do that, need to look a little closely, mainly checking that plot in multiqc looks like that

@lpantano
Copy link
Contributor Author

lpantano commented Sep 6, 2024

When I ran:

nextflow run main.nf -profile docker,test_nextflex --outdir

Mirtop doesn't run anymore. It should be after NFCORE_SMRNASEQ:MIRNA_QUANT:BOWTIE_MAP_MATURE. Are you getting Mirtop running with this?

Copy link
Contributor Author

@lpantano lpantano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line

ch_mirna_gtf = val_mirna_gtf ? Channel.empty() : ( mirna_gtf_from_species ? Channel.fromPath(mirna_gtf_from_species, checkIfExists: true).collect() : Channel.empty() ) //TODO for ch_mirna_gtf, shouldn't it try to build a channel.fromPath with params.mirna_gtf, if true? (instead of setting it to empty). Is this parameter used for non mirgenedb runs?

is making not to setup the ch_mirna_gtf, when the user gives param.mirna_gtf, I don't think that is what we want right? it is not related to this PR, actually affecting everything.

There is a comment actually there, that we should implement:

//TODO for ch_mirna_gtf, shouldn't it try to build a channel.fromPath with params.mirna_gtf,  if true? (instead of setting it to empty). Is this parameter used for non mirgenedb runs?

@lpantano
Copy link
Contributor Author

lpantano commented Sep 6, 2024

Otherwise, the trimming is working. If we fix that TODO it would be great, right now is skipping the miRNA quantification if people set that parameter.

Screenshot 2024-09-06 at 3 00 42 PM

@nschcolnicov
Copy link
Contributor

This line

ch_mirna_gtf = val_mirna_gtf ? Channel.empty() : ( mirna_gtf_from_species ? Channel.fromPath(mirna_gtf_from_species, checkIfExists: true).collect() : Channel.empty() ) //TODO for ch_mirna_gtf, shouldn't it try to build a channel.fromPath with params.mirna_gtf, if true? (instead of setting it to empty). Is this parameter used for non mirgenedb runs?

is making not to setup the ch_mirna_gtf, when the user gives param.mirna_gtf, I don't think that is what we want right? it is not related to this PR, actually affecting everything.

There is a comment actually there, that we should implement:

//TODO for ch_mirna_gtf, shouldn't it try to build a channel.fromPath with params.mirna_gtf,  if true? (instead of setting it to empty). Is this parameter used for non mirgenedb runs?

Thanks for looking into this, I agree, it doesnt look right

@atrigila
Copy link
Contributor

atrigila commented Sep 6, 2024

@lpantano I think the change introduced in the last commit fixed this issue, as I can now see that mirtop run. Would you mind checking from your side? I can then focus on updating the CI tests.

image

@lpantano
Copy link
Contributor Author

lpantano commented Sep 6, 2024

yes, that is true. That change makes mirtop to run.

@lpantano
Copy link
Contributor Author

lpantano commented Sep 6, 2024

please, somebody go a merge this!!!!!!!!!!!!!!!!! what a way to end the week :)

@apeltzer apeltzer merged commit f7cd5ba into dev Sep 7, 2024
6 checks passed
@apeltzer apeltzer deleted the trim3p_nextflex branch September 20, 2024 08:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants