Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean -fw from sample names in MultiQC #375

Merged
merged 3 commits into from
Feb 11, 2020

Conversation

ewels
Copy link
Member

@ewels ewels commented Feb 4, 2020

The new gzipping / rRNA code with --removeRiboRNA in the pipeline now appends -fw and -rv to the FastQ file names:

rnaseq/main.nf

Lines 987 to 988 in 3b6df9b

gzip < non-rRNA-reads-fw.fq > ${name}-fw.fq.gz
gzip < non-rRNA-reads-rv.fq > ${name}-rv.fq.gz

These are not currently trimmed off by MultiQC and result in the sample rows being split:

Screen-Shot-2020-02-04-at-9 57 43-AM

This PR adds -fw to the MultiQC config, so that the sample names are cleaned and will merge into a single row again.

PR checklist

  • PR is to dev rather than master
  • This comment contains a description of changes (with reason)
  • If you've fixed a bug or added code that should be tested, add tests!
  • If necessary, also make a PR on the nf-core/rnaseq branch on the nf-core/test-datasets repo
  • Ensure the test suite passes (nextflow run . -profile test,docker).
  • Make sure your code lints (nf-core lint .).
  • Documentation in docs is updated
  • CHANGELOG.md is updated
  • README.md is updated

@ewels ewels requested a review from a team February 4, 2020 09:28
@ewels ewels mentioned this pull request Feb 4, 2020
9 tasks
Copy link
Member

@maxulysse maxulysse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

@kviljoen
Copy link

kviljoen commented Feb 5, 2020

@ewels as discussed on gitter this was for nfcore-rnaseq v 1.4.2 (multiqc v1.7). I have made some changes for implementation on our cluster but mainly to resource settings so should affect the multiqc part (https://github.com/kviljoen/RNAseq)

@ewels
Copy link
Member Author

ewels commented Feb 7, 2020

Hi @kviljoen,

You should not need to fork and edit the pipeline code to change configuration settings. The pipelines (and Nextflow) are designed so that you can overwrite just about everything in your own files. Then the codebase is kept "clean" and the pipeline is more reproducible, plus you can easily update without overwriting your changes.

See https://nf-co.re/usage/configuration for some documentation (recently revamped) and https://nf-co.re/usage/configuration as the place to go to add your config profile (which will then be instantly available to all nf-core pipelines).

Phil

Fixes new lint tests to standardise all command line parameters to use snake_case instead of camelCase
@ewels
Copy link
Member Author

ewels commented Feb 7, 2020

I'm not sure why the GitHub actions lint test is failing here.. @ggabernet - any ideas?

- name: Check PRs
run: |
[[ $(git remote get-url origin) == *nf-core/rnaseq ]] && [[ ${GITHUB_BASE_REF} = "master" ]] && { [[ ${GITHUB_HEAD_REF} = "dev" ]] || [[ ${GITHUB_BASE_REF} = "patch" ]]; }
{ [[ $(git remote get-url origin) == *{{cookiecutter.name}} ]] && [[ ${GITHUB_HEAD_REF} = "dev" ]]; } || [[ ${GITHUB_HEAD_REF} == "patch" ]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
{ [[ $(git remote get-url origin) == *{{cookiecutter.name}} ]] && [[ ${GITHUB_HEAD_REF} = "dev" ]]; } || [[ ${GITHUB_HEAD_REF} == "patch" ]]
{ [[ $(git remote get-url origin) == *nf-core/rnaseq ]] && [[ ${GITHUB_HEAD_REF} = "dev" ]]; } || [[ ${GITHUB_HEAD_REF} == "patch" ]]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

facepalm

@apeltzer apeltzer changed the base branch from dev to merge-stuff February 11, 2020 19:08
@apeltzer apeltzer merged commit b4748ad into nf-core:merge-stuff Feb 11, 2020
@kviljoen
Copy link

Hi @ewels I'm still getting sample rows being split by '-fw' in the multiqc report even after inclusion of '-fw' in the multiqc_config.yaml for stripping.

@ewels
Copy link
Member Author

ewels commented Jun 17, 2020

Hi @kviljoen - sorry to hear that. This is definitely with the dev version of the pipeline? I don't think that this change is in a stable release yet..

@ewels ewels deleted the multiqc-clean-fw branch June 17, 2020 07:55
@kviljoen
Copy link

kviljoen commented Jun 17, 2020 via email

@ewels
Copy link
Member Author

ewels commented Jun 17, 2020

Ok, we should make a new issue about the ch_fasta_for_rsem_reference problem 😅

The change you mentioned should be all that is required. You can see that block of the MultiQC config in the pipeline here:

extra_fn_clean_exts:
- '_R1'
- '_R2'
- '.hisat'
- '_subsamp'
- '.sorted'
- '-fw'

It seems to work fine in my testing though 🤔

@kviljoen
Copy link

kviljoen commented Jun 17, 2020 via email

@ewels
Copy link
Member Author

ewels commented Jun 17, 2020

If you're using the development code, you should be using the development container - the dev tag instead of 1.4.2. This tag should be hardcoded in the pipeline though, usually there's not a need to specify this yourself?

The automated tests are passing fine so it must be something specific to the way that you're running the pipeline. Let's discuss on the other issue thread though.

@kviljoen
Copy link

kviljoen commented Jun 17, 2020

Hi Phil, yes, dev container specified for dev branch (I was referring to my master branch where I'm on 1.4.2 and had only changed -fw in multiqc_config.yaml). Attached log file from dev run. Please note that this is a result of merging with my own version of the pipeline because we've had to make some small tweaks particularly with java-related process memory settings in the main.nf to get it to run optimally on our cluster so may be specific to my version of pipeline. Log file attached
testrun.log

@ewels
Copy link
Member Author

ewels commented Jun 17, 2020

Hi @kviljoen,

I can't see any MultiQC config edits when comparing your fork?

Will move discussion of the log over to the other issue to try to keep these two separate 😅

Phil

@kviljoen
Copy link

Hi Phil,

The test I ran was on my master branch not dev, sorry (since dev wasn't working because of the other ch_fasta_for_rsem_reference issue). It's here

Regards,
Katie

@ewels
Copy link
Member Author

ewels commented Jun 17, 2020

Ok, the change looks fine. And you're not supplying a custom MultiQC config or anything? Is it possible to attach a MultiQC report where this is a problem?

@kviljoen
Copy link

Hi Phil, no custom MultiQC config no.
image

@kviljoen
Copy link

Hi @ewels I'm now running the nf-core/dev pipeline but am still having this issue, unfortunately.
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants