Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ASSESS_SIGNIFICANCE fails with Rscript error fix #1400

Merged
merged 17 commits into from
Feb 29, 2024

Conversation

nschcolnicov
Copy link
Contributor

@nschcolnicov nschcolnicov commented Feb 9, 2024

-Description of change:

  1. These changes are to adress the bug reported here: ASSESS_SIGNIFICANCE fails with Rscript error #1239:
    The assess_significance.R script is currently being called on cnv and cnv normal files instead of just the cnv and ratio file. And the assess_significance.R script fails in line 13:
    ratio$Ratio[which(ratio$Ratio==-1)]=NA Because the ratio dataframe doesn't contain any columns labeled "Ratio"
  2. Also the makegraph script fails, so it is needed to update to the makegraph2 script.
  3. I had originally modified the subworkflow so that the assess_significance module would run on both the normal and regular files, but I saw your comment @FriederikeHanssen on the issue that it should only be run on the non normal ones, so I modified it. Let me know if it should be run on both instead
  4. I tested it with this command and worked ok: nextflow run nschcolnicov/sarek --tools strelka,controlfreec --input 'https://raw.githubusercontent.com/nf-core/test-datasets/sarek/testdata/csv/HCC1395_WXS_somatic_full_test.csv' --outdir . -r dev -latest -resume -profile cluster,bi

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/sarek branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

@maxulysse
Copy link
Member

Can you remove the makegraph module since this new module is replacing it

Copy link

github-actions bot commented Feb 12, 2024

nf-core lint overall result: Passed ✅ ⚠️

Posted for pipeline commit 8514c54

+| ✅ 188 tests passed       |+
#| ❔   9 tests were ignored |#
!| ❗   2 tests had warnings |!

❗ Test warnings:

  • files_exist - File not found: .github/workflows/awstest.yml
  • pipeline_todos - TODO string in WorkflowSarek.groovy: Optionally add in-text citation tools to this list.

❔ Tests ignored:

  • files_exist - File is ignored: .github/workflows/awsfulltest.yml
  • files_exist - File is ignored: conf/modules.config
  • files_unchanged - File ignored due to lint config: assets/nf-core-sarek_logo_light.png
  • files_unchanged - File ignored due to lint config: docs/images/nf-core-sarek_logo_light.png
  • files_unchanged - File ignored due to lint config: docs/images/nf-core-sarek_logo_dark.png
  • files_unchanged - File ignored due to lint config: lib/NfcoreTemplate.groovy
  • actions_ci - actions_ci
  • actions_awstest - 'awstest.yml' workflow not found: /home/runner/work/sarek/sarek/.github/workflows/awstest.yml
  • template_strings - template_strings

✅ Tests passed:

Run details

  • nf-core/tools version 2.12.1
  • Run at 2024-02-16 12:53:32

@nschcolnicov
Copy link
Contributor Author

nschcolnicov commented Feb 14, 2024

Can you remove the makegraph module since this new module is replacing it

@maxulysse I left it there because I only replaced the one on the bam_variant_calling_somatic_controlfreec subworkflow, but the bam_variant_calling_tumor_only_controlfreec subworkflow is still the other version of makegraph. I didn't modified this one, since It didn't cause me any issues

@nschcolnicov
Copy link
Contributor Author

@FriederikeHanssen @maxulysse I updated the makegraph module on the tumor only subworkflow too, as discussed. I tested it by running:

nextflow run nschcolnicov/sarek -r dev -latest  --tools strelka,controlfreec --input ../samplesheet_test.csv --outdir .  -resume -profile cluster,bi

The samplesheet I used contained this:

patient,sex,status,sample,lane,fastq_1,fastq_2
HCC1395,XX,1,HCC1395T,1,s3://ngi-igenomes/test-data/sarek/SRR7890918_WES_HCC1395-EA_tumor_1.fastq.gz,s3://ngi-igenomes/test-data/sarek/SRR7890918_WES_HCC1395-EA_tumor_2.fastq.gz

image

maxulysse
maxulysse previously approved these changes Feb 16, 2024
Copy link
Member

@maxulysse maxulysse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, what do you think @FriederikeHanssen ?

@asp8200
Copy link
Contributor

asp8200 commented Feb 22, 2024

Do we now have a CI-test which covers the bug?

@nschcolnicov
Copy link
Contributor Author

nschcolnicov commented Feb 22, 2024

I created a branch to test the changes suggested, and ran it with:

nextflow run nschcolnicov/sarek --tools strelka,controlfreec --input 'https://raw.githubusercontent.com/nf-core/test-datasets/sarek/testdata/csv/HCC1395_WXS_somatic_full_test.csv' --outdir . -r tmp -latest -resume -profile cluster,bi

No visible changes in the execution of the pipeline, all ran OK:
image

@asp8200 Changes have already been added to this PR

@asp8200
Copy link
Contributor

asp8200 commented Feb 22, 2024

I created a branch to test the changes suggested, and ran it with:

nextflow run nschcolnicov/sarek --tools strelka,controlfreec --input 'https://raw.githubusercontent.com/nf-core/test-datasets/sarek/testdata/csv/HCC1395_WXS_somatic_full_test.csv' --outdir . -r tmp -latest -resume -profile cluster,bi

No visible changes in the execution of the pipeline, all ran OK: image

@asp8200 Changes have already been added to this PR

Seems like every job except the last two (multiqc and dumpversions) were cached on your rerun, so the adjusted code did not get executed?

@nschcolnicov
Copy link
Contributor Author

I created a branch to test the changes suggested, and ran it with:

nextflow run nschcolnicov/sarek --tools strelka,controlfreec --input 'https://raw.githubusercontent.com/nf-core/test-datasets/sarek/testdata/csv/HCC1395_WXS_somatic_full_test.csv' --outdir . -r tmp -latest -resume -profile cluster,bi

No visible changes in the execution of the pipeline, all ran OK: image
@asp8200 Changes have already been added to this PR

Seems like every job except the last two (multiqc and dumpversions) were cached on your rerun, so the adjusted code did not get executed?

@asp8200
That is because nothing really changed in the work directory of the module, or the content of the input and outoput channels, it is the best way of knowing that the change had no impact, in my opinion.

If you want to verify this, you can just cancel your pipeline execution, and re run it using the "-latest" argument (so the new dev version gets pulled) and the "-resume" and you should see it use cache for all the modules it had previously ran.
And if you are still not convinced, you can revert the changes in the bam_variant_calling_somatic_controlfreec subworkflow in your local version to the ones currently displayed in the dev branch, and you will see that it doesn't cache, and the error triggers.

CHANGELOG.md Outdated Show resolved Hide resolved
@nschcolnicov
Copy link
Contributor Author

All the comments were addressed, I added a bigger samplesheet for testing these changes, because I wanted to see how it would handle two pairs of control/tumor samples.

This is how the samplesheet I used looks like (its the same pair of fastqs with a different name):

patient,sex,status,sample,lane,fastq_1,fastq_2
HCC1395,XX,0,HCC1395N,1,s3://ngi-igenomes/test-data/sarek/SRR7890919_WES_HCC1395BL-EA_normal_1.fastq.gz,s3://ngi-igenomes/test-data/sarek/SRR7890919_WES_HCC1395BL-EA_normal_2.fastq.gz
HCC1395,XX,1,HCC1395T,1,s3://ngi-igenomes/test-data/sarek/SRR7890918_WES_HCC1395-EA_tumor_1.fastq.gz,s3://ngi-igenomes/test-data/sarek/SRR7890918_WES_HCC1395-EA_tumor_2.fastq.gz
test,XX,0,testN,1,s3://ngi-igenomes/test-data/sarek/SRR7890919_WES_HCC1395BL-EA_normal_1.fastq.gz,s3://ngi-igenomes/test-data/sarek/SRR7890919_WES_HCC1395BL-EA_normal_2.fastq.gz
test,XX,1,testT,1,s3://ngi-igenomes/test-data/sarek/SRR7890918_WES_HCC1395-EA_tumor_1.fastq.gz,s3://ngi-igenomes/test-data/sarek/SRR7890918_WES_HCC1395-EA_tumor_2.fastq.gz

I ran it with this command:

nextflow run nschcolnicov/sarek --tools strelka,controlfreec --input samplesheet.csv --outdir . -r tmp -resume -profile cluster,bi -latest

Everything ran OK:
image

@FriederikeHanssen FriederikeHanssen self-requested a review February 23, 2024 21:02
Copy link
Contributor

@FriederikeHanssen FriederikeHanssen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! Thank you 🚀

nschcolnicov and others added 3 commits February 23, 2024 18:04
…in.nf

Co-authored-by: Friederike Hanssen <Friederike.hanssen@qbic.uni-tuebingen.de>
…in.nf

Co-authored-by: Friederike Hanssen <Friederike.hanssen@qbic.uni-tuebingen.de>
@FriederikeHanssen
Copy link
Contributor

@maxulysse you are already investigating the serializer issue and we can merge with failing tests ?

@maxulysse maxulysse merged commit 03c04ac into nf-core:dev Feb 29, 2024
82 of 85 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants