Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement GATK-gCNV tool for CNV detection #415

Closed
egenomics opened this issue Aug 22, 2023 · 5 comments
Closed

Implement GATK-gCNV tool for CNV detection #415

egenomics opened this issue Aug 22, 2023 · 5 comments
Labels
bug Something isn't working

Comments

@egenomics
Copy link

Description of feature

Hi,
It would be awesome if GATK-gCNV new tool for CNV detection in WES, targeted panels and WGS could be implemented in the pipeline.
The original paper is here: https://www.nature.com/articles/s41588-023-01449-0

@egenomics egenomics added the enhancement Improvement for existing functionality label Aug 22, 2023
@ramprasadn
Copy link
Collaborator

Hi @egenomics! Correct me if I am wrong, but this looks like a feature we already implemented in release 1.1.0. Relevant PR here #362.

@ramprasadn
Copy link
Collaborator

Closing this issue for now, but feel free to open it back up if you think this feature is lacking. 😄

@a113n
Copy link

a113n commented Oct 12, 2024

In release 2.2.0, seems like the outputs from GATK CNV segment calls are not merged in a family-based VCF first:
`### name: 'NFCORE_RAREDISEASE:RAREDISEASE:CALL_STRUCTURAL_VARIANTS:SVDB_MERGE (fam_1)'

container: 'quay.io/biocontainers/mulled-v2-c8daa8f9d69d3c5a1a4ff08283a166c18edb0000:511069f65a53621c5503e5cfee319aa3c735abfa-0'

###BASH output
drwxr-xr-x 2 allen allen 4096 Oct 12 12:05 ./
drwxr-xr-x 4 allen allen 4096 Oct 5 11:24 ../
-rw-r--r-- 1 allen allen 0 Oct 5 11:24 .command.begin
-rw-r--r-- 1 allen allen 0 Oct 5 11:24 .command.err
-rw-r--r-- 1 allen allen 0 Oct 5 11:24 .command.log
-rw-r--r-- 1 allen allen 0 Oct 5 11:24 .command.out
-rw-r--r-- 1 allen allen 12434 Oct 5 11:24 .command.run
-rw-r--r-- 1 allen allen 602 Oct 12 11:58 .command.sh
-rw-r--r-- 1 allen allen 260 Oct 5 11:24 .command.trace
-rw-r--r-- 1 allen allen 1 Oct 5 11:24 .exitcode
lrwxrwxrwx 1 allen allen 101 Oct 5 11:24 sample1_gatkcnv_segments_refiltered.vcf.gz -> /mnt/data/wgs/work/e7/22f995203d96241c7692dc821f2c85/sample1_gatkcnv_segments_refiltered.vcf.gz
lrwxrwxrwx 1 allen allen 101 Oct 5 11:24 sample2_gatkcnv_segments_refiltered.vcf.gz -> /mnt/data/wgs/work/11/fc0c2445409f85f6a9f6b559685882/sample2_gatkcnv_segments_refiltered.vcf.gz
lrwxrwxrwx 1 allen allen 101 Oct 5 11:24 sample3_gatkcnv_segments_refiltered.vcf.gz -> /mnt/data/wgs/work/10/626e9f99ae2a6f70cc92b3a0d08e02/sample3_gatkcnv_segments_refiltered.vcf.gz
lrwxrwxrwx 1 allen allen 74 Oct 5 11:24 fam_1_cnvnator.vcf.gz -> /mnt/data/wgs/work/7d/6c9f89a9b60391bf328f71f8337819/fam_1_cnvnator.vcf.gz
lrwxrwxrwx 1 allen allen 82 Oct 5 11:24 fam_1_manta.diploid_sv.vcf.gz -> /mnt/data/wgs/work/fe/8bfb6f1fda184eccc08f79c9937cce/fam_1_manta.diploid_sv.vcf.gz
-rw-r--r-- 1 allen allen 0 Oct 12 12:05 fam_1_sv.vcf.gz
lrwxrwxrwx 1 allen allen 72 Oct 5 11:24 fam_1_tiddit.vcf.gz -> /mnt/data/wgs/work/fe/d528ed5cbaf1941ab1bafcb24e92e4/fam_1_tiddit.vcf.gz
-rw-r--r-- 1 allen allen 108 Oct 12 12:05 versions.yml
`
Expected sample1_gatkcnv_segments_refiltered.vcf.gz, sample2_gatkcnv_segments_refiltered.vcf.gz sample3_gatkcnv_segments_refiltered.vcf.gz are merged into fam_1_gcnvcaller.vcf.gz prior entering the NFCORE_RAREDISEASE:RAREDISEASE:CALL_STRUCTURAL_VARIANTS:SVDB_MERGE workflow.

Content in .command.sh may give insight on why the output fam_1_sv.vcf.gz is empty:
`
#.command.sh
#!/bin/bash -euo pipefail
svdb
--merge
--pass_only --same_order
--priority tiddit,manta,gcnvcaller,cnvnator
--vcf fam_1_tiddit.vcf.gz:tiddit fam_1_manta.diploid_sv.vcf.gz:manta sample1_gatkcnv_segments_refiltered.vcf.gz:gcnvcaller sample2_gatkcnv_segments_refiltered.vcf.gz:cnvnator sample3_gatkcnv_segments_refiltered.vcf.gz:null fam_1_cnvnator.vcf.gz:null
> fam_1_sv.vcf
bgzip fam_1_sv.vcf

cat <<-END_VERSIONS > versions.yml
"NFCORE_RAREDISEASE:RAREDISEASE:CALL_STRUCTURAL_VARIANTS:SVDB_MERGE":
svdb: $( echo $(svdb) | head -1 | sed 's/usage: SVDB-([0-9].[0-9].[0-9]).*/\1/' )
samtools: $(echo $(samtools --version 2>&1) | sed 's/^.samtools //; s/Using.$//')
END_VERSIONS`

Note that in the line with "sample3_gatkcnv_segments_refiltered.vcf.gz:null fam_1_cnvnator.vcf.gz:null", two files were labelled as null, potentially due to array overflow of unexpected number of input VCF files.

@a113n
Copy link

a113n commented Oct 12, 2024

@ramprasadn Should the GATK SV workflow include a SVDB merge step as well (Similar to line 39 of the cnvnator workflow https://github.com/nf-core/raredisease/blob/fa61a657257a14a6433e3d751c577ab4c9d2eda4/subworkflows/local/variant_calling/call_sv_cnvnator.nf)?

@jemten jemten reopened this Oct 16, 2024
@jemten jemten added bug Something isn't working and removed enhancement Improvement for existing functionality labels Oct 16, 2024
@jemten
Copy link
Collaborator

jemten commented Oct 16, 2024

Tracked in issue #634

@jemten jemten closed this as completed Oct 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants