Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicates in fusion-annoFuse.tsv.gz file #558

Open
jharenza opened this issue Mar 9, 2024 · 2 comments
Open

Duplicates in fusion-annoFuse.tsv.gz file #558

jharenza opened this issue Mar 9, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@jharenza
Copy link
Member

jharenza commented Mar 9, 2024

What data file(s) does this issue pertain to?

fusion-annoFuse.tsv.gz

What release are you using?

v13-v15

Put your question or report your issue here.

Per the comment below, there are fusions duplicated in the fusion-annoFuse.tsv.gz OPC release file. The only difference seems to be the Gene1A_anno. I just started using this file to derive the putative-oncogenic.tsv as of v13, so I am not sure if this has been happening since the beginning or not since it was not used prior to this. I am also not sure the extent of this (how many fusions it affects) and whether this is happening at a patient level or somehow upon merge. I suspect this may be occurring at a patient level since I do not think there is any analysis prior to merge. The arriba file for the patient below only has one entry for this fusion. Can someone investigate the cause of the gene annotation duplicate rows?

@jharenza , not sure what to do about this. In the fusions file, there are repeat entries that are slightly different. For example:
BS_0HW7W7SD NSD3--TRIP12 8:38347497 2:229785855 NA NA NSD3 NA TRIP12 NA CosmicCensus, Oncogene NA NA NA NA ARRIBA 1 FALSE PT_C73C5BBZ [INTERCHROMOSOMAL[chr8--chr2]], translocation Genic in-frame
BS_0HW7W7SD NSD3--TRIP12 8:38347497 2:229785855 NA NA NSD3 NA TRIP12 NA TumorSuppressorGene NA NA NA NA ARRIBA 1 FALSE PT_C73C5BBZ [INTERCHROMOSOMAL[chr8--chr2]], translocation Genic in-frame
Exact same call from same caller, but it seems Gene1A_anno is somehow different...the cbio validation script reports 810 instances of this.

Originally posted by @migbro in https://github.com/d3b-center/bixu-tracker/issues/2248#issuecomment-1985959937

Duplicate of https://github.com/d3b-center/bixu-tracker/issues/2325

@jharenza jharenza added the bug Something isn't working label Mar 9, 2024
@jharenza
Copy link
Member Author

jira ticket: https://d3b.atlassian.net/browse/BIXU-2325

@jharenza
Copy link
Member Author

jharenza commented Aug 2, 2024

this is fixed in https://github.com/d3b-center/annoFuseData/tree/v1.0.0, we will need to update the docker image and rerun annofuse annotation for the OPC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant