-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increased number of SVs in versions after 9.0.1 #1118
Labels
Comments
I spoke to Jesper about TIDDIT and there were 2 large conclusions, with fairly simple implementations to probably significantly reduce the number of variants:
|
Nice find! 🕵️ |
2 tasks
8 tasks
Fixed with #1120 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Is your feature request related to a problem? Please describe.
Not sure if this is a relevant issue or not. But I thought I would bring it up as a discussion.
Context: In a GMS-BT meeting case chiefgull (run with 11.2.0, a re-analysis of masterflea, run with 9.0.1) it was seen that the number of PASS variants in the final SV-vcf uploaded to Scout was increased from 197 to 8404.
This triggered a question of why the numbers had increased so significantly, and I learned that 8032 of the unique variants in this re-analysis came from TIDDIT which was added to the WGS flow in version 10.0.0 ((https://github.com/Clinical-Genomics/BALSAMIC/pull/947) )
To see if this was just an outlier I checked a few other cases before and after addition of TIDDIT. Below is a table summarising the number of variants in the final SV vcf with filter PASS (column 1) and PASS + TIDDIT (column2), for a few cases in version 9.0.1, 10.0.5 and 11.2.0 (the current latest version).
In summary in a lot of cases TIDDIT seems to add a lot of SVs.
In the VCF there is a value per variant about how many files this variant was observed in, taken probably from the SVDB merge step. But this value is not available to filter in Scout, nor any other quality-based metric to decrease the number of variants to a manageable amount to interpret.
Describe the solution you'd like
Either more filtering of the SV variants before upload to Scout, or more options for manual filtration in Scout, in which case we need to identify good parameters to filter by.
SOMATICSCORE which we're planning to introduce to Scout (#1107) is only available for variants called with Manta, and would not enable us to filter TIDDIT variants.
Describe alternatives you've considered
Is TIDDIT necessary? Why was it introduced?
Additional context
If possible, add any other context or screenshots about the feature request here.
Expected output for the feature
If possible, an example of expected output
Current BALSAMIC version
balsamic --version
11.2.0The text was updated successfully, but these errors were encountered: