Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: replace filter alt_allele_in_normal #1289

Merged
merged 28 commits into from
Feb 29, 2024

Conversation

mathiasbio
Copy link
Collaborator

@mathiasbio mathiasbio commented Oct 19, 2023

Description

This issue describes the problem with this filter: #1254
In short it is a very strict filter by TNscope that removes somatic variants sometimes even if they only have 1 read supporting it in the normal sample. The goal here is to remove it, and replace it with a less strict filter.

Added

  • high_normal_tumor_af_frac filter in bcftools for TNscope WGS T+N analysis, and UMI T+N analysis which allows for 30% of tumor in the normal.

Changed

  • [Description]

Fixed

  • [Description]

Removed

  • alt_allele_in_normal set by TNscope for WGS T+N analysis, and UMI T+N analysis

Documentation

  • N/A
  • Updated Balsamic documentation to reflect the changes as needed for this PR.
    • [Document Name]

Tests

Detailed test-results and evaluations here: https://docs.google.com/spreadsheets/d/1_lNTCywOtEikJHXVhndfFzD1ieN21WSmQRDZVxVbAcY/edit#gid=0

Among other things:

  • tumor_af vs normal_af plots comparing alt_allele_in_normal to the replacement filter high_normal_tumor_af_frac showing much more reasonable results for the new filter.
  • sensitivity and precision analysis comparing these filters in TWISTpancancer reference with the following results:

From steadybedbug: In the end filtering type 6 is used as it is more readable than type 5.

Filtering type Step1 Step2 Threshold True-pos-baseline True-pos-call False-pos False-neg Precision Sensitivity F-measure
1 alt_allele_in_normal removed --e '(FORMAT/AF[1] / FORMAT/AF[0]) >= 0.3' --soft-filter high_normal_frac --mode + None 74 75 350 11 0.1765 0.8706 0.2935
2 alt_allele_in_normal removed -e sum(FORMAT/AF[0])<sum(FORMAT/AF[1])*4 -s high_normal_af_ratio -m + None 84 85 359 1 0.1914 0.9882 0.3207
3 alt_allele_in_normal removed None 85 86 640 0 0.1185 1 0.2118
4 standard None 71 72 330 14 0.1791 0.8353 0.295
5 alt_allele_in_normal removed -e sum(FORMAT/AF[0])<sum(FORMAT/AF[1])*3.33 -s high_normal_af_ratio -m + None 84 85 359 1 0.1914 0.9882 0.3207
6 alt_allele_in_normal removed filter -e sum(FORMAT/AF[1])/sum(FORMAT/AF[0])>0.3 --soft-filter high_normal_af_ratio -m + None 84 85 359 1 0.1914 0.9882 0.3207
  • Changes in number of variants between this PR and 14.0.0, in only quality-filtered VCF, and in the final clinical VCF:
case workflow # variants in vep/research VCF (v14.0.0) # variants in vep/research VCF (this PR) # variants in final VCF (v14.0.0) # variants in final VCF (this PR)
A 30% TINC UMI T+N 302 331 143 198
A 25% TINC UMI T+N 302 329 143 196
B 30% TINC WGS T+N 9098 9330 7009 7197

Feature Tests

alt_allele_in_normal removal

Verify that filter alt_allele_in_normal is removed after quality-filtering in TNscope VCFs of tumor + normal analysis of WGS and UMI cases.

  • Verified

Results in UMI TN-case: vcf/*.tnscope_umi.research.vcf.gz

  • header contains bcftools command to remove alt_allele_in_normal
  • variant labelled with alt_allele_in_normal in raw VCF does not have filter in vcf/*.tnscope_umi.research.vcf.gz

Results in WGS TN-case: vcf/*.tnscope.research.vcf.gz

  • header contains bcftools command to remove alt_allele_in_normal
  • variant labelled with alt_allele_in_normal in raw VCF does not have filter in vcf/*.tnscope.research.vcf.gz

new high_normal_tumor_af_frac filter

Verify that new filter high_normal_tumor_af_frac is set correctly in example WGS T+N and UMI T+N cases.

  • Verified

Results in UMI TN-case: vcf/*.tnscope_umi.research.vcf.gz

  • header contains bcftools command to set high_normal_tumor_af_frac
  • variant with N-AF / T-AF smaller than 0.3 does not have filter set
  • variant with N-AF / T-AF greater than 0.3 has filter set

Results in WGS TN-case: vcf/*.tnscope.research.vcf.gz

  • header contains bcftools command to set high_normal_tumor_af_frac
  • variant with N-AF / T-AF smaller than 0.3 does not have filter set
  • variant with N-AF / T-AF greater than 0.3 has filter set

Pipeline Integrity Tests

  • Report deliver (generation of the .hk file)
    • N/A
    • Verified
  • TGA T/O Workflow
    • N/A
    • Verified
  • TGA T/N Workflow
    • N/A
    • Verified
  • UMI T/O Workflow
    • N/A
    • Verified
  • UMI T/N Workflow
    • N/A
    • Verified
  • WGS T/O Workflow
    • N/A
    • Verified
  • WGS T/N Workflow
    • N/A
    • Verified
  • QC Workflow
    • N/A
    • Verified
  • PON Workflow
    • N/A
    • Verified

Clinical Genomics Stockholm

Documentation

  • Atlas documentation
    • N/A
    • Updated: [Link]
  • Web portal for Clinical Genomics
    • N/A
    • Updated: [Link]

User Changes

  • N/A
  • This PR affects the output files or results.

Infrastructure Changes

  • Stored files in Housekeeper
    • N/A
    • Updated: [Link]
  • CG (CLI and delivered/uploaded files)
    • N/A
    • Updated: [Link]
  • Servers (configuration files on Hasta)
    • N/A
    • Updated: [Link]
  • Scout interface
    • N/A
    • Updated: [Link]

Checklist

Important

Ensure that all checkboxes below are ticked before merging.

For Developers

  • PR Description
    • Provided a comprehensive description of the PR.
    • Linked relevant user stories or issues to the PR.
  • Documentation
    • Verified and updated documentation if necessary.
  • Tests
    • Described and tested the functionality addressed in the PR.
    • Ensured integration of the new code with existing workflows.
    • Confirmed that meaningful unit tests were added for the changes introduced.
    • Checked that the PR has successfully passed all relevant code smells and coverage checks.
  • Review
    • Addressed and resolved all the feedback provided during the code review process.
    • Obtained final approval from designated reviewers.

For Reviewers

  • Code
    • Code implements the intended features or fixes the reported issue.
    • Code follows the project's coding standards and style guide.
  • Documentation
    • Pipeline changes are well-documented in the CHANGELOG and relevant documentation.
  • Tests
    • The author provided a description of their manual testing, including consideration of edge cases and boundary
      conditions where applicable, with satisfactory results.
  • Review
    • Confirmed that the developer has addressed all the comments during the code review.

@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@codecov
Copy link

codecov bot commented Oct 19, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.44%. Comparing base (22ba6c4) to head (8157302).

Additional details and impacted files
@@           Coverage Diff            @@
##           develop    #1289   +/-   ##
========================================
  Coverage    99.44%   99.44%           
========================================
  Files           40       40           
  Lines         1983     1984    +1     
========================================
+ Hits          1972     1973    +1     
  Misses          11       11           
Flag Coverage Δ
unittests 99.44% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mathiasbio mathiasbio linked an issue Oct 20, 2023 that may be closed by this pull request
@mathiasbio mathiasbio changed the title feat: replace filter alt_alelle_in_normal feat: replace filter alt_allele_in_normal Oct 27, 2023
@mathiasbio mathiasbio changed the base branch from develop to release_v13.0.0 January 2, 2024 18:26
Base automatically changed from release_v13.0.0 to master January 19, 2024 15:28
@mathiasbio mathiasbio changed the base branch from master to develop January 22, 2024 14:01
@mathiasbio mathiasbio marked this pull request as ready for review February 26, 2024 12:07
@mathiasbio mathiasbio requested a review from a team as a code owner February 26, 2024 12:07
@mathiasbio mathiasbio linked an issue Feb 26, 2024 that may be closed by this pull request
3 tasks
@mathiasbio mathiasbio added this to the Release 15 milestone Feb 26, 2024
Copy link
Contributor

@ivadym ivadym left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! 💯

  • The bcftools annotate command is used to exclude the FILTER/alt_allele_in_normal annotation from the VCF
  • Variants meeting the condition AF[normal] / AF[tumor] > 0.3 will be excluded from further analysis as they are less likely to be relevant somatic mutations

BALSAMIC/constants/variant_filters.py Outdated Show resolved Hide resolved
CHANGELOG.rst Show resolved Hide resolved
docs/balsamic_filters.rst Outdated Show resolved Hide resolved
Copy link

Quality Gate Passed Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

Copy link
Contributor

@ivadym ivadym left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@mathiasbio mathiasbio merged commit edc672d into develop Feb 29, 2024
8 checks passed
@mathiasbio mathiasbio deleted the replace_alt_allele_in_normal branch February 29, 2024 08:39
mathiasbio added a commit that referenced this pull request Apr 10, 2024
This is a relatively small update to Balsamic with some significant changes to VarDict filters and settings, which influences all TGA workflows and in particular exome-analyses, as well as a new feature with a custom solution for detecting IGH::DUX4 rearrangements in WGS. 

#### Added

* new option for exome samples --exome with modified bcftools filters compared to standard targeted workflow #1414
* custom samtools rule for detection of reads supporting IGH::DUX4 rearrangements #1397
* high_normal_tumor_af_frac filter in bcftools for TNscope WGS T+N analysis, and UMI T+N analysis which allows for 30% of tumor in the normal. #1289

#### Changed

* reduced stringency of targeted none-exome bcftools filters for min MQ #1414
* removed -u flag from VarDict T+N and T only rules #1414

#### Removed

* removed -U flag to VarDict T+N rule to start calling SVs in VarDict (required for FLT3) #1414
* alt_allele_in_normal set by TNscope for WGS T+N analysis, and UMI T+N analysis #1289
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[User Story] Replace alt_allele_in_normal filter in TNscope Replace alt_allele_in_normal filter in TNscope
2 participants