Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invert qvalue calculation #2327

Merged
merged 23 commits into from
Jan 11, 2024

Conversation

trishorts
Copy link
Contributor

@trishorts trishorts commented Dec 26, 2023

We have never used the correct calculation for q-value, which is (decoy count + 1)/(target count). This is because, the highest scoring PSM would have q-value = 0.5. Bill Noble alerted me to an alternate strategy for computing q-value that would eliminate this problem and allow us to use the correct formula. This is accomplished by computing q-value from lowest scoring PSM to highest scoring PSM subject to the following rule. Whenever the current q-value is greater than the previous q-value, we keep the previous q-value. Now, the highest scoring PSM will have a q-value equal to the q-value of the highest scoring decoy PSM. It will never be zero. c est la vie.

overall we expect to see a higher yield of both peptides and PSMs with q-value < 0.01.

An unfortunate by product of this update is that some unit tests with low psm count fail to have sufficent psms with low q-value. In certain instances, this prevents PEP from being calculated. This problem was solved by adding another mzML that could boost the psm count to a sufficiently high level to allow all unit tests to complete successfully while testing what they were intended to test. Please note, some test assertions had to be changed to accommodate the new results.

Copy link

codecov bot commented Dec 26, 2023

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (5aafd10) 92.69% compared to head (0f2f031) 92.67%.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2327      +/-   ##
==========================================
- Coverage   92.69%   92.67%   -0.02%     
==========================================
  Files         136      136              
  Lines       21280    21321      +41     
  Branches     2926     2930       +4     
==========================================
+ Hits        19726    19760      +34     
- Misses       1072     1081       +9     
+ Partials      482      480       -2     
Files Coverage Δ
...pheus/EngineLayer/FdrAnalysis/FdrAnalysisEngine.cs 95.51% <98.64%> (+1.01%) ⬆️

... and 7 files with indirect coverage changes

Copy link
Contributor

@nbollis nbollis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do Target/Decoy curves change based upon this implementation?

@trishorts trishorts merged commit 20da6fc into smith-chem-wisc:master Jan 11, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants