Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kdn adding and diagnostic filtering #2383

Merged

Conversation

RayMSMS
Copy link
Contributor

@RayMSMS RayMSMS commented Jul 3, 2024

  1. Add Kdn sugar type in the dictionary
  2. Add a new function "DiagnosticIon filter", which uses the diagnostic ion to distinguish the isobaric peak.
    If we find 274/292 peak in Scan, we remove the glycan without A(NeuAc) in the candidate list.
  3. Add the corresponding test for the filter function. Isobaric case: H1N1A1 & N2K1.

@RayMSMS RayMSMS added the WIP Work in progress, not ready for review label Jul 3, 2024
Copy link

codecov bot commented Jul 3, 2024

Codecov Report

Attention: Patch coverage is 89.90536% with 32 lines in your changes missing coverage. Please review.

Project coverage is 93.69%. Comparing base (7dab370) to head (48267d9).

Files Patch % Lines
...yer/GlycoSearchTask/PostGlycoSearchAnalysisTask.cs 73.58% 3 Missing and 11 partials ⚠️
...pheus/EngineLayer/GlycoSearch/GlycoSearchEngine.cs 86.79% 2 Missing and 5 partials ⚠️
...Morpheus/EngineLayer/GlycoSearch/GlycanDatabase.cs 89.28% 1 Missing and 5 partials ⚠️
...heus/EngineLayer/GlycoSearch/GlycoSpectralMatch.cs 92.30% 0 Missing and 2 partials ⚠️
MetaMorpheus/EngineLayer/GlycoSearch/Glycan.cs 97.56% 0 Missing and 1 partial ⚠️
...aMorpheus/EngineLayer/GlycoSearch/GlycoPeptides.cs 96.66% 1 Missing ⚠️
...eus/EngineLayer/ModernSearch/ModernSearchEngine.cs 85.71% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2383      +/-   ##
==========================================
+ Coverage   92.97%   93.69%   +0.72%     
==========================================
  Files         139      139              
  Lines       21668    21720      +52     
  Branches     2983     3004      +21     
==========================================
+ Hits        20146    20351     +205     
+ Misses       1043      910     -133     
+ Partials      479      459      -20     
Files Coverage Δ
MetaMorpheus/EngineLayer/GlycoSearch/AdjNode.cs 80.00% <ø> (ø)
MetaMorpheus/EngineLayer/GlycoSearch/GlycanBox.cs 96.59% <100.00%> (+14.77%) ⬆️
...pheus/EngineLayer/GlycoSearch/LocalizationGraph.cs 100.00% <100.00%> (ø)
MetaMorpheus/EngineLayer/GlycoSearch/ModBox.cs 90.90% <ø> (ø)
MetaMorpheus/EngineLayer/GlycoSearch/Node.cs 88.00% <ø> (+21.33%) ⬆️
MetaMorpheus/EngineLayer/PsmTsv/PsmFromTsv.cs 97.74% <100.00%> (+0.04%) ⬆️
...pheus/TaskLayer/GlycoSearchTask/GlycoSearchTask.cs 97.59% <100.00%> (ø)
...rpheus/TaskLayer/GlycoSearchTask/WriteGlycoFile.cs 95.18% <ø> (ø)
MetaMorpheus/EngineLayer/GlycoSearch/Glycan.cs 97.04% <97.56%> (+0.25%) ⬆️
...aMorpheus/EngineLayer/GlycoSearch/GlycoPeptides.cs 88.39% <96.66%> (+0.35%) ⬆️
... and 5 more

... and 3 files with indirect coverage changes

RayMSMS and others added 11 commits July 3, 2024 18:15
1. add new tester model for "OGlycanCompositionFragments"
1. add the tester for writing function, in different search type
2. glycoBox tester for decoy glycanBox
improve the converage
1. new function "DiagonsticFilter" was built and corresponding test "IsobaricCase" was built
2. "SpectralRecoveryTest" fixing, the set up function will automatically clean the folder when we find the extra file in that.
1. delete the bin and set the file "copy always"
2. revise the "MetaDrawSettingAndViewsTest", detailed informed on comment
Copy link
Member

@nbollis nbollis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall. I had a few questions and concerns that I left as comments throughout the PR. I also checked the code coverage and did not find an area where it would be easy to test the additions that you made.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be a separate file or should it be added to one of the existing glyco files?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just added to the existing glyco files. The Glycan_Mods is a default glycan database collection for the user.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be a separate file or should it be added to one of the existing .gdb files?


//TO DO: Decoy O-glycan can be created, but the results need to be reasoned.
//public static int[] SugarShift = new int[]{ -16205282, -20307937, -29109542, -14605791, -30709033, -15005282, -36513219, -40615874, 16205282, 20307937, 29109542, 14605791, 30709033, 15005282, 36513219, 40615874 };
private readonly static int[] SugarShift = new int[]
private readonly static int[] SugarShift = new int[] //still unclear about the shift...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we still unclear about this shift? Can you add a comment about what these numbers are and how they are used?

/// Constructor of GlycanBox.
/// </summary>
/// <param name="ids"> The glycanBox composition, each number represent one glycan index in the database</param>
/// <param name="targetDecoy"></param>
public GlycanBox(int[] ids, bool targetDecoy = true):base(ids)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bool targetDecoy should be changed to isTarget to make it more clear if true is a target or true is a decoy

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected it. Thanks

@@ -207,7 +240,7 @@ public static List<GlycanIon> NGlycanCompositionFragments(byte[] kind)

for (int add_fuc_count = 2; add_fuc_count <= fuc_count; add_fuc_count++)
{
GlycanIon add_fuc_glycanIon = ExtendGlycanIon(hexose_glycanIon, 0, 0, (byte)add_fuc_count, 0, glycan_mass);
GlycanIon add_fuc_glycanIon = ExtendGlycanIon(hexose_glycanIon, 0, 0, 1, 0, glycan_mass);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calculated values are changed to heuristics and vice versa. Is this okay? Do we have tests for glycan database reading?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fragment is for N-glycan fragmentation, we don't use that in O-pair search right now. But, I still do a small revision to correct the fragment issue. A tester is also built.

private readonly int OxoniumIon204Index = 9; //Check Glycan.AllOxoniumIons
protected readonly List<GlycoSpectralMatch>[] GlobalCsms;
private readonly int OxoniumIon204Index = 9; // Check Glycan.AllOxoniumIons
protected readonly List<GlycoSpectralMatch>[] GlobalCsms; // Why don't we call it GlobalGsms?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should make this change to Gsms. They likely did it because they copied the structure from cross link search

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. I already changed it.

public static List<Tuple<int, int, bool>> GetLocalizedGlycan(List<Route> OGlycanBoxLocalization, out LocalizationLevel localizationLevel)
{
List<Tuple<int, int, bool>> localizedGlycan = new List<Tuple<int, int, bool>>();

//Dictionary<string, int>: modsite-id, count
Dictionary<string, int> seenModSite = new Dictionary<string, int>();
Dictionary<string, int> ModSiteSeenCount = new Dictionary<string, int>(); // all possible glycan-sites pair, Dictionary<string, int>: site-glycan pair, count
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ModSiteSeenCount should be lowercase as it is accessible only inside this function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Corrected it.

Slove the Nic's comment
@RayMSMS RayMSMS changed the title [WIP] Kdn adding and diagnostic filtering Kdn adding and diagnostic filtering Jul 30, 2024
@RayMSMS RayMSMS added ready for review and removed WIP Work in progress, not ready for review labels Jul 30, 2024
1. Delete the Bullfrog glycan database (data from collaborator)
2. add the summary comment on the "WriteProteinGlycoLocalization" function
@nbollis nbollis merged commit f48d118 into smith-chem-wisc:master Aug 2, 2024
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants