Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Semi colon in protein group explanation #728

Merged

Conversation

trishorts
Copy link
Contributor

This PR explains the appearance of the semi-colon in a protein group accession for the chromatographic peak.

Occasionally one sees a semi-colon in the protein accession column of allQuantifiedPeptides in MetaMorpheus. Here are some examples.

  • P00001|P00002;P00003|P00004
  • P0000A;P0000B

During protein parsimony, you can get situations where all peptides are shared between two or more proteins. In other words, there is no unique peptide that could resolve the parsimony. In this case you would see something like P00001 | P00002.

That’s the easy part and you already understand that.

Now imagine another scenario where you have some other peptides (that are not in either P00001 or P00002) that give you a second group, like the one above. Let’s call it P00003 | P00004. Everything is still fine her.

Now you have two protein groups each with two proteins.

Here is where the semi-colon comes in.
Imagine you now find a new peptide (totally different from any of the peptides used to create the two original protein groups) that is shared across all four proteins. The original peptides require that two different protein groups exist, but this new peptide could come from either or both. We don’t know. So, the quantification of that peptide must be allowed to be to either/both groups. For this peptide, the protein accession in the output will be P00001|P00002;P00003|P00004.

You could see an output that looks like P0000A;P0000B. Here there is only one protein in each protein group (as decided by parsimony). And you have a peptide that is shared. This would not ever be reported as P0000A|P0000B because each protein has a unique peptide that confirms its existence.

MICHAEL SHORTREED and others added 30 commits November 18, 2021 12:30
@trishorts trishorts merged commit 26e4eef into smith-chem-wisc:master Aug 28, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants