-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Same peptide has different razor proteins from Philosopher #218
Comments
Hi @jd690764 The warnings from IonQuant looks weird. Can you share all psm.tsv files to us? I can send a link for you to upload. Thanks, Fengchao |
Hi Fengchao,
Thank you for your help. Yes, I can do that. Please, send me the link to upload.
Janos
…On Mon, Aug 10, 2020 at 12:42 PM Fengchao ***@***.***> wrote:
Hi @jd690764 <https://github.com/jd690764>
The warnings from IonQuant looks weird. Can you share all psm.tsv files to
us? I can send a link for you to upload.
Thanks,
Fengchao
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#218 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACP7QY2WTWFL5ZT5IEYBG2TSABET7ANCNFSM4P2IVE5Q>
.
|
Hi Janos, Thanks. Please upload your files to https://umich.app.box.com/f/bc3183ac9ec44c318c2cd43fa3f36e32 Best, Fengchao |
Hi Fengchao,
I uploaded them in a zip file.
Thank you,
Janos
…On Tue, Aug 11, 2020 at 1:33 PM Fengchao ***@***.***> wrote:
Hi Janos,
Thanks. Please upload your files to
https://umich.app.box.com/f/bc3183ac9ec44c318c2cd43fa3f36e32
Best,
Fengchao
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#218 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACP7QY2MU7PDX6VRIWSNU6TSAGTJJANCNFSM4P2IVE5Q>
.
|
Thanks Janos, I checked one peptide If you process this data in one batch, there should be a bug in Philosopher. @prvst Can you take a look when you have time? The psm files are in https://umich.box.com/s/1iw3iumyf1uq489ygz3dh9l3aoa33m3i Thanks, Fengchao |
How were these tables generated? Are these from the same experiment? Different data sets? Is it a combined analysis? |
There is a log here (#218 (comment)). Look like all runs were processed together. |
I suggest using the latest version of Philosopher |
Also, can you try with philosopher generated UniProt database. Download it with FragPioe. Philosopher sometimes has issues with custom databases ( although RefSeq should be fine)
…Sent from my iPhone
On Aug 12, 2020, at 9:07 AM, Felipe Leprevost <notifications@github.com> wrote:
External Email - Use Caution
I suggest using the latest version of Philosopher
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub<#218 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AIIMM62Y6SRW5GTC2ATFUO3SAKHTPANCNFSM4P2IVE5Q>.
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
|
Yes, these were run as a single batch. I actually ran this twice and it
gave the exact same error. I have used the same library in the past and had
no issues with it.
I'll repeat the run with the newest version of philosopher and
report back to you probably tomorrow or Friday.
Thank you,
Janos
…On Wed, Aug 12, 2020 at 6:05 AM Felipe Leprevost ***@***.***> wrote:
I suggest using the latest version of Philosopher
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#218 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACP7QY4HVMVAUOU2UWOO2KLSAKHTNANCNFSM4P2IVE5Q>
.
|
Hi,
I updated philosopher to the latest version (3.2.9) and recreated the
protein library. I used a smaller dataset that doesn't take as much time to
run, but it produced the same error.
Here is one peptide that produced these conflicts:
1: TFNQVEIKPEMIGHYLGEFSITYKPVK NP_001009922 NP_001009562
2: TFNQVEIKPEMIGHYLGEFSITYKPVK NP_001009562 NP_001009608
3: TFNQVEIKPEMIGHYLGEFSITYKPVK NP_001009562 NP_001009608
4: TFNQVEIKPEMIGHYLGEFSITYKPVK NP_001009922 NP_001009608
5: TFNQVEIKPEMIGHYLGEFSITYKPVK NP_001009562 NP_001009925
6: TFNQVEIKPEMIGHYLGEFSITYKPVK NP_001009562 NP_001009925
7: TFNQVEIKPEMIGHYLGEFSITYKPVK NP_001009562 NP_001009182
8: TFNQVEIKPEMIGHYLGEFSITYKPVK NP_001009562 NP_001009182
9: TFNQVEIKPEMIGHYLGEFSITYKPVK NP_001009562 NP_001009922
10: TFNQVEIKPEMIGHYLGEFSITYKPVK NP_001009562 NP_001009922
> t[V1 == 'TFNQVEIKPEMIGHYLGEFSITYKPVK', 2:3] %>% unlist %>% unique %>%
sort
[1] "NP_001009182" "NP_001009562" "NP_001009608" "NP_001009922"
"NP_001009925"
All these ids are valid, they are in the library. But the peptide is only
present in 2 proteins:
grep -n 'TFNQVEIKPEMIGHYLGEFSITYKPVK'
2020-08-14-decoys-contam-covid2_human_proteome.fa
3936:MAEVEQKKKRTFRKFTYRGVDLDQLLDMSYEQLMQLYSARQRRRLNRGLRRKQHSLLKRLRKAKKEAPPMEKPEVVKTHLRDMIILPEMVGSMVGVYNGKTFNQVEIKPEMIGHYLGEFSITYKPVKHGRPGIGATHSSRFIPLK
40048:MLGRGADLAEVEQKKKRTFRKFTYRGVDLDQLLDMSYEQLMQLYSARQRRRLNRGLRRKQHSLLKRLRKAKKEAPPMEKPEVVKTHLRDMIILPEMVGSMVGVYNGKTFNQVEIKPEMIGHYLGEFSITYKPVKHGRPGIGATHSSRFIPLK
more +3935 2020-08-14-decoys-contam-covid2_human_proteome.fa|head -1
>NP_001009 | 40S ribosomal protein S15 isoform 2 | 6209 | RPS15 | 145 |
P62841 | sp
more +40047 2020-08-14-decoys-contam-covid2_human_proteome.fa|head -1
>NP_001295155 | 40S ribosomal protein S15 isoform 1 | 6209 | RPS15 | 152 |
K7ELC2 | tr
It seems the real protein id should be NP_001009, but for some reason
these 5 other ids - of the form NP_001009nnn - are assigned (there are a
total of 65 ids of this form in the library, plus the correct NP_001009).
I looked at a few other sequences giving warnings, they all have this same
issue of a shorter id replaced by ids with 3 additional digits appended to
the correct one.
I am not sure what this means, but hopefully it makes some sense to you.
Thank you for looking into this.
Janos
…On Wed, Aug 12, 2020 at 12:02 PM Janos Demeter ***@***.***> wrote:
Yes, these were run as a single batch. I actually ran this twice and it
gave the exact same error. I have used the same library in the past and had
no issues with it.
I'll repeat the run with the newest version of philosopher and
report back to you probably tomorrow or Friday.
Thank you,
Janos
On Wed, Aug 12, 2020 at 6:05 AM Felipe Leprevost ***@***.***>
wrote:
> I suggest using the latest version of Philosopher
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#218 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/ACP7QY4HVMVAUOU2UWOO2KLSAKHTNANCNFSM4P2IVE5Q>
> .
>
|
Could you share with me all your reports and the database? I also need you to confirm the commands you executed in the last run. |
Hi Felipe,
here are the files:
https://office365stanford-my.sharepoint.com/:f:/g/personal/jdemeter_stanford_edu/Ej3jarHuaatHlpvqr9f2OGEBC_zKorCAR5Z8XFRMECwj2A?e=LHwa2b
Let me know if you need anything else.
Thank you,
Janos
…On Mon, Aug 17, 2020 at 5:57 AM Felipe Leprevost ***@***.***> wrote:
@jd690764 <https://github.com/jd690764>
Could you share with me all your reports and the database? I also need you
to confirm the commands you executed in the last run.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#218 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACP7QY6OMNEQLIAXNSCHEVLSBESNHANCNFSM4P2IVE5Q>
.
|
How did you get this database? |
I have my own library that I made manually and then ran the latest
philosopher according to the instructions online.
Janos
…On Mon, Aug 17, 2020 at 11:25 AM Felipe Leprevost ***@***.***> wrote:
How did you get this database?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#218 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACP7QYZE3JVGGQGKTF7CAE3SBFYY3ANCNFSM4P2IVE5Q>
.
|
I suspect that there might be an issue with the parsing of the FASTA sequences. I suggest that you take a look at our documentation on how to work with the pre-existing database, and how to format it for Philosopher properly. |
@prvst I think one possible reason may be Philosopher matching substring rather than a whole string of protein IDs. That is why it messes up NP_001009 with all other proteins starting with NP_001009 (e.g., NP_001009922, NP_001009562, NP_001009608). |
Yes, it could be. But we need to be sure first that the formatting is OK, that way we can eliminate it from the possible problems. |
Hi Felipe,
I looked at the library again - before adding the contaminants - and it
seems correct to me. (The library I sent you is a derivative of a library I
successfully used before, but which is also giving the same error now - see
my original report.)
After adding the contaminants it also looks good, the only differences are
the added contaminants and reformatting of the protein sequences (removing
the line breaks that were present after every 60 amino acids). Also, it is
recognized as correct by fragpipe.
Thanks,
Janos
…On Mon, Aug 17, 2020 at 11:42 AM Felipe Leprevost ***@***.***> wrote:
Yes, it could be. But we need to be sure first that the formatting is OK,
that way we can eliminate it from the possible problems.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#218 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACP7QY4H7TBNU4MNY7VMUXTSBF22HANCNFSM4P2IVE5Q>
.
|
Hi Felipe,
Indeed, the error was due to the library. Description lines like this:
NP_060876 | mucin-4 isoform a precursor | 4585 | MUC4 | 5412 | Q99102 | sp
produce errors, but lines like this:
NP_060876|4585|MUC4 mucin-4 isoform a precursor
don't. The first form of description used to work with earlier versions of
msfragger/philosopher.
Thank you for your help,
Janos
…On Mon, Aug 17, 2020 at 12:22 PM jd690764 ***@***.***> wrote:
Hi Felipe,
I looked at the library again - before adding the contaminants - and it
seems correct to me. (The library I sent you is a derivative of a library I
successfully used before, but which is also giving the same error now - see
my original report.)
After adding the contaminants it also looks good, the only differences are
the added contaminants and reformatting of the protein sequences (removing
the line breaks that were present after every 60 amino acids). Also, it is
recognized as correct by fragpipe.
Thanks,
Janos
On Mon, Aug 17, 2020 at 11:42 AM Felipe Leprevost <
***@***.***>
wrote:
> Yes, it could be. But we need to be sure first that the formatting is OK,
> that way we can eliminate it from the possible problems.
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#218 (comment)
>,
> or unsubscribe
> <
https://github.com/notifications/unsubscribe-auth/ACP7QY4H7TBNU4MNY7VMUXTSBF22HANCNFSM4P2IVE5Q
>
> .
>
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#218 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACP7QY73TWFMU7FEYXHMH43SBF7PBANCNFSM4P2IVE5Q>
.
|
I'm glad that you were able to find the issue. |
Describe the bug
A clear and concise description of what the bug is.
Hi,
I am uploading the log file.
log_2020-08-08_09-19-30.txt
Any help would be greatly appreciated.
Thanks,
Janos
If you're submitting a bug report, please attach log file
The log file can be saved from FragPipe:
Export Log
button on theRun
tab.Run
tab to a text file.The text was updated successfully, but these errors were encountered: