-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible bug (v1.0 and 1.2) #389
Comments
Thank you for your je feedback I guess it does not affect the AGAT work, I will add a fix in order the message does not appear again |
I working on silencing the message, but it actually reflect a deeper problem in the file you work with.
that has a locus_tag to define to which mRNA it has to be linked to (priority is ID/gene_id relationship > locus_tag > sequential). And this locus_tag is used in so many places. You have many genes that use the same locus tag, this is awkward. So AGAT does not know to which one to attach it (and worse you have locus_tag only to level1 (gene) and level3 (cds/exon) so as you have many mRNAs AGAT does not know to which one it must link it). |
Hi Jacques
Thanks for the reply, and for looking in to this.
I cannot tell if this issue is because I am misusing the annotation terms, or if the biology I am trying to represent is too complicated.
Essentially the locus I am working with leads – makes 7 different proteins, with seven or eight different polyA sites, and 3 or 8 different promoters (depending how you count the one that is repeated in a repetitive genome region), and these proteins are separated by alternative splicing (and in some cases may be bicistronic). This is why I used locus tag rather than gene.
As it happens this promoter happens to only be used for one protein [EBNA1] (and one mRNA with which the promoter should be contiguous), so I can perhaps alter this annotation.
Any recommendations how we should annotate this locus (or the promoters) to avoid these sort of unanticipated glitches? For gene level estimates, we really want to have an annotation that defines which protein is made from which transcript, despite the high degree of overlap.
Cheers
Rob
Dr Rob White
Senior Lecturer In Virology
Imperial College London
Section of Virology
St Mary's Hospital Medical School Building
Norfolk Place
London W2 1PG
tel: 0207 594 1124
www.ebv.org.uk<http://www.ebv.org.uk/>
www.imperial.ac.uk/people/robert.e.white/<http://www.imperial.ac.uk/people/robert.e.white/>
From: Jacques Dainat ***@***.***>
Date: Friday, 13 October 2023 at 14:04
To: NBISweden/AGAT ***@***.***>
Cc: White, Rob ***@***.***>, Author ***@***.***>
Subject: Re: [NBISweden/AGAT] Possible bug (v1.0 and 1.2) (Issue #389)
This email from ***@***.*** originates from outside Imperial. Do not click on links and attachments unless you recognise the sender. If you trust the sender, add them to your safe senders list<https://spam.ic.ac.uk/SpamConsole/Senders.aspx> to disable email stamping for this address.
I working on silencing the message, but it actually reflect a deeper problem in the file you work with.
it comes from the feature:
pHB9 Manual exon 38748 38784 . + . ID=Qp-exon;Name=Qp;locus_tag=EBNAs
that has a locus_tag to define to which mRNA it has to be linked to (priority is ID/gene_id relationship > locus_tag > sequential). And this locus_tag is used in so many places. You have many genes that use the same locus tag, this is awkward. So AGAT does not know to which one to attach it (and worse you have locus_tag only to level1 (gene) and level3 (cds/exon) so as you have many mRNAs AGAT does not know to which one it must link it).
—
Reply to this email directly, view it on GitHub<#389 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AKDCOJRKBGOZU3MJPSJVA5DX7E36FANCNFSM6AAAAAA3EKUTXY>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Hi, Pay attention to stay consistent how your features are defined.
do not have Parent feature but has a
pHB9 Manual exon 21605 21665 . + . ID=W1pr6;label=Exon W1'.6;locus_tag=EBNAs;Parent=mRNA_Wp1E2_LP1W
|
Discussed in #388
Originally posted by robertewhite August 4, 2023
Hi
I have been using AGAT to assemble GFF3 files for Epstein-Barr virus, and have had some odd error messages in the log file. As far as I can tell, the output GFF3 looks fine, but it is a lot to analyse in forensic detail.
The message is:
"Use of uninitialized value in lc at /Users/xxx/miniconda3/envs/agatenv3/lib/perl5/site_perl/AGAT/OmniscientI.pm line 1069, line 511. [and same on many other lines]
Since I have used AGAT a lot for a string of files (iteratively correcting my input) and initially this did not appear, but subsequently appeared more and more frequently in the output, I thought it might be a corruption in my installation, so I installed 1.2 in a new environment, and the error persisted.
Attaching the log and the input file so you can see if you can reproduce this issue. [the orphan features in the GFF3 file are deliberate]
Cheers
Rob
WTw_withWp.agat.log
WTw_withWp.gff3 copy.txt
The text was updated successfully, but these errors were encountered: