Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible bug (v1.0 and 1.2) #389

Closed
robertewhite opened this issue Aug 4, 2023 Discussed in #388 · 4 comments
Closed

Possible bug (v1.0 and 1.2) #389

robertewhite opened this issue Aug 4, 2023 Discussed in #388 · 4 comments

Comments

@robertewhite
Copy link

Discussed in #388

Originally posted by robertewhite August 4, 2023
Hi

I have been using AGAT to assemble GFF3 files for Epstein-Barr virus, and have had some odd error messages in the log file. As far as I can tell, the output GFF3 looks fine, but it is a lot to analyse in forensic detail.

The message is:
"Use of uninitialized value in lc at /Users/xxx/miniconda3/envs/agatenv3/lib/perl5/site_perl/AGAT/OmniscientI.pm line 1069, line 511. [and same on many other lines]

Since I have used AGAT a lot for a string of files (iteratively correcting my input) and initially this did not appear, but subsequently appeared more and more frequently in the output, I thought it might be a corruption in my installation, so I installed 1.2 in a new environment, and the error persisted.

Attaching the log and the input file so you can see if you can reproduce this issue. [the orphan features in the GFF3 file are deliberate]

Cheers
Rob
WTw_withWp.agat.log
WTw_withWp.gff3 copy.txt

@Juke34
Copy link
Collaborator

Juke34 commented Aug 8, 2023

Thank you for your je feedback I guess it does not affect the AGAT work, I will add a fix in order the message does not appear again

@Juke34
Copy link
Collaborator

Juke34 commented Oct 13, 2023

I working on silencing the message, but it actually reflect a deeper problem in the file you work with.
it comes from the feature:

pHB9	Manual	exon	38748	38784	.	+	.	ID=Qp-exon;Name=Qp;locus_tag=EBNAs

that has a locus_tag to define to which mRNA it has to be linked to (priority is ID/gene_id relationship > locus_tag > sequential). And this locus_tag is used in so many places. You have many genes that use the same locus tag, this is awkward. So AGAT does not know to which one to attach it (and worse you have locus_tag only to level1 (gene) and level3 (cds/exon) so as you have many mRNAs AGAT does not know to which one it must link it).

@robertewhite
Copy link
Author

robertewhite commented Oct 13, 2023 via email

@Juke34 Juke34 closed this as completed in ecef089 Oct 19, 2023
@Juke34
Copy link
Collaborator

Juke34 commented Oct 19, 2023

Hi,
Up to you to define one gene or severals. Overlaps between genes is allowed.
In any case each defined gene (e.g. locus) can have multiple mRNA/RNA (that could be considered as isoforms). They may differ by start/stop position or/and by splicing events.

Pay attention to stay consistent how your features are defined.
E.g. some exon features have locus_tag attributes some other not, some both :

pHB9	Manual	exon	38748	38784	.	+	.	ID=Qp-exon;Name=Qp;locus_tag=EBNAs

do not have Parent feature but has a locus_tag, while other exon like this one

pHB9	Manual	exon	38556	38784	.	+	.	ID=Fp-exon;Name=Fp;Parent=mRNA_Fp_EBNA1
``
have `Parent` attribute but no locus_tag and other exons like this one

pHB9 Manual exon 21605 21665 . + . ID=W1pr6;label=Exon W1'.6;locus_tag=EBNAs;Parent=mRNA_Wp1E2_LP1W

has both `locus_tag` and `Parent `  attributes.

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants