Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OmniscientTool error #165

Closed
cmatKhan opened this issue Aug 26, 2021 · 2 comments · Fixed by #170
Closed

OmniscientTool error #165

cmatKhan opened this issue Aug 26, 2021 · 2 comments · Fixed by #170

Comments

@cmatKhan
Copy link

I have a correctly formatted gff3 with the following level1 features:

protein_coding_gene
ncRNA_gene
pseudogene

Note that there is no 'gene' feature, but from what I understand these are correct children of 'gene'.

protein_coding_gene is not included in the level1 json output by agat_convert_sp_gxf2gxf.pl, so I have both tried adding it as its own entry, and replacing 'gene'. In both cases, I get this error when running other agat scripts:

=> OmniscientI total time: 52 seconds
liftoff_h99_to_kn99.gff file parsed
These two features overlap without same id ! :
CP022335.1	Liftoff	protein_coding_gene	23466	23846	.	+	.	ID CNAG_09011 ; copy_num_ID CNAG_09011_0 ; coverage "1.0"  ; description "NADH dehydrogenase subunit 3"  ; extra_copy_number 0 ; sequence_ID "1.0"  ; valid_ORFs 1
CP022335.1	Liftoff	protein_coding_gene	21967	23466	.	+	.	ID CNAG_09010 ; copy_num_ID CNAG_09010_0 ; coverage "1.0"  ; description "NADH dehydrogenase subunit 2"  ; extra_copy_number 0 ; sequence_ID "1.0"  ; valid_ORFs 1
1 overlapping feature found ! We will treat them now:
We decided to keep that one: CP022335.1	Liftoff	protein_coding_gene	23466	23846	.	+	.ID CNAG_09011 ; copy_num_ID CNAG_09011_0 ; coverage "1.0"  ; description "NADH dehydrogenase subunit 3"  ; extra_copy_number 0 ; sequence_ID "1.0"  ; valid_ORFs 1
Can't call method "start" on an undefined value at /home/chasem/.conda/envs/agat_env/lib/site_perl/5.26.2/AGAT/OmniscientTool.pm line 2052.

If I replace all instances of protein_coding_gene in the gff with gene, then all the agat scripts do work.

The offending line in the script above is this:

 my $gene_feature=$hash_omniscient->{'level1'}{'gene'}{lc($gene_id)};
  if ($gene_feature->start != $geneExtremStart){

I have tried replacing 'gene' with 'protein_coding_gene' in both instances -- lines 2438 and 2052 -- where it occurs. This also works, but obviously changing the gff is a better solution than this.

I wonder if 'gene' shouldn't be hard coded into the script, but rather somehow set from the level1 json? Or, maybe that is the way it is supposed to work, and I am not adding "protein_coding_gene" to the json correctly?

@Juke34
Copy link
Collaborator

Juke34 commented Aug 26, 2021

Thank you for using AGAT and for you feedback.
I have pushed a fix in branch 165. Could you give a try? I had hard written 'gene' feature type due to specific needs at this time. I didn't think we could use other type of feature type of level1. Now it should be able to work as expected with any type of level1 feature type.
The issue was affecting agat_sp_fix_fusion.pl and agat_sp_fix_overlaping_genes.pl, was it one of these script you were using?

@cmatKhan
Copy link
Author

I updated my agat version today, and the fix worked.

Thank you very much -- I appreciate the prompt response and patch. My apologies that I didn't respond sooner.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants