Is it necessary to de-redundant the resulting transcript model? #257

wjcre2023 · 2024-11-06T10:20:21Z

Dear andrewprzh
I ran all the samples together and ended up with a transcript model .gtf file. However, many new but very similar transcripts have been found in IGV with the same number of exons, with only a few hundred bp difference in the length of some exons. Do these require further removal of redundancy? Because this is most likely due to degradation of the 5 'end.How to deal with this situation? Should reads be de-redundant before quantitative and transcriptional modeling?

best wishes
jie

andrewprzh · 2024-11-15T14:59:16Z

Dear @wjcre2023

Could send a command line you used?
Normally, IsoQuant does not report multiple transcripts with the same intron chain unless strong evidence is found, for example, multiple distinct polyadenylation sites.

Best
Andrey

wjcre2023 · 2024-11-16T02:21:04Z

Dear @andrewprzh
My parameters are as follows：

The file I used was a full-length transcript .fq identified by pychopper.

Also it seems that the.gff file is still not available in 3.6.1, I uploaded a log file.
isoquant_log.txt

andrewprzh · 2024-11-18T00:20:45Z

Dear @wjcre2023

Yes, I think the main reason is --fl_data option. It considers that all reads correspond to a full-length transcript and 5' and 3' are correctly detected. Thus, you have transcripts with the same intron but different TSS and TES positions. I suggest to re-run IsoQuant without any options.
Also, it is possible to run IsoQuant without any pre-preocessing on raw ONT data.

P.S. You log shows an error caused by duplicated ids in your reference annotation.

Best
Andrey

wjcre2023 · 2024-11-18T12:29:04Z

Dear@andrewprzh
Thank you for your reply! I will delete this parameter and try again. In addition, this duplication seems to be correct, because one protein ID corresponds to multiple CDS. I don't know why the error was reported, can you give me some advice?

andrewprzh · 2024-11-18T16:13:49Z

Dear @wjcre2023

The ID should be unique for all features, even for exons belonging to a single CDS.
From GFF documentation:

ID
Indicates the unique identifier of the feature. IDs must be unique within the scope of the GFF file.

So that's why gffutils library that IsoQuant uses to convert GFF to gene database freaks about this. I think it's better to modify our annotation.

It is possible to ignore these warnings (i.e. convert GFF to database with other options), but then the outcome is not predictable.

Best
Andrey

wjcre2023 · 2024-11-19T03:40:25Z

Dear@andrewprzh
Ok, thank you very much. I think I see what you mean.
Best
Jie

wjcre2023 · 2024-11-20T02:23:12Z

Dear @andrewprzh
Unfortunately, the result of my re-run has not changed much from before. I was re-running isoquant with bam from previous alignment to save time.Is it related to these two parameters?

Here are my results:

andrewprzh · 2024-12-03T01:22:57Z

Dear @wjcre2023

The parameters looks OK. Could you send me GTF records of these two transcripts?
I can take a look, but, of course, it mat be hard to understand the real reason behind it without having the data.

Best
Andrey

wjcre2023 · 2024-12-04T02:18:42Z

Dear@andrewprzh
Thank you for your reply!
Here are my two examples:
example.txt

Best
jie

andrewprzh added the question Further information is requested label Nov 18, 2024

andrewprzh added the weird results Something looks odd in the resulting files label Dec 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is it necessary to de-redundant the resulting transcript model? #257

Is it necessary to de-redundant the resulting transcript model? #257

wjcre2023 commented Nov 6, 2024

andrewprzh commented Nov 15, 2024

wjcre2023 commented Nov 16, 2024

andrewprzh commented Nov 18, 2024

wjcre2023 commented Nov 18, 2024

andrewprzh commented Nov 18, 2024

wjcre2023 commented Nov 19, 2024

wjcre2023 commented Nov 20, 2024

andrewprzh commented Dec 3, 2024

wjcre2023 commented Dec 4, 2024

Is it necessary to de-redundant the resulting transcript model? #257

Is it necessary to de-redundant the resulting transcript model? #257

Comments

wjcre2023 commented Nov 6, 2024

andrewprzh commented Nov 15, 2024

wjcre2023 commented Nov 16, 2024

andrewprzh commented Nov 18, 2024

wjcre2023 commented Nov 18, 2024

andrewprzh commented Nov 18, 2024

wjcre2023 commented Nov 19, 2024

wjcre2023 commented Nov 20, 2024

andrewprzh commented Dec 3, 2024

wjcre2023 commented Dec 4, 2024