RSeqQC: parse Transcript Integrity Number #737

drbecavin · 2018-04-20T08:04:56Z

Integration of Transcript Integrity Number from RSeqQC

Name of tool:
- Transcript Integrity Number - tin.py
Tool description:
- This program is designed to evaluate RNA integrity at transcript level. TIN (transcript integrity number) is named in analogous to RIN (RNA integrity number). RIN (RNA integrity number) is the most widely used metric to evaluate RNA integrity at sample (or transcriptome) level. It is a very useful preventive measure to ensure good RNA quality and robust, reproducible RNA sequencing.
Tool homepage:
- http://rseqc.sourceforge.net/#tin-py
Complete log file output:
- New_abx_ZT13_IP_4.summary.txt
Log filename pattern:
- .summary.txt and .tin.xls
Most interesting data for General Stats table:
- TIN(median)
Data suitable for MultiQC plot(s):
- TIN(median)

Thanks for your amazing job on MultiQC. I love this softwareeeeee ! It changed my life !

The text was updated successfully, but these errors were encountered:

ewels · 2018-04-20T08:07:59Z

Hah, thanks for the comment @drbecavin - I should add that quote to the website testimonials 😉

Should be pretty easy to add this. I'll take a look into it when I get a chance.

drbecavin · 2018-04-20T08:11:57Z

No problem, I really enjoy the way you manage all these logs, and the quality of the html report created. Thanks!
Maybe I should create another issue for read_quality.py, another tool of RSeQC?

ewels · 2018-04-20T08:12:46Z

Sure 👍 It's good to have multiple issues where possible to break things up.

EngineerReversed · 2020-09-22T10:38:10Z

Has this been included in MultiQC_report?

guidohooiveld · 2021-06-10T14:30:58Z

Hi, I also kindly second this request to include the results of tin.py in a MultiQC report.
An example picture (box plot; Figure 1B) of representing the data of multiple samples can be found in this paper, but other representations of the results (median TIN score + SD or IQR of all transcripts in a sample) in a table or graph may be more appropriate... ??

For completeness below the code used to run the TIN module, and the results have been attached. (the txt file *out.summary.txt contains the summary of the sample (i.e. median + SD), the xls [actually, also a tab delim txt] file *out.tin.xls the TIN score per transcript).

[guidoh@localhost P15-1-6h]$ tin.py -i P15-1-6h_Aligned.sortedByCoord.out.bam -r /mnt/files/guido/INDEX/STAR/Housekeeping_TranscriptsHuman2158.bed
@ 2021-06-10 11:44:04: Get BAM file(s) ...
Total 1 BAM file(s):
        P15-1-6h_Aligned.sortedByCoord.out.bam
@ 2021-06-10 11:44:04: Processing P15-1-6h_Aligned.sortedByCoord.out.bam
[guidoh@localhost P15-1-6h]$

Thank you for having a look at this!
G

output_tin.py.zip

See MultiQC/MultiQC#737

ewels · 2021-07-02T18:46:45Z

Hi all,

Apologies for the very (very, very) long time it's taken to get this added. @ErikDanielsson has just put together a new RSeQC submodule to support this output in #1481 and it will be part of the v1.11 release any day now.

In the end, I decided that we should keep it simple. It adds two columns to the General Statistics table: the median TIN and the stdev. The latter is hidden by default, it can be shown via the Configure Columns button or at report generation time via a config (see docs).

I hope this is still helpful to you all, despite coming over 3 years late! 😁 Shout if you hit any problems with it.

Many thanks,

Phil

guidohooiveld · 2021-07-05T20:31:29Z

Thanks Phil and Erik for creating the MultiQC RSeQC TIN submodule; much appreciated!

Earlier today I updated MutiQC to the latest development version, and ran it again on a map containing various QC output files, including TIN.
Et voila, the 2 columns (of which one is hidden) were indeed added to the General Statistics table. Nice & thanks!

One comment/question, though, regarding the sample names used for the TIN values in the General Statistics table: these are not the same as used for the other RSeQC modules. This makes the table 'less nice' and more difficult to read. See 1st screenshot below.

I think this is due to the fact that within the TIN "summary" file (the txt file *out.summary.txt) the full name of the BAM file is returned (used) by RSeQC (see its copied content below), which is then extracted (parsed) by the MultiQC TIN module, and subsequently used in the General Statistics table.

Therefore: would you have any suggestion to prevent this form happening? So that only the 'base name' is used in the table? Maybe by somehow using on-the-fly the function fn_clean_sample_names?
Note that I am not an expert on how to do this and it may be a too naive thought... but since the 'other' files seem to be correctly recognized and name cleaned (see 2nd screenshot), this may be feasible.

Thus, in summary: in the General Statistics table the full name present in the TIN summary file (*out.summary.txt) is used (e.g. "P26-1-6h_Aligned.sortedByCoord.out.bam"), whereas just the use of only the sample ID (base name) "P26-1-6h" would be preferred.

Content TIN summary file (P26-1-6h_Aligned.sortedByCoord.out.summary.txt):

Bam_file	TIN(mean)	TIN(median)	TIN(stdev)
P26-1-6h_Aligned.sortedByCoord.out.bam	53.72327495737302	53.34221052273402	18.530355596890026

An example file is present in my previous post in this thread (#737 (comment)).

Below a screenshot of a map containing for a sample the output of STAR, but also RSeQC and Picard. All relevant files are nicely recognized by MultiQC, and their names are properly 'cleaned' when used in the MultiQC report. Hence my (naive) thought above...

ewels · 2021-07-06T05:36:01Z

Ah yes, our first v1.11 release bug! You're totally right, we missed passing the sample name through the self.clean_s_name() function (docs). It's a one-line fix, I'll try to get to it later today.

ewels · 2021-07-06T05:37:32Z

Moved into a dedicated issue: #1484

ewels · 2021-07-06T17:44:27Z

(fixed in v1.12dev)

See MultiQC/MultiQC#737

ewels changed the title ~~Integration of Transcript Integrity Number from RSeqQC~~ RSeqQC: parse Transcript Integrity Number Apr 20, 2018

ewels added the module: change label Apr 20, 2018

ewels added the priority: high label Jun 13, 2021

ewels assigned ErikDanielsson Jun 29, 2021

ewels added this to the MultiQC v1.11 milestone Jul 2, 2021

ErikDanielsson mentioned this issue Jul 2, 2021

Add new module tin #1481

Merged

11 tasks

ewels added a commit to MultiQC/test-data that referenced this issue Jul 2, 2021

RSeQC Tin - more test data.

b6f3374

See MultiQC/MultiQC#737

ewels closed this as completed Jul 2, 2021

ewels mentioned this issue Jul 6, 2021

RSeQC TIN module - sample name cleaning #1484

Closed

vladsavelyev pushed a commit to vladsavelyev/MultiQC_TestData that referenced this issue Apr 16, 2022

RSeQC Tin - more test data.

46c4d0c

See MultiQC/MultiQC#737

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RSeqQC: parse Transcript Integrity Number #737

RSeqQC: parse Transcript Integrity Number #737

drbecavin commented Apr 20, 2018 •

edited by ewels

Loading

ewels commented Apr 20, 2018

drbecavin commented Apr 20, 2018 •

edited by ewels

Loading

ewels commented Apr 20, 2018

EngineerReversed commented Sep 22, 2020 •

edited by ewels

Loading

guidohooiveld commented Jun 10, 2021 •

edited by ewels

Loading

ewels commented Jul 2, 2021

guidohooiveld commented Jul 5, 2021 •

edited

Loading

ewels commented Jul 6, 2021

ewels commented Jul 6, 2021

ewels commented Jul 6, 2021

RSeqQC: parse Transcript Integrity Number #737

RSeqQC: parse Transcript Integrity Number #737

Comments

drbecavin commented Apr 20, 2018 • edited by ewels Loading

ewels commented Apr 20, 2018

drbecavin commented Apr 20, 2018 • edited by ewels Loading

ewels commented Apr 20, 2018

EngineerReversed commented Sep 22, 2020 • edited by ewels Loading

guidohooiveld commented Jun 10, 2021 • edited by ewels Loading

ewels commented Jul 2, 2021

guidohooiveld commented Jul 5, 2021 • edited Loading

ewels commented Jul 6, 2021

ewels commented Jul 6, 2021

ewels commented Jul 6, 2021

drbecavin commented Apr 20, 2018 •

edited by ewels

Loading

drbecavin commented Apr 20, 2018 •

edited by ewels

Loading

EngineerReversed commented Sep 22, 2020 •

edited by ewels

Loading

guidohooiveld commented Jun 10, 2021 •

edited by ewels

Loading

guidohooiveld commented Jul 5, 2021 •

edited

Loading