Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Result Table Column Explanations #14

Open
DarioS opened this issue Mar 4, 2020 · 1 comment
Open

Result Table Column Explanations #14

DarioS opened this issue Mar 4, 2020 · 1 comment

Comments

@DarioS
Copy link

DarioS commented Mar 4, 2020

The output table in Gene View tab has 22 columns. Could an explanation for each be in the vignette? For example, I see BSJ_vs_FSJ is always 0 and I wonder what that means. The last few columns are a mystery.

@davhum
Copy link
Collaborator

davhum commented Mar 5, 2020

Hi Dario,

There are a number of tables that can be built through Gene View tab. I am assuming you are refering to the "Grouped" table output using STAR chimeric outputs. To have 22 columns you would have a data set of 6 samples. Note each sample is automatically assigned to a "Group" which can be visualised under Projects tab (i.e. if you select items under "List of all groups" the associated sample ID is shown in main panel).

Each sample (group) will contribute 3 columns in the grouped tabulated table under gene view. These three columns are : (i) BSJ count, (ii) a RAD score (group_II_II) and (iii) FSJ score (group_FSJ). There are also 4 generic columns built into the table. These are BSjuncName (unique identifier), strandDonor, Gene, and juncType. JuncType is copied from column 7 of STAR chimeric output (possible values are: -1=encompassing junction (between the mates), 1=GT/AG, | 2=CT/AC).

The RAD score is the ratio of type II / type III alignments and is labelled "Group__II_III". A value close to 0.5 indicates a good balance between type II and type III alignments. A value of 0 or 1 means a strong bias towards one alignment type which suggests circRNA might be a false positive.

The forward splice junction (FSJ) score identifies the presence of reads that support canonical splice junctions that use splice donor/acceptor of BSJ coordinates. So if both splice donor and acceptor of a BSJ is used in the parental transcript then a score of 2 is given. If only one is used then a score of 1 is given. This score is most useful for BSJ that don't align with current gene models.

Note for some data sets (eg RNaseR treated) you would expect FSJ score of 0.

Regards,
D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants