-
Notifications
You must be signed in to change notification settings - Fork 67
Collapsed RNA-seq matrices with unique gene symbols #248
Comments
Updated with #273 |
I am reopening this issue. Version 10 includes the following files:
These files are the tables that report whether or not a gene was dropped, not the summarized matrices themselves which would have been:
See
I think it's fine to continue to include the tables above, but we need to include the summarized matrices as well. |
@jaclyn-taroni Actually, that was exactly why I had created this issue - to include the summarized matrices (not the dropped tables). Should I upload those files here: https://cavatica.sbgenomics.com/u/cavatica/pbta/files/#q?path=processed-data-merge%2FV10-data |
That was my understanding of this issue when you filed it too @komalsrathi. I am better equipped to speak to the |
Ahhhh, sorry I thought the summarized matrices were the ones we put into CAVATICA and the data download - yes, we should swap those out, but I will make a V11 folder @komalsrathi - can you add them here: https://cavatica.sbgenomics.com/u/cavatica/pbta/files/#q?path=processed-data-merge%2FV11-data |
Thinking back, I think this was miscommunication. I asked @kgaonkar6 to grab the collapsed files, but never realized there were also tables in the analysis from that PR, but we will swap out. Thanks! |
@jharenza done! |
thanks! |
@jharenza I have also added the new collapsed tables with correlations in V11 (discussed here) @jaclyn-taroni will submit a pull request with updated code tomorrow. |
@jaclyn-taroni @komalsrathi do you want the collapsed tables in the release or just the summarized matrices? Was thinking just the latter... |
I think it would make sense to make them available along with the collapsed
matrices as they are not really part of the results but more like processed
data? Let me know what you think.
On Mon, Nov 25, 2019 at 7:54 PM Jo Lynne ***@***.***> wrote:
@jaclyn-taroni <https://github.com/jaclyn-taroni> @komalsrathi
<https://github.com/komalsrathi> do you want the collapsed tables in the
release or just the summarized matrices? Was thinking just the latter...
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#248?email_source=notifications&email_token=ABVNEJ45IPPVRK4ENV4ACLDQVRXS5A5CNFSM4JK2UL4KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFEKMAA#issuecomment-558409216>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABVNEJ4ZBJ3C2GPMAGLZGRTQVRXS5ANCNFSM4JK2UL4A>
.
--
*Komal S Rathi* | Bioinformatics Scientist II, DBHi, The Children's
Hospital of Philadelphia | rathik@email.chop.edu
|
Hmm, I don't know if we want to add all processed data to releases, just data people will need for downstream work. If we do release those tables, I would suggest we rename to something like "collapsed matrices-genes removed" or something. Thoughts, @jaclyn-taroni ? |
The tables has info on all genes - not just removed genes i.e. which ones
were expressed, which were multi mapped and which were dropped/kept. But
yes, you guys decide if you want to keep it or not.
On Tue, Nov 26, 2019 at 7:57 AM Jo Lynne ***@***.***> wrote:
Hmm, I don't know if we want to add *all* processed data to releases,
just data people will need for downstream work. If we do release those
tables, I would suggest we rename to something like "collapsed
matrices-genes removed" or something. Thoughts, @jaclyn-taroni
<https://github.com/jaclyn-taroni> ?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#248?email_source=notifications&email_token=ABVNEJZVEUU26HDAGLHVMYLQVUMNDA5CNFSM4JK2UL4KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFF5BNQ#issuecomment-558616758>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABVNEJ74XBQPRHBOLYDXY53QVUMNDANCNFSM4JK2UL4A>
.
--
*Komal S Rathi* | Bioinformatics Scientist II, DBHi, The Children's
Hospital of Philadelphia | rathik@email.chop.edu
|
Ahh, ok. Sounds like a good thing to reference in the paper, just don't want people to use that by accident which would defeat the purpose of the collapsing. |
I feel like there is some miscommunication happening here. There are two kinds of files we are discussing:
What was included in the data download was (1). What I think we want is (2), but we may be using different words. The reasoning here is that we want people not to have to recreate the logic of the collapsing that @komalsrathi did, or run the scripts repeatedly. Having them in the data repository also makes them “approved” in a way that is helpful to new contributors. |
Yes I meant exactly this but was undecided if (1) should also go under
data. But if you say so then the annotation tables i.e. (1) would remain
under the analyses/collapse-rnaseq folder and (2) would go under data/
On Tue, Nov 26, 2019 at 8:30 AM jashapiro ***@***.***> wrote:
I feel like there is some miscommunication happening here. There are two
kinds of files we are discussing:
1.
Tables of what genes were kept and removed, with no expression data.
i.e.
‘pbta-gene-expression-rsem-fpkm-collapsed_table.stranded.rds’
2.
Collapsed expression matrixes with only one of each duplicated set of
genes. i.e.
‘pbta-gene-expression-rsem-fpkm-collapsed.stranded.rds’
What was included in the data download was (1). I think we all agree we
want is (2), but we are using different words.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#248?email_source=notifications&email_token=ABVNEJYTNI2PFAHY3VEYFFTQVUQGFA5CNFSM4JK2UL4KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFF75TQ#issuecomment-558628558>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABVNEJ74WMSFFQMNGUBXYALQVUQGFANCNFSM4JK2UL4A>
.
--
*Komal S Rathi* | Bioinformatics Scientist II, DBHi, The Children's
Hospital of Philadelphia | rathik@email.chop.edu
|
closed with #293 |
File(s)
Collapsed RNA-seq matrices (merging multiple Ensembl identifers to get unique gene symbols)
Release
v9
Link to OpenPBTA-manuscript
Put a link to the relevant section of the OpenPBTA manuscript here.
Question/issue
Reg. #198, should we make the collapsed RNA-seq matrices available in v10?
The text was updated successfully, but these errors were encountered: