-
Notifications
You must be signed in to change notification settings - Fork 67
Proposed Analysis: Molecularly subtype ATRT tumors #244
Comments
I'm adding what I think the table summarizing the results would contain here. From a cursory look, there are 30 samples that are classified as ATRT in the histologies file. That is a large enough sample size for what I'll suggest below. I agree that one of the first analyses would be unsupervised clustering or dimension reduction. Tabular formatThe goal of the table is to summarize all of the information above in a manner that would allow someone with domain expertise to quickly make relatively easy calls and to identify cases where more information is needed. So it should contain everything mentioned above.
Notes on columns
|
I am going to begin the work on this analysis by implementing the suggestions above. |
To be more specific, my plan is as follows:
|
@jaclyn-taroni do you and @cbethell want to see if the results from gistic for CNVkit: |
@jharenza yes, I believe the gistic results would be good/useful for this analysis so I would like to see them in the next data release, if possible. |
### release-v12-20191217 - release date: 2019-12-17 - status: available - changes: - Add `data-file-descriptions.md` with data release to better track file types, origins, and workflows per [#334](#334) and [#336](#336) - Add stranded RNA-Seq for 23 PNOC samples and 21 CBTTC samples previously sequenced using a polyA library prep. Files updated: - pbta-fusion-arriba.tsv.gz - pbta-fusion-starfusion.tsv.gz - pbta-gene-expression-rsem-tpm.stranded.rds - pbta-gene-expression-rsem-fpkm.stranded.rds - pbta-isoform-expression-rsem-tpm.stranded.rds - pbta-isoform-counts-rsem-expected_count.stranded.rds - pbta-gene-counts-rsem-expected_count.stranded.rds - pbta-gene-expression-kallisto.stranded.rds - pbta-gene-expression-rsem-fpkm-collapsed.stranded.rds - Add recurrently-fused genes by histology and matrix of recurrently-fused genes by biospecimen from [fusion filtering and prioritization analysis](https://github.com/AlexsLemonade/OpenPBTA-analysis/tree/master/analyses/fusion_filtering) - Update consensus TMB files and MAF [#333]](#333) - Add RNA-Seq [collapsed matrices](#287) - wrong files (tables of transcripts removed) were included with [V10](#273) - Rename `WGS.hg38.mutect2.unpadded.bed` to `WGS.hg38.mutect2.vardict.unpadded.bed` to better reflect usage - Update `pbta-histologies.tsv` to add new RNA-Seq samples listed above, [#222 harmonize disease separators](#222), and reran [medulloblastoma classifier](https://github.com/d3b-center/medullo-classifier-package) using V12 RSEM fpkm collapsed files - BS_2Z1MKS84, BS_5VQP0E6K re-classified from Group4 to WNT and BS_3BDAG9YN, BS_8T7DZV2F, and BS_JTMXAMB7 re-classified from Group3 to WNT - Add CNVkit GISTIC results focal CN analyses, eg: [#244](#244) and [#8](#8)
* Release V12 data ### release-v12-20191217 - release date: 2019-12-17 - status: available - changes: - Add `data-file-descriptions.md` with data release to better track file types, origins, and workflows per [#334](#334) and [#336](#336) - Add stranded RNA-Seq for 23 PNOC samples and 21 CBTTC samples previously sequenced using a polyA library prep. Files updated: - pbta-fusion-arriba.tsv.gz - pbta-fusion-starfusion.tsv.gz - pbta-gene-expression-rsem-tpm.stranded.rds - pbta-gene-expression-rsem-fpkm.stranded.rds - pbta-isoform-expression-rsem-tpm.stranded.rds - pbta-isoform-counts-rsem-expected_count.stranded.rds - pbta-gene-counts-rsem-expected_count.stranded.rds - pbta-gene-expression-kallisto.stranded.rds - pbta-gene-expression-rsem-fpkm-collapsed.stranded.rds - Add recurrently-fused genes by histology and matrix of recurrently-fused genes by biospecimen from [fusion filtering and prioritization analysis](https://github.com/AlexsLemonade/OpenPBTA-analysis/tree/master/analyses/fusion_filtering) - Update consensus TMB files and MAF [#333]](#333) - Add RNA-Seq [collapsed matrices](#287) - wrong files (tables of transcripts removed) were included with [V10](#273) - Rename `WGS.hg38.mutect2.unpadded.bed` to `WGS.hg38.mutect2.vardict.unpadded.bed` to better reflect usage - Update `pbta-histologies.tsv` to add new RNA-Seq samples listed above, [#222 harmonize disease separators](#222), and reran [medulloblastoma classifier](https://github.com/d3b-center/medullo-classifier-package) using V12 RSEM fpkm collapsed files - BS_2Z1MKS84, BS_5VQP0E6K re-classified from Group4 to WNT and BS_3BDAG9YN, BS_8T7DZV2F, and BS_JTMXAMB7 re-classified from Group3 to WNT - Add CNVkit GISTIC results focal CN analyses, eg: [#244](#244) and [#8](#8) * Update release-notes.md fix link * Update data-files-description.md fix GISTIC table sectioning * Update data-files-description.md fix spacing on data description table * Update data-files-description.md fix more spacing in data file description file * Update download-data.sh add new release date to download script * Update the TMB file descriptions * Update TMB file formats section * Update fusion section of data formats Also more specific description of the by sample file * Add GISTIC file to data-formats * Update download-data.sh * Update download-data.sh * data description md is also included in md5sum * TMB exon -> coding sequence * Coding TMB CDS, not exon
My understand of what is left on this ticket:
|
@jaclyn-taroni was just looking at this today as well. Once the table is done, we have a clinician (probably via email) who can check out the data. It may be nice to point him to a notebook of the table with the final subtypes and columns of criteria used. |
|
@cbethell - started going through these results a bit. Can I request that you also add a column for |
@cbethell sorry for the separate comment - would you be able to also include expression of HES5, HES6, DLL1, and DLL3 and GSEA for Looks like the other genes mentioned are all included. Thanks! |
I'm going to pick this up right now @jharenza. |
Thanks @jaclyn-taroni ! |
Here is a minimal gene list for expression/GSEA for each subgroup:
|
Even using the minimal set of genes, the subtyping is not clear-cut based on these genes' expression, and after discussing with @jaclyn-taroni, I think it would be a better approach to develop a classifier for ATRT subtyping, similar to what we was done for MB here. Since this may not make it into the first submission of the paper, I will close this for now. |
Scientific goals
What are the scientific goals of the analysis?
Subtype ATRTs into SHH, MYC, TYR.
Proposed methods
What methods do you plan to use to accomplish the scientific goals?
Suggestions:
Summarize results in a table, which can be added to a notebook, with molecular subtype designated.
ATRT TYR
ATRT SHH
ATRT MYC
May be able to determine brain regions using the
primary_site
from pbta_histologies.tsv - Ref:Required input data
What input data will you use for this analysis?
Proposed timeline
What is the timeline for the analysis?
1 week
Relevant literature
If there is relevant scientific literature, put links to those items here.
Link to Atypical Teratoid/Rhabdoid Tumors Are Comprised of Three Epigenetic Subgroups with Distinct Enhancer Landscapes, specifically, Table S3 has a nice summary of genotyping the SMARCB1 locus.
The text was updated successfully, but these errors were encountered: