Skip to content

Latest commit

 

History

History
77 lines (73 loc) · 165 KB

README.md

File metadata and controls

77 lines (73 loc) · 165 KB

Analysis Modules

This directory contains various analysis modules in the OpenPedCan project. See the README of an individual analysis modules for more information about that module.

Modules at a glance

The table below is intended to help project organizers quickly get an idea of what files (and therefore types of data) are consumed by each analysis module, what the module does, and what output files it produces that can be consumed by other analysis modules. This is in service of documenting interdependent analyses. Note that nearly all modules use the harmonized clinical data file (histologies.tsv) even when it is not explicitly included in the table below.

Module Input Files Brief Description Produces files for data release? Output Files Consumed by Other Analyses Adapted for OPC? Run Platform Action Plan
chromosomal-instability histologies.tsv sv-manta.tsv.gz cnv-cnvkit.seg.gz Evaluates chromosomal instability by calculating chromosomal breakpoint densities and by creating circular plot visuals No breakpoint-data/union_of_breaks_densities.tsv No N/A Will Adapt for OT
chromothripsis sv-manta.tsv.gz cnv-consensus.seg.gz independent-specimens.wgs.primary-plus.tsv chromothripsis analysis per #1007 No N/A No N/A N/A
cnv-chrom-plot cnv-consensus-gistic.zip cnv-consensus.seg Plots genome wide visualizations relating to copy number results No N/A No N/A N/A
cnv-frequencies (DEPRECATED) histologies.tsv consensus_wgs_plus_cnvkit_wxs.tsv.gz independent-specimens.wgswxspanel.primary.eachcohort.tsv independent-specimens.wgswxspanel.relapse.eachcohort.tsv independent-specimens.wgswxspanel.primary.tsv independent-specimens.wgswxspanel.relapse.tsv Annotate CNV table with mutation frequencies No results/gene-level-cnv-consensus-annotated-mut-freq.jsonl.gz results/gene-level-cnv-consensus-annotated-mut-freq.tsv.gz Yes GitHub N/A
collapse-rnaseq (DEPRECATED) gene-expression-rsem-tpm.rds gencode.v39.primary_assembly.annotation.gtf.gz Collapses RSEM FPKM matrices such that gene symbols are de-duplicated. Yes results/gene-expression-rsem-fpkm-collapsed.rds included in data download; too large for tracking via GitHub Yes CAVATICA N/A
comparative-RNASeq-analysis (DEPRECATED) gene-expression-rsem-tpm.rds histologies.tsv mend-qc-manifest.tsv mend-qc-results.tar.gz In progress; will produce expression outlier profiles per #229 No N/A No N/A N/A
compare-gistic (DEPRECATED) cnv-consensus-gistic.zip analyses/run-gistic/results/cnv-consensus-hgat-gistic.zip analyses/run-gistic/results/cnv-consensus-lgat-gistic.zip analyses/run-gistic/results/cnv-consensus-medulloblastoma-gistic.zip Comparison of the GISTIC results of the entire cohort with the GISTIC results of three individual histolgies, namely, LGAT, HGAT and medulloblastoma #547 No N/A No N/A N/A
copy_number_consensus_call cnv-cnvkit.seg.gz cnv-controlfreec.tsv.gz sv-manta.tsv.gz Produces consensus copy number calls per #128 and a set of excluded regions where CNV calls are not made Yes results/cnv_consensus.tsv 'results/uncalled_samples.tsv' results/cnv-consensus.seg.gz included in data download ref/cnv_excluded_regions.bed ref/cnv_callable.bed Yes CAVATICA N/A
create-subset-files All files This module contains the code to create the subset files used in continuous integration No All subset files for continuous integration No N/A Will set up for OT ticket in
data-pre-release-qc histologies-base.tsv gene-counts-rsem-expected_count-collapsed.rds gene-expression-rsem-tpm-collapased.rds tcga-gene-counts-rsem-expected_count-collapsed.rds tcga-gene-expression-rsem-tpm-collapsed.rds cnv-cnvkit.seg.gz cnvkit_with_status.tsv consensus_wgs_plus_cnvkit_wxs_autosomes.tsv.gz consensus_wgs_plus_cnvkit_wxs_x_and_y.tsv.gz snv-mutation-tmb-all.tsv fusion_summary_embryonal_foi.tsv fusion_summary_ependymoma_foi.tsv fusion_summary_lgg_hgg_foi.tsv fusion_summary_ewings_foi.tsv biospecimen_id_to_bed_map.txt Performs QC on data pre-release files with requirements which should pass before hand off between BIXU Engineering team to the OpenPedCan team Yes 'data-pre-release-qc.nb.html' No N/A N/A
efo-mondo-mapping (DEPRECATED) histologies.tsv efo-mondo-map.tsv This module contains a file with EFO, MONDO, and NCIT codes for all cancer_group found in histologies.tsv and runs a script to qc in case any cancer_group is missed Yes efo-mondo-mapping.tsv Yes N/A Yes
filter-mtp-tables (DEPRECATED) gencode.v39.primary_assembly.annotation.gtf.gz PMTL_v1.1.tsv histologies.tsv gene-level-snv-consensus-annotated-mut-freq.tsv.gz snv-consensus-plus-hotspots.maf.tsv.gz variant-level-snv-consensus-annotated-mut-freq.tsv.gz gene-level-cnv-consensus-annotated-mut-freq.tsv.gz consensus_wgs_plus_cnvkit_wxs.tsv.gz putative-oncogene-fusion-freq.tsv.gz fusion-putative-oncogenic.tsv putative-oncogene-fused-gene-freq.tsv.gz long_n_tpm_mean_sd_quantile_gene_wise_zscore.tsv.gz long_n_tpm_mean_sd_quantile_group_wise_zscore.tsv.gz Remove Ensembl (ESNG) gene identifier in the OpenPedCan mutation frequency tables, including SNV, CNV, fusion, and TPM expression tables that are not in GENCODE v39 and Ensembl package 104. No All files from module results directory Yes N/A Yes
focal-cn-file-preparation cnv-cnvkit.seg.gz cnv-controlfreec.tsv.gz gene-expression-rsem-tpm-collapsed.rds cnv-consensus.seg.gz Maps from copy number variant caller segments to gene identifiers; will be updated to take into account changes that affect entire cytobands, chromosome arms #186 Yes cnvkit_annotated_cn_wxs_autosomes.tsv.gz cnvkit_annotated_cn_wxs_x_and_y.tsv.gz consensus_seg_annotated_cn_autosomes.tsv.gz consensus_seg_annotated_cn_x_and_y.tsv.gz consensus_seg_most_focal_fn_status.tsv.gz consensus_seg_recurrent_focal_cn_units.tsv consensus_seg_with_ucsc_cytoband_status.tsv.gz consensus_wgs_plus_cnvkit_wxs_autosomes.tsv.gzincluded in data download consensus_wgs_plus_cnvkit_wxs_x_and_y.tsv.gz` included in data download Yes CAVATICA N/A
fusion_filtering fusion-arriba.tsv.gz fusion-starfusion.tsv.gz independent-specimens.rnaseq.primary.tsv independent-specimens.rnaseq.relapse.tsv Standardizes, filters, and prioritizes fusion calls Yes results/fusion-putative-oncogenic.tsv included in data download results/fusion-recurrent-fusion-bycancergroup.tsv results/fusion-recurrent-fusion-bysample.tsv results/fusion-recurrently-fused-genes-bycancergroup.tsv results/fusion-recurrently-fused-genes-bysample.tsv Yes GitHub N/A
fusion-frequencies (DEPRECATED) histologies.tsv fusion-putative-oncogenic.tsv fusion-dgd.tsv.gz independent-specimens.rnaseqpanel.primary.tsv independent-specimens.rnaseqpanel.relapse.tsv independent-specimens.rnaseqpanel.primary.eachcohort.tsv independent-specimens.rnaseqpanel.relapse.eachcohort.tsv Gather counts and frequencies for fusion per cancer_group and cohort results/putative-oncogene-fused-gene-freq.jsonl.gz results/putative-oncogene-fused-gene-freq.tsv.gz results/putative-oncogene-fusion-freq.jsonl.gz results/putative-oncogene-fusion-freq.tsv.gz N/A Yes GitHub N/A
fusion-summary histologies.tsv fusion-putative-oncogenic.tsv fusion-arriba.tsv.gz fusion-starfusion.tsv.gz Generate summary tables from fusion files (#398; #623) Yes results/fusion_summary_embryonal_foi.tsv results/fusion_summary_ependymoma_foi.tsv results/fusion_summary_ewings_foi.tsv Yes GitHub N/A
gene_match (DEPRECATED) GTF file sources: gencode v28 gencode v38 open_ped_can_v7_ensg-hugo-rmtl-mapping.tsv This module reads GTF file and formats attributes to extract gene symbol with gene ensembl ID. Yes ensg-hugo-pmtl-mapping.tsv Yes GitHub N/A
gene-set-enrichment-analysis gene-expression-rsem-tpm-collapsed.rds histologies.tsv Updated gene set enrichment analysis with appropriate RNA-seq expression data No results/gsva_scores.tsv combined file for all RNA library types Yes GitHub Move to CAVATICA
hotspots-detection (DEPRECATED) snv-strelka2.vep.maf.gz snv-mutect2.vep.maf.gz snv-vardict.vep.maf.gz snv-lancet.vep.maf.gz Scavenges cancer any hotspot calls from each caller and merges with consensus (3/3) calls if it was missed in snv-caller workflow. No snv-hotspots-mutation.maf.tsv.gz No CAVATICA N/A
immune-deconv gene-expression-rsem-tpm-collapsed.rds data/histologies.tsv Immune/Stroma characterization across PBTA part of #15 No xcell_output.rds quantiseq_output.rds No N/A N/A
independent-samples histologies.tsv Generates independent specimen lists for WGS/WXS samples Yes results/independent-specimens.wgswxspanel.primary.tsv included in data download results/independent-specimens.wgswxspanel.relapse.tsv included in data download results/independent-specimens.wgswxspanel.primary.eachcohort.tsv included in data download results/independent-specimens.wgswxspanel.relapse.eachcohort.tsv included in data download results/independent-specimens.wgswxspanel.primary.prefer.wxs.tsv included in data download results/independent-specimens.wgswxspanel.relapse.prefer.wxs.tsv included in data download results/independent-specimens.wgswxspanel.primary.eachcohort.prefer.wxs.tsv included in data download results/independent-specimens.wgswxspanel.relapse.eachcohort.prefer.wxs.tsv included in data download results/independent-specimens.rnaseq.primary.tsv included in data download results/independent-specimens.rnaseq.relapse.tsv included in data download results/independent-specimens.rnaseq.primary.eachcohort.tsv included in data download results/independent-specimens.rnaseq.relapse.eachcohort.tsv included in data download Yes GitHub N/A
interaction-plots independent-specimens.wgs.primary-plus.tsv snv-consensus-mutation.maf.tsv.gz Creates interaction plots for mutation mutual exclusivity/co-occurrence #13; may be updated to include other data types e.g., fusions No N/A No N/A N/A
long-format-table-utils (DEPRECATED) ensg-hugo-rmtl-mapping.tsv analyses/fusion_filtering/references/genelistreference.txt efo-mondo-map.tsv uberon-map-gtex-group.tsv uberon-map-gtex-subgroup.tsv Functions and scripts for handling long-format tables No annotator/annotation-data/ensg-gene-full-name-refseq-protein.tsv annotator/annotation-data/oncokb-cancer-gene-list.tsv Yes GitHub N/A
methylation-preprocessing (DEPRECATED) TARGET_Normal_MethylationArray_20160812.sdrf.txt TARGET_NBL_MethylationArray_20160812.sdrf.1.txt TARGET_NBL_MethylationArray_20160812.sdrf.2.txt TARGET_CCSK_MethylationArray_20160819.sdrf.txt TARGET_OS_MethylationArray_20161103.sdrf.txt TARGET_WT_MethylationArray_20160831.sdrf.txt TARGET_AML_MethylationArray_20160812_450k.sdrf.1.txt TARGET_AML_MethylationArray_20160812_450k.sdrf.2.txt TARGET_AML_MethylationArray_20160812_27k.sdrf.1.txt TARGET_AML_MethylationArray_20160812_27k.sdrf.2.txt TARGET_AML_MethylationArray_20160812_27k.sdrf.3.txt manifest_methylation_CBTN_20220410.1.csv manifest_methylation_CBTN_20220410.2.csv manifest_methylation_CBTN_20220410.3.csv manifest_methylation_CBTN_20220410.4.csv Preprocess probe hybridization intensity values of selected methylated and unmethylated cytosine (CpG) loci into usable methylation measurements for the Pediatric Open Targets, OpenPedCan-analysis raw DNA methylation array datasets. No N/A Yes Cavatica N/A
methylation-summary (DEPRECATED) infinium.gencode.v39.probe.annotations.tsv.gz independent-specimens.rnaseqpanel.eachchort.tsv independent-specimens.methyl.eachcohort.tsv gene-expression-rsem-tpm-collapsed.rds rna-isoform-expression-rsem-tpm.rds methyl-beta-values.rds efo-mondo-map.tsv histlogies.tsv Summarize preprocessed Illumina Infinium Human Methylation array measurements produced by the OpenPedCan methylation preprocessing module and Illumina infinium methylation array CpG probe coordinates. No N/A No aws N/A
molecular-subtyping-ATRT histologies-base.tsv Molecular subtyping of ATRT samples No NA GitHub N/A
molecular-subtyping-CRANIO histologies-base.tsv snv-consensus-plus-hotspots.maf.tsv.gz Molecular subtyping of craniopharyngiomas samples #810 No results/CRANIO_molecular_subtype.tsv No N/A Prepare for scaling
molecular-subtyping-EPN histologies-base.tsv gene-expression-rsem-tpm-collapsed.rds analyses/chromosomal-instability/breakpoint-data/union_of_breaks_densities.tsv analyses/fusion-summary/results/fusion_summary_ependymoma_foi.tsv analyses/gene-set-enrichment-analysis/results/gsva_scores.tsv molecular subtyping of ependymoma tumors No results/EPN_all_data_withsubgroup.tsv No N/A Will Adapt for OT
molecular-subtyping-EWS histologies-base.tsv analyses/fusion-summary/results/fusion_summary_ewings_foi.tsv Reclassification of tumors based on the presence of defining fusions for Ewing Sarcoma per #623 No results/EWS_samples.tsv No N/A Will Adapt for OT
molecular-subtyping-HGG histologies-base.tsv snv-consensus-plus-hotspots.maf.tsv.gz consensus_wgs_plus_cnvkit_wxs.tsv.gz fusion-putative-oncogenic.tsv cnv-consensus-gistic.zip gene-expression-rsem-tpm-collapsed.rds tp53_altered_status.tsv Molecular subtyping of high-grade glioma samples #249 No results/HGG_molecular_subtype.tsv Yes GitHub N/A
molecular-subtyping-LGAT histologies-base.tsv snv-consensus-plus-hotspots.maf.tsv.gz fusion-putative-oncogenic.tsv analyses/fusion_filtering/results/fusion-recurrently-fused-genes-bysample.tsv Molecular subtyping of Low-grade astrocytic tumor samples #631 No results/lgat_subtyping.tsv Yes GitHub N/A
molecular-subtyping-MB histologies-base.tsv gene-expression-rsem-tpm-collapsed.rds Molecular classification of Medulloblastoma subtypes part of #116 No results/MB_molecular_subtype.tsv Yes GitHub N/A
molecular-subtyping-SHH-tp53 histologies snv-consensus-plus-hotspots.maf.tsv.gz Deprecated; Identify the SHH-classified medulloblastoma samples that have TP53 mutations #247 No N/A No N/A N/A
molecular-subtyping-chordoma analyses/focal-cn-file-preparation/results/consensus_seg_annotated_cn_autosomes.tsv.gz gene-expression-rsem-fpkm-collapsed.stranded.rds identifying poorly-differentiated chordoma samples per #250 No N/A No N/A Will Adapt for OT
molecular-subtyping-embryonal histologies-base.tsv analyses/fusion-summary/fusion_summary_embryonal_foi.tsv sv-manta.tsv.gz consensus_wgs_plus_cnvkit_wxs.tsv.gz analyses/focal-cn-file-preparation/cnvkit_annotated_cn_x\_and_y.tsv.gz analyses/focal-cn-file-preparation/controlfreec_annotated_cn_x\_and_y.tsv.gz gene-expression-rsem-tpm-collapsed.rds Molecular subtyping of non-medulloblastoma, non-ATRT embryonal tumors #251 No results/embryonal_tumor_molecular_subtypes.tsv No N/A Will Adapt for OT
molecular-subtyping-integrate histologies-base.tsv results/compiled_molecular_subtypes_with_clinical_pathology_feedback.tsv Add molecular subtype information to base histology No results/histologies.tsv Yes GitHub N/A
molecular-subtyping-NBL histologies-base.tsv consensus_wgs_plus_cnvkit_wxs.tsv.gz cnv-cnvkit.seg.gz cnv-controlfreec.tsv.gz gene-expression-rsem-tpm-collapsed.rds analyses/molecular-subtyping-NBL/input/gmkf_patient_clinical_mycn_status.tsv analyses/molecular-subtyping-NBL/input/target_patient_clinical_mycn_status.tsv molecular subtyping of NBL tumors #417 No results/NBL_MYCN_Subtype.tsv results/Alteration_Table.tsv results/Subtypes_Based_On_Cutoff.tsv results/QC_table.tsv Yes EC2 N/A
molecular-subtyping-neurocytoma histologies-base.tsv Molecular subtyping of Neurocytoma samples #805 No results/neurocytoma_subtyping.tsv No N/A Will Adapt for OT
molecular-subtyping-pathology analyses/molecular-subtyping-CRANIO/results/CRANIO_molecular_subtype.tsv analyses/molecular-subtyping-EPN/results/CRANIO_molecular_subtype.tsv analyses/molecular-subtyping-MB/results/MB_molecular_subtype.tsv analyses/molecular-subtyping-neurocytoma/results/neurocytoma_subtyping.tsv analyses/molecular-subtyping-EWS/results/EWS_samples.tsv analyses/molecular-subtyping-HGG/results/HGG_molecular_subtype.tsv analyses/molecular-subtyping-LGAT/results/lgat_subtyping.tsv analyses/molecular-subtyping-embryonal/results/embryonal_tumor_molecular_subtypes.tsv Compile output from other molecular subtyping modules and incorporate pathology feedback #645 No choroid_plexus_papilloma_subtypes.tsv cns-lymphoma-subtypes.tsv compiled_molecular_subtypes.tsv compiled_molecular_subtypes_and_report_info.tsv compiled_molecular_subtypes_with_clinical_feedback_and_report_info.tsv compiled_molecular_subtypes_with_clinical_pathology_feedback_and_report_info.tsv cranio_adam_subtypes.tsv glialneuronal_tumor_subtypes.tsv juvenile-xanthogranuloma-subtypes.tsv lgat-pathology-free-text-subtypes.tsv meningioma_subtypes.tsv Yes GitHub N/A
molecular-subtyping-PB histologies-base.tsv Molecular subtyping of Pineoblastoma samples PR #476 No results/pineo-molecular-subtypes.tsv Yes GitHub N/A
mtp-annotations (DEPRECATED) scratch/mtp-json/targets/ scratch/mtp-json/diseases/ This module transforms OpenTargetsPlatform Target (core annotations for targets) and Disease/Phenotype (core annotations for diseases and phenotypes) tables into mapping files utilized in filtering MTP designated tables and OPC data release files for plotting API development No N/A local N/A N/A
mtp-tables-qc-checks (DEPRECATED) gene-level-cnv-consensus-annotated-mut-freq.tsv.gz gene-level-snv-consensus-annotated-mut-freq.tsv.gz gene-variant-snv-consensus-annotated-mut-freq.tsv.gz putative-oncogene-gused-gene-freq.tsv.gz putative-oncogene-fusion-freq.tsv.gz long_n_tpm_mean_sd_quantitle_gene_wise_zscore.tsv.gz long_n_tpm_mean_sd_quatile_group_wise_zscore.tsv.gz Performs summary and QC checks comparing the current and the previous OPC mutation frequencies table No N/A No N/A N/A
mutational-signatures snv-consensus-plus-hotspots.maf.tsv.gz Performs COSMIC and Alexandrov et al. mutational signature analysis using the consensus SNV data No N/A No N/A N/A
mutect2-vs-strelka2 (DEPRECATED) snv-mutect2.vep.maf.gz snv-strelka2.vep.maf.gz Deprecated; comparison of only two SNV callers, subsumed by snv-callers No N/A No N/A N/A
oncoprint-landscape snv-consensus-plus-hotspots.maf.tsv.gz fusion-putative-oncogenic.tsv analyses/focal-cn-file-preparation/results/controlfreec_annotated_cn_autosomes.tsv.gz independent-specimens.\* Combines mutation, copy number, and fusion data into an OncoPrint plot #6; will need to be updated as all data types are refined No N/A No N/A N/A
pedcbio-cnv-prepare consensus_wgs_plus_cnvkit_wxs_autosomes.tsv.gz consensus_wgs_plus_cnvkit_wxs_x\_and_y.tsv.gz Generate annotated CNV files that are similar to seg files for PedCBio uploads to include all samples with neutral CNV calls Yes Upload to PedCBio S3 bucket for ingestion GitHub N/A N/A
pedcbio-sample-name histologies.tsv input\cbtn_cbio_sample.csv input\dgd_cbio_sample.csv input\oligo_nation_cbio_sample.csv input\x01_fy16_nbl_maris_cbio_sample.csv For some of the samples, when multiple DNA or RNA specimens are associated with the same sample, there is no column that would distinguish between different aliquots while still tying DNA and RNA together. Yes Upload to PedCBio S3 bucket for ingestion GitHub N/A N/A
pedot-table-column-display-order-name analyses/snv-frequencies/results/gene-level-snv-consensus-annotated-mut-freq.tsv analyses/snv-frequencies/results/variant-level-snv-consensus-annotated-mut-freq.tsv.gz analyses/cnv-frequencies/results/gene-level-cnv-consensus-annotated-mut-freq.tsv.gz analyses/fusion-frequencies/results/putative-oncogene-fused-gene-freq.tsv.gz analyses/fusion-frequencies/results/putative-oncogene-fusion-freq.tsv.gz analyses/rna-seq-expression-summary-stats/results/long_n\_tpm_mean_sd_quantile_gene_wise_zscore.tsv.gz analyses/rna-seq-expression-summary-stats/results/long_n\_tpm_mean_sd_quantile_group_wise_zscore.tsv.gz Generate and validate an Excel spreadsheet for Pediatric Open Targets PedOT website table display orders and names No Upload to FNL BOX Yes GitHub N/A
rna-seq-composition (DEPRECATED) gene-expression-rsem-tpm.rds histologies.tsv mend-qc-results.tar.gz mend-qc-manifest.tsv star-log-manifest.tsv star-log-final.tar.gz Analyzes the fraction of read types that comprise each RNA-Seq sample; flags samples with unusual composition No N/A No N/A N/A
rnaseq-batch-correct gene-counts-rsem-expected_count-collapsed.rds histologies.tsv hk_genes_normals.rds [positive_control_genes].rds RUVseq-DESeq2 batch-corrected DGE analysis Yes N/A Yes Github N/A
rna-seq-expression-summary-stats (DEPRECATED) gene-expression-rsem-tpm-collapsed.rds histologies.tsv Calculate TPM summary statistics within each cancer group and cohort. #51. No Upload to FNL Box Yes GitHub N/A
run-gistic histologies.tsv cnv-consensus.seg.gz Runs GISTIC 2.0 on SEG files Yes cnv-consensus-gistic.zip included in data download Yes GitHub Move to CAVATICA
sample-distribution-analysis (DEPRECATED) histologies.tsv Produces plots and tables that illustrate the distribution of different histologies in the PBTA data No N/A No N/A N/A
sex-prediction-from-RNASeq (DEPRECATED) gene-expression-kallisto.stranded.rds histologies.tsv predicts genetic sex using RNA-seq data #84 No N/A No N/A N/A
snv-frequencies (DEPRECATED) histologies.tsv snv-consensus-plus-hotspots.maf.tsv.gz snv-dgd.maf.tsv.gz independent-specimens.wgswxspanel.primary.eachcohort.prefer.wxs.tsv independent-specimens.wgswxspanel.relapse.eachcohort.prefer.wxs.tsv independent-specimens.wgswxspanel.primary.prefer.wxs.tsv independent-specimens.wgswxspanel.relapse.prefer.wxs.tsv Annotate SNV table with mutation frequencies No results/gene-level-snv-consensus-annotated-mut-freq.jsonl.gz results/gene-level-snv-consensus-annotated-mut-freq.tsv.gz variant-level-snv-consensus-annotated-mut-freq.jsonl.gz variant-level-snv-consensus-annotated-mut-freq.tsv.gz Yes GitHub N/A
survival-analysis TBD In progress; will eventually contain functions for various types of survival analysis #18 No N/A No N/A N/A
telomerase-activity-prediction gene-expression-rsem-tpm-collapsed.rds gene-counts-rsem-expected_count-collapsed.rds Quantify telomerase activity across pediatric brain tumors part of #148 No results/TelomeraseScores_PTBAPolya_counts results/TelomeraseScores_PTBAPolya_FPKM.txt results/TelomeraseScores_PTBAStranded_counts.txt results/TelomeraseScores_PTBAStranded_FPKM.txt No N/A N/A
tmb-calculation gencode.v27.primary_assembly.annotation.bed intersect_strelka_mutect2_vardict_WGS.bed snv-consensus-plus-hotspots.maf.tsv.gz biospecimen_id_to_bed_map.tsv histologies-base.tsv hg38_strelka.bed wgs_canonical_calling_regions.hg38.bed gencode.v39.primary_assembly.annotation.gtf.gz The Tumor Mutation Burden calculation is adapted from snv-callers module of the OpenPBTA-analyses, but uses the consensus SNV calls from 2/4 Mutect2, Strelka2, Lancet, and Vardict callers. Yes snv-mutation-tmb-all.tsv snv-mutation-tmb-coding.tsv Yes GitHub N/A
tmb-compare (DEPRECATED) snv-consensus-mutation-tmb-coding.tsv Compares PBTA tumor mutation burden to adult TCGA data. The D3B TMB calculations TMB_d3b_code and its comparison notebook compare-tmb-calculations.Rmd are deprecated. No N/A No N/A N/A
tp53_nf1_score snv-consensus-plus-hotspots.maf.tsv gene-expression-rsem-tpm-collapsed.rds consensus_wgs_plus_cnvkit_wxs.tsv.gz Applies TP53 inactivation, NF1 inactivation, and Ras activation classifiers to RNA-seq data #165 No TP53_NF1_snv_alteration.tsv gene-expression-rsem-tpm-collapsed_classifier_scores.tsv loss_overlap_domains_tp53.tsv poly-A_TP53.png stranded_TP53.png sv_overlap_tp53.tsv tp53_altered_status.tsv Yes GitHub N/A
transcriptomic-dimension-reduction gene-expression-rsem-tpm.rds gene-expression-kallisto.rds Dimension reduction and visualization of RNA-seq data part of #9 No N/A No N/A N/A
tcga-capture-kit-investigation (DEPRECATED) snv-lancet.vep.maf.gz snv-mutect2.vep.maf.gz snv-strelka2.vep.maf.gz tcga-snv-lancet.vep.maf.gz tcga-snv-mutect2.vep.maf.gz tcga-snv-strelka2.vep.maf.gz histologies.tsv tcga-manifest.tsv WGS.hg38.lancet.unpadded.bed WGS.hg38.strelka2.unpadded.bed WGS.hg38.mutect2.vardict.unpadded.bed Investigation of the TMB discrepancy between PBTA and TCGA data No results/*.bed No GitHub N/A
tumor-gtex-plots (DEPRECATED) gene-expression-rsem-tpm-collapsed.rds histologies.tsv In progress #38; tumor vs normal and tumor only expression plots No results/pan_cancer_plots_cancer_group_level.{tsv, jsonl.gz} results/pan_cancer_plots_cohort_cancer_group_level.{tsv, jsonl.gz} results/tumor_normal_gtex_plots_cancer_group_level.{tsv, jsonl.gz} results/tumor_normal_gtex_plots_cohort_cancer_group_level.{tsv, jsonl.gz} results/metadata.tsv plots/\*.png Yes GitHub N/A
tumor-normal-differential-expression (DEPRECATED) histologies.tsv gene-counts-rsem-expected_count-collapsed.rds independent-specimens.rnaseq.primary.tsv independent-specimens.rnaseq.primary.eachcohort.tsv gene-expression-rsem-tpm-collapsed.rds ensg-hugo-pmtl-mapping.tsv efo-mondo-map.tsv uberon-map-gtex-subgroup.tsv This module takes as input histologies and the RNA-Seq expression matrices data, and performs differential expression analysis for all combinations of GTEx subgroup normal and cancer histology type tumor. No N/A