Skip to content

Commit

Permalink
Merge pull request #101 from d3b-center/as_suggestions
Browse files Browse the repository at this point in the history
combine sections that refer to the previous manuscript
  • Loading branch information
jharenza committed Jun 14, 2024
2 parents 5471ce9 + 254de43 commit 2a9bea9
Showing 1 changed file with 4 additions and 11 deletions.
15 changes: 4 additions & 11 deletions content/03.Data_Description_Methods.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@

<!-- A brief statement providing background and purpose for collection of these data should be presented for readers without specialist knowledge in that area. A clear, concise, description of the data, the protocol(s) for data collection, data curation and quality control, as well as potential uses should then follow. -->

The Open Pediatric Cancer (OpenPedCan) project at the Children’s Hospital of Philadelphia is an open analysis effort in which we harmonize pediatric cancer data from multiple sources, perform downstream cancer analyses on these data, and provide them on PedcBioPortal and v2.1 of NCI's [Pediatric Molecular Targets Platform (MTP)](https://moleculartargets.ccdi.cancer.gov/).
The Open Pediatric Cancer (OpenPedCan) project at the Children’s Hospital of Philadelphia (CHOP) is an open analysis effort in which we harmonize pediatric cancer data from multiple sources, perform downstream cancer analyses on these data, and provide them on PedcBioPortal and v2.1 of NCI's [Pediatric Molecular Targets Platform (MTP)](https://moleculartargets.ccdi.cancer.gov/).
We harmonized, aggregated, and analyzed data from multiple pediatric and adult data sources, building upon the work of the OpenPBTA (**Figure {@fig:Fig1}**).

![**OpenPedCan Data.** A, OpenPedCan contains multi-omic data from seven cohorts of pediatric tumors (A-B), RNA-Seq from adult tumors from The Cancer Genome Atlas (TCGA) Program (C-D) and RNA-Seq from normal adult tissues from the Genotype-Tissue Expression (GTeX) project (E). (Abbreviations: TARGET = Therapeutically Applicable Research to Generate Effective Treatments , PPTC = Pediatric Preclinical Testing Consortium, PBTA = Pediatric Brain Tumor Atlas, Maris = Neuroblastoma cell lines from the Maris Laboratory at CHOP, GMKF = Gabriella Miller Kids First, DGD = Division of Genomic Diagnostics at CHOP, CPTAC = Clinical Proteomic Tumor Analysis Consortium)](https://raw.githubusercontent.com/d3b-center/OpenPedCan-analysis/66840de10c21494445c3fbd3e3098646e7b048d5/figures/manuscript_OPC/figure1/Figure1.png?sanitize=true){#fig:Fig1 width="7in"}
Expand Down Expand Up @@ -150,21 +150,14 @@ Libraries were sequenced using an Illumina Nextseq 500 per manufacturer guidelin
FASTQ files were generated from raw sequencing data using Illumina BaseSpace and analyzed with the HTG EdgeSeq Parser software v5.4.0.7543 to generate an excel file containing quantification of 2083 miRNAs per sample.
Any sample that did not pass the quality control set by the HTG REVEAL software version 2.0.1 (Tuscon, AR, USA) was excluded from the analysis.

#### DNA WGS Alignment
Please refer to the OpenPBTA manuscript for details [@doi:10.1016/j.xgen.2023.100340].

#### Prediction of participants' genetic sex
Please refer to the OpenPBTA manuscript for details [@doi:10.1016/j.xgen.2023.100340].
#### DNA WGS Alignment, Quality Control, and SNP Calling
Please refer to the OpenPBTA manuscript for details on DNA WGS Alignment, prediction of participants’ genetic sex, SNP calling for B-allele Frequency (BAF) generation, and initial quality control steps. [@doi:10.1016/j.xgen.2023.100340].

#### Quality Control of Sequencing Data
Please refer to the OpenPBTA manuscript for details [@doi:10.1016/j.xgen.2023.100340].
#### Additional Quality Control of Sequencing Data
We also ran `somalier relate` [@doi:10.1186/s13073-020-00761-2] to identify potential mismatched samples.
We required that at least 20M total reads with 50% of RNA-Seq reads mapped to the human reference for samples to be included in analysis.
We required at least 20X coverage for tumor DNA samples to be included in this analysis.

##### SNP calling for B-allele Frequency (BAF) generation
Please refer to the OpenPBTA manuscript for details [@doi:10.1016/j.xgen.2023.100340].

#### Somatic Mutation and INDEL Calling
For matched tumor/normal samples, we used the same mutation calling methods as described in OpenPBTA manuscript for details [@doi:10.1016/j.xgen.2023.100340].
For tumor only samples, we ran Mutect2 from GATK v4.2.2.0 using the following [workflow](https://github.com/kids-first/kf-tumor-workflow/tree/v0.3.0-beta).
Expand Down

0 comments on commit 2a9bea9

Please sign in to comment.