This project identifies a list of recurrently mutated and clinically-relevant genes for variants that could possibly be missed by automated variant callers due to low sequence coverage and/or low VAF, and should be manually reviewed when supporting sequence alignment data exists.
Our initial target was a short list of pediatric cancer SMGs from the following sources:
In additon, medically actionable genes from the following sources should be considered:
- CIViC
- Others?
-
CIViC (notes): 5 pediatric clinically-relevant genes with generalized mutations / deletions / loss.
-
Gröbner and Worst, et al. (notes): 77 pediatric cancer-relevant SMGs (23 are also adult SMGs).
-
PeCAN (notes): summarized SMGs / hotspots not available or reproducible from provided downloads. However, several published studies from major datasets used by PeCAN are listed on their site:
- Gröbner and Worst et al. (already evaluated)
- Ma et al.
- Rusch et al.
As a first pass, then, we can collect gene lists from these datasets at the time of publication in lieu of recomputing and calling our own.
-
Ma, et al. (notes): With some added filtering, 138 pediatric cancer SMGs.
-
Rusch et al. (notes): No SMG / hotspot analysis reported.
After initial evaluation, this leaves us with three sources of genes. Refining the SMGs to those that appear in both landscape studies and then merging CIViC genes with this list gives us an initial list of 40 genes. Of these, 20 are not reported as adult SMGs in the Kandoth et al. evaluation of TCGA adult cancers. (notes)
To aid in reproducibility, all data generated are stored in
the data/
directory. Analyses (notes) are individually
linked in this document, but also findable in the analyses/
directory.