Skip to content

Contains tools used by cBioPortal data curators when preparing data for upload into the datahub repos.

License

Notifications You must be signed in to change notification settings

dfci/datahub-study-curation-tools

 
 

Repository files navigation

Introduction

Data curation scripts for cBioPortal. Curated data sets can be found in Datahub.

Directory layout

.
├── archive                         # legacy tools 
├── GN-annotation-wrapper           # MAF genome-nexus annotation 
├── add-clinical-header             # attach header lines to clinical files  
├── gene-table-update               # tools for updating seedDB gene tables  
├── generate-case-lists             # generate case lists  
├── hugo-symbol-corrector           # correct hugo symbols convereted to dates by Microsoft Excel 
├── TMB                             # tumor mutation burden calculation
├── validation                      # validator wrapper and tools for validator 
├── oncotree-code-converter         # add `cancer_type` and `cancer_type_detailed` based on oncotree code
├── subset-data                     # subset study based on sample/patient ID list
├── zscores                         # calculate z-scores
├── LICENSE  
└── README.md  

About

Contains tools used by cBioPortal data curators when preparing data for upload into the datahub repos.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 68.0%
  • Shell 30.7%
  • Jinja 1.3%