Skip to content

Detection of possible sumoylation sites that emerges through mutations in different cancers where mutations are mapped into sequence and analyzed

Notifications You must be signed in to change notification settings

sonurdogan/tlmsa

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

50 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

tlmsa

Detection of SUMOylation sites that emerge through mutations in cancer. The pipeline uses the SUMOnet to predict possible SUMOylation sites.

Running the Pipeline

Pipeline consist of three main part:

  • Retrieving Mutation data from GDC Database and filtering respected to patients gene that has mutation resulted in lysine and getting all of the mutations of the corresponding genes of the patient since mutation near the mutated K may affect SUMOylation (R code).
  • Getting wild type sequence and mapping the mutations to wild-type sequence to have mutated sequence of each protein of patient.
  • Constructing 21 long subsequence (mutated K in the middle) as a input of SUMOnet.

Pipeline can be performed using bash script with a input of path and project name.

./bash_script/tlmsa.sh

Part 1 can be done by retrieveData.R by defining TCGA project name in the code. Once data is retrieved, Part 2 and 3 can be found as a part of tlmsa python package.

Also, part 2 and 3 can perform on a data other than TCGA. Detailed instruction can be found in tutorial.py

Pipeline workflow

workflow

About

Detection of possible sumoylation sites that emerges through mutations in different cancers where mutations are mapped into sequence and analyzed

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published