Skip to content

Latest commit

 

History

History
71 lines (47 loc) · 2.69 KB

README.md

File metadata and controls

71 lines (47 loc) · 2.69 KB

ESA-UbiSite

ESA-UbiSite: accurate prediction of human ubiquitination sites by identifying a set of effective negatives. Ubiquitination, the conserved proteasome system, is a PTM which relates to numerous biological processes, such as protein degradation, endocytosis, and cell cycle. Numerous ubiquitination sites remain undiscovered because of the limitations of mass spectrometry-based methods. In fact, some sites that undergo ubiquitination have not been identified. Hence, these machine learning-based prediction methods suffer from no reliable database of non-ubiquitination sites. Existing prediction methods use randomly selected non-validated sites as non-ubiquitination sites to train ubiquitination site prediction models. In this work, we propose an evolutionary screening algorithm (ESA) to select effective negatives from among non-validated sites and an ESA-based prediction method, ESA-UbiSite, to identify human ubiquitination sites. The ESA selects non-validated sites least likely to be ubiquitination sites as training negatives. Experimental results show that ESA-UbiSite with effective negatives achieved 0.92 test accuracy, better than existing prediction methods.

Input Data

FASTA format (e.g., example.fa)

Getting start

git clone https://github.com/NctuICLab/ESA-UbiSite.git
cd ESA-UbiSite

build AAIndex

cd src/aaindex
make

build LIBSVM

cd src/libsvm_320
make

Example of running ESA-UbiSite

create a new folder for the new analysis

mkdir example_output
perl ESAUbiSite_main.pl example.fa example_output

Result of ESA-UbiSite

results of example.fa -> example_output/ESAUbiSite_prediction.html ESAUbiSite_prediction.html

Dataset

The format of datasets

  • 1st: accession number of proteins
  • 2nd: residue
  • 3rd: position

Precalculating Dataset

Citing ESA-UbiSite

ESA-UbiSite: accurate prediction of human ubiquitination sites by identifying a set of effective negatives. Bioinformatics, 2017 [PMID:28062441]

Contact