To address the various limitations of current tools when applying
to proteomes and to better utilize the large magnitude of experimentally
verified phosphorylation sites, we developed a unique standalone application
system Musite, specifically designed for large-scale prediction of both general
and kinase-specific phosphorylation sites.
Musite utilized local sequence similarity patterns (KNN scores) and generic features
(disorder scores and amino acid frequencies) of phosphorylation sites, and employed
a comprehensive machine learning approach to make predictions.
Musite is the first tool that provides utility for training a phosphorylation-site
prediction model from users' own data and supports continuous adjustment of
Musite provides a user-friendly graphic user interface, which makes it easy
for biologists to perform predictions in an automated fashion.
Applications of Musite on six proteomes yielded tens of thousands of putative
phosphorylation sites with high stringency. These predictions provide useful
hypotheses for experimental validations.
Cross-validation tests show that Musite significantly outperforms existing
tools for predicting general phosphorylation sites and is at least comparable
to those for predicting kinase-specific phosphorylation sites.
Moreover, as an open-source software, Musite can be also served as an open
platform for building machine learning application for phosphorylation-site