Abstract
We present CoPhosK to predict kinase-substrate associations for phosphopeptide substrates detected by mass spectrometry (MS). The tool utilizes a Naïve Bayes framework with priors of known kinase-substrate associations (KSAs) to generate its predictions. Through the mining of MS data for the collective dynamic signatures of the kinases’ substrates revealed by correlation analysis of phosphopeptide intensity data, the tool infers KSAs in the data for the considerable body of substrates lacking such annotations. We benchmarked the tool against existing approaches for predicting KSAs that rely on static information (e.g. sequences, structures and interactions) using publically available MS data, including breast, colon, and ovarian cancer models. The benchmarking reveals that co-phosphorylation analysis can significantly improve prediction performance when static information is available (about 35% of sites) while providing reliable predictions for the remainder, thus tripling the KSAs available from the experimental MS data providing a to comprehensive and reliable characterization of the landscape of kinase-substrate interactions well beyond current limitations.
Author Summary Kinases play an important role in cellular regulation and have emerged as an important class of drug targets for many diseases, particularly cancers. Comprehensive identification of the links between kinases and their substrates enhances our ability to understand the underlying mechanism of diseases and signalling networks to drive drug discovery. Most of the current computational methods for prediction of kinase-substrate associations use static information such as sequence motifs and physical interactions to generate predictions. However, phosphorylation is a dynamic process and these static predictions may overlook unique features of cellular context, where kinases may be rewired. In this manuscript, we propose a computational method, CoPhosK, which uses the mass spectrometry based phosphoproteomics data to predict the kinase for all identified phosphosites in the experiment. We show that our approach complements and extends existing approaches.