Standfirst
A new integrative computational approach -NetworKIN- predicts which protein kinases may target experimentally identified phosphorylation sites.

Thousands of in vivo phosphorylation sites have been identified via high-throughput proteome mapping; however, several limitations preclude matching of these sites to specific kinases. Reporting in Cell, Linding and colleagues now present NetworKIN, an integrative computational approach that combines consensus sequence motifs and protein-association networks to predict which protein kinases target experimentally identified phosphorylation sites in vivo.
The NetworKIN algorithm consists of two steps. In the first step, each phosphorylation site is assigned to one or more kinase families, based on the intrinsic preference of kinases for consensus substrate motifs. In the second step, the STRING database is used to construct a context network for each substrate — this network integrates information from curated pathway databases, literature mining, physical protein-interaction assays, mRNA expression studies and genomic data.
To test this approach, Linding et al. analysed the ability of NetworKIN to correctly predict which kinases are responsible for modifying each of 282 known in vivo phosphorylation sites from four well-studied kinase families (INSR, PIKK, CDK, PKC). They obtained a prediction accuracy of 64% (148 of 233 predictions) and a sensitivity of 52% (148 of 282 sites). By contrast, motif-based methods had an accuracy of 25% and a sensitivity of 61%. The improved predictive power gained with this method underlines the importance of contextual information on kinase–substrate interactions in the specificity of protein phosphorylation within cells.
This algorithm, which is based on 112 human kinases from 20 families, was applied to the complete curated in vivo human phosphoproteome in Phospho.ELM, which contains 7,207 sites on 2,540 proteins. This resulted in a human phosphorylation network (HPN) that consists of 7,143 site-specific kinase–substrate interactions between 1,759 substrates and 68 kinases, with predictions for 4,488 phosphorylation sites.
Using the HPN as a resource, the authors investigated protein phosphorylation within the DNA-damage response network. Several predictions within this network are of potential interest. For example, the algorithm accurately predicted 39 out of the 45 sites in 16 known ataxia-telangiectasia mutated (ATM) substrates, but also predicted 12 new ATM sites. Among the novel substrates, RAD50 — a component of the MRE11 complex — was shown to be phosphorylated by ATM in response to genotoxic stress. In addition, the cell-cycle kinase CDK1 was shown to phosphorylate 53BP1 during mitosis. Combining NetworKIN with small interfering RNA or selective inhibitor approaches and quantitative mass spectroscopy could accelerate the validation of individual in vivo kinase–substrate interactions. Using NetworKIN and specific inhibitors of glycogen synthase kinase-3 (GSK3), Linding et al. identified the BCL2-interacting transcriptional repressor BCLAF1 as a novel substrate of GSK3.
The identification of new and more accurate motifs, improved techniques for determining kinase specificity and increased information on protein interactions are likely to improve the information used by NetworKIN. Whether this algorithm can be extended to include other post-translational modifications, and therefore enable the construction of comprehensive signalling networks, remains to be seen.
