Skip to main content
ARS Home » Midwest Area » Ames, Iowa » Corn Insects and Crop Genetics Research » Research » Publications at this Location » Publication #319431

Title: DiffSLc: A graph centrality method to detect essential proteins of a protein-protein interaction network

Author
item MISTRY, DIVYA - Iowa State University
item Wise, Roger
item DICKERSON, JULIE - Iowa State University

Submitted to: PLOS ONE
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 10/15/2017
Publication Date: 11/9/2017
Citation: Mistry, D., Wise, R.P., Dickerson, J. 2017. DiffSLc: A graph centrality method to detect essential proteins of a protein-protein interaction network. PLoS One. https://doi.org/10.1371/journal.pone.0187091.

Interpretive Summary: The proper functioning of interconnected genetic networks is essential for an organism to survive. In the field of computational biology, network centrality measures have been successfully applied to detect important genes of gene coexpression networks, and proteins of protein-protein interaction networks. Previous research has pointed to various computational techniques to find such network elements, although these measures have not improved the gene predictability. Finding essential genes and proteins can help to prioritize experimental validation and ascertain biological function. We would like to perform this sort of genome-wide determination for genes involved in plant disease resistance. This then could be used in modern plant breeding programs to increase yield and sustainability. The proposed method DiffSLc uses the gene expression data, in conjunction with a protein-protein interaction network to bias the calculation of degree centrality. Our method is verified using a yeast gene expression dataset with a known yeast protein-protein interaction network. DiffSLc shows measurable improvement compared to other centrality methods. DiffSLc is able to detect more essential verified genes than the ones predicted exclusively by degree, closeness, betweenness, eigenvector, and subgraph centralities. Impact: Once a Hordeum vulgare (barley) protein-protein interaction data is available, or a verifiable network could be predicted from the proteomics data, the pre-existing barley gene expression experiments can be used to predict essential genes in the barley interaction network. In the future, we would like to verify whether different gene expression datasets can be used to predict context-dependent essential genes and proteins using DiffSLc. This will provide a critical step forward in understanding biomolecular systems as dynamic context-dependent data discovery environments.

Technical Abstract: Network centrality measures prioritize nodes and edges based on their importance to the network topology. These measures have been helpful in identifying critical genes and proteins in biomolecular networks. The proposed centrality measure DiffSLc uses the number of interactions of a protein and gene coexpression values of genes from which those proteins were translated, as a weighting factor to bias the identification of essential proteins in a protein interaction network. Potentially essential proteins with low node degree are promoted through eigenvector centrality. Thus, the gene coexpression values are used in conjunction with the eigenvector of the network's adjacency matrix and edge clustering coefficient to improve essentiality prediction. The outcome of this prediction is shown using three variations: (1) inclusion or exclusion of gene co-expression data, (2) impact due to choice of different coexpression measures, and (3) impact of different gene expression data sets. For a total of seven networks, DiffSLc is compared to other centrality measures using Saccharomyces cerevisiae protein interaction networks and gene expression data. Comparisons are also performed for the top ranked proteins against the known essential genes from Saccharomyces Gene Deletion Project, which show that DiffSLc detects more essential proteins and has a higher area under the ROC curve than other compared methods. This makes DiffSLc a stronger alternative to other methods for detecting essential genes using a protein-protein interaction network that obeys centrality-lethality principle. DiffSLc is implemented using the igraph package in R, and is available via http://git.io/diffslc.