Author
WALSH, JESSE - Iowa State University | |
Schaeffer, Mary | |
ZHANG, PEIFEN - Carnegie Institute - Stanford | |
RHEE, SEUNG - Carnegie Institute - Stanford | |
DICKERSON, JULIE - Iowa State University | |
Sen, Taner |
Submitted to: BMC Systems Biology
Publication Type: Peer Reviewed Journal Publication Acceptance Date: 11/9/2016 Publication Date: 11/29/2016 Publication URL: https://handle.nal.usda.gov/10113/5695347 Citation: Walsh, J.R., Schaeffer, M.L., Zhang, P., Rhee, S.Y., Dickerson, J.A., Sen, T.Z. 2016. The quality of metabolic pathway resources depends on initial enzymatic function assignments: a case for maize. BMC Systems Biology. 10:129. doi: 10.1186/s12918-016-0369-x. Interpretive Summary: In order to manipulate agronomically important traits, we need to understand how proteins act in the cell. A large number of proteins are enzymes, and these operate in the various pathways used by cells to create the different molecules (lipids, carbohydrates, amino acids, hormones, vitamins, etc). A great deal of information is available in various species about the genes for each step of the pathways, but a complete story is generally not experimentally defined for a particular species or crop. Computational methods have been developed that do this; and two different methods have been used for maize. Here we compare two metabolic pathway resources (CornCyc and MaizeCyc), created by two different computational methods, with a focus on the differences in quality of each resource. Our work will especially help maize researchers, including plant breeders, geneticists, and developmental biologists to discern the quality of available metabolic pathways information, as they can use this knowledge to further our understanding of how genes influence traits important to improving the maize crop. Technical Abstract: As metabolic pathway resources become more commonly available, researchers have unprecedented access to information about their organism of interest. Despite efforts to ensure consistency between various resources, information content and quality can vary widely. Two maize metabolic pathway resources for the B73 inbred line, CornCyc4.0 and MaizeCyc2.2, are based on the same gene model set and were developed using Pathway Tools software. These resources differ in their initial enzymatic function assignments and in the extent of manual curation. We present an in-depth comparison between CornCyc and MaizeCyc to demonstrate the effect of initial computational enzymatic function assignments on the final quality and content of metabolic pathway resources. MaizeCyc contains over twice as many annotated genes and more proteins than CornCyc. CornCyc contains on average 1.6 transcripts per gene, while MaizeCyc contains almost no alternate splicing. MaizeCyc does not match CornCyc’s breadth in representing the metabolic domain, having fewer compounds, fewer reactions, and fewer pathways than CornCyc. CornCyc predictions are more accurate than those in MaizeCyc when compared to experimentally determined function assignments, demonstrating the relative strength of the enzymatic function assignment pipeline used to generate CornCyc. Our results show that the quality of initial enzymatic function assignments primarily determines the quality of the final metabolic pathway resource. Therefore, biologists should pay close attention to the methods and information sources used to develop a metabolic pathway resource to gauge the utility of using such functional assignments to construct hypotheses for experimental studies. |