Page Banner

United States Department of Agriculture

Agricultural Research Service

Research Project: GENETIC ENHANCEMENT OF SOYBEAN SEED VALUE BY BIOTECHNOLOGY Title: Leveraging non-targeted metabolite profiling via statistical genomics

item Shen, Miaoqing -
item Broeckling, Corey -
item Chu, Elly Yiyi -
item Ziegler, Gregory -
item Baxter, Ivan
item Prenni, Jessica -
item Hoekenga, Owen

Submitted to: PLoS One
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: January 15, 2013
Publication Date: February 28, 2013
Repository URL:
Citation: Shen, M., Broeckling, C., Chu, E., Ziegler, G., Baxter, I.R., Prenni, J., Hoekenga, O. 2013. Leveraging non-targeted metabolite profiling via statistical genomics. PLoS One. 8(2):e57667. Available:

Interpretive Summary: Collecting data is becoming easier and easier, while the problem of analyzing and visualizing data is becoming equally more difficult. We present a procedure that gives scientists the ability to combine multiple sources of data that relates to maize grain composition so that a comprehensive view of grain quality can be achieved. The procedure has been designed primarily for the integration of metabolite data (organic compounds in the grain) with data derived from genetic and genomic studies. The procedure is a linear series of statisical and bioinformatic tools that can be applied to not only maize but for any model organism, and thus this procedure represents an important step forward for biology in general rather than just plant genetics in particular. Application of the procedure within agriculture will enhance the development of breeding strategies for grain improvement for both human and animal consumption.

Technical Abstract: One of the challenges of systems biology is to integrate multiple sources of data in order to build a cohesive view of the system of study. Here we describe the mass spectrometry based profiling of maize kernels, a model system for genomic studies and a cornerstone of the agroeconomy. Using a network analysis, we can include 83% of the 8,710 compounds detected from 210 varieties into a single framework. More conservatively, 47.1% of compounds detected can be organized into a network with 48 distinct modules. Weighted averages were calculated for each module and then used as inputs for genome-wide association studies. Nineteen modules returned significant results, illustrating the genetic control of biochemical networks within the maize kernel. Our approach leverages genetic and genomic information to enhance the analysis of metabolomic and proteomic datasets. This method is applicable to any organism with sufficient bioinformatic resources.

Last Modified: 8/28/2016