Submitted to: PLoS One
Publication Type: Peer reviewed journal
Publication Acceptance Date: 8/22/2011
Publication Date: 10/21/2011
Citation: Dileo, M., Strahan, G.D., Hoekenga, O. 2011. Weighted correlation network analysis (WGCNA) applied to the tomato fruit metabolome. PLoS One. 6(10):e26683. DOI: 10.1371/journal.pone.0026683. Interpretive Summary: High throughput analytical chemistry is a collection of technologies useful to a wide range of scientific disciplines, including food science and plant biology. This collection includes nuclear magnetic resonance and mass spectrometry based technologies, which generate very large datasets that are often difficult to analyze. However, these experiments produce valuable information regarding food quality and composition of use to producers, consumers and regulators. Using multivariate statistics, one can identify patterns and relationships between chemical compounds and other quality traits. We compared two styles of multivariate statistics to analyze the chemical composition data we collected from greenhouse grown tomato fruits. We found that a correlation network based approach was a powerful tool to analyze and visualize the complexity of tomato fruit composition.
Technical Abstract: One of the challenges for systems biology approaches is that hundreds to thousands of variables are often measured for treatments with low replication, thus creating a multiple testing problem. Principal component analysis (PCA) and weighted correlation network analysis (WGCNA) are two complementary approaches to reduce data complexity and identify the most informative variables hidden within the data. Although PCA is routinely used, it offers limited information in comparison to other approaches, such as ANOVA, as the goal of a PCA is to summarize the factors that drive differences between datasets. A second commonly used method is the construction of correlation networks, which link pairs of metabolites or transcripts using either quantitative or qualitative tests. Correlation analyses allow the visualization of a network of metabolites and/or transcripts. WGCNA is one such correlation analysis; it obviates the multiple testing problem while more accurately modeling the structure of function of molecular networks and provides more informative interpretations. Here we compare PCA and WGCNA analyses of tomato fruit metabolite data collected from nuclear magnetic resonance and liquid chromatography mass spectrometry on greenhouse reared plants. We examine the effect of the pleiotropic ripening mutant rin on the fruit metabolome and demonstrate that WGCNA is a powerful tool to combine datasets, describe complex systems and generate new hypotheses.