Submitted to: BioMed Central (BMC) Genetics
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 11/18/2008
Publication Date: 11/18/2008
Citation: Druka, A., Druka, I., Centeno, A.G., Li, H., Sun, Z., Thomas, W.T., Bonar, N., Steffenson, B.J., Ullrich, S.E., Kleinhofs, A., Wise, R.P., Close, T.J., Potokina, E., Luo, Z., Wagner, C., Schweizer, G.F., Marshall, D.F., Kearsey, M.J., Williams, R.W., Waugh, R. 2008. Towards Systems Genetic Analyses in Barley: Integration of Phenotypic, Expression and Genotype Data into GeneNetwork. BioMed Central (BMC) Genetics. 9:73.
Interpretive Summary: In order to provide a framework for the identification and mapping of key regulatory genes in grain crops, three requirements must be met: 1) experimental germplasm that harbors genetic variation, 2) high-throughput, expression-profiling, and 3) high-throughput genotyping. The use of this “systems biology” approach leverages the complementary strengths of classical genetics and transcriptomics to connect loci that confer important agronomic traits with the gene expression networks that influence them. We integrated these three key data sets for each individual in a segregating population of 150 doubled haploid barley lines. This included two mRNA profiling data sets, a Transcript Derived Marker (TDM)-based barley genetic linkage map, and a set of new trait data obtained from over 4 years of field and glasshouse experiments. We also compiled publicly available trait segregation data that has been collected on this reference population by the barley genetics community over the last 15 years. Here we provide open access and availability to these data by integrating them into GeneNetwork, a web-based analytical tool that has been designed for multiscale integration of networks of genes, transcripts, and traits and optimized for on-line analysis of traits controlled by a combination of allelic variants and environmental factors. The results of this study will enable scientists to infer map positions and trait association networks, promoting the identification of genes that can be used to improve yield and quality of small grain crops.
Technical Abstract: A typical genetical genomics experiment results in three separate data sets: genotype, gene expression, and higher-order phenotypic data. Used in concert, these data sets provide the opportunity to perform genetic analysis at a systems level. The predictive power of these experiments is largely determined by the gene expression dataset where tens of millions of data points can be generated using currently available mRNA profiling technologies. Such large, multidimensional data sets often have value beyond that extracted during their initial analysis and interpretation, particularly if conducted on widely distributed reference genetic materials. Besides quality and scale, access to the data is of primary importance as accessibility potentially allows the extraction of considerable added value from the same primary dataset by the wider research community. Although the number of genetical genomics experiments in different plant species is rapidly increasing, none to date has been presented in a form that allows quick and efficient on-line testing for possible associations between genes, loci, and traits of interest by an entire research community. Using a reference population of 150 recombinant doubled haploid barley lines, we generated novel phenotypic, mRNA abundance and SNP (single nucleotide polymorphism)-based genotyping data sets, added them to a considerable volume of "legacy" trait data provided by several of the authors, and entered them into the GeneNetwork (www.genenetwork.org). GeneNetwork is a unified on-line analytical environment that enables the user to test genetic hypotheses about how component traits, such as mRNA abundance in this case, may interact to condition more complex biological phenotypes (higher-order traits). Here we describe these barley data sets and demonstrate some of the functionalities GeneNetwork provides as an easily accessible and integrated analytical environment for exploring these complex datasets. By integrating barley genotypic, phenotypic, and mRNA abundance data sets directly within GeneNetwork’s analytical environment, we provide simple desktop community access. In this environment, a combination of correlation analysis and linkage mapping provides the potential to identify and substantiate gene targets for saturation mapping and positional cloning. By integrating datasets from an as yet unsequenced crop plant (barley) in a database that has been designed for mouse, we support the feasibility of "sustainable programming" practice for biological data sets.