Location: Plant, Soil and Nutrition Research2011 Annual Report
1a. Objectives (from AD-416)
Genomic information, particularly information about DNA sequence polymorphism, has great potential to increase the rate of improvement in small grains breeding programs. As genotyping costs fall and genotyping services are made available through the USDA small grains genotyping labs, breeders will need new methods to apply those resources effectively to their selection programs. The overall objectives of this program are to develop effective methods for identification of quantitative trait loci (QTL) and marker-assisted selection (MAS), and to deliver those methods to breeders and geneticists in publications and software. 1: Develop methods for identifying plant breeding quantitative trait loci. 2: Integrate methods for QTL identification into strategies that enable geneticists and breeders to design more efficient experiments and make better selection decisions. 3: Develop breeder-friendly tools for genomic and genetic data access and analysis, with a specific focus on optimum analysis and use of molecular marker and agronomic data for small grains breeders and geneticists.
1b. Approach (from AD-416)
Objective 1: Develop methods for identifying plant breeding quantitative trait loci. Experimental Design. Populations under a Wright-Fisher neutral model will be simulated using a standard coalescent approach using different parameter values to compare three analysis methods: 1. Single-marker regression, 2. A random-effect haplotype method, and 3. A coalescent-based haplotype method. These methods will be applied using different haplotype block identification methods. Objective 2: Integrate methods for QTL identification into strategies that enable geneticists and breeders to design more efficient experiments and make better selection decisions. Experimental Design. To predict specific untested haplotype-environment effects, the covariance matrix of haplotype-within-environment effects will be modeled in two ways. First, the covariance of haplotype main effects can be modeled on the basis of the sequence similarity of the haplotypes. Second, the covariance of haplotype effects across environments can be modeled much as the covariance of genotype effects in multi-environment trials can be modeled. We will also explore a combination of these two options. Simulations of MAS will be applied to data from the Barley CAP for spring, six-row barley. The form of the distribution of QTL effects obtained from the real data will be maintained. Mixed model and whole-genome selection methods will be applied. Objective 3: Develop breeder-friendly tools for genomic and genetic data access and analysis, with a specific focus on optimum analysis and use of molecular marker and agronomic data for small grains breeders and geneticists. Experimental Design. In collaboration with GrainGenes, displays currently available in TASSEL and Haploview will be scoped, resource requirements estimated, and priorities established. In addition, this project will provide association analyses based on diversity data stored in the GrainGenes database, with significant markers to be displayed on a genetic map. Methods developed in the preceding two objectives will be implemented as plugins to the TASSEL software package. TASSEL already handles most of the data input, data management, and output functions. Connections will be established between GrainGenes, The Hordeum Toolbox (THT), and USDA small grains genotyping labs by implementing a GDPC (Genomic Diversity and Phenotype Connection) web-service for each database.
3. Progress Report
In FY 2011, we continued work on methods for using high-density DNA markers to accelerate crop improvement. We compared different methods of genomic selection (GS), assessed the value of using sets of adjacent markers (i.e., haplotype blocks) rather than single-markers, and developed multi-trait GS methods. Numerous GS models have been proposed but not compared. Using eight datasets from four species, the predictive ability of available GS models and several machine learning methods, were evaluated. A similar level of accuracy was observed for many models, though computation times differed as did affect estimates. We concluded that GS in plant breeding programs could be based on a small set of four models. We assessed the power of identifying associations between markers and traits when haplotype blocks are used as predictors. Simulations were performed to mimic different population histories of the experimental lines being evaluated. We found that except in the most simple (and unrealistic) population history case, the haplotype methods outperformed the single marker methods, though the difference was small. We are developing multi-trait methods to make predictions of performance using DNA marker data only. In using those methods, we hope to leverage information in traits that are strongly affected by genotype and that are correlated to traits that are more strongly affected by environment. Work on barley: During the past year, work fell into three categories: implementing and supporting software for GS as a plug-in to the TASSEL software package, contributing to the association analysis for all trait data collected under the Barley CAP grant, and adapting The Hordeum Toolbox (THT) for its transition to The Triticeae Toolbox (T3). We have assisted in empirical barley selection against Fusarium head blight (FHB) susceptibility. Selecting against FHB phenotypically is expensive and it is a trait well-suited to GS. We selected optimal markers for genotyping, developed a GS model, and applied the model to 2,000 barley lines. Analyses showed that GS models predict FHB, DON, yield and other traits in barley with an accuracy of ~ 0.6 to 0.7. This level of accuracy is expected to substantially increase the rate of genetic gain. Work on oat: We have obtained oat accessions from the National Small Grains Collection that are divergent for the soluble fiber beta-glucan. We produced new trait and DNA marker data for these accessions. We are using genome-wide marker data on elite oat to compare selection methods for beta-glucan content. We have done two cycles of selection and will compare gains based on the 2011 field season. Finally, we assessed a number of prediction models to determine how accurately oat agronomic and grain quality traits can be predicted on the basis of marker data.
1. The information used for plant breeding from previous years from multiple breeding programs can be used to computationally predict crop performance using DNA marker data. This process, called genomic selection (GS), is a statistical method to predict the agronomic performance of new crop breeding lines by using DNA markers spread throughout the plant genome. ARS researchers at the Robert W. Holley Center for Agriculture & Health at Ithaca, NY, evaluated the prediction accuracy of GS using historical data from the USDA-ARS organized Uniform Oat Performance Nursery. All US oat breeding programs have participated in this nursery over many years. The study showed that the large number of breeding lines included in these trials improved the accuracy of predicted future oat performance obtained from the data. The inclusion of older lines in the statistical model increased or maintained its accuracy, suggesting that older generations provided useful information. This empirical validation of GS methods indicates that GS will play a role in accelerating the delivery of improved varieties via plant breeding through the use of inexpensive and abundant DNA markers available to the public sector.
2. Joint analysis of adjacent DNA markers helps identify important genes. An important unresolved question in analyses to find genes that affect traits like crop yield and grain quality is whether using single markers or groups of adjacent DNA markers (also called "haplotypes") improves the detection of the effects of a gene on a crop trait. ARS researchers at the Robert W. Holley Center for Agriculture & Health at Ithaca, NY, in collaboration with Cornell University researchers, used computer simulations to compare the utility of using single-marker versus haplotype analyses to identify genes. Because the power of the methods may depend on how genes affect the trait and on the breeding history of the varieties studied, a number of different scenarios were simulated. Across a range of plausible scenarios, the average power of methods using 2- and 3-marker haplotypes to detect genes exceeded that of the single marker method. Scenarios that favored the multiple marker haplotype methods are common for small grain breeding programs in the United States. Because of these discoveries, small grain breeders and geneticists will be more successful in identifying important genes and that will accelerate crop improvement.
3. Genomic selection (GS) accurately predicted experimental line performance in a wheat breeding program. GS uses genome-wide DNA marker data to predict the performance of new breeding lines in crop improvement programs. In crops, many breeding lines can come from a single family and performance prediction using DNA markers within such families has been shown to be accurate. However, this approach requires lines from each family to be field tested prior to conducting GS. This requirement will slow down the selection cycle and may result in lower gains in improved crop performance per year than approaches that estimate marker-effects with multiple families. ARS researchers at the Robert W. Holley Center for Agriculture & Health at Ithaca, NY, in collaboration with Cornell University researchers, compared multi-family prediction accuracies of phenotypic selection (PS - measuring the actual plant trait), conventional marker-assisted selection (MAS), and GS in a winter wheat breeding program. For individual traits, the average prediction accuracies from GS were 28% higher than from MAS and 5% lower than from PS. For a combination of traits designed to give an overall assessment of the value of a line, the average accuracy from GS was 14% higher than from PS. These results will impact small grains improvement by encouraging breeders to try multi-family GS to increase genetic gain per unit time and cost.Jannink, J. 2010. Dynamics of long-term genomic selection. Genetics. 42:35.