2012 Annual Report
1a.Objectives (from AD-416):
Genomic information, particularly information about DNA sequence polymorphism, has great potential to increase the rate of improvement in small grains breeding programs. As genotyping costs fall and genotyping services are made available through the USDA small grains genotyping labs, breeders will need new methods to apply those resources effectively to their selection programs. The overall objectives of this program are to develop effective methods for identification of quantitative trait loci (QTL) and marker-assisted selection (MAS), and to deliver those methods to breeders and geneticists in publications and software.
1: Develop methods for identifying plant breeding quantitative trait loci.
2: Integrate methods for QTL identification into strategies that enable geneticists and breeders to design more efficient experiments and make better selection decisions.
3: Develop breeder-friendly tools for genomic and genetic data access and analysis, with a specific focus on optimum analysis and use of molecular marker and agronomic data for small grains breeders and geneticists.
1b.Approach (from AD-416):
Objective 1: Develop methods for identifying plant breeding quantitative trait loci.
Experimental Design. Populations under a Wright-Fisher neutral model will be simulated using a standard coalescent approach using different parameter values to compare three analysis methods: 1. Single-marker regression, 2. A random-effect haplotype method, and 3. A coalescent-based haplotype method. These methods will be applied using different haplotype block identification methods.
Objective 2: Integrate methods for QTL identification into strategies that enable geneticists and breeders to design more efficient experiments and make better selection decisions.
Experimental Design. To predict specific untested haplotype-environment effects, the covariance matrix of haplotype-within-environment effects will be modeled in two ways. First, the covariance of haplotype main effects can be modeled on the basis of the sequence similarity of the haplotypes. Second, the covariance of haplotype effects across environments can be modeled much as the covariance of genotype effects in multi-environment trials can be modeled. We will also explore a combination of these two options. Simulations of MAS will be applied to data from the Barley CAP for spring, six-row barley. The form of the distribution of QTL effects obtained from the real data will be maintained. Mixed model and whole-genome selection methods will be applied.
Objective 3: Develop breeder-friendly tools for genomic and genetic data access and analysis, with a specific focus on optimum analysis and use of molecular marker and agronomic data for small grains breeders and geneticists.
Experimental Design. In collaboration with GrainGenes, displays currently available in TASSEL and Haploview will be scoped, resource requirements estimated, and priorities established. In addition, this project will provide association analyses based on diversity data stored in the GrainGenes database, with significant markers to be displayed on a genetic map. Methods developed in the preceding two objectives will be implemented as plugins to the TASSEL software package. TASSEL already handles most of the data input, data management, and output functions. Connections will be established between GrainGenes, The Hordeum Toolbox (THT), and USDA small grains genotyping labs by implementing a GDPC (Genomic Diversity and Phenotype Connection) web-service for each database.
We continued work on methods for using high-density DNA markers to accelerate crop improvement.
The first task is to ensure that best prediction methods are used. In that area, we have completed coding and testing of multi-trait methods to make predictions of performance using DNA marker data only. In using those methods, we leverage information in traits that are strongly affected by genotype and that are correlated to traits that are more strongly affected by environment. We have also tested multiple ensemble method approaches and identified one that performed well for a number of diverse datasets.
We have been a part of several empirical tests of genomic selection. In oat, we showed that using genotype information could increase the accuracies of estimation of beta-glucan concentration and therefore improve selection decisions. In barley, genomic selection for increased winter-hardiness designed to make facultative barley survive the winter in the upper mid-west has been initiated. A second round of genomic selection for improved Fusarium head blight resistance is also underway.
To increase use of these methods, however, we cannot always be directly involved. Therefore we are also working to incorporate them into an online data management and analysis resource for the Triticeae Coordinated Agricultural Project. That resource is called The Triticeae Toolbox (T3). We have been actively improving and expanding the capacity of the database itself as well as making the user interface more rapid and intuitive. We are staying close to the needs of breeders by regularly convening a User Group to provide direction.
Genome-wide markers coupled with phenotypic measurements can be used to develop statistical prediction models. Such model building has been done on single traits. Genetic correlations between multiple traits, however, are pervasive. These correlations indicate that measurements of one trait carry information on other traits. ARS researchers at Ithaca, New York developed three multivariate linear models to take advantage of this information and compared these models to univariate models using simulated and real quantitative traits. We also extended these models to optimize them for traits controlled by different genetic architectures. We showed that the prediction accuracy for a low heritability trait can be increased by multivariate genomic selection when a correlated high heritability trait is available. Further, multiple trait genomic selection exhibits greater prediction performance than single trait genomic selection when phenotypes are not available on all individuals and traits. We explored additional factors affecting the performance of multiple trait genomic selection to guide users of these new models.
The soluble fiber beta-glucan is at the base of several mechanisms making oat a healthy food. Associations between beta-glucan concentration and specific marker alleles were identified in a world-wide oat collection kept by the National Small Grains Collection of the National Plant Germplasm System and in elite North-American oat cultivars. The value of these associations was additionally tested in a two-cycle oat selection program by ARS researchers at Ithaca, New York. Sequence from the associated markers in oat were compared to the sequenced rice genome to identify candidate genes that might cause the shifts in beta-glucan concentration. Two approaches to using marker associations in selection for beta-glucan concentration (i.e., marker-assisted selection and genomic selection) were compared to phenotypic selection. Improvement using markers was shown to be higher, and reduction of genetic variation to be lower than traditional phenotypic selection.
Many approaches to prediction have been developed in statistics. One broad class of approaches is called "ensemble methods" in which data are used to construct many base predictors (or "learners") whose output is then processed into a final prediction. The key questions in improving ensemble methods are.
1)how to construct the base learners and.
2)how to process their output to maximize prediction accuracy. ARS researchers at Ithaca, New York tested many methods and identified one (conjunctive rule base learners combined into one prediction using partial least squares regression) that performed uniformly well.
Newell, M., Franco, A., Scott, M.P., White, P., Beavis, W., Jannink, J. 2012. Genome-wide association study for oat (Avena sativa L.) beta-glucan concentration using germplasm of worldwide origin. Theoretical and Applied Genetics. 125:1687-1696.
Blake, V.C., Kling, J.G., Hayes, P.M., Jannink, J., Jillella, S.R., Lee, J., Matthews, D.E., Chao, S., Close, T.J., Muehlbauer, G.J., Smith, K.P., Wise, R.P., Dickerson, J.A. 2012. The hordeum toolbox - the barley CAP genotype and phenotype resource. The Plant Genome. DOI: 10.385/plantgenome2012.03.0002.
Lipka, A.E., Feng, T., Wang, Q., Peiffer, J., Li, M., Bradbury, P., Gore, M.A., Buckler IV, E.S., Zhang, Z. 2012. GAPIT: genome association and prediction integrated tool. Bioinformatics. 28(18):2397-2399.
Tian, F., Bradbury, P., Brown, P., Sun, Q., Flint Garcia, S.A., Rocheford, T.R., McMullen, M.D., Holland, J.B., Buckler IV, E.S. 2011. Genome-wide association study of maize identifies genes affecting leaf architecture. Nature Genetics. 43:159-162.
Chia, J., Song, C., Bradbury, P., Costich, D., De Leon, N., Doebley, J., Elshire, R., Gaut, B., Geller, L., Glaubitz, J., Gore, M.A., Guill, K., Holland, J., Hufford, M., Lai, J., Li, M., Liu, X., Lu, Y., McCombie, R., Nelson, R., Poland, J.A., Prasanna, B., Phyajarvi, T., Rong, T., Sekhon, R., Sun, Q., Tenaillon, M., Tian, F., Wang, J., Xu, X., Zhang, Z., Kaeppler, S.M., Ross-Ibarra, J., McMullen, M.D., Buckler IV, E.S., Zhang, G., Xu, Y., Ware, D. 2012. Maize HapMap2 identifies extant variation from a genome in flux. Nature Genetics. 40:803-807. DOI: 10.1038/ng.2313.
Hung, H., Shannon, L.M., Tian, F., Bradbury, P., Chen, C., Flint Garcia, S.A., McMullen, M.D., Ware, D., Buckler IV, E.S., Doebley, J.F., Holland, J.B. 2012. ZmCCT and the genetic basis of day-length adaptation underlying the postdomestication spread of maize. Proceedings of the National Academy of Sciences. 109:E1913–E1921.
Cook, J.P., McMullen, M.D., Holland, J.B., Tian, F., Bradbury, P., Ross-Ibarra, J., Buckler IV, E.S., Flint-Garcia, S.A. 2012. Genetic architecture of maize kernel composition in the nested association mapping and inbred association panels. Plant Physiology. 158:824-834.