Skip to main content
ARS Home » Plains Area » Fargo, North Dakota » Edward T. Schafer Agricultural Research Center » Sugarbeet and Potato Research » Research » Publications at this Location » Publication #384392

Research Project: Increasing Sugar Beet Productivity and Sustainability through Genetic and Physiological Approaches

Location: Sugarbeet and Potato Research

Title: A new strategy for using historical imbalanced yield data to conduct genome-wide association studies and develop genomic prediction models for wheat breeding

item Chu, Chenggen
item RUDD, JACKIE - Texas A&M Agrilife
item Chen, Ming-Shun
item WANG, SHICHEN - Texas A&M Agrilife
item IBRAHIM, AMIR - Texas A&M Agrilife
item XUE, QINGWU - Texas A&M Agrilife
item DEVKOTA, RAVINDRA - Texas A&M Agrilife
item BAKER, JASON - Texas A&M Agrilife
item BAKER, SHANNON - Texas A&M Agrilife
item SIMONEAUX, BRYAN - Texas A&M Agrilife
item OPENA, GERALDINE - Texas A&M Agrilife
item DONG, HAIXIAO - Washington State University

Submitted to: Molecular Breeding
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 3/2/2022
Publication Date: 3/22/2022
Citation: Chu, C.N., Rudd, J.C., Chen, M., Wang, S., Ibrahim, A.M., Xue, Q., Devkota, R.N., Baker, J.A., Baker, S., Simoneaux, B., Opena, G., Dong, H. 2022. A new strategy for using historical imbalanced yield data to conduct genome-wide association studies and develop genomic prediction models for wheat breeding. Molecular Breeding. 42. Article e18.

Interpretive Summary: Crop yield is one of the most important factors for farmers worldwide. However, yield tests are very costly since they need to be done in many locations over multiple years in replicated trials. Plant breeders typically collect yield data of genetic lines during the breeding process. Unfortunately, those yield data cannot be directly used to identify yield-associated regions in the genome because they were collected from a different set of genetic lines each year under varying environmental conditions. In this study, we used statistic methods to evaluate the yield data collected by breeders from different sets of Texas wheat lines from different years at different locations, and then selected the appropriate data for genetic study. By using this strategy, we identified many genomic regions associated with grain yield that were mostly consistent with results from previous studies. We then developed statistic models to predict grain yield of each breeding line solely based on its genetic information. Using a similar approach, we also identified genomic regions associated with insect resistance in those breeding lines. Taken together, this research established a new way of using historical breeding data for genetic study on crop improvement. This technique will provide a cost-effective way for breeders to efficiently select lines with high yield potential as well as disease and pest resistance.

Technical Abstract: Using imbalanced historical yield data to predict performance and select new lines is an arduous breeding task. Genome-wide association studies (GWAS) and high throughput genotyping based on sequencing techniques can increase prediction accuracy. An association mapping panel of 227 Texas elite (TXE) wheat breeding lines was used for GWAS and a training population to develop prediction models for grain yield selection. An imbalanced set of TXE lines collected from 102 environments (year-by-location) over ten years, through testing yield in 40 - 66 lines each year at 6 - 14 locations with 38 - 41 lines repeated in the test in any two consecutive years, was used. Based on correlations among yield data collected from different environments within two adjacent years and genetic variance estimated in each environment, yield data from 87 environments were selected and assigned to two correlation-based groups. The yield best linear unbiased estimation (BLUE) from each group, along with reaction to greenbug and Hessian fly in each line, were used for GWAS to reveal genomic regions associated with yield and insect resistance. A total of 74 genomic regions were associated with grain yield and two of them were commonly detected in both correlation-based groups. Greenbug resistance in TXE lines was mainly controlled by Gb3 on chromosome 7DL in addition to two novel regions on 3DL and 6DS, and Hessian fly resistance was conferred by the region on 1AS. Genomic prediction models developed in two correlation-based groups were validated using a set of 105 recently developed advanced breeding lines and the model from correlation-based group G2 was more reliable for prediction. This research not only identified genomic regions associated with wheat grain yield but also established the method of using historical imbalanced breeding data for genetic analysis to develop a genomic prediction model for improving grain yield.