|STEWART-BROWN, BENJAMIN - University Of Georgia|
|LI, ZENGLU - University Of Georgia|
Submitted to: G3, Genes/Genomes/Genetics
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 5/10/2019
Publication Date: 5/14/2019
Citation: Stewart-Brown, B.B., Song, Q., Vaughn, J., Li, Z. 2019. Genomic selection for yield and seed composition traits within an applied soybean breeding program . G3, Genes/Genomes/Genetics. https://doi.org/10.1534/g3.118.200917.
Interpretive Summary: Plant breeders have had trouble improving yield and seed quality for plants like soybean because these traits are often controlled by many genes. Now with the discovery of all of the genes in genomes, scientists have a better ability to follow many genes during a breeding process. Previous reports for maize and wheat showed that many factors such as the traits themselves, the seed population size, and the frequency of occurrence for genes influence the accuracy of using genome information for breeding. Researchers at University of Georgia and USDA-ARS explored the effects of these factors on the accuracy of predicting seed yield, seed protein, and oil content in soybeans for 500 breeding lines with different genetic backgrounds and planted at different locations and years. Through this work, the scientists proposed models and statistics that can greatly improve the predictive ability of genomic information for breeding for protein and oil content traits in soybean. The results will help breeders in industry, the government, and universities use genomic DNA information to develop better soybeans.
Technical Abstract: Genomic selection (GS) has become viable for selection of quantitative traits for which marker-assisted selection has often proven less effective. The potential of GS for soybean was characterized using 483 elite breeding lines, genotyped with BARCSoySNP6K iSelect BeadChips. Cross validation was performed using RR-BLUP and predictive abilities (rMP) of 0.81, 0.71, and 0.26 for protein, oil, and yield, were achieved at the largest tested training set size. Minimal differences were observed when comparing different marker densities and there appeared to be inflation in rMP due to population structure. For comparison purposes, two additional methods to predict breeding values for lines of four bi-parental populations within the GS dataset were tested. The first method predicted within each bi-parental population (WP method) and utilized a training set of full-sibs of the validation set. The second method utilized a training set of all remaining breeding lines except for full-sibs of the validation set to predict across populations (AP method). The AP method is more practical as the WP method would likely delay the breeding cycle and leverage smaller training sets. Averaging across populations for protein and oil content, rMP for the AP method (0.55, 0.30) approached rMP for the WP method (0.60, 0.52). Though comparable, rMP for yield was low for both AP and WP methods (0.12, 0.13). Based on increases in rMP as training sets increased and the effectiveness of WP versus AP method, the AP method could potentially improve with larger training sets and increased relatedness between training and validation sets.