Skip to main content
ARS Home » Midwest Area » Columbia, Missouri » Plant Genetics Research » Research » Publications at this Location » Publication #375270

Research Project: Gene Discovery and Designing Soybeans for Food, Feed, and Industrial Applications

Location: Plant Genetics Research

Title: Genomic prediction using training population design in interspecific soybean populations

item BECHE, EDUARDO - University Of Missouri
item Gillman, Jason
item Song, Qijian
item Nelson, Randall
item BEISSINGER, TIMOTHY - Georg August University
item DECKER, JARED - University Of Missouri
item SHANNON, GROVER - University Of Missouri
item SCABOO, ANDREW - University Of Missouri

Submitted to: Molecular Breeding
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 1/11/2021
Publication Date: 2/10/2021
Citation: Beche, E., Gillman, J.D., Song, Q., Nelson, R.L., Beissinger, T., Decker, J., Shannon, G., Scaboo, A.M. 2021. Genomic prediction using training population design in interspecific soybean populations. Molecular Breeding. 41. Article e15.

Interpretive Summary: Breeding and genetic work over the last few decades has revealed that important crop traits (e.g. seed yield, seed quality and composition) are controlled by a large number of genes, each of which has relatively small effects. Marker assisted breeding for a small number of genes has largely failed to make substantive gains. An alternate approach, Genomic Prediction, creates and applies mathematical models that correlate whole-genome genetic data directly with traits of interest. This technique has revolutionized plant and animal breeding and is now widely applied. All such models developed for soybean have focused only on domesticated soybean, completely omitting the genetic potential present in wild relatives. This is important because domesticated soybean has remarkably low genetic diversity, which may hinder long term breeding gain potential. In this study, we tested three unique populations derived from crosses between wild and domesticated soybean lines in a large number of field environments. We optimized Genomic Prediction and our results both confirm and extend previous studies. More importantly, our results will facilitate long term improvement of the soybean crop through use of novel genetic diversity derived from wild soybeans.

Technical Abstract: Agronomically important traits generally have complex genetic architecture, where many genes have a small and largely additive effect. Genomic prediction has been demonstrated to increase genetic gain and efficiency in plant breeding programs beyond marker-assisted selection and phenotypic selection. The objective of this study was to evaluate the impact of allelic origin, marker density, training population size, and cross-validation schemes on the accuracy of genomic prediction models in an interspecific soybean nested association mapping (NAM) panel. Three crossvalidation schemes were used: (a) Within-Family (WF): training population and predictions are made exclusively within each family; (b) Across All families (AF): all the individuals from the three families were randomly assigned to either the training or validation set; (c) Leave one Family out (LFO): each family is predicted using a training set that contains the other two families. Predictive abilities increased with training population size up to 350 individuals, but no significant gains were noted beyond 250 individuals in the training population. The number of markers had a limited impact on the observed predictive ability across traits; increasing markers used in the model above 1000 revealed no significant increases in prediction accuracy. Predictive abilities for AF were not significantly different from the WF method, and predictive abilities across populations for the WF method had a range of 0.58 to 0.70 for maturity, protein, meal, and oil. Our results also showed encouraging prediction accuracies for grain yield (0.58–0.69) using the WF method. Partitioning genomic prediction between G. max and G. soja alleles revealed useful information to select material with a larger allele contribution from both parents and could accelerate allele introgression from exotic germplasm into the elite soybean gene pool.