Skip to main content
ARS Home » Northeast Area » Ithaca, New York » Robert W. Holley Center for Agriculture & Health » Plant, Soil and Nutrition Research » Research » Publications at this Location » Publication #295606

Title: Genomic predictability of interconnected bi-parental maize populations

Author
item RIEDELSHEIMER, CHRISTIAN - University Of Hohenheim
item ENDELMAN, JEFFREY - Cornell University
item STANGE, MICHAEL - University Of Hohenheim
item SORRELLS, MARK - Cornell University
item Jannink, Jean-Luc
item MELCHINGER, ALBRECHT - University Of Hohenheim

Submitted to: Genetics
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 3/25/2013
Publication Date: 6/1/2013
Publication URL: http://DOI: 10.1534/genetics.113.150227/-/DC1
Citation: Riedelsheimer, C., Endelman, J., Stange, M., Sorrells, M., Jannink, J., Melchinger, A. 2013. Genomic predictability of interconnected bi-parental maize populations. Genetics. 194:493-503.

Interpretive Summary: In plant breeding populations there are often groups of individuals who are more closely to each other than they are to individuals outside their group. One such type of group are large sets of progeny from single families. For predicting individual performance based on their marker data, a process called genomic selection (GS), this group structure is important. GS requires a training population (TP) of individuals with both genotypes and phenotypes to calculate parameters for the prediction model. The research here used a set of five connected maize (Zea mays L.) families derived from four parents to systematically investigate how the composition of the TP affects the prediction accuracy within families. A total of 635 DH progeny genotyped with 56,110 DNA markers were evaluated for five traits including Gibberella ear rot severity and three kernel yield component traits. Predication accuracies for siblings that share both parents followed closely expectations based on analytical results with regard to the influence of sample size and heritability of the trait. Prediction accuracies declined strongly if siblings shared only one parent (half-sibs), but better results could be achieved if the TP contained half-sibs related through both instead of only one parent. Once both parents were represented in the TP, it was not favorable to include more crosses under a constant size of the TP. Adding unrelated individuals to the TP resulted in negative or reduced predication accuracies.

Technical Abstract: Intense structuring of plant breeding populations leads to new challenges for genomic selection (GS) not encountered in animal breeding. One important open question is how the training population (TP) should be constructed from multiple related or unrelated small bi-parental families. Knowing the predictability for progeny from individual crosses with such a TP represents a key element for implementing genomic prediction in plant breeding. Here, we used a set of five interconnected maize (Zea mays L.) populations of doubled haploid (DH) lines derived from four parents to systematically investigate how the composition of the TP affects the prediction accuracy for lines from individual crosses. A total of 635 DH lines genotyped with 56,110 SNPs were evaluated for five traits including Gibberella ear rot severity and three kernel yield component traits. Predication accuracies within full sib families of DH lines followed closely expectations based on analytical results with regard to the influence of sample size and heritability of the trait. Prediction accuracies declined strongly if full sib DH lines were replaced by half sib DH lines, but statistically significantly better results could be achieved if half sib DH lines were available from both instead of only one parent of the validation population. Once both parents of the validation population were represented in the TP, it was not favorable to include more crosses under a constant size of the TP. Unrelated crosses showing negative linkage phase similarities with the validation population resulted in negative or reduced predication accuracies, if used alone or in combination with related families, respectively. We therefore suggest identifying and excluding such crosses from the TP. Moreover, the observed variability among population and traits suggest to account for these uncertainties in models optimizing the allocation of resources in GS.