Skip to main content
ARS Home » Plains Area » Fargo, North Dakota » Edward T. Schafer Agricultural Research Center » Sunflower and Plant Biology Research » Research » Research Project #427358

Research Project: Using Genomic Selection (GS) to Optimize Prediction of Sclerotinia and Agronomic Phenotypes for More Efficient Breeding

Location: Sunflower and Plant Biology Research

Project Number: 3060-21000-043-02-S
Project Type: Non-Assistance Cooperative Agreement

Start Date: Sep 1, 2014
End Date: Aug 31, 2019

The primary goal of this work is to better balance the intensity and efficiency of selection for Sclerotinia resistance and other sunflower traits to make more progress per generation on all traits proportional to their actual value to the producer. The current problem with sunflower breeding is that easy to phenotype traits are often given disproportionately higher weights in selection relative to hard-to-phenotype traits like Sclerotinia resistance. The objectives for this year are: (1) Generate genotype-by-sequencing (GBS) data, with reference to whole genome scaffolds of progenitor lines, to obtain information on at least 60,000 high quality, polymorphic sites in sunflower in a training population derived from the USDA sunflower breeding program. (2) Conduct model cross-validation to determine accuracy of GS and associated prediction intervals. The models will be trained using markers as random effects, for each of the traits in the phenotypic data set from 2009 to 2012.

In order to achieve improvements in breeding program efficiency using new technology, the new paradigm must be thoroughly vetted and shown to produce better results than the old paradigm. We plan to accomplish this in year 1 of this work. First, we must ensure that the relatively new technology of GBS yields enough high quality polymorphic sites in our breeding program materials to provide an adequate sampling of the sunflower genome for GS. We expect to have far greater than the 60,000 markers we set as a baseline goal by imputing missing markers in GBS data from progenitor lines with whole genome sequence. Next, we plan to test all applicable statistical methods for GS using cross-validation of subsampled training and selection data sets, to simulate the accuracy that could be achieved in an actual breeding program. These data sets will be stratified so that the training and selection sets are not from the same pedigree, leading to more realistic predictions of accuracy. Phenotypic data of these sets have already been gathered. The individual results of the subsamples can be pooled and converted into the statistics of average accuracy and predictive intervals. The latter statistic, in particular, will guide the breeder in the level of statistical certainty when real selection candidates are evaluated in future years of the project. The long-term plan for genomic selection in the USDA breeding program is to reduce the size of F4 (preliminary) trials for yield and Sclerotinia evaluation, through genomic prediction of performance prior to testcross development and evaluation. The development of preliminary testcrosses is the most laborious work in our nurseries because of the large number of lines at that development stage every year. In year 2 and beyond, we will utilize the trained model from year 1 and conduct selection. We will analyze how resources need to be reallocated to sustain the effort indefinitely in the breeding program, and the opportunities that are afforded by more efficient selection in the F4. This information will be published in appropriate scientific journals, and passed on to commercial sunflower breeders as a model to follow to better balance agronomic and Sclerotinia resistance selection in sunflower.