Location: Livestock and Range Research LaboratoryTitle: Increasing accuracy of genomic selection in presence of high density marker panels through the prioritization of relevant polymorphisms Author
|Chang, Liny-yun - Abs Global|
|Aggrey, Samuel - University Of Georgia|
|Rekaya, Romdhane - University Of Georgia|
Submitted to: BMC Genetics
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 2/4/2019
Publication Date: 2/4/2019
Citation: Chang, L., Toghiani, S., Aggrey, S.E., Rekaya, R. 2019. Increasing accuracy of genomic selection in presence of high density marker panels through the prioritization of relevant polymorphisms. BMC Genetics. 1-10. https://doi.org/10.1186/s12863-019-0720-5.
DOI: https://doi.org/10.1186/s12863-019-0720-5 Interpretive Summary: Implementing genetic evaluation of livestock using high-throughput genomic information becomes a challenge when the data are too large. This high genomic data dimensionality precludes us from fully utilizing high-density marker panels. The objective of this study was to reduce the dimensionality of high-throughput data by prioritizing the informative markers form high-density marker panels using fixation index score. Applying this criterion in the presence of high-density marker panels guides us to track the most influential markers in the genome for a given trait. This approach better utilizes the current genomic evaluation statistical methods resulting in an increase of the accuracy of breeding values.
Technical Abstract: It is becoming clear that the increase in the density of marker panels and even the use of sequence data did not result in any meaningful increase in the accuracy of genomic selection (GS) using either regression (RM) or variance component (VC) approaches. This is in part due to the limitations of current methods. Association model are well over-parameterized and suffer from severe co-linearity and lack of statistical power. Even when the variant effects are not directly estimated using VC based approaches, the genomic relationships didn't improve after the marker density exceeded a certain threshold. SNP prioritization-based fixation index (FST) scores were used to track the majority of significant QTL and to reduce the dimensionality of the association model. Two populations with average LO between adjacent markers of 0.3 (P1) and O. 7 (P2) were simulated. In both populations, the genomic data consisted of 400 K SNP markers distributed on 10 chromosomes. The density of simulated genomic data mimics roughly 1.2 million SNP markers in the bovine genome. The genomic relationship matrix (G) was calculated for each set of selected SNPs based on their FST score and similar numbers of SNPs were selected randomly for comparison. Using all 400 K SNPs, 46% of the off-diagonal elements (OD) were between - 0.01 and 0.01. The same portion was 31, 23 and 16% when 80 K, 40 K and 20 K SNPs were selected based on FST scores. For randomly selected 20 K SNP subsets, around 33% of the OD fell within the same range. Genomic similarity computed using SNPs selected based on FST scores was always higher than using the same number of SNPs selected randomly. Maximum accuracies of 0.741 and 0.828 were achieved when 20 and 10 K SNPs were selected based on FST scores in P1 and P2, respectively. Genomic similarity could be maximized by the decrease in the number of selected SNPs, but it also leads to a decrease in the percentage of genetic variation explained by the selected markers. Finding the balance between these two parameters could optimize the accuracy of GS in the presence of high density marker panels.