DEFINING THE GENETIC DIVERSITY AND STRUCTURE OF THE SOYBEAN GENOME AND APPLICATIONS TO GENE DISCOVERY IN SOYBEAN AND WHEAT GERMPLASM
Location: Soybean Genomics and Improvement
Title: Development and evaluation of a high-density infinium iSelect Beadchip SoySNP50K
Submitted to: PLoS Genetics
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: December 18, 2012
Publication Date: January 25, 2013
Citation: Song, Q., Hyten, D., Jia, G., Quigley, C.V., Fickus, E.W., Nelson, R.L., Cregan, P.B. 2013. Development and evaluation of a high-density infinium iSelect Beadchip SoySNP50K. PLoS Genetics. 8(1):e54985.
Interpretive Summary: DNA markers are based upon specific differences in the DNA content of different individuals of a species. It was our objective to identify single nucleotide polymorphism (SNP) DNA markers distributed across the 20 pairs of soybean chromosomes so that we could analyze the differences in the DNA of different sets of soybean accessions from the USDA Soybean Germplasm Collection. The resulting SNP DNA marker data would allow us to identify regions of the soybean genome that were under selection during domestication of cultivated soybean from the wild soybean which resulted in the soybean landraces from Asia that are maintained in the USDA Soybean Germplasm Collection. In addition, we can identify regions of selection that have resulted from soybean breeding in North America over the past 70 years. We determined the DNA sequence of a small set of diverse cultivated and wild soybean accessions and identified a set of 60,800 SNP DNA markers that were spread across the 20 pairs of soybean chromosomes. These 60,800 SNP DNA markers were incorporated into an Illumina Inc. beadchip which permitted the rapid analysis of the soybean DNA samples in parallel. We analyzed 96 wild soybean accessions, 96 Asian landrace accessions and 96 N. American elite soybean cultivars and determined that the beadchip contained 47,337 SNP DNA makers that were useful for distinguishing among and/or within the sets of 96 accessions. Many regions of the soybean chromosomes appeared to differ between the wild soybeans and the Asian landraces. These are likely to be regions associated with the domestication of cultivated soybean from the wild soybean. A smaller number of regions were found that distinguished the landrace accessions from the elite N. American cultivars. These data will be useful to soybean breeders and geneticists who want to use the diversity in the wild soybean to find useful genes that were left behind during the domestication of soybean that occurred in Asia 3,000-5,000 years ago. Similarly, soybean breeders and geneticists will use these data to identify regions of the soybean genome that are critical to producing the most productive and disease resistant elite cultivars.
The objective of this research was to identify single nucleotide polymorphisms (SNPs) and to efficiently develop an Infinium iSelect beadchip that contained over 50,000 SNPs from soybean. A total of 498,921,777 reads 35-45bp in length were obtained from DNA sequence analysis of reduced representation libraries created from several soybean accessions which included the wild soybean genotypes PI468916 and PI479752 and the cultivated soybean lines, Essex, Evans, Archer, Minsoy, Noir 1, and Peking. These reads were mapped to the soybean whole genome sequence and 209,903 SNPs were identified. After applying several filters, a total of 146,161 of the 209,903 SNPs were determined to be ideal candidates for Illumina Infinium II beadchip design. In order to equalize the distance between selected SNPs along each chromosome, increase assay success rate, and minimize the number of SNPs with low minor allele frequency, an iteration selection algorithm based on a selection index was developed and was used to select 60,800 SNPs for Infinium beadchip design. Of the 60,800 SNPs, 50,701 were targeted to euchromatic regions and 10,000 to heterochromatic regions of the 20 soybean chromosomes. In addition, 99 SNPs were targeted to unanchored sequence scaffolds. Of the 60,800 SNPs, a total of 52,041 passed Illumina’s manufacturing phase to produce the SoySNP50K iSelect SNP beadchip. Validation of the SoySNP50K chip with 96 landrace genotypes, 96 elite cultivars and 96 wild soybean accessions showed that 47,337 SNPs were polymorphic and could generate successful SNP allele calls. In addition, 40,841 of the 47,337 SNPs (86%) had minor allele frequencies greater than 10% among the landraces, elite cultivars and the wild soybean accessions. A total of 620 and 42 candidate regions which may be associated with domestication and recent selection were identified, respectively. The SoySNP50K iSelect SNP beadchip will be a powerful tool for characterizing soybean genetic diversity and linkage disequilibrium, and for constructing high resolution linkage maps to improve the soybean whole genome sequence assembly (Glyma1.01).