Submitted to: Genome
Publication Type: Peer reviewed journal
Publication Acceptance Date: 1/18/2008
Publication Date: 3/18/2008
Citation: Shoemaker, R.C., Grant, D.M., Olson, T., Warren, W.C., Wing, R., Yu, Y., Kim, H., Cregan, P.B., Joseph, B., Futrell-Griggs, M., Nelson, W., Davito, J., Walker, J., Wallis, J., Kremitski, C., Scheer, D., Clifton, S., Graves, T., Nguyen, H., Wu, X., Luo, M., Dvorak, J., Nelson, R., Cannon, S.B., Thomkins, J., Schmutz, J., Stacey, G., Jackson, S. 2008. Microsatellite Discovery from BAC End Sequences and Genetic Mapping to Anchor the Soybean Physical and Genetic Maps. Genome. 51:294-302. Interpretive Summary: A whole-genome sequence is a text of all the hereditary code for an organism. Physical maps are important for accurate assembly of whole genome sequences. For physical maps to be useful they must be accurately oriented to the genetic map. In this paper the authors report on the development of genetic markers useful for mapping and anchoring the physical map to the soybean genetic map. They discovered the markers directly from individual pieces of the physical map. Then, when they genetically mapped the markers they concurrently oriented the physical map to the genetic map. At the same time they demonstrated how to use the preliminary whole-genome sequence to assess the quality of the physical map, and vice versa. This information will be useful to a growing number of researchers who will undertake the sequencing and assembly of entire genomes. It will also be useful to geneticists and breeders who require genetic markers to facilitate crop improvement activities.
Technical Abstract: Physical maps can be an invaluable resource for improving and assessing the quality of a whole-genome sequence assembly. Here we report the identification and screening of 3,290 microsatellites (SSRs) identified from BAC end sequences of clones comprising the physical map of the cultivar Williams 82. SSRs were screened for length polymorphisms against three mapping populations. We found the AAT and ACT classes of repeats produced the greatest frequency of length polymorphisms ranging from 17.2 % to 32.3 %, and 11.8% to 33.3%, respectively. Other useful repeat classes include the dinucleotide repeats AG, AT and AG, with frequency of length polymorphisms ranging from 11.2 % - 18.4 % (AT), 12.4 % - 20.6 % (AG), and 11.3 % - 16.4 % (GT). We found repeat lengths less than16 bp were generally less useful than repeat lengths of 40 – 60 bp. A surge in frequency of length polymorphisms was noted in two populations for repeat lengths greater than 100 bp. Two-hundred sixty-five SSRs were genetically mapped in at least one population. Of the 265 mapped SSRs, 60 came from BAC singletons, not yet placed into contigs of the physical map. One hundred ten of the 265 originated in BACs thought to be located in 90 distinct contigs for which no genetic map location was previously known. Another 95 SSRs came from BACs within contigs for which one or more other BAC had already been mapped. Of the 67 contigs represented in this group, new SSRs mapping to 15 (22.3 %) perfectly matched the linkage group associated with that contig. The remainder of the newly mapped SSRs came from BAC in contigs that were already associated with a corresponding linkage group and a non-matching anchor, or for which the FPC contig contained only a non-matching anchor. A strategy is introduced by which physical/genetic map inconsistencies can be resolved using the preliminary 4X assembly of the whole genome sequence of soybean, and concurrently improving genome sequence assemblies using the genetic/physical map information.