|HULSE-KEMP, AMANDA - Texas A&M University|
|HAMID, ASHRAFI - Texas A&M University|
|STOFFEL, KEVIN - University Of California|
|PEPPER, ALAN - Texas A&M University|
|SASKI, CHRISTOPHER - Clemson University|
|CHEN, Z.JEFFREY - University Of Texas|
|VAN DEYNZE, ALLEN - University Of California|
|ZHENG, XIUTING - Texas A&M University|
|STELLY, DAVID - Texas A&M University|
Submitted to: Genes, Genomes, and Genomics
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 3/30/2015
Publication Date: 6/1/2015
Citation: Hulse-Kemp, A.M., Hamid, A., Stoffel, K., Pepper, A.E., Saski, C., Scheffler, B.E., Fang, D.D., Chen, Z., Van Deynze, A., Zheng, X., Stelly, D.M. 2015. BAC-end sequence-based SNP mining in Allotetraploid Cotton (Gossypium) utilizing re-sequencing data, phylogenetic inferences and perspectives for genetic mapping. Genes, Genomes, and Genomics. 5:1095-1105. doi:10.1534/g3.115.017749/-/DC1.
Interpretive Summary: DNA markers are very valuable for achieving faster gains in breeding programs. A major challenge in developing DNA markers is associating a DNA marker with any given trait, like yield or disease resistance. This problem is amplified in a crop like cotton where the genetic diversity within advanced cotton germplasm is limited, thus making it more difficult to uncover useful DNA markers that can distinguish between two parents. In order to have DNA markers evenly distributed across the cotton genome, a two prong approach was taken. The type of DNA marker developed was based on SNPs (single nucleotide polymorphism) which are more common that other types of DNA markers. In order to assure distribution across the whole cotton genome, the SNPs were developed based on a genome map constructed from a BAC (Bacterial Artificial Chromosome) physical map of cotton where the BAC ends were also sequenced. These BAC end DNA sequences where then compared to whole genome DNA sequences from three species of tertraploid cotton to uncover the SNPs. Since the derived SNPs were based on linear chromosome map of the BAC physical map it assured that the SNPs were also in a linear fashion. The validity of the SNP calling was confirmed on an in-depth analysis of the homeologous chromosomes 12 and 26. The data set of the BAC end sequences will allow for future construction of high-density integrated physical and genetic maps.
Technical Abstract: A bacterial artificial chromosome (BAC) library and BAC-end sequences for Gossypium hirsutum L. have recently been developed. Here we report on genomic-based genome-wide SNP mining utilizing re-sequencing data with a BAC-end sequence reference for twelve G. hirsutum L. lines, one G. barbadense L. line, and one G. longicalyx Hutch & Lee line. A total of 132,262 intraspecific SNPs have been developed for G. hirsutum while 223,138 and 470,631 interspecific SNPs have been developed that are applicable to G. hirsutum crosses with G. barbadense and G. longicalyx, respectively. Using a set of 96 interspecific SNPs putatively associated with the homeologous chromosome pair 12/26 as a tester set, 90 SNPs were mapped into the 2 linkage groups representing these chromosomes, spanning 236.2 cM in an interspecific F2 population (G. hirsutum TM-1 x G. barbadense 3-79). The mapping results validated the approach for reliably producing large numbers of both intraspecific and interspecific genomic-based SNPs that are BAC-associated. This will allow for future construction of high-density integrated physical and genetic maps for cotton and other complex polyploid genomes. The developed pipeline will allow for future Gossypium re-sequencing data to be automatically genotyped for identified polymorphic positions along the BAC-end sequence reference for comparative studies.