Skip to main content
ARS Home » Southeast Area » New Orleans, Louisiana » Southern Regional Research Center » Cotton Fiber Bioscience Research » Research » Publications at this Location » Publication #296405

Title: Development of EST-based SNP and indel markers and their utilization in tetraploid cotton genetic mapping

Author
item LI, XIMEI - Huazhong Agricultural University
item GAO, WENHUI - Huazhong Agricultural University
item GUO, HUANLE - Huazhong Agricultural University
item ZHANG, XIANLONG - Huazhong Agricultural University
item Fang, David
item LIN, ZHONGXU - Huazhong Agricultural University

Submitted to: BMC Genomics
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 12/1/2014
Publication Date: 12/1/2014
Citation: Li, X., Gao, W., Guo, H., Zhang, X., Fang, D.D., Lin, Z. 2014. Development of EST-based SNP and indel markers and their utilization in tetraploid cotton genetic mapping. Biomed Central (BMC) Genomics. 15:1046.

Interpretive Summary: Molecular markers are the foundation of the modern molecular plant breeding. In this research, expressed sequence tags (ESTs) from public databases were analyzed in order to identify single nucleotide polymorphisms (SNPs) and insertion-deletion polymorphisms (InDels) in cotton. A total of 1349 EST-based SNP and InDel markers were developed by comparing ESTs between upland cotton (Gossypium hirsutum) and pima cotton (G. barbadense), mining upland cotton unigenes, and analyzing 3’ untranslated region (3’UTR) sequences. The marker polymorphisms were investigated using the two parents of the mapping population, upland cotton Emian 22 and pima cotton 3-79, based on the single-strand conformation polymorphism analysis. Of all the markers, 137 (10.16%) were polymorphic, and revealed 142 loci. Linkage analysis using a BC1 population of 141 progeny mapped 133 loci on the 26 chromosomes with the remaining 9 loci unmapped. Statistical analysis of base variations in SNPs showed that base transitions were 55.78% of the total base variations. Gene ontology analysis indicated that cotton genes varied greatly in harboring SNPs ranging from 1.00 to 24.00 SNPs per gene. Sanger sequencing analyses of three randomly selected SNP markers revealed discrepancy between the in silico predicted sequences and the actual sequencing results. Nonetheless, our results demonstrate that in silico analysis of ESTs is a valuable tool in developing cotton SNP and InDel markers. Markers developed herein will be useful in cotton genetics and breeding research.

Technical Abstract: Expressed sequence tags (ESTs) were analyzed in silico in order to identify single nucleotide polymorphisms (SNPs) and insertion-deletion polymorphisms (InDels) in cotton. A total of 1349 EST-based SNP and InDel markers were developed by comparing ESTs between Gossypium hirsutum and G. barbadense, mining G. hirsutum unigenes, and analyzing 3’ untranslated region (3’UTR) sequences. The marker polymorphisms were investigated using the two parents of the mapping population, G. hirsutum cv. Emian 22 and G. barbadense acc. 3-79, based on the single-strand conformation polymorphism (SSCP) analysis. Of all the markers, 137 (10.16%) were polymorphic, and revealed 142 loci. Linkage analysis using a BC1 population of 141 progeny mapped 133 loci on the 26 chromosomes with the remaining 9 loci unmapped. Statistical analysis of base variations in SNPs showed that base transitions were 55.78% of the total base variations. Gene ontology analysis indicated that cotton genes varied greatly in harboring SNPs ranging from 1.00 to 24.00 SNPs per gene. Sanger sequencing analyses of three randomly selected SNP markers revealed discrepancy between the in silico predicted sequences and the actual sequencing results. Nonetheless, our results demonstrate that in silico analysis of ESTs is a valuable tool in developing cotton SNP and InDel markers. Markers developed herein will be useful in cotton genetics and breeding research.