Skip to main content
ARS Home » Plains Area » Lubbock, Texas » Cropping Systems Research Laboratory » Plant Stress and Germplasm Development Research » Research » Publications at this Location » Publication #307556

Title: SNP marker discovery in Pima cotton (Gossypium barbadense L.) leaf transcriptomes

Author
item KOTTAPALLI, PRATIBHA - Texas Tech University
item Ulloa, Mauricio
item KOTTAPALLI, KAMESWARA - Texas Tech University
item Payton, Paxton
item Burke, John

Submitted to: Genomics Insights
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 8/24/2016
Publication Date: 10/3/2016
Citation: Kottapalli, P., Ulloa, M., Kottapalli, K.R., Payton, P.R., Burke, J.J. 2016. SNP marker discovery in Pima cotton (Gossypium barbadense L.) leaf transcriptomes. Genomics Insights. 9:51-60. doi:10.4137/GEI.S40377.

Interpretive Summary: With the decrease in sequence cost by the new generation of DNA or RNA (cDNA) sequencing technology, it is becoming possible to obtain large amounts of sequence information for identifying differences among cottons and directly mapping genes responsible for important traits. These differences among cottons are revealed by differently expressed genes and single nucleotide polymorphic (SNP) - molecular markers or biomarkers. Plant breeders find these biomarkers useful as a selection tool in monitoring alien genome introgression in cotton breeding programs. As an initial step to explore the known narrow genetic diversity and to discover SNP-biomarkers for marker assisted breeding within Pima cotton, leaf cDNA from 25 day plants of three diverse cottons [Pima-S6 (PS6), Pima-S7 (PS7), and Pima 3-79 (P3-79)] were sequenced. Differential gene expression analysis between PS6 vs. PS7, PS6 vs. P3-79, and PS7 vs. P3-79 cottons resulted in 5,080; 5,738; and 5,399 differential sequences with greater than two fold change, respectively. These differentially expressed genes representing major metabolic pathways might explain the gene pool diversity of these Pima genotypes. Additionally, more than 10,000 single SNPs were identified between the cotton types. Differentially expressed genes identified in this study will help us to advance applied genomic research in cotton. The SNP-biomarkers can be utilized for characterizing genetic diversity, genotyping, and eventually in breeding through marker-assisted selection.

Technical Abstract: The vast information generated by the next generation sequencing (NGS) technology will continue to benefit the development of new strategies to study and characterize genetic diversity, the improvement of existing tools for molecular breeding, and the discovery of genes underlying important traits in crop plants. As an initial step to explore the known narrow genetic diversity and to discover single nucleotide polymorphic (SNP)-biomarkers for marker assisted breeding within Pima (Gossypium barbadense L.) cotton, leaf cDNA from 25 day plants of three diverse genotypes [Pima-S6 (PS6), Pima-S7 (PS7), and Pima 3-79 (P3-79)] were sequenced by Illumina MiSeq sequencer. A total of 28.9 million reads (average read length of 138 bp) were generated by sequencing cDNA libraries of these three genotypes. The de novo assembly of reads generated transcriptome sets of 26,369 contigs for PS6; 25,870 contigs for PS7; and 24,796 contigs for P3-79. A Pima leaf reference transcriptome was generated consisting of 42,695 contigs. Differential gene expression analysis between PS6 vs. PS7, PS6 vs. P3-79, and PS7 vs. P3-79 genotypes resulted in 5,080; 5,738, and 5,399 differential contigs with greater than 2 log(2) fold change, respectively. These differentially expressed genes representing major metabolic pathways might explain the gene pool diversity of these Pima genotypes. Additionally, more than 10,000 single nucleotide polymorphisms (SNPs) were identified between genotypes with 100 percent SNP consistency based on the alignment of reference contigs from a Pima reference transcriptome and a minimum of four sequence reads. The most prevalent SNP substitutions were A - G and T - C while for indels were A and T nucleotides. Differentially expressed genes identified in this study will help us to advance applied genomic research in tetraploid cotton. The SNP-biomarkers can be utilized for characterizing genetic diversity, genotyping, and eventually in breeding through marker-assisted selection.