Skip to main content
ARS Home » Plains Area » College Station, Texas » Southern Plains Agricultural Research Center » Crop Germplasm Research » Research » Research Project #424746

Research Project: Developing Genomic and Genetic Tools for Exploiting Cotton Genetic Variation

Location: Crop Germplasm Research

2016 Annual Report

The goal of this project is to develop genomic and genetic tools, materials, and information critically lacking for effectively exploiting cotton genetic variation in Gossypium germplasm characterization and cotton genetic improvement programs. Objective 1 reflects our commitment to continue the development of portable DNA markers (simple sequence repeat or SSR and single nucleotide polymorphism or SNP) and molecular descriptors (core sets of well-defined DNA markers) and make them available to the cotton research community. Objective 2 reflects our unique participation in the development of Upland cotton genome sequence and community database resources. A complete reference genome sequence of the Upland cotton genetic standard (G. hirsutum acc. TM-1) will unprecedentedly facilitate the process of gene mining in Gossypium germplasm for commercial improvement. A centralized public database (CottonGen) with user-friendly bioinformatic tools will make coordinated analysis and dissemination of research data and information more effective for the cotton research community. Work under Objective 3 will identify genes or novel alleles, genomic regions or quantitative trait loci (QTLs) for value-added priority traits, utilizing the genomic and genetic tools developed under the first two objectives. Superior cotton lines will be identified for developing breeding populations with novel variability for traits of interest. Collaborative work with key members/organizations of the cotton research community is necessary and will be done; all such work will be of mutual benefit and will be conducted so as to assure complementarity and lack of duplication. Specifically, during the next five years the project will focus on the following three objectives. Objective 1: Develop new genetic markers to augment current core sets of mapped SSR and SNP markers for high-throughput characterization of the genetic diversity within and among Gossypium germplasm accessions in the National Cotton Germplasm Collection. Sub-obj. 1A: Develop new SSR and SNP primers, and evaluate for polymorphism. Sub-obj. 1B: Identify and validate core cotton SSR and SNP markers. Objective 2: Collaborate with other public national and international researchers to sequence and analyze the tetraploid genome of G. hirsutum genetic standard genotype Texas Marker-1 (TM-1), and coordinate the activities of a public database to maintain and disseminate sequence and other genetic information to the research community. Sub-obj. 2A: Develop and analyze TM-1 genome sequence. Sub-obj. 2B: Coordinate the activities of CottonGen. Objective 3: Identify key genes and genomic regions of cotton that govern or are closely linked with priority traits, including fiber yield and quality, as well as biotic and abiotic stress tolerance. Sub-obj. 3A: Apply cotton genomic tools to identify and characterize QTLs or alleles from cotton genetic resources, maintained under the sister project, that govern key agronomic or fiber traits. Sub-obj. 3B: Apply the preceding information to identify superior parents for developing breeding populations with novel sources of variability for traits of interest.

New genetic markers will be created to augment current core sets of mapped SSR and SNP markers developed for high-throughput characterization of the genetic diversity within the National Cotton Germplasm Collection (objective 1). Genomic DNA will be isolated from species of G. hirsutum, G. barbadense, G. arboreum, and G. raimondii; and cotton sequence reads will be generated employing next–generation sequencing (NGS) technologies (Illumina GAIIx or HiSeq system). A high-throughput simplified one-enzyme system will be used to simultaneously discover and genotype SNP loci. The information will be used to develop A and D genome-specific SNP markers that will be made available to members of the research community via CottonGen (Sub-objective 1A). New polymorphic SSR and SNP markers will be mapped to the 26 chromosomes of the tetraploid cotton genome (sub-objective 1B). Genetic mapping of markers (SSR and SNP) will be conducted using the 186 RILs of the publicly available mapping population TM-1 x 3-79 RIL. The G. hirsutum genome will be sequenced in collaboration with national and international researchers (sub-objective 2A). Working closely with BGI and Cotton Research Institute (CRI), an integrated assembly strategy that includes large insert BAC libraries, sub-genome alignments, and chromosomal anchoring through the restriction site associated DNA (RAD)-seq analysis are being developed and tested. The TM-1 reference genome sequence will be made available to the broader plant research community via GenBank and CottonGen. The activities of the database CottonGen will be supported and coordinated through a cooperative agreement with Cotton Incorporated (sub-objective 2B). CottonGen is being built using the open-source Tripal database infrastructure to incorporate new datasets such as annotated transcriptome, genome sequence, marker-trait-locus and breeding data, as well as enhanced tools for easy querying and visualizing research data. Most technical aspects of building and maintaining CottonGen are handled by the database team at Washington State. The ARS group at College Station is responsible for determining what functionality, content, and data integration is desired of the database. Genomic tools will be used to identify QTLs or alleles governing key agronomic or fiber traits (sub-objective 3A). Validation of the growing numbers of QTLs reported in cotton will be accomplished by aligning genomic locations and comparing genetic effects of QTLS in order to make the QTLs useful. Once it is determined that specific chromosomal regions contain genes that make a significant contribution to the expression of a trait, fine-mapping of the most promising genomic regions will be used to identify polymorphisms in coding and/or regulatory regions. Diagnostic DNA markers that are associated with traits of interest will be used to screen the Collection, and lines possessing combinations of desirable QTLs will be used for developing breeding populations with novel sources of variability (sub-objective 3B).

Progress Report
The objective of sequencing and analyzing the tetraploid genome of Upland cotton was substantively realized through cooperative efforts with the Chinese Academy of Agricultural Sciences' Institute of Cotton Research. The genome sequence of the cultivated Upland cotton species, Gossypium hirsutum, provided new insights into Gossypium genome evolution and cotton fiber biology. This major genome resource made possible high-throughput analysis of molecular tools (known as single nucleotide polymorphism (SNP) markers and annotated genetic elements) in cotton. Project scientists made significant progress in developing new SNP markers for high-throughput characterization of the genetic diversity of cotton. Working in collaboration with national and international cooperators, high-resolution genetic maps were constructed with informative SNP markers (39K and 10K, respectively) using an interspecific mapping population. These SNP maps are being updated and evaluated for exploring genetic diversity and gene discovery among many applications. In collaborative efforts, significant progress was made in identifying key genes and genomic regions of cotton that govern or are closely linked with priority traits, including fiber and seed quality, biotic and abiotic tolerance, phytochrome, and photoperiodism. A dominant glandless gene Gl2^e was mapped to a chromosome 12 fragment using new SNP markers, and the gene was confirmed to encode a transcription factor regulating pigment gland production. Gene-based markers called cleaved amplified polymorphism (CAP) from three phytochrome and flowering genes were developed and mapped on cotton chromosomes 10, 11, and 24, respectively. These gene-based markers were also associated with fiber quality and other cotton traits identified with previously published SSR and SNP markers. Several candidate genes involved in resistance to multiple soil-borne pathogens were identified in chromosomes 11 and 21. A comparative genome-wide association study (GWAS) of 440 G. hirsutum and 219 G. barbadense cultivars and landraces revealed a common set of SNP markers associated with several seedling root traits. Progress was also made on the chromosomal survey for distribution of fiber development Unigenes and on the comparative analysis of functional gene loss in the cotton genomes. All the diagnostic DNA markers linked to the traits of interest will be useful in facilitating genetic improvement of the cotton plant. Project scientists coordinated with the cooperators at Washington State University to incorporate a whole-genome sequence assembly and annotation of G. barbadense into CottonGen, (, a consolidated cotton genomics, genetics, and breeding database for the cotton research community.

1. Fine mapping of a dominant glandless gene in cotton. Cottonseed is an excellent nutritional source of oil and protein, but its utilization is greatly limited by the presence of pigment glands containing toxic gossypol. ARS researchers at College Station, Texas, working in collaboration with international cooperators, mapped a dominant glandless gene Gl2^e to a 15-kb chromosome 12 fragment. One candidate gene was identified in this fragment encoding an MYC transcription factor that likely serves as a vital positive regulator in the production of pigment glands. Sequence and expression analysis of the gene showed a protein product of 475 amino acids present in glanded plants while almost absent in glandless plants. This accomplishment indicates that manipulation of the Gl2^e gene with a tissue-specific promoter could effectively inhibit the formation of the pigment glands in cottonseed. As the most important pigment gland-related gene identified in cotton, it would facilitate research on the glandless trait, cotton MYC proteins, and low-gossypol cotton breeding.


Review Publications
Abdurakhmonov, I.Y., Ayubov, M., Ubaydullaeva, K.A., Buriev, B.T., Shermatov, S.E., Ruziboev, H., Shapulatov, U.M., Saha, S., Ulloa, M., Yu, J., Percy, R.G., Devor, E.J., Govind, S.C., Sripathi, V.R., Kumpatla, S.P., Van De Kroll, A., Hake, K.D., Khamidov, K., Salikhov, S.I., Jenkins, J.N., Abdukarimov, A., Pepper, A.E. 2016. RNA interference for functional genomics and improvement of cotton (Gossypium species). Frontiers in Plant Science. 7:202.
Wang, C., Ulloa, M., Shi, X., Yuan, X., Saski, C., Yu, J., Roberts, P. 2015. Sequence composition of BAC clones and SSR markers mapped to Upland cotton chromosomes 11 and 21 targeting resistance to soil-borne pathogens. Frontiers in Plant Science. 6:791.