Skip to main content
ARS Home » Plains Area » College Station, Texas » Southern Plains Agricultural Research Center » Crop Germplasm Research » Research » Research Project #434423

Research Project: Advanced Genomic and Bioinformatic Tools for Accelerated Cotton Genetic Improvement

Location: Crop Germplasm Research

2019 Annual Report

Objective 1: Evaluate the cotton primary and secondary gene pools, as well as natural and synthetic cotton populations that are maintained in the USDA NPGS and cotton research community to identify useful genetic variability for industry-relevant traits, and provide information to breeders, along with augmented, and/or improved core sets of effective DNA markers. Sub-objective 1A: Augment and improve core sets of cotton SSR and SNP markers to effectively exploit the genetic variation of cotton germplasm and populations. Sub-objective 1B: Develop a core set of SSR markers for G. thurberi to allow for improved molecular characterization of this wild diploid Gossypium species. Objective 2: Sequence, refine, and annotate priority genomes of cotton species and accessions that contain genes controlling traits important to the cotton industry, and work with breeders to use these and previously identified cotton sequences to identify genomic regions for effective selections. Objective 3: Develop, improve, and manage an efficient and effective database and bioinformatics system, CottonGen, for efficiently exploiting cotton genetic variation. Objective 4: Identify key genes and genetic elements in cotton genomes, and use the information in selecting and verifying a range of priority agronomic traits, including biotic and abiotic stress resistance, and fiber and seed properties from materials contained in the USDA NPGS and cotton research community.

This project will provide the cotton industry with advanced genomic information and bioinformatic tools to enhance and accelerate the analysis and exploitation of genetic variability in the complex Gossypium genus. Current information suggests that genetic variation in cultivated cotton is limited, and that the overall structure of genetic variation in the Gossypium genus is not adequately resolved. More powerful tools are required to exploit the genetic potential of wild or uncultivated genotypes. Our recently completed genome assemblies of the Upland cotton genetic standard TM-1 and its probable progenitors provide a template for further sequencing efforts. Resequencing other cultivated and wild cotton species and/or accessions will allow comparative exploration for effective identification and manipulation of beneficial genes otherwise buried within Gossypium germplasm collections. In the current project, we will specifically develop and improve core sets of DNA markers tailored to individual cotton species, generate novel genome sequence information, and identify key genes or genetic elements linked to priority traits for improving agronomics, fiber and/or seed quality, and resistance to biotic/abiotic stresses. In cooperation with Cotton Incorporated, this project will provide support, coordination, and oversight to CottonGen, a database of genomic, genetic, and breeding resources managed by Washington State University. A primary goal of this project is to provide effective tools and information to identify and elucidate genetic variation within the U.S. National Cotton Germplasm Collection that is maintained by our sister germplasm project. New biological information developed by the project will be made publicly available in the GenBank and CottonGen databases.

Progress Report
Project work in FY 2019 made significant progress in sequencing and resequencing cotton genomes. The genetic make-up known as the A1 genome of diploid (plants whose cells contain two complete sets of chromosomes) cotton Gossypium herbaceum was newly assembled, and the A2 genome of diploid cotton G. arboreum and the AtDt genome of tetraploid (four sets of chromosomes) cotton G. hirsutum were improved significantly. These genomes contained 43,952, 43,278 and 74,350 protein-coding genes, respectively. Compelling evidence showed that all existing A-genomes (A1, A2, and At) may have originated from a common ancestor referred as to A0, and AtDt formation may have preceded the speciation of A1 and A2 that were evolved independently. This work confirms the origin and evolutionary relationship of all existing A-genomes in cultivated cottons and it provides valuable genomic resources for cotton genetic improvement (Objectives 1 and 2). The project continued support of the CottonGen database (managed by Cotton Inc.) which serves the broad cotton community worldwide. Several new cotton genomes, 15,000 genetic tools known as molecular markers, and 2,600 molecular tools known as quantitative trait loci (QTL) were added to the database. The CottonGen Breeding Information Management System (BIMS) was significantly enhanced and it is now being tested/used by several cotton breeders. During FY 2019, CottonGen served 311,711 pages to 17,311 cotton researchers from more than 130 countries (Objective 3). The project continued to make progress in defining genetic control of fiber development and abiotic stress in tetraploid cotton. Recently developed molecular tools allowed identification of genetic types (genotypes) based on DNA variation. One single nucleotide polymorphism (SNP) was identified that caused amino acid residue change in fine mapping of an important component (virescent-1 locus) on cotton chromosome 20 (Objective 4). A detailed review on application of marker-assisted breeding in cotton was done cooperatively with colleagues at ARS-Mississippi State and Uzbekistan Academy of Sciences.

1. Origin and relationship of A-genomes in cultivated cottons. Uncertainty regarding the actual A-genome donor of the most widely cultivated cotton, Gossypium hirsutum, has been the source of much speculation. ARS scientists at College Station, Texas, working with Chinese collaborators, sequenced and assembled the A1-genome of G. herbaceum while updating and improving the A2 genome of G. arboreum and the AtDt genome of G. hirsutum. Upon in-depth analysis and assay, there is compelling evidence to suggest that all existing A-genomes, including A1, A2, and the At in Gossypium hirsutum, originate from a common ancestor named A0. The formation of AtDt cotton preceded the speciation of A1 and A2 diploid cottons that were evolved independently, with no ancestor-progeny relationship as previously suggested. This accomplishment confirms the origin and evolutionary relationship of all existing A-genomes in cultivated cottons. It provides valuable information and genomic resources for cotton genetic improvement.

Review Publications
Han, M., Lu, X., Yu, J., Chen, X., Wang, X., Malik, W., Wang, J., Wang, D., Wang, S., Guo, L., Chen, C., Cui, R., Yang, X., Ye, W. 2019. Transcriptome analysis reveals cotton (Gossypium hirsutum) genes that are differentially expressed in cadmium (Cd) stress tolerance. International Journal of Molecular Sciences. 20(6):1479.
Yu, J., Gervers, K. 2019. Genomic analysis of marker-associated fiber development genes in Upland cotton (Gossypium hirsutum L). Euphytica. 215:74.
Lu, X., Fu, X., Wang, D., Wang, J., Chen, X., Hao, M., Wang, J., Gervers, K.A., Guo, L., Wang, S., Yin, Z., Fan, W., Shi, C., Wang, X., Peng, J., Chen, C., Cui, R., Shu, N., Zhang, B., Han, M., Zhao, X., Mu, M., Yu, J., Ye, W. 2019. Resequencing of cv CRI-12 family reveals haplotype block inheritance and recombination of agronomically important genes in artificial selection. Plant Biotechnology Journal. 17(5):945-955.
Zhang, Y., Wang, Q., Zuo, D., Cheng, H., Liu, K., Ashraf, J., Li, S., Feng, X., Yu, J., Song, G. 2018. Map-based cloning of a recessive gene v1 for virescent leaf expression in cotton (Gossypium spp). Journal of Cotton Research. 1:10.