Developing Genomic and Genetic Tools for Exploiting Cotton Genetic Variation
Crop Germplasm Research
Project Number: 6202-21000-038-00
Start Date: Mar 28, 2013
End Date: Mar 27, 2018
The goal of this project is to develop genomic and genetic tools, materials, and information critically lacking for effectively exploiting cotton genetic variation in Gossypium germplasm characterization and cotton genetic improvement programs. Objective 1 reflects our commitment to continue the development of portable DNA markers (simple sequence repeat or SSR and single nucleotide polymorphism or SNP) and molecular descriptors (core sets of well-defined DNA markers) and make them available to the cotton research community. Objective 2 reflects our unique participation in the development of Upland cotton genome sequence and community database resources. A complete reference genome sequence of the Upland cotton genetic standard (G. hirsutum acc. TM-1) will unprecedentedly facilitate the process of gene mining in Gossypium germplasm for commercial improvement. A centralized public database (CottonGen) with user-friendly bioinformatic tools will make coordinated analysis and dissemination of research data and information more effective for the cotton research community. Work under Objective 3 will identify genes or novel alleles, genomic regions or quantitative trait loci (QTLs) for value-added priority traits, utilizing the genomic and genetic tools developed under the first two objectives. Superior cotton lines will be identified for developing breeding populations with novel variability for traits of interest. Collaborative work with key members/organizations of the cotton research community is necessary and will be done; all such work will be of mutual benefit and will be conducted so as to assure complementarity and lack of duplication. Specifically, during the next five years the project will focus on the following three objectives.
Objective 1: Develop new genetic markers to augment current core sets of mapped SSR and SNP markers for high-throughput characterization of the genetic diversity within and among Gossypium germplasm accessions in the National Cotton Germplasm Collection.
Sub-obj. 1A: Develop new SSR and SNP primers, and evaluate for polymorphism.
Sub-obj. 1B: Identify and validate core cotton SSR and SNP markers.
Objective 2: Collaborate with other public national and international researchers to sequence and analyze the tetraploid genome of G. hirsutum genetic standard genotype Texas Marker-1 (TM-1), and coordinate the activities of a public database to maintain and disseminate sequence and other genetic information to the research community.
Sub-obj. 2A: Develop and analyze TM-1 genome sequence.
Sub-obj. 2B: Coordinate the activities of CottonGen.
Objective 3: Identify key genes and genomic regions of cotton that govern or are closely linked with priority traits, including fiber yield and quality, as well as biotic and abiotic stress tolerance.
Sub-obj. 3A: Apply cotton genomic tools to identify and characterize QTLs or alleles from cotton genetic resources, maintained under the sister project, that govern key agronomic or fiber traits.
Sub-obj. 3B: Apply the preceding information to identify superior parents for developing breeding populations with novel sources of variability for traits of interest.
New genetic markers will be created to augment current core sets of mapped SSR and SNP markers developed for high-throughput characterization of the genetic diversity within the National Cotton Germplasm Collection (objective 1). Genomic DNA will be isolated from species of G. hirsutum, G. barbadense, G. arboreum, and G. raimondii; and cotton sequence reads will be generated employing next–generation sequencing (NGS) technologies (Illumina GAIIx or HiSeq system). A high-throughput simplified one-enzyme system will be used to simultaneously discover and genotype SNP loci. The information will be used to develop A and D genome-specific SNP markers that will be made available to members of the research community via CottonGen (Sub-objective 1A). New polymorphic SSR and SNP markers will be mapped to the 26 chromosomes of the tetraploid cotton genome (sub-objective 1B). Genetic mapping of markers (SSR and SNP) will be conducted using the 186 RILs of the publicly available mapping population TM-1 x 3-79 RIL. The G. hirsutum genome will be sequenced in collaboration with national and international researchers (sub-objective 2A). Working closely with BGI and Cotton Research Institute (CRI), an integrated assembly strategy that includes large insert BAC libraries, sub-genome alignments, and chromosomal anchoring through the restriction site associated DNA (RAD)-seq analysis are being developed and tested. The TM-1 reference genome sequence will be made available to the broader plant research community via GenBank and CottonGen. The activities of the database CottonGen will be supported and coordinated through a cooperative agreement with Cotton Incorporated (sub-objective 2B). CottonGen is being built using the open-source Tripal database infrastructure to incorporate new datasets such as annotated transcriptome, genome sequence, marker-trait-locus and breeding data, as well as enhanced tools for easy querying and visualizing research data. Most technical aspects of building and maintaining CottonGen are handled by the database team at Washington State. The ARS group at College Station is responsible for determining what functionality, content, and data integration is desired of the database. Genomic tools will be used to identify QTLs or alleles governing key agronomic or fiber traits (sub-objective 3A). Validation of the growing numbers of QTLs reported in cotton will be accomplished by aligning genomic locations and comparing genetic effects of QTLS in order to make the QTLs useful. Once it is determined that specific chromosomal regions contain genes that make a significant contribution to the expression of a trait, fine-mapping of the most promising genomic regions will be used to identify polymorphisms in coding and/or regulatory regions. Diagnostic DNA markers that are associated with traits of interest will be used to screen the Collection, and lines possessing combinations of desirable QTLs will be used for developing breeding populations with novel sources of variability (sub-objective 3B).