2013 Annual Report
1a.Objectives (from AD-416):
Objective 1: Identify a core set of molecular markers tailored for systematic characterization of the genetic diversity within and among Gossypium germplasm accessions that will be maintained under the sister project 6202-21000-032-00D.
Objective 2: Maintain and enhance CottonDB as a user-friendly tool for the cotton research community.
Sub-objective 2.A: Maintain and enhance CottonDB, including development of user friendly public interfaces.
Sub-objective 2.B: Develop bioinformatic software and tools to assist both users and curators of CottonDB.
Objective 3: Collaborate with other public sector researchers to construct and integrate physical and genetic maps of G. hirsutum.
Sub-objective 3.A: Develop cotton genetic maps that contain PCR-based DNA markers.
Sub-objective 3.B: Develop cotton physical maps that contain large-insert BAC clones.
Sub-objective 3.C: Integrate cotton genetic and physical maps with EST unigene information.
Objective 4: Identify key genes and genomic regions of cotton for use in developing cotton germplasm resources that exhibit desirable/improved agronomic and fiber traits.
Sub-objective 4.A: Apply genomic and bioinformatic tools to identify and characterize QTLs or alleles from cotton genetic resources, maintained under the sister project 6202-21000-032-00D, that govern key agronomic or fiber traits.
Sub-objective 4.B: Apply the preceding information to identify superior parents for developing breeding populations with novel sources of variability for traits of interest.
Sub-objective 4.C: Recombine and select the preceding breeding populations to accumulate desirable QTLs and alleles in enhanced cotton breeding lines.
1b.Approach (from AD-416):
To develop a portable core set of markers for cotton (Objective 1), new SSR and SNP markers will be developed from cotton BAC libraries and other genomic DNA templates. From the markers created, a core set of 208 markers will be carefully selected from the saturated genome map of tetraploid cotton (TM-1 x 3-79) with 8 markers from each of 26 chromosomes. Each of these core markers will have a high polymorphism information content (PIC) value to be determined on a standardized core germplasm panel consisting of 12 diverse Gossypium genotypes. These markers will be evenly distributed on the cotton genome, with every chromosome arm having 4 core markers at approximately 15-cM intervals. Data from marker development will be stored and made available in the CottonDB database. CottonDB, a tool for the research community, will be enhanced through continued migration of its information content to a relational structure, improved display pages, and direct record-to-record links between internet databases to integrate information into a larger virtual database (Objective 2). To enrich the delivered content and streamline users' searches for specific information, work will integrate related data from multiple databases. Solutions developed by other genome databases will be adapted and implemented to this project's databases where appropriate. To construct and integrate physical and genetic maps, genetic mapping of TM-1 BAC-derived and other markers will be conducted using the TM-1 x 3-79 RI population. Diagnostic DNA markers will be identified that are capable of detecting polymorphism in intraspecific populations, and these markers will be used to genotype the entire TM-1 x 3-79 RI mapping population. A score matrix will be generated from the genotyping experiments and merged with the existing mapping database to perform linkage analysis via MapMaker and/or JOINMAP software programs. Recombination frequencies will be converted into map distances (cM). Approximately 500 SSR and 500 SNP markers will be added to the existing genetic map that contains 1,200 SSR markers to obtain an average resolution of 1-2 cM per marker. Integration of cotton genetic and physical maps will be achieved by anchoring framework genetic markers to TM-1 BAC contigs, and locating BAC-derived markers to the TM-1 x 3-79 RI map (Objective 3). Comparisons of genetic and physical map tools (CMap and IntegratedMap) will allow for consolidation of all structural and physical genomic information. In order to utilize the growing numbers of QTLs reported in cotton, work will validate those QTL by aligning genomic locations and comparing genetic effects (Objective 4). Information for QTLs of interest will be related among comparable studies in cotton and will be obtained from a variety of sources, including published accounts and database records. Once specific chromosomal regions containing genes that make a significant contribution to the expression of a complex phenotype of interest are identified, fine-mapping of the most promising genomic regions will be used to identify polymorphisms in coding and/or regulatory regions.
In FY 2013, project scientists working with national and international cooperators developed a strategy for using modern molecular tools (known as SNP markers) to identify and map important cotton genes. A total of 762 SNP markers were discovered among eight different wild cotton germplasm lines under study, with populations identified for photoperiod independence. Significant progress was made in sequencing a cultivated A-genome cotton diploid species G. arboreum, the descendant of the A-subgenome contributor to tetraploid cotton species. More than 41,000 protein-coding genes were identified from the sequence assembly. Over the life of this project, significant cotton genomic resources and new knowledge were developed that were disseminated to the plant research community. Among the most important were the mapping and sequencing of cotton genomes. Using the cotton genetic standard, TM-1 x 3-79 recombinant inbred line (RIL) population, the tetraploid cotton map was saturated with 2,280 simple sequence repeat (SSR) and 247 single nucleotide polymorphism (SNP) genetic markers. From this map, 208 SSR markers were selected that are evenly distributed from the 26 cotton chromosomes, and from these a set of 105 core SSR markers were selected to characterize the genetic diversity of the National Cotton Germplasm Collection. Significant progress was made in sequencing the D-genome species, Gossypium raimondii. In close work with Cotton Inc., databases CottonDB (http://www.cottondb.org/) and CMD (http://www.cottonmarker.org/) were merged to create CottonGen (http://www.cottongen.org/). CottonGen released new versions 0.9 and 1.0 that contain data of the published D-genome cotton sequence and many other datasets, including cotton genes, DNA markers, and breeding records. This project expired in FY 2013, but was replaced by 6202-21000-038-00D which is expanding upon the work of the precursor project.
Release of CottonGen version 1.0. The cotton research community needs a centralized portal to access important information and scientific data for more effective genetic improvement of the cotton crop. ARS scientists at College Station, Texas, working with national cooperators, and particularly with Cotton Inc., made major progress in this area with release of the new version of the cotton database CottonGen (version 1.0) that increases access, functionality, and content. The new data added to CottonGen included the published diploid cotton G. raimondii genome sequence, genome maps, DNA markers, quantitative trait loci (QTL) traits, expressed sequence tag (EST) genes, and breeding records. The functionality included the creation of a new search site for QTLs and for cotton publications. This accomplishment will significantly benefit the cotton research community by helping cotton researchers better understand and improve the cotton crop for higher productivity, fiber quality, and agronomic performance.
Sequencing of the A-genome of cotton - a first sequence for cotton. Understanding the genetic control and the underlying genetic code of various aspects of cotton growth and development is essential for continued improvement of this important fiber crop. ARS scientists at College Station, Texas, working with a large and scientifically diverse group of international cooperators, completely sequenced a cultivated diploid cotton G. arboreum A-genome, with more than 98% of the A-genome assembled and over 90% of the assembled sequences oriented to 13 chromosomes that harbor at least 41,330 genes. As the first cultivated cotton sequenced in the world, the G. arboreum genome will greatly assist in sequencing and assembling the more complex genomes of the most widely grown commercial tetraploid cottons G. hirsutum and G. barbadense. The A-genome sequence also facilitates various biological studies, including fiber evolution and development. This accomplishment, within a few years, will lead to a complete sequence map of the tetraploid cotton genomes, which is much needed for cotton researchers to develop better cottons with enhanced yield and quality traits.
Wang, K., Wang, Z., Li, F., Ye, W., Wang, J., Song, G., Yue, Z., Cong, L., Shang, H., Zhu, S., Zou, C., Li, Q., Yuan, Y., Lu, C., Wei, H., Gou, C., Zheng, Z., Yin, Y., Zhang, X., Liu, K., Wang, B., Song, C., Shi, N., Kohel, R.J., Percy, R.G., Yu, J., Zhu, Y., Wang, J., Yu, S. 2012. The draft genome of a diploid cotton Gossypium raimondii. Nature Genetics. 44(10:1098-1103.
Fang, D.D., Yu, J. 2012. Addition of four-hundred fifty-five microsatellite marker loci to the high density Gossypium hirsutum TM-1 x G. barbadense 3-79 genetic map. Journal of Cotton Science. 16:229-248.