Genomic Analysis of Cultivated Cotton and Related Wild Species
Genomics and Bioinformatics Research Unit
2012 Annual Report
1a.Objectives (from AD-416):
The genome sequence of Gossypium (G.) raimondii (D-genome cotton) is scheduled to be released in late 2011 while the G. arboreum (A-genome cotton) genome is slated for release in 2012. These two diploid species are believed to be the progenitors of the tetraploid commercial cotton species G. barbadense and G. hirsutum (both AADD). G. arboreum is a cultivated species in some parts of the world, but it does not have the superior characteristics of the two cultivated tetraploid species. While G. raimondii does not produce lint, many of the traits associated with fiber productivity and quality in the tetraploid commercial species appear to have been largely accounted for by G. raimondii genes. The availability of two reference genomes means it is now possible to start major whole genome comparisons between all Gossypium species. In this regard, we will use genome resequencing to explore the genetic diversity of the genus Gossypium.
1b.Approach (from AD-416):
We will use short-read sequencing technology to produce 20X-50X genome coverage of approximately 26 different cotton species/cultivars. The sequence data will then be mapped back to the reference genomes to identify differences between and within species. Approximately eight diploid species representing the D, C, G, K, A, F, E, and B genomes and five AD tetraploid species will be resequenced. Within cultivated tetraploid cotton, 4 and 19 accessions will be resequenced for G. barbadense and G. hirsutum, respectively. In this way it is believed relevant single nucleotide polymorphisms (SNPs) will be uncovered that can be used in genetic mapping and marker assisted selection. Allelic and gene variations will also be examined for future exploration to improve cotton fiber quality and yield.
More than 500 billion base pairs of DNA sequence data has been generated for various Gossypium (G.) species and G. hirsutum lines. This is roughly one-quarter of the sequence data slated to be produced in the project. The sequence data in hand has been aligned to the public Gossypium raimondii genome sequence. Computational detection of single nucleotide polymorphisms and characterization of repetitive elements has been initiated. Some of the data from species related to G. raimondii has been used to explore evolution of the cotton D-genome. The project is proceeding on schedule and on budget.