Location: Crop Germplasm ResearchTitle: The draft genome of a diploid cotton Gossypium raimondii Author
Submitted to: Nature Genetics
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 7/5/2012
Publication Date: 8/26/2012
Citation: Wang, K., Wang, Z., Li, F., Ye, W., Wang, J., Song, G., Yue, Z., Cong, L., Shang, H., Zhu, S., Zou, C., Li, Q., Yuan, Y., Lu, C., Wei, H., Gou, C., Zheng, Z., Yin, Y., Zhang, X., Liu, K., Wang, B., Song, C., Shi, N., Kohel, R.J., Percy, R.G., Yu, J., Zhu, Y., Wang, J., Yu, S. 2012. The draft genome of a diploid cotton Gossypium raimondii. Nature Genetics. 44(10:1098-1103. Interpretive Summary: Cotton is the leading fiber crop contributing to the world economy. Due to the unique nature of its fiber, cotton also is an excellent model for studying basic biological questions, such as cell elongation and differentiation. Like a number of cultivated crops, cotton is the product of the hybridization of two different species, followed by a doubling of the plant's chromosomes. To gain insights into the fusion and rearrangement of genetic material in cultivated cotton, we have sequenced the DNA of one of its putative parent species, Gossypium raimondii. High levels of rearrangement among and within chromosomes; as well as duplication of genetic material were observed in the species. This research is a major step towards fully sequencing and analyzing the modern cultivated cotton for accelerated identification and enhancement of genetic systems contributing to cotton productivity, quality and environmental stability.
Technical Abstract: We have sequenced and assembled the draft genome of Gossypium raimondii, whose progenitor is considered the contributor of the D-subgenome to the economically important natural textile fiber producer, G. hirsutum. Next-generation Illumina pair-end (PE) sequencing strategies were employed to obtain 103.6-fold genome coverage of cleaned DNA sequence from various shotgun libraries with insert sizes ranging from 170 bp to 40 kbp. Over 73% of the assembled sequences were anchored on 13 G. raimondii chromosomes or linkage groups. The genome was predicted to contain 40,976 protein-coding genes with 92.2% of them further confirmed by transcriptome data. We observed two whole genome duplication (WGD) events, one occurring at about 56-63 million years ago (MYA) and the other at about 13-20 MYA. We also identified 2355 synteny blocks in the G. raimondii genome. About 40% of the gene models are present in more than one block, suggesting that the G. raimondii genome has undergone substantial chromosome rearrangements. Nearly 57% of the genome is composed of transposable elements, most of which may come from the expansion of long terminal repeats (LTRs) since 4 MYA until now. Qualitative differences exist when comparing the expression patterns of gene families with key importance in cotton fiber initiation, elongation and cell wall synthesis from G. hirsutum with that of G. raimondii transcriptomes. Genes involved in gossypol biosynthesis can only be found in cotton and in T. cacao, but not in V. vinifera, which indicates that gossypol production evolved after the separation of the last two closely related species. The G. raimondii genome not only provides a major source of candidate genes for cotton research, but also it may serve as a platform for assembly of the tetraploid G. hirsutum genome.