Page Banner

United States Department of Agriculture

Agricultural Research Service


Location: Crop Germplasm Research

Title: Genome sequence of the cultivated cotton Gossypium arboreum

item Li, Fuguang
item Fan, Guangyi
item Wang, Kunbo
item Sun, Fengming
item Yuan, Youlu
item Song, Guoli
item Ma, Zhiying
item Li, Qin
item Lu, Cairui
item Zou, Changsong
item Chen, Wenbin
item Liang, Xinming
item Shang, Haihong
item Liu, Weiqing
item Xiao, Guanghui
item Gou, Caiyun
item Ye, Wuwei
item Xu, Xun
item Zhang, Xueyan
item Wei, Hengling
item Li, Zhifang
item Zhang, Guiyin
item Wang, Junyi
item Liu, Kun
item Kohel, Russell
item Percy, Richard
item Yu, John
item Zhu, Yu-xian
item Wang, Jun
item Yu, Shuxun

Submitted to: Nature Genetics
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 4/24/2014
Publication Date: 6/1/2014
Citation: Li, F., Fan, G., Wang, K., Sun, F., Yuan, Y., Song, G., Ma, Z., Li, Q., Lu, C., Zou, C., Chen, W., Liang, X., Shang, H., Liu, W., Xiao, G., Gou, C., Ye, W., Xu, X., Zhang, X., Wei, H., Li, Z., Zhang, G., Wang, J., Liu, K., Kohel, R.J., Percy, R.G., Yu, J., Zhu, Y., Wang, J., Yu, S. 2014. Genome sequence of the cultivated cotton Gossypium arboreum. Nature Genetics. 46(6):567-574.

Interpretive Summary: Upland cotton originated from the hybridization of two species and therefore has a very complex genetic makeup. Sequencing Upland cotton will greatly aid researchers in characterizing and exploiting cotton germplasm for agronomic traits, but its two-species origin greatly complicates sequencing efforts. As the first step toward sequencing Upland cotton, we had to sequence both of its putative parents. Here we report the complete sequencing and successful assembly of one of those parents. Over 90 percent of the assembled sequences, covering more than 98 percent of the parent species genome, were anchored and oriented to 13 chromosomes. A total of 41,330 genes were predicted with 92 percent being confirmed. The sequencing of the parent species A-genome, along with the previously published sequence of Upland cotton's other parent D-genome, lays the foundation for fully sequencing and assembling the more genetically complex commercial Upland cotton varieties. The parent species sequence provides the research community with critical resources and information for accelerated identification and enhancement of genetic systems contributing to cotton productivity, quality and environmental stability.

Technical Abstract: Cotton is one of the most economically important natural fiber crops in the world, and the complex tetraploid nature of its genome (AADD, 2n = 52) makes genetic, genomic and functional analyses extremely challenging. Here we sequenced and assembled 98.3% of the 1.7-gigabase G. arboreum (AA, 2n = 26) genome, whose progenitor is a putative contributor of the diploid A-subgenome to tetraploid cottons. Pair-end sequencing from 10 libraries with insert sizes ranging from 180 bp to 40 kb resulted in 193.6 Gb clean sequence that covers the genome by 112.6-fold. Using a set of 24,569 single-nucleotide polymorphism (SNP) markers that we obtained from 154 F2 restriction-site-associated DNA (RAD) lines, we were able to anchor and orient 90.4% of the assembly on 13 pseudo chromosomes. The majority of the genome (68.5%) is occupied by repetitive DNA sequences, most of which are long terminal repeats (LTRs). We predicted 41,330 protein-coding genes in G. arboreum, which is similar to that of the G. raimondii. One ancient (about 115 - 146 million years ago, MYA) and one recent (approximately 13 - 20 MYA) whole genome duplications (WGDs) were shared by both species before the speciation event around 2 - 13 MYA. The two-fold size changes of these otherwise highly co-linear genomes were the result of LTR insertions in the past five million years. Expansion and contraction of nucleotide- binding site (NBS) gene family sizes in different cotton species may be responsible for their resistance to Verticillium dahlia. The ethylene-central regulatory pathway may determine fundamentally the fate of cotton fiber cell development.

Last Modified: 10/16/2017
Footer Content Back to Top of Page