Skip to main content
ARS Home » Plains Area » College Station, Texas » Southern Plains Agricultural Research Center » Crop Germplasm Research » Research » Publications at this Location » Publication #364888

Research Project: Advanced Genomic and Bioinformatic Tools for Accelerated Cotton Genetic Improvement

Location: Crop Germplasm Research

Title: De novo genome assemblies of Gossypium raimondii and G. turneri

Author
item Udall, Joshua - Josh
item LONG, EVAN - Brigham Young University
item HANSON, CHRIS - Brigham Young University
item YUAN, DAOJUN - Iowa State University
item RAMARAJ, THIRUVARANGAN - Brigham Young University
item CONOVER, JUSTIN - Iowa State University
item GONG, LEI - National Center For Genome Resources
item ARICK, MARK - Northeast Normal University
item GROVER, CORRINNE - Iowa State University
item PETERSON, DANIEL - Northeast Normal University
item WENDEL, JONATHAN - Iowa State University

Submitted to: G3, Genes/Genomes/Genetics
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 8/26/2019
Publication Date: 10/1/2019
Citation: Udall, J.A., Long, E., Hanson, C., Yuan, D., Ramaraj, T., Conover, J.L., Gong, L., Arick, M.A., Grover, C.E., Peterson, D.G., Wendel, J.F. 2019. De novo genome assemblies of Gossypium raimondii and G. turneri. G3, Genes/Genomes/Genetics. 9(10):3079-3085. https://doi.org/10.1534/g3.119.400392.
DOI: https://doi.org/10.1534/g3.119.400392

Interpretive Summary: In this report, we describe the new assembly and annotation of two genome sequences (Gossypium raimondii and G. turneri) using PacBio, HiC, and Bionano technologies. The integrity of these assemblies is excellent and both genomes have chromosome-level continuity with large underlying contig N50s. The G. raimondii genome has been previously sequenced. It has been been used as "the cotton reference genome sequence" for the last 7 years as evidenced by its 641 citations since its publication in 2012. Even though this original assembly was of great significance, it contained a few errors because of technical limitations that existed 10 years ago. In this report, we briefly compare the previous and new genome assemblies, and we correct several assembly errors that were initially contained in that genome sequence. These errors included inversions, translocations, and a mitochondrial genome insertion event. We also validate the G. raimonii genome assembly by comparing it to the new genome assembly of G. turneri, a close D-genome relative of diploid G. raimondii. The analysis described is brief, but well substantiated. It is meant to be a report of the genome sequences that are publicly released for the cotton community. The genome sequences are available on NCBI and on CottonGen for use by other researchers. We believe this report will be of great value to the cotton community because of its prominence as a genetic resource and its high quality.

Technical Abstract: Cotton is an agriculturally important crop. Because of its importance, a genome sequence of a diploid cotton species (Gossypium raimondii, D-genome) was first assembled using Sanger sequencing data in 2012. Improvements to DNA sequencing technology have improved accuracy and correctness of assembled genome sequences. Here we report a new de novo genome assembly of G. raimondii and its close relative G. turneri. The two genomes were assembled to a chromosome level using PacBio long-read technology, HiC, and Bionano optical mapping. This report corrects some minor assembly errors found in the Sanger assembly of G. raimondii. We also compare the genome sequences of these two species for gene composition, repetitive element composition, and collinearity. Most of the identified structural rearrangements between these two species are due to intra-chromosomal inversions. More inversions were found in the G. turneri genome sequence than the G. raimondii genome sequence. These findings and updates to the D-genome sequence will improve accuracy and translation of genomics to cotton breeding and genetics.