Skip to main content
ARS Home » Northeast Area » Ithaca, New York » Robert W. Holley Center for Agriculture & Health » Plant, Soil and Nutrition Research » Research » Publications at this Location » Publication #306143

Title: New whole genome de novo assemblies of three divergent strains of rice (O. sativa) documents novel gene space of aus and indica

Author
item SCHATZ, MICHAEL - Cold Spring Harbor Laboratory
item MARON, LYZA - Cornell University
item STEIN, JOSHUA - Cold Spring Harbor Laboratory
item HERNANDEZ, WENCES - Cold Spring Harbor Laboratory
item GURTOWSKI, JAMES - Cold Spring Harbor Laboratory
item BIGGERS, ERIC - Macalester College
item LEE, HAYAN - Cold Spring Harbor Laboratory
item KRAMER, MELISSA - Cold Spring Harbor Laboratory
item ANTONIOU, ERIC - Cold Spring Harbor Laboratory
item GHIBAN, ELENA - Cold Spring Harbor Laboratory
item WRIGHT, MARK - Cornell University
item CHIA, JER-MING - Cold Spring Harbor Laboratory
item Ware, Doreen
item MCCOUCH, SUSAN - Cornell University
item MCCOMBIA, WILLIAM - Cold Spring Harbor Laboratory

Submitted to: Genome Biology
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 6/20/2014
Publication Date: 12/3/2014
Citation: Schatz, M.C., Maron, L.G., Stein, J.C., Hernandez, W.A., Gurtowski, J., Biggers, E., Lee, H., Kramer, M., Antoniou, E., Ghiban, E., Wright, M.H., Chia, J., Ware, D., Mccouch, S.R., Mccombia, W.R. 2014. New whole genome de novo assemblies of three divergent strains of rice (O. sativa) documents novel gene space of aus and indica. Genome Biology. 15:506-521.

Interpretive Summary: This manuscript reports inter-comparison of genomes that were sequenced and assembled from three strains of cultivated Asian rice, representing the diverse sub-populations of japonica, indica, and aus. The research demonstrated feasibility in using next generation sequencing technology to assemble high-quality reference assemblies, which were amenable to detailed annotation of protein-coding genes. Comparison of the three genomes revealed core conserved genes as well as genes unique to individual strains. Detailed analysis of several loci associated with agriculturally important traits illustrated the utility of this approach for biological discovery.

Technical Abstract: The use of high throughput genome-sequencing technologies has uncovered a large extent of structural variation in eukaryotic genomes that makes important contributions to genomic diversity and phenotypic variation. Currently, when the genomes of different strains of a given organism are compared, whole genome resequencing data are aligned to an established reference sequence. However when the reference differs in significant structural ways from the individuals under study, the analysis is often incomplete or inaccurate. Here, we use rice as a model to explore the extent of structural variation among strains adapted to different ecologies and geographies, and show that this variation can be significant, often matching or exceeding the variation present in closely related human populations or other mammals. We demonstrate how improvements in sequencing and assembly technology allow rapid and inexpensive de novo assembly of next generation sequence data into high-quality assemblies that can be directly compared to provide an unbiased assessment. Using this approach, we are able to accurately assess the pan-genome of three divergent rice varieties and document several megabases of each genome absent in the other two. Many of the genome-specific loci are annotated to contain genes, reflecting the potential for new biological properties that would be missed by standard resequencing approaches. We further provide a detailed analysis of several loci associated with agriculturally important traits, illustrating the utility of our approach for biological discovery. All of the data and software are openly available to support further breeding and functional studies of rice and other species.