Skip to main content
ARS Home » Northeast Area » Ithaca, New York » Robert W. Holley Center for Agriculture & Health » Plant, Soil and Nutrition Research » Research » Publications at this Location » Publication #319573

Research Project: Enhancing Plant Genome Function Maps Through Genomic, Genetic, Computational and Collaborative Research

Location: Plant, Soil and Nutrition Research

Title: High-throughput comparison, functional annotation, and metabolic modeling of plant genomes using the PlantSEED resource

Author
item SEAVER, SAMUEL - Argonne National Laboratory
item GERDESA, SVETLANA - Argonne National Laboratory
item FRELIND, OCEANE - University Of Florida
item LERMA-ORTIZE, CLAUDIA - University Of Florida
item BRADBURYD, LOUIS - University Of Florida
item ZALLOTE, REMI - University Of Florida
item HASNAIND, GHULAM - University Of Florida
item NIEHAUSD, THOMAS - University Of Florida
item EL YACOUBIE, BASMA - University Of Florida
item PASTERNAK, SHIRAN - Cold Spring Harbor Laboratory
item OLSON, ROBERT - Argonne National Laboratory
item PUSCH, GORDON - Argonne National Laboratory
item OVERBEEK, ROSS - Argonne National Laboratory
item STEVENS, RICK - Argonne National Laboratory
item DE CRECY-LAGARDE, VALERIE - Cold Spring Harbor Laboratory
item Ware, Doreen
item HANSON, ANDREW - University Of Florida
item HENRY, CHRISTOPHER - Argonne National Laboratory

Submitted to: Proceedings of the National Academy of Sciences(PNAS)
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 4/7/2014
Publication Date: 7/1/2014
Publication URL: http://DOI: 10.1073/pnas.1401329111
Citation: Seaver, S.M., Gerdesa, S., Frelind, O., Lerma-Ortize, C., Bradburyd, L.M., Zallote, R., Hasnaind, G., Niehausd, T.D., El Yacoubie, B., Pasternak, S., Olson, R., Pusch, G., Overbeek, R., Stevens, R., De Crecy-Lagarde, V., Ware, D., Hanson, A.D., Henry, C.S. 2014. High-throughput comparison, functional annotation, and metabolic modeling of plant genomes using the PlantSEED resource. Proceedings of the National Academy of Sciences. 111(26):9645-9650.

Interpretive Summary: PlantSEED is a new tool to help plant scientists more quickly and easily label (or in science lingo, “annotate”) plant genes. It is a public and free information system that can take data from scientists around the world, place it into a common platform, and provide plant models that everyone can use. Thus, identifying important genes for productivity, stress-resistance, and other factors, will become much easier for both traditional and non-traditional breeders alike. The tool can be compared to aeronautical engineers who test newly designed equipment first by plugging information into computer models before building aircraft. Because the average plant has 20,000 to 30,000 genes, such models should allow making very specific alterations by first testing potential effects in the whole plant system before proceeding with breeding by design.

Technical Abstract: The increasing number of sequenced plant genomes is placing new demands on the methods applied to analyze, annotate, and model these genomes. Today's annotation pipelines result in inconsistent gene assignments that complicate comparative analyses and prevent efficient construction of metabolic models. To overcome these problems, we have developed the PlantSEED, an integrated, metabolism-centric database to support subsystems-based annotation and metabolic model reconstruction for plant genomes. PlantSEED combines SEED subsystems technology, first developed for microbial genomes, with refined protein families and biochemical data to assign fully consistent functional annotations to orthologous genes, particularly those encoding primary metabolic pathways. Seamless integration with its parent, the prokaryotic SEED database, makes PlantSEED a unique environment for cross-kingdom comparative analysis of plant and bacterial genomes. The consistent annotations imposed by PlantSEED permit rapid reconstruction and modeling of primary metabolism for all plant genomes in the database. This feature opens the unique possibility of model-based assessment of the completeness and accuracy of gene annotation and thus allows computational identification of genes and pathways that are restricted to certain genomes or need better curation. We demonstrate the PlantSEED system by producing consistent annotations for 10 reference genomes. We also produce a functioning metabolic model for each genome, gap filling to identify missing annotations and proposing gene candidates for missing annotations. Models are built around an extended biomass composition representing the most comprehensive published to date. To our knowledge, our models are the first to be published for seven of the genomes analyzed.