Location: Plant, Soil and Nutrition Research2013 Annual Report
1a. Objectives (from AD-416):
Biological research benefits extraordinarily from the integration of many different types of data both within and between species. The first specific objective of the proposal builds on existing and emerging data sets, providing resources to characterize, track and ultimately identify sequence associated with agronomically important traits. The second objective addresses infrastructure to manage, visualize and distribute complex datasets. The research makes use of four methodologies, data integration, software development, genome annotation, and evolutionary analysis. Throughout the proposal each objective builds upon each other. Combined they hold greater potential for providing a knowledge base for improving agricultural varieties. 1. Enhance our knowledge of plant genome structure, organization and evolution through computational and experimental approaches. 2. Develop and implement standards for plant genome databases. This includes development of vocabulary, methods, database structures and visualization software to facilitate data integration and interoperability.
1b. Approach (from AD-416):
We propose to leverage computational and experimental approaches, building on existing and new developed resources to create standardized baseline comparative maps and genome annotations across plant genomes, with an emphasis on crop grasses and other agriculturally important species as well as model genomes. As part of this work we will leverage existing infrastructure and build upon these to deliver data management and visualization tools for sequence, maps, diversity, and phenotype data sets.
3. Progress Report:
In an effort to provide data standards and interpretation of the Plant genomes we expanded our DNA and protein comparative whole genome analysis from 19 to 25 genomes. We updated the B73 maize (corn) reference assembly & annotations and submitted these to the reference archives at GenBank. We began work with the Genome Reference Consortium (GRC) to support the maize community access to the genome through these GRC resources. We also contributed to additional draft assemblies in wheat, maize and rice. In addition to developing draft assemblies and structural annotations, we are currently exploring approaches which make use of expression and methylation of chromatin, in sorghum and maize to improve biological models and functional annotations. While the genomes provide the parts list, our work on gene networks supports the characterization of gene involved in development and response to stress. We have extended the Arabidopsis miRNA gene root network by identify potential upstream regulators for an additional 57 genes and are developing genetic resources to validate these networks in plants. In maize we have characterized grass specific gene networks associated with flower development and architecture using expression and DNA binding profiles of maize flower development mutants. We continue to refine workflows for to understand the architecture of the sequence involved in when and where a gene is expressed (core promoter motifs) and applied this to 8 eukaryote genomes. Recent technological advancements have led to an expansion of data in the biological sciences. Managing, accessing, integration and interpretation of the data is now one of the major challenges in agriculture science. In the past year, this project has contributed to the scientific leadership, development, and community outreach for three collaborative infrastructure projects which support open data and service initiatives for agriculture: Gramene/Ensembl (NSF, EBI), iPlant (NSF), and Systems Biology Knowledge Base (DOE). In the last year progress has been made on infrastructure for hosting data and access to high performance computes from the command line or a graphical user interface (Web). The resources have focused on high priority targets which support data integration, genome annotation, phenotype association and network analyses. In the last year we have organized or participated in more than 10 meetings to support training, vision and standards for the agricultural and broader life science community. This work was done in collaboration with USDA ARS scientists at several locations, as well as public- and private-sector scientists at Cold Spring Harbor Laboratory, Texas A&M, UC Davis, Oregon State University, European Bioinformatics Institute, DOE National laboratories, and Pioneer/DuPont.
1. How and where the genes are expressed in a cell is controlled by the information encoded by the DNA sequence. We now know that there are a number of novel switches that can be set to tell the cell whether the instructions in the DNA (DNA sequence) should be modified. Recently we have begun to understand that these switches are associated with markers on the genomes. An ARS researcher at Cold Spring Harbor laboratory at Ithaca, New York recently reviewed the status of these switches in two maize genomes to identify the differences in these markers between two different varieties of corn and found that these markers are inherited. These findings are providing insights on how differences in expression may be regulated and inherited from one generation to the next.
Monaco, M.K., Sen, T.Z., Dharmawardhana, P., Ren, L., Schaeffer, M.L., Amarasinghe, V., Thomason, J., Harper, E.C., Gardiner, J.M., Lawrence, C.J., Ware, D., Jaiswal, P., Naithani, S., Cannon, E. 2013. Maize metabolic network construction and transcriptome analysis. The Plant Genome. 6(1):DOI:10.3835/plantgenome2012.09.0025.