Project : USDA ARS

ARS Home » Midwest Area » Ames, Iowa » Corn Insects and Crop Genetics Research » Research » Research Project #434521

Research Project: MaizeGDB: Enabling Access to Basic, Translational, and Applied Research Information

Location: Corn Insects and Crop Genetics Research

2019 Annual Report

Objectives
Objective 1: Accelerate maize trait analysis, germplasm analysis, genetic studies, and breeding through stewardship of maize genomes, genetic data, genotype data, and phenotype data. Objective 2: Develop an infrastructure to curate, integrate, query, and visualize the genetic, genomic, and phenotypic relationships in maize germplasm. Objective 3: Identify and curate key datasets for benchmarking genomic discovery tools for the functional annotation of maize genomes, for agronomic trait analyses, for breeding (including genome editing), and for improving database interoperability. Objective 4: Provide community support services, training and documentation, meeting coordination, support for community elections and surveys, and support for the crop genome database community. Objective 5: Collaborate with database developers and plant researchers to develop improved methods and mechanisms for open, standardized data and knowledge exchange to enhance database utility and interoperability.

Approach
The Maize Genetics and Genomics Database (MaizeGDB – http://www.maizegdb.org) is the model organism database for maize. MaizeGDB’s overall aim is to provide long-term storage, support, and stability to the maize research community’s data and to provide informatics services for access, integration, visualization, and knowledge discovery. The MaizeGDB website, database, and underlying resources allow plant researchers to understand basic plant biology, make genetic enhancement, facilitate breeding efforts, and translate those findings into products that increase crop quality and production. To accelerate research and breeding progress, generated data must be made freely and easily accessible. Curation of high-quality and high-impact datasets has been the foundation of the MaizeGDB project since its inception over 25 years ago. MaizeGDB serves as a two-way conduit for getting maize research data to and from our stakeholders. The maize research community uses data at MaizeGDB to facilitate their research, and in return, their published data gets curated at MaizeGDB. The information and data provided at MaizeGDB and facilitated through outreach has directly been used in research that has had broad commercial, social, and academic impacts. The MaizeGDB team will make accessible high-quality, actively curated and reliable genetic, genomic, and phenotypic description datasets. At the root of high-quality genome annotation lies well-supported assemblies and annotations. For this reason, we focus our efforts on benefitting researchers by developing a system to ensure long-term stewardship of both a representative reference genome sequence assembly with associated structural and functional annotations as well as additional reference-quality genomes that help represent the diversity of maize. In addition, we will enable researchers to access data in a customized and flexible manner by deploying tools that enable direct interaction with the MaizeGDB database. Continued efforts to engage in education, outreach, and organizational needs of the maize research community will involve the creation and deployment of video and one-on-one tutorials, updating maize Cooperators on developments of interest to the community, and supporting the information technology needs of the Maize Genetics Executive Committee and Annual Maize Genetics Conference Steering Committee.

Progress Report
The Maize Genetics and Genomics Database (MaizeGDB) provides tools and resources that make the maize genome sequence useful for investigative research and crop improvement. MaizeGDB’s objectives are to provide stewardship to key datasets related to maize genetics, genomics, and breeding, develop robust infrastructure to store, query, integrate, and visualize data, curate high-quality, high-impact datasets, interact with the maize research community to identify needs and priorities, and to work with other outside communities and databases to coordinate on data standards and interoperability. MaizeGDB’s stewardship efforts have focused on high-quality genome sequences. MaizeGDB is working closely with the University of Georgia, Cold Spring Harbor Laboratory, and Iowa State University to steward 26 high-quality, diverse maize genomes and expression datasets. As part of an agreement with Iowa State University, a pan-genome representation of the 26 genomes is being generated and integrated into MaizeGDB. Data for thousands of other public maize lines that represent the broad diversity of maize are also available. MaizeGDB has ongoing efforts to curate datasets (over 150 datasets to date) that have been associated with functional regions in the genome, which can be visually explored through a genome browser tool. MaizeGDB now supports 14 genome browsers for recently released maize genomes and over 90 datasets that can be used as targets for sequence similarity searches. Data is being generated and collected to represent how genes and proteins are expressed in different conditions and across multiple genome assemblies. This allows researchers to leverage research outcomes from many different sources and data types, all within the context of the maize reference assemblies. A pathway viewer helps researchers determine which genes and pathways to select for targeted crop improvement. MaizeGDB also identifies key datasets to include in a tool co-developed in an agreement with the University of Missouri which enables researchers to integrate their data with publicly available data and perform meta-analysis. MaizeGDB has updated its infrastructure to allow the capability to host community-developed applications. These applications have easy-to-use interfaces that provide access to data, analysis, and visualization. Additional curation efforts have targeted datasets and tools to support maize trait analysis, genetic studies, and breeding. MaizeGDB has coordinated with a consortium of over 25 agricultural biological databases to develop best practices to make sure data adheres to community-defined standards. MaizeGDB continues to be the community hub for maize research, coordinating activities and providing technical support to the maize research community. Work carried out by the MaizeGDB team has resulted in improved communication among maize researchers worldwide, increased ability to document the results of experiments, and increased availability of information relative to high impact research.

Accomplishments
1. MaizeGDB completed a major expansion of resources to improve data access and facilitate crop improvement. In genetics and genomics research, model organism databases are critical by acting as both a data repository and as a resource for crop breeders and researchers to search, integrate, analyze, and visualize data that are essential for their work. The needs of research communities continually change as the size, scale, and types of available data are growing quickly. The genetics and genomics database for the maize research community is MaizeGDB. ARS researchers in Ames, Iowa have expanded its capabilities to adapt to the needs of the maize research community including the ability to host multiple reference-quality assemblies, support datasets related to gene expression, and provide resources to better understand the relationships between genes and phenotypes. MaizeGDB now hosts reference assemblies for 14 maize genomes including three additions in the past year. MaizeGDB now provides a tool that compares how different genes are expressed in a plant for over 150 different conditions and a curation tool to link images of traits to genes. These new resources will lead to improved crop performance by helping researchers to better understand how the genes in a plant define the potential traits that will be observed in farmers’ fields.

Review Publications
Andorf, C.M., Beavis, W.D., Hufford, M., Lubberstedt, T., Smith, S., Suza, W., Wang, K., Woodhouse, M., Yu, J. 2019. Technological advances in maize breeding: Past, present and future. Journal of Theoretical and Applied Genetics. 132(32):817-849. https://doi.org/10.1007/s00122-019-03306-3.
Schott, D.A., Vinnakota, A.G., Portwood II, J.L., Andorf, C.M., Sen, T.Z. 2018. SNPversity: A web-based tool for visualizing diversity. Database: The Journal of Biological Databases and Curation. https://doi.org/10.1093/database/bay037.
Zhou, N., Siegel, Z.D., Zarecor, S., Lee, N., Campbell, D.A., Andorf, C.M., Nettleton, D., Lawrence-Dill, C.J., Ganapathysubramanian, B., Kelly, J.W., Friedberg, I. 2018. Crowdsourcing image analysis for plant phenomics to generate ground truth data for machine learning. PLoS Computational Biology. 14(7):e1006337. https://doi.org/10.1371/journal.pcbi.1006337.
Springer, N., Anderson, S., Andorf, C.M., Ahern, K., Bai, F., Barad, O., Barbazuk, W., Bass, H.W., Baruch, K., Gen-Zvi, G., Buckler IV, E.S., Bukowski, R., Campbell, M.S., Cannon, E.K., Chomet, P., Dawe, R., Davenport, R., Dooner, H.K., He Du, L., Du, C., Easterling, K., Gault, C., Guan, J., Jander, G., Hunter III, C.T., Jiao, Y., Koch, K.E., Kol, G., Kudo, T., Li, Q., Lu, F., Mayfield-Jones, D., Mei, W., McCarty, D.R., Noshay, J., Portwood II, J.L., Ronen, G., Settles, M.A., Shem-Tov, D., Shi, J., Soifer, I., Stein, J.C., Suzuki, M., Vera, D.L., Vollbrecht, E., Vrebalov, J.T., Ware, D., Wei, X., Wimalanathan, K., Woodhouse, M.R., Xiong, W., Brutnell, T.P. 2018. The maize W22 genome provides a foundation for functional genomics and transposon biology. Nature Genetics. 50:1282-1288. https://doi.org/10.1038/s41588-018-0158-0.
Harper, E.C., Campbell, J., Cannon, E.K., Jung, S., Main, D., Poelchau, M.F., Walls, R.L., Andorf, C.M., Arnaud, E., Berardini, T.Z., Birkett, C.L., Cannon, S.B., Carson, J., Condon, B., Cooper, L., Dunn, N., Elsik, C., Farmer, A., Ficklin, S., Grant, D.M., Grau, E., Hendon, N., Hu, Z., Humann, J., Jaiswal, P., Jonquet, C., Laporte, M., Larmande, P., Lazo, G.R., McCarthy, F., Menda, N., Mungall, C., Munoz-Torres, M., Naithani, S., Nelson, R., Nesdill, D., Park, C., Reecy, J., Reiser, L., Sanderson, L., Sen, T.Z., Staton, M., Subramaniam, S., Karey, T., Unda, V., Unni, D., Wang, L., Ware, D., Wegrzyn, J., Williams, J., Woodhouse, M. 2018. AgBioData consortium recommendations for sustainable genomics and genetics databases for agriculture. Database: The Journal of Biological Databases and Curation. 2018(1):1-32. https://doi.org/10.1093/database/bay088.

U.S. DEPARTMENT OF AGRICULTURE

Corn Insects and Crop Genetics Research: Ames, IA