Skip to main content
ARS Home » Northeast Area » Ithaca, New York » Robert W. Holley Center for Agriculture & Health » Plant, Soil and Nutrition Research » Research » Publications at this Location » Publication #374796

Research Project: Mapping Crop Genome Functions for Biology-Enabled Germplasm Improvement

Location: Plant, Soil and Nutrition Research

Title: Management, analyses, and distribution of the MaizeCODE Data on the Cloud

Author
item WANG, LIYA - COLD SPRING HARBOR LABORATORY
item LU, ZHENYUAN - COLD SPRING HARBOR LABORATORY
item DELABASTIDE, MELISSA - COLD SPRING HARBOR LABORATORY
item VAN BUREN, PETER - COLD SPRING HARBOR LABORATORY
item WANG, XIAOFEI - COLD SPRING HARBOR LABORATORY
item GHIBAN, CORNEL - COLD SPRING HARBOR LABORATORY
item REGULSKI, MICHAEL - COLD SPRING HARBOR LABORATORY
item DRENKOW, JORG - COLD SPRING HARBOR LABORATORY
item Ware, Doreen
item GINGERAS, THOMAS - COLD SPRING HARBOR LABORATORY
item XU, XIAOSA - COLD SPRING HARBOR LABORATORY
item RAMIREZ, CARLOS ORTIZ - NEW YORK UNIVERSITY
item FERNANDEZ MARCO, CHRISTINA - COLD SPRING HARBOR LABORATORY
item WILLIAMS, JASON - COLD SPRING HARBOR LABORATORY
item DOBIN, ALEXANDER - COLD SPRING HARBOR LABORATORY
item BIRNBAUM, KEN - NEW YORK UNIVERSITY
item JACKSON, DAVID - COLD SPRING HARBOR LABORATORY
item MARTIENSSEN, ROBERT - COLD SPRING HARBOR LABORATORY
item MCCOMBIE, RICHARD - COLD SPRING HARBOR LABORATORY
item MICKLOS, DAVID - COLD SPRING HARBOR LABORATORY
item SCHATZ, MICHAEL - COLD SPRING HARBOR LABORATORY

Submitted to: Frontiers in Plant Science
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 2/26/2020
Publication Date: 3/31/2020
Citation: Wang, L., Lu, Z., Delabastide, M., Van Buren, P., Wang, X., Ghiban, C., Regulski, M., Drenkow, J., Ware, D., Gingeras, T., Xu, X., Ramirez, C., Fernandez Marco, C., Williams, J., Dobin, A., Birnbaum, K., Jackson, D., Martienssen, R., Mccombie, R.W., Micklos, D., Schatz, M. 2020. Management, analyses, and distribution of the MaizeCODE Data on the Cloud. Frontiers in Plant Science. 31. https://doi.org/10.3389/fpls.2020.00289.
DOI: https://doi.org/10.3389/fpls.2020.00289

Interpretive Summary: MaizeCODE, a project for the analysis of functional elements in the maize genome, has been generating data from three different corn varieties and one variety of teosinte, the ancestor of domesticated corn. In order to process, analyze and provide access to this data in a reproducible way, we have been extending the development of the SciApps portal. SciApps workflow platform has been improved to handle the data management of the MaizeCode project. The platform supports accessible and reproducible scientific workflows using a programmatic interface that supports large scale processing and distribution of both the primary data, information about the experiment, and the data needed to reproduce the analyses. The SciApps portal is a flexible platform that allows integration of new analysis tools, workflows, and genomic data from multiple projects. The portal experience is designed to improve both access to the scientists who produce the data and those who want to access the project resources.

Technical Abstract: MaizeCODE, a project for the analysis of functional elements in the maize genome, has assayed up to five tissues of four maize genomes (B73, NC350, W22, TIL11) for RNA-Seq, Chip-Seq, RAMPAGE, and small RNA in its initial phase. To facilitate reproducible science and provide both human and machine access to the MaizeCODE data, A cloud-based portal, SciApps, is used and further developed for analysis and distribution of both raw data and analysis results. Based on the SciApps workflow platform, new components have been developed to support the complete cycle of the MaizeCODE data management, including public accessible scientific workflows for reproducible and shareable analysis of various functional data, a RESTful API for batch processing and distribution of both data and metadata, a searchable data page that lists each MaizeCODE experiment as a reproducible workflow, and Genome Browser tracks that are linked with workflows and metadata. The SciApps portal is a flexible platform that allows integration of new analysis tools, workflows, and genomic data from multiple projects. The portal experience is designed to improve both access to and analysis of the MaizeCODE data by relying on both metadata and a ready-to-compute cloud-based platform.