Location: Corn Insects and Crop Genetics ResearchTitle: MaizeMine: A data mining warehouse for the maize genetics and genomics database (MaizeGDB)
|SHAMIMUZZAMAN, MD - US Department Of Agriculture (USDA)|
|GARDINER, JACK - University Of Missouri|
|WALSH, AMY - University Of Missouri|
|TRIANT, DEBORAH - University Of Missouri|
|LE TOURNEAU, JUSTIN - University Of Missouri|
|TAYAL, ADITI - University Of Missouri|
|UNNI, DEEPAK - Lawrence Berkeley National Laboratory|
|NGUYEN, HUNG - University Of Missouri|
|ELSIK, CHRISTINE - University Of Missouri|
Submitted to: Frontiers in Plant Science
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 10/1/2020
Publication Date: 10/22/2020
Citation: Shamimuzzaman, M., Gardiner, J.M., Walsh, A.T., Triant, D.A., Le Tourneau, J.J., Tayal, A., Unni, D.R., Nguyen, H.H., Portwood Ii, J.L., Cannon, E.K., Andorf, C.M., Elsik, C.G. 2020. MaizeMine: A data mining warehouse for the maize genetics and genomics database (MaizeGDB). Frontiers in Plant Science. 11. Article 592730. https://doi.org/10.3389/fpls.2020.592730.
Interpretive Summary: The availability of the maize genomic, genetic, and breeding datasets has accelerated both maize breeding and genetics research. As the size and complexity of these datasets continue to increase, there is a challenge to making the data easily accessible to researchers. MaizeMine is an online resource to mine data at the Maize Genetics and Genome Database (MaizeGDB). It enables researchers to combine their data with publicly available maize datasets. MaizeMine provides search tools, built-in templates for common data requests, and a QueryBuilder tool for creating custom queries. These resources facilitate meta-analysis by allowing researchers to access, integrate, and analyze a wide variety of data from the two most recent maize genome assemblies. This work will benefit public and private maize breeders in both academia and commercial seed companies.
Technical Abstract: MaizeMine is the data mining resource of the Maize Genetics and Genome Database (MaizeGDB). It enables researchers to create and export customized annotation datasets that can be merged with their own research data for use in downstream analyses. MaizeMine uses the InterMine data warehousing system to integrate genomic sequences and gene annotations from the B73_RefGen_v3 and B73_RefGen_v4 genome assemblies, Gene Ontology (GO) annotations, dbSNP variation, protein annotations (UniProt), protein families and domains (InterPro), homologs (Ensembl Compara) and pathways (CornCyc, KEGG, Plant Reactome). MaizeMine also provides database-cross references between genes of the AGPv3.21, AGPv4 and RefSeq gene sets, as well as pre-computed expression levels for all three gene sets based on RNA-seq data from the Zea mays B73 Gene Expression Atlas (NCBI BioProject PRJNA171684). MaizeMine provides several search tools, including a keyword search, built-in template queries with intuitive search menus, and a QueryBuilder tool for creating custom queries. The Genomic Regions search tool executes queries based on lists of genome coordinates, and supports both the B73_RefGen_v3 and B73_RefGen_v4 assemblies. The List tool allows you to upload identifiers to create custom lists, perform set operations such as unions and intersections, and execute template queries with lists. When used with gene identifiers, the List tool automatically provides gene set enrichment for GO and pathways, with a choice of statistical parameters and background gene sets. MaizeMine is particularly useful for tracking gene identifiers across gene sets to facilitate meta-analysis.