Location: Plant, Soil and Nutrition ResearchTitle: Automated update, revision, and quality control of the maize genome annotations using MAKER-P improves the B73 RefGen_v3 gene models and identifies new genes
|LAW, MEIYEE - The Jackson Laboratory|
|CHILDS, KEVIN - Michigan State University|
|CAMPBELL, MICHAEL - University Of Utah|
|STEIN, JOSHUA - Cold Spring Harbor Laboratory|
|OLSON, ANDREW - Cold Spring Harbor Laboratory|
|HOLT, CARSON - University Of Utah|
|PANCHY, NICHOLAS - Michigan State University|
|LEI, JIKAI - Michigan State University|
|JIAO, DIAN - University Of Texas|
|ANDORF, CARSON - Iowa State University|
|LAWRENCE, CAROLYN - Iowa State University|
|SHIU, SHIN-HAN - Michigan State University|
|SUN, YANNI - Michigan State University|
|JIANG, NING - Michigan State University|
|YANDELL, MARK - University Of Utah|
Submitted to: Plant Physiology
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 8/7/2014
Publication Date: 1/1/2015
Publication URL: http://DOI: 10.1104/pp.114.245027
Citation: Law, M., Childs, K.L., Campbell, M.S., Stein, J.C., Olson, A.J., Holt, C., Panchy, N., Lei, J., Jiao, D., Andorf, C.M., Lawrence, C.J., Ware, D., Shiu, S., Sun, Y., Jiang, N., Yandell, M. 2015. Automated update, revision, and quality control of the maize genome annotations using MAKER-P improves the B73 RefGen_v3 gene models and identifies new genes. Plant Physiology. 167(1):25-39.
Interpretive Summary: MAKER is a tool kit (a software pipeline) that allows researchers to annotate the genes in a reference genome sequence. Its purpose is to allow relatively small genome projects to independently annotate their genomes and create genome databases. MAKER-P is an implementation of MAKER to rapidly create, manage, and exert quality control of plant genome annotations. This publication describes the use of MAKER-P to improve the maize reference genome (B73 RefGen_v3) in less than 3 hours, using another publicly available infrastructure for bioinformatics analysis in plants – iPlant. As a result 4,466 new genes (not present in the existing annotation build) were identified, 1,393 existing gene models were extended, 2,647 gene models were found to lack supporting evidence, 104,215 pseudogene fragments were identified, and additional non-coding gene annotations were created. In addition, a new method for de novo training of MAKER-P for the annotation of newly sequenced grass genomes is described.
Technical Abstract: The large size and relative complexity of many plant genomes make creation, quality control, and dissemination of high-quality gene structure annotations challenging. In response, we have developed MAKER-P, a fast and easy-to-use genome annotation engine for plants. Here, we report the use of MAKER-P to update and revise the maize (Zea mays) B73 RefGen_v3 annotation build (5b+) in less than 3 h using the iPlant Cyberinfrastructure. MAKER-P identified and annotated 4,466 additional, well-supported protein-coding genes not present in the 5b+ annotation build, added additional untranslated regions to 1,393 5b+ gene models, identified 2,647 5b+ gene models that lack any supporting evidence (despite the use of large and diverse evidence data sets), identified 104,215 pseudogene fragments, and created an additional 2,522 noncoding gene annotations. We also describe a method for de novo training of MAKER-P for the annotation of newly sequenced grass genomes. Collectively, these results lead to the 6a maize genome annotation and demonstrate the utility of MAKER-P for rapid annotation, management, and quality control of grasses and other difficult-to-annotate plant genomes.