Skip to main content
ARS Home » Pacific West Area » Albany, California » Western Regional Research Center » Crop Improvement and Genetics Research » Research » Research Project #425427

Research Project: Small Grains Database and Bioinformatics Resources

Location: Crop Improvement and Genetics Research

2015 Annual Report

Over the next 5 years the project will focus on the following specific objectives as part of the long-term purpose to synthesize, display, and provide access to small grains genomics and genetics data for the research community and applied users. Objective 1: Annotate wheat, barley and oat whole genome sequences in collaboration with the crop research communities and integrate with genetic, physical, and trait maps. • Sub-objective 1.A. - Contribute to wheat genome annotations and incorporation of small grains annotations into GrainGenes. • Sub-objective 1.B. - Collaborate in integrating small grains genetic, physical, and trait maps. • Sub-objective 1.C. - Modifying GrainGenes with enhanced user tools in accessing genomic and mapping data. Objective 2: Integrate genotyping and phenotyping results from the Triticeae Coordinated Agricultural Project (T-CAP) including the T3 database, the National Small Grains Collection and GRIN database, and Gramene, to enhance support for trait analysis by association mapping and trait improvement by genomic selection. • Sub-objective 2.A. - Collaborate in developing common standards describing phenotypes and traits across species. • Sub-objective 2.B. - Convert data from GRIN, ARS Genotyping Laboratories, and the small grains Regional Field Nurseries to GrainGenes database formats. • Sub-objective 2.C. – Modify the GrainGenes schema to accommodate increased data volume and utilization. Objective 3: Collate, analyze, and present trait data from wheat, barley and oat communities to facilitate the genetic improvement of target traits and trait gene isolation. • Sub-objective 3.A. - Collate data on target traits. • Sub-objective 3.B. - Implement tools and interfaces for map displays. Objective 4: Maintain existing and develop new user community outreach. • Sub-objective 4.A. - Solicitation of user community input. • Sub-objective 4.B. - Training and education for use of GrainGenes resources. Objective 5: Facilitate the use of genomic and genetic data, information, and tools for germplasm improvement, thus empowering ARS scientists and partners to use a new generation of computational tools and resources [NP301, C2, PS2A].

1) Contribute to the annotation of whole genome sequences of wheat, barley, and oats in collaboration with the research community along with other national and international small grains genomics efforts. 2) Incorporation of genomic sequences and maps (genetic, physical, trait) into GrainGenes. To include integration of maps from multiple sources and related data sets already represented within GrainGenes. 3) Integrate genotyping and phenotyping data into GrainGenes. To include collaborating the GRIN, Gramene, and the Triticeae T-CAP project. 4) Modify the GrainGenes web site with enhanced user tools for accessing data, implement tools and interfaces for enhanced map displays, and modify the GrainGenes database schema to accommodate larger data sets. To include a complete rewrite and redeign of the GrainGenes web site and databases. 5) Enhanced research community outreach through regular solicitation of user community input, development of social medium tools for data access and user training, and develop formal training manuals and training manuals for GrainGenes users.

Progress Report
First and foremost, the GrainGenes project is addressing recommendations by the liaison committee project review (June 2014) to incorporate the latest research publications of most utility to the small grains community. Over the course of the year, to meet Objective 1, mapping data have been added for wheat, barley, oats, rye and leymus. For wheat, data from a significant 90K Single Nucleotide Polymorphism (SNP) marker project was added to build bridges between mapping populations and the International Wheat Genome Sequencing Consortium (IWGSC) reference maps. Probe data and maps were made available with links to the scaffolds provided by IWGSC. In parallel, mapping studies concentrating on rust resistance loci were added to GrainGenes. These resources include markers and map locations for genes essential for addressing the international threat to food security posed by the Ug99 fungus. In addition to the reference hexaploid wheat T. aestivum or ABD genome provided by IWGSC, data from the sequencing and mapping of a progenitor wheat species related to the ancestor of the D-genome of wheat was added; it contained 6732 molecular markers arrayed on the physical map. For barley, data have been added from a reference map that is a consensus of detailed genetic maps published in 2012. This map is being used to close the gaps in the barley genome sequence. As a start, the GrainGenes project has prepared links and maps for the 15,718 reference genes. In progress is building connections of genetic and physical maps to BAC clones from cultivars ‘Morex’, ‘Barke’, and ‘Bowman’ germplasm. Mapping data for barley yield QTLs were also added to the database. For oat, data from the molecular markers used to build the first physically-anchored hexaploid oat map were added. These markers were discovered in the six mapping populations that were used to build the first complete genetic reference maps. Also added to GrainGenes were data from a mapping study that employed probes from a 6000-bead microarray. The new array is the platform the oat community is using to build new maps with additional populations from crosses between breeding lines currently used for crop improvement. Project scientists added genetic maps published in 2009 for rye and in 2012 for Leymus spp to GrainGenes. These can serve as references for now and as templates for soon-to-be available genetic maps featuring the newer molecular markers being generated with next-generation sequencing technology. To meet Subobjectives 1C and 2C, modifications were made to the website. GrainGenes 3.0 utilizes a content management system (CMS) based on Drupal that expedites some of the routine administrative tasks such as updates of events and news. The CMS is now built around the relational database structure currently underlying GrainGenes. Tests are underway to utilize and improve upon relational and visualization platforms developed within the Generic Model Organism Database (GMOD) initiative (CMap, Chado, Tripal, and Jbrowse), and apply them to the GrainGenes environment. To meet Objectives 2.A and 2.B, one of the GrainGenes curators participated in a week-long workshop, as well as weekly web-conferences, on the Plant Breeding Application Programming Interface (API). This project is building a shared public API to databases worldwide that will enable exchange of data related to crop breeding ( Besides making GrainGenes and the National Institute of Food and Agriculture (NIFA)funded Triticae Toolbox (T3) data available to external applications, the API will also provide a mechanism for data transfer between GrainGenes and T3. To meet Objectives 2 and 3, a programmer was hired to help build links between the GrainGenes database and the NIFA-funded Triticae Toolbox (T3) database. The latter houses germplasm, genotypes and phenotypes data collected by wheat, barley and oat breeders. To meet Objective 5, several newly developed tools for analyses of genomic and genetic data are now hosted at the GrainGenes website. New tools developed by the project are NetVenn, which uses protein sequence data to facilitate genome-wide comparison of orthologous clusters across multiple species (; Arabidopsis interactome module (AIM), a plant protein interaction database that can provide valuable insights into the function of a protein of interest; and OrthoVenn, an interactive Web application for evolutionary and functional comparisons of genes from different plant species. Also added to GrainGenes is a tool developed by a stakeholder group at University of California, Davis, WheatExp, for analyzing wheat transcriptome/expression database ( Such tools will allow researchers to discriminate among individual genes and orthologues in studies of the expression of genes that underlie traits. Over the course of the last year, the GrainGenes website, database, and all other hosted websites were moved to a new hardware platform. The new hardware provides a substantial improvement in processing power, storage capacity, and memory. This permits data to be accessed more quickly and allows for larger, more complex data sets to be hosted. The new hardware will also have better reliability and a reduced maintenance footprint. GrainGenes maintains a close relationship with the U.S. Wheat and Barley Scab Initiative (USWBSI), providing the server hardware that hosts the USWBSI website and associated web services. The two projects share a full-time systems administrator who provides programming expertise and monitors network and system security. This year, a secondary site associated with the USWBSI,, was redesigned to use the Drupal content management system employed by the GrainGenes and USWBSI websites. This year, there have been both physical and administrative changes that have had significant impacts on the GrainGenes project. It has become part of a larger newly amalgamated unit, Crop Improvement and Genetics (CIG), with a new Research Leader. The computers that support GrainGenes were physically relocated to a remodeled room with improved environmental stability. A project scientist participated in the design of the new location. Project scientists have also been participating in the Big Data initiative; two are members of the Big Data Implementation Team that meets by conference call every other week. They are serving as liaisons and consultants as the Albany location is outfitted to serve as one of the nodes that will provide Internet2-connectivity to the Pacific West Area (PWA). Information Technology (IT) security claims an increasing share of project members’ time as each potential security breach must be investigated and must often be addressed with improved protection systems.

1. GrainGenes database gets a new web interface. GrainGenes is a publicly accessible database that houses information and data for researchers seeking to improve wheat, barley, rye and oats. This year ARS scientists in Albany, California, developed and installed a new web interface GrainGenes 3.0 ( The upgrade entailed improvements in the webpage display and, behind the scenes, a new content management system (Drupal) that provides built-in services for curating the website pages. In particular, the Drupal online editor and automatic expiration features are reducing the workload of maintaining the most active pages, such as Job Listings and Calendar. New datasets loaded into the database are highlighted in the GrainGenes Updates scrolling sidebar. The changes make it easy to keep GrainGenes current and visually interesting for its users worldwide.

2. New web resources for oat researchers and their stakeholders. Internet-based informational resources are needed to facilitate exchange of news and progress among oat researchers and stakeholders. To meet these needs, ARS scientists in Albany, California, and Ithaca, New York, collaborated to create two new websites. One called "T3/Oat" serves oat breeders by bringing together genetic mapping and field trial research results under one umbrella. The other "Oat Global" ( was created to aggregate information sources useful to the entire oat community within a content management system housed in Albany, California. Both websites provide forums for communication. The Oat Global website promotes interactions and discussions among scientists, farmers, stakeholders in the food industry, and consumers.

Review Publications
Tinker, N.A., Chao, S., Lazo, G.R., Oliver, R.E., Huang, Y.-F., Poland, J.A., Jellen, E.N., Maughan, P.J., Kilian, A., Jackson, E.W. 2014. A SNP genotyping array for hexaploid oat. The Plant Genome. 7(3). doi: 10.3835/plantgenome2014.03.0010
Wang, Y., Coleman-Derr, D.A., Chen, G., Gu, Y.Q. 2015. OrthoVenn: a web server for genome wide comparison and annotation of orthologous clusters across multiple species. Nucleic Acids Research. 43(W1):W78-W84. doi: 10.1093/nar/gkv487.
Wang, Y., Thilmony, R.L., Zhao, Y., Chen, G., Gu, Y.Q. 2014. AIM: A comprehensive Arabidopsis Interactome Module database and related interologs in plants. Database: The Journal of Biological Databases and Curation. doi: 10.1093/database/bau117.