Skip to main content
ARS Home » Midwest Area » Ames, Iowa » Corn Insects and Crop Genetics Research » Research » Research Project #425040

Research Project: SoyBase and the Legume Clade Database

Location: Corn Insects and Crop Genetics Research

2017 Annual Report

1a. Objectives (from AD-416):
Objective 1: Support stewardship of soybean and other major reference legume genetic, genomic, and phenotypic datasets. Sub-objective 1.A Develop and deploy infrastructure to support both the current reference soybean genome sequence, improved versions of that sequence, and new re-sequenced soybean genomes and haplotype data. Sub-objective 1.B Develop processes and tools to provide access to soybean gene model structural and functional annotations as these are revised over time. Sub-objective 1.C Provide standardized access to reference genome and affiliated sequences for the major crop and model legume species. Sub-objective 1.D Curate high-quality soybean datasets created by the community at large. These may include expression, mutant, phenotype, epigenetic, haplotype, small-RNA, QTL, and other data types. Sub-objective 1.E Maintain infrastructure to enable acquisition, storage, and community access to major public data sets for various legume species. Objective 2: Cooperate with other database developers and plant researchers to develop gene and trait ontologies and open, standardized data exchange mechanisms to enhance database interoperability. Objective 3: Provide community support and research coordination services for the research and breeding communities for soybean and other legumes. Expand outreach activities through workshops, web-based tutorials, and other communications. Objective 4: Facilitate the use of genomic and genetic data, information, and tools for germplasm improvement, thus empowering ARS scientists and partners to use a new generation of computational tools and resources.

1b. Approach (from AD-416):
Incorporate revised primary reference genome sequence for soybean into SoyBase. House and provide access to genome sequences for other soybean accessions, haplotype data, and related annotations. Incorporate revised gene models and annotations into SoyBase. Install or implement web-based tools for curation and improvement of soybean gene models and gene annotations. Incorporate available legume genome sequences and annotations. Working with collaborators, collect and add genetic map and QTL data for crop legumes. Extend web-based tools for navigation among biological sequence data across the legumes. Extend and develop methods and storage capacity for accepting genomic data sets for soybean and other legume species. Develop a complete set of descriptors (ontologies) for soybean biology (anatomy, traits, and development), and for other significant crop legumes as needed. Work with the relevant ontology communities-of-practice to incorporate these descriptors into broadly accessible ontologies. Develop web tutorials for important typical uses of SoyBase and the Legume Clade Database. Present and train about features at relevant conferences and workshops. Regularly seek feedback from users about desired features and usability.

3. Progress Report:
Soybean genomics and SoyBase. The U.S. soybean crop, valued in excess of $35 billion (USDA-NASS), depends on continued breeding improvements in order to achieve yield gains and avoid losses due to pathogens and environmental stresses. The USDA-ARS soybean genetics database,, provides access to the complete genome sequence for soybean, as well as to predicted genes, markers, valuable traits and their locations, and many other genetic features. The SoyBase database continues to be actively extended through the addition of publications that describe the locations of traits, genes, and features of interest. Updated data for transposable element-generated mutants were added to SoyBase. The user interface for the section of SoyBase that allows searching of the USDA Germplasm Repository Information Network (GRIN) data and other community submitted data in a user friendly search page was substantially upgraded. The search results are presented in the context of other data in SoyBase. Additional gene expression data sets were added to SoyBase and the interface which allows users to select data sets to view by tissue, cultivar, and experimental conditions. Genome methylation data sets were added to SoyBase and a user interface developed that allows users to select data based on experiment details. Added pedigrees for all of the cultivars newly included in the Soybean Uniform Trials and all newly registered soybean cultivars. SoyBase and the Legume Information System (LIS) staff participated in the AgBioData Working Group on large data sets, metadata and data retention. They also participated in the National Agricultural Library working group on web hosting and large scale data storage. The SoyBase video tutorial page has been updated with additional and improved tutorials. Poster and oral presentations about SoyBase were made at the Plant and Animal Genome meeting, the Soybean Precision Genomics and Mutant Finder Workshop, and the Soybean Breeders Workshop. A SoyBase tutorial was held at the Soybean Breeders Workshop and the 2017 World Soybean Research Conference. We have worked in the past year with international collaborators to assemble and analyze the genome sequences of narrow-leafed lupin (cultivated as a high-protein seed crop, much like soybean), and have added the genome sequences for red clover to the LIS, We have also continued improving other aspects of LIS and PeanutBase,, for improved user experience and data-handling capacity. These Web resources now provide access to the genome and gene sequences of eleven legume species: common bean, pigeonpea, soybean (via SoyBase), chickpea, Medicago truncatula and Lotus japonicus (two forage and model research species), and Arachis duranensis and Arachis ipaensis (two wild relatives of peanut), and red clover, mungbean, adzuki bean. LIS also has a new viewer for interactively displaying wild peanut species and accessions on a geographical map. These resources will enable plant breeders and researchers to more rapidly develop new crop varieties with favorable yield, disease resistance, or stress tolerance characteristics. Work on LIS in the past year has included: improved viewers for genes in gene families (at; an improved viewer and interface for exploring conserved genomic regions between sequenced legume genomes, (at; new search capabilities for features such as plant traits, genetic markers, and genes; and a new viewer for plant accessions in the USDA-ARS GRIN-Global database, visualized in an interactive geographic map background (at Work on PeanutBase in the last year has included: new genetic trait and marker information, a viewer for gene expression information for a 22-tissue gene expression atlas, an interactive tour for genetic marker and trait information, and addition of several thousand additional genetic markers and associated information. Both the PeanutBase and LIS projects have also included substantial outreach to other software database developers by sharing data-collection templates and software modules that can be used in other contexts, such as modules for sequence search and display, and viewers for visualizing evolutionary relationships among related genes from different species. In the past year, this outreach has increased through the project (NSF-funded, with ARS participation).

4. Accomplishments
1. Genetic characterization of a collection of breeding lines for Apios americana. A North American bean relative, Apios americana (also called “potato bean” or “ground nut”) was once a staple crop of Native American Indians and has economic potential, once agronomic improvements are made. This plant produces high-protein, potato-like tubers, which grow along underground stolons. ARS researchers in Ames, Iowa describe a large set of gene sequences for this plant, as well as a set of several thousand genetic markers that can be used for crop improvement in Apios. This research describes associations between particular genetic markers and valuable plant traits in Apios, including the size of the edible tubers information. These genetic markers can be used to speed varietal improvement in this promising but under-utilized native North American species.

Review Publications
Martin, K., Jugpreet, S., Hill, J.H., Whitham, S., Cannon, S.B. 2016. Dynamic transcriptome profiling of Bean Common Mosaic Virus (BCMV) infection in Common Bean (Phaseolus vulgaris L.). Biomed Central (BMC) Genomics. 17:613. doi:10.1186/s12864-016-2976-8.

Assefa, T., Rao, I.M., Cannon, S.B., Wu, J., Gutema, Z., Blair, M., Otyama, P., Alemayehu, F., Dagne, B. 2017. Improving adaptation to drought stress in white pea bean (Phaseolus vulgaris L): genotypic effects on grain yield, yield components and pod harvest index. Plant Breeding. 136(4):548-561. doi:10.1111/pbr.12496.

Belamkar, V., Farmer, A.D., Weeks, N.T., Kalberer, S.R., Blackmon, W.J., Cannon, S.B. 2016. Genomics-assisted characterization of a breeding collection of Apios americana, an edible tuberous legume. Nature Scientific Reports. 6:34908. doi: 10.1038/srep34908.