Skip to main content
ARS Home » Midwest Area » Ames, Iowa » Corn Insects and Crop Genetics Research » Research » Research Project #425040

Research Project: SoyBase and the Legume Clade Database

Location: Corn Insects and Crop Genetics Research

2018 Annual Report

1a. Objectives (from AD-416):
Objective 1: Support stewardship of soybean and other major reference legume genetic, genomic, and phenotypic datasets. Sub-objective 1.A Develop and deploy infrastructure to support both the current reference soybean genome sequence, improved versions of that sequence, and new re-sequenced soybean genomes and haplotype data. Sub-objective 1.B Develop processes and tools to provide access to soybean gene model structural and functional annotations as these are revised over time. Sub-objective 1.C Provide standardized access to reference genome and affiliated sequences for the major crop and model legume species. Sub-objective 1.D Curate high-quality soybean datasets created by the community at large. These may include expression, mutant, phenotype, epigenetic, haplotype, small-RNA, QTL, and other data types. Sub-objective 1.E Maintain infrastructure to enable acquisition, storage, and community access to major public data sets for various legume species. Objective 2: Cooperate with other database developers and plant researchers to develop gene and trait ontologies and open, standardized data exchange mechanisms to enhance database interoperability. Objective 3: Provide community support and research coordination services for the research and breeding communities for soybean and other legumes. Expand outreach activities through workshops, web-based tutorials, and other communications. Objective 4: Facilitate the use of genomic and genetic data, information, and tools for germplasm improvement, thus empowering ARS scientists and partners to use a new generation of computational tools and resources.

1b. Approach (from AD-416):
Incorporate revised primary reference genome sequence for soybean into SoyBase. House and provide access to genome sequences for other soybean accessions, haplotype data, and related annotations. Incorporate revised gene models and annotations into SoyBase. Install or implement web-based tools for curation and improvement of soybean gene models and gene annotations. Incorporate available legume genome sequences and annotations. Working with collaborators, collect and add genetic map and QTL data for crop legumes. Extend web-based tools for navigation among biological sequence data across the legumes. Extend and develop methods and storage capacity for accepting genomic data sets for soybean and other legume species. Develop a complete set of descriptors (ontologies) for soybean biology (anatomy, traits, and development), and for other significant crop legumes as needed. Work with the relevant ontology communities-of-practice to incorporate these descriptors into broadly accessible ontologies. Develop web tutorials for important typical uses of SoyBase and the Legume Clade Database. Present and train about features at relevant conferences and workshops. Regularly seek feedback from users about desired features and usability.

3. Progress Report:
The SoyBase and Legume Clade Database project serves plant breeders and other researchers working on the many crops in the legume plant family, such as soybean, peanut, common bean, lentil, and forage crops such as clover and alfalfa. These crops play critical roles in U.S. and global agriculture, both through direct consumption of pulses, vegetables, and oilseeds, through animal forages, and through processed products for animal and human consumption and industrial uses. The combined annual value of these crops in the U.S. is on the order of $45-50 billion (USDA-NASS). This project provides vital agricultural services by giving researchers access to genetic and genomic data about these species, including genome sequences, gene sequences and functional information, and genetic markers for important traits. The project provides these services through the SoyBase and LegumeInfo (Legume Information System) online databases. Work on SoyBase in the last five years has included continuing curation and incorporation of literature on mapped traits and gene function, as well as work to house and integrate information from several large research projects: (1) improvement of gene predictions for the primary reference genome assembly for soybean; (2) a collection of fast neutron- and transposable element-generated mutants lines used by researchers to determine gene function; (3) genetic information about the SoyNAM Project, which is a multi-institution, multi-year nested association study designed to identify genomic regions of agronomic interest; (4) pedigrees for all of the entries in the Northern and Southern Uniform Testing Trials; (5) several RNA-seq gene atlases, contextualized relative to all predicted genes; (6) an atlas of micro-RNAs (miRNAs) in soybean, which are involved in regulation of gene expression and function; (7) genome browser displays of methylation patterns in soybean, which also contribute to gene expression and function. Work on LegumeInfo over the project period has included collaboration on several important crop genomes as well as many improvements to the database and web platform components and tools. Major accomplishments for LegumeInfo in the last five years include: (1) transition to a new genomic web and database framework (Tripal and Chado), in order to better integrate with other genomic databases and make use of a large group of developers for these tools; (2) contributions to the genome assemblies and gene predictions for five species: common bean; peanut and its two closest wild relatives (Arachis duranensis and Arachis ipaensis); and narrow-leafed lupin; (3) incorporation of the genome assemblies and genes for ten species into genome browsers and associated search and display tools: chickpea, adzuki bean, common bean, lupin, mungbean, red clover, cowpea, wild peanut (two species), cultivated peanut; (4) a new tool for visualizing the location, against high-resolution geographic maps, of any legume germplasm in the USDA GRIN-Global collection; (5) new visualizations for gene families in the legumes, to show evolutionary relationships among all sequenced crop legumes; (6) a new tool for exploring the genomic organization of genes from all sequenced crop legumes (useful for determining gene conservation, function and regulation); (7) gene expression atlases for peanut, common bean, and chickpea; (8) incorporation of curated literature for several hundred agricultural traits and associated genetic markers for peanut and common bean.

4. Accomplishments
1. Assembly of peanut genome sequences. Global demand continues to increase for protein-rich, nutrient dense foods and oil crops. Peanut is unique in being both very nutrient-dense, and consumable with minimal or no processing. The farm value of the U.S. crop is in excess of $1 billion annually (USDA NASS). Over the project period, ARS researchers at Ames, Iowa have worked with other U.S. and international researchers to help assemble the genome sequences of the two closest wild relatives of peanut, and the genome of cultivated peanut itself. These genome sequences and related genetic resources are available at PeanutBase and LegumeInfo. The availability of the genome sequences for peanut will make it possible for researchers to more rapidly breed varieties that have improved yield, disease resistance, and stress tolerance.

2. Publication of the common bean genome sequence, with web access to the bean genome and genes. Common bean is the most important grain legume for human consumption worldwide, and plays an important role in agriculture due to its ability to utilize atmospheric nitrogen as a natural fertilizer. The farm value of the U.S. common bean crop (dry bean and snap bean) is approximately $1 billion annually (USDA NASS). ARS researchers at Ames, Iowa have worked with U.S. and international partners to report the genome sequence of common bean. This research confirms that common bean was independently domesticated in South America and Mesoamerica, and also reports that important traits such as seed size are based on different genetic factors in the two distinct domestications. The South American domestication produced generally larger-seeded, slower-maturing varieties, and the Mesoamerican domestication produced generally smaller-seeded, more drought-tolerant varieties. These results mean that plant breeders have a larger and more diverse collection of genetic material to draw from as they develop new bean varieties to address challenges in global agriculture.

3. First published description of the genome sequences of the two wild ancestors of cultivated peanut. Peanut is very important in human nutrition, providing a calorie-dense, versatile, high-protein food source - one that is especially unusual in that it is palatable without cooking or preparation. ARS researchers from Ames, Iowa, Tifton and Athens, Georgia, and Stoneville, Mississippi, participated in an international consortium of researchers to sequence and analyze the genomes of the two closest wild ancestors of cultivated peanut. Those ancestors merged to form a new species which was domesticated to become modern cultivated peanut. An important finding of this research is that the unusual hybridization of these two species was likely the direct result of early agriculturalists in South America. The genome sequences from these wild species thus comprise essentially all of the genetic material from the modern cultivated peanut. This research will be used by plant researchers and breeders to more efficiently select improved peanut varieties, and to speed development of varieties that are well suited for growing in various regions of the world. The genome sequence has already been useful in helping identify mechanisms for resistance to root-knot nematodes and rust (a fungal disease), which are serious challenges for many peanut farmers.

4. Genetic characterization of a collection of breeding lines for Apios americana. A North American bean relative, Apios americana (also called “potato bean” or “ground nut”) was once a staple crop of Native Americans and has economic potential, once agronomic improvements are made. This plant produces high-protein, potato-like tubers, which grow along underground stolons. ARS researchers in Ames, Iowa describe a large set of gene sequences for this plant, as well as a set of several thousand genetic markers that can be used for crop improvement in Apios. This research describes associations between particular genetic markers and valuable plant traits in Apios, including the size of the edible tubers, information used to speed varietal improvement in this promising but under-utilized native North American species . This research is being used by breeders who are working to develop Apios into a new crop – one that is already well adapted to environments throughout the eastern half of North America, with several valuable characteristics, including a good nutritional profile and tolerance of flooding and wet soils, as well as tolerance to intermittent drought. The potential impacts include a more robust, diverse food production system in the U.S. and worldwide.

5. Characterization of a group of genes that help soybean and other plants respond to drought and salinity. Drought and salinity are major concerns for farmers throughout the world. Salinity often comes with irrigation, so it occurs in many arid parts of the world. ARS researchers in Ames, Iowa characterized a family of related genes in soybean, and identified several of these that are particularly active during soybean response to salt and desiccation stress. The corresponding genes in other species, including rice and the model plant Arabidopsis thaliana, have also been shown to help those plants respond to salt and drought. These genes are candidates for enhancement in soybean and other crops in particular, as the basis for design of molecular markers for breeding stress-resistant varieties. One of the key genes identified in this study has also been identified in peanut and has been used as a marker for selecting drought-tolerant lines in that crop.

Review Publications
Singh, J., Kalberer, S.R., Belamkar, V., Assefa, T., Nelson, M.N., Farmer, A.D., Blackmon, W.J., Cannon, S.B. 2017. A transcriptome-SNP-derived linkage map of Apios americana (potato bean) provides insights about genome re-organization and synteny conservation in the phaseoloid legumes. Theoretical and Applied Genetics. 131(2):333-351.