Skip to main content
ARS Home » Midwest Area » Ames, Iowa » Corn Insects and Crop Genetics Research » Research » Publications at this Location » Publication #429035

Research Project: SoyBase and the Legume Information System - Information Infrastructure and Research for Legume Crop Improvement

Location: Corn Insects and Crop Genetics Research

Title: The SoyBase, LegumeInfo, and PeanutBase databases in support of legume research and crop improvement

Author
item Campbell, Jacqueline
item CAMERON, CONNOR - National Center For Genome Resources
item CLEARY, ALAN - National Center For Genome Resources
item DASH, SUDHANSU - National Center For Genome Resources
item LAVELLE, EVAN - National Center For Genome Resources
item FARMER, ANDREW - National Center For Genome Resources
item Huang, Wei
item NOVAK, SIMON - Michigan Technological University
item PROM, CHEN - Collaborator
item WEEKS, NATHAN - Harvard University
item Cannon, Steven
item Nelson, Rex

Submitted to: Genetics
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 11/20/2025
Publication Date: N/A
Citation: N/A

Interpretive Summary: This paper describes a collection of websites with associated genomic databases, SoyBase (https://soybase.org), Legume Information System (https://legumeinfo.org), and PeanutBase (https://peanutbase.org). These websites support breeding and research work in the legume crop plant family by organizing and linking published legume genomic research data. Legume crops are critical to agriculture and the economy in the U.S., supporting plant breeding in soybean and other legume crops that are critical for products including livestock feed, oil seeds, and food in U.S. grocery stores. This paper describes a genomic Data Store that allows for bulk data access by breeders and researchers and by Artificial Intelligence (AI) models. The paper also describes the architecture for these websites, which has been designed for rapid, modular, flexible development well suited to genomic data and to rapidly include new data. This architecture allows for both code sharing and for customization to serve the unique needs of each research community. Showcasing this website architecture, the paper describes a set of online tools — some shared on all three sites, and some customized for the respective research communities.

Technical Abstract: Here, we describe a collection of genomic database portals, SoyBase (https://soybase.org), Legume Information System (https://legumeinfo.org), and PeanutBase (https://peanutbase.org), that support breeding and research work in the legume plant family. The legume family includes important crops such as soybean, peanut, common bean, lentils, chickpeas, as well as approximately 20,000 other species that are important in all terrestrial ecosystems. Beyond the value of the portals for species in this large clade (as well as for plant biology more generally), the database and site architecture of these portals will be of interest to developers of similar genomic sites, as the data management and software solutions are generic and should be applicable to a wide variety of organisms. The architecture for these sites has been designed for rapid, modular, flexible development well suited to genomic data and to rapid incorporation of new data. Website content is handled with a static site generator (Jekyll). Interactive applications are developed using JavaScript encapsulated as Web Components that access back-end data via APIs for stability and flexibility. This architecture allows for both code portability and for customization to serve the unique needs of each research community.