Location: Corn Insects and Crop Genetics Research
Title: The SoyBase, LegumeInfo, and PeanutBase databases in support of legume research and crop improvementAuthor
![]() |
Campbell, Jacqueline |
![]() |
CAMERON, CONNOR - National Center For Genome Resources |
![]() |
CLEARY, ALAN - National Center For Genome Resources |
![]() |
DASH, SUDHANSU - National Center For Genome Resources |
![]() |
LAVELLE, EVAN - National Center For Genome Resources |
![]() |
FARMER, ANDREW - National Center For Genome Resources |
![]() |
Huang, Wei |
![]() |
NOVAK, SIMON - Michigan Technological University |
![]() |
PROM, CHEN - Collaborator |
![]() |
WEEKS, NATHAN - Harvard University |
![]() |
Cannon, Steven |
![]() |
Nelson, Rex |
|
Submitted to: Genetics
Publication Type: Peer Reviewed Journal Publication Acceptance Date: 11/20/2025 Publication Date: N/A Citation: N/A Interpretive Summary: This paper describes a collection of websites with associated genomic databases, SoyBase (https://soybase.org), Legume Information System (https://legumeinfo.org), and PeanutBase (https://peanutbase.org). These websites support breeding and research work in the legume crop plant family by organizing and linking published legume genomic research data. Legume crops are critical to agriculture and the economy in the U.S., supporting plant breeding in soybean and other legume crops that are critical for products including livestock feed, oil seeds, and food in U.S. grocery stores. This paper describes a genomic Data Store that allows for bulk data access by breeders and researchers and by Artificial Intelligence (AI) models. The paper also describes the architecture for these websites, which has been designed for rapid, modular, flexible development well suited to genomic data and to rapidly include new data. This architecture allows for both code sharing and for customization to serve the unique needs of each research community. Showcasing this website architecture, the paper describes a set of online tools — some shared on all three sites, and some customized for the respective research communities. Technical Abstract: Here, we describe a collection of genomic database portals, SoyBase (https://soybase.org), Legume Information System (https://legumeinfo.org), and PeanutBase (https://peanutbase.org), that support breeding and research work in the legume plant family. The legume family includes important crops such as soybean, peanut, common bean, lentils, chickpeas, as well as approximately 20,000 other species that are important in all terrestrial ecosystems. Beyond the value of the portals for species in this large clade (as well as for plant biology more generally), the database and site architecture of these portals will be of interest to developers of similar genomic sites, as the data management and software solutions are generic and should be applicable to a wide variety of organisms. The architecture for these sites has been designed for rapid, modular, flexible development well suited to genomic data and to rapid incorporation of new data. Website content is handled with a static site generator (Jekyll). Interactive applications are developed using JavaScript encapsulated as Web Components that access back-end data via APIs for stability and flexibility. This architecture allows for both code portability and for customization to serve the unique needs of each research community. |
