Page Banner

United States Department of Agriculture

Agricultural Research Service

Research Project: CURATION AND DEVELOPMENT OF THE SOYBEAN BREEDER'S TOOLBOX AND ITS INTEGRATION WITH OTHER PLANT GENOME DATABASES

Location: Corn Insects and Crop Genetics Research

2009 Annual Report


1a. Objectives (from AD-416)
Objective 1: Implement web-accessible computational and visualization tools, including semantic web technologies, to enable comparison and transfer of agronomically important genetic information among soybean and other legume and related dicot species. Objective 2: Continue to curate and enhance SoyBase and the Soybean Breeder’s Toolbox (SBT), more fully integrating the genetic, phenotypic, physical map, and whole-genome sequence data from soybean and other legumes. Objective 3: Coordinate the quality assembly and annotation of the soybean whole-genome sequence.


1b. Approach (from AD-416)
Soybean ontologies will be prepared to describe selected data types from the Soybean Breeders Toolbox (SBT). Data exchange descriptions (“RDF graphs”) will be developed to allow integration of the data into the Virtual Plant Information Network (VPIN). To let researchers transparently find, retrieve, or apply analytical methods to data contained in the SBT, web services will be developed to make these services accessible through a single portal. Soybase and the SBT will be maintained and updated with new data classes as needed. The Williams 82 physical map and the soybean whole genome sequence, new sequence-based data types in SoyBase, and comparative data from other legumes will be integrated and displayed. The project works closely with DOE-JGI to enhance the quality of the soybean whole-genome sequence assembly. This will include analysis of sequence-based genetic markers, comparative analyses with other genomes, and various informatic analyses.


3. Progress Report
The dissemination of information about phenotypic characteristics is often hampered by the use of imprecise descriptions of phenotypic characters using “field” terms. This makes discovery of information by computer programs difficult. Applying a numerical labeling system, ARS scientists at Ames, Iowa have constructed a terminology to describe the development of soybean plants. This activity has produced a descriptive vocabulary that covers the growth and development of both vegetative and reproductive components of a soybean plant. The terminology is composed of approximately 1000 terms in SoyBase, and which have been submitted to the Plant Ontology Consortium. Two hundred forty-three soybean trait terms were identified and linked to their nearest synonym in the greater plant trait ontology. Common names for the same phenotype have been collected and compiled into a searchable database available on the SoyBase website (soybase.org) that will allow researchers to identify phenotypic traits using common or field terms. This controlled vocabulary will be important in associating genetically mapped phenotypes (i.e., Qualititative Trait Loci (QTL)) with genes identified in the genome sequence. Semantic web technologies provide a way of exchanging data based on meaning and not on labels for the data. Thus, databases that have similar data described in various ways can make their data available to others based on a common semantic rather than a common vocabulary for the data. This system will facilitate the discovery of pertinent data that will greatly reduce the time curatorial staff and researchers spend in literature analysis. We have developed and deployed six Simple Semantic Web and Protocol (SSWAP) semantic web services that will allow researchers and programs to systematically discover and transfer all data in the Soybean Breeders Toolbox (SBT) QTL class. The SBT database and display engine have been modified to provide links to the Germplasm Resource Information Network (GRIN) database based on the use of both GRIN accession numbers and germplasm common names. This allows researchers to incorporate GRIN data into the context of SoyBase genetic and genomic information. As available data has increased, it has been necessary to add new web pages to accommodate the changes. We developed a two-tier system of navigational tabs that both briefly summarize the contents of each of the major sections in SoyBase and allow rapid movement between them. Access to the soybean genomic sequence permits the visualization of both the soybean physical map and the soybean genomic sequence in the context of the mature soybean genetic map. In response to stakeholder requests, SoyBase displays have been modified through the use of contextual menus to allow a seamless transition between the SBT data and displays of the soybean physical, sequence and genetic maps and between the map displays. In cooperation with ARS-BARC personnel, we have increased the density of the soybean genetic map by the inclusion of 1600 single nucleotide polymorphim (SNP) markers identified by BARC personnel. The new genetic markers were positioned in the soybean genome sequence.


4. Accomplishments

Last Modified: 10/16/2017
Footer Content Back to Top of Page