|Lightfoot, David - SOUTHERN ILLINOIS UNIV|
Submitted to: Biomed Central (BMC) Genomics
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: December 12, 2006
Publication Date: January 8, 2007
Citation: Shultz, J.L., Ray, J.D., Lightfoot, D.A. 2007. A Sequence Based Synteny Map Between Soybean and Arabidopis Thaliana. Genome. DOI:10.1186/1471-2164-8-8 Interpretive Summary: Molecular markers are small DNA segments used to tag specific useful genes so they can be more easily incorporated into new soybean varieties. However, often there are not enough molecular markers to identify the desired genes. One limitation in identifying new genes is that the DNA sequence of soybean has not been completed. However it has been completed for Arabidopsis which is frequently used as a model plant DNA system. In this study Arabidopsis was used as a framework to align soybean gene sequences to identify a specific type of molecular marker. A set of computer programs was written and used to compare known gene sequences from soybean with those from Arabidopsis. The resulting comparison allows easy visual identification of similar genes in the two plants. The soybean genes are placed at an exact location in reference to those in Arabidopsis, allowing identification and alignment of similar soybean DNA segments. The alignment of these soybean DNA sequences can be used to identify and design new molecular markers. These new markers can be used to identify beneficial genes related to problems such as disease or environmental stress. All programs and maps created in this study are available on the World Wide Web.
Technical Abstract: In an effort to identify conserved sequences between soybean (Glycine max, L. Merr.) and the model organism Arabidopsis thaliana (thale cress), a series of JAVA-based programs were created that processed and compared 341,619 soybean DNA sequences against A. thaliana chromosomal DNA. A. thaliana DNA was probed for short, exact matches (15bp) to each soybean sequence, then checked for the number of additional 7 bp matches in the adjacent 400 bp region. The position of these matches was used to order soybean sequences in relation to the A. thaliana genome. Reported associations between soybean sequences and A. thaliana were within a 95% confidence interval of e-30 – e-100. In addition, the clustering of soybean expressed sequence tags (ESTs) based on A. thaliana sequence was accurate enough to identify potential single nucleotide polymorphisms (SNPs) within the soybean sequence clusters. An EST, bacterial artificial chromosome (BAC) end sequence and marker amplicon sequence synteny map of soybean and A. thaliana is presented. In addition, all JAVA programs used to create this map are available upon request and on the WEB.