|Van Der Hoeven, Rutger - CORNELL UNIVERSITY|
|Ronning, Cathy - INST FOR GENOME RES|
|Martin, Greg - CORNELL UNIVERSITY|
|Tanksley, Steve - CORNELL UNIVERSITY|
Submitted to: The Plant Cell
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: July 10, 2002
Publication Date: September 1, 2002
Citation: VAN DER HOEVEN, R., RONNING, C., GIOVANNONI, J.J., MARTIN, G., TANKSLEY, S. DEDUCTIONS ABOUT THE NUMBER, ORGANIZATION AND EVOLUTION OF GENES IN THE TOMATO GENOME BASED ON ANALYSIS OF LARGE EST COLLECTION AND SELECTIVE GENOMIC SEQUENCING. THE PLANT CELL. 2002. V. 14. P. 1441-1456. Interpretive Summary: We have generated a database for tomato comprising more than 120,000 ESTs. In addition, BAC clones corresponding to six selected regions of the tomato genome were sequenced. In this report we describe the analysis of both the tomato EST database and the BAC sequences. Computational comparisons are made against the arabidopsis genomic sequence and a similar high density EST database from another dicot species, Medicago truncatula. As a result of these analyses, we have been able to address a number of issues including the content, number and organization of genes in the tomato genome and the degree to which genes have diverged since tomato, arabidopsis and M. truncatula diverged from their last common ancestor.
Technical Abstract: Analysis of a collection of 120,892 single pass ESTs, derived from 26 different tomato cDNA libraries and reduced to a set of 27,274 unique consensus sequences (unigenes) reveals that 70% of the unigenes have identifiable homologs in the arabidopsis genome. Many of the most highly conserved multigene families share similar copy numbers between tomato and arabidopsis, suggesting that the multiplicity of these families may have occurred prior to the divergence of these two species. Finally, 6 BAC clones from different parts of the tomato genome were sequencing and annotated. The combined analysis of the EST database and these six sequenced BACs, leads to the prediction that the tomato genome encodes approximately 35,000 genes.