Skip to main content
ARS Home » Research » Publications at this Location » Publication #202844

Title: INCREASING THE DIVERSITY OF EST SEQUENCES FOR FRAGARIA

Author
item Slovin, Janet
item RABINOWICZ, PABLO - TIGR

Submitted to: Plant and Animal Genome Conference
Publication Type: Abstract Only
Publication Acceptance Date: 11/7/2006
Publication Date: N/A
Citation: N/A

Interpretive Summary:

Technical Abstract: This project aims to substantially increase the amount of strawberry expressed sequence tag (EST) data available to the community, and increase the diversity of EST sequences for the family Rosaceae. Currently, there are approximately 19,000 Fragaria ESTs in GenBank, 50% of which have been generated by this project. The project’s goal is to produce 35-40,000 high quality EST sequences from 5 different cDNA libraries from in vitro- or greenhouse-grown plants treated with cold, heat, salt, drought, and a combination of salt and heat. The use of these libraries will increase the chances of capturing stress-induced genes, which are under-represented among the currently available Rosaceae EST sequences in the public databases. We are using a diploid species of strawberry, Fragaria vesca, a useful model system for the family Rosaceae. It is a small plant with a small genome size (164 Mbp), a cycle of 3.5-4 months from seed to seed, and inbred lines are available. The genotype PI551574 (Hawaii 4) chosen is day-neutral, produces runners, produces abundant seed, and is easily transformed. The first cDNA library (from cold-stressed tissue) has been constructed and 9,600 high quality sequences produced. An assembly (~2000 contigs and ~6,000 singletons) of all F. vesca ESTs and other cDNA sequences in Genbank is now available at http://plantta.tigr.org. TIGR plant transcript assemblies (TAs) are built for individual species (NCBI taxon ID) and they exclude predicted transcript sequences from genome sequencing projects. They are freely available to the community through user friendly web interfaces. Each TA entry shows the component reads of the assembly and their orientation, as well as its sequence and annotation. Users can search the TAs by accession number or annotation keywords. A BLAST server that allows selecting one or more taxa, and an ftp server for data download are also available through the web pages. A TA has been constructed for F. x ananassa, with 350 contigs and 4,800 singletons, as well as for several other members of the Rosaceae. A special TA, generated for our 9,600 F. vesca cold treated seedling ESTs contains a total of 5,800 assemblies and singletons. Sequences were aligned to Arabidopsis proteins (TAIR) using BLASTX with a cutoff E value <10-5; alignments spanning at least 30% and at least 100 nt of the TA sequence. Over 90% of these TAs matched Arabidopsis proteins. A simplified version of the gene ontology annotation (GOslim) was transferred to the TAs in order to assign GO annotation to the strawberry sequences. Over 1,000 stress related sequences were identified. F. vesca sequences produced in this project that are not represented in existing strawberry or other Rosaceae ESTs were identified by aligning our 5,800 F. vesca TAs to strawberry or Rosaceae unigenes in the Genome Database for Rosaceae (GDR; http://www.mainlab.clemson.edu/gdr), which does not yet include sequences from our project. Using a conservative, low stringency cutoff (BLASTN E value <10-10), we found that ~73% of our TAs do not show similarity to other strawberry unigenes in GDR, and ~21% do not show similarity to any GDR Rosaceae unigene. Among the 73% of our strawberry sequences not present in previous strawberry EST sets, 765 (or 13%) are associated with stress-related GO annotation. Also, 75 (or 1.3%) of those not represented in the whole Rosaceae unigene set are associated with stress-related GO annotation. In conclusion, our project has already yielded a substantial number of new stress-related gene sequences that had not been previously identified in strawberry, and, expectedly a smaller but significant number of new stress-related genes not previously isolated in Rosaceae. We predict that additional ESTs to be produced under this project from tissues subjected to other stresses will continue to deliver new stress-related genes