AN INTEGRATED WEB-BASED RELATIONAL DATABASE FOR THE CURATION OF CACAO GENETIC AND GENOMIC DATA
Location: Subtropical Horticulture Research
Project Number: 6631-21000-021-02
Specific Cooperative Agreement
Start Date: Jan 26, 2009
End Date: Jul 06, 2013
To create and maintain a curated and integrated web-based relational database of cacao genetic and genomic data.
The cacao genome sequencing project will generate a large amount of sequence data, physical map data, and single-nucleotide polymorphism (SNP) data. These data will be produced by USDA and other scientists and will require a website for the deposition, curation, manipulation and distribution of the data.
The cacao genome database will contain comprehensive data of the genetically anchored cacao physical map, annotated EST databases of cacao, cacao maps and markers, all publicly available cacao sequences and the raw and assembled output of the ongoing genome sequencing project. Annotations of ESTs and genomic sequence will include contig assembly, putative function, simple sequence repeats, ORFs, Gene Ontology and anchored position to the cacao physical map where applicable. The integrated map viewer will provide a graphical interface to the genetic, transcriptome and physical mapping information. New cacao map data will be added to CMap, a web-based tool that allows users to view comparisons of genetic and physical maps. ESTs, BACs and markers will be queried by various categories and the search result sites will be linked to the integrated map viewer or to the WebFPC physical map sites. In addition to browsing and querying the database, users can compare their sequences with the annotated cacao sequences via a dedicated sequence similarity server running either the BLAST or FASTA algorithm, search their sequences for microsatellites using the SSR server or assemble their ESTs using the CAP3 Server.