Page Banner

United States Department of Agriculture

Agricultural Research Service

Research Project: An Integrated Web-Based Relational Database for the Curation of Cacao Genetic and Genomic Data

Location: Subtropical Horticulture Research

2010 Annual Report


1a.Objectives (from AD-416)
To create and maintain a curated and integrated web-based relational database of cacao genetic and genomic data.


1b.Approach (from AD-416)
The cacao genome sequencing project will generate a large amount of sequence data, physical map data, and single-nucleotide polymorphism (SNP) data. These data will be produced by USDA and other scientists and will require a website for the deposition, curation, manipulation and distribution of the data. The cacao genome database will contain comprehensive data of the genetically anchored cacao physical map, annotated EST databases of cacao, cacao maps and markers, all publicly available cacao sequences and the raw and assembled output of the ongoing genome sequencing project. Annotations of ESTs and genomic sequence will include contig assembly, putative function, simple sequence repeats, ORFs, Gene Ontology and anchored position to the cacao physical map where applicable. The integrated map viewer will provide a graphical interface to the genetic, transcriptome and physical mapping information. New cacao map data will be added to CMap, a web-based tool that allows users to view comparisons of genetic and physical maps. ESTs, BACs and markers will be queried by various categories and the search result sites will be linked to the integrated map viewer or to the WebFPC physical map sites. In addition to browsing and querying the database, users can compare their sequences with the annotated cacao sequences via a dedicated sequence similarity server running either the BLAST or FASTA algorithm, search their sequences for microsatellites using the SSR server or assemble their ESTs using the CAP3 Server.


3.Progress Report

This research relates to inhouse objective: The development and implementation of an international Marker Assisted Selection (MAS) program for cacao is the major objective of this project. This objective involves a combination of hypothesis-driven and non-hypothesis driven research and includes the training of scientists from cacao producing countries in plant breeding, genetics, and the use of molecular markers in a MAS program.

The objective of this agreement is to create and maintain a curated and integrated web-based relational database of cacao genetic and genomic data. The cacao genome database currently contains comprehensive data of the genetically anchored cacao physical map, annotated EST databases of cacao, cacao maps and markers, all publicly available cacao sequences and the raw and assembled output of the ongoing genome and transcriptome sequencing project. The website architecture is CHADO and DRUPAL based and allows for portability of the site to an ARS server upon completion of the project. Monitoring Activities: Project management has been accomplished through regular conference calls, Go2Meeting webinars, e-mails and two meetings per year.


Last Modified: 8/29/2014
Footer Content Back to Top of Page