2011 Annual Report
1a.Objectives (from AD-416)
To provide analytical bioinformatic support to the MARS-USDA Theobroma cacao L. sequencing project in various research activities: a) SNP diversity and association mapping analyses of cacao populations, b) Gene expression analysis to study self incompatibility in cacao with special emphasis on the effect of micro RNAs, c) comparative genomics of Theobroma grandiflorum vs. Theobroma cacao, d) sequencing of Phytophthora megakarya, and (e) functional annotation of the T. cacao, T. grandiflorum, and P. megakarya genomes and polymorphisms based on bioinformatic approaches as well as integration of RNAseq data.
1b.Approach (from AD-416)
The research activities outlined in the objectives are not enunciated in order of priority and will be implemented according to the availability of the data.
The cacao genome sequencing project will generate a large amount of Single Nucleotide Polymorphism (SNP) data that will be selected to establish a high throughput SNP genotyping platform. The selection of the SNPs and their utilization for diversity and association mapping studies, requires bionformatic analytical support. A post doctoral fellow (or later research associate, depending on rank) will be hired to work in the laboratory and the Center for Genomics and Personalized Medicine at the Stanford University School of Medicine. The post-doctoral fellow will perform the required analyses for the SNPs studies. Other scientific activities, within the context of the cacao genome project, will also benefit from bioinformatics expertise in the lab and the CGPM. Specifically, the department and center’s expertise on the study of gene expression and the action of microRNAs will be of great value for the analysis of the gene expression data that have been generated to understand the genetic determinism of self incompatibility in cacao.
The genome of Theobroma grandiflorum, a related species showing significant traits of importance such as disease resistance, will be sequenced and compared to the cacao genome. This comparison will help identifying key genes involved in the domestication of cacao. The sequencing project also comprises the study of important diseases affecting cacao; The lab and specifically the post-doctoral fellow hired to support the activities outlined in this SCA will contribute to the assembly of the genome of main diseases affecting cacao. For example, P. megakarya has a detrimental impact in cacao production, causing hundreds of millions of dollars in damages to West African cacao producer countries.
This project is related to the inhouse objective: The development and implementation of an international Marker Assisted Selection(MAS)program for cacao is the major objective of this project. This objective involves a combination of hypothesis-driven and non-hypothesis driven research and includes the training of scientists from cacao producing countries in plant breeding, genetics and the use of molecular markers in a MAS program.
The Bioinformatics lab cooperator has focused in providing bioinformatic resources to the cacao genome project at several levels: a) The development of a platform for the genotyping by sequencing (GBS) of cacao clones; b) The re-sequencing and assembly of additional genotypes to Matina 1-6 by generating the sequences themselves and assembling the sequence data using the Matina 1-6 reference assembly; c) The identification of Single Nucleotide Polymorphism (SNP)markers among genotypes sequenced for validating the GBS approach or for developing SNP chips; and d) The comparison of the Matina 1-6 and Criollo genomes. Currently we are re-sequencing 90 genotypes from the 10 cacao genetic groups. Ten re-sequenced genotypes have been used to identify SNPs against the Matina 1-6 reference genome.
Progress was monitored through phone calls and emails.