Location: Subtropical Horticulture ResearchTitle: A genetically anchored physical framework for Theobroma cacao cv. Matina 1-6 Author
|Schnell Ii, Raymond|
Submitted to: Biomed Central (BMC) Genomics
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 8/4/2011
Publication Date: 8/16/2011
Citation: Saski, C.A., Feltus, F.A., Staton, M.E., Blackmon, B.P., Ficklin, S.P., Kuhn, D.N., Schnell Ii, R.J., Shapiro, H., Motamayor, J.C. 2011. A genetically anchored physical framework for Theobroma cacao cv. Matina 1-6. Biomed Central (BMC) Genomics. 12:413. DOI: 10.1186/1471-2164-12-413. Interpretive Summary: The fermented, dried seeds of Theobroma cacao (chocolate tree) are the main flavor ingredient in chocolate. World cocoa production is estimated to be 7 million tons in 2010 with an annual estimated average growth rate of 2.2%. This cacao bean production industry is currently under threat from a rise in fungal diseases including black pod, frosty pod, and witch’s broom. In order to address these issues, the entire genome of cacao has been sequenced to identify genetic markers and genes aimed to accelerate the release of robust T. cacao cultivars. To be able to link the genome sequence to previous genetic recombination mapping of favorable traits such as disease resistance, we have produced a physical map (framework) of cacao. The physical map improves the assembly of the cacao genome and makes identification of markers linked with traits easier. Identification of markers linked to disease resistance improves cacao breeding programs and accelerates the availability of disease resistant, higher yielding cultivars to the farmers. All this means a stable, abundant supply of cocoa for the US chocolate manufacturing industry
Technical Abstract: Background: Theobroma cacao (cacao tree) is the main ingredient in chocolate. World cocoa production is estimated to be 7 million tons in 2010 with an annual estimated average growth rate of 2.2%. This cacao bean production industry is currently under threat from a rise in fungal diseases including black pod, frosty pod, and witch’s broom. In order to address these issues, two genome-sequencing efforts have been initiated recently to identify genetic markers and genes aimed to accelerate the release of robust T. cacao cultivars. However, inherent problems with assembly and resolution of distal regions of complex eukaryotic genomes such as gaps, chimeric joins, and unresolvable repeat-induced compressions are unavoidable with selected sequencing strategies. Here, we describe the construction of a BAC-based integrated genetic-physical map of the T. cacao genome, which is aimed to augment and enhance these efforts. Results: Three 10X coverage BAC libraries were constructed and fingerprinted. Two hundred thirty genetic markers from the high-resolution genetic recombination map and 96 Arabidopsis derived conserved ortholog set (COS) II markers were anchored by pooled overgo hybridization. A dense tile path of BACs consisting of 29,383 BACs was selected and end sequenced. The physical map consists of 154 contigs, and 4,268 singletons. Forty-nine contigs are genetically anchored and ordered to chromosomes for a total span of 307.2 Mbp. The unanchored contigs (105) span 67.4 Mbp giving an estimated genome size of 374.7 Mbp. A comparative analysis with A. thaliana, V. vinifera, and P. trichocarpa suggest that these distant genome assemblies can serve as a window of opportunity to gain insight into genome structure, evolutionary history, conserved functional sites, and improve a physical map assembly. A comparison between the two cultivars (Matina 1-6 cultivar and Criollo) indicate a high degree of colinearity, yet rearrangements were observed. Conclusions: The genomic resources presented in this study are a standalone resource for functional exploitation and enhancement of Theobroma cacao and are expected to complement and augment the two ongoing genome-sequencing efforts. In the face of two genome sequences, these resources will serve as the template for genome sequence refinement through gap-filling, targeted resequencing, and resolution of repetitive DNA arrays by ordered long-range contiguity.