Skip to main content
ARS Home » Southeast Area » Miami, Florida » Subtropical Horticulture Research » Research » Publications at this Location » Publication #294709

Title: The genome sequence of the most widely cultivated cacao type and its use to identify candidate genes regulating pod color

Author
item MOTAMAYOR, JUAN - M & M Mars Company - United States
item MOCKAITIS, KEITHANNE - Indiana University
item SCHMUTZ, JEREMY - Hudsonalpha Institute For Biotechnology
item HAIMINEN, NIINA - International Business Machines Corporation (IBM)
item LIVINSTONE III, DONALD - M & M Mars Company - United States
item CORNEJO, OMAR - Stanford University
item FINDLEY, SETH - M & M Mars Company - United States
item ZHENG, PING - Washington State University
item UTRO, FILIPPO - International Business Machines Corporation (IBM)
item ROYAERT, STEFAN - M & M Mars Company - United States
item SASKI, CHRISTOPHER - Clemson University
item JENKINS, JERRY - Hudsonalpha Institute For Biotechnology
item PODICHETI, RAM - Indiana University
item ZHAO, MEIXIA - Purdue University
item Scheffler, Brian
item STACK, JOSEPH - M & M Mars Company - United States
item FELTUS, ALEX - Clemson University
item MUSTIGA, GUILIANA - M & M Mars Company - United States
item AMORES, FREDDY - National Institute For Agricultural Research (INIAP)
item PHILLIPS, WILBERT - Catie Tropical Agricultural Research
item MARELLI, JEAN PHILIPPE - M & M Mars Company - Brazil
item MAY, GREGORY - National Center For Genome Research
item SHAPIRO, HOWARD - M & M Mars Company - United States
item MA, JIANXIN - Purdue University
item BUSTAMANTE, CARLOS - Stanford University
item SCHNELL, RAYMOND - M & M Mars Company - United States
item MAIN, DORRIE - Washington State University
item GILBERT, DON - Indiana University
item PARIDA, LAXMI - International Business Machines Corporation (IBM)
item Kuhn, David

Submitted to: Genome Biology
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 4/11/2013
Publication Date: 6/14/2013
Citation: Motamayor, J.C., Mockaitis, K., Schmutz, J., Haiminen, N., Livinstone Iii, D., Cornejo, O., Findley, S., Zheng, P., Utro, F., Royaert, S., Saski, C., Jenkins, J., Podicheti, R., Zhao, M., Scheffler, B.E., Stack, J.C., Feltus, A., Mustiga, G., Amores, F., Phillips, W., Marelli, J., May, G.D., Shapiro, H., Ma, J., Bustamante, C.D., Schnell, R.J., Main, D., Gilbert, D., Parida, L., Kuhn, D.N. 2013. The genome sequence of the most widely cultivated cacao type and its use to identify candidate genes regulating pod color. Genome Biology. Genome Biology 2013, 14:R53 doi:10.1186/gb-2013-14-6-r53.

Interpretive Summary: Theobroma cacao, whose seeds are the source of cocoa, the raw material for the multi-billion dollar US chocolate industry, is an important tropical agriculture commodity that is affected by a number of fungal diseases, including Black Pod disease caused by Phytophthora spp. We are trying to find molecular genetic markers that are linked to disease resistance in Theobroma cacao to aid in a marker assisted selection (MAS) breeding program to ensure a reliable supply of cocoa for the US confectionary industry. As a means to generate large numbers of markers and to identify candidate genes for traits such as pod color and disease resistance, we have completed the genome sequence of the most widely cultivated cacao type. Our results are important to scientists trying to understand the mechanism of disease resistance and, eventually, to cacao farmers who will benefit from superior disease resistant cultivars produced through our MAS breeding program.

Technical Abstract: Background: Theobroma cacao L. cultivar Matina 1-6 belongs to the most cultivated cacao type. The availability of its genome sequence and methods for identifying genes responsible for important cacao traits will aid cacao researchers and breeders. Results: We describe the sequencing and assembly of the genome of Theobroma cacao L. cultivar Matina 1-6. The genome of the Matina 1-6 cultivar is 445 Mbp, which is significantly larger than a sequenced Criollo cultivar, and more typical of other cultivars. The chromosome-scale assembly, version 1.1, contains 711 scaffolds covering 346.0 Mbp, with a contig N50 of 84.4 kbp, a scaffold N50 of 34.4 Mbp, and an evidence-based gene set of 29,408 loci. Version 1.1 has 10x the scaffold N50 and 4x the contig N50 as Criollo, and includes 111 Mb more anchored sequence. The version 1.1 assembly has 4.4% gap sequence, while Criollo has 10.9%. Through a combination of haplotype, association mapping and gene expression analyses, we leverage this robust reference genome to identify a promising candidate gene responsible for pod color variation. We demonstrate that green/red pod color in cacao is likely regulated by the R2R3 MYB transcription factor TcMYB113, homologs of which determine pigmentation in Rosaceae, Solanaceae, and Brassicaceae. One SNP within the target site for a highly conserved trans-acting siRNA in dicots, found within TcMYB113, seems to affect transcript levels of this gene and therefore pod color variation. Conclusions: We report a high-quality sequence and annotation of Theobroma cacao L. and demonstrate its utility in identifying candidate genes regulating traits. Keywords: Theobroma cacao L., genome, Matina 1-6, haplotype phasing, genetic mapping, pod color, MYB113