Skip to main content
ARS Home » Southeast Area » Miami, Florida » Subtropical Horticulture Research » Research » Publications at this Location » Publication #285889

Title: A core set of SNPs for cacao genotyping

Author
item Kuhn, David
item LIVINGSTONE III, DONALD - Mars, Inc
item MOCKAITIS, KEITHANNE - Indiana University
item FARMER, ANDREW - National Center For Genome Resources
item MAY, GREGORY - National Center For Genome Resources
item SCHNELL, RAYMOND - Mars, Inc
item MOTAMAYOR, JUAN CARLOS - Mars, Inc

Submitted to: Meeting Abstract
Publication Type: Abstract Only
Publication Acceptance Date: 1/15/2012
Publication Date: 1/15/2012
Citation: Kuhn, D.N., Livingstone Iii, D., Mockaitis, K., Farmer, A., May, G.D., Schnell, R., Motamayor, J. 2012. A core set of SNPs for cacao genotyping. Available: https://www.researchgate.net/publication/268120512_A_Core_Set_of_SNPs_for_Cacao_Genotyping.

Interpretive Summary: Theobroma cacao, the source of cocoa beans for chocolate, is an important tropical agriculture commodity that is affected by a number of fungal pathogens and insect pests, as well as concerns about yield and quality. We are trying to find molecular genetic markers that are linked to disease resistance and other important economic traits to aid in a marker assisted selection (MAS) breeding program for cacao to ensure a reliable supply of cocoa for the US confectionary industry. Theobroma cacao L. (cacao) is an understory tropical tree whose fermented seeds are the source of cocoa butter and cocoa for chocolate production. Breeding programs to improve disease resistance and productivity in cacao depend on the correct genotyping of breeding material and accessions in germplasm collections, which has been done with microsatellite markers. Due to the platform dependence of such markers, it has not been possible to produce widely agreed upon reference genotypes, which has significantly hindered breeding progress. We produced a 6k Illumina Infinium SNP chip from SNPs called from transcriptome sequence data (exons) of genetically diverse individuals. We genotyped 19 individuals that represented the breadth of genetic diversity in cacao and filtered this data to identify SNPs that occurred in single copy genes, had high minor allele frequencies, were evenly distributed across the 10 linkage groups of cacao based on mapping to the Matina1-6 genome sequence, and, when possible, caused an amino acid change in the coding sequence. From this filtering process, we are proposing a core set of 100 SNPs, 10 per linkage group, to be used in genotyping the germplasm collections and the material used in breeding programs around the world so that a single, verifiable reference genotype will be available for the majority of identified cultivars. Because SNPs are platform independent, both low and high throughput methods can be used to assay the complete core set or informative subsets based on the needs of the germplasm curator or breeder. Our results are important to scientists trying to understand the mechanism of disease resistance and, eventually, to cacao farmers who will benefit from superior disease resistant and more productive cultivars produced through our MAS breeding program.

Technical Abstract: Juan Carlos Motamayor , MARS Inc., Miami, FL Theobroma cacao L. (cacao) is an understory tropical tree whose fermented seeds are the source of cocoa butter and cocoa for chocolate production. Breeding programs to improve disease resistance and productivity in cacao depend on the correct genotyping of breeding material and accessions in germplasm collections, which has been done with microsatellite markers. Due to the platform dependence of such markers, it has not been possible to produce widely agreed upon reference genotypes, which has significantly hindered breeding progress. We produced a 6k Illumina Infinium SNP chip from SNPs called from transcriptome sequence data (exons) of genetically diverse individuals. We genotyped 19 individuals that represented the breadth of genetic diversity in cacao and filtered this data to identify SNPs that occurred in single copy genes, had high minor allele frequencies, were evenly distributed across the 10 linkage groups of cacao based on mapping to the Matina1-6 genome sequence, and, when possible, caused an amino acid change in the coding sequence. From this filtering process, we are proposing a core set of 100 SNPs, 10 per linkage group, to be used in genotyping the germplasm collections and the material used in breeding programs around the world so that a single, verifiable reference genotype will be available for the majority of identified cultivars. Because SNPs are platform independent, both low and high throughput methods can be used to assay the complete core set or informative subsets based on the needs of the germplasm curator or breeder.