Submitted to: PLoS One
Publication Type: Peer reviewed journal
Publication Acceptance Date: 6/12/2012
Publication Date: 7/13/2012
Publication URL: http://handle.nal.usda.gov/10113/58750
Citation: Barthe, S., Gugerli, F., Barkley, N.L., Maggia, L., Cardi, C., Scotti, I. 2012. Always look on both sides: Phylogenetic information conveyed by simple sequence repeat allele sequences. PLoS One. 7(7):e40699. doi:10.1371/journal.pone.0040699. Interpretive Summary: Simple Sequence Repeat (SSR) markers are widely used for genetic diversity, determining relationships between individuals, and parentage analysis. There are two main models (infinite alleles and stepwise mutation) used to analyze SSR data both with different assumptions which affect the overall analysis, results, and interpretation of the data. The underlying assumptions of these models are based on how the SSR alleles are changing. Allele size alone however cannot provide any information to allow one to decipher how these repeat elements evolve and change since allele size may not always convey identical sequence content. The determination of an appropriate model can only be accomplished by sequencing the alleles and analyzing the data to discover if the mechanism of change is consistent with one model over the other. The main goal of this work was to determine which model would be most appropriate for analyzing SSR alleles in three divergent angiosperms and to model the different forms of variation detected from characterizing these SSR alleles. Additionally, homoplasy in which alleles are identical in state but not identical in content has been reported to occur in other plants and further to be a problem in SSR marker analysis. Therefore, this study aimed to determine the level of homoplasy that occurred in SSR alleles from three divergent tree genera; Citrus, Jacaranda, and Quercus.
Technical Abstract: Simple sequence repeat (SSR) markers are widely used tools for inferences about genetic diversity, phylogeography and spatial genetic structure. Their applications assume that variation among alleles is essentially caused by an expansion or contraction of the number of repeats and that, accessorily, mutations in the target sequences follow the stepwise mutation model (SMM). Generally speaking, PCR amplicon sizes are used as direct indicators of the number of SSR repeats composing an allele with the data analysis either ignoring the extent of allele size differences or assuming that there is a direct correlation between differences in amplicon size and evolutionary distance. However, without precisely knowing the kind and distribution of polymorphism within an allele (SSR and the associated flanking region (FR) sequences), it is hard to say what kind of evolutionary message is conveyed by such a synthetic descriptor of polymorphism as DNA amplicon size. In this study, we sequenced several SSR alleles in multiple populations of three divergent tree genera and disentangled the types of polymorphisms contained in each portion of the DNA amplicon containing an SSR. The patterns of diversity provided by amplicon size variation, SSR variation itself, insertions/deletions (indels), and single nucleotide polymorphisms (SNPs) observed in the FRs were compared. The amount of variation was as large in FRs as in the SSR itself. The former contributed significantly to the phylogenetic information and sometimes was the main source of differentiation among individuals and populations detected by SSR markers. The presence of mutations occurring at different rates within a marker's sequence offers the opportunity to analyse evolutionary events occurring on various timescales, but at the same time calls for caution in the interpretation of SSR marker data when the distribution of within-locus polymorphism is not known.