|Ruby, J graham|
|Garcia, Jose fernando|
|Smith, Timothy - Tim|
Submitted to: Genome Biology
Publication Type: Peer reviewed journal
Publication Acceptance Date: 1/8/2013
Publication Date: 1/30/2013
Citation: Melters, D.P., Bradnam, K.R., Young, H.P., Telis, N., May, M.R., Ruby, J., Sebra, R., Peluso, P., Eid, J., Rank, D., Garcia, J., Derisi, J.L., Smith, T.P., Tobias, C.M., Ross-Ibarra, J., Korf, I., Chan, S.W. 2013. Comparative analysis of tandem repeats from hundreds of species reveals unique insights into centromere evolution. Genome Biology. 14(1):R10. doi:10.1186/gb-2013-14-1-r10. Interpretive Summary: During cell division, chromosomes are replicated and the resulting sister chromatids are divided equally between the two nascent cells. The movement of chromosomes to their proper positions in the new cells is normally carried out by attachment of spindle microtubules to the chromosomes at structural features in the DNA called centromeres. In most species that have been studied, centromere DNA is composed of short sequence units that are repeated in tandem, with repeats sometimes spanning millions of copies. Surprisingly, although the presence of clustered tandem repeat units is very common in species of both plant and animal kingdoms, the DNA sequence of the repeats themselves are quite divergent. Little is known about the evolution of these centromeric repeat units. The present study describes the results of a survey of centromeric repeats among more than 280 species, and provides the most comprehensive look at evolution of centromere structure so far attempted. In the study, it was discovered that bovid species such as cattle, yak, and buffalo have some of the longest core repeats of any species examined. These 1400 base repeat units are so long that only the latest technology for obtaining very long sequence reads, like the sequencing machine at the U.S. Meat Animal Research Center in Clay Center, Nebraska, is able to correctly identify these repeats.
Technical Abstract: Centromeres are essential for chromosome segregation, yet their DNA sequences evolve rapidly. In most animals and plants that have been studied, centromeres comprise of megabase-scale arrays of tandem repeats. The true prevalence of centromere tandem repeats, and whether they exhibit conserved sequence properties, are not known. Bioinformatic methods were used to identify high-copy tandem repeats from 255 species with shotgun genomic sequence in public archives. The assumption that the most abundant tandem repeat is the centromere DNA was borne out by comparison to previously characterized species. Our methods are compatible with all current sequencing technologies. Pacific Biosciences’ long sequence reads allowed us to find tandem repeat monomers up to 1,418 bp. High-copy centromere tandem repeats were found in almost all animal and plant genomes. The repeated monomers were highly variable in composition and length. Furthermore, the repeat sequences themselves were not conserved beyond approximately 47 million years of evolution. Despite sharing only the property of being tandem repeats, centromere arrays showed similar modes of evolution over a wide phylogenetic range, including the appearance of higher order repeat structures in which several polymorphic monomers make up a larger repeating unit. Centromere identity in most eukaryotes is epigenetically determined, but our results show that tandem repeats are highly prevalent at centromeres of animals and plants. This suggests that they confer a subtle selective advantage, possibly by promoting centromere DNA on all chromosomes to evolve in concert.