Location: Vegetable Crops Research Unit
Title: Genome Wide Characterization of Simple Sequence Repeats in Cucumber Authors
Submitted to: International Symposium on Cucurbits Proceedings
Publication Type: Abstract Only
Publication Acceptance Date: June 8, 2009
Publication Date: September 20, 2009
Citation: Weng, Y., Cavagnaro, P., Senalik, D.A., Harkins, T., Simon, P.W. 2009. Genome Wide Characterization of Simple Sequence Repeats in Cucumber [abstract]. International Symposium on Cucurbits Proceedings. p. 80. Technical Abstract: The whole genome sequence of the cucumber cultivar Gy14 was recently sequenced at 15× coverage with the Roche 454 Titanium technology. The microsatellite DNA sequences (simple sequence repeats, SSRs) in the assembled scaffolds were computationally explored and characterized. A total of 112,073 SSRs with basic motif lengths of 2 to 8 nucleotides were identified in ~ 203 Mb of cucumber genomic DNA sequences, representing ~ 55.3% of its 367 Mbp nuclear genome. This translates to an average of 1 SSR / 1.81 kb. Tetra- (29.8%), di-(26.5%) and tri-nucleotides (25.6%) were the most abundant repeat types, with the dinucleotide “AT” being the single most frequent SSR motif. AT-rich motifs predominated over GC-rich repeats for all the SSR basic-length types. In order to investigate the type and distribution of SSRs in different genome fractions, SSR-containing sequence regions were first masked for microsatellites and then compared against EST and repeat databases, and positive hits were recorded. Substantial variation in the relative frequencies of SSR types and motifs was found across the two genome fractions analyzed. It was found that the SSR dataset associated with the transcribed fraction of the C. sativus genome (Cs-est) had a higher frequency of trinucleotides and hexanucleotides than the overall genomic sequence (Cs-gen). Also, Cs-est had a higher frequency of AAG, AGG and AG motifs than Cs-gen. Compared to Cs-gen and Cs-est, the microsatellites associated with repetitive elements in the cucumber genome were particularly abundant in dinucleotides, predominating those with “AT” motif. Several other published plant genomes were analyzed, and a comparable distribution of SSR types in transcribed and genomic regions was also observed, suggesting that this unequal distribution is not random.