Submitted to: Plant and Animal Genome VX Conference Abstracts
Publication Type: Abstract Only
Publication Acceptance Date: December 2, 2006
Publication Date: January 13, 2007
Citation: Crane, C.F., Quevillon, E., Goodwin, S.B. 2007. Characterization of Tandem Repeats and Repetitive Elements in the Genomes of Mycosphaerella graminicola and M. figiensis. Plant and Animal Genome VX Conference Abstracts. Available: http://www.intl-pag.org/15/abstracts/PAG15_P03b_144.html. Interpretive Summary: Mycosphaerella graminicola is the fungus that causes septoria leaf blotch, a destructive disease of wheat. The related M. fijiensis causes black sigatoka disease in banana. The entire DNA (genome) of these fungi has been sequenced this year, and it is now possible to characterize different parts of the genome and investigate their function. One noteworthy characteristic of DNA sequence is repeated subsequences, which can be classified in three ways: microsatellites, where the repeated subsequence is one to six bases long and the repeats are tandem; minisatellites, where the tandemly repeated subsequence is more than six bases long; and repetitive elements, where the subsequence is typically hundreds or thousands of bases long and the element exists at many separated positions in the genome. We have developed software to identify microsatellite and minisatellite sequences in genomic DNA sequence, and we have used standard software (tblastx and RECON) to identify known and novel repetitive elements. The most frequent microsatellite patterns are C, AG, AAG, AAAC, AAGTG, and AACCCT, in their respective length catagories. There are 2200 groups of repetitive elements overall; 38 of the groups are novel and occur in at least 20 instances in the genome. Our findings contribute to basic knowledge about the genomes of these two troublesome plant pathogens.
Technical Abstract: In the genome of Mycosphaerella graminicola, the causal agent of septoria leaf blotch of wheat, a total of 22745 tandem repeat sites at least 15 bp in length were isolated for motifs from two to 250 bases long by the same processing pipeline that we have used to analyze such repeats in plant EST collections, and independently by use of Tandem Repeats Finder. Among canonical microsatellite motifs, C (0.57 sites / MB) was more frequent than A (0.31 / MB), AG (2.01 / MB) > AC > AT > CG (0.03 / MB), and AAG (4.70 / MB) > AGG > AAC > ACG > AGC > ACT > ACC > CCG > ATC > AAT (0.36 / MB). The top five tetranucleotide motifs were AAAC (4.70 / MB), AGGG (0.34 / MB), AATG, AAGC, and ATCC (0.23 / MB). The top seven pentanucleotide motifs were AAGTG (0.67 / MB), ACCTC, ATGCC, ATCGC, AAGGG, AACAC, and AATAC (the last three were 0.26 / MB), and the top five hexanucleotide motifs were AACCCT (0.85 / MB), AAGAGG, AGGATG, ACGAGG, and ACGGCG (0.31 / MB). Known repetitive elements were identified by tblastx of the scaffolds against RepBase3.0 (version as of 19 April 2006), and a comprehensive set of 2200 families of known and novel repetitive elements were identified by self-blastn followed by Recon. Of the latter, 38 families contained members at 20 or more distinct loci in the genome, and nine exceeded 100 loci. Parallel results will be available for the genome of Mycosphaerella fijiensis.