Skip to main content
ARS Home » Midwest Area » West Lafayette, Indiana » Crop Production and Pest Control Research » Research » Publications at this Location » Publication #255473

Title: Identification and annotation of repetitive sequences in fungal genomes

Author
item DHILLON, BRAHAM - Purdue University
item Goodwin, Stephen - Steve

Submitted to: Methods in Molecular Biology
Publication Type: Book / Chapter
Publication Acceptance Date: 3/19/2011
Publication Date: 6/3/2011
Citation: Dhillon, B., Goodwin, S.B. 2011. Identification and annotation of repetitive sequences in fungal genomes. Methods in Molecular Biology. In: Xu, J.R., Bluhm, B.H., editors. Methods in Molecular Biology: Fungal Genomics. New York, NY: Humana Press. p. 33-50.

Interpretive Summary: Advances in technologies for DNA sequencing have greatly increased the amount of genomic data available for many species of fungi and other microorganisms. This has been paralleled by an increase in computational power and resources to process and translate raw sequence data into meaningful information. In addition to protein-coding regions, an integral part of all genomes analyzed so far has been repetitive sequences that can arise by many mechanisms. Identification and analysis of repetitive sequences has presented a problem for many genomic sequencing projects. To facilitate this process, the available literature on analysis of movable genetic elements and other repetitive sequences was reviewed. Many computer programs are available to identify or characterize specific types of repetitive sequences. Each approach has its own strengths and weaknesses. Sequential analysis by multiple computer programs is necessary for a complete accounting of the repetitive elements within each genome. One possible pipeline for analysis of repetitive elements was suggested. This information will be useful to plant pathologists, mycologists and bioinformaticists who are trying to analyze the repetitive fractions of fungal genomes.

Technical Abstract: Cheaper and faster sequencing technologies have fundamentally changed the pace of genome sequencing projects and have contributed to the ever-increasing volume of genomic data. This has been paralleled by an increase in computational power and resources to process and translate raw sequence data into meaningful information. In addition to protein coding regions, an integral part of all the genomes studied so far has been the presence of repetitive sequences. Previously considered as ‘junk’, numerous studies have implicated repetitive sequences in important biological and structural roles in the genome. Therefore, the identification and characterization of these repetitive sequences has become an indispensable part of genome sequencing projects. Numerous similarity-based and de novo methods have been developed to search for and annotate repeats in the genome, many of which have been discussed in this chapter.