Submitted to: Bioinformatics
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: September 20, 2004
Publication Date: November 5, 2004
Citation: Nelson, R., Grant, D.M., Shoemaker, R.C. 2004. Estminer: a suite of programs for gene and allele identification. Bioinformatics. 21(5):691-693. Interpretive Summary: The hereditary material of many plants has undergone a doubling at some time during the plants evolution. This results in a doubling of all genes. Even though the genes gradually diverge over millions of years distinguishing the genes is not always easy. In this study the authors devsied methods to distinguish the hereditary code of closely related genes. They used a variety of pre-existing computer programs and relied upon hereditary sequences isolated from a single cultivar. The method they developed is adaptable by research groups with limited computer access. This information will be important to scientists studying the evolution of plant DNAs.
Technical Abstract: ESTminer is a collection of programs that use expressed sequence tag (EST) data from inbred genomes to identify polymorphisms which define genes within gene families. The algorithm utilizes Cap3 to perform an initial clustering of related EST sequences to produce a consensus sequence of a gene family. These consensus sequences are then used to collect all ESTs in the original EST collection that are related using BLAST. A redundancy based criterion is applied to each polymorphism observed in an EST to identify reliable candidate gene-defining polymorphisms. Using a highly inbred genome as a source of ESTs eliminates the necessity of computing covariance on each polymorphism to identify alleles of the same gene making this algorithm computationally less intensive than other alternatives.