Submitted to: Genome
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 4/22/2004
Publication Date: 10/14/2004
Citation: Schlueter, J.A., Dixon, P., Granger, C., Grant, D.M., Clark, L., Doyle, J.J., Shoemaker, R.C. 2004. Mining EST databases to resolve evolutionary events in major plant species. Genome. 47:868-876. Interpretive Summary: Duplication of the hereditary material plays a major role in generating genetic diversity in most plants. Until recently it has been very difficult to study these events because gathering the necessary data has been very expensive. Recent national research programs have generated much genetic data that is now available in public databases. The authors used these data to identify gene duplication events that occurred in a wide range of plant species. They identified ancient events that probably occurred as much as 60 million years agos. They also showed that expression of duplicated genes changes very quickly after duplication and that most genes are under selection pressure to maintain their structure. This data may have important implications in understanding the evolution of agronomically important traits.
Technical Abstract: Utilizing plant EST collections, we obtained 1392 potential gene duplicates across eight plant species: Zea mays, Oryza sativa, Sorghum bicolor, Hordeum vulgare, Solanum tuberosum, Lycopersicon esculentum, Medicago truncatula, and Glycine max. We estimated the synonymous and nonsynonymous distances between each gene pair and identified two to three mixtures of normal distributions. We observed two to three rounds of genome duplication in each species. Within the Poaceae we found a conserved duplication event among all four species at approximately 50-60 million years ago; an event that probably occurred prior to the major radiation of the grasses. In the Solanaceae, we found evidence for a conserved duplication event approximately 46-48 Mya. A soybean duplication occurred approximately 42 Mya and a Medicago duplication about 55 Mya. Comparing synonymous and nonsynonymous distances allowed us to determine that most duplicate gene pairs are under purifying, negative selection. We calculated Pearson correlation coefficients to provide us with a measure of how gene expression patterns has changed between duplicate pairs, and compared this across evolutionary distances. Gene expression of duplicates seems to move rapidly to fixation. We have found that duplicate gene analysis from large sequence resources can answer evolutionary questions.