Skip to main content
ARS Home » Northeast Area » Ithaca, New York » Robert W. Holley Center for Agriculture & Health » Plant, Soil and Nutrition Research » Research » Publications at this Location » Publication #309676

Title: Genome-wide computational prediction and analysis of core promoter elements across plant monocots and dicots

item Ware, Doreen

Submitted to: PLOS ONE
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 9/18/2013
Publication Date: 10/29/2013
Publication URL: http://DOI: 10.1371/journal.pone.0079011
Citation: Kumari, S., Ware, D. 2013. Genome-wide computational prediction and analysis of core promoter elements across plant monocots and dicots. PLoS One. 8(10):e79011.

Interpretive Summary: Despite numerous technological advances in biological and computational sciences in the post genome era, our basic understanding of the gene regulatory mechanisms still remains elusive. By now, the fundamental need to unravel RNA polymerase II (polII) mediated transcription initiation is well recognized for developing system level understanding of the condition specific gene regulatory networks (GRNs). Though important, the experimental discovery and the confirmatory studies alone are not sufficient to meet these challenges. For example, it is now well known that the TATA-box motif, once thought to be necessary for the formation of polII pre-initiation complex (PIC) assembly, accounts only for a small fraction of the expressed genome. Furthermore, it is still quite challenging to accurately identify the transcription start site (TSS) and predict the functional genomic elements in the promoter region. Therefore, incorporation of TSS and cis-regulatory element identification tools' integration into genome annotation pipelines is yet to become a common practice. Very little is known about the cis-regulatory elements of transcription control in plants. Here, we have reported the first genome-wide distribution of known core promoter elements (CPEs) across monocots and dicots. In this work, genome-wide frequency distribution profiles of each motif distribution along with sequence and structural based feature-characteristics of promoter elements such as comparative genomics and DNA free energy profiles were studied. The binding sites of thirteen known CPEs in the promoter region of eight plant species were predicted based on the sequence content and cross species conservation. The putative core promoter region boundaries were ascertained based on the structure content using DNA free energy profiles. Monocots and dicots were compared to see the similarities and differences in promoter sequences across monocots and dicots.

Technical Abstract: Transcription initiation, essential to gene expression regulation, involves recruitment of basal transcription factors to the core promoter elements (CPEs). The distribution of currently known CPEs across plant genomes is largely unknown. This is the first large scale genome-wide report on the computational prediction of CPEs across eight plant genomes to help better understand the transcription initiation complex assembly. The distribution of thirteen known CPEs across four monocots (Brachypodium distachyon, Oryza sativa ssp. japonica, Sorghum bicolor, Zea mays) and four dicots (Arabidopsis thaliana, Populus trichocarpa, Vitis vinifera, Glycine max) reveals the structural organization of the core promoter in relation to the TATA-box as well as with respect to other CPEs. The distribution of known CPE motifs with respect to transcription start site (TSS) exhibited positional conservation within monocots and dicots with slight differences across all eight genomes. Further, a more refined subset of annotated genes based on orthologs of the model monocot (O. sativa ssp. japonica) and dicot (A. thaliana) genomes supported the positional distribution of these thirteen known CPEs. DNA free energy profiles provided evidence that the structural properties of promoter regions are distinctly different from that of the non-regulatory genome sequence. It also showed that monocot core promoters have lower DNA free energy than dicot core promoters. The comparison of monocot and dicot promoter sequences highlights both the similarities and differences in the core promoter architecture irrespective of the species-specific nucleotide bias. This study will be useful for future work related to genome annotation projects and can inspire research efforts aimed to better understand regulatory mechanisms of transcription.