Location: Plant, Soil and Nutrition ResearchTitle: Conserved noncoding sequences provide insights into regulatory sequence and loss of gene expression in maize
|SONG, BAOXING - CORNELL UNIVERSITY - NEW YORK|
|Buckler, Edward - Ed|
|WANG, HAI - CHINA AGRICULTURE UNIVERSITY|
|WU, YAOYAO - CORNELL UNIVERSITY - NEW YORK|
|REES, EVAN - CORNELL UNIVERSITY - NEW YORK|
|KELLOGG, ELIZABETH - DANFORTH PLANT SCIENCE CENTER|
|GATES, DANIEL - UNIVERSITY OF CALIFORNIA, DAVIS|
|KHAIPHO-BURCH, MERRITT - CORNELL UNIVERSITY - NEW YORK|
|ROSS-IBARRA, JEFF - UNIVERSITY OF CALIFORNIA, DAVIS|
|HUFFORD, MATTHEW - IOWA STATE UNIVERSITY|
|ROMAY, M. CINTA - CORNELL UNIVERSITY - NEW YORK|
Submitted to: Genome Research
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 5/21/2021
Publication Date: 5/21/2021
Citation: Song, B., Buckler IV, E.S., Wang, H., Wu, Y., Rees, E., Kellogg, E.A., Gates, D.J., Khaipho-Burch, M., Bradbury, P., Ross-Ibarra, J., Hufford, M.B., Romay, M. 2021. Conserved noncoding sequences provide insights into regulatory sequence and loss of gene expression in maize. Genome Research. 31:1245-1257.
Interpretive Summary: In biological cells, the genome sequence regulates protein content which determines how each biological individual looks. In the genome sequence, research determining which region produces specific proteins has been well established. Less known, however, is identifying the genome sequences that are functionally important in determining which proteins to produce as well as where and how many are produced. Sequences that are conserved in different species are supposed to be functionally important. In this study, we developed new software and measured the genome sequence of four relative species of corn and sorghum. We identified 106.52 million base pairs of genome sequences present in multiple species and functionally important in maize. Our results provide a quantitative understanding of the functionally important genomic elements in maize and that allows us to target regions to improve crop performance using genetic technology.
Technical Abstract: Thousands of species will be sequenced in the next few years; however, understanding how their genomes work without an unlimited budget requires both molecular and novel evolutionary approaches. We developed a sensitive sequence alignment pipeline to identify conserved noncoding sequences (CNSs) in the Andropogoneae tribe (multiple crop species descended from a common ancestor ~18 million years ago). The Andropogoneae share similar physiology while being tremendously genomically diverse, harboring a broad range of ploidy levels, structural variation, and transposons. These contribute to the potential of Andropogoneae as a powerful system for studying CNSs and are factors we leverage to understand the function of maize CNSs. We found that 86% of CNSs were comprised of annotated features, including introns, UTRs, putative cis-regulatory elements, chromatin loop anchors, noncoding RNA genes, and several transposable element superfamilies. CNSs were enriched in active regions of DNA replication in the early S phase of the mitotic cell cycle and showed different DNA methylation ratios compared to the genome-wide background. More than half of putative cis-regulatory sequences (identified via other methods) overlapped with CNSs detected in this study. Variants in CNSs were associated with gene expression levels, and CNS absence contributed to loss of gene expression. Furthermore, the evolution of CNSs was associated with the functional diversification of duplicated genes in the context of maize subgenomes. Our results provide a quantitative understanding of the molecular processes governing the evolution of CNSs in maize.