Location: Plant, Soil and Nutrition Research
Title: Constrained non-coding sequence provides insights into regulatory elements and loss of gene expression in maizeAuthor
SONG, BAOXING - Cornell University | |
WANG, HAI - China Agricultural University | |
WU, YAOYAO - Cornell University | |
REES, EVAN - Cornell University | |
GATES, DANIEL - University Of California, Davis | |
BURCH, MERRITT - Cornell University | |
Bradbury, Peter | |
ROSS-IBARRA, JEFF - University Of California, Davis | |
KELLOGG, ELIZABETH - Danforth Plant Science Center | |
HUFFORD, MATTHEW - Iowa State University | |
ROMAY, CINTA - Cornell University | |
Buckler, Edward - Ed |
Submitted to: bioRxiv
Publication Type: Other Publication Acceptance Date: 7/13/2020 Publication Date: 7/13/2020 Citation: Song, B., Wang, H., Wu, Y., Rees, E., Gates, D.J., Burch, M., Bradbury, P., Ross-Ibarra, J., Kellogg, E.A., Hufford, M.B., Romay, C., Buckler IV, E.S. 2020. Constrained non-coding sequence provides insights into regulatory elements and loss of gene expression in maize. bioRxiv. https://doi.org/10.1101/2020.07.11.192575. DOI: https://doi.org/10.1101/2020.07.11.192575 Interpretive Summary: n biological cells, the genome sequence regulates protein content which determines how each biological individual looks. In the genome sequence, research determining which region produces specific proteins has been well established. Less known, however, is identifying the genome sequences that are functionally important in determining which proteins to produce as well as where and how many are produced. Sequences that are conserved in different species are supposed to be functionally important. In this study, we developed new software and measured the genome sequence of four relative species of corn and sorghum. We identified 106.52 million base pairs of genome sequences present in multiple species and functionally important in maize. Our results provide a quantitative understanding of the functionally important genomic elements in maize and that allows us to target regions to improve crop performance using genetic technology. Technical Abstract: DNA sequencing technology has advanced so quickly, identifying key functional regions using evolutionary approaches is required to understand how those genomes work. This research develops a sensitive sequence alignment approach to identify functional constrained non-coding sequences in the Andropogoneae tribe. The grass tribe Andropogoneae contains several crop species descended from a common ancestor ~18 million years ago. Despite broadly similar phenotypes, they have tremendous genomic diversity with a broad range of ploidy levels and transposons. These features make Andropogoneae a powerful system for studying conserved non-coding sequence (CNS), here we used it to understand the function of CNS in maize. We find that 86% of CNS comprise known genomic elements e.g., cis-regulatory elements, chromosome interactions, introns, several transposable element superfamilies, and are linked to genomic regions related to DNA replication initiation, DNA methylation and histone modification. In maize, we show that CNSs regulate gene expression and variants in CNS are associated with phenotypic variance, and rare CNS absence contributes to loss of gene expression. Furthermore, we find the evolution of CNS is associated with the functional diversification of duplicated genes in the context of the maize subgenomes. Our results provide a quantitative understanding of constrained non-coding elements and identify functional non-coding variation in maize. |