Skip to main content
ARS Home » Northeast Area » Ithaca, New York » Robert W. Holley Center for Agriculture & Health » Plant, Soil and Nutrition Research » Research » Publications at this Location » Publication #377709

Research Project: Improving Crop Efficiency Using Genomic Diversity and Computational Modeling

Location: Plant, Soil and Nutrition Research

Title: Constrained non-coding sequence provides insights into regulatory elements and loss of gene expression in maize

Author
item SONG, BAOXING - Cornell University
item WANG, HAI - China Agricultural University
item WU, YAOYAO - Cornell University
item REES, EVAN - Cornell University
item GATES, DANIEL - University Of California, Davis
item BURCH, MERRITT - Cornell University
item Bradbury, Peter
item ROSS-IBARRA, JEFF - University Of California, Davis
item KELLOGG, ELIZABETH - Danforth Plant Science Center
item HUFFORD, MATTHEW - Iowa State University
item ROMAY, CINTA - Cornell University
item Buckler, Edward - Ed

Submitted to: bioRxiv
Publication Type: Other
Publication Acceptance Date: 7/13/2020
Publication Date: 7/13/2020
Citation: Song, B., Wang, H., Wu, Y., Rees, E., Gates, D.J., Burch, M., Bradbury, P., Ross-Ibarra, J., Kellogg, E.A., Hufford, M.B., Romay, C., Buckler IV, E.S. 2020. Constrained non-coding sequence provides insights into regulatory elements and loss of gene expression in maize. bioRxiv. https://doi.org/10.1101/2020.07.11.192575.
DOI: https://doi.org/10.1101/2020.07.11.192575

Interpretive Summary: n biological cells, the genome sequence regulates protein content which determines how each biological individual looks. In the genome sequence, research determining which region produces specific proteins has been well established. Less known, however, is identifying the genome sequences that are functionally important in determining which proteins to produce as well as where and how many are produced. Sequences that are conserved in different species are supposed to be functionally important. In this study, we developed new software and measured the genome sequence of four relative species of corn and sorghum. We identified 106.52 million base pairs of genome sequences present in multiple species and functionally important in maize. Our results provide a quantitative understanding of the functionally important genomic elements in maize and that allows us to target regions to improve crop performance using genetic technology.

Technical Abstract: DNA sequencing technology has advanced so quickly, identifying key functional regions using evolutionary approaches is required to understand how those genomes work. This research develops a sensitive sequence alignment approach to identify functional constrained non-coding sequences in the Andropogoneae tribe. The grass tribe Andropogoneae contains several crop species descended from a common ancestor ~18 million years ago. Despite broadly similar phenotypes, they have tremendous genomic diversity with a broad range of ploidy levels and transposons. These features make Andropogoneae a powerful system for studying conserved non-coding sequence (CNS), here we used it to understand the function of CNS in maize. We find that 86% of CNS comprise known genomic elements e.g., cis-regulatory elements, chromosome interactions, introns, several transposable element superfamilies, and are linked to genomic regions related to DNA replication initiation, DNA methylation and histone modification. In maize, we show that CNSs regulate gene expression and variants in CNS are associated with phenotypic variance, and rare CNS absence contributes to loss of gene expression. Furthermore, we find the evolution of CNS is associated with the functional diversification of duplicated genes in the context of the maize subgenomes. Our results provide a quantitative understanding of constrained non-coding elements and identify functional non-coding variation in maize.