Skip to main content
ARS Home » Northeast Area » Ithaca, New York » Robert W. Holley Center for Agriculture & Health » Plant, Soil and Nutrition Research » Research » Publications at this Location » Publication #386698

Research Project: Improving Crop Efficiency Using Genomic Diversity and Computational Modeling

Location: Plant, Soil and Nutrition Research

Title: Haplotype associated RNA expression (HARE) improves prediction of complex traits in maize

Author
item GIRI, ANJU - Cornell University
item KHAIPHO-BURCH, MERRITT - Cornell University
item Buckler, Edward - Ed
item RAMSTEIN, GUILLAUME - Aarhus University

Submitted to: bioRxiv
Publication Type: Pre-print Publication
Publication Acceptance Date: 5/4/2021
Publication Date: 5/4/2021
Citation: Giri, A., Khaipho-Burch, M., Buckler IV, E.S., Ramstein, G.P. 2021. Haplotype associated RNA expression (HARE) improves prediction of complex traits in maize. bioRxiv. https://doi.org/10.1101/2021.04.30.442099.
DOI: https://doi.org/10.1101/2021.04.30.442099

Interpretive Summary: Genome wide prediction has been mostly carried out within populations using single site polymorphisms. Therefore, it cannot capture the complex effect due to combination of alleles as a result prediction across populations has usually been low. In this study, we explored the prediction of field traits within and across populations using estimated RNA expression attributable to only the DNA sequence around a gene. We showed that the estimated RNA expression using our method was more transferable than overall measured RNA expression. We improved prediction of field traits up to 15% using estimated gene expression as compared to observed expression or gene sequence alone. In this study we present a novel and cost-effective method of imputing RNA expression attributable to DNA sequence around the gene. Imputing expression was not only cheaper, but it also yielded stable and transferrable genetic information across tissues and improved prediction of many complex traits. The result can be used in strengthening crop improvement efforts through the application in genomic prediction and biological inference.

Technical Abstract: Genomic prediction typically relies on associations between single-site polymorphisms and traits of interest. This representation of genomic variability has been successful for prediction within populations. However, it usually cannot capture the complex effects due to combination of alleles in haplotypes. Therefore, accuracy across populations has usually been low. Here we present a novel and cost-effective method for imputing cis haplotype associated RNA expression (HARE, RNA expression of genes by haplotype), studied their transferability across tissues, and evaluated genomic prediction models within and across populations. HARE focuses on tightly linked cis acting causal variants in the immediate vicinity of the gene, while excluding trans effects from diffusion and metabolism, so it would be more transferrable across different tissues and populations. We showed that HARE estimates captured one-third of the variation in gene expression and were more transferable across diverse tissues than the measured transcript expression. HARE estimates were used in genomic prediction models evaluated within and across two diverse maize panels – a diverse association panel (Goodman Association panel) and a large half-sib panel (Nested Association Mapping panel) – for predicting 26 complex traits. HARE resulted in up to 15% higher prediction accuracy than control approaches that preserved haplotype structure, suggesting that HARE carried functional information in addition to information about haplotype structure. The largest increase was observed when the model was trained in the Nested Association Mapping panel and tested in the Goodman Association panel. Additionally, HARE yielded higher within-population prediction accuracy as compared to measured expression values. The accuracy achieved by measured expression was variable across tissues whereas accuracy using HARE was more stable across tissues. Therefore, imputing RNA expression of genes by haplotype is stable, cost-effective, and transferable across populations.