Skip to main content
ARS Home » Midwest Area » Ames, Iowa » Corn Insects and Crop Genetics Research » Research » Publications at this Location » Publication #411031

Research Project: MaizeGDB - Database and Computational Resources for Maize Genetics, Genomics, and Breeding Research

Location: Corn Insects and Crop Genetics Research

Title: PanEffect: A pan-genome visualization tool for variant effects in maize

item Andorf, Carson
item HALEY, OLIVIA - Orise Fellow
item HAYFORD, RITA - Orise Fellow
item Portwood, John
item Harding, Stephen
item SEN, SHATABDI - Iowa State University
item Cannon, Ethalinda
item GARDINER, JACK - University Of Missouri
item Kim, Hye-Seon
item Woodhouse, Margaret

Submitted to: Bioinformatics
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 2/6/2024
Publication Date: 2/9/2024
Citation: Andorf, C.M., Haley, O., Hayford, R.K., Portwood II, J.L., Harding, S.F., Sen, S., Cannon, E.K., Gardiner, J.M., Kim, H., Woodhouse, M.H. 2024. PanEffect: A pan-genome visualization tool for variant effects in maize. Bioinformatics. 40(2). Article btae073.

Interpretive Summary: Understanding how changes in a plant's genes affect its characteristics is important for improving traits like yield or resilience under stress conditions. Recently, scientists have been using artificial intelligence to look at all the possible changes in genes that make proteins and score those changes as benign or likely to make a noticeable difference in the plant. However, there still is a need to study these changes across many different versions of a plant's entire set of genes. A new tool called PanEffect was developed to address this challenge. This tool is available at the Maize Genetics and Genomics Database. The genes in the species maize produce approximately 40,000 different proteins. PanEffect helps users see and understand more than 550 million possible changes in the amino acids of those proteins. PanEffect also lets a user compare the effects of 2.3 million naturally occurring amino acid changes that exist across 50 different types of maize. The strength of PanEffect lies in its potential to identify genetic targets that can enhance crop breeding strategies. Thus, in an era where food security and sustainable agriculture are paramount, tools like PanEffect can assist researchers and breeders seeking to harness genetic potential in crops like maize.

Technical Abstract: Understanding the effects of genetic variants is crucial for accurately predicting traits and phenotypic outcomes. Recent advances have utilized protein language models to score all possible missense variant effects at the proteome level for a single genome, but a reliable tool is needed to explore these effects at the pan-genome level. To address this gap, we introduce a new tool called PanEffect. We implemented PanEffect at MaizeGDB to enable a comprehensive examination of the potential effects of coding variants across 51 maize genomes. The tool allows users to visualize over 550 million possible amino acid substitutions in the B73 maize reference genome and also to observe the effects of the 2.3 million natural variations in the maize pan-genome. Each variant effect score, calculated from the Evolutionary Scale Modeling (ESM) protein language model, shows the log-likelihood ratio difference between B73 and all variants in the pan-genome. These scores are shown using heatmaps spanning benign outcomes to strong phenotypic consequences. Additionally, PanEffect displays secondary structures and functional domains along with the variant effects, offering additional functional and structural context. Using PanEffect, researchers now have a platform to explore protein variants and identify genetic targets for crop enhancement.