Skip to main content
ARS Home » Pacific West Area » Pullman, Washington » Plant Germplasm Introduction and Testing Research » Research » Publications at this Location » Publication #408241

Research Project: Genetic Resource and Information Management for Pulse, Temperate Forage Legume, Oilseed, Vegetable, Grasses, Sugar, Ornamental, and Other Crops

Location: Plant Germplasm Introduction and Testing Research

Title: Linking phenotypes to protein characteristics in 3D structures predicted by AlphaFold

item PARAJULI, ATIT - Washington State University
item BRUEGGEMAN, ROBERT - Washington State University
item WAGNER, STEVEN - Corteva Agriscience
item Warburton, Marilyn
item Peel, Michael
item Yu, Long-Xi
item See, Deven
item ZHANG, ZHIWU - Washington State University

Submitted to: Preprints
Publication Type: Other
Publication Acceptance Date: 5/16/2023
Publication Date: 5/16/2023
Citation: Parajuli, A., Brueggeman, R., Wagner, S., Warburton, M.L., Peel, M., Yu, L., See, D.R., Zhang, Z. 2023. Linking phenotypes to protein characteristics in 3D structures predicted by AlphaFold. Preprints.

Interpretive Summary: Plant breeders develop new cultivars with traits useful to farmers and consumers. These traits are measured as phenotypes, and generally require expensive and time consuming analyses of plants in labs or fields. Looking at the genes responsible for these traits would enable breeders to choose the best plants based on the sequence of the DNA, rather than the phenotype, which would be faster, cheaper, and less error prone. Determining which genes are responsible for each trait is also an expensive and time consuming process, and for some genes, has not been possible. This study presents a new method utilizing protein structure to identify and confirm genes responsible for traits of interest, providing a new tool for geneticists and breeders to speed the creation of improved crop varieties.

Technical Abstract: Plant breeding aims to develop elite crop varieties appropriate for various environments with higher quality and quantity of production. Breeders mostly depend on quantitative trait loci (QTL) mapping and association studies to locate regions in the genome responsible for variation in the quantitative traits of interest. However, mapped regions do not always translate to functional proteins, which makes it challenging to identify genes associated with traits of interest. Alternatively, if proteins can be directly linked with the phenotypes, the effect of mutations on phenotypic changes can be assessed, as the biological functions of proteins are strongly dependent on their 3D structure. Innovation of deep learning models in biology opens new avenues of exploration. AlphaFold is an AI system that predicts the 3D structure of a protein from its amino acid sequence and was used in this study with near experimental accuracy. Point mutations with a significant influence on the 3D structure of a protein can capture the effect on phenotypes through association study, and this provides insights into the regions that are functionally significant. In the current study, 534 plants were selected based on plant vigor, and 168 missense variants that change amino acid sequences were characterized in these plants. The changes in protein 3D structure were assessed and associated with the phenotype.