Location: Plant Genetics Research
Title: Global genotype by environment prediction competition reveals that diverse modeling strategies can deliver satisfactory maize yield estimatesAuthor
![]() |
Washburn, Jacob |
![]() |
VARELA, JOSE - University Of Wisconsin |
![]() |
XAVIER, ALENCAR - Corteva Agriscience |
![]() |
CHEN, QIUYUE - North Carolina State University |
![]() |
ERTL, DAVID - Iowa Corn Promotion Board |
![]() |
GAGE, JOSEPH - North Carolina State University |
![]() |
Holland, James |
![]() |
LIMA, DAYANE - University Of Wisconsin |
![]() |
ROMAY, MARIA - Cornell University |
![]() |
LOPEZ-CRUZ, MARCO - Michigan State University |
![]() |
Kick, Daniel |
Submitted to: Genetics
Publication Type: Peer Reviewed Journal Publication Acceptance Date: 11/13/2024 Publication Date: 11/22/2024 Citation: Washburn, J.D., Varela, J.I., Xavier, A., Chen, Q., Ertl, D., Gage, J.L., Holland, J.B., Lima, D.C., Romay, M.C., Lopez-Cruz, M., Kick, D.R. 2024. Global genotype by environment prediction competition reveals that diverse modeling strategies can deliver satisfactory maize yield estimates. Genetics. 229(2). https://doi.org/10.1093/genetics/iyae195. DOI: https://doi.org/10.1093/genetics/iyae195 Interpretive Summary: Predicting crop yield ahead of time is critical to decision making for farmers, plant breeders, and researchers. A corn yield prediction competition was organized using publicly available data generated by the Genomes to Fields (G2F) initiative. Contestants from across the globe formed teams and submitted predictive models using diverse approaches and strategies. The winning team used a combination of machine learning and quantitative genetic statistics to beat out the competition, but many other viable methods were submitted. The methods and results represent a wealth of potential avenues for future research and improvement of crop yield modeling. Technical Abstract: Predicting phenotypes from a combination of genetic and environmental factors is a grand challenge of modern biology. Slight improvements in this area have the potential to save lives, improve food and fuel security, permit better care of the planet, and create other positive outcomes. In 2022 and 2023, the first open-to-the-public Genomes to Fields initiative Genotype by Environment prediction competition was held using a large dataset including genomic variation, phenotype and weather measurements, and field management notes gathered by the project over 9 years. The competition attracted registrants from around the world with representation from academic, government, industry, and nonprofit institutions as well as unaffiliated. These participants came from diverse disciplines, including plant science, animal science, breeding, statistics, computational biology, and others. Some participants had no formal genetics or plant-related training, and some were just beginning their graduate education. The teams applied varied methods and strategies, providing a wealth of modeling knowledge based on a common dataset. The winner's strategy involved 2 models combining machine learning and traditional breeding tools: 1 model emphasized environment using features extracted by random forest, ridge regression, and least squares, and 1 focused on genetics. Other high-performing teams’ methods included quantitative genetics, machine learning/deep learning, mechanistic models, and model ensembles. The dataset factors used, such as genetics, weather, and management data, were also diverse, demonstrating that no single model or strategy is far superior to all others within the context of this competition. |