Location: Plant Genetics Research
Title: Leveraging data from the genomes-to-fields initiative to investigate genotype-by-environment interactions in maize in North AmericaAuthor
LOPEZ-CRUZ, MARCO - Michigan State University | |
AGUATE, FERNANDO - Michigan State University | |
Washburn, Jacob | |
DE LEON, NATALIA - University Of Wisconsin | |
KAEPPLER, SHAWN - University Of Wisconsin | |
LIMA, DAYANE - University Of Wisconsin | |
TAN, RUIJUAN - Michigan State University | |
THOMPSON, ADDIE - Michigan State University | |
DE LA BRETONNE, LAWRENCE - University Of Wisconsin | |
DE LOS CAMPOS, GUSTAVO - Michigan State University |
Submitted to: Nature Communications
Publication Type: Peer Reviewed Journal Publication Acceptance Date: 10/18/2023 Publication Date: 10/30/2023 Citation: Lopez-Cruz, M., Aguate, F.M., Washburn, J.D., de Leon, N., Kaeppler, S.M., Lima, D., Tan, R., Thompson, A., De La Bretonne, L.W., de los Campos, G. 2023. Leveraging data from the genomes-to-fields initiative to investigate genotype-by-environment interactions in maize in North America. Nature Communications. 14. Article 6904. https://doi.org/10.1038/s41467-023-42687-4. DOI: https://doi.org/10.1038/s41467-023-42687-4 Interpretive Summary: Agricultural crop yields are significantly influenced by the genetics of the crops being grown and the environment the crops are grown in (weather, soil, fertilizer, etc.). Understanding and predicting how these factors interact to result in crop yields is critical to increasing agricultural efficiency and farmer profits, but requires the collection of large datasets that encompass many environments and large amounts of genetic variation. The Genomes to Fields (G2F) initiative has spent the past decade gathering such data but organizing that data into useful formats and processing it through different types of models has lagged behind making the datasets inaccessible to many researchers. This study filtered through the entire G2F database to create the most comprehensive, high quality, and broadly accessible version of the dataset possible, then developed and tested a number of modeling approaches for validation of the data. This publication and the associated repositories and workflows are designed to serve as a simple starting point for broad use of the datasets by researchers across many fields. Technical Abstract: Genotype-by-environment (G×E) interactions can significantly affect crop performance and stability. Investigating G×E requires extensive data sets with diverse cultivars tested over multiple locations and years. The Genomes-to-Fields (G2F) Initiative has tested maize hybrids in more than 130 year-locations in North America since 2014. Here, we curate and expand this data set by generating environmental covariates (using a crop model) for each of the trials. The resulting data set includes DNA genotypes and environmental data linked to more than 70,000 phenotypic records of grain yield and flowering traits for more than 4000 hybrids. We show how this valuable data set can serve as a benchmark in agricultural modeling and prediction, paving the way for countless G×E investigations in maize. We use multivariate analyses to characterize the data set’s genetic and environmental structure, study the association of key environmental factors with traits, and provide benchmarks using genomic prediction models. |