|DHALIWAL, DALJEET - University Of Illinois|
Submitted to: Precision Agriculture
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 7/21/2023
Publication Date: 7/29/2023
Citation: Dhaliwal, D., Williams II, M.M. 2023. Sweet corn yield prediction using machine learning models and field-level data. Precision Agriculture. https://doi.org/10.1007/s11119-023-10057-1.
Interpretive Summary: Gains in computing resources may improve accuracy of crop yield prediction. This study used a rich spatio-temporal sweet corn dataset, with field-level yield observations, to evaluate various machine learning model prediction capabilities. Our research identified the model that provides the most accurate yield predictions and identified environmental and management variables most influential on sweet corn yield. The impact of this work is the revelation of new insights into commercial sweet corn production, particularly as it relates to climate change, adverse weather, and sweet corn yield.
Technical Abstract: The advent of modern technologies, acquisition of large amounts of crop management and weather data, and advances in computing are reshaping modern agriculture. Coupled with machine learning and predictive analytics, these advancements have unlocked the power of data by providing valuable insights and more accurate yield predictions. This study utilizes a historic US sweet corn dataset to (a) evaluate machine learning model performances on sweet corn yield prediction, and (b) identify the most influential variables for crop yield predictions. The sweet corn data comprised field-level data for over a quarter century period (1992-2018) from two primary regions of commercial sweet corn production for processing, namely the Upper Midwest and the Pacific Northwest. Several machine learning models were trained to predict field-level sweet corn yield from 67 variables of crop genetics, management, weather, and soil factors. The random forest model outperformed all trained models with the lowest RMSE (3.29 Mt/ha) and the highest Pearson’s correlation coefficient (0.77) between predicted and observed yields. Variable importance plots revealed the top three most influential predictor variables as year (time), location (space), and seed source (genetics). Season long total precipitation and average minimum temperature during anthesis were the two most important weather variables in yield prediction. This is the first report of symbiotic association between fine-scale (time and space) crop data and advanced data analytics to leverage insights into commercial sweet corn production.