Skip to main content
ARS Home » Research » Publications at this Location » Publication #389937

Research Project: Towards Resilient Agricultural Systems to Enhance Water Availability, Quality, and Other Ecosystem Services under Changing Climate and Land Use

Location: Location not imported yet.

Title: Using artificial neural network (ANN) for short-range prediction of cotton yield in data-scarce regions

item YILDIRIM, TUGBA - Oklahoma State University
item Moriasi, Daniel
item CHAKRABORTY, DEBADITYA - University Of Texas At San Antonio
item MIRCHI, ALI - Oklahoma State University
item Starks, Patrick
item TAGHVAQEIAN, SALEH - Oklahoma State University

Submitted to: Agronomy Journal
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 3/24/2022
Publication Date: 3/29/2022
Citation: Yildirim, T., Moriasi, D.N., Chakraborty, D., Mirchi, A., Starks, P.J., Taghvaqeian, S. 2022. Using artificial neural network (ANN) for short-range prediction of cotton yield in data-scarce regions. Agronomy Journal. 12(4):828.

Interpretive Summary: Prediction of crop yields months before harvest provides valuable insights for managing the economic activity of the farm enterprise and for improved management of farm inputs such as fertilizers and herbicides. Such predictions are difficult to achieve in regions that lack extensive observational records. In this study, we used readily available climate and remote sensing data as inputs for four machine learning (ML) models to predict months-ahead yield of irrigated cotton in the Menemen Plain, Turkey. The results showed that the Light Gradient Boosting, Random Forest, and extreme Gradient Boosting (XGBoost) ML models predicted cotton yield much better than the Artificial Neural Networks model. The selected XGBoost model accurately (within < 1 kg ha-1 of the average measured value of 5220 kg ha-1) predicted cotton yield up to 5 months before harvesting. Therefore, the XGBoost model has potential to be used as a cotton production management tool by producers to forecast yield in data-scarce and other regions. This data-driven modeling framework is expected to lay a foundation to assess the effects of climate variability on agricultural crop yield and management, which contributes to the goals of the Conservation Effects Assessment Project and Long-Term Agroecosystem Research network USDA initiatives.

Technical Abstract: Short range predictions of crop yield provide valuable insights for agricultural resource management and likely economic impacts associated with low yield. Such predictions are difficult to achieve where extensive observational records are lacking. Herein, we demonstrate how months-ahead predictions of rainfed and irrigated cotton yield can be provided in data-scarce regions using machine learning (ML) models trained with a number of basic or readily available input data from Menemen Plain, Turkey. We applied an innovative preprocessing technique to increase data points from the limited reported yield (13 years) along with cumulative precipitation and cumulative heat units as basic inputs for four ML models, including Artificial Neural Networks (ANN), Light Gradient Boosting (LGBoost), Random Forest (RF), and extreme Gradient Boosting (XGBoost). We also used two meteorologically-based drought indices (Standardized Precipitation Index (SPI) and Standardized Precipitation Evapotranspiration Index (SPEI)), and three remotely-sensed vegetation indices (Normalized Difference Vegetation Index (NDVI), Enhanced Vegetation Index (EVI), and Land Surface Water Index (LSWI)) as indicators of agricultural water stress. The LGBoost, RF, and XGBoost models predicted cotton yield much better than the off-the-shelf ANN model. The selected XGBoost model accurately predicted cotton yield up to 5 or 4 months prior to harvest with R2 = 0.99 based on prevailing climatic conditions prior to irrigation in July.