Skip to main content
ARS Home » Midwest Area » Columbia, Missouri » Cropping Systems and Water Quality Research » Research » Publications at this Location » Publication #361660

Research Project: Sustainable Intensification of Cropping Systems on Spatially Variable Landscapes and Soils

Location: Cropping Systems and Water Quality Research

Title: Statistical and machine learning methods evaluated for incorporating soil and weather into corn nitrogen recommendations

item RANSOM, C - University Of Missouri
item Kitchen, Newell
item CAMBERATO, J - Purdue University
item CARTER, P - Corteva Agriscience
item FERGUSON, R - University Of Nebraska
item FERNANDEZ, F - University Of Minnesota
item FRANZEN, D - North Dakota State University
item LABOSKI, A - University Of Wisconsin
item MYERS, D - Corteva Agriscience
item NAFZIGER, E - University Of Illinois
item SAWYER, J - Iowa State University
item SHANAHAN, J - Fortigen

Submitted to: Computers and Electronics in Agriculture
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 6/23/2019
Publication Date: 8/1/2019
Citation: Ransom, C.J., Kitchen, N.R., Camberato, J.J., Carter, P.R., Ferguson, R.B., Fernandez, F.G., Franzen, D.W., Laboski, A.M., Myers, D.B., Nafziger, E.D., Sawyer, J.E., Shanahan, J.F. 2019. Statistical and machine learning methods evaluated for incorporating soil and weather into corn nitrogen recommendations. Computers and Electronics in Agriculture. 164:104872.

Interpretive Summary: Nitrogen (N) management could be improved if decision tools used to determine the correct N fertilizer rate included soil and weather information. However, knowing what soil and weather information to include and how best to incorporate that information is not well known. This research was done to compare eight different analytical methods for how well they integrated soil and weather information into three different corn N recommendation tools. The investigation utilized studies across 49 fields located within eight U.S. Midwest states. The traditionally used stepwise regression performed well at improving all three tools but it required excessive and redundant use (= 39) of weather and soil variables. This is vulnerable to “over-fitting”. After removing redundant variables, stepwise regression models performed poorly. Other methods that allow the computer to automate analytical model building (often referred to as “machine learning methods”) did not perform much better, but several used many fewer variables. When only a few variables are used, it is easier to explain. The random forest model best improved all three N recommendation tools for predicting N rate, but it too utilized many variables making it difficult to interpret. On the other hand, a decision tree method performed quite well in recommending N rate, and it only used a few variables; these were easy to interpret. These results show that some machine learning tools could help improve current N recommendation tools when used to incorporate soil and weather information. These findings could be used to help farmers improve N fertilizer management decisions, the result of which would decrease N over-applications.

Technical Abstract: Nitrogen (N) fertilizer recommendation tools could be improved for estimating corn (Zea mays L.) N needs by incorporating site-specific soil and weather information. However, an evaluation of analytical methods is needed to determine the success for incorporating this information. The objectives of this research were to evaluate statistical and machine learning (ML) algorithms for utilizing soil and weather information for improving corn N recommendation tools. Eight algorithms [stepwise, ridge regression, least absolute shrinkage and selection operator (Lasso), elastic net regression, principal component regression (PCR), partial least squares regression (PLS), decision tree, and random forest] were evaluated using a dataset containing measured soil and weather variables from a regional database. The performance was evaluated based on how well these algorithms predicted corn economically optimal N rates (EONR) from 49 sites in the U.S. Midwest. Multiple algorithm modeling scenarios were examined with and without adjustment for multicollinearity and inclusion of two-way interaction terms to identify the soil and weather variables that could improve three dissimilar N recommendation tools. Results showed the out-of-sample error was significantly greater for the stepwise regression compared to all other models. The best method for adjusting N recommendation tools was the random forest approach (r^2 increased between 0.72 and 0.84 and the root-mean-square error (RMSE) decreased between 41 and 94 kg N/ha); however, this method was difficult to interpret agronomically and produced a cumbersome model. In contrast, the method judged best based on statistical metrics and agronomic interpretation was the decision tree method. This method was simple, needing only one or two variables (regardless of modeling scenario) and provided moderate improvement as r^2 values increased between 0.15 and 0.51 and RMSE decreased between 16 and 66 kg N/ha.