|RANSOM, C - University Of Missouri|
|CAMBERATO, J - Purdue University|
|CARTER, P - Corteva Agriscience|
|FERGUSON, R - University Of Nebraska|
|FERNANDEZ, F - University Of Minnesota|
|FRANZEN, D - North Dakota State University|
|LABOSKI, A - University Of Wisconsin|
|MYERS, D - Corteva Agriscience|
|NAFZIGER, E - University Of Illinois|
|SAWYER, J - Iowa State University|
|SHANAHAN, J - Fortigen|
Submitted to: Computers and Electronics in Agriculture
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 6/23/2019
Publication Date: 8/1/2019
Citation: Ransom, C.J., Kitchen, N.R., Camberato, J.J., Carter, P.R., Ferguson, R.B., Fernandez, F.G., Franzen, D.W., Laboski, A.M., Myers, D.B., Nafziger, E.D., Sawyer, J.E., Shanahan, J.F. 2019. Statistical and machine learning methods evaluated for incorporating soil and weather into corn nitrogen recommendations. Computers and Electronics in Agriculture. 164:104872. https://doi.org/10.1016/j.compag.2019.104872.
Interpretive Summary: Nitrogen (N) management could be improved if decision tools used to determine the correct N fertilizer rate included soil and weather information. However, knowing what soil and weather information to include and how best to incorporate that information is not well known. This research was done to compare eight different analytical methods for how well they integrated soil and weather information into three different corn N recommendation tools. The investigation utilized studies across 49 fields located within eight U.S. Midwest states. The traditionally used stepwise regression performed well at improving all three tools but it required excessive and redundant use (= 39) of weather and soil variables. This is vulnerable to “over-fitting”. After removing redundant variables, stepwise regression models performed poorly. Other methods that allow the computer to automate analytical model building (often referred to as “machine learning methods”) did not perform much better, but several used many fewer variables. When only a few variables are used, it is easier to explain. The random forest model best improved all three N recommendation tools for predicting N rate, but it too utilized many variables making it difficult to interpret. On the other hand, a decision tree method performed quite well in recommending N rate, and it only used a few variables; these were easy to interpret. These results show that some machine learning tools could help improve current N recommendation tools when used to incorporate soil and weather information. These findings could be used to help farmers improve N fertilizer management decisions, the result of which would decrease N over-applications.
Technical Abstract: Nitrogen (N) fertilizer recommendation tools could be improved for estimating corn (Zea mays L.) N needs by incorporating site-specific soil and weather information. However, an evaluation of analytical methods is needed to determine the success for incorporating this information. The objectives of this research were to evaluate statistical and machine learning (ML) algorithms for utilizing soil and weather information for improving corn N recommendation tools. Eight algorithms [stepwise, ridge regression, least absolute shrinkage and selection operator (Lasso), elastic net regression, principal component regression (PCR), partial least squares regression (PLS), decision tree, and random forest] were evaluated using a dataset containing measured soil and weather variables from a regional database. The performance was evaluated based on how well these algorithms predicted corn economically optimal N rates (EONR) from 49 sites in the U.S. Midwest. Multiple algorithm modeling scenarios were examined with and without adjustment for multicollinearity and inclusion of two-way interaction terms to identify the soil and weather variables that could improve three dissimilar N recommendation tools. Results showed the out-of-sample error was significantly greater for the stepwise regression compared to all other models. The best method for adjusting N recommendation tools was the random forest approach (r^2 increased between 0.72 and 0.84 and the root-mean-square error (RMSE) decreased between 41 and 94 kg N/ha); however, this method was difficult to interpret agronomically and produced a cumbersome model. In contrast, the method judged best based on statistical metrics and agronomic interpretation was the decision tree method. This method was simple, needing only one or two variables (regardless of modeling scenario) and provided moderate improvement as r^2 values increased between 0.15 and 0.51 and RMSE decreased between 16 and 66 kg N/ha.