Page Banner

United States Department of Agriculture

Agricultural Research Service

Title: Utility of Non-Parametric 'k-Nearest Neighbor' Algorithms to Estimate Soil Hydraulic Properties

Authors
item Nemes, Attila - UNIV. OF CA, RIVERSIDE
item Rawls, Walter
item Pachepsky, Yakov

Submitted to: Meeting Abstract
Publication Type: Proceedings
Publication Acceptance Date: August 1, 2005
Publication Date: October 13, 2005
Citation: Nemes, A., Rawls, W.J., Pachepsky, Y.A. 2005. Utility of non-parametric ‘k-nearest neighbor’ algorithms to estimate soil hydraulic properties. In: Proceedings of the International Scientific Conference-Innovation and Utility in the Visegrad Fours, October 13-15, 2005, Nyiregyhaza, Hungary. p. 103-108.

Technical Abstract: Non-parametric approaches are being used in various fields to address classification type problems, as well as to estimate continuous variables. One type of the non-parametric lazy learning algorithms, a k-Nearest Neighbor (k-NN) algorithm has been developed to estimate soil water retention at –33 and –1500 kPa matric potentials. Different design settings, analogous to parameters in parametric models, have been optimized. Performance of the algorithm has subsequently been tested against estimations made by a neural network (NNet) model, developed using the same data and input soil attributes. We used a hierarchical set of inputs using soil texture, bulk density and organic matter content to avoid possible bias towards one set of inputs, and varied the size of the data set used to make estimations. The k-NN technique shows little sensitivity to potential sub-optimal settings in terms of how many nearest soils are selected and how those are weighed while formulating the output of the algorithm, as long as extremes are avoided. The optimal settings are, however, dependent on the size of the development/reference data set. The novel non-parametric k-NN technique performed comparably to equally well to the NNet models, in terms of root-mean-squared residuals, mean residuals and Akaike’s Information Criterion, an efficiency measure of models. Gradual reduction of the development data set size from 1600 to 100 resulted in only a slight loss of estimation accuracy and reliability for both the k-NN and NNet approaches. Such is encouraging for potential users of a k-NN technique who do not possess an abundance of data. The k-NN technique could be a competitive alternative to other techniques to estimate soil hydraulic properties. Literature provides a list of advantages of using such non-parametric approaches over parametric approaches. Our study shows that to obtain those advantages the user would not necessarily have to compromise estimation accuracy and reliability.

Last Modified: 4/21/2014
Footer Content Back to Top of Page