Submitted to: Soil Science Society of America Journal
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: March 23, 2008
Publication Date: July 1, 2008
Citation: Lamorski, K., Pachepsky, Y.A., Slawinski, C., Walczak, R. 2008. Using support vector machines to develop pedotransfer functions for water retention of soils in Poland. Soil Science Society of America Journal. 72:1243-1247.
Interpretive Summary: For large scale projects, the ability of soils to retain and to transmit water is often estimated rather than measured. The estimation uses soil survey data as the input and the soil water retention and hydraulic conductivity as the output. Currently, the estimation procedures relate the inputs and outputs using the artificial intelligence, or machine learning. Artificial neural networks (ANNs) that crudely mimic the brain functioning are the tools of choice. Although preferable to statistical regression techniques, ANNs have an important disadvantage - there is no guarantee that the ANN does the best estimation. Recently a different, more rigorous estimation method called support vector machines (SVM) has been introduced and is gaining popularity is due to the belief that it produces the best estimates. We compared SVM and ANN estimators obtained on the national database called Soil Profiles Bank of Polish Mineral Soils. ANN gave the same accuracy as SVM. This allows one to suggest that existing ANN-based hydraulic property estimators provide the best estimations, although this hypothesis needs further testing.
Pedotransfer functions (PTF), which estimate soil hydraulic parameters from better known soil properties, are the important data source for hydrologic modeling. Recently artificial neural networks (ANNs) became the tool of choice in PTF development. Training of ANN can be viewed as finding the minimum of the mean-squared error as dependent on the neuron weights. None of training algorithms can guarantee that the global rather the local minimum will be found. Recent developments in machine learning methods include the growing research and application of the alternative data driven method called Support Vector Machines (SVMs). SVMs have gained popularity in many traditionally ANNs dominated fields. Using the SVM eliminates the local minimum issue - the minimum found is always the global one. The objective of this work was to see whether using the SVM to develop PTFs may have some advantages compared with the ANN. We have used the Soil Profiles Bank of Polish Mineral Soils that includes hydraulic properties for about 1000 soil samples taken from 290 soil profiles. This database was repeatedly randomly split into training and testing datasets, and both SVMs and ANNs were trained and tested for each split with bulk density, sand and clay as input variables, and water contents at 11 soil water potentials as the output variables. The PTF performance was evaluated by using the test datasets to compute the determination coefficient, the root-mean-squared error, and slope and intercept of the linear regression “simulated vs. measured water contents.” There was no statistically significant difference (P<0.05) between the average ANN and SVM determination coefficient and the root-mean-squared error for most of the 11 matric potential measurement levels. The SVM performed slightly better where the significant difference was found but this difference should not be important for many practical purposes. Overall, the ANN did not demonstrate the tendency to generate worse predictions after being stuck in local minima for the database of this work.