Submitted to: World Congress of Soil Science
Publication Type: Abstract Only
Publication Acceptance Date: February 21, 2006
Publication Date: July 9, 2006
Citation: Lilly, A., Nemes, A., Rawls, W.J., Pachepsky, Y.A. 2006. A probabilistic approach to the identification of soil textural and structural input variables for the estimation of saturated hydaulic conductivity [abstract]. 18th ISSS World Congress of Soil Science Conference. Paper No. 137-30. Technical Abstract: The performance of pedotransfer functions (PTFs) depends on properties of the development and application databases. The database effect has usually been studied by changing the database, while the applied predictors remained the same. It may happen, however, that the set of input variables that are optimal to estimate the desired soil properties from one data set, are not optimal for the other. Pedotransfer development is usually guided by the availability of data on potentially useful input attributes. Soil structure is known to have significant impact on soil hydraulic properties, but is rarely represented directly in soil hydraulic PTFs. Most PTFs use input parameters that are indirectly related to soil structure, such as bulk density, organic matter content and topsoil/subsoil distinctions. This is in part because quantified soil structure data are difficult to collect, but also because collecting such information may not have been part of the original project(s), and thus are not available. However, morphological descriptions of soil structure are often routinely collected during field sampling and represent an underutilized resource. No clear recommendation exists on what structural indicators might be the most significant in relation to the estimation of soil hydraulic properties. We used regression trees to examine, which types of soil texture- and structure-related data provide the most useful information towards estimating soil hydraulic properties, and thus could be used to either improve current PTFs or to allow PTFs to be developed in areas where measured data such as particle-size distribution, organic matter content and bulk density are lacking. The regression tree technique performs input attribute selection. We coupled this technique with ‘bagging’; meaning that alternative realizations of the input data set were generated to provide data for tree development. Two hundred realizations have been generated from a European data set (n=502) using randomized subset selection, otherwise known as the jackknife technique. In each case, samples not selected for tree development were used to test the performance of the tree models. Input variables were: the presence of peds of any of 7 ped-size classes (i.e. 1-2, 2-5, 5-10, 10-20, 20-50, 50-100 >100 mm) the orientation of any structural cracks (horizontal/vertical); a classification of apedal soils (massive, single grain, structure-less); a dummy variable to indicate top/subsoil; USDA texture classes; sand, silt and clay content, bulk density, and organic matter content. The probability of appearance of input attributes at the branch splits, as well as their split values was evaluated. Sand and silt content were the primary splitting factors with a 93% and 7% probability respectively. The topsoil/subsoil distinction (59%) and bulk density (26%) were most dominant as the secondary split level, regardless of the primary splitting variable. Organic matter content (9%), and the largest ped-size class (>100mm) (6%) also appeared at this level. The importance of this ped-size class was confirmed at the tertiary split level, where this was the most frequent splitting variable. The listed variables dominated the first 3 split levels, with the presence of both vertical and horizontal cracks and massive aggregates as the dominant aggregation type appearing at the tertiary level only once (0.5%). We developed additional tree models using less input information, simulating cases where certain types of data are not available. The above approach helps (1) to identify what textural and structural input attributes are more preferable in PTF development than others; (2) to define what variables should be measured and/or recorded to help improve the reliability of the estimations of soil hydraulic properties; (3) to understand whether and how the grouping of soils may enhance the accuracy and reliability of PTFs.