Skip to main content
ARS Home » Plains Area » Las Cruces, New Mexico » Range Management Research » Research » Publications at this Location » Publication #332246

Title: Neighborhood size of training data influences soil map disaggregation

Author
item Levi, Matthew

Submitted to: Soil Science Society of America Journal
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 12/28/2016
Publication Date: 4/20/2017
Publication URL: https://handle.nal.usda.gov/10113/5695464
Citation: Levi, M.R. 2017. Neighborhood size of training data influences soil map disaggregation. Soil Science Society of America Journal. 81:354-368. doi:10.2136/sssaj2016.08.0258

Interpretive Summary: Predicting soil types is an important focus of digital soil mapping (DSM). Most DSM approaches intersect sample locations with one raster pixel per covariate layer regardless of pixel size, but fail to account for adjacent landscape information. My objective was to disaggregate a soil map in a semiarid Arizona rangeland (78,569 ha) by exploring different neighborhood sizes for extracting covariate data to points. Eight machine learning algorithms were compared to assess the influence of aggregating covariate data in neighborhood sizes between 0 – 180 m radius and a multi-scale model. Model performance showed fair to moderate agreement and increased with buffer radius up to a radius of 150 m. Support vector machine and random forest algorithms performed best across all scales. The best model used 150 m aggregations of covariates and produced a generalized map compared to the best multi-resolution model which resulted in a mix of general and detailed soil features. Evaluating a range of neighborhood sizes for aggregating covariate data provides a method of accounting for multi-scale processes important for predicting soil patterns without modifying pixel size of final maps. Incorporating polypedon concepts from traditional soil survey with DSM approaches can strengthen ties between them and optimize the extraction of landscape information for predicting soil properties.

Technical Abstract: Soil class mapping relies on the ability of sample locations to represent portions of the landscape with similar soil types; however, most digital soil mapping (DSM) approaches intersect sample locations with one raster pixel per covariate layer regardless of pixel size. This approach does not take into account the variability of covariate information adjacent to training data that represent the polypedon. My objective was to disaggregate a soil map in a semiarid Arizona rangeland (78,569 ha) by exploring different neighborhood sizes for extracting covariate data to points. Eight machine learning algorithms were compared to assess the influence of aggregating covariate data in neighborhood sizes between 0 – 180 m radius and a multi-scale model. Kappa values of all models ranged between 0.24 and 0.44 and increased with buffer radius up to a radius of 150 m. Support vector machine and random forest algorithms performed best across all scales. The radial support vector machine model using 150 m aggregations of covariates had the highest kappa and produced a more generalized map compared to the best multi-resolution model (random forest) which resulted in a mix of general and detailed soil features. Evaluating a range of neighborhood sizes for aggregating covariate data provides a method of accounting for multi-scale processes important for predicting soil patterns without modifying pixel size of final maps. Incorporating polypedon concepts from traditional soil survey with DSM approaches can strengthen ties between them and optimize the extraction of landscape information for predicting soil properties.