Location: Grassland Soil and Water Research Laboratory
Title: Empirical relationships between environmental factors and soil organic carbon produce comparable prediction accuracy to machine learningAuthor
MISHRA, UMAKANT - Sandia National Laboratory | |
YEO, KYONGMIN - Computational Biology Center, Ibm, Tj Watson Research | |
Adhikari, Kabindra | |
RILEY, WILLIAM - Lawrence Berkeley National Laboratory | |
HOFFMAN, FOREST - Oak Ridge National Laboratory | |
HUDSON, COREY - Sandia National Laboratory | |
GAUTAM, SAGAR - Sandia National Laboratory |
Submitted to: Soil Science Society of America Journal
Publication Type: Peer Reviewed Journal Publication Acceptance Date: 5/20/2022 Publication Date: 6/7/2022 Citation: Mishra, U., Yeo, K., Adhikari, K., Riley, W.J., Hoffman, F.M., Hudson, C., Gautam, S. 2022. Empirical relationships between environmental factors and soil organic carbon produce comparable prediction accuracy to machine learning. Soil Science Society of America Journal. 86(6):1611-1624. https://doi.org/10.1002/saj2.20453. DOI: https://doi.org/10.1002/saj2.20453 Interpretive Summary: Machine learning techniques are robust in quantifying and modeling spatial relationship between environmental variables and soil organic carbon (SOC) distribution across multiple scales. This study applied Random Forest and Generalized Additive Modeling approach to derive functional relationships between environmental factors and SOC stocks using >6200 point SOC observations from across the US mainland. The models quantified such relationship and identified evapotranspiration, soil drainage, and vegetation index as important controllers of SOC stocks variations. The study recommended that the derived relationships could be used to benchmark land model representations of SOC. Technical Abstract: Accurate representation of environmental controllers of soil organic carbon (SOC) stocks in Earth System Model (ESM) land models could reduce uncertainties in future carbon-climate feedback projections. Using functional relationships between environmental factors and SOC stocks to evaluate land models can help modelers understand prediction biases beyond what can be achieved with the observed SOC stocks alone. In this study, we used 31 environmental factors dataset, field SOC observations (n = 6,213) from the continental US, Random Forest (RF) and Generalized Additive Modeling (GAM) to (1) select important environmental predictors of SOC stocks, (2) derive functional relationships between environmental factors and SOC stocks, and (3) use the derived functional relationships to predict SOC stocks and compare prediction accuracy of the two approaches. Out of the 31 environmental factors we investigated, 12 were identified as important predictors of SOC stocks by the RF approach. In contrast, the GAM approach identified six (of those 12) environmental factors as important controllers of SOC stocks: potential evapotranspiration, normalized difference vegetation index, soil drainage condition, precipitation, elevation, and net primary productivity. The GAM approach showed minimal SOC predictive importance of the remaining six environmental factors identified by the RF approach. However, the derived functional relationships of these six environmental factors explained 52% of the observed variability of SOC stocks compared to 56% by RF approach using 12 environmental factors. The functional relationships we derived using the GAM approach can serve as important benchmarks to evaluate environmental control representations of SOC stocks in ESMs, which could reduce uncertainty in predicting future carbon-climate feedbacks. |