1a. Objectives (from AD-416):
Develop empirical statistical models of biofuel and forage species distribution and abundance using field trial data and incorporating edaphic, topographic, and climatic factors; 2) use those models to describe potential forage and biofuels production across the northeastern United States; and 3) characterize the effects of past and potential land use change scenarios for grass-based agriculture in this region. The topo-climatic and edaphic data will contribute to development and testing of Forage Suitability Groups for the Northeast. Data provided by the ARS North Great Plains Research Laboratory, Mandan, ND, will be used to test the generality of the methods and models developed.
1b. Approach (from AD-416):
Empirical statistical models will be calibrated and tested using field trial data collected in the northeastern United States. Regions of similar climate and soils may also be included. Existing soils, climate, and elevation geographic information systems (GIS) data layers will be used to extrapolate those predictive models across the landscape for the Northeast and the northern Great Plains. Scenario development will use current and historic land cover data and will address previous and future tensions between forage and biofuels uses, as well as development and abandonment of lands.
3. Progress Report:
A postdoctoral research associate was hired in February 2013. The research team is developing empirical statistical methods for simulating forage species distribution and abundance using field based point species measurements combined with available gridded databases of climatic, topographic and soil layers. To date, the advantages and disadvantages of existing statistical models and possible environmental factors that could ecologically influence grass species occurrence have been investigated. Given the total number of targeted forage species is in the range of 20 to 50, we have broadly chosen ten statistical models for species prediction. Mathematically these models cover different categories of statistical approaches. Currently the environmental factors (explanatory variables) used as input to the statistical models comprise 18 bioclimatic variables, 12 topographical variables and 10 soil variables. The bioclimatic variables are generated from monthly temperature and precipitation and include mean annual temperature, minimum temperature of coldest month, and precipitation in the driest quarter, etc. Topographic variables such as elevation, aspect, slope and surface curvature have also been included. Adding soil variables to the models such as soil texture, depth-to-water table and chemical compositions, which are extracted from the newly published gridded Soil Survey Geographic Database (gSSURGO) dataset, is ongoing. The statistical models are specific to individual forage species presence/absence data collected cross the northeast region in the past ten years. Because an independent dataset with suitable forage species is not currently available, available data have been divided into “training” for calibration (e.g., 70% of data) and “testing” for validation (the remaining 30% of data). This evaluation process is called cross-validation. Quantitative criteria for comparing the predictive ability of the ten statistical models and for assessing model performance for different forage species have been developed. All the implementation procedures are programmed in the R language with functions included in the BIOMOD2 software package. The R program code accesses the necessary input datasets, calibrates a set of the statistical models, performs cross-validation, and generates a set of outputs for analysis. The R program is also used to make ensemble forecasting for combining the statistical models when there is a difficulty with selecting the most appropriate individual statistical model. When a dataset does not contain absence information (presence only data), the construction of virtual absence data is available. We have also calculated the importance of variables using the BIOMOD2 proposed method. Preliminary results indicate that mathematically including 30 environmental factors (18 climatic and 12 soil factors) as input in the models can make the best predictions of presence/absence for the orchardgrass and white clover species. Some of the models demonstrate 100% correct predictive performance. The completion of the optimized models will be used to project the potential forage and biofuels production cross the northeast United States and the Northern Great Plains, describing the suitability of grass-based species for the regions and determining the role of environmental factors on species occurrence. These models will be used to explore scenarios based on historical, current and future land use change and climate change cross the landscape for the regions, addressing previous and future allocations between forage and biofuels uses.