|Meek, David - RETIRED ARS EMPLOYEE|
Submitted to: Transactions of the ASABE
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 6/7/2013
Publication Date: 6/28/2013
Citation: Tomer, M.D., Beeson, P.C., Meek, D.W., Moriasi, D.N., Rossi, C.G., Sadeghi, A.M. 2013. Evaluating simulations of daily discharge from large watersheds using autoregression and an index of flashiness. Transactions of the ASABE. 56(4):1317-1326.
Interpretive Summary: Watershed models are calibrated to simulate stream discharge as accurately as possible, but assessment of model accuracy is usually done on monthly or annual time steps, whereas the models themselves operate on a daily time step. While a consistent approach is needed to be able to evaluate watershed modeling efforts at the same time scale the models operate on, there are statistical challenges that need to be overcome. In particular, daily hydrologic data vary over orders of magnitude and are highly skewed. We approached this problem by transforming rather than aggregating the data. Statistical "autoregressive" models were used to generate estimates of daily discharge that could be evaluated as if they were output by a simulation model. We found that the approach can be used to develop performance criteria to evaluate watershed models like the Soil and Water Assessment Tool (SWAT). The statistical modeling approach provides a mimic of observed daily stream data with confidence intervals for daily estimates. The approach offers possible utility for developing watershed specific modeling goals and a potential framework for comparison of modeling results across watersheds. One major recommendation is to include a measure of stream flashiness when assessing watershed model performance at the daily time step. This research is of greatest interest to the watershed modeling community and those who are hoping that watershed models can become a more consistently useful tool for watershed management.
Technical Abstract: Watershed models are calibrated to simulate stream discharge as accurately as possible. Modelers will often calculate model validation statistics on aggregate (often monthly) time periods, rather than the daily step at which models typically operate. This is because daily hydrologic data exhibit large variability and skewness, while aggregating to a coarser temporal scale provides a near-normal distribution and hence a straight-forward performance target. Statistical modeling of skewed data commonly employs transformation, avoiding loss of information from aggregation. We empirically simulated the natural log of daily discharge (ln[Q]) for four South Fork Iowa River (SFIR) stream gages using autoregressive models, transformed results back to the original scale, and calculated model performance statistics for both the autoregressive models and Soil and Water Assessment Tool (SWAT) output. The autoregressive models captured 93-97% of the variation in ln(Q), with near-zero bias as a condition for model convergence. Back transformed autoregressive model results, for three of the four stations, had Nash-Sutcliffe efficiencies (NSE) of 0.77-0.82, and residual-standard-error (RSR) ratios of 0.37-0.41 which was better than SWAT performance. The fourth gage, on Beaver Creek (BC), showed flashier hydrology and weaker autocorrelation. Consequently, SWAT generally outperformed autoregression at BC. Results highlighted hydrologic variation among SFIR tributary watersheds and consequences for modeling success. Stream flashiness should be considered when assessing watershed model performance at the daily time step. Autoregressive modeling provides a statistical mimic of observed daily stream data and carries potential as an internally consistent approach to benchmark performance of watershed models. In this case, if transformed SWAT-simulated data could achieve NSE<0.9 and RSR<0.2, they would be indistinguishable from a statistically generated estimate of the time series. Autoregressive models can also provide confidence intervals for estimates of daily measurements, allowing uncertainty metrics to be generated from the measured dataset itself.