Skip to main content
ARS Home » Research » Publications at this Location » Publication #236399

Title: Reconstructing missing daily precipitation data using regression trees and artificial neural networks

item Kim, Jung Woo
item Pachepsky, Yakov

Submitted to: Journal of Hydrology
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 10/5/2010
Publication Date: 12/1/2010
Citation: Kim, J., Pachepsky, Y.A. 2010. Reconstructing missing daily precipitation data using regression trees and artificial neural networks. Journal of Hydrology. 394:305-314.

Interpretive Summary: Completeness of meteorological data series is a precondition for any environmental modeling project. Continual malfunctioning of the weather sensing and recording equipment is unavoidable, and procedures are needed to reconstruct the missing data from the records at neighboring weather stations. Previous research has shown that the relations between precipitation data series at neighboring stations are very complex and are best represented with machine learning methods. With a problem at hand of incomplete on-site weather data for simulations of bacterial water quality in the creek in Appalachia, we developed a new precipitation reconstruction method that combined two machine learning methods – regression trees and artificial neural networks. The new method performed better than each of the two individual methods both in terms of accuracy and reliability of the precipitation data series reconstruction and in terms of accuracy and reliability of stream flow simulations with the USDA-ARS water quality model SWAT using the reconstructed precipitation data. The developed method is a useful improvement that can be widely used by scientists and industry working with environmental models.

Technical Abstract: Incomplete meteorological data has been a problem in environmental modeling studies. The objective of this work was to develop a technique to reconstruct missing daily precipitation data in the central part of Chesapeake Bay Watershed using regression trees (RT) and artificial neural networks (ANN). We also applied the two-step reconstruction method (RT+ANN) that employed ANN with inputs only from stations that were found to be influential in bootstrap applications of RT. Besides characterizing the reconstruction accuracy by statistics and the reconstruction uncertainty using bootstrap, we performed the functional testing of the technique by evaluating the precipitation error propagation in streamflow simulations with the Soil and Water Assessment Tool (SWAT) model. RT provided a transparent visual representation of the similarity between the stations in their daily precipitation time series. Seven years of data from 39 weather stations showed both RT and ANN provided reconstruction accuracy comparable to or better than published earlier results of ANN application to the precipitation reconstruction. The RT+ANN method significantly improved accuracy and was more robust compared with RT and ANN alone. This method was also more accurate and robust in SWAT streamflow predictions with reconstructed precipitation.