Skip to main content
ARS Home » Plains Area » El Reno, Oklahoma » Oklahoma and Central Plains Agricultural Research Center » Agroclimate and Hydraulics Research Unit » Research » Publications at this Location » Publication #420218

Research Project: Development of a Monitoring Network, Engineering Tools, and Guidelines for the Design, Analysis, and Rehabilitation of Embankment Dams, Hydraulic Structures, and Channels

Location: Agroclimate and Hydraulics Research Unit

Title: Predicting flood stages in watersheds with different scales using hourly rainfall dataset: A machine learning approach

Author
item QIAO, LEI - Oklahoma State University
item Livsey, Daniel
item WISE, JARRETT - Department Of Energy
item KADAVY, KEM - Retired ARS Employee
item Hunt, Sherry
item WAGNER, KEVIN - Oklahoma State University

Submitted to: Meeting Abstract
Publication Type: Abstract Only
Publication Acceptance Date: 10/17/2024
Publication Date: 11/19/2024
Citation: Qiao, L., Livsey, D.N., Wise, J., Kadavy, K., Hunt, S., Wagner, K. 2024. Predicting flood stages in watersheds with different scales using hourly rainfall dataset: A machine learning approach. Meeting Abstract. 2024 Governor's Water Conference and Research Symposium, Nov 19-20, 2024, Norman, Oklahoma.

Interpretive Summary:

Technical Abstract: Accurate prediction of instantaneous high lake water levels and flood flows (flood stages) from micro-catchments to big river basins are critical for flood forecasting. Lake Carl Blackwell, a dammed small-watershed reservoir in the south-central USA, served as a primary case study due to its rich historical dataset. Bearing knowledge that both current and previous rainfall contributes to the reservoirs’ water body, a series of hourly rainfall features were created to maximize predicting power, which include total rainfall amounts in the current hour, the past 2 hours, 3 hours, …, 600 hours (25 days) in addition to previous-day lake levels. Machine learning algorithm Random Forest Regression (RFR) was used to score the features’ importance and predict the flood stages along with Support Vector Regression (SVR), Extreme Gradient Boosting (XGBoost), and the ordinary multi-variant linear regression (MLR) together with dimension reduced linear models of Principal Component Regression (PCR) and Partial Least Square Regression (PLSR). The prediction accuracy for the hourly lake flood stages can be as high as 0.95 in R2, 0.11 ft in mean absolute error (MAE), and 0.21 ft in root mean square error (RMSE) for the testing dataset (hold-out-validation) by RFR, with small accuracy decreases by the other two non-linear algorithms of XGBoost and SVR. Linear regressions with lowest accuracy had R2 values = 0.83, MAE < 0.23 ft, and RMSE < 0.37 ft. Furthermore, we extended this study to three different-sized watersheds for surface runoff and streamflow predictions (from micro-catchments to large river basins) and the approach showed high accuracy and broad applicability in the region. The rainfall features emerged as the dominant predicting power for all the watersheds, with the importance of earlier rainfall increasing for the larger watersheds and vice versa.