Skip to main content
ARS Home » Plains Area » El Reno, Oklahoma » Oklahoma and Central Plains Agricultural Research Center » Agroclimate and Hydraulics Research Unit » Research » Publications at this Location » Publication #414324

Research Project: Development of a Monitoring Network, Engineering Tools, and Guidelines for the Design, Analysis, and Rehabilitation of Embankment Dams, Hydraulic Structures, and Channels

Location: Agroclimate and Hydraulics Research Unit

Title: Predicting flood stages in watersheds with different scales using hourly rainfall dataset: A high-volume rainfall features empowered machine learning approach

Author
item QIAO, LEI - Oklahoma State University
item Livsey, Daniel
item Wise, Jarrett
item Kadavy, Kem
item Hunt, Sherry
item WAGNER, KEVIN - Oklahoma State University

Submitted to: Science of the Total Environment
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 7/31/2024
Publication Date: 8/3/2024
Citation: Qiao, L., Livsey, D., Wise, J.L., Kadavy, K.C., Hunt, S., Wagner, K. 2024. Predicting flood stages in watersheds with different scales using hourly rainfall dataset: A high-volume rainfall features empowered machine learning approach. Science of the Total Environment. 950. Article 175231. https://doi.org/10.1016/j.scitotenv.2024.175231.
DOI: https://doi.org/10.1016/j.scitotenv.2024.175231

Interpretive Summary: Accurate flood forecasting in lake and river level is needed to protect human lives and properties. Predicting lake levels is difficult due to numerous weather and soil conditions. Therefore, artificial intelligence models, e.g. machine learning, are used to determine patterns and relationships between rainfall and lake level. Lake Carl Blackwell, located in north-central Oklahoma, served as a primary case study due to its extensive historical records. In addition, streamflow forecasts were developed for three different-sized watersheds in Oklahoma ranging from 0.01 square miles (0.03 square kilometers) to 3,705 square miles (9,595 square kilometers). The prediction accuracy for the lake level was on average 0.11 ft, while the prediction accuracy for the river streamflow level was on average 0.21 ft. Overall, this approach showed high accuracy with broad applicability in the region. USDA is an equal opportunity provider and employer.

Technical Abstract: Accurate prediction of instantaneous high lake water levels and flood flows (flood stages) from micro-catchments to big river basins are critical for flood forecasting. Lake Carl Blackwell, a dammed small-watershed reservoir in the south-central USA, served as a primary case study due to its rich historical dataset. Bearing knowledge that both current and previous rainfall contributes to the reservoirs’ water body, a series of hourly rainfall features were created to maximize predicting power, which include total rainfall amounts in the current hour, the past 2 hours, 3 hours, …, 600 hours (25 days) in addition to previous-day lake levels. Machine learning algorithm Random Forest Regression (RFR) was used to score the features’ importance and predict the flood stages along with Support Vector Regression (SVR), Extreme Gradient Boosting (XGBoost), and the ordinary multi-variant linear regression (MLR) together with dimension reduced linear models of Principal Component Regression (PCR) and Partial Least Square Regression (PLSR). The prediction accuracy for the hourly lake flood stages can be as high as 0.95 in R2, 0.11 ft in mean absolute error (MAE), and 0.21 ft in root mean square error (RMSE) for the testing dataset (hold-out-validation) by RFR, with small accuracy decreases by the other two non-linear algorithms of XGBoost and SVR. Linear regressions with lowest accuracy had R2 values = 0.83, MAE < 0.23 ft, and RMSE < 0.37 ft. Furthermore, we extended this study to three different-sized watersheds for surface runoff and streamflow predictions (from micro-catchments to large river basins) and the approach showed high accuracy and broad applicability in the region. The rainfall features emerged as the dominant predicting power for all the watersheds, with the importance of earlier rainfall increasing for the larger watersheds and vice versa. USDA is an equal opportunity provider and employer.