Location: Agroclimate and Hydraulics Research Unit
Title: Predicting flood stages in watersheds with different scales using hourly rainfall dataset: A machine learning approachAuthor
![]() |
QIAO, LEI - Oklahoma State University |
![]() |
Livsey, Daniel |
![]() |
WISE, JARRETT - Department Of Energy |
![]() |
KADAVY, KEM - Retired ARS Employee |
![]() |
Hunt, Sherry |
![]() |
WAGNER, KEVIN - Oklahoma State University |
|
Submitted to: Meeting Abstract
Publication Type: Abstract Only Publication Acceptance Date: 10/17/2024 Publication Date: 11/19/2024 Citation: Qiao, L., Livsey, D.N., Wise, J., Kadavy, K., Hunt, S., Wagner, K. 2024. Predicting flood stages in watersheds with different scales using hourly rainfall dataset: A machine learning approach. Meeting Abstract. 2024 Governor's Water Conference and Research Symposium, Nov 19-20, 2024, Norman, Oklahoma. Interpretive Summary: Technical Abstract: Accurate prediction of instantaneous high lake water levels and flood flows (flood stages) from micro-catchments to big river basins are critical for flood forecasting. Lake Carl Blackwell, a dammed small-watershed reservoir in the south-central USA, served as a primary case study due to its rich historical dataset. Bearing knowledge that both current and previous rainfall contributes to the reservoirs’ water body, a series of hourly rainfall features were created to maximize predicting power, which include total rainfall amounts in the current hour, the past 2 hours, 3 hours, …, 600 hours (25 days) in addition to previous-day lake levels. Machine learning algorithm Random Forest Regression (RFR) was used to score the features’ importance and predict the flood stages along with Support Vector Regression (SVR), Extreme Gradient Boosting (XGBoost), and the ordinary multi-variant linear regression (MLR) together with dimension reduced linear models of Principal Component Regression (PCR) and Partial Least Square Regression (PLSR). The prediction accuracy for the hourly lake flood stages can be as high as 0.95 in R2, 0.11 ft in mean absolute error (MAE), and 0.21 ft in root mean square error (RMSE) for the testing dataset (hold-out-validation) by RFR, with small accuracy decreases by the other two non-linear algorithms of XGBoost and SVR. Linear regressions with lowest accuracy had R2 values = 0.83, MAE < 0.23 ft, and RMSE < 0.37 ft. Furthermore, we extended this study to three different-sized watersheds for surface runoff and streamflow predictions (from micro-catchments to large river basins) and the approach showed high accuracy and broad applicability in the region. The rainfall features emerged as the dominant predicting power for all the watersheds, with the importance of earlier rainfall increasing for the larger watersheds and vice versa. |
