A hybrid rainfall-runoff model: integrating initial loss and LSTM for improved forecasting

Wang, Wei; Gao, Jie; Liu, Zheng; Li, Chuanqi

doi:10.3389/fenvs.2023.1261239

ORIGINAL RESEARCH article

Front. Environ. Sci. , 18 October 2023

Sec. Water and Wastewater Management

Volume 11 - 2023 | https://doi.org/10.3389/fenvs.2023.1261239

This article is part of the Research Topic Vegetation-soil-hydrology Interactions and Ecohydrological Processes View all 15 articles

A hybrid rainfall-runoff model: integrating initial loss and LSTM for improved forecasting

Wei Wang¹

Jie Gao¹

Zheng Liu²

Chuanqi Li¹*

¹School of Civil Engineering, Shandong University, Jinan, China
²Jinan Water Resources Engineering Service Center, Jinan, China

Accurate rainfall-runoff modeling is crucial for disaster prevention, mitigation, and water resource management. This study aims to enhance precision and reliability in predicting runoff patterns by integrating physical-based models like HEC-HMS with data-driven models, such as LSTM. We present a novel hybrid model, Ia-LSTM, which combines the strengths of HEC-HMS and LSTM to improve hydrological modeling. By optimizing the “initial loss” (Ia) with HEC-HMS and utilizing LSTM to capture the effective rainfall-runoff relationship, the model achieves a substantial improvement in precision. Tested in the Yufuhe basin in Jinan City, Shandong province, the Ia-LSTM consistently outperforms individual HEC-HMS and LSTM models, achieving notable average Nash-Sutcliffe Efficiency (NSE) values of 0.873 and 0.829, and average R² values of 0.916 and 0.870 for calibration and validation, respectively. The study shows the potential of integrating physical mechanisms to enhance the efficiency of data-driven rainfall-runoff modeling. The Ia-LSTM model holds promise for more accurate runoff estimation, with wide applications in flood forecasting, water resource management, and infrastructure planning.

1 Introduction

Rainfall-runoff modeling is essential in hydrology, especially for tasks like reservoir management, flood forecasting, and water resource planning (Chen and Adams, 2006; Young and Liu, 2015). Despite significant progress, accurately predicting runoff remains a big challenging due to the complex, nonlinear, and dynamic nature of the rainfall-runoff process (Wang et al., 2006; Xie et al., 2019). This complexity is further compounded by various influencing factors, including rainfall patterns, initial soil moisture, terrain, land cover, and infiltration (Wang and Ding, 2003; Perera et al., 2019). Sudden rainstorms further emphasize the need for a comprehensive understanding of primary rainfall patterns (Xie et al., 2023a; Xie et al., 2023b). The impact of urban imperviousness on runoff and flooding dynamics has also emerged as a crucial factor in recent studies (Shukla et al., 2020; Mehr and Akdegirmen, 2021).

Vegetation and soil properties play a significant role in regulating the hydrological cycle, impacting various processes such as interception, infiltration, evaporation, and surface depression storage (Shukla et al., 2018). Notably, initial loss or initial abstraction (Ia) represents the rainfall occurring before the initiation of surface runoff. Ia is influenced by factors like vegetation cover, soil infiltration capacity, and antecedent moisture condition in the soil. Its magnitude is closely tied to both climatic conditions and moisture level in the watershed, making accurate estimation Ia for runoff determination and flood management (Zheng et al., 2020).

Rainfall-runoff models are broadly classified into physically-based models and data-driven models (Devia et al., 2015; Bartoletti et al., 2018; Mohammadi et al., 2022). Physically-based models, such as the Hydrologic Engineering Center-Hydrologic Modeling System (HEC-HMS) (Feldman, 2000), Xinanjiang (XAJ) model (Zhao, 1992), soil and water assessment tool (SWAT) (Arnold et al., 1998), MIKE-SHE (Jaber and Shukla, 2012), and HSPF (Bicknell et al., 1997), employ mathematical equations to represent hydrological processes. While these models provide valuable insights, their development demands a deep understanding of hydrological processes and extensive basin parameters, leading to a complex and time-consuming development process (Fenicia et al., 2008; Chen et al., 2022). The Hydrologic Modeling System (HMS), designed by the Hydrologic Engineering Center (HEC) of the United States Army Corps of Engineers, is a widely adopted rainfall-runoff analysis tool worldwide. The physical processes are so complex in hydrological models that it is difficult to discover the information from the available inputs.

Data-driven models offer a compelling alternative, establishing relationships between input and output data without the need for detailed understanding of underlying physical processes (Noori and Kalin, 2016; Yaseen et al., 2016; Lees et al., 2021). These models rely on historical rainfall and runoff data, making them suitable for handling non-linear and stochastic systems (Hu et al., 2018; Kratzert et al., 2018; Gao et al., 2020). Prominent data-driven methods for rainfall-runoff modeling include artificial neural networks (ANN) (Haykin and Network, 2004), support vector machines (SVM) (Cortes and Vapnik, 1995), genetic programming (Savic et al., 1999; Danandeh and Nourani, 2018), random forests (Breiman, 2001), fuzzy logic (Hundecha et., 2001) and regression in the reproducing kernel hilbert space (RRKHS) (Safari et al., 2020). These models use historical data to identify patterns and associations, enabling them to make precise predictions or estimates based on observed data patterns.

In recent years, deep learning, as a type of data-driven modeling, has gained substantial attention in hydrology due to its adaptability and minimal data requirements (Beven, 2020; Gu et al., 2020; Zhou et al., 2023). Among various deep learning approaches, Long Short-Term Memory (Hochreiter and Schmidhuber, 1997) networks have proven their effectiveness in various hydrological applications, including rainfall prediction (Barrera-Animas et al., 2022), flood forecasting (Hu et al., 2018; Rahimzad et al., 2021), and river water table prediction (Kim et al., 2022). As emphasized by Kratzert et al. (2018), the strength of the LSTM models lies in their capacity to capture long-term dependencies between the input and output.

The integration of physically-based and data-driven models in rainfall-runoff modeling has received considerable interest, driven by their complementary strengths (Tian et al., 2018; Sun et al., 2019; Zhou et al., 2022). Several hybrid models have exhibited promise in this domain. For instance, the XAJ-LSTM model, proposed by Cui et al. (2021), combines the Xinanjiang (XAJ) conceptual model with LSTM neural networks for multistep-ahead flood forecasting. This hybrid model utilizes the model forecast results of XAJ as input variables for LSTM, thus enhancing the physical mechanisms of hydrological simulation. By incorporating discharge forecasts from the XAJ model, the XAJ-LSTM hybrid model overcomes the limitations of LSTM’s input variables, resulting in notably improved performance. Similarly, Gholami & Khaleghi (2021) conducted a comparative analysis of ANN and HEC-HMS models in rainfall-runoff simulation. Narayana Reddy and Pramada, (2022) integrated HEC-HMS with ANN to enhance daily discharge simulation and yearly peak discharge prediction. Farfan et al. (2020) used streamflow series forecasts from a conceptual model as input for back-propagation neural networks, leading to markedly improved streamflow predictions. Hitokoto and Sakuraba (2020) successfully integrated a rainfall-runoff model with a feed-forward artificial neural network to predict real-time water level processes. These instances highlight the effectiveness of hybrid models in enhancing predictive accuracy.

While previous research has made significant progress in rainfall-runoff modeling, there remains a critical need for innovative approaches to address the limitations of current models. Notably, the absence of physical mechanism poses a substantial obstacle in applying machine learning methods, which typically rely on labeled observations (Xie et al., 2021). The consideration of initial loss (Ia) within a deep learning network for rainfall-runoff simulation has received limited attention. Ia represents a crucial stage in the rainfall-runoff process. To address these challenges, this study proposes the hybrid rainfall-runoff model, integrating initial loss and LSTM. This integration harnesses the strengths of both physically-based and data-driven approaches, offering the potential for substantial advancements in accurately predicting and managing rainfall-induced runoff events.

The main objectives of this study are: 1) to develop the Ia-LSTM hybrid model, combining the advantages of the widely used hydrologic model, HEC-HMS, with the predictive capabilities of LSTM; 2) to conduct a comprehensive evaluation of the performance of the proposed hybrid model against the individual HEC-HMS and LSTM models. To assess the model’s effectiveness, a case study is undertaken in the Yufuhe Basin, located in Jinan City, Shandong Province. The integration of the HEC-HMS model with LSTM enables a more comprehensive representation of the rainfall-runoff process, considering both the physical processes and historical data patterns. The incorporation of initial loss estimation and LSTM aims to improve the accuracy and reliability of runoff forecasting.

The contributions of this paper can be summarized as follows. First, it introduces the Ia-LSTM model, a novel rainfall-runoff model based on the integration of initial loss and LSTM. Second, the model is applied to the tasks of individual rainfall-runoff modeling in the Yufuhe basin, demonstrating its effectiveness.

The paper is organized as follows: Section 2 provides an overview of the study area and the data utilized. It also briefly describes the HEC-HMS model, LSTM network, and Ia-LSTM hybrid model. Section 3 presents the research results and discussions. Finally, Section 4 concludes the paper by summarizing the key findings.

2 Materials and methods

This section provides an overview of the study area and data (Section 2.1), introduces the HEC-HMS model (Section 2.2), explains the LSTM model structure (Section 2.3), presents the proposed framework based on the LSTM (Section 2.4), and outlines the evaluation metrics of model performance (Section 2.5).

2.1 Study area and data

This study focuses on the Yufuhe basin, located upstream of the Wohushan Reservoir in Jinan city, Shandong Province, China. Encompassing an area of 557km², the basin exhibits vulnerability to floods and droughts due to its unique natural and geographical conditions. Notably, both 2007 and 2013 witnessed large-scale floods resulting in significant economic losses in Jinan (Zhang et al., 2016). The basin plays a critical role in flood control and water management, featuring diverse topography including mountains, hills, and a complex river network.

The study area is characterized by a sub-humid continental monsoon climate, with an annual average temperature of 14.3°C and an average annual precipitation of 670.0 mm. Rainfall is concentrated within the flood season from June to September, marked by intense, short-duration rainfall events. The flood season accounts for approximately 70% of the annual precipitation, posing flood risks in the basin.

Within the Yufuhe basin, there are seven rain-gauge stations and one Wohushan stream flow gauge station located at the basin outlet. Figure 1 illustrates the location of the watershed, elevation, distribution of rainfall and flow gauging stations, as well as the streams. The land use and land cover (LULC) map for the Yufuhe basin in 2020 was sourced from the Institute of Geographic Sciences and Resources of the Chinese Academy of Sciences (http://www.resdc.cn/), offering a detailed representation at a 30-m resolution. The basin is characterized by abundant vegetation, with agricultural land accounting for approximately 38% and forests covering 35% of the total area (Figure 2).

FIGURE 1

FIGURE 1. Elevation and distribution of rainfall stations in the study area.

FIGURE 2

FIGURE 2. Land use map in the study area.

Hourly flow runoff data from the Wohushan hydrological station and hourly precipitation data from seven gauges were collected from 1973 to 2020. After data preprocessing, 30 rainfall and runoff events, including 6136 one-hourly rainfall and runoff records, were selected for this study. Among these flood events, 20 were used for model calibration, and the remaining 10 were used for model validation.

2.2 HEC-HMS model

The HEC-HMS model, developed by the U.S. Army Corps of Engineers (USACE), can accurately predict streamflow, runoff volume, and other hydrologic parameters. It incorporates inputs such as land use, soil types, channel networks, and rainfall data. HEC-HMS offers variood, unit hydrograph method, Snyder unit hydrograph method, and others [(USACE 2000us hydrologic modeling methods, including the Soil Conservation Service (SCS) curve number meth]. These methods are selected based on the specific characteristics of the modeled watershed.

The HEC-HMS model comprises four main components: the basin model, meteorological model, control specifications, and time series model. The rainfall runoff process is delineated through four modules: loss, transformation, routing, and baseflow. Detailed information on the model’s structure and processes can be found in the Technical Reference Manual (USACE-HEC, 2000) and the User’s Manual of HEC-HMS.

2.2.1 Initial and constant loss method

The initial and constant loss method estimates surface losses in rainfall runoff modeling and is suitable for watersheds with limited soil data. This method requires two parameters: initial loss and constant rate. Initially, all rainfall is absorbed until the specified initial loss volume is attained, after which rainfall is lost at a constant rate. It considers antecedent moisture conditions and losses prior to reaching ultimate infiltration capacity. This method assumes a single soil layer for estimating moisture content changes, making it ideal for event simulation, particularly in data-scarce watersheds. The initial loss is influenced by antecedent moisture conditions and losses before reaching the ultimate infiltration capacity. It is worth noting that the initial loss parameter should be calibrated using observed data, although it is often estimated based on the soil moisture state at the beginning of the simulation and an assumed active layer depth. Throughout the simulation, a constant maximum potential rate of precipitation loss, f_c, is assumed.

The net rainfall, P_et, at time t, is calculated using the following equation (USACE, 2000b):

P_{e t} = \{\begin{array}{c} 0 & i f \sum P_{i} < I_{a} \\ P_{i} - f_{c} Δ t & i f \sum P_{i} > I_{a} a n d P_{i} > f_{c} \\ 0 & i f \sum P_{i} > I_{a} a n d P_{t} < f_{c} \end{array} (1)

where P_et represents the net rainfall (mm), I_a denotes the initial loss (mm), P_i represents cumulative rainfall from time t to t+Δt (mm), and f_c represents the average infiltration rate (mm/h).

Optimal values of the initial loss and the constant loss rate are determined during the calibration of HEC-HMS model, primarily to match the depths of effective precipitation and direct runoff.

2.2.2 Direct runoff calculation

The Snyder unit hydrograph method is used to estimate surface direct runoff resulting from excess precipitation. It utilizes a standardized unit hydrograph incorporating parameters like peak lag time, peak flow, and total duration. These parameters play a crucial role in understanding the hydrological response of a watershed to rainfall events.

The standard unit hydrograph relates rainfall duration (t_r) to basin lag time (t_p) as follows:

t_{p} = 5.5 t_{r} (2)

The Snyder Unit hydrograph method requires specifying input parameters such as the basin lag time (t_p) and peak coefficient (C_p). Peak lag time is calculated using the following formula:

t_{p} = C C_{t} {(L L_{C})}^{0.3} (3)

in which L is the length of the main stream from outlet to the divide (km); Lc is the length along the main stream to the nearest point of the watershed centroid; Ct is a coefficient (usually 1.8–2.2); C is a conversion constant (0.75 for SI units).

2.2.3 Baseflow calculation

Baseflow calculation involves accounting for the flow through a channel or the influence of groundwater in a hydrological system. HEC-HMS offers two methods for baseflow calculation: recession and constant monthly. The recession method, utilized in this study, represents the drainage process from natural storage within a watershed. It employs an exponential decay function (Knebl et al., 2005) to relate the baseflow (Q_t) at a specific time (t) to an initial value (Q₀). The equation is defined as:

Q_{t} = Q_{0} K^{t} (4)

where K represents the exponential decay constant.

2.2.4 Flood routing

Flood routing in HEC-HMS provides various options for routing flood hydrographs through different reaches. The Muskingum method is commonly used for general flood routing.

In this study, the Muskingum method is adopted to compute the outflow from each reach during flood routing. This method is based on the following equation:

Q_{j + 1} = C_{1} I_{j + 1} + C_{2} I_{j} + C_{3} Q_{j} (5a)

where

C_{1} = \frac{Δ t - 2 K X}{2 K (1 - X) + Δ t} (5b)

C_{2} = \frac{Δ t + 2 K X}{2 K (1 - X) + Δ t} (5c)

C_{3} = \frac{2 K (1 - X) - Δ t}{2 K (1 - X) + Δ t} (5d)

where C₁, C₂ and C₃ are the routing coefficients for the concerned reach; I_j, I_j+1 are the inflows to the reach at the beginning and end of the computation interval △t, respectively, Q_j and Q_j+1 correspond to the outflows from the reach at the beginning and end of computation interval, respectively. K denotes the travel time through the reach, and X is the Muskingum weighting factor (0 ≤ X ≤ 0.5). The coefficients C₁, C₂, and C₃ must satisfy the condition that their sum equals 1.0.

2.2.5 Parameter optimization methods

Calibrating the parameters of HEC-HMS model is a crucial step for improving the agreement between model results and observed data. The primary objective is to determine the most appropriate parameter values that yield the closest match between computed and observed hydrographs. This involves quantifying the match using an objective function, which compares the simulated and observed flow data. The objective function serves to assess the accuracy of the model’s performance.

To execute parameter calibration, HEC-HMS provides two search methods: the Univariate Gradient algorithm (UG) and the Nelder-Mead algorithm (NM). These algorithms assist in minimizing the objective functions and determining the parameter values that provide the best fit.

In this study, the Peak-Weighted Root Mean Square Error (PWRMSE) function is chosen as the objective function for parameter calibration. The Nelder-Mead algorithm is employed to optimize the model parameters and obtain the most suitable values, ensuring accurate simulation results.

2.3 Long short-term memory (LSTM) network

The Long Short-Term Memory (LSTM) network was selected due to its exceptional ability to handle extended data sequences, a challenge commonly faced by conventional Recurrent Neural Networks (RNNs) (Hochreiter and Schmidhuber, 1997). In hydrological modeling, where processes like rainfall-runoff relationships exhibit complex temporal patterns, LSTM’ capability to capture long-term dependencies is crucial.

Specifically, LSTM excels in preserving vital information over extended periods, allowing it to accurately model complex water-related processes. This type of deep learning model is designed to address challenges encountered by traditional RNNs, such as gradient exploding or vanishing problems. It achieves this through specialized gate mechanisms that control information flow, proving highly effective in processing sequential data.

The basic unit of the LSTM network includes a memory and three types of gates: input gate, forget gate, and output gate. These gates play a crucial role in managing memory and capturing relevant features by controlling information flow within the LSTM unit. Figure 3 provides a visual representation of the structure of an LSTM cell.

FIGURE 3

FIGURE 3. The structure of a LSTM cell.

The forget gate, represented by f_t, determines how much of the previous memory to discard, based on the current input x_t and the previous cell state c_t-1. The input gate, represented by i_t, controls the information to be stored in the cell state c_t. The output gate, represented by o_t, filters the output variable h_t. The equations for the gates are given as follows (Kratzert et al., 2018):

f_{t} = σ (W_{h f} h_{t - 1} + W_{x f} x_{t} + b_{f}) (6)

i_{t} = σ ({W_{h}}_{i} h_{t - 1} + W_{x i} x_{t} + b_{i}) (7)

o_{t} = σ (W_{h o} h_{t - 1} + W_{x o} x_{t} + b_{o}) (8)

{\tilde{c}}_{t} = \tanh ({W_{h}}_{c} h_{t - 1} + W_{x c} x_{t} + b_{c}) (9)

c_{t} = f_{t} ⊙ c_{t - 1} + i_{t} ⊙ {\tilde{c}}_{t} (10)

h_{t} = o_{t} ⊙ \tanh (c_{t}) (11)

where xt denotes the input, f_t is a forget gate, i_t is an input gate, o_t is an output gate, c_t is the cell state at time t; σ is Sigmoid function, ⊙denotes the element-wise multiplication of two vectors, bf, bi, bo, and bc are the corresponding bias; W_hf, W_xf, W_hi, W_xi, W_ho,W_xo, W_hc and W_xc are the network weights matrices; tanh is hyperbolic tangent function; h_t-1 is the output of hidden state of previous step; and x_t is the input.

To train the LSTM model, it is crucial to configure the hyperparameters that govern the training process (Tian et al., 2018). Several hyperparameters, including learning rate, loss function, optimizer, dropout rate, batch size, and number of epochs, were tested and evaluated to determine the optimal values that give the best evaluation metrics. The final selected hyperparameters were as follows: a time step of 10, 256 neurons in the hidden layer, dropout rate of 0.20, and a batch size of 32. The Root Mean Square prop (RMSprop) optimizer with a decay coefficient of 0.8 and a learning rate of 0.0001 was utilized for model training. The training process involved 1000 iterations. The mean squared error (MSE) served as the loss function, measuring the average squared difference between the predicted values and the actual values.

To ensure accurate data analysis and enhance the efficiency and performance of the model, it is essential to preprocess the input data and map their attribute values to the range [0, 1]. Normalizing the input variables eliminates the influence of magnitude, thereby improving the accuracy and efficiency of network learning.

In this study, the rainfall and runoff data were preprocessed using min-max normalization method, which can be defined by Eq 1:

x_{n o r m} = \frac{x_{i} - x_{\min}}{x_{\max} - x_{\min}} (12)

where $x_{n o r m}$ , x_i, x_min, x_max represents the normalized, observed, minimum and maximum values of rainfall or runoff, respectively. This normalization process ensures that the input variables are scaled appropriately and enables effective analysis and learning by the network.

2.4 Ia-LSTM hybrid model

This study proposes an Ia–LSTM model to improve the accuracy of hourly runoff discharge predictions using LSTM. The model incorporates HEC-HMS model for dataset generation, using the effective rainfall data series obtained by subtracting the initial loss (Ia) from the total rainfall data. By considering the influence of Ia, the LSTM model is trained to predict flow discharge sequences, resulting in improved precision in rainfall-runoff predictions. Figure 4 illustrates the overall workflow of the Ia-LSTM the hybrid model. The Ia-LSTM hybrid model optimizes the determination of Ia using HEC-HMS and considers factors such as infiltration, vegetation interception, and evaporation that impact rainfall-runoff dynamics.

FIGURE 4

FIGURE 4. The flowchart of Ia-LSTM Hybrid Model.

The development of the Ia-LSTM hybrid model involves the following steps:

(1) Data preparation: Historical rainfall-runoff data for the study area are collected and organized into rainfall-runoff data sequences.

(2) Dataset generation: The HEC-HMS model is used to optimize and accurately estimate the initial loss (Ia) by considering factors such as rainfall-runoff, land use, soil type, and DEM data. The effective rainfall data is derived by subtracting Ia from the total rainfall. This step involves generating a dataset comprising effective rainfall-runoff pairs.

(3) LSTM model construction and training: The LSTM model is constructed, with effective rainfall data serving as the input variable and the corresponding runoff data as the output. The model is trained to capture the hidden mapping relationship between the inputs and outputs. Throughout the training process, various parameter combinations are explored to identify the optimal settings that enhance performance and efficiency.

(4) LSTM model forecasting: The trained LSTM model is used to predict runoff by inputting the effective rainfall data sequence. As a reference, the LSTM model is also trained on the original rainfall-runoff sequence. The performance of the LSTM rainfall-runoff prediction model, accounting for Ia, is evaluated and compared.

2.5 Evaluation metrics of model performance

The performance of the developed models is assessed using four widely used metrics in other hydrological studies: Nash-Sutcliffe efficiency (NSE), root mean square error (RMSE), relative error of peak discharge (REP), and coefficient of determination (R²).

NSE is extensively used for evaluating rainfall-runoff simulation (Kumar et al., 2016). It quantifies the agreement between simulated and observed data by comparing their variances. NSE is calculated using the following formula:

N S E = 1 - \frac{\sum_{i = 1}^{N} {(O_{i} - P_{i})}^{2}}{\sum_{i = 1}^{N} {(O_{i} - \bar{O})}^{2}} (13)

where O_i and P_i represent the observed and predicted runoff at the time step i, respectively; $\bar{O}$ is the average observed runoff, and N is the total number of observations. NSE ranges from - ∞ to 1, with 1 indicating a perfect match between the predictions and observations.

Root mean square error (RMSE) measures the effectiveness of the model and is the average of the squared difference between model simulated and observed values. RMSE is used to represent the model’s ability to predict flood events. RMSE can be calculated by:

R M S E = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {(O_{i} - P_{i})}^{2}} (14)

A lower RMSE indicates better model simulation performance, with an RMSE of 0 indicating an exact match simulated and observed values.

REP assesses the accuracy and uncertainty associated with peak discharge estimation. It is calculated as:

R E P = \frac{|O_{p} - P_{p}|}{O_{p}} \times 100 % (15)

where O_p and P_p represent the observed and predicted peak river flow discharge, respectively. A lower REP value indicates better performance, indicating that the model’s predictions are closer to the actual observed results.

The coefficient of determination (R²) quantifies the degree of correlation between the simulated and observed runoff (Kumarasamy and Belmont, 2018). It is calculated using the formula:

R^{2} = {(\frac{\sum_{i = 1}^{N} (O_{i} - \bar{O}) (P_{i} - \bar{P})}{\sqrt{\sum_{i = 1}^{N} {(O_{i} - \bar{O})}^{2}} \sqrt{\sum_{i = 1}^{n} {(P_{i} - \bar{P})}^{2}}})}^{2} (16)

where $\bar{O}$ and $\bar{P}$ represent the average value of observed and predicted runoff, O_p and P_p are the observed and predicted runoff, respectively. R² values range from 0 to 1, with higher values indicating a better fit between the model outputs and the target outputs. Higher R² values suggest greater predictive power. An R² equal to 1 denotes an ideal fit. Model performance is categorized as very good (0.7 < R² < 1), good (0.6 < R² < 0.7), satisfactory (0.5 < R² < 0.6), or unsatisfactory (R² < 0.5) (Ayele et al., 2017).

These metrics collectively provide a comprehensive assessment of the model’s performance, encompassing simulation quality, accuracy of peak discharge predictions, and the correlation with observed data.

3 Results and discussion

3.1 Estimation of initial loss

To accurately estimate initial losses, the Yufuhe basin was divided into sub-basins (S1, S2, S3, S4, S5) as shown in Figure 5. Table 1 provides key characteristics of these sub-basins, including their areas, average slopes, and stream lengths. This subdivision allowed for a precise assessment of initial losses. The initial and constant loss method requires the specification of parameters including the percent impervious area, initial loss (Ia), and constant loss rate. The Thiessen polygon method was employed to estimate the average rainfall for the entire watershed, and specific runoff parameters for each sub-basin were determined.

FIGURE 5

FIGURE 5. Divisions of the Yufuhe basin.

TABLE 1

TABLE 1. Key characteristics of sub-basins in the Yufuhe basin.

The value of initial loss (Ia) depends on the topography and land use conditions within the watershed. Typically, it is set at 10%–20% of the total rainfall for forested areas. In this study, based on the soil and land use characteristics, the Ia value was determined as 30 mm, while the constant loss rate ranged from 0.30 mm/h to 1.16 mm/h.

The optimization procedure involved using a search method to minimize an objective function and find optimal parameters. To determine the optimal values of Ia for different flood events, the parameters were optimized using the Nelder-Mead optimization algorithm, with the peak-weighted root mean square as the objective function. The resulting optimized values are presented in Table 2. The analysis of Table 2 reveals the following conclusions regarding the relationship between rainfall and initial loss values:

TABLE 2

TABLE 2. Optimization of initial loss (Ia) values for different flood events.

Different sub-basins exhibit varying initial loss values for the same flood event. For example, in the flood event on 19730715 with a rainfall of 101 mm, the corresponding initial loss values for the sub-basins are as follows: S1-19.1 mm, S2-21.5 mm, S3-18.5 mm, S4-21.3 mm, and S5-18.3 mm.

The magnitude of initial loss values is not solely determined by the amount of rainfall. Other factors, such as the surface condition, rainfall characteristics, topography and slope, soil type and moisture content, and antecedent rainfall, play a significant role in determining initial loss values. These factors interact and collectively influence the extent of initial rainfall loss.

There is no clear linear relationship between rainfall and initial loss values in Table 2. This suggests that the estimation of initial loss values cannot solely rely on the amount of rainfall. Instead, a comprehensive understanding and consideration of the various factors influencing initial loss is necessary for accurate estimation.

3.2 Performance comparison

Flood events No. 20000809 and 20050918 were selected for model calibration, while flood events 20130918 and 20190815 were used for model validation. Figure 6 presents a comparison of simulated and observed discharges for four flood events in the Yufuhe basin using three models: HEC-HMS, LSTM, and Ia-LSTM. The figure shows that these models can generally capture the overall runoff process during the rainfall-runoff forecasting. However, some discrepancies exist in accurately simulating localized peak values. Despite this, the predicted values exhibit consistent trends with the observed values.

FIGURE 6

FIGURE 6. Comparison of observed and simulated discharge using three models: (A) flood No. 20000809, (B) flood No. 20050918, (C) flood No.20130723, (D) flood No.20190815.

Regarding the comparison of relative error of peak discharge (REP) for the three models, Table 3 shows that different models exhibit varying performance in simulating peak discharge for each flood event. The LSTM model exhibits relatively large relative errors, particularly exceeding 20% for the flood event on 20130918. In contrast, both the Ia-LSTM and HEC-HMS models demonstrate significantly smaller relative errors, with all four flood events falling within the acceptable range. Notably, the Ia-LSTM model outperforms the other models, with a mere 1.3% error for the peak discharge during the flood event on 20190815. On average, the HEC-HMS model has a relative error of 9.8%, while the Ia-LSTM model has 8.1% for the peak discharge across all four flood events. These findings highlight the superior performance of the Ia-LSTM model in simulating peak discharge.

TABLE 3

TABLE 3. Comparison of relative error of peak discharge (REP) for three models.

Table 4 presents the errors in peak time for different flood events predicted by the HEC-HMS, LSTM, and Ia-LSTM models. For the flood event on 20190815, the LSTM model exhibited a peak time error of 2 h. Conversely, for the flood event on 20130918, both HEC-HMS model and LSTM had a peak time error of 1 h. In contrast, the Ia-LSTM model achieved accurate peak time predictions for three out of the four flood events, with a maximum peak time error of 1 h. Notably, the Ia-LSTM model outperformed the other models by accurately simulating the temporal pattern of peak discharge propagation.

TABLE 4

TABLE 4. Comparison of errors in peak time.

Table 5 presents a comprehensive comparison of three models (HEC-HMS, LSTM, and Ia-LSTM) based on key performance metrics: Nash-Sutcliffe Efficiency (NSE), Root Mean Square Error (RMSE), and Coefficient of Determination (R²). Notably, during the flood event on 20000809, all models demonstrated exceptional performance with NSE coefficients above 0.86, RMSE values ranging from 7.234 to 14.503, and R² coefficients exceeding 0.90. The HMS model showed heightened accuracy in predicting the flood event on 20190815, potentially due to its detailed consideration of the rainfall process. Additionally, the Ia-LSTM model consistently displayed commendable performance across various flood events, with NSE coefficients ranging from 0.755 to 0.923, RMSE values between 2.314 and 7.234, and R² coefficients from 0.798 to 0.941. Importantly, the Ia-LSTM model consistently outperformed the LSTM model, highlighting its effectiveness in flood prediction and modeling.

TABLE 5

TABLE 5. Comparison of NSE, RMSE and R².

The Ia-LSTM model consistently outperforms in various flood events, showing lower RMSE, and higher NSE and R² values. This highlights its effectiveness in flood prediction, especially compared to the LSTM model, emphasizing the importance of initial loss incorporation for accurate simulations.

3.3 Impact of initial loss

The analysis of initial loss in the proposed hybrid model provides valuable insights for improving rainfall-runoff predictions. By integrating initial loss estimation with LSTM neural networks, the Ia-LSTM model captures the complex interactions among various hydrological components, including rainfall, vegetation, soil, and runoff. This integration allows for a more comprehensive representation of the rainfall-runoff process, leveraging the strengths of physically-based and data-driven modeling approaches.

Consistent results demonstrate the superiority of the Ia-LSTM hybrid model over the individual HEC-HMS and LSTM models in estimating peak discharge, predicting peak time, and achieving higher NSE, lower RMSE, and greater R² values. The incorporation of initial loss estimation enhances the model’s ability to simulate runoff dynamics. This leads to improved accuracy and reliability. In the Yufuhe basin case study, the Ia-LSTM model demonstrates an average improvement of 6.05% and 13.7% in peak discharge estimation compared to the HEC-HMS model and LSTM, respectively.

These findings emphasize the importance of accurate initial loss estimation in rainfall-runoff modeling, particularly for flood management and forecasting. Accurate initial loss estimation provides a clearer understanding of the initial loss processes and their impact on runoff generation. Through the optimization of initial loss values obtained from the HEC-HMS model, the Ia-LSTM model achieves heightened accuracy and reliability in simulating rainfall-runoff dynamics.

3.4 Comparison with previous studies

The Ia-LSTM hybrid model represents a significant advancement in rainfall-runoff modeling. Previous studies in the Yufuhe basin have employed various methodologies and models. For instance, Zhang et al. (2016) developed a distributed flood forecasting model based on sub-basins, river reaches, and reservoirs, achieving high performance with a Nash-Sutcliffe Efficiency (NSE) exceeding 0.70 and a Relative Error of Peak Discharge (REP) below 10%. Similarly, Yang et al. (2013) focused on the application of the SWAT distributed hydrological model, yielding satisfactory results with NSE and R² exceeding 0.70, and a relative error in peak flow below 15%. Their work highlights the effectiveness of their model in capturing key influencing factors of floods within the Yufuhe basin.

In recent years, machine learning models, particularly those based on LSTM, have exhibited promise in runoff forecasting. For instance, Xiang and YanDemir, (2020) proposed an LSTM-sequence-to-sequence rainfall-runoff model, demonstrating notable predictive power for short-term flood predictions. The LSTM model produced NSE values of 0.72, 0.80, and 0.93 for the Tripoli, Independence, and Anamosa stations, respectively. Additionally, an LSTM network was applied to build a data-driven model for streamflow prediction in an urban watershed.

While deep learning algorithms may not fully capture the rainfall-runoff process, they can be used to discern streamflow patterns and to identify effective variables, making them the preferred choice for modeling in data-poor catchments.

In our study, the Ia-LSTM model outperforms previous models, exhibiting NSE coefficients ranging from 0.755 to 0.923, RMSE values between 2.314 and 7.234 m³/s, and R² coefficients from 0.798 to 0.941. These results signify substantial advancements in rainfall-runoff modeling.

This research builds upon earlier works by incorporating initial loss estimation and utilizing the powerful Ia-LSTM hybrid model. This approach significantly enhances accuracy and reliability in simulating rainfall-runoff dynamics, particularly in terms of estimating peak discharge and predicting peak time.

The findings of this study have important implications for flood forecasting and water resource management. The Ia-LSTM hybrid model demonstrates superior performance in simulating peak discharge and predicting peak time compared to individual HEC-HMS and LSTM models. This suggests its potential for accurate and reliable rainfall-runoff modeling, which is crucial for disaster prevention, mitigation, and water resource management.

Additionally, the integration of initial loss estimation with LSTM neural networks represents a significant advancement in rainfall-runoff modeling. This approach captures complex interactions among various hydrological components, providing a more comprehensive representation of the rainfall-runoff process.

The Ia-LSTM hybrid model shows promise for a wide range of applications, including flood forecasting, water resource management, and infrastructure planning. Its effectiveness in data-driven rainfall-runoff modeling with integrated physical mechanisms can significantly enhance the efficiency of flood prediction and management.

4 Conclusion

This study presents a hybrid rainfall-runoff model combining initial loss estimation with LSTM networks, significantly enhancing runoff forecasting accuracy. Effective runoff, obtained by subtracting initial loss from total rainfall through HEC-HMS simulations, was used as the input for the LSTM network. The Ia-LSTM hybrid model, integrating physically-based and data-driven modeling approaches, outperforms both individual HEC-HMS and LSTM models, as evidenced by case studies in the Yufuhe basin.

The integration of physically-based and data-driven modeling techniques in the Ia-LSTM hybrid model offers a comprehensive representation of the rainfall-runoff process. This integration significantly improves the model’s ability to capture the complex dynamics of rainfall-runoff, resulting in enhanced peak discharge estimation. The optimized initial loss values derived from the HEC-HMS model contribute to the increased accuracy of the Ia-LSTM model.

The case studies conducted in the Yufuhe basin demonstrate the effectiveness of the Ia-LSTM model in simulating peak discharge and accurately predicting peak time for the flood events. The performance of Ia-LSTM model was evaluated with Nash-Sutcliffe Efficiency (NSE), root mean square error (RMSE), relative error of peak discharge (REP) and coefficient of determination (R²). The Ia-LSTM model, in particular, shows an average improvement of 6.05% and 13.7% in peak discharge estimation compared to the HEC-HMS model and LSTM, respectively. The model achieves NSE values ranging from 0.755 to 0.923, RMSE values between 2.314 and 7.234 m³/s, and R² coefficients from 0.798 to 0.941. This demonstrates the consistent outperformance of the Ia-LSTM model across various flood events, as indicated by lower RMSE, and higher NSE and R² values.

These findings highlight the importance of accurate initial loss estimation and the potential of hybrid modeling approaches in improving rainfall-runoff predictions. Accurate estimation of initial loss enables a better understanding of the runoff generation process and its influence on peak discharge. The integration of initial loss estimation with LSTM in the hybrid model contributes to its superior performance in simulating peak discharge and capturing the temporal pattern of peak flow propagation. These findings offer promise for enhancing the accuracy and reliability of hydrological forecasting models.

While LSTM has been effective in rainfall-runoff forecasting, there’s room for improvement. Extending the output sequence length using historical rainfall-runoff data will significantly enhance long-term predictions.

Simplifying the complex process of initial loss estimation, which currently relies on HEC-HMS, is crucial. Future research can explore efficient techniques like the SCS curve method, considering factors such as soil type, pre-rainfall soil moisture, and the CN parameter. This streamlined approach makes initial loss estimation practical and applicable in real-world scenarios.

Data availability statement

The raw data supporting the conclusion of this article will be made available by the authors, without undue reservation.

Author contributions

WW: Writing–original draft. JG: Data curation, Formal Analysis, Writing–review and editing. ZL: Investigation, Writing–review and editing. CL: Funding acquisition, Methodology, Writing–review and editing.

Funding

The author(s) declare financial support was received for the research, authorship, and/or publication of this article. Natural Science Foundation of Shandong Province, China (ZR2021ME030), Special Project for Sustainable Development of Shenzhen Science and Technology Innovation Committee (KCXFZ20201221173407021), and Jinan Water Science and Technology Project (JNSWKJ202105) provided support for this study.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Arnold, J. G., Srinivasan, R., Muttiah, R. S., and Williams, J. R. (1998). Large area hydrologic modeling and assessment part I: model development 1. JAWRA J. Am. Water Resour. Assoc. 34 (1), 73–89. doi:10.1111/j.1752-1688.1998.tb05961.x

CrossRef Full Text | Google Scholar

Barrera-Animas, A. Y., Oyedele, L. O., Bilal, M., Akinosho, T. D., Delgado, J. M. D., and Akanbi, L. A. (2022). Rainfall prediction: a comparative analysis of modern machine learning algorithms for time-series forecasting. Mach. Learn. Appl. 7, 100204. doi:10.1016/j.mlwa.2021.100204

CrossRef Full Text | Google Scholar

Bartoletti, N., Casagli, F., Marsili-Libelli, S., Nardi, A., and Palandri, L. (2018). Data-driven rainfall/runoff modelling based on a neuro-fuzzy inference system. Environ. Model. Softw. 106, 35–47. doi:10.1016/j.envsoft.2017.11.026

CrossRef Full Text | Google Scholar

Beven, K. (2020). Deep learning, hydrological processes and the uniqueness of place. Hydrol. Process. 34 (16), 3608–3613. doi:10.1002/hyp.13805

CrossRef Full Text | Google Scholar

Bicknell, B. R., Imhoff, J. C., Kittle, J. L., Donigian, A. S., and Johanson, R. C. (1997). Hydrological simulation program—FORTRAN user’s manual for version 11. Report No. EPA/600/R-97/080. Athens, GA, USA: US Environmental Protection Agency.

Google Scholar

Breiman, L. (2001). Random forests. Mach. Learn. 45, 5–32. doi:10.1023/a:1010933404324

CrossRef Full Text | Google Scholar

Chen, C., Jiang, J., Liao, Z., Zhou, Y., Wang, H., and Pei, Q. (2022). A short-term flood prediction based on spatial deep learning network: a case study for Xi County, China. J. Hydrology 607, 127535. doi:10.1016/j.jhydrol.2022.127535

CrossRef Full Text | Google Scholar

Chen, J., and Adams, B. J. (2006). Integration of artificial neural networks with conceptual models in rainfall-runoff modeling. J. Hydrology 318 (1-4), 232–249. doi:10.1016/j.jhydrol.2005.06.017

CrossRef Full Text | Google Scholar

Cortes, C., and Vapnik, V. (1995). Support-vector networks. Mach. Learn. 20, 273–297. doi:10.1007/bf00994018

CrossRef Full Text | Google Scholar

Cui, Z., Zhou, Y., Guo, S., Wang, J., Ba, H., and He, S. (2021). A novel hybrid XAJ-LSTM model for multi-step-ahead flood forecasting. Hydrology Res. 52 (6), 1436–1454. doi:10.2166/nh.2021.016

CrossRef Full Text | Google Scholar

Danandeh Mehr, A., and Nourani, V. (2018). Season algorithm-multigene genetic programming: a new approach for rainfall-runoff modelling. Water Resour. Manag. 32, 2665–2679. doi:10.1007/s11269-018-1951-3

CrossRef Full Text | Google Scholar

Devia, G. K., Ganasri, B. P., and Dwarakish, G. S. (2015). A review on hydrological models. Aquat. procedia 4, 1001–1007. doi:10.1016/j.aqpro.2015.02.126

CrossRef Full Text | Google Scholar

Farfán, J. F., Palacios, K., Ulloa, J., and Avilés, A. (2020). A hybrid neural network-based technique to improve the flow forecasting of physical and data-driven models: methodology and case studies in Andean watersheds. J. Hydrology Regional Stud. 27, 100652. doi:10.1016/j.ejrh.2019.100652

CrossRef Full Text | Google Scholar

Feldman, A. D. (2000). Hydrologic modeling system HEC-HMS: technical reference manual. Washington, D.C, USA: US Army Corps of Engineers, Hydrologic Engineering Center.

Google Scholar

Fenicia, F., Savenije, H. H., Matgen, P., and Pfister, L. (2008). Understanding catchment behavior through stepwise model concept improvement. Water Resour. Res. 44 (1), 1–13. doi:10.1029/2006wr005563

CrossRef Full Text | Google Scholar

Gholami, V., and Khaleghi, M. R. (2021). A simulation of the rainfall-runoff process using artificial neural network and HEC-HMS model in forest lands. J. For. Sci. 67 (4), 165–174. doi:10.17221/90/2020-jfs

CrossRef Full Text | Google Scholar

Gu, H., Xu, Y. P., Ma, D., Xie, J., Liu, L., and Bai, Z. (2020). A surrogate model for the Variable Infiltration Capacity model using deep learning artificial neural network. J. Hydrology 588, 125019. doi:10.1016/j.jhydrol.2020.125019

CrossRef Full Text | Google Scholar

Haykin, S., and Network, N. (2004). A comprehensive foundation. Neural Netw. 2, 41.

Google Scholar

Hitokoto, M., and Sakuraba, M. (2020). Hybrid deep neural network and distributed rainfall-runoff model for real-time river-stage prediction. J. JSCE 8 (1), 46–58. doi:10.2208/journalofjsce.8.1_46

CrossRef Full Text | Google Scholar

Hochreiter, S., and Schmidhuber, J. (1997). Long short-term memory. Neural Comput. 9 (8), 1735–1780. doi:10.1162/neco.1997.9.8.1735

PubMed Abstract | CrossRef Full Text | Google Scholar

Hu, C., Wu, Q., Li, H., Jian, S., Li, N., and Lou, Z. (2018). Deep learning with a long short-term memory networks approach for rainfall-runoff simulation. Water 10 (11), 1543. doi:10.3390/w10111543

CrossRef Full Text | Google Scholar

Hundecha, Y., Bardossy, A., and Werner, H. W. (2001). Development of a fuzzy logic-based rainfall-runoff model. Hydrological Sci. J. 46 (3), 363–376. doi:10.1080/02626660109492832

CrossRef Full Text | Google Scholar

Jaber, F. H., and Shukla, S. (2012). MIKE SHE: model use, calibration, and validation. Trans. ASABE 55 (4), 1479–1489. doi:10.13031/2013.42255

CrossRef Full Text | Google Scholar

Kim, D., Lee, J., Kim, J., Lee, M., Wang, W., and Kim, H. S. (2022). Comparative analysis of long short-term memory and storage function model for flood water level forecasting of Bokha stream in NamHan River, Korea. J. Hydrology 606, 127415. doi:10.1016/j.jhydrol.2021.127415

CrossRef Full Text | Google Scholar

Knebl, M. R., Yang, Z. L., Hutchison, K., and Maidment, D. R. (2005). Regional scale flood modeling using NEXRAD rainfall, GIS, and HEC-HMS/RAS: a case study for the San Antonio River Basin Summer 2002 storm event. J. Environ. Manag. 75 (4), 325–336. doi:10.1016/j.jenvman.2004.11.024

PubMed Abstract | CrossRef Full Text | Google Scholar

Kratzert, F., Klotz, D., Brenner, C., Schulz, K., and Herrnegger, M. (2018). Rainfall–runoff modelling using long short-term memory (LSTM) networks. Hydrology Earth Syst. Sci. 22 (11), 6005–6022. doi:10.5194/hess-22-6005-2018

CrossRef Full Text | Google Scholar

Kumar, P. S., Praveen, T. V., and Prasad, M. A. (2016). Artificial neural network model for rainfall-runoff-A case study. Int. J. Hybrid Inf. Technol. 9 (3), 263–272. doi:10.14257/ijhit.2016.9.3.24

CrossRef Full Text | Google Scholar

Kumarasamy, K., and Belmont, P. (2018). Calibration parameter selection and watershed hydrology model evaluation in time and frequency domains. Water 10 (6), 710. doi:10.3390/w10060710

CrossRef Full Text | Google Scholar

Lees, T., Buechel, M., Anderson, B., Slater, L., Reece, S., Coxon, G., et al. (2021). Benchmarking data-driven rainfall–runoff models in Great Britain: a comparison of long short-term memory (LSTM)-based models with four lumped conceptual models. Hydrology Earth Syst. Sci. 25 (10), 5517–5534. doi:10.5194/hess-25-5517-2021

CrossRef Full Text | Google Scholar

Mehr, A. D., and Akdegirmen, O. (2021). Estimation of urban imperviousness and its impacts on flashfloods in Gazipaşa, Turkey. Knowledge-Based Eng. Sci. 2 (1), 9–17. doi:10.51526/kbes.2021.2.1.9-17

CrossRef Full Text | Google Scholar

Mohammadi, B., Safari, M. J. S., and Vazifehkhah, S. (2022). IHACRES, GR4J and MISD-based multi conceptual-machine learning approach for rainfall-runoff modeling. Sci. Rep. 12 (1), 12096. doi:10.1038/s41598-022-16215-1

PubMed Abstract | CrossRef Full Text | Google Scholar

Narayana Reddy, B. S., and Pramada, S. K. (2022). A hybrid artificial intelligence and semi-distributed model for runoff prediction. Water Supply 22 (7), 6181–6194. doi:10.2166/ws.2022.239

CrossRef Full Text | Google Scholar

Noori, N., and Kalin, L. (2016). Coupling SWAT and ANN models for enhanced daily streamflow prediction. J. Hydrology 533, 141–151. doi:10.1016/j.jhydrol.2015.11.050

CrossRef Full Text | Google Scholar

Perera, T., McGree, J., Egodawatta, P., Jinadasa, K. B. S. N., and Goonetilleke, A. (2019). Taxonomy of influential factors for predicting pollutant first flush in urban stormwater runoff. Water Res. 166, 115075. doi:10.1016/j.watres.2019.115075

PubMed Abstract | CrossRef Full Text | Google Scholar

Rahimzad, M., Moghaddam Nia, A., Zolfonoon, H., Soltani, J., Danandeh Mehr, A., and Kwon, H. H. (2021). Performance comparison of an LSTM-based deep learning model versus conventional machine learning algorithms for streamflow forecasting. Water Resour. Manag. 35 (12), 4167–4187. doi:10.1007/s11269-021-02937-w

CrossRef Full Text | Google Scholar

Safari, M. J. S., Arashloo, S. R., and Mehr, A. D. (2020). Rainfall-runoff modeling through regression in the reproducing kernel Hilbert space algorithm. J. Hydrology 587, 125014. doi:10.1016/j.jhydrol.2020.125014

CrossRef Full Text | Google Scholar

Savic, D. A., Walters, G. A., and Davidson, J. W. (1999). A genetic programming approach to rainfall-runoff modelling. Water Resour. Manag. 13, 219–231. doi:10.1023/a:1008132509589

CrossRef Full Text | Google Scholar

Shukla, A. K., Ojha, C. S. P., Garg, R. D., Shukla, S., and Pal, L. (2020). Influence of spatial urbanization on hydrological components of the upper ganga river basin, India. J. Hazard. Toxic, Radioact. Waste 24 (4), 04020028. doi:10.1061/(asce)hz.2153-5515.0000508

CrossRef Full Text | Google Scholar

Shukla, A. K., Pathak, S., Pal, L., Ojha, C. S. P., Mijic, A., and Garg, R. D. (2018). Spatio-temporal assessment of annual water balance models for upper Ganga Basin. Hydrology Earth Syst. Sci. 22 (10), 5357–5371. doi:10.5194/hess-22-5357-2018

CrossRef Full Text | Google Scholar

Sun, A. Y., Scanlon, B. R., Zhang, Z., Walling, D., Bhanja, S. N., Mukherjee, A., et al. (2019). Combining physically based modeling and deep learning for fusing GRACE satellite data: can we learn from mismatch? Water Resour. Res. 55 (2), 1179–1195. doi:10.1029/2018wr023333

CrossRef Full Text | Google Scholar

Tian, Y., Xu, Y. P., Yang, Z., Wang, G., and Zhu, Q. (2018). Integration of a parsimonious hydrological model with recurrent neural networks for improved streamflow forecasting. Water 10 (11), 1655. doi:10.3390/w10111655

CrossRef Full Text | Google Scholar

Wang, W., and Ding, J. (2003). Purification of boiling-soluble antifreeze protein from the legume Ammopiptanthus mongolicus. Nat. Sci. 1 (1), 67–80. doi:10.1081/PB-120018370

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, W., Vrijling, J. K., Van Gelder, P. H., and Ma, J. (2006). Testing for nonlinearity of streamflow processes at different timescales. J. Hydrology 322 (1-4), 247–268. doi:10.1016/j.jhydrol.2005.02.045

CrossRef Full Text | Google Scholar

XiangYan, Z. J., and Demir, I. (2020). A rainfall-runoff model with LSTM-based sequence-to-sequence learning. Water Resour. Res. 56 (1). doi:10.1029/2019wr025326

CrossRef Full Text | Google Scholar

Xie, K., Liu, P., Zhang, J., Han, D., Wang, G., and Shen, C. (2021). Physics-guided deep learning for rainfall-runoff modeling by considering extreme events and monotonic relationships. J. Hydrology 603, 127043. doi:10.1016/j.jhydrol.2021.127043

CrossRef Full Text | Google Scholar

Xie, T., Zhang, G., Hou, J., Xie, J., Lv, M., and Liu, F. (2019). Hybrid forecasting model for non-stationary daily runoff series: a case study in the Han River Basin, China. J. Hydrology 577, 123915. doi:10.1016/j.jhydrol.2019.123915

CrossRef Full Text | Google Scholar

Xie, X., Huang, L., Marson, S. M., and Wei, G. (2023b). Emergency response process for sudden rainstorm and flooding: scenario deduction and Bayesian network analysis using evidence theory and knowledge meta-theory. Nat. Hazards 117 (3), 3307–3329. doi:10.1007/s11069-023-05988-x

CrossRef Full Text | Google Scholar

Xie, X., Tian, Y., and Wei, G. (2023a). Deduction of sudden rainstorm scenarios: integrating decision makers' emotions, dynamic Bayesian network and DS evidence theory. Nat. Hazards 116 (3), 2935–2955. doi:10.1007/s11069-022-05792-z

CrossRef Full Text | Google Scholar

Yang, S., Xu, Z., Kong, Ke., Miao, S., and Zhang, S. (2013). A flow simulation based on SWAT model in Wohushan reservoir basin. China Rural Water and Hydropower (5), 11–18. doi:10.3969/j.issn.1007-2284.2013.05.003

CrossRef Full Text | Google Scholar

Yaseen, Z. M., Jaafar, O., Deo, R. C., Kisi, O., Adamowski, J., Quilty, J., et al. (2016). Stream-flow forecasting using extreme learning machines: a case study in a semi-arid region in Iraq. J. Hydrology 542, 603–614. doi:10.1016/j.jhydrol.2016.09.035

CrossRef Full Text | Google Scholar

Young, C. C., and Liu, W. C. (2015). Prediction and modelling of rainfall–runoff during typhoon events using a physically-based and artificial neural network hybrid model. Hydrological Sci. J. 60 (12), 2102–2116. doi:10.1080/02626667.2014.959446

CrossRef Full Text | Google Scholar

Zheng, Y., Li, J., Dong, L., Rong, Y., Kang, A., and Feng, P. (2020). Estimation of initial abstraction for hydrological modeling based on global land data assimilation system–simulated datasets. J. Hydrometeorol. 21 (5), 1051–1072. doi:10.1175/jhm-d-19-0202.1

CrossRef Full Text | Google Scholar

Zhang, L., Yang, Z., and Liu, G. (2016). A forecast model of distributed flood in Yufuhe basin and its application. J. Water Resour. Water Eng. 27 (3), 66–72. doi:10.11705/j.issn.1672-643X.2016.03.13

CrossRef Full Text | Google Scholar

Zhao, R. J. (1992). The Xinanjiang model applied in China. J. Hydrology 135, 371–381. doi:10.1016/0022-1694(92)90096-e

CrossRef Full Text | Google Scholar

Zhou, Q., Teng, S., Situ, Z., Liao, X., Feng, J., Chen, G., et al. (2023). A deep-learning-technique-based data-driven model for accurate and rapid flood predictions in temporal and spatial dimensions. Hydrology Earth Syst. Sci. 27 (9), 1791–1808. doi:10.5194/hess-27-1791-2023

CrossRef Full Text | Google Scholar

Zhou, Y., Cui, Z., Lin, K., Sheng, S., Chen, H., Guo, S., et al. (2022). Short-term flood probability density forecasting using a conceptual hydrological model with machine learning techniques. J. Hydrology 604, 127255. doi:10.1016/j.jhydrol.2021.127255

CrossRef Full Text | Google Scholar

Keywords: rainfall-runoff modeling, hybrid model, initial loss (Ia), HEC-HMS, LSTM

Citation: Wang W, Gao J, Liu Z and Li C (2023) A hybrid rainfall-runoff model: integrating initial loss and LSTM for improved forecasting. Front. Environ. Sci. 11:1261239. doi: 10.3389/fenvs.2023.1261239

Received: 20 July 2023; Accepted: 06 October 2023;
Published: 18 October 2023.

Edited by:

Buddhi Wijesiri, Queensland University of Technology, Australia

Reviewed by:

Ali Danandeh Mehr, Antalya Bilim University, Türkiye
Anoop Kumar Shukla, Manipal Academy of Higher Education, India

Copyright © 2023 Wang, Gao, Liu and Li. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Chuanqi Li, bGljaHVhbnFpQHNkdS5lZHUuY24=

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

A hybrid rainfall-runoff model: integrating initial loss and LSTM for improved forecasting

1 Introduction

2 Materials and methods

2.1 Study area and data

2.2 HEC-HMS model

2.2.1 Initial and constant loss method

2.2.2 Direct runoff calculation

2.2.3 Baseflow calculation

2.2.4 Flood routing

2.2.5 Parameter optimization methods

2.3 Long short-term memory (LSTM) network

2.4 Ia-LSTM hybrid model

2.5 Evaluation metrics of model performance

3 Results and discussion

3.1 Estimation of initial loss

3.2 Performance comparison

3.3 Impact of initial loss

3.4 Comparison with previous studies

4 Conclusion

Data availability statement

Author contributions

Funding

Conflict of interest

Publisher’s note

References

95% of researchers rate our articles as excellent or good

95% of researchers rate our articles as excellent or good