1
|
Pinthong S, Ditthakit P, Salaeh N, Hasan MA, Son CT, Linh NTT, Islam S, Yadav KK. Imputation of missing monthly rainfall data using machine learning and spatial interpolation approaches in Thale Sap Songkhla River Basin, Thailand. Environ Sci Pollut Res Int 2022:10.1007/s11356-022-23022-8. [PMID: 36173524 DOI: 10.1007/s11356-022-23022-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Accepted: 09/10/2022] [Indexed: 06/16/2023]
Abstract
Missing rainfall data has been a prevalent issue and primarily interested in hydrology and meteorology. This research aimed to examine the capability of machine learning (ML) and spatial interpolation (SI) methods to estimate missing monthly rainfall data. Six ML algorithms (i.e. multiple linear regression (MLR), M5 model tree (M5), random forest (RF), support vector regression (SVR), multilayer perceptron (MLP), genetic programming (GP)) and four SI methods (i.e. arithmetic average (AA), inverse distance weighting (IDW), correlation coefficient weighted (CCW), normal ratio (NR)) were investigated and compared in their performance. The twelve rainfall stations, located in the Thale Sap Songkhla river basin and nearby basins, were considered as a study case. Tuning hyper-parameters for each ML method was conducted to get the most suitable model for the data sets considered. Three performance criteria matrices (i.e. NSE, OI, and r) were chosen, and the sum of those three performance criteria matrices was introduced for methods' performance comparison. The experimental results pointed out that selecting neighbouring stations were essential when applying SI methods, but not for the ML method. The overall performance showed ML better imputed missing monthly rainfall than SI due to overcoming spatial constraints. GP provided the highest performance by giving NSE = 0.825, OI = 0.877, and r = 0.909 for the training stage. Those values for the testing stage were 0.796, 0.852, and 0.902, respectively. It was followed by SVR-rbf, SVR-poly, and RF. NR provided the best performance among four SI methods, followed by CCW, AA, and IDW. When applying SI methods, it should contemplate a correlation between the target and neighbouring stations greater than 0.80.
Collapse
Affiliation(s)
- Sirimon Pinthong
- Center of Excellence in Sustainable Disaster Management, School of Engineering and Technology, Walailak University, Nakhon Si Thammarat, 80161, Thailand
| | - Pakorn Ditthakit
- Center of Excellence in Sustainable Disaster Management, School of Engineering and Technology, Walailak University, Nakhon Si Thammarat, 80161, Thailand.
| | - Nureehan Salaeh
- Center of Excellence in Sustainable Disaster Management, School of Engineering and Technology, Walailak University, Nakhon Si Thammarat, 80161, Thailand
| | - Mohd Abul Hasan
- Civil Engineering Department, College of Engineering, King Khalid University, Abha, 61421, Saudi Arabia
| | - Cao Truong Son
- Faculty of Natural Resources and Environment, Vietnam National University of Agriculture, Hanoi, 100000, Vietnam
| | - Nguyen Thi Thuy Linh
- Institute of Applied Technology, Thu Dau Mot University, Thủ Dầu Một, Binh Duong province, Vietnam
| | - Saiful Islam
- Civil Engineering Department, College of Engineering, King Khalid University, Abha, 61421, Saudi Arabia
| | - Krishna Kumar Yadav
- Faculty of Science and Technology, Madhyanchal Professional University, Ratibad, Bhopal, 462044, India
| |
Collapse
|
2
|
Ditthakit P, Pinthong S, Salaeh N, Binnui F, Khwanchum L, Pham QB. Using machine learning methods for supporting GR2M model in runoff estimation in an ungauged basin. Sci Rep 2021; 11:19955. [PMID: 34620910 PMCID: PMC8497588 DOI: 10.1038/s41598-021-99164-5] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2021] [Accepted: 09/14/2021] [Indexed: 02/08/2023] Open
Abstract
Estimating monthly runoff variation, especially in ungauged basins, is inevitable for water resource planning and management. The present study aimed to evaluate the regionalization methods for determining regional parameters of the rainfall-runoff model (i.e., GR2M model). Two regionalization methods (i.e., regression-based methods and distance-based methods) were investigated in this study. Three regression-based methods were selected including Multiple Linear Regression (MLR), Random Forest (RF), and M5 Model Tree (M5), and two distance-based methods included Spatial Proximity Approach and Physical Similarity Approach (PSA). Hydrological data and the basin's physical attributes were analyzed from 37 runoff stations in Thailand's southern basin. The results showed that using hydrological data for estimating the GR2M model parameters is better than using the basin's physical attributes. RF had the most accuracy in estimating regional GR2M model's parameters by giving the lowest error, followed by M5, MLR, SPA, and PSA. Such regional parameters were then applied in estimating monthly runoff using the GR2M model. Then, their performance was evaluated using three performance criteria, i.e., Nash-Sutcliffe Efficiency (NSE), Correlation Coefficient (r), and Overall Index (OI). The regionalized monthly runoff with RF performed the best, followed by SPA, M5, MLR, and PSA. The Taylor diagram was also used to graphically evaluate the obtained results, which indicated that RF provided the products closest to GR2M's results, followed by SPA, M5, PSA, and MLR. Our finding revealed the applicability of machine learning for estimating monthly runoff in the ungauged basins. However, the SPA would be recommended in areas where lacking the basin's physical attributes and hydrological information.
Collapse
Affiliation(s)
- Pakorn Ditthakit
- grid.412867.e0000 0001 0043 6347School of Engineering and Technology, Walailak University, Nakhon Si Thammarat, 80161 Thailand ,grid.412867.e0000 0001 0043 6347Center of Excellence in Sustainable Disaster Management, Walailak University, Nakhon Si Thammarat, 80161 Thailand
| | - Sirimon Pinthong
- grid.412867.e0000 0001 0043 6347School of Engineering and Technology, Walailak University, Nakhon Si Thammarat, 80161 Thailand ,grid.412867.e0000 0001 0043 6347Center of Excellence in Sustainable Disaster Management, Walailak University, Nakhon Si Thammarat, 80161 Thailand
| | - Nureehan Salaeh
- grid.412867.e0000 0001 0043 6347School of Engineering and Technology, Walailak University, Nakhon Si Thammarat, 80161 Thailand ,grid.412867.e0000 0001 0043 6347Center of Excellence in Sustainable Disaster Management, Walailak University, Nakhon Si Thammarat, 80161 Thailand
| | - Fadilah Binnui
- grid.412867.e0000 0001 0043 6347School of Engineering and Technology, Walailak University, Nakhon Si Thammarat, 80161 Thailand ,grid.412867.e0000 0001 0043 6347Center of Excellence in Sustainable Disaster Management, Walailak University, Nakhon Si Thammarat, 80161 Thailand
| | - Laksanara Khwanchum
- grid.412867.e0000 0001 0043 6347School of Languages and General Education, Walailak University, Nakhon Si Thammarat, 80161 Thailand ,grid.412867.e0000 0001 0043 6347Center of Excellence in Sustainable Disaster Management, Walailak University, Nakhon Si Thammarat, 80161 Thailand
| | - Quoc Bao Pham
- Institute of Applied Technology, Thu Dau Mot University, Thu Dau Mot City, Binh Duong Province 821389 Vietnam
| |
Collapse
|