1
|
Jia Z, Zhang Q, Shi B, Xu C, Liu D, Yang Y, Xi B, Li R. A new strategy for groundwater level prediction using a hybrid deep learning model under Ecological Water Replenishment. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2024; 31:23951-23967. [PMID: 38436858 DOI: 10.1007/s11356-024-32330-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Accepted: 01/30/2024] [Indexed: 03/05/2024]
Abstract
Accurate prediction of the groundwater level (GWL) is crucial for sustainable groundwater resource management. Ecological water replenishment (EWR) involves artificially diverting water to replenish the ecological flow and water resources of both surface water and groundwater within the basin. However, fluctuations in GWLs during the EWR process exhibit high nonlinearity and complexity in their time series, making it challenging for single data-driven models to predict the trend of groundwater level changes under the backdrop of EWR. This study introduced a new GWL prediction strategy based on a hybrid deep learning model, STL-IWOA-GRU. It integrated the LOESS-based seasonal trend decomposition algorithm (STL), improved whale optimization algorithm (IWOA), and Gated recurrent unit (GRU). The aim was to accurately predict GWLs in the context of EWR. This study gathered GWL, precipitation, and surface runoff data from 21 monitoring wells in the Yongding River Basin (Beijing Section) over a period of 731 days. The research results demonstrate that the improvement strategy implemented for the IWOA enhances the convergence speed and global search capabilities of the algorithm. In the case analysis, evaluation metrics including the root mean square error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), and Nash-Sutcliffe efficiency (NSE) were employed. STL-IWOA-GRU exhibited commendable performance, with MAE achieving the best result, averaging at 0.266. When compared to other models such as Variance Mode Decomposition-Gated Recurrent Unit (VMD-GRU), Ant Lion Optimizer-Support Vector Machine (ALO-SVM), STL-Particle Swarm Optimization-GRU (STL-PSO-GRU), and STL-Sine Cosine Algorithm-GRU (STL-SCA-GRU), MAE was reduced by 18%, 26%, 11%, and 29%, respectively. This indicates that the model proposed in this study exhibited high prediction accuracy and robust versatility, making it a potent strategic choice for forecasting GWL changes in the context of EWR.
Collapse
Affiliation(s)
- Zihao Jia
- School of Environmental Science and Engineering, Guilin University of Technology, Guilin, 541004, China
- The Nuclear and Radiation Safety Center of Ministry of Ecology and Environment of China, Beijing, 100082, China
- State Key Laboratory of Environmental Criteria and Risk Assessment, Chinese Research Academy of Environmental Sciences, Beijing, 100012, China
| | - Qin Zhang
- State Key Laboratory of Environmental Criteria and Risk Assessment, Chinese Research Academy of Environmental Sciences, Beijing, 100012, China
| | - Bowen Shi
- The Nuclear and Radiation Safety Center of Ministry of Ecology and Environment of China, Beijing, 100082, China
- State Key Laboratory of Environmental Criteria and Risk Assessment, Chinese Research Academy of Environmental Sciences, Beijing, 100012, China
| | - Congchao Xu
- The Nuclear and Radiation Safety Center of Ministry of Ecology and Environment of China, Beijing, 100082, China
- State Key Laboratory of Environmental Criteria and Risk Assessment, Chinese Research Academy of Environmental Sciences, Beijing, 100012, China
- School of Water Resources and Environment, China University of Geosciences (Beijing), Beijing, 100083, China
| | - Di Liu
- The Nuclear and Radiation Safety Center of Ministry of Ecology and Environment of China, Beijing, 100082, China
- State Key Laboratory of Environmental Criteria and Risk Assessment, Chinese Research Academy of Environmental Sciences, Beijing, 100012, China
| | - Yihong Yang
- The Nuclear and Radiation Safety Center of Ministry of Ecology and Environment of China, Beijing, 100082, China
- State Key Laboratory of Environmental Criteria and Risk Assessment, Chinese Research Academy of Environmental Sciences, Beijing, 100012, China
| | - Beidou Xi
- The Nuclear and Radiation Safety Center of Ministry of Ecology and Environment of China, Beijing, 100082, China
- State Key Laboratory of Environmental Criteria and Risk Assessment, Chinese Research Academy of Environmental Sciences, Beijing, 100012, China
| | - Rui Li
- The Nuclear and Radiation Safety Center of Ministry of Ecology and Environment of China, Beijing, 100082, China.
- State Key Laboratory of Environmental Criteria and Risk Assessment, Chinese Research Academy of Environmental Sciences, Beijing, 100012, China.
| |
Collapse
|
2
|
Yin J, Huang Y, Lu C, Liu Z. Uncertainty-based saltwater intrusion prediction using integrated Bayesian machine learning modeling (IBMLM) in a deep aquifer. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2024; 354:120252. [PMID: 38394869 DOI: 10.1016/j.jenvman.2024.120252] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/18/2023] [Revised: 01/25/2024] [Accepted: 01/27/2024] [Indexed: 02/25/2024]
Abstract
Data-driven machine learning approaches are promising to substitute physically based groundwater numerical models and capture input-output relationships for reducing computational burden. But the performance and reliability are strongly influenced by different sources of uncertainty. Conventional researches generally rely on a stand-alone machine learning surrogate approach and fail to account for errors in model outputs resulting from structural deficiencies. To overcome this issue, this study proposes a flexible integrated Bayesian machine learning modeling (IBMLM) method to explicitly quantify uncertainties originating from structures and parameters of machine learning surrogate models. An Expectation-Maximization (EM) algorithm is combined with Bayesian model averaging (BMA) to find out maximum likelihood and construct posterior predictive distribution. Three machine learning approaches representing different model complexity are incorporated in the framework, including artificial neural network (ANN), support vector machine (SVM) and random forest (RF). The proposed IBMLM method is demonstrated in a field-scale real-world "1500-foot" sand aquifer, Baton Rouge, USA, where overexploitation caused serious saltwater intrusion (SWI) issues. This study adds to the understanding of how chloride concentration transport responds to multi-dimensional extraction-injection remediation strategies in a sophisticated saltwater intrusion model. Results show that most IBMLM exhibit r values above 0.98 and NSE values above 0.93, both slightly higher than individual machine learning, confirming that the IBMLM is well established to provide better model predictions than individual machine learning models, while maintaining the advantage of high computing efficiency. The IBMLM is found useful to predict saltwater intrusion without running the physically based numerical simulation model. We conclude that an explicit consideration of machine learning model structure uncertainty along with parameters improves accuracy and reliability of predictions, and also corrects uncertainty bounds. The applicability of the IBMLM framework can be extended in regions where a physical hydrogeologic model is difficult to build due to lack of subsurface information.
Collapse
Affiliation(s)
- Jina Yin
- The National Key Laboratory of Water Disaster Prevention, Hohai University, Nanjing, China; Yangtze Institute for Conservation and Development, Hohai University, Nanjing, China
| | - Yulu Huang
- The National Key Laboratory of Water Disaster Prevention, Hohai University, Nanjing, China; Yangtze Institute for Conservation and Development, Hohai University, Nanjing, China
| | - Chunhui Lu
- The National Key Laboratory of Water Disaster Prevention, Hohai University, Nanjing, China; Yangtze Institute for Conservation and Development, Hohai University, Nanjing, China.
| | - Zhu Liu
- The National Key Laboratory of Water Disaster Prevention, Hohai University, Nanjing, China; Department of Hydrology and Water Resources, Hohai University, Nanjing, China
| |
Collapse
|
3
|
Ji Y, Liu Z, Cui Y, Liu R, Chen Z, Zong X, Yang T. Faba bean and pea harvest index estimations using aerial-based multimodal data and machine learning algorithms. PLANT PHYSIOLOGY 2024; 194:1512-1526. [PMID: 37935623 PMCID: PMC10904323 DOI: 10.1093/plphys/kiad577] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Accepted: 10/13/2023] [Indexed: 11/09/2023]
Abstract
Early and high-throughput estimations of the crop harvest index (HI) are essential for crop breeding and field management in precision agriculture; however, traditional methods for measuring HI are time-consuming and labor-intensive. The development of unmanned aerial vehicles (UAVs) with onboard sensors offers an alternative strategy for crop HI research. In this study, we explored the potential of using low-cost, UAV-based multimodal data for HI estimation using red-green-blue (RGB), multispectral (MS), and thermal infrared (TIR) sensors at 4 growth stages to estimate faba bean (Vicia faba L.) and pea (Pisum sativum L.) HI values within the framework of ensemble learning. The average estimates of RGB (faba bean: coefficient of determination [R2] = 0.49, normalized root-mean-square error [NRMSE] = 15.78%; pea: R2 = 0.46, NRMSE = 20.08%) and MS (faba bean: R2 = 0.50, NRMSE = 15.16%; pea: R2 = 0.46, NRMSE = 19.43%) were superior to those of TIR (faba bean: R2 = 0.37, NRMSE = 16.47%; pea: R2 = 0.38, NRMSE = 19.71%), and the fusion of multisensor data exhibited a higher estimation accuracy than those obtained using each sensor individually. Ensemble Bayesian model averaging provided the most accurate estimations (faba bean: R2 = 0.64, NRMSE = 13.76%; pea: R2 = 0.74, NRMSE = 15.20%) for whole growth stage, and the estimation accuracy improved with advancing growth stage. These results indicate that the combination of low-cost, UAV-based multimodal data and machine learning algorithms can be used to estimate crop HI reliably, therefore highlighting a promising strategy and providing valuable insights for high spatial precision in agriculture, which can help breeders make early and efficient decisions.
Collapse
Affiliation(s)
- Yishan Ji
- National Key Facility for Crop Gene Resources and Genetic Improvement/Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Zehao Liu
- National Key Facility for Crop Gene Resources and Genetic Improvement/Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Yuxing Cui
- National Key Facility for Crop Gene Resources and Genetic Improvement/Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Rong Liu
- National Key Facility for Crop Gene Resources and Genetic Improvement/Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Zhen Chen
- Institute of Farmland Irrigation, Chinese Academy of Agricultural Sciences, Xinxiang 453002, China
| | - Xuxiao Zong
- National Key Facility for Crop Gene Resources and Genetic Improvement/Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Tao Yang
- National Key Facility for Crop Gene Resources and Genetic Improvement/Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| |
Collapse
|
4
|
Ataei P, Takhtravan A, Gheibi M, Chahkandi B, Faramarz MG, Wacławek S, Fathollahi-Fard AM, Behzadian K. An intelligent decision support system for groundwater supply management and electromechanical infrastructure controls. Heliyon 2024; 10:e25036. [PMID: 38317976 PMCID: PMC10840003 DOI: 10.1016/j.heliyon.2024.e25036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Revised: 01/06/2024] [Accepted: 01/18/2024] [Indexed: 02/07/2024] Open
Abstract
This study presents an intelligent Decision Support System (DSS) aimed at bridging the theoretical-practical gap in groundwater management. The ongoing demand for sophisticated systems capable of interpreting extensive data to inform sustainable groundwater decision-making underscores the critical nature of this research. To meet this challenge, telemetry data from six randomly selected wells were used to establish a comprehensive database of groundwater pumping parameters, including flow rate, pressure, and current intensity. Statistical analysis of these parameters led to the determination of threshold values for critical factors such as water pressure and electrical current. Additionally, a soft sensor was developed using a Random Forest (RF) machine learning algorithm, enabling real-time forecasting of key variables. This was achieved by continuously comparing live telemetry data to pump design specifications and results from regular field testing. The proposed machine learning model ensures robust empirical monitoring of well and pump health. Furthermore, expert operational knowledge from water management professionals, gathered through a Classical Delphi (CD) technique, was seamlessly integrated. This collective expertise culminated in a data-driven framework for sustainable groundwater facilities monitoring. In conclusion, this innovative DSS not only addresses the theory-application gap but also leverages the power of data analytics and expert knowledge to provide high-precision online insights, thereby optimizing groundwater management practices.
Collapse
Affiliation(s)
- Parisa Ataei
- Department of Civil Engineering, Birjand University of Technology, Birjand, Iran
| | - Amir Takhtravan
- Department of Civil Engineering, Birjand University of Technology, Birjand, Iran
| | - Mohammad Gheibi
- Institute for Nanomaterials, Advanced Technologies and Innovation, Technical University of Liberec, 46117, Liberec, Czech Republic
- Faculty of Mechatronics, Informatics, and Interdisciplinary Studies, Technical University of Liberec, Liberec, Czech Republic
| | | | - Mahdieh G. Faramarz
- Department of Building, Civil, and Environmental Engineering, Concordia University, Montreal, QC, H3G1M8, Canada
| | - Stanisław Wacławek
- Institute for Nanomaterials, Advanced Technologies and Innovation, Technical University of Liberec, 46117, Liberec, Czech Republic
- Faculty of Mechatronics, Informatics, and Interdisciplinary Studies, Technical University of Liberec, Liberec, Czech Republic
| | - Amir M. Fathollahi-Fard
- Département d’Analytique, Opérations et Technologies de l’Information, Université Du Québec à Montréal, B.P. 8888, Succ. Centre-ville, Montréal, QC, H3C 3P8, Canada
- New Era and Development in Civil Engineering Research Group, Scientific Research Center, Al-Ayen University, Nasiriyah, Thi-Qar 64001, Iraq
| | - Kourosh Behzadian
- School of Computing and Engineering, University of West London, England, UK
| |
Collapse
|
5
|
Li G, Liu Z, Zhang J, Han H, Shu Z. Bayesian model averaging by combining deep learning models to improve lake water level prediction. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 906:167718. [PMID: 37832688 DOI: 10.1016/j.scitotenv.2023.167718] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Revised: 09/25/2023] [Accepted: 10/08/2023] [Indexed: 10/15/2023]
Abstract
Water level (WL) is an essential indicator of lakes and sensitive to climate change. Fluctuations of lake WL may significantly affect water supply security and ecosystem stability. Accurate prediction of lake WL is, therefore, crucial for water resource management and eco-environmental protection. In this study, three deep learning (DL) models, including long short-term memory (LSTM), the gated recurrent unit (GRU), and the temporal convolutional network (TCN), were used to predict WLs at five stations of Poyang Lake for different forecast periods (1-day ahead, 3-day ahead, and 7-day ahead). The forecast results of the three DL models were synthesized through Bayesian model averaging (BMA) to improve prediction accuracy, and Monte Carlo sampling method was used to calculated the 90 % confidence intervals to analyze the model uncertainty. All the three DL models achieved satisfactory prediction accuracy. GRU performed best in most forecast scenarios, followed by TCN and LSTM. None of the models, however, consistently provided the optimal results in all forecast scenarios. Lake WL prediction accuracy of BMA had a further improvement in metrics of NSE and R2 in 80 % of the forecast scenarios and ranked at least top two in all forecast scenarios. The uncertainty analysis showed that the containing ration (CR) values were above 84 % while the relative bandwidth (RB) maintained reliable performance over the 7-day ahead prediction. The proposed framework in the present study can realize satisfactory WL forecast accuracy while avoiding complex comparison and selection of DL models, and it can also be easily applied to the prediction of other hydrological variables.
Collapse
Affiliation(s)
- Gang Li
- Jiangxi Academy of Water Science and Engineering, Nanchang 330029, China; Jiangxi Provincial Technology Innovation Center for Ecological Water Engineering in Poyang Lake Basin, Nanchang 330029, China
| | - Zhangjun Liu
- Jiangxi Academy of Water Science and Engineering, Nanchang 330029, China; Jiangxi Provincial Technology Innovation Center for Ecological Water Engineering in Poyang Lake Basin, Nanchang 330029, China.
| | - Jingwen Zhang
- Jiangxi Academy of Water Science and Engineering, Nanchang 330029, China; Jiangxi Provincial Technology Innovation Center for Ecological Water Engineering in Poyang Lake Basin, Nanchang 330029, China
| | - Huiming Han
- Jiangxi Academy of Water Science and Engineering, Nanchang 330029, China; Jiangxi Provincial Technology Innovation Center for Ecological Water Engineering in Poyang Lake Basin, Nanchang 330029, China
| | - Zhangkang Shu
- State Key Laboratory of Hydrology-Water Resources and Hydraulic Engineering, Nanjing Hydraulic Research Institute, Nanjing 210029, China
| |
Collapse
|
6
|
Wang Z, Kuerban K, Zhou Z, Hailati M, Aihematiniyazi R, Wang X, Yan C. HCEs-Net: Hepatic cystic echinococcosis classification ensemble model based on tree-structured Parzen estimator and snap-shot approach. Med Phys 2023. [PMID: 37183479 DOI: 10.1002/mp.16444] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Revised: 02/15/2023] [Accepted: 04/18/2023] [Indexed: 05/16/2023] Open
Abstract
BACKGROUND Hepatic cystic echinococcosis (HCE) still has a high misdiagnosis rate, and misdiagnosis may lead to wrong treatments seriously harmful for the patients. Precise diagnosis of HCE relies heavily on the experience of clinical experts with auxiliary diagnostic tools using medical images. PURPOSE This paper intends to improve the diagnostic accuracy for HCE by employing a method which combines deep learning with ensemble method. METHODS We proposed a method, namely HCEs-Net, for classification of five HCE subtypes using ultrasound images. It takes first the snap-shot strategy to obtain sub-models from the pre-trained VGG19, ResNet18, ViT-Base, and ConvNeXt-T models, then a stacking process to ensemble those sub-models. Afterwards, it uses the tree-structured Pazren estimator (TPE) to optimize the hyperparameters. The experiments were evaluated by the five-fold cross-validation process. RESULTS A total of 3083 abdominal ultrasound images from 972 patients covering five subtypes of HCE were utilized in this study. The experiments were conducted to predict the HCE subtype, and results of modeling performance evaluation were reported in terms of precision, recall, F1-score, and AUC. The stacking model based on three ConvNeXt-T sub-models showed the best performance, with precision 85.9%, recall 85.5%, F1-score 85.7%, and AUC 0.971 which are higher than the compared state-of-the-art models. CONCLUSION The stacking model of three ConvNeXt-T sub-models shows comparable or superior performance to the other methods, including VGG19, ResNet18 and ViT-Base. It has the potential to enhance clinical diagnosis for HCE.
Collapse
Affiliation(s)
- Zhengye Wang
- College of Public Health, Xinjiang Medical University, Urumqi, China
| | - Kadiliya Kuerban
- College of Public Health, Xinjiang Medical University, Urumqi, China
| | - Zihang Zhou
- School of Computer Science and Engineering, Beihang University, Beijing, China
| | | | | | - Xiaorong Wang
- The First Affiliated Hospital of Xinjiang Medical University, Urumqi, China
- State Key Laboratory of Pathogenesis, Prevention and Treatment of High Incidence Diseases in Central Asia, Department of Abdominal Ultrasound, The First Affiliated Hospital of Xinjiang Medical University, Urumqi, China
| | - Chuanbo Yan
- College of Medical Engineering Technology, Xinjiang Medical University, Urumqi, China
| |
Collapse
|
7
|
Bayesian inference for survival prediction of childhood Leukemia. Comput Biol Med 2023; 156:106713. [PMID: 36863191 DOI: 10.1016/j.compbiomed.2023.106713] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Revised: 02/09/2023] [Accepted: 02/26/2023] [Indexed: 03/03/2023]
Abstract
BACKGROUND Childhood Leukemia is the most common type of cancer among children. Nearly 39% of cancer-induced childhood deaths are attributable to Leukemia. Nevertheless, early intervention has long been underdeveloped. Moreover, there are still a group of children succumbing to their cancer due to the cancer care resource disparity. Therefore, it calls for an accurate predictive approach to improve childhood Leukemia survival and mitigate these disparities. Existing survival predictions rely on a single best model, which fails to consider model uncertainties in predictions. Prediction from a single model is brittle, with model uncertainty neglected, and inaccurate prediction could lead to serious ethical and economic consequences. METHODS To address these challenges, we develop a Bayesian survival model to predict patient-specific survivals by taking model uncertainty into account. Specifically, we first develop a survival model predict time-varying survival probabilities. Second, we place different prior distributions over various model parameters and estimate their posterior distribution with full Bayesian inference. Third, we predict the patient-specific survival probabilities changing with respect to time by considering model uncertainty induced by posterior distribution. RESULTS Concordance index of the proposed model is 0.93. Moreover, the standardized survival probability of the censored group is higher than that of the deceased group. CONCLUSIONS Experimental results indicate that the proposed model is robust and accurate in predicting patient-specific survivals. It can also help clinicians track the contribution of multiple clinical attributes, thereby enabling well-informed intervention and timely medical care for childhood Leukemia.
Collapse
|
8
|
Sarkar S, Mukherjee A, Chakraborty M, Quamar MT, Duttagupta S, Bhattacharya A. Prediction of elevated groundwater fluoride across India using multi-model approach: insights on the influence of geologic and environmental factors. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2023; 30:31998-32013. [PMID: 36459318 DOI: 10.1007/s11356-022-24328-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Accepted: 11/16/2022] [Indexed: 06/17/2023]
Abstract
Elevated fluoride in groundwater is a severe problem in India due to its extensive occurrence and detrimental health impacts on the large population that thrives on groundwater. Although fluoride is primarily a geogenic pollutant, existing model-based studies lack the amalgamation of the influence of geologic factors, specifically tectonics, for identifying groundwater fluoride distribution. This drawback encourages the present study to investigate the association of the tectonic framework with fluoride in a multi-model approach. We have applied three machine learning models (random forest, boosted regression tree, and logistic regression) to predict elevated groundwater fluoride based on fluoride measurements across India. The random forest model outperformed other models with an accuracy of 93%. Tectonics was found to be one of the most important predictors alongside "depth to water table." Two major areas of high risk identified were the northwest parts and the south-southeast cratonic peninsular region. The random forest model also performed significantly well over the validation dataset. We estimate that nearly 257 million people are exposed to elevated fluoride risk in India. We endeavor that the findings of our study would be an effective tool for identifying the areas at risk of elevated fluoride and also assist in undertaking effective groundwater management strategies.
Collapse
Affiliation(s)
- Soumyajit Sarkar
- School of Environmental Science and Engineering, Indian Institute of Technology Kharagpur, Kharagpur, 721302, West Bengal, India
| | - Abhijit Mukherjee
- School of Environmental Science and Engineering, Indian Institute of Technology Kharagpur, Kharagpur, 721302, West Bengal, India.
- Department of Geology and Geophysics, Indian Institute of Technology Kharagpur, Kharagpur, 721302, West Bengal, India.
| | - Madhumita Chakraborty
- Department of Geology and Geophysics, Indian Institute of Technology Kharagpur, Kharagpur, 721302, West Bengal, India
| | - Md Tahseen Quamar
- Department of Geology and Geophysics, Indian Institute of Technology Kharagpur, Kharagpur, 721302, West Bengal, India
| | - Srimanti Duttagupta
- Graduate School of Public Health, San Diego State University, San Diego, CA, 92182, USA
| | - Animesh Bhattacharya
- School of Environmental Science and Engineering, Indian Institute of Technology Kharagpur, Kharagpur, 721302, West Bengal, India
| |
Collapse
|
9
|
Samani S, Vadiati M, Nejatijahromi Z, Etebari B, Kisi O. Groundwater level response identification by hybrid wavelet-machine learning conjunction models using meteorological data. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2023; 30:22863-22884. [PMID: 36308648 DOI: 10.1007/s11356-022-23686-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Accepted: 10/13/2022] [Indexed: 06/16/2023]
Abstract
Due to its heterogeneous and complex nature, groundwater modeling needs great effort to quantify the aquifer, a crucial tool for policymakers and hydrogeologists to understand the variations in groundwater levels (GWL). This study proposed a set of supervised machine learning (ML) models to delineate the GWL changes in the Zarand-Saveh complex aquifer in Iran using 15-year (2005-2020) monthly dataset. The wavelet transform (WT) procedure was also used to improve the GWL prediction ability of ML models for 3-month horizons using input datasets of precipitation, evapotranspiration, temperature, and GWL. The four well-accepted standalone ML methods, i.e., artificial neural network (ANN), adaptive neuro-fuzzy inference system (ANFIS), group method of data handling (GMDH), and least square support vector machine (LSSVM), were implemented and compared with the hybrid wavelet conjunction models. The methods were compared based on root mean square error (RMSE), mean absolute error (MAE), correlation coefficient (R), and Nash-Sutcliffe efficiency (NSE). Comparison outcomes showed that the hybrid wavelet-ML considerably improved the standalone model results. The wavelet transform-least square support vector machine (WT-LSSVM) model was superior to other standalone and hybrid wavelet-ML methods to predict GWL. The best GWL predictions were acquired from the WT-LSSVM model with input scenario 5 involving all influential variables, and this model produced RMSE, MAE, R, and NSE as 0.05, 0.04, 0.99, and 0.99 for 1 month ahead of GWL prediction, while the corresponding values were obtained as 0.18, 0.14, 0.95, and 0.90 for 3 months ahead of GWL prediction, respectively.
Collapse
Affiliation(s)
- Saeideh Samani
- Department of Water Resources Study and Research, Water Research Institute (WRI), Tehran Province, District 4, Bahar Blvd, Tehran, Iran
| | - Meysam Vadiati
- Global Affairs, Hubert H. Humphrey Fellowship Program, University of California, 10 College Park, Davis, CA, 95616, USA.
| | - Zohre Nejatijahromi
- Department of Minerals and Hydrogeology, Faculty of Earth Sciences, Shahid Beheshti University, Evin Ave, Tehran, Iran
| | - Behrooz Etebari
- CalNRA/Dept. of Water Resources/ Sustainable Groundwater Management Office, 715 P Street, Sacramento, CA, USA
| | - Ozgur Kisi
- Department of Civil Engineering, Technical University of Lübeck, 23562, Lübeck, Germany
- Department of Civil Engineering, Ilia State University, 0162, Tbilisi, Georgia
| |
Collapse
|
10
|
Zhu X, Hu J, Xiao T, Huang S, Wen Y, Shang D. An interpretable stacking ensemble learning framework based on multi-dimensional data for real-time prediction of drug concentration: The example of olanzapine. Front Pharmacol 2022; 13:975855. [PMID: 36238557 PMCID: PMC9552071 DOI: 10.3389/fphar.2022.975855] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Accepted: 09/05/2022] [Indexed: 11/13/2022] Open
Abstract
Background and Aim: Therapeutic drug monitoring (TDM) has evolved over the years as an important tool for personalized medicine. Nevertheless, some limitations are associated with traditional TDM. Emerging data-driven model forecasting [e.g., through machine learning (ML)-based approaches] has been used for individualized therapy. This study proposes an interpretable stacking-based ML framework to predict concentrations in real time after olanzapine (OLZ) treatment. Methods: The TDM-OLZ dataset, consisting of 2,142 OLZ measurements and 472 features, was formed by collecting electronic health records during the TDM of 927 patients who had received OLZ treatment. We compared the performance of ML algorithms by using 10-fold cross-validation and the mean absolute error (MAE). The optimal subset of features was analyzed by a random forest-based sequential forward feature selection method in the context of the top five heterogeneous regressors as base models to develop a stacked ensemble regressor, which was then optimized via the grid search method. Its predictions were explained by using local interpretable model-agnostic explanations (LIME) and partial dependence plots (PDPs). Results: A state-of-the-art stacking ensemble learning framework that integrates optimized extra trees, XGBoost, random forest, bagging, and gradient-boosting regressors was developed for nine selected features [i.e., daily dose (OLZ), gender_male, age, valproic acid_yes, ALT, K, BW, MONO#, and time of blood sampling after first administration]. It outperformed other base regressors that were considered, with an MAE of 0.064, R-square value of 0.5355, mean squared error of 0.0089, mean relative error of 13%, and ideal rate (the percentages of predicted TDM within ± 30% of actual TDM) of 63.40%. Predictions at the individual level were illustrated by LIME plots, whereas the global interpretation of associations between features and outcomes was illustrated by PDPs. Conclusion: This study highlights the feasibility of the real-time estimation of drug concentrations by using stacking-based ML strategies without losing interpretability, thus facilitating model-informed precision dosing.
Collapse
Affiliation(s)
- Xiuqing Zhu
- Department of Pharmacy, The Affiliated Brain Hospital of Guangzhou Medical University, Guangzhou, China
- Guangdong Engineering Technology Research Center for Translational Medicine of Mental Disorders, Guangzhou, China
| | - Jinqing Hu
- Department of Pharmacy, The Affiliated Brain Hospital of Guangzhou Medical University, Guangzhou, China
- Guangdong Engineering Technology Research Center for Translational Medicine of Mental Disorders, Guangzhou, China
| | - Tao Xiao
- Department of Pharmacy, The Affiliated Brain Hospital of Guangzhou Medical University, Guangzhou, China
- Department of Clinical Research, Guangdong Second Provincial General Hospital, Guangzhou, China
| | - Shanqing Huang
- Department of Pharmacy, The Affiliated Brain Hospital of Guangzhou Medical University, Guangzhou, China
- Guangdong Engineering Technology Research Center for Translational Medicine of Mental Disorders, Guangzhou, China
| | - Yuguan Wen
- Department of Pharmacy, The Affiliated Brain Hospital of Guangzhou Medical University, Guangzhou, China
- Guangdong Engineering Technology Research Center for Translational Medicine of Mental Disorders, Guangzhou, China
- *Correspondence: Yuguan Wen, ; Dewei Shang,
| | - Dewei Shang
- Department of Pharmacy, The Affiliated Brain Hospital of Guangzhou Medical University, Guangzhou, China
- Guangdong Engineering Technology Research Center for Translational Medicine of Mental Disorders, Guangzhou, China
- *Correspondence: Yuguan Wen, ; Dewei Shang,
| |
Collapse
|
11
|
Shu M, Fei S, Zhang B, Yang X, Guo Y, Li B, Ma Y. Application of UAV Multisensor Data and Ensemble Approach for High-Throughput Estimation of Maize Phenotyping Traits. PLANT PHENOMICS 2022; 2022:9802585. [PMID: 36158531 PMCID: PMC9489231 DOI: 10.34133/2022/9802585] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Accepted: 08/07/2022] [Indexed: 12/03/2022]
Abstract
High-throughput estimation of phenotypic traits from UAV (unmanned aerial vehicle) images is helpful to improve the screening efficiency of breeding maize. Accurately estimating phenotyping traits of breeding maize at plot scale helps to promote gene mining for specific traits and provides a guarantee for accelerating the breeding of superior varieties. Constructing an efficient and accurate estimation model is the key to the application of UAV-based multiple sensors data. This study aims to apply the ensemble learning model to improve the feasibility and accuracy of estimating maize phenotypic traits using UAV-based red-green-blue (RGB) and multispectral sensors. The UAV images of four growth stages were obtained, respectively. The reflectance of visible light bands, canopy coverage, plant height (PH), and texture information were extracted from RGB images, and the vegetation indices were calculated from multispectral images. We compared and analyzed the estimation accuracy of single-type feature and multiple features for LAI (leaf area index), fresh weight (FW), and dry weight (DW) of maize. The basic models included ridge regression (RR), support vector machine (SVM), random forest (RF), Gaussian process (GP), and K-neighbor network (K-NN). The ensemble learning models included stacking and Bayesian model averaging (BMA). The results showed that the ensemble learning model improved the accuracy and stability of maize phenotypic traits estimation. Among the features extracted from UAV RGB images, the highest accuracy was obtained by the combination of spectrum, structure, and texture features. The model had the best accuracy constructed using all features of two sensors. The estimation accuracies of ensemble learning models, including stacking and BMA, were higher than those of the basic models. The coefficient of determination (R2) of the optimal validation results were 0.852, 0.888, and 0.929 for LAI, FW, and DW, respectively. Therefore, the combination of UAV-based multisource data and ensemble learning model could accurately estimate phenotyping traits of breeding maize at plot scale.
Collapse
Affiliation(s)
- Meiyan Shu
- College of Land Science and Technology, China Agricultural University, Beijing 100091, China
| | - Shuaipeng Fei
- College of Land Science and Technology, China Agricultural University, Beijing 100091, China
| | - Bingyu Zhang
- College of Land Science and Technology, China Agricultural University, Beijing 100091, China
| | - Xiaohong Yang
- State Key Laboratory of Plant Physiology and Biochemistry, National Maize Improvement Center of China, China Agricultural University, Beijing 100091, China
| | - Yan Guo
- College of Land Science and Technology, China Agricultural University, Beijing 100091, China
| | - Baoguo Li
- College of Land Science and Technology, China Agricultural University, Beijing 100091, China
| | - Yuntao Ma
- College of Land Science and Technology, China Agricultural University, Beijing 100091, China
| |
Collapse
|
12
|
Fei S, Hassan MA, Xiao Y, Su X, Chen Z, Cheng Q, Duan F, Chen R, Ma Y. UAV-based multi-sensor data fusion and machine learning algorithm for yield prediction in wheat. PRECISION AGRICULTURE 2022; 24:187-212. [PMID: 35967193 PMCID: PMC9362526 DOI: 10.1007/s11119-022-09938-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 06/30/2022] [Indexed: 05/31/2023]
Abstract
UNLABELLED Early prediction of grain yield helps scientists to make better breeding decisions for wheat. Use of machine learning (ML) methods for fusion of unmanned aerial vehicle (UAV)-based multi-sensor data can improve the prediction accuracy of crop yield. For this, five ML algorithms including Cubist, support vector machine (SVM), deep neural network (DNN), ridge regression (RR) and random forest (RF) were used for multi-sensor data fusion and ensemble learning for grain yield prediction in wheat. A set of thirty wheat cultivars and breeding lines were grown under three irrigation treatments i.e., light, moderate and high irrigation treatments to evaluate the yield prediction capabilities of a low-cost multi-sensor (RGB, multi-spectral and thermal infrared) UAV platform. Multi-sensor data fusion-based yield prediction showed higher accuracy compared to individual-sensor data in each ML model. The coefficient of determination (R 2) values for Cubist, SVM, DNN and RR models regarding grain yield prediction were observed from 0.527 to 0.670. Moreover, the results of ensemble learning through integrating the above models illustrated further increase in accuracy. The predictions of ensemble learning showed high R 2 values up to 0.692, which was higher as compared to individual ML models across the multi-sensor data. Root mean square error (RMSE), residual prediction deviation (RPD) and ratio of prediction performance to inter-quartile range (RPIQ) were calculated to be 0.916 t ha-1, 1.771 and 2.602, respectively. The results proved that low altitude UAV-based multi-sensor data can be used for early grain yield prediction using data fusion and an ensemble learning framework with high accuracy. This high-throughput phenotyping approach is valuable for improving the efficiency of selection in large breeding activities. SUPPLEMENTARY INFORMATION The online version contains supplementary material available at 10.1007/s11119-022-09938-8.
Collapse
Affiliation(s)
- Shuaipeng Fei
- Institute of Farmland Irrigation, Chinese Academy of Agricultural Sciences, Xinxiang, 453002 China
| | - Muhammad Adeel Hassan
- National Wheat Improvement Centre, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100081 China
- Dezhou Academy of Agricultural Sciences, Dezhou, 253050 China
| | - Yonggui Xiao
- National Wheat Improvement Centre, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100081 China
| | - Xin Su
- Water Diversion and Irrigation Engineering Technology Center, Yellow River Institute of Hydraulic Research, Zhengzhou, 450003 China
| | - Zhen Chen
- Institute of Farmland Irrigation, Chinese Academy of Agricultural Sciences, Xinxiang, 453002 China
| | - Qian Cheng
- Institute of Farmland Irrigation, Chinese Academy of Agricultural Sciences, Xinxiang, 453002 China
| | - Fuyi Duan
- Institute of Farmland Irrigation, Chinese Academy of Agricultural Sciences, Xinxiang, 453002 China
| | - Riqiang Chen
- School of Information Science and Technology, Beijing Forestry University, Beijing, 100083 China
| | - Yuntao Ma
- College of Land Science and Technology, China Agricultural University, Beijing, 100193 China
| |
Collapse
|
13
|
Chen S, Zhang Z, Lin J, Huang J. Machine learning-based estimation of riverine nutrient concentrations and associated uncertainties caused by sampling frequencies. PLoS One 2022; 17:e0271458. [PMID: 35830456 PMCID: PMC9278742 DOI: 10.1371/journal.pone.0271458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Accepted: 06/30/2022] [Indexed: 11/23/2022] Open
Abstract
Accurate and sufficient water quality data is essential for watershed management and sustainability. Machine learning models have shown great potentials for estimating water quality with the development of online sensors. However, accurate estimation is challenging because of uncertainties related to models used and data input. In this study, random forest (RF), support vector machine (SVM), and back-propagation neural network (BPNN) models are developed with three sampling frequency datasets (i.e., 4-hourly, daily, and weekly) and five conventional indicators (i.e., water temperature (WT), hydrogen ion concentration (pH), electrical conductivity (EC), dissolved oxygen (DO), and turbidity (TUR)) as surrogates to individually estimate riverine total phosphorus (TP), total nitrogen (TN), and ammonia nitrogen (NH4+-N) in a small-scale coastal watershed. The results show that the RF model outperforms the SVM and BPNN machine learning models in terms of estimative performance, which explains much of the variation in TP (79 ± 1.3%), TN (84 ± 0.9%), and NH4+-N (75 ± 1.3%), when using the 4-hourly sampling frequency dataset. The higher sampling frequency would help the RF obtain a significantly better performance for the three nutrient estimation measures (4-hourly > daily > weekly) for R2 and NSE values. WT, EC, and TUR were the three key input indicators for nutrient estimations in RF. Our study highlights the importance of high-frequency data as input to machine learning model development. The RF model is shown to be viable for riverine nutrient estimation in small-scale watersheds of important local water security.
Collapse
Affiliation(s)
- Shengyue Chen
- Fujian Key Laboratory of Coastal Pollution Prevention and Control, Xiamen University, Xiamen, China
| | - Zhenyu Zhang
- Fujian Key Laboratory of Coastal Pollution Prevention and Control, Xiamen University, Xiamen, China
| | - Juanjuan Lin
- Xiamen Environmental Publicity and Education Center, Xiamen, China
| | - Jinliang Huang
- Fujian Key Laboratory of Coastal Pollution Prevention and Control, Xiamen University, Xiamen, China
- * E-mail:
| |
Collapse
|
14
|
Ahamed A, Knight R, Alam S, Pauloo R, Melton F. Assessing the utility of remote sensing data to accurately estimate changes in groundwater storage. THE SCIENCE OF THE TOTAL ENVIRONMENT 2022; 807:150635. [PMID: 34606871 DOI: 10.1016/j.scitotenv.2021.150635] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/21/2021] [Revised: 09/09/2021] [Accepted: 09/23/2021] [Indexed: 06/13/2023]
Abstract
Accurate and timely estimates of groundwater storage changes are critical to the sustainable management of aquifers worldwide, but are hindered by the lack of in-situ groundwater measurements in most regions. Hydrologic remote sensing measurements provide a potential pathway to quantify groundwater storage changes by closing the water balance, but the degree to which remote sensing data can accurately estimate groundwater storage changes is unclear. In this study, we quantified groundwater storage changes in California's Central Valley at two spatial scales for the period 2002 through 2020 using remote sensing data and an ensemble water balance method. To evaluate performance, we compared estimates of groundwater storage changes to three independent estimates: GRACE satellite data, groundwater wells and a groundwater flow model. Results suggest evapotranspiration has the highest uncertainty among water balance components, while precipitation has the lowest. We found that remote sensing-based groundwater storage estimates correlated well with independent estimates; annual trends during droughts fall within 15% of trends calculated using wells and groundwater models within the Central Valley. Remote sensing-based estimates also reliably estimated the long-term trend, seasonality, and rate of groundwater depletion during major drought events. Additionally, our study suggests that the proposed method estimate changes in groundwater at sub-annual latencies, which is not currently possible using other methods. The findings have implications for improving the understanding of aquifer dynamics and can inform regional water managers about the status of groundwater systems during droughts.
Collapse
Affiliation(s)
- Aakash Ahamed
- Department of Geophysics, Stanford University, 397 Panama Mall, Stanford, CA 94305, United States of America.
| | - Rosemary Knight
- Department of Geophysics, Stanford University, 397 Panama Mall, Stanford, CA 94305, United States of America
| | - Sarfaraz Alam
- Department of Geophysics, Stanford University, 397 Panama Mall, Stanford, CA 94305, United States of America
| | - Rich Pauloo
- Hydrologic Sciences, University of California, Davis, One Shields Avenue, Davis, CA 95616, United States of America
| | - Forrest Melton
- Department of Applied Environmental Sciences, California State University, Monterey Bay, 100 Campus Center, Seaside, CA 93955, United States of America; Biospheric Sciences Branch, NASA Ames Research Center, Mail Stop 245, Moffett Field, CA 94035, United States of America
| |
Collapse
|