1
|
Bang GH, Gwon NH, Cho MJ, Park JY, Baek SS. Developing a real-time water quality simulation toolbox using machine learning and application programming interface. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2025; 377:124719. [PMID: 40022793 DOI: 10.1016/j.jenvman.2025.124719] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2024] [Revised: 02/21/2025] [Accepted: 02/22/2025] [Indexed: 03/04/2025]
Abstract
Rivers are vital for sustaining human life as they foster social development, provide drinking water, maintain aquatic ecosystems, and offer recreational spaces. However, most rivers are being increasingly contaminated by pollutants from non-point sources, urbanization, and other sources. Consequently, real-time river water quality modeling is essential for managing and protecting rivers from contamination, and its significance is growing across various sectors, including public health, agriculture, and water treatment systems. Therefore, a real-time river water quality simulation toolbox was developed using machine learning (ML) and an application program interface (API). To create the toolbox, models that simulated water quality parameters such as chlorophyll a (Chl-a), dissolved oxygen (DO), total nitrogen (TN), total organic carbon (TOC), and total phosphorus (TP) at each point in the Nakdong River were constructed. The models were constructed using Artificial neural network (ANN), Random Forest (RF), support vector machines (SVM), and data from API. Subsequently, hyperparameter optimization was conducted to enhance the model's performance. During training, the models' performances were evaluated and compared based on the data sampling method and ML algorithms. Models trained with random sampling data outperformed those trained with time-series data. Among the algorithm models that used random sampling data, the RF exhibited the best performance. The average coefficient of determination (R2) values for each water quality simulation with randomly sampled data using RF for DO, TN, TP, Chl-a, and TOC were 0.79, 0.65, 0.74, 0.45, and 0.48, respectively. For ANN, they were 0.7, 0.51, 0.64, 0.35, and 0.35, respectively, and for SVM, they were 0.73, 0.51, 0.59, 0.21, and 0.3, respectively. The Chl-a and TOC models exhibited relatively poor performance, whereas the DO, TN, and TP models demonstrated superior performance. Diversifying the input data variables is necessary to improve the performance of the Chl-a and TOC models. Sensitivity and uncertainty analyses were conducted to evaluate and enhance the models' understanding. Furthermore, using a graphic user interface (GUI) toolbox, user convenience was maximized.
Collapse
Affiliation(s)
- Gi-Hun Bang
- Department of Integrated Water Management, Yeungnam University, Daehak-ro 280, Gyeongsan-si, Water Campus, Korea Water Cluster, Gukgasandan-daero 40-gil, Guji-myeon, Dalseong-gun, Gyeongsangbuk-do, Daegu, Republic of Korea
| | - Na-Hyeon Gwon
- Department of Environmental Engineering, Yeongnam University, 280 Daehak-Ro, Gyeonsan-Si, Gyeongbuk, 38541, Republic of Korea
| | - Min-Jeong Cho
- Department of Environmental Engineering, Yeongnam University, 280 Daehak-Ro, Gyeonsan-Si, Gyeongbuk, 38541, Republic of Korea
| | - Ji-Ye Park
- Department of Environmental Engineering, Yeongnam University, 280 Daehak-Ro, Gyeonsan-Si, Gyeongbuk, 38541, Republic of Korea
| | - Sang-Soo Baek
- Department of Environmental Engineering, Yeongnam University, 280 Daehak-Ro, Gyeonsan-Si, Gyeongbuk, 38541, Republic of Korea.
| |
Collapse
|
2
|
Salubi EA, Gizaw Z, Schuster-Wallace CJ, Pietroniro A. Climate change and waterborne diseases in temperate regions: a systematic review. JOURNAL OF WATER AND HEALTH 2025; 23:58-78. [PMID: 39882854 DOI: 10.2166/wh.2024.314] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/28/2024] [Accepted: 11/29/2024] [Indexed: 01/31/2025]
Abstract
Risk of waterborne diseases (WBDs) persists in temperate regions. The extent of influence of climate-related factors on the risk of specific WBDs in a changing climate and the projections of future climate scenarios on WBDs in temperate regions are unclear. A systematic review was conducted to identify specific waterborne pathogens and diseases prevalent in temperate region literature and transmission cycle associations with a changing climate. Projections of WBD risk based on future climate scenarios and models used to assess future disease risk were identified. Seventy-five peer-reviewed full-text articles for temperate regions published in the English language were included in this review after a search of Scopus and Web of Science databases from 2010 to 2023. Using thematic analysis, climate-related drivers impacting WBD risk were identified. Risk of WBDs was influenced mostly by weather (rainfall: 22% and heavy rainfall: 19%) across the majority of temperate regions and hydrological (streamflow: 50%) factors in Europe. Future climate scenarios suggest that WBD risk is likely to increase in temperate regions. Given the need to understand changes and potential feedback across fate, transport and exposure pathways, more studies should combine data-driven and process-based models to better assess future risks using model simulations.
Collapse
Affiliation(s)
- Eunice A Salubi
- Department of Geography and Planning, University of Saskatchewan, 117 Science Place, Saskatoon, Saskatchewan S7N 5C8, Canada; Global Institute for Water Security, University of Saskatchewan, 11 Innovation Boulevard, Saskatoon, Saskatchewan S7N 3H5, Canada E-mail:
| | - Zemichael Gizaw
- Department of Geography and Planning, University of Saskatchewan, 117 Science Place, Saskatoon, Saskatchewan S7N 5C8, Canada; Global Institute for Water Security, University of Saskatchewan, 11 Innovation Boulevard, Saskatoon, Saskatchewan S7N 3H5, Canada; Department of Environmental and Occupational Health and Safety, Institute of Public Health, College of Medicine and Health Sciences, University of Gondar, Gondar, Ethiopia
| | - Corinne J Schuster-Wallace
- Department of Geography and Planning, University of Saskatchewan, 117 Science Place, Saskatoon, Saskatchewan S7N 5C8, Canada; Global Institute for Water Security, University of Saskatchewan, 11 Innovation Boulevard, Saskatoon, Saskatchewan S7N 3H5, Canada
| | - Alain Pietroniro
- Global Institute for Water Security, University of Saskatchewan, 11 Innovation Boulevard, Saskatoon, Saskatchewan S7N 3H5, Canada; Schulich School of Engineering, University of Calgary, 622 Collegiate Pl NW, Calgary, Alberta T2N 4V8, Canada
| |
Collapse
|
3
|
Suh S, Moon J, Jung S, Pyo J. Improving fecal bacteria estimation using machine learning and explainable AI in four major rivers, South Korea. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 957:177459. [PMID: 39536862 DOI: 10.1016/j.scitotenv.2024.177459] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/16/2024] [Revised: 10/27/2024] [Accepted: 11/06/2024] [Indexed: 11/16/2024]
Abstract
This study addresses the critical public health issue of fecal coliform contamination in the four major rivers in South Korea (Han, Nakdong, Geum, and Yeongsan rivers) by applying advanced machine learning (ML) algorithms combined with Explainable Artificial Intelligence to enhance both prediction accuracy and interpretability. Both traditional and machine learning models often face challenges in accurately estimating fecal coliform levels due to the complexity of environmental variables and data limitations. To address this limitation, we employed two tree-based models (i.e., random forest [RF] and extreme gradient boost [XGBoost]), and two neural network models (i.e., deep neural network and convolutional neural network [CNN]). we employed the use of Shapley Additive Explanations (SHAP) to facilitate a more comprehensive understanding of the influence exerted by each variable on the model's predictions. Based on a comprehensive dataset collected from the National Institute of Environmental Research covering 16 water quality parameters and meteorological data from 2014 to 2022, our study improved the accuracy of fecal coliform estimation using XGBoost and CNN models. The optimal result was obtained using XGBoost, which had a validation Nash-Sutcliffe efficiency of 0.597 in the Han River. In addition, this study provides insights into the significant factors influencing fecal coliform concentrations across different river environments using the SHAP model. The results indicated that the XGBoost model provided superior estimation accuracy and explanations for the contributions of variables. The SHAP results provided the precise contribution of each water quality variable that affected the fecal estimation results using the XGBoost model. The study facilitates an improved understanding of the relationship between water quality variables and fecal coliform contamination mechanisms in the four major rivers in South Korea.
Collapse
Affiliation(s)
- SungMin Suh
- Department of Environmental Engineering, Pusan National University, Busan 46241, Republic of Korea
| | - JunGi Moon
- Department of Environmental Engineering, Pusan National University, Busan 46241, Republic of Korea
| | - Sangjin Jung
- Department of Environmental Engineering, Pusan National University, Busan 46241, Republic of Korea
| | - JongCheol Pyo
- Department of Environmental Engineering, Pusan National University, Busan 46241, Republic of Korea.
| |
Collapse
|
4
|
Volf G, Sušanj Čule I, Zorko S. Influence of the physiochemical parameters on the occurrence of E. coli bacteria in a small and shallow reservoir. JOURNAL OF WATER AND HEALTH 2024; 22:2206-2217. [PMID: 39611679 DOI: 10.2166/wh.2024.394] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/24/2024] [Accepted: 10/10/2024] [Indexed: 11/30/2024]
Abstract
The microbiological quality of water plays a crucial role in the relationship among human, animal, and environmental health. This research gives insight into the relationship between concentrations of Escherichia coli bacteria and physiochemical parameters in water, which is captured from the Butoniga reservoir and then used for treatment in the drinking water treatment plant Butoniga. Analysis was carried out using statistical analysis through the Pearson correlation coefficient and supported with PCA. The conducted analysis revealed that turbidity and Fe have the highest correlation coefficients with E. coli bacteria. Turbidity was also identified as a potential indicator for E. coli bacteria. Additionally, parameters such as Mn and UV 254 were also found to be closely related to E. coli bacteria, alongside turbidity and Fe. Furthermore, a relationship between E. coli bacteria and different water intakes was conducted. This shows that higher concentrations of E. coli bacteria were present when water was captured from lower water intakes, characterized by increased water turbidity. Thus, the research results provide important information on influential water quality parameters related to E. coli bacteria, especially in the Butoniga reservoir and related drinking water treatment plant, creating a foundation for future water quality management.
Collapse
Affiliation(s)
- Goran Volf
- Department of Hydrotehnics, Faculty of Civil Engineering, University of Rijeka, Radmile Matejčić 3, 51000, Rijeka, The Republic of Croatia E-mail:
| | - Ivana Sušanj Čule
- Department of Hydrotehnics, Faculty of Civil Engineering, University of Rijeka, Radmile Matejčić 3, 51000, Rijeka, The Republic of Croatia
| | - Sonja Zorko
- Istarski vodovod d.o.o., Drinking Water Treatment Plant Butoniga, Sv. Ivan 8, 52420, Buzet, The Republic of Croatia
| |
Collapse
|
5
|
Greenwood EE, Lauber T, van den Hoogen J, Donmez A, Bain RES, Johnston R, Crowther TW, Julian TR. Mapping safe drinking water use in low- and middle-income countries. Science 2024; 385:784-790. [PMID: 39146419 DOI: 10.1126/science.adh9578] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Accepted: 06/26/2024] [Indexed: 08/17/2024]
Abstract
Safe drinking water access is a human right, but data on safely managed drinking water services (SMDWS) is lacking for more than half of the global population. We estimate SMDWS use in 135 low- and middle-income countries (LMICs) at subnational levels with a geospatial modeling approach, combining existing household survey data with available global geospatial datasets. We estimate that only one in three people used SMDWS in LMICs in 2020 and identified fecal contamination as the primary limiting factor affecting almost half of the population of LMICs. Our results are relevant for raising awareness about the challenges and limitations of current global monitoring approaches and demonstrating how globally available geospatial data can be leveraged to fill data gaps and identify priority areas in LMICs.
Collapse
Affiliation(s)
- Esther E Greenwood
- Department of Environmental Systems Science, ETH Zurich, Swiss Federal Institute of Technology, Zurich, Switzerland
- Eawag, Swiss Federal Institute of Aquatic Science and Technology, Dübendorf, Switzerland
| | - Thomas Lauber
- Department of Environmental Systems Science, ETH Zurich, Swiss Federal Institute of Technology, Zurich, Switzerland
| | - Johan van den Hoogen
- Department of Environmental Systems Science, ETH Zurich, Swiss Federal Institute of Technology, Zurich, Switzerland
| | - Ayca Donmez
- Division of Data, Analytics, Planning and Monitoring, United Nations Children's Fund, New York, NY, USA
| | - Robert E S Bain
- Division of Data, Analytics, Planning and Monitoring, United Nations Children's Fund, New York, NY, USA
- Regional Office for the Middle East and North Africa, United Nations Children's Fund, Amman, Jordan
| | - Richard Johnston
- Department of Environment, Climate Change and Health, World Health Organization, Geneva, Switzerland
| | - Thomas W Crowther
- Department of Environmental Systems Science, ETH Zurich, Swiss Federal Institute of Technology, Zurich, Switzerland
| | - Timothy R Julian
- Eawag, Swiss Federal Institute of Aquatic Science and Technology, Dübendorf, Switzerland
- Swiss Tropical and Public Health Institute, Allschwill, Switzerland
- University of Basel, Basel, Switzerland
| |
Collapse
|
6
|
Hong SM, Morgan BJ, Stocker MD, Smith JE, Kim MS, Cho KH, Pachepsky YA. Using machine learning models to estimate Escherichia coli concentration in an irrigation pond from water quality and drone-based RGB imagery data. WATER RESEARCH 2024; 260:121861. [PMID: 38875854 DOI: 10.1016/j.watres.2024.121861] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Revised: 05/29/2024] [Accepted: 05/30/2024] [Indexed: 06/16/2024]
Abstract
The rapid and efficient quantification of Escherichia coli concentrations is crucial for monitoring water quality. Remote sensing techniques and machine learning algorithms have been used to detect E. coli in water and estimate its concentrations. The application of these approaches, however, is challenged by limited sample availability and unbalanced water quality datasets. In this study, we estimated the E. coli concentration in an irrigation pond in Maryland, USA, during the summer season using demosaiced natural color (red, green, and blue: RGB) imagery in the visible and infrared spectral ranges, and a set of 14 water quality parameters. We did this by deploying four machine learning models - Random Forest (RF), Gradient Boosting Machine (GBM), Extreme Gradient Boosting (XGB), and K-nearest Neighbor (KNN) - under three data utilization scenarios: water quality parameters only, combined water quality and small unmanned aircraft system (sUAS)-based RGB data, and RGB data only. To select the training and test datasets, we applied two data-splitting methods: ordinary and quantile data splitting. These methods provided a constant splitting ratio in each decile of the E. coli concentration distribution. Quantile data splitting resulted in better model performance metrics and smaller differences between the metrics for both the training and testing datasets. When trained with quantile data splitting after hyperparameter optimization, models RF, GBM, and XGB had R2 values above 0.847 for the training dataset and above 0.689 for the test dataset. The combination of water quality and RGB imagery data resulted in a higher R2 value (>0.896) for the test dataset. Shapley additive explanations (SHAP) of the relative importance of variables revealed that the visible blue spectrum intensity and water temperature were the most influential parameters in the RF model. Demosaiced RGB imagery served as a useful predictor of E. coli concentration in the studied irrigation pond.
Collapse
Affiliation(s)
- Seok Min Hong
- USDA-ARS Environmental Microbial and Food Safety Laboratory, 10300 Baltimore Ave, Bldg. 173, Beltsville, MD, 20705, USA; Department of Civil Urban Earth and Environmental Engineering, Ulsan National Institute of Science and Technology, UNIST-gil 50, Ulsan, 44919, South Korea
| | - Billie J Morgan
- USDA-ARS Environmental Microbial and Food Safety Laboratory, 10300 Baltimore Ave, Bldg. 173, Beltsville, MD, 20705, USA
| | - Matthew D Stocker
- USDA-ARS Environmental Microbial and Food Safety Laboratory, 10300 Baltimore Ave, Bldg. 173, Beltsville, MD, 20705, USA
| | - Jaclyn E Smith
- USDA-ARS Environmental Microbial and Food Safety Laboratory, 10300 Baltimore Ave, Bldg. 173, Beltsville, MD, 20705, USA
| | - Moon S Kim
- USDA-ARS Environmental Microbial and Food Safety Laboratory, 10300 Baltimore Ave, Bldg. 173, Beltsville, MD, 20705, USA
| | - Kyung Hwa Cho
- School of Civil, Environmental and Architectural Engineering, Korea University, Seoul, 02841, South Korea.
| | - Yakov A Pachepsky
- USDA-ARS Environmental Microbial and Food Safety Laboratory, 10300 Baltimore Ave, Bldg. 173, Beltsville, MD, 20705, USA.
| |
Collapse
|
7
|
Mansouri S. Recent developments of (bio)-sensors for detection of main microbiological and non-biological pollutants in plastic bottled water samples: A critical review. Talanta 2024; 274:125962. [PMID: 38537355 DOI: 10.1016/j.talanta.2024.125962] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2024] [Revised: 02/27/2024] [Accepted: 03/20/2024] [Indexed: 05/04/2024]
Abstract
The importance of water in all biological processes is undeniable. Ensuring access to clean and safe drinking water is crucial for maintaining sustainable water resources. To elaborate, the consumption of water of inadequate quality can have a repercussion on human health. Furthermore, according to the instability of tap water quality, the consumption rate of bottled water is increasing every day at the global level. Although most people believe bottled water is safe, it can also be contaminated by microbiological or chemical pollution, which can increase the risk of disease. Over the last decades, several conventional analytical tools applied to analyze the contamination of bottled water. On the other hand, some limitations restrict their application in this field. Therefore, biosensors, as emerging analytical method, attract tremendous attention for detection both microbial and chemical contamination of bottled water. Biosensors enjoy several facilities including selectivity, affordability, and sensitivity. In this review, the developed biosensors for analyzing contamination of bottled water were highlighted, as along with working strategies, pros and cons of studies. Challenges and prospects were also examined.
Collapse
Affiliation(s)
- Sofiene Mansouri
- Department of Biomedical Technology, College of Applied Medical Sciences in Al-Kharj, Prince Sattam bin Abdulaziz University, Al-Kharj, 11942, Saudi Arabia; University of Tunis El Manar, Higher Institute of Medical Technologies of Tunis, Laboratory of Biophysics and Medical Technologies, Tunis, Tunisia.
| |
Collapse
|
8
|
Fooladi M, Nikoo MR, Mirghafari R, Madramootoo CA, Al-Rawas G, Nazari R. Robust clustering-based hybrid technique enabling reliable reservoir water quality prediction with uncertainty quantification and spatial analysis. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2024; 362:121259. [PMID: 38830281 DOI: 10.1016/j.jenvman.2024.121259] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Revised: 05/13/2024] [Accepted: 05/25/2024] [Indexed: 06/05/2024]
Abstract
Machine learning methodology has recently been considered a smart and reliable way to monitor water quality parameters in aquatic environments like reservoirs and lakes. This study employs both individual and hybrid-based techniques to boost the accuracy of dissolved oxygen (DO) and chlorophyll-a (Chl-a) predictions in the Wadi Dayqah Dam located in Oman. At first, an AAQ-RINKO device (CTD+ sensor) was used to collect water quality parameters from different locations and varying depths in the reservoir. Second, the dataset is segmented into homogeneous clusters based on DO and Chl-a parameters by leveraging an optimized K-means algorithm, facilitating precise estimations. Third, ten sophisticated variational-individual data-driven models, namely generalized regression neural network (GRNN), random forest (RF), gaussian process regression (GPR), decision tree (DT), least-squares boosting (LSB), bayesian ridge (BR), support vector regression (SVR), K-nearest neighbors (KNN), multilayer perceptron (MLP), and group method of data handling (GMDH) are employed to estimate DO and Chl-a concentrations. Fourth, to improve prediction accuracy, bayesian model averaging (BMA), entropy weighted (EW), and a new enhanced clustering-based hybrid technique called Entropy-ORNESS are employed to combine model outputs. The Entropy-ORNESS method incorporates a Genetic Algorithm (GA) to determine optimal weights and then combine them with EW weights. Finally, the inclusion of bootstrapping techniques introduces a stochastic assessment of model uncertainty, resulting in a robust estimator model. In the validation phase, the Entropy-ORNESS technique outperforms the independent models among the three fusion-based methods, yielding R2 values of 0.92 and 0.89 for DO and Chl-a clusters, respectively. The proposed hybrid-based methodology demonstrates reduced uncertainty compared to single data-driven models and two combination frameworks, with uncertainty levels of 0.24% and 1.16% for cluster 1 of DO and cluster 2 of Chl-a parameters. As a highlight point, the spatial analysis of DO and Chl-a concentrations exhibit similar pattern variations across varying depths of the dam according to the comparison of field measurements with the best hybrid technique, in which DO concentration values notably decrease during warmer seasons. These findings collectively underscore the potential of the upgraded weighted-based hybrid approach to provide more accurate estimations of DO and Chl-a concentrations in dynamic aquatic environments.
Collapse
Affiliation(s)
- Mahmood Fooladi
- Department of Civil Engineering, Isfahan University of Technology, Isfahan, Iran.
| | - Mohammad Reza Nikoo
- Department of Civil and Architectural Engineering, Sultan Qaboos University, Muscat, Oman.
| | - Rasoul Mirghafari
- School of Engineering, Computing and Mathematics, Oxford Brookes University, Oxford, United Kingdom.
| | - Chandra A Madramootoo
- Department of Bioresource Engineering, McGill University, Sainte-Anne-de-Bellevue, Quebec, H9X 3V9, Canada.
| | - Ghazi Al-Rawas
- Department of Civil and Architectural Engineering, Sultan Qaboos University, Muscat, Oman.
| | - Rouzbeh Nazari
- School of Engineering, Department of Civil, Construction, and Environmental Engineering, Sustainable Smart Cities Research Center, University of Alabama at Birmingham, Birmingham, AL, United States.
| |
Collapse
|
9
|
Ortiz-Lopez C, Bouchard C, Rodriguez MJ. Ensemble machine learning using hydrometeorological information to improve modeling of quality parameter of raw water supplying treatment plants. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2024; 362:121378. [PMID: 38838533 DOI: 10.1016/j.jenvman.2024.121378] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/13/2024] [Revised: 05/03/2024] [Accepted: 06/02/2024] [Indexed: 06/07/2024]
Abstract
Source and raw water quality may deteriorate due to rainfall and river flow events that occur in watersheds. The effects on raw water quality are normally detected in drinking water treatment plants (DWTPs) with a time-lag after these events in the watersheds. Early warning systems (EWSs) in DWTPs require models with high accuracy in order to anticipate changes in raw water quality parameters. Ensemble machine learning (EML) techniques have recently been used for water quality modeling to improve accuracy and decrease variance in the outcomes. We used three decision-tree-based EML models (random forest [RF], gradient boosting [GB], and eXtreme Gradient Boosting [XGB]) to predict two critical parameters for DWTPs, raw water Turbidity and UV absorbance (UV254), using rainfall and river flow time series as predictors. When modeling raw water turbidity, the three EML models (rRF-Tu2=0.87, rGB-Tu2=0.80 and rXGB-Tu2=0.81) showed very good performance metrics. For raw water UV254, the three models (rRF-UV2=0.89, rGB-UV2=0.85 and rXGB-UV2=0.88) again showed very good performance metrics. Results from this study suggest that EML approaches could be used in EWSs to anticipate changes in the quality parameters of raw water and enhance decision-making in DWTPs.
Collapse
Affiliation(s)
- Christian Ortiz-Lopez
- Centre de Recherche en Aménagement et Développement (CRAD), Université Laval, 2325 Allée des Bibliothèques, Québec City, QC, G1V 0A6, Canada.
| | - Christian Bouchard
- Centre de Recherche en Aménagement et Développement (CRAD), Université Laval, 2325 Allée des Bibliothèques, Québec City, QC, G1V 0A6, Canada
| | - Manuel J Rodriguez
- École Supérieure d'Aménagement du Territoire et de Développement Régional (ESAD), Université Laval, 2325 Allée des Bibliothèques, Québec City, QC, G1V 0A6, Canada
| |
Collapse
|
10
|
Serna-Carrizales JC, Zárate-Guzmán AI, Flores-Ramírez R, Díaz de León-Martínez L, Aguilar-Aguilar A, Warren-Vega WM, Bailón-García E, Ocampo-Pérez R. Application of artificial intelligence for the optimization of advanced oxidation processes to improve the water quality polluted with pharmaceutical compounds. CHEMOSPHERE 2024; 351:141216. [PMID: 38224748 DOI: 10.1016/j.chemosphere.2024.141216] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Revised: 12/29/2023] [Accepted: 01/12/2024] [Indexed: 01/17/2024]
Abstract
Sulfamethoxazole and metronidazole are emerging pollutants commonly found in surface water and wastewater. These compounds have a significant environmental impact, being necessary in the design of technologies for their removal. Recently, the advanced oxidation process has been proven successful in the elimination of this kind of compounds. In this sense, the present work discusses the application of UV/H2O2 and ozonation for the degradation of both molecules in single and binary systems. Experimental kinetic data from O3 and UV/H2O2 process were adequately described by a first and second kinetic model, respectively. From the ANOVA analysis, it was determined that the most statistically significant variables were the initial concentration of the drugs (0.03 mmol L-1) and the pH = 8 for UV/H2O2 system, and only the pH (optimal value of 6) was significant for degradation with O3. Results showed that both molecules were eliminated with high degradation efficiencies (88-94% for UV/H2O2 and 79-98% for O3) in short reaction times (around 30-90 min). The modeling was performed using a quadratic regression model through response surface methodology representing adequately 90 % of the experimental data. On the other hand, an artificial neural network was used to evaluate a non-linear multi-variable system, a 98% of fit between the model and experimental data was obtained. The identification of degradation byproducts was performed by high-performance liquid chromatography coupled to a time mass detector. After each process, at least four to five stable byproducts were found in the treated water, reducing the mineralization percentage to 20% for both molecules.
Collapse
Affiliation(s)
- Juan Carlos Serna-Carrizales
- Centro de Investigación y Estudios de Posgrado, Facultad de Ciencias Químicas, Universidad Autónoma de San Luis Potosí, Av. Dr. Manuel Nava 6, San Luis Potosí, 78210, Mexico
| | - Ana I Zárate-Guzmán
- Centro de Investigación y Estudios de Posgrado, Facultad de Ciencias Químicas, Universidad Autónoma de San Luis Potosí, Av. Dr. Manuel Nava 6, San Luis Potosí, 78210, Mexico; Grupo de Investigación en Materiales y Fenómenos de Superficie, Departamento de Biotecnológicas y Ambientales, Universidad Autónoma de Guadalajara, Av. Patria 1201, C.P, 45129, Zapopan, Jalisco, Mexico.
| | - Rogelio Flores-Ramírez
- Programa Multidisciplinario de Posgrado en Ciencias Ambientales, Universidad Autónoma de San Luis Potosí, Av. Manuel Nava No. 201, San Luis Potosí, 78210, Mexico
| | | | - Angélica Aguilar-Aguilar
- Centro de Investigación y Estudios de Posgrado, Facultad de Ciencias Químicas, Universidad Autónoma de San Luis Potosí, Av. Dr. Manuel Nava 6, San Luis Potosí, 78210, Mexico
| | - Walter M Warren-Vega
- Grupo de Investigación en Materiales y Fenómenos de Superficie, Departamento de Biotecnológicas y Ambientales, Universidad Autónoma de Guadalajara, Av. Patria 1201, C.P, 45129, Zapopan, Jalisco, Mexico
| | - Esther Bailón-García
- Grupo de Investigación en Materiales de Carbón, Departamento de Química Inorgánica, Facultad de Ciencias, Universidad de Granada, Campus Fuente Nueva S/n, 18071, Granada, Spain
| | - Raúl Ocampo-Pérez
- Centro de Investigación y Estudios de Posgrado, Facultad de Ciencias Químicas, Universidad Autónoma de San Luis Potosí, Av. Dr. Manuel Nava 6, San Luis Potosí, 78210, Mexico
| |
Collapse
|
11
|
Uddin MG, Nash S, Rahman A, Dabrowski T, Olbert AI. Data-driven modelling for assessing trophic status in marine ecosystems using machine learning approaches. ENVIRONMENTAL RESEARCH 2024; 242:117755. [PMID: 38008200 DOI: 10.1016/j.envres.2023.117755] [Citation(s) in RCA: 16] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Revised: 10/05/2023] [Accepted: 11/20/2023] [Indexed: 11/28/2023]
Abstract
Assessing eutrophication in coastal and transitional waters is of utmost importance, yet existing Trophic Status Index (TSI) models face challenges like multicollinearity, data redundancy, inappropriate aggregation methods, and complex classification schemes. To tackle these issues, we developed a novel tool that harnesses machine learning (ML) and artificial intelligence (AI), enhancing the reliability and accuracy of trophic status assessments. Our research introduces an improved data-driven methodology specifically tailored for transitional and coastal (TrC) waters, with a focus on Cork Harbour, Ireland, as a case study. Our innovative approach, named the Assessment Trophic Status Index (ATSI) model, comprises three main components: the selection of pertinent water quality indicators, the computation of ATSI scores, and the implementation of a new classification scheme. To optimize input data and minimize redundancy, we employed ML techniques, including advanced deep learning methods. Specifically, we developed a CHL prediction model utilizing ten algorithms, among which XGBoost demonstrated exceptional performance, showcasing minimal errors during both training (RMSE = 0.0, MSE = 0.0, MAE = 0.01) and testing (RMSE = 0.0, MSE = 0.0, MAE = 0.01) phases. Utilizing a novel linear rescaling interpolation function, we calculated ATSI scores and evaluated the model's sensitivity and efficiency across diverse application domains, employing metrics such as R2, the Nash-Sutcliffe efficiency (NSE), and the model efficiency factor (MEF). The results consistently revealed heightened sensitivity and efficiency across all application domains. Additionally, we introduced a brand new classification scheme for ranking the trophic status of transitional and coastal waters. To assess spatial sensitivity, we applied the ATSI model to four distinct waterbodies in Ireland, comparing trophic assessment outcomes with the Assessment of Trophic Status of Estuaries and Bays in Ireland (ATSEBI) System. Remarkably, significant disparities between the ATSI and ATSEBI System were evident in all domains, except for Mulroy Bay. Overall, our research significantly enhances the accuracy of trophic status assessments in marine ecosystems. The ATSI model, combined with cutting-edge ML techniques and our new classification scheme, represents a promising avenue for evaluating and monitoring trophic conditions in TrC waters. The study also demonstrated the effectiveness of ATSI in assessing trophic status across various waterbodies, including lakes, rivers, and more. These findings make substantial contributions to the field of marine ecosystem management and conservation.
Collapse
Affiliation(s)
- Md Galal Uddin
- School of Engineering, University of Galway, Ireland; Ryan Institute, University of Galway, Ireland; MaREI Research Centre, University of Galway, Ireland; Eco-HydroInformatics Research Group (EHIRG), Civil Engineering, University of Galway, Ireland.
| | - Stephen Nash
- School of Engineering, University of Galway, Ireland; Ryan Institute, University of Galway, Ireland; MaREI Research Centre, University of Galway, Ireland
| | - Azizur Rahman
- School of Computing, Mathematics and Engineering, Charles Sturt University, Wagga Wagga, Australia; The Gulbali Institute of Agriculture, Water and Environment, Charles Sturt University, Wagga Wagga, Australia
| | | | - Agnieszka I Olbert
- School of Engineering, University of Galway, Ireland; Ryan Institute, University of Galway, Ireland; MaREI Research Centre, University of Galway, Ireland; Eco-HydroInformatics Research Group (EHIRG), Civil Engineering, University of Galway, Ireland
| |
Collapse
|
12
|
Dong J, Wang Z, Wu J, Huang J, Zhang C. A water quality prediction model based on signal decomposition and ensemble deep learning techniques. WATER SCIENCE AND TECHNOLOGY : A JOURNAL OF THE INTERNATIONAL ASSOCIATION ON WATER POLLUTION RESEARCH 2023; 88:2611-2632. [PMID: 38017681 PMCID: wst_2023_357 DOI: 10.2166/wst.2023.357] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/30/2023]
Abstract
Accurate water quality predictions are critical for water resource protection, and dissolved oxygen (DO) reflects overall river water quality and ecosystem health. This study proposes a hybrid model based on the fusion of signal decomposition and deep learning for predicting river water quality. Initially, complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) is employed to split the internal series of DO into numerous internal mode functions (IMFs). Subsequently, we employed multi-scale fuzzy entropy (MFE) to compute the entropy values for each IMF component. Time-varying filtered empirical mode decomposition (TVFEMD) is used to further extract features in high-frequency subsequences after linearly aggregating the high-frequency sequences. Finally, support vector machine (SVM) and long short-term memory (LSTM) neural networks are used to predict low- and high-frequency subsequences. Moreover, by comparing it with single models, models based on 'single layer decomposition-prediction-ensemble' and combination models using different methods, the feasibility of the proposed model in predicting water quality data for the Xinlian section of Fuhe River and the Chucha section of Ganjiang River was verified. As a result, the combined prediction approach developed in this work has improved generalizability and prediction accuracy, and it may be used to forecast water quality in complicated waters.
Collapse
Affiliation(s)
- Jinghan Dong
- College of Marine Ecology and Environment, Shanghai Ocean University, Shanghai 201306, China E-mail:
| | - Zhaocai Wang
- College of Information, Shanghai Ocean University, Shanghai 201306, China
| | - Junhao Wu
- State Key Laboratory of Estuarine and Coastal Research, East China Normal University, Shanghai 200241, China
| | - Jinghan Huang
- College of Economics and Management, Shanghai Ocean University, Shanghai 201306, China
| | - Can Zhang
- College of Information, Shanghai Ocean University, Shanghai 201306, China
| |
Collapse
|
13
|
Zhang C, Nong X, Shao D, Chen L. An integrated risk assessment framework using information theory-based coupling methods for basin-scale water quality management: A case study in the Danjiangkou Reservoir Basin, China. THE SCIENCE OF THE TOTAL ENVIRONMENT 2023; 884:163731. [PMID: 37142036 DOI: 10.1016/j.scitotenv.2023.163731] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/01/2023] [Revised: 03/27/2023] [Accepted: 04/21/2023] [Indexed: 05/06/2023]
Abstract
As the second largest reservoir in China, the Danjiangkou Reservoir (DJKR) serves as the water source of the Middle Route of the South-to-North Water Diversion Project of China (MRSNWDPC), i.e., the currently longest (1273 km) inter-basin water diversion project in the world, for more than eight years. The water quality status of the DJKR basin has been receiving worldwide attention because it is related to the health and safety of >100 million people and the integrity of an ecosystem covering >92,500 km2. In this study, basin-scale water quality sampling campaigns were conducted monthly at 47 monitoring sites in river systems of the DJKRB from the year 2020 to 2022, covering nine water quality indicators, i.e., water temperature (WT), pH, dissolved oxygen (DO), permanganate index (CODMn), five-day biochemical oxygen demand (BOD5), ammonia nitrogen (NH3-N), total phosphorus (TP), total nitrogen (TN), and fluoride (F-). The water quality index (WQI) and multivariate statistical techniques were introduced to comprehensively evaluate water quality status and understand the corresponding driving factors of water quality variations. An integrated risk assessment framework simultaneously considered intra and inter-regional factors using information theory-based and the SPA (Set-Pair Analysis) methods were proposed for basin-scale water quality management. The results showed that the water quality of the DJKR and its tributaries stably maintained a "good" status, with all the average WQIs >60 of river systems during the monitoring period. The spatial variations of all WQIs in the basin showed significantly different (Kruskal-Wallis tests, P < 0.01), while no seasonal differences were found. The increase in built-up land use and agricultural water consumption revealed the highest contributions (Mantel's r > 0.5, P < 0.05) to the rise of nutrient loadings of all river systems, showing the intensive anthropogenic activities can eclipse the power of natural processes on water quality variations to some extent. The risks of specific sub-basins that may cause water quality degradation on the MRSNWDPC were effectively quantified and identified into five classifications based on transfer entropy and the SPA methods. This study provides an informative risk assessment framework that was relatively easy to be applied by professionals and non-experts for basin-scale water quality management, thus providing a valuable and reliable reference for the administrative department to conduct effective pollution control in the future.
Collapse
Affiliation(s)
- Chi Zhang
- State Key Laboratory of Water Resources and Hydropower Engineering Science, Wuhan University, Wuhan 430072, China
| | - Xizhi Nong
- State Key Laboratory of Water Resources and Hydropower Engineering Science, Wuhan University, Wuhan 430072, China; College of Civil Engineering and Architecture, Guangxi University, Nanning 530004, China
| | - Dongguo Shao
- State Key Laboratory of Water Resources and Hydropower Engineering Science, Wuhan University, Wuhan 430072, China.
| | - Lihua Chen
- College of Civil Engineering and Architecture, Guangxi University, Nanning 530004, China
| |
Collapse
|
14
|
Meng X, Yip Y, Valiyaveettil S. Understanding the aggregation, consumption, distribution and accumulation of nanoparticles of polyvinyl chloride and polymethyl methacrylate in Ruditapes philippinarum. THE SCIENCE OF THE TOTAL ENVIRONMENT 2023; 871:161955. [PMID: 36737013 DOI: 10.1016/j.scitotenv.2023.161955] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Revised: 01/27/2023] [Accepted: 01/28/2023] [Indexed: 06/18/2023]
Abstract
Plastic products have become an integral part of our life. A widespread usage, high stability, uncontrolled disposal and slow degradation of plastics in the environment led to the generation and accumulation of nanoparticles of polymers (NPs) in the marine environment. However, little is known about the aggregation, consumption and distribution of NPs from common polymers such as polyvinyl chloride (NP-PVC) and polymethyl methacrylate (NP-PMMA) inside marine animal physiologies. In the current study, two types of polymers (PVC and PMMA) × four exposure concentrations (1, 5, 15 and 25 mg/L) × four times (4, 8, 12 and 24 h) exposure studies were conducted to understand the consumption and distribution of luminescent NP-PVC (98.6 ± 17.6 nm) and NP-PMMA (111.9 ± 37.1 nm) in R. philippinarum. Under laboratory conditions, NP-PVC showed a higher aggregation rate than NP-PMMA in seawater within a period of 24 h. Aggregations of NPs increased with an increase in initial NP concentrations, leading to significant settling of nanoparticles within 24 h exposure. Such aggregation and settling of particles enhanced the consumption of NPs by benthic filter-feeding R. philippinarum at all exposure concentrations during 4 h exposure. More interestingly, NP-PVC and NP-PMMA were observed in large amounts in both liver and gills (22.6 % - 29.1 %) of the clams. Furthermore, NP-PVC was detected in most organs of R. philippinarum as compared to NP-PMMA. This study demonstrates that different polymers distribute and accumulate differently in the same biological model under laboratory exposure conditions based on their chemical nature.
Collapse
Affiliation(s)
- Xingliang Meng
- Department of Chemistry, National University of Singapore, 3 Science Drive 3, Singapore 117543, Singapore
| | - Yongjie Yip
- Department of Chemistry, National University of Singapore, 3 Science Drive 3, Singapore 117543, Singapore
| | - Suresh Valiyaveettil
- Department of Chemistry, National University of Singapore, 3 Science Drive 3, Singapore 117543, Singapore.
| |
Collapse
|
15
|
Shahid N. A proficiency assessment of integrating machine learning (ML) schemes on Lahore water ensemble. Sci Rep 2023; 13:5130. [PMID: 36991152 PMCID: PMC10060416 DOI: 10.1038/s41598-023-32280-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2022] [Accepted: 03/25/2023] [Indexed: 03/31/2023] Open
Abstract
A synthesis of statistical inference and machine learning (ML) tools has been employed to establish a comprehensive insight of a coarse data. Water components' data for 16 central distributing locations of Lahore, the capital of second most populated province of Pakistan, has been analyzed to gauge current water stature of the city. Moreover, a classification of surplus-response variables through tolerance manipulation was incorporated to debrief dimension aspect of the data. By the same token, the influence of supererogatory variables' renouncement through identification of clustering movement of constituents is inquired. The approach of building a spectrum of colluding results through application of comparable methods has been experimented. To test the propriety of each statistical method prior to its execution on a huge data, a faction of ML schemes have been proposed. The supervised learning tools pca, factoran and clusterdata were implemented to establish an elemental character of water at elected locations. A location 'LAH-13' was highlighted for containing an out of normal range Total Dissolved Solids (TDS) concentration in the water. The classification of lower and higher variability parameters carried out by Sample Mean (XBAR) control identified a set of least correlated variables pH, As, Total Coliforms and E. Coli. The analysis provided four locations LAH-06, LAH-10, LAH-13 and LAH-14 for extreme concentration propensity. An execution of factoran demonstrated that specific tolerance of independent variability '0.005' could be employed to reduce dimension of a system without loss of fundamental data information. A higher value of cophenetic coefficient, c = 0.9582 provided the validation for an accurate cluster division of similar characteristics' variables. The current approach of mutually validating ML and SA (statistical analysis) schemes will assist in preparing the groundwork for state of the art analysis (SOTA) analysis. The advantage of our approach can be examined through the fact that the related SOTA will further refine the predictive precision between two comparable methods, unlike the SOTA analysis between two random ML methods. Conclusively, this study featured the locations LAH-03, LAH-06, LAH-12, LAH-13, LAH-14 and LAH-15 with compromised water quality in the region.
Collapse
Affiliation(s)
- Nazish Shahid
- Department of Mathematics, Forman Christian College (A Chartered University), Lahore, Pakistan.
| |
Collapse
|
16
|
Zhu M, Wang J, Yang X, Zhang Y, Zhang L, Ren H, Wu B, Ye L. A review of the application of machine learning in water quality evaluation. ECO-ENVIRONMENT & HEALTH (ONLINE) 2022; 1:107-116. [PMID: 38075524 PMCID: PMC10702893 DOI: 10.1016/j.eehl.2022.06.001] [Citation(s) in RCA: 69] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Revised: 05/19/2022] [Accepted: 06/01/2022] [Indexed: 12/31/2023]
Abstract
With the rapid increase in the volume of data on the aquatic environment, machine learning has become an important tool for data analysis, classification, and prediction. Unlike traditional models used in water-related research, data-driven models based on machine learning can efficiently solve more complex nonlinear problems. In water environment research, models and conclusions derived from machine learning have been applied to the construction, monitoring, simulation, evaluation, and optimization of various water treatment and management systems. Additionally, machine learning can provide solutions for water pollution control, water quality improvement, and watershed ecosystem security management. In this review, we describe the cases in which machine learning algorithms have been applied to evaluate the water quality in different water environments, such as surface water, groundwater, drinking water, sewage, and seawater. Furthermore, we propose possible future applications of machine learning approaches to water environments.
Collapse
Affiliation(s)
- Mengyuan Zhu
- State Key Laboratory of Pollution Control and Resource Reuse, School of Environment, Nanjing University, Nanjing 210023, China
| | - Jiawei Wang
- State Key Laboratory of Pollution Control and Resource Reuse, School of Environment, Nanjing University, Nanjing 210023, China
| | - Xiao Yang
- State Key Laboratory of Pollution Control and Resource Reuse, School of Environment, Nanjing University, Nanjing 210023, China
| | - Yu Zhang
- State Key Laboratory of Pollution Control and Resource Reuse, School of Environment, Nanjing University, Nanjing 210023, China
| | - Linyu Zhang
- State Key Laboratory of Pollution Control and Resource Reuse, School of Environment, Nanjing University, Nanjing 210023, China
| | - Hongqiang Ren
- State Key Laboratory of Pollution Control and Resource Reuse, School of Environment, Nanjing University, Nanjing 210023, China
| | - Bing Wu
- State Key Laboratory of Pollution Control and Resource Reuse, School of Environment, Nanjing University, Nanjing 210023, China
| | - Lin Ye
- State Key Laboratory of Pollution Control and Resource Reuse, School of Environment, Nanjing University, Nanjing 210023, China
| |
Collapse
|