1
|
Reshadi MAM, Rezanezhad F, Shahvaran AR, Ghajari A, Kaykhosravi S, Slowinski S, Van Cappellen P. Assessment of environmental and socioeconomic drivers of urban stormwater microplastics using machine learning. Sci Rep 2025; 15:6299. [PMID: 39984553 PMCID: PMC11845695 DOI: 10.1038/s41598-025-90612-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2024] [Accepted: 02/14/2025] [Indexed: 02/23/2025] Open
Abstract
Microplastics (MPs) are ubiquitous environmental contaminants with urban landscapes as major source areas of MPs and stormwater runoff as an important transport pathway to receiving aquatic environments. To better delineate the drivers of urban stormwater MP loads, we created a global dataset of stormwater MP concentrations extracted from 107 stormwater catchments (SWCs). Using this dataset, we trained and tested three optimized gradient boosting Machine Learning (ML) models. Twenty hydrometeorological and socioeconomic variables, as well as the MP size definitions considered in the individual SWCs, were included as potential predictors of the observed MP concentrations. CatBoost emerged as the best-performing ML model. Shapley additive explanations revealed that features related to hydrometeorological conditions, watershed characteristics and human activity, and plastic waste management practices contributed 34, 25, and 4.8%, respectively, to the model's predictive performance. The MP size definition, that is, the lower size limit and the width of the size range, accounted for the remaining 36% variability in the predicted MP concentrations. The lack of a consistent definition of the MP size range among studies therefore represents a major source of uncertainty in the comparative analysis of urban stormwater MP concentrations. The proposed ML modeling approach can generate first estimates of MP concentrations in urban stormwater when data are sparse and serve as a quantitative tool for benchmarking the added value of including further data layers and applying uniform definitions of size classes of environmental MPs.
Collapse
Affiliation(s)
- Mir Amir Mohammad Reshadi
- Ecohydrology Research Group, Department of Earth and Environmental Sciences, University of Waterloo, 200 University Avenue West, Waterloo, ON, N2L 3G1, Canada.
| | - Fereidoun Rezanezhad
- Ecohydrology Research Group, Department of Earth and Environmental Sciences, University of Waterloo, 200 University Avenue West, Waterloo, ON, N2L 3G1, Canada
- Water Institute, University of Waterloo, Waterloo, ON, N2L 3G1, Canada
| | - Ali Reza Shahvaran
- Ecohydrology Research Group, Department of Earth and Environmental Sciences, University of Waterloo, 200 University Avenue West, Waterloo, ON, N2L 3G1, Canada
| | - Amirhossein Ghajari
- Department of Civil, Construction, and Environmental Engineering, North Carolina State University, Raleigh, NC, 27695, USA
| | | | - Stephanie Slowinski
- Ecohydrology Research Group, Department of Earth and Environmental Sciences, University of Waterloo, 200 University Avenue West, Waterloo, ON, N2L 3G1, Canada
| | - Philippe Van Cappellen
- Ecohydrology Research Group, Department of Earth and Environmental Sciences, University of Waterloo, 200 University Avenue West, Waterloo, ON, N2L 3G1, Canada
- Water Institute, University of Waterloo, Waterloo, ON, N2L 3G1, Canada
| |
Collapse
|
2
|
Yin H, Chen Y, Zhou J, Xie Y, Wei Q, Xu Z. A probabilistic deep learning approach to enhance the prediction of wastewater treatment plant effluent quality under shocking load events. WATER RESEARCH X 2025; 26:100291. [PMID: 39720317 PMCID: PMC11667701 DOI: 10.1016/j.wroa.2024.100291] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/13/2024] [Revised: 12/01/2024] [Accepted: 12/02/2024] [Indexed: 12/26/2024]
Abstract
Sudden shocking load events featuring significant increases in inflow quantities or concentrations of wastewater treatment plants (WWTPs), are a major threat to the attainment of treated effluents to discharge quality standards. To aid in real-time decision-making for stable WWTP operations, this study developed a probabilistic deep learning model that comprises encoder-decoder long short-term memory (LSTM) networks with added capacity of producing probability predictions, to enhance the robustness of real-time WWTP effluent quality prediction under such events. The developed probabilistic encoder-decoder LSTM (P-ED-LSTM) model was tested in an actual WWTP, where bihourly effluent quality prediction of total nitrogen was performed and compared with classical deep learning models, including LSTM, gated recurrent unit (GRU) and Transformer. It was found that under shocking load events, the P-ED-LSTM could achieve a 49.7% improvement in prediction accuracy for bihourly real-time predictions of effluent concentration compared to the LSTM, GRU, and Transformer. A higher quantile of the probability data from the P-ED-LSTM model output, indicated a prediction value more approximate to real effluent quality. The P-ED-LSTM model also exhibited higher predictive power for the next multiple time steps with shocking load scenarios. It captured approximately 90% of the actual over-limit discharges up to 6 hours ahead, significantly outperforming other deep learning models. Therefore, the P-ED-LSTM model, with its robust adaptability to significant fluctuations, has the potential for broader applications across WWTPs with different processes, as well as providing strategies for wastewater system regulation under emergency conditions.
Collapse
Affiliation(s)
- Hailong Yin
- Shanghai Institute of Pollution Control and Ecological Security, Shanghai, 200092, China
- Key Laboratory of Urban Water Supply, Water Saving and Water Environment Governance in the Yangtze River Delta of Ministry of Water Resources, State Key Laboratory of Pollution Control and Resource Reuse, Tongji University, Shanghai, 200092, China
| | - Yongqi Chen
- Shanghai Institute of Pollution Control and Ecological Security, Shanghai, 200092, China
- Key Laboratory of Urban Water Supply, Water Saving and Water Environment Governance in the Yangtze River Delta of Ministry of Water Resources, State Key Laboratory of Pollution Control and Resource Reuse, Tongji University, Shanghai, 200092, China
| | - Jingshu Zhou
- Shanghai Institute of Pollution Control and Ecological Security, Shanghai, 200092, China
- Key Laboratory of Urban Water Supply, Water Saving and Water Environment Governance in the Yangtze River Delta of Ministry of Water Resources, State Key Laboratory of Pollution Control and Resource Reuse, Tongji University, Shanghai, 200092, China
| | - Yifan Xie
- School of Environment, Tsinghua University, Beijing, 100084, China
| | - Qing Wei
- Shanghai Institute of Pollution Control and Ecological Security, Shanghai, 200092, China
- Key Laboratory of Urban Water Supply, Water Saving and Water Environment Governance in the Yangtze River Delta of Ministry of Water Resources, State Key Laboratory of Pollution Control and Resource Reuse, Tongji University, Shanghai, 200092, China
| | - Zuxin Xu
- Shanghai Institute of Pollution Control and Ecological Security, Shanghai, 200092, China
- Key Laboratory of Urban Water Supply, Water Saving and Water Environment Governance in the Yangtze River Delta of Ministry of Water Resources, State Key Laboratory of Pollution Control and Resource Reuse, Tongji University, Shanghai, 200092, China
| |
Collapse
|
3
|
Roohi AM, Nazif S, Ramazi P. Tackling data challenges in forecasting effluent characteristics of wastewater treatment plants. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2024; 354:120324. [PMID: 38364537 DOI: 10.1016/j.jenvman.2024.120324] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Revised: 01/21/2024] [Accepted: 02/08/2024] [Indexed: 02/18/2024]
Abstract
In wastewater treatment plants (WWTPs), the stochastic nature of influent wastewater and operational and weather conditions cause fluctuations in effluent quality. Data-driven models can forecast effluent quality a few hours ahead as a response to the influent characteristics, providing enough time to adjust system operations and avoid undesired consequences. However, existing data for training models are often incomplete and contain missing values. On the other hand, collecting additional data by installing new sensors is costly. The trade-off between using existing incomplete data and collecting costly new data results in three data challenges faced when developing data-driven WWTP effluent forecasters. These challenges are to determine important variables to be measured, the minimum number of required data instances, and the maximum percentage of tolerable missing values that do not impede the development of an accurate model. As these issues are not discussed in previous studies, in this research, for the first time, a comprehensive analysis is done to provide answers to these challenges. Another issue that arises in all data-driven modeling is how to select an appropriate forecasting model. This paper addresses these issues by first testing nine machine learning models on data collected from three wastewater treatment plants located in Iran, Australia, and Spain. The most accurate forecaster, Bayesian network, was then used to address the articulated challenges. Key variables in forecasting effluent characteristics were flow rate, total suspended solids, electrical conductivity, phosphorus compounds, wastewater temperature, and air temperature. A minimum of 250 samples was needed during the model training to achieve a great reduction in the forecasting error. Moreover, a steep increase in the error was observed should the portion of missing values exceed 10%. The results assist plant managers in estimating the necessary data collection effort to obtain an accurate forecaster, contributing to the quality of the effluent.
Collapse
Affiliation(s)
- Ali Mohammad Roohi
- School of Civil Engineering, College of Engineering, University of Tehran, Tehran, Iran
| | - Sara Nazif
- School of Civil Engineering, College of Engineering, University of Tehran, Tehran, Iran.
| | - Pouria Ramazi
- Department of Mathematics and Statistics, Brock University, St. Catharines, ON, L2S 3A1, Canada
| |
Collapse
|
4
|
Sakizadeh M, Zhang C, Milewski A. Spatial distribution pattern and health risk of groundwater contamination by cadmium, manganese, lead and nitrate in groundwater of an arid area. ENVIRONMENTAL GEOCHEMISTRY AND HEALTH 2024; 46:80. [PMID: 38367130 DOI: 10.1007/s10653-023-01845-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/23/2023] [Accepted: 12/21/2023] [Indexed: 02/19/2024]
Abstract
Combining the results of base models to create a meta-model is one of the ensemble approaches known as stacking. In this study, stacking of five base learners, including eXtreme gradient boosting, random forest, feed-forward neural networks, generalized linear models with Lasso or Elastic Net regularization, and support vector machines, was used to study the spatial variation of Mn, Cd, Pb, and nitrate in Qom-Kahak Aquifers, Iran. The stacking strategy proved to be an effective substitute predictor for existing machine learning approaches due to its high accuracy and stability when compared to individual learners. Contrarily, there was not any best-performing base model for all of the involved parameters. For instance, in the case of cadmium, random forest produced the best results, with adjusted R2 and RMSE of 0.108 and 0.014, as opposed to 0.337 and 0.013 obtained by the stacking method. The Mn and Cd showed a tight link with phosphate by the redundancy analysis (RDA). This demonstrates the effect of phosphate fertilizers on agricultural operations. In order to analyze the causes of groundwater pollution, spatial methodologies can be used with multivariate analytic techniques, such as RDA, to help uncover hidden sources of contamination that would otherwise go undetected. Lead has a larger health risk than nitrate, according to the probabilistic health risk assessment, which found that 34.4% and 6.3% of the simulated values for children and adults, respectively, were higher than HQ = 1. Furthermore, cadmium exposure risk affected 84% of children and 47% of adults in the research area.
Collapse
Affiliation(s)
- Mohamad Sakizadeh
- Department of Environmental Sciences, Shahid Rajaee Teacher Training University, Lavizan, 1678815811, Tehran, Iran.
| | - Chaosheng Zhang
- International Network for Environment and Health (INEH), School of Geography, Archaeology and Irish Studies, University of Galway, Galway, Ireland
| | - Adam Milewski
- Department of Geology, University of Georgia, Athens, USA
| |
Collapse
|
5
|
Xie Y, Chen Y, Wei Q, Yin H. A hybrid deep learning approach to improve real-time effluent quality prediction in wastewater treatment plant. WATER RESEARCH 2024; 250:121092. [PMID: 38171177 DOI: 10.1016/j.watres.2023.121092] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/24/2023] [Revised: 12/11/2023] [Accepted: 12/28/2023] [Indexed: 01/05/2024]
Abstract
Wastewater treatment plant (WWTP) operation is usually intricate due to large variations in influent characteristics and nonlinear sewage treatment processes. Effective modeling of WWTP effluent water quality can provide valuable decision-making support to facilitate their operations and management. In this study, we developed a novel hybrid deep learning model by combining the temporal convolutional network (TCN) model with the long short-term memory (LSTM) network model to improve the simulation of hourly total nitrogen (TN) concentration in WWTP effluent. The developed model was tested in a WWTP in Jiangsu Province, China, where the prediction results of the hybrid TCN-LSTM model were compared with those of single deep learning models (TCN and LSTM) and traditional machine learning model (feedforward neural network, FFNN). The hybrid TCN-LSTM model could achieve 33.1 % higher accuracy as compared to the single TCN or LSTM model, and its performance could improve by 63.6 % comparing to the traditional FFNN model. The developed hybrid model also exhibited a higher power prediction of WWTP effluent TN for the next multiple time steps within eight hours, as compared to the standalone TCN, LSTM, and FFNN models. Finally, employing model interpretation approach of Shapley additive explanation to identify the key parameters influencing the behavior of WWTP effluent water quality, it was found that removing variables that did not contribute to the model output could further improve modeling efficiency while optimizing monitoring and management strategies.
Collapse
Affiliation(s)
- Yifan Xie
- School of Environment, Tsinghua University, Beijing 100084, China
| | - Yongqi Chen
- Shanghai Institute of Pollution Control and Ecological Security, Shanghai 200092, China; Key Laboratory of Urban Water Supply, Water Saving and Water Environment Governance in the Yangtze River Delta of Ministry of Water Resources, State Key Laboratory of Pollution Control and Resource Reuse, Tongji University, Shanghai 200092, China
| | - Qing Wei
- Shanghai Institute of Pollution Control and Ecological Security, Shanghai 200092, China; Key Laboratory of Urban Water Supply, Water Saving and Water Environment Governance in the Yangtze River Delta of Ministry of Water Resources, State Key Laboratory of Pollution Control and Resource Reuse, Tongji University, Shanghai 200092, China
| | - Hailong Yin
- Shanghai Institute of Pollution Control and Ecological Security, Shanghai 200092, China; Key Laboratory of Urban Water Supply, Water Saving and Water Environment Governance in the Yangtze River Delta of Ministry of Water Resources, State Key Laboratory of Pollution Control and Resource Reuse, Tongji University, Shanghai 200092, China.
| |
Collapse
|