1
|
Fu B, Li S, Lao Z, Wei Y, Song K, Deng T, Wang Y. A novel hierarchical approach to insight to spectral characteristics in surface water of karst wetlands and estimate its non-optically active parameters using field hyperspectral data. Water Res 2024; 257:121673. [PMID: 38688189 DOI: 10.1016/j.watres.2024.121673] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/07/2024] [Revised: 04/22/2024] [Accepted: 04/23/2024] [Indexed: 05/02/2024]
Abstract
Wetlands cover only around 6 % of the Earth's land surface, and are recognized as one of the three major ecosystems, alongside forests and oceans. The ecological structure and function of karst wetlands are unique due to the influence of geologic structure. At present, the unclear spectral morphology of surface water in karst wetlands poses a significant challenge in remote sensing estimation of non-optically active water quality parameters (NAWQPs). This study proposed a novel multi-scale spectral morphology feature extraction (MSFE) method to insight to spectral characteristics in surface water of karst wetlands, and further screen the sensitive features of NAWQPs. Then we constructed three remote sensing inversion strategies for NAWQPs (TN, TP, NH3_N, DO), including direct estimation, indirect estimation, and auxiliary estimation. Finally, we constructed a novel pH-based hierarchical analysis framework (pH_HA) to thoroughly explore the influence of alkalinity-biased characteristics of karst water on the spectral domain of NAWQPs and its estimation accuracy using in-situ hyperspectral data, respectively. We found that the spectral characteristics of karst waters at the first reflectance peak (580 nm) differed significantly from other water body types. The MSFE successfully captured the sensitive spectral domains for NAWQPs, and focused on between 500 and 600 nm and 900-960 nm. The sensitive features captured by MSFE improved estimation accuracy of NAWQPs (R2 >0.9). Direct estimation presented more stable performance compared to the auxiliary estimation (average RMSE of 0.366 mg/L), and the auxiliary estimation model further improved the retrieval accuracy of TN compared to direct estimation model (R2 increasing from 0.43 to 0.56). The novel hierarchical framework clearly revealed the notable changes in the sensitive spectral domains of NAWQPs under different pH values, and enabled more precise determination of spectral subdomains of NAWQPs, and identified the optimal spectral features. The pH_HA framework effectively improved the estimation accuracy of NAWQPs (R2 increased from 0.514 to over 0.9), and the estimation accuracies (R2) of four NAWQPs were all more than 0.9 when the pH value was over 8.5. Our works provide an effective approach for monitoring water quality in karst wetlands.
Collapse
Affiliation(s)
- Bolin Fu
- College of Geomatics and Geoinformation, Guilin University of Technology, Guilin 541006, China.
| | - Sunzhe Li
- College of Geomatics and Geoinformation, Guilin University of Technology, Guilin 541006, China
| | - Zhinan Lao
- College of Geomatics and Geoinformation, Guilin University of Technology, Guilin 541006, China
| | - Yingying Wei
- College of Geomatics and Geoinformation, Guilin University of Technology, Guilin 541006, China
| | - Kaishan Song
- Northeast Institute of Geography and Agroecology, CAS, Changchun 130102, China
| | - Tengfang Deng
- College of Geomatics and Geoinformation, Guilin University of Technology, Guilin 541006, China
| | - Yeqiao Wang
- Department of Natural Resources Science, University of Rhode Island, Kingston, RI 02881, USA
| |
Collapse
|
2
|
Thanh NN, Chotpantarat S, Ngu NH, Thunyawatcharakul P, Kaewdum N. Integrating machine learning models with cross-validation and bootstrapping for evaluating groundwater quality in Kanchanaburi province, Thailand. Environ Res 2024; 252:118952. [PMID: 38636644 DOI: 10.1016/j.envres.2024.118952] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Revised: 03/10/2024] [Accepted: 04/14/2024] [Indexed: 04/20/2024]
Abstract
Exploring the potential of new models for mapping groundwater quality presents a major challenge in water resource management, particularly in Kanchanaburi Province, Thailand, where groundwater faces contamination risks. This study aimed to explore the applicability of random forest (RF) and artificial neural networks (ANN) models to predict groundwater quality. Particularly, these two models were integrated into cross-validation (CV) and bootstrapping (B) techniques to build predictive models, including RF-CV, RF-B, ANN-CV, and ANN-B. Entropy groundwater quality index (EWQI) was converted to normalized EWQI which was then classified into five levels from very poor to very good. A total of twelve physicochemical parameters from 180 groundwater wells, including potassium, sodium, calcium, magnesium, chloride, sulfate, bicarbonate, nitrate, pH, electrical conductivity, total dissolved solids, and total hardness, were investigated to decipher groundwater quality in the eastern part of Kanchanaburi Province, Thailand. Our results indicated that groundwater quality in the study area was primarily polluted by calcium, magnesium, and bicarbonate and that the RF-CV model (RMSE = 0.06, R2 = 0.87, MAE = 0.04) outperformed the RF-B (RMSE = 0.07, R2 = 0.80, MAE = 0.04), ANN-CV (RMSE = 0.09, R2 = 0.70, MAE = 0.06), and ANN-B (RMSE = 0.10, R2 = 0.67, MAE = 0.06). Our findings highlight the superiority of the RF models over the ANN models based on the CV and B techniques. In addition, the role of groundwater parameters to the normalized EWQI in various machine learning models was found. The groundwater quality map created by the RF-CV model can be applied to orient groundwater use.
Collapse
Affiliation(s)
- Nguyen Ngoc Thanh
- University of Agriculture and Forestry, Hue University, 102 Phung Hung Str, Hue City, Thua Thien Hue, 53000, Viet Nam
| | - Srilert Chotpantarat
- Department of Geology, Faculty of Science, Chulalongkorn University, Bangkok, 10330, Thailand; Center of Excellence in Environmental Innovation and Management of Metals (EnvIMM), Environmental Research Institute, Chulalongkorn University, Phayathai Road, Pathumwan, Bangkok 10330, Thailand.
| | - Nguyen Huu Ngu
- University of Agriculture and Forestry, Hue University, 102 Phung Hung Str, Hue City, Thua Thien Hue, 53000, Viet Nam
| | - Pongsathorn Thunyawatcharakul
- International Postgraduate Program in Hazardous Substance and Environmental Management, Graduate School, Chulalongkorn University, Bangkok, 10330, Thailand
| | - Narongsak Kaewdum
- Geoscience Program, Mahidol University Kanchanaburi Campus, Kanchanaburi, 71150, Thailand
| |
Collapse
|
3
|
Hong W, Mei H, Shi X, Lin X, Wang S, Ni R, Wang Y, Song L. Viral community distribution, assembly mechanism, and associated hosts in an industrial park wastewater treatment plant. Environ Res 2024; 247:118156. [PMID: 38199475 DOI: 10.1016/j.envres.2024.118156] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Revised: 12/02/2023] [Accepted: 01/06/2024] [Indexed: 01/12/2024]
Abstract
Viruses manipulate bacterial community composition and impact wastewater treatment efficiency. Some viruses pose threats to the environment and human populations through infection. Improving the efficiency of wastewater treatment and ensuring the health of the effluent and receptor pools requires an understanding of how viral communities assemble and interact with hosts in wastewater treatment plants (WWTPs). We used metagenomic analysis to study the distribution, assembly mechanism, and sensitive hosts for the viral communities in raw water, anaerobic tanks, and returned activated sludge units of a large-scale industrial park WWTP. Uroviricota (53.42% ± 0.14%) and Nucleocytoviricota (26.1% ± 0.19%) were dominant in all units. Viral community composition significantly differed between units, as measured by β diversity (P = 0.005). Compared to raw water, the relative viral abundance decreased by 29.8% in the anaerobic tank but increased by 9.9% in the activated sludge. Viral community assembly in raw water and anaerobic tanks was predominantly driven by deterministic processes (MST <0.5) versus stochastic processes (MST >0.5) in the activated sludge, indicating that differences in diffusion limits may fundamentally alter the assembly mechanisms of viral communities between the solid and liquid-phase environments. Acidobacteria was identified as the sensitive host contributing to viral abundance, exhibiting strong interactions and a mutual dependence (degree = 59). These results demonstrate the occurrence and prevalence of viruses in WWTPs, their different assembly mechanism, and sensitive hosts. These observations require further study of the mechanisms of viral community succession, ecological function, and roles in the successive wastewater treatment units.
Collapse
Affiliation(s)
- Wenqing Hong
- School of Resources and Environmental Engineering, Anhui University, Hefei, 230601, China; Anhui Shengjin Lake Wetland Ecology National Long-term Scientific Research Base, Dongzhi, 247230, China
| | - Hong Mei
- East China Engineering Science and Technology Co., Ltd, Hefei, 230024, China
| | - Xianyang Shi
- School of Resources and Environmental Engineering, Anhui University, Hefei, 230601, China; Anhui Shengjin Lake Wetland Ecology National Long-term Scientific Research Base, Dongzhi, 247230, China.
| | - Xiaoxing Lin
- School of Resources and Environmental Engineering, Anhui University, Hefei, 230601, China; Anhui Shengjin Lake Wetland Ecology National Long-term Scientific Research Base, Dongzhi, 247230, China
| | - Shuijing Wang
- School of Resources and Environmental Engineering, Anhui University, Hefei, 230601, China; Anhui Shengjin Lake Wetland Ecology National Long-term Scientific Research Base, Dongzhi, 247230, China
| | - Renjie Ni
- School of Resources and Environmental Engineering, Anhui University, Hefei, 230601, China; Anhui Shengjin Lake Wetland Ecology National Long-term Scientific Research Base, Dongzhi, 247230, China
| | - Yan Wang
- East China Engineering Science and Technology Co., Ltd, Hefei, 230024, China
| | - Liyan Song
- School of Resources and Environmental Engineering, Anhui University, Hefei, 230601, China; Anhui Shengjin Lake Wetland Ecology National Long-term Scientific Research Base, Dongzhi, 247230, China.
| |
Collapse
|
4
|
Wu J, Chen X, Li R, Wang A, Huang S, Li Q, Qi H, Liu M, Cheng H, Wang Z. A novel framework for high resolution air quality index prediction with interpretable artificial intelligence and uncertainties estimation. J Environ Manage 2024; 357:120785. [PMID: 38583378 DOI: 10.1016/j.jenvman.2024.120785] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Revised: 02/02/2024] [Accepted: 03/27/2024] [Indexed: 04/09/2024]
Abstract
Accurate air quality index (AQI) prediction is essential in environmental monitoring and management. Given that previous studies neglect the importance of uncertainty estimation and the necessity of constraining the output during prediction, we proposed a new hybrid model, namely TMSSICX, to forecast the AQI of multiple cities. Firstly, time-varying filtered based empirical mode decomposition (TVFEMD) was adopted to decompose the AQI sequence into multiple internal mode functions (IMF) components. Secondly, multi-scale fuzzy entropy (MFE) was applied to evaluate the complexity of each IMF component and clustered them into high and low-frequency portions. In addition, the high-frequency portion was secondarily decomposed by successive variational mode decomposition (SVMD) to reduce volatility. Then, six air pollutant concentrations, namely CO, SO2, PM2.5, PM10, O3, and NO2, were used as inputs. The secondary decomposition and preliminary portion were employed as the outputs for the bidirectional long short-term memory network optimized by the snake optimization algorithm (SOABiLSTM) and improved Catboost (ICatboost), respectively. Furthermore, extreme gradient boosting (XGBoost) was applied to ensemble each predicted sub-model to acquire the consequence. Ultimately, we introduced adaptive kernel density estimation (AKDE) for interval estimation. The empirical outcome indicated the TMSSICX model achieved the best performance among the other 23 models across all datasets. Moreover, implementing the XGBoost to ensemble each predicted sub-model led to an 8.73%, 8.94%, and 0.19% reduction in RMSE, compared to SVM. Additionally, by utilizing SHapley Additive exPlanations (SHAP) to assess the impact of the six pollutant concentrations on AQI, the results reveal that PM2.5 and PM10 had the most notable positive effects on the long-term trend of AQI. We hope this model can provide guidance for air quality management.
Collapse
Affiliation(s)
- Junhao Wu
- State Key Laboratory of Estuarine and Coastal Research, East China Normal University, Shanghai, 200062, China
| | - Xi Chen
- School of Geographic Sciences, East China Normal University, Shanghai, 200241, China; Key Laboratory of Geographic Information Science, Ministry of Education, East China Normal University, Shanghai, 200241, China; Key Laboratory of Spatial-Temporal Big Data Analysis and Application of Natural Resources in Megacities, Ministry of Natural Resources, Shanghai, 200241, China.
| | - Rui Li
- School of Geographic Sciences, East China Normal University, Shanghai, 200241, China
| | - Anqi Wang
- Department of Mathematics, The University of Manchester, Manchester, M13 9PL, UK
| | - Shutong Huang
- School of Geographic Sciences, East China Normal University, Shanghai, 200241, China
| | - Qingli Li
- Shanghai Key Laboratory of Multidimensional Information Processing, East China Normal University, Shanghai, 200241, China
| | - Honggang Qi
- School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Min Liu
- School of Geographic Sciences, East China Normal University, Shanghai, 200241, China; Key Laboratory of Geographic Information Science, Ministry of Education, East China Normal University, Shanghai, 200241, China
| | - Heqin Cheng
- State Key Laboratory of Estuarine and Coastal Research, East China Normal University, Shanghai, 200062, China.
| | - Zhaocai Wang
- College of Information, Shanghai Ocean University, Shanghai, 201306, China.
| |
Collapse
|
5
|
Zhou M, Li Y. Spatial distribution and source identification of potentially toxic elements in Yellow River Delta soils, China: An interpretable machine-learning approach. Sci Total Environ 2024; 912:169092. [PMID: 38056655 DOI: 10.1016/j.scitotenv.2023.169092] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Revised: 11/15/2023] [Accepted: 12/02/2023] [Indexed: 12/08/2023]
Abstract
Identifying the driving factors and quantifying the sources of potentially toxic elements (PTEs) are essential for protecting the ecological environment of the Yellow River Delta. In this study, data from 201 surface soil samples and 16 environmental variables were collected, and the random forest (RF) and Shapley additive explanations (SHAP) methods were then combined to explore the key factors affecting soil PTEs. An innovative t-distributed random neighbor embedding-RF-SHAP model was then constructed, based on the absolute principal component score and multivariate linear regression model, to quantitatively determine PTE sources. Although average PTE concentrations did not exceed the risk control values, PTE distributions exhibited significant differences. It was found that sodium, soil organic matter, and phosphorus contents were the three most important factors affecting PTEs, and human activities and natural environmental factors both influence PTE contents by altering the soil properties. The proposed model successfully determined PTE sources in the soil, outperforming the original linear regression model with a significantly lower RMSE. Source analysis revealed that the parent material was the main contributor to soil PTEs, accounting for more than half of the total PTE content. Industrial and agricultural activities also contributed to an increase in soil PTEs, with average contributions of 19.91 % and 17.44 %, respectively. Unknown sources accounted for 10.83 % of the total PTE content. Thus, the proposed model provides innovative perspectives on source parsing. These findings provide valuable scientific insights for policymakers seeking to develop effective environmental protection measures and improve the quality of saline-alkali land in the Yellow River Delta.
Collapse
Affiliation(s)
- Mengge Zhou
- Key Laboratory of Land Surface Pattern and Simulation, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yonghua Li
- Key Laboratory of Land Surface Pattern and Simulation, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China.
| |
Collapse
|
6
|
Xu R, Hu S, Wan H, Xie Y, Cai Y, Wen J. A unified deep learning framework for water quality prediction based on time-frequency feature extraction and data feature enhancement. J Environ Manage 2024; 351:119894. [PMID: 38154219 DOI: 10.1016/j.jenvman.2023.119894] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Revised: 11/02/2023] [Accepted: 12/19/2023] [Indexed: 12/30/2023]
Abstract
Deep learning methods exhibited significant advantages in mapping highly nonlinear relationships with acceptable computational speed, and have been widely used to predict water quality. However, various model selection and construction methods resulted in differences in prediction accuracy and performance. Hence, a unified deep learning framework for water quality prediction was established in the paper, including data processing module, feature enhancement module, and data prediction module. In the established model, the data processing module based on wavelet transform method was applied to decomposing complex nonlinear meteorology, hydrology, and water quality data into multiple frequency domain signals for extracting self characteristics of data cyclic and fluctuations. The feature enhancement module based on Informer Encoder was used to enhance feature encoding of time series data in different frequency domains to discover global time dependent features of variables. Finally, the data prediction module based on the stacked bidirectional long and short term memory network (SBiLSTM) method was employed to strengthen the local correlation of feature sequences and predict the water quality. The established model framework was applied in Lijiang River in Guilin, China. The maximum relative errors between the predicted and observed values for dissolved oxygen (DO), chemical oxygen demand (CODMn) were 12.4% and 20.7%, suggesting a satisfactory prediction performance of the established model. The validation results showed that the established model was superior to all other models in terms of prediction accuracy with RMSE values 0.329, 0.121, MAE values 0.217, 0.057, SMAPE values 0.022, 0.063 for DO and CODMn, respectively. Ablation tests confirmed the necessity and rationality of each module for the established model framework. The established method provided a unified deep learning framework for water quality prediction.
Collapse
Affiliation(s)
- Rui Xu
- School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin, 541004, China
| | - Shengri Hu
- School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin, 541004, China
| | - Hang Wan
- Research Centre of Ecology & Environment for Coastal Area and Deep Sea, Southern Marine Science and Engineering Guangdong Laboratory (Guangzhou), Guangzhou, 511458, China; Guangdong Provincial Key Laboratory of Water Quality Improvement and Ecological Restoration for Watersheds, School of Ecology, Environment and Resources, Guangdong University of Technology, Guangzhou, 510006, China.
| | - Yulei Xie
- Research Centre of Ecology & Environment for Coastal Area and Deep Sea, Southern Marine Science and Engineering Guangdong Laboratory (Guangzhou), Guangzhou, 511458, China; Guangdong Provincial Key Laboratory of Water Quality Improvement and Ecological Restoration for Watersheds, School of Ecology, Environment and Resources, Guangdong University of Technology, Guangzhou, 510006, China
| | - Yanpeng Cai
- Research Centre of Ecology & Environment for Coastal Area and Deep Sea, Southern Marine Science and Engineering Guangdong Laboratory (Guangzhou), Guangzhou, 511458, China; Guangdong Provincial Key Laboratory of Water Quality Improvement and Ecological Restoration for Watersheds, School of Ecology, Environment and Resources, Guangdong University of Technology, Guangzhou, 510006, China
| | - Jianhui Wen
- Ecological and Environmental Monitoring Center of Guangxi, Guilin, 541002, China
| |
Collapse
|
7
|
Xue J, Yuan C, Ji X, Zhang M. Predictive modeling of nitrogen and phosphorus concentrations in rivers using a machine learning framework: A case study in an urban-rural transitional area in Wenzhou China. Sci Total Environ 2024; 910:168521. [PMID: 37981147 DOI: 10.1016/j.scitotenv.2023.168521] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/26/2023] [Revised: 11/04/2023] [Accepted: 11/10/2023] [Indexed: 11/21/2023]
Abstract
Rapid urbanization in China since 1980 generated environmental pressures of non-point source pollution (NPSP) and increased wide public concerns. Excessive quantities of nitrogen (N) and phosphorus (P) is a significant source of aquatic pollution, despite of their roles as essential nutritional elements for aquatic life processes. In this study, we present a new framework using random forest (RF) as a powerful machine learning algorithm driven by geo-datasets to estimate and map the concentration of total nitrogen (TN) and phosphorus (TP) at a spatial resolution for the Wen-Rui Tang River (WRTR) watershed, which is a typically urban-rural transitional area in east coastal region of China. A comprehensive GIS database of 26 in-house built environmental variables was adopted to build the predictive models of TN and TP in open waters over the watershed. The performances of the RF regression models were evaluated in comparison with in-situ measurements, and the results indicated the ability of RF regression models to accurately predict the spatiotemporal distribution of N and P concentration in rivers. Charactering the explanatory variable importance measures in the calibrated RF regression model defined the most significant variables impacting N and P contaminations in open waters across the urban-rural transitional area, and the results showed that these variables are aquaculture, direct domestic sewage, industrial wastewater discharges and the changing meteorological variables. Besides, mapping of the TN and TP concentrations across the continuous river at high spatiotemporal resolution (daily, 1 km × 1 km) in this study were informative. The results in this study provided the valuable data to various different stakeholders for managing water quality and pollution control where similar regions with rapid urbanization and a lack of water quality monitoring datasets.
Collapse
Affiliation(s)
- Jingyuan Xue
- Institute for Disaster Management and Reconstruction, Sichuan University, Chengdu 610041, China; College of Water Resource and Civil Engineering, China Agricultural University, Beijing 100083, China
| | - Can Yuan
- Key Laboratory of Watershed Science and Health of Zhejiang Province, School of Public Health and Management, Wenzhou Medical University, Wenzhou 325035, China
| | - Xiaoliang Ji
- Key Laboratory of Watershed Science and Health of Zhejiang Province, School of Public Health and Management, Wenzhou Medical University, Wenzhou 325035, China
| | - Minghua Zhang
- Key Laboratory of Watershed Science and Health of Zhejiang Province, School of Public Health and Management, Wenzhou Medical University, Wenzhou 325035, China; Department of Land Air & Water Resources, University of California Davis, Davis, CA 95616, USA.
| |
Collapse
|
8
|
Kikuchi T, Anzai T, Ouchi T. Assessing spatiotemporal variability in the concentration and composition of dissolved organic matter and its impact on iron solubility in tropical freshwater systems through a machine learning approach. Sci Total Environ 2023; 904:166892. [PMID: 37683858 DOI: 10.1016/j.scitotenv.2023.166892] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/25/2023] [Revised: 09/04/2023] [Accepted: 09/05/2023] [Indexed: 09/10/2023]
Abstract
Dissolved organic matter (DOM) plays important roles not only in maintaining the productivity and functioning of aquatic ecosystems but also in the global carbon cycle, although the sources and biogeochemical functions of terrestrially derived DOM have not been fully elucidated, particularly in the tropics and subtropics. This study aimed to evaluate the factors influencing spatiotemporal variability in (i) the concentration and composition of DOM, including dissolved organic carbon (DOC), ultraviolet absorption coefficient at 254-nm wavelength (a254), and components identified by fluorescence excitation-emission matrix coupled with parallel factor analysis (EEM-PARAFAC), and (ii) the concentration of dissolved iron (DFe) across freshwater systems (rivers, forested streams, and dam reservoirs) on a tropical island (Ishigaki Island, Japan) based on the results of water quality monitoring at 2-month intervals over a 2-year period. Random forests (RF) machine learning algorithm was employed, with the catchment characteristics (land use, soil type) and water temperature as the predictor variables for DOM and the composition of DOM (EEM-PARAFAC components) and hydrochemistry (water temperature, pH, and concentrations of divalent cations) as the predictor variables for DFe. The RF models for DOC, a254, and three humic-like components exhibited excellent predictive performance, indicating that these DOM properties are not only seasonally variable but also strongly influenced by the compositions of land uses and soil types in the upstream watershed. Poorly drained riparian lowland soil (Gleyic Fluvisols) was identified as the most important catchment parameter that positively influences these DOM variables. The RF model also explained a large portion of the variation in DFe, while terrestrial humic-like components were the most important parameters, emphasizing their significance as organic ligands for iron. These results improve our understanding of the impacts of terrestrial DOM and iron loadings on tropical and subtropical coastal ecosystems as well as on regional and global carbon budgets.
Collapse
Affiliation(s)
- Tetsuro Kikuchi
- Crop, Livestock and Environment Division, Japan International Research Center for Agricultural Sciences (JIRCAS), 1-1 Ohwashi, Tsukuba, Ibaraki 305-8686, Japan.
| | - Toshihiko Anzai
- Tropical Agriculture Research Front, JIRCAS, 1091-1 Maezato-Kawarabaru, Ishigaki, Okinawa 907-0002, Japan.
| | - Takao Ouchi
- Ibaraki Kasumigaura Environmental Science Center, 1853 Okijuku-machi, Tsuchiura, Ibaraki 300-0023, Japan.
| |
Collapse
|
9
|
Tselemponis A, Stefanis C, Giorgi E, Kalmpourtzi A, Olmpasalis I, Tselemponis A, Adam M, Kontogiorgis C, Dokas IM, Bezirtzoglou E, Constantinidis TC. Coastal Water Quality Modelling Using E. coli, Meteorological Parameters and Machine Learning Algorithms. Int J Environ Res Public Health 2023; 20:6216. [PMID: 37444064 PMCID: PMC10341787 DOI: 10.3390/ijerph20136216] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Revised: 06/19/2023] [Accepted: 06/21/2023] [Indexed: 07/15/2023]
Abstract
In this study, machine learning models were implemented to predict the classification of coastal waters in the region of Eastern Macedonia and Thrace (EMT) concerning Escherichia coli (E. coli) concentration and weather variables in the framework of the Directive 2006/7/EC. Six sampling stations of EMT, located on beaches of the regional units of Kavala, Xanthi, Rhodopi, Evros, Thasos and Samothraki, were selected. All 1039 samples were collected from May to September within a 14-year follow-up period (2009-2021). The weather parameters were acquired from nearby meteorological stations. The samples were analysed according to the ISO 9308-1 for the detection and the enumeration of E. coli. The vast majority of the samples fall into category 1 (Excellent), which is a mark of the high quality of the coastal waters of EMT. The experimental results disclose, additionally, that two-class classifiers, namely Decision Forest, Decision Jungle and Boosted Decision Tree, achieved high Accuracy scores over 99%. In addition, comparing our performance metrics with those of other researchers, diversity is observed in using algorithms for water quality prediction, with algorithms such as Decision Tree, Artificial Neural Networks and Bayesian Belief Networks demonstrating satisfactory results. Machine learning approaches can provide critical information about the dynamic of E. coli contamination and, concurrently, consider the meteorological parameters for coastal waters classification.
Collapse
Affiliation(s)
- Athanasios Tselemponis
- Laboratory of Hygiene and Environmental Protection, Medical School, Democritus University of Thrace, 68100 Alexandroupoli, Greece; (A.T.); (E.G.); (A.K.); (I.O.); (A.T.); (M.A.); (C.K.); (E.B.); (T.C.C.)
| | - Christos Stefanis
- Laboratory of Hygiene and Environmental Protection, Medical School, Democritus University of Thrace, 68100 Alexandroupoli, Greece; (A.T.); (E.G.); (A.K.); (I.O.); (A.T.); (M.A.); (C.K.); (E.B.); (T.C.C.)
| | - Elpida Giorgi
- Laboratory of Hygiene and Environmental Protection, Medical School, Democritus University of Thrace, 68100 Alexandroupoli, Greece; (A.T.); (E.G.); (A.K.); (I.O.); (A.T.); (M.A.); (C.K.); (E.B.); (T.C.C.)
| | - Aikaterini Kalmpourtzi
- Laboratory of Hygiene and Environmental Protection, Medical School, Democritus University of Thrace, 68100 Alexandroupoli, Greece; (A.T.); (E.G.); (A.K.); (I.O.); (A.T.); (M.A.); (C.K.); (E.B.); (T.C.C.)
| | - Ioannis Olmpasalis
- Laboratory of Hygiene and Environmental Protection, Medical School, Democritus University of Thrace, 68100 Alexandroupoli, Greece; (A.T.); (E.G.); (A.K.); (I.O.); (A.T.); (M.A.); (C.K.); (E.B.); (T.C.C.)
| | - Antonios Tselemponis
- Laboratory of Hygiene and Environmental Protection, Medical School, Democritus University of Thrace, 68100 Alexandroupoli, Greece; (A.T.); (E.G.); (A.K.); (I.O.); (A.T.); (M.A.); (C.K.); (E.B.); (T.C.C.)
| | - Maria Adam
- Laboratory of Hygiene and Environmental Protection, Medical School, Democritus University of Thrace, 68100 Alexandroupoli, Greece; (A.T.); (E.G.); (A.K.); (I.O.); (A.T.); (M.A.); (C.K.); (E.B.); (T.C.C.)
| | - Christos Kontogiorgis
- Laboratory of Hygiene and Environmental Protection, Medical School, Democritus University of Thrace, 68100 Alexandroupoli, Greece; (A.T.); (E.G.); (A.K.); (I.O.); (A.T.); (M.A.); (C.K.); (E.B.); (T.C.C.)
| | - Ioannis M. Dokas
- Department of Civil Engineering, Democritus University of Thrace, 69100 Komotini, Greece;
| | - Eugenia Bezirtzoglou
- Laboratory of Hygiene and Environmental Protection, Medical School, Democritus University of Thrace, 68100 Alexandroupoli, Greece; (A.T.); (E.G.); (A.K.); (I.O.); (A.T.); (M.A.); (C.K.); (E.B.); (T.C.C.)
| | - Theodoros C. Constantinidis
- Laboratory of Hygiene and Environmental Protection, Medical School, Democritus University of Thrace, 68100 Alexandroupoli, Greece; (A.T.); (E.G.); (A.K.); (I.O.); (A.T.); (M.A.); (C.K.); (E.B.); (T.C.C.)
| |
Collapse
|
10
|
Cheng Q, Chunhong Z, Qianglin L. Development and application of random forest regression soft sensor model for treating domestic wastewater in a sequencing batch reactor. Sci Rep 2023; 13:9149. [PMID: 37277429 DOI: 10.1038/s41598-023-36333-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Accepted: 06/01/2023] [Indexed: 06/07/2023] Open
Abstract
Small-scale distributed water treatment equipment such as sequencing batch reactor (SBR) is widely used in the field of rural domestic sewage treatment because of its advantages of rapid installation and construction, low operation cost and strong adaptability. However, due to the characteristics of non-linearity and hysteresis in SBR process, it is difficult to construct the simulation model of wastewater treatment. In this study, a methodology was developed using artificial intelligence and automatic control system that can save energy corresponding to reduce carbon emissions. The methodology leverages random forest model to determine a suitable soft sensor for the prediction of COD trends. This study uses pH and temperature sensors as premises for COD sensors. In the proposed method, data were pre-processed into 12 input variables and top 7 variables were selected as the variables of the optimized model. Cycle ended by the artificial intelligence and automatic control system instead of by fixed time control that was an uncontrolled scenario. In 12 test cases, percentage of COD removal is about 91. 075% while 24. 25% time or energy was saved from an average perspective. This proposed soft sensor selection methodology can be applied in field of rural domestic sewage treatment with advantages of time and energy saving. Time-saving results in increasing treatment capacity and energy-saving represents low carbon technology. The proposed methodology provides a framework for investigating ways to reduce costs associated with data collection by replacing costly and unreliable sensors with affordable and reliable alternatives. By adopting this approach, energy conservation can be maintained while meeting emission standards.
Collapse
Affiliation(s)
- Qiu Cheng
- Department of Material and Environmental Engineering, Chengdu Technological University, Chengdu, China
| | - Zhan Chunhong
- Huicai Environmental Technology Co., Ltd., De Yuan Zhen, Pidu District, Chengdu, Sichuan, China
| | - Li Qianglin
- Department of Material and Environmental Engineering, Chengdu Technological University, Chengdu, China.
| |
Collapse
|
11
|
Tong S, Li W, Chen J, Xia R, Lin J, Chen Y, Xu CY. A novel framework to improve the consistency of water quality attribution from natural and anthropogenic factors. J Environ Manage 2023; 342:118077. [PMID: 37209643 DOI: 10.1016/j.jenvman.2023.118077] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Revised: 03/31/2023] [Accepted: 04/30/2023] [Indexed: 05/22/2023]
Abstract
One critical question for water security and sustainable development is how water quality responses to the changes in natural factors and human activities, especially in light of the expected exacerbation in water scarcity. Although machine learning models have shown noticeable advances in water quality attribution analysis, they have limited interpretability in explaining the feature importance with theoretical guarantees of consistency. To fill this gap, this study built a modelling framework that employed the inverse distance weighting method and the extreme gradient boosting model to simulate the water quality at grid scale, and adapted the Shapley additive explanation to interpret the contributions of the drivers to water quality over the Yangtze River basin. Different from previous studies, we calculated the contribution of features to water quality at each grid within river basin and aggregated the contribution from all the grids as the feature importance. Our analysis revealed dramatic changes in response magnitudes of water quality to drivers within river basin. Air temperature had high importance in the variability of key water quality indicators (i.e. ammonia-nitrogen, total phosphorus, and chemical oxygen demand), and dominated the changes of water quality in Yangtze River basin, especially in the upstream region. In the mid- and downstream regions, water quality was mainly affected by human activities. This study provided a modelling framework applicable to robustly identify the feature importance by explaining the contribution of features to water quality at each grid.
Collapse
Affiliation(s)
- Shanlin Tong
- State Key Laboratory of Water Resources and Hydropower Engineering Science, Wuhan University, Wuhan, 430072, China
| | - Wenpan Li
- China National Environmental Monitoring Center, Beijing, 100012, China
| | - Jie Chen
- State Key Laboratory of Water Resources and Hydropower Engineering Science, Wuhan University, Wuhan, 430072, China.
| | - Rui Xia
- State Key Laboratory of Environmental Criteria and Risk Assessment, Chinese Research Academy of Environmental Sciences, Beijing, 100012, China.
| | - Jingyu Lin
- Guangdong Provincial Key Laboratory of Water Quality Improvement and Ecological Restoration for Watersheds, School of Ecology, Environment and Resources, Guangdong University of Technology, Guangzhou, 510006, China
| | - Yan Chen
- State Key Laboratory of Environmental Criteria and Risk Assessment, Chinese Research Academy of Environmental Sciences, Beijing, 100012, China
| | - Chong-Yu Xu
- Department of Geosciences, University of Oslo, Oslo, N-0316, Norway
| |
Collapse
|
12
|
Sumdang N, Chotpantarat S, Cho KH, Thanh NN. The risk assessment of arsenic contamination in the urbanized coastal aquifer of Rayong groundwater basin, Thailand using the machine learning approach. Ecotoxicol Environ Saf 2023; 253:114665. [PMID: 36863158 DOI: 10.1016/j.ecoenv.2023.114665] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/13/2022] [Revised: 12/26/2022] [Accepted: 02/15/2023] [Indexed: 06/18/2023]
Abstract
The rapid expansion of urbanization has resulted in an insufficient of groundwater resource. In order to use groundwater more efficiently, a risk assessment of groundwater pollution should be proposed. The present study used machine learning with three algorithms consisting of Random Forest (RF), Support Vector Machine (SVM), and Artificial Neural Network (ANN) to locate risk areas of arsenic contamination in Rayong coastal aquifers, Thailand and selected the suitable model based on model performance and uncertainty for risk assessment. The parameters of 653 groundwater wells (Deep=236, Shallow=417) were selected based on the correlation of each hydrochemical parameters with arsenic concentration in deep and shallow aquifer environments. The models were validated with arsenic concentration collected from 27 well data in the field. The model's performance indicated that the RF algorithm has the highest performance as compared to those of SVM and ANN in both deep and shallow aquifers (Deep: AUC=0.72, Recall=0.61, F1 =0.69; Shallow: AUC=0.81, Recall=0.79, F1 =0.68). In addition, the uncertainty from the quantile regression of each model confirmed that the RF algorithm has the lowest uncertainty (Deep: PICP=0.20; Shallow: PICP=0.34). The result of the risk map obtained from the RF reveals that the deep aquifer, in the northern part of the Rayong basin has a higher risk for people to expose to As. In contrast, the shallow aquifer revealed that the southern part of the basin has a higher risk, which is also supported by the location of the landfill and industrial estates in the area. Therefore, health surveillance is important in monitoring the toxic effects on the residents who use groundwater from these contaminated wells. The outcome of this study can help policymakers in regions to manage the quality of groundwater resources and enhance the sustainable use of groundwater resources. The novelty process of this research can be used to further study other groundwater aquifers contaminated and increase the effectiveness of groundwater quality management.
Collapse
Affiliation(s)
- Narongpon Sumdang
- International Postgraduate Program in Hazardous Substance and Environmental Management, Graduate School, Chulalongkorn University, Bangkok 10330, Thailand
| | - Srilert Chotpantarat
- Department of Geology, Faculty of Science, Chulalongkorn University, Bangkok 10330, Thailand; Center of Excellence in Environmental Innovation and Management of Metals (EnvIMM), Chulalongkorn University, Phayathai Road, Pathumwan, Bangkok 10330, Thailand.
| | - Kyung Hwa Cho
- Department of Urban and Environmental Engineering, Ulsan National Institute of Science and Technology, 50, UNIST-gil, Ulsan 44919, Republic of Korea
| | - Nguyen Ngoc Thanh
- University of Agriculture and Forestry, Hue University, 102 Phung Hung Str, Hue City, Viet Nam
| |
Collapse
|
13
|
Zhao X, Song Y, Zhang Y, Cai G, Xue G, Liu Y, Chen K, Zhang F, Wang K, Zhang M, Gao Y, Sun D, Wang X, Li J. Predictions of Milk Fatty Acid Contents by Mid-Infrared Spectroscopy in Chinese Holstein Cows. Molecules 2023; 28:molecules28020666. [PMID: 36677723 PMCID: PMC9864415 DOI: 10.3390/molecules28020666] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/03/2022] [Revised: 01/01/2023] [Accepted: 01/04/2023] [Indexed: 01/11/2023]
Abstract
Genetic improvement of milk fatty acid content traits in dairy cattle is of great significance. However, chromatography-based methods to measure milk fatty acid content have several disadvantages. Thus, quick and accurate predictions of various milk fatty acid contents based on the mid-infrared spectrum (MIRS) from dairy herd improvement (DHI) data are essential and meaningful to expand the amount of phenotypic data available. In this study, 24 kinds of milk fatty acid concentrations were measured from the milk samples of 336 Holstein cows in Shandong Province, China, using the gas chromatography (GC) technique, which simultaneously produced MIRS values for the prediction of fatty acids. After quantification by the GC technique, milk fatty acid contents expressed as g/100 g of milk (milk-basis) and g/100 g of fat (fat-basis) were processed by five spectral pre-processing algorithms: first-order derivative (DER1), second-order derivative (DER2), multiple scattering correction (MSC), standard normal transform (SNV), and Savitzky-Golsy convolution smoothing (SG), and four regression models: random forest regression (RFR), partial least square regression (PLSR), least absolute shrinkage and selection operator regression (LassoR), and ridge regression (RidgeR). Two ranges of wavebands (4000~400 cm-1 and 3017~2823 cm-1/1805~1734 cm-1) were also used in the above analysis. The prediction accuracy was evaluated using a 10-fold cross validation procedure, with the ratio of the training set and the test set as 3:1, where the determination coefficient (R2) and residual predictive deviation (RPD) were used for evaluations. The results showed that 17 out of 31 milk fatty acids were accurately predicted using MIRS, with RPD values higher than 2 and R2 values higher than 0.75. In addition, 16 out of 31 fatty acids were accurately predicted by RFR, indicating that the ensemble learning model potentially resulted in a higher prediction accuracy. Meanwhile, DER1, DER2 and SG pre-processing algorithms led to high prediction accuracy for most fatty acids. In summary, these results imply that the application of MIRS to predict the fatty acid contents of milk is feasible.
Collapse
Affiliation(s)
- Xiuxin Zhao
- Institute of Animal Science and Veterinary Medicine, Shandong Academy of Agricultural Sciences, Jinan 250100, China
- Shandong OX Livestock Breeding Co., Ltd., Jinan 250100, China
| | - Yuetong Song
- Institute of Animal Science and Veterinary Medicine, Shandong Academy of Agricultural Sciences, Jinan 250100, China
- Department of Animal Genetics and Breeding, College of Animal Science and Technology, China Agricultural University, Beijing 100193, China
- Yantai Institute, China Agricultural University, Yantai 264670, China
| | - Yuanpei Zhang
- Shandong OX Livestock Breeding Co., Ltd., Jinan 250100, China
| | - Gaozhan Cai
- Shandong OX Livestock Breeding Co., Ltd., Jinan 250100, China
| | - Guanghui Xue
- Shandong OX Livestock Breeding Co., Ltd., Jinan 250100, China
| | - Yan Liu
- Shandong OX Livestock Breeding Co., Ltd., Jinan 250100, China
| | - Kewei Chen
- Yantai Institute, China Agricultural University, Yantai 264670, China
| | - Fan Zhang
- Institute of Animal Science and Veterinary Medicine, Shandong Academy of Agricultural Sciences, Jinan 250100, China
| | - Kun Wang
- Institute of Animal Science and Veterinary Medicine, Shandong Academy of Agricultural Sciences, Jinan 250100, China
- Yantai Institute, China Agricultural University, Yantai 264670, China
| | - Miao Zhang
- Institute of Animal Science and Veterinary Medicine, Shandong Academy of Agricultural Sciences, Jinan 250100, China
- Yantai Institute, China Agricultural University, Yantai 264670, China
| | - Yundong Gao
- Institute of Animal Science and Veterinary Medicine, Shandong Academy of Agricultural Sciences, Jinan 250100, China
| | - Dongxiao Sun
- Department of Animal Genetics and Breeding, College of Animal Science and Technology, China Agricultural University, Beijing 100193, China
- Correspondence: (D.S.); (X.W.); (J.L.)
| | - Xiao Wang
- Institute of Animal Science and Veterinary Medicine, Shandong Academy of Agricultural Sciences, Jinan 250100, China
- Correspondence: (D.S.); (X.W.); (J.L.)
| | - Jianbin Li
- Institute of Animal Science and Veterinary Medicine, Shandong Academy of Agricultural Sciences, Jinan 250100, China
- Correspondence: (D.S.); (X.W.); (J.L.)
| |
Collapse
|
14
|
Yan J, Jia S. A global gridded municipal water withdrawal estimation method using aggregated data and artificial neural network. Water Sci Technol 2023; 87:251-274. [PMID: 36640036 DOI: 10.2166/wst.2022.399] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
Municipal water withdrawal (MWW) information is of great significance for water supply planning, including water supply pipeline networks planning, optimization and management. Currently most MWW data are reported as spatially aggregated over large-area survey regions or even lack of data, which is unable to meet the growing demand for spatially detailed data in many applications. In this paper, six different models are constructed and evaluated in estimating global MWW using aggregated MWW data and gridded raster covariates. Among the models, the artificial neural network-based indirect model (NNM) shows the best accuracy with higher R2 and lower NMAE and NRMSE in different spatial scales. The estimates achieved from the NNM model are consistent with census and survey data, and outperforms the existing global gridded MWW dataset. At last, the NNM model is applied to mapping global gridded MWW for the year 2015 at 0.1 × 0.1° resolution. The proposed method can be applied to a wider aggregated output learning problem and the high-resolution global gridded MWW data can be used in hydrological models and water resources management.
Collapse
Affiliation(s)
- Jiabao Yan
- Key Laboratory of Water Cycle and Related Land Surface Processes, Institute of Geographic Science and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China E-mail:
| | - Shaofeng Jia
- Key Laboratory of Water Cycle and Related Land Surface Processes, Institute of Geographic Science and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China E-mail:
| |
Collapse
|
15
|
Kikuchi T, Anzai T, Ouchi T, Okamoto K, Terajima Y. Assessing the impact of watershed characteristics and management on nutrient concentrations in tropical rivers using a machine learning method. Environ Pollut 2023; 316:120599. [PMID: 36343855 DOI: 10.1016/j.envpol.2022.120599] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/07/2022] [Revised: 10/28/2022] [Accepted: 11/02/2022] [Indexed: 06/16/2023]
Abstract
Excessive loadings of terrestrial nitrogen and phosphorus, as well as their imbalances with silicon, have been recognized as one of the major causes of water quality and ecosystem deterioration in receiving waters. In this study, a periodic water quality monitoring was conducted in the rivers and streams of a tropical island (Ishigaki Island, Japan) to identify the factors controlling the concentrations of dissolved inorganic nitrogen (DIN), total phosphorus (TP) and dissolved silicon (DSi) with a special focus on the catchment characteristics (e.g., land use, surface geology, topography). Random Forest (RF) machine learning algorithm was employed to develop predictive models for nutrient concentrations from the catchment properties. The developed models could predict nutrient concentrations with sufficient accuracy, demonstrating that the studied nutrients are strongly affected by catchment properties. Agricultural land uses (e.g., livestock barn, sugarcane field) were ranked as the most important parameters for DIN and TP, while broadleaf forest was the most influential factor for DSi. Using the RF models, the contributions of DIN originating from sugarcane fields (i.e., fertilizers) and barns (i.e., manure) to riverine DIN were estimated, which were up to 60% in total in the studied river basins. Furthermore, the yield of DIN from sugarcane fields, calculated as the concentration of DIN derived from sugarcane fields divided by the percent area of sugarcane fields, strongly positively correlated with the areal coverage of limestone, suggesting that fertilizer-derived DIN is more prone to leaching out from cropland soil to groundwater and rivers in catchments with a higher dominance of calcareous geology. These results, including the methodology employed, have implications for water quality assessment and management in inland and coastal waters not only at the study site but also other regions.
Collapse
Affiliation(s)
- Tetsuro Kikuchi
- Crop, Livestock and Environment Division, Japan International Research Center for Agricultural Sciences (JIRCAS), 1-1 Ohwashi, Tsukuba, Ibaraki, 305-8686, Japan.
| | - Toshihiko Anzai
- Tropical Agriculture Research Front, JIRCAS, 1091-1 Maezato-Kawarabaru, Ishigaki, Okinawa, 907-0002, Japan.
| | - Takao Ouchi
- Ibaraki Kasumigaura Environmental Science Center, 1853, Okijuku-machi, Tsuchiura, Ibaraki, 300-0023, Japan.
| | - Ken Okamoto
- Tropical Agriculture Research Front, JIRCAS, 1091-1 Maezato-Kawarabaru, Ishigaki, Okinawa, 907-0002, Japan.
| | - Yoshifumi Terajima
- Tropical Agriculture Research Front, JIRCAS, 1091-1 Maezato-Kawarabaru, Ishigaki, Okinawa, 907-0002, Japan.
| |
Collapse
|
16
|
Wang S, Wang Y, Wang Y, Wang Z. Comparison of multi-objective evolutionary algorithms applied to watershed management problem. J Environ Manage 2022; 324:116255. [PMID: 36352707 DOI: 10.1016/j.jenvman.2022.116255] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/12/2022] [Revised: 09/09/2022] [Accepted: 09/09/2022] [Indexed: 06/16/2023]
Abstract
Simulation-based optimization (S-O) frameworks are effective in developing cost-effective watershed management strategies, where optimization algorithms have substantial effect on the quality of strategies. Despite the development and improvement of multi-objective evolutionary algorithms (MOEAs) provide more robust alternatives for optimization, they typically have limited applications in real-world decision contexts. In this study, three advanced MOEAs, including NSGA-II, MOEA/D and NSGA-III, were introduced into the S-O framework and applied to a real-world watershed management problem, and their performance and characteristics were quantified through performance metrics. Results show that a higher crossover or mutation probability do not necessarily promote convergence and diversity of solutions, while a larger generation and population size is helpful for MOEAs to find high-quality solutions. Compared to the other two MOEAs, NSGA-II consistently exhibits robust performance in finding solutions with good convergence and high diversity, and provides more options at the same computational cost, while the degenerate Pareto front of the proposed watershed management problem may account for the poor performance of MOEA/D and NSGA-III in terms of diversity. For a 10% TN or TP reduction target, the average cost of the NSGA-II optimized strategies is 32.22% or 47.83% of the commonly used strategies. In addition, this study also discussed the development of resilient watershed management to buffer the impacts of climate change on aquatic system, the incorporation of fuzzy programming into the S-O framework to develop robust watershed management strategies under uncertainty, and the application of machine learning-based surrogate models to reduce computational cost of the S-O framework. These results can contribute to the understanding of MOEAs and provide useful guidance to decision makers.
Collapse
Affiliation(s)
- Shuhui Wang
- Three-gorges Reservoir Area (Chongqing) Forest Ecosystem Research Station, School of Soil and Water Conservation, Beijing Forestry University, Beijing, 100083, China
| | - Yunqi Wang
- Three-gorges Reservoir Area (Chongqing) Forest Ecosystem Research Station, School of Soil and Water Conservation, Beijing Forestry University, Beijing, 100083, China.
| | - Yujie Wang
- Three-gorges Reservoir Area (Chongqing) Forest Ecosystem Research Station, School of Soil and Water Conservation, Beijing Forestry University, Beijing, 100083, China
| | - Zhen Wang
- Three-gorges Reservoir Area (Chongqing) Forest Ecosystem Research Station, School of Soil and Water Conservation, Beijing Forestry University, Beijing, 100083, China
| |
Collapse
|
17
|
Xu G, Fan H, Oliver DM, Dai Y, Li H, Shi Y, Long H, Xiong K, Zhao Z. Decoding river pollution trends and their landscape determinants in an ecologically fragile karst basin using a machine learning model. Environ Res 2022; 214:113843. [PMID: 35931190 DOI: 10.1016/j.envres.2022.113843] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/05/2022] [Revised: 04/27/2022] [Accepted: 07/04/2022] [Indexed: 06/15/2023]
Abstract
Karst watersheds accommodate high landscape complexity and are influenced by both human-induced and natural activity, which affects the formation and process of runoff, sediment connectivity and contaminant transport and alters natural hydrological and nutrient cycling. However, physical monitoring stations are costly and labor-intensive, which has confined the assessment of water quality impairments on spatial scale. The geographical characteristics of catchments are potential influencing factors of water quality, often overlooked in previous studies of highly heterogeneous karst landscape. To solve this problem, we developed a machining learning method and applied Extreme Gradient Boosting (XGBoost) to predict the spatial distribution of water quality in the world's most ecologically fragile karst watershed. We used the Shapley Addition interpretation (SHAP) to explain the potential determinants. Before this process, we first used the water quality damage index (WQI-DET) to evaluate the water quality impairment status and determined that CODMn, TN and TP were causing river water quality impairments in the WRB. Second, we selected 46 watershed features based on the three key processes (sources-mobilization-transport) which affect the temporal and spatial variation of river pollutants to predict water quality in unmonitored reaches and decipher the potential determinants of river impairments. The predicting range of CODMn spanned from 1.39 mg/L to 17.40 mg/L. The predictions of TP and TN ranged from 0.02 to 1.31 mg/L and 0.25-5.72 mg/L, respectively. In general, the XGBoost model performs well in predicting the concentration of water quality in the WRB. SHAP explained that pollutant levels may be driven by three factors: anthropogenic sources (agricultural pollution inputs), fragile soils (low organic carbon content and high soil permeability to water flow), and pollutant transport mechanisms (TWI, carbonate rocks). Our study provides key data to support decision-making for water quality restoration projects in the WRB and information to help bridge the science:policy gap.
Collapse
Affiliation(s)
- Guoyu Xu
- Key Laboratory of Watershed Geographic Sciences, Nanjing Institute of Geography and Limnology, Chinese Academy of Sciences, Nanjing, 210008, China; University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Hongxiang Fan
- Key Laboratory of Watershed Geographic Sciences, Nanjing Institute of Geography and Limnology, Chinese Academy of Sciences, Nanjing, 210008, China; University of Chinese Academy of Sciences, Beijing, 100049, China
| | - David M Oliver
- Biological & Environmental Sciences, Faculty of Natural Sciences, University of Stirling, Stirling FK9 4LA, UK
| | - Yibin Dai
- Institute of Surface-Earth System Science, School of Earth System Science, Tianjin University, Tianjin, 300072, China
| | - Hengpeng Li
- Key Laboratory of Watershed Geographic Sciences, Nanjing Institute of Geography and Limnology, Chinese Academy of Sciences, Nanjing, 210008, China.
| | - Yuejie Shi
- Key Laboratory of Watershed Geographic Sciences, Nanjing Institute of Geography and Limnology, Chinese Academy of Sciences, Nanjing, 210008, China; University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Haifei Long
- Guizhou Provincial Bureau of Hydrological Resources, Guiyang, 550002, China
| | - Kangning Xiong
- School of Karst Science / State Engineering Technology Institute for Karst Desertification Control, Guizhou Normal University, Guiyang, 550001, China
| | - Zhongming Zhao
- Department of Geography, King's College London, London, WC2R 2LS, UK
| |
Collapse
|
18
|
Chang HM, Xu Y, Chen SS, He Z. Enhanced understanding of osmotic membrane bioreactors through machine learning modeling of water flux and salinity. Sci Total Environ 2022; 838:156009. [PMID: 35595138 DOI: 10.1016/j.scitotenv.2022.156009] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/12/2022] [Revised: 05/12/2022] [Accepted: 05/12/2022] [Indexed: 06/15/2023]
Abstract
Mathematical modeling can be helpful to understand and optimize osmotic membrane bioreactors (OMBR), a promising technology for sustainable wastewater treatment with simultaneous water recovery. Herein, seven machine learning (ML) algorithms were employed to model both water flux and salinity of a lab-scale OMBR. Through the optimum hyperparameters tuning and 5-fold cross-validation, the ML models have achieved more accurate results without obvious overfitting and bias. The median R2 scores of water flux modeling were all over the 0.95 and the most of median R2 scores from total dissolved solids (TDS) modeling were higher than 0.90. During model testing, random forest (RF) algorithm presented the highest R2 score of 0.987 with the lowest root mean square error (RMSE = 0.044) for the water flux modeling, and extreme gradient boosting (XGB) algorithm exhibited the best results (R2 = 0.97; RMSE = 0.234) in the TDS modeling. The Shapley Additive exPlanations (SHAP) analysis found that the phosphorus concentration was a critical input feature for both water flux and TDS modeling. Finally, the selected ML models were used to predict water flux and salinity affected by two input features and the predication results confirmed the importance of the phosphate concentration. The results of this study have demonstrated the promise of ML modeling for investigating OMBR systems.
Collapse
Affiliation(s)
- Hau-Ming Chang
- Institute of Environmental Engineering and Management, National Taipei University of Technology, Taipei, Taiwan; Department of Energy, Environmental and Chemical Engineering, Washington University in St. Louis, St. Louis, MO, USA
| | - Yanran Xu
- Department of Energy, Environmental and Chemical Engineering, Washington University in St. Louis, St. Louis, MO, USA
| | - Shiao-Shing Chen
- Institute of Environmental Engineering and Management, National Taipei University of Technology, Taipei, Taiwan
| | - Zhen He
- Department of Energy, Environmental and Chemical Engineering, Washington University in St. Louis, St. Louis, MO, USA.
| |
Collapse
|
19
|
Park J, Ahn J, Kim J, Yoon Y, Park J. Prediction and Interpretation of Water Quality Recovery after a Disturbance in a Water Treatment System Using Artificial Intelligence. Water 2022; 14:2423. [DOI: 10.3390/w14152423] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
In this study, an ensemble machine learning model was developed to predict the recovery rate of water quality in a water treatment plant after a disturbance. XGBoost, one of the most popular ensemble machine learning models, was used as the main framework of the model. Water quality and operational data observed in a pilot plant were used to train and test the model. Disturbance was determined when the observed turbidity was higher than the given turbidity criteria. Therefore, the recovery rate of water quality at a time t was defined during the falling limb of the turbidity recovery period. It was considered as a relative ratio of the differences between the peak and observed turbidities at time t to the difference between the peak turbidity and turbidity criteria. The root mean square error–observation standard deviation ratio of the XGBoost model improved from 0.730 to 0.373 by pretreatment, removing the observation for the rising limb of the disturbance from the training data. Moreover, Shapley value analysis, a novel explainable artificial intelligence method, was used to provide a reasonable interpretation of the model’s performance.
Collapse
|
20
|
Qi C, Wu M, Lu X, Zhang Q, Chen Q. Comparison and Determination of Optimal Machine Learning Model for Predicting Generation of Coal Fly Ash. Crystals 2022; 12:556. [DOI: 10.3390/cryst12040556] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The rapid development of industry keeps increasing the demand for energy. Coal, as the main energy source, has a huge level of consumption, resulting in the continuous generation of its combustion byproduct coal fly ash (CFA). The accumulated CFA will occupy a large amount of land, but also cause serious environmental pollution and personal injury, which makes the resource utilization of CFA gradually to be attached importance. However, given the variability of the amount of CFA generation, predicting it in advance is the basis to ensure effective disposal and rational utilization. In this study, CFA generation was taken as the target variable, three machine learning (ML) algorithms were used to construct the model, and four evaluation indices were used to evaluate its performance. The results showed that the DNN model with the R = 0.89, R2 = 0.77 on the testing set performed better than the traditional multiple linear regression equation and other ML algorithms, and the feasibility of DNN as the optimal model framework was demonstrated. Applying this model framework to the engineering field enables managers to identify the next step of the disposal method in advance, so as to rationally allocate ways of recycling and utilization to maximize the use and sales benefits of CFA while minimizing its disposal costs. In addition, sensitivity analysis further explains ML’s internal decisions and verifies that coal consumption is more important than installed capacity, which provides a certain reference for ensuring the rational utilization of CFA.
Collapse
|
21
|
Ji X, Chen J, Guo Y. A Multi-Dimensional Investigation on Water Quality of Urban Rivers with Emphasis on Implications for the Optimization of Monitoring Strategy. Sustainability 2022; 14:4174. [DOI: 10.3390/su14074174] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Water quality monitoring (WQM) of urban rivers has been a reliable method to supervise the urban water environment. Indiscriminate WQM strategies can hardly emphasize the concerning pollution and usually require high costs of money, time, and manpower. To tackle these issues, this work carried out a multi-dimensional study (large spatial scale, multiple monitoring parameters, and long time scale) on the water quality of two urban rivers in Jiujiang City, China, which can provide indicative information for the optimization of WQM. Of note, the spatial distribution of NH3-N concentration varied significantly both in terms of the two different rivers as well as the different sections (i.e., much higher in the northern section), with a maximal difference, on average greater, than five times. Statistical methods and machine learning algorithms were applied to optimize the monitoring objects, parameters, and frequency. The sharp decrease in water quality of adjacent sections was identified by Analytical Hierarchy Process of water quality assessment indexes. After correlation analysis, principal component analysis, and cluster analysis, the various WQM parameters could be divided into three principal components and four clusters. With the machine learning algorithm of Random Forest, the relation between concentration of pollutants and rainfall depth was fitted using quadratic functions (calculated Pearson correlation coefficients ≥ 0.89), which could help predict the pollution after precipitation and further determine the appropriate WQM frequency. Generally, this work provides a novel thought for efficient, smart, and low-cost water quality investigation and monitoring strategy determination, which contributes to the construction of smart water systems and sustainable water source management.
Collapse
|
22
|
Cheng L, Feng R, Wang L, Yan J, Liang D. An Assessment of Electric Power Consumption Using Random Forest and Transferable Deep Model with Multi-Source Data. Remote Sensing 2022; 14:1469. [DOI: 10.3390/rs14061469] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/04/2022]
Abstract
Reliable and fine-resolution electric power consumption (EPC) is essential for effective urban electricity allocation and planning. Currently, EPC data exists mainly as statistics with low resolution. Many studies estimate fine-resolution EPC based on the positive correction between stable nighttime light and EPC distribution. However, EPC is related to various factors other than nighttime light and is spatially non-stationary. Yet this has been ignored in current research. This study developed a novel method to estimate EPC at 500 m resolution by considering spatially non-stationary through fusing geospatial data and high-resolution satellite images. Deep transfer learning and statistical methods were used to extract socio-economic, population density, and landscape features to describe EPC distribution from multi-source geospatial data. Finally, a random forest regression (RFR) model with features and EPC statistics is established to estimate fine-resolution EPC. A study area of Shenzhen city, China, is employed to evaluate the proposed method. The R2 between predicted EPC and statistical EPC is 0.82 at sub-district level in 2013, which is higher than an existing EPC product (Shi’s product) with R2=0.46, illustrating the effectiveness of the proposed method. Moreover, the EPC distribution for Shenzhen from 2013 to 2019 was estimated. Furthermore, the spatiotemporal dynamic of EPC was analyzed at the pixel and sub-district levels.
Collapse
|
23
|
Abstract
River sediments often contain potentially harmful pollutants such as metals. Much research has been conducted to identify factors involved in sediment concentrations of metals. While most metal pollution studies focus on smaller scales, it has been shown that basin-scale parameters are powerful predictors of river water quality. The present study focused on basin-scale factors of metal concentrations in river sediments. The study was performed on the contiguous USA using Random Forest (R.F.) to analyze the importance of different factors of the metal pollution potential of river sediments and evaluate the possibility of assessing this potential from basin characteristics. Results indicated that the most important factors belonged to the groups Geology, Dams, and Land cover. Rock characteristics (contents of K2O, CaO, and SiO2) and reservoir drainage area were strong factors. Vegetation indices were more important than land cover types. The response of different metals to basin-scale factors varied greatly. The R.F. models performed well with prediction errors of 16.5% to 28.1%, showing that basin-scale parameters hold sufficient information for predicting potential metal concentrations. The results contribute to research and policymaking dependent on understanding large-scale factors of metal pollution.
Collapse
|
24
|
Ghimire S, Deo RC, Wang H, Al-musaylh MS, Casillas-pérez D, Salcedo-sanz S. Stacked LSTM Sequence-to-Sequence Autoencoder with Feature Selection for Daily Solar Radiation Prediction: A Review and New Modeling Results. Energies 2022; 15:1061. [DOI: 10.3390/en15031061] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/10/2022]
Abstract
We review the latest modeling techniques and propose new hybrid SAELSTM framework based on Deep Learning (DL) to construct prediction intervals for daily Global Solar Radiation (GSR) using the Manta Ray Foraging Optimization (MRFO) feature selection to select model parameters. Features are employed as potential inputs for Long Short-Term Memory and a seq2seq SAELSTM autoencoder Deep Learning (DL) system in the final GSR prediction. Six solar energy farms in Queensland, Australia are considered to evaluate the method with predictors from Global Climate Models and ground-based observation. Comparisons are carried out among DL models (i.e., Deep Neural Network) and conventional Machine Learning algorithms (i.e., Gradient Boosting Regression, Random Forest Regression, Extremely Randomized Trees, and Adaptive Boosting Regression). The hyperparameters are deduced with grid search, and simulations demonstrate that the DL hybrid SAELSTM model is accurate compared with the other models as well as the persistence methods. The SAELSTM model obtains quality solar energy prediction intervals with high coverage probability and low interval errors. The review and new modelling results utilising an autoencoder deep learning method show that our approach is acceptable to predict solar radiation, and therefore is useful in solar energy monitoring systems to capture the stochastic variations in solar power generation due to cloud cover, aerosols, ozone changes, and other atmospheric attenuation factors.
Collapse
|
25
|
Elzain HE, Chung SY, Senapathi V, Sekar S, Lee SY, Roy PD, Hassan A, Sabarathinam C. Comparative study of machine learning models for evaluating groundwater vulnerability to nitrate contamination. Ecotoxicol Environ Saf 2022; 229:113061. [PMID: 34902776 DOI: 10.1016/j.ecoenv.2021.113061] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Revised: 12/02/2021] [Accepted: 12/04/2021] [Indexed: 06/14/2023]
Abstract
The accurate evaluation of groundwater contamination vulnerability is essential for the management and prevention of groundwater contamination in the watershed. In this study, advanced multiple machine learning (ML) models of Radial Basis Neural Networks (RBNN), Support Vector Regression (SVR), and ensemble Random Forest Regression (RFR) were applied to determine the most accurate performance for the evaluation of groundwater contamination vulnerability. Eight vulnerability factors of DRASTIC-L were rated based on the modified DRASTIC model (MDM) and were used as input data. The adjusted vulnerability index (AVI) with nitrate values was used as output data for the modeling process. The performance of three models was verified using the statistical performance criteria of MAE, RMSE, r2, and ROC/AUC values. The ensemble RFR model showed the highest performance in comparison with standalone SVR and RBNN models. Specifically, ensemble RFR kept all promising solutions during the model performance due to its flexibility and robustness, and the vulnerability map obtained by the RFR model was more accurate for predicting the most vulnerable areas to contamination. It was concluded that ensemble RFR was a robust tool to enhance the evaluation of groundwater contamination vulnerability, and that it could contribute to environmental safety against groundwater contamination.
Collapse
Affiliation(s)
- Hussam Eldin Elzain
- Department of Earth & Environmental Sciences, Pukyong National University, Busan 48513, Republic of Korea
| | - Sang Yong Chung
- Department of Earth & Environmental Sciences, Pukyong National University, Busan 48513, Republic of Korea.
| | | | - Selvam Sekar
- Department of Geology, V. O. Chidambaram College, Tuticorin, Tamil Nadu 628008, India
| | - Seung Yeop Lee
- High Level Waste Disposal Research Center, Korea Atomic Energy Research Institute (KAERI), Daejeon 34057, Republic of Korea
| | - Priyadarsi D Roy
- Instituto de Geología, Universidad Nacional Autónoma de México, Ciudad Universitaria, Ciudad de México CP 04510, Mexico
| | - Amjed Hassan
- College of Petroleum Engineering & Geosciences, King Fahd University of Petroleum & Minerals, Dhahran 31261, Saudi Arabia
| | | |
Collapse
|