1
|
Wang Y, Zhang Z, Cheng C, Liang C, Wang H, He M, Huang H, Wang K. Ensemble learning-assisted quantitative identifying influencing factors of cadmium and arsenic concentration in rice grain based multiplexed data. JOURNAL OF HAZARDOUS MATERIALS 2025; 485:136869. [PMID: 39675080 DOI: 10.1016/j.jhazmat.2024.136869] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/10/2024] [Revised: 12/06/2024] [Accepted: 12/11/2024] [Indexed: 12/17/2024]
Abstract
Rapid and accurate prediction of rice Cd (rCd) and rice As (rAs) bioaccumulation are important for assessing the safe utilization of rice. Currently, there is lack of comprehensive and systematic exploration of the factors of rCd and rAs. Herein, ensemble learning (EL) was first used to analysis the 23 factors in 8 categories (heavy metal pollution characteristics, soil properties, geographical characteristics, meteorological factors, socio-economic factors, environmental factors, rice type, and nutrient element) in typical regions of China based on the results of 193 research papers from 2000 to 2024 in Web of Science database. Three machine learning methods were used to predict rCd and rAs concentrations and identify the key factors in each region, and explored the mechanism of Cd and As uptake in rice. The results showed that there were large differences in the factors affecting rice enrichment for the same heavy metal in different regions. For Cd, rice type (48.30 %), soil characteristics (28.14 %), and environmental factors (61.30 %) were the most important factors in Central South, East China, and Southwest China, respectively. For As, soil properties (34.01 %) and geographical characteristics (50.22 %) had the greatest influence in Central South and East China, respectively. Our study provided valuable insights into the prediction of rCd and rAs, thus contributing to ensuring food safety and preventing Cd and As exposure-associated health risks.
Collapse
Affiliation(s)
- Yakun Wang
- School of Land Science and Technology, China University of Geosciences (Beijing), Beijing 100083, China
| | - Zhuo Zhang
- School of Land Science and Technology, China University of Geosciences (Beijing), Beijing 100083, China; Key Laboratory of Land Consolidation and Rehabilitation, Ministry of Natural Resources, Beijing 100035, China.
| | - Cheng Cheng
- PipeChina north Pipeline company, Langfang 065000, China
| | - Chouyuan Liang
- School of Land Science and Technology, China University of Geosciences (Beijing), Beijing 100083, China
| | - Hejing Wang
- Technical Center for Soil,Agriculture and Rural Ecology and Environment Ministry of Ecology and Environment, Beijing 100012, China
| | - Mengsi He
- School of Land Science and Technology, China University of Geosciences (Beijing), Beijing 100083, China
| | - Haochong Huang
- School of Science, China University of Geosciences (Beijing), Beijing 100083, China
| | - Kai Wang
- School of Earth sciences and Resources, China University of Geosciences (Beijing), Beijing 100083, China
| |
Collapse
|
2
|
Zhang RJ, Ji XH, Xie YH, Xue T, Liu SH, Tian FX, Pan SF. A novel graph convolutional neural network model for predicting soil Cd and As pollution: Identification of influencing factors and interpretability. ECOTOXICOLOGY AND ENVIRONMENTAL SAFETY 2025; 292:117926. [PMID: 39978104 DOI: 10.1016/j.ecoenv.2025.117926] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/03/2024] [Revised: 01/23/2025] [Accepted: 02/17/2025] [Indexed: 02/22/2025]
Abstract
Soil pollution caused by toxic metals poses serious threats to the ecological environment and human well-being. Accurately predicting toxic metal concentrations is critical for safeguarding soil environmental security. However, the distribution of soil toxic metal concentrations often exhibits significant spatial heterogeneity and intricate correlations with other environmental influencing factors, posing substantial challenges to accurate prediction. This study delves into the prospective application of a novel graph convolutional neural network model, namely DistNet-GCN. By capitalizing on the spatial relationships among sampling points, this model endeavors to predict cadmium (Cd) and arsenic (As) concentrations in soil. The distinctive feature of this model resides in its capacity to mimic the transmission process of relationships between soil Cd/As concentrations and the environmental influencing factors within a local spatial scope by integrating the powerful ability of GCN to extract the inter-node dependencies in complex networks. Subsequently, it extracts the critical features of the dataset from a spatial relationship graph structure by taking the spatial positions of sampling points as network nodes, the concentrations of toxic metals as node labels, and environmental factors as node attributes. In comparison with traditional models, the DistNet-GCN model achieves the highest prediction accuracy for soil Cd and As concentrations. Specifically, the R2 values reach 0.91 and 0.94 respectively, which signify improvements of 21.33 % and 9.30 % over those of Multiple Linear Regression (MLR). The outcome of the interpretability analysis shows that the urban human activities, mining operation, pH, and soil organic matter (SOM) are the most important environmental factors affecting the spatial distribution of soil Cd/As concentrations in the study area. Additionally, the local spatial autocorrelation findings reveal that the Moran's I values for Cd and As are 0.796 and 0.897, respectively, which validate the structural soundness and rationality of the DistNet-GCN model. This study enlightens a novel approach of soil Cd/As concentrations prediction by integrating spatial graph structures into the deep learning models and is significant for uncovering the complex correlations between toxic metal concentrations in soil and various environmental factors.
Collapse
Affiliation(s)
- Ren-Jie Zhang
- Longping Branch, College of Biology, Hunan University, Changsha 410125, China; Key Lab of Prevention, Control and Remediation of Soil Heavy Metal Pollution, Hunan Institute of Agro-Environment and Ecology, Hunan Academy of Agricultural Sciences, Changsha 410125, China; Ministry of Agriculture Key Lab of Agri-Environment in the Midstream of Yangtze River Plain, Changsha 410125, China
| | - Xiong-Hui Ji
- Longping Branch, College of Biology, Hunan University, Changsha 410125, China; Key Lab of Prevention, Control and Remediation of Soil Heavy Metal Pollution, Hunan Institute of Agro-Environment and Ecology, Hunan Academy of Agricultural Sciences, Changsha 410125, China; Ministry of Agriculture Key Lab of Agri-Environment in the Midstream of Yangtze River Plain, Changsha 410125, China.
| | - Yun-He Xie
- Key Lab of Prevention, Control and Remediation of Soil Heavy Metal Pollution, Hunan Institute of Agro-Environment and Ecology, Hunan Academy of Agricultural Sciences, Changsha 410125, China; Ministry of Agriculture Key Lab of Agri-Environment in the Midstream of Yangtze River Plain, Changsha 410125, China
| | - Tao Xue
- Key Lab of Prevention, Control and Remediation of Soil Heavy Metal Pollution, Hunan Institute of Agro-Environment and Ecology, Hunan Academy of Agricultural Sciences, Changsha 410125, China; Ministry of Agriculture Key Lab of Agri-Environment in the Midstream of Yangtze River Plain, Changsha 410125, China
| | - Sai-Hua Liu
- Key Lab of Prevention, Control and Remediation of Soil Heavy Metal Pollution, Hunan Institute of Agro-Environment and Ecology, Hunan Academy of Agricultural Sciences, Changsha 410125, China; Ministry of Agriculture Key Lab of Agri-Environment in the Midstream of Yangtze River Plain, Changsha 410125, China
| | - Fa-Xiang Tian
- Key Lab of Prevention, Control and Remediation of Soil Heavy Metal Pollution, Hunan Institute of Agro-Environment and Ecology, Hunan Academy of Agricultural Sciences, Changsha 410125, China; Ministry of Agriculture Key Lab of Agri-Environment in the Midstream of Yangtze River Plain, Changsha 410125, China
| | - Shu-Fang Pan
- Key Lab of Prevention, Control and Remediation of Soil Heavy Metal Pollution, Hunan Institute of Agro-Environment and Ecology, Hunan Academy of Agricultural Sciences, Changsha 410125, China; Ministry of Agriculture Key Lab of Agri-Environment in the Midstream of Yangtze River Plain, Changsha 410125, China.
| |
Collapse
|
3
|
Chen R, Liu Z, Yang J, Ma T, Guo A, Shi R. Predicting cadmium enrichment in crops/vegetables and identifying the effects of soil factors based on transfer learning methods. ECOTOXICOLOGY AND ENVIRONMENTAL SAFETY 2025; 291:117823. [PMID: 39904259 DOI: 10.1016/j.ecoenv.2025.117823] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/27/2024] [Revised: 01/27/2025] [Accepted: 01/27/2025] [Indexed: 02/06/2025]
Abstract
Cadmium (Cd) is present in soils and can easily migrate into plants due to its various forms. This mobility allows it to be absorbed by plant roots and accumulate in edible parts, entering the food chain and posing health risks. In some regions, insufficient sampling and research, or the limited cultivation of specific vegetables and crops, make it challenging to gather adequate data for modeling. A total of 353 pairs of soil and crop/vegetable samples were collected across three regions using a unified measurement method. These samples were utilized to build predictive models to study the relationship between soil factors and cadmium (Cd) absorption in six different crops/vegetables, followed by a unified comparison. This study compares regression and probability models and determines the best feature combination, which can retain enough information to accurately predict and prevent over-fitting caused by too many features. The best feature combination is used to apply transfer learning to cadmium enrichment in crops/vegetables. The results show that the best accuracy of the random forest probability model in the rice dataset is 0.89. The best feature combination of prediction results was found by feature optimization. This feature combination has a very good effect on the prediction of cadmium in corn / vegetables by transfer learning. The accuracy of corn, rape and radish is 0.93,0.89 and 0.81, respectively. In the case of good prediction effect of transfer learning, available Cd is the most critical function, and available Cd is positively correlated with Cd in plants. It suggests that available heavy metal significantly influence predictions in crops/vegetables. In areas with less sampling and research, selecting relevant features and using transfer learning methods is more appropriate for constructing predictive models.
Collapse
Affiliation(s)
- Rui Chen
- Engineering Research Center of Clean and Low-carbon Technology for Intelligent Transportation, Ministry of Education, School of Environment, Beijing Jiaotong University, Beijing 100044, China
| | - Zean Liu
- Engineering Research Center of Clean and Low-carbon Technology for Intelligent Transportation, Ministry of Education, School of Environment, Beijing Jiaotong University, Beijing 100044, China
| | - Jingyan Yang
- Engineering Research Center of Clean and Low-carbon Technology for Intelligent Transportation, Ministry of Education, School of Environment, Beijing Jiaotong University, Beijing 100044, China
| | - Tiantian Ma
- Agro-Environmental Protection Institute, Ministry of Agriculture and Rural Affairs, Tianjin 300191, China
| | - Aihong Guo
- College of Chemical Engineering, North China University of Science and Technology, Tangshan 063210, China
| | - Rongguang Shi
- Agro-Environmental Protection Institute, Ministry of Agriculture and Rural Affairs, Tianjin 300191, China.
| |
Collapse
|
4
|
Proshad R, Asharaful Abedin Asha SM, Tan R, Lu Y, Abedin MA, Ding Z, Zhang S, Li Z, Chen G, Zhao Z. Machine learning models with innovative outlier detection techniques for predicting heavy metal contamination in soils. JOURNAL OF HAZARDOUS MATERIALS 2025; 481:136536. [PMID: 39566457 DOI: 10.1016/j.jhazmat.2024.136536] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/10/2024] [Revised: 10/31/2024] [Accepted: 11/14/2024] [Indexed: 11/22/2024]
Abstract
Machine learning (ML) models for accurately predicting heavy metals with inconsistent outputs have improved owing to dataset outliers, which influence model reliability and accuracy. A comprehensive technique that combines machine learning and advanced statistical methods was applied to assess data outlier's effects on ML models. Ten ML models with three outlier detection methods predicted Cr, Ni, Cd, and Pb in Narayanganj soils. XGBoost with density-based spatial clustering of applications with noise (DBSCAN) improved model efficacy (R2). The R2 of Cr, Ni, Cd, and Pb was considerably enhanced by 11.11 %, 6.33 %, 14.47 %, and 5.68 %, respectively, indicating that outliers affected the model's HM prediction. Soil factors affected Cr (80 %), Ni (72.61 %), Cd (53.35 %), and Pb (63.47 %) concentrations based on feature importance. Contamination factor prediction showed considerable contamination for Cr, Ni, and Cd. LISA revealed Cd (55.4 %), Cr (49.3 %), and Pb (47.3 %) as the significant pollutant (p < 0.05). Moran's I index values for Cr, Ni, Cd, and Pb were 0.65, 0.58, 0.60, and 0.66, respectively, indicating strong positive spatial autocorrelation and clusters with similar contamination. Finally, this work successfully assessed the influence of data outliers on the ML model for soil HM contamination prediction, identifying crucial regions that require rapid conservation measures.
Collapse
Affiliation(s)
- Ram Proshad
- State Key Laboratory of Mountain Hazards and Engineering Safety, Institute of Mountain Hazards and Environment, Chinese Academy of Sciences, Chengdu 610041, Sichuan, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | | | - Rong Tan
- College of Earth and Environmental Sciences, Lanzhou University, Lanzhou 730000, China
| | - Yineng Lu
- College of Earth and Environmental Sciences, Lanzhou University, Lanzhou 730000, China
| | - Md Anwarul Abedin
- Laboratory of Environment and Sustainable Development, Department of Soil Science, Bangladesh Agricultural University, Mymensingh 2202, Bangladesh
| | - Zihao Ding
- College of Earth and Environmental Sciences, Lanzhou University, Lanzhou 730000, China
| | - Shuangting Zhang
- College of Earth and Environmental Sciences, Lanzhou University, Lanzhou 730000, China
| | - Ziyi Li
- College of Earth and Environmental Sciences, Lanzhou University, Lanzhou 730000, China
| | - Geng Chen
- College of Earth and Environmental Sciences, Lanzhou University, Lanzhou 730000, China
| | - Zhuanjun Zhao
- State Key Laboratory of Mountain Hazards and Engineering Safety, Institute of Mountain Hazards and Environment, Chinese Academy of Sciences, Chengdu 610041, Sichuan, China.
| |
Collapse
|
5
|
Zhou B, Wang F, Li H, Zhao Y, Yang R, Huang H, Wang Y, Xiao Z, Tian K, Pang W. Evaluating heavy metals-related risk in staple crops and making financing strategy for corresponding soil remediation across China. JOURNAL OF HAZARDOUS MATERIALS 2024; 480:136135. [PMID: 39405717 DOI: 10.1016/j.jhazmat.2024.136135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/09/2024] [Revised: 09/13/2024] [Accepted: 10/08/2024] [Indexed: 12/01/2024]
Abstract
China's staple crops face heavy metal (HMs) contamination, a widespread issue lacking a national assessment. We used machine learning (ML) to assess risks of 8 HMs in rice, wheat, and maize, and estimated a financing strategy for soil remediation via linear optimization and computable general equilibrium (CGE). The accumulation of HMs in crops depends on Soil-HMs, climate, soil properties, and crop types. Cd and Hg pose major soil pollution risks, while Cr, Pb, and Cd are the most threatening in crops. High-risk zones are located at the warm temperature and subtropical zones, with wheat most vulnerable. Over a quarter (26.77 %) of the nation's croplands are classified as high-risk, with a significant 60.89 % falling into the medium-risk category, leaving merely 12.34 % of the agricultural land in a safe condition. The estimated remediation cost is 58596.73 billion RMB and the crop loss is 808.03 billion RMB in a ten-year remediation period at the context of secure crop supply. The reallocation of social investment rather than raising new taxation for the remediation is beneficial to the GDP increase and social welfare despite some loss in the household income and enterprise income. This study provides a comprehensive evaluation for Crop-HMs risk and remediation policy, crucial for national crop security.
Collapse
Affiliation(s)
- Baiqin Zhou
- Gansu Academy of Eco-environmental Sciences, Lanzhou 730030, China; School of Civil and Environmental Engineering, Harbin Institute of Technology (Shenzhen), Shenzhen 518055, China
| | - Fangjun Wang
- College of Architecture & Civil Engineering, Faculty of Urban Construction, Beijing University of Technology, Beijing 100124, China
| | - Huiping Li
- Key Laboratory of Yangtze River Water Environment, Ministry of Education, College of Environmental Science and Engineering, Tongji University, Shanghai 200092, China.
| | - Yuantian Zhao
- College of Architecture & Civil Engineering, Faculty of Urban Construction, Beijing University of Technology, Beijing 100124, China
| | - Ruichun Yang
- National Engineering Laboratory for Advanced Municipal Wastewater Treatment and Reuse Technology, Beijing University of Technology, Beijing 100124, China
| | - Hui Huang
- Gansu Academy of Eco-environmental Sciences, Lanzhou 730030, China
| | - Yujun Wang
- Yongdeng County Bureau of Industry and Information Technology, Lanzhou 730300, China
| | - Zijie Xiao
- School of Civil and Environmental Engineering, Harbin Institute of Technology (Shenzhen), Shenzhen 518055, China; Department of Chemical Engineering, KU Leuven, 3001 Leuven, Belgium
| | - Kun Tian
- State Key Laboratory of Soil & Sustainable Agriculture, Institute of Soil Science, Chinese Academy of Sciences, Nanjing 210008, China
| | - Weihai Pang
- Key Laboratory of Yangtze River Water Environment, Ministry of Education, College of Environmental Science and Engineering, Tongji University, Shanghai 200092, China
| |
Collapse
|
6
|
Lu X, Sun L, Zhang Y, Du J, Wang G, Huang X, Li X, Wang X. Predicting Cd accumulation in crops and identifying nonlinear effects of multiple environmental factors based on machine learning models. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 951:175787. [PMID: 39187091 DOI: 10.1016/j.scitotenv.2024.175787] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/09/2024] [Revised: 08/21/2024] [Accepted: 08/23/2024] [Indexed: 08/28/2024]
Abstract
The traditional prediction of the Cd content in grains (Cdg) of crops primarily relies on the multiple linear regression models based on soil Cd content (Cds) and pH, neglecting inter-factorial interactions and nonlinear causal links between external environmental factors and Cdg. In this study, a comprehensive index system of multi-type environmental factors including soil properties, geology, climate, and anthropogenic activity was constructed. The machine learning models of the tree-based ensemble, support vector regression, artificial neural network for predicting Cdg of rice and wheat based on the environmental factor indexes significantly improved the accuracy than the traditional models of linear regression based on soil properties. Among them, the tree-based ensemble models of XGboost and random forest exhibited highest accuracies for predicting Cdg of rice and wheat, with R2 in the test dataset of 0.349 and 0.546, respectively. This study found that soil properties, including Cds, pH, and clay, have greater impacts on Cdg of rice and wheat, with combined contribution rates accounting for 65.2 % and 29.7 % respectively. Since wheat sampling areas are located in central and northern China, they are more constrained by precipitation and temperature than rice sampling areas in the south. Geologic and climate factors have a greater impact on Cdg of wheat, with a combined contribution rate of 49.9 %, which is higher than the corresponding rate of 20.9 % in rice. Furthermore, the Cdg of rice and wheat did not exhibit an absolute linear relationship with Cds, and excessively high Cds can reduce the bioconcentration factor of Cd accumulation in crops. Meanwhile, other environmental factors such as temperature, precipitation, elevation have marginal effects on the increase of Cdg of crops. This study provides a novel framework to optimize traditional soil plant transfer models, as well as offer a step towards realizing high precision prediction of Cd content in crops.
Collapse
Affiliation(s)
- Xiaosong Lu
- State Environmental Protection Key Laboratory of Soil Environmental Management and Pollution Control, Nanjing Institute of Environmental Sciences, Ministry of Ecology and Environment, Nanjing 210042, China
| | - Li Sun
- State Environmental Protection Key Laboratory of Soil Environmental Management and Pollution Control, Nanjing Institute of Environmental Sciences, Ministry of Ecology and Environment, Nanjing 210042, China
| | - Ya Zhang
- State Environmental Protection Key Laboratory of Soil Environmental Management and Pollution Control, Nanjing Institute of Environmental Sciences, Ministry of Ecology and Environment, Nanjing 210042, China
| | - Junyang Du
- State Environmental Protection Key Laboratory of Soil Environmental Management and Pollution Control, Nanjing Institute of Environmental Sciences, Ministry of Ecology and Environment, Nanjing 210042, China
| | - Guoqing Wang
- State Environmental Protection Key Laboratory of Soil Environmental Management and Pollution Control, Nanjing Institute of Environmental Sciences, Ministry of Ecology and Environment, Nanjing 210042, China.
| | - Xinghua Huang
- State Environmental Protection Key Laboratory of Soil Environmental Management and Pollution Control, Nanjing Institute of Environmental Sciences, Ministry of Ecology and Environment, Nanjing 210042, China; College of Environmental Science and Engineering, Yangzhou University, Yangzhou 225127, China
| | - Xuzhi Li
- State Environmental Protection Key Laboratory of Soil Environmental Management and Pollution Control, Nanjing Institute of Environmental Sciences, Ministry of Ecology and Environment, Nanjing 210042, China.
| | - Xiaozhi Wang
- College of Environmental Science and Engineering, Yangzhou University, Yangzhou 225127, China
| |
Collapse
|
7
|
Chen J, Wan J, Ye G, Wang Y. Prediction and optimization of wastewater treatment process effluent chemical oxygen demand and energy consumption based on typical ensemble learning models. BIORESOURCE TECHNOLOGY 2024; 411:131362. [PMID: 39197664 DOI: 10.1016/j.biortech.2024.131362] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/08/2024] [Revised: 08/14/2024] [Accepted: 08/25/2024] [Indexed: 09/01/2024]
Abstract
Pollution integration and carbon reduction has become a primary focus in wastewater treatment processes. In this study, water quality and control indicators were used as input features and the dataset was extended using the moving average method. Random Forest, eXtreme Gradient Boosting (XGBoost), and Light Gradient Boosting Machine algorithms were used to predict the effluent chemical oxygen demand (COD) and total energy consumption (TEC). The results indicated that the model prediction performance could be effectively improved when the data were amplified by two times and that the XGBoost model exhibited the best prediction performance for effluent COD and TEC. The Non-dominated Sorting Genetic Algorithm II model was employed for the multi-objective optimization of effluent COD and TEC, resulting in reductions of 15% and 18%, respectively. The ensemble learning model proposed in this study to achieve synergy between water quality improvement and energy saving is practical.
Collapse
Affiliation(s)
- Jian Chen
- College of Environment and Energy, South China University of Technology, Guangzhou 510006, China
| | - Jinquan Wan
- College of Environment and Energy, South China University of Technology, Guangzhou 510006, China.
| | - Gang Ye
- General Water of China Co., Ltd., Xiangtan 411100, China
| | - Yan Wang
- College of Environment and Energy, South China University of Technology, Guangzhou 510006, China
| |
Collapse
|
8
|
Li S, Shen Y, Gao M, Song H, Ge Z, Zhang Q, Xu J, Wang Y, Sun H. Machine Learning Models for Predicting Bioavailability of Traditional and Emerging Aromatic Contaminants in Plant Roots. TOXICS 2024; 12:737. [PMID: 39453157 PMCID: PMC11511036 DOI: 10.3390/toxics12100737] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/10/2024] [Revised: 10/08/2024] [Accepted: 10/10/2024] [Indexed: 10/26/2024]
Abstract
To predict the behavior of aromatic contaminants (ACs) in complex soil-plant systems, this study developed machine learning (ML) models to estimate the root concentration factor (RCF) of both traditional (e.g., polycyclic aromatic hydrocarbons, polychlorinated biphenyls) and emerging ACs (e.g., phthalate acid esters, aryl organophosphate esters). Four ML algorithms were employed, trained on a unified RCF dataset comprising 878 data points, covering 6 features of soil-plant cultivation systems and 98 molecular descriptors of 55 chemicals, including 29 emerging ACs. The gradient-boosted regression tree (GBRT) model demonstrated strong predictive performance, with a coefficient of determination (R2) of 0.75, a mean absolute error (MAE) of 0.11, and a root mean square error (RMSE) of 0.22, as validated by five-fold cross-validation. Multiple explanatory analyses highlighted the significance of soil organic matter (SOM), plant protein and lipid content, exposure time, and molecular descriptors related to electronegativity distribution pattern (GATS8e) and double-ring structure (fr_bicyclic). An increase in SOM was found to decrease the overall RCF, while other variables showed strong correlations within specific ranges. This GBRT model provides an important tool for assessing the environmental behaviors of ACs in soil-plant systems, thereby supporting further investigations into their ecological and human exposure risks.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - Yu Wang
- MOE Key Laboratory of Pollution Processes and Environmental Criteria, College of Environmental Science and Engineering, Nankai University, Tianjin 300350, China; (S.L.); (Y.S.); (M.G.); (H.S.); (Z.G.); (Q.Z.); (J.X.)
| | - Hongwen Sun
- MOE Key Laboratory of Pollution Processes and Environmental Criteria, College of Environmental Science and Engineering, Nankai University, Tianjin 300350, China; (S.L.); (Y.S.); (M.G.); (H.S.); (Z.G.); (Q.Z.); (J.X.)
| |
Collapse
|
9
|
Bi Z, Sun J, Xie Y, Gu Y, Zhang H, Zheng B, Ou R, Liu G, Li L, Peng X, Gao X, Wei N. Machine learning-driven source identification and ecological risk prediction of heavy metal pollution in cultivated soils. JOURNAL OF HAZARDOUS MATERIALS 2024; 476:135109. [PMID: 38972204 DOI: 10.1016/j.jhazmat.2024.135109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/01/2024] [Revised: 06/07/2024] [Accepted: 07/04/2024] [Indexed: 07/09/2024]
Abstract
To overcome challenges in assessing the impact of environmental factors on heavy metal accumulation in soil due to limited comprehensive data, our study in Yangxin County, Hubei Province, China, analyzed 577 soil samples in combination with extensive big data. We used machine learning techniques, the potential ecological risk index, and the bivariate local Moran's index (BLMI) to predict Cr, Pb, Cd, As, and Hg concentrations in cultivated soil to assess ecological risks and identify pollution sources. The random forest model was selected for its superior performance among various machine learning models, and results indicated that heavy metal accumulation was substantially influenced by environmental factors such as climate, elevation, industrial activities, soil properties, railways, and population. Our ecological risk assessment highlighted areas of concern, where Cd and Hg were identified as the primary threats. BLMI was used to analyze spatial clustering and autocorrelation patterns between ecological risk and environmental factors, pinpointing areas that require targeted interventions. Additionally, redundancy analysis revealed the dynamics of heavy metal transfer to crops. This detailed approach mapped the spatial distribution of heavy metals, highlighted the ecological risks, identified their sources, and provided essential data for effective land management and pollution mitigation.
Collapse
Affiliation(s)
- Zihan Bi
- Key Laboratory of the Three Gorges Reservoir Region's Eco-Environment, Ministry of Education, College of Environment and Ecology, Chongqing University, Chongqing 400045, China
| | - Jian Sun
- Key Laboratory of the Three Gorges Reservoir Region's Eco-Environment, Ministry of Education, College of Environment and Ecology, Chongqing University, Chongqing 400045, China; School of Public Policy and Administration, Chongqing University, Chongqing 400045, China
| | - Yutong Xie
- Key Laboratory of the Three Gorges Reservoir Region's Eco-Environment, Ministry of Education, College of Environment and Ecology, Chongqing University, Chongqing 400045, China
| | - Yilu Gu
- Key Laboratory of the Three Gorges Reservoir Region's Eco-Environment, Ministry of Education, College of Environment and Ecology, Chongqing University, Chongqing 400045, China
| | - Hongzhen Zhang
- Center for Soil Protection and Landscape Design, Chinese Academy of Environmental Planning, Beijing 100041, China
| | - Bowen Zheng
- School of Engineering, Hong Kong University of Science and Technology, Clear water bay, Sai Kung, New Territories, Hong Kong 999077, China
| | - Rongtao Ou
- Key Laboratory of the Three Gorges Reservoir Region's Eco-Environment, Ministry of Education, College of Environment and Ecology, Chongqing University, Chongqing 400045, China
| | - Gaoyuan Liu
- Key Laboratory of the Three Gorges Reservoir Region's Eco-Environment, Ministry of Education, College of Environment and Ecology, Chongqing University, Chongqing 400045, China
| | - Lei Li
- Key Laboratory of the Three Gorges Reservoir Region's Eco-Environment, Ministry of Education, College of Environment and Ecology, Chongqing University, Chongqing 400045, China
| | - Xuya Peng
- Key Laboratory of the Three Gorges Reservoir Region's Eco-Environment, Ministry of Education, College of Environment and Ecology, Chongqing University, Chongqing 400045, China
| | - Xiaofeng Gao
- Key Laboratory of the Three Gorges Reservoir Region's Eco-Environment, Ministry of Education, College of Environment and Ecology, Chongqing University, Chongqing 400045, China.
| | - Nan Wei
- Center for Soil Protection and Landscape Design, Chinese Academy of Environmental Planning, Beijing 100041, China.
| |
Collapse
|
10
|
Bai B, Wang L, Guan F, Cui Y, Bao M, Gong S. Prediction models for bioavailability of Cu and Zn during composting: Insights into machine learning. JOURNAL OF HAZARDOUS MATERIALS 2024; 471:134392. [PMID: 38669932 DOI: 10.1016/j.jhazmat.2024.134392] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/12/2024] [Revised: 04/18/2024] [Accepted: 04/21/2024] [Indexed: 04/28/2024]
Abstract
Bioavailability assessment of heavy metals in compost products is crucial for evaluating associated environmental risks. However, existing experimental methods are time-consuming and inefficient. The machine learning (ML) method has demonstrated excellent performance in predicting heavy metal fractions. In this study, based on the conventional physicochemical properties of 260 compost samples, including compost time, temperature, electrical conductivity (EC), pH, organic matter (OM), total phosphorus (TP), total nitrogen, and total heavy metal contents, back propagation neural network, gradient boosting regression, and random forest (RF) models were used to predict the dynamic changes in bioavailable fractions of Cu and Zn during composting. All three models could be used for effective prediction of the variation trend in bioavailable fractions of Cu and Zn; the RF model showed the best prediction performance, with the prediction level higher than that reported in related studies. Although the key factors affecting changes among fractions were different, OM, EC, and TP were important for the accurate prediction of bioavailable fractions of Cu and Zn. This study provides simple and efficient ML models for predicting bioavailable fractions of Cu and Zn during composting, and offers a rapid evaluation method for the safe application of compost products.
Collapse
Affiliation(s)
- Bing Bai
- State Key Laboratory of Black Soils Conservation and Utilization, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Changchun 130102, China; University of Chinese Academy of Sciences, Beijing 101408, China
| | - Lixia Wang
- State Key Laboratory of Black Soils Conservation and Utilization, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Changchun 130102, China.
| | - Fachun Guan
- Jilin Academy of Agricultural Sciences, Changchun 130033, China
| | - Yanru Cui
- Jilin Academy of Agricultural Sciences, Changchun 130033, China
| | - Meiwen Bao
- State Key Laboratory of Black Soils Conservation and Utilization, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Changchun 130102, China; University of Chinese Academy of Sciences, Beijing 101408, China
| | - Shuxin Gong
- State Key Laboratory of Black Soils Conservation and Utilization, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Changchun 130102, China
| |
Collapse
|