1
|
Sun R, Li Y, Kang Y, Xu X, Zhu J, Fu H, Zhang Y, Lin J, Liu Y. Interpretable machine learning models to predict decline in intrinsic capacity among older adults in China: a prospective cohort study. Maturitas 2025; 198:108594. [PMID: 40344939 DOI: 10.1016/j.maturitas.2025.108594] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2025] [Revised: 04/21/2025] [Accepted: 05/06/2025] [Indexed: 05/11/2025]
Abstract
BACKGROUND Monitoring intrinsic capacity and implementing appropriate interventions can support healthy aging. There are, though, few tools available for predicting decline in intrinsic capacity among older adults. This study aimed to develop and validate an interpretable machine learning model designed to identify populations at elevated risk of a decline in intrinsic capacity. METHODS Using data from the China Health and Retirement Longitudinal Study baseline (2011) and 4-year follow-up (2015), a total of 822 participants were randomly allocated to a training set and a testing set at a 7:3 ratio. Five machine learning methods were employed to train the model and assess its performance through various metrics. The SHapley Additive exPlanation method was subsequently used to interpret the optimal model. RESULTS The 4-year incidence of decline in intrinsic capacity among the older adults in the sample was 44.6 % (n = 367). Nine variables were screened for model construction, among which eXtreme gradient boosting demonstrated the best predictive performance, achieving an area under the receiver operating characteristic curve (AUC) of 0.715 (95 % CI 0.651-0.780) in the testing set. The SHapley Additive exPlanation method identified educational level, smoking, handgrip strength, self-rated health, and residence as the top five significant predictors. CONCLUSIONS The developed model can serve as a highly effective tool for primary care teams to identify older adults with early signs of decline in intrinsic capacity, facilitating the provision of subsequent screening and tailored interventions for intrinsic capacity.
Collapse
Affiliation(s)
- Runjie Sun
- School of Nursing School of Public Health, Yangzhou University, Yangzhou 225009, China
| | - Yijing Li
- School of Nursing School of Public Health, Yangzhou University, Yangzhou 225009, China
| | - Yanru Kang
- School of Nursing School of Public Health, Yangzhou University, Yangzhou 225009, China
| | - Xinqi Xu
- School of Nursing School of Public Health, Yangzhou University, Yangzhou 225009, China
| | - Jie Zhu
- School of Nursing School of Public Health, Yangzhou University, Yangzhou 225009, China
| | - Haiyan Fu
- School of Nursing School of Public Health, Yangzhou University, Yangzhou 225009, China
| | - Yining Zhang
- School of Nursing School of Public Health, Yangzhou University, Yangzhou 225009, China
| | - Jingwen Lin
- School of Nursing School of Public Health, Yangzhou University, Yangzhou 225009, China
| | - Yongbing Liu
- School of Nursing School of Public Health, Yangzhou University, Yangzhou 225009, China.
| |
Collapse
|
2
|
Zhao A, Bai H, Bao X, Liao K, Ren H, Hu H. Model-driven high-throughput zebrafish embryo assay for evaluating whole effluent toxicity variation across 100 full-scale wastewater treatment plants. WATER RESEARCH 2025; 281:123675. [PMID: 40273605 DOI: 10.1016/j.watres.2025.123675] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/25/2025] [Revised: 03/26/2025] [Accepted: 04/17/2025] [Indexed: 04/26/2025]
Abstract
The zebrafish embryo is a valuable model for evaluating whole effluent toxicity (WET). However, the widely recognized acute toxicity indicator, based on International Organization of Standardization (ISO) methods, requires large numbers of embryos and is often time-consuming due to its complex experimental procedures. In this study, we propose an alternative to the conventional reliance on ISO standards by developing a model-driven high-throughput assay that utilizes actual wastewater, enabling rapid LC10 (the lethal concentration at which 10 % of the test organisms are affected) prediction through machine learning techniques and multidimensional indicators derived from streamlined experimental procedures. We compared three streamlined toxicity assays-developmental toxicity, behavioral toxicity, and vascular toxicity-along with five different models. Among these, the Lasso model based on behavioral toxicity emerged as the most effective, achieving an R2 value of 0.893 while reducing experimental time by 5- to 8-fold. Furthermore, fivefold cross-validation confirmed its robust predictive accuracy. The application of this model-driven high-throughput assay across 100 wastewater treatment plants in China highlights the crucial role of biological treatment, particularly aerobic processes and secondary sedimentation, in reducing toxicity, thereby providing valuable insights into their functions. This high-throughput assay not only surpasses the ISO standard method in efficiency but also substantially decreases embryo usage, facilitating rapid WET assessments of actual wastewater with larger sample sizes.
Collapse
Affiliation(s)
- Aixia Zhao
- State Key Laboratory of Pollution Control and Resource Reuse, School of Environment, Nanjing University, Nanjing 210023, Jiangsu, PR China
| | - Hongwei Bai
- State Key Laboratory of Pollution Control and Resource Reuse, School of Environment, Nanjing University, Nanjing 210023, Jiangsu, PR China
| | - Xingchen Bao
- State Key Laboratory of Pollution Control and Resource Reuse, School of Environment, Nanjing University, Nanjing 210023, Jiangsu, PR China
| | - Kewei Liao
- State Key Laboratory of Pollution Control and Resource Reuse, School of Environment, Nanjing University, Nanjing 210023, Jiangsu, PR China
| | - Hongqiang Ren
- State Key Laboratory of Pollution Control and Resource Reuse, School of Environment, Nanjing University, Nanjing 210023, Jiangsu, PR China
| | - Haidong Hu
- State Key Laboratory of Pollution Control and Resource Reuse, School of Environment, Nanjing University, Nanjing 210023, Jiangsu, PR China.
| |
Collapse
|
3
|
Wang H, Li Y, Xuan X, Wang K, Yao YF, Pan L. Machine Learning Accelerated Discovery of Covalent Organic Frameworks for Environmental and Energy Applications. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2025; 59:6361-6378. [PMID: 40159087 DOI: 10.1021/acs.est.5c00390] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/02/2025]
Abstract
Covalent organic frameworks (COFs) are porous crystalline materials obtained by linking organic ligands covalently. Their high surface area and adjustable pore sizes make them ideal for a range of applications, including CO2 capture, CH4 storage, gas separation, catalysis, etc. Traditional methods of material research, which mainly rely on manual experimentation, are not particularly efficient, while with advancements in computer science, high-throughput computational screening methods based on molecular simulation have become crucial in material discovery, yet they face limitations in terms of computational resources and time. Currently, machine learning (ML) has emerged as a transformative tool in many fields, capable of analyzing large data sets, identifying underlying patterns, and predicting material performance efficiently and accurately. This approach, termed "materials genomics", combines high-throughput computational screening with ML to predict and design high-performance materials, significantly speeding up the discovery process compared to traditional methods. This review discusses the functions of ML in the screening, design, and performance prediction of COFs and highlights their applications across various domains like CO2 capture, CH4 storage, gas separation, and catalysis, thereby providing new research directions and enhancing the understanding of COF materials and their applications.
Collapse
Affiliation(s)
- Hao Wang
- Shanghai Key Laboratory of Magnetic Resonance, School of Physics and Electronic Science, East China Normal University, Shanghai 200241, China
- Institute of Magnetic Resonance and Molecular Imaging in Medicine, East China Normal University, Shanghai 200241, China
| | - Yuquan Li
- College of Environmental Science and Engineering, Yangzhou University, Yangzhou, Jiangsu 225127, China
| | - Xiaoyang Xuan
- College of Chemistry and Chemical Engineering, Taishan University, Taian, Shandong 271000, China
| | - Kai Wang
- Inner Mongolia Key Laboratory of Environmental Chemistry, College of Chemistry and Environmental Science, Inner Mongolia Normal University, Hohhot 010022, China
| | - Ye-Feng Yao
- Shanghai Key Laboratory of Magnetic Resonance, School of Physics and Electronic Science, East China Normal University, Shanghai 200241, China
- Institute of Magnetic Resonance and Molecular Imaging in Medicine, East China Normal University, Shanghai 200241, China
| | - Likun Pan
- Shanghai Key Laboratory of Magnetic Resonance, School of Physics and Electronic Science, East China Normal University, Shanghai 200241, China
- Institute of Magnetic Resonance and Molecular Imaging in Medicine, East China Normal University, Shanghai 200241, China
| |
Collapse
|
4
|
Yin WX, Chen KH, Lv JQ, Chen JJ, Liu S, Song YP, Zhao YW, Huang F, Bao HX, Wang HC, Wang AJ, Ren NQ. Deciphering and Mitigating of Dynamic Greenhouse Gas Emission in Urban Drainage Systems with Knowledge-Infused Graph Neural Network. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2025; 59:3592-3602. [PMID: 39936390 DOI: 10.1021/acs.est.4c10644] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/13/2025]
Abstract
Deciphering and mitigating dynamic greenhouse gas (GHG) emissions under environmental fluctuation in urban drainage systems (UDGSs) is challenging due to the absence of a high-prediction model that accurately quantifies the contributions of biological production pathways. Here we infused biological production pathways into the graph neural network (GNN) model architecture, developing ecological knowledge-infused GNN (EcoGNN-GHG) models to evaluate methane (CH4) and nitrous oxide (N2O) production in sewers and wastewater treatment plants (WWTPs). The EcoGNN-GHG model demonstrated high predictive accuracy, achieving an R2 of 0.96 for CH4 in sewers and 0.82 for N2O in WWTPs. Model interpretability analysis revealed fluctuations in contributions of the anaerobic hydrolysis acidification pathway to CH4 production and the nitrification-denitrification pathway to N2O production under dynamic environmental conditions, guiding the formulation of a precise dissolved oxygen control strategy targeting critical water quality parameters (acetate for CH4 production and nitrite for N2O production). Implementing this strategy to control DO thereby regulating biological production pathway contributions, CH4 production in sewers and N2O production in WWTPs were reduced by 35.50% and 29.94%, respectively. Our findings offer a robust, accurate method for predicting GHG emissions, quantifying production pathway contributions, and developing effective control strategies in UDGSs.
Collapse
Affiliation(s)
- Wan-Xin Yin
- College of the Environment, Liaoning University, Shenyang 110036, China
| | - Ke-Hua Chen
- Division of Emerging Interdisciplinary Areas (EMIA), Academy of Interdisciplinary Studies, The Hong Kong University of Science and Technology, Hong Kong 999077, China
| | - Jia-Qiang Lv
- State Key Laboratory of Urban Water Resource and Environment, School of Eco-Environment, Harbin Institute of Technology, Shenzhen 518055, China
| | - Jia-Ji Chen
- CAS Key Laboratory of Environmental Biotechnology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, China
| | - Shuai Liu
- State Key Laboratory of Urban Water Resource and Environment, School of Eco-Environment, Harbin Institute of Technology, Shenzhen 518055, China
| | - Yun-Peng Song
- State Key Laboratory of Urban Water Resource and Environment, School of Eco-Environment, Harbin Institute of Technology, Shenzhen 518055, China
| | - Yi-Wei Zhao
- State Key Laboratory of Urban Water Resource and Environment, School of Eco-Environment, Harbin Institute of Technology, Shenzhen 518055, China
| | - Fang Huang
- State Key Laboratory of Urban Water Resource and Environment, School of Eco-Environment, Harbin Institute of Technology, Shenzhen 518055, China
| | - Hong-Xu Bao
- College of the Environment, Liaoning University, Shenyang 110036, China
| | - Hong-Cheng Wang
- State Key Laboratory of Urban Water Resource and Environment, School of Eco-Environment, Harbin Institute of Technology, Shenzhen 518055, China
| | - Ai-Jie Wang
- State Key Laboratory of Urban Water Resource and Environment, School of Eco-Environment, Harbin Institute of Technology, Shenzhen 518055, China
| | - Nan-Qi Ren
- State Key Laboratory of Urban Water Resource and Environment, School of Eco-Environment, Harbin Institute of Technology, Shenzhen 518055, China
| |
Collapse
|
5
|
Wang H, Liu W, Chen J, Ji S. Transfer Learning with a Graph Attention Network and Weighted Loss Function for Screening of Persistent, Bioaccumulative, Mobile, and Toxic Chemicals. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2025; 59:578-590. [PMID: 39680085 DOI: 10.1021/acs.est.4c11085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/17/2024]
Abstract
In silico methods for screening hazardous chemicals are necessary for sound management. Persistent, bioaccumulative, mobile, and toxic (PBMT) chemicals persist in the environment and have high mobility in aquatic environments, posing risks to human and ecological health. However, lack of experimental data for the vast number of chemicals hinders identification of PBMT chemicals. Through an extensive search of measured chemical mobility data, as well as persistent, bioaccumulative, and toxic (PBT) chemical inventories, this study constructed comprehensive data sets on PBMT chemicals. To address the limited volume of the PBMT chemical data set, a transfer learning (TL) framework based on graph attention network (GAT) architecture was developed to construct models for screening PBMT chemicals, designating the PBT chemical inventories as source domains and the PBMT chemical data set as target domains. A weighted loss (LW) function was proposed and proved to mitigate the negative impact of imbalanced data on the model performance. Results indicate the TL-GAT models outperformed GAT models, along with large coverage of applicability domains and interpretability. The constructed models were employed to identify PBMT chemicals from inventories consisting of about 1 × 106 chemicals. The developed TL-GAT framework with the LW function holds broad applicability across diverse tasks, especially those involving small and imbalanced data sets.
Collapse
Affiliation(s)
- Haobo Wang
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Wenjia Liu
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Jingwen Chen
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Shengshe Ji
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| |
Collapse
|
6
|
Liu W, Chen J, Wang H, Fu Z, Peijnenburg WJGM, Hong H. Perspectives on Advancing Multimodal Learning in Environmental Science and Engineering Studies. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2024. [PMID: 39226136 DOI: 10.1021/acs.est.4c03088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/05/2024]
Abstract
The environment faces increasing anthropogenic impacts, resulting in a rapid increase in environmental issues that undermine the natural capital essential for human wellbeing. These issues are complex and often influenced by various factors represented by data with different modalities. While machine learning (ML) provides data-driven tools for addressing the environmental issues, the current ML models in environmental science and engineering (ES&E) often neglect the utilization of multimodal data. With the advancement in deep learning, multimodal learning (MML) holds promise for comprehensive descriptions of the environmental issues by harnessing data from diverse modalities. This advancement has the potential to significantly elevate the accuracy and robustness of prediction models in ES&E studies, providing enhanced solutions for various environmental modeling tasks. This perspective summarizes MML methodologies and proposes potential applications of MML models in ES&E studies, including environmental quality assessment, prediction of chemical hazards, and optimization of pollution control techniques. Additionally, we discuss the challenges associated with implementing MML in ES&E and propose future research directions in this domain.
Collapse
Affiliation(s)
- Wenjia Liu
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Jingwen Chen
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Haobo Wang
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Zhiqiang Fu
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Willie J G M Peijnenburg
- Institute of Environmental Sciences (CML), Leiden University, Leiden 2300 RA, The Netherlands
- Centre for Safety of Substances and Products, National Institute of Public Health and the Environment (RIVM), Bilthoven 3720 BA, The Netherlands
| | - Huixiao Hong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, Arkansas 72079, United States
| |
Collapse
|
7
|
Jiang J, Xiang X, Zhou Q, Zhou L, Bi X, Khanal SK, Wang Z, Chen G, Guo G. Optimization of a Novel Engineered Ecosystem Integrating Carbon, Nitrogen, Phosphorus, and Sulfur Biotransformation for Saline Wastewater Treatment Using an Interpretable Machine Learning Approach. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2024; 58:12989-12999. [PMID: 38982970 DOI: 10.1021/acs.est.4c03160] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/11/2024]
Abstract
The denitrifying sulfur (S) conversion-associated enhanced biological phosphorus removal (DS-EBPR) process for treating saline wastewater is characterized by its unique microbial ecology that integrates carbon (C), nitrogen (N), phosphorus (P), and S biotransformation. However, operational instability arises due to the numerous parameters and intricates bacterial interactions. This study introduces a two-stage interpretable machine learning approach to predict S conversion-driven P removal efficiency and optimize DS-EBPR process. Stage one utilized the XGBoost regression model, achieving an R2 value of 0.948 for predicting sulfate reduction (SR) intensity from anaerobic parameters with feature engineering. Stage two involved the CatBoost classification and regression model integrating anoxic parameters with the predicted SR values for predicting P removal, reaching an accuracy of 94% and an R2 value of 0.93, respectively. This study identified key environmental factors, including SR intensity (20-45 mg S/L), influent P concentration (<9.0 mg P/L), mixed liquor volatile suspended solids (MLVSS)/mixed liquor suspended solids (MLSS) ratio (0.55-0.72), influent C/S ratio (0.5-1.0), anoxic reaction time (5-6 h), and MLSS concentration (>6.50 g/L). A user-friendly graphic interface was developed to facilitate easier optimization and control. This approach streamlines the determination of optimal conditions for enhancing P removal in the DS-EBPR process.
Collapse
Affiliation(s)
- Jinqi Jiang
- Hubei Key Laboratory of Multi-media Pollution Cooperative Control in Yangtze Basin, School of Environmental Science & Engineering, Huazhong University of Science and Technology (HUST), 1037 Luoyu Road, Wuhan, Hubei 430074, China
- School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Xiang Xiang
- School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Qinhao Zhou
- School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Lichang Zhou
- Hubei Key Laboratory of Multi-media Pollution Cooperative Control in Yangtze Basin, School of Environmental Science & Engineering, Huazhong University of Science and Technology (HUST), 1037 Luoyu Road, Wuhan, Hubei 430074, China
| | - Xinqi Bi
- Hubei Key Laboratory of Multi-media Pollution Cooperative Control in Yangtze Basin, School of Environmental Science & Engineering, Huazhong University of Science and Technology (HUST), 1037 Luoyu Road, Wuhan, Hubei 430074, China
| | - Samir Kumar Khanal
- Department of Molecular Biosciences and Bioengineering, University of Hawai'i at Ma̅noa, 1955 East-West Road, Honolulu, Hawaii 96822, United States
| | - Zongping Wang
- Hubei Key Laboratory of Multi-media Pollution Cooperative Control in Yangtze Basin, School of Environmental Science & Engineering, Huazhong University of Science and Technology (HUST), 1037 Luoyu Road, Wuhan, Hubei 430074, China
| | - Guanghao Chen
- Civil & Environmental Engineering and Hong Kong Branch of the Chinese National Engineering Research Center for Control & Treatment of Heavy Metal Pollution, The Hong Kong University of Science and Technology, Hong Kong 999077, PR China
| | - Gang Guo
- Hubei Key Laboratory of Multi-media Pollution Cooperative Control in Yangtze Basin, School of Environmental Science & Engineering, Huazhong University of Science and Technology (HUST), 1037 Luoyu Road, Wuhan, Hubei 430074, China
| |
Collapse
|
8
|
Huang Y, Zhong S, Gan L, Chen Y. Development of Machine Learning Models for Ion-Selective Electrode Cation Sensor Design. ACS ES&T ENGINEERING 2024; 4:1702-1711. [PMID: 39021402 PMCID: PMC11250033 DOI: 10.1021/acsestengg.4c00087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/19/2024] [Revised: 03/15/2024] [Accepted: 03/15/2024] [Indexed: 07/20/2024]
Abstract
Polyvinyl chloride (PVC) membrane-based ion-selective electrode (ISE) sensors are common tools for water assessments, but their development relies on time-consuming and costly experimental investigations. To address this challenge, this study combines machine learning (ML), Morgan fingerprint, and Bayesian optimization technologies with experimental results to develop high-performance PVC-based ISE cation sensors. By using 1745 data sets collected from 20 years of literature, appropriate ML models are trained to enable accurate prediction and a deep understanding of the relationship between ISE components and sensor performance (R 2 = 0.75). Rapid ionophore screening is achieved using the Morgan fingerprint based on atomic groups derived from ML model interpretation. Bayesian optimization is then applied to identify optimal combinations of ISE materials with the potential to deliver desirable ISE sensor performance. Na+, Mg2+, and Al3+ sensors fabricated from Bayesian optimization results exhibit excellent Nernst slopes with less than 8.2% deviation from the ideal value and superb detection limits at 10-7 M level based on experimental validation results. This approach can potentially transform sensor development into a more time-efficient, cost-effective, and rational design process, guided by ML-based techniques.
Collapse
Affiliation(s)
- Yuankai Huang
- School
of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Shifa Zhong
- School
of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
- Department
of Environmental Science, School of Ecological and Environmental Sciences, East China Normal University, Shanghai 200241, China
| | - Lan Gan
- School
of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Yongsheng Chen
- School
of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| |
Collapse
|
9
|
Qing S, Li C. Data-driven prediction on critical mechanical properties of engineered cementitious composites based on machine learning. Sci Rep 2024; 14:15322. [PMID: 38961183 PMCID: PMC11222503 DOI: 10.1038/s41598-024-66123-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2024] [Accepted: 06/27/2024] [Indexed: 07/05/2024] Open
Abstract
The present study introduces a novel approach utilizing machine learning techniques to predict the crucial mechanical properties of engineered cementitious composites (ECCs), spanning from typical to exceptionally high strength levels. These properties, including compressive strength, flexural strength, tensile strength, and tensile strain capacity, can not only be predicted but also precisely estimated. The investigation encompassed a meticulous compilation and examination of 1532 datasets sourced from pertinent research. Four machine learning algorithms, linear regression (LR), K nearest neighbors (KNN), random forest (RF), and extreme gradient boosting (XGB), were used to establish the prediction model of ECC mechanical properties and determine the optimal model. The optimal model was utilized to employ SHapley Additive exPlanations (SHAP) for scrutinizing feature importance and conducting an in-depth parametric analysis. Subsequently, a comprehensive control strategy was devised for ECC mechanical properties. This strategy can provide actionable guidance for ECC design, equipping engineers and professionals in civil engineering and material science to make informed decisions throughout their design endeavors. The results show that the RF model demonstrated the highest prediction accuracy for compressive strength and flexural strength, with R2 values of 0.92 and 0.91 on the test set. The XGB model outperformed in predicting tensile strength and tensile strain capacity, with R2 values of 0.87 and 0.80 on the test set, respectively. The prediction of tensile strain capacity was the least accurate. Meanwhile, the MAE of the tensile strain capacity was a mere 0.84%, smaller than the variability (1.77%) of the test results in previous research. Compressive strength and tensile strength demonstrated high sensitivity to variations in both water-cement ratio (W) and water reducer (WR). In contrast, flexural strength exhibited high sensitivity solely to changes in W. Conversely, the sensitivity of tensile strain capacity to input features was moderate and consistent. The mechanical attributes of ECC emerged from the combined effects of multiple positive and negative features. Notably, WR exerted the most significant influence on compressive strength among all features, whereas polyethylene (PE) fiber emerged as the primary driver affecting flexural strength, tensile strength, and tensile strain capacity.
Collapse
Affiliation(s)
- Shuangquan Qing
- Department of Civil Engineering, Changsha University of Science & Technology, Changsha, 410114, China.
| | - Chuanxi Li
- Department of Civil Engineering, Changsha University of Science & Technology, Changsha, 410114, China
- State Key Laboratory of Featured Metal Materials and Life-Cycle Safety for Composite Structures, Nanning, 530004, China
| |
Collapse
|
10
|
Xie W, Yu Q, Fang W, Zhang X, Geng J, Tang J, Jing W, Liu M, Ma Z, Yang J, Bi J. Data-driven approaches linking wastewater and source estimation hazardous waste for environmental management. Nat Commun 2024; 15:5432. [PMID: 38926394 PMCID: PMC11208539 DOI: 10.1038/s41467-024-49817-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Accepted: 06/19/2024] [Indexed: 06/28/2024] Open
Abstract
Industrial enterprises are major sources of contaminants, making their regulation vital for sustainable development. Tracking contaminant generation at the firm-level is challenging due to enterprise heterogeneity and the lack of a universal estimation method. This study addresses the issue by focusing on hazardous waste (HW), which is difficult to monitor automatically. We developed a data-driven methodology to predict HW generation using wastewater big data which is grounded in the availability of this data with widespread application of automatic sensors and the logical assumption that a correlation exists between wastewater and HW generation. We created a generic framework that used representative variables from diverse sectors, exploited a data-balance algorithm to address long-tail data distribution, and incorporated causal discovery to screen features and improve computation efficiency. Our method was tested on 1024 enterprises across 10 sectors in Jiangsu, China, demonstrating high fidelity (R² = 0.87) in predicting HW generation with 4,260,593 daily wastewater data.
Collapse
Affiliation(s)
- Wenjun Xie
- State Key Laboratory of Pollution Control and Resource Reuse, School of the Environment, Nanjing University, Nanjing, Jiangsu, China
| | - Qingyuan Yu
- State Key Laboratory of Pollution Control and Resource Reuse, School of the Environment, Nanjing University, Nanjing, Jiangsu, China
| | - Wen Fang
- State Key Laboratory of Pollution Control and Resource Reuse, School of the Environment, Nanjing University, Nanjing, Jiangsu, China.
| | - Xiaoge Zhang
- Department of Industrial and Systems Engineering, The Hong Kong Polytechnic University, Kowloon, Hong Kong
| | - Jinghua Geng
- State Key Laboratory of Pollution Control and Resource Reuse, School of the Environment, Nanjing University, Nanjing, Jiangsu, China
| | - Jiayi Tang
- State Key Laboratory of Pollution Control and Resource Reuse, School of the Environment, Nanjing University, Nanjing, Jiangsu, China
| | - Wenfei Jing
- State Key Laboratory of Pollution Control and Resource Reuse, School of the Environment, Nanjing University, Nanjing, Jiangsu, China
| | - Miaomiao Liu
- State Key Laboratory of Pollution Control and Resource Reuse, School of the Environment, Nanjing University, Nanjing, Jiangsu, China.
| | - Zongwei Ma
- State Key Laboratory of Pollution Control and Resource Reuse, School of the Environment, Nanjing University, Nanjing, Jiangsu, China
| | - Jianxun Yang
- State Key Laboratory of Pollution Control and Resource Reuse, School of the Environment, Nanjing University, Nanjing, Jiangsu, China
| | - Jun Bi
- State Key Laboratory of Pollution Control and Resource Reuse, School of the Environment, Nanjing University, Nanjing, Jiangsu, China.
| |
Collapse
|
11
|
Yang M, Zhu JJ, McGaughey AL, Priestley RD, Hoek EMV, Jassby D, Ren ZJ. Machine Learning for Polymer Design to Enhance Pervaporation-Based Organic Recovery. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2024; 58:10128-10139. [PMID: 38743597 DOI: 10.1021/acs.est.4c00060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Pervaporation (PV) is an effective membrane separation process for organic dehydration, recovery, and upgrading. However, it is crucial to improve membrane materials beyond the current permeability-selectivity trade-off. In this research, we introduce machine learning (ML) models to identify high-potential polymers, greatly improving the efficiency and reducing cost compared to conventional trial-and-error approach. We utilized the largest PV data set to date and incorporated polymer fingerprints and features, including membrane structure, operating conditions, and solute properties. Dimensionality reduction, missing data treatment, seed randomness, and data leakage management were employed to ensure model robustness. The optimized LightGBM models achieved RMSE of 0.447 and 0.360 for separation factor and total flux, respectively (logarithmic scale). Screening approximately 1 million hypothetical polymers with ML models resulted in identifying polymers with a predicted permeation separation index >30 and synthetic accessibility score <3.7 for acetic acid extraction. This study demonstrates the promise of ML to accelerate tailored membrane designs.
Collapse
Affiliation(s)
- Meiqi Yang
- Department of Civil and Environmental Engineering, Princeton University, Princeton, New Jersey 08544, United States
- Andlinger Center for Energy and the Environment, Princeton University, Princeton, New Jersey 08544, United States
| | - Jun-Jie Zhu
- Department of Civil and Environmental Engineering, Princeton University, Princeton, New Jersey 08544, United States
- Andlinger Center for Energy and the Environment, Princeton University, Princeton, New Jersey 08544, United States
| | - Allyson L McGaughey
- Department of Civil and Environmental Engineering, Princeton University, Princeton, New Jersey 08544, United States
- Andlinger Center for Energy and the Environment, Princeton University, Princeton, New Jersey 08544, United States
- Department of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey 08544, United States
| | - Rodney D Priestley
- Department of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey 08544, United States
| | - Eric M V Hoek
- Department of Civil & Environmental Engineering, University of California Los Angeles, Los Angeles, California 90095, United States
| | - David Jassby
- Department of Civil & Environmental Engineering, University of California Los Angeles, Los Angeles, California 90095, United States
| | - Zhiyong Jason Ren
- Department of Civil and Environmental Engineering, Princeton University, Princeton, New Jersey 08544, United States
- Andlinger Center for Energy and the Environment, Princeton University, Princeton, New Jersey 08544, United States
| |
Collapse
|
12
|
Liu B, Xi F, Zhang H, Peng J, Sun L, Zhu X. Coupling machine learning and theoretical models to compare key properties of biochar in adsorption kinetics rate and maximum adsorption capacity for emerging contaminants. BIORESOURCE TECHNOLOGY 2024; 402:130776. [PMID: 38701979 DOI: 10.1016/j.biortech.2024.130776] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/04/2024] [Revised: 04/28/2024] [Accepted: 04/29/2024] [Indexed: 05/06/2024]
Abstract
Insights into key properties of biochar with a fast adsorption rate and high adsorption capacity are urgent to design biochar as an adsorbent in pollution emergency treatment. Machine learning (ML) incorporating classical theoretical adsorption models was applied to build prediction models for adsorption kinetics rate (i.e., K) and maximum adsorption capacity (i.e., Qm) of emerging contaminants (ECs) on biochar. Results demonstrated that the prediction performance of adaptive boosting algorithm significantly improved after data preprocessing (i.e., log-transformation) in the small unbalanced datasets with R2 of 0.865 and 0.874 for K and Qm, respectively. The surface chemistry, primarily led by ash content of biochar significantly influenced the K, while surface porous structure of biochar showed a dominant role in predicting Qm. An interactive platform was deployed for relevant scientists to predict K and Qm of new biochar for ECs. The research provided practical references for future engineered biochar design for ECs removal.
Collapse
Affiliation(s)
- Bingyou Liu
- School of Environmental Science and Engineering, Sun Yat-sen University, Guangzhou 510275, China
| | - Feiyu Xi
- School of Environmental Science and Engineering, Sun Yat-sen University, Guangzhou 510275, China
| | - Huanjing Zhang
- School of Environmental Science and Engineering, Sun Yat-sen University, Guangzhou 510275, China
| | - Jiangtao Peng
- School of Environmental Science and Engineering, Sun Yat-sen University, Guangzhou 510275, China
| | - Lianpeng Sun
- School of Environmental Science and Engineering, Sun Yat-sen University, Guangzhou 510275, China; Guangdong Provincial Key Laboratory of Environmental Pollution Control and Remediation Technology, Sun Yat-sen University, Guangzhou 510275, China
| | - Xinzhe Zhu
- School of Environmental Science and Engineering, Sun Yat-sen University, Guangzhou 510275, China; Guangdong Provincial Key Laboratory of Environmental Pollution Control and Remediation Technology, Sun Yat-sen University, Guangzhou 510275, China.
| |
Collapse
|
13
|
Varga D. Critical Analysis of Data Leakage in WiFi CSI-Based Human Action Recognition Using CNNs. SENSORS (BASEL, SWITZERLAND) 2024; 24:3159. [PMID: 38794015 PMCID: PMC11124867 DOI: 10.3390/s24103159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/13/2024] [Revised: 05/12/2024] [Accepted: 05/13/2024] [Indexed: 05/26/2024]
Abstract
WiFi Channel State Information (CSI)-based human action recognition using convolutional neural networks (CNNs) has emerged as a promising approach for non-intrusive activity monitoring. However, the integrity and reliability of the reported performance metrics are susceptible to data leakage, wherein information from the test set inadvertently influences the training process, leading to inflated accuracy rates. In this paper, we conduct a critical analysis of a notable IEEE Sensors Journal study on WiFi CSI-based human action recognition, uncovering instances of data leakage resulting from the absence of subject-based data partitioning. Empirical investigation corroborates the lack of exclusivity of individuals across dataset partitions, underscoring the importance of rigorous data management practices. Furthermore, we demonstrate that employing data partitioning with respect to humans results in significantly lower precision rates than the reported 99.9% precision, highlighting the exaggerated nature of the original findings. Such inflated results could potentially discourage other researchers and impede progress in the field by fostering a sense of complacency.
Collapse
|
14
|
Bian Y, Leininger A, May HD, Ren ZJ. H 2 mediated mixed culture microbial electrosynthesis for high titer acetate production from CO 2. ENVIRONMENTAL SCIENCE AND ECOTECHNOLOGY 2024; 19:100324. [PMID: 37961049 PMCID: PMC10637882 DOI: 10.1016/j.ese.2023.100324] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/07/2023] [Revised: 09/26/2023] [Accepted: 09/27/2023] [Indexed: 11/15/2023]
Abstract
Microbial electrosynthesis (MES) converts CO2 into value-added products such as volatile fatty acids (VFAs) with minimal energy use, but low production titer has limited scale-up and commercialization. Mediated electron transfer via H2 on the MES cathode has shown a higher conversion rate than the direct biofilm-based approach, as it is tunable via cathode potential control and accelerates electrosynthesis from CO2. Here we report high acetate titers can be achieved via improved in situ H2 supply by nickel foam decorated carbon felt cathode in mixed community MES systems. Acetate concentration of 12.5 g L-1 was observed in 14 days with nickel-carbon cathode at a poised potential of -0.89 V (vs. standard hydrogen electrode, SHE), which was much higher than cathodes using stainless steel (5.2 g L-1) or carbon felt alone (1.7 g L-1) with the same projected surface area. A higher acetate concentration of 16.0 g L-1 in the cathode was achieved over long-term operation for 32 days, but crossover was observed in batch operation, as additional acetate (5.8 g L-1) was also found in the abiotic anode chamber. We observed the low Faradaic efficiencies in acetate production, attributed to partial H2 utilization for electrosynthesis. The selective acetate production with high titer demonstrated in this study shows the H2-mediated electron transfer with common cathode materials carries good promise in MES development.
Collapse
Affiliation(s)
- Yanhong Bian
- Department of Civil and Environmental Engineering, Princeton University, 86 Olden St, Princeton, NJ, 08544, United States
- Andlinger Center for Energy and the Environment, Princeton University, 86 Olden St., Princeton, NJ, 08544, United States
| | - Aaron Leininger
- Department of Civil and Environmental Engineering, Princeton University, 86 Olden St, Princeton, NJ, 08544, United States
- Andlinger Center for Energy and the Environment, Princeton University, 86 Olden St., Princeton, NJ, 08544, United States
| | - Harold D. May
- Andlinger Center for Energy and the Environment, Princeton University, 86 Olden St., Princeton, NJ, 08544, United States
| | - Zhiyong Jason Ren
- Department of Civil and Environmental Engineering, Princeton University, 86 Olden St, Princeton, NJ, 08544, United States
- Andlinger Center for Energy and the Environment, Princeton University, 86 Olden St., Princeton, NJ, 08544, United States
| |
Collapse
|
15
|
Wang H, Zeng J, Dai R, Wang Z. Understanding Rejection Mechanisms of Trace Organic Contaminants by Polyamide Membranes via Data-Knowledge Codriven Machine Learning. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2024; 58:5878-5888. [PMID: 38498471 DOI: 10.1021/acs.est.3c08523] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/20/2024]
Abstract
Data-driven machine learning (ML) provides a promising approach to understanding and predicting the rejection of trace organic contaminants (TrOCs) by polyamide (PA). However, various confounding variables, coupled with data scarcity, restrict the direct application of data-driven ML. In this study, we developed a data-knowledge codriven ML model via domain-knowledge embedding and explored its application in comprehending TrOC rejection by PA membranes. Domain-knowledge embedding enhanced both the predictive performance and the interpretability of the ML model. The contribution of key mechanisms, including size exclusion, charge effect, hydrophobic interaction, etc., that dominate the rejections of the three TrOC categories (neutral hydrophilic, neutral hydrophobic, and charged TrOCs) was quantified. Log D and molecular charge emerge as key factors contributing to the discernible variations in the rejection among the three TrOC categories. Furthermore, we quantitatively compared the TrOC rejection mechanisms between nanofiltration (NF) and reverse osmosis (RO) PA membranes. The charge effect and hydrophobic interactions possessed higher weights for NF to reject TrOCs, while the size exclusion in RO played a more important role. This study demonstrated the effectiveness of the data-knowledge codriven ML method in understanding TrOC rejection by PA membranes, providing a methodology to formulate a strategy for targeted TrOC removal.
Collapse
Affiliation(s)
- Hejia Wang
- State Key Laboratory of Pollution Control and Resource Reuse, Shanghai Institute of Pollution Control and Ecological Security, School of Environmental Science and Engineering, Tongji University, Shanghai 200092, China
| | - Jin Zeng
- School of Software Engineering, Tongji University, Shanghai 201804, China
| | - Ruobin Dai
- State Key Laboratory of Pollution Control and Resource Reuse, Shanghai Institute of Pollution Control and Ecological Security, School of Environmental Science and Engineering, Tongji University, Shanghai 200092, China
| | - Zhiwei Wang
- State Key Laboratory of Pollution Control and Resource Reuse, Shanghai Institute of Pollution Control and Ecological Security, School of Environmental Science and Engineering, Tongji University, Shanghai 200092, China
| |
Collapse
|
16
|
Schroer HW, Just CL. Feature Engineering and Supervised Machine Learning to Forecast Biogas Production during Municipal Anaerobic Co-Digestion. ACS ES&T ENGINEERING 2024; 4:660-672. [PMID: 38481751 PMCID: PMC10928704 DOI: 10.1021/acsestengg.3c00435] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Revised: 12/12/2023] [Accepted: 12/12/2023] [Indexed: 01/19/2025]
Abstract
Municipalities with excess anaerobic digestion capacity accept offsite wastes for co-digestion to meet sustainability goals and create more biogas. Despite the benefits inherent to co-digestion, the temporal and compositional heterogeneity of external waste streams creates operational challenges that lead to upsets or conservative co-digestion. Given the complex microbial bioprocesses occurring during anaerobic digestion, prediction and modeling of the outcomes can be challenging, and machine learning has the potential to improve understanding and control of co-digestion processes. Biogas flows are a surrogate for process health, and here, we predicted biogas production from historical data collected by a water resource recovery facility (WRRF) during normal operation. We tested a daily lab and operational data set (n = 1089 after cleaning) and a minute-by-minute supervisory control and data acquisition (SCADA) operational data set (n = 491,761 after cleaning) to determine if forecasting biogas flow for a 24 h time horizon is feasible without collecting additional data. We found that a multilayer perceptron (MLP) neural network model outperformed tree-based and multiple linear regression models. Using a high-resolution SCADA data set for the first time, we showed that MLP neural networks could predict biogas production with an adjusted coefficient of determination (R2) of 0.78 and a mean absolute percentage error of 13.4% on a holdout test set. Adding daily laboratory analyses to the model did not appreciably improve the prediction of biogas flows. Feature engineering was essential to an accurate prediction, and 11 of the 15 most important features in the SCADA model were calculated from raw SCADA outputs. In summary, this paper demonstrates that minute-scale SCADA information collected at a municipal co-digestion facility can forecast biogas production, as a first step toward a digital twin model, without additional data collection.
Collapse
Affiliation(s)
- Hunter W. Schroer
- IIHR
– Hydroscience and Engineering, University
of Iowa, Iowa City, Iowa 52242, United States
| | - Craig L. Just
- IIHR
– Hydroscience and Engineering, University
of Iowa, Iowa City, Iowa 52242, United States
- Department
of Civil & Environmental Engineering, University of Iowa, Iowa City, Iowa 52242, United States
| |
Collapse
|
17
|
Wu CD, Zhu JJ, Hsu CY, Shie RH. Quantifying source contributions to ambient NH 3 using Geo-AI with time lag and parcel tracking functions. ENVIRONMENT INTERNATIONAL 2024; 185:108520. [PMID: 38412565 DOI: 10.1016/j.envint.2024.108520] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Revised: 01/26/2024] [Accepted: 02/19/2024] [Indexed: 02/29/2024]
Abstract
Ambient ammonia (NH3) plays an important compound in forming particulate matters (PMs), and therefore, it is crucial to comprehend NH3's properties in order to better reduce PMs. However, it is not easy to achieve this goal due to the limited range/real-time NH3 data monitored by the air quality stations. While there were other studies to predict NH3 and its source apportionment, this manuscript provides a novel method (i.e., GEO-AI)) to look into NH3 predictions and their contribution sources. This study represents a pioneering effort in the application of a novel geospatial-artificial intelligence (Geo-AI) base model with parcel tracking functions. This innovative approach seamlessly integrates various machine learning algorithms and geographic predictor variables to estimate NH3 concentrations, marking the first instance of such a comprehensive methodology. The Shapley additive explanation (SHAP) was used to further analyze source contribution of NH3 with domain knowledge. From 2016 to 2018, Taichung's hourly average NH3 values were predicted with total variance up to 96%. SHAP values revealed that waterbody, traffic and agriculture emissions were the most significant factors to affect NH3 concentrations in Taichung among all the characteristics. Our methodology is a vital first step for shaping future policies and regulations and is adaptable to regions with limited monitoring sites.
Collapse
Affiliation(s)
- Chih-Da Wu
- Department of Geomatics, National Cheng Kung University, Tainan, Taiwan; National Institute of Environmental Health Sciences, National Health Research Institutes, Miaoli, Taiwan; Innovation and Development Center of Sustainable Agriculture, National Chung-Hsing University, Taichung, Taiwan
| | - Jun-Jie Zhu
- Department of Civil and Environmental Engineering and Andlinger Center for Energy and the Environment, Princeton University, Princeton, NJ 08544, USA
| | - Chin-Yu Hsu
- Department of Safety, Health and Environmental Engineering, Ming Chi University of Technology, New Taipei City, Taiwan; Center for Environmental Sustainability and Human Health, Ming Chi University of Technology, New Taipei City, Taiwan.
| | - Ruei-Hao Shie
- Green Energy and Environment Research Laboratories, Industrial Technology Research Institute, 321 Guangfu Road, East District, Hsinchu City 30011, Taiwan
| |
Collapse
|
18
|
Guo S, Zhou J, Li Z, Zheng L, Wang X, Cheng S, Li K. End-to-end machine-learning for high-gravity ammonia stripping: Bridging the gap between scientific research and user-friendly applications. WATER RESEARCH 2024; 248:120790. [PMID: 37988805 DOI: 10.1016/j.watres.2023.120790] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Revised: 10/13/2023] [Accepted: 10/26/2023] [Indexed: 11/23/2023]
Abstract
The removal and recovery of ammonia from wastewater are critical processes for achieving global environmental sustainability and promoting circular economic development. High-gravity technology is an advanced solution to achieve ammonia stripping from wastewater. This study used machine-learning (ML) techniques to provide more comprehensive insights on various influencing factors, including the operating parameters, wastewater characteristics, and design parameters of rotating packed beds. Bayesian auto-optimization combined with a boosting algorithm effectively overcame the challenges of modeling complex datasets with small sample sizes, multidimensional data, missing values, and skewed distributions. Accurate ML based predictive models for the ammonia removal efficiency (η) and mass transfer coefficient (KLa) were developed, the performance on the training set was R2 = 0.98 and R2 = 0.89, and on the testing set was R2 = 0.98 and R2 = 0.82. The developed model revealed that the stripping stage and gas-liquid ratio were the most influential features for predicting η, whereas the liquid flow and high-gravity factor were the most important features for predicting KLa. The well-trained model was then deployed in an online software application that could provide both predictive and auto-update functions for operators and managers, ensuring that practitioners could use the model. The end-to-end machine-learning approach used in this study-that is, covering data collection, model development, and application-could improve the availability of research results, providing valuable references for the further advancement of technology in the field of environmental.
Collapse
Affiliation(s)
- Shaomin Guo
- School of Energy and Environmental Engineering, Beijing Key Laboratory of Resource-oriented Treatment of Industrial Pollutants, University of Science and Technology Beijing, Beijing 100083, PR China
| | - Junwen Zhou
- School of Energy and Environmental Engineering, Beijing Key Laboratory of Resource-oriented Treatment of Industrial Pollutants, University of Science and Technology Beijing, Beijing 100083, PR China
| | - Zifu Li
- School of Energy and Environmental Engineering, Beijing Key Laboratory of Resource-oriented Treatment of Industrial Pollutants, University of Science and Technology Beijing, Beijing 100083, PR China.
| | - Lei Zheng
- School of Energy and Environmental Engineering, Beijing Key Laboratory of Resource-oriented Treatment of Industrial Pollutants, University of Science and Technology Beijing, Beijing 100083, PR China
| | - Xuemei Wang
- School of Energy and Environmental Engineering, Beijing Key Laboratory of Resource-oriented Treatment of Industrial Pollutants, University of Science and Technology Beijing, Beijing 100083, PR China
| | - Shikun Cheng
- School of Energy and Environmental Engineering, Beijing Key Laboratory of Resource-oriented Treatment of Industrial Pollutants, University of Science and Technology Beijing, Beijing 100083, PR China
| | - Kang Li
- Department of Geotechnical Engineering, College of Civil Engineering, Tongji University, Shanghai 200092, PR China
| |
Collapse
|
19
|
Jin Y, Ma M, Yan Y, Guo Y, Feng Y, Chen C, Zhong Y, Huang K, Xia H, Libo Y, Si Y, Zou J. A convenient machine learning model to predict full stomach and evaluate the safety and comfort improvements of preoperative oral carbohydrate in patients undergoing elective painless gastrointestinal endoscopy. Ann Med 2023; 55:2292778. [PMID: 38109932 PMCID: PMC10732178 DOI: 10.1080/07853890.2023.2292778] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Accepted: 12/04/2023] [Indexed: 12/20/2023] Open
Abstract
BACKGROUND AND AIMS Assessment of the patient's gastric contents is the key to avoiding aspiration incidents, however, there is no effective method to determine whether elective painless gastrointestinal endoscopy (GIE) patients have a full stomach or an empty stomach. And previous studies have shown that preoperative oral carbohydrates (POCs) can improve the discomfort induced by fasting, but there are different perspectives on their safety. This study aimed to develop a convenient, accurate machine learning (ML) model to predict full stomach. And based on the model outcomes, evaluate the safety and comfort improvements of POCs in empty- and full stomach groups. METHODS We enrolled 1386 painless GIE patients between October 2022 and January 2023 in Nanjing First Hospital, and 1090 patients without POCs were used to construct five different ML models to identify full stomach. The metrics of discrimination and calibration validated the robustness of the models. For the best-performance model, we further interpreted it through SHapley Additive exPlanations (SHAP) and constructed a web calculator to facilitate clinical use. We evaluated the safety and comfort improvements of POCs by propensity score matching (PSM) in the two groups, respectively. RESULTS Random Forest (RF) model showed the greatest discrimination with the area under the receiver operating characteristic curve (AUROC) 0.837 [95% confidence interval (CI): 79.1-88.2], F1 71.5%, and best calibration with a Brier score of 15.2%. The web calculator can be visited at https://medication.shinyapps.io/RF_model/. PSM results demonstrated that POCs significantly reduced the full stomach incident in empty stomach group (p < 0.05), but no differences in full stomach group (p > 0.05). Comfort improved in both groups and was more significant in empty stomach group. CONCLUSIONS The developed convenient RF model predicted full stomach with high accuracy and interpretability. POCs were safe and comfortably improved in both groups, with more benefit in empty stomach group. These findings may guide the patients' gastrointestinal preparation.
Collapse
Affiliation(s)
- Yuzhan Jin
- School of Basic Medicine and Clinical Pharmacy, China Pharmaceutical University, Nanjing, China
- Department of Clinical Pharmacology, Nanjing First Hospital, Nanjing Medical University, Nanjing, China
| | - Mingtao Ma
- Department of Anesthesiology, Perioperative and Pain Medicine, Nanjing First Hospital, Nanjing Medical University, Nanjing, China
- Department of Anesthesiology, Leping People’s Hospital, Jiangxi, China
| | - Yuqing Yan
- School of Basic Medicine and Clinical Pharmacy, China Pharmaceutical University, Nanjing, China
- Department of Clinical Pharmacology, Nanjing First Hospital, Nanjing Medical University, Nanjing, China
| | - Yaoyi Guo
- Department of Anesthesiology, Perioperative and Pain Medicine, Nanjing First Hospital, Nanjing Medical University, Nanjing, China
| | - Yue Feng
- Department of Anesthesiology, Perioperative and Pain Medicine, Nanjing First Hospital, Nanjing Medical University, Nanjing, China
| | - Chen Chen
- Department of Clinical Pharmacology, Nanjing First Hospital, Nanjing Medical University, Nanjing, China
- Department of Pharmacy, Nanjing First Hospital, China Pharmaceutical University, Nanjing, China
| | - Yi Zhong
- Department of Anesthesiology, Perioperative and Pain Medicine, Nanjing First Hospital, Nanjing Medical University, Nanjing, China
| | - Kaizong Huang
- Department of Clinical Pharmacology, Nanjing First Hospital, Nanjing Medical University, Nanjing, China
- Department of Pharmacy, Nanjing First Hospital, China Pharmaceutical University, Nanjing, China
| | - Huaming Xia
- Nanjing Xiaheng Network System Co., Ltd., Nanjing, China
| | - Yan Libo
- Jiangsu Kaiyuan Pharmaceutical Co., Ltd., Nanjing, China
| | - Yanna Si
- Department of Anesthesiology, Perioperative and Pain Medicine, Nanjing First Hospital, Nanjing Medical University, Nanjing, China
| | - Jianjun Zou
- Department of Clinical Pharmacology, Nanjing First Hospital, Nanjing Medical University, Nanjing, China
- Department of Pharmacy, Nanjing First Hospital, China Pharmaceutical University, Nanjing, China
| |
Collapse
|
20
|
Zhu JJ, Yang M, Ren ZJ. Machine Learning in Environmental Research: Common Pitfalls and Best Practices. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2023; 57:17671-17689. [PMID: 37384597 DOI: 10.1021/acs.est.3c00026] [Citation(s) in RCA: 107] [Impact Index Per Article: 53.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/01/2023]
Abstract
Machine learning (ML) is increasingly used in environmental research to process large data sets and decipher complex relationships between system variables. However, due to the lack of familiarity and methodological rigor, inadequate ML studies may lead to spurious conclusions. In this study, we synthesized literature analysis with our own experience and provided a tutorial-like compilation of common pitfalls along with best practice guidelines for environmental ML research. We identified more than 30 key items and provided evidence-based data analysis based on 148 highly cited research articles to exhibit the misconceptions of terminologies, proper sample size and feature size, data enrichment and feature selection, randomness assessment, data leakage management, data splitting, method selection and comparison, model optimization and evaluation, and model explainability and causality. By analyzing good examples on supervised learning and reference modeling paradigms, we hope to help researchers adopt more rigorous data preprocessing and model development standards for more accurate, robust, and practicable model uses in environmental research and applications.
Collapse
Affiliation(s)
- Jun-Jie Zhu
- Department of Civil and Environmental Engineering and Andlinger Center for Energy and the Environment, Princeton University, Princeton, New Jersey 08544, United States
| | - Meiqi Yang
- Department of Civil and Environmental Engineering and Andlinger Center for Energy and the Environment, Princeton University, Princeton, New Jersey 08544, United States
| | - Zhiyong Jason Ren
- Department of Civil and Environmental Engineering and Andlinger Center for Energy and the Environment, Princeton University, Princeton, New Jersey 08544, United States
| |
Collapse
|
21
|
Wang M, Shi GM, Zhao D, Liu X, Jiang J. Machine Learning-Assisted Design of Thin-Film Composite Membranes for Solvent Recovery. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2023; 57:15914-15924. [PMID: 37814603 DOI: 10.1021/acs.est.3c04773] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/11/2023]
Abstract
Organic solvents are extensively utilized in industries as raw materials, reaction media, and cleaning agents. It is crucial to efficiently recover solvents for environmental protection and sustainable manufacturing. Recently, organic solvent nanofiltration (OSN) has emerged as an energy-efficient membrane technology for solvent recovery; however, current OSN membranes are largely fabricated by trial-and-error methods. In this study, for the first time, we develop a machine learning (ML) approach to design new thin-film composite membranes for solvent recovery. The monomers used in interfacial polymerization, along with membrane, solvent and solute properties, are featurized to train ML models via gradient boosting regression. The ML models demonstrate high accuracy in predicting OSN performance including solvent permeance and solute rejection. Subsequently, 167 new membranes are designed from 40 monomers and their OSN performance is predicted by the ML models for common solvents (methanol, acetone, dimethylformamide, and n-hexane). New top-performing membranes are identified with methanol permeance superior to that of existing membranes. Particularly, nitrogen-containing heterocyclic monomers are found to enhance microporosity and contribute to higher permeance. Finally, one new membrane is experimentally synthesized and tested to validate the ML predictions. Based on the chemical structures of monomers, the ML approach developed here provides a bottom-up strategy toward the rational design of new membranes for high-performance solvent recovery and many other technologically important applications.
Collapse
Affiliation(s)
- Mao Wang
- Department of Chemical and Biomolecular Engineering, National University of Singapore, Singapore 117576, Singapore
| | - Gui Min Shi
- Department of Chemical and Biomolecular Engineering, National University of Singapore, Singapore 117576, Singapore
| | - Daohui Zhao
- Department of Chemical and Biomolecular Engineering, National University of Singapore, Singapore 117576, Singapore
| | - Xinyi Liu
- Department of Chemical and Biomolecular Engineering, National University of Singapore, Singapore 117576, Singapore
| | - Jianwen Jiang
- Department of Chemical and Biomolecular Engineering, National University of Singapore, Singapore 117576, Singapore
| |
Collapse
|