1
|
Liu P, Xu H, Jin P, Zhu X, Zheng J, Liu Y, Yang J, Xu D, Liang H. DFT-assisted machine learning for polyester membrane design in textile wastewater recovery applications. WATER RESEARCH 2025; 279:123438. [PMID: 40073492 DOI: 10.1016/j.watres.2025.123438] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/20/2024] [Revised: 02/11/2025] [Accepted: 03/03/2025] [Indexed: 03/14/2025]
Abstract
Resource recovery from textile wastewater has attracted increasing interest because it simultaneously addresses wastewater treatment and maximizes the utilization of the residual dyes. Although polyester membranes have demonstrated great potential for textile wastewater recovery, tailoring high-performance polyester membranes remains a multidimensional challenge because of the complex nonlinear relationships between the membrane materials and their performance. Here we developed density functional theory (DFT)-assisted machine learning models that integrates DFT descriptors with fabrication and operation parameters to facilitate the generative design of polyester membranes. The developed machine learning model demonstrated the ability to accurately predict permeance and separation performance. The contribution analysis revealed that the fabrication parameters emerged as the critical factors influencing permeance, whereas the DFT descriptors played important roles in determining the dye and salt rejection. Additionally, optimal combinations of monomer, fabrication, and operation conditions were identified from a chemical space of 8,000 candidates using the developed model combined with Bayesian optimization, targeting dye/salt and dye/dye selectivity. Five polyester membranes were then fabricated under these identified combinations. These membranes surpassed the current performance upper bound and achieved efficient recovery of the dyes from textile wastewater. Overall, a feasible and universal machine learning model aimed at driving a paradigm shift in the inverse design of polyester membranes was developed.
Collapse
Affiliation(s)
- Peng Liu
- State Key Laboratory of Urban Water Resource and Environment, School of Environment, Harbin Institute of Technology, Harbin, 150090, China
| | - Hangbin Xu
- State Key Laboratory of Urban Water Resource and Environment, School of Environment, Harbin Institute of Technology, Harbin, 150090, China
| | - Pengrui Jin
- Department of Chemical Engineering, KU Leuven, Celestijnenlaan 200F, Leuven, B-3001 Belgium
| | - Xuewu Zhu
- School of Municipal and Environmental Engineering, Shandong Jianzhu University, Jinan, 250101, China
| | - Junfeng Zheng
- School of Carbon Neutrality Future Technology, Sichuan University, Chengdu, 610065, China
| | - Yanling Liu
- State Key Laboratory of Pollution Control and Resources Reuse, Advanced Membrane Technology Center, Tongji University, Shanghai, 200092, China
| | - Jiaxuan Yang
- State Key Laboratory of Urban Water Resource and Environment, School of Environment, Harbin Institute of Technology, Harbin, 150090, China
| | - Daliang Xu
- State Key Laboratory of Urban Water Resource and Environment, School of Environment, Harbin Institute of Technology, Harbin, 150090, China.
| | - Heng Liang
- State Key Laboratory of Urban Water Resource and Environment, School of Environment, Harbin Institute of Technology, Harbin, 150090, China
| |
Collapse
|
2
|
Özönder Ş, Küçükkartal HK. Rapid Discovery of Graphene Nanoflakes with Desired Absorption Spectra Using DFT and Bayesian Optimization with Neural Network Kernel. J Phys Chem A 2025; 129:4591-4600. [PMID: 40338138 DOI: 10.1021/acs.jpca.5c00405] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/09/2025]
Abstract
Grid searching a large and high-dimensional chemical space with density functional theory (DFT) to discover new materials with desired properties is prohibitive due to the high computational cost. We propose an approach utilizing Bayesian optimization (BO) with an artificial neural network kernel to enable an efficient and low-cost guided search on the chemical space, avoiding costly brute-force grid search. This method leverages the BO algorithm, where the kernel neural network trained on a limited number of DFT results determines the most promising regions of the chemical space to explore in subsequent iterations. This approach aims to discover new materials with target properties while minimizing the number of DFT calculations required. To demonstrate the effectiveness of this method, we investigated 63 doped graphene quantum dots (GQDs) with sizes ranging from 1 to 2 nm to find the structure with the highest light absorption. Using time-dependent DFT (TDDFT) only 12 times, we achieved a significant reduction in computational cost, approximately 20% of what would be required for a full grid search. Considering that TDDFT calculations for a single GQD require about half a day of wall time on high-performance computing nodes, this reduction is substantial. Our approach can be generalized to the discovery of new drugs, chemicals, crystals, and alloys in high-dimensional and large chemical spaces, offering a scalable solution enabled by the neural network kernel.
Collapse
Affiliation(s)
- Şener Özönder
- Institute for Data Science & Artificial Intelligence, Boğaziçi University, İstanbul 34342, Turkey
| | - Hatice Kübra Küçükkartal
- Computer Engineering Department, Eskişehir Osmangazi University, Eskişehir 26040, Turkey
- ArtificaX Technologies, Boğaziçi Tecnopark, İstanbul 34470, Turkey
| |
Collapse
|
3
|
Tao W, Zhao W, Zhao Q, Xiao Y. Ensemble-Learning-Guided Optimization Design for Metal-Organic Framework Adsorbents toward CO Adsorption. Inorg Chem 2025; 64:9237-9250. [PMID: 40314500 DOI: 10.1021/acs.inorgchem.5c00994] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/03/2025]
Abstract
Metal-organic frameworks (MOFs) hold great potential for carbon monoxide (CO) adsorption owing to their large pore volume, diverse periodic network structures, and designability. Machine learning is anticipated to provide optimization parameters for designing high-efficiency MOFs adsorbents, avoiding time-consuming experiments. Here, we proposed an ensemble-learning strategy accounting for multidimensional analysis of features to rationally design pore geometries, structural properties, and synthesis conditions of MOFs toward high performance for CO adsorption. The extreme gradient boosting model exhibited the best predictive performance (R2 > 0.95) under limited data set size. Porous characteristic was identified as a dominant factor in pristine MOFs. Prediction results illustrated that MOFs featuring one-dimensional, two-dimensional, microporous, and isolated pores were optimal for CO adsorption, with 0.4-0.6 cm3/g total pore volume. This enhanced adsorption capacity can be attributed to the shortened molecular diffusion pathways. The relative significance of structural parameters followed: space groups > geometry > topology. The optimal structural configuration involved space group of R3m, binuclear paddle wheel geometry, and scorpionate-like topology. Regarding transition metal-modified MOFs, incorporated Cu(I) demonstrated the strongest binding affinity toward CO, while Fe(II) and Ni(II) could serve as effective binding sites. This work offers a theoretical guidance for designing efficient adsorbents toward CO adsorption.
Collapse
Affiliation(s)
- Wenyuan Tao
- School of Energy and Materials, Shanghai Polytechnic University, Shanghai 201209, China
- School of Petrochemical Engineering, Shenyang University of Technology, Liaoyang 111003, China
- Panjin Institute of Industrial Technology, Dalian University of Technology, Panjin 124221, China
| | - Wenkai Zhao
- School of Petrochemical Engineering, Shenyang University of Technology, Liaoyang 111003, China
| | - Qidong Zhao
- School of Chemical Engineering, Ocean and Life Sciences, Dalian University of Technology, Panjin 124221, China
| | - Yonghou Xiao
- School of Energy and Materials, Shanghai Polytechnic University, Shanghai 201209, China
- Panjin Institute of Industrial Technology, Dalian University of Technology, Panjin 124221, China
| |
Collapse
|
4
|
Wang M, Ji Z, Dong Y. Machine learning-guided performance prediction of forward osmosis polymeric membranes for boron recovery. WATER RESEARCH 2025; 281:123700. [PMID: 40305914 DOI: 10.1016/j.watres.2025.123700] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/08/2025] [Revised: 04/12/2025] [Accepted: 04/21/2025] [Indexed: 05/02/2025]
Abstract
Efficient recovery of boron is one of the crucial strategies of sustainably extracting valuable resource from water. It however still remains a key technological challenge to efficiently predict boron recovery from unconventional water resources such as underground water, geothermal water and seawater, which are still few concerned in open literature. To effectively address this issue, herein we propose an efficient strategy to precisely predict boron recovery performance and then explore mechanism in forward osmosis process via advanced machine learning techniques with better model performance. Specifically, to explore the complex relationships among various boron recovery factors, we compare different advanced machine learning regression models to provide valuable insights into how these key factors impact system performance. We find that three key driving factors (i.e., pH, boron concentration, and membrane orientation) significantly affect boron recovery performance in the forward osmosis process. The best prediction accuracy with a high r-square (R2, 95.4 %) is achieved via the XGBoost model combined with the particle swarm optimization algorithm, demonstrating its remarkable ability for precise boron recovery prediction. By employing this hybrid model to optimize the search space, the overall performance of forward osmosis system was significantly enhanced, with a predicted boron rejection rate as high as 98.28 %, outperforming the reported values. Our work demonstrates the powerful potential of advanced machine learning for efficiently predicting boron recovery for water quality improvement and resource recovery applications.
Collapse
Affiliation(s)
- Meng Wang
- School of Science and Engineering, The Chinese University of Hong Kong, Shenzhen, Guangdong, 518172, PR China
| | - Zhanlin Ji
- College of Mathematics and Computer Science, Zhejiang A&F University, Hangzhou 311300, PR China.
| | - Yingchao Dong
- School of Science and Engineering, The Chinese University of Hong Kong, Shenzhen, Guangdong, 518172, PR China.
| |
Collapse
|
5
|
Akbari A, Maleki M, Kazemzadeh Y, Ranjbar A. Calculation of hydrogen dispersion in cushion gases using machine learning. Sci Rep 2025; 15:13718. [PMID: 40258984 PMCID: PMC12012074 DOI: 10.1038/s41598-025-98613-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2025] [Accepted: 04/14/2025] [Indexed: 04/23/2025] Open
Abstract
Hydrogen storage is a crucial technology for ensuring a sustainable energy transition. Underground Hydrogen Storage (UHS) in depleted hydrocarbon reservoirs, aquifers, and salt caverns provides a viable large-scale solution. However, hydrogen dispersion in cushion gases such as nitrogen (N2), methane (CH4), and carbon dioxide (CO2) lead to contamination, reduced purity, and increased purification costs. Existing experimental and numerical methods for predicting hydrogen dispersion coefficients (KL) are often limited by high costs, lengthy processing times, and insufficient accuracy in dynamic reservoir conditions. This study addresses these challenges by integrating experimental data with advanced machine learning (ML) techniques to model hydrogen dispersion. Various ML models-including Random Forest (RF), Least Squares Boosting (LSBoost), Bayesian Regression, Linear Regression (LR), Artificial Neural Networks (ANNs), and Support Vector Machines (SVMs)-were employed to quantify KL as a function of pressure (P) and displacement velocity (Um). Among these methods, RF outperformed the others, achieving an R2 of 0.9965 for test data and 0.9999 for training data, with RMSE values of 0.023 and 0.001, respectively. The findings highlight the potential of ML-driven approaches in optimizing UHS operations by enhancing predictive accuracy, reducing computational costs, and mitigating hydrogen contamination risks.
Collapse
Affiliation(s)
- Ali Akbari
- Department of Petroleum Engineering, Faculty of Petroleum, Gas, and Petrochemical Engineering, Persian Gulf University, Bushehr, Iran.
| | - Mehdi Maleki
- Department of Petroleum Engineering, Faculty of Petroleum, Gas, and Petrochemical Engineering, Persian Gulf University, Bushehr, Iran
| | - Yousef Kazemzadeh
- Department of Petroleum Engineering, Faculty of Petroleum, Gas, and Petrochemical Engineering, Persian Gulf University, Bushehr, Iran
| | - Ali Ranjbar
- Department of Petroleum Engineering, Faculty of Petroleum, Gas, and Petrochemical Engineering, Persian Gulf University, Bushehr, Iran
| |
Collapse
|
6
|
Dangayach R, Jeong N, Demirel E, Uzal N, Fung V, Chen Y. Machine Learning-Aided Inverse Design and Discovery of Novel Polymeric Materials for Membrane Separation. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2025; 59:993-1012. [PMID: 39680111 PMCID: PMC11755723 DOI: 10.1021/acs.est.4c08298] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/10/2024] [Revised: 12/03/2024] [Accepted: 12/04/2024] [Indexed: 12/17/2024]
Abstract
Polymeric membranes have been widely used for liquid and gas separation in various industrial applications over the past few decades because of their exceptional versatility and high tunability. Traditional trial-and-error methods for material synthesis are inadequate to meet the growing demands for high-performance membranes. Machine learning (ML) has demonstrated huge potential to accelerate design and discovery of membrane materials. In this review, we cover strengths and weaknesses of the traditional methods, followed by a discussion on the emergence of ML for developing advanced polymeric membranes. We describe methodologies for data collection, data preparation, the commonly used ML models, and the explainable artificial intelligence (XAI) tools implemented in membrane research. Furthermore, we explain the experimental and computational validation steps to verify the results provided by these ML models. Subsequently, we showcase successful case studies of polymeric membranes and emphasize inverse design methodology within a ML-driven structured framework. Finally, we conclude by highlighting the recent progress, challenges, and future research directions to advance ML research for next generation polymeric membranes. With this review, we aim to provide a comprehensive guideline to researchers, scientists, and engineers assisting in the implementation of ML to membrane research and to accelerate the membrane design and material discovery process.
Collapse
Affiliation(s)
- Raghav Dangayach
- School
of Civil & Environmental Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Nohyeong Jeong
- School
of Civil & Environmental Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Elif Demirel
- School
of Civil & Environmental Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Nigmet Uzal
- School
of Civil & Environmental Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
- Department
of Civil Engineering, Abdullah Gul University, 38039 Kayseri, Turkey
| | - Victor Fung
- School
of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Yongsheng Chen
- School
of Civil & Environmental Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| |
Collapse
|
7
|
Wu Y, Wang Z, Yu G, Zhao Y, Chen C, Xie Y, Cao H. Interpretable Machine Learning Models Delivering a New Perspective for the Reaction Mechanism between Organic Pollutants and Oxidative Radicals. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2025; 59:1264-1273. [PMID: 39772452 DOI: 10.1021/acs.est.4c11504] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/11/2025]
Abstract
Machine learning (ML) is expected to bring new insights into the impact of organic structures on the reaction mechanisms in reactive oxygen species oxidation. However, understanding the underlying chemical mechanisms still faces challenges due to the limited interpretability of the ML models. In this study, interpretable ML models were established to predict the second-order rate constants between hydroxyl radicals (•OH) and organics (k•OH). It was found that the energy of the highest occupied molecular orbital (EHOMO), the number of aromatic rings (NAR), and the number of carbon atoms of organics (NC) have important impacts on k•OH. The positive correlation between k•OH and EHOMO can be explained by the regularity of electrophilic reaction, while the relationship between k•OH and NAR and NC seems to be related with reactive sites. Furthermore, a rapid judgment method for reaction mechanism was developed based on an unsupervised learning approach which automatically divided organics into three clusters. Additionally, this methodology was applied to the reaction between organics and sulfate radicals. This study offers a rational model for predicting reaction mechanisms and provides more insights into the impact of organic structures on the reaction mechanism from the perspective of big data.
Collapse
Affiliation(s)
- Yiqiu Wu
- Chemistry & Chemical Engineering Data Center, Institute of Process Engineering, Chinese Academy of Sciences, Beijing 100190, China
- School of Chemical Engineering, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zhixiang Wang
- Chemistry & Chemical Engineering Data Center, Institute of Process Engineering, Chinese Academy of Sciences, Beijing 100190, China
- School of Chemical Engineering, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Guangfei Yu
- MOE Key Laboratory of Resources and Environmental Systems Optimization, College of Environmental Science and Engineering, North China Electric Power University, Beijing 102206, China
| | - Yuehong Zhao
- Chemistry & Chemical Engineering Data Center, Institute of Process Engineering, Chinese Academy of Sciences, Beijing 100190, China
- School of Chemical Engineering, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Chuncheng Chen
- Beijing National Laboratory for Molecular Sciences, Key Laboratory of Photochemistry, CAS Research/Education Center for Excellence in Molecular Sciences, Institute of Chemistry, Chinese Academy of Sciences, Beijing 100190, P. R. China
| | - Yongbing Xie
- Chemistry & Chemical Engineering Data Center, Institute of Process Engineering, Chinese Academy of Sciences, Beijing 100190, China
- School of Chemical Engineering, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Hongbin Cao
- Chemistry & Chemical Engineering Data Center, Institute of Process Engineering, Chinese Academy of Sciences, Beijing 100190, China
- School of Chemical Engineering, University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
8
|
Jeong N, Park S, Mahajan S, Zhou J, Blotevogel J, Li Y, Tong T, Chen Y. Elucidating governing factors of PFAS removal by polyamide membranes using machine learning and molecular simulations. Nat Commun 2024; 15:10918. [PMID: 39738140 DOI: 10.1038/s41467-024-55320-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2024] [Accepted: 12/09/2024] [Indexed: 01/01/2025] Open
Abstract
Per- and polyfluoroalkyl substances (PFASs) have recently garnered considerable concerns regarding their impacts on human and ecological health. Despite the important roles of polyamide membranes in remediating PFASs-contaminated water, the governing factors influencing PFAS transport across these membranes remain elusive. In this study, we investigate PFAS rejection by polyamide membranes using two machine learning (ML) models, namely XGBoost and multimodal transformer models. Utilizing the Shapley additive explanation method for XGBoost model interpretation unveils the impacts of both PFAS characteristics and membrane properties on model predictions. The examination of the impacts of chemical structure involves interpreting the multimodal transformer model incorporated with simplified molecular input line entry system strings through heat maps, providing a visual representation of the attention score assigned to each atom of PFAS molecules. Both ML interpretation methods highlight the dominance of electrostatic interaction in governing PFAS transport across polyamide membranes. The roles of functional groups in altering PFAS transport across membranes are further revealed by molecular simulations. The combination of ML with computer simulations not only advances our knowledge of PFAS removal by polyamide membranes, but also provides an innovative approach to facilitate data-driven feature selection for the development of high-performance membranes with improved PFAS removal efficiency.
Collapse
Affiliation(s)
- Nohyeong Jeong
- School of Civil & Environmental Engineering, Georgia Institute of Technology, Atlanta, GA, 30332, USA
| | - Shinyun Park
- Department of Civil and Environmental Engineering, Colorado State University, Fort Collins, CO, 80523, USA
- School of Sustainable Engineering and the Built Environment, Arizona State University, Tempe, AZ, 85287, USA
| | - Subhamoy Mahajan
- Department of Mechanical Engineering, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Ji Zhou
- Department of Mechanical Engineering, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Jens Blotevogel
- Department of Civil and Environmental Engineering, Colorado State University, Fort Collins, CO, 80523, USA
- Commonwealth Scientific and Industrial Research Organisation (CSIRO), Environment, Waite Campus, Urrbrae, 5064, Australia
| | - Ying Li
- Department of Mechanical Engineering, University of Wisconsin-Madison, Madison, WI, 53706, USA.
| | - Tiezheng Tong
- Department of Civil and Environmental Engineering, Colorado State University, Fort Collins, CO, 80523, USA.
- School of Sustainable Engineering and the Built Environment, Arizona State University, Tempe, AZ, 85287, USA.
| | - Yongsheng Chen
- School of Civil & Environmental Engineering, Georgia Institute of Technology, Atlanta, GA, 30332, USA.
| |
Collapse
|
9
|
Zheng R, Xu S, Zhong S, Tong X, Yu X, Zhao Y, Chen Y. Enhancing Ion Selectivity of Nanofiltration Membranes via Heterogeneous Charge Distribution. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2024; 58:22818-22828. [PMID: 39671316 DOI: 10.1021/acs.est.4c08841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/15/2024]
Abstract
Nanofiltration technology holds significant potential for precisely separating monovalent and multivalent ions, such as lithium (Li) and magnesium (Mg) ions, during lithium extraction from salt lakes. This study bridges a crucial gap in understanding the impact of the membrane spatial charge distribution on ion-selective separation. We developed two types of mixed-charge membranes with similar pore sizes but distinct longitudinal and horizontal distributions of oppositely charged domains. The charge-mosaic membrane, synthesized and utilized for ion fractionation for the first time, achieved an exceptional water permeance of 15.4 LMH/bar and a Li/Mg selectivity of 108, outperforming the majority of published reports. Through comprehensive characterization, mathematical modeling, and machine learning methods, we provide evidence that the spatial charge distribution dominantly determines ion selectivity. The charge-mosaic structure excels by substantially promoting ion selectivity through locally enhanced Donnan effects while remaining unaffected by variations in feedwater concentration. Our findings not only demonstrate the applicability of charge-mosaic membranes to precise nanofiltration but also have profound implications for technologies demanding advanced ion selectivity, including those in the sustainable water treatment and energy storage industries.
Collapse
Affiliation(s)
- Ruiqi Zheng
- Fujian Key Laboratory of Coastal Pollution Prevention and Control, College of the Environment and Ecology, Xiamen University, Xiamen 361102, China
| | - Shuyi Xu
- Fujian Key Laboratory of Coastal Pollution Prevention and Control, College of the Environment and Ecology, Xiamen University, Xiamen 361102, China
| | - Shifa Zhong
- Department of Environmental Science, Institute of Eco-Chongming, School of Ecological and Environmental Sciences, East China Normal University, Shanghai 200241, China
| | - Xin Tong
- State Key Laboratory of Pollution Control and Resource Reuse, School of Environmental Science and Engineering, Tongji University, Shanghai 200092, China
| | - Xin Yu
- Fujian Key Laboratory of Coastal Pollution Prevention and Control, College of the Environment and Ecology, Xiamen University, Xiamen 361102, China
| | - Yangying Zhao
- Fujian Key Laboratory of Coastal Pollution Prevention and Control, College of the Environment and Ecology, Xiamen University, Xiamen 361102, China
| | - Yongsheng Chen
- School of Civil & Environmental Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| |
Collapse
|
10
|
Xu Z, Ding Y, Han SC, Zhang C. Predicting the performance of lithium adsorption and recovery from unconventional water sources with machine learning. WATER RESEARCH 2024; 266:122374. [PMID: 39260198 DOI: 10.1016/j.watres.2024.122374] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/11/2024] [Revised: 08/25/2024] [Accepted: 09/01/2024] [Indexed: 09/13/2024]
Abstract
Selective lithium (Li) recovery from unconventional water sources (UWS) (e.g., shale gas waters, geothermal brines, and rejected seawater desalination brines) using inorganic lithium-ion sieve (LIS) materials can address Li supply shortages and distribution issues. However, the development of high-performance LIS materials and the optimization of recovery-related operating parameters are hampered by the variety of production methods, intricate procedures, and experimental expenses. Machine learning (ML) techniques offer potential solutions for enhancing LIS material development. We collected literature data on Li adsorption, categorizing 16 parameters into adsorbent parameters, operating parameters, and solution components. Three tree-based algorithms-Random Forest (RF), Gradient Boosting Decision Trees (GBDT), and Extreme Gradient Boosting (XGBoost)-were used to evaluate the impact of these parameters on lithium adsorption. The grouped random splitting method limited data leakage and mitigated overfitting. XGBoost demonstrated the best performance, with an R² of 0.98 and a root-mean-squared error (RMSE) of 1.72. The SHAP values highlighted that operating parameters were the most influential, followed by adsorbent parameters and coexisting ion concentrations. Therefore, focusing on optimizing operating parameters or making targeted improvements on LIS based on operating conditions will enhance LIS performances in UWS. These insights are crucial for optimizing Li adsorption processes and designing effective inorganic LIS materials.
Collapse
Affiliation(s)
- Ziyang Xu
- CAS Key Laboratory of Urban Pollutant Conversion, Department of Environmental Science and Engineering, University of Science and Technology of China, Hefei, 230026, China.
| | - Yihao Ding
- School of Computing and Information Systems, The University of Melbourne, Melbourne, VIC 3010, Australia.
| | - Soyeon Caren Han
- School of Computing and Information Systems, The University of Melbourne, Melbourne, VIC 3010, Australia.
| | - Changyong Zhang
- CAS Key Laboratory of Urban Pollutant Conversion, Department of Environmental Science and Engineering, University of Science and Technology of China, Hefei, 230026, China.
| |
Collapse
|
11
|
Meng L, Sheng A, Cao L, Li M, Zheng G, Li S, Chen J, Wu X, Shen Z, Wang L. Contribution assessment and accumulation prediction of heavy metals in wheat grain in a smelting-affected area using machine learning methods. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 951:175461. [PMID: 39137845 DOI: 10.1016/j.scitotenv.2024.175461] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/03/2024] [Revised: 08/06/2024] [Accepted: 08/10/2024] [Indexed: 08/15/2024]
Abstract
Due to the diverse controlling factors and their uneven spatial distribution, especially atmospheric deposition from smelters, assessing and predicting the accumulation of heavy metals (HM) in crops across smelting-affected areas becomes challenging. In this study, integrating HM influx from atmospheric deposition, a boosted regression tree model with an average R2 > 0.8 was obtained to predict accumulation of Pb, As, and Cd in wheat grain across a smelting region. The atmospheric deposition serves as the dominant factor influencing the accumulation of Pb (28.2 %) and As (31.2 %) in wheat grain, but shows a weak influence on Cd accumulation (12.1 %). The contents of available HM in soil affect HM accumulation in wheat grain more significantly than their total contents in soil with relative importance rates of Pb (14.4 % > 8.2 %), As (30.9 % > 4.0 %), and Cd (55.0 % > 16.9 %), respectively. Marginal effect analysis illustrates that HM accumulation in wheat grain begins to intensify when Pb content in atmospheric dust reaches 5140 mg/kg and available Cd content in soil exceeds 1.15 mg/kg. The path analysis rationalizes the cascading effects of distances from study sites to smelting factories on HM accumulation in wheat grain via negatively influencing atmospheric HM deposition. The study provides data support and a theoretical basis for the sustainable development of non-ferrous metal smelting industry, as well as for the restoration and risk management of HM-contaminated soils.
Collapse
Affiliation(s)
- Lingkun Meng
- School of Environmental Science and Engineering, Huazhong University of Science and Technology, Wuhan 430074, China; Hubei Key Laboratory of Multi-media Pollution Cooperative Control in Yangtze Basin, School of Environmental Science and Engineering, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Anxu Sheng
- School of Environmental Science and Engineering, Huazhong University of Science and Technology, Wuhan 430074, China; Hubei Key Laboratory of Multi-media Pollution Cooperative Control in Yangtze Basin, School of Environmental Science and Engineering, Huazhong University of Science and Technology, Wuhan 430074, China.
| | - Liu Cao
- Environmental Protection Agency of Jiyuan Production City Integration Demonstration Area, Jiyuan 459000, China
| | - Mingyue Li
- School of Environmental Science and Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Gang Zheng
- Nanoscale Organisation and Dynamics Group, School of Science, Western Sydney University, Penrith, NSW 2751, Australia
| | - Sen Li
- School of Environmental Science and Engineering, Huazhong University of Science and Technology, Wuhan 430074, China; Hubei Key Laboratory of Multi-media Pollution Cooperative Control in Yangtze Basin, School of Environmental Science and Engineering, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Jing Chen
- School of Environmental Science and Engineering, Huazhong University of Science and Technology, Wuhan 430074, China; Hubei Key Laboratory of Multi-media Pollution Cooperative Control in Yangtze Basin, School of Environmental Science and Engineering, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Xiaohui Wu
- School of Environmental Science and Engineering, Huazhong University of Science and Technology, Wuhan 430074, China; Hubei Key Laboratory of Multi-media Pollution Cooperative Control in Yangtze Basin, School of Environmental Science and Engineering, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Zhemin Shen
- School of Environmental Science and Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Linling Wang
- School of Environmental Science and Engineering, Huazhong University of Science and Technology, Wuhan 430074, China; Hubei Key Laboratory of Multi-media Pollution Cooperative Control in Yangtze Basin, School of Environmental Science and Engineering, Huazhong University of Science and Technology, Wuhan 430074, China.
| |
Collapse
|
12
|
White GM, Siegel AP, Tovar A. Optimizing Thermoplastic Starch Film with Heteroscedastic Gaussian Processes in Bayesian Experimental Design Framework. MATERIALS (BASEL, SWITZERLAND) 2024; 17:5345. [PMID: 39517615 PMCID: PMC11547296 DOI: 10.3390/ma17215345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/11/2024] [Revised: 10/26/2024] [Accepted: 10/27/2024] [Indexed: 11/16/2024]
Abstract
The development of thermoplastic starch (TPS) films is crucial for fabricating sustainable and compostable plastics with desirable mechanical properties. However, traditional design of experiments (DOE) methods used in TPS development are often inefficient. They require extensive time and resources while frequently failing to identify optimal material formulations. As an alternative, adaptive experimental design methods based on Bayesian optimization (BO) principles have been recently proposed to streamline material development by iteratively refining experiments based on prior results. However, most implementations are not suited to manage the heteroscedastic noise inherently present in physical experiments. This work introduces a heteroscedastic Gaussian process (HGP) model within the BO framework to account for varying levels of uncertainty in the data, improve the accuracy of the predictions, and increase the overall experimental efficiency. The aim is to find the optimal TPS film composition that maximizes its elongation at break and tensile strength. To demonstrate the effectiveness of this approach, TPS films were prepared by mixing potato starch, distilled water, glycerol as a plasticizer, and acetic acid as a catalyst. After gelation, the mixture was degassed via centrifugation and molded into films, which were dried at room temperature. Tensile tests were conducted according to ASTM D638 standards. After five iterations and 30 experiments, the films containing 4.5 wt% plasticizer and 2.0 wt% starch exhibited the highest elongation at break (M = 96.7%, SD = 5.6%), while the films with 0.5 wt% plasticizer and 7.0 wt% starch demonstrated the highest tensile strength (M = 2.77 MPa, SD = 1.54 MPa). These results demonstrate the potential of the HGP model within a BO framework to improve material development efficiency and performance in TPS film and other potential material formulations.
Collapse
Affiliation(s)
- Gracie M. White
- Luddy School of Informatics, Computing, and Engineering, Integrative Nanosystems Development Institute (INDI),} Indiana University Indianapolis, Indianapolis, IN 46202, USA;
| | - Amanda P. Siegel
- Department of Chemistry and Chemical Biology, Indiana University Indianapolis, Indianapolis, IN 46202, USA;
| | - Andres Tovar
- School of Mechanical Engineering, Purdue University, Indianapolis, IN 46202, USA
| |
Collapse
|
13
|
Liu H, Yin H, Luo Z, Wang X. Integrating chemistry knowledge in large language models via prompt engineering. Synth Syst Biotechnol 2024; 10:23-38. [PMID: 39206087 PMCID: PMC11350497 DOI: 10.1016/j.synbio.2024.07.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2024] [Revised: 07/08/2024] [Accepted: 07/20/2024] [Indexed: 09/04/2024] Open
Abstract
This paper presents a study on the integration of domain-specific knowledge in prompt engineering to enhance the performance of large language models (LLMs) in scientific domains. The proposed domain-knowledge embedded prompt engineering method outperforms traditional prompt engineering strategies on various metrics, including capability, accuracy, F1 score, and hallucination drop. The effectiveness of the method is demonstrated through case studies on complex materials including the MacMillan catalyst, paclitaxel, and lithium cobalt oxide. The results suggest that domain-knowledge prompts can guide LLMs to generate more accurate and relevant responses, highlighting the potential of LLMs as powerful tools for scientific discovery and innovation when equipped with domain-specific prompts. The study also discusses limitations and future directions for domain-specific prompt engineering development.
Collapse
Affiliation(s)
- Hongxuan Liu
- Department of Chemical Engineering, Tsinghua University, Beijing, 100084, China
| | - Haoyu Yin
- Department of Chemical Engineering, Tsinghua University, Beijing, 100084, China
| | - Zhiyao Luo
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Old Road Campus Research Building, Headington, Oxford, OX3 7DQ, United Kingdom
| | - Xiaonan Wang
- Department of Chemical Engineering, Tsinghua University, Beijing, 100084, China
- Key Laboratory for Industrial Biocatalysis, Ministry of Education, Tsinghua University, Beijing, 100084, China
| |
Collapse
|
14
|
Huang Y, Zhong S, Gan L, Chen Y. Development of Machine Learning Models for Ion-Selective Electrode Cation Sensor Design. ACS ES&T ENGINEERING 2024; 4:1702-1711. [PMID: 39021402 PMCID: PMC11250033 DOI: 10.1021/acsestengg.4c00087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/19/2024] [Revised: 03/15/2024] [Accepted: 03/15/2024] [Indexed: 07/20/2024]
Abstract
Polyvinyl chloride (PVC) membrane-based ion-selective electrode (ISE) sensors are common tools for water assessments, but their development relies on time-consuming and costly experimental investigations. To address this challenge, this study combines machine learning (ML), Morgan fingerprint, and Bayesian optimization technologies with experimental results to develop high-performance PVC-based ISE cation sensors. By using 1745 data sets collected from 20 years of literature, appropriate ML models are trained to enable accurate prediction and a deep understanding of the relationship between ISE components and sensor performance (R 2 = 0.75). Rapid ionophore screening is achieved using the Morgan fingerprint based on atomic groups derived from ML model interpretation. Bayesian optimization is then applied to identify optimal combinations of ISE materials with the potential to deliver desirable ISE sensor performance. Na+, Mg2+, and Al3+ sensors fabricated from Bayesian optimization results exhibit excellent Nernst slopes with less than 8.2% deviation from the ideal value and superb detection limits at 10-7 M level based on experimental validation results. This approach can potentially transform sensor development into a more time-efficient, cost-effective, and rational design process, guided by ML-based techniques.
Collapse
Affiliation(s)
- Yuankai Huang
- School
of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Shifa Zhong
- School
of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
- Department
of Environmental Science, School of Ecological and Environmental Sciences, East China Normal University, Shanghai 200241, China
| | - Lan Gan
- School
of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Yongsheng Chen
- School
of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| |
Collapse
|
15
|
Wang R, He Z, Chen H, Guo S, Zhang S, Wang K, Wang M, Ho SH. Enhancing biomass conversion to bioenergy with machine learning: Gains and problems. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 927:172310. [PMID: 38599406 DOI: 10.1016/j.scitotenv.2024.172310] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/18/2024] [Revised: 03/20/2024] [Accepted: 04/06/2024] [Indexed: 04/12/2024]
Abstract
The growing concerns about environmental sustainability and energy security, such as exhaustion of traditional fossil fuels and global carbon footprint growth have led to an increasing interest in alternative energy sources, especially bioenergy. Recently, numerous scenarios have been proposed regarding the use of bioenergy from different sources in the future energy systems. In this regard, one of the biggest challenges for scientists is managing, modeling, decision-making, and future forecasting of bioenergy systems. The development of machine learning (ML) techniques can provide new opportunities for modeling, optimizing and managing the production, consumption and environmental effects of bioenergy. However, researchers in bioenergy fields have not widely utilized the ML concepts and practices. Therefore, a comparative review of the current ML techniques used for bioenergy productions is presented in this paper. This review summarizes the common issues and difficulties existing in integrating ML with bioenergy studies, and discusses and proposes the possible solutions. Additionally, a detailed discussion of the appropriate ML application scenarios is also conducted in every sector of the entire bioenergy chain. This indicates the modernized conversion processes supported by ML techniques are imperative to accurately capture process-level subtleties, and thus improving techno-economic resilience and socio-ecological integrity of bioenergy production. All the efforts are believed to help in sustainable bioenergy production with ML technologies for the future.
Collapse
Affiliation(s)
- Rupeng Wang
- State Key Laboratory of Urban Water Resource and Environment, School of Environment, Harbin Institute of Technology, Harbin 150040, PR China
| | - Zixiang He
- State Key Laboratory of Urban Water Resource and Environment, School of Environment, Harbin Institute of Technology, Harbin 150040, PR China
| | - Honglin Chen
- State Key Laboratory of Urban Water Resource and Environment, School of Environment, Harbin Institute of Technology, Harbin 150040, PR China
| | - Silin Guo
- School of Medicine and Health, Harbin Institute of Technology, Harbin 150040, PR China
| | - Shiyu Zhang
- State Key Laboratory of Urban Water Resource and Environment, School of Environment, Harbin Institute of Technology, Harbin 150040, PR China
| | - Ke Wang
- State Key Laboratory of Urban Water Resource and Environment, School of Environment, Harbin Institute of Technology, Harbin 150040, PR China
| | - Meng Wang
- State Key Laboratory of Urban Water Resource and Environment, School of Environment, Harbin Institute of Technology, Harbin 150040, PR China
| | - Shih-Hsin Ho
- State Key Laboratory of Urban Water Resource and Environment, School of Environment, Harbin Institute of Technology, Harbin 150040, PR China.
| |
Collapse
|
16
|
Zhu H, Szymczyk A, Ghoufi A. Multiscale modelling of transport in polymer-based reverse-osmosis/nanofiltration membranes: present and future. DISCOVER NANO 2024; 19:91. [PMID: 38771417 PMCID: PMC11109084 DOI: 10.1186/s11671-024-04020-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Accepted: 04/22/2024] [Indexed: 05/22/2024]
Abstract
Nanofiltration (NF) and reverse osmosis (RO) processes are physical separation technologies used to remove contaminants from liquid streams by employing dense polymer-based membranes with nanometric voids that confine fluids at the nanoscale. At this level, physical properties such as solvent and solute permeabilities are intricately linked to molecular interactions. Initially, numerous studies focused on developing macroscopic transport models to gain insights into separation properties at the nanometer scale. However, continuum-based models have limitations in nanoconfined situations that can be overcome by force field molecular simulations. Continuum-based models heavily rely on bulk properties, often neglecting critical factors like liquid structuring, pore geometry, and molecular/chemical specifics. Molecular/mesoscale simulations, while encompassing these details, often face limitations in time and spatial scales. Therefore, achieving a comprehensive understanding of transport requires a synergistic integration of both approaches through a multiscale approach that effectively combines and merges both scales. This review aims to provide a comprehensive overview of the state-of-the-art in multiscale modeling of transport through NF/RO membranes, spanning from the nanoscale to continuum media.
Collapse
Affiliation(s)
- Haochen Zhu
- State Key Laboratory of Pollution Control and Resources Reuse, Key Laboratory of Yangtze River Water Environment, College of Environmental Science and Engineering, Tongji University, 1239 Siping Rd., Shanghai, 200092, China.
| | - Anthony Szymczyk
- CNRS, ISCR (Institut des Sciences Chimiques de Rennes) - UMR 6226, Univ Rennes, 35000, Rennes, France.
| | - Aziz Ghoufi
- CNRS, ICMPE (Institut de Chimie et des Matériaux Paris-Est) - UMR 7182, Univ Paris-East Creteil, 94320, Thiais, France.
- CNRS, IPR (Institut de Physique de Rennes) - UMR 6251, Univ Rennes, 35000, Rennes, France.
| |
Collapse
|
17
|
Yang Q, Fan L, Hao E, Hou X, Deng J, Xia Z, Du Z. Construction of An Oral Bioavailability Prediction Model Based on Machine Learning for Evaluating Molecular Modifications. J Pharm Sci 2024; 113:1155-1167. [PMID: 38430955 DOI: 10.1016/j.xphs.2024.02.026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Revised: 02/26/2024] [Accepted: 02/26/2024] [Indexed: 03/05/2024]
Abstract
OBJECTIVE This study aims to explore the impact of ADME on the Oral Bioavailability (OB) of drugs and to construct a machine learning model for OB prediction. The model is then applied to predict the OB of modified berberine and atenolol molecules to obtain structures with higher OB. METHODS Initially, a drug OB database was established, and corresponding ADME characteristics were obtained. The relationship between ADME and OB was analyzed using machine learning, with Morgan fingerprints serving as molecular descriptors. Compounds from the database were input into Random Forest, XGBoost, CatBoost, and LightGBM machine learning models to train the OB 7prediction model and evaluate its performance. Subsequently, berberine and atenolol were modified using Chemdraw software with ten different substituents for mono-substitution, and chlorine atoms for a full range of double substitutions. The modified molecular structures were converted into the same format as the training set for OB prediction. The predicted OB values of the modified structures of berberine and atenolol were compared. RESULTS An OB database of 386 drugs was obtained. It was found that smaller molecular weight and a higher number of rotatable bonds (ten or less) could potentially lead to higher OB. The four machine learning models were evaluated using MSE, R2 score, MAE, and MFE as metrics, with Random Forest performing the best. The models' predictions for the test set were particularly accurate when OB ranged from 30% to 90%. After mono-substitution and double substitution of berberine and atenolol, the OB of both drugs was significantly improved. CONCLUSIONS This study found that some ADME properties of molecules do not have an absolute impact on OB. The database played a decisive role in the process of the machine learning OB prediction model, and the performance of the model was evaluated based on predictions within a range of strong generalization ability. In most cases, mono-substitution and double substitution were beneficial for enhancing the OB of berberine and atenolol. In summary, this study successfully constructed a machine learning regression prediction model that can accurately predict drug OB, which can guide drug design to achieve higher OB to some extent.
Collapse
Affiliation(s)
- Qi Yang
- School of Pharmacy, Guangxi University of Chinese Medicine, Nanning 530200, China
| | - Lili Fan
- School of Pharmacy, Guangxi University of Chinese Medicine, Nanning 530200, China.
| | - Erwei Hao
- Guangxi Key Laboratory of Efficacy Study on Chinese Materia Medica, Guangxi University of Chinese Medicine, Nanning 530200, China; Guangxi Collaborative Innovation Center for Research on Functional Ingredients of Agricultural Residues, Guangxi University of Chinese Medicine, Nanning 530200, China; Guangxi Key Laboratory of Traditional Chinese Medicine Formulas Theory and Transformation for Damp Diseases, Guangxi University of Chinese Medicine, Nanning 530200, China
| | - Xiaotao Hou
- Guangxi Key Laboratory of Efficacy Study on Chinese Materia Medica, Guangxi University of Chinese Medicine, Nanning 530200, China; Guangxi Collaborative Innovation Center for Research on Functional Ingredients of Agricultural Residues, Guangxi University of Chinese Medicine, Nanning 530200, China; Guangxi Key Laboratory of Traditional Chinese Medicine Formulas Theory and Transformation for Damp Diseases, Guangxi University of Chinese Medicine, Nanning 530200, China
| | - Jiagang Deng
- Guangxi Key Laboratory of Efficacy Study on Chinese Materia Medica, Guangxi University of Chinese Medicine, Nanning 530200, China; Guangxi Collaborative Innovation Center for Research on Functional Ingredients of Agricultural Residues, Guangxi University of Chinese Medicine, Nanning 530200, China; Guangxi Key Laboratory of Traditional Chinese Medicine Formulas Theory and Transformation for Damp Diseases, Guangxi University of Chinese Medicine, Nanning 530200, China
| | - Zhongshang Xia
- Guangxi Key Laboratory of Efficacy Study on Chinese Materia Medica, Guangxi University of Chinese Medicine, Nanning 530200, China; Guangxi Collaborative Innovation Center for Research on Functional Ingredients of Agricultural Residues, Guangxi University of Chinese Medicine, Nanning 530200, China; Guangxi Key Laboratory of Traditional Chinese Medicine Formulas Theory and Transformation for Damp Diseases, Guangxi University of Chinese Medicine, Nanning 530200, China.
| | - Zhengcai Du
- Guangxi Key Laboratory of Efficacy Study on Chinese Materia Medica, Guangxi University of Chinese Medicine, Nanning 530200, China; Guangxi Collaborative Innovation Center for Research on Functional Ingredients of Agricultural Residues, Guangxi University of Chinese Medicine, Nanning 530200, China; Guangxi Key Laboratory of Traditional Chinese Medicine Formulas Theory and Transformation for Damp Diseases, Guangxi University of Chinese Medicine, Nanning 530200, China; Guangxi Scientific Research Center of Traditional Chinese Medicine, Guangxi University of Chinese Medicine, Nanning 530200, China
| |
Collapse
|
18
|
Yuan X, Suvarna M, Lim JY, Pérez-Ramírez J, Wang X, Ok YS. Active Learning-Based Guided Synthesis of Engineered Biochar for CO 2 Capture. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2024; 58:6628-6636. [PMID: 38497595 PMCID: PMC11025117 DOI: 10.1021/acs.est.3c10922] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/25/2023] [Revised: 02/21/2024] [Accepted: 02/22/2024] [Indexed: 03/19/2024]
Abstract
Biomass waste-derived engineered biochar for CO2 capture presents a viable route for climate change mitigation and sustainable waste management. However, optimally synthesizing them for enhanced performance is time- and labor-intensive. To address these issues, we devise an active learning strategy to guide and expedite their synthesis with improved CO2 adsorption capacities. Our framework learns from experimental data and recommends optimal synthesis parameters, aiming to maximize the narrow micropore volume of engineered biochar, which exhibits a linear correlation with its CO2 adsorption capacity. We experimentally validate the active learning predictions, and these data are iteratively leveraged for subsequent model training and revalidation, thereby establishing a closed loop. Over three active learning cycles, we synthesized 16 property-specific engineered biochar samples such that the CO2 uptake nearly doubled by the final round. We demonstrate a data-driven workflow to accelerate the development of high-performance engineered biochar with enhanced CO2 uptake and broader applications as a functional material.
Collapse
Affiliation(s)
- Xiangzhou Yuan
- Ministry
of Education of Key Laboratory of Energy Thermal Conversion and Control,
School of Energy and Environment, Southeast University, Nanjing 210096, China
- Korea
Biochar Research Center, APRU Sustainable Waste Management Program
& Division of Environmental Science and Ecological Engineering, Korea University, Seoul 02841, Republic of Korea
| | - Manu Suvarna
- Institute
for Chemical and Bioengineering, Department of Chemistry and Applied
Biosciences, ETH Zurich, Vladimir-Prelog-Weg 1, 8093 Zurich, Switzerland
| | - Juin Yau Lim
- Korea
Biochar Research Center, APRU Sustainable Waste Management Program
& Division of Environmental Science and Ecological Engineering, Korea University, Seoul 02841, Republic of Korea
| | - Javier Pérez-Ramírez
- Institute
for Chemical and Bioengineering, Department of Chemistry and Applied
Biosciences, ETH Zurich, Vladimir-Prelog-Weg 1, 8093 Zurich, Switzerland
| | - Xiaonan Wang
- Department
of Chemical Engineering, Tsinghua University, Beijing 100084, China
| | - Yong Sik Ok
- Korea
Biochar Research Center, APRU Sustainable Waste Management Program
& Division of Environmental Science and Ecological Engineering, Korea University, Seoul 02841, Republic of Korea
| |
Collapse
|
19
|
Usman J, Abba SI, Baig N, Abu-Zahra N, Hasan SW, Aljundi IH. Design and Machine Learning Prediction of In Situ Grown PDA-Stabilized MOF (UiO-66-NH 2) Membrane for Low-Pressure Separation of Emulsified Oily Wastewater. ACS APPLIED MATERIALS & INTERFACES 2024; 16:16271-16289. [PMID: 38514254 DOI: 10.1021/acsami.4c00752] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/23/2024]
Abstract
Significant progress has been made in designing advanced membranes; however, persistent challenges remain due to their reduced permeation rates and a propensity for substantial fouling. These factors continue to pose significant barriers to the effective utilization of membranes in the separation of oil-in-water emulsions. Metal-organic frameworks (MOFs) are considered promising materials for such applications; however, they encounter three key challenges when applied to the separation of oil from water: (a) lack of water stability; (b) difficulty in producing defect-free membranes; and (c) unresolved issue of stabilizing the MOF separating layer on the ceramic membrane (CM) support. In this study, a defect-free hydrolytically stable zirconium-based MOF separating layer was formed through a two-step method: first, by in situ growth of UiO-66-NH2 MOF into the voids of polydopamine (PDA)-functionalized CM during the solvothermal process, and then by facilitating the self-assembly of UiO-66-NH2 with PDA using a pressurized dead-end assembly. A stable MOF separating layer was attained by enriching the ceramic support with amines and hydroxyl groups using PDA, which assisted in the assembly and stabilization of UiO-66-NH2. The PDA-s-UiO-66-NH2-CM membrane displayed air superhydrophilicity and underwater superoleophobicity, demonstrating its oil resistance and high antifouling behavior. The PDA-s-UiO-66-NH2-CM membrane has shown exceptionally high permeability and separation capacity for challenging oil-in-water emulsions. This is attributed to numerous nanochannels from the membrane and its high resistance to oil adhesion. The membranes showed excellent stability over 15 continuous test cycles, which indicates that the developed MOFs separating layers have a low tendency to be clogged by oil droplets during separation. Machine learning-based Gaussian process regression (GPR) models as nonparametric kernel-based probabilistic models were employed to predict the performance efficiency of the PDA-s-UiO-66-NH2-CM membrane in oil-in-water separation. The outcomes were compared with the support vector machine (SVM) and decision tree (DT) algorithm. This efficiency includes various metrics related to its separation accuracy, and the models were developed through feature engineering to identify and utilize the most significant factors affecting the membrane's performance. The results proved the reliability of GPR optimization with the highest prediction accuracy in the validation phase. The average percentage increase of the GPR model compared to the SVM and DT model was 6.11 and 42.94%, respectively.
Collapse
Affiliation(s)
- Jamilu Usman
- Interdisciplinary Research Centre for Membranes and Water Security (IRC-MWS), King Fahd University of Petroleum and Minerals, Dhahran 31261, Saudi Arabia
| | - Sani I Abba
- Interdisciplinary Research Centre for Membranes and Water Security (IRC-MWS), King Fahd University of Petroleum and Minerals, Dhahran 31261, Saudi Arabia
| | - Nadeem Baig
- Interdisciplinary Research Centre for Membranes and Water Security (IRC-MWS), King Fahd University of Petroleum and Minerals, Dhahran 31261, Saudi Arabia
| | - Nidal Abu-Zahra
- Materials Science and Engineering Department, University of Wisconsin-Milwaukee, 3200 North Cramer Street, Milwaukee, Wisconsin 53201, United States
| | - Shadi W Hasan
- Center for Membranes and Advanced Water Technology (CMAT), Department of Chemical and Petroleum Engineering, Khalifa University of Science and Technology, P.O. Box 127788 Abu Dhabi, United Arab Emirates
| | - Isam H Aljundi
- Interdisciplinary Research Centre for Membranes and Water Security (IRC-MWS), King Fahd University of Petroleum and Minerals, Dhahran 31261, Saudi Arabia
- Chemical Engineering Department, King Fahd University of Petroleum and Minerals, Dhahran 31261, Saudi Arabia
| |
Collapse
|
20
|
Wang H, Zeng J, Dai R, Wang Z. Understanding Rejection Mechanisms of Trace Organic Contaminants by Polyamide Membranes via Data-Knowledge Codriven Machine Learning. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2024; 58:5878-5888. [PMID: 38498471 DOI: 10.1021/acs.est.3c08523] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/20/2024]
Abstract
Data-driven machine learning (ML) provides a promising approach to understanding and predicting the rejection of trace organic contaminants (TrOCs) by polyamide (PA). However, various confounding variables, coupled with data scarcity, restrict the direct application of data-driven ML. In this study, we developed a data-knowledge codriven ML model via domain-knowledge embedding and explored its application in comprehending TrOC rejection by PA membranes. Domain-knowledge embedding enhanced both the predictive performance and the interpretability of the ML model. The contribution of key mechanisms, including size exclusion, charge effect, hydrophobic interaction, etc., that dominate the rejections of the three TrOC categories (neutral hydrophilic, neutral hydrophobic, and charged TrOCs) was quantified. Log D and molecular charge emerge as key factors contributing to the discernible variations in the rejection among the three TrOC categories. Furthermore, we quantitatively compared the TrOC rejection mechanisms between nanofiltration (NF) and reverse osmosis (RO) PA membranes. The charge effect and hydrophobic interactions possessed higher weights for NF to reject TrOCs, while the size exclusion in RO played a more important role. This study demonstrated the effectiveness of the data-knowledge codriven ML method in understanding TrOC rejection by PA membranes, providing a methodology to formulate a strategy for targeted TrOC removal.
Collapse
Affiliation(s)
- Hejia Wang
- State Key Laboratory of Pollution Control and Resource Reuse, Shanghai Institute of Pollution Control and Ecological Security, School of Environmental Science and Engineering, Tongji University, Shanghai 200092, China
| | - Jin Zeng
- School of Software Engineering, Tongji University, Shanghai 201804, China
| | - Ruobin Dai
- State Key Laboratory of Pollution Control and Resource Reuse, Shanghai Institute of Pollution Control and Ecological Security, School of Environmental Science and Engineering, Tongji University, Shanghai 200092, China
| | - Zhiwei Wang
- State Key Laboratory of Pollution Control and Resource Reuse, Shanghai Institute of Pollution Control and Ecological Security, School of Environmental Science and Engineering, Tongji University, Shanghai 200092, China
| |
Collapse
|
21
|
Jin L, Li X. Materials Science in the Quest for Sustainability. ACS ENVIRONMENTAL AU 2024; 4:54-55. [PMID: 38525018 PMCID: PMC10958651 DOI: 10.1021/acsenvironau.4c00014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Indexed: 03/26/2024]
Affiliation(s)
- Ling Jin
- Department of Civil and Environmental
Engineering and Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong, China
- Department of Civil and Environmental
Engineering, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong, China
| | | |
Collapse
|
22
|
Cao Z, Barati Farimani O, Ock J, Barati Farimani A. Machine Learning in Membrane Design: From Property Prediction to AI-Guided Optimization. NANO LETTERS 2024; 24:2953-2960. [PMID: 38436240 PMCID: PMC10941251 DOI: 10.1021/acs.nanolett.3c05137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Revised: 02/26/2024] [Accepted: 02/27/2024] [Indexed: 03/05/2024]
Abstract
Porous membranes, either polymeric or two-dimensional materials, have been extensively studied because of their outstanding performance in many applications such as water filtration. Recently, inspired by the significant success of machine learning (ML) in many areas of scientific discovery, researchers have started to tackle the problem in the field of membrane design using data-driven ML tools. In this Mini Review, we summarize research efforts on three types of applications of machine learning in membrane design, including (1) membrane property prediction using ML, (2) gaining physical insight and drawing quantitative relationships between membrane properties and performance using explainable artificial intelligence, and (3) ML-guided design, optimization, or virtual screening of membranes. On top of the review of previous research, we discuss the challenges associated with applying ML for membrane design and potential future directions.
Collapse
Affiliation(s)
- Zhonglin Cao
- Department
of Mechanical Engineering, Carnegie Mellon
University, Pittsburgh Pennsylvania 15213, United States
| | - Omid Barati Farimani
- Department
of Mechanical Engineering, Carnegie Mellon
University, Pittsburgh Pennsylvania 15213, United States
| | - Janghoon Ock
- Department
of Chemical Engineering, Carnegie Mellon
University, Pittsburgh Pennsylvania 15213, United States
| | - Amir Barati Farimani
- Department
of Mechanical Engineering, Carnegie Mellon
University, Pittsburgh Pennsylvania 15213, United States
- Department
of Chemical Engineering, Carnegie Mellon
University, Pittsburgh Pennsylvania 15213, United States
- Machine
Learning Department, Carnegie Mellon University, Pittsburgh Pennsylvania 15213, United States
| |
Collapse
|
23
|
Jiang BN, Zhang YY, Zhang ZY, Yang YL, Song HL. Tree-structured parzen estimator optimized-automated machine learning assisted by meta-analysis for predicting biochar-driven N 2O mitigation effect in constructed wetlands. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2024; 354:120335. [PMID: 38368804 DOI: 10.1016/j.jenvman.2024.120335] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 01/29/2024] [Accepted: 02/08/2024] [Indexed: 02/20/2024]
Abstract
Biochar is a carbon-neutral tool for combating climate change. Artificial intelligence applications to estimate the biochar mitigation effect on greenhouse gases (GHGs) can assist scientists in making more informed solutions. However, there is also evidence indicating that biochar promotes, rather than reduces, N2O emissions. Thus, the effect of biochar on N2O remains uncertain in constructed wetlands (CWs), and there is not a characterization metric for this effect, which increases the difficulty and inaccuracy of biochar-driven alleviation effect projections. Here, we provide new insight by utilizing machine learning-based, tree-structured Parzen Estimator (TPE) optimization assisted by a meta-analysis to estimate the potency of biochar-driven N2O mitigation. We first synthesized datasets that contained 80 studies on global biochar-amended CWs. The mitigation effect size was then calculated and further introduced as a new metric. TPE optimization was then applied to automatically tune the hyperparameters of the built extreme gradient boosting (XGBoost) and random forest (RF), and the optimum TPE-XGBoost obtained adequately achieved a satisfactory prediction accuracy for N2O flux (R2 = 91.90%, RPD = 3.57) and the effect size (R2 = 92.61%, RPD = 3.59). Results indicated that a high influent chemical oxygen demand/total nitrogen (COD/TN) ratio and the COD removal efficiency interpreted by the Shapley value significantly enhanced the effect size contribution. COD/TN ratio made the most and the second greatest positive contributions among 22 input variables to N2O flux and to the effect size that were up to 18% and 14%, respectively. By combining with a structural equation model analysis, NH4+-N removal rate had significant negative direct effects on the N2O flux. This study implied that the application of granulated biochar derived from C-rich feedstocks would maximize the net climate benefit of N2O mitigation driven by biochar for future biochar-based CWs.
Collapse
Affiliation(s)
- Bi-Ni Jiang
- School of Environment, Nanjing Normal University, Jiangsu Province Engineering Research Center of Environmental Risk Prevention and Emergency Response Technology, Jiangsu Engineering Lab of Water and Soil Eco-remediation, Wenyuan Road 1, Nanjing 210023, China; Institute of Agricultural Resources and Environment, Jiangsu Academy of Agricultural Sciences, Ministry of Agriculture and Rural Affairs, Liuhe Observation and Experimental Station of National Agro-Environment, Nanjing, 210014, China
| | - Ying-Ying Zhang
- Institute of Agricultural Resources and Environment, Jiangsu Academy of Agricultural Sciences, Ministry of Agriculture and Rural Affairs, Liuhe Observation and Experimental Station of National Agro-Environment, Nanjing, 210014, China
| | - Zhi-Yong Zhang
- Institute of Agricultural Resources and Environment, Jiangsu Academy of Agricultural Sciences, Ministry of Agriculture and Rural Affairs, Liuhe Observation and Experimental Station of National Agro-Environment, Nanjing, 210014, China.
| | - Yu-Li Yang
- School of Environment, Nanjing Normal University, Jiangsu Province Engineering Research Center of Environmental Risk Prevention and Emergency Response Technology, Jiangsu Engineering Lab of Water and Soil Eco-remediation, Wenyuan Road 1, Nanjing 210023, China
| | - Hai-Liang Song
- School of Environment, Nanjing Normal University, Jiangsu Province Engineering Research Center of Environmental Risk Prevention and Emergency Response Technology, Jiangsu Engineering Lab of Water and Soil Eco-remediation, Wenyuan Road 1, Nanjing 210023, China.
| |
Collapse
|
24
|
Ye G, Wan J, Deng Z, Wang Y, Chen J, Zhu B, Ji S. Prediction of effluent total nitrogen and energy consumption in wastewater treatment plants: Bayesian optimization machine learning methods. BIORESOURCE TECHNOLOGY 2024; 395:130361. [PMID: 38286171 DOI: 10.1016/j.biortech.2024.130361] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/02/2023] [Revised: 01/18/2024] [Accepted: 01/18/2024] [Indexed: 01/31/2024]
Abstract
The control of effluent total nitrogen (TN) and total energy consumption (TEC) is a key issue in managing wastewater treatment plants. In this study, effluent TN and TEC predictive models were established by selecting influent water quality and process control indicators as input features. The prediction performance of machine learning methods under different random seeds was explored, the moving average method was used for data amplification, and the Bayesian algorithm was used for hyperparameter optimization. The results showed that compared with the traditional hyperparameter optimization method for effluent TN prediction, the coefficient of determination (R2) increased by 0.092 and 0.067, reaching 0.725, and the root mean square error (RMSE) decreased by 0.262 and 0.215 mg/L, reaching 1.673 mg/L, respectively, after Bayesian optimization and data amplification. During TEC prediction, R2 increased by 0.068 and 0.042, reaching 0.884, and the RMSE decreased by 232.444 and 197.065 kWh, reaching 1305.829 kWh, respectively.
Collapse
Affiliation(s)
- Gang Ye
- College of Environment and Energy, South China University of Technology, Guangzhou 510006, China
| | - Jinquan Wan
- College of Environment and Energy, South China University of Technology, Guangzhou 510006, China.
| | - Zhicheng Deng
- College of Environment and Energy, South China University of Technology, Guangzhou 510006, China
| | - Yan Wang
- College of Environment and Energy, South China University of Technology, Guangzhou 510006, China
| | - Jian Chen
- College of Environment and Energy, South China University of Technology, Guangzhou 510006, China
| | - Bin Zhu
- Guangdong Shunkong Zihua Technology Co, Ltd, Foshan 528300, China
| | - Shiming Ji
- Guangdong Shunkong Zihua Technology Co, Ltd, Foshan 528300, China
| |
Collapse
|
25
|
Yang K, Liu L, Wen Y. The impact of Bayesian optimization on feature selection. Sci Rep 2024; 14:3948. [PMID: 38366092 PMCID: PMC10873405 DOI: 10.1038/s41598-024-54515-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2023] [Accepted: 02/13/2024] [Indexed: 02/18/2024] Open
Abstract
Feature selection is an indispensable step for the analysis of high-dimensional molecular data. Despite its importance, consensus is lacking on how to choose the most appropriate feature selection methods, especially when the performance of the feature selection methods itself depends on hyper-parameters. Bayesian optimization has demonstrated its advantages in automatically configuring the settings of hyper-parameters for various models. However, it remains unclear whether Bayesian optimization can benefit feature selection methods. In this research, we conducted extensive simulation studies to compare the performance of various feature selection methods, with a particular focus on the impact of Bayesian optimization on those where hyper-parameters tuning is needed. We further utilized the gene expression data obtained from the Alzheimer's Disease Neuroimaging Initiative to predict various brain imaging-related phenotypes, where various feature selection methods were employed to mine the data. We found through simulation studies that feature selection methods with hyper-parameters tuned using Bayesian optimization often yield better recall rates, and the analysis of transcriptomic data further revealed that Bayesian optimization-guided feature selection can improve the accuracy of disease risk prediction models. In conclusion, Bayesian optimization can facilitate feature selection methods when hyper-parameter tuning is needed and has the potential to substantially benefit downstream tasks.
Collapse
Affiliation(s)
- Kaixin Yang
- Department of Health Statistics, School of Public Health, Shanxi Medical University, No 56 Xinjian South Road, Yingze District, Taiyuan, Shanxi, China
| | - Long Liu
- Department of Health Statistics, School of Public Health, Shanxi Medical University, No 56 Xinjian South Road, Yingze District, Taiyuan, Shanxi, China.
| | - Yalu Wen
- Department of Statistics, University of Auckland, 38 Princes Street, Auckland Central, Auckland, 1010, New Zealand.
| |
Collapse
|
26
|
Mohammadrezaei D, Podina L, Silva JD, Kohandel M. Cell viability prediction and optimization in extrusion-based bioprinting via neural network-based Bayesian optimization models. Biofabrication 2024; 16:025016. [PMID: 38128119 DOI: 10.1088/1758-5090/ad17cf] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Accepted: 12/21/2023] [Indexed: 12/23/2023]
Abstract
The fields of regenerative medicine and cancer modeling have witnessed tremendous growth in the application of 3D bioprinting. Maintaining high cell viability throughout the bioprinting process is crucial for the success of this technology, as it directly affects the accuracy of the 3D bioprinted models, the validity of experimental results, and the discovery of new therapeutic approaches. Therefore, optimizing bioprinting conditions, which include numerous variables influencing cell viability during and after the procedure, is of utmost importance to achieve desirable results. So far, these optimizations have been accomplished primarily through trial and error and repeating multiple time-consuming and costly experiments. To address this challenge, we initiated the process by creating a dataset of these parameters for gelatin and alginate-based bioinks and the corresponding cell viability by integrating data obtained in our laboratory and those derived from the literature. Then, we developed machine learning models to predict cell viability based on different bioprinting variables. The trained neural network yielded regressionR2value of 0.71 and classification accuracy of 0.86. Compared to models that have been developed so far, the performance of our models is superior and shows great prediction results. The study further introduces a novel optimization strategy that employs the Bayesian optimization model in combination with the developed regression neural network to determine the optimal combination of the selected bioprinting parameters to maximize cell viability and eliminate trial-and-error experiments. Finally, we experimentally validated the optimization model's performance.
Collapse
Affiliation(s)
- Dorsa Mohammadrezaei
- Department of Applied Mathematics, University of Waterloo, Waterloo, Ontario, Canada
| | - Lena Podina
- Cheriton School of Computer Science, University of Waterloo, Waterloo, Ontario, Canada
| | - Johanna De Silva
- Department of Applied Mathematics, University of Waterloo, Waterloo, Ontario, Canada
| | - Mohammad Kohandel
- Department of Applied Mathematics, University of Waterloo, Waterloo, Ontario, Canada
| |
Collapse
|
27
|
Wang Z, Chen A, Tao K, Han Y, Li J. MatGPT: A Vane of Materials Informatics from Past, Present, to Future. ADVANCED MATERIALS (DEERFIELD BEACH, FLA.) 2024; 36:e2306733. [PMID: 37813548 DOI: 10.1002/adma.202306733] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Revised: 09/05/2023] [Indexed: 10/17/2023]
Abstract
Combining materials science, artificial intelligence (AI), physical chemistry, and other disciplines, materials informatics is continuously accelerating the vigorous development of new materials. The emergence of "GPT (Generative Pre-trained Transformer) AI" shows that the scientific research field has entered the era of intelligent civilization with "data" as the basic factor and "algorithm + computing power" as the core productivity. The continuous innovation of AI will impact the cognitive laws and scientific methods, and reconstruct the knowledge and wisdom system. This leads to think more about materials informatics. Here, a comprehensive discussion of AI models and materials infrastructures is provided, and the advances in the discovery and design of new materials are reviewed. With the rise of new research paradigms triggered by "AI for Science", the vane of materials informatics: "MatGPT", is proposed and the technical path planning from the aspects of data, descriptors, generative models, pretraining models, directed design models, collaborative training, experimental robots, as well as the efforts and preparations needed to develop a new generation of materials informatics, is carried out. Finally, the challenges and constraints faced by materials informatics are discussed, in order to achieve a more digital, intelligent, and automated construction of materials informatics with the joint efforts of more interdisciplinary scientists.
Collapse
Affiliation(s)
- Zhilong Wang
- National Key Laboratory of Science and Technology on Micro/Nano Fabrication, Shanghai Jiao Tong University, Shanghai, 200240, China
- Key Laboratory of Thin Film and Microfabrication of Ministry of Education, Department of Micro/Nano Electronics, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - An Chen
- National Key Laboratory of Science and Technology on Micro/Nano Fabrication, Shanghai Jiao Tong University, Shanghai, 200240, China
- Key Laboratory of Thin Film and Microfabrication of Ministry of Education, Department of Micro/Nano Electronics, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Kehao Tao
- National Key Laboratory of Science and Technology on Micro/Nano Fabrication, Shanghai Jiao Tong University, Shanghai, 200240, China
- Key Laboratory of Thin Film and Microfabrication of Ministry of Education, Department of Micro/Nano Electronics, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Yanqiang Han
- National Key Laboratory of Science and Technology on Micro/Nano Fabrication, Shanghai Jiao Tong University, Shanghai, 200240, China
- Key Laboratory of Thin Film and Microfabrication of Ministry of Education, Department of Micro/Nano Electronics, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Jinjin Li
- National Key Laboratory of Science and Technology on Micro/Nano Fabrication, Shanghai Jiao Tong University, Shanghai, 200240, China
- Key Laboratory of Thin Film and Microfabrication of Ministry of Education, Department of Micro/Nano Electronics, Shanghai Jiao Tong University, Shanghai, 200240, China
| |
Collapse
|
28
|
Lei L, Zhang L, Han Z, Chen Q, Liao P, Wu D, Tai J, Xie B, Su Y. Advancing chronic toxicity risk assessment in freshwater ecology by molecular characterization-based machine learning. ENVIRONMENTAL POLLUTION (BARKING, ESSEX : 1987) 2024; 342:123093. [PMID: 38072027 DOI: 10.1016/j.envpol.2023.123093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Revised: 11/30/2023] [Accepted: 12/02/2023] [Indexed: 01/26/2024]
Abstract
The continuously increased production of various chemicals and their release into environments have raised potential negative effects on ecological health. However, traditional labor-intensive assessment methods cannot effectively and rapidly evaluate these hazards, especially for chronic risk. In this study, machine learning (ML) was employed to construct quantitative structure-activity relationship (QSAR) models, enabling the prediction of chronic toxicity to aquatic organisms by leveraging the molecular characteristics of pollutants, namely, the molecular descriptors, fingerprints, and graphs. The limited dataset size hindered the notable advantages of the graph attention network (GAT) model for the molecular graphs. Considering computational efficiency and performance (R2 = 0.78; RMSE = 0.77), XGBoost (XGB) was used for reliable QSAR-ML models predicting chronic toxicity using small- or medium-sized tabular data and the molecular descriptors. Further kernel density estimation analysis confirmed the high accuracy of the model for pollutant concentrations ranging from 10-3 to 102 mg/L, effectively aligning with most environmental scenarios. Model interpretation showed SlogP and exposure duration as the primary influential factors. SlogP, representing the distribution coefficient of a molecule between lipophilic and hydrophilic environments, had a negative effect on the toxicity outcomes. Additionally, the exposure duration played a crucial role in determining the chronic toxicity. Finally, the chronic toxicity data of bisphenol A validated the robustness and reliability of the model established in this research. Our study provided a robust and feasible methodology for chronic ecological risk evaluation of various types of pollutants and could facilitate and increase the use of ML applications in environmental fields.
Collapse
Affiliation(s)
- Lang Lei
- Shanghai Engineering Research Center of Biotransformation of Organic Solid Waste, School of Ecological and Environmental Sciences, East China Normal University, Shanghai, 200241, China
| | - Liangmao Zhang
- Shanghai Engineering Research Center of Biotransformation of Organic Solid Waste, School of Ecological and Environmental Sciences, East China Normal University, Shanghai, 200241, China
| | - Zhibang Han
- Shanghai Engineering Research Center of Biotransformation of Organic Solid Waste, School of Ecological and Environmental Sciences, East China Normal University, Shanghai, 200241, China
| | - Qirui Chen
- Shanghai Engineering Research Center of Biotransformation of Organic Solid Waste, School of Ecological and Environmental Sciences, East China Normal University, Shanghai, 200241, China
| | - Pengcheng Liao
- Shanghai Engineering Research Center of Biotransformation of Organic Solid Waste, School of Ecological and Environmental Sciences, East China Normal University, Shanghai, 200241, China
| | - Dong Wu
- Shanghai Engineering Research Center of Biotransformation of Organic Solid Waste, School of Ecological and Environmental Sciences, East China Normal University, Shanghai, 200241, China; Chongqing Key Laboratory of Precision Optics, Chongqing Institute of East China Normal University, Chongqing, 401120, China; Shanghai Institute of Pollution Control and Ecological Security, Shanghai, 200092, China
| | - Jun Tai
- Shanghai Environmental Sanitation Engineering Design Institute Co., Ltd., Shanghai, 200232, China
| | - Bing Xie
- Shanghai Engineering Research Center of Biotransformation of Organic Solid Waste, School of Ecological and Environmental Sciences, East China Normal University, Shanghai, 200241, China; Shanghai Institute of Pollution Control and Ecological Security, Shanghai, 200092, China
| | - Yinglong Su
- Shanghai Engineering Research Center of Biotransformation of Organic Solid Waste, School of Ecological and Environmental Sciences, East China Normal University, Shanghai, 200241, China; Chongqing Key Laboratory of Precision Optics, Chongqing Institute of East China Normal University, Chongqing, 401120, China; Shanghai Institute of Pollution Control and Ecological Security, Shanghai, 200092, China.
| |
Collapse
|
29
|
Li Y, Tao C, Fu D, Jafvert CT, Zhu T. Integrating molecular descriptors for enhanced prediction: Shedding light on the potential of pH to model hydrated electron reaction rates for organic compounds. CHEMOSPHERE 2024; 349:140984. [PMID: 38122944 DOI: 10.1016/j.chemosphere.2023.140984] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/03/2023] [Revised: 12/13/2023] [Accepted: 12/14/2023] [Indexed: 12/23/2023]
Abstract
Hydrated electron reaction rate constant (ke-aq) is an important parameter to determine reductive degradation efficiency and to mitigate the ecological risk of organic compounds (OCs). However, OC species morphology and the concentration of hydrated electrons (e-aq) in water vary with pH, complicating OC fate assessment. This study introduced the environmental variable of pH, to develop models for ke-aq for 701 data points using 3 descriptor types: (i) molecular descriptors (MD), (ii) quantum chemical descriptors (QCD), and (iii) the combination of both (MD + QCD). Models were screened using 2 descriptor screening methods (MLR and RF) and 14 machine learning (ML) algorithms. The introduction of QCDs that characterized the electronic structure of OCs greatly improved the performance of models while ensuring the need for fewer descriptors. The optimal model MLR-XGBoost(MD + QCD), which included pH, achieved the most satisfactory prediction: R2tra = 0.988, Q2boot = 0.861, R2test = 0.875 and Q2test = 0.873. The mechanistic interpretation using the SHAP method further revealed that QCDs, polarizability, volume, and pH had a great influence on the reductive degradation of OCs by e-aq. Overall, the electrochemical parameters (QCDs, pH) related to the solvent and solute are of significance and should be considered in any future ML modeling that assesses the fate of OCs in aquatic environment.
Collapse
Affiliation(s)
- Yi Li
- School of Environmental Science and Engineering, Yangzhou University, Yangzhou, 225127, Jiangsu, China
| | - Cuicui Tao
- School of Environmental Science and Engineering, Yangzhou University, Yangzhou, 225127, Jiangsu, China
| | - Dafang Fu
- School of Civil Engineering, Southeast University, Nanjing, 210096, China
| | - Chad T Jafvert
- Lyles School of Civil Engineering, and Environmental & Ecological Engineering, Purdue University, West Lafayette, IN, 47907, USA
| | - Tengyi Zhu
- School of Environmental Science and Engineering, Yangzhou University, Yangzhou, 225127, Jiangsu, China.
| |
Collapse
|
30
|
Wang C, Wang L, Yu H, Seo A, Wang Z, Rajabzadeh S, Ni BJ, Shon HK. Machine learning for layer-by-layer nanofiltration membrane performance prediction and polymer candidate exploration. CHEMOSPHERE 2024; 350:140999. [PMID: 38151066 DOI: 10.1016/j.chemosphere.2023.140999] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Revised: 12/18/2023] [Accepted: 12/19/2023] [Indexed: 12/29/2023]
Abstract
In this study, machine learning-based models were established for layer-by-layer (LBL) nanofiltration (NF) membrane performance prediction and polymer candidate exploration. Four different models, i.e., linear, random forest (RF), boosted tree (BT), and eXtreme Gradient Boosting (XGBoost), were formed, and membrane performance prediction was determined in terms of membrane permeability and selectivity. The XGBoost exhibited optimal prediction accuracy for membrane permeability (coefficient of determination (R2): 0.99) and membrane selectivity (R2: 0.80). The Shapley Additive exPlanation (SHAP) method was utilized to evaluate the effects of different LBL NF membrane fabrication conditions on membrane performances. The SHAP method was also used to identify the relationships between polymer structure and membrane performance. Polymers were represented by Morgan fingerprint, which is an effective description approach for developing modeling. Based on the SHAP value results, two reference Morgan fingerprints were constructed containing atomic groups with positive contributions to membrane permeability and selectivity. According to the reference Morgan fingerprint, 204 potential polymers were explored from the largest polymer database (PoLyInfo). By calculating the similarities between each potential polymer and both reference Morgan fingerprints, 23 polymer candidates were selected and could be further used for LBL NF membrane fabrication with the potential for providing good membrane performance. Overall, this work provided new ways both for LBL NF membrane performance prediction and high-performance polymer candidate exploration. The source code for the models and algorithms used in this study is publicly available to facilitate replication and further research. https://github.com/wangliwfsd/LLNMPP/.
Collapse
Affiliation(s)
- Chen Wang
- School of Civil and Environmental Engineering, University of Technology Sydney, Sydney, New South Wales, 2007, Australia
| | - Li Wang
- CSIRO Space and Astronomy, PO Box 1130, Bentley, WA, 6102, Australia
| | - Hanwei Yu
- School of Civil and Environmental Engineering, University of Technology Sydney, Sydney, New South Wales, 2007, Australia
| | - Allan Seo
- School of Civil and Environmental Engineering, University of Technology Sydney, Sydney, New South Wales, 2007, Australia
| | - Zhining Wang
- Shandong Provincial Key Laboratory of Water Pollution Control and Resource Reuse, School of Environmental Science and Engineering, Shandong University, Qingdao, 266237, China
| | - Saeid Rajabzadeh
- School of Civil and Environmental Engineering, University of Technology Sydney, Sydney, New South Wales, 2007, Australia
| | - Bing-Jie Ni
- School of Civil and Environmental Engineering, University of New South Wales, Sydney, New South Wales, 2052, Australia
| | - Ho Kyong Shon
- School of Civil and Environmental Engineering, University of Technology Sydney, Sydney, New South Wales, 2007, Australia.
| |
Collapse
|
31
|
Igou T, Zhong S, Reid E, Chen Y. Real-Time Sensor Data Profile-Based Deep Learning Method Applied to Open Raceway Pond Microalgal Productivity Prediction. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2023; 57:17981-17989. [PMID: 37234045 PMCID: PMC10666538 DOI: 10.1021/acs.est.2c07578] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/14/2022] [Revised: 05/10/2023] [Accepted: 05/11/2023] [Indexed: 05/27/2023]
Abstract
Microalgal biotechnology holds the potential for renewable biofuels, bioproducts, and carbon capture applications due to unparalleled photosynthetic efficiency and diversity. Outdoor open raceway pond (ORP) cultivation enables utilization of sunlight and atmospheric carbon dioxide to drive microalgal biomass synthesis for production of bioproducts including biofuels; however, environmental conditions are highly dynamic and fluctuate both diurnally and seasonally, making ORP productivity prediction challenging without time-intensive physical measurements and location-specific calibrations. Here, for the first time, we present an image-based deep learning method for the prediction of ORP productivity. Our method is based on parameter profile plot images of sensor parameters, including pH, dissolved oxygen, temperature, photosynthetically active radiation, and total dissolved solids. These parameters can be remotely monitored without physical interaction with ORPs. We apply the model to data we generated during the Unified Field Studies of the Algae Testbed Public-Private-Partnership (ATP3 UFS), the largest publicly available ORP data set to date, which includes millions of sensor records and 598 productivities from 32 ORPs operated in 5 states in the United States. We demonstrate that this approach significantly outperforms an average value based traditional machine learning method (R2 = 0.77 ≫ R2 = 0.39) without considering bioprocess parameters (e.g., biomass density, hydraulic retention time, and nutrient concentrations). We then evaluate the sensitivity of image and monitoring data resolutions and input parameter variations. Our results demonstrate ORP productivity can be effectively predicted from remote monitoring data, providing an inexpensive tool for microalgal production and operational forecasting.
Collapse
Affiliation(s)
- Thomas Igou
- School
of Civil & Environmental Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Shifa Zhong
- Department
of Environmental Science, School of Ecological and Environmental Sciences, East China Normal University, Shanghai 200241, PR China
| | - Elliot Reid
- School
of Civil & Environmental Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Yongsheng Chen
- School
of Civil & Environmental Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| |
Collapse
|
32
|
Zhu JJ, Yang M, Ren ZJ. Machine Learning in Environmental Research: Common Pitfalls and Best Practices. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2023; 57:17671-17689. [PMID: 37384597 DOI: 10.1021/acs.est.3c00026] [Citation(s) in RCA: 107] [Impact Index Per Article: 53.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/01/2023]
Abstract
Machine learning (ML) is increasingly used in environmental research to process large data sets and decipher complex relationships between system variables. However, due to the lack of familiarity and methodological rigor, inadequate ML studies may lead to spurious conclusions. In this study, we synthesized literature analysis with our own experience and provided a tutorial-like compilation of common pitfalls along with best practice guidelines for environmental ML research. We identified more than 30 key items and provided evidence-based data analysis based on 148 highly cited research articles to exhibit the misconceptions of terminologies, proper sample size and feature size, data enrichment and feature selection, randomness assessment, data leakage management, data splitting, method selection and comparison, model optimization and evaluation, and model explainability and causality. By analyzing good examples on supervised learning and reference modeling paradigms, we hope to help researchers adopt more rigorous data preprocessing and model development standards for more accurate, robust, and practicable model uses in environmental research and applications.
Collapse
Affiliation(s)
- Jun-Jie Zhu
- Department of Civil and Environmental Engineering and Andlinger Center for Energy and the Environment, Princeton University, Princeton, New Jersey 08544, United States
| | - Meiqi Yang
- Department of Civil and Environmental Engineering and Andlinger Center for Energy and the Environment, Princeton University, Princeton, New Jersey 08544, United States
| | - Zhiyong Jason Ren
- Department of Civil and Environmental Engineering and Andlinger Center for Energy and the Environment, Princeton University, Princeton, New Jersey 08544, United States
| |
Collapse
|
33
|
Gao H, Zhong S, Dangayach R, Chen Y. Understanding and Designing a High-Performance Ultrafiltration Membrane Using Machine Learning. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2023; 57:17831-17840. [PMID: 36790106 PMCID: PMC10666290 DOI: 10.1021/acs.est.2c05404] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Revised: 02/04/2023] [Accepted: 02/06/2023] [Indexed: 06/18/2023]
Abstract
Ultrafiltration (UF) as one of the mainstream membrane-based technologies has been widely used in water and wastewater treatment. Increasing demand for clean and safe water requires the rational design of UF membranes with antifouling potential, while maintaining high water permeability and removal efficiency. This work employed a machine learning (ML) method to establish and understand the correlation of five membrane performance indices as well as three major performance-determining membrane properties with membrane fabrication conditions. The loading of additives, specifically nanomaterials (A_wt %), at loading amounts of >1.0 wt % was found to be the most significant feature affecting all of the membrane performance indices. The polymer content (P_wt %), molecular weight of the pore maker (M_Da), and pore maker content (M_wt %) also made considerable contributions to predicting membrane performance. Notably, M_Da was more important than M_wt % for predicting membrane performance. The feature analysis of ML models in terms of membrane properties (i.e., mean pore size, overall porosity, and contact angle) provided an unequivocal explanation of the effects of fabrication conditions on membrane performance. Our approach can provide practical aid in guiding the design of fit-for-purpose separation membranes through data-driven virtual experiments.
Collapse
Affiliation(s)
- Haiping Gao
- School
of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
- Shandong
Provincial Key Laboratory of Water Pollution Control and Resource
Reuse, School of Environmental Science and Engineering, Shandong University, Qingdao, Shandong 266237, China
| | - Shifa Zhong
- School
of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
- School
of Ecological and Environmental Sciences, East China Normal University, Shanghai 200241, China
| | - Raghav Dangayach
- School
of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Yongsheng Chen
- School
of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| |
Collapse
|
34
|
Cao H, Peng J, Zhou Z, Yang Z, Wang L, Sun Y, Wang Y, Liang Y. Investigation of the Binding Fraction of PFAS in Human Plasma and Underlying Mechanisms Based on Machine Learning and Molecular Dynamics Simulation. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2023; 57:17762-17773. [PMID: 36282672 DOI: 10.1021/acs.est.2c04400] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
More than 7000 per- and polyfluorinated alkyl substances (PFAS) have been documented in the U.S. Environmental Protection Agency's CompTox Chemicals database. These PFAS can be used in a broad range of industrial and consumer applications but may pose potential environmental issues and health risks. However, little is known about emerging PFAS bioaccumulation to assess their chemical safety. This study focuses specifically on the large and high-quality data set of fluorochemicals from the related environmental and pharmaceutical chemicals databases, and machine learning (ML) models were developed for the classification prediction of the unbound fraction of compounds in plasma. A comprehensive evaluation of the ML models shows that the best blending model yields an accuracy of 0.901 for the test set. The predictions suggest that most PFAS (∼92%) have a high binding fraction in plasma. Introduction of alkaline amino groups is likely to reduce the binding affinities of PFAS with plasma proteins. Molecular dynamics simulations indicate a clear distinction between the high and low binding fractions of PFAS. These computational workflows can be used to predict the bioaccumulation of emerging PFAS and are also helpful for the molecular design of PFAS to prevent the release of high-bioaccumulation compounds into the environment.
Collapse
Affiliation(s)
- Huiming Cao
- Hubei Key Laboratory of Environmental and Health Effects of Persistent Toxic Substances, School of Environment and Health, Jianghan University, Wuhan 430056, China
| | - Jianhua Peng
- Hubei Key Laboratory of Environmental and Health Effects of Persistent Toxic Substances, School of Environment and Health, Jianghan University, Wuhan 430056, China
| | - Zhen Zhou
- Hubei Key Laboratory of Environmental and Health Effects of Persistent Toxic Substances, School of Environment and Health, Jianghan University, Wuhan 430056, China
| | - Zeguo Yang
- Hubei Key Laboratory of Environmental and Health Effects of Persistent Toxic Substances, School of Environment and Health, Jianghan University, Wuhan 430056, China
| | - Ling Wang
- Hubei Key Laboratory of Environmental and Health Effects of Persistent Toxic Substances, School of Environment and Health, Jianghan University, Wuhan 430056, China
| | - Yuzhen Sun
- Hubei Key Laboratory of Environmental and Health Effects of Persistent Toxic Substances, School of Environment and Health, Jianghan University, Wuhan 430056, China
| | - Yawei Wang
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, China
| | - Yong Liang
- Hubei Key Laboratory of Environmental and Health Effects of Persistent Toxic Substances, School of Environment and Health, Jianghan University, Wuhan 430056, China
| |
Collapse
|
35
|
Deng H, Luo Z, Imbrogno J, Swenson TM, Jiang Z, Wang X, Zhang S. Machine Learning Guided Polyamide Membrane with Exceptional Solute-Solute Selectivity and Permeance. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2023; 57:17841-17850. [PMID: 36576929 DOI: 10.1021/acs.est.2c05571] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
Designing polymeric membranes with high solute-solute selectivity and permeance is important but technically challenging. Existing industrial interfacial polymerization (IP) process to fabricate polyamide-based polymeric membranes is largely empirical, which requires enormous trial-and-error experimentations to identify optimal fabrication conditions from a wide candidate space for separating a given solute pair. Herein, we developed a novel multitask machine learning (ML) model based on an artificial neural network (ANN) with skip connections and selectivity regularization to guide the design of polyamide membranes. We used limited sets of lab-collected data to obtain satisfactory model performance over four iterations by introducing human expert experience in the online learning process. Four membranes under fabrication conditions guided by the model exceeded the present upper bound for mono/divalent ion selectivity and permeance of the polymeric membranes. Moreover, we obtained new mechanistic insights into the membrane design through feature analysis of the model. Our work demonstrates a ML approach that represents a paradigm shift for high-performance polymeric membranes design.
Collapse
Affiliation(s)
- Hao Deng
- Department Joint School of National University of Singapore and Tianjin University, International Campus of Tianjin University, Binhai New City, Fuzhou350207, China
- Department of Chemical and Biomolecular Engineering, National University of Singapore, Singapore117576, Singapore
| | - Zhiyao Luo
- Department of Chemical and Biomolecular Engineering, National University of Singapore, Singapore117576, Singapore
| | - Joe Imbrogno
- Pfizer Inc., 235 East 42nd Street, New York, New York10017, United States
| | - Tim M Swenson
- Pfizer Inc., 235 East 42nd Street, New York, New York10017, United States
| | - Zhongyi Jiang
- Department Joint School of National University of Singapore and Tianjin University, International Campus of Tianjin University, Binhai New City, Fuzhou350207, China
| | - Xiaonan Wang
- Department of Chemical Engineering, Tsinghua University, Beijing100084, China
| | - Sui Zhang
- Department of Chemical and Biomolecular Engineering, National University of Singapore, Singapore117576, Singapore
| |
Collapse
|
36
|
Jeong N, Epsztein R, Wang R, Park S, Lin S, Tong T. Exploring the Knowledge Attained by Machine Learning on Ion Transport across Polyamide Membranes Using Explainable Artificial Intelligence. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2023; 57:17851-17862. [PMID: 36917705 DOI: 10.1021/acs.est.2c08384] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
Recent studies have increasingly applied machine learning (ML) to aid in performance and material design associated with membrane separation. However, whether the knowledge attained by ML with a limited number of available data is enough to capture and validate the fundamental principles of membrane science remains elusive. Herein, we applied explainable artificial intelligence (XAI) to thoroughly investigate the knowledge learned by ML on the mechanisms of ion transport across polyamide reverse osmosis (RO) and nanofiltration (NF) membranes by leveraging 1,585 data from 26 membrane types. The Shapley additive explanation method based on cooperative game theory was used to unveil the influences of various ion and membrane properties on the model predictions. XAI shows that the ML can capture the important roles of size exclusion and electrostatic interaction in regulating membrane separation properly. XAI also identifies that the mechanisms governing ion transport possess different relative importance to cation and anion rejections during RO and NF filtration. Overall, we provide a framework to evaluate the knowledge underlying the ML model prediction and demonstrate that ML is able to learn fundamental mechanisms of ion transport across polyamide membranes, highlighting the importance of elucidating model interpretability for more reliable and explainable ML applications to membrane selection and design.
Collapse
Affiliation(s)
- Nohyeong Jeong
- Department of Civil and Environmental Engineering, Colorado State University, Fort Collins, Colorado 80523, United States
| | - Razi Epsztein
- Department of Civil and Environmental Engineering, Technion - Israel Institute of Technology, Haifa 32000, Israel
| | - Ruoyu Wang
- Department of Civil and Environmental Engineering, Vanderbilt University, Nashville, Tennessee 37235-1831, United States
| | - Shinyun Park
- Department of Civil and Environmental Engineering, Colorado State University, Fort Collins, Colorado 80523, United States
| | - Shihong Lin
- Department of Civil and Environmental Engineering, Vanderbilt University, Nashville, Tennessee 37235-1831, United States
- Department of Chemical and Bimolecular Engineering, Vanderbilt University, Nashville, Tennessee 37235-1831, United States
| | - Tiezheng Tong
- Department of Civil and Environmental Engineering, Colorado State University, Fort Collins, Colorado 80523, United States
| |
Collapse
|
37
|
Wang M, Shi GM, Zhao D, Liu X, Jiang J. Machine Learning-Assisted Design of Thin-Film Composite Membranes for Solvent Recovery. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2023; 57:15914-15924. [PMID: 37814603 DOI: 10.1021/acs.est.3c04773] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/11/2023]
Abstract
Organic solvents are extensively utilized in industries as raw materials, reaction media, and cleaning agents. It is crucial to efficiently recover solvents for environmental protection and sustainable manufacturing. Recently, organic solvent nanofiltration (OSN) has emerged as an energy-efficient membrane technology for solvent recovery; however, current OSN membranes are largely fabricated by trial-and-error methods. In this study, for the first time, we develop a machine learning (ML) approach to design new thin-film composite membranes for solvent recovery. The monomers used in interfacial polymerization, along with membrane, solvent and solute properties, are featurized to train ML models via gradient boosting regression. The ML models demonstrate high accuracy in predicting OSN performance including solvent permeance and solute rejection. Subsequently, 167 new membranes are designed from 40 monomers and their OSN performance is predicted by the ML models for common solvents (methanol, acetone, dimethylformamide, and n-hexane). New top-performing membranes are identified with methanol permeance superior to that of existing membranes. Particularly, nitrogen-containing heterocyclic monomers are found to enhance microporosity and contribute to higher permeance. Finally, one new membrane is experimentally synthesized and tested to validate the ML predictions. Based on the chemical structures of monomers, the ML approach developed here provides a bottom-up strategy toward the rational design of new membranes for high-performance solvent recovery and many other technologically important applications.
Collapse
Affiliation(s)
- Mao Wang
- Department of Chemical and Biomolecular Engineering, National University of Singapore, Singapore 117576, Singapore
| | - Gui Min Shi
- Department of Chemical and Biomolecular Engineering, National University of Singapore, Singapore 117576, Singapore
| | - Daohui Zhao
- Department of Chemical and Biomolecular Engineering, National University of Singapore, Singapore 117576, Singapore
| | - Xinyi Liu
- Department of Chemical and Biomolecular Engineering, National University of Singapore, Singapore 117576, Singapore
| | - Jianwen Jiang
- Department of Chemical and Biomolecular Engineering, National University of Singapore, Singapore 117576, Singapore
| |
Collapse
|
38
|
Xin QY, Pei YC, Luo MY, Wang ZQ, He L, Liu JY, Wang B, Lu H. A generalized precision measuring mechanism and efficient signal processing algorithm for the eccentricity of rotary parts. IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS (1982) 2023; 70:10385-10395. [PMID: 37663405 PMCID: PMC7615000 DOI: 10.1109/tie.2022.3222655] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/05/2023]
Abstract
Rotary parts are widely used in transmission equipment, and the precision of the rotary parts determines the performance of the equipment. The accurate measurement and modification of eccentricity is a premise to ensure the parts' quality. However, the existing measurement methods have the shortcomings of low efficiency, complex operation, high costs and restricted applicability. To accurately and efficiently identify the rotary part's eccentricity parameters, including attitude angle, eccentricity and eccentric angle, a novel generalized precision measurement method is proposed in this study. Our method includes a lever measuring mechanism with a spherical probe, and a corresponding efficient signal processing algorithm to fit the measurement signal. The generalized measuring mechanism has a simple structure and can effectively measure arbitrary cross-sections, and its design and optimization principles are investigated and given thoroughly. The signal processing algorithm, based on Fourier expansion and least squares, can efficiently extract the eccentricity parameters of measured cross-sections. The proposed generalized precision measurement method has proven to overcome the limitations of existing methods, exhibits strong resistance to interference and enables batch inspection of rotary parts with a single adjustment. Its effectiveness, efficiency, applicability and repeatability are evaluated by simulation calculation and experimental verification. The proposed method has great potential in broad applications, such as detecting eccentricity and correcting errors for mechanical measurements, aerospace, equipment manufacturing, and other related fields.
Collapse
Affiliation(s)
- Qing-Yuan Xin
- School of Mechanical and Aerospace Engineering, Jilin University, Nanling Campus, Changchun, 130025, People’s Republic of China
| | - Yong-Chen Pei
- School of Mechanical and Aerospace Engineering, Jilin University, Nanling Campus, Changchun, 130025, People’s Republic of China
| | - Meng-Yan Luo
- School of Mechanical and Aerospace Engineering, Jilin University, Nanling Campus, Changchun, 130025, People’s Republic of China
| | - Zhi-Qiong Wang
- School of Mechanical and Aerospace Engineering, Jilin University, Nanling Campus, Changchun, 130025, People’s Republic of China
| | - Ling He
- College of Automobile Engineering, Jilin University, Nanling Campus, Changchun, 130025, People’s Republic of China
| | - Jian-Yao Liu
- School of Mechanical and Aerospace Engineering, Jilin University, Nanling Campus, Changchun, 130025, People’s Republic of China
| | - Bin Wang
- School of Mechanical and Aerospace Engineering, Jilin University, Nanling Campus, Changchun, 130025, People’s Republic of China
| | - Huiqi Lu
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford, OX3 7DQ, United Kingdom
| |
Collapse
|
39
|
Zou X, Guo H, Jiang C, Nguyen DV, Chen GH, Wu D. Physics-informed neural network-based serial hybrid model capturing the hidden kinetics for sulfur-driven autotrophic denitrification process. WATER RESEARCH 2023; 243:120331. [PMID: 37454462 DOI: 10.1016/j.watres.2023.120331] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Revised: 06/04/2023] [Accepted: 07/09/2023] [Indexed: 07/18/2023]
Abstract
Sulfur-driven autotrophic denitrification (SdAD) is a biological process that can remove nitrate from low carbon/nitrogen (C/N) ratio wastewater. Although this process has been intensively researched, the mechanism whereby its intermediates (i.e., elemental sulfur and nitrite ions) are generated and accumulated remains elusive. Existing mathematical models developed for SdAD cannot accurately predict the intermediates in SdAD because of the incomplete knowledge of process kinetic resulting from changes in the environmental conditions and electron competition during SdAD. To address this limitation, we proposed a novel serial hybrid model structure based on a physics-informed neural network (PINN) to capture the dynamics of the process kinetics and predict the substrate concentrations in SdAD. In this study, we evaluated the model through numerical experiments and applied it to real case studies involving batch and continuous-flow reactor scenarios. By leveraging the PINN approach, the hybrid model yielded accurate predictions at both the state (i.e. substrate concentration) and kinetic levels in the numerical experiments and performed better than both mechanistic and purely data-driven models in the case studies. Furthermore, we used the trained hybrid model to design control strategies for SdAD and a novel integrated process involving SdAD and anammox for energy-efficient nitrogen removal. Finally, we discuss the advantages and application scope of the PINN-based hybrid model.
Collapse
Affiliation(s)
- Xu Zou
- Department of Civil and Environmental Engineering, Water Technology Center, Hong Kong Branch of Chinese National Engineering Research Center for Control & Treatment of Heavy Metal Pollution, The Hong Kong University of Science and Technology, Hong Kong, China
| | - Hongxiao Guo
- Department of Civil and Environmental Engineering, Water Technology Center, Hong Kong Branch of Chinese National Engineering Research Center for Control & Treatment of Heavy Metal Pollution, The Hong Kong University of Science and Technology, Hong Kong, China
| | - Chukuan Jiang
- Department of Civil and Environmental Engineering, Water Technology Center, Hong Kong Branch of Chinese National Engineering Research Center for Control & Treatment of Heavy Metal Pollution, The Hong Kong University of Science and Technology, Hong Kong, China
| | - Duc Viet Nguyen
- Centre for Environmental and Energy Research, Ghent University Global Campus, Incheon, Republic of Korea; Department of Green Chemistry and Technology, Centre for Advanced Process Technology for Urban REsource recovery (CAPTURE), Ghent University, Ghent, Belgium
| | - Guang-Hao Chen
- Department of Civil and Environmental Engineering, Water Technology Center, Hong Kong Branch of Chinese National Engineering Research Center for Control & Treatment of Heavy Metal Pollution, The Hong Kong University of Science and Technology, Hong Kong, China.
| | - Di Wu
- Department of Civil and Environmental Engineering, Water Technology Center, Hong Kong Branch of Chinese National Engineering Research Center for Control & Treatment of Heavy Metal Pollution, The Hong Kong University of Science and Technology, Hong Kong, China; Centre for Environmental and Energy Research, Ghent University Global Campus, Incheon, Republic of Korea; Department of Green Chemistry and Technology, Centre for Advanced Process Technology for Urban REsource recovery (CAPTURE), Ghent University, Ghent, Belgium.
| |
Collapse
|
40
|
Guo S, Ao X, Ma X, Cheng S, Men C, Harada H, Saroj DP, Mang HP, Li Z, Zheng L. Machine-learning-aided application of high-gravity technology to enhance ammonia recovery of fresh waste leachate. WATER RESEARCH 2023; 235:119891. [PMID: 36965295 DOI: 10.1016/j.watres.2023.119891] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Revised: 02/27/2023] [Accepted: 03/17/2023] [Indexed: 06/18/2023]
Abstract
Stripping is widely applied for the removal of ammonia from fresh waste leachate. However, the development of air stripping technology is restricted by the requirements for large-scale equipment and long operation periods. This paper describes a high-gravity technology that improves ammonia stripping from actual fresh waste leachate and a machine learning approach that predicts the stripping performance under different operational parameters. The high-gravity field is implemented in a co-current-flow rotating packed bed in multi-stage cycle series mode. The eXtreme Gradient Boosting algorithm is applied to the experimental data to predict the liquid volumetric mass transfer coefficient (KLa) and removal efficiency (η) for various rotation speeds, numbers of stripping stages, gas flow rates, and liquid flow rates. Ammonia stripping under a high-gravity field achieves η = 82.73% and KLa = 5.551 × 10-4 s-1 at a pH value of 10 and ambient temperature. The results suggest that the eXtreme Gradient Boosting model provides good accuracy and predictive performance, with R2 values of 0.9923 and 0.9783 for KLa and η, respectively. The machine learning models developed in this study are combined with experimental results to provide more comprehensive information on rotating packed bed operations and more accurate predictions of KLa and η. The information mining behind the model is an important reference for the rational design of high-gravity-field-coupled ammonia stripping projects.
Collapse
Affiliation(s)
- Shaomin Guo
- School of Energy and Environmental Engineering, Beijing Key Laboratory of Resource-oriented Treatment of Industrial Pollutants, University of Science and Technology Beijing, Beijing 100083, PR China
| | - Xiuwei Ao
- School of Energy and Environmental Engineering, Beijing Key Laboratory of Resource-oriented Treatment of Industrial Pollutants, University of Science and Technology Beijing, Beijing 100083, PR China
| | - Xin Ma
- School of Energy and Environmental Engineering, Beijing Key Laboratory of Resource-oriented Treatment of Industrial Pollutants, University of Science and Technology Beijing, Beijing 100083, PR China
| | - Shikun Cheng
- School of Energy and Environmental Engineering, Beijing Key Laboratory of Resource-oriented Treatment of Industrial Pollutants, University of Science and Technology Beijing, Beijing 100083, PR China
| | - Cong Men
- School of Energy and Environmental Engineering, Beijing Key Laboratory of Resource-oriented Treatment of Industrial Pollutants, University of Science and Technology Beijing, Beijing 100083, PR China
| | - Hidenori Harada
- Graduate School of Asian and African Area Studies, Kyoto University, Kyoto 606-8501, Japan
| | - Devendra P Saroj
- Department of Civil and Environmental Engineering, Centre for Environmental Health Engineering (CEHE), Faculty of Engineering and Physical Sciences, University of Surrey, Surrey GU27XH, United Kingdom
| | - Heinz-Peter Mang
- School of Energy and Environmental Engineering, Beijing Key Laboratory of Resource-oriented Treatment of Industrial Pollutants, University of Science and Technology Beijing, Beijing 100083, PR China
| | - Zifu Li
- School of Energy and Environmental Engineering, Beijing Key Laboratory of Resource-oriented Treatment of Industrial Pollutants, University of Science and Technology Beijing, Beijing 100083, PR China.
| | - Lei Zheng
- School of Energy and Environmental Engineering, Beijing Key Laboratory of Resource-oriented Treatment of Industrial Pollutants, University of Science and Technology Beijing, Beijing 100083, PR China.
| |
Collapse
|
41
|
Tao H, Jawad AH, Shather AH, Al-Khafaji Z, Rashid TA, Ali M, Al-Ansari N, Marhoon HA, Shahid S, Yaseen ZM. Machine learning algorithms for high-resolution prediction of spatiotemporal distribution of air pollution from meteorological and soil parameters. ENVIRONMENT INTERNATIONAL 2023; 175:107931. [PMID: 37119651 DOI: 10.1016/j.envint.2023.107931] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Revised: 03/18/2023] [Accepted: 04/11/2023] [Indexed: 05/22/2023]
Abstract
This study uses machine learning (ML) models for a high-resolution prediction (0.1°×0.1°) of air fine particular matter (PM2.5) concentration, the most harmful to human health, from meteorological and soil data. Iraq was considered the study area to implement the method. Different lags and the changing patterns of four European Reanalysis (ERA5) meteorological variables, rainfall, mean temperature, wind speed and relative humidity, and one soil parameter, the soil moisture, were used to select the suitable set of predictors using a non-greedy algorithm known as simulated annealing (SA). The selected predictors were used to simulate the temporal and spatial variability of air PM2.5 concentration over Iraq during the early summer (May-July), the most polluted months, using three advanced ML models, extremely randomized trees (ERT), stochastic gradient descent backpropagation (SGD-BP) and long short-term memory (LSTM) integrated with Bayesian optimizer. The spatial distribution of the annual average PM2.5 revealed the population of the whole of Iraq is exposed to a pollution level above the standard limit. The changes in temperature and soil moisture and the mean wind speed and humidity of the month before the early summer can predict the temporal and spatial variability of PM2.5 over Iraq during May-July. Results revealed the higher performance of LSTM with normalized root-mean-square error and Kling-Gupta efficiency of 13.4% and 0.89, compared to 16.02% and 0.81 for SDG-BP and 17.9% and 0.74 for ERT. The LSTM could also reconstruct the observed spatial distribution of PM2.5 with MapCurve and Cramer's V values of 0.95 and 0.91, compared to 0.9 and 0.86 for SGD-BP and 0.83 and 0.76 for ERT. The study provided a methodology for forecasting spatial variability of PM2.5 concentration at high resolution during the peak pollution months from freely available data, which can be replicated in other regions for generating high-resolution PM2.5 forecasting maps.
Collapse
Affiliation(s)
- Hai Tao
- School of Computer and Information, Qiannan Normal University for Nationalities, Duyun, Guizhou 558000, China; State Key Laboratory of Public Big Data, Guizhou University, Guizhou, Guiyang 550025, China; Institute for Big Data Analytics and Artificial Intelligence (IBDAAI), Universiti Teknologi MARA, 40450 Shah Alam, Selangor, Malaysia.
| | - Ali H Jawad
- Faculty of Applied Sciences, UniversitiTeknologi MARA, 40450 Shah Alam, Selangor, Malaysia.
| | - A H Shather
- Dep of Computer Technology Engineering, Engineering Technical College, University of Alkitab, Iraq.
| | - Zainab Al-Khafaji
- Department of Building and Construction Technologies Engineering, AL-Mustaqbal University College, Hillah 51001, Iraq.
| | - Tarik A Rashid
- Computer Science and Engineering Department, University of Kurdistan Hewler, Erbil, KR, Iraq.
| | - Mumtaz Ali
- UniSQ College, University of Southern Queensland, QLD 4350, Australia.
| | - Nadhir Al-Ansari
- Dept. of Civil, Environmental and Natural Resources Engineering, Lulea Univ. of Technology, Lulea T3334, Sweden.
| | - Haydar Abdulameer Marhoon
- Information and Communication Technology Research Group, Scientific Research Center, Al-Ayen University, Thi-Qar, Iraq; College of Computer Sciences and Information Technology, University of Kerbala, Karbala, Iraq.
| | - Shamsuddin Shahid
- Department of Hydraulics and Hydrology, School of Civil Engineering, Faculty of Engineering, Universiti Teknologi Malaysia (UTM), 81310 Skudia, Johor, Malaysia.
| | - Zaher Mundher Yaseen
- Civil and Environmental Engineering Department, King Fahd University of Petroleum & Minerals, Dhahran 31261, Saudi Arabia; Interdisciplinary Research Center for Membranes and Water Security, King Fahd University of Petroleum & Minerals, Dhahran 31261, Saudi Arabia.
| |
Collapse
|
42
|
Reid E, Igou T, Zhao Y, Crittenden J, Huang CH, Westerhoff P, Rittmann B, Drewes JE, Chen Y. The Minus Approach Can Redefine the Standard of Practice of Drinking Water Treatment. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2023; 57:7150-7161. [PMID: 37074125 PMCID: PMC10173460 DOI: 10.1021/acs.est.2c09389] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
Chlorine-based disinfection for drinking water treatment (DWT) was one of the 20th century's great public health achievements, as it substantially reduced the risk of acute microbial waterborne disease. However, today's chlorinated drinking water is not unambiguously safe; trace levels of regulated and unregulated disinfection byproducts (DBPs), and other known, unknown, and emerging contaminants (KUECs), present chronic risks that make them essential removal targets. Because conventional chemical-based DWT processes do little to remove DBPs or KUECs, alternative approaches are needed to minimize risks by removing DBP precursors and KUECs that are ubiquitous in water supplies. We present the "Minus Approach" as a toolbox of practices and technologies to mitigate KUECs and DBPs without compromising microbiological safety. The Minus Approach reduces problem-causing chemical addition treatment (i.e., the conventional "Plus Approach") by producing biologically stable water containing pathogens at levels having negligible human health risk and substantially lower concentrations of KUECs and DBPs. Aside from ozonation, the Minus Approach avoids primary chemical-based coagulants, disinfectants, and advanced oxidation processes. The Minus Approach focuses on bank filtration, biofiltration, adsorption, and membranes to biologically and physically remove DBP precursors, KUECs, and pathogens; consequently, water purveyors can use ultraviolet light at key locations in conjunction with smaller dosages of secondary chemical disinfectants to minimize microbial regrowth in distribution systems. We describe how the Minus Approach contrasts with the conventional Plus Approach, integrates with artificial intelligence, and can ultimately improve the sustainability performance of water treatment. Finally, we consider barriers to adoption of the Minus Approach.
Collapse
Affiliation(s)
- Elliot Reid
- School of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Thomas Igou
- School of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Yangying Zhao
- School of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - John Crittenden
- School of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
- Brook Byers Institute for Sustainable Systems, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Ching-Hua Huang
- School of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Paul Westerhoff
- Nanosystems Engineering Research Center for Nanotechnology-Enabled Water Treatment, School of Sustainable Engineering and The Built Environment, Ira A. Fulton Schools of Engineering, Arizona State University, Tempe, Arizona 85287, United States
| | - Bruce Rittmann
- Nanosystems Engineering Research Center for Nanotechnology-Enabled Water Treatment, School of Sustainable Engineering and The Built Environment, Ira A. Fulton Schools of Engineering, Arizona State University, Tempe, Arizona 85287, United States
- Biodesign Swette Center for Environmental Biotechnology, Arizona State University, Tempe, Arizona 85287, United States
| | - Jörg E Drewes
- Chair of Urban Water Systems Engineering, Technical University of Munich, 85748 Garching, Germany
| | - Yongsheng Chen
- School of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| |
Collapse
|
43
|
Yang M, Zhu JJ, McGaughey A, Zheng S, Priestley RD, Ren ZJ. Predicting Extraction Selectivity of Acetic Acid in Pervaporation by Machine Learning Models with Data Leakage Management. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2023; 57:5934-5946. [PMID: 36972410 DOI: 10.1021/acs.est.2c06382] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
The extraction of acetic acid and other carboxylic acids from water is an emerging separation need as they are increasingly produced from waste organics and CO2 during carbon valorization. However, the traditional experimental approach can be slow and expensive, and machine learning (ML) may provide new insights and guidance in membrane development for organic acid extraction. In this study, we collected extensive literature data and developed the first ML models for predicting separation factors between acetic acid and water in pervaporation with polymers' properties, membrane morphology, fabrication parameters, and operating conditions. Importantly, we assessed seed randomness and data leakage problems during model development, which have been overlooked in ML studies but will result in over-optimistic results and misinterpreted variable importance. With proper data leakage management, we established a robust model and achieved a root-mean-square error of 0.515 using the CatBoost regression model. In addition, the prediction model was interpreted to elucidate the variables' importance, where the mass ratio was the topmost significant variable in predicting separation factors. In addition, polymers' concentration and membranes' effective area contributed to information leakage. These results demonstrate ML models' advances in membrane design and fabrication and the importance of vigorous model validation.
Collapse
Affiliation(s)
- Meiqi Yang
- Department of Civil and Environmental Engineering and Andlinger Center for Energy and the Environment, Princeton University, Princeton, New Jersey08544, United States
| | - Jun-Jie Zhu
- Department of Civil and Environmental Engineering and Andlinger Center for Energy and the Environment, Princeton University, Princeton, New Jersey08544, United States
| | - Allyson McGaughey
- Department of Civil and Environmental Engineering and Andlinger Center for Energy and the Environment, Princeton University, Princeton, New Jersey08544, United States
- Department of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey08544, United States
| | - Sunxiang Zheng
- Department of Civil and Environmental Engineering and Andlinger Center for Energy and the Environment, Princeton University, Princeton, New Jersey08544, United States
| | - Rodney D Priestley
- Department of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey08544, United States
| | - Zhiyong Jason Ren
- Department of Civil and Environmental Engineering and Andlinger Center for Energy and the Environment, Princeton University, Princeton, New Jersey08544, United States
| |
Collapse
|
44
|
Zhu T, Zhang Y, Tao C, Chen W, Cheng H. Prediction of organic contaminant rejection by nanofiltration and reverse osmosis membranes using interpretable machine learning models. THE SCIENCE OF THE TOTAL ENVIRONMENT 2023; 857:159348. [PMID: 36228787 DOI: 10.1016/j.scitotenv.2022.159348] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Revised: 09/21/2022] [Accepted: 10/06/2022] [Indexed: 06/16/2023]
Abstract
Efficiency improvement in contaminant removal by nanofiltration (NF) and reverse osmosis (RO) membranes is a multidimensional process involving membrane material selection and experimental condition optimization. It is unrealistic to explore the contributions of diverse influencing factors to the removal rate by trial-and-error experimentation. However, the advanced machine learning (ML) method is a powerful tool to simulate this complex decision-making process. Here, 4 traditional learning algorithms (MLR, SVM, ANN, kNN) and 4 ensemble learning algorithms (RF, GBDT, XGBoost, LightGBM) were applied to predict the removal efficiency of contaminants. Results reported here demonstrate that ensemble models showed significantly better predictive performance than traditional models. More importantly, this study achieved a compelling tradeoff between accuracy and interpretability for ensemble models with an effective model interpretation approach, which revealed the mutual interaction mechanism between the membrane material, contaminants and experimental conditions in membrane separation. Additionally, feature selection was for the first time achieved based on the aforementioned model interpretation method to determine the most important variable influencing the contaminant removal rate. Ultimately, the four ensemble models retrained by the selected variables achieved distinguished prediction performance (R2adj = 92.4 %-99.5 %). MWCO (membrane molecular weight cut-off), McGowan volume of solute (V) and molecular weight (MW) of the compound were demonstrated to be the most important influencing factors in contaminant removal by the NF and RO processes. Overall, the proposed methods in this study can facilitate versatile complex decision-making processes in the environmental field, particularly in contaminant removal by advanced physicochemical separation processes.
Collapse
Affiliation(s)
- Tengyi Zhu
- School of Environmental Science and Engineering, Yangzhou University, Yangzhou 225127, Jiangsu, China.
| | - Yu Zhang
- School of Environmental Science and Engineering, Yangzhou University, Yangzhou 225127, Jiangsu, China
| | - Cuicui Tao
- School of Environmental Science and Engineering, Yangzhou University, Yangzhou 225127, Jiangsu, China
| | - Wenxuan Chen
- School of Civil Engineering, Southeast University, Nanjing 210096, China
| | - Haomiao Cheng
- School of Environmental Science and Engineering, Yangzhou University, Yangzhou 225127, Jiangsu, China
| |
Collapse
|
45
|
Wang Z, Yu Y, Roy K, Gao C, Huang L. The Application of Machine Learning: Controlling the Preparation of Environmental Materials and Carbon Neutrality. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2023; 20:1871. [PMID: 36767237 PMCID: PMC9915388 DOI: 10.3390/ijerph20031871] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/14/2023] [Accepted: 01/17/2023] [Indexed: 06/18/2023]
Abstract
The greenhouse effect is a severe global problem [...].
Collapse
Affiliation(s)
- Zhenxing Wang
- South China Institute of Environmental Sciences, Ministry of Ecology and Environment of the People’s Republic of China, Guangzhou 510655, China
| | - Yunjun Yu
- South China Institute of Environmental Sciences, Ministry of Ecology and Environment of the People’s Republic of China, Guangzhou 510655, China
| | - Kallol Roy
- Institute of Computer Science, Faculty of Science and Technology, University of Tartu, 51009 Tartu, Estonia
| | - Cheng Gao
- College of Hydrology and Water Resources, Hohai University, Nanjing 210098, China
| | - Lei Huang
- School of Environmental Science and Engineering, Guangzhou University, Guangzhou 510006, China
| |
Collapse
|
46
|
Li M, Chen H, Zhang H, Zeng M, Chen B, Guan L. Prediction of the Aqueous Solubility of Compounds Based on Light Gradient Boosting Machines with Molecular Fingerprints and the Cuckoo Search Algorithm. ACS OMEGA 2022; 7:42027-42035. [PMID: 36440111 PMCID: PMC9685740 DOI: 10.1021/acsomega.2c03885] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Accepted: 10/18/2022] [Indexed: 06/16/2023]
Abstract
Aqueous solubility is one of the most important physicochemical properties in drug discovery. At present, the prediction of aqueous solubility of compounds is still a challenging problem. Machine learning has shown great potential in solubility prediction. Most machine learning models largely rely on the setting of hyperparameters, and their performance can be improved by setting the hyperparameters in a better way. In this paper, we used MACCS fingerprints to represent the structural features and optimized the hyperparameters of the light gradient boosting machine (LightGBM) with the cuckoo search algorithm (CS). Based on the above representation and optimization, the CS-LightGBM model was established to predict the aqueous solubility of 2446 organic compounds and the obtained prediction results were compared with those obtained with the other six different machine learning models (RF, GBDT, XGBoost, LightGBM, SVR, and BO-LightGBM). The comparison results showed that the CS-LightGBM model had a better prediction performance than the other six different models. RMSE, MAE, and R 2 of the CS-LightGBM model were, respectively, 0.7785, 0.5117, and 0.8575. In addition, this model has good scalability and can be used to solve solubility prediction problems in other fields such as solvent selection and drug screening.
Collapse
|
47
|
A critical review on thin-film nanocomposite membranes enabled by nanomaterials incorporated in different positions and with diverse dimensions: Performance comparison and mechanisms. J Memb Sci 2022. [DOI: 10.1016/j.memsci.2022.120952] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
48
|
Tao L, Byrnes J, Varshney V, Li Y. Machine learning strategies for the structure-property relationship of copolymers. iScience 2022; 25:104585. [PMID: 35789847 PMCID: PMC9249671 DOI: 10.1016/j.isci.2022.104585] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Revised: 05/26/2022] [Accepted: 06/07/2022] [Indexed: 11/15/2022] Open
Abstract
Establishing the structure-property relationship is extremely valuable for the molecular design of copolymers. However, machine learning (ML) models can incorporate both chemical composition and sequence distribution of monomers, and have the generalization ability to process various copolymer types (e.g., alternating, random, block, and gradient copolymers) with a unified approach are missing. To address this challenge, we formulate four different ML models for investigation, including a feedforward neural network (FFNN) model, a convolutional neural network (CNN) model, a recurrent neural network (RNN) model, and a combined FFNN/RNN (Fusion) model. We use various copolymer types to systematically validate the performance and generalizability of different models. We find that the RNN architecture that processes the monomer sequence information both forward and backward is a more suitable ML model for copolymers with better generalizability. As a supplement to polymer informatics, our proposed approach provides an efficient way for the evaluation of copolymers.
Collapse
Affiliation(s)
- Lei Tao
- Department of Mechanical Engineering, University of Connecticut, Storrs, CT 06269, USA
| | | | - Vikas Varshney
- Materials and Manufacturing Directorate, Air Force Research Laboratory, Wright-Patterson Air Force Base, Ohio 45433, USA
| | - Ying Li
- Department of Mechanical Engineering, University of Connecticut, Storrs, CT 06269, USA
- Polymer Program, Institute of Materials Science, University of Connecticut, Storrs, CT 06269, USA
| |
Collapse
|