1
|
Vishnyakov A. Machine Learning in Computational Design and Optimization of Disordered Nanoporous Materials. MATERIALS (BASEL, SWITZERLAND) 2025; 18:534. [PMID: 39942200 PMCID: PMC11818078 DOI: 10.3390/ma18030534] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/04/2024] [Revised: 12/30/2024] [Accepted: 12/31/2024] [Indexed: 02/16/2025]
Abstract
This review analyzes the current practices in the data-driven characterization, design and optimization of disordered nanoporous materials with pore sizes ranging from angstroms (active carbon and polymer membranes for gas separation) to tens of nm (aerogels). While the machine learning (ML)-based prediction and screening of crystalline, ordered porous materials are conducted frequently, materials with disordered porosity receive much less attention, although ML is expected to excel in the field, which is rich with ill-posed problems, non-linear correlations and a large volume of experimental results. For micro- and mesoporous solids (active carbons, mesoporous silica, aerogels, etc.), the obstacles are mostly related to the navigation of the available data with transferrable and easily interpreted features. The majority of published efforts are based on the experimental data obtained in the same work, and the datasets are often very small. Even with limited data, machine learning helps discover non-evident correlations and serves in material design and production optimization. The development of comprehensive databases for micro- and mesoporous materials with low-level structural and sorption characteristics, as well as automated synthesis/characterization protocols, is seen as the direction of efforts for the immediate future. This paper is written in a language readable by a chemist unfamiliar with the data science specifics.
Collapse
Affiliation(s)
- Aleksey Vishnyakov
- Aramco Innovations LLC, 119234 Moscow, Russia;
- Department of Physics, Moscow State University, 119134 Moscow, Russia
| |
Collapse
|
2
|
Dangayach R, Jeong N, Demirel E, Uzal N, Fung V, Chen Y. Machine Learning-Aided Inverse Design and Discovery of Novel Polymeric Materials for Membrane Separation. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2025; 59:993-1012. [PMID: 39680111 PMCID: PMC11755723 DOI: 10.1021/acs.est.4c08298] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/10/2024] [Revised: 12/03/2024] [Accepted: 12/04/2024] [Indexed: 12/17/2024]
Abstract
Polymeric membranes have been widely used for liquid and gas separation in various industrial applications over the past few decades because of their exceptional versatility and high tunability. Traditional trial-and-error methods for material synthesis are inadequate to meet the growing demands for high-performance membranes. Machine learning (ML) has demonstrated huge potential to accelerate design and discovery of membrane materials. In this review, we cover strengths and weaknesses of the traditional methods, followed by a discussion on the emergence of ML for developing advanced polymeric membranes. We describe methodologies for data collection, data preparation, the commonly used ML models, and the explainable artificial intelligence (XAI) tools implemented in membrane research. Furthermore, we explain the experimental and computational validation steps to verify the results provided by these ML models. Subsequently, we showcase successful case studies of polymeric membranes and emphasize inverse design methodology within a ML-driven structured framework. Finally, we conclude by highlighting the recent progress, challenges, and future research directions to advance ML research for next generation polymeric membranes. With this review, we aim to provide a comprehensive guideline to researchers, scientists, and engineers assisting in the implementation of ML to membrane research and to accelerate the membrane design and material discovery process.
Collapse
Affiliation(s)
- Raghav Dangayach
- School
of Civil & Environmental Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Nohyeong Jeong
- School
of Civil & Environmental Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Elif Demirel
- School
of Civil & Environmental Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Nigmet Uzal
- School
of Civil & Environmental Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
- Department
of Civil Engineering, Abdullah Gul University, 38039 Kayseri, Turkey
| | - Victor Fung
- School
of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Yongsheng Chen
- School
of Civil & Environmental Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| |
Collapse
|
3
|
Xie C, Li R, Li Y, Xie H, Liu Q. Imputation of Missing Data in Materials Science through Nearest Neighbors and Iterative Predictions. J Chem Theory Comput 2025; 21:70-78. [PMID: 39723676 DOI: 10.1021/acs.jctc.4c01237] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2024]
Abstract
Missing data in tabular data sets is ubiquitous in statistical analysis, big data analysis, and machine learning studies. Many strategies have been proposed to impute missing data, but their reliability has not been stringently assessed in materials science. Here, we carried out a benchmark test for six imputation strategies: Mean, MissForest, HyperImpute, Gain, Sinkhorn, and a newly proposed MatImpute on seven representative data sets in materials science. The imputation-induced errors (IIEs) were evaluated through the difference between imputed and original values, by root mean square error (RMSE), Wasserstein distance (WD), and a newly introduced metrics data set correlation convergence (DCC), to measure the difference at three aspects for individual data, column-wise distribution, and correlation stability of a data set. MatImpute outperformed the others with the least RMSE and WD and the highest DCC. The IIE increases with the increase of data missing ratio and in the order of missing at random < missing completely at random ≤ missing not at random, considering inherent correlations among missing data. A similar trend was observed for the increase of IIE along the central departure distance in units of the standard deviation, which is consistent with the increase of difficulty from interpolation to extrapolation. Further tests of IIE in regression and classification machine learning predictive models, MatImpute also preserved the highest data recovery fidelity. We released the code of MatImpute to facilitate the construction of high-quality data sets in materials science.
Collapse
Affiliation(s)
- Chunhui Xie
- Department of Polymer Materials and Engineering, College of Materials and Metallurgy, Guizhou University, Guiyang 550025, P. R. China
| | - Rui Li
- Department of Polymer Materials and Engineering, College of Materials and Metallurgy, Guizhou University, Guiyang 550025, P. R. China
| | - Yunqi Li
- Department of Polymer Materials and Engineering, College of Materials and Metallurgy, Guizhou University, Guiyang 550025, P. R. China
| | - Haibo Xie
- Department of Polymer Materials and Engineering, College of Materials and Metallurgy, Guizhou University, Guiyang 550025, P. R. China
| | - Qibin Liu
- Department of Polymer Materials and Engineering, College of Materials and Metallurgy, Guizhou University, Guiyang 550025, P. R. China
| |
Collapse
|
4
|
Olayiwola T, Kumar R, Romagnoli JA. Empowering Capacitive Devices: Harnessing Transfer Learning for Enhanced Data-Driven Optimization. Ind Eng Chem Res 2024; 63:11971-11981. [PMID: 39015815 PMCID: PMC11247430 DOI: 10.1021/acs.iecr.4c01171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/18/2024]
Abstract
Developing data-driven models has found successful applications in engineering tasks, such as material design, process modeling, and process monitoring. In capacitive devices like deionization and supercapacitors, there exists potential for applying this data-driven machine learning (ML) model in optimizing its potential use in energy-efficient separations or energy generation. However, these models are faced with limited datasets, and even in large quantities, the datasets are incomplete, limiting their potential use for successful data-driven modeling. Here, the success of transfer learning in resolving the challenges with limited datasets was exploited. A two-step data-driven ML modeling framework named ImputeNet involving training with ML-imputed datasets and then with clean datasets was explored. Through data imputation and transfer learning, it is possible to develop a data-driven model with acceptable metrics mirroring experimental measurements. By using the model, optimization studies using the genetic algorithm were implemented to analyze the solution under the Pareto optimality. This early insight can be used in the initial stage of experimental measurements to rapidly identify experimental conditions worthy of further investigation. Moreover, we expect that the insights from these results will drive accurate predictive modeling in other fields including healthcare, genomic data analysis, and environmental monitoring with incomplete datasets.
Collapse
Affiliation(s)
- Teslim Olayiwola
- Cain
Department of Chemical Engineering, Louisiana
State University, Baton
Rouge, Louisiana 70803, United States
| | - Revati Kumar
- Department
of Chemistry, Louisiana State University, Baton Rouge, Louisiana 70803, United States
| | - Jose A. Romagnoli
- Cain
Department of Chemical Engineering, Louisiana
State University, Baton
Rouge, Louisiana 70803, United States
| |
Collapse
|
5
|
Al-Sakkari EG, Ragab A, Dagdougui H, Boffito DC, Amazouz M. Carbon capture, utilization and sequestration systems design and operation optimization: Assessment and perspectives of artificial intelligence opportunities. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 917:170085. [PMID: 38224888 DOI: 10.1016/j.scitotenv.2024.170085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Revised: 12/10/2023] [Accepted: 01/09/2024] [Indexed: 01/17/2024]
Abstract
Carbon capture, utilization, and sequestration (CCUS) is a promising solution to decarbonize the energy and industrial sectors to mitigate climate change. An integrated assessment of technological options is required for the effective deployment of CCUS large-scale infrastructure between CO2 production and utilization/sequestration nodes. However, developing cost-effective strategies from engineering and operation perspectives to implement CCUS is challenging. This is due to the diversity of upstream emitting processes located in different geographical areas, available downstream utilization technologies, storage sites capacity/location, and current/future energy/emissions/economic conditions. This paper identifies the need to achieve a robust hybrid assessment tool for CCUS modeling, simulation, and optimization based mainly on artificial intelligence (AI) combined with mechanistic methods. Thus, a critical literature review is conducted to assess CCUS technologies and their related process modeling/simulation/optimization techniques, while evaluating the needs for improvements or new developments to reduce overall CCUS systems design and operation costs. These techniques include first principles- based and data-driven ones, i.e. AI and related machine learning (ML) methods. Besides, the paper gives an overview on the role of life cycle assessment (LCA) to evaluate CCUS systems where the combined LCA-AI approach is assessed. Other advanced methods based on the AI/ML capabilities/algorithms can be developed to optimize the whole CCUS value chain. Interpretable ML combined with explainable AI can accelerate optimum materials selection by giving strong rules which accelerates the design of capture/utilization plants afterwards. Besides, deep reinforcement learning (DRL) coupled with process simulations will accelerate process design/operation optimization through considering simultaneous optimization of equipment sizing and operating conditions. Moreover, generative deep learning (GDL) is a key solution to optimum capture/utilization materials design/discovery. The developed AI methods can be generalizable where the extracted knowledge can be transferred to future works to help cutting the costs of CCUS value chain.
Collapse
Affiliation(s)
- Eslam G Al-Sakkari
- Department of Mathematics and Industrial Engineering, Polytechnique Montréal, 2500 Chemin de Polytechnique, Montréal, Québec H3T 1J4, Canada; CanmetENERGY, 1615 Lionel-Boulet Blvd, P.O. Box 4800, Varennes, Québec J3X 1P7, Canada.
| | - Ahmed Ragab
- Department of Mathematics and Industrial Engineering, Polytechnique Montréal, 2500 Chemin de Polytechnique, Montréal, Québec H3T 1J4, Canada; CanmetENERGY, 1615 Lionel-Boulet Blvd, P.O. Box 4800, Varennes, Québec J3X 1P7, Canada
| | - Hanane Dagdougui
- Department of Mathematics and Industrial Engineering, Polytechnique Montréal, 2500 Chemin de Polytechnique, Montréal, Québec H3T 1J4, Canada
| | - Daria C Boffito
- Department of Chemical Engineering, Polytechnique Montréal, 2500 Chemin de Polytechnique, Montréal, Québec H3T 1J4, Canada; Canada Research Chair in Engineering Process Intensification and Catalysis (EPIC), Canada
| | - Mouloud Amazouz
- CanmetENERGY, 1615 Lionel-Boulet Blvd, P.O. Box 4800, Varennes, Québec J3X 1P7, Canada
| |
Collapse
|
6
|
Mobili R, La Cognata S, Monteleone M, Longo M, Fuoco A, Serapian SA, Vigani B, Milanese C, Armentano D, Jansen JC, Amendola V. Gas Permeation through Mechanically Resistant Self-Standing Membranes of a Neat Amorphous Organic Cage. Chemistry 2023; 29:e202301437. [PMID: 37433050 DOI: 10.1002/chem.202301437] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Revised: 06/27/2023] [Accepted: 07/11/2023] [Indexed: 07/13/2023]
Abstract
The synthesis and characterization of a novel film-forming organic cage and of its smaller analogue are here described. While the small cage produced single crystals suitable for X-ray diffraction studies, the large one was isolated as a dense film. Due to its remarkable film-forming properties, this latter cage could be solution processed into transparent thin-layer films and mechanically stable dense self-standing membranes of controllable thickness. Thanks to these peculiar features, the membranes were also successfully tested for gas permeation, reporting a behavior similar to that found with stiff glassy polymers such as polymers of intrinsic microporosity or polyimides. Given the growing interest in the development of molecular-based membranes, for example for separation technologies and functional coatings, the properties of this organic cage were investigated by thorough analysis of their structural, thermal, mechanical and gas transport properties, and by detailed atomistic simulations.
Collapse
Affiliation(s)
- Riccardo Mobili
- Department of Chemistry, University of Pavia, viale Torquato Taramelli 12, 27100, Pavia, Italy
| | - Sonia La Cognata
- Department of Chemistry, University of Pavia, viale Torquato Taramelli 12, 27100, Pavia, Italy
| | - Marcello Monteleone
- Institute on Membrane Technology, National Research Council of Italy (CNR-ITM), via P. Bucci 17/C, Rende (CS), 87036, Italy
| | - Mariagiulia Longo
- Institute on Membrane Technology, National Research Council of Italy (CNR-ITM), via P. Bucci 17/C, Rende (CS), 87036, Italy
| | - Alessio Fuoco
- Institute on Membrane Technology, National Research Council of Italy (CNR-ITM), via P. Bucci 17/C, Rende (CS), 87036, Italy
| | - Stefano A Serapian
- Department of Chemistry, University of Pavia, viale Torquato Taramelli 12, 27100, Pavia, Italy
| | - Barbara Vigani
- Department of Drug Sciences, University of Pavia, viale Torquato Taramelli 12, 27100, Pavia, Italy
| | - Chiara Milanese
- Department of Chemistry, University of Pavia, viale Torquato Taramelli 12, 27100, Pavia, Italy
| | - Donatella Armentano
- Department of Chemistry & Chemical Technologies, University of Calabria, Via P. Bucci, 13/C, 87036, Rende (CS), Italy
| | - Johannes C Jansen
- Institute on Membrane Technology, National Research Council of Italy (CNR-ITM), via P. Bucci 17/C, Rende (CS), 87036, Italy
| | - Valeria Amendola
- Department of Chemistry, University of Pavia, viale Torquato Taramelli 12, 27100, Pavia, Italy
| |
Collapse
|
7
|
Behnoudfar D, Simon CM, Schrier J. Data-Driven Imputation of Miscibility of Aqueous Solutions via Graph-Regularized Logistic Matrix Factorization. J Phys Chem B 2023; 127:7964-7973. [PMID: 37682958 DOI: 10.1021/acs.jpcb.3c03789] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/10/2023]
Abstract
Aqueous, two-phase systems (ATPSs) may form upon mixing two solutions of independently water-soluble compounds. Many separation, purification, and extraction processes rely on ATPSs. Predicting the miscibility of solutions can accelerate and reduce the cost of the discovery of new ATPSs for these applications. Whereas previous machine learning approaches to ATPS prediction used physicochemical properties of each solute as a descriptor, in this work, we show how to impute missing miscibility outcomes directly from an incomplete collection of pairwise miscibility experiments. We use graph-regularized logistic matrix factorization (GR-LMF) to learn a latent vector of each solution from (i) the observed entries in the pairwise miscibility matrix and (ii) a graph where each node is a solution and edges are relationships indicating the general category of the solute (i.e., polymer, surfactant, salt, protein). For an experimental data set of the pairwise miscibility of 68 solutions from Peacock et al. [ACS Appl. Mater. Interfaces 2021, 13, 11449-11460], we find that GR-LMF more accurately predicts missing (im)miscibility outcomes of pairs of solutions than ordinary logistic matrix factorization and random forest classifiers that use physicochemical features of the solutes. GR-LMF obviates the need for features of the solutions and solutions to impute missing miscibility outcomes, but it cannot predict the miscibility of a new solution without some observations of its miscibility with other solutions in the training data set.
Collapse
Affiliation(s)
- Diba Behnoudfar
- School of Chemical, Biological, and Environmental Engineering, Oregon State University, Corvallis, Oregon 97331, United States
| | - Cory M Simon
- School of Chemical, Biological, and Environmental Engineering, Oregon State University, Corvallis, Oregon 97331, United States
| | - Joshua Schrier
- Department of Chemistry, Fordham University, The Bronx, New York 10458, United States
| |
Collapse
|
8
|
Chen H, Zheng Y, Li J, Li L, Wang X. AI for Nanomaterials Development in Clean Energy and Carbon Capture, Utilization and Storage (CCUS). ACS NANO 2023. [PMID: 37267448 DOI: 10.1021/acsnano.3c01062] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Zero-carbon energy and negative emission technologies are crucial for achieving a carbon neutral future, and nanomaterials have played critical roles in advancing such technologies. More recently, due to the explosive growth in data, the adoption and exploitation of artificial intelligence (AI) as part of the materials research framework have had a tremendous impact on the development of nanomaterials. AI has enabled revolutionary next-generation paradigms to significantly accelerate all stages of material discovery and facilitate the exploration of the enormous design space. In this review, we summarize recent advancements of AI applications in nanomaterials discovery, with a special emphasis on the selected applications of AI and nanotechnology for the net-zero emission future including the development of solar cells, hydrogen energy, battery materials for renewable energy, and CO2 capture and conversion materials for carbon capture, utilization and storage (CCUS) technologies. In addition, we discuss the limitations and challenges of current AI applications in this area by identifying the gaps that exist in current development. Finally, we present the prospect for future research directions in order to facilitate the large-scale applications of artificial intelligence for advancements in nanomaterials.
Collapse
Affiliation(s)
- Honghao Chen
- Department of Chemical Engineering, Tsinghua University, Beijing 100084, China
| | - Yingzhe Zheng
- Department of Chemical and Biomolecular Engineering, National University of Singapore, 117585, Singapore
| | - Jiali Li
- Department of Chemical and Biomolecular Engineering, National University of Singapore, 117585, Singapore
| | - Lanyu Li
- Department of Chemical Engineering, Tsinghua University, Beijing 100084, China
| | - Xiaonan Wang
- Department of Chemical Engineering, Tsinghua University, Beijing 100084, China
- Department of Chemical and Biomolecular Engineering, National University of Singapore, 117585, Singapore
| |
Collapse
|
9
|
Wang J, Tian K, Li D, Chen M, Feng X, Zhang Y, Wang Y, Van der Bruggen B. Machine learning in gas separation membrane developing: ready for prime time. Sep Purif Technol 2023. [DOI: 10.1016/j.seppur.2023.123493] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/03/2023]
|
10
|
Wang C, Wang L, Soo A, Bansidhar Pathak N, Kyong Shon H. Machine learning based prediction and optimization of thin film nanocomposite membranes for organic solvent nanofiltration. Sep Purif Technol 2023. [DOI: 10.1016/j.seppur.2022.122328] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/04/2023]
|
11
|
Cheng X, Liao Y, Lei Z, Li J, Fan X, Xiao X. Multi-scale design of MOF-based membrane separation for CO2/CH4 mixture via integration of molecular simulation, machine learning and process modeling and simulation. J Memb Sci 2023. [DOI: 10.1016/j.memsci.2023.121430] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
|
12
|
Abstract
Computational modeling is increasingly used to assist in the discovery of supramolecular materials. Supramolecular materials are typically primarily built from organic components that are self-assembled through noncovalent bonding and have potential applications, including in selective binding, sorption, molecular separations, catalysis, optoelectronics, sensing, and as molecular machines. In this review, the key areas where computational prediction can assist in the discovery of supramolecular materials, including in structure prediction, property prediction, and the prediction of how to synthesize a hypothetical material are discussed, before exploring the potential impact of artificial intelligence techniques on the field. Throughout, the importance of close integration with experimental materials discovery programs will be highlighted. A series of case studies from the author's work across some different supramolecular material classes will be discussed, before finishing with a discussion of the outlook for the field.
Collapse
Affiliation(s)
- Kim E. Jelfs
- Department of Chemistry, Molecular Sciences Research HubImperial College LondonLondonUK
| |
Collapse
|
13
|
Yang J, Tao L, He J, McCutcheon JR, Li Y. Machine learning enables interpretable discovery of innovative polymers for gas separation membranes. SCIENCE ADVANCES 2022; 8:eabn9545. [PMID: 35857839 PMCID: PMC9299556 DOI: 10.1126/sciadv.abn9545] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Accepted: 06/07/2022] [Indexed: 05/21/2023]
Abstract
Polymer membranes perform innumerable separations with far-reaching environmental implications. Despite decades of research, design of new membrane materials remains a largely Edisonian process. To address this shortcoming, we demonstrate a generalizable, accurate machine learning (ML) implementation for the discovery of innovative polymers with ideal performance. Specifically, multitask ML models are trained on experimental data to link polymer chemistry to gas permeabilities of He, H2, O2, N2, CO2, and CH4. We interpret the ML models and extract valuable insights into the contributions of different chemical moieties to permeability and selectivity. We then screen over 9 million hypothetical polymers and identify thousands that lie well above current performance upper bounds, including hundreds of never-before-seen ultrapermeable polymer membranes with O2 and CO2 permeability greater than 104 and 105 Barrers, respectively. High-fidelity molecular dynamics simulations confirm the ML-predicted gas permeabilities of the promising candidates, which suggests that many can be translated to reality.
Collapse
Affiliation(s)
- Jason Yang
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Lei Tao
- Department of Mechanical Engineering, University of Connecticut, Storrs, CT 06269, USA
| | - Jinlong He
- Department of Mechanical Engineering, University of Connecticut, Storrs, CT 06269, USA
| | - Jeffrey R. McCutcheon
- Department of Chemical & Biomolecular Engineering, Center for Environmental Sciences and Engineering, University of Connecticut, Storrs, CT 06269, USA
- Polymer Program, Institute of Materials Science, University of Connecticut, Storrs, CT 06269, USA
| | - Ying Li
- Department of Mechanical Engineering, University of Connecticut, Storrs, CT 06269, USA
- Polymer Program, Institute of Materials Science, University of Connecticut, Storrs, CT 06269, USA
- Corresponding author.
| |
Collapse
|
14
|
Tao L, Byrnes J, Varshney V, Li Y. Machine learning strategies for the structure-property relationship of copolymers. iScience 2022; 25:104585. [PMID: 35789847 PMCID: PMC9249671 DOI: 10.1016/j.isci.2022.104585] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Revised: 05/26/2022] [Accepted: 06/07/2022] [Indexed: 11/15/2022] Open
Abstract
Establishing the structure-property relationship is extremely valuable for the molecular design of copolymers. However, machine learning (ML) models can incorporate both chemical composition and sequence distribution of monomers, and have the generalization ability to process various copolymer types (e.g., alternating, random, block, and gradient copolymers) with a unified approach are missing. To address this challenge, we formulate four different ML models for investigation, including a feedforward neural network (FFNN) model, a convolutional neural network (CNN) model, a recurrent neural network (RNN) model, and a combined FFNN/RNN (Fusion) model. We use various copolymer types to systematically validate the performance and generalizability of different models. We find that the RNN architecture that processes the monomer sequence information both forward and backward is a more suitable ML model for copolymers with better generalizability. As a supplement to polymer informatics, our proposed approach provides an efficient way for the evaluation of copolymers.
Collapse
Affiliation(s)
- Lei Tao
- Department of Mechanical Engineering, University of Connecticut, Storrs, CT 06269, USA
| | | | - Vikas Varshney
- Materials and Manufacturing Directorate, Air Force Research Laboratory, Wright-Patterson Air Force Base, Ohio 45433, USA
| | - Ying Li
- Department of Mechanical Engineering, University of Connecticut, Storrs, CT 06269, USA
- Polymer Program, Institute of Materials Science, University of Connecticut, Storrs, CT 06269, USA
| |
Collapse
|
15
|
Anstine DM, Sholl DS, Siepmann JI, Snurr RQ, Aspuru-Guzik A, Colina CM. In silico design of microporous polymers for chemical separations and storage. Curr Opin Chem Eng 2022. [DOI: 10.1016/j.coche.2022.100795] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
16
|
Yokoyama D, Suzuki S, Asakura T, Kikuchi J. Chemometric Analysis of NMR Spectra and Machine Learning to Investigate Membrane Fouling. ACS OMEGA 2022; 7:12654-12660. [PMID: 35474825 PMCID: PMC9025983 DOI: 10.1021/acsomega.1c06891] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/06/2021] [Accepted: 03/02/2022] [Indexed: 05/26/2023]
Abstract
Efficient membrane filtration requires the understanding of the membrane foulants and the functional properties of different membrane types in water purification. In this study, dead-end filtration of aquaculture system effluents was performed and the membrane foulants were investigated via nuclear magnetic resonance (NMR) spectroscopy. Several machine learning models (Random Forest; RF, Extreme Gradient Boosting; XGBoost, Support Vector Machine; SVM, and Neural Network; NN) were constructed, one to predict the maximum transmembrane pressure, for revealing the chemical compounds causing fouling, and the other to classify the membrane materials based on chemometric analysis of NMR spectra, for determining their effect on the properties of the different membrane types tested. Especially, RF models exhibited high accuracy; the important chemical shifts observed in both the regression and classification models suggested that the proportional patterns of sugars and proteins are key factors in the fouling progress and the classification of membrane types. Therefore, the proposed strategy of chemometric analysis of NMR spectra is suitable for membrane research, which aims at investigating comprehensively the fouling phenomenon and how the foulants and environmental conditions vary according to the filtration systems.
Collapse
Affiliation(s)
- Daiki Yokoyama
- RIKEN
Center for Sustainable Resource Science, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
- Graduate
School of Medical Life Science, Yokohama
City University, 1-7-29
Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Sosei Suzuki
- Graduate
School of Medical Life Science, Yokohama
City University, 1-7-29
Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Taiga Asakura
- RIKEN
Center for Sustainable Resource Science, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
- Graduate
School of Medical Life Science, Yokohama
City University, 1-7-29
Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Jun Kikuchi
- RIKEN
Center for Sustainable Resource Science, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
- Graduate
School of Medical Life Science, Yokohama
City University, 1-7-29
Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
- Graduate
School of Bioagricultural Sciences, Nagoya
University, 1 Furo-cho, Chikusa-ku, Nagoya, Aichi 464-0810, Japan
| |
Collapse
|
17
|
Missing Data Imputation – A Survey. INTERNATIONAL JOURNAL OF DECISION SUPPORT SYSTEM TECHNOLOGY 2022. [DOI: 10.4018/ijdsst.292446] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Many real world datasets may contain missing values for various reasons. These incomplete datasets can pose severe issues to the underlying machine learning algorithms and decision support systems. It may result in high computational cost, skewed output and invalid deductions. Various solutions exist to mitigate this issue; the most popular strategy is to estimate the missing values by applying inferential techniques such as linear regression, decision trees or Bayesian inference. In this paper, the missing data problem is discussed in detail with a comprehensive review of the approaches to tackle it. The paper concludes with a discussion on the effectiveness of three imputation methods namely, imputation based on Multiple Linear Regression (MLR), Predictive Mean Matching (PMM) and Classification And Regression Tree (CART) in the context of subspace clustering. The experimental results obtained on real benchmark datasets and high-dimensional synthetic datasets highlight that, MLR based imputation method is more efficient on high-dimensional incomplete datasets.
Collapse
|
18
|
Alamoodi A, Zaidan B, Zaidan A, Albahri O, Chen J, Chyad M, Garfan S, Aleesa A. Machine learning-based imputation soft computing approach for large missing scale and non-reference data imputation. CHAOS, SOLITONS & FRACTALS 2021; 151:111236. [DOI: 10.1016/j.chaos.2021.111236] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/04/2025]
|
19
|
Comparative Study of Machine Learning Approaches for Predicting Creep Behavior of Polyurethane Elastomer. Polymers (Basel) 2021; 13:polym13111768. [PMID: 34071349 PMCID: PMC8198355 DOI: 10.3390/polym13111768] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Revised: 05/25/2021] [Accepted: 05/25/2021] [Indexed: 11/28/2022] Open
Abstract
The long-term mechanical properties of viscoelastic polymers are among their most important aspects. In the present research, a machine learning approach was proposed for creep properties’ prediction of polyurethane elastomer considering the effect of creep time, creep temperature, creep stress and the hardness of the material. The approaches are based on multilayer perceptron network, random forest and support vector machine regression, respectively. While the genetic algorithm and k-fold cross-validation were used to tune the hyper-parameters. The results showed that the three models all proposed excellent fitting ability for the training set. Moreover, the three models had different prediction capabilities for the testing set by focusing on various changing factors. The correlation coefficient values between the predicted and experimental strains were larger than 0.913 (mostly larger than 0.998) on the testing set when choosing the reasonable model.
Collapse
|