1
|
Ge W, De Silva R, Fan Y, Sisson SA, Stenzel MH. Machine Learning in Polymer Research. ADVANCED MATERIALS (DEERFIELD BEACH, FLA.) 2025; 37:e2413695. [PMID: 39924835 PMCID: PMC11923530 DOI: 10.1002/adma.202413695] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/11/2024] [Revised: 12/21/2024] [Indexed: 02/11/2025]
Abstract
Machine learning is increasingly being applied in polymer chemistry to link chemical structures to macroscopic properties of polymers and to identify chemical patterns in the polymer structures that help improve specific properties. To facilitate this, a chemical dataset needs to be translated into machine readable descriptors. However, limited and inadequately curated datasets, broad molecular weight distributions, and irregular polymer configurations pose significant challenges. Most off the shelf mathematical models often need refinement for specific applications. Addressing these challenges demand a close collaboration between chemists and mathematicians as chemists must formulate research questions in mathematical terms while mathematicians are required to refine models for specific applications. This review unites both disciplines to address dataset curation hurdles and highlight advances in polymer synthesis and modeling that enhance data availability. It then surveys ML approaches used to predict solid-state properties, solution behavior, composite performance, and emerging applications such as drug delivery and the polymer-biology interface. A perspective of the field is concluded and the importance of FAIR (findability, accessibility, interoperability, and reusability) data and the integration of polymer theory and data are discussed, and the thoughts on the machine-human interface are shared.
Collapse
Affiliation(s)
- Wei Ge
- School of Chemistry, University of New South Wales, Sydney, 2052, Australia
- School of Mathematics and Statistics and UNSW Data Science Hub, University of New South Wales, Sydney, 2052, Australia
| | - Ramindu De Silva
- School of Chemistry, University of New South Wales, Sydney, 2052, Australia
- School of Mathematics and Statistics and UNSW Data Science Hub, University of New South Wales, Sydney, 2052, Australia
- Data61, CSIRO, Sydney, NSW, 2015, Australia
| | - Yanan Fan
- School of Mathematics and Statistics and UNSW Data Science Hub, University of New South Wales, Sydney, 2052, Australia
- Data61, CSIRO, Sydney, NSW, 2015, Australia
| | - Scott A Sisson
- School of Mathematics and Statistics and UNSW Data Science Hub, University of New South Wales, Sydney, 2052, Australia
| | - Martina H Stenzel
- School of Chemistry, University of New South Wales, Sydney, 2052, Australia
| |
Collapse
|
2
|
Curtis KA, Statt A, Reinhart WF. Predicting self-assembly of sequence-controlled copolymers with stochastic sequence variation. SOFT MATTER 2025. [PMID: 39989378 DOI: 10.1039/d4sm01219d] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/25/2025]
Abstract
Sequence-controlled copolymers can self-assemble into a wide assortment of complex architectures, with exciting applications in nanofabrication and personalized medicine. However, polymer synthesis is notoriously imprecise, and stochasticity in both chemical synthesis and self-assembly poses a significant challenge to tight control over these systems. While it is increasingly viable to design "protein-like" sequences, specifying each individual monomer in a chain, the effect of variability within those sequences has not been well studied. In this work, we performed nearly 15 000 molecular dynamics simulations of sequence-controlled copolymer aggregates with varying level of sequence stochasticity. We utilized unsupervised learning to characterize the resulting morphologies and found that sequence variation leads to relatively smooth and predictable changes in morphology compared to ensembles of identical chains. Furthermore, structural response to sequence variation was accurately modeled using supervised learning, revealing several interesting trends in how specific families of sequences break down as monomer sequences become more variable. Our work presents a way forward in understanding and controlling the effect of sequence variation in sequence-controlled copolymer systems, which can hopefully be used to design advanced copolymer systems for technological applications in the future.
Collapse
Affiliation(s)
- Kaleigh A Curtis
- Department of Materials Science and Engineering, Pennsylvania State University, University Park, PA, USA.
- Institute for Computational and Data Sciences, Pennsylvania State University, University Park, PA, USA
| | - Antonia Statt
- Department of Materials Science and Engineering, Grainger College of Engineering, University of Illinois Urbana-Champaign, Champaign, IL, USA
| | - Wesley F Reinhart
- Department of Materials Science and Engineering, Pennsylvania State University, University Park, PA, USA.
- Institute for Computational and Data Sciences, Pennsylvania State University, University Park, PA, USA
| |
Collapse
|
3
|
Hore MJA. Analysis of the internal motions of thermoresponsive polymers and single chain nanoparticles. SOFT MATTER 2025; 21:770-780. [PMID: 39792065 DOI: 10.1039/d4sm01308e] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/12/2025]
Abstract
Data-driven techniques, such as proper orthogonal decomposition (POD) and uniform manifold approximation & projection (UMAP), are powerful methods for understanding polymer behavior in complex systems that extend beyond ideal conditions. They are based on the principle that low-dimensional behaviors are often embedded within the structure and dynamics of complex systems. Here, the internal motions of a thermoresponsive, LCST polymer are investigated for two cases: (1) the coil-to-globule transition that occurs as the system is heated above its critical temperature and (2) intramolecularly crosslinked, single chain nanoparticles (SCNPs) both above and below the critical temperature (TC). Our results demonstrate that POD can successfully extract the key features of the dynamics for both polymer globules and SCNPs. In the globular state, our results show that the relaxation modes are distorted relative to the coil state and relaxation times decrease upon chain collapse. After randomly crosslinking a globule to produce a SCNP, we observe a further distortion of the relaxation modes that depends strongly upon the particular set of monomers that are crosslinked. Yet, different sets of crosslinked monomers produce similar relaxation times for the SCNP. We observe that for SCNPs below the critical temperature, the relaxation times decrease with increasing crosslink density while above the critical temperature, they increase as crosslink density increases. Finally, using UMAP we categorize the local structure of SCNPs and examine the influence of the local structure on SCNP relaxation dynamics.
Collapse
Affiliation(s)
- Michael J A Hore
- Department of Macromolecular Science and Engineering, Case Western Reserve University, 10900 Euclid Ave., Cleveland, OH 44122, USA.
| |
Collapse
|
4
|
Lizano-Villalobos A, Namikas B, Tang X. Siamese neural network improves the performance of a convolutional neural network in colloidal self-assembly state classification. J Chem Phys 2024; 161:204905. [PMID: 39588832 DOI: 10.1063/5.0244337] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2024] [Accepted: 11/12/2024] [Indexed: 11/27/2024] Open
Abstract
Identifying the state of the colloidal self-assembly process is critical to monitoring and controlling the system into desired configurations. Recent application of convolutional neural networks with unsupervised clustering has shown a comparable performance to conventional approaches, in representing and classifying the states of a simulated 2D colloidal batch assembly system. Despite the early success, capturing the subtle differences among similar configurations still presents a challenge. To address this issue, we leverage a Siamese neural network to improve the accuracy of the state classification. Results from a Brownian dynamics-simulated electric field-mediated colloidal self-assembly system and a magnetic field-mediated colloidal self-assembly system demonstrate significant improvement from the original convolutional neural network-based approach. We anticipate the proposed improvement to further pave the way for automated monitoring and control of colloidal self-assembly processes in real time and real space.
Collapse
Affiliation(s)
- Andres Lizano-Villalobos
- Cain Department of Chemical Engineering, Louisiana State University, Baton Rouge, Louisiana 70803, USA
| | - Benjamin Namikas
- Baton Rouge Magnet High School, Baton Rouge, Louisiana, 70806, USA
| | - Xun Tang
- Cain Department of Chemical Engineering, Louisiana State University, Baton Rouge, Louisiana 70803, USA
| |
Collapse
|
5
|
de Jager M, Kolbeck PJ, Vanderlinden W, Lipfert J, Filion L. Exploring protein-mediated compaction of DNA by coarse-grained simulations and unsupervised learning. Biophys J 2024; 123:3231-3241. [PMID: 39044429 PMCID: PMC11427786 DOI: 10.1016/j.bpj.2024.07.023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2024] [Revised: 06/18/2024] [Accepted: 07/18/2024] [Indexed: 07/25/2024] Open
Abstract
Protein-DNA interactions and protein-mediated DNA compaction play key roles in a range of biological processes. The length scales typically involved in DNA bending, bridging, looping, and compaction (≥1 kbp) are challenging to address experimentally or by all-atom molecular dynamics simulations, making coarse-grained simulations a natural approach. Here, we present a simple and generic coarse-grained model for DNA-protein and protein-protein interactions and investigate the role of the latter in the protein-induced compaction of DNA. Our approach models the DNA as a discrete worm-like chain. The proteins are treated in the grand canonical ensemble, and the protein-DNA binding strength is taken from experimental measurements. Protein-DNA interactions are modeled as an isotropic binding potential with an imposed binding valency without specific assumptions about the binding geometry. To systematically and quantitatively classify DNA-protein complexes, we present an unsupervised machine learning pipeline that receives a large set of structural order parameters as input, reduces the dimensionality via principal-component analysis, and groups the results using a Gaussian mixture model. We apply our method to recent data on the compaction of viral genome-length DNA by HIV integrase and find that protein-protein interactions are critical to the formation of looped intermediate structures seen experimentally. Our methodology is broadly applicable to DNA-binding proteins and protein-induced DNA compaction and provides a systematic and semi-quantitative approach for analyzing their mesoscale complexes.
Collapse
Affiliation(s)
- Marjolein de Jager
- Soft Condensed Matter and Biophysics, Debye Institute for Nanomaterials Science, Utrecht University, Utrecht, the Netherlands.
| | - Pauline J Kolbeck
- Soft Condensed Matter and Biophysics, Debye Institute for Nanomaterials Science, Utrecht University, Utrecht, the Netherlands; Department of Physics and Center for NanoScience, LMU, Munich, Germany
| | - Willem Vanderlinden
- Soft Condensed Matter and Biophysics, Debye Institute for Nanomaterials Science, Utrecht University, Utrecht, the Netherlands; Department of Physics and Center for NanoScience, LMU, Munich, Germany; School of Physics and Astronomy, University of Edinburgh, Scotland, United Kingdom
| | - Jan Lipfert
- Soft Condensed Matter and Biophysics, Debye Institute for Nanomaterials Science, Utrecht University, Utrecht, the Netherlands; Department of Physics and Center for NanoScience, LMU, Munich, Germany
| | - Laura Filion
- Soft Condensed Matter and Biophysics, Debye Institute for Nanomaterials Science, Utrecht University, Utrecht, the Netherlands
| |
Collapse
|
6
|
Shi J, Walsh D, Zou W, Rebello NJ, Deagen ME, Fransen KA, Gao X, Olsen BD, Audus DJ. Calculating Pairwise Similarity of Polymer Ensembles via Earth Mover's Distance. ACS POLYMERS AU 2024; 4:66-76. [PMID: 38371731 PMCID: PMC10870752 DOI: 10.1021/acspolymersau.3c00029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Revised: 11/28/2023] [Accepted: 11/29/2023] [Indexed: 02/20/2024]
Abstract
Synthetic polymers, in contrast to small molecules and deterministic biomacromolecules, are typically ensembles composed of polymer chains with varying numbers, lengths, sequences, chemistry, and topologies. While numerous approaches exist for measuring pairwise similarity among small molecules and sequence-defined biomacromolecules, accurately determining the pairwise similarity between two polymer ensembles remains challenging. This work proposes the earth mover's distance (EMD) metric to calculate the pairwise similarity score between two polymer ensembles. EMD offers a greater resolution of chemical differences between polymer ensembles than the averaging method and provides a quantitative numeric value representing the pairwise similarity between polymer ensembles in alignment with chemical intuition. The EMD approach for assessing polymer similarity enhances the development of accurate chemical search algorithms within polymer databases and can improve machine learning techniques for polymer design, optimization, and property prediction.
Collapse
Affiliation(s)
- Jiale Shi
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, Cambridge, Massachusetts 02139, United States
| | - Dylan Walsh
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, Cambridge, Massachusetts 02139, United States
| | - Weizhong Zou
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, Cambridge, Massachusetts 02139, United States
| | - Nathan J. Rebello
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, Cambridge, Massachusetts 02139, United States
| | - Michael E. Deagen
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, Cambridge, Massachusetts 02139, United States
| | - Katharina A. Fransen
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, Cambridge, Massachusetts 02139, United States
| | - Xian Gao
- Department
of Chemical and Biomolecular Engineering, University of Notre Dame, Notre
Dame, Indiana 46556, United States
| | - Bradley D. Olsen
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, Cambridge, Massachusetts 02139, United States
| | - Debra J. Audus
- Materials
Science and Engineering Division, National
Institute of Standards and Technology, Gaithersburg, Maryland 20899, United States
| |
Collapse
|
7
|
Himanshu, Chakraborty K, Patra TK. Developing efficient deep learning model for predicting copolymer properties. Phys Chem Chem Phys 2023; 25:25166-25176. [PMID: 37712405 DOI: 10.1039/d3cp03100d] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/16/2023]
Abstract
Deep learning models are gaining popularity and potency in predicting polymer properties. These models can be built using pre-existing data and are useful for the rapid prediction of polymer properties. However, the performance of a deep learning model is intricately connected to its topology and the volume of training data. There is no facile protocol available to select a deep learning architecture, and there is a lack of a large volume of homogeneous sequence-property data of polymers. These two factors are the primary bottleneck for the efficient development of deep learning models for polymers. Here we assess the severity of these factors and propose strategies to address them. We show that a linear layer-by-layer expansion of a neural network can help in identifying the best neural network topology for a given problem. Moreover, we map the discrete sequence space of a polymer to a continuous one-dimensional latent space using a feature extraction technique to identify minimal data points for training a deep learning model. We implement these approaches for two representative cases of building sequence-property surrogate models, viz., the single-molecule radius of gyration of a copolymer and copolymer compatibilizer. This work demonstrates efficient methods for building deep learning models with minimal data and hyperparameters for predicting sequence-defined properties of polymers.
Collapse
Affiliation(s)
- Himanshu
- Department of Chemical Engineering and Center for Atomistic Modeling and Materials Design, Indian Institute of Technology Madras, Chennai, TN 600036, India.
| | - Kaushik Chakraborty
- Department of Chemical Engineering and Center for Atomistic Modeling and Materials Design, Indian Institute of Technology Madras, Chennai, TN 600036, India.
| | - Tarak K Patra
- Department of Chemical Engineering and Center for Atomistic Modeling and Materials Design, Indian Institute of Technology Madras, Chennai, TN 600036, India.
| |
Collapse
|
8
|
Shi J, Albreiki F, Yamil J Colón, Srivastava S, Whitmer JK. Transfer Learning Facilitates the Prediction of Polymer-Surface Adhesion Strength. J Chem Theory Comput 2023; 19:4631-4640. [PMID: 37068204 DOI: 10.1021/acs.jctc.2c01314] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/19/2023]
Abstract
Machine learning (ML) accelerates the exploration of material properties and their links to the structure of the underlying molecules. In previous work [Shi et al. ACS Applied Materials & Interfaces 2022, 14, 37161-37169.], ML models were applied to predict the adhesive free energy of polymer-surface interactions with high accuracy from the knowledge of the sequence data, demonstrating successes in inverse-design of polymer sequence for known surface compositions. While the method was shown to be successful in designing polymers for a known surface, extensive data sets were needed for each specific surface in order to train the surrogate models. Ideally, one should be able to infer information about similar surfaces without having to regenerate a full complement of adhesion data for each new case. In the current work, we demonstrate a transfer learning (TL) technique using a deep neural network to improve the accuracy of ML models trained on small data sets by pretraining on a larger database from a related system and fine-tuning the weights of all layers with a small amount of additional data. The shared knowledge from the pretrained model facilitates the prediction accuracy significantly on small data sets. We also explore the limits of database size on accuracy and the optimal tuning of network architecture and parameters for our learning tasks. While applied to a relatively simple coarse-grained (CG) polymer model, the general lessons of this study apply to detailed modeling studies and the broader problems of inverse materials design.
Collapse
Affiliation(s)
- Jiale Shi
- Department of Chemical and Biomolecular Engineering, University of Notre Dame, Notre Dame, Indiana 46556, United States
| | - Fahed Albreiki
- Department of Chemical and Biomolecular Engineering, University of California, Los Angeles, Los Angeles, California 90095, United States
| | - Yamil J Colón
- Department of Chemical and Biomolecular Engineering, University of Notre Dame, Notre Dame, Indiana 46556, United States
| | - Samanvaya Srivastava
- Department of Chemical and Biomolecular Engineering, University of California, Los Angeles, Los Angeles, California 90095, United States
- California NanoSystems Institute, Center for Biological Physics, University of California, Los Angeles, Los Angeles, California 90095, United States
- Institute for Carbon Management, University of California, Los Angeles, Los Angeles, California 90095, United States
- Center for Biological Physics, University of California, Los Angeles, Los Angeles, California 90095, United States
| | - Jonathan K Whitmer
- Department of Chemical and Biomolecular Engineering, University of Notre Dame, Notre Dame, Indiana 46556, United States
- Department of Chemistry and Biochemistry, University of Notre Dame, Notre Dame, Indiana 46556, United States
| |
Collapse
|
9
|
Martin TB, Audus DJ. Emerging Trends in Machine Learning: A Polymer Perspective. ACS POLYMERS AU 2023; 3:239-258. [PMID: 37334191 PMCID: PMC10273415 DOI: 10.1021/acspolymersau.2c00053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/16/2022] [Revised: 12/20/2022] [Accepted: 12/21/2022] [Indexed: 01/19/2023]
Abstract
In the last five years, there has been tremendous growth in machine learning and artificial intelligence as applied to polymer science. Here, we highlight the unique challenges presented by polymers and how the field is addressing them. We focus on emerging trends with an emphasis on topics that have received less attention in the review literature. Finally, we provide an outlook for the field, outline important growth areas in machine learning and artificial intelligence for polymer science and discuss important advances from the greater material science community.
Collapse
Affiliation(s)
- Tyler B. Martin
- National Institute of Standards
and Technology, Gaithersburg, Maryland20899, United States
| | - Debra J. Audus
- National Institute of Standards
and Technology, Gaithersburg, Maryland20899, United States
| |
Collapse
|
10
|
Patel R, Colmenares S, Webb MA. Sequence Patterning, Morphology, and Dispersity in Single-Chain Nanoparticles: Insights from Simulation and Machine Learning. ACS POLYMERS AU 2023; 3:284-294. [PMID: 37334192 PMCID: PMC10273411 DOI: 10.1021/acspolymersau.3c00007] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Revised: 05/15/2023] [Accepted: 05/15/2023] [Indexed: 06/20/2023]
Abstract
Single-chain nanoparticles (SCNPs) are intriguing materials inspired by proteins that consist of a single precursor polymer chain that has collapsed into a stable structure. In many prospective applications, such as catalysis, the utility of a single-chain nanoparticle will intricately depend on the formation of a mostly specific structure or morphology. However, it is not generally well understood how to reliably control the morphology of single-chain nanoparticles. To address this knowledge gap, we simulate the formation of 7680 distinct single-chain nanoparticles from precursor chains that span a wide range of, in principle, tunable patterning characteristics of cross-linking moieties. Using a combination of molecular simulation and machine learning analyses, we show how the overall fraction of functionalization and blockiness of cross-linking moieties biases the formation of certain local and global morphological characteristics. Importantly, we illustrate and quantify the dispersity of morphologies that arise due to the stochastic nature of collapse from a well-defined sequence as well as from the ensemble of sequences that correspond to a given specification of precursor parameters. Moreover, we also examine the efficacy of precise sequence control in achieving morphological outcomes in different regimes of precursor parameters. Overall, this work critically assesses how precursor chains might be feasibly tailored to achieve given SCNP morphologies and provides a platform to pursue future sequence-based design.
Collapse
|
11
|
Robinson Brown DC, Webber TR, Jiao S, Rivera Mirabal DM, Han S, Shell MS. Relationships between Molecular Structural Order Parameters and Equilibrium Water Dynamics in Aqueous Mixtures. J Phys Chem B 2023; 127:4577-4594. [PMID: 37171393 DOI: 10.1021/acs.jpcb.3c00826] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
Water's unique thermophysical properties and how it mediates aqueous interactions between solutes have long been interpreted in terms of its collective molecular structure. The seminal work of Errington and Debenedetti [Nature 2001, 409, 318-321] revealed a striking hierarchy of relationships among the thermodynamic, dynamic, and structural properties of water, motivating many efforts to understand (1) what measures of water structure are connected to different experimentally accessible macroscopic responses and (2) how many such structural metrics are adequate to describe the collective structural behavior of water. Diffusivity constitutes a particularly interesting experimentally accessible equilibrium property to investigate such relationships because advanced NMR techniques allow the measurement of bulk and local water dynamics in nanometer proximity to molecules and interfaces, suggesting the enticing possibility of measuring local diffusivities that report on water structure. Here, we apply statistical learning methods to discover persistent structure-dynamic correlations across a variety of simulated aqueous mixtures, from alcohol-water to polypeptoid-water systems. We investigate a variety of molecular water structure metrics and find that an unsupervised statistical learning algorithm (namely, sequential feature selection) identifies only two or three independent structural metrics that are sufficient to predict water self-diffusivity accurately. Surprisingly, the translational diffusivity of water across all mixed systems studied here is strongly correlated with a measure of tetrahedral order given by water's triplet angle distribution. We also identify a separate small number of structural metrics that well predict an important thermodynamic property, the excess chemical potential of an idealized methane-sized hydrophobe in water. Ultimately, we offer a Bayesian method of inferring water structure by using only structure-dynamics linear regression models with experimental Overhauser dynamic nuclear polarization (ODNP) measurements of water self-diffusivity. This study thus quantifies the relationships among several distinct structural order parameters in water and, through statistical learning, reveals the potential to leverage molecular structure to predict fundamental thermophysical properties. In turn, these findings suggest a framework for solving the inverse problem of inferring water's molecular structure using experimental measurements such as ODNP studies that probe local water properties.
Collapse
Affiliation(s)
| | - Thomas R Webber
- Department of Chemical Engineering, University of California, Santa Barbara, California 93106, United States
| | - Sally Jiao
- Department of Chemical Engineering, University of California, Santa Barbara, California 93106, United States
| | - Daniela M Rivera Mirabal
- Department of Chemical Engineering, University of California, Santa Barbara, California 93106, United States
| | - Songi Han
- Department of Chemical Engineering, University of California, Santa Barbara, California 93106, United States
- Department of Chemistry and Biochemistry, University of California, Santa Barbara, California 93106, United States
| | - M Scott Shell
- Department of Chemical Engineering, University of California, Santa Barbara, California 93106, United States
| |
Collapse
|
12
|
Lizano A, Tang X. Convolutional neural network-based colloidal self-assembly state classification. SOFT MATTER 2023; 19:3450-3457. [PMID: 37129254 DOI: 10.1039/d3sm00139c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
Colloidal self-assembly is a viable solution to making advanced metamaterials. While the physicochemical properties of the particles affect the properties of the assembled structures, particle configuration is also a critical determinant factor. Colloidal self-assembly state classification is typically achieved with order parameters, which are aggregate variables normally defined with nontrivial exploration and validation. Here, we present an image-based framework to classify the state of a 2-D colloidal self-assembly system. The framework leverages deep learning algorithms with unsupervised learning for state classification and a supervised learning-based convolutional neural network for state prediction. The neural network models are developed using data from an experimentally validated Brownian dynamics simulation. Our results demonstrate that the proposed approach gives a satisfying performance, comparable and even outperforming the commonly used order parameters in distinguishing void defective states from ordered states. Given the data-based nature of the approach, we anticipate its general applicability and potential automatability to different and complex systems where image or particle coordination acquisition is feasible.
Collapse
Affiliation(s)
- Andres Lizano
- Cain Department of Chemical Engineering, Louisiana State University, Baton Rouge, LA 70803, USA.
| | - Xun Tang
- Cain Department of Chemical Engineering, Louisiana State University, Baton Rouge, LA 70803, USA.
| |
Collapse
|
13
|
Panagiotopoulos AZ. Phase separation and aggregation in multiblock chains. J Chem Phys 2023; 158:2882254. [PMID: 37094002 DOI: 10.1063/5.0146673] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Accepted: 03/30/2023] [Indexed: 04/26/2023] Open
Abstract
This paper focuses on phase and aggregation behavior for linear chains composed of blocks of hydrophilic and hydrophobic segments. Phase and conformational transitions of patterned chains are relevant for understanding liquid-liquid separation of biomolecular condensates, which play a prominent role in cellular biophysics and for surfactant and polymer applications. Previous studies of simple models for multiblock chains have shown that, depending on the sequence pattern and chain length, such systems can fall into one of two categories: displaying either phase separation or aggregation into finite-size clusters. The key new result of this paper is that both formation of finite-size aggregates and phase separation can be observed for certain chain architectures at appropriate conditions of temperature and concentration. For such systems, a bulk dense liquid condenses from a dilute phase that already contains multi-chain finite-size aggregates. The computational approach used in this study involves several distinct steps using histogram-reweighting grand canonical Monte Carlo simulations, which are described in some level of detail.
Collapse
|
14
|
de Las Heras D, Zimmermann T, Sammüller F, Hermann S, Schmidt M. Perspective: How to overcome dynamical density functional theory. JOURNAL OF PHYSICS. CONDENSED MATTER : AN INSTITUTE OF PHYSICS JOURNAL 2023; 35:271501. [PMID: 37023762 DOI: 10.1088/1361-648x/accb33] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/28/2023] [Accepted: 04/06/2023] [Indexed: 06/19/2023]
Abstract
We argue in favour of developing a comprehensive dynamical theory for rationalizing, predicting, designing, and machine learning nonequilibrium phenomena that occur in soft matter. To give guidance for navigating the theoretical and practical challenges that lie ahead, we discuss and exemplify the limitations of dynamical density functional theory (DDFT). Instead of the implied adiabatic sequence of equilibrium states that this approach provides as a makeshift for the true time evolution, we posit that the pending theoretical tasks lie in developing a systematic understanding of the dynamical functional relationships that govern the genuine nonequilibrium physics. While static density functional theory gives a comprehensive account of the equilibrium properties of many-body systems, we argue that power functional theory is the only present contender to shed similar insights into nonequilibrium dynamics, including the recognition and implementation of exact sum rules that result from the Noether theorem. As a demonstration of the power functional point of view, we consider an idealized steady sedimentation flow of the three-dimensional Lennard-Jones fluid and machine-learn the kinematic map from the mean motion to the internal force field. The trained model is capable of both predicting and designing the steady state dynamics universally for various target density modulations. This demonstrates the significant potential of using such techniques in nonequilibrium many-body physics and overcomes both the conceptual constraints of DDFT as well as the limited availability of its analytical functional approximations.
Collapse
Affiliation(s)
- Daniel de Las Heras
- Theoretische Physik II, Physikalisches Institut, Universität Bayreuth, D-95447 Bayreuth, Germany
| | - Toni Zimmermann
- Theoretische Physik II, Physikalisches Institut, Universität Bayreuth, D-95447 Bayreuth, Germany
| | - Florian Sammüller
- Theoretische Physik II, Physikalisches Institut, Universität Bayreuth, D-95447 Bayreuth, Germany
| | - Sophie Hermann
- Theoretische Physik II, Physikalisches Institut, Universität Bayreuth, D-95447 Bayreuth, Germany
| | - Matthias Schmidt
- Theoretische Physik II, Physikalisches Institut, Universität Bayreuth, D-95447 Bayreuth, Germany
| |
Collapse
|
15
|
Ricci E, Vergadou N. Integrating Machine Learning in the Coarse-Grained Molecular Simulation of Polymers. J Phys Chem B 2023; 127:2302-2322. [PMID: 36888553 DOI: 10.1021/acs.jpcb.2c06354] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/09/2023]
Abstract
Machine learning (ML) is having an increasing impact on the physical sciences, engineering, and technology and its integration into molecular simulation frameworks holds great potential to expand their scope of applicability to complex materials and facilitate fundamental knowledge and reliable property predictions, contributing to the development of efficient materials design routes. The application of ML in materials informatics in general, and polymer informatics in particular, has led to interesting results, however great untapped potential lies in the integration of ML techniques into the multiscale molecular simulation methods for the study of macromolecular systems, specifically in the context of Coarse Grained (CG) simulations. In this Perspective, we aim at presenting the pioneering recent research efforts in this direction and discussing how these new ML-based techniques can contribute to critical aspects of the development of multiscale molecular simulation methods for bulk complex chemical systems, especially polymers. Prerequisites for the implementation of such ML-integrated methods and open challenges that need to be met toward the development of general systematic ML-based coarse graining schemes for polymers are discussed.
Collapse
Affiliation(s)
- Eleonora Ricci
- Institute of Nanoscience and Nanotechnology, National Center for Scientific Research "Demokritos", GR-15341 Agia Paraskevi, Athens, Greece
- Institute of Informatics and Telecommunications, National Center for Scientific Research "Demokritos", GR-15341 Agia Paraskevi, Athens, Greece
| | - Niki Vergadou
- Institute of Nanoscience and Nanotechnology, National Center for Scientific Research "Demokritos", GR-15341 Agia Paraskevi, Athens, Greece
| |
Collapse
|
16
|
Zhu Q, Tree DR. Simulations of morphology control of self‐assembled amphiphilic surfactants. JOURNAL OF POLYMER SCIENCE 2023. [DOI: 10.1002/pol.20220771] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/19/2023]
Affiliation(s)
- Qinyu Zhu
- Department of Chemical Engineering Brigham Young University Provo Utah USA
| | - Douglas R. Tree
- Department of Chemical Engineering Brigham Young University Provo Utah USA
| |
Collapse
|
17
|
Smith A, Runde S, Chew AK, Kelkar AS, Maheshwari U, Van Lehn RC, Zavala VM. Topological Analysis of Molecular Dynamics Simulations using the Euler Characteristic. J Chem Theory Comput 2023; 19:1553-1567. [PMID: 36812112 DOI: 10.1021/acs.jctc.2c00766] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/24/2023]
Abstract
Molecular dynamics (MD) simulations are used in diverse scientific and engineering fields such as drug discovery, materials design, separations, biological systems, and reaction engineering. These simulations generate highly complex data sets that capture the 3D spatial positions, dynamics, and interactions of thousands of molecules. Analyzing MD data sets is key for understanding and predicting emergent phenomena and in identifying key drivers and tuning design knobs of such phenomena. In this work, we show that the Euler characteristic (EC) provides an effective topological descriptor that facilitates MD analysis. The EC is a versatile, low-dimensional, and easy-to-interpret descriptor that can be used to reduce, analyze, and quantify complex data objects that are represented as graphs/networks, manifolds/functions, and point clouds. Specifically, we show that the EC is an informative descriptor that can be used for machine learning and data analysis tasks such as classification, visualization, and regression. We demonstrate the benefits of the proposed approach through case studies that aim to understand and predict the hydrophobicity of self-assembled monolayers and the reactivity of complex solvent environments.
Collapse
Affiliation(s)
- Alexander Smith
- Department of Chemical and Biological Engineering, University of Wisconsin, Madison, Wisconsin 53706, United States
| | - Spencer Runde
- Department of Chemical and Biological Engineering, University of Wisconsin, Madison, Wisconsin 53706, United States
| | - Alex K Chew
- Department of Chemical and Biological Engineering, University of Wisconsin, Madison, Wisconsin 53706, United States
| | - Atharva S Kelkar
- Department of Chemical and Biological Engineering, University of Wisconsin, Madison, Wisconsin 53706, United States
| | - Utkarsh Maheshwari
- Department of Electrical and Computer Engineering, University of Wisconsin, Madison, Wisconsin 53706, United States
| | - Reid C Van Lehn
- Department of Chemical and Biological Engineering, University of Wisconsin, Madison, Wisconsin 53706, United States
| | - Victor M Zavala
- Department of Chemical and Biological Engineering, University of Wisconsin, Madison, Wisconsin 53706, United States
| |
Collapse
|
18
|
Gavrilov AA, Potemkin II. Copolymers with Nonblocky Sequences as Novel Materials with Finely Tuned Properties. J Phys Chem B 2023; 127:1479-1489. [PMID: 36790352 DOI: 10.1021/acs.jpcb.2c07689] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/16/2023]
Abstract
The copolymer sequence can be considered as a new tool to shape the resulting system properties on demand. This perspective is devoted to copolymers with "partially segregated" (or nonblocky) sequences. Such copolymers include gradient copolymers and copolymers with random sequences as well as copolymers with precisely controlled sequences. We overview recent developments in the synthesis of these systems as well as new findings regarding their properties, in particular, self-assembly in solutions and in melts. An emphasis is put on how the microscopic behavior of polymer chains is influenced by the chain sequences. In addition to that, a novel class of approaches allowing one to efficiently tackle the problem of copolymer chain sequence design─data driven methods (artificial intelligence and machine learning)─is discussed.
Collapse
Affiliation(s)
- Alexey A Gavrilov
- Physics Department, Lomonosov Moscow State University, Moscow 119991, Russian Federation.,Semenov Federal Research Center for Chemical Physics, Moscow 119991, Russian Federation
| | - Igor I Potemkin
- Physics Department, Lomonosov Moscow State University, Moscow 119991, Russian Federation
| |
Collapse
|
19
|
Chew PY, Reinhardt A. Phase diagrams-Why they matter and how to predict them. J Chem Phys 2023; 158:030902. [PMID: 36681642 DOI: 10.1063/5.0131028] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
Understanding the thermodynamic stability and metastability of materials can help us to, for example, gauge whether crystalline polymorphs in pharmaceutical formulations are likely to be durable. It can also help us to design experimental routes to novel phases with potentially interesting properties. In this Perspective, we provide an overview of how thermodynamic phase behavior can be quantified both in computer simulations and machine-learning approaches to determine phase diagrams, as well as combinations of the two. We review the basic workflow of free-energy computations for condensed phases, including some practical implementation advice, ranging from the Frenkel-Ladd approach to thermodynamic integration and to direct-coexistence simulations. We illustrate the applications of such methods on a range of systems from materials chemistry to biological phase separation. Finally, we outline some challenges, questions, and practical applications of phase-diagram determination which we believe are likely to be possible to address in the near future using such state-of-the-art free-energy calculations, which may provide fundamental insight into separation processes using multicomponent solvents.
Collapse
Affiliation(s)
- Pin Yu Chew
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| | - Aleks Reinhardt
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| |
Collapse
|
20
|
Ramesh PS, Patra TK. Polymer sequence design via molecular simulation-based active learning. SOFT MATTER 2023; 19:282-294. [PMID: 36519427 DOI: 10.1039/d2sm01193j] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
Molecular-scale interactions and chemical structures offer an enormous opportunity to tune material properties. However, designing materials from their molecular scale is a grand challenge owing to the practical limitations in exploring astronomically large design spaces using traditional experimental or computational methods. Advancements in data science and machine learning have produced a host of tools and techniques that can address this problem and facilitate the efficient exploration of large search spaces. In this work, a blended approach integrating physics-based methods, machine learning techniques and uncertainty quantification is implemented to effectively screen a macromolecular sequence space and design target structures. Here, we survey and assess the efficacy of data-driven methods within the framework of active learning for a challenging design problem, viz., sequence optimization of a copolymer. We report the impact of surrogate models, kernels, and initial conditions on the convergence of the active learning method for the sequence design problem. This work establishes optimal strategies and hyperparameters for efficient inverse design of polymer sequences via active learning.
Collapse
Affiliation(s)
- Praneeth S Ramesh
- Department of Chemical Engineering, Center for Atomistic Modeling and Materials Design and Center for Carbon Capture Utilization and Storage, Indian Institute of Technology Madras, Chennai, TN 600036, India.
| | - Tarak K Patra
- Department of Chemical Engineering, Center for Atomistic Modeling and Materials Design and Center for Carbon Capture Utilization and Storage, Indian Institute of Technology Madras, Chennai, TN 600036, India.
| |
Collapse
|
21
|
Abstract
The application of machine learning to the materials domain has traditionally struggled with two major challenges: a lack of large, curated data sets and the need to understand the physics behind the machine-learning prediction. The former problem is particularly acute in the polymers domain. Here we aim to simultaneously tackle these challenges through the incorporation of scientific knowledge, thus, providing improved predictions for smaller data sets, both under interpolation and extrapolation, and a degree of explainability. We focus on imperfect theories, as they are often readily available and easier to interpret. Using a system of a polymer in different solvent qualities, we explore numerous methods for incorporating theory into machine learning using different machine-learning models, including Gaussian process regression. Ultimately, we find that encoding the functional form of the theory performs best followed by an encoding of the numeric values of the theory.
Collapse
Affiliation(s)
- Debra J Audus
- Materials Science and Engineering Division, National Institute of Standards and Technology, Gaithersburg, Maryland 20899, United States
| | - Austin McDannald
- Materials Measurement Science Division, National Institute of Standards and Technology, Gaithersburg, Maryland 20899, United States
| | - Brian DeCost
- Materials Measurement Science Division, National Institute of Standards and Technology, Gaithersburg, Maryland 20899, United States
| |
Collapse
|
22
|
Shi J, Quevillon MJ, Amorim Valença PH, Whitmer JK. Predicting Adhesive Free Energies of Polymer-Surface Interactions with Machine Learning. ACS APPLIED MATERIALS & INTERFACES 2022; 14:37161-37169. [PMID: 35917495 DOI: 10.1021/acsami.2c08891] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Polymer-surface interactions are crucial to many biological processes and industrial applications. Here we propose a machine learning method to connect a model polymer's sequence with its adhesion to decorated surfaces. We simulate the adhesive free energies of 20000 unique coarse-grained one-dimensional polymer sequences interacting with functionalized surfaces and build support vector regression models that demonstrate inexpensive and reliable prediction of the adhesive free energy as a function of sequence. Our work highlights the promising integration of coarse-grained simulation with data-driven machine learning methods for the design of functional polymers and represents an important step toward linking polymer compositions with polymer-surface interactions.
Collapse
Affiliation(s)
- Jiale Shi
- Department of Chemical and Biomolecular Engineering, University of Notre Dame, Notre Dame, Indiana 46556, United States
| | - Michael J Quevillon
- Department of Chemical and Biomolecular Engineering, University of Notre Dame, Notre Dame, Indiana 46556, United States
| | - Pedro H Amorim Valença
- Department of Chemical and Biomolecular Engineering, University of Notre Dame, Notre Dame, Indiana 46556, United States
| | - Jonathan K Whitmer
- Department of Chemical and Biomolecular Engineering, University of Notre Dame, Notre Dame, Indiana 46556, United States
- Department of Chemistry and Biochemistry, University of Notre Dame, Notre Dame, Indiana 46556, United States
| |
Collapse
|
23
|
Tao L, Byrnes J, Varshney V, Li Y. Machine learning strategies for the structure-property relationship of copolymers. iScience 2022; 25:104585. [PMID: 35789847 PMCID: PMC9249671 DOI: 10.1016/j.isci.2022.104585] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Revised: 05/26/2022] [Accepted: 06/07/2022] [Indexed: 11/15/2022] Open
Abstract
Establishing the structure-property relationship is extremely valuable for the molecular design of copolymers. However, machine learning (ML) models can incorporate both chemical composition and sequence distribution of monomers, and have the generalization ability to process various copolymer types (e.g., alternating, random, block, and gradient copolymers) with a unified approach are missing. To address this challenge, we formulate four different ML models for investigation, including a feedforward neural network (FFNN) model, a convolutional neural network (CNN) model, a recurrent neural network (RNN) model, and a combined FFNN/RNN (Fusion) model. We use various copolymer types to systematically validate the performance and generalizability of different models. We find that the RNN architecture that processes the monomer sequence information both forward and backward is a more suitable ML model for copolymers with better generalizability. As a supplement to polymer informatics, our proposed approach provides an efficient way for the evaluation of copolymers.
Collapse
Affiliation(s)
- Lei Tao
- Department of Mechanical Engineering, University of Connecticut, Storrs, CT 06269, USA
| | | | - Vikas Varshney
- Materials and Manufacturing Directorate, Air Force Research Laboratory, Wright-Patterson Air Force Base, Ohio 45433, USA
| | - Ying Li
- Department of Mechanical Engineering, University of Connecticut, Storrs, CT 06269, USA
- Polymer Program, Institute of Materials Science, University of Connecticut, Storrs, CT 06269, USA
| |
Collapse
|
24
|
Bhattacharya D, Kleeblatt DC, Statt A, Reinhart WF. Predicting aggregate morphology of sequence-defined macromolecules with recurrent neural networks. SOFT MATTER 2022; 18:5037-5051. [PMID: 35748651 DOI: 10.1039/d2sm00452f] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Self-assembly of dilute sequence-defined macromolecules is a complex phenomenon in which the local arrangement of chemical moieties can lead to the formation of long-range structure. The dependence of this structure on the sequence necessarily implies that a mapping between the two exists, yet it has been difficult to model so far. Predicting the aggregation behavior of these macromolecules is challenging due to the lack of effective order parameters, a vast design space, inherent variability, and high computational costs associated with currently available simulation techniques. Here, we accurately predict the morphology of aggregates self-assembled from sequence-defined macromolecules using supervised machine learning. We find that regression models with implicit representation learning perform significantly better than those based on engineered features such as k-mer counting, and a recurrent-neural-network-based regressor performs the best out of nine model architectures we tested. Furthermore, we demonstrate the high-throughput screening of monomer sequences using the regression model to identify candidates for self-assembly into selected morphologies. Our strategy is shown to successfully identify multiple suitable sequences in every test we performed, so we hope the insights gained here can be extended to other increasingly complex design scenarios in the future, such as the design of sequences under polydispersity and at varying environmental conditions.
Collapse
Affiliation(s)
- Debjyoti Bhattacharya
- Materials Science and Engineering, Pennsylvania State University, University Park, PA 16802, USA.
| | - Devon C Kleeblatt
- Materials Science and Engineering, Pennsylvania State University, University Park, PA 16802, USA.
| | - Antonia Statt
- Materials Science and Engineering, Grainger College of Engineering, University of Illinois, Urbana-Champaign, IL 61801, USA
| | - Wesley F Reinhart
- Materials Science and Engineering, Pennsylvania State University, University Park, PA 16802, USA.
- Institute for Computational and Data Sciences, Pennsylvania State University, University Park, PA 16802, USA
| |
Collapse
|
25
|
Bale AA, Gautham SMB, Patra TK. Sequence‐defined Pareto frontier of a copolymer structure. JOURNAL OF POLYMER SCIENCE 2022. [DOI: 10.1002/pol.20220088] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Affiliation(s)
- Ashwin A. Bale
- Department of Chemical Engineering Birla Institute of Technology and Science Pilani‐Hyderabad Hyderabad India
| | - Sachin M. B. Gautham
- Department of Chemical Engineering, Center for Atomistic Modeling and Materials Design and Center for Carbon Capture Utilization and Storage Indian Institute of Technology Madras Chennai India
| | - Tarak K. Patra
- Department of Chemical Engineering, Center for Atomistic Modeling and Materials Design and Center for Carbon Capture Utilization and Storage Indian Institute of Technology Madras Chennai India
| |
Collapse
|
26
|
Quach CD, Gilmer JB, Pert D, Mason-Hogans A, Iacovella CR, Cummings PT, McCabe C. High-throughput screening of tribological properties of monolayer films using molecular dynamics and machine learning. J Chem Phys 2022; 156:154902. [PMID: 35459321 DOI: 10.1063/5.0080838] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Monolayer films have shown promise as a lubricating layer to reduce friction and wear of mechanical devices with separations on the nanoscale. These films have a vast design space with many tunable properties that can affect their tribological effectiveness. For example, terminal group chemistry, film composition, and backbone chemistry can all lead to films with significantly different tribological properties. This design space, however, is very difficult to explore without a combinatorial approach and an automatable, reproducible, and extensible workflow to screen for promising candidate films. Using the Molecular Simulation Design Framework (MoSDeF), a combinatorial screening study was performed to explore 9747 unique monolayer films (116 964 total simulations) and a machine learning (ML) model using a random forest regressor, an ensemble learning technique, to explore the role of terminal group chemistry and its effect on tribological effectiveness. The most promising films were found to contain small terminal groups such as cyano and ethylene. The ML model was subsequently applied to screen terminal group candidates identified from the ChEMBL small molecule library. Approximately 193 131 unique film candidates were screened with approximately a five order of magnitude speed-up in analysis compared to simulation alone. The ML model was thus able to be used as a predictive tool to greatly speed up the initial screening of promising candidate films for future simulation studies, suggesting that computational screening in combination with ML can greatly increase the throughput in combinatorial approaches to generate in silico data and then train ML models in a controlled, self-consistent fashion.
Collapse
Affiliation(s)
- Co D Quach
- Department of Chemical and Biomolecular Engineering, Vanderbilt University, Nashville, Tennessee 37235, USA
| | - Justin B Gilmer
- Interdiscplinary Materials Science, Vanderbilt University, Nashville, Tennessee 37235, USA
| | - Daniel Pert
- Department of Chemical and Biomolecular Engineering, Vanderbilt University, Nashville, Tennessee 37235, USA
| | - Akanke Mason-Hogans
- Department of Chemical and Biomolecular Engineering, Vanderbilt University, Nashville, Tennessee 37235, USA
| | - Christopher R Iacovella
- Department of Chemical and Biomolecular Engineering, Vanderbilt University, Nashville, Tennessee 37235, USA
| | - Peter T Cummings
- Department of Chemical and Biomolecular Engineering, Vanderbilt University, Nashville, Tennessee 37235, USA
| | - Clare McCabe
- Department of Chemical and Biomolecular Engineering, Vanderbilt University, Nashville, Tennessee 37235, USA
| |
Collapse
|
27
|
Abstract
Optimal design of polymers is a challenging task due to their enormous chemical and configurational space. Recent advances in computations, machine learning, and increasing trends in data and software availability can potentially address this problem and accelerate the molecular-scale design of polymers. Here, the central problem of polymer design is reviewed, and the general ideas of data-driven methods and their working principles in the context of polymer design are discussed. This Review provides a historical perspective and a summary of current trends and outlines future scopes of data-driven methods for polymer research. A few representative case studies on the use of such data-driven methods for discovering new polymers with exceptional properties are presented. Moreover, attempts are made to highlight how data-driven strategies aid in establishing new correlations and advancing the fundamental understanding of polymers. This Review posits that the combination of machine learning, rapid computational characterization of polymers, and availability of large open-sourced homogeneous data will transform polymer research and development over the coming decades. It is hoped that this Review will serve as a useful reference to researchers who wish to develop and deploy data-driven methods for polymer research and education.
Collapse
Affiliation(s)
- Tarak K. Patra
- Department of Chemical Engineering,
Center for Atomistic Modeling and Materials Design and Center for
Carbon Capture Utilization and Storage, Indian Institute of Technology Madras, Chennai, TN 600036, India
| |
Collapse
|
28
|
Nguyen D, Tao L, Li Y. Integration of Machine Learning and Coarse-Grained Molecular Simulations for Polymer Materials: Physical Understandings and Molecular Design. Front Chem 2022; 9:820417. [PMID: 35141207 PMCID: PMC8819075 DOI: 10.3389/fchem.2021.820417] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Accepted: 12/31/2021] [Indexed: 12/21/2022] Open
Abstract
In recent years, the synthesis of monomer sequence-defined polymers has expanded into broad-spectrum applications in biomedical, chemical, and materials science fields. Pursuing the characterization and inverse design of these polymer systems requires our fundamental understanding not only at the individual monomer level, but also considering the chain scales, such as polymer configuration, self-assembly, and phase separation. However, our accessibility to this field is still rudimentary due to the limitations of traditional design approaches, the complexity of chemical space along with the burdened cost and time issues that prevent us from unveiling the underlying monomer sequence-structure-property relationships. Fortunately, thanks to the recent advancements in molecular dynamics simulations and machine learning (ML) algorithms, the bottlenecks in the tasks of establishing the structure-function correlation of the polymer chains can be overcome. In this review, we will discuss the applications of the integration between ML techniques and coarse-grained molecular dynamics (CGMD) simulations to solve the current issues in polymer science at the chain level. In particular, we focus on the case studies in three important topics-polymeric configuration characterization, feed-forward property prediction, and inverse design-in which CGMD simulations are leveraged to generate training datasets to develop ML-based surrogate models for specific polymer systems and designs. By doing so, this computational hybridization allows us to well establish the monomer sequence-functional behavior relationship of the polymers as well as guide us toward the best polymer chain candidates for the inverse design in undiscovered chemical space with reasonable computational cost and time. Even though there are still limitations and challenges ahead in this field, we finally conclude that this CGMD/ML integration is very promising, not only in the attempt of bridging the monomeric and macroscopic characterizations of polymer materials, but also enabling further tailored designs for sequence-specific polymers with superior properties in many practical applications.
Collapse
Affiliation(s)
- Danh Nguyen
- Department of Mechanical Engineering, University of Connecticut, Mansfield, CT, United States
| | - Lei Tao
- Department of Mechanical Engineering, University of Connecticut, Mansfield, CT, United States
| | - Ying Li
- Department of Mechanical Engineering, University of Connecticut, Mansfield, CT, United States
- Polymer Program, Institute of Materials Science, University of Connecticut, Mansfield, CT, United States
| |
Collapse
|