1
|
Ahmad S, Raza K. An extensive review on lung cancer therapeutics using machine learning techniques: state-of-the-art and perspectives. J Drug Target 2024; 32:635-646. [PMID: 38662768 DOI: 10.1080/1061186x.2024.2347358] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2024] [Accepted: 04/18/2024] [Indexed: 05/07/2024]
Abstract
There are over 100 types of human cancer, accounting for millions of deaths every year. Lung cancer alone claims over 1.8 million lives per year and is expected to surpass 3.2 million by 2050, which underscores the urgent need for rapid drug development and repurposing initiatives. The application of AI emerges as a pivotal solution to developing anti-cancer therapeutics. This state-of-the-art review aims to explore the various applications of AI in lung cancer therapeutics. Predictive models can analyse large datasets, including clinical data, genetic information, and treatment outcomes, for novel drug design and to generate personalised treatment recommendations, potentially optimising therapeutic strategies, enhancing treatment efficacy, and minimising adverse effects. A thorough literature review study was conducted based on articles indexed in PubMed and Scopus. We compiled the use of various machine learning approaches, including CNN, RNN, GAN, VAEs, and other AI techniques, enhancing efficiency with accuracy exceeding 95%, which is validated through a computer-aided drug design process. AI can revolutionise lung cancer therapeutics, streamlining processes and saving biological scientists' time and effort-however, further research is needed to overcome challenges and fully unlock AI's potential in Lung Cancer Therapeutics.
Collapse
Affiliation(s)
- Shaban Ahmad
- Department of Computer Science, Jamia Millia Islamia, New Delhi, India
| | - Khalid Raza
- Department of Computer Science, Jamia Millia Islamia, New Delhi, India
| |
Collapse
|
2
|
Singh RK, Nayak NP, Behl T, Arora R, Anwer MK, Gulati M, Bungau SG, Brisc MC. Exploring the Intersection of Geophysics and Diagnostic Imaging in the Health Sciences. Diagnostics (Basel) 2024; 14:139. [PMID: 38248016 PMCID: PMC11154438 DOI: 10.3390/diagnostics14020139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2023] [Revised: 01/03/2024] [Accepted: 01/05/2024] [Indexed: 01/23/2024] Open
Abstract
To develop diagnostic imaging approaches, this paper emphasizes the transformational potential of merging geophysics with health sciences. Diagnostic imaging technology improvements have transformed the health sciences by enabling earlier and more precise disease identification, individualized therapy, and improved patient care. This review article examines the connection between geophysics and diagnostic imaging in the field of health sciences. Geophysics, which is typically used to explore Earth's subsurface, has provided new uses of its methodology in the medical field, providing innovative solutions to pressing medical problems. The article examines the different geophysical techniques like electrical imaging, seismic imaging, and geophysics and their corresponding imaging techniques used in health sciences like tomography, magnetic resonance imaging, ultrasound imaging, etc. The examination includes the description, similarities, differences, and challenges associated with these techniques and how modified geophysical techniques can be used in imaging methods in health sciences. Examining the progression of each method from geophysics to medical imaging and its contributions to illness diagnosis, treatment planning, and monitoring are highlighted. Also, the utilization of geophysical data analysis techniques like signal processing and inversion techniques in image processing in health sciences has been briefly explained, along with different mathematical and computational tools in geophysics and how they can be implemented for image processing in health sciences. The key findings include the development of machine learning and artificial intelligence in geophysics-driven medical imaging, demonstrating the revolutionary effects of data-driven methods on precision, speed, and predictive modeling.
Collapse
Affiliation(s)
- Rahul Kumar Singh
- Energy Cluster, University of Petroleum and Energy Studies, Dehradun 248007, Uttarakhand, India; (R.K.S.); (N.P.N.)
| | - Nirlipta Priyadarshini Nayak
- Energy Cluster, University of Petroleum and Energy Studies, Dehradun 248007, Uttarakhand, India; (R.K.S.); (N.P.N.)
| | - Tapan Behl
- Amity School of Pharmaceutical Sciences, Amity University, Mohali 140306, Punjab, India
| | - Rashmi Arora
- Chitkara College of Pharmacy, Chitkara University, Rajpura 140401, Punjab, India;
| | - Md. Khalid Anwer
- Department of Pharmaceutics, College of Pharmacy, Prince Sattam Bin Abdulaziz University, Alkharj 11942, Saudi Arabia;
| | - Monica Gulati
- School of Pharmaceutical Sciences, Lovely Professional University, Phagwara 1444411, Punjab, India;
- Australian Research Centre in Complementary and Integrative Medicine, Faculty of Health, University of Technology Sydney, Ultimo, NSW 20227, Australia
| | - Simona Gabriela Bungau
- Department of Pharmacy, Faculty of Medicine and Pharmacy, University of Oradea, 410028 Oradea, Romania
- Doctoral School of Biological and Biomedical Sciences, University of Oradea, 410087 Oradea, Romania
| | - Mihaela Cristina Brisc
- Department of Medical Disciplines, Faculty of Medicine and Pharmacy, University of Oradea, 410073 Oradea, Romania;
| |
Collapse
|
3
|
Xue X, Sun H, Yang M, Liu X, Hu HY, Deng Y, Wang X. Advances in the Application of Artificial Intelligence-Based Spectral Data Interpretation: A Perspective. Anal Chem 2023; 95:13733-13745. [PMID: 37688541 DOI: 10.1021/acs.analchem.3c02540] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/11/2023]
Abstract
The interpretation of spectral data, including mass, nuclear magnetic resonance, infrared, and ultraviolet-visible spectra, is critical for obtaining molecular structural information. The development of advanced sensing technology has multiplied the amount of available spectral data. Chemical experts must use basic principles corresponding to the spectral information generated by molecular fragments and functional groups. This is a time-consuming process that requires a solid professional knowledge base. In recent years, the rapid development of computer science and its applications in cheminformatics and the emergence of computer-aided expert systems have greatly reduced the difficulty in analyzing large quantities of data. For expert systems, however, the problem-solving strategy must be known in advance or extracted by human experts and translated into algorithms. Gratifyingly, the development of artificial intelligence (AI) methods has shown great promise for solving such problems. Traditional algorithms, including the latest neural network algorithms, have shown great potential for both extracting useful information and processing massive quantities of data. This Perspective highlights recent innovations covering all of the emerging AI-based spectral interpretation techniques. In addition, the main limitations and current obstacles are presented, and the corresponding directions for further research are proposed. Moreover, this Perspective gives the authors' personal outlook on the development and future applications of spectral interpretation.
Collapse
Affiliation(s)
- Xi Xue
- State Key Laboratory of Bioactive Substances and Functions of Natural Medicines, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, China
- Beijing Key Laboratory of Active Substances Discovery and Drugability Evaluation, Department of Medicinal Chemistry, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, P. R. China
| | - Hanyu Sun
- State Key Laboratory of Bioactive Substances and Functions of Natural Medicines, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, China
- Beijing Key Laboratory of Active Substances Discovery and Drugability Evaluation, Department of Medicinal Chemistry, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, P. R. China
| | - Minjian Yang
- State Key Laboratory of Bioactive Substances and Functions of Natural Medicines, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, China
- Beijing Key Laboratory of Active Substances Discovery and Drugability Evaluation, Department of Medicinal Chemistry, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, P. R. China
| | - Xue Liu
- State Key Laboratory of Bioactive Substances and Functions of Natural Medicines, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, China
| | - Hai-Yu Hu
- State Key Laboratory of Bioactive Substances and Functions of Natural Medicines, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, China
| | - Yafeng Deng
- CarbonSilicon AI Technology Co., Ltd. Beijing 100080, China
- Department of Automation, Tsinghua University, Beijing 100084, China
| | - Xiaojian Wang
- State Key Laboratory of Bioactive Substances and Functions of Natural Medicines, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, China
- CarbonSilicon AI Technology Co., Ltd. Beijing 100080, China
| |
Collapse
|
4
|
Dou B, Zhu Z, Merkurjev E, Ke L, Chen L, Jiang J, Zhu Y, Liu J, Zhang B, Wei GW. Machine Learning Methods for Small Data Challenges in Molecular Science. Chem Rev 2023; 123:8736-8780. [PMID: 37384816 PMCID: PMC10999174 DOI: 10.1021/acs.chemrev.3c00189] [Citation(s) in RCA: 21] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023]
Abstract
Small data are often used in scientific and engineering research due to the presence of various constraints, such as time, cost, ethics, privacy, security, and technical limitations in data acquisition. However, big data have been the focus for the past decade, small data and their challenges have received little attention, even though they are technically more severe in machine learning (ML) and deep learning (DL) studies. Overall, the small data challenge is often compounded by issues, such as data diversity, imputation, noise, imbalance, and high-dimensionality. Fortunately, the current big data era is characterized by technological breakthroughs in ML, DL, and artificial intelligence (AI), which enable data-driven scientific discovery, and many advanced ML and DL technologies developed for big data have inadvertently provided solutions for small data problems. As a result, significant progress has been made in ML and DL for small data challenges in the past decade. In this review, we summarize and analyze several emerging potential solutions to small data challenges in molecular science, including chemical and biological sciences. We review both basic machine learning algorithms, such as linear regression, logistic regression (LR), k-nearest neighbor (KNN), support vector machine (SVM), kernel learning (KL), random forest (RF), and gradient boosting trees (GBT), and more advanced techniques, including artificial neural network (ANN), convolutional neural network (CNN), U-Net, graph neural network (GNN), Generative Adversarial Network (GAN), long short-term memory (LSTM), autoencoder, transformer, transfer learning, active learning, graph-based semi-supervised learning, combining deep learning with traditional machine learning, and physical model-based data augmentation. We also briefly discuss the latest advances in these methods. Finally, we conclude the survey with a discussion of promising trends in small data challenges in molecular science.
Collapse
Affiliation(s)
- Bozheng Dou
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Zailiang Zhu
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Ekaterina Merkurjev
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Lu Ke
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Long Chen
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Jian Jiang
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Yueying Zhu
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Jie Liu
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Bengong Zhang
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
- Department of Electrical and Computer Engineering, Michigan State University, East Lansing, Michigan 48824, United States
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, United States
| |
Collapse
|
5
|
Zhou L, Wang Y, Peng L, Li Z, Luo X. Identifying potential drug-target interactions based on ensemble deep learning. Front Aging Neurosci 2023; 15:1176400. [PMID: 37396659 PMCID: PMC10309650 DOI: 10.3389/fnagi.2023.1176400] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Accepted: 05/10/2023] [Indexed: 07/04/2023] Open
Abstract
Introduction Drug-target interaction prediction is one important step in drug research and development. Experimental methods are time consuming and laborious. Methods In this study, we developed a novel DTI prediction method called EnGDD by combining initial feature acquisition, dimensional reduction, and DTI classification based on Gradient boosting neural network, Deep neural network, and Deep Forest. Results EnGDD was compared with seven stat-of-the-art DTI prediction methods (BLM-NII, NRLMF, WNNGIP, NEDTP, DTi2Vec, RoFDT, and MolTrans) on the nuclear receptor, GPCR, ion channel, and enzyme datasets under cross validations on drugs, targets, and drug-target pairs, respectively. EnGDD computed the best recall, accuracy, F1-score, AUC, and AUPR under the majority of conditions, demonstrating its powerful DTI identification performance. EnGDD predicted that D00182 and hsa2099, D07871 and hsa1813, DB00599 and hsa2562, D00002 and hsa10935 have a higher interaction probabilities among unknown drug-target pairs and may be potential DTIs on the four datasets, respectively. In particular, D00002 (Nadide) was identified to interact with hsa10935 (Mitochondrial peroxiredoxin3) whose up-regulation might be used to treat neurodegenerative diseases. Finally, EnGDD was used to find possible drug targets for Parkinson's disease and Alzheimer's disease after confirming its DTI identification performance. The results show that D01277, D04641, and D08969 may be applied to the treatment of Parkinson's disease through targeting hsa1813 (dopamine receptor D2) and D02173, D02558, and D03822 may be the clues of treatment for patients with Alzheimer's disease through targeting hsa5743 (prostaglandinendoperoxide synthase 2). The above prediction results need further biomedical validation. Discussion We anticipate that our proposed EnGDD model can help discover potential therapeutic clues for various diseases including neurodegenerative diseases.
Collapse
Affiliation(s)
- Liqian Zhou
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Yuzhuang Wang
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Lihong Peng
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Zejun Li
- School of Computer Science, Hunan Institute of Technology, Hengyang, China
| | - Xueming Luo
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| |
Collapse
|
6
|
Yang M, Sun H, Liu X, Xue X, Deng Y, Wang X. CMGN: a conditional molecular generation net to design target-specific molecules with desired properties. Brief Bioinform 2023:7165252. [PMID: 37193672 DOI: 10.1093/bib/bbad185] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2022] [Revised: 04/06/2023] [Accepted: 04/23/2023] [Indexed: 05/18/2023] Open
Abstract
The rational design of chemical entities with desired properties for a specific target is a long-standing challenge in drug design. Generative neural networks have emerged as a powerful approach to sample novel molecules with specific properties, termed as inverse drug design. However, generating molecules with biological activity against certain targets and predefined drug properties still remains challenging. Here, we propose a conditional molecular generation net (CMGN), the backbone of which is a bidirectional and autoregressive transformer. CMGN applies large-scale pretraining for molecular understanding and navigates the chemical space for specified targets by fine-tuning with corresponding datasets. Additionally, fragments and properties were trained to recover molecules to learn the structure-properties relationships. Our model crisscrosses the chemical space for specific targets and properties that control fragment-growth processes. Case studies demonstrated the advantages and utility of our model in fragment-to-lead processes and multi-objective lead optimization. The results presented in this paper illustrate that CMGN has the potential to accelerate the drug discovery process.
Collapse
Affiliation(s)
- Minjian Yang
- State Key Laboratory of Bioactive Substances and Functions of Natural Medicines, Department of Medicinal Chemistry, Beijing Key Laboratory of Active Substances Discovery and Druggability Evaluation, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, China
| | - Hanyu Sun
- State Key Laboratory of Bioactive Substances and Functions of Natural Medicines, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, China
| | - Xue Liu
- State Key Laboratory of Bioactive Substances and Functions of Natural Medicines, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, China
| | - Xi Xue
- State Key Laboratory of Bioactive Substances and Functions of Natural Medicines, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, China
| | - Yafeng Deng
- CarbonSilicon AI Technology Co., Ltd., China
| | - Xiaojian Wang
- State Key Laboratory of Bioactive Substances and Functions of Natural Medicines, Department of Medicinal Chemistry, Beijing Key Laboratory of Active Substances Discovery and Druggability Evaluation, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, China
| |
Collapse
|
7
|
Desmedt E, Smets D, Woller T, Alonso M, De Vleeschouwer F. Designing hexaphyrins for high-potential NLO switches: the synergy of core-modifications and meso-substitutions. Phys Chem Chem Phys 2023. [PMID: 37162298 DOI: 10.1039/d3cp01240a] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Due to the enormous size of the chemical compound space, usually only small regions are traversed with traditional direct molecular design approaches making the discovery for novel functionalized molecules for nonlinear optical applications challenging. By applying inverse molecular design algorithms, we aim to efficiently explore larger regions of the compound space in search of promising hexaphyrin-based molecular switches as measured by their first-hyperpolarizability (βHRS) contrast. We focus on the 28R → 30R switch with a functionalization pattern allowing for centrosymmetric OFF states yielding zero βHRS response. This switch is particularly challenging as full meso-substitution with a single type of functional group or core-modifications result in almost no contrast enhancement. We carried out four inverse design procedures during which two sets of core-modifications and three sets of meso-substitutions sites were systematically optimized. All 4 optimal switches are characterized by a mix of meso-substitutions and core-modifications, of which the best performing switch yields a 10-fold improvement over the parent macrocycle. Throughout the inverse design procedures, we collected and analyzed a database biased towards high NLO contrasts that contains 277 different patterns for hexaphyrin-based switches. We derived three design rules to obtain highly functional 28R → 30R NLO switches: (I) a combination of 2 strong EWG and 1 EDG group is the ideal recipe for increasing the NLO contrast, though their position also plays an important role. (II) The type of core-modification is less important when only the diagonal positions are core-modified. Switches with 4 core-modifications show a clear preference for oxygen. (III) Keeping centrosymmetry in the OFF state remains highly beneficial given the investigated functionalization pattern. Finally, we have demonstrated that combining meso-substitutions with core-modifications can synergistically improve the NLO contrast.
Collapse
Affiliation(s)
- Eline Desmedt
- Department of General Chemistry Algemene Chemie (ALGC), Vrije Universiteit Brussel, Pleinlaan 2, 1050 Brussel, Belgium.
| | - David Smets
- Department of General Chemistry Algemene Chemie (ALGC), Vrije Universiteit Brussel, Pleinlaan 2, 1050 Brussel, Belgium.
| | - Tatiana Woller
- Department of General Chemistry Algemene Chemie (ALGC), Vrije Universiteit Brussel, Pleinlaan 2, 1050 Brussel, Belgium.
| | - Mercedes Alonso
- Department of General Chemistry Algemene Chemie (ALGC), Vrije Universiteit Brussel, Pleinlaan 2, 1050 Brussel, Belgium.
| | - Freija De Vleeschouwer
- Department of General Chemistry Algemene Chemie (ALGC), Vrije Universiteit Brussel, Pleinlaan 2, 1050 Brussel, Belgium.
| |
Collapse
|
8
|
Yao L, Yang M, Song J, Yang Z, Sun H, Shi H, Liu X, Ji X, Deng Y, Wang X. Conditional Molecular Generation Net Enables Automated Structure Elucidation Based on 13C NMR Spectra and Prior Knowledge. Anal Chem 2023; 95:5393-5401. [PMID: 36926883 DOI: 10.1021/acs.analchem.2c05817] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/18/2023]
Abstract
Structure elucidation of unknown compounds based on nuclear magnetic resonance (NMR) remains a challenging problem in both synthetic organic and natural product chemistry. Library matching has been an efficient method to assist structure elucidation. However, it is limited by the coverage of libraries. In addition, prior knowledge such as molecular fragments is neglected. To solve the problem, we propose a conditional molecular generation net (CMGNet) to allow input of multiple sources of information. CMGNet not only uses 13C NMR spectrum data as input but molecular formulas and fragments of molecules are also employed as input conditions. Our model applies large-scale pretraining for molecular understanding and fine-tuning on two NMR spectral data sets of different granularity levels to accommodate structure elucidation tasks. CMGNet generates structures based on 13C NMR data, molecular formula, and fragment information, with a recovery rate of 94.17% in the top 10 recommendations. In addition, the generative model performed well in the generation of various classes of compounds and in the structural revision task. CMGNet has a deep understanding of molecular connectivities from 13C NMR, molecular formula, and fragments, paving the way for a new paradigm of deep learning-assisted inverse problem-solving.
Collapse
Affiliation(s)
- Lin Yao
- CarbonSilicon AI Technology Co., Ltd., Beijing 100080, China
| | - Minjian Yang
- State Key Laboratory of Bioactive Substances and Functions of Natural Medicines, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, China
| | - Jianfei Song
- CarbonSilicon AI Technology Co., Ltd., Beijing 100080, China
| | - Zhuo Yang
- CarbonSilicon AI Technology Co., Ltd., Beijing 100080, China
| | - Hanyu Sun
- State Key Laboratory of Bioactive Substances and Functions of Natural Medicines, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, China
| | - Hui Shi
- CarbonSilicon AI Technology Co., Ltd., Beijing 100080, China
| | - Xue Liu
- State Key Laboratory of Bioactive Substances and Functions of Natural Medicines, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, China
| | - Xiangyang Ji
- Department of Automation, Tsinghua University, Beijing 100084, China
| | - Yafeng Deng
- CarbonSilicon AI Technology Co., Ltd., Beijing 100080, China.,Department of Automation, Tsinghua University, Beijing 100084, China
| | - Xiaojian Wang
- State Key Laboratory of Bioactive Substances and Functions of Natural Medicines, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, China.,CarbonSilicon AI Technology Co., Ltd., Beijing 100080, China
| |
Collapse
|
9
|
Shen SC, Khare E, Lee NA, Saad MK, Kaplan DL, Buehler MJ. Computational Design and Manufacturing of Sustainable Materials through First-Principles and Materiomics. Chem Rev 2023; 123:2242-2275. [PMID: 36603542 DOI: 10.1021/acs.chemrev.2c00479] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Engineered materials are ubiquitous throughout society and are critical to the development of modern technology, yet many current material systems are inexorably tied to widespread deterioration of ecological processes. Next-generation material systems can address goals of environmental sustainability by providing alternatives to fossil fuel-based materials and by reducing destructive extraction processes, energy costs, and accumulation of solid waste. However, development of sustainable materials faces several key challenges including investigation, processing, and architecting of new feedstocks that are often relatively mechanically weak, complex, and difficult to characterize or standardize. In this review paper, we outline a framework for examining sustainability in material systems and discuss how recent developments in modeling, machine learning, and other computational tools can aid the discovery of novel sustainable materials. We consider these through the lens of materiomics, an approach that considers material systems holistically by incorporating perspectives of all relevant scales, beginning with first-principles approaches and extending through the macroscale to consider sustainable material design from the bottom-up. We follow with an examination of how computational methods are currently applied to select examples of sustainable material development, with particular emphasis on bioinspired and biobased materials, and conclude with perspectives on opportunities and open challenges.
Collapse
Affiliation(s)
- Sabrina C Shen
- Laboratory for Atomistic and Molecular Mechanics (LAMM), Massachusetts Institute of Technology, 77 Massachusetts Avenue 1-165, Cambridge, Massachusetts 02139, United States.,Department of Materials Science and Engineering, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - Eesha Khare
- Laboratory for Atomistic and Molecular Mechanics (LAMM), Massachusetts Institute of Technology, 77 Massachusetts Avenue 1-165, Cambridge, Massachusetts 02139, United States.,Department of Materials Science and Engineering, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - Nicolas A Lee
- Laboratory for Atomistic and Molecular Mechanics (LAMM), Massachusetts Institute of Technology, 77 Massachusetts Avenue 1-165, Cambridge, Massachusetts 02139, United States.,School of Architecture and Planning, Media Lab, Massachusetts Institute of Technology, 75 Amherst Street, Cambridge, Massachusetts 02139, United States
| | - Michael K Saad
- Department of Biomedical Engineering, Tufts University, 4 Colby Street, Medford, Massachusetts 02155, United States
| | - David L Kaplan
- Department of Biomedical Engineering, Tufts University, 4 Colby Street, Medford, Massachusetts 02155, United States
| | - Markus J Buehler
- Laboratory for Atomistic and Molecular Mechanics (LAMM), Massachusetts Institute of Technology, 77 Massachusetts Avenue 1-165, Cambridge, Massachusetts 02139, United States.,Center for Computational Science and Engineering, Schwarzman College of Computing, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
10
|
Fromer JC, Coley CW. Computer-aided multi-objective optimization in small molecule discovery. PATTERNS (NEW YORK, N.Y.) 2023; 4:100678. [PMID: 36873904 PMCID: PMC9982302 DOI: 10.1016/j.patter.2023.100678] [Citation(s) in RCA: 21] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/12/2023]
Abstract
Molecular discovery is a multi-objective optimization problem that requires identifying a molecule or set of molecules that balance multiple, often competing, properties. Multi-objective molecular design is commonly addressed by combining properties of interest into a single objective function using scalarization, which imposes assumptions about relative importance and uncovers little about the trade-offs between objectives. In contrast to scalarization, Pareto optimization does not require knowledge of relative importance and reveals the trade-offs between objectives. However, it introduces additional considerations in algorithm design. In this review, we describe pool-based and de novo generative approaches to multi-objective molecular discovery with a focus on Pareto optimization algorithms. We show how pool-based molecular discovery is a relatively direct extension of multi-objective Bayesian optimization and how the plethora of different generative models extend from single-objective to multi-objective optimization in similar ways using non-dominated sorting in the reward function (reinforcement learning) or to select molecules for retraining (distribution learning) or propagation (genetic algorithms). Finally, we discuss some remaining challenges and opportunities in the field, emphasizing the opportunity to adopt Bayesian optimization techniques into multi-objective de novo design.
Collapse
Affiliation(s)
- Jenna C Fromer
- Department of Chemical Engineering, MIT, Cambridge, MA 02139, USA
| | - Connor W Coley
- Department of Chemical Engineering, MIT, Cambridge, MA 02139, USA.,Department of Electrical Engineering and Computer Science, MIT, Cambridge, MA 02139, USA
| |
Collapse
|
11
|
Sridharan B, Mehta S, Pathak Y, Priyakumar UD. Deep Reinforcement Learning for Molecular Inverse Problem of Nuclear Magnetic Resonance Spectra to Molecular Structure. J Phys Chem Lett 2022; 13:4924-4933. [PMID: 35635003 DOI: 10.1021/acs.jpclett.2c00624] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Spectroscopy is the study of how matter interacts with electromagnetic radiation. The spectra of any molecule are highly information-rich, yet the inverse relation of spectra to the corresponding molecular structure is still an unsolved problem. Nuclear magnetic resonance (NMR) spectroscopy is one such critical technique in the scientists' toolkit to characterize molecules. In this work, a novel machine learning framework is proposed that attempts to solve this inverse problem by navigating the chemical space to find the correct structure given an NMR spectra. The proposed framework uses a combination of online Monte Carlo tree search (MCTS) and a set of graph convolution networks to build a molecule iteratively. Our method can predict the structure of the molecule ∼80% of the time in its top 3 guesses for molecules with <10 heavy atoms. We believe that the proposed framework is a significant step in solving the inverse design problem of NMR spectra.
Collapse
Affiliation(s)
- Bhuvanesh Sridharan
- Centre for Computational Natural Science and Bioinformatics, International Institute of Information Technology, Hyderabad 500032, India
| | - Sarvesh Mehta
- Centre for Computational Natural Science and Bioinformatics, International Institute of Information Technology, Hyderabad 500032, India
| | - Yashaswi Pathak
- Centre for Computational Natural Science and Bioinformatics, International Institute of Information Technology, Hyderabad 500032, India
| | - U Deva Priyakumar
- Centre for Computational Natural Science and Bioinformatics, International Institute of Information Technology, Hyderabad 500032, India
| |
Collapse
|