1
|
Krishnan SR, Bung N, Srinivasan R, Roy A. Target-specific novel molecules with their recipe: Incorporating synthesizability in the design process. J Mol Graph Model 2024; 129:108734. [PMID: 38442440 DOI: 10.1016/j.jmgm.2024.108734] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2023] [Revised: 02/14/2024] [Accepted: 02/15/2024] [Indexed: 03/07/2024]
Abstract
Application of Artificial intelligence (AI) in drug discovery has led to several success stories in recent times. While traditional methods mostly relied upon screening large chemical libraries for early-stage drug-design, de novo design can help identify novel target-specific molecules by sampling from a much larger chemical space. Although this has increased the possibility of finding diverse and novel molecules from previously unexplored chemical space, this has also posed a great challenge for medicinal chemists to synthesize at least some of the de novo designed novel molecules for experimental validation. To address this challenge, in this work, we propose a novel forward synthesis-based generative AI method, which is used to explore the synthesizable chemical space. The method uses a structure-based drug design framework, where the target protein structure and a target-specific seed fragment from co-crystal structures can be the initial inputs. A random fragment from a purchasable fragment library can also be the input if a target-specific fragment is unavailable. Then a template-based forward synthesis route prediction and molecule generation is performed in parallel using the Monte Carlo Tree Search (MCTS) method where, the subsequent fragments for molecule growth can again be obtained from a purchasable fragment library. The rewards for each iteration of MCTS are computed using a drug-target affinity (DTA) model based on the docking pose of the generated reaction intermediates at the binding site of the target protein of interest. With the help of the proposed method, it is now possible to overcome one of the major obstacles posed to the AI-based drug design approaches through the ability of the method to design novel target-specific synthesizable molecules.
Collapse
Affiliation(s)
| | - Navneet Bung
- TCS Research (Life Sciences Division), Tata Consultancy Services Limited, Hyderabad, 500081, India
| | - Rajgopal Srinivasan
- TCS Research (Life Sciences Division), Tata Consultancy Services Limited, Hyderabad, 500081, India
| | - Arijit Roy
- TCS Research (Life Sciences Division), Tata Consultancy Services Limited, Hyderabad, 500081, India.
| |
Collapse
|
2
|
Shao L, Fu T, Lin Y, Xiao D, Ai D, Zhang T, Fan J, Song H, Yang J. Facial augmented reality based on hierarchical optimization of similarity aspect graph. Comput Methods Programs Biomed 2024; 248:108108. [PMID: 38461712 DOI: 10.1016/j.cmpb.2024.108108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/16/2022] [Revised: 02/05/2024] [Accepted: 02/29/2024] [Indexed: 03/12/2024]
Abstract
BACKGROUND The existing face matching method requires a point cloud to be drawn on the real face for registration, which results in low registration accuracy due to the irregular deformation of the patient's skin that makes the point cloud have many outlier points. METHODS This work proposes a non-contact pose estimation method based on similarity aspect graph hierarchical optimization. The proposed method constructs a distance-weighted and triangular-constrained similarity measure to describe the similarity between views by automatically identifying the 2D and 3D feature points of the face. A mutual similarity clustering method is proposed to construct a hierarchical aspect graph with 3D pose as nodes. A Monte Carlo tree search strategy is used to search the hierarchical aspect graph for determining the optimal pose of the facial 3D model, so as to realize the accurate registration of the facial 3D model and the real face. RESULTS The proposed method was used to conduct accuracy verification experiments on the phantoms and volunteers, which were compared with four advanced pose calibration methods. The proposed method obtained average fusion errors of 1.13 ± 0.20 mm and 0.92 ± 0.08 mm in head phantom and volunteer experiments, respectively, which exhibits the best fusion performance among all comparison methods. CONCLUSIONS Our experiments proved the effectiveness of the proposed pose estimation method in facial augmented reality.
Collapse
Affiliation(s)
- Long Shao
- School of Computer Science & Technology, Beijing Institute of Technology, Beijing 100081, China
| | - Tianyu Fu
- School of Medical Technology, Beijing Institute of Technology, Beijing 100081, China.
| | - Yucong Lin
- School of Medical Technology, Beijing Institute of Technology, Beijing 100081, China
| | - Deqiang Xiao
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, China
| | - Danni Ai
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, China
| | - Tao Zhang
- Department of Stomatology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100730, China
| | - Jingfan Fan
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, China.
| | - Hong Song
- School of Computer Science & Technology, Beijing Institute of Technology, Beijing 100081, China.
| | - Jian Yang
- Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, China
| |
Collapse
|
3
|
Westerlund AM, Barge B, Mervin L, Genheden S. Data-driven approaches for identifying hyperparameters in multi-step retrosynthesis. Mol Inform 2023; 42:e202300128. [PMID: 37679293 DOI: 10.1002/minf.202300128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Revised: 08/24/2023] [Accepted: 09/07/2023] [Indexed: 09/09/2023]
Abstract
The multi-step retrosynthesis problem can be solved by a search algorithm, such as Monte Carlo tree search (MCTS). The performance of multistep retrosynthesis, as measured by a trade-off in search time and route solvability, therefore depends on the hyperparameters of the search algorithm. In this paper, we demonstrated the effect of three MCTS hyperparameters (number of iterations, tree depth, and tree width) on metrics such as Linear integrated speed-accuracy score (LISAS) and Inverse efficiency score which consider both route solvability and search time. This exploration was conducted by employing three data-driven approaches, namely a systematic grid search, Bayesian optimization over an ensemble of molecules to obtain static MCTS hyperparameters, and a machine learning approach to dynamically predict optimal MCTS hyperparameters given an input target molecule. With the obtained results on the internal dataset, we demonstrated that it is possible to identify a hyperparameter set which outperforms the current AiZynthFinder default setting. It appeared optimal across a variety of target input molecules, both on proprietary and public datasets. The settings identified with the in-house dataset reached a solvability of 93 % and median search time of 151 s for the in-house dataset, and a 74 % solvability and 114 s for the ChEMBL dataset. These numbers can be compared to the current default settings which solved 85 % and 73 % during a median time of 110s and 84 s, for in-house and ChEMBL, respectively.
Collapse
Affiliation(s)
| | - Bente Barge
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
- Hylleraas Centre for Quantum Molecular Sciences, Department of Chemistry, UiT The Arctic University of Norway, N9037, Tromsø, Norway
| | - Lewis Mervin
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Cambridge, UK
| | - Samuel Genheden
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
| |
Collapse
|
4
|
Zhang B, Zhang X, Du W, Song Z, Zhang G, Zhang G, Wang Y, Chen X, Jiang J, Luo Y. Chemistry-informed molecular graph as reaction descriptor for machine-learned retrosynthesis planning. Proc Natl Acad Sci U S A 2022; 119:e2212711119. [PMID: 36191228 DOI: 10.1073/pnas.2212711119] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Infusing "chemical wisdom" should improve the data-driven approaches that rely exclusively on historical synthetic data for automatic retrosynthesis planning. For this purpose, we designed a chemistry-informed molecular graph (CIMG) to describe chemical reactions. A collection of key information that is most relevant to chemical reactions is integrated in CIMG:NMR chemical shifts as vertex features, bond dissociation energies as edge features, and solvent/catalyst information as global features. For any given compound as a target, a product CIMG is generated and exploited by a graph neural network (GNN) model to choose reaction template(s) leading to this product. A reactant CIMG is then inferred and used in two GNN models to select appropriate catalyst and solvent, respectively. Finally, a fourth GNN model compares the two CIMG descriptors to check the plausibility of the proposed reaction. A reaction vector is obtained for every molecule in training these models. The chemical wisdom of reaction propensity contained in the pretrained reaction vectors is exploited to autocategorize molecules/reactions and to accelerate Monte Carlo tree search (MCTS) for multistep retrosynthesis planning. Full synthetic routes with recommended catalysts/solvents are predicted efficiently using this CIMG-based approach.
Collapse
|
5
|
Kleinerman A, Rosenfeld A, Rosemarin H. Machine-learning based routing of callers in an Israeli mental health hotline. Isr J Health Policy Res 2022; 11:25. [PMID: 35659290 PMCID: PMC9164346 DOI: 10.1186/s13584-022-00534-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Accepted: 05/19/2022] [Indexed: 11/15/2022] Open
Abstract
Background Mental health contact centers (also known as Hotlines) offer crisis intervention and counselling by phone calls and online chats. These mental health helplines have shown great success in improving the mental state of the callers, and are increasingly becoming popular in Israel and worldwide. Unfortunately, our knowledge about how to conduct successful routing of callers to counselling agents has been limited due to lack of large-scale data with labeled outcomes of the interactions. To date, many of these contact centers are overwhelmed by chat requests and operate in a simple first-come-first-serve (FCFS) scheduling policy which, combined, may lead to many callers receiving suboptimal counselling or abandoning the service before being treated. In this work our goal is to improve the efficiency of mental health contact centers by using a novel machine-learning based routing policy. Methods We present a large-scale machine learning-based analysis of real-world data from the online contact center of ERAN, the Israeli Association for Emotional First Aid. The data includes over 35,000 conversations over a 2-years period. Based on this analysis, we present a novel call routing method, that integrates advanced AI-techniques including the Monte Carlo tree search algorithm. We conducted an experiment that included various realistic simulations of incoming calls to contact centers, based on data from ERAN. We divided the simulations into two common settings: standard call flow and heavy call flow. In order to establish a baseline, we compared our proposed solution to two baseline methods: (1) The FCFS method; and (2) a greedy solution based on machine learning predictions. Our comparison focuses on two metrics - the number of calls served and the average feedback of the callers (i.e., quality of the chats). Results In the preliminary analysis, we identify indicative features that significantly contribute to the effectiveness of a conversation and demonstrate high accuracy in predicting the expected duration and the callers’ feedback. In the routing methods evaluation, we find that in heavy call flow settings, our proposed method significantly outperforms the other methods in both the quantity of served calls and average feedback. Most notably, we find that in the heavy call flow settings, our method improves the average feedback by 24% compared to FCFS and by 4% compared to the greedy solution. Regarding the standard-flow setting, we find that our proposed method significantly outperforms the FCFS method in the callers’ average feedback with a 12% improvement. However, in this setting, we did not find a significant difference between all methods in the quantity of served-calls and no significant difference was found between our proposed method and the greedy solution. Conclusion The proposed routing policy has the potential to significantly improve the performance of mental health contact centers, especially in peak hours. Leveraging artificial intelligence techniques, such as machine learning algorithms, combined with real-world data can bring about a significant and necessary leap forward in the way mental health hotlines operate and consequently reduce the burden of mental illnesses on health systems. However, implementation and evaluation in an operational contact center is necessary in order to verify that the results replicate in practice.
Collapse
|
6
|
Banik S, Loeffler TD, Batra R, Singh H, Cherukara MJ, Sankaranarayanan SKRS. Learning with Delayed Rewards-A Case Study on Inverse Defect Design in 2D Materials. ACS Appl Mater Interfaces 2021; 13:36455-36464. [PMID: 34288661 DOI: 10.1021/acsami.1c07545] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Defect dynamics in materials are of central importance to a broad range of technologies from catalysis to energy storage systems to microelectronics. Material functionality depends strongly on the nature and organization of defects-their arrangements often involve intermediate or transient states that present a high barrier for transformation. The lack of knowledge of these intermediate states and the presence of this energy barrier presents a serious challenge for inverse defect design, especially for gradient-based approaches. Here, we present a reinforcement learning (RL) [Monte Carlo Tree Search (MCTS)] based on delayed rewards that allow for efficient search of the defect configurational space and allows us to identify optimal defect arrangements in low-dimensional materials. Using a representative case of two-dimensional MoS2, we demonstrate that the use of delayed rewards allows us to efficiently sample the defect configurational space and overcome the energy barrier for a wide range of defect concentrations (from 1.5 to 8% S vacancies)-the system evolves from an initial randomly distributed S vacancies to one with extended S line defects consistent with previous experimental studies. Detailed analysis in the feature space allows us to identify the optimal pathways for this defect transformation and arrangement. Comparison with other global optimization schemes like genetic algorithms suggests that the MCTS with delayed rewards takes fewer evaluations and arrives at a better quality of the solution. The implications of the various sampled defect configurations on the 2H to 1T phase transitions in MoS2 are discussed. Overall, we introduce a RL strategy employing delayed rewards that can accelerate the inverse design of defects in materials for achieving targeted functionality.
Collapse
Affiliation(s)
- Suvo Banik
- Center for Nanoscale Materials, Argonne National Laboratory, Lemont, Illinois 60439, United States
- Department of Mechanical and Industrial Engineering, University of Illinois, Chicago, Illinois 60607, United States
| | - Troy David Loeffler
- Center for Nanoscale Materials, Argonne National Laboratory, Lemont, Illinois 60439, United States
- Department of Mechanical and Industrial Engineering, University of Illinois, Chicago, Illinois 60607, United States
| | - Rohit Batra
- Center for Nanoscale Materials, Argonne National Laboratory, Lemont, Illinois 60439, United States
| | - Harpal Singh
- Research and Development, Sentient Science Corporation, West Lafayette, Indiana 47906, United States
| | - Mathew J Cherukara
- Advanced Photon Source, Argonne National Laboratory, Lemont, Illinois 60439, United States
| | - Subramanian K R S Sankaranarayanan
- Center for Nanoscale Materials, Argonne National Laboratory, Lemont, Illinois 60439, United States
- Department of Mechanical and Industrial Engineering, University of Illinois, Chicago, Illinois 60607, United States
| |
Collapse
|
7
|
Barciś M, Barciś A, Tsiogkas N, Hellwagner H. Information Distribution in Multi-Robot Systems: Generic, Utility-Aware Optimization Middleware. Front Robot AI 2021; 8:685105. [PMID: 34386524 PMCID: PMC8353533 DOI: 10.3389/frobt.2021.685105] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2021] [Accepted: 06/21/2021] [Indexed: 11/13/2022] Open
Abstract
This work addresses the problem of what information is worth sending in a multi-robot system under generic constraints, e.g., limited throughput or energy. Our decision method is based on Monte Carlo Tree Search. It is designed as a transparent middleware that can be integrated into existing systems to optimize communication among robots. Furthermore, we introduce techniques to reduce the decision space of this problem to further improve the performance. We evaluate our approach using a simulation study and demonstrate its feasibility in a real-world environment by realizing a proof of concept in ROS 2 on mobile robots.
Collapse
Affiliation(s)
- Michał Barciś
- Karl Popper Kolleg on Networked Autonomous Aerial Vehicles (KPK NAV), University of Klagenfurt, Klagenfurt, Austria
| | - Agata Barciś
- Karl Popper Kolleg on Networked Autonomous Aerial Vehicles (KPK NAV), University of Klagenfurt, Klagenfurt, Austria
| | - Nikolaos Tsiogkas
- Department of Mechanical Engineering, Division RAM, KU Leuven, Leuven, Belgium
- FlandersMake@KULeuven, Core Lab ROB, Leuven, Belgium
| | - Hermann Hellwagner
- Karl Popper Kolleg on Networked Autonomous Aerial Vehicles (KPK NAV), University of Klagenfurt, Klagenfurt, Austria
| |
Collapse
|
8
|
Lin CH, Chen YS, Lin JT, Wu HC, Kuo HT, Lin CF, Chen P, Wu PC. Automatic Inverse Design of High-Performance Beam-Steering Metasurfaces via Genetic-type Tree Optimization. Nano Lett 2021; 21:4981-4989. [PMID: 34110156 DOI: 10.1021/acs.nanolett.1c00720] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
We introduce a genetic-type tree search (GTTS) algorithm combined with unsupervised clustering for the automatic inverse design of high-performance metasurfaces. With the proposed method, we realize highly directive beam-steering metasurfaces via the cooptimization of the amplitude and phase. In comparison with previous topology optimization approaches, the developed GTTS algorithm optimizes the organization of subwavelength nanoantennas and, thus, is applicable to the design of both passive and active metasurfaces. The optimized beam-steering metasurface specifically exhibits a nearly constant directivity when the steering angle varies from 5° to 30°. Furthermore, the optimized nonintuitive reflectance and phase profiles assist in achieving highly directive beam steering when the phase modulation range is <180°, which was previously challenging. Our approach can diminish the requirements of scattering light properties with substantially enhanced angular resolution of beam-steering metasurfaces, which enables the realization of high-performance metasurfaces that will be promising for a wide range of advanced nanophotonic applications.
Collapse
Affiliation(s)
- Chia-Hsiang Lin
- Department of Electrical Engineering, National Cheng Kung University, Tainan 70101, Taiwan
- Miin Wu School of Computing, National Cheng Kung University, Tainan 70101, Taiwan
| | - Yu-Sheng Chen
- Department of Photonics, National Cheng Kung University, Tainan 70101, Taiwan
| | - Jhao-Ting Lin
- Department of Electrical Engineering, National Cheng Kung University, Tainan 70101, Taiwan
| | - Hao Chung Wu
- Department of Photonics, National Cheng Kung University, Tainan 70101, Taiwan
| | - Hsuan-Ting Kuo
- Department of Photonics, National Cheng Kung University, Tainan 70101, Taiwan
| | - Chen-Fu Lin
- Department of Photonics, National Cheng Kung University, Tainan 70101, Taiwan
| | - Peter Chen
- Department of Photonics, National Cheng Kung University, Tainan 70101, Taiwan
| | - Pin Chieh Wu
- Department of Photonics, National Cheng Kung University, Tainan 70101, Taiwan
| |
Collapse
|
9
|
Yamada K, Chen ZZ, Wang L. Improved Practical Algorithms for Rooted Subtree Prune and Regraft (rSPR) Distance and Hybridization Number. J Comput Biol 2020; 27:1422-1432. [PMID: 32048865 DOI: 10.1089/cmb.2019.0432] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The problem of computing the rooted subtree prune and regraft (rSPR) distance of two phylogenetic trees is computationally hard and so is the problem of computing the hybridization number of two phylogenetic trees (denoted by Hybridization Number Computation [HNC]). Since they are important problems in phylogenetics, they have been studied extensively in the literature. Indeed, quite a number of exact or approximation algorithms have been designed and implemented for them. In this article, we design and implement several approximation algorithms for them and one exact algorithm for HNC. Our experimental results show that the resulting exact program is much faster (namely, more than 80 times faster for the easiest dataset used in the experiments) than the previous best and its superiority in speed becomes even more significant for more difficult instances. Moreover, the resulting approximation program's output has much better results than the previous bests; indeed, the outputs are always nearly optimal and often optimal. Of particular interest is the usage of the Monte Carlo tree search (MCTS) method in the design of our approximation algorithms. Our experimental results show that with MCTS, we can often solve HNC exactly within short time.
Collapse
Affiliation(s)
- Kohei Yamada
- Division of Information System Design, Tokyo Denki University, Hatoyama, Saitama 350-0394, Japan
| | - Zhi-Zhong Chen
- Division of Information System Design, Tokyo Denki University, Hatoyama, Saitama 350-0394, Japan
| | - Lusheng Wang
- Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong SAR
| |
Collapse
|
10
|
Zhang X, Zhang K, Lee Y. Machine Learning Enabled Tailor-Made Design of Application-Specific Metal-Organic Frameworks. ACS Appl Mater Interfaces 2020; 12:734-743. [PMID: 31820913 DOI: 10.1021/acsami.9b17867] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
In the development of advanced nanoporous materials, one clear and unavoidable challenge in hand is the sheer size (in principle, infinite) of the materials space to be explored. While high-throughput screening techniques allow us to narrow down the enormous-scale database of nanoporous materials, there are still practical limitations stemming from a costly molecular simulation in estimating a material's performance and the necessity of a sophisticated descriptor identifying materials. With an attempt to transition away from the screening-based approaches, this paper presents a computational approach combining the Monte Carlo tree search and recurrent neural networks for the tailor-made design of metal-organic frameworks toward the desired target applications. In the demonstration cases for methane-storage and carbon-capture applications, our approach showed significant efficiency in designing promising and novel metal-organic frameworks. We expect that this approach would easily be extended to other applications by simply adjusting the reward function according to the target performance property.
Collapse
Affiliation(s)
- Xiangyu Zhang
- School of Physical Science and Technology , ShanghaiTech University , Shanghai 201210 , China
| | - Kexin Zhang
- School of Physical Science and Technology , ShanghaiTech University , Shanghai 201210 , China
| | - Yongjin Lee
- School of Physical Science and Technology , ShanghaiTech University , Shanghai 201210 , China
| |
Collapse
|
11
|
Yang X, Zhang J, Yoshizoe K, Terayama K, Tsuda K. ChemTS: an efficient python library for de novo molecular generation. Sci Technol Adv Mater 2017; 18:972-976. [PMID: 29435094 PMCID: PMC5801530 DOI: 10.1080/14686996.2017.1401424] [Citation(s) in RCA: 102] [Impact Index Per Article: 14.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/29/2017] [Revised: 11/01/2017] [Accepted: 11/02/2017] [Indexed: 05/23/2023]
Abstract
Automatic design of organic materials requires black-box optimization in a vast chemical space. In conventional molecular design algorithms, a molecule is built as a combination of predetermined fragments. Recently, deep neural network models such as variational autoencoders and recurrent neural networks (RNNs) are shown to be effective in de novo design of molecules without any predetermined fragments. This paper presents a novel Python library ChemTS that explores the chemical space by combining Monte Carlo tree search and an RNN. In a benchmarking problem of optimizing the octanol-water partition coefficient and synthesizability, our algorithm showed superior efficiency in finding high-scoring molecules. ChemTS is available at https://github.com/tsudalab/ChemTS.
Collapse
Affiliation(s)
- Xiufeng Yang
- Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Japan
| | - Jinzhe Zhang
- Department of Biosciences, INSA Lyon, Villeurbanne Cedex, France
| | - Kazuki Yoshizoe
- RIKEN, Center for Advanced Intelligence Project, Tokyo, Japan
| | - Kei Terayama
- Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Japan
| | - Koji Tsuda
- Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Japan
- RIKEN, Center for Advanced Intelligence Project, Tokyo, Japan
- National Institute for Materials Science, Tsukuba, Japan
| |
Collapse
|
12
|
Abstract
Background Artificially synthesized RNA molecules provide important ways for creating a variety of novel functional molecules. State-of-the-art RNA inverse folding algorithms can design simple and short RNA sequences of specific GC content, that fold into the target RNA structure. However, their performance is not satisfactory in complicated cases. Result We present a new inverse folding algorithm called MCTS-RNA, which uses Monte Carlo tree search (MCTS), a technique that has shown exceptional performance in Computer Go recently, to represent and discover the essential part of the sequence space. To obtain high accuracy, initial sequences generated by MCTS are further improved by a series of local updates. Our algorithm has an ability to control the GC content precisely and can deal with pseudoknot structures. Using common benchmark datasets for evaluation, MCTS-RNA showed a lot of promise as a standard method of RNA inverse folding. Conclusion MCTS-RNA is available at https://github.com/tsudalab/MCTS-RNA. Electronic supplementary material The online version of this article (doi:10.1186/s12859-017-1882-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Xiufeng Yang
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, 277-8561, Japan
| | - Kazuki Yoshizoe
- RIKEN Center for Advanced Intelligence Project, 1-4-1 Nihombashi Chuo-ku, Tokyo, 103-0027, Japan
| | - Akito Taneda
- Graduate School of Science and Technology, Hirosaki University, 3 Bunkyo-cho, Hirosaki, 036-8561, Japan
| | - Koji Tsuda
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, 277-8561, Japan. .,Center for Materials Research by Information Integration, National Institute for Materials Science, 1-2-1 Sengen, Tsukuba, 305-0047, Japan. .,RIKEN Center for Advanced Intelligence Project, 1-4-1 Nihombashi Chuo-ku, Tokyo, 103-0027, Japan.
| |
Collapse
|
13
|
M. Dieb T, Ju S, Yoshizoe K, Hou Z, Shiomi J, Tsuda K. MDTS: automatic complex materials design using Monte Carlo tree search. Sci Technol Adv Mater 2017; 18:498-503. [PMID: 28804525 PMCID: PMC5532970 DOI: 10.1080/14686996.2017.1344083] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/25/2017] [Revised: 05/29/2017] [Accepted: 06/15/2017] [Indexed: 05/26/2023]
Abstract
Complex materials design is often represented as a black-box combinatorial optimization problem. In this paper, we present a novel python library called MDTS (Materials Design using Tree Search). Our algorithm employs a Monte Carlo tree search approach, which has shown exceptional performance in computer Go game. Unlike evolutionary algorithms that require user intervention to set parameters appropriately, MDTS has no tuning parameters and works autonomously in various problems. In comparison to a Bayesian optimization package, our algorithm showed competitive search efficiency and superior scalability. We succeeded in designing large Silicon-Germanium (Si-Ge) alloy structures that Bayesian optimization could not deal with due to excessive computational cost. MDTS is available at https://github.com/tsudalab/MDTS.
Collapse
Affiliation(s)
- Thaer M. Dieb
- National Institute for Materials Science, Tsukuba, Japan
- Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Japan
| | - Shenghong Ju
- Department of Mechanical Engineering, The University of Tokyo, Tokyo, Japan
| | - Kazuki Yoshizoe
- RIKEN, Center for Advanced Intelligence Project, Tokyo, Japan
| | - Zhufeng Hou
- National Institute for Materials Science, Tsukuba, Japan
| | - Junichiro Shiomi
- National Institute for Materials Science, Tsukuba, Japan
- Department of Mechanical Engineering, The University of Tokyo, Tokyo, Japan
| | - Koji Tsuda
- National Institute for Materials Science, Tsukuba, Japan
- Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Japan
- RIKEN, Center for Advanced Intelligence Project, Tokyo, Japan
| |
Collapse
|
14
|
Abstract
Despite many debates in the first half of the twentieth century, it is now largely a truism that humans and other animals build models of their environments and use them for prediction and control. However, model-based (MB) reasoning presents severe computational challenges. Alternative, computationally simpler, model-free (MF) schemes have been suggested in the reinforcement learning literature, and have afforded influential accounts of behavioural and neural data. Here, we study the realization of MB calculations, and the ways that this might be woven together with MF values and evaluation methods. There are as yet mostly only hints in the literature as to the resulting tapestry, so we offer more preview than review.
Collapse
Affiliation(s)
- Nathaniel D Daw
- Department of Psychology and Center for Neural Science, New York University, 4 Washington Place Suite 888, New York, NY 10003, USA
| | - Peter Dayan
- Gatsby Computational Neuroscience Unit, UCL, 17 Queen Square, London WC1N 3AR, UK
| |
Collapse
|