1
|
Li J, Zhou Y, Chen SJ. Embracing exascale computing in nucleic acid simulations. Curr Opin Struct Biol 2024; 87:102847. [PMID: 38815519 DOI: 10.1016/j.sbi.2024.102847] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2023] [Revised: 04/18/2024] [Accepted: 05/09/2024] [Indexed: 06/01/2024]
Abstract
This mini-review reports the recent advances in biomolecular simulations, particularly for nucleic acids, and provides the potential effects of the emerging exascale computing on nucleic acid simulations, emphasizing the need for advanced computational strategies to fully exploit this technological frontier. Specifically, we introduce recent breakthroughs in computer architectures for large-scale biomolecular simulations and review the simulation protocols for nucleic acids regarding force fields, enhanced sampling methods, coarse-grained models, and interactions with ligands. We also explore the integration of machine learning methods into simulations, which promises to significantly enhance the predictive modeling of biomolecules and the analysis of complex data generated by the exascale simulations. Finally, we discuss the challenges and perspectives for biomolecular simulations as we enter the dawning exascale computing era.
Collapse
Affiliation(s)
- Jun Li
- Department of Physics, Department of Biochemistry and Institute for Data Science and Informatics, University of Missouri, 223 Physics Bldg., Columbia, 65211, MO, USA
| | - Yuanzhe Zhou
- Department of Physics, Department of Biochemistry and Institute for Data Science and Informatics, University of Missouri, 223 Physics Bldg., Columbia, 65211, MO, USA
| | - Shi-Jie Chen
- Department of Physics, Department of Biochemistry and Institute for Data Science and Informatics, University of Missouri, 223 Physics Bldg., Columbia, 65211, MO, USA.
| |
Collapse
|
2
|
Cheng F, Wang F, Tang J, Zhou Y, Fu Z, Zhang P, Haines JL, Leverenz JB, Gan L, Hu J, Rosen-Zvi M, Pieper AA, Cummings J. Artificial intelligence and open science in discovery of disease-modifying medicines for Alzheimer's disease. Cell Rep Med 2024; 5:101379. [PMID: 38382465 PMCID: PMC10897520 DOI: 10.1016/j.xcrm.2023.101379] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Revised: 08/15/2023] [Accepted: 12/19/2023] [Indexed: 02/23/2024]
Abstract
The high failure rate of clinical trials in Alzheimer's disease (AD) and AD-related dementia (ADRD) is due to a lack of understanding of the pathophysiology of disease, and this deficit may be addressed by applying artificial intelligence (AI) to "big data" to rapidly and effectively expand therapeutic development efforts. Recent accelerations in computing power and availability of big data, including electronic health records and multi-omics profiles, have converged to provide opportunities for scientific discovery and treatment development. Here, we review the potential utility of applying AI approaches to big data for discovery of disease-modifying medicines for AD/ADRD. We illustrate how AI tools can be applied to the AD/ADRD drug development pipeline through collaborative efforts among neurologists, gerontologists, geneticists, pharmacologists, medicinal chemists, and computational scientists. AI and open data science expedite drug discovery and development of disease-modifying therapeutics for AD/ADRD and other neurodegenerative diseases.
Collapse
Affiliation(s)
- Feixiong Cheng
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA; Cleveland Clinic Genome Center, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA; Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, OH 44195, USA.
| | - Fei Wang
- Department of Population Health Sciences, Weill Cornell Medical College, Cornell University, New York, NY 10065, USA
| | - Jian Tang
- Mila-Quebec Institute for Learning Algorithms and CIFAR AI Research Chair, HEC Montreal, Montréal, QC H3T 2A7, Canada
| | - Yadi Zhou
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA
| | - Zhimin Fu
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA; College of Pharmacy, Northeast Ohio Medical University, Rootstown, OH 44272, USA
| | - Pengyue Zhang
- Department of Biostatistics and Health Data Science, Indiana University, Indianapolis, IN 46037, USA
| | - Jonathan L Haines
- Cleveland Institute for Computational Biology, and Department of Population & Quantitative Health Sciences, Case Western Reserve University, Cleveland, OH 44106, USA
| | - James B Leverenz
- Lou Ruvo Center for Brain Health, Neurological Institute, Cleveland Clinic, Cleveland, OH 44195, USA
| | - Li Gan
- Helen and Robert Appel Alzheimer's Disease Research Institute, Brain and Mind Research Institute, Weill Cornell Medicine, New York, NY 10021, USA
| | - Jianying Hu
- IBM Research, Yorktown Heights, New York, NY 10598, USA
| | - Michal Rosen-Zvi
- AI for Accelerated Healthcare and Life Sciences Discovery, IBM Research Labs, Haifa 3498825, Israel; Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem 9190500, Israel
| | - Andrew A Pieper
- Brain Health Medicines Center, Harrington Discovery Institute, University Hospitals Cleveland Medical Center, Cleveland, OH, 44106, USA; Department of Psychiatry, Case Western Reserve University, Cleveland, OH 44106, USA; Geriatric Psychiatry, GRECC, Louis Stokes Cleveland VA Medical Center, Cleveland, OH 44106, USA; Institute for Transformative Molecular Medicine, School of Medicine, Case Western Reserve University, Cleveland OH 44106, USA; Department of Pathology, Case Western Reserve University, School of Medicine, Cleveland, OH, 44106, USA; Department of Neurosciences, Case Western Reserve University, School of Medicine, Cleveland, OH 44106, USA
| | - Jeffrey Cummings
- Chambers-Grundy Center for Transformative Neuroscience, Department of Brain Health, School of Integrated Health Sciences, UNLV, Las Vegas, NV 89154, USA
| |
Collapse
|
3
|
Jin Z, Wei Z. Molecular simulation for food protein-ligand interactions: A comprehensive review on principles, current applications, and emerging trends. Compr Rev Food Sci Food Saf 2024; 23:e13280. [PMID: 38284571 DOI: 10.1111/1541-4337.13280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 11/19/2023] [Accepted: 11/22/2023] [Indexed: 01/30/2024]
Abstract
In recent years, investigations on molecular interaction mechanisms between food proteins and ligands have attracted much interest. The interaction mechanisms can supply much useful information for many fields in the food industry, including nutrient delivery, food processing, auxiliary detection, and others. Molecular simulation has offered extraordinary insights into the interaction mechanisms. It can reflect binding conformation, interaction forces, binding affinity, key residues, and other information that physicochemical experiments cannot reveal in a fast and detailed manner. The simulation results have proven to be consistent with the results of physicochemical experiments. Molecular simulation holds great potential for future applications in the field of food protein-ligand interactions. This review elaborates on the principles of molecular docking and molecular dynamics simulation. Besides, their applications in food protein-ligand interactions are summarized. Furthermore, challenges, perspectives, and trends in molecular simulation of food protein-ligand interactions are proposed. Based on the results of molecular simulation, the mechanisms of interfacial behavior, enzyme-substrate binding, and structural changes during food processing can be reflected, and strategies for hazardous substance detection and food flavor adjustment can be generated. Moreover, molecular simulation can accelerate food development and reduce animal experiments. However, there are still several challenges to applying molecular simulation to food protein-ligand interaction research. The future trends will be a combination of international cooperation and data sharing, quantum mechanics/molecular mechanics, advanced computational techniques, and machine learning, which contribute to promoting food protein-ligand interaction simulation. Overall, the use of molecular simulation to study food protein-ligand interactions has a promising prospect.
Collapse
Affiliation(s)
- Zihan Jin
- State Key Laboratory of Marine Food Processing & Safety Control, College of Food Science and Engineering, Ocean University of China, Qingdao, China
| | - Zihao Wei
- State Key Laboratory of Marine Food Processing & Safety Control, College of Food Science and Engineering, Ocean University of China, Qingdao, China
| |
Collapse
|
4
|
Hsiung SY, Deng SX, Li J, Huang SY, Liaw CK, Huang SY, Wang CC, Hsieh YSY. Machine learning-based monosaccharide profiling for tissue-specific classification of Wolfiporia extensa samples. Carbohydr Polym 2023; 322:121338. [PMID: 37839831 DOI: 10.1016/j.carbpol.2023.121338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Revised: 08/09/2023] [Accepted: 08/26/2023] [Indexed: 10/17/2023]
Abstract
Machine learning (ML) has been used for many clinical decision-making processes and diagnostic procedures in bioinformatics applications. We examined eight algorithms, including linear discriminant analysis (LDA), logistic regression (LR), k-nearest neighbor (KNN), random forest (RF), gradient boosting machine (GBM), support vector machine (SVM), Naïve Bayes classifier (NB), and artificial neural network (ANN) models, to evaluate their classification and prediction capabilities for four tissue types in Wolfiporia extensa using their monosaccharide composition profiles. All 8 ML-based models were assessed as exemplary models with AUC exceeding 0.8. Five models, namely LDA, KNN, RF, GBM, and ANN, performed excellently in the four-tissue-type classification (AUC > 0.9). Additionally, all eight models were evaluated as good predictive models with AUC value > 0.8 in the three-tissue-type classification. Notably, all 8 ML-based methods outperformed the single linear discriminant analysis (LDA) plotting method. For large sample sizes, the ML-based methods perform better than traditional regression techniques and could potentially increase the accuracy in identifying tissue samples of W. extensa.
Collapse
Affiliation(s)
- Shih-Yi Hsiung
- School of Pharmacy, College of Pharmacy, Taipei Medical University, Taipei, Taiwan
| | - Shun-Xin Deng
- School of Pharmacy, College of Pharmacy, Taipei Medical University, Taipei, Taiwan; Graduate Institute of Pharmacognosy, Taipei Medical University, Taipei, Taiwan
| | - Jing Li
- College of Life Science, Shanghai Normal University, Shanghai, China
| | - Sheng-Yao Huang
- Institute of Statistical Science, Academia Sinica, Taipei, Taiwan
| | - Chen-Kun Liaw
- Department of Orthopedics, School of Medicine, College of Medicine, Taipei Medical University, Taipei, Taiwan
| | - Su-Yun Huang
- Institute of Statistical Science, Academia Sinica, Taipei, Taiwan
| | - Ching-Chiung Wang
- School of Pharmacy, College of Pharmacy, Taipei Medical University, Taipei, Taiwan; Graduate Institute of Pharmacognosy, Taipei Medical University, Taipei, Taiwan
| | - Yves S Y Hsieh
- School of Pharmacy, College of Pharmacy, Taipei Medical University, Taipei, Taiwan; Graduate Institute of Pharmacognosy, Taipei Medical University, Taipei, Taiwan; Division of Glycoscience, Department of Chemistry, School of Engineering Sciences in Chemistry, Biotechnology and Health, KTH Royal Institute of Technology, AlbaNova University Centre, Stockholm SE106 91, Sweden.
| |
Collapse
|
5
|
Raghavan B, Paulikat M, Ahmad K, Callea L, Rizzi A, Ippoliti E, Mandelli D, Bonati L, De Vivo M, Carloni P. Drug Design in the Exascale Era: A Perspective from Massively Parallel QM/MM Simulations. J Chem Inf Model 2023. [PMID: 37319347 DOI: 10.1021/acs.jcim.3c00557] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
The initial phases of drug discovery - in silico drug design - could benefit from first principle Quantum Mechanics/Molecular Mechanics (QM/MM) molecular dynamics (MD) simulations in explicit solvent, yet many applications are currently limited by the short time scales that this approach can cover. Developing scalable first principle QM/MM MD interfaces fully exploiting current exascale machines - so far an unmet and crucial goal - will help overcome this problem, opening the way to the study of the thermodynamics and kinetics of ligand binding to protein with first principle accuracy. Here, taking two relevant case studies involving the interactions of ligands with rather large enzymes, we showcase the use of our recently developed massively scalable Multiscale Modeling in Computational Chemistry (MiMiC) QM/MM framework (currently using DFT to describe the QM region) to investigate reactions and ligand binding in enzymes of pharmacological relevance. We also demonstrate for the first time strong scaling of MiMiC-QM/MM MD simulations with parallel efficiency of ∼70% up to >80,000 cores. Thus, among many others, the MiMiC interface represents a promising candidate toward exascale applications by combining machine learning with statistical mechanics based algorithms tailored for exascale supercomputers.
Collapse
Affiliation(s)
- Bharath Raghavan
- Computational Biomedicine, Institute of Advanced Simulations IAS-5/Institute for Neuroscience and Medicine INM-9, Forschungszentrum Jülich GmbH, Jülich 52428, Germany
- Department of Physics, RWTH Aachen University, Aachen 52074, Germany
| | - Mirko Paulikat
- Computational Biomedicine, Institute of Advanced Simulations IAS-5/Institute for Neuroscience and Medicine INM-9, Forschungszentrum Jülich GmbH, Jülich 52428, Germany
| | - Katya Ahmad
- Computational Biomedicine, Institute of Advanced Simulations IAS-5/Institute for Neuroscience and Medicine INM-9, Forschungszentrum Jülich GmbH, Jülich 52428, Germany
| | - Lara Callea
- Department of Earth and Environmental Sciences, University of Milano-Bicocca, Piazza della Scienza 1, 20126 Milan, Italy
| | - Andrea Rizzi
- Computational Biomedicine, Institute of Advanced Simulations IAS-5/Institute for Neuroscience and Medicine INM-9, Forschungszentrum Jülich GmbH, Jülich 52428, Germany
- Atomistic Simulations, Italian Institute of Technology, Genova 16163, Italy
| | - Emiliano Ippoliti
- Computational Biomedicine, Institute of Advanced Simulations IAS-5/Institute for Neuroscience and Medicine INM-9, Forschungszentrum Jülich GmbH, Jülich 52428, Germany
| | - Davide Mandelli
- Computational Biomedicine, Institute of Advanced Simulations IAS-5/Institute for Neuroscience and Medicine INM-9, Forschungszentrum Jülich GmbH, Jülich 52428, Germany
| | - Laura Bonati
- Department of Earth and Environmental Sciences, University of Milano-Bicocca, Piazza della Scienza 1, 20126 Milan, Italy
| | - Marco De Vivo
- Molecular Modelling and Drug Discovery, Italian Institute of Technology, Genova 16163, Italy
| | - Paolo Carloni
- Computational Biomedicine, Institute of Advanced Simulations IAS-5/Institute for Neuroscience and Medicine INM-9, Forschungszentrum Jülich GmbH, Jülich 52428, Germany
- Department of Physics and Universitätsklinikum, RWTH Aachen University, Aachen 52074, Germany
| |
Collapse
|
6
|
Conformational ensemble of the NSP1 CTD in SARS-CoV-2: Perspectives from the free energy landscape. Biophys J 2023:S0006-3495(23)00102-9. [PMID: 36793215 PMCID: PMC9928668 DOI: 10.1016/j.bpj.2023.02.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Revised: 01/13/2023] [Accepted: 02/10/2023] [Indexed: 02/17/2023] Open
Abstract
The nonstructural protein-1 (NSP1) of the severe acute respiratory syndrome-associated coronavirus 2 plays a crucial role in the translational shutdown and immune evasion inside host cells. Despite its known intrinsic disorder, the C-terminal domain (CTD) of NSP1 has been reported to form a double α-helical structure and block the 40S-ribosomal channel for mRNA translation. Experimental studies indicate that NSP1 CTD functions independently from the globular N-terminal region separated with a long linker domain, underscoring the necessity of exploring the standalone conformational ensemble. In this contribution, we utilize exascale computing resources to yield unbiased molecular dynamics simulation of NSP1 CTD in all-atom resolution starting from multiple initial seed structures. A data-driven approach elicits collective variables (CVs) that are significantly superior to conventional descriptors in capturing the conformational heterogeneity. The free energy landscape as a function of the CV space is estimated using the modified expectation maximized molecular dynamics. Originally developed by us for small peptides, here, we establish the efficacy of expectation maximized molecular dynamics in conjunction with data-driven CV space for a more complex and relevant biomolecular system. The results reveal the existence of two disordered metastable populations in the free energy landscape that are separated from the conformation resembling ribosomal subunit bound state by high kinetic barriers. Chemical shift correlation and secondary structure analysis capture significant differences among key structures of the ensemble. Altogether, these insights can underpin drug development studies and mutational experiments that help induce population shifts to alter translational blocking and understand its molecular basis in further detail.
Collapse
|
7
|
Banerjee A, Saha S, Tvedt NC, Yang LW, Bahar I. Mutually beneficial confluence of structure-based modeling of protein dynamics and machine learning methods. Curr Opin Struct Biol 2023; 78:102517. [PMID: 36587424 PMCID: PMC10038760 DOI: 10.1016/j.sbi.2022.102517] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Revised: 11/19/2022] [Accepted: 11/22/2022] [Indexed: 12/31/2022]
Abstract
Proteins sample an ensemble of conformers under physiological conditions, having access to a spectrum of modes of motions, also called intrinsic dynamics. These motions ensure the adaptation to various interactions in the cell, and largely assist in, if not determine, viable mechanisms of biological function. In recent years, machine learning frameworks have proven uniquely useful in structural biology, and recent studies further provide evidence to the utility and/or necessity of considering intrinsic dynamics for increasing their predictive ability. Efficient quantification of dynamics-based attributes by recently developed physics-based theories and models such as elastic network models provides a unique opportunity to generate data on dynamics for training ML models towards inferring mechanisms of protein function, assessing pathogenicity, or estimating binding affinities.
Collapse
Affiliation(s)
- Anupam Banerjee
- Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh PA 15261, USA
| | - Satyaki Saha
- Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh PA 15261, USA
| | - Nathan C Tvedt
- Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh PA 15261, USA; Computational and Applied Mathematics and Statistics, The College of William and Mary, Williamsburg, VA 23185, USA
| | - Lee-Wei Yang
- Institute of Bioinformatics and Structural Biology, and PhD Program in Biomedical Artificial Intelligence, National Tsing Hua University, Hsinchu 300044, Taiwan; Physics Division, National Center for Theoretical Sciences, Taipei 106319, Taiwan
| | - Ivet Bahar
- Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh PA 15261, USA.
| |
Collapse
|
8
|
Abstract
![]()
AlphaFold has burst into our lives. A powerful algorithm
that underscores
the strength of biological sequence data and artificial intelligence
(AI). AlphaFold has appended projects and research directions. The
database it has been creating promises an untold number of applications
with vast potential impacts that are still difficult to surmise. AI
approaches can revolutionize personalized treatments and usher in
better-informed clinical trials. They promise to make giant leaps
toward reshaping and revamping drug discovery strategies, selecting
and prioritizing combinations of drug targets. Here, we briefly overview
AI in structural biology, including in molecular dynamics simulations
and prediction of microbiota–human protein–protein interactions.
We highlight the advancements accomplished by the deep-learning-powered
AlphaFold in protein structure prediction and their powerful impact
on the life sciences. At the same time, AlphaFold does not resolve
the decades-long protein folding challenge, nor does it identify the
folding pathways. The models that AlphaFold provides do not capture
conformational mechanisms like frustration and allostery, which are
rooted in ensembles, and controlled by their dynamic distributions.
Allostery and signaling are properties of populations. AlphaFold also
does not generate ensembles of intrinsically disordered proteins and
regions, instead describing them by their low structural probabilities.
Since AlphaFold generates single ranked structures, rather than conformational
ensembles, it cannot elucidate the mechanisms of allosteric activating
driver hotspot mutations nor of allosteric drug resistance. However,
by capturing key features, deep learning techniques can use the single
predicted conformation as the basis for generating a diverse ensemble.
Collapse
Affiliation(s)
- Ruth Nussinov
- Computational Structural Biology Section, Frederick National Laboratory for Cancer Research, Frederick, Maryland 21702, United States.,Department of Human Molecular Genetics and Biochemistry, Sackler School of Medicine, Tel Aviv University, Tel Aviv 69978, Israel
| | - Mingzhen Zhang
- Computational Structural Biology Section, Frederick National Laboratory for Cancer Research, Frederick, Maryland 21702, United States
| | - Yonglan Liu
- Cancer Innovation Laboratory, National Cancer Institute, Frederick, Maryland 21702, United States
| | - Hyunbum Jang
- Computational Structural Biology Section, Frederick National Laboratory for Cancer Research, Frederick, Maryland 21702, United States
| |
Collapse
|
9
|
Cheng F, Tuncbag N. Editorial overview: Artificial intelligence (AI) methodologies in structural biology. Curr Opin Struct Biol 2022; 74:102387. [PMID: 35589509 DOI: 10.1016/j.sbi.2022.102387] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Affiliation(s)
- Feixiong Cheng
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA; Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, OH 44195, USA; Case Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, OH 44106, USA.
| | - Nurcan Tuncbag
- Department of Chemical and Biological Engineering, College of Engineering, Koc University, Istanbul, 34450, Turkey; School of Medicine, Koc University, Istanbul, 34450, Turkey.
| |
Collapse
|