1
|
Singh S, Gapsys V, Aldeghi M, Schaller D, Rangwala AM, White JB, Bluck JP, Scheen J, Glass WG, Guo J, Hayat S, de Groot BL, Volkamer A, Christ CD, Seeliger MA, Chodera JD. Prospective Evaluation of Structure-Based Simulations Reveal Their Ability to Predict the Impact of Kinase Mutations on Inhibitor Binding. J Phys Chem B 2025; 129:2882-2902. [PMID: 40053698 PMCID: PMC12038917 DOI: 10.1021/acs.jpcb.4c07794] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/09/2025]
Abstract
Small molecule kinase inhibitors are critical in the modern treatment of cancers, evidenced by the existence of over 80 FDA-approved small-molecule kinase inhibitors. Unfortunately, intrinsic or acquired resistance, often causing therapy discontinuation, is frequently caused by mutations in the kinase therapeutic target. The advent of clinical tumor sequencing has opened additional opportunities for precision oncology to improve patient outcomes by pairing optimal therapies with tumor mutation profiles. However, modern precision oncology efforts are hindered by lack of sufficient biochemical or clinical evidence to classify each mutation as resistant or sensitive to existing inhibitors. Structure-based methods show promising accuracy in retrospective benchmarks at predicting whether a kinase mutation will perturb inhibitor binding, but comparisons are made by pooling disparate experimental measurements across different conditions. We present the first prospective benchmark of structure-based approaches on a blinded dataset of in-cell kinase inhibitor affinities to Abl kinase mutants using a NanoBRET reporter assay. We compare NanoBRET results to structure-based methods and their ability to estimate the impact of mutations on inhibitor binding (measured as ΔΔG). Comparing physics-based simulations, Rosetta, and previous machine learning models, we find that structure-based methods accurately classify kinase mutations as inhibitor-resistant or inhibitor-sensitizing, and each approach has a similar degree of accuracy. We show that physics-based simulations are best suited to estimate ΔΔG of mutations that are distal to the kinase active site. To probe modes of failure, we retrospectively investigate two clinically significant mutations poorly predicted by our methods, T315A and L298F, and find that starting configurations and protonation states significantly alter the accuracy of our predictions. Our experimental and computational measurements provide a benchmark for estimating the impact of mutations on inhibitor binding affinity for future methods and structure-based models. These structure-based methods have potential utility in identifying optimal therapies for tumor-specific mutations, predicting resistance mutations in the absence of clinical data, and identifying potential sensitizing mutations to established inhibitors.
Collapse
Affiliation(s)
- Sukrit Singh
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
| | - Vytautas Gapsys
- Computational Chemistry, Janssen Research & Development, Turnhoutseweg 30, Beerse 2340, Belgium
| | - Matteo Aldeghi
- Computational Biomolecular Dynamics Group, Department of Theoretical and Computational Biophysics, Max Planck Institute for multidisciplinary sciences, D-37077 Göttingen, Germany
| | - David Schaller
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
- In Silico Toxicology and Structural Bioinformatics, Institute of Physiology, Charité – Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Augustenburger Platz 1, 13353 Berlin, Germany
| | - Aziz M. Rangwala
- Department of Pharmacological Sciences, Stony Brook University Medical School, Stony Brook, NY 11794, United States
| | - Jessica B. White
- Tri-Institutional PhD Program in Computational Biology and Medicine, Weill Cornell Graduate School of Medical Sciences, Cornell University, New York, NY 10065, United States
| | - Joseph P. Bluck
- Structural Biology & Computational Design, Research and Development, Pharmaceuticals, Bayer AG, 13342 Berlin, Germany
| | - Jenke Scheen
- Open Molecular Software Foundation, Davis, CA 95618, USA
| | - William G. Glass
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
| | - Jiaye Guo
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
| | - Sikander Hayat
- Department of medicine II, University Hospital Aachen, Pauwelsstraße 30, 52074 Aachen, Germany
| | - Bert L. de Groot
- Computational Biomolecular Dynamics Group, Department of Theoretical and Computational Biophysics, Max Planck Institute for multidisciplinary sciences, D-37077 Göttingen, Germany
| | - Andrea Volkamer
- In Silico Toxicology and Structural Bioinformatics, Institute of Physiology, Charité – Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Augustenburger Platz 1, 13353 Berlin, Germany
- Data Driven Drug Design, Faculty of Mathematics and Computer Sciences, Saarland University, 66123 Saarbrücken, Germany
| | - Clara D. Christ
- Structural Biology & Computational Design, Research and Development, Pharmaceuticals, Bayer AG, 13342 Berlin, Germany
| | - Markus A. Seeliger
- Department of Pharmacological Sciences, Stony Brook University Medical School, Stony Brook, NY 11794, United States
| | - John D. Chodera
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
| |
Collapse
|
2
|
Singh S, Gapsys V, Aldeghi M, Schaller D, Rangwala AM, White JB, Bluck JP, Scheen J, Glass WG, Guo J, Hayat S, de Groot BL, Volkamer A, Christ CD, Seeliger MA, Chodera JD. Prospective evaluation of structure-based simulations reveal their ability to predict the impact of kinase mutations on inhibitor binding. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2024.11.15.623861. [PMID: 40060600 PMCID: PMC11888192 DOI: 10.1101/2024.11.15.623861] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 03/16/2025]
Abstract
Small molecule kinase inhibitors are critical in the modern treatment of cancers, evidenced by the existence of over 80 FDA-approved small-molecule kinase inhibitors. Unfortunately, intrinsic or acquired resistance, often causing therapy discontinuation, is frequently caused by mutations in the kinase therapeutic target. The advent of clinical tumor sequencing has opened additional opportunities for precision oncology to improve patient outcomes by pairing optimal therapies with tumor mutation profiles. However, modern precision oncology efforts are hindered by lack of sufficient biochemical or clinical evidence to classify each mutation as resistant or sensitive to existing inhibitors. Structure-based methods show promising accuracy in retrospective benchmarks at predicting whether a kinase mutation will perturb inhibitor binding, but comparisons are made by pooling disparate experimental measurements across different conditions. We present the first prospective benchmark of structure-based approaches on a blinded dataset of in-cell kinase inhibitor affinities to Abl kinase mutants using a NanoBRET reporter assay. We compare NanoBRET results to structure-based methods and their ability to estimate the impact of mutations on inhibitor binding (measured as ΔΔG). Comparing physics-based simulations, Rosetta, and previous machine learning models, we find that structure-based methods accurately classify kinase mutations as inhibitor-resistant or inhibitor-sensitizing, and each approach has a similar degree of accuracy. We show that physics-based simulations are best suited to estimate ΔΔG of mutations that are distal to the kinase active site. To probe modes of failure, we retrospectively investigate two clinically significant mutations poorly predicted by our methods, T315A and L298F, and find that starting configurations and protonation states significantly alter the accuracy of our predictions. Our experimental and computational measurements provide a benchmark for estimating the impact of mutations on inhibitor binding affinity for future methods and structure-based models. These structure-based methods have potential utility in identifying optimal therapies for tumor-specific mutations, predicting resistance mutations in the absence of clinical data, and identifying potential sensitizing mutations to established inhibitors.
Collapse
Affiliation(s)
- Sukrit Singh
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
| | - Vytautas Gapsys
- Computational Chemistry, Janssen Research & Development, Turnhoutseweg 30, Beerse 2340, Belgium
| | - Matteo Aldeghi
- Computational Biomolecular Dynamics Group, Department of Theoretical and Computational Biophysics, Max Planck Institute for multidisciplinary sciences, D-37077 Göttingen, Germany
| | - David Schaller
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
- In Silico Toxicology and Structural Bioinformatics, Institute of Physiology, Charité – Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Augustenburger Platz 1, 13353 Berlin, Germany
| | - Aziz M. Rangwala
- Department of Pharmacological Sciences, Stony Brook University Medical School, Stony Brook, NY 11794, United States
| | - Jessica B. White
- Tri-Institutional PhD Program in Computational Biology and Medicine, Weill Cornell Graduate School of Medical Sciences, Cornell University, New York, NY 10065, United States
| | - Joseph P. Bluck
- Structural Biology & Computational Design, Research and Development, Pharmaceuticals, Bayer AG, 13342 Berlin, Germany
| | - Jenke Scheen
- Open Molecular Software Foundation, Davis, CA 95618, USA
| | - William G. Glass
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
| | - Jiaye Guo
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
| | - Sikander Hayat
- Department of medicine II, University Hospital Aachen, Pauwelsstraße 30, 52074 Aachen, Germany
| | - Bert L. de Groot
- Computational Biomolecular Dynamics Group, Department of Theoretical and Computational Biophysics, Max Planck Institute for multidisciplinary sciences, D-37077 Göttingen, Germany
| | - Andrea Volkamer
- In Silico Toxicology and Structural Bioinformatics, Institute of Physiology, Charité – Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Augustenburger Platz 1, 13353 Berlin, Germany
- Data Driven Drug Design, Faculty of Mathematics and Computer Sciences, Saarland University, 66123 Saarbrücken, Germany
| | - Clara D. Christ
- Structural Biology & Computational Design, Research and Development, Pharmaceuticals, Bayer AG, 13342 Berlin, Germany
| | - Markus A. Seeliger
- Department of Pharmacological Sciences, Stony Brook University Medical School, Stony Brook, NY 11794, United States
| | - John D. Chodera
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
| |
Collapse
|
3
|
Deng W, Niu X, He P, Yan Q, Liang H, Wang Y, Ning L, Lin Z, Zhang Y, Zhao X, Feng L, Qu L, Chen L. An allelic atlas of immunoglobulin heavy chain variable regions reveals antibody binding epitope preference resilient to SARS-CoV-2 mutation escape. Front Immunol 2025; 15:1471396. [PMID: 39840032 PMCID: PMC11746035 DOI: 10.3389/fimmu.2024.1471396] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2024] [Accepted: 12/04/2024] [Indexed: 01/23/2025] Open
Abstract
Background Although immunoglobulin (Ig) alleles play a pivotal role in the antibody response to pathogens, research to understand their role in the humoral immune response is still limited. Methods We retrieved the germline sequences for the IGHV from the IMGT database to illustrate the amino acid polymorphism present within germline sequences of IGHV genes. We aassembled the sequences of IgM and IgD repertoire from 130 people to investigate the genetic variations in the population. A dataset comprising 10,643 SARS-CoV-2 spike-specific antibodies, obtained from COV-AbDab, was compiled to assess the impact of SARS-CoV-2 infection on allelic gene utilization. Binding affinity and neutralizing activity were determined using bio-layer interferometry and pseudovirus neutralization assays. Primary docking was performed using ZDOCK (3.0.2) to generate the initial conformation of the antigen-antibody complex, followed by simulations of the complete conformations using Rosetta SnugDock software. The original and simulated structural conformations were visualized and presented using ChimeraX (v1.5). Results We present an allelic atlas of immunoglobulin heavy chain (IgH) variable regions, illustrating the diversity of allelic variants across 33 IGHV family germline sequences by sequencing the IgH repertoire of in the population. Our comprehensive analysis of SARS-CoV-2 spike-specific antibodies revealed the preferential use of specific Ig alleles among these antibodies. We observed an association between Ig alleles and antibody binding epitopes. Different allelic genotypes binding to the same RBD epitope on the spike show different neutralizing potency and breadth. We found that antibodies carrying the IGHV1-69*02 allele tended to bind to the RBD E2.2 epitope. The antibodies carrying G50 and L55 amino acid residues exhibit potential enhancements in binding affinity and neutralizing potency to SARS-CoV-2 variants containing the L452R mutation on RBD, whereas R50 and F55 amino acid residues tend to have reduced binding affinity and neutralizing potency. IGHV2-5*02 antibodies using the D56 allele bind to the RBD D2 epitope with greater binding and neutralizing potency due to the interaction between D56 on HCDR2 and K444 on RBD of most Omicron subvariants. In contrast, IGHV2-5*01 antibodies using the N56 allele show increased binding resistance to the K444T mutation on RBD. Discussion This study provides valuable insights into humoral immune responses from the perspective of Ig alleles and population genetics. These findings underscore the importance of Ig alleles in vaccine design and therapeutic antibody development.
Collapse
Affiliation(s)
- Weiqi Deng
- State Key Laboratory of Respiratory Disease, Guangdong Laboratory of Computational Biomedicine, Center for Cell Lineage Research, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou, China
- University of Chinese Academy of Science, Beijing, China
| | - Xuefeng Niu
- State Key Laboratory of Respiratory Disease, Guangzhou Institute of Respiratory Health, the First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Ping He
- State Key Laboratory of Respiratory Disease, Guangdong Laboratory of Computational Biomedicine, Center for Cell Lineage Research, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou, China
- Guangzhou National Laboratory, Guangzhou, China
| | - Qihong Yan
- State Key Laboratory of Respiratory Disease, Guangdong Laboratory of Computational Biomedicine, Center for Cell Lineage Research, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou, China
- State Key Laboratory of Respiratory Disease, Guangzhou Institute of Respiratory Health, the First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Huan Liang
- State Key Laboratory of Respiratory Disease, Guangzhou Institute of Respiratory Health, the First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Yongping Wang
- State Key Laboratory of Respiratory Disease, Guangdong Laboratory of Computational Biomedicine, Center for Cell Lineage Research, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou, China
- University of Chinese Academy of Science, Beijing, China
| | - Lishan Ning
- State Key Laboratory of Respiratory Disease, Guangdong Laboratory of Computational Biomedicine, Center for Cell Lineage Research, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou, China
- University of Chinese Academy of Science, Beijing, China
| | - Zihan Lin
- State Key Laboratory of Respiratory Disease, Guangdong Laboratory of Computational Biomedicine, Center for Cell Lineage Research, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou, China
- University of Chinese Academy of Science, Beijing, China
| | - Yudi Zhang
- State Key Laboratory of Respiratory Disease, Guangdong Laboratory of Computational Biomedicine, Center for Cell Lineage Research, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou, China
| | - Xinwei Zhao
- State Key Laboratory of Respiratory Disease, Guangdong Laboratory of Computational Biomedicine, Center for Cell Lineage Research, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou, China
- Guangzhou National Laboratory, Guangzhou, China
| | - Liqiang Feng
- State Key Laboratory of Respiratory Disease, Guangdong Laboratory of Computational Biomedicine, Center for Cell Lineage Research, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou, China
| | - Linbing Qu
- State Key Laboratory of Respiratory Disease, Guangdong Laboratory of Computational Biomedicine, Center for Cell Lineage Research, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou, China
| | - Ling Chen
- State Key Laboratory of Respiratory Disease, Guangdong Laboratory of Computational Biomedicine, Center for Cell Lineage Research, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou, China
- Guangzhou National Laboratory, Guangzhou, China
| |
Collapse
|
4
|
Das T, Bhattacharya A, Jha T, Gayen S. Exploration of Fingerprints and Data Mining-based Prediction of Some Bioactive Compounds from Allium sativum as Histone Deacetylase 9 (HDAC9) Inhibitors. Curr Comput Aided Drug Des 2025; 21:270-284. [PMID: 38321909 DOI: 10.2174/0115734099282303240126061624] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Revised: 01/12/2024] [Accepted: 01/18/2024] [Indexed: 02/08/2024]
Abstract
BACKGROUND Histone deacetylase 9 (HDAC9) is an important member of the class IIa family of histone deacetylases. It is well established that over-expression of HDAC9 causes various types of cancers including gastric cancer, breast cancer, ovarian cancer, liver cancer, lung cancer, lymphoblastic leukaemia, etc. The important role of HDAC9 is also recognized in the development of bone, cardiac muscles, and innate immunity. Thus, it will be beneficial to find out the important structural attributes of HDAC9 inhibitors for developing selective HDAC9 inhibitors with higher potency. METHODS The classification QSAR-based methods namely Bayesian classification and recursive partitioning method were applied to a dataset consisting of HADC9 inhibitors. The structural features strongly suggested that sulphur-containing compounds can be a good choice for HDAC9 inhibition. For this reason, these models were applied further to screen some natural compounds from Allium sativum. The screened compounds were further accessed for the ADME properties and docked in the homology-modelled structure of HDAC9 in order to find important amino acids for the interaction. The best-docked compound was considered for molecular dynamics (MD) simulation study. RESULTS The classification models have identified good and bad fingerprints for HDAC9 inhibition. The screened compounds like ajoene, 1,2 vinyl dithiine, diallyl disulphide and diallyl trisulphide had been identified as compounds having potent HDAC9 inhibitory activity. The results from ADME and molecular docking study of these compounds show the binding interaction inside the active site of the HDAC9. The best-docked compound ajoene shows satisfactory results in terms of different validation parameters of MD simulation study. CONCLUSION This in-silico modelling study has identified the natural potential lead (s) from Allium sativum. Specifically, the ajoene with the best in-silico features can be considered for further in-vitro and in-vivo investigation to establish as potential HDAC9 inhibitors.
Collapse
Affiliation(s)
- Totan Das
- Department of Pharmaceutical Technology, Laboratory of Drug Design and Discovery, Jadavpur University, Kolkata, 700032, India
| | - Arijit Bhattacharya
- Department of Pharmaceutical Technology, Laboratory of Drug Design and Discovery, Jadavpur University, Kolkata, 700032, India
| | - Tarun Jha
- Department of Pharmaceutical Technology, Natural Science Laboratory, Division of Medicinal and Pharmaceutical Chemistry, Jadavpur University, Kolkata, 700032, India
| | - Shovanlal Gayen
- Department of Pharmaceutical Technology, Laboratory of Drug Design and Discovery, Jadavpur University, Kolkata, 700032, India
| |
Collapse
|
5
|
Chiera F, Costa G, Alcaro S, Artese A. An overview on olfaction in the biological, analytical, computational, and machine learning fields. Arch Pharm (Weinheim) 2025; 358:e2400414. [PMID: 39439128 PMCID: PMC11704061 DOI: 10.1002/ardp.202400414] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2024] [Revised: 09/24/2024] [Accepted: 09/26/2024] [Indexed: 10/25/2024]
Abstract
Recently, the comprehension of odor perception has advanced, unveiling the mysteries of the molecular receptors within the nasal passages and the intricate mechanisms governing signal transmission between these receptors, the olfactory bulb, and the brain. This review provides a comprehensive panorama of odors, encompassing various topics ranging from the structural and molecular underpinnings of odorous substances to the physiological intricacies of olfactory perception. It extends to elucidate the analytical methods used for their identification and explores the frontiers of computational methodologies.
Collapse
Affiliation(s)
- Federica Chiera
- Dipartimento di Scienze della Salute, Campus “S. Venuta”Università degli Studi “Magna Græcia” di CatanzaroCatanzaroItaly
| | - Giosuè Costa
- Dipartimento di Scienze della Salute, Campus “S. Venuta”Università degli Studi “Magna Græcia” di CatanzaroCatanzaroItaly
- Net4Science S.r.l.Università degli Studi “Magna Græcia” di CatanzaroCatanzaroItaly
| | - Stefano Alcaro
- Dipartimento di Scienze della Salute, Campus “S. Venuta”Università degli Studi “Magna Græcia” di CatanzaroCatanzaroItaly
- Net4Science S.r.l.Università degli Studi “Magna Græcia” di CatanzaroCatanzaroItaly
- Associazione CRISEA ‐ Centro di Ricerca e Servizi Avanzati per l'Innovazione Rurale, Loc. CondoleoBelcastroItaly
| | - Anna Artese
- Dipartimento di Scienze della Salute, Campus “S. Venuta”Università degli Studi “Magna Græcia” di CatanzaroCatanzaroItaly
- Net4Science S.r.l.Università degli Studi “Magna Græcia” di CatanzaroCatanzaroItaly
| |
Collapse
|
6
|
Asediya VS, Anjaria PA, Mathakiya RA, Koringa PG, Nayak JB, Bisht D, Fulmali D, Patel VA, Desai DN. Vaccine development using artificial intelligence and machine learning: A review. Int J Biol Macromol 2024; 282:136643. [PMID: 39426778 DOI: 10.1016/j.ijbiomac.2024.136643] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2024] [Revised: 09/30/2024] [Accepted: 10/15/2024] [Indexed: 10/21/2024]
Abstract
The COVID-19 pandemic has underscored the critical importance of effective vaccines, yet their development is a challenging and demanding process. It requires identifying antigens that elicit protective immunity, selecting adjuvants that enhance immunogenicity, and designing delivery systems that ensure optimal efficacy. Artificial intelligence (AI) can facilitate this process by using machine learning methods to analyze large and diverse datasets, suggest novel vaccine candidates, and refine their design and predict their performance. This review explores how AI can be applied to various aspects of vaccine development, such as predicting immune response from protein sequences, discovering adjuvants, optimizing vaccine doses, modeling vaccine supply chains, and predicting protein structures. We also address the challenges and ethical issues that emerge from the use of AI in vaccine development, such as data privacy, algorithmic bias, and health data sensitivity. We contend that AI has immense potential to accelerate vaccine development and respond to future pandemics, but it also requires careful attention to the quality and validity of the data and methods used.
Collapse
Affiliation(s)
| | | | | | | | | | - Deepanker Bisht
- Indian Veterinary Research Institute, Izatnagar, U.P., India
| | | | | | | |
Collapse
|
7
|
Gnanaolivu R, Hart SN. Using AI-predicted protein structures as a reference to predict loss-of-function activity in tumor suppressor breast cancer genes. Comput Struct Biotechnol J 2024; 23:3472-3480. [PMID: 39430403 PMCID: PMC11490748 DOI: 10.1016/j.csbj.2024.10.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2024] [Revised: 10/03/2024] [Accepted: 10/03/2024] [Indexed: 10/22/2024] Open
Abstract
Background The loss-of-function (LOF) classification of most missense variants in tumor suppressor breast cancer genes BRCA1, BRCA2, PALB2, and RAD51C remains unclassified and confounds clinical actionability. Classifying these variants is challenging due to their rarity, leading clinicians to rely on in silico predictive methods. Protein stability changes are associated with function, making stability predictors valuable. Stability predictions upon missense variant perturbations require high-resolution protein structures. However, the availability of these high-resolution structures is lacking. This study explores using generative AI to predict high-resolution protein structures, which can then be analyzed with in silico protein stability prediction methods to assess LOF activity in ordered regions of the protein. This study also determines the appropriate in silico protein stability and dedicated in silico missense prediction methods in dbNSFP v4.7 database to predict LOF activity in ordered regions of these four genes. Functional classifications from homology recombination DNA repair (HDR) assays and variant classifications from the ClinVar database provide a reliable dataset for evaluating the performance of these in silico prediction methods. Results Complex AlphaFold2 structures of the BRCA1-C terminal (BRCT) domain and the DNA-binding (DB) domain of BRCA2, analyzed using protein stability tool FoldX predicts LOF activity from missense variants significantly better than experimentally-derived structures in ordered regions. The BRCT domain achieved an Area Under the Curve (AUC)= 0.861 (95 % CI:0.858-0.863) and AUC= 0.842 (95 % CI:0.840-0.845), while the DB domain achieved an AUC= 0.836 (95 % CI:0.8322-0.841), compared to AUC= 0.847 (95 % CI:0.844-0.850) and AUC= 0.835 (95 % CI:0.832-0.837) from the BRCT domain, and AUC= 0.830 (95 % CI:0.821-0.8320) from the DB domain from experimentally-derived structures. Protein stability does not predict LOF activity from missense variants better than dedicated in silico missense predictors. Overall, we find that AlphaMissense ranks highly, with an average AUC= 0.890 (95 % CI 0.886-0.895) from ordered regions across these four cancer genes, compared to all other in silico missense predictors present in the dbNSFP database. Conclusions The study reveals that generative AI protein predicted structures can outperform experimentally-derived structures in evaluating LOF activity from predicted protein stability in ordered regions of genes BRCA1, BRCA2, PALB2 and RAD51C. The study also highlights the predictive performance of AlphaMissense as the premier in silico missense prediction method to predict LOF activity from missense variants in these four tumor suppressor breast cancer genes. The code for this study can be downloaded for free on GitHub (https://github.com/rohandavidg/CarePred).
Collapse
Affiliation(s)
- Rohan Gnanaolivu
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN, United States
| | - Steven N. Hart
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN, United States
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, United States
| |
Collapse
|
8
|
Sinclair M, Stein RA, Sheehan JH, Hawes EM, O’Brien RM, Tajkhorshid E, Claxton DP. Integrative analysis of pathogenic variants in glucose-6-phosphatase based on an AlphaFold2 model. PNAS NEXUS 2024; 3:pgae036. [PMID: 38328777 PMCID: PMC10849595 DOI: 10.1093/pnasnexus/pgae036] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Accepted: 01/09/2024] [Indexed: 02/09/2024]
Abstract
Mediating the terminal reaction of gluconeogenesis and glycogenolysis, the integral membrane protein glucose-6-phosphate catalytic subunit 1 (G6PC1) regulates hepatic glucose production by catalyzing hydrolysis of glucose-6-phosphate (G6P) within the lumen of the endoplasmic reticulum. Consistent with its vital contribution to glucose homeostasis, inactivating mutations in G6PC1 causes glycogen storage disease (GSD) type 1a characterized by hepatomegaly and severe hypoglycemia. Despite its physiological importance, the structural basis of G6P binding to G6PC1 and the molecular disruptions induced by missense mutations within the active site that give rise to GSD type 1a are unknown. In this study, we determine the atomic interactions governing G6P binding as well as explore the perturbations imposed by disease-linked missense variants by subjecting an AlphaFold2 G6PC1 structural model to molecular dynamics simulations and in silico predictions of thermodynamic stability validated with robust in vitro and in situ biochemical assays. We identify a collection of side chains, including conserved residues from the signature phosphatidic acid phosphatase motif, that contribute to a hydrogen bonding and van der Waals network stabilizing G6P in the active site. The introduction of GSD type 1a mutations modified the thermodynamic landscape, altered side chain packing and substrate-binding interactions, and induced trapping of catalytic intermediates. Our results, which corroborate the high quality of the AF2 model as a guide for experimental design and to interpret outcomes, not only confirm the active-site structural organization but also identify previously unobserved mechanistic contributions of catalytic and noncatalytic side chains.
Collapse
Affiliation(s)
- Matt Sinclair
- Theoretical and Computational Biophysics Group, NIH Center for Macromolecular Modeling and Visualization, Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
- Department of Biochemistry, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Richard A Stein
- Department of Molecular Physiology and Biophysics, Vanderbilt University, Nashville, TN 37232, USA
- Center for Applied Artificial Intelligence in Protein Dynamics, Vanderbilt University, Nashville, TN 37240, USA
| | - Jonathan H Sheehan
- Center for Structural Biology, Vanderbilt University, Nashville, TN 37240, USA
- Division of Infectious Diseases, Department of Internal Medicine, Washington University School of Medicine, St Louis, MO 63110, USA
| | - Emily M Hawes
- Department of Molecular Physiology and Biophysics, Vanderbilt University, Nashville, TN 37232, USA
| | - Richard M O’Brien
- Department of Molecular Physiology and Biophysics, Vanderbilt University, Nashville, TN 37232, USA
| | - Emad Tajkhorshid
- Theoretical and Computational Biophysics Group, NIH Center for Macromolecular Modeling and Visualization, Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
- Department of Biochemistry, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
- Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Derek P Claxton
- Department of Molecular Physiology and Biophysics, Vanderbilt University, Nashville, TN 37232, USA
- Center for Applied Artificial Intelligence in Protein Dynamics, Vanderbilt University, Nashville, TN 37240, USA
- Center for Structural Biology, Vanderbilt University, Nashville, TN 37240, USA
| |
Collapse
|
9
|
Nourbakhsh M, Degn K, Saksager A, Tiberti M, Papaleo E. Prediction of cancer driver genes and mutations: the potential of integrative computational frameworks. Brief Bioinform 2024; 25:bbad519. [PMID: 38261338 PMCID: PMC10805075 DOI: 10.1093/bib/bbad519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Revised: 11/27/2023] [Accepted: 12/11/2023] [Indexed: 01/24/2024] Open
Abstract
The vast amount of available sequencing data allows the scientific community to explore different genetic alterations that may drive cancer or favor cancer progression. Software developers have proposed a myriad of predictive tools, allowing researchers and clinicians to compare and prioritize driver genes and mutations and their relative pathogenicity. However, there is little consensus on the computational approach or a golden standard for comparison. Hence, benchmarking the different tools depends highly on the input data, indicating that overfitting is still a massive problem. One of the solutions is to limit the scope and usage of specific tools. However, such limitations force researchers to walk on a tightrope between creating and using high-quality tools for a specific purpose and describing the complex alterations driving cancer. While the knowledge of cancer development increases daily, many bioinformatic pipelines rely on single nucleotide variants or alterations in a vacuum without accounting for cellular compartments, mutational burden or disease progression. Even within bioinformatics and computational cancer biology, the research fields work in silos, risking overlooking potential synergies or breakthroughs. Here, we provide an overview of databases and datasets for building or testing predictive cancer driver tools. Furthermore, we introduce predictive tools for driver genes, driver mutations, and the impact of these based on structural analysis. Additionally, we suggest and recommend directions in the field to avoid silo-research, moving towards integrative frameworks.
Collapse
Affiliation(s)
- Mona Nourbakhsh
- Cancer Systems Biology, Section for Bioinformatics, Department of Health Technology, Technical University of Denmark, 2800 Lyngby, Denmark
| | - Kristine Degn
- Cancer Systems Biology, Section for Bioinformatics, Department of Health Technology, Technical University of Denmark, 2800 Lyngby, Denmark
| | - Astrid Saksager
- Cancer Systems Biology, Section for Bioinformatics, Department of Health Technology, Technical University of Denmark, 2800 Lyngby, Denmark
| | - Matteo Tiberti
- Cancer Structural Biology, Danish Cancer Institute, 2100 Copenhagen, Denmark
| | - Elena Papaleo
- Cancer Systems Biology, Section for Bioinformatics, Department of Health Technology, Technical University of Denmark, 2800 Lyngby, Denmark
- Cancer Structural Biology, Danish Cancer Institute, 2100 Copenhagen, Denmark
| |
Collapse
|
10
|
Alderson TR, Pritišanac I, Kolarić Đ, Moses AM, Forman-Kay JD. Systematic identification of conditionally folded intrinsically disordered regions by AlphaFold2. Proc Natl Acad Sci U S A 2023; 120:e2304302120. [PMID: 37878721 PMCID: PMC10622901 DOI: 10.1073/pnas.2304302120] [Citation(s) in RCA: 65] [Impact Index Per Article: 32.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Accepted: 08/30/2023] [Indexed: 10/27/2023] Open
Abstract
The AlphaFold Protein Structure Database contains predicted structures for millions of proteins. For the majority of human proteins that contain intrinsically disordered regions (IDRs), which do not adopt a stable structure, it is generally assumed that these regions have low AlphaFold2 confidence scores that reflect low-confidence structural predictions. Here, we show that AlphaFold2 assigns confident structures to nearly 15% of human IDRs. By comparison to experimental NMR data for a subset of IDRs that are known to conditionally fold (i.e., upon binding or under other specific conditions), we find that AlphaFold2 often predicts the structure of the conditionally folded state. Based on databases of IDRs that are known to conditionally fold, we estimate that AlphaFold2 can identify conditionally folding IDRs at a precision as high as 88% at a 10% false positive rate, which is remarkable considering that conditionally folded IDR structures were minimally represented in its training data. We find that human disease mutations are nearly fivefold enriched in conditionally folded IDRs over IDRs in general and that up to 80% of IDRs in prokaryotes are predicted to conditionally fold, compared to less than 20% of eukaryotic IDRs. These results indicate that a large majority of IDRs in the proteomes of human and other eukaryotes function in the absence of conditional folding, but the regions that do acquire folds are more sensitive to mutations. We emphasize that the AlphaFold2 predictions do not reveal functionally relevant structural plasticity within IDRs and cannot offer realistic ensemble representations of conditionally folded IDRs.
Collapse
Affiliation(s)
- T. Reid Alderson
- Department of Biochemistry, University of Toronto, Toronto, ONM5S 1A8, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ONM5S 1A8, Canada
| | - Iva Pritišanac
- Department of Cell and Systems Biology, University of Toronto, Toronto, ONM5S 35G, Canada
- Molecular Medicine Program, The Hospital for Sick Children, Toronto, ONM5G 0A4, Canada
- Department of Molecular Biology and Biochemistry, Gottfried Schatz Research Center for Cell Signaling, Metabolism and Aging, Medical University of Graz, Graz8010, Austria
| | - Đesika Kolarić
- Department of Molecular Biology and Biochemistry, Gottfried Schatz Research Center for Cell Signaling, Metabolism and Aging, Medical University of Graz, Graz8010, Austria
| | - Alan M. Moses
- Department of Cell and Systems Biology, University of Toronto, Toronto, ONM5S 35G, Canada
| | - Julie D. Forman-Kay
- Department of Biochemistry, University of Toronto, Toronto, ONM5S 1A8, Canada
- Molecular Medicine Program, The Hospital for Sick Children, Toronto, ONM5G 0A4, Canada
| |
Collapse
|
11
|
Sinclair M, Stein RA, Sheehan JH, Hawes EM, O'Brien RM, Tajkhorshid E, Claxton DP. Molecular mechanisms of catalytic inhibition for active site mutations in glucose-6-phosphatase catalytic subunit 1 linked to glycogen storage disease. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.13.532485. [PMID: 36993754 PMCID: PMC10054992 DOI: 10.1101/2023.03.13.532485] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Mediating the terminal reaction of gluconeogenesis and glycogenolysis, the integral membrane protein G6PC1 regulates hepatic glucose production by catalyzing hydrolysis of glucose-6-phosphate (G6P) within the lumen of the endoplasmic reticulum. Consistent with its vital contribution to glucose homeostasis, inactivating mutations in G6PC1 cause glycogen storage disease (GSD) type 1a characterized by hepatomegaly and severe hypoglycemia. Despite its physiological importance, the structural basis of G6P binding to G6PC1 and the molecular disruptions induced by missense mutations within the active site that give rise to GSD type 1a are unknown. Exploiting a computational model of G6PC1 derived from the groundbreaking structure prediction algorithm AlphaFold2 (AF2), we combine molecular dynamics (MD) simulations and computational predictions of thermodynamic stability with a robust in vitro screening platform to define the atomic interactions governing G6P binding as well as explore the energetic perturbations imposed by disease-linked variants. We identify a collection of side chains, including conserved residues from the signature phosphatidic acid phosphatase motif, that contribute to a hydrogen bonding and van der Waals network stabilizing G6P in the active site. Introduction of GSD type 1a mutations into the G6PC1 sequence elicits changes in G6P binding energy, thermostability and structural properties, suggesting multiple pathways of catalytic impairment. Our results, which corroborate the high quality of the AF2 model as a guide for experimental design and to interpret outcomes, not only confirm active site structural organization but also suggest novel mechanistic contributions of catalytic and non-catalytic side chains.
Collapse
|
12
|
Blaabjerg LM, Kassem MM, Good LL, Jonsson N, Cagiada M, Johansson KE, Boomsma W, Stein A, Lindorff-Larsen K. Rapid protein stability prediction using deep learning representations. eLife 2023; 12:e82593. [PMID: 37184062 PMCID: PMC10266766 DOI: 10.7554/elife.82593] [Citation(s) in RCA: 57] [Impact Index Per Article: 28.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Accepted: 05/12/2023] [Indexed: 05/16/2023] Open
Abstract
Predicting the thermodynamic stability of proteins is a common and widely used step in protein engineering, and when elucidating the molecular mechanisms behind evolution and disease. Here, we present RaSP, a method for making rapid and accurate predictions of changes in protein stability by leveraging deep learning representations. RaSP performs on-par with biophysics-based methods and enables saturation mutagenesis stability predictions in less than a second per residue. We use RaSP to calculate ∼ 230 million stability changes for nearly all single amino acid changes in the human proteome, and examine variants observed in the human population. We find that variants that are common in the population are substantially depleted for severe destabilization, and that there are substantial differences between benign and pathogenic variants, highlighting the role of protein stability in genetic diseases. RaSP is freely available-including via a Web interface-and enables large-scale analyses of stability in experimental and predicted protein structures.
Collapse
Affiliation(s)
- Lasse M Blaabjerg
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of CopenhagenCopenhagenDenmark
| | - Maher M Kassem
- Center for Basic Machine Learning Research in Life Science, Department of Computer Science, University of CopenhagenCopenhagenDenmark
| | - Lydia L Good
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of CopenhagenCopenhagenDenmark
| | - Nicolas Jonsson
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of CopenhagenCopenhagenDenmark
| | - Matteo Cagiada
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of CopenhagenCopenhagenDenmark
| | - Kristoffer E Johansson
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of CopenhagenCopenhagenDenmark
| | - Wouter Boomsma
- Center for Basic Machine Learning Research in Life Science, Department of Computer Science, University of CopenhagenCopenhagenDenmark
| | - Amelie Stein
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of CopenhagenCopenhagenDenmark
| | - Kresten Lindorff-Larsen
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of CopenhagenCopenhagenDenmark
| |
Collapse
|