Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Strokach A, Becerra D, Corbi-Verge C, Perez-Riba A, Kim PM. Fast and Flexible Protein Design Using Deep Graph Neural Networks. Cell Syst 2020;11:402-411.e4. [DOI: 10.1016/j.cels.2020.08.016] [Citation(s) in RCA: 56] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2020] [Revised: 06/27/2020] [Accepted: 08/26/2020] [Indexed: 11/15/2022]

For:	Strokach A, Becerra D, Corbi-Verge C, Perez-Riba A, Kim PM. Fast and Flexible Protein Design Using Deep Graph Neural Networks. Cell Syst 2020;11:402-411.e4. [DOI: 10.1016/j.cels.2020.08.016] [Citation(s) in RCA: 56] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2020] [Revised: 06/27/2020] [Accepted: 08/26/2020] [Indexed: 11/15/2022]

Number

Cited by Other Article(s)

Gu C, Ghasemi SM, Cai Y, Fahrmann JF, Long JP, Katayama H, Wu C, Vykoukal J, Dennison JB, Hanash S, Do KA, Irajizad E. Grape-Pi: graph-based neural networks for enhanced protein identification in proteomics pipelines. BIOINFORMATICS ADVANCES 2025;5:vbaf095. [PMID: 40406669 PMCID: PMC12096076 DOI: 10.1093/bioadv/vbaf095] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/17/2024] [Revised: 04/02/2025] [Accepted: 04/24/2025] [Indexed: 05/26/2025]

Affiliation(s)

Chunhui Gu Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, United States Department of Clinical Cancer Prevention, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, United States
Seyyed Mahmood Ghasemi Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, United States Department of Clinical Cancer Prevention, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, United States
Yining Cai Department of Clinical Cancer Prevention, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, United States
Johannes F Fahrmann Department of Clinical Cancer Prevention, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, United States
James P Long Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, United States
Hiroyuki Katayama Department of Clinical Cancer Prevention, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, United States
Chong Wu Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, United States
Jody Vykoukal Department of Clinical Cancer Prevention, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, United States
Jennifer B Dennison Department of Clinical Cancer Prevention, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, United States
Samir Hanash Department of Clinical Cancer Prevention, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, United States
Kim-Anh Do Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, United States
Ehsan Irajizad Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, United States Department of Clinical Cancer Prevention, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, United States

Collapse

Dauparas J, Lee GR, Pecoraro R, An L, Anishchenko I, Glasscock C, Baker D. Atomic context-conditioned protein sequence design using LigandMPNN. Nat Methods 2025;22:717-723. [PMID: 40155723 PMCID: PMC11978504 DOI: 10.1038/s41592-025-02626-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2023] [Accepted: 02/10/2025] [Indexed: 04/01/2025]

Høie MH, Hummer AM, Olsen TH, Aguilar-Sanjuan B, Nielsen M, Deane CM. AntiFold: improved structure-based antibody design using inverse folding. BIOINFORMATICS ADVANCES 2025;5:vbae202. [PMID: 40170886 PMCID: PMC11961221 DOI: 10.1093/bioadv/vbae202] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/17/2024] [Revised: 11/28/2024] [Accepted: 03/19/2025] [Indexed: 04/03/2025]

Zhong J, Zou Z, Qiu J, Wang S. ScFold: a GNN-based model for efficient inverse folding of short-chain proteins via spatial reduction. Brief Bioinform 2025;26:bbaf156. [PMID: 40205854 PMCID: PMC11982017 DOI: 10.1093/bib/bbaf156] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2024] [Revised: 02/24/2025] [Accepted: 03/19/2025] [Indexed: 04/11/2025] Open

Turina P, Petrosino M, Enriquez Sandoval CA, Novak L, Pasquo A, Alexov E, Alladin MA, Ascher DB, Babbi G, Bakolitsa C, Casadio R, Cheng J, Fariselli P, Folkman L, Kamandula A, Katsonis P, Li M, Li D, Lichtarge O, Mahmud S, Martelli PL, Pal D, Panday SK, Pires DEV, Portelli S, Pucci F, Rodrigues CHM, Rooman M, Savojardo C, Schwersensky M, Shen Y, Strokach AV, Sun Y, Woo J, Radivojac P, Brenner SE, Chiaraluce R, Consalvi V, Capriotti E. Assessing the predicted impact of single amino acid substitutions in MAPK proteins for CAGI6 challenges. Hum Genet 2025;144:265-280. [PMID: 39976676 PMCID: PMC11975483 DOI: 10.1007/s00439-024-02724-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2024] [Accepted: 12/27/2024] [Indexed: 03/05/2025]

Abstract

New thermodynamic and functional studies have been recently conducted to evaluate the impact of amino acid substitutions on the Mitogen Activated Protein Kinases 1 and 3 (MAPK1/3). The Critical Assessment of Genome Interpretation (CAGI) data provider, at Sapienza University of Rome, measured the unfolding free energy and the enzymatic activity of a set of variants (MAPK challenge dataset). Thermodynamic measurements for the denaturant-induced equilibrium unfolding of the phosphorylated and unphosphorylated forms of the MAPKs were obtained by monitoring the far-UV circular dichroism and intrinsic fluorescence changes as a function of denaturant concentration. These values have been used to calculate the change in unfolding free energy between the variant and wild-type proteins at zero concentration of denaturant ( Δ Δ G H 2 O ). The enzymatic activity of the phosphorylated MAPKs variants was also measured using Chelation-Enhanced Fluorescence to monitor the phosphorylation of a peptide substrate. The MAPK challenge dataset, composed of a total of 23 single amino acid substitutions (11 and 12 for MAPK1 and MAPK3, respectively), was used to assess the effectiveness of the computational methods in predicting the Δ Δ G H 2 O values, associated with the variants, and categorize them as destabilizing and not destabilizing. The data on the enzymatic activity of the MAPKs mutants were used to assess the performance of the methods for predicting the functional impact of the variants. For the sixth edition of CAGI, thirteen independent research groups from four continents (Asia, Australia, Europe and North America) submitted > 80 sets of predictions, obtained from different approaches. In this manuscript, we summarized the results of our assessment to highlight the possible limitations of the available algorithms.

Collapse

Affiliation(s)

Paola Turina Department of Pharmacy and Biotechnology, University of Bologna, 40126, Bologna, Italy
Maria Petrosino Department of Biochemical Sciences "A. Rossi Fanelli", Sapienza University of Roma, 00185, Rome, Italy
Carlos A Enriquez Sandoval Department of Pharmacy and Biotechnology, University of Bologna, 40126, Bologna, Italy
Leonore Novak Department of Biochemical Sciences "A. Rossi Fanelli", Sapienza University of Roma, 00185, Rome, Italy
Alessandra Pasquo Diagnostics and Metrology Laboratory FSN-TECFIS-DIM, ENEA CR Frascati, 00044, Frascati, Italy
Emil Alexov Department of Physics and Astronomy, Clemson University, Clemson, SC, 29634, USA
Muttaqi Ahmad Alladin Department of Computational and Data Sciences, Indian Institute of Science, Bangaluru, 560012, India
David B Ascher Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC, 3004, Australia School of Chemistry and Molecular Biosciences, Australian Centre for Ecogenomics, University of Queensland, St Lucia, QLD, 4072, Australia
Giulia Babbi Department of Pharmacy and Biotechnology, University of Bologna, 40126, Bologna, Italy
Constantina Bakolitsa Department of Plant and Microbial Biology and Center for Computational Biology, University of California, Berkeley, CA, 94720, USA
Rita Casadio Department of Pharmacy and Biotechnology, University of Bologna, 40126, Bologna, Italy
Jianlin Cheng Department of Electrical Engineering and Computer Science, NextGen Precision Health Institute, University of Missouri, Columbia, MO, 65211, USA
Piero Fariselli Department of Medical Sciences, University of Torino, 10126, Torino, Italy
Lukas Folkman Institute for Integrated and Intelligent Systems, Griffith University, Southport, QLD, 4222, Australia
Akash Kamandula Khoury College of Computer Sciences, Northeastern University, Boston, MA, 02115, USA
Panagiotis Katsonis Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
Minghui Li School of Biology and Basic Medical Sciences, Suzhou Medical College of Soochow University, Suzhou, 215123, Jiangsu, China
Dong Li Computational Biology and Bioinformatics, Université Libre de Bruxelles, 1050, Brussels, Belgium
Olivier Lichtarge Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
Sajid Mahmud Department of Electrical Engineering and Computer Science, NextGen Precision Health Institute, University of Missouri, Columbia, MO, 65211, USA
Pier Luigi Martelli Department of Pharmacy and Biotechnology, University of Bologna, 40126, Bologna, Italy
Debnath Pal Department of Computational and Data Sciences, Indian Institute of Science, Bangaluru, 560012, India
Shailesh Kumar Panday Department of Physics and Astronomy, Clemson University, Clemson, SC, 29634, USA
Douglas E V Pires School of Computing and Information Systems, The University of Melbourne, Melbourne, VIC, 3053, Australia
Stephanie Portelli Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC, 3004, Australia School of Chemistry and Molecular Biosciences, Australian Centre for Ecogenomics, University of Queensland, St Lucia, QLD, 4072, Australia
Fabrizio Pucci Computational Biology and Bioinformatics, Université Libre de Bruxelles, 1050, Brussels, Belgium
Carlos H M Rodrigues Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC, 3004, Australia
Marianne Rooman Computational Biology and Bioinformatics, Université Libre de Bruxelles, 1050, Brussels, Belgium
Castrense Savojardo Department of Pharmacy and Biotechnology, University of Bologna, 40126, Bologna, Italy
Martin Schwersensky Computational Biology and Bioinformatics, Université Libre de Bruxelles, 1050, Brussels, Belgium
Yang Shen Department of Electrical and Computer Engineering Texas, A&M University, College Station, TX, 77843, USA
Alexey V Strokach Department of Computer Science, University of Toronto, Toronto, ON, M5S 2E4, Canada
Yuanfei Sun Department of Electrical and Computer Engineering Texas, A&M University, College Station, TX, 77843, USA
Junwoo Woo 3 Billion, Seoul, South Korea
Predrag Radivojac Khoury College of Computer Sciences, Northeastern University, Boston, MA, 02115, USA
Steven E Brenner Department of Plant and Microbial Biology and Center for Computational Biology, University of California, Berkeley, CA, 94720, USA Biophysics Graduate Group, University of California, Berkeley, Berkeley, CA, 94720, USA Center for Computational Biology, University of California, Berkeley, Berkeley, CA, 94720, USA
Roberta Chiaraluce Department of Biochemical Sciences "A. Rossi Fanelli", Sapienza University of Roma, 00185, Rome, Italy.
Valerio Consalvi Department of Biochemical Sciences "A. Rossi Fanelli", Sapienza University of Roma, 00185, Rome, Italy.
Emidio Capriotti Department of Pharmacy and Biotechnology, University of Bologna, 40126, Bologna, Italy. Computational Genomics Platform, IRCCS University Hospital of Bologna, 40138, Bologna, Italy.

Collapse

Turina P, Dal Cortivo G, Enriquez Sandoval CA, Alexov E, Ascher DB, Babbi G, Bakolitsa C, Casadio R, Fariselli P, Folkman L, Kamandula A, Katsonis P, Li D, Lichtarge O, Martelli PL, Panday SK, Pires DEV, Portelli S, Pucci F, Rodrigues CHM, Rooman M, Savojardo C, Schwersensky M, Shen Y, Strokach AV, Sun Y, Woo J, Radivojac P, Brenner SE, Dell'Orco D, Capriotti E. Assessing the predicted impact of single amino acid substitutions in calmodulin for CAGI6 challenges. Hum Genet 2025;144:113-125. [PMID: 39714488 PMCID: PMC11975486 DOI: 10.1007/s00439-024-02720-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2024] [Accepted: 12/02/2024] [Indexed: 12/24/2024]

Affiliation(s)

Paola Turina Department of Pharmacy and Biotechnology, University of Bologna, 40126, Bologna, Italy
Giuditta Dal Cortivo Department of Neurosciences, Biomedicine, and Movement Sciences, Section of Biological Chemistry, University of Verona, 37134, Verona, Italy
Carlos A Enriquez Sandoval Department of Pharmacy and Biotechnology, University of Bologna, 40126, Bologna, Italy
Emil Alexov Department of Physics and Astronomy, Clemson University, Clemson, SC, 29634, USA
David B Ascher Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC, 3004, Australia School of Chemistry and Molecular Biosciences, Australian Centre for Ecogenomics, University of Queensland, St Lucia, QLD, 4072, Australia
Giulia Babbi Department of Pharmacy and Biotechnology, University of Bologna, 40126, Bologna, Italy
Constantina Bakolitsa Department of Plant and Microbial Biology and Center for Computational Biology, University of California, Berkeley, CA, USA
Rita Casadio Department of Pharmacy and Biotechnology, University of Bologna, 40126, Bologna, Italy
Piero Fariselli Department of Medical Sciences, University of Torino, Turin, Italy
Lukas Folkman Institute for Integrated and Intelligent Systems, Griffith University, Southport, QLD, Australia
Akash Kamandula Khoury College of Computer Sciences, Northeastern University, Boston, MA, 02115, USA
Panagiotis Katsonis Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
Dong Li Computational Biology and Bioinformatics, Université Libre de Bruxelles, 50 Roosevelt Ave, 1050, Brussels, Belgium
Olivier Lichtarge Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
Pier Luigi Martelli Department of Pharmacy and Biotechnology, University of Bologna, 40126, Bologna, Italy
Shailesh Kumar Panday Department of Physics and Astronomy, Clemson University, Clemson, SC, 29634, USA
Douglas E V Pires School of Computing and Information Systems, The University of Melbourne, Melbourne, VIC, 3053, Australia
Stephanie Portelli Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC, 3004, Australia School of Chemistry and Molecular Biosciences, Australian Centre for Ecogenomics, University of Queensland, St Lucia, QLD, 4072, Australia
Fabrizio Pucci Computational Biology and Bioinformatics, Université Libre de Bruxelles, 50 Roosevelt Ave, 1050, Brussels, Belgium
Carlos H M Rodrigues Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC, 3004, Australia
Marianne Rooman Computational Biology and Bioinformatics, Université Libre de Bruxelles, 50 Roosevelt Ave, 1050, Brussels, Belgium
Castrense Savojardo Department of Pharmacy and Biotechnology, University of Bologna, 40126, Bologna, Italy
Martin Schwersensky Computational Biology and Bioinformatics, Université Libre de Bruxelles, 50 Roosevelt Ave, 1050, Brussels, Belgium
Yang Shen Department of Electrical and Computer Engineering Texas, A&M University, College Station, TX, USA
Alexey V Strokach Department of Computer Science, University of Toronto, Toronto, ON, Canada
Yuanfei Sun Department of Electrical and Computer Engineering Texas, A&M University, College Station, TX, USA
Junwoo Woo , 3 Billion, Seoul, South Korea
Predrag Radivojac Khoury College of Computer Sciences, Northeastern University, Boston, MA, 02115, USA
Steven E Brenner Department of Plant and Microbial Biology and Center for Computational Biology, University of California, Berkeley, CA, USA Biophysics Graduate Group, University of California, Berkeley, CA, 94720, USA Center for Computational Biology, University of California, Berkeley, CA, 94720, USA
Daniele Dell'Orco Department of Neurosciences, Biomedicine, and Movement Sciences, Section of Biological Chemistry, University of Verona, 37134, Verona, Italy.
Emidio Capriotti Department of Pharmacy and Biotechnology, University of Bologna, 40126, Bologna, Italy. Computational Genomics Platform, IRCCS University Hospital of Bologna, 40138, Bologna, Italy.

Collapse

Zhang J, Kinch L, Katsonis P, Lichtarge O, Jagota M, Song YS, Sun Y, Shen Y, Kuru N, Dereli O, Adebali O, Alladin MA, Pal D, Capriotti E, Turina MP, Savojardo C, Martelli PL, Babbi G, Casadio R, Pucci F, Rooman M, Cia G, Tsishyn M, Strokach A, Hu Z, van Loggerenberg W, Roth FP, Radivojac P, Brenner SE, Cong Q, Grishin NV. Assessing predictions on fitness effects of missense variants in HMBS in CAGI6. Hum Genet 2025;144:173-189. [PMID: 39110250 PMCID: PMC12085147 DOI: 10.1007/s00439-024-02680-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2023] [Accepted: 05/17/2024] [Indexed: 02/21/2025]

Affiliation(s)

Jing Zhang Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA Department of Biochemistry, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA Eugene McDermott Center for Human Growth and Development, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA Harold C. Simmons Comprehensive Cancer Center, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA
Lisa Kinch Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA Department of Molecular Biology, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA
Panagiotis Katsonis Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
Olivier Lichtarge Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
Milind Jagota Computer Science Division, University of California, Berkeley, CA, 94720, USA
Yun S Song Computer Science Division, University of California, Berkeley, CA, 94720, USA Department of Statistics, University of California, Berkeley, Berkeley, CA, 94720, USA
Yuanfei Sun Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX, 77843, USA
Yang Shen Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX, 77843, USA
Nurdan Kuru Faculty of Engineering and Natural Sciences, Sabanci University, Tuzla, Turkey
Onur Dereli Faculty of Engineering and Natural Sciences, Sabanci University, Tuzla, Turkey
Ogun Adebali Faculty of Engineering and Natural Sciences, Sabanci University, Tuzla, Turkey
Muttaqi Ahmad Alladin Department of Computational and Data Sciences, Indian Institute of Science, Bangaluru, 560012, India
Debnath Pal Department of Computational and Data Sciences, Indian Institute of Science, Bangaluru, 560012, India
Emidio Capriotti Department of Pharmacy and Biotechnology, University of Bologna, Via Selmi 3, 40126, Bologna, Italy
Maria Paola Turina Department of Pharmacy and Biotechnology, University of Bologna, Via Selmi 3, 40126, Bologna, Italy
Castrense Savojardo Department of Pharmacy and Biotechnology, University of Bologna, Via Selmi 3, 40126, Bologna, Italy
Pier Luigi Martelli Department of Pharmacy and Biotechnology, University of Bologna, Via Selmi 3, 40126, Bologna, Italy
Giulia Babbi Department of Pharmacy and Biotechnology, University of Bologna, Via Selmi 3, 40126, Bologna, Italy
Rita Casadio Department of Pharmacy and Biotechnology, University of Bologna, Via Selmi 3, 40126, Bologna, Italy
Fabrizio Pucci Computational Biology and Bioinformatics, Université Libre de Bruxelles, 50 Roosevelt Ave, 1050, Brussels, Belgium
Marianne Rooman Computational Biology and Bioinformatics, Université Libre de Bruxelles, 50 Roosevelt Ave, 1050, Brussels, Belgium
Gabriel Cia Computational Biology and Bioinformatics, Université Libre de Bruxelles, 50 Roosevelt Ave, 1050, Brussels, Belgium
Matsvei Tsishyn Computational Biology and Bioinformatics, Université Libre de Bruxelles, 50 Roosevelt Ave, 1050, Brussels, Belgium
Alexey Strokach Department of Computer Science, University of Toronto, Toronto, ON, M5S 2E4, Canada
Zhiqiang Hu Department of Plant and Microbial Biology, University of California, Berkeley, CA, 94720, USA Center for Computational Biology, University of California, Berkeley, Berkeley, CA, 94720, USA
Warren van Loggerenberg Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA, 15213, USA Donnelly Centre, University of Toronto, Toronto, ON, M5S 3E1, Canada Department of Molecular Genetics, University of Toronto, Toronto, ON, M5S 1A8, Canada Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, M5G 1X5, Canada
Frederick P Roth Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA, 15213, USA Donnelly Centre, University of Toronto, Toronto, ON, M5S 3E1, Canada Department of Molecular Genetics, University of Toronto, Toronto, ON, M5S 1A8, Canada Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, M5G 1X5, Canada
Predrag Radivojac Khoury College of Computer Sciences, Northeastern University, Boston, MA, 02115, USA
Steven E Brenner Department of Plant and Microbial Biology, University of California, Berkeley, CA, 94720, USA Center for Computational Biology, University of California, Berkeley, Berkeley, CA, 94720, USA Biophysics Graduate Group, University of California, Berkeley, Berkeley, CA, 94720, USA
Qian Cong Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA. Department of Biochemistry, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA. Eugene McDermott Center for Human Growth and Development, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA. Harold C. Simmons Comprehensive Cancer Center, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA.
Nick V Grishin Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA. Department of Biochemistry, University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA.

Collapse

Dong Z, Jin J, Xiao Y, Xiao B, Wang S, Liu X, Zhu E. Subgraph Propagation and Contrastive Calibration for Incomplete Multiview Data Clustering. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025;36:3218-3230. [PMID: 38236668 DOI: 10.1109/tnnls.2024.3350671] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/06/2025]

Abstract

The success of multiview raw data mining relies on the integrity of attributes. However, each view faces various noises and collection failures, which leads to a condition that attributes are only partially available. To make matters worse, the attributes in multiview raw data are composed of multiple forms, which makes it more difficult to explore the structure of the data especially in multiview clustering task. Due to the missing data in some views, the clustering task on incomplete multiview data confronts the following challenges, namely: 1) mining the topology of missing data in multiview is an urgent problem to be solved; 2) most approaches do not calibrate the complemented representations with common information of multiple views; and 3) we discover that the cluster distributions obtained from incomplete views have a cluster distribution unaligned problem (CDUP) in the latent space. To solve the above issues, we propose a deep clustering framework based on subgraph propagation and contrastive calibration (SPCC) for incomplete multiview raw data. First, the global structural graph is reconstructed by propagating the subgraphs generated by the complete data of each view. Then, the missing views are completed and calibrated under the guidance of the global structural graph and contrast learning between views. In the latent space, we assume that different views have a common cluster representation in the same dimension. However, in the unsupervised condition, the fact that the cluster distributions of different views do not correspond affects the information completion process to use information from other views. Finally, the complemented cluster distributions for different views are aligned by contrastive learning (CL), thus solving the CDUP in the latent space. Our method achieves advanced performance on six benchmarks, which validates the effectiveness and superiority of our SPCC.

Collapse

Hui WH, Chen YL, Chang SW. GraphLOGIC: Lethality prediction of osteogenesis imperfecta on type I collagen by a mechanics-informed graph neural network. Int J Biol Macromol 2025;291:139001. [PMID: 39706395 DOI: 10.1016/j.ijbiomac.2024.139001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2024] [Revised: 12/16/2024] [Accepted: 12/17/2024] [Indexed: 12/23/2024]

Sun J, Zhu T, Cui Y, Wu B. Structure-based self-supervised learning enables ultrafast protein stability prediction upon mutation. Innovation (N Y) 2025;6:100750. [PMID: 39872490 PMCID: PMC11763918 DOI: 10.1016/j.xinn.2024.100750] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2024] [Accepted: 12/02/2024] [Indexed: 01/30/2025] Open

Bochtler M. How the technologies behind self-driving cars, social networks, ChatGPT, and DALL-E2 are changing structural biology. Bioessays 2025;47:e2400155. [PMID: 39404756 DOI: 10.1002/bies.202400155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2024] [Revised: 09/08/2024] [Accepted: 09/26/2024] [Indexed: 12/22/2024]

Cohn R, Holm EA. Graph convolutional network for predicting abnormal grain growth in Monte Carlo simulations of microstructural evolution. Sci Rep 2024;14:30259. [PMID: 39632876 PMCID: PMC11618464 DOI: 10.1038/s41598-024-81349-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2024] [Accepted: 11/26/2024] [Indexed: 12/07/2024] Open

Soleymani F, Paquet E, Viktor HL, Michalowski W. Structure-based protein and small molecule generation using EGNN and diffusion models: A comprehensive review. Comput Struct Biotechnol J 2024;23:2779-2797. [PMID: 39050782 PMCID: PMC11268121 DOI: 10.1016/j.csbj.2024.06.021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2024] [Revised: 06/13/2024] [Accepted: 06/18/2024] [Indexed: 07/27/2024] Open

Jiao Z, Liu Y, Wang Z. Application of graph neural network in computational heterogeneous catalysis. J Chem Phys 2024;161:171001. [PMID: 39484893 DOI: 10.1063/5.0227821] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2024] [Accepted: 10/11/2024] [Indexed: 11/03/2024] Open

Liu J, Guo Z, You H, Zhang C, Lai L. All-Atom Protein Sequence Design Based on Geometric Deep Learning. Angew Chem Int Ed Engl 2024:e202411461. [PMID: 39295564 DOI: 10.1002/anie.202411461] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2024] [Revised: 09/09/2024] [Accepted: 09/18/2024] [Indexed: 09/21/2024]

Satalkar V, Degaga GD, Li W, Pang YT, McShan AC, Gumbart JC, Mitchell JC, Torres MP. Generative β-hairpin design using a residue-based physicochemical property landscape. Biophys J 2024;123:2790-2806. [PMID: 38297834 PMCID: PMC11393682 DOI: 10.1016/j.bpj.2024.01.029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Revised: 12/20/2023] [Accepted: 01/25/2024] [Indexed: 02/02/2024] Open

Ghafarollahi A, Buehler MJ. ProtAgents: protein discovery via large language model multi-agent collaborations combining physics and machine learning. DIGITAL DISCOVERY 2024;3:1389-1409. [PMID: 38993729 PMCID: PMC11235180 DOI: 10.1039/d4dd00013g] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Accepted: 05/13/2024] [Indexed: 07/13/2024]

Abstract

Designing de novo proteins beyond those found in nature holds significant promise for advancements in both scientific and engineering applications. Current methodologies for protein design often rely on AI-based models, such as surrogate models that address end-to-end problems by linking protein structure to material properties or vice versa. However, these models frequently focus on specific material objectives or structural properties, limiting their flexibility when incorporating out-of-domain knowledge into the design process or comprehensive data analysis is required. In this study, we introduce ProtAgents, a platform for de novo protein design based on Large Language Models (LLMs), where multiple AI agents with distinct capabilities collaboratively address complex tasks within a dynamic environment. The versatility in agent development allows for expertise in diverse domains, including knowledge retrieval, protein structure analysis, physics-based simulations, and results analysis. The dynamic collaboration between agents, empowered by LLMs, provides a versatile approach to tackling protein design and analysis problems, as demonstrated through diverse examples in this study. The problems of interest encompass designing new proteins, analyzing protein structures and obtaining new first-principles data - natural vibrational frequencies - via physics simulations. The concerted effort of the system allows for powerful automated and synergistic design of de novo proteins with targeted mechanical properties. The flexibility in designing the agents, on one hand, and their capacity in autonomous collaboration through the dynamic LLM-based multi-agent environment on the other hand, unleashes great potentials of LLMs in addressing multi-objective materials problems and opens up new avenues for autonomous materials discovery and design.

Collapse

Chu SKS, Narang K, Siegel JB. Protein stability prediction by fine-tuning a protein language model on a mega-scale dataset. PLoS Comput Biol 2024;20:e1012248. [PMID: 39038042 PMCID: PMC11293664 DOI: 10.1371/journal.pcbi.1012248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2023] [Revised: 08/01/2024] [Accepted: 06/13/2024] [Indexed: 07/24/2024] Open

Gurusinghe SNS, Wu Y, DeGrado W, Shifman JM. ProBASS - a language model with sequence and structural features for predicting the effect of mutations on binding affinity. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.21.600041. [PMID: 38979193 PMCID: PMC11230163 DOI: 10.1101/2024.06.21.600041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]

Abstract

Protein-protein interactions (PPIs) govern virtually all cellular processes. Even a single mutation within PPI can significantly influence overall protein functionality and potentially lead to various types of diseases. To date, numerous approaches have emerged for predicting the change in free energy of binding (ΔΔGbind) resulting from mutations, yet the majority of these methods lack precision. In recent years, protein language models (PLMs) have been developed and shown powerful predictive capabilities by leveraging both sequence and structural data from protein-protein complexes. Yet, PLMs have not been optimized specifically for predicting ΔΔGbind. We developed an approach to predict effects of mutations on PPI binding affinity based on two most advanced protein language models ESM2 and ESM-IF1 that incorporate PPI sequence and structural features, respectively. We used the two models to generate embeddings for each PPI mutant and subsequently fine-tuned our model by training on a large dataset of experimental ΔΔGbind values. Our model, ProBASS (Protein Binding Affinity from Structure and Sequence) achieved a correlation with experimental ΔΔGbind values of 0.83 ± 0.05 for single mutations and 0.69 ± 0.04 for double mutations when model training and testing was done on the same PDB. Moreover, ProBASS exhibited very high correlation (0.81 ± 0.02) between prediction and experiment when training and testing was performed on a dataset containing 2325 single mutations in 132 PPIs. ProBASS surpasses the state-of-the-art methods in correlation with experimental data and could be further trained as more experimental data becomes available. Our results demonstrate that the integration of extensive datasets containing ΔΔGbind values across multiple PPIs to refine the pre-trained PLMs represents a successful approach for achieving a precise and broadly applicable model for ΔΔGbind prediction, greatly facilitating future protein engineering and design studies.

Collapse

Lv S, Dong J, Wang C, Wang X, Bao Z. RB-GAT: A Text Classification Model Based on RoBERTa-BiGRU with Graph ATtention Network. SENSORS (BASEL, SWITZERLAND) 2024;24:3365. [PMID: 38894157 PMCID: PMC11175149 DOI: 10.3390/s24113365] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/21/2024] [Revised: 05/17/2024] [Accepted: 05/21/2024] [Indexed: 06/21/2024]

Tang X, Dai H, Knight E, Wu F, Li Y, Li T, Gerstein M. A survey of generative AI for de novo drug design: new frontiers in molecule and protein generation. Brief Bioinform 2024;25:bbae338. [PMID: 39007594 PMCID: PMC11247410 DOI: 10.1093/bib/bbae338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2024] [Revised: 05/21/2024] [Accepted: 06/27/2024] [Indexed: 07/16/2024] Open

Song Y, Wang F, Chen L, Zhang W. Engineering Fatty Acid Biosynthesis in Microalgae: Recent Progress and Perspectives. Mar Drugs 2024;22:216. [PMID: 38786607 PMCID: PMC11122798 DOI: 10.3390/md22050216] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2024] [Revised: 05/06/2024] [Accepted: 05/07/2024] [Indexed: 05/25/2024] Open

Song C, Zhang L. Intelligent Design of Antithrombotic Peptide Targeting Collagen. LANGMUIR : THE ACS JOURNAL OF SURFACES AND COLLOIDS 2024;40:9661-9668. [PMID: 38664943 DOI: 10.1021/acs.langmuir.4c00543] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2024]

Kim Y, Wang K, Lock RI, Nash TR, Fleischer S, Wang BZ, Fine BM, Vunjak-Novakovic G. BeatProfiler: Multimodal In Vitro Analysis of Cardiac Function Enables Machine Learning Classification of Diseases and Drugs. IEEE OPEN JOURNAL OF ENGINEERING IN MEDICINE AND BIOLOGY 2024;5:238-249. [PMID: 38606403 PMCID: PMC11008807 DOI: 10.1109/ojemb.2024.3377461] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2023] [Revised: 02/13/2024] [Accepted: 03/10/2024] [Indexed: 04/13/2024] Open

Abstract

Goal: Contractile response and calcium handling are central to understanding cardiac function and physiology, yet existing methods of analysis to quantify these metrics are often time-consuming, prone to mistakes, or require specialized equipment/license. We developed BeatProfiler, a suite of cardiac analysis tools designed to quantify contractile function, calcium handling, and force generation for multiple in vitro cardiac models and apply downstream machine learning methods for deep phenotyping and classification. Methods: We first validate BeatProfiler's accuracy, robustness, and speed by benchmarking against existing tools with a fixed dataset. We further confirm its ability to robustly characterize disease and dose-dependent drug response. We then demonstrate that the data acquired by our automatic acquisition pipeline can be further harnessed for machine learning (ML) analysis to phenotype a disease model of restrictive cardiomyopathy and profile cardioactive drug functional response. To accurately classify between these biological signals, we apply feature-based ML and deep learning models (temporal convolutional-bidirectional long short-term memory model or TCN-BiLSTM). Results: Benchmarking against existing tools revealed that BeatProfiler detected and analyzed contraction and calcium signals better than existing tools through improved sensitivity in low signal data, reduction in false positives, and analysis speed increase by 7 to 50-fold. Of signals accurately detected by published methods (PMs), BeatProfiler's extracted features showed high correlations to PMs, confirming that it is reliable and consistent with PMs. The features extracted by BeatProfiler classified restrictive cardiomyopathy cardiomyocytes from isogenic healthy controls with 98% accuracy and identified relax90 as a top distinguishing feature in congruence with previous findings. We also show that our TCN-BiLSTM model was able to classify drug-free control and 4 cardiac drugs with different mechanisms of action at 96% accuracy. We further apply Grad-CAM on our convolution-based models to identify signature regions of perturbations by these drugs in calcium signals. Conclusions: We anticipate that the capabilities of BeatProfiler will help advance in vitro studies in cardiac biology through rapid phenotyping, revealing mechanisms underlying cardiac health and disease, and enabling objective classification of cardiac disease and responses to drugs.

Collapse

Mu J, Li Z, Zhang B, Zhang Q, Iqbal J, Wadood A, Wei T, Feng Y, Chen HF. Graphormer supervised de novo protein design method and function validation. Brief Bioinform 2024;25:bbae135. [PMID: 38557677 PMCID: PMC10982952 DOI: 10.1093/bib/bbae135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2023] [Revised: 01/31/2024] [Accepted: 03/12/2024] [Indexed: 04/04/2024] Open

Affiliation(s)

Junxi Mu State Key Laboratory of Microbial metabolism, Joint International Research Laboratory of Metabolic Developmental Sciences, Department of Bioinformatics and Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, No.5 Yiheyuan Road, Beijing, 100871, China
Zhengxin Li State Key Laboratory of Microbial metabolism, Joint International Research Laboratory of Metabolic Developmental Sciences, Department of Bioinformatics and Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China
Bo Zhang State Key Laboratory of Microbial metabolism, Joint International Research Laboratory of Metabolic Developmental Sciences, Department of Bioinformatics and Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China
Qi Zhang State Key Laboratory of Microbial metabolism, Joint International Research Laboratory of Metabolic Developmental Sciences, Department of Bioinformatics and Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China
Jamshed Iqbal Centre for Advanced Drug Research, COMSATS University Islamabad, Abbottabad Campus, Abbottabad, 22060, Pakistan
Abdul Wadood Department of Biochemistry, Abdul Wali Khan University Mardan, Mardan, 23200, Pakistan
Ting Wei State Key Laboratory of Microbial metabolism, Joint International Research Laboratory of Metabolic Developmental Sciences, Department of Bioinformatics and Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China
Yan Feng State Key Laboratory of Microbial metabolism, Joint International Research Laboratory of Metabolic Developmental Sciences, Department of Bioinformatics and Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China
Hai-Feng Chen State Key Laboratory of Microbial metabolism, Joint International Research Laboratory of Metabolic Developmental Sciences, Department of Bioinformatics and Biostatistics, National Experimental Teaching Center for Life Sciences and Biotechnology, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China

Collapse

Jänes J, Beltrao P. Deep learning for protein structure prediction and design-progress and applications. Mol Syst Biol 2024;20:162-169. [PMID: 38291232 PMCID: PMC10912668 DOI: 10.1038/s44320-024-00016-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Revised: 12/21/2023] [Accepted: 01/11/2024] [Indexed: 02/01/2024] Open

Chu AE, Lu T, Huang PS. Sparks of function by de novo protein design. Nat Biotechnol 2024;42:203-215. [PMID: 38361073 PMCID: PMC11366440 DOI: 10.1038/s41587-024-02133-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2023] [Accepted: 01/09/2024] [Indexed: 02/17/2024]

Yu J, Mu J, Wei T, Chen HF. Multi-indicator comparative evaluation for deep learning-based protein sequence design methods. Bioinformatics 2024;40:btae037. [PMID: 38261649 PMCID: PMC10868333 DOI: 10.1093/bioinformatics/btae037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Revised: 12/20/2023] [Accepted: 01/18/2024] [Indexed: 01/25/2024] Open

Jones RD. Information Transmission in G Protein-Coupled Receptors. Int J Mol Sci 2024;25:1621. [PMID: 38338905 PMCID: PMC10855935 DOI: 10.3390/ijms25031621] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2023] [Revised: 01/19/2024] [Accepted: 01/22/2024] [Indexed: 02/12/2024] Open

Bravi B. Development and use of machine learning algorithms in vaccine target selection. NPJ Vaccines 2024;9:15. [PMID: 38242890 PMCID: PMC10798987 DOI: 10.1038/s41541-023-00795-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Accepted: 12/07/2023] [Indexed: 01/21/2024] Open

Krokidis MG, Dimitrakopoulos GN, Vrahatis AG, Exarchos TP, Vlamos P. Challenges and limitations in computational prediction of protein misfolding in neurodegenerative diseases. Front Comput Neurosci 2024;17:1323182. [PMID: 38250244 PMCID: PMC10796696 DOI: 10.3389/fncom.2023.1323182] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Accepted: 12/19/2023] [Indexed: 01/23/2024] Open

Xu B, Chen Y, Xue W. Computational Protein Design - Where it goes? Curr Med Chem 2024;31:2841-2854. [PMID: 37272467 DOI: 10.2174/0929867330666230602143700] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2022] [Revised: 02/18/2023] [Accepted: 03/15/2023] [Indexed: 06/06/2023]

Wang J, Chen C, Yao G, Ding J, Wang L, Jiang H. Intelligent Protein Design and Molecular Characterization Techniques: A Comprehensive Review. Molecules 2023;28:7865. [PMID: 38067593 PMCID: PMC10707872 DOI: 10.3390/molecules28237865] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2023] [Revised: 11/13/2023] [Accepted: 11/23/2023] [Indexed: 12/18/2023] Open

Lategan FA, Schreiber C, Patterton HG. SeqPredNN: a neural network that generates protein sequences that fold into specified tertiary structures. BMC Bioinformatics 2023;24:373. [PMID: 37789284 PMCID: PMC10546711 DOI: 10.1186/s12859-023-05498-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Accepted: 09/25/2023] [Indexed: 10/05/2023] Open

Abstract

BACKGROUND

The relationship between the sequence of a protein, its structure, and the resulting connection between its structure and function, is a foundational principle in biological science. Only recently has the computational prediction of protein structure based only on protein sequence been addressed effectively by AlphaFold, a neural network approach that can predict the majority of protein structures with X-ray crystallographic accuracy. A question that is now of acute relevance is the "inverse protein folding problem": predicting the sequence of a protein that folds into a specified structure. This will be of immense value in protein engineering and biotechnology, and will allow the design and expression of recombinant proteins that can, for instance, fold into specified structures as a scaffold for the attachment of recombinant antigens, or enzymes with modified or novel catalytic activities. Here we describe the development of SeqPredNN, a feed-forward neural network trained with X-ray crystallographic structures from the RCSB Protein Data Bank to predict the identity of amino acids in a protein structure using only the relative positions, orientations, and backbone dihedral angles of nearby residues.

RESULTS

We predict the sequence of a protein expected to fold into a specified structure and assess the accuracy of the prediction using both AlphaFold and RoseTTAFold to computationally generate the fold of the derived sequence. We show that the sequences predicted by SeqPredNN fold into a structure with a median TM-score of 0.638 when compared to the crystal structure according to AlphaFold predictions, yet these sequences are unique and only 28.4% identical to the sequence of the crystallized protein.

CONCLUSIONS

We propose that SeqPredNN will be a valuable tool to generate proteins of defined structure for the design of novel biomaterials, pharmaceuticals, catalysts, and reporter systems. The low sequence identity of its predictions compared to the native sequence could prove useful for developing proteins with modified physical properties, such as water solubility and thermal stability. The speed and ease of use of SeqPredNN offers a significant advantage over physics-based protein design methods.

Collapse

Wu F, Wu L, Radev D, Xu J, Li SZ. Integration of pre-trained protein language models into geometric deep learning networks. Commun Biol 2023;6:876. [PMID: 37626165 PMCID: PMC10457366 DOI: 10.1038/s42003-023-05133-1] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Accepted: 07/11/2023] [Indexed: 08/27/2023] Open

Ichikawa DM, Abdin O, Alerasool N, Kogenaru M, Mueller AL, Wen H, Giganti DO, Goldberg GW, Adams S, Spencer JM, Razavi R, Nim S, Zheng H, Gionco C, Clark FT, Strokach A, Hughes TR, Lionnet T, Taipale M, Kim PM, Noyes MB. A universal deep-learning model for zinc finger design enables transcription factor reprogramming. Nat Biotechnol 2023;41:1117-1129. [PMID: 36702896 PMCID: PMC10421740 DOI: 10.1038/s41587-022-01624-4] [Citation(s) in RCA: 36] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Accepted: 11/17/2022] [Indexed: 01/27/2023]

Affiliation(s)

David M Ichikawa Institute for Systems Genetics, NYU Grossman School of Medicine, New York, NY, USA Department of Biochemistry and Molecular Pharmacology, NYU Grossman School of Medicine, New York, NY, USA
Osama Abdin Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
Nader Alerasool Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
Manjunatha Kogenaru Institute for Systems Genetics, NYU Grossman School of Medicine, New York, NY, USA
April L Mueller Institute for Systems Genetics, NYU Grossman School of Medicine, New York, NY, USA
Han Wen Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
David O Giganti Institute for Systems Genetics, NYU Grossman School of Medicine, New York, NY, USA
Gregory W Goldberg Institute for Systems Genetics, NYU Grossman School of Medicine, New York, NY, USA
Samantha Adams Institute for Systems Genetics, NYU Grossman School of Medicine, New York, NY, USA
Jeffrey M Spencer Institute for Systems Genetics, NYU Grossman School of Medicine, New York, NY, USA
Rozita Razavi Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
Satra Nim Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
Hong Zheng Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
Courtney Gionco Institute for Systems Genetics, NYU Grossman School of Medicine, New York, NY, USA
Finnegan T Clark Institute for Systems Genetics, NYU Grossman School of Medicine, New York, NY, USA
Alexey Strokach Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
Timothy R Hughes Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
Timothee Lionnet Institute for Systems Genetics, NYU Grossman School of Medicine, New York, NY, USA
Mikko Taipale Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
Philip M Kim Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada. Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada. Department of Computer Science, University of Toronto, Toronto, Ontario, Canada.
Marcus B Noyes Institute for Systems Genetics, NYU Grossman School of Medicine, New York, NY, USA. Department of Biochemistry and Molecular Pharmacology, NYU Grossman School of Medicine, New York, NY, USA.

Collapse

Jin W, Brannan KW, Kapeli K, Park SS, Tan HQ, Gosztyla ML, Mujumdar M, Ahdout J, Henroid B, Rothamel K, Xiang JS, Wong L, Yeo GW. HydRA: Deep-learning models for predicting RNA-binding capacity from protein interaction association context and protein sequence. Mol Cell 2023;83:2595-2611.e11. [PMID: 37421941 PMCID: PMC11098078 DOI: 10.1016/j.molcel.2023.06.019] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Revised: 03/20/2023] [Accepted: 06/13/2023] [Indexed: 07/10/2023]

Affiliation(s)

Wenhao Jin Department of Cellular and Molecular Medicine, University of Califorinia, San Diego, La Jolla, CA, USA; Institute for Genomic Medicine and UCSD Stem Cell Program, University of California, San Diego, La Jolla, CA, USA; Stem Cell Program, University of California, San Diego, La Jolla, CA, USA
Kristopher W Brannan Department of Cellular and Molecular Medicine, University of Califorinia, San Diego, La Jolla, CA, USA; Institute for Genomic Medicine and UCSD Stem Cell Program, University of California, San Diego, La Jolla, CA, USA; Stem Cell Program, University of California, San Diego, La Jolla, CA, USA
Katannya Kapeli Department of Physiology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
Samuel S Park Department of Cellular and Molecular Medicine, University of Califorinia, San Diego, La Jolla, CA, USA; Institute for Genomic Medicine and UCSD Stem Cell Program, University of California, San Diego, La Jolla, CA, USA; Stem Cell Program, University of California, San Diego, La Jolla, CA, USA
Hui Qing Tan Department of Physiology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
Maya L Gosztyla Department of Cellular and Molecular Medicine, University of Califorinia, San Diego, La Jolla, CA, USA; Institute for Genomic Medicine and UCSD Stem Cell Program, University of California, San Diego, La Jolla, CA, USA; Stem Cell Program, University of California, San Diego, La Jolla, CA, USA
Mayuresh Mujumdar Department of Cellular and Molecular Medicine, University of Califorinia, San Diego, La Jolla, CA, USA; Institute for Genomic Medicine and UCSD Stem Cell Program, University of California, San Diego, La Jolla, CA, USA; Stem Cell Program, University of California, San Diego, La Jolla, CA, USA
Joshua Ahdout Department of Cellular and Molecular Medicine, University of Califorinia, San Diego, La Jolla, CA, USA; Institute for Genomic Medicine and UCSD Stem Cell Program, University of California, San Diego, La Jolla, CA, USA; Stem Cell Program, University of California, San Diego, La Jolla, CA, USA
Bryce Henroid Department of Cellular and Molecular Medicine, University of Califorinia, San Diego, La Jolla, CA, USA; Institute for Genomic Medicine and UCSD Stem Cell Program, University of California, San Diego, La Jolla, CA, USA; Stem Cell Program, University of California, San Diego, La Jolla, CA, USA
Katherine Rothamel Department of Cellular and Molecular Medicine, University of Califorinia, San Diego, La Jolla, CA, USA; Institute for Genomic Medicine and UCSD Stem Cell Program, University of California, San Diego, La Jolla, CA, USA; Stem Cell Program, University of California, San Diego, La Jolla, CA, USA
Joy S Xiang Department of Cellular and Molecular Medicine, University of Califorinia, San Diego, La Jolla, CA, USA; Institute for Genomic Medicine and UCSD Stem Cell Program, University of California, San Diego, La Jolla, CA, USA; Stem Cell Program, University of California, San Diego, La Jolla, CA, USA
Limsoon Wong Department of Computer Science, National University of Singapore, Singapore, Singapore
Gene W Yeo Department of Cellular and Molecular Medicine, University of Califorinia, San Diego, La Jolla, CA, USA; Institute for Genomic Medicine and UCSD Stem Cell Program, University of California, San Diego, La Jolla, CA, USA; Stem Cell Program, University of California, San Diego, La Jolla, CA, USA.

Collapse

Yan J, Li S, Zhang Y, Hao A, Zhao Q. ZetaDesign: an end-to-end deep learning method for protein sequence design and side-chain packing. Brief Bioinform 2023;24:bbad257. [PMID: 37429578 DOI: 10.1093/bib/bbad257] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 06/05/2023] [Accepted: 06/21/2023] [Indexed: 07/12/2023] Open

McFee M, Kim PM. GDockScore: a graph-based protein-protein docking scoring function. BIOINFORMATICS ADVANCES 2023;3:vbad072. [PMID: 37359726 PMCID: PMC10290236 DOI: 10.1093/bioadv/vbad072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Revised: 05/30/2023] [Accepted: 06/10/2023] [Indexed: 06/28/2023]

Zhang H, Li X, Li Z, Huang D, Zhang L. Estimation of Particle Location in Granular Materials Based on Graph Neural Networks. MICROMACHINES 2023;14:714. [PMID: 37420946 DOI: 10.3390/mi14040714] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Revised: 03/20/2023] [Accepted: 03/21/2023] [Indexed: 07/09/2023]

Omar SI, Keasar C, Ben-Sasson AJ, Haber E. Protein Design Using Physics Informed Neural Networks. Biomolecules 2023;13:biom13030457. [PMID: 36979392 PMCID: PMC10046838 DOI: 10.3390/biom13030457] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2023] [Revised: 02/16/2023] [Accepted: 02/27/2023] [Indexed: 03/06/2023] Open

Yuan Y, Xin K, Liu J, Zhao P, Lu MP, Yan Y, Hu Y, Huo H, Li Z, Fang T. A GNN-based model for capturing spatio-temporal changes in locomotion behaviors of aging C. elegans. Comput Biol Med 2023;155:106694. [PMID: 36812812 DOI: 10.1016/j.compbiomed.2023.106694] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2022] [Revised: 01/27/2023] [Accepted: 02/14/2023] [Indexed: 02/17/2023]

Li AJ, Lu M, Desta I, Sundar V, Grigoryan G, Keating AE. Neural network-derived Potts models for structure-based protein design using backbone atomic coordinates and tertiary motifs. Protein Sci 2023;32:e4554. [PMID: 36564857 PMCID: PMC9854172 DOI: 10.1002/pro.4554] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Revised: 11/15/2022] [Accepted: 12/20/2022] [Indexed: 12/25/2022]

Castorina LV, Petrenas R, Subr K, Wood CW. PDBench: evaluating computational methods for protein-sequence design. Bioinformatics 2023;39:btad027. [PMID: 36637198 PMCID: PMC9869650 DOI: 10.1093/bioinformatics/btad027] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 11/14/2022] [Accepted: 01/12/2023] [Indexed: 01/14/2023] Open

Nallasamy V, Seshiah M. Energy Profile Bayes and Thompson Optimized Convolutional Neural Network protein structure prediction. Neural Comput Appl 2023;35:1983-2006. [PMID: 36245797 PMCID: PMC9542649 DOI: 10.1007/s00521-022-07868-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2021] [Accepted: 09/21/2022] [Indexed: 01/12/2023]

Abstract

In living organisms, proteins are considered as the executants of biological functions. Owing to its pivotal role played in protein folding patterns, comprehension of protein structure is a challenging issue. Moreover, owing to numerous protein sequence exploration in protein data banks and complication of protein structures, experimental methods are found to be inadequate for protein structural class prediction. Hence, it is very much advantageous to design a reliable computational method to predict protein structural classes from protein sequences. In the recent few years there has been an elevated interest in using deep learning to assist protein structure prediction as protein structure prediction models can be utilized to screen a large number of novel sequences. In this regard, we propose a model employing Energy Profile for atom pairs in conjunction with the Legion-Class Bayes function called Energy Profile Legion-Class Bayes Protein Structure Identification model. Followed by this, we use a Thompson Optimized convolutional neural network to extract features between amino acids and then the Thompson Optimized SoftMax function is employed to extract associations between protein sequences for predicting secondary protein structure. The proposed Energy Profile Bayes and Thompson Optimized Convolutional Neural Network (EPB-OCNN) method tested distinct unique protein data and was compared to the state-of-the-art methods, the Template-Based Modeling, Protein Design using Deep Graph Neural Networks, a deep learning-based S-glutathionylation sites prediction tool called a Computational Framework, the Deep Learning and a distance-based protein structure prediction using deep learning. The results obtained when applied with the Biopython tool with respect to protein structure prediction time, protein structure prediction accuracy, specificity, recall, F-measure, and precision, respectively, are measured. The proposed EPB-OCNN method outperformed the state-of-the-art methods, thereby corroborating the objective.

Collapse

Durairaj J, de Ridder D, van Dijk AD. Beyond sequence: Structure-based machine learning. Comput Struct Biotechnol J 2022;21:630-643. [PMID: 36659927 PMCID: PMC9826903 DOI: 10.1016/j.csbj.2022.12.039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Revised: 12/21/2022] [Accepted: 12/21/2022] [Indexed: 12/31/2022] Open

Liu J, Zhang C, Lai L. GeoPacker: A novel deep learning framework for protein side-chain modeling. Protein Sci 2022;31:e4484. [PMID: 36309961 PMCID: PMC9667900 DOI: 10.1002/pro.4484] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Revised: 10/23/2022] [Accepted: 10/26/2022] [Indexed: 12/13/2022]

Ferruz N, Heinzinger M, Akdel M, Goncearenco A, Naef L, Dallago C. From sequence to function through structure: Deep learning for protein design. Comput Struct Biotechnol J 2022;21:238-250. [PMID: 36544476 PMCID: PMC9755234 DOI: 10.1016/j.csbj.2022.11.014] [Citation(s) in RCA: 44] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Revised: 11/05/2022] [Accepted: 11/05/2022] [Indexed: 11/20/2022] Open

Liu H, Chen Q. Computational protein design with data‐driven approaches: Recent developments and perspectives. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2022. [DOI: 10.1002/wcms.1646] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Gill ML. The rise of the machines in chemistry. MAGNETIC RESONANCE IN CHEMISTRY : MRC 2022;60:1044-1051. [PMID: 35976263 DOI: 10.1002/mrc.5304] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/27/2021] [Revised: 08/07/2022] [Accepted: 08/09/2022] [Indexed: 06/15/2023]