1
|
Liu J, Neupane P, Cheng J. Improving AlphaFold2- and AlphaFold3-Based Protein Complex Structure Prediction With MULTICOM4 in CASP16. Proteins 2025. [PMID: 40452318 DOI: 10.1002/prot.26850] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2025] [Revised: 05/06/2025] [Accepted: 05/23/2025] [Indexed: 06/11/2025]
Abstract
With AlphaFold achieving high-accuracy tertiary structure prediction for most single-chain proteins (monomers), the next major challenge in protein structure prediction is to accurately model multichain protein complexes (multimers). We developed MULTICOM4, the latest version of the MULTICOM system, to improve protein complex structure prediction by integrating transformer-based AlphaFold2, diffusion model-based AlphaFold3, and our in-house techniques. These include protein complex stoichiometry prediction, diverse multiple sequence alignment (MSA) generation leveraging both sequence and structure comparison, modeling exception handling, and deep learning-based protein model quality assessment. MULTICOM4 was blindly evaluated in the 16th Critical Assessment of Techniques for Protein Structure Prediction (CASP16) in 2024. In Phase 0 of CASP16, where stoichiometry information was unavailable, MULTICOM predictors performed best, with MULTICOM_human achieving a TM-score of 0.752 and a DockQ score of 0.584 for top-ranked predictions on average. In Phase 1 of CASP16, with stoichiometry information provided, MULTICOM_human remained among the top predictors, attaining a TM-score of 0.797 and a DockQ score of 0.558 on average. The CASP16 results demonstrate that integrating complementary AlphaFold2 and AlphaFold3 with enhanced MSA inputs, comprehensive model ranking, exception handling, and accurate stoichiometry prediction can effectively improve protein complex structure prediction.
Collapse
Affiliation(s)
- Jian Liu
- Department of Electrical Engineering & Computer Science, NextGen Precision Health, University of Missouri, Columbia, Missouri, USA
| | - Pawan Neupane
- Department of Electrical Engineering & Computer Science, NextGen Precision Health, University of Missouri, Columbia, Missouri, USA
| | - Jianlin Cheng
- Department of Electrical Engineering & Computer Science, NextGen Precision Health, University of Missouri, Columbia, Missouri, USA
| |
Collapse
|
2
|
Dumas N, Portelli G, Ji Y, Dupont F, Jendoubi M, Lalli E. Detection of protein structural hotspots using AI distillation and explainability: application to the DAX-1 protein. NAR Genom Bioinform 2025; 7:lqaf047. [PMID: 40264682 PMCID: PMC12012785 DOI: 10.1093/nargab/lqaf047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2024] [Revised: 03/26/2025] [Accepted: 04/10/2025] [Indexed: 04/24/2025] Open
Abstract
AlphaMissense is a valuable resource for discerning important functional regions within proteins, providing pathogenicity heatmaps that highlight the pathogenic risk of specific mutations along the protein sequence. However, due to protein folding and long-range interactions, the actual structural alterations with functional implications may be occurring at a distance from the mutation site. As a result, the identification of the most sensitive structural regions for protein function may be hampered by the presence of mutations that indirectly affect the critical regions from a distance. In this study, we illustrate how the use of AlphaMissense predictions to train an XGBoost regression model on structural features extracted from the structures of protein variants predicted by OmegaFold enables the definition of a new explainability metric: a residue-based importance score that highlights the most critical structural domains within a protein sequence. To verify the accuracy of our approach, we applied it to the extensively studied protein DAX-1 and successfully identified critical structural domains. Notably, as this score only requires knowledge of the protein's amino acid sequence, it is valuable in guiding experimental investigations aimed at discovering functionally crucial regions in proteins that have been poorly characterized.
Collapse
Affiliation(s)
- Noé Dumas
- Thales SA, Thales Services Numériques, 06560 Valbonne—Sophia Antipolis, France
| | - Geoffrey Portelli
- Thales SA, Thales Services Numériques, 06560 Valbonne—Sophia Antipolis, France
| | - Yang Ji
- Thales SA, Thales Services Numériques, 06560 Valbonne—Sophia Antipolis, France
| | - Florent Dupont
- Thales SA, Thales Services Numériques, 06560 Valbonne—Sophia Antipolis, France
| | - Mehdi Jendoubi
- Thales SA, Thales Services Numériques, 06560 Valbonne—Sophia Antipolis, France
| | - Enzo Lalli
- Centre National de la Recherche Scientifique, Institut de Pharmacologie Moléculaire et Cellulaire, 06560 Valbonne—Sophia Antipolis, France
- Institut national de la santé et de la recherche médicale, Institut de Pharmacologie Moléculaire et Cellulaire, 06560 Valbonne—Sophia Antipolis, France
- Université Côte d’Azur, Institut de Pharmacologie Moléculaire et Cellulaire, 06560 Valbonne—Sophia Antipolis, France
| |
Collapse
|
3
|
d'Errico A, Vonk PJ, Wösten HAB, Lugones LG. Transposition of a non-autonomous element into the G β gene of Schizophyllum commune causes the streak mutation. Fungal Genet Biol 2025; 179:104007. [PMID: 40447071 DOI: 10.1016/j.fgb.2025.104007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2025] [Revised: 05/21/2025] [Accepted: 05/24/2025] [Indexed: 06/11/2025]
Abstract
Streak mutants of Schizophyllum commune are characterized by ropy, hyperbranching hyphae, suppressed aerial hyphae formation, and the production of pigments. Additionally, these mutants dikaryotize unilaterally, with the mutant fertilizing its compatible mating partner, but not accepting its nucleus. Here we show that a 512 bp non-autonomous transposable element had integrated in the Gβ protein of a streak mutant of S. commune. This element has the same 50 bp inverted repeat as an autonomous element, dubbed Bike transposon. Its transposase has homologues in various Agaricomycetes. Introducing the Gβ gene in the streak mutant restored the wild-type phenotype showing that the integration of the 512 bp element in the Gβ gene is responsible for the streak phenotype.
Collapse
Affiliation(s)
- Antonio d'Errico
- Microbiology, Department of Biology, Utrecht University, Padualaan 8, 3584 CH, Utrecht, the Netherlands.
| | - Peter Jan Vonk
- Microbiology, Department of Biology, Utrecht University, Padualaan 8, 3584 CH, Utrecht, the Netherlands.
| | - Han A B Wösten
- Microbiology, Department of Biology, Utrecht University, Padualaan 8, 3584 CH, Utrecht, the Netherlands.
| | - Luis G Lugones
- Microbiology, Department of Biology, Utrecht University, Padualaan 8, 3584 CH, Utrecht, the Netherlands.
| |
Collapse
|
4
|
Dennler O, Ryan CJ. Evaluating sequence and structural similarity metrics for predicting shared paralog functions. NAR Genom Bioinform 2025; 7:lqaf051. [PMID: 40290317 PMCID: PMC12034104 DOI: 10.1093/nargab/lqaf051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2024] [Revised: 03/07/2025] [Accepted: 04/15/2025] [Indexed: 04/30/2025] Open
Abstract
Gene duplication is the primary source of new genes, resulting in most genes having identifiable paralogs. Over time, paralog pairs may diverge in some respects but many retain the ability to perform the same functional role. Protein sequence identity is often used as a proxy for functional similarity and can predict shared functions between paralogs as revealed by synthetic lethal experiments. However, the advent of alternative protein representations, including embeddings from protein language models (PLMs) and predicted structures from AlphaFold, raises the possibility that alternative similarity metrics could better capture functional similarity between paralogs. Here, using two species (budding yeast and human) and two different definitions of shared functionality (shared protein-protein interactions and synthetic lethality), we evaluated a variety of alternative similarity metrics. For some tasks, predicted structural similarity or PLM similarity outperform sequence identity, but more importantly these similarity metrics are not redundant with sequence identity, i.e. combining them with sequence identity leads to improved predictions of shared functionality. By adding contextual features, representing similarity to homologous proteins within and across species, we can significantly enhance our predictions of shared paralog functionality. Overall, our results suggest that alternative similarity metrics capture complementary aspects of functional similarity beyond sequence identity alone.
Collapse
Affiliation(s)
- Olivier Dennler
- School of Medicine, University College Dublin, Dublin 4, D04 V1W8, Ireland
- School of Computer Science, University College Dublin, Dublin 4, D04 V1W8, Ireland
- Conway Institute, University College Dublin, Dublin 4, D04 V1W8, Ireland
| | - Colm J Ryan
- School of Medicine, University College Dublin, Dublin 4, D04 V1W8, Ireland
- School of Computer Science, University College Dublin, Dublin 4, D04 V1W8, Ireland
- Conway Institute, University College Dublin, Dublin 4, D04 V1W8, Ireland
| |
Collapse
|
5
|
Ma Z, Yang J. DeepUSPS: Deep Learning-Empowered Unconstrained-Structural Protein Sequence Design. Proteins 2025. [PMID: 40448386 DOI: 10.1002/prot.26847] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2025] [Revised: 04/23/2025] [Accepted: 05/16/2025] [Indexed: 06/02/2025]
Abstract
Currently, the unconstrained-structural protein sequence design models suffer from low optimization efficiency, and their generated proteins exhibit significant similarities to natural proteins and low thermal stability. To address these challenges, we propose the Deep Learning-Empowered Unconstrained-Structural Protein Sequence Design (DeepUSPS) model. To effectively address the inadequate thermal stability problem, we employ the innovative Inverted Dense Residual Network (IDRNet). To mitigate the designed proteins similarity issue, the Sequence-Pairwise Features Extraction Synthetic Network (SPFESN) is constructed. Furthermore, we introduce the Warm Restart AngularGrad (WRA) optimizer to optimize the 3D Position-Specific Scoring Matrix (3Dpssm) for unconstrained-structural protein sequence, only involving 2100 iterations (140.36 min) updates to generate idealization (IDE) protein sequences. We obtained a total of 1000 IDE protein sequences. Then we utilized in silico experiments to evaluate them, including similarity, clarity and iterations, thermal stability, spatial distribution of similarity, and predicted local-distance difference test (pLDDT) confidence assessment. Notably, the mean lg(E-value) for IDE protein sequences reached -0.051, the mean TM-score for IDE protein structures reached 0.594, the iterations only need 2100, and the mean Tm (melting point) for thermal stability reached 74.78°C. The average pLDDT value for 3D structures reached 76. Additionally, the IDE proteins' 3D structures exhibit diverse types. These in silico results conclusively demonstrate the superior performance of DeepUSPS compared with Hallucinate.
Collapse
Affiliation(s)
- Zhichong Ma
- College of Publishing, University of Shanghai for Science and Technology, Shanghai, China
| | - Jiawen Yang
- College of Publishing, University of Shanghai for Science and Technology, Shanghai, China
| |
Collapse
|
6
|
Techasen A, Worasith C, Muengsaen D, Ponglong J, Mahalapbutr P, Kongtaworn N, Rungrotmongkol T, Khongsukwiwat K, Wongphutorn P, Wangboon C, Homwonk C, Chaiyadet S, Laha T, Suttiprapa S, Sakonsinsiri C, Sithithaworn P, Thanan R. Identification and characterization of a target antigen recognized by the monoclonal antibody against Opisthorchis viverrini. PLoS One 2025; 20:e0324137. [PMID: 40440420 PMCID: PMC12121735 DOI: 10.1371/journal.pone.0324137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2024] [Accepted: 04/22/2025] [Indexed: 06/02/2025] Open
Abstract
Opisthorchis viverrini (Ov) infection caused opisthorchiasis, which posed an important risk for the development of cholangiocarcinoma (CCA). Therefore, it is crucial to focus on the primary prevention and control of opisthorchiasis in order to control CCA effectively in Thailand and other endemic regions. A recent diagnostic method of antigen detection using monoclonal antibody-based enzyme-linked immunosorbent assay (mAb-ELISA) has the potential for rapid mass screening of opisthorchiasis. Nevertheless, the specific antigen(s) in Ov adult worms recognized by mAb have not been determined. In this study, we aimed to identify and characterize the target molecule of our in-house Ov-specific monoclonal antibody (mAb KKU505). The specific antigenic band formed by the reaction of Ov adult worm extract and mAb KKU505 was detected using western blot analysis. The protein band was identified as the myosin heavy chain of Ov using LC-MS/MS analysis. The reactivity of the recombinant full-length myosin heavy chain (rMHC) was comparable to that of the crude Ov antigen when evaluated using mAb-ELISA at similar protein concentrations. Moreover, the binding ability between Ov myosin head domain and mAb KKU505 was confirmed using in silico analysis. The results reported here indicate that rMHC could potentially substitute for Ov crude antigen in antigen detection by mAb-ELISA and as a positive control for Ov-strip in lateral flow assays, thereby avoiding the use of laboratory animals for the production of Ov adult worms.
Collapse
Affiliation(s)
- Anchalee Techasen
- Cholangiocarcinoma Research Institute, Khon Kaen University, Khon Kaen, Thailand
- Faculty of Associated Medical Sciences, Khon Kaen University, Khon Kaen, Thailand
| | - Chanika Worasith
- Cholangiocarcinoma Research Institute, Khon Kaen University, Khon Kaen, Thailand
- Department of Adult Nursing, Faculty of Nursing, Khon Kaen University, Khon Kaen, Thailand
| | - Duangkamon Muengsaen
- Cholangiocarcinoma Research Institute, Khon Kaen University, Khon Kaen, Thailand
| | - Jiraprapa Ponglong
- Cholangiocarcinoma Research Institute, Khon Kaen University, Khon Kaen, Thailand
| | - Panupong Mahalapbutr
- Department of Biochemistry, Faculty of Medicine, Khon Kaen University, Khon Kaen, Thailand
| | - Napat Kongtaworn
- Program in Bioinformatics and Computational Biology, Graduate School, Chulalongkorn University, Bangkok, Thailand
| | - Thanyada Rungrotmongkol
- Program in Bioinformatics and Computational Biology, Graduate School, Chulalongkorn University, Bangkok, Thailand
- Center of Excellence in Structural and Computational Biology, Department of Biochemistry, Faculty of Science, Chulalongkorn University, Bangkok, Thailand
| | - Kanoknan Khongsukwiwat
- Cholangiocarcinoma Research Institute, Khon Kaen University, Khon Kaen, Thailand
- Department of Parasitology, Faculty of Medicine, Khon Kaen University, Khon Kaen, Thailand
| | - Phattharaphon Wongphutorn
- Cholangiocarcinoma Research Institute, Khon Kaen University, Khon Kaen, Thailand
- Department of Parasitology, Faculty of Medicine, Khon Kaen University, Khon Kaen, Thailand
| | - Chompunoot Wangboon
- School of Preclinic, Institute of Science, Suranaree University of Technology, Nakhon Ratchasima, Thailand
| | - Chutima Homwonk
- Cholangiocarcinoma Research Institute, Khon Kaen University, Khon Kaen, Thailand
- Department of Parasitology, Faculty of Medicine, Khon Kaen University, Khon Kaen, Thailand
| | - Sujittra Chaiyadet
- Department of Tropical Medicine, Faculty of Medicine, Khon Kaen University, Khon Kaen, Thailand
| | - Thewarach Laha
- Department of Parasitology, Faculty of Medicine, Khon Kaen University, Khon Kaen, Thailand
| | - Sutas Suttiprapa
- Department of Tropical Medicine, Faculty of Medicine, Khon Kaen University, Khon Kaen, Thailand
| | - Chadamas Sakonsinsiri
- Cholangiocarcinoma Research Institute, Khon Kaen University, Khon Kaen, Thailand
- Department of Biochemistry, Faculty of Medicine, Khon Kaen University, Khon Kaen, Thailand
| | - Paiboon Sithithaworn
- Cholangiocarcinoma Research Institute, Khon Kaen University, Khon Kaen, Thailand
- Department of Parasitology, Faculty of Medicine, Khon Kaen University, Khon Kaen, Thailand
| | - Raynoo Thanan
- Cholangiocarcinoma Research Institute, Khon Kaen University, Khon Kaen, Thailand
- Department of Biochemistry, Faculty of Medicine, Khon Kaen University, Khon Kaen, Thailand
| |
Collapse
|
7
|
Macdonald JR, Arnold MS, Luth MR, Cihalova D, Quinn RJ, Winzeler EA, Lee MC, van Dooren GG, Maier AG, Skinner-Adams TS, Andrews KT, Fisher GM. Inner-mitochondrial membrane protein PfMPV17 is linked to P. falciparum in vitro resistance to the indoloquinolizidine alkaloid alstonine. J Antimicrob Chemother 2025:dkaf141. [PMID: 40432501 DOI: 10.1093/jac/dkaf141] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2024] [Accepted: 04/27/2025] [Indexed: 05/29/2025] Open
Abstract
BACKGROUND There are an estimated 260 million malaria cases and ∼600 000 deaths annually. Challenges to malaria eradication include the lack of highly effective and broadly applicable vaccines and parasite drug resistance. This is driving the need for new tools, including novel drugs and drug targets. The indoloquinolizidine alkaloid alstonine was previously shown to have in vitro activity against Plasmodium falciparum malaria parasites and a slow-action activity that is different from other slow-action antiplasmodial compounds such as clindamycin. OBJECTIVES To investigate the action of the antiplasmodial compound alstonine by validating a putative resistance mutation and determining whether the activity of alstonine is linked to the mitochondrial electron transport chain. MATERIALS AND METHODS In vitro evolution of resistance was used to generate alstonine-resistant P. falciparum, followed by whole-genome sequencing and CRISPR/Cas9 gene editing of wildtype parasites to validate a putative resistance-associated mutation. Links to mitochondrial function were assessed using oxygen consumption rate measurements and activity of alstonine in P. falciparum expressing the yeast dihydroorotate dehydrogenase. RESULTS P. falciparum parasites were selected with ∼20-fold reduced sensitivity to alstonine compared to wild-type parasites. Whole-genome sequencing of alstonine-resistant P. falciparum sub-clones identified several mutations including a copy number variation and point mutation (A318P) in a gene encoding a putative inner-mitochondrial membrane protein (PfMPV17). Introduction of the A318P mutation into the PfMPV17 gene in wild-type P. falciparum yielded parasites with reduced alstonine sensitivity. While a direct link between alstonine action and mitochondrial respiratory function was not found, a transgenic P. falciparum line resistant to the cytochrome bc1 inhibitor atovaquone and pyrimidine synthesis inhibitor DSM265 had reduced sensitivity to alstonine. CONCLUSIONS These data demonstrate that PfMPV17 is linked to alstonine resistance and suggest that alstonine action is linked to the mitochondria and/or pyrimidine biosynthesis pathways.
Collapse
Affiliation(s)
- J R Macdonald
- Institute for Biomedicine and Glycomics, Griffith University, Brisbane, Queensland, Australia
| | - M S Arnold
- Institute for Biomedicine and Glycomics, Griffith University, Brisbane, Queensland, Australia
| | - M R Luth
- Department of Pediatrics, University of California, San Diego, USA
| | - D Cihalova
- Research School of Biology, Australian National University, Canberra, ACT, Australia
| | - R J Quinn
- Institute for Biomedicine and Glycomics, Griffith University, Brisbane, Queensland, Australia
| | - E A Winzeler
- Department of Pediatrics, School of Medicine, and Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, USA
| | - M C Lee
- Division of Biological Chemistry and Drug Discovery, University of Dundee, Dundee DD1 5EH, UK
| | - G G van Dooren
- Research School of Biology, Australian National University, Canberra, ACT, Australia
| | - A G Maier
- Research School of Biology, Australian National University, Canberra, ACT, Australia
| | - T S Skinner-Adams
- Institute for Biomedicine and Glycomics, Griffith University, Brisbane, Queensland, Australia
| | - K T Andrews
- Institute for Biomedicine and Glycomics, Griffith University, Brisbane, Queensland, Australia
| | - G M Fisher
- Institute for Biomedicine and Glycomics, Griffith University, Brisbane, Queensland, Australia
| |
Collapse
|
8
|
Harmalkar A, Lyskov S, Gray JJ. Reliable protein-protein docking with AlphaFold, Rosetta, and replica exchange. eLife 2025; 13:RP94029. [PMID: 40424178 PMCID: PMC12113263 DOI: 10.7554/elife.94029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/29/2025] Open
Abstract
Despite the recent breakthrough of AlphaFold (AF) in the field of protein sequence-to-structure prediction, modeling protein interfaces and predicting protein complex structures remains challenging, especially when there is a significant conformational change in one or both binding partners. Prior studies have demonstrated that AF-multimer (AFm) can predict accurate protein complexes in only up to 43% of cases (Yin et al., 2022). In this work, we combine AF as a structural template generator with a physics-based replica exchange docking algorithm to better sample conformational changes. Using a curated collection of 254 available protein targets with both unbound and bound structures, we first demonstrate that AF confidence measures (pLDDT) can be repurposed for estimating protein flexibility and docking accuracy for multimers. We incorporate these metrics within our ReplicaDock 2.0 protocol to complete a robust in silico pipeline for accurate protein complex structure prediction. AlphaRED (AlphaFold-initiated Replica Exchange Docking) successfully docks failed AF predictions, including 97 failure cases in Docking Benchmark Set 5.5. AlphaRED generates CAPRI acceptable-quality or better predictions for 63% of benchmark targets. Further, on a subset of antigen-antibody targets, which is challenging for AFm (20% success rate), AlphaRED demonstrates a success rate of 43%. This new strategy demonstrates the success possible by integrating deep learning-based architectures trained on evolutionary information with physics-based enhanced sampling. The pipeline is available at https://github.com/Graylab/AlphaRED.
Collapse
Affiliation(s)
- Ameya Harmalkar
- Department of Chemical and Biomolecular Engineering, The Johns Hopkins UniversityBaltimoreUnited States
| | - Sergey Lyskov
- Department of Chemical and Biomolecular Engineering, The Johns Hopkins UniversityBaltimoreUnited States
| | - Jeffrey J Gray
- Department of Chemical and Biomolecular Engineering, The Johns Hopkins UniversityBaltimoreUnited States
- Program in Molecular Biophysics, The Johns Hopkins UniversityBaltimoreUnited States
- Data Science and AI Institute, Johns Hopkins UniversityBaltimoreUnited States
| |
Collapse
|
9
|
Lin X, Chen Z, Li Y, Ma Z, Fan C, Cao Z, Feng S, Zhang J, Gao YQ. Unifying sequence-structure coding for advanced protein engineering via a multimodal diffusion transformer. Chem Sci 2025:d5sc02055g. [PMID: 40417294 PMCID: PMC12096517 DOI: 10.1039/d5sc02055g] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2025] [Accepted: 05/14/2025] [Indexed: 05/27/2025] Open
Abstract
Modern protein engineering demands integrated sequence-structure representations to tackle key challenges in designing, modifying, and evolving proteins for specific functions. While sequence-based methods are promising for generating novel proteins, incorporating structure-oriented information improves the success rate and helps target corresponding functions. Therefore, rather than relying solely on sequence or structure-based approaches, a consensus strategy is essential. Here, we introduce ProTokens, machine-learned "amino acids" derived from structural databases via self-supervised learning, providing a compact yet information-rich representation that bridges sequence and structure modalities. Instead of treating sequences and structures separately, we build PT-DiT, a multimodal diffusion transformer-based model that integrates both into a unified representation, enabling protein engineering in a joint sequence-structure space, streamlining the design process and facilitating the efficient encoding of 3D folds, contextual protein design, sampling of metastable states, and directed evolution for diverse objectives. Therefore, as a unified solution for in silico protein engineering, PT-DiT leverages sequence and structure insights to realize functional protein design.
Collapse
Affiliation(s)
- Xiaohan Lin
- Beijing National Laboratory for Molecular Sciences, College of Chemistry and Molecular Engineering, Peking University Beijing 100871 China
| | - Zhenyu Chen
- Beijing National Laboratory for Molecular Sciences, College of Chemistry and Molecular Engineering, Peking University Beijing 100871 China
| | - Yanheng Li
- Beijing National Laboratory for Molecular Sciences, College of Chemistry and Molecular Engineering, Peking University Beijing 100871 China
| | - Zicheng Ma
- Changping Laboratory Beijing 102200 China
- Academy for Advanced Interdisciplinary Studies, Peking University Beijing 100871 China
| | - Chuanliu Fan
- Institute of Artificial Intelligence, Soochow University Suzhou 215006 China
| | - Ziqiang Cao
- Institute of Artificial Intelligence, Soochow University Suzhou 215006 China
| | | | - Jun Zhang
- Changping Laboratory Beijing 102200 China
| | - Yi Qin Gao
- Beijing National Laboratory for Molecular Sciences, College of Chemistry and Molecular Engineering, Peking University Beijing 100871 China
- Changping Laboratory Beijing 102200 China
| |
Collapse
|
10
|
Balasco N, Modjtahedi N, Monti A, Ruvo M, Vitagliano L, Doti N. CHCHD4 Oxidoreductase Activity: A Comprehensive Analysis of the Molecular, Functional, and Structural Properties of Its Redox-Regulated Substrates. Molecules 2025; 30:2117. [PMID: 40430290 PMCID: PMC12114033 DOI: 10.3390/molecules30102117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2025] [Revised: 04/24/2025] [Accepted: 05/06/2025] [Indexed: 05/29/2025] Open
Abstract
The human CHCHD4 protein, which is a prototypical family member, carries a coiled-coil-helix-coiled-coil-helix motif that is stabilized by two disulfide bonds. Using its CPC sequence motif, CHCHD4 plays a key role in mitochondrial metabolism, cell survival, and response to stress conditions, controlling the mitochondrial import of diversified protein substrates that are specifically recognized through an interplay between covalent and non-covalent interactions. In the present review, we provide an updated and comprehensive analysis of CHCHD4 substrates controlled by its redox activities. A particular emphasis has been placed on the molecular and structural aspects of these partnerships. The literature survey has been integrated with the mining of structural databases reporting either experimental structures (Protein Data Bank) or structures predicted by AlphaFold, which provide protein three-dimensional models using machine learning-based approaches. In providing an updated view of the thirty-four CHCHD4 substrates that have been experimentally validated, our analyses highlight the notion that this protein can operate on a variety of structurally diversified substrates. Although in most cases, CHCHD4 plays a crucial role in the formation of disulfide bridges that stabilize helix-coil-helix motifs of its substrates, significant variations on this common theme are observed, especially for substrates that have been more recently identified.
Collapse
Affiliation(s)
- Nicole Balasco
- Institute of Molecular Biology and Pathology, National Research Council (CNR), Department of Chemistry, University of Rome Sapienza, Piazzale Aldo Moro 5, 00185 Rome, Italy;
| | - Nazanine Modjtahedi
- Unité Physiopathologie et Génétique du Neurone et du Muscle, UMR CNRS 5261, Inserm U1315, Université Claude Bernard Lyon 1, 69008 Lyon, France;
| | - Alessandra Monti
- Institute of Biostructures and Bioimaging, National Research Council (CNR), Via P. Castellino 111, 80131 Naples, Italy; (A.M.); (M.R.)
| | - Menotti Ruvo
- Institute of Biostructures and Bioimaging, National Research Council (CNR), Via P. Castellino 111, 80131 Naples, Italy; (A.M.); (M.R.)
| | - Luigi Vitagliano
- Institute of Biostructures and Bioimaging, National Research Council (CNR), Via P. Castellino 111, 80131 Naples, Italy; (A.M.); (M.R.)
| | - Nunzianna Doti
- Institute of Biostructures and Bioimaging, National Research Council (CNR), Via P. Castellino 111, 80131 Naples, Italy; (A.M.); (M.R.)
| |
Collapse
|
11
|
Dos Santos TG, Melgarejo AS, Ligabue-Braun R, de Oliveira DL. Phylogenetic and Structural Analyses of Vesicular Glutamate Transporters. Mol Neurobiol 2025:10.1007/s12035-025-05012-2. [PMID: 40338457 DOI: 10.1007/s12035-025-05012-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2024] [Accepted: 04/29/2025] [Indexed: 05/09/2025]
Abstract
Vesicular glutamate transporters are members of the solute carrier 17 (SLC17) family, and mammals express three closely related isoforms: vGluT1-3. While vGluT genes have been identified across various species in the Animalia kingdom, the evolutionary relationships and the natural history of vGluT members remain poorly understood. This study aimed to address these gaps by presenting a phylogenetic analysis of vGluTs across the animal kingdom. The study also included a detailed sequence analysis and structural modeling of vGluT isoforms among species. The phylogenetic tree revealed distinct clusters corresponding to the vGluts isoform 1, 2, and 3, with functional amino acid residues highly conserved among them. Invertebrate vGluTs emerged as the most divergent proteins, serving as the root of the tree. Sequence analysis confirmed the high conservation of vGluTs transmembrane core regions but identified high variations in the N and C-terminal ones. Structural analysis revealed that AlphaFold2-predicted models demonstrated high confidence quality in the transmembrane domains, but exhibited limited local similarity in the N-terminal, C-terminal, and loop regions. On the other hand, the expected topology of these helices was accurately captured and positioned in the Swiss-Model-generated structures, with the functionally relevant residues precisely positioned in three-dimensional space. In conclusion, we expect that our findings will contribute to a deeper understanding of vesicular glutamate transporter structure and function, as well as their roles across distinct species and biological contexts.
Collapse
Affiliation(s)
- Thainá Garbino Dos Santos
- Laboratory of Neural Development, Department of Biochemistry, Instituto de Ciências Básicas da Saúde, Universidade Federal do Rio Grande do Sul (UFRGS), Rua Ramiro Barcelos 2600, Anexo Porto Alegre, RS, 90035003, Brazil.
| | - Alanis Silva Melgarejo
- Laboratory of Neural Development, Department of Biochemistry, Instituto de Ciências Básicas da Saúde, Universidade Federal do Rio Grande do Sul (UFRGS), Rua Ramiro Barcelos 2600, Anexo Porto Alegre, RS, 90035003, Brazil
| | - Rodrigo Ligabue-Braun
- Department of Pharmacosciences and Graduate Program in Biosciences (PPGBio), Universidade Federal de Ciências da Saúde de Porto Alegre (UFCSPA), Porto Alegre, RS, Brazil
| | - Diogo Losch de Oliveira
- Laboratory of Neural Development, Department of Biochemistry, Instituto de Ciências Básicas da Saúde, Universidade Federal do Rio Grande do Sul (UFRGS), Rua Ramiro Barcelos 2600, Anexo Porto Alegre, RS, 90035003, Brazil.
| |
Collapse
|
12
|
Balasco N, Esposito L, Vitagliano L. Structural Biology in the AlphaFold Era: How Far Is Artificial Intelligence from Deciphering the Protein Folding Code? Biomolecules 2025; 15:674. [PMID: 40427567 PMCID: PMC12109453 DOI: 10.3390/biom15050674] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2025] [Revised: 04/24/2025] [Accepted: 05/02/2025] [Indexed: 05/29/2025] Open
Abstract
Proteins are biomolecules characterized by uncommon chemical and physicochemical complexities coupled with extreme responsiveness to even minor chemical modifications or environmental variations. Since the shape that proteins assume is fundamental for their function, understanding the chemical and structural bases that drive their three-dimensional structures represents the central problem for an atomic-level interpretation of biology. Not surprisingly, this question has progressively become the Holy Grail of structural biology (the folding problem). From this perspective, we initially describe and discuss the different formulations of the folding problem. In the present manuscript, the folding problem is framed from a historical perspective, effectively highlighting the progress made in the last lustrum. We chronologically summarize the major contributions that traditional methodologies provide in approaching this multifaceted problem. We then describe the recent advent and evolution of predictive approaches based on machine learning techniques that are revolutionizing the field by pointing out the potentialities and limitations of this approach. In the final part of the perspective, we illustrate the contribution that computational approaches will make in current structural biology to overcome the limitations of the reductionist approach of studying individual molecules to afford the atomic-level characterization of entire cellular compartments.
Collapse
Affiliation(s)
- Nicole Balasco
- Institute of Molecular Biology and Pathology, National Research Council (CNR), c/o Department Chemistry, Sapienza University of Rome, 00185 Rome, Italy;
| | - Luciana Esposito
- Institute of Biostructure and Bioimaging, Department of Biomedical Sciences, National Research Council (CNR), 80131 Naples, Italy;
| | - Luigi Vitagliano
- Institute of Biostructure and Bioimaging, Department of Biomedical Sciences, National Research Council (CNR), 80131 Naples, Italy;
| |
Collapse
|
13
|
Maity D, Qiao B. AlloBench: A Data Set Pipeline for the Development and Benchmarking of Allosteric Site Prediction Tools. ACS OMEGA 2025; 10:17973-17982. [PMID: 40352555 PMCID: PMC12059942 DOI: 10.1021/acsomega.5c01263] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/10/2025] [Revised: 04/14/2025] [Accepted: 04/17/2025] [Indexed: 05/14/2025]
Abstract
Allostery refers to the activity regulation of biological macromolecules originating from the binding of an effector molecule at the allosteric site that is distant from the active site. The few existing allosteric data sets have not been updated with recent discoveries of allosteric proteins and are challenging to use for data-intensive tasks. Instead of providing another data set bound to become outdated, we present the AlloBench pipeline to create high-quality data sets of biomolecules with allosteric and active site information suitable for computational and data-driven studies of protein allostery. The pipeline produces a data set of 2141 allosteric sites from 2034 protein structures with 418 unique protein chains by integrating information from AlloSteric Database, UniProt, Mechanism and Catalytic Site Atlas, and Protein Data Bank. Furthermore, we use a subset of 100 proteins from the AlloBench data set to quantitatively compare the performance of currently available allosteric site prediction tools: APOP, PASSer, Ohm, ALLO, Allosite, STRESS, and AlloPred. Such a large-scale benchmarking of these programs has not been undertaken on a common test set. The results show a significant need for improvement, as the accuracy for all programs is well below 60%, with PASSer (Ensemble) outperforming the rest. The AlloBench pipeline will not only promote the development of improved allosteric site prediction tools but also serve as a reference for studying allostery in general.
Collapse
Affiliation(s)
- Dibyajyoti Maity
- Department of Natural Sciences, Baruch College, City University of New York, New York 10010, New York United States
| | - Baofu Qiao
- Department of Natural Sciences, Baruch College, City University of New York, New York 10010, New York United States
| |
Collapse
|
14
|
Raghuraman P, Park S. Exploring the modulation of phosphorylation and SUMOylation-dependent NPR1 conformational switching on immune regulators TGA3 and WRKY70 through molecular simulation. PLANT PHYSIOLOGY AND BIOCHEMISTRY : PPB 2025; 222:109711. [PMID: 40056739 DOI: 10.1016/j.plaphy.2025.109711] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/22/2024] [Revised: 02/12/2025] [Accepted: 02/24/2025] [Indexed: 03/10/2025]
Abstract
NPR1 (Nonexpressor pathogenesis-related genes 1) is regulated by multisite phosphorylation and SUMOylation, serving as a master switch for effector-triggered plant immunity through a transcriptional activator (TGA3) and repressor (WRKY70) are experimentally well studied. However, the conformational relationship between the various phosphorylation, un-phosphorylation states, and SUMOylation's role in the functional switch remains unclear. Using deep learning-based molecular modeling, docking, and multi-nanosecond simulations (totaling 2 μs) with end-state free energy calculations, we unveil how different phosphorylation states impact the dynamic stability of NPR1's four phospho-serine residues (Ser11, Ser15, Ser55, & Ser59) and binding of the TGA3-WRKY70 over SUMOylation. Results from our simulations show that the salicylic-acid induced P-Ser11/15NPR1-SUMO3 stabilizes helices and the flexible activation loop (α22Lys423 - α1Arg50 & L35Asp467-Arg51α51, and Gly27L3), thereby switching association with TGA3. The inter-helix salt-bridge formed (L10Arg99-Glu323α9 and α14Glu280-Pro264L6) between the phosphorylated NPR1-SUMO3-TGA3 engage in tight control of conformational regulation were disengaged in the unphosphorylated system. The P-Ser55/59NPR1-SUMO3-WRKY70 reorients itself and forms an electrostatic and hydrogen bond with Lys145α7 - L2Asp26, L6Arg99 - Leu293L18 and Lys262L15 - Glu241L15, α13Val239 (α310), & L17Leu267 keeps complex stable and quiescent compare to unphosphorylated NPR1-WRKY70. Subsequently, the essential dynamic and secondary structural analysis reveals that the phosphorylation inhibits the α516 (long helix) formation and reduces the communication space between the 460α25-βturn3-α30-L42590 (NPR1) and 90L9-L10107 (SUMO3), making the binding more suitable for TGA3 (260βturn-L6270) and WRKY70 (230L15-L16265) via activation loop. These findings, which reveal the atomic and structural details of the NPR1's post-translational modification, will illuminate future investigations into enhancing plant immunity.
Collapse
Affiliation(s)
- P Raghuraman
- Department of Life Sciences, Yeungnam University, Gyeongsan, Gyeongsangbuk-do, 38541, Republic of Korea
| | - SeonJoo Park
- Department of Life Sciences, Yeungnam University, Gyeongsan, Gyeongsangbuk-do, 38541, Republic of Korea.
| |
Collapse
|
15
|
Jia F, Wang Y, Chen Z, Jin J, Zeng L, Zhang L, Tang H, Wang Y, Fan P. 10-Hydroxydec-2-enoic acid reduces vascular smooth muscle cell inflammation via interacting with Toll-like receptor 4. PHYTOMEDICINE : INTERNATIONAL JOURNAL OF PHYTOTHERAPY AND PHYTOPHARMACOLOGY 2025; 140:156534. [PMID: 40054182 DOI: 10.1016/j.phymed.2025.156534] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/22/2024] [Revised: 01/09/2025] [Accepted: 02/15/2025] [Indexed: 03/25/2025]
Abstract
BACKGROUND 10-Hydroxydec-2-enoic acid (10-HDA), a unique and marker compound in royal jelly, has a wide range of bio-activities. However, its role in regulating inflammation of vascular smooth muscle cell (VSMC), which is essential to a set of vascular diseases, is still unknown. PURPOSE Our study aimed to investigate whether 10-HDA exerts effect on VSMC inflammation via interacting with toll-like receptor 4 (TLR4), a pivotal inflammatory initiator. METHODS A package of proteins, which might participate in TLR4-mediated signaling, influenced by 10-HDA were analyzed in mouse VSMCs with Angiotensin Ⅱ(Ang Ⅱ) or lipopolysaccharide (LPS) stimulation. Accordingly, pro- or anti-inflammatory cytokines, reactive oxygen species (ROS), and anti-oxidants that are closely relevant to inflammatory process were determined. The possible mode for 10-HDA interacting with TLR4 was also characterized. Moreover, involvement of a key miRNA in 10-HDA regulating VSMC inflammation was identified. RESULTS In the presence of Ang Ⅱ, 10-HDA inhibited the TLR4 expression in a dose-dependent manner. In such occasion, 10-HDA hindered the up-regulation of specificity protein 1 (SP1) and serine/threonine-protein phosphatase 6 catalytic subunit (PPP6C), the phosphorylation of extracellular signal-regulated kinase 1/2, TGF-β-activated kinase 1, and nuclear factor-κB p56, as well as the enhancement of myeloid differentiation primary response gene 88. Apart from SP1 and PPP6C, the level change of these proteins by 10-HDA was similar with LPS stimulation. The effect might be resulted from 10-HDA blocking TLR4 through multiple atomic interactions. 10-HDA mitigated the increase of pro-inflammatory cytokines tumor necrosis factor-alpha, interleukin-2 (IL-2), and IL-6, as well as increased the anti-inflammatory cytokine IL-10, in the Ang Ⅱ- or LPS-induced VSMCs. Correspondingly, the level of ROS was attenuated and the anti-oxidants such as glutathione and superoxide dismutase were fortified. The data indicated the anti-inflammatory potential of 10-HDA in VSMCs, which was associated with 10-HDA's capability of relieving oxidative stress. Additionally, the expression of miR-17-5p was saved by 10-HDA from Ang Ⅱ- or LPS-treated VSMCs, which might be relevant to SP1 and PPP6C targeting. CONCLUSION The present work of 10-HDA, for the first time, revealed its ability to alleviate VSMC inflammation by targeting TLR4 and therefore modulate the downstream inflammatory participants. Our data will cast light on the utilization of 10-HDA in VSMC inflammation-related vascular disorders.
Collapse
MESH Headings
- Toll-Like Receptor 4/metabolism
- Animals
- Muscle, Smooth, Vascular/drug effects
- Muscle, Smooth, Vascular/cytology
- Muscle, Smooth, Vascular/metabolism
- Mice
- Lipopolysaccharides/pharmacology
- Inflammation/drug therapy
- Inflammation/metabolism
- Myocytes, Smooth Muscle/drug effects
- Myocytes, Smooth Muscle/metabolism
- Reactive Oxygen Species/metabolism
- Cytokines/metabolism
- Angiotensin II/pharmacology
- Signal Transduction/drug effects
- Cells, Cultured
- Anti-Inflammatory Agents/pharmacology
- NF-kappa B/metabolism
- Fatty Acids, Monounsaturated
Collapse
Affiliation(s)
- Feng Jia
- School of Biological Engineering, Henan University of Technology, Zhengzhou 450001, China
| | - Yongqing Wang
- School of Biological Engineering, Henan University of Technology, Zhengzhou 450001, China
| | - Zhiqiang Chen
- School of Biological Engineering, Henan University of Technology, Zhengzhou 450001, China
| | - Jingxian Jin
- School of Biological Engineering, Henan University of Technology, Zhengzhou 450001, China
| | - Lei Zeng
- School of Biological Engineering, Henan University of Technology, Zhengzhou 450001, China
| | - Li Zhang
- School of Biological Engineering, Henan University of Technology, Zhengzhou 450001, China
| | - Huaijian Tang
- School of Food and Strategic Reserves, Henan University of Technology, Zhengzhou 450001, China
| | - Yanyan Wang
- School of Food and Strategic Reserves, Henan University of Technology, Zhengzhou 450001, China
| | - Pei Fan
- School of Biological Engineering, Henan University of Technology, Zhengzhou 450001, China.
| |
Collapse
|
16
|
Kim TD, Pretorius D, Murray JW, Cardona T. Exploring the Structural Diversity and Evolution of the D1 Subunit of Photosystem II Using AlphaFold and Foldtree. PHYSIOLOGIA PLANTARUM 2025; 177:e70284. [PMID: 40401773 PMCID: PMC12096807 DOI: 10.1111/ppl.70284] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/06/2025] [Revised: 04/24/2025] [Accepted: 04/29/2025] [Indexed: 05/23/2025]
Abstract
Although our knowledge of photosystem II has expanded to include time-resolved atomic details, the diversity of experimental structures of the enzyme remains limited. Recent advances in protein structure prediction with AlphaFold offer a promising approach to fill this gap in structural diversity in non-model systems. This study used AlphaFold to predict the structures of the D1 protein, the core subunit of photosystem II, across a broad range of photosynthetic organisms. The prediction produced high-confidence structures, and structural alignment analyses highlighted conserved regions across the different D1 groups, which were in line with high pLDDT scoring regions. In contrast, varying pLDDT in the DE loop and terminal regions appears to correlate with different degrees of structural flexibility or disorder. Subsequent structural phylogenetic analysis using Foldtree provided a tree that is in good agreement with previous sequence-based studies. Moreover, the phylogeny supports a parsimonious scenario in which far-red D1 and D1INT evolved from an ancestral form of G4 D1. This work demonstrates the potential of AlphaFold and Foldtree to study the molecular evolution of photosynthesis.
Collapse
Affiliation(s)
- Tom Dongmin Kim
- School of Biological and Behavioural SciencesQueen Mary University of LondonLondonUK
- Department of Life SciencesImperial College LondonLondonUK
| | | | | | - Tanai Cardona
- School of Biological and Behavioural SciencesQueen Mary University of LondonLondonUK
- Department of Life SciencesImperial College LondonLondonUK
| |
Collapse
|
17
|
Ragucci S, Campanile MG, Russo V, Landi N, Hussain HZF, Canonico E, Russo R, Russo M, Arcella A, Chambery A, Di Maro A. Hortensin 4, main type 1 ribosome inactivating protein from red mountain spinach seeds: Structural characterization and biological action. Int J Biol Macromol 2025; 307:142085. [PMID: 40086539 DOI: 10.1016/j.ijbiomac.2025.142085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2025] [Revised: 03/07/2025] [Accepted: 03/12/2025] [Indexed: 03/16/2025]
Abstract
Here, we report the primary structure of hortensin 4, main type 1 ribosome inactivating protein (RIP) isolated from Atriplex hortensis seeds. The complete sequencing was achieved by a combination of mass spectrometry coupled with Edman degradation. The amino acid sequence of hortensin 4 matches with that of Atriplex patens type 1 RIP, deduced from the cDNA sequence (AC: ABJ90432.1). The protein consists of 254 amino acid residues, without cysteinyl residues and a N-Acetylhexosamine chain at position Asn231. Structural studies (CD spectrum and 3D model) show a protein core typical of RIPs, and the amino acid residues of active site are conserved. In addition, to get insight into the protective effects of hortensin 4 against pathogens and its putative biotechnological applications, we evaluated the: i) N-glycosylase activity against the tobacco mosaic virus (TMV) RNA; (ii) antifungal activity towards Trichoderma harzianum and Botrytis cinerea, by damaging fungal ribosomes; and (iii) inhibition of human primary glioblastoma NULU cells proliferation, with cytotoxicity enhanced in the presence of temozolomide, used as a chemotherapeutic agent. Altogether, the multiple biological activities of hortensin 4 could be exploited both to improve the resistance to various pathogens by engineering transgenic plants and to develop useful tools for cancer therapy.
Collapse
Affiliation(s)
- Sara Ragucci
- Department of Environmental, Biological and Pharmaceutical Sciences and Technologies (DiSTABiF), University of Campania 'Luigi Vanvitelli', Via Vivaldi 43, 81100 Caserta, Italy
| | - Maria Giuseppina Campanile
- Department of Environmental, Biological and Pharmaceutical Sciences and Technologies (DiSTABiF), University of Campania 'Luigi Vanvitelli', Via Vivaldi 43, 81100 Caserta, Italy
| | - Veronica Russo
- IRCCS Istituto Neurologico Mediterraneo 'NEUROMED', Via Atinense 18, 86077 Pozzilli, Italy
| | - Nicola Landi
- Department of Environmental, Biological and Pharmaceutical Sciences and Technologies (DiSTABiF), University of Campania 'Luigi Vanvitelli', Via Vivaldi 43, 81100 Caserta, Italy; Institute of Crystallography, National Research Council of Italy, Via Vivaldi 43, 81100 Caserta, Italy
| | - Hafiza Z F Hussain
- Department of Environmental, Biological and Pharmaceutical Sciences and Technologies (DiSTABiF), University of Campania 'Luigi Vanvitelli', Via Vivaldi 43, 81100 Caserta, Italy
| | - Enza Canonico
- Department of Environmental, Biological and Pharmaceutical Sciences and Technologies (DiSTABiF), University of Campania 'Luigi Vanvitelli', Via Vivaldi 43, 81100 Caserta, Italy
| | - Rosita Russo
- Department of Environmental, Biological and Pharmaceutical Sciences and Technologies (DiSTABiF), University of Campania 'Luigi Vanvitelli', Via Vivaldi 43, 81100 Caserta, Italy
| | - Miriam Russo
- IRCCS Istituto Neurologico Mediterraneo 'NEUROMED', Via Atinense 18, 86077 Pozzilli, Italy
| | - Antonietta Arcella
- IRCCS Istituto Neurologico Mediterraneo 'NEUROMED', Via Atinense 18, 86077 Pozzilli, Italy
| | - Angela Chambery
- Department of Environmental, Biological and Pharmaceutical Sciences and Technologies (DiSTABiF), University of Campania 'Luigi Vanvitelli', Via Vivaldi 43, 81100 Caserta, Italy
| | - Antimo Di Maro
- Department of Environmental, Biological and Pharmaceutical Sciences and Technologies (DiSTABiF), University of Campania 'Luigi Vanvitelli', Via Vivaldi 43, 81100 Caserta, Italy.
| |
Collapse
|
18
|
dos Santos IC, de Souza RDS, Tolstoy I, Oliveira LS, Gruber A. Integrating Sequence- and Structure-Based Similarity Metrics for the Demarcation of Multiple Viral Taxonomic Levels. Viruses 2025; 17:642. [PMID: 40431654 PMCID: PMC12115509 DOI: 10.3390/v17050642] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2025] [Revised: 04/23/2025] [Accepted: 04/25/2025] [Indexed: 05/29/2025] Open
Abstract
Viruses exhibit significantly greater diversity than cellular organisms, posing a complex challenge to their taxonomic classification. While primary sequences may diverge considerably, protein functional domains can maintain conserved 3D structures throughout evolution. Consequently, structural homology of viral proteins can reveal deep taxonomic relationships, overcoming limitations inherent in sequence-based methods. In this work, we introduce MPACT (Multimetric Pairwise Comparison Tool), an integrated tool that utilizes both sequence- and structure-based metrics. The program incorporates five metrics: sequence identity, similarity, maximum likelihood distance, TM-score, and 3Di-character similarity. MPACT generates heatmaps and distance trees to visualize viral relationships across multiple levels, enabling users to substantiate viral taxa demarcation. Taxa delineation can be achieved by specifying appropriate score cutoffs for each metric, facilitating the definition of viral groups, and storing their corresponding sequence data. By analyzing diverse viral datasets spanning various levels of divergence, we demonstrate MPACT's capability to reveal viral relationships, even among distantly related taxa. This tool provides a comprehensive approach to assist viral classification, exceeding the current methods by integrating multiple metrics and uncovering deeper evolutionary connections.
Collapse
Affiliation(s)
- Igor C. dos Santos
- Escola de Artes, Ciências e Humanidades, Universidade de São Paulo, São Paulo 038288-000, Brazil;
| | | | - Igor Tolstoy
- Argentys Informatics, LLC, 12 South Summit Avenue Suite 200, Gaithersburg, MD 20877, USA;
| | - Liliane S. Oliveira
- Department of Computer Science, Federal University of Technology of Paraná (UTFPR), Alberto Carazzai Avenue, 1640, Cornélio Procópio 86300-000, Brazil;
| | - Arthur Gruber
- Department of Parasitology, Instituto de Ciências Biomédicas, Universidade de São Paulo, São Paulo 05508-000, Brazil
- European Virus Bioinformatics Center, Leutragraben 1, 07743 Jena, Germany
| |
Collapse
|
19
|
Walker DR, Fujimura G, Vanegas JM, Barbar EJ. Successful prediction of LC8 binding to intrinsically disordered proteins sheds light on AlphaFold's black box. Front Mol Biosci 2025; 12:1531793. [PMID: 40337642 PMCID: PMC12057147 DOI: 10.3389/fmolb.2025.1531793] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2024] [Accepted: 02/24/2025] [Indexed: 05/09/2025] Open
Abstract
Introduction LC8 is a hub protein involved in many processes from tumor suppression and cell cycle regulation to neurotransmission and viral infection. Despite recent progress, prediction of binding sites for LC8 is plagued by motif variability and a multitude of weakly binding motifs, especially when binding depends on multivalency. Our binding site prediction algorithm, LC8Pred has proven useful for uncovering new LC8 binders, but is insufficient for finding all LC8 binding sites. Methods To address this, we probed the ability of a general structure predictor, AlphaFold, to predict whether a given sequence binds to LC8. Certain combinations of in-built AlphaFold scores were extracted and distributions of scores of binders were compared to scores of nonbinders. Results AlphaFold successfully places proteins at the correct interface of LC8. A set of threshold values of built-in AlphaFold scores enables differentiation between known binders and nonbinders with minimal false positive (8%) and acceptable false negative rates (20%). This cutoff, along with a more inclusive cutoff, was used to predict elusive LC8 binding sites in proteins known to bind LC8. Discussion Correlations between binding affinities and AlphaFold scores provide insight into the black box and indicate that AlphaFold learned an inaccurate energy function that nevertheless is useful for making inferences and conclusions about physical systems. Binding sites predicted by this method can be prioritized for investigation by comparing to result by LC8Pred, local structure, and evolutionary conservation.
Collapse
Affiliation(s)
| | | | | | - Elisar J. Barbar
- Department of Biochemistry and Biophysics, Oregon State University, Corvallis, OR, United States
| |
Collapse
|
20
|
Yan L, Zhang Q, Liu D, Zhao W, Yu Z. Identification and molecular mechanism of novel salt-enhancing peptide in crocodile hemoglobin: a combined E-tongue, molecular docking, and dynamic simulation. JOURNAL OF THE SCIENCE OF FOOD AND AGRICULTURE 2025. [PMID: 40251916 DOI: 10.1002/jsfa.14289] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/04/2025] [Revised: 02/28/2025] [Accepted: 03/28/2025] [Indexed: 04/21/2025]
Abstract
BACKGROUND This study aimed to reduce salt intake without compromising food sensory properties. Novel salt-enhancing peptides were identified from crocodile hemoglobin via virtual screening and evaluated for their salt-reducing effects using molecular docking, electronic tongue analysis, and molecular dynamics simulations. RESULTS A total of 24 water-soluble and non-toxic peptides were obtained by virtual enzymolysis. The protein structure of human transmembrane channel-like 4 (TMC4), a novel salt taste receptor, was constructed using AlphaFold2 and applied as a receptor. The salt-reducing effect of these peptides was verified using electronic tongue analysis, in which the peptide SSDDK had a significant salt-reducing effect. Molecular docking results showed that the main force for peptide binding to the TMC4 receptor was conventional hydrogen bonding, and Arg 583, Arg330, and Glu284 were the key amino acid residues for its binding. Molecular dynamics simulations also verified the stability of peptide-receptor binding. CONCLUSION This study demonstrates that the peptide SSDDK, derived from crocodile hemoglobin, can be used to enhance salty taste and reduce sodium salt use. © 2025 Society of Chemical Industry.
Collapse
Affiliation(s)
- Linyuezhi Yan
- School of Food Science and Engineering, Hainan University, Haikou, PR China
| | - Qian Zhang
- College of Food Science and Engineering, Bohai University, Jinzhou, PR China
| | - Di Liu
- College of Food Science and Engineering, Bohai University, Jinzhou, PR China
| | - Wenzhu Zhao
- School of Food Science and Engineering, Hainan University, Haikou, PR China
| | - Zhipeng Yu
- School of Food Science and Engineering, Hainan University, Haikou, PR China
| |
Collapse
|
21
|
Liu J, Neupane P, Cheng J. Improving AlphaFold2 and 3-based protein complex structure prediction with MULTICOM4 in CASP16. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.03.06.641913. [PMID: 40161604 PMCID: PMC11952293 DOI: 10.1101/2025.03.06.641913] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/02/2025]
Abstract
With AlphaFold achieving high-accuracy tertiary structure prediction for most single-chain proteins (monomers), the next major challenge in protein structure prediction is accurately modeling multi-chain protein complexes (multimers). We developed MULTICOM4, the latest version of the MULTICOM system, to improve protein complex structure prediction by integrating transformer-based AlphaFold2, diffusion model-based AlphaFold3, and our in-house techniques. These include protein complex stoichiometry prediction, diverse multiple sequence alignment (MSA) generation leveraging both sequence and structure comparison, modeling exception handling, and deep learning-based model quality assessment. MULTICOM4 was blindly evaluated in the 16th community-wide Critical Assessment of Techniques for Protein Structure Prediction (CASP16) in 2024. In Phase 0 of CASP16, where stoichiometry information was unavailable, MULTICOM predictors performed best, with MULTICOM_human achieving a TM-score of 0.752 and a DockQ score of 0.584 for top-ranked predictions on average. In Phase 1 of CASP16, with stoichiometry information provided, MULTICOM_human remained among the top predictors, attaining a TM-score of 0.797 and a DockQ score of 0.558 on average. The CASP16 results demonstrate that integrating complementary AlphaFold2 and 3 with enhanced MSA inputs, comprehensive model ranking, exception handling, and accurate stoichiometry prediction can effectively improve protein complex structure prediction.
Collapse
|
22
|
Rennie ML, Oliver MR. Emerging frontiers in protein structure prediction following the AlphaFold revolution. J R Soc Interface 2025; 22:20240886. [PMID: 40233800 PMCID: PMC11999738 DOI: 10.1098/rsif.2024.0886] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2024] [Revised: 02/04/2025] [Accepted: 03/10/2025] [Indexed: 04/17/2025] Open
Abstract
Models of protein structures enable molecular understanding of biological processes. Current protein structure prediction tools lie at the interface of biology, chemistry and computer science. Millions of protein structure models have been generated in a very short space of time through a revolution in protein structure prediction driven by deep learning, led by AlphaFold. This has provided a wealth of new structural information. Interpreting these predictions is critical to determining where and when this information is useful. But proteins are not static nor do they act alone, and structures of proteins interacting with other proteins and other biomolecules are critical to a complete understanding of their biological function at the molecular level. This review focuses on the application of state-of-the-art protein structure prediction to these advanced applications. We also suggest a set of guidelines for reporting AlphaFold predictions.
Collapse
|
23
|
Wlodawer A, Dauter Z, Rubach P, Minor W, Jaskolski M, Jiang Z, Jeffcott W, Anosova O, Kurlin V. Duplicate entries in the Protein Data Bank: how to detect and handle them. Acta Crystallogr D Struct Biol 2025; 81:170-180. [PMID: 40056147 PMCID: PMC11966240 DOI: 10.1107/s2059798325001883] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2025] [Accepted: 02/27/2025] [Indexed: 03/10/2025] Open
Abstract
A global analysis of protein crystal structures in the Protein Data Bank (PDB) using a newly developed computational approach reveals many pairs with (nearly) identical main-chain coordinates. Such cases are identified and analyzed, showing that duplication is possible since the PDB does not currently have tools or mechanisms that would detect potentially duplicate submissions. Some duplicated entries represent modeling efforts of ligand binding that masquerade as experimentally determined structures. We propose that duplicate entries should either be obsoleted by the PDB or, as a minimum, marked with a clear `CAVEAT' record that would alert potential users to the presence of such problems. We also suggest that using a tool for verifying the uniqueness of the deposited structure, such as that presented in this work, should become part of the routine validation procedure for new depositions.
Collapse
Affiliation(s)
- Alexander Wlodawer
- Center for Structural Biology, Center for Cancer ResearchNational Cancer InstituteFrederickMD21702USA
| | - Zbigniew Dauter
- Center for Structural Biology, Center for Cancer ResearchNational Cancer InstituteFrederickMD21702USA
| | - Pawel Rubach
- Department of Molecular Physiology and Biological PhysicsUniversity of VirginiaCharlottesvilleVA22908USA
- Institute of Information Systems and Digital EconomyWarsaw School of EconomicsWarsawPoland
| | - Wladek Minor
- Department of Molecular Physiology and Biological PhysicsUniversity of VirginiaCharlottesvilleVA22908USA
| | - Mariusz Jaskolski
- Institute of Bioorganic ChemistryPolish Academy of SciencesPoznańPoland
- Department of Crystallography, Faculty of Chemistry, Adam Mickiewicz University, Poznań, Poland
| | - Ziqiu Jiang
- Department of Surgery and CancerImperial College LondonLondonUnited Kingdom
| | - William Jeffcott
- Computer ScienceUniversity of LiverpoolLiverpoolL69 3BXUnited Kingdom
| | - Olga Anosova
- Computer ScienceUniversity of LiverpoolLiverpoolL69 3BXUnited Kingdom
| | - Vitaliy Kurlin
- Computer ScienceUniversity of LiverpoolLiverpoolL69 3BXUnited Kingdom
- Materials Innovation FactoryUniversity of LiverpoolLiverpoolL69 3NYUnited Kingdom
| |
Collapse
|
24
|
Wali MH, Naif HM, Abdul Rahim NA, Yunus MA. Genetic Diversity in the Fusion Gene of Respiratory Syncytial Virus (RSV) Isolated From Iraqi Patients: A First Report. Adv Virol 2025; 2025:8864776. [PMID: 40191805 PMCID: PMC11971507 DOI: 10.1155/av/8864776] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2024] [Accepted: 03/08/2025] [Indexed: 04/09/2025] Open
Abstract
Molecular evaluation of the respiratory syncytial virus (RSV) genome is one of the common strategies applied to understand the viral pathogenicity and control its spreading. In this study, we carried out molecular evaluation on the targeted fusion (F) gene region in the RSV-positive samples of Iraqi patients during the autumn and winter of 2022/2023. One hundred and fifty patients with lower respiratory tract infections were screened for RSV using reverse transcription-quantitative polymerase chain reaction (RT-qPCR). Sanger sequencing was performed on the RSV-positive samples targeting 1061 nucleotides (from nucleotide 6168 to 7228 within the RSV genome) and 1000 nucleotides (from nucleotide 6122 to 7121 within the RSV genome) of the F gene region for RSV-A and RSV-B, respectively. The results showed some nucleotide changes within the targeted F gene, which were grouped in distinct clade, closely related to isolates from Austria, Argentine, Finland, and France through phylogenetic analysis. In silico protein modeling using the SWISS-MODEL and I-TASSER web tools based on nonsynonymous changes of amino acid sequence showed some good-predicted models that can be utilized for antiviral screening. In summary, the identified nucleotide variations in the F gene could influence vaccine development as the F protein is the primary target for the major antigen of RSV. Molecular surveillance data of RSV local isolates are also essential for studying new genomic changes and enable the prediction of potential new antiviral agents.
Collapse
Affiliation(s)
- Mohammed Hussein Wali
- Department of Biomedical Sciences, Advanced Medical and Dental Institute, Universiti Sains Malaysia, Penang, Malaysia
- Department of Molecular and Medical Biotechnology, College of Biotechnology, Al-Nahrain University, Baghdad, Iraq
| | - Hassan Mohammad Naif
- Department of Molecular and Medical Biotechnology, College of Biotechnology, Al-Nahrain University, Baghdad, Iraq
| | - Nur Arzuar Abdul Rahim
- Department of Clinical Medicine, Advanced Medical and Dental Institute, Universiti Sains Malaysia, Penang, Malaysia
| | - Muhammad Amir Yunus
- Department of Biomedical Sciences, Advanced Medical and Dental Institute, Universiti Sains Malaysia, Penang, Malaysia
| |
Collapse
|
25
|
Santos TNF, Moreira RO, Rodrigues JDB, Rojas LAC, Souza JAM, Desidério JA. Isolation and in silico analysis of a new subclass of parasporin 4 from Bacillus thuringiensis coreanensis. PeerJ 2025; 13:e19061. [PMID: 40151459 PMCID: PMC11949118 DOI: 10.7717/peerj.19061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2024] [Accepted: 02/06/2025] [Indexed: 03/29/2025] Open
Abstract
Background Bacillus thuringiensis (Bt) is a Gram-positive bacterium whose strains have been studied mainly for the control of insect pests, due to the insecticidal capacity of its Cry and Vip proteins. However, recent studies indicate the presence of other proteins with no known insecticidal action. These proteins denominated "parasporins" (PS) have cytotoxic activity and are divided into six classes, namely PS1, PS2, PS3, PS4, PS5, and PS6. Among these, parasporins 4 (PS4) has only one described subclass, present in the Bacillus thuringiensis shandongiensis strain. Given the importance and limited knowledge about the actions of PS4 proteins and the existence of only one described subclass, the present work aimed to characterize the Bacillus thuringiensis coreanensis strain as a potential source of PS4 protein. Methods A preliminary screening to detect the ps4 gene was conducted in a bank of standard strains and isolates of Bacillus thuringiensis from the Laboratory of Bacterial Genetics and Applied Biotechnology, FCAV/UNESP. The positive strain for this gene had its genomic DNA extracted, the ps4 gene was isolated, cloned and in silico analyses of its sequence were performed. Tools such as Bioedit, BLAST, Clustal Omega, Geneious, IQ-Tree, and iTOL were used in these analyses. For the structural analysis of the PS4 detected, in comparison to the database PS4 (BAD22577), the tools Alphafold2, Pymol, and InterPro were used. Sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) gel analyses allowed the visualization of the inactive and active PS4 protein from the positive strain, after solubilization and activation with Proteinase K. Results Previous screening of Bt standard strains revealed the presence of a partial ps4 gene in Bacillus thuringiensis coreanensis strain. The alignment obtained by the BLAST tool revealed 100% identity between the fragment detected in this work with a hypothetical protein (ANN35810.1) from the genome of that same strain. Considering this, the isolation of the complete gene present in this strain was performed by applying the polymer chain reaction (PCR) technique, using the hypothetical sequence as a basis for the primers elaboration. The in silico analysis of the obtained sequence revealed 92.03% similarity with the ps4 sequence presented in the database (AB180980). Protein modeling studies and comparison of their structures revealed that the B. thuringiensis coreanensis has a new subclass of PS4, denominated PS4Ab1, being an important source of parasporin to be explored in biotechnological applications.
Collapse
Affiliation(s)
- Thais N. F. Santos
- Biology Department, São Paulo State University, Jaboticabal, São Paulo, Brazil
| | - Raquel O. Moreira
- Biology Department, São Paulo State University, Jaboticabal, São Paulo, Brazil
| | | | - Luis A. C. Rojas
- Department of Agricultural and Environmental Biotechnology, São Paulo State University, Jaboticabal, São Paulo, Brazil
| | - Jackson A. M. Souza
- Biology Department, São Paulo State University, Jaboticabal, São Paulo, Brazil
| | - Janete A. Desidério
- Biology Department, São Paulo State University, Jaboticabal, São Paulo, Brazil
| |
Collapse
|
26
|
Rodríguez-Fernández MA, Tristán-Flores FE, Casique-Aguirre D, Negrete-Rodríguez MDLLX, Cervantes-Montelongo JA, Conde-Barajas E, Acosta-García G, Silva-Martínez GA. Virtual Screening and Molecular Dynamics of Cytokine-Drug Complexes for Atherosclerosis Therapy. Int J Mol Sci 2025; 26:2931. [PMID: 40243563 PMCID: PMC11988346 DOI: 10.3390/ijms26072931] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2025] [Revised: 03/19/2025] [Accepted: 03/21/2025] [Indexed: 04/18/2025] Open
Abstract
Cardiovascular disease remains the leading global cause of mortality, largely driven by atherosclerosis, a chronic inflammatory condition characterized by lipid accumulation and immune-cell infiltration in arterial walls. Macrophages play a central role by forming foam cells and secreting pro-atherogenic cytokines, such as TNF-α, IFN-γ, and IL-1β, which destabilize atherosclerotic plaques, expanding the lipid core and increasing the risk of thrombosis and ischemia. Despite the significant health burden of subclinical atherosclerosis, few targeted therapies exist. Current treatments, including monoclonal antibodies, are limited by high costs and immunosuppressive side effects, underscoring the urgent need for alternative therapeutic strategies. In this study, we employed in silico drug repositioning to identify multitarget inhibitors against TNF-α, IFN-γ, and IL-1β, leveraging a virtual screening of 2750 FDA-approved drugs followed by molecular dynamics simulations to assess the stability of selected cytokine-ligand complexes. This computational approach provides structural insights into potential inhibitors. Additionally, we highlight nutraceutical options, such as fatty acids (oleic, linoleic and eicosapentaenoic acid), which exhibited strong and stable interactions with key cytokine targets. Our study suggests that these bioactive compounds could serve as effective new therapeutic approaches for atherosclerosis.
Collapse
Affiliation(s)
- María Angélica Rodríguez-Fernández
- Posgrado de Ingeniería Bioquímica, Tecnológico Nacional de México/IT de Celaya, Celaya 38010, Guanajuato, Mexico; (M.A.R.-F.); (F.E.T.-F.); (M.d.l.L.X.N.-R.); (E.C.-B.); (G.A.-G.)
| | - Fabiola Estefanía Tristán-Flores
- Posgrado de Ingeniería Bioquímica, Tecnológico Nacional de México/IT de Celaya, Celaya 38010, Guanajuato, Mexico; (M.A.R.-F.); (F.E.T.-F.); (M.d.l.L.X.N.-R.); (E.C.-B.); (G.A.-G.)
- Departamento de Ciencias Básicas, Tecnológico Nacional de México/IT de Celaya, Celaya 38010, Guanajuato, Mexico
| | - Diana Casique-Aguirre
- Laboratorio de Citómica del Cáncer Infantil, Centro de Investigación Biomédica de Oriente, Instituto Mexicano del Seguro Social, Delegación Puebla, Puebla 06600, Mexico;
- Secretaría de Ciencia, Humanidades, Tecnología e Innovación (SECIHTI), Ciudad de México 03940, Mexico
| | - María de la Luz Xochilt Negrete-Rodríguez
- Posgrado de Ingeniería Bioquímica, Tecnológico Nacional de México/IT de Celaya, Celaya 38010, Guanajuato, Mexico; (M.A.R.-F.); (F.E.T.-F.); (M.d.l.L.X.N.-R.); (E.C.-B.); (G.A.-G.)
- Departamento de Ingeniería Bioquímica y Ambiental, Tecnológico Nacional de México/IT de Celaya, Celaya 38010, Guanajuato, Mexico;
| | - Juan Antonio Cervantes-Montelongo
- Departamento de Ingeniería Bioquímica y Ambiental, Tecnológico Nacional de México/IT de Celaya, Celaya 38010, Guanajuato, Mexico;
- Escuela de Medicina, Universidad de Celaya, Celaya 38080, Guanajuato, Mexico
| | - Eloy Conde-Barajas
- Posgrado de Ingeniería Bioquímica, Tecnológico Nacional de México/IT de Celaya, Celaya 38010, Guanajuato, Mexico; (M.A.R.-F.); (F.E.T.-F.); (M.d.l.L.X.N.-R.); (E.C.-B.); (G.A.-G.)
- Departamento de Ingeniería Bioquímica y Ambiental, Tecnológico Nacional de México/IT de Celaya, Celaya 38010, Guanajuato, Mexico;
| | - Gerardo Acosta-García
- Posgrado de Ingeniería Bioquímica, Tecnológico Nacional de México/IT de Celaya, Celaya 38010, Guanajuato, Mexico; (M.A.R.-F.); (F.E.T.-F.); (M.d.l.L.X.N.-R.); (E.C.-B.); (G.A.-G.)
- Departamento de Ingeniería Bioquímica y Ambiental, Tecnológico Nacional de México/IT de Celaya, Celaya 38010, Guanajuato, Mexico;
| | - Guillermo Antonio Silva-Martínez
- Posgrado de Ingeniería Bioquímica, Tecnológico Nacional de México/IT de Celaya, Celaya 38010, Guanajuato, Mexico; (M.A.R.-F.); (F.E.T.-F.); (M.d.l.L.X.N.-R.); (E.C.-B.); (G.A.-G.)
- Secretaría de Ciencia, Humanidades, Tecnología e Innovación (SECIHTI), Ciudad de México 03940, Mexico
- Departamento de Ingeniería Bioquímica y Ambiental, Tecnológico Nacional de México/IT de Celaya, Celaya 38010, Guanajuato, Mexico;
| |
Collapse
|
27
|
Li XZ, Li YL, Zhu JS. Three-Dimensional Structural Heteromorphs of Mating-Type Proteins in Hirsutella sinensis and the Natural Cordyceps sinensis Insect-Fungal Complex. J Fungi (Basel) 2025; 11:244. [PMID: 40278065 PMCID: PMC12028455 DOI: 10.3390/jof11040244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2025] [Revised: 02/21/2025] [Accepted: 03/18/2025] [Indexed: 04/26/2025] Open
Abstract
The MAT1-1-1 and MAT1-2-1 proteins are essential for the sexual reproduction of Ophiocordyceps sinensis. Although Hirsutella sinensis has been postulated to be the sole anamorph of O. sinensis and to undergo self-fertilization under homothallism or pseudohomothallism, little is known about the three-dimensional (3D) structures of the mating proteins in the natural Cordyceps sinensis insect-fungal complex, which is a valuable therapeutic agent in traditional Chinese medicine. However, the alternative splicing and differential occurrence and translation of the MAT1-1-1 and MAT1-2-1 genes have been revealed in H. sinensis, negating the self-fertilization hypothesis but rather suggesting the occurrence of self-sterility under heterothallic or hybrid outcrossing. In this study, the MAT1-1-1 and MAT1-2-1 proteins in 173 H. sinensis strains and wild-type C. sinensis isolates were clustered into six and five clades in the Bayesian clustering trees and belonged to 24 and 21 diverse AlphaFold-predicted 3D structural morphs, respectively. Over three-quarters of the strains/isolates contained either MAT1-1-1 or MAT1-2-1 proteins but not both. The diversity of the heteromorphic 3D structures of the mating proteins suggested functional alterations of the proteins and provided additional evidence supporting the self-sterility hypothesis under heterothallism and hybridization for H. sinensis, Genotype #1 of the 17 genome-independent O. sinensis genotypes. The heteromorphic stereostructures and mutations of the MAT1-1-1 and MAT1-2-1 proteins in the wild-type C. sinensis isolates and natural C. sinensis insect-fungi complex suggest that there are various sources of the mating proteins produced by two or more cooccurring heterospecific fungal species in natural C. sinensis that have been discovered in mycobiotic, molecular, metagenomic, and metatranscriptomic studies, which may inspire future studies on the biochemistry of mating and pheromone receptor proteins and the reproductive physiology of O. sinensis.
Collapse
Affiliation(s)
| | | | - Jia-Shi Zhu
- State Key Laboratory of Plateau Ecology and Agriculture, Qinghai Academy of Animal and Veterinary Sciences, Qinghai University, Xining 810016, China; (X.-Z.L.); (Y.-L.L.)
| |
Collapse
|
28
|
Spanke VA, Egger-Hoerschinger VJ, Ruzsanyi V, Liedl KR. From closed to open: three dynamic states of membrane-bound cytochrome P450 3A4. J Comput Aided Mol Des 2025; 39:12. [PMID: 40095179 PMCID: PMC11913904 DOI: 10.1007/s10822-025-00589-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2024] [Accepted: 03/01/2025] [Indexed: 03/19/2025]
Abstract
Cytochrome P450 3A4 (CYP3A4) is a membrane bound monooxygenase. It metabolizes the largest proportion of all orally ingested drugs. Ligands can enter and exit the enzyme through flexible tunnels, which co-determine CYP3A4's ligand promiscuity. The flexibility can be represented by distinct conformational states of the enzyme. However, previous state definitions relied solely on crystal structures. We employed conventional molecular dynamics (cMD) simulations to sample the conformational space of CYP3A4. Five conformationally different crystal structures embedded in a membrane were simulated for 1 µs each. A Markov state model (MSM) coupled with spectral clustering (Robust Perron Cluster Analysis PCCA +) resulted in three distinct states: Two open conformations and an intermediate conformation. The tunnels inside CYP3A4 were calculated with CAVER3.0. Notably, we observed variations in bottleneck radii compared to those derived from crystallographic data. We want to point out the importance of simulations to characterize the dynamic behaviour. Moreover, we identified a mechanism, in which the membrane supports the opening of a tunnel. Therefore, CYP3A4 must be investigated in its membrane-bound state.
Collapse
Affiliation(s)
- Vera A Spanke
- Department of Theoretical Chemistry, Universität Innsbruck, Innsbruck, Austria
| | | | - Veronika Ruzsanyi
- Department of Breath Research, Universität Innsbruck, Innsbruck, Austria
| | - Klaus R Liedl
- Department of Theoretical Chemistry, Universität Innsbruck, Innsbruck, Austria.
| |
Collapse
|
29
|
Xing E, Zhang J, Wang S, Cheng X. Leveraging Sequence Purification for Accurate Prediction of Multiple Conformational States with AlphaFold2. RESEARCH SQUARE 2025:rs.3.rs-6087969. [PMID: 40092441 PMCID: PMC11908349 DOI: 10.21203/rs.3.rs-6087969/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/19/2025]
Abstract
AlphaFold2 (AF2) has transformed protein structure prediction by harnessing co-evolutionary constraints embedded in multiple sequence alignments (MSAs). MSAs not only encode static structural information, but also hold critical details about protein dynamics, which underpin biological functions. However, these subtle coevolutionary signatures, which dictate conformational state preferences, are often obscured by noise within MSA data and thus remain challenging to decipher. Here, we introduce AF-ClaSeq, a systematic framework that isolates these co-evolutionary signals through sequence purification and iterative enrichment. By extracting sequence subsets that preferentially encode distinct structural states, AF-ClaSeq enables high-confidence predictions of alternative conformations. Our findings reveal that the successful sampling of alternative states depends not on MSA depth but on sequence purity. Intriguingly, purified sequences encoding specific structural states are distributed across phylogenetic clades and superfamilies, rather than confined to specific lineages. Expanding upon AF2's transformative capabilities, AF-ClaSeq provides a powerful approach for uncovering hidden structural plasticity, advancing allosteric protein and drug design, and facilitating dynamics-based protein function annotation.
Collapse
Affiliation(s)
- Enming Xing
- Division of Medicinal Chemistry and Pharmacognosy, College of Pharmacy, The Ohio State University, Columbus OH, 43210, USA
| | - Junjie Zhang
- Division of Medicinal Chemistry and Pharmacognosy, College of Pharmacy, The Ohio State University, Columbus OH, 43210, USA
| | - Shen Wang
- Division of Medicinal Chemistry and Pharmacognosy, College of Pharmacy, The Ohio State University, Columbus OH, 43210, USA
| | - Xiaolin Cheng
- Division of Medicinal Chemistry and Pharmacognosy, College of Pharmacy, The Ohio State University, Columbus OH, 43210, USA
- Translational Data Analytics Institute, The Ohio State University, Columbus, OH 43210, USA
| |
Collapse
|
30
|
Han B, Zhang Y, Li L, Gong X, Xia K. TopoQA: a topological deep learning-based approach for protein complex structure interface quality assessment. Brief Bioinform 2025; 26:bbaf083. [PMID: 40062613 PMCID: PMC11891663 DOI: 10.1093/bib/bbaf083] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2024] [Revised: 01/11/2025] [Accepted: 02/17/2025] [Indexed: 05/13/2025] Open
Abstract
Even with the significant advances of AlphaFold-Multimer (AF-Multimer) and AlphaFold3 (AF3) in protein complex structure prediction, their accuracy is still not comparable with monomer structure prediction. Efficient and effective quality assessment (QA) or estimation of model accuracy models that can evaluate the quality of the predicted protein-complexes without knowing their native structures are of key importance for protein structure generation and model selection. In this paper, we leverage persistent homology (PH) to capture the atomic-level topological information around residues and design a topological deep learning-based QA method, TopoQA, to assess the accuracy of protein complex interfaces. We integrate PH from topological data analysis into graph neural networks (GNNs) to characterize complex higher-order structures that GNNs might overlook, enhancing the learning of the relationship between the topological structure of complex interfaces and quality scores. Our TopoQA model is extensively validated based on the two most-widely used benchmark datasets, Docking Benchmark5.5 AF2 (DBM55-AF2) and Heterodimer-AF2 (HAF2), along with our newly constructed ABAG-AF3 dataset to facilitate comparisons with AF3. For all three datasets, TopoQA outperforms AF-Multimer-based AF2Rank and shows an advantage over AF3 in nearly half of the targets. In particular, in the DBM55-AF2 dataset, a ranking loss of 73.6% lower than AF-Multimer-based AF2Rank is obtained. Further, other than AF-Multimer and AF3, we have also extensively compared with nearly-all the state-of-the-art models (as far as we know), it has been found that our TopoQA can achieve the highest Top 10 Hit-rate on the DBM55-AF2 dataset and the lowest ranking loss on the HAF2 dataset. Ablation experiments show that our topological features significantly improve the model's performance. At the same time, our method also provides a new paradigm for protein structure representation learning.
Collapse
Affiliation(s)
- Bingqing Han
- Institute for Mathematical Sciences, Renmin University of China, Beijing 100872, China
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore 637371, Singapore
| | - Yipeng Zhang
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore 637371, Singapore
| | - Longlong Li
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore 637371, Singapore
- School of Mathematics, Shandong University, Jinan 250100, China
- Data Science Institute, Shandong University, Jinan 250100, China
| | - Xinqi Gong
- Institute for Mathematical Sciences, Renmin University of China, Beijing 100872, China
| | - Kelin Xia
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore 637371, Singapore
| |
Collapse
|
31
|
Quast NP, Abanades B, Guloglu B, Karuppiah V, Harper S, Raybould MIJ, Deane CM. T-cell receptor structures and predictive models reveal comparable alpha and beta chain structural diversity despite differing genetic complexity. Commun Biol 2025; 8:362. [PMID: 40038394 PMCID: PMC11880327 DOI: 10.1038/s42003-025-07708-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2024] [Accepted: 02/09/2025] [Indexed: 03/06/2025] Open
Abstract
T-cell receptor (TCR) structures are currently under-utilised in early-stage drug discovery and repertoire-scale informatics. Here, we leverage a large dataset of solved TCR structures from Immunocore to evaluate the current state-of-the-art for TCR structure prediction, and identify which regions of the TCR remain challenging to model. Through clustering analyses and the training of a TCR-specific model capable of large-scale structure prediction, we find that the alpha chain VJ-recombined loop (CDR3α) is as structurally diverse and correspondingly difficult to predict as the beta chain VDJ-recombined loop (CDR3β). This differentiates TCR variable domain loops from the genetically analogous antibody loops and supports the conjecture that both TCR alpha and beta chains are deterministic of antigen specificity. We hypothesise that the larger number of alpha chain joining genes compared to beta chain joining genes compensates for the lack of a diversity gene segment. We also provide over 1.5M predicted TCR structures to enable repertoire structural analysis and elucidate strategies towards improving the accuracy of future TCR structure predictors. Our observations reinforce the importance of paired TCR sequence information and capture the current state-of-the-art for TCR structure prediction, while our model and 1.5M structure predictions enable the use of structural TCR information at an unprecedented scale.
Collapse
MESH Headings
- Receptors, Antigen, T-Cell, alpha-beta/genetics
- Receptors, Antigen, T-Cell, alpha-beta/chemistry
- Humans
- Models, Molecular
- Genetic Variation
- Receptors, Antigen, T-Cell/chemistry
- Receptors, Antigen, T-Cell/genetics
- Protein Conformation
- Complementarity Determining Regions/genetics
- Complementarity Determining Regions/chemistry
Collapse
Affiliation(s)
- Nele P Quast
- Department of Statistics, University of Oxford, Oxford, UK
| | | | - Bora Guloglu
- Department of Statistics, University of Oxford, Oxford, UK
| | | | | | | | | |
Collapse
|
32
|
Feldman J, Skolnick J. AF3Complex Yields Improved Structural Predictions of Protein Complexes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.02.27.640585. [PMID: 40093092 PMCID: PMC11908126 DOI: 10.1101/2025.02.27.640585] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 03/19/2025]
Abstract
Motivation Accurate structures of protein complexes are essential for understanding biological pathway function. A previous study showed how downstream modifications to AlphaFold 2 could yield AF2Complex, a model better suited for protein complexes. Here, we introduce AF3Complex, a model equipped with the same improvements as AF2Complex, along with a novel method for excluding ligands, built on AlphaFold 3. Results Benchmarking AF3Complex and AlphaFold 3 on a large dataset of protein complexes, it was shown that AF3Complex outperforms AlphaFold 3 to a significant degree. Moreover, by evaluating the structures generated by AF3Complex on a dataset of protein-peptide complexes and antibody-antigen complexes, it was established that AF3Complex could create high-fidelity structures for these challenging complex types. Additionally, when deployed to generate structural predictions for the two antibody-antigen and seven protein-protein complexes used in the recent CASP16 competition, AF3Complex yielded structures that would have placed it among the top models in the competition. Availability The AF3Complex code is freely available at https://github.com/Jfeldman34/AF3Complex.git. Contact Please contact skolnick@gatech.edu.
Collapse
Affiliation(s)
- Jonathan Feldman
- Center for the Study of Systems Biology/School of Biological Sciences, Georgia Institute of Technology, 950 Atlantic Drive, 30332, Georgia
- School of Computer Science, Georgia Institute of Technology, 266 Ferst Dr, Atlanta, 30332, Georgia
| | - Jeffrey Skolnick
- School of Computer Science, Georgia Institute of Technology, 266 Ferst Dr, Atlanta, 30332, Georgia
| |
Collapse
|
33
|
Kim G, Lee S, Levy Karin E, Kim H, Moriwaki Y, Ovchinnikov S, Steinegger M, Mirdita M. Easy and accurate protein structure prediction using ColabFold. Nat Protoc 2025; 20:620-642. [PMID: 39402428 DOI: 10.1038/s41596-024-01060-5] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Accepted: 08/07/2024] [Indexed: 03/12/2025]
Abstract
Since its public release in 2021, AlphaFold2 (AF2) has made investigating biological questions, by using predicted protein structures of single monomers or full complexes, a common practice. ColabFold-AF2 is an open-source Jupyter Notebook inside Google Colaboratory and a command-line tool that makes it easy to use AF2 while exposing its advanced options. ColabFold-AF2 shortens turnaround times of experiments because of its optimized usage of AF2's models. In this protocol, we guide the reader through ColabFold best practices by using three scenarios: (i) monomer prediction, (ii) complex prediction and (iii) conformation sampling. The first two scenarios cover classic static structure prediction and are demonstrated on the human glycosylphosphatidylinositol transamidase protein. The third scenario demonstrates an alternative use case of the AF2 models by predicting two conformations of the human alanine serine transporter 2. Users can run the protocol without computational expertise via Google Colaboratory or in a command-line environment for advanced users. Using Google Colaboratory, it takes <2 h to run each procedure. The data and code for this protocol are available at https://protocol.colabfold.com .
Collapse
Affiliation(s)
- Gyuri Kim
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, South Korea
| | - Sewon Lee
- School of Biological Sciences, Seoul National University, Seoul, South Korea
| | | | - Hyunbin Kim
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, South Korea
| | - Yoshitaka Moriwaki
- Department of Biotechnology, Graduate School of Agricultural and Life Sciences, The University of Tokyo, Tokyo, Japan
- Collaborative Research Institute for Innovative Microbiology, The University of Tokyo, Tokyo, Japan
- Department of Computational Drug Discovery and Design, Medical Research Institute, Tokyo Medical and Dental University, Tokyo, Japan
| | | | - Martin Steinegger
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, South Korea.
- School of Biological Sciences, Seoul National University, Seoul, South Korea.
- Artificial Intelligence Institute, Seoul National University, Seoul, South Korea.
- Institute of Molecular Biology and Genetics, Seoul National University, Seoul, South Korea.
| | - Milot Mirdita
- School of Biological Sciences, Seoul National University, Seoul, South Korea.
| |
Collapse
|
34
|
Jung N, Vellozo-Echevarría T, Barrett K, Meyer AS. Analysis of enzyme kinetics of fungal methionine synthases in an optimized colorimetric microscale assay for measuring cobalamin-independent methionine synthase activity. Enzyme Microb Technol 2025; 184:110581. [PMID: 39824044 DOI: 10.1016/j.enzmictec.2025.110581] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2024] [Revised: 01/01/2025] [Accepted: 01/02/2025] [Indexed: 01/20/2025]
Abstract
Aspergillus spp. and Rhizopus spp., used in solid-state plant food fermentations, encode cobalamin-independent methionine synthase activity (MetE, EC 2.1.1.14). Here, we examine the enzyme kinetics, reaction activation energies (Ea), thermal robustness, and structural folds of three MetEs from three different food-fermentation relevant fungi, Aspergillus sojae, Rhizopus delemar, and Rhizopus microsporus, and compare them to the MetE from Escherichia coli. We also downscaled and optimized a colorimetric assay to allow direct MetE activity measurements in microplates. The catalytic rates, kcat, of the three fungal MetE enzymes on the methyl donor (6S)-5-methyl-tetrahydropteroyl-L-glutamate3 ranged from 1.2 to 3.3 min-1 and KM values varied from 0.8 to 6.8 µM. The kcat was lowest for the R. delemar MetE, but this enzyme also had the lowest KM thus resulting in the highest kcat/KM of ∼1.4 min-1 µM-1 among the three fungal enzymes. The kcat was higher for the E. coli enzyme, 12 min-1, but KM was 6.4 µM, resulting in kcat/KM of ∼1.9 min-1 µM-1. The Ea values of the fungal MetEs ranged from 52 to 97 kJ mole-1 and were higher than that of the E. coli MetE (38.7 kJ mole -1). The predicted structural folds of the MetEs were very similar. Tm values of the fungal MetEs ranged from 41 to 54 °C, highest for the A. sojae enzyme (54 °C), lowest for the R. delemar (41 °C). At 30 °C, the half-lives of the three fungal enzymes varied significantly, with MetE from A. sojae having the longest (> 600 min, kD=0), and R. delemar the shortest (17 min). Knowledge of the kinetics of these enzymes is important for understanding methionine synthesis in fungi and a first step in promoting methionine synthesis in fungally fermented plant foods.
Collapse
Affiliation(s)
- Noël Jung
- Protein Chemistry and Enzyme Technology, Department of Biotechnology and Biomedicine, Building 221, Technical University of Denmark, Lyngby DK-2800 Kgs, Denmark
| | - Tomás Vellozo-Echevarría
- Protein Chemistry and Enzyme Technology, Department of Biotechnology and Biomedicine, Building 221, Technical University of Denmark, Lyngby DK-2800 Kgs, Denmark
| | - Kristian Barrett
- Protein Chemistry and Enzyme Technology, Department of Biotechnology and Biomedicine, Building 221, Technical University of Denmark, Lyngby DK-2800 Kgs, Denmark
| | - Anne S Meyer
- Protein Chemistry and Enzyme Technology, Department of Biotechnology and Biomedicine, Building 221, Technical University of Denmark, Lyngby DK-2800 Kgs, Denmark.
| |
Collapse
|
35
|
Vangaru S, Bhattacharya D. To pack or not to pack: revisiting protein side-chain packing in the post-AlphaFold era. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.02.22.639681. [PMID: 40060396 PMCID: PMC11888329 DOI: 10.1101/2025.02.22.639681] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 03/20/2025]
Abstract
Motivation Protein side-chain packing (PSCP), the problem of predicting side-chain conformation given a fixed backbone structure, has important implications in modeling of structures and interactions. However, despite the groundbreaking progress in protein structure prediction pioneered by AlphaFold, the existing PSCP methods still rely on experimental inputs, and do not leverage AlphaFold-predicted backbone coordinates to enable PSCP at scale. Results Here, we perform a large-scale benchmarking of the predictive performance of various PSCP methods on public datasets from multiple rounds of the Critical Assessment of Structure Prediction (CASP) challenges using a diverse set of evaluation metrics. Empirical results demonstrate that the PSCP methods perform well in packing the side-chains with experimental inputs, but they fail to generalize in repacking AlphaFold-generated structures. We additionally explore the effectiveness of leveraging the self-assessment confidence scores from AlphaFold by implementing a backbone confidence-aware integrative approach. While such a protocol often leads to performance improvement by attaining modest yet statistically significant accuracy gains over the AlphaFold baseline, it does not yield consistent and pronounced improvements. Our study highlights the recent advances and remaining challenges in PSCP in the post-AlphaFold era. Availability The code and raw data are freely available at https://github.com/Bhattacharya-Lab/PackBench.
Collapse
Affiliation(s)
- Sriniketh Vangaru
- Department of Computer Science, Virginia Tech, Blacksburg, 24061, Virginia, USA
| | | |
Collapse
|
36
|
Kumar BH, Kabekkodu SP, Pai KSR. Structural insights of AKT and its activation mechanism for drug development. Mol Divers 2025:10.1007/s11030-025-11132-7. [PMID: 40009150 DOI: 10.1007/s11030-025-11132-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2025] [Accepted: 02/09/2025] [Indexed: 02/27/2025]
Abstract
AKT1, a serine/threonine kinase, is pivotal in signaling and regulating cell survival, proliferation, and metabolism. This review focuses on the structural insights and the essential features required for its active conformation. AKT belongs to the AGC kinase group and has three isoforms: AKT1, AKT2, and AKT3. AKT has three functional regions: PH domain, kinase domain, and hydrophobic motif. AKT1 activation involves intricate conformational changes, including transitions in the αC-in, DFG-in, G-loop, activation loop, and PH domain out, S-spine and R-spine formation, as well as phosphorylation at Thr 308 and Ser 473, which enable AKT1 to adopt active conformation. The analysis highlights the limitations of the AlphaFold-predicted AKT1 structure, which lacks key elements of the active state, including ATP, magnesium ion coordination, phosphatidylinositol-(1,3,4,5)-tetraphosphate, substrate peptide, and phosphorylation at Thr 308 and Ser 473. This study underscores the necessity of these features for stabilizing the kinase domain and facilitating efficient substrate phosphorylation. By consolidating structural insights and activation mechanisms, this review aims to inform the development of computational models and targeted therapeutics for AKT1 activators in diseases such as hepatic ischemia-reperfusion injury, cerebral ischemia, acute hepatic failure, subarachnoid hemorrhage, and alzheimer's disease.
Collapse
Affiliation(s)
- B Harish Kumar
- Department of Pharmacology, Manipal College of Pharmaceutical Sciences, Manipal Academy of Higher Education, Manipal, Karnataka, 576104, India
- Department of Applied Biology, CSIR-Indian Institute of Chemical Technology (IICT), Hyderabad, 500007, India
| | - Shama Prasada Kabekkodu
- Department of Cell and Molecular Biology, Manipal School of Life Sciences, Manipal Academy of Higher Education, Manipal, Karnataka, 576104, India
| | - K Sreedhara Ranganath Pai
- Department of Pharmacology, Manipal College of Pharmaceutical Sciences, Manipal Academy of Higher Education, Manipal, Karnataka, 576104, India.
| |
Collapse
|
37
|
Zhou L, Ortega-Rodriguez U, Flores MJ, Matsumoto Y, Bettinger JQ, Wu WW, Zhang Y, Kim SR, Biel TG, Pritts JD, Shen RF, Rao VA, Ju T. Dual functional POGases from bacteria encompassing broader O-glycanase and adhesin activities. Nat Commun 2025; 16:1960. [PMID: 40000644 PMCID: PMC11861894 DOI: 10.1038/s41467-025-57143-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2024] [Accepted: 02/11/2025] [Indexed: 02/27/2025] Open
Abstract
Mucin-type O-glycans on glycoproteins are pivotal for biology and impact the quality of biotherapeutics. Furthermore, glycans on host cells serve as ligands for lectins/adhesins on bacteria for bacterium-host interactions in the colonization or attachment/invasion of bacteria. Defining the structure-function relationship of O-glycans is hindered by a lack of enzyme(s) to release sialylated O-glycans from glycoproteins. Here we show identification of endo-α-N-acetylgalactosaminidases (O-glycanases, GH101) with broad substrate specificities, termed Peptide:O-Glycosidase (POGase). In 5 POGase orthologs identified, we characterize one that releases sialylated O-glycans from glycopeptides, glycoproteins and biotherapeutics. Three peptide motifs differentiate the POGase existing in phylum Actinomycetota from known O-glycanases in other bacteria. While the GH101 domain classifies POGases, other domains confer the efficient enzyme activity and binding to major glycans decorating epithelial cells. The dual functional POGases encompassing broader O-glycanase and adhesin activities will facilitate the study of O-glycomics, quality assessment of biotherapeutics, and development of microbiology and medicine.
Collapse
Affiliation(s)
- Linjiao Zhou
- Office of Pharmaceutical Quality, Center for Drug Evaluation and Research, Food and Drug Administration, Silver Spring, MD, USA
| | - Uriel Ortega-Rodriguez
- Office of Pharmaceutical Quality, Center for Drug Evaluation and Research, Food and Drug Administration, Silver Spring, MD, USA
| | - Matthew J Flores
- Office of Pharmaceutical Quality, Center for Drug Evaluation and Research, Food and Drug Administration, Silver Spring, MD, USA
| | - Yasuyuki Matsumoto
- Office of Pharmaceutical Quality, Center for Drug Evaluation and Research, Food and Drug Administration, Silver Spring, MD, USA
| | - John Q Bettinger
- Office of Pharmaceutical Quality, Center for Drug Evaluation and Research, Food and Drug Administration, Silver Spring, MD, USA
| | - Wells W Wu
- Facility for Biotechnology Resources, Center for Biologics Evaluation and Research, Food and Drug Administration, Silver Spring, MD, USA
| | - Yaqin Zhang
- Office of Pharmaceutical Quality, Center for Drug Evaluation and Research, Food and Drug Administration, Silver Spring, MD, USA
| | - Su-Ryun Kim
- Office of Pharmaceutical Quality, Center for Drug Evaluation and Research, Food and Drug Administration, Silver Spring, MD, USA
| | - Thomas G Biel
- Office of Pharmaceutical Quality, Center for Drug Evaluation and Research, Food and Drug Administration, Silver Spring, MD, USA
| | - Jordan D Pritts
- Office of Pharmaceutical Quality, Center for Drug Evaluation and Research, Food and Drug Administration, Silver Spring, MD, USA
| | - Rong-Fong Shen
- Facility for Biotechnology Resources, Center for Biologics Evaluation and Research, Food and Drug Administration, Silver Spring, MD, USA
| | - V Ashutosh Rao
- Office of Pharmaceutical Quality, Center for Drug Evaluation and Research, Food and Drug Administration, Silver Spring, MD, USA
| | - Tongzhong Ju
- Office of Pharmaceutical Quality, Center for Drug Evaluation and Research, Food and Drug Administration, Silver Spring, MD, USA.
| |
Collapse
|
38
|
Maisuradze GG, Thakur A, Khatri K, Haldane A, Levy RM. Using AlphaFold2 to Predict the Conformations of Side Chains in Folded Proteins. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.02.10.637534. [PMID: 39990457 PMCID: PMC11844428 DOI: 10.1101/2025.02.10.637534] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 02/25/2025]
Abstract
AlphaFold has revolutionized protein structure prediction by accurately creating 3D structures from just the amino acid sequence. However, even with extensive research validating its overall accuracy, a key question remains: Can AlphaFold predict the conformation of individual amino acid residue side chains within a folded protein? This is important for the field of molecular modeling, particularly when predicting the effects of mutations on protein stability and ligand binding. AlphaFold generates a set of atomic coordinates not just for the mutated side chain but also for potential rearrangements across the entire protein structure. In this study we investigate the ability of ColabFold, an online implementation of AlphaFold2 (AF2), to predict the conformations of residue side chains in folded proteins. We find that over a set of 10 benchmark proteins, the side chain conformation prediction error of ColabFold is on average ~ 14 % forχ 1 dihedral angles, and increases to ~ 48 % forχ 3 dihedral angles. The prediction error is smaller for non-polar side chains and is somewhat improved using structural templates. ColabFold demonstrates a bias towards the most prevalent rotamer states in the protein data bank (PDB), potentially limiting its ability to capture rare side chain conformations effectively. As an application of AlphaFold to explore the structural consequences of strongly cooperative mutations on side chain rearrangements, we employ a Potts sequence-based statistical energy model to perform large scale mutational scans of two proteins ABL1 and PIM1 kinase, searching for the most strongly cooperative mutational pairs, and then use ColabFold to predict the structural signatures of this cooperativity on the interacting side chains. Our results demonstrate that integration of the sequence-based Potts model with AlphaFold into a single pipeline provides a new tool that can be used to explore the fundamental relationship between protein mutations, cooperative changes in structure, and fitness.
Collapse
Affiliation(s)
- Gia G. Maisuradze
- Center for Biophysics and Computational Biology, Temple University, Philadelphia, PA, USA
- Department of Chemistry, Temple University, Philadelphia, PA, USA
- Baker Laboratory of Chemistry and Chemical Biology, Cornell University, Ithaca, NY, USA
| | - Abhishek Thakur
- Center for Biophysics and Computational Biology, Temple University, Philadelphia, PA, USA
- Department of Chemistry, Temple University, Philadelphia, PA, USA
| | - Kisan Khatri
- Center for Biophysics and Computational Biology, Temple University, Philadelphia, PA, USA
- Department of Physics, Temple University, Philadelphia, PA, USA
| | - Allan Haldane
- Center for Biophysics and Computational Biology, Temple University, Philadelphia, PA, USA
- Department of Physics, Temple University, Philadelphia, PA, USA
| | - Ronald M. Levy
- Center for Biophysics and Computational Biology, Temple University, Philadelphia, PA, USA
- Department of Chemistry, Temple University, Philadelphia, PA, USA
| |
Collapse
|
39
|
Tauriello G, Waterhouse AM, Haas J, Behringer D, Bienert S, Garello T, Schwede T. ModelArchive: A Deposition Database for Computational Macromolecular Structural Models. J Mol Biol 2025:168996. [PMID: 39947281 DOI: 10.1016/j.jmb.2025.168996] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2024] [Revised: 02/04/2025] [Accepted: 02/07/2025] [Indexed: 02/27/2025]
Abstract
A wide range of applications in life science research benefit from the availability of three-dimensional structures of biological macromolecules as they provide valuable insights into their molecular function. Recent advances in structure prediction techniques have made it possible to generate high quality computational macromolecular structural models for almost all known proteins. In this context, ModelArchive (https://modelarchive.org/) serves as a deposition database for computational models, complementing the Protein Data Bank (PDB) and PDB-IHM, which require experimental data, and specialised databases such as the AlphaFold DB. ModelArchive contains over 600,000 models contributed by researchers using a variety of modelling techniques. It supports single biological macromolecules and complexes, including any combination of polymers and small molecules. Each deposited model can be referenced in manuscripts using an immutable accession code provided by ModelArchive. Depositors are required to provide a minimal set of information about the modelling process and the expected accuracy of the resulting model, enabling scientific reproducibility and maximising the potential reuse of the models. The vast majority of models in ModelArchive use the ModelCIF format which includes coordinates and metadata, allows for programmatic validation of the models, and makes the models interoperable with structures obtained from other sources such as the PDB. The ModelArchive web service provides access to the models and search queries. Model findability is also provided in external services either through APIs or by importing data from ModelArchive.
Collapse
Affiliation(s)
- Gerardo Tauriello
- Biozentrum, University of Basel, Basel, Switzerland; Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Andrew M Waterhouse
- Biozentrum, University of Basel, Basel, Switzerland; Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Juergen Haas
- Biozentrum, University of Basel, Basel, Switzerland; Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Dario Behringer
- Biozentrum, University of Basel, Basel, Switzerland; Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Stefan Bienert
- Biozentrum, University of Basel, Basel, Switzerland; Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Thomas Garello
- Biozentrum, University of Basel, Basel, Switzerland; Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Torsten Schwede
- Biozentrum, University of Basel, Basel, Switzerland; Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel, Switzerland.
| |
Collapse
|
40
|
Ghodrati F, Parivar K, Amiri I, Roodbari NH. Exploring miR-34a, miR-449, and ADAM2/ADAM7 Expressions as Potential Biomarkers in Male Infertility: A Combined In Silico and Experimental Approach. Biochem Genet 2025:10.1007/s10528-025-11050-1. [PMID: 39928278 DOI: 10.1007/s10528-025-11050-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2024] [Accepted: 01/31/2025] [Indexed: 02/11/2025]
Abstract
miR-34a and miR-449 are key miRNAs involved in sperm function and male fertility, with their dysregulation potentially contributing to male infertility. ADAM proteins, specifically ADAM2 and ADAM7, are also implicated in sperm function. This study investigates the interactions between miR-34a, miR-449, and ADAM2/ADAM7, exploring their roles in male infertility through both experimental analyses and molecular docking. In this case-control study, 15 infertile males and 15 healthy controls were included. Gene expression levels of miR-34a, miR-449, and SOX30 were measured using real-time PCR, while protein levels of ADAM7 and ADAM2 in sperm were assessed through western blotting. Additionally, molecular docking was performed to analyze the binding affinities between miR-34a/miR-449 and ADAM2/ADAM7, with docking scores and confidence levels evaluated. Expression levels of ADAM7 and ADAM2 proteins in sperm from the infertile group showed significant differences compared with the control group (P ≤ 0.05). A significant difference was observed in the expression of miR-449, miR-34a, and SOX30 genes between the control and infertile groups (P < 0.05). A significant correlation between miR-34a expression, ADAM7 protein expression, and sperm morphology was observed. However, no statistically significant correlation was found between miR-34a expression and sperm motility, sperm count, blastocyst, or embryo rates in ICSI and IVF (P ≥ 0.05). Molecular docking and dynamics studies revealed strong interactions between miR-34a/miR-449 and ADAM proteins. The ADAM7/miR-34a complex showed the highest binding affinity with a docking score of - 372.40 and a confidence score of 0.9884, followed by ADAM7/miR-449. Hydrogen bond analysis indicated stable binding, with 9 bonds for ADAM2/miR-34a and 7 for ADAM7/miR-34a. These interactions suggest a significant role in regulating sperm morphology and function.miR-34a, miR-449, ADAM7, and ADAM2 protein expression appear to be involved in the molecular mechanisms of male infertility. These parameters show potential as biomarkers in assisted reproductive technology techniques, particularly by influencing sperm morphology and function.
Collapse
Affiliation(s)
- Fariba Ghodrati
- Department of Biology, Science and Research Branch, Islamic Azad University, Tehran, Iran
| | - Kazem Parivar
- Department of Biology, Science and Research Branch, Islamic Azad University, Tehran, Iran.
| | - Iraj Amiri
- Department of Anatomy and Embryology, Hamedan University of Medical Sciences, Hamedan, Iran
| | - Nasim Hayati Roodbari
- Department of Biology, Science and Research Branch, Islamic Azad University, Tehran, Iran
| |
Collapse
|
41
|
Morehead A, Giri N, Liu J, Neupane P, Cheng J. Deep Learning for Protein-Ligand Docking: Are We There Yet? ARXIV 2025:arXiv:2405.14108v5. [PMID: 38827451 PMCID: PMC11142318] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/04/2024]
Abstract
The effects of ligand binding on protein structures and their in vivo functions carry numerous implications for modern biomedical research and biotechnology development efforts such as drug discovery. Although several deep learning (DL) methods and benchmarks designed for protein-ligand docking have recently been introduced, to date no prior works have systematically studied the behavior of the latest docking and structure prediction methods within the broadly applicable context of (1) using predicted (apo) protein structures for docking (e.g., for applicability to new proteins); (2) binding multiple (cofactor) ligands concurrently to a given target protein (e.g., for enzyme design); and (3) having no prior knowledge of binding pockets (e.g., for generalization to unknown pockets). To enable a deeper understanding of docking methods' real-world utility, we introduce PoseBench, the first comprehensive benchmark for broadly applicable protein-ligand docking. PoseBench enables researchers to rigorously and systematically evaluate DL methods for apo-to-holo protein-ligand docking and protein-ligand structure prediction using both primary ligand and multi-ligand benchmark datasets, the latter of which we introduce for the first time to the DL community. Empirically, using PoseBench, we find that (1) DL co-folding methods generally outperform comparable conventional and DL docking baselines, yet popular methods such as AlphaFold 3 are still challenged by prediction targets with novel protein sequences; (2) certain DL co-folding methods are highly sensitive to their input multiple sequence alignments, while others are not; and (3) DL methods struggle to strike a balance between structural accuracy and chemical specificity when predicting novel or multi-ligand protein targets. Code, data, tutorials, and benchmark results are available at https://github.com/BioinfoMachineLearning/PoseBench.
Collapse
Affiliation(s)
- Alex Morehead
- Electrical Engineering & Computer Science, NextGen Precision Health, University of Missouri, Columbia, Missouri, USA
| | - Nabin Giri
- Electrical Engineering & Computer Science, NextGen Precision Health, University of Missouri, Columbia, Missouri, USA
| | - Jian Liu
- Electrical Engineering & Computer Science, NextGen Precision Health, University of Missouri, Columbia, Missouri, USA
| | - Pawan Neupane
- Electrical Engineering & Computer Science, NextGen Precision Health, University of Missouri, Columbia, Missouri, USA
| | - Jianlin Cheng
- Electrical Engineering & Computer Science, NextGen Precision Health, University of Missouri, Columbia, Missouri, USA
| |
Collapse
|
42
|
Banna HA, Berg K, Sadat T, Das N, Paudel R, D'Souza V, Koirala D. Synthetic anti-RNA antibody derivatives for RNA visualization in mammalian cells. Nucleic Acids Res 2025; 53:gkae1275. [PMID: 39739875 PMCID: PMC11879077 DOI: 10.1093/nar/gkae1275] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2024] [Revised: 11/15/2024] [Accepted: 12/12/2024] [Indexed: 01/02/2025] Open
Abstract
Although antibody derivatives, such as Fabs and scFvs, have revolutionized the cellular imaging, quantification and tracking of proteins, analogous tools and strategies are unavailable for cellular RNA visualization. Here, we developed four synthetic anti-RNA scFv (sarabody) probes and their green fluorescent protein (GFP) fusions and demonstrated their potential to visualize RNA in live mammalian cells. We expressed these sarabodies and sarabody-GFP modules, purified them as soluble proteins, characterized their binding interactions with their corresponding epitopes and finally employed two of the four modules, sara1-GFP and sara1c-GFP, to visualize a target messenger RNA in live U2OS cells. Our current RNA imaging strategy is analogous to the existing MCP-MS2 system for RNA visualization, but additionally, our approach provides robust flexibility for developing target RNA-specific imaging modules, as epitope-specific probes can be selected from a library generated by diversifying the sarabody complementarity determining regions. While we continue to optimize these probes, develop new probes for various target RNAs and incorporate other fluorescence proteins like mCherry and HaloTag, our groundwork results demonstrated that these first-of-a-kind immunofluorescent probes will have tremendous potential for tracking mature RNAs and may aid in visualizing and quantifying many cellular processes as well as examining the spatiotemporal dynamics of various RNAs.
Collapse
Affiliation(s)
- Hasan Al Banna
- Department of Chemistry and Biochemistry, University of Maryland, Baltimore County, Baltimore, MD 21250, USA
| | - Kimberley Berg
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA 02138, USA
| | - Tasnia Sadat
- Department of Chemistry and Biochemistry, University of Maryland, Baltimore County, Baltimore, MD 21250, USA
| | - Naba Krishna Das
- Department of Chemistry and Biochemistry, University of Maryland, Baltimore County, Baltimore, MD 21250, USA
| | - Roshan Paudel
- Department of Computer Science, Morgan State University, Baltimore, MD 21251, USA
| | - Victoria D'Souza
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA 02138, USA
| | - Deepak Koirala
- Department of Chemistry and Biochemistry, University of Maryland, Baltimore County, Baltimore, MD 21250, USA
| |
Collapse
|
43
|
Vincenzi M, Mercurio FA, Autiero I, Leone M. Sam-Sam Association Between EphA2 and SASH1: In Silico Studies of Cancer-Linked Mutations. Molecules 2025; 30:718. [PMID: 39942820 PMCID: PMC11820823 DOI: 10.3390/molecules30030718] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2024] [Revised: 01/21/2025] [Accepted: 01/30/2025] [Indexed: 02/16/2025] Open
Abstract
Recently, SASH1 has emerged as a novel protein interactor of a few Eph tyrosine kinase receptors like EphA2. These interactions involve the first N-terminal Sam (sterile alpha motif) domain of SASH1 (SASH1-Sam1) and the Sam domain of Eph receptors. Currently, the functional meaning of the SASH1-Sam1/EphA2-Sam complex is unknown, but EphA2 is a well-established and crucial player in cancer onset and progression. Thus, herein, to investigate a possible correlation between the formation of the SASH1-Sam1/EphA2-Sam complex and EphA2 activity in cancer, cancer-linked mutations in SASH1-Sam1 were deeply analyzed. Our research plan relied first on searching the COSMIC database for cancer-related SASH1 variants carrying missense mutations in the Sam1 domain and then, through a variety of bioinformatic tools and molecular dynamic simulations, studying how these mutations could affect the stability of SASH1-Sam1 alone, leading eventually to a defective fold. Next, through docking studies, with the support of AlphaFold2 structure predictions, we investigated if/how mutations in SASH1-Sam1 could affect binding to EphA2-Sam. Our study, apart from presenting a solid multistep research protocol to analyze structural consequences related to cancer-associated protein variants with the support of cutting-edge artificial intelligence tools, suggests a few mutations that could more likely modulate the interaction between SASH1-Sam1 and EphA2-Sam.
Collapse
Affiliation(s)
| | | | | | - Marilisa Leone
- Institute of Biostructures and Bioimaging, National Research Council of Italy, Via Pietro Castellino 111, 80131 Naples, Italy; (M.V.); (F.A.M.); (I.A.)
| |
Collapse
|
44
|
Zhou Z, Riley R, Kautsar S, Wu W, Egan R, Hofmeyr S, Goldhaber-Gordon S, Yu M, Ho H, Liu F, Chen F, Morgan-Kiss R, Shi L, Liu H, Wang Z. GenomeOcean: An Efficient Genome Foundation Model Trained on Large-Scale Metagenomic Assemblies. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.01.30.635558. [PMID: 39975405 PMCID: PMC11838515 DOI: 10.1101/2025.01.30.635558] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 02/21/2025]
Abstract
Genome foundation models hold transformative potential for precision medicine, drug discovery, and understanding complex biological systems. However, existing models are often inefficient, constrained by suboptimal tokenization and architectural design, and biased toward reference genomes, limiting their representation of low-abundance, uncultured microbes in the rare biosphere. To address these challenges, we developed GenomeOcean, a 4-billion-parameter generative genome foundation model trained on over 600 Gbp of high-quality contigs derived from 220 TB of metagenomic datasets collected from diverse habitats across Earth's ecosystems. A key innovation of GenomeOcean is training directly on large-scale co-assemblies of metagenomic samples, enabling enhanced representation of rare microbial species and improving generalizability beyond genome-centric approaches. We implemented a byte-pair encoding (BPE) tokenization strategy for genome sequence generation, alongside architectural optimizations, achieving up to 150× faster sequence generation while maintaining high biological fidelity. GenomeOcean excels in representing microbial species and generating protein-coding genes constrained by evolutionary principles. Additionally, its fine-tuned model demonstrates the ability to discover novel biosynthetic gene clusters (BGCs) in natural genomes and perform zero-shot synthesis of biochemically plausible, complete BGCs. GenomeOcean sets a new benchmark for metagenomic research, natural product discovery, and synthetic biology, offering a robust foundation for advancing these fields.
Collapse
Affiliation(s)
| | - Robert Riley
- Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Satria Kautsar
- Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Weimin Wu
- Northwestern University, Evanston, IL, USA
| | - Rob Egan
- Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Steven Hofmeyr
- Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | | | - Mutian Yu
- Northwestern University, Evanston, IL, USA
| | - Harrison Ho
- Lawrence Berkeley National Laboratory, Berkeley, CA, USA
- University of California at Merced, Merced, CA, USA
| | - Fengchen Liu
- Lawrence Berkeley National Laboratory, Berkeley, CA, USA
- University of California at Berkeley, Berkeley, CA, USA
| | | | | | - Lizhen Shi
- Northwestern University, Evanston, IL, USA
| | - Han Liu
- Northwestern University, Evanston, IL, USA
| | - Zhong Wang
- Lawrence Berkeley National Laboratory, Berkeley, CA, USA
- University of California at Merced, Merced, CA, USA
| |
Collapse
|
45
|
Prabakaran R, Bromberg Y. Functional profiling of the sequence stockpile: a protein pair-based assessment of in silico prediction tools. Bioinformatics 2025; 41:btaf035. [PMID: 39854283 PMCID: PMC11821270 DOI: 10.1093/bioinformatics/btaf035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2024] [Revised: 11/04/2024] [Accepted: 01/22/2025] [Indexed: 01/26/2025] Open
Abstract
MOTIVATION In silico functional annotation of proteins is crucial to narrowing the sequencing-accelerated gap in our understanding of protein activities. Numerous function annotation methods exist, and their ranks have been growing, particularly so with the recent deep learning-based developments. However, it is unclear if these tools are truly predictive. As we are not aware of any methods that can identify new terms in functional ontologies, we ask if they can, at least, identify molecular functions of proteins that are non-homologous to or far-removed from known protein families. RESULTS Here, we explore the potential and limitations of the existing methods in predicting the molecular functions of thousands of such proteins. Lacking the "ground truth" functional annotations, we transformed the assessment of function prediction into evaluation of functional similarity of protein pairs that likely share function but are unlike any of the currently functionally annotated sequences. Notably, our approach transcends the limitations of functional annotation vocabularies, providing a means to assess different-ontology annotation methods. We find that most existing methods are limited to identifying functional similarity of homologous sequences and fail to predict the function of proteins lacking reference. Curiously, despite their seemingly unlimited by-homology scope, deep learning methods also have trouble capturing the functional signal encoded in protein sequence. We believe that our work will inspire the development of a new generation of methods that push boundaries and promote exploration and discovery in the molecular function domain. AVAILABILITY AND IMPLEMENTATION The data underlying this article are available at https://doi.org/10.6084/m9.figshare.c.6737127.v3. The code used to compute siblings is available openly at https://bitbucket.org/bromberglab/siblings-detector/.
Collapse
Affiliation(s)
- R Prabakaran
- Department of Biology, Emory University, Atlanta, GA 30322, United States
- Department of Computer Science, Emory University, Atlanta, GA 30322, United States
| | - Yana Bromberg
- Department of Biology, Emory University, Atlanta, GA 30322, United States
- Department of Computer Science, Emory University, Atlanta, GA 30322, United States
| |
Collapse
|
46
|
Wang M, Robertson D, Zou J, Spanos C, Rappsilber J, Marston AL. Molecular mechanism targeting condensin for chromosome condensation. EMBO J 2025; 44:705-735. [PMID: 39690240 PMCID: PMC11791182 DOI: 10.1038/s44318-024-00336-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2024] [Revised: 11/26/2024] [Accepted: 12/02/2024] [Indexed: 12/19/2024] Open
Abstract
Genomes are organised into DNA loops by the Structural Maintenance of Chromosomes (SMC) proteins. SMCs establish functional chromosomal sub-domains for DNA repair, gene expression and chromosome segregation, but how SMC activity is specifically targeted is unclear. Here, we define the molecular mechanism targeting the condensin SMC complex to specific chromosomal regions in budding yeast. A conserved pocket on the condensin HAWK subunit Ycg1 binds to chromosomal receptors carrying a related motif, CR1. In early mitosis, CR1 motifs in receptors Sgo1 and Lrs4 recruit condensin to pericentromeres and rDNA, to facilitate sister kinetochore biorientation and rDNA condensation, respectively. We additionally find that chromosome arm condensation begins as sister kinetochores come under tension, in a manner dependent on the Ycg1 pocket. We propose that multiple CR1-containing proteins recruit condensin to chromosomes and identify several additional candidates based on their sequence. Overall, we uncover the molecular mechanism that targets condensin to functionalise chromosomal domains to achieve accurate chromosome segregation during mitosis.
Collapse
Affiliation(s)
- Menglu Wang
- Centre for Cell Biology, Institute of Cell Biology, University of Edinburgh, Edinburgh, EH9 3BF, United Kingdom
| | - Daniel Robertson
- Centre for Cell Biology, Institute of Cell Biology, University of Edinburgh, Edinburgh, EH9 3BF, United Kingdom
| | - Juan Zou
- Centre for Cell Biology, Institute of Cell Biology, University of Edinburgh, Edinburgh, EH9 3BF, United Kingdom
| | - Christos Spanos
- Centre for Cell Biology, Institute of Cell Biology, University of Edinburgh, Edinburgh, EH9 3BF, United Kingdom
| | - Juri Rappsilber
- Centre for Cell Biology, Institute of Cell Biology, University of Edinburgh, Edinburgh, EH9 3BF, United Kingdom
- Institute of Biotechnology, Technische Universität Berlin, Gustav-Meyer-Allee 25, 13355, Berlin, Germany
| | - Adele L Marston
- Centre for Cell Biology, Institute of Cell Biology, University of Edinburgh, Edinburgh, EH9 3BF, United Kingdom.
| |
Collapse
|
47
|
Mi T, Xiao N, Gong H. GDFold2: A fast and parallelizable protein folding environment with freely defined objective functions. Protein Sci 2025; 34:e70041. [PMID: 39873342 PMCID: PMC11773392 DOI: 10.1002/pro.70041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2024] [Revised: 12/31/2024] [Accepted: 01/10/2025] [Indexed: 01/30/2025]
Abstract
An important step of mainstream protein structure prediction is to model the 3D protein structure based on the predicted 2D inter-residue geometric information. This folding step has been integrated into a unified neural network to allow end-to-end training in state-of-the-art methods like AlphaFold2, but is separately implemented using the Rosetta folding environment in some traditional methods like trRosetta. Despite the inferiority in prediction accuracy, the conventional approach allows for the sampling of various protein conformations compatible with the predicted geometric constraints, partially capturing the dynamic information. Here, we propose GDFold2, a novel protein folding environment, to address the limitations of Rosetta. On the one hand, GDFold2 is highly computationally efficient, capable of accomplishing multiple folding processes in parallel within the time scale of minutes for generic proteins. On the other hand, GDFold2 supports freely defined objective functions to fulfill diversified optimization requirements. Moreover, we propose a quality assessment (QA) model to provide reliable prediction on the quality of protein structures folded by GDFold2, thus substantially simplifying the selection of structural models. GDFold2 and the QA model could be combined to investigate the transition path between protein conformational states, and the online server is available at https://structpred.life.tsinghua.edu.cn/server_gdfold2.html.
Collapse
Affiliation(s)
- Tianyu Mi
- MOE Key Laboratory of Bioinformatics, School of Life SciencesTsinghua UniversityBeijingChina
- Beijing Frontier Research Center for Biological StructureTsinghua UniversityBeijingChina
| | - Nan Xiao
- MOE Key Laboratory of Bioinformatics, School of Life SciencesTsinghua UniversityBeijingChina
- Beijing Frontier Research Center for Biological StructureTsinghua UniversityBeijingChina
| | - Haipeng Gong
- MOE Key Laboratory of Bioinformatics, School of Life SciencesTsinghua UniversityBeijingChina
- Beijing Frontier Research Center for Biological StructureTsinghua UniversityBeijingChina
| |
Collapse
|
48
|
Simpson J, Kasson PM. Structural prediction of chimeric immunogen candidates to elicit targeted antibodies against betacoronaviruses. PLoS Comput Biol 2025; 21:e1012812. [PMID: 39908344 PMCID: PMC11809852 DOI: 10.1371/journal.pcbi.1012812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2024] [Revised: 02/10/2025] [Accepted: 01/20/2025] [Indexed: 02/07/2025] Open
Abstract
Betacoronaviruses pose an ongoing pandemic threat. Antigenic evolution of the SARS-CoV-2 virus has shown that much of the spontaneous antibody response is narrowly focused rather than broadly neutralizing against even SARS-CoV-2 variants, let alone future threats. One way to overcome this is by focusing the antibody response against better-conserved regions of the viral spike protein. This has been demonstrated empirically in prior work, but we posit that systematic design tools will further potentiate antigenic focusing approaches. Here, we present a design approach to predict stable chimeras between SARS-CoV-2 and other coronaviruses, creating synthetic spike proteins that display a desired conserved region, in this case S2, and vary other regions. We leverage AlphaFold to predict chimeric structures and create a new metric for scoring chimera stability based on AlphaFold outputs. We evaluated 114 candidate spike chimeras using this approach. Top chimeras were further evaluated using molecular dynamics simulation as an intermediate validation technique, showing good stability compared to low-scoring controls. Experimental testing of five predicted-stable and two predicted-unstable chimeras confirmed 5/7 predictions, with one intermediate result. This demonstrates the feasibility of the underlying approach, which can be used to design custom immunogens to focus the immune response against a desired viral glycoprotein epitope.
Collapse
Affiliation(s)
- Jamel Simpson
- Program in Biophysics and Department of Biomedical Engineering, University of Virginia, Charlottesville, Virginia, United States of America
| | - Peter M. Kasson
- Program in Biophysics and Department of Biomedical Engineering, University of Virginia, Charlottesville, Virginia, United States of America
- Departments of Chemistry and Biochemistry and Biomedical Engineering, Georgia Institute of Technology, Atlanta, GeorgiaUnited States of America
- Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden
| |
Collapse
|
49
|
Dey M, Gupta A, Badmalia MD, Ashish, Sharma D. Visualizing gaussian-chain like structural models of human α-synuclein in monomeric pre-fibrillar state: Solution SAXS data and modeling analysis. Int J Biol Macromol 2025; 288:138614. [PMID: 39674478 DOI: 10.1016/j.ijbiomac.2024.138614] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2024] [Revised: 12/08/2024] [Accepted: 12/08/2024] [Indexed: 12/16/2024]
Abstract
Here, using small angle X-ray scattering (SAXS) data profile as reference, we attempted to visualize conformational ensemble accessible prefibrillar monomeric state of α-synuclein in solution. In agreement with previous reports, our analysis also confirmed that α-synuclein molecules adopted disordered shape profile under non-associating conditions. Chain-ensemble modeling protocol with dummy residues provided two weighted averaged clusters of semi-extended shapes. Further, Ensemble Optimization Method (EOM) computed mole fractions of semi-extended "twisted" conformations which might co-exist in solution. Since these were only Cα traces of the models, ALPHAFOLD2 server was used to search for all-atom models. Comparison with experimental data showed all predicted models disagreed equally, as individuals. Finally, we employed molecular dynamics simulations and normal mode analysis-based search coupled with SAXS data to seek better agreeing models. Overall, our analysis concludes that a shifting equilibrium of curved models with low α-helical content best-represents non-associating monomeric α-synuclein.
Collapse
Affiliation(s)
- Madhumita Dey
- CSIR - Institute of Microbial Technology, Chandigarh, India
| | - Arpit Gupta
- CSIR - Institute of Microbial Technology, Chandigarh, India
| | | | - Ashish
- CSIR - Institute of Microbial Technology, Chandigarh, India.
| | - Deepak Sharma
- CSIR - Institute of Microbial Technology, Chandigarh, India.
| |
Collapse
|
50
|
Bu F, Adam Y, Adamiak RW, Antczak M, de Aquino BRH, Badepally NG, Batey RT, Baulin EF, Boinski P, Boniecki MJ, Bujnicki JM, Carpenter KA, Chacon J, Chen SJ, Chiu W, Cordero P, Das NK, Das R, Dawson WK, DiMaio F, Ding F, Dock-Bregeon AC, Dokholyan NV, Dror RO, Dunin-Horkawicz S, Eismann S, Ennifar E, Esmaeeli R, Farsani MA, Ferré-D'Amaré AR, Geniesse C, Ghanim GE, Guzman HV, Hood IV, Huang L, Jain DS, Jaryani F, Jin L, Joshi A, Karelina M, Kieft JS, Kladwang W, Kmiecik S, Koirala D, Kollmann M, Kretsch RC, Kurciński M, Li J, Li S, Magnus M, Masquida B, Moafinejad SN, Mondal A, Mukherjee S, Nguyen THD, Nikolaev G, Nithin C, Nye G, Pandaranadar Jeyeram IPN, Perez A, Pham P, Piccirilli JA, Pilla SP, Pluta R, Poblete S, Ponce-Salvatierra A, Popenda M, Popenda L, Pucci F, Rangan R, Ray A, Ren A, Sarzynska J, Sha CM, Stefaniak F, Su Z, Suddala KC, Szachniuk M, Townshend R, Trachman RJ, Wang J, Wang W, Watkins A, Wirecki TK, Xiao Y, Xiong P, Xiong Y, Yang J, Yesselman JD, Zhang J, Zhang Y, Zhang Z, Zhou Y, Zok T, Zhang D, Zhang S, Żyła A, Westhof E, Miao Z. RNA-Puzzles Round V: blind predictions of 23 RNA structures. Nat Methods 2025; 22:399-411. [PMID: 39623050 PMCID: PMC11810798 DOI: 10.1038/s41592-024-02543-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Accepted: 10/29/2024] [Indexed: 01/16/2025]
Abstract
RNA-Puzzles is a collective endeavor dedicated to the advancement and improvement of RNA three-dimensional structure prediction. With agreement from structural biologists, RNA structures are predicted by modeling groups before publication of the experimental structures. We report a large-scale set of predictions by 18 groups for 23 RNA-Puzzles: 4 RNA elements, 2 Aptamers, 4 Viral elements, 5 Ribozymes and 8 Riboswitches. We describe automatic assessment protocols for comparisons between prediction and experiment. Our analyses reveal some critical steps to be overcome to achieve good accuracy in modeling RNA structures: identification of helix-forming pairs and of non-Watson-Crick modules, correct coaxial stacking between helices and avoidance of entanglements. Three of the top four modeling groups in this round also ranked among the top four in the CASP15 contest.
Collapse
Grants
- T32 GM066706 NIGMS NIH HHS
- NSFC T2225007 National Natural Science Foundation of China (National Science Foundation of China)
- R35 GM134919 NIGMS NIH HHS
- R35GM145409 Foundation for the National Institutes of Health (Foundation for the National Institutes of Health, Inc.)
- R35 GM145409 NIGMS NIH HHS
- 32270707 National Natural Science Foundation of China (National Science Foundation of China)
- R35 GM122579 NIGMS NIH HHS
- R35 GM134864 NIGMS NIH HHS
- T32 grant GM066706 Foundation for the National Institutes of Health (Foundation for the National Institutes of Health, Inc.)
- P20GM121342 Foundation for the National Institutes of Health (Foundation for the National Institutes of Health, Inc.)
- R21 CA219847 NCI NIH HHS
- 32171191 National Natural Science Foundation of China (National Science Foundation of China)
- P20 GM121342 NIGMS NIH HHS
- R35 GM152029 NIGMS NIH HHS
- R01 GM073850 NIGMS NIH HHS
- F32 GM112294 NIGMS NIH HHS
- ZIA DK075136 Intramural NIH HHS
- Z.M. is supported by Major Projects of Guangzhou National Laboratory, (Grant No. GZNL2023A01006, GZNL2024A01002, SRPG22-003, SRPG22-006, SRPG22-007, HWYQ23-003, YW-YFYJ0102), the National Key R&D Programs of China (2023YFF1204700, 2023YFF1204701, 2021YFF1200900, 2021YFF1200903). This work is part of the ITI 2021-2028 program and supported by IdEx Unistra (ANR-10-IDEX-0002 to E.W.), SFRI-STRAT’US project (ANR-20-SFRI-0012) and EUR IMCBio (IMCBio ANR-17-EURE-0023 to E.W.) under the framework of the French Investments for the Future Program.
- E.W. acknowledges also support from Wenzhou Institute, University of Chinese Academy of Sciences (WIUCASQD2024002).
- E.F.B. was additionally supported by European Molecular Biology Organization (EMBO) fellowship (ALTF 525-2022).
- Boniecki’s research was supported by the Polish National Science Center Poland (NCN) (grant 2016/23/B/ST6/03433 to Michal J. Boniecki). Predictions were performed using computational resources of the Interdisciplinary Centre for Mathematical and Computational Modelling of the University of Warsaw (ICM) (grant G66-9).
- J.M.B. is supported by the National Science Centre in Poland (NCN grants: 2017/26/A/NZ1/01083 to J.M.B., 2021/43/D/NZ1/03360 to S.M., 2020/39/B/NZ2/03127 to F.S., 2020/39/D/NZ2/02837 to T.K.W.). J.M.B. acknowledge Poland high-performance computing Infrastructure PLGrid (HPC Centers: ACK Cyfronet AGH, PCSS, CI TASK, WCSS) for providing computer facilities and support within the computational grant PLG/2023/016080.
- S.J.C. is supported by the National Institutes of Health under Grant R35-GM134919.
- R.D. is supported by Stanford Bio-X (to R.D., R.O.D., R.C.K., and S.E.); Stanford Gerald J. Lieberman Fellowship (to R.R.); the National Institutes of Health (R21 CA219847 and R35 GM122579 to R.D.), the Howard Hughes Medical Institute (HHMI, to R.D.); Consejo Nacional de Ciencia y Tecnología CONACyT Fellowship 312765 (P.C.); the Ruth L. Kirschstein National Research Service Award Postdoctoral Fellowships GM112294 (to J.D.Y.); National Science Foundation Graduate Research Fellowships (R.J.L.T. and R.R.); the National Library of Medicine T15 Training Grant (NLM T15007033 to K.A.C.); the U.S. Department of Energy, Office of Science Graduate Student Research program (R.J.L.T.).
- The National Institutes of Health grants 1R35 GM134864 and the Passan Foundation.
- R.O.D. is supported by the U.S. Department of Energy, Office of Science, Scientific Discovery through Advanced Computing (SciDAC) program (R.O.D.); Intel (R.O.D.).
- A.F.D. is supported, in part, by the intramural program of the National Heart, Lung and Blood Institute, National Institutes of Health, USA.
- Guangdong Science and Technology Department (2022A1515010328, 2023B1212060013, 2020B1212030004), Fundamental Research Funds for the Central Universities, Sun Yat-sen University (23ptpy41).
- D.K. is supported by the NSF CAREER award MCB-2236996, and start-up, SURFF, and START awards from the University of Maryland Baltimore County to D.K.
- BM is supported by the Interdisciplinary Thematic Institute IMCBio, as part of the ITI 2021-2028 program at the University of Strasbourg, CNRS and Inserm, by IdEx Unistra (ANR-10-IDEX-0002), and EUR (IMCBio ANR-17-EUR-0023), under the framework of the French Investments Program for the Future.
- T.H.D.N. is supported by UKRI-Medical Research Council grant MC_UP_1201/19.
- C.N. and M.K. acknowledge funding from the National Science Centre, Poland [OPUS 2019/33/B/NZ2/02100]; S.P.P. acknowledges funding from the National Science Centre, Poland [OPUS 2020/39/B/NZ2/01301]; S.K. acknowledges funding from the National Science Centre, Poland [Sheng 2021/40/Q/NZ2/00078]; C.N. acknowledge Polish high-performance computing infrastructure PLGrid (HPC Centers: PCSS, ACK Cyfronet AGH, CI TASK, WCSS) for providing computer facilities and support within the computational grants PLG/2022/016043, PLG/2022/015327 and PLG/2020/013424.
- AP is supported by an NSF-CAREER award CHE-2235785
- A.R. is supported by grants from the Natural Science Foundation of China (32325029, 32022039, 91940302, and 91640104), the National Key Research and Development Project of China (2021YFC2300300 and 2023YFC2604300).
- Marta Szachniuk are supported by the National Science Centre, Poland (2019/35/B/ST6/03074 to M.S.), the statutory funds of IBCH PAS and Poznan University of Technology.
- J.W. is supported by the Penn State College of Medicine’s Artificial Intelligence and Biomedical Informatics Program.
- J.Z. is supported by the Intramural Research Program of the NIH, the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) (ZIADK075136 to J.Z.), and an NIH Deputy Director for Intramural Research (DDIR) Challenge Award to J.Z.
Collapse
Affiliation(s)
- Fan Bu
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macao Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou National Laboratory, Guangzhou Medical University, Guangzhou, China
- School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
| | - Yagoub Adam
- Inter-institutional Graduate Program on Bioinformatics, Department of Computer Science and Mathematics, FFCLRP, University of São Paulo, Ribeirão Preto, Brazil
- Covenant University Bioinformatics Research (CUBRe), Covenant University, Ota, Nigeria
| | - Ryszard W Adamiak
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| | - Maciej Antczak
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| | - Belisa Rebeca H de Aquino
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Nagendar Goud Badepally
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Robert T Batey
- Department of Biochemistry, University of Colorado at Boulder, Boulder, CO, USA
| | - Eugene F Baulin
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Pawel Boinski
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| | - Michal J Boniecki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Janusz M Bujnicki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Kristy A Carpenter
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA
| | - Jose Chacon
- Department of Biochemistry, Stanford University, Stanford, CA, USA
- Department of Cell and Developmental Biology, University of California San Diego, San Diego, CA, USA
| | - Shi-Jie Chen
- Department of Physics, Department of Biochemistry and Institute for Data Science and Informatics, University of Missouri, Columbia, MO, USA
| | - Wah Chiu
- Department of Bioengineering and James H. Clark Center, Stanford University, Stanford, CA, USA
| | - Pablo Cordero
- Department of Biochemistry, Stanford University, Stanford, CA, USA
- Stripe, South San Francisco, CA, USA
| | - Naba Krishna Das
- Department of Chemistry and Biochemistry, University of Maryland Baltimore County, Baltimore, MD, USA
| | - Rhiju Das
- Department of Biochemistry, Stanford University, Stanford, CA, USA
- Howard Hughes Medical Institute, Stanford University, Stanford, CA, USA
- Biophysics program, Stanford University, Stanford, CA, USA
| | - Wayne K Dawson
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Frank DiMaio
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Feng Ding
- Department of Physics and Astronomy, Clemson University, Clemson, SC, USA
| | - Anne-Catherine Dock-Bregeon
- Laboratory of Integrative Biology of Marine Models (LBI2M), Sorbonne University-CNRS UMR8227, Roscoff, France
| | - Nikolay V Dokholyan
- Department of Pharmacology, Penn State College of Medicine, Hershey, PA, USA
| | - Ron O Dror
- Department of Computer Science, Stanford University, Stanford, CA, USA
- Department of Structural Biology, Stanford University, Stanford, CA, USA
- Department of Molecular and Cellular Physiology, Stanford University, Stanford, CA, USA
- Institute for Computational and Mathematical Engineering, Stanford University, Stanford, CA, USA
| | - Stanisław Dunin-Horkawicz
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Stephan Eismann
- Department of Applied Physics, Stanford University, Stanford, CA, USA
- Atomic AI, South San Francisco, CA, USA
| | - Eric Ennifar
- Architecture et Réactivité de l'ARN, Institut de Biologie Moléculaire et Cellulaire du CNRS, Université de Strasbourg, Strasbourg, France
| | - Reza Esmaeeli
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL, USA
| | - Masoud Amiri Farsani
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Adrian R Ferré-D'Amaré
- Laboratory of Nucleic Acids, National Heart, Lung and Blood Institute, Bethesda, MD, USA
| | - Caleb Geniesse
- Department of Biochemistry, Stanford University, Stanford, CA, USA
- Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - George E Ghanim
- Medical Research Council Laboratory of Molecular Biology, Cambridge, UK
| | - Horacio V Guzman
- Instituto de Ciencia de Materials de Barcelona, ICMAB-CSIC, Bellaterra E-08193, Spain & Departamento de Física Teórica de la Materia Condensada, Universidad Autónoma de Madrid, Madrid, Spain
| | - Iris V Hood
- Laboratory of Molecular Biology, National Institute of Diabetes and Digestive and Kidney Diseases, Bethesda, MD, USA
| | - Lin Huang
- Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Guangdong-Hong Kong Joint Laboratory for RNA Medicine, Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University Guangzhou, Guangdong, China
| | - Dharm Skandh Jain
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Farhang Jaryani
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Lei Jin
- Department of Physics, Department of Biochemistry and Institute for Data Science and Informatics, University of Missouri, Columbia, MO, USA
| | - Astha Joshi
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Masha Karelina
- Biophysics program, Stanford University, Stanford, CA, USA
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Jeffrey S Kieft
- Department of Biochemistry and Molecular Genetics, University of Colorado Denver School of Medicine, Aurora, CO, USA
- New York Structural Biology Center, New York, NY, USA
| | - Wipapat Kladwang
- Department of Biochemistry, Stanford University, Stanford, CA, USA
- Howard Hughes Medical Institute, Stanford University, Stanford, CA, USA
| | - Sebastian Kmiecik
- Laboratory of Computational Biology, Biological and Chemical Research Center, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | - Deepak Koirala
- Department of Chemistry and Biochemistry, University of Maryland Baltimore County, Baltimore, MD, USA
| | - Markus Kollmann
- Department of Computer Science, Heinrich Heine University of Düsseldorf, Düsseldorf, Germany
| | | | - Mateusz Kurciński
- Laboratory of Computational Biology, Biological and Chemical Research Center, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | - Jun Li
- Department of Physics, Department of Biochemistry and Institute for Data Science and Informatics, University of Missouri, Columbia, MO, USA
| | - Shuang Li
- Laboratory of Molecular Biology, National Institute of Diabetes and Digestive and Kidney Diseases, Bethesda, MD, USA
| | - Marcin Magnus
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA
| | - BenoÎt Masquida
- UMR 7156, CNRS - Université de Strasbourg, IPCB, Strasbourg, France
| | - S Naeim Moafinejad
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Arup Mondal
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL, USA
| | - Sunandan Mukherjee
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | | | - Grigory Nikolaev
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Chandran Nithin
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
- Laboratory of Computational Biology, Biological and Chemical Research Center, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | - Grace Nye
- Howard Hughes Medical Institute, Stanford University, Stanford, CA, USA
| | - Iswarya P N Pandaranadar Jeyeram
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Alberto Perez
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL, USA
| | - Phillip Pham
- Howard Hughes Medical Institute, Stanford University, Stanford, CA, USA
| | - Joseph A Piccirilli
- Department of Biochemistry and Molecular Biology, The University of Chicago, Chicago, IL, USA
- Department of Chemistry, The University of Chicago, Chicago, IL, USA
| | - Smita Priyadarshini Pilla
- Laboratory of Computational Biology, Biological and Chemical Research Center, University of Warsaw, Warsaw, Poland
| | - Radosław Pluta
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Simón Poblete
- Facultad de Ingeniería, Arquitectura y Diseño, Universidad San Sebastián, Santiago, Chile
- Centro BASAL Ciencia & Vida, Universidad San Sebastián, Santiago, Chile
| | - Almudena Ponce-Salvatierra
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Mariusz Popenda
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
| | - Lukasz Popenda
- NanoBioMedical Centre, Adam Mickiewicz University, Poznan, Poland
| | - Fabrizio Pucci
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Brussels, Belgium
| | - Ramya Rangan
- Biophysics program, Stanford University, Stanford, CA, USA
- Atomic AI, South San Francisco, CA, USA
| | - Angana Ray
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Aiming Ren
- Life Sciences Institute, Zhejiang University, Hangzhou, China
| | - Joanna Sarzynska
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
| | - Congzhou Mike Sha
- Department of Pharmacology, Penn State College of Medicine, Hershey, PA, USA
| | - Filip Stefaniak
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Zhaoming Su
- The State Key Laboratory of Biotherapy, West China Hospital, Chengdu, China
| | - Krishna C Suddala
- Laboratory of Molecular Biology, National Institute of Diabetes and Digestive and Kidney Diseases, Bethesda, MD, USA
| | - Marta Szachniuk
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| | - Raphael Townshend
- Department of Computer Science, Stanford University, Stanford, CA, USA
- Atomic AI, South San Francisco, CA, USA
| | - Robert J Trachman
- Laboratory of Nucleic Acids, National Heart, Lung and Blood Institute, Bethesda, MD, USA
| | - Jian Wang
- Department of Pharmacology, Penn State College of Medicine, Hershey, PA, USA
| | - Wenkai Wang
- MOE Frontiers Science Center for Nonlinear Expectations, Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, China
| | - Andrew Watkins
- Department of Biochemistry, Stanford University, Stanford, CA, USA
- Prescient Design, Genentech Research and Early Development, South San Francisco, CA, USA
| | - Tomasz K Wirecki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Yi Xiao
- School of Physics and Key Laboratory of Molecular Biophysics of the Ministry of Education, Huazhong University of Science and Technology, Wuhan, China
| | - Peng Xiong
- School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
- Department of Biomedical Engineering, Suzhou Institute for Advanced Research, University of Science and Technology of China, Suzhou, China
| | - Yiduo Xiong
- School of Physics and Key Laboratory of Molecular Biophysics of the Ministry of Education, Huazhong University of Science and Technology, Wuhan, China
| | - Jianyi Yang
- MOE Frontiers Science Center for Nonlinear Expectations, Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, China
| | - Joseph David Yesselman
- Howard Hughes Medical Institute, Stanford University, Stanford, CA, USA
- Department of Chemistry, University of Nebraska, Lincoln, NE, USA
| | - Jinwei Zhang
- Laboratory of Molecular Biology, National Institute of Diabetes and Digestive and Kidney Diseases, Bethesda, MD, USA
| | - Yi Zhang
- School of Physics and Key Laboratory of Molecular Biophysics of the Ministry of Education, Huazhong University of Science and Technology, Wuhan, China
| | - Zhenzhen Zhang
- Department of Physics and Astronomy, Clemson University, Clemson, SC, USA
| | - Yuanzhe Zhou
- Department of Physics, Department of Biochemistry and Institute for Data Science and Informatics, University of Missouri, Columbia, MO, USA
| | - Tomasz Zok
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| | - Dong Zhang
- Department of Physics, Department of Biochemistry and Institute for Data Science and Informatics, University of Missouri, Columbia, MO, USA
| | - Sicheng Zhang
- Department of Physics, Department of Biochemistry and Institute for Data Science and Informatics, University of Missouri, Columbia, MO, USA
| | - Adriana Żyła
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Eric Westhof
- Architecture et Réactivité de l'ARN, Institut de Biologie Moléculaire et Cellulaire du CNRS, Université de Strasbourg, Strasbourg, France.
- Engineering Research Center of Clinical Functional Materials and Diagnosis & Treatment Devices of Zhejiang Province, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, China.
| | - Zhichao Miao
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macao Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou National Laboratory, Guangzhou Medical University, Guangzhou, China.
- Shanghai Key Laboratory of Anesthesiology and Brain Functional Modulation, Clinical Research Center for Anesthesiology and Perioperative Medicine, Translational Research Institute of Brain and Brain-Like Intelligence, Shanghai Fourth People's Hospital, School of Medicine, Tongji University, Shanghai, China.
- European Bioinformatics Institute, European Molecular Biology Laboratory, Wellcome Genome Campus, Cambridge, UK.
| |
Collapse
|