Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Heinzinger M, Weissenow K, Sanchez J, Henkel A, Mirdita M, Steinegger M, Rost B. Bilingual language model for protein sequence and structure. NAR Genom Bioinform 2024;6:lqae150. [PMID: 39633723 PMCID: PMC11616678 DOI: 10.1093/nargab/lqae150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2024] [Revised: 08/02/2024] [Accepted: 10/21/2024] [Indexed: 12/07/2024] Open

For:	Heinzinger M, Weissenow K, Sanchez J, Henkel A, Mirdita M, Steinegger M, Rost B. Bilingual language model for protein sequence and structure. NAR Genom Bioinform 2024;6:lqae150. [PMID: 39633723 PMCID: PMC11616678 DOI: 10.1093/nargab/lqae150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2024] [Revised: 08/02/2024] [Accepted: 10/21/2024] [Indexed: 12/07/2024] Open

Number

Cited by Other Article(s)

Chen SF, Steele RJ, Hocky GM, Lemeneh B, Lad SP, Oermann EK. Large-Scale Multi-omic Biosequence Transformers for Modeling Protein-Nucleic Acid Interactions. ARXIV 2025:arXiv:2408.16245v4. [PMID: 40236839 PMCID: PMC11998858] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 04/17/2025]

Georgakis N, Premetis GE, Pantiora P, Varotsou C, Bodourian CS, Labrou NE. The impact of metagenomic analysis on the discovery of novel endolysins. Appl Microbiol Biotechnol 2025;109:126. [PMID: 40411603 PMCID: PMC12103483 DOI: 10.1007/s00253-025-13513-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2025] [Revised: 05/04/2025] [Accepted: 05/05/2025] [Indexed: 05/26/2025]

Yurtseven A, Keller S, Hirsch P, Kalinina OV, Gress A. StructMAn 2.0 Web: a web server for structural annotation of protein sequences and mutations. Nucleic Acids Res 2025:gkaf381. [PMID: 40326516 DOI: 10.1093/nar/gkaf381] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2025] [Revised: 04/11/2025] [Accepted: 04/25/2025] [Indexed: 05/07/2025] Open

Pokharel S, Barasa K, Pratyush P, KC DB. PLM-DBPs: enhancing plant DNA-binding protein prediction by integrating sequence-based and structure-aware protein language models. Brief Bioinform 2025;26:bbaf245. [PMID: 40439671 PMCID: PMC12121366 DOI: 10.1093/bib/bbaf245] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2025] [Revised: 04/14/2025] [Accepted: 05/05/2025] [Indexed: 06/02/2025] Open

Ciuchcinski K, Kaczorowska AK, Biernacka D, Dorawa S, Kaczorowski T, Park Y, Piekarski K, Stanowski M, Ishikawa T, Stokke R, Steen IH, Dziewit L. Computational pipeline for sustainable enzyme discovery through (re)use of metagenomic data. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2025;382:125381. [PMID: 40252419 DOI: 10.1016/j.jenvman.2025.125381] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/29/2024] [Revised: 04/03/2025] [Accepted: 04/13/2025] [Indexed: 04/21/2025]

Abstract

Enzymes derived from extremophilic organisms, also known as extremozymes, offer sustainable and efficient solutions for industrial applications. Valued for their resilience and low environmental impact, extremozymes have found use as catalysts in various processes, ranging from dairy production to pharmaceutical manufacturing. However, discovery of novel extremozymes is often hindered by challenges such as culturing difficulties, underrepresentation of extreme environments in reference databases, and limitations of traditional sequence-based screening methods. In this work, we present a computational pipeline designed to discover novel enzymes from metagenomic data derived from extreme environments. This pipeline represents a versatile and sustainable approach that promotes reuse and recycling of existing datasets and minimises the need for additional environmental sampling. In its core, the algorithm integrates both traditional bioinformatic techniques and recent advances in structural prediction, enabling rapid and accurate identification of enzymes. However, due to its design, the algorithm relies heavily on existing databases, which can limit its effectiveness in situations where reference data is scarce or when encountering novel protein families. As a proof-of-concept, we applied the pipeline to metagenomic data from deep-sea hydrothermal vents, with a focus on β-galactosidases. The pipeline identified 11 potential candidate proteins, out of which 10 showed in vitro activity. One of the selected enzymes, βGal_UW07, showed strong potential for industrial applications. The enzyme exhibited optimal activity at 70 °C and was exceptionally resistant to high pH and the presence of metal ions and reducing agents. Overall, our results indicate that the pipeline is highly accurate and can play a key role in sustainable bioprospecting, leveraging existing metagenomic datasets and minimising in situ interventions in pristine regions.

Collapse

Affiliation(s)

Karol Ciuchcinski Department of Environmental Microbiology and Biotechnology, Institute of Microbiology, Faculty of Biology, University of Warsaw, Miecznikowa 1, 02-096, Warsaw, Poland.
Anna-Karina Kaczorowska Collection of Plasmids and Microorganisms \| KPD, Department of Microbiology, Faculty of Biology, University of Gdańsk, Wita Stwosza 59, 80-308, Gdańsk, Poland.
Daria Biernacka Collection of Plasmids and Microorganisms \| KPD, Department of Microbiology, Faculty of Biology, University of Gdańsk, Wita Stwosza 59, 80-308, Gdańsk, Poland; Structural Biology Laboratory, Intercollegiate Faculty of Biotechnology of University of Gdansk and Medical University of Gdańsk, Abrahama 58, 80-307, Gdańsk, Poland.
Sebastian Dorawa Laboratory of Extremophiles Biology, Department of Microbiology, Faculty of Biology, University of Gdańsk, Wita Stwosza 59, 80-308, Gdańsk, Poland.
Tadeusz Kaczorowski Laboratory of Extremophiles Biology, Department of Microbiology, Faculty of Biology, University of Gdańsk, Wita Stwosza 59, 80-308, Gdańsk, Poland.
Younginn Park Department of Environmental Microbiology and Biotechnology, Institute of Microbiology, Faculty of Biology, University of Warsaw, Miecznikowa 1, 02-096, Warsaw, Poland.
Karol Piekarski Department of Environmental Microbiology and Biotechnology, Institute of Microbiology, Faculty of Biology, University of Warsaw, Miecznikowa 1, 02-096, Warsaw, Poland.
Michal Stanowski Department of Environmental Microbiology and Biotechnology, Institute of Microbiology, Faculty of Biology, University of Warsaw, Miecznikowa 1, 02-096, Warsaw, Poland.
Takao Ishikawa Department of Environmental Microbiology and Biotechnology, Institute of Microbiology, Faculty of Biology, University of Warsaw, Miecznikowa 1, 02-096, Warsaw, Poland.
Runar Stokke Department of Biological Sciences, Center for Deep Sea Research, University of Bergen, Postboks 7803, N-5020, Bergen, Norway.
Ida Helene Steen Department of Biological Sciences, Center for Deep Sea Research, University of Bergen, Postboks 7803, N-5020, Bergen, Norway.
Lukasz Dziewit Department of Environmental Microbiology and Biotechnology, Institute of Microbiology, Faculty of Biology, University of Warsaw, Miecznikowa 1, 02-096, Warsaw, Poland.

Collapse

Zhang L, Liu T. ATP-Pred: Prediction of Protein-ATP Binding Residues via Fusion of Residue-Level Embeddings and Kolmogorov-Arnold Network. J Chem Inf Model 2025;65:3812-3826. [PMID: 40119803 DOI: 10.1021/acs.jcim.5c00016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/24/2025]

Bjerregaard A, Groth PM, Hauberg S, Krogh A, Boomsma W. Foundation models of protein sequences: A brief overview. Curr Opin Struct Biol 2025;91:103004. [PMID: 39983412 DOI: 10.1016/j.sbi.2025.103004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2024] [Revised: 01/24/2025] [Accepted: 01/26/2025] [Indexed: 02/23/2025]

Song C, He S, Qian Y, Li X, Hu Y, Chen J, Wang J, Deng L. DeepMVD: A Novel Multiview Dynamic Feature Fusion Model for Accurate Protein Function Prediction. J Chem Inf Model 2025;65:3077-3089. [PMID: 40053671 DOI: 10.1021/acs.jcim.4c02216] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/09/2025]

Affiliation(s)

Chaolin Song School of Software, Xinjiang University, Urumqi 830091, China Xinjiang Engineering Research Center of Big Data and Intelligent Software, School of Software, Xinjiang University, Urumqi 830091, China Key Laboratory of Software Engineering, Xinjiang University, Urumqi 830091, China
Shiwen He School of Software, Xinjiang University, Urumqi 830091, China School of Computer Science and Engineering, Central South University, Changsha 410083, China
Yurong Qian Xinjiang Engineering Research Center of Big Data and Intelligent Software, School of Software, Xinjiang University, Urumqi 830091, China Key Laboratory of Software Engineering, Xinjiang University, Urumqi 830091, China School of Computer Science and Technology, Xinjiang University, Urumqi 830046, China Joint International Research Laboratory of Silk Road Multilingual Cognitive Computing, Xinjiang University, Urumqi, Xinjiang 830046, China
Xinhui Li School of Computer Science and Technology, Xinjiang University, Urumqi 830046, China Joint International Research Laboratory of Silk Road Multilingual Cognitive Computing, Xinjiang University, Urumqi, Xinjiang 830046, China
Yue Hu School of Computer Science and Technology, Xinjiang University, Urumqi 830046, China Joint International Research Laboratory of Silk Road Multilingual Cognitive Computing, Xinjiang University, Urumqi, Xinjiang 830046, China
Jiaying Chen School of Software, Xinjiang University, Urumqi 830091, China Xinjiang Engineering Research Center of Big Data and Intelligent Software, School of Software, Xinjiang University, Urumqi 830091, China Key Laboratory of Software Engineering, Xinjiang University, Urumqi 830091, China
Jingfu Wang School of Software, Xinjiang University, Urumqi 830091, China Xinjiang Engineering Research Center of Big Data and Intelligent Software, School of Software, Xinjiang University, Urumqi 830091, China Key Laboratory of Software Engineering, Xinjiang University, Urumqi 830091, China
Lei Deng School of Software, Xinjiang University, Urumqi 830091, China School of Computer Science and Engineering, Central South University, Changsha 410083, China

Collapse

Carmona OG, Kleinjung J, Anastasiou D, Oostenbrink C, Fraternali F. AllohubPy: Detecting Allosteric Signals Through An Information-theoretic Approach. J Mol Biol 2025:168969. [PMID: 39900284 DOI: 10.1016/j.jmb.2025.168969] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2024] [Revised: 01/22/2025] [Accepted: 01/24/2025] [Indexed: 02/05/2025]

Abstract

Allosteric regulation is crucial for biological processes like signal transduction, transcriptional regulation, and metabolism, yet the mechanisms and macromolecular properties that govern it are still not well understood. Several methods have been developed over the years to study allosterism through different angles. Among the possible ways to study allosterism, information-theoretic approaches, like AlloHubMat or GSAtools, can be particularly effective due to their use of robust statistics and the possibility to be combined with graph analysis. These methods capture local conformational changes associated with global motions from molecular dynamics simulations through the use of a Structural Alphabet, which simplifies the complexity of the Cartesian space by reducing the dimensionality down to a string of encoded fragments, representing sets of internal coordinates that still capture the overall conformation changes. In this work, we present "AllohubPy," an improved and standardized methodology of AlloHubMat and GSAtools coded in Python. We analyse the performance, limitations and sampling requirements of AllohubPy by using extensive molecular dynamics simulations of model allosteric systems and apply convergence analysis techniques to estimate result reliability. Additionally, we expand the methodology to use different dimensionality reduction Structural Alphabets, such as the 3DI alphabet, and integrate Protein Language Models (PLMs) to refine allosteric hub communication detection by monitoring the detected evolutionary constraints. Overall, AllohubPy expands its preceding methods and simplifies the use and reliability of the method to effectively capture dynamic allosteric motions and residue pathways. AllohubPy is freely available on GitHub (https://github.com/Fraternalilab/AlloHubPy) as a package and as a Jupyter Notebook.

Collapse

Majila K, Ullanat V, Viswanath S. A deep learning method for predicting interactions for intrinsically disordered regions of proteins. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2024.12.19.629373. [PMID: 39763873 PMCID: PMC11702703 DOI: 10.1101/2024.12.19.629373] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/14/2025]

Chen JY, Wang JF, Hu Y, Li XH, Qian YR, Song CL. Evaluating the advancements in protein language models for encoding strategies in protein function prediction: a comprehensive review. Front Bioeng Biotechnol 2025;13:1506508. [PMID: 39906415 PMCID: PMC11790633 DOI: 10.3389/fbioe.2025.1506508] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2024] [Accepted: 01/02/2025] [Indexed: 02/06/2025] Open

Johnson S, Weigele P, Fomenkov A, Ge A, Vincze A, Eaglesham J, Roberts R, Sun Z. Domainator, a flexible software suite for domain-based annotation and neighborhood analysis, identifies proteins involved in antiviral systems. Nucleic Acids Res 2025;53:gkae1175. [PMID: 39657740 PMCID: PMC11754643 DOI: 10.1093/nar/gkae1175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2024] [Revised: 11/07/2024] [Accepted: 11/15/2024] [Indexed: 12/12/2024] Open

Gonzales MEM, Ureta JC, Shrestha AMS. PHIStruct: improving phage-host interaction prediction at low sequence similarity settings using structure-aware protein embeddings. Bioinformatics 2024;41:btaf016. [PMID: 39804673 PMCID: PMC11783280 DOI: 10.1093/bioinformatics/btaf016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2024] [Revised: 12/04/2024] [Accepted: 01/10/2025] [Indexed: 02/01/2025] Open

Tule S, Foley G, Bodén M. Do protein language models learn phylogeny? Brief Bioinform 2024;26:bbaf047. [PMID: 39987495 PMCID: PMC11847157 DOI: 10.1093/bib/bbaf047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2024] [Revised: 12/20/2024] [Accepted: 02/20/2025] [Indexed: 02/25/2025] Open

Greener JG, Jamali K. Fast protein structure searching using structure graph embeddings. BIOINFORMATICS ADVANCES 2024;5:vbaf042. [PMID: 40196750 PMCID: PMC11974391 DOI: 10.1093/bioadv/vbaf042] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/07/2025] [Revised: 02/11/2025] [Accepted: 03/03/2025] [Indexed: 04/09/2025]