1
|
Sutcliffe R, Doherty CPA, Morgan HP, Dunne NJ, McCarthy HO. Strategies for the design of biomimetic cell-penetrating peptides using AI-driven in silico tools for drug delivery. BIOMATERIALS ADVANCES 2025; 169:214153. [PMID: 39705787 DOI: 10.1016/j.bioadv.2024.214153] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/09/2024] [Revised: 12/08/2024] [Accepted: 12/14/2024] [Indexed: 12/23/2024]
Abstract
Cell-penetrating peptides (CPP) have gained rapid attention over the last 25 years; this is attributed to their versatility, customisation, and 'Trojan horse' delivery that evades the immune system. However, the current CPP rational design process is limited, as it requires several rounds of peptide synthesis, prediction and wet-lab validation, which is expensive, time-consuming and requires extensive knowledge in peptide chemistry. Artificial intelligence (AI) has emerged as a promising alternative which can augment the design process, for example by determining physiochemical characteristics, secondary structure, solvent accessibility, disorder and flexibility, as well as predicting in vivo behaviour such as toxicity and peptidase degradation. Other more recent tools utilise supervised machine learning (ML) to predict the penetrative ability of an amino acid sequence. The use of AI in the CPP design process has the potential to reduce development costs and increase the chances of success with respect to delivery. This review provides a survey of in silico tools and AI platforms which can be utilised in the design process, and the key features that should be taken into consideration when designing next generation CPPs.
Collapse
Affiliation(s)
- Rebecca Sutcliffe
- School of Pharmacy, Queen's University Belfast, 97 Lisburn Road, Belfast BT9 7BL, United Kingdom of Great Britain and Northern Ireland
| | - Ciaran P A Doherty
- School of Pharmacy, Queen's University Belfast, 97 Lisburn Road, Belfast BT9 7BL, United Kingdom of Great Britain and Northern Ireland; Antigenesis Biologics, Crossgar, Northern Ireland, United Kingdom of Great Britain and Northern Ireland
| | - Hugh P Morgan
- School of Pharmacy, Queen's University Belfast, 97 Lisburn Road, Belfast BT9 7BL, United Kingdom of Great Britain and Northern Ireland; Antigenesis Biologics, Crossgar, Northern Ireland, United Kingdom of Great Britain and Northern Ireland
| | - Nicholas J Dunne
- School of Pharmacy, Queen's University Belfast, 97 Lisburn Road, Belfast BT9 7BL, United Kingdom of Great Britain and Northern Ireland; School of Mechanical and Manufacturing Engineering, Dublin City University, Dublin 9, Ireland
| | - Helen O McCarthy
- School of Pharmacy, Queen's University Belfast, 97 Lisburn Road, Belfast BT9 7BL, United Kingdom of Great Britain and Northern Ireland.
| |
Collapse
|
2
|
Distefano GL, D'Amico F. Deep Learning-Driven Computational Approaches for Studying Intrinsically Disordered Regions in S100-A9. Methods Mol Biol 2025. [PMID: 40106150 DOI: 10.1007/7651_2025_617] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/22/2025]
Abstract
Intrinsically disordered regions (IDRs) are flexible protein regions and lack a fixed three-dimensional structure, which makes them difficult to study using traditional structural methods. However, artificial intelligence can be helpful in predicting, analyzing, and modeling these regions. This chapter provides a simple protocol for a preliminary-level approach to identifying protein IDRs. By reporting on the S100-A9 protein as a case study example, characterization of the IDRs of this molecule could provide further details on the complex molecular interactions involved in psoriasis, particularly those related to inflammation, immune dysregulation, and keratinocyte behavior.
Collapse
Affiliation(s)
- Gionathan L Distefano
- Department of Mathematics and Computer Science, University of Catania, Catania, Italy
| | - Fabio D'Amico
- Department of Biomedical and Biotechnological Sciences, University of Catania, Catania, Italy.
| |
Collapse
|
3
|
Zhang L, Hodgins L, Sakib S, Verbeem A, Mahmood A, Perez-Romero C, Marmion RA, Dostatni N, Fradin C. Both the transcriptional activator, Bcd, and repressor, Cic, form small mobile oligomeric clusters. Biophys J 2025; 124:980-995. [PMID: 39164967 PMCID: PMC11947476 DOI: 10.1016/j.bpj.2024.08.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Revised: 07/11/2024] [Accepted: 08/15/2024] [Indexed: 08/22/2024] Open
Abstract
Transcription factors play an essential role in pattern formation during early embryo development, generating a strikingly fast and precise transcriptional response that results in sharp gene expression boundaries. To characterize the steps leading up to transcription, we performed a side-by-side comparison of the nuclear dynamics of two morphogens, a transcriptional activator, Bicoid (Bcd), and a transcriptional repressor, Capicua (Cic), both involved in body patterning along the anterior-posterior axis of the early Drosophila embryo. We used a combination of fluorescence recovery after photobleaching, fluorescence correlation spectroscopy, and single-particle tracking to access a wide range of dynamical timescales. Despite their opposite effects on gene transcription, we find that Bcd and Cic have very similar nuclear dynamics, characterized by the coexistence of a freely diffusing monomer population with a number of oligomeric clusters, which range from low stoichiometry and high mobility clusters to larger, DNA-bound hubs. Our observations are consistent with the inclusion of both Bcd and Cic into transcriptional hubs or condensates, while putting constraints on the mechanism by which these form. These results fit in with the recent proposal that many transcription factors might share a common search strategy for target gene regulatory regions that makes use of their large unstructured regions, and may eventually help explain how the transcriptional response they elicit can be at the same time so fast and so precise.
Collapse
Affiliation(s)
- Lili Zhang
- Department of Physics and Astronomy, McMaster University, Hamilton, ON, Canada
| | - Lydia Hodgins
- Department of Physics and Astronomy, McMaster University, Hamilton, ON, Canada
| | - Shariful Sakib
- Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, ON, Canada
| | - Alexander Verbeem
- Department of Physics and Astronomy, McMaster University, Hamilton, ON, Canada
| | - Ahmad Mahmood
- Department of Physics and Astronomy, McMaster University, Hamilton, ON, Canada
| | - Carmina Perez-Romero
- Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, ON, Canada
| | - Robert A Marmion
- The Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey
| | - Nathalie Dostatni
- Institut Curie, PSL University, CNRS, Sorbonne University, Nuclear Dynamics, Paris, France
| | - Cécile Fradin
- Department of Physics and Astronomy, McMaster University, Hamilton, ON, Canada; Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, ON, Canada.
| |
Collapse
|
4
|
Xie J, Jin X, Wei H, Sun S, Liu Y. IDP-EDL: enhancing intrinsically disordered protein prediction by combining protein language model and ensemble deep learning. Brief Bioinform 2025; 26:bbaf182. [PMID: 40254833 PMCID: PMC12009716 DOI: 10.1093/bib/bbaf182] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2024] [Revised: 02/26/2025] [Accepted: 03/30/2025] [Indexed: 04/22/2025] Open
Abstract
Identification of intrinsically disordered regions (IDRs) in proteins is essential for understanding fundamental cellular processes. The IDRs can be divided into long disordered regions (LDRs) and short disordered regions (SDRs) according to their lengths. In previous studies, most computational methods ignored the differences between LDRs and SDRs, and therefore failed to capture the different patterns of LDRs and SDRs. In this study, we propose IDP-EDL, an ensemble of three predictors. The component predictors were first built based on pretrained protein language model and applied task-specific fine-tuning for short, long, and generic disordered regions. A meta predictor was then trained to integrate three task-specific predictors into the final predictor. The results of experiments show that task-specific supervised fine-tuning can capture the different features of LDRs and SDRs and IDP-EDL can achieve stable performance on datasets with different ratios of LDRs and SDRs. More importantly, IDP-EDL can reach or even surpass state-of-the-art performance than other existing predictors on independent test sets. IDP-EDL is available at https://github.com/joestarXjx/IDP-EDL.
Collapse
Affiliation(s)
- Junxi Xie
- College of Big Data and Internet, Shenzhen Technology University, 3002 Lantian Road, Pingshan District, Shenzhen, Guangdong 518118, China
| | - Xiaopeng Jin
- College of Big Data and Internet, Shenzhen Technology University, 3002 Lantian Road, Pingshan District, Shenzhen, Guangdong 518118, China
| | - Hang Wei
- School of Computer Science and Technology, Xidian University, South Campus: 266 Xinglong Section of Xifeng Road, Xi’an, Shaanxi 710126, North Campus: No. 2 South Taibai Road, Xi’an, Shaanxi 710071, China
| | - SaiSai Sun
- School of Computer Science and Technology, Xidian University, South Campus: 266 Xinglong Section of Xifeng Road, Xi’an, Shaanxi 710126, North Campus: No. 2 South Taibai Road, Xi’an, Shaanxi 710071, China
| | - Yumeng Liu
- College of Big Data and Internet, Shenzhen Technology University, 3002 Lantian Road, Pingshan District, Shenzhen, Guangdong 518118, China
| |
Collapse
|
5
|
Kotowski K, Roterman I, Stapor K. DisorderUnetLM: Validating ProteinUnet for efficient protein intrinsic disorder prediction. Comput Biol Med 2025; 185:109586. [PMID: 39708500 DOI: 10.1016/j.compbiomed.2024.109586] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2024] [Revised: 12/03/2024] [Accepted: 12/14/2024] [Indexed: 12/23/2024]
Abstract
The prediction of intrinsic disorder regions has significant implications for understanding protein functions and dynamics. It can help to discover novel protein-protein interactions essential for designing new drugs and enzymes. Recently, a new generation of predictors based on protein language models (pLMs) is emerging. These algorithms reach state-of-the-art accuracy without calculating time-consuming multiple sequence alignments (MSAs). This article introduces the new DisorderUnetLM disorder predictor, which builds upon the idea of ProteinUnet. It uses the Attention U-Net convolutional network and incorporates features from the ProtTrans pLM. DisorderUnetLM achieves top results in the direct comparison with recent predictors exploiting MSAs and pLMs. Moreover, among 43 predictors on the latest CAID-2 benchmark, it ranks 1st for the NOX subset in terms of the ROC-AUC metric (0.844) and 2nd for the AP metric (0.596). For the CAID-2 PDB subset, it ranks in the top 10 (ROC-AUC of 0.924 and AP of 0.862). The code and model are publicly available and fully reproducible at doi.org/10.24433/CO.7350682.v1.
Collapse
Affiliation(s)
- Krzysztof Kotowski
- Department of Applied Informatics, Silesian University of Technology, Akademicka 16, 44-100, Gliwice, Poland
| | - Irena Roterman
- Department of Bioinformatics and Telemedicine, Jagiellonian University Medical College, Medyczna 7, 30-688, Kraków, Poland
| | - Katarzyna Stapor
- Department of Applied Informatics, Silesian University of Technology, Akademicka 16, 44-100, Gliwice, Poland.
| |
Collapse
|
6
|
Yang W, Du Q, Zhou X, Wu C, Bao J. PDFll: Predictors of Disorder and Function of Proteins from the Language of Life. J Comput Biol 2025; 32:143-155. [PMID: 39246251 DOI: 10.1089/cmb.2024.0506] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/10/2024] Open
Abstract
The identification of intrinsically disordered proteins and their functional roles is largely dependent on the performance of computational predictors, necessitating a high standard of accuracy in these tools. In this context, we introduce a novel series of computational predictors, termed PDFll (Predictors of Disorder and Function of proteins from the Language of Life), which are designed to offer precise predictions of protein disorder and associated functional roles based on protein sequences. PDFll is developed through a two-step process. Initially, it leverages large-scale protein language models (pLMs), trained on an extensive dataset comprising billions of protein sequences. Subsequently, the embeddings derived from pLMs are integrated into streamlined, yet sophisticated, deep-learning models to generate predictions. These predictions notably surpass the performance of existing state-of-the-art predictors, particularly those that forecast disorder and function without utilizing evolutionary information.
Collapse
Affiliation(s)
- Wanyi Yang
- College of Life Sciences, Sichuan University, Chengdu, China
| | - Qingsong Du
- College of Life Sciences, Sichuan University, Chengdu, China
| | - Xunyu Zhou
- College of Life Sciences, Sichuan University, Chengdu, China
| | - Chuanfang Wu
- College of Life Sciences, Sichuan University, Chengdu, China
| | - Jinku Bao
- College of Life Sciences, Sichuan University, Chengdu, China
| |
Collapse
|
7
|
Wang K, Hu G, Wu Z, Kurgan L. Accurate and Fast Prediction of Intrinsic Disorder Using flDPnn. Methods Mol Biol 2025; 2867:201-218. [PMID: 39576583 DOI: 10.1007/978-1-0716-4196-5_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2024]
Abstract
Intrinsically disordered proteins (IDPs) that include one or more intrinsically disordered regions (IDRs) are abundant across all domains of life and viruses and play numerous functional roles in various cellular processes. Due to a relatively low throughput and high cost of experimental techniques for identifying IDRs, there is a growing need for fast and accurate computational algorithms that accurately predict IDRs/IDPs from protein sequences. We describe one of the leading disorder predictors, flDPnn. Results from a recent community-organized Critical Assessment of Intrinsic Disorder (CAID) experiment show that flDPnn provides fast and state-of-the-art predictions of disorder, which are supplemented with the predictions of several major disorder functions. This chapter provides a practical guide to flDPnn, which includes a brief explanation of its predictive model, descriptions of its web server and standalone versions, and a case study that showcases how to read and understand flDPnn's predictions.
Collapse
Affiliation(s)
- Kui Wang
- NITFID, School of Statistics and Data Science, LPMC and KLMDASR, Nankai University, Tianjin, China
| | - Gang Hu
- NITFID, School of Statistics and Data Science, LPMC and KLMDASR, Nankai University, Tianjin, China
| | - Zhonghua Wu
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin, China
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA.
| |
Collapse
|
8
|
Basu S, Kurgan L. Taxonomy-specific assessment of intrinsic disorder predictions at residue and region levels in higher eukaryotes, protists, archaea, bacteria and viruses. Comput Struct Biotechnol J 2024; 23:1968-1977. [PMID: 38765610 PMCID: PMC11098722 DOI: 10.1016/j.csbj.2024.04.059] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Revised: 04/23/2024] [Accepted: 04/24/2024] [Indexed: 05/22/2024] Open
Abstract
Intrinsic disorder predictors were evaluated in several studies including the two large CAID experiments. However, these studies are biased towards eukaryotic proteins and focus primarily on the residue-level predictions. We provide first-of-its-kind assessment that comprehensively covers the taxonomy and evaluates predictions at the residue and disordered region levels. We curate a benchmark dataset that uniformly covers eukaryotic, archaeal, bacterial, and viral proteins. We find that predictive performance differs substantially across taxonomy, where viruses are predicted most accurately, followed by protists and higher eukaryotes, while bacterial and archaeal proteins suffer lower levels of accuracy. These trends are consistent across predictors. We also find that current tools, except for flDPnn, struggle with reproducing native distributions of the numbers and sizes of the disordered regions. Moreover, analysis of two variants of disorder predictions derived from the AlphaFold2 predicted structures reveals that they produce accurate residue-level propensities for archaea, bacteria and protists. However, they underperform for higher eukaryotes and generally struggle to accurately identify disordered regions. Our results motivate development of new predictors that target bacteria and archaea and which produce accurate results at both residue and region levels. We also stress the need to include the region-level assessments in future assessments.
Collapse
Affiliation(s)
- Sushmita Basu
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| |
Collapse
|
9
|
Wang K, Hu G, Basu S, Kurgan L. flDPnn2: Accurate and Fast Predictor of Intrinsic Disorder in Proteins. J Mol Biol 2024; 436:168605. [PMID: 39237195 DOI: 10.1016/j.jmb.2024.168605] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Revised: 04/16/2024] [Accepted: 05/04/2024] [Indexed: 09/07/2024]
Abstract
Prediction of the intrinsic disorder in protein sequences is an active research area, with well over 100 predictors that were released to date. These efforts are motivated by the functional importance and high levels of abundance of intrinsic disorder, combined with relatively low amounts of experimental annotations. The disorder predictors are periodically evaluated by independent assessors in the Critical Assessment of protein Intrinsic Disorder prediction (CAID) experiments. The recently completed CAID2 experiment assessed close to 40 state-of-the-art methods demonstrating that some of them produce accurate results. In particular, flDPnn2 method, which is the successor of flDPnn that performed well in the CAID1 experiment, secured the overall most accurate results on the Disorder-NOX dataset in CAID2. flDPnn2 implements a number of improvements when compared to its predecessor including changes to the inputs, increased size of the deep network model that we retrained on a larger training set, and addition of an alignment module. Using results from CAID2, we show that flDPnn2 produces accurate predictions very quickly, modestly improving over the accuracy of flDPnn and reducing the runtime by half, to about 27 s per protein. flDPnn2 is freely available as a convenient web server at http://biomine.cs.vcu.edu/servers/flDPnn2/.
Collapse
Affiliation(s)
- Kui Wang
- NITFID, School of Statistics and Data Science, LPMC and KLMDASR, Nankai University, Tianjin, China
| | - Gang Hu
- NITFID, School of Statistics and Data Science, LPMC and KLMDASR, Nankai University, Tianjin, China
| | - Sushmita Basu
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA.
| |
Collapse
|
10
|
Nambiar A, Forsyth JM, Liu S, Maslov S. DR-BERT: A protein language model to annotate disordered regions. Structure 2024; 32:1260-1268.e3. [PMID: 38701796 DOI: 10.1016/j.str.2024.04.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Revised: 06/16/2023] [Accepted: 04/08/2024] [Indexed: 05/05/2024]
Abstract
Despite their lack of a rigid structure, intrinsically disordered regions (IDRs) in proteins play important roles in cellular functions, including mediating protein-protein interactions. Therefore, it is important to computationally annotate IDRs with high accuracy. In this study, we present Disordered Region prediction using Bidirectional Encoder Representations from Transformers (DR-BERT), a compact protein language model. Unlike most popular tools, DR-BERT is pretrained on unannotated proteins and trained to predict IDRs without relying on explicit evolutionary or biophysical data. Despite this, DR-BERT demonstrates significant improvement over existing methods on the Critical Assessment of protein Intrinsic Disorder (CAID) evaluation dataset and outperforms competitors on two out of four test cases in the CAID 2 dataset, while maintaining competitiveness in the others. This performance is due to the information learned during pretraining and DR-BERT's ability to use contextual information.
Collapse
Affiliation(s)
- Ananthan Nambiar
- Department of Bioengineering, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA; Carl R. Woese Institute for Genomic Biology, Urbana, IL 61801, USA.
| | - John Malcolm Forsyth
- Carl R. Woese Institute for Genomic Biology, Urbana, IL 61801, USA; Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| | - Simon Liu
- Carl R. Woese Institute for Genomic Biology, Urbana, IL 61801, USA; Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| | - Sergei Maslov
- Department of Bioengineering, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA; Carl R. Woese Institute for Genomic Biology, Urbana, IL 61801, USA; Department of Physics, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA; Computing, Environment and Life Sciences, Argonne National Laboratory, Lemont, IL 60439, USA.
| |
Collapse
|
11
|
Chen J, Li Q, Xia S, Arsala D, Sosa D, Wang D, Long M. The Rapid Evolution of De Novo Proteins in Structure and Complex. Genome Biol Evol 2024; 16:evae107. [PMID: 38753069 PMCID: PMC11149777 DOI: 10.1093/gbe/evae107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/10/2024] [Indexed: 06/06/2024] Open
Abstract
Recent studies in the rice genome-wide have established that de novo genes, evolving from noncoding sequences, enhance protein diversity through a stepwise process. However, the pattern and rate of their evolution in protein structure over time remain unclear. Here, we addressed these issues within a surprisingly short evolutionary timescale (<1 million years for 97% of Oryza de novo genes) with comparative approaches to gene duplicates. We found that de novo genes evolve faster than gene duplicates in the intrinsically disordered regions (such as random coils), secondary structure elements (such as α helix and β strand), hydrophobicity, and molecular recognition features. In de novo proteins, specifically, we observed an 8% to 14% decay in random coils and intrinsically disordered region lengths and a 2.3% to 6.5% increase in structured elements, hydrophobicity, and molecular recognition features, per million years on average. These patterns of structural evolution align with changes in amino acid composition over time as well. We also revealed higher positive charges but smaller molecular weights for de novo proteins than duplicates. Tertiary structure predictions showed that most de novo proteins, though not typically well folded on their own, readily form low-energy and compact complexes with other proteins facilitated by extensive residue contacts and conformational flexibility, suggesting a faster-binding scenario in de novo proteins to promote interaction. These analyses illuminate a rapid evolution of protein structure in de novo genes in rice genomes, originating from noncoding sequences, highlighting their quick transformation into active, protein complex-forming components within a remarkably short evolutionary timeframe.
Collapse
Affiliation(s)
- Jianhai Chen
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL 60637, USA
| | - Qingrong Li
- Division of Pharmaceutical Sciences, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA 92093, USA
- Department of Cellular & Molecular Medicine, School of Medicine, University of California San Diego, La Jolla, CA 92093, USA
| | - Shengqian Xia
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL 60637, USA
| | - Deanna Arsala
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL 60637, USA
| | - Dylan Sosa
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL 60637, USA
| | - Dong Wang
- Division of Pharmaceutical Sciences, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA 92093, USA
- Department of Cellular & Molecular Medicine, School of Medicine, University of California San Diego, La Jolla, CA 92093, USA
| | - Manyuan Long
- Department of Ecology and Evolution, The University of Chicago, Chicago, IL 60637, USA
| |
Collapse
|
12
|
Cook AD, Carrington M, Higgins MK. Molecular mechanism of complement inhibition by the trypanosome receptor ISG65. eLife 2024; 12:RP88960. [PMID: 38655765 PMCID: PMC11042801 DOI: 10.7554/elife.88960] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/26/2024] Open
Abstract
African trypanosomes replicate within infected mammals where they are exposed to the complement system. This system centres around complement C3, which is present in a soluble form in serum but becomes covalently deposited onto the surfaces of pathogens after proteolytic cleavage to C3b. Membrane-associated C3b triggers different complement-mediated effectors which promote pathogen clearance. To counter complement-mediated clearance, African trypanosomes have a cell surface receptor, ISG65, which binds to C3b and which decreases the rate of trypanosome clearance in an infection model. However, the mechanism by which ISG65 reduces C3b function has not been determined. We reveal through cryogenic electron microscopy that ISG65 has two distinct binding sites for C3b, only one of which is available in C3 and C3d. We show that ISG65 does not block the formation of C3b or the function of the C3 convertase which catalyses the surface deposition of C3b. However, we show that ISG65 forms a specific conjugate with C3b, perhaps acting as a decoy. ISG65 also occludes the binding sites for complement receptors 2 and 3, which may disrupt recruitment of immune cells, including B cells, phagocytes, and granulocytes. This suggests that ISG65 protects trypanosomes by combining multiple approaches to dampen the complement cascade.
Collapse
Affiliation(s)
- Alexander D Cook
- Department of Biochemistry, University of OxfordOxfordUnited Kingdom
- Kavli Institute for Nanoscience Discovery, Dorothy Crowfoot Hodgkin Building, University of OxfordOxfordUnited Kingdom
| | - Mark Carrington
- Department of Biochemistry, University of CambridgeCambridgeUnited Kingdom
| | - Matthew K Higgins
- Department of Biochemistry, University of OxfordOxfordUnited Kingdom
- Kavli Institute for Nanoscience Discovery, Dorothy Crowfoot Hodgkin Building, University of OxfordOxfordUnited Kingdom
| |
Collapse
|
13
|
Nag S, Banerjee C, Goyal M, Siddiqui AA, Saha D, Mazumder S, Debsharma S, Pramanik S, Saha SJ, De R, Bandyopadhyay U. Plasmodium falciparum Alba6 exhibits DNase activity and participates in stress response. iScience 2024; 27:109467. [PMID: 38558939 PMCID: PMC10981135 DOI: 10.1016/j.isci.2024.109467] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 12/12/2023] [Accepted: 03/07/2024] [Indexed: 04/04/2024] Open
Abstract
Alba domain proteins, owing to their functional plasticity, play a significant role in organisms. Here, we report an intrinsic DNase activity of PfAlba6 from Plasmodium falciparum, an etiological agent responsible for human malignant malaria. We identified that tyrosine28 plays a critical role in the Mg2+ driven 5'-3' DNase activity of PfAlba6. PfAlba6 cleaves both dsDNA as well as ssDNA. We also characterized PfAlba6-DNA interaction and observed concentration-dependent oligomerization in the presence of DNA, which is evident from size exclusion chromatography and single molecule AFM-imaging. PfAlba6 mRNA expression level is up-regulated several folds following heat stress and treatment with artemisinin, indicating a possible role in stress response. PfAlba6 has no human orthologs and is expressed in all intra-erythrocytic stages; thus, this protein can potentially be a new anti-malarial drug target.
Collapse
Affiliation(s)
- Shiladitya Nag
- Division of Infectious Diseases and Immunology, CSIR-Indian Institute of Chemical Biology, 4, Raja S. C. Mullick Road, Jadavpur, Kolkata 700032, West Bengal, India
| | - Chinmoy Banerjee
- Division of Infectious Diseases and Immunology, CSIR-Indian Institute of Chemical Biology, 4, Raja S. C. Mullick Road, Jadavpur, Kolkata 700032, West Bengal, India
| | - Manish Goyal
- Department of Molecular & Cell Biology, School of Dental Medicine, Boston University Medical Campus, Boston, MA, USA
| | - Asim Azhar Siddiqui
- Division of Infectious Diseases and Immunology, CSIR-Indian Institute of Chemical Biology, 4, Raja S. C. Mullick Road, Jadavpur, Kolkata 700032, West Bengal, India
| | - Debanjan Saha
- Division of Infectious Diseases and Immunology, CSIR-Indian Institute of Chemical Biology, 4, Raja S. C. Mullick Road, Jadavpur, Kolkata 700032, West Bengal, India
| | - Somnath Mazumder
- Division of Infectious Diseases and Immunology, CSIR-Indian Institute of Chemical Biology, 4, Raja S. C. Mullick Road, Jadavpur, Kolkata 700032, West Bengal, India
- Department of Zoology, Raja Peary Mohan College, 1 Acharya Dhruba Pal Road, Uttarpara, West Bengal 712258, India
| | - Subhashis Debsharma
- Division of Infectious Diseases and Immunology, CSIR-Indian Institute of Chemical Biology, 4, Raja S. C. Mullick Road, Jadavpur, Kolkata 700032, West Bengal, India
| | - Saikat Pramanik
- Division of Infectious Diseases and Immunology, CSIR-Indian Institute of Chemical Biology, 4, Raja S. C. Mullick Road, Jadavpur, Kolkata 700032, West Bengal, India
| | - Shubhra Jyoti Saha
- Division of Infectious Diseases and Immunology, CSIR-Indian Institute of Chemical Biology, 4, Raja S. C. Mullick Road, Jadavpur, Kolkata 700032, West Bengal, India
| | - Rudranil De
- Amity Institute of Biotechnology, Amity University, Kolkata, Plot No: 36, 37 & 38, Major Arterial Road, Action Area II, Kadampukur Village, Newtown, Kolkata, West Bengal 700135, India
| | - Uday Bandyopadhyay
- Division of Infectious Diseases and Immunology, CSIR-Indian Institute of Chemical Biology, 4, Raja S. C. Mullick Road, Jadavpur, Kolkata 700032, West Bengal, India
- Division of Molecular Medicine, Bose Institute, Unified Academic Campus, EN 80, Sector V, Bidhan Nagar, Kolkata, West Bengal 700091, India
| |
Collapse
|
14
|
Xu S, Onoda A. Accurate and Fast Prediction of Intrinsically Disordered Protein by Multiple Protein Language Models and Ensemble Learning. J Chem Inf Model 2024; 64:2901-2911. [PMID: 37883249 DOI: 10.1021/acs.jcim.3c01202] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2023]
Abstract
Intrinsically disordered proteins (IDPs) play a vital role in various biological processes and have attracted increasing attention in the past few decades. Predicting IDPs from the primary structures of proteins offers a rapid and facile means of protein analysis without necessitating crystal structures. In particular, machine learning methods have demonstrated their potential in this field. Recently, protein language models (PLMs) are emerging as a promising approach to extracting essential information from protein sequences and have been employed in protein modeling to utilize their advantages of precision and efficiency. In this article, we developed a novel IDP prediction method named IDP-ELM to predict the intrinsically disordered regions (IDRs) as well as their functions including disordered flexible linkers and disordered protein binding. This method utilizes high-dimensional representations extracted from several state-of-the-art PLMs and predicts IDRs by ensemble learning based on bidirectional recurrent neural networks. The performance of the method was evaluated on two independent test data sets from CAID (critical assessment of protein intrinsic disorder prediction) and CAID2, indicating notable improvements in terms of area under the receiver operating characteristic (AUC), Matthew's correlation coefficient (MCC), and F1 score. Moreover, IDP-ELM requires solely protein sequences as inputs and does not entail a time-consuming process of protein profile generation, which is a prerequisite for most existing state-of-the-art methods, enabling an accurate, fast, and convenient tool for proteome-level analysis. The corresponding reproducible source code and model weights are available at https://github.com/xu-shi-jie/idp-elm.
Collapse
Affiliation(s)
- Shijie Xu
- Graduate School of Environmental Science, Hokkaido University, Sapporo 060-0810, Japan
| | - Akira Onoda
- Graduate School of Environmental Science, Hokkaido University, Sapporo 060-0810, Japan
- Faculty of Environmental Earth Science, Hokkaido University, Sapporo 060-0810, Japan
| |
Collapse
|
15
|
Singleton MD, Eisen MB. Evolutionary analyses of intrinsically disordered regions reveal widespread signals of conservation. PLoS Comput Biol 2024; 20:e1012028. [PMID: 38662765 PMCID: PMC11075841 DOI: 10.1371/journal.pcbi.1012028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Revised: 05/07/2024] [Accepted: 03/28/2024] [Indexed: 05/08/2024] Open
Abstract
Intrinsically disordered regions (IDRs) are segments of proteins without stable three-dimensional structures. As this flexibility allows them to interact with diverse binding partners, IDRs play key roles in cell signaling and gene expression. Despite the prevalence and importance of IDRs in eukaryotic proteomes and various biological processes, associating them with specific molecular functions remains a significant challenge due to their high rates of sequence evolution. However, by comparing the observed values of various IDR-associated properties against those generated under a simulated model of evolution, a recent study found most IDRs across the entire yeast proteome contain conserved features. Furthermore, it showed clusters of IDRs with common "evolutionary signatures," i.e. patterns of conserved features, were associated with specific biological functions. To determine if similar patterns of conservation are found in the IDRs of other systems, in this work we applied a series of phylogenetic models to over 7,500 orthologous IDRs identified in the Drosophila genome to dissect the forces driving their evolution. By comparing models of constrained and unconstrained continuous trait evolution using the Brownian motion and Ornstein-Uhlenbeck models, respectively, we identified signals of widespread constraint, indicating conservation of distributed features is mechanism of IDR evolution common to multiple biological systems. In contrast to the previous study in yeast, however, we observed limited evidence of IDR clusters with specific biological functions, which suggests a more complex relationship between evolutionary constraints and function in the IDRs of multicellular organisms.
Collapse
Affiliation(s)
- Marc D. Singleton
- Howard Hughes Medical Institute, UC Berkeley, Berkeley, California, United States of America
| | - Michael B. Eisen
- Howard Hughes Medical Institute, UC Berkeley, Berkeley, California, United States of America
- Department of Molecular and Cell Biology, UC Berkeley, Berkeley, California, United States of America
| |
Collapse
|
16
|
Peng J, Zhao L. The origin and structural evolution of de novo genes in Drosophila. Nat Commun 2024; 15:810. [PMID: 38280868 PMCID: PMC10821953 DOI: 10.1038/s41467-024-45028-1] [Citation(s) in RCA: 24] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Accepted: 01/09/2024] [Indexed: 01/29/2024] Open
Abstract
Recent studies reveal that de novo gene origination from previously non-genic sequences is a common mechanism for gene innovation. These young genes provide an opportunity to study the structural and functional origins of proteins. Here, we combine high-quality base-level whole-genome alignments and computational structural modeling to study the origination, evolution, and protein structures of lineage-specific de novo genes. We identify 555 de novo gene candidates in D. melanogaster that originated within the Drosophilinae lineage. Sequence composition, evolutionary rates, and expression patterns indicate possible gradual functional or adaptive shifts with their gene ages. Surprisingly, we find little overall protein structural changes in candidates from the Drosophilinae lineage. We identify several candidates with potentially well-folded protein structures. Ancestral sequence reconstruction analysis reveals that most potentially well-folded candidates are often born well-folded. Single-cell RNA-seq analysis in testis shows that although most de novo gene candidates are enriched in spermatocytes, several young candidates are biased towards the early spermatogenesis stage, indicating potentially important but less emphasized roles of early germline cells in the de novo gene origination in testis. This study provides a systematic overview of the origin, evolution, and protein structural changes of Drosophilinae-specific de novo genes.
Collapse
Affiliation(s)
- Junhui Peng
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, USA
| | - Li Zhao
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, USA.
| |
Collapse
|
17
|
Pang Y, Liu B. DisoFLAG: accurate prediction of protein intrinsic disorder and its functions using graph-based interaction protein language model. BMC Biol 2024; 22:3. [PMID: 38166858 PMCID: PMC10762911 DOI: 10.1186/s12915-023-01803-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2023] [Accepted: 12/15/2023] [Indexed: 01/05/2024] Open
Abstract
Intrinsically disordered proteins and regions (IDPs/IDRs) are functionally important proteins and regions that exist as highly dynamic conformations under natural physiological conditions. IDPs/IDRs exhibit a broad range of molecular functions, and their functions involve binding interactions with partners and remaining native structural flexibility. The rapid increase in the number of proteins in sequence databases and the diversity of disordered functions challenge existing computational methods for predicting protein intrinsic disorder and disordered functions. A disordered region interacts with different partners to perform multiple functions, and these disordered functions exhibit different dependencies and correlations. In this study, we introduce DisoFLAG, a computational method that leverages a graph-based interaction protein language model (GiPLM) for jointly predicting disorder and its multiple potential functions. GiPLM integrates protein semantic information based on pre-trained protein language models into graph-based interaction units to enhance the correlation of the semantic representation of multiple disordered functions. The DisoFLAG predictor takes amino acid sequences as the only inputs and provides predictions of intrinsic disorder and six disordered functions for proteins, including protein-binding, DNA-binding, RNA-binding, ion-binding, lipid-binding, and flexible linker. We evaluated the predictive performance of DisoFLAG following the Critical Assessment of protein Intrinsic Disorder (CAID) experiments, and the results demonstrated that DisoFLAG offers accurate and comprehensive predictions of disordered functions, extending the current coverage of computationally predicted disordered function categories. The standalone package and web server of DisoFLAG have been established to provide accurate prediction tools for intrinsic disorders and their associated functions.
Collapse
Affiliation(s)
- Yihe Pang
- School of Computer Science and Technology, Beijing Institute of Technology, No. 5, South Zhongguancun Street, Beijing, Haidian District, 100081, China
| | - Bin Liu
- School of Computer Science and Technology, Beijing Institute of Technology, No. 5, South Zhongguancun Street, Beijing, Haidian District, 100081, China.
- Advanced Research Institute of Multidisciplinary Science, Beijing Institute of Technology, No. 5, South Zhongguancun Street, Beijing, Haidian District, 100081, China.
| |
Collapse
|
18
|
Yang Z, Wang Y, Ni X, Yang S. DeepDRP: Prediction of intrinsically disordered regions based on integrated view deep learning architecture from transformer-enhanced and protein information. Int J Biol Macromol 2023; 253:127390. [PMID: 37827403 DOI: 10.1016/j.ijbiomac.2023.127390] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2023] [Revised: 09/20/2023] [Accepted: 10/09/2023] [Indexed: 10/14/2023]
Abstract
Intrinsic disorder in proteins, a widely distributed phenomenon in nature, is related to many crucial biological processes and various diseases. Traditional determination methods tend to be costly and labor-intensive, therefore it is desirable to seek an accurate identification method of intrinsically disordered proteins (IDPs). In this paper, we proposed a novel Deep learning model for Intrinsically Disordered Regions in Proteins named DeepDRP. DeepDRP employed an innovative TimeDistributed strategy and Bi-LSTM architecture to predict IDPs and is driven by integrated view features of PSSM, Energy-based encoding, AAindex, and transformer-enhanced embeddings including DR-BERT, OntoProtein, Prot-T5, and ESM-2. The comparison of different feature combinations indicates that the transformer-enhanced features contribute far more than traditional features to predict IDPs and ESM-2 accounts for a larger contribution in the pre-trained fusion vectors. The ablation test verified that the TimeDistributed strategy surely increased the model performance and is an efficient approach to the IDP prediction. Compared with eight state-of-the-art methods on the DISORDER723, S1, and DisProt832 datasets, the Matthews correlation coefficient of DeepDRP significantly outperformed competing methods by 4.90 % to 36.20 %, 11.80 % to 26.33 %, and 4.82 % to 13.55 %. In brief, DeepDRP is a reliable model for IDP prediction and is freely available at https://github.com/ZX-COLA/DeepDRP.
Collapse
Affiliation(s)
- Zexi Yang
- School of Computer Science and Artificial Intelligence Aliyun School of Big Data School of Software, Changzhou University, Changzhou 213164, China
| | - Yan Wang
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China; School of Artificial Intelligence, Jilin University, Changchun 130012, China
| | - Xinye Ni
- The Affiliated Changzhou No. 2 People's Hospital of Nanjing Medical University, Changzhou 213164, China
| | - Sen Yang
- School of Computer Science and Artificial Intelligence Aliyun School of Big Data School of Software, Changzhou University, Changzhou 213164, China; The Affiliated Changzhou No. 2 People's Hospital of Nanjing Medical University, Changzhou 213164, China.
| |
Collapse
|
19
|
Conte AD, Mehdiabadi M, Bouhraoua A, Miguel Monzon A, Tosatto SCE, Piovesan D. Critical assessment of protein intrinsic disorder prediction (CAID) - Results of round 2. Proteins 2023; 91:1925-1934. [PMID: 37621223 DOI: 10.1002/prot.26582] [Citation(s) in RCA: 31] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Revised: 06/22/2023] [Accepted: 08/08/2023] [Indexed: 08/26/2023]
Abstract
Protein intrinsic disorder (ID) is a complex and context-dependent phenomenon that covers a continuum between fully disordered states and folded states with long dynamic regions. The lack of a ground truth that fits all ID flavors and the potential for order-to-disorder transitions depending on specific conditions makes ID prediction challenging. The CAID2 challenge aimed to evaluate the performance of different prediction methods across different benchmarks, leveraging the annotation provided by the DisProt database, which stores the coordinates of ID regions when there is experimental evidence in the literature. The CAID2 challenge demonstrated varying performance of different prediction methods across different benchmarks, highlighting the need for continued development of more versatile and efficient prediction software. Depending on the application, researchers may need to balance performance with execution time when selecting a predictor. Methods based on AlphaFold2 seem to be good ID predictors but they are better at detecting absence of order rather than ID regions as defined in DisProt. The CAID2 predictors can be freely used through the CAID Prediction Portal, and CAID has been integrated into OpenEBench, which will become the official platform for running future CAID challenges.
Collapse
Affiliation(s)
- Alessio Del Conte
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | - Mahta Mehdiabadi
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | - Adel Bouhraoua
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | | | | | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| |
Collapse
|
20
|
Kurgan L, Hu G, Wang K, Ghadermarzi S, Zhao B, Malhis N, Erdős G, Gsponer J, Uversky VN, Dosztányi Z. Tutorial: a guide for the selection of fast and accurate computational tools for the prediction of intrinsic disorder in proteins. Nat Protoc 2023; 18:3157-3172. [PMID: 37740110 DOI: 10.1038/s41596-023-00876-x] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Accepted: 06/21/2023] [Indexed: 09/24/2023]
Abstract
Intrinsic disorder is instrumental for a wide range of protein functions, and its analysis, using computational predictions from primary structures, complements secondary and tertiary structure-based approaches. In this Tutorial, we provide an overview and comparison of 23 publicly available computational tools with complementary parameters useful for intrinsic disorder prediction, partly relying on results from the Critical Assessment of protein Intrinsic Disorder prediction experiment. We consider factors such as accuracy, runtime, availability and the need for functional insights. The selected tools are available as web servers and downloadable programs, offer state-of-the-art predictions and can be used in a high-throughput manner. We provide examples and instructions for the selected tools to illustrate practical aspects related to the submission, collection and interpretation of predictions, as well as the timing and their limitations. We highlight two predictors for intrinsically disordered proteins, flDPnn as accurate and fast and IUPred as very fast and moderately accurate, while suggesting ANCHOR2 and MoRFchibi as two of the best-performing predictors for intrinsically disordered region binding. We link these tools to additional resources, including databases of predictions and web servers that integrate multiple predictive methods. Altogether, this Tutorial provides a hands-on guide to comparatively evaluating multiple predictors, submitting and collecting their own predictions, and reading and interpreting results. It is suitable for experimentalists and computational biologists interested in accurately and conveniently identifying intrinsic disorder, facilitating the functional characterization of the rapidly growing collections of protein sequences.
Collapse
Affiliation(s)
- Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA.
| | - Gang Hu
- School of Statistics and Data Science, LPMC and KLMDASR, Nankai University, Tianjin, China
| | - Kui Wang
- School of Statistics and Data Science, LPMC and KLMDASR, Nankai University, Tianjin, China
| | - Sina Ghadermarzi
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| | - Bi Zhao
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| | - Nawar Malhis
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada
| | - Gábor Erdős
- MTA-ELTE Momentum Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Budapest, Hungary
| | - Jörg Gsponer
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada.
| | - Vladimir N Uversky
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL, USA.
- Byrd Alzheimer's Center and Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, USA.
| | - Zsuzsanna Dosztányi
- MTA-ELTE Momentum Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Budapest, Hungary.
| |
Collapse
|
21
|
Tang YJ, Yan K, Zhang X, Tian Y, Liu B. Protein intrinsically disordered region prediction by combining neural architecture search and multi-objective genetic algorithm. BMC Biol 2023; 21:188. [PMID: 37674132 PMCID: PMC10483879 DOI: 10.1186/s12915-023-01672-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2023] [Accepted: 07/31/2023] [Indexed: 09/08/2023] Open
Abstract
BACKGROUND Intrinsically disordered regions (IDRs) are widely distributed in proteins and related to many important biological functions. Accurately identifying IDRs is of great significance for protein structure and function analysis. Because the long disordered regions (LDRs) and short disordered regions (SDRs) share different characteristics, the existing predictors fail to achieve better and more stable performance on datasets with different ratios between LDRs and SDRs. There are two main reasons. First, the existing predictors construct network structures based on their own experiences such as convolutional neural network (CNN) which is used to extract the feature of neighboring residues in protein, and long short-term memory (LSTM) is used to extract the long-distance dependencies feature of protein residues. But these networks cannot capture the hidden feature associated with the length-dependent between residues. Second, many algorithms based on deep learning have been proposed but the complementarity of the existing predictors is not fully explored and used. RESULTS In this study, the neural architecture search (NAS) algorithm was employed to automatically construct the network structures so as to capture the hidden features in protein sequences. In order to stably predict both the LDRs and SDRs, the model constructed by NAS was combined with length-dependent models for capturing the unique features of SDRs or LDRs and general models for capturing the common features between LDRs and SDRs. A new predictor called IDP-Fusion was proposed. CONCLUSIONS Experimental results showed that IDP-Fusion can achieve more stable performance than the other existing predictors on independent test sets with different ratios between SDRs and LDRs.
Collapse
Affiliation(s)
- Yi-Jun Tang
- School of Computer Science and Technology, Beijing Institute of Technology, Haidian District, No. 5, South Zhongguancun Street, Beijing, 100081, China
| | - Ke Yan
- School of Computer Science and Technology, Beijing Institute of Technology, Haidian District, No. 5, South Zhongguancun Street, Beijing, 100081, China
| | - Xingyi Zhang
- School of Artificial Intelligence, Anhui University, Hefei, 230601, China
| | - Ye Tian
- Institutes of Physical Science and Information Technology, Anhui University, Hefei, 230601, China
| | - Bin Liu
- School of Computer Science and Technology, Beijing Institute of Technology, Haidian District, No. 5, South Zhongguancun Street, Beijing, 100081, China.
- Advanced Research Institute of Multidisciplinary Science, Beijing Institute of Technology, Beijing, 100081, China.
| |
Collapse
|
22
|
Antonietti M, Gonzalez DJT, Djulbegovic M, Dayhoff GW, Uversky VN, Shields CL, Karp CL. Intrinsic disorder in PRAME and its role in uveal melanoma. Cell Commun Signal 2023; 21:222. [PMID: 37626310 PMCID: PMC10463658 DOI: 10.1186/s12964-023-01197-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2023] [Accepted: 06/13/2023] [Indexed: 08/27/2023] Open
Abstract
INTRODUCTION The PReferentially expressed Antigen in MElanoma (PRAME) protein has been shown to be an independent biomarker for increased risk of metastasis in Class 1 uveal melanomas (UM). Intrinsically disordered proteins and regions of proteins (IDPs/IDPRs) are proteins that do not have a well-defined three-dimensional structure and have been linked to neoplastic development. Our study aimed to evaluate the presence of intrinsic disorder in PRAME and the role these structureless regions have in PRAME( +) Class 1 UM. METHODS A bioinformatics study to characterize PRAME's propensity for the intrinsic disorder. We first used the AlphaFold tool to qualitatively assess the protein structure of PRAME. Then we used the Compositional Profiler and a set of per-residue intrinsic disorder predictors to quantify the intrinsic disorder. The Database of Disordered Protein Prediction (D2P2) platform, IUPred, FuzDrop, fIDPnn, AUCpred, SPOT-Disorder2, and metapredict V2 allowed us to evaluate the potential functional disorder of PRAME. Additionally, we used the Search Tool for the Retrieval of Interacting Genes (STRING) to analyze PRAME's potential interactions with other proteins. RESULTS Our structural analysis showed that PRAME contains intrinsically disordered protein regions (IDPRs), which are structureless and flexible. We found that PRAME is significantly enriched with serine (p-value < 0.05), a disorder-promoting amino acid. PRAME was found to have an average disorder score of 16.49% (i.e., moderately disordered) across six per-residue intrinsic disorder predictors. Our IUPred analysis revealed the presence of disorder-to-order transition (DOT) regions in PRAME near the C-terminus of the protein (residues 475-509). The D2P2 platform predicted a region from approximately 140 and 175 to be highly concentrated with post-translational modifications (PTMs). FuzDrop predicted the PTM hot spot of PRAME to be a droplet-promoting region and an aggregation hotspot. Finally, our analysis using the STRING tool revealed that PRAME has significantly more interactions with other proteins than expected for randomly selected proteins of the same size, with the ability to interact with 84 different partners (STRING analysis result: p-value < 1.0 × 10-16; model confidence: 0.400). CONCLUSION Our study revealed that PRAME has IDPRs that are possibly linked to its functionality in the context of Class 1 UM. The regions of functionality (i.e., DOT regions, PTM sites, droplet-promoting regions, and aggregation hotspots) are localized to regions of high levels of disorder. PRAME has a complex protein-protein interaction (PPI) network that may be secondary to the structureless features of the polypeptide. Our findings contribute to our understanding of UM and suggest that IDPRs and DOT regions in PRAME may be targeted in developing new therapies for this aggressive cancer. Video Abstract.
Collapse
Affiliation(s)
- Michael Antonietti
- Bascom Palmer Eye Institute, University of Miami, 900 NW 17th Street, Miami, FL, 33136, USA
| | | | - Mak Djulbegovic
- Bascom Palmer Eye Institute, University of Miami, 900 NW 17th Street, Miami, FL, 33136, USA
| | - Guy W Dayhoff
- Department of Chemistry, College of Art and Sciences, University of South Florida, FL, 33612, Tampa, USA
| | - Vladimir N Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, FL, 33612, Tampa, USA
| | - Carol L Shields
- Ocular Oncology Service, Wills Eye Hospital, Thomas Jefferson University, PA, Philadelphia, USA
| | - Carol L Karp
- Bascom Palmer Eye Institute, University of Miami, 900 NW 17th Street, Miami, FL, 33136, USA.
| |
Collapse
|
23
|
Zhao B, Ghadermarzi S, Kurgan L. Comparative evaluation of AlphaFold2 and disorder predictors for prediction of intrinsic disorder, disorder content and fully disordered proteins. Comput Struct Biotechnol J 2023; 21:3248-3258. [PMID: 38213902 PMCID: PMC10782001 DOI: 10.1016/j.csbj.2023.06.001] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 05/31/2023] [Accepted: 06/01/2023] [Indexed: 01/13/2024] Open
Abstract
We expand studies of AlphaFold2 (AF2) in the context of intrinsic disorder prediction by comparing it against a broad selection of 20 accurate, popular and recently released disorder predictors. We use 25% larger benchmark dataset with 646 proteins and cover protein-level predictions of disorder content and fully disordered proteins. AF2-based disorder predictions secure a relatively high Area Under receiver operating characteristic Curve (AUC) of 0.77 and are statistically outperformed by several modern disorder predictors that secure AUCs around 0.8 with median runtime of about 20 s compared to 1200 s for AF2. Moreover, AF2 provides modestly accurate predictions of fully disordered proteins (F1 = 0.59 vs. 0.91 for the best disorder predictor) and disorder content (mean absolute error of 0.21 vs. 0.15). AF2 also generates statistically more accurate disorder predictions for about 20% of proteins that have relatively short sequences and a few disordered regions that tend to be located at the sequence termini, and which are absent of disordered protein-binding regions. Interestingly, AF2 and the most accurate disorder predictors rely on deep neural networks, suggesting that these models are useful for protein structure and disorder predictions.
Collapse
Affiliation(s)
- Bi Zhao
- Genomics program, College of Public Health, University of South Florida, Tampa, FL, United States
| | - Sina Ghadermarzi
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States
| |
Collapse
|
24
|
Redl I, Fisicaro C, Dutton O, Hoffmann F, Henderson L, Owens BJ, Heberling M, Paci E, Tamiola K. ADOPT: intrinsic protein disorder prediction through deep bidirectional transformers. NAR Genom Bioinform 2023; 5:lqad041. [PMID: 37138579 PMCID: PMC10150328 DOI: 10.1093/nargab/lqad041] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Revised: 02/07/2023] [Accepted: 04/17/2023] [Indexed: 05/05/2023] Open
Abstract
Intrinsically disordered proteins (IDPs) are important for a broad range of biological functions and are involved in many diseases. An understanding of intrinsic disorder is key to develop compounds that target IDPs. Experimental characterization of IDPs is hindered by the very fact that they are highly dynamic. Computational methods that predict disorder from the amino acid sequence have been proposed. Here, we present ADOPT (Attention DisOrder PredicTor), a new predictor of protein disorder. ADOPT is composed of a self-supervised encoder and a supervised disorder predictor. The former is based on a deep bidirectional transformer, which extracts dense residue-level representations from Facebook's Evolutionary Scale Modeling library. The latter uses a database of nuclear magnetic resonance chemical shifts, constructed to ensure balanced amounts of disordered and ordered residues, as a training and a test dataset for protein disorder. ADOPT predicts whether a protein or a specific region is disordered with better performance than the best existing predictors and faster than most other proposed methods (a few seconds per sequence). We identify the features that are relevant for the prediction performance and show that good performance can already be gained with <100 features. ADOPT is available as a stand-alone package at https://github.com/PeptoneLtd/ADOPT and as a web server at https://adopt.peptone.io/.
Collapse
Affiliation(s)
- Istvan Redl
- Peptone Ltd, 370 Grays Inn Road, London WC1X 8BB, UK
| | | | - Oliver Dutton
- Peptone Ltd, 370 Grays Inn Road, London WC1X 8BB, UK
| | - Falk Hoffmann
- Peptone Ltd, 370 Grays Inn Road, London WC1X 8BB, UK
| | | | | | | | - Emanuele Paci
- Peptone Ltd, 370 Grays Inn Road, London WC1X 8BB, UK
- Department of Physics and Astronomy ‘Augusto Righi’, University of Bologna, 40127 Bologna, Italy
| | - Kamil Tamiola
- To whom correspondence should be addressed. Tel: +41 79 609 7333;
| |
Collapse
|
25
|
Abstract
There are over 100 computational predictors of intrinsic disorder. These methods predict amino acid-level propensities for disorder directly from protein sequences. The propensities can be used to annotate putative disordered residues and regions. This unit provides a practical and holistic introduction to the sequence-based intrinsic disorder prediction. We define intrinsic disorder, explain the format of computational prediction of disorder, and identify and describe several accurate predictors. We also introduce recently released databases of intrinsic disorder predictions and use an illustrative example to provide insights into how predictions should be interpreted and combined. Lastly, we summarize key experimental methods that can be used to validate computational predictions. © 2023 Wiley Periodicals LLC.
Collapse
Affiliation(s)
- Vladimir N Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, Florida
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, Virginia
| |
Collapse
|
26
|
Basu S, Gsponer J, Kurgan L. DEPICTER2: a comprehensive webserver for intrinsic disorder and disorder function prediction. Nucleic Acids Res 2023:7151337. [PMID: 37140058 DOI: 10.1093/nar/gkad330] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Revised: 04/12/2023] [Accepted: 04/18/2023] [Indexed: 05/05/2023] Open
Abstract
Intrinsic disorder in proteins is relatively abundant in nature and essential for a broad spectrum of cellular functions. While disorder can be accurately predicted from protein sequences, as it was empirically demonstrated in recent community-organized assessments, it is rather challenging to collect and compile a comprehensive prediction that covers multiple disorder functions. To this end, we introduce the DEPICTER2 (DisorderEd PredictIon CenTER) webserver that offers convenient access to a curated collection of fast and accurate disorder and disorder function predictors. This server includes a state-of-the-art disorder predictor, flDPnn, and five modern methods that cover all currently predictable disorder functions: disordered linkers and protein, peptide, DNA, RNA and lipid binding. DEPICTER2 allows selection of any combination of the six methods, batch predictions of up to 25 proteins per request and provides interactive visualization of the resulting predictions. The webserver is freely available at http://biomine.cs.vcu.edu/servers/DEPICTER2/.
Collapse
Affiliation(s)
- Sushmita Basu
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Jörg Gsponer
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| |
Collapse
|
27
|
Pang Y, Liu B. TransDFL: Identification of Disordered Flexible Linkers in Proteins by Transfer Learning. GENOMICS, PROTEOMICS & BIOINFORMATICS 2023; 21:359-369. [PMID: 36272675 PMCID: PMC10626177 DOI: 10.1016/j.gpb.2022.10.004] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Revised: 09/21/2022] [Accepted: 10/14/2022] [Indexed: 11/27/2022]
Abstract
Disordered flexible linkers (DFLs) are the functional disordered regions in proteins, which are the sub-regions of intrinsically disordered regions (IDRs) and play important roles in connecting domains and maintaining inter-domain interactions. Trained with the limited available DFLs, the existing DFL predictors based on the machine learning techniques tend to predict the ordered residues as DFLs, leading to a high falsepositive rate (FPR) and low prediction accuracy. Previous studies have shown that DFLs are extremely flexible disordered regions, which are usually predicted as disordered residues with high confidence [P(D) > 0.9] by an IDR predictor. Therefore, transferring an IDR predictor to an accurate DFL predictor is of great significance for understanding the functions of IDRs. In this study, we proposed a new predictor called TransDFL for identifying DFLs by transferring the RFPR-IDP predictor for IDR identification to the DFL prediction. The RFPR-IDP was pre-trained with IDR sequences to learn the general features between IDRs and DFLs, which is helpful to reduce the false positives in the ordered regions. RFPR-IDP was fine-tuned with the DFL sequences to capture the specific features of DFLs so as to be transferred into the TransDFL. Experimental results of two application scenarios (prediction of DFLs only in IDRs or prediction of DFLs in entire proteins) showed that TransDFL consistently outperformed other existing DFL predictors with higher accuracy. The corresponding web server of TransDFL can be freely accessed at http://bliulab.net/TransDFL/.
Collapse
Affiliation(s)
- Yihe Pang
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
| | - Bin Liu
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China; Advanced Research Institute of Multidisciplinary Science, Beijing Institute of Technology, Beijing 100081, China.
| |
Collapse
|
28
|
Mínguez-Toral M, Pacios LF, Sánchez F, Ponz F. Structural intrinsic disorder in a functionalized potyviral coat protein as a main viability determinant of its assembled nanoparticles. Int J Biol Macromol 2023; 236:123958. [PMID: 36906197 DOI: 10.1016/j.ijbiomac.2023.123958] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Revised: 02/24/2023] [Accepted: 03/04/2023] [Indexed: 03/11/2023]
Abstract
The viability of viral-derived nanoparticles (virions and VLPs) aimed to nanobiotechnological functionalizations of the coat protein (CP) of turnip mosaic virus has been studied by means of advanced computational methodologies that include molecular dynamics. The study has allowed to model the structure of the complete CP and its functionalization with three different peptides and obtain essential structural features such as order/disorder, interactions, and electrostatic potentials of their constituent domains. The results provide for the first time a dynamic view of a complete potyvirus CP, since experimental available structures so far obtained lack N- and C-terminal segments. The relevance of disorder in the most distal N-terminal subdomain, and the interaction of the less distal N-terminal subdomain with the highly ordered CP core, stand out as crucial characteristic for a viable CP. Preserving them proved of outmost importance to obtain viable potyviral CPs presenting peptides at their N-terminus.
Collapse
Affiliation(s)
- Marina Mínguez-Toral
- Department of Structural and Chemical Biology, Centro de Investigaciones Biológicas Margarita Salas, CIB-CSIC, 28040 Madrid, Spain
| | - Luis F Pacios
- Departamento de Biotecnología-Biología Vegetal, ETSIAAB, Universidad Politécnica de Madrid (UPM), 28040 Madrid, Spain; Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación Agraria y Alimentaria (INIA/CSIC), Campus de Montegancedo UPM, 28223 Pozuelo de Alarcón, Madrid, Spain
| | - Flora Sánchez
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación Agraria y Alimentaria (INIA/CSIC), Campus de Montegancedo UPM, 28223 Pozuelo de Alarcón, Madrid, Spain
| | - Fernando Ponz
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación Agraria y Alimentaria (INIA/CSIC), Campus de Montegancedo UPM, 28223 Pozuelo de Alarcón, Madrid, Spain.
| |
Collapse
|
29
|
Computational prediction of disordered binding regions. Comput Struct Biotechnol J 2023; 21:1487-1497. [PMID: 36851914 PMCID: PMC9957716 DOI: 10.1016/j.csbj.2023.02.018] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Revised: 02/08/2023] [Accepted: 02/08/2023] [Indexed: 02/12/2023] Open
Abstract
One of the key features of intrinsically disordered regions (IDRs) is their ability to interact with a broad range of partner molecules. Multiple types of interacting IDRs were identified including molecular recognition fragments (MoRFs), short linear sequence motifs (SLiMs), and protein-, nucleic acids- and lipid-binding regions. Prediction of binding IDRs in protein sequences is gaining momentum in recent years. We survey 38 predictors of binding IDRs that target interactions with a diverse set of partners, such as peptides, proteins, RNA, DNA and lipids. We offer a historical perspective and highlight key events that fueled efforts to develop these methods. These tools rely on a diverse range of predictive architectures that include scoring functions, regular expressions, traditional and deep machine learning and meta-models. Recent efforts focus on the development of deep neural network-based architectures and extending coverage to RNA, DNA and lipid-binding IDRs. We analyze availability of these methods and show that providing implementations and webservers results in much higher rates of citations/use. We also make several recommendations to take advantage of modern deep network architectures, develop tools that bundle predictions of multiple and different types of binding IDRs, and work on algorithms that model structures of the resulting complexes.
Collapse
|
30
|
Kouros CE, Makri V, Ouzounis CA, Chasapi A. Disease association and comparative genomics of compositional bias in human proteins. F1000Res 2023; 12:198. [PMID: 37082000 PMCID: PMC10111144 DOI: 10.12688/f1000research.129929.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 02/02/2023] [Indexed: 02/22/2023] Open
Abstract
Background: The evolutionary rate of disordered proteins varies greatly due to the lack of structural constraints. So far, few studies have investigated the presence/absence patterns of intrinsically disordered regions (IDRs) across phylogenies in conjunction with human disease. In this study, we report a genome-wide analysis of compositional bias association with disease in human proteins and their taxonomic distribution. Methods: The human genome protein set provided by the Ensembl database was annotated and analysed with respect to both disease associations and the detection of compositional bias. The Uniprot Reference Proteome dataset, containing 11297 proteomes was used as target dataset for the comparative genomics of a well-defined subset of the Human Genome, including 100 characteristic, compositionally biased proteins, some linked to disease. Results: Cross-evaluation of compositional bias and disease-association in the human genome reveals a significant bias towards low complexity regions in disease-associated genes, with charged, hydrophilic amino acids appearing as over-represented. The phylogenetic profiling of 17 disease-associated, low complexity proteins across 11297 proteomes captures characteristic taxonomic distribution patterns. Conclusions: This is the first time that a combined genome-wide analysis of low complexity, disease-association and taxonomic distribution of human proteins is reported, covering structural, functional, and evolutionary properties. The reported framework can form the basis for large-scale, follow-up projects, encompassing the entire human genome and all known gene-disease associations.
Collapse
Affiliation(s)
- Christos E. Kouros
- BCCB-AIIA, School of Informatics, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - Vasiliki Makri
- BCCB-AIIA, School of Informatics, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - Christos A. Ouzounis
- BCCB-AIIA, School of Informatics, Aristotle University of Thessaloniki, Thessaloniki, Greece
- BCPL, Chemical Process & Energy Resources Institute, Centre for Research & Technology Hellas (CERTH), Thessaloniki, Greece
| | - Anastasia Chasapi
- BCPL, Chemical Process & Energy Resources Institute, Centre for Research & Technology Hellas (CERTH), Thessaloniki, Greece
| |
Collapse
|
31
|
Kouros CE, Makri V, Ouzounis CA, Chasapi A. Disease association and comparative genomics of compositional bias in human proteins. F1000Res 2023; 12:198. [PMID: 37082000 PMCID: PMC10111144 DOI: 10.12688/f1000research.129929.2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 04/12/2023] [Indexed: 04/25/2023] Open
Abstract
Background: The evolutionary rate of disordered protein regions varies greatly due to the lack of structural constraints. So far, few studies have investigated the presence/absence patterns of compositional bias, indicative of disorder, across phylogenies in conjunction with human disease. In this study, we report a genome-wide analysis of compositional bias association with disease in human proteins and their taxonomic distribution. Methods: The human genome protein set provided by the Ensembl database was annotated and analysed with respect to both disease associations and the detection of compositional bias. The Uniprot Reference Proteome dataset, containing 11297 proteomes was used as target dataset for the comparative genomics of a well-defined subset of the Human Genome, including 100 characteristic, compositionally biased proteins, some linked to disease. Results: Cross-evaluation of compositional bias and disease-association in the human genome reveals a significant bias towards biased regions in disease-associated genes, with charged, hydrophilic amino acids appearing as over-represented. The phylogenetic profiling of 17 disease-associated, proteins with compositional bias across 11297 proteomes captures characteristic taxonomic distribution patterns. Conclusions: This is the first time that a combined genome-wide analysis of compositional bias, disease-association and taxonomic distribution of human proteins is reported, covering structural, functional, and evolutionary properties. The reported framework can form the basis for large-scale, follow-up projects, encompassing the entire human genome and all known gene-disease associations.
Collapse
Affiliation(s)
- Christos E. Kouros
- BCCB-AIIA, School of Informatics, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - Vasiliki Makri
- BCCB-AIIA, School of Informatics, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - Christos A. Ouzounis
- BCCB-AIIA, School of Informatics, Aristotle University of Thessaloniki, Thessaloniki, Greece
- BCPL, Chemical Process & Energy Resources Institute, Centre for Research & Technology Hellas (CERTH), Thessaloniki, Greece
| | - Anastasia Chasapi
- BCPL, Chemical Process & Energy Resources Institute, Centre for Research & Technology Hellas (CERTH), Thessaloniki, Greece
| |
Collapse
|
32
|
Han B, Ren C, Wang W, Li J, Gong X. Computational Prediction of Protein Intrinsically Disordered Region Related Interactions and Functions. Genes (Basel) 2023; 14:432. [PMID: 36833360 PMCID: PMC9956190 DOI: 10.3390/genes14020432] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2022] [Revised: 02/02/2023] [Accepted: 02/05/2023] [Indexed: 02/11/2023] Open
Abstract
Intrinsically Disordered Proteins (IDPs) and Regions (IDRs) exist widely. Although without well-defined structures, they participate in many important biological processes. In addition, they are also widely related to human diseases and have become potential targets in drug discovery. However, there is a big gap between the experimental annotations related to IDPs/IDRs and their actual number. In recent decades, the computational methods related to IDPs/IDRs have been developed vigorously, including predicting IDPs/IDRs, the binding modes of IDPs/IDRs, the binding sites of IDPs/IDRs, and the molecular functions of IDPs/IDRs according to different tasks. In view of the correlation between these predictors, we have reviewed these prediction methods uniformly for the first time, summarized their computational methods and predictive performance, and discussed some problems and perspectives.
Collapse
Affiliation(s)
- Bingqing Han
- Mathematical Intelligence Application Lab, Institute for Mathematical Sciences, Renmin University of China, Beijing 100872, China
| | - Chongjiao Ren
- Mathematical Intelligence Application Lab, Institute for Mathematical Sciences, Renmin University of China, Beijing 100872, China
| | - Wenda Wang
- Mathematical Intelligence Application Lab, Institute for Mathematical Sciences, Renmin University of China, Beijing 100872, China
| | - Jiashan Li
- Mathematical Intelligence Application Lab, Institute for Mathematical Sciences, Renmin University of China, Beijing 100872, China
| | - Xinqi Gong
- Mathematical Intelligence Application Lab, Institute for Mathematical Sciences, Renmin University of China, Beijing 100872, China
- Beijing Academy of Intelligence, Beijing 100083, China
| |
Collapse
|
33
|
Eicher JE, Brom JA, Wang S, Sheiko SS, Atkin JM, Pielak GJ. Secondary structure and stability of a gel-forming tardigrade desiccation-tolerance protein. Protein Sci 2022; 31:e4495. [PMID: 36335581 PMCID: PMC9679978 DOI: 10.1002/pro.4495] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Revised: 10/26/2022] [Accepted: 11/02/2022] [Indexed: 11/08/2022]
Abstract
Protein-based pharmaceuticals are increasingly important, but their inherent instability necessitates a "cold chain" requiring costly refrigeration during production, shipment, and storage. Drying can overcome this problem, but most proteins need the addition of stabilizers, and some cannot be successfully formulated. Thus, there is a need for new, more effective protective molecules. Cytosolically, abundant heat-soluble proteins from tardigrades are both fundamentally interesting and a promising source of inspiration; these disordered, monodisperse polymers form hydrogels whose structure may protect client proteins during drying. We used attenuated total reflectance Fourier transform infrared spectroscopy, differential scanning calorimetry, and small-amplitude oscillatory shear rheometry to characterize gelation. A 5% (wt/vol) gel has a strength comparable with human skin, and melts cooperatively and reversibly near body temperature with an enthalpy comparable with globular proteins. We suggest that the dilute protein forms α-helical coiled coils and increasing their concentration drives gelation via intermolecular β-sheet formation.
Collapse
Affiliation(s)
- Jonathan E. Eicher
- Department of ChemistryUniversity of North Carolina at Chapel HillChapel HillNorth CarolinaUSA
| | - Julia A. Brom
- Department of ChemistryUniversity of North Carolina at Chapel HillChapel HillNorth CarolinaUSA
| | - Shikun Wang
- Department of ChemistryUniversity of North Carolina at Chapel HillChapel HillNorth CarolinaUSA
| | - Sergei S. Sheiko
- Department of ChemistryUniversity of North Carolina at Chapel HillChapel HillNorth CarolinaUSA
| | - Joanna M. Atkin
- Department of ChemistryUniversity of North Carolina at Chapel HillChapel HillNorth CarolinaUSA
| | - Gary J. Pielak
- Department of ChemistryUniversity of North Carolina at Chapel HillChapel HillNorth CarolinaUSA
| |
Collapse
|
34
|
Dayhoff GW, Uversky VN. Rapid prediction and analysis of protein intrinsic disorder. Protein Sci 2022; 31:e4496. [PMID: 36334049 PMCID: PMC9679974 DOI: 10.1002/pro.4496] [Citation(s) in RCA: 63] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 10/28/2022] [Accepted: 11/02/2022] [Indexed: 11/07/2022]
Abstract
Protein intrinsic disorder is found in all kingdoms of life and is known to underpin numerous physiological and pathological processes. Computational methods play an important role in characterizing and identifying intrinsically disordered proteins and protein regions. Herein, we present a new high-efficiency web-based disorder predictor named Rapid Intrinsic Disorder Analysis Online (RIDAO) that is designed to facilitate the application of protein intrinsic disorder analysis in genome-scale structural bioinformatics and comparative genomics/proteomics. RIDAO integrates six established disorder predictors into a single, unified platform that reproduces the results of individual predictors with near-perfect fidelity. To demonstrate the potential applications, we construct a test set containing more than one million sequences from one hundred organisms comprising over 420 million residues. Using this test set, we compare the efficiency and accessibility (i.e., ease of use) of RIDAO to five well-known and popular disorder predictors, namely: AUCpreD, IUPred3, metapredict V2, flDPnn, and SPOT-Disorder2. We show that RIDAO yields per-residue predictions at a rate two to six orders of magnitude greater than the other predictors and completely processes the test set in under an hour. RIDAO can be accessed free of charge at https://ridao.app.
Collapse
Affiliation(s)
- Guy W. Dayhoff
- Department of ChemistryUniversity of South FloridaTampaFloridaUSA
| | - Vladimir N. Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer's Research InstituteUniversity of South FloridaTampaFloridaUSA
| |
Collapse
|
35
|
Ilzhöfer D, Heinzinger M, Rost B. SETH predicts nuances of residue disorder from protein embeddings. FRONTIERS IN BIOINFORMATICS 2022; 2:1019597. [PMID: 36304335 PMCID: PMC9580958 DOI: 10.3389/fbinf.2022.1019597] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Accepted: 09/20/2022] [Indexed: 11/07/2022] Open
Abstract
Predictions for millions of protein three-dimensional structures are only a few clicks away since the release of AlphaFold2 results for UniProt. However, many proteins have so-called intrinsically disordered regions (IDRs) that do not adopt unique structures in isolation. These IDRs are associated with several diseases, including Alzheimer's Disease. We showed that three recent disorder measures of AlphaFold2 predictions (pLDDT, "experimentally resolved" prediction and "relative solvent accessibility") correlated to some extent with IDRs. However, expert methods predict IDRs more reliably by combining complex machine learning models with expert-crafted input features and evolutionary information from multiple sequence alignments (MSAs). MSAs are not always available, especially for IDRs, and are computationally expensive to generate, limiting the scalability of the associated tools. Here, we present the novel method SETH that predicts residue disorder from embeddings generated by the protein Language Model ProtT5, which explicitly only uses single sequences as input. Thereby, our method, relying on a relatively shallow convolutional neural network, outperformed much more complex solutions while being much faster, allowing to create predictions for the human proteome in about 1 hour on a consumer-grade PC with one NVIDIA GeForce RTX 3060. Trained on a continuous disorder scale (CheZOD scores), our method captured subtle variations in disorder, thereby providing important information beyond the binary classification of most methods. High performance paired with speed revealed that SETH's nuanced disorder predictions for entire proteomes capture aspects of the evolution of organisms. Additionally, SETH could also be used to filter out regions or proteins with probable low-quality AlphaFold2 3D structures to prioritize running the compute-intensive predictions for large data sets. SETH is freely publicly available at: https://github.com/Rostlab/SETH.
Collapse
Affiliation(s)
- Dagmar Ilzhöfer
- Faculty of Informatics, TUM (Technical University of Munich), Munich, Germany
| | - Michael Heinzinger
- Faculty of Informatics, TUM (Technical University of Munich), Munich, Germany
- Center of Doctoral Studies in Informatics and Its Applications (CeDoSIA), TUM Graduate School, Garching, Germany
| | - Burkhard Rost
- Faculty of Informatics, TUM (Technical University of Munich), Munich, Germany
- Institute for Advanced Study (TUM-IAS), TUM (Technical University of Munich), Garching, Germany
- TUM School of Life Sciences Weihenstephan (WZW), TUM (Technical University of Munich), Freising, Germany
| |
Collapse
|
36
|
Yin K, Tong M, Sun F, Wu R. Quantitative Structural Proteomics Unveils the Conformational Changes of Proteins under the Endoplasmic Reticulum Stress. Anal Chem 2022; 94:13250-13260. [PMID: 36108266 PMCID: PMC9789690 DOI: 10.1021/acs.analchem.2c03076] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Protein structures are decisive for their activities and interactions with other molecules. Global analysis of protein structures and conformational changes cannot be achieved by commonly used abundance-based proteomics. Here, we integrated cysteine covalent labeling, selective enrichment, and quantitative proteomics to study protein structures and structural changes on a large scale. This method was applied to globally investigate protein structures in HEK293T cells and protein structural changes in the cells with the tunicamycin (Tm)-induced endoplasmic reticulum (ER) stress. We quantified several thousand cysteine residues, which contain unprecedented and valuable information of protein structures. Combining this method with pulsed stable isotope labeling by amino acids in cell culture, we further analyzed the folding state differences between pre-existing and newly synthesized proteins in cells under the Tm treatment. Besides newly synthesized proteins, unexpectedly, many pre-existing proteins were found to become unfolded upon ER stress, especially those related to gene transcription and protein translation. Furthermore, the current results reveal that N-glycosylation plays a more important role in the folding process of the tertiary and quaternary structures than the secondary structures for newly synthesized proteins. Considering the importance of cysteine in protein structures, this method can be extensively applied in the biological and biomedical research fields.
Collapse
Affiliation(s)
- Kejun Yin
- School of Chemistry and Biochemistry and the Petit Institute for Bioengineering and Bioscience, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Ming Tong
- School of Chemistry and Biochemistry and the Petit Institute for Bioengineering and Bioscience, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Fangxu Sun
- School of Chemistry and Biochemistry and the Petit Institute for Bioengineering and Bioscience, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Ronghu Wu
- School of Chemistry and Biochemistry and the Petit Institute for Bioengineering and Bioscience, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| |
Collapse
|
37
|
Mahmoud NA, Elshafei AM, Almofti YA. A novel strategy for developing vaccine candidate against Jaagsiekte sheep retrovirus from the envelope and gag proteins: an in-silico approach. BMC Vet Res 2022; 18:343. [PMID: 36085036 PMCID: PMC9463060 DOI: 10.1186/s12917-022-03431-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Accepted: 08/29/2022] [Indexed: 11/26/2022] Open
Abstract
BACKGROUND Sheep pulmonary adenocarcinoma (OPA) is a contagious lung cancer of sheep caused by the Jaagsiekte retrovirus (JSRV). OPA typically has a serious economic impact worldwide. A vaccine has yet to be developed, even though the disease has been globally spread, along with its complications. This study aimed to construct an effective multi-epitopes vaccine against JSRV eliciting B and T lymphocytes using immunoinformatics tools. RESULTS The designed vaccine was composed of 499 amino acids. Before the vaccine was computationally validated, all critical parameters were taken into consideration; including antigenicity, allergenicity, toxicity, and stability. The physiochemical properties of the vaccine displayed an isoelectric point of 9.88. According to the Instability Index (II), the vaccine was stable at 28.28. The vaccine scored 56.51 on the aliphatic index and -0.731 on the GRAVY, indicating that the vaccine was hydrophilic. The RaptorX server was used to predict the vaccine's tertiary structure, the GalaxyWEB server refined the structure, and the Ramachandran plot and the ProSA-web server validated the vaccine's tertiary structure. Protein-sol and the SOLPro servers showed the solubility of the vaccine. Moreover, the high mobile regions in the vaccine's structure were reduced and the vaccine's stability was improved by disulfide engineering. Also, the vaccine construct was docked with an ovine MHC-1 allele and showed efficient binding energy. Immune simulation remarkably showed high levels of immunoglobulins, T lymphocytes, and INF-γ secretions. The molecular dynamic simulation provided the stability of the constructed vaccine. Finally, the vaccine was back-transcribed into a DNA sequence and cloned into a pET-30a ( +) vector to affirm the potency of translation and microbial expression. CONCLUSION A novel multi-epitopes vaccine construct against JSRV, was formed from B and T lymphocytes epitopes, and was produced with potential protection. This study might help in controlling and eradicating OPA.
Collapse
Affiliation(s)
- Nuha Amin Mahmoud
- Department of Biochemistry, Genetics and Molecular Biology/ Faculty of Medicine and Surgery, National University, Khartoum, Sudan
| | - Abdelmajeed M Elshafei
- Department of Biochemistry, Genetics and Molecular Biology/ Faculty of Medicine and Surgery, National University, Khartoum, Sudan
| | - Yassir A Almofti
- Department of Biochemistry, Genetics and Molecular Biology/ Faculty of Medicine and Surgery, National University, Khartoum, Sudan.
- Department of Molecular Biology and Bioinformatics, College of Veterinary Medicine, University of Bahri, Khartoum, Sudan.
| |
Collapse
|
38
|
Avery C, Patterson J, Grear T, Frater T, Jacobs DJ. Protein Function Analysis through Machine Learning. Biomolecules 2022; 12:1246. [PMID: 36139085 PMCID: PMC9496392 DOI: 10.3390/biom12091246] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2022] [Revised: 08/22/2022] [Accepted: 08/31/2022] [Indexed: 11/16/2022] Open
Abstract
Machine learning (ML) has been an important arsenal in computational biology used to elucidate protein function for decades. With the recent burgeoning of novel ML methods and applications, new ML approaches have been incorporated into many areas of computational biology dealing with protein function. We examine how ML has been integrated into a wide range of computational models to improve prediction accuracy and gain a better understanding of protein function. The applications discussed are protein structure prediction, protein engineering using sequence modifications to achieve stability and druggability characteristics, molecular docking in terms of protein-ligand binding, including allosteric effects, protein-protein interactions and protein-centric drug discovery. To quantify the mechanisms underlying protein function, a holistic approach that takes structure, flexibility, stability, and dynamics into account is required, as these aspects become inseparable through their interdependence. Another key component of protein function is conformational dynamics, which often manifest as protein kinetics. Computational methods that use ML to generate representative conformational ensembles and quantify differences in conformational ensembles important for function are included in this review. Future opportunities are highlighted for each of these topics.
Collapse
Affiliation(s)
- Chris Avery
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - John Patterson
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - Tyler Grear
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
- Department of Physics and Optical Science, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - Theodore Frater
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - Donald J. Jacobs
- Department of Physics and Optical Science, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| |
Collapse
|
39
|
Wang L, Zhong H, Xue Z, Wang Y. Res-Dom: predicting protein domain boundary from sequence using deep residual network and Bi-LSTM. BIOINFORMATICS ADVANCES 2022; 2:vbac060. [PMID: 36699417 PMCID: PMC9710680 DOI: 10.1093/bioadv/vbac060] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/22/2022] [Revised: 07/01/2022] [Accepted: 08/30/2022] [Indexed: 01/28/2023]
Abstract
Motivation Protein domains are the basic units of proteins that can fold, function and evolve independently. Protein domain boundary partition plays an important role in protein structure prediction, understanding their biological functions, annotating their evolutionary mechanisms and protein design. Although there are many methods that have been developed to predict domain boundaries from protein sequence over the past two decades, there is still much room for improvement. Results In this article, a novel domain boundary prediction tool called Res-Dom was developed, which is based on a deep residual network, bidirectional long short-term memory (Bi-LSTM) and transfer learning. We used deep residual neural networks to extract higher-order residue-related information. In addition, we also used a pre-trained protein language model called ESM to extract sequence embedded features, which can summarize sequence context information more abundantly. To improve the global representation of these deep residual networks, a Bi-LSTM network was also designed to consider long-range interactions between residues. Res-Dom was then tested on an independent test set including 342 proteins and generated correct single-domain and multi-domain classifications with a Matthew's correlation coefficient of 0.668, which was 17.6% higher than the second-best compared method. For domain boundaries, the normalized domain overlapping score of Res-Dom was 0.849, which was 5% higher than the second-best compared method. Furthermore, Res-Dom required significantly less time than most of the recently developed state-of-the-art domain prediction methods. Availability and implementation All source code, datasets and model are available at http://isyslab.info/Res-Dom/.
Collapse
Affiliation(s)
- Lei Wang
- Institute of Medical Artificial Intelligence, Binzhou Medical University, Yantai, Shandong 264003, China.,School of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China
| | - Haolin Zhong
- School of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China
| | - Zhidong Xue
- Institute of Medical Artificial Intelligence, Binzhou Medical University, Yantai, Shandong 264003, China.,School of Software Engineering, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China
| | - Yan Wang
- Institute of Medical Artificial Intelligence, Binzhou Medical University, Yantai, Shandong 264003, China.,School of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China
| |
Collapse
|
40
|
Macleod OJS, Cook AD, Webb H, Crow M, Burns R, Redpath M, Seisenberger S, Trevor CE, Peacock L, Schwede A, Kimblin N, Francisco AF, Pepperl J, Rust S, Voorheis P, Gibson W, Taylor MC, Higgins MK, Carrington M. Invariant surface glycoprotein 65 of Trypanosoma brucei is a complement C3 receptor. Nat Commun 2022; 13:5085. [PMID: 36038546 PMCID: PMC9424271 DOI: 10.1038/s41467-022-32728-9] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2021] [Accepted: 08/10/2022] [Indexed: 11/16/2022] Open
Abstract
African trypanosomes are extracellular pathogens of mammals and are exposed to the adaptive and innate immune systems. Trypanosomes evade the adaptive immune response through antigenic variation, but little is known about how they interact with components of the innate immune response, including complement. Here we demonstrate that an invariant surface glycoprotein, ISG65, is a receptor for complement component 3 (C3). We show how ISG65 binds to the thioester domain of C3b. We also show that C3 contributes to control of trypanosomes during early infection in a mouse model and provide evidence that ISG65 is involved in reducing trypanosome susceptibility to C3-mediated clearance. Deposition of C3b on pathogen surfaces, such as trypanosomes, is a central point in activation of the complement system. In ISG65, trypanosomes have evolved a C3 receptor which diminishes the downstream effects of C3 deposition on the control of infection. Trypanosomes evade the immune response through antigenic variation of a surface coat containing variant surface glycoproteins (VSG). They also express invariant surface glycoproteins (ISGs), which are less well understood. Here, Macleod et al. show that ISG65 of T. brucei is a receptor for complement component 3. They provide the crystal structure of T. brucei ISG65 in complex with complement C3d and show evidence that ISG65 is involved in reducing trypanosome susceptibility to C3-mediated clearance in vivo.
Collapse
Affiliation(s)
- Olivia J S Macleod
- Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QW, UK
| | - Alexander D Cook
- Department of Biochemistry, University of Oxford, South Parks Road, Oxford, OX1 3QU, UK.,Kavli Institute for Nanoscience Discovery, Dorothy Crowfoot Hodgkin Building, University of Oxford, South Parks Road, Oxford, OX1 3QU, UK
| | - Helena Webb
- Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QW, UK
| | - Mandy Crow
- Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QW, UK
| | - Roisin Burns
- Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QW, UK
| | - Maria Redpath
- Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QW, UK
| | - Stefanie Seisenberger
- Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QW, UK
| | - Camilla E Trevor
- Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QW, UK
| | - Lori Peacock
- Bristol Veterinary School and School of Biological Sciences, University of Bristol, Bristol, UK
| | - Angela Schwede
- Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QW, UK
| | - Nicola Kimblin
- Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QW, UK
| | - Amanda F Francisco
- Faculty of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine, London, WC1E 7HT, UK
| | - Julia Pepperl
- Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QW, UK
| | - Steve Rust
- Antibody Discovery and Protein Engineering, Biopharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Paul Voorheis
- School of Biochemistry and Immunology, Trinity Biomedical Sciences Institute, Trinity College Dublin, Dublin, Ireland
| | - Wendy Gibson
- Bristol Veterinary School and School of Biological Sciences, University of Bristol, Bristol, UK
| | - Martin C Taylor
- Faculty of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine, London, WC1E 7HT, UK
| | - Matthew K Higgins
- Department of Biochemistry, University of Oxford, South Parks Road, Oxford, OX1 3QU, UK. .,Kavli Institute for Nanoscience Discovery, Dorothy Crowfoot Hodgkin Building, University of Oxford, South Parks Road, Oxford, OX1 3QU, UK.
| | - Mark Carrington
- Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QW, UK.
| |
Collapse
|
41
|
Hong Y, Song J, Ko J, Lee J, Shin WH. S-Pred: protein structural property prediction using MSA transformer. Sci Rep 2022; 12:13891. [PMID: 35974061 PMCID: PMC9381718 DOI: 10.1038/s41598-022-18205-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Accepted: 08/08/2022] [Indexed: 11/10/2022] Open
Abstract
Predicting the local structural features of a protein from its amino acid sequence helps its function prediction to be revealed and assists in three-dimensional structural modeling. As the sequence-structure gap increases, prediction methods have been developed to bridge this gap. Additionally, as the size of the structural database and computing power increase, the performance of these methods have also significantly improved. Herein, we present a powerful new tool called S-Pred, which can predict eight-state secondary structures (SS8), accessible surface areas (ASAs), and intrinsically disordered regions (IDRs) from a given sequence. For feature prediction, S-Pred uses multiple sequence alignment (MSA) of a query sequence as an input. The MSA input is converted to features by the MSA Transformer, which is a protein language model that uses an attention mechanism. A long short-term memory (LSTM) was employed to produce the final prediction. The performance of S-Pred was evaluated on several test sets, and the program consistently provided accurate predictions. The accuracy of the SS8 prediction was approximately 76%, and the Pearson’s correlation between the experimental and predicted ASAs was 0.84. Additionally, an IDR could be accurately predicted with an F1-score of 0.514. The program is freely available at https://github.com/arontier/S_Pred_Paper and https://ad3.io as a code and a web server.
Collapse
Affiliation(s)
- Yiyu Hong
- Arontier Co., Seoul, 06735, Republic of Korea
| | - Jinung Song
- Arontier Co., Seoul, 06735, Republic of Korea
| | - Junsu Ko
- Arontier Co., Seoul, 06735, Republic of Korea
| | - Juyong Lee
- Arontier Co., Seoul, 06735, Republic of Korea.,Division of Chemistry and Biochemistry, Department of Chemistry, Kangwon National University, Chuncheon, 24341, Republic of Korea
| | - Woong-Hee Shin
- Arontier Co., Seoul, 06735, Republic of Korea. .,Department of Chemistry Education, Sunchon National University, Suncheon, 57922, Republic of Korea. .,Department of Advanced Components and Materials Engineering, Sunchon National University, Suncheon, 57922, Republic of Korea.
| |
Collapse
|
42
|
Mignon J, Mottet D, Leyder T, Uversky VN, Perpète EA, Michaux C. Structural characterisation of amyloidogenic intrinsically disordered zinc finger protein isoforms DPF3b and DPF3a. Int J Biol Macromol 2022; 218:57-71. [PMID: 35863661 DOI: 10.1016/j.ijbiomac.2022.07.102] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Revised: 07/08/2022] [Accepted: 07/13/2022] [Indexed: 11/05/2022]
Abstract
Double PHD fingers 3 (DPF3) is a zinc finger protein, found in the BAF chromatin remodelling complex, and is involved in the regulation of gene expression. Two DPF3 isoforms have been identified, respectively named DPF3b and DPF3a. Very limited structural information is available for these isoforms, and their specific functionality still remains poorly studied. In a previous work, we have demonstrated the first evidence of DPF3a being a disordered protein sensitive to amyloid fibrillation. Intrinsically disordered proteins (IDPs) lack a defined tertiary structure, existing as a dynamic conformational ensemble, allowing them to act as hubs in protein-protein interaction networks. In the present study, we have more thoroughly characterised DPF3a in vitro behaviour, as well as unravelled and compared the structural properties of the DPF3b isoform, using an array of predictors and biophysical techniques. Predictions, spectroscopy, and dynamic light scattering have revealed a high content in disorder: prevalence of random coil, aromatic residues partially to fully exposed to the solvent, and large hydrodynamic diameters. DPF3a appears to be more disordered than DPF3b, and exhibits more expanded conformations. Furthermore, we have shown that they both time-dependently aggregate into amyloid fibrils, as revealed by typical circular dichroism, deep-blue autofluorescence, and amyloid-dye binding assay fingerprints. Although spectroscopic and microscopic analyses have unveiled that they share a similar aggregation pathway, DPF3a fibrillates at a faster rate, likely through reordering of its C-terminal domain.
Collapse
Affiliation(s)
- Julien Mignon
- Laboratoire de Chimie Physique des Biomolécules, UCPTS, University of Namur, 61 rue de Bruxelles, 5000 Namur, Belgium; Namur Institute of Structured Matter (NISM), University of Namur, Namur, Belgium; Namur Research Institute for Life Sciences (NARILIS), University of Namur, Namur, Belgium.
| | - Denis Mottet
- University of Liège, GIGA-Molecular Biology of Diseases, Gene Expression and Cancer Laboratory, B34, Avenue de l'Hôpital, 4000 Liège, Belgium.
| | - Tanguy Leyder
- Laboratoire de Chimie Physique des Biomolécules, UCPTS, University of Namur, 61 rue de Bruxelles, 5000 Namur, Belgium.
| | - Vladimir N Uversky
- Department of Molecular Medicine, USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, Tampa, FL, United States.
| | - Eric A Perpète
- Laboratoire de Chimie Physique des Biomolécules, UCPTS, University of Namur, 61 rue de Bruxelles, 5000 Namur, Belgium; Namur Research Institute for Life Sciences (NARILIS), University of Namur, Namur, Belgium; Institute of Life, Earth and Environment (ILEE), University of Namur, Namur, Belgium.
| | - Catherine Michaux
- Laboratoire de Chimie Physique des Biomolécules, UCPTS, University of Namur, 61 rue de Bruxelles, 5000 Namur, Belgium; Namur Institute of Structured Matter (NISM), University of Namur, Namur, Belgium; Namur Research Institute for Life Sciences (NARILIS), University of Namur, Namur, Belgium.
| |
Collapse
|
43
|
Compositional Bias of Intrinsically Disordered Proteins and Regions and Their Predictions. Biomolecules 2022; 12:biom12070888. [PMID: 35883444 PMCID: PMC9313023 DOI: 10.3390/biom12070888] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Revised: 06/10/2022] [Accepted: 06/10/2022] [Indexed: 11/17/2022] Open
Abstract
Intrinsically disordered regions (IDRs) carry out many cellular functions and vary in length and placement in protein sequences. This diversity leads to variations in the underlying compositional biases, which were demonstrated for the short vs. long IDRs. We analyze compositional biases across four classes of disorder: fully disordered proteins; short IDRs; long IDRs; and binding IDRs. We identify three distinct biases: for the fully disordered proteins, the short IDRs and the long and binding IDRs combined. We also investigate compositional bias for putative disorder produced by leading disorder predictors and find that it is similar to the bias of the native disorder. Interestingly, the accuracy of disorder predictions across different methods is correlated with the correctness of the compositional bias of their predictions highlighting the importance of the compositional bias. The predictive quality is relatively low for the disorder classes with compositional bias that is the most different from the “generic” disorder bias, while being much higher for the classes with the most similar bias. We discover that different predictors perform best across different classes of disorder. This suggests that no single predictor is universally best and motivates the development of new architectures that combine models that target specific disorder classes.
Collapse
|
44
|
Yan X, Lu Y, Li Z, Wei Q, Gao X, Wang S, Wu S, Cui S. PointSite: A Point Cloud Segmentation Tool for Identification of Protein Ligand Binding Atoms. J Chem Inf Model 2022; 62:2835-2845. [PMID: 35621730 DOI: 10.1021/acs.jcim.1c01512] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Accurate identification of ligand binding sites (LBS) on a protein structure is critical for understanding protein function and designing structure-based drugs. As the previous pocket-centric methods are usually based on the investigation of pseudo-surface-points outside the protein structure, they cannot fully take advantage of the local connectivity of atoms within the protein, as well as the global 3D geometrical information from all the protein atoms. In this paper, we propose a novel point clouds segmentation method, PointSite, for accurate identification of protein ligand binding atoms, which performs protein LBS identification at the atom-level in a protein-centric manner. Specifically, we first transfer the original 3D protein structure to point clouds and then conduct segmentation through Submanifold Sparse Convolution based U-Net. With the fine-grained atom-level binding atoms representation and enhanced feature learning, PointSite can outperform previous methods in atom Intersection over Union (atom-IoU) by a large margin. Furthermore, our segmented binding atoms, that is, atoms with high probability predicted by our model can work as a filter on predictions achieved by previous pocket-centric approaches, which significantly decreases the false-positive of LBS candidates. Besides, we further directly extend PointSite trained on bound proteins for LBS identification on unbound proteins, which demonstrates the superior generalization capacity of PointSite. Through cascaded filter and reranking aided by the segmented atoms, state-of-the-art performance can be achieved over various canonical benchmarks, CAMEO hard targets, and unbound proteins in terms of the commonly used DCA criteria.
Collapse
Affiliation(s)
- Xu Yan
- The Chinese University of Hongkong (Shenzhen) & Future Network of Intelligence Institute, Shenzhen 518172, China
| | - Yingfeng Lu
- The Chinese University of Hongkong (Shenzhen) & Future Network of Intelligence Institute, Shenzhen 518172, China
| | - Zhen Li
- The Chinese University of Hongkong (Shenzhen) & Future Network of Intelligence Institute, Shenzhen 518172, China
| | - Qing Wei
- The Chinese University of Hongkong (Shenzhen) & Future Network of Intelligence Institute, Shenzhen 518172, China
| | - Xin Gao
- King Abdullah University of Science and Technology, Thuwal 23955, Saudi Arabia
| | - Sheng Wang
- Shanghai Zelixir Biotech Company Ltd., Shanghai 200030, China.,CAS Key Laboratory of Quantitative Engineering Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Song Wu
- Shenzhen University, Shenzhen 518060, China
| | - Shuguang Cui
- The Chinese University of Hongkong (Shenzhen) & Future Network of Intelligence Institute, Shenzhen 518172, China
| |
Collapse
|
45
|
Wilson CJ, Choy WY, Karttunen M. AlphaFold2: A Role for Disordered Protein/Region Prediction? Int J Mol Sci 2022; 23:4591. [PMID: 35562983 PMCID: PMC9104326 DOI: 10.3390/ijms23094591] [Citation(s) in RCA: 72] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2022] [Revised: 04/18/2022] [Accepted: 04/19/2022] [Indexed: 01/27/2023] Open
Abstract
The development of AlphaFold2 marked a paradigm-shift in the structural biology community. Herein, we assess the ability of AlphaFold2 to predict disordered regions against traditional sequence-based disorder predictors. We find that AlphaFold2 performs well at discriminating disordered regions, but also note that the disorder predictor one constructs from an AlphaFold2 structure determines accuracy. In particular, a naïve, but non-trivial assumption that residues assigned to helices, strands, and H-bond stabilized turns are likely ordered and all other residues are disordered results in a dramatic overestimation in disorder; conversely, the predicted local distance difference test (pLDDT) provides an excellent measure of residue-wise disorder. Furthermore, by employing molecular dynamics (MD) simulations, we note an interesting relationship between the pLDDT and secondary structure, that may explain our observations and suggests a broader application of the pLDDT for characterizing the local dynamics of intrinsically disordered proteins and regions (IDPs/IDRs).
Collapse
Affiliation(s)
- Carter J. Wilson
- Department of Mathematics, The University of Western Ontario, 1151 Richmond Street, London, ON N6A 5B7, Canada;
- Centre for Advanced Materials and Biomaterials Research, The University of Western Ontario, 1151 Richmond Street, London, ON N6A 5B7, Canada
| | - Wing-Yiu Choy
- Department of Biochemistry, The University of Western Ontario, 1151 Richmond Street, London, ON N6A 5C1, Canada
| | - Mikko Karttunen
- Centre for Advanced Materials and Biomaterials Research, The University of Western Ontario, 1151 Richmond Street, London, ON N6A 5B7, Canada
- Department of Physics and Astronomy, The University of Western Ontario, 1151 Richmond Street, London, ON N6A 5B7, Canada
- Department of Chemistry, The University of Western Ontario, 1151 Richmond Street, London, ON N6A 3K7, Canada
| |
Collapse
|
46
|
Orlando G, Raimondi D, Codice F, Tabaro F, Vranken W. Prediction of disordered regions in proteins with recurrent Neural Networks and protein dynamics. J Mol Biol 2022; 434:167579. [DOI: 10.1016/j.jmb.2022.167579] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Revised: 03/21/2022] [Accepted: 03/31/2022] [Indexed: 10/18/2022]
|
47
|
Pei H, Guo W, Peng Y, Xiong H, Chen Y. Targeting key proteins involved in transcriptional regulation for cancer therapy: Current strategies and future prospective. Med Res Rev 2022; 42:1607-1660. [PMID: 35312190 DOI: 10.1002/med.21886] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2022] [Revised: 02/10/2022] [Accepted: 02/22/2022] [Indexed: 12/14/2022]
Abstract
The key proteins involved in transcriptional regulation play convergent roles in cellular homeostasis, and their dysfunction mediates aberrant gene expressions that underline the hallmarks of tumorigenesis. As tumor progression is dependent on such abnormal regulation of transcription, it is important to discover novel chemical entities as antitumor drugs that target key tumor-associated proteins involved in transcriptional regulation. Despite most key proteins (especially transcription factors) involved in transcriptional regulation are historically recognized as undruggable targets, multiple targeting approaches at diverse levels of transcriptional regulation, such as epigenetic intervention, inhibition of DNA-binding of transcriptional factors, and inhibition of the protein-protein interactions (PPIs), have been established in preclinically or clinically studies. In addition, several new approaches have recently been described, such as targeting proteasomal degradation and eliciting synthetic lethality. This review will emphasize on accentuating these developing therapeutic approaches and provide a thorough conspectus of the drug development to target key proteins involved in transcriptional regulation and their impact on future oncotherapy.
Collapse
Affiliation(s)
- Haixiang Pei
- Institute for Advanced Study, Shenzhen University and Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Shenzhen University Health Science Center, Shenzhen, China.,Shanghai Key Laboratory of Regulatory Biology, The Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai, China
| | - Weikai Guo
- Shanghai Key Laboratory of Regulatory Biology, The Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai, China.,Joint National Laboratory for Antibody Drug Engineering, School of Basic Medical Science, Henan University, Kaifeng, China
| | - Yangrui Peng
- Shanghai Key Laboratory of Regulatory Biology, The Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai, China
| | - Hai Xiong
- Institute for Advanced Study, Shenzhen University and Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Shenzhen University Health Science Center, Shenzhen, China
| | - Yihua Chen
- Shanghai Key Laboratory of Regulatory Biology, The Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai, China
| |
Collapse
|
48
|
Zhao B, Kurgan L. Deep learning in prediction of intrinsic disorder in proteins. Comput Struct Biotechnol J 2022; 20:1286-1294. [PMID: 35356546 PMCID: PMC8927795 DOI: 10.1016/j.csbj.2022.03.003] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Revised: 03/04/2022] [Accepted: 03/04/2022] [Indexed: 12/12/2022] Open
Abstract
Intrinsic disorder prediction is an active area that has developed over 100 predictors. We identify and investigate a recent trend towards the development of deep neural network (DNN)-based methods. The first DNN-based method was released in 2013 and since 2019 deep learners account for majority of the new disorder predictors. We find that the 13 currently available DNN-based predictors are diverse in their topologies, sizes of their networks and the inputs that they utilize. We empirically show that the deep learners are statistically more accurate than other types of disorder predictors using the blind test dataset from the recent community assessment of intrinsic disorder predictions (CAID). We also identify several well-rounded DNN-based predictors that are accurate, fast and/or conveniently available. The popularity, favorable predictive performance and architectural flexibility suggest that deep networks are likely to fuel the development of future disordered predictors. Novel hybrid designs of deep networks could be used to adequately accommodate for diversity of types and flavors of intrinsic disorder. We also discuss scarcity of the DNN-based methods for the prediction of disordered binding regions and the need to develop more accurate methods for this prediction.
Collapse
Affiliation(s)
- Bi Zhao
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| |
Collapse
|
49
|
Kurgan L. Resources for computational prediction of intrinsic disorder in proteins. Methods 2022; 204:132-141. [DOI: 10.1016/j.ymeth.2022.03.018] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Revised: 03/25/2022] [Accepted: 03/29/2022] [Indexed: 12/26/2022] Open
|
50
|
Beaudoin CA, Bartas M, Volná A, Pečinka P, Blundell TL. Are There Hidden Genes in DNA/RNA Vaccines? Front Immunol 2022; 13:801915. [PMID: 35211117 PMCID: PMC8860813 DOI: 10.3389/fimmu.2022.801915] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2021] [Accepted: 01/14/2022] [Indexed: 02/02/2023] Open
Abstract
Due to the fast global spreading of the Severe Acute Respiratory Syndrome Coronavirus - 2 (SARS-CoV-2), prevention and treatment options are direly needed in order to control infection-related morbidity, mortality, and economic losses. Although drug and inactivated and attenuated virus vaccine development can require significant amounts of time and resources, DNA and RNA vaccines offer a quick, simple, and cheap treatment alternative, even when produced on a large scale. The spike protein, which has been shown as the most antigenic SARS-CoV-2 protein, has been widely selected as the target of choice for DNA/RNA vaccines. Vaccination campaigns have reported high vaccination rates and protection, but numerous unintended effects, ranging from muscle pain to death, have led to concerns about the safety of RNA/DNA vaccines. In parallel to these studies, several open reading frames (ORFs) have been found to be overlapping SARS-CoV-2 accessory genes, two of which, ORF2b and ORF-Sh, overlap the spike protein sequence. Thus, the presence of these, and potentially other ORFs on SARS-CoV-2 DNA/RNA vaccines, could lead to the translation of undesired proteins during vaccination. Herein, we discuss the translation of overlapping genes in connection with DNA/RNA vaccines. Two mRNA vaccine spike protein sequences, which have been made publicly-available, were compared to the wild-type sequence in order to uncover possible differences in putative overlapping ORFs. Notably, the Moderna mRNA-1273 vaccine sequence is predicted to contain no frameshifted ORFs on the positive sense strand, which highlights the utility of codon optimization in DNA/RNA vaccine design to remove undesired overlapping ORFs. Since little information is available on ORF2b or ORF-Sh, we use structural bioinformatics techniques to investigate the structure-function relationship of these proteins. The presence of putative ORFs on DNA/RNA vaccine candidates implies that overlapping genes may contribute to the translation of smaller peptides, potentially leading to unintended clinical outcomes, and that the protein-coding potential of DNA/RNA vaccines should be rigorously examined prior to administration.
Collapse
Affiliation(s)
- Christopher A. Beaudoin
- Department of Biochemistry, Sanger Building, University of Cambridge, Cambridge, United Kingdom
| | - Martin Bartas
- Department of Biology and Ecology, University of Ostrava, Ostrava, Czechia
| | - Adriana Volná
- Department of Physics, University of Ostrava, Ostrava, Czechia
| | - Petr Pečinka
- Department of Biology and Ecology, University of Ostrava, Ostrava, Czechia
| | - Tom L. Blundell
- Department of Biochemistry, Sanger Building, University of Cambridge, Cambridge, United Kingdom
| |
Collapse
|