1
|
Lu C, Jiang J, Chen Q, Liu H, Ju X, Wang H. Analysis and prediction of interactions between transmembrane and non-transmembrane proteins. BMC Genomics 2024; 25:401. [PMID: 38658824 PMCID: PMC11040819 DOI: 10.1186/s12864-024-10251-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Accepted: 03/25/2024] [Indexed: 04/26/2024] Open
Abstract
BACKGROUND Most of the important biological mechanisms and functions of transmembrane proteins (TMPs) are realized through their interactions with non-transmembrane proteins(nonTMPs). The interactions between TMPs and nonTMPs in cells play vital roles in intracellular signaling, energy metabolism, investigating membrane-crossing mechanisms, correlations between disease and drugs. RESULTS Despite the importance of TMP-nonTMP interactions, the study of them remains in the wet experimental stage, lacking specific and comprehensive studies in the field of bioinformatics. To fill this gap, we performed a comprehensive statistical analysis of known TMP-nonTMP interactions and constructed a deep learning-based predictor to identify potential interactions. The statistical analysis describes known TMP-nonTMP interactions from various perspectives, such as distributions of species and protein families, enrichment of GO and KEGG pathways, as well as hub proteins and subnetwork modules in the PPI network. The predictor implemented by an end-to-end deep learning model can identify potential interactions from protein primary sequence information. The experimental results over the independent validation demonstrated considerable prediction performance with an MCC of 0.541. CONCLUSIONS To our knowledge, we were the first to focus on TMP-nonTMP interactions. We comprehensively analyzed them using bioinformatics methods and predicted them via deep learning-based solely on their sequence. This research completes a key link in the protein network, benefits the understanding of protein functions, and helps in pathogenesis studies of diseases and associated drug development.
Collapse
Affiliation(s)
- Chang Lu
- School of Psychology, School of Information Science and Technology, Institute of Computational Biology, Northeast Normal University, Changchun, China
| | - Jiuhong Jiang
- School of Psychology, School of Information Science and Technology, Institute of Computational Biology, Northeast Normal University, Changchun, China
| | - Qiufen Chen
- School of Psychology, School of Information Science and Technology, Institute of Computational Biology, Northeast Normal University, Changchun, China
| | - Huanhuan Liu
- School of Psychology, School of Information Science and Technology, Institute of Computational Biology, Northeast Normal University, Changchun, China
| | - Xingda Ju
- School of Psychology, School of Information Science and Technology, Institute of Computational Biology, Northeast Normal University, Changchun, China.
| | - Han Wang
- School of Psychology, School of Information Science and Technology, Institute of Computational Biology, Northeast Normal University, Changchun, China.
| |
Collapse
|
2
|
Sun J, Kulandaisamy A, Ru J, Gromiha MM, Cribbs AP. TMKit: a Python interface for computational analysis of transmembrane proteins. Brief Bioinform 2023; 24:bbad288. [PMID: 37594311 PMCID: PMC10516361 DOI: 10.1093/bib/bbad288] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Revised: 07/07/2023] [Accepted: 07/18/2023] [Indexed: 08/19/2023] Open
Abstract
Transmembrane proteins are receptors, enzymes, transporters and ion channels that are instrumental in regulating a variety of cellular activities, such as signal transduction and cell communication. Despite tremendous progress in computational capacities to support protein research, there is still a significant gap in the availability of specialized computational analysis toolkits for transmembrane protein research. Here, we introduce TMKit, an open-source Python programming interface that is modular, scalable and specifically designed for processing transmembrane protein data. TMKit is a one-stop computational analysis tool for transmembrane proteins, enabling users to perform database wrangling, engineer features at the mutational, domain and topological levels, and visualize protein-protein interaction interfaces. In addition, TMKit includes seqNetRR, a high-performance computing library that allows customized construction of a large number of residue connections. This library is particularly well suited for assigning correlation matrix-based features at a fast speed. TMKit should serve as a useful tool for researchers in assisting the study of transmembrane protein sequences and structures. TMKit is publicly available through https://github.com/2003100127/tmkit and https://tmkit-guide.herokuapp.com/doc/overview.
Collapse
Affiliation(s)
- Jianfeng Sun
- Nuffield Department of Orthopedics, Rheumatology, and Musculoskeletal Sciences, Botnar Research Centre, University of Oxford, Headington, Oxford OX3 7LD, UK
| | - Arulsamy Kulandaisamy
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BioSciences, Indian Institute of Technology Madras, Chennai 600036, Tamil Nadu, India
| | - Jinlong Ru
- Chair of Prevention of Microbial Diseases, School of Life Sciences Weihenstephan, Technical University of Munich, 85354 Freising, Germany
| | - M Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BioSciences, Indian Institute of Technology Madras, Chennai 600036, Tamil Nadu, India
| | - Adam P Cribbs
- Nuffield Department of Orthopedics, Rheumatology, and Musculoskeletal Sciences, Botnar Research Centre, University of Oxford, Headington, Oxford OX3 7LD, UK
| |
Collapse
|
3
|
Sun J, Kulandaisamy A, Liu J, Hu K, Gromiha MM, Zhang Y. Machine learning in computational modelling of membrane protein sequences and structures: From methodologies to applications. Comput Struct Biotechnol J 2023; 21:1205-1226. [PMID: 36817959 PMCID: PMC9932300 DOI: 10.1016/j.csbj.2023.01.036] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Revised: 01/16/2023] [Accepted: 01/25/2023] [Indexed: 01/29/2023] Open
Abstract
Membrane proteins mediate a wide spectrum of biological processes, such as signal transduction and cell communication. Due to the arduous and costly nature inherent to the experimental process, membrane proteins have long been devoid of well-resolved atomic-level tertiary structures and, consequently, the understanding of their functional roles underlying a multitude of life activities has been hampered. Currently, computational tools dedicated to furthering the structure-function understanding are primarily focused on utilizing intelligent algorithms to address a variety of site-wise prediction problems (e.g., topology and interaction sites), but are scattered across different computing sources. Moreover, the recent advent of deep learning techniques has immensely expedited the development of computational tools for membrane protein-related prediction problems. Given the growing number of applications optimized particularly by manifold deep neural networks, we herein provide a review on the current status of computational strategies mainly in membrane protein type classification, topology identification, interaction site detection, and pathogenic effect prediction. Meanwhile, we provide an overview of how the entire prediction process proceeds, including database collection, data pre-processing, feature extraction, and method selection. This review is expected to be useful for developing more extendable computational tools specific to membrane proteins.
Collapse
Affiliation(s)
- Jianfeng Sun
- Botnar Research Centre, Nuffield Department of Orthopedics, Rheumatology, and Musculoskeletal Sciences, University of Oxford, Headington, Oxford OX3 7LD, UK
| | - Arulsamy Kulandaisamy
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BioSciences, Indian Institute of Technology Madras, Chennai 600 036, Tamilnadu, India
| | - Jacklyn Liu
- UCL Cancer Institute, University College London, 72 Huntley Street, London WC1E 6BT, UK
| | - Kai Hu
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan 411105, China
| | - M. Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BioSciences, Indian Institute of Technology Madras, Chennai 600 036, Tamilnadu, India,Corresponding authors.
| | - Yuan Zhang
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan 411105, China,Corresponding authors.
| |
Collapse
|
4
|
Matos-Filipe P, Preto AJ, Koukos PI, Mourão J, Bonvin AMJJ, Moreira IS. MENSAdb: a thorough structural analysis of membrane protein dimers. Database (Oxford) 2021; 2021:baab013. [PMID: 33822911 PMCID: PMC8023553 DOI: 10.1093/database/baab013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2020] [Revised: 01/19/2021] [Accepted: 03/01/2021] [Indexed: 11/14/2022]
Abstract
Membrane proteins (MPs) are key players in a variety of different cellular processes and constitute the target of around 60% of all Food and Drug Administration-approved drugs. Despite their importance, there is still a massive lack of relevant structural, biochemical and mechanistic information mainly due to their localization within the lipid bilayer. To help fulfil this gap, we developed the MEmbrane protein dimer Novel Structure Analyser database (MENSAdb). This interactive web application summarizes the evolutionary and physicochemical properties of dimeric MPs to expand the available knowledge on the fundamental principles underlying their formation. Currently, MENSAdb contains features of 167 unique MPs (63% homo- and 37% heterodimers) and brings insights into the conservation of residues, accessible solvent area descriptors, average B-factors, intermolecular contacts at 2.5 Å and 4.0 Å distance cut-offs, hydrophobic contacts, hydrogen bonds, salt bridges, π-π stacking, T-stacking and cation-π interactions. The regular update and organization of all these data into a unique platform will allow a broad community of researchers to collect and analyse a large number of features efficiently, thus facilitating their use in the development of prediction models associated with MPs. Database URL: http://www.moreiralab.com/resources/mensadb.
Collapse
Affiliation(s)
- Pedro Matos-Filipe
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra 3005-504, Portugal
| | - António J Preto
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra 3005-504, Portugal
- PhD Programme in Experimental Biology and Biomedicine, Institute for Interdisciplinary Research, University of Coimbra, Coimbra, 3030-789, Portugal
| | - Panagiotis I Koukos
- Bijvoet Centre for Biomolecular Research, Faculty of Science—Chemistry, Utrecht University, Utrecht, 3584, CH, Netherlands
| | - Joana Mourão
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra 3005-504, Portugal
| | - Alexandre M J J Bonvin
- Bijvoet Centre for Biomolecular Research, Faculty of Science—Chemistry, Utrecht University, Utrecht, 3584, CH, Netherlands
| | - Irina S Moreira
- Department of Life Sciences, University of Coimbra, Coimbra, 3000-456, Portugal
- Center for Neuroscience and Cell Biology, Center for Innovative Biomedicine and Biotechnology, University of Coimbra, Coimbra, Portugal
| |
Collapse
|
5
|
Liu Z, Gong Y, Guo Y, Zhang X, Lu C, Zhang L, Wang H. TMP- SSurface2: A Novel Deep Learning-Based Surface Accessibility Predictor for Transmembrane Protein Sequence. Front Genet 2021; 12:656140. [PMID: 33790952 PMCID: PMC8006303 DOI: 10.3389/fgene.2021.656140] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2021] [Accepted: 02/22/2021] [Indexed: 12/13/2022] Open
Abstract
Transmembrane protein (TMP) is an important type of membrane protein that is involved in various biological membranes related biological processes. As major drug targets, TMPs’ surfaces are highly concerned to form the structural biases of their material-bindings for drugs or other biological molecules. However, the quantity of determinate TMP structures is still far less than the requirements, while artificial intelligence technologies provide a promising approach to accurately identify the TMP surfaces, merely depending on their sequences without any feature-engineering. For this purpose, we present an updated TMP surface residue predictor TMP-SSurface2 which achieved an even higher prediction accuracy compared to our previous version. The method uses an attention-enhanced Bidirectional Long Short Term Memory (BiLSTM) network, benefiting from its efficient learning capability, some useful latent information is abstracted from protein sequences, thus improving the Pearson correlation coefficients (CC) value performance of the old version from 0.58 to 0.66 on an independent test dataset. The results demonstrate that TMP-SSurface2 is efficient in predicting the surface of transmembrane proteins, representing new progress in transmembrane protein structure modeling based on primary sequences. TMP-SSurface2 is freely accessible at https://github.com/NENUBioCompute/TMP-SSurface-2.0.
Collapse
Affiliation(s)
- Zhe Liu
- School of Computer Science and Engineering, Changchun University of Technology, Changchun, China.,School of Information Science and Technology, Institute of Computational Biology, Northeast Normal University, Changchun, China.,Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Yingli Gong
- College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Yuanzhao Guo
- School of Information Science and Technology, Institute of Computational Biology, Northeast Normal University, Changchun, China
| | - Xiao Zhang
- College of Computing and Software Engineering, Kennesaw State University, Kennesaw, GA, United States
| | - Chang Lu
- School of Information Science and Technology, Institute of Computational Biology, Northeast Normal University, Changchun, China
| | - Li Zhang
- School of Computer Science and Engineering, Changchun University of Technology, Changchun, China
| | - Han Wang
- School of Information Science and Technology, Institute of Computational Biology, Northeast Normal University, Changchun, China
| |
Collapse
|
6
|
Sun J, Frishman D. Improved sequence-based prediction of interaction sites in α-helical transmembrane proteins by deep learning. Comput Struct Biotechnol J 2021; 19:1512-1530. [PMID: 33815689 PMCID: PMC7985279 DOI: 10.1016/j.csbj.2021.03.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2020] [Revised: 03/02/2021] [Accepted: 03/02/2021] [Indexed: 11/10/2022] Open
Abstract
Fast and accurate prediction of transmembrane protein interaction sites. First ever computational survey of interaction sites in membrane proteins. 10-30% of amino acid positions predicted to be involved in interactions.
Interactions between transmembrane (TM) proteins are fundamental for a wide spectrum of cellular functions, but precise molecular details of these interactions remain largely unknown due to the scarcity of experimentally determined three-dimensional complex structures. Computational techniques are therefore required for a large-scale annotation of interaction sites in TM proteins. Here, we present a novel deep-learning approach, DeepTMInter, for sequence-based prediction of interaction sites in α-helical TM proteins based on their topological, physiochemical, and evolutionary properties. Using a combination of ultra-deep residual neural networks with a stacked generalization ensemble technique DeepTMInter significantly outperforms existing methods, achieving the AUC/AUCPR values of 0.689/0.598. Across the main functional families of human transmembrane proteins, the percentage of amino acid sites predicted to be involved in interactions typically ranges between 10% and 25%, and up to 30% in ion channels. DeepTMInter is available as a standalone package at https://github.com/2003100127/deeptminter. The training and benchmarking datasets are available at https://data.mendeley.com/datasets/2t8kgwzp35.
Collapse
Affiliation(s)
- Jianfeng Sun
- Department of Bioinformatics, Wissenschaftzentrum Weihenstephan, Technical University of Munich, Maximus-von-Imhof-Forum 3, 85354 Freising, Germany
| | - Dmitrij Frishman
- Department of Bioinformatics, Wissenschaftzentrum Weihenstephan, Technical University of Munich, Maximus-von-Imhof-Forum 3, 85354 Freising, Germany
| |
Collapse
|
7
|
Rosário-Ferreira N, Marques-Pereira C, Gouveia RP, Mourão J, Moreira IS. Guardians of the Cell: State-of-the-Art of Membrane Proteins from a Computational Point-of-View. Methods Mol Biol 2021; 2315:3-28. [PMID: 34302667 DOI: 10.1007/978-1-0716-1468-6_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
Membrane proteins (MPs) encompass a large family of proteins with distinct cellular functions, and although representing over 50% of existing pharmaceutical drug targets, their structural and functional information is still very scarce. Over the last years, in silico analysis and algorithm development were essential to characterize MPs and overcome some limitations of experimental approaches. The optimization and improvement of these methods remain an ongoing process, with key advances in MPs' structure, folding, and interface prediction being continuously tackled. Herein, we discuss the latest trends in computational methods toward a deeper understanding of the atomistic and mechanistic details of MPs.
Collapse
Affiliation(s)
- Nícia Rosário-Ferreira
- Coimbra Chemistry Center, Department of Chemistry, University of Coimbra, Coimbra, Portugal.,Center for Neuroscience and Cell Biology, Center for Innovative Biomedicine and Biotechnology, University of Coimbra, Coimbra, Portugal
| | - Catarina Marques-Pereira
- Center for Neuroscience and Cell Biology, Center for Innovative Biomedicine and Biotechnology, University of Coimbra, Coimbra, Portugal.,PhD Programme in Experimental Biology and Biomedicine, Institute for Interdisciplinary Research (IIIUC), University of Coimbra, Coimbra, Portugal
| | - Raquel P Gouveia
- Center for Neuroscience and Cell Biology, Center for Innovative Biomedicine and Biotechnology, University of Coimbra, Coimbra, Portugal
| | - Joana Mourão
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra, Portugal
| | - Irina S Moreira
- Department of Life Sciences, University of Coimbra, Coimbra, Portugal.
| |
Collapse
|
8
|
Sharapova Y, Švedas V, Suplatov D. Catalytic and lectin domains in neuraminidase A from Streptococcus pneumoniae are capable of an intermolecular assembly: Implications for biofilm formation. FEBS J 2020; 288:3217-3230. [PMID: 33108702 DOI: 10.1111/febs.15610] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2020] [Revised: 09/25/2020] [Accepted: 10/19/2020] [Indexed: 01/14/2023]
Abstract
Neuraminidase A from Streptococcus pneumoniae (NanA) is a cell wall-bound modular enzyme containing one lectin and one catalytic domain. Unlike homologous NanB and NanC expressed by the same bacterium, the two domains within one NanA molecule do not form a stable interaction and are spatially separated by a 16-amino acid-long flexible linker. In this work, the ability of NanA to form intermolecular assemblies was characterized using the methods of molecular modeling and bioinformatic analysis based on crystallographic data and by bringing together previously published experimental data. It was concluded that two catalytic domains, as well as one catalytic and one lectin domain, originating from two cell wall-bound NanA molecules, can interact through a previously uncharacterized interdomain interface to form complexes stabilized by a network of intermolecular hydrogen bonds and salt bridges. Supercomputer modeling strongly indicated that artocarpin, an earlier experimentally discovered inhibitor of the pneumococcal biofilm formation, is able to bind to a site located in the catalytic domain of one NanA entity and prevent its interaction with the lectin or catalytic domain of another NanA entity, thus directly precluding the generation of intermolecular assemblies. The revealed structural adaptation is discussed as one plausible mechanism of noncatalytic participation of this potentially key pathogenicity enzyme in pneumococcal biofilm formation.
Collapse
Affiliation(s)
- Yana Sharapova
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, Russia.,Belozersky Institute of Physicochemical Biology, Lomonosov Moscow State University, Moscow, Russia
| | - Vytas Švedas
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, Russia.,Belozersky Institute of Physicochemical Biology, Lomonosov Moscow State University, Moscow, Russia
| | - Dmitry Suplatov
- Belozersky Institute of Physicochemical Biology, Lomonosov Moscow State University, Moscow, Russia
| |
Collapse
|
9
|
Xiao Y, Zeng B, Berner N, Frishman D, Langosch D, George Teese M. Experimental determination and data-driven prediction of homotypic transmembrane domain interfaces. Comput Struct Biotechnol J 2020; 18:3230-3242. [PMID: 33209210 PMCID: PMC7649602 DOI: 10.1016/j.csbj.2020.09.035] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2020] [Revised: 09/22/2020] [Accepted: 09/24/2020] [Indexed: 12/22/2022] Open
Abstract
Homotypic TMD interfaces identified by different techniques share strong similarities. The GxxxG motif is the feature most strongly associated with interfaces. Other features include conservation, polarity, coevolution, and depth in the membrane The role of each of each feature strongly depends on the individual protein. Machine-learning helps predict interfaces from evolutionary sequence data
Interactions between their transmembrane domains (TMDs) frequently support the assembly of single-pass membrane proteins to non-covalent complexes. Yet, the TMD-TMD interactome remains largely uncharted. With a view to predicting homotypic TMD-TMD interfaces from primary structure, we performed a systematic analysis of their physical and evolutionary properties. To this end, we generated a dataset of 50 self-interacting TMDs. This dataset contains interfaces of nine TMDs from bitopic human proteins (Ire1, Armcx6, Tie1, ATP1B1, PTPRO, PTPRU, PTPRG, DDR1, and Siglec7) that were experimentally identified here and combined with literature data. We show that interfacial residues of these homotypic TMD-TMD interfaces tend to be more conserved, coevolved and polar than non-interfacial residues. Further, we suggest for the first time that interface positions are deficient in β-branched residues, and likely to be located deep in the hydrophobic core of the membrane. Overrepresentation of the GxxxG motif at interfaces is strong, but that of (small)xxx(small) motifs is weak. The multiplicity of these features and the individual character of TMD-TMD interfaces, as uncovered here, prompted us to train a machine learning algorithm. The resulting prediction method, THOIPA (www.thoipa.org), excels in the prediction of key interface residues from evolutionary sequence data.
Collapse
Affiliation(s)
- Yao Xiao
- Center for Integrated Protein Science Munich (CIPSM) at the Lehrstuhl für Chemie der Biopolymere, Technische Universität München, Weihenstephaner Berg 3, 85354 Freising, Germany
| | - Bo Zeng
- Department of Bioinformatics, Wissenschaftszentrum, Weihenstephan, Maximus-von-Imhof-Forum 3, Freising 85354, Germany
| | - Nicola Berner
- Center for Integrated Protein Science Munich (CIPSM) at the Lehrstuhl für Chemie der Biopolymere, Technische Universität München, Weihenstephaner Berg 3, 85354 Freising, Germany
| | - Dmitrij Frishman
- Department of Bioinformatics, Wissenschaftszentrum, Weihenstephan, Maximus-von-Imhof-Forum 3, Freising 85354, Germany.,Department of Bioinformatics, Peter the Great Saint Petersburg Polytechnic University, St. Petersburg 195251, Russian Federation
| | - Dieter Langosch
- Center for Integrated Protein Science Munich (CIPSM) at the Lehrstuhl für Chemie der Biopolymere, Technische Universität München, Weihenstephaner Berg 3, 85354 Freising, Germany
| | - Mark George Teese
- Center for Integrated Protein Science Munich (CIPSM) at the Lehrstuhl für Chemie der Biopolymere, Technische Universität München, Weihenstephaner Berg 3, 85354 Freising, Germany.,TNG Technology Consulting GmbH, Beta-Straße 13a, 85774 Unterföhring, Germany
| |
Collapse
|
10
|
Zaucha J, Heinzinger M, Kulandaisamy A, Kataka E, Salvádor ÓL, Popov P, Rost B, Gromiha MM, Zhorov BS, Frishman D. Mutations in transmembrane proteins: diseases, evolutionary insights, prediction and comparison with globular proteins. Brief Bioinform 2020; 22:5872174. [PMID: 32672331 DOI: 10.1093/bib/bbaa132] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2020] [Revised: 05/26/2020] [Accepted: 05/28/2020] [Indexed: 12/18/2022] Open
Abstract
Membrane proteins are unique in that they interact with lipid bilayers, making them indispensable for transporting molecules and relaying signals between and across cells. Due to the significance of the protein's functions, mutations often have profound effects on the fitness of the host. This is apparent both from experimental studies, which implicated numerous missense variants in diseases, as well as from evolutionary signals that allow elucidating the physicochemical constraints that intermembrane and aqueous environments bring. In this review, we report on the current state of knowledge acquired on missense variants (referred to as to single amino acid variants) affecting membrane proteins as well as the insights that can be extrapolated from data already available. This includes an overview of the annotations for membrane protein variants that have been collated within databases dedicated to the topic, bioinformatics approaches that leverage evolutionary information in order to shed light on previously uncharacterized membrane protein structures or interaction interfaces, tools for predicting the effects of mutations tailored specifically towards the characteristics of membrane proteins as well as two clinically relevant case studies explaining the implications of mutated membrane proteins in cancer and cardiomyopathy.
Collapse
Affiliation(s)
- Jan Zaucha
- Department of Bioinformatics of the TUM School of Life Sciences Weihenstephan in Freising, Germany
| | - Michael Heinzinger
- Department of Informatics, Bioinformatics and Computational Biology of the TUM Faculty of Informatics in Garching, Germany
| | - A Kulandaisamy
- Department of Biotechnology of the IIT Bhupat and Jyoti Mehta School of BioSciences in Madras, India
| | - Evans Kataka
- Department of Bioinformatics of the TUM School of Life Sciences Weihenstephan in Freising, Germany
| | - Óscar Llorian Salvádor
- Department of Informatics, Bioinformatics and Computational Biology of the TUM Faculty of Informatics in Garching, Germany
| | - Petr Popov
- Center for Computational and Data-Intensive Science and Engineering of the Skolkovo Institute of Science and Technology in Moscow, Russia
| | - Burkhard Rost
- Department of Informatics, Bioinformatics and Computational Biology at the TUM Faculty of Informatics in Garching, Germany
| | | | - Boris S Zhorov
- Department of Biochemistry and Biomedical Sciences, McMaster University in Hamilton, Canada
| | - Dmitrij Frishman
- Department of Bioinformatics at the TUM School of Life Sciences Weihenstephan in Freising, Germany
| |
Collapse
|
11
|
Sun J, Frishman D. DeepHelicon: Accurate prediction of inter-helical residue contacts in transmembrane proteins by residual neural networks. J Struct Biol 2020; 212:107574. [PMID: 32663598 DOI: 10.1016/j.jsb.2020.107574] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2020] [Revised: 07/03/2020] [Accepted: 07/07/2020] [Indexed: 01/16/2023]
Abstract
Accurate prediction of amino acid residue contacts is an important prerequisite for generating high-quality 3D models of transmembrane (TM) proteins. While a large number of compositional, evolutionary, and structural properties of proteins can be used to train contact prediction methods, recent research suggests that coevolution between residues provides the strongest indication of their spatial proximity. We have developed a deep learning approach, DeepHelicon, to predict inter-helical residue contacts in TM proteins by considering only coevolutionary features. DeepHelicon comprises a two-stage supervised learning process by residual neural networks for a gradual refinement of contact maps, followed by variance reduction by an ensemble of models. We present a benchmark study of 12 contact predictors and conclude that DeepHelicon together with the two other state-of-the-art methods DeepMetaPSICOV and Membrain2 outperforms the 10 remaining algorithms on all datasets and at all settings. On a set of 44 TM proteins with an average length of 388 residues DeepHelicon achieves the best performance among all benchmarked methods in predicting the top L/5 and L/2 inter-helical contacts, with the mean precision of 87.42% and 77.84%, respectively. On a set of 57 relatively small TM proteins with an average length of 298 residues DeepHelicon ranks second best after DeepMetaPSICOV. DeepHelicon produces the most accurate predictions for large proteins with more than 10 transmembrane helices. Coevolutionary features alone allow to predict inter-helical residue contacts with an accuracy sufficient for generating acceptable 3D models for up to 30% of proteins using a fully automated modeling method such as CONFOLD2.
Collapse
Affiliation(s)
- Jianfeng Sun
- Department of Bioinformatics, Wissenschaftzentrum Weihenstephan, Technische Universität München, 85354 Freising, Germany
| | - Dmitrij Frishman
- Department of Bioinformatics, Wissenschaftzentrum Weihenstephan, Technische Universität München, 85354 Freising, Germany.
| |
Collapse
|
12
|
Koukos P, Bonvin A. Integrative Modelling of Biomolecular Complexes. J Mol Biol 2020; 432:2861-2881. [DOI: 10.1016/j.jmb.2019.11.009] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2019] [Revised: 11/12/2019] [Accepted: 11/13/2019] [Indexed: 12/31/2022]
|
13
|
Computational Identification and Analysis of Ubiquinone-Binding Proteins. Cells 2020; 9:cells9020520. [PMID: 32102444 PMCID: PMC7072731 DOI: 10.3390/cells9020520] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2020] [Revised: 02/21/2020] [Accepted: 02/21/2020] [Indexed: 12/15/2022] Open
Abstract
Ubiquinone is an important cofactor that plays vital and diverse roles in many biological processes. Ubiquinone-binding proteins (UBPs) are receptor proteins that dock with ubiquinones. Analyzing and identifying UBPs via a computational approach will provide insights into the pathways associated with ubiquinones. In this work, we were the first to propose a UBPs predictor (UBPs-Pred). The optimal feature subset selected from three categories of sequence-derived features was fed into the extreme gradient boosting (XGBoost) classifier, and the parameters of XGBoost were tuned by multi-objective particle swarm optimization (MOPSO). The experimental results over the independent validation demonstrated considerable prediction performance with a Matthews correlation coefficient (MCC) of 0.517. After that, we analyzed the UBPs using bioinformatics methods, including the statistics of the binding domain motifs and protein distribution, as well as an enrichment analysis of the gene ontology (GO) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway.
Collapse
|
14
|
Barreto CAV, Baptista SJ, Preto AJ, Matos-Filipe P, Mourão J, Melo R, Moreira I. Prediction and targeting of GPCR oligomer interfaces. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2020; 169:105-149. [PMID: 31952684 DOI: 10.1016/bs.pmbts.2019.11.007] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
GPCR oligomerization has emerged as a hot topic in the GPCR field in the last years. Receptors that are part of these oligomers can influence each other's function, although it is not yet entirely understood how these interactions work. The existence of such a highly complex network of interactions between GPCRs generates the possibility of alternative targets for new therapeutic approaches. However, challenges still exist in the characterization of these complexes, especially at the interface level. Different experimental approaches, such as FRET or BRET, are usually combined to study GPCR oligomer interactions. Computational methods have been applied as a useful tool for retrieving information from GPCR sequences and the few X-ray-resolved oligomeric structures that are accessible, as well as for predicting new and trustworthy GPCR oligomeric interfaces. Machine-learning (ML) approaches have recently helped with some hindrances of other methods. By joining and evaluating multiple structure-, sequence- and co-evolution-based features on the same algorithm, it is possible to dilute the issues of particular structures and residues that arise from the experimental methodology into all-encompassing algorithms capable of accurately predict GPCR-GPCR interfaces. All these methods used as a single or a combined approach provide useful information about GPCR oligomerization and its role in GPCR function and dynamics. Altogether, we present experimental, computational and machine-learning methods used to study oligomers interfaces, as well as strategies that have been used to target these dynamic complexes.
Collapse
Affiliation(s)
- Carlos A V Barreto
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra, Portugal
| | - Salete J Baptista
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra, Portugal; Centro de Ciências e Tecnologias Nucleares, Instituto Superior Técnico, Universidade de Lisboa, CTN, LRS, Portugal
| | - António José Preto
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra, Portugal
| | - Pedro Matos-Filipe
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra, Portugal
| | - Joana Mourão
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra, Portugal; Institute for Interdisciplinary Research, University of Coimbra, Coimbra, Portugal
| | - Rita Melo
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra, Portugal; Centro de Ciências e Tecnologias Nucleares, Instituto Superior Técnico, Universidade de Lisboa, CTN, LRS, Portugal
| | - Irina Moreira
- Center for Neuroscience and Cell Biology, University of Coimbra, Coimbra, Portugal; Science and Technology Faculty, University of Coimbra, Coimbra, Portugal.
| |
Collapse
|
15
|
Lu C, Liu Z, Zhang E, He F, Ma Z, Wang H. MPLs-Pred: Predicting Membrane Protein-Ligand Binding Sites Using Hybrid Sequence-Based Features and Ligand-Specific Models. Int J Mol Sci 2019; 20:ijms20133120. [PMID: 31247932 PMCID: PMC6651575 DOI: 10.3390/ijms20133120] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2019] [Revised: 06/23/2019] [Accepted: 06/23/2019] [Indexed: 02/07/2023] Open
Abstract
Membrane proteins (MPs) are involved in many essential biomolecule mechanisms as a pivotal factor in enabling the small molecule and signal transport between the two sides of the biological membrane; this is the reason that a large portion of modern medicinal drugs target MPs. Therefore, accurately identifying the membrane protein-ligand binding sites (MPLs) will significantly improve drug discovery. In this paper, we propose a sequence-based MPLs predictor called MPLs-Pred, where evolutionary profiles, topology structure, physicochemical properties, and primary sequence segment descriptors are combined as features applied to a random forest classifier, and an under-sampling scheme is used to enhance the classification capability with imbalanced samples. Additional ligand-specific models were taken into consideration in refining the prediction. The corresponding experimental results based on our method achieved an appreciable performance, with 0.63 MCC (Matthews correlation coefficient) as the overall prediction precision, and those values were 0.604, 0.7, and 0.692, respectively, for the three main types of ligands: drugs, metal ions, and biomacromolecules. MPLs-Pred is freely accessible at http://icdtools.nenu.edu.cn/.
Collapse
Affiliation(s)
- Chang Lu
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China
- Institute of Computational Biology, Northeast Normal University, Changchun 130117, China
| | - Zhe Liu
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China
- Institute of Computational Biology, Northeast Normal University, Changchun 130117, China
| | - Enju Zhang
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China
- Institute of Computational Biology, Northeast Normal University, Changchun 130117, China
| | - Fei He
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China.
- Institute of Computational Biology, Northeast Normal University, Changchun 130117, China.
| | - Zhiqiang Ma
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China.
- Institute of Computational Biology, Northeast Normal University, Changchun 130117, China.
| | - Han Wang
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China.
- Institute of Computational Biology, Northeast Normal University, Changchun 130117, China.
| |
Collapse
|