1
|
Sawhney A, Li J, Liao L. Improving AlphaFold Predicted Contacts for Alpha-Helical Transmembrane Proteins Using Structural Features. Int J Mol Sci 2024; 25:5247. [PMID: 38791287 PMCID: PMC11121315 DOI: 10.3390/ijms25105247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2024] [Revised: 05/06/2024] [Accepted: 05/09/2024] [Indexed: 05/26/2024] Open
Abstract
Residue contact maps provide a condensed two-dimensional representation of three-dimensional protein structures, serving as a foundational framework in structural modeling but also as an effective tool in their own right in identifying inter-helical binding sites and drawing insights about protein function. Treating contact maps primarily as an intermediate step for 3D structure prediction, contact prediction methods have limited themselves exclusively to sequential features. Now that AlphaFold2 predicts 3D structures with good accuracy in general, we examine (1) how well predicted 3D structures can be directly used for deciding residue contacts, and (2) whether features from 3D structures can be leveraged to further improve residue contact prediction. With a well-known benchmark dataset, we tested predicting inter-helical residue contact based on AlphaFold2's predicted structures, which gave an 83% average precision, already outperforming a sequential features-based state-of-the-art model. We then developed a procedure to extract features from atomic structure in the neighborhood of a residue pair, hypothesizing that these features will be useful in determining if the residue pair is in contact, provided the structure is decently accurate, such as predicted by AlphaFold2. Training on features generated from experimentally determined structures, we leveraged knowledge from known structures to significantly improve residue contact prediction, when testing using the same set of features but derived using AlphaFold2 structures. Our results demonstrate a remarkable improvement over AlphaFold2, achieving over 91.9% average precision for a held-out subset and over 89.5% average precision in cross-validation experiments.
Collapse
Affiliation(s)
- Aman Sawhney
- Department of Computer and Information Sciences, University of Delaware, Smith Hall, 18 Amstel Avenue, Newark, DE 19716, USA;
| | - Jiefu Li
- School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, 516 Jun Gong Road, Shanghai 200093, China;
| | - Li Liao
- Department of Computer and Information Sciences, University of Delaware, Smith Hall, 18 Amstel Avenue, Newark, DE 19716, USA;
| |
Collapse
|
2
|
Roterman I, Stapor K, Konieczny L. Transmembrane proteins-Different anchoring systems. Proteins 2024; 92:593-609. [PMID: 38062872 DOI: 10.1002/prot.26646] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Revised: 11/03/2023] [Accepted: 11/17/2023] [Indexed: 04/13/2024]
Abstract
Transmembrane proteins are active in amphipathic environments. To stabilize the protein in such surrounding the exposure of hydrophobic residues on the protein surface is required. Transmembrane proteins are responsible for the transport of various molecules. Therefore, they often represent structures in the form of channels. This analysis focused on the stability and local flexibility of transmembrane proteins, particularly those related to their biological activity. Different forms of anchorage were identified using the fuzzy oil-drop model (FOD) and its modified form, FOD-M. The mainly helical as well as β-barrel structural forms are compared with respect to the mechanism of stabilization in the cell membrane. The different anchoring system was found to stabilize protein molecules with possible local fluctuation.
Collapse
Affiliation(s)
- Irena Roterman
- Department of Bioinformatics and Telemedicine, Jagiellonian University-Medical College, Krakow, Poland
| | - Katarzyna Stapor
- Faculty of Automatic, Electronics and Computer Science, Department of Applied Informatics, Silesian University of Technology, Gliwice, Poland
| | - Leszek Konieczny
- Chair of Medical Biochemistry, Jagiellonian University-Medical College, Krakow, Poland
| |
Collapse
|
3
|
Faiz M, Khan SJ, Azim F, Ejaz N. Disclosing the locale of transmembrane proteins within cellular alcove by machine learning approach: systematic review and meta analysis. J Biomol Struct Dyn 2023; 42:11133-11148. [PMID: 37768108 DOI: 10.1080/07391102.2023.2260490] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Accepted: 09/13/2023] [Indexed: 09/29/2023]
Abstract
Protein subcellular localization is a promising research question in Proteomics and associated fields, including Biological Sciences, Biomedical Engineering, Computational Biology, Bioinformatics, Proteomics, Artificial Intelligence, and Biophysics. However, computational techniques are preferred to explore this attribute for a massive number of proteins. The byproduct of this conjunction yields diversified location identifiers of proteins. These protein subcellular localization identifiers are unique regarding the database used, organisms, Machine Learning Technique, and accuracy. Despite the availability of these identifiers, the majority of the work has been done on the subcellular localization of proteins and, less work has been done specifically on locations of transmembrane proteins. This systematic review accounts for computational techniques implemented on transmembrane protein localization. Moreover, a literature search on PubMed, Science Direct, and IEEE Databases disclosed no systematic review or meta-analysis on the cell's transmembrane protein locale. A Systematic review was formed under the guidelines of PRISMA by using Science Direct, PubMed, and IEEE Databases. Journal publications from 2000 to 2023 were taken into consideration and screened. This review has focused only on computational studies rather than experimental techniques. 1004 studies were reviewed and were categorized as relevant and non-relevant according to inclusion and exclusion criteria. All the screening was done through Endnote after importing citations. This systematic review characterizes the gap in targeting the locale of the transmembrane protein and will aid researchers in exploring its new horizons.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Mehwish Faiz
- Department of Biomedical Engineering, Ziauddin University (FESTM), Karachi, Pakistan
- Department of Electrical Engineering, Ziauddin University, (FESTM), Karachi, Pakistan
| | - Saad Jawaid Khan
- Department of Biomedical Engineering, Ziauddin University (FESTM), Karachi, Pakistan
| | - Fahad Azim
- Department of Electrical Engineering, Ziauddin University, (FESTM), Karachi, Pakistan
| | - Nazia Ejaz
- Balochistan University of Engineering and Technology, Khuzdar, Pakistan
| |
Collapse
|
4
|
Sun J, Kulandaisamy A, Ru J, Gromiha MM, Cribbs AP. TMKit: a Python interface for computational analysis of transmembrane proteins. Brief Bioinform 2023; 24:bbad288. [PMID: 37594311 PMCID: PMC10516361 DOI: 10.1093/bib/bbad288] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Revised: 07/07/2023] [Accepted: 07/18/2023] [Indexed: 08/19/2023] Open
Abstract
Transmembrane proteins are receptors, enzymes, transporters and ion channels that are instrumental in regulating a variety of cellular activities, such as signal transduction and cell communication. Despite tremendous progress in computational capacities to support protein research, there is still a significant gap in the availability of specialized computational analysis toolkits for transmembrane protein research. Here, we introduce TMKit, an open-source Python programming interface that is modular, scalable and specifically designed for processing transmembrane protein data. TMKit is a one-stop computational analysis tool for transmembrane proteins, enabling users to perform database wrangling, engineer features at the mutational, domain and topological levels, and visualize protein-protein interaction interfaces. In addition, TMKit includes seqNetRR, a high-performance computing library that allows customized construction of a large number of residue connections. This library is particularly well suited for assigning correlation matrix-based features at a fast speed. TMKit should serve as a useful tool for researchers in assisting the study of transmembrane protein sequences and structures. TMKit is publicly available through https://github.com/2003100127/tmkit and https://tmkit-guide.herokuapp.com/doc/overview.
Collapse
Affiliation(s)
- Jianfeng Sun
- Nuffield Department of Orthopedics, Rheumatology, and Musculoskeletal Sciences, Botnar Research Centre, University of Oxford, Headington, Oxford OX3 7LD, UK
| | - Arulsamy Kulandaisamy
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BioSciences, Indian Institute of Technology Madras, Chennai 600036, Tamil Nadu, India
| | - Jinlong Ru
- Chair of Prevention of Microbial Diseases, School of Life Sciences Weihenstephan, Technical University of Munich, 85354 Freising, Germany
| | - M Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BioSciences, Indian Institute of Technology Madras, Chennai 600036, Tamil Nadu, India
| | - Adam P Cribbs
- Nuffield Department of Orthopedics, Rheumatology, and Musculoskeletal Sciences, Botnar Research Centre, University of Oxford, Headington, Oxford OX3 7LD, UK
| |
Collapse
|
5
|
Li J, Sawhney A, Lee JY, Liao L. Improving Inter-Helix Contact Prediction With Local 2D Topological Information. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:3001-3012. [PMID: 37155404 DOI: 10.1109/tcbb.2023.3274361] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
Inter-helix contact prediction is to identify residue contact across different helices in α-helical integral membrane proteins. Despite the progress made by various computational methods, contact prediction remains as a challenging task, and there is no method to our knowledge that directly tap into the contact map in an alignment free manner. We build 2D contact models from an independent dataset to capture the topological patterns in the neighborhood of a residue pair depending it is a contact or not, and apply the models to the state-of-art method's predictions to extract the features reflecting 2D inter-helix contact patterns. A secondary classifier is trained on such features. Realizing that the achievable improvement is intrinsically hinged on the quality of original predictions, we devise a mechanism to deal with the issue by introducing, 1) partial discretization of original prediction scores to more effectively leverage useful information 2) fuzzy score to assess the quality of the original prediction to help with selecting the residue pairs where improvement is more achievable. The cross-validation results show that the prediction from our method outperforms other methods including the state-of-the-art method (DeepHelicon) by a notable degree even without using the refinement selection scheme. By applying the refinement selection scheme, our method outperforms the state-of-the-art method significantly in these selected sequences.
Collapse
|
6
|
Sun J, Ru J, Ramos-Mucci L, Qi F, Chen Z, Chen S, Cribbs AP, Deng L, Wang X. DeepsmirUD: Prediction of Regulatory Effects on microRNA Expression Mediated by Small Molecules Using Deep Learning. Int J Mol Sci 2023; 24:1878. [PMID: 36768205 PMCID: PMC9915273 DOI: 10.3390/ijms24031878] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 12/26/2022] [Accepted: 01/12/2023] [Indexed: 01/21/2023] Open
Abstract
Aberrant miRNA expression has been associated with a large number of human diseases. Therefore, targeting miRNAs to regulate their expression levels has become an important therapy against diseases that stem from the dysfunction of pathways regulated by miRNAs. In recent years, small molecules have demonstrated enormous potential as drugs to regulate miRNA expression (i.e., SM-miR). A clear understanding of the mechanism of action of small molecules on the upregulation and downregulation of miRNA expression allows precise diagnosis and treatment of oncogenic pathways. However, outside of a slow and costly process of experimental determination, computational strategies to assist this on an ad hoc basis have yet to be formulated. In this work, we developed, to the best of our knowledge, the first cross-platform prediction tool, DeepsmirUD, to infer small-molecule-mediated regulatory effects on miRNA expression (i.e., upregulation or downregulation). This method is powered by 12 cutting-edge deep-learning frameworks and achieved AUC values of 0.843/0.984 and AUCPR values of 0.866/0.992 on two independent test datasets. With a complementarily constructed network inference approach based on similarity, we report a significantly improved accuracy of 0.813 in determining the regulatory effects of nearly 650 associated SM-miR relations, each formed with either novel small molecule or novel miRNA. By further integrating miRNA-cancer relationships, we established a database of potential pharmaceutical drugs from 1343 small molecules for 107 cancer diseases to understand the drug mechanisms of action and offer novel insight into drug repositioning. Furthermore, we have employed DeepsmirUD to predict the regulatory effects of a large number of high-confidence associated SM-miR relations. Taken together, our method shows promise to accelerate the development of potential miRNA targets and small molecule drugs.
Collapse
Affiliation(s)
- Jianfeng Sun
- College of Animal Science and Technology, Northwest A&F University, Yangling 712100, China
- Botnar Research Centre, Nuffield Department of Orthopedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK
| | - Jinlong Ru
- Institute of Virology, Helmholtz Centre Munich—German Research Center for Environmental Health, 85764 Neuherberg, Germany
- Chair of Prevention of Microbial Diseases, School of Life Sciences Weihenstephan, Technical University of Munich, 85354 Freising, Germany
| | - Lorenzo Ramos-Mucci
- Botnar Research Centre, Nuffield Department of Orthopedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK
| | - Fei Qi
- Institute of Genomics, School of Medicine, Huaqiao University, Xiamen 362021, China
| | - Zihao Chen
- Department of Computational Biology for Drug Discovery, Biolife Biotechnology Ltd., Zhumadian 463200, China
| | - Suyuan Chen
- Leibniz-Institut für Analytische Wissenschaften–ISAS–e.V., Otto-Hahn-Str asse 6b, 44227 Dortmund, Germany
| | - Adam P. Cribbs
- Botnar Research Centre, Nuffield Department of Orthopedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK
| | - Li Deng
- Institute of Virology, Helmholtz Centre Munich—German Research Center for Environmental Health, 85764 Neuherberg, Germany
- Chair of Prevention of Microbial Diseases, School of Life Sciences Weihenstephan, Technical University of Munich, 85354 Freising, Germany
| | - Xia Wang
- College of Animal Science and Technology, Northwest A&F University, Yangling 712100, China
- Department of Molecular and Cellular Biology, University of Arizona, Tucson, AZ 85721, USA
| |
Collapse
|
7
|
De-Simone SG, Napoleão-Pêgo P, Gonçalves PS, Lechuga GC, Mandonado A, Graeff-Teixeira C, Provance DW. Angiostrongilus cantonensis an Atypical Presenilin: Epitope Mapping, Characterization, and Development of an ELISA Peptide Assay for Specific Diagnostic of Angiostrongyliasis. MEMBRANES 2022; 12:membranes12020108. [PMID: 35207030 PMCID: PMC8878667 DOI: 10.3390/membranes12020108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/15/2021] [Revised: 01/06/2022] [Accepted: 01/06/2022] [Indexed: 12/10/2022]
Abstract
Background: Angiostrongyliasis, the leading cause universal of eosinophilic meningitis, is an emergent disease due to Angiostrongylus cantonensis (rat lungworm) larvae, transmitted accidentally to humans. The diagnosis of human angiostrongyliasis is based on epidemiologic characteristics, clinical symptoms, medical history, and laboratory findings, particularly hypereosinophilia in blood and cerebrospinal fluid. Thus, the diagnosis is difficult and often confused with those produced by other parasitic diseases. Therefore, the development of a fast and specific diagnostic test for angiostrongyliasis is a challenge mainly due to the lack of specificity of the described tests, and therefore, the characterization of a new target is required. Material and Methods: Using bioinformatics tools, the putative presenilin (PS) protein C7BVX5-1 was characterized structurally and phylogenetically. A peptide microarray approach was employed to identify single and specific epitopes, and tetrameric epitope peptides were synthesized to evaluate their performance in an ELISA-peptide assay. Results: The data showed that the A. cantonensis PS protein presents nine transmembrane domains, the catalytic aspartyl domain [(XD (aa 241) and GLGD (aa 332–335)], between TM6 and TM7 and the absence of the PALP and other characteristics domains of the class A22 and homologous presenilin (PSH). These individualities make it an atypical sub-branch of the PS family, located in a separate subgroup along with the enzyme Haemogonchus contournus and separated from other worm subclasses. Twelve B-linear epitopes were identified by microarray of peptides and validated by ELISA using infected rat sera. In addition, their diagnostic performance was demonstrated by an ELISA-MAP4 peptide. Conclusions: Our data show that the putative AgPS is an atypical multi-pass transmembrane protein and indicate that the protein is an excellent immunological target with two (PsAg3 and PsAg9) A. costarisencis cross-reactive epitopes and eight (PsAg1, PsAg2, PsAg6, PsAg7, PsAg8, PsAg10, PsAg11, PsAg12) apparent unique A. cantonensis epitopes. These epitopes could be used in engineered receptacle proteins to develop a specific immunological diagnostic assay for angiostrongyliasis caused by A. cantonensis.
Collapse
Affiliation(s)
- Salvatore G. De-Simone
- Center of Technological Development in Health (CDTS), National Institute of Science and Technology for Innovation on Neglected Diseases (INCT-IDN), FIOCRUZ, Rio de Janeiro 21040-900, RJ, Brazil; (P.N.-P.); (P.S.G.); (G.C.L.); (D.W.P.J.)
- Laboratory of Epidemiology and Molecular Systematics (LESM), Oswaldo Cruz Institute, FIOCRUZ, Rio de Janeiro 21040-900, RJ, Brazil
- Department of Cellular and Molecular Biology, Biology Institute, Federal Fluminense University, Niterói 24220-900, RJ, Brazil
- Correspondence:
| | - Paloma Napoleão-Pêgo
- Center of Technological Development in Health (CDTS), National Institute of Science and Technology for Innovation on Neglected Diseases (INCT-IDN), FIOCRUZ, Rio de Janeiro 21040-900, RJ, Brazil; (P.N.-P.); (P.S.G.); (G.C.L.); (D.W.P.J.)
| | - Priscila S. Gonçalves
- Center of Technological Development in Health (CDTS), National Institute of Science and Technology for Innovation on Neglected Diseases (INCT-IDN), FIOCRUZ, Rio de Janeiro 21040-900, RJ, Brazil; (P.N.-P.); (P.S.G.); (G.C.L.); (D.W.P.J.)
- Department of Cellular and Molecular Biology, Biology Institute, Federal Fluminense University, Niterói 24220-900, RJ, Brazil
| | - Guilherme C. Lechuga
- Center of Technological Development in Health (CDTS), National Institute of Science and Technology for Innovation on Neglected Diseases (INCT-IDN), FIOCRUZ, Rio de Janeiro 21040-900, RJ, Brazil; (P.N.-P.); (P.S.G.); (G.C.L.); (D.W.P.J.)
| | - Arnaldo Mandonado
- Laboratory of Biology and Parasitology of Wild Mammals Reservoirs, Oswaldo Cruz Institute, FIOCRUZ, Rio de Janeiro 21040-360, RJ, Brazil;
| | - Carlos Graeff-Teixeira
- Infectious Diseases Unit, Department of Pathology, Federal University of Espirito Santo, Vitória 29075-910, ES, Brazil;
| | - David W. Provance
- Center of Technological Development in Health (CDTS), National Institute of Science and Technology for Innovation on Neglected Diseases (INCT-IDN), FIOCRUZ, Rio de Janeiro 21040-900, RJ, Brazil; (P.N.-P.); (P.S.G.); (G.C.L.); (D.W.P.J.)
| |
Collapse
|
8
|
Sun J, Frishman D. Improved sequence-based prediction of interaction sites in α-helical transmembrane proteins by deep learning. Comput Struct Biotechnol J 2021; 19:1512-1530. [PMID: 33815689 PMCID: PMC7985279 DOI: 10.1016/j.csbj.2021.03.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2020] [Revised: 03/02/2021] [Accepted: 03/02/2021] [Indexed: 11/10/2022] Open
Abstract
Fast and accurate prediction of transmembrane protein interaction sites. First ever computational survey of interaction sites in membrane proteins. 10-30% of amino acid positions predicted to be involved in interactions.
Interactions between transmembrane (TM) proteins are fundamental for a wide spectrum of cellular functions, but precise molecular details of these interactions remain largely unknown due to the scarcity of experimentally determined three-dimensional complex structures. Computational techniques are therefore required for a large-scale annotation of interaction sites in TM proteins. Here, we present a novel deep-learning approach, DeepTMInter, for sequence-based prediction of interaction sites in α-helical TM proteins based on their topological, physiochemical, and evolutionary properties. Using a combination of ultra-deep residual neural networks with a stacked generalization ensemble technique DeepTMInter significantly outperforms existing methods, achieving the AUC/AUCPR values of 0.689/0.598. Across the main functional families of human transmembrane proteins, the percentage of amino acid sites predicted to be involved in interactions typically ranges between 10% and 25%, and up to 30% in ion channels. DeepTMInter is available as a standalone package at https://github.com/2003100127/deeptminter. The training and benchmarking datasets are available at https://data.mendeley.com/datasets/2t8kgwzp35.
Collapse
Affiliation(s)
- Jianfeng Sun
- Department of Bioinformatics, Wissenschaftzentrum Weihenstephan, Technical University of Munich, Maximus-von-Imhof-Forum 3, 85354 Freising, Germany
| | - Dmitrij Frishman
- Department of Bioinformatics, Wissenschaftzentrum Weihenstephan, Technical University of Munich, Maximus-von-Imhof-Forum 3, 85354 Freising, Germany
| |
Collapse
|
9
|
Lechuga GC, Napoleão-Pêgo P, Bottino CCG, Pinho RT, Provance-Jr DW, De-Simone SG. Trypanosoma cruzi Presenilin-Like Transmembrane Aspartyl Protease: Characterization and Cellular Localization. Biomolecules 2020; 10:biom10111564. [PMID: 33212923 PMCID: PMC7698364 DOI: 10.3390/biom10111564] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Revised: 11/06/2020] [Accepted: 11/09/2020] [Indexed: 02/08/2023] Open
Abstract
The increasing detection of infections of Trypanosoma cruzi, the etiological agent of Chagas disease, in non-endemic regions beyond Latin America has risen to be a major public health issue. With an impact in the millions of people, current treatments rely on antiquated drugs that produce severe side effects and are considered nearly ineffective for the chronic phase. The minimal progress in the development of new drugs highlights the need for advances in basic research on crucial biochemical pathways in T. cruzi to identify new targets. Here, we report on the T. cruzi presenilin-like transmembrane aspartyl enzyme, a protease of the aspartic class in a unique phylogenetic subgroup with T. vivax separate from protozoans. Computational analyses suggest it contains nine transmembrane domains and an active site with the characteristic PALP motif of the A22 family. Multiple linear B-cell epitopes were identified by SPOT-synthesis analysis with Chagasic patient sera. Two were chosen to generate rabbit antisera, whose signal was primarily localized to the flagellar pocket, intracellular vesicles, and endoplasmic reticulum in parasites by whole-cell immunofluorescence. The results suggest that the parasitic presenilin-like enzyme could have a role in the secretory pathway and serve as a target for the generation of new therapeutics specific to the T. cruzi.
Collapse
Affiliation(s)
- Guilherme C. Lechuga
- Center for Technological Development in Health/National Institute of Science and Technology for Innovation on Diseases of Neglected Population (INCT-IDPN), FIOCRUZ, Rio de Janeiro 21040-900, Brazil; (G.C.L.); (P.N.-P.); (C.C.G.B.); (D.W.P.-J.)
- Cellular Ultrastructure Laboratory, FIOCRUZ, Oswaldo Cruz Institute, Rio de Janeiro 21040-900, Brazil
| | - Paloma Napoleão-Pêgo
- Center for Technological Development in Health/National Institute of Science and Technology for Innovation on Diseases of Neglected Population (INCT-IDPN), FIOCRUZ, Rio de Janeiro 21040-900, Brazil; (G.C.L.); (P.N.-P.); (C.C.G.B.); (D.W.P.-J.)
| | - Carolina C. G. Bottino
- Center for Technological Development in Health/National Institute of Science and Technology for Innovation on Diseases of Neglected Population (INCT-IDPN), FIOCRUZ, Rio de Janeiro 21040-900, Brazil; (G.C.L.); (P.N.-P.); (C.C.G.B.); (D.W.P.-J.)
| | - Rosa T. Pinho
- Clinical Immunology Laboratory, FIOCRUZ, Oswaldo Cruz Institute, Rio de Janeiro 21040-900, Brazil;
| | - David W. Provance-Jr
- Center for Technological Development in Health/National Institute of Science and Technology for Innovation on Diseases of Neglected Population (INCT-IDPN), FIOCRUZ, Rio de Janeiro 21040-900, Brazil; (G.C.L.); (P.N.-P.); (C.C.G.B.); (D.W.P.-J.)
- Interdisciplinary Medical Research Laboratory, FIOCRUZ, Oswaldo Cruz Institute, Rio de Janeiro 21040-900, Brazil
| | - Salvatore G. De-Simone
- Center for Technological Development in Health/National Institute of Science and Technology for Innovation on Diseases of Neglected Population (INCT-IDPN), FIOCRUZ, Rio de Janeiro 21040-900, Brazil; (G.C.L.); (P.N.-P.); (C.C.G.B.); (D.W.P.-J.)
- Department of Molecular and Cellular Biology, Federal Fluminense University, Niterói 24220-008, Brazil
- Correspondence: ; Tel.: +55-21-3865-8183
| |
Collapse
|
10
|
Zhang Q, Zhu J, Ju F, Kong L, Sun S, Zheng WM, Bu D. ISSEC: inferring contacts among protein secondary structure elements using deep object detection. BMC Bioinformatics 2020; 21:503. [PMID: 33153432 PMCID: PMC7643357 DOI: 10.1186/s12859-020-03793-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2020] [Accepted: 09/30/2020] [Indexed: 11/12/2022] Open
Abstract
BACKGROUND The formation of contacts among protein secondary structure elements (SSEs) is an important step in protein folding as it determines topology of protein tertiary structure; hence, inferring inter-SSE contacts is crucial to protein structure prediction. One of the existing strategies infers inter-SSE contacts directly from the predicted possibilities of inter-residue contacts without any preprocessing, and thus suffers from the excessive noises existing in the predicted inter-residue contacts. Another strategy defines SSEs based on protein secondary structure prediction first, and then judges whether each candidate SSE pair could form contact or not. However, it is difficult to accurately determine boundary of SSEs due to the errors in secondary structure prediction. The incorrectly-deduced SSEs definitely hinder subsequent prediction of the contacts among them. RESULTS We here report an accurate approach to infer the inter-SSE contacts (thus called as ISSEC) using the deep object detection technique. The design of ISSEC is based on the observation that, in the inter-residue contact map, the contacting SSEs usually form rectangle regions with characteristic patterns. Therefore, ISSEC infers inter-SSE contacts through detecting such rectangle regions. Unlike the existing approach directly using the predicted probabilities of inter-residue contact, ISSEC applies the deep convolution technique to extract high-level features from the inter-residue contacts. More importantly, ISSEC does not rely on the pre-defined SSEs. Instead, ISSEC enumerates multiple candidate rectangle regions in the predicted inter-residue contact map, and for each region, ISSEC calculates a confidence score to measure whether it has characteristic patterns or not. ISSEC employs greedy strategy to select non-overlapping regions with high confidence score, and finally infers inter-SSE contacts according to these regions. CONCLUSIONS Comprehensive experimental results suggested that ISSEC outperformed the state-of-the-art approaches in predicting inter-SSE contacts. We further demonstrated the successful applications of ISSEC to improve prediction of both inter-residue contacts and tertiary structure as well.
Collapse
Affiliation(s)
- Qi Zhang
- Key Lab of Intelligent Information Processing, Big Data Academy, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, 100190, China
- School of Computer Science, University of Chinese Academy of Sciences, Beijing, China
| | - Jianwei Zhu
- Key Lab of Intelligent Information Processing, Big Data Academy, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, 100190, China
- School of Computer Science, University of Chinese Academy of Sciences, Beijing, China
| | - Fusong Ju
- Key Lab of Intelligent Information Processing, Big Data Academy, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, 100190, China
- School of Computer Science, University of Chinese Academy of Sciences, Beijing, China
| | - Lupeng Kong
- Key Lab of Intelligent Information Processing, Big Data Academy, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, 100190, China
- School of Computer Science, University of Chinese Academy of Sciences, Beijing, China
| | - Shiwei Sun
- Key Lab of Intelligent Information Processing, Big Data Academy, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, 100190, China
- School of Computer Science, University of Chinese Academy of Sciences, Beijing, China
| | - Wei-Mou Zheng
- Institute of Theoretical Physics, Chinese Academy of Sciences, Beijing, 100190, China
| | - Dongbo Bu
- Key Lab of Intelligent Information Processing, Big Data Academy, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, 100190, China.
- School of Computer Science, University of Chinese Academy of Sciences, Beijing, China.
| |
Collapse
|
11
|
Xiao Y, Zeng B, Berner N, Frishman D, Langosch D, George Teese M. Experimental determination and data-driven prediction of homotypic transmembrane domain interfaces. Comput Struct Biotechnol J 2020; 18:3230-3242. [PMID: 33209210 PMCID: PMC7649602 DOI: 10.1016/j.csbj.2020.09.035] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2020] [Revised: 09/22/2020] [Accepted: 09/24/2020] [Indexed: 12/22/2022] Open
Abstract
Homotypic TMD interfaces identified by different techniques share strong similarities. The GxxxG motif is the feature most strongly associated with interfaces. Other features include conservation, polarity, coevolution, and depth in the membrane The role of each of each feature strongly depends on the individual protein. Machine-learning helps predict interfaces from evolutionary sequence data
Interactions between their transmembrane domains (TMDs) frequently support the assembly of single-pass membrane proteins to non-covalent complexes. Yet, the TMD-TMD interactome remains largely uncharted. With a view to predicting homotypic TMD-TMD interfaces from primary structure, we performed a systematic analysis of their physical and evolutionary properties. To this end, we generated a dataset of 50 self-interacting TMDs. This dataset contains interfaces of nine TMDs from bitopic human proteins (Ire1, Armcx6, Tie1, ATP1B1, PTPRO, PTPRU, PTPRG, DDR1, and Siglec7) that were experimentally identified here and combined with literature data. We show that interfacial residues of these homotypic TMD-TMD interfaces tend to be more conserved, coevolved and polar than non-interfacial residues. Further, we suggest for the first time that interface positions are deficient in β-branched residues, and likely to be located deep in the hydrophobic core of the membrane. Overrepresentation of the GxxxG motif at interfaces is strong, but that of (small)xxx(small) motifs is weak. The multiplicity of these features and the individual character of TMD-TMD interfaces, as uncovered here, prompted us to train a machine learning algorithm. The resulting prediction method, THOIPA (www.thoipa.org), excels in the prediction of key interface residues from evolutionary sequence data.
Collapse
Affiliation(s)
- Yao Xiao
- Center for Integrated Protein Science Munich (CIPSM) at the Lehrstuhl für Chemie der Biopolymere, Technische Universität München, Weihenstephaner Berg 3, 85354 Freising, Germany
| | - Bo Zeng
- Department of Bioinformatics, Wissenschaftszentrum, Weihenstephan, Maximus-von-Imhof-Forum 3, Freising 85354, Germany
| | - Nicola Berner
- Center for Integrated Protein Science Munich (CIPSM) at the Lehrstuhl für Chemie der Biopolymere, Technische Universität München, Weihenstephaner Berg 3, 85354 Freising, Germany
| | - Dmitrij Frishman
- Department of Bioinformatics, Wissenschaftszentrum, Weihenstephan, Maximus-von-Imhof-Forum 3, Freising 85354, Germany.,Department of Bioinformatics, Peter the Great Saint Petersburg Polytechnic University, St. Petersburg 195251, Russian Federation
| | - Dieter Langosch
- Center for Integrated Protein Science Munich (CIPSM) at the Lehrstuhl für Chemie der Biopolymere, Technische Universität München, Weihenstephaner Berg 3, 85354 Freising, Germany
| | - Mark George Teese
- Center for Integrated Protein Science Munich (CIPSM) at the Lehrstuhl für Chemie der Biopolymere, Technische Universität München, Weihenstephaner Berg 3, 85354 Freising, Germany.,TNG Technology Consulting GmbH, Beta-Straße 13a, 85774 Unterföhring, Germany
| |
Collapse
|
12
|
Sun J, Frishman D. DeepHelicon: Accurate prediction of inter-helical residue contacts in transmembrane proteins by residual neural networks. J Struct Biol 2020; 212:107574. [PMID: 32663598 DOI: 10.1016/j.jsb.2020.107574] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2020] [Revised: 07/03/2020] [Accepted: 07/07/2020] [Indexed: 01/16/2023]
Abstract
Accurate prediction of amino acid residue contacts is an important prerequisite for generating high-quality 3D models of transmembrane (TM) proteins. While a large number of compositional, evolutionary, and structural properties of proteins can be used to train contact prediction methods, recent research suggests that coevolution between residues provides the strongest indication of their spatial proximity. We have developed a deep learning approach, DeepHelicon, to predict inter-helical residue contacts in TM proteins by considering only coevolutionary features. DeepHelicon comprises a two-stage supervised learning process by residual neural networks for a gradual refinement of contact maps, followed by variance reduction by an ensemble of models. We present a benchmark study of 12 contact predictors and conclude that DeepHelicon together with the two other state-of-the-art methods DeepMetaPSICOV and Membrain2 outperforms the 10 remaining algorithms on all datasets and at all settings. On a set of 44 TM proteins with an average length of 388 residues DeepHelicon achieves the best performance among all benchmarked methods in predicting the top L/5 and L/2 inter-helical contacts, with the mean precision of 87.42% and 77.84%, respectively. On a set of 57 relatively small TM proteins with an average length of 298 residues DeepHelicon ranks second best after DeepMetaPSICOV. DeepHelicon produces the most accurate predictions for large proteins with more than 10 transmembrane helices. Coevolutionary features alone allow to predict inter-helical residue contacts with an accuracy sufficient for generating acceptable 3D models for up to 30% of proteins using a fully automated modeling method such as CONFOLD2.
Collapse
Affiliation(s)
- Jianfeng Sun
- Department of Bioinformatics, Wissenschaftzentrum Weihenstephan, Technische Universität München, 85354 Freising, Germany
| | - Dmitrij Frishman
- Department of Bioinformatics, Wissenschaftzentrum Weihenstephan, Technische Universität München, 85354 Freising, Germany.
| |
Collapse
|
13
|
Hönigschmid P, Breimann S, Weigl M, Frishman D. AllesTM: predicting multiple structural features of transmembrane proteins. BMC Bioinformatics 2020; 21:242. [PMID: 32532211 PMCID: PMC7291640 DOI: 10.1186/s12859-020-03581-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2019] [Accepted: 06/03/2020] [Indexed: 12/04/2022] Open
Abstract
Background This study is motivated by the following three considerations: a) the physico-chemical properties of transmembrane (TM) proteins are distinctly different from those of globular proteins, necessitating the development of specialized structure prediction techniques, b) for many structural features no specialized predictors for TM proteins are available at all, and c) deep learning algorithms allow to automate the feature engineering process and thus facilitate the development of multi-target methods for predicting several protein properties at once. Results We present AllesTM, an integrated tool to predict almost all structural features of transmembrane proteins that can be extracted from atomic coordinate data. It blends several machine learning algorithms: random forests and gradient boosting machines, convolutional neural networks in their original form as well as those enhanced by dilated convolutions and residual connections, and, finally, long short-term memory architectures. AllesTM outperforms other available methods in predicting residue depth in the membrane, flexibility, topology, relative solvent accessibility in its bound state, while in torsion angles, secondary structure and monomer relative solvent accessibility prediction it lags only slightly behind the currently leading technique SPOT-1D. High accuracy on a multitude of prediction targets and easy installation make AllesTM a one-stop shop for many typical problems in the structural bioinformatics of transmembrane proteins. Conclusions In addition to presenting a highly accurate prediction method and eliminating the need to install and maintain many different software tools, we also provide a comprehensive overview of the impact of different machine learning algorithms and parameter choices on the prediction performance. AllesTM is freely available at https://github.com/phngs/allestm.
Collapse
Affiliation(s)
- Peter Hönigschmid
- Department of Bioinformatics, Wissenschaftszentrum Weihenstephan, Technische Universität München, Maximus-von-Imhof-Forum 3, 85354, Freising, Germany
| | - Stephan Breimann
- Department of Bioinformatics, Wissenschaftszentrum Weihenstephan, Technische Universität München, Maximus-von-Imhof-Forum 3, 85354, Freising, Germany
| | - Martina Weigl
- Department of Bioinformatics, Wissenschaftszentrum Weihenstephan, Technische Universität München, Maximus-von-Imhof-Forum 3, 85354, Freising, Germany
| | - Dmitrij Frishman
- Department of Bioinformatics, Wissenschaftszentrum Weihenstephan, Technische Universität München, Maximus-von-Imhof-Forum 3, 85354, Freising, Germany.
| |
Collapse
|
14
|
Fang C, Jia Y, Hu L, Lu Y, Wang H. IMPContact: An Interhelical Residue Contact Prediction Method. BIOMED RESEARCH INTERNATIONAL 2020; 2020:4569037. [PMID: 32309431 PMCID: PMC7140131 DOI: 10.1155/2020/4569037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/01/2020] [Accepted: 03/09/2020] [Indexed: 11/17/2022]
Abstract
As an important category of proteins, alpha-helix transmembrane proteins (αTMPs) play an important role in various biological activities. Because the solved αTMP structures are inadequate, predicting the residue contacts among the transmembrane segments of an αTMP exhibits the basis of protein fold, which can be used to further discover more protein functions. A few efforts have been devoted to predict the interhelical residue contact using machine learning methods based on the prior knowledge of transmembrane protein structure. However, it is still a challenge to improve the prediction accuracy, while the deep learning method provides an opportunity to utilize the structural knowledge in a different insight. For this purpose, we proposed a novel αTMP residue-residue contact prediction method IMPContact, in which a convolutional neural network (CNN) was applied to recognize those interhelical contacts in a TMP using its specific structural features. There were four sequence-based TMP-specific features selected to descript a pair of residues, namely, evolutionary covariation, predicted topology structure, residue relative position, and evolutionary conservation. An up-to-date dataset was used to train and test the IMPContact; our method achieved better performance compared to peer methods. In the case studies, IHRCs in the regular transmembrane helixes were better predicted than in the irregular ones.
Collapse
Affiliation(s)
- Chao Fang
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China
| | - Yajie Jia
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China
- Institute of Computational Biology, Northeast Normal University, Changchun 130117, China
| | - Lihong Hu
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China
| | - Yinghua Lu
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China
- Department of Computer Science, College of Humanities & Sciences of Northeast Normal University, Changchun 130117, China
| | - Han Wang
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China
- Institute of Computational Biology, Northeast Normal University, Changchun 130117, China
- Department of Computer Science, College of Humanities & Sciences of Northeast Normal University, Changchun 130117, China
| |
Collapse
|
15
|
Feng SH, Zhang WX, Yang J, Yang Y, Shen HB. Topology Prediction Improvement of α-helical Transmembrane Proteins Through Helix-tail Modeling and Multiscale Deep Learning Fusion. J Mol Biol 2020; 432:1279-1296. [DOI: 10.1016/j.jmb.2019.12.007] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2019] [Revised: 12/02/2019] [Accepted: 12/04/2019] [Indexed: 12/18/2022]
|
16
|
Batra V, Maheshwarappa A, Dagar K, Kumar S, Soni A, Kumaresan A, Kumar R, Datta TK. Unusual interplay of contrasting selective pressures on β-defensin genes implicated in male fertility of the Buffalo (Bubalus bubalis). BMC Evol Biol 2019; 19:214. [PMID: 31771505 PMCID: PMC6878701 DOI: 10.1186/s12862-019-1535-8] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2019] [Accepted: 10/22/2019] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND The buffalo, despite its superior milk-producing ability, suffers from reproductive limitations that constrain its lifetime productivity. Male sub-fertility, manifested as low conception rates (CRs), is a major concern in buffaloes. The epididymal sperm surface-binding proteins which participate in the sperm surface remodelling (SSR) events affect the survival and performance of the spermatozoa in the female reproductive tract (FRT). A mutation in an epididymal secreted protein, beta-defensin 126 (DEFB-126/BD-126), a class-A beta-defensin (CA-BD), resulted in decreased CRs in human cohorts across the globe. To better understand the role of CA-BDs in buffalo reproduction, this study aimed to identify the BD genes for characterization of the selection pressure(s) acting on them, and to identify the most abundant CA-BD transcript in the buffalo male reproductive tract (MRT) for predicting its reproductive functional significance. RESULTS Despite the low protein sequence homology with their orthologs, the CA-BDs have maintained the molecular framework and the structural core vital to their biological functions. Their coding-sequences in ruminants revealed evidence of pervasive purifying and episodic diversifying selection pressures. The buffalo CA-BD genes were expressed in the major reproductive and non-reproductive tissues exhibiting spatial variations. The Buffalo BD-129 (BuBD-129) was the most abundant and the longest CA-BD in the distal-MRT segments and was predicted to be heavily O-glycosylated. CONCLUSIONS The maintenance of the structural core, despite the sequence divergence, indicated the conservation of the molecular functions of the CA-BDs. The expression of the buffalo CA-BDs in both the distal-MRT segments and non-reproductive tissues indicate the retention the primordial microbicidal activity, which was also predicted by in silico sequence analyses. However, the observed spatial variations in their expression across the MRT hint at their region-specific roles. Their comparison across mammalian species revealed a pattern in which the various CA-BDs appeared to follow dissimilar evolutionary paths. This pattern appears to maintain only the highly efficacious CA-BD alleles and diversify their functional repertoire in the ruminants. Our preliminary results and analyses indicated that BuBD-129 could be the functional ortholog of the primate DEFB-126. Further studies are warranted to assess its molecular functions to elucidate its role in immunity, reproduction and fertility.
Collapse
Affiliation(s)
- Vipul Batra
- Animal Genomics Lab, National Dairy Research Institute, Karnal, 132001, India
| | | | - Komal Dagar
- Animal Genomics Lab, National Dairy Research Institute, Karnal, 132001, India
| | - Sandeep Kumar
- Animal Genomics Lab, National Dairy Research Institute, Karnal, 132001, India
| | - Apoorva Soni
- Animal Genomics Lab, National Dairy Research Institute, Karnal, 132001, India
| | - A Kumaresan
- Theriogenology Lab, SRS of NDRI, Bengaluru, 560030, India
| | - Rakesh Kumar
- Animal Genomics Lab, National Dairy Research Institute, Karnal, 132001, India
| | - T K Datta
- Animal Genomics Lab, National Dairy Research Institute, Karnal, 132001, India.
| |
Collapse
|
17
|
Lu C, Liu Z, Zhang E, He F, Ma Z, Wang H. MPLs-Pred: Predicting Membrane Protein-Ligand Binding Sites Using Hybrid Sequence-Based Features and Ligand-Specific Models. Int J Mol Sci 2019; 20:ijms20133120. [PMID: 31247932 PMCID: PMC6651575 DOI: 10.3390/ijms20133120] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2019] [Revised: 06/23/2019] [Accepted: 06/23/2019] [Indexed: 02/07/2023] Open
Abstract
Membrane proteins (MPs) are involved in many essential biomolecule mechanisms as a pivotal factor in enabling the small molecule and signal transport between the two sides of the biological membrane; this is the reason that a large portion of modern medicinal drugs target MPs. Therefore, accurately identifying the membrane protein-ligand binding sites (MPLs) will significantly improve drug discovery. In this paper, we propose a sequence-based MPLs predictor called MPLs-Pred, where evolutionary profiles, topology structure, physicochemical properties, and primary sequence segment descriptors are combined as features applied to a random forest classifier, and an under-sampling scheme is used to enhance the classification capability with imbalanced samples. Additional ligand-specific models were taken into consideration in refining the prediction. The corresponding experimental results based on our method achieved an appreciable performance, with 0.63 MCC (Matthews correlation coefficient) as the overall prediction precision, and those values were 0.604, 0.7, and 0.692, respectively, for the three main types of ligands: drugs, metal ions, and biomacromolecules. MPLs-Pred is freely accessible at http://icdtools.nenu.edu.cn/.
Collapse
Affiliation(s)
- Chang Lu
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China
- Institute of Computational Biology, Northeast Normal University, Changchun 130117, China
| | - Zhe Liu
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China
- Institute of Computational Biology, Northeast Normal University, Changchun 130117, China
| | - Enju Zhang
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China
- Institute of Computational Biology, Northeast Normal University, Changchun 130117, China
| | - Fei He
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China.
- Institute of Computational Biology, Northeast Normal University, Changchun 130117, China.
| | - Zhiqiang Ma
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China.
- Institute of Computational Biology, Northeast Normal University, Changchun 130117, China.
| | - Han Wang
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China.
- Institute of Computational Biology, Northeast Normal University, Changchun 130117, China.
| |
Collapse
|
18
|
Zeng B, Hönigschmid P, Frishman D. Residue co-evolution helps predict interaction sites in α-helical membrane proteins. J Struct Biol 2019; 206:156-169. [DOI: 10.1016/j.jsb.2019.02.009] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2018] [Revised: 01/30/2019] [Accepted: 02/13/2019] [Indexed: 11/29/2022]
|
19
|
Kulandaisamy A, Priya SB, Sakthivel R, Frishman D, Gromiha MM. Statistical analysis of disease‐causing and neutral mutations in human membrane proteins. Proteins 2019; 87:452-466. [DOI: 10.1002/prot.25667] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2018] [Revised: 01/16/2019] [Accepted: 01/31/2019] [Indexed: 11/11/2022]
Affiliation(s)
- A. Kulandaisamy
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BiosciencesIndian Institute of Technology Madras Chennai Tamil Nadu India
| | - S. Binny Priya
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BiosciencesIndian Institute of Technology Madras Chennai Tamil Nadu India
| | - R. Sakthivel
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BiosciencesIndian Institute of Technology Madras Chennai Tamil Nadu India
| | - Dmitrij Frishman
- Department of BioinformaticsPeter the Great St. Petersburg Polytechnic University St. Petersburg Russian Federation
- Department of BioinformaticsTechnische Universität München, Wissenschaftszentrum Weihenstephan Freising Germany
| | - M. Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of BiosciencesIndian Institute of Technology Madras Chennai Tamil Nadu India
- Advanced Computational Drug Discovery Unit (ACDD)Institute of Innovative Research, Tokyo Institute of Technology Yokohama Kanagawa Japan
| |
Collapse
|
20
|
Xiong D, Mao W, Gong H. Predicting the helix-helix interactions from correlated residue mutations. Proteins 2017; 85:2162-2169. [PMID: 28833538 DOI: 10.1002/prot.25370] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2017] [Revised: 08/03/2017] [Accepted: 08/13/2017] [Indexed: 12/30/2022]
Abstract
Helix-helix interactions are crucial in the structure assembly, stability and function of helix-rich proteins including many membrane proteins. In spite of remarkable progresses over the past decades, the accuracy of predicting protein structures from their amino acid sequences is still far from satisfaction. In this work, we focused on a simpler problem, the prediction of helix-helix interactions, the results of which could facilitate practical protein structure prediction by constraining the sampling space. Specifically, we started from the noisy 2D residue contact maps derived from correlated residue mutations, and utilized ridge detection to identify the characteristic residue contact patterns for helix-helix interactions. The ridge information as well as a few additional features were then fed into a machine learning model HHConPred to predict interactions between helix pairs. In an independent test, our method achieved an F-measure of ∼60% for predicting helix-helix interactions. Moreover, although the model was trained mainly using soluble proteins, it could be extended to membrane proteins with at least comparable performance relatively to previous approaches that were generated purely using membrane proteins. All data and source codes are available at http://166.111.152.91/Downloads.html or https://github.com/dpxiong/HHConPred.
Collapse
Affiliation(s)
- Dapeng Xiong
- MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing, China
- Beijing Innovation Center of Structural Biology, Tsinghua University, Beijing, China
| | - Wenzhi Mao
- MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing, China
- Beijing Innovation Center of Structural Biology, Tsinghua University, Beijing, China
| | - Haipeng Gong
- MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing, China
- Beijing Innovation Center of Structural Biology, Tsinghua University, Beijing, China
| |
Collapse
|
21
|
Simkovic F, Ovchinnikov S, Baker D, Rigden DJ. Applications of contact predictions to structural biology. IUCRJ 2017; 4:291-300. [PMID: 28512576 PMCID: PMC5414403 DOI: 10.1107/s2052252517005115] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/12/2016] [Accepted: 04/03/2017] [Indexed: 06/07/2023]
Abstract
Evolutionary pressure on residue interactions, intramolecular or intermolecular, that are important for protein structure or function can lead to covariance between the two positions. Recent methodological advances allow much more accurate contact predictions to be derived from this evolutionary covariance signal. The practical application of contact predictions has largely been confined to structural bioinformatics, yet, as this work seeks to demonstrate, the data can be of enormous value to the structural biologist working in X-ray crystallo-graphy, cryo-EM or NMR. Integrative structural bioinformatics packages such as Rosetta can already exploit contact predictions in a variety of ways. The contribution of contact predictions begins at construct design, where structural domains may need to be expressed separately and contact predictions can help to predict domain limits. Structure solution by molecular replacement (MR) benefits from contact predictions in diverse ways: in difficult cases, more accurate search models can be constructed using ab initio modelling when predictions are available, while intermolecular contact predictions can allow the construction of larger, oligomeric search models. Furthermore, MR using supersecondary motifs or large-scale screens against the PDB can exploit information, such as the parallel or antiparallel nature of any β-strand pairing in the target, that can be inferred from contact predictions. Contact information will be particularly valuable in the determination of lower resolution structures by helping to assign sequence register. In large complexes, contact information may allow the identity of a protein responsible for a certain region of density to be determined and then assist in the orientation of an available model within that density. In NMR, predicted contacts can provide long-range information to extend the upper size limit of the technique in a manner analogous but complementary to experimental methods. Finally, predicted contacts can distinguish between biologically relevant interfaces and mere lattice contacts in a final crystal structure, and have potential in the identification of functionally important regions and in foreseeing the consequences of mutations.
Collapse
Affiliation(s)
- Felix Simkovic
- Institute of Integrative Biology, University of Liverpool, Liverpool L69 7ZB, England
| | - Sergey Ovchinnikov
- Department of Biochemistry, University of Washington, Seattle, WA 98195, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98195, USA
- Howard Hughes Medical Institute, University of Washington, Box 357370, Seattle, WA 98195, USA
| | - David Baker
- Department of Biochemistry, University of Washington, Seattle, WA 98195, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98195, USA
- Howard Hughes Medical Institute, University of Washington, Box 357370, Seattle, WA 98195, USA
| | - Daniel J. Rigden
- Institute of Integrative Biology, University of Liverpool, Liverpool L69 7ZB, England
| |
Collapse
|