1
|
Roterman I, Stapor K, Konieczny L. Transmembrane proteins-Different anchoring systems. Proteins 2024; 92:593-609. [PMID: 38062872 DOI: 10.1002/prot.26646] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Revised: 11/03/2023] [Accepted: 11/17/2023] [Indexed: 04/13/2024]
Abstract
Transmembrane proteins are active in amphipathic environments. To stabilize the protein in such surrounding the exposure of hydrophobic residues on the protein surface is required. Transmembrane proteins are responsible for the transport of various molecules. Therefore, they often represent structures in the form of channels. This analysis focused on the stability and local flexibility of transmembrane proteins, particularly those related to their biological activity. Different forms of anchorage were identified using the fuzzy oil-drop model (FOD) and its modified form, FOD-M. The mainly helical as well as β-barrel structural forms are compared with respect to the mechanism of stabilization in the cell membrane. The different anchoring system was found to stabilize protein molecules with possible local fluctuation.
Collapse
Affiliation(s)
- Irena Roterman
- Department of Bioinformatics and Telemedicine, Jagiellonian University-Medical College, Krakow, Poland
| | - Katarzyna Stapor
- Faculty of Automatic, Electronics and Computer Science, Department of Applied Informatics, Silesian University of Technology, Gliwice, Poland
| | - Leszek Konieczny
- Chair of Medical Biochemistry, Jagiellonian University-Medical College, Krakow, Poland
| |
Collapse
|
2
|
Ou YY, Ho QT, Chang HT. Recent advances in features generation for membrane protein sequences: From multiple sequence alignment to pre-trained language models. Proteomics 2023; 23:e2200494. [PMID: 37863817 DOI: 10.1002/pmic.202200494] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Revised: 09/19/2023] [Accepted: 09/20/2023] [Indexed: 10/22/2023]
Abstract
Membrane proteins play a crucial role in various cellular processes and are essential components of cell membranes. Computational methods have emerged as a powerful tool for studying membrane proteins due to their complex structures and properties that make them difficult to analyze experimentally. Traditional features for protein sequence analysis based on amino acid types, composition, and pair composition have limitations in capturing higher-order sequence patterns. Recently, multiple sequence alignment (MSA) and pre-trained language models (PLMs) have been used to generate features from protein sequences. However, the significant computational resources required for MSA-based features generation can be a major bottleneck for many applications. Several methods and tools have been developed to accelerate the generation of MSAs and reduce their computational cost, including heuristics and approximate algorithms. Additionally, the use of PLMs such as BERT has shown great potential in generating informative embeddings for protein sequence analysis. In this review, we provide an overview of traditional and more recent methods for generating features from protein sequences, with a particular focus on MSAs and PLMs. We highlight the advantages and limitations of these approaches and discuss the methods and tools developed to address the computational challenges associated with features generation. Overall, the advancements in computational methods and tools provide a promising avenue for gaining deeper insights into the function and properties of membrane proteins, which can have significant implications in drug discovery and personalized medicine.
Collapse
Affiliation(s)
- Yu-Yen Ou
- Department of Computer Science and Engineering, Yuan Ze University, Chung-Li, Taiwan
- Graduate Program in Biomedical Informatics, Yuan Ze University, Chung-Li, Taiwan
| | - Quang-Thai Ho
- Department of Computer Science and Engineering, Yuan Ze University, Chung-Li, Taiwan
| | - Heng-Ta Chang
- Department of Computer Science and Engineering, Yuan Ze University, Chung-Li, Taiwan
| |
Collapse
|
3
|
Su W, Qian X, Yang K, Ding H, Huang C, Zhang Z. Recognition of outer membrane proteins using multiple feature fusion. Front Genet 2023; 14:1211020. [PMID: 37351347 PMCID: PMC10284346 DOI: 10.3389/fgene.2023.1211020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Accepted: 05/24/2023] [Indexed: 06/24/2023] Open
Abstract
Introduction: Outer membrane proteins are crucial in maintaining the structural stability and permeability of the outer membrane. Outer membrane proteins exhibit several functions such as antigenicity and strong immunogenicity, which have potential applications in clinical diagnosis and disease prevention. However, wet experiments for studying OMPs are time and capital-intensive, thereby necessitating the use of computational methods for their identification. Methods: In this study, we developed a computational model to predict outer membrane proteins. The non-redundant dataset consists of a positive set of 208 outer membrane proteins and a negative set of 876 non-outer membrane proteins. In this study, we employed the pseudo amino acid composition method to extract feature vectors and subsequently utilized the support vector machine for prediction. Results and Discussion: In the Jackknife cross-validation, the overall accuracy and the area under receiver operating characteristic curve were observed to be 93.19% and 0.966, respectively. These results demonstrate that our model can produce accurate predictions, and could serve as a valuable guide for experimental research on outer membrane proteins.
Collapse
Affiliation(s)
- Wenxia Su
- College of Science, Inner Mongolia Agriculture University, Hohhot, China
| | - Xiaojun Qian
- School of Life Science and Technology, Center for Information Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Keli Yang
- Nonlinear Research Institute, Baoji University of Arts and Sciences, Baoji, China
| | - Hui Ding
- School of Life Science and Technology, Center for Information Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Chengbing Huang
- School of Computer Science and Technology, Aba Teachers University, Aba, China
| | - Zhaoyue Zhang
- School of Life Science and Technology, Center for Information Biology, University of Electronic Science and Technology of China, Chengdu, China
- School of Healthcare Technology, Chengdu Neusoft University, Chengdu, China
| |
Collapse
|
4
|
Mejias-Gomez O, Madsen AV, Skovgaard K, Pedersen LE, Morth JP, Jenkins TP, Kristensen P, Goletz S. A window into the human immune system: comprehensive characterization of the complexity of antibody complementary-determining regions in functional antibodies. MAbs 2023; 15:2268255. [PMID: 37876265 PMCID: PMC10601506 DOI: 10.1080/19420862.2023.2268255] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Accepted: 10/04/2023] [Indexed: 10/26/2023] Open
Abstract
The human immune system uses antibodies to neutralize foreign antigens. They are composed of heavy and light chains, both with constant and variable regions. The variable region has six hypervariable loops, also known as complementary-determining regions (CDRs) that determine antibody diversity and antigen specificity. Knowledge of their significance, and certain residues present in these areas, is vital for antibody therapeutics development. This study includes an analysis of more than 11,000 human antibody sequences from the International Immunogenetics information system (IMGT). The analysis included parameters such as length distribution, overall amino acid diversity, amino acid frequency per CDR and residue position within antibody chains. Overall, our findings confirm existing knowledge, such as CDRH3's high length diversity and amino acid variability, increased aromatic residue usage, particularly tyrosine, charged and polar residues like aspartic acid, serine, and the flexible residue glycine. Specific residue positions within each CDR influence these occurrences, implying a unique amino acid type distribution pattern. We compared amino acid type usage in CDRs and non-CDR regions, both in globular and transmembrane proteins, which revealed distinguishing features, such as increased frequency of tyrosine, serine, aspartic acid, and arginine. These findings should prove useful for future optimization, improvement of affinity, synthetic antibody library design, or the creation of antibodies de-novo in silico.
Collapse
Affiliation(s)
- Oscar Mejias-Gomez
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - Andreas V. Madsen
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - Kerstin Skovgaard
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - Lasse E. Pedersen
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - J. Preben Morth
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - Timothy P. Jenkins
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - Peter Kristensen
- Department of Chemistry and Bioscience, Aalborg University, Aalborg, Denmark
| | - Steffen Goletz
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Kgs. Lyngby, Denmark
| |
Collapse
|
5
|
Jiang M, Zhao B, Luo S, Wang Q, Chu Y, Chen T, Mao X, Liu Y, Wang Y, Jiang X, Wei DQ, Xiong Y. NeuroPpred-Fuse: an interpretable stacking model for prediction of neuropeptides by fusing sequence information and feature selection methods. Brief Bioinform 2021; 22:6350884. [PMID: 34396388 DOI: 10.1093/bib/bbab310] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2021] [Revised: 07/01/2021] [Accepted: 07/18/2021] [Indexed: 12/13/2022] Open
Abstract
Neuropeptides acting as signaling molecules in the nervous system of various animals play crucial roles in a wide range of physiological functions and hormone regulation behaviors. Neuropeptides offer many opportunities for the discovery of new drugs and targets for the treatment of neurological diseases. In recent years, there have been several data-driven computational predictors of various types of bioactive peptides, but the relevant work about neuropeptides is little at present. In this work, we developed an interpretable stacking model, named NeuroPpred-Fuse, for the prediction of neuropeptides through fusing a variety of sequence-derived features and feature selection methods. Specifically, we used six types of sequence-derived features to encode the peptide sequences and then combined them. In the first layer, we ensembled three base classifiers and four feature selection algorithms, which select non-redundant important features complementarily. In the second layer, the output of the first layer was merged and fed into logistic regression (LR) classifier to train the model. Moreover, we analyzed the selected features and explained the feasibility of the selected features. Experimental results show that our model achieved 90.6% accuracy and 95.8% AUC on the independent test set, outperforming the state-of-the-art models. In addition, we exhibited the distribution of selected features by these tree models and compared the results on the training set to that on the test set. These results fully showed that our model has a certain generalization ability. Therefore, we expect that our model would provide important advances in the discovery of neuropeptides as new drugs for the treatment of neurological diseases.
Collapse
Affiliation(s)
- Mingming Jiang
- State Key Laboratory of Microbial Metabolism, and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Bowen Zhao
- State Key Laboratory of Microbial Metabolism, and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Shenggan Luo
- State Key Laboratory of Microbial Metabolism, and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Qiankun Wang
- State Key Laboratory of Microbial Metabolism, and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Yanyi Chu
- State Key Laboratory of Microbial Metabolism, and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Tianhang Chen
- State Key Laboratory of Microbial Metabolism, and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Xueying Mao
- State Key Laboratory of Microbial Metabolism, and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Yatong Liu
- State Key Laboratory of Microbial Metabolism, and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Yanjing Wang
- State Key Laboratory of Microbial Metabolism, and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Xue Jiang
- State Key Laboratory of Microbial Metabolism, and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Dong-Qing Wei
- State Key Laboratory of Microbial Metabolism, and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Yi Xiong
- State Key Laboratory of Microbial Metabolism, and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| |
Collapse
|
6
|
Ho QT, Nguyen TTD, Khanh Le NQ, Ou YY. FAD-BERT: Improved prediction of FAD binding sites using pre-training of deep bidirectional transformers. Comput Biol Med 2021; 131:104258. [PMID: 33601085 DOI: 10.1016/j.compbiomed.2021.104258] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Revised: 01/16/2021] [Accepted: 02/03/2021] [Indexed: 02/07/2023]
Abstract
The electron transport chain is a series of protein complexes embedded in the process of cellular respiration, which is an important process to transfer electrons and other macromolecules throughout the cell. Identifying Flavin Adenine Dinucleotide (FAD) binding sites in the electron transport chain is vital since it helps biological researchers precisely understand how electrons are produced and are transported in cells. This study distills and analyzes the contextualized word embedding from pre-trained BERT models to explore similarities in natural language and protein sequences. Thereby, we propose a new approach based on Pre-training of Bidirectional Encoder Representations from Transformers (BERT), Position-specific Scoring Matrix profiles (PSSM), Amino Acid Index database (AAIndex) to predict FAD-binding sites from the transport proteins which are found in nature recently. Our proposed approach archives 85.14% accuracy and improves accuracy by 11%, with Matthew's correlation coefficient of 0.39 compared to the previous method on the same independent set. We also deploy a web server that identifies FAD-binding sites in electron transporters available for academics at http://140.138.155.216/fadbert/.
Collapse
Affiliation(s)
- Quang-Thai Ho
- Department of Computer Science and Engineering, Yuan Ze University, Chung-Li, 32003, Taiwan; College of Information & Communication Technology, Can Tho University, Viet Nam
| | | | - Nguyen Quoc Khanh Le
- Professional Master Program in Artificial Intelligence in Medicine, College of Medicine, Taipei Medical University, Taipei City, 106, Taiwan; Research Center for Artificial Intelligence in Medicine, Taipei Medical University, Taipei City, 106, Taiwan
| | - Yu-Yen Ou
- Department of Computer Science and Engineering, Yuan Ze University, Chung-Li, 32003, Taiwan.
| |
Collapse
|
7
|
Jeon J, Yau WM, Tycko R. Millisecond Time-Resolved Solid-State NMR Reveals a Two-Stage Molecular Mechanism for Formation of Complexes between Calmodulin and a Target Peptide from Myosin Light Chain Kinase. J Am Chem Soc 2020; 142:21220-21232. [PMID: 33280387 DOI: 10.1021/jacs.0c11156] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Calmodulin (CaM) mediates a wide range of biological responses to changes in intracellular Ca2+ concentrations through its calcium-dependent binding affinities to numerous target proteins. Binding of two Ca2+ ions to each of the two four-helix-bundle domains of CaM results in major conformational changes that create a potential binding site for the CaM binding domain of a target protein, which also undergoes major conformational changes to form the complex with CaM. Details of the molecular mechanism of complex formation are not well established, despite numerous structural, spectroscopic, thermodynamic, and kinetic studies. Here, we report a study of the process by which the 26-residue peptide M13, which represents the CaM binding domain of skeletal muscle myosin light chain kinase, forms a complex with CaM in the presence of excess Ca2+ on the millisecond time scale. Our experiments use a combination of selective 13C labeling of CaM and M13, rapid mixing of CaM solutions with M13/Ca2+ solutions, rapid freeze-quenching of the mixed solutions, and low-temperature solid state nuclear magnetic resonance (ssNMR) enhanced by dynamic nuclear polarization. From measurements of the dependence of 2D 13C-13C ssNMR spectra on the time between mixing and freezing, we find that the N-terminal portion of M13 converts from a conformationally disordered state to an α-helix and develops contacts with the C-terminal domain of CaM in about 2 ms. The C-terminal portion of M13 becomes α-helical and develops contacts with the N-terminal domain of CaM more slowly, in about 8 ms. The level of structural order in the CaM/M13/Ca2+ complexes, indicated by 13C ssNMR line widths, continues to increase beyond 27 ms.
Collapse
Affiliation(s)
- Jaekyun Jeon
- Laboratory of Chemical Physics, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland 20892-0520, United States
| | - Wai-Ming Yau
- Laboratory of Chemical Physics, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland 20892-0520, United States
| | - Robert Tycko
- Laboratory of Chemical Physics, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland 20892-0520, United States
| |
Collapse
|
8
|
MFSC: Multi-voting based feature selection for classification of Golgi proteins by adopting the general form of Chou's PseAAC components. J Theor Biol 2019; 463:99-109. [DOI: 10.1016/j.jtbi.2018.12.017] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2018] [Revised: 12/02/2018] [Accepted: 12/14/2018] [Indexed: 12/29/2022]
|
9
|
Kikuchi N, Fujiwara K, Ikeguchi M. β‐Strand twisting/bending in soluble and transmembrane β‐barrel structures. Proteins 2018; 86:1231-1241. [DOI: 10.1002/prot.25576] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2018] [Revised: 06/05/2018] [Accepted: 06/22/2018] [Indexed: 01/03/2023]
|
10
|
An in silico structural and physicochemical characterization of TonB-dependent copper receptor in A. baumannii. Microb Pathog 2018. [DOI: 10.1016/j.micpath.2018.03.009] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
|
11
|
Bersimis S, Sachlas A, Bagos PG. Discriminating membrane proteins using the joint distribution of length sums of success and failure runs. STAT METHOD APPL-GER 2017. [DOI: 10.1007/s10260-016-0370-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
12
|
Le NQK, Nguyen TTD, Ou YY. Identifying the molecular functions of electron transport proteins using radial basis function networks and biochemical properties. J Mol Graph Model 2017; 73:166-178. [DOI: 10.1016/j.jmgm.2017.01.003] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2016] [Revised: 12/26/2016] [Accepted: 01/04/2017] [Indexed: 10/20/2022]
|
13
|
Ofer D, Linial M. ProFET: Feature engineering captures high-level protein functions. Bioinformatics 2015; 31:3429-36. [DOI: 10.1093/bioinformatics/btv345] [Citation(s) in RCA: 55] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2015] [Accepted: 05/29/2015] [Indexed: 11/13/2022] Open
|
14
|
Yamamoto-Tamura K, Kawagishi I, Ogawa N, Fujii T. A putative porin gene of Burkholderia sp. NK8 involved in chemotaxis toward β-ketoadipate. Biosci Biotechnol Biochem 2015; 79:926-36. [PMID: 25649919 DOI: 10.1080/09168451.2015.1006571] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
Burkholderia sp. NK8 can utilize 3-chlorobenzoate (3CB) as a sole source of carbon because it has a megaplasmid (pNK8) that carries the gene cluster (tfdT-CDEF) encoding chlorocatechol-degrading enzymes. The expression of tfdT-CDEF is induced by 3CB. In this study, we found that NK8 cells were attracted to 3CB and its degradation products, 3- and 4-chlorocatechol, and β-ketoadipate. Capillary assays revealed that a pNK8-eliminated strain (NK82) was defective in chemotaxis toward β-ketoadipate. The introduction of a plasmid carrying a putative outer membrane porin gene, which we name ompNK8, into strain NK82 restored chemotaxis toward β-ketoadipate. RT-PCR analyses demonstrated that the transcription of the ompNK8 gene was enhanced in the presence of 3CB.
Collapse
Affiliation(s)
- Kimiko Yamamoto-Tamura
- a Environmental Biofunction Division , National Institute for Agro-Environmental Sciences , Tsukuba , Japan
| | | | | | | |
Collapse
|
15
|
Abstract
The outer membrane (OM) is the front line of leptospiral interactions with their environment and the mammalian host. Unlike most invasive spirochetes, pathogenic leptospires must be able to survive in both free-living and host-adapted states. As organisms move from one set of environmental conditions to another, the OM must cope with a series of conflicting challenges. For example, the OM must be porous enough to allow nutrient uptake, yet robust enough to defend the cell against noxious substances. In the host, the OM presents a surface decorated with adhesins and receptors for attaching to, and acquiring, desirable host molecules such as the complement regulator, Factor H.Factor H. On the other hand, the OM must enable leptospires to evade detection by the host's immune system on their way from sites of invasion through the bloodstream to the protected niche of the proximal tubule. The picture that is emerging of the leptospiral OM is that, while it shares many of the characteristics of the OMs of spirochetes and Gram-negative bacteria, it is also unique and different in ways that make it of general interest to microbiologists. For example, unlike most other pathogenic spirochetes, the leptospiral OM is rich in lipopolysaccharide (LPS). Leptospiral LPS is similar to that of Gram-negative bacteria but has a number of unique structural features that may explain why it is not recognized by the LPS-specific Toll-like receptor 4 of humans. As in other spirochetes, lipoproteins are major components of the leptospiral OM, though their roles are poorly understood. The functions of transmembrane outer membrane proteins (OMPs) in many cases are better understood, thanks to homologies with their Gram-negative counterparts and the emergence of improved genetic techniques. This chapter will review recent discoveries involving the leptospiral OM and its role in leptospiral physiology and pathogenesis.
Collapse
Affiliation(s)
- David A Haake
- Division of Infectious Diseases, VA Greater Los Angeles Healthcare System, Los Angeles, CA, 90073, USA,
| | | |
Collapse
|
16
|
Tripathi V, Tripathi P, Gupta D. Statistical approach for lysosomal membrane proteins (LMPs) identification. SYSTEMS AND SYNTHETIC BIOLOGY 2014; 8:313-9. [PMID: 26396655 PMCID: PMC4571724 DOI: 10.1007/s11693-014-9153-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/11/2014] [Revised: 06/11/2014] [Accepted: 07/26/2014] [Indexed: 10/25/2022]
Abstract
Discrimination of Lysosomal membrane proteins (LMP's) from folding types of globular (GPs) and other membrane proteins (OtMPs) is an important task both for identifying LMPs from genomic sequences and for the successful prediction of their secondary and tertiary structures. We have systematically analyzed the amino acid frequencies as well as dipeptide count of GPs, LMPs and OtMPs. Based on the above calculated single amino acid frequency combined with dipeptide count information, we statistically discriminated LMPs from GPs and OtMPs. This approach correctly classified the LMPs with an accuracy of 95 %. On the other hand, the amino acid frequency alone can discriminate LMPs with an accuracy of only 79 %. Similarly dipeptide count alone has an accuracy of 87 % for the discrimination of LMPs. Thus the combined information of both amino acid frequencies and dipeptide composition gives us significant high accurate results.
Collapse
Affiliation(s)
- Vijay Tripathi
- />Center of Bioinformatics, University of Allahabad, Allahabad, India
- />Genome Diversity Center, The Institute of Evolution, University of Haifa, Haifa, Israel
| | - Pooja Tripathi
- />Center of Bioinformatics, University of Allahabad, Allahabad, India
| | - Dwijendra Gupta
- />Center of Bioinformatics, University of Allahabad, Allahabad, India
- />Department of Biochemistry, University of Allahabad, Allahabad, India
| |
Collapse
|
17
|
RETRACTED: Identifying halophilic proteins based on random forests with preprocessing of the pseudo-amino acid composition. J Theor Biol 2014; 361:175-81. [DOI: 10.1016/j.jtbi.2014.07.017] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2014] [Revised: 07/14/2014] [Accepted: 07/15/2014] [Indexed: 01/07/2023]
|
18
|
Guilvout I, Chami M, Disconzi E, Bayan N, Pugsley AP, Huysmans GHM. Independent domain assembly in a trapped folding intermediate of multimeric outer membrane secretins. Structure 2014; 22:582-9. [PMID: 24657091 DOI: 10.1016/j.str.2014.02.009] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2013] [Revised: 01/27/2014] [Accepted: 02/11/2014] [Indexed: 11/28/2022]
Abstract
The outer membrane portal of the Klebsiella oxytoca type II secretion system, PulD, is a prototype of a family of proteins, the secretins, which are essential components of many bacterial secretion and pilus assembly machines. PulD is a homododecamer with a periplasmic vestibule and an outer chamber on either side of a membrane-spanning region that is poorly resolved by electron microscopy. Membrane insertion involves the formation of a dodecameric membrane-embedded intermediate. Here, we describe an amino acid substitution in PulD that blocks its assembly at this intermediate "prepore" stage. Electron microscopy indicated that the prepore has an apparently normal periplasmic vestibule but a poorly organized outer chamber. A peptide loop around this amino acid appears to be important for the formation/stability of the fully folded complex. A similar assembly intermediate results from creation of the same amino acid substitution in the Pseudomonas aeruginosa secretin XcpQ.
Collapse
Affiliation(s)
- Ingrid Guilvout
- Molecular Genetics Unit, Departments of Microbiology and of Structural Biology and Chemistry, Institut Pasteur, rue du Dr. Roux, 75724 Paris Cedex 15, France; CNRS ERL3526, rue du Dr. Roux, 75724 Paris Cedex 15, France
| | - Mohamed Chami
- C-CINA Center for Cellular Imaging and NanoAnalytics, Biozentrum, University of Basel, 4058 Basel, Switzerland
| | - Elena Disconzi
- Molecular Genetics Unit, Departments of Microbiology and of Structural Biology and Chemistry, Institut Pasteur, rue du Dr. Roux, 75724 Paris Cedex 15, France; CNRS ERL3526, rue du Dr. Roux, 75724 Paris Cedex 15, France; Institut de Biochimie et de Biophysique Moléculaire et Cellulaire, Université de Paris-Sud, 91405 Orsay, France; CNRS UMR 8619, 91405 Orsay, France
| | - Nicolas Bayan
- Institut de Biochimie et de Biophysique Moléculaire et Cellulaire, Université de Paris-Sud, 91405 Orsay, France; CNRS UMR 8619, 91405 Orsay, France
| | - Anthony P Pugsley
- Molecular Genetics Unit, Departments of Microbiology and of Structural Biology and Chemistry, Institut Pasteur, rue du Dr. Roux, 75724 Paris Cedex 15, France; CNRS ERL3526, rue du Dr. Roux, 75724 Paris Cedex 15, France.
| | - Gerard H M Huysmans
- Molecular Genetics Unit, Departments of Microbiology and of Structural Biology and Chemistry, Institut Pasteur, rue du Dr. Roux, 75724 Paris Cedex 15, France; CNRS ERL3526, rue du Dr. Roux, 75724 Paris Cedex 15, France.
| |
Collapse
|
19
|
Saravanan K, Krishnaswamy S. Analysis of dihedral angle preferences for alanine and glycine residues in alpha and beta transmembrane regions. J Biomol Struct Dyn 2014; 33:552-62. [DOI: 10.1080/07391102.2014.895678] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
|
20
|
Ni Q, Zou L. Accurate discrimination of outer membrane proteins using secondary structure element alignment and support vector machine. J Bioinform Comput Biol 2014; 12:1450003. [PMID: 24467761 DOI: 10.1142/s0219720014500036] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Outer membrane proteins (OMPs) play critical roles in many cellular processes and discriminating OMPs from other types of proteins is very important for OMPs identification in bacterial genomic proteins. In this study, a method SSEA_SVM is developed using secondary structure element alignment and support vector machine. Moreover, a novel kernel function is designed to utilize secondary structure information in the support vector machine classifier. A benchmark dataset, which consists of 208 OMPs, 673 globular proteins, and 206 α-helical membrane proteins, is used to evaluate the performance of SSEA_SVM. A high accuracy of 97.7% with 0.926 MCC is achieved while SSEA_SVM is applied to discriminating OMPs and non-OMPs. In comparison with existing methods in the literature, SSEA_SVM is also highly competitive. We suggest that SSEA_SVM is a much more promising method to identify OMPs in genomic proteins. A web server that implements SSEA_SVM is freely available at http://bioinfo.tmmu.edu.cn/SSEA_SVM/.
Collapse
Affiliation(s)
- Qingshan Ni
- Department of Microbiology, College of Basic Medical Sciences, Third Military Medical University, No. 30 Gaotanyan Road, Shapingba District, Chongqing 400038, P. R. China
| | | |
Collapse
|
21
|
Ru B, 't Hoen PAC, Nie F, Lin H, Guo FB, Huang J. PhD7Faster: predicting clones propagating faster from the Ph.D.-7 phage display peptide library. J Bioinform Comput Biol 2014; 12:1450005. [PMID: 24467763 DOI: 10.1142/s021972001450005x] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Phage display can rapidly discover peptides binding to any given target; thus, it has been widely used in basic and applied research. Each round of panning consists of two basic processes: Selection and amplification. However, recent studies have showed that the amplification step would decrease the diversity of phage display libraries due to different propagation capacity of phage clones. This may induce phages with growth advantage rather than specific affinity to appear in the final experimental results. The peptides displayed by such phages are termed as propagation-related target-unrelated peptides (PrTUPs). They would mislead further analysis and research if not removed. In this paper, we describe PhD7Faster, an ensemble predictor based on support vector machine (SVM) for predicting clones with growth advantage from the Ph.D.-7 phage display peptide library. By using reduced dipeptide composition (ReDPC) as features, an accuracy (Acc) of 79.67% and a Matthews correlation coefficient (MCC) of 0.595 were achieved in 5-fold cross-validation. In addition, the SVM-based model was demonstrated to perform better than several representative machine learning algorithms. We anticipate that PhD7Faster can assist biologists to exclude potential PrTUPs and accelerate the finding of specific binders from the popular Ph.D.-7 library. The web server of PhD7Faster can be freely accessed at http://immunet.cn/sarotup/cgi-bin/PhD7Faster.pl.
Collapse
Affiliation(s)
- Beibei Ru
- Center of Bioinformatics (COBI), Key Laboratory for NeuroInformation of Ministry of Education, University of Electronic Science and Technology of China, Chengdu 610054, P. R. China
| | | | | | | | | | | |
Collapse
|
22
|
Yan R, Lin J, Chen Z, Wang X, Huang L, Cai W, Zhang Z. Prediction of outer membrane proteins by combining the position- and composition-based features of sequence profiles. MOLECULAR BIOSYSTEMS 2014; 10:1004-13. [DOI: 10.1039/c3mb70435a] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
23
|
AcalPred: a sequence-based tool for discriminating between acidic and alkaline enzymes. PLoS One 2013; 8:e75726. [PMID: 24130738 PMCID: PMC3794003 DOI: 10.1371/journal.pone.0075726] [Citation(s) in RCA: 81] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2013] [Accepted: 08/16/2013] [Indexed: 11/19/2022] Open
Abstract
The structure and activity of enzymes are influenced by pH value of their surroundings. Although many enzymes work well in the pH range from 6 to 8, some specific enzymes have good efficiencies only in acidic (pH<5) or alkaline (pH>9) solution. Studies have demonstrated that the activities of enzymes correlate with their primary sequences. It is crucial to judge enzyme adaptation to acidic or alkaline environment from its amino acid sequence in molecular mechanism clarification and the design of high efficient enzymes. In this study, we developed a sequence-based method to discriminate acidic enzymes from alkaline enzymes. The analysis of variance was used to choose the optimized discriminating features derived from g-gap dipeptide compositions. And support vector machine was utilized to establish the prediction model. In the rigorous jackknife cross-validation, the overall accuracy of 96.7% was achieved. The method can correctly predict 96.3% acidic and 97.1% alkaline enzymes. Through the comparison between the proposed method and previous methods, it is demonstrated that the proposed method is more accurate. On the basis of this proposed method, we have built an online web-server called AcalPred which can be freely accessed from the website (http://lin.uestc.edu.cn/server/AcalPred). We believe that the AcalPred will become a powerful tool to study enzyme adaptation to acidic or alkaline environment.
Collapse
|
24
|
Feher VA, Randall A, Baldi P, Bush RM, de la Maza LM, Amaro RE. A 3-dimensional trimeric β-barrel model for Chlamydia MOMP contains conserved and novel elements of Gram-negative bacterial porins. PLoS One 2013; 8:e68934. [PMID: 23935908 PMCID: PMC3723809 DOI: 10.1371/journal.pone.0068934] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2013] [Accepted: 06/04/2013] [Indexed: 01/17/2023] Open
Abstract
Chlamydia trachomatis is the most prevalent cause of bacterial sexually transmitted diseases and the leading cause of preventable blindness worldwide. Global control of Chlamydia will best be achieved with a vaccine, a primary target for which is the major outer membrane protein, MOMP, which comprises ~60% of the outer membrane protein mass of this bacterium. In the absence of experimental structural information on MOMP, three previously published topology models presumed a16-stranded barrel architecture. Here, we use the latest β-barrel prediction algorithms, previous 2D topology modeling results, and comparative modeling methodology to build a 3D model based on the 16-stranded, trimeric assumption. We find that while a 3D MOMP model captures many structural hallmarks of a trimeric 16-stranded β-barrel porin, and is consistent with most of the experimental evidence for MOMP, MOMP residues 320-334 cannot be modeled as β-strands that span the entire membrane, as is consistently observed in published 16-stranded β-barrel crystal structures. Given the ambiguous results for β-strand delineation found in this study, recent publications of membrane β-barrel structures breaking with the canonical rule for an even number of β-strands, findings of β-barrels with strand-exchanged oligomeric conformations, and alternate folds dependent upon the lifecycle of the bacterium, we suggest that although the MOMP porin structure incorporates canonical 16-stranded conformations, it may have novel oligomeric or dynamic structural changes accounting for the discrepancies observed.
Collapse
Affiliation(s)
- Victoria A. Feher
- Department Chemistry and Biochemistry, University of California San Diego, San Diego, California, United States of America
| | - Arlo Randall
- School of Information and Computer Sciences, University of California Irvine, Irvine, California, United States of America
- Institute for Genomics and Bioinformatics, University of California Irvine, Irvine, California, United States of America
| | - Pierre Baldi
- School of Information and Computer Sciences, University of California Irvine, Irvine, California, United States of America
- Institute for Genomics and Bioinformatics, University of California Irvine, Irvine, California, United States of America
| | - Robin M. Bush
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, California, United States of America
| | - Luis M. de la Maza
- Department of Pathology and Laboratory Medicine, University of California Irvine, Irvine, California, United States of America
| | - Rommie E. Amaro
- Department Chemistry and Biochemistry, University of California San Diego, San Diego, California, United States of America
- * E-mail:
| |
Collapse
|
25
|
Ou YY, Chen SA, Chang YM, Velmurugan D, Fukui K, Michael Gromiha M. Identification of efflux proteins using efficient radial basis function networks with position-specific scoring matrices and biochemical properties. Proteins 2013; 81:1634-43. [DOI: 10.1002/prot.24322] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2013] [Revised: 04/11/2013] [Accepted: 04/19/2013] [Indexed: 11/11/2022]
Affiliation(s)
- Yu-Yen Ou
- Department of Computer Science and Engineering; Yuan Ze University; Chung-Li Taiwan
| | - Shu-An Chen
- Department of Computer Science and Engineering; Yuan Ze University; Chung-Li Taiwan
| | - Yun-Min Chang
- Department of Computer Science and Engineering; Yuan Ze University; Chung-Li Taiwan
| | - Devadasan Velmurugan
- Department of Crystallography and Biophysics; University of Madras; Chennai 600025 Tamilnadu India
| | - Kazuhiko Fukui
- Computational Biology Research Center (CBRC), National Institute of Advanced Industrial Science and Technology (AIST); 2-43 Aomi Koto-ku Tokyo 135-0064 Japan
| | - M. Michael Gromiha
- Department of Biotechnology, Indian Institute of Technology (IIT) Madras; Chennai 600036 Tamilnadu India
| |
Collapse
|
26
|
Thangakani AM, Kumar S, Velmurugan D, Gromiha MM. Distinct position-specific sequence features of hexa-peptides that form amyloid-fibrils: application to discriminate between amyloid fibril and amorphous β-aggregate forming peptide sequences. BMC Bioinformatics 2013; 14 Suppl 8:S6. [PMID: 23815227 PMCID: PMC3654898 DOI: 10.1186/1471-2105-14-s8-s6] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023] Open
Abstract
Background Comparison of short peptides which form amyloid-fibrils with their homologues that may form amorphous β-aggregates but not fibrils, can aid development of novel amyloid-containing nanomaterials with well defined morphologies and characteristics. The knowledge gained from the comparative analysis could also be applied towards identifying potential aggregation prone regions in proteins, which are important for biotechnology applications or have been implicated in neurodegenerative diseases. In this work we have systematically analyzed a set of 139 amyloid-fibril hexa-peptides along with a highly homologous set of 168 hexa-peptides that do not form amyloid fibrils for their position-wise as well as overall amino acid compositions and averages of 49 selected amino acid properties. Results Amyloid-fibril forming peptides show distinct preferences and avoidances for amino acid residues to occur at each of the six positions. As expected, the amyloid fibril peptides are also more hydrophobic than non-amyloid peptides. We have used the results of this analysis to develop statistical potential energy values for the 20 amino acid residues to occur at each of the six different positions in the hexa-peptides. The distribution of the potential energy values in 139 amyloid and 168 non-amyloid fibrils are distinct and the amyloid-fibril peptides tend to be more stable (lower total potential energy values) than non-amyloid peptides. The average frequency of occurrence of these peptides with lower than specific cutoff energies at different positions is 72% and 50%, respectively. The potential energy values were used to devise a statistical discriminator to distinguish between amyloid-fibril and non-amyloid peptides. Our method could identify the amyloid-fibril forming hexa-peptides to an accuracy of 89%. On the other hand, the accuracy of identifying non-amyloid peptides was only 54%. Further attempts were made to improve the prediction accuracy via machine learning. This resulted in an overall accuracy of 82.7% with the sensitivity and specificity of 81.3% and 83.9%, respectively, in 10-fold cross-validation method. Conclusions Amyloid-fibril forming hexa-peptides show position specific sequence features that are different from those which may form amorphous β-aggregates. These positional preferences are found to be important features for discriminating amyloid-fibril forming peptides from their homologues that don't form amyloid-fibrils.
Collapse
Affiliation(s)
- A Mary Thangakani
- Department of Crystallography and Biophysics, University of Madras, Chennai 600025, India
| | | | | | | |
Collapse
|
27
|
Gromiha MM, Ou YY. Bioinformatics approaches for functional annotation of membrane proteins. Brief Bioinform 2013; 15:155-68. [DOI: 10.1093/bib/bbt015] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
28
|
MetaLocGramN: A meta-predictor of protein subcellular localization for Gram-negative bacteria. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2012; 1824:1425-33. [PMID: 22705560 DOI: 10.1016/j.bbapap.2012.05.018] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/01/2012] [Revised: 05/20/2012] [Accepted: 05/31/2012] [Indexed: 12/29/2022]
Abstract
Subcellular localization is a key functional characteristic of proteins. It is determined by signals encoded in the protein sequence. The experimental determination of subcellular localization is laborious. Thus, a number of computational methods have been developed to predict the protein location from sequence. However predictions made by different methods often disagree with each other and it is not always clear which algorithm performs best for the given cellular compartment. We benchmarked primary subcellular localization predictors for proteins from Gram-negative bacteria, PSORTb3, PSLpred, CELLO, and SOSUI-GramN, on a common dataset that included 1056 proteins. We found that PSORTb3 performs best on the average, but is outperformed by other methods in predictions of extracellular proteins. This motivated us to develop a meta-predictor, which combines the primary methods by using the logistic regression models, to take advantage of their combined strengths, and to eliminate their individual weaknesses. MetaLocGramN runs the primary methods, and based on their output classifies protein sequences into one of five major localizations of the Gram-negative bacterial cell: cytoplasm, plasma membrane, periplasm, outer membrane, and extracellular space. MetaLocGramN achieves the average Matthews correlation coefficient of 0.806, i.e. 12% better than the best individual primary method. MetaLocGramN is a meta-predictor specialized in predicting subcellular localization for proteins from Gram-negative bacteria. According to our benchmark, it performs better than all other tools run independently. MetaLocGramN is a web and SOAP server available for free use by all academic users at the URL http://iimcb.genesilico.pl/MetaLocGramN. This article is part of a Special Issue entitled: Computational Methods for Protein Interaction and Structural Prediction.
Collapse
|
29
|
CHEN YUEHUI, CHEN FENG, YANG JACKY, YANG MARYQU. ENSEMBLE VOTING SYSTEM FOR MULTICLASS PROTEIN FOLD RECOGNITION. INT J PATTERN RECOGN 2011. [DOI: 10.1142/s0218001408006454] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Protein structure classification is an important issue in understanding the associations between sequence and structure as well as possible functional and evolutionary relationships. Recently structural genomes initiatives and other high-throughput experiments have populated the biological databases at a rapid pace. In this paper, three types of classifiers, k nearest neighbors, class center and nearest neighbor and probabilistic neural networks and their homogenous ensemble for multiclass protein fold recognition problem are evaluated firstly, and then a heterogenous ensemble Voting System is designed for the same problem. The different features and/or their combinations extracted from the protein fold dataset are used in these classification models. The heterogenous classification results are then put into a voting system to get the final result. The experimental results show that the proposed method can improve prediction accuracy by 4%–10% on a benchmark dataset containing 27 SCOP folds.
Collapse
Affiliation(s)
- YUEHUI CHEN
- School of Information Science and Engineering, University of Jinan, 106 Jiwei Road, 250022 Jinan, P. R. China
| | - FENG CHEN
- School of Software, University of Electronic Science and Technology of China, Chengdu 610054, P. R. China
| | - JACK Y. YANG
- Harvard Medical School, Harvard University, P.O. Box 400888, Cambridge, MA 02140, USA
| | - MARY QU YANG
- National Human Genome Research Institute, National Institutes of Health, US Department of Health and Human Services Bethesda, MD 20852, USA
| |
Collapse
|
30
|
Liang GZ, Ma XY, Li YC, Lv FL, Yang L. Toward an improved discrimination of outer membrane proteins using a sequence-based approach. Biosystems 2011; 105:101-6. [PMID: 21440034 DOI: 10.1016/j.biosystems.2011.03.008] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2010] [Revised: 03/16/2011] [Accepted: 03/16/2011] [Indexed: 11/26/2022]
Abstract
This article offers a novel sequence-based approach to discriminate outer membrane proteins (OMPs). The first step is to use a new representation approach, factor analysis scales of generalized amino acid information (FASGAI) representing hydrophobicity, alpha and turn propensities, bulky properties, compositional characteristics, local flexibility and electronic properties, etc., to characterize sequences of OMPs and non-OMPs. The subsequent data is then transformed into a uniform matrix by the auto cross covariance (ACC). The second step is to develop discrimination predictors of OMPs from non-OMPs using a support vector machine (SVM). The SVM predictors thus successfully produce a high Matthews correlation coefficient (MCC) of 0.916 on 208 OMPs from non-OMPs including 206 α-helical membrane proteins and 673 globular proteins by a fivefold cross validation test. Meanwhile, overall MCC values of 0.923 and 0.930 are obtained for the discrimination OMPs from the α-helical membrane proteins and the globular proteins, respectively. The results demonstrate that the FASGAI-ACC-SVM combination approach shows great prospect of application in the field of bioinformatics or proteomics studies.
Collapse
Affiliation(s)
- Gui-Zhao Liang
- Key Laboratory of Biorheological Science and Technology, Ministry of Education, Bioengineering College, Chongqing University, Shazheng Street 174#, Chongqing 400044, China.
| | | | | | | | | |
Collapse
|
31
|
Outer membrane proteins can be simply identified using secondary structure element alignment. BMC Bioinformatics 2011; 12:76. [PMID: 21414186 PMCID: PMC3072342 DOI: 10.1186/1471-2105-12-76] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2010] [Accepted: 03/17/2011] [Indexed: 02/04/2023] Open
Abstract
Background Outer membrane proteins (OMPs) are frequently found in the outer membranes of gram-negative bacteria, mitochondria and chloroplasts and have been found to play diverse functional roles. Computational discrimination of OMPs from globular proteins and other types of membrane proteins is helpful to accelerate new genome annotation and drug discovery. Results Based on the observation that almost all OMPs consist of antiparallel β-strands in a barrel shape and that their secondary structure arrangements differ from those of other types of proteins, we propose a simple method called SSEA-OMP to identify OMPs using secondary structure element alignment. Through intensive benchmark experiments, the proposed SSEA-OMP method is better than some well-established OMP detection methods. Conclusions The major advantage of SSEA-OMP is its good prediction performance considering its simplicity. The web server implements the method is freely accessible at http://protein.cau.edu.cn/SSEA-OMP/index.html.
Collapse
|
32
|
Lin H, Ding H. Predicting ion channels and their types by the dipeptide mode of pseudo amino acid composition. J Theor Biol 2011; 269:64-9. [DOI: 10.1016/j.jtbi.2010.10.019] [Citation(s) in RCA: 110] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2010] [Revised: 08/31/2010] [Accepted: 10/15/2010] [Indexed: 12/11/2022]
|
33
|
Mizianty MJ, Kurgan L. Improved identification of outer membrane beta barrel proteins using primary sequence, predicted secondary structure, and evolutionary information. Proteins 2010; 79:294-303. [DOI: 10.1002/prot.22882] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
34
|
Tanaka T, Niwa H, Yutani K, Kuramitsu S, Yokoyama S, Kumarevel T. Crystal structure of TTHA0061, an uncharacterized protein from Thermus thermophilus HB8, reveals a novel fold. Biochem Biophys Res Commun 2010; 400:258-264. [PMID: 20728427 DOI: 10.1016/j.bbrc.2010.08.054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2010] [Accepted: 08/17/2010] [Indexed: 05/29/2023]
Abstract
The crystal structure of an uncharacterized protein TTHA0061 from Thermus thermophilus HB8, was determined and refined to 1.8 A by a single wavelength anomalous dispersion (SAD) method. The structural analysis and comparison of TTHA0061 with other existing structures in the Protein Data Bank (PDB) revealed a novel fold, suggesting that this protein may belong to a translation initiation factor or ribosomal protein family. Differential scanning calorimetry analysis suggested that the thermostability of TTHA0061 increased at pH ranges of 5.8-6.2, perhaps due to the abundance of glutamic acid residues.
Collapse
Affiliation(s)
- Tomoyuki Tanaka
- RIKEN SPring-8 Center, Harima Institute, 1-1-1 Kouto, Sayo, Hyogo 679-5148, Japan
| | | | | | | | | | | |
Collapse
|
35
|
iFC²: an integrated web-server for improved prediction of protein structural class, fold type, and secondary structure content. Amino Acids 2010; 40:963-73. [PMID: 20730460 DOI: 10.1007/s00726-010-0721-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2010] [Accepted: 08/06/2010] [Indexed: 10/19/2022]
Abstract
Several descriptors of protein structure at the sequence and residue levels have been recently proposed. They are widely adopted in the analysis and prediction of structural and functional characteristics of proteins. Numerous in silico methods have been developed for sequence-based prediction of these descriptors. However, many of them do not have a public web-server and only a few integrate multiple descriptors to improve the predictions. We introduce iFC² (integrated prediction of fold, class, and content) server that is the first to integrate three modern predictors of sequence-level descriptors. They concern fold type (PFRES), structural class (SCEC), and secondary structure content (PSSC-core). The server exploits relations between the three descriptors to implement a cross-evaluation procedure that improves over the predictions of the individual methods. The iFC² annotates fold and class predictions as potentially correct/incorrect. When tested on datasets with low-similarity chains, for the fold prediction iFC² labels 82% of the PFRES predictions as correct and the accuracy of these predictions equals 72%. The accuracy of the remaining 28% of the PFRES predictions equals 38%. Similarly, our server assigns correct labels for over 79% of SCEC predictions, which are shown to be 98% accurate, while the remaining SCEC predictions are only 15% accurate. These results are shown to be competitive when contrasted against recent relevant web-servers. Predictions on CASP8 targets show that the content predicted by iFC² is competitive when compared with the content computed from the tertiary structures predicted by three best-performing methods in CASP8. The iFC² server is available at http://biomine.ece.ualberta.ca/1D/1D.html .
Collapse
|
36
|
|
37
|
Ou YY, Chen SA, Gromiha MM. Classification of transporters using efficient radial basis function networks with position-specific scoring matrices and biochemical properties. Proteins 2010; 78:1789-97. [DOI: 10.1002/prot.22694] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
|
38
|
Mizianty MJ, Kurgan L. Modular prediction of protein structural classes from sequences of twilight-zone identity with predicting sequences. BMC Bioinformatics 2009; 10:414. [PMID: 20003388 PMCID: PMC2805645 DOI: 10.1186/1471-2105-10-414] [Citation(s) in RCA: 71] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2009] [Accepted: 12/13/2009] [Indexed: 11/13/2022] Open
Abstract
Background Knowledge of structural class is used by numerous methods for identification of structural/functional characteristics of proteins and could be used for the detection of remote homologues, particularly for chains that share twilight-zone similarity. In contrast to existing sequence-based structural class predictors, which target four major classes and which are designed for high identity sequences, we predict seven classes from sequences that share twilight-zone identity with the training sequences. Results The proposed MODular Approach to Structural class prediction (MODAS) method is unique as it allows for selection of any subset of the classes. MODAS is also the first to utilize a novel, custom-built feature-based sequence representation that combines evolutionary profiles and predicted secondary structure. The features quantify information relevant to the definition of the classes including conservation of residues and arrangement and number of helix/strand segments. Our comprehensive design considers 8 feature selection methods and 4 classifiers to develop Support Vector Machine-based classifiers that are tailored for each of the seven classes. Tests on 5 twilight-zone and 1 high-similarity benchmark datasets and comparison with over two dozens of modern competing predictors show that MODAS provides the best overall accuracy that ranges between 80% and 96.7% (83.5% for the twilight-zone datasets), depending on the dataset. This translates into 19% and 8% error rate reduction when compared against the best performing competing method on two largest datasets. The proposed predictor provides accurate predictions at 58% accuracy for membrane proteins class, which is not considered by majority of existing methods, in spite that this class accounts for only 2% of the data. Our predictive model is analyzed to demonstrate how and why the input features are associated with the corresponding classes. Conclusions The improved predictions stem from the novel features that express collocation of the secondary structure segments in the protein sequence and that combine evolutionary and secondary structure information. Our work demonstrates that conservation and arrangement of the secondary structure segments predicted along the protein chain can successfully predict structural classes which are defined based on the spatial arrangement of the secondary structures. A web server is available at http://biomine.ece.ualberta.ca/MODAS/.
Collapse
Affiliation(s)
- Marcin J Mizianty
- Department of Electrical and Computer Engineering, University of Alberta, Edmonton, Canada.
| | | |
Collapse
|
39
|
Gao QB, Ye XF, Jin ZC, He J. Improving discrimination of outer membrane proteins by fusing different forms of pseudo amino acid composition. Anal Biochem 2009; 398:52-9. [PMID: 19874797 DOI: 10.1016/j.ab.2009.10.040] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2009] [Revised: 10/21/2009] [Accepted: 10/22/2009] [Indexed: 10/20/2022]
Abstract
Integral membrane proteins are central to many cellular processes and constitute approximately 50% of potential targets for novel drugs. However, the number of outer membrane proteins (OMPs) present in the public structure database is very limited due to the difficulties in determining structure with experimental methods. Therefore, discriminating OMPs from non-OMPs with computational methods is of medical importance as well as genome sequencing necessity. In this study, some sequence-derived structural and physicochemical features of proteins were incorporated with amino acid composition to discriminate OMPs from non-OMPs using support vector machines. The discrimination performance of the proposed method is evaluated on a benchmark dataset of 208 OMPs, 673 globular proteins, and 206 alpha-helical membrane proteins. A high overall accuracy of 97.8% was observed in the 5-fold cross-validation test. In addition, the current method distinguished OMPs from globular proteins and alpha-helical membrane proteins with overall accuracies of 98.2 and 96.4%, respectively. The prediction performance is superior to the state-of-the-art methods in the literature. It is anticipated that the current method might be a powerful tool for the discrimination of OMPs.
Collapse
Affiliation(s)
- Qing-Bin Gao
- Department of Health Statistics, Second Military Medical University, No. 800 Xiangyin Road, Shanghai 200433, China.
| | | | | | | |
Collapse
|
40
|
Using auto covariance method for functional discrimination of membrane proteins based on evolution information. Amino Acids 2009; 38:1497-503. [DOI: 10.1007/s00726-009-0362-4] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2009] [Accepted: 09/24/2009] [Indexed: 11/29/2022]
|
41
|
Díaz-Mejía JJ, Babu M, Emili A. Computational and experimental approaches to chart the Escherichia coli cell-envelope-associated proteome and interactome. FEMS Microbiol Rev 2008; 33:66-97. [PMID: 19054114 PMCID: PMC2704936 DOI: 10.1111/j.1574-6976.2008.00141.x] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
Abstract
The bacterial cell-envelope consists of a complex arrangement of lipids, proteins and carbohydrates that serves as the interface between a microorganism and its environment or, with pathogens, a human host. Escherichia coli has long been investigated as a leading model system to elucidate the fundamental mechanisms underlying microbial cell-envelope biology. This includes extensive descriptions of the molecular identities, biochemical activities and evolutionary trajectories of integral transmembrane proteins, many of which play critical roles in infectious disease and antibiotic resistance. Strikingly, however, only half of the c. 1200 putative cell-envelope-related proteins of E. coli currently have experimentally attributed functions, indicating an opportunity for discovery. In this review, we summarize the state of the art of computational and proteomic approaches for determining the components of the E. coli cell-envelope proteome, as well as exploring the physical and functional interactions that underlie its biogenesis and functionality. We also provide a comprehensive comparative benchmarking analysis on the performance of different bioinformatic and proteomic methods commonly used to determine the subcellular localization of bacterial proteins.
Collapse
Affiliation(s)
- Juan Javier Díaz-Mejía
- Banting and Best Department of Medical Research, Terrence Donnelly Center for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada
| | | | | |
Collapse
|
42
|
Waldispühl J, O'Donnell CW, Devadas S, Clote P, Berger B. Modeling ensembles of transmembrane beta-barrel proteins. Proteins 2008; 71:1097-112. [PMID: 18004792 DOI: 10.1002/prot.21788] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Transmembrane beta-barrel (TMB) proteins are embedded in the outer membrane of gram-negative bacteria, mitochondria, and chloroplasts. Despite their importance, very few nonhomologous TMB structures have been determined by X-ray diffraction because of the experimental difficulty encountered in crystallizing transmembrane proteins. We introduce the program partiFold to investigate the folding landscape of TMBs. By computing the Boltzmann partition function, partiFold estimates inter-beta-strand residue interaction probabilities, predicts contacts and per-residue X-ray crystal structure B-values, and samples conformations from the Boltzmann low energy ensemble. This broad range of predictive capabilities is achieved using a single, parameterizable grammatical model to describe potential beta-barrel supersecondary structures, combined with a novel energy function of stacked amino acid pair statistical potentials. PartiFold outperforms existing programs for inter-beta-strand residue contact prediction on TMB proteins, offering both higher average predictive accuracy as well as more consistent results. Moreover, the integration of these contact probabilities inside a stochastic contact map can be used to infer a more meaningful picture of the TMB folding landscape, which cannot be achieved with other methods. Partifold's predictions of B-values are competitive with recent methods specifically designed for this problem. Finally, we show that sampling TMBs from the Boltzmann ensemble matches the X-ray crystal structure better than single structure prediction methods. A webserver running partiFold is available at http://partiFold.csail.mit.edu/.
Collapse
|
43
|
Chen K, Kurgan LA, Ruan J. Prediction of protein structural class using novel evolutionary collocation-based sequence representation. J Comput Chem 2008; 29:1596-604. [PMID: 18293306 DOI: 10.1002/jcc.20918] [Citation(s) in RCA: 116] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Knowledge of structural classes is useful in understanding of folding patterns in proteins. Although existing structural class prediction methods applied virtually all state-of-the-art classifiers, many of them use a relatively simple protein sequence representation that often includes amino acid (AA) composition. To this end, we propose a novel sequence representation that incorporates evolutionary information encoded using PSI-BLAST profile-based collocation of AA pairs. We used six benchmark datasets and five representative classifiers to quantify and compare the quality of the structural class prediction with the proposed representation. The best, classifier support vector machine achieved 61-96% accuracy on the six datasets. These predictions were comprehensively compared with a wide range of recently proposed methods for prediction of structural classes. Our comprehensive comparison shows superiority of the proposed representation, which results in error rate reductions that range between 14% and 26% when compared with predictions of the best-performing, previously published classifiers on the considered datasets. The study also shows that, for the benchmark dataset that includes sequences characterized by low identity (i.e., 25%, 30%, and 40%), the prediction accuracies are 20-35% lower than for the other three datasets that include sequences with a higher degree of similarity. In conclusion, the proposed representation is shown to substantially improve the accuracy of the structural class prediction. A web server that implements the presented prediction method is freely available at http://biomine.ece.ualberta.ca/Structural_Class/SCEC.html.
Collapse
Affiliation(s)
- Ke Chen
- Department of Electrical and Computer Engineering, ECERF, University of Alberta, Edmonton, Alberta, Canada
| | | | | |
Collapse
|
44
|
Ou YY, Gromiha M, Chen SA, Suwa M. TMBETADISC-RBF: Discrimination of -barrel membrane proteins using RBF networks and PSSM profiles. Comput Biol Chem 2008; 32:227-31. [DOI: 10.1016/j.compbiolchem.2008.03.002] [Citation(s) in RCA: 63] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2007] [Revised: 03/11/2008] [Accepted: 03/11/2008] [Indexed: 10/22/2022]
|
45
|
Kurgan LA, Zhang T, Zhang H, Shen S, Ruan J. Secondary structure-based assignment of the protein structural classes. Amino Acids 2008; 35:551-64. [DOI: 10.1007/s00726-008-0080-3] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2008] [Accepted: 02/27/2008] [Indexed: 11/24/2022]
|
46
|
Martin J, de Brevern AG, Camproux AC. In silico local structure approach: a case study on outer membrane proteins. Proteins 2008; 71:92-109. [PMID: 17932925 DOI: 10.1002/prot.21659] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
The detection of Outer Membrane Proteins (OMP) in whole genomes is an actual question, their sequence characteristics have thus been intensively studied. This class of protein displays a common beta-barrel architecture, formed by adjacent antiparallel strands. However, due to the lack of available structures, few structural studies have been made on this class of proteins. Here we propose a novel OMP local structure investigation, based on a structural alphabet approach, i.e., the decomposition of 3D structures using a library of four-residue protein fragments. The optimal decomposition of structures using hidden Markov model results in a specific structural alphabet of 20 fragments, six of them dedicated to the decomposition of beta-strands. This optimal alphabet, called SA20-OMP, is analyzed in details, in terms of local structures and transitions between fragments. It highlights a particular and strong organization of beta-strands as series of regular canonical structural fragments. The comparison with alphabets learned on globular structures indicates that the internal organization of OMP structures is more constrained than in globular structures. The analysis of OMP structures using SA20-OMP reveals some recurrent structural patterns. The preferred location of fragments in the distinct regions of the membrane is investigated. The study of pairwise specificity of fragments reveals that some contacts between structural fragments in beta-sheets are clearly favored whereas others are avoided. This contact specificity is stronger in OMP than in globular structures. Moreover, SA20-OMP also captured sequential information. This can be integrated in a scoring function for structural model ranking with very promising results.
Collapse
Affiliation(s)
- Juliette Martin
- INSERM UMR-S 726/Université Denis Diderot Paris 7, Equipe de Bioinformatique Génomique et Moléculaire, F-75005 Paris
| | | | | |
Collapse
|
47
|
Sadovskaya NS, Gelfand MS. Benchmarking of programs that predict the position of transmembrane segments in beta-barrel proteins. Biophysics (Nagoya-shi) 2008. [DOI: 10.1134/s0006350908020036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
|
48
|
Gromiha MM, Suresh MX. Discrimination of mesophilic and thermophilic proteins using machine learning algorithms. Proteins 2008; 70:1274-9. [PMID: 17876820 DOI: 10.1002/prot.21616] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
Discriminating thermophilic proteins from their mesophilic counterparts is a challenging task and it would help to design stable proteins. In this work, we have systematically analyzed the amino acid compositions of 3075 mesophilic and 1609 thermophilic proteins belonging to 9 and 15 families, respectively. We found that the charged residues Lys, Arg, and Glu as well as the hydrophobic residues, Val and Ile have higher occurrence in thermophiles than mesophiles. Further, we have analyzed the performance of different methods, based on Bayes rules, logistic functions, neural networks, support vector machines, decision trees and so forth for discriminating mesophilic and thermophilic proteins. We found that most of the machine learning techniques discriminate these classes of proteins with similar accuracy. The neural network-based method could discriminate the thermophiles from mesophiles at the five-fold cross-validation accuracy of 89% in a dataset of 4684 proteins. Moreover, this method is tested with 325 mesophiles in Xylella fastidosa and 382 thermophiles in Aquifex aeolicus and it could successfully discriminate them with the accuracy of 91%. These accuracy levels are better than other methods in the literature and we suggest that this method could be effectively used to discriminate mesophilic and thermophilic proteins.
Collapse
Affiliation(s)
- M Michael Gromiha
- Computational Biology Research Center, National Institute of Advanced Industrial Science and Technology, 2-42 Aomi, Koto-ku, Tokyo 135-0064, Japan.
| | | |
Collapse
|
49
|
Gromiha MM, Yabuki Y. Functional discrimination of membrane proteins using machine learning techniques. BMC Bioinformatics 2008; 9:135. [PMID: 18312695 PMCID: PMC2375119 DOI: 10.1186/1471-2105-9-135] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2007] [Accepted: 03/03/2008] [Indexed: 11/30/2022] Open
Abstract
Background Discriminating membrane proteins based on their functions is an important task in genome annotation. In this work, we have analyzed the characteristic features of amino acid residues in membrane proteins that perform major functions, such as channels/pores, electrochemical potential-driven transporters and primary active transporters. Results We observed that the residues Asp, Asn and Tyr are dominant in channels/pores whereas the composition of hydrophobic residues, Phe, Gly, Ile, Leu and Val is high in electrochemical potential-driven transporters. The composition of all the amino acids in primary active transporters lies in between other two classes of proteins. We have utilized different machine learning algorithms, such as, Bayes rule, Logistic function, Neural network, Support vector machine, Decision tree etc. for discriminating these classes of proteins. We observed that most of the algorithms have discriminated them with similar accuracy. The neural network method discriminated the channels/pores, electrochemical potential-driven transporters and active transporters with the 5-fold cross validation accuracy of 64% in a data set of 1718 membrane proteins. The application of amino acid occurrence improved the overall accuracy to 68%. In addition, we have discriminated transporters from other α-helical and β-barrel membrane proteins with the accuracy of 85% using k-nearest neighbor method. The classification of transporters and all other proteins (globular and membrane) showed the accuracy of 82%. Conclusion The performance of discrimination with amino acid occurrence is better than that with amino acid composition. We suggest that this method could be effectively used to discriminate transporters from all other globular and membrane proteins, and classify them into channels/pores, electrochemical and active transporters.
Collapse
Affiliation(s)
- M Michael Gromiha
- Computational Biology Research Center, National Institute of Advanced Industrial Science and Technology, AIST Tokyo Waterfront Bio-IT Research Building, 2-42 Aomi, Koto-ku, Tokyo 135-0064, Japan.
| | | |
Collapse
|
50
|
Lin H. The modified Mahalanobis Discriminant for predicting outer membrane proteins by using Chou's pseudo amino acid composition. J Theor Biol 2008; 252:350-6. [PMID: 18355838 DOI: 10.1016/j.jtbi.2008.02.004] [Citation(s) in RCA: 182] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2007] [Revised: 12/02/2007] [Accepted: 02/04/2008] [Indexed: 11/15/2022]
Abstract
The outer membrane proteins (OMPs) are beta-barrel membrane proteins that performed lots of biology functions. The discriminating OMPs from other non-OMPs is a very important task for understanding some biochemical process. In this study, a method that combines increment of diversity with modified Mahalanobis Discriminant, called IDQD, is presented to predict 208 OMPs, 206 transmembrane helical proteins (TMHPs) and 673 globular proteins (GPs) by using Chou's pseudo amino acid compositions as parameters. The overall accuracy of jackknife cross-validation is 93.2% and 96.1%, respectively, for three datasets (OMPs, TMHPs and GPs) and two datasets (OMPs and non-OMPs). These predicted results suggest that the method can be effectively applied to discriminate OMPs, TMHPs and GPs. And it also indicates that the pseudo amino acid composition can better reflect the core feature of membrane proteins than the classical amino acid composition.
Collapse
Affiliation(s)
- Hao Lin
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China.
| |
Collapse
|