1
|
Abbass J, Parisi C. Machine learning-based prediction of proteins' architecture using sequences of amino acids and structural alphabets. J Biomol Struct Dyn 2024:1-16. [PMID: 38505995 DOI: 10.1080/07391102.2024.2328736] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Accepted: 03/05/2024] [Indexed: 03/21/2024]
Abstract
In addition to the growth of protein structures generated through wet laboratory experiments and deposited in the PDB repository, AlphaFold predictions have significantly contributed to the creation of a much larger database of protein structures. Annotating such a vast number of structures has become an increasingly challenging task. CATH is widely recognized as one the most common platforms for addressing this challenge, as it classifies proteins based on their structural and evolutionary relationships, offering the scientific community an invaluable resource for uncovering various properties, including functional annotations. While CATH annotation involves - to some extent - human intervention, keeping up with the classification of the rapidly expanding repositories of protein structures has become exceedingly difficult. Therefore, there is a pressing need for a fully automated approach. On the other hand, the abundance of protein sequences stemming from next generation sequencing technologies, lacking structural annotations, presents an additional challenge to the scientific community. Consequently, 'pre-annotating' protein sequences with structural features, ensuring a high level of precision, could prove highly advantageous. In this paper, after a thorough investigation, we introduce a novel machine-learning model capable of classifying any protein domain, whether it has a known structure or not, into one of the 40 main CATH Architectures. We achieve an F1 Score of 0.92 using only the amino acid sequence and a score of 0.94 using both the sequence of amino acids and the sequence of structural alphabets.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Jad Abbass
- School of Computer Science and Mathematics, Kingston University, London, UK
| | - Charles Parisi
- School of Computer Science and Mathematics, Kingston University, London, UK
- Telecom Physique Strasbourg, Strasbourg University, Strasbourg, France
| |
Collapse
|
2
|
Zhang G, Xu T, Chen Y, Xu W, Wang Y, Li Y, Zhu F, Liu H, Ruan H. Complete Mitochondrial Genomes of Nedyopus patrioticus: New Insights into the Color Polymorphism of Millipedes. Curr Issues Mol Biol 2024; 46:2514-2527. [PMID: 38534775 DOI: 10.3390/cimb46030159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2024] [Revised: 03/04/2024] [Accepted: 03/13/2024] [Indexed: 03/28/2024] Open
Abstract
There has been debate about whether individuals with different color phenotypes should have different taxonomic status. In order to determine whether the different color phenotypes of Nedyopus patrioticus require separate taxonomic status or are simply synonyms, here, the complete mitochondrial genomes (mitogenomes) of two different colored N. patrioticus, i.e., red N. patrioticus and white N. patrioticus, are presented. The two mitogenomes were 15,781 bp and 15,798 bp in length, respectively. Each mitogenome contained 13 PCGs, 19 tRNAs, 2 rRNAs, and 1 CR, with a lack of trnI, trnL2, and trnV compared to other Polydesmida species. All genes were located on a single strand in two mitogenomes. Mitochondrial DNA analyses revealed that red N. patrioticus and white N. patrioticus did not show clear evolutionary differences. Furthermore, no significant divergence was discovered by means of base composition analysis. As a result, we suggest that white N. patrioticus might be regarded as a synonym for red N. patrioticus. The current findings confirmed the existence of color polymorphism in N. patrioticus, which provides exciting possibilities for future research. It is necessary to apply a combination of molecular and morphological methods in the taxonomy of millipedes.
Collapse
Affiliation(s)
- Gaoji Zhang
- College of Life Sciences, Nanjing Forestry University, Nanjing 210037, China
| | - Tangjun Xu
- College of Life Sciences, Nanjing Forestry University, Nanjing 210037, China
| | - Yukun Chen
- College of Life Sciences, Nanjing Forestry University, Nanjing 210037, China
| | - Wei Xu
- College of Life Sciences, Nanjing Forestry University, Nanjing 210037, China
| | - Yinuo Wang
- College of Life Sciences, Nanjing Forestry University, Nanjing 210037, China
| | - Yuanyuan Li
- College of Ecology and the Environment, Nanjing Forestry University, Nanjing 210037, China
| | - Fuyuan Zhu
- College of Life Sciences, Nanjing Forestry University, Nanjing 210037, China
| | - Hongyi Liu
- College of Life Sciences, Nanjing Forestry University, Nanjing 210037, China
- College of Ecology and the Environment, Nanjing Forestry University, Nanjing 210037, China
| | - Honghua Ruan
- College of Ecology and the Environment, Nanjing Forestry University, Nanjing 210037, China
| |
Collapse
|
3
|
Zafar Z, Wood MJ, Fatima S, Bhatti MF, Shah FA, Saud Z, Loveridge EJ, Karaca I, Butt TM. Identification of the odorant binding proteins of Western Flower Thrips ( Frankliniella occidentalis), characterization and binding analysis of FoccOBP3 with molecular modelling, molecular dynamics simulations and a confirmatory field trial. J Biomol Struct Dyn 2024:1-16. [PMID: 38415377 DOI: 10.1080/07391102.2024.2317990] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Accepted: 02/07/2024] [Indexed: 02/29/2024]
Abstract
Olfactory systems are indispensable for insects as they, including Western Flower Thrips (Frankliniella occidentalis), use olfactory cues for ovipositing and feeding. F. occidentalis use odorant binding proteins (OBPs) to transport semiochemicals to odorant receptors to induce a behavioural response from the sensillum lymph of the insect's antennae. This study identifies four OBPs of F. occidentalis and analyses their expression at three stages of growth: larvae, adult males and adult females. Further, it investigates the presence of conserved motifs and their phylogenetic relationship to other insect species. Moreover, FoccOBP3 was in silico characterized to analyse its structure along with molecular docking and molecular dynamics simulations to understand its binding with semiochemicals of F. occidentalis. Molecular docking revealed the interactions of methyl isonicotinate, p-anisaldehyde and (S)-(-)-verbenone with FoccOBP3. Moreover, molecular dynamics simulations showed bonding stability of these ligands with FoccOBP3, and field trials validated that Lurem TR (commercial product) and p-anisaldehyde had greater attraction as compared to (S)-(-)-verbenone, given the compound's binding with FoccOBP3. The current study helps in understanding the tertiary structure and interaction of FoccOBP3 with lures using computational and field data and will help in the identification of novel lures of insects in the future, given the importance of binding with OBPs.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Zeeshan Zafar
- Research and Development, Razbio Limited, Bridgend, UK
- Atta-ur-Rahman School of Applied Biosciences, National University of Sciences and Technology, Islamabad, Pakistan
| | - Martyn J Wood
- Research and Development, Razbio Limited, Bridgend, UK
- Institute of Molecular Biology and Biotechnology, Foundation for Research and Technology-Hellas, Heraklion, Greece
| | - Sidra Fatima
- Atta-ur-Rahman School of Applied Biosciences, National University of Sciences and Technology, Islamabad, Pakistan
| | - Muhammad Faraz Bhatti
- Atta-ur-Rahman School of Applied Biosciences, National University of Sciences and Technology, Islamabad, Pakistan
| | - Farooq A Shah
- Research and Development, Razbio Limited, Bridgend, UK
| | - Zack Saud
- Department of Biosciences, Swansea University, Swansea, UK
| | | | - Ismail Karaca
- Faculty of Agriculture, Department of Plant Protection, Isparta University of Applied Sciences, Isparta, Turkey
| | - Tariq M Butt
- Department of Biosciences, Swansea University, Swansea, UK
| |
Collapse
|
4
|
Milchevskiy YV, Milchevskaya VY, Nikitin AM, Kravatsky YV. Effective Local and Secondary Protein Structure Prediction by Combining a Neural Network-Based Approach with Extensive Feature Design and Selection without Reliance on Evolutionary Information. Int J Mol Sci 2023; 24:15656. [PMID: 37958639 PMCID: PMC10648199 DOI: 10.3390/ijms242115656] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Revised: 10/24/2023] [Accepted: 10/25/2023] [Indexed: 11/15/2023] Open
Abstract
Protein structure prediction continues to pose multiple challenges despite outstanding progress that is largely attributable to the use of novel machine learning techniques. One of the widely used representations of local 3D structure-protein blocks (PBs)-can be treated in a similar way to secondary structure classes. Here, we present a new approach for predicting local conformation in terms of PB classes solely from amino acid sequences. We apply the RMSD metric to ensure unambiguous future 3D protein structure recovery. The selection of statistically assessed features is a key component of the proposed method. We suggest that ML input features should be created from the statistically significant predictors that are derived from the amino acids' physicochemical properties and the resolved structures' statistics. The statistical significance of the suggested features was assessed using a stepwise regression analysis that permitted the evaluation of the contribution and statistical significance of each predictor. We used the set of 380 statistically significant predictors as a learning model for the regression neural network that was trained using the PISCES30 dataset. When using the same dataset and metrics for benchmarking, our method outperformed all other methods reported in the literature for the CB513 nonredundant dataset (for the PBs, Q16 = 81.01%, and for the DSSP, Q3 = 85.99% and Q8 = 79.35%).
Collapse
Affiliation(s)
- Yury V. Milchevskiy
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Vavilov Str., 32, 119991 Moscow, Russia (Y.V.K.)
| | - Vladislava Y. Milchevskaya
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Vavilov Str., 32, 119991 Moscow, Russia (Y.V.K.)
- Institute of Medical Statistics and Bioinformatics, University of Cologne, 50931 Cologne, Germany
| | - Alexei M. Nikitin
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Vavilov Str., 32, 119991 Moscow, Russia (Y.V.K.)
| | - Yury V. Kravatsky
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Vavilov Str., 32, 119991 Moscow, Russia (Y.V.K.)
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, 119991 Moscow, Russia
| |
Collapse
|
5
|
Selvaraj MK, Kaur J. Computational method for aromatase-related proteins using machine learning approach. PLoS One 2023; 18:e0283567. [PMID: 36989252 PMCID: PMC10057777 DOI: 10.1371/journal.pone.0283567] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Accepted: 03/12/2023] [Indexed: 03/30/2023] Open
Abstract
Human aromatase enzyme is a microsomal cytochrome P450 and catalyzes aromatization of androgens into estrogens during steroidogenesis. For breast cancer therapy, third-generation aromatase inhibitors (AIs) have proven to be effective; however patients acquire resistance to current AIs. Thus there is a need to predict aromatase-related proteins to develop efficacious AIs. A machine learning method was established to identify aromatase-related proteins using a five-fold cross validation technique. In this study, different SVM approach-based models were built using the following approaches like amino acid, dipeptide composition, hybrid and evolutionary profiles in the form of position-specific scoring matrix (PSSM); with maximum accuracy of 87.42%, 84.05%, 85.12%, and 92.02% respectively. Based on the primary sequence, the developed method is highly accurate to predict the aromatase-related proteins. Prediction scores graphs were developed using the known dataset to check the performance of the method. Based on the approach described above, a webserver for predicting aromatase-related proteins from primary sequence data was developed and implemented at https://bioinfo.imtech.res.in/servers/muthu/aromatase/home.html. We hope that the developed method will be useful for aromatase protein related research.
Collapse
Affiliation(s)
| | - Jasmeet Kaur
- Department of Biophysics, Postgraduate Institute of Medical Education and Research (PGIMER), Chandigarh, India
| |
Collapse
|
6
|
Cretin G, Galochkina T, de Brevern AG, Gelly JC. PYTHIA: Deep Learning Approach for Local Protein Conformation Prediction. Int J Mol Sci 2021; 22:ijms22168831. [PMID: 34445537 PMCID: PMC8396346 DOI: 10.3390/ijms22168831] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Revised: 08/09/2021] [Accepted: 08/10/2021] [Indexed: 02/07/2023] Open
Abstract
Protein Blocks (PBs) are a widely used structural alphabet describing local protein backbone conformation in terms of 16 possible conformational states, adopted by five consecutive amino acids. The representation of complex protein 3D structures as 1D PB sequences was previously successfully applied to protein structure alignment and protein structure prediction. In the current study, we present a new model, PYTHIA (predicting any conformation at high accuracy), for the prediction of the protein local conformations in terms of PBs directly from the amino acid sequence. PYTHIA is based on a deep residual inception-inside-inception neural network with convolutional block attention modules, predicting 1 of 16 PB classes from evolutionary information combined to physicochemical properties of individual amino acids. PYTHIA clearly outperforms the LOCUSTRA reference method for all PB classes and demonstrates great performance for PB prediction on particularly challenging proteins from the CASP14 free modelling category.
Collapse
Affiliation(s)
- Gabriel Cretin
- Biologie Intégrée du Globule Rouge, Université de Paris, UMR_S1134, BIGR, INSERM, 75015 Paris, France; (G.C.); (T.G.); (A.G.d.B.)
- Laboratoire d’Excellence GR-Ex, 75015 Paris, France
| | - Tatiana Galochkina
- Biologie Intégrée du Globule Rouge, Université de Paris, UMR_S1134, BIGR, INSERM, 75015 Paris, France; (G.C.); (T.G.); (A.G.d.B.)
- Laboratoire d’Excellence GR-Ex, 75015 Paris, France
| | - Alexandre G. de Brevern
- Biologie Intégrée du Globule Rouge, Université de Paris, UMR_S1134, BIGR, INSERM, 75015 Paris, France; (G.C.); (T.G.); (A.G.d.B.)
- Laboratoire d’Excellence GR-Ex, 75015 Paris, France
| | - Jean-Christophe Gelly
- Biologie Intégrée du Globule Rouge, Université de Paris, UMR_S1134, BIGR, INSERM, 75015 Paris, France; (G.C.); (T.G.); (A.G.d.B.)
- Laboratoire d’Excellence GR-Ex, 75015 Paris, France
- Correspondence:
| |
Collapse
|
7
|
Guo L, Jiang Q, Jin X, Liu L, Zhou W, Yao S, Wu M, Wang Y. A Deep Convolutional Neural Network to Improve the Prediction of Protein Secondary Structure. Curr Bioinform 2020. [DOI: 10.2174/1574893615666200120103050] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Background:
Protein secondary structure prediction (PSSP) is a fundamental task in
bioinformatics that is helpful for understanding the three-dimensional structure and biological
function of proteins. Many neural network-based prediction methods have been developed for
protein secondary structures. Deep learning and multiple features are two obvious means to improve
prediction accuracy.
Objective:
To promote the development of PSSP, a deep convolutional neural network-based
method is proposed to predict both the eight-state and three-state of protein secondary structure.
Methods:
In this model, sequence and evolutionary information of proteins are combined as multiple
input features after preprocessing. A deep convolutional neural network with no pooling layer and
connection layer is then constructed to predict the secondary structure of proteins. L2 regularization,
batch normalization, and dropout techniques are employed to avoid over-fitting and obtain better
prediction performance, and an improved cross-entropy is used as the loss function.
Results:
Our proposed model can obtain Q3 prediction results of 86.2%, 84.5%, 87.8%, and 84.7%,
respectively, on CullPDB, CB513, CASP10 and CASP11 datasets, with corresponding Q8
prediction results of 74.1%, 70.5%, 74.9%, and 71.3%.
Conclusion:
We have proposed the DCNN-SS deep convolutional-network-based PSSP method,
and experimental results show that DCNN-SS performs competitively with other methods.
Collapse
Affiliation(s)
- Lin Guo
- School of Software, Yunnan University, Kunming, China; 2School of Information, Yunnan Normal University, Kunming, China
| | - Qian Jiang
- School of Software, Yunnan University, Kunming, China; 2School of Information, Yunnan Normal University, Kunming, China
| | - Xin Jin
- School of Software, Yunnan University, Kunming, China; 2School of Information, Yunnan Normal University, Kunming, China
| | - Lin Liu
- School of Software, Yunnan University, Kunming, China; 2School of Information, Yunnan Normal University, Kunming, China
| | - Wei Zhou
- School of Software, Yunnan University, Kunming, China; 2School of Information, Yunnan Normal University, Kunming, China
| | - Shaowen Yao
- School of Software, Yunnan University, Kunming, China; 2School of Information, Yunnan Normal University, Kunming, China
| | - Min Wu
- School of Software, Yunnan University, Kunming, China; 2School of Information, Yunnan Normal University, Kunming, China
| | - Yun Wang
- School of Software, Yunnan University, Kunming, China; 2School of Information, Yunnan Normal University, Kunming, China
| |
Collapse
|
8
|
Bi J, Tian C, Jiang J, Zhang GL, Hao H, Hou HM. Antibacterial Activity and Potential Application in Food Packaging of Peptides Derived from Turbot Viscera Hydrolysate. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2020; 68:9968-9977. [PMID: 32841003 DOI: 10.1021/acs.jafc.0c03146] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
As a good choice for food preservation, antimicrobial peptides (AMPs) have received much attention in recent years. In this paper, peptides derived from the turbot viscera hydrolysate were identified by ultraperformance liquid chromatography-quadrupole time-of-flight mass spectrometry (UPLC-Q-TOF-MS/MS), and the physicochemical properties and structural characteristics were analyzed by in silico tools. Furthermore, three cationic peptides with potential hydrophobicity and amphipathy were synthesized; their cytotoxicity, hemolysis, and antibacterial activities were investigated. In particular, Sm-A1 (GITDLRGMLKRLKKMK), a peptide with 16 amino acids, showed an outstanding antibacterial activity against both Gram-positive and Gram-negative bacteria by damaging the cell membrane integrity. Moreover, Sm-A1 was successfully loaded into hydroxyl-rich poly(vinyl alcohol) (PVA)/chitosan (CS) hydrogel to improve the antibacterial activity and biofilm inhibition effect. PVA/CS+7.5‰ Sm-A1 hydrogel can satisfactorily protect the salmon muscle from the microbiological contamination and texture deterioration.
Collapse
Affiliation(s)
- Jingran Bi
- School of Food Science and Technology, Dalian Polytechnic University, No. 1, Qinggongyuan, Ganjingzi District, Dalian, Liaoning 116034, People's Republic of China
- Liaoning Key Lab for Aquatic Processing Quality and Safety, No. 1, Qinggongyuan, Ganjingzi District, Dalian, Liaoning 116034, People's Republic of China
| | - Chuan Tian
- School of Food Science and Technology, Dalian Polytechnic University, No. 1, Qinggongyuan, Ganjingzi District, Dalian, Liaoning 116034, People's Republic of China
- Liaoning Key Lab for Aquatic Processing Quality and Safety, No. 1, Qinggongyuan, Ganjingzi District, Dalian, Liaoning 116034, People's Republic of China
| | - Jinghui Jiang
- School of Food Science and Technology, Dalian Polytechnic University, No. 1, Qinggongyuan, Ganjingzi District, Dalian, Liaoning 116034, People's Republic of China
- Liaoning Key Lab for Aquatic Processing Quality and Safety, No. 1, Qinggongyuan, Ganjingzi District, Dalian, Liaoning 116034, People's Republic of China
| | - Gong-Liang Zhang
- School of Food Science and Technology, Dalian Polytechnic University, No. 1, Qinggongyuan, Ganjingzi District, Dalian, Liaoning 116034, People's Republic of China
- Liaoning Key Lab for Aquatic Processing Quality and Safety, No. 1, Qinggongyuan, Ganjingzi District, Dalian, Liaoning 116034, People's Republic of China
| | - Hongshun Hao
- School of Food Science and Technology, Dalian Polytechnic University, No. 1, Qinggongyuan, Ganjingzi District, Dalian, Liaoning 116034, People's Republic of China
- Liaoning Key Lab for Aquatic Processing Quality and Safety, No. 1, Qinggongyuan, Ganjingzi District, Dalian, Liaoning 116034, People's Republic of China
| | - Hong-Man Hou
- School of Food Science and Technology, Dalian Polytechnic University, No. 1, Qinggongyuan, Ganjingzi District, Dalian, Liaoning 116034, People's Republic of China
- Liaoning Key Lab for Aquatic Processing Quality and Safety, No. 1, Qinggongyuan, Ganjingzi District, Dalian, Liaoning 116034, People's Republic of China
| |
Collapse
|
9
|
Vetrivel I, de Brevern AG, Cadet F, Srinivasan N, Offmann B. Structural variations within proteins can be as large as variations observed across their homologues. Biochimie 2019; 167:162-170. [PMID: 31560932 DOI: 10.1016/j.biochi.2019.09.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2019] [Accepted: 09/18/2019] [Indexed: 10/26/2022]
Abstract
Understanding the structural plasticity of proteins is key to understanding the intricacies of their functions and mechanistic basis. In the current study, we analyzed the available multiple crystal structures of the same protein for the structural differences. For this purpose we used an abstraction of protein structures referred as Protein Blocks (PBs) that was previously established. We also characterized the nature of the structural variations for a few proteins using molecular dynamics simulations. In both the cases, the structural variations were summarized in the form of substitution matrices of PBs. We show that certain conformational states are preferably replaced by other specific conformational states. Interestingly, these structural variations are highly similar to those previously observed across structures of homologous proteins (r2 = 0.923) or across the ensemble of conformations from NMR data (r2 = 0.919). Thus our study quantitatively shows that overall trends of structural changes in a given protein are nearly identical to the trends of structural differences that occur in the topologically equivalent positions in homologous proteins. Specific case studies are used to illustrate the nature of these structural variations.
Collapse
Affiliation(s)
- Iyanar Vetrivel
- Université de Nantes, UFIP UMR 6286 CNRS, UFR Sciences et Techniques, 2 Chemin de La Houssinière, Nantes, France
| | - Alexandre G de Brevern
- INSERM UMR_S 1134, DSIMB Team, Laboratory of Excellence, GR-Ex, Univ Paris Diderot, Univ Sorbonne Paris Cité, INTS, 6 Rue Alexandre Cabanel, Paris, France
| | - Frédéric Cadet
- University of Paris, UMR_S1134, BIGR, Inserm, F-75015, Paris, France; DSIMB, UMR_S1134, BIGR, Inserm, Laboratory of Excellence GR-Ex, Faculty of Sciences and Technology, University of La Reunion, F-97715, Saint-Denis, France; PEACCEL, Protein Engineering Accelerator, 6 Square Albin Cachot, Box 42, 75013, Paris, France
| | | | - Bernard Offmann
- Université de Nantes, UFIP UMR 6286 CNRS, UFR Sciences et Techniques, 2 Chemin de La Houssinière, Nantes, France.
| |
Collapse
|
10
|
Vetrivel I, Mahajan S, Tyagi M, Hoffmann L, Sanejouand YH, Srinivasan N, de Brevern AG, Cadet F, Offmann B. Knowledge-based prediction of protein backbone conformation using a structural alphabet. PLoS One 2017; 12:e0186215. [PMID: 29161266 PMCID: PMC5697859 DOI: 10.1371/journal.pone.0186215] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2017] [Accepted: 09/27/2017] [Indexed: 01/19/2023] Open
Abstract
Libraries of structural prototypes that abstract protein local structures are known as structural alphabets and have proven to be very useful in various aspects of protein structure analyses and predictions. One such library, Protein Blocks, is composed of 16 standard 5-residues long structural prototypes. This form of analyzing proteins involves drafting its structure as a string of Protein Blocks. Predicting the local structure of a protein in terms of protein blocks is the general objective of this work. A new approach, PB-kPRED is proposed towards this aim. It involves (i) organizing the structural knowledge in the form of a database of pentapeptide fragments extracted from all protein structures in the PDB and (ii) applying a knowledge-based algorithm that does not rely on any secondary structure predictions and/or sequence alignment profiles, to scan this database and predict most probable backbone conformations for the protein local structures. Though PB-kPRED uses the structural information from homologues in preference, if available. The predictions were evaluated rigorously on 15,544 query proteins representing a non-redundant subset of the PDB filtered at 30% sequence identity cut-off. We have shown that the kPRED method was able to achieve mean accuracies ranging from 40.8% to 66.3% depending on the availability of homologues. The impact of the different strategies for scanning the database on the prediction was evaluated and is discussed. Our results highlight the usefulness of the method in the context of proteins without any known structural homologues. A scoring function that gives a good estimate of the accuracy of prediction was further developed. This score estimates very well the accuracy of the algorithm (R2 of 0.82). An online version of the tool is provided freely for non-commercial usage at http://www.bo-protscience.fr/kpred/.
Collapse
Affiliation(s)
- Iyanar Vetrivel
- Université de Nantes, Unité Fonctionnalité et Ingénierie des Protéines (UFIP), UMR 6286 CNRS, UFR Sciences et Techniques, 2, chemin de la Houssinière, France
| | - Swapnil Mahajan
- Université de Nantes, Unité Fonctionnalité et Ingénierie des Protéines (UFIP), UMR 6286 CNRS, UFR Sciences et Techniques, 2, chemin de la Houssinière, France
- DSIMB, INSERM, UMR S-1134, Laboratory of Excellence, GR-Ex, Université de La Réunion, Faculty of Sciences and Technology, Saint Denis Cedex, La Réunion, France
| | - Manoj Tyagi
- Université de La Réunion, Saint Denis Cedex, La Réunion, France
| | - Lionel Hoffmann
- Université de Nantes, Unité Fonctionnalité et Ingénierie des Protéines (UFIP), UMR 6286 CNRS, UFR Sciences et Techniques, 2, chemin de la Houssinière, France
| | - Yves-Henri Sanejouand
- Université de Nantes, Unité Fonctionnalité et Ingénierie des Protéines (UFIP), UMR 6286 CNRS, UFR Sciences et Techniques, 2, chemin de la Houssinière, France
| | | | - Alexandre G. de Brevern
- INSERM UMR_S 1134, DSIMB team, Laboratory of Excellence, GR-Ex, Univ Paris Diderot, Univ Sorbonne Paris Cité, INTS, rue Alexandre Cabanel, Paris, France
| | - Frédéric Cadet
- DSIMB, INSERM, UMR S-1134, Laboratory of Excellence, GR-Ex, Université de La Réunion, Faculty of Sciences and Technology, Saint Denis Cedex, La Réunion, France
- PEACCEL SAS, Paris, France
| | - Bernard Offmann
- Université de Nantes, Unité Fonctionnalité et Ingénierie des Protéines (UFIP), UMR 6286 CNRS, UFR Sciences et Techniques, 2, chemin de la Houssinière, France
| |
Collapse
|
11
|
Barnoud J, Santuz H, Craveur P, Joseph AP, Jallu V, de Brevern AG, Poulain P. PBxplore: a tool to analyze local protein structure and deformability with Protein Blocks. PeerJ 2017; 5:e4013. [PMID: 29177113 PMCID: PMC5700758 DOI: 10.7717/peerj.4013] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2017] [Accepted: 10/19/2017] [Indexed: 11/20/2022] Open
Abstract
This paper describes the development and application of a suite of tools, called PBxplore, to analyze the dynamics and deformability of protein structures using Protein Blocks (PBs). Proteins are highly dynamic macromolecules, and a classical way to analyze their inherent flexibility is to perform molecular dynamics simulations. The advantage of using small structural prototypes such as PBs is to give a good approximation of the local structure of the protein backbone. More importantly, by reducing the conformational complexity of protein structures, PBs allow analysis of local protein deformability which cannot be done with other methods and had been used efficiently in different applications. PBxplore is able to process large amounts of data such as those produced by molecular dynamics simulations. It produces frequencies, entropy and information logo outputs as text and graphics. PBxplore is available at https://github.com/pierrepo/PBxplore and is released under the open-source MIT license.
Collapse
Affiliation(s)
- Jonathan Barnoud
- INSERM, U 1134, DSIMB, Paris, France.,Univ. Paris Diderot, Sorbonne Paris Cité, Univ de la Réunion, Univ des Antilles, UMR-S 1134, Paris, France.,Institut National de la Transfusion Sanguine (INTS), Paris, France.,Laboratoire d'Excellence GR-Ex, Paris, France.,Current affiliation: Groningen Biomolecular Sciences and Biotechnology Institute and Zernike Institute for Advanced Materials, University of Groningen, Groningen, The Netherlands
| | - Hubert Santuz
- INSERM, U 1134, DSIMB, Paris, France.,Univ. Paris Diderot, Sorbonne Paris Cité, Univ de la Réunion, Univ des Antilles, UMR-S 1134, Paris, France.,Institut National de la Transfusion Sanguine (INTS), Paris, France.,Laboratoire d'Excellence GR-Ex, Paris, France.,Current affiliation: Laboratoire de Biochimie Théorique, CNRS UPR 9080, Institut de Biologie Physico-Chimique, Paris, France
| | - Pierrick Craveur
- INSERM, U 1134, DSIMB, Paris, France.,Univ. Paris Diderot, Sorbonne Paris Cité, Univ de la Réunion, Univ des Antilles, UMR-S 1134, Paris, France.,Institut National de la Transfusion Sanguine (INTS), Paris, France.,Laboratoire d'Excellence GR-Ex, Paris, France.,Current affiliation: Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, United States of America
| | - Agnel Praveen Joseph
- INSERM, U 1134, DSIMB, Paris, France.,Univ. Paris Diderot, Sorbonne Paris Cité, Univ de la Réunion, Univ des Antilles, UMR-S 1134, Paris, France.,Institut National de la Transfusion Sanguine (INTS), Paris, France.,Laboratoire d'Excellence GR-Ex, Paris, France.,Current affiliation: Birkbeck College, University of London, London, UK
| | | | - Alexandre G de Brevern
- INSERM, U 1134, DSIMB, Paris, France.,Univ. Paris Diderot, Sorbonne Paris Cité, Univ de la Réunion, Univ des Antilles, UMR-S 1134, Paris, France.,Institut National de la Transfusion Sanguine (INTS), Paris, France.,Laboratoire d'Excellence GR-Ex, Paris, France
| | - Pierre Poulain
- INSERM, U 1134, DSIMB, Paris, France.,Univ. Paris Diderot, Sorbonne Paris Cité, Univ de la Réunion, Univ des Antilles, UMR-S 1134, Paris, France.,Institut National de la Transfusion Sanguine (INTS), Paris, France.,Laboratoire d'Excellence GR-Ex, Paris, France.,Current affiliation: Mitochondria, Metals and Oxidative Stress Group, Institut Jacques Monod, UMR 7592, Univ. Paris Diderot, CNRS, Sorbonne Paris Cité, Paris, France
| |
Collapse
|
12
|
Xie S, Li Z, Hu H. Protein secondary structure prediction based on the fuzzy support vector machine with the hyperplane optimization. Gene 2017; 642:74-83. [PMID: 29104167 DOI: 10.1016/j.gene.2017.11.005] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2017] [Revised: 10/29/2017] [Accepted: 11/02/2017] [Indexed: 11/30/2022]
Abstract
The prediction of the protein secondary structure is a crucial point in bioinformatics and related fields. In the last years, machine learning methods have become a valuable tool, achieving satisfactory results. However, the prediction accuracy needs to be further ameliorated. This paper proposes a new method based on an improved fuzzy support vector machine (FSVM) for the prediction of the secondary structure of proteins. Unlike traditional methods to set the membership function, it firstly constructs an approximate optimal separating hyperplane by iterating the class centers in the feature space. Then sample points close to this hyperplane are assigned with large membership values, while outliers with small membership values according to the K-nearest neighbor. And some sample points with low membership values are removed, reducing the training time and improving the prediction accuracy. To optimize the prediction results, our method also exploits information on sequence-based structural similarity. We used three databases (e.g. RS126, CB513 and data1199) to test this method, showing the achievement of 94.2%, 93.1%, 96.7% Q3 accuracy and 91.7%, 89.7%, 94.1% SOV values for the three datasets, respectively. Overall, our method results are comparable to or often better than commonly used methods (Magnan & Baldi, 2014; Sheng et al., 2016) for secondary structure prediction.
Collapse
Affiliation(s)
- Shangxin Xie
- School of Science, Zhejiang Sci-Tech University, Hangzhou, Zhejiang, 310018, China
| | - Zhong Li
- School of Science, Zhejiang Sci-Tech University, Hangzhou, Zhejiang, 310018, China.
| | - Hailong Hu
- School of Science, Zhejiang Sci-Tech University, Hangzhou, Zhejiang, 310018, China; School of Science, Zhejiang A&F University, Lin'an, Zhejiang 311300, China
| |
Collapse
|
13
|
Suresh V, Liu L, Adjeroh D, Zhou X. RPI-Pred: predicting ncRNA-protein interaction using sequence and structural information. Nucleic Acids Res 2015; 43:1370-9. [PMID: 25609700 PMCID: PMC4330382 DOI: 10.1093/nar/gkv020] [Citation(s) in RCA: 130] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
RNA-protein complexes are essential in mediating important fundamental cellular processes, such as transport and localization. In particular, ncRNA-protein interactions play an important role in post-transcriptional gene regulation like mRNA localization, mRNA stabilization, poly-adenylation, splicing and translation. The experimental methods to solve RNA-protein interaction prediction problem remain expensive and time-consuming. Here, we present the RPI-Pred (RNA-protein interaction predictor), a new support-vector machine-based method, to predict protein-RNA interaction pairs, based on both the sequences and structures. The results show that RPI-Pred can correctly predict RNA-protein interaction pairs with ∼94% prediction accuracy when using sequence and experimentally determined protein and RNA structures, and with ∼83% when using sequences and predicted protein and RNA structures. Further, our proposed method RPI-Pred was superior to other existing ones by predicting more experimentally validated ncRNA-protein interaction pairs from different organisms. Motivated by the improved performance of RPI-Pred, we further applied our method for reliable construction of ncRNA-protein interaction networks. The RPI-Pred is publicly available at: http://ctsb.is.wfubmc.edu/projects/rpi-pred.
Collapse
Affiliation(s)
- V Suresh
- Department of Radiology, Wake Forest University Health Science, Medical Center Boulevard, Winston-Salem, NC 27157, USA
| | - Liang Liu
- Department of Radiology, Wake Forest University Health Science, Medical Center Boulevard, Winston-Salem, NC 27157, USA
| | - Donald Adjeroh
- Lane Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown, WV 26505, USA
| | - Xiaobo Zhou
- Department of Radiology, Wake Forest University Health Science, Medical Center Boulevard, Winston-Salem, NC 27157, USA
| |
Collapse
|