1
|
Zhu L, Fang Y, Liu S, Shen HB, De Neve W, Pan X. ToxDL 2.0: Protein toxicity prediction using a pretrained language model and graph neural networks. Comput Struct Biotechnol J 2025; 27:1538-1549. [PMID: 40276117 PMCID: PMC12018212 DOI: 10.1016/j.csbj.2025.04.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2024] [Revised: 03/31/2025] [Accepted: 04/01/2025] [Indexed: 04/26/2025] Open
Abstract
Motivation Assessing the potential toxicity of proteins is crucial for both therapeutic and agricultural applications. Traditional experimental methods for protein toxicity evaluation are time-consuming, expensive, and labor-intensive, highlighting the requirement for efficient computational approaches. Recent advancements in language models and deep learning have significantly improved protein toxicity prediction, yet current models often lack the ability to integrate evolutionary and structural information, which is crucial for accurate toxicity assessment of proteins. Results In this study, we present ToxDL 2.0, a novel multimodal deep learning model for protein toxicity prediction that integrates both evolutionary and structural information derived from a pretrained language model and AlphaFold2. ToxDL 2.0 consists of three key modules: (1) a Graph Convolutional Network (GCN) module for generating protein graph embeddings based on AlphaFold2-predicted structures, (2) a domain embedding module for capturing protein domain representations, and (3) a dense module that combines these embeddings to predict the toxicity. After constructing a comprehensive toxicity benchmark dataset, we obtained experimental results on both an original non-redundant test set (comprising pre-2022 protein sequences) and an independent non-redundant test set (a holdout set of post-2022 protein sequences), demonstrating that ToxDL 2.0 outperforms existing state-of-the-art methods. Additionally, we utilized Integrated Gradients to discover known toxic motifs associated with protein toxicity. A web server for ToxDL 2.0 is publicly available at www.csbio.sjtu.edu.cn/bioinf/ToxDL2/.
Collapse
Affiliation(s)
- Lin Zhu
- School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China
| | - Yi Fang
- Department of Automation, Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Shanghai 200240, China
- Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China
| | - Shuting Liu
- School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China
| | - Hong-Bin Shen
- Department of Automation, Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Shanghai 200240, China
- Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China
| | - Wesley De Neve
- Department for Electronics and Information Systems, IDLab, Ghent University, Ghent 9000, Belgium
- Department of Environmental Technology, Food Technology and Molecular Biotechnology, Center for Biotech Data Science, Ghent University Global Campus, Songdo, Incheon 305-701, South Korea
| | - Xiaoyong Pan
- Department of Automation, Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Shanghai 200240, China
- Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China
| |
Collapse
|
2
|
Shey RA, Nchanji GT, Stong TYA, Yaah NE, Shintouo CM, Yengo BN, Nebangwa DN, Efeti MT, Chick JA, Ayuk AB, Gwei KY, Lemoge AA, Vanhamme L, Ghogomu SM, Souopgui J. One Health Approach to the Computational Design of a Lipoprotein-Based Multi-Epitope Vaccine Against Human and Livestock Tuberculosis. Int J Mol Sci 2025; 26:1587. [PMID: 40004053 PMCID: PMC11855821 DOI: 10.3390/ijms26041587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2025] [Revised: 01/29/2025] [Accepted: 02/11/2025] [Indexed: 02/27/2025] Open
Abstract
Tuberculosis (TB) remains a major cause of ill health and one of the leading causes of death worldwide, with about 1.25 million deaths estimated in 2023. Control measures have focused principally on early diagnosis, the treatment of active TB, and vaccination. However, the widespread emergence of anti-tuberculosis drug resistance remains the major public health threat to progress made in global TB care and control. Moreover, the Bacillus Calmette-Guérin (BCG) vaccine, the only licensed vaccine against TB in children, has been in use for over a century, and there have been considerable debates concerning its effectiveness in TB control. A multi-epitope vaccine against TB would be an invaluable tool to attain the Global Plan to End TB 2023-2030 target. A rational approach that combines several B-cell and T-cell epitopes from key lipoproteins was adopted to design a novel multi-epitope vaccine candidate. In addition, interactions with TLR4 were implemented to assess its ability to elicit an innate immune response. The conservation of the selected proteins suggests the possibility of cross-protection in line with the One Health approach to disease control. The vaccine candidate was predicted to be both antigenic and immunogenic, and immune simulation analyses demonstrated its ability to elicit both humoral and cellular immune responses. Protein-protein docking and normal-mode analyses of the vaccine candidate with TLR4 predicted efficient binding and stable interaction. This study provides a promising One Health approach for the design of multi-epitope vaccines against human and livestock tuberculosis. Overall, the designed vaccine candidate demonstrated immunogenicity and safety features that warrant further experimental validation in vitro and in vivo.
Collapse
Affiliation(s)
- Robert Adamu Shey
- Department of Biochemistry and Molecular Biology, Faculty of Science, University of Buea, Buea P.O. Box 63, Cameroon; (T.Y.A.S.); (N.E.Y.); (D.N.N.); (M.T.E.); (A.B.A.); (K.Y.G.); (S.M.G.)
- Tropical Disease Interventions, Diagnostics, Vaccines and Therapeutics (TroDDIVaT) Initiative, Buea P.O. Box 1022, Cameroon;
| | - Gordon Takop Nchanji
- Tropical Disease Interventions, Diagnostics, Vaccines and Therapeutics (TroDDIVaT) Initiative, Buea P.O. Box 1022, Cameroon;
- Department of Microbiology and Parasitology, Faculty of Science, University of Buea, Buea P.O. Box 63, Cameroon
| | - Tangan Yanick Aqua Stong
- Department of Biochemistry and Molecular Biology, Faculty of Science, University of Buea, Buea P.O. Box 63, Cameroon; (T.Y.A.S.); (N.E.Y.); (D.N.N.); (M.T.E.); (A.B.A.); (K.Y.G.); (S.M.G.)
| | - Ntang Emmaculate Yaah
- Department of Biochemistry and Molecular Biology, Faculty of Science, University of Buea, Buea P.O. Box 63, Cameroon; (T.Y.A.S.); (N.E.Y.); (D.N.N.); (M.T.E.); (A.B.A.); (K.Y.G.); (S.M.G.)
| | - Cabirou Mounchili Shintouo
- Department of Microbiology and Immunology, College of Medicine, Drexel University, 2900 W Queen Ln, Philadelphia, PA 19129, USA; (C.M.S.); (B.N.Y.)
| | - Bernis Neneyoh Yengo
- Department of Microbiology and Immunology, College of Medicine, Drexel University, 2900 W Queen Ln, Philadelphia, PA 19129, USA; (C.M.S.); (B.N.Y.)
| | - Derrick Neba Nebangwa
- Department of Biochemistry and Molecular Biology, Faculty of Science, University of Buea, Buea P.O. Box 63, Cameroon; (T.Y.A.S.); (N.E.Y.); (D.N.N.); (M.T.E.); (A.B.A.); (K.Y.G.); (S.M.G.)
| | - Mary Teke Efeti
- Department of Biochemistry and Molecular Biology, Faculty of Science, University of Buea, Buea P.O. Box 63, Cameroon; (T.Y.A.S.); (N.E.Y.); (D.N.N.); (M.T.E.); (A.B.A.); (K.Y.G.); (S.M.G.)
- Frailty in Ageing Research Group, Vrije Universiteit Brussel, Laarbeeklaan 103, B-1090 Brussels, Belgium
- Department of Gerontology, Faculty of Medicine and Pharmacy, Vrije Universiteit Brussel, Laarbeeklaan 103, B-1090 Brussels, Belgium
| | - Joan Amban Chick
- Department of Computer and Information Sciences, College of Science and Technology, Covenant University, PMB 1023, Ota 112233, Ogun State, Nigeria;
| | - Abey Blessings Ayuk
- Department of Biochemistry and Molecular Biology, Faculty of Science, University of Buea, Buea P.O. Box 63, Cameroon; (T.Y.A.S.); (N.E.Y.); (D.N.N.); (M.T.E.); (A.B.A.); (K.Y.G.); (S.M.G.)
| | - Ketura Yaje Gwei
- Department of Biochemistry and Molecular Biology, Faculty of Science, University of Buea, Buea P.O. Box 63, Cameroon; (T.Y.A.S.); (N.E.Y.); (D.N.N.); (M.T.E.); (A.B.A.); (K.Y.G.); (S.M.G.)
| | | | - Luc Vanhamme
- Department of Molecular Biology, Institute of Biology and Molecular Medicine, IBMM, Gosselies, Université Libre de Bruxelles, Rue des Professeurs Jeener et Brachet 12, B-6041 Charleroi, Belgium; (L.V.); (J.S.)
| | - Stephen Mbigha Ghogomu
- Department of Biochemistry and Molecular Biology, Faculty of Science, University of Buea, Buea P.O. Box 63, Cameroon; (T.Y.A.S.); (N.E.Y.); (D.N.N.); (M.T.E.); (A.B.A.); (K.Y.G.); (S.M.G.)
| | - Jacob Souopgui
- Department of Molecular Biology, Institute of Biology and Molecular Medicine, IBMM, Gosselies, Université Libre de Bruxelles, Rue des Professeurs Jeener et Brachet 12, B-6041 Charleroi, Belgium; (L.V.); (J.S.)
| |
Collapse
|
3
|
Rivera-Asencios D, Espinoza-Culupú A, Carmen-Sifuentes S, Ramirez P, García-de-la-Guarda R. Design of a multi-epitope vaccine candidate against carrion disease by immunoinformatics approach. Comput Biol Med 2025; 184:109397. [PMID: 39566279 DOI: 10.1016/j.compbiomed.2024.109397] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2024] [Revised: 10/09/2024] [Accepted: 11/07/2024] [Indexed: 11/22/2024]
Abstract
Carrion's disease, caused by the bacterium Bartonella bacilliformis, is a serious public health problem in Peru, Ecuador and Colombia. Currently there is no available vaccine against B. bacilliformis. While antibiotics are the standard treatment, resistant strains have been reported, and there is a potential spread of the vector that transmits the bacteria. This study aimed to design a multi-epitope vaccine candidate against the causative agent of Carrion's disease using immunoinformatics tools. Predictions of B-cell epitopes, as well as CD4+ and CD8+T cell epitopes, were performed from the entire proteome of B. bacilliformis KC583 using the most frequent alleles from Peru, Ecuador, Colombia, and worldwide. B-cell epitopes and T-cell nested epitopes from outer membrane and virulence-associated proteins were selected. Epitopes were filtered out based on promiscuity, non-allergenicity, conservation, non-homology and non-toxicity. Two vaccine constructs were assembled using linkers. The tertiary structure of the constructs was predicted, and their stability was evaluated through molecular dynamics simulations. The most stable construct was selected for molecular docking with the TLR4 receptor. This study proposes a vaccine construct evaluated in silico as a potential vaccine candidate against Bartonella bacilliformis.
Collapse
Affiliation(s)
- Damaris Rivera-Asencios
- Molecular Microbiology and Biotechnology Laboratory, Faculty of Biological Sciences, Universidad Nacional Mayor de San Marcos, Lima, Peru
| | - Abraham Espinoza-Culupú
- Molecular Microbiology and Biotechnology Laboratory, Faculty of Biological Sciences, Universidad Nacional Mayor de San Marcos, Lima, Peru
| | | | - Pablo Ramirez
- Molecular Microbiology and Biotechnology Laboratory, Faculty of Biological Sciences, Universidad Nacional Mayor de San Marcos, Lima, Peru
| | - Ruth García-de-la-Guarda
- Molecular Microbiology and Biotechnology Laboratory, Faculty of Biological Sciences, Universidad Nacional Mayor de San Marcos, Lima, Peru.
| |
Collapse
|
4
|
Hessami A, Mogharari Z, Rahim F, Khalesi B, Jamal Nassrullah O, Reza Rahbar M, Khalili S, Jahangiri A. In silico design of a novel hybrid epitope-based antigen harboring highly exposed immunogenic peptides of BamA, OmpA, and Omp34 against Acinetobacter baumannii. Int Immunopharmacol 2024; 142:113066. [PMID: 39241518 DOI: 10.1016/j.intimp.2024.113066] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2024] [Revised: 08/07/2024] [Accepted: 08/30/2024] [Indexed: 09/09/2024]
Abstract
Acinetobacter baumannii, is among the highest priority bacteria according to the WHO categorization which necessitate the exploration of alternative strategies such as vaccination. OmpA, BamA, and Omp34 are assigned as appropriate antigens to serve in vaccine development against this pathogen. Experimentally validated exposed epitopes of OmpA and Omp34 along with selected exposed epitopes predicted by an integrative in silico approach were represented by the barrel domain of BamA as a scaffold. Among the 8 external loops of BamA, 5 loops were replaced with selected loops of OmpA and Omp34. The designed antigen was analyzed regarding the physicochemical properties, antigenicity, epitope retrieval, topology, structure, and safety. BamA is a two-domain OMP with a 16-stranded barrel in which L4, L6, and L7 were the longest loops of BamA in order. The designed antigen consisted of 478 amino acids with antigen probability of 0.7793. The novel antigen was a 16-stranded barrel. No identical 8-meric peptides were found in the human proteome against the designed antigen sequence. The designed construct was safe regarding the allergenicity, toxicity, and human proteome reactivity. The designed antigen could develop higher protection against A. baumannii in comparison to either OmpA, BamA, or Omp34 alone.
Collapse
Affiliation(s)
- Anahita Hessami
- School of Pharmacy, Shiraz University of Medical Sciences, Shiraz, Iran
| | | | - Fatemeh Rahim
- Department of Molecular Genetics, Faculty of Biological Sciences, Tarbiat Modares, P.O. Box: 14115-154, Tehran, Iran
| | - Bahman Khalesi
- Department of Research and Production of Poultry Viral Vaccine, Razi Vaccine and Serum Research Institute, Agricultural Research Education and Extension Organization, Karaj, Iran
| | | | - Mohammad Reza Rahbar
- Pharmaceutical Sciences Research Center, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Saeed Khalili
- Department of Biology Sciences, Shahid Rajaee Teacher Training University, Tehran, Iran
| | - Abolfazl Jahangiri
- Applied Microbiology Research Center, Biomedicine Technologies Institute, Baqiyatallah University of Medical Sciences, Tehran, Iran.
| |
Collapse
|
5
|
Wei Y, Qiu T, Ai Y, Zhang Y, Xie J, Zhang D, Luo X, Sun X, Wang X, Qiu J. Advances of computational methods enhance the development of multi-epitope vaccines. Brief Bioinform 2024; 26:bbaf055. [PMID: 39951549 PMCID: PMC11827616 DOI: 10.1093/bib/bbaf055] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2024] [Revised: 11/28/2024] [Accepted: 01/27/2025] [Indexed: 02/16/2025] Open
Abstract
Vaccine development is one of the most promising fields, and multi-epitope vaccine, which does not need laborious culture processes, is an attractive alternative to classical vaccines with the advantage of safety, and efficiency. The rapid development of algorithms and the accumulation of immune data have facilitated the advancement of computer-aided vaccine design. Here we systemically reviewed the in silico data and algorithms resource, for different steps of computational vaccine design, including immunogen selection, epitope prediction, vaccine construction, optimization, and evaluation. The performance of different available tools on epitope prediction and immunogenicity evaluation was tested and compared on benchmark datasets. Finally, we discuss the future research direction for the construction of a multiepitope vaccine.
Collapse
Affiliation(s)
- Yiwen Wei
- School of Health Science and Engineering, University of Shanghai for Science and Technology, No. 334, Jungong Road, Yangpu District, Shanghai 200093, China
| | - Tianyi Qiu
- Institute of Clinical Science, Zhongshan Hospital; Intelligent Medicine Institute; Shanghai Institute of Infectious Disease and Biosecurity, Shanghai Medical College, Fudan University, No. 180, Fenglin Road, Xuhui Destrict, Shanghai 200032, China
| | - Yisi Ai
- School of Health Science and Engineering, University of Shanghai for Science and Technology, No. 334, Jungong Road, Yangpu District, Shanghai 200093, China
| | - Yuxi Zhang
- School of Health Science and Engineering, University of Shanghai for Science and Technology, No. 334, Jungong Road, Yangpu District, Shanghai 200093, China
| | - Junting Xie
- School of Health Science and Engineering, University of Shanghai for Science and Technology, No. 334, Jungong Road, Yangpu District, Shanghai 200093, China
| | - Dong Zhang
- School of Health Science and Engineering, University of Shanghai for Science and Technology, No. 334, Jungong Road, Yangpu District, Shanghai 200093, China
| | - Xiaochuan Luo
- School of Health Science and Engineering, University of Shanghai for Science and Technology, No. 334, Jungong Road, Yangpu District, Shanghai 200093, China
| | - Xiulan Sun
- State Key Laboratory of Food Science and Technology, School of Food Science and Technology, National Engineering Research Center for Functional Foods, Synergetic Innovation Center of Food Safety and Nutrition, Jiangnan University, Lihu Avenue 1800, Wuxi, Jiangsu 214122, China
| | - Xin Wang
- School of Health Science and Engineering, University of Shanghai for Science and Technology, No. 334, Jungong Road, Yangpu District, Shanghai 200093, China
- Shanghai Collaborative Innovation Center of Energy Therapy for Tumors, No. 334, Jungong Road, Yangpu District, Shanghai 200093, China
| | - Jingxuan Qiu
- School of Health Science and Engineering, University of Shanghai for Science and Technology, No. 334, Jungong Road, Yangpu District, Shanghai 200093, China
- Shanghai Collaborative Innovation Center of Energy Therapy for Tumors, No. 334, Jungong Road, Yangpu District, Shanghai 200093, China
| |
Collapse
|
6
|
Yu Q, Zhang Z, Liu G, Li W, Tang Y. ToxGIN: an In silico prediction model for peptide toxicity via graph isomorphism networks integrating peptide sequence and structure information. Brief Bioinform 2024; 25:bbae583. [PMID: 39530430 PMCID: PMC11555482 DOI: 10.1093/bib/bbae583] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2024] [Revised: 10/22/2024] [Accepted: 10/29/2024] [Indexed: 11/16/2024] Open
Abstract
Peptide drugs have demonstrated enormous potential in treating a variety of diseases, yet toxicity prediction remains a significant challenge in drug development. Existing models for prediction of peptide toxicity largely rely on sequence information and often neglect the three-dimensional (3D) structures of peptides. This study introduced a novel model for short peptide toxicity prediction, named ToxGIN. The model utilizes Graph Isomorphism Network (GIN), integrating the underlying amino acid sequence composition and the 3D structures of peptides. ToxGIN comprises three primary modules: (i) Sequence processing module, converting peptide 3D structures and sequences into information of nodes and edges; (ii) Feature extraction module, utilizing GIN to learn discriminative features from nodes and edges; (iii) Classification module, employing a fully connected classifier for toxicity prediction. ToxGIN performed well on the independent test set with F1 score = 0.83, AUROC = 0.91, and Matthews correlation coefficient = 0.68, better than existing models for prediction of peptide toxicity. These results validated the effectiveness of integrating 3D structural information with sequence data using GIN for peptide toxicity prediction. The proposed ToxGIN and data can be freely accessible at https://github.com/cihebiyql/ToxGIN.
Collapse
Affiliation(s)
- Qiule Yu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Zhixing Zhang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Guixia Liu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Weihua Li
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Yun Tang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| |
Collapse
|
7
|
Rathore AS, Choudhury S, Arora A, Tijare P, Raghava GPS. ToxinPred 3.0: An improved method for predicting the toxicity of peptides. Comput Biol Med 2024; 179:108926. [PMID: 39038391 DOI: 10.1016/j.compbiomed.2024.108926] [Citation(s) in RCA: 29] [Impact Index Per Article: 29.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Revised: 05/17/2024] [Accepted: 07/17/2024] [Indexed: 07/24/2024]
Abstract
Toxicity emerges as a prominent challenge in the design of therapeutic peptides, causing the failure of numerous peptides during clinical trials. In 2013, our group developed ToxinPred, a computational method that has been extensively adopted by the scientific community for predicting peptide toxicity. In this paper, we propose a refined variant of ToxinPred that showcases improved reliability and accuracy in predicting peptide toxicity. Initially, we utilized a similarity/alignment-based approach employing BLAST to predict toxic peptides, which yielded satisfactory accuracy; however, the method suffered from inadequate coverage. Subsequently, we employed a motif-based approach using MERCI software to uncover specific patterns or motifs that are exclusively observed in toxic peptides. The search for these motifs in peptides allowed us to predict toxic peptides with a high level of specificity with poor sensitivity. To overcome the coverage limitations, we developed alignment-free methods using machine/deep learning techniques to balance sensitivity and specificity of prediction. Deep learning model (ANN - LSTM with fixed sequence length) developed using one-hot encoding achieved a maximum AUROC of 0.93 with MCC of 0.71 on an independent dataset. Machine learning model (extra tree) developed using compositional features of peptides achieved a maximum AUROC of 0.95 with MCC of 0.78. We also developed large language models and achieved maximum AUC of 0.93 using ESM2-t33. Finally, we developed hybrid or ensemble methods combining two or more methods to enhance performance. Our specific hybrid method, which combines a motif-based approach with a machine learning-based model, achieved a maximum AUROC of 0.98 with MCC 0.81 on an independent dataset. In this study, all models were trained and tested on 80 % of data using five-fold cross-validation and evaluated on the remaining 20 % of data called independent dataset. The evaluation of all methods on an independent dataset revealed that the method proposed in this study exhibited better performance than existing methods. To cater to the needs of the scientific community, we have developed a standalone software, pip package and web-based server ToxinPred3 (https://github.com/raghavagps/toxinpred3 and https://webs.iiitd.edu.in/raghava/toxinpred3/).
Collapse
Affiliation(s)
- Anand Singh Rathore
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, 110020, India.
| | - Shubham Choudhury
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, 110020, India.
| | - Akanksha Arora
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, 110020, India.
| | - Purva Tijare
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, 110020, India.
| | - Gajendra P S Raghava
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, 110020, India.
| |
Collapse
|
8
|
Singh S, Kaur N, Gehlot A. Application of artificial intelligence in drug design: A review. Comput Biol Med 2024; 179:108810. [PMID: 38991316 DOI: 10.1016/j.compbiomed.2024.108810] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2024] [Revised: 05/31/2024] [Accepted: 06/24/2024] [Indexed: 07/13/2024]
Abstract
Artificial intelligence (AI) is a field of computer science that involves acquiring information, developing rule bases, and mimicking human behaviour. The fundamental concept behind AI is to create intelligent computer systems that can operate with minimal human intervention or without any intervention at all. These rule-based systems are developed using various machine learning and deep learning models, enabling them to solve complex problems. AI is integrated with these models to learn, understand, and analyse provided data. The rapid advancement of Artificial Intelligence (AI) is reshaping numerous industries, with the pharmaceutical sector experiencing a notable transformation. AI is increasingly being employed to automate, optimize, and personalize various facets of the pharmaceutical industry, particularly in pharmacological research. Traditional drug development methods areknown for being time-consuming, expensive, and less efficient, often taking around a decade and costing billions of dollars. The integration of artificial intelligence (AI) techniques addresses these challenges by enabling the examination of compounds with desired properties from a vast pool of input drugs. Furthermore, it plays a crucial role in drug screening by predicting toxicity, bioactivity, ADME properties (absorption, distribution, metabolism, and excretion), physicochemical properties, and more. AI enhances the drug design process by improving the efficiency and accuracy of predicting drug behaviour, interactions, and properties. These approaches further significantly improve the precision of drug discovery processes and decrease clinical trial costs leading to the development of more effective drugs.
Collapse
Affiliation(s)
- Simrandeep Singh
- Department of Electronics & Communication Engineering, UCRD, Chandigarh University, Gharuan, Punjab, India.
| | - Navjot Kaur
- Department of Pharmacognosy, Amar Shaheed Baba Ajit Singh Jujhar Singh Memorial College of Pharmacy, Bela, Ropar, India
| | - Anita Gehlot
- Uttaranchal Institute of technology, Uttaranchal University, Dehradun, India
| |
Collapse
|
9
|
Oladipo EK, Ojo TO, Elegbeleye OE, Bolaji OQ, Oyewole MP, Ogunlana AT, Olalekan EO, Abiodun B, Adediran DA, Obideyi OA, Olufemi SE, Salamatullah AM, Bourhia M, Younous YA, Adelusi TI. Exploring the nuclear proteins, viral capsid protein, and early antigen protein using immunoinformatic and molecular modeling approaches to design a vaccine candidate against Epstein Barr virus. Sci Rep 2024; 14:16798. [PMID: 39039173 PMCID: PMC11263613 DOI: 10.1038/s41598-024-66828-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Accepted: 07/04/2024] [Indexed: 07/24/2024] Open
Abstract
The available Epstein Barr virus vaccine has tirelessly harnessed the gp350 glycoprotein as its target epitope, but the result has not been preventive. Right here, we designed a global multi-epitope vaccine for EBV; with special attention to making sure all strains and preventive antigens are covered. Using a robust computational vaccine design approach, our proposed vaccine is armed with 6-16 mers linear B-cell epitopes, 4-9 mer CTL epitopes, and 8-15 mer HTL epitopes which are verified to induce interleukin 4, 10 & IFN-gamma. We employed deep computational mining coupled with expert intelligence in designing the vaccine, using human Beta defensin-3-which has been reported to induce the same TLRs as EBV-as the adjuvant. The tendency of the vaccine to cause autoimmune disorder is quenched by the assurance that the construct contains no EBNA-1 homolog. The protein vaccine construct exhibited excellent physicochemical attributes such as Aliphatic index 59.55 and GRAVY - 0.710; and a ProsaWeb Z score of - 3.04. Further computational analysis revealed the vaccine docked favorably with EBV indicted TLR 1, 2, 4 & 9 with satisfactory interaction patterns. With global coverage of 85.75% and the stable molecular dynamics result obtained for the best two interactions, we are optimistic that our nontoxic, non-allergenic multi-epitope vaccine will help to ameliorate the EBV-associated diseases-which include various malignancies, tumors, and cancers-preventively.
Collapse
Affiliation(s)
- Elijah Kolawole Oladipo
- Division of Vaccine Design and Development, Helix Biogen Institute, Ogbomoso, 210214, Nigeria
- Department of Microbiology, Laboratory of Molecular Biology, Immunology and Bioinformatics, Adeleke University, Ede, 232104, Nigeria
| | - Taiwo Ooreoluwa Ojo
- Division of Vaccine Design and Development, Helix Biogen Institute, Ogbomoso, 210214, Nigeria
- Computational Biology and Drug Discovery Laboratory, Department of Biochemistry, Ladoke Akintola University of Technology, (LAUTECH), Ogbomoso, 210214, Nigeria
| | - Oluwabamise Emmanuel Elegbeleye
- Computational Biology and Drug Discovery Laboratory, Department of Biochemistry, Ladoke Akintola University of Technology, (LAUTECH), Ogbomoso, 210214, Nigeria
| | - Olawale Quadri Bolaji
- Computational Biology and Drug Discovery Laboratory, Department of Biochemistry, Ladoke Akintola University of Technology, (LAUTECH), Ogbomoso, 210214, Nigeria
| | - Moyosoluwa Precious Oyewole
- Division of Vaccine Design and Development, Helix Biogen Institute, Ogbomoso, 210214, Nigeria
- Department of Biochemistry, Bowen University, Iwo, 232101, Nigeria
| | - Abdeen Tunde Ogunlana
- Institute of Advanced Medical Research and Training (IAMRAT), College of Medicine, University of Ibadan, Ibadan, 200005, Nigeria
| | - Emmanuel Obanijesu Olalekan
- Computational Biology and Drug Discovery Laboratory, Department of Biochemistry, Ladoke Akintola University of Technology, (LAUTECH), Ogbomoso, 210214, Nigeria
| | - Bamidele Abiodun
- Computational Biology and Drug Discovery Laboratory, Department of Biochemistry, Ladoke Akintola University of Technology, (LAUTECH), Ogbomoso, 210214, Nigeria
| | - Daniel Adewole Adediran
- Division of Vaccine Design and Development, Helix Biogen Institute, Ogbomoso, 210214, Nigeria
| | | | - Seun Elijah Olufemi
- Division of Vaccine Design and Development, Helix Biogen Institute, Ogbomoso, 210214, Nigeria
| | - Ahmad Mohammad Salamatullah
- Department of Food Science and Nutrition, College of Food and Agricultural Sciences, King Saud University, 11, P.O. Box 2460, 11451, Riyadh, Saudi Arabia
| | - Mohammed Bourhia
- Laboratory of Therapeutic and Organic Chemistry, Faculty of Pharmacy, University of Montpellier, Montpellier, 34000, France
| | | | - Temitope Isaac Adelusi
- Computational Biology and Drug Discovery Laboratory, Department of Biochemistry, Ladoke Akintola University of Technology, (LAUTECH), Ogbomoso, 210214, Nigeria.
- Department of Surgery, School of Medicine, University of Connecticut Health, Farmington Ave, Farmington, CT, 06030, USA.
| |
Collapse
|
10
|
Ebrahimikondori H, Sutherland D, Yanai A, Richter A, Salehi A, Li C, Coombe L, Kotkoff M, Warren RL, Birol I. Structure-aware deep learning model for peptide toxicity prediction. Protein Sci 2024; 33:e5076. [PMID: 39196703 PMCID: PMC11193153 DOI: 10.1002/pro.5076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Revised: 04/26/2024] [Accepted: 05/28/2024] [Indexed: 08/30/2024]
Abstract
Antimicrobial resistance is a critical public health concern, necessitating the exploration of alternative treatments. While antimicrobial peptides (AMPs) show promise, assessing their toxicity using traditional wet lab methods is both time-consuming and costly. We introduce tAMPer, a novel multi-modal deep learning model designed to predict peptide toxicity by integrating the underlying amino acid sequence composition and the three-dimensional structure of peptides. tAMPer adopts a graph-based representation for peptides, encoding ColabFold-predicted structures, where nodes represent amino acids and edges represent spatial interactions. Structural features are extracted using graph neural networks, and recurrent neural networks capture sequential dependencies. tAMPer's performance was assessed on a publicly available protein toxicity benchmark and an AMP hemolysis data we generated. On the latter, tAMPer achieves an F1-score of 68.7%, outperforming the second-best method by 23.4%. On the protein benchmark, tAMPer exhibited an improvement of over 3.0% in the F1-score compared to current state-of-the-art methods. We anticipate tAMPer to accelerate AMP discovery and development by reducing the reliance on laborious toxicity screening experiments.
Collapse
Affiliation(s)
- Hossein Ebrahimikondori
- Canada's Michael Smith Genome Sciences CentreBC Cancer AgencyVancouverBritish ColumbiaCanada
- Bioinformatics Graduate ProgramUniversity of British ColumbiaVancouverBritish ColumbiaCanada
| | - Darcy Sutherland
- Canada's Michael Smith Genome Sciences CentreBC Cancer AgencyVancouverBritish ColumbiaCanada
- Public Health LaboratoryBritish Columbia Centre for Disease ControlVancouverBritish ColumbiaCanada
- Department of Pathology and Laboratory MedicineUniversity of British ColumbiaVancouverBritish ColumbiaCanada
| | - Anat Yanai
- Canada's Michael Smith Genome Sciences CentreBC Cancer AgencyVancouverBritish ColumbiaCanada
- Public Health LaboratoryBritish Columbia Centre for Disease ControlVancouverBritish ColumbiaCanada
| | - Amelia Richter
- Canada's Michael Smith Genome Sciences CentreBC Cancer AgencyVancouverBritish ColumbiaCanada
- Public Health LaboratoryBritish Columbia Centre for Disease ControlVancouverBritish ColumbiaCanada
| | - Ali Salehi
- Canada's Michael Smith Genome Sciences CentreBC Cancer AgencyVancouverBritish ColumbiaCanada
- Public Health LaboratoryBritish Columbia Centre for Disease ControlVancouverBritish ColumbiaCanada
| | - Chenkai Li
- Canada's Michael Smith Genome Sciences CentreBC Cancer AgencyVancouverBritish ColumbiaCanada
- Bioinformatics Graduate ProgramUniversity of British ColumbiaVancouverBritish ColumbiaCanada
| | - Lauren Coombe
- Canada's Michael Smith Genome Sciences CentreBC Cancer AgencyVancouverBritish ColumbiaCanada
| | - Monica Kotkoff
- Canada's Michael Smith Genome Sciences CentreBC Cancer AgencyVancouverBritish ColumbiaCanada
| | - René L. Warren
- Canada's Michael Smith Genome Sciences CentreBC Cancer AgencyVancouverBritish ColumbiaCanada
| | - Inanc Birol
- Canada's Michael Smith Genome Sciences CentreBC Cancer AgencyVancouverBritish ColumbiaCanada
- Public Health LaboratoryBritish Columbia Centre for Disease ControlVancouverBritish ColumbiaCanada
- Department of Pathology and Laboratory MedicineUniversity of British ColumbiaVancouverBritish ColumbiaCanada
- Department of Medical GeneticsUniversity of British ColumbiaVancouverBritish ColumbiaCanada
| |
Collapse
|
11
|
Das R, Arora R, Nadar K, Saroj S, Singh AK, Patil SA, Raman SK, Misra A, Bajpai U. Insights into the genomic features and lifestyle of B1 subcluster mycobacteriophages. J Basic Microbiol 2024; 64:e2400027. [PMID: 38548701 DOI: 10.1002/jobm.202400027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2024] [Accepted: 02/24/2024] [Indexed: 06/06/2024]
Abstract
Bacteriophages infecting Mycobacterium smegmatis mc2155 are numerous and, hence, are classified into clusters based on nucleotide sequence similarity. Analyzing phages belonging to clusters/subclusters can help gain deeper insights into their biological features and potential therapeutic applications. In this study, for genomic characterization of B1 subcluster mycobacteriophages, a framework of online tools was developed, which enabled functional annotation of about 55% of the previously deemed hypothetical proteins in B1 phages. We also studied the phenotype, lysogeny status, and antimycobacterial activity of 10 B1 phages against biofilm and an antibiotic-resistant M. smegmatis strain (4XR1). All 10 phages belonged to the Siphoviridae family, appeared temperate based on their spontaneous release from the putative lysogens and showed antibiofilm activity. The highest inhibitory and disruptive effects on biofilm were 64% and 46%, respectively. This systematic characterization using a combination of genomic and experimental tools is a promising approach to furthering our understanding of viral dark matter.
Collapse
Affiliation(s)
- Ritam Das
- Department of Life Science, Acharya Narendra Dev College, University of Delhi, Kalkaji, New Delhi, India
- Faculty of Biological Sciences, Friedrich Schiller University, Jena, Germany
| | - Ritu Arora
- Department of Biomedical Science, Acharya Narendra Dev College, University of Delhi, Kalkaji, New Delhi, India
| | - Kanika Nadar
- Department of Biomedical Science, Acharya Narendra Dev College, University of Delhi, Kalkaji, New Delhi, India
| | - Saroj Saroj
- Department of Biomedical Science, Acharya Narendra Dev College, University of Delhi, Kalkaji, New Delhi, India
| | - Amit K Singh
- Experimental Animal Facility, National JALMA Institute for Leprosy and Other Mycobacterial Diseases, Agra, India
| | - Shripad A Patil
- Experimental Animal Facility, National JALMA Institute for Leprosy and Other Mycobacterial Diseases, Agra, India
| | - Sunil K Raman
- Pharmaceutics and Pharmacokinetic Division, CSIR-Central Drug Research Institute, Lucknow, India
| | - Amit Misra
- Pharmaceutics and Pharmacokinetic Division, CSIR-Central Drug Research Institute, Lucknow, India
- Pharmaceutics and Pharmacokinetic Division, Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
| | - Urmi Bajpai
- Department of Biomedical Science, Acharya Narendra Dev College, University of Delhi, Kalkaji, New Delhi, India
| |
Collapse
|
12
|
Mall R, Singh A, Patel CN, Guirimand G, Castiglione F. VISH-Pred: an ensemble of fine-tuned ESM models for protein toxicity prediction. Brief Bioinform 2024; 25:bbae270. [PMID: 38842509 PMCID: PMC11154842 DOI: 10.1093/bib/bbae270] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 05/06/2024] [Accepted: 05/23/2024] [Indexed: 06/07/2024] Open
Abstract
Peptide- and protein-based therapeutics are becoming a promising treatment regimen for myriad diseases. Toxicity of proteins is the primary hurdle for protein-based therapies. Thus, there is an urgent need for accurate in silico methods for determining toxic proteins to filter the pool of potential candidates. At the same time, it is imperative to precisely identify non-toxic proteins to expand the possibilities for protein-based biologics. To address this challenge, we proposed an ensemble framework, called VISH-Pred, comprising models built by fine-tuning ESM2 transformer models on a large, experimentally validated, curated dataset of protein and peptide toxicities. The primary steps in the VISH-Pred framework are to efficiently estimate protein toxicities taking just the protein sequence as input, employing an under sampling technique to handle the humongous class-imbalance in the data and learning representations from fine-tuned ESM2 protein language models which are then fed to machine learning techniques such as Lightgbm and XGBoost. The VISH-Pred framework is able to correctly identify both peptides/proteins with potential toxicity and non-toxic proteins, achieving a Matthews correlation coefficient of 0.737, 0.716 and 0.322 and F1-score of 0.759, 0.696 and 0.713 on three non-redundant blind tests, respectively, outperforming other methods by over $10\%$ on these quality metrics. Moreover, VISH-Pred achieved the best accuracy and area under receiver operating curve scores on these independent test sets, highlighting the robustness and generalization capability of the framework. By making VISH-Pred available as an easy-to-use web server, we expect it to serve as a valuable asset for future endeavors aimed at discerning the toxicity of peptides and enabling efficient protein-based therapeutics.
Collapse
Affiliation(s)
- Raghvendra Mall
- Biotechnology Research Center, Technology Innovation Institute, P.O. Box 9639, Abu Dhabi, United Arab Emirates
| | - Ankita Singh
- Biotechnology Research Center, Technology Innovation Institute, P.O. Box 9639, Abu Dhabi, United Arab Emirates
| | - Chirag N Patel
- Biotechnology Research Center, Technology Innovation Institute, P.O. Box 9639, Abu Dhabi, United Arab Emirates
| | - Gregory Guirimand
- Biotechnology Research Center, Technology Innovation Institute, P.O. Box 9639, Abu Dhabi, United Arab Emirates
- Graduate School of Science, Technology and Innovation, Kobe University, 1-1 Rokkodai-cho, Nada-ku, Kobe, 657-8501, Japan
| | - Filippo Castiglione
- Biotechnology Research Center, Technology Innovation Institute, P.O. Box 9639, Abu Dhabi, United Arab Emirates
- Institute for Applied Computing, National Research Council of Italy, Via dei Taurini, 19, 00185, Rome, Italy
| |
Collapse
|
13
|
Beltrán JF, Herrera-Belén L, Parraguez-Contreras F, Farías JG, Machuca-Sepúlveda J, Short S. MultiToxPred 1.0: a novel comprehensive tool for predicting 27 classes of protein toxins using an ensemble machine learning approach. BMC Bioinformatics 2024; 25:148. [PMID: 38609877 PMCID: PMC11010298 DOI: 10.1186/s12859-024-05748-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Accepted: 03/14/2024] [Indexed: 04/14/2024] Open
Abstract
Protein toxins are defense mechanisms and adaptations found in various organisms and microorganisms, and their use in scientific research as therapeutic candidates is gaining relevance due to their effectiveness and specificity against cellular targets. However, discovering these toxins is time-consuming and expensive. In silico tools, particularly those based on machine learning and deep learning, have emerged as valuable resources to address this challenge. Existing tools primarily focus on binary classification, determining whether a protein is a toxin or not, and occasionally identifying specific types of toxins. For the first time, we propose a novel approach capable of classifying protein toxins into 27 distinct categories based on their mode of action within cells. To accomplish this, we assessed multiple machine learning techniques and found that an ensemble model incorporating the Light Gradient Boosting Machine and Quadratic Discriminant Analysis algorithms exhibited the best performance. During the tenfold cross-validation on the training dataset, our model exhibited notable metrics: 0.840 accuracy, 0.827 F1 score, 0.836 precision, 0.840 sensitivity, and 0.989 AUC. In the testing stage, using an independent dataset, the model achieved 0.846 accuracy, 0.838 F1 score, 0.847 precision, 0.849 sensitivity, and 0.991 AUC. These results present a powerful next-generation tool called MultiToxPred 1.0, accessible through a web application. We believe that MultiToxPred 1.0 has the potential to become an indispensable resource for researchers, facilitating the efficient identification of protein toxins. By leveraging this tool, scientists can accelerate their search for these toxins and advance their understanding of their therapeutic potential.
Collapse
Affiliation(s)
- Jorge F Beltrán
- Department of Chemical Engineering, Faculty of Engineering and Science, Universidad de La Frontera, Ave. Francisco Salazar, 01145, Temuco, Chile.
| | - Lisandra Herrera-Belén
- Departamento de Ciencias Básicas, Facultad de Ciencias, Universidad Santo Tomas, Temuco, Chile
| | - Fernanda Parraguez-Contreras
- Department of Chemical Engineering, Faculty of Engineering and Science, Universidad de La Frontera, Ave. Francisco Salazar, 01145, Temuco, Chile
| | - Jorge G Farías
- Department of Chemical Engineering, Faculty of Engineering and Science, Universidad de La Frontera, Ave. Francisco Salazar, 01145, Temuco, Chile
| | - Jorge Machuca-Sepúlveda
- Department of Chemical Engineering, Faculty of Engineering and Science, Universidad de La Frontera, Ave. Francisco Salazar, 01145, Temuco, Chile
| | - Stefania Short
- Department of Chemical Engineering, Faculty of Engineering and Science, Universidad de La Frontera, Ave. Francisco Salazar, 01145, Temuco, Chile
| |
Collapse
|
14
|
Kumar N, Srivastava R. Deep learning in structural bioinformatics: current applications and future perspectives. Brief Bioinform 2024; 25:bbae042. [PMID: 38701422 PMCID: PMC11066934 DOI: 10.1093/bib/bbae042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Revised: 01/05/2024] [Accepted: 01/18/2024] [Indexed: 05/05/2024] Open
Abstract
In this review article, we explore the transformative impact of deep learning (DL) on structural bioinformatics, emphasizing its pivotal role in a scientific revolution driven by extensive data, accessible toolkits and robust computing resources. As big data continue to advance, DL is poised to become an integral component in healthcare and biology, revolutionizing analytical processes. Our comprehensive review provides detailed insights into DL, featuring specific demonstrations of its notable applications in bioinformatics. We address challenges tailored for DL, spotlight recent successes in structural bioinformatics and present a clear exposition of DL-from basic shallow neural networks to advanced models such as convolution, recurrent, artificial and transformer neural networks. This paper discusses the emerging use of DL for understanding biomolecular structures, anticipating ongoing developments and applications in the realm of structural bioinformatics.
Collapse
Affiliation(s)
- Niranjan Kumar
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi, India
| | - Rakesh Srivastava
- Center for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad, India
| |
Collapse
|
15
|
Ehsasatvatan M, Baghban Kohnehrouz B. A new trivalent recombinant protein for type 2 diabetes mellitus with oral delivery potential: design, expression, and experimental validation. J Biomol Struct Dyn 2024:1-16. [PMID: 38468545 DOI: 10.1080/07391102.2024.2329290] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2023] [Accepted: 03/06/2024] [Indexed: 03/13/2024]
Abstract
Glucagon-like peptide-1 (GLP-1) receptor agonists are increasingly used in clinical practice for the management of type 2 diabetes mellitus. However, the extremely short half-life of GLP-1 and the need for subcutaneous administration limit its clinical application. Thus, half-life extension and alternative delivery methods are highly desired. DARPin domains with high affinity for human serum albumin (HSA) have been selected for the half-life extension of therapeutic peptides and proteins. In the present study, novel trivalent fusion proteins as long-acting GLP-1 receptor agonists with potential for oral delivery were computationally engineered by incorporating a protease-resistant modified GLP-1, an anti-human serum albumin DARPin, and an approved cell-penetrating peptide (Penetratin, Tat, and Polyarginine) linked either by rigid or flexible linkers. Theoretical studies and molecular dynamics simulation results suggested that mGLP1-DARPin-Pen has acceptable quality and stability. Moreover, the potential affinity of the selected fusion proteins for GLP-1 receptor and human serum albumin was explored by molecular docking. The recombinant construct was cloned into the pET28a vector and expressed in Escherichia coli. SDS-PAGE analysis of the purified fusion protein matched its molecular size and was confirmed by western blot analysis. The results demonstrated that the engineered fusion protein could bind HSA with high affinity. Importantly, insulin secretion assays using a mouse pancreatic β-cell line (β-TC6) revealed that the engineered trivalent fusion protein retained the ability to stimulate cellular insulin secretion. Immunofluorescence microscopy analysis indicated the CPP-dependent cellular uptake of mGLP1-DARPin-Pen. These findings demonstrated that mGLP1-DARPin-Pen is a highly potent oral drug candidate that could be particularly useful in the treatment of type 2 diabetes mellitus.
Collapse
Affiliation(s)
- Maryam Ehsasatvatan
- Department of Plant Breeding & Biotechnology, Faculty of Agriculture, University of Tabriz, Tabriz, Iran
| | - Bahram Baghban Kohnehrouz
- Department of Plant Breeding & Biotechnology, Faculty of Agriculture, University of Tabriz, Tabriz, Iran
| |
Collapse
|
16
|
Liu Y, Wang H, Ding Y. The Dynamical Biomarkers in Functional Connectivity of Autism Spectrum Disorder Based on Dynamic Graph Embedding. Interdiscip Sci 2024; 16:141-159. [PMID: 38060171 DOI: 10.1007/s12539-023-00592-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2023] [Revised: 11/02/2023] [Accepted: 11/02/2023] [Indexed: 12/08/2023]
Abstract
Autism spectrum disorder (ASD) is a neurological and developmental disorder and its early diagnosis is a challenging task. The dynamic brain network (DBN) offers a wealth of information for the diagnosis and treatment of ASD. Mining the spatio-temporal characteristics of DBN is critical for finding dynamic communication across brain regions and, ultimately, identifying the ASD diagnostic biomarker. We proposed the dgEmbed-KNN and the Aggregation-SVM diagnostic models, which use the spatio-temporal information from DBN and interactive information among brain regions represented by dynamic graph embedding. The classification accuracies show that dgEmbed-KNN model performs slightly better than traditional machine learning and deep learning methods, while the Aggregation-SVM model has a very good capacity to diagnose ASD using aggregation brain network connections as features. We discovered over- and under-connections in ASD at the level of dynamic connections, involving brain regions of the postcentral gyrus, the insula, the cerebellum, the caudate nucleus, and the temporal pole. We also found abnormal dynamic interactions associated with ASD within/between the functional subnetworks, including default mode network, visual network, auditory network and saliency network. These can provide potential DBN biomarkers for ASD identification.
Collapse
Affiliation(s)
- Yanting Liu
- School of Science, Jiangnan University, Wuxi, 214122, China
| | - Hao Wang
- School of Science, Jiangnan University, Wuxi, 214122, China
| | - Yanrui Ding
- School of Science, Jiangnan University, Wuxi, 214122, China.
| |
Collapse
|
17
|
Zancolli G, von Reumont BM, Anderluh G, Caliskan F, Chiusano ML, Fröhlich J, Hapeshi E, Hempel BF, Ikonomopoulou MP, Jungo F, Marchot P, de Farias TM, Modica MV, Moran Y, Nalbantsoy A, Procházka J, Tarallo A, Tonello F, Vitorino R, Zammit ML, Antunes A. Web of venom: exploration of big data resources in animal toxin research. Gigascience 2024; 13:giae054. [PMID: 39250076 PMCID: PMC11382406 DOI: 10.1093/gigascience/giae054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2024] [Revised: 07/01/2024] [Accepted: 07/13/2024] [Indexed: 09/10/2024] Open
Abstract
Research on animal venoms and their components spans multiple disciplines, including biology, biochemistry, bioinformatics, pharmacology, medicine, and more. Manipulating and analyzing the diverse array of data required for venom research can be challenging, and relevant tools and resources are often dispersed across different online platforms, making them less accessible to nonexperts. In this article, we address the multifaceted needs of the scientific community involved in venom and toxin-related research by identifying and discussing web resources, databases, and tools commonly used in this field. We have compiled these resources into a comprehensive table available on the VenomZone website (https://venomzone.expasy.org/10897). Furthermore, we highlight the challenges currently faced by researchers in accessing and using these resources and emphasize the importance of community-driven interdisciplinary approaches. We conclude by underscoring the significance of enhancing standards, promoting interoperability, and encouraging data and method sharing within the venom research community.
Collapse
Affiliation(s)
- Giulia Zancolli
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland
- SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Björn Marcus von Reumont
- Goethe University Frankfurt, Faculty of Biological Sciences, 60438 Frankfurt, Germany
- LOEWE Centre for Translational Biodiversity Genomics, 60325 Frankfurt, Germany
| | - Gregor Anderluh
- Department of Molecular Biology and Nanobiotechnology, National Institute of Chemistry, 1000 Ljubljana, Slovenia
| | - Figen Caliskan
- Department of Biology, Faculty of Science, Eskisehir Osmangazi University, 26040 Eskişehir, Turkey
| | - Maria Luisa Chiusano
- Department of Agricultural Sciences, University Federico II of Naples, 80055 Portici, Naples, Italy
- Department of Research Infrastructures for Marine Biological Resources, Stazione Zoologica Anton Dohrn, Villa Comunale, 80121 Naples, Italy
| | - Jacob Fröhlich
- Veterinary Center for Resistance Research (TZR), Freie Universität Berlin, 14163 Berlin, Germany
| | - Evroula Hapeshi
- Department of Health Sciences, School of Life and Health Sciences, University of Nicosia, 1700 Nicosia, Cyprus
| | - Benjamin-Florian Hempel
- Veterinary Center for Resistance Research (TZR), Freie Universität Berlin, 14163 Berlin, Germany
| | - Maria P Ikonomopoulou
- Madrid Institute of Advanced Studies in Food, Precision Nutrition & Aging Program, 28049 Madrid, Spain
| | - Florence Jungo
- SIB Swiss Institute of Bioinformatics, Swiss-Prot Group, 1211 Geneva, Switzerland
| | - Pascale Marchot
- Laboratory Architecture et Fonction des Macromolécules Biologiques, Aix-Marseille University, Centre National de la Recherche Scientifique, Faculté des Sciences, Campus Luminy, 13288 Marseille, France
| | - Tarcisio Mendes de Farias
- Department of Ecology and Evolution, University of Lausanne, 1015 Lausanne, Switzerland
- SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Maria Vittoria Modica
- Department of Biology and Evolution of Marine Organisms, Stazione Zoologica Anton Dohrn, 00198 Rome, Italy
| | - Yehu Moran
- Department of Ecology, Evolution and Behavior, Alexander Silberman Institute of Life Sciences, Faculty of Science, The Hebrew University of Jerusalem, 9190401 Jerusalem, Israel
| | - Ayse Nalbantsoy
- Engineering Faculty, Bioengineering Department, Ege University, 35100 Bornova-Izmir, Turkey
| | - Jan Procházka
- Laboratory of Transgenic Models of Diseases, Institute of Molecular Genetics of the Czech Academy of Sciences, 252 50 Vestec, Czech Republic
| | - Andrea Tarallo
- Institute of Research on Terrestrial Ecosystems (IRET), National Research Council (CNR), 73100 Lecce, Italy
| | - Fiorella Tonello
- Neuroscience Institute, National Research Council (CNR), 35131 Padua, Italy
| | - Rui Vitorino
- Department of Medical Sciences, iBiMED, University of Aveiro, 3810-193 Aveiro, Portugal
| | - Mark Lawrence Zammit
- Department of Clinical Pharmacology & Therapeutics, Faculty of Medicine & Surgery, University of Malta, 2090 Msida, Malta
- Malta National Poisons Centre, Malta Life Sciences Park, 3000 San Ġwann, Malta
| | - Agostinho Antunes
- CIIMAR/CIMAR, Interdisciplinary Centre of Marine and Environmental Research, University of Porto, 4450-208 Porto, Portugal
- Department of Biology, Faculty of Sciences, University of Porto, 4169-007 Porto, Portugal
| |
Collapse
|
18
|
Hessel SS, Dwivany FM, Zainuddin IM, Wikantika K, Celik I, Emran TB, Tallei TE. A computational simulation appraisal of banana lectin as a potential anti-SARS-CoV-2 candidate by targeting the receptor-binding domain. J Genet Eng Biotechnol 2023; 21:148. [PMID: 38015308 PMCID: PMC10684481 DOI: 10.1186/s43141-023-00569-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2022] [Accepted: 10/26/2023] [Indexed: 11/29/2023]
Abstract
BACKGROUND The ongoing concern surrounding coronavirus disease 2019 (COVID-19) primarily stems from continuous mutations in the genome of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), leading to the emergence of numerous variants. The receptor-binding domain (RBD) in the S1 subunit of the S protein of the virus plays a crucial role in recognizing the host's angiotensin-converting enzyme 2 (hACE2) receptor and facilitating cell membrane fusion processes, making it a potential target for preventing viral entrance into cells. This research aimed to determine the potential of banana lectin (BanLec) proteins to inhibit SARS-CoV-2 attachment to host cells by interacting with RBD through computational modeling. MATERIALS AND METHODS The BanLecs were selected through a sequence analysis process. Subsequently, the genes encoding BanLec proteins were retrieved from the Banana Genome Hub database. The FGENESH online tool was then employed to predict protein sequences, while web-based tools were utilized to assess the physicochemical properties, allergenicity, and toxicity of BanLecs. The RBDs of SARS-CoV-2 were modeled using the SWISS-MODEL in the following step. Molecular docking procedures were conducted with the aid of ClusPro 2.0 and HDOCK web servers. The three-dimensional structures of the docked complexes were visualized using PyMOL. Finally, molecular dynamics simulations were performed to investigate and validate the interactions of the complexes exhibiting the highest interactions, facilitating the simulation of their dynamic properties. RESULTS The BanLec proteins were successfully modeled based on the RNA sequences from two species of banana (Musa sp.). Moreover, an amino acid modification in the BanLec protein was made to reduce its mitogenicity. Theoretical allergenicity and toxicity predictions were conducted on the BanLecs, which suggested they were likely non-allergenic and contained no discernible toxic domains. Molecular docking analysis demonstrated that both altered and wild-type BanLecs exhibited strong affinity with the RBD of different SARS-CoV-2 variants. Further analysis of the molecular docking results showed that the BanLec proteins interacted with the active site of RBD, particularly the key amino acids residues responsible for RBD's binding to hACE2. Molecular dynamics simulation indicated a stable interaction between the Omicron RBD and BanLec, maintaining a root-mean-square deviation (RMSD) of approximately 0.2 nm for a duration of up to 100 ns. The individual proteins also had stable structural conformations, and the complex demonstrated a favorable binding-free energy (BFE) value. CONCLUSIONS These results confirm that the BanLec protein is a promising candidate for developing a potential therapeutic agent for combating COVID-19. Furthermore, the results suggest the possibility of BanLec as a broad-spectrum antiviral agent and highlight the need for further studies to examine the protein's safety and effectiveness as a potent antiviral agent.
Collapse
Affiliation(s)
- Sofia Safitri Hessel
- School of Life Sciences and Technology, Institut Teknologi Bandung, Bandung, West Java, 40132, Indonesia
| | - Fenny Martha Dwivany
- School of Life Sciences and Technology, Institut Teknologi Bandung, Bandung, West Java, 40132, Indonesia.
| | - Ima Mulyama Zainuddin
- Department of Biosystems, KU Leuven, Willem de Croylaan 42 box 2455, B-3001, Leuven, Belgium
| | - Ketut Wikantika
- Remote Sensing and Geographical Information Science Research Group, Faculty of Earth Science and Technology (FITB), Institut Teknologi Bandung, Bandung, West Java, 40132, Indonesia
| | - Ismail Celik
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, Erciyes University, 38039, Kayseri, Turkey
| | - Talha Bin Emran
- Department of Pathology and Laboratory Medicine, Warren Alpert Medical School, Brown University, Providence, RI 02912, USA
- Department of Pharmacy, Faculty of Allied Health Sciences, Daffodil International University, Dhaka, 1207, Bangladesh
- Legorreta Cancer Center, Brown University, Providence, RI 02912, USA
| | - Trina Ekawati Tallei
- Department of Biology, Faculty of Mathematics and Natural Sciences, Sam Ratulangi University, Manado, North Sulawesi, 95115, Indonesia.
| |
Collapse
|
19
|
Parthiban S, Vijeesh T, Gayathri T, Shanmugaraj B, Sharma A, Sathishkumar R. Artificial intelligence-driven systems engineering for next-generation plant-derived biopharmaceuticals. FRONTIERS IN PLANT SCIENCE 2023; 14:1252166. [PMID: 38034587 PMCID: PMC10684705 DOI: 10.3389/fpls.2023.1252166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Accepted: 10/17/2023] [Indexed: 12/02/2023]
Abstract
Recombinant biopharmaceuticals including antigens, antibodies, hormones, cytokines, single-chain variable fragments, and peptides have been used as vaccines, diagnostics and therapeutics. Plant molecular pharming is a robust platform that uses plants as an expression system to produce simple and complex recombinant biopharmaceuticals on a large scale. Plant system has several advantages over other host systems such as humanized expression, glycosylation, scalability, reduced risk of human or animal pathogenic contaminants, rapid and cost-effective production. Despite many advantages, the expression of recombinant proteins in plant system is hindered by some factors such as non-human post-translational modifications, protein misfolding, conformation changes and instability. Artificial intelligence (AI) plays a vital role in various fields of biotechnology and in the aspect of plant molecular pharming, a significant increase in yield and stability can be achieved with the intervention of AI-based multi-approach to overcome the hindrance factors. Current limitations of plant-based recombinant biopharmaceutical production can be circumvented with the aid of synthetic biology tools and AI algorithms in plant-based glycan engineering for protein folding, stability, viability, catalytic activity and organelle targeting. The AI models, including but not limited to, neural network, support vector machines, linear regression, Gaussian process and regressor ensemble, work by predicting the training and experimental data sets to design and validate the protein structures thereby optimizing properties such as thermostability, catalytic activity, antibody affinity, and protein folding. This review focuses on, integrating systems engineering approaches and AI-based machine learning and deep learning algorithms in protein engineering and host engineering to augment protein production in plant systems to meet the ever-expanding therapeutics market.
Collapse
Affiliation(s)
- Subramanian Parthiban
- Plant Genetic Engineering Laboratory, Department of Biotechnology, Bharathiar University, Coimbatore, India
| | - Thandarvalli Vijeesh
- Plant Genetic Engineering Laboratory, Department of Biotechnology, Bharathiar University, Coimbatore, India
| | - Thashanamoorthi Gayathri
- Plant Genetic Engineering Laboratory, Department of Biotechnology, Bharathiar University, Coimbatore, India
| | - Balamurugan Shanmugaraj
- Plant Genetic Engineering Laboratory, Department of Biotechnology, Bharathiar University, Coimbatore, India
| | - Ashutosh Sharma
- Tecnologico de Monterrey, School of Engineering and Sciences, Centre of Bioengineering, Queretaro, Mexico
| | - Ramalingam Sathishkumar
- Plant Genetic Engineering Laboratory, Department of Biotechnology, Bharathiar University, Coimbatore, India
| |
Collapse
|
20
|
Ehsasatvatan M, Baghban Kohnehrouz B. Designing and computational analyzing of chimeric long-lasting GLP-1 receptor agonists for type 2 diabetes. Sci Rep 2023; 13:17778. [PMID: 37853095 PMCID: PMC10584922 DOI: 10.1038/s41598-023-45185-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Accepted: 10/17/2023] [Indexed: 10/20/2023] Open
Abstract
Glucagon-like peptide-1 (GLP-1) is an intestinally derived incretin that plays a vital role in engineering the biological circuit involved in treating type 2 diabetes. Exceedingly short half-life (1-2 min) of GLP-1 limits its therapeutic applicability, and the implication of its new variants is under question. Since albumin-binding DARPin as a mimetic molecule has been reported to increase the serum half-life of therapeutic compounds, the interaction of new variants of GLP-1 in fusion with DARPin needs to be examined against the GLP-1 receptor. This study was aimed to design stable and functional fusion proteins consisting of new protease-resistant GLP-1 mutants (mGLP1) genetically fused to DARPin as a critical step toward developing long-acting GLP-1 receptor agonists. The stability and solubility of the engineered fusion proteins were analyzed, and their secondary and tertiary structures were predicted and satisfactorily validated. Molecular dynamics simulation studies revealed that the predicted structures of engineered fusion proteins remained stable throughout the simulation. The relative binding affinity of the engineered fusion proteins' complex with human serum albumin and the GLP-1 receptor individually was assessed using molecular docking analyses. It revealed a higher affinity compared to the interaction of the individual GLP-1 and HSA-binding DARPin with the GLP-1 receptor and human serum albumin, respectively. The present study suggests that engineered fusion proteins can be used as a potential molecule in the treatment of type 2 diabetes, and this study provides insight into further experimental use of mimetic complexes as alternative molecules to be evaluated as new bio-breaks in the engineering of biological circuits in the treatment of type 2 diabetes.
Collapse
Affiliation(s)
- Maryam Ehsasatvatan
- Department of Plant Breeding and Biotechnology, Faculty of Agriculture, University of Tabriz, Tabriz, 51666, Iran
| | - Bahram Baghban Kohnehrouz
- Department of Plant Breeding and Biotechnology, Faculty of Agriculture, University of Tabriz, Tabriz, 51666, Iran.
| |
Collapse
|
21
|
Fathollahi M, Motamedi H, Hossainpour H, Abiri R, Shahlaei M, Moradi S, Dashtbin S, Moradi J, Alvandi A. Designing a novel multi-epitopes pan-vaccine against SARS-CoV-2 and seasonal influenza: in silico and immunoinformatics approach. J Biomol Struct Dyn 2023; 42:10761-10784. [PMID: 37723861 DOI: 10.1080/07391102.2023.2258420] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Accepted: 09/07/2023] [Indexed: 09/20/2023]
Abstract
The merger of COVID-19 and seasonal influenza infections is considered a potentially serious threat to public health. These two viral agents can cause extensive and severe lower and upper respiratory tract infections with lung damage with host factors. Today, the development of vaccination has been shown to reduce the risk of hospitalization and mortality from the COVID-19 virus and influenza epidemics. Therefore, this study contributes to an immunoinformatics approach to producing a vaccine that can elicit strong and specific immune responses against COVID-19 and influenza A and B viruses. The NCBI, GISAID, and Uniprot databases were used to retrieve sequences. Linear B cell, Cytotoxic T lymphocyte, and Helper T lymphocyte epitopes were predicted using the online servers. Population coverage of MHC I epitopes worldwide for SARS-CoV-2, Influenza virus H3N2, H3N2, and Yamagata/Victoria were 99.93%, 68.67%, 68.38%, and 85.45%, respectively. Candidate epitopes were linked by GGGGS, GPGPG, and KK linkers. Different epitopes were permutated several times to form different peptides and then screened for antigenicity, allergenicity, and toxicity. The vaccine construct was analyzed for physicochemical properties, conformational B-cell epitopes, interaction with Toll-like receptors, and IFN-gamma-induced. Immune stimulation response of final construct was evaluated using C-IMMSIM. Eventually, the final construct sequence was codon-optimized for Escherichia coli K12 and Homo sapiens to design a multi-epitope vaccine and mRNA vaccine. In conclusion, due to the variable nature of SARS-CoV-2 and influenza proteins, the design of a multi-epitope vaccine can protect against all their standard variants, but laboratory validation is required.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Matin Fathollahi
- Student Research Committee, School of Medicine, Kermanshah University of Medical Sciences, Kermanshah, Iran
- Department of Microbiology, School of Medicine, Kermanshah University of Medical Sciences, Kermanshah, Iran
| | - Hamid Motamedi
- Student Research Committee, School of Medicine, Kermanshah University of Medical Sciences, Kermanshah, Iran
- Department of Microbiology, School of Medicine, Kermanshah University of Medical Sciences, Kermanshah, Iran
| | - Hadi Hossainpour
- Student Research Committee, School of Medicine, Kermanshah University of Medical Sciences, Kermanshah, Iran
- Department of Microbiology, School of Medicine, Kermanshah University of Medical Sciences, Kermanshah, Iran
| | - Ramin Abiri
- Fertility and Infertility Research Center, Research Institute for Health Technology, Kermanshah University of Medical Sciences, Kermanshah, Iran
| | - Mohsen Shahlaei
- Nano Drug Delivery Research Center, Health Technology Institute, Kermanshah University of Medical Sciences, Kermanshah, Iran
| | - Sajad Moradi
- Nano Drug Delivery Research Center, Health Technology Institute, Kermanshah University of Medical Sciences, Kermanshah, Iran
| | - Shirin Dashtbin
- Department of Microbiology, School of Medicine, Iran University of Medical Sciences, Tehran, Iran
| | - Jale Moradi
- Department of Microbiology, School of Medicine, Kermanshah University of Medical Sciences, Kermanshah, Iran
| | - Amirhooshang Alvandi
- Medical Technology Research Center, Research Institute for Health Technology, Kermanshah University of Medical Sciences, Kermanshah, Iran
| |
Collapse
|
22
|
Zheng J, Yang X, Huang Y, Yang S, Wuchty S, Zhang Z. Deep learning-assisted prediction of protein-protein interactions in Arabidopsis thaliana. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2023; 114:984-994. [PMID: 36919205 DOI: 10.1111/tpj.16188] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Revised: 02/20/2023] [Accepted: 03/09/2023] [Indexed: 05/27/2023]
Abstract
Currently, the experimentally identified interactome of Arabidopsis (Arabidopsis thaliana) is still far from complete, suggesting that computational prediction methods can complement experimental techniques. Motivated by the prosperity and success of deep learning algorithms and natural language processing techniques, we introduce an integrative deep learning framework, DeepAraPPI, allowing us to predict protein-protein interactions (PPIs) of Arabidopsis utilizing sequence, domain and Gene Ontology (GO) information. Our current DeepAraPPI comprises: (i) a word2vec encoding-based Siamese recurrent convolutional neural network (RCNN) model; (ii) a Domain2vec encoding-based multiple-layer perceptron (MLP) model; and (iii) a GO2vec encoding-based MLP model. Finally, DeepAraPPI combines the prediction results of the three individual predictors through a logistic regression model. Compiling high-quality positive and negative training and test samples by applying strict filtering strategies, DeepAraPPI shows superior performance compared with existing state-of-the-art Arabidopsis PPI prediction methods. DeepAraPPI also provides better cross-species predictive ability in rice (Oryza sativa) than traditional machine learning methods, although the overall performance in cross-species prediction remains to be improved. DeepAraPPI is freely accessible at http://zzdlab.com/deeparappi/. In the meantime, we have also made the source code and data sets of DeepAraPPI available at https://github.com/zjy1125/DeepAraPPI.
Collapse
Affiliation(s)
- Jingyan Zheng
- State Key Laboratory of Animal Biotech Breeding, College of Biological Sciences, China Agricultural University, Beijing, 100193, China
| | - Xiaodi Yang
- Department of Hematology, Peking University First Hospital, Beijing, 100034, China
| | - Yan Huang
- State Key Laboratory of Animal Biotech Breeding, College of Biological Sciences, China Agricultural University, Beijing, 100193, China
| | - Shiping Yang
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing, 100193, China
| | - Stefan Wuchty
- Department of Computer Science, University of Miami, Miami, FL, 33146, USA
- Department of Biology, University of Miami, Miami, FL, 33146, USA
- Sylvester Comprehensive Cancer Center, University of Miami, Miami, FL, 33136, USA
- Institute of Data Science and Computing, University of Miami, Miami, FL, 33146, USA
| | - Ziding Zhang
- State Key Laboratory of Animal Biotech Breeding, College of Biological Sciences, China Agricultural University, Beijing, 100193, China
| |
Collapse
|
23
|
Fan Y, Sun G, Pan X. ELMo4m6A: A Contextual Language Embedding-Based Predictor for Detecting RNA N6-Methyladenosine Sites. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:944-954. [PMID: 35536814 DOI: 10.1109/tcbb.2022.3173323] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
N6-methyladenosine (m6A) is a universal post-transcriptional modification of RNAs, and it is widely involved in various biological processes. Identifying m6A modification sites accurately is indispensable to further investigate m6A-mediated biological functions. How to better represent RNA sequences is crucial for building effective computational methods for detecting m6A modification sites. However, traditional encoding methods require complex biological prior knowledge and are time-consuming. Furthermore, most of the existing m6A sites prediction methods are limited to single species, and few methods are able to predict m6A sites across different species and tissues. Thus, it is necessary to design a more efficient computational method to predict m6A sites across multiple species and tissues. In this paper, we proposed ELMo4m6A, a contextual language embedding-based method for predicting m6A sites from RNA sequences without any prior knowledge. ELMo4m6A first learns embeddings of RNA sequences using a language model ELMo, then uses a hybrid convolutional neural network (CNN) and long short-term memory (LSTM) to identify m6A sites. The results of 5-fold cross-validation and independent testing demonstrate that ELMo4m6A is superior to state-of-the-art methods. Moreover, we applied integrated gradients to find potential sequence patterns contributing to m6A sites.
Collapse
|
24
|
CSM-Toxin: A Web-Server for Predicting Protein Toxicity. Pharmaceutics 2023; 15:pharmaceutics15020431. [PMID: 36839752 PMCID: PMC9966851 DOI: 10.3390/pharmaceutics15020431] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Revised: 01/17/2023] [Accepted: 01/18/2023] [Indexed: 01/31/2023] Open
Abstract
Biologics are one of the most rapidly expanding classes of therapeutics, but can be associated with a range of toxic properties. In small-molecule drug development, early identification of potential toxicity led to a significant reduction in clinical trial failures, however we currently lack robust qualitative rules or predictive tools for peptide- and protein-based biologics. To address this, we have manually curated the largest set of high-quality experimental data on peptide and protein toxicities, and developed CSM-Toxin, a novel in-silico protein toxicity classifier, which relies solely on the protein primary sequence. Our approach encodes the protein sequence information using a deep learning natural languages model to understand "biological" language, where residues are treated as words and protein sequences as sentences. The CSM-Toxin was able to accurately identify peptides and proteins with potential toxicity, achieving an MCC of up to 0.66 across both cross-validation and multiple non-redundant blind tests, outperforming other methods and highlighting the robust and generalisable performance of our model. We strongly believe the CSM-Toxin will serve as a valuable platform to minimise potential toxicity in the biologic development pipeline. Our method is freely available as an easy-to-use webserver.
Collapse
|
25
|
Ahmadi Moghaddam Y, Maroufi A, Zareei S, Irani M. Computational design of fusion proteins against ErbB2-amplified tumors inspired by ricin toxin. Front Mol Biosci 2023; 10:1098365. [PMID: 36936983 PMCID: PMC10018397 DOI: 10.3389/fmolb.2023.1098365] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Accepted: 02/21/2023] [Indexed: 03/06/2023] Open
Abstract
Although the anti-cancer activity of ricin is well-known, its non-specific targeting challenges the development of ricin-derived medicines. In the present study, novel potential ribosome-inactivating fusion proteins (RIPs) were computationally engineered by incorporation of an ErbB2-dependant penetrating peptide (KCCYSL, MARAKE, WYSWLL, MARSGL, MSRTMS, and WYAWML), a linker (either EAAAK or GGGGS) and chain A of ricin which is responsible for the ribosome inactivation. Molecular dynamics simulations assisted in making sure that the least change is made in conformation and dynamic behavior of ricin chain A in selected chimeric protein (CP). Moreover, the potential affinity of the selected CPs against the ligand-uptaking ErbB2 domain was explored by molecular docking. The results showed that two CPs (CP2 and 10) could bind the receptor with the greatest affinity.
Collapse
Affiliation(s)
- Yasser Ahmadi Moghaddam
- Department of Plant Production and Genetics, Faculty of Agriculture, University of Kurdistan, Sanandaj, Iran
- *Correspondence: Yasser Ahmadi Moghaddam,
| | - Asad Maroufi
- Department of Plant Production and Genetics, Faculty of Agriculture, University of Kurdistan, Sanandaj, Iran
| | - Sara Zareei
- Department of Cell & Molecular Biology, Faculty of Biological Sciences, Kharazmi University, Tehran, Iran
| | - Mehdi Irani
- Department of Chemistry, Faculty of Science, University of Kurdistan, Sanandaj, Iran
| |
Collapse
|
26
|
Shi H, Li Y, Chen Y, Qin Y, Tang Y, Zhou X, Zhang Y, Wu Y. ToxMVA: An end-to-end multi-view deep autoencoder method for protein toxicity prediction. Comput Biol Med 2022; 151:106322. [PMID: 36435057 DOI: 10.1016/j.compbiomed.2022.106322] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Revised: 11/03/2022] [Accepted: 11/14/2022] [Indexed: 11/18/2022]
Abstract
Effectively predicting protein toxicity plays an essential step in the early stage of protein-based drug discovery, which is of great help to speed up novel drug screening and reduce costs. Recently, several relevant datasets have been designed, and then machine learning-based methods have been proposed to predict the toxicity of the protein and have shown satisfactory performance. However, previous studies generally directly concatenate different protein features, which may introduce irrelevant information and decrease model performance. In this study, we present a novel end-to-end deep learning-based method called ToxMVA, to predict protein toxicity. To be specific, we first build comprehensive feature profiles of proteins based on primary sequences, including sequential, physicochemical, and contextual semantic information. Next, an autoencoder network is introduced to integrate the multi-view information for obtaining a more concise and accurate feature representation. Extensive experimental results on three datasets demonstrate that ToxMVA has superior performance for protein toxicity prediction and shows better robustness among three different datasets.
Collapse
Affiliation(s)
- Hua Shi
- School of Opto-electronic and Communication Engineering, Xiamen University of Technology, Xiamen, 361024, Fujian, China
| | - Yan Li
- School of Opto-electronic and Communication Engineering, Xiamen University of Technology, Xiamen, 361024, Fujian, China
| | - Yi Chen
- School of Opto-electronic and Communication Engineering, Xiamen University of Technology, Xiamen, 361024, Fujian, China
| | - Yuming Qin
- Anesthesiology Department, The Affiliated Hospital of Southwest Medical University, Luzhou, 646000, Sichuan, China
| | - Yifan Tang
- Anesthesiology Department, The Affiliated Hospital of Southwest Medical University, Luzhou, 646000, Sichuan, China
| | - Xun Zhou
- Beidahuang Industry Group General Hospital, Harbin, China.
| | - Ying Zhang
- Anesthesiology Department, The Affiliated Hospital of Southwest Medical University, Luzhou, 646000, Sichuan, China.
| | - Yun Wu
- College of Computer and Information Engineering, Xiamen University of Technology, Xiamen, 361024, Fujian, China.
| |
Collapse
|
27
|
Zhao Z, Gui J, Yao A, Le NQK, Chua MCH. Improved Prediction Model of Protein and Peptide Toxicity by Integrating Channel Attention into a Convolutional Neural Network and Gated Recurrent Units. ACS OMEGA 2022; 7:40569-40577. [PMID: 36385847 PMCID: PMC9647964 DOI: 10.1021/acsomega.2c05881] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/11/2022] [Accepted: 10/19/2022] [Indexed: 06/16/2023]
Abstract
In recent times, the importance of peptides in the biomedical domain has received increasing concern in terms of their effect on multiple disease treatments. However, before successful large-scale implementation in the industry, accurate identification of peptide toxicity is a vital prerequisite. The existing computational methods have reached good results from toxicity prediction, and we present an improved model based on different deep learning architectures. The modification mainly focuses on two aspects: sequence encoding and variational information bottlenecks. Consequently, one of our modified plans shows an obvious increase in sensitivity, while the rest show good performance meanwhile adding novelty in the peptide toxicity prediction domain. In detail, our best model could achieve an accuracy of 97.38 and 95.03% in protein and peptide toxicity predictions, respectively. The performance was superior to previous predictors on the same datasets.
Collapse
Affiliation(s)
- Zhengyun Zhao
- Institute of Systems
Science, National University of Singapore, 25 Heng Mui Keng Terrace, Singapore 119615, Singapore
| | - Jingyu Gui
- Institute of Systems
Science, National University of Singapore, 25 Heng Mui Keng Terrace, Singapore 119615, Singapore
| | - Anqi Yao
- Institute of Systems
Science, National University of Singapore, 25 Heng Mui Keng Terrace, Singapore 119615, Singapore
| | - Nguyen Quoc Khanh Le
- Professional Master Program in Artificial Intelligence
in Medicine, College of Medicine, Taipei
Medical University, Taipei 106, Taiwan
- Research
Center for Artificial Intelligence in Medicine, Taipei Medical University, Taipei 106, Taiwan
- Translational Imaging Research Center, Taipei Medical University Hospital, Taipei 110, Taiwan
| | - Matthew Chin Heng Chua
- Institute of Systems
Science, National University of Singapore, 25 Heng Mui Keng Terrace, Singapore 119615, Singapore
| |
Collapse
|
28
|
Ahn SY, Kim M, Bae JE, Bang IS, Lee SW. Reliability of the In Silico Prediction Approach to In Vitro Evaluation of Bacterial Toxicity. SENSORS (BASEL, SWITZERLAND) 2022; 22:6557. [PMID: 36081016 PMCID: PMC9459819 DOI: 10.3390/s22176557] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Revised: 08/26/2022] [Accepted: 08/26/2022] [Indexed: 06/15/2023]
Abstract
Several pathogens that spread through the air are highly contagious, and related infectious diseases are more easily transmitted through airborne transmission under indoor conditions, as observed during the COVID-19 pandemic. Indoor air contaminated by microorganisms, including viruses, bacteria, and fungi, or by derived pathogenic substances, can endanger human health. Thus, identifying and analyzing the potential pathogens residing in the air are crucial to preventing disease and maintaining indoor air quality. Here, we applied deep learning technology to analyze and predict the toxicity of bacteria in indoor air. We trained the ProtBert model on toxic bacterial and virulence factor proteins and applied them to predict the potential toxicity of some bacterial species by analyzing their protein sequences. The results reflect the results of the in vitro analysis of their toxicity in human cells. The in silico-based simulation and the obtained results demonstrated that it is plausible to find possible toxic sequences in unknown protein sequences.
Collapse
Affiliation(s)
- Sung-Yoon Ahn
- Pattern Recognition and Machine Learning Lab, Department of AI Software, Gachon University, Seongnam 13557, Korea
| | - Mira Kim
- Department of Microbiology and Immunology, Chosun University School of Dentistry, Gwangju 61452, Korea
| | - Ji-Eun Bae
- Department of Microbiology and Immunology, Chosun University School of Dentistry, Gwangju 61452, Korea
| | - Iel-Soo Bang
- Department of Microbiology and Immunology, Chosun University School of Dentistry, Gwangju 61452, Korea
| | - Sang-Woong Lee
- Pattern Recognition and Machine Learning Lab, Department of AI Software, Gachon University, Seongnam 13557, Korea
| |
Collapse
|
29
|
Sharma N, Naorem LD, Jain S, Raghava GPS. ToxinPred2: an improved method for predicting toxicity of proteins. Brief Bioinform 2022; 23:6590152. [PMID: 35595541 DOI: 10.1093/bib/bbac174] [Citation(s) in RCA: 117] [Impact Index Per Article: 39.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2022] [Revised: 03/31/2022] [Accepted: 04/18/2022] [Indexed: 12/13/2022] Open
Abstract
Proteins/peptides have shown to be promising therapeutic agents for a variety of diseases. However, toxicity is one of the obstacles in protein/peptide-based therapy. The current study describes a web-based tool, ToxinPred2, developed for predicting the toxicity of proteins. This is an update of ToxinPred developed mainly for predicting toxicity of peptides and small proteins. The method has been trained, tested and evaluated on three datasets curated from the recent release of the SwissProt. To provide unbiased evaluation, we performed internal validation on 80% of the data and external validation on the remaining 20% of data. We have implemented the following techniques for predicting protein toxicity; (i) Basic Local Alignment Search Tool-based similarity, (ii) Motif-EmeRging and with Classes-Identification-based motif search and (iii) Prediction models. Similarity and motif-based techniques achieved a high probability of correct prediction with poor sensitivity/coverage, whereas models based on machine-learning techniques achieved balance sensitivity and specificity with reasonably high accuracy. Finally, we developed a hybrid method that combined all three approaches and achieved a maximum area under receiver operating characteristic curve around 0.99 with Matthews correlation coefficient 0.91 on the validation dataset. In addition, we developed models on alternate and realistic datasets. The best machine learning models have been implemented in the web server named 'ToxinPred2', which is available at https://webs.iiitd.edu.in/raghava/toxinpred2/ and a standalone version at https://github.com/raghavagps/toxinpred2. This is a general method developed for predicting the toxicity of proteins regardless of their source of origin.
Collapse
Affiliation(s)
- Neelam Sharma
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi-110020, India
| | - Leimarembi Devi Naorem
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi-110020, India
| | - Shipra Jain
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi-110020, India
| | - Gajendra P S Raghava
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi-110020, India
| |
Collapse
|
30
|
Yang X, Yang S, Ren P, Wuchty S, Zhang Z. Deep Learning-Powered Prediction of Human-Virus Protein-Protein Interactions. Front Microbiol 2022; 13:842976. [PMID: 35495666 PMCID: PMC9051481 DOI: 10.3389/fmicb.2022.842976] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2021] [Accepted: 03/25/2022] [Indexed: 11/13/2022] Open
Abstract
Identifying human-virus protein-protein interactions (PPIs) is an essential step for understanding viral infection mechanisms and antiviral response of the human host. Recent advances in high-throughput experimental techniques enable the significant accumulation of human-virus PPI data, which have further fueled the development of machine learning-based human-virus PPI prediction methods. Emerging as a very promising method to predict human-virus PPIs, deep learning shows the powerful ability to integrate large-scale datasets, learn complex sequence-structure relationships of proteins and convert the learned patterns into final prediction models with high accuracy. Focusing on the recent progresses of deep learning-powered human-virus PPI predictions, we review technical details of these newly developed methods, including dataset preparation, deep learning architectures, feature engineering, and performance assessment. Moreover, we discuss the current challenges and potential solutions and provide future perspectives of human-virus PPI prediction in the coming post-AlphaFold2 era.
Collapse
Affiliation(s)
- Xiaodi Yang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing, China
| | - Shiping Yang
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing, China
| | - Panyu Ren
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing, China
| | - Stefan Wuchty
- Department of Computer Science, University of Miami, Miami, FL, United States
- Department of Biology, University of Miami, Miami, FL, United States
- Sylvester Comprehensive Cancer Center, University of Miami, Miami, FL, United States
| | - Ziding Zhang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing, China
- *Correspondence: Ziding Zhang,
| |
Collapse
|
31
|
Kumar A, Sharma P, Arun A, Meena LS. Development of peptide vaccine candidate using highly antigenic PE-PGRS family proteins to stimulate the host immune response against Mycobacterium tuberculosis H 37Rv: an immuno-informatics approach. J Biomol Struct Dyn 2022; 41:3382-3404. [PMID: 35293852 DOI: 10.1080/07391102.2022.2048079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
Tuberculosis (TB) is a fast spreading; transmissible disease caused by the Mycobacterium tuberculosis (M. tuberculosis). M. tuberculosis has a high death rate in its endemic regions due to a lack of appropriate treatment and preventative measures. We have used a vaccinomics strategy to create an effective multi-epitope vaccine against M. tuberculosis. The antigenic proteins with the highest antigenicity were utilised to predict cytotoxic T-lymphocyte (CTL), helper T-lymphocyte (HTL), and linear B-lymphocyte (LBL) epitopes. CTL and HTL epitopes were covered in 99.97% of the population. Seven epitopes each of CTL, HTL, and LBL were ultimately selected and utilised to develop a multi-epitope vaccine. A vaccine design was developed by combining these epitopes with suitable linkers and LprG adjuvant. The vaccine chimera was revealed to be highly immunogenic, non-allergenic, and non-toxic. To ensure a better expression within the Escherichia coli K12 (E. coli K12) host system, codon adaptation and in silico cloning were accomplished. Following that, various validation studies were conducted, including molecular docking, molecular dynamics simulation, and immunological simulation, all of which indicated that the designed vaccine would be stable in the biological environment and effective against M. tuberculosis infection. The immune simulation revealed higher levels of T-cell and B-cell activity, which corresponded to the actual immune response. Exposure simulations were repeated several times, resulting in increased clonal selection and faster antigen clearance. These results suggest that, if proposed vaccine chimera would test both in-vitro and in-vivo, it could be a viable treatment and preventive strategy for TB.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Ajit Kumar
- CSIR-Institute of Genomics and Integrative Biology, Delhi, India.,Academy of Scientific and Innovative Research (AcSIR), CSIR-HRDC, Ghaziabad, Uttar Pradesh, India
| | - Priyanka Sharma
- CSIR-Institute of Genomics and Integrative Biology, Delhi, India
| | - Akanksha Arun
- CSIR-Institute of Genomics and Integrative Biology, Delhi, India.,Academy of Scientific and Innovative Research (AcSIR), CSIR-HRDC, Ghaziabad, Uttar Pradesh, India
| | - Laxman S Meena
- CSIR-Institute of Genomics and Integrative Biology, Delhi, India.,Academy of Scientific and Innovative Research (AcSIR), CSIR-HRDC, Ghaziabad, Uttar Pradesh, India
| |
Collapse
|
32
|
Wei L, Ye X, Sakurai T, Mu Z, Wei L. ToxIBTL: prediction of peptide toxicity based on information bottleneck and transfer learning. Bioinformatics 2022; 38:1514-1524. [PMID: 34999757 DOI: 10.1093/bioinformatics/btac006] [Citation(s) in RCA: 79] [Impact Index Per Article: 26.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2021] [Revised: 11/29/2021] [Accepted: 01/04/2022] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Recently, peptides have emerged as a promising class of pharmaceuticals for various diseases treatment poised between traditional small molecule drugs and therapeutic proteins. However, one of the key bottlenecks preventing them from therapeutic peptides is their toxicity toward human cells, and few available algorithms for predicting toxicity are specially designed for short-length peptides. RESULTS We present ToxIBTL, a novel deep learning framework by utilizing the information bottleneck principle and transfer learning to predict the toxicity of peptides as well as proteins. Specifically, we use evolutionary information and physicochemical properties of peptide sequences and integrate the information bottleneck principle into a feature representation learning scheme, by which relevant information is retained and the redundant information is minimized in the obtained features. Moreover, transfer learning is introduced to transfer the common knowledge contained in proteins to peptides, which aims to improve the feature representation capability. Extensive experimental results demonstrate that ToxIBTL not only achieves a higher prediction performance than state-of-the-art methods on the peptide dataset, but also has a competitive performance on the protein dataset. Furthermore, a user-friendly online web server is established as the implementation of the proposed ToxIBTL. AVAILABILITY AND IMPLEMENTATION The proposed ToxIBTL and data can be freely accessible at http://server.wei-group.net/ToxIBTL. Our source code is available at https://github.com/WLYLab/ToxIBTL. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Lesong Wei
- Department of Computer Science, University of Tsukuba, Tsukuba 3058577, Japan
| | - Xiucai Ye
- Department of Computer Science, University of Tsukuba, Tsukuba 3058577, Japan
| | - Tetsuya Sakurai
- Department of Computer Science, University of Tsukuba, Tsukuba 3058577, Japan
| | - Zengchao Mu
- School of Mathematics and Statistics, Shandong University, Weihai, China
| | - Leyi Wei
- School of Software, Shandong University, Jinan, China
| |
Collapse
|
33
|
Habehh H, Gohel S. Machine Learning in Healthcare. Curr Genomics 2021; 22:291-300. [PMID: 35273459 PMCID: PMC8822225 DOI: 10.2174/1389202922666210705124359] [Citation(s) in RCA: 129] [Impact Index Per Article: 32.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Revised: 05/12/2021] [Accepted: 06/04/2021] [Indexed: 11/22/2022] Open
Abstract
Recent advancements in Artificial Intelligence (AI) and Machine Learning (ML) technology have brought on substantial strides in predicting and identifying health emergencies, disease populations, and disease state and immune response, amongst a few. Although, skepticism remains regarding the practical application and interpretation of results from ML-based approaches in healthcare settings, the inclusion of these approaches is increasing at a rapid pace. Here we provide a brief overview of machine learning-based approaches and learning algorithms including supervised, unsupervised, and reinforcement learning along with examples. Second, we discuss the application of ML in several healthcare fields, including radiology, genetics, electronic health records, and neuroimaging. We also briefly discuss the risks and challenges of ML application to healthcare such as system privacy and ethical concerns and provide suggestions for future applications.
Collapse
Affiliation(s)
- Hafsa Habehh
- Department of Health Informatics, Rutgers University School of Health Professions, 65 Bergen Street, Newark, NJ 07107, USA
| | - Suril Gohel
- Department of Health Informatics, Rutgers University School of Health Professions, 65 Bergen Street, Newark, NJ 07107, USA
| |
Collapse
|
34
|
Fathollahi M, Fathollahi A, Motamedi H, Moradi J, Alvandi A, Abiri R. In silico vaccine design and epitope mapping of New Delhi metallo-beta-lactamase (NDM): an immunoinformatics approach. BMC Bioinformatics 2021; 22:458. [PMID: 34563132 PMCID: PMC8465709 DOI: 10.1186/s12859-021-04378-z] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Accepted: 09/17/2021] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Antibiotic resistance is a global health crisis. The adage that "prevention is better than cure" is especially true regarding antibiotic resistance because the resistance appears and spreads much faster than the production of new antibiotics. Vaccination is an important strategy to fight infectious agents; however, this strategy has not attracted sufficient attention in antibiotic resistance prevention. New Delhi metallo-beta-lactamase (NDM) confers resistance to many beta-lactamases, including important carbapenems like imipenem. Our goal in this study is to use an immunoinformatics approach to develop a vaccine that can elicit strong and specific immune responses against NDMs that prevent the development of antibiotic-resistant bacteria. RESULTS In this study, 2194 NDM sequences were aligned to obtain a conserved sequence. One continuous B cell epitope and three T cell CD4+ epitopes were selected from NDMs conserved sequence. Epitope conservancy for B cell and HLA-DR, HLA-DQ, and HLA-DP epitopes was 100.00%, 99.82%, 99.41%, and 99.86%, respectively, and population coverage of MHC II epitopes for the world was 99.91%. Permutation of the four epitope fragments resulted in 24 different peptides, of which 6 peptides were selected after toxicity, allergenicity, and antigenicity assessment. After primary vaccine design, only one vaccine sequence with the highest similarity with discontinuous B cell epitope in NDMs was selected. The final vaccine can bind to various Toll-like receptors (TLRs). The prediction implied that the vaccine would be stable with a good half-life. An immune simulation performed by the C-IMMSIM server predicted that two doses of vaccine injection can induce a strong immune response to NDMs. Finally, the GC-Content of the vaccine was designed very similar to E. coli K12. CONCLUSIONS In this study, immunoinformatics strategies were used to design a vaccine against different NDM variants that could produce an effective immune response against this antibiotic-resistant factor.
Collapse
Affiliation(s)
- Matin Fathollahi
- Department of Microbiology, School of Medicine, Kermanshah University of Medical Sciences, Kermanshah, Iran
| | - Anwar Fathollahi
- Department of Immunology, School of Medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Hamid Motamedi
- Department of Microbiology, School of Medicine, Kermanshah University of Medical Sciences, Kermanshah, Iran
| | - Jale Moradi
- Department of Microbiology, School of Medicine, Kermanshah University of Medical Sciences, Kermanshah, Iran
| | - Amirhooshang Alvandi
- Department of Microbiology, School of Medicine, Kermanshah University of Medical Sciences, Kermanshah, Iran
- Medical Technology Research Center, Health Technology Institute, Kermanshah University of Medical Sciences, Kermanshah, Iran
| | - Ramin Abiri
- Department of Microbiology, School of Medicine, Kermanshah University of Medical Sciences, Kermanshah, Iran.
- Fertility and Infertility Research Center, Health Technology Institute, Kermanshah University of Medical Sciences, Kermanshah, Iran.
| |
Collapse
|
35
|
Beg AZ, Farhat N, Khan AU. Designing multi-epitope vaccine candidates against functional amyloids in Pseudomonas aeruginosa through immunoinformatic and structural bioinformatics approach. INFECTION GENETICS AND EVOLUTION 2021; 93:104982. [PMID: 34186254 DOI: 10.1016/j.meegid.2021.104982] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Revised: 06/09/2021] [Accepted: 06/24/2021] [Indexed: 10/21/2022]
Abstract
Pseudomonas aeruginosa (P. aeruginosa) displays high drug resistance and biofilm-mediated adaptability, which makes its infections difficult to treat. Alternative intervention methods and targets have made such infections treatment manageable. One of the biofilm components, functional amyloids of Pseudomonas (Fap) is correlated positively with virulence and mucoidy phenotype found in infection in cystic fibrosis (CF) patients. Extracellular accessibility, conservation across P. aeruginosa isolates and linkage with lung infections phenotype in CF patients, makes Fap a promising intervention target. Furthermore, the reported effect of bacterial amyloid on neuronal function and immune response makes it a targetable candidate. In the current study, Fap C protein and its immediate interactions were explored to extract antigenic T-cell and B-cell epitopes. A combination of epitopes and peptide adjuvants has been linked to derive vaccine candidate structures. The vaccine candidates were validated for antigenicity, allergenicity, physiochemical properties, stability and interactions with TLRs and MHC alleles. Immunosimulation studies have demonstrated that vaccines elicit Th1 dominated response, which can assist in good prognosis of infection in CF patients.
Collapse
Affiliation(s)
- Ayesha Z Beg
- Medical Microbiology and Molecular Biology Lab., Interdisciplinary Biotechnology Unit, Aligarh Muslim University, Aligarh, India
| | - Nabeela Farhat
- Medical Microbiology and Molecular Biology Lab., Interdisciplinary Biotechnology Unit, Aligarh Muslim University, Aligarh, India
| | - Asad U Khan
- Medical Microbiology and Molecular Biology Lab., Interdisciplinary Biotechnology Unit, Aligarh Muslim University, Aligarh, India; Centre for Bioinformatic on Antimicrobial Resistance, IBU, Aligarh Muslim University, Aligarh, India.
| |
Collapse
|
36
|
Lin Y, Pan X, Shen HB. lncLocator 2.0: a cell-line-specific subcellular localization predictor for long non-coding RNAs with interpretable deep learning. Bioinformatics 2021; 37:2308-2316. [PMID: 33630066 DOI: 10.1093/bioinformatics/btab127] [Citation(s) in RCA: 63] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2020] [Revised: 01/26/2021] [Accepted: 02/23/2021] [Indexed: 12/14/2022] Open
Abstract
MOTIVATION Long non-coding RNAs (lncRNAs) are generally expressed in a tissue-specific way, and subcellular localizations of lncRNAs depend on the tissues or cell lines that they are expressed. Previous computational methods for predicting subcellular localizations of lncRNAs do not take this characteristic into account, they train a unified machine learning model for pooled lncRNAs from all available cell lines. It is of importance to develop a cell-line-specific computational method to predict lncRNA locations in different cell lines. RESULTS In this study, we present an updated cell-line-specific predictor lncLocator 2.0, which trains an end-to-end deep model per cell line, for predicting lncRNA subcellular localization from sequences.We first construct benchmark datasets of lncRNA subcellular localizations for 15 cell lines. Then we learn word embeddings using natural language models, and these learned embeddings are fed into convolutional neural network, long short-term memory and multilayer perceptron to classify subcellular localizations. lncLocator 2.0 achieves varying effectiveness for different cell lines and demonstrates the necessity of training cell-line-specific models. Furthermore, we adopt Integrated Gradients to explain the proposed model in lncLocator 2.0, and find some potential patterns that determine the subcellular localizations of lncRNAs, suggesting that the subcellular localization of lncRNAs is linked to some specific nucleotides. AVAILABILITY The lncLocator 2.0 is available at www.csbio.sjtu.edu.cn/bioinf/lncLocator2 and the source code can be found at https://github.com/Yang-J-LIN/lncLocator2. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yang Lin
- Department of Automation, Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Xiaoyong Pan
- Department of Automation, Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Hong-Bin Shen
- Department of Automation, Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Shanghai, 200240, China
| |
Collapse
|
37
|
Martín-Galiano AJ, Escolano-Martínez MS, Corsini B, de la Campa AG, Yuste J. Immunization with SP_1992 (DiiA) Protein of Streptococcus pneumoniae Reduces Nasopharyngeal Colonization and Protects against Invasive Disease in Mice. Vaccines (Basel) 2021; 9:vaccines9030187. [PMID: 33668195 PMCID: PMC7995960 DOI: 10.3390/vaccines9030187] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2021] [Revised: 02/19/2021] [Accepted: 02/19/2021] [Indexed: 11/16/2022] Open
Abstract
Knowledge-based vaccinology can reveal uncharacterized antigen candidates for a new generation of protein-based anti-pneumococcal vaccines. DiiA, encoded by the sp_1992 locus, is a surface protein containing either one or two repeats of a 37mer N-terminal motif that exhibits low interstrain variability. DiiA belongs to the core proteome, contains several conserved B-cell epitopes, and is associated with colonization and pathogenesis. Immunization with DiiA protein via the intraperitoneal route induced a strong IgG response, including different IgG subtypes. Vaccination with DiiA increased bacterial clearance and induced protection against sepsis, conferring 70% increased survival at 48 h post-infection when compared to the adjuvant control. The immunogenic response and survival rates in mice immunized with a truncated DiiA version lacking 119 N-terminal residues were remarkably lower, confirming the relevance of the repeat zone in the immunoprotection by DiiA. Intranasal immunization of mice with the entire recombinant protein elicited mucosal IgG and IgA responses that reduced bacterial colonization of the nasopharynx, confirming that this protein might be a vaccine candidate for reducing the carrier rate. DiiA constitutes an example of how functionally unannotated proteins may still represent promising candidates that can be used in prophylactic strategies against the pneumococcal carrier state and invasive disease.
Collapse
Affiliation(s)
- Antonio J. Martín-Galiano
- Centro Nacional de Microbiología, Instituto de Salud Carlos III (ISCIII), 28220 Madrid, Spain; (M.S.E.-M.); (B.C.); (A.G.d.l.C.)
- Correspondence: (A.J.M.-G.); (J.Y.); Tel.: +34-918223976 (A.J.M.-G.); +34-918223620 (J.Y.)
| | - María S. Escolano-Martínez
- Centro Nacional de Microbiología, Instituto de Salud Carlos III (ISCIII), 28220 Madrid, Spain; (M.S.E.-M.); (B.C.); (A.G.d.l.C.)
| | - Bruno Corsini
- Centro Nacional de Microbiología, Instituto de Salud Carlos III (ISCIII), 28220 Madrid, Spain; (M.S.E.-M.); (B.C.); (A.G.d.l.C.)
| | - Adela G. de la Campa
- Centro Nacional de Microbiología, Instituto de Salud Carlos III (ISCIII), 28220 Madrid, Spain; (M.S.E.-M.); (B.C.); (A.G.d.l.C.)
- Presidencia Consejo Superior de Investigaciones Científicas, 28006 Madrid, Spain
| | - José Yuste
- Centro Nacional de Microbiología, Instituto de Salud Carlos III (ISCIII), 28220 Madrid, Spain; (M.S.E.-M.); (B.C.); (A.G.d.l.C.)
- CIBER de Enfermedades Respiratorias (CIBERES), 28029 Madrid, Spain
- Correspondence: (A.J.M.-G.); (J.Y.); Tel.: +34-918223976 (A.J.M.-G.); +34-918223620 (J.Y.)
| |
Collapse
|