1
|
Yin K, Li R, Zhang S, Sun Y, Huang L, Jiang M, Xu D, Xu W. Deep Learning Combined with Quantitative Structure‒Activity Relationship Accelerates De Novo Design of Antifungal Peptides. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2025; 12:e2412488. [PMID: 39921483 PMCID: PMC11967820 DOI: 10.1002/advs.202412488] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/07/2024] [Revised: 01/20/2025] [Indexed: 02/10/2025]
Abstract
Novel antifungal drugs that evade resistance are urgently needed for Candida infections. Antifungal peptides (AFPs) are potential candidates due to their specific mechanism of action, which makes them less prone to developing drug resistance. An AFP de novo design method, Deep Learning-Quantitative Structure‒Activity Relationship Empirical Screening (DL-QSARES), is developed by integrating deep learning and quantitative structure‒activity relationship empirical screening. After generating candidate AFPs (c_AFPs) through the recombination of dominant amino acids and dipeptide compositions, natural language processing models are utilized and quantitative structure‒activity relationship (QSAR) approaches based on physicochemical properties to screen for promising c_AFPs. Forty-nine promising c_AFPs are screened, and their minimum inhibitory concentrations (MICs) against C. albicans are determined to be 3.9-125 µg mL-1, of which four leading c_AFPs (AFP-8, -10, -11, and -13) has MICs of <10 µg mL-1 against the four tested pathogenic fungi, and AFP-13 has excellent therapeutic efficacy in the animal model.
Collapse
Affiliation(s)
- Kedong Yin
- Zhengzhou Key Laboratory of Functional Molecules for Biomedical ResearchHenan University of TechnologyZhengzhouHenan450001P. R. China
- College of Information Science and EngineeringHenan University of TechnologyZhengzhouHenan450001P. R. China
| | - Ruifang Li
- Zhengzhou Key Laboratory of Functional Molecules for Biomedical ResearchHenan University of TechnologyZhengzhouHenan450001P. R. China
- School of Biological EngineeringHenan University of TechnologyZhengzhouHenan450001P. R. China
| | - Shaojie Zhang
- Zhengzhou Key Laboratory of Functional Molecules for Biomedical ResearchHenan University of TechnologyZhengzhouHenan450001P. R. China
- School of Biological EngineeringHenan University of TechnologyZhengzhouHenan450001P. R. China
| | - Yiqing Sun
- Zhengzhou Key Laboratory of Functional Molecules for Biomedical ResearchHenan University of TechnologyZhengzhouHenan450001P. R. China
- School of Biological EngineeringHenan University of TechnologyZhengzhouHenan450001P. R. China
| | - Liang Huang
- Zhengzhou Key Laboratory of Functional Molecules for Biomedical ResearchHenan University of TechnologyZhengzhouHenan450001P. R. China
- School of Biological EngineeringHenan University of TechnologyZhengzhouHenan450001P. R. China
| | - Mengwan Jiang
- School of Artificial Intelligence and Big DataHenan University of TechnologyZhengzhouHenan450001P. R. China
| | - Degang Xu
- College of Information Science and EngineeringHenan University of TechnologyZhengzhouHenan450001P. R. China
| | - Wen Xu
- Zhengzhou Key Laboratory of Functional Molecules for Biomedical ResearchHenan University of TechnologyZhengzhouHenan450001P. R. China
- Law CollegeHenan University of TechnologyZhengzhouHenan450001P. R. China
| |
Collapse
|
2
|
Wei Z, Shen Y, Tang X, Wen J, Song Y, Wei M, Cheng J, Zhu X. AVPpred-BWR: antiviral peptides prediction via biological words representation. Bioinformatics 2025; 41:btaf126. [PMID: 40152250 PMCID: PMC11968319 DOI: 10.1093/bioinformatics/btaf126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2024] [Revised: 02/17/2025] [Accepted: 03/26/2025] [Indexed: 03/29/2025] Open
Abstract
MOTIVATION Antiviral peptides (AVPs) are short chains of amino acids, showing great potential as antiviral drugs. The traditional wisdom (e.g. wet experiments) for identifying the AVPs is time-consuming and laborious, while cutting-edge computational methods are less accurate to predict them. RESULTS In this article, we propose an AVPs prediction model via biological words representation, dubbed AVPpred-BWR. Based on the fact that the secondary structures of AVPs mainly consist of α-helix and loop, we explore the biological words of 1mer (corresponding to loops) and 4mer (4 continuous residues, corresponding to α-helix). That is, the peptides sequences are decomposed into biological words, and then the concealed sequential information is represented by training the Word2Vec models. Moreover, in order to extract multi-scale features, we leverage a CNN-Transformer framework to process the embeddings of 1mer and 4mer generated by Word2Vec models. To the best of our knowledge, this is the first time to realize the word segmentation of protein primary structure sequences based on the regularity of protein secondary structure. AVPpred-BWR illustrates clear improvements over its competitors on the independent test set (e.g. improvements of 4.6% and 11.0% for AUROC and MCC, respectively, compared to UniDL4BioPep). AVAILABILITY AND IMPLEMENTATION AVPpred-BWR is publicly available at: https://github.com/zyweizm/AVPpred-BWR or https://zenodo.org/records/14880447 (doi: 10.5281/zenodo.14880447).
Collapse
Affiliation(s)
- Zhuoyu Wei
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui 230036, China
| | - Yongqi Shen
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui 230036, China
| | - Xiang Tang
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui 230036, China
| | - Jian Wen
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui 230036, China
| | - Youyi Song
- School of Science, China Pharmaceutical University, Nanjing 210009, China
| | - Mingqiang Wei
- School of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China
| | - Jing Cheng
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui 230036, China
| | - Xiaolei Zhu
- School of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, Anhui 230036, China
| |
Collapse
|
3
|
Wang Y, Song M, Liu F, Liang Z, Hong R, Dong Y, Luan H, Fu X, Yuan W, Fang W, Li G, Lou H, Chang W. Artificial intelligence using a latent diffusion model enables the generation of diverse and potent antimicrobial peptides. SCIENCE ADVANCES 2025; 11:eadp7171. [PMID: 39908380 PMCID: PMC11797553 DOI: 10.1126/sciadv.adp7171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/09/2024] [Accepted: 01/07/2025] [Indexed: 02/07/2025]
Abstract
Artificial intelligence holds great promise for the design of antimicrobial peptides (AMPs); however, current models face limitations in generating AMPs with sufficient novelty and diversity, and they are rarely applied to the generation of antifungal peptides. Here, we develop an alternative pipeline grounded in a diffusion model and molecular dynamics for the de novo design of AMPs. The peptides generated by our pipeline have lower similarity and identity than those of other reported methodologies. Among the 40 peptides synthesized for an experimental validation, 25 exhibit either antibacterial or antifungal activity. AMP-29 shows selective antifungal activity against Candida glabrata and in vivo antifungal efficacy in a murine skin infection model. AMP-24 exhibits potent in vitro activity against Gram-negative bacteria and in vivo efficacy against both skin and lung Acinetobacter baumannii infection models. The proposed approach offers a pipeline for designing diverse AMPs to counteract the threat of antibiotic resistance.
Collapse
Affiliation(s)
- Yeji Wang
- Department of Natural Product Chemistry, Key Laboratory of Chemical Biology (Ministry of Education), School of Pharmaceutical Sciences, Cheeloo College of Medicine, Shandong University, Jinan, Shandong Province, China
| | - Minghui Song
- Department of Natural Product Chemistry, Key Laboratory of Chemical Biology (Ministry of Education), School of Pharmaceutical Sciences, Cheeloo College of Medicine, Shandong University, Jinan, Shandong Province, China
| | - Fujing Liu
- Department of Natural Product Chemistry, Key Laboratory of Chemical Biology (Ministry of Education), School of Pharmaceutical Sciences, Cheeloo College of Medicine, Shandong University, Jinan, Shandong Province, China
| | - Zhen Liang
- Department of Natural Product Chemistry, Key Laboratory of Chemical Biology (Ministry of Education), School of Pharmaceutical Sciences, Cheeloo College of Medicine, Shandong University, Jinan, Shandong Province, China
| | - Rui Hong
- Department of Natural Product Chemistry, Key Laboratory of Chemical Biology (Ministry of Education), School of Pharmaceutical Sciences, Cheeloo College of Medicine, Shandong University, Jinan, Shandong Province, China
| | - Yuemei Dong
- Department of Natural Product Chemistry, Key Laboratory of Chemical Biology (Ministry of Education), School of Pharmaceutical Sciences, Cheeloo College of Medicine, Shandong University, Jinan, Shandong Province, China
| | - Huaizu Luan
- Department of Natural Product Chemistry, Key Laboratory of Chemical Biology (Ministry of Education), School of Pharmaceutical Sciences, Cheeloo College of Medicine, Shandong University, Jinan, Shandong Province, China
| | - Xiaojie Fu
- Department of Natural Product Chemistry, Key Laboratory of Chemical Biology (Ministry of Education), School of Pharmaceutical Sciences, Cheeloo College of Medicine, Shandong University, Jinan, Shandong Province, China
| | - Wenchang Yuan
- Guangzhou Key Laboratory for Clinical Rapid Diagnosis and Early Warning of Infectious Diseases, KingMed School of Laboratory Medicine, Guangzhou Medical University, Guangzhou, Guangdong, China
| | - Wenjie Fang
- Shanghai Key Laboratory of Molecular Medical Mycology, Shanghai Institute of Mycology, Shanghai Changzheng Hospital, Second Military Medical University, Shanghai, China
| | - Gang Li
- Department of Natural Medicinal Chemistry and Pharmacognosy, School of Pharmacy, Qingdao University, Qingdao, China
| | - Hongxiang Lou
- Department of Natural Product Chemistry, Key Laboratory of Chemical Biology (Ministry of Education), School of Pharmaceutical Sciences, Cheeloo College of Medicine, Shandong University, Jinan, Shandong Province, China
| | - Wenqiang Chang
- Department of Natural Product Chemistry, Key Laboratory of Chemical Biology (Ministry of Education), School of Pharmaceutical Sciences, Cheeloo College of Medicine, Shandong University, Jinan, Shandong Province, China
| |
Collapse
|
4
|
Barroso RA, Agüero-Chapin G, Sousa R, Marrero-Ponce Y, Antunes A. Unlocking Antimicrobial Peptides: In Silico Proteolysis and Artificial Intelligence-Driven Discovery from Cnidarian Omics. Molecules 2025; 30:550. [PMID: 39942653 PMCID: PMC11820242 DOI: 10.3390/molecules30030550] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2024] [Revised: 01/20/2025] [Accepted: 01/21/2025] [Indexed: 02/16/2025] Open
Abstract
Overcoming the growing challenge of antimicrobial resistance (AMR), which affects millions of people worldwide, has driven attention for the exploration of marine-derived antimicrobial peptides (AMPs) for innovative solutions. Cnidarians, such as corals, sea anemones, and jellyfish, are a promising valuable resource of these bioactive peptides due to their robust innate immune systems yet are still poorly explored. Hence, we employed an in silico proteolysis strategy to search for novel AMPs from omics data of 111 Cnidaria species. Millions of peptides were retrieved and screened using shallow- and deep-learning models, prioritizing AMPs with a reduced toxicity and with a structural distinctiveness from characterized AMPs. After complex network analysis, a final dataset of 3130 Cnidaria singular non-haemolytic and non-toxic AMPs were identified. Such unique AMPs were mined for their putative antibacterial activity, revealing 20 favourable candidates for in vitro testing against important ESKAPEE pathogens, offering potential new avenues for antibiotic development.
Collapse
Affiliation(s)
- Ricardo Alexandre Barroso
- Interdisciplinary Centre of Marine and Environmental Research (CIIMAR/CIMAR), University of Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos s/n, 4450-208 Porto, Portugal; (R.A.B.); (G.A.-C.); (R.S.)
- Department of Biology, Faculty of Sciences of University of Porto (FCUP), Rua do Campo Alegre s/n, 4169-007 Porto, Portugal
| | - Guillermin Agüero-Chapin
- Interdisciplinary Centre of Marine and Environmental Research (CIIMAR/CIMAR), University of Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos s/n, 4450-208 Porto, Portugal; (R.A.B.); (G.A.-C.); (R.S.)
- Department of Biology, Faculty of Sciences of University of Porto (FCUP), Rua do Campo Alegre s/n, 4169-007 Porto, Portugal
| | - Rita Sousa
- Interdisciplinary Centre of Marine and Environmental Research (CIIMAR/CIMAR), University of Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos s/n, 4450-208 Porto, Portugal; (R.A.B.); (G.A.-C.); (R.S.)
- Department of Biology, Faculty of Sciences of University of Porto (FCUP), Rua do Campo Alegre s/n, 4169-007 Porto, Portugal
| | - Yovani Marrero-Ponce
- Facultad de Ingeniería, Universidad Panamericana, Augusto Rodin No. 498, Insurgentes Mixcoac, Benito Juárez, Ciudad de Mexico 03920, Mexico;
- Grupo de Medicina Molecular y Traslacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas, Instituto de Simulación Computacional (ISC-USFQ), Universidad San Francisco de Quito (USFQ), Diego de Robles y vía Interoceánica, Quito 170157, Ecuador
| | - Agostinho Antunes
- Interdisciplinary Centre of Marine and Environmental Research (CIIMAR/CIMAR), University of Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos s/n, 4450-208 Porto, Portugal; (R.A.B.); (G.A.-C.); (R.S.)
- Department of Biology, Faculty of Sciences of University of Porto (FCUP), Rua do Campo Alegre s/n, 4169-007 Porto, Portugal
| |
Collapse
|
5
|
Rodrigues T, Guardiola FA, Almeida D, Antunes A. Aquatic Invertebrate Antimicrobial Peptides in the Fight Against Aquaculture Pathogens. Microorganisms 2025; 13:156. [PMID: 39858924 PMCID: PMC11767717 DOI: 10.3390/microorganisms13010156] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2024] [Revised: 01/07/2025] [Accepted: 01/11/2025] [Indexed: 01/27/2025] Open
Abstract
The intensification of aquaculture has escalated disease outbreaks and overuse of antibiotics, driving the global antimicrobial resistance (AMR) crisis. Antimicrobial peptides (AMPs) provide a promising alternative due to their rapid, broad-spectrum activity, low AMR risk, and additional bioactivities, including immunomodulatory, anticancer, and antifouling properties. AMPs derived from aquatic invertebrates, particularly marine-derived, are well-suited for aquaculture, offering enhanced stability in high-salinity environments. This study compiles and analyzes data from AMP databases and over 200 scientific sources, identifying approximately 350 AMPs derived from aquatic invertebrates, mostly cationic and α-helical, across 65 protein families. While in vitro assays highlight their potential, limited in vivo studies hinder practical application. These AMPs could serve as feed additives, therapeutic agents, or in genetic engineering approaches like CRISPR/Cas9-mediated transgenesis to enhance resilience of farmed species. Despite challenges such as stability, ecological impacts, and regulatory hurdles, advancements in peptidomimetics and genetic engineering hold significant promise. Future research should emphasize refining AMP enhancement techniques, expanding their diversity and bioactivity profiles, and prioritizing comprehensive in vivo evaluations. Harnessing the potential of AMPs represents a significant step forward on the path to aquaculture sustainability, reducing antibiotic dependency, and combating AMR, ultimately safeguarding public health and ecosystem resilience.
Collapse
Affiliation(s)
- Tomás Rodrigues
- CIIMAR—Interdisciplinary Centre of Marine and Environmental Research, University of Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos, s/n, 4450-208 Porto, Portugal;
- Department of Biology, Faculty of Sciences, University of Porto, Rua do Campo Alegre 687, 4169-007 Porto, Portugal
| | - Francisco Antonio Guardiola
- Immunobiology for Aquaculture Group, Department of Cell Biology and Histology, Faculty of Biology, Regional Campus of International Excellence “Campus Mare Nostrum”, University of Murcia, 30100 Murcia, Spain;
| | - Daniela Almeida
- Department of Zoology and Physical Anthropology, Faculty of Biology, Regional Campus of International Excellence “Campus Mare Nostrum”, University of Murcia, 30100 Murcia, Spain;
| | - Agostinho Antunes
- CIIMAR—Interdisciplinary Centre of Marine and Environmental Research, University of Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos, s/n, 4450-208 Porto, Portugal;
- Department of Biology, Faculty of Sciences, University of Porto, Rua do Campo Alegre 687, 4169-007 Porto, Portugal
| |
Collapse
|
6
|
Brizuela CA, Liu G, Stokes JM, de la Fuente‐Nunez C. AI Methods for Antimicrobial Peptides: Progress and Challenges. Microb Biotechnol 2025; 18:e70072. [PMID: 39754551 PMCID: PMC11702388 DOI: 10.1111/1751-7915.70072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2024] [Revised: 11/18/2024] [Accepted: 12/16/2024] [Indexed: 01/06/2025] Open
Abstract
Antimicrobial peptides (AMPs) are promising candidates to combat multidrug-resistant pathogens. However, the high cost of extensive wet-lab screening has made AI methods for identifying and designing AMPs increasingly important, with machine learning (ML) techniques playing a crucial role. AI approaches have recently revolutionised this field by accelerating the discovery of new peptides with anti-infective activity, particularly in preclinical mouse models. Initially, classical ML approaches dominated the field, but recently there has been a shift towards deep learning (DL) models. Despite significant contributions, existing reviews have not thoroughly explored the potential of large language models (LLMs), graph neural networks (GNNs) and structure-guided AMP discovery and design. This review aims to fill that gap by providing a comprehensive overview of the latest advancements, challenges and opportunities in using AI methods, with a particular emphasis on LLMs, GNNs and structure-guided design. We discuss the limitations of current approaches and highlight the most relevant topics to address in the coming years for AMP discovery and design.
Collapse
Affiliation(s)
| | - Gary Liu
- Department of Biochemistry and Biomedical Sciences, Michael G. DeGroote Institute for Infectious Disease Research, David Braley Centre for Antibiotic DiscoveryMcMaster UniversityHamiltonOntarioCanada
| | - Jonathan M. Stokes
- Department of Biochemistry and Biomedical Sciences, Michael G. DeGroote Institute for Infectious Disease Research, David Braley Centre for Antibiotic DiscoveryMcMaster UniversityHamiltonOntarioCanada
| | - Cesar de la Fuente‐Nunez
- Machine Biology Group, Department of Psychiatry and Microbiology, Institute for Biomedical Informatics, Institute for Translational Medicine and Therapeutics, Perelman School of MedicineUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
- Department of Bioengineering and Chemical and Biomolecular Engineering, School of Engineering and Applied ScienceUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
- Department of Chemistry, School of Arts and SciencesUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
- Penn Institute for Computational ScienceUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| |
Collapse
|
7
|
Castillo-Mendieta K, Agüero-Chapin G, Mora JR, Pérez N, Contreras-Torres E, Valdes-Martini JR, Martinez-Rios F, Marrero-Ponce Y. Unraveling the hemolytic toxicity tapestry of peptides using chemical space complex networks. Toxicol Sci 2024; 202:236-249. [PMID: 39254655 DOI: 10.1093/toxsci/kfae115] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/11/2024] Open
Abstract
Peptides have emerged as promising therapeutic agents. However, their potential is hindered by hemotoxicity. Understanding the hemotoxicity of peptides is crucial for developing safe and effective peptide-based therapeutics. Here, we employed chemical space complex networks (CSNs) to unravel the hemotoxicity tapestry of peptides. CSNs are powerful tools for visualizing and analyzing the relationships between peptides based on their physicochemical properties and structural features. We constructed CSNs from the StarPepDB database, encompassing 2,004 hemolytic peptides, and explored the impact of seven different (dis)similarity measures on network topology and cluster (communities) distribution. Our findings revealed that each CSN extracts orthogonal information, enhancing the motif discovery and enrichment process. We identified 12 consensus hemolytic motifs, whose amino acid composition unveiled a high abundance of lysine, leucine, and valine residues, whereas aspartic acid, methionine, histidine, asparagine, and glutamine were depleted. Additionally, physicochemical properties were used to characterize clusters/communities of hemolytic peptides. To predict hemolytic activity directly from peptide sequences, we constructed multi-query similarity searching models, which outperformed cutting-edge machine learning-based models, demonstrating robust hemotoxicity prediction capabilities. Overall, this novel in silico approach uses complex network science as its central strategy to develop robust model classifiers, characterize the chemical space, and discover new motifs from hemolytic peptides. This will help to enhance the design/selection of peptides with potential therapeutic activity and low toxicity.
Collapse
Affiliation(s)
- Kevin Castillo-Mendieta
- School of Biological Sciences and Engineering, Yachay Tech University, Hda. San José s/n y Proyecto Yachay, Urcuquí 100119, Ecuador
| | - Guillermin Agüero-Chapin
- CIIMAR/CIMAR, Interdisciplinary Centre of Marine and Environmental Research, University of Porto, Terminal de Cruzeiros do Porto de Leixões, Porto 4450-208, Portugal
- Department of Biology, Faculty of Sciences, University of Porto, Porto 4169-007, Portugal
| | - José R Mora
- Universidad San Francisco de Quito (USFQ), Colegio de Ciencias e Ingenierías "El Politécnico", Diego de Robles y vía Interoceánica, Quito 170157, Pichincha, Ecuador
| | - Noel Pérez
- Universidad San Francisco de Quito (USFQ), Colegio de Ciencias e Ingenierías "El Politécnico", Diego de Robles y vía Interoceánica, Quito 170157, Pichincha, Ecuador
| | - Ernesto Contreras-Torres
- Universidad San Francisco de Quito (USFQ), Grupo de Medicina Molecular y Traslacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas and Instituto de Simulación Computacional (ISC-USFQ), Diego de Robles y vía Interoceánica, Quito 170157, Pichincha, Ecuador
| | | | - Felix Martinez-Rios
- Facultad de Ingeniería, Universidad Panamericana, Benito Juárez, Ciudad de México 03920, México
| | - Yovani Marrero-Ponce
- Universidad San Francisco de Quito (USFQ), Colegio de Ciencias e Ingenierías "El Politécnico", Diego de Robles y vía Interoceánica, Quito 170157, Pichincha, Ecuador
- Universidad San Francisco de Quito (USFQ), Grupo de Medicina Molecular y Traslacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas and Instituto de Simulación Computacional (ISC-USFQ), Diego de Robles y vía Interoceánica, Quito 170157, Pichincha, Ecuador
- Facultad de Ingeniería, Universidad Panamericana, Benito Juárez, Ciudad de México 03920, México
| |
Collapse
|
8
|
Agüero-Chapin G, Domínguez-Pérez D, Marrero-Ponce Y, Castillo-Mendieta K, Antunes A. Unveiling Encrypted Antimicrobial Peptides from Cephalopods' Salivary Glands: A Proteolysis-Driven Virtual Approach. ACS OMEGA 2024; 9:43353-43367. [PMID: 39494035 PMCID: PMC11525497 DOI: 10.1021/acsomega.4c01959] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Revised: 04/26/2024] [Accepted: 04/30/2024] [Indexed: 11/05/2024]
Abstract
Antimicrobial peptides (AMPs) have potential against antimicrobial resistance and serve as templates for novel therapeutic agents. While most AMP databases focus on terrestrial eukaryotes, marine cephalopods represent a promising yet underexplored source. This study reveals the putative reservoir of AMPs encrypted within the proteomes of cephalopod salivary glands via in silico proteolysis. A composite protein database comprising 5,412,039 canonical and noncanonical proteins from salivary apparatus of 14 cephalopod species was subjected to digestion by 5 proteases under three protocols, yielding over 9 million of nonredundant peptides. These peptides were effectively screened by a selection of 8 prediction and sequence comparative tools, including machine learning, deep learning, multiquery similarity-based models, and complex networks. The screening prioritized the antimicrobial activity while ensuring the absence of hemolytic and toxic properties, and structural uniqueness compared to known AMPs. Five relevant AMP datasets were released, ranging from a comprehensive collection of 542,485 AMPs to a refined dataset of 68,694 nonhemolytic and nontoxic AMPs. Further comparative analyses and application of network science principles helped identify 5466 unique and 808 representative nonhemolytic and nontoxic AMPs. These datasets, along with the selected mining tools, provide valuable resources for peptide drug developers.
Collapse
Affiliation(s)
- Guillermin Agüero-Chapin
- CIIMAR—Centro
Interdisciplinar de Investigação Marinha e Ambiental, Universidade do Porto, Terminal de Cruzeiros do Porto
de Leixões, Av. General Norton de Matos, s/n, Porto 4450-208, Portugal
- Departamento
de Biologia, Faculdade de Ciências, Universidade do Porto, Rua do Campo Alegre, s/n, Porto 4169-007, Portugal
| | - Dany Domínguez-Pérez
- Department
of Biology and Evolution of Marine Organisms (BEOM), Stazione Zoologica Anton Dohrn, Località Torre Spaccata 87071, 87071 Amendolara, Italy
- PagBiOmicS—Personalised
Academic Guidance and Biodiscovery-integrated OMICs Solutions, Porto 4200-603, Portugal
| | - Yovani Marrero-Ponce
- Universidad
San Francisco de Quito (USFQ), Grupo de Medicina Molecular y Traslacional
(MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina,
Edificio de Especialidades Médicas; and Instituto de Simulación
Computacional (ISC-USFQ), Diego de Robles
y vía Interoceánica, Quito 170157, Pichincha, Ecuador
- Facultad
de Ingeniería, Universidad Panamericana, Augusto Rodin No. 498, Insurgentes
Mixcoac, Benito Juárez 03920, Ciudad de México, Mexico
| | - Kevin Castillo-Mendieta
- School
of Biological Sciences and Engineering, Yachay Tech University, Hda. San José s/n y Proyecto Yachay, Urcuquí 100119, Ecuador
| | - Agostinho Antunes
- CIIMAR—Centro
Interdisciplinar de Investigação Marinha e Ambiental, Universidade do Porto, Terminal de Cruzeiros do Porto
de Leixões, Av. General Norton de Matos, s/n, Porto 4450-208, Portugal
- Departamento
de Biologia, Faculdade de Ciências, Universidade do Porto, Rua do Campo Alegre, s/n, Porto 4169-007, Portugal
| |
Collapse
|
9
|
Castillo-Mendieta K, Agüero-Chapin G, Marquez EA, Perez-Castillo Y, Barigye SJ, Vispo NS, García-Jacas CR, Marrero-Ponce Y. Peptide hemolytic activity analysis using visual data mining of similarity-based complex networks. NPJ Syst Biol Appl 2024; 10:115. [PMID: 39367008 PMCID: PMC11452708 DOI: 10.1038/s41540-024-00429-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Accepted: 08/22/2024] [Indexed: 10/06/2024] Open
Abstract
Peptides are promising drug development frameworks that have been hindered by intrinsic undesired properties including hemolytic activity. We aim to get a better insight into the chemical space of hemolytic peptides using a novel approach based on network science and data mining. Metadata networks (METNs) were useful to characterize and find general patterns associated with hemolytic peptides, whereas Half-Space Proximal Networks (HSPNs), represented the hemolytic peptide space. The best candidate HSPNs were used to extract various subsets of hemolytic peptides (scaffolds) considering network centrality and peptide similarity. These scaffolds have been proved to be useful in developing robust similarity-based model classifiers. Finally, using an alignment-free approach, we reported 47 putative hemolytic motifs, which can be used as toxic signatures when developing novel peptide-based drugs. We provided evidence that the number of hemolytic motifs in a sequence might be related to the likelihood of being hemolytic.
Collapse
Affiliation(s)
| | - Guillermin Agüero-Chapin
- CIIMAR-Centro Interdisciplinar de Investigação Marinha e Ambiental, Universidade do Porto, Terminal de Cruzeiros do Porto de Leixões, Porto, Portugal.
- Departamento de Biologia, Faculdade de Ciências, Universidade do Porto, Porto, Portugal.
| | - Edgar A Marquez
- Grupo de Investigaciones en Química y Biología, Departamento de Química y Biología, Facultad de Ciencias Básicas, Universidad del Norte, Universidad del Norte, Barranquilla, Colombia
| | - Yunierkis Perez-Castillo
- Bio-Chemoinformatics Research Group and Escuela de Ciencias Físicas y Matemáticas. Universidad de Las Américas, Quito, Ecuador
| | - Stephen J Barigye
- Departamento de Química Física Aplicada, Facultad de Ciencias, Universidad Autónoma de Madrid (UAM), Madrid, Spain
| | | | - Cesar R García-Jacas
- Investigador por México, Consejo Nacional de Humanidades, Ciencias y Tecnologías (Conahcyt), 03940, Ciudad de Mexico, Mexico
| | - Yovani Marrero-Ponce
- Facultad de Ingeniería, Universidad Panamericana, Augusto Rodin 498, 03920, Ciudad de México, CDMX, México.
- Universidad San Francisco de Quito (USFQ), Grupo de Medicina Molecular y Traslacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas; and Instituto de Simulación Computacional (ISC-USFQ), Diego de Robles y vía Interoceánica, Quito, Pichincha, Ecuador.
| |
Collapse
|
10
|
Mazein I, Rougny A, Mazein A, Henkel R, Gütebier L, Michaelis L, Ostaszewski M, Schneider R, Satagopam V, Jensen LJ, Waltemath D, Wodke JAH, Balaur I. Graph databases in systems biology: a systematic review. Brief Bioinform 2024; 25:bbae561. [PMID: 39565895 PMCID: PMC11578065 DOI: 10.1093/bib/bbae561] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Revised: 09/28/2024] [Accepted: 10/21/2024] [Indexed: 11/22/2024] Open
Abstract
Graph databases are becoming increasingly popular across scientific disciplines, being highly suitable for storing and connecting complex heterogeneous data. In systems biology, they are used as a backend solution for biological data repositories, ontologies, networks, pathways, and knowledge graph databases. In this review, we analyse all publications using or mentioning graph databases retrieved from PubMed and PubMed Central full-text search, focusing on the top 16 available graph databases, Publications are categorized according to their domain and application, focusing on pathway and network biology and relevant ontologies and tools. We detail different approaches and highlight the advantages of outstanding resources, such as UniProtKB, Disease Ontology, and Reactome, which provide graph-based solutions. We discuss ongoing efforts of the systems biology community to standardize and harmonize knowledge graph creation and the maintenance of integrated resources. Outlining prospects, including the use of graph databases as a way of communication between biological data repositories, we conclude that efficient design, querying, and maintenance of graph databases will be key for knowledge generation in systems biology and other research fields with heterogeneous data.
Collapse
Affiliation(s)
- Ilya Mazein
- Medical Informatics Laboratory, University Medicine Greifswald, Walther-Rathenau-Straße 48, Greifswald 17475, Germany
| | - Adrien Rougny
- Luxembourg Centre for Systems Biology, University of Luxembourg, 6 Avenue du Swing, Belvaux L-4367, Luxembourg
| | - Alexander Mazein
- Luxembourg Centre for Systems Biology, University of Luxembourg, 6 Avenue du Swing, Belvaux L-4367, Luxembourg
| | - Ron Henkel
- Medical Informatics Laboratory, University Medicine Greifswald, Walther-Rathenau-Straße 48, Greifswald 17475, Germany
| | - Lea Gütebier
- Medical Informatics Laboratory, University Medicine Greifswald, Walther-Rathenau-Straße 48, Greifswald 17475, Germany
| | - Lea Michaelis
- Medical Informatics Laboratory, University Medicine Greifswald, Walther-Rathenau-Straße 48, Greifswald 17475, Germany
| | - Marek Ostaszewski
- Luxembourg Centre for Systems Biology, University of Luxembourg, 6 Avenue du Swing, Belvaux L-4367, Luxembourg
| | - Reinhard Schneider
- Luxembourg Centre for Systems Biology, University of Luxembourg, 6 Avenue du Swing, Belvaux L-4367, Luxembourg
| | - Venkata Satagopam
- Luxembourg Centre for Systems Biology, University of Luxembourg, 6 Avenue du Swing, Belvaux L-4367, Luxembourg
| | - Lars Juhl Jensen
- Department of Veterinary and Animal Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, Grønnegårdsvej 15, 1870 Frederiksberg C, Denmark
| | - Dagmar Waltemath
- Medical Informatics Laboratory, University Medicine Greifswald, Walther-Rathenau-Straße 48, Greifswald 17475, Germany
| | - Judith A H Wodke
- Medical Informatics Laboratory, University Medicine Greifswald, Walther-Rathenau-Straße 48, Greifswald 17475, Germany
| | - Irina Balaur
- Luxembourg Centre for Systems Biology, University of Luxembourg, 6 Avenue du Swing, Belvaux L-4367, Luxembourg
| |
Collapse
|
11
|
Madhavi BGK, Wijethunga AM, Okagu OD, Sun X. Defatted Wheat Germ Protein-Derived Peptides Showed Multiple Biological Activities from the Stomach to Small Intestine: In Silico and In Vitro Approaches. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2024; 72:20527-20536. [PMID: 39231371 DOI: 10.1021/acs.jafc.4c06539] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/06/2024]
Abstract
This study aimed to test the hypothesis that bioactive peptides can exert multiple bioactivities at different sites in the gastrointestinal tract. Our previous research identified 33 gastric-resistant peptides derived from wheat germ with potential antiadhesive activity against Helicobacter pylori in the stomach. In this work, in silico digestion of these peptides with trypsin, thermolysin, and chymotrypsin produced 67 peptide fragments. Molecular docking was conducted to predict their ACE and DPP-IV inhibitory activities in the small intestine. Three peptides (VPIPNPSGDR, VPY, and AR) were selected and synthesized for in vitro validation. Their generation in the gastrointestinal tract was verified via in vitro digestion, followed by mass spectrometry analysis. The IC50 values for ACE inhibition were 199.5 μM (VPIPNPSGDR), 316.3 μM (VPY), and 446.7 μM (AR). For DPP-IV inhibition, their IC50 values were 0.5, 1.6, and 4.0 mM, respectively. This research pioneers new directions in the emerging field of multifunctional peptides, providing scientific evidence to support the utilization of wheat germ as value-added food ingredients.
Collapse
Affiliation(s)
- Bolappa Gamage Kaushalya Madhavi
- Department of Plant, Food and Environmental Sciences, Faculty of Agriculture, Dalhousie University, Truro, Nova Scotia B2N 5E3, Canada
| | - Anushi Madushani Wijethunga
- Department of Plant, Food and Environmental Sciences, Faculty of Agriculture, Dalhousie University, Truro, Nova Scotia B2N 5E3, Canada
| | - Ogadimma D Okagu
- Department of Chemistry and Biomolecular Sciences, University of Ottawa, Ottawa, Ontario K1N 6N5, Canada
| | - Xiaohong Sun
- Department of Plant, Food and Environmental Sciences, Faculty of Agriculture, Dalhousie University, Truro, Nova Scotia B2N 5E3, Canada
| |
Collapse
|
12
|
de Llano García D, Marrero-Ponce Y, Agüero-Chapin G, Ferri FJ, Antunes A, Martinez-Rios F, Rodríguez H. Innovative Alignment-Based Method for Antiviral Peptide Prediction. Antibiotics (Basel) 2024; 13:768. [PMID: 39200068 PMCID: PMC11350826 DOI: 10.3390/antibiotics13080768] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2024] [Revised: 08/08/2024] [Accepted: 08/09/2024] [Indexed: 09/01/2024] Open
Abstract
Antiviral peptides (AVPs) represent a promising strategy for addressing the global challenges of viral infections and their growing resistances to traditional drugs. Lab-based AVP discovery methods are resource-intensive, highlighting the need for efficient computational alternatives. In this study, we developed five non-trained but supervised multi-query similarity search models (MQSSMs) integrated into the StarPep toolbox. Rigorous testing and validation across diverse AVP datasets confirmed the models' robustness and reliability. The top-performing model, M13+, demonstrated impressive results, with an accuracy of 0.969 and a Matthew's correlation coefficient of 0.71. To assess their competitiveness, the top five models were benchmarked against 14 publicly available machine-learning and deep-learning AVP predictors. The MQSSMs outperformed these predictors, highlighting their efficiency in terms of resource demand and public accessibility. Another significant achievement of this study is the creation of the most comprehensive dataset of antiviral sequences to date. In general, these results suggest that MQSSMs are promissory tools to develop good alignment-based models that can be successfully applied in the screening of large datasets for new AVP discovery.
Collapse
Affiliation(s)
- Daniela de Llano García
- School of Chemical Sciences and Engineering, Yachay Tech University, Hda. San José s/n y Proyecto Yachay, Urcuquí 100119, Imbabura, Ecuador; (D.d.L.G.); (H.R.)
| | - Yovani Marrero-Ponce
- Universidad San Francisco de Quito (USFQ), Grupo de Medicina Molecular y Traslacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas, Instituto de Simulación Computacional (ISC-USFQ), Diego de Robles y vía Interoceánica, Quito 170157, Pichincha, Ecuador
- Facultad de Ingeniería, Universidad Panamericana, Augusto Rodin 498, Benito Juárez 03920, Ciudad de México, Mexico;
- Computer Science Department, Universitat de València, 46100 Valencia, Burjassot, Spain;
| | - Guillermin Agüero-Chapin
- CIIMAR—Centro Interdisciplinar de Investigação Marinha e Ambiental, Universidade do Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos, s/n, 4450-208 Porto, Portugal;
- Departamento de Biologia, Faculdade de Ciências, Universidade do Porto, Rua do Campo Alegre, 4169-007 Porto, Portugal
| | - Francesc J. Ferri
- Computer Science Department, Universitat de València, 46100 Valencia, Burjassot, Spain;
| | - Agostinho Antunes
- CIIMAR—Centro Interdisciplinar de Investigação Marinha e Ambiental, Universidade do Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos, s/n, 4450-208 Porto, Portugal;
- Departamento de Biologia, Faculdade de Ciências, Universidade do Porto, Rua do Campo Alegre, 4169-007 Porto, Portugal
| | - Felix Martinez-Rios
- Facultad de Ingeniería, Universidad Panamericana, Augusto Rodin 498, Benito Juárez 03920, Ciudad de México, Mexico;
| | - Hortensia Rodríguez
- School of Chemical Sciences and Engineering, Yachay Tech University, Hda. San José s/n y Proyecto Yachay, Urcuquí 100119, Imbabura, Ecuador; (D.d.L.G.); (H.R.)
| |
Collapse
|
13
|
Santos-Júnior CD, Torres MDT, Duan Y, Rodríguez Del Río Á, Schmidt TSB, Chong H, Fullam A, Kuhn M, Zhu C, Houseman A, Somborski J, Vines A, Zhao XM, Bork P, Huerta-Cepas J, de la Fuente-Nunez C, Coelho LP. Discovery of antimicrobial peptides in the global microbiome with machine learning. Cell 2024; 187:3761-3778.e16. [PMID: 38843834 PMCID: PMC11666328 DOI: 10.1016/j.cell.2024.05.013] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 04/11/2024] [Accepted: 05/06/2024] [Indexed: 06/25/2024]
Abstract
Novel antibiotics are urgently needed to combat the antibiotic-resistance crisis. We present a machine-learning-based approach to predict antimicrobial peptides (AMPs) within the global microbiome and leverage a vast dataset of 63,410 metagenomes and 87,920 prokaryotic genomes from environmental and host-associated habitats to create the AMPSphere, a comprehensive catalog comprising 863,498 non-redundant peptides, few of which match existing databases. AMPSphere provides insights into the evolutionary origins of peptides, including by duplication or gene truncation of longer sequences, and we observed that AMP production varies by habitat. To validate our predictions, we synthesized and tested 100 AMPs against clinically relevant drug-resistant pathogens and human gut commensals both in vitro and in vivo. A total of 79 peptides were active, with 63 targeting pathogens. These active AMPs exhibited antibacterial activity by disrupting bacterial membranes. In conclusion, our approach identified nearly one million prokaryotic AMP sequences, an open-access resource for antibiotic discovery.
Collapse
Affiliation(s)
- Célio Dias Santos-Júnior
- Institute of Science and Technology for Brain-Inspired Intelligence - ISTBI, Fudan University, Shanghai 200433, China; Laboratory of Microbial Processes & Biodiversity - LMPB, Department of Hydrobiology, Universidade Federal de São Carlos - UFSCar, São Carlos, São Paulo 13565-905, Brazil
| | - Marcelo D T Torres
- Machine Biology Group, Departments of Psychiatry and Microbiology, Institute for Biomedical Informatics, Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA; Departments of Bioengineering and Chemical and Biomolecular Engineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA, USA; Department of Chemistry, School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA, USA; Penn Institute for Computational Science, University of Pennsylvania, Philadelphia, PA, USA
| | - Yiqian Duan
- Institute of Science and Technology for Brain-Inspired Intelligence - ISTBI, Fudan University, Shanghai 200433, China
| | - Álvaro Rodríguez Del Río
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Campus de Montegancedo-UPM, Pozuelo de Alarcón, 28223 Madrid, Spain
| | - Thomas S B Schmidt
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany; APC Microbiome & School of Medicine, University College Cork, Cork, Ireland
| | - Hui Chong
- Institute of Science and Technology for Brain-Inspired Intelligence - ISTBI, Fudan University, Shanghai 200433, China
| | - Anthony Fullam
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Michael Kuhn
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Chengkai Zhu
- Institute of Science and Technology for Brain-Inspired Intelligence - ISTBI, Fudan University, Shanghai 200433, China
| | - Amy Houseman
- Institute of Science and Technology for Brain-Inspired Intelligence - ISTBI, Fudan University, Shanghai 200433, China
| | - Jelena Somborski
- Institute of Science and Technology for Brain-Inspired Intelligence - ISTBI, Fudan University, Shanghai 200433, China
| | - Anna Vines
- Institute of Science and Technology for Brain-Inspired Intelligence - ISTBI, Fudan University, Shanghai 200433, China
| | - Xing-Ming Zhao
- Institute of Science and Technology for Brain-Inspired Intelligence - ISTBI, Fudan University, Shanghai 200433, China; Department of Neurology, Zhongshan Hospital, Fudan University, Shanghai, China; State Key Laboratory of Medical Neurobiology, Institutes of Brain Science, Fudan University, Shanghai, China; MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence and MOE Frontiers Center for Brain Science, Fudan University, Shanghai, China
| | - Peer Bork
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany; Max Delbrück Centre for Molecular Medicine, Berlin, Germany; Department of Bioinformatics, Biocenter, University of Würzburg, Würzburg, Germany
| | - Jaime Huerta-Cepas
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Campus de Montegancedo-UPM, Pozuelo de Alarcón, 28223 Madrid, Spain
| | - Cesar de la Fuente-Nunez
- Machine Biology Group, Departments of Psychiatry and Microbiology, Institute for Biomedical Informatics, Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA; Departments of Bioengineering and Chemical and Biomolecular Engineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA, USA; Department of Chemistry, School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA, USA; Penn Institute for Computational Science, University of Pennsylvania, Philadelphia, PA, USA.
| | - Luis Pedro Coelho
- Institute of Science and Technology for Brain-Inspired Intelligence - ISTBI, Fudan University, Shanghai 200433, China; Centre for Microbiome Research, School of Biomedical Sciences, Queensland University of Technology, Translational Research Institute, Woolloongabba, QLD, Australia.
| |
Collapse
|
14
|
Cordoves-Delgado G, García-Jacas CR. Predicting Antimicrobial Peptides Using ESMFold-Predicted Structures and ESM-2-Based Amino Acid Features with Graph Deep Learning. J Chem Inf Model 2024; 64:4310-4321. [PMID: 38739853 DOI: 10.1021/acs.jcim.3c02061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Currently, antimicrobial resistance constitutes a serious threat to human health. Drugs based on antimicrobial peptides (AMPs) constitute one of the alternatives to address it. Shallow and deep learning (DL)-based models have mainly been built from amino acid sequences to predict AMPs. Recent advances in tertiary (3D) structure prediction have opened new opportunities in this field. In this sense, models based on graphs derived from predicted peptide structures have recently been proposed. However, these models are not in correspondence with state-of-the-art approaches to codify evolutionary information, and, in addition, they are memory- and time-consuming because depend on multiple sequence alignment. Herein, we presented a framework to create alignment-free models based on graph representations generated from ESMFold-predicted peptide structures, whose nodes are characterized with amino acid-level evolutionary information derived from the Evolutionary Scale Modeling (ESM-2) models. A graph attention network (GAT) was implemented to assess the usefulness of the framework in the AMP classification. To this end, a set comprised of 67,058 peptides was used. It was demonstrated that the proposed methodology allowed to build GAT models with generalization abilities consistently better than 20 state-of-the-art non-DL-based and DL-based models. The best GAT models were developed using evolutionary information derived from the 36- and 33-layer ESM-2 models. Similarity studies showed that the best-built GAT models codified different chemical spaces, and thus they were fused to significantly improve the classification. In general, the results suggest that esm-AxP-GDL is a promissory tool to develop good, structure-dependent, and alignment-free models that can be successfully applied in the screening of large data sets. This framework should not only be useful to classify AMPs but also for modeling other peptide and protein activities.
Collapse
Affiliation(s)
- Greneter Cordoves-Delgado
- Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), 22860 Ensenada, Baja California, México
| | - César R García-Jacas
- Cátedras CONAHCYT - Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), 22860 Ensenada, Baja California, México
| |
Collapse
|
15
|
Castillo-Mendieta K, Agüero-Chapin G, Marquez E, Perez-Castillo Y, Barigye SJ, Pérez-Cárdenas M, Peréz-Giménez F, Marrero-Ponce Y. Multiquery Similarity Searching Models: An Alternative Approach for Predicting Hemolytic Activity from Peptide Sequence. Chem Res Toxicol 2024; 37:580-589. [PMID: 38501392 DOI: 10.1021/acs.chemrestox.3c00408] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/20/2024]
Abstract
The desirable pharmacological properties and a broad number of therapeutic activities have made peptides promising drugs over small organic molecules and antibody drugs. Nevertheless, toxic effects, such as hemolysis, have hampered the development of such promising drugs. Hence, a reliable computational tool to predict peptide hemolytic toxicity is enormously useful before synthesis and experimental evaluation. Currently, four web servers that predict hemolytic activity using machine learning (ML) algorithms are available; however, they exhibit some limitations, such as the need for a reliable negative set and limited application domain. Hence, we developed a robust model based on a novel theoretical approach that combines network science and a multiquery similarity searching (MQSS) method. A total of 1152 initial models were constructed from 144 scaffolds generated in a previous report. These were evaluated on external data sets, and the best models were fused and improved. Our best MQSS model I1 outperformed all state-of-the-art ML-based models and was used to characterize the prevalence of hemolytic toxicity on therapeutic peptides. Based on our model's estimation, the number of hemolytic peptides might be 3.9-fold higher than the reported.
Collapse
Affiliation(s)
- Kevin Castillo-Mendieta
- School of Biological Sciences and Engineering, Yachay Tech University, Hda. San José s/n y Proyecto Yachay, Urcuquí 100119, Ecuador
| | - Guillermin Agüero-Chapin
- CIIMAR/CIMAR, Interdisciplinary Centre of Marine and Environmental Research, Terminal de Cruzeiros do Porto de Leixões, University of Porto, Av. General Norton de Matos s/n, 4450-208 Porto, Portugal
- Department of Biology, Faculty of Sciences, University of Porto, Rua do Campo Alegre, 4169-007 Porto, Portugal
| | - Edgar Marquez
- Grupo de Investigaciones en Química y Biología, Departamento de Química y Biología, Facultad de Ciencias Básicas, Universidad del Norte, Carrera 51B, Km 5, vía Puerto Colombia, Barranquilla 081007, Colombia
| | - Yunierkis Perez-Castillo
- Bio-Chemoinformatics Research Group and Escuela de Ciencias Físicas y Matemáticas. Universidad de Las Américas, Quito 170504, Ecuador
| | - Stephen J Barigye
- Departamento de Química Física Aplicada, Facultad de Ciencias, Universidad Autónoma de Madrid (UAM), 28049 Madrid, Spain
| | - Mariela Pérez-Cárdenas
- School of Biological Sciences and Engineering, Yachay Tech University, Hda. San José s/n y Proyecto Yachay, Urcuquí 100119, Ecuador
| | - Facundo Peréz-Giménez
- Unidad de Investigación de Diseño de Fármacos y Conectividad Molecular, Departamento de Química Física, Facultad de Farmacia, Universitat de València, Valencia 46100, Spain
| | - Yovani Marrero-Ponce
- Unidad de Investigación de Diseño de Fármacos y Conectividad Molecular, Departamento de Química Física, Facultad de Farmacia, Universitat de València, Valencia 46100, Spain
- Facultad de Ingeniería, Universidad Panamericana, Augusto Rodin No. 498, Insurgentes Mixcoac, Benito Juárez, CDMX, Mexico 03920, Mexico
- Grupo de Medicina Molecular y Traslacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas; and Instituto de Simulación Computacional (ISC-USFQ), Diego de Robles y vía Interoceánica, Universidad San Francisco de Quito (USFQ), Quito, Pichincha 170157, Ecuador
| |
Collapse
|
16
|
Martínez‐Mauricio KL, García‐Jacas CR, Cordoves‐Delgado G. Examining evolutionary scale modeling-derived different-dimensional embeddings in the antimicrobial peptide classification through a KNIME workflow. Protein Sci 2024; 33:e4928. [PMID: 38501511 PMCID: PMC10949403 DOI: 10.1002/pro.4928] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Revised: 01/28/2024] [Accepted: 01/30/2024] [Indexed: 03/20/2024]
Abstract
Molecular features play an important role in different bio-chem-informatics tasks, such as the Quantitative Structure-Activity Relationships (QSAR) modeling. Several pre-trained models have been recently created to be used in downstream tasks, either by fine-tuning a specific model or by extracting features to feed traditional classifiers. In this regard, a new family of Evolutionary Scale Modeling models (termed as ESM-2 models) was recently introduced, demonstrating outstanding results in protein structure prediction benchmarks. Herein, we studied the usefulness of the different-dimensional embeddings derived from the ESM-2 models to classify antimicrobial peptides (AMPs). To this end, we built a KNIME workflow to use the same modeling methodology across experiments in order to guarantee fair analyses. As a result, the 640- and 1280-dimensional embeddings derived from the 30- and 33-layer ESM-2 models, respectively, are the most valuable since statistically better performances were achieved by the QSAR models built from them. We also fused features of the different ESM-2 models, and it was concluded that the fusion contributes to getting better QSAR models than using features of a single ESM-2 model. Frequency studies revealed that only a portion of the ESM-2 embeddings is valuable for modeling tasks since between 43% and 66% of the features were never used. Comparisons regarding state-of-the-art deep learning (DL) models confirm that when performing methodologically principled studies in the prediction of AMPs, non-DL based QSAR models yield comparable-to-superior performances to DL-based QSAR models. The developed KNIME workflow is available-freely at https://github.com/cicese-biocom/classification-QSAR-bioKom. This workflow can be valuable to avoid unfair comparisons regarding new computational methods, as well as to propose new non-DL based QSAR models.
Collapse
Affiliation(s)
- Karla L. Martínez‐Mauricio
- Departamento de Ciencias de la ComputaciónCentro de Investigación Científica y de Educación Superior de Ensenada (CICESE)EnsenadaMexico
| | - César R. García‐Jacas
- Cátedras CONAHCYT – Departamento de Ciencias de la ComputaciónCentro de Investigación Científica y de Educación Superior de Ensenada (CICESE)EnsenadaMexico
| | - Greneter Cordoves‐Delgado
- Departamento de Ciencias de la ComputaciónCentro de Investigación Científica y de Educación Superior de Ensenada (CICESE)EnsenadaMexico
| |
Collapse
|
17
|
Goshisht MK. Machine Learning and Deep Learning in Synthetic Biology: Key Architectures, Applications, and Challenges. ACS OMEGA 2024; 9:9921-9945. [PMID: 38463314 PMCID: PMC10918679 DOI: 10.1021/acsomega.3c05913] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Revised: 01/19/2024] [Accepted: 01/30/2024] [Indexed: 03/12/2024]
Abstract
Machine learning (ML), particularly deep learning (DL), has made rapid and substantial progress in synthetic biology in recent years. Biotechnological applications of biosystems, including pathways, enzymes, and whole cells, are being probed frequently with time. The intricacy and interconnectedness of biosystems make it challenging to design them with the desired properties. ML and DL have a synergy with synthetic biology. Synthetic biology can be employed to produce large data sets for training models (for instance, by utilizing DNA synthesis), and ML/DL models can be employed to inform design (for example, by generating new parts or advising unrivaled experiments to perform). This potential has recently been brought to light by research at the intersection of engineering biology and ML/DL through achievements like the design of novel biological components, best experimental design, automated analysis of microscopy data, protein structure prediction, and biomolecular implementations of ANNs (Artificial Neural Networks). I have divided this review into three sections. In the first section, I describe predictive potential and basics of ML along with myriad applications in synthetic biology, especially in engineering cells, activity of proteins, and metabolic pathways. In the second section, I describe fundamental DL architectures and their applications in synthetic biology. Finally, I describe different challenges causing hurdles in the progress of ML/DL and synthetic biology along with their solutions.
Collapse
Affiliation(s)
- Manoj Kumar Goshisht
- Department of Chemistry, Natural and
Applied Sciences, University of Wisconsin—Green
Bay, Green
Bay, Wisconsin 54311-7001, United States
| |
Collapse
|
18
|
Li Z, Jin J, He W, Long W, Yu H, Gao X, Nakai K, Zou Q, Wei L. CoraL: interpretable contrastive meta-learning for the prediction of cancer-associated ncRNA-encoded small peptides. Brief Bioinform 2023; 24:bbad352. [PMID: 37861173 DOI: 10.1093/bib/bbad352] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Revised: 08/29/2023] [Accepted: 09/17/2023] [Indexed: 10/21/2023] Open
Abstract
NcRNA-encoded small peptides (ncPEPs) have recently emerged as promising targets and biomarkers for cancer immunotherapy. Therefore, identifying cancer-associated ncPEPs is crucial for cancer research. In this work, we propose CoraL, a novel supervised contrastive meta-learning framework for predicting cancer-associated ncPEPs. Specifically, the proposed meta-learning strategy enables our model to learn meta-knowledge from different types of peptides and train a promising predictive model even with few labeled samples. The results show that our model is capable of making high-confidence predictions on unseen cancer biomarkers with only five samples, potentially accelerating the discovery of novel cancer biomarkers for immunotherapy. Moreover, our approach remarkably outperforms existing deep learning models on 15 cancer-associated ncPEPs datasets, demonstrating its effectiveness and robustness. Interestingly, our model exhibits outstanding performance when extended for the identification of short open reading frames derived from ncPEPs, demonstrating the strong prediction ability of CoraL at the transcriptome level. Importantly, our feature interpretation analysis discovers unique sequential patterns as the fingerprint for each cancer-associated ncPEPs, revealing the relationship among certain cancer biomarkers that are validated by relevant literature and motif comparison. Overall, we expect CoraL to be a useful tool to decipher the pathogenesis of cancer and provide valuable information for cancer research. The dataset and source code of our proposed method can be found at https://github.com/Johnsunnn/CoraL.
Collapse
Affiliation(s)
- Zhongshen Li
- School of Software, Shandong University, Jinan 250101, China
- Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan 250101, China
| | - Junru Jin
- School of Software, Shandong University, Jinan 250101, China
- Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan 250101, China
| | - Wenjia He
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Wentao Long
- School of Software, Shandong University, Jinan 250101, China
- Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan 250101, China
| | - Haoqing Yu
- School of Software, Shandong University, Jinan 250101, China
- Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan 250101, China
| | - Xin Gao
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Kenta Nakai
- Department of Computational Biology and Medical Sciences, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa-shi, Chiba 277-8562, Japan
- Human Genome Center, The Institute of Medical Science, The University of Tokyo, 4-6-1 Shirokanedai Minato-ku, Tokyo 108-8639, Japan
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Leyi Wei
- School of Software, Shandong University, Jinan 250101, China
- Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan 250101, China
| |
Collapse
|
19
|
Santos-Júnior CD, Der Torossian Torres M, Duan Y, del Río ÁR, Schmidt TS, Chong H, Fullam A, Kuhn M, Zhu C, Houseman A, Somborski J, Vines A, Zhao XM, Bork P, Huerta-Cepas J, de la Fuente-Nunez C, Coelho LP. Computational exploration of the global microbiome for antibiotic discovery. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.31.555663. [PMID: 37693522 PMCID: PMC10491242 DOI: 10.1101/2023.08.31.555663] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 09/12/2023]
Abstract
Novel antibiotics are urgently needed to combat the antibiotic-resistance crisis. We present a machine learning-based approach to predict prokaryotic antimicrobial peptides (AMPs) by leveraging a vast dataset of 63,410 metagenomes and 87,920 microbial genomes. This led to the creation of AMPSphere, a comprehensive catalog comprising 863,498 non-redundant peptides, the majority of which were previously unknown. We observed that AMP production varies by habitat, with animal-associated samples displaying the highest proportion of AMPs compared to other habitats. Furthermore, within different human-associated microbiota, strain-level differences were evident. To validate our predictions, we synthesized and experimentally tested 50 AMPs, demonstrating their efficacy against clinically relevant drug-resistant pathogens both in vitro and in vivo. These AMPs exhibited antibacterial activity by targeting the bacterial membrane. Additionally, AMPSphere provides valuable insights into the evolutionary origins of peptides. In conclusion, our approach identified AMP sequences within prokaryotic microbiomes, opening up new avenues for the discovery of antibiotics.
Collapse
Affiliation(s)
- Célio Dias Santos-Júnior
- Institute of Science and Technology for Brain-Inspired Intelligence - ISTBI, Fudan University, Shanghai, China
| | - Marcelo Der Torossian Torres
- Machine Biology Group, Departments of Psychiatry and Microbiology, Institute for Biomedical Informatics, Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania; Philadelphia, Pennsylvania, United States of America
- Departments of Bioengineering and Chemical and Biomolecular Engineering, School of Engineering and Applied Science, University of Pennsylvania; Philadelphia, Pennsylvania, United States of America
- Penn Institute for Computational Science, University of Pennsylvania; Philadelphia, Pennsylvania, United States of America
| | - Yiqian Duan
- Institute of Science and Technology for Brain-Inspired Intelligence - ISTBI, Fudan University, Shanghai, China
| | - Álvaro Rodríguez del Río
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Campus de Montegancedo-UPM, 28223 Pozuelo de Alarcón, Madrid, Spain
| | - Thomas S.B. Schmidt
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Hui Chong
- Institute of Science and Technology for Brain-Inspired Intelligence - ISTBI, Fudan University, Shanghai, China
| | - Anthony Fullam
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Michael Kuhn
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Chengkai Zhu
- Institute of Science and Technology for Brain-Inspired Intelligence - ISTBI, Fudan University, Shanghai, China
| | - Amy Houseman
- Institute of Science and Technology for Brain-Inspired Intelligence - ISTBI, Fudan University, Shanghai, China
| | - Jelena Somborski
- Institute of Science and Technology for Brain-Inspired Intelligence - ISTBI, Fudan University, Shanghai, China
| | - Anna Vines
- Institute of Science and Technology for Brain-Inspired Intelligence - ISTBI, Fudan University, Shanghai, China
| | - Xing-Ming Zhao
- Institute of Science and Technology for Brain-Inspired Intelligence - ISTBI, Fudan University, Shanghai, China
- Department of Neurology, Zhongshan Hospital, Fudan University, Shanghai, China
- State Key Laboratory of Medical Neurobiology, Institutes of Brain Science, Fudan University, Shanghai, China
- MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence
- MOE Frontiers Center for Brain Science, Fudan University, Shanghai, China
- International Human Phenome Institute, Shanghai, China
| | - Peer Bork
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
- Max Delbrück Centre for Molecular Medicine, Berlin, Germany
- Department of Bioinformatics, Biocenter, University of Würzburg, Würzburg, Germany
| | - Jaime Huerta-Cepas
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Campus de Montegancedo-UPM, 28223 Pozuelo de Alarcón, Madrid, Spain
| | - Cesar de la Fuente-Nunez
- Machine Biology Group, Departments of Psychiatry and Microbiology, Institute for Biomedical Informatics, Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania; Philadelphia, Pennsylvania, United States of America
- Departments of Bioengineering and Chemical and Biomolecular Engineering, School of Engineering and Applied Science, University of Pennsylvania; Philadelphia, Pennsylvania, United States of America
- Penn Institute for Computational Science, University of Pennsylvania; Philadelphia, Pennsylvania, United States of America
| | - Luis Pedro Coelho
- Institute of Science and Technology for Brain-Inspired Intelligence - ISTBI, Fudan University, Shanghai, China
| |
Collapse
|
20
|
Singh V, Singh SK. A separable temporal convolutional networks based deep learning technique for discovering antiviral medicines. Sci Rep 2023; 13:13722. [PMID: 37608092 PMCID: PMC10444765 DOI: 10.1038/s41598-023-40922-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2023] [Accepted: 08/18/2023] [Indexed: 08/24/2023] Open
Abstract
An alarming number of fatalities caused by the COVID-19 pandemic has forced the scientific community to accelerate the process of therapeutic drug discovery. In this regard, the collaboration between biomedical scientists and experts in artificial intelligence (AI) has led to a number of in silico tools being developed for the initial screening of therapeutic molecules. All living organisms produce antiviral peptides (AVPs) as a part of their first line of defense against invading viruses. The Deep-AVPiden model proposed in this paper and its corresponding web app, deployed at https://deep-avpiden.anvil.app , is an effort toward discovering novel AVPs in proteomes of living organisms. Apart from Deep-AVPiden, a computationally efficient model called Deep-AVPiden (DS) has also been developed using the same underlying network but with point-wise separable convolutions. The Deep-AVPiden and Deep-AVPiden (DS) models show an accuracy of 90% and 88%, respectively, and both have a precision of 90%. Also, the proposed models were statistically compared using the Student's t-test. On comparing the proposed models with the state-of-the-art classifiers, it was found that they are much better than them. To test the proposed model, we identified some AVPs in the natural defense proteins of plants, mammals, and fishes and found them to have appreciable sequence similarity with some experimentally validated antimicrobial peptides. These AVPs can be chemically synthesized and tested for their antiviral activity.
Collapse
Affiliation(s)
- Vishakha Singh
- Department of Computer Science and Engineering, Indian Institute of Technology (BHU) Varanasi, Varanasi, Uttar Pradesh, 221005, India.
| | - Sanjay Kumar Singh
- Department of Computer Science and Engineering, Indian Institute of Technology (BHU) Varanasi, Varanasi, Uttar Pradesh, 221005, India.
| |
Collapse
|
21
|
Aguilera-Mendoza L, Ayala-Ruano S, Martinez-Rios F, Chavez E, García-Jacas CR, Brizuela CA, Marrero-Ponce Y. StarPep Toolbox: an open-source software to assist chemical space analysis of bioactive peptides and their functions using complex networks. Bioinformatics 2023; 39:btad506. [PMID: 37603724 PMCID: PMC10469104 DOI: 10.1093/bioinformatics/btad506] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2023] [Revised: 07/24/2023] [Accepted: 08/18/2023] [Indexed: 08/23/2023] Open
Abstract
MOTIVATION Antimicrobial peptides (AMPs) are promising molecules to treat infectious diseases caused by multi-drug resistance pathogens, some types of cancer, and other conditions. Computer-aided strategies are efficient tools for the high-throughput screening of AMPs. RESULTS This report highlights StarPep Toolbox, an open-source and user-friendly software to study the bioactive chemical space of AMPs using complex network-based representations, clustering, and similarity-searching models. The novelty of this research lies in the combination of network science and similarity-searching techniques, distinguishing it from conventional methods based on machine learning and other computational approaches. The network-based representation of the AMP chemical space presents promising opportunities for peptide drug repurposing, development, and optimization. This approach could serve as a baseline for the discovery of a new generation of therapeutics peptides. AVAILABILITY AND IMPLEMENTATION All underlying code and installation files are accessible through GitHub (https://github.com/Grupo-Medicina-Molecular-y-Traslacional/StarPep) under the Apache 2.0 license.
Collapse
Affiliation(s)
- Longendri Aguilera-Mendoza
- Grupo de Medicina Molecular y Translacional (MeM&T), Facultad de Medicina, Universidad San Francisco de Quito (USFQ), Quito, Ecuador
- Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), Ensenada, Baja California 22860, México
| | - Sebastián Ayala-Ruano
- Grupo de Medicina Molecular y Translacional (MeM&T), Facultad de Medicina, Universidad San Francisco de Quito (USFQ), Quito, Ecuador
| | - Felix Martinez-Rios
- Facultad de Ingeniería, Universidad Panamericana, CDMX, Benito Juárez 03920, México
| | - Edgar Chavez
- Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), Ensenada, Baja California 22860, México
| | - César R García-Jacas
- Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), Ensenada, Baja California 22860, México
- Cátedras CONAHCYT - Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), Ensenada, Baja California 22860, México
| | - Carlos A Brizuela
- Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), Ensenada, Baja California 22860, México
| | - Yovani Marrero-Ponce
- Grupo de Medicina Molecular y Translacional (MeM&T), Facultad de Medicina, Universidad San Francisco de Quito (USFQ), Quito, Ecuador
- Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), Ensenada, Baja California 22860, México
| |
Collapse
|
22
|
Agüero-Chapin G, Antunes A, Mora JR, Pérez N, Contreras-Torres E, Valdes-Martini JR, Martinez-Rios F, Zambrano CH, Marrero-Ponce Y. Complex Networks Analyses of Antibiofilm Peptides: An Emerging Tool for Next-Generation Antimicrobials' Discovery. Antibiotics (Basel) 2023; 12:antibiotics12040747. [PMID: 37107109 PMCID: PMC10135022 DOI: 10.3390/antibiotics12040747] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Revised: 04/04/2023] [Accepted: 04/11/2023] [Indexed: 04/29/2023] Open
Abstract
Microbial biofilms cause several environmental and industrial issues, even affecting human health. Although they have long represented a threat due to their resistance to antibiotics, there are currently no approved antibiofilm agents for clinical treatments. The multi-functionality of antimicrobial peptides (AMPs), including their antibiofilm activity and their potential to target multiple microbes, has motivated the synthesis of AMPs and their relatives for developing antibiofilm agents for clinical purposes. Antibiofilm peptides (ABFPs) have been organized in databases that have allowed the building of prediction tools which have assisted in the discovery/design of new antibiofilm agents. However, the complex network approach has not yet been explored as an assistant tool for this aim. Herein, a kind of similarity network called the half-space proximal network (HSPN) is applied to represent/analyze the chemical space of ABFPs, aiming to identify privileged scaffolds for the development of next-generation antimicrobials that are able to target both planktonic and biofilm microbial forms. Such analyses also considered the metadata associated with the ABFPs, such as origin, other activities, targets, etc., in which the relationships were projected by multilayer networks called metadata networks (METNs). From the complex networks' mining, a reduced but informative set of 66 ABFPs was extracted, representing the original antibiofilm space. This subset contained the most central to atypical ABFPs, some of them having the desired properties for developing next-generation antimicrobials. Therefore, this subset is advisable for assisting the search for/design of both new antibiofilms and antimicrobial agents. The provided ABFP motifs list, discovered within the HSPN communities, is also useful for the same purpose.
Collapse
Affiliation(s)
- Guillermin Agüero-Chapin
- CIIMAR/CIMAR, Interdisciplinary Centre of Marine and Environmental Research, University of Porto, 4450-208 Porto, Portugal
- Department of Biology, Faculty of Sciences, University of Porto, 4169-007 Porto, Portugal
| | - Agostinho Antunes
- CIIMAR/CIMAR, Interdisciplinary Centre of Marine and Environmental Research, University of Porto, 4450-208 Porto, Portugal
- Department of Biology, Faculty of Sciences, University of Porto, 4169-007 Porto, Portugal
| | - José R Mora
- Universidad San Francisco de Quito (USFQ), Colegio de Ciencias e Ingenierías "El Politécnico", Instituto de Simulación Computacional (ISC-USFQ), Diego de Robles y vía Interoceánica, Quito 170157, Pichincha, Ecuador
| | - Noel Pérez
- Universidad San Francisco de Quito (USFQ), Colegio de Ciencias e Ingenierías "El Politécnico", Instituto de Simulación Computacional (ISC-USFQ), Diego de Robles y vía Interoceánica, Quito 170157, Pichincha, Ecuador
| | - Ernesto Contreras-Torres
- Universidad San Francisco de Quito (USFQ), Grupo de Medicina Molecular y Traslacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas and Instituto de Simulación Computacional (ISC-USFQ), Diego de Robles y vía Interoceánica, Quito 170157, Pichincha, Ecuador
| | | | - Felix Martinez-Rios
- Facultad de Ingeniería, Universidad Panamericana, Augusto Rodin No. 498, Insurgentes Mixcoac, Benito Juárez, Ciudad de México 03920, Mexico
| | - Cesar H Zambrano
- Universidad San Francisco de Quito (USFQ), Colegio de Ciencias e Ingenierías "El Politécnico", Instituto de Simulación Computacional (ISC-USFQ), Diego de Robles y vía Interoceánica, Quito 170157, Pichincha, Ecuador
| | - Yovani Marrero-Ponce
- Universidad San Francisco de Quito (USFQ), Grupo de Medicina Molecular y Traslacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas and Instituto de Simulación Computacional (ISC-USFQ), Diego de Robles y vía Interoceánica, Quito 170157, Pichincha, Ecuador
- Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), Ensenada 22860, Baja California, Mexico
| |
Collapse
|
23
|
Tonolo F, Grinzato A, Bindoli A, Rigobello MP. From In Silico to a Cellular Model: Molecular Docking Approach to Evaluate Antioxidant Bioactive Peptides. Antioxidants (Basel) 2023; 12:antiox12030665. [PMID: 36978913 PMCID: PMC10045749 DOI: 10.3390/antiox12030665] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Revised: 03/02/2023] [Accepted: 03/03/2023] [Indexed: 03/10/2023] Open
Abstract
The increasing need to counteract the redox imbalance in chronic diseases leads to focusing research on compounds with antioxidant activity. Among natural molecules with health-promoting effects on many body functions, bioactive peptides are gaining interest. They are protein fragments of 2–20 amino acids that can be released by various mechanisms, such as gastrointestinal digestion, food processing and microbial fermentation. Recent studies report the effects of bioactive peptides in the cellular environment, and there is evidence that these compounds can exert their action by modulating specific pathways. This review focuses on the newest approaches to the structure–function correlation of the antioxidant bioactive peptides, considering their molecular mechanism, by evaluating the activation of specific signaling pathways that are linked to antioxidant systems. The correlation between the results of in silico molecular docking analysis and the effects in a cellular model was highlighted. This knowledge is fundamental in order to propose the use of bioactive peptides as ingredients in functional foods or nutraceuticals.
Collapse
Affiliation(s)
- Federica Tonolo
- Department of Biomedical Sciences, University of Padova, Via U. Bassi 58/b, 35131 Padova, Italy
- Department of Comparative Biomedicine and Food Science, University of Padova, Viale dell’Università, 35020 Padova, Italy
| | - Alessandro Grinzato
- European Synchrotron Radiation Facility, 71 Avenue des Martyrs, 38000 Grenoble, France
| | - Alberto Bindoli
- Institute of Neuroscience (CNR), Viale G. Colombo 3, 35131 Padova, Italy
| | - Maria Pia Rigobello
- Department of Biomedical Sciences, University of Padova, Via U. Bassi 58/b, 35131 Padova, Italy
- Correspondence:
| |
Collapse
|
24
|
Li Z, Gao E, Zhou J, Han W, Xu X, Gao X. Applications of deep learning in understanding gene regulation. CELL REPORTS METHODS 2023; 3:100384. [PMID: 36814848 PMCID: PMC9939384 DOI: 10.1016/j.crmeth.2022.100384] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
Abstract
Gene regulation is a central topic in cell biology. Advances in omics technologies and the accumulation of omics data have provided better opportunities for gene regulation studies than ever before. For this reason deep learning, as a data-driven predictive modeling approach, has been successfully applied to this field during the past decade. In this article, we aim to give a brief yet comprehensive overview of representative deep-learning methods for gene regulation. Specifically, we discuss and compare the design principles and datasets used by each method, creating a reference for researchers who wish to replicate or improve existing methods. We also discuss the common problems of existing approaches and prospectively introduce the emerging deep-learning paradigms that will potentially alleviate them. We hope that this article will provide a rich and up-to-date resource and shed light on future research directions in this area.
Collapse
Affiliation(s)
- Zhongxiao Li
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
- KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Elva Gao
- The KAUST School, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Juexiao Zhou
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
- KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Wenkai Han
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
- KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Xiaopeng Xu
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
- KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Xin Gao
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
- KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| |
Collapse
|
25
|
Carballo GM, Vázquez KG, García-González LA, Rio GD, Brizuela CA. Embedded-AMP: A Multi-Thread Computational Method for the Systematic Identification of Antimicrobial Peptides Embedded in Proteome Sequences. Antibiotics (Basel) 2023; 12:antibiotics12010139. [PMID: 36671338 PMCID: PMC9854971 DOI: 10.3390/antibiotics12010139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Revised: 01/03/2023] [Accepted: 01/05/2023] [Indexed: 01/12/2023] Open
Abstract
Antimicrobial peptides (AMPs) have gained the attention of the research community for being an alternative to conventional antimicrobials to fight antibiotic resistance and for displaying other pharmacologically relevant activities, such as cell penetration, autophagy induction, immunomodulation, among others. The identification of AMPs had been accomplished by combining computational and experimental approaches and have been mostly restricted to self-contained peptides despite accumulated evidence indicating AMPs may be found embedded within proteins, the functions of which are not necessarily associated with antimicrobials. To address this limitation, we propose a machine-learning (ML)-based pipeline to identify AMPs that are embedded in proteomes. Our method performs an in-silico digestion of every protein in the proteome to generate unique k-mers of different lengths, computes a set of molecular descriptors for each k-mer, and performs an antimicrobial activity prediction. To show the efficiency of the method we used the shrimp proteome, and the pipeline analyzed all k-mers between 10 and 60 amino acids in length to predict all AMPs in less than 20 min. As an application example we predicted AMPs in different rodents (common cuy, common rat, and naked mole rat) with different reported longevities and found a relation between species longevity and the number of predicted AMPs. The analysis shows as the longevity of the species is higher, the number of predicted AMPs is also higher. The pipeline is available as a web service.
Collapse
Affiliation(s)
| | - Karen Guerrero Vázquez
- Computer Science Department, CICESE Research Center, Ensenada 22860, Mexico
- School of Mathematical & Statistical Sciences, University of Galway, H91 TK33 Galway, Ireland
| | | | - Gabriel Del Rio
- Department of Biochemistry and Structural Biology, Instituto de Fisiologia Celular, UNAM, Mexico City 04510, Mexico
- Correspondence: (G.D.R.); (C.A.B.)
| | - Carlos A. Brizuela
- Computer Science Department, CICESE Research Center, Ensenada 22860, Mexico
- Correspondence: (G.D.R.); (C.A.B.)
| |
Collapse
|
26
|
Ayala-Ruano S, Marrero-Ponce Y, Aguilera-Mendoza L, Pérez N, Agüero-Chapin G, Antunes A, Aguilar AC. Network Science and Group Fusion Similarity-Based Searching to Explore the Chemical Space of Antiparasitic Peptides. ACS OMEGA 2022; 7:46012-46036. [PMID: 36570318 PMCID: PMC9773354 DOI: 10.1021/acsomega.2c03398] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Accepted: 11/21/2022] [Indexed: 05/13/2023]
Abstract
Antimicrobial peptides (AMPs) have appeared as promising compounds to treat a wide range of diseases. Their clinical potentialities reside in the wide range of mechanisms they can use for both killing microbes and modulating immune responses. However, the hugeness of the AMPs' chemical space (AMPCS), represented by more than 1065 unique sequences, has represented a big challenge for the discovery of new promising therapeutic peptides and for the identification of common structural motifs. Here, we introduce network science and a similarity searching approach to discover new promising AMPs, specifically antiparasitic peptides (APPs). We exploited the network-based representation of APPs' chemical space (APPCS) to retrieve valuable information by using three network types: chemical space (CSN), half-space proximal (HSPN), and metadata (METN). Some centrality measures were applied to identify in each network the most important and nonredundant peptides. Then, these central peptides were considered as queries (Qs) in group fusion similarity-based searches against a comprehensive collection of known AMPs, stored in the graph database StarPepDB, to propose new potential APPs. The performance of the resulting multiquery similarity-based search models (mQSSMs) was evaluated in five benchmarking data sets of APP/non-APPs. The predictions performed by the best mQSSM showed a strong-to-very-strong performance since their external Matthews correlation coefficient (MCC) values ranged from 0.834 to 0.965. Outstanding MCC values (>0.85) were attained by the mQSSM with 219 Qs from both networks CSN and HSPN with 0.5 as similarity threshold in external data sets. Then, the performance of our best mQSSM was compared with the APPs prediction servers AMPDiscover and AMPFun. The proposed model showed its relevance by outperforming state-of-the-art machine learning models to predict APPs. After applying the best mQSSM and additional filters on the non-APP space from StarPepDB, 95 AMPs were repurposed as potential APP hits. Due to the high sequence diversity of these peptides, different computational approaches were applied to identify relevant motifs for searching and designing new APPs. Lastly, we identified 11 promising APP lead candidates by using our best mQSSMs together with diversity-based network analyses, and 24 web servers for activity/toxicity and drug-like properties. These results support that network-based similarity searches can be an effective and reliable strategy to identify APPs. The proposed models and pipeline are freely available through the StarPep toolbox software at http://mobiosd-hub.com/starpep.
Collapse
Affiliation(s)
- Sebastián Ayala-Ruano
- Grupo
de Medicina Molecular y Traslacional (MeM&T), Escuela de Medicina,
Colegio de Ciencias de la Salud (COCSA), Universidad San Francisco de Quito, Av. Interoceánica Km 12 1/2 y Av. Florencia, Quito 17-1200-841, Ecuador
- Colegio
de Ciencias e Ingenierías “El Politécnico”, Universidad San Francisco de Quito (USFQ), Quito 170901, Ecuador
| | - Yovani Marrero-Ponce
- Grupo
de Medicina Molecular y Traslacional (MeM&T), Escuela de Medicina,
Colegio de Ciencias de la Salud (COCSA), Universidad San Francisco de Quito, Av. Interoceánica Km 12 1/2 y Av. Florencia, Quito 17-1200-841, Ecuador
- Computer-Aided
Molecular “Biosilico” Discovery and Bioinformatics Research
International Network (CAMD-BIR IN), Cumbayá, Quito 170901, Ecuador
- Universidad
San Francisco de Quito (USFQ), Instituto
de Simulación Computacional (ISC-USFQ), Diego de Robles y vía Interoceánica, Quito 170157, Pichincha, Ecuador
- Departamento
de Ciencias de la Computación, Centro
de Investigación Científica y de Educación Superior
de Ensenada (CICESE), Baja California 22860, Mexico
| | - Longendri Aguilera-Mendoza
- Departamento
de Ciencias de la Computación, Centro
de Investigación Científica y de Educación Superior
de Ensenada (CICESE), Baja California 22860, Mexico
| | - Noel Pérez
- Colegio
de Ciencias e Ingenierías “El Politécnico”, Universidad San Francisco de Quito (USFQ), Quito 170901, Ecuador
| | - Guillermin Agüero-Chapin
- CIIMAR/CIMAR,
Interdisciplinary Centre of Marine and Environmental Research, University of Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton
de Matos s/n, 4450-208 Porto, Portugal
- Department
of Biology, Faculty of Sciences, University
of Porto, Rua do Campo
Alegre, 4169-007 Porto, Portugal
| | - Agostinho Antunes
- CIIMAR/CIMAR,
Interdisciplinary Centre of Marine and Environmental Research, University of Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton
de Matos s/n, 4450-208 Porto, Portugal
- Department
of Biology, Faculty of Sciences, University
of Porto, Rua do Campo
Alegre, 4169-007 Porto, Portugal
| | - Ana Cristina Aguilar
- Grupo
de Medicina Molecular y Traslacional (MeM&T), Escuela de Medicina,
Colegio de Ciencias de la Salud (COCSA), Universidad San Francisco de Quito, Av. Interoceánica Km 12 1/2 y Av. Florencia, Quito 17-1200-841, Ecuador
| |
Collapse
|
27
|
ABP-Finder: A Tool to Identify Antibacterial Peptides and the Gram-Staining Type of Targeted Bacteria. Antibiotics (Basel) 2022; 11:antibiotics11121708. [PMID: 36551365 PMCID: PMC9774453 DOI: 10.3390/antibiotics11121708] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Revised: 11/16/2022] [Accepted: 11/17/2022] [Indexed: 11/29/2022] Open
Abstract
Multi-drug resistance in bacteria is a major health problem worldwide. To overcome this issue, new approaches allowing for the identification and development of antibacterial agents are urgently needed. Peptides, due to their binding specificity and low expected side effects, are promising candidates for a new generation of antibiotics. For over two decades, a large diversity of antimicrobial peptides (AMPs) has been discovered and annotated in public databases. The AMP family encompasses nearly 20 biological functions, thus representing a potentially valuable resource for data mining analyses. Nonetheless, despite the availability of machine learning-based approaches focused on AMPs, these tools lack evidence of successful application for AMPs' discovery, and many are not designed to predict a specific function for putative AMPs, such as antibacterial activity. Consequently, among the apparent variety of data mining methods to screen peptide sequences for antibacterial activity, only few tools can deal with such task consistently, although with limited precision and generally no information about the possible targets. Here, we addressed this gap by introducing a tool specifically designed to identify antibacterial peptides (ABPs) with an estimation of which type of bacteria is susceptible to the action of these peptides, according to their response to the Gram-staining assay. Our tool is freely available via a web server named ABP-Finder. This new method ranks within the top state-of-the-art ABP predictors, particularly in terms of precision. Importantly, we showed the successful application of ABP-Finder for the screening of a large peptide library from the human urine peptidome and the identification of an antibacterial peptide.
Collapse
|
28
|
García-Jacas CR, García-González LA, Martinez-Rios F, Tapia-Contreras IP, Brizuela CA. Handcrafted versus non-handcrafted (self-supervised) features for the classification of antimicrobial peptides: complementary or redundant? Brief Bioinform 2022; 23:6754757. [PMID: 36215083 DOI: 10.1093/bib/bbac428] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Revised: 08/28/2022] [Accepted: 09/02/2022] [Indexed: 12/14/2022] Open
Abstract
Antimicrobial peptides (AMPs) have received a great deal of attention given their potential to become a plausible option to fight multi-drug resistant bacteria as well as other pathogens. Quantitative sequence-activity models (QSAMs) have been helpful to discover new AMPs because they allow to explore a large universe of peptide sequences and help reduce the number of wet lab experiments. A main aspect in the building of QSAMs based on shallow learning is to determine an optimal set of protein descriptors (features) required to discriminate between sequences with different antimicrobial activities. These features are generally handcrafted from peptide sequence datasets that are labeled with specific antimicrobial activities. However, recent developments have shown that unsupervised approaches can be used to determine features that outperform human-engineered (handcrafted) features. Thus, knowing which of these two approaches contribute to a better classification of AMPs, it is a fundamental question in order to design more accurate models. Here, we present a systematic and rigorous study to compare both types of features. Experimental outcomes show that non-handcrafted features lead to achieve better performances than handcrafted features. However, the experiments also prove that an improvement in performance is achieved when both types of features are merged. A relevance analysis reveals that non-handcrafted features have higher information content than handcrafted features, while an interaction-based importance analysis reveals that handcrafted features are more important. These findings suggest that there is complementarity between both types of features. Comparisons regarding state-of-the-art deep models show that shallow models yield better performances both when fed with non-handcrafted features alone and when fed with non-handcrafted and handcrafted features together.
Collapse
Affiliation(s)
- César R García-Jacas
- Cátedras CONACYT - Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), 22860 Ensenada, Baja California, México
| | - Luis A García-González
- Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), 22860 Ensenada, Baja California, México
| | | | - Issac P Tapia-Contreras
- Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), 22860 Ensenada, Baja California, México
| | - Carlos A Brizuela
- Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), 22860 Ensenada, Baja California, México
| |
Collapse
|
29
|
Antimicrobial peptides with cell-penetrating activity as prophylactic and treatment drugs. Biosci Rep 2022; 42:231731. [PMID: 36052730 PMCID: PMC9508529 DOI: 10.1042/bsr20221789] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2022] [Revised: 08/31/2022] [Accepted: 09/01/2022] [Indexed: 01/18/2023] Open
Abstract
Health is fundamental for the development of individuals and evolution of species. In that sense, for human societies is relevant to understand how the human body has developed molecular strategies to maintain health. In the present review, we summarize diverse evidence that support the role of peptides in this endeavor. Of particular interest to the present review are antimicrobial peptides (AMP) and cell-penetrating peptides (CPP). Different experimental evidence indicates that AMP/CPP are able to regulate autophagy, which in turn regulates the immune system response. AMP also assists in the establishment of the microbiota, which in turn is critical for different behavioral and health aspects of humans. Thus, AMP and CPP are multifunctional peptides that regulate two aspects of our bodies that are fundamental to our health: autophagy and microbiota. While it is now clear the multifunctional nature of these peptides, we are still in the early stages of the development of computational strategies aimed to assist experimentalists in identifying selective multifunctional AMP/CPP to control nonhealthy conditions. For instance, both AMP and CPP are computationally characterized as amphipatic and cationic, yet none of these features are relevant to differentiate these peptides from non-AMP or non-CPP. The present review aims to highlight current knowledge that may facilitate the development of AMP’s design tools for preventing or treating illness.
Collapse
|
30
|
Agüero-Chapin G, Galpert-Cañizares D, Domínguez-Pérez D, Marrero-Ponce Y, Pérez-Machado G, Teijeira M, Antunes A. Emerging Computational Approaches for Antimicrobial Peptide Discovery. Antibiotics (Basel) 2022; 11:antibiotics11070936. [PMID: 35884190 PMCID: PMC9311958 DOI: 10.3390/antibiotics11070936] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Revised: 07/01/2022] [Accepted: 07/08/2022] [Indexed: 02/05/2023] Open
Abstract
In the last two decades many reports have addressed the application of artificial intelligence (AI) in the search and design of antimicrobial peptides (AMPs). AI has been represented by machine learning (ML) algorithms that use sequence-based features for the discovery of new peptidic scaffolds with promising biological activity. From AI perspective, evolutionary algorithms have been also applied to the rational generation of peptide libraries aimed at the optimization/design of AMPs. However, the literature has scarcely dedicated to other emerging non-conventional in silico approaches for the search/design of such bioactive peptides. Thus, the first motivation here is to bring up some non-standard peptide features that have been used to build classical ML predictive models. Secondly, it is valuable to highlight emerging ML algorithms and alternative computational tools to predict/design AMPs as well as to explore their chemical space. Another point worthy of mention is the recent application of evolutionary algorithms that actually simulate sequence evolution to both the generation of diversity-oriented peptide libraries and the optimization of hit peptides. Last but not least, included here some new considerations in proteogenomic analyses currently incorporated into the computational workflow for unravelling AMPs in natural sources.
Collapse
Affiliation(s)
- Guillermin Agüero-Chapin
- CIIMAR—Centro Interdisciplinar de Investigação Marinha e Ambiental, Universidade do Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos, s/n, 4450-208 Porto, Portugal;
- Departamento de Biologia, Faculdade de Ciências, Universidade do Porto, Rua do Campo Alegre, 4169-007 Porto, Portugal
- Correspondence: (G.A.-C.); (A.A.); Tel.: +351-22-340-1813 (G.A.-C. & A.A.)
| | - Deborah Galpert-Cañizares
- Departamento de Ciencia de la Computación, Universidad Central Marta Abreu de Las Villas (UCLV), Santa Clara 54830, Cuba;
| | - Dany Domínguez-Pérez
- CIIMAR—Centro Interdisciplinar de Investigação Marinha e Ambiental, Universidade do Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos, s/n, 4450-208 Porto, Portugal;
- Proquinorte, Unipessoal, Lda, Avenida 5 de Outubro, 124, 7º Piso, Avenidas Novas, 1050-061 Lisboa, Portugal
| | - Yovani Marrero-Ponce
- Universidad San Francisco de Quito (USFQ), Grupo de Medicina Molecular y Translacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas and Instituto de Simulación Computacional (ISC-USFQ), Diego de Robles y vía Interoceánica, Quito 170157, Ecuador;
| | - Gisselle Pérez-Machado
- EpiDisease S.L—Spin-Off of Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), 46980 Valencia, Spain;
| | - Marta Teijeira
- Departamento de Química Orgánica, Facultade de Química, Universidade de Vigo, 36310 Vigo, Spain;
- Instituto de Investigación Sanitaria Galicia Sur, Hospital Álvaro Cunqueiro, 36213 Vigo, Spain
| | - Agostinho Antunes
- CIIMAR—Centro Interdisciplinar de Investigação Marinha e Ambiental, Universidade do Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos, s/n, 4450-208 Porto, Portugal;
- Departamento de Biologia, Faculdade de Ciências, Universidade do Porto, Rua do Campo Alegre, 4169-007 Porto, Portugal
- Correspondence: (G.A.-C.); (A.A.); Tel.: +351-22-340-1813 (G.A.-C. & A.A.)
| |
Collapse
|
31
|
A Novel Network Science and Similarity-Searching-Based Approach for Discovering Potential Tumor-Homing Peptides from Antimicrobials. Antibiotics (Basel) 2022; 11:antibiotics11030401. [PMID: 35326864 PMCID: PMC8944733 DOI: 10.3390/antibiotics11030401] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Revised: 03/13/2022] [Accepted: 03/15/2022] [Indexed: 02/01/2023] Open
Abstract
Peptide-based drugs are promising anticancer candidates due to their biocompatibility and low toxicity. In particular, tumor-homing peptides (THPs) have the ability to bind specifically to cancer cell receptors and tumor vasculature. Despite their potential to develop antitumor drugs, there are few available prediction tools to assist the discovery of new THPs. Two webservers based on machine learning models are currently active, the TumorHPD and the THPep, and more recently the SCMTHP. Herein, a novel method based on network science and similarity searching implemented in the starPep toolbox is presented for THP discovery. The approach leverages from exploring the structural space of THPs with Chemical Space Networks (CSNs) and from applying centrality measures to identify the most relevant and non-redundant THP sequences within the CSN. Such THPs were considered as queries (Qs) for multi-query similarity searches that apply a group fusion (MAX-SIM rule) model. The resulting multi-query similarity searching models (SSMs) were validated with three benchmarking datasets of THPs/non-THPs. The predictions achieved accuracies that ranged from 92.64 to 99.18% and Matthews Correlation Coefficients between 0.894–0.98, outperforming state-of-the-art predictors. The best model was applied to repurpose AMPs from the starPep database as THPs, which were subsequently optimized for the TH activity. Finally, 54 promising THP leads were discovered, and their sequences were analyzed to encounter novel motifs. These results demonstrate the potential of CSNs and multi-query similarity searching for the rapid and accurate identification of THPs.
Collapse
|
32
|
Singh V, Shrivastava S, Kumar Singh S, Kumar A, Saxena S. Accelerating the discovery of antifungal peptides using deep temporal convolutional networks. Brief Bioinform 2022; 23:6526725. [DOI: 10.1093/bib/bbac008] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2021] [Revised: 12/27/2021] [Accepted: 01/06/2022] [Indexed: 02/02/2023] Open
Abstract
Abstract
The application of machine intelligence in biological sciences has led to the development of several automated tools, thus enabling rapid drug discovery. Adding to this development is the ongoing COVID-19 pandemic, due to which researchers working in the field of artificial intelligence have acquired an active interest in finding machine learning-guided solutions for diseases like mucormycosis, which has emerged as an important post-COVID-19 fungal complication, especially in immunocompromised patients. On these lines, we have proposed a temporal convolutional network-based binary classification approach to discover new antifungal molecules in the proteome of plants and animals to accelerate the development of antifungal medications. Although these biomolecules, known as antifungal peptides (AFPs), are part of an organism’s intrinsic host defense mechanism, their identification and discovery by traditional biochemical procedures is arduous. Also, the absence of a large dataset on AFPs is also a considerable impediment in building a robust automated classifier. To this end, we have employed the transfer learning technique to pre-train our model on antibacterial peptides. Subsequently, we have built a classifier that predicts AFPs with accuracy and precision of 94%. Our classifier outperforms several state-of-the-art models by a considerable margin. The results of its performance were proven as statistically significant using the Kruskal–Wallis H test, followed by a post hoc analysis performed using the Tukey honestly significant difference (HSD) test. Furthermore, we identified potent AFPs in representative animal (Histatin) and plant (Snakin) proteins using our model. We also built and deployed a web app that is freely available at https://tcn-afppred.anvil.app/ for the identification of AFPs in protein sequences.
Collapse
Affiliation(s)
- Vishakha Singh
- Department of Computer Science and Engineering, Indian Institute of Technology (BHU), Varanasi, 221005, Uttar Pradesh, India
| | - Sameer Shrivastava
- Division of Veterinary Biotechnology, ICAR-Indian Veterinary Research Institute, Izatnagar, 243122, Uttar Pradesh, India
| | - Sanjay Kumar Singh
- Department of Computer Science and Engineering, Indian Institute of Technology (BHU), Varanasi, 221005, Uttar Pradesh, India
| | - Abhinav Kumar
- Department of Computer Science and Engineering, Indian Institute of Technology (BHU), Varanasi, 221005, Uttar Pradesh, India
| | - Sonal Saxena
- Division of Veterinary Biotechnology, ICAR-Indian Veterinary Research Institute, Izatnagar, 243122, Uttar Pradesh, India
| |
Collapse
|
33
|
Vidal-Limon A, Aguilar-Toalá JE, Liceaga AM. Integration of Molecular Docking Analysis and Molecular Dynamics Simulations for Studying Food Proteins and Bioactive Peptides. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2022; 70:934-943. [PMID: 34990125 DOI: 10.1021/acs.jafc.1c06110] [Citation(s) in RCA: 175] [Impact Index Per Article: 58.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
In silico tools, such as molecular docking, are widely applied to study interactions and binding affinity of biological activity of proteins and peptides. However, restricted sampling of both ligand and receptor conformations and use of approximated scoring functions can produce results that do not correlate with actual experimental binding affinities. Molecular dynamics simulations (MDS) can provide valuable information in deciphering functional mechanisms of proteins/peptides and other biomolecules, overcoming the rigid sampling limitations in docking analysis. This review will discuss the information related to the traditional use of in silico models, such as molecular docking, and its application for studying food proteins and bioactive peptides, followed by an in-depth introduction to the theory of MDS and description of why these molecular simulation techniques are important in the theoretical prediction of structural and functional dynamics of food proteins and bioactive peptides. Applications, limitations, and future prospects of MDS will also be discussed.
Collapse
Affiliation(s)
- Abraham Vidal-Limon
- Red de Estudios Moleculares Avanzados, Clúster Científico y Tecnológico BioMimic, Instituto de Ecología A.C. (INECOL), Carretera Antigua a Coatepec 351, El Haya, Xalapa, Veracruz 91073, Mexico
| | - José E Aguilar-Toalá
- Departamento de Ciencias de la Alimentación, División de Ciencias Biológicas y de la Salud, Universidad Autónoma Metropolitana Unidad Lerma, Avenida de las Garzas 10, Colonia El Panteón, Lerma de Villada, Estado de México 52005, Mexico
| | - Andrea M Liceaga
- Protein Chemistry and Bioactive Peptides Laboratory. Department of Food Science, Purdue University, 745 Agriculture Mall Drive, West Lafayette, Indiana 47907, United States
| |
Collapse
|
34
|
Singh V, Shrivastava S, Kumar Singh S, Kumar A, Saxena S. StaBle-ABPpred: a stacked ensemble predictor based on biLSTM and attention mechanism for accelerated discovery of antibacterial peptides. Brief Bioinform 2021; 23:6423526. [PMID: 34750606 DOI: 10.1093/bib/bbab439] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2021] [Revised: 09/06/2021] [Accepted: 09/24/2021] [Indexed: 01/29/2023] Open
Abstract
Due to the rapid emergence of multi-drug resistant (MDR) bacteria, existing antibiotics are becoming ineffective. So, researchers are looking for alternatives in the form of antibacterial peptides (ABPs) based medicines. The discovery of novel ABPs using wet-lab experiments is time-consuming and expensive. Many machine learning models have been proposed to search for new ABPs, but there is still scope to develop a robust model that has high accuracy and precision. In this work, we present StaBle-ABPpred, a stacked ensemble technique-based deep learning classifier that uses bidirectional long-short term memory (biLSTM) and attention mechanism at base-level and an ensemble of random forest, gradient boosting and logistic regression at meta-level to classify peptides as antibacterial or otherwise. The performance of our model has been compared with several state-of-the-art classifiers, and results were subjected to analysis of variance (ANOVA) test and its post hoc analysis, which proves that our model performs better than existing classifiers. Furthermore, a web app has been developed and deployed at https://stable-abppred.anvil.app to identify novel ABPs in protein sequences. Using this app, we identified novel ABPs in all the proteins of the Streptococcus phage T12 genome. These ABPs have shown amino acid similarities with experimentally tested antimicrobial peptides (AMPs) of other organisms. Hence, they could be chemically synthesized and experimentally validated for their activity against different bacteria. The model and app developed in this work can be further utilized to explore the protein diversity for identifying novel ABPs with broad-spectrum activity, especially against MDR bacterial pathogens.
Collapse
Affiliation(s)
- Vishakha Singh
- Department of Computer Science and Engineering, Indian Institute of Technology (BHU), Varanasi, 221005, Uttar Pradesh, India
| | - Sameer Shrivastava
- Division of Veterinary Biotechnology, ICAR-Indian Veterinary Research Institute, Izatnagar, 243122, Uttar Pradesh, India
| | - Sanjay Kumar Singh
- Department of Computer Science and Engineering, Indian Institute of Technology (BHU), Varanasi, 221005, Uttar Pradesh, India
| | - Abhinav Kumar
- Department of Computer Science and Engineering, Indian Institute of Technology (BHU), Varanasi, 221005, Uttar Pradesh, India
| | - Sonal Saxena
- Division of Veterinary Biotechnology, ICAR-Indian Veterinary Research Institute, Izatnagar, 243122, Uttar Pradesh, India
| |
Collapse
|
35
|
Sharma R, Shrivastava S, Kumar Singh S, Kumar A, Saxena S, Kumar Singh R. Deep-AFPpred: identifying novel antifungal peptides using pretrained embeddings from seq2vec with 1DCNN-BiLSTM. Brief Bioinform 2021; 23:6404058. [PMID: 34670278 DOI: 10.1093/bib/bbab422] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2021] [Revised: 09/03/2021] [Accepted: 09/12/2021] [Indexed: 01/04/2023] Open
Abstract
Fungal infections or mycosis cause a wide range of diseases in humans and animals. The incidences of community acquired; nosocomial fungal infections have increased dramatically after the emergence of COVID-19 pandemic. The increase in number of patients with immunodeficiency / immunosuppression related diseases, resistance to existing antifungal compounds and availability of limited therapeutic options has triggered the search for alternative antifungal molecules. In this direction, antifungal peptides (AFPs) have received a lot of interest as an alternative to currently available antifungal drugs. Although the AFPs are produced by diverse population of living organisms, identifying effective AFPs from natural sources is time-consuming and expensive. Therefore, there is a need to develop a robust in silico model capable of identifying novel AFPs in protein sequences. In this paper, we propose Deep-AFPpred, a deep learning classifier that can identify AFPs in protein sequences. We developed Deep-AFPpred using the concept of transfer learning with 1DCNN-BiLSTM deep learning algorithm. The findings reveal that Deep-AFPpred beats other state-of-the-art AFP classifiers by a wide margin and achieved approximately 96% and 94% precision on validation and test data, respectively. Based on the proposed approach, an online prediction server is created and made publicly available at https://afppred.anvil.app/. Using this server, one can identify novel AFPs in protein sequences and the results are provided as a report that includes predicted peptides, their physicochemical properties and motifs. By utilizing this model, we identified AFPs in different proteins, which can be chemically synthesized in lab and experimentally validated for their antifungal activity.
Collapse
Affiliation(s)
- Ritesh Sharma
- Department of Computer Science and Engineering, Indian Institute of Technology (BHU), Varanasi, 221005, Uttar Pradesh, India
| | - Sameer Shrivastava
- Division of Veterinary Biotechnology, ICAR-Indian Veterinary Research Institute, Izatnagar, 243122, Uttar Pradesh, India
| | - Sanjay Kumar Singh
- Department of Computer Science and Engineering, Indian Institute of Technology (BHU), Varanasi, 221005, Uttar Pradesh, India
| | - Abhinav Kumar
- Department of Computer Science and Engineering, Indian Institute of Technology (BHU), Varanasi, 221005, Uttar Pradesh, India
| | - Sonal Saxena
- Division of Veterinary Biotechnology, ICAR-Indian Veterinary Research Institute, Izatnagar, 243122, Uttar Pradesh, India
| | - Raj Kumar Singh
- Former Director & Vice Chancellor, ICAR-Indian Veterinary Research Institute, Izatnagar, 243122, Uttar Pradesh, India
| |
Collapse
|
36
|
Sharma R, Shrivastava S, Kumar Singh S, Kumar A, Saxena S, Kumar Singh R. AniAMPpred: artificial intelligence guided discovery of novel antimicrobial peptides in animal kingdom. Brief Bioinform 2021; 22:6320952. [PMID: 34259329 DOI: 10.1093/bib/bbab242] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2021] [Revised: 06/02/2021] [Accepted: 06/21/2021] [Indexed: 12/12/2022] Open
Abstract
With advancements in genomics, there has been substantial reduction in the cost and time of genome sequencing and has resulted in lot of data in genome databases. Antimicrobial host defense proteins provide protection against invading microbes. But confirming the antimicrobial function of host proteins by wet-lab experiments is expensive and time consuming. Therefore, there is a need to develop an in silico tool to identify the antimicrobial function of proteins. In the current study, we developed a model AniAMPpred by considering all the available antimicrobial peptides (AMPs) of length $\in $[10 200] from the animal kingdom. The model utilizes a support vector machine algorithm with deep learning-based features and identifies probable antimicrobial proteins (PAPs) in the genome of animals. The results show that our proposed model outperforms other state-of-the-art classifiers, has very high confidence in its predictions, is not biased and can classify both AMPs and non-AMPs for a diverse peptide length with high accuracy. By utilizing AniAMPpred, we identified 436 PAPs in the genome of Helobdella robusta. To further confirm the functional activity of PAPs, we performed BLAST analysis against known AMPs. On detailed analysis of five selected PAPs, we could observe their similarity with antimicrobial proteins of several animal species. Thus, our proposed model can help the researchers identify PAPs in the genome of animals and provide insight into the functional identity of different proteins. An online prediction server is also developed based on the proposed approach, which is freely accessible at https://aniamppred.anvil.app/.
Collapse
Affiliation(s)
- Ritesh Sharma
- Department of Computer Science and Engineering, Indian Institute of Technology (BHU), Varanasi, 221005, Uttar Pradesh, India
| | - Sameer Shrivastava
- Division of Veterinary Biotechnology, ICAR-Indian Veterinary Research Institute, Izatnagar, 243122, Uttar Pradesh, India
| | - Sanjay Kumar Singh
- Department of Computer Science and Engineering, Indian Institute of Technology (BHU), Varanasi, 221005, Uttar Pradesh, India
| | - Abhinav Kumar
- Department of Computer Science and Engineering, Indian Institute of Technology (BHU), Varanasi, 221005, Uttar Pradesh, India
| | - Sonal Saxena
- Division of Veterinary Biotechnology, ICAR-Indian Veterinary Research Institute, Izatnagar, 243122, Uttar Pradesh, India
| | - Raj Kumar Singh
- Former Director & Vice Chancellor, ICAR-Indian Veterinary Research Institute, Izatnagar, 243122, Uttar Pradesh, India
| |
Collapse
|
37
|
Pinacho-Castellanos SA, García-Jacas CR, Gilson MK, Brizuela CA. Alignment-Free Antimicrobial Peptide Predictors: Improving Performance by a Thorough Analysis of the Largest Available Data Set. J Chem Inf Model 2021; 61:3141-3157. [PMID: 34081438 DOI: 10.1021/acs.jcim.1c00251] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
In the last two decades, a large number of machine-learning-based predictors for the activities of antimicrobial peptides (AMPs) have been proposed. These predictors differ from one another in the learning method and in the training and testing data sets used. Unfortunately, the training data sets present several drawbacks, such as a low representativeness regarding the experimentally validated AMP space, and duplicated peptide sequences between negative and positive data sets. These limitations give a low confidence to most of the approaches to be used in prospective studies. To address these weaknesses, we propose novel modeling and assessing data sets from the largest experimentally validated nonredundant peptide data set reported to date. From these novel data sets, alignment-free quantitative sequence-activity models (AF-QSAMs) based on Random Forest are created to identify general AMPs and their antibacterial, antifungal, antiparasitic, and antiviral functional types. An applicability domain analysis is carried out to determine the reliability of the predictions obtained, which, to the best of our knowledge, is performed for the first time for AMP recognition. A benchmarking is undertaken between the models proposed and several models from the literature that are freely available in 13 programs (ClassAMP, iAMP-2L, ADAM, MLAMP, AMPScanner v2.0, AntiFP, AMPfun, PEPred-suite, AxPEP, CAMPR3, iAMPpred, APIN, and Meta-iAVP). The models proposed are those with the best performance in all of the endpoints modeled, while most of the methods from the literature have weak-to-random predictive agreements. The models proposed are also assessed through Y-scrambling and repeated k-fold cross-validation tests, demonstrating that the outcomes obtained by them are not given by chance. Three chemometric analyses also confirmed the relevance of the peptides descriptors used in the modeling. Therefore, it can be concluded that the models built by fixing the drawbacks existing in the literature contribute to identifying antibacterial, antifungal, antiparasitic, and antiviral peptides with high effectivity and reliability. Models are freely available via the AMPDiscover tool at https://biocom-ampdiscover.cicese.mx/.
Collapse
Affiliation(s)
- Sergio A Pinacho-Castellanos
- Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), 22860 Ensenada, Baja California, México.,Centro de Investigación y Desarrollo de Tecnología Digital (CITEDI), Instituto Politécnico Nacional (IPN), 22435 Tijuana, Baja California, México
| | - César R García-Jacas
- Cátedras CONACYT-Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), 22860 Ensenada, Baja California, México
| | - Michael K Gilson
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, California 92093, United States
| | - Carlos A Brizuela
- Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), 22860 Ensenada, Baja California, México
| |
Collapse
|
38
|
Zhang Y, Lin J, Zhao L, Zeng X, Liu X. A novel antibacterial peptide recognition algorithm based on BERT. Brief Bioinform 2021; 22:6284370. [PMID: 34037687 DOI: 10.1093/bib/bbab200] [Citation(s) in RCA: 51] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2020] [Revised: 04/19/2021] [Accepted: 05/03/2021] [Indexed: 12/31/2022] Open
Abstract
As the best substitute for antibiotics, antimicrobial peptides (AMPs) have important research significance. Due to the high cost and difficulty of experimental methods for identifying AMPs, more and more researches are focused on using computational methods to solve this problem. Most of the existing calculation methods can identify AMPs through the sequence itself, but there is still room for improvement in recognition accuracy, and there is a problem that the constructed model cannot be universal in each dataset. The pre-training strategy has been applied to many tasks in natural language processing (NLP) and has achieved gratifying results. It also has great application prospects in the field of AMP recognition and prediction. In this paper, we apply the pre-training strategy to the model training of AMP classifiers and propose a novel recognition algorithm. Our model is constructed based on the BERT model, pre-trained with the protein data from UniProt, and then fine-tuned and evaluated on six AMP datasets with large differences. Our model is superior to the existing methods and achieves the goal of accurate identification of datasets with small sample size. We try different word segmentation methods for peptide chains and prove the influence of pre-training steps and balancing datasets on the recognition effect. We find that pre-training on a large number of diverse AMP data, followed by fine-tuning on new data, is beneficial for capturing both new data's specific features and common features between AMP sequences. Finally, we construct a new AMP dataset, on which we train a general AMP recognition model.
Collapse
Affiliation(s)
- Yue Zhang
- Xiamen University, Xiamen 361005, China
| | | | | | | | | |
Collapse
|
39
|
Hashemi ZS, Zarei M, Fath MK, Ganji M, Farahani MS, Afsharnouri F, Pourzardosht N, Khalesi B, Jahangiri A, Rahbar MR, Khalili S. In silico Approaches for the Design and Optimization of Interfering Peptides Against Protein-Protein Interactions. Front Mol Biosci 2021; 8:669431. [PMID: 33996914 PMCID: PMC8113820 DOI: 10.3389/fmolb.2021.669431] [Citation(s) in RCA: 41] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2021] [Accepted: 04/06/2021] [Indexed: 01/01/2023] Open
Abstract
Large contact surfaces of protein-protein interactions (PPIs) remain to be an ongoing issue in the discovery and design of small molecule modulators. Peptides are intrinsically capable of exploring larger surfaces, stable, and bioavailable, and therefore bear a high therapeutic value in the treatment of various diseases, including cancer, infectious diseases, and neurodegenerative diseases. Given these promising properties, a long way has been covered in the field of targeting PPIs via peptide design strategies. In silico tools have recently become an inevitable approach for the design and optimization of these interfering peptides. Various algorithms have been developed to scrutinize the PPI interfaces. Moreover, different databases and software tools have been created to predict the peptide structures and their interactions with target protein complexes. High-throughput screening of large peptide libraries against PPIs; "hotspot" identification; structure-based and off-structure approaches of peptide design; 3D peptide modeling; peptide optimization strategies like cyclization; and peptide binding energy evaluation are among the capabilities of in silico tools. In the present study, the most recent advances in the field of in silico approaches for the design of interfering peptides against PPIs will be reviewed. The future perspective of the field and its advantages and limitations will also be pinpointed.
Collapse
Affiliation(s)
- Zahra Sadat Hashemi
- ATMP Department, Breast Cancer Research Center, Motamed Cancer Institute, Academic Center for Education, Culture and Research, Tehran, Iran
| | - Mahboubeh Zarei
- Pharmaceutical Sciences Research Center, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Mohsen Karami Fath
- Department of Cellular and Molecular Biology, Faculty of Biological Sciences, Kharazmi University, Tehran, Iran
| | - Mahmoud Ganji
- Department of Medical Biotechnology, Faculty of Medical Sciences, Tarbiat Modares University, Tehran, Iran
| | - Mahboube Shahrabi Farahani
- Department of Medical Biotechnology, Faculty of Medical Sciences, Tarbiat Modares University, Tehran, Iran
| | - Fatemeh Afsharnouri
- Department of Medical Biotechnology, Faculty of Medical Sciences, Tarbiat Modares University, Tehran, Iran
| | - Navid Pourzardosht
- Cellular and Molecular Research Center, Faculty of Medicine, Guilan University of Medical Sciences, Rasht, Iran
- Department of Biochemistry, Guilan University of Medical Sciences, Rasht, Iran
| | - Bahman Khalesi
- Department of Research and Production of Poultry Viral Vaccine, Razi Vaccine and Serum Research Institute, Agricultural Research Education and Extension Organization, Karaj, Iran
| | - Abolfazl Jahangiri
- Applied Microbiology Research Center, Systems Biology and Poisonings Institute, Baqiyatallah University of Medical Sciences, Tehran, Iran
| | - Mohammad Reza Rahbar
- Pharmaceutical Sciences Research Center, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Saeed Khalili
- Department of Biology Sciences, Shahid Rajaee Teacher Training University, Tehran, Iran
| |
Collapse
|
40
|
Evolutionary algorithm-based generation of optimum peptide sequences with dengue virus inhibitory activity. Future Med Chem 2021; 13:993-1000. [PMID: 33890502 DOI: 10.4155/fmc-2020-0372] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Background: There is currently no effective dengue virus (DENV) therapeutic. We aim to develop a genetic algorithm-based framework for the design of peptides with possible DENV inhibitory activity. Methods & results: A Python-based tool (denominated AutoPepGEN) based on a DENV support vector machine classifier as the objective function was implemented. AutoPepGEN was applied to the design of three- to seven-amino acid sequences and ten peptides were selected. Peptide-protease (DENV) docking and Molecular Mechanics-Generalized Born Surface Area calculations were performed for the selected sequences and favorable binding energies were observed. Conclusion: It is hoped that AutoPepGEN will serve as an in silico alternative to the experimental design of positional scanning combinatorial libraries, known to be prone to a combinatorial explosion. AutoPepGEN is available at: https://github.com/sjbarigye/AutoPepGEN.
Collapse
|
41
|
Sharma R, Shrivastava S, Kumar Singh S, Kumar A, Saxena S, Kumar Singh R. Deep-ABPpred: identifying antibacterial peptides in protein sequences using bidirectional LSTM with word2vec. Brief Bioinform 2021; 22:6204762. [PMID: 33784381 DOI: 10.1093/bib/bbab065] [Citation(s) in RCA: 60] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2020] [Revised: 02/04/2021] [Indexed: 12/13/2022] Open
Abstract
The overuse of antibiotics has led to emergence of antimicrobial resistance, and as a result, antibacterial peptides (ABPs) are receiving significant attention as an alternative. Identification of effective ABPs in lab from natural sources is a cost-intensive and time-consuming process. Therefore, there is a need for the development of in silico models, which can identify novel ABPs in protein sequences for chemical synthesis and testing. In this study, we propose a deep learning classifier named Deep-ABPpred that can identify ABPs in protein sequences. We developed Deep-ABPpred using bidirectional long short-term memory algorithm with amino acid level features from word2vec. The results show that Deep-ABPpred outperforms other state-of-the-art ABP classifiers on both test and independent datasets. Our proposed model achieved the precision of approximately 97 and 94% on test dataset and independent dataset, respectively. The high precision suggests applicability of Deep-ABPpred in proposing novel ABPs for synthesis and experimentation. By utilizing Deep-ABPpred, we identified ABPs in the tail protein sequences of Streptococcus bacteriophages, chemically synthesized identified peptides in lab and tested their activity in vitro. These ABPs showed potent antibacterial activity against selected Gram-positive and Gram-negative bacteria, which confirms the capability of Deep-ABPpred in identifying novel ABPs in protein sequences. Based on the proposed approach, an online prediction server is also developed, which is freely accessible at https://abppred.anvil.app/. This web server takes the protein sequence as input and provides ABPs with high probability (>0.95) as output.
Collapse
Affiliation(s)
- Ritesh Sharma
- Department of Computer Science and Engineering at IIT (BHU), Varanasi, India
| | | | - Sanjay Kumar Singh
- Department of Computer Science and Engineering at IIT (BHU), Varanasi, India
| | - Abhinav Kumar
- Department of Computer Science and Engineering at IIT (BHU), Varanasi, India
| | - Sonal Saxena
- Division of Veterinary Biotechnology, IVRI, Izatnagar, India
| | | |
Collapse
|
42
|
Shotgun Proteomics and Protein-Based Bioinformatics for the Characterization of Food-Derived Bioactive Peptides. Methods Mol Biol 2021. [PMID: 33687718 DOI: 10.1007/978-1-0716-1178-4_14] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/13/2024]
Abstract
A workflow for the characterization of food-derived bioactive peptides is described in this chapter. The workflow integrates two consecutive steps: a discovery phase and a protein-based bioinformatic phase. In the first step (discovery phase), a shotgun bottom-up proteomics approach is used to create a reference data set for a selected food proteome. Afterward, in a second step (bioinformatic phase), the reference proteome is subjected to several in silico protein-based bioinformatic analyses to predict and characterize potential bioactive peptides after an in silico human gastrointestinal digestion. Using this workflow, bioactive collagen peptides, antihypertensive, antimicrobial, and antitumor peptides were predicted as potential valuable bioactive peptides from seafood and marine by-products. It is concluded that the combination of the global shotgun proteomic analysis and the analysis by protein-based bioinformatics can provide a rapid strategy for the characterization of new potential food-derived bioactive peptides.
Collapse
|
43
|
Pirtskhalava M, Amstrong AA, Grigolava M, Chubinidze M, Alimbarashvili E, Vishnepolsky B, Gabrielian A, Rosenthal A, Hurt DE, Tartakovsky M. DBAASP v3: database of antimicrobial/cytotoxic activity and structure of peptides as a resource for development of new therapeutics. Nucleic Acids Res 2021; 49:D288-D297. [PMID: 33151284 PMCID: PMC7778994 DOI: 10.1093/nar/gkaa991] [Citation(s) in RCA: 307] [Impact Index Per Article: 76.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2020] [Revised: 10/09/2020] [Accepted: 10/14/2020] [Indexed: 12/30/2022] Open
Abstract
The Database of Antimicrobial Activity and Structure of Peptides (DBAASP) is an open-access, comprehensive database containing information on amino acid sequences, chemical modifications, 3D structures, bioactivities and toxicities of peptides that possess antimicrobial properties. DBAASP is updated continuously, and at present, version 3.0 (DBAASP v3) contains >15 700 entries (8000 more than the previous version), including >14 500 monomers and nearly 400 homo- and hetero-multimers. Of the monomeric antimicrobial peptides (AMPs), >12 000 are synthetic, about 2700 are ribosomally synthesized, and about 170 are non-ribosomally synthesized. Approximately 3/4 of the entries were added after the initial release of the database in 2014 reflecting the recent sharp increase in interest in AMPs. Despite the increased interest, adoption of peptide antimicrobials in clinical practice is still limited as a consequence of several factors including side effects, problems with bioavailability and high production costs. To assist in developing and optimizing de novo peptides with desired biological activities, DBAASP offers several tools including a sophisticated multifactor analysis of relevant physicochemical properties. Furthermore, DBAASP has implemented a structure modelling pipeline that automates the setup, execution and upload of molecular dynamics (MD) simulations of database peptides. At present, >3200 peptides have been populated with MD trajectories and related analyses that are both viewable within the web browser and available for download. More than 400 DBAASP entries also have links to experimentally determined structures in the Protein Data Bank. DBAASP v3 is freely accessible at http://dbaasp.org.
Collapse
Affiliation(s)
- Malak Pirtskhalava
- Ivane Beritashvili Center of Experimental Biomedicine, Tbilisi 0160, Georgia
| | - Anthony A Amstrong
- Office of Cyber Infrastructure and Computational Biology, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD 20892, USA
| | - Maia Grigolava
- Ivane Beritashvili Center of Experimental Biomedicine, Tbilisi 0160, Georgia
| | - Mindia Chubinidze
- Ivane Beritashvili Center of Experimental Biomedicine, Tbilisi 0160, Georgia
| | | | - Boris Vishnepolsky
- Ivane Beritashvili Center of Experimental Biomedicine, Tbilisi 0160, Georgia
| | - Andrei Gabrielian
- Office of Cyber Infrastructure and Computational Biology, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD 20892, USA
| | - Alex Rosenthal
- Office of Cyber Infrastructure and Computational Biology, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD 20892, USA
| | - Darrell E Hurt
- Office of Cyber Infrastructure and Computational Biology, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD 20892, USA
| | - Michael Tartakovsky
- Office of Cyber Infrastructure and Computational Biology, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD 20892, USA
| |
Collapse
|
44
|
Aguilera-Mendoza L, Marrero-Ponce Y, García-Jacas CR, Chavez E, Beltran JA, Guillen-Ramirez HA, Brizuela CA. Automatic construction of molecular similarity networks for visual graph mining in chemical space of bioactive peptides: an unsupervised learning approach. Sci Rep 2020; 10:18074. [PMID: 33093586 PMCID: PMC7583304 DOI: 10.1038/s41598-020-75029-1] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2020] [Accepted: 09/23/2020] [Indexed: 12/15/2022] Open
Abstract
The increasing interest in bioactive peptides with therapeutic potentials has been reflected in a large variety of biological databases published over the last years. However, the knowledge discovery process from these heterogeneous data sources is a nontrivial task, becoming the essence of our research endeavor. Therefore, we devise a unified data model based on molecular similarity networks for representing a chemical reference space of bioactive peptides, having an implicit knowledge that is currently not explicitly accessed in existing biological databases. Indeed, our main contribution is a novel workflow for the automatic construction of such similarity networks, enabling visual graph mining techniques to uncover new insights from the "ocean" of known bioactive peptides. The workflow presented here relies on the following sequential steps: (i) calculation of molecular descriptors by applying statistical and aggregation operators on amino acid property vectors; (ii) a two-stage unsupervised feature selection method to identify an optimized subset of descriptors using the concepts of entropy and mutual information; (iii) generation of sparse networks where nodes represent bioactive peptides, and edges between two nodes denote their pairwise similarity/distance relationships in the defined descriptor space; and (iv) exploratory analysis using visual inspection in combination with clustering and network science techniques. For practical purposes, the proposed workflow has been implemented in our visual analytics software tool ( http://mobiosd-hub.com/starpep/ ), to assist researchers in extracting useful information from an integrated collection of 45120 bioactive peptides, which is one of the largest and most diverse data in its field. Finally, we illustrate the applicability of the proposed workflow for discovering central nodes in molecular similarity networks that may represent a biologically relevant chemical space known to date.
Collapse
Affiliation(s)
- Longendri Aguilera-Mendoza
- Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), Baja California, 22860, Mexico
| | - Yovani Marrero-Ponce
- Universidad San Francisco de Quito, Grupo de Medicina Molecular y Traslacional (MeM&T), Escuela de Medicina, Colegio de Ciencias de la Salud (COCSA), Av. Interoceánica Km 12 1/2 y Av. Florencia, 17-1200-841, Quito, Ecuador.
- Grupo GINUMED, Corporacion Universitaria Rafael Nuñez. Facultad de Salud, Programa de Medicina, Cartagena, Colombia.
- Unidad de Investigación de Diseño de Fármacos y Conectividad Molecular, Departamento de Química Física, Facultad de Farmacia, Universitat de València, Valencia, Spain.
| | - César R García-Jacas
- Cátedras Conacyt - Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), Ensenada, Baja California, Mexico
| | - Edgar Chavez
- Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), Baja California, 22860, Mexico
| | - Jesus A Beltran
- Department of Informatics, University of California, Irvine, Irvine, CA, USA
| | - Hugo A Guillen-Ramirez
- Department of BioMedical Research (DBMR), University of Bern, Bern, 3008, Switzerland
- Department of Medical Oncology, Inselspital, University Hospital and University of Bern, 3010, Bern, Switzerland
| | - Carlos A Brizuela
- Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), Baja California, 22860, Mexico.
| |
Collapse
|
45
|
Barigye SJ, Gómez-Ganau S, Serrano-Candelas E, Gozalbes R. PeptiDesCalculator: Software for computation of peptide descriptors. Definition, implementation and case studies for 9 bioactivity endpoints. Proteins 2020; 89:174-184. [PMID: 32881068 DOI: 10.1002/prot.26003] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2020] [Revised: 08/05/2020] [Accepted: 08/27/2020] [Indexed: 11/09/2022]
Abstract
We present a novel Java-based program denominated PeptiDesCalculator for computing peptide descriptors. These descriptors include: redefinitions of known protein parameters to suite the peptide domain, generalization schemes for the global descriptions of peptide characteristics, as well as empirical descriptors based on experimental evidence on peptide stability and interaction propensity. The PeptiDesCalculator software provides a user-friendly Graphical User Interface (GUI) and is parallelized to maximize the use of computational resources available in current work stations. The PeptiDesCalculator indices are employed in modeling 8 peptide bioactivity endpoints demonstrating satisfactory behavior. Moreover, we compare the performance of a support vector machine (SVM) classifier built using 15 PeptiDesCalculator indices with that of a recently reported deep neural network (DNN) antimicrobial activity classifier, demonstrating comparable test set performance notwithstanding the remarkably lower degree of freedom for the former. This software will facilitate the development of in silico models for the prediction of peptide properties.
Collapse
Affiliation(s)
- Stephen J Barigye
- ProtoQSAR SL, Centro Europeo de Empresas Innovadoras (CEEI), Parque Tecnológico de Valencia, Valencia, Spain.,MolDrug AI Systems SL, Valencia, Spain
| | - Sergi Gómez-Ganau
- ProtoQSAR SL, Centro Europeo de Empresas Innovadoras (CEEI), Parque Tecnológico de Valencia, Valencia, Spain.,Eurofins Agroscience Services Regulatory Spain SL, Valencia, Spain
| | - Eva Serrano-Candelas
- ProtoQSAR SL, Centro Europeo de Empresas Innovadoras (CEEI), Parque Tecnológico de Valencia, Valencia, Spain
| | - Rafael Gozalbes
- ProtoQSAR SL, Centro Europeo de Empresas Innovadoras (CEEI), Parque Tecnológico de Valencia, Valencia, Spain.,MolDrug AI Systems SL, Valencia, Spain
| |
Collapse
|
46
|
Nava Lara RA, Beltrán JA, Brizuela CA, Del Rio G. Relevant Features of Polypharmacologic Human-Target Antimicrobials Discovered by Machine-Learning Techniques. Pharmaceuticals (Basel) 2020; 13:ph13090204. [PMID: 32825532 PMCID: PMC7559829 DOI: 10.3390/ph13090204] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2020] [Revised: 08/07/2020] [Accepted: 08/07/2020] [Indexed: 11/16/2022] Open
Abstract
Polypharmacologic human-targeted antimicrobials (polyHAM) are potentially useful in the treatment of complex human diseases where the microbiome is important (e.g., diabetes, hypertension). We previously reported a machine-learning approach to identify polyHAM from FDA-approved human targeted drugs using a heterologous approach (training with peptides and non-peptide compounds). Here we discover that polyHAM are more likely to be found among antimicrobials displaying a broad-spectrum antibiotic activity and that topological, but not chemical features, are most informative to classify this activity. A heterologous machine-learning approach was trained with broad-spectrum antimicrobials and tested with human metabolites; these metabolites were labeled as antimicrobials or non-antimicrobials based on a naïve text-mining approach. Human metabolites are not commonly recognized as antimicrobials yet circulate in the human body where microbes are found and our heterologous model was able to classify those with antimicrobial activity. These results provide the basis to develop applications aimed to design human diets that purposely alter metabolic compounds proportions as a way to control human microbiome.
Collapse
Affiliation(s)
- Rodrigo A. Nava Lara
- Department of Biochemistry and Structural Biology, Instituto de Fisiologia Celular, UNAM, Mexico City 04510, Mexico;
| | - Jesús A. Beltrán
- Department of Computer Science, CICESE Research Center, Ensenada 22860, Mexico; (J.A.B.); (C.A.B.)
| | - Carlos A. Brizuela
- Department of Computer Science, CICESE Research Center, Ensenada 22860, Mexico; (J.A.B.); (C.A.B.)
| | - Gabriel Del Rio
- Department of Biochemistry and Structural Biology, Instituto de Fisiologia Celular, UNAM, Mexico City 04510, Mexico;
- Correspondence:
| |
Collapse
|
47
|
Carrera M, Ezquerra-Brauer JM, Aubourg SP. Characterization of the Jumbo Squid ( Dosidicus gigas) Skin By-Product by Shotgun Proteomics and Protein-Based Bioinformatics. Mar Drugs 2019; 18:md18010031. [PMID: 31905758 PMCID: PMC7024357 DOI: 10.3390/md18010031] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2019] [Revised: 12/19/2019] [Accepted: 12/26/2019] [Indexed: 12/16/2022] Open
Abstract
Jumbo squid (Dosidicus gigas) is one of the largest cephalopods, and represents an important economic fishery in several regions of the Pacific Ocean, from southern California in the United States to southern Chile. Large and considerable discards of this species, such as skin, have been reported to constitute an important source of potential by-products. In this paper, a shotgun proteomics approach was applied for the first time to the characterization of the jumbo squid (Dosidicus gigas) skin proteome. A total of 1004 different peptides belonging to 219 different proteins were identified. The final proteome compilation was investigated by integrated in-silico studies, including gene ontology (GO) term enrichment, pathways, and networks studies. Potential new valuable bioactive peptides such as antimicrobial, bioactive collagen peptides, antihypertensive and antitumoral peptides were predicted to be present in the jumbo squid skin proteome. The integration of the global proteomics results and the bioinformatics analysis of the jumbo squid skin proteome show a comprehensive knowledge of this fishery discard and provide potential bioactive peptides of this marine by-product.
Collapse
Affiliation(s)
- Mónica Carrera
- Department of Food Technology, Marine Research Institute (IIM), Spanish National Research Council (CSIC), 36208 Vigo, Pontevedra, Spain;
- Correspondence: ; Tel.: +34-986-231930; Fax: +34-986-292762
| | | | - Santiago P. Aubourg
- Department of Food Technology, Marine Research Institute (IIM), Spanish National Research Council (CSIC), 36208 Vigo, Pontevedra, Spain;
| |
Collapse
|