51
|
Beyene SS, Ling T, Ristevski B, Chen M. A novel riboswitch classification based on imbalanced sequences achieved by machine learning. PLoS Comput Biol 2020; 16:e1007760. [PMID: 32687488 PMCID: PMC7392346 DOI: 10.1371/journal.pcbi.1007760] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2020] [Revised: 07/30/2020] [Accepted: 05/13/2020] [Indexed: 11/24/2022] Open
Abstract
Riboswitch, a part of regulatory mRNA (50-250nt in length), has two main classes: aptamer and expression platform. One of the main challenges raised during the classification of riboswitch is imbalanced data. That is a circumstance in which the records of a sequences of one group are very small compared to the others. Such circumstances lead classifier to ignore minority group and emphasize on majority ones, which results in a skewed classification. We considered sixteen riboswitch families, to be in accord with recent riboswitch classification work, that contain imbalanced sequences. The sequences were split into training and test set using a newly developed pipeline. From 5460 k-mers (k value 1 to 6) produced, 156 features were calculated based on CfsSubsetEval and BestFirst function found in WEKA 3.8. Statistically tested result was significantly difference between balanced and imbalanced sequences (p < 0.05). Besides, each algorithm also showed a significant difference in sensitivity, specificity, accuracy, and macro F-score when used in both groups (p < 0.05). Several k-mers clustered from heat map were discovered to have biological functions and motifs at the different positions like interior loops, terminal loops and helices. They were validated to have a biological function and some are riboswitch motifs. The analysis has discovered the importance of solving the challenges of majority bias analysis and overfitting. Presented results were generalized evaluation of both balanced and imbalanced models, which implies their ability of classifying, to classify novel riboswitches. The Python source code is available at https://github.com/Seasonsling/riboswitch.
Collapse
Affiliation(s)
- Solomon Shiferaw Beyene
- Department of Bioinformatics, College of Life Sciences, Zhejiang University, Hangzhou, China
| | - Tianyi Ling
- Department of Bioinformatics, College of Life Sciences, Zhejiang University, Hangzhou, China
- School of Medicine, Zhejiang University, Hangzhou, Zhejiang, China
| | - Blagoj Ristevski
- Faculty of Information and Communication Technologies, Bitola, St. Kliment Ohridski University Bitola, ul. Partizanska Bitola, Republic of North Macedonia
| | - Ming Chen
- Department of Bioinformatics, College of Life Sciences, Zhejiang University, Hangzhou, China
| |
Collapse
|
52
|
Qutb AM, Wei F, Dong W. Prediction and Characterization of Cationic Arginine-Rich Plant Antimicrobial Peptide SM-985 From Teosinte ( Zea mays ssp. mexicana). Front Microbiol 2020; 11:1353. [PMID: 32636825 PMCID: PMC7318549 DOI: 10.3389/fmicb.2020.01353] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2020] [Accepted: 05/26/2020] [Indexed: 12/17/2022] Open
Abstract
Antimicrobial peptides (AMPs) are effective against different plant pathogens and newly considered as part of plant defense systems. From prokaryotes to eukaryotes, AMPs can exist in all forms of life. SM-985 is a cationic AMP (CAMP) isolated from the cDNA library of Mexican teosinte (Zea mays ssp. mexicana). A computational prediction server running with different algorithms was used to screen the teosinte cDNA library for AMPs, and the SM-985 peptide was predicted as an AMP with high probability prediction values. SM-985 is an arginine-rich peptide and composed of 21 amino acids (MW: 2671.06 Da). The physicochemical properties of SM-985 are very promising as an AMP, including the net charge (+8), hydrophobicity ratio of 23%, Boman index of 5.19 kcal/mol, and isoelectric point of 12.95. The SM-985 peptide has amphipathic α-helix conformations. The antimicrobial activity of SM-985 was confirmed against six bacterial plant pathogens, and the MIC of SM-985 against Gram-positive indicators was 8 μM, while the MIC of SM-985 against Gram-negative indicators was 4 μM. The SM-985 interacting with the bacterial membrane and this interaction were examined by treatment of the bacterial indicators with FITC-SM-985 peptide, which showed a high binding affinity of SM-985 to the bacterial membrane (whether Gram-positive or Gram-negative). Scanning electron microscopy (SEM) and transmission electron microscopy (TEM) images of the treated bacteria with SM-985 demonstrated cell membrane damage and cell lysis. In vivo antimicrobial activity was examined, and SM-985 prevented leaf spot disease infection caused by Pst DC3000 on Solanum lycopersicum. Moreover, SM-985 showed sensitivity to calcium chloride salt, which is a common feature of CAMPs.
Collapse
Affiliation(s)
- Abdelrahman M. Qutb
- Department of Plant Pathology, College of Plant Science and Technology and the Key Lab of Crop Disease Monitoring and Safety Control in Hubei Province, Huazhong Agricultural University, Wuhan, China
- Department of Agricultural Botany, Faculty of Agriculture, Al-Azhar University, Cairo, Egypt
| | - Feng Wei
- State Key Laboratory of Agricultural Microbiology, College of Life Science and Technology, Huazhong Agricultural University, Wuhan, China
| | - Wubei Dong
- Department of Plant Pathology, College of Plant Science and Technology and the Key Lab of Crop Disease Monitoring and Safety Control in Hubei Province, Huazhong Agricultural University, Wuhan, China
| |
Collapse
|
53
|
Beltran JA, Del Rio G, Brizuela CA. An automatic representation of peptides for effective antimicrobial activity classification. Comput Struct Biotechnol J 2020; 18:455-463. [PMID: 32180904 PMCID: PMC7063200 DOI: 10.1016/j.csbj.2020.02.002] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2019] [Revised: 01/30/2020] [Accepted: 02/01/2020] [Indexed: 01/19/2023] Open
Abstract
Antimicrobial peptides (AMPs) are a promising alternative to small-molecules-based antibiotics. These peptides are part of most living organisms' innate defense system. In order to computationally identify new AMPs within the peptides these organisms produce, an automatic AMP/non-AMP classifier is required. In order to have an efficient classifier, a set of robust features that can capture what differentiates an AMP from another that is not, has to be selected. However, the number of candidate descriptors is large (in the order of thousands) to allow for an exhaustive search of all possible combinations. Therefore, efficient and effective feature selection techniques are required. In this work, we propose an efficient wrapper technique to solve the feature selection problem for AMPs identification. The method is based on a Genetic Algorithm that uses a variable-length chromosome for representing the selected features and uses an objective function that considers the Mathew Correlation Coefficient and the number of selected features. Computational experiments show that the proposed method can produce competitive results regarding sensitivity, specificity, and MCC. Furthermore, the best classification results are achieved by using only 39 out of 272 molecular descriptors.
Collapse
Affiliation(s)
- Jesus A Beltran
- Computer Science Department, Cicese Research Center, Ensenada, Baja California 22860, Mexico
| | - Gabriel Del Rio
- Department of Biochemistry and Structural Biology, Instituto de Fisiologia Celular, Universidad Nacional Autónoma de México, 04510, Mexico
| | - Carlos A Brizuela
- Computer Science Department, Cicese Research Center, Ensenada, Baja California 22860, Mexico
| |
Collapse
|
54
|
Characterization and Identification of Natural Antimicrobial Peptides on Different Organisms. Int J Mol Sci 2020; 21:ijms21030986. [PMID: 32024233 PMCID: PMC7038045 DOI: 10.3390/ijms21030986] [Citation(s) in RCA: 47] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2019] [Revised: 01/18/2020] [Accepted: 01/30/2020] [Indexed: 12/30/2022] Open
Abstract
Because of the rapid development of multidrug resistance, conventional antibiotics cannot kill pathogenic bacteria efficiently. New antibiotic treatments such as antimicrobial peptides (AMPs) can provide a possible solution to the antibiotic-resistance crisis. However, the identification of AMPs using experimental methods is expensive and time-consuming. Meanwhile, few studies use amino acid compositions (AACs) and physicochemical properties with different sequence lengths against different organisms to predict AMPs. Therefore, the major purpose of this study is to identify AMPs on seven categories of organisms, including amphibians, humans, fish, insects, plants, bacteria, and mammals. According to the one-rule attribute evaluation, the selected features were used to construct the predictive models based on the random forest algorithm. Compared to the accuracies of iAMP-2L (a web-server for identifying AMPs and their functional types), ADAM (a database of AMP), and MLAMP (a multi-label AMP classifier), the proposed method yielded higher than 92% in predicting AMPs on each category. Additionally, the sensitivities of the proposed models in the prediction of AMPs of seven organisms were higher than that of all other tools. Furthermore, several physicochemical properties (charge, hydrophobicity, polarity, polarizability, secondary structure, normalized van der Waals volume, and solvent accessibility) of AMPs were investigated according to their sequence lengths. As a result, the proposed method is a practical means to complement the existing tools in the characterization and identification of AMPs in different organisms.
Collapse
|
55
|
Settouti N, Douibi K, Bechar MEA, Daho MEH, Saidi M. Semi-Supervised learning with Collaborative Bagged Multi-label K-Nearest-Neighbors. OPEN COMPUTER SCIENCE 2019. [DOI: 10.1515/comp-2019-0017] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Abstract
Over the last few years, Multi-label classification has received significant attention from researchers to solve many issues in many fields. The manual annotation of available datasets is time-consuming and need a huge effort from the expert, especially for Multi-label applications in which each example of learning is associated with many labels at once. To overcome the manual annotation drawback, and to take advantages from the large amounts of unlabeled data, many semi-supervised approaches were proposed in the literature to give more sophisticated and fast solutions to support the automatic labeling of the unlabeled data. In this paper, a Collaborative Bagged Multi-label K-Nearest-Neighbors (CobMLKNN) algorithm is proposed, that extend the co-Training paradigm by a Multi-label K-Nearest-Neighbors algorithm. Experiments on ten real-world Multi-label datasets show the effectiveness of CobMLKNN algorithm to improve the performance of MLKNN to learn from a small number of labeled samples by exploiting unlabeled samples.
Collapse
Affiliation(s)
- Nesma Settouti
- Biomedical Engineering Laboratory , Faculty of Technology , Tlemcen University , 13000 Tlemcen , Algeria
| | - Khalida Douibi
- Biomedical Engineering Laboratory , Faculty of Technology , Tlemcen University , 13000 Tlemcen , Algeria
| | - Mohammed El Amine Bechar
- Biomedical Engineering Laboratory , Faculty of Technology , Tlemcen University , 13000 Tlemcen , Algeria
| | - Mostafa El Habib Daho
- Biomedical Engineering Laboratory , Faculty of Technology , Tlemcen University , 13000 Tlemcen , Algeria
| | - Meryem Saidi
- Biomedical Engineering Laboratory , Faculty of Technology , Tlemcen University , 13000 Tlemcen , Algeria
| |
Collapse
|
56
|
Zuo Y, Chang Y, Huang S, Zheng L, Yang L, Cao G. iDEF-PseRAAC: Identifying the Defensin Peptide by Using Reduced Amino Acid Composition Descriptor. Evol Bioinform Online 2019; 15:1176934319867088. [PMID: 31391777 PMCID: PMC6669840 DOI: 10.1177/1176934319867088] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2019] [Accepted: 07/08/2019] [Indexed: 11/18/2022] Open
Abstract
Defensins as 1 of major classes of host defense peptides play a significant role in the innate immunity, which are extremely evolved in almost all living organisms. Developing high-throughput computational methods can accurately help in designing drugs or medical means to defense against pathogens. To take up such a challenge, an up-to-date server based on rigorous benchmark dataset, referred to as iDEF-PseRAAC, was designed for predicting the defensin family in this study. By extracting primary sequence compositions based on different types of reduced amino acid alphabet, it was calculated that the best overall accuracy of the selected feature subset was achieved to 92.38%. Therefore, we can conclude that the information provided by abundant types of amino acid reduction will provide efficient and rational methodology for defensin identification. And, a free online server is freely available for academic users at http://bioinfor.imu.edu.cn/idpf. We hold expectations that iDEF-PseRAAC may be a promising weapon for the function annotation about the defensins protein.
Collapse
Affiliation(s)
- Yongchun Zuo
- College of Veterinary Medicine, Inner Mongolia Agricultural University, Hohhot, China.,State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, China
| | - Yu Chang
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, China
| | - Shenghui Huang
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, China
| | - Lei Zheng
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, China
| | - Lei Yang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Guifang Cao
- College of Veterinary Medicine, Inner Mongolia Agricultural University, Hohhot, China
| |
Collapse
|
57
|
Chung CR, Kuo TR, Wu LC, Lee TY, Horng JT. Characterization and identification of antimicrobial peptides with different functional activities. Brief Bioinform 2019; 21:bbz043. [PMID: 31155657 DOI: 10.1093/bib/bbz043] [Citation(s) in RCA: 96] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2019] [Revised: 03/20/2019] [Accepted: 03/20/2019] [Indexed: 02/28/2024] Open
Abstract
In recent years, antimicrobial peptides (AMPs) have become an emerging area of focus when developing therapeutics hot spot residues of proteins are dominant against infections. Importantly, AMPs are produced by virtually all known living organisms and are able to target a wide range of pathogenic microorganisms, including viruses, parasites, bacteria and fungi. Although several studies have proposed different machine learning methods to predict peptides as being AMPs, most do not consider the diversity of AMP activities. On this basis, we specifically investigated the sequence features of AMPs with a range of functional activities, including anti-parasitic, anti-viral, anti-cancer and anti-fungal activities and those that target mammals, Gram-positive and Gram-negative bacteria. A new scheme is proposed to systematically characterize and identify AMPs and their functional activities. The 1st stage of the proposed approach is to identify the AMPs, while the 2nd involves further characterization of their functional activities. Sequential forward selection was employed to extract potentially informative features that are possibly associated with the functional activities of the AMPs. These features include hydrophobicity, the normalized van der Waals volume, polarity, charge and solvent accessibility-all of which are essential attributes in classifying between AMPs and non-AMPs. The results revealed the 1st stage AMP classifier was able to achieve an area under the receiver operating characteristic curve (AUC) value of 0.9894. During the 2nd stage, we found pseudo amino acid composition to be an informative attribute when differentiating between AMPs in terms of their functional activities. The independent testing results demonstrated that the AUCs of the multi-class models were 0.7773, 0.9404, 0.8231, 0.8578, 0.8648, 0.8745 and 0.8672 for anti-parasitic, anti-viral, anti-cancer, anti-fungal AMPs and those that target mammals, Gram-positive and Gram-negative bacteria, respectively. The proposed scheme helps facilitate biological experiments related to the functional analysis of AMPs. Additionally, it was implemented as a user-friendly web server (AMPfun, http://fdblab.csie.ncu.edu.tw/AMPfun/index.html) that allows individuals to explore the antimicrobial functions of peptides of interest.
Collapse
Affiliation(s)
- Chia-Ru Chung
- Department of Computer Science and Information Engineering, National Central University, Taoyuan, Taiwan
| | - Ting-Rung Kuo
- Department of Computer Science and Information Engineering, National Central University, Taoyuan, Taiwan
| | - Li-Ching Wu
- Department of Biomedical Sciences and Engineering, National Central University, Taoyuan, Taiwan
| | - Tzong-Yi Lee
- School of Science and Engineering, The Chinese University of Hong Kong, Shenzhen, China
- School of Life and Health Sciences, The Chinese University of Hong Kong, Shenzhen, China
- Warshel Institute for Computational Biology, The Chinese University of Hong Kong, Shenzhen, China
| | - Jorng-Tzong Horng
- Department of Computer Science and Information Engineering, National Central University, Taoyuan, Taiwan
- Department of Bioinformatics and Medical Engineering, Asia University, Taichung, Taiwan
| |
Collapse
|
58
|
Gull S, Shamim N, Minhas F. AMAP: Hierarchical multi-label prediction of biologically active and antimicrobial peptides. Comput Biol Med 2019; 107:172-181. [DOI: 10.1016/j.compbiomed.2019.02.018] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2018] [Revised: 02/17/2019] [Accepted: 02/20/2019] [Indexed: 12/12/2022]
|
59
|
Ayala‐Ruano S, Santander‐Gordón D, Tejera E, Perez‐Castillo Y, Armijos-Jaramillo V. A putative antimicrobial peptide from Hymenoptera in the megaplasmid pSCL4 of Streptomyces clavuligerus ATCC 27064 reveals a singular case of horizontal gene transfer with potential applications. Ecol Evol 2019; 9:2602-2614. [PMID: 30891203 PMCID: PMC6406012 DOI: 10.1002/ece3.4924] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2018] [Revised: 12/31/2018] [Accepted: 01/02/2019] [Indexed: 11/06/2022] Open
Abstract
Streptomyces clavuligerus is a Gram-positive bacterium that is a high producer of secondary metabolites with industrial applications. The production of antibiotics such as clavulanic acid or cephamycin has been extensively studied in this species; nevertheless, other aspects, such as evolution or ecology, have received less attention. Furthermore, genes that arise from ancient events of lateral transfer have been demonstrated to be implicated in important functions of host species. This approximation discovered relevant genes that genomic analyses overlooked. Thus, we studied the impact of horizontal gene transfer in the S. clavuligerus genome. To perform this task, we applied whole-genome analysis to identify a laterally transferred sequence from different domains. The most relevant result was a putative antimicrobial peptide (AMP) with a clear origin in the Hymenoptera order of insects. Next, we determined that two copies of these genes were present in the megaplasmid pSCL4 but absent in the S. clavuligerus ATCC 27064 chromosome. Additionally, we found that these sequences were exclusive to the ATCC 27064 strain (and so were not present in any other bacteria) and we also verified the expression of the genes using RNAseq data. Next, we used several AMP predictors to validate the original annotation extracted from Hymenoptera sequences and explored the possibility that these proteins had post-translational modifications using peptidase cleavage prediction. We suggest that Hymenoptera AMP-like proteins of S. clavuligerus ATCC 27064 may be useful for both species adaptation and as an antimicrobial molecule with industrial applications.
Collapse
Affiliation(s)
- Sebastián Ayala‐Ruano
- Universidad San Francisco de Quito, Colegio de Ciencias Biológicas y Ambientales (COCIBA‐USFQ)QuitoEcuador
| | - Daniela Santander‐Gordón
- Carrera de Ingeniería en Biotecnología, Facultad de Ingeniería y Ciencias AplicadasUniversidad de Las AméricasQuitoEcuador
| | - Eduardo Tejera
- Carrera de Ingeniería en Biotecnología, Facultad de Ingeniería y Ciencias AplicadasUniversidad de Las AméricasQuitoEcuador
- Grupo de Bio‐QuimioinformáticaUniversidad de Las AméricasQuitoEcuador
| | - Yunierkis Perez‐Castillo
- Grupo de Bio‐QuimioinformáticaUniversidad de Las AméricasQuitoEcuador
- Ciencias Físicas y Matemáticas‐Facultad de Formación GeneralUniversidad de Las AméricasQuitoEcuador
| | - Vinicio Armijos-Jaramillo
- Carrera de Ingeniería en Biotecnología, Facultad de Ingeniería y Ciencias AplicadasUniversidad de Las AméricasQuitoEcuador
- Grupo de Bio‐QuimioinformáticaUniversidad de Las AméricasQuitoEcuador
| |
Collapse
|
60
|
Mirza B, Wang W, Wang J, Choi H, Chung NC, Ping P. Machine Learning and Integrative Analysis of Biomedical Big Data. Genes (Basel) 2019; 10:E87. [PMID: 30696086 PMCID: PMC6410075 DOI: 10.3390/genes10020087] [Citation(s) in RCA: 182] [Impact Index Per Article: 30.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2018] [Revised: 01/08/2019] [Accepted: 01/21/2019] [Indexed: 12/11/2022] Open
Abstract
Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues.
Collapse
Affiliation(s)
- Bilal Mirza
- NIH BD2K Center of Excellence for Biomedical Computing, University of California Los Angeles, Los Angeles, CA 90095, USA.
- Department of Physiology, University of California Los Angeles, Los Angeles, CA 90095, USA.
| | - Wei Wang
- NIH BD2K Center of Excellence for Biomedical Computing, University of California Los Angeles, Los Angeles, CA 90095, USA.
- Department of Computer Science, University of California Los Angeles, Los Angeles, CA 90095, USA.
- Scalable Analytics Institute (ScAi), University of California Los Angeles, Los Angeles, CA 90095, USA.
- Department of Bioinformatics, University of California Los Angeles, Los Angeles, CA 90095, USA.
| | - Jie Wang
- NIH BD2K Center of Excellence for Biomedical Computing, University of California Los Angeles, Los Angeles, CA 90095, USA.
- Department of Physiology, University of California Los Angeles, Los Angeles, CA 90095, USA.
| | - Howard Choi
- NIH BD2K Center of Excellence for Biomedical Computing, University of California Los Angeles, Los Angeles, CA 90095, USA.
- Department of Physiology, University of California Los Angeles, Los Angeles, CA 90095, USA.
- Department of Bioinformatics, University of California Los Angeles, Los Angeles, CA 90095, USA.
| | - Neo Christopher Chung
- NIH BD2K Center of Excellence for Biomedical Computing, University of California Los Angeles, Los Angeles, CA 90095, USA.
- Institute of Informatics, Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Banacha 2, 02-097 Warsaw, Poland.
| | - Peipei Ping
- NIH BD2K Center of Excellence for Biomedical Computing, University of California Los Angeles, Los Angeles, CA 90095, USA.
- Department of Physiology, University of California Los Angeles, Los Angeles, CA 90095, USA.
- Scalable Analytics Institute (ScAi), University of California Los Angeles, Los Angeles, CA 90095, USA.
- Department of Bioinformatics, University of California Los Angeles, Los Angeles, CA 90095, USA.
- Department of Medicine (Cardiology), University of California Los Angeles, Los Angeles, CA 90095, USA.
| |
Collapse
|
61
|
Vishnepolsky B, Pirtskhalava M. Comment on: ‘Empirical comparison of web-based antimicrobial peptide prediction tools’. Bioinformatics 2018; 35:2692-2694. [DOI: 10.1093/bioinformatics/bty1023] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2017] [Revised: 03/05/2018] [Accepted: 12/13/2018] [Indexed: 11/12/2022] Open
Abstract
Abstract
Supplementary information: Supplementary data are available at Bioinformatics online.
Collapse
|
62
|
Kalmykova SD, Arapidi GP, Urban AS, Osetrova MS, Gordeeva VD, Ivanov VT, Govorun VM. In Silico Analysis of Peptide Potential Biological Functions. RUSSIAN JOURNAL OF BIOORGANIC CHEMISTRY 2018. [DOI: 10.1134/s106816201804009x] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
63
|
Liu S, Bao J, Lao X, Zheng H. Novel 3D Structure Based Model for Activity Prediction and Design of Antimicrobial Peptides. Sci Rep 2018; 8:11189. [PMID: 30046138 PMCID: PMC6060096 DOI: 10.1038/s41598-018-29566-5] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2018] [Accepted: 07/13/2018] [Indexed: 01/10/2023] Open
Abstract
The emergence and worldwide spread of multi-drug resistant bacteria makes an urgent challenge for the development of novel antibacterial agents. A perspective weapon to fight against severe infections caused by drug-resistant microorganisms is antimicrobial peptides (AMPs). AMPs are a diverse class of naturally occurring molecules that are produced as a first line of defense by all multi-cellular organisms. Limited by the number of experimental determinate 3D structure, most of the prediction or classification methods of AMPs were based on 2D descriptors, including sequence, amino acid composition, peptide net charge, hydrophobicity, amphiphilic, etc. Due to the rapid development of structural simulation methods, predicted models of proteins (or peptides) have been successfully applied in structure based drug design, for example as targets of virtual ligand screening. Here, we establish the activity prediction model based on the predicted 3D structure of AMPs molecule. To our knowledge, it is the first report of prediction method based on 3D descriptors of AMPs. Novel AMPs were designed by using the model, and their antibacterial effect was measured by in vitro experiments.
Collapse
Affiliation(s)
- Shicai Liu
- School of Life Science and Technology, China Pharmaceutical University, Nanjing, 210009, China
| | - Jingxiao Bao
- School of Life Science and Technology, China Pharmaceutical University, Nanjing, 210009, China
| | - Xingzhen Lao
- School of Life Science and Technology, China Pharmaceutical University, Nanjing, 210009, China.
| | - Heng Zheng
- School of Life Science and Technology, China Pharmaceutical University, Nanjing, 210009, China.
| |
Collapse
|
64
|
Ning Q, Zhao X, Bao L, Ma Z, Zhao X. Detecting Succinylation sites from protein sequences using ensemble support vector machine. BMC Bioinformatics 2018; 19:237. [PMID: 29940836 PMCID: PMC6016146 DOI: 10.1186/s12859-018-2249-4] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2017] [Accepted: 06/14/2018] [Indexed: 12/14/2022] Open
Abstract
Background Lysine succinylation is a new kind of post-translational modification which plays a key role in protein conformation regulation and cellular function control. To understand the mechanism of succinylation profoundly, it is necessary to identify succinylation sites in proteins accurately. However, traditional methods, experimental approaches, are labor-intensive and time-consuming. Computational prediction methods have been proposed recent years, and they are popular because of their convenience and high speed. In this study, we developed a new method to predict succinylation sites in protein combining multiple features, including amino acid composition, binary encoding, physicochemical property and grey pseudo amino acid composition, with a feature selection scheme (information gain). And then, it was trained using SVM (Support Vector Machine) and an ensemble learning algorithm. Results The performance of this method was measured with an accuracy of 89.14% and a MCC (Matthew Correlation Coefficient) of 0.79 using 10-fold cross validation on training dataset and an accuracy of 84.5% and a MCC of 0.2 on independent dataset. Conclusions The conclusions made from this study can help to understand more of the succinylation mechanism. These results suggest that our method was very promising for predicting succinylation sites. The source code and data of this paper are freely available athttps://github.com/ningq669/PSuccE. Electronic supplementary material The online version of this article (10.1186/s12859-018-2249-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Qiao Ning
- School of Information Science and Technology, Northeast Normal University, Changchun, 130117, China
| | - Xiaosa Zhao
- School of Information Science and Technology, Northeast Normal University, Changchun, 130117, China
| | - Lingling Bao
- School of Information Science and Technology, Northeast Normal University, Changchun, 130117, China
| | - Zhiqiang Ma
- School of Information Science and Technology, Northeast Normal University, Changchun, 130117, China.
| | - Xiaowei Zhao
- Key Laboratory of Intelligent Information Processing of Jilin Universities, Northeast Normal University, Changchun, 130117, China.
| |
Collapse
|
65
|
Gabere MN, Noble WS. Empirical comparison of web-based antimicrobial peptide prediction tools. Bioinformatics 2018; 33:1921-1929. [PMID: 28203715 DOI: 10.1093/bioinformatics/btx081] [Citation(s) in RCA: 79] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2016] [Accepted: 02/13/2017] [Indexed: 11/13/2022] Open
Abstract
Motivation Antimicrobial peptides (AMPs) are innate immune molecules that exhibit activities against a range of microbes, including bacteria, fungi, viruses and protozoa. Recent increases in microbial resistance against current drugs has led to a concomitant increase in the need for novel antimicrobial agents. Over the last decade, a number of AMP prediction tools have been designed and made freely available online. These AMP prediction tools show potential to discriminate AMPs from non-AMPs, but the relative quality of the predictions produced by the various tools is difficult to quantify. Results We compiled two sets of AMP and non-AMP peptides, separated into three categories-antimicrobial, antibacterial and bacteriocins. Using these benchmark data sets, we carried out a systematic evaluation of ten publicly available AMP prediction methods. Among the six general AMP prediction tools-ADAM, CAMPR3(RF), CAMPR3(SVM), MLAMP, DBAASP and MLAMP-we find that CAMPR3(RF) provides a statistically significant improvement in performance, as measured by the area under the receiver operating characteristic (ROC) curve, relative to the other five methods. Surprisingly, for antibacterial prediction, the original AntiBP method significantly outperforms its successor, AntiBP2 based on one benchmark dataset. The two bacteriocin prediction tools, BAGEL3 and BACTIBASE, both provide very good performance and BAGEL3 outperforms its predecessor, BACTIBASE, on the larger of the two benchmarks. Contact gaberemu@ngha.med.sa or william-noble@uw.edu. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Musa Nur Gabere
- Department of Biostatistics and Bioinformatics, King Abdullah International Medical Research Center/King Saud bin Abdulaziz University for Health Sciences, Riyadh, Saudi Arabia
| | - William Stafford Noble
- Department of Genome Sciences, Department of Computer Science and Engineering, University of Washington, Seattle, WA, USA
| |
Collapse
|
66
|
Vishnepolsky B, Gabrielian A, Rosenthal A, Hurt DE, Tartakovsky M, Managadze G, Grigolava M, Makhatadze GI, Pirtskhalava M. Predictive Model of Linear Antimicrobial Peptides Active against Gram-Negative Bacteria. J Chem Inf Model 2018; 58:1141-1151. [DOI: 10.1021/acs.jcim.8b00118] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Affiliation(s)
- Boris Vishnepolsky
- Ivane Beritashvili Center of Experimental Biomedicine, Tbilisi 0160, Georgia
| | - Andrei Gabrielian
- Office of Cyber Infrastructure and Computational Biology, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Alex Rosenthal
- Office of Cyber Infrastructure and Computational Biology, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Darrell E. Hurt
- Office of Cyber Infrastructure and Computational Biology, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Michael Tartakovsky
- Office of Cyber Infrastructure and Computational Biology, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Grigol Managadze
- Ivane Beritashvili Center of Experimental Biomedicine, Tbilisi 0160, Georgia
| | - Maya Grigolava
- Ivane Beritashvili Center of Experimental Biomedicine, Tbilisi 0160, Georgia
| | | | - Malak Pirtskhalava
- Ivane Beritashvili Center of Experimental Biomedicine, Tbilisi 0160, Georgia
| |
Collapse
|
67
|
Ding M, Yang Y, Lan Z. Multi-label imbalanced classification based on assessments of cost and value. APPL INTELL 2018. [DOI: 10.1007/s10489-018-1156-8] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
68
|
Waghu FH, Joseph S, Ghawali S, Martis EA, Madan T, Venkatesh KV, Idicula-Thomas S. Designing Antibacterial Peptides with Enhanced Killing Kinetics. Front Microbiol 2018. [PMID: 29527201 PMCID: PMC5829097 DOI: 10.3389/fmicb.2018.00325] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
Abstract
Antimicrobial peptides (AMPs) are gaining attention as substitutes for antibiotics in order to combat the risk posed by multi-drug resistant pathogens. Several research groups are engaged in design of potent anti-infective agents using natural AMPs as templates. In this study, a library of peptides with high sequence similarity to Myeloid Antimicrobial Peptide (MAP) family were screened using popular online prediction algorithms. These peptide variants were designed in a manner to retain the conserved residues within the MAP family. The prediction algorithms were found to effectively classify peptides based on their antimicrobial nature. In order to improve the activity of the identified peptides, molecular dynamics (MD) simulations, using bilayer and micellar systems could be used to design and predict effect of residue substitution on membranes of microbial and mammalian cells. The inference from MD simulation studies well corroborated with the wet-lab observations indicating that MD-guided rational design could lead to discovery of potent AMPs. The effect of the residue substitution on membrane activity was studied in greater detail using killing kinetic analysis. Killing kinetics studies on Gram-positive, negative and human erythrocytes indicated that a single residue change has a drastic effect on the potency of AMPs. An interesting outcome was a switch from monophasic to biphasic death rate constant of Staphylococcus aureus due to a single residue mutation in the peptide.
Collapse
Affiliation(s)
- Faiza H Waghu
- Biomedical Informatics Centre, ICMR-National Institute for Research in Reproductive Health, Mumbai, India
| | - Shaini Joseph
- Biomedical Informatics Centre, ICMR-National Institute for Research in Reproductive Health, Mumbai, India
| | - Sanket Ghawali
- Biomedical Informatics Centre, ICMR-National Institute for Research in Reproductive Health, Mumbai, India
| | - Elvis A Martis
- Molecular Simulation Group, Department of Pharmaceutical Chemistry, Bombay College of Pharmacy, Mumbai, India
| | - Taruna Madan
- Department of Innate Immunity, ICMR-National Institute for Research in Reproductive Health, Mumbai, India
| | | | - Susan Idicula-Thomas
- Biomedical Informatics Centre, ICMR-National Institute for Research in Reproductive Health, Mumbai, India
| |
Collapse
|
69
|
Riemenschneider M, Herbst A, Rasch A, Gorlatch S, Heider D. eccCL: parallelized GPU implementation of Ensemble Classifier Chains. BMC Bioinformatics 2017; 18:371. [PMID: 28818036 PMCID: PMC5561639 DOI: 10.1186/s12859-017-1783-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2016] [Accepted: 08/08/2017] [Indexed: 11/30/2022] Open
Abstract
Background Multi-label classification has recently gained great attention in diverse fields of research, e.g., in biomedical application such as protein function prediction or drug resistance testing in HIV. In this context, the concept of Classifier Chains has been shown to improve prediction accuracy, especially when applied as Ensemble Classifier Chains. However, these techniques lack computational efficiency when applied on large amounts of data, e.g., derived from next-generation sequencing experiments. By adapting algorithms for the use of graphics processing units, computational efficiency can be greatly improved due to parallelization of computations. Results Here, we provide a parallelized and optimized graphics processing unit implementation (eccCL) of Classifier Chains and Ensemble Classifier Chains. Additionally to the OpenCL implementation, we provide an R-Package with an easy to use R-interface for parallelized graphics processing unit usage. Conclusion
eccCL is a handy implementation of Classifier Chains on GPUs, which is able to process up to over 25,000 instances per second, and thus can be used efficiently in high-throughput experiments. The software is available at http://www.heiderlab.de.
Collapse
Affiliation(s)
- Mona Riemenschneider
- Department of Bioinformatics, Straubing Center of Science, Petersgasse 18, Straubing, 94315, Germany
| | - Alexander Herbst
- Institute of Computer Science, University of Münster, Einsteinstr. 62, Münster, 48149, Germany
| | - Ari Rasch
- Institute of Computer Science, University of Münster, Einsteinstr. 62, Münster, 48149, Germany
| | - Sergei Gorlatch
- Institute of Computer Science, University of Münster, Einsteinstr. 62, Münster, 48149, Germany
| | - Dominik Heider
- Department of Bioinformatics, Straubing Center of Science, Petersgasse 18, Straubing, 94315, Germany. .,Wissenschaftszentrum Weihenstephan, Technische Universität München, Alte Akademie 8, Freising, 85354, Germany. .,Present Address: Department of Mathematics and Computer Science, University of Marburg, Hans-Meerwein-Str. 6, Marburg, 35032, Germany.
| |
Collapse
|
70
|
Cheng X, Zhao SG, Lin WZ, Xiao X, Chou KC. pLoc-mAnimal: predict subcellular localization of animal proteins with both single and multiple sites. Bioinformatics 2017; 33:3524-3531. [DOI: 10.1093/bioinformatics/btx476] [Citation(s) in RCA: 167] [Impact Index Per Article: 20.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2017] [Accepted: 07/22/2017] [Indexed: 12/24/2022] Open
Affiliation(s)
- Xiang Cheng
- College of Information Science and Technology, Donghua University, Shanghai, China
- Computer Department, Jingdezhen Ceramic Institute, Jingdezhen, China
| | - Shu-Guang Zhao
- College of Information Science and Technology, Donghua University, Shanghai, China
| | - Wei-Zhong Lin
- Computer Department, Jingdezhen Ceramic Institute, Jingdezhen, China
| | - Xuan Xiao
- Computer Department, Jingdezhen Ceramic Institute, Jingdezhen, China
- The Gordon Life Science Institute, Boston, MA, USA
| | - Kuo-Chen Chou
- The Gordon Life Science Institute, Boston, MA, USA
- Center of Excellence in Genomic Medicine Research (CEGMR), King Abdulaziz University, Jeddah, Saudi Arabia
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
71
|
Abstract
Understanding epigenetic processes holds immense promise for medical applications. Advances in Machine Learning (ML) are critical to realize this promise. Previous studies used epigenetic data sets associated with the germline transmission of epigenetic transgenerational inheritance of disease and novel ML approaches to predict genome-wide locations of critical epimutations. A combination of Active Learning (ACL) and Imbalanced Class Learning (ICL) was used to address past problems with ML to develop a more efficient feature selection process and address the imbalance problem in all genomic data sets. The power of this novel ML approach and our ability to predict epigenetic phenomena and associated disease is suggested. The current approach requires extensive computation of features over the genome. A promising new approach is to introduce Deep Learning (DL) for the generation and simultaneous computation of novel genomic features tuned to the classification task. This approach can be used with any genomic or biological data set applied to medicine. The application of molecular epigenetic data in advanced machine learning analysis to medicine is the focus of this review.
Collapse
Affiliation(s)
- Lawrence B Holder
- a School of Electrical Engineering and Computer Science , Washington State University , Pullman , WA , USA
| | - M Muksitul Haque
- a School of Electrical Engineering and Computer Science , Washington State University , Pullman , WA , USA.,b Center for Reproductive Biology, School of Biological Sciences , Washington State University , Pullman , WA , USA
| | - Michael K Skinner
- b Center for Reproductive Biology, School of Biological Sciences , Washington State University , Pullman , WA , USA
| |
Collapse
|