1
|
Abbas M, Sahibzada KI, Shahid S, Yousaf N, Hu Y, Wei DQ. ABP-Xplorer: A Machine Learning Approach for Prediction of Antibacterial Peptides Targeting Mycobacterium abscessus-tRNA-Methyltransferase (TrmD). J Chem Inf Model 2025. [PMID: 40377983 DOI: 10.1021/acs.jcim.5c00663] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/18/2025]
Abstract
Mycobacterium abscessus (MAB) infections pose a significant treatment challenge due to their intrinsic resistance to antibiotics, requiring prolonged multidrug regimens with limited success and frequent relapses. tRNA (m1G37) methyltransferase (TrmD), an enzyme essential for maintaining the reading frame during protein synthesis in MAB and other mycobacteria, is a potential therapeutic target for identifying new inhibitors. This study introduces ABP-Xplorer, a machine learning-based (ML) model designed to predict the antibacterial potential of peptides targeting MAB-TrmD ribosomal sites. A systematic evaluation of 26 machine learning models identified the Random Forest (RF) classifier as the most effective, achieving 96% accuracy. To address data set imbalance and enhance predictive reliability, the Synthetic Minority Oversampling Technique (SMOTE) was applied, improving model generalization and reducing bias. After that, an ABP-Xplorer streamlit was developed to predict positive and negative antibacterial peptides (ABP), enabling easy sequence input and classification based on predictive scoring. For validation, 12 positive peptides with high predictive scores were selected for molecular docking by HADDOCK. Docking analysis of selected peptides confirmed strong binding to TrmD, with P1, P7, P8, and P9 as top candidates. Notably, P1 exhibited the best interaction with a HADDOCK score of -102.2, followed by P7 (-93.6) and P8 (-91.4), indicating their potential for further development as TrmD inhibitors.Moreover, Ramachandran plot analysis validated the structural reliability. Future research should focus on the experimental validation of these peptides and optimizing their stability and bioavailability for therapeutic applications.
Collapse
Affiliation(s)
- Munawar Abbas
- College of Food Science and Technology, Henan University of Technology, Zhengzhou 450001, Henan, China
| | - Kashif Iqbal Sahibzada
- College of Biological Engineering, Henan University of Technology, Zhengzhou 454001, Henan, P. R. China
- Department of Health Professional Technologies, Faculty of Allied Health Sciences, The University of Lahore, Lahore 54570, Pakistan
| | - Shumaila Shahid
- School of Biochemistry and Biotechnology, University of the Punjab, Lahore 54570, Pakistan
| | - Numan Yousaf
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200030, P. R. China
| | - Yuansen Hu
- College of Biological Engineering, Henan University of Technology, Zhengzhou 454001, Henan, P. R. China
| | - Dong-Qing Wei
- College of Food Science and Technology, Henan University of Technology, Zhengzhou 450001, Henan, China
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200030, P. R. China
- Zhongjing Research and Industrialization Institute of Chinese Medicine, Zhongguancun Scientific Park, Meixi, Nanyang, Henan 473006, P. R. China
| |
Collapse
|
2
|
Rodríguez-Belenguer P, Soria-Olivas E, Pastor M. StreamChol: a web-based application for predicting cholestasis. J Cheminform 2025; 17:9. [PMID: 39838478 PMCID: PMC11752685 DOI: 10.1186/s13321-024-00943-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2024] [Accepted: 12/19/2024] [Indexed: 01/23/2025] Open
Abstract
This article introduces StreamChol, a software for developing and applying mechanistic models to predict cholestasis. StreamChol is a Streamlit application, usable as a desktop application or web-accessible software when installed on a server using a docker container.StreamChol allows a seamless integration of pharmacokinetic analyses with Machine Learning models. This integration not only enables cholestasis prediction but also opens avenues for predicting other toxicological endpoints requiring similar integrations. StreamChol's Docker containerization also streamlines deployment across diverse environments, addressing potential compatibility issues. StreamChol is distributed as open-source under GNU GPL v3, reflecting our commitment to open science. Through StreamChol, researchers are offered a potent tool for predictive modelling in toxicology, harnessing its strengths within an intuitive and user-friendly interface, without the need for any programming knowledge.Scientific contribution This work offers a user-friendly web-based tool for cholestasis prediction and a complete workflow for creating web platforms that require the combination of both programming languages, R and Python.
Collapse
Affiliation(s)
- Pablo Rodríguez-Belenguer
- Research Programme On Biomedical Informatics (GRIB), Department of Medicine and Life Sciences, Universitat Pompeu Fabra, Hospital del Mar Medical Research Institute, Barcelona, Spain
| | - Emilio Soria-Olivas
- IDAL, Intelligent Data Analysis Laboratory, ETSE, Universitat de València, Valencia, Spain
| | - Manuel Pastor
- Research Programme On Biomedical Informatics (GRIB), Department of Medicine and Life Sciences, Universitat Pompeu Fabra, Hospital del Mar Medical Research Institute, Barcelona, Spain.
| |
Collapse
|
3
|
Xu L, Li C, Zhang J, Guan C, Zhao L, Shen X, Zhang N, Li T, Yang C, Zhou B, Bu Q, Xu Y. Personalized prediction of mortality in patients with acute ischemic stroke using explainable artificial intelligence. Eur J Med Res 2024; 29:341. [PMID: 38902792 PMCID: PMC11188208 DOI: 10.1186/s40001-024-01940-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2024] [Accepted: 06/17/2024] [Indexed: 06/22/2024] Open
Abstract
BACKGROUND Research into the acute kidney disease (AKD) after acute ischemic stroke (AIS) is rare, and how clinical features influence its prognosis remain unknown. We aim to employ interpretable machine learning (ML) models to study AIS and clarify its decision-making process in identifying the risk of mortality. METHODS We conducted a retrospective cohort study involving AIS patients from January 2020 to June 2021. Patient data were randomly divided into training and test sets. Eight ML algorithms were employed to construct predictive models for mortality. The performance of the best model was evaluated using various metrics. Furthermore, we created an artificial intelligence (AI)-driven web application that leveraged the top ten most crucial features for mortality prediction. RESULTS The study cohort consisted of 1633 AIS patients, among whom 257 (15.74%) developed subacute AKD, 173 (10.59%) experienced AKI recovery, and 65 (3.98%) met criteria for both AKI and AKD. The mortality rate stood at 4.84%. The LightGBM model displayed superior performance, boasting an AUROC of 0.96 for mortality prediction. The top five features linked to mortality were ACEI/ARE, renal function trajectories, neutrophil count, diuretics, and serum creatinine. Moreover, we designed a web application using the LightGBM model to estimate mortality risk. CONCLUSIONS Complete renal function trajectories, including AKI and AKD, are vital for fitting mortality in AIS patients. An interpretable ML model effectively clarified its decision-making process for identifying AIS patients at risk of mortality. The AI-driven web application has the potential to contribute to the development of personalized early mortality prevention.
Collapse
Affiliation(s)
- Lingyu Xu
- Department of Nephrology, The Affiliated Hospital of Qingdao University, 16 Jiangsu Road, Qingdao, 266003, China
| | - Chenyu Li
- Department of Nephrology, The Affiliated Hospital of Qingdao University, 16 Jiangsu Road, Qingdao, 266003, China
- Division of Nephrology, Medizinische Klinik Und Poliklinik IV, Klinikum der Universität, Munich, Germany
| | - Jiaqi Zhang
- Yidu Central Hospital of Weifang, Weifang, China
| | - Chen Guan
- Department of Nephrology, The Affiliated Hospital of Qingdao University, 16 Jiangsu Road, Qingdao, 266003, China
| | - Long Zhao
- Department of Nephrology, The Affiliated Hospital of Qingdao University, 16 Jiangsu Road, Qingdao, 266003, China
| | - Xuefei Shen
- Department of Nephrology, The Affiliated Hospital of Qingdao University, 16 Jiangsu Road, Qingdao, 266003, China
| | - Ningxin Zhang
- Department of Nephrology, The Affiliated Hospital of Qingdao University, 16 Jiangsu Road, Qingdao, 266003, China
| | - Tianyang Li
- Department of Nephrology, The Affiliated Hospital of Qingdao University, 16 Jiangsu Road, Qingdao, 266003, China
| | - Chengyu Yang
- Department of Nephrology, The Affiliated Hospital of Qingdao University, 16 Jiangsu Road, Qingdao, 266003, China
| | - Bin Zhou
- Department of Nephrology, The Affiliated Hospital of Qingdao University, 16 Jiangsu Road, Qingdao, 266003, China
| | - Quandong Bu
- Department of Nephrology, The Affiliated Hospital of Qingdao University, 16 Jiangsu Road, Qingdao, 266003, China
| | - Yan Xu
- Department of Nephrology, The Affiliated Hospital of Qingdao University, 16 Jiangsu Road, Qingdao, 266003, China.
| |
Collapse
|
4
|
Castorina LV, Ünal SM, Subr K, Wood CW. TIMED-Design: flexible and accessible protein sequence design with convolutional neural networks. Protein Eng Des Sel 2024; 37:gzae002. [PMID: 38288671 PMCID: PMC10939383 DOI: 10.1093/protein/gzae002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Revised: 12/12/2023] [Accepted: 01/12/2024] [Indexed: 02/18/2024] Open
Abstract
Sequence design is a crucial step in the process of designing or engineering proteins. Traditionally, physics-based methods have been used to solve for optimal sequences, with the main disadvantages being that they are computationally intensive for the end user. Deep learning-based methods offer an attractive alternative, outperforming physics-based methods at a significantly lower computational cost. In this paper, we explore the application of Convolutional Neural Networks (CNNs) for sequence design. We describe the development and benchmarking of a range of networks, as well as reimplementations of previously described CNNs. We demonstrate the flexibility of representing proteins in a three-dimensional voxel grid by encoding additional design constraints into the input data. Finally, we describe TIMED-Design, a web application and command line tool for exploring and applying the models described in this paper. The user interface will be available at the URL: https://pragmaticproteindesign.bio.ed.ac.uk/timed. The source code for TIMED-Design is available at https://github.com/wells-wood-research/timed-design.
Collapse
Affiliation(s)
- Leonardo V Castorina
- School of Informatics, University of Edinburgh, 10 Crichton Street, Edinburgh EH8 9AB United Kingdom
| | - Suleyman Mert Ünal
- School of Biological Sciences, University of Edinburgh, Roger Land Building, Edinburgh EH9 3FF, United Kingdom
| | - Kartic Subr
- School of Informatics, University of Edinburgh, 10 Crichton Street, Edinburgh EH8 9AB United Kingdom
| | - Christopher W Wood
- School of Biological Sciences, University of Edinburgh, Roger Land Building, Edinburgh EH9 3FF, United Kingdom
| |
Collapse
|
5
|
Kim J, Yoon S, Kondakala S, Foley SL, Hart M, Baek DH, Wang W, Kim SK, Sutherland JB, Kim SJ, Kweon O. CPGminer: An Interactive Dashboard to Explore the Genomic Features and Taxonomy of Complete Prokaryotic Genomes. Microorganisms 2023; 11:2556. [PMID: 37894214 PMCID: PMC10609142 DOI: 10.3390/microorganisms11102556] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Revised: 09/22/2023] [Accepted: 10/08/2023] [Indexed: 10/29/2023] Open
Abstract
Prokaryotes, the earliest forms of life on Earth, play crucial roles in global biogeochemical processes in virtually all ecosystems. The ever-increasing amount of prokaryotic genome sequencing data provides a wealth of information to examine fundamental and applied questions through systematic genome comparison. Genomic features, such as genome size and GC content, and taxonomy-centric genomic features of complete prokaryotic genomes (CPGs) are crucial for various fields of microbial research and education, yet they are often overlooked. Additionally, creating systematically curated datasets that align with research concerns is an essential yet challenging task for wet-lab researchers. In this study, we introduce CPGminer, a user-friendly tool that allows researchers to quickly and easily examine the genomic features and taxonomy of CPGs and curate genome datasets. We also provide several examples to demonstrate its practical utility in addressing descriptive questions.
Collapse
Affiliation(s)
- Jaehyun Kim
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA;
| | - Sunghyun Yoon
- Division of Microbiology, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA; (S.Y.); (S.K.); (S.L.F.); (M.H.); (J.B.S.)
| | - Sandeep Kondakala
- Division of Microbiology, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA; (S.Y.); (S.K.); (S.L.F.); (M.H.); (J.B.S.)
| | - Steven L. Foley
- Division of Microbiology, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA; (S.Y.); (S.K.); (S.L.F.); (M.H.); (J.B.S.)
| | - Mark Hart
- Division of Microbiology, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA; (S.Y.); (S.K.); (S.L.F.); (M.H.); (J.B.S.)
| | - Dong-Heon Baek
- Department of Oral Microbiology and Immunology, School of Dentistry, Dankook University, Cheonan 31116, Republic of Korea;
| | - Wenjun Wang
- Department of Management, Marketing, and Technology, University of Arkansas at Little Rock, Little Rock, AR 72204, USA; (W.W.); (S.-K.K.)
| | - Sung-Kwan Kim
- Department of Management, Marketing, and Technology, University of Arkansas at Little Rock, Little Rock, AR 72204, USA; (W.W.); (S.-K.K.)
| | - John B. Sutherland
- Division of Microbiology, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA; (S.Y.); (S.K.); (S.L.F.); (M.H.); (J.B.S.)
| | - Seong-Jae Kim
- Division of Microbiology, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA; (S.Y.); (S.K.); (S.L.F.); (M.H.); (J.B.S.)
| | - Ohgew Kweon
- Division of Microbiology, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA; (S.Y.); (S.K.); (S.L.F.); (M.H.); (J.B.S.)
| |
Collapse
|
6
|
Rampogu S, Balasubramaniyam T, Lee JH. Curcumin Chalcone Derivatives Database (CCDD): a Python framework for natural compound derivatives database. PeerJ 2023; 11:e15885. [PMID: 37605747 PMCID: PMC10440061 DOI: 10.7717/peerj.15885] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Accepted: 07/20/2023] [Indexed: 08/23/2023] Open
Abstract
We built the Curcumin Chalcone Derivatives Database (CCDD) to enable the effective virtual screening of highly potent curcumin and its analogs. The two-dimensional (2D) structures were drawn using the ChemBioOffice package and converted to 3D structures using Discovery Studio Visualizer V 2021 (DS). The database was built using different Python modules. For the 3D structures, different Python packages were used to obtain the data frame of compounds. This framework is also used to visualize the compounds. The webserver enables the users to screen the compounds according to Lipinski's rule of five. The structures can be downloaded in .sdf and .mol format. The data frame (df) can be downloaded in .csv format. Our webserver can help computational drug discovery researchers find new therapeutics and build new webservers. The CCDD is freely available at: https://srampogu-ccdd-ccdd-8uldk8.streamlit.app/.
Collapse
Affiliation(s)
| | | | - Joon-Hwa Lee
- Department of Chemistry, Gyeongsang National University, Jinju, Gyeongnam, South Korea
| |
Collapse
|