1
|
Proshad R, Rahim MA, Rahman M, Asif MR, Dey HC, Khurram D, Al MA, Islam M, Idris AM. Utilizing machine learning to evaluate heavy metal pollution in the world's largest mangrove forest. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 951:175746. [PMID: 39182771 DOI: 10.1016/j.scitotenv.2024.175746] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/27/2024] [Revised: 07/24/2024] [Accepted: 08/22/2024] [Indexed: 08/27/2024]
Abstract
The world's largest mangrove forest (Sundarbans) is facing an imminent threat from heavy metal pollution, posing grave ecological and human health risks. Developing an accurate predictive model for heavy metal content in this area has been challenging. In this study, we used machine learning techniques to model sediment pollution by heavy metals in this vital ecosystem. We collected 199 standardized sediment samples to predict the accumulation of eleven heavy metals using ten different machine learning algorithms. Among them, the extremely randomized tree model exhibited the best performance in predicting Fe (0.87), Cr (0.89), Zn (0.85), Ni (0.83), Cu (0.87), Co (0.62), As (0.68), and V (0.90), achieving notable R2 values. On the other hand, the random forest outperformed for predicting Cd (0.72) and Mn (0.91), whereas the decision tree model showed the best performance for Pb (0.73). The feature attribute analysis identified FeV, CrV, CuZn, CoMn, PbCd, and AsCd relationships resembled with correlation coefficients among them. Based on the established models, the prediction of the contamination factor of metals in sediments showed very high Cd contamination (CF ≥ 6). The Moran's I index for Cd, Cr, Pb, and As were 0.71, 0.81, 0.71, and 0.67, respectively, indicating strong positive spatial autocorrelation and suggesting clustering of similar contamination levels. Conclusively, this research provides a comprehensive framework for predicting heavy metal sediment pollution in the Sundarbans, identifying key areas needing urgent conservation. Our findings support the adoption of integrated management strategies and targeted remedial actions to mitigate the harmful effects of heavy metal contamination in this vital ecosystem.
Collapse
Affiliation(s)
- Ram Proshad
- State Key Laboratory of Mountain Hazards and Engineering Safety, Institute of Mountain Hazards and Environment, Chinese Academy of Sciences, Chengdu 610041, Sichuan, China; University of Chinese Academy of Sciences, Beijing 100049, China.
| | - Md Abdur Rahim
- State Key Laboratory of Mountain Hazards and Engineering Safety, Institute of Mountain Hazards and Environment, Chinese Academy of Sciences, Chengdu 610041, Sichuan, China; University of Chinese Academy of Sciences, Beijing 100049, China; Department of Disaster Resilience and Engineering, Patuakhali Science and Technology University, Dumki, Patuakhali 8602, Bangladesh
| | - Mahfuzur Rahman
- Department of Civil Engineering, International University of Business Agriculture and Technology (IUBAT), Dhaka 1230, Bangladesh; Renewable Energy Research Institute, Kunsan National University, 558 Daehakro, Gunsan, Jeollabugdo, 54150, Republic of Korea
| | - Maksudur Rahman Asif
- College of Environmental Science & Engineering, Taiyuan University of Technology, Jinzhong City, China
| | - Hridoy Chandra Dey
- Department of Agronomy, Patuakhali Science and Technology University, Dumki, Patuakhali 8602, Bangladesh
| | - Dil Khurram
- State Key Laboratory of Mountain Hazards and Engineering Safety, Institute of Mountain Hazards and Environment, Chinese Academy of Sciences, Chengdu 610041, Sichuan, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Mamun Abdullah Al
- Environmental Microbiomics Research Center, School of Environmental Science and Engineering, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), State Key Laboratory for Biocontrol, Sun Yat-sen University, Guangzhou 510275, China; Aquatic Eco-Health Group, Fujian Key Laboratory of Watershed Ecology, Key Laboratory of Urban Environment and Health, Institute of Urban Environment, Chinese Academy of Sciences, Xiamen 361021, China
| | - Maksudul Islam
- Department of Environmental Science, Patuakhali Science and Technology University, Dumki, Patuakhali 8602, Bangladesh
| | - Abubakr M Idris
- Department of Chemistry, College of Science, King Khalid University, Abha 62529, Saudi Arabia.
| |
Collapse
|
2
|
Tang LJ, Li XK, Huang Y, Zhang XZ, Li BQ. Accurate and visualiable discrimination of Chenpi age using 2D-CNN and Grad-CAM++ based on infrared spectral images. Food Chem X 2024; 23:101759. [PMID: 39280221 PMCID: PMC11401106 DOI: 10.1016/j.fochx.2024.101759] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2024] [Revised: 08/19/2024] [Accepted: 08/20/2024] [Indexed: 09/18/2024] Open
Abstract
Dried tangerine peel ("Chenpi"), has numerous clinical and nutritional benefits, with its quality being significantly influenced by its storage age, referred to as "Chen Jiu Zhe Liang" in Chinese. Concequently, the rapid and accurate identification of Chenpi's age is important for consumers. In this study, Fourier transform infrared spectroscopy (FTIR) was employed to capture spectral images of Chenpi. These FTIR images were then analyzed using a two-dimensional convolutional neural networks (2D-CNN) model, achieving a discrimination accuracy of 97.92%. To address the "black box" nature of the 2D-CNN, Gradient-weighted Class Activation Mapping Plus Plus (Grad-CAM++) was utilized to highlight the important regions contributing to the model's performance. Additionally, six other machine learning models were developped using features identified by the 2D-CNN to validate their effectiveness. The results demonstrated that the combination of FTIR spectral images and 2D-CNN provides a highly effective method for accurately determining the age of Chenpi.
Collapse
Affiliation(s)
- Li Jun Tang
- School of Pharmacy and Food Engineering, Wuyi University, Jiangmen, 529020, PR China
| | - Xin Kang Li
- School of Pharmacy and Food Engineering, Wuyi University, Jiangmen, 529020, PR China
| | - Yue Huang
- School of Pharmacy and Food Engineering, Wuyi University, Jiangmen, 529020, PR China
| | - Xiang-Zhi Zhang
- School of Pharmacy and Food Engineering, Wuyi University, Jiangmen, 529020, PR China
| | - Bao Qiong Li
- School of Pharmacy and Food Engineering, Wuyi University, Jiangmen, 529020, PR China
| |
Collapse
|
3
|
Sun D, Macedonia C, Chen Z, Chandrasekaran S, Najarian K, Zhou S, Cernak T, Ellingrod VL, Jagadish HV, Marini B, Pai M, Violi A, Rech JC, Wang S, Li Y, Athey B, Omenn GS. Can Machine Learning Overcome the 95% Failure Rate and Reality that Only 30% of Approved Cancer Drugs Meaningfully Extend Patient Survival? J Med Chem 2024. [PMID: 39253942 DOI: 10.1021/acs.jmedchem.4c01684] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/11/2024]
Abstract
Despite implementing hundreds of strategies, cancer drug development suffers from a 95% failure rate over 30 years, with only 30% of approved cancer drugs extending patient survival beyond 2.5 months. Adding more criteria without eliminating nonessential ones is impractical and may fall into the "survivorship bias" trap. Machine learning (ML) models may enhance efficiency by saving time and cost. Yet, they may not improve success rate without identifying the root causes of failure. We propose a "STAR-guided ML system" (structure-tissue/cell selectivity-activity relationship) to enhance success rate and efficiency by addressing three overlooked interdependent factors: potency/specificity to the on/off-targets determining efficacy in tumors at clinical doses, on/off-target-driven tissue/cell selectivity influencing adverse effects in the normal organs at clinical doses, and optimal clinical doses balancing efficacy/safety as determined by potency/specificity and tissue/cell selectivity. STAR-guided ML models can directly predict clinical dose/efficacy/safety from five features to design/select the best drugs, enhancing success and efficiency of cancer drug development.
Collapse
Affiliation(s)
| | | | - Zhigang Chen
- LabBotics.ai, Palo Alto, California 94303, United States
| | | | | | - Simon Zhou
- Aurinia Pharmaceuticals Inc., Rockville, Maryland 20850, United States
| | | | | | | | | | | | | | | | | | - Yan Li
- Translational Medicine and Clinical Pharmacology, Bristol Myers Squibb, Summit, New Jersey 07901, United States
| | | | | |
Collapse
|
4
|
Forrester MT, Egol JR, Ozbay S, Singh R, Tata PR. Topology-Driven Discovery of Transmembrane Protein S-Palmitoylation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.08.611865. [PMID: 39282397 PMCID: PMC11398512 DOI: 10.1101/2024.09.08.611865] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 09/19/2024]
Abstract
Protein S-palmitoylation is a reversible lipophilic posttranslational modification regulating a diverse number of signaling pathways. Within transmembrane proteins (TMPs), S-palmitoylation is implicated in conditions from inflammatory disorders to respiratory viral infections. Many small-scale experiments have observed S-palmitoylation at juxtamembrane Cys residues. However, most large-scale S-palmitoyl discovery efforts rely on trypsin-based proteomics within which hydrophobic juxtamembrane regions are likely underrepresented. Machine learning- by virtue of its freedom from experimental constraints - is particularly well suited to address this discovery gap surrounding TMP S-palmitoylation. Utilizing a UniProt-derived feature set, a gradient boosted machine learning tool (TopoPalmTree) was constructed and applied to a holdout dataset of viral S-palmitoylated proteins. Upon application to the mouse TMP proteome, 1591 putative S-palmitoyl sites (i.e. not listed in SwissPalm or UniProt) were identified. Two lung-expressed S-palmitoyl candidates (synaptobrevin Vamp5 and water channel Aquaporin-5) were experimentally assessed. Finally, TopoPalmTree was used for rational design of an S-palmitoyl site on KDEL-Receptor 2. This readily interpretable model aligns the innumerable small-scale experiments observing juxtamembrane S-palmitoylation into a proteomic tool for TMP S-palmitoyl discovery and design, thus facilitating future investigations of this important modification.
Collapse
Affiliation(s)
- Michael T Forrester
- Division of Pulmonary, Allergy and Critical Care Medicine, Duke University School of Medicine, Durham, NC 27710
| | - Jacob R Egol
- Department of Cell Biology, Duke University School of Medicine, Durham, NC 27710
| | - Sinan Ozbay
- Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, NC 27710
| | - Rohit Singh
- Department of Cell Biology, Duke University School of Medicine, Durham, NC 27710
- Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, NC 27710
| | - Purushothama Rao Tata
- Division of Pulmonary, Allergy and Critical Care Medicine, Duke University School of Medicine, Durham, NC 27710
- Department of Cell Biology, Duke University School of Medicine, Durham, NC 27710
- Duke Regeneration Center, Duke University School of Medicine, Durham, NC 27710
| |
Collapse
|
5
|
Xiao Z, Zhu M, Chen J, You Z. Integrated Transfer Learning and Multitask Learning Strategies to Construct Graph Neural Network Models for Predicting Bioaccumulation Parameters of Chemicals. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2024; 58:15650-15660. [PMID: 39051472 DOI: 10.1021/acs.est.4c02421] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/27/2024]
Abstract
Accurate prediction of parameters related to the environmental exposure of chemicals is crucial for the sound management of chemicals. However, the lack of large data sets for training models may result in poor prediction accuracy and robustness. Herein, integrated transfer learning (TL) and multitask learning (MTL) was proposed for constructing a graph neural network (GNN) model (abbreviated as TL-MTL-GNN model) using n-octanol/water partition coefficients as a source domain. The TL-MTL-GNN model was trained to predict three bioaccumulation parameters based on enlarged data sets that cover 2496 compounds with at least one bioaccumulation parameter. Results show that the TL-MTL-GNN model outperformed single-task GNN models with and without the TL, as well as conventional machine learning models trained with molecular descriptors or fingerprints. Applicability domains were characterized by a state-of-the-art structure-activity landscape-based (abbreviated as ADSAL) methodology. The TL-MTL-GNN model coupled with the optimal ADSAL was employed to predict bioaccumulation parameters for around 60,000 chemicals, with more than 13,000 compounds identified as bioaccumulative chemicals. The high predictive accuracy and robustness of the TL-MTL-GNN model demonstrate the feasibility of integrating the TL and MTL strategy in modeling small-sized data sets. The strategy holds significant potential for addressing small data challenges in modeling environmental chemicals.
Collapse
Affiliation(s)
- Zijun Xiao
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Minghua Zhu
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
- Key Laboratory of Integrated Regulation and Resources Development of Shallow Lakes of Ministry of Education, College of Environment, Hohai University, Nanjing 210098, China
| | - Jingwen Chen
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Zecang You
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| |
Collapse
|
6
|
Hou P, Tian Y, Meng X. Improving Molecular-Dynamics Simulations for Solid-Liquid Interfaces with Machine-Learning Interatomic Potentials. Chemistry 2024; 30:e202401373. [PMID: 38877181 DOI: 10.1002/chem.202401373] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2024] [Revised: 06/13/2024] [Accepted: 06/14/2024] [Indexed: 06/16/2024]
Abstract
Emerging developments in artificial intelligence have opened infinite possibilities for material simulation. Depending on the powerful fitting of machine learning algorithms to first-principles data, machine learning interatomic potentials (MLIPs) can effectively balance the accuracy and efficiency problems in molecular dynamics (MD) simulations, serving as powerful tools in various complex physicochemical systems. Consequently, this brings unprecedented enthusiasm for researchers to apply such novel technology in multiple fields to revisit the major scientific problems that have remained controversial owing to the limitations of previous computational methods. Herein, we introduce the evolution of MLIPs, provide valuable application examples for solid-liquid interfaces, and present current challenges. Driven by solving multitudinous difficulties in terms of the accuracy, efficiency, and versatility of MLIPs, this booming technique, combined with molecular simulation methods, will provide an underlying and valuable understanding of interdisciplinary scientific challenges, including materials, physics, and chemistry.
Collapse
Affiliation(s)
- Pengfei Hou
- Key Laboratory of Physics and Technology for Advanced Batteries (Ministry of Education), College of Physics, Jilin University, Changchun, 130012, China
- Key Laboratory of Material Simulation Methods and Software of Ministry of Education, College of Physics, Jilin University, Changchun, 130012, China
| | - Yumiao Tian
- Key Laboratory of Physics and Technology for Advanced Batteries (Ministry of Education), College of Physics, Jilin University, Changchun, 130012, China
- Key Laboratory of Material Simulation Methods and Software of Ministry of Education, College of Physics, Jilin University, Changchun, 130012, China
| | - Xing Meng
- Key Laboratory of Physics and Technology for Advanced Batteries (Ministry of Education), College of Physics, Jilin University, Changchun, 130012, China
- Key Laboratory of Material Simulation Methods and Software of Ministry of Education, College of Physics, Jilin University, Changchun, 130012, China
| |
Collapse
|
7
|
Manen-Freixa L, Antolin AA. Polypharmacology prediction: the long road toward comprehensively anticipating small-molecule selectivity to de-risk drug discovery. Expert Opin Drug Discov 2024; 19:1043-1069. [PMID: 39004919 DOI: 10.1080/17460441.2024.2376643] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Accepted: 07/02/2024] [Indexed: 07/16/2024]
Abstract
INTRODUCTION Small molecules often bind to multiple targets, a behavior termed polypharmacology. Anticipating polypharmacology is essential for drug discovery since unknown off-targets can modulate safety and efficacy - profoundly affecting drug discovery success. Unfortunately, experimental methods to assess selectivity present significant limitations and drugs still fail in the clinic due to unanticipated off-targets. Computational methods are a cost-effective, complementary approach to predict polypharmacology. AREAS COVERED This review aims to provide a comprehensive overview of the state of polypharmacology prediction and discuss its strengths and limitations, covering both classical cheminformatics methods and bioinformatic approaches. The authors review available data sources, paying close attention to their different coverage. The authors then discuss major algorithms grouped by the types of data that they exploit using selected examples. EXPERT OPINION Polypharmacology prediction has made impressive progress over the last decades and contributed to identify many off-targets. However, data incompleteness currently limits most approaches to comprehensively predict selectivity. Moreover, our limited agreement on model assessment challenges the identification of the best algorithms - which at present show modest performance in prospective real-world applications. Despite these limitations, the exponential increase of multidisciplinary Big Data and AI hold much potential to better polypharmacology prediction and de-risk drug discovery.
Collapse
Affiliation(s)
- Leticia Manen-Freixa
- Oncobell Division, Bellvitge Biomedical Research Institute (IDIBELL) and ProCURE Department, Catalan Institute of Oncology (ICO), Barcelona, Spain
| | - Albert A Antolin
- Oncobell Division, Bellvitge Biomedical Research Institute (IDIBELL) and ProCURE Department, Catalan Institute of Oncology (ICO), Barcelona, Spain
- Center for Cancer Drug Discovery, The Division of Cancer Therapeutics, The Institute of Cancer Research, London, UK
| |
Collapse
|
8
|
Sultan A, Sieg J, Mathea M, Volkamer A. Transformers for Molecular Property Prediction: Lessons Learned from the Past Five Years. J Chem Inf Model 2024; 64:6259-6280. [PMID: 39136669 DOI: 10.1021/acs.jcim.4c00747] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/27/2024]
Abstract
Molecular Property Prediction (MPP) is vital for drug discovery, crop protection, and environmental science. Over the last decades, diverse computational techniques have been developed, from using simple physical and chemical properties and molecular fingerprints in statistical models and classical machine learning to advanced deep learning approaches. In this review, we aim to distill insights from current research on employing transformer models for MPP. We analyze the currently available models and explore key questions that arise when training and fine-tuning a transformer model for MPP. These questions encompass the choice and scale of the pretraining data, optimal architecture selections, and promising pretraining objectives. Our analysis highlights areas not yet covered in current research, inviting further exploration to enhance the field's understanding. Additionally, we address the challenges in comparing different models, emphasizing the need for standardized data splitting and robust statistical analysis.
Collapse
Affiliation(s)
- Afnan Sultan
- Data Driven Drug Design, Center for Bioinformatics, Saarland University, Saarbrücken 66123, Germany
| | | | | | - Andrea Volkamer
- Data Driven Drug Design, Center for Bioinformatics, Saarland University, Saarbrücken 66123, Germany
| |
Collapse
|
9
|
Le K, Radović JR, MacCallum JL, Larter SR, Van Humbeck JF. Machine Learning in Complex Organic Mixtures: Applying Domain Knowledge Allows for Meaningful Performance with Small Data Sets. J Am Chem Soc 2024; 146:22563-22569. [PMID: 39082215 DOI: 10.1021/jacs.4c06595] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/15/2024]
Abstract
The ability to quantify individual components of complex mixtures is a challenge found throughout the life and physical sciences. An improved capacity to generate large data sets along with the uptake of machine-learning (ML)-based analysis tools has allowed for various "omics" disciplines to realize exceptional advances. Other areas of chemistry that deal with complex mixtures often do not leverage these advances. Environmental samples, for example, can be more difficult to access, and the resulting small data sets are less appropriate for unconstrained ML approaches. Herein, we present an approach to address this latter issue. Using a very small environmental data set─35 high-resolution mass spectra gathered from various solvent extractions of Canadian petroleum fractions─we show that the application of specific domain knowledge can lead to ML models with notable performance.
Collapse
Affiliation(s)
- Katelyn Le
- Department of Chemistry, University of Calgary, Calgary, Alberta T2N 1N4, Canada
| | - Jagoš R Radović
- Center for Petroleum Geochemistry (UH-CPG), Department of Earth and Atmospheric Sciences, University of Houston, Houston, Texas 77204-5007, United States
| | - Justin L MacCallum
- Department of Chemistry, University of Calgary, Calgary, Alberta T2N 1N4, Canada
| | - Stephen R Larter
- Department of Earth, Energy, and Environment, University of Calgary, Calgary, Alberta T2N 1N4, Canada
| | | |
Collapse
|
10
|
Singh S, Bhardwaj S, Choudhary N, Patgiri R, Teramoto Y, Maji PK. Stimuli-Responsive Chiral Cellulose Nanocrystals Based Self-Assemblies for Security Measures to Prevent Counterfeiting: A Review. ACS APPLIED MATERIALS & INTERFACES 2024; 16:41743-41765. [PMID: 39102587 DOI: 10.1021/acsami.4c08290] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/07/2024]
Abstract
The proliferation of misleading information and counterfeit products in conjunction with technical progress presents substantial worldwide issues. To address the issue of counterfeiting, many tactics, such as the use of luminous anticounterfeiting systems, have been investigated. Nevertheless, traditional fluorescent compounds have a restricted effectiveness. Cellulose nanocrystals (CNCs), known for their renewable nature and outstanding qualities, present an excellent opportunity to develop intelligent, optically active materials formed due to their self-assembly behavior and stimuli response. CNCs and their derivatives-based self-assemblies allow for the creation of adaptable luminous materials that may be used to prevent counterfeiting. These materials integrate the photophysical characteristics of optically active components due to their stimuli-responsive behavior, enabling their use in fibers, labels, films, hydrogels, and inks. Despite substantial attention, existing materials frequently fall short of practical criteria due to limited knowledge and poor performance comparisons. This review aims to provide information on the latest developments in anticounterfeit materials based on stimuli-responsive CNCs and derivatives. It also includes the scope of artificial intelligence (AI) in the near future. It will emphasize the potential uses of these materials and encourage future investigation in this rapidly growing area of study.
Collapse
Affiliation(s)
- Shiva Singh
- Department of Polymer and Process Engineering, Indian Institute of Technology Roorkee, Saharanpur Campus, Saharanpur 240071, India
| | - Shakshi Bhardwaj
- Department of Polymer and Process Engineering, Indian Institute of Technology Roorkee, Saharanpur Campus, Saharanpur 240071, India
| | - Nitesh Choudhary
- Department of Polymer and Process Engineering, Indian Institute of Technology Roorkee, Saharanpur Campus, Saharanpur 240071, India
| | - Rohan Patgiri
- Department of Polymer and Process Engineering, Indian Institute of Technology Roorkee, Saharanpur Campus, Saharanpur 240071, India
| | - Yoshikuni Teramoto
- Division of Forest & Biomaterials Science, Graduate School of Agriculture, Kyoto University, Kitashirakawa Oiwake-cho, Sakyo-ku, Kyoto 6068502, Japan
| | - Pradip K Maji
- Department of Polymer and Process Engineering, Indian Institute of Technology Roorkee, Saharanpur Campus, Saharanpur 240071, India
| |
Collapse
|
11
|
Zhou J, Huang M. Navigating the landscape of enzyme design: from molecular simulations to machine learning. Chem Soc Rev 2024; 53:8202-8239. [PMID: 38990263 DOI: 10.1039/d4cs00196f] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/12/2024]
Abstract
Global environmental issues and sustainable development call for new technologies for fine chemical synthesis and waste valorization. Biocatalysis has attracted great attention as the alternative to the traditional organic synthesis. However, it is challenging to navigate the vast sequence space to identify those proteins with admirable biocatalytic functions. The recent development of deep-learning based structure prediction methods such as AlphaFold2 reinforced by different computational simulations or multiscale calculations has largely expanded the 3D structure databases and enabled structure-based design. While structure-based approaches shed light on site-specific enzyme engineering, they are not suitable for large-scale screening of potential biocatalysts. Effective utilization of big data using machine learning techniques opens up a new era for accelerated predictions. Here, we review the approaches and applications of structure-based and machine-learning guided enzyme design. We also provide our view on the challenges and perspectives on effectively employing enzyme design approaches integrating traditional molecular simulations and machine learning, and the importance of database construction and algorithm development in attaining predictive ML models to explore the sequence fitness landscape for the design of admirable biocatalysts.
Collapse
Affiliation(s)
- Jiahui Zhou
- School of Chemistry and Chemical Engineering, Queen's University, David Keir Building, Stranmillis Road, Belfast BT9 5AG, Northern Ireland, UK.
| | - Meilan Huang
- School of Chemistry and Chemical Engineering, Queen's University, David Keir Building, Stranmillis Road, Belfast BT9 5AG, Northern Ireland, UK.
| |
Collapse
|
12
|
Tajima M, Nagai Y, Chen S, Pan Z, Katayama K. A robust methodology for PEC performance analysis of photoanodes using machine learning and analytical data. Analyst 2024; 149:4193-4207. [PMID: 38984992 DOI: 10.1039/d4an00439f] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/11/2024]
Abstract
Machine learning (ML) is increasingly applied across various fields, including chemistry, for molecular design and optimizing reaction parameters. Yet, applying ML to experimental data is challenging due to the limited number of synthesized samples, which restricts its broader application in device development. In energy harvesting, photoanodes are crucial for solar-driven water splitting, generating hydrogen and oxygen. We explored electrodes like hematite and bismuth vanadate for photocatalytic uses, noting varied photoelectrochemical performances despite similar preparations. To understand this variability, we applied a data-driven ML approach, predicting photocurrent values and identifying key performance influencers even with limited experimental data in the research development of inorganic devices. We have utilized multiple machine learning algorithms to predict the target value in the calculation process, where the contributions of the dominant descriptors were unknown. We introduced a novel methodology, incorporating clustering to manage multicollinearity from correlated analytical data and Shapley analysis for clear interpretation of contributions to performance prediction. This method was validated on hematite and bismuth vanadate, showing superior predictability and factor identification, and then extended to tungsten oxide and bismuth vanadate heterojunction photoanodes. Despite their complexity, our approach achieved determination coefficients (R2) with a prediction accuracy over 0.85, successfully pinpointing performance-determining factors, demonstrating the robustness of the new scheme in advancing photodevice research.
Collapse
Affiliation(s)
- Moeko Tajima
- Department of Applied Chemistry, Chuo University, Tokyo 112-8551, Japan.
| | - Yuya Nagai
- Department of Applied Chemistry, Chuo University, Tokyo 112-8551, Japan.
| | - Siyan Chen
- Department of Applied Chemistry, Chuo University, Tokyo 112-8551, Japan.
| | - Zhenhua Pan
- Department of Applied Chemistry, Chuo University, Tokyo 112-8551, Japan.
| | - Kenji Katayama
- Department of Applied Chemistry, Chuo University, Tokyo 112-8551, Japan.
| |
Collapse
|
13
|
Hong C, Wu X, Huang J, Dai H. Biomimetic fusion: Platyper's dual vision for predicting protein-surface interactions. MATERIALS HORIZONS 2024; 11:3528-3538. [PMID: 38916578 DOI: 10.1039/d4mh00066h] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/26/2024]
Abstract
Predicting protein binding with the material surface still remains a challenge. Here, a novel approach, platypus dual perception neural network (Platyper), was developed to describe the interactions in protein-surface systems involving bioceramics with BMPs. The resulting model integrates a graph convolutional neural network (GCN) based on interatomic potentials with a convolutional neural network (CNN) model based on images of molecular structures. This dual-vision approach, inspired by the platypus's adaptive sensory system, addresses the challenge of accurately predicting the complex binding and unbinding dynamics in steered molecular dynamics (SMD) simulations. The model's effectiveness is demonstrated through its application in predicting surface interactions in protein-ligand systems. Notably, Platyper improves computational efficiency compared to classical SMD-based methods and overcomes the limitations of GNN-based methods for large-scale atomic simulations. The incorporation of heat maps enhances model's interpretability, providing valuable insights into its predictive capabilities. Overall, Platyper represents a promising advancement in the accurate and efficient prediction of protein-surface interactions in the context of bioceramics and growth factors.
Collapse
Affiliation(s)
- Chuhang Hong
- State Key Laboratory of Advanced Technology for Materials Synthesis and Processing, Biomedical Materials and Engineering Research Center of Hubei Province, Wuhan University of Technology, Wuhan 430070, China.
| | - Xiaopei Wu
- State Key Laboratory of Advanced Technology for Materials Synthesis and Processing, Biomedical Materials and Engineering Research Center of Hubei Province, Wuhan University of Technology, Wuhan 430070, China.
| | - Jian Huang
- Materials Genome Institute, Shanghai University, Shanghai, 200444, China.
| | - Honglian Dai
- State Key Laboratory of Advanced Technology for Materials Synthesis and Processing, Biomedical Materials and Engineering Research Center of Hubei Province, Wuhan University of Technology, Wuhan 430070, China.
- Foshan Xianhu Laboratory of the Advanced Energy Science and Technology Guangdong Laboratory, Xianhu Hydrogen Valley, Foshan 528200, China
| |
Collapse
|
14
|
Song S, Xu X, Lan H, Gao L, Lin J, Du L, Wang Y. Design of Co-Cured Multi-Component Thermosets with Enhanced Heat Resistance, Toughness, and Processability via a Machine Learning Approach. Macromol Rapid Commun 2024:e2400337. [PMID: 39018478 DOI: 10.1002/marc.202400337] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2024] [Revised: 06/30/2024] [Indexed: 07/19/2024]
Abstract
Designing heat-resistant thermosets with excellent comprehensive performance has been a long-standing challenge. Co-curing of various high-performance thermosets is an effective strategy, however, the traditional trial-and-error experiments have long research cycles for discovering new materials. Herein, a two-step machine learning (ML) assisted approach is proposed to design heat-resistant co-cured resins composed of polyimide (PI) and silicon-containing arylacetylene (PSA), that is, poly(silicon-alkyne imide) (PSI). First, two ML prediction models are established to evaluate the processability of PIs and their compatibility with PSA. Then, another two ML models are developed to predict the thermal decomposition temperature and flexural strength of the co-cured PSI resins. The optimal molecular structures and compositions of PSI resins are high-throughput screened. The screened PSI resins are experimentally verified to exhibit enhanced heat resistance, toughness, and processability. The research framework established in this work can be generalized to the rational design of other advanced multi-component polymeric materials.
Collapse
Affiliation(s)
- Shuang Song
- Shanghai Key Laboratory of Advanced Polymeric Materials, Key Laboratory for Ultrafine Materials of Ministry of Education, Frontiers Science Center for Materiobiology and Dynamic Chemistry, School of Materials Science and Engineering, East China University of Science and Technology, Shanghai, 200237, China
| | - Xinyao Xu
- Shanghai Key Laboratory of Advanced Polymeric Materials, Key Laboratory for Ultrafine Materials of Ministry of Education, Frontiers Science Center for Materiobiology and Dynamic Chemistry, School of Materials Science and Engineering, East China University of Science and Technology, Shanghai, 200237, China
| | - Haoxiang Lan
- Shanghai Key Laboratory of Advanced Polymeric Materials, Key Laboratory for Ultrafine Materials of Ministry of Education, Frontiers Science Center for Materiobiology and Dynamic Chemistry, School of Materials Science and Engineering, East China University of Science and Technology, Shanghai, 200237, China
| | - Liang Gao
- Shanghai Key Laboratory of Advanced Polymeric Materials, Key Laboratory for Ultrafine Materials of Ministry of Education, Frontiers Science Center for Materiobiology and Dynamic Chemistry, School of Materials Science and Engineering, East China University of Science and Technology, Shanghai, 200237, China
| | - Jiaping Lin
- Shanghai Key Laboratory of Advanced Polymeric Materials, Key Laboratory for Ultrafine Materials of Ministry of Education, Frontiers Science Center for Materiobiology and Dynamic Chemistry, School of Materials Science and Engineering, East China University of Science and Technology, Shanghai, 200237, China
| | - Lei Du
- Shanghai Key Laboratory of Advanced Polymeric Materials, Key Laboratory for Ultrafine Materials of Ministry of Education, Frontiers Science Center for Materiobiology and Dynamic Chemistry, School of Materials Science and Engineering, East China University of Science and Technology, Shanghai, 200237, China
| | - Yuyuan Wang
- Shanghai Key Laboratory of Advanced Polymeric Materials, Key Laboratory for Ultrafine Materials of Ministry of Education, Frontiers Science Center for Materiobiology and Dynamic Chemistry, School of Materials Science and Engineering, East China University of Science and Technology, Shanghai, 200237, China
| |
Collapse
|
15
|
Abrofarakh M, Moghadam H, Abdulrahim HK. Investigation of direct contact membrane distillation (DCMD) performance using CFD and machine learning approaches. CHEMOSPHERE 2024; 357:141969. [PMID: 38604515 DOI: 10.1016/j.chemosphere.2024.141969] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/18/2024] [Revised: 03/24/2024] [Accepted: 04/08/2024] [Indexed: 04/13/2024]
Abstract
Direct Contact Membrane Distillation (DCMD) is emerging as an effective method for water desalination, known for its efficiency and adaptability. This study delves into the performance of DCMD by integrating two powerful analytical tools: Computational Fluid Dynamics (CFD) and Artificial Neural Networks (ANN). The research thoroughly examines the impact of various factors, such as inlet temperatures, velocities, channel heights, salt concentration, and membrane characteristics, on the process's efficiency, specifically calculating the water vapor flux. A rigorous validation of the CFD model aligns well with established studies, ensuring reliability. Subsequently, over 1000 data points reflecting variations in input factors are utilized to train and validate the ANN. The training phase demonstrated high accuracy, with near-zero mean squared errors and R2 values close to one, indicating a strong predictive capability. Further analysis post-ANN training shed light on key relationships: higher membrane porosity boosts water vapor flux, whereas thicker membranes reduce it. Additionally, it was detailed how salt concentration, channel dimensions, inlet temperatures, and velocities significantly influence the distillation process. Finally, a mathematical model was proposed for water vapor flux as a function of key input factors. The results highlighted that salt mole fraction and hot water inlet temperature have the most effect on the water vapor flux. This comprehensive investigation contributes to the understanding of DCMD and emphasizes the potential of combining CFD and ANN for optimizing and innovating water desalination technology.
Collapse
Affiliation(s)
- Moslem Abrofarakh
- Department of Chemical Engineering, Faculty of Engineering, University of Sistan and Baluchestan, Zahedan, Iran
| | - Hamid Moghadam
- Department of Chemical Engineering, Faculty of Engineering, University of Sistan and Baluchestan, Zahedan, Iran.
| | - Hassan K Abdulrahim
- Water Research Center (WRC), Kuwait Institute for Scientific Research (KISR), P.O. Box 24885, 13109, Safat, Kuwait
| |
Collapse
|
16
|
Tan J, Yang R, Xiao L, Xia Y, Qin W. Personalized decision support system for tailoring IgA nephropathy treatment strategies. Eur J Intern Med 2024; 124:69-77. [PMID: 38443263 DOI: 10.1016/j.ejim.2024.02.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Revised: 01/06/2024] [Accepted: 02/04/2024] [Indexed: 03/07/2024]
Abstract
BACKGROUND The ongoing debate surrounding the use of immunosuppressive treatments for IgA nephropathy (IgAN) underscores the demand for personalized and effective strategies. METHODS Analyzed data from 807 IgAN patients over 5+ years using three methods: Random Forest with molecular biomarkers, network biomarkers with graph engineering, and an auto-encoder model. All models were trained using identical demographic, clinical, and pathological data, employing an 80-20 split for training and testing purposes. RESULTS In the comprehensive assessment of IgAN prognosis, the Random Forest model, employing molecular biomarkers, demonstrated strong performance metrics (AUC = 0.83, sensitivity = 0.51, specificity = 0.96). However, traditional graph feature engineering on patient-specific networks outperformed these results with an AUC of 0.90, sensitivity of 0.64, and specificity of 0.94. The Auto-encoder model showed the best accuracy (AUC = 0.91, sensitivity = 0.46, specificity = 0.96). The findings highlighted the superior predictive capabilities of network biomarkers over molecular biomarkers for adverse renal outcome prediction in IgAN. Consequently, we integrated Auto-encoder-derived Network Biomarkers with Random Forest Models to enhance prognostic precision in diverse IgAN treatment scenarios. The prediction for the prognosis of patients receiving supportive care, glucocorticoid therapy, and immunosuppressant treatment yielded AUC values of 0.95, 0.96, and 1, respectively, indicating high specificity. Drawing from these insights, we pioneered the development of an innovative decision support model for IgAN treatment. This model demonstrated the ability to make medical decisions comparable to those by experienced nephrologists, enabling the customization of personalized disease management strategies. CONCLUSION Our system accurately predicted IgAN prognosis and evaluated various treatment efficacies, aiding physicians in devising optimal therapeutic strategies for patients.
Collapse
Affiliation(s)
- Jiaxing Tan
- Division of Nephrology, Department of Medicine, West China Hospital, Sichuan University, Chengdu, Sichuan, China
| | - Rongxin Yang
- College of Computer Science, Sichuan University, Chengdu, Sichuan, China
| | - Liyin Xiao
- College of Computer Science, Sichuan University, Chengdu, Sichuan, China
| | - Yuanlin Xia
- School of Mechanical Engineering, Sichuan University College of Computer Science, Sichuan University, Chengdu, China
| | - Wei Qin
- Division of Nephrology, Department of Medicine, West China Hospital, Sichuan University, Chengdu, Sichuan, China.
| |
Collapse
|
17
|
Zahra MA, Al-Taher A, Alquhaidan M, Hussain T, Ismail I, Raya I, Kandeel M. The synergy of artificial intelligence and personalized medicine for the enhanced diagnosis, treatment, and prevention of disease. Drug Metab Pers Ther 2024; 39:47-58. [PMID: 38997240 DOI: 10.1515/dmpt-2024-0003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Accepted: 06/17/2024] [Indexed: 07/14/2024]
Abstract
INTRODUCTION The completion of the Human Genome Project in 2003 marked the beginning of a transformative era in medicine. This milestone laid the foundation for personalized medicine, an innovative approach that customizes healthcare treatments. CONTENT Central to the advancement of personalized medicine is the understanding of genetic variations and their impact on drug responses. The integration of artificial intelligence (AI) into drug response trials has been pivotal in this domain. These technologies excel in handling large-scale genomic datasets and patient histories, significantly improving diagnostic accuracy, disease prediction and drug discovery. They are particularly effective in addressing complex diseases such as cancer and genetic disorders. Furthermore, the advent of wearable technology, when combined with AI, propels personalized medicine forward by offering real-time health monitoring, which is crucial for early disease detection and management. SUMMARY The integration of AI into personalized medicine represents a significant advancement in healthcare, promising more accurate diagnoses, effective treatment plans and innovative drug discoveries. OUTLOOK As technology continues to evolve, the role of AI in enhancing personalized medicine and transforming the healthcare landscape is expected to grow exponentially. This synergy between AI and healthcare holds great promise for the future, potentially revolutionizing the way healthcare is delivered and experienced.
Collapse
Affiliation(s)
- Mohammad Abu Zahra
- Department of Biomolecular Sciences, College of Veterinary Medicine, 114800 King Faisal University , Al-Hofuf, Al-Ahsa, Saudi Arabia
| | - Abdulla Al-Taher
- Department of Biomolecular Sciences, College of Veterinary Medicine, 114800 King Faisal University , Al-Hofuf, Al-Ahsa, Saudi Arabia
| | - Mohamed Alquhaidan
- Department of Biomolecular Sciences, College of Veterinary Medicine, 114800 King Faisal University , Al-Hofuf, Al-Ahsa, Saudi Arabia
| | - Tarique Hussain
- Animal Sciences Division, Nuclear Institute for Agriculture and Biology (NIAB), Faisalabad, Pakistan
| | - Izzeldin Ismail
- Department of Biomolecular Sciences, College of Veterinary Medicine, 114800 King Faisal University , Al-Hofuf, Al-Ahsa, Saudi Arabia
| | - Indah Raya
- Department of Chemistry, Faculty of Mathematics, and Natural Science, Hasanuddin University, Makassar, Indonesia
| | - Mahmoud Kandeel
- Department of Biomolecular Sciences, College of Veterinary Medicine, 114800 King Faisal University , Al-Hofuf, Al-Ahsa, Saudi Arabia
- Department of Pharmacology, Faculty of Veterinary Medicine, Kafrelshikh University, Kafrelshikh, Egypt
| |
Collapse
|
18
|
Li P, Dong L, Li C, Li Y, Zhao J, Peng B, Wang W, Zhou S, Liu W. Machine Learning to Promote Efficient Screening of Low-Contact Electrode for 2D Semiconductor Transistor Under Limited Data. ADVANCED MATERIALS (DEERFIELD BEACH, FLA.) 2024; 36:e2312887. [PMID: 38606800 DOI: 10.1002/adma.202312887] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Revised: 03/09/2024] [Indexed: 04/13/2024]
Abstract
Low-barrier and high-injection electrodes are crucial for high-performance (HP) 2D semiconductor devices. Conventional trial-and-error methodologies for electrode material screening are impractical because of their low efficiency and arbitrary specificity. Although machine learning has emerged as a promising alternative to tackle this problem, its practical application in semiconductor devices is hindered by its substantial data requirements. In this paper, a comprehensive scheme combining an autoencoding regularized adversarial neural network and a feature-adaptive variational active learning algorithm for screening low-contact electrode materials for 2D semiconductor transistors with limited data is proposed. The proposed scheme exhibits exceptional performance by training with only 15% of the total data points, where the mean square errors are 0.17 and 0.27 eV for the vertical and lateral Schottky barrier, respectively, and 2.88% for tunneling probability. Further, it exhibits an optimal predictive performance for 100 randomly sampled training datasets, reveals the underlying physical insight based on the identified features, and realizes continual improvement by employing detailed density-of-states descriptors. Finally, the empirical evaluations of the transport characteristics are conducted and verified by constructing MOSFET devices. These findings demonstrate the considerable potential of machine-learning techniques for screening high-efficiency electrode materials and constructing HP 2D semiconductor devices.
Collapse
Affiliation(s)
- Penghui Li
- Shaanxi Province Key Laboratory of Thin Films Technology and Optical Test, Xi'an Technological University, Xi'an, 710032, China
- School of Opto-electronical Engineering, Xi'an Technological University, Xi'an, 710032, China
| | - Linpeng Dong
- Shaanxi Province Key Laboratory of Thin Films Technology and Optical Test, Xi'an Technological University, Xi'an, 710032, China
- School of Opto-electronical Engineering, Xi'an Technological University, Xi'an, 710032, China
| | - Chong Li
- Xi'an Xiangteng Microelectronics Technology Co., Ltd, Xi'an, 710075, China
| | - Yan Li
- Shaanxi Province Key Laboratory of Thin Films Technology and Optical Test, Xi'an Technological University, Xi'an, 710032, China
- School of Opto-electronical Engineering, Xi'an Technological University, Xi'an, 710032, China
| | - Jie Zhao
- Shaanxi Province Key Laboratory of Thin Films Technology and Optical Test, Xi'an Technological University, Xi'an, 710032, China
- School of Opto-electronical Engineering, Xi'an Technological University, Xi'an, 710032, China
| | - Bo Peng
- Key Laboratory of Wide Band-Gap Semiconductor Materials and Devices, School of Microelectronics, Xidian University, Xi'an, 710071, China
| | - Wei Wang
- School of Opto-electronical Engineering, Xi'an Technological University, Xi'an, 710032, China
| | - Shun Zhou
- Shaanxi Province Key Laboratory of Thin Films Technology and Optical Test, Xi'an Technological University, Xi'an, 710032, China
- School of Opto-electronical Engineering, Xi'an Technological University, Xi'an, 710032, China
| | - Weiguo Liu
- Shaanxi Province Key Laboratory of Thin Films Technology and Optical Test, Xi'an Technological University, Xi'an, 710032, China
- School of Opto-electronical Engineering, Xi'an Technological University, Xi'an, 710032, China
| |
Collapse
|
19
|
van Tilborg D, Brinkmann H, Criscuolo E, Rossen L, Özçelik R, Grisoni F. Deep learning for low-data drug discovery: Hurdles and opportunities. Curr Opin Struct Biol 2024; 86:102818. [PMID: 38669740 DOI: 10.1016/j.sbi.2024.102818] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Revised: 03/27/2024] [Accepted: 03/29/2024] [Indexed: 04/28/2024]
Abstract
Deep learning is becoming increasingly relevant in drug discovery, from de novo design to protein structure prediction and synthesis planning. However, it is often challenged by the small data regimes typical of certain drug discovery tasks. In such scenarios, deep learning approaches-which are notoriously 'data-hungry'-might fail to live up to their promise. Developing novel approaches to leverage the power of deep learning in low-data scenarios is sparking great attention, and future developments are expected to propel the field further. This mini-review provides an overview of recent low-data-learning approaches in drug discovery, analyzing their hurdles and advantages. Finally, we venture to provide a forecast of future research directions in low-data learning for drug discovery.
Collapse
Affiliation(s)
- Derek van Tilborg
- Institute for Complex Molecular Systems (ICMS), Department of Biomedical Engineering, Eindhoven University of Technology, PO Box 513, 5600 MB Eindhoven, the Netherlands; Centre for Living Technologies, Alliance TU/e, WUR, UU, UMC Utrecht, Princetonlaan 6, 3584 CB, Utrecht, the Netherlands. https://twitter.com/DerekvTilborg
| | - Helena Brinkmann
- Institute for Complex Molecular Systems (ICMS), Department of Biomedical Engineering, Eindhoven University of Technology, PO Box 513, 5600 MB Eindhoven, the Netherlands. https://twitter.com/hlnbrkmnn
| | - Emanuele Criscuolo
- Institute for Complex Molecular Systems (ICMS), Department of Biomedical Engineering, Eindhoven University of Technology, PO Box 513, 5600 MB Eindhoven, the Netherlands. https://twitter.com/emanuelecriscu9
| | - Luke Rossen
- Institute for Complex Molecular Systems (ICMS), Department of Biomedical Engineering, Eindhoven University of Technology, PO Box 513, 5600 MB Eindhoven, the Netherlands. https://twitter.com/molecular_ml
| | - Rıza Özçelik
- Institute for Complex Molecular Systems (ICMS), Department of Biomedical Engineering, Eindhoven University of Technology, PO Box 513, 5600 MB Eindhoven, the Netherlands; Centre for Living Technologies, Alliance TU/e, WUR, UU, UMC Utrecht, Princetonlaan 6, 3584 CB, Utrecht, the Netherlands. https://twitter.com/Rza_ozcelik
| | - Francesca Grisoni
- Institute for Complex Molecular Systems (ICMS), Department of Biomedical Engineering, Eindhoven University of Technology, PO Box 513, 5600 MB Eindhoven, the Netherlands; Centre for Living Technologies, Alliance TU/e, WUR, UU, UMC Utrecht, Princetonlaan 6, 3584 CB, Utrecht, the Netherlands.
| |
Collapse
|
20
|
Khan MK, Raza M, Shahbaz M, Hussain I, Khan MF, Xie Z, Shah SSA, Tareen AK, Bashir Z, Khan K. The recent advances in the approach of artificial intelligence (AI) towards drug discovery. Front Chem 2024; 12:1408740. [PMID: 38882215 PMCID: PMC11176507 DOI: 10.3389/fchem.2024.1408740] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2024] [Accepted: 04/26/2024] [Indexed: 06/18/2024] Open
Abstract
Artificial intelligence (AI) has recently emerged as a unique developmental influence that is playing an important role in the development of medicine. The AI medium is showing the potential in unprecedented advancements in truth and efficiency. The intersection of AI has the potential to revolutionize drug discovery. However, AI also has limitations and experts should be aware of these data access and ethical issues. The use of AI techniques for drug discovery applications has increased considerably over the past few years, including combinatorial QSAR and QSPR, virtual screening, and denovo drug design. The purpose of this survey is to give a general overview of drug discovery based on artificial intelligence, and associated applications. We also highlighted the gaps present in the traditional method for drug designing. In addition, potential strategies and approaches to overcome current challenges are discussed to address the constraints of AI within this field. We hope that this survey plays a comprehensive role in understanding the potential of AI in drug discovery.
Collapse
Affiliation(s)
- Mahroza Kanwal Khan
- College of Chemistry and Environmental Engineering, Shenzhen University, Shenzhen, China
| | - Mohsin Raza
- Additive Manufacturing Institute, Shenzhen University, Shenzhen, China
| | - Muhammad Shahbaz
- Additive Manufacturing Institute, Shenzhen University, Shenzhen, China
| | - Iftikhar Hussain
- Department of Mechanical Engineering, City University of Hong Kong, Kowloon, Hong Kong SAR, China
- A. J. Drexel Nanomaterials Institute and Department of Materials Science and Engineering, Drexel University, Philadelphia, PA, United States
| | - Muhammad Farooq Khan
- Department of Electrical Engineering, Sejong University, Seoul, Republic of Korea
| | - Zhongjian Xie
- Shenzhen Children's Hospital, Clinical Medical College of Southern University of Science and Technology, Shenzhen, China
| | - Syed Shoaib Ahmad Shah
- Department of Chemistry, School of Natural Sciences, National University of Sciences and Technology, Islamabad, Pakistan
| | - Ayesha Khan Tareen
- School of Mechanical Engineering, Dongguan University of Technology, Dongguan, China
| | - Zoobia Bashir
- College of Chemistry and Environmental Engineering, Shenzhen University, Shenzhen, China
| | - Karim Khan
- Additive Manufacturing Institute, Shenzhen University, Shenzhen, China
| |
Collapse
|
21
|
Fooladi H, Hirte S, Kirchmair J. Quantifying the Hardness of Bioactivity Prediction Tasks for Transfer Learning. J Chem Inf Model 2024; 64:4031-4046. [PMID: 38739465 PMCID: PMC11134514 DOI: 10.1021/acs.jcim.4c00160] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Revised: 04/24/2024] [Accepted: 04/24/2024] [Indexed: 05/16/2024]
Abstract
Today, machine learning methods are widely employed in drug discovery. However, the chronic lack of data continues to hamper their further development, validation, and application. Several modern strategies aim to mitigate the challenges associated with data scarcity by learning from data on related tasks. These knowledge-sharing approaches encompass transfer learning, multitask learning, and meta-learning. A key question remaining to be answered for these approaches is about the extent to which their performance can benefit from the relatedness of available source (training) tasks; in other words, how difficult ("hard") a test task is to a model, given the available source tasks. This study introduces a new method for quantifying and predicting the hardness of a bioactivity prediction task based on its relation to the available training tasks. The approach involves the generation of protein and chemical representations and the calculation of distances between the bioactivity prediction task and the available training tasks. In the example of meta-learning on the FS-Mol data set, we demonstrate that the proposed task hardness metric is inversely correlated with performance (Pearson's correlation coefficient r = -0.72). The metric will be useful in estimating the task-specific gain in performance that can be achieved through meta-learning.
Collapse
Affiliation(s)
- Hosein Fooladi
- Department
of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry,
Faculty of Life Sciences, University of
Vienna, Josef-Holaubek-Platz 2, 1090 Vienna, Austria
- Christian
Doppler Laboratory for Molecular Informatics in the Biosciences, Department
for Pharmaceutical Sciences, University
of Vienna, 1090 Vienna, Austria
- Vienna
Doctoral School of Pharmaceutical, Nutritional and Sport Sciences
(PhaNuSpo), University of Vienna, 1090 Vienna, Austria
| | - Steffen Hirte
- Department
of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry,
Faculty of Life Sciences, University of
Vienna, Josef-Holaubek-Platz 2, 1090 Vienna, Austria
- Vienna
Doctoral School of Pharmaceutical, Nutritional and Sport Sciences
(PhaNuSpo), University of Vienna, 1090 Vienna, Austria
| | - Johannes Kirchmair
- Department
of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry,
Faculty of Life Sciences, University of
Vienna, Josef-Holaubek-Platz 2, 1090 Vienna, Austria
- Christian
Doppler Laboratory for Molecular Informatics in the Biosciences, Department
for Pharmaceutical Sciences, University
of Vienna, 1090 Vienna, Austria
| |
Collapse
|
22
|
Vecchi E, Bassetti D, Graziato F, Pospíšil L, Horenko I. Gauge-Optimal Approximate Learning for Small Data Classification. Neural Comput 2024; 36:1198-1227. [PMID: 38669692 DOI: 10.1162/neco_a_01664] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Accepted: 01/16/2024] [Indexed: 04/28/2024]
Abstract
Small data learning problems are characterized by a significant discrepancy between the limited number of response variable observations and the large feature space dimension. In this setting, the common learning tools struggle to identify the features important for the classification task from those that bear no relevant information and cannot derive an appropriate learning rule that allows discriminating among different classes. As a potential solution to this problem, here we exploit the idea of reducing and rotating the feature space in a lower-dimensional gauge and propose the gauge-optimal approximate learning (GOAL) algorithm, which provides an analytically tractable joint solution to the dimension reduction, feature segmentation, and classification problems for small data learning problems. We prove that the optimal solution of the GOAL algorithm consists in piecewise-linear functions in the Euclidean space and that it can be approximated through a monotonically convergent algorithm that presents-under the assumption of a discrete segmentation of the feature space-a closed-form solution for each optimization substep and an overall linear iteration cost scaling. The GOAL algorithm has been compared to other state-of-the-art machine learning tools on both synthetic data and challenging real-world applications from climate science and bioinformatics (i.e., prediction of the El Niño Southern Oscillation and inference of epigenetically induced gene-activity networks from limited experimental data). The experimental results show that the proposed algorithm outperforms the reported best competitors for these problems in both learning performance and computational cost.
Collapse
Affiliation(s)
- Edoardo Vecchi
- Università della Svizzera Italiana, Faculty of Informatics, Institute of Computing, 6962 Lugano, Switzerland
| | - Davide Bassetti
- Technical University of Kaiserslautern, Faculty of Mathematics, Group of Mathematics of AI, 67663 Kaiserslautern, Germany
| | | | - Lukáš Pospíšil
- VSB Ostrava, Department of Mathematics, Ludvika Podeste 1875/17 708 33 Ostrava, Czech Republic
| | - Illia Horenko
- Technical University of Kaiserslautern, Faculty of Mathematics, Group of Mathematics of AI, 67663 Kaiserslautern, Germany
| |
Collapse
|
23
|
Focke K, De Santis M, Wolter M, Martinez B JA, Vallet V, Pereira Gomes AS, Olejniczak M, Jacob CR. Interoperable workflows by exchanging grid-based data between quantum-chemical program packages. J Chem Phys 2024; 160:162503. [PMID: 38686818 DOI: 10.1063/5.0201701] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Accepted: 04/02/2024] [Indexed: 05/02/2024] Open
Abstract
Quantum-chemical subsystem and embedding methods require complex workflows that may involve multiple quantum-chemical program packages. Moreover, such workflows require the exchange of voluminous data that go beyond simple quantities, such as molecular structures and energies. Here, we describe our approach for addressing this interoperability challenge by exchanging electron densities and embedding potentials as grid-based data. We describe the approach that we have implemented to this end in a dedicated code, PyEmbed, currently part of a Python scripting framework. We discuss how it has facilitated the development of quantum-chemical subsystem and embedding methods and highlight several applications that have been enabled by PyEmbed, including wave-function theory (WFT) in density-functional theory (DFT) embedding schemes mixing non-relativistic and relativistic electronic structure methods, real-time time-dependent DFT-in-DFT approaches, the density-based many-body expansion, and workflows including real-space data analysis and visualization. Our approach demonstrates, in particular, the merits of exchanging (complex) grid-based data and, in general, the potential of modular software development in quantum chemistry, which hinges upon libraries that facilitate interoperability.
Collapse
Affiliation(s)
- Kevin Focke
- Institute of Physical and Theoretical Chemistry, Technische Universität Braunschweig, Gaußstraße 17, 38106 Braunschweig, Germany
| | - Matteo De Santis
- CNRS, UMR 8523-PhLAM-Physique des Lasers Atomes et Molécules, Univ. Lille, F-59000 Lille, France
| | - Mario Wolter
- Institute of Physical and Theoretical Chemistry, Technische Universität Braunschweig, Gaußstraße 17, 38106 Braunschweig, Germany
| | - Jessica A Martinez B
- CNRS, UMR 8523-PhLAM-Physique des Lasers Atomes et Molécules, Univ. Lille, F-59000 Lille, France
- Department of Chemistry, Rutgers University, Newark, New Jersey 07102, USA
| | - Valérie Vallet
- CNRS, UMR 8523-PhLAM-Physique des Lasers Atomes et Molécules, Univ. Lille, F-59000 Lille, France
| | | | - Małgorzata Olejniczak
- Centre of New Technologies, University of Warsaw, S. Banacha 2c, 02-097 Warsaw, Poland
| | - Christoph R Jacob
- Institute of Physical and Theoretical Chemistry, Technische Universität Braunschweig, Gaußstraße 17, 38106 Braunschweig, Germany
| |
Collapse
|
24
|
Gou Q, Liu J, Su H, Guo Y, Chen J, Zhao X, Pu X. Exploring an accurate machine learning model to quickly estimate stability of diverse energetic materials. iScience 2024; 27:109452. [PMID: 38523799 PMCID: PMC10960145 DOI: 10.1016/j.isci.2024.109452] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Revised: 01/27/2024] [Accepted: 03/06/2024] [Indexed: 03/26/2024] Open
Abstract
High energy and low sensitivity have been the focus of developing new energetic materials (EMs). However, there has been a lack of a quick and accurate method for evaluating the stability of diverse EMs. Here, we develop a machine learning prediction model with high accuracy for bond dissociation energy (BDE) of EMs. A reliable and representative BDE dataset of EMs is constructed by collecting 778 experimental energetic compounds and quantum mechanics calculation. To sufficiently characterize the BDE of EMs, a hybrid feature representation is proposed by coupling the local target bond into the global structure characteristics. To alleviate the limitation of the low dataset, pairwise difference regression is utilized as a data augmentation with the advantage of reducing systematic errors and improving diversity. Benefiting from these improvements, the XGBoost model achieves the best prediction accuracy with R2 of 0.98 and MAE of 8.8 kJ mol-1, significantly outperforming competitive models.
Collapse
Affiliation(s)
- Qiaolin Gou
- College of Chemistry, Sichuan University, Chengdu 610064, China
| | - Jing Liu
- College of Chemistry, Sichuan University, Chengdu 610064, China
| | - Haoming Su
- College of Chemistry, Sichuan University, Chengdu 610064, China
| | - Yanzhi Guo
- College of Chemistry, Sichuan University, Chengdu 610064, China
| | - Jiayi Chen
- College of Chemistry, Sichuan University, Chengdu 610064, China
| | - Xueyan Zhao
- Institute of Chemical Materials, China Academy of Engineering Physics, Mianyang 621900, China
| | - Xuemei Pu
- College of Chemistry, Sichuan University, Chengdu 610064, China
| |
Collapse
|
25
|
Khiari Z. Recent Developments in Bio-Ink Formulations Using Marine-Derived Biomaterials for Three-Dimensional (3D) Bioprinting. Mar Drugs 2024; 22:134. [PMID: 38535475 PMCID: PMC10971850 DOI: 10.3390/md22030134] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Revised: 03/12/2024] [Accepted: 03/13/2024] [Indexed: 05/01/2024] Open
Abstract
3D bioprinting is a disruptive, computer-aided, and additive manufacturing technology that allows the obtention, layer-by-layer, of 3D complex structures. This technology is believed to offer tremendous opportunities in several fields including biomedical, pharmaceutical, and food industries. Several bioprinting processes and bio-ink materials have emerged recently. However, there is still a pressing need to develop low-cost sustainable bio-ink materials with superior qualities (excellent mechanical, viscoelastic and thermal properties, biocompatibility, and biodegradability). Marine-derived biomaterials, including polysaccharides and proteins, represent a viable and renewable source for bio-ink formulations. Therefore, the focus of this review centers around the use of marine-derived biomaterials in the formulations of bio-ink. It starts with a general overview of 3D bioprinting processes followed by a description of the most commonly used marine-derived biomaterials for 3D bioprinting, with a special attention paid to chitosan, glycosaminoglycans, alginate, carrageenan, collagen, and gelatin. The challenges facing the application of marine-derived biomaterials in 3D bioprinting within the biomedical and pharmaceutical fields along with future directions are also discussed.
Collapse
Affiliation(s)
- Zied Khiari
- National Research Council of Canada, Aquatic and Crop Resource Development Research Centre, 1411 Oxford Street, Halifax, NS B3H 3Z1, Canada
| |
Collapse
|
26
|
Mahato KD, Kumar U. Optimized Machine learning techniques Enable prediction of organic dyes photophysical Properties: Absorption Wavelengths, emission Wavelengths, and quantum yields. SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2024; 308:123768. [PMID: 38134661 DOI: 10.1016/j.saa.2023.123768] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Revised: 12/05/2023] [Accepted: 12/12/2023] [Indexed: 12/24/2023]
Abstract
Applications of organic dyes, ranging from basic research to industry, are functions of their photophysical properties. Two important aspects- (1) knowledge of the photophysical properties of existing dyes long before real applications and (2) discovery of new organic dyes with desired photophysical properties for either upgradation of existing or development of new applications-are needed to be addressed. These two cases are coupled together with the common goal of estimating photophysical properties with high accuracy at the minimum cost of time and money long before the hard-core laboratory experiment. For this purpose, machine learning-based techniques are the most suitable approach. In this study, we used optimized machine-learning techniques to assess a dataset of 3066 organic dyes, which were evaluated using three evaluation parameters: Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and the coefficient of determination (R2). The Quadratic Support Vector Machine (QSVM) was the best predictive model for RMSE-16.614, MAE-10.837, and R2-0.961 for absorption wavelengths and RMSE-23.636, MAE-16.278, and R2-0.929 for emission wavelengths. These R2 values are 0.7% and 0.4% greater than the Gradient Boost Regression Tree (GBRT) model's recently reported values of 0.954 and 0.925 for absorption and emission wavelengths, respectively. Furthermore, we estimated the quantum yield and found that the Coarse Gaussian Support Vector Machine (CGSVM) outperformed all examined models. For more validation of these models, we compared the predicted results with the experimental results of selective dyes. The proposed automated approach can be used for predicting photophysical properties without much computer programming knowledge.
Collapse
Affiliation(s)
- Kapil Dev Mahato
- Department of Physics, National Institute of Technology Jamshedpur, Jharkhand 831014, India.
| | - Uday Kumar
- Department of Physics, National Institute of Technology Jamshedpur, Jharkhand 831014, India
| |
Collapse
|
27
|
Novais Â, Gonçalves AB, Ribeiro TG, Freitas AR, Méndez G, Mancera L, Read A, Alves V, López-Cerero L, Rodríguez-Baño J, Pascual Á, Peixe L. Development and validation of a quick, automated, and reproducible ATR FT-IR spectroscopy machine-learning model for Klebsiella pneumoniae typing. J Clin Microbiol 2024; 62:e0121123. [PMID: 38284762 PMCID: PMC10865814 DOI: 10.1128/jcm.01211-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Accepted: 12/18/2023] [Indexed: 01/30/2024] Open
Abstract
The reliability of Fourier-transform infrared (FT-IR) spectroscopy for Klebsiella pneumoniae typing and outbreak control has been previously assessed, but issues remain in standardization and reproducibility. We developed and validated a reproducible FT-IR with attenuated total reflectance (ATR) workflow for the identification of K. pneumoniae lineages. We used 293 isolates representing multidrug-resistant K. pneumoniae lineages causing outbreaks worldwide (2002-2021) to train a random forest classification (RF) model based on capsular (KL)-type discrimination. This model was validated with 280 contemporaneous isolates (2021-2022), using wzi sequencing and whole-genome sequencing as references. Repeatability and reproducibility were tested in different culture media and instruments throughout time. Our RF model allowed the classification of 33 capsular (KL)-types and up to 36 clinically relevant K. pneumoniae lineages based on the discrimination of specific KL- and O-type combinations. We obtained high rates of accuracy (89%), sensitivity (88%), and specificity (92%), including from cultures obtained directly from the clinical sample, allowing to obtain typing information the same day bacteria are identified. The workflow was reproducible in different instruments throughout time (>98% correct predictions). Direct colony application, spectral acquisition, and automated KL prediction through Clover MS Data analysis software allow a short time-to-result (5 min/isolate). We demonstrated that FT-IR ATR spectroscopy provides meaningful, reproducible, and accurate information at a very early stage (as soon as bacterial identification) to support infection control and public health surveillance. The high robustness together with automated and flexible workflows for data analysis provide opportunities to consolidate real-time applications at a global level. IMPORTANCE We created and validated an automated and simple workflow for the identification of clinically relevant Klebsiella pneumoniae lineages by FT-IR spectroscopy and machine-learning, a method that can be extremely useful to provide quick and reliable typing information to support real-time decisions of outbreak management and infection control. This method and workflow is of interest to support clinical microbiology diagnostics and to aid public health surveillance.
Collapse
Affiliation(s)
- Ângela Novais
- UCIBIO, Applied Molecular Biosciences Unit, Department of Biological Sciences, Faculty of Pharmacy, University of Porto, Porto, Portugal
- Associate Laboratory i4HB - Institute for Health and Bioeconomy, Faculty of Pharmacy, University of Porto, Porto, Portugal
| | - Ana Beatriz Gonçalves
- UCIBIO, Applied Molecular Biosciences Unit, Department of Biological Sciences, Faculty of Pharmacy, University of Porto, Porto, Portugal
- Associate Laboratory i4HB - Institute for Health and Bioeconomy, Faculty of Pharmacy, University of Porto, Porto, Portugal
| | - Teresa G. Ribeiro
- UCIBIO, Applied Molecular Biosciences Unit, Department of Biological Sciences, Faculty of Pharmacy, University of Porto, Porto, Portugal
- Associate Laboratory i4HB - Institute for Health and Bioeconomy, Faculty of Pharmacy, University of Porto, Porto, Portugal
- CCP, Culture Collection of Porto, Faculty of Pharmacy, University of Porto, Porto, Portugal
| | - Ana R. Freitas
- UCIBIO, Applied Molecular Biosciences Unit, Department of Biological Sciences, Faculty of Pharmacy, University of Porto, Porto, Portugal
- Associate Laboratory i4HB - Institute for Health and Bioeconomy, Faculty of Pharmacy, University of Porto, Porto, Portugal
- 1H-TOXRUN, One Health Toxicology Research Unit, University Institute of Health Sciences, CESPU, CRL, Gandra, Portugal
| | - Gema Méndez
- CLOVER Bioanalytical Software, Granada, Spain
| | | | - Antónia Read
- Clinical Microbiology Laboratory, Local Healthcare Unit, Matosinhos, Portugal
| | - Valquíria Alves
- Clinical Microbiology Laboratory, Local Healthcare Unit, Matosinhos, Portugal
| | - Lorena López-Cerero
- Unidad Clínica de Enfermedades Infecciosas y Microbiología, Hospital Universitario Vírgen Macarena, Instituto de Biomedicina de Sevilla (IBIS; CSIC/Hospital Virgen Macarena/Universidad de Sevilla), Sevilla, Spain
- Departamentos de Microbiología y Medicina, Universidad de Sevilla, Sevilla, Spain
| | - Jesús Rodríguez-Baño
- Unidad Clínica de Enfermedades Infecciosas y Microbiología, Hospital Universitario Vírgen Macarena, Instituto de Biomedicina de Sevilla (IBIS; CSIC/Hospital Virgen Macarena/Universidad de Sevilla), Sevilla, Spain
- Departamentos de Microbiología y Medicina, Universidad de Sevilla, Sevilla, Spain
| | - Álvaro Pascual
- Unidad Clínica de Enfermedades Infecciosas y Microbiología, Hospital Universitario Vírgen Macarena, Instituto de Biomedicina de Sevilla (IBIS; CSIC/Hospital Virgen Macarena/Universidad de Sevilla), Sevilla, Spain
- Departamentos de Microbiología y Medicina, Universidad de Sevilla, Sevilla, Spain
| | - Luísa Peixe
- UCIBIO, Applied Molecular Biosciences Unit, Department of Biological Sciences, Faculty of Pharmacy, University of Porto, Porto, Portugal
- Associate Laboratory i4HB - Institute for Health and Bioeconomy, Faculty of Pharmacy, University of Porto, Porto, Portugal
- CCP, Culture Collection of Porto, Faculty of Pharmacy, University of Porto, Porto, Portugal
| |
Collapse
|
28
|
Wang X, Xiong Z, Hong W, Liao X, Yang G, Jiang Z, Jing L, Huang S, Fu Z, Zhu F. Identification of cuproptosis-related gene clusters and immune cell infiltration in major burns based on machine learning models and experimental validation. Front Immunol 2024; 15:1335675. [PMID: 38410514 PMCID: PMC10894925 DOI: 10.3389/fimmu.2024.1335675] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2023] [Accepted: 01/23/2024] [Indexed: 02/28/2024] Open
Abstract
Introduction Burns are a global public health problem. Major burns can stimulate the body to enter a stress state, thereby increasing the risk of infection and adversely affecting the patient's prognosis. Recently, it has been discovered that cuproptosis, a form of cell death, is associated with various diseases. Our research aims to explore the molecular clusters associated with cuproptosis in major burns and construct predictive models. Methods We analyzed the expression and immune infiltration characteristics of cuproptosis-related factors in major burn based on the GSE37069 dataset. Using 553 samples from major burn patients, we explored the molecular clusters based on cuproptosis-related genes and their associated immune cell infiltrates. The WGCNA was utilized to identify cluster-specific genes. Subsequently, the performance of different machine learning models was compared to select the optimal model. The effectiveness of the predictive model was validated using Nomogram, calibration curves, decision curves, and an external dataset. Finally, five core genes related to cuproptosis and major burn have been was validated using RT-qPCR. Results In both major burn and normal samples, we determined the cuproptosis-related genes associated with major burns through WGCNA analysis. Through immune infiltrate profiling analysis, we found significant immune differences between different clusters. When K=2, the clustering number is the most stable. GSVA analysis shows that specific genes in cluster 2 are closely associated with various functions. After identifying the cross-core genes, machine learning models indicate that generalized linear models have better accuracy. Ultimately, a generalized linear model for five highly correlated genes was constructed, and validation with an external dataset showed an AUC of 0.982. The accuracy of the model was further verified through calibration curves, decision curves, and modal graphs. Further analysis of clinical relevance revealed that these correlated genes were closely related to time of injury. Conclusion This study has revealed the intricate relationship between cuproptosis and major burns. Research has identified 15 cuproptosis-related genes that are associated with major burn. Through a machine learning model, five core genes related to cuproptosis and major burn have been selected and validated.
Collapse
Affiliation(s)
- Xin Wang
- Medical Center of Burn Plastic and Wound Repair, The First Affiliated Hospital of Nanchang University, Nanchang, Jiangxi, China
| | - Zhenfang Xiong
- Medical Center of Burn Plastic and Wound Repair, The First Affiliated Hospital of Nanchang University, Nanchang, Jiangxi, China
| | - Wangbing Hong
- Medical Center of Burn Plastic and Wound Repair, The First Affiliated Hospital of Nanchang University, Nanchang, Jiangxi, China
| | - Xincheng Liao
- Medical Center of Burn Plastic and Wound Repair, The First Affiliated Hospital of Nanchang University, Nanchang, Jiangxi, China
| | - Guangping Yang
- Medical Center of Burn Plastic and Wound Repair, The First Affiliated Hospital of Nanchang University, Nanchang, Jiangxi, China
| | - Zhengying Jiang
- Medical Center of Burn Plastic and Wound Repair, The First Affiliated Hospital of Nanchang University, Nanchang, Jiangxi, China
| | - Lanxin Jing
- Medical Center of Burn Plastic and Wound Repair, The First Affiliated Hospital of Nanchang University, Nanchang, Jiangxi, China
| | - Shengyu Huang
- Medical Center of Burn Plastic and Wound Repair, The First Affiliated Hospital of Nanchang University, Nanchang, Jiangxi, China
| | - Zhonghua Fu
- Medical Center of Burn Plastic and Wound Repair, The First Affiliated Hospital of Nanchang University, Nanchang, Jiangxi, China
| | - Feng Zhu
- Department of Critical Care Medicine, Shanghai East Hospital, Tongji University School of Medicine, Shanghai, China
- Department of Burns, The First Affiliated Hospital, Naval Medical University, Shanghai, China
| |
Collapse
|
29
|
Wu C, Luo J, Xiao Y. Multi-omics assists genomic prediction of maize yield with machine learning approaches. MOLECULAR BREEDING : NEW STRATEGIES IN PLANT IMPROVEMENT 2024; 44:14. [PMID: 38343399 PMCID: PMC10853138 DOI: 10.1007/s11032-024-01454-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Accepted: 01/19/2024] [Indexed: 02/28/2024]
Abstract
With the improvement of high-throughput technologies in recent years, large multi-dimensional plant omics data have been produced, and big-data-driven yield prediction research has received increasing attention. Machine learning offers promising computational and analytical solutions to interpret the biological meaning of large amounts of data in crops. In this study, we utilized multi-omics datasets from 156 maize recombinant inbred lines, containing 2496 single nucleotide polymorphisms (SNPs), 46 image traits (i-traits) from 16 developmental stages obtained through an automatic phenotyping platform, and 133 primary metabolites. Based on benchmark tests with different types of prediction models, some machine learning methods, such as Partial Least Squares (PLS), Random Forest (RF), and Gaussian process with Radial basis function kernel (GaussprRadial), achieved better prediction for maize yield, albeit slight difference for method preferences among i-traits, genomic, and metabolic data. We found that better yield prediction may be caused by various capabilities in ranking and filtering data features, which is found to be linked with biological meaning such as photosynthesis-related or kernel development-related regulations. Finally, by integrating multiple omics data with the RF machine learning approach, we can further improve the prediction accuracy of grain yield from 0.32 to 0.43. Our research provides new ideas for the application of plant omics data and artificial intelligence approaches to facilitate crop genetic improvements. Supplementary Information The online version contains supplementary material available at 10.1007/s11032-024-01454-z.
Collapse
Affiliation(s)
- Chengxiu Wu
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070 China
| | - Jingyun Luo
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070 China
| | - Yingjie Xiao
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070 China
- Hubei Hongshan Laboratory, Wuhan, 430070 China
| |
Collapse
|
30
|
Ao YF, Dörr M, Menke MJ, Born S, Heuson E, Bornscheuer UT. Data-Driven Protein Engineering for Improving Catalytic Activity and Selectivity. Chembiochem 2024; 25:e202300754. [PMID: 38029350 DOI: 10.1002/cbic.202300754] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Revised: 11/28/2023] [Accepted: 11/29/2023] [Indexed: 12/01/2023]
Abstract
Protein engineering is essential for altering the substrate scope, catalytic activity and selectivity of enzymes for applications in biocatalysis. However, traditional approaches, such as directed evolution and rational design, encounter the challenge in dealing with the experimental screening process of a large protein mutation space. Machine learning methods allow the approximation of protein fitness landscapes and the identification of catalytic patterns using limited experimental data, thus providing a new avenue to guide protein engineering campaigns. In this concept article, we review machine learning models that have been developed to assess enzyme-substrate-catalysis performance relationships aiming to improve enzymes through data-driven protein engineering. Furthermore, we prospect the future development of this field to provide additional strategies and tools for achieving desired activities and selectivities.
Collapse
Affiliation(s)
- Yu-Fei Ao
- Department of Biotechnology and Enzyme Catalysis, Institute of Biochemistry, University of Greifswald, Felix-Hausdorff-Str. 4, 17487, Greifswald, Germany
- Beijing National Laboratory for Molecular Sciences, CAS Key Laboratory of Molecular Recognition and Function, Institute of Chemistry, Chinese Academy of Sciences, Zhongguancun North First Street 2, Beijing, 100190, China
- University of Chinese Academy of Sciences, Yuquan Road 19(A), Beijing, 100049, China
| | - Mark Dörr
- Department of Biotechnology and Enzyme Catalysis, Institute of Biochemistry, University of Greifswald, Felix-Hausdorff-Str. 4, 17487, Greifswald, Germany
| | - Marian J Menke
- Department of Biotechnology and Enzyme Catalysis, Institute of Biochemistry, University of Greifswald, Felix-Hausdorff-Str. 4, 17487, Greifswald, Germany
| | - Stefan Born
- Technische Universität Berlin, Chair of Bioprocess Engineering, Ackerstraße 76, 13355, Berlin, Germany
| | - Egon Heuson
- Univ. Lille, CNRS, Centrale Lille, Univ. Artois, UMR 8181 UCCS, Unité de Catalyse et Chimie du Solide, 59000, Lille, France
| | - Uwe T Bornscheuer
- Department of Biotechnology and Enzyme Catalysis, Institute of Biochemistry, University of Greifswald, Felix-Hausdorff-Str. 4, 17487, Greifswald, Germany
| |
Collapse
|
31
|
Bass L, Elder LH, Folescu DE, Forouzesh N, Tolokh IS, Karpatne A, Onufriev AV. Improving the Accuracy of Physics-Based Hydration-Free Energy Predictions by Machine Learning the Remaining Error Relative to the Experiment. J Chem Theory Comput 2024; 20:396-410. [PMID: 38149593 PMCID: PMC10950260 DOI: 10.1021/acs.jctc.3c00981] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2023]
Abstract
The accuracy of computational models of water is key to atomistic simulations of biomolecules. We propose a computationally efficient way to improve the accuracy of the prediction of hydration-free energies (HFEs) of small molecules: the remaining errors of the physics-based models relative to the experiment are predicted and mitigated by machine learning (ML) as a postprocessing step. Specifically, the trained graph convolutional neural network attempts to identify the "blind spots" in the physics-based model predictions, where the complex physics of aqueous solvation is poorly accounted for, and partially corrects for them. The strategy is explored for five classical solvent models representing various accuracy/speed trade-offs, from the fast analytical generalized Born (GB) to the popular TIP3P explicit solvent model; experimental HFEs of small neutral molecules from the FreeSolv set are used for the training and testing. For all of the models, the ML correction reduces the resulting root-mean-square error relative to the experiment for HFEs of small molecules, without significant overfitting and with negligible computational overhead. For example, on the test set, the relative accuracy improvement is 47% for the fast analytical GB, making it, after the ML correction, almost as accurate as uncorrected TIP3P. For the TIP3P model, the accuracy improvement is about 39%, bringing the ML-corrected model's accuracy below the 1 kcal/mol threshold. In general, the relative benefit of the ML corrections is smaller for more accurate physics-based models, reaching the lower limit of about 20% relative accuracy gain compared with that of the physics-based treatment alone. The proposed strategy of using ML to learn the remaining error of physics-based models offers a distinct advantage over training ML alone directly on reference HFEs: it preserves the correct overall trend, even well outside of the training set.
Collapse
Affiliation(s)
- Lewis Bass
- Department of Computer Engineering, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Luke H Elder
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Dan E Folescu
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
- Department of Mathematics, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Negin Forouzesh
- Department of Computer Science, California State University, Los Angeles, California 90032, United States
| | - Igor S Tolokh
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Anuj Karpatne
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Alexey V Onufriev
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
- Department of Physics, Virginia Tech, Blacksburg, Virginia 24061, United States
- Center for Soft Matter and Biological Physics, Virginia Tech, Blacksburg, Virginia 24061, United States
| |
Collapse
|
32
|
Mondello A, Dal Bo M, Toffoli G, Polano M. Machine learning in onco-pharmacogenomics: a path to precision medicine with many challenges. Front Pharmacol 2024; 14:1260276. [PMID: 38264526 PMCID: PMC10803549 DOI: 10.3389/fphar.2023.1260276] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Accepted: 12/26/2023] [Indexed: 01/25/2024] Open
Abstract
Over the past two decades, Next-Generation Sequencing (NGS) has revolutionized the approach to cancer research. Applications of NGS include the identification of tumor specific alterations that can influence tumor pathobiology and also impact diagnosis, prognosis and therapeutic options. Pharmacogenomics (PGx) studies the role of inheritance of individual genetic patterns in drug response and has taken advantage of NGS technology as it provides access to high-throughput data that can, however, be difficult to manage. Machine learning (ML) has recently been used in the life sciences to discover hidden patterns from complex NGS data and to solve various PGx problems. In this review, we provide a comprehensive overview of the NGS approaches that can be employed and the different PGx studies implicating the use of NGS data. We also provide an excursus of the ML algorithms that can exert a role as fundamental strategies in the PGx field to improve personalized medicine in cancer.
Collapse
Affiliation(s)
| | | | | | - Maurizio Polano
- Experimental and Clinical Pharmacology Unit, Centro di Riferimento Oncologico di Aviano (CRO), Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS), Aviano, Italy
| |
Collapse
|
33
|
Li B, Wang Y, Yin Z, Xu L, Xie L, Xu X. Decision tree-based identification of important molecular fragments for protein-ligand binding. Chem Biol Drug Des 2024; 103:e14427. [PMID: 38230776 DOI: 10.1111/cbdd.14427] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Revised: 11/16/2023] [Accepted: 12/11/2023] [Indexed: 01/18/2024]
Abstract
Fragment-based drug design is an emerging technology in pharmaceutical research and development. One of the key aspects of this technology is the identification and quantitative characterization of molecular fragments. This study presents a strategy for identifying important molecular fragments based on molecular fingerprints and decision tree algorithms and verifies its feasibility in predicting protein-ligand binding affinity. Specifically, the three-dimensional (3D) structures of protein-ligand complexes are encoded using extended-connectivity fingerprints (ECFP), and three decision tree models, namely Random Forest, XGBoost, and LightGBM, are used to quantitatively characterize the feature importance, thereby extracting important molecular fragments with high reliability. Few-shot learning reveals that the extracted molecular fragments contribute significantly and consistently to the binding affinity even with a small sample size. Despite the absence of location and distance information for molecular fragments in ECFP, 3D visualization, in combination with the reverse ECFP process, shows that the majority of the extracted fragments are located at the binding interface of the protein and the ligand. This alignment with the distance constraints critical for binding affinity further supports the reliability of the strategy for identifying important molecular fragments.
Collapse
Affiliation(s)
- Baiyi Li
- Institute of Bioinformatics and Medical Engineering, Jiangsu University of Technology, Changzhou, China
| | - Yunsong Wang
- School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou, China
| | - Zuode Yin
- Institute of Bioinformatics and Medical Engineering, Jiangsu University of Technology, Changzhou, China
| | - Lei Xu
- Institute of Bioinformatics and Medical Engineering, Jiangsu University of Technology, Changzhou, China
| | - Liangxu Xie
- Institute of Bioinformatics and Medical Engineering, Jiangsu University of Technology, Changzhou, China
| | - Xiaojun Xu
- Institute of Bioinformatics and Medical Engineering, Jiangsu University of Technology, Changzhou, China
| |
Collapse
|
34
|
Fu M, He R, Zhang Z, Ma F, Shen L, Zhang Y, Duan M, Zhang Y, Wang Y, Zhu L, He J. Multinomial machine learning identifies independent biomarkers by integrated metabolic analysis of acute coronary syndrome. Sci Rep 2023; 13:20535. [PMID: 37996510 PMCID: PMC10667512 DOI: 10.1038/s41598-023-47783-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Accepted: 11/18/2023] [Indexed: 11/25/2023] Open
Abstract
A multi-class classification model for acute coronary syndrome (ACS) remains to be constructed based on multi-fluid metabolomics. Major confounders may exert spurious effects on the relationship between metabolism and ACS. The study aims to identify an independent biomarker panel for the multiclassification of HC, UA, and AMI by integrating serum and urinary metabolomics. We performed a liquid chromatography-tandem mass spectrometry (LC-MS/MS)-based metabolomics study on 300 serum and urine samples from 44 patients with unstable angina (UA), 77 with acute myocardial infarction (AMI), and 29 healthy controls (HC). Multinomial machine learning approaches, including multinomial adaptive least absolute shrinkage and selection operator (LASSO) regression and random forest (RF), and assessment of the confounders were applied to integrate a multi-class classification biomarker panel for HC, UA and AMI. Different metabolic landscapes were portrayed during the transition from HC to UA and then to AMI. Glycerophospholipid metabolism and arginine biosynthesis were predominant during the progression from HC to UA and then to AMI. The multiclass metabolic diagnostic model (MDM) dependent on ACS, including 2-ketobutyric acid, LysoPC(18:2(9Z,12Z)), argininosuccinic acid, and cyclic GMP, demarcated HC, UA, and AMI, providing a C-index of 0.84 (HC vs. UA), 0.98 (HC vs. AMI), and 0.89 (UA vs. AMI). The diagnostic value of MDM largely derives from the contribution of 2-ketobutyric acid, and LysoPC(18:2(9Z,12Z)) in serum. Higher 2-ketobutyric acid and cyclic GMP levels were positively correlated with ACS risk and atherosclerosis plaque burden, while LysoPC(18:2(9Z,12Z)) and argininosuccinic acid showed the reverse relationship. An independent multiclass biomarker panel for HC, UA, and AMI was constructed using the multinomial machine learning methods based on serum and urinary metabolite signatures.
Collapse
Affiliation(s)
- Meijiao Fu
- Ningxia Medical University, Yinchuan, 750004, Ningxia, China
| | - Ruhua He
- Department of Cardiology, General Hospital of Ningxia Medical University, Yinchuan, 750004, Ningxia, China
| | - Zhihan Zhang
- Department of Cardiology, Hanzhong Central Hospital, Hanzhong, 723200, Shanxi, China
| | - Fuqing Ma
- Department of Cardiology, The Fifth People's Hospital of Ningxia, Shizuishan, 753000, Ningxia, China
| | - Libo Shen
- Center for Cardiovascular Diseases, People's Hospital of Ningxia Hui Autonomous Region, Yinchuan, 750002, Ningxia, China
| | - Yu Zhang
- Ningxia Medical University, Yinchuan, 750004, Ningxia, China
| | - Mingyu Duan
- Ningxia Medical University, Yinchuan, 750004, Ningxia, China
| | - Yameng Zhang
- Department of Cardiology, The Second Affiliated Hospital of Henan University of Science and Technology, Luoyang, 471000, Henan, China
| | - Yifan Wang
- Department of Radiology, General Hospital of Ningxia Medical University, Yinchuan, 750004, Ningxia, China
| | - Li Zhu
- Department of Radiology, General Hospital of Ningxia Medical University, Yinchuan, 750004, Ningxia, China.
| | - Jun He
- Department of Cardiology, General Hospital of Ningxia Medical University, Yinchuan, 750004, Ningxia, China.
| |
Collapse
|
35
|
Xia S, Chen E, Zhang Y. Integrated Molecular Modeling and Machine Learning for Drug Design. J Chem Theory Comput 2023; 19:7478-7495. [PMID: 37883810 PMCID: PMC10653122 DOI: 10.1021/acs.jctc.3c00814] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Revised: 10/10/2023] [Accepted: 10/11/2023] [Indexed: 10/28/2023]
Abstract
Modern therapeutic development often involves several stages that are interconnected, and multiple iterations are usually required to bring a new drug to the market. Computational approaches have increasingly become an indispensable part of helping reduce the time and cost of the research and development of new drugs. In this Perspective, we summarize our recent efforts on integrating molecular modeling and machine learning to develop computational tools for modulator design, including a pocket-guided rational design approach based on AlphaSpace to target protein-protein interactions, delta machine learning scoring functions for protein-ligand docking as well as virtual screening, and state-of-the-art deep learning models to predict calculated and experimental molecular properties based on molecular mechanics optimized geometries. Meanwhile, we discuss remaining challenges and promising directions for further development and use a retrospective example of FDA approved kinase inhibitor Erlotinib to demonstrate the use of these newly developed computational tools.
Collapse
Affiliation(s)
- Song Xia
- Department
of Chemistry, New York University, New York, New York 10003, United States
| | - Eric Chen
- Department
of Chemistry, New York University, New York, New York 10003, United States
| | - Yingkai Zhang
- Department
of Chemistry, New York University, New York, New York 10003, United States
- Simons
Center for Computational Physical Chemistry at New York University, New York, New York 10003, United States
- NYU-ECNU
Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
| |
Collapse
|
36
|
Shi Y, Zhang C, Pan S, Chen Y, Miao X, He G, Wu Y, Ye H, Weng C, Zhang H, Zhou W, Yang X, Liang C, Chen D, Hong L, Su F. The diagnosis of tuberculous meningitis: advancements in new technologies and machine learning algorithms. Front Microbiol 2023; 14:1290746. [PMID: 37942080 PMCID: PMC10628659 DOI: 10.3389/fmicb.2023.1290746] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Accepted: 10/09/2023] [Indexed: 11/10/2023] Open
Abstract
Tuberculous meningitis (TBM) poses a diagnostic challenge, particularly impacting vulnerable populations such as infants and those with untreated HIV. Given the diagnostic intricacies of TBM, there's a pressing need for rapid and reliable diagnostic tools. This review scrutinizes the efficacy of up-and-coming technologies like machine learning in transforming TBM diagnostics and management. Advanced diagnostic technologies like targeted gene sequencing, real-time polymerase chain reaction (RT-PCR), miRNA assays, and metagenomic next-generation sequencing (mNGS) offer promising avenues for early TBM detection. The capabilities of these technologies are further augmented when paired with mass spectrometry, metabolomics, and proteomics, enriching the pool of disease-specific biomarkers. Machine learning algorithms, adept at sifting through voluminous datasets like medical imaging, genomic profiles, and patient histories, are increasingly revealing nuanced disease pathways, thereby elevating diagnostic accuracy and guiding treatment strategies. While these burgeoning technologies offer hope for more precise TBM diagnosis, hurdles remain in terms of their clinical implementation. Future endeavors should zero in on the validation of these tools through prospective studies, critically evaluating their limitations, and outlining protocols for seamless incorporation into established healthcare frameworks. Through this review, we aim to present an exhaustive snapshot of emerging diagnostic modalities in TBM, the current standing of machine learning in meningitis diagnostics, and the challenges and future prospects of converging these domains.
Collapse
Affiliation(s)
- Yi Shi
- Department of Infectious Diseases, Wenzhou Central Hospital, Wenzhou, China
- The First School of Medicine, Wenzhou Medical University, Wenzhou, China
| | - Chengxi Zhang
- School of Materials Science and Engineering, Shandong Jianzhu University, Jinan, China
| | - Shuo Pan
- The First School of Medicine, Wenzhou Medical University, Wenzhou, China
| | - Yi Chen
- The First School of Medicine, Wenzhou Medical University, Wenzhou, China
| | - Xingguo Miao
- Department of Infectious Diseases, Wenzhou Central Hospital, Wenzhou, China
- Department of Infectious Diseases, Wenzhou Sixth People’s Hospital, Wenzhou, China
- Wenzhou Key Laboratory of Diagnosis and Treatment of Emerging and Recurrent Infectious Diseases, Wenzhou, China
| | - Guoqiang He
- Postgraduate Training Base Alliance of Wenzhou Medical University, Wenzhou, China
- Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, China
| | - Yanchan Wu
- School of Electrical and Information Engineering, Quzhou University, Quzhou, China
| | - Hui Ye
- Department of Infectious Diseases, Wenzhou Central Hospital, Wenzhou, China
- Department of Infectious Diseases, Wenzhou Sixth People’s Hospital, Wenzhou, China
- Wenzhou Key Laboratory of Diagnosis and Treatment of Emerging and Recurrent Infectious Diseases, Wenzhou, China
| | - Chujun Weng
- The Fourth Affiliated Hospital Zhejiang University School of Medicine, Yiwu, China
| | - Huanhuan Zhang
- School and Hospital of Stomatology, Wenzhou Medical University, Wenzhou, China
| | - Wenya Zhou
- School and Hospital of Stomatology, Wenzhou Medical University, Wenzhou, China
| | - Xiaojie Yang
- Wenzhou Medical University Renji College, Wenzhou, China
| | - Chenglong Liang
- The First School of Medicine, Wenzhou Medical University, Wenzhou, China
| | - Dong Chen
- Wenzhou Key Laboratory of Diagnosis and Treatment of Emerging and Recurrent Infectious Diseases, Wenzhou, China
- Wenzhou Central Blood Station, Wenzhou, China
| | - Liang Hong
- Department of Infectious Diseases, The Third Affiliated Hospital of Wenzhou Medical University, Wenzhou, China
| | - Feifei Su
- Department of Infectious Diseases, Wenzhou Central Hospital, Wenzhou, China
- Department of Infectious Diseases, Wenzhou Sixth People’s Hospital, Wenzhou, China
- Wenzhou Key Laboratory of Diagnosis and Treatment of Emerging and Recurrent Infectious Diseases, Wenzhou, China
| |
Collapse
|
37
|
Silva Junior HC, Menezes HNS, Ferreira GB, Guedes GP. Rapid and Accurate Prediction of the Axial Magnetic Anisotropy in Cobalt(II) Complexes Using a Machine-Learning Approach. Inorg Chem 2023; 62:14838-14842. [PMID: 37676736 DOI: 10.1021/acs.inorgchem.3c02569] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/09/2023]
Abstract
Estimating the magnetic anisotropy for single-ion magnets is complex due to its multireference nature. This study demonstrates that deep neural networks (DNNs) can provide accurate axial magnetic anisotropy (D) values, closely matching the complete-active-space self-consistent-field (CASSCF) quality using density functional theory (DFT) data. We curated an 86-parameter database (UFF1) with electronic data from over 33000 cobalt(II) compounds. The DNN achieved an R2 of 0.906 and a mean absolute error of 18.1 cm-1 in comparison to reference CASSCF D values. Remarkably, it is 11 times more accurate than DFT methods and 7700 times faster. This approach hints at DNNs predicting the anisotropy in larger molecules, even when trained on smaller ligands.
Collapse
Affiliation(s)
- Henrique C Silva Junior
- Instituto de Química, Universidade Federal Fluminense, Niterói, Rio de Janeiro 24020-141, Brazil
| | - Heloisa N S Menezes
- Instituto de Química, Universidade Federal Fluminense, Niterói, Rio de Janeiro 24020-141, Brazil
| | - Glaucio B Ferreira
- Instituto de Química, Universidade Federal Fluminense, Niterói, Rio de Janeiro 24020-141, Brazil
| | - Guilherme P Guedes
- Instituto de Química, Universidade Federal Fluminense, Niterói, Rio de Janeiro 24020-141, Brazil
| |
Collapse
|