1
|
Lin M, Lu HC, Lin HW, Pan SW, Cheng BM, Tseng TR, Feng JY, Ho ML. Fast Screening of Tuberculosis Patients Based on Analysis of Plasma by Infrared Spectroscopy Coupled with Machine Learning Approaches. ACS OMEGA 2025; 10:11817-11827. [PMID: 40191314 PMCID: PMC11966281 DOI: 10.1021/acsomega.4c07990] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/30/2024] [Revised: 03/11/2025] [Accepted: 03/13/2025] [Indexed: 04/09/2025]
Abstract
Prompt diagnosis of tuberculosis (TB) enables timely treatment, limiting spread and improving public health for this disease. Currently, a rapid, sensitive, accurate, and cost-effective detection of TB still remains a challenge. For this purpose, we engaged a transmission skill and an attenuated total reflectance (ATR) technique coupled with Fourier-transform infrared spectrometry (FTIR) to study the IR spectra of the plasma samples from TB patients (n = 10) and healthy individuals (n = 10). To ensure high-quality spectral data, spectra were collected in both transmission and ATR modes, with each measurement consisting of 256 scans at a resolution of 8 cm-1. For the transmission mode, measurements were repeated five times per sample, while ATR-FTIR measurements were repeated three times per sample. These parameters were carefully optimized through rigorous testing to achieve the highest possible signal-to-noise ratio for patient sample analysis. Using this method, we obtained a total of 100 spectra from 20 samples in the transmission mode and 60 spectra in the ATR-FTIR mode, ensuring sufficient data for robust spectral analysis. Further, we applied machine learning techniques to analyze and classify the IR spectra; by this means, we differentiated those spectra between TB patients and healthy ones. In this work, we modified the transmission-FTIR setup to improve the absorption sensitivity by focusing the IR light on the interface of the sample; while, we used a high-refractive-index crystal ZnSe as a medium to reflect the signals in ATR scheme. Routinely, we compared the spectra obtained from both methods; in their second derivative curves, we notified that there had distinct spectral differences in protein and lipid regions (3500-3000, 2900-2800, and 1700-1500 cm-1) between TB and healthy groups. Using three machine learning classifiers-Logistic Regression (LR), Random Forest (RF), and XGBoost (Xg)-we found that the Xg achieved an accuracy of 0.749, precision of 0.703, recall of 0.901, F1 score of 0.790, and an AUC of the ROC curve of 0.82 for absorption spectra in the 3500-2700 cm-1 region; additionally, the machine learning practice showed that ATR data possessed performance parameters of ∼ 80% in accuracy. We randomly assigned participants (rather than individual scans) to 80% training and 20% test sets to train and validate three machine learning models (LR, RF, and Xg). Based on the results, we concluded that the absorption spectroscopic method demonstrated its superior performance in TB diagnosis. Thus, we have showed that absorption-FTIR spectroscopy is a valuable tool for sorting the TB disease from patients. The spectral IR analysis of plasmas can complement clinical evidence and provides a rapid and accurate diagnosis of TB in clinic.
Collapse
Affiliation(s)
- Mei Lin
- Department
of Chemistry, Fu Jen Catholic University, New Taipei City 242, Taiwan
| | - Hsiao-Chi Lu
- Department
of Medical Research, Hualien Tzu Chi Hospital, Buddhist Tzu Chi Medical Foundation, No. 707, Sec. 3, Chung-Yang Rd., Hualien City 97002, Taiwan
| | - Hui-Wen Lin
- Department
of Mathematics, Soochow University, Taipei 111, Taiwan
| | - Sheng-Wei Pan
- Department
of Chest Medicine, Taipei Veterans General
Hospital, Taipei 11217, Taiwan
- School
of Medicine, National Yang Ming Chiao Tung
University, Taipei 12304, Taiwan
| | - Bing-Ming Cheng
- Department
of Medical Research, Hualien Tzu Chi Hospital, Buddhist Tzu Chi Medical Foundation, No. 707, Sec. 3, Chung-Yang Rd., Hualien City 97002, Taiwan
- Center for
General Education, Tzu Chi University, No. 880, Sec. 2, Chien-kuo Rd., Hualien City 97005, Taiwan
| | - Ton-Rong Tseng
- Mastek
Technologies, Inc., 4F-4,
No. 13, Wuquan first Rd., Xinzhuang, New Taipei
City 24892, Taiwan
| | - Jia-Yih Feng
- Department
of Chest Medicine, Taipei Veterans General
Hospital, Taipei 11217, Taiwan
- School
of Medicine, National Yang Ming Chiao Tung
University, Taipei 12304, Taiwan
| | - Mei-Lin Ho
- Department
of Chemistry, Fu Jen Catholic University, New Taipei City 242, Taiwan
| |
Collapse
|
2
|
Topalian R, Kavallaris L, Rosenau F, Mavoungou C. Safe-by-Design Strategies for Intranasal Drug Delivery Systems: Machine and Deep Learning Solutions to Differentiate Epithelial Tissues via Attenuated Total Reflection Fourier Transform Infrared Spectroscopy. ACS Pharmacol Transl Sci 2025; 8:762-773. [PMID: 40109738 PMCID: PMC11915033 DOI: 10.1021/acsptsci.4c00643] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2024] [Revised: 01/08/2025] [Accepted: 01/09/2025] [Indexed: 03/22/2025]
Abstract
The development of nasal drug delivery systems requires advanced analytical techniques and tools that allow for distinguishing between the nose-to-brain epithelial tissues with better precision, where traditional bioanalytical methods frequently fail. In this study, attenuated total reflection Fourier transform infrared (ATR-FTIR) spectroscopy is coupled to machine learning (ML) and deep learning (DL) techniques to discriminate effectively between epithelial tissues. The primary goal of this work was to develop Safe-by-Design models for intranasal drug delivery using ex vivo pig tissues experiment, which were analyzed by way of ML modeling. We compiled an ATR-FTIR spectral data set from olfactory epithelium (OE), respiratory epithelium (RE), and tracheal tissues. The data set was used to train and test different ML algorithms. Accuracy, sensitivity, specificity, and F1 score metrics were used to evaluate optimized model performance and their abilities to identify specific spectral signatures relevant to each tissue type. The used feedforward neural network (FNN) has shown 0.99 accuracy, indicating that it had performed a discrimination with a high level of trueness estimates, without overfitting, unlike the built support vector machine (SVM) model. Important spectral features detailing the assignment and site of two-dimensional (2D) protein structures per tissue type were determined by the SHapley Additive exPlanations (SHAP) value analysis of the FNN model. Furthermore, a denoising autoencoder was built to improve spectral quality by reducing noise, as confirmed by higher Pearson correlation coefficients for denoised spectra. The combination of spectroscopic analysis with ML modeling offers a promising strategy called, Safe-by-Design, as a monitoring strategy for intranasal drug delivery systems, also for designing the analysis of tissue for diagnosis purposes.
Collapse
Affiliation(s)
- Romain Topalian
- Institute for Applied Biotechnology, Biberach University of Applied Sciences, Karlstraße 6-11, 88400 Biberach, Germany
- Institute of Pharmaceutical Biotechnology, Ulm University, Albert-Einstein-Allee 11, 89081 Ulm, Germany
| | - Leo Kavallaris
- Institute for Applied Biotechnology, Biberach University of Applied Sciences, Karlstraße 6-11, 88400 Biberach, Germany
| | - Frank Rosenau
- Institute of Pharmaceutical Biotechnology, Ulm University, Albert-Einstein-Allee 11, 89081 Ulm, Germany
| | - Chrystelle Mavoungou
- Institute for Applied Biotechnology, Biberach University of Applied Sciences, Karlstraße 6-11, 88400 Biberach, Germany
| |
Collapse
|
3
|
Yang Q, Xu W, Sun X, Chen Q, Niu B. The Application of Machine Learning in Doping Detection. J Chem Inf Model 2024; 64:8673-8683. [PMID: 39574320 DOI: 10.1021/acs.jcim.4c01234] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/10/2024]
Abstract
Detecting doping agents in sports poses a significant challenge due to the continuous emergence of new prohibited substances and methods. Traditional detection methods primarily rely on targeted analysis, which is often labor-intensive and is susceptible to errors. In response, machine learning offers a transformative approach to enhancing doping screening and detection. With its powerful data analysis capabilities, machine learning enables the rapid identification of patterns and features in complex compound data, increasing both the efficiency and the accuracy of detection. Moreover, when integrated with nontargeted metabolomics, machine learning can predict unknown metabolites, aiding the discovery of long-lasting biomarkers of doping. It also excels in classifying novel compounds, thereby reducing false-negative rates. As instrumental analysis and machine learning technologies continue to advance, the development of rapid, scalable, and highly efficient doping detection methods becomes increasingly feasible, supporting the pursuit of fairness and integrity in sports competitions.
Collapse
Affiliation(s)
- Qingqing Yang
- School of Life Sciences, Shanghai University, Shanghai, 200444, China
| | - Wennuo Xu
- School of Life Sciences, Shanghai University, Shanghai, 200444, China
| | - Xiaodong Sun
- School of Life Sciences, Shanghai University, Shanghai, 200444, China
| | - Qin Chen
- School of Life Sciences, Shanghai University, Shanghai, 200444, China
| | - Bing Niu
- School of Life Sciences, Shanghai University, Shanghai, 200444, China
| |
Collapse
|
4
|
Yang Q, Luo L, Lin Z, Wen W, Zeng W, Deng H. A machine learning-based predictive model of causality in orthopaedic medical malpractice cases in China. PLoS One 2024; 19:e0300662. [PMID: 38630758 PMCID: PMC11023448 DOI: 10.1371/journal.pone.0300662] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Accepted: 02/27/2024] [Indexed: 04/19/2024] Open
Abstract
PURPOSE To explore the feasibility and validity of machine learning models in determining causality in medical malpractice cases and to try to increase the scientificity and reliability of identification opinions. METHODS We collected 13,245 written judgments from PKULAW.COM, a public database. 963 cases were included after the initial screening. 21 medical and ten patient factors were selected as characteristic variables by summarising previous literature and cases. Random Forest, eXtreme Gradient Boosting (XGBoost) and Light Gradient Boosting Machine (LightGBM) were used to establish prediction models of causality for the two data sets, respectively. Finally, the optimal model is obtained by hyperparameter tuning of the six models. RESULTS We built three real data set models and three virtual data set models by three algorithms, and their confusion matrices differed. XGBoost performed best in the real data set, with a model accuracy of 66%. In the virtual data set, the performance of XGBoost and LightGBM was basically the same, and the model accuracy rate was 80%. The overall accuracy of external verification was 72.7%. CONCLUSIONS The optimal model of this study is expected to predict the causality accurately.
Collapse
Affiliation(s)
- Qingxin Yang
- School of Forensic Medicine, Kunming Medical University, Kunming, China
| | - Li Luo
- School of Forensic Medicine, Kunming Medical University, Kunming, China
| | - Zhangpeng Lin
- School of Forensic Medicine, Kunming Medical University, Kunming, China
| | - Wei Wen
- School of Forensic Medicine, Kunming Medical University, Kunming, China
| | - Wenbo Zeng
- West China Hospital of Sichuan University, Chengdu, China
| | - Hong Deng
- School of Forensic Medicine, Kunming Medical University, Kunming, China
| |
Collapse
|
5
|
Truong CM, Jair YC, Chen HP, Chen WC, Liu YH, Chen PC, Chen PS. Streamlining regular liquid chromatography with MALDI-TOF MS and NMR spectroscopy using automatic full-contact splitless spotting interface and flash-tap fractioning collection. Anal Chim Acta 2024; 1298:342401. [PMID: 38462340 DOI: 10.1016/j.aca.2024.342401] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Revised: 02/18/2024] [Accepted: 02/20/2024] [Indexed: 03/12/2024]
Abstract
BACKGROUND High-resolution matrix-assisted laser desorption/ionization-time of flight mass spectrometry (MALDI-TOF MS) and nuclear magnetic resonance (NMR) spectroscopy are powerful tools to identify unknown psychoactive substances. However, in complex matrices, trace levels of unknown substances usually require additional fractionation and concentration. Specialized liquid chromatography systems are necessary for both techniques. The small flow rate of nano LC, typically paired with MALDI-TOF MS, often results in prolonged fractionation times. Conversely, the larger flow rate of semi-preparative LC, used for NMR analysis, can be time-consuming and labor-intensive when concentrating samples. To address these issues, we developed an integrated automatic system that integrated to regular LC. RESULT Automatic spot collector (ASC) and automatic fraction collector (AFC) were present in this study. The ASC utilized in-line matrix mixing, full-contact spotting and real time heating (50 °C), achieving great capacity of 5 μL droplet on MALDI plate, high recovery (76-116%) and rapid evaporation in 2 min. The analytes were concentrated 4-8 times, forming even crystallization, reaching the detection limit at the concentration of 50 μg L-1 for 12 psychoactive substances in urine. The AFC utilizes flexible tubing which flash-tapped the microtube's upper rim (3 mm depth) instead of reaching the bottom. This method prevents sample loss and minimizes the robotic arm's movement, providing a high fractionating speed at 6 s 12 psychoactive compounds were fractionated in a single round analysis (recovery: 81%-114%). Methamphetamine and nitrazepam obtained from drug-laced coffee samples were successful analyzed with photodiode array (PDA) after one AFC round and NMR after five rounds. SIGNIFICANCE The ASC device employed real-time heating, in-line matrix mixing, and full-contact spotting to facilitate the samples spotting onto the MALDI target plate, thereby enhancing detection sensitivity in low-concentration and complex samples. The AFC device utilized the novel flash-tapping method to achieve rapid fractionation and high recovery rate. These devices were assembled using commercially available components, making them affordable (400 USD) for most laboratories while still meeting the required performance for advanced commercialized systems.
Collapse
Affiliation(s)
- Chi-Minh Truong
- Department of Mechanical Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
| | - Yung-Cheng Jair
- Institute of Toxicology, College of Medicine, National Taiwan University, Taipei, Taiwan
| | - Hong-Po Chen
- Department of Chemistry, National Taiwan Normal University, Taipei, Taiwan
| | - Wei-Chih Chen
- Department of Mechanical Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
| | - Yi-Hsin Liu
- Department of Chemistry, National Taiwan Normal University, Taipei, Taiwan
| | - Pin-Chuan Chen
- Department of Mechanical Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan.
| | - Pai-Shan Chen
- Institute of Toxicology, College of Medicine, National Taiwan University, Taipei, Taiwan.
| |
Collapse
|
6
|
Liu CM, Liu XY, Du Y, Hua ZD. Discrimination of opium from Afghanistan and Myanmar by infrared spectroscopy coupled with machine learning methods. Forensic Sci Int 2024; 357:111974. [PMID: 38447346 DOI: 10.1016/j.forsciint.2024.111974] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Revised: 01/23/2024] [Accepted: 02/29/2024] [Indexed: 03/08/2024]
Abstract
Afghanistan and Myanmar are two overwhelming opium production places. In this study, rapid and efficient methods for distinguishing opium from Afghanistan and Myanmar were developed using infrared spectroscopy (IR) coupled with multiple machine learning (ML) methods for the first time. A total of 146 authentic opium samples were analyzed by mid-IR (MIR) and near-IR (NIR), within them 116 were used for model training and 30 were used for model validation. Six ML methods, including partial least squares discriminant analysis (PLS-DA), orthogonal PLS-DA (OPLS-DA), k-nearest neighbour (KNN), support vector machine (SVM), random forest (RF), and artificial neural networks (ANNs) were constructed and compared to get the best classification effect. For MIR data, the average of precision, recall and f1-score for all classification models were 1.0. For NIR data, the average of precision, recall and f1-score for different classification models ranged from 0.90 to 0.94. The comparison results of six ML models for MIR and NIR data showed that MIR was more suitable for opium geography classification. Compared with traditional chromatography and mass spectrometry profiling methods, the advantages of MIR are simple, rapid, cost-effective, and environmentally friendly. The developed IR chemical profiling methodology may find wide application in classification of opium from Afghanistan and Myanmar, and also to differentiate them from opium originating from other opium producing countries. This study presented new insights into the application of IR and ML to rapid drug profiling analysis.
Collapse
Affiliation(s)
- Cui-Mei Liu
- Key Laboratory of Drug Monitoring and Control, Drug Intelligence and Forensic Center, Ministry of Public Security, P.R.C., Beijing 100193, China.
| | - Xue-Yan Liu
- China Pharmaceutical University, Nanjing Jiangsu 210009, China
| | - Yu Du
- China Pharmaceutical University, Nanjing Jiangsu 210009, China
| | - Zhen-Dong Hua
- Key Laboratory of Drug Monitoring and Control, Drug Intelligence and Forensic Center, Ministry of Public Security, P.R.C., Beijing 100193, China
| |
Collapse
|
7
|
Shiammala PN, Duraimutharasan NKB, Vaseeharan B, Alothaim AS, Al-Malki ES, Snekaa B, Safi SZ, Singh SK, Velmurugan D, Selvaraj C. Exploring the artificial intelligence and machine learning models in the context of drug design difficulties and future potential for the pharmaceutical sectors. Methods 2023; 219:82-94. [PMID: 37778659 DOI: 10.1016/j.ymeth.2023.09.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 09/21/2023] [Accepted: 09/25/2023] [Indexed: 10/03/2023] Open
Abstract
Artificial intelligence (AI), particularly deep learning as a subcategory of AI, provides opportunities to accelerate and improve the process of discovering and developing new drugs. The use of AI in drug discovery is still in its early stages, but it has the potential to revolutionize the way new drugs are discovered and developed. As AI technology continues to evolve, it is likely that AI will play an even greater role in the future of drug discovery. AI is used to identify new drug targets, design new molecules, and predict the efficacy and safety of potential drugs. The inclusion of AI in drug discovery can screen millions of compounds in a matter of hours, identifying potential drug candidates that would have taken years to find using traditional methods. AI is highly utilized in the pharmaceutical industry by optimizing processes, reducing waste, and ensuring quality control. This review covers much-needed topics, including the different types of machine-learning techniques, their applications in drug discovery, and the challenges and limitations of using machine learning in this field. The state-of-the-art of AI-assisted pharmaceutical discovery is described, covering applications in structure and ligand-based virtual screening, de novo drug creation, prediction of physicochemical and pharmacokinetic properties, drug repurposing, and related topics. Finally, many obstacles and limits of present approaches are outlined, with an eye on potential future avenues for AI-assisted drug discovery and design.
Collapse
Affiliation(s)
| | | | - Baskaralingam Vaseeharan
- Department of Animal Health and Management, Science Block, Alagappa University, Karaikudi, Tamil Nadu 630 003, India
| | - Abdulaziz S Alothaim
- Department of Biology, College of Science in Zulfi, Majmaah University, Al-Majmaah 11952, Saudi Arabia
| | - Esam S Al-Malki
- Department of Biology, College of Science in Zulfi, Majmaah University, Al-Majmaah 11952, Saudi Arabia
| | - Babu Snekaa
- Laboratory for Artificial Intelligence and Molecular Modelling, Department of Pharmacology, Saveetha Dental College and Hospitals, Saveetha Institute of Medical and Technical Sciences (SIMATS), Saveetha University, Chennai, Tamil Nadu 600077, India
| | - Sher Zaman Safi
- Faculty of Medicine, Bioscience and Nursing, MAHSA University, Jenjarom 42610, Selangor, Malaysia
| | - Sanjeev Kumar Singh
- Computer Aided Drug Design and Molecular Modelling Lab, Department of Bioinformatics, Science Block, Alagappa University, Karaikudi-630 003, Tamil Nadu, India
| | - Devadasan Velmurugan
- Department of Biotechnology, College of Engineering & Technology, SRM Institute of Science & Technology, Kattankulathur, Chennai, Tamil Nadu 603203, India
| | - Chandrabose Selvaraj
- Laboratory for Artificial Intelligence and Molecular Modelling, Department of Pharmacology, Saveetha Dental College and Hospitals, Saveetha Institute of Medical and Technical Sciences (SIMATS), Saveetha University, Chennai, Tamil Nadu 600077, India; Laboratory for Artificial Intelligence and Molecular Modelling, Center for Global Health Research, Saveetha Medical College, Saveetha Institute of Medical and Technical Sciences, Saveetha Nagar, Thandalam, Chennai, Tamil Nadu 602105, India.
| |
Collapse
|