1
|
Kim JY, Choi DH. Deep learning-based image classification and quantification models for tablet sticking. Int J Pharm 2025:125690. [PMID: 40339626 DOI: 10.1016/j.ijpharm.2025.125690] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2025] [Revised: 04/19/2025] [Accepted: 05/04/2025] [Indexed: 05/10/2025]
Abstract
Sticking can significantly affect drug product quality, manufacturing efficiency, and therapeutic efficacy in pharmaceutical tablet manufacturing. This study presents a novel integrated model with a convolutional neural network (CNN) and gray-level co-occurrence matrix (GLCM) based features combined with a support vector machine to classify and quantify tablet sticking. The classification model was developed and evaluated using CNN architectures, including AlexNet, VGG 16, ResNet 50, and GoogLeNet. GoogLeNet showed the best performance in terms of accuracy (99.39%), precision (100.00%), recall (98.78%), F1-score (99.38%), and computational efficiency. GLCM features such as energy, homogeneity, contrast, and correlation were analyzed to develop an optimal quantification model, revealing a significant difference between the sticking and non-sticking regions. Based on these differences, the sticking regions were detected and quantified using a sticking index. To validate the final model, which integrated the classification and quantification models, 10 batches of tablets were produced using a rotary tablet press. The validation confirmed high measurement repeatability with minimal and classified sticking levels. Tablet quality attributes such as assay, content uniformity, and weight were evaluated. Despite the occurrence of sticking, the tablet quality attributes met their criteria. These results suggest that measuring tablet quality attributes and visual inspection may not be sufficient to detect mild sticking. However, the integrated model proposed in this study could detect mild sticking, even if the tablet quality attributes remained within the acceptable criteria. This study demonstrated that the proposed integrated model could improve pharmaceutical manufacturing efficiency, ensure consistent drug product quality, and overcome visual inspection limitations.
Collapse
Affiliation(s)
- Ji Yeon Kim
- College of Pharmacy, Daegu Catholic University, Gyeongsan-si, Gyeongbuk 38430, Republic of Korea
| | - Du Hyung Choi
- College of Pharmacy, Daegu Catholic University, Gyeongsan-si, Gyeongbuk 38430, Republic of Korea.
| |
Collapse
|
2
|
Sasmal P, Kumar Panigrahi S, Panda SL, Bhuyan MK. Attention-guided deep framework for polyp localization and subsequent classification via polyp local and Siamese feature fusion. Med Biol Eng Comput 2025:10.1007/s11517-025-03369-z. [PMID: 40314710 DOI: 10.1007/s11517-025-03369-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2024] [Accepted: 04/16/2025] [Indexed: 05/03/2025]
Abstract
Colorectal cancer (CRC) is one of the leading causes of death worldwide. This paper proposes an automated diagnostic technique to detect, localize, and classify polyps in colonoscopy video frames. The proposed model adopts the deep YOLOv4 model that incorporates both spatial and contextual information in the form of spatial attention and channel attention blocks, respectively for better localization of polyps. Finally, leveraging a fusion of deep and handcrafted features, the detected polyps are classified as adenoma or non-adenoma. Polyp shape and texture are essential features in discriminating polyp types. Therefore, the proposed work utilizes a pyramid histogram of oriented gradient (PHOG) and embedding features learned via triplet Siamese architecture to extract these features. The PHOG extracts local shape information from each polyp class, whereas the Siamese network extracts intra-polyp discriminating features. The individual and cross-database performances on two databases suggest the robustness of our method in polyp localization. The competitive analysis based on significant clinical parameters with current state-of-the-art methods confirms that our method can be used for automated polyp localization in both real-time and offline colonoscopic video frames. Our method provides an average precision of 0.8971 and 0.9171 and an F1 score of 0.8869 and 0.8812 for the Kvasir-SEG and SUN databases. Similarly, the proposed classification framework for the detected polyps yields a classification accuracy of 96.66% on a publicly available UCI colonoscopy video dataset. Moreover, the classification framework provides an F1 score of 96.54% that validates the potential of the proposed framework in polyp localization and classification.
Collapse
Affiliation(s)
- Pradipta Sasmal
- Department of Electrical Engineering, Indian Institute of Technology, Kharagpur, West Bengal, 721302, India.
| | - Susant Kumar Panigrahi
- Department of Electrical Engineering, Indian Institute of Technology, Kharagpur, West Bengal, 721302, India
| | - Swarna Laxmi Panda
- Department of Electronics and Communication Engineering, National Institute of Technology, Rourkela, Odisha, 769008, India
| | - M K Bhuyan
- Department of Electronics and Electrical Engineering, Indian Institute of Technology, Guwahati, Assam, 781039, India
| |
Collapse
|
3
|
Lara-Abelenda FJ, Chushig-Muzo D, Peiro-Corbacho P, Gómez-Martínez V, Wägner AM, Granja C, Soguero-Ruiz C. Transfer learning for a tabular-to-image approach: A case study for cardiovascular disease prediction. J Biomed Inform 2025; 165:104821. [PMID: 40209918 DOI: 10.1016/j.jbi.2025.104821] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2024] [Revised: 02/20/2025] [Accepted: 03/27/2025] [Indexed: 04/12/2025]
Abstract
OBJECTIVE Machine learning (ML) models have been extensively used for tabular data classification but recent works have been developed to transform tabular data into images, aiming to leverage the predictive performance of convolutional neural networks (CNNs). However, most of these approaches fail to convert data with a low number of samples and mixed-type features. This study aims: to evaluate the performance of the tabular-to-image method named low mixed-image generator for tabular data (LM-IGTD); and to assess the effectiveness of transfer learning and fine-tuning for improving predictions on tabular data. METHODS We employed two public tabular datasets with patients diagnosed with cardiovascular diseases (CVDs): Framingham and Steno. First, both datasets were transformed into images using LM-IGTD. Then, Framingham, which contains a larger set of samples than Steno, is used to train CNN-based models. Finally, we performed transfer learning and fine-tuning using the pre-trained CNN on the Steno dataset to predict CVD risk. RESULTS The CNN-based model with transfer learning achieved the highest AUCORC in Steno (0.855), outperforming ML models such as decision trees, K-nearest neighbors, least absolute shrinkage and selection operator (LASSO) support vector machine and TabPFN. This approach improved accuracy by 2% over the best-performing traditional model, TabPFN. CONCLUSION To the best of our knowledge, this is the first study that evaluates the effectiveness of applying transfer learning and fine-tuning to tabular data using tabular-to-image approaches. Through the use of CNNs' predictive capabilities, our work also advances the diagnosis of CVD by providing a framework for early clinical intervention and decision-making support.
Collapse
Affiliation(s)
- Francisco J Lara-Abelenda
- Department of Signal Theory and Communications, Telematics and Computing Systems, Rey Juan Carlos University, Madrid, Spain.
| | - David Chushig-Muzo
- Department of Signal Theory and Communications, Telematics and Computing Systems, Rey Juan Carlos University, Madrid, Spain.
| | - Pablo Peiro-Corbacho
- Department of Signal Theory and Communications, Telematics and Computing Systems, Rey Juan Carlos University, Madrid, Spain.
| | - Vanesa Gómez-Martínez
- Department of Signal Theory and Communications, Telematics and Computing Systems, Rey Juan Carlos University, Madrid, Spain.
| | - Ana M Wägner
- Instituto Universitario de Investigaciones Biomédicas y Sanitarias, Universidad de Las Palmas de Gran Canaria, Las Palmas de Gran Canaria, Spain.
| | - Conceição Granja
- Norwegian Centre for E-health Research, University Hospital of North Norway, Tromsø, Norway.
| | - Cristina Soguero-Ruiz
- Department of Signal Theory and Communications, Telematics and Computing Systems, Rey Juan Carlos University, Madrid, Spain
| |
Collapse
|
4
|
Gupta A, Bajaj S, Nema P, Purohit A, Kashaw V, Soni V, Kashaw SK. Potential of AI and ML in oncology research including diagnosis, treatment and future directions: A comprehensive prospective. Comput Biol Med 2025; 189:109918. [PMID: 40037170 DOI: 10.1016/j.compbiomed.2025.109918] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2024] [Revised: 02/21/2025] [Accepted: 02/23/2025] [Indexed: 03/06/2025]
Abstract
Artificial intelligence (AI) and machine learning (ML) have emerged as transformative tools in cancer research, offering the ability to process huge data rapidly and make precise therapeutic decisions. Over the last decade, AI, particularly deep learning (DL) and machine learning (ML), has significantly enhanced cancer prediction, diagnosis, and treatment by leveraging algorithms such as convolutional neural networks (CNNs) and multi-layer perceptrons (MLPs). These technologies provide reliable, efficient solutions for managing aggressive diseases like cancer, which have high recurrence and mortality rates. This review prospective highlights the applications of AI in oncology, a long with FDA-approved technologies like EFAI RTSuite CT HN-Segmentation System, Quantib Prostate, and Paige Prostate, and explore their role in advancing cancer detection, personalized care, and treatment. Furthermore, we also explored broader applications of AI in healthcare, addressing challenges, limitations, regulatory considerations, and ethical implications. By presenting these advancements, we underscore AI's potential to revolutionize cancer care, management and treatment.
Collapse
Affiliation(s)
- Akanksha Gupta
- Integrated Drug Discovery Research Laboratory, Department of Pharmaceutical Sciences, Dr. Harisingh Gour University (A Central University), Sagar, Madya Pradesh, 470003, India.
| | - Samyak Bajaj
- Integrated Drug Discovery Research Laboratory, Department of Pharmaceutical Sciences, Dr. Harisingh Gour University (A Central University), Sagar, Madya Pradesh, 470003, India.
| | - Priyanshu Nema
- Integrated Drug Discovery Research Laboratory, Department of Pharmaceutical Sciences, Dr. Harisingh Gour University (A Central University), Sagar, Madya Pradesh, 470003, India.
| | - Arpana Purohit
- Integrated Drug Discovery Research Laboratory, Department of Pharmaceutical Sciences, Dr. Harisingh Gour University (A Central University), Sagar, Madya Pradesh, 470003, India.
| | - Varsha Kashaw
- Sagar Institute of Pharmaceutical Sciences, Sagar, M.P., India.
| | - Vandana Soni
- Integrated Drug Discovery Research Laboratory, Department of Pharmaceutical Sciences, Dr. Harisingh Gour University (A Central University), Sagar, Madya Pradesh, 470003, India.
| | - Sushil K Kashaw
- Integrated Drug Discovery Research Laboratory, Department of Pharmaceutical Sciences, Dr. Harisingh Gour University (A Central University), Sagar, Madya Pradesh, 470003, India.
| |
Collapse
|
5
|
Subeesh A, Chauhan N. Deep learning based abiotic crop stress assessment for precision agriculture: A comprehensive review. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2025; 381:125158. [PMID: 40203709 DOI: 10.1016/j.jenvman.2025.125158] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/03/2024] [Revised: 03/14/2025] [Accepted: 03/25/2025] [Indexed: 04/11/2025]
Abstract
Abiotic stresses are a leading cause of crop loss and a severe peril to global food security. Precise and prompt identification of abiotic stresses in crops is crucial for effective mitigation strategies. In recent years, Deep learning (DL) techniques have demonstrated remarkable promise for high-throughput crop stress phenotyping using remote sensing and field data. This study offers a comprehensive review of the applications of DL models like artificial neural networks (ANN), convolutional neural networks (CNN), recurrent neural networks (RNN), vision transformers (ViT), and other advanced deep learning architectures for abiotic crop stress assessment using different modalities like IoT sensor data, thermal, spectral, RGB with field, UAV and satellite based imagery. The study comprehensively analyses the abiotic stress conditions due to (a) water (b) nutrients (c) salinity (d) temperature and (e) heavy metal. Key contributions in the literature on stress classification, localization, and quantification using deep learning approaches are discussed in detail. The study also covers the principles of deep learning models, and their unique capabilities for handling complex, high-dimensional datasets inherent in abiotic crop stress assessment. The review also highlights important challenges and future directions in deep learning based abiotic crop stress assessment like limited labelled data, model interpretability, and interoperability for robust stress phenotyping. This study critically examines the research pertaining to the abiotic crop stress assessment, and provides a comprehensive view of the role deep learning plays in advancing abiotic crop stress assessment for data-driven precision agriculture.
Collapse
Affiliation(s)
- A Subeesh
- Department of Computer Science and Engineering, National Institute of Technology, Hamirpur, HP, 177005, India; Agricultural Mechanization Division, ICAR-Central Institute of Agricultural Engineering, Bhopal, 462038, MP, India.
| | - Naveen Chauhan
- Department of Computer Science and Engineering, National Institute of Technology, Hamirpur, HP, 177005, India.
| |
Collapse
|
6
|
Hosseinzadeh Taher MR, Haghighi F, Gotway MB, Liang J. Large-scale benchmarking and boosting transfer learning for medical image analysis. Med Image Anal 2025; 102:103487. [PMID: 40117988 DOI: 10.1016/j.media.2025.103487] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2024] [Revised: 08/03/2024] [Accepted: 01/27/2025] [Indexed: 03/23/2025]
Abstract
Transfer learning, particularly fine-tuning models pretrained on photographic images to medical images, has proven indispensable for medical image analysis. There are numerous models with distinct architectures pretrained on various datasets using different strategies. But, there is a lack of up-to-date large-scale evaluations of their transferability to medical imaging, posing a challenge for practitioners in selecting the most proper pretrained models for their tasks at hand. To fill this gap, we conduct a comprehensive systematic study, focusing on (i) benchmarking numerous conventional and modern convolutional neural network (ConvNet) and vision transformer architectures across various medical tasks; (ii) investigating the impact of fine-tuning data size on the performance of ConvNets compared with vision transformers in medical imaging; (iii) examining the impact of pretraining data granularity on transfer learning performance; (iv) evaluating transferability of a wide range of recent self-supervised methods with diverse training objectives to a variety of medical tasks across different modalities; and (v) delving into the efficacy of domain-adaptive pretraining on both photographic and medical datasets to develop high-performance models for medical tasks. Our large-scale study (∼5,000 experiments) yields impactful insights: (1) ConvNets demonstrate higher transferability than vision transformers when fine-tuning for medical tasks; (2) ConvNets prove to be more annotation efficient than vision transformers when fine-tuning for medical tasks; (3) Fine-grained representations, rather than high-level semantic features, prove pivotal for fine-grained medical tasks; (4) Self-supervised models excel in learning holistic features compared with supervised models; and (5) Domain-adaptive pretraining leads to performant models via harnessing knowledge acquired from ImageNet and enhancing it through the utilization of readily accessible expert annotations associated with medical datasets. As open science, all codes and pretrained models are available at GitHub.com/JLiangLab/BenchmarkTransferLearning (Version 2).
Collapse
Affiliation(s)
| | - Fatemeh Haghighi
- School of Computing and Augmented Intelligence, Arizona State University, Tempe, AZ 85281, USA
| | | | - Jianming Liang
- School of Computing and Augmented Intelligence, Arizona State University, Tempe, AZ 85281, USA.
| |
Collapse
|
7
|
Park J, Xu Z, Park GM, Luo T, Lee E. Inverse binary optimization of convolutional neural network in active learning efficiently designs nanophotonic structures. Sci Rep 2025; 15:15187. [PMID: 40307366 PMCID: PMC12043942 DOI: 10.1038/s41598-025-99570-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2025] [Accepted: 04/21/2025] [Indexed: 05/02/2025] Open
Abstract
Binary optimization using active learning schemes has gained attention for automating the discovery of optimal designs in nanophotonic structures and material configurations. Recently, active learning has utilized factorization machines (FM), which usually are second-order models, as surrogates to approximate the hypervolume of the design space, benefiting from rapid optimization by Ising machines such as quantum annealing (QA). However, due to their second-order nature, FM-based surrogate functions struggle to fully capture the complexity of the hypervolume. In this paper, we introduce an inverse binary optimization (IBO) scheme that optimizes a surrogate function based on a convolutional neural network (CNN) within an active learning framework. The IBO method employs backward error propagation to optimize the input binary vector, minimizing the output value while maintaining fixed parameters in the pre-trained CNN layers. We conduct a benchmarking study of the CNN-based surrogate function within the CNN-IBO framework by optimizing nanophotonic designs (e.g., planar multilayer and stratified grating structure) as a testbed. Our results demonstrate that CNN-IBO achieves optimal designs with fewer actively accumulated training data than FM-QA, indicating its potential as a powerful and efficient method for binary optimization.
Collapse
Affiliation(s)
- Jaehyeon Park
- Department of Electronic Engineering, Kyung Hee University, Yongin-si, Gyonggi-do, 17104, Republic of Korea
| | - Zhihao Xu
- Department of Aerospace and Mechanical Engineering, University of Notre Dame, Notre Dame, IN, 46556, USA
| | - Gyeong-Moon Park
- Department of Artificial Intelligence, Korea University, Seongbuk-gu, Seoul, 02841, Republic of Korea
| | - Tengfei Luo
- Department of Aerospace and Mechanical Engineering, University of Notre Dame, Notre Dame, IN, 46556, USA.
| | - Eungkyu Lee
- Department of Electronic Engineering, Kyung Hee University, Yongin-si, Gyonggi-do, 17104, Republic of Korea.
| |
Collapse
|
8
|
Han X, Peng C, Ruan SM, Li L, He M, Shi M, Huang B, Luo Y, Liu J, Wen H, Wang W, Zhou J, Lu M, Chen X, Zou R, Liu Z. A Contrast-Enhanced Ultrasound Cine-Based Deep Learning Model for Predicting the Response of Advanced Hepatocellular Carcinoma to Hepatic Arterial Infusion Chemotherapy Combined With Systemic Therapies. Cancer Sci 2025. [PMID: 40302359 DOI: 10.1111/cas.70089] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2024] [Revised: 04/01/2025] [Accepted: 04/18/2025] [Indexed: 05/02/2025] Open
Abstract
Recently, a hepatic arterial infusion chemotherapy (HAIC)-associated combination therapeutic regimen, comprising HAIC and systemic therapies (molecular targeted therapy plus immunotherapy), referred to as HAIC combination therapy, has demonstrated promising anticancer effects. Identifying individuals who may potentially benefit from HAIC combination therapy could contribute to improved treatment decision-making for patients with advanced hepatocellular carcinoma (HCC). This dual-center study was a retrospective analysis of prospectively collected data with advanced HCC patients who underwent HAIC combination therapy and pretreatment contrast-enhanced ultrasound (CEUS) evaluations from March 2019 to March 2023. Two deep learning models, AE-3DNet and 3DNet, along with a time-intensity curve-based model, were developed for predicting therapeutic responses from pretreatment CEUS cine images. Diagnostic metrics, including the area under the receiver-operating-characteristic curve (AUC), were calculated to compare the performance of the models. Survival analysis was used to assess the relationship between predicted responses and prognostic outcomes. The model of AE-3DNet was constructed on the top of 3DNet, with innovative incorporation of spatiotemporal attention modules to enhance the capacity for dynamic feature extraction. 326 patients were included, 243 of whom formed the internal validation cohort, which was utilized for model development and fivefold cross-validation, while the rest formed the external validation cohort. Objective response (OR) or non-objective response (non-OR) were observed in 63% (206/326) and 37% (120/326) of the participants, respectively. Among the three efficacy prediction models assessed, AE-3DNet performed superiorly with AUC values of 0.84 and 0.85 in the internal and external validation cohorts, respectively. AE-3DNet's predicted response survival curves closely resembled actual clinical outcomes. The deep learning model of AE-3DNet developed based on pretreatment CEUS cine performed satisfactorily in predicting the responses of advanced HCC to HAIC combination therapy, which may serve as a promising tool for guiding combined therapy and individualized treatment strategies. Trial Registration: NCT02973685.
Collapse
Affiliation(s)
- Xu Han
- State Key Laboratory of Oncology in South China, Guangdong Provincial Clinical Research Center for Cancer, Department of Ultrasound, Sun Yat-Sen University Cancer Center, Guangzhou, China
| | - Chuan Peng
- State Key Laboratory of Oncology in South China, Guangdong Provincial Clinical Research Center for Cancer, Department of Ultrasound, Sun Yat-Sen University Cancer Center, Guangzhou, China
| | - Si-Min Ruan
- Department of Medical Ultrasonics, Ultrasomics Artificial Intelligence X-Lab, Institute of Diagnostic and Interventional Ultrasound, the First Affiliated Hospital of Sun Yat-Sen University, Guangzhou, China
| | - Lingling Li
- State Key Laboratory of Oncology in South China, Guangdong Provincial Clinical Research Center for Cancer, Department of Ultrasound, Sun Yat-Sen University Cancer Center, Guangzhou, China
| | - Minke He
- State Key Laboratory of Oncology in South China, Guangdong Provincial Clinical Research Center for Cancer, Department of Hepatobiliary Oncology, Sun Yat-Sen University Cancer Center, Guangzhou, China
| | - Ming Shi
- State Key Laboratory of Oncology in South China, Guangdong Provincial Clinical Research Center for Cancer, Department of Hepatobiliary Oncology, Sun Yat-Sen University Cancer Center, Guangzhou, China
| | - Bin Huang
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China
| | - Yudi Luo
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China
| | - Jingming Liu
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China
| | - Huiying Wen
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China
| | - Wei Wang
- Department of Medical Ultrasonics, Ultrasomics Artificial Intelligence X-Lab, Institute of Diagnostic and Interventional Ultrasound, the First Affiliated Hospital of Sun Yat-Sen University, Guangzhou, China
| | - Jianhua Zhou
- State Key Laboratory of Oncology in South China, Guangdong Provincial Clinical Research Center for Cancer, Department of Ultrasound, Sun Yat-Sen University Cancer Center, Guangzhou, China
| | - Minhua Lu
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China
| | - Xin Chen
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China
| | - Ruhai Zou
- State Key Laboratory of Oncology in South China, Guangdong Provincial Clinical Research Center for Cancer, Department of Ultrasound, Sun Yat-Sen University Cancer Center, Guangzhou, China
| | - Zhong Liu
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Shenzhen University Medical School, Shenzhen University, Shenzhen, China
| |
Collapse
|
9
|
M SD, Balasubaramanian S, S B, Shah MA. Restricted Boltzmann machine with Sobel filter dense adversarial noise secured layer framework for flower species recognition. Sci Rep 2025; 15:12315. [PMID: 40210949 PMCID: PMC11985983 DOI: 10.1038/s41598-025-95564-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2024] [Accepted: 03/21/2025] [Indexed: 04/12/2025] Open
Abstract
Recognition is an extremely high-level computer vision evaluating task that primarily involves categorizing objects by identifying and evaluating their key distinguishing characteristics. Categorization is important in botany because it makes comprehending the relationships between various flower species easier to organize. Since there is a great deal of variability among flower species and some flower species may resemble one another, classifying flowers may become difficult. An appropriate technique for classification that uses deep learning technology is vital to categorize flower species effectively. This leads to the design of proposed Sobel Restricted Boltzmann VGG19 (SRB-VGG19), which is highly effective at classifying flower species and is inspired by VGG19 model. This research primarily contributes in three ways. The first contribution deals with the dataset preparation by means of feature extraction through the use of the Sobel filter and the Restricted Boltzmann Machine (RBM) neural network approach through unsupervised learning. The second contribution focuses on improving the VGG19 and DenseNet model for supervised learning, which is used to classify species of flowers into five groups. The third contribution overcomes the issue of data poisoning attack through Fast Gradient Sign Method (FGSM) to the input data samples. The FGSM attack was addressed by forming the Adversarial Noise Layer in the dense block. The Flowers Recognition KAGGLE dataset preprocessing was done to extract only the important features using the Sobel filter that computes the image intensity gradient at every pixel in the image. The Sobel filtered image was then applied to RBM to generate RBM Component Vectorized Flower images (RBMCV) which was divided into 3400 training and 850 testing images. To determine the best CNN, the training pictures are fitted with the existing CNN models. According to experiment results, VGG19 and DenseNet can classify floral species with an accuracy of above 80%. So, VGG19 and DenseNet were fine tuned to design the proposed SRB-VGG19 model. The Novelty of this research was explored by designing two sub models SRB-VGG FCL model, SRB-VGG Dense model and validating the security countermeasure of the model through FGSM attack. The proposed SRB-VGG19 initially begins by forming the RBMCV input images that only includes the essential flower edges. The RBMCV Flower images are trained with SRB-VGG FCL model, SRB-VGG Dense model and the performance analysis was done. When compared to the current deep learning models, the implementation results show that the proposed SRB-VGG19 Dense Model classifies the flower species with a high accuracy of 98.65%.
Collapse
Affiliation(s)
- Shyamala Devi M
- Department of Computer Science and Engineering, Panimalar Engineering College, Chennai, Tamil Nadu, 600123, India
| | | | - Balasubramaniam S
- School of Computer Science and Engineering, Kerala University of Digital Sciences, Innovation and Technology (Formerly IIITM-K), Digital University Kerala, Thiruvananthapuram, Kerala, India
| | - Mohd Asif Shah
- Kardan University, Parwan e Du, Kabul, 1001, Afghanistan.
- Division of Research and Development, Lovely Professional University, Phagwara, Punjab, 144001, India.
| |
Collapse
|
10
|
Lin M, Guo J, Gu Z, Tang W, Tao H, You S, Jia D, Sun Y, Jia P. Machine learning and multi-omics integration: advancing cardiovascular translational research and clinical practice. J Transl Med 2025; 23:388. [PMID: 40176068 PMCID: PMC11966820 DOI: 10.1186/s12967-025-06425-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2024] [Accepted: 03/25/2025] [Indexed: 04/04/2025] Open
Abstract
The global burden of cardiovascular diseases continues to rise, making their prevention, diagnosis and treatment increasingly critical. With advancements and breakthroughs in omics technologies such as high-throughput sequencing, multi-omics approaches can offer a closer reflection of the complex physiological and pathological changes in the body from a molecular perspective, providing new microscopic insights into cardiovascular diseases research. However, due to the vast volume and complexity of data, accurately describing, utilising, and translating these biomedical data demands substantial effort. Researchers and clinicians are actively developing artificial intelligence (AI) methods for data-driven knowledge discovery and causal inference using various omics data. These AI approaches, integrated with multi-omics research, have shown promising outcomes in cardiovascular studies. In this review, we outline the methods for integrating machine learning, one of the most successful applications of AI, with omics data and summarise representative AI models developed that leverage various omics data to facilitate the exploration of cardiovascular diseases from underlying mechanisms to clinical practice. Particular emphasis is placed on the effectiveness of using AI to extract potential molecular information to address current knowledge gaps. We discuss the challenges and opportunities of integrating omics with AI into routine diagnostic and therapeutic practices and anticipate the future development of novel AI models for wider application in the field of cardiovascular diseases.
Collapse
Affiliation(s)
- Mingzhi Lin
- Department of Cardiology, The First Hospital of China Medical University, 155 Nanjing North Street, Heping District, Shenyang, 110001, People's Republic of China
| | - Jiuqi Guo
- Department of Cardiology, The First Hospital of China Medical University, 155 Nanjing North Street, Heping District, Shenyang, 110001, People's Republic of China
| | - Zhilin Gu
- Department of Cardiology, The First Hospital of China Medical University, 155 Nanjing North Street, Heping District, Shenyang, 110001, People's Republic of China
| | - Wenyi Tang
- Department of Cardiology, The First Hospital of China Medical University, 155 Nanjing North Street, Heping District, Shenyang, 110001, People's Republic of China
| | - Hongqian Tao
- Department of Cardiology, The First Hospital of China Medical University, 155 Nanjing North Street, Heping District, Shenyang, 110001, People's Republic of China
| | - Shilong You
- Department of Cardiology, The First Hospital of China Medical University, 155 Nanjing North Street, Heping District, Shenyang, 110001, People's Republic of China
| | - Dalin Jia
- Department of Cardiology, The First Hospital of China Medical University, 155 Nanjing North Street, Heping District, Shenyang, 110001, People's Republic of China.
| | - Yingxian Sun
- Department of Cardiology, The First Hospital of China Medical University, 155 Nanjing North Street, Heping District, Shenyang, 110001, People's Republic of China.
- Key Laboratory of Environmental Stress and Chronic Disease Control and Prevention, Ministry of Education, China Medical University, Shenyang, Liaoning, China.
| | - Pengyu Jia
- Department of Cardiology, The First Hospital of China Medical University, 155 Nanjing North Street, Heping District, Shenyang, 110001, People's Republic of China.
| |
Collapse
|
11
|
Nakabayashi D, Inui A, Mifune Y, Yamaura K, Kato T, Furukawa T, Hayashi S, Matsumoto T, Matsushita T, Kuroda R. Quantitative Evaluation of Tendon Gliding Sounds and Their Classification Using Deep Learning Models. Cureus 2025; 17:e81790. [PMID: 40330348 PMCID: PMC12054386 DOI: 10.7759/cureus.81790] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/06/2025] [Indexed: 05/08/2025] Open
Abstract
This study aims to develop and evaluate a deep learning (DL) model for classifying tendon gliding sounds recorded using digital stethoscopes (Nexteto, ShareMedical, Japan, Nagoya). Specifically, we investigate whether differences in tendon excursion and biomechanics produce distinct acoustic signatures that can be identified through spectrogram analysis and machine learning (ML). Tendon disorders often present characteristic tactile and acoustic features, such as clicking or resistance during movement. In recent years, artificial intelligence (AI) and ML have achieved significant success in medical diagnostics, particularly through pattern recognition in medical imaging. Leveraging these advancements, we recorded tendon gliding sounds from the thumb and index finger in healthy volunteers and transformed these recordings into spectrograms for analysis. Although the sample size was small, we performed classification based on the frequency characteristics of the spectrograms using DL models, achieving high classification accuracy. These findings indicate that AI-based models can accurately distinguish between different tendon sounds and strongly suggest their potential as a non-invasive diagnostic tool for musculoskeletal disorders. This approach could offer a non-invasive diagnostic tool for detecting tendon disorders such as tenosynovitis or carpal tunnel syndrome, potentially aiding early diagnosis and treatment planning.
Collapse
Affiliation(s)
- Daiji Nakabayashi
- Department of Orthopaedic Surgery, Kobe University Graduate School of Medicine, Kobe, JPN
| | - Atsuyuki Inui
- Department of Orthopaedic Surgery, Kobe University Graduate School of Medicine, Kobe, JPN
| | - Yutaka Mifune
- Department of Orthopaedic Surgery, Kobe University Graduate School of Medicine, Kobe, JPN
| | - Kohei Yamaura
- Department of Orthopaedic Surgery, Kobe University Graduate School of Medicine, Kobe, JPN
| | - Tatsuo Kato
- Department of Orthopaedic Surgery, Kobe University Graduate School of Medicine, Kobe, JPN
| | - Takahiro Furukawa
- Department of Orthopaedic Surgery, Kobe University Graduate School of Medicine, Kobe, JPN
| | - Shinya Hayashi
- Department of Orthopaedic Surgery, Kobe University Graduate School of Medicine, Kobe, JPN
| | - Tomoyuki Matsumoto
- Department of Orthopaedic Surgery, Kobe University Graduate School of Medicine, Kobe, JPN
| | - Takehiko Matsushita
- Department of Orthopaedic Surgery, Kobe University Graduate School of Medicine, Kobe, JPN
| | - Ryosuke Kuroda
- Department of Orthopaedic Surgery, Kobe University Graduate School of Medicine, Kobe, JPN
| |
Collapse
|
12
|
Kamachi M, Kamada K, Kanzaki N, Yamamoto T, Hoshino Y, Inui A, Nakanishi Y, Nishida K, Nagai K, Matsushita T, Kuroda R. Using deep learning for ultrasound images to diagnose chronic lateral ankle instability with high accuracy. Asia Pac J Sports Med Arthrosc Rehabil Technol 2025; 40:1-6. [PMID: 39911312 PMCID: PMC11791010 DOI: 10.1016/j.asmart.2025.01.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2024] [Revised: 01/08/2025] [Accepted: 01/12/2025] [Indexed: 02/07/2025] Open
Abstract
The purpose of this study is to calculate diagnostic accuracy of chronic lateral ankle instability (CLAI) from a confusion matrix using deep learning (DL) on ultrasound images of anterior talofibular ligament (ATFL). The study included 30 ankles with no history of ankle sprains (control group), and 30 ankles diagnosed with CLAI (injury group). A total of 2000 images were prepared for each group by capturing ultrasound videos visualizing the fibers of ATFL under the anterior drawer stress. The images of 20 feet in each group were randomly selected and used for training data and the images of remaining 10 feet in each group were used as test data. Transfer learning was performed using 3 pretraining DL models, and the accuracy, precision, recall (sensitivity), specificity, F-measure, and the area under the receiver operating characteristic curve (AUC) were calculated based on the confusion matrix. The important features were visualized using occlusion sensitivity, a method for visualizing areas that are important for model prediction. DL was able to diagnose CLAI using ultrasound imaging with very high accuracy and AUC in three different learning models. In visualization of the region of interest, AI focused on the substance of the ATFL and its attachment on the fibula for the diagnosis of CLAI.
Collapse
Affiliation(s)
- Masamune Kamachi
- Department of Orthopaedic Surgery, Kobe University Graduate School of Medicine, Kobe, 650-0017, Japan
| | - Kohei Kamada
- Department of Orthopaedic Surgery, Kobe University Graduate School of Medicine, Kobe, 650-0017, Japan
| | - Noriyuki Kanzaki
- Department of Orthopaedic Surgery, Kobe University Graduate School of Medicine, Kobe, 650-0017, Japan
| | - Tetsuya Yamamoto
- Department of Orthopaedic Surgery, Kobe University Graduate School of Medicine, Kobe, 650-0017, Japan
| | - Yuichi Hoshino
- Department of Orthopaedic Surgery, Kobe University Graduate School of Medicine, Kobe, 650-0017, Japan
| | - Atsuyuki Inui
- Department of Orthopaedic Surgery, Kobe University Graduate School of Medicine, Kobe, 650-0017, Japan
| | - Yuta Nakanishi
- Department of Orthopaedic Surgery, Kobe University Graduate School of Medicine, Kobe, 650-0017, Japan
| | - Kyohei Nishida
- Department of Orthopaedic Surgery, Kobe University Graduate School of Medicine, Kobe, 650-0017, Japan
| | - Kanto Nagai
- Department of Orthopaedic Surgery, Kobe University Graduate School of Medicine, Kobe, 650-0017, Japan
| | - Takehiko Matsushita
- Department of Orthopaedic Surgery, Kobe University Graduate School of Medicine, Kobe, 650-0017, Japan
| | - Ryosuke Kuroda
- Department of Orthopaedic Surgery, Kobe University Graduate School of Medicine, Kobe, 650-0017, Japan
| |
Collapse
|
13
|
Kumar S, Earnest T, Yang B, Kothapalli D, Aschenbrenner AJ, Hassenstab J, Xiong C, Ances B, Morris J, Benzinger TLS, Gordon BA, Payne P, Sotiras A. Analyzing heterogeneity in Alzheimer disease using multimodal normative modeling on imaging-based ATN biomarkers. Alzheimers Dement 2025; 21:e70143. [PMID: 40235115 PMCID: PMC12000228 DOI: 10.1002/alz.70143] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2024] [Revised: 03/05/2025] [Accepted: 03/06/2025] [Indexed: 04/17/2025]
Abstract
INTRODUCTION Previous studies have applied normative modeling on a single neuroimaging modality to investigate Alzheimer disease (AD) heterogeneity. We employed a deep learning-based multimodal normative framework to analyze individual-level variation across ATN (amyloid-tau-neurodegeneration) imaging biomarkers. METHODS We selected cross-sectional discovery (n = 665) and replication cohorts (n = 430) with available T1-weighted magnetic resonance imaging (MRI), amyloid, and tau positron emission tomography (PET). Normative modeling estimated individual-level abnormal deviations in amyloid-positive individuals compared to amyloid-negative controls. Regional abnormality patterns were mapped at different clinical group levels to assess intra-group heterogeneity. An individual-level disease severity index (DSI) was calculated using both the spatial extent and magnitude of abnormal deviations across ATN. RESULTS Greater intra-group heterogeneity in ATN abnormality patterns was observed in more severe clinical stages of AD. Higher DSI was associated with worse cognitive function and increased risk of disease progression. DISCUSSION Subject-specific abnormality maps across ATN reveal the heterogeneous impact of AD on the brain. HIGHLIGHTS Normative modeling examined AD heterogeneity across multimodal imaging biomarkers. Heterogeneity in spatial patterns of gray matter atrophy, amyloid, and tau burden. Higher within-group heterogeneity for AD patients at advanced dementia stages. Patient-specific metric summarized extent of neurodegeneration and neuropathology. Metric is a marker of poor brain health and can monitor risk of disease progression.
Collapse
Affiliation(s)
- Sayantan Kumar
- Department of Computer Science and EngineeringWashington University in St LouisSaint LouisMissouriUSA
- Institute for InformaticsData Science & BiostatisticsWashington University School of Medicine in St LouisSaint LouisMissouriUSA
- Mallinckrodt Institute of RadiologyWashington University School of Medicine in St LouisSaint LouisMissouriUSA
| | - Tom Earnest
- Mallinckrodt Institute of RadiologyWashington University School of Medicine in St LouisSaint LouisMissouriUSA
| | - Braden Yang
- Mallinckrodt Institute of RadiologyWashington University School of Medicine in St LouisSaint LouisMissouriUSA
| | - Deydeep Kothapalli
- Mallinckrodt Institute of RadiologyWashington University School of Medicine in St LouisSaint LouisMissouriUSA
| | | | - Jason Hassenstab
- Department of NeurologyWashington University School of MedicineSt louisMissouriUSA
| | - Chengie Xiong
- Institute for InformaticsData Science & BiostatisticsWashington University School of Medicine in St LouisSaint LouisMissouriUSA
| | - Beau Ances
- Department of NeurologyWashington University School of MedicineSt louisMissouriUSA
| | - John Morris
- Department of NeurologyWashington University School of MedicineSt louisMissouriUSA
| | - Tammie L. S. Benzinger
- Mallinckrodt Institute of RadiologyWashington University School of Medicine in St LouisSaint LouisMissouriUSA
| | - Brian A. Gordon
- Mallinckrodt Institute of RadiologyWashington University School of Medicine in St LouisSaint LouisMissouriUSA
| | - Philip Payne
- Department of Computer Science and EngineeringWashington University in St LouisSaint LouisMissouriUSA
- Institute for InformaticsData Science & BiostatisticsWashington University School of Medicine in St LouisSaint LouisMissouriUSA
| | - Aristeidis Sotiras
- Institute for InformaticsData Science & BiostatisticsWashington University School of Medicine in St LouisSaint LouisMissouriUSA
- Mallinckrodt Institute of RadiologyWashington University School of Medicine in St LouisSaint LouisMissouriUSA
| | | |
Collapse
|
14
|
Yu Y, Wu D, Yuan J, Yu L, Dai X, Yang W, Lan Z, Wang J, Tao Z, Zhan Y, Ling R, Zhu X, Xu Y, Li Y, Zhang J. Deep Learning-based Quantitative CT Myocardial Perfusion Imaging and Risk Stratification of Coronary Artery Disease. Radiology 2025; 315:e242570. [PMID: 40298595 DOI: 10.1148/radiol.242570] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/30/2025]
Abstract
Background Precise assessment of myocardial ischemia burden and cardiovascular risk stratification based on dynamic CT myocardial perfusion imaging (MPI) is lacking. Purpose To develop and validate a deep learning (DL) model for automated quantification of myocardial blood flow (MBF) and ischemic myocardial volume (IMV) percentage and to explore the prognostic value for major adverse cardiovascular events (MACE). Materials and Methods This multicenter study comprised three cohorts of patients with clinically indicated CT MPI and coronary CT angiography (CCTA). Cohorts 1 and 2 were retrospective cohorts (May 2021 to June 2023 and January 2018 to December 2022, respectively). Cohort 3 was prospectively included (November 2016 to December 2021). The DL model was developed in cohort 1 (training set: 211 patients, validation set: 57 patients, test set: 90 patients). The diagnostic performance of MBF derived from the DL model (MBFDL) for myocardial ischemia was evaluated in cohort 2 based on the area under the receiver operating characteristic curve (AUC). The prognostic value of the DL model-derived IMV percentage was assessed in cohort 3 using multivariable Cox regression analyses. Results Across three cohorts, 1108 patients (mean age: 61 years ± 12 [SD]; 667 men) were included. MBFDL showed excellent agreement with manual measurements in the test set (segment-level intraclass correlation coefficient = 0.928; 95% CI: 0.921, 0.935). MBFDL showed higher diagnostic performance (vessel-based AUC: 0.97) over CT-derived fractional flow reserve (FFR) (vessel-based AUC: 0.87; P = .006) and CCTA-derived diameter stenosis (vessel-based AUC: 0.79; P < .001) for hemodynamically significant lesions, compared with invasive FFR. Over a mean follow-up of 39 months, MACE occurred in 94 (14.2%) of 660 patients. IMV percentage was an independent predictor of MACE (hazard ratio = 1.12, P = .003), with incremental prognostic value (C index: 0.86; 95% CI: 0.84, 0.88) over conventional risk factors and CCTA parameters (C index: 0.84; 95% CI: 0.82, 0.86; P = .02). Conclusion A DL model enabled automated CT MBF quantification and accurate diagnosis of myocardial ischemia. DL model-derived IMV percentage was an independent predictor of MACE and mildly improved cardiovascular risk stratification. © RSNA, 2025 Supplemental material is available for this article. See also the editorial by Zhu and Xu in this issue.
Collapse
Affiliation(s)
- Yarong Yu
- Department of Radiology, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, #85 Wujin Rd, Shanghai, China 200080
| | - Dijia Wu
- Shanghai United Imaging Intelligence, Shanghai, China
| | - Jiajun Yuan
- Department of Radiology, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, #85 Wujin Rd, Shanghai, China 200080
| | - Lihua Yu
- Department of Radiology, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, #85 Wujin Rd, Shanghai, China 200080
| | - Xu Dai
- Department of Radiology, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, #85 Wujin Rd, Shanghai, China 200080
| | - Wenli Yang
- Department of Radiology, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, #85 Wujin Rd, Shanghai, China 200080
| | - Ziting Lan
- Department of Radiology, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, #85 Wujin Rd, Shanghai, China 200080
| | - Jiayu Wang
- Shanghai United Imaging Intelligence, Shanghai, China
| | - Ze Tao
- Shanghai United Imaging Intelligence, Shanghai, China
| | - Yiqiang Zhan
- Shanghai United Imaging Intelligence, Shanghai, China
| | - Runjianya Ling
- Institute of Diagnostic and Interventional Radiology, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, Shanghai, China
| | - Xiaomei Zhu
- Department of Radiology, The First Affiliated Hospital of Nanjing Medical University, Nanjing, China
| | - Yi Xu
- Department of Radiology, The First Affiliated Hospital of Nanjing Medical University, Nanjing, China
| | - Yuehua Li
- Institute of Diagnostic and Interventional Radiology, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, Shanghai, China
| | - Jiayin Zhang
- Department of Radiology, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, #85 Wujin Rd, Shanghai, China 200080
- Department of Medical Imaging Technology, College of Health Science and Technology, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| |
Collapse
|
15
|
Olcay A, White PR, Bull JM, Risch D, Dell B, White EL. Sounds of the deep: How input representation, model choice, and dataset size influence underwater sound classification performance. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2025; 157:3017-3032. [PMID: 40249180 DOI: 10.1121/10.0036498] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/20/2024] [Accepted: 04/04/2025] [Indexed: 04/19/2025]
Abstract
Convolutional neural networks (CNNs) have proven highly effective in automatically identifying and classifying underwater sound sources, enabling efficient analysis of marine environments. This work examines two key design choices for a CNN classifier: input representation and network architecture, analyzing their importance as training data size varies and their effectiveness in generalizing between sites. Passive acoustic data from three offshore sites in Western Scotland were used for hierarchical classification; categorizing sounds into one of four classes: delphinid tonal, delphinid clicks, vessels, and ambient noise. Three different input representations of the acoustic signals were investigated along with four CNN architectures, including three pre-trained for image classification tasks. Experiments show that a custom-built shallow CNN can outperform more complex ar chitectures if the input representation is chosen appropriately. For example, a shallow CNN using Mel-spectrogram normalised with per channel energy normalization (MS-PCEN) achieved a 12.5% accuracy improvement over a ResNet model when small amounts of training data are available. Studying model performance across the three sites demonstrates that input representation is an important factor for achieving robust results between sites, with MS-PCEN achieving the best performance. However, the importance of the choice of input representation decreases as the training dataset size increases.
Collapse
Affiliation(s)
- Abdullah Olcay
- Institute of Sound and Vibration Research, University of Southampton, Southampton, SO17 1BJ, United Kingdom
| | - Paul R White
- Institute of Sound and Vibration Research, University of Southampton, Southampton, SO17 1BJ, United Kingdom
| | - Jonathan M Bull
- School of Ocean and Earth Science, University of Southampton, Southampton, SO14 3ZH, United Kingdom
| | - Denise Risch
- Marine Science Department, Scottish Association of Marine Science, Oban, PA37 1QA, United Kingdom
| | - Benedict Dell
- Institute of Sound and Vibration Research, University of Southampton, Southampton, SO17 1BJ, United Kingdom
| | - Ellen L White
- School of Ocean and Earth Science, University of Southampton, Southampton, SO14 3ZH, United Kingdom
| |
Collapse
|
16
|
Yang G, Xiao Q, Zhang Z, Yu Z, Wang X, Lu Q. Exploring AI in metasurface structures with forward and inverse design. iScience 2025; 28:111995. [PMID: 40104054 PMCID: PMC11914293 DOI: 10.1016/j.isci.2025.111995] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/20/2025] Open
Abstract
As an artificially manufactured planar device, a metasurface structure can produce unusual electromagnetic responses by harnessing four basic characteristics of the light wave. Traditional design processes rely on numerical algorithms combined with parameter optimization. However, such methods are often time-consuming and struggle to match actual responses. This paper aims to give a unique perspective to classify the artificial intelligence(AI)-enabled design, dividing it into forward and inverse designs according to the mapping relationship between variables and performance. Forward designs are driven by intelligent algorithms; neural networks are one of the principal ways to realize reverse design. This paper reviews recent progress in AI-enabled metasurface design, examining the principles, advantages, and potential applications. A rich content and detailed comparison can help build a holistic understanding of metasurface design. Moreover, the authors believe that this systematic and detailed review will pave the way for future research and the selection of practical applications.
Collapse
Affiliation(s)
- Guantai Yang
- Frontiers Science Center for Flexible Electronics (FSCFE) Institute of Flexible Electronics (IFE), Northwestern Polytechnical University, Xi'an 710072, China
- School of Automation, Northwestern Polytechnical University, 127 West Youyi Road, Beilin District, Xi'an 710072, China
| | - Qingxiong Xiao
- School of Automation, Northwestern Polytechnical University, 127 West Youyi Road, Beilin District, Xi'an 710072, China
| | - Zhilin Zhang
- School of Automation, Northwestern Polytechnical University, 127 West Youyi Road, Beilin District, Xi'an 710072, China
| | - Zhe Yu
- College of Optical Science and Engineering, Zhejiang University, Hangzhou 310027, China
| | - Xiaoxu Wang
- School of Automation, Northwestern Polytechnical University, 127 West Youyi Road, Beilin District, Xi'an 710072, China
| | - Qianbo Lu
- Frontiers Science Center for Flexible Electronics (FSCFE) Institute of Flexible Electronics (IFE), Northwestern Polytechnical University, Xi'an 710072, China
| |
Collapse
|
17
|
Pei YB, Yu ZY, Shen JS. Transfer learning for accelerated failure time model with microarray data. BMC Bioinformatics 2025; 26:84. [PMID: 40098088 PMCID: PMC11917065 DOI: 10.1186/s12859-025-06056-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2024] [Accepted: 01/17/2025] [Indexed: 03/19/2025] Open
Abstract
BACKGROUND In microarray prognostic studies, researchers aim to identify genes associated with disease progression. However, due to the rarity of certain diseases and the cost of sample collection, researchers often face the challenge of limited sample size, which may prevent accurate estimation and risk assessment. This challenge necessitates methods that can leverage information from external data (i.e., source cohorts) to improve gene selection and risk assessment based on the current sample (i.e., target cohort). METHOD We propose a transfer learning method for the accelerated failure time (AFT) model to enhance the fit on the target cohort by adaptively borrowing information from the source cohorts. We use a Leave-One-Out cross validation based procedure to evaluate the relative stability of selected genes and overall predictive power. CONCLUSION In simulation studies, the transfer learning method for the AFT model can correctly identify a small number of genes, its estimation error is smaller than the estimation error obtained without using the source cohorts. Furthermore, the proposed method demonstrates satisfactory accuracy and robustness in addressing heterogeneity across the cohorts compared to the method that directly combines the target and the source cohorts in the AFT model. We analyze the GSE88770 and GSE25055 data using the proposed method. The selected genes are relatively stable, and the proposed method can make an overall satisfactory risk prediction.
Collapse
Affiliation(s)
- Yan-Bo Pei
- School of Statistics, Capital University of Economics and Business, Beijing, China
| | - Zheng-Yang Yu
- School of Statistics, Capital University of Economics and Business, Beijing, China
| | - Jun-Shan Shen
- School of Statistics, Capital University of Economics and Business, Beijing, China.
| |
Collapse
|
18
|
Adebayo OE, Chatelain B, Trucu D, Eftimie R. Deep Learning Approaches for the Classification of Keloid Images in the Context of Malignant and Benign Skin Disorders. Diagnostics (Basel) 2025; 15:710. [PMID: 40150053 PMCID: PMC11940829 DOI: 10.3390/diagnostics15060710] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2025] [Revised: 03/02/2025] [Accepted: 03/08/2025] [Indexed: 03/29/2025] Open
Abstract
Background/Objectives: Misdiagnosing skin disorders leads to the administration of wrong treatments, sometimes with life-impacting consequences. Deep learning algorithms are becoming more and more used for diagnosis. While many skin cancer/lesion image classification studies focus on datasets containing dermatoscopic images and do not include keloid images, in this study, we focus on diagnosing keloid disorders amongst other skin lesions and combine two publicly available datasets containing non-dermatoscopic images: one dataset with keloid images and one with images of other various benign and malignant skin lesions (melanoma, basal cell carcinoma, squamous cell carcinoma, actinic keratosis, seborrheic keratosis, and nevus). Methods: Different Convolution Neural Network (CNN) models are used to classify these disorders as either malignant or benign, to differentiate keloids amongst different benign skin disorders, and furthermore to differentiate keloids among other similar-looking malignant lesions. To this end, we use the transfer learning technique applied to nine different base models: the VGG16, MobileNet, InceptionV3, DenseNet121, EfficientNetB0, Xception, InceptionRNV2, EfficientNetV2L, and NASNetLarge. We explore and compare the results of these models using performance metrics such as accuracy, precision, recall, F1score, and AUC-ROC. Results: We show that the VGG16 model (after fine-tuning) performs the best in classifying keloid images among other benign and malignant skin lesion images, with the following keloid class performance: an accuracy of 0.985, precision of 1.0, recall of 0.857, F1 score of 0.922 and AUC-ROC value of 0.996. VGG16 also has the best overall average performance (over all classes) in terms of the AUC-ROC and the other performance metrics. Using this model, we further attempt to predict the identification of three new non-dermatoscopic anonymised clinical images, classifying them as either malignant, benign, or keloid, and in the process, we identify some issues related to the collection and processing of such images. Finally, we also show that the DenseNet121 model has the best performance when differentiating keloids from other malignant disorders that have similar clinical presentations. Conclusions: The study emphasised the potential use of deep learning algorithms (and their drawbacks), to identify and classify benign skin disorders such as keloids, which are not usually investigated via these approaches (as opposed to cancers), mainly due to lack of available data.
Collapse
Affiliation(s)
- Olusegun Ekundayo Adebayo
- Laboratoire de Mathématiques de Besançon, Université Marie et Louis Pasteur, F-25000 Besançon, France;
| | - Brice Chatelain
- Service de Chirurgie Maxillo-Faciale, Stomatologie et Odontologie Hospitalière, CHU Besançon, F-25000 Besançon, France;
| | - Dumitru Trucu
- Division of Mathematics, University of Dundee, Dundee DD1 4HN, UK
| | - Raluca Eftimie
- Laboratoire de Mathématiques de Besançon, Université Marie et Louis Pasteur, F-25000 Besançon, France;
- Division of Mathematics, University of Dundee, Dundee DD1 4HN, UK
| |
Collapse
|
19
|
Wu Y, Fong S, Yu J. Enhancing bone radiology images classification through appropriate preprocessing: a deep learning and explainable artificial intelligence approach. Quant Imaging Med Surg 2025; 15:2529-2546. [PMID: 40160653 PMCID: PMC11948367 DOI: 10.21037/qims-24-1745] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2024] [Accepted: 12/16/2024] [Indexed: 04/02/2025]
Abstract
Background Medical image classification has been an important application for deep learning techniques for over a decade, and since the emergence of explainable artificial intelligence (XAI), researchers have started using XAI to validate the results produced by these black box models. In the research field, it has become clear that accuracy and efficiency are not the only crucial factors for developing medical deep learning models; the authenticity of results and the accountability of the model and its creator also matter greatly. The objective of this study is to emphasize the importance of authenticity of the results and the accountability of deep learning models used for medical purposes, through proposing targeted preprocessing method for medical dataset processed by deep learning models. Methods In this paper we conduct comparison experiments on processing two bone radiology image datasets using various deep learning neural networks, while emphasizing on the effect of appropriate preprocessing methods for the dataset towards the models' prediction performance. Comparisons are conducted both horizontally, between performance of different neural networks; and vertically, of using same models processing datasets before and after going through appropriate preprocessing procedures. Furthermore, we evaluate the experimental results not only quantitatively, but also visually by using XAI techniques, in order to determine the reasonability and reliability of the predictions from the experiments. Results Results showed that for the bone radiology image dataset used for our experiment, among the five comparison models, DenseNet201 achieved the highest validation accuracy of 78%. Using the same models to process the abovementioned dataset after conducting appropriate preprocessing procedures, performance for all models have increased by an average of 0.06. Using XAI technique to evaluate the comparison results for before/after preprocessing experiments, we could observe that the appropriate preprocessing method effectively helped the models to concentrate on the abnormality areas on the radiology images comparing to processing raw images. Conclusions The novelty of this paper lies in its specific application of extended preprocessing techniques-namely, the removal of background and irrelevant parts-to medical images for improving the performance of deep learning models in classification tasks. While the concept of preprocessing images has been explored by many researchers, applying such targeted preprocessing steps to medical images, combined with the use of XAI to validate and illustrate the benefits, is a novel approach. This paper highlights the unique requirements of medical image data and proposes an innovative method to enhance model accuracy and reliability in medical diagnostics by removing background and redundant features from the images.
Collapse
Affiliation(s)
- Yaoyang Wu
- Department of Computer and Information Science, University of Macau, Macau, China
| | - Simon Fong
- Department of Computer and Information Science, University of Macau, Macau, China
| | - Jiahui Yu
- Department of Computer and Information Science, University of Macau, Macau, China
| |
Collapse
|
20
|
Qian YF, Guo WL. Development and validation of a deep learning algorithm for prediction of pediatric recurrent intussusception in ultrasound images and radiographs. BMC Med Imaging 2025; 25:67. [PMID: 40033220 DOI: 10.1186/s12880-025-01582-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2024] [Accepted: 02/05/2025] [Indexed: 03/05/2025] Open
Abstract
PURPOSES To develop a predictive model for recurrent intussusception based on abdominal ultrasound (US) images and abdominal radiographs. METHODS A total of 3665 cases of intussusception were retrospectively collected from January 2017 to December 2022. The cohort was randomly assigned to training and validation sets at a 6:4 ratio. Two types of images were processed: abdominal grayscale US images and abdominal radiographs. These images served as inputs for the deep learning algorithm and were individually processed by five detection models for training, with each model predicting its respective categories and probabilities. The optimal models were selected individually for decision fusion to obtain the final predicted categories and their probabilities. RESULTS With US, the VGG11 model showed the best performance, achieving an area under the receiver operating characteristic curve (AUC) of 0.669 (95% CI: 0.635-0.702). In contrast, with radiographs, the ResNet18 model excelled with an AUC of 0.809 (95% CI: 0.776-0.841). We then employed two fusion methods. In the averaging fusion method, the two models were combined to reach a diagnostic decision. Specifically, a soft voting scheme was used to average the probabilities predicted by each model, resulting in an AUC of 0.877 (95% CI: 0.846-0.908). In the stacking fusion method, a meta-model was built based on the predictions of the two optimal models. This approach notably enhanced the overall predictive performance, with LightGBM emerging as the top performer, achieving an AUC of 0.897 (95% CI: 0.869-0.925). Both fusion methods demonstrated excellent performance. CONCLUSIONS Deep learning algorithms developed using multimodal medical imaging may help predict recurrent intussusception. CLINICAL TRIAL NUMBER Not applicable.
Collapse
Affiliation(s)
- Yu-Feng Qian
- Department of Radiology, Children's Hospital of Soochow University, Suzhou, China
| | - Wan-Liang Guo
- Department of Radiology, Children's Hospital of Soochow University, Suzhou, China.
| |
Collapse
|
21
|
Bakshi A, Stetson J, Wang L, Shi J, Caragea D, Miller LC. Toward a rapid, sensitive, user-friendly, field-deployable artificial intelligence tool for enhancing African swine fever diagnosis and reporting. Am J Vet Res 2025; 86:S27-S37. [PMID: 40023145 PMCID: PMC11957874 DOI: 10.2460/ajvr.24.10.0305] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2024] [Accepted: 02/07/2025] [Indexed: 03/04/2025]
Abstract
Objective African swine fever (ASF) is a lethal and highly contagious transboundary animal disease with the potential for rapid international spread. Lateral flow assays (LFAs) are sometimes hard to read by the inexperienced user, mainly due to the LFA sensitivity and reading ambiguities. Our objective was to develop and implement an AI-powered tool to enhance the accuracy of LFA reading, thereby improving rapid and early detection for ASF diagnosis and reporting. Methods Here, we focus on the development of a deep learning-assisted, smartphone-based AI diagnostic tool to provide accurate decisions with higher sensitivity. The tool employs state-of-the-art You Only Look Once (YOLO) models for image classification. The YOLO models were trained and evaluated using a dataset consisting of images where the lateral flow assays are manually labeled as positives or negatives. A prototype JavaScript website application for ASF reporting and visualization was created in Azure. The application maintains the distribution of the positive predictions on a map as the positive cases are submitted by users. Results The performance of the models is evaluated using standard evaluation metrics for classification tasks, specifically accuracy, precision, recall, sensitivity, specificity, and F1 measure. We acquired 86.3 ± 7.9% average accuracy, 96.3 ± 2.04% average precision, 79 ± 13.20% average recall, and an average F1 score of 0.87 ± 0.088 across 3 different train/development/test splits of the datasets. Submitting a positive result of the deep learning model updates a map with a location marker for positive results. Conclusions Combining clinical data learning and 2-step algorithms enables a point-of-need assay with higher accuracy. Clinical Relevance A rapid, sensitive, user-friendly, and deployable deep learning tool was developed for classifying LFA test images to enhance diagnosis and reporting, particularly in settings with limited laboratory resources.
Collapse
Affiliation(s)
- Aliva Bakshi
- Department of Computer Science, Carl R. Ice College of Engineering, Kansas State University, Manhattan, KS
| | - Jake Stetson
- Department of Computer Science, Carl R. Ice College of Engineering, Kansas State University, Manhattan, KS
| | - Lihua Wang
- Department of Anatomy and Physiology, College of Veterinary Medicine, Kansas State University, Manhattan, KS
| | - Jishu Shi
- Department of Anatomy and Physiology, College of Veterinary Medicine, Kansas State University, Manhattan, KS
| | - Doina Caragea
- Department of Computer Science, Carl R. Ice College of Engineering, Kansas State University, Manhattan, KS
| | - Laura C. Miller
- Department of Diagnostic Medicine/Pathobiology, College of Veterinary Medicine, Kansas State University, Manhattan, KS
| |
Collapse
|
22
|
Abe M, Niioka H, Matsumoto A, Katsuma Y, Imai A, Okushima H, Ozaki S, Fujii N, Oka K, Sakaguchi Y, Inoue K, Isaka Y, Matsui I. Self-Supervised Learning for Feature Extraction from Glomerular Images and Disease Classification with Minimal Annotations. J Am Soc Nephrol 2025; 36:471-486. [PMID: 40029749 PMCID: PMC11888952 DOI: 10.1681/asn.0000000514] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2024] [Accepted: 10/02/2024] [Indexed: 10/11/2024] Open
Abstract
Background Deep learning has great potential in digital kidney pathology. However, its effectiveness depends heavily on the availability of extensively labeled datasets, which are often limited because of the specialized knowledge and time required for their creation. This limitation hinders the widespread application of deep learning for the analysis of kidney biopsy images. Methods We applied self-distillation with no labels (DINO), a self-supervised learning method, to a dataset of 10,423 glomerular images obtained from 384 periodic acid–Schiff-stained kidney biopsy slides. Glomerular features extracted from the DINO-pretrained backbone were visualized using principal component analysis. We then performed classification tasks by adding either k-nearest neighbor classifiers or linear head layers to the DINO-pretrained or ImageNet-pretrained backbones. These models were trained on our labeled classification dataset. Performance was evaluated using metrics such as the area under the receiver operating characteristic curve (ROC-AUC). The classification tasks encompassed four disease categories (minimal change disease, mesangial proliferative GN, membranous nephropathy, and diabetic nephropathy) and clinical parameters such as hypertension, proteinuria, and hematuria. Results Principal component analysis visualization revealed distinct principal components corresponding to different glomerular structures, demonstrating the capability of the DINO-pretrained backbone to capture morphologic features. In disease classification, the DINO-pretrained transferred model (ROC-AUC=0.93) outperformed the ImageNet-pretrained fine-tuned model (ROC-AUC=0.89). When the labeled data were limited, the ImageNet-pretrained fine-tuned model's ROC-AUC dropped to 0.76 (95% confidence interval, 0.72 to 0.80), whereas the DINO-pretrained transferred model maintained superior performance (ROC-AUC, 0.88; 95% confidence interval, 0.86 to 0.90). The DINO-pretrained transferred model also exhibited higher AUCs for the classification of several clinical parameters. External validation using two independent datasets confirmed DINO pretraining's superiority, particularly when labeled data were limited. Conclusions The application of DINO to unlabeled periodic acid–Schiff-stained glomerular images facilitated the extraction of histologic features that could be effectively used for disease classification.
Collapse
Affiliation(s)
- Masatoshi Abe
- Department of Nephrology, Graduate School of Medicine, Osaka University, Osaka, Japan
| | - Hirohiko Niioka
- Data-Driven Innovation Initiative, Kyushu University, Fukuoka, Japan
| | - Ayumi Matsumoto
- Department of Nephrology, Graduate School of Medicine, Osaka University, Osaka, Japan
| | - Yusuke Katsuma
- Department of Nephrology, Graduate School of Medicine, Osaka University, Osaka, Japan
| | - Atsuhiro Imai
- Department of Nephrology, Graduate School of Medicine, Osaka University, Osaka, Japan
| | - Hiroki Okushima
- Department of Nephrology, Graduate School of Medicine, Osaka University, Osaka, Japan
| | - Shingo Ozaki
- Department of Nephrology, Hyogo Prefectural Nishinomiya Hospital, Nishinomiya, Japan
| | - Naohiko Fujii
- Department of Nephrology, Hyogo Prefectural Nishinomiya Hospital, Nishinomiya, Japan
| | - Kazumasa Oka
- Department of Pathology, Hyogo Prefectural Nishinomiya Hospital, Nishinomiya, Japan
| | - Yusuke Sakaguchi
- Department of Nephrology, Graduate School of Medicine, Osaka University, Osaka, Japan
| | - Kazunori Inoue
- Department of Nephrology, Graduate School of Medicine, Osaka University, Osaka, Japan
| | - Yoshitaka Isaka
- Department of Nephrology, Graduate School of Medicine, Osaka University, Osaka, Japan
| | - Isao Matsui
- Department of Nephrology, Graduate School of Medicine, Osaka University, Osaka, Japan
- Transdimensional Life Imaging Division, Institute for Open and Transdisciplinary Research Initiatives, Osaka University, Osaka, Japan
| |
Collapse
|
23
|
Li G, Liu H, Pan Z, Cheng L, Dai J. Predicting craniofacial fibrous dysplasia growth status: an exploratory study of a hybrid radiomics and deep learning model based on computed tomography images. Oral Surg Oral Med Oral Pathol Oral Radiol 2025; 139:364-376. [PMID: 39725588 DOI: 10.1016/j.oooo.2024.11.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2024] [Revised: 10/11/2024] [Accepted: 11/02/2024] [Indexed: 12/28/2024]
Abstract
OBJECTIVE This study aimed to develop 3 models based on computed tomography (CT) images of patients with craniofacial fibrous dysplasia (CFD): a radiomics model (Model Rad), a deep learning (DL) model (Model DL), and a hybrid radiomics and DL model (Model Rad+DL), and evaluate the ability of these models to distinguish between adolescents with active lesion progression and adults with stable lesion progression. METHODS We retrospectively analyzed preoperative CT scans from 148 CFD patients treated at Shanghai Ninth People's Hospital. The images were processed using 3D-Slicer software to segment and extract regions of interest for radiomics and DL analysis. Feature selection was performed using t-tests, mutual information, correlation tests, and the least absolute shrinkage and selection operator algorithm to develop the 3 models. Model accuracy was evaluated using measurements including the area under the curve (AUC) derived from receiver operating characteristic analysis, sensitivity, specificity, and F1 score. Decision curve analysis (DCA) was conducted to evaluate clinical benefits. RESULTS In total, 1,130 radiomics features and 512 DL features were successfully extracted. Model Rad+DL demonstrated superior AUC values compared to Model Rad and Model DL in the training and validation sets. DCA revealed that Model Rad+DL offered excellent clinical benefits when the threshold probability exceeded 20%. CONCLUSIONS Model Rad+DL exhibits superior potential in evaluating CFD progression, determining the optimal surgical timing for adult CFD patients.
Collapse
Affiliation(s)
- Guozhi Li
- Department of Oral and Cranio-Maxillofacial Surgery, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China; College of Stomatology, Shanghai Jiao Tong University, Shanghai, China; National Center for Stomatology, Shanghai, China; National Clinical Research Center for Oral Diseases, Shanghai, China; Shanghai Key Laboratory of Stomatology, Shanghai, China
| | - Hao Liu
- Department of Oral and Cranio-Maxillofacial Surgery, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China; College of Stomatology, Shanghai Jiao Tong University, Shanghai, China; National Center for Stomatology, Shanghai, China; National Clinical Research Center for Oral Diseases, Shanghai, China; Shanghai Key Laboratory of Stomatology, Shanghai, China
| | - Zhiyuan Pan
- Department of Oral and Cranio-Maxillofacial Surgery, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China; College of Stomatology, Shanghai Jiao Tong University, Shanghai, China; National Center for Stomatology, Shanghai, China; National Clinical Research Center for Oral Diseases, Shanghai, China; Shanghai Key Laboratory of Stomatology, Shanghai, China
| | - Li Cheng
- Department of Oral and Cranio-Maxillofacial Surgery, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China; College of Stomatology, Shanghai Jiao Tong University, Shanghai, China; National Center for Stomatology, Shanghai, China; National Clinical Research Center for Oral Diseases, Shanghai, China; Shanghai Key Laboratory of Stomatology, Shanghai, China
| | - Jiewen Dai
- Department of Oral and Cranio-Maxillofacial Surgery, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China; College of Stomatology, Shanghai Jiao Tong University, Shanghai, China; National Center for Stomatology, Shanghai, China; National Clinical Research Center for Oral Diseases, Shanghai, China; Shanghai Key Laboratory of Stomatology, Shanghai, China.
| |
Collapse
|
24
|
Zhang Z, Gu X, Zhu Y, Wang T, Gong Y, Shang Y. Data-driven available capacity estimation of lithium-ion batteries based on fragmented charge capacity. COMMUNICATIONS ENGINEERING 2025; 4:32. [PMID: 39994361 PMCID: PMC11850593 DOI: 10.1038/s44172-025-00372-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Accepted: 02/12/2025] [Indexed: 02/26/2025]
Abstract
Efficient and accurate available capacity estimation of lithium-ion batteries is crucial for ensuring the safe and effective operation of electric vehicles. However, incomplete charging cycles in practical applications challenge conventional methods. Here we manipulate fragmented charge capacity data to estimate available capacity without complete charging information. Considering correlation, charging time, and initial state of charge, 36 feature combinations are available for estimation. The basic machine learning model is established on 11,500 cyclic samples, and a transfer learning model is fine-tuned and validated on multiple datasets. The validation results indicate that the best root-mean-square error for the basic model is 0.012. Furthermore, the RMSE demonstrates consistent stability across different datasets in the transfer learning model, with fluctuations within 0.5% when considering feature combinations across cycles with spacings of 5, 10, and 20. This work highlights the promise of available capacity estimation using actual, readily accessible fragmented charge capacity data.
Collapse
Affiliation(s)
- Zhen Zhang
- School of Control Science and Engineering, Shandong University, Jinan, China
| | - Xin Gu
- School of Control Science and Engineering, Shandong University, Jinan, China
| | - Yuhao Zhu
- School of Control Science and Engineering, Shandong University, Jinan, China
| | - Teng Wang
- School of Control Science and Engineering, Shandong University, Jinan, China
| | - Yichang Gong
- School of Control Science and Engineering, Shandong University, Jinan, China
| | - Yunlong Shang
- School of Control Science and Engineering, Shandong University, Jinan, China.
| |
Collapse
|
25
|
McLean KA, Sgrò A, Brown LR, Buijs LF, Mountain KE, Shaw CA, Drake TM, Pius R, Knight SR, Fairfield CJ, Skipworth RJE, Tsaftaris SA, Wigmore SJ, Potter MA, Bouamrane MM, Harrison EM. Multimodal machine learning to predict surgical site infection with healthcare workload impact assessment. NPJ Digit Med 2025; 8:121. [PMID: 39988586 PMCID: PMC11847912 DOI: 10.1038/s41746-024-01419-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2024] [Accepted: 12/21/2024] [Indexed: 02/25/2025] Open
Abstract
Remote monitoring is essential for healthcare digital transformation, however, this poses greater burdens on healthcare providers to review and respond as the data collected expands. This study developed a multimodal neural network to automate assessments of patient-generated data from remote postoperative wound monitoring. Two interventional studies including adult gastrointestinal surgery patients collected wound images and patient-reported outcome measures (PROMs) for 30-days postoperatively. Neural networks for PROMs and images were combined to predict surgical site infection (SSI) diagnosis within 48 h. The multimodal neural network model to predict confirmed SSI within 48 h remained comparable to clinician triage (0.762 [0.690-0.835] vs 0.777 [0.721-0.832]), with an excellent performance on external validation. Simulated usage indicated an 80% reduction in staff time (51.5 to 9.1 h) without compromising diagnostic accuracy. This multimodal approach can effectively support remote monitoring, alleviating provider burden while ensuring high-quality postoperative care.
Collapse
Affiliation(s)
- Kenneth A McLean
- Department of Clinical Surgery, University of Edinburgh, 51 Little France Crescent, Edinburgh, EH16 4SA, UK.
- Centre for Medical Informatics, Usher Institute, University of Edinburgh, 9 Little France Rd, Edinburgh, EH16 4UX, UK.
| | - Alessandro Sgrò
- Colorectal Unit, Western General Hospital, Edinburgh, EH4 2XU, UK
| | - Leo R Brown
- Department of Clinical Surgery, University of Edinburgh, 51 Little France Crescent, Edinburgh, EH16 4SA, UK
| | - Louis F Buijs
- Colorectal Unit, Western General Hospital, Edinburgh, EH4 2XU, UK
| | - Katie E Mountain
- Department of Clinical Surgery, University of Edinburgh, 51 Little France Crescent, Edinburgh, EH16 4SA, UK
| | - Catherine A Shaw
- Department of Clinical Surgery, University of Edinburgh, 51 Little France Crescent, Edinburgh, EH16 4SA, UK
- Centre for Medical Informatics, Usher Institute, University of Edinburgh, 9 Little France Rd, Edinburgh, EH16 4UX, UK
| | - Thomas M Drake
- Department of Clinical Surgery, University of Edinburgh, 51 Little France Crescent, Edinburgh, EH16 4SA, UK
- Centre for Medical Informatics, Usher Institute, University of Edinburgh, 9 Little France Rd, Edinburgh, EH16 4UX, UK
| | - Riinu Pius
- Department of Clinical Surgery, University of Edinburgh, 51 Little France Crescent, Edinburgh, EH16 4SA, UK
- Centre for Medical Informatics, Usher Institute, University of Edinburgh, 9 Little France Rd, Edinburgh, EH16 4UX, UK
| | - Stephen R Knight
- Department of Clinical Surgery, University of Edinburgh, 51 Little France Crescent, Edinburgh, EH16 4SA, UK
- Centre for Medical Informatics, Usher Institute, University of Edinburgh, 9 Little France Rd, Edinburgh, EH16 4UX, UK
| | - Cameron J Fairfield
- Department of Clinical Surgery, University of Edinburgh, 51 Little France Crescent, Edinburgh, EH16 4SA, UK
- Centre for Medical Informatics, Usher Institute, University of Edinburgh, 9 Little France Rd, Edinburgh, EH16 4UX, UK
| | - Richard J E Skipworth
- Department of Clinical Surgery, University of Edinburgh, 51 Little France Crescent, Edinburgh, EH16 4SA, UK
| | - Sotirios A Tsaftaris
- AI Hub for Causality in Healthcare AI with Real Data, University of Edinburgh, Edinburgh, EH9 3FG, UK
| | - Stephen J Wigmore
- Department of Clinical Surgery, University of Edinburgh, 51 Little France Crescent, Edinburgh, EH16 4SA, UK
| | - Mark A Potter
- Colorectal Unit, Western General Hospital, Edinburgh, EH4 2XU, UK
| | - Matt-Mouley Bouamrane
- Centre for Medical Informatics, Usher Institute, University of Edinburgh, 9 Little France Rd, Edinburgh, EH16 4UX, UK
| | - Ewen M Harrison
- Department of Clinical Surgery, University of Edinburgh, 51 Little France Crescent, Edinburgh, EH16 4SA, UK.
- Centre for Medical Informatics, Usher Institute, University of Edinburgh, 9 Little France Rd, Edinburgh, EH16 4UX, UK.
| |
Collapse
|
26
|
Zeng S, Chen H, Jing R, Yang W, He L, Zou T, Liu P, Liang B, Shi D, Wu W, Lin Q, Ma Z, Zha J, Zhong Y, Zhang X, Shao G, Gong P. An assessment of breast cancer HER2, ER, and PR expressions based on mammography using deep learning with convolutional neural networks. Sci Rep 2025; 15:4826. [PMID: 39924532 PMCID: PMC11808088 DOI: 10.1038/s41598-024-83597-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2024] [Accepted: 12/16/2024] [Indexed: 02/11/2025] Open
Abstract
Mammography is the recommended imaging modality for breast cancer screening. Expressions of human epidermal growth factor receptor 2 (HER2), estrogen receptor (ER), and progesterone receptor (PR) are critical to the development of therapeutic strategies for breast cancer. In this study, a deep learning model (CBAM ResNet-18) was developed to predict the expression of these three receptors on mammography without manual segmentation of masses. Mammography of patients with pathologically proven breast cancer was obtained from two centers. A deep learning-based model (CBAM ResNet-18) for predicting HER2, ER, and PR expressions was trained and validated using five-fold cross-validation on a training dataset. The performance of the model was further tested using an external test dataset. Area under receiver operating characteristic curve (AUC), accuracy (ACC), and F1-score were calculated to assess the ability of the model to predict each receptor. For comparison we also developed original ResNet-18 without attention module and VGG-19 with and without attention module. The AUC (95% CI), ACC, and F1-score were 0.708 (0.609, 0.808), 0.651, 0.528, respectively, in the HER2 test dataset; 0.785 (0.673, 0.897), 0.845, 0.905, respectively, in the ER test dataset; and 0.706 (0.603, 0.809), 0.678, 0.773, respectively, in the PR test dataset. The proposed model demonstrates superior performance compared to the original ResNet-18 without attention module and VGG-19 with and without attention module. The model has the potential to predict HER2, PR, and especially ER expressions, and thus serve as an adjunctive diagnostic tool for breast cancer.
Collapse
Affiliation(s)
- Shun Zeng
- Department of General Surgery, Institute of Precision Diagnosis and Treatment of Digestive System Tumors and Guangdong Provincial Key Laboratory of Chinese Medicine Ingredients and Gut Microbiomics, Shenzhen University General Hospital, Shenzhen University, Shenzhen, China
| | - Hongyu Chen
- Department of Health Outcomes & Biomedical Informatics, University of Florida, Gainesville, FL, USA
| | - Rui Jing
- Department of Radiology, Second Hospital of Shandong University, Jinan, Shandong, China
| | - Wenzhuo Yang
- Sun Yat-sen University Cancer Center, Sun Yat-sen University, Guangzhou, China
| | - Ligong He
- Sun Yat-sen University Cancer Center, Sun Yat-sen University, Guangzhou, China
| | - Tianle Zou
- Department of General Surgery, Institute of Precision Diagnosis and Treatment of Digestive System Tumors and Guangdong Provincial Key Laboratory of Chinese Medicine Ingredients and Gut Microbiomics, Shenzhen University General Hospital, Shenzhen University, Shenzhen, China
| | - Peng Liu
- Department of General Surgery, Institute of Precision Diagnosis and Treatment of Digestive System Tumors and Guangdong Provincial Key Laboratory of Chinese Medicine Ingredients and Gut Microbiomics, Shenzhen University General Hospital, Shenzhen University, Shenzhen, China
| | - Bo Liang
- Department of General Surgery, Institute of Precision Diagnosis and Treatment of Digestive System Tumors and Guangdong Provincial Key Laboratory of Chinese Medicine Ingredients and Gut Microbiomics, Shenzhen University General Hospital, Shenzhen University, Shenzhen, China
| | - Dan Shi
- Department of General Surgery, Institute of Precision Diagnosis and Treatment of Digestive System Tumors and Guangdong Provincial Key Laboratory of Chinese Medicine Ingredients and Gut Microbiomics, Shenzhen University General Hospital, Shenzhen University, Shenzhen, China
| | - Wenhao Wu
- Department of General Surgery, Institute of Precision Diagnosis and Treatment of Digestive System Tumors and Guangdong Provincial Key Laboratory of Chinese Medicine Ingredients and Gut Microbiomics, Shenzhen University General Hospital, Shenzhen University, Shenzhen, China
| | - Qiusheng Lin
- Department of Thyroid and Breast Surgery, Huazhong University of Science and Technology Union Shenzhen Hospital, 89 Taoyuan Road, Shenzhen, 518052, China
| | - Zhenyu Ma
- Department of Radiology, Second Hospital of Shandong University Zhaoyuan Branch, Zhaoyuan, Shandong, China
| | - Jinhui Zha
- Department of General Surgery, Institute of Precision Diagnosis and Treatment of Digestive System Tumors and Guangdong Provincial Key Laboratory of Chinese Medicine Ingredients and Gut Microbiomics, Shenzhen University General Hospital, Shenzhen University, Shenzhen, China
| | - Yonghao Zhong
- Department of General Surgery, Institute of Precision Diagnosis and Treatment of Digestive System Tumors and Guangdong Provincial Key Laboratory of Chinese Medicine Ingredients and Gut Microbiomics, Shenzhen University General Hospital, Shenzhen University, Shenzhen, China
| | - Xianbin Zhang
- Department of General Surgery, Institute of Precision Diagnosis and Treatment of Digestive System Tumors and Guangdong Provincial Key Laboratory of Chinese Medicine Ingredients and Gut Microbiomics, Shenzhen University General Hospital, Shenzhen University, Shenzhen, China
| | - Guangrui Shao
- Department of Radiology, Second Hospital of Shandong University, Jinan, Shandong, China
| | - Peng Gong
- Department of General Surgery, Institute of Precision Diagnosis and Treatment of Digestive System Tumors and Guangdong Provincial Key Laboratory of Chinese Medicine Ingredients and Gut Microbiomics, Shenzhen University General Hospital, Shenzhen University, Shenzhen, China.
| |
Collapse
|
27
|
Bai X, Zhang X. Artificial Intelligence-Powered Materials Science. NANO-MICRO LETTERS 2025; 17:135. [PMID: 39912967 PMCID: PMC11803041 DOI: 10.1007/s40820-024-01634-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/31/2024] [Accepted: 12/11/2024] [Indexed: 02/07/2025]
Abstract
The advancement of materials has played a pivotal role in the advancement of human civilization, and the emergence of artificial intelligence (AI)-empowered materials science heralds a new era with substantial potential to tackle the escalating challenges related to energy, environment, and biomedical concerns in a sustainable manner. The exploration and development of sustainable materials are poised to assume a critical role in attaining technologically advanced solutions that are environmentally friendly, energy-efficient, and conducive to human well-being. This review provides a comprehensive overview of the current scholarly progress in artificial intelligence-powered materials science and its cutting-edge applications. We anticipate that AI technology will be extensively utilized in material research and development, thereby expediting the growth and implementation of novel materials. AI will serve as a catalyst for materials innovation, and in turn, advancements in materials innovation will further enhance the capabilities of AI and AI-powered materials science. Through the synergistic collaboration between AI and materials science, we stand to realize a future propelled by advanced AI-powered materials.
Collapse
Affiliation(s)
- Xiaopeng Bai
- Department of Mechanical Engineering, The University of Hong Kong, Hong Kong, 999077, People's Republic of China
- Department of Physics, The Chinese University of Hong Kong, Shatin, Hong Kong, 999077, People's Republic of China
| | - Xingcai Zhang
- World Tea Organization, Cambridge, MA, 02139, USA.
- Department of Materials Science and Engineering, Stanford University, Stanford, CA, 94305, USA.
| |
Collapse
|
28
|
Huang W, Hu J, Xiao J, Wei Y, Bi X, Xiao B. Prototype-Guided Graph Reasoning Network for Few-Shot Medical Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2025; 44:761-773. [PMID: 39269802 DOI: 10.1109/tmi.2024.3459943] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/15/2024]
Abstract
Few-shot semantic segmentation (FSS) is of tremendous potential for data-scarce scenarios, particularly in medical segmentation tasks with merely a few labeled data. Most of the existing FSS methods typically distinguish query objects with the guidance of support prototypes. However, the variances in appearance and scale between support and query objects from the same anatomical class are often exceedingly considerable in practical clinical scenarios, thus resulting in undesirable query segmentation masks. To tackle the aforementioned challenge, we propose a novel prototype-guided graph reasoning network (PGRNet) to explicitly explore potential contextual relationships in structured query images. Specifically, a prototype-guided graph reasoning module is proposed to perform information interaction on the query graph under the guidance of support prototypes to fully exploit the structural properties of query images to overcome intra-class variances. Moreover, instead of fixed support prototypes, a dynamic prototype generation mechanism is devised to yield a collection of dynamic support prototypes by mining rich contextual information from support images to further boost the efficiency of information interaction between support and query branches. Equipped with the proposed two components, PGRNet can learn abundant contextual representations for query images and is therefore more resilient to object variations. We validate our method on three publicly available medical segmentation datasets, namely CHAOS-T2, MS-CMRSeg, and Synapse. Experiments indicate that the proposed PGRNet outperforms previous FSS methods by a considerable margin and establishes a new state-of-the-art performance.
Collapse
|
29
|
Stember JN, Dishner K, Jenabi M, Pasquini L, K Peck K, Saha A, Shah A, O'Malley B, Ilica AT, Kelly L, Arevalo-Perez J, Hatzoglou V, Holodny A, Shalu H. Evolutionary Strategies Enable Systematic and Reliable Uncertainty Quantification: A Proof-of-Concept Pilot Study on Resting-State Functional MRI Language Lateralization. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2025; 38:576-586. [PMID: 38980624 PMCID: PMC11810852 DOI: 10.1007/s10278-024-01188-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Revised: 06/17/2024] [Accepted: 06/19/2024] [Indexed: 07/10/2024]
Abstract
Reliable and trustworthy artificial intelligence (AI), particularly in high-stake medical diagnoses, necessitates effective uncertainty quantification (UQ). Existing UQ methods using model ensembles often introduce invalid variability or computational complexity, rendering them impractical and ineffective in clinical workflow. We propose a UQ approach based on deep neuroevolution (DNE), a data-efficient optimization strategy. Our goal is to replicate trends observed in expert-based UQ. We focused on language lateralization maps from resting-state functional MRI (rs-fMRI). Fifty rs-fMRI maps were divided into training/testing (30:20) sets, representing two labels: "left-dominant" and "co-dominant." DNE facilitated acquiring an ensemble of 100 models with high training and testing set accuracy. Model uncertainty was derived from distribution entropies over the 100 model predictions. Expert reviewers provided user-based uncertainties for comparison. Model (epistemic) and user-based (aleatoric) uncertainties were consistent in the independently and identically distributed (IID) testing set, mainly indicating low uncertainty. In a mostly out-of-distribution (OOD) holdout set, both model and user-based entropies correlated but displayed a bimodal distribution, with one peak representing low and another high uncertainty. We also found a statistically significant positive correlation between epistemic and aleatoric uncertainties. DNE-based UQ effectively mirrored user-based uncertainties, particularly highlighting increased uncertainty in OOD images. We conclude that DNE-based UQ correlates with expert assessments, making it reliable for our use case and potentially for other radiology applications.
Collapse
Affiliation(s)
- Joseph N Stember
- Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, NY, 10021, USA.
| | - Katharine Dishner
- Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, NY, 10021, USA
| | - Mehrnaz Jenabi
- Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, NY, 10021, USA
| | - Luca Pasquini
- Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, NY, 10021, USA
| | - Kyung K Peck
- Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, NY, 10021, USA
| | - Atin Saha
- Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, NY, 10021, USA
| | - Akash Shah
- Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, NY, 10021, USA
| | - Bernard O'Malley
- Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, NY, 10021, USA
| | - Ahmet Turan Ilica
- Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, NY, 10021, USA
| | - Lori Kelly
- Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, NY, 10021, USA
| | - Julio Arevalo-Perez
- Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, NY, 10021, USA
| | - Vaios Hatzoglou
- Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, NY, 10021, USA
| | - Andrei Holodny
- Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, NY, 10021, USA
| | - Hrithwik Shalu
- Department of Aerospace Engineering, Indian Institute of Technology Madras, Chennai, India, Tamil Nadu, 600036
| |
Collapse
|
30
|
Mohit K, Gupta R, Kumar B. Contrastive Learned Self-Supervised Technique for Fatty Liver and Chronic Liver Identification. Biomed Signal Process Control 2025; 100:106950. [DOI: 10.1016/j.bspc.2024.106950] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2025]
|
31
|
Hassan SU, Abdulkadir SJ, Zahid MSM, Al-Selwi SM. Local interpretable model-agnostic explanation approach for medical imaging analysis: A systematic literature review. Comput Biol Med 2025; 185:109569. [PMID: 39705792 DOI: 10.1016/j.compbiomed.2024.109569] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Revised: 10/30/2024] [Accepted: 12/10/2024] [Indexed: 12/23/2024]
Abstract
BACKGROUND The interpretability and explainability of machine learning (ML) and artificial intelligence systems are critical for generating trust in their outcomes in fields such as medicine and healthcare. Errors generated by these systems, such as inaccurate diagnoses or treatments, can have serious and even life-threatening effects on patients. Explainable Artificial Intelligence (XAI) is emerging as an increasingly significant area of research nowadays, focusing on the black-box aspect of sophisticated and difficult-to-interpret ML algorithms. XAI techniques such as Local Interpretable Model-Agnostic Explanations (LIME) can give explanations for these models, raising confidence in the systems and improving trust in their predictions. Numerous works have been published that respond to medical problems through the use of ML models in conjunction with XAI algorithms to give interpretability and explainability. The primary objective of the study is to evaluate the performance of the newly emerging LIME techniques within healthcare domains that require more attention in the realm of XAI research. METHOD A systematic search was conducted in numerous databases (Scopus, Web of Science, IEEE Xplore, ScienceDirect, MDPI, and PubMed) that identified 1614 peer-reviewed articles published between 2019 and 2023. RESULTS 52 articles were selected for detailed analysis that showed a growing trend in the application of LIME techniques in healthcare, with significant improvements in the interpretability of ML models used for diagnostic and prognostic purposes. CONCLUSION The findings suggest that the integration of XAI techniques, particularly LIME, enhances the transparency and trustworthiness of AI systems in healthcare, thereby potentially improving patient outcomes and fostering greater acceptance of AI-driven solutions among medical professionals.
Collapse
Affiliation(s)
- Shahab Ul Hassan
- Department of Computer and Information Sciences, Universiti Teknologi PETRONAS, Seri Iskandar, 32610, Perak, Malaysia; Centre for Intelligent Signal & Imaging Research (CISIR), Universiti Teknologi PETRONAS, Seri Iskandar, 32610, Perak, Malaysia.
| | - Said Jadid Abdulkadir
- Department of Computer and Information Sciences, Universiti Teknologi PETRONAS, Seri Iskandar, 32610, Perak, Malaysia; Center for Research in Data Science (CeRDaS), Universiti Teknologi PETRONAS, Seri Iskandar, 32610, Perak, Malaysia.
| | - M Soperi Mohd Zahid
- Department of Computer and Information Sciences, Universiti Teknologi PETRONAS, Seri Iskandar, 32610, Perak, Malaysia; Centre for Intelligent Signal & Imaging Research (CISIR), Universiti Teknologi PETRONAS, Seri Iskandar, 32610, Perak, Malaysia.
| | - Safwan Mahmood Al-Selwi
- Department of Computer and Information Sciences, Universiti Teknologi PETRONAS, Seri Iskandar, 32610, Perak, Malaysia; Center for Research in Data Science (CeRDaS), Universiti Teknologi PETRONAS, Seri Iskandar, 32610, Perak, Malaysia.
| |
Collapse
|
32
|
Wu JY, Tsai YY, Chen YJ, Hsiao FC, Hsu CH, Lin YF, Liao LD. Digital transformation of mental health therapy by integrating digitalized cognitive behavioral therapy and eye movement desensitization and reprocessing. Med Biol Eng Comput 2025; 63:339-354. [PMID: 39400854 DOI: 10.1007/s11517-024-03209-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2024] [Accepted: 09/17/2024] [Indexed: 10/15/2024]
Abstract
Digital therapy has gained popularity in the mental health field because of its convenience and accessibility. One major benefit of digital therapy is its ability to address therapist shortages. Posttraumatic stress disorder (PTSD) is a debilitating mental health condition that can develop after an individual experiences or witnesses a traumatic event. Digital therapy is an important resource for individuals with PTSD who may not have access to traditional in-person therapy. Cognitive behavioral therapy (CBT) and eye movement desensitization and reprocessing (EMDR) are two evidence-based psychotherapies that have shown efficacy in treating PTSD. This paper examines the mechanisms and clinical symptoms of PTSD as well as the principles and applications of CBT and EMDR. Additionally, the potential of digital therapy, including internet-based CBT, video conferencing-based therapy, and exposure therapy using augmented and virtual reality, is explored. This paper also discusses the engineering techniques employed in digital psychotherapy, such as emotion detection models and text analysis, for assessing patients' emotional states. Furthermore, it addresses the challenges faced in digital therapy, including regulatory issues, hardware limitations, privacy and security concerns, and effectiveness considerations. Overall, this paper provides a comprehensive overview of the current state of digital psychotherapy for PTSD treatment and highlights the opportunities and challenges in this rapidly evolving field.
Collapse
Affiliation(s)
- Ju-Yu Wu
- Institute of Biomedical Engineering and Nanomedicine, National Health Research Institutes, 35 Keyan Road, Zhunan, Miaoli County, 35053, Taiwan
- Doctoral Program in Tissue Engineering and Regenerative Medicine, National Chung Hsing University, Taichung, Taiwan
| | - Ying-Ying Tsai
- Institute of Biomedical Engineering and Nanomedicine, National Health Research Institutes, 35 Keyan Road, Zhunan, Miaoli County, 35053, Taiwan
- Department of Biomedical Engineering & Environmental Sciences, National Tsing-Hua University, Hsinchu, Taiwan
| | - Yu-Jie Chen
- Institute of Biomedical Engineering and Nanomedicine, National Health Research Institutes, 35 Keyan Road, Zhunan, Miaoli County, 35053, Taiwan
| | - Fan-Chi Hsiao
- Department of Counseling, Clinical and Industrial/Organizational Psychology, Ming Chuan University, Taoyuan City, Taiwan
| | - Ching-Han Hsu
- Department of Biomedical Engineering & Environmental Sciences, National Tsing-Hua University, Hsinchu, Taiwan
| | - Yen-Feng Lin
- Center for Neuropsychiatric Research, National Health Research Institutes, 35, Keyan Road, Zhunan Town, Miaoli County, 350, Taiwan
| | - Lun-De Liao
- Institute of Biomedical Engineering and Nanomedicine, National Health Research Institutes, 35 Keyan Road, Zhunan, Miaoli County, 35053, Taiwan.
- Doctoral Program in Tissue Engineering and Regenerative Medicine, National Chung Hsing University, Taichung, Taiwan.
| |
Collapse
|
33
|
Guo J, Li YM, Guo H, Hao DP, Xu JX, Huang CC, Han HW, Hou F, Yang SF, Cui JL, Wang HX. Parallel CNN-Deep Learning Clinical-Imaging Signature for Assessing Pathologic Grade and Prognosis of Soft Tissue Sarcoma Patients. J Magn Reson Imaging 2025; 61:807-819. [PMID: 38859600 DOI: 10.1002/jmri.29474] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Revised: 05/22/2024] [Accepted: 05/23/2024] [Indexed: 06/12/2024] Open
Abstract
BACKGROUND Traditional biopsies pose risks and may not accurately reflect soft tissue sarcoma (STS) heterogeneity. MRI provides a noninvasive, comprehensive alternative. PURPOSE To assess the diagnostic accuracy of histological grading and prognosis in STS patients when integrating clinical-imaging parameters with deep learning (DL) features from preoperative MR images. STUDY TYPE Retrospective/prospective. POPULATION 354 pathologically confirmed STS patients (226 low-grade, 128 high-grade) from three hospitals and the Cancer Imaging Archive (TCIA), divided into training (n = 185), external test (n = 125), and TCIA cohorts (n = 44). 12 patients (6 low-grade, 6 high-grade) were enrolled into prospective validation cohort. FIELD STRENGTH/SEQUENCE 1.5 T and 3.0 T/Unenhanced T1-weighted and fat-suppressed-T2-weighted. ASSESSMENT DL features were extracted from MR images using a parallel ResNet-18 model to construct DL signature. Clinical-imaging characteristics included age, gender, tumor-node-metastasis stage and MRI semantic features (depth, number, heterogeneity at T1WI/FS-T2WI, necrosis, and peritumoral edema). Logistic regression analysis identified significant risk factors for the clinical model. A DL clinical-imaging signature (DLCS) was constructed by incorporating DL signature with risk factors, evaluated for risk stratification, and assessed for progression-free survival (PFS) in retrospective cohorts, with an average follow-up of 23 ± 22 months. STATISTICAL TESTS Logistic regression, Cox regression, Kaplan-Meier curves, log-rank test, area under the receiver operating characteristic curve (AUC),and decision curve analysis. A P-value <0.05 was considered significant. RESULTS The AUC values for DLCS in the external test, TCIA, and prospective test cohorts (0.834, 0.838, 0.819) were superior to clinical model (0.662, 0.685, 0.694). Decision curve analysis showed that the DLCS model provided greater clinical net benefit over the DL and clinical models. Also, the DLCS model was able to risk-stratify patients and assess PFS. DATA CONCLUSION The DLCS exhibited strong capabilities in histological grading and prognosis assessment for STS patients, and may have potential to aid in the formulation of personalized treatment plans. LEVEL OF EVIDENCE: 4 TECHNICAL EFFICACY Stage 2.
Collapse
Affiliation(s)
- Jia Guo
- Department of Radiology, The Affiliated Hospital of Qingdao University, Qingdao, China
| | - Yi-Ming Li
- Department of Research Collaboration, Research and Development (R&D) center, Beijing Deepwise and League of Philosophy Doctor (PHD) Technology Co., Ltd, Beijing, China
| | - Hongwei Guo
- Operation center, Qingdao Women and Children's Hospital, Shandong, China
| | - Da-Peng Hao
- Department of Radiology, The Affiliated Hospital of Qingdao University, Qingdao, China
| | - Jing-Xu Xu
- Department of Research Collaboration, Research and Development (R&D) center, Beijing Deepwise and League of Philosophy Doctor (PHD) Technology Co., Ltd, Beijing, China
| | - Chen-Cui Huang
- Department of Research Collaboration, Research and Development (R&D) center, Beijing Deepwise and League of Philosophy Doctor (PHD) Technology Co., Ltd, Beijing, China
| | - Hua-Wei Han
- Department of Research Collaboration, Research and Development (R&D) center, Beijing Deepwise and League of Philosophy Doctor (PHD) Technology Co., Ltd, Beijing, China
| | - Feng Hou
- Department of Pathology, The Affiliated Hospital of Qingdao University, Qingdao, China
| | - Shi-Feng Yang
- Department of Radiology, Shandong Provincial Hospital Affiliated to Shandong First Medical University, Jinan, China
| | - Jian-Ling Cui
- Department of Radiology, Hebei Medical University Third Hospital, Shijiazhuang, China
- Key Laboratory of Biomechanics of Hebei Province, Shijiazhuang, China
| | - He-Xiang Wang
- Department of Radiology, The Affiliated Hospital of Qingdao University, Qingdao, China
| |
Collapse
|
34
|
Lyu Y, Tian X. MWG-UNet++: Hybrid Transformer U-Net Model for Brain Tumor Segmentation in MRI Scans. Bioengineering (Basel) 2025; 12:140. [PMID: 40001660 PMCID: PMC11852190 DOI: 10.3390/bioengineering12020140] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2025] [Revised: 01/22/2025] [Accepted: 01/27/2025] [Indexed: 02/27/2025] Open
Abstract
The accurate segmentation of brain tumors from medical images is critical for diagnosis and treatment planning. However, traditional segmentation methods struggle with complex tumor shapes and inconsistent image quality which leads to suboptimal results. To address this challenge, we propose multiple tasking Wasserstein Generative Adversarial Network U-shape Network++ (MWG-UNet++) to brain tumor segmentation by integrating a U-Net architecture enhanced with transformer layers which combined with Wasserstein Generative Adversarial Networks (WGAN) for data augmentation. The proposed model called Residual Attention U-shaped Network (RAUNet) for brain tumor segmentation leverages the robust feature extraction capabilities of U-Net and the global context awareness provided by transformers to improve segmentation accuracy. Incorporating WGAN for data augmentation addresses the challenge of limited medical imaging datasets to generate high-quality synthetic images that enhance model training and generalization. Our comprehensive evaluation demonstrates that this hybrid model significantly improves segmentation performance. The RAUNet outperforms compared approaches by capturing long-range dependencies and considering spatial variations. The use of WGANs augments the dataset for resulting in robust training and improved resilience to overfitting. The average evaluation metric for brain tumor segmentation is 0.8965 which outperformed the compared methods.
Collapse
Affiliation(s)
| | - Xiaolin Tian
- School of Computer Science and Engineering, Faculty of Information Technology, Macau University of Science and Technology, Macao 999078, China;
| |
Collapse
|
35
|
Tomassini S, Duranti D, Zeggada A, Cosimo Quattrocchi C, Melgani F, Giorgini P. Multi-Branch CNN-LSTM Fusion Network-Driven System With BERT Semantic Evaluator for Radiology Reporting in Emergency Head CTs. IEEE JOURNAL OF TRANSLATIONAL ENGINEERING IN HEALTH AND MEDICINE 2025; 13:61-74. [PMID: 40035027 PMCID: PMC11875635 DOI: 10.1109/jtehm.2025.3535676] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/02/2024] [Revised: 01/05/2025] [Accepted: 01/22/2025] [Indexed: 03/05/2025]
Abstract
The high volume of emergency room patients often necessitates head CT examinations to rule out ischemic, hemorrhagic, or other organic pathologies. A system that enhances the diagnostic efficacy of head CT imaging in emergency settings through structured reporting would significantly improve clinical decision making. Currently, no AI solutions address this need. Thus, our research aims to develop an automatic radiology reporting system by directly analyzing brain anomalies in head CT data. We propose a multi-branch CNN-LSTM fusion network-driven system for enhanced radiology reporting in emergency settings. We preprocessed head CT scans by resizing all slices, selecting those with significant variability, and applying PCA to retain 95% of the original data variance, ultimately saving the most representative five slices for each scan. We linked the reports to their respective slice IDs, divided them into individual captions, and preprocessed each. We performed an 80-20 split of the dataset for ten times, with 15% of the training set used for validation. Our model utilizes a pretrained VGG16, processing groups of five slices simultaneously, and features multiple end-to-end LSTM branches, each specialized in predicting one caption, subsequently combined to form the ordered reports after a BERT-based semantic evaluation. Our system demonstrates effectiveness and stability, with the postprocessing stage refining the syntax of the generated descriptions. However, there remains an opportunity to empower the evaluation framework to more accurately assess the clinical relevance of the automatically-written reports. Part of future work will include transitioning to 3D and developing an improved version based on vision-language models.
Collapse
Affiliation(s)
- Selene Tomassini
- Department of Information Engineering and Computer ScienceUniversity of TrentoTrento38121Italy
| | - Damiano Duranti
- Department of Information Engineering and Computer ScienceUniversity of TrentoTrento38121Italy
| | - Abdallah Zeggada
- Department of Information Engineering and Computer ScienceUniversity of TrentoTrento38121Italy
| | | | - Farid Melgani
- Department of Information Engineering and Computer ScienceUniversity of TrentoTrento38121Italy
| | - Paolo Giorgini
- Department of Information Engineering and Computer ScienceUniversity of TrentoTrento38121Italy
| |
Collapse
|
36
|
Kim JW, Khan AU, Banerjee I. Systematic Review of Hybrid Vision Transformer Architectures for Radiological Image Analysis. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2025:10.1007/s10278-024-01322-4. [PMID: 39871042 DOI: 10.1007/s10278-024-01322-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/28/2024] [Revised: 10/11/2024] [Accepted: 10/25/2024] [Indexed: 01/29/2025]
Abstract
Vision transformer (ViT)and convolutional neural networks (CNNs) each possess distinct strengths in medical imaging: ViT excels in capturing long-range dependencies through self-attention, while CNNs are adept at extracting local features via spatial convolution filters. While ViT may struggle with capturing detailed local spatial information, critical for tasks like anomaly detection in medical imaging, shallow CNNs often fail to effectively abstract global context. This study aims to explore and evaluate hybrid architectures that integrate ViT and CNN to leverage their complementary strengths for enhanced performance in medical vision tasks, such as segmentation, classification, reconstruction, and prediction. Following PRISMA guideline, a systematic review was conducted on 34 articles published between 2020 and Sept. 2024. These articles proposed novel hybrid ViT-CNN architectures specifically for medical imaging tasks in radiology. The review focused on analyzing architectural variations, merging strategies between ViT and CNN, innovative applications of ViT, and efficiency metrics including parameters, inference time (GFlops), and performance benchmarks. The review identified that integrating ViT and CNN can mitigate the limitations of each architecture offering comprehensive solutions that combine global context understanding with precise local feature extraction. We benchmarked the articles based on architectural variations, merging strategies, innovative uses of ViT, and efficiency metrics (number of parameters, inference time (GFlops), and performance), and derived a ranked list. By synthesizing current literature, this review defines fundamental concepts of hybrid vision transformers and highlights emerging trends in the field. It provides a clear direction for future research aimed at optimizing the integration of ViT and CNN for effective utilization in medical imaging, contributing to advancements in diagnostic accuracy and image analysis. We performed systematic review of hybrid vision transformer architecture using PRISMA guideline and performed thorough comparative analysis to benchmark the architectures.
Collapse
Affiliation(s)
- Ji Woong Kim
- School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, Tempe, AZ, USA
| | | | - Imon Banerjee
- School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, Tempe, AZ, USA.
- Department of Radiology, Mayo Clinic, Phoenix, AZ, USA.
- Department of Artificial Intelligence and Informatics (AI&I), Mayo Clinic, Scottsdale, AZ, USA.
| |
Collapse
|
37
|
Jong BK, Yu ZH, Hsu YJ, Chiang SF, You JF, Chern YJ. Deep learning algorithms for predicting pathological complete response in MRI of rectal cancer patients undergoing neoadjuvant chemoradiotherapy: a systematic review. Int J Colorectal Dis 2025; 40:19. [PMID: 39833443 PMCID: PMC11753312 DOI: 10.1007/s00384-025-04809-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/08/2025] [Indexed: 01/22/2025]
Abstract
PURPOSE This systematic review examines the utility of deep learning algorithms in predicting pathological complete response (pCR) in rectal cancer patients undergoing neoadjuvant chemoradiotherapy (nCRT). The primary goal is to evaluate the performance of MRI-based artificial intelligence (AI) models and explore factors affecting their diagnostic accuracy. METHODS The review followed PRISMA guidelines and is registered with PROSPERO (CRD42024628017). Literature searches were conducted in PubMed, Embase, and Cochrane Library using keywords such as "artificial intelligence," "rectal cancer," "MRI," and "pathological complete response." Articles involving deep learning models applied to MRI for predicting pCR were included, excluding non-MRI data and studies without AI applications. Data on study characteristics, MRI sequences, AI model details, and performance metrics were extracted. Quality assessment was performed using the PROBAST tool. RESULTS Out of 512 initial records, 26 studies met the inclusion criteria. Most studies demonstrated promising diagnostic performance, with AUC values for external validation typically exceeding 0.8. The use of T2W and diffusion-weighted imaging (DWI) MRI phases enhanced model accuracy compared to T2W alone. Larger datasets generally correlated with improved model performance. However, heterogeneity in model designs, MRI protocols, and the limited integration of clinical data were noted as challenges. CONCLUSION AI-enhanced MRI demonstrates significant potential in predicting pCR in rectal cancer, particularly with T2W + DWI sequences and larger datasets. While integrating clinical data remains controversial, standardizing methodologies and expanding datasets will further enhance model robustness and clinical utility.
Collapse
Affiliation(s)
- Bor-Kang Jong
- Colorectal Section, Department of Surgery, Chang Gung Memorial Hospital, Taoyuan, Taiwan
- School of Medicine, Chang Gung University, Taoyuan, Taiwan
| | - Zhen-Hao Yu
- Colorectal Section, Department of Surgery, Chang Gung Memorial Hospital, Taoyuan, Taiwan
- School of Medicine, Chang Gung University, Taoyuan, Taiwan
| | - Yu-Jen Hsu
- Colorectal Section, Department of Surgery, Chang Gung Memorial Hospital, Taoyuan, Taiwan
- School of Medicine, Chang Gung University, Taoyuan, Taiwan
| | - Sum-Fu Chiang
- Colorectal Section, Department of Surgery, Chang Gung Memorial Hospital, Taoyuan, Taiwan
- School of Medicine, Chang Gung University, Taoyuan, Taiwan
| | - Jeng-Fu You
- Colorectal Section, Department of Surgery, Chang Gung Memorial Hospital, Taoyuan, Taiwan
- School of Medicine, Chang Gung University, Taoyuan, Taiwan
| | - Yih-Jong Chern
- Colorectal Section, Department of Surgery, Chang Gung Memorial Hospital, Taoyuan, Taiwan.
- School of Medicine, Chang Gung University, Taoyuan, Taiwan.
| |
Collapse
|
38
|
Yoon J, Doh J. A study on hybrid-architecture deep learning model for predicting pressure distribution in 2D airfoils. Sci Rep 2025; 15:2155. [PMID: 39820053 PMCID: PMC11739700 DOI: 10.1038/s41598-024-84940-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2024] [Accepted: 12/30/2024] [Indexed: 01/19/2025] Open
Abstract
This study introduces a novel deep learning-based technique for predicting pressure distribution images, aimed at application in image-based approximate optimal design. The proposed approach integrates both unsupervised and supervised learning paradigms, employing autoencoders (AE) for the unsupervised component and fully connected neural networks (FNN) for the supervised component. A surrogate model based on 2D image data was developed, enabling a comparative analysis of three distinct methods: the conventional AE, the convolutional autoencoder (CAE), and a hybrid CAE, which combines the CAE with a conventional AE. Extensive experiments demonstrated that the CAE method achieved the highest learning capability and restoration rate for pressure distribution images of 2D airfoils. The compressed latent image data were utilized as inputs for the FNN, which was trained to predict latent features. These features were decoded to forecast the corresponding pressure distribution images. The results showed excellent concordance with those derived from computational fluid dynamics (CFD) simulations, achieving a match rate exceeding 99.99%. This methodology significantly simplifies and accelerates image prediction, rendering it feasible without requiring specialized CFD knowledge. Moreover, it enhances accuracy while streamlining the neural network structure. Consequently, it provides foundational technology for image data-based optimization, establishing a platform for future AI-driven design and optimization advancements.
Collapse
Affiliation(s)
- Jaehyun Yoon
- Department of Drone and Robot Convergence, Seoul Cyber University, Seoul, 01133, Republic of Korea
- Department of Mechanical Engineering, Gyeongsang National University, Jinju-si, 52725, Gyeongsangnam-do, Republic of Korea
| | - Jaehyeok Doh
- School of Aerospace Engineering, Gyeongsang National University, Jinju-si, 52828, Gyeongsangnam-do, Republic of Korea.
| |
Collapse
|
39
|
Mallat S, Hkiri E, Albarrak AM, Louhichi B. A Synergy of Convolutional Neural Networks for Sensor-Based EEG Brain-Computer Interfaces to Enhance Motor Imagery Classification. SENSORS (BASEL, SWITZERLAND) 2025; 25:443. [PMID: 39860813 PMCID: PMC11769250 DOI: 10.3390/s25020443] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/24/2024] [Revised: 01/02/2025] [Accepted: 01/07/2025] [Indexed: 01/27/2025]
Abstract
Enhancing motor disability assessment and its imagery classification is a significant concern in contemporary medical practice, necessitating reliable solutions to improve patient outcomes. One promising avenue is the use of brain-computer interfaces (BCIs), which establish a direct communication pathway between users and machines. This technology holds the potential to revolutionize human-machine interaction, especially for individuals diagnosed with motor disabilities. Despite this promise, extracting reliable control signals from noisy brain data remains a critical challenge. In this paper, we introduce a novel approach leveraging the collaborative synergy of five convolutional neural network (CNN) models to improve the classification accuracy of motor imagery tasks, which are essential components of BCI systems. Our method demonstrates exceptional performance, achieving an accuracy of 79.44% on the BCI Competition IV 2a dataset, surpassing existing state-of-the-art techniques in using multiple CNN models. This advancement offers significant promise for enhancing the efficacy and versatility of BCIs in a wide range of real-world applications, from assistive technologies to neurorehabilitation, thereby providing robust solutions for individuals with motor disabilities.
Collapse
Affiliation(s)
- Souheyl Mallat
- Department of Computer Science, Faculty of Sciences, Monastir University, Monastir 5019, Tunisia;
| | - Emna Hkiri
- Department of Computer Science, Higher Institute of Computer Science, Kairouan University, Kairouan 3100, Tunisia;
| | - Abdullah M. Albarrak
- Department of Computer Science, College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University, Riyadh 11432, Saudi Arabia
| | - Borhen Louhichi
- Department of Mechanical Engineering, College of Engineering, Imam Mohammad Ibn Saud Islamic University, Riyadh 11432, Saudi Arabia;
| |
Collapse
|
40
|
Ahmadieh H, Ghassemi F, Moradi MH. EEG Signals Classification Related to Visual Objects Using Long Short-Term Memory Network and Nonlinear Interval Type-2 Fuzzy Regression. Brain Topogr 2025; 38:20. [PMID: 39762447 DOI: 10.1007/s10548-024-01080-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Accepted: 08/19/2024] [Indexed: 02/21/2025]
Abstract
By gaining insights into how brain activity is encoded and decoded, we enhance our understanding of brain function. This study introduces a method for classifying EEG signals related to visual objects, employing a combination of an LSTM network and nonlinear interval type-2 fuzzy regression (NIT2FR). Here, ResNet is utilized for feature extraction from images, the LSTM network for feature extraction from EEG signals, and NIT2FR for mapping image features to EEG signal features. The application of type-2 fuzzy logic addresses uncertainties arising from EEG signal nonlinearity, noise, limited data sample size, and diverse mental states among participants. The Stanford database was used for implementation, evaluating effectiveness through metrics like classification accuracy, precision, recall, and F1 score. According to the findings, the LSTM network achieved an accuracy of 55.83% in categorizing images using raw EEG data. When compared to other methods like linear type-2, linear/nonlinear type-1 fuzzy, neural network, and polynomial regression, NIT2FR coupled with an SVM classifier outperformed with a 68.05% accuracy. Thus, NIT2FR demonstrates superiority in handling high uncertainty environments. Moreover, the 6.03% improvement in accuracy over the best previous study using the same dataset underscores its effectiveness. Precision, recall, and F1 score results for NIT2FR were 68.93%, 68.08%, and 68.49% respectively, surpassing outcomes from linear type-2, linear/nonlinear type-1 fuzzy regression methods.
Collapse
Affiliation(s)
- Hajar Ahmadieh
- Department of Biomedical Engineering, Amirkabir University of Technology, Tehran, Iran
| | - Farnaz Ghassemi
- Department of Biomedical Engineering, Amirkabir University of Technology, Tehran, Iran.
| | | |
Collapse
|
41
|
Ahmad IS, Dai J, Xie Y, Liang X. Deep learning models for CT image classification: a comprehensive literature review. Quant Imaging Med Surg 2025; 15:962-1011. [PMID: 39838987 PMCID: PMC11744119 DOI: 10.21037/qims-24-1400] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2024] [Accepted: 10/18/2024] [Indexed: 01/23/2025]
Abstract
Background and Objective Computed tomography (CT) imaging plays a crucial role in the early detection and diagnosis of life-threatening diseases, particularly in respiratory illnesses and oncology. The rapid advancement of deep learning (DL) has revolutionized CT image analysis, enhancing diagnostic accuracy and efficiency. This review explores the impact of advanced DL methodologies in CT imaging, with a particular focus on their applications in coronavirus disease 2019 (COVID-19) detection and lung nodule classification. Methods A comprehensive literature search was conducted, examining the evolution of DL architectures in medical imaging from conventional convolutional neural networks (CNNs) to sophisticated foundational models (FMs). We reviewed publications from major databases, focusing on developments in CT image analysis using DL from 2013 to 2023. Our search criteria included all types of articles, with a focus on peer-reviewed research papers and review articles in English. Key Content and Findings The review reveals that DL, particularly advanced architectures like FMs, has transformed CT image analysis by streamlining interpretation processes and enhancing diagnostic capabilities. We found significant advancements in addressing global health challenges, especially during the COVID-19 pandemic, and in ongoing efforts for lung cancer screening. The review also addresses technical challenges in CT image analysis, including data variability, the need for large high-quality datasets, and computational demands. Innovative strategies such as transfer learning, data augmentation, and distributed computing are explored as solutions to these challenges. Conclusions This review underscores the pivotal role of DL in advancing CT image analysis, particularly for COVID-19 and lung nodule detection. The integration of DL models into clinical workflows shows promising potential to enhance diagnostic accuracy and efficiency. However, challenges remain in areas of interpretability, validation, and regulatory compliance. The review advocates for continued research, interdisciplinary collaboration, and ethical considerations as DL technologies become integral to clinical practice. While traditional imaging techniques remain vital, the integration of DL represents a significant advancement in medical diagnostics, with far-reaching implications for future research, clinical practice, and healthcare policy.
Collapse
Affiliation(s)
- Isah Salim Ahmad
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Jingjing Dai
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Yaoqin Xie
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Xiaokun Liang
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
- University of Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
42
|
Su AY, Wu ML, Wu YH. Deep learning system for the differential diagnosis of oral mucosal lesions through clinical photographic imaging. J Dent Sci 2025; 20:54-60. [PMID: 39873061 PMCID: PMC11763237 DOI: 10.1016/j.jds.2024.10.019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2024] [Revised: 10/15/2024] [Indexed: 01/30/2025] Open
Abstract
Background/purpose Oral mucosal lesions are associated with a variety of pathological conditions. Most deep-learning-based convolutional neural network (CNN) systems for computer-aided diagnosis of oral lesions have typically concentrated on determining limited aspects of differential diagnosis. This study aimed to develop a CNN-based diagnostic model capable of classifying clinical photographs of oral ulcerative and associated lesions into five different diagnoses, thereby assisting clinicians in making accurate differential diagnoses. Materials and methods A set of clinical images were selected, including 506 images of five different diagnoses. The images were pre-processed and randomly divided into two sets for training and testing the CNN model. The model architecture was composed of convolutional layers, batch normalization layers, max pooling layers, the dropout layer and fully-connected layers. Evaluation metrics included weighted-precision, weighted-recall, weighted-F1 score, average specificity, Cohen's Kappa coefficient, normalized confusion matrix and AUC. Results The overall performance for the image classification showed a weighted-precision of 88.8%, a weighted-recall of 88.2%, a weighted-F1 score of 0.878, an average pecificity of 97.0%, a Kappa coefficient of 0.851, and an average AUC of 0.985. Conclusion The model achieved a decent classification performance (overall AUC=0.985), showing the capacity to discern between benign and malignant potential lesions, and laid the foundation of a novel tool that can help clinical differential diagnosis of oral mucosal lesions. The main challenges were the small and imbalanced dataset. Enlarging the minority classes, incorporating more oral mucosal lesion diagnoses, employing transfer learning and cross-validation might be included in future works to optimize the image classification model.
Collapse
Affiliation(s)
- An-Yu Su
- School of Dentistry, College of Medicine, National Cheng Kung University, Tainan, Taiwan
| | - Ming-Long Wu
- Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan
- Institute of Medical Informatics, National Cheng Kung University, Tainan, Taiwan
| | - Yu-Hsueh Wu
- School of Dentistry, College of Medicine, National Cheng Kung University, Tainan, Taiwan
- Institute of Oral Medicine, School of Dentistry, College of Medicine, National Cheng Kung University, Tainan, Taiwan
- Department of Stomatology, National Cheng Kung University Hospital, College of Medicine, National Cheng Kung University, Tainan, Taiwan
| |
Collapse
|
43
|
Zhang B, Zhu J, Xu R, Zou L, Lian Y, Xie X, Tian Y. A combined model integrating radiomics and deep learning based on multiparametric magnetic resonance imaging for classification of brain metastases. Acta Radiol 2025; 66:24-34. [PMID: 39552295 DOI: 10.1177/02841851241292528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2024]
Abstract
BACKGROUND Radiomics and deep learning (DL) can individually and efficiently identify the pathological type of brain metastases (BMs). PURPOSE To investigate the feasibility of utilizing multi-parametric MRI-based deep transfer learning radiomics (DTLR) for the classification of lung adenocarcinoma (LUAD) and non-LUAD BMs. MATERIAL AND METHODS A retrospective analysis was performed on 342 patients with 1389 BMs. These instances were randomly assigned to a training set of 273 (1179 BMs) and a testing set of 69 (210 BMs) in an 8:2 ratio. Eight machine learning algorithms were employed to construct the radiomics models. A DL model was developed using four pre-trained convolutional neural networks (CNNs). The DTLR model was formulated by integrating the optimal performing radiomics model and the DL model using a classification probability averaging approach. The area under the curve (AUC), calibration curve, and decision curve analysis (DCA) were utilized to assess the performance and clinical utility of the models. RESULTS The AUC for the optimal radiomics and DL model in the testing set were 0.824 (95% confidence interval [CI]= 0.726-0.923) and 0.775 (95% CI=0.666-0.884), respectively. The DTLR model demonstrated superior discriminatory power, achieving an AUC of 0.880 (95% CI=0.803-0.957). In addition, the DTLR model exhibited good consistency between actual and predicted probabilities based on the calibration curve and DCA analysis, indicating its significant clinical value. CONCLUSION Our study's DTLR model demonstrated high diagnostic accuracy in distinguishing LUAD from non-LUAD BMs. This method shows potential for the non-invasive identification of the histological subtype of BMs.
Collapse
Affiliation(s)
- Bo Zhang
- Department of Radiology, The Second Affiliated Hospital of Soochow University, Suzhou, PR China
| | - Jinling Zhu
- Department of Radiology, The Second Affiliated Hospital of Soochow University, Suzhou, PR China
| | - Ruizhe Xu
- Department of Radiation Oncology, The Second Affiliated Hospital of Soochow University, Suzhou, PR China
| | - Li Zou
- Department of Radiation Oncology, The Second Affiliated Hospital of Soochow University, Suzhou, PR China
| | - Yixin Lian
- Department of Respiratory & Critical Care Medicine, The Second Affiliated Hospital of Soochow University, Suzhou, PR China
| | - Xin Xie
- Department of Radiology, The Second Affiliated Hospital of Soochow University, Suzhou, PR China
| | - Ye Tian
- Department of Radiotherapy & Oncology, The Second Affiliated Hospital of Soochow University, Suzhou, PR China
- Institute of Radiotherapy & Oncology, Soochow University, Suzhou, PR China
- Suzhou Key Laboratory for Radiation Oncology, Suzhou, PR China
| |
Collapse
|
44
|
Selvam IJ, Madhavan M, Kumarasamy SK. Detection and classification of electrocardiography using hybrid deep learning models. Hellenic J Cardiol 2025; 81:75-84. [PMID: 39218394 DOI: 10.1016/j.hjc.2024.08.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2024] [Revised: 08/17/2024] [Accepted: 08/26/2024] [Indexed: 09/04/2024] Open
Abstract
OBJECTIVE Electrocardiography (ECGs) has been a vital tool for cardiovascular disease (CVD) diagnosis, which visually depicts the heart's electrical activity. To enhance automatic classification between normal and diseased ECG, it is essential to extract consistent and qualitative features. METHODS Precision of ECG classification through a hybrid Deep Learning (DL) approach leverages both Convolutional Neural Network (CNN) architecture and Variational Autoencoder (VAE) techniques. By combining these methods, we aim to achieve more accurate and robust ECG interpretation. The method is trained and tested over the PTB-XL dataset, which contains 21,799 with 12-lead ECGs from 18,869 patients, each spanning 10 s. The classification evaluation of five super-classes and 23 sub-classes of CVD, with the proposed CNN-VAE model is compared. RESULTS The classification of various CVDs resulted in the highest accuracy of 98.51%, specificity of 98.12%, sensitivity of 97.9%, and F1-score of 97.95%. We have also achieved the minimum false positive and false negative rates of 2.07% and 1.87%, respectively, during validation. The results are validated upon the annotations given by individual cardiologists, who assigned potentially multiple ECG statements to each record. CONCLUSION When compared to other deep learning methods, our suggested CNN-VAE model performs significantly better in the testing phase. This study proposes a new architecture of combining CNN-VAE for CVD classification from ECG data, this can help clinicians to identify the disease earlier and carry out further treatment. The CNN-VAE model can better characterize input signals due to its hybrid architecture.
Collapse
Affiliation(s)
- Immaculate Joy Selvam
- Department of Electronics and Communication Engineering, Saveetha Engineering College, Thandalam, Chennai, 602105, India.
| | - Moorthi Madhavan
- Department of Biomedical Engineering, Saveetha Engineering College, Thandalam, Chennai, 602105, India.
| | - Senthil Kumar Kumarasamy
- Department of Electronics and Communication Engineering, Central Polytechnic College, Tharamani, Chennai, 600113, India.
| |
Collapse
|
45
|
Zhu L, Chen Y, Liu L, Xing L, Yu L. Multi-Sensor Learning Enables Information Transfer Across Different Sensory Data and Augments Multi-Modality Imaging. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2025; 47:288-304. [PMID: 39302777 PMCID: PMC11875987 DOI: 10.1109/tpami.2024.3465649] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/22/2024]
Abstract
Multi-modality imaging is widely used in clinical practice and biomedical research to gain a comprehensive understanding of an imaging subject. Currently, multi-modality imaging is accomplished by post hoc fusion of independently reconstructed images under the guidance of mutual information or spatially registered hardware, which limits the accuracy and utility of multi-modality imaging. Here, we investigate a data-driven multi-modality imaging (DMI) strategy for synergetic imaging of CT and MRI. We reveal two distinct types of features in multi-modality imaging, namely intra- and inter-modality features, and present a multi-sensor learning (MSL) framework to utilize the crossover inter-modality features for augmented multi-modality imaging. The MSL imaging approach breaks down the boundaries of traditional imaging modalities and allows for optimal hybridization of CT and MRI, which maximizes the use of sensory data. We showcase the effectiveness of our DMI strategy through synergetic CT-MRI brain imaging. The principle of DMI is quite general and holds enormous potential for various DMI applications across disciplines.
Collapse
|
46
|
Pérez-Núñez JR, Rodríguez C, Vásquez-Serpa LJ, Navarro C. The Challenge of Deep Learning for the Prevention and Automatic Diagnosis of Breast Cancer: A Systematic Review. Diagnostics (Basel) 2024; 14:2896. [PMID: 39767257 PMCID: PMC11675111 DOI: 10.3390/diagnostics14242896] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2024] [Revised: 11/24/2024] [Accepted: 12/18/2024] [Indexed: 01/11/2025] Open
Abstract
OBJECTIVES This review aims to evaluate several convolutional neural network (CNN) models applied to breast cancer detection, to identify and categorize CNN variants in recent studies, and to analyze their specific strengths, limitations, and challenges. METHODS Using PRISMA methodology, this review examines studies that focus on deep learning techniques, specifically CNN, for breast cancer detection. Inclusion criteria encompassed studies from the past five years, with duplicates and those unrelated to breast cancer excluded. A total of 62 articles from the IEEE, SCOPUS, and PubMed databases were analyzed, exploring CNN architectures and their applicability in detecting this pathology. RESULTS The review found that CNN models with advanced architecture and greater depth exhibit high accuracy and sensitivity in image processing and feature extraction for breast cancer detection. CNN variants that integrate transfer learning proved particularly effective, allowing the use of pre-trained models with less training data required. However, challenges include the need for large, labeled datasets and significant computational resources. CONCLUSIONS CNNs represent a promising tool in breast cancer detection, although future research should aim to create models that are more resource-efficient and maintain accuracy while reducing data requirements, thus improving clinical applicability.
Collapse
Affiliation(s)
- Jhelly-Reynaluz Pérez-Núñez
- Facultad de Ingeniería de Sistemas e Informática, Universidad Nacional Mayor de San Marcos (UNMSM), Lima 15081, Peru; (C.R.); (L.-J.V.-S.); (C.N.)
| | | | | | | |
Collapse
|
47
|
Lu N, Liu Y, Cui J, Xiao X, Luo Y, Noori M. A Time-Frequency-Based Data-Driven Approach for Structural Damage Identification and Its Application to a Cable-Stayed Bridge Specimen. SENSORS (BASEL, SWITZERLAND) 2024; 24:8007. [PMID: 39771742 PMCID: PMC11678986 DOI: 10.3390/s24248007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/15/2024] [Revised: 12/11/2024] [Accepted: 12/13/2024] [Indexed: 01/11/2025]
Abstract
Structural damage identification based on structural health monitoring (SHM) data and machine learning (ML) is currently a rapidly developing research area in structural engineering. Traditional machine learning techniques rely heavily on feature extraction, where weak feature extraction can lead to suboptimal features and poor classification performance. In contrast, ML-based methods, particularly deep learning approaches like convolutional neural networks (CNNs), automatically extract relevant features from raw data, improving the accuracy and adaptability of the damage identification process. This study developed a time-frequency-based data-driven approach aiming to improve the effectiveness of traditional data-driven structural damage identification approaches for large complex structures. Firstly, the structural acceleration signals in the time domain were converted into two-dimensional images via the Gram angle difference field (GADF). Subsequently, the characteristic feature in the image data was studied by convolutional neural networks (CNNs) to predict the structural damage conditions. An experimental study on a scale model of a cable-stayed bridge was conducted to identify the damage of stay cables under the moving vehicle load on the main girders. The CNN was employed to extract the characteristic features from the time-varying monitoring data of vehicle-bridge interactions. The CNN parameters were optimized to conduct the structural damage classification task. The performance of the proposed method was evaluated by comparing it with various traditional pre-trained networks. The effect of environmental noise on the prediction accuracy was also investigated. The numerical results show that the ResNet model has the best performance in terms of damage identification accuracy and convergence speed, achieving higher accuracy and faster convergence compared to the other four traditional networks. The method can accurately identify damage on bridges using insufficient sensors on the bridge deck, which has valuable potential for application to real-world bridges with monitoring data. As the Signal-to-Noise Ratio (SNR) decreases from 20 dB to 2.5 dB, the prediction accuracy of ResNet decreases from 86.63% to 62.5%, which demonstrates the robustness and reliability in identifying structural damage.
Collapse
Affiliation(s)
- Naiwei Lu
- School of Civil Engineering, Changsha University of Science and Technology, Changsha 410114, China; (N.L.); (J.C.); (X.X.)
| | - Yiru Liu
- School of Civil Engineering, Changsha University of Science and Technology, Changsha 410114, China; (N.L.); (J.C.); (X.X.)
| | - Jian Cui
- School of Civil Engineering, Changsha University of Science and Technology, Changsha 410114, China; (N.L.); (J.C.); (X.X.)
| | - Xiangyuan Xiao
- School of Civil Engineering, Changsha University of Science and Technology, Changsha 410114, China; (N.L.); (J.C.); (X.X.)
| | - Yuan Luo
- College of Civil Engineering, Hunan University of Technology, Zhuzhou 412007, China;
| | - Mohammad Noori
- Department of Mechanical Engineering, California Polytechnic State University, San Luis Obispo, CA 93407, USA;
- School of Civil Engineering, University of Leeds, Leeds LS2 9JT, UK
| |
Collapse
|
48
|
Ramiah D, Mmereki D. Synthesizing Efficiency Tools in Radiotherapy to Increase Patient Flow: A Comprehensive Literature Review. Clin Med Insights Oncol 2024; 18:11795549241303606. [PMID: 39677332 PMCID: PMC11645725 DOI: 10.1177/11795549241303606] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2024] [Accepted: 11/07/2024] [Indexed: 12/17/2024] Open
Abstract
The promise of novel technologies to increase access to radiotherapy in low- and middle-income countries (LMICs) is crucial, given that the cost of equipping new radiotherapy centres or upgrading existing machinery remains a major obstacle to expanding access to cancer treatment. The study aims to provide a thorough analysis overview of how technological advancement may revolutionize radiotherapy (RT) to improve level of care provided to cancer patients. A comprehensive literature review following some steps of systematic review (SLR) was performed using the Web of Science (WoS), PubMed, and Scopus databases. The study findings are classified into different technologies. Artificial intelligence (AI), knowledge-based planning, remote planning, radiotherapy, and scripting are all ways to increase patient flow across radiation oncology, including initial consultation, treatment planning, delivery, verification, and patient follow-up. This review found that these technologies improve delineation of organ at risks (OARs) and considerably reduce waiting times when compared with conventional treatment planning in RT. In this review, AI, knowledge-based planning, remote radiotherapy treatment planning, and scripting reduced waiting times and improved organ at-risk delineation compared with conventional RT treatment planning. A combination of these technologies may lower cancer patients' risk of disease progression due to reduced workload, quality of therapy, and individualized treatment. Efficiency tools, such as the application of AI, knowledge-based planning, remote radiotherapy planning, and scripting, are urgently needed to reduce waiting times and improve OAR delineation accuracy in cancer treatment compared with traditional treatment planning methods. The study's contribution is to present the potential of technological advancement to optimize RT planning process, thereby improving patient care and resource utilization. The study may be extended in the future to include digital integration and technology's impact on patient safety, outcomes, and risk. Therefore, in radiotherapy, research on more efficient tools pioneers the development and implementation of high-precision radiotherapy for cancer patients.
Collapse
Affiliation(s)
- Duvern Ramiah
- Division of Radiation Oncology, Department of Radiation Sciences, School of Clinical Medicine, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa
| | - Daniel Mmereki
- Division of Radiation Oncology, Department of Radiation Sciences, School of Clinical Medicine, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa
| |
Collapse
|
49
|
Morikawa T, Shingyouchi M, Ariizumi T, Watanabe A, Shibahara T, Katakura A. Performance of image processing analysis and a deep convolutional neural network for the classification of oral cancer in fluorescence visualization. Int J Oral Maxillofac Surg 2024:S0901-5027(24)00444-2. [PMID: 39672733 DOI: 10.1016/j.ijom.2024.11.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Revised: 11/25/2024] [Accepted: 11/27/2024] [Indexed: 12/15/2024]
Abstract
The aim of this prospective study was to determine the effectiveness of screening using image processing analysis and a deep convolutional neural network (DCNN) to classify oral cancers using non-invasive fluorescence visualization. The study included 1076 patients with diseases of the oral mucosa (oral cancer, oral potentially malignant disorders (OPMDs), benign disease) or normal mucosa. For oral cancer, the rate of fluorescence visualization loss (FVL) was 96.9%. Regarding image processing, multivariate analysis identified FVL, the coefficient of variation of the G value (CV), and the G value ratio (VRatio) as factors significantly associated with oral cancer detection. The sensitivity and specificity for detecting oral cancer were 96.9% and 77.3% for FVL, 80.8% and 86.4% for CV, and 84.9% and 87.8% for VRatio, respectively. Regarding the performance of the DCNN for image classification, recall was 0.980 for oral cancer, 0.760 for OPMDs, 0.960 for benign disease, and 0.739 for normal mucosa. Precision was 0.803, 0.821, 0.842, and 0.941, respectively. The F-score was 0.883, 0.789, 0.897, and 0.828, respectively. Sensitivity and specificity for detecting oral cancer were 98.0% and 92.7%, respectively. The accuracy for all lesions was 0.851, average recall was 0.860, average precision was 0.852, and average F-score was 0.849.
Collapse
Affiliation(s)
- T Morikawa
- Department of Oral and Maxillofacial Surgery, Tokyo Dental College, Tokyo, Japan; Oral and Maxillofacial Surgery, Mitsuwadai General Hospital, Chiba, Japan.
| | - M Shingyouchi
- Department of Oral and Maxillofacial Surgery, Tokyo Dental College, Tokyo, Japan
| | - T Ariizumi
- Department of Oral and Maxillofacial Surgery, Tokyo Dental College, Tokyo, Japan
| | - A Watanabe
- Department of Oral and Maxillofacial Surgery, Tokyo Dental College, Tokyo, Japan
| | - T Shibahara
- Department of Oral and Maxillofacial Surgery, Tokyo Dental College, Tokyo, Japan
| | - A Katakura
- Department of Oral Pathobiological Science and Surgery, Tokyo Dental College, Tokyo, Japan
| |
Collapse
|
50
|
Shetty S, Talaat W, AlKawas S, Al-Rawi N, Reddy S, Hamdoon Z, Kheder W, Acharya A, Ozsahin DU, David LR. Application of artificial intelligence-based detection of furcation involvement in mandibular first molar using cone beam tomography images- a preliminary study. BMC Oral Health 2024; 24:1476. [PMID: 39633335 PMCID: PMC11619149 DOI: 10.1186/s12903-024-05268-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2024] [Accepted: 11/27/2024] [Indexed: 12/07/2024] Open
Abstract
BACKGROUND Radiographs play a key role in diagnosis of periodontal diseases. Deep learning models have been explored for image analysis in periodontal diseases. However, there is lacuna of research in the deep learning model-based detection of furcation involvements [FI]. The objective of this study was to determine the accuracy of deep learning model in the detection of FI in axial CBCT images. METHODOLOGY We obtained initial dataset 285 axial CBCT images among which 143 were normal (without FI) and 142 were abnormal (with FI). Data augmentation technique was used to create 600(300 normal and 300 abnormal) images by using 200 images from the training dataset. Remaining 85(43 normal and 42 abnormal) images were kept for testing of model. ResNet101V2 with transfer learning was used employed for the analysis of images. RESULTS Training accuracy of model is 98%, valid accuracy is 97% and test accuracy is 91%. The precision and F1 score were 0.98 and 0.98 respectively. The Area under curve (AUC) was reported at 0.98. The test loss was reported at 0.2170. CONCLUSION The deep learning model (ResNet101V2) can accurately detect the FI in axial CBCT images. However, since our study was preliminary in nature and carried out with relatively smaller dataset, a study with larger dataset will further confirm the accuracy of deep learning models.
Collapse
Affiliation(s)
- Shishir Shetty
- Department of Oral and Craniofacial Health Sciences, College of Dental Medicine, University of Sharjah, Sharjah, United Arab Emirates
| | - Wael Talaat
- Department of Oral and Craniofacial Health Sciences, College of Dental Medicine, University of Sharjah, Sharjah, United Arab Emirates
| | - Sausan AlKawas
- Department of Oral and Craniofacial Health Sciences, College of Dental Medicine, University of Sharjah, Sharjah, United Arab Emirates
| | - Natheer Al-Rawi
- Department of Oral and Craniofacial Health Sciences, College of Dental Medicine, University of Sharjah, Sharjah, United Arab Emirates
| | - Sesha Reddy
- College of Dentistry, Gulf Medical University, Ajman, United Arab Emirates
| | - Zaid Hamdoon
- Department of Oral and Craniofacial Health Sciences, College of Dental Medicine, University of Sharjah, Sharjah, United Arab Emirates
| | - Waad Kheder
- Department of Preventive and Restorative Dentistry College of Dental medicine, University of Sharjah, Sharjah, United Arab Emirates
| | - Anirudh Acharya
- Department of Preventive and Restorative Dentistry College of Dental medicine, University of Sharjah, Sharjah, United Arab Emirates
| | - Dilber Uzun Ozsahin
- Department of Medical Diagnostic Imaging, College of Health Sciences, University of Sharjah, Sharjah, United Arab Emirates.
| | - Leena R David
- Department of Medical Diagnostic Imaging, College of Health Sciences, University of Sharjah, Sharjah, United Arab Emirates.
| |
Collapse
|