1
|
Nondestructive detection of SSC in multiple pear (Pyrus pyrifolia Nakai) cultivars using Vis-NIR spectroscopy coupled with the Grad-CAM method. Food Chem 2024; 450:139283. [PMID: 38615528 DOI: 10.1016/j.foodchem.2024.139283] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Revised: 03/22/2024] [Accepted: 04/06/2024] [Indexed: 04/16/2024]
Abstract
Vis-NIR spectroscopy coupled with chemometric models is frequently used for pear soluble solid content (SSC) prediction. However, the model robustness is challenged by the variations in pear cultivars. This study explored the feasibility of developing universal models for predicting SSC of multiple pear varieties to improve the model's generalizability. The mature fruits of 6 pear cultivars with green skin (Pyrus pyrifolia Nakai cv. 'Cuiyu', 'Sucui No.1' and 'Cuiguan') and brown skin (Pyrus pyrifolia Nakai cv. 'Hosui','Syusui' and 'Wakahikari') were used to establish single-cultivar models and multi-cultivar universal models using convolutional neural network (CNN), partial least square (PLS), and support vector regression (SVR) approaches. Multi-cultivar universal models were built using full spectra and important variables extracted by gradient-weighted class activation mapping (Grad-CAM), respectively. The universal models based on important variables obtained satisfactory performances with RMSEPs of 0.76, 0.59, 0.80, 1.64, 0.98, and 1.03°Brix on 6 cultivars, respectively.
Collapse
|
2
|
Enhancing brain tumor detection in MRI images through explainable AI using Grad-CAM with Resnet 50. BMC Med Imaging 2024; 24:107. [PMID: 38734629 PMCID: PMC11088067 DOI: 10.1186/s12880-024-01292-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2024] [Accepted: 05/06/2024] [Indexed: 05/13/2024] Open
Abstract
This study addresses the critical challenge of detecting brain tumors using MRI images, a pivotal task in medical diagnostics that demands high accuracy and interpretability. While deep learning has shown remarkable success in medical image analysis, there remains a substantial need for models that are not only accurate but also interpretable to healthcare professionals. The existing methodologies, predominantly deep learning-based, often act as black boxes, providing little insight into their decision-making process. This research introduces an integrated approach using ResNet50, a deep learning model, combined with Gradient-weighted Class Activation Mapping (Grad-CAM) to offer a transparent and explainable framework for brain tumor detection. We employed a dataset of MRI images, enhanced through data augmentation, to train and validate our model. The results demonstrate a significant improvement in model performance, with a testing accuracy of 98.52% and precision-recall metrics exceeding 98%, showcasing the model's effectiveness in distinguishing tumor presence. The application of Grad-CAM provides insightful visual explanations, illustrating the model's focus areas in making predictions. This fusion of high accuracy and explainability holds profound implications for medical diagnostics, offering a pathway towards more reliable and interpretable brain tumor detection tools.
Collapse
|
3
|
Quantitative and Visual Analysis of Data Augmentation and Hyperparameter Optimization in Deep Learning-Based Segmentation of Low-Grade Glioma Tumors Using Grad-CAM. Ann Biomed Eng 2024; 52:1359-1377. [PMID: 38409433 DOI: 10.1007/s10439-024-03461-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Accepted: 01/29/2024] [Indexed: 02/28/2024]
Abstract
This study executes a quantitative and visual investigation on the effectiveness of data augmentation and hyperparameter optimization on the accuracy of deep learning-based segmentation of LGG tumors. The study employed the MobileNetV2 and ResNet backbones with atrous convolution in DeepLabV3+ structure. The Grad-CAM tool was also used to interpret the effect of augmentation and network optimization on segmentation performance. A wide investigation was performed to optimize the network hyperparameters. In addition, the study examined 35 different models to evaluate different data augmentation techniques. The results of the study indicated that incorporating data augmentation techniques and optimization can improve the performance of segmenting brain LGG tumors up to 10%. Our extensive investigation of the data augmentation techniques indicated that enlargement of data from 90° and 225° rotated data,up to down and left to right flipping are the most effective techniques. MobilenetV2 as the backbone,"Focal Loss" as the loss function and "Adam" as the optimizer showed the superior results. The optimal model (DLG-Net) achieved an overall accuracy of 96.1% with a loss value of 0.006. Specifically, the segmentation performance for Whole Tumor (WT), Tumor Core (TC), and Enhanced Tumor (ET) reached a Dice Similarity Coefficient (DSC) of 89.4%, 70.1%, and 49.9%, respectively. Simultaneous visual and quantitative assessment of data augmentation and network optimization can lead to an optimal model with a reasonable performance in segmenting the LGG tumors.
Collapse
|
4
|
A Semi-Supervised Learning Framework for Classifying Colorectal Neoplasia Based on the NICE Classification. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2024:10.1007/s10278-024-01123-9. [PMID: 38653910 DOI: 10.1007/s10278-024-01123-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/21/2024] [Revised: 04/02/2024] [Accepted: 04/12/2024] [Indexed: 04/25/2024]
Abstract
Labelling medical images is an arduous and costly task that necessitates clinical expertise and large numbers of qualified images. Insufficient samples can lead to underfitting during training and poor performance of supervised learning models. In this study, we aim to develop a SimCLR-based semi-supervised learning framework to classify colorectal neoplasia based on the NICE classification. First, the proposed framework was trained under self-supervised learning using a large unlabelled dataset; subsequently, it was fine-tuned on a limited labelled dataset based on the NICE classification. The model was evaluated on an independent dataset and compared with models based on supervised transfer learning and endoscopists using accuracy, Matthew's correlation coefficient (MCC), and Cohen's kappa. Finally, Grad-CAM and t-SNE were applied to visualize the models' interpretations. A ResNet-backboned SimCLR model (accuracy of 0.908, MCC of 0.862, and Cohen's kappa of 0.896) outperformed supervised transfer learning-based models (means: 0.803, 0.698, and 0.742) and junior endoscopists (0.816, 0.724, and 0.863), while performing only slightly worse than senior endoscopists (0.916, 0.875, and 0.944). Moreover, t-SNE showed a better clustering of ternary samples through self-supervised learning in SimCLR than through supervised transfer learning. Compared with traditional supervised learning, semi-supervised learning enables deep learning models to achieve improved performance with limited labelled endoscopic images.
Collapse
|
5
|
Harnessing ResNet50 and SENet for enhanced ankle fracture identification. BMC Musculoskelet Disord 2024; 25:250. [PMID: 38561697 PMCID: PMC10983628 DOI: 10.1186/s12891-024-07355-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/29/2023] [Accepted: 03/13/2024] [Indexed: 04/04/2024] Open
Abstract
BACKGROUND Ankle fractures are prevalent injuries that necessitate precise diagnostic tools. Traditional diagnostic methods have limitations that can be addressed using machine learning techniques, with the potential to improve accuracy and expedite diagnoses. METHODS We trained various deep learning architectures, notably the Adapted ResNet50 with SENet capabilities, to identify ankle fractures using a curated dataset of radiographic images. Model performance was evaluated using common metrics like accuracy, precision, and recall. Additionally, Grad-CAM visualizations were employed to interpret model decisions. RESULTS The Adapted ResNet50 with SENet capabilities consistently outperformed other models, achieving an accuracy of 93%, AUC of 95%, and recall of 92%. Grad-CAM visualizations provided insights into areas of the radiographs that the model deemed significant in its decisions. CONCLUSIONS The Adapted ResNet50 model enhanced with SENet capabilities demonstrated superior performance in detecting ankle fractures, offering a promising tool to complement traditional diagnostic methods. However, continuous refinement and expert validation are essential to ensure optimal application in clinical settings.
Collapse
Grants
- 2020AS0031 Science and Technology Projects in the Field of Agriculture and Social Development in Yinzhou District, Ningbo City, Zhejiang Province, China
- 2020AS0031 Science and Technology Projects in the Field of Agriculture and Social Development in Yinzhou District, Ningbo City, Zhejiang Province, China
- 2020AS0031 Science and Technology Projects in the Field of Agriculture and Social Development in Yinzhou District, Ningbo City, Zhejiang Province, China
- 2020AS0031 Science and Technology Projects in the Field of Agriculture and Social Development in Yinzhou District, Ningbo City, Zhejiang Province, China
- 2020AS0031 Science and Technology Projects in the Field of Agriculture and Social Development in Yinzhou District, Ningbo City, Zhejiang Province, China
Collapse
|
6
|
Alzheimer's Disease Evaluation Through Visual Explainability by Means of Convolutional Neural Networks. Int J Neural Syst 2024; 34:2450007. [PMID: 38273799 DOI: 10.1142/s0129065724500072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2024]
Abstract
Background and Objective: Alzheimer's disease is nowadays the most common cause of dementia. It is a degenerative neurological pathology affecting the brain, progressively leading the patient to a state of total dependence, thus creating a very complex and difficult situation for the family that has to assist him/her. Early diagnosis is a primary objective and constitutes the hope of being able to intervene in the development phase of the disease. Methods: In this paper, a method to automatically detect the presence of Alzheimer's disease, by exploiting deep learning, is proposed. Five different convolutional neural networks are considered: ALEX_NET, VGG16, FAB_CONVNET, STANDARD_CNN and FCNN. The first two networks are state-of-the-art models, while the last three are designed by authors. We classify brain images into one of the following classes: non-demented, very mild demented and mild demented. Moreover, we highlight on the image the areas symptomatic of Alzheimer presence, thus providing a visual explanation behind the model diagnosis. Results: The experimental analysis, conducted on more than 6000 magnetic resonance images, demonstrated the effectiveness of the proposed neural networks in the comparison with the state-of-the-art models in Alzheimer's disease diagnosis and localization. The best results in terms of metrics are the best with STANDARD_CNN and FCNN with accuracy, precision and recall between 98% and 95%. Excellent results also from a qualitative point of view are obtained with the Grad-CAM for localization and visual explainability. Conclusions: The analysis of the heatmaps produced by the Grad-CAM algorithm shows that in almost all cases the heatmaps highlight regions such as ventricles and cerebral cortex. Future work will focus on the realization of a network capable of analyzing the three anatomical views simultaneously.
Collapse
|
7
|
Deep learning diagnostic performance and visual insights in differentiating benign and malignant thyroid nodules on ultrasound images. Exp Biol Med (Maywood) 2023; 248:2538-2546. [PMID: 38279511 PMCID: PMC10854474 DOI: 10.1177/15353702231220664] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2023] [Accepted: 10/13/2023] [Indexed: 01/28/2024] Open
Abstract
This study aims to construct and evaluate a deep learning model, utilizing ultrasound images, to accurately differentiate benign and malignant thyroid nodules. The objective includes visualizing the model's process for interpretability and comparing its diagnostic precision with a cohort of 80 radiologists. We employed ResNet as the classification backbone for thyroid nodule prediction. The model was trained using 2096 ultrasound images of 655 distinct thyroid nodules. For performance evaluation, an independent test set comprising 100 cases of thyroid nodules was curated. In addition, to demonstrate the superiority of the artificial intelligence (AI) model over radiologists, a Turing test was conducted with 80 radiologists of varying clinical experience. This was meant to assess which group of radiologists' conclusions were in closer alignment with AI predictions. Furthermore, to highlight the interpretability of the AI model, gradient-weighted class activation mapping (Grad-CAM) was employed to visualize the model's areas of focus during its prediction process. In this cohort, AI diagnostics demonstrated a sensitivity of 81.67%, a specificity of 60%, and an overall diagnostic accuracy of 73%. In comparison, the panel of radiologists on average exhibited a diagnostic accuracy of 62.9%. The AI's diagnostic process was significantly faster than that of the radiologists. The generated heat-maps highlighted the model's focus on areas characterized by calcification, solid echo and higher echo intensity, suggesting these areas might be indicative of malignant thyroid nodules. Our study supports the notion that deep learning can be a valuable diagnostic tool with comparable accuracy to experienced senior radiologists in the diagnosis of malignant thyroid nodules. The interpretability of the AI model's process suggests that it could be clinically meaningful. Further studies are necessary to improve diagnostic accuracy and support auxiliary diagnoses in primary care settings.
Collapse
|
8
|
Deep learning-based estimation of axial length using macular optical coherence tomography images. Front Med (Lausanne) 2023; 10:1308923. [PMID: 38046408 PMCID: PMC10693454 DOI: 10.3389/fmed.2023.1308923] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2023] [Accepted: 11/06/2023] [Indexed: 12/05/2023] Open
Abstract
Background This study aimed to develop deep learning models using macular optical coherence tomography (OCT) images to estimate axial lengths (ALs) in eyes without maculopathy. Methods A total of 2,664 macular OCT images from 444 patients' eyes without maculopathy, who visited Beijing Hospital between March 2019 and October 2021, were included. The dataset was divided into training, validation, and testing sets with a ratio of 6:2:2. Three pre-trained models (ResNet 18, ResNet 50, and ViT) were developed for binary classification (AL ≥ 26 mm) and regression task. Ten-fold cross-validation was performed, and Grad-CAM analysis was employed to visualize AL-related macular features. Additionally, retinal thickness measurements were used to predict AL by linear and logistic regression models. Results ResNet 50 achieved an accuracy of 0.872 (95% Confidence Interval [CI], 0.840-0.899), with high sensitivity of 0.804 (95% CI, 0.728-0.867) and specificity of 0.895 (95% CI, 0.861-0.923). The mean absolute error for AL prediction was 0.83 mm (95% CI, 0.72-0.95 mm). The best AUC, and accuracy of AL estimation using macular OCT images (0.929, 87.2%) was superior to using retinal thickness measurements alone (0.747, 77.8%). AL-related macular features were on the fovea and adjacent regions. Conclusion OCT images can be effectively utilized for estimating AL with good performance via deep learning. The AL-related macular features exhibit a localized pattern in the macula, rather than continuous alterations throughout the entire region. These findings can lay the foundation for future research in the pathogenesis of AL-related maculopathy.
Collapse
|
9
|
Spatiotemporal cortical dynamics for visual scene processing as revealed by EEG decoding. Front Neurosci 2023; 17:1167719. [PMID: 38027518 PMCID: PMC10646306 DOI: 10.3389/fnins.2023.1167719] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2023] [Accepted: 10/16/2023] [Indexed: 12/01/2023] Open
Abstract
The human visual system rapidly recognizes the categories and global properties of complex natural scenes. The present study investigated the spatiotemporal dynamics of neural signals involved in visual scene processing using electroencephalography (EEG) decoding. We recorded visual evoked potentials from 11 human observers for 232 natural scenes, each of which belonged to one of 13 natural scene categories (e.g., a bedroom or open country) and had three global properties (naturalness, openness, and roughness). We trained a deep convolutional classification model of the natural scene categories and global properties using EEGNet. Having confirmed that the model successfully classified natural scene categories and the three global properties, we applied Grad-CAM to the EEGNet model to visualize the EEG channels and time points that contributed to the classification. The analysis showed that EEG signals in the occipital electrodes at short latencies (approximately 80 ~ ms) contributed to the classifications, whereas those in the frontal electrodes at relatively long latencies (200 ~ ms) contributed to the classification of naturalness and the individual scene category. These results suggest that different global properties are encoded in different cortical areas and with different timings, and that the combination of the EEGNet model and Grad-CAM can be a tool to investigate both temporal and spatial distribution of natural scene processing in the human brain.
Collapse
|
10
|
Deep neural network technique for automated detection of ADHD and CD using ECG signal. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 241:107775. [PMID: 37651817 DOI: 10.1016/j.cmpb.2023.107775] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Revised: 08/09/2023] [Accepted: 08/22/2023] [Indexed: 09/02/2023]
Abstract
BACKGROUND AND OBJECTIVE Attention Deficit Hyperactivity problem (ADHD) is a common neurodevelopment problem in children and adolescents that can lead to long-term challenges in life outcomes if left untreated. Also, ADHD is frequently associated with Conduct Disorder (CD), and multiple research have found similarities in clinical signs and behavioral symptoms between both diseases, making differentiation between ADHD, ADHD comorbid with CD (ADHD+CD), and CD a subjective diagnosis. Therefore, the goal of this pilot study is to create the first explainable deep learning (DL) model for objective ECG-based ADHD/CD diagnosis as having an objective biomarker may improve diagnostic accuracy. METHODS The dataset used in this study consist of ECG data collected from 45 ADHD, 62 ADHD+CD, and 16 CD patients at the Child Guidance Clinic in Singapore. The ECG data were segmented into 2 s epochs and directly used to train our 1-dimensional (1D) convolutional neural network (CNN) model. RESULTS The proposed model yielded 96.04% classification accuracy, 96.26% precision, 95.99% sensitivity, and 96.11% F1-score. The Gradient-weighted class activation mapping (Grad-CAM) function was also used to highlight the important ECG characteristics at specific time points that most impact the classification score. CONCLUSION In addition to achieving model performance results with our suggested DL method, Grad-CAM's implementation also offers vital temporal data that clinicians and other mental healthcare professionals can use to make wise medical judgments. We hope that by conducting this pilot study, we will be able to encourage larger-scale research with a larger biosignal dataset. Hence allowing biosignal-based computer-aided diagnostic (CAD) tools to be implemented in healthcare and ambulatory settings, as ECG can be easily obtained via wearable devices such as smartwatches.
Collapse
|
11
|
A Smartphone-Based Detection System for Tomato Leaf Disease Using EfficientNetV2B2 and Its Explainability with Artificial Intelligence (AI). SENSORS (BASEL, SWITZERLAND) 2023; 23:8685. [PMID: 37960385 PMCID: PMC10648786 DOI: 10.3390/s23218685] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Revised: 10/19/2023] [Accepted: 10/20/2023] [Indexed: 11/15/2023]
Abstract
The occurrence of tomato diseases has substantially reduced agricultural output and financial losses. The timely detection of diseases is crucial to effectively manage and mitigate the impact of episodes. Early illness detection can improve output, reduce chemical use, and boost a nation's economy. A complete system for plant disease detection using EfficientNetV2B2 and deep learning (DL) is presented in this paper. This research aims to develop a precise and effective automated system for identifying several illnesses that impact tomato plants. This will be achieved by analyzing tomato leaf photos. A dataset of high-resolution photographs of healthy and diseased tomato leaves was created to achieve this goal. The EfficientNetV2B2 model is the foundation of the deep learning system and excels at picture categorization. Transfer learning (TF) trains the model on a tomato leaf disease dataset using EfficientNetV2B2's pre-existing weights and a 256-layer dense layer. Tomato leaf diseases can be identified using the EfficientNetV2B2 model and a dense layer of 256 nodes. An ideal loss function and algorithm train and tune the model. Next, the concept is deployed in smartphones and online apps. The user can accurately diagnose tomato leaf diseases with this application. Utilizing an automated system facilitates the rapid identification of diseases, assisting in making informed decisions on disease management and promoting sustainable tomato cultivation practices. The 5-fold cross-validation method achieved 99.02% average weighted training accuracy, 99.22% average weighted validation accuracy, and 98.96% average weighted test accuracy. The split method achieved 99.93% training accuracy and 100% validation accuracy. Using the DL approach, tomato leaf disease identification achieves nearly 100% accuracy on a test dataset.
Collapse
|
12
|
Explainable Image Similarity: Integrating Siamese Networks and Grad-CAM. J Imaging 2023; 9:224. [PMID: 37888331 PMCID: PMC10606999 DOI: 10.3390/jimaging9100224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Revised: 10/03/2023] [Accepted: 10/12/2023] [Indexed: 10/28/2023] Open
Abstract
With the proliferation of image-based applications in various domains, the need for accurate and interpretable image similarity measures has become increasingly critical. Existing image similarity models often lack transparency, making it challenging to understand the reasons why two images are considered similar. In this paper, we propose the concept of explainable image similarity, where the goal is the development of an approach, which is capable of providing similarity scores along with visual factual and counterfactual explanations. Along this line, we present a new framework, which integrates Siamese Networks and Grad-CAM for providing explainable image similarity and discuss the potential benefits and challenges of adopting this approach. In addition, we provide a comprehensive discussion about factual and counterfactual explanations provided by the proposed framework for assisting decision making. The proposed approach has the potential to enhance the interpretability, trustworthiness and user acceptance of image-based systems in real-world image similarity applications.
Collapse
|
13
|
Explainable machine learning for diffraction patterns. J Appl Crystallogr 2023; 56:1494-1504. [PMID: 37791364 PMCID: PMC10543671 DOI: 10.1107/s1600576723007446] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Accepted: 08/24/2023] [Indexed: 10/05/2023] Open
Abstract
Serial crystallography experiments at X-ray free-electron laser facilities produce massive amounts of data but only a fraction of these data are useful for downstream analysis. Thus, it is essential to differentiate between acceptable and unacceptable data, generally known as 'hit' and 'miss', respectively. Image classification methods from artificial intelligence, or more specifically convolutional neural networks (CNNs), classify the data into hit and miss categories in order to achieve data reduction. The quantitative performance established in previous work indicates that CNNs successfully classify serial crystallography data into desired categories [Ke, Brewster, Yu, Ushizima, Yang & Sauter (2018). J. Synchrotron Rad.25, 655-670], but no qualitative evidence on the internal workings of these networks has been provided. For example, there are no visualization methods that highlight the features contributing to a specific prediction while classifying data in serial crystallography experiments. Therefore, existing deep learning methods, including CNNs classifying serial crystallography data, are like a 'black box'. To this end, presented here is a qualitative study to unpack the internal workings of CNNs with the aim of visualizing information in the fundamental blocks of a standard network with serial crystallography data. The region(s) or part(s) of an image that mostly contribute to a hit or miss prediction are visualized.
Collapse
|
14
|
Deep learning-based overall survival prediction model in patients with rare cancer: a case study for primary central nervous system lymphoma. Int J Comput Assist Radiol Surg 2023; 18:1849-1856. [PMID: 37083973 PMCID: PMC10497660 DOI: 10.1007/s11548-023-02886-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Accepted: 03/27/2023] [Indexed: 04/22/2023]
Abstract
PURPOSE Primary central nervous system lymphoma (PCNSL) is a rare, aggressive form of extranodal non-Hodgkin lymphoma. To predict the overall survival (OS) in advance is of utmost importance as it has the potential to aid clinical decision-making. Though radiomics-based machine learning (ML) has demonstrated the promising performance in PCNSL, it demands large amounts of manual feature extraction efforts from magnetic resonance images beforehand. deep learning (DL) overcomes this limitation. METHODS In this paper, we tailored the 3D ResNet to predict the OS of patients with PCNSL. To overcome the limitation of data sparsity, we introduced data augmentation and transfer learning, and we evaluated the results using r stratified k-fold cross-validation. To explain the results of our model, gradient-weighted class activation mapping was applied. RESULTS We obtained the best performance (the standard error) on post-contrast T1-weighted (T1Gd)-area under curve [Formula: see text], accuracy [Formula: see text], precision [Formula: see text], recall [Formula: see text] and F1-score [Formula: see text], while compared with ML-based models on clinical data and radiomics data, respectively, further confirming the stability of our model. Also, we observed that PCNSL is a whole-brain disease and in the cases where the OS is less than 1 year, it is more difficult to distinguish the tumor boundary from the normal part of the brain, which is consistent with the clinical outcome. CONCLUSIONS All these findings indicate that T1Gd can improve prognosis predictions of patients with PCNSL. To the best of our knowledge, this is the first time to use DL to explain model patterns in OS classification of patients with PCNSL. Future work would involve collecting more data of patients with PCNSL, or additional retrospective studies on different patient populations with rare diseases, to further promote the clinical role of our model.
Collapse
|
15
|
FemurTumorNet: Bone tumor classification in the proximal femur using DenseNet model based on radiographs. J Bone Oncol 2023; 42:100504. [PMID: 37766930 PMCID: PMC10520341 DOI: 10.1016/j.jbo.2023.100504] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2023] [Revised: 08/31/2023] [Accepted: 09/03/2023] [Indexed: 09/29/2023] Open
Abstract
Background & purpose For the best possible outcomes from therapy, proximal femur bone cancers must be accurately classified. This work creates an artificial intelligence (AI) model based on plain radiographs to categorize bone tumor in the proximal femur. Materials and methods A tertiary referral center's standard anteroposterior hip radiographs were employed. A dataset 538 images of the femur, including malignant, benign, and tumor-free cases, was employed for training the AI model. There is a total of 214 images showing bone tumor. Pre-processing techniques were applied, and DenseNet model utilized for classification. The performance of the DenseNet model was compared to that of human doctors using cross-validation, further enhanced by incorporating Grad-CAM to visually indicate tumor locations. Results For the three-label classification job, the suggested method boasts an excellent area under the receiver operating characteristic (AUROC) of 0.953. It scored much higher (0.853) than the diagnosis accuracy of the human experts in manual classification (0.794). The AI model outperformed the mean values of the clinicians in terms of sensitivity, specificity, accuracy, and F1 scores. Conclusion The developed DenseNet model demonstrated remarkable accuracy in classifying bone tumors in the proximal femur using plain radiographs. This technology has the potential to reduce misdiagnosis, particularly among non-specialists in musculoskeletal oncology. The utilization of advanced deep learning models provides a promising approach for improved classification and enhanced clinical decision-making in bone tumor detection.
Collapse
|
16
|
Grad-CAM-Based Explainable Artificial Intelligence Related to Medical Text Processing. Bioengineering (Basel) 2023; 10:1070. [PMID: 37760173 PMCID: PMC10525184 DOI: 10.3390/bioengineering10091070] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 08/28/2023] [Accepted: 09/06/2023] [Indexed: 09/29/2023] Open
Abstract
The opacity of deep learning makes its application challenging in the medical field. Therefore, there is a need to enable explainable artificial intelligence (XAI) in the medical field to ensure that models and their results can be explained in a manner that humans can understand. This study uses a high-accuracy computer vision algorithm model to transfer learning to medical text tasks and uses the explanatory visualization method known as gradient-weighted class activation mapping (Grad-CAM) to generate heat maps to ensure that the basis for decision-making can be provided intuitively or via the model. The system comprises four modules: pre-processing, word embedding, classifier, and visualization. We used Word2Vec and BERT to compare word embeddings and use ResNet and 1Dimension convolutional neural networks (CNN) to compare classifiers. Finally, the Bi-LSTM was used to perform text classification for direct comparison. With 25 epochs, the model that used pre-trained ResNet on the formalized text presented the best performance (recall of 90.9%, precision of 91.1%, and an F1 score of 90.2% weighted). This study uses ResNet to process medical texts through Grad-CAM-based explainable artificial intelligence and obtains a high-accuracy classification effect; at the same time, through Grad-CAM visualization, it intuitively shows the words to which the model pays attention when making predictions.
Collapse
|
17
|
CervicoXNet: an automated cervicogram interpretation network. Med Biol Eng Comput 2023; 61:2405-2416. [PMID: 37185967 DOI: 10.1007/s11517-023-02835-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Accepted: 04/05/2023] [Indexed: 05/17/2023]
Abstract
Visual inspection with acetic acid (VIA) is a pre-cancerous screening program for low-middle-income countries (LMICs). Due to the limited number of oncology-gynecologist clinicians in LMICs, VIA examinations are performed mainly by medical workers. However, the inability of the medical workers to recognize a significant pattern based on cervicograms, VIA examination produces high inter-observer variance and high false-positive rate. This study proposed an automated cervicogram interpretation using explainable convolutional neural networks named "CervicoXNet" to support medical workers decision. The total number of 779 cervicograms was used for the learning process: 487 with VIA ( +) and 292 with VIA ( -). We performed data augmentation process under a geometric transformation scenario, such process produces 7325 cervicogram with VIA ( -) and 7242 cervicogram with VIA ( +). The proposed model outperformed other deep learning models, with 99.22% accuracy, 100% sensitivity, and 98.28% specificity. Moreover, to test the robustness of the proposed model, colposcope images used to validate the model's generalization ability. The results showed that the proposed architecture still produced satisfactory performance, with 98.11% accuracy, 98.33% sensitivity, and 98% specificity. It can be proven that the proposed model has been achieved satisfactory results. To make the prediction results visually interpretable, the results are localized with a heat map in fine-grained pixels using a combination of Grad-CAM and guided backpropagation. CervicoXNet can be used an alternative early screening tool with VIA alone.
Collapse
|
18
|
Explainability agreement between dermatologists and five visual explanations techniques in deep neural networks for melanoma AI classification. Front Med (Lausanne) 2023; 10:1241484. [PMID: 37746081 PMCID: PMC10513767 DOI: 10.3389/fmed.2023.1241484] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Accepted: 08/14/2023] [Indexed: 09/26/2023] Open
Abstract
Introduction The use of deep convolutional neural networks for analyzing skin lesion images has shown promising results. The identification of skin cancer by faster and less expensive means can lead to an early diagnosis, saving lives and avoiding treatment costs. However, to implement this technology in a clinical context, it is important for specialists to understand why a certain model makes a prediction; it must be explainable. Explainability techniques can be used to highlight the patterns of interest for a prediction. Methods Our goal was to test five different techniques: Grad-CAM, Grad-CAM++, Score-CAM, Eigen-CAM, and LIME, to analyze the agreement rate between features highlighted by the visual explanation maps to 3 important clinical criteria for melanoma classification: asymmetry, border irregularity, and color heterogeneity (ABC rule) in 100 melanoma images. Two dermatologists scored the visual maps and the clinical images using a semi-quantitative scale, and the results were compared. They also ranked their preferable techniques. Results We found that the techniques had different agreement rates and acceptance. In the overall analysis, Grad-CAM showed the best total+partial agreement rate (93.6%), followed by LIME (89.8%), Grad-CAM++ (88.0%), Eigen-CAM (86.4%), and Score-CAM (84.6%). Dermatologists ranked their favorite options: Grad-CAM and Grad-CAM++, followed by Score-CAM, LIME, and Eigen-CAM. Discussion Saliency maps are one of the few methods that can be used for visual explanations. The evaluation of explainability with humans is ideal to assess the understanding and applicability of these methods. Our results demonstrated that there is a significant agreement between clinical features used by dermatologists to diagnose melanomas and visual explanation techniques, especially Grad-Cam.
Collapse
|
19
|
Explainable Automated TI-RADS Evaluation of Thyroid Nodules. SENSORS (BASEL, SWITZERLAND) 2023; 23:7289. [PMID: 37631825 PMCID: PMC10459295 DOI: 10.3390/s23167289] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Revised: 08/09/2023] [Accepted: 08/18/2023] [Indexed: 08/27/2023]
Abstract
A thyroid nodule, a common abnormal growth within the thyroid gland, is often identified through ultrasound imaging of the neck. These growths may be solid- or fluid-filled, and their treatment is influenced by factors such as size and location. The Thyroid Imaging Reporting and Data System (TI-RADS) is a classification method that categorizes thyroid nodules into risk levels based on features such as size, echogenicity, margin, shape, and calcification. It guides clinicians in deciding whether a biopsy or other further evaluation is needed. Machine learning (ML) can complement TI-RADS classification, thereby improving the detection of malignant tumors. When combined with expert rules (TI-RADS) and explanations, ML models may uncover elements that TI-RADS misses, especially when TI-RADS training data are scarce. In this paper, we present an automated system for classifying thyroid nodules according to TI-RADS and assessing malignancy effectively. We use ResNet-101 and DenseNet-201 models to classify thyroid nodules according to TI-RADS and malignancy. By analyzing the models' last layer using the Grad-CAM algorithm, we demonstrate that these models can identify risk areas and detect nodule features relevant to the TI-RADS score. By integrating Grad-CAM results with feature probability calculations, we provide a precise heat map, visualizing specific features within the nodule and potentially assisting doctors in their assessments. Our experiments show that the utilization of ResNet-101 and DenseNet-201 models, in conjunction with Grad-CAM visualization analysis, improves TI-RADS classification accuracy by up to 10%. This enhancement, achieved through iterative analysis and re-training, underscores the potential of machine learning in advancing thyroid nodule diagnosis, offering a promising direction for further exploration and clinical application.
Collapse
|
20
|
Tracking Therapy Response in Glioblastoma Using 1D Convolutional Neural Networks. Cancers (Basel) 2023; 15:4002. [PMID: 37568818 PMCID: PMC10417313 DOI: 10.3390/cancers15154002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Revised: 07/26/2023] [Accepted: 08/05/2023] [Indexed: 08/13/2023] Open
Abstract
BACKGROUND Glioblastoma (GB) is a malignant brain tumour that is challenging to treat, often relapsing even after aggressive therapy. Evaluating therapy response relies on magnetic resonance imaging (MRI) following the Response Assessment in Neuro-Oncology (RANO) criteria. However, early assessment is hindered by phenomena such as pseudoprogression and pseudoresponse. Magnetic resonance spectroscopy (MRS/MRSI) provides metabolomics information but is underutilised due to a lack of familiarity and standardisation. METHODS This study explores the potential of spectroscopic imaging (MRSI) in combination with several machine learning approaches, including one-dimensional convolutional neural networks (1D-CNNs), to improve therapy response assessment. Preclinical GB (GL261-bearing mice) were studied for method optimisation and validation. RESULTS The proposed 1D-CNN models successfully identify different regions of tumours sampled by MRSI, i.e., normal brain (N), control/unresponsive tumour (T), and tumour responding to treatment (R). Class activation maps using Grad-CAM enabled the study of the key areas relevant to the models, providing model explainability. The generated colour-coded maps showing the N, T and R regions were highly accurate (according to Dice scores) when compared against ground truth and outperformed our previous method. CONCLUSIONS The proposed methodology may provide new and better opportunities for therapy response assessment, potentially providing earlier hints of tumour relapsing stages.
Collapse
|
21
|
Interpretation of lung disease classification with light attention connected module. Biomed Signal Process Control 2023; 84:104695. [PMID: 36879856 PMCID: PMC9978539 DOI: 10.1016/j.bspc.2023.104695] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Revised: 12/21/2022] [Accepted: 02/11/2023] [Indexed: 03/06/2023]
Abstract
Lung diseases lead to complications from obstructive diseases, and the COVID-19 pandemic has increased lung disease-related deaths. Medical practitioners use stethoscopes to diagnose lung disease. However, an artificial intelligence model capable of objective judgment is required since the experience and diagnosis of respiratory sounds differ. Therefore, in this study, we propose a lung disease classification model that uses an attention module and deep learning. Respiratory sounds were extracted using log-Mel spectrogram MFCC. Normal and five types of adventitious sounds were effectively classified by improving VGGish and adding a light attention connected module to which the efficient channel attention module (ECA-Net) was applied. The performance of the model was evaluated for accuracy, precision, sensitivity, specificity, f1-score, and balanced accuracy, which were 92.56%, 92.81%, 92.22%, 98.50%, 92.29%, and 95.4%, respectively. We confirmed high performance according to the attention effect. The classification causes of lung diseases were analyzed using gradient-weighted class activation mapping (Grad-CAM), and the performances of their models were compared using open lung sounds measured using a Littmann 3200 stethoscope. The experts' opinions were also included. Our results will contribute to the early diagnosis and interpretation of diseases in patients with lung disease by utilizing algorithms in smart medical stethoscopes.
Collapse
|
22
|
Diagnosis of Alzheimer Disease and Tauopathies on Whole-Slide Histopathology Images Using a Weakly Supervised Deep Learning Algorithm. J Transl Med 2023; 103:100127. [PMID: 36889541 DOI: 10.1016/j.labinv.2023.100127] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2022] [Revised: 02/06/2023] [Accepted: 02/17/2023] [Indexed: 03/08/2023] Open
Abstract
Neuropathologic assessment during autopsy is the gold standard for diagnosing neurodegenerative disorders. Neurodegenerative conditions, such as Alzheimer disease (AD) neuropathological change, are a continuous process from normal aging rather than categorical; therefore, diagnosing neurodegenerative disorders is a complicated task. We aimed to develop a pipeline for diagnosing AD and other tauopathies, including corticobasal degeneration (CBD), globular glial tauopathy, Pick disease, and progressive supranuclear palsy. We used a weakly supervised deep learning-based approach called clustering-constrained-attention multiple-instance learning (CLAM) on the whole-slide images (WSIs) of patients with AD (n = 30), CBD (n = 20), globular glial tauopathy (n = 10), Pick disease (n = 20), and progressive supranuclear palsy (n = 20), as well as nontauopathy controls (n = 21). Three sections (A: motor cortex; B: cingulate gyrus and superior frontal gyrus; and C: corpus striatum) that had been immunostained for phosphorylated tau were scanned and converted to WSIs. We evaluated 3 models (classic multiple-instance learning, single-attention-branch CLAM, and multiattention-branch CLAM) using 5-fold cross-validation. Attention-based interpretation analysis was performed to identify the morphologic features contributing to the classification. Within highly attended regions, we also augmented gradient-weighted class activation mapping to the model to visualize cellular-level evidence of the model's decisions. The multiattention-branch CLAM model using section B achieved the highest area under the curve (0.970 ± 0.037) and diagnostic accuracy (0.873 ± 0.087). A heatmap showed the highest attention in the gray matter of the superior frontal gyrus in patients with AD and the white matter of the cingulate gyrus in patients with CBD. Gradient-weighted class activation mapping showed the highest attention in characteristic tau lesions for each disease (eg, numerous tau-positive threads in the white matter inclusions for CBD). Our findings support the feasibility of deep learning-based approaches for the classification of neurodegenerative disorders on WSIs. Further investigation of this method, focusing on clinicopathologic correlations, is warranted.
Collapse
|
23
|
Saliency Map and Deep Learning in Binary Classification of Brain Tumours. SENSORS (BASEL, SWITZERLAND) 2023; 23:s23094543. [PMID: 37177747 PMCID: PMC10181656 DOI: 10.3390/s23094543] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/16/2023] [Revised: 04/24/2023] [Accepted: 05/05/2023] [Indexed: 05/15/2023]
Abstract
The paper was devoted to the application of saliency analysis methods in the performance analysis of deep neural networks used for the binary classification of brain tumours. We have presented the basic issues related to deep learning techniques. A significant challenge in using deep learning methods is the ability to explain the decision-making process of the network. To ensure accurate results, the deep network being used must undergo extensive training to produce high-quality predictions. There are various network architectures that differ in their properties and number of parameters. Consequently, an intriguing question is how these different networks arrive at similar or distinct decisions based on the same set of prerequisites. Therefore, three widely used deep convolutional networks have been discussed, such as VGG16, ResNet50 and EfficientNetB7, which were used as backbone models. We have customized the output layer of these pre-trained models with a softmax layer. In addition, an additional network has been described that was used to assess the saliency areas obtained. For each of the above networks, many tests have been performed using key metrics, including statistical evaluation of the impact of class activation mapping (CAM) and gradient-weighted class activation mapping (Grad-CAM) on network performance on a publicly available dataset of brain tumour X-ray images.
Collapse
|
24
|
Recent Advances in Explainable Artificial Intelligence for Magnetic Resonance Imaging. Diagnostics (Basel) 2023; 13:1571. [PMID: 37174962 PMCID: PMC10178221 DOI: 10.3390/diagnostics13091571] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 03/29/2023] [Accepted: 04/26/2023] [Indexed: 05/15/2023] Open
Abstract
Advances in artificial intelligence (AI), especially deep learning (DL), have facilitated magnetic resonance imaging (MRI) data analysis, enabling AI-assisted medical image diagnoses and prognoses. However, most of the DL models are considered as "black boxes". There is an unmet need to demystify DL models so domain experts can trust these high-performance DL models. This has resulted in a sub-domain of AI research called explainable artificial intelligence (XAI). In the last decade, many experts have dedicated their efforts to developing novel XAI methods that are competent at visualizing and explaining the logic behind data-driven DL models. However, XAI techniques are still in their infancy for medical MRI image analysis. This study aims to outline the XAI applications that are able to interpret DL models for MRI data analysis. We first introduce several common MRI data modalities. Then, a brief history of DL models is discussed. Next, we highlight XAI frameworks and elaborate on the principles of multiple popular XAI methods. Moreover, studies on XAI applications in MRI image analysis are reviewed across the tissues/organs of the human body. A quantitative analysis is conducted to reveal the insights of MRI researchers on these XAI techniques. Finally, evaluations of XAI methods are discussed. This survey presents recent advances in the XAI domain for explaining the DL models that have been utilized in MRI applications.
Collapse
|
25
|
Application of machine learning and deep learning methods for hydrated electron rate constant prediction. ENVIRONMENTAL RESEARCH 2023; 231:115996. [PMID: 37105290 DOI: 10.1016/j.envres.2023.115996] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/06/2022] [Revised: 04/19/2023] [Accepted: 04/24/2023] [Indexed: 05/08/2023]
Abstract
Accurately determining the second-order rate constant with eaq- (keaq-) for organic compounds (OCs) is crucial in the eaq- induced advanced reduction processes (ARPs). In this study, we collected 867 keaq- values at different pHs from peer-reviewed publications and applied machine learning (ML) algorithm-XGBoost and deep learning (DL) algorithm-convolutional neural network (CNN) to predict keaq-. Our results demonstrated that the CNN model with transfer learning and data augmentation (CNN-TL&DA) greatly improved the prediction results and overcame over-fitting. Furthermore, we compared the ML/DL modeling methods and found that the CNN-TL&DA, which combined molecular images (MI), achieved the best overall performance (R2test = 0.896, RMSEtest = 0.362, MAEtest = 0.261) when compared to the XGBoost algorithm combined with Mordred descriptors (MD) (0.692, RMSEtest = 0.622, MAEtest = 0.399) and Morgan fingerprint (MF) (R2test = 0.512, RMSEtest = 0.783, MAEtest = 0.520). Moreover, the interpretation of the MD-XGBoost and MF-XGBoost models using the SHAP method revealed the significance of MDs (e.g., molecular size, branching, electron distribution, polarizability, and bond types), MFs (e.g, aromatic carbon, carbonyl oxygen, nitrogen, and halogen) and environmental conditions (e.g., pH) that effectively influence the keaq- prediction. The interpretation of the 2D molecular image-CNN (MI-CNN) models using the Grad-CAM method showed that they correctly identified key functional groups such as -CN, -NO2, and -X functional groups that can increase the keaq- values. Additionally, almost all electron-withdrawing groups and a small part of electron-donating groups for the MI-CNN model can be highlighted for estimating keaq-. Overall, our results suggest that the CNN approach has smaller errors when compared to ML algorithms, making it a promising candidate for predicting other rate constants.
Collapse
|
26
|
An Endodontic forecasting model based on the analysis of preoperative dental radiographs: A pilot study on an endodontic predictive deep neural network. J Endod 2023:S0099-2399(23)00178-4. [PMID: 37019378 DOI: 10.1016/j.joen.2023.03.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Revised: 03/25/2023] [Accepted: 03/27/2023] [Indexed: 04/07/2023]
Abstract
INTRODUCTION This study aimed to evaluate the use of deep convolutional neural network (DCNN) algorithms to detect clinical features and predict the three years outcome of endodontic treatment on preoperative periapical radiographs. METHODS A database of single-root premolars that received endodontic treatment or retreatment by endodontists with presence of three years outcome was prepared (n = 598). We constructed a 17-layered DCNN with a self-attention layer (PRESSAN-17), and the model was trained, validated, and tested to 1) detect seven clinical features, i.e., full coverage restoration (FCR), presence of proximal teeth (PRX), coronal defect (COD), root rest (RRS), canal visibility (CAV), previous root filling (PRF), and periapical radiolucency (PAR), and 2) predict the three years endodontic prognosis by analyzing preoperative periapical radiographs as an input. During the prognostication test, a conventional DCNN without a self-attention layer (RESNET-18) was tested for comparison. Accuracy and area under the receiver-operating-characteristic (ROC) curve (AUC) were mainly evaluated for performance comparison. Gradient-weighted class activation mapping (Grad-CAM) was used to visualize weighted heatmaps. RESULTS PRESSAN-17 detected FCR (AUC = 0.975), PRX (0.866), COD (0.672), RRS (0.989), PRF (0.879) and PAR (0.690) significantly, compared to the no-information rate (p<0.05). Comparing the mean accuracy of 5-fold validation of two models, PRESSAN-17 (67.0%) showed a significant difference to RESNET-18 (63.4%, p<0.05). Also, the area under average ROC of PRESSAN-17 was 0.638, which was significantly different compared to the no-information rate. Grad-CAM demonstrated that PRESSAN-17 correctly identified clinical features. CONCLUSIONS Deep convolutional neural networks may aid in the prognostication of endodontic treatment outcome.
Collapse
|
27
|
Classification of major depressive disorder using an attention-guided unified deep convolutional neural network and individual structural covariance network. Cereb Cortex 2023; 33:2415-2425. [PMID: 35641181 DOI: 10.1093/cercor/bhac217] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2021] [Revised: 05/06/2022] [Accepted: 05/07/2022] [Indexed: 11/12/2022] Open
Abstract
Major depressive disorder (MDD) is the second leading cause of disability worldwide. Currently, the structural magnetic resonance imaging-based MDD diagnosis models mainly utilize local grayscale information or morphological characteristics in a single site with small samples. Emerging evidence has demonstrated that different brain structures in different circuits have distinct developmental timing, but mature coordinately within the same functional circuit. Thus, establishing an attention-guided unified classification framework with deep learning and individual structural covariance networks in a large multisite dataset could facilitate developing an accurate diagnosis strategy. Our results showed that attention-guided classification could improve the classification accuracy from primary 75.1% to ultimate 76.54%. Furthermore, the discriminative features of regional covariance connectivities and local structural characteristics were found to be mainly located in prefrontal cortex, insula, superior temporal cortex, and cingulate cortex, which have been widely reported to be closely associated with depression. Our study demonstrated that our attention-guided unified deep learning framework may be an effective tool for MDD diagnosis. The identified covariance connectivities and structural features may serve as biomarkers for MDD.
Collapse
|
28
|
Leaf-Counting in Monocot Plants Using Deep Regression Models. SENSORS (BASEL, SWITZERLAND) 2023; 23:1890. [PMID: 36850487 PMCID: PMC9962473 DOI: 10.3390/s23041890] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Revised: 01/28/2023] [Accepted: 02/02/2023] [Indexed: 06/18/2023]
Abstract
Leaf numbers are vital in estimating the yield of crops. Traditional manual leaf-counting is tedious, costly, and an enormous job. Recent convolutional neural network-based approaches achieve promising results for rosette plants. However, there is a lack of effective solutions to tackle leaf counting for monocot plants, such as sorghum and maize. The existing approaches often require substantial training datasets and annotations, thus incurring significant overheads for labeling. Moreover, these approaches can easily fail when leaf structures are occluded in images. To address these issues, we present a new deep neural network-based method that does not require any effort to label leaf structures explicitly and achieves superior performance even with severe leaf occlusions in images. Our method extracts leaf skeletons to gain more topological information and applies augmentation to enhance structural variety in the original images. Then, we feed the combination of original images, derived skeletons, and augmentations into a regression model, transferred from Inception-Resnet-V2, for leaf-counting. We find that leaf tips are important in our regression model through an input modification method and a Grad-CAM method. The superiority of the proposed method is validated via comparison with the existing approaches conducted on a similar dataset. The results show that our method does not only improve the accuracy of leaf-counting, with overlaps and occlusions, but also lower the training cost, with fewer annotations compared to the previous state-of-the-art approaches.The robustness of the proposed method against the noise effect is also verified by removing the environmental noises during the image preprocessing and reducing the effect of the noises introduced by skeletonization, with satisfactory outcomes.
Collapse
|
29
|
DTLCx: An Improved ResNet Architecture to Classify Normal and Conventional Pneumonia Cases from COVID-19 Instances with Grad-CAM-Based Superimposed Visualization Utilizing Chest X-ray Images. Diagnostics (Basel) 2023; 13:diagnostics13030551. [PMID: 36766662 PMCID: PMC9914155 DOI: 10.3390/diagnostics13030551] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Revised: 01/04/2023] [Accepted: 01/31/2023] [Indexed: 02/05/2023] Open
Abstract
COVID-19 is a severe respiratory contagious disease that has now spread all over the world. COVID-19 has terribly impacted public health, daily lives and the global economy. Although some developed countries have advanced well in detecting and bearing this coronavirus, most developing countries are having difficulty in detecting COVID-19 cases for the mass population. In many countries, there is a scarcity of COVID-19 testing kits and other resources due to the increasing rate of COVID-19 infections. Therefore, this deficit of testing resources and the increasing figure of daily cases encouraged us to improve a deep learning model to aid clinicians, radiologists and provide timely assistance to patients. In this article, an efficient deep learning-based model to detect COVID-19 cases that utilizes a chest X-ray images dataset has been proposed and investigated. The proposed model is developed based on ResNet50V2 architecture. The base architecture of ResNet50V2 is concatenated with six extra layers to make the model more robust and efficient. Finally, a Grad-CAM-based discriminative localization is used to readily interpret the detection of radiological images. Two datasets were gathered from different sources that are publicly available with class labels: normal, confirmed COVID-19, bacterial pneumonia and viral pneumonia cases. Our proposed model obtained a comprehensive accuracy of 99.51% for four-class cases (COVID-19/normal/bacterial pneumonia/viral pneumonia) on Dataset-2, 96.52% for the cases with three classes (normal/ COVID-19/bacterial pneumonia) and 99.13% for the cases with two classes (COVID-19/normal) on Dataset-1. The accuracy level of the proposed model might motivate radiologists to rapidly detect and diagnose COVID-19 cases.
Collapse
|
30
|
An Explainable Deep Learning Model to Prediction Dental Caries Using Panoramic Radiograph Images. Diagnostics (Basel) 2023; 13:diagnostics13020226. [PMID: 36673036 PMCID: PMC9858273 DOI: 10.3390/diagnostics13020226] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Revised: 12/30/2022] [Accepted: 01/04/2023] [Indexed: 01/10/2023] Open
Abstract
Dental caries is the most frequent dental health issue in the general population. Dental caries can result in extreme pain or infections, lowering people's quality of life. Applying machine learning models to automatically identify dental caries can lead to earlier treatment. However, physicians frequently find the model results unsatisfactory due to a lack of explainability. Our study attempts to address this issue with an explainable deep learning model for detecting dental caries. We tested three prominent pre-trained models, EfficientNet-B0, DenseNet-121, and ResNet-50, to determine which is best for the caries detection task. These models take panoramic images as the input, producing a caries-non-caries classification result and a heat map, which visualizes areas of interest on the tooth. The model performance was evaluated using whole panoramic images of 562 subjects. All three models produced remarkably similar results. However, the ResNet-50 model exhibited a slightly better performance when compared to EfficientNet-B0 and DenseNet-121. This model obtained an accuracy of 92.00%, a sensitivity of 87.33%, and an F1-score of 91.61%. Visual inspection showed us that the heat maps were also located in the areas with caries. The proposed explainable deep learning model diagnosed dental caries with high accuracy and reliability. The heat maps help to explain the classification results by indicating a region of suspected caries on the teeth. Dentists could use these heat maps to validate the classification results and reduce misclassification.
Collapse
|
31
|
An Explainable AI-Enabled Framework for Interpreting Pulmonary Diseases from Chest Radiographs. Cancers (Basel) 2023; 15:cancers15010314. [PMID: 36612309 PMCID: PMC9818469 DOI: 10.3390/cancers15010314] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Revised: 12/22/2022] [Accepted: 12/23/2022] [Indexed: 01/05/2023] Open
Abstract
Explainable Artificial Intelligence is a key component of artificially intelligent systems that aim to explain the classification results. The classification results explanation is essential for automatic disease diagnosis in healthcare. The human respiration system is badly affected by different chest pulmonary diseases. Automatic classification and explanation can be used to detect these lung diseases. In this paper, we introduced a CNN-based transfer learning-based approach for automatically explaining pulmonary diseases, i.e., edema, tuberculosis, nodules, and pneumonia from chest radiographs. Among these pulmonary diseases, pneumonia, which COVID-19 causes, is deadly; therefore, radiographs of COVID-19 are used for the explanation task. We used the ResNet50 neural network and trained the network on extensive training with the COVID-CT dataset and the COVIDNet dataset. The interpretable model LIME is used for the explanation of classification results. Lime highlights the input image's important features for generating the classification result. We evaluated the explanation using radiologists' highlighted images and identified that our model highlights and explains the same regions. We achieved improved classification results with our fine-tuned model with an accuracy of 93% and 97%, respectively. The analysis of our results indicates that this research not only improves the classification results but also provides an explanation of pulmonary diseases with advanced deep-learning methods. This research would assist radiologists with automatic disease detection and explanations, which are used to make clinical decisions and assist in diagnosing and treating pulmonary diseases in the early stage.
Collapse
|
32
|
Utilizing Deep Learning Models and Transfer Learning for COVID-19 Detection from X-Ray Images. SN COMPUTER SCIENCE 2023; 4:326. [PMID: 37089895 PMCID: PMC10105354 DOI: 10.1007/s42979-022-01655-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Accepted: 12/28/2022] [Indexed: 04/25/2023]
Abstract
COVID-19 has been a global pandemic. Flattening the curve requires intensive testing, and the world has been facing a shortage of testing equipment and medical personnel with expertise. There is a need to automate and aid the detection process. Several diagnostic tools are currently being used for COVID-19, including X-Rays and CT-scans. This study focuses on detecting COVID-19 from X-Rays. We pursue two types of problems: binary classification (COVID-19 and No COVID-19) and multi-class classification (COVID-19, No COVID-19 and Pneumonia). We examine and evaluate several classic models, namely VGG19, ResNet50, MobileNetV2, InceptionV3, Xception, DenseNet121, and specialized models such as DarkCOVIDNet and COVID-Net and prove that ResNet50 models perform best. We also propose a simple modification to the ResNet50 model, which gives a binary classification accuracy of 99.20% and a multi-class classification accuracy of 86.13%, hence cementing the ResNet50's abilities for COVID-19 detection and ability to differentiate pneumonia and COVID-19. The proposed model's explanations were interpreted via LIME which provides contours, and Grad-CAM, which provides heat-maps over the area(s) of interest of the classifier, i.e., COVID-19 concentrated regions in the lungs, and realize that LIME explains the results better. These explanations support our model's ability to generalize. The proposed model is intended to be deployed for free use.
Collapse
|
33
|
Explainable AI in Scene Understanding for Autonomous Vehicles in Unstructured Traffic Environments on Indian Roads Using the Inception U-Net Model with Grad-CAM Visualization. SENSORS (BASEL, SWITZERLAND) 2022; 22:9677. [PMID: 36560047 PMCID: PMC9785663 DOI: 10.3390/s22249677] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Revised: 11/20/2022] [Accepted: 12/08/2022] [Indexed: 06/17/2023]
Abstract
The intelligent transportation system, especially autonomous vehicles, has seen a lot of interest among researchers owing to the tremendous work in modern artificial intelligence (AI) techniques, especially deep neural learning. As a result of increased road accidents over the last few decades, significant industries are moving to design and develop autonomous vehicles. Understanding the surrounding environment is essential for understanding the behavior of nearby vehicles to enable the safe navigation of autonomous vehicles in crowded traffic environments. Several datasets are available for autonomous vehicles focusing only on structured driving environments. To develop an intelligent vehicle that drives in real-world traffic environments, which are unstructured by nature, there should be an availability of a dataset for an autonomous vehicle that focuses on unstructured traffic environments. Indian Driving Lite dataset (IDD-Lite), focused on an unstructured driving environment, was released as an online competition in NCPPRIPG 2019. This study proposed an explainable inception-based U-Net model with Grad-CAM visualization for semantic segmentation that combines an inception-based module as an encoder for automatic extraction of features and passes to a decoder for the reconstruction of the segmentation feature map. The black-box nature of deep neural networks failed to build trust within consumers. Grad-CAM is used to interpret the deep-learning-based inception U-Net model to increase consumer trust. The proposed inception U-net with Grad-CAM model achieves 0.622 intersection over union (IoU) on the Indian Driving Dataset (IDD-Lite), outperforming the state-of-the-art (SOTA) deep neural-network-based segmentation models.
Collapse
|
34
|
Strategies for Enhancing the Multi-Stage Classification Performances of HER2 Breast Cancer from Hematoxylin and Eosin Images. Diagnostics (Basel) 2022; 12:diagnostics12112825. [PMID: 36428885 PMCID: PMC9689487 DOI: 10.3390/diagnostics12112825] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 11/10/2022] [Accepted: 11/11/2022] [Indexed: 11/18/2022] Open
Abstract
Breast cancer is a significant health concern among women. Prompt diagnosis can diminish the mortality rate and direct patients to take steps for cancer treatment. Recently, deep learning has been employed to diagnose breast cancer in the context of digital pathology. To help in this area, a transfer learning-based model called 'HE-HER2Net' has been proposed to diagnose multiple stages of HER2 breast cancer (HER2-0, HER2-1+, HER2-2+, HER2-3+) on H&E (hematoxylin & eosin) images from the BCI dataset. HE-HER2Net is the modified version of the Xception model, which is additionally comprised of global average pooling, several batch normalization layers, dropout layers, and dense layers with a swish activation function. This proposed model exceeds all existing models in terms of accuracy (0.87), precision (0.88), recall (0.86), and AUC score (0.98) immensely. In addition, our proposed model has been explained through a class-discriminative localization technique using Grad-CAM to build trust and to make the model more transparent. Finally, nuclei segmentation has been performed through the StarDist method.
Collapse
|
35
|
Explanatory classification of CXR images into COVID-19, Pneumonia and Tuberculosis using deep learning and XAI. Comput Biol Med 2022; 150:106156. [PMID: 36228463 PMCID: PMC9549800 DOI: 10.1016/j.compbiomed.2022.106156] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Revised: 09/05/2022] [Accepted: 09/24/2022] [Indexed: 11/18/2022]
Abstract
Chest X-ray (CXR) images are considered useful to monitor and investigate a variety of pulmonary disorders such as COVID-19, Pneumonia, and Tuberculosis (TB). With recent technological advancements, such diseases may now be recognized more precisely using computer-assisted diagnostics. Without compromising the classification accuracy and better feature extraction, deep learning (DL) model to predict four different categories is proposed in this study. The proposed model is validated with publicly available datasets of 7132 chest x-ray (CXR) images. Furthermore, results are interpreted and explained using Gradient-weighted Class Activation Mapping (Grad-CAM), Local Interpretable Modelagnostic Explanation (LIME), and SHapley Additive exPlanation (SHAP) for better understandably. Initially, convolution features are extracted to collect high-level object-based information. Next, shapely values from SHAP, predictability results from LIME, and heatmap from Grad-CAM are used to explore the black-box approach of the DL model, achieving average test accuracy of 94.31 ± 1.01% and validation accuracy of 94.54 ± 1.33 for 10-fold cross validation. Finally, in order to validate the model and qualify medical risk, medical sensations of classification are taken to consolidate the explanations generated from the eXplainable Artificial Intelligence (XAI) framework. The results suggest that XAI and DL models give clinicians/medical professionals persuasive and coherent conclusions related to the detection and categorization of COVID-19, Pneumonia, and TB.
Collapse
|
36
|
DeepClassPathway: Molecular pathway aware classification using explainable deep learning. Eur J Cancer 2022; 176:41-49. [PMID: 36191385 DOI: 10.1016/j.ejca.2022.08.033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Accepted: 08/25/2022] [Indexed: 12/15/2022]
Abstract
OBJECTIVE HPV-associated head and neck cancer is correlated with favorable prognosis; however, its underlying biology is not fully understood. We propose an explainable convolutional neural network (CNN) classifier, DeepClassPathway, that predicts HPV-status and allows patient-specific identification of molecular pathways driving classifier decisions. METHODS The CNN was trained to classify HPV-status on transcriptome data from 264 (13% HPV-positive) and tested on 85 (25% HPV-positive) head and neck squamous carcinoma patients after transformation into 2D-treemaps representing molecular pathways. Grad-CAM saliency was used to quantify pathways contribution to individual CNN decisions. Model stability was assessed by shuffling pathways within 2D-images. RESULTS The classification performance of the CNN-ensembles achieved ROC-AUC/PR-AUC of 0.96/0.90 for all treemap variants. Quantification of the averaged pathway saliency heatmaps consistently identified KRAS, spermatogenesis, bile acid metabolism, and inflammation signaling pathways as the four most informative for classifying HPV-positive patients and MYC targets, epithelial-mesenchymal transition, and protein secretion pathways for HPV-negative patients. CONCLUSION We have developed and applied an explainable CNN classification approach to transcriptome data from an oncology cohort with typical sample size that allows classification while accounting for the importance of molecular pathways in individual-level decisions.
Collapse
|
37
|
Computer-aided diagnostic for classifying chest X-ray images using deep ensemble learning. BMC Med Imaging 2022; 22:178. [PMID: 36243705 PMCID: PMC9568999 DOI: 10.1186/s12880-022-00904-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Accepted: 09/05/2022] [Indexed: 11/16/2022] Open
Abstract
Background Nowadays doctors and radiologists are overwhelmed with a huge amount of work. This led to the effort to design different Computer-Aided Diagnosis systems (CAD system), with the aim of accomplishing a faster and more accurate diagnosis. The current development of deep learning is a big opportunity for the development of new CADs. In this paper, we propose a novel architecture for a convolutional neural network (CNN) ensemble for classifying chest X-ray (CRX) images into four classes: viral Pneumonia, Tuberculosis, COVID-19, and Healthy. Although Computed tomography (CT) is the best way to detect and diagnoses pulmonary issues, CT is more expensive than CRX. Furthermore, CRX is commonly the first step in the diagnosis, so it’s very important to be accurate in the early stages of diagnosis and treatment. Results We applied the transfer learning technique and data augmentation to all CNNs for obtaining better performance. We have designed and evaluated two different CNN-ensembles: Stacking and Voting. This system is ready to be applied in a CAD system to automated diagnosis such a second or previous opinion before the doctors or radiology’s. Our results show a great improvement, 99% accuracy of the Stacking Ensemble and 98% of accuracy for the the Voting Ensemble. Conclusions To minimize missclassifications, we included six different base CNN models in our architecture (VGG16, VGG19, InceptionV3, ResNet101V2, DenseNet121 and CheXnet) and it could be extended to any number as well as we expect extend the number of diseases to detected. The proposed method has been validated using a large dataset created by mixing several public datasets with different image sizes and quality. As we demonstrate in the evaluation carried out, we reach better results and generalization compared with previous works. In addition, we make a first approach to explainable deep learning with the objective of providing professionals more information that may be valuable when evaluating CRXs.
Collapse
|
38
|
Detection and Visualisation of Pneumoconiosis Using an Ensemble of Multi-Dimensional Deep Features Learned from Chest X-rays. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022; 19:11193. [PMID: 36141457 PMCID: PMC9517617 DOI: 10.3390/ijerph191811193] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Revised: 08/25/2022] [Accepted: 08/27/2022] [Indexed: 06/16/2023]
Abstract
Pneumoconiosis is a group of occupational lung diseases induced by mineral dust inhalation and subsequent lung tissue reactions. It can eventually cause irreparable lung damage, as well as gradual and permanent physical impairments. It has affected millions of workers in hazardous industries throughout the world, and it is a leading cause of occupational death. It is difficult to diagnose early pneumoconiosis because of the low sensitivity of chest radiographs, the wide variation in interpretation between and among readers, and the scarcity of B-readers, which all add to the difficulty in diagnosing these occupational illnesses. In recent years, deep machine learning algorithms have been extremely successful at classifying and localising abnormality of medical images. In this study, we proposed an ensemble learning approach to improve pneumoconiosis detection in chest X-rays (CXRs) using nine machine learning classifiers and multi-dimensional deep features extracted using CheXNet-121 architecture. There were eight evaluation metrics utilised for each high-level feature set of the associated cross-validation datasets in order to compare the ensemble performance and state-of-the-art techniques from the literature that used the same cross-validation datasets. It is observed that integrated ensemble learning exhibits promising results (92.68% accuracy, 85.66% Matthews correlation coefficient (MCC), and 0.9302 area under the precision-recall (PR) curve), compared to individual CheXNet-121 and other state-of-the-art techniques. Finally, Grad-CAM was used to visualise the learned behaviour of individual dense blocks within CheXNet-121 and their ensembles into three-color channels of CXRs. We compared the Grad-CAM-indicated ROI to the ground-truth ROI using the intersection of the union (IOU) and average-precision (AP) values for each classifier and their ensemble. Through the visualisation of the Grad-CAM within the blue channel, the average IOU passed more than 90% of the pneumoconiosis detection in chest radiographs.
Collapse
|
39
|
Evaluating and Visualizing the Contribution of ECG Characteristic Waveforms for PPG-Based Blood Pressure Estimation. MICROMACHINES 2022; 13:1438. [PMID: 36144060 PMCID: PMC9502729 DOI: 10.3390/mi13091438] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/14/2022] [Revised: 08/27/2022] [Accepted: 08/27/2022] [Indexed: 06/16/2023]
Abstract
Non-invasive continuous blood pressure monitoring is of great significance for the preventing, diagnosing, and treating of cardiovascular diseases (CVDs). Studies have demonstrated that photoplethysmogram (PPG) and electrocardiogram (ECG) signals can effectively and continuously predict blood pressure (BP). However, most of the BP estimation models focus on the waveform features of the PPG signal, while the peak value of R-wave in ECG is only used as a time reference, and few references investigated the ECG waveforms. This paper aims to evaluate the influence of three characteristic waveforms in ECG on the improvement of BP estimation. PPG is the primary signal, and five input combinations are formed by adding ECG, P wave, QRS complex, T wave, and none. We employ five common convolutional neural networks (CNN) to validate the consistency of the contribution. Meanwhile, with the visualization of Gradient-weighted class activation mapping (Grad-CAM), we generate the heat maps and further visualize the distribution of CNN's attention to each waveform of PPG and ECG. The heat maps show that networks pay more attention to the QRS complex and T wave. In the comparison results, the QRS complex and T wave have more contribution to minimizing errors than P wave. By separately adding P wave, QRS complex, and T wave, the average MAE of these networks reaches 7.87 mmHg, 6.57 mmHg, and 6.21 mmHg for systolic blood pressure (SBP), and 4.27 mmHg, 3.65 mmHg, and 3.73 mmHg, respectively, for diastolic blood pressure (DBP). The results of the experiment show that QRS complex and T wave deserves more attention and feature extraction like PPG waveform features in the continuous BP estimation.
Collapse
|
40
|
Comparison of Eye and Face Features on Drowsiness Analysis. SENSORS (BASEL, SWITZERLAND) 2022; 22:6529. [PMID: 36080988 PMCID: PMC9460799 DOI: 10.3390/s22176529] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Revised: 08/22/2022] [Accepted: 08/28/2022] [Indexed: 06/15/2023]
Abstract
Drowsiness is one of the leading causes of traffic accidents. For those who operate large machinery or motor vehicles, incidents due to lack of sleep can cause property damage and sometimes lead to grave consequences of injuries and fatality. This study aims to design learning models to recognize drowsiness through human facial features. In addition, this work analyzes the attentions of individual neurons in the learning model to understand how neural networks interpret drowsiness. For this analysis, gradient-weighted class activation mapping (Grad-CAM) is implemented in the neural networks to display the attention of neurons. The eye and face images are processed separately to the model for the training process. The results initially show that better results can be obtained by delivering eye images alone. The effect of Grad-CAM is also more reasonable using eye images alone. Furthermore, this work proposed a feature analysis method, K-nearest neighbors Sigma (KNN-Sigma), to estimate the homogeneous concentration and heterogeneous separation of the extracted features. In the end, we found that the fusion of face and eye signals gave the best results for recognition accuracy and KNN-sigma. The area under the curve (AUC) of using face, eye, and fusion images are 0.814, 0.897, and 0.935, respectively.
Collapse
|
41
|
SSPNet: An interpretable 3D-CNN for classification of schizophrenia using phase maps of resting-state complex-valued fMRI data. Med Image Anal 2022; 79:102430. [PMID: 35397470 DOI: 10.1016/j.media.2022.102430] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2020] [Revised: 03/16/2022] [Accepted: 03/18/2022] [Indexed: 01/05/2023]
Abstract
Convolutional neural networks (CNNs) have shown promising results in classifying individuals with mental disorders such as schizophrenia using resting-state fMRI data. However, complex-valued fMRI data is rarely used since additional phase data introduces high-level noise though it is potentially useful information for the context of classification. As such, we propose to use spatial source phase (SSP) maps derived from complex-valued fMRI data as the CNN input. The SSP maps are not only less noisy, but also more sensitive to spatial activation changes caused by mental disorders than magnitude maps. We build a 3D-CNN framework with two convolutional layers (named SSPNet) to fully explore the 3D structure and voxel-level relationships from the SSP maps. Two interpretability modules, consisting of saliency map generation and gradient-weighted class activation mapping (Grad-CAM), are incorporated into the well-trained SSPNet to provide additional information helpful for understanding the output. Experimental results from classifying schizophrenia patients (SZs) and healthy controls (HCs) show that the proposed SSPNet significantly improved accuracy and AUC compared to CNN using magnitude maps extracted from either magnitude-only (by 23.4 and 23.6% for DMN) or complex-valued fMRI data (by 10.6 and 5.8% for DMN). SSPNet captured more prominent HC-SZ differences in saliency maps, and Grad-CAM localized all contributing brain regions with opposite strengths for HCs and SZs within SSP maps. These results indicate the potential of SSPNet as a sensitive tool that may be useful for the development of brain-based biomarkers of mental disorders.
Collapse
|
42
|
Intracerebral hemorrhage detection on computed tomography images using a residual neural network. Phys Med 2022; 99:113-119. [PMID: 35671679 DOI: 10.1016/j.ejmp.2022.05.015] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/08/2022] [Revised: 04/23/2022] [Accepted: 05/26/2022] [Indexed: 01/31/2023] Open
Abstract
Intracerebral hemorrhage (ICH) is a high mortality rate, critical medical injury, produced by the rupture of a blood vessel of the vascular system inside the skull. ICH can lead to paralysis and even death. Therefore, it is considered a clinically dangerous disease that needs to be treated quickly. Thanks to the advancement in machine learning and the computing power of today's microprocessors, deep learning has become an unbelievably valuable tool for detecting diseases, in particular from medical images. In this work, we are interested in differentiating computer tomography (CT) images of healthy brains and ICH using a ResNet-18, a deep residual convolutional neural network. In addition, the gradient-weighted class activation mapping (Grad-CAM) technique was employed to visually explore and understand the network's decisions. The generalizability of the detector was assessed through a 100-iteration Monte Carlo cross-validation (80% of the data for training and 20% for test). In a database with 200 CT images of brains (100 with ICH and 100 without ICH), the detector yielded, on average, 95.93%accuracy, 96.20% specificity, 95.65% sensitivity, 96.40% precision, and 95.91% F1-core, with an average computing time of 165.90 s to train the network (on 160 images) and 1.17 s to test it with 40 CT images. These results are comparable with the state of the art with a simpler and lower computational load approach. Our detector could assist physicians in their medical decision, in resource optimization and in reducing the time and error in the diagnosis of ICH.
Collapse
|
43
|
Deep Residual Convolutional Neural Networks for Brain-Computer Interface to Visualize Neural Processing of Hand Movements in the Human Brain. Front Comput Neurosci 2022; 16:882290. [PMID: 35669388 PMCID: PMC9165810 DOI: 10.3389/fncom.2022.882290] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2022] [Accepted: 04/19/2022] [Indexed: 11/29/2022] Open
Abstract
Concomitant with the development of deep learning, brain-computer interface (BCI) decoding technology has been rapidly evolving. Convolutional neural networks (CNNs), which are generally used as electroencephalography (EEG) classification models, are often deployed in BCI prototypes to improve the estimation accuracy of a participant's brain activity. However, because most BCI models are trained, validated, and tested via within-subject cross-validation and there is no corresponding generalization model, their applicability to unknown participants is not guaranteed. In this study, to facilitate the generalization of BCI model performance to unknown participants, we trained a model comprising multiple layers of residual CNNs and visualized the reasons for BCI classification to reveal the location and timing of neural activities that contribute to classification. Specifically, to develop a BCI that can distinguish between rest, left-hand movement, and right-hand movement tasks with high accuracy, we created multilayers of CNNs, inserted residual networks into the multilayers, and used a larger dataset than in previous studies. The constructed model was analyzed with gradient-class activation mapping (Grad-CAM). We evaluated the developed model via subject cross-validation and found that it achieved significantly improved accuracy (85.69 ± 1.10%) compared with conventional models or without residual networks. Grad-CAM analysis of the classification of cases in which our model produced correct answers showed localized activity near the premotor cortex. These results confirm the effectiveness of inserting residual networks into CNNs for tuning BCI. Further, they suggest that recording EEG signals over the premotor cortex and some other areas contributes to high classification accuracy.
Collapse
|
44
|
Cardiac Segmentation Method Based on Domain Knowledge. ULTRASONIC IMAGING 2022; 44:105-117. [PMID: 35574925 DOI: 10.1177/01617346221099435] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Echocardiography plays an important role in the clinical diagnosis of cardiovascular diseases. Cardiac function assessment by echocardiography is a crucial process in daily cardiology. However, cardiac segmentation in echocardiography is a challenging task due to shadows and speckle noise. The traditional manual segmentation method is a time-consuming process and limited by inter-observer variability. In this paper, we present a fast and accurate echocardiographic automatic segmentation framework based on Convolutional neural networks (CNN). We propose FAUet, a segmentation method serially integrated U-Net with coordinate attention mechanism and domain feature loss from VGG19 pre-trained on the ImageNet dataset. The coordinate attention mechanism can capture long-range dependencies along one spatial direction and meanwhile preserve precise positional information along the other spatial direction. And the domain feature loss is more concerned with the topology of cardiac structures by exploiting their higher-level features. In this research, we use a two-dimensional echocardiogram (2DE) of 88 patients from two devices, Philips Epiq 7C and Mindray Resona 7T, to segment the left ventricle (LV), interventricular septal (IVS), and posterior left ventricular wall (PLVW). We also draw the gradient weighted class activation mapping (Grad-CAM) to improve the interpretability of the segmentation results. Compared with the traditional U-Net, the proposed segmentation method shows better performance. The mean Dice Score Coefficient (Dice) of LV, IVS, and PLVW of FAUet can achieve 0.932, 0.848, and 0.868, and the average Dice of the three objects can achieve 0.883. Statistical analysis showed that there is no significant difference between the segmentation results of the two devices. The proposed method can realize fast and accurate segmentation of 2DE with a low time cost. Combining coordinate attention module and feature loss with the original U-Net framework can significantly increase the performance of the algorithm.
Collapse
|
45
|
Dissecting Breeders' Sense via Explainable Machine Learning Approach: Application to Fruit Peelability and Hardness in Citrus. FRONTIERS IN PLANT SCIENCE 2022; 13:832749. [PMID: 35222489 PMCID: PMC8867066 DOI: 10.3389/fpls.2022.832749] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/10/2021] [Accepted: 01/17/2022] [Indexed: 06/14/2023]
Abstract
"Genomics-assisted breeding", which utilizes genomics-based methods, e.g., genome-wide association study (GWAS) and genomic selection (GS), has been attracting attention, especially in the field of fruit breeding. Low-cost genotyping technologies that support genome-assisted breeding have already been established. However, efficient collection of large amounts of high-quality phenotypic data is essential for the success of such breeding. Most of the fruit quality traits have been sensorily and visually evaluated by professional breeders. However, the fruit morphological features that serve as the basis for such sensory and visual judgments are unclear. This makes it difficult to collect efficient phenotypic data on fruit quality traits using image analysis. In this study, we developed a method to automatically measure the morphological features of citrus fruits by the image analysis of cross-sectional images of citrus fruits. We applied explainable machine learning methods and Bayesian networks to determine the relationship between fruit morphological features and two sensorily evaluated fruit quality traits: easiness of peeling (Peeling) and fruit hardness (FruH). In each of all the methods applied in this study, the degradation area of the central core of the fruit was significantly and directly associated with both Peeling and FruH, while the seed area was significantly and directly related to FruH alone. The degradation area of albedo and the area of flavedo were also significantly and directly related to Peeling and FruH, respectively, except in one or two methods. These results suggest that an approach that combines explainable machine learning methods, Bayesian networks, and image analysis can be effective in dissecting the experienced sense of a breeder. In breeding programs, collecting fruit images and efficiently measuring and documenting fruit morphological features that are related to fruit quality traits may increase the size of data for the analysis and improvement of the accuracy of GWAS and GS on the quality traits of the citrus fruits.
Collapse
|
46
|
PSCNN: PatchShuffle Convolutional Neural Network for COVID-19 Explainable Diagnosis. Front Public Health 2021; 9:768278. [PMID: 34778194 PMCID: PMC8585997 DOI: 10.3389/fpubh.2021.768278] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Accepted: 09/29/2021] [Indexed: 12/11/2022] Open
Abstract
Objective: COVID-19 is a sort of infectious disease caused by a new strain of coronavirus. This study aims to develop a more accurate COVID-19 diagnosis system. Methods: First, the n-conv module (nCM) is introduced. Then we built a 12-layer convolutional neural network (12l-CNN) as the backbone network. Afterwards, PatchShuffle was introduced to integrate with 12l-CNN as a regularization term of the loss function. Our model was named PSCNN. Moreover, multiple-way data augmentation and Grad-CAM are employed to avoid overfitting and locating lung lesions. Results: The mean and standard variation values of the seven measures of our model were 95.28 ± 1.03 (sensitivity), 95.78 ± 0.87 (specificity), 95.76 ± 0.86 (precision), 95.53 ± 0.83 (accuracy), 95.52 ± 0.83 (F1 score), 91.7 ± 1.65 (MCC), and 95.52 ± 0.83 (FMI). Conclusion: Our PSCNN is better than 10 state-of-the-art models. Further, we validate the optimal hyperparameters in our model and demonstrate the effectiveness of PatchShuffle.
Collapse
|
47
|
Fine-Tuning Convolutional Neural Networks for COVID-19 Detection from Chest X-ray Images. Diagnostics (Basel) 2021; 11:1887. [PMID: 34679585 PMCID: PMC8535063 DOI: 10.3390/diagnostics11101887] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Revised: 09/30/2021] [Accepted: 10/10/2021] [Indexed: 12/24/2022] Open
Abstract
As the COVID-19 pandemic continues to ravage the world, the use of chest X-ray (CXR) images as a complementary screening strategy to reverse transcription-polymerase chain reaction (RT-PCR) testing continues to grow owing to its routine clinical application to respiratory diseases. We performed extensive convolutional neural network (CNN) fine-tuning experiments and identified that models pretrained on larger out-of-domain datasets show an improved performance. This suggests that a priori knowledge of models from out-of-field training should also apply to X-ray images. With appropriate hyperparameters selection, we found that higher resolution images carry more clinical information, and the use of mixup in training improved the performance of the model. The experimental showed that our proposed transfer learning present state-of-the-art results. Furthermore, we evaluated the performance of our model with a small amount of downstream training data and found that the model still performed well in COVID-19 identification. We also explored the mechanism of model detection using a gradient-weighted class activation mapping (Grad-CAM) method for CXR imaging to interpret the detection of radiology images. The results helped us understand how the model detects COVID-19, which can be used to discover new visual features and assist radiologists in screening.
Collapse
|
48
|
Interpreting convolutional neural network for real-time volatile organic compounds detection and classification using optical emission spectroscopy of plasma. Anal Chim Acta 2021; 1179:338822. [PMID: 34535253 DOI: 10.1016/j.aca.2021.338822] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2021] [Revised: 06/28/2021] [Accepted: 06/30/2021] [Indexed: 01/02/2023]
Abstract
This study presents the investigation of optical emission spectroscopy of plasma using interpretable convolutional neural network (CNN) for real-time volatile organic compounds (VOCs) classification. A microplasma-generation platform was developed to efficiently collect 64 k spectra from various types of VOCs at different concentrations, as training and testing sets for machine learning. A CNN model was trained to classify VOCs with accuracy of 99.9%. To interpret the CNN model and its predictions, the spectral processing mechanism of the CNN was visualized by feature maps and the critical spectral features were identified by gradient-weighted class activation mapping. Such approaches brought insights on how CNN analyzes the spectra and enables the CNN operation to be explainable. Finally, the CNN model was incorporated with the microplasma platform to demonstrate the application of real-time VOC monitoring. The type of VOCs can be identified and reported via messages within 10 s once the microplasma is ignited. We believe that using CNN brings a novel route for plasma spectroscopy analysis for VOC classification and impacts the fields of plasma, spectroscopy, and environmental monitoring.
Collapse
|
49
|
Deep Learning-Based High-Frequency Ultrasound Skin Image Classification with Multicriteria Model Evaluation. SENSORS 2021; 21:s21175846. [PMID: 34502735 PMCID: PMC8434172 DOI: 10.3390/s21175846] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/27/2021] [Revised: 08/22/2021] [Accepted: 08/27/2021] [Indexed: 02/01/2023]
Abstract
This study presents the first application of convolutional neural networks to high-frequency ultrasound skin image classification. This type of imaging opens up new opportunities in dermatology, showing inflammatory diseases such as atopic dermatitis, psoriasis, or skin lesions. We collected a database of 631 images with healthy skin and different skin pathologies to train and assess all stages of the methodology. The proposed framework starts with the segmentation of the epidermal layer using a DeepLab v3+ model with a pre-trained Xception backbone. We employ transfer learning to train the segmentation model for two purposes: to extract the region of interest for classification and to prepare the skin layer map for classification confidence estimation. For classification, we train five models in different input data modes and data augmentation setups. We also introduce a classification confidence level to evaluate the deep model’s reliability. The measure combines our skin layer map with the heatmap produced by the Grad-CAM technique designed to indicate image regions used by the deep model to make a classification decision. Moreover, we propose a multicriteria model evaluation measure to select the optimal model in terms of classification accuracy, confidence, and test dataset size. The experiments described in the paper show that the DenseNet-201 model fed with the extracted region of interest produces the most reliable and accurate results.
Collapse
|
50
|
Classification of Cardiomyopathies from MR Cine Images Using Convolutional Neural Network with Transfer Learning. Diagnostics (Basel) 2021; 11:diagnostics11091554. [PMID: 34573896 PMCID: PMC8470356 DOI: 10.3390/diagnostics11091554] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Revised: 08/20/2021] [Accepted: 08/24/2021] [Indexed: 01/14/2023] Open
Abstract
The automatic classification of various types of cardiomyopathies is desirable but has never been performed using a convolutional neural network (CNN). The purpose of this study was to evaluate currently available CNN models to classify cine magnetic resonance (cine-MR) images of cardiomyopathies. Method: Diastolic and systolic frames of 1200 cine-MR sequences of three categories of subjects (395 normal, 411 hypertrophic cardiomyopathy, and 394 dilated cardiomyopathy) were selected, preprocessed, and labeled. Pretrained, fine-tuned deep learning models (VGG) were used for image classification (sixfold cross-validation and double split testing with hold-out data). The heat activation map algorithm (Grad-CAM) was applied to reveal salient pixel areas leading to the classification. Results: The diastolic–systolic dual-input concatenated VGG model cross-validation accuracy was 0.982 ± 0.009. Summed confusion matrices showed that, for the 1200 inputs, the VGG model led to 22 errors. The classification of a 227-input validation group, carried out by an experienced radiologist and cardiologist, led to a similar number of discrepancies. The image preparation process led to 5% accuracy improvement as compared to nonprepared images. Grad-CAM heat activation maps showed that most misclassifications occurred when extracardiac location caught the attention of the network. Conclusions: CNN networks are very well suited and are 98% accurate for the classification of cardiomyopathies, regardless of the imaging plane, when both diastolic and systolic frames are incorporated. Misclassification is in the same range as inter-observer discrepancies in experienced human readers.
Collapse
|