1
|
Liu X, Li W, Miao S, Liu F, Han K, Bezabih TT. HAMMF: Hierarchical attention-based multi-task and multi-modal fusion model for computer-aided diagnosis of Alzheimer's disease. Comput Biol Med 2024; 176:108564. [PMID: 38744010 DOI: 10.1016/j.compbiomed.2024.108564] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Revised: 04/15/2024] [Accepted: 05/05/2024] [Indexed: 05/16/2024]
Abstract
Alzheimer's disease (AD) is a progressive neurodegenerative condition, and early intervention can help slow its progression. However, integrating multi-dimensional information and deep convolutional networks increases the model parameters, affecting diagnosis accuracy and efficiency and hindering clinical diagnostic model deployment. Multi-modal neuroimaging can offer more precise diagnostic results, while multi-task modeling of classification and regression tasks can enhance the performance and stability of AD diagnosis. This study proposes a Hierarchical Attention-based Multi-task Multi-modal Fusion model (HAMMF) that leverages multi-modal neuroimaging data to concurrently learn AD classification tasks, cognitive score regression, and age regression tasks using attention-based techniques. Firstly, we preprocess MRI and PET image data to obtain two modal data, each containing distinct information. Next, we incorporate a novel Contextual Hierarchical Attention Module (CHAM) to aggregate multi-modal features. This module employs channel and spatial attention to extract fine-grained pathological features from unimodal image data across various dimensions. Using these attention mechanisms, the Transformer can effectively capture correlated features of multi-modal inputs. Lastly, we adopt multi-task learning in our model to investigate the influence of different variables on diagnosis, with a primary classification task and a secondary regression task for optimal multi-task prediction performance. Our experiments utilized MRI and PET images from 720 subjects in the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset. The results show that our proposed model achieves an overall accuracy of 93.15% for AD/NC recognition, and the visualization results demonstrate its strong pathological feature recognition performance.
Collapse
Affiliation(s)
- Xiao Liu
- School of Computer Engineering and Science, Shanghai University, Shanghai, China
| | - Weimin Li
- School of Computer Engineering and Science, Shanghai University, Shanghai, China.
| | - Shang Miao
- School of Computer Engineering and Science, Shanghai University, Shanghai, China
| | - Fangyu Liu
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China; University of Chinese Academy of Sciences, Beijing, China; BGI-Shenzhen, Shenzhen, China
| | - Ke Han
- Medical and Health Center, Liaocheng People's Hospital, LiaoCheng, China
| | - Tsigabu T Bezabih
- School of Computer Engineering and Science, Shanghai University, Shanghai, China
| |
Collapse
|
2
|
Alam MS, Wang D, Sowmya A. DLA-Net: dual lesion attention network for classification of pneumoconiosis using chest X-ray images. Sci Rep 2024; 14:11616. [PMID: 38773153 PMCID: PMC11109256 DOI: 10.1038/s41598-024-61024-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Accepted: 04/30/2024] [Indexed: 05/23/2024] Open
Abstract
Accurate and early detection of pneumoconiosis using chest X-rays (CXR) is important for preventing the progression of this incurable disease. It is also a challenging task due to large variations in appearance, size and location of lesions in the lung regions as well as inter-class similarity and intra-class variance. Compared to traditional methods, Convolutional Neural Networks-based methods have shown improved results; however, these methods are still not applicable in clinical practice due to limited performance. In some cases, limited computing resources make it impractical to develop a model using whole CXR images. To address this problem, the lung fields are divided into six zones, each zone is classified separately and the zone classification results are then aggregated into an image classification score, based on state-of-the-art. In this study, we propose a dual lesion attention network (DLA-Net) for the classification of pneumoconiosis that can extract features from affected regions in a lung. This network consists of two main components: feature extraction and feature refinement. Feature extraction uses the pre-trained Xception model as the backbone to extract semantic information. To emphasise the lesion regions and improve the feature representation capability, the feature refinement component uses a DLA module that consists of two sub modules: channel attention (CA) and spatial attention (SA). The CA module focuses on the most important channels in the feature maps extracted by the backbone model, and the SA module highlights the spatial details of the affected regions. Thus, both attention modules combine to extract discriminative and rich contextual features to improve classification performance on pneumoconiosis. Experimental results show that the proposed DLA-Net outperforms state-of-the-art methods for pneumoconiosis classification.
Collapse
Affiliation(s)
- Md Shariful Alam
- School of Computer Science and Engineering, University of New South Wales, Sydney, Australia.
- CSIRO Data61, Sydney, Australia.
| | | | - Arcot Sowmya
- School of Computer Science and Engineering, University of New South Wales, Sydney, Australia
| |
Collapse
|
3
|
Bakasa W, Viriri S. Stacked ensemble deep learning for pancreas cancer classification using extreme gradient boosting. Front Artif Intell 2023; 6:1232640. [PMID: 37876961 PMCID: PMC10591225 DOI: 10.3389/frai.2023.1232640] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Accepted: 09/04/2023] [Indexed: 10/26/2023] Open
Abstract
Ensemble learning aims to improve prediction performance by combining several models or forecasts. However, how much and which ensemble learning techniques are useful in deep learning-based pipelines for pancreas computed tomography (CT) image classification is a challenge. Ensemble approaches are the most advanced solution to many machine learning problems. These techniques entail training multiple models and combining their predictions to improve the predictive performance of a single model. This article introduces the idea of Stacked Ensemble Deep Learning (SEDL), a pipeline for classifying pancreas CT medical images. The weak learners are Inception V3, VGG16, and ResNet34, and we employed a stacking ensemble. By combining the first-level predictions, an input train set for XGBoost, the ensemble model at the second level of prediction, is created. Extreme Gradient Boosting (XGBoost), employed as a strong learner, will make the final classification. Our findings showed that SEDL performed better, with a 98.8% ensemble accuracy, after some adjustments to the hyperparameters. The Cancer Imaging Archive (TCIA) public access dataset consists of 80 pancreas CT scans with a resolution of 512 * 512 pixels, from 53 male and 27 female subjects. A sample of two hundred and twenty-two images was used for training and testing data. We concluded that implementing the SEDL technique is an effective way to strengthen the robustness and increase the performance of the pipeline for classifying pancreas CT medical images. Interestingly, grouping like-minded or talented learners does not make a difference.
Collapse
Affiliation(s)
| | - Serestina Viriri
- School of Mathematics Statistics & Computer Science, College of Agriculture, Engineering and Science, University of KwaZulu-Natal, Durban, South Africa
| |
Collapse
|
4
|
Guo S, Zhang J, Li H, Zhang J, Cheng CK. A multi-branch network to detect post-operative complications following hip arthroplasty on X-ray images. Front Bioeng Biotechnol 2023; 11:1239637. [PMID: 37840662 PMCID: PMC10569301 DOI: 10.3389/fbioe.2023.1239637] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Accepted: 09/13/2023] [Indexed: 10/17/2023] Open
Abstract
Background: Postoperative complications following total hip arthroplasty (THA) often require revision surgery. X-rays are usually used to detect such complications, but manually identifying the location of the problem and making an accurate assessment can be subjective and time-consuming. Therefore, in this study, we propose a multi-branch network to automatically detect postoperative complications on X-ray images. Methods: We developed a multi-branch network using ResNet as the backbone and two additional branches with a global feature stream and a channel feature stream for extracting features of interest. Additionally, inspired by our domain knowledge, we designed a multi-coefficient class-specific residual attention block to learn the correlations between different complications to improve the performance of the system. Results: Our proposed method achieved state-of-the-art (SOTA) performance in detecting multiple complications, with mean average precision (mAP) and F1 scores of 0.346 and 0.429, respectively. The network also showed excellent performance at identifying aseptic loosening, with recall and precision rates of 0.929 and 0.897, respectively. Ablation experiments were conducted on detecting multiple complications and single complications, as well as internal and external datasets, demonstrating the effectiveness of our proposed modules. Conclusion: Our deep learning method provides an accurate end-to-end solution for detecting postoperative complications following THA.
Collapse
Affiliation(s)
- Sijia Guo
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
- Engineering Research Center for Digital Medicine of the Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Jiping Zhang
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
- Engineering Research Center for Digital Medicine of the Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Huiwu Li
- Department of Orthopaedics, Ninth People’s Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Jingwei Zhang
- Department of Orthopaedics, Ninth People’s Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Cheng-Kung Cheng
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
- Engineering Research Center for Digital Medicine of the Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| |
Collapse
|
5
|
Li A, Yang C. AGMG-Net: Leveraging multiscale and fine-grained features for improved cargo recognition. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:16744-16761. [PMID: 37920032 DOI: 10.3934/mbe.2023746] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/04/2023]
Abstract
Security systems place great emphasis on the safety of stored cargo, as any loss or tampering can result in significant economic damage. The cargo identification module within the security system faces the challenge of achieving a 99.99% recognition accuracy. However, current identification methods are limited in accuracy due to the lack of cargo data, insufficient utilization of image features and minimal differences between actual cargo classes. First, we collected and created a cargo identification dataset named "Cargo" using industrial cameras. Subsequently, an Attention-guided Multi-granularity feature fusion model (AGMG-Net) was proposed for cargo identification. This model extracts both coarse-grained and fine-grained features of the cargo using two branch networks and fuses them to fully utilize the information contained in these features. Furthermore, the Attention-guided Multi-stage Attention Accumulation (AMAA) module is introduced for target localization, and the Multi-region Optimal Selection method Based on Confidence (MOSBC) module is used for target cropping. The features from the two branches are fused using a fusion branch in a Concat manner for multi-granularity feature fusion. The experimental results show that the proposed model achieves an average recognition rate of 99.58, 92.73 and 88.57% on the self-built dataset Cargo, and the publicly available datasets Flower and Butterfly20, respectively. This is better than the state-of-the-art model. Therefore, this research method accurately identifies cargo categories and provides valuable assistance to security systems.
Collapse
Affiliation(s)
- Aigou Li
- College of Computer Science and Technology, Xi'an University of Science and Technology, Xi'an, China
| | - Chen Yang
- College of Computer Science and Technology, Xi'an University of Science and Technology, Xi'an, China
| |
Collapse
|
6
|
Wang KN, Zhuang S, Ran QY, Zhou P, Hua J, Zhou GQ, He X. DLGNet: A dual-branch lesion-aware network with the supervised Gaussian Mixture model for colon lesions classification in colonoscopy images. Med Image Anal 2023; 87:102832. [PMID: 37148864 DOI: 10.1016/j.media.2023.102832] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2022] [Revised: 01/20/2023] [Accepted: 04/20/2023] [Indexed: 05/08/2023]
Abstract
Colorectal cancer is one of the malignant tumors with the highest mortality due to the lack of obvious early symptoms. It is usually in the advanced stage when it is discovered. Thus the automatic and accurate classification of early colon lesions is of great significance for clinically estimating the status of colon lesions and formulating appropriate diagnostic programs. However, it is challenging to classify full-stage colon lesions due to the large inter-class similarities and intra-class differences of the images. In this work, we propose a novel dual-branch lesion-aware neural network (DLGNet) to classify intestinal lesions by exploring the intrinsic relationship between diseases, composed of four modules: lesion location module, dual-branch classification module, attention guidance module, and inter-class Gaussian loss function. Specifically, the elaborate dual-branch module integrates the original image and the lesion patch obtained by the lesion localization module to explore and interact with lesion-specific features from a global and local perspective. Also, the feature-guided module guides the model to pay attention to the disease-specific features by learning remote dependencies through spatial and channel attention after network feature learning. Finally, the inter-class Gaussian loss function is proposed, which assumes that each feature extracted by the network is an independent Gaussian distribution, and the inter-class clustering is more compact, thereby improving the discriminative ability of the network. The extensive experiments on the collected 2568 colonoscopy images have an average accuracy of 91.50%, and the proposed method surpasses the state-of-the-art methods. This study is the first time that colon lesions are classified at each stage and achieves promising colon disease classification performance. To motivate the community, we have made our code publicly available via https://github.com/soleilssss/DLGNet.
Collapse
Affiliation(s)
- Kai-Ni Wang
- School of Biological Science and Medical Engineering, Southeast University, Nanjing, China; State Key Laboratory of Digital Medical Engineering, Southeast University, Nanjing, China; Jiangsu Key Laboratory of Biomaterials and Devices, Southeast University, Nanjing, China
| | - Shuaishuai Zhuang
- The First Affiliated Hospital of Nanjing Medical University, Nanjing, China
| | - Qi-Yong Ran
- School of Biological Science and Medical Engineering, Southeast University, Nanjing, China; State Key Laboratory of Digital Medical Engineering, Southeast University, Nanjing, China; Jiangsu Key Laboratory of Biomaterials and Devices, Southeast University, Nanjing, China
| | - Ping Zhou
- School of Biological Science and Medical Engineering, Southeast University, Nanjing, China; State Key Laboratory of Digital Medical Engineering, Southeast University, Nanjing, China; Jiangsu Key Laboratory of Biomaterials and Devices, Southeast University, Nanjing, China
| | - Jie Hua
- The First Affiliated Hospital of Nanjing Medical University, Nanjing, China; Liyang People's Hospital, Liyang Branch Hospital of Jiangsu Province Hospital, Liyang, China
| | - Guang-Quan Zhou
- School of Biological Science and Medical Engineering, Southeast University, Nanjing, China; State Key Laboratory of Digital Medical Engineering, Southeast University, Nanjing, China; Jiangsu Key Laboratory of Biomaterials and Devices, Southeast University, Nanjing, China.
| | - Xiaopu He
- The First Affiliated Hospital of Nanjing Medical University, Nanjing, China.
| |
Collapse
|
7
|
Zhu S, Zhan H, Yan Z, Wu M, Zheng B, Xu S, Jiang Q, Yang W. Prediction of spherical equivalent refraction and axial length in children based on machine learning. Indian J Ophthalmol 2023; 71:2115-2131. [PMID: 37203092 PMCID: PMC10391375 DOI: 10.4103/ijo.ijo_2989_22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/20/2023] Open
Abstract
Purpose Recently, the proportion of patients with high myopia has shown a continuous growing trend, more toward the younger age groups. This study aimed to predict the changes in spherical equivalent refraction (SER) and axial length (AL) in children using machine learning methods. Methods This study is a retrospective study. The cooperative ophthalmology hospital of this study collected data on 179 sets of childhood myopia examinations. The data collected included AL and SER from grades 1 to 6. This study used the six machine learning models to predict AL and SER based on the data. Six evaluation indicators were used to evaluate the prediction results of the models. Results For predicting SER in grade 6, grade 5, grade 4, grade 3, and grade 2, the best results were obtained through the multilayer perceptron (MLP) algorithm, MLP algorithm, orthogonal matching pursuit (OMP) algorithm, OMP algorithm, and OMP algorithm, respectively. The R2 of the five models were 0.8997, 0.7839, 0.7177, 0.5118, and 0.1758, respectively. For predicting AL in grade 6, grade 5, grade 4, grade 3, and grade 2, the best results were obtained through the Extra Tree (ET) algorithm, MLP algorithm, kernel ridge (KR) algorithm, KR algorithm, and MLP algorithm, respectively. The R2 of the five models were 0.7546, 0.5456, 0.8755, 0.9072, and 0.8534, respectively. Conclusion Therefore, in predicting SER, the OMP model performed better than the other models in most experiments. In predicting AL, the KR and MLP models were better than the other models in most experiments.
Collapse
Affiliation(s)
- Shaojun Zhu
- School of Information Engineering, Huzhou University; Zhejiang Province Key Laboratory of Smart Management and Application of Modern Agricultural Resources, Huzhou University, Huzhou, China
| | - Haodong Zhan
- School of Information Engineering, Huzhou University, Huzhou, China
| | - Zhipeng Yan
- Eye Hospital, Nanjing Medical University, Nanjing, China
| | - Maonian Wu
- School of Information Engineering, Huzhou University; Zhejiang Province Key Laboratory of Smart Management and Application of Modern Agricultural Resources, Huzhou University, Huzhou, China
| | - Bo Zheng
- School of Information Engineering, Huzhou University; Zhejiang Province Key Laboratory of Smart Management and Application of Modern Agricultural Resources, Huzhou University, Huzhou, China
| | - Shanshan Xu
- Eye Hospital, Nanjing Medical University, Nanjing, China
| | - Qin Jiang
- Eye Hospital, Nanjing Medical University, Nanjing, China
| | - Weihua Yang
- Shenzhen Eye Hospital, Jinan University, Shenzhen, China
| |
Collapse
|
8
|
You H, Yu L, Tian S, Cai W. A stereo spatial decoupling network for medical image classification. COMPLEX INTELL SYST 2023; 9:1-10. [PMID: 37361963 PMCID: PMC10107597 DOI: 10.1007/s40747-023-01049-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2022] [Accepted: 03/09/2023] [Indexed: 06/28/2023]
Abstract
Deep convolutional neural network (CNN) has made great progress in medical image classification. However, it is difficult to establish effective spatial associations, and always extracts similar low-level features, resulting in redundancy of information. To solve these limitations, we propose a stereo spatial discoupling network (TSDNets), which can leverage the multi-dimensional spatial details of medical images. Then, we use an attention mechanism to progressively extract the most discriminative features from three directions: horizontal, vertical, and depth. Moreover, a cross feature screening strategy is used to divide the original feature maps into three levels: important, secondary and redundant. Specifically, we design a cross feature screening module (CFSM) and a semantic guided decoupling module (SGDM) to model multi-dimension spatial relationships, thereby enhancing the feature representation capabilities. The extensive experiments conducted on multiple open source baseline datasets demonstrate that our TSDNets outperforms previous state-of-the-art models.
Collapse
Affiliation(s)
- Hongfeng You
- School of Information Science and Engineering, Xinjiang University, Urumqi, 830000 China
| | - Long Yu
- Network Center, Xinjiang University, Urumqi, 830000 China
| | - Shengwei Tian
- Software College, Xinjiang University, Urumqi, 830000 China
| | - Weiwei Cai
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, 214122 China
| |
Collapse
|
9
|
Intelligent Diagnosis of Multiple Peripheral Retinal Lesions in Ultra-widefield Fundus Images Based on Deep Learning. Ophthalmol Ther 2023; 12:1081-1095. [PMID: 36692813 PMCID: PMC9872743 DOI: 10.1007/s40123-023-00651-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Accepted: 01/05/2023] [Indexed: 01/25/2023] Open
Abstract
INTRODUCTION Compared with traditional fundus examination techniques, ultra-widefield fundus (UWF) images provide 200° panoramic images of the retina, which allows better detection of peripheral retinal lesions. The advent of UWF provides effective solutions only for detection but still lacks efficient diagnostic capabilities. This study proposed a retinal lesion detection model to automatically locate and identify six relatively typical and high-incidence peripheral retinal lesions from UWF images which will enable early screening and rapid diagnosis. METHODS A total of 24,602 augmented ultra-widefield fundus images with labels corresponding to 6 peripheral retinal lesions and normal manifestation labelled by 5 ophthalmologists were included in this study. An object detection model named You Only Look Once X (YOLOX) was modified and trained to locate and classify the six peripheral retinal lesions including rhegmatogenous retinal detachment (RRD), retinal breaks (RB), white without pressure (WWOP), cystic retinal tuft (CRT), lattice degeneration (LD), and paving-stone degeneration (PSD). We applied coordinate attention block and generalized intersection over union (GIOU) loss to YOLOX and evaluated it for accuracy, sensitivity, specificity, precision, F1 score, and average precision (AP). This model was able to show the exact location and saliency map of the retinal lesions detected by the model thus contributing to efficient screening and diagnosis. RESULTS The model reached an average accuracy of 96.64%, sensitivity of 87.97%, specificity of 98.04%, precision of 87.01%, F1 score of 87.39%, and mAP of 86.03% on test dataset 1 including 248 UWF images and reached an average accuracy of 95.04%, sensitivity of 83.90%, specificity of 96.70%, precision of 78.73%, F1 score of 81.96%, and mAP of 80.59% on external test dataset 2 including 586 UWF images, showing this system performs well in distinguishing the six peripheral retinal lesions. CONCLUSION Focusing on peripheral retinal lesions, this work proposed a deep learning model, which automatically recognized multiple peripheral retinal lesions from UWF images and localized exact positions of lesions. Therefore, it has certain potential for early screening and intelligent diagnosis of peripheral retinal lesions.
Collapse
|
10
|
Liu R, Wang T, Li H, Zhang P, Li J, Yang X, Shen D, Sheng B. TMM-Nets: Transferred Multi- to Mono-Modal Generation for Lupus Retinopathy Diagnosis. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:1083-1094. [PMID: 36409801 DOI: 10.1109/tmi.2022.3223683] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Rare diseases, which are severely underrepresented in basic and clinical research, can particularly benefit from machine learning techniques. However, current learning-based approaches usually focus on either mono-modal image data or matched multi-modal data, whereas the diagnosis of rare diseases necessitates the aggregation of unstructured and unmatched multi-modal image data due to their rare and diverse nature. In this study, we therefore propose diagnosis-guided multi-to-mono modal generation networks (TMM-Nets) along with training and testing procedures. TMM-Nets can transfer data from multiple sources to a single modality for diagnostic data structurization. To demonstrate their potential in the context of rare diseases, TMM-Nets were deployed to diagnose the lupus retinopathy (LR-SLE), leveraging unmatched regular and ultra-wide-field fundus images for transfer learning. The TMM-Nets encoded the transfer learning from diabetic retinopathy to LR-SLE based on the similarity of the fundus lesions. In addition, a lesion-aware multi-scale attention mechanism was developed for clinical alerts, enabling TMM-Nets not only to inform patient care, but also to provide insights consistent with those of clinicians. An adversarial strategy was also developed to refine multi- to mono-modal image generation based on diagnostic results and the data distribution to enhance the data augmentation performance. Compared to the baseline model, the TMM-Nets showed 35.19% and 33.56% F1 score improvements on the test and external validation sets, respectively. In addition, the TMM-Nets can be used to develop diagnostic models for other rare diseases.
Collapse
|
11
|
Huang P, He P, Tian S, Ma M, Feng P, Xiao H, Mercaldo F, Santone A, Qin J. A ViT-AMC Network With Adaptive Model Fusion and Multiobjective Optimization for Interpretable Laryngeal Tumor Grading From Histopathological Images. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:15-28. [PMID: 36018875 DOI: 10.1109/tmi.2022.3202248] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
The tumor grading of laryngeal cancer pathological images needs to be accurate and interpretable. The deep learning model based on the attention mechanism-integrated convolution (AMC) block has good inductive bias capability but poor interpretability, whereas the deep learning model based on the vision transformer (ViT) block has good interpretability but weak inductive bias ability. Therefore, we propose an end-to-end ViT-AMC network (ViT-AMCNet) with adaptive model fusion and multiobjective optimization that integrates and fuses the ViT and AMC blocks. However, existing model fusion methods often have negative fusion: 1). There is no guarantee that the ViT and AMC blocks will simultaneously have good feature representation capability. 2). The difference in feature representations learning between the ViT and AMC blocks is not obvious, so there is much redundant information in the two feature representations. Accordingly, we first prove the feasibility of fusing the ViT and AMC blocks based on Hoeffding's inequality. Then, we propose a multiobjective optimization method to solve the problem that ViT and AMC blocks cannot simultaneously have good feature representation. Finally, an adaptive model fusion method integrating the metrics block and the fusion block is proposed to increase the differences between feature representations and improve the deredundancy capability. Our methods improve the fusion ability of ViT-AMCNet, and experimental results demonstrate that ViT-AMCNet significantly outperforms state-of-the-art methods. Importantly, the visualized interpretive maps are closer to the region of interest of concern by pathologists, and the generalization ability is also excellent. Our code is publicly available at https://github.com/Baron-Huang/ViT-AMCNet.
Collapse
|
12
|
Ma L, Su X, Ma L, Gao X, Sun M. Deep learning for classification and localization of early gastric cancer in endoscopic images. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104200] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|
13
|
Liang S, Dong X, Yang K, Chu Z, Tang F, Ye F, Chen B, Guan J, Zhang Y. A multi-perspective information aggregation network for automated T-staging detection of nasopharyngeal carcinoma. Phys Med Biol 2022; 67. [PMID: 36541557 DOI: 10.1088/1361-6560/aca516] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2022] [Accepted: 11/22/2022] [Indexed: 11/23/2022]
Abstract
AccurateT-staging is important when planning personalized radiotherapy. However,T-staging via manual slice-by-slice inspection is time-consuming while tumor sizes and shapes are heterogeneous, and junior physicians find such inspection challenging. With inspiration from oncological diagnostics, we developed a multi-perspective aggregation network that incorporated various diagnosis-oriented knowledge which allowed automated nasopharyngeal carcinomaT-staging detection (TSD Net). Specifically, our TSD Net was designed in multi-branch architecture, which can capture tumor size and shape information (basic knowledge), strongly correlated contextual features, and associations between the tumor and surrounding tissues. We defined the association between the tumor and surrounding tissues by a signed distance map which can embed points and tumor contours in higher-dimensional spaces, yielding valuable information regarding the locations of tissue associations. TSD Net finally outputs aT1-T4 stage prediction by aggregating data from the three branches. We evaluated TSD Net by using the T1-weighted contrast-enhanced magnetic resonance imaging database of 320 patients in a three-fold cross-validation manner. The results show that the proposed method achieves a mean area under the curve (AUC) as high as 87.95%. We also compared our method to traditional classifiers and a deep learning-based method. Our TSD Net is efficient and accurate and outperforms other methods.
Collapse
Affiliation(s)
- Shujun Liang
- School of Biomedical Engineering, Southern Medical University, Guangzhou, Guangdong, 510515, People's Republic of China.,Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou, Guangdong, 510515, People's Republic of China.,Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou, Guangdong, 510515, People's Republic of China
| | - Xiuyu Dong
- School of Biomedical Engineering, Southern Medical University, Guangzhou, Guangdong, 510515, People's Republic of China.,Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou, Guangdong, 510515, People's Republic of China.,Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou, Guangdong, 510515, People's Republic of China
| | - Kaifan Yang
- Department of Medical Imaging Center, Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong, 510515, People's Republic of China
| | - Zhiqin Chu
- School of Biomedical Engineering, Southern Medical University, Guangzhou, Guangdong, 510515, People's Republic of China.,Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou, Guangdong, 510515, People's Republic of China.,Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou, Guangdong, 510515, People's Republic of China
| | - Fan Tang
- Department of Radiation Oncology, Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong, 510515, People's Republic of China
| | - Feng Ye
- Department of Radiation Oncology, Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong, 510515, People's Republic of China
| | - Bei Chen
- Department of Radiation Oncology, Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong, 510515, People's Republic of China
| | - Jian Guan
- Department of Radiation Oncology, Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong, 510515, People's Republic of China
| | - Yu Zhang
- School of Biomedical Engineering, Southern Medical University, Guangzhou, Guangdong, 510515, People's Republic of China.,Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou, Guangdong, 510515, People's Republic of China.,Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou, Guangdong, 510515, People's Republic of China
| |
Collapse
|
14
|
Tang Q, Xu F, Zhang C, Li C, Liu F, Shen M, Liu X, Lin J, Zhu L, Lin T, Sun D. Two birds, one stone: host-guest complex of indocyanine green–β-cyclodextrin for fundus angiography. J INCL PHENOM MACRO 2022. [DOI: 10.1007/s10847-022-01154-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
15
|
Multi-Label Fundus Image Classification Using Attention Mechanisms and Feature Fusion. MICROMACHINES 2022; 13:mi13060947. [PMID: 35744561 PMCID: PMC9230753 DOI: 10.3390/mi13060947] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Revised: 06/09/2022] [Accepted: 06/13/2022] [Indexed: 01/27/2023]
Abstract
Fundus diseases can cause irreversible vision loss in both eyes if not diagnosed and treated immediately. Due to the complexity of fundus diseases, the probability of fundus images containing two or more diseases is extremely high, while existing deep learning-based fundus image classification algorithms have low diagnostic accuracy in multi-labeled fundus images. In this paper, a multi-label classification of fundus disease with binocular fundus images is presented, using a neural network algorithm model based on attention mechanisms and feature fusion. The algorithm highlights detailed features in binocular fundus images, and then feeds them into a ResNet50 network with attention mechanisms to extract fundus image lesion features. The model obtains global features of binocular images through feature fusion and uses Softmax to classify multi-label fundus images. The ODIR binocular fundus image dataset was used to evaluate the network classification performance and conduct ablation experiments. The model’s backend is the Tensorflow framework. Through experiments on the test images, this method achieved accuracy, precision, recall, and F1 values of 94.23%, 99.09%, 99.23%, and 99.16%, respectively.
Collapse
|
16
|
Attention to region: Region-based integration-and-recalibration networks for nuclear cataract classification using AS-OCT images. Med Image Anal 2022; 80:102499. [DOI: 10.1016/j.media.2022.102499] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2021] [Revised: 03/31/2022] [Accepted: 05/24/2022] [Indexed: 01/16/2023]
|
17
|
Dai W, Li X, Chiu WHK, Kuo MD, Cheng KT. Adaptive Contrast for Image Regression in Computer-Aided Disease Assessment. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; 41:1255-1268. [PMID: 34941504 DOI: 10.1109/tmi.2021.3137854] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Image regression tasks for medical applications, such as bone mineral density (BMD) estimation and left-ventricular ejection fraction (LVEF) prediction, play an important role in computer-aided disease assessment. Most deep regression methods train the neural network with a single regression loss function like MSE or L1 loss. In this paper, we propose the first contrastive learning framework for deep image regression, namely AdaCon, which consists of a feature learning branch via a novel adaptive-margin contrastive loss and a regression prediction branch. Our method incorporates label distance relationships as part of the learned feature representations, which allows for better performance in downstream regression tasks. Moreover, it can be used as a plug-and-play module to improve performance of existing regression methods. We demonstrate the effectiveness of AdaCon on two medical image regression tasks, i.e., bone mineral density estimation from X-ray images and left-ventricular ejection fraction prediction from echocardiogram videos. AdaCon leads to relative improvements of 3.3% and 5.9% in MAE over state-of-the-art BMD estimation and LVEF prediction methods, respectively.
Collapse
|
18
|
Wu Z, Cai W, Xie H, Chen S, Wang Y, Lei B, Zheng Y, Lu L. Predicting Optical Coherence Tomography-Derived High Myopia Grades From Fundus Photographs Using Deep Learning. Front Med (Lausanne) 2022; 9:842680. [PMID: 35308524 PMCID: PMC8927672 DOI: 10.3389/fmed.2022.842680] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2021] [Accepted: 02/09/2022] [Indexed: 01/28/2023] Open
Abstract
Purpose To develop an artificial intelligence (AI) system that can predict optical coherence tomography (OCT)-derived high myopia grades based on fundus photographs. Methods In this retrospective study, 1,853 qualified fundus photographs obtained from the Zhongshan Ophthalmic Center (ZOC) were selected to develop an AI system. Three retinal specialists assessed corresponding OCT images to label the fundus photographs. We developed a novel deep learning model to detect and predict myopic maculopathy according to the atrophy (A), traction (T), and neovascularisation (N) classification and grading system. Furthermore, we compared the performance of our model with that of ophthalmologists. Results When evaluated on the test set, the deep learning model showed an area under the receiver operating characteristic curve (AUC) of 0.969 for category A, 0.895 for category T, and 0.936 for category N. The average accuracy of each category was 92.38% (A), 85.34% (T), and 94.21% (N). Moreover, the performance of our AI system was superior to that of attending ophthalmologists and comparable to that of retinal specialists. Conclusion Our AI system achieved performance comparable to that of retinal specialists in predicting vision-threatening conditions in high myopia via simple fundus photographs instead of fundus and OCT images. The application of this system can save the cost of patients' follow-up, and is more suitable for applications in less developed areas that only have fundus photography.
Collapse
Affiliation(s)
- Zhenquan Wu
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Wenjia Cai
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Hai Xie
- Health Science Center, School of Biomedical Engineering, Shenzhen University, Shenzhen, China
| | - Shida Chen
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Yanbing Wang
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Baiying Lei
- Health Science Center, School of Biomedical Engineering, Shenzhen University, Shenzhen, China
| | - Yingfeng Zheng
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Lin Lu
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| |
Collapse
|
19
|
Channel separation-based network for the automatic anatomical site recognition using endoscopic images. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2021.103167] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|