1
|
Zhang Y, Sun K, Liu Y, Xie F, Guo Q, Shen D. A Modality-Flexible Framework for Alzheimer's Disease Diagnosis Following Clinical Routine. IEEE J Biomed Health Inform 2025; 29:535-546. [PMID: 39352829 DOI: 10.1109/jbhi.2024.3472011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/04/2024]
Abstract
Dementia has high incidence among the elderly, and Alzheimer's disease (AD) is the most common dementia. The procedure of AD diagnosis in clinics usually follows a standard routine consisting of different phases, from acquiring non-imaging tabular data in the screening phase to MR imaging and ultimately to PET imaging. Most of the existing AD diagnosis studies are dedicated to a specific phase using either single or multi-modal data. In this paper, we introduce a modality-flexible classification framework, which is applicable for different AD diagnosis phases following the clinical routine. Specifically, our framework consists of three branches corresponding to three diagnosis phases: 1) a tabular branch using only tabular data for screening phase, 2) an MRI branch using both MRI and tabular data for uncertain cases in screening phase, and 3) ultimately a PET branch for the challenging cases using all the modalities including PET, MRI, and tabular data. To achieve effective fusion of imaging and non-imaging modalities, we introduce an image-tabular transformer block to adaptively scale and shift the image and tabular features according to modality importance determined by the network. The proposed framework is extensively validated on four cohorts containing 6495 subjects. Experiments demonstrate that our framework achieves superior diagnostic performance than the other representative methods across various AD diagnosis tasks, and shows promising performance for all the diagnosis phases, which exhibits great potential for clinical application.
Collapse
|
2
|
Kwak MG, Mao L, Zheng Z, Su Y, Lure F, Li J. A Cross-Modal Mutual Knowledge Distillation Framework for Alzheimer's Disease Diagnosis: Addressing Incomplete Modalities. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2023.08.24.23294574. [PMID: 37662267 PMCID: PMC10473798 DOI: 10.1101/2023.08.24.23294574] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/05/2023]
Abstract
Early detection of Alzheimer's Disease (AD) is crucial for timely interventions and optimizing treatment outcomes. Despite the promise of integrating multimodal neuroimages such as MRI and PET, handling datasets with incomplete modalities remains under-researched. This phenomenon, however, is common in real-world scenarios as not every patient has all modalities due to practical constraints such as cost, access, and safety concerns. We propose a deep learning framework employing cross-modal Mutual Knowledge Distillation (MKD) to model different sub-cohorts of patients based on their available modalities. In MKD, the multimodal model (e.g., MRI and PET) serves as a teacher, while the single-modality model (e.g., MRI only) is the student. Our MKD framework features three components: a Modality-Disentangling Teacher (MDT) model designed through information disentanglement, a student model that learns from classification errors and MDT's knowledge, and the teacher model enhanced via distilling the student's single-modal feature extraction capabilities. Moreover, we show the effectiveness of the proposed method through theoretical analysis and validate its performance with simulation studies. In addition, our method is demonstrated through a case study with Alzheimer's Disease Neuroimaging Initiative (ADNI) datasets, underscoring the potential of artificial intelligence in addressing incomplete multimodal neuroimaging datasets and advancing early AD detection. Note to Practitioners— This paper was motivated by the challenge of early AD diagnosis, particularly in scenarios when clinicians encounter varied availability of patient imaging data, such as MRI and PET scans, often constrained by cost or accessibility issues. We propose an incomplete multimodal learning framework that produces tailored models for patients with only MRI and patients with both MRI and PET. This approach improves the accuracy and effectiveness of early AD diagnosis, especially when imaging resources are limited, via bi-directional knowledge transfer. We introduced a teacher model that prioritizes extracting common information between different modalities, significantly enhancing the student model's learning process. This paper includes theoretical analysis, simulation study, and real-world case study to illustrate the method's promising potential in early AD detection. However, practitioners should be mindful of the complexities involved in model tuning. Future work will focus on improving model interpretability and expanding its application. This includes developing methods to discover the key brain regions for predictions, enhancing clinical trust, and extending the framework to incorporate a broader range of imaging modalities, demographic information, and clinical data. These advancements aim to provide a more comprehensive view of patient health and improve diagnostic accuracy across various neurodegenerative diseases.
Collapse
Affiliation(s)
- Min Gu Kwak
- H. Milton Stewart School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, GA 30332, USA
| | - Lingchao Mao
- H. Milton Stewart School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, GA 30332, USA
| | - Zhiyang Zheng
- H. Milton Stewart School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, GA 30332, USA
| | - Yi Su
- Banner Alzheimer's Institute, Phoenix, AZ 85006, USA
| | - Fleming Lure
- MS Technologies Corporation, Rockville, MD 20850, USA
| | - Jing Li
- H. Milton Stewart School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, GA 30332, USA
| |
Collapse
|
3
|
Xu W, Li C, Bian Y, Meng Q, Zhu W, Shi F, Chen X, Shao C, Xiang D. Cross-Modal Consistency for Single-Modal MR Image Segmentation. IEEE Trans Biomed Eng 2024; 71:2557-2567. [PMID: 38512744 DOI: 10.1109/tbme.2024.3380058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/23/2024]
Abstract
OBJECTIVE Multi-modal magnetic resonance (MR) image segmentation is an important task in disease diagnosis and treatment, but it is usually difficult to obtain multiple modalities for a single patient in clinical applications. To address these issues, a cross-modal consistency framework is proposed for a single-modal MR image segmentation. METHODS To enable single-modal MR image segmentation in the inference stage, a weighted cross-entropy loss and a pixel-level feature consistency loss are proposed to train the target network with the guidance of the teacher network and the auxiliary network. To fuse dual-modal MR images in the training stage, the cross-modal consistency is measured according to Dice similarity entropy loss and Dice similarity contrastive loss, so as to maximize the prediction similarity of the teacher network and the auxiliary network. To reduce the difference in image contrast between different MR images for the same organs, a contrast alignment network is proposed to align input images with different contrasts to reference images with a good contrast. RESULTS Comprehensive experiments have been performed on a publicly available prostate dataset and an in-house pancreas dataset to verify the effectiveness of the proposed method. Compared to state-of-the-art methods, the proposed method can achieve better segmentation. CONCLUSION The proposed image segmentation method can fuse dual-modal MR images in the training stage and only need one-modal MR images in the inference stage. SIGNIFICANCE The proposed method can be used in routine clinical occasions when only single-modal MR image with variable contrast is available for a patient.
Collapse
|
4
|
Li Z, Yang X, Lan H, Wang M, Huang L, Wei X, Xie G, Wang R, Yu J, He Q, Zhang Y, Luo J. Knowledge fused latent representation from lung ultrasound examination for COVID-19 pneumonia severity assessment. ULTRASONICS 2024; 143:107409. [PMID: 39053242 DOI: 10.1016/j.ultras.2024.107409] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/19/2024] [Revised: 06/19/2024] [Accepted: 07/17/2024] [Indexed: 07/27/2024]
Abstract
COVID-19 pneumonia severity assessment is of great clinical importance, and lung ultrasound (LUS) plays a crucial role in aiding the severity assessment of COVID-19 pneumonia due to its safety and portability. However, its reliance on qualitative and subjective observations by clinicians is a limitation. Moreover, LUS images often exhibit significant heterogeneity, emphasizing the need for more quantitative assessment methods. In this paper, we propose a knowledge fused latent representation framework tailored for COVID-19 pneumonia severity assessment using LUS examinations. The framework transforms the LUS examination into latent representation and extracts knowledge from regions labeled by clinicians to improve accuracy. To fuse the knowledge into the latent representation, we employ a knowledge fusion with latent representation (KFLR) model. This model significantly reduces errors compared to approaches that lack prior knowledge integration. Experimental results demonstrate the effectiveness of our method, achieving high accuracy of 96.4 % and 87.4 % for binary-level and four-level COVID-19 pneumonia severity assessments, respectively. It is worth noting that only a limited number of studies have reported accuracy for clinically valuable exam level assessments, and our method surpass existing methods in this context. These findings highlight the potential of the proposed framework for monitoring disease progression and patient stratification in COVID-19 pneumonia cases.
Collapse
Affiliation(s)
- Zhiqiang Li
- School of Biomedical Engineering, Tsinghua University, Beijing 100084, China
| | - Xueping Yang
- Department of Ultrasound, Beijing Ditan Hospital, Capital Medical University, Beijing 100015, China
| | - Hengrong Lan
- School of Biomedical Engineering, Tsinghua University, Beijing 100084, China
| | - Mixue Wang
- Department of Ultrasound, Beijing Ditan Hospital, Capital Medical University, Beijing 100015, China
| | - Lijie Huang
- School of Biomedical Engineering, Tsinghua University, Beijing 100084, China
| | - Xingyue Wei
- School of Biomedical Engineering, Tsinghua University, Beijing 100084, China
| | - Gangqiao Xie
- School of Biomedical Engineering, Tsinghua University, Beijing 100084, China
| | - Rui Wang
- School of Biomedical Engineering, Tsinghua University, Beijing 100084, China
| | - Jing Yu
- Department of Ultrasound, Beijing Ditan Hospital, Capital Medical University, Beijing 100015, China
| | - Qiong He
- School of Biomedical Engineering, Tsinghua University, Beijing 100084, China
| | - Yao Zhang
- Department of Ultrasound, Beijing Ditan Hospital, Capital Medical University, Beijing 100015, China.
| | - Jianwen Luo
- School of Biomedical Engineering, Tsinghua University, Beijing 100084, China.
| |
Collapse
|
5
|
Zhang H, Liu J, Liu W, Chen H, Yu Z, Yuan Y, Wang P, Qin J. MHD-Net: Memory-Aware Hetero-Modal Distillation Network for Thymic Epithelial Tumor Typing With Missing Pathology Modality. IEEE J Biomed Health Inform 2024; 28:3003-3014. [PMID: 38470599 DOI: 10.1109/jbhi.2024.3376462] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/14/2024]
Abstract
Fusing multi-modal radiology and pathology data with complementary information can improve the accuracy of tumor typing. However, collecting pathology data is difficult since it is high-cost and sometimes only obtainable after the surgery, which limits the application of multi-modal methods in diagnosis. To address this problem, we propose comprehensively learning multi-modal radiology-pathology data in training, and only using uni-modal radiology data in testing. Concretely, a Memory-aware Hetero-modal Distillation Network (MHD-Net) is proposed, which can distill well-learned multi-modal knowledge with the assistance of memory from the teacher to the student. In the teacher, to tackle the challenge in hetero-modal feature fusion, we propose a novel spatial-differentiated hetero-modal fusion module (SHFM) that models spatial-specific tumor information correlations across modalities. As only radiology data is accessible to the student, we store pathology features in the proposed contrast-boosted typing memory module (CTMM) that achieves type-wise memory updating and stage-wise contrastive memory boosting to ensure the effectiveness and generalization of memory items. In the student, to improve the cross-modal distillation, we propose a multi-stage memory-aware distillation (MMD) scheme that reads memory-aware pathology features from CTMM to remedy missing modal-specific information. Furthermore, we construct a Radiology-Pathology Thymic Epithelial Tumor (RPTET) dataset containing paired CT and WSI images with annotations. Experiments on the RPTET and CPTAC-LUAD datasets demonstrate that MHD-Net significantly improves tumor typing and outperforms existing multi-modal methods on missing modality situations.
Collapse
|
6
|
Liu X. Incomplete Multiple Kernel Alignment Maximization for Clustering. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2024; 46:1412-1424. [PMID: 34596533 DOI: 10.1109/tpami.2021.3116948] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Multiple kernel alignment (MKA) maximization criterion has been widely applied into multiple kernel clustering (MKC) and many variants have been recently developed. Though demonstrating superior clustering performance in various applications, it is observed that none of them can effectively handle incomplete MKC, where parts or all of the pre-specified base kernel matrices are incomplete. To address this issue, we propose to integrate the imputation of incomplete kernel matrices and MKA maximization for clustering into a unified learning framework. The clustering of MKA maximization guides the imputation of incomplete kernel elements, and the completed kernel matrices are in turn combined to conduct the subsequent MKC. These two procedures are alternately performed until convergence. By this way, the imputation and MKC processes are seamlessly connected, with the aim to achieve better clustering performance. Besides theoretically analyzing the clustering generalization error bound, we empirically evaluate the clustering performance on several multiple kernel learning (MKL) benchmark datasets, and the results indicate the superiority of our algorithm over existing state-of-the-art counterparts. Our codes and data are publicly available at https://xinwangliu.github.io/.
Collapse
|
7
|
Chen Y, Pan Y, Xia Y, Yuan Y. Disentangle First, Then Distill: A Unified Framework for Missing Modality Imputation and Alzheimer's Disease Diagnosis. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:3566-3578. [PMID: 37450359 DOI: 10.1109/tmi.2023.3295489] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/18/2023]
Abstract
Multi-modality medical data provide complementary information, and hence have been widely explored for computer-aided AD diagnosis. However, the research is hindered by the unavoidable missing-data problem, i.e., one data modality was not acquired on some subjects due to various reasons. Although the missing data can be imputed using generative models, the imputation process may introduce unrealistic information to the classification process, leading to poor performance. In this paper, we propose the Disentangle First, Then Distill (DFTD) framework for AD diagnosis using incomplete multi-modality medical images. First, we design a region-aware disentanglement module to disentangle each image into inter-modality relevant representation and intra-modality specific representation with emphasis on disease-related regions. To progressively integrate multi-modality knowledge, we then construct an imputation-induced distillation module, in which a lateral inter-modality transition unit is created to impute representation of the missing modality. The proposed DFTD framework has been evaluated against six existing methods on an ADNI dataset with 1248 subjects. The results show that our method has superior performance in both AD-CN classification and MCI-to-AD prediction tasks, substantially over-performing all competing methods.
Collapse
|
8
|
Chen Y, Guo X, Pan Y, Xia Y, Yuan Y. Dynamic feature splicing for few-shot rare disease diagnosis. Med Image Anal 2023; 90:102959. [PMID: 37757644 DOI: 10.1016/j.media.2023.102959] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2022] [Revised: 09/03/2023] [Accepted: 09/05/2023] [Indexed: 09/29/2023]
Abstract
Annotated images for rare disease diagnosis are extremely hard to collect. Therefore, identifying rare diseases under a few-shot learning (FSL) setting is significant. Existing FSL methods transfer useful and global knowledge from base classes with abundant training samples to enrich features of novel classes with few training samples, but still face difficulties when being applied to medical images due to the complex lesion characteristics and large intra-class variance. In this paper, we propose a dynamic feature splicing (DNFS) framework for few-shot rare disease diagnosis. Under DNFS, both low-level features (i.e., the output of three convolutional blocks) and high-level features (i.e., the output of the last fully connected layer) of novel classes are dynamically enriched. We construct the position coherent DNFS (P-DNFS) module to perform low-level feature splicing, where a lesion-oriented Transformer is designed to detect lesion regions. Thus, novel-class channels are replaced by similar base-class channels within the detected lesion regions to achieve disease-related feature enrichment. We also devise a semantic coherent DNFS (S-DNFS) module to perform high-level feature splicing. It explores cross-image channel relations and selects base-class channels with semantic consistency for explicit knowledge transfer. Both low-level and high-level feature splicings are performed dynamically and iteratively. Consequently, abundant spliced features are generated for disease diagnosis, leading to more accurate decision boundary and improved diagnosis performance. Extensive experiments have been conducted on three medical image classification datasets. Our results suggest that the proposed DNFS achieves superior performance against state-of-the-art approaches.
Collapse
Affiliation(s)
- Yuanyuan Chen
- National Engineering Laboratory for Integrated Aero-Space-Ground-Ocean Big Data Application Technology, School of Computer Science and Engineering, Northwestern Polytechnical University, Xi'an 710072, China
| | - Xiaoqing Guo
- Department of Engineering Science, University of Oxford, Oxford, UK
| | - Yongsheng Pan
- National Engineering Laboratory for Integrated Aero-Space-Ground-Ocean Big Data Application Technology, School of Computer Science and Engineering, Northwestern Polytechnical University, Xi'an 710072, China
| | - Yong Xia
- National Engineering Laboratory for Integrated Aero-Space-Ground-Ocean Big Data Application Technology, School of Computer Science and Engineering, Northwestern Polytechnical University, Xi'an 710072, China.
| | - Yixuan Yuan
- Department of Electronic Engineering, Chinese University of Hong Kong, Hong Kong Special Administrative Region of China; CUHK Shenzhen Research Institute, Shenzhen 518172, China.
| |
Collapse
|
9
|
Gu Y, Otake Y, Uemura K, Soufi M, Takao M, Talbot H, Okada S, Sugano N, Sato Y. Bone mineral density estimation from a plain X-ray image by learning decomposition into projections of bone-segmented computed tomography. Med Image Anal 2023; 90:102970. [PMID: 37774535 DOI: 10.1016/j.media.2023.102970] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Revised: 07/25/2023] [Accepted: 09/11/2023] [Indexed: 10/01/2023]
Abstract
Osteoporosis is a prevalent bone disease that causes fractures in fragile bones, leading to a decline in daily living activities. Dual-energy X-ray absorptiometry (DXA) and quantitative computed tomography (QCT) are highly accurate for diagnosing osteoporosis; however, these modalities require special equipment and scan protocols. To frequently monitor bone health, low-cost, low-dose, and ubiquitously available diagnostic methods are highly anticipated. In this study, we aim to perform bone mineral density (BMD) estimation from a plain X-ray image for opportunistic screening, which is potentially useful for early diagnosis. Existing methods have used multi-stage approaches consisting of extraction of the region of interest and simple regression to estimate BMD, which require a large amount of training data. Therefore, we propose an efficient method that learns decomposition into projections of bone-segmented QCT for BMD estimation under limited datasets. The proposed method achieved high accuracy in BMD estimation, where Pearson correlation coefficients of 0.880 and 0.920 were observed for DXA-measured BMD and QCT-measured BMD estimation tasks, respectively, and the root mean square of the coefficient of variation values were 3.27 to 3.79% for four measurements with different poses. Furthermore, we conducted extensive validation experiments, including multi-pose, uncalibrated-CT, and compression experiments toward actual application in routine clinical practice.
Collapse
Affiliation(s)
- Yi Gu
- Graduate School of Science and Technology, Nara Institute of Science and Technology, Ikoma, Nara 630-0192, Japan; CentraleSupélec, Université Paris-Saclay, Inria, Gif-sur-Yvette 91190, France.
| | - Yoshito Otake
- Graduate School of Science and Technology, Nara Institute of Science and Technology, Ikoma, Nara 630-0192, Japan.
| | - Keisuke Uemura
- Department of Orthopeadic Medical Engineering, Osaka University Graduate School of Medicine, Suita, Osaka 565-0871, Japan.
| | - Mazen Soufi
- Graduate School of Science and Technology, Nara Institute of Science and Technology, Ikoma, Nara 630-0192, Japan
| | - Masaki Takao
- Department of Bone and Joint Surgery, Ehime University Graduate School of Medicine, Toon, Ehime 791-0295, Japan
| | - Hugues Talbot
- CentraleSupélec, Université Paris-Saclay, Inria, Gif-sur-Yvette 91190, France
| | - Seiji Okada
- Department of Orthopaedics, Osaka University Graduate School of Medicine, Suita, Osaka 565-0871, Japan
| | - Nobuhiko Sugano
- Department of Orthopeadic Medical Engineering, Osaka University Graduate School of Medicine, Suita, Osaka 565-0871, Japan
| | - Yoshinobu Sato
- Graduate School of Science and Technology, Nara Institute of Science and Technology, Ikoma, Nara 630-0192, Japan.
| |
Collapse
|
10
|
Hou W, Lin C, Yu L, Qin J, Yu R, Wang L. Hybrid Graph Convolutional Network With Online Masked Autoencoder for Robust Multimodal Cancer Survival Prediction. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:2462-2473. [PMID: 37028064 DOI: 10.1109/tmi.2023.3253760] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Cancer survival prediction requires exploiting related multimodal information (e.g., pathological, clinical and genomic features, etc.) and it is even more challenging in clinical practices due to the incompleteness of patient's multimodal data. Furthermore, existing methods lack sufficient intra- and inter-modal interactions, and suffer from significant performance degradation caused by missing modalities. This manuscript proposes a novel hybrid graph convolutional network, entitled HGCN, which is equipped with an online masked autoencoder paradigm for robust multimodal cancer survival prediction. Particularly, we pioneer modeling the patient's multimodal data into flexible and interpretable multimodal graphs with modality-specific preprocessing. HGCN integrates the advantages of graph convolutional networks (GCNs) and a hypergraph convolutional network (HCN) through node message passing and a hyperedge mixing mechanism to facilitate intra-modal and inter-modal interactions between multimodal graphs. With HGCN, the potential for multimodal data to create more reliable predictions of patient's survival risk is dramatically increased compared to prior methods. Most importantly, to compensate for missing patient modalities in clinical scenarios, we incorporated an online masked autoencoder paradigm into HGCN, which can effectively capture intrinsic dependence between modalities and seamlessly generate missing hyperedges for model inference. Extensive experiments and analysis on six cancer cohorts from TCGA show that our method significantly outperforms the state-of-the-arts in both complete and missing modal settings. Our codes are made available at https://github.com/lin-lcx/HGCN.
Collapse
|
11
|
Steyaert S, Pizurica M, Nagaraj D, Khandelwal P, Hernandez-Boussard T, Gentles AJ, Gevaert O. Multimodal data fusion for cancer biomarker discovery with deep learning. NAT MACH INTELL 2023; 5:351-362. [PMID: 37693852 PMCID: PMC10484010 DOI: 10.1038/s42256-023-00633-5] [Citation(s) in RCA: 74] [Impact Index Per Article: 37.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Accepted: 02/17/2023] [Indexed: 09/12/2023]
Abstract
Technological advances now make it possible to study a patient from multiple angles with high-dimensional, high-throughput multi-scale biomedical data. In oncology, massive amounts of data are being generated ranging from molecular, histopathology, radiology to clinical records. The introduction of deep learning has significantly advanced the analysis of biomedical data. However, most approaches focus on single data modalities leading to slow progress in methods to integrate complementary data types. Development of effective multimodal fusion approaches is becoming increasingly important as a single modality might not be consistent and sufficient to capture the heterogeneity of complex diseases to tailor medical care and improve personalised medicine. Many initiatives now focus on integrating these disparate modalities to unravel the biological processes involved in multifactorial diseases such as cancer. However, many obstacles remain, including lack of usable data as well as methods for clinical validation and interpretation. Here, we cover these current challenges and reflect on opportunities through deep learning to tackle data sparsity and scarcity, multimodal interpretability, and standardisation of datasets.
Collapse
Affiliation(s)
- Sandra Steyaert
- Stanford Center for Biomedical Informatics Research (BMIR), Department of Medicine, Stanford University
| | - Marija Pizurica
- Stanford Center for Biomedical Informatics Research (BMIR), Department of Medicine, Stanford University
| | | | | | - Tina Hernandez-Boussard
- Stanford Center for Biomedical Informatics Research (BMIR), Department of Medicine, Stanford University
- Department of Biomedical Data Science, Stanford University
| | - Andrew J Gentles
- Stanford Center for Biomedical Informatics Research (BMIR), Department of Medicine, Stanford University
- Department of Biomedical Data Science, Stanford University
| | - Olivier Gevaert
- Stanford Center for Biomedical Informatics Research (BMIR), Department of Medicine, Stanford University
- Department of Biomedical Data Science, Stanford University
| |
Collapse
|
12
|
El-Sappagh S, Alonso-Moral JM, Abuhmed T, Ali F, Bugarín-Diz A. Trustworthy artificial intelligence in Alzheimer’s disease: state of the art, opportunities, and challenges. Artif Intell Rev 2023. [DOI: 10.1007/s10462-023-10415-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/28/2023]
|
13
|
Gong W, Bai S, Zheng YQ, Smith SM, Beckmann CF. Supervised Phenotype Discovery From Multimodal Brain Imaging. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:834-849. [PMID: 36318559 DOI: 10.1109/tmi.2022.3218720] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Data-driven discovery of image-derived phenotypes (IDPs) from large-scale multimodal brain imaging data has enormous potential for neuroscientific and clinical research by linking IDPs to subjects' demographic, behavioural, clinical and cognitive measures (i.e., non-imaging derived phenotypes or nIDPs). However, current approaches are primarily based on unsupervised approaches, without the use of information in nIDPs. In this paper, we proposed a semi-supervised, multimodal, and multi-task fusion approach, termed SuperBigFLICA, for IDP discovery, which simultaneously integrates information from multiple imaging modalities as well as multiple nIDPs. SuperBigFLICA is computationally efficient and largely avoids the need for parameter tuning. Using the UK Biobank brain imaging dataset with around 40,000 subjects and 47 modalities, along with more than 17,000 nIDPs, we showed that SuperBigFLICA enhances the prediction power of nIDPs, benchmarked against IDPs derived by conventional expert-knowledge and unsupervised-learning approaches (with average nIDP prediction accuracy improvements of up to 46%). It also enables the learning of generic imaging features that can predict new nIDPs. Further empirical analysis of the SuperBigFLICA algorithm demonstrates its robustness in different prediction tasks and the ability to derive biologically meaningful IDPs in predicting health outcomes and cognitive nIDPs, such as fluid intelligence and hypertension.
Collapse
|
14
|
Sun Y, Li Y, Zhang F, Zhao H, Liu H, Wang N, Li H. A deep network using coarse clinical prior for myopic maculopathy grading. Comput Biol Med 2023; 154:106556. [PMID: 36682177 DOI: 10.1016/j.compbiomed.2023.106556] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2022] [Revised: 12/19/2022] [Accepted: 01/11/2023] [Indexed: 01/15/2023]
Abstract
Pathological Myopia (PM) is a globally prevalent eye disease which is one of the main causes of blindness. In the long-term clinical observation, myopic maculopathy is a main criterion to diagnose PM severity. The grading of myopic maculopathy can provide a severity and progression prediction of PM to perform treatment and prevent myopia blindness in time. In this paper, we propose a feature fusion framework to utilize tessellated fundus and the brightest region in fundus images as prior knowledge. The proposed framework consists of prior knowledge extraction module and feature fusion module. Prior knowledge extraction module uses traditional image processing methods to extract the prior knowledge to indicate coarse lesion positions in fundus images. Furthermore, the prior, tessellated fundus and the brightest region in fundus images, are integrated into deep learning network as global and local constrains respectively by feature fusion module. In addition, rank loss is designed to increase the continuity of classification score. We collect a private color fundus dataset from Beijing TongRen Hospital containing 714 clinical images. The dataset contains all 5 grades of myopic maculopathy which are labeled by experienced ophthalmologists. Our framework achieves 0.8921 five-grade accuracy on our private dataset. Pathological Myopia (PALM) dataset is used for comparison with other related algorithms. Our framework is trained with 400 images and achieves an AUC of 0.9981 for two-class grading. The results show that our framework can achieve a good performance for myopic maculopathy grading.
Collapse
Affiliation(s)
- Yun Sun
- Beijing Institute of Technology, No. 5, Zhong Guan Cun South Street, Beijing, 100081, China
| | - Yu Li
- Beijing Tongren Hospital, Capital Medical University, No. 2, Chongwenmennei Street, Beijing, 100730, China
| | - Fengju Zhang
- Beijing Tongren Hospital, Capital Medical University, No. 2, Chongwenmennei Street, Beijing, 100730, China
| | - He Zhao
- Beijing Institute of Technology, No. 5, Zhong Guan Cun South Street, Beijing, 100081, China.
| | - Hanruo Liu
- Beijing Institute of Technology, No. 5, Zhong Guan Cun South Street, Beijing, 100081, China; Beijing Tongren Hospital, Capital Medical University, No. 2, Chongwenmennei Street, Beijing, 100730, China
| | - Ningli Wang
- Beijing Tongren Hospital, Capital Medical University, No. 2, Chongwenmennei Street, Beijing, 100730, China
| | - Huiqi Li
- Beijing Institute of Technology, No. 5, Zhong Guan Cun South Street, Beijing, 100081, China.
| |
Collapse
|
15
|
Liu F, Yuan S, Li W, Xu Q, Sheng B. Patch-based deep multi-modal learning framework for Alzheimer’s disease diagnosis using multi-view neuroimaging. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104400] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
16
|
Xiao Z, Zhang X, Liu Y, Geng L, Wu J, Wang W, Zhang F. RNN-combined graph convolutional network with multi-feature fusion for tuberculosis cavity segmentation. SIGNAL, IMAGE AND VIDEO PROCESSING 2023; 17:2297-2303. [PMID: 36624826 PMCID: PMC9813881 DOI: 10.1007/s11760-022-02446-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Revised: 11/16/2022] [Accepted: 12/10/2022] [Indexed: 05/20/2023]
Abstract
Tuberculosis is a common infectious disease in the world. Tuberculosis cavities are common and an important imaging signs in tuberculosis. Accurate segmentation of tuberculosis cavities has practical significance for indicating the activity of lesions and guiding clinical treatment. However, this task faces challenges such as blurred boundaries, irregular shapes, different location and size of lesions and similar structures on computed tomography (CT) to other lung diseases or tissues. To overcome these problems, we propose a novel RNN-combined graph convolutional network (R2GCN) method, which integrates the bidirectional recurrent network (BRN) and graph convolution network (GCN) modules. First, feature extraction is performed on the input image by VGG-16 or ResNet-50 to obtain the feature map. The feature map is then used as the input of the two modules. On the one hand, we adopt the BRN to retrieve contextual information from the feature map. On the other hand, we take the vector for each location in the feature map as input nodes and utilize GCN to extract node topology information. Finally, two types of features obtained fuse together. Our strategy can not only make full use of node correlations and differences, but also obtain more precise segmentation boundaries. Extensive experiments on CT images of cavitary patients with tuberculosis show that our proposed method achieves the best segmentation accuracy than compared segmentation methods. Our method can be used for the diagnosis of tuberculosis cavity and the evaluation of tuberculosis cavity treatment.
Collapse
Affiliation(s)
- Zhitao Xiao
- School of life Sciences, Tiangong University, Tianjin, 300387 China
- Tianjin Key Laboratory of Optoelectronic Detection Technology and Systems, Tianjin, 300387 China
| | - Xiaomeng Zhang
- School of Artificial Intelligence, Tiangong University, Tianjin, 300387 China
| | - Yanbei Liu
- School of life Sciences, Tiangong University, Tianjin, 300387 China
| | - Lei Geng
- School of life Sciences, Tiangong University, Tianjin, 300387 China
| | - Jun Wu
- School of Electronic and Information Engineering, Tiangong University, Tianjin, 300387 China
| | - Wen Wang
- School of life Sciences, Tiangong University, Tianjin, 300387 China
| | - Fang Zhang
- School of life Sciences, Tiangong University, Tianjin, 300387 China
| |
Collapse
|
17
|
Zhang S, Zhang J, Tian B, Lukasiewicz T, Xu Z. Multi-modal contrastive mutual learning and pseudo-label re-learning for semi-supervised medical image segmentation. Med Image Anal 2023; 83:102656. [PMID: 36327656 DOI: 10.1016/j.media.2022.102656] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Revised: 10/04/2022] [Accepted: 10/12/2022] [Indexed: 12/12/2022]
Abstract
Semi-supervised learning has a great potential in medical image segmentation tasks with a few labeled data, but most of them only consider single-modal data. The excellent characteristics of multi-modal data can improve the performance of semi-supervised segmentation for each image modality. However, a shortcoming for most existing multi-modal solutions is that as the corresponding processing models of the multi-modal data are highly coupled, multi-modal data are required not only in the training but also in the inference stages, which thus limits its usage in clinical practice. Consequently, we propose a semi-supervised contrastive mutual learning (Semi-CML) segmentation framework, where a novel area-similarity contrastive (ASC) loss leverages the cross-modal information and prediction consistency between different modalities to conduct contrastive mutual learning. Although Semi-CML can improve the segmentation performance of both modalities simultaneously, there is a performance gap between two modalities, i.e., there exists a modality whose segmentation performance is usually better than that of the other. Therefore, we further develop a soft pseudo-label re-learning (PReL) scheme to remedy this gap. We conducted experiments on two public multi-modal datasets. The results show that Semi-CML with PReL greatly outperforms the state-of-the-art semi-supervised segmentation methods and achieves a similar (and sometimes even better) performance as fully supervised segmentation methods with 100% labeled data, while reducing the cost of data annotation by 90%. We also conducted ablation studies to evaluate the effectiveness of the ASC loss and the PReL module.
Collapse
Affiliation(s)
- Shuo Zhang
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, China; Tianjin Key Laboratory of Bioelectromagnetic Technology and Intelligent Health, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, China
| | - Jiaojiao Zhang
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, China; Tianjin Key Laboratory of Bioelectromagnetic Technology and Intelligent Health, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, China
| | - Biao Tian
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, China; Tianjin Key Laboratory of Bioelectromagnetic Technology and Intelligent Health, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, China
| | | | - Zhenghua Xu
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, China; Tianjin Key Laboratory of Bioelectromagnetic Technology and Intelligent Health, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, China.
| |
Collapse
|
18
|
Xu C, Liu H, Guan Z, Wu X, Tan J, Ling B. Adversarial Incomplete Multiview Subspace Clustering Networks. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:10490-10503. [PMID: 33750730 DOI: 10.1109/tcyb.2021.3062830] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Multiview clustering aims to leverage information from multiple views to improve the clustering performance. Most previous works assumed that each view has complete data. However, in real-world datasets, it is often the case that a view may contain some missing data, resulting in the problem of incomplete multiview clustering (IMC). Previous approaches to this problem have at least one of the following drawbacks: 1) employing shallow models, which cannot well handle the dependence and discrepancy among different views; 2) ignoring the hidden information of the missing data; and 3) being dedicated to the two-view case. To eliminate all these drawbacks, in this work, we present the adversarial IMC (AIMC) framework. In particular, AIMC seeks the common latent representation of multiview data for reconstructing raw data and inferring missing data. The elementwise reconstruction and the generative adversarial network are integrated to evaluate the reconstruction. They aim to capture the overall structure and get a deeper semantic understanding, respectively. Moreover, the clustering loss is designed to obtain a better clustering structure. We explore two variants of AIMC, namely: 1) autoencoder-based AIMC (AAIMC) and 2) generalized AIMC (GAIMC), with different strategies to obtain the multiview common representation. Experiments conducted on six real-world datasets show that AAIMC and GAIMC perform well and outperform the baseline methods.
Collapse
|
19
|
Zhang Y, Zhang H, Xiao L, Bai Y, Calhoun VD, Wang YP. Multi-Modal Imaging Genetics Data Fusion via a Hypergraph-Based Manifold Regularization: Application to Schizophrenia Study. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; 41:2263-2272. [PMID: 35320094 PMCID: PMC9661879 DOI: 10.1109/tmi.2022.3161828] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Recent studies show that multi-modal data fusion techniques combine information from diverse sources for comprehensive diagnosis and prognosis of complex brain disorder, often resulting in improved accuracy compared to single-modality approaches. However, many existing data fusion methods extract features from homogeneous networs, ignoring heterogeneous structural information among multiple modalities. To this end, we propose a Hypergraph-based Multi-modal data Fusion algorithm, namely HMF. Specifically, we first generate a hypergraph similarity matrix to represent the high-order relationships among subjects, and then enforce the regularization term based upon both the inter- and intra-modality relationships of the subjects. Finally, we apply HMF to integrate imaging and genetics datasets. Validation of the proposed method is performed on both synthetic data and real samples from schizophrenia study. Results show that our algorithm outperforms several competing methods, and reveals significant interactions among risk genes, environmental factors and abnormal brain regions.
Collapse
|
20
|
Xu L, Wu H, He C, Wang J, Zhang C, Nie F, Chen L. Multi-modal sequence learning for Alzheimer’s disease progression prediction with incomplete variable-length longitudinal data. Med Image Anal 2022; 82:102643. [DOI: 10.1016/j.media.2022.102643] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2021] [Revised: 08/27/2022] [Accepted: 09/23/2022] [Indexed: 11/28/2022]
|
21
|
Zhang H, Chen X, Zhang E, Wang L. Incomplete Multi-view Learning via Consensus Graph Completion. Neural Process Lett 2022. [DOI: 10.1007/s11063-022-10973-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|