1
|
Wang P, Zhang J. Prediction of Composite Clinical Outcomes for Childhood Neuroblastoma Using Multi-Omics Data and Machine Learning. Int J Mol Sci 2024; 26:136. [PMID: 39795994 PMCID: PMC11720239 DOI: 10.3390/ijms26010136] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2024] [Revised: 05/20/2024] [Accepted: 05/22/2024] [Indexed: 01/13/2025] Open
Abstract
Neuroblastoma is a common malignant tumor in childhood that seriously endangers the health and lives of children, making it essential to find effective prognostic markers to accurately predict their clinical outcomes. The development of high-throughput technology in the biomedical field has made it possible to obtain multi-omics data, whose integration can compensate for missing or unreliable information in a single data source. In this study, we integrated clinical data and two omics data, i.e., gene expression and DNA methylation data, to study the prognosis of neuroblastoma. Since the features in omics data are redundant, it is crucial to conduct feature selection on them. We proposed a two-step feature selection (TSFS) method to quickly and accurately select the optimal features, where the first step aims at selecting candidate features and the second step is to remove redundant features among them using our proposed maximal association coefficient (MAC). Our goal is to predict composite clinical outcomes for neuroblastoma patients, i.e., their survival time and vital status at the last follow-up, which was validated to be two inter-correlated tasks. We conducted a series of experiments and evaluated the experimental results using accuracy and AUC (area under the ROC curve) evaluation metrics, which indicated that by the combination of the integration of the three types of data, our proposed TSFS method and a multi-task learning method can synergistically improve the reliability and accuracy of the prediction models.
Collapse
Affiliation(s)
| | - Junying Zhang
- School of Computer Science and Technology, Xidian University, Xi’an 710126, China;
| |
Collapse
|
2
|
Jin Z, Gong J, Deng M, Zheng P, Li G. Deep Learning-Based Diagnosis Algorithm for Alzheimer's Disease. J Imaging 2024; 10:333. [PMID: 39728230 PMCID: PMC11728444 DOI: 10.3390/jimaging10120333] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2024] [Revised: 12/03/2024] [Accepted: 12/16/2024] [Indexed: 12/28/2024] Open
Abstract
Alzheimer's disease (AD), a degenerative condition affecting the central nervous system, has witnessed a notable rise in prevalence along with the increasing aging population. In recent years, the integration of cutting-edge medical imaging technologies with forefront theories in artificial intelligence has dramatically enhanced the efficiency of identifying and diagnosing brain diseases such as AD. This paper presents an innovative two-stage automatic auxiliary diagnosis algorithm for AD, based on an improved 3D DenseNet segmentation model and an improved MobileNetV3 classification model applied to brain MR images. In the segmentation network, the backbone network was simplified, the activation function and loss function were replaced, and the 3D GAM attention mechanism was introduced. In the classification network, firstly, the CA attention mechanism was added to enhance the model's ability to capture positional information of disease features; secondly, dilated convolutions were introduced to extract richer features from the input feature maps; and finally, the fully connected layer of MobileNetV3 was modified and the idea of transfer learning was adopted to improve the model's feature extraction capability. The results of the study showed that the proposed approach achieved classification accuracies of 97.85% for AD/NC, 95.31% for MCI/NC, 93.96% for AD/MCI, and 92.63% for AD/MCI/NC, respectively, which were 3.1, 2.8, 2.6, and 2.8 percentage points higher than before the improvement. Comparative and ablation experiments have validated the proposed classification performance of this method, demonstrating its capability to facilitate an accurate and efficient automated auxiliary diagnosis of AD, offering a deep learning-based solution for it.
Collapse
Affiliation(s)
| | | | - Minghui Deng
- College of Electrical and Information, Northeast Agricultural University, 600 Changjiang Road, Harbin 150038, China; (Z.J.); (J.G.); (P.Z.); (G.L.)
| | | | | |
Collapse
|
3
|
Iqbal MS, Belal Bin Heyat M, Parveen S, Ammar Bin Hayat M, Roshanzamir M, Alizadehsani R, Akhtar F, Sayeed E, Hussain S, Hussein HS, Sawan M. Progress and trends in neurological disorders research based on deep learning. Comput Med Imaging Graph 2024; 116:102400. [PMID: 38851079 DOI: 10.1016/j.compmedimag.2024.102400] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2024] [Revised: 05/07/2024] [Accepted: 05/13/2024] [Indexed: 06/10/2024]
Abstract
In recent years, deep learning (DL) has emerged as a powerful tool in clinical imaging, offering unprecedented opportunities for the diagnosis and treatment of neurological disorders (NDs). This comprehensive review explores the multifaceted role of DL techniques in leveraging vast datasets to advance our understanding of NDs and improve clinical outcomes. Beginning with a systematic literature review, we delve into the utilization of DL, particularly focusing on multimodal neuroimaging data analysis-a domain that has witnessed rapid progress and garnered significant scientific interest. Our study categorizes and critically analyses numerous DL models, including Convolutional Neural Networks (CNNs), LSTM-CNN, GAN, and VGG, to understand their performance across different types of Neurology Diseases. Through particular analysis, we identify key benchmarks and datasets utilized in training and testing DL models, shedding light on the challenges and opportunities in clinical neuroimaging research. Moreover, we discuss the effectiveness of DL in real-world clinical scenarios, emphasizing its potential to revolutionize ND diagnosis and therapy. By synthesizing existing literature and describing future directions, this review not only provides insights into the current state of DL applications in ND analysis but also covers the way for the development of more efficient and accessible DL techniques. Finally, our findings underscore the transformative impact of DL in reshaping the landscape of clinical neuroimaging, offering hope for enhanced patient care and groundbreaking discoveries in the field of neurology. This review paper is beneficial for neuropathologists and new researchers in this field.
Collapse
Affiliation(s)
- Muhammad Shahid Iqbal
- Department of Computer Science and Information Technology, Women University of Azad Jammu & Kashmir, Bagh, Pakistan.
| | - Md Belal Bin Heyat
- CenBRAIN Neurotech Center of Excellence, School of Engineering, Westlake University, Hangzhou, Zhejiang, China.
| | - Saba Parveen
- College of Electronics and Information Engineering, Shenzhen University, Shenzhen 518060, China.
| | | | - Mohamad Roshanzamir
- Department of Computer Engineering, Faculty of Engineering, Fasa University, Fasa, Iran.
| | - Roohallah Alizadehsani
- Institute for Intelligent Systems Research and Innovation, Deakin University, VIC 3216, Australia.
| | - Faijan Akhtar
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, China.
| | - Eram Sayeed
- Kisan Inter College, Dhaurahara, Kushinagar, India.
| | - Sadiq Hussain
- Department of Examination, Dibrugarh University, Assam 786004, India.
| | - Hany S Hussein
- Electrical Engineering Department, Faculty of Engineering, King Khalid University, Abha 61411, Saudi Arabia; Electrical Engineering Department, Faculty of Engineering, Aswan University, Aswan 81528, Egypt.
| | - Mohamad Sawan
- CenBRAIN Neurotech Center of Excellence, School of Engineering, Westlake University, Hangzhou, Zhejiang, China.
| |
Collapse
|
4
|
Tchetchenian A, Zekelman L, Chen Y, Rushmore J, Zhang F, Yeterian EH, Makris N, Rathi Y, Meijering E, Song Y, O'Donnell LJ. Deep multimodal saliency parcellation of cerebellar pathways: Linking microstructure and individual function through explainable multitask learning. Hum Brain Mapp 2024; 45:e70008. [PMID: 39185598 PMCID: PMC11345609 DOI: 10.1002/hbm.70008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Revised: 07/18/2024] [Accepted: 08/10/2024] [Indexed: 08/27/2024] Open
Abstract
Parcellation of human cerebellar pathways is essential for advancing our understanding of the human brain. Existing diffusion magnetic resonance imaging tractography parcellation methods have been successful in defining major cerebellar fibre tracts, while relying solely on fibre tract structure. However, each fibre tract may relay information related to multiple cognitive and motor functions of the cerebellum. Hence, it may be beneficial for parcellation to consider the potential importance of the fibre tracts for individual motor and cognitive functional performance measures. In this work, we propose a multimodal data-driven method for cerebellar pathway parcellation, which incorporates both measures of microstructure and connectivity, and measures of individual functional performance. Our method involves first training a multitask deep network to predict various cognitive and motor measures from a set of fibre tract structural features. The importance of each structural feature for predicting each functional measure is then computed, resulting in a set of structure-function saliency values that are clustered to parcellate cerebellar pathways. We refer to our method as Deep Multimodal Saliency Parcellation (DeepMSP), as it computes the saliency of structural measures for predicting cognitive and motor functional performance, with these saliencies being applied to the task of parcellation. Applying DeepMSP to a large-scale dataset from the Human Connectome Project Young Adult study (n = 1065), we found that it was feasible to identify multiple cerebellar pathway parcels with unique structure-function saliency patterns that were stable across training folds. We thoroughly experimented with all stages of the DeepMSP pipeline, including network selection, structure-function saliency representation, clustering algorithm, and cluster count. We found that a 1D convolutional neural network architecture and a transformer network architecture both performed comparably for the multitask prediction of endurance, strength, reading decoding, and vocabulary comprehension, with both architectures outperforming a fully connected network architecture. Quantitative experiments demonstrated that a proposed low-dimensional saliency representation with an explicit measure of motor versus cognitive category bias achieved the best parcellation results, while a parcel count of four was most successful according to standard cluster quality metrics. Our results suggested that motor and cognitive saliencies are distributed across the cerebellar white matter pathways. Inspection of the final k = 4 parcellation revealed that the highest-saliency parcel was most salient for the prediction of both motor and cognitive performance scores and included parts of the middle and superior cerebellar peduncles. Our proposed saliency-based parcellation framework, DeepMSP, enables multimodal, data-driven tractography parcellation. Through utilising both structural features and functional performance measures, this parcellation strategy may have the potential to enhance the study of structure-function relationships of the cerebellar pathways.
Collapse
Affiliation(s)
- Ari Tchetchenian
- Biomedical Image Computing Group, School of Computer Science and EngineeringUniversity of New South Wales (UNSW)SydneyNew South WalesAustralia
| | - Leo Zekelman
- Department of Radiology, Brigham and Women's HospitalHarvard Medical SchoolBostonMassachusettsUSA
- Harvard UniversityCambridgeMassachusettsUSA
| | - Yuqian Chen
- Department of Radiology, Brigham and Women's HospitalHarvard Medical SchoolBostonMassachusettsUSA
| | - Jarrett Rushmore
- Department of PsychiatryMassachusetts General Hospital, Harvard Medical SchoolBostonMassachusettsUSA
- Department of NeurologyMassachusetts General Hospital, Harvard Medical SchoolBostonMassachusettsUSA
- Department of RadiologyMassachusetts General Hospital, Harvard Medical SchoolBostonMassachusettsUSA
- Department of Anatomy and NeurobiologyBoston University School of MedicineBostonMassachusettsUSA
| | - Fan Zhang
- School of Information and Communication EngineeringUniversity of Electronic Science and Technology of ChinaChengduChina
| | | | - Nikos Makris
- Department of PsychiatryMassachusetts General Hospital, Harvard Medical SchoolBostonMassachusettsUSA
- Department of NeurologyMassachusetts General Hospital, Harvard Medical SchoolBostonMassachusettsUSA
- Department of RadiologyMassachusetts General Hospital, Harvard Medical SchoolBostonMassachusettsUSA
- Department of Psychiatry, Brigham and Women's HospitalHarvard Medical SchoolBostonMassachusettsUSA
| | - Yogesh Rathi
- Department of Radiology, Brigham and Women's HospitalHarvard Medical SchoolBostonMassachusettsUSA
- Department of Psychiatry, Brigham and Women's HospitalHarvard Medical SchoolBostonMassachusettsUSA
| | - Erik Meijering
- Biomedical Image Computing Group, School of Computer Science and EngineeringUniversity of New South Wales (UNSW)SydneyNew South WalesAustralia
| | - Yang Song
- Biomedical Image Computing Group, School of Computer Science and EngineeringUniversity of New South Wales (UNSW)SydneyNew South WalesAustralia
| | - Lauren J. O'Donnell
- Department of Radiology, Brigham and Women's HospitalHarvard Medical SchoolBostonMassachusettsUSA
| |
Collapse
|
5
|
Han T, Peng Y, Du Y, Li Y, Wang Y, Sun W, Cui L, Peng Q. Mining Alzheimer's disease clinical data: reducing effects of natural aging for predicting progression and identifying subtypes. Front Neurosci 2024; 18:1388391. [PMID: 39206114 PMCID: PMC11351280 DOI: 10.3389/fnins.2024.1388391] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2024] [Accepted: 07/26/2024] [Indexed: 09/04/2024] Open
Abstract
Introduction Because Alzheimer's disease (AD) has significant heterogeneity in encephalatrophy and clinical manifestations, AD research faces two critical challenges: eliminating the impact of natural aging and extracting valuable clinical data for patients with AD. Methods This study attempted to address these challenges by developing a novel machine-learning model called tensorized contrastive principal component analysis (T-cPCA). The objectives of this study were to predict AD progression and identify clinical subtypes while minimizing the influence of natural aging. Results We leveraged a clinical variable space of 872 features, including almost all AD clinical examinations, which is the most comprehensive AD feature description in current research. T-cPCA yielded the highest accuracy in predicting AD progression by effectively minimizing the confounding effects of natural aging. Discussion The representative features and pathogenic circuits of the four primary AD clinical subtypes were discovered. Confirmed by clinical doctors in Tangdu Hospital, the plaques (18F-AV45) distribution of typical patients in the four clinical subtypes are consistent with representative brain regions found in four AD subtypes, which further offers novel insights into the underlying mechanisms of AD pathogenesis.
Collapse
Affiliation(s)
- Tian Han
- Systems Engineering Institute, School of Automation, Xi’an Jiaotong University, Xi’an, China
| | - Yunhua Peng
- Center for Mitochondrial Biology and Medicine, School of Life Science and Technology, Xi’an Jiaotong University, Xi’an, China
- The Key Laboratory of Biomedical Information Engineering of Ministry of Education, Xi’an Jiaotong University, Xi’an, China
| | - Ying Du
- Department of Neurology, Tangdu Hospital, Fourth Military Medical University, Xi’an, China
| | - Yunbo Li
- Department of Nuclear Medicine, Tangdu Hospital, Fourth Military Medical University, Xi’an, China
| | - Ying Wang
- Systems Engineering Institute, School of Automation, Xi’an Jiaotong University, Xi’an, China
| | - Wentong Sun
- Systems Engineering Institute, School of Automation, Xi’an Jiaotong University, Xi’an, China
| | - Lanxin Cui
- Systems Engineering Institute, School of Automation, Xi’an Jiaotong University, Xi’an, China
| | - Qinke Peng
- Systems Engineering Institute, School of Automation, Xi’an Jiaotong University, Xi’an, China
- School of Future Technology, Xi’an Jiaotong University, Xi’an, China
| |
Collapse
|
6
|
Dhaygude AD, Ameta GK, Khan IR, Singh PP, Maaliw RR, Lakshmaiya N, Shabaz M, Khan MA, Hussein HS, Alshazly H. Knowledge‐based deep learning system for classifying Alzheimer's disease for multi‐task learning. CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY 2024; 9:805-820. [DOI: 10.1049/cit2.12291] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Accepted: 06/21/2023] [Indexed: 08/25/2024] Open
Abstract
AbstractDeep learning has recently become a viable approach for classifying Alzheimer's disease (AD) in medical imaging. However, existing models struggle to efficiently extract features from medical images and may squander additional information resources for illness classification. To address these issues, a deep three‐dimensional convolutional neural network incorporating multi‐task learning and attention mechanisms is proposed. An upgraded primary C3D network is utilised to create rougher low‐level feature maps. It introduces a new convolution block that focuses on the structural aspects of the magnetic resonance imaging image and another block that extracts attention weights unique to certain pixel positions in the feature map and multiplies them with the feature map output. Then, several fully connected layers are used to achieve multi‐task learning, generating three outputs, including the primary classification task. The other two outputs employ backpropagation during training to improve the primary classification job. Experimental findings show that the authors’ proposed method outperforms current approaches for classifying AD, achieving enhanced classification accuracy and other indicators on the Alzheimer's disease Neuroimaging Initiative dataset. The authors demonstrate promise for future disease classification studies.
Collapse
Affiliation(s)
| | - Gaurav Kumar Ameta
- Department of Computer Science & Engineering Parul Institute of Technology Parul University Vadodara Gujarat India
| | | | | | - Renato R. Maaliw
- College of Engineering Southern Luzon State University Lucban Quezon Philippines
| | - Natrayan Lakshmaiya
- Department of Mechanical Engineering Saveetha School of Engineering SIMATS Chennai Tamil Nadu India
| | - Mohammad Shabaz
- Model Institute of Engineering and Technology Jammu J&K India
| | - Muhammad Attique Khan
- Department of Computer Science HITEC University Taxila Pakistan
- Department of Computer Science and Mathematics Lebanese American University Beirut Lebanon
| | - Hany S. Hussein
- Electrical Engineering Department College of Engineering King Khalid University Abha Saudi Arabia
- Electrical Engineering Department Aswan University Aswan Egypt
| | - Hammam Alshazly
- Faculty of Computers and Information South Valley University Qena Egypt
| |
Collapse
|
7
|
Sakharova T, Mao S, Osadchuk M. Updated Models of Alzheimer's Disease with Deep Neural Networks. J Alzheimers Dis 2024; 100:685-697. [PMID: 38905045 DOI: 10.3233/jad-240183] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/23/2024]
Abstract
Background In recent years, researchers have focused on developing precise models for the progression of Alzheimer's disease (AD) using deep neural networks. Forecasting the progression of AD through the analysis of time series data represents a promising approach. Objective The primary objective of this research is to formulate an effective methodology for forecasting the progression of AD through the integration of multi-task learning techniques and the analysis of pertinent medical data. Methods This study primarily utilized volumetric measurements obtained through magnetic resonance imaging (MRI), trajectories of cognitive assessments, and clinical status indicators. The research encompassed 150 patients diagnosed with AD who underwent examination between 2020 and 2022 in Beijing, China. A multi-task learning approach was employed to train forecasting models using MRI data, trajectories of cognitive assessments, and clinical status. Correlation analysis was conducted at various time points. Results At the baseline, a robust correlation was observed among the forecasting tasks: 0.75 for volumetric MRI measurements, 0.62 for trajectories of cognitive assessment, and 0.48 for clinical status. The implementation of a multi-task learning framework enhanced performance by 12.7% for imputing missing values and 14.8% for prediction accuracy. Conclusions The findings of our study, indicate that multi-task learning can effectively predict the progression of AD. However, it is important to note that the study's generalizability may be limited due to the restricted dataset and the specific population under examination. These conclusions represent a significant stride toward more precise diagnosis and treatment of this neurological disorder.
Collapse
Affiliation(s)
- Tatyana Sakharova
- Department of Biology and General Genetics, I.M. Sechenov First Moscow State Medical University, Moscow, Russia
| | - Siqi Mao
- Department of Statistics, Florida State University, Tallahassee, FL, USA
| | - Mikhail Osadchuk
- Department of Polyclinic Therapy, I.M. Sechenov First Moscow State Medical University, Moscow, Russia
| |
Collapse
|
8
|
Liu M, Li S, Yuan H, Ong MEH, Ning Y, Xie F, Saffari SE, Shang Y, Volovici V, Chakraborty B, Liu N. Handling missing values in healthcare data: A systematic review of deep learning-based imputation techniques. Artif Intell Med 2023; 142:102587. [PMID: 37316097 DOI: 10.1016/j.artmed.2023.102587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Revised: 04/08/2023] [Accepted: 05/16/2023] [Indexed: 06/16/2023]
Abstract
OBJECTIVE The proper handling of missing values is critical to delivering reliable estimates and decisions, especially in high-stakes fields such as clinical research. In response to the increasing diversity and complexity of data, many researchers have developed deep learning (DL)-based imputation techniques. We conducted a systematic review to evaluate the use of these techniques, with a particular focus on the types of data, intending to assist healthcare researchers from various disciplines in dealing with missing data. MATERIALS AND METHODS We searched five databases (MEDLINE, Web of Science, Embase, CINAHL, and Scopus) for articles published prior to February 8, 2023 that described the use of DL-based models for imputation. We examined selected articles from four perspectives: data types, model backbones (i.e., main architectures), imputation strategies, and comparisons with non-DL-based methods. Based on data types, we created an evidence map to illustrate the adoption of DL models. RESULTS Out of 1822 articles, a total of 111 were included, of which tabular static data (29%, 32/111) and temporal data (40%, 44/111) were the most frequently investigated. Our findings revealed a discernible pattern in the choice of model backbones and data types, for example, the dominance of autoencoder and recurrent neural networks for tabular temporal data. The discrepancy in imputation strategy usage among data types was also observed. The "integrated" imputation strategy, which solves the imputation task simultaneously with downstream tasks, was most popular for tabular temporal data (52%, 23/44) and multi-modal data (56%, 5/9). Moreover, DL-based imputation methods yielded a higher level of imputation accuracy than non-DL methods in most studies. CONCLUSION The DL-based imputation models are a family of techniques, with diverse network structures. Their designation in healthcare is usually tailored to data types with different characteristics. Although DL-based imputation models may not be superior to conventional approaches across all datasets, it is highly possible for them to achieve satisfactory results for a particular data type or dataset. There are, however, still issues with regard to portability, interpretability, and fairness associated with current DL-based imputation models.
Collapse
Affiliation(s)
- Mingxuan Liu
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore
| | - Siqi Li
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore
| | - Han Yuan
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore
| | - Marcus Eng Hock Ong
- Programme in Health Services and Systems Research, Duke-NUS Medical School, Singapore; Department of Emergency Medicine, Singapore General Hospital, Singapore
| | - Yilin Ning
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore
| | - Feng Xie
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore; Programme in Health Services and Systems Research, Duke-NUS Medical School, Singapore
| | - Seyed Ehsan Saffari
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore; Programme in Health Services and Systems Research, Duke-NUS Medical School, Singapore
| | - Yuqing Shang
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore
| | - Victor Volovici
- Department of Neurosurgery, Erasmus MC University Medical Center, Rotterdam, the Netherlands
| | - Bibhas Chakraborty
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore; Programme in Health Services and Systems Research, Duke-NUS Medical School, Singapore; Department of Statistics and Data Science, National University of Singapore, Singapore; Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, USA
| | - Nan Liu
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore; Programme in Health Services and Systems Research, Duke-NUS Medical School, Singapore; SingHealth AI Office, Singapore Health Services, Singapore; Institute of Data Science, National University of Singapore, Singapore.
| |
Collapse
|
9
|
Yi F, Yang H, Chen D, Qin Y, Han H, Cui J, Bai W, Ma Y, Zhang R, Yu H. XGBoost-SHAP-based interpretable diagnostic framework for alzheimer's disease. BMC Med Inform Decis Mak 2023; 23:137. [PMID: 37491248 PMCID: PMC10369804 DOI: 10.1186/s12911-023-02238-9] [Citation(s) in RCA: 30] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2022] [Accepted: 07/13/2023] [Indexed: 07/27/2023] Open
Abstract
BACKGROUND Due to the class imbalance issue faced when Alzheimer's disease (AD) develops from normal cognition (NC) to mild cognitive impairment (MCI), present clinical practice is met with challenges regarding the auxiliary diagnosis of AD using machine learning (ML). This leads to low diagnosis performance. We aimed to construct an interpretable framework, extreme gradient boosting-Shapley additive explanations (XGBoost-SHAP), to handle the imbalance among different AD progression statuses at the algorithmic level. We also sought to achieve multiclassification of NC, MCI, and AD. METHODS We obtained patient data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database, including clinical information, neuropsychological test results, neuroimaging-derived biomarkers, and APOE-ε4 gene statuses. First, three feature selection algorithms were applied, and they were then included in the XGBoost algorithm. Due to the imbalance among the three classes, we changed the sample weight distribution to achieve multiclassification of NC, MCI, and AD. Then, the SHAP method was linked to XGBoost to form an interpretable framework. This framework utilized attribution ideas that quantified the impacts of model predictions into numerical values and analysed them based on their directions and sizes. Subsequently, the top 10 features (optimal subset) were used to simplify the clinical decision-making process, and their performance was compared with that of a random forest (RF), Bagging, AdaBoost, and a naive Bayes (NB) classifier. Finally, the National Alzheimer's Coordinating Center (NACC) dataset was employed to assess the impact path consistency of the features within the optimal subset. RESULTS Compared to the RF, Bagging, AdaBoost, NB and XGBoost (unweighted), the interpretable framework had higher classification performance with accuracy improvements of 0.74%, 0.74%, 1.46%, 13.18%, and 0.83%, respectively. The framework achieved high sensitivity (81.21%/74.85%), specificity (92.18%/89.86%), accuracy (87.57%/80.52%), area under the receiver operating characteristic curve (AUC) (0.91/0.88), positive clinical utility index (0.71/0.56), and negative clinical utility index (0.75/0.68) on the ADNI and NACC datasets, respectively. In the ADNI dataset, the top 10 features were found to have varying associations with the risk of AD onset based on their SHAP values. Specifically, the higher SHAP values of CDRSB, ADAS13, ADAS11, ventricle volume, ADASQ4, and FAQ were associated with higher risks of AD onset. Conversely, the higher SHAP values of LDELTOTAL, mPACCdigit, RAVLT_immediate, and MMSE were associated with lower risks of AD onset. Similar results were found for the NACC dataset. CONCLUSIONS The proposed interpretable framework contributes to achieving excellent performance in imbalanced AD multiclassification tasks and provides scientific guidance (optimal subset) for clinical decision-making, thereby facilitating disease management and offering new research ideas for optimizing AD prevention and treatment programs.
Collapse
Affiliation(s)
- Fuliang Yi
- Department of Health Statistics, School of Public Health, Shanxi Medical University, 56 South XinJian Road, Taiyuan, 030001 P.R. China
| | - Hui Yang
- Department of Health Statistics, School of Public Health, Shanxi Medical University, 56 South XinJian Road, Taiyuan, 030001 P.R. China
| | - Durong Chen
- Department of Health Statistics, School of Public Health, Shanxi Medical University, 56 South XinJian Road, Taiyuan, 030001 P.R. China
| | - Yao Qin
- Department of Health Statistics, School of Public Health, Shanxi Medical University, 56 South XinJian Road, Taiyuan, 030001 P.R. China
| | - Hongjuan Han
- Department of Health Statistics, School of Public Health, Shanxi Medical University, 56 South XinJian Road, Taiyuan, 030001 P.R. China
| | - Jing Cui
- Department of Health Statistics, School of Public Health, Shanxi Medical University, 56 South XinJian Road, Taiyuan, 030001 P.R. China
| | - Wenlin Bai
- Department of Health Statistics, School of Public Health, Shanxi Medical University, 56 South XinJian Road, Taiyuan, 030001 P.R. China
| | - Yifei Ma
- Department of Health Statistics, School of Public Health, Shanxi Medical University, 56 South XinJian Road, Taiyuan, 030001 P.R. China
| | - Rong Zhang
- Department of Health Statistics, School of Public Health, Shanxi Medical University, 56 South XinJian Road, Taiyuan, 030001 P.R. China
| | - Hongmei Yu
- Department of Health Statistics, School of Public Health, Shanxi Medical University, 56 South XinJian Road, Taiyuan, 030001 P.R. China
- Shanxi Provincial Key Laboratory of Major Diseases Risk Assessment, Taiyuan, China
| |
Collapse
|
10
|
Bhachawat S, Shriram E, Srinivasan K, Hu YC. Leveraging Computational Intelligence Techniques for Diagnosing Degenerative Nerve Diseases: A Comprehensive Review, Open Challenges, and Future Research Directions. Diagnostics (Basel) 2023; 13:288. [PMID: 36673100 PMCID: PMC9858227 DOI: 10.3390/diagnostics13020288] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Revised: 12/28/2022] [Accepted: 01/10/2023] [Indexed: 01/13/2023] Open
Abstract
Degenerative nerve diseases such as Alzheimer's and Parkinson's diseases have always been a global issue of concern. Approximately 1/6th of the world's population suffers from these disorders, yet there are no definitive solutions to cure these diseases after the symptoms set in. The best way to treat these disorders is to detect them at an earlier stage. Many of these diseases are genetic; this enables machine learning algorithms to give inferences based on the patient's medical records and history. Machine learning algorithms such as deep neural networks are also critical for the early identification of degenerative nerve diseases. The significant applications of machine learning and deep learning in early diagnosis and establishing potential therapies for degenerative nerve diseases have motivated us to work on this review paper. Through this review, we covered various machine learning and deep learning algorithms and their application in the diagnosis of degenerative nerve diseases, such as Alzheimer's disease and Parkinson's disease. Furthermore, we also included the recent advancements in each of these models, which improved their capabilities for classifying degenerative nerve diseases. The limitations of each of these methods are also discussed. In the conclusion, we mention open research challenges and various alternative technologies, such as virtual reality and Big data analytics, which can be useful for the diagnosis of degenerative nerve diseases.
Collapse
Affiliation(s)
- Saransh Bhachawat
- School of Computer Science and Engineering, Vellore Institute of Technology, Vellore 632014, India
| | - Eashwar Shriram
- School of Information Technology and Engineering, Vellore Institute of Technology, Vellore 632014, India
| | - Kathiravan Srinivasan
- School of Computer Science and Engineering, Vellore Institute of Technology, Vellore 632014, India
| | - Yuh-Chung Hu
- Department of Mechanical and Electromechanical Engineering, National Ilan University, Yilan 26047, Taiwan
| |
Collapse
|
11
|
Fathi S, Ahmadi M, Dehnad A. Early diagnosis of Alzheimer's disease based on deep learning: A systematic review. Comput Biol Med 2022; 146:105634. [DOI: 10.1016/j.compbiomed.2022.105634] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2022] [Revised: 04/25/2022] [Accepted: 04/25/2022] [Indexed: 11/03/2022]
|
12
|
Xia W, Zheng L, Fang J, Li F, Zhou Y, Zeng Z, Zhang B, Li Z, Li H, Zhu F. PFmulDL: a novel strategy enabling multi-class and multi-label protein function annotation by integrating diverse deep learning methods. Comput Biol Med 2022; 145:105465. [PMID: 35366467 DOI: 10.1016/j.compbiomed.2022.105465] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Revised: 03/22/2022] [Accepted: 03/25/2022] [Indexed: 02/06/2023]
Abstract
Bioinformatic annotation of protein function is essential but extremely sophisticated, which asks for extensive efforts to develop effective prediction method. However, the existing methods tend to amplify the representativeness of the families with large number of proteins by misclassifying the proteins in the families with small number of proteins. That is to say, the ability of the existing methods to annotate proteins in the 'rare classes' remains limited. Herein, a new protein function annotation strategy, PFmulDL, integrating multiple deep learning methods, was thus constructed. First, the recurrent neural network was integrated, for the first time, with the convolutional neural network to facilitate the function annotation. Second, a transfer learning method was introduced to the model construction for further improving the prediction performances. Third, based on the latest data of Gene Ontology, the newly constructed model could annotate the largest number of protein families comparing with the existing methods. Finally, this newly constructed model was found capable of significantly elevating the prediction performance for the 'rare classes' without sacrificing that for the 'major classes'. All in all, due to the emerging requirements on improving the prediction performance for the proteins in 'rare classes', this new strategy would become an essential complement to the existing methods for protein function prediction. All the models and source codes are freely available and open to all users at: https://github.com/idrblab/PFmulDL.
Collapse
Affiliation(s)
- Weiqi Xia
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310058, China
| | - Lingyan Zheng
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310058, China; Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou, 330110, China
| | - Jiebin Fang
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310058, China
| | - Fengcheng Li
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310058, China
| | - Ying Zhou
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310058, China
| | - Zhenyu Zeng
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou, 330110, China
| | - Bing Zhang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou, 330110, China
| | - Zhaorong Li
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou, 330110, China
| | - Honglin Li
- School of Pharmacy, East China University of Science and Technology, Shanghai, 200237, China.
| | - Feng Zhu
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, 310058, China; Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou, 330110, China.
| |
Collapse
|