1
|
Wang G, Fan F, Shi S, An S, Cao X, Ge W, Yu F, Wang Q, Han X, Tan S, Tan Y, Wang Z. Multi modality fusion transformer with spatio-temporal feature aggregation module for psychiatric disorder diagnosis. Comput Med Imaging Graph 2024; 114:102368. [PMID: 38518412 DOI: 10.1016/j.compmedimag.2024.102368] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Revised: 03/02/2024] [Accepted: 03/13/2024] [Indexed: 03/24/2024]
Abstract
Bipolar disorder (BD) is characterized by recurrent episodes of depression and mild mania. In this paper, to address the common issue of insufficient accuracy in existing methods and meet the requirements of clinical diagnosis, we propose a framework called Spatio-temporal Feature Fusion Transformer (STF2Former). It improves on our previous work - MFFormer by introducing a Spatio-temporal Feature Aggregation Module (STFAM) to learn the temporal and spatial features of rs-fMRI data. It promotes intra-modality attention and information fusion across different modalities. Specifically, this method decouples the temporal and spatial dimensions and designs two feature extraction modules for extracting temporal and spatial information separately. Extensive experiments demonstrate the effectiveness of our proposed STFAM in extracting features from rs-fMRI, and prove that our STF2Former can significantly outperform MFFormer and achieve much better results among other state-of-the-art methods.
Collapse
Affiliation(s)
- Guoxin Wang
- College of Biomedical Engineering & Instrument Science, Zhejiang University, Hangzhou 310027, China
| | - Fengmei Fan
- Beijing Huilongguan Hospital, Peking University Huilongguan Clinical Medical School, Beijing 100096, China
| | - Sheng Shi
- College of Sciences, Northeastern University, Shenyang 110819, China
| | - Shan An
- JD Health International Inc., Beijing 100176, China
| | - Xuyang Cao
- JD Health International Inc., Beijing 100176, China
| | - Wenshu Ge
- JD Health International Inc., Beijing 100176, China
| | - Feng Yu
- College of Biomedical Engineering & Instrument Science, Zhejiang University, Hangzhou 310027, China.
| | - Qi Wang
- College of Sciences, Northeastern University, Shenyang 110819, China.
| | - Xiaole Han
- Beijing Huilongguan Hospital, Peking University Huilongguan Clinical Medical School, Beijing 100096, China
| | - Shuping Tan
- Beijing Huilongguan Hospital, Peking University Huilongguan Clinical Medical School, Beijing 100096, China
| | - Yunlong Tan
- Beijing Huilongguan Hospital, Peking University Huilongguan Clinical Medical School, Beijing 100096, China
| | - Zhiren Wang
- Beijing Huilongguan Hospital, Peking University Huilongguan Clinical Medical School, Beijing 100096, China.
| |
Collapse
|
2
|
Shen C, Cai G, Tian J, Wu X, Ding M, Wang B, Liu D. Characterization of lamb shashliks with different roasting methods by intelligent sensory technologies and GC-MS to simulate human muti-sensation: Based on multimodal deep learning. Food Chem 2024; 440:138265. [PMID: 38154281 DOI: 10.1016/j.foodchem.2023.138265] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Revised: 11/28/2023] [Accepted: 12/21/2023] [Indexed: 12/30/2023]
Abstract
To simulate the functions of olfaction, gustation, vision, and oral touch, intelligent sensory technologies have been developed. Headspace solid-phase microextraction gas chromatography-mass spectrometry (HS-SPME-GC/MS) with electronic noses (E-noses), electronic tongues (E-tongues), computer vision (CVs), and texture analyzers (TAs) was applied for sensory characterization of lamb shashliks (LSs) with various roasting methods. A total of 56 VOCs in lamb shashliks with five roasting methods were identified by HS-SPME/GC-MS, and 21 VOCs were identified as key compounds based on OAV (>1). Cross-channel sensory Transformer (CCST) was also proposed and used to predict 19 sensory attributes and their lamb shashlik scores with different roasting methods. The model achieved satisfactory results in the prediction set (R2 = 0.964). This study shows that a multimodal deep learning model can be used to simulate assessor, and it is feasible to guide and correct sensory evaluation.
Collapse
Affiliation(s)
- Che Shen
- College of Food Science and Technology, Bohai University, Jinzhou 121013, China; Engineering Research Center of Bio process, Ministry of Education, Hefei University of Technology, Hefei 230009, China
| | - Guanhua Cai
- College of Food Science and Technology, Bohai University, Jinzhou 121013, China
| | - Jiaqi Tian
- College of Food Science and Technology, Henan Agricultural University, Zhengzhou 450002, China
| | - Xinnan Wu
- College of Food Science and Technology, Bohai University, Jinzhou 121013, China
| | - Meiqi Ding
- College of Food Science and Technology, Bohai University, Jinzhou 121013, China
| | - Bo Wang
- College of Food Science and Technology, Bohai University, Jinzhou 121013, China; Key Laboratory of Meat Processing and Quality Control, MOE, Key Laboratory of Meat Processing, MARA, College of Food Science and Technology, Nanjing Agricultural University, Nanjing 210095, China; Institute of Ocean Research, Bohai University, Jinzhou 121013, Liaoning, China.
| | - Dengyong Liu
- College of Food Science and Technology, Bohai University, Jinzhou 121013, China.
| |
Collapse
|
3
|
Karampuri A, Kundur S, Perugu S. Exploratory drug discovery in breast cancer patients: A multimodal deep learning approach to identify novel drug candidates targeting RTK signaling. Comput Biol Med 2024; 174:108433. [PMID: 38642491 DOI: 10.1016/j.compbiomed.2024.108433] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Revised: 04/04/2024] [Accepted: 04/07/2024] [Indexed: 04/22/2024]
Abstract
Breast cancer, a highly formidable and diverse malignancy predominantly affecting women globally, poses a significant threat due to its intricate genetic variability, rendering it challenging to diagnose accurately. Various therapies such as immunotherapy, radiotherapy, and diverse chemotherapy approaches like drug repurposing and combination therapy are widely used depending on cancer subtype and metastasis severity. Our study revolves around an innovative drug discovery strategy targeting potential drug candidates specific to RTK signalling, a prominently targeted receptor class in cancer. To accomplish this, we have developed a multimodal deep neural network (MM-DNN) based QSAR model integrating omics datasets to elucidate genomic, proteomic expression data, and drug responses, validated rigorously. The results showcase an R2 value of 0.917 and an RMSE value of 0.312, affirming the model's commendable predictive capabilities. Structural analogs of drug molecules specific to RTK signalling were sourced from the PubChem database, followed by meticulous screening to eliminate dissimilar compounds. Leveraging the MM-DNN-based QSAR model, we predicted the biological activity of these molecules, subsequently clustering them into three distinct groups. Feature importance analysis was performed. Consequently, we successfully identified prime drug candidates tailored for each potential downstream regulatory protein within the RTK signalling pathway. This method makes the early stages of drug development faster by removing inactive compounds, providing a hopeful path in combating breast cancer.
Collapse
Affiliation(s)
- Anush Karampuri
- Department of Biotechnology, National Institute of Technology, Warangal, 500604, India
| | - Sunitha Kundur
- Department of Biotechnology, National Institute of Technology, Warangal, 500604, India
| | - Shyam Perugu
- Department of Biotechnology, National Institute of Technology, Warangal, 500604, India.
| |
Collapse
|
4
|
Wang Y, Yin C, Zhang P. Multimodal risk prediction with physiological signals, medical images and clinical notes. Heliyon 2024; 10:e26772. [PMID: 38455585 PMCID: PMC10918115 DOI: 10.1016/j.heliyon.2024.e26772] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Revised: 02/17/2024] [Accepted: 02/20/2024] [Indexed: 03/09/2024] Open
Abstract
The broad adoption of electronic health record (EHR) systems brings us a tremendous amount of clinical data and thus provides opportunities to conduct data-based healthcare research to solve various clinical problems in the medical domain. Machine learning and deep learning methods are widely used in the medical informatics and healthcare domain due to their power to mine insights from raw data. When adapting deep learning models for EHR data, it is essential to consider its heterogeneous nature: EHR contains patient records from various sources including medical tests (e.g. blood test, microbiology test), medical imaging, diagnosis, medications, procedures, clinical notes, etc. Those modalities together provide a holistic view of patient health status and complement each other. Therefore, combining data from multiple modalities that are intrinsically different is challenging but intuitively promising in deep learning for EHR. To assess the expectations of multimodal data, we introduce a comprehensive fusion framework designed to integrate temporal variables, medical images, and clinical notes in EHR for enhanced performance in clinical risk prediction. Early, joint, and late fusion strategies are employed to combine data from various modalities effectively. We test the model with three predictive tasks: in-hospital mortality, long length of stay, and 30-day readmission. Experimental results show that multimodal models outperform uni-modal models in the tasks involved. Additionally, by training models with different input modality combinations, we calculate the Shapley value for each modality to quantify their contribution to multimodal performance. It is shown that temporal variables tend to be more helpful than CXR images and clinical notes in the three explored predictive tasks.
Collapse
Affiliation(s)
- Yuanlong Wang
- Department of Computer Science and Engineering, The Ohio State University, Columbus, OH 43210, USA
| | - Changchang Yin
- Department of Computer Science and Engineering, The Ohio State University, Columbus, OH 43210, USA
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH 43210, USA
| | - Ping Zhang
- Department of Computer Science and Engineering, The Ohio State University, Columbus, OH 43210, USA
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH 43210, USA
| |
Collapse
|
5
|
Gravina M, García-Pedrero A, Gonzalo-Martín C, Sansone C, Soda P. Multi input-Multi output 3D CNN for dementia severity assessment with incomplete multimodal data. Artif Intell Med 2024; 149:102774. [PMID: 38462278 DOI: 10.1016/j.artmed.2024.102774] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Revised: 12/08/2023] [Accepted: 01/14/2024] [Indexed: 03/12/2024]
Abstract
Alzheimer's Disease is the most common cause of dementia, whose progression spans in different stages, from very mild cognitive impairment to mild and severe conditions. In clinical trials, Magnetic Resonance Imaging (MRI) and Positron Emission Tomography (PET) are mostly used for the early diagnosis of neurodegenerative disorders since they provide volumetric and metabolic function information of the brain, respectively. In recent years, Deep Learning (DL) has been employed in medical imaging with promising results. Moreover, the use of the deep neural networks, especially Convolutional Neural Networks (CNNs), has also enabled the development of DL-based solutions in domains characterized by the need of leveraging information coming from multiple data sources, raising the Multimodal Deep Learning (MDL). In this paper, we conduct a systematic analysis of MDL approaches for dementia severity assessment exploiting MRI and PET scans. We propose a Multi Input-Multi Output 3D CNN whose training iterations change according to the characteristic of the input as it is able to handle incomplete acquisitions, in which one image modality is missed. Experiments performed on OASIS-3 dataset show the satisfactory results of the implemented network, which outperforms approaches exploiting both single image modality and different MDL fusion techniques.
Collapse
Affiliation(s)
- Michela Gravina
- Department of Electrical Engineering and Information Technology, University of Naples Federico II, Napoli, 80125, Italy
| | - Angel García-Pedrero
- Department of Computer Architecture and Technology, Universidad Politécnica de Madrid, Boadilla del Monte, 28660, Madrid, Spain; Center for Biomedical Technology, Campus de Montegancedo, Universidad Politécnica de Madrid, Pozuelo de Alarcón, 28233, Madrid, Spain
| | - Consuelo Gonzalo-Martín
- Department of Computer Architecture and Technology, Universidad Politécnica de Madrid, Boadilla del Monte, 28660, Madrid, Spain; Center for Biomedical Technology, Campus de Montegancedo, Universidad Politécnica de Madrid, Pozuelo de Alarcón, 28233, Madrid, Spain.
| | - Carlo Sansone
- Department of Electrical Engineering and Information Technology, University of Naples Federico II, Napoli, 80125, Italy
| | - Paolo Soda
- Department of Engineering, Unit of Computer Systems and Bioinformatics, University of Rome Campus Bio-Medico, Roma, 00128, Italy; Department of Diagnostics and Intervention, Radiation Physics, Biomedical Engineering, Umeå University, 90187, Umeå, Sweden
| |
Collapse
|
6
|
Dehghan A, Abbasi K, Razzaghi P, Banadkuki H, Gharaghani S. CCL-DTI: contributing the contrastive loss in drug-target interaction prediction. BMC Bioinformatics 2024; 25:48. [PMID: 38291364 DOI: 10.1186/s12859-024-05671-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Accepted: 01/22/2024] [Indexed: 02/01/2024] Open
Abstract
BACKGROUND The Drug-Target Interaction (DTI) prediction uses a drug molecule and a protein sequence as inputs to predict the binding affinity value. In recent years, deep learning-based models have gotten more attention. These methods have two modules: the feature extraction module and the task prediction module. In most deep learning-based approaches, a simple task prediction loss (i.e., categorical cross entropy for the classification task and mean squared error for the regression task) is used to learn the model. In machine learning, contrastive-based loss functions are developed to learn more discriminative feature space. In a deep learning-based model, extracting more discriminative feature space leads to performance improvement for the task prediction module. RESULTS In this paper, we have used multimodal knowledge as input and proposed an attention-based fusion technique to combine this knowledge. Also, we investigate how utilizing contrastive loss function along the task prediction loss could help the approach to learn a more powerful model. Four contrastive loss functions are considered: (1) max-margin contrastive loss function, (2) triplet loss function, (3) Multi-class N-pair Loss Objective, and (4) NT-Xent loss function. The proposed model is evaluated using four well-known datasets: Wang et al. dataset, Luo's dataset, Davis, and KIBA datasets. CONCLUSIONS Accordingly, after reviewing the state-of-the-art methods, we developed a multimodal feature extraction network by combining protein sequences and drug molecules, along with protein-protein interaction networks and drug-drug interaction networks. The results show it performs significantly better than the comparable state-of-the-art approaches.
Collapse
Affiliation(s)
- Alireza Dehghan
- Department of Bioinformatics, Kish International Campus, University of Tehran, Kish, 1417614411, Iran
| | - Karim Abbasi
- Laboratory of System Biology, Bioinformatics and Artificial Intelligence in Medicine (LBB&AI), Faculty of Mathematics and Computer Science, Kharazmi University, Tehran, 1417614411, Iran
| | - Parvin Razzaghi
- Department of Computer Science and Information Technology, Institute for Advanced Studies in Basic Sciences (IASBS), Zanjan, 4513766731, Iran.
| | - Hossein Banadkuki
- Laboratory of Bioinformatics and Drug Design (LBD), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, 1417614411, Iran
| | - Sajjad Gharaghani
- Laboratory of Bioinformatics and Drug Design (LBD), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, 1417614411, Iran.
| |
Collapse
|
7
|
Zhang K, Lincoln JA, Jiang X, Bernstam EV, Shams S. Predicting multiple sclerosis severity with multimodal deep neural networks. BMC Med Inform Decis Mak 2023; 23:255. [PMID: 37946182 PMCID: PMC10634041 DOI: 10.1186/s12911-023-02354-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2022] [Accepted: 10/25/2023] [Indexed: 11/12/2023] Open
Abstract
Multiple Sclerosis (MS) is a chronic disease developed in the human brain and spinal cord, which can cause permanent damage or deterioration of the nerves. The severity of MS disease is monitored by the Expanded Disability Status Scale, composed of several functional sub-scores. Early and accurate classification of MS disease severity is critical for slowing down or preventing disease progression via applying early therapeutic intervention strategies. Recent advances in deep learning and the wide use of Electronic Health Records (EHR) create opportunities to apply data-driven and predictive modeling tools for this goal. Previous studies focusing on using single-modal machine learning and deep learning algorithms were limited in terms of prediction accuracy due to data insufficiency or model simplicity. In this paper, we proposed the idea of using patients' multimodal longitudinal and longitudinal EHR data to predict multiple sclerosis disease severity in the future. Our contribution has two main facets. First, we describe a pioneering effort to integrate structured EHR data, neuroimaging data and clinical notes to build a multi-modal deep learning framework to predict patient's MS severity. The proposed pipeline demonstrates up to 19% increase in terms of the area under the Area Under the Receiver Operating Characteristic curve (AUROC) compared to models using single-modal data. Second, the study also provides valuable insights regarding the amount useful signal embedded in each data modality with respect to MS disease prediction, which may improve data collection processes.
Collapse
Affiliation(s)
- Kai Zhang
- Department of Health Data Science and Artificial Intelligence, McWilliams School of Biomedical Informatics, University of Texas Health Sciences Center at Houston, Houston, TX, USA
| | - John A Lincoln
- Department of Neurology, University of Texas Health Sciences Center, McGovern Medical School, Houston, TX, USA
| | - Xiaoqian Jiang
- Department of Health Data Science and Artificial Intelligence, McWilliams School of Biomedical Informatics, University of Texas Health Sciences Center at Houston, Houston, TX, USA
| | - Elmer V Bernstam
- Department of Health Data Science and Artificial Intelligence, McWilliams School of Biomedical Informatics, University of Texas Health Sciences Center at Houston, Houston, TX, USA
- Division of General Internal Medicine, Department of Internal Medicine, University of Texas Health Sciences Center, McGovern Medical School, Houston, TX, USA
| | - Shayan Shams
- Department of Health Data Science and Artificial Intelligence, McWilliams School of Biomedical Informatics, University of Texas Health Sciences Center at Houston, Houston, TX, USA.
- Department of Applied Data Science, San Jose State University, San Jose, CA, USA.
| |
Collapse
|
8
|
Cantarini M, Gabrielli L, Mancini A, Squartini S, Longo R. A3CarScene: An audio-visual dataset for driving scene understanding. Data Brief 2023; 48:109146. [PMID: 37128585 PMCID: PMC10148019 DOI: 10.1016/j.dib.2023.109146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Revised: 04/03/2023] [Accepted: 04/05/2023] [Indexed: 05/03/2023] Open
Abstract
Accurate perception and awareness of the environment surrounding the automobile is a challenge in automotive research. This article presents A3CarScene, a dataset recorded while driving a research vehicle equipped with audio and video sensors on public roads in the Marche Region, Italy. The sensor suite includes eight microphones installed inside and outside the passenger compartment and two dashcams mounted on the front and rear windows. Approximately 31 h of data for each device were collected during October and November 2022 by driving about 1500 km along diverse roads and landscapes, in variable weather conditions, in daytime and nighttime hours. All key information for the scene understanding process of automated vehicles has been accurately annotated. For each route, annotations with beginning and end timestamps report the type of road traveled (motorway, trunk, primary, secondary, tertiary, residential, and service roads), the degree of urbanization of the area (city, town, suburban area, village, exurban and rural areas), the weather conditions (clear, cloudy, overcast, and rainy), the level of lighting (daytime, evening, night, and tunnel), the type (asphalt or cobblestones) and moisture status (dry or wet) of the road pavement, and the state of the windows (open or closed). This large-scale dataset is valuable for developing new driving assistance technologies based on audio or video data alone or in a multimodal manner and for improving the performance of systems currently in use. The data acquisition process with sensors in multiple locations allows for the assessment of the best installation placement concerning the task. Deep learning engineers can use this dataset to build new baselines, as a comparative benchmark, and to extend existing databases for autonomous driving.
Collapse
Affiliation(s)
- Michela Cantarini
- Department of Information Engineering, Università Politecnica delle Marche, via Brecce Bianche 12, 60131 Ancona, Italy
- Corresponding author.
| | - Leonardo Gabrielli
- Department of Information Engineering, Università Politecnica delle Marche, via Brecce Bianche 12, 60131 Ancona, Italy
| | - Adriano Mancini
- Department of Information Engineering, Università Politecnica delle Marche, via Brecce Bianche 12, 60131 Ancona, Italy
| | - Stefano Squartini
- Department of Information Engineering, Università Politecnica delle Marche, via Brecce Bianche 12, 60131 Ancona, Italy
| | - Roberto Longo
- Groupe Signal Image et Instrumentation (GSII), École Supérieure d’Électronique de l'Ouest (ESEO), 10 Bd Jeanneteau, 49107 Angers, France
- Laboratoire d'Acoustique de l'Université du Mans (LAUM), UMR 6613, Institut d'Acoustique - Graduate School (IA-GS), CNRS, Le Mans Université, Av. Olivier Messiaen, 72085 Le Mans, France
| |
Collapse
|
9
|
Li K, Chen C, Cao W, Wang H, Han S, Wang R, Ye Z, Wu Z, Wang W, Cai L, Ding D, Yuan Z. DeAF: A multimodal deep learning framework for disease prediction. Comput Biol Med 2023; 156:106715. [PMID: 36867898 DOI: 10.1016/j.compbiomed.2023.106715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Revised: 02/05/2023] [Accepted: 02/26/2023] [Indexed: 03/05/2023]
Abstract
Multimodal deep learning models have been applied for disease prediction tasks, but difficulties exist in training due to the conflict between sub-models and fusion modules. To alleviate this issue, we propose a framework for decoupling feature alignment and fusion (DeAF), which separates the multimodal model training into two stages. In the first stage, unsupervised representation learning is conducted, and the modality adaptation (MA) module is used to align the features from various modalities. In the second stage, the self-attention fusion (SAF) module combines the medical image features and clinical data using supervised learning. Moreover, we apply the DeAF framework to predict the postoperative efficacy of CRS for colorectal cancer and whether the MCI patients change to Alzheimer's disease. The DeAF framework achieves a significant improvement in comparison to the previous methods. Furthermore, extensive ablation experiments are conducted to demonstrate the rationality and effectiveness of our framework. In conclusion, our framework enhances the interaction between the local medical image features and clinical data, and derive more discriminative multimodal features for disease prediction. The framework implementation is available at https://github.com/cchencan/DeAF.
Collapse
Affiliation(s)
- Kangshun Li
- College of Mathematics and Informatics, South China Agricultural University, Guangzhou, 510000, China.
| | - Can Chen
- College of Mathematics and Informatics, South China Agricultural University, Guangzhou, 510000, China
| | - Wuteng Cao
- Department of Radiology, The Sixth Affiliated Hospital, Sun Yat-Sen University, Guangzhou, 510000, China
| | - Hui Wang
- Department of Colorectal Surgery, Department of General Surgery, Guangdong Provincial Key Laboratory of Colorectal and Pelvic Floor Diseases, The Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou, 510000, China
| | - Shuai Han
- General Surgery Center, Department of Gastrointestinal Surgery, Zhujiang Hospital, Southern Medical University, Guangzhou, 510000, China
| | - Renjie Wang
- Department of Colorectal Surgery, Fudan University Shanghai Cancer Center, Shanghai, 200000, China
| | - Zaisheng Ye
- Department of Gastrointestinal Surgical Oncology, Fujian Cancer Hospital and Fujian Medical University Cancer Hospital, Fuzhou, 350000, China
| | - Zhijie Wu
- Department of Colorectal Surgery, Department of General Surgery, Guangdong Provincial Key Laboratory of Colorectal and Pelvic Floor Diseases, The Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou, 510000, China
| | - Wenxiang Wang
- College of Mathematics and Informatics, South China Agricultural University, Guangzhou, 510000, China
| | - Leng Cai
- College of Mathematics and Informatics, South China Agricultural University, Guangzhou, 510000, China
| | - Deyu Ding
- Department of Economics, University of Konstanz, Konstanz, 350000, Germany
| | - Zixu Yuan
- Department of Colorectal Surgery, Department of General Surgery, Guangdong Provincial Key Laboratory of Colorectal and Pelvic Floor Diseases, The Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou, 510000, China.
| |
Collapse
|
10
|
Afshar S, Braun PR, Han S, Lin Y. A multimodal deep learning model to infer cell-type-specific functional gene networks. BMC Bioinformatics 2023; 24:47. [PMID: 36788477 PMCID: PMC9926713 DOI: 10.1186/s12859-023-05146-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Accepted: 01/11/2023] [Indexed: 02/16/2023] Open
Abstract
BACKGROUND Functional gene networks (FGNs) capture functional relationships among genes that vary across tissues and cell types. Construction of cell-type-specific FGNs enables the understanding of cell-type-specific functional gene relationships and insights into genetic mechanisms of human diseases in disease-relevant cell types. However, most existing FGNs were developed without consideration of specific cell types within tissues. RESULTS In this study, we created a multimodal deep learning model (MDLCN) to predict cell-type-specific FGNs in the human brain by integrating single-nuclei gene expression data with global protein interaction networks. We systematically evaluated the prediction performance of the MDLCN and showed its superior performance compared to two baseline models (boosting tree and convolutional neural network). Based on the predicted cell-type-specific FGNs, we observed that cell-type marker genes had a higher level of hubness than non-marker genes in their corresponding cell type. Furthermore, we showed that risk genes underlying autism and Alzheimer's disease were more strongly connected in disease-relevant cell types, supporting the cellular context of predicted cell-type-specific FGNs. CONCLUSIONS Our study proposes a powerful deep learning approach (MDLCN) to predict FGNs underlying a diverse set of cell types in human brain. The MDLCN model enhances prediction accuracy of cell-type-specific FGNs compared to single modality convolutional neural network (CNN) and boosting tree models, as shown by higher areas under both receiver operating characteristic (ROC) and precision-recall curves for different levels of independent test datasets. The predicted FGNs also show evidence for the cellular context and distinct topological features (i.e. higher hubness and topological score) of cell-type marker genes. Moreover, we observed stronger modularity among disease-associated risk genes in FGNs of disease-relevant cell types. For example, the strength of connectivity among autism risk genes was stronger in neurons, but risk genes underlying Alzheimer's disease were more connected in microglia.
Collapse
Affiliation(s)
- Shiva Afshar
- grid.266436.30000 0004 1569 9707Department of Industrial Engineering, University of Houston, Houston, TX 77204 USA
| | - Patricia R. Braun
- grid.21107.350000 0001 2171 9311Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, MD 21287 USA
| | - Shizhong Han
- grid.21107.350000 0001 2171 9311Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, MD 21287 USA ,grid.429552.d0000 0004 5913 1291Lieber Institute for Brain Development, Baltimore, MD 21205 USA
| | - Ying Lin
- Department of Industrial Engineering, University of Houston, Houston, TX, 77204, USA.
| |
Collapse
|
11
|
Sheng J, Wang B, Zhang Q, Yu M. Connectivity and variability of related cognitive subregions lead to different stages of progression toward Alzheimer's disease. Heliyon 2022; 8:e08827. [PMID: 35128111 PMCID: PMC8803587 DOI: 10.1016/j.heliyon.2022.e08827] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2020] [Revised: 04/29/2021] [Accepted: 01/19/2022] [Indexed: 12/04/2022] Open
Abstract
Single modality MRI data is not enough to depict and discern the cause of the underlying brain pathology of Alzheimer's disease (AD). Most existing studies do not perform well with multi-group classification. To reveal the structural, functional connectivity and functional topological relationships among different stages of mild cognitive impairment (MCI) and AD, a novel method was proposed in this paper for the analysis of regional importance with an improved deep learning model. Obvious drift of related cognitive regions can be observed in the prefrontal lobe and surrounding the cingulate area in the right hemisphere when comparing AD and healthy controls (HC) based on absolute weights in the classification mode. Alterations of these regions being responsible for cognitive impairment have been previously reported. Different parcellation atlases of the human cerebral cortex were compared, and the fine-grained multimodal parcellation HCPMMP performed the best with 180 cortical areas per hemisphere. In multi-group classification, the highest accuracy achieved was 96.86% with the utilization of structural and functional topological modalities as input to the training model. Weights in the trained model with perfect discriminating ability quantify the importance of each cortical region. This is the first time such a phenomenon is discovered and weights in cortical areas are precisely described in AD and its prodromal stages to the best of our knowledge. Our findings can establish other study models to differentiate the patterns in various diseases with cognitive impairments and help to identify the underlying pathology.
Collapse
Affiliation(s)
- Jinhua Sheng
- School of Computer Science, Hangzhou Dianzi University, Hangzhou, Zhejiang, 310018, China
- Key Laboratory of Intelligent Image Analysis for Sensory and Cognitive Health, Ministry of Industry and Information Technology of China, Hangzhou, Zhejiang, 310018, China
| | - Bocheng Wang
- School of Computer Science, Hangzhou Dianzi University, Hangzhou, Zhejiang, 310018, China
- Key Laboratory of Intelligent Image Analysis for Sensory and Cognitive Health, Ministry of Industry and Information Technology of China, Hangzhou, Zhejiang, 310018, China
- Communication University of Zhejiang, Hangzhou, Zhejiang, 310018, China
| | - Qiao Zhang
- Beijing Hospital, Beijing, 100730, China
- Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, 100730, China
| | - Margaret Yu
- Department of Neurology, Northwestern University Feinberg School of Medicine, Chicago, IL, 60611, USA
| |
Collapse
|
12
|
Menegotto AB, Becker CDL, Cazella SC. Computer-aided diagnosis of hepatocellular carcinoma fusing imaging and structured health data. Health Inf Sci Syst 2021; 9:20. [PMID: 33968399 PMCID: PMC8096870 DOI: 10.1007/s13755-021-00151-x] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2020] [Accepted: 04/20/2021] [Indexed: 12/21/2022] Open
Abstract
INTRODUCTION Hepatocellular carcinoma is the prevalent primary liver cancer, a silent disease that killed 782,000 worldwide in 2018. Multimodal deep learning is the application of deep learning techniques, fusing more than one data modality as the model's input. PURPOSE A computer-aided diagnosis system for hepatocellular carcinoma developed with multimodal deep learning approaches could use multiple data modalities as recommended by clinical guidelines, and enhance the robustness and the value of the second-opinion given to physicians. This article describes the process of creation and evaluation of an algorithm for computer-aided diagnosis of hepatocellular carcinoma developed with multimodal deep learning techniques fusing preprocessed computed-tomography images with structured data from patient Electronic Health Records. RESULTS The classification performance achieved by the proposed algorithm in the test dataset was: accuracy = 86.9%, precision = 89.6%, recall = 86.9% and F-Score = 86.7%. These classification performance metrics are closer to the state-of-the-art in this area and were achieved with data modalities which are cheaper than traditional Magnetic Resonance Imaging approaches, enabling the use of the proposed algorithm by low and mid-sized healthcare institutions. CONCLUSION The classification performance achieved with the multimodal deep learning algorithm is higher than human specialists diagnostic performance using only CT for diagnosis. Even though the results are promising, the multimodal deep learning architecture used for hepatocellular carcinoma prediction needs more training and test processes using different datasets before the use of the proposed algorithm by physicians in real healthcare routines. The additional training aims to confirm the classification performance achieved and enhance the model's robustness.
Collapse
Affiliation(s)
- Alan Baronio Menegotto
- Universidade Federal de Ciências da Saúde de Porto Alegre, Rua Sarmento Leite, 245-Porto Alegre, Rio Grande do Sul, Brazil
| | - Carla Diniz Lopes Becker
- Universidade Federal de Ciências da Saúde de Porto Alegre, Rua Sarmento Leite, 245-Porto Alegre, Rio Grande do Sul, Brazil
| | - Silvio Cesar Cazella
- Universidade Federal de Ciências da Saúde de Porto Alegre, Rua Sarmento Leite, 245-Porto Alegre, Rio Grande do Sul, Brazil
| |
Collapse
|
13
|
Pan X, Shen HB. RNA-protein binding motifs mining with a new hybrid deep learning based cross-domain knowledge integration approach. BMC Bioinformatics 2017; 18:136. [PMID: 28245811 PMCID: PMC5331642 DOI: 10.1186/s12859-017-1561-8] [Citation(s) in RCA: 108] [Impact Index Per Article: 15.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2016] [Accepted: 02/23/2017] [Indexed: 01/08/2023] Open
Abstract
Background RNAs play key roles in cells through the interactions with proteins known as the RNA-binding proteins (RBP) and their binding motifs enable crucial understanding of the post-transcriptional regulation of RNAs. How the RBPs correctly recognize the target RNAs and why they bind specific positions is still far from clear. Machine learning-based algorithms are widely acknowledged to be capable of speeding up this process. Although many automatic tools have been developed to predict the RNA-protein binding sites from the rapidly growing multi-resource data, e.g. sequence, structure, their domain specific features and formats have posed significant computational challenges. One of current difficulties is that the cross-source shared common knowledge is at a higher abstraction level beyond the observed data, resulting in a low efficiency of direct integration of observed data across domains. The other difficulty is how to interpret the prediction results. Existing approaches tend to terminate after outputting the potential discrete binding sites on the sequences, but how to assemble them into the meaningful binding motifs is a topic worth of further investigation. Results In viewing of these challenges, we propose a deep learning-based framework (iDeep) by using a novel hybrid convolutional neural network and deep belief network to predict the RBP interaction sites and motifs on RNAs. This new protocol is featured by transforming the original observed data into a high-level abstraction feature space using multiple layers of learning blocks, where the shared representations across different domains are integrated. To validate our iDeep method, we performed experiments on 31 large-scale CLIP-seq datasets, and our results show that by integrating multiple sources of data, the average AUC can be improved by 8% compared to the best single-source-based predictor; and through cross-domain knowledge integration at an abstraction level, it outperforms the state-of-the-art predictors by 6%. Besides the overall enhanced prediction performance, the convolutional neural network module embedded in iDeep is also able to automatically capture the interpretable binding motifs for RBPs. Large-scale experiments demonstrate that these mined binding motifs agree well with the experimentally verified results, suggesting iDeep is a promising approach in the real-world applications. Conclusion The iDeep framework not only can achieve promising performance than the state-of-the-art predictors, but also easily capture interpretable binding motifs. iDeep is available at http://www.csbio.sjtu.edu.cn/bioinf/iDeep Electronic supplementary material The online version of this article (doi:10.1186/s12859-017-1561-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Xiaoyong Pan
- Department of Veterinary Clinical and Animal Sciences, University of Copenhagen, Copenhagen, Denmark.
| | - Hong-Bin Shen
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, China.
| |
Collapse
|