1
|
Wang J, Wang L, Liu Y, Li X, Ma J, Li M, Zhu Y. Comprehensive Evaluation of Multi-Omics Clustering Algorithms for Cancer Molecular Subtyping. Int J Mol Sci 2025; 26:963. [PMID: 39940732 PMCID: PMC11816650 DOI: 10.3390/ijms26030963] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2024] [Revised: 01/15/2025] [Accepted: 01/21/2025] [Indexed: 02/16/2025] Open
Abstract
As a highly heterogeneous and complex disease, the identification of cancer's molecular subtypes is crucial for accurate diagnosis and personalized treatment. The integration of multi-omics data enables a comprehensive interpretation of the molecular characteristics of cancer at various biological levels. In recent years, an increasing number of multi-omics clustering algorithms for cancer molecular subtyping have been proposed. However, the absence of a definitive gold standard makes it challenging to evaluate and compare these methods effectively. In this study, we developed a general framework for the comprehensive evaluation of multi-omics clustering algorithms and introduced an innovative metric, the accuracy-weighted average index, which simultaneously considers both clustering performance and clinical relevance. Using this framework, we performed a thorough evaluation and comparison of 11 state-of-the-art multi-omics clustering algorithms, including deep learning-based methods. By integrating the accuracy-weighted average index with computational efficiency, our analysis reveals that PIntMF demonstrates the best overall performance, making it a promising tool for molecular subtyping across a wide range of cancers.
Collapse
Affiliation(s)
- Juan Wang
- School of Basic Medical Sciences, Anhui Medical University, Hefei 230032, China;
- State Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China; (L.W.); (Y.L.); (X.L.); (J.M.)
| | - Lingxiao Wang
- State Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China; (L.W.); (Y.L.); (X.L.); (J.M.)
| | - Yi Liu
- State Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China; (L.W.); (Y.L.); (X.L.); (J.M.)
| | - Xiao Li
- State Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China; (L.W.); (Y.L.); (X.L.); (J.M.)
| | - Jie Ma
- State Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China; (L.W.); (Y.L.); (X.L.); (J.M.)
| | - Mansheng Li
- State Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China; (L.W.); (Y.L.); (X.L.); (J.M.)
| | - Yunping Zhu
- School of Basic Medical Sciences, Anhui Medical University, Hefei 230032, China;
- State Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China; (L.W.); (Y.L.); (X.L.); (J.M.)
| |
Collapse
|
2
|
Chen F, Peng W, Dai W, Wei S, Fu X, Liu L, Liu L. Supervised graph contrastive learning for cancer subtype identification through multi-omics data integration. Health Inf Sci Syst 2024; 12:12. [PMID: 38404715 PMCID: PMC10891026 DOI: 10.1007/s13755-024-00274-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2023] [Accepted: 01/09/2024] [Indexed: 02/27/2024] Open
Abstract
Cancer is one of the most deadly diseases in the world. Accurate cancer subtype classification is critical for patient diagnosis, treatment, and prognosis. Ever-increasing multi-omics data describes the characteristics of the patients from different views and serves as complementary information to promote cancer subtype identification. However, omics data generally have different distributions and high dimensions. How to effectively integrate multiple omics data to classify cancer subtypes accurately is a challenge for researchers. This work proposes a method integrating multi-omics data based on supervised graph contrast learning (MCRGCN) to classify cancer subtypes. The method considers the unique feature distribution of each omics data and the interaction of different omics data features to improve the accuracy of cancer subtype classification. To achieve this, MCRGCN first constructs different sample networks based on the multi-omics data of the samples. Then, it puts the omics data and adjacency matrix of the sample into different residual graph convolution models to get multi-omics features of the samples, which are trained with a supervised comparison loss to maintain that the sample features of each omics should be as consistent as possible. Finally, we input the sample features combining multi-omics features into a classifier to obtain the cancer subtypes. We applied MCRGCN to the invasive breast carcinoma (BRCA) and glioblastoma multiforme (GBM) datasets, integrating gene expression, miRNA expression, and DNA methylation data. The results demonstrate that our model is superior to other methods in integrating multi-omics data. Moreover, the results of survival analysis experiments demonstrate that the cancer subtypes identified by our model have significant clinical features. Furthermore, our model can help to identify potential biomarkers and pathways associated with cancer subtypes.
Collapse
Affiliation(s)
- Fangxu Chen
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, 650500 Yunnan China
- Computer Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Kunming, 650050 China
| | - Wei Peng
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, 650500 Yunnan China
- Computer Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Kunming, 650050 China
| | - Wei Dai
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, 650500 Yunnan China
- Computer Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Kunming, 650050 China
| | - Shoulin Wei
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, 650500 Yunnan China
- Computer Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Kunming, 650050 China
| | - Xiaodong Fu
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, 650500 Yunnan China
- Computer Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Kunming, 650050 China
| | - Li Liu
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, 650500 Yunnan China
- Computer Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Kunming, 650050 China
| | - Lijun Liu
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, 650500 Yunnan China
- Computer Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Kunming, 650050 China
| |
Collapse
|
3
|
Kuang Y, Xie M, Zhao Z, Deng D, Bao E. Multi-view contrastive clustering for cancer subtyping using fully and weakly paired multi-omics data. Methods 2024; 232:1-8. [PMID: 39423914 DOI: 10.1016/j.ymeth.2024.09.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2024] [Revised: 09/22/2024] [Accepted: 09/26/2024] [Indexed: 10/21/2024] Open
Abstract
The identification of cancer subtypes is crucial for advancing precision medicine, as it facilitates the development of more effective and personalized treatment and prevention strategies. With the development of high-throughput sequencing technologies, researchers now have access to a wealth of multi-omics data from cancer patients, making computational cancer subtyping increasingly feasible. One of the main challenges in integrating multi-omics data is handling missing data, since not all biomolecules are consistently measured across all samples. Current computational models based on multi-omics data for cancer subtyping often struggle with the challenge of weakly paired omics data. To address this challenge, we propose a novel unsupervised cancer subtyping model named Subtype-MVCC. This model leverages graph convolutional networks to extract and represent low-dimensional features from each omics data type, using intra-view and inter-view contrastive learning approaches. By incorporating a weighted average fusion strategy to unify the dimension of each sample, Subtype-MVCC effectively handles weakly paired multi-omics datasets. Comprehensive evaluations on established benchmark datasets demonstrate that Subtype-MVCC outperforms nine leading models in this domain. Additionally, simulations with varying levels of missing data highlight the model's robust performance in handling weakly paired omics data. The clinical relevance and survival outcomes associated with the identified subtypes further validate the interpretability and reliability of the clustering results produced by Subtype-MVCC.
Collapse
Affiliation(s)
- Yabin Kuang
- College of Information Science and Engineering, Hunan Normal University, China.
| | - Minzhu Xie
- College of Information Science and Engineering, Hunan Normal University, China.
| | - Zhanhong Zhao
- College of Information Science and Engineering, Hunan Normal University, China.
| | - Dongze Deng
- College of Information Science and Engineering, Hunan Normal University, China.
| | - Ergude Bao
- School of Software Engineering, Beijing Jiaotong University, China.
| |
Collapse
|
4
|
Shen J, Guo X, Bai H, Luo J. CAEM-GBDT: a cancer subtype identifying method using multi-omics data and convolutional autoencoder network. FRONTIERS IN BIOINFORMATICS 2024; 4:1403826. [PMID: 39077754 PMCID: PMC11284046 DOI: 10.3389/fbinf.2024.1403826] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2024] [Accepted: 06/13/2024] [Indexed: 07/31/2024] Open
Abstract
The identification of cancer subtypes plays a very important role in the field of medicine. Accurate identification of cancer subtypes is helpful for both cancer treatment and prognosis Currently, most methods for cancer subtype identification are based on single-omics data, such as gene expression data. However, multi-omics data can show various characteristics about cancer, which also can improve the accuracy of cancer subtype identification. Therefore, how to extract features from multi-omics data for cancer subtype identification is the main challenge currently faced by researchers. In this paper, we propose a cancer subtype identification method named CAEM-GBDT, which takes gene expression data, miRNA expression data, and DNA methylation data as input, and adopts convolutional autoencoder network to identify cancer subtypes. Through a convolutional encoder layer, the method performs feature extraction on the input data. Within the convolutional encoder layer, a convolutional self-attention module is embedded to recognize higher-level representations of the multi-omics data. The extracted high-level representations from the convolutional encoder are then concatenated with the input to the decoder. The GBDT (Gradient Boosting Decision Tree) is utilized for cancer subtype identification. In the experiments, we compare CAEM-GBDT with existing cancer subtype identifying methods. Experimental results demonstrate that the proposed CAEM-GBDT outperforms other methods. The source code is available from GitHub at https://github.com/gxh-1/CAEM-GBDT.git.
Collapse
Affiliation(s)
| | | | | | - Junwei Luo
- School of Software, Henan Polytechnic University, Jiaozuo, China
| |
Collapse
|
5
|
Yang H, Zhao L, Li D, An C, Fang X, Chen Y, Liu J, Xiao T, Wang Z. Subtype-WGME enables whole-genome-wide multi-omics cancer subtyping. CELL REPORTS METHODS 2024; 4:100781. [PMID: 38761803 PMCID: PMC11228280 DOI: 10.1016/j.crmeth.2024.100781] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/10/2023] [Revised: 01/05/2024] [Accepted: 04/26/2024] [Indexed: 05/20/2024]
Abstract
We present an innovative strategy for integrating whole-genome-wide multi-omics data, which facilitates adaptive amalgamation by leveraging hidden layer features derived from high-dimensional omics data through a multi-task encoder. Empirical evaluations on eight benchmark cancer datasets substantiated that our proposed framework outstripped the comparative algorithms in cancer subtyping, delivering superior subtyping outcomes. Building upon these subtyping results, we establish a robust pipeline for identifying whole-genome-wide biomarkers, unearthing 195 significant biomarkers. Furthermore, we conduct an exhaustive analysis to assess the importance of each omic and non-coding region features at the whole-genome-wide level during cancer subtyping. Our investigation shows that both omics and non-coding region features substantially impact cancer development and survival prognosis. This study emphasizes the potential and practical implications of integrating genome-wide data in cancer research, demonstrating the potency of comprehensive genomic characterization. Additionally, our findings offer insightful perspectives for multi-omics analysis employing deep learning methodologies.
Collapse
Affiliation(s)
- Hai Yang
- Department of Computer Science and Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Liang Zhao
- Department of Computer Science and Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Dongdong Li
- Department of Computer Science and Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Congcong An
- Department of Computer Science and Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Xiaoyang Fang
- Cornell Tech, Cornell University, New York, NY 14853, USA
| | - Yiwen Chen
- Center for Continuing and Lifelong Education, National University of Singapore, Singapore 119077, Singapore
| | - Jingping Liu
- Department of Computer Science and Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Ting Xiao
- Department of Computer Science and Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Zhe Wang
- Department of Computer Science and Engineering, East China University of Science and Technology, Shanghai 200237, China.
| |
Collapse
|
6
|
Cai Y, Wang S. Deeply integrating latent consistent representations in high-noise multi-omics data for cancer subtyping. Brief Bioinform 2024; 25:bbae061. [PMID: 38426322 PMCID: PMC10939425 DOI: 10.1093/bib/bbae061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Revised: 01/13/2024] [Accepted: 01/29/2024] [Indexed: 03/02/2024] Open
Abstract
Cancer is a complex and high-mortality disease regulated by multiple factors. Accurate cancer subtyping is crucial for formulating personalized treatment plans and improving patient survival rates. The underlying mechanisms that drive cancer progression can be comprehensively understood by analyzing multi-omics data. However, the high noise levels in omics data often pose challenges in capturing consistent representations and adequately integrating their information. This paper proposed a novel variational autoencoder-based deep learning model, named Deeply Integrating Latent Consistent Representations (DILCR). Firstly, multiple independent variational autoencoders and contrastive loss functions were designed to separate noise from omics data and capture latent consistent representations. Subsequently, an Attention Deep Integration Network was proposed to integrate consistent representations across different omics levels effectively. Additionally, we introduced the Improved Deep Embedded Clustering algorithm to make integrated variable clustering friendly. The effectiveness of DILCR was evaluated using 10 typical cancer datasets from The Cancer Genome Atlas and compared with 14 state-of-the-art integration methods. The results demonstrated that DILCR effectively captures the consistent representations in omics data and outperforms other integration methods in cancer subtyping. In the Kidney Renal Clear Cell Carcinoma case study, cancer subtypes were identified by DILCR with significant biological significance and interpretability.
Collapse
Affiliation(s)
- Yueyi Cai
- Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming, 650504, Yunnan, China
| | - Shunfang Wang
- Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming, 650504, Yunnan, China
| |
Collapse
|
7
|
Kang DW, Park GH, Ryu WS, Schellingerhout D, Kim M, Kim YS, Park CY, Lee KJ, Han MK, Jeong HG, Kim DE. Strengthening deep-learning models for intracranial hemorrhage detection: strongly annotated computed tomography images and model ensembles. Front Neurol 2023; 14:1321964. [PMID: 38221995 PMCID: PMC10784380 DOI: 10.3389/fneur.2023.1321964] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2023] [Accepted: 12/11/2023] [Indexed: 01/16/2024] Open
Abstract
Background and purpose Multiple attempts at intracranial hemorrhage (ICH) detection using deep-learning techniques have been plagued by clinical failures. We aimed to compare the performance of a deep-learning algorithm for ICH detection trained on strongly and weakly annotated datasets, and to assess whether a weighted ensemble model that integrates separate models trained using datasets with different ICH improves performance. Methods We used brain CT scans from the Radiological Society of North America (27,861 CT scans, 3,528 ICHs) and AI-Hub (53,045 CT scans, 7,013 ICHs) for training. DenseNet121, InceptionResNetV2, MobileNetV2, and VGG19 were trained on strongly and weakly annotated datasets and compared using independent external test datasets. We then developed a weighted ensemble model combining separate models trained on all ICH, subdural hemorrhage (SDH), subarachnoid hemorrhage (SAH), and small-lesion ICH cases. The final weighted ensemble model was compared to four well-known deep-learning models. After external testing, six neurologists reviewed 91 ICH cases difficult for AI and humans. Results InceptionResNetV2, MobileNetV2, and VGG19 models outperformed when trained on strongly annotated datasets. A weighted ensemble model combining models trained on SDH, SAH, and small-lesion ICH had a higher AUC, compared with a model trained on all ICH cases only. This model outperformed four deep-learning models (AUC [95% C.I.]: Ensemble model, 0.953[0.938-0.965]; InceptionResNetV2, 0.852[0.828-0.873]; DenseNet121, 0.875[0.852-0.895]; VGG19, 0.796[0.770-0.821]; MobileNetV2, 0.650[0.620-0.680]; p < 0.0001). In addition, the case review showed that a better understanding and management of difficult cases may facilitate clinical use of ICH detection algorithms. Conclusion We propose a weighted ensemble model for ICH detection, trained on large-scale, strongly annotated CT scans, as no model can capture all aspects of complex tasks.
Collapse
Affiliation(s)
- Dong-Wan Kang
- Department of Public Health, Seoul National University Bundang Hospital, Seongnam, Republic of Korea
- Department of Neurology, Gyeonggi Provincial Medical Center, Icheon Hospital, Icheon, Republic of Korea
- Department of Neurology, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, Republic of Korea
| | - Gi-Hun Park
- JLK Inc., Artificial Intelligence Research Center, Seoul, Republic of Korea
| | - Wi-Sun Ryu
- JLK Inc., Artificial Intelligence Research Center, Seoul, Republic of Korea
| | - Dawid Schellingerhout
- Department of Neuroradiology and Imaging Physics, The University of Texas M.D. Anderson Cancer Center, Houston, TX, United States
| | - Museong Kim
- Department of Neurosurgery, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, Republic of Korea
- Hospital Medicine Center, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, Republic of Korea
| | - Yong Soo Kim
- Department of Neurology, Nowon Eulji Medical Center, Eulji University School of Medicine, Seoul, Republic of Korea
| | - Chan-Young Park
- Department of Neurology, Chung-Ang University Hospital, Seoul, Republic of Korea
| | - Keon-Joo Lee
- Department of Neurology, Korea University Guro Hospital, Seoul, Republic of Korea
| | - Moon-Ku Han
- Department of Neurology, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, Republic of Korea
| | - Han-Gil Jeong
- Department of Neurology, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, Republic of Korea
- Department of Neurosurgery, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, Republic of Korea
| | - Dong-Eog Kim
- Department of Neurology, Dongguk University Ilsan Hospital, Goyang, Republic of Korea
- National Priority Research Center for Stroke, Goyang, Republic of Korea
| |
Collapse
|
8
|
Chen W, Wang H, Liang C. Deep multi-view contrastive learning for cancer subtype identification. Brief Bioinform 2023; 24:bbad282. [PMID: 37539822 DOI: 10.1093/bib/bbad282] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Revised: 05/29/2023] [Accepted: 07/19/2023] [Indexed: 08/05/2023] Open
Abstract
Cancer heterogeneity has posed great challenges in exploring precise therapeutic strategies for cancer treatment. The identification of cancer subtypes aims to detect patients with distinct molecular profiles and thus could provide new clues on effective clinical therapies. While great efforts have been made, it remains challenging to develop powerful computational methods that can efficiently integrate multi-omics datasets for the task. In this paper, we propose a novel self-supervised learning model called Deep Multi-view Contrastive Learning (DMCL) for cancer subtype identification. Specifically, by incorporating the reconstruction loss, contrastive loss and clustering loss into a unified framework, our model simultaneously encodes the sample discriminative information into the extracted feature representations and well preserves the sample cluster structures in the embedded space. Moreover, DMCL is an end-to-end framework where the cancer subtypes could be directly obtained from the model outputs. We compare DMCL with eight alternatives ranging from classic cancer subtype identification methods to recently developed state-of-the-art systems on 10 widely used cancer multi-omics datasets as well as an integrated dataset, and the experimental results validate the superior performance of our method. We further conduct a case study on liver cancer and the analysis results indicate that different subtypes might have different responses to the selected chemotherapeutic drugs.
Collapse
Affiliation(s)
- Wenlan Chen
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, China
| | - Hong Wang
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, China
| | - Cheng Liang
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, China
| |
Collapse
|
9
|
Zhang Z, Xu J, Wu Y, Liu N, Wang Y, Liang Y. CapsNet-LDA: predicting lncRNA-disease associations using attention mechanism and capsule network based on multi-view data. Brief Bioinform 2023; 24:6889447. [PMID: 36511221 DOI: 10.1093/bib/bbac531] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2022] [Revised: 10/25/2022] [Accepted: 11/07/2022] [Indexed: 12/15/2022] Open
Abstract
Cumulative studies have shown that many long non-coding RNAs (lncRNAs) are crucial in a number of diseases. Predicting potential lncRNA-disease associations (LDAs) can facilitate disease prevention, diagnosis and treatment. Therefore, it is vital to develop practical computational methods for LDA prediction. In this study, we propose a novel predictor named capsule network (CapsNet)-LDA for LDA prediction. CapsNet-LDA first uses a stacked autoencoder for acquiring the informative low-dimensional representations of the lncRNA-disease pairs under multiple views, then the attention mechanism is leveraged to implement an adaptive allocation of importance weights to them, and they are subsequently processed using a CapsNet-based architecture for predicting LDAs. Different from the conventional convolutional neural networks (CNNs) that have some restrictions with the usage of scalar neurons and pooling operations. the CapsNets use vector neurons instead of scalar neurons that have better robustness for the complex combination of features and they use dynamic routing processes for updating parameters. CapsNet-LDA is superior to other five state-of-the-art models on four benchmark datasets, four perturbed datasets and an independent test set in the comparison experiments, demonstrating that CapsNet-LDA has excellent performance and robustness against perturbation, as well as good generalization ability. The ablation studies verify the effectiveness of some modules of CapsNet-LDA. Moreover, the ability of multi-view data to improve performance is proven. Case studies further indicate that CapsNet-LDA can accurately predict novel LDAs for specific diseases.
Collapse
Affiliation(s)
- Zequn Zhang
- College of Computer and Information Engineering, Jiangxi Agricultural University, Nanchang, 310045 Jiangxi, China
| | - Junlin Xu
- College of Information Science and Engineering, Hunan University, Changsha 410082, Hunan, China
| | - Yanan Wu
- College of Computer and Information Engineering, Jiangxi Agricultural University, Nanchang, 310045 Jiangxi, China
| | - Niannian Liu
- College of Computer and Information Engineering, Jiangxi Agricultural University, Nanchang, 310045 Jiangxi, China
| | - Yinglong Wang
- College of Computer and Information Engineering, Jiangxi Agricultural University, Nanchang, 310045 Jiangxi, China
| | - Ying Liang
- College of Computer and Information Engineering, Jiangxi Agricultural University, Nanchang, 310045 Jiangxi, China
| |
Collapse
|
10
|
Li L, Wei Y, Shi G, Yang H, Li Z, Fang R, Cao H, Cui Y. Multi-omics data integration for subtype identification of Chinese lower-grade gliomas: a joint similarity network fusion approach. Comput Struct Biotechnol J 2022; 20:3482-3492. [PMID: 35860412 PMCID: PMC9284445 DOI: 10.1016/j.csbj.2022.06.065] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Revised: 06/30/2022] [Accepted: 06/30/2022] [Indexed: 12/28/2022] Open
Abstract
Lower-grade gliomas (LGG), characterized by heterogeneity and invasiveness, originate from the central nervous system. Although studies focusing on molecular subtyping and molecular characteristics have provided novel insights into improving the diagnosis and therapy of LGG, there is an urgent need to identify new molecular subtypes and biomarkers that are promising to improve patient survival outcomes. Here, we proposed a joint similarity network fusion (Joint-SNF) method to integrate different omics data types to construct a fused network using the Joint and Individual Variation Explained (JIVE) technique under the SNF framework. Focusing on the joint network structure, a spectral clustering method was employed to obtain subtypes of patients. Simulation studies show that the proposed Joint-SNF method outperforms the original SNF approach under various simulation scenarios. We further applied the method to a Chinese LGG data set including mRNA expression, DNA methylation and microRNA (miRNA). Three molecular subtypes were identified and showed statistically significant differences in patient survival outcomes. The five-year mortality rates of the three subtypes are 80.8%, 32.1%, and 34.4%, respectively. After adjusting for clinically relevant covariates, the death risk of patients in Cluster 1 was 5.06 times higher than patients in other clusters. The fused network attained by the proposed Joint-SNF method enhances strong similarities, thus greatly improves subtyping performance compared to the original SNF method. The findings in the real application may provide important clues for improving patient survival outcomes and for precision treatment for Chinese LGG patients. An R package to implement the method can be accessed in Github at https://github.com/Sameerer/Joint-SNF.
Collapse
Affiliation(s)
- Lingmei Li
- Division of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi 030001, PR China
| | - Yifang Wei
- Division of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi 030001, PR China
| | - Guojing Shi
- Division of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi 030001, PR China
| | - Haitao Yang
- Division of Health Statistics, School of Public Health, Hebei Medical University, Shijiazhuang, Hebei 050017, PR China
| | - Zhi Li
- Department of Hematology, Taiyuan Central Hospital of Shanxi Medical University, Taiyuan, Shanxi 030001, PR China
| | - Ruiling Fang
- Division of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi 030001, PR China
| | - Hongyan Cao
- Division of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi 030001, PR China
- Shanxi Medical University-Yidu Cloud Institute of Medical Data Science, Taiyuan, Shanxi 030001, PR China
- Corresponding authors at: Division of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi, PR China.
| | - Yuehua Cui
- Department of Statistics and Probability, Michigan State University, East Lansing, MI 48824, USA
- Corresponding authors at: Division of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi, PR China.
| |
Collapse
|
11
|
Famitha S, Moorthi M. Intelligent and novel multi-type cancer prediction model using optimized ensemble learning. Comput Methods Biomech Biomed Engin 2022; 25:1879-1903. [PMID: 35695463 DOI: 10.1080/10255842.2022.2081504] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
Abstract
Cancer is known to be highly severe disease and gets incurable even when the treatment has started at the time of diagnosis owing to the occurrence of cancer cells. Diverse machine learning approaches are implemented for predicting the cancer recurrence that needs to be evaluated for showing the appropriate approach for cancer prediction. This paper provides intelligent optimized ensemble learning for predicting multiple types of cancers. At first, the different types of cancer data are collected and performed the data cleansing. Then, the feature extraction is done using statistical features, 'Linear Discriminant Analysis (LDA), and Principal Component Analysis (PCA)'. With these features, a new Adaptive Condition Searched-Harris hawks Whale Optimization (ACS-HWO) is used for selecting the optimal features and transformed into weighted features with meta-heuristic update. The prediction is carried out by Optimized Ensemble-based Multi-disease Detection (OEMD) with Support Vector Machine (SVM), Autoencoder, Adaboost, 'Deep Neural Network (DNN), and Recurrent Neural Network (RNN)' with high ranking strategy. The same ACS-HWO is used for improvising the weighted feature selection and optimized ensemble learning. The comparative analysis over existing models shows that the suggested method can be highly applicable for the healthcare system to ensure the consistent prediction with the multi-type of cancers.
Collapse
Affiliation(s)
- S Famitha
- Associate Professor, Computer Science and Engineering, Prathyusha Engineering College, Anna University, Tiruvallur, India
| | - M Moorthi
- Professor & HOD, BME & Medical Electronics, Saveetha Engineering College, Anna University, Chennai India
| |
Collapse
|
12
|
Yan K, Lv H, Guo Y, Chen Y, Wu H, Liu B. TPpred-ATMV: therapeutic peptide prediction by adaptive multi-view tensor learning model. Bioinformatics 2022; 38:2712-2718. [PMID: 35561206 DOI: 10.1093/bioinformatics/btac200] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2022] [Revised: 03/17/2022] [Accepted: 04/06/2022] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Therapeutic peptide prediction is important for the discovery of efficient therapeutic peptides and drug development. Researchers have developed several computational methods to identify different therapeutic peptide types. However, these computational methods focus on identifying some specific types of therapeutic peptides, failing to predict the comprehensive types of therapeutic peptides. Moreover, it is still challenging to utilize different properties to predict the therapeutic peptides. RESULTS In this study, an adaptive multi-view based on the tensor learning framework TPpred-ATMV is proposed for predicting different types of therapeutic peptides. TPpred-ATMV constructs the class and probability information based on various sequence features. We constructed the latent subspace among the multi-view features and constructed an auto-weighted multi-view tensor learning model to utilize the high correlation based on the multi-view features. Experimental results showed that the TPpred-ATMV is better than or highly comparable with the other state-of-the-art methods for predicting eight types of therapeutic peptides. AVAILABILITY AND IMPLEMENTATION The code of TPpred-ATMV is accessed at: https://github.com/cokeyk/TPpred-ATMV. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ke Yan
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
| | - Hongwu Lv
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
| | - Yichen Guo
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
| | - Yongyong Chen
- Bio-Computing Research Center, Harbin Institute of Technology, Shenzhen 518055, China
| | - Hao Wu
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
| | - Bin Liu
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
- Advanced Research Institute of Multidisciplinary Science, Beijing Institute of Technology, Beijing 100081, China
| |
Collapse
|
13
|
Zhang G, Peng Z, Yan C, Wang J, Luo J, Luo H. MultiGATAE: A Novel Cancer Subtype Identification Method Based on Multi-Omics and Attention Mechanism. Front Genet 2022; 13:855629. [PMID: 35391797 PMCID: PMC8979770 DOI: 10.3389/fgene.2022.855629] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2022] [Accepted: 02/14/2022] [Indexed: 11/13/2022] Open
Abstract
Cancer is one of the leading causes of death worldwide, which brings an urgent need for its effective treatment. However, cancer is highly heterogeneous, meaning that one cancer can be divided into several subtypes with distinct pathogenesis and outcomes. This is considered as the main problem which limits the precision treatment of cancer. Thus, cancer subtypes identification is of great importance for cancer diagnosis and treatment. In this work, we propose a deep learning method which is based on multi-omics and attention mechanism to effectively identify cancer subtypes. We first used similarity network fusion to integrate multi-omics data to construct a similarity graph. Then, the similarity graph and the feature matrix of the patient are input into a graph autoencoder composed of a graph attention network and omics-level attention mechanism to learn embedding representation. The K-means clustering method is applied to the embedding representation to identify cancer subtypes. The experiment on eight TCGA datasets confirmed that our proposed method performs better for cancer subtypes identification when compared with the other state-of-the-art methods. The source codes of our method are available at https://github.com/kataomoi7/multiGATAE.
Collapse
Affiliation(s)
- Ge Zhang
- School of Computer and Information Engineering, Henan University, Kaifeng, China
| | - Zhen Peng
- School of Computer and Information Engineering, Henan University, Kaifeng, China
| | - Chaokun Yan
- School of Computer and Information Engineering, Henan University, Kaifeng, China
| | - Jianlin Wang
- School of Computer and Information Engineering, Henan University, Kaifeng, China
| | - Junwei Luo
- College of Computer Science and Technology, Henan Polytechnic University, Jiaozuo, China
| | - Huimin Luo
- School of Computer and Information Engineering, Henan University, Kaifeng, China
| |
Collapse
|
14
|
Automatic Control of Mobile Industrial Robot Based on Multiobjective Optimization Strategy. JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING 2022. [DOI: 10.1155/2022/7825906] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
In order to solve the optimal cascade mobile path selection problem when mobile industrial robots repair network coverage holes, a cascade mobile path selection optimization method considering the number and energy of intermediate cascade nodes is proposed. By calculating the energy availability of intermediate cascade nodes, this method further obtains the energy availability and decisive energy of each path, selects the optimal cascade mobile path from the perspective of multiobjective optimization, effectively balances the energy consumption of each mobile industrial robot, makes full use of the energy of the whole network, and prolongs the survival time of the network. Simulation results show that the optimization method has higher network energy efficiency than the standard cascaded mobile method.
Collapse
|