1
|
Ibrahim TS, Saraya MS, Saleh AI, Rabie AH. An efficient graph attention framework enhances bladder cancer prediction. Sci Rep 2025; 15:11127. [PMID: 40169776 PMCID: PMC11961686 DOI: 10.1038/s41598-025-93059-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2024] [Accepted: 03/04/2025] [Indexed: 04/03/2025] Open
Abstract
Bladder (BL) cancer is the 10th most common cancer worldwide, ranking 9th in males and 13th in females in the United States, respectively. BL cancer is a quick-growing tumor of all cancer forms. Given a malignant tumor's high malignancy, rapid metastasis prediction and accurate treatment are critical. The most significant drivers of the intricate genesis of cancer are complex genetics, including deoxyribonucleic acid (DNA) insertions and deletions, abnormal structure, copy number variations (CNVs), and single nucleotide variations (SNVs). The proposed method enhances the identification of driver genes at the individual patient level by employing attention mechanisms to extract features of both coding and non-coding genes and predict BL cancer based on the personalized driver gene (PDG) detection. The embedded vectors are propagated through the three dense blocks for the binary classification of PDGs. The novel constructure of graph neural network (GNN) with attention mechanism, called Multi Stacked-Layered GAT (MSL-GAT) leverages graph attention mechanisms (GAT) to identify and predict critical driver genes associated with BL cancer progression. In order to pick out and extract essential features from both coding and non-coding genes, including long non-coding RNAs (lncRNAs), which are known to be crucial to the advancement of BL cancer. The approach analyzes key genetic changes (such as SNVs, CNVs, and structural abnormalities) that lead to tumorigenesis and metastasis by concentrating on personalized driver genes (PDGs). The discovery of genes crucial for the survival and proliferation of cancer cells is made possible by the model's precise classification of PDGs. MSL-GAT draws attention to certain lncRNAs and other non-coding elements that control carcinogenic pathways by utilizing the attention mechanism. Tumor development, metastasis, and medication resistance are all facilitated by these lncRNAs, which are frequently overexpressed or dysregulated in BL cancer. In order to reduce the survival of cancer cells, the model's predictions can direct specific treatment approaches, such as RNA interference (RNAi), to mute or suppress the expression of these important genes. MSL-GAT is followed by three dense blocks that spread the embedded vectors to categorize PDGs, making it possible to determine which genes are more likely to cause BL cancer in a certain patient. The model facilitates the identification of new treatment targets by offering a thorough understanding of the molecular landscape of BL cancer through the integration of multi-omics data, encompassing as genomic, transcriptomic, and epigenomic metadata. We compared the novel approach with classical machine learning methods and other deep learning-based methods on benchmark TCGA-BLCA, and the leave-one-out experimental results showed that MSL-GAT achieved better performance than competitive methods. This approach achieves accuracy with 97.72% and improves specificity and sensitivity. It can potentially aid physicians during early prediction of BL cancer.
Collapse
Affiliation(s)
- Taghreed S Ibrahim
- Computers and Control Dept. faculty of engineering, Mansoura University, Mansoura, Egypt.
| | - M S Saraya
- Computers and Control Dept. faculty of engineering, Mansoura University, Mansoura, Egypt
| | - Ahmed I Saleh
- Computers and Control Dept. faculty of engineering, Mansoura University, Mansoura, Egypt
| | - Asmaa H Rabie
- Computers and Control Dept. faculty of engineering, Mansoura University, Mansoura, Egypt
| |
Collapse
|
2
|
Li B, Nabavi S. A multimodal graph neural network framework for cancer molecular subtype classification. BMC Bioinformatics 2024; 25:27. [PMID: 38225583 PMCID: PMC10789042 DOI: 10.1186/s12859-023-05622-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2023] [Accepted: 12/15/2023] [Indexed: 01/17/2024] Open
Abstract
BACKGROUND The recent development of high-throughput sequencing has created a large collection of multi-omics data, which enables researchers to better investigate cancer molecular profiles and cancer taxonomy based on molecular subtypes. Integrating multi-omics data has been proven to be effective for building more precise classification models. Most current multi-omics integrative models use either an early fusion in the form of concatenation or late fusion with a separate feature extractor for each omic, which are mainly based on deep neural networks. Due to the nature of biological systems, graphs are a better structural representation of bio-medical data. Although few graph neural network (GNN) based multi-omics integrative methods have been proposed, they suffer from three common disadvantages. One is most of them use only one type of connection, either inter-omics or intra-omic connection; second, they only consider one kind of GNN layer, either graph convolution network (GCN) or graph attention network (GAT); and third, most of these methods have not been tested on a more complex classification task, such as cancer molecular subtypes. RESULTS In this study, we propose a novel end-to-end multi-omics GNN framework for accurate and robust cancer subtype classification. The proposed model utilizes multi-omics data in the form of heterogeneous multi-layer graphs, which combine both inter-omics and intra-omic connections from established biological knowledge. The proposed model incorporates learned graph features and global genome features for accurate classification. We tested the proposed model on the Cancer Genome Atlas (TCGA) Pan-cancer dataset and TCGA breast invasive carcinoma (BRCA) dataset for molecular subtype and cancer subtype classification, respectively. The proposed model shows superior performance compared to four current state-of-the-art baseline models in terms of accuracy, F1 score, precision, and recall. The comparative analysis of GAT-based models and GCN-based models reveals that GAT-based models are preferred for smaller graphs with less information and GCN-based models are preferred for larger graphs with extra information.
Collapse
Affiliation(s)
- Bingjun Li
- Department of Computer Science and Engineering, University of Connecticut, Storrs, USA
| | - Sheida Nabavi
- Department of Computer Science and Engineering, University of Connecticut, Storrs, USA.
| |
Collapse
|
3
|
Xu P, Joshi CK, Bresson X. Multigraph Transformer for Free-Hand Sketch Recognition. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:5150-5161. [PMID: 33826519 DOI: 10.1109/tnnls.2021.3069230] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Learning meaningful representations of free-hand sketches remains a challenging task given the signal sparsity and the high-level abstraction of sketches. Existing techniques have focused on exploiting either the static nature of sketches with convolutional neural networks (CNNs) or the temporal sequential property with recurrent neural networks (RNNs). In this work, we propose a new representation of sketches as multiple sparsely connected graphs. We design a novel graph neural network (GNN), the multigraph transformer (MGT), for learning representations of sketches from multiple graphs, which simultaneously capture global and local geometric stroke structures as well as temporal information. We report extensive numerical experiments on a sketch recognition task to demonstrate the performance of the proposed approach. Particularly, MGT applied on 414k sketches from Google QuickDraw: 1) achieves a small recognition gap to the CNN-based performance upper bound (72.80% versus 74.22%) and infers faster than the CNN competitors and 2) outperforms all RNN-based models by a significant margin. To the best of our knowledge, this is the first work proposing to represent sketches as graphs and apply GNNs for sketch recognition. Code and trained models are available at https://github.com/PengBoXiangShang/multigraph_transformer.
Collapse
|
4
|
Chen Y, Hu Y, Li K, Yeo CK, Li K. Approximate personalized propagation for unsupervised embedding in heterogeneous graphs. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2022.04.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
5
|
Abstract
In industrial production, accidents caused by the unsafe behavior of operators often bring serious economic losses. Therefore, how to use artificial intelligence technology to monitor the unsafe behavior of operators in a production area in real time has become a research topic of great concern. Based on the YOLOv5 framework, this paper proposes an improved YOLO network to detect unsafe behaviors such as not wearing safety helmets and smoking in industrial places. First, the proposed network uses a novel adaptive self-attention embedding (ASAE) model to improve the backbone network and reduce the loss of context information in the high-level feature map by reducing the number of feature channels. Second, a new weighted feature pyramid network (WFPN) module is used to replace the original enhanced feature-extraction network PANet to alleviate the loss of feature information caused by too many network layers. Finally, the experimental results on the self-constructed behavior dataset show that the proposed framework has higher detection accuracy than traditional methods. The average detection accuracy of smoking increased by 3.3%, and the average detection accuracy of not wearing a helmet increased by 3.1%.
Collapse
|
6
|
ADSF: Node Classification via Adaptive Structural Fingerprints. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.05.073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
7
|
Song H, Dai Z, Xu P, Ren L. Interactive Visual Pattern Search on Graph Data via Graph Representation Learning. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:335-345. [PMID: 34587078 DOI: 10.1109/tvcg.2021.3114857] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Graphs are a ubiquitous data structure to model processes and relations in a wide range of domains. Examples include control-flow graphs in programs and semantic scene graphs in images. Identifying subgraph patterns in graphs is an important approach to understand their structural properties. We propose a visual analytics system GraphQ to support human-in-the-loop, example-based, subgraph pattern search in a database containing many individual graphs. To support fast, interactive queries, we use graph neural networks (GNNs) to encode a graph as fixed-length latent vector representation, and perform subgraph matching in the latent space. Due to the complexity of the problem, it is still difficult to obtain accurate one-to-one node correspondences in the matching results that are crucial for visualization and interpretation. We, therefore, propose a novel GNN for node-alignment called NeuroAlign, to facilitate easy validation and interpretation of the query results. GraphQ provides a visual query interface with a query editor and a multi-scale visualization of the results, as well as a user feedback mechanism for refining the results with additional constraints. We demonstrate GraphQ through two example usage scenarios: analyzing reusable subroutines in program workflows and semantic scene graph search in images. Quantitative experiments show that NeuroAlign achieves 19%-29% improvement in node-alignment accuracy compared to baseline GNN and provides up to 100× speedup compared to combinatorial algorithms. Our qualitative study with domain experts confirms the effectiveness for both usage scenarios.
Collapse
|
8
|
Xiang Y, Du J, Fujimoto K, Li F, Schneider J, Tao C. Application of artificial intelligence and machine learning for HIV prevention interventions. Lancet HIV 2022; 9:e54-e62. [PMID: 34762838 PMCID: PMC9840899 DOI: 10.1016/s2352-3018(21)00247-2] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2021] [Revised: 08/11/2021] [Accepted: 09/02/2021] [Indexed: 01/17/2023]
Abstract
In 2019, the US Government announced its goal to end the HIV epidemic within 10 years, mirroring the initiatives set forth by UNAIDS. Public health prevention interventions are a crucial part of this ambitious goal. However, numerous challenges to this goal exist, including improving HIV awareness, increasing early HIV infection detection, ensuring rapid treatment, optimising resource distribution, and providing efficient prevention services for vulnerable populations. Artificial intelligence has had a pivotal role in revolutionising health care and has shown great potential in developing effective HIV prevention intervention strategies. Although artificial intelligence has been used in a few HIV prevention intervention areas, there are challenges to address and opportunities to explore.
Collapse
Affiliation(s)
- Yang Xiang
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Jingcheng Du
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Kayo Fujimoto
- Department of Health Promotion and Behavioral Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Fang Li
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - John Schneider
- The Chicago Center for HIV Elimination and Department of Medicine and Department of Public Health Sciences, University of Chicago, Chicago, IL, USA
| | - Cui Tao
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA.
| |
Collapse
|
9
|
XIANG Y, FUJIMOTO K, LI F, WANG Q, DEL VECCHIO N, SCHNEIDER J, ZHI D, TAO C. Identifying influential neighbors in social networks and venue affiliations among young MSM: a data science approach to predict HIV infection. AIDS 2021; 35:S65-S73. [PMID: 33306549 PMCID: PMC8058230 DOI: 10.1097/qad.0000000000002784] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
OBJECTIVE Young MSM (YMSM) bear a disproportionate burden of HIV infection in the United States and their risks of acquiring HIV may be shaped by complex multilayer social networks. These networks are formed through not only direct contact with social/sex partners but also indirect anonymous contacts encountered when attending social venues. We introduced a new application of a state-of-the-art graph-based deep learning method to predict HIV infection that can identify influential neighbors within these multiple network contexts. DESIGN AND METHODS We used empirical network data among YMSM aged 16-29 years old collected from Houston and Chicago in the United States between 2014 and 2016. A computational framework GAT-HIV (Graph Attention Networks for HIV) was proposed to predict HIV infections by identifying influential neighbors within social networks. These networks were formed by multiple relations constituted of social/sex partners and shared venue attendances, and using individual-level variables. Further, GAT-HIV was extended to combine multiple social networks using multigraph GAT methods. A visualization tool was also developed to highlight influential network members for each individual within the multiple social networks. RESULTS The multigraph GAT-HIV models obtained average AUC values of 0.776 and 0.824 for Chicago and Houston, respectively, performing better than empirical predictive models (e.g. AUCs of random forest: 0.758 and 0.798). GAT-HIV on single networks also delivered promising prediction performances. CONCLUSION The proposed methods provide a comprehensive and interpretable framework for graph-based modeling that may inform effective HIV prevention intervention strategies among populations most vulnerable to HIV.
Collapse
Affiliation(s)
- Yang XIANG
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Kayo FUJIMOTO
- Department of Health Promotion & Behavioral Sciences, School of Public Health, University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Fang LI
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Qing WANG
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Natascha DEL VECCHIO
- Chicago Center for HIV Elimination, University of Chicago, Chicago, Illinois, USA
| | - John SCHNEIDER
- Chicago Center for HIV Elimination, University of Chicago, Chicago, Illinois, USA
- Departments of Medicine and Public Health Sciences, University of Chicago, Chicago, Illinois, USA
| | - Degui ZHI
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Cui TAO
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, Texas, USA
| |
Collapse
|