1
|
Chen X, Cai R, Huang Z, Li Z, Zheng J, Wu M. Interpretable high-order knowledge graph neural network for predicting synthetic lethality in human cancers. Brief Bioinform 2025; 26:bbaf142. [PMID: 40194555 PMCID: PMC11975366 DOI: 10.1093/bib/bbaf142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2024] [Revised: 02/21/2025] [Accepted: 03/07/2025] [Indexed: 04/09/2025] Open
Abstract
Synthetic lethality (SL) is a promising gene interaction for cancer therapy. Recent SL prediction methods integrate knowledge graphs (KGs) into graph neural networks (GNNs) and employ attention mechanisms to extract local subgraphs as explanations for target gene pairs. However, attention mechanisms often lack fidelity, typically generate a single explanation per gene pair, and fail to ensure trustworthy high-order structures in their explanations. To overcome these limitations, we propose Diverse Graph Information Bottleneck for Synthetic Lethality (DGIB4SL), a KG-based GNN that generates multiple faithful explanations for the same gene pair and effectively encodes high-order structures. Specifically, we introduce a novel DGIB objective, integrating a determinant point process constraint into the standard information bottleneck objective, and employ 13 motif-based adjacency matrices to capture high-order structures in gene representations. Experimental results show that DGIB4SL outperforms state-of-the-art baselines and provides multiple explanations for SL prediction, revealing diverse biological mechanisms underlying SL inference.
Collapse
Affiliation(s)
- Xuexin Chen
- School of Computer Science, Guangdong University of Technology, No. 100 Waihuan Xi Road, Panyu, Guangdong, Guangzhou, 510006, China
| | - Ruichu Cai
- School of Computer Science, Guangdong University of Technology, No. 100 Waihuan Xi Road, Panyu, Guangdong, Guangzhou, 510006, China
- Pazhou Laboratory (Huangpu), No. 248 Pazhou Qiaotou Street, Haizhu, Guangdong Province, Guangzhou, 510335, China
| | - Zhengting Huang
- School of Computer Science, Guangdong University of Technology, No. 100 Waihuan Xi Road, Panyu, Guangdong, Guangzhou, 510006, China
| | - Zijian Li
- Machine Learning Department, Mohamed bin Zayed University of Artificial Intelligence, Masdar, Abu Dhabi, United Arab Emirates
| | - Jie Zheng
- School of Information Science and Technology, ShanghaiTech University, No. 393 Huaxia Middle Road, Pudong, Shanghai, 201210, China
- School of Information Science and Technology, Shanghai Engineering Research Center of Intelligent Vision and Imaging, ShanghaiTech University, No. 393 Huaxia Middle Road, Pudong, Shanghai, 201210, China
| | - Min Wu
- Institute for Infocomm Research (IR), A*STAR, No. 2 Fusionopolis Way, Queenstown Planning, Singapore 138632, Singapore
| |
Collapse
|
2
|
Li J, Lu X, Jiang K, Tang D, Ning B, Sun F. TARSL: Triple-Attention Cross-Network Representation Learning to Predict Synthetic Lethality for Anti-Cancer Drug Discovery. IEEE J Biomed Health Inform 2025; 29:1680-1691. [PMID: 37603479 DOI: 10.1109/jbhi.2023.3306768] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/23/2023]
Abstract
Cancer is a multifaceted disease that results from co-mutations of multi biological molecules. A promising strategy for cancer therapy involves in exploiting the phenomenon of Synthetic Lethality (SL) by targeting the SL partner of cancer gene. Since traditional methods for SL prediction suffer from high-cost, time-consuming and off-targets effects, computational approaches have been efficient complementary to these methods. Most of existing approaches treat SL associations as independent of other biological interaction networks, and fail to consider other information from various biological networks. Despite some approaches have integrated different networks to capture multi-modal features of genes for SL prediction, these methods implicitly assume that all sources and levels of information contribute equally to the SL associations. As such, a comprehensive and flexible framework for learning gene cross-network representations for SL prediction is still lacking. In this work, we present a novel Triple-Attention cross-network Representation learning for SL prediction (TARSL) by capturing molecular features from heterogeneous sources. We employ three-level attention modules to consider the different contribution of multi-level information. In particular, feature-level attention can capture the correlations between molecular feature and network link, node-level attention can differentiate the importance of various neighbors, and network-level attention can concentrate on important network and reduce the effects of irrelated networks. We perform comprehensive experiments on human SL datasets and these results have proven that our model is consistently superior to baseline methods and predicted SL associations could aid in designing anti-cancer drugs.
Collapse
|
3
|
Jiang Y, Wang J, Zhang Y, Cao Z, Zhang Q, Su J, He S, Bo X. Graph based recurrent network for context specific synthetic lethality prediction. SCIENCE CHINA. LIFE SCIENCES 2025; 68:527-540. [PMID: 39422810 DOI: 10.1007/s11427-023-2618-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/23/2024] [Accepted: 05/10/2024] [Indexed: 10/19/2024]
Abstract
The concept of synthetic lethality (SL) has been successfully used for targeted therapies. To further explore SL for cancer therapy, identifying more SL interactions with therapeutic potential are essential. Recently, graph neural network-based deep learning methods have been proposed for SL prediction, which reduce the SL search space of wet-lab based methods. However, these methods ignore that most SL interactions depend strongly on genetic context, which limits the application of the predicted results. In this study, we proposed a graph recurrent network-based model for specific context-dependent SL prediction (SLGRN). In particular, we introduced a Graph Recurrent Network-based encoder to acquire a context-specific, low-dimensional feature representation for each node, facilitating the prediction of novel SL. SLGRN leveraged gate recurrent unit (GRU) and it incorporated a context-dependent-level state to effectively integrate information from all nodes. As a result, SLGRN outperforms the state-of-the-arts models for SL prediction. We subsequently validate novel SL interactions under different contexts based on combination therapy or patient survival analysis. Through in vitro experiments and retrospective clinical analysis, we emphasize the potential clinical significance of this context-specific SL prediction model.
Collapse
Affiliation(s)
- Yuyang Jiang
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, 300072, China
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing, 100850, China
| | - Jing Wang
- School of Medicine, Tsinghua University, Beijing, 100084, China
| | - Yixin Zhang
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing, 100850, China
| | - ZhiWei Cao
- School of Informatics, Xiamen University, Xiamen, 361005, China
| | - Qinglong Zhang
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing, 100850, China
| | - Jinsong Su
- School of Informatics, Xiamen University, Xiamen, 361005, China.
| | - Song He
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing, 100850, China.
| | - Xiaochen Bo
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing, 100850, China.
| |
Collapse
|
4
|
Zhang X, Liu Q. A graph neural network approach for hierarchical mapping of breast cancer protein communities. BMC Bioinformatics 2025; 26:23. [PMID: 39838298 PMCID: PMC11749236 DOI: 10.1186/s12859-024-06015-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2024] [Accepted: 12/16/2024] [Indexed: 01/23/2025] Open
Abstract
BACKGROUND Comprehensively mapping the hierarchical structure of breast cancer protein communities and identifying potential biomarkers from them is a promising way for breast cancer research. Existing approaches are subjective and fail to take information from protein sequences into consideration. Deep learning can automatically learn features from protein sequences and protein-protein interactions for hierarchical clustering. RESULTS Using a large amount of publicly available proteomics data, we created a hierarchical tree for breast cancer protein communities using a novel hierarchical graph neural network, with the supervision of gene ontology terms and assistance of a pre-trained deep contextual language model. Then, a group-lasso algorithm was applied to identify protein communities that are under both mutation burden and survival burden, undergo significant alterations when targeted by specific drug molecules, and show cancer-dependent perturbations. The resulting hierarchical map of protein communities shows how gene-level mutations and survival information converge on protein communities at different scales. Internal validity of the model was established through the convergence on BRCA2 as a breast cancer hotspot. Further overlaps with breast cancer cell dependencies revealed SUPT6H and RAD21, along with their respective protein systems, HOST:37 and HOST:861, as potential biomarkers. Using gene-level perturbation data of the HOST:37 and HOST:861 gene sets, three FDA-approved drugs with high therapeutic value were selected as potential treatments to be further evaluated. These drugs include mercaptopurine, pioglitazone, and colchicine. CONCLUSION The proposed graph neural network approach to analyzing breast cancer protein communities in a hierarchical structure provides a novel perspective on breast cancer prognosis and treatment. By targeting entire gene sets, we were able to evaluate the prognostic and therapeutic value of genes (or gene sets) at different levels, from gene-level to system-level biology. Cancer-specific gene dependencies provide additional context for pinpointing cancer-related systems and drug-induced alterations can highlight potential therapeutic targets. These identified protein communities, in conjunction with other protein communities under strong mutation and survival burdens, can potentially be used as clinical biomarkers for breast cancer.
Collapse
Affiliation(s)
- Xiao Zhang
- Department of Applied Computer Science, University of Winnipeg, Winnipeg, MB, R3B 2E9, Canada
- Department of Biochemistry and Medical Genetics, University of Manitoba, Winnipeg, MB, R3E 0W2, Canada
| | - Qian Liu
- Department of Applied Computer Science, University of Winnipeg, Winnipeg, MB, R3B 2E9, Canada.
- Department of Biochemistry and Medical Genetics, University of Manitoba, Winnipeg, MB, R3E 0W2, Canada.
| |
Collapse
|
5
|
Wang J, Wen Y, Zhang Y, Wang Z, Jiang Y, Dai C, Wu L, Leng D, He S, Bo X. An interpretable artificial intelligence framework for designing synthetic lethality-based anti-cancer combination therapies. J Adv Res 2024; 65:329-343. [PMID: 38043609 PMCID: PMC11519055 DOI: 10.1016/j.jare.2023.11.035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Revised: 11/27/2023] [Accepted: 11/29/2023] [Indexed: 12/05/2023] Open
Abstract
INTRODUCTION Synthetic lethality (SL) provides an opportunity to leverage different genetic interactions when designing synergistic combination therapies. To further explore SL-based combination therapies for cancer treatment, it is important to identify and mechanistically characterize more SL interactions. Artificial intelligence (AI) methods have recently been proposed for SL prediction, but the results of these models are often not interpretable such that deriving the underlying mechanism can be challenging. OBJECTIVES This study aims to develop an interpretable AI framework for SL prediction and subsequently utilize it to design SL-based synergistic combination therapies. METHODS We propose a knowledge and data dual-driven AI framework for SL prediction (KDDSL). Specifically, we use gene knowledge related to the SL mechanism to guide the construction of the model and develop a method to identify the most relevant gene knowledge for the predicted results. RESULTS Experimental and literature-based validation confirmed a good balance between predictive and interpretable ability when using KDDSL. Moreover, we demonstrated that KDDSL could help to discover promising drug combinations and clarify associated biological processes, such as the combination of MDM2 and CDK9 inhibitors, which exhibited significant anti-cancer effects in vitro and in vivo. CONCLUSION These data underscore the potential of KDDSL to guide SL-based combination therapy design. There is a need for biomedicine-focused AI strategies to combine rational biological knowledge with developed models.
Collapse
Affiliation(s)
- Jing Wang
- School of Medicine, Tsinghua University, Beijing, 100084, China
| | - Yuqi Wen
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing, 100850, China
| | - Yixin Zhang
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing, 100850, China
| | - Zhongming Wang
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, 300072, China
| | - Yuyang Jiang
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, 300072, China
| | - Chong Dai
- College of Life Science and Technology, Beijing University of Chemical Technology, Beijing, 100029, China
| | - Lianlian Wu
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, 300072, China
| | - Dongjin Leng
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing, 100850, China
| | - Song He
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing, 100850, China.
| | - Xiaochen Bo
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing, 100850, China.
| |
Collapse
|
6
|
Feng Y, Long Y, Wang H, Ouyang Y, Li Q, Wu M, Zheng J. Benchmarking machine learning methods for synthetic lethality prediction in cancer. Nat Commun 2024; 15:9058. [PMID: 39428397 PMCID: PMC11491473 DOI: 10.1038/s41467-024-52900-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Accepted: 09/23/2024] [Indexed: 10/22/2024] Open
Abstract
Synthetic lethality (SL) is a gold mine of anticancer drug targets, exposing cancer-specific dependencies of cellular survival. To complement resource-intensive experimental screening, many machine learning methods for SL prediction have emerged recently. However, a comprehensive benchmarking is lacking. This study systematically benchmarks 12 recent machine learning methods for SL prediction, assessing their performance across diverse data splitting scenarios, negative sample ratios, and negative sampling techniques, on both classification and ranking tasks. We observe that all the methods can perform significantly better by improving data quality, e.g., excluding computationally derived SLs from training and sampling negative labels based on gene expression. Among the methods, SLMGAE performs the best. Furthermore, the methods have limitations in realistic scenarios such as cold-start independent tests and context-specific SLs. These results, together with source code and datasets made freely available, provide guidance for selecting suitable methods and developing more powerful techniques for SL virtual screening.
Collapse
Affiliation(s)
- Yimiao Feng
- School of Information Science and Technology, ShanghaiTech University, Shanghai, China
- Lingang Laboratory, Shanghai, China
| | - Yahui Long
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
| | - He Wang
- School of Information Science and Technology, ShanghaiTech University, Shanghai, China
| | - Yang Ouyang
- School of Information Science and Technology, ShanghaiTech University, Shanghai, China
| | - Quan Li
- School of Information Science and Technology, ShanghaiTech University, Shanghai, China
| | - Min Wu
- Institute for Infocomm Research, Agency for Science, Technology and Research (A*STAR), Singapore, Singapore.
| | - Jie Zheng
- School of Information Science and Technology, ShanghaiTech University, Shanghai, China.
- Shanghai Engineering Research Center of Intelligent Vision and Imaging, Shanghai, China.
| |
Collapse
|
7
|
Hu X, Yi H, Cheng H, Zhao Y, Zhang D, Li J, Ruan J, Zhang J, Lu X. Multiple Heterogeneous Networks Representation With Latent Space for Synthetic Lethality Prediction. IEEE Trans Nanobioscience 2024; 23:564-571. [PMID: 39150817 DOI: 10.1109/tnb.2024.3444922] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/18/2024]
Abstract
Computational synthetic lethality (SL) method has become a promising strategy to identify SL gene pairs for targeted cancer therapy and cancer medicine development. Feature representation for integrating various biological networks is crutial to improve the identification performance. However, previous feature representation, such as matrix factorization and graph neural network, projects gene features onto latent variables by keeping a specific geometric metric. There is a lack of models of gene representational latent space with considerating multiple dimentionalities correlation and preserving latent geometric structures in both sample and feature spaces. Therefore, we propose a novel method to model gene Latent Space using matrix Tri-Factorization (LSTF) to obtain gene representation with embedding variables resulting from the potential interpretation of synthetic lethality. Meanwhile, manifold subspace regularization is applied to the tri-factorization to capture the geometrical manifold structure in the latent space with gene PPI functional and GO semantic embeddings. Then, SL gene pairs are identified by the reconstruction of the associations with gene representations in the latent space. The experimental results illustrate that LSTF is superior to other state-of-the-art methods. Case study demonstrate the effectiveness of the predicted SL associations.
Collapse
|
8
|
Fan K, Gökbağ B, Tang S, Li S, Huang Y, Wang L, Cheng L, Li L. Synthetic lethal connectivity and graph transformer improve synthetic lethality prediction. Brief Bioinform 2024; 25:bbae425. [PMID: 39210507 PMCID: PMC11361842 DOI: 10.1093/bib/bbae425] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2024] [Revised: 06/14/2024] [Accepted: 08/16/2024] [Indexed: 09/04/2024] Open
Abstract
Synthetic lethality (SL) has shown great promise for the discovery of novel targets in cancer. CRISPR double-knockout (CDKO) technologies can only screen several hundred genes and their combinations, but not genome-wide. Therefore, good SL prediction models are highly needed for genes and gene pairs selection in CDKO experiments. However, lack of scalable SL properties prevents generalizability of SL interactions to out-of-sample data, thereby hindering modeling efforts. In this paper, we recognize that SL connectivity is a scalable and generalizable SL property. We develop a novel two-step multilayer encoder for individual sample-specific SL prediction model (MLEC-iSL), which predicts SL connectivity first and SL interactions subsequently. MLEC-iSL has three encoders, namely, gene, graph, and transformer encoders. MLEC-iSL achieves high SL prediction performance in K562 (AUPR, 0.73; AUC, 0.72) and Jurkat (AUPR, 0.73; AUC, 0.71) cells, while no existing methods exceed 0.62 AUPR and AUC. The prediction performance of MLEC-iSL is validated in a CDKO experiment in 22Rv1 cells, yielding a 46.8% SL rate among 987 selected gene pairs. The screen also reveals SL dependency between apoptosis and mitosis cell death pathways.
Collapse
Affiliation(s)
- Kunjie Fan
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, 1800 Cannon Drive, Columbus, OH 43210, United States
| | - Birkan Gökbağ
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, 1800 Cannon Drive, Columbus, OH 43210, United States
| | - Shan Tang
- Department of Biomedical Informatics, College of Pharmacy, The Ohio State University, 500 W. 12 ave, Columbus, OH 43210, United States
| | - Shangjia Li
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, 1800 Cannon Drive, Columbus, OH 43210, United States
| | - Yirui Huang
- Department of Biomedical Informatics, College of Pharmacy, The Ohio State University, 500 W. 12 ave, Columbus, OH 43210, United States
| | - Lingling Wang
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, 1800 Cannon Drive, Columbus, OH 43210, United States
| | - Lijun Cheng
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, 1800 Cannon Drive, Columbus, OH 43210, United States
| | - Lang Li
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, 1800 Cannon Drive, Columbus, OH 43210, United States
- Department of Biomedical Informatics, College of Pharmacy, The Ohio State University, 500 W. 12 ave, Columbus, OH 43210, United States
| |
Collapse
|
9
|
Zhang G, Chen Y, Yan C, Wang J, Liang W, Luo J, Luo H. MPASL: multi-perspective learning knowledge graph attention network for synthetic lethality prediction in human cancer. Front Pharmacol 2024; 15:1398231. [PMID: 38835667 PMCID: PMC11148462 DOI: 10.3389/fphar.2024.1398231] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2024] [Accepted: 04/26/2024] [Indexed: 06/06/2024] Open
Abstract
Synthetic lethality (SL) is widely used to discover the anti-cancer drug targets. However, the identification of SL interactions through wet experiments is costly and inefficient. Hence, the development of efficient and high-accuracy computational methods for SL interactions prediction is of great significance. In this study, we propose MPASL, a multi-perspective learning knowledge graph attention network to enhance synthetic lethality prediction. MPASL utilizes knowledge graph hierarchy propagation to explore multi-source neighbor nodes related to genes. The knowledge graph ripple propagation expands gene representations through existing gene SL preference sets. MPASL can learn the gene representations from both gene-entity perspective and entity-entity perspective. Specifically, based on the aggregation method, we learn to obtain gene-oriented entity embeddings. Then, the gene representations are refined by comparing the various layer-wise neighborhood features of entities using the discrepancy contrastive technique. Finally, the learned gene representation is applied in SL prediction. Experimental results demonstrated that MPASL outperforms several state-of-the-art methods. Additionally, case studies have validated the effectiveness of MPASL in identifying SL interactions between genes.
Collapse
Affiliation(s)
- Ge Zhang
- School of Computer and Information Engineering, Henan University, Kaifeng, Henan, China
- Henan Key Laboratory of Big Data Analysis and Processing, Henan University, Kaifeng, Henan, China
| | - Yitong Chen
- School of Computer and Information Engineering, Henan University, Kaifeng, Henan, China
- Henan Key Laboratory of Big Data Analysis and Processing, Henan University, Kaifeng, Henan, China
| | - Chaokun Yan
- School of Computer and Information Engineering, Henan University, Kaifeng, Henan, China
- Henan Key Laboratory of Big Data Analysis and Processing, Henan University, Kaifeng, Henan, China
| | - Jianlin Wang
- School of Computer and Information Engineering, Henan University, Kaifeng, Henan, China
- Henan Key Laboratory of Big Data Analysis and Processing, Henan University, Kaifeng, Henan, China
| | - Wenjuan Liang
- School of Computer and Information Engineering, Henan University, Kaifeng, Henan, China
- Henan Key Laboratory of Big Data Analysis and Processing, Henan University, Kaifeng, Henan, China
| | - Junwei Luo
- College of Computer Science and Technology, Henan Polytechnic University, Jiaozuo, Henan, China
| | - Huimin Luo
- School of Computer and Information Engineering, Henan University, Kaifeng, Henan, China
- Henan Key Laboratory of Big Data Analysis and Processing, Henan University, Kaifeng, Henan, China
| |
Collapse
|
10
|
Liu X, Hu J, Zheng J. SL-Miner: a web server for mining evidence and prioritization of cancer-specific synthetic lethality. Bioinformatics 2024; 40:btae016. [PMID: 38244572 PMCID: PMC10868331 DOI: 10.1093/bioinformatics/btae016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2023] [Revised: 12/10/2023] [Accepted: 01/16/2024] [Indexed: 01/22/2024] Open
Abstract
SUMMARY Synthetic lethality (SL) refers to a type of genetic interaction in which the simultaneous inactivation of two genes leads to cell death, while the inactivation of a single gene does not affect cell viability. It significantly expands the range of potential therapeutic targets for anti-cancer treatments. SL interactions are primarily identified through experimental screening and computational prediction. Although various computational methods have been proposed, they tend to ignore providing evidence to support their predictions of SL. Besides, they are rarely user-friendly for biologists who likely have limited programming skills. Moreover, the genetic context specificity of SL interactions is often not taken into consideration. Here, we introduce a web server called SL-Miner, which is designed to mine the evidence of SL relationships between a primary gene and a few candidate SL partner genes in a specific type of cancer, and to prioritize these candidate genes by integrating various types of evidence. For intuitive data visualization, SL-Miner provides a range of charts (e.g. volcano plot and box plot) to help users get insights from the data. AVAILABILITY AND IMPLEMENTATION SL-Miner is available at https://slminer.sist.shanghaitech.edu.cn.
Collapse
Affiliation(s)
- Xin Liu
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China
| | - Jieni Hu
- School of Life Science and Technology, ShanghaiTech University, Shanghai 201210, China
| | - Jie Zheng
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China
- Shanghai Engineering Research Center of Intelligent Vision and Imaging, Shanghai 201210, China
| |
Collapse
|
11
|
Tepeli YI, Seale C, Gonçalves JP. ELISL: early-late integrated synthetic lethality prediction in cancer. Bioinformatics 2024; 40:btad764. [PMID: 38113447 PMCID: PMC11616771 DOI: 10.1093/bioinformatics/btad764] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 11/06/2023] [Accepted: 12/18/2023] [Indexed: 12/21/2023] Open
Abstract
MOTIVATION Anti-cancer therapies based on synthetic lethality (SL) exploit tumour vulnerabilities for treatment with reduced side effects, by targeting a gene that is jointly essential with another whose function is lost. Computational prediction is key to expedite SL screening, yet existing methods are vulnerable to prevalent selection bias in SL data and reliant on cancer or tissue type-specific omics, which can be scarce. Notably, sequence similarity remains underexplored as a proxy for related gene function and joint essentiality. RESULTS We propose ELISL, Early-Late Integrated SL prediction with forest ensembles, using context-free protein sequence embeddings and context-specific omics from cell lines and tissue. Across eight cancer types, ELISL showed superior robustness to selection bias and recovery of known SL genes, as well as promising cross-cancer predictions. Co-occurring mutations in a BRCA gene and ELISL-predicted pairs from the HH, FGF, WNT, or NEIL gene families were associated with longer patient survival times, revealing therapeutic potential. AVAILABILITY AND IMPLEMENTATION Data: 10.6084/m9.figshare.23607558 & Code: github.com/joanagoncalveslab/ELISL.
Collapse
Affiliation(s)
- Yasin I Tepeli
- Pattern Recognition & Bioinformatics, Department of Intelligent
Systems, Faculty EEMCS, Delft University of Technology, Delft, The Netherlands
| | - Colm Seale
- Pattern Recognition & Bioinformatics, Department of Intelligent
Systems, Faculty EEMCS, Delft University of Technology, Delft, The Netherlands
- Holland Proton Therapy Center (HollandPTC), Delft, The Netherlands
| | - Joana P Gonçalves
- Pattern Recognition & Bioinformatics, Department of Intelligent
Systems, Faculty EEMCS, Delft University of Technology, Delft, The Netherlands
| |
Collapse
|
12
|
Son J, Kim D. Applying network link prediction in drug discovery: an overview of the literature. Expert Opin Drug Discov 2024; 19:43-56. [PMID: 37794688 DOI: 10.1080/17460441.2023.2267020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Accepted: 10/02/2023] [Indexed: 10/06/2023]
Abstract
INTRODUCTION Network representation can give a holistic view of relationships for biomedical entities through network topology. Link prediction estimates the probability of link formation between the pair of unconnected nodes. In the drug discovery process, the link prediction method not only enables the detection of connectivity patterns but also predicts the effects of one biomedical entity to multiple entities simultaneously and vice versa, which is useful for many applications. AREAS COVERED The authors provide a comprehensive overview of network link prediction in drug discovery. Link prediction methodologies such as similarity-based approaches, embedding-based approaches, probabilistic model-based approaches, and preprocessing methods are summarized with examples. In addition to describing their properties and limitations, the authors discuss the applications of link prediction in drug discovery based on the relationship between biomedical concepts. EXPERT OPINION Link prediction is a powerful method to infer the existence of novel relationships in drug discovery. However, link prediction has been hampered by the sparsity of data and the lack of negative links in biomedical networks. With preprocessing to balance positive and negative samples and the collection of more data, the authors believe it is possible to develop more reliable link prediction methods that can become invaluable tools for successful drug discovery.
Collapse
Affiliation(s)
- Jeongtae Son
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea
| | - Dongsup Kim
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea
| |
Collapse
|
13
|
Li J, Lu X, Jiang K, Tang D, Sun F, Ruan J. Latent space feature representation on multiple biological network for synthetic lethality interaction prediction. 2023 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM) 2023:1236-1241. [DOI: 10.1109/bibm58861.2023.10385727] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
Affiliation(s)
- Jinxin Li
- Hunan University,College of Computer Science and Electronic Engineering,Changsha
| | - Xinguo Lu
- Hunan University,College of Computer Science and Electronic Engineering,Changsha
| | - Kaibao Jiang
- Hunan University,College of Computer Science and Electronic Engineering,Changsha
| | - Daoxu Tang
- Hunan University,College of Computer Science and Electronic Engineering,Changsha
| | - Fengxu Sun
- Hunan University,College of Computer Science and Electronic Engineering,Changsha
| | - Jingjing Ruan
- Hunan University,College of Computer Science and Electronic Engineering,Changsha
| |
Collapse
|
14
|
Pu M, Cheng K, Li X, Xin Y, Wei L, Jin S, Zheng W, Peng G, Tang Q, Zhou J, Zhang Y. Using graph-based model to identify cell specific synthetic lethal effects. Comput Struct Biotechnol J 2023; 21:5099-5110. [PMID: 37920819 PMCID: PMC10618116 DOI: 10.1016/j.csbj.2023.10.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Revised: 10/05/2023] [Accepted: 10/06/2023] [Indexed: 11/04/2023] Open
Abstract
Synthetic lethal (SL) pairs are pairs of genes whose simultaneous loss-of-function results in cell death, while a damaging mutation of either gene alone does not affect the cell's survival. This makes SL pairs attractive targets for precision cancer therapies, as targeting the unimpaired gene of the SL pair can selectively kill cancer cells that already harbor the impaired gene. Limited by the difficulty of finding true SL pairs, especially on specific cell types, current computational approaches provide only limited insights because of overlooking the crucial aspects of cellular context dependency and mechanistic understanding of SL pairs. As a result, the identification of SL targets still relies on expensive, time-consuming experimental approaches. In this work, we applied cell-line specific multi-omics data to a specially designed deep learning model to predict cell-line specific SL pairs. Through incorporating multiple types of cell-specific omics data with a self-attention module, we represent gene relationships as graphs. Our approach achieves the prediction of SL pairs in a cell-specific manner and demonstrates the potential to facilitate the discovery of cell-specific SL targets for cancer therapeutics, providing a tool to unearth mechanisms underlying the origin of SL in cancer biology. The code and data of our approach can be found at https://github.com/promethiume/SLwise.
Collapse
Affiliation(s)
| | - Kaiyang Cheng
- StoneWise, AI, Ltd., Beijing, China
- Nanjing University of Chinese Medicine, Shanghai, China
| | - Xiaorong Li
- StoneWise, AI, Ltd., Beijing, China
- Minzu University of China, Beijing, China
| | | | | | - Sutong Jin
- StoneWise, AI, Ltd., Beijing, China
- Harbin Institute of Technology, Weihai, China
| | | | | | - Qihong Tang
- StoneWise, AI, Ltd., Beijing, China
- Guilin University of Electronic Science and Technology, Guangxi, China
| | | | | |
Collapse
|
15
|
Liang J, Li ZW, Sun ZN, Bi Y, Cheng H, Zeng T, Guo WF. Latent space search based multimodal optimization with personalized edge-network biomarker for multi-purpose early disease prediction. Brief Bioinform 2023; 24:bbad364. [PMID: 37833844 DOI: 10.1093/bib/bbad364] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Revised: 09/06/2023] [Accepted: 09/20/2023] [Indexed: 10/15/2023] Open
Abstract
Considering that cancer is resulting from the comutation of several essential genes of individual patients, researchers have begun to focus on identifying personalized edge-network biomarkers (PEBs) using personalized edge-network analysis for clinical practice. However, most of existing methods ignored the optimization of PEBs when multimodal biomarkers exist in multi-purpose early disease prediction (MPEDP). To solve this problem, this study proposes a novel model (MMPDENB-RBM) that combines personalized dynamic edge-network biomarkers (PDENB) theory, multimodal optimization strategy and latent space search scheme to identify biomarkers with different configurations of PDENB modules (i.e. to effectively identify multimodal PDENBs). The application to the three largest cancer omics datasets from The Cancer Genome Atlas database (i.e. breast invasive carcinoma, lung squamous cell carcinoma and lung adenocarcinoma) showed that the MMPDENB-RBM model could more effectively predict critical cancer state compared with other advanced methods. And, our model had better convergence, diversity and multimodal property as well as effective optimization ability compared with the other state-of-art methods. Particularly, multimodal PDENBs identified were more enriched with different functional biomarkers simultaneously, such as tissue-specific synthetic lethality edge-biomarkers including cancer driver genes and disease marker genes. Importantly, as our aim, these multimodal biomarkers can perform diverse biological and biomedical significances for drug target screen, survival risk assessment and novel biomedical sight as the expected multi-purpose of personalized early disease prediction. In summary, the present study provides multimodal property of PDENBs, especially the therapeutic biomarkers with more biological significances, which can help with MPEDP of individual cancer patients.
Collapse
Affiliation(s)
- Jing Liang
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou 450001, China
- State Key Laboratory of Intelligent Agricultural Power Equipment, Zhengzhou University, Luoyang 471000, China
| | - Zong-Wei Li
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou 450001, China
| | - Ze-Ning Sun
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou 450001, China
| | - Ying Bi
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou 450001, China
| | - Han Cheng
- School of Life Sciences, Zhengzhou University, Zhengzhou 450001, China
| | - Tao Zeng
- Guangzhou National Laboratory, Guangzhou 510005, China
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou Laboratory, 510005, Guangzhou Medical University
| | - Wei-Feng Guo
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou 450001, China
- State Key Laboratory of Intelligent Agricultural Power Equipment, Zhengzhou University, Luoyang 471000, China
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center,Guangzhou 7510060, China
| |
Collapse
|
16
|
Zhang K, Wu M, Liu Y, Feng Y, Zheng J. KR4SL: knowledge graph reasoning for explainable prediction of synthetic lethality. Bioinformatics 2023; 39:i158-i167. [PMID: 37387166 PMCID: PMC10311291 DOI: 10.1093/bioinformatics/btad261] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023] Open
Abstract
MOTIVATION Synthetic lethality (SL) is a promising strategy for anticancer therapy, as inhibiting SL partners of genes with cancer-specific mutations can selectively kill the cancer cells without harming the normal cells. Wet-lab techniques for SL screening have issues like high cost and off-target effects. Computational methods can help address these issues. Previous machine learning methods leverage known SL pairs, and the use of knowledge graphs (KGs) can significantly enhance the prediction performance. However, the subgraph structures of KG have not been fully explored. Besides, most machine learning methods lack interpretability, which is an obstacle for wide applications of machine learning to SL identification. RESULTS We present a model named KR4SL to predict SL partners for a given primary gene. It captures the structural semantics of a KG by efficiently constructing and learning from relational digraphs in the KG. To encode the semantic information of the relational digraphs, we fuse textual semantics of entities into propagated messages and enhance the sequential semantics of paths using a recurrent neural network. Moreover, we design an attentive aggregator to identify critical subgraph structures that contribute the most to the SL prediction as explanations. Extensive experiments under different settings show that KR4SL significantly outperforms all the baselines. The explanatory subgraphs for the predicted gene pairs can unveil prediction process and mechanisms underlying synthetic lethality. The improved predictive power and interpretability indicate that deep learning is practically useful for SL-based cancer drug target discovery. AVAILABILITY AND IMPLEMENTATION The source code is freely available at https://github.com/JieZheng-ShanghaiTech/KR4SL.
Collapse
Affiliation(s)
- Ke Zhang
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China
- Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai 200050, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Min Wu
- Institute for Infocomm Research, Agency for Science, Technology and Research (A*STAR), Singapore 138632, Singapore
| | - Yong Liu
- Nanyang Technological University, Singapore 639798, Singapore
| | - Yimiao Feng
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China
- Lingang Laboratory, Shanghai 201602, China
| | - Jie Zheng
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China
- Shanghai Engineering Research Center of Intelligent Vision and Imaging, ShanghaiTech University, Shanghai 201210, China
| |
Collapse
|
17
|
Zhu Y, Zhou Y, Liu Y, Wang X, Li J. SLGNN: synthetic lethality prediction in human cancers based on factor-aware knowledge graph neural network. Bioinformatics 2023; 39:6988048. [PMID: 36645245 PMCID: PMC9907046 DOI: 10.1093/bioinformatics/btad015] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Revised: 11/29/2022] [Accepted: 01/13/2023] [Indexed: 01/17/2023] Open
Abstract
MOTIVATION Synthetic lethality (SL) is a form of genetic interaction that can selectively kill cancer cells without damaging normal cells. Exploiting this mechanism is gaining popularity in the field of targeted cancer therapy and anticancer drug development. Due to the limitations of identifying SL interactions from laboratory experiments, an increasing number of research groups are devising computational prediction methods to guide the discovery of potential SL pairs. Although existing methods have attempted to capture the underlying mechanisms of SL interactions, methods that have a deeper understanding of and attempt to explain SL mechanisms still need to be developed. RESULTS In this work, we propose a novel SL prediction method, SLGNN. This method is based on the following assumption: SL interactions are caused by different molecular events or biological processes, which we define as SL-related factors that lead to SL interactions. SLGNN, apart from identifying SL interaction pairs, also models the preferences of genes for different SL-related factors, making the results more interpretable for biologists and clinicians. SLGNN consists of three steps: first, we model the combinations of relationships in the gene-related knowledge graph as the SL-related factors. Next, we derive initial embeddings of genes through an explicit message aggregation process of the knowledge graph. Finally, we derive the final gene embeddings through an SL graph, constructed using known SL gene pairs, utilizing factor-based message aggregation. At this stage, a supervised end-to-end training model is used for SL interaction prediction. Based on experimental results, the proposed SLGNN model outperforms all current state-of-the-art SL prediction methods and provides better interpretability. AVAILABILITY AND IMPLEMENTATION SLGNN is freely available at https://github.com/zy972014452/SLGNN.
Collapse
Affiliation(s)
- Yan Zhu
- School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen), Shenzhen 518055, China
| | - Yuhuan Zhou
- School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen), Shenzhen 518055, China
| | - Yang Liu
- School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen), Shenzhen 518055, China.,Guangdong Provincial Key Laboratory of Novel Security Intelligence Technologies, Harbin Institute of Technology (Shenzhen), Shenzhen 518055, China
| | - Xuan Wang
- School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen), Shenzhen 518055, China.,Guangdong Provincial Key Laboratory of Novel Security Intelligence Technologies, Harbin Institute of Technology (Shenzhen), Shenzhen 518055, China
| | - Junyi Li
- School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen), Shenzhen 518055, China.,Guangdong Provincial Key Laboratory of Novel Security Intelligence Technologies, Harbin Institute of Technology (Shenzhen), Shenzhen 518055, China
| |
Collapse
|
18
|
Fan K, Tang S, Gökbağ B, Cheng L, Li L. Multi-view graph convolutional network for cancer cell-specific synthetic lethality prediction. Front Genet 2023; 13:1103092. [PMID: 36699450 PMCID: PMC9868610 DOI: 10.3389/fgene.2022.1103092] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2022] [Accepted: 12/22/2022] [Indexed: 01/11/2023] Open
Abstract
Synthetic lethal (SL) genetic interactions have been regarded as a promising focus for investigating potential targeted therapeutics to tackle cancer. However, the costly investment of time and labor associated with wet-lab experimental screenings to discover potential SL relationships motivates the development of computational methods. Although graph neural network (GNN) models have performed well in the prediction of SL gene pairs, existing GNN-based models are not designed for predicting cancer cell-specific SL interactions that are more relevant to experimental validation in vitro. Besides, neither have existing methods fully utilized diverse graph representations of biological features to improve prediction performance. In this work, we propose MVGCN-iSL, a novel multi-view graph convolutional network (GCN) model to predict cancer cell-specific SL gene pairs, by incorporating five biological graph features and multi-omics data. Max pooling operation is applied to integrate five graph-specific representations obtained from GCN models. Afterwards, a deep neural network (DNN) model serves as the prediction module to predict the SL interactions in individual cancer cells (iSL). Extensive experiments have validated the model's successful integration of the multiple graph features and state-of-the-art performance in the prediction of potential SL gene pairs as well as generalization ability to novel genes.
Collapse
Affiliation(s)
- Kunjie Fan
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, United States
| | - Shan Tang
- College of Pharmacy, The Ohio State University, Columbus, OH, United States
| | - Birkan Gökbağ
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, United States
| | - Lijun Cheng
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, United States
| | - Lang Li
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, United States,College of Pharmacy, The Ohio State University, Columbus, OH, United States,*Correspondence: Lang Li,
| |
Collapse
|
19
|
Tang S, Gökbağ B, Fan K, Shao S, Huo Y, Wu X, Cheng L, Li L. Synthetic lethal gene pairs: Experimental approaches and predictive models. Front Genet 2022; 13:961611. [PMID: 36531238 PMCID: PMC9751344 DOI: 10.3389/fgene.2022.961611] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2022] [Accepted: 11/07/2022] [Indexed: 03/27/2024] Open
Abstract
Synthetic lethality (SL) refers to a genetic interaction in which the simultaneous perturbation of two genes leads to cell or organism death, whereas viability is maintained when only one of the pair is altered. The experimental exploration of these pairs and predictive modeling in computational biology contribute to our understanding of cancer biology and the development of cancer therapies. We extensively reviewed experimental technologies, public data sources, and predictive models in the study of synthetic lethal gene pairs and herein detail biological assumptions, experimental data, statistical models, and computational schemes of various predictive models, speculate regarding their influence on individual sample- and population-based synthetic lethal interactions, discuss the pros and cons of existing SL data and models, and highlight potential research directions in SL discovery.
Collapse
Affiliation(s)
- Shan Tang
- College of Pharmacy, The Ohio State University, Columbus, OH, United States
| | - Birkan Gökbağ
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, United States
| | - Kunjie Fan
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, United States
| | - Shuai Shao
- College of Pharmacy, The Ohio State University, Columbus, OH, United States
| | - Yang Huo
- Indiana University, Bloomington, IN, United States
| | - Xue Wu
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, United States
| | - Lijun Cheng
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, United States
| | - Lang Li
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, United States
| |
Collapse
|
20
|
Wang S, Feng Y, Liu X, Liu Y, Wu M, Zheng J. NSF4SL: negative-sample-free contrastive learning for ranking synthetic lethal partner genes in human cancers. Bioinformatics 2022; 38:ii13-ii19. [PMID: 36124790 DOI: 10.1093/bioinformatics/btac462] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
MOTIVATION Detecting synthetic lethality (SL) is a promising strategy for identifying anti-cancer drug targets. Targeting SL partners of a primary gene mutated in cancer is selectively lethal to cancer cells. Due to high cost of wet-lab experiments and availability of gold standard SL data, supervised machine learning for SL prediction has been popular. However, most of the methods are based on binary classification and thus limited by the lack of reliable negative data. Contrastive learning can train models without any negative sample and is thus promising for finding novel SLs. RESULTS We propose NSF4SL, a negative-sample-free SL prediction model based on a contrastive learning framework. It captures the characteristics of positive SL samples by using two branches of neural networks that interact with each other to learn SL-related gene representations. Moreover, a feature-wise data augmentation strategy is used to mitigate the sparsity of SL data. NSF4SL significantly outperforms all baselines which require negative samples, even in challenging experimental settings. To the best of our knowledge, this is the first time that SL prediction is formulated as a gene ranking problem, which is more practical than the current formulation as binary classification. NSF4SL is the first contrastive learning method for SL prediction and its success points to a new direction of machine-learning methods for identifying novel SLs. AVAILABILITY AND IMPLEMENTATION Our source code is available at https://github.com/JieZheng-ShanghaiTech/NSF4SL. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Shike Wang
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China
| | - Yimiao Feng
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China
| | - Xin Liu
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China
| | - Yong Liu
- Joint NTU-UBC Research Centre of Excellence in Active Living for the Elderly, Nanyang Technological University, Singapore 639798, Singapore
| | - Min Wu
- Institute for Infocomm Research, Agency for Science, Technology and Research (A*STAR), Singapore 138632, Singapore
| | - Jie Zheng
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China.,Shanghai Engineering Research Center of Intelligent Vision and Imaging, Shanghai 201210, China
| |
Collapse
|
21
|
Liu X, Yu J, Tao S, Yang B, Wang S, Wang L, Bai F, Zheng J. PiLSL: pairwise interaction learning-based graph neural network for synthetic lethality prediction in human cancers. Bioinformatics 2022; 38:ii106-ii112. [PMID: 36124788 DOI: 10.1093/bioinformatics/btac476] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
MOTIVATION Synthetic lethality (SL) is a type of genetic interaction in which the simultaneous inactivation of two genes leads to cell death, while the inactivation of a single gene does not affect the cell viability. It can effectively expand the range of anti-cancer therapeutic targets. SL interactions are identified mainly by experimental screening and computational prediction. Recent machine-learning methods mostly learn the representation of each gene individually, ignoring the representation of the pairwise interaction between two genes. In addition, the mechanisms of SL, the key to translating SL into cancer therapeutics, are often unclear. RESULTS To fill the gaps, we propose a pairwise interaction learning-based graph neural network (GNN) named PiLSL to learn the representation of pairwise interaction between two genes for SL prediction. First, we construct an enclosing graph for each pair of genes from a knowledge graph. Secondly, we design an attentive embedding propagation layer in a GNN to discriminate the importance among the edges in the enclosing graph and to learn the latent features of the pairwise interaction from the weighted enclosing graph. Finally, we further fuse the latent features with explicit features extracted from multi-omics data to obtain powerful gene representations for SL prediction. Extensive experimental results demonstrate that PiLSL outperforms the best baseline by a large margin and generalizes well under three realistic scenarios. Besides, PiLSL provides an explanation of SL mechanisms via the weighted paths in the enclosing graphs by attention mechanism. AVAILABILITY AND IMPLEMENTATION Our source code is available at https://github.com/JieZheng-ShanghaiTech/PiLSL.
Collapse
Affiliation(s)
- Xin Liu
- School of Information Science and Technology, Shanghai Tech University, Shanghai 201210, China
| | - Jiale Yu
- School of Information Science and Technology, Shanghai Tech University, Shanghai 201210, China
| | - Siyu Tao
- School of Information Science and Technology, Shanghai Tech University, Shanghai 201210, China
| | - Beiyuan Yang
- School of Information Science and Technology, Shanghai Tech University, Shanghai 201210, China
| | - Shike Wang
- School of Information Science and Technology, Shanghai Tech University, Shanghai 201210, China
| | - Lin Wang
- School of Life Science and Technology, Shanghai Tech University, Shanghai 201210, China.,Shanghai Institute for Advanced Immunochemical Studies, Shanghai Tech University, Shanghai 201210, China
| | - Fang Bai
- School of Information Science and Technology, Shanghai Tech University, Shanghai 201210, China.,School of Life Science and Technology, Shanghai Tech University, Shanghai 201210, China.,Shanghai Institute for Advanced Immunochemical Studies, Shanghai Tech University, Shanghai 201210, China
| | - Jie Zheng
- School of Information Science and Technology, Shanghai Tech University, Shanghai 201210, China.,Shanghai Engineering Research Center of Intelligent Vision and Imaging, Shanghai 201210, China
| |
Collapse
|
22
|
Seale C, Tepeli Y, Gonçalves JP. Overcoming selection bias in synthetic lethality prediction. Bioinformatics 2022; 38:4360-4368. [PMID: 35876858 PMCID: PMC9477536 DOI: 10.1093/bioinformatics/btac523] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Revised: 07/13/2022] [Accepted: 07/22/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION Synthetic lethality (SL) between two genes occurs when simultaneous loss of function leads to cell death. This holds great promise for developing anti-cancer therapeutics that target synthetic lethal pairs of endogenously disrupted genes. Identifying novel SL relationships through exhaustive experimental screens is challenging, due to the vast number of candidate pairs. Computational SL prediction is therefore sought to identify promising SL gene pairs for further experimentation. However, current SL prediction methods lack consideration for generalizability in the presence of selection bias in SL data. RESULTS We show that SL data exhibit considerable gene selection bias. Our experiments designed to assess the robustness of SL prediction reveal that models driven by the topology of known SL interactions (e.g. graph, matrix factorization) are especially sensitive to selection bias. We introduce selection bias-resilient synthetic lethality (SBSL) prediction using regularized logistic regression or random forests. Each gene pair is described by 27 molecular features derived from cancer cell line, cancer patient tissue and healthy donor tissue samples. SBSL models are built and tested using approximately 8000 experimentally derived SL pairs across breast, colon, lung and ovarian cancers. Compared to other SL prediction methods, SBSL showed higher predictive performance, better generalizability and robustness to selection bias. Gene dependency, quantifying the essentiality of a gene for cell survival, contributed most to SBSL predictions. Random forests were superior to linear models in the absence of dependency features, highlighting the relevance of mutual exclusivity of somatic mutations, co-expression in healthy tissue and differential expression in tumour samples. AVAILABILITY AND IMPLEMENTATION https://github.com/joanagoncalveslab/sbsl. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Colm Seale
- Pattern Recognition & Bioinformatics, Department of Intelligent Systems, Faculty EEMCS, Delft University of Technology, Delft 2628 XE, The Netherlands
- Holland Proton Therapy Center (HollandPTC), Delft 2600 AC, The Netherlands
| | - Yasin Tepeli
- Pattern Recognition & Bioinformatics, Department of Intelligent Systems, Faculty EEMCS, Delft University of Technology, Delft 2628 XE, The Netherlands
| | - Joana P Gonçalves
- Pattern Recognition & Bioinformatics, Department of Intelligent Systems, Faculty EEMCS, Delft University of Technology, Delft 2628 XE, The Netherlands
| |
Collapse
|
23
|
Guo L, Dou Y, Xia D, Yin Z, Xiang Y, Luo L, Zhang Y, Wang J, Liang T. SLOAD: a comprehensive database of cancer-specific synthetic lethal interactions for precision cancer therapy via multi-omics analysis. Database (Oxford) 2022; 2022:6677988. [PMID: 36029479 PMCID: PMC9419874 DOI: 10.1093/database/baac075] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Revised: 07/27/2022] [Accepted: 08/20/2022] [Indexed: 11/14/2022]
Abstract
Abstract
Synthetic lethality has been widely concerned because of its potential role in cancer treatment, which can be harnessed to selectively kill cancer cells via identifying inactive genes in a specific cancer type and further targeting the corresponding synthetic lethal partners. Herein, to obtain cancer-specific synthetic lethal interactions, we aimed to predict genetic interactions via a pan-cancer analysis from multiple molecular levels using random forest and then develop a user-friendly database. First, based on collected public gene pairs with synthetic lethal interactions, candidate gene pairs were analyzed via integrating multi-omics data, mainly including DNA mutation, copy number variation, methylation and mRNA expression data. Then, integrated features were used to predict cancer-specific synthetic lethal interactions using random forest. Finally, SLOAD (http://www.tmliang.cn/SLOAD) was constructed via integrating these findings, which was a user-friendly database for data searching, browsing, downloading and analyzing. These results can provide candidate cancer-specific synthetic lethal interactions, which will contribute to drug designing in cancer treatment that can promote therapy strategies based on the principle of synthetic lethality.
Database URL http://www.tmliang.cn/SLOAD/
Collapse
Affiliation(s)
- Li Guo
- Department of Bioinformatics, Smart Health Big Data Analysis and Location Services Engineering Lab of Jiangsu Province, School of Geographic and Biologic Information, Nanjing University of Posts and Telecommunications , No. 9, Wenyuan Road, Qixia District, Nanjing, Jiangsu 210023, China
| | - Yuyang Dou
- Department of Bioinformatics, Smart Health Big Data Analysis and Location Services Engineering Lab of Jiangsu Province, School of Geographic and Biologic Information, Nanjing University of Posts and Telecommunications , No. 9, Wenyuan Road, Qixia District, Nanjing, Jiangsu 210023, China
| | - Daoliang Xia
- Department of Bioinformatics, Smart Health Big Data Analysis and Location Services Engineering Lab of Jiangsu Province, School of Geographic and Biologic Information, Nanjing University of Posts and Telecommunications , No. 9, Wenyuan Road, Qixia District, Nanjing, Jiangsu 210023, China
| | - Zibo Yin
- Department of Bioinformatics, Smart Health Big Data Analysis and Location Services Engineering Lab of Jiangsu Province, School of Geographic and Biologic Information, Nanjing University of Posts and Telecommunications , No. 9, Wenyuan Road, Qixia District, Nanjing, Jiangsu 210023, China
| | - Yangyang Xiang
- Department of Bioinformatics, Smart Health Big Data Analysis and Location Services Engineering Lab of Jiangsu Province, School of Geographic and Biologic Information, Nanjing University of Posts and Telecommunications , No. 9, Wenyuan Road, Qixia District, Nanjing, Jiangsu 210023, China
| | - Lulu Luo
- Jiangsu Key Laboratory for Molecular and Medical Biotechnology, School of Life Science, Nanjing Normal University , No. 1, Wenyuan Road, Qixia District, Nanjing, Jiangsu 210023, China
| | - Yuting Zhang
- Department of Bioinformatics, Smart Health Big Data Analysis and Location Services Engineering Lab of Jiangsu Province, School of Geographic and Biologic Information, Nanjing University of Posts and Telecommunications , No. 9, Wenyuan Road, Qixia District, Nanjing, Jiangsu 210023, China
| | - Jun Wang
- Department of Bioinformatics, Smart Health Big Data Analysis and Location Services Engineering Lab of Jiangsu Province, School of Geographic and Biologic Information, Nanjing University of Posts and Telecommunications , No. 9, Wenyuan Road, Qixia District, Nanjing, Jiangsu 210023, China
| | - Tingming Liang
- Jiangsu Key Laboratory for Molecular and Medical Biotechnology, School of Life Science, Nanjing Normal University , No. 1, Wenyuan Road, Qixia District, Nanjing, Jiangsu 210023, China
| |
Collapse
|
24
|
Long Y, Wu M, Liu Y, Fang Y, Kwoh CK, Chen J, Luo J, Li X. Pre-training graph neural networks for link prediction in biomedical networks. Bioinformatics 2022; 38:2254-2262. [PMID: 35171981 DOI: 10.1093/bioinformatics/btac100] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2021] [Revised: 01/15/2022] [Accepted: 02/14/2022] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Graphs or networks are widely utilized to model the interactions between different entities (e.g. proteins, drugs, etc.) for biomedical applications. Predicting potential interactions/links in biomedical networks is important for understanding the pathological mechanisms of various complex human diseases, as well as screening compound targets for drug discovery. Graph neural networks (GNNs) have been utilized for link prediction in various biomedical networks, which rely on the node features extracted from different data sources, e.g. sequence, structure and network data. However, it is challenging to effectively integrate these data sources and automatically extract features for different link prediction tasks. RESULTS In this article, we propose a novel Pre-Training Graph Neural Networks-based framework named PT-GNN to integrate different data sources for link prediction in biomedical networks. First, we design expressive deep learning methods [e.g. convolutional neural network and graph convolutional network (GCN)] to learn features for individual nodes from sequence and structure data. Second, we further propose a GCN-based encoder to effectively refine the node features by modelling the dependencies among nodes in the network. Third, the node features are pre-trained based on graph reconstruction tasks. The pre-trained features can be used for model initialization in downstream tasks. Extensive experiments have been conducted on two critical link prediction tasks, i.e. synthetic lethality (SL) prediction and drug-target interaction (DTI) prediction. Experimental results demonstrate PT-GNN outperforms the state-of-the-art methods for SL prediction and DTI prediction. In addition, the pre-trained features benefit improving the performance and reduce the training time of existing models. AVAILABILITY AND IMPLEMENTATION Python codes and dataset are available at: https://github.com/longyahui/PT-GNN. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yahui Long
- Singapore Immunology Network (SIgN), Agency for Science, Technology and Research, Singapore, Singapore
| | - Min Wu
- Institute for Infocomm Research, Agency for Science, Technology and Research, Singapore, Singapore
| | - Yong Liu
- Joint NTU-UBC Research Centre of Excellence in Active Living for the Elderly, Singapore, Singapore
| | - Yuan Fang
- School of Information Systems, Singapore Management University, 178902 Singapore, Singapore
| | - Chee Keong Kwoh
- School of Computer Science and Engineering, Nanyang Technological University, Singapore, Singapore
| | - Jinmiao Chen
- Singapore Immunology Network (SIgN), Agency for Science, Technology and Research, Singapore, Singapore
| | - Jiawei Luo
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Xiaoli Li
- Institute for Infocomm Research, Agency for Science, Technology and Research, Singapore, Singapore
| |
Collapse
|
25
|
Wang J, Zhang Q, Han J, Zhao Y, Zhao C, Yan B, Dai C, Wu L, Wen Y, Zhang Y, Leng D, Wang Z, Yang X, He S, Bo X. Computational methods, databases and tools for synthetic lethality prediction. Brief Bioinform 2022; 23:6555403. [PMID: 35352098 PMCID: PMC9116379 DOI: 10.1093/bib/bbac106] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Revised: 02/15/2022] [Accepted: 03/02/2022] [Indexed: 12/17/2022] Open
Abstract
Synthetic lethality (SL) occurs between two genes when the inactivation of either gene alone has no effect on cell survival but the inactivation of both genes results in cell death. SL-based therapy has become one of the most promising targeted cancer therapies in the last decade as PARP inhibitors achieve great success in the clinic. The key point to exploiting SL-based cancer therapy is the identification of robust SL pairs. Although many wet-lab-based methods have been developed to screen SL pairs, known SL pairs are less than 0.1% of all potential pairs due to large number of human gene combinations. Computational prediction methods complement wet-lab-based methods to effectively reduce the search space of SL pairs. In this paper, we review the recent applications of computational methods and commonly used databases for SL prediction. First, we introduce the concept of SL and its screening methods. Second, various SL-related data resources are summarized. Then, computational methods including statistical-based methods, network-based methods, classical machine learning methods and deep learning methods for SL prediction are summarized. In particular, we elaborate on the negative sampling methods applied in these models. Next, representative tools for SL prediction are introduced. Finally, the challenges and future work for SL prediction are discussed.
Collapse
Affiliation(s)
- Jing Wang
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Qinglong Zhang
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Junshan Han
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Yanpeng Zhao
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Caiyun Zhao
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Bowei Yan
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Chong Dai
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Lianlian Wu
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Yuqi Wen
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Yixin Zhang
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Dongjin Leng
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Zhongming Wang
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Xiaoxi Yang
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Song He
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Xiaochen Bo
- Department of Bioinformatics, Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| |
Collapse
|
26
|
DA-IMRN: Dual-Attention-Guided Interactive Multi-Scale Residual Network for Hyperspectral Image Classification. REMOTE SENSING 2022. [DOI: 10.3390/rs14030530] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
Deep learning-based fusion of spectral-spatial information is increasingly dominant for hyperspectral image (HSI) classification. However, due to insufficient samples, current feature fusion methods often neglect joint interactions. In this paper, to further improve the classification accuracy, we propose a dual-attention-guided interactive multi-scale residual network (DA-IMRN) to explore the joint spectral-spatial information and assign pixel-wise labels for HSIs without information leakage. In DA-IMRN, two branches focusing on spatial and spectral information separately are employed for feature extraction. A bidirectional-attention mechanism is employed to guide the interactive feature learning between two branches and promote refined feature maps. In addition, we extract deep multi-scale features corresponding to multiple receptive fields from limited samples via a multi-scale spectral/spatial residual block, to improve classification performance. Experimental results on three benchmark datasets (i.e., Salinas Valley, Pavia University, and Indian Pines) support that attention-guided multi-scale feature learning can effectively explore the joint spectral-spatial information. The proposed method outperforms state-of-the-art methods with the overall accuracy of 91.26%, 93.33%, and 82.38%, and the average accuracy of 94.22%, 89.61%, and 80.35%, respectively.
Collapse
|
27
|
Sun F, Lu X, Chen G, Zhang X, Jiang K, Li J. A Novel Synthetic Lethality Prediction Method Based on Bidirectional Attention Learning. LECTURE NOTES IN COMPUTER SCIENCE 2022:356-363. [DOI: 10.1007/978-3-031-13829-4_30] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
|
28
|
Ou-Yang L, Lu F, Zhang ZC, Wu M. Matrix factorization for biomedical link prediction and scRNA-seq data imputation: an empirical survey. Brief Bioinform 2021; 23:6447434. [PMID: 34864871 DOI: 10.1093/bib/bbab479] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Revised: 09/25/2021] [Accepted: 10/18/2021] [Indexed: 02/02/2023] Open
Abstract
Advances in high-throughput experimental technologies promote the accumulation of vast number of biomedical data. Biomedical link prediction and single-cell RNA-sequencing (scRNA-seq) data imputation are two essential tasks in biomedical data analyses, which can facilitate various downstream studies and gain insights into the mechanisms of complex diseases. Both tasks can be transformed into matrix completion problems. For a variety of matrix completion tasks, matrix factorization has shown promising performance. However, the sparseness and high dimensionality of biomedical networks and scRNA-seq data have raised new challenges. To resolve these issues, various matrix factorization methods have emerged recently. In this paper, we present a comprehensive review on such matrix factorization methods and their usage in biomedical link prediction and scRNA-seq data imputation. Moreover, we select representative matrix factorization methods and conduct a systematic empirical comparison on 15 real data sets to evaluate their performance under different scenarios. By summarizing the experimental results, we provide general guidelines for selecting matrix factorization methods for different biomedical matrix completion tasks and point out some future directions to further improve the performance for biomedical link prediction and scRNA-seq data imputation.
Collapse
Affiliation(s)
- Le Ou-Yang
- Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen Key Laboratory of Media Security, and Guangdong Laboratory of Artificial Intelligence and Digital Economy(SZ), College of Electronics and Information Engineering, Shenzhen University, Shenzhen, 518060, China.,Shenzhen Institute of Artificial Intelligence and Robotics for Society, Shenzhen,518172, China
| | - Fan Lu
- Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen Key Laboratory of Media Security, and Guangdong Laboratory of Artificial Intelligence and Digital Economy(SZ), College of Electronics and Information Engineering, Shenzhen University, Shenzhen, 518060, China
| | - Zi-Chao Zhang
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, 200433, China
| | - Min Wu
- Institute for Infocomm Research (I2R), A*STAR, 138632, Singapore
| |
Collapse
|