1
|
Chen X, Cai R, Huang Z, Li Z, Zheng J, Wu M. Interpretable high-order knowledge graph neural network for predicting synthetic lethality in human cancers. Brief Bioinform 2025; 26:bbaf142. [PMID: 40194555 PMCID: PMC11975366 DOI: 10.1093/bib/bbaf142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2024] [Revised: 02/21/2025] [Accepted: 03/07/2025] [Indexed: 04/09/2025] Open
Abstract
Synthetic lethality (SL) is a promising gene interaction for cancer therapy. Recent SL prediction methods integrate knowledge graphs (KGs) into graph neural networks (GNNs) and employ attention mechanisms to extract local subgraphs as explanations for target gene pairs. However, attention mechanisms often lack fidelity, typically generate a single explanation per gene pair, and fail to ensure trustworthy high-order structures in their explanations. To overcome these limitations, we propose Diverse Graph Information Bottleneck for Synthetic Lethality (DGIB4SL), a KG-based GNN that generates multiple faithful explanations for the same gene pair and effectively encodes high-order structures. Specifically, we introduce a novel DGIB objective, integrating a determinant point process constraint into the standard information bottleneck objective, and employ 13 motif-based adjacency matrices to capture high-order structures in gene representations. Experimental results show that DGIB4SL outperforms state-of-the-art baselines and provides multiple explanations for SL prediction, revealing diverse biological mechanisms underlying SL inference.
Collapse
Affiliation(s)
- Xuexin Chen
- School of Computer Science, Guangdong University of Technology, No. 100 Waihuan Xi Road, Panyu, Guangdong, Guangzhou, 510006, China
| | - Ruichu Cai
- School of Computer Science, Guangdong University of Technology, No. 100 Waihuan Xi Road, Panyu, Guangdong, Guangzhou, 510006, China
- Pazhou Laboratory (Huangpu), No. 248 Pazhou Qiaotou Street, Haizhu, Guangdong Province, Guangzhou, 510335, China
| | - Zhengting Huang
- School of Computer Science, Guangdong University of Technology, No. 100 Waihuan Xi Road, Panyu, Guangdong, Guangzhou, 510006, China
| | - Zijian Li
- Machine Learning Department, Mohamed bin Zayed University of Artificial Intelligence, Masdar, Abu Dhabi, United Arab Emirates
| | - Jie Zheng
- School of Information Science and Technology, ShanghaiTech University, No. 393 Huaxia Middle Road, Pudong, Shanghai, 201210, China
- School of Information Science and Technology, Shanghai Engineering Research Center of Intelligent Vision and Imaging, ShanghaiTech University, No. 393 Huaxia Middle Road, Pudong, Shanghai, 201210, China
| | - Min Wu
- Institute for Infocomm Research (IR), A*STAR, No. 2 Fusionopolis Way, Queenstown Planning, Singapore 138632, Singapore
| |
Collapse
|
2
|
Cantore T, Gasperini P, Bevilacqua R, Ciani Y, Sinha S, Ruppin E, Demichelis F. PRODE recovers essential and context-essential genes through neighborhood-informed scores. Genome Biol 2025; 26:42. [PMID: 40022167 PMCID: PMC11869679 DOI: 10.1186/s13059-025-03501-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2024] [Accepted: 02/05/2025] [Indexed: 03/03/2025] Open
Abstract
Gene context-essentiality assessment supports precision oncology opportunities. The variability of gene effects inference from loss-of-function screenings across models and technologies limits identifying robust hits. We propose a computational framework named PRODE that integrates gene effects with protein-protein interactions to generate neighborhood-informed essential (NIE) and neighborhood-informed context essential (NICE) scores. It outperforms the canonical gene effect approach in recovering missed essential genes in shRNA screens and prioritizing context-essential hits from CRISPR-KO screens, as supported by in vitro validations. Applied to Her2 + breast cancer tumor samples, PRODE identifies oxidative phosphorylation genes as vulnerabilities with prognostic value, highlighting new therapeutic opportunities.
Collapse
Affiliation(s)
- Thomas Cantore
- Laboratory of Computational and Functional Oncology, Department of Cellular, Computational, and Integrative Biology, University of Trento, Via Sommarive 9, Trento, 38123, Italy
| | - Paola Gasperini
- Laboratory of Computational and Functional Oncology, Department of Cellular, Computational, and Integrative Biology, University of Trento, Via Sommarive 9, Trento, 38123, Italy
| | - Riccardo Bevilacqua
- Laboratory of Computational and Functional Oncology, Department of Cellular, Computational, and Integrative Biology, University of Trento, Via Sommarive 9, Trento, 38123, Italy
| | - Yari Ciani
- Laboratory of Computational and Functional Oncology, Department of Cellular, Computational, and Integrative Biology, University of Trento, Via Sommarive 9, Trento, 38123, Italy
| | - Sanju Sinha
- Cancer Data Science Laboratory (CDSL), Center for Cancer Research (CCR), National Cancer Institute (NCI), National Institutes of Health (NIH), Bethesda, MD, USA
- Currently at Sanford Burnham Prebys Medical Discovery Institute, San Diego, CA, USA
| | - Eytan Ruppin
- Cancer Data Science Laboratory (CDSL), Center for Cancer Research (CCR), National Cancer Institute (NCI), National Institutes of Health (NIH), Bethesda, MD, USA
| | - Francesca Demichelis
- Laboratory of Computational and Functional Oncology, Department of Cellular, Computational, and Integrative Biology, University of Trento, Via Sommarive 9, Trento, 38123, Italy.
| |
Collapse
|
3
|
Feng Y, Zhou L, Ma C, Zheng Y, He R, Li Y. Knowledge graph-based thought: a knowledge graph-enhanced LLM framework for pan-cancer question answering. Gigascience 2025; 14:giae082. [PMID: 39775838 PMCID: PMC11702363 DOI: 10.1093/gigascience/giae082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2024] [Revised: 08/14/2024] [Accepted: 10/02/2024] [Indexed: 01/11/2025] Open
Abstract
BACKGROUND In recent years, large language models (LLMs) have shown promise in various domains, notably in biomedical sciences. However, their real-world application is often limited by issues like erroneous outputs and hallucinatory responses. RESULTS We developed the knowledge graph-based thought (KGT) framework, an innovative solution that integrates LLMs with knowledge graphs (KGs) to improve their initial responses by utilizing verifiable information from KGs, thus significantly reducing factual errors in reasoning. The KGT framework demonstrates strong adaptability and performs well across various open-source LLMs. Notably, KGT can facilitate the discovery of new uses for existing drugs through potential drug-cancer associations and can assist in predicting resistance by analyzing relevant biomarkers and genetic mechanisms. To evaluate the knowledge graph question answering task within biomedicine, we utilize a pan-cancer knowledge graph to develop a pan-cancer question answering benchmark, named pan-cancer question answering. CONCLUSIONS The KGT framework substantially improves the accuracy and utility of LLMs in the biomedical field. This study serves as a proof of concept, demonstrating its exceptional performance in biomedical question answering.
Collapse
Affiliation(s)
- Yichun Feng
- Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, 310024 Hangzhou, China
- Guangzhou National Laboratory, Guangzhou International Bio Island, 510005 Guangzhou, China
| | - Lu Zhou
- Guangzhou National Laboratory, Guangzhou International Bio Island, 510005 Guangzhou, China
| | - Chao Ma
- Smartquerier Gene Technology (Shanghai) Co., Ltd., 200100 Shanghai, China
| | - Yikai Zheng
- Guangzhou National Laboratory, Guangzhou International Bio Island, 510005 Guangzhou, China
| | - Ruikun He
- BYHEALTH Institute of Nutrition & Health, 510663 Guangzhou, China
- Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences Shanghai, 200030 Shanghai, China
| | - Yixue Li
- Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, 310024 Hangzhou, China
- Guangzhou National Laboratory, Guangzhou International Bio Island, 510005 Guangzhou, China
| |
Collapse
|
4
|
Jiang J, Ma Y, Yang L, Ma S, Yu Z, Ren X, Kong X, Zhang X, Li D, Liu Z. CTR-DB 2.0: an updated cancer clinical transcriptome resource, expanding primary drug resistance and newly adding acquired resistance datasets and enhancing the discovery and validation of predictive biomarkers. Nucleic Acids Res 2025; 53:D1335-D1347. [PMID: 39494527 PMCID: PMC11701710 DOI: 10.1093/nar/gkae993] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2024] [Revised: 10/02/2024] [Accepted: 10/23/2024] [Indexed: 11/05/2024] Open
Abstract
Drug resistance is a principal limiting factor in cancer treatment. CTR-DB, the Cancer Treatment Response gene signature DataBase, is the first data resource for clinical transcriptomes with cancer treatment response, and meanwhile supports various data analysis functions, providing insights into the molecular determinants of drug resistance. Here we proposed an upgraded version, CTR-DB 2.0 (http://ctrdb.ncpsb.org.cn). Around 190 up-to-date source datasets with primary resistance information (129% increase compared to version 1.0) and 13 acquired-resistant datasets (a new dataset type), covering 10 856 patient samples (111% increase), 39 cancer types (39% increase) and 346 therapeutic regimens (26% increase), have been collected. In terms of function, for the single dataset analysis and multiple-dataset comparison modules, CTR-DB 2.0 added new gene set enrichment, tumor microenvironment (TME) and signature connectivity analysis functions to help elucidate drug resistance mechanisms and their homogeneity/heterogeneity and discover candidate combinational therapies. Furthermore, biomarker-related functions were greatly extended. CTR-DB 2.0 newly supported the validation of cell types in the TME as predictive biomarkers of treatment response, especially the validation of a combinational biomarker panel and even the direct discovery of the optimal biomarker panel using user-customized CTR-DB patient samples. In addition, the analysis of users' own datasets, application programming interface and data crowdfunding were also added.
Collapse
Affiliation(s)
- Jianzhou Jiang
- College of Life Sciences, Hebei University, Baoding 071002, China
- State Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China
| | - Yajie Ma
- State Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China
- College of Chemistry and Materials Science, Key Laboratory of Medicinal Chemistry and Molecular Diagnosis (Hebei University), Hebei University, Baoding 071002, China
| | - Lele Yang
- State Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China
- College of Chemistry and Materials Science, Key Laboratory of Medicinal Chemistry and Molecular Diagnosis (Hebei University), Hebei University, Baoding 071002, China
| | - Shurui Ma
- State Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China
- School of Basic Medicine, Anhui Medical University, Hefei 230032, China
| | - Zixuan Yu
- College of Life Sciences, Hebei University, Baoding 071002, China
- State Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China
| | - Xinyi Ren
- State Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China
- School of Basic Medicine, Anhui Medical University, Hefei 230032, China
| | - Xiangya Kong
- Beijing Cloudna Technology Company, Limited, Beijing 100029, China
| | - Xinlei Zhang
- Beijing Cloudna Technology Company, Limited, Beijing 100029, China
| | - Dong Li
- State Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China
- College of Chemistry and Materials Science, Key Laboratory of Medicinal Chemistry and Molecular Diagnosis (Hebei University), Hebei University, Baoding 071002, China
| | - Zhongyang Liu
- College of Life Sciences, Hebei University, Baoding 071002, China
- State Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China
- College of Chemistry and Materials Science, Key Laboratory of Medicinal Chemistry and Molecular Diagnosis (Hebei University), Hebei University, Baoding 071002, China
- School of Basic Medicine, Anhui Medical University, Hefei 230032, China
| |
Collapse
|
5
|
Wang Z, Yuan Y, Wang Z, Zhang W, Chen C, Duan Z, Peng S, Zheng J, He Y, Yang X. CancerPro: deciphering the pan-cancer prognostic landscape through combinatorial enrichment analysis and knowledge network insights. NAR Genom Bioinform 2024; 6:lqae157. [PMID: 39633722 PMCID: PMC11616677 DOI: 10.1093/nargab/lqae157] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Revised: 08/26/2024] [Accepted: 10/30/2024] [Indexed: 12/07/2024] Open
Abstract
Gene expression levels serve as valuable markers for assessing prognosis in cancer patients. To understand the mechanisms underlying prognosis and explore potential therapeutics across diverse cancers, we developed CancerPro (https:/medcode.link/cancerpro). This knowledge network platform integrates comprehensive biomedical data on genes, drugs, diseases and pathways, along with their interactions. By integrating ontology and knowledge graph technologies, CancerPro offers a user-friendly interface for analyzing pan-cancer prognostic markers and exploring genes or drugs of interest. CancerPro implements three core functions: gene set enrichment analysis based on multiple annotations; in-depth drug analysis; and in-depth gene list analysis. Using CancerPro, we categorized genes and cancers into distinct groups and utilized network analysis to identify key biological pathways associated with unfavorable prognostic genes. The platform further pinpoints potential drug targets and explores potential links between prognostic markers and patient characteristics such as glutathione levels and obesity. For renal and prostate cancer, CancerPro identified risk genes linked to immune deficiency pathways and alternative splicing abnormalities. This research highlights CancerPro's potential as a valuable tool for researchers to explore pan-cancer prognostic markers and uncover novel therapeutic avenues. Its flexible tools support a wide range of biological investigations, making it a versatile asset in cancer research and beyond.
Collapse
Affiliation(s)
- Zhigang Wang
- Department of Biomedical Engineering, Institute of Basic Medical Sciences Chinese Academy of Medical Sciences, School of Basic Medicine Peking Union Medical College, Beijing 100005, China
| | - Yize Yuan
- Department of Biomedical Engineering, Institute of Basic Medical Sciences Chinese Academy of Medical Sciences, School of Basic Medicine Peking Union Medical College, Beijing 100005, China
| | - Zhe Wang
- Department of Biomedical Engineering, Institute of Basic Medical Sciences Chinese Academy of Medical Sciences, School of Basic Medicine Peking Union Medical College, Beijing 100005, China
| | - Wenjia Zhang
- Department of Biomedical Engineering, Institute of Basic Medical Sciences Chinese Academy of Medical Sciences, School of Basic Medicine Peking Union Medical College, Beijing 100005, China
| | - Chong Chen
- Department of Immunology, Institute of Basic Medical Sciences Chinese Academy of Medical Sciences, School of Basic Medicine Peking Union Medical College, Beijing 100005, China
| | - Zhaojun Duan
- Department of Immunology, Institute of Basic Medical Sciences Chinese Academy of Medical Sciences, School of Basic Medicine Peking Union Medical College, Beijing 100005, China
| | - Suyuan Peng
- Institute of Information on Traditional Chinese Medicine, China Academy of Chinese Medical Sciences, Beijing, China
| | - Jie Zheng
- Unit for Laboratory Animal Medicine, Department of Microbiology and Immunology, Center for Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI 48109, USA
| | - Yongqun He
- Unit for Laboratory Animal Medicine, Department of Microbiology and Immunology, Center for Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI 48109, USA
| | - Xiaolin Yang
- Department of Biomedical Engineering, Institute of Basic Medical Sciences Chinese Academy of Medical Sciences, School of Basic Medicine Peking Union Medical College, Beijing 100005, China
| |
Collapse
|
6
|
Geraghty S, Boyer JA, Fazel-Zarandi M, Arzouni N, Ryseck RP, McBride MJ, Parsons LR, Rabinowitz JD, Singh M. Integrative Computational Framework, Dyscovr, Links Mutated Driver Genes to Expression Dysregulation Across 19 Cancer Types. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.11.20.624509. [PMID: 39605479 PMCID: PMC11601522 DOI: 10.1101/2024.11.20.624509] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/29/2024]
Abstract
Though somatic mutations play a critical role in driving cancer initiation and progression, the systems-level functional impacts of these mutations-particularly, how they alter expression across the genome and give rise to cancer hallmarks-are not yet well-understood, even for well-studied cancer driver genes. To address this, we designed an integrative machine learning model, Dyscovr, that leverages mutation, gene expression, copy number alteration (CNA), methylation, and clinical data to uncover putative relationships between nonsynonymous mutations in key cancer driver genes and transcriptional changes across the genome. We applied Dyscovr pan-cancer and within 19 individual cancer types, finding both broadly relevant and cancer type-specific links between driver genes and putative targets, including a subset we further identify as exhibiting negative genetic relationships. Our work newly implicates-and validates in cell lines-KBTBD2 and mutant PIK3CA as putative synthetic lethals in breast cancer, suggesting a novel combinatorial treatment approach.
Collapse
Affiliation(s)
- Sara Geraghty
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544
| | - Jacob A. Boyer
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544
- Ludwig Cancer Institute, Princeton Branch, Princeton University, Princeton, NJ 08554
| | - Mahya Fazel-Zarandi
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544
| | - Nibal Arzouni
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544
| | - Rolf-Peter Ryseck
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544
| | - Matthew J. McBride
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544
- Department of Chemical Biology, Ernest Mario School of Pharmacy, Rutgers University, Piscataway, NJ 08854
| | - Lance R. Parsons
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544
| | - Joshua D. Rabinowitz
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544
- Ludwig Cancer Institute, Princeton Branch, Princeton University, Princeton, NJ 08554
- Department of Chemistry, Princeton University, Princeton, NJ 08544
| | - Mona Singh
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544
- Department of Computer Science, Princeton University, Princeton, NJ 08544
- Lead Contact
| |
Collapse
|
7
|
Feng Y, Long Y, Wang H, Ouyang Y, Li Q, Wu M, Zheng J. Benchmarking machine learning methods for synthetic lethality prediction in cancer. Nat Commun 2024; 15:9058. [PMID: 39428397 PMCID: PMC11491473 DOI: 10.1038/s41467-024-52900-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Accepted: 09/23/2024] [Indexed: 10/22/2024] Open
Abstract
Synthetic lethality (SL) is a gold mine of anticancer drug targets, exposing cancer-specific dependencies of cellular survival. To complement resource-intensive experimental screening, many machine learning methods for SL prediction have emerged recently. However, a comprehensive benchmarking is lacking. This study systematically benchmarks 12 recent machine learning methods for SL prediction, assessing their performance across diverse data splitting scenarios, negative sample ratios, and negative sampling techniques, on both classification and ranking tasks. We observe that all the methods can perform significantly better by improving data quality, e.g., excluding computationally derived SLs from training and sampling negative labels based on gene expression. Among the methods, SLMGAE performs the best. Furthermore, the methods have limitations in realistic scenarios such as cold-start independent tests and context-specific SLs. These results, together with source code and datasets made freely available, provide guidance for selecting suitable methods and developing more powerful techniques for SL virtual screening.
Collapse
Affiliation(s)
- Yimiao Feng
- School of Information Science and Technology, ShanghaiTech University, Shanghai, China
- Lingang Laboratory, Shanghai, China
| | - Yahui Long
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
| | - He Wang
- School of Information Science and Technology, ShanghaiTech University, Shanghai, China
| | - Yang Ouyang
- School of Information Science and Technology, ShanghaiTech University, Shanghai, China
| | - Quan Li
- School of Information Science and Technology, ShanghaiTech University, Shanghai, China
| | - Min Wu
- Institute for Infocomm Research, Agency for Science, Technology and Research (A*STAR), Singapore, Singapore.
| | - Jie Zheng
- School of Information Science and Technology, ShanghaiTech University, Shanghai, China.
- Shanghai Engineering Research Center of Intelligent Vision and Imaging, Shanghai, China.
| |
Collapse
|
8
|
Hu Y, Oleshko S, Firmani S, Zhu Z, Cheng H, Ulmer M, Arnold M, Colomé-Tatché M, Tang J, Xhonneux S, Marsico A. BioPathNet: Enhancing Link Prediction in Biomedical Knowledge Graphs through Path Representation Learning. RESEARCH SQUARE 2024:rs.3.rs-5057842. [PMID: 39372928 PMCID: PMC11451641 DOI: 10.21203/rs.3.rs-5057842/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/08/2024]
Abstract
Understanding complex interactions in biomedical networks is crucial for advancements in biomedicine, but traditional link prediction (LP) methods are limited in capturing this complexity. Representation-based learning techniques improve prediction accuracy by mapping nodes to low-dimensional embeddings, yet they often struggle with interpretability and scalability. We present BioPathNet, a novel graph neural network framework based on the Neural Bellman-Ford Network (NBFNet), addressing these limitations through path-based reasoning for LP in biomedical knowledge graphs. Unlike node-embedding frameworks, BioPathNet learns representations between node pairs by considering all relations along paths, enhancing prediction accuracy and interpretability. This allows visualization of influential paths and facilitates biological validation. BioPathNet leverages a background regulatory graph (BRG) for enhanced message passing and uses stringent negative sampling to improve precision. In evaluations across various LP tasks, such as gene function annotation, drug-disease indication, synthetic lethality, and lncRNA-mRNA interaction prediction, BioPathNet consistently outperformed shallow node embedding methods, relational graph neural networks and task-specific state-of-the-art methods, demonstrating robust performance and versatility. Our study predicts novel drug indications for diseases like acute lymphoblastic leukemia (ALL) and Alzheimer's, validated by medical experts and clinical trials. We also identified new synthetic lethality gene pairs and regulatory interactions involving lncRNAs and target genes, confirmed through literature reviews. BioPathNet's interpretability will enable researchers to trace prediction paths and gain molecular insights, making it a valuable tool for drug discovery, personalized medicine and biology in general.
Collapse
Affiliation(s)
- Yue Hu
- Computational Health Center, Helmholtz Center Munich, Ingolstaedter Landstrasse 1, Neuherberg, 85764, Bavaria, Germany
- School of Life Sciences, Technical University of Munich, Alte Akademie 8, Freising, 85354, Bavaria, Germany
| | - Svitlana Oleshko
- Computational Health Center, Helmholtz Center Munich, Ingolstaedter Landstrasse 1, Neuherberg, 85764, Bavaria, Germany
- School of Computation, Information and Technology, Technical University of Munich, Arcisstrasse 21, Munich, 80333, Bavaria, Germany
| | - Samuele Firmani
- Computational Health Center, Helmholtz Center Munich, Ingolstaedter Landstrasse 1, Neuherberg, 85764, Bavaria, Germany
- School of Computation, Information and Technology, Technical University of Munich, Arcisstrasse 21, Munich, 80333, Bavaria, Germany
| | - Zhaocheng Zhu
- Department, Mila - Québec AI Institute, 6666 St-Urbain, Montréal, QC H2S 3H1, Quebec, Canada
- Department, Université de Montréal, 2900, boul. Édouard-Montpetit, Montréal, QC H3T 1J4, Quebec, Canada
| | - Hui Cheng
- School of Computation, Information and Technology, Technical University of Munich, Arcisstrasse 21, Munich, 80333, Bavaria, Germany
| | - Maria Ulmer
- Computational Health Center, Helmholtz Center Munich, Ingolstaedter Landstrasse 1, Neuherberg, 85764, Bavaria, Germany
- School of Life Sciences, Technical University of Munich, Alte Akademie 8, Freising, 85354, Bavaria, Germany
| | - Matthias Arnold
- Computational Health Center, Helmholtz Center Munich, Ingolstaedter Landstrasse 1, Neuherberg, 85764, Bavaria, Germany
- Department of Psychiatry and Behavioural Sciences, Duke University, 905 W Main St., Durham, NC 27701, North Carolina, United States
| | - Maria Colomé-Tatché
- Computational Health Center, Helmholtz Center Munich, Ingolstaedter Landstrasse 1, Neuherberg, 85764, Bavaria, Germany
- School of Life Sciences, Technical University of Munich, Alte Akademie 8, Freising, 85354, Bavaria, Germany
- Faculty of Biology, Ludwig-Maximilian University of Munich, Grosshaderner Str. 2, Planegg-Martinsried, 82152, Bavaria, Germany
| | - Jian Tang
- Department, Mila - Québec AI Institute, 6666 St-Urbain, Montréal, QC H2S 3H1, Quebec, Canada
- Department, CIFAR AI Chair, 661 University Ave, Toronto, ON M5G 1M1, Ontario, Canada
- Department, HEC Montréal, 3000 Chem. de la Côte-Sainte-Catherine, Montréal, QC H3T 2A7, Quebec, Canada
| | - Sophie Xhonneux
- Department, Mila - Québec AI Institute, 6666 St-Urbain, Montréal, QC H2S 3H1, Quebec, Canada
- Department, Université de Montréal, 2900, boul. Édouard-Montpetit, Montréal, QC H3T 1J4, Quebec, Canada
| | - Annalisa Marsico
- Computational Health Center, Helmholtz Center Munich, Ingolstaedter Landstrasse 1, Neuherberg, 85764, Bavaria, Germany
| |
Collapse
|
9
|
Messa L, Testa C, Carelli S, Rey F, Jacchetti E, Cereda C, Raimondi MT, Ceri S, Pinoli P. Non-Negative Matrix Tri-Factorization for Representation Learning in Multi-Omics Datasets with Applications to Drug Repurposing and Selection. Int J Mol Sci 2024; 25:9576. [PMID: 39273521 PMCID: PMC11394968 DOI: 10.3390/ijms25179576] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2024] [Revised: 08/18/2024] [Accepted: 08/20/2024] [Indexed: 09/15/2024] Open
Abstract
The vast corpus of heterogeneous biomedical data stored in databases, ontologies, and terminologies presents a unique opportunity for drug design. Integrating and fusing these sources is essential to develop data representations that can be analyzed using artificial intelligence methods to generate novel drug candidates or hypotheses. Here, we propose Non-Negative Matrix Tri-Factorization as an invaluable tool for integrating and fusing data, as well as for representation learning. Additionally, we demonstrate how representations learned by Non-Negative Matrix Tri-Factorization can effectively be utilized by traditional artificial intelligence methods. While this approach is domain-agnostic and applicable to any field with vast amounts of structured and semi-structured data, we apply it specifically to computational pharmacology and drug repurposing. This field is poised to benefit significantly from artificial intelligence, particularly in personalized medicine. We conducted extensive experiments to evaluate the performance of the proposed method, yielding exciting results, particularly compared to traditional methods. Novel drug-target predictions have also been validated in the literature, further confirming their validity. Additionally, we tested our method to predict drug synergism, where constructing a classical matrix dataset is challenging. The method demonstrated great flexibility, suggesting its applicability to a wide range of tasks in drug design and discovery.
Collapse
Affiliation(s)
- Letizia Messa
- Department of Electronics, Information and Bioengineering (DEIB), Politecnico di Milano, 20133 Milan, Italy
| | - Carolina Testa
- Department of Electronics, Information and Bioengineering (DEIB), Politecnico di Milano, 20133 Milan, Italy
| | - Stephana Carelli
- Center of Functional Genomics and Rare Diseases, Buzzi Children's Hospital, 20154 Milan, Italy
- Pediatric Clinical Research Center "Fondazione Romeo ed Enrica Invernizzi", Department of Biomedical and Clinical Sciences, Università degli Studi di Milano, 20157 Milan, Italy
| | - Federica Rey
- Pediatric Clinical Research Center "Fondazione Romeo ed Enrica Invernizzi", Department of Biomedical and Clinical Sciences, Università degli Studi di Milano, 20157 Milan, Italy
| | - Emanuela Jacchetti
- Department of Chemistry, Materials and Chemical Engineering "Giulio Natta", Politecnico di Milano, 20133 Milan, Italy
| | - Cristina Cereda
- Center of Functional Genomics and Rare Diseases, Buzzi Children's Hospital, 20154 Milan, Italy
| | - Manuela Teresa Raimondi
- Department of Chemistry, Materials and Chemical Engineering "Giulio Natta", Politecnico di Milano, 20133 Milan, Italy
| | - Stefano Ceri
- Department of Electronics, Information and Bioengineering (DEIB), Politecnico di Milano, 20133 Milan, Italy
| | - Pietro Pinoli
- Department of Electronics, Information and Bioengineering (DEIB), Politecnico di Milano, 20133 Milan, Italy
| |
Collapse
|
10
|
Buccioli G, Testa C, Jacchetti E, Pinoli P, Carelli S, Ceri S, Raimondi MT. The molecular basis of the anticancer effect of statins. Sci Rep 2024; 14:20298. [PMID: 39217242 PMCID: PMC11365972 DOI: 10.1038/s41598-024-71240-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2024] [Accepted: 08/26/2024] [Indexed: 09/04/2024] Open
Abstract
Statins, widely used cardiovascular drugs that lower cholesterol by inhibiting HMG-CoA reductase, have been increasingly recognized for their potential anticancer properties. This study elucidates the underlying mechanism, revealing that statins exploit Synthetic Lethality, a principle where the co-occurrence of two non-lethal events leads to cell death. Our computational analysis of approximately 37,000 SL pairs identified statins as potential drugs targeting genes involved in SL pairs with metastatic genes. In vitro validation on various cancer cell lines confirmed the anticancer efficacy of statins. This data-driven drug repurposing strategy provides a molecular basis for the anticancer effects of statins, offering translational opportunities in oncology.
Collapse
Affiliation(s)
- Giovanni Buccioli
- Department of Chemistry, Materials and Chemical Engineering "Giulio Natta", Politecnico di Milano, Milan, Italy
| | - Carolina Testa
- Department of Chemistry, Materials and Chemical Engineering "Giulio Natta", Politecnico di Milano, Milan, Italy
| | - Emanuela Jacchetti
- Department of Chemistry, Materials and Chemical Engineering "Giulio Natta", Politecnico di Milano, Milan, Italy
| | - Pietro Pinoli
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy
| | - Stephana Carelli
- Center of Functional Genomics and Rare Diseases, Buzzi Children's Hospital, Milan, Italy
| | - Stefano Ceri
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy.
| | - Manuela T Raimondi
- Department of Chemistry, Materials and Chemical Engineering "Giulio Natta", Politecnico di Milano, Milan, Italy.
| |
Collapse
|
11
|
Hu Y, Oleshko S, Firmani S, Zhu Z, Cheng H, Ulmer M, Arnold M, Colomé-Tatché M, Tang J, Xhonneux S, Marsico A. Path-based reasoning for biomedical knowledge graphs with BioPathNet. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.17.599219. [PMID: 39149355 PMCID: PMC11326122 DOI: 10.1101/2024.06.17.599219] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 08/17/2024]
Abstract
Understanding complex interactions in biomedical networks is crucial for advancements in biomedicine, but traditional link prediction (LP) methods are limited in capturing this complexity. Representation-based learning techniques improve prediction accuracy by mapping nodes to low-dimensional embeddings, yet they often struggle with interpretability and scalability. We present BioPathNet, a novel graph neural network framework based on the Neural Bellman-Ford Network (NBFNet), addressing these limitations through path-based reasoning for LP in biomedical knowledge graphs. Unlike node-embedding frameworks, BioPathNet learns representations between node pairs by considering all relations along paths, enhancing prediction accuracy and interpretability. This allows visualization of influential paths and facilitates biological validation. BioPathNet leverages a background regulatory graph (BRG) for enhanced message passing and uses stringent negative sampling to improve precision. In evaluations across various LP tasks, such as gene function annotation, drug-disease indication, synthetic lethality, and lncRNA-mRNA interaction prediction, BioPathNet consistently outperformed shallow node embedding methods, relational graph neural networks and task-specific state-of-the-art methods, demonstrating robust performance and versatility. Our study predicts novel drug indications for diseases like acute lymphoblastic leukemia (ALL) and Alzheimer's, validated by medical experts and clinical trials. We also identified new synthetic lethality gene pairs and regulatory interactions involving lncRNAs and target genes, confirmed through literature reviews. BioPathNet's interpretability will enable researchers to trace prediction paths and gain molecular insights, making it a valuable tool for drug discovery, personalized medicine and biology in general.
Collapse
Affiliation(s)
- Yue Hu
- Computational Health Center, Helmholtz Center Munich, Ingolstaedter Landstrasse 1, Neuherberg, 85764, Bavaria, Germany
- School of Life Sciences, Technical University of Munich, Alte Akademie 8, Freising, 85354, Bavaria, Germany
| | - Svitlana Oleshko
- Computational Health Center, Helmholtz Center Munich, Ingolstaedter Landstrasse 1, Neuherberg, 85764, Bavaria, Germany
- School of Computation, Information and Technology, Technical University of Munich, Arcisstrasse 21, Munich, 80333, Bavaria, Germany
| | - Samuele Firmani
- Computational Health Center, Helmholtz Center Munich, Ingolstaedter Landstrasse 1, Neuherberg, 85764, Bavaria, Germany
| | - Zhaocheng Zhu
- Department, Mila - Québec AI Institute, 6666 St-Urbain, Montréal, QC H2S 3H1, Quebec, Canada
- Department, Université de Montréal, 2900, boul. Édouard-Montpetit, Montréal, QC H3T 1J4, Quebec, Canada
| | - Hui Cheng
- School of Computation, Information and Technology, Technical University of Munich, Arcisstrasse 21, Munich, 80333, Bavaria, Germany
| | - Maria Ulmer
- Computational Health Center, Helmholtz Center Munich, Ingolstaedter Landstrasse 1, Neuherberg, 85764, Bavaria, Germany
- School of Life Sciences, Technical University of Munich, Alte Akademie 8, Freising, 85354, Bavaria, Germany
| | - Matthias Arnold
- Computational Health Center, Helmholtz Center Munich, Ingolstaedter Landstrasse 1, Neuherberg, 85764, Bavaria, Germany
- Department of Psychiatry and Behavioural Sciences, Duke University, 905 W Main St., Durham, NC 27701, North Carolina, United States
| | - Maria Colomé-Tatché
- Computational Health Center, Helmholtz Center Munich, Ingolstaedter Landstrasse 1, Neuherberg, 85764, Bavaria, Germany
- School of Life Sciences, Technical University of Munich, Alte Akademie 8, Freising, 85354, Bavaria, Germany
- Faculty of Biology, Ludwig-Maximilian University of Munich, Grosshaderner Str. 2, Planegg-Martinsried, 82152, Bavaria, Germany
| | - Jian Tang
- Department, Mila - Québec AI Institute, 6666 St-Urbain, Montréal, QC H2S 3H1, Quebec, Canada
- Department, CIFAR AI Chair, 661 University Ave, Toronto, ON M5G 1M1, Ontario, Canada
- Department, HEC Montréal, 3000 Chem. de la Côte-Sainte-Catherine, Montréal, QC H3T 2A7, Quebec, Canada
| | - Sophie Xhonneux
- Department, Mila - Québec AI Institute, 6666 St-Urbain, Montréal, QC H2S 3H1, Quebec, Canada
- Department, Université de Montréal, 2900, boul. Édouard-Montpetit, Montréal, QC H3T 1J4, Quebec, Canada
| | - Annalisa Marsico
- Computational Health Center, Helmholtz Center Munich, Ingolstaedter Landstrasse 1, Neuherberg, 85764, Bavaria, Germany
| |
Collapse
|
12
|
Wooller SK, Pearl LH, Pearl FMG. Identifying actionable synthetically lethal cancer gene pairs using mutual exclusivity. FEBS Lett 2024; 598:2028-2039. [PMID: 38977941 DOI: 10.1002/1873-3468.14950] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Revised: 04/25/2024] [Accepted: 05/09/2024] [Indexed: 07/10/2024]
Abstract
Mutually exclusive loss-of-function alterations in gene pairs are those that occur together less frequently than may be expected and may denote a synthetically lethal relationship (SSL) between the genes. SSLs can be exploited therapeutically to selectively kill cancer cells. Here, we analysed mutation, copy number variation, and methylation levels in samples from The Cancer Genome Atlas, using the hypergeometric and the Poisson binomial tests to identify mutually exclusive inactivated genes. We focused on gene pairs where one is an inactivated tumour suppressor and the other a gene whose protein product can be inhibited by known drugs. This provided an abundance of potential targeted therapeutics and repositioning opportunities for several cancers. These data are available on the MexDrugs website, https://bioinformaticslab.sussex.ac.uk/mexdrugs.
Collapse
Affiliation(s)
- Sarah K Wooller
- Bioinformatics Lab, School of Life Sciences, University of Sussex, Brighton, UK
| | - Laurence H Pearl
- Genome Damage Stability Centre, School of Life Sciences, University of Sussex, Brighton, UK
| | - Frances M G Pearl
- Bioinformatics Lab, School of Life Sciences, University of Sussex, Brighton, UK
| |
Collapse
|
13
|
Dey A, Mudunuri S, Kiran M. MAGICAL: A multi-class classifier to predict synthetic lethal and viable interactions using protein-protein interaction network. PLoS Comput Biol 2024; 20:e1012336. [PMID: 39186799 DOI: 10.1371/journal.pcbi.1012336] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Revised: 09/06/2024] [Accepted: 07/17/2024] [Indexed: 08/28/2024] Open
Abstract
Synthetic lethality (SL) and synthetic viability (SV) are commonly studied genetic interactions in the targeted therapy approach in cancer. In SL, inhibiting either of the genes does not affect the cancer cell survival, but inhibiting both leads to a lethal phenotype. In SV, inhibiting the vulnerable gene makes the cancer cell sick; inhibiting the partner gene rescues and promotes cell viability. Many low and high-throughput experimental approaches have been employed to identify SLs and SVs, but they are time-consuming and expensive. The computational tools for SL prediction involve statistical and machine-learning approaches. Almost all machine learning tools are binary classifiers and involve only identifying SL pairs. Most importantly, there are limited properties known that best describe and discriminate SL from SV. We developed MAGICAL (Multi-class Approach for Genetic Interaction in Cancer via Algorithm Learning), a multi-class random forest based machine learning model for genetic interaction prediction. Network properties of protein derived from physical protein-protein interactions are used as features to classify SL and SV. The model results in an accuracy of ~80% for the training dataset (CGIdb, BioGRID, and SynLethDB) and performs well on DepMap and other experimentally derived reported datasets. Amongst all the network properties, the shortest path, average neighbor2, average betweenness, average triangle, and adhesion have significant discriminatory power. MAGICAL is the first multi-class model to identify discriminatory features of synthetic lethal and viable interactions. MAGICAL can predict SL and SV interactions with better accuracy and precision than any existing binary classifier.
Collapse
Affiliation(s)
- Anubha Dey
- Department of Systems and Computational Biology, School of Life Sciences, University of Hyderabad, Hyderabad, India
| | - Suresh Mudunuri
- Centre for Bioinformatics Research, SRKR Engineering College, Andhra Pradesh, India
| | - Manjari Kiran
- Department of Systems and Computational Biology, School of Life Sciences, University of Hyderabad, Hyderabad, India
| |
Collapse
|
14
|
Fan K, Gökbağ B, Tang S, Li S, Huang Y, Wang L, Cheng L, Li L. Synthetic lethal connectivity and graph transformer improve synthetic lethality prediction. Brief Bioinform 2024; 25:bbae425. [PMID: 39210507 PMCID: PMC11361842 DOI: 10.1093/bib/bbae425] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2024] [Revised: 06/14/2024] [Accepted: 08/16/2024] [Indexed: 09/04/2024] Open
Abstract
Synthetic lethality (SL) has shown great promise for the discovery of novel targets in cancer. CRISPR double-knockout (CDKO) technologies can only screen several hundred genes and their combinations, but not genome-wide. Therefore, good SL prediction models are highly needed for genes and gene pairs selection in CDKO experiments. However, lack of scalable SL properties prevents generalizability of SL interactions to out-of-sample data, thereby hindering modeling efforts. In this paper, we recognize that SL connectivity is a scalable and generalizable SL property. We develop a novel two-step multilayer encoder for individual sample-specific SL prediction model (MLEC-iSL), which predicts SL connectivity first and SL interactions subsequently. MLEC-iSL has three encoders, namely, gene, graph, and transformer encoders. MLEC-iSL achieves high SL prediction performance in K562 (AUPR, 0.73; AUC, 0.72) and Jurkat (AUPR, 0.73; AUC, 0.71) cells, while no existing methods exceed 0.62 AUPR and AUC. The prediction performance of MLEC-iSL is validated in a CDKO experiment in 22Rv1 cells, yielding a 46.8% SL rate among 987 selected gene pairs. The screen also reveals SL dependency between apoptosis and mitosis cell death pathways.
Collapse
Affiliation(s)
- Kunjie Fan
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, 1800 Cannon Drive, Columbus, OH 43210, United States
| | - Birkan Gökbağ
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, 1800 Cannon Drive, Columbus, OH 43210, United States
| | - Shan Tang
- Department of Biomedical Informatics, College of Pharmacy, The Ohio State University, 500 W. 12 ave, Columbus, OH 43210, United States
| | - Shangjia Li
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, 1800 Cannon Drive, Columbus, OH 43210, United States
| | - Yirui Huang
- Department of Biomedical Informatics, College of Pharmacy, The Ohio State University, 500 W. 12 ave, Columbus, OH 43210, United States
| | - Lingling Wang
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, 1800 Cannon Drive, Columbus, OH 43210, United States
| | - Lijun Cheng
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, 1800 Cannon Drive, Columbus, OH 43210, United States
| | - Lang Li
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, 1800 Cannon Drive, Columbus, OH 43210, United States
- Department of Biomedical Informatics, College of Pharmacy, The Ohio State University, 500 W. 12 ave, Columbus, OH 43210, United States
| |
Collapse
|
15
|
Zhang G, Chen Y, Yan C, Wang J, Liang W, Luo J, Luo H. MPASL: multi-perspective learning knowledge graph attention network for synthetic lethality prediction in human cancer. Front Pharmacol 2024; 15:1398231. [PMID: 38835667 PMCID: PMC11148462 DOI: 10.3389/fphar.2024.1398231] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2024] [Accepted: 04/26/2024] [Indexed: 06/06/2024] Open
Abstract
Synthetic lethality (SL) is widely used to discover the anti-cancer drug targets. However, the identification of SL interactions through wet experiments is costly and inefficient. Hence, the development of efficient and high-accuracy computational methods for SL interactions prediction is of great significance. In this study, we propose MPASL, a multi-perspective learning knowledge graph attention network to enhance synthetic lethality prediction. MPASL utilizes knowledge graph hierarchy propagation to explore multi-source neighbor nodes related to genes. The knowledge graph ripple propagation expands gene representations through existing gene SL preference sets. MPASL can learn the gene representations from both gene-entity perspective and entity-entity perspective. Specifically, based on the aggregation method, we learn to obtain gene-oriented entity embeddings. Then, the gene representations are refined by comparing the various layer-wise neighborhood features of entities using the discrepancy contrastive technique. Finally, the learned gene representation is applied in SL prediction. Experimental results demonstrated that MPASL outperforms several state-of-the-art methods. Additionally, case studies have validated the effectiveness of MPASL in identifying SL interactions between genes.
Collapse
Affiliation(s)
- Ge Zhang
- School of Computer and Information Engineering, Henan University, Kaifeng, Henan, China
- Henan Key Laboratory of Big Data Analysis and Processing, Henan University, Kaifeng, Henan, China
| | - Yitong Chen
- School of Computer and Information Engineering, Henan University, Kaifeng, Henan, China
- Henan Key Laboratory of Big Data Analysis and Processing, Henan University, Kaifeng, Henan, China
| | - Chaokun Yan
- School of Computer and Information Engineering, Henan University, Kaifeng, Henan, China
- Henan Key Laboratory of Big Data Analysis and Processing, Henan University, Kaifeng, Henan, China
| | - Jianlin Wang
- School of Computer and Information Engineering, Henan University, Kaifeng, Henan, China
- Henan Key Laboratory of Big Data Analysis and Processing, Henan University, Kaifeng, Henan, China
| | - Wenjuan Liang
- School of Computer and Information Engineering, Henan University, Kaifeng, Henan, China
- Henan Key Laboratory of Big Data Analysis and Processing, Henan University, Kaifeng, Henan, China
| | - Junwei Luo
- College of Computer Science and Technology, Henan Polytechnic University, Jiaozuo, Henan, China
| | - Huimin Luo
- School of Computer and Information Engineering, Henan University, Kaifeng, Henan, China
- Henan Key Laboratory of Big Data Analysis and Processing, Henan University, Kaifeng, Henan, China
| |
Collapse
|
16
|
Bashi AC, Coker EA, Bulusu KC, Jaaks P, Crafter C, Lightfoot H, Milo M, McCarten K, Jenkins DF, van der Meer D, Lynch JT, Barthorpe S, Andersen CL, Barry ST, Beck A, Cidado J, Gordon JA, Hall C, Hall J, Mali I, Mironenko T, Mongeon K, Morris J, Richardson L, Smith PD, Tavana O, Tolley C, Thomas F, Willis BS, Yang W, O'Connor MJ, McDermott U, Critchlow SE, Drew L, Fawell SE, Mettetal JT, Garnett MJ. Large-scale Pan-cancer Cell Line Screening Identifies Actionable and Effective Drug Combinations. Cancer Discov 2024; 14:846-865. [PMID: 38456804 PMCID: PMC11061612 DOI: 10.1158/2159-8290.cd-23-0388] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Revised: 11/01/2023] [Accepted: 02/02/2024] [Indexed: 03/09/2024]
Abstract
Oncology drug combinations can improve therapeutic responses and increase treatment options for patients. The number of possible combinations is vast and responses can be context-specific. Systematic screens can identify clinically relevant, actionable combinations in defined patient subtypes. We present data for 109 anticancer drug combinations from AstraZeneca's oncology small molecule portfolio screened in 755 pan-cancer cell lines. Combinations were screened in a 7 × 7 concentration matrix, with more than 4 million measurements of sensitivity, producing an exceptionally data-rich resource. We implement a new approach using combination Emax (viability effect) and highest single agent (HSA) to assess combination benefit. We designed a clinical translatability workflow to identify combinations with clearly defined patient populations, rationale for tolerability based on tumor type and combination-specific "emergent" biomarkers, and exposures relevant to clinical doses. We describe three actionable combinations in defined cancer types, confirmed in vitro and in vivo, with a focus on hematologic cancers and apoptotic targets. SIGNIFICANCE We present the largest cancer drug combination screen published to date with 7 × 7 concentration response matrices for 109 combinations in more than 750 cell lines, complemented by multi-omics predictors of response and identification of "emergent" combination biomarkers. We prioritize hits to optimize clinical translatability, and experimentally validate novel combination hypotheses. This article is featured in Selected Articles from This Issue, p. 695.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Marta Milo
- Oncology R&D, AstraZeneca, Cambridge, United Kingdom
| | | | | | | | | | - Syd Barthorpe
- Wellcome Sanger Institute, Cambridge, United Kingdom
| | | | | | | | | | | | - Caitlin Hall
- Wellcome Sanger Institute, Cambridge, United Kingdom
| | - James Hall
- Wellcome Sanger Institute, Cambridge, United Kingdom
| | - Iman Mali
- Wellcome Sanger Institute, Cambridge, United Kingdom
| | | | | | - James Morris
- Wellcome Sanger Institute, Cambridge, United Kingdom
| | | | - Paul D. Smith
- Oncology R&D, AstraZeneca, Cambridge, United Kingdom
| | - Omid Tavana
- Oncology R&D, AstraZeneca, Waltham, Massachusetts
| | | | | | | | - Wanjuan Yang
- Wellcome Sanger Institute, Cambridge, United Kingdom
| | | | | | | | - Lisa Drew
- Oncology R&D, AstraZeneca, Waltham, Massachusetts
| | | | | | | |
Collapse
|
17
|
Liu X, Hu J, Zheng J. SL-Miner: a web server for mining evidence and prioritization of cancer-specific synthetic lethality. Bioinformatics 2024; 40:btae016. [PMID: 38244572 PMCID: PMC10868331 DOI: 10.1093/bioinformatics/btae016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2023] [Revised: 12/10/2023] [Accepted: 01/16/2024] [Indexed: 01/22/2024] Open
Abstract
SUMMARY Synthetic lethality (SL) refers to a type of genetic interaction in which the simultaneous inactivation of two genes leads to cell death, while the inactivation of a single gene does not affect cell viability. It significantly expands the range of potential therapeutic targets for anti-cancer treatments. SL interactions are primarily identified through experimental screening and computational prediction. Although various computational methods have been proposed, they tend to ignore providing evidence to support their predictions of SL. Besides, they are rarely user-friendly for biologists who likely have limited programming skills. Moreover, the genetic context specificity of SL interactions is often not taken into consideration. Here, we introduce a web server called SL-Miner, which is designed to mine the evidence of SL relationships between a primary gene and a few candidate SL partner genes in a specific type of cancer, and to prioritize these candidate genes by integrating various types of evidence. For intuitive data visualization, SL-Miner provides a range of charts (e.g. volcano plot and box plot) to help users get insights from the data. AVAILABILITY AND IMPLEMENTATION SL-Miner is available at https://slminer.sist.shanghaitech.edu.cn.
Collapse
Affiliation(s)
- Xin Liu
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China
| | - Jieni Hu
- School of Life Science and Technology, ShanghaiTech University, Shanghai 201210, China
| | - Jie Zheng
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China
- Shanghai Engineering Research Center of Intelligent Vision and Imaging, Shanghai 201210, China
| |
Collapse
|
18
|
Gökbağ B, Tang S, Fan K, Cheng L, Yu L, Zhao Y, Li L. SLKB: synthetic lethality knowledge base. Nucleic Acids Res 2024; 52:D1418-D1428. [PMID: 37889037 PMCID: PMC10767912 DOI: 10.1093/nar/gkad806] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2023] [Revised: 08/16/2023] [Accepted: 09/27/2023] [Indexed: 10/28/2023] Open
Abstract
Emerging CRISPR-Cas9 technology permits synthetic lethality (SL) screening of large number of gene pairs from gene combination double knockout (CDKO) experiments. However, the poor integration and annotation of CDKO SL data in current SL databases limit their utility, and diverse methods of calculating SL scores prohibit their comparison. To overcome these shortcomings, we have developed SL knowledge base (SLKB) that incorporates data of 11 CDKO experiments in 22 cell lines, 16,059 SL gene pairs and 264,424 non-SL gene pairs. Additionally, within SLKB, we have implemented five SL calculation methods: median score with and without background control normalization (Median-B/NB), sgRNA-derived score (sgRNA-B/NB), Horlbeck score, GEMINI score and MAGeCK score. The five scores have demonstrated a mere 1.21% overlap among their top 10% SL gene pairs, reflecting high diversity. Users can browse SL networks and assess the impact of scoring methods using Venn diagrams. The SL network generated from all data in SLKB shows a greater likelihood of SL gene pair connectivity with other SL gene pairs than non-SL pairs. Comparison of SL networks between two cell lines demonstrated greater likelihood to share SL hub genes than SL gene pairs. SLKB website and pipeline can be freely accessed at https://slkb.osubmi.org and https://slkb.docs.osubmi.org/, respectively.
Collapse
Affiliation(s)
- Birkan Gökbağ
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH 43210, USA
| | - Shan Tang
- College of Pharmacy, The Ohio State University, Columbus, OH 43210, USA
| | - Kunjie Fan
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH 43210, USA
| | - Lijun Cheng
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH 43210, USA
| | - Lianbo Yu
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH 43210, USA
| | - Yue Zhao
- Department of Computational Medicine and Bioinformatics, College of Medicine, University of Michigan, Ann Arbor, MI 48104, USA
| | - Lang Li
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH 43210, USA
| |
Collapse
|
19
|
Staheli JP, Neal ML, Navare A, Mast FD, Aitchison JD. Predicting host-based, synthetic lethal antiviral targets from omics data. NAR MOLECULAR MEDICINE 2024; 1:ugad001. [PMID: 38994440 PMCID: PMC11233254 DOI: 10.1093/narmme/ugad001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Revised: 12/08/2023] [Accepted: 01/03/2024] [Indexed: 07/13/2024]
Abstract
Traditional antiviral therapies often have limited effectiveness due to toxicity and the emergence of drug resistance. Host-based antivirals are an alternative, but can cause nonspecific effects. Recent evidence shows that virus-infected cells can be selectively eliminated by targeting synthetic lethal (SL) partners of proteins disrupted by viral infection. Thus, we hypothesized that genes depleted in CRISPR knockout (KO) screens of virus-infected cells may be enriched in SL partners of proteins altered by infection. To investigate this, we established a computational pipeline predicting antiviral SL drug targets. First, we identified SARS-CoV-2-induced changes in gene products via a large compendium of omics data. Second, we identified SL partners for each altered gene product. Last, we screened CRISPR KO data for SL partners required for cell viability in infected cells. Despite differences in virus-induced alterations detected by various omics data, they share many predicted SL targets, with significant enrichment in CRISPR KO-depleted datasets. Our comparison of SARS-CoV-2 and influenza infection data revealed potential broad-spectrum, host-based antiviral SL targets. This suggests that CRISPR KO data are replete with common antiviral targets due to their SL relationship with virus-altered states and that such targets can be revealed from analysis of omics datasets and SL predictions.
Collapse
Affiliation(s)
- Jeannette P Staheli
- Center for Global Infectious Disease Research, Seattle Children’s Research Institute, Seattle, WA 98101, USA
| | - Maxwell L Neal
- Center for Global Infectious Disease Research, Seattle Children’s Research Institute, Seattle, WA 98101, USA
| | - Arti Navare
- Center for Global Infectious Disease Research, Seattle Children’s Research Institute, Seattle, WA 98101, USA
| | - Fred D Mast
- Center for Global Infectious Disease Research, Seattle Children’s Research Institute, Seattle, WA 98101, USA
| | - John D Aitchison
- Center for Global Infectious Disease Research, Seattle Children’s Research Institute, Seattle, WA 98101, USA
| |
Collapse
|
20
|
Abu-Salih B, AL-Qurishi M, Alweshah M, AL-Smadi M, Alfayez R, Saadeh H. Healthcare knowledge graph construction: A systematic review of the state-of-the-art, open issues, and opportunities. JOURNAL OF BIG DATA 2023; 10:81. [PMID: 37274445 PMCID: PMC10225120 DOI: 10.1186/s40537-023-00774-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/28/2022] [Accepted: 05/17/2023] [Indexed: 06/06/2023]
Abstract
The incorporation of data analytics in the healthcare industry has made significant progress, driven by the demand for efficient and effective big data analytics solutions. Knowledge graphs (KGs) have proven utility in this arena and are rooted in a number of healthcare applications to furnish better data representation and knowledge inference. However, in conjunction with a lack of a representative KG construction taxonomy, several existing approaches in this designated domain are inadequate and inferior. This paper is the first to provide a comprehensive taxonomy and a bird's eye view of healthcare KG construction. Additionally, a thorough examination of the current state-of-the-art techniques drawn from academic works relevant to various healthcare contexts is carried out. These techniques are critically evaluated in terms of methods used for knowledge extraction, types of the knowledge base and sources, and the incorporated evaluation protocols. Finally, several research findings and existing issues in the literature are reported and discussed, opening horizons for future research in this vibrant area.
Collapse
Affiliation(s)
| | | | | | - Mohammad AL-Smadi
- Jordan University of Science and Technology, Irbid, Jordan
- Qatar University, Doha, Qatar
| | | | | |
Collapse
|
21
|
Jing Y, Feng B, Gao J, Li J, Zhou G, Sun Z, Wang Y. BLAB2CancerKD: a knowledge graph database focusing on the association between lactic acid bacteria and cancer, but beyond. Database (Oxford) 2023; 2023:7176387. [PMID: 37221044 DOI: 10.1093/database/baad036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2023] [Revised: 04/23/2023] [Accepted: 04/28/2023] [Indexed: 05/25/2023]
Abstract
In a broad sense, lactic acid bacteria (LAB) is a general term for Gram-positive bacteria that can produce lactic acid by utilizing fermentable carbohydrates. It is widely used in essential fields such as industry, agriculture, animal husbandry and medicine. At the same time, LAB are closely related to human health. They can regulate human intestinal flora and improve gastrointestinal function and body immunity. Cancer, a disease in which some cells grow out of control and spread to other body parts, is one of the leading causes of human death worldwide. In recent years, the potential of LAB in cancer treatment has attracted attention. Mining knowledge from the scientific literature significantly accelerates its application in cancer treatment. Using 7794 literature studies of LAB cancer as source data, we have processed 16 543 biomedical concepts and 23 091 associations by using automatic text mining tools combined with manual curation of domain experts. An ontology containing 31 434 pieces of structured data is constructed. Finally, based on ontology, a knowledge graph (KG) database, which is called Beyond 'Lactic acid bacteria to Cancer Knowledge graph Database' (BLAB2CancerKD), is constructed by using KG and web technology. BLAB2CancerKD presents all the relevant knowledge intuitively and clearly in various data presentation forms, and the interactive system function also makes it more efficient. BLAB2CancerKD will be continuously updated to advance the research and application of LAB in cancer therapy. Researchers can visit BLAB2CancerKD at. Database URL http://110.40.139.2:18095/.
Collapse
Affiliation(s)
- Yi Jing
- Faculty of Science, The University of New South Wales, High Street, Sydney, New South Wales 2052, Australia
- Inner Mongolia Autonomous Region Key Laboratory of Big Data Research and Application for Agriculture and Animal Husbandry, Zhaowuda Road No. 306, Hohhot 010018, China
| | - Baiyang Feng
- Inner Mongolia Autonomous Region Key Laboratory of Big Data Research and Application for Agriculture and Animal Husbandry, Zhaowuda Road No. 306, Hohhot 010018, China
- College of Computer and Information Engineering, Inner Mongolia Agricultural University, Erdos East Street No. 29, Hohhot 010011, China
| | - Jing Gao
- Inner Mongolia Autonomous Region Key Laboratory of Big Data Research and Application for Agriculture and Animal Husbandry, Zhaowuda Road No. 306, Hohhot 010018, China
- College of Computer and Information Engineering, Inner Mongolia Agricultural University, Erdos East Street No. 29, Hohhot 010011, China
- Inner Mongolia Autonomous Region Big Data Center, Chilechuan Street No. 1, Hohhot 010091, China
| | - Jin Li
- Inner Mongolia Autonomous Region Key Laboratory of Big Data Research and Application for Agriculture and Animal Husbandry, Zhaowuda Road No. 306, Hohhot 010018, China
- College of Computer and Information Engineering, Inner Mongolia Agricultural University, Erdos East Street No. 29, Hohhot 010011, China
| | - Ganghui Zhou
- Inner Mongolia Autonomous Region Key Laboratory of Big Data Research and Application for Agriculture and Animal Husbandry, Zhaowuda Road No. 306, Hohhot 010018, China
- College of Computer and Information Engineering, Inner Mongolia Agricultural University, Erdos East Street No. 29, Hohhot 010011, China
| | - Zhihong Sun
- College of Food Science and Engineering, Inner Mongolia Agricultural University, Zhaowuda Road No. 306, Hohhot 010018, China
| | - Yufei Wang
- The Affiliated Hospital of Inner Mongolia Medical University, Tongdao North road No.1, Hohhot 010050, China
| |
Collapse
|
22
|
Markowska M, Budzinska MA, Coenen-Stass A, Kang S, Kizling E, Kolmus K, Koras K, Staub E, Szczurek E. Synthetic lethality prediction in DNA damage repair, chromatin remodeling and the cell cycle using multi-omics data from cell lines and patients. Sci Rep 2023; 13:7049. [PMID: 37120674 PMCID: PMC10148866 DOI: 10.1038/s41598-023-34161-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Accepted: 04/25/2023] [Indexed: 05/01/2023] Open
Abstract
Discovering synthetic lethal (SL) gene partners of cancer genes is an important step in developing cancer therapies. However, identification of SL interactions is challenging, due to a large number of possible gene pairs, inherent noise and confounding factors in the observed signal. To discover robust SL interactions, we devised SLIDE-VIP, a novel framework combining eight statistical tests, including a new patient data-based test iSurvLRT. SLIDE-VIP leverages multi-omics data from four different sources: gene inactivation cell line screens, cancer patient data, drug screens and gene pathways. We applied SLIDE-VIP to discover SL interactions between genes involved in DNA damage repair, chromatin remodeling and cell cycle, and their potentially druggable partners. The top 883 ranking SL candidates had strong evidence in cell line and patient data, 250-fold reducing the initial space of 200K pairs. Drug screen and pathway tests provided additional corroboration and insights into these interactions. We rediscovered well-known SL pairs such as RB1 and E2F3 or PRKDC and ATM, and in addition, proposed strong novel SL candidates such as PTEN and PIK3CB. In summary, SLIDE-VIP opens the door to the discovery of SL interactions with clinical potential. All analysis and visualizations are available via the online SLIDE-VIP WebApp.
Collapse
Affiliation(s)
- Magda Markowska
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Stefana Banacha 2, 02-097, Warsaw, Poland
- Postgraduate School of Molecular Medicine, Medical University of Warsaw, Zwirki i Wigury 61, 02-091, Warsaw, Poland
| | - Magdalena A Budzinska
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Stefana Banacha 2, 02-097, Warsaw, Poland
- Ardigen S.A., Podole 76, 30-394, Cracow, Poland
| | - Anna Coenen-Stass
- Translational Medicine, Oncology Bioinformatics, Merck Healthcare KGaA, Frankfurt Strasse 250, 64293, Darmstadt, Germany
| | - Senbai Kang
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Stefana Banacha 2, 02-097, Warsaw, Poland
| | - Ewa Kizling
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Stefana Banacha 2, 02-097, Warsaw, Poland
| | | | - Krzysztof Koras
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Stefana Banacha 2, 02-097, Warsaw, Poland
| | - Eike Staub
- Translational Medicine, Oncology Bioinformatics, Merck Healthcare KGaA, Frankfurt Strasse 250, 64293, Darmstadt, Germany
| | - Ewa Szczurek
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Stefana Banacha 2, 02-097, Warsaw, Poland.
| |
Collapse
|
23
|
Tang S, Gökbağ B, Fan K, Shao S, Huo Y, Wu X, Cheng L, Li L. Synthetic lethal gene pairs: Experimental approaches and predictive models. Front Genet 2022; 13:961611. [PMID: 36531238 PMCID: PMC9751344 DOI: 10.3389/fgene.2022.961611] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2022] [Accepted: 11/07/2022] [Indexed: 03/27/2024] Open
Abstract
Synthetic lethality (SL) refers to a genetic interaction in which the simultaneous perturbation of two genes leads to cell or organism death, whereas viability is maintained when only one of the pair is altered. The experimental exploration of these pairs and predictive modeling in computational biology contribute to our understanding of cancer biology and the development of cancer therapies. We extensively reviewed experimental technologies, public data sources, and predictive models in the study of synthetic lethal gene pairs and herein detail biological assumptions, experimental data, statistical models, and computational schemes of various predictive models, speculate regarding their influence on individual sample- and population-based synthetic lethal interactions, discuss the pros and cons of existing SL data and models, and highlight potential research directions in SL discovery.
Collapse
Affiliation(s)
- Shan Tang
- College of Pharmacy, The Ohio State University, Columbus, OH, United States
| | - Birkan Gökbağ
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, United States
| | - Kunjie Fan
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, United States
| | - Shuai Shao
- College of Pharmacy, The Ohio State University, Columbus, OH, United States
| | - Yang Huo
- Indiana University, Bloomington, IN, United States
| | - Xue Wu
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, United States
| | - Lijun Cheng
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, United States
| | - Lang Li
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, United States
| |
Collapse
|
24
|
Wang S, Feng Y, Liu X, Liu Y, Wu M, Zheng J. NSF4SL: negative-sample-free contrastive learning for ranking synthetic lethal partner genes in human cancers. Bioinformatics 2022; 38:ii13-ii19. [PMID: 36124790 DOI: 10.1093/bioinformatics/btac462] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
MOTIVATION Detecting synthetic lethality (SL) is a promising strategy for identifying anti-cancer drug targets. Targeting SL partners of a primary gene mutated in cancer is selectively lethal to cancer cells. Due to high cost of wet-lab experiments and availability of gold standard SL data, supervised machine learning for SL prediction has been popular. However, most of the methods are based on binary classification and thus limited by the lack of reliable negative data. Contrastive learning can train models without any negative sample and is thus promising for finding novel SLs. RESULTS We propose NSF4SL, a negative-sample-free SL prediction model based on a contrastive learning framework. It captures the characteristics of positive SL samples by using two branches of neural networks that interact with each other to learn SL-related gene representations. Moreover, a feature-wise data augmentation strategy is used to mitigate the sparsity of SL data. NSF4SL significantly outperforms all baselines which require negative samples, even in challenging experimental settings. To the best of our knowledge, this is the first time that SL prediction is formulated as a gene ranking problem, which is more practical than the current formulation as binary classification. NSF4SL is the first contrastive learning method for SL prediction and its success points to a new direction of machine-learning methods for identifying novel SLs. AVAILABILITY AND IMPLEMENTATION Our source code is available at https://github.com/JieZheng-ShanghaiTech/NSF4SL. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Shike Wang
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China
| | - Yimiao Feng
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China
| | - Xin Liu
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China
| | - Yong Liu
- Joint NTU-UBC Research Centre of Excellence in Active Living for the Elderly, Nanyang Technological University, Singapore 639798, Singapore
| | - Min Wu
- Institute for Infocomm Research, Agency for Science, Technology and Research (A*STAR), Singapore 138632, Singapore
| | - Jie Zheng
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China.,Shanghai Engineering Research Center of Intelligent Vision and Imaging, Shanghai 201210, China
| |
Collapse
|
25
|
Liu X, Yu J, Tao S, Yang B, Wang S, Wang L, Bai F, Zheng J. PiLSL: pairwise interaction learning-based graph neural network for synthetic lethality prediction in human cancers. Bioinformatics 2022; 38:ii106-ii112. [PMID: 36124788 DOI: 10.1093/bioinformatics/btac476] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
MOTIVATION Synthetic lethality (SL) is a type of genetic interaction in which the simultaneous inactivation of two genes leads to cell death, while the inactivation of a single gene does not affect the cell viability. It can effectively expand the range of anti-cancer therapeutic targets. SL interactions are identified mainly by experimental screening and computational prediction. Recent machine-learning methods mostly learn the representation of each gene individually, ignoring the representation of the pairwise interaction between two genes. In addition, the mechanisms of SL, the key to translating SL into cancer therapeutics, are often unclear. RESULTS To fill the gaps, we propose a pairwise interaction learning-based graph neural network (GNN) named PiLSL to learn the representation of pairwise interaction between two genes for SL prediction. First, we construct an enclosing graph for each pair of genes from a knowledge graph. Secondly, we design an attentive embedding propagation layer in a GNN to discriminate the importance among the edges in the enclosing graph and to learn the latent features of the pairwise interaction from the weighted enclosing graph. Finally, we further fuse the latent features with explicit features extracted from multi-omics data to obtain powerful gene representations for SL prediction. Extensive experimental results demonstrate that PiLSL outperforms the best baseline by a large margin and generalizes well under three realistic scenarios. Besides, PiLSL provides an explanation of SL mechanisms via the weighted paths in the enclosing graphs by attention mechanism. AVAILABILITY AND IMPLEMENTATION Our source code is available at https://github.com/JieZheng-ShanghaiTech/PiLSL.
Collapse
Affiliation(s)
- Xin Liu
- School of Information Science and Technology, Shanghai Tech University, Shanghai 201210, China
| | - Jiale Yu
- School of Information Science and Technology, Shanghai Tech University, Shanghai 201210, China
| | - Siyu Tao
- School of Information Science and Technology, Shanghai Tech University, Shanghai 201210, China
| | - Beiyuan Yang
- School of Information Science and Technology, Shanghai Tech University, Shanghai 201210, China
| | - Shike Wang
- School of Information Science and Technology, Shanghai Tech University, Shanghai 201210, China
| | - Lin Wang
- School of Life Science and Technology, Shanghai Tech University, Shanghai 201210, China.,Shanghai Institute for Advanced Immunochemical Studies, Shanghai Tech University, Shanghai 201210, China
| | - Fang Bai
- School of Information Science and Technology, Shanghai Tech University, Shanghai 201210, China.,School of Life Science and Technology, Shanghai Tech University, Shanghai 201210, China.,Shanghai Institute for Advanced Immunochemical Studies, Shanghai Tech University, Shanghai 201210, China
| | - Jie Zheng
- School of Information Science and Technology, Shanghai Tech University, Shanghai 201210, China.,Shanghai Engineering Research Center of Intelligent Vision and Imaging, Shanghai 201210, China
| |
Collapse
|
26
|
Guo L, Dou Y, Xia D, Yin Z, Xiang Y, Luo L, Zhang Y, Wang J, Liang T. SLOAD: a comprehensive database of cancer-specific synthetic lethal interactions for precision cancer therapy via multi-omics analysis. Database (Oxford) 2022; 2022:6677988. [PMID: 36029479 PMCID: PMC9419874 DOI: 10.1093/database/baac075] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Revised: 07/27/2022] [Accepted: 08/20/2022] [Indexed: 11/14/2022]
Abstract
Abstract
Synthetic lethality has been widely concerned because of its potential role in cancer treatment, which can be harnessed to selectively kill cancer cells via identifying inactive genes in a specific cancer type and further targeting the corresponding synthetic lethal partners. Herein, to obtain cancer-specific synthetic lethal interactions, we aimed to predict genetic interactions via a pan-cancer analysis from multiple molecular levels using random forest and then develop a user-friendly database. First, based on collected public gene pairs with synthetic lethal interactions, candidate gene pairs were analyzed via integrating multi-omics data, mainly including DNA mutation, copy number variation, methylation and mRNA expression data. Then, integrated features were used to predict cancer-specific synthetic lethal interactions using random forest. Finally, SLOAD (http://www.tmliang.cn/SLOAD) was constructed via integrating these findings, which was a user-friendly database for data searching, browsing, downloading and analyzing. These results can provide candidate cancer-specific synthetic lethal interactions, which will contribute to drug designing in cancer treatment that can promote therapy strategies based on the principle of synthetic lethality.
Database URL http://www.tmliang.cn/SLOAD/
Collapse
Affiliation(s)
- Li Guo
- Department of Bioinformatics, Smart Health Big Data Analysis and Location Services Engineering Lab of Jiangsu Province, School of Geographic and Biologic Information, Nanjing University of Posts and Telecommunications , No. 9, Wenyuan Road, Qixia District, Nanjing, Jiangsu 210023, China
| | - Yuyang Dou
- Department of Bioinformatics, Smart Health Big Data Analysis and Location Services Engineering Lab of Jiangsu Province, School of Geographic and Biologic Information, Nanjing University of Posts and Telecommunications , No. 9, Wenyuan Road, Qixia District, Nanjing, Jiangsu 210023, China
| | - Daoliang Xia
- Department of Bioinformatics, Smart Health Big Data Analysis and Location Services Engineering Lab of Jiangsu Province, School of Geographic and Biologic Information, Nanjing University of Posts and Telecommunications , No. 9, Wenyuan Road, Qixia District, Nanjing, Jiangsu 210023, China
| | - Zibo Yin
- Department of Bioinformatics, Smart Health Big Data Analysis and Location Services Engineering Lab of Jiangsu Province, School of Geographic and Biologic Information, Nanjing University of Posts and Telecommunications , No. 9, Wenyuan Road, Qixia District, Nanjing, Jiangsu 210023, China
| | - Yangyang Xiang
- Department of Bioinformatics, Smart Health Big Data Analysis and Location Services Engineering Lab of Jiangsu Province, School of Geographic and Biologic Information, Nanjing University of Posts and Telecommunications , No. 9, Wenyuan Road, Qixia District, Nanjing, Jiangsu 210023, China
| | - Lulu Luo
- Jiangsu Key Laboratory for Molecular and Medical Biotechnology, School of Life Science, Nanjing Normal University , No. 1, Wenyuan Road, Qixia District, Nanjing, Jiangsu 210023, China
| | - Yuting Zhang
- Department of Bioinformatics, Smart Health Big Data Analysis and Location Services Engineering Lab of Jiangsu Province, School of Geographic and Biologic Information, Nanjing University of Posts and Telecommunications , No. 9, Wenyuan Road, Qixia District, Nanjing, Jiangsu 210023, China
| | - Jun Wang
- Department of Bioinformatics, Smart Health Big Data Analysis and Location Services Engineering Lab of Jiangsu Province, School of Geographic and Biologic Information, Nanjing University of Posts and Telecommunications , No. 9, Wenyuan Road, Qixia District, Nanjing, Jiangsu 210023, China
| | - Tingming Liang
- Jiangsu Key Laboratory for Molecular and Medical Biotechnology, School of Life Science, Nanjing Normal University , No. 1, Wenyuan Road, Qixia District, Nanjing, Jiangsu 210023, China
| |
Collapse
|