1
|
Liang H, Berger B, Singh R. Tracing the Shared Foundations of Gene Expression and Chromatin Structure. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.03.31.646349. [PMID: 40235997 PMCID: PMC11996408 DOI: 10.1101/2025.03.31.646349] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 04/17/2025]
Abstract
The three-dimensional organization of chromatin into topologically associating domains (TADs) may impact gene regulation by bringing distant genes into contact. However, many questions about TADs' function and their influence on transcription remain unresolved due to technical limitations in defining TAD boundaries and measuring the direct effect that TADs have on gene expression. Here, we develop consensus TAD maps for human and mouse with a novel "bag-of-genes" approach for defining the gene composition within TADs. This approach enables new functional interpretations of TADs by providing a way to capture species-level differences in chromatin organization. We also leverage a generative AI foundation model computed from 33 million transcriptomes to define contextual similarity, an embedding-based metric that is more powerful than co-expression at representing functional gene relationships. Our analytical framework directly leads to testable hypotheses about chromatin organization across cellular states. We find that TADs play an active role in facilitating gene co-regulation, possibly through a mechanism involving transcriptional condensates. We also discover that the TAD-linked enhancement of transcriptional context is strongest in early developmental stages and systematically declines with aging. Investigation of cancer cells show distinct patterns of TAD usage that shift with chemotherapy treatment, suggesting specific roles for TAD-mediated regulation in cellular development and plasticity. Finally, we develop "TAD signatures" to improve statistical analysis of single-cell transcriptomic data sets in predicting cancer cell-line drug response. These findings reshape our understanding of cellular plasticity in development and disease, indicating that chromatin organization acts through probabilistic mechanisms rather than deterministic rules. Software availability https://singhlab.net/tadmap.
Collapse
|
2
|
Zitnik M, Li MM, Wells A, Glass K, Morselli Gysi D, Krishnan A, Murali TM, Radivojac P, Roy S, Baudot A, Bozdag S, Chen DZ, Cowen L, Devkota K, Gitter A, Gosline SJC, Gu P, Guzzi PH, Huang H, Jiang M, Kesimoglu ZN, Koyuturk M, Ma J, Pico AR, Pržulj N, Przytycka TM, Raphael BJ, Ritz A, Sharan R, Shen Y, Singh M, Slonim DK, Tong H, Yang XH, Yoon BJ, Yu H, Milenković T. Current and future directions in network biology. BIOINFORMATICS ADVANCES 2024; 4:vbae099. [PMID: 39143982 PMCID: PMC11321866 DOI: 10.1093/bioadv/vbae099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Revised: 05/31/2024] [Accepted: 07/08/2024] [Indexed: 08/16/2024]
Abstract
Summary Network biology is an interdisciplinary field bridging computational and biological sciences that has proved pivotal in advancing the understanding of cellular functions and diseases across biological systems and scales. Although the field has been around for two decades, it remains nascent. It has witnessed rapid evolution, accompanied by emerging challenges. These stem from various factors, notably the growing complexity and volume of data together with the increased diversity of data types describing different tiers of biological organization. We discuss prevailing research directions in network biology, focusing on molecular/cellular networks but also on other biological network types such as biomedical knowledge graphs, patient similarity networks, brain networks, and social/contact networks relevant to disease spread. In more detail, we highlight areas of inference and comparison of biological networks, multimodal data integration and heterogeneous networks, higher-order network analysis, machine learning on networks, and network-based personalized medicine. Following the overview of recent breakthroughs across these five areas, we offer a perspective on future directions of network biology. Additionally, we discuss scientific communities, educational initiatives, and the importance of fostering diversity within the field. This article establishes a roadmap for an immediate and long-term vision for network biology. Availability and implementation Not applicable.
Collapse
Affiliation(s)
- Marinka Zitnik
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, United States
| | - Michelle M Li
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, United States
| | - Aydin Wells
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, United States
- Lucy Family Institute for Data and Society, University of Notre Dame, Notre Dame, IN 46556, United States
- Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN 46556, United States
| | - Kimberly Glass
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, United States
| | - Deisy Morselli Gysi
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, United States
- Department of Statistics, Federal University of Paraná, Curitiba, Paraná 81530-015, Brazil
- Department of Physics, Northeastern University, Boston, MA 02115, United States
| | - Arjun Krishnan
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, United States
| | - T M Murali
- Department of Computer Science, Virginia Tech, Blacksburg, VA 24061, United States
| | - Predrag Radivojac
- Khoury College of Computer Sciences, Northeastern University, Boston, MA 02115, United States
| | - Sushmita Roy
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53715, United States
- Wisconsin Institute for Discovery, Madison, WI 53715, United States
| | - Anaïs Baudot
- Aix Marseille Université, INSERM, MMG, Marseille, France
| | - Serdar Bozdag
- Department of Computer Science and Engineering, University of North Texas, Denton, TX 76203, United States
- Department of Mathematics, University of North Texas, Denton, TX 76203, United States
| | - Danny Z Chen
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, United States
| | - Lenore Cowen
- Department of Computer Science, Tufts University, Medford, MA 02155, United States
| | - Kapil Devkota
- Department of Computer Science, Tufts University, Medford, MA 02155, United States
| | - Anthony Gitter
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53715, United States
- Morgridge Institute for Research, Madison, WI 53715, United States
| | - Sara J C Gosline
- Biological Sciences Division, Pacific Northwest National Laboratory, Seattle, WA 98109, United States
| | - Pengfei Gu
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, United States
| | - Pietro H Guzzi
- Department of Medical and Surgical Sciences, University Magna Graecia of Catanzaro, Catanzaro, 88100, Italy
| | - Heng Huang
- Department of Computer Science, University of Maryland College Park, College Park, MD 20742, United States
| | - Meng Jiang
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, United States
| | - Ziynet Nesibe Kesimoglu
- Department of Computer Science and Engineering, University of North Texas, Denton, TX 76203, United States
- National Center of Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20814, United States
| | - Mehmet Koyuturk
- Department of Computer and Data Sciences, Case Western Reserve University, Cleveland, OH 44106, United States
| | - Jian Ma
- Ray and Stephanie Lane Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, United States
| | - Alexander R Pico
- Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA 94158, United States
| | - Nataša Pržulj
- Department of Computer Science, University College London, London, WC1E 6BT, England
- ICREA, Catalan Institution for Research and Advanced Studies, Barcelona, 08010, Spain
- Barcelona Supercomputing Center (BSC), Barcelona, 08034, Spain
| | - Teresa M Przytycka
- National Center of Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20814, United States
| | - Benjamin J Raphael
- Department of Computer Science, Princeton University, Princeton, NJ 08544, United States
| | - Anna Ritz
- Department of Biology, Reed College, Portland, OR 97202, United States
| | - Roded Sharan
- School of Computer Science, Tel Aviv University, Tel Aviv, 69978, Israel
| | - Yang Shen
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843, United States
| | - Mona Singh
- Department of Computer Science, Princeton University, Princeton, NJ 08544, United States
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, United States
| | - Donna K Slonim
- Department of Computer Science, Tufts University, Medford, MA 02155, United States
| | - Hanghang Tong
- Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL 61801, United States
| | - Xinan Holly Yang
- Department of Pediatrics, University of Chicago, Chicago, IL 60637, United States
| | - Byung-Jun Yoon
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843, United States
- Computational Science Initiative, Brookhaven National Laboratory, Upton, NY 11973, United States
| | - Haiyuan Yu
- Department of Computational Biology, Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY 14853, United States
| | - Tijana Milenković
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, United States
- Lucy Family Institute for Data and Society, University of Notre Dame, Notre Dame, IN 46556, United States
- Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN 46556, United States
| |
Collapse
|
3
|
Qin G, Dai J, Chien S, Martins TJ, Loera B, Nguyen QH, Oakes ML, Tercan B, Aguilar B, Hagen L, McCune J, Gelinas R, Monnat RJ, Shmulevich I, Becker PS. Mutation Patterns Predict Drug Sensitivity in Acute Myeloid Leukemia. Clin Cancer Res 2024; 30:2659-2671. [PMID: 38619278 PMCID: PMC11176916 DOI: 10.1158/1078-0432.ccr-23-1674] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Revised: 08/15/2023] [Accepted: 12/08/2023] [Indexed: 04/16/2024]
Abstract
PURPOSE The inherent genetic heterogeneity of acute myeloid leukemia (AML) has challenged the development of precise and effective therapies. The objective of this study was to elucidate the genomic basis of drug resistance or sensitivity, identify signatures for drug response prediction, and provide resources to the research community. EXPERIMENTAL DESIGN We performed targeted sequencing, high-throughput drug screening, and single-cell genomic profiling on leukemia cell samples derived from patients with AML. Statistical approaches and machine learning models were applied to identify signatures for drug response prediction. We also integrated large public datasets to understand the co-occurring mutation patterns and further investigated the mutation profiles in the single cells. The features revealed in the co-occurring or mutual exclusivity pattern were further subjected to machine learning models. RESULTS We detected genetic signatures associated with sensitivity or resistance to specific agents, and identified five co-occurring mutation groups. The application of single-cell genomic sequencing unveiled the co-occurrence of variants at the individual cell level, highlighting the presence of distinct subclones within patients with AML. Using the mutation pattern for drug response prediction demonstrates high accuracy in predicting sensitivity to some drug classes, such as MEK inhibitors for RAS-mutated leukemia. CONCLUSIONS Our study highlights the importance of considering the gene mutation patterns for the prediction of drug response in AML. It provides a framework for categorizing patients with AML by mutations that enable drug sensitivity prediction.
Collapse
Affiliation(s)
| | - Jin Dai
- Division of Hematology, University of Washington, Seattle, Washington
- Institute of Stem Cell and Regenerative Medicine, University of Washington, Seattle, Washington
| | - Sylvia Chien
- Division of Hematology, University of Washington, Seattle, Washington
- Institute of Stem Cell and Regenerative Medicine, University of Washington, Seattle, Washington
| | - Timothy J. Martins
- Institute of Stem Cell and Regenerative Medicine, University of Washington, Seattle, Washington
| | - Brenda Loera
- City of Hope National Medical Center, Duarte, California
| | - Quy H. Nguyen
- University of California, Irvine, Irvine, California
| | | | - Bahar Tercan
- Institute for Systems Biology, Seattle, Washington
| | | | - Lauren Hagen
- Institute for Systems Biology, Seattle, Washington
| | | | | | - Raymond J. Monnat
- Lab Medicine|Pathology and Genome Sciences, University of Washington, Seattle, Washington
| | | | - Pamela S. Becker
- Division of Hematology, University of Washington, Seattle, Washington
- Institute of Stem Cell and Regenerative Medicine, University of Washington, Seattle, Washington
- City of Hope National Medical Center, Duarte, California
| |
Collapse
|
4
|
Sinkala M. Mutational landscape of cancer-driver genes across human cancers. Sci Rep 2023; 13:12742. [PMID: 37550388 PMCID: PMC10406856 DOI: 10.1038/s41598-023-39608-2] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Accepted: 07/27/2023] [Indexed: 08/09/2023] Open
Abstract
The genetic mutations that contribute to the transformation of healthy cells into cancerous cells have been the subject of extensive research. The molecular aberrations that lead to cancer development are often characterised by gain-of-function or loss-of-function mutations in a variety of oncogenes and tumour suppressor genes. In this study, we investigate the genomic sequences of 20,331 primary tumours representing 41 distinct human cancer types to identify and catalogue the driver mutations present in 727 known cancer genes. Our findings reveal significant variations in the frequency of cancer gene mutations across different cancer types and highlight the frequent involvement of tumour suppressor genes (94%), oncogenes (93%), transcription factors (72%), kinases (64%), cell surface receptors (63%), and phosphatases (22%), in cancer. Additionally, our analysis reveals that cancer gene mutations are predominantly co-occurring rather than exclusive in all types of cancer. Notably, we discover that patients with tumours displaying different combinations of gene mutation patterns tend to exhibit variable survival outcomes. These findings provide new insights into the genetic landscape of cancer and bring us closer to a comprehensive understanding of the underlying mechanisms driving the development of various forms of cancer.
Collapse
Affiliation(s)
- Musalula Sinkala
- Department of Biomedical Sciences, School of Health Sciences, University of Zambia, Lusaka, Zambia.
- Computational Biology Division, Faculty of Health Sciences, Institute of Infectious Disease and Molecular Medicine, University of Cape Town, Cape Town, South Africa.
| |
Collapse
|
5
|
Ivanovic S, El-Kebir M. Modeling and predicting cancer clonal evolution with reinforcement learning. Genome Res 2023; 33:1078-1088. [PMID: 37344104 PMCID: PMC10538496 DOI: 10.1101/gr.277672.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Accepted: 06/09/2023] [Indexed: 06/23/2023]
Abstract
Cancer results from an evolutionary process that typically yields multiple clones with varying sets of mutations within the same tumor. Accurately modeling this process is key to understanding and predicting cancer evolution. Here, we introduce clone to mutation (CloMu), a flexible and low-parameter tree generative model of cancer evolution. CloMu uses a two-layer neural network trained via reinforcement learning to determine the probability of new mutations based on the existing mutations on a clone. CloMu supports several prediction tasks, including the determination of evolutionary trajectories, tree selection, causality and interchangeability between mutations, and mutation fitness. Importantly, previous methods support only some of these tasks, and many suffer from overfitting on data sets with a large number of mutations. Using simulations, we show that CloMu either matches or outperforms current methods on a wide variety of prediction tasks. In particular, for simulated data with interchangeable mutations, current methods are unable to uncover causal relationships as effectively as CloMu. On breast cancer and leukemia cohorts, we show that CloMu determines similarities and causal relationships between mutations as well as the fitness of mutations. We validate CloMu's inferred mutation fitness values for the leukemia cohort by comparing them to clonal proportion data not used during training, showing high concordance. In summary, CloMu's low-parameter model facilitates a wide range of prediction tasks regarding cancer evolution on increasingly available cohort-level data sets.
Collapse
Affiliation(s)
- Stefan Ivanovic
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
| | - Mohammed El-Kebir
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA;
- Cancer Center at Illinois, University of Illinois Urbana-Champaign, Urbana, Illinois 61801, USA
| |
Collapse
|
6
|
Zhang W, Xiang X, Zhao B, Huang J, Yang L, Zeng Y. Identifying Cancer Driver Pathways Based on the Mouth Brooding Fish Algorithm. ENTROPY (BASEL, SWITZERLAND) 2023; 25:841. [PMID: 37372185 DOI: 10.3390/e25060841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Revised: 05/05/2023] [Accepted: 05/23/2023] [Indexed: 06/29/2023]
Abstract
Identifying the driver genes of cancer progression is of great significance in improving our understanding of the causes of cancer and promoting the development of personalized treatment. In this paper, we identify the driver genes at the pathway level via an existing intelligent optimization algorithm, named the Mouth Brooding Fish (MBF) algorithm. Many methods based on the maximum weight submatrix model to identify driver pathways attach equal importance to coverage and exclusivity and assign them equal weight, but those methods ignore the impact of mutational heterogeneity. Here, we use principal component analysis (PCA) to incorporate covariate data to reduce the complexity of the algorithm and construct a maximum weight submatrix model considering different weights of coverage and exclusivity. Using this strategy, the unfavorable effect of mutational heterogeneity is overcome to some extent. Data involving lung adenocarcinoma and glioblastoma multiforme were tested with this method and the results compared with the MDPFinder, Dendrix, and Mutex methods. When the driver pathway size was 10, the recognition accuracy of the MBF method reached 80% in both datasets, and the weight values of the submatrix were 1.7 and 1.89, respectively, which are better than those of the compared methods. At the same time, in the signal pathway enrichment analysis, the important role of the driver genes identified by our MBF method in the cancer signaling pathway is revealed, and the validity of these driver genes is demonstrated from the perspective of their biological effects.
Collapse
Affiliation(s)
- Wei Zhang
- College of Computer Science and Engineering, Changsha University, Changsha 410022, China
- Hunan Province Key Laboratory of Industrial Internet Technology and Security, Changsha University, Changsha 410022, China
| | - Xiaowen Xiang
- College of Computer Science and Engineering, Changsha University, Changsha 410022, China
| | - Bihai Zhao
- College of Computer Science and Engineering, Changsha University, Changsha 410022, China
- Hunan Province Key Laboratory of Industrial Internet Technology and Security, Changsha University, Changsha 410022, China
| | - Jianlin Huang
- College of Computer Science and Engineering, Changsha University, Changsha 410022, China
| | - Lan Yang
- College of Computer Science and Engineering, Changsha University, Changsha 410022, China
| | - Yifu Zeng
- College of Computer Science and Engineering, Changsha University, Changsha 410022, China
- Hunan Province Key Laboratory of Industrial Internet Technology and Security, Changsha University, Changsha 410022, China
| |
Collapse
|
7
|
Ostroverkhova D, Przytycka TM, Panchenko AR. Cancer driver mutations: predictions and reality. Trends Mol Med 2023:S1471-4914(23)00067-9. [PMID: 37076339 DOI: 10.1016/j.molmed.2023.03.007] [Citation(s) in RCA: 60] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Revised: 03/17/2023] [Accepted: 03/23/2023] [Indexed: 04/21/2023]
Abstract
Cancer cells accumulate many genetic alterations throughout their lifetime, but only a few of them drive cancer progression, termed driver mutations. Driver mutations may vary between cancer types and patients, can remain latent for a long time and become drivers at particular cancer stages, or may drive oncogenesis only in conjunction with other mutations. The high mutational, biochemical, and histological tumor heterogeneity makes driver mutation identification very challenging. In this review we summarize recent efforts to identify driver mutations in cancer and annotate their effects. We underline the success of computational methods to predict driver mutations in finding novel cancer biomarkers, including in circulating tumor DNA (ctDNA). We also report on the boundaries of their applicability in clinical research.
Collapse
Affiliation(s)
- Daria Ostroverkhova
- Department of Pathology and Molecular Medicine, Queen's University, Kingston, ON, Canada
| | - Teresa M Przytycka
- National Library of Medicine, National Institutes of Health (NIH), Bethesda, MD, USA.
| | - Anna R Panchenko
- Department of Pathology and Molecular Medicine, Queen's University, Kingston, ON, Canada; Department of Biology and Molecular Sciences, Queen's University, Kingston, ON, Canada; School of Computing, Queen's University, Kingston, ON, Canada; Ontario Institute of Cancer Research, Toronto, ON, Canada.
| |
Collapse
|
8
|
Mina M, Iyer A, Ciriello G. Epistasis and evolutionary dependencies in human cancers. Curr Opin Genet Dev 2022; 77:101989. [PMID: 36182742 DOI: 10.1016/j.gde.2022.101989] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2022] [Revised: 08/29/2022] [Accepted: 08/31/2022] [Indexed: 01/27/2023]
Abstract
Cancer evolution is driven by the concerted action of multiple molecular alterations, which emerge and are selected during tumor progression. An alteration is selected when it provides an advantage to the tumor cell. However, the advantage provided by a specific alteration depends on the tumor lineage, cell epigenetic state, and presence of additional alterations. In this case, we say that an evolutionary dependency exists between an alteration and what influences its selection. Epistatic interactions between altered genes lead to evolutionary dependencies (EDs), by favoring or vetoing specific combinations of events. Large-scale cancer genomics studies have discovered examples of such dependencies, and showed that they influence tumor progression, disease phenotypes, and therapeutic response. In the past decade, several algorithmic approaches have been proposed to infer EDs from large-scale genomics datasets. These methods adopt diverse strategies to address common challenges and shed new light on cancer evolutionary trajectories. Here, we review these efforts starting from a simple conceptualization of the problem, presenting the tackled and still unmet needs in the field, and discussing the implications of EDs in cancer biology and precision oncology.
Collapse
Affiliation(s)
- Marco Mina
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland; Swiss Cancer Center Leman, Lausanne, Switzerland; Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Arvind Iyer
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland; Swiss Cancer Center Leman, Lausanne, Switzerland; Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Giovanni Ciriello
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland; Swiss Cancer Center Leman, Lausanne, Switzerland; Swiss Institute of Bioinformatics, Lausanne, Switzerland.
| |
Collapse
|
9
|
Liany H, Lin Y, Jeyasekharan A, Rajan V. An Algorithm to Mine Therapeutic Motifs for Cancer from Networks of Genetic Interactions. IEEE J Biomed Health Inform 2022; 26:2830-2838. [PMID: 34990373 DOI: 10.1109/jbhi.2022.3141076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Study of pairwise genetic interactions, such as mutually exclusive mutations, has led to understanding of underlying mechanisms in cancer. Investigation of various combinatorial motifs within networks of such interactions can lead to deeper insights into its mutational landscape and inform therapy development. One such motif called the Between-Pathway Model (BPM) represents redundant or compensatory pathways that can be therapeutically exploited. Finding such BPM motifs is challenging since most formulations require solving variants of the NP-complete maximum weight bipartite subgraph problem. In this paper we design an algorithm based on Integer Linear Programming (ILP) to solve this problem. In our experiments, our approach outperforms the best previous method to mine BPM motifs. Further, our ILP-based approach allows us to easily model additional application-specific constraints. We illustrate this advantage through a new application of BPM motifs that can potentially aid in finding combination therapies to combat cancer.
Collapse
|
10
|
Ahmed R, Erten C, Houdjedj A, Kazan H, Yalcin C. A Network-Centric Framework for the Evaluation of Mutual Exclusivity Tests on Cancer Drivers. Front Genet 2021; 12:746495. [PMID: 34899838 PMCID: PMC8664367 DOI: 10.3389/fgene.2021.746495] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Accepted: 10/27/2021] [Indexed: 12/03/2022] Open
Abstract
One of the key concepts employed in cancer driver gene identification is that of mutual exclusivity (ME); a driver mutation is less likely to occur in case of an earlier mutation that has common functionality in the same molecular pathway. Several ME tests have been proposed recently, however the current protocols to evaluate ME tests have two main limitations. Firstly the evaluations are mostly with respect to simulated data and secondly the evaluation metrics lack a network-centric view. The latter is especially crucial as the notion of common functionality can be achieved through searching for interaction patterns in relevant networks. We propose a network-centric framework to evaluate the pairwise significances found by statistical ME tests. It has three main components. The first component consists of metrics employed in the network-centric ME evaluations. Such metrics are designed so that network knowledge and the reference set of known cancer genes are incorporated in ME evaluations under a careful definition of proper control groups. The other two components are designed as further mechanisms to avoid confounders inherent in ME detection on top of the network-centric view. To this end, our second objective is to dissect the side effects caused by mutation load artifacts where mutations driving tumor subtypes with low mutation load might be incorrectly diagnosed as mutually exclusive. Finally, as part of the third main component, the confounding issue stemming from the use of nonspecific interaction networks generated as combinations of interactions from different tissues is resolved through the creation and use of tissue-specific networks in the proposed framework. The data, the source code and useful scripts are available at: https://github.com/abu-compbio/NetCentric.
Collapse
Affiliation(s)
- Rafsan Ahmed
- Electrical and Computer Engineering Graduate Program, Antalya Bilim University, Antalya, Turkey
| | - Cesim Erten
- Department of Computer Engineering, Antalya Bilim University, Antalya, Turkey
| | - Aissa Houdjedj
- Department of Computer Engineering, Antalya Bilim University, Antalya, Turkey
| | - Hilal Kazan
- Department of Computer Engineering, Antalya Bilim University, Antalya, Turkey
| | - Cansu Yalcin
- Department of Computer Engineering, Antalya Bilim University, Antalya, Turkey
| |
Collapse
|
11
|
Fedrizzi T, Ciani Y, Lorenzin F, Cantore T, Gasperini P, Demichelis F. Fast mutual exclusivity algorithm nominates potential synthetic lethal gene pairs through brute force matrix product computations. Comput Struct Biotechnol J 2021; 19:4394-4403. [PMID: 34429855 PMCID: PMC8369001 DOI: 10.1016/j.csbj.2021.08.001] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2021] [Revised: 08/02/2021] [Accepted: 08/03/2021] [Indexed: 12/12/2022] Open
Abstract
Mutual Exclusivity analysis of genomic aberrations contributes to the exploration of potential synthetic lethal (SL) relationships thus guiding the nomination of specific cancer cells vulnerabilities. When multiple classes of genomic aberrations and large cohorts of patients are interrogated, exhaustive genome-wide analyses are not computationally feasible with commonly used approaches. Here we present Fast Mutual Exclusivity (FaME), an algorithm based on matrix multiplication that employs a logarithm-based implementation of the Fisher's exact test to achieve fast computation of genome-wide mutual exclusivity tests; we show that brute force testing for mutual exclusivity of hundreds of millions of aberrations combinations can be performed in few minutes. We applied FaME to allele-specific data from whole exome experiments of 27 TCGA studies cohorts, detecting both mutual exclusivity of point mutations, as well as allele-specific copy number signals that span sets of contiguous cytobands. We next focused on a case study involving the loss of tumor suppressors and druggable genes while exploiting an integrated analysis of both public cell lines loss of function screens data and patients' transcriptomic profiles. FaME algorithm implementation as well as allele-specific analysis output are publicly available at https://github.com/demichelislab/FaME.
Collapse
Affiliation(s)
- Tarcisio Fedrizzi
- Department of Cellular, Computational and Integrative Biology, University of Trento, 38123 Trento, Italy
| | - Yari Ciani
- Department of Cellular, Computational and Integrative Biology, University of Trento, 38123 Trento, Italy
| | - Francesca Lorenzin
- Department of Cellular, Computational and Integrative Biology, University of Trento, 38123 Trento, Italy
| | - Thomas Cantore
- Department of Cellular, Computational and Integrative Biology, University of Trento, 38123 Trento, Italy
| | - Paola Gasperini
- Department of Cellular, Computational and Integrative Biology, University of Trento, 38123 Trento, Italy
| | - Francesca Demichelis
- Department of Cellular, Computational and Integrative Biology, University of Trento, 38123 Trento, Italy
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Al-Saud Institute for Computational Biomedicine, Weill Cornell Medical College, New York, NY 10021, USA
- The Caryl and Israel Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY 10021, USA
| |
Collapse
|
12
|
Yang Z, Yu G, Guo M, Yu J, Zhang X, Wang J. CDPath: Cooperative Driver Pathways Discovery Using Integer Linear Programming and Markov Clustering. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:1384-1395. [PMID: 31581094 DOI: 10.1109/tcbb.2019.2945029] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Discovering driver pathways is an essential task to understand the pathogenesis of cancer and to design precise treatments for cancer patients. Increasing evidences have been indicating that multiple pathways often function cooperatively in carcinogenesis. In this study, we propose an approach called CDPath to discover cooperative driver pathways. CDPath first uses Integer Linear Programming to explore driver core modules from mutation profiles by enforcing co-occurrence and functional interaction relations between modules, and by maximizing the mutual exclusivity and coverage within modules. Next, to enforce cooperation of pathways and help the follow-up exact cooperative driver pathways discovery, it performs Markov clustering on pathway-pathway interaction network to cluster pathways. After that, it identifies pathways in different modules but in the same clusters as cooperative driver pathways. We apply CDPath on two TCGA datasets: breast cancer (BRCA) and endometrial cancer (UCEC). The results show that CDPath can identify known (i.e., TP53) and potential driver genes (i.e., SPTBN2). In addition, the identified cooperative driver pathways are related with the target cancer, and they are involved with carcinogenesis and several key biological processes. CDPath can uncover more potential biological associations between pathways (over 100 percent) and more cooperative driver pathways (over 200 percent) than competitive approaches. The demo codes of CDPath are available at http://mlda.swu.edu.cn/codes.php?name=CDPath.
Collapse
|
13
|
Baali I, Erten C, Kazan H. DriveWays: a method for identifying possibly overlapping driver pathways in cancer. Sci Rep 2020; 10:21971. [PMID: 33319839 PMCID: PMC7738685 DOI: 10.1038/s41598-020-78852-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2020] [Accepted: 11/19/2020] [Indexed: 11/22/2022] Open
Abstract
The majority of the previous methods for identifying cancer driver modules output nonoverlapping modules. This assumption is biologically inaccurate as genes can participate in multiple molecular pathways. This is particularly true for cancer-associated genes as many of them are network hubs connecting functionally distinct set of genes. It is important to provide combinatorial optimization problem definitions modeling this biological phenomenon and to suggest efficient algorithms for its solution. We provide a formal definition of the Overlapping Driver Module Identification in Cancer (ODMIC) problem. We show that the problem is NP-hard. We propose a seed-and-extend based heuristic named DriveWays that identifies overlapping cancer driver modules from the graph built from the IntAct PPI network. DriveWays incorporates mutual exclusivity, coverage, and the network connectivity information of the genes. We show that DriveWays outperforms the state-of-the-art methods in recovering well-known cancer driver genes performed on TCGA pan-cancer data. Additionally, DriveWay’s output modules show a stronger enrichment for the reference pathways in almost all cases. Overall, we show that enabling modules to overlap improves the recovery of functional pathways filtered with known cancer drivers, which essentially constitute the reference set of cancer-related pathways.
Collapse
Affiliation(s)
- Ilyes Baali
- Electrical and Computer Engineering Graduate Program, Antalya Bilim University, 07190, Antalya, Turkey
| | - Cesim Erten
- Department of Computer Engineering, Antalya Bilim University, 07190, Antalya, Turkey.
| | - Hilal Kazan
- Department of Computer Engineering, Antalya Bilim University, 07190, Antalya, Turkey.
| |
Collapse
|
14
|
A forward selection algorithm to identify mutually exclusive alterations in cancer studies. J Hum Genet 2020; 66:509-518. [PMID: 33177701 DOI: 10.1038/s10038-020-00870-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2020] [Revised: 08/11/2020] [Accepted: 10/23/2020] [Indexed: 01/18/2023]
Abstract
Mutual exclusivity analyses provide an effective tool to identify driver genes from passenger genes for cancer studies. Various algorithms have been developed for the detection of mutual exclusivity, but controlling false positive and improving accuracy remain challenging. We propose a forward selection algorithm for identification of mutually exclusive gene sets (FSME) in this paper. The method includes an initial search of seed pair of mutually exclusive (ME) genes and subsequently including more genes into the current ME set. Simulations demonstrated that, compared to recently published approaches (i.e., CoMEt, WExT, and MEGSA), FSME could provide higher precision or recall rate to identify ME gene sets, and had superior control of false positive rates. With application to TCGA real data sets for AML, BRCA, and GBM, we confirmed that FSME can be utilized to discover cancer driver genes.
Collapse
|
15
|
Inferring tumor progression in large datasets. PLoS Comput Biol 2020; 16:e1008183. [PMID: 33035204 PMCID: PMC7577444 DOI: 10.1371/journal.pcbi.1008183] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2020] [Revised: 10/21/2020] [Accepted: 07/22/2020] [Indexed: 12/31/2022] Open
Abstract
Identification of mutations of the genes that give cancer a selective advantage is an important step towards research and clinical objectives. As such, there has been a growing interest in developing methods for identification of driver genes and their temporal order within a single patient (intra-tumor) as well as across a cohort of patients (inter-tumor). In this paper, we develop a probabilistic model for tumor progression, in which the driver genes are clustered into several ordered driver pathways. We develop an efficient inference algorithm that exhibits favorable scalability to the number of genes and samples compared to a previously introduced ILP-based method. Adopting a probabilistic approach also allows principled approaches to model selection and uncertainty quantification. Using a large set of experiments on synthetic datasets, we demonstrate our superior performance compared to the ILP-based method. We also analyze two biological datasets of colorectal and glioblastoma cancers. We emphasize that while the ILP-based method puts many seemingly passenger genes in the driver pathways, our algorithm keeps focused on truly driver genes and outputs more accurate models for cancer progression. Cancer is a disease caused by the accumulation of somatic mutations in the genome. This process is mainly driven by mutations in certain genes that give the harboring cells some selective advantage. The rather few driver genes are usually masked amongst an abundance of so-called passenger mutations. Identification of the driver genes and the temporal order in which the mutations occur is of great importance towards research and clinical objectives. In this paper, we introduce a probabilistic model for cancer progression and devise an efficient inference algorithm to train the model. We show that our method scales favorably to large datasets and provides superior performance compared to an ILP-based counterpart on a wide set of synthetic data simulations. Our Bayesian approach also allows for systematic model selection and confidence quantification procedures in contrast to the previous non-probabilistic progression models. We also study two large datasets on colorectal and glioblastoma cancers and validate our inferred model in comparison to the ILP-based method.
Collapse
|
16
|
Kim YA, Sarto Basso R, Wojtowicz D, Liu AS, Hochbaum DS, Vandin F, Przytycka TM. Identifying Drug Sensitivity Subnetworks with NETPHIX. iScience 2020; 23:101619. [PMID: 33089107 PMCID: PMC7566085 DOI: 10.1016/j.isci.2020.101619] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2019] [Revised: 09/08/2020] [Accepted: 09/24/2020] [Indexed: 12/29/2022] Open
Abstract
Phenotypic heterogeneity in cancer is often caused by different patterns of genetic alterations. Understanding such phenotype-genotype relationships is fundamental for the advance of personalized medicine. We develop a computational method, named NETPHIX (NETwork-to-PHenotype association with eXclusivity) to identify subnetworks of genes whose genetic alterations are associated with drug response or other continuous cancer phenotypes. Leveraging interaction information among genes and properties of cancer mutations such as mutual exclusivity, we formulate the problem as an integer linear program and solve it optimally to obtain a subnetwork of associated genes. Applied to a large-scale drug screening dataset, NETPHIX uncovered gene modules significantly associated with drug responses. Utilizing interaction information, NETPHIX modules are functionally coherent and can thus provide important insights into drug action. In addition, we show that modules identified by NETPHIX together with their association patterns can be leveraged to suggest drug combinations.
Collapse
Affiliation(s)
- Yoo-Ah Kim
- National Center of Biotechnology Information, National Library of Medicine, NIH, Bethesda, MD 20894, USA
| | - Rebecca Sarto Basso
- Department of Industrial Engineering and Operations Research, University of California at Berkeley, Berkeley, CA 94709, USA
| | - Damian Wojtowicz
- National Center of Biotechnology Information, National Library of Medicine, NIH, Bethesda, MD 20894, USA
| | - Amanda S Liu
- Montgomery Blair High School, Silver Spring, MD 20901, USA
| | - Dorit S Hochbaum
- Department of Industrial Engineering and Operations Research, University of California at Berkeley, Berkeley, CA 94709, USA
| | - Fabio Vandin
- Department of Information Engineering, University of Padova, Padova 35131, Italy
| | - Teresa M Przytycka
- National Center of Biotechnology Information, National Library of Medicine, NIH, Bethesda, MD 20894, USA
| |
Collapse
|
17
|
Ahmed R, Baali I, Erten C, Hoxha E, Kazan H. MEXCOwalk: mutual exclusion and coverage based random walk to identify cancer modules. Bioinformatics 2020; 36:872-879. [PMID: 31432076 DOI: 10.1093/bioinformatics/btz655] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2019] [Revised: 07/03/2019] [Accepted: 08/18/2019] [Indexed: 12/25/2022] Open
Abstract
MOTIVATION Genomic analyses from large cancer cohorts have revealed the mutational heterogeneity problem which hinders the identification of driver genes based only on mutation profiles. One way to tackle this problem is to incorporate the fact that genes act together in functional modules. The connectivity knowledge present in existing protein-protein interaction (PPI) networks together with mutation frequencies of genes and the mutual exclusivity of cancer mutations can be utilized to increase the accuracy of identifying cancer driver modules. RESULTS We present a novel edge-weighted random walk-based approach that incorporates connectivity information in the form of protein-protein interactions (PPIs), mutual exclusivity and coverage to identify cancer driver modules. MEXCOwalk outperforms several state-of-the-art computational methods on TCGA pan-cancer data in terms of recovering known cancer genes, providing modules that are capable of classifying normal and tumor samples and that are enriched for mutations in specific cancer types. Furthermore, the risk scores determined with output modules can stratify patients into low-risk and high-risk groups in multiple cancer types. MEXCOwalk identifies modules containing both well-known cancer genes and putative cancer genes that are rarely mutated in the pan-cancer data. The data, the source code and useful scripts are available at: https://github.com/abu-compbio/MEXCOwalk. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Rafsan Ahmed
- Electrical and Computer Engineering Graduate Program, Department of Computer Engineering, Antalya Bilim University, Antalya 07190, Turkey
| | - Ilyes Baali
- Electrical and Computer Engineering Graduate Program, Department of Computer Engineering, Antalya Bilim University, Antalya 07190, Turkey
| | - Cesim Erten
- Department of Computer Engineering, Antalya Bilim University, Antalya 07190, Turkey
| | - Evis Hoxha
- Department of Computer Engineering, Antalya Bilim University, Antalya 07190, Turkey
| | - Hilal Kazan
- Department of Computer Engineering, Antalya Bilim University, Antalya 07190, Turkey
| |
Collapse
|
18
|
Mateo L, Duran-Frigola M, Gris-Oliver A, Palafox M, Scaltriti M, Razavi P, Chandarlapaty S, Arribas J, Bellet M, Serra V, Aloy P. Personalized cancer therapy prioritization based on driver alteration co-occurrence patterns. Genome Med 2020; 12:78. [PMID: 32907621 PMCID: PMC7488324 DOI: 10.1186/s13073-020-00774-x] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2020] [Accepted: 08/11/2020] [Indexed: 12/19/2022] Open
Abstract
Identification of actionable genomic vulnerabilities is key to precision oncology. Utilizing a large-scale drug screening in patient-derived xenografts, we uncover driver gene alteration connections, derive driver co-occurrence (DCO) networks, and relate these to drug sensitivity. Our collection of 53 drug-response predictors attains an average balanced accuracy of 58% in a cross-validation setting, rising to 66% for a subset of high-confidence predictions. We experimentally validated 12 out of 14 predictions in mice and adapted our strategy to obtain drug-response models from patients’ progression-free survival data. Our strategy reveals links between oncogenic alterations, increasing the clinical impact of genomic profiling.
Collapse
Affiliation(s)
- Lidia Mateo
- Joint IRB-BSC-CRG Program in Computational Biology, Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
| | - Miquel Duran-Frigola
- Joint IRB-BSC-CRG Program in Computational Biology, Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
| | - Albert Gris-Oliver
- Experimental Therapeutics Group, Vall d'Hebron Institute of Oncology, Barcelona, Catalonia, Spain
| | - Marta Palafox
- Experimental Therapeutics Group, Vall d'Hebron Institute of Oncology, Barcelona, Catalonia, Spain
| | - Maurizio Scaltriti
- Human Oncology and Pathogenesis Program, Memorial Sloan Kettering Cancer Center (MSKCC), New York, NY, 10065, USA.,Department of Pathology, MSKCC, New York, NY, 10065, USA
| | - Pedram Razavi
- Human Oncology and Pathogenesis Program, Memorial Sloan Kettering Cancer Center (MSKCC), New York, NY, 10065, USA.,Breast Medicine Service, Department of Medicine, MSKCC and Weill-Cornell Medical College, New York, NY, 10065, USA
| | - Sarat Chandarlapaty
- Human Oncology and Pathogenesis Program, Memorial Sloan Kettering Cancer Center (MSKCC), New York, NY, 10065, USA.,Breast Medicine Service, Department of Medicine, MSKCC and Weill-Cornell Medical College, New York, NY, 10065, USA
| | - Joaquin Arribas
- Growth Factors Laboratory, Vall d'Hebron Institute of Oncology, Barcelona, Catalonia, Spain.,Department of Biochemistry and Molecular Biology, Universitat Autònoma de Barcelona, Bellaterra, Catalonia, Spain.,Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Catalonia, Spain.,CIBERONC, Barcelona, Spain
| | - Meritxell Bellet
- Breast Cancer Group, Vall d'Hebron Institute of Oncology, Barcelona, Catalonia, Spain.,Department of Medical Oncology, Hospital Vall d'Hebron, Universitat Autònoma de Barcelona, Barcelona, Catalonia, Spain
| | - Violeta Serra
- Experimental Therapeutics Group, Vall d'Hebron Institute of Oncology, Barcelona, Catalonia, Spain.,CIBERONC, Barcelona, Spain
| | - Patrick Aloy
- Joint IRB-BSC-CRG Program in Computational Biology, Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain. .,Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Catalonia, Spain.
| |
Collapse
|
19
|
Kim YA, Wojtowicz D, Sarto Basso R, Sason I, Robinson W, Hochbaum DS, Leiserson MDM, Sharan R, Vadin F, Przytycka TM. Network-based approaches elucidate differences within APOBEC and clock-like signatures in breast cancer. Genome Med 2020; 12:52. [PMID: 32471470 PMCID: PMC7260830 DOI: 10.1186/s13073-020-00745-2] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2020] [Accepted: 05/07/2020] [Indexed: 01/27/2023] Open
Abstract
BACKGROUND Studies of cancer mutations have typically focused on identifying cancer driving mutations that confer growth advantage to cancer cells. However, cancer genomes accumulate a large number of passenger somatic mutations resulting from various endogenous and exogenous causes, including normal DNA damage and repair processes or cancer-related aberrations of DNA maintenance machinery as well as mutations triggered by carcinogenic exposures. Different mutagenic processes often produce characteristic mutational patterns called mutational signatures. Identifying mutagenic processes underlying mutational signatures shaping a cancer genome is an important step towards understanding tumorigenesis. METHODS To investigate the genetic aberrations associated with mutational signatures, we took a network-based approach considering mutational signatures as cancer phenotypes. Specifically, our analysis aims to answer the following two complementary questions: (i) what are functional pathways whose gene expression activities correlate with the strengths of mutational signatures, and (ii) are there pathways whose genetic alterations might have led to specific mutational signatures? To identify mutated pathways, we adopted a recently developed optimization method based on integer linear programming. RESULTS Analyzing a breast cancer dataset, we identified pathways associated with mutational signatures on both expression and mutation levels. Our analysis captured important differences in the etiology of the APOBEC-related signatures and the two clock-like signatures. In particular, it revealed that clustered and dispersed APOBEC mutations may be caused by different mutagenic processes. In addition, our analysis elucidated differences between two age-related signatures-one of the signatures is correlated with the expression of cell cycle genes while the other has no such correlation but shows patterns consistent with the exposure to environmental/external processes. CONCLUSIONS This work investigated, for the first time, a network-level association of mutational signatures and dysregulated pathways. The identified pathways and subnetworks provide novel insights into mutagenic processes that the cancer genomes might have undergone and important clues for developing personalized drug therapies.
Collapse
Affiliation(s)
- Yoo-Ah Kim
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, 20894 USA
| | - Damian Wojtowicz
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, 20894 USA
| | - Rebecca Sarto Basso
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, 20894 USA
- Department of Industrial Engineering and Operations Research, University of California, Berkeley, 94720 CA USA
| | - Itay Sason
- School of Computer Science, Tel Aviv University, Tel Aviv, 69978 Israel
| | - Welles Robinson
- Center for Bioinformatics and Computational Biology, University of Maryland, 8314 Paint Branch Dr, College Park, 20742 USA
| | - Dorit S. Hochbaum
- Department of Industrial Engineering and Operations Research, University of California, Berkeley, 94720 CA USA
| | - Mark D. M. Leiserson
- Center for Bioinformatics and Computational Biology, University of Maryland, 8314 Paint Branch Dr, College Park, 20742 USA
| | - Roded Sharan
- School of Computer Science, Tel Aviv University, Tel Aviv, 69978 Israel
| | - Fabio Vadin
- Department of Information Engineering, University of Padova, Via Gradenigo 6/A, Padua, I-35131 Italy
| | - Teresa M. Przytycka
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, 20894 USA
| |
Collapse
|
20
|
Zhang W, Zeng Y, Wang L, Liu Y, Cheng YN. An Effective Graph Clustering Method to Identify Cancer Driver Modules. Front Bioeng Biotechnol 2020; 8:271. [PMID: 32318558 PMCID: PMC7154174 DOI: 10.3389/fbioe.2020.00271] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2020] [Accepted: 03/16/2020] [Indexed: 12/15/2022] Open
Abstract
Identifying the molecular modules that drive cancer progression can greatly deepen the understanding of cancer mechanisms and provide useful information for targeted therapies. Most methods currently addressing this issue primarily use mutual exclusivity without making full use of the extra layer of module property. In this paper, we propose MCLCluster to identity cancer driver modules, which use somatic mutation data, Cancer Cell Fraction (CCF) data, gene functional interaction network and protein-protein interaction (PPI) network to derive the module property on mutual exclusivity, connectivity in PPI network and functionally similarity of genes. We have taken three effective measures to ensure the effectiveness of our algorithm. First, we use CCF data to choose stronger signals and more confident mutations. Second, the weighted gene functional interaction network is used to quantify the gene functional similarity in PPI. The third, graph clustering method based on Markov is exploited to extract the candidate module. MCLCluster is tested in the two TCGA datasets (GBM and BRCA), and identifies several well-known oncogenes driver modules and some modules with functionally associated driver genes. Besides, we compare it with Multi-Dendrix, FSME Cluster and RME in simulated dataset with background noise and passenger rate, MCLCluster outperforming all of these methods.
Collapse
Affiliation(s)
- Wei Zhang
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, China.,Hunan Province Key Laboratory of Industrial Internet Technology and Security, Changsha University, Changsha, China
| | - Yifu Zeng
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, China.,Hunan Province Key Laboratory of Industrial Internet Technology and Security, Changsha University, Changsha, China
| | - Lei Wang
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, China.,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, China
| | - Yue Liu
- College of Computer Science and Electronics Engineering, Hunan University, Changsha, China
| | - Yi-Nan Cheng
- College of Science, Southern University of Science and Technology, Shenzhen, China
| |
Collapse
|
21
|
Wang J, Yang Z, Domeniconi C, Zhang X, Yu G. Cooperative driver pathway discovery via fusion of multi-relational data of genes, miRNAs and pathways. Brief Bioinform 2020; 22:1984-1999. [PMID: 32103253 DOI: 10.1093/bib/bbz167] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2019] [Revised: 12/13/2019] [Accepted: 12/29/2019] [Indexed: 12/19/2022] Open
Abstract
Discovering driver pathways is an essential step to uncover the molecular mechanism underlying cancer and to explore precise treatments for cancer patients. However, due to the difficulties of mapping genes to pathways and the limited knowledge about pathway interactions, most previous work focus on identifying individual pathways. In practice, two (or even more) pathways interplay and often cooperatively trigger cancer. In this study, we proposed a new approach called CDPathway to discover cooperative driver pathways. First, CDPathway introduces a driver impact quantification function to quantify the driver weight of each gene. CDPathway assumes that genes with larger weights contribute more to the occurrence of the target disease and identifies them as candidate driver genes. Next, it constructs a heterogeneous network composed of genes, miRNAs and pathways nodes based on the known intra(inter)-relations between them and assigns the quantified driver weights to gene-pathway and gene-miRNA relational edges. To transfer driver impacts of genes to pathway interaction pairs, CDPathway collaboratively factorizes the weighted adjacency matrices of the heterogeneous network to explore the latent relations between genes, miRNAs and pathways. After this, it reconstructs the pathway interaction network and identifies the pathway pairs with maximal interactive and driver weights as cooperative driver pathways. Experimental results on the breast, uterine corpus endometrial carcinoma and ovarian cancer data from The Cancer Genome Atlas show that CDPathway can effectively identify candidate driver genes [area under the receiver operating characteristic curve (AUROC) of $\geq $0.9] and reconstruct the pathway interaction network (AUROC of>0.9), and it uncovers much more known (potential) driver genes than other competitive methods. In addition, CDPathway identifies 150% more driver pathways and 60% more potential cooperative driver pathways than the competing methods. The code of CDPathway is available at http://mlda.swu.edu.cn/codes.php?name=CDPathway.
Collapse
Affiliation(s)
- Jun Wang
- Professor of the School of Software, Shandong University
| | - Ziying Yang
- Professor of the School of Software, Shandong University
| | | | - Xiangliang Zhang
- Computational Bioscience Research Center (CBRC), Computer Science, Electrical and Mathematical Science and Engineering Division, King Abdullah University of Science and Technology, SA
| | - Guoxian Yu
- Computational Bioscience Research Center (CBRC), Computer Science, Electrical and Mathematical Science and Engineering Division, King Abdullah University of Science and Technology, SA.,Professor of the School of Software, Shandong University and Computational Bioscience Research Center
| |
Collapse
|
22
|
Reyna MA, Leiserson MDM, Raphael BJ. Hierarchical HotNet: identifying hierarchies of altered subnetworks. Bioinformatics 2019; 34:i972-i980. [PMID: 30423088 PMCID: PMC6129270 DOI: 10.1093/bioinformatics/bty613] [Citation(s) in RCA: 80] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
Motivation The analysis of high-dimensional ‘omics data is often informed by the use of biological interaction networks. For example, protein–protein interaction networks have been used to analyze gene expression data, to prioritize germline variants, and to identify somatic driver mutations in cancer. In these and other applications, the underlying computational problem is to identify altered subnetworks containing genes that are both highly altered in an ‘omics dataset and are topologically close (e.g. connected) on an interaction network. Results We introduce Hierarchical HotNet, an algorithm that finds a hierarchy of altered subnetworks. Hierarchical HotNet assesses the statistical significance of the resulting subnetworks over a range of biological scales and explicitly controls for ascertainment bias in the network. We evaluate the performance of Hierarchical HotNet and several other algorithms that identify altered subnetworks on the problem of predicting cancer genes and significantly mutated subnetworks. On somatic mutation data from The Cancer Genome Atlas, Hierarchical HotNet outperforms other methods and identifies significantly mutated subnetworks containing both well-known cancer genes and candidate cancer genes that are rarely mutated in the cohort. Hierarchical HotNet is a robust algorithm for identifying altered subnetworks across different ‘omics datasets. Availability and implementation http://github.com/raphael-group/hierarchical-hotnet. Supplementary information Supplementary material are available at Bioinformatics online.
Collapse
Affiliation(s)
- Matthew A Reyna
- Department of Computer Science, Princeton University, Princeton, NJ, USA
| | - Mark D M Leiserson
- Department of Computer Science, University of Maryland, College Park, MD, USA
| | - Benjamin J Raphael
- Department of Computer Science, Princeton University, Princeton, NJ, USA
| |
Collapse
|
23
|
Abstract
Classically, phenotype is what is observed, and genotype is the genetic makeup. Statistical studies aim to project phenotypic likelihoods of genotypic patterns. The traditional genotype-to-phenotype theory embraces the view that the encoded protein shape together with gene expression level largely determines the resulting phenotypic trait. Here, we point out that the molecular biology revolution at the turn of the century explained that the gene encodes not one but ensembles of conformations, which in turn spell all possible gene-associated phenotypes. The significance of a dynamic ensemble view is in understanding the linkage between genetic change and the gained observable physical or biochemical characteristics. Thus, despite the transformative shift in our understanding of the basis of protein structure and function, the literature still commonly relates to the classical genotype-phenotype paradigm. This is important because an ensemble view clarifies how even seemingly small genetic alterations can lead to pleiotropic traits in adaptive evolution and in disease, why cellular pathways can be modified in monogenic and polygenic traits, and how the environment may tweak protein function.
Collapse
Affiliation(s)
- Ruth Nussinov
- Cancer and Inflammation Program, Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research, National Cancer Institute at Frederick, Frederick, Maryland, United States of America
- Sackler Institute of Molecular Medicine, Department of Human Genetics and Molecular Medicine, Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Chung-Jung Tsai
- Cancer and Inflammation Program, Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research, National Cancer Institute at Frederick, Frederick, Maryland, United States of America
| | - Hyunbum Jang
- Cancer and Inflammation Program, Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research, National Cancer Institute at Frederick, Frederick, Maryland, United States of America
| |
Collapse
|
24
|
Hodzic E, Shrestha R, Zhu K, Cheng K, Collins CC, Cenk Sahinalp S. Combinatorial Detection of Conserved Alteration Patterns for Identifying Cancer Subnetworks. Gigascience 2019; 8:giz024. [PMID: 30978274 PMCID: PMC6458499 DOI: 10.1093/gigascience/giz024] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2018] [Revised: 12/12/2018] [Accepted: 02/21/2019] [Indexed: 01/24/2023] Open
Abstract
BACKGROUND Advances in large-scale tumor sequencing have led to an understanding that there are combinations of genomic and transcriptomic alterations specific to tumor types, shared across many patients. Unfortunately, computational identification of functionally meaningful and recurrent alteration patterns within gene/protein interaction networks has proven to be challenging. FINDINGS We introduce a novel combinatorial method, cd-CAP (combinatorial detection of conserved alteration patterns), for simultaneous detection of connected subnetworks of an interaction network where genes exhibit conserved alteration patterns across tumor samples. Our method differentiates distinct alteration types associated with each gene (rather than relying on binary information of a gene being altered or not) and simultaneously detects multiple alteration profile conserved subnetworks. CONCLUSIONS In a number of The Cancer Genome Atlas datasets, cd-CAP identified large biologically significant subnetworks with conserved alteration patterns, shared across many tumor samples.
Collapse
Affiliation(s)
- Ermin Hodzic
- Laboratory for Advanced Genome Analysis, Vancouver Prostate Centre, 2660 Oak St, Vancouver, BC, V6H 3Z6, Canada
- School of Computing Science, Simon Fraser University, 8888 University Dr, Burnaby, BC, V5A 1S6, Canada
| | - Raunak Shrestha
- Laboratory for Advanced Genome Analysis, Vancouver Prostate Centre, 2660 Oak St, Vancouver, BC, V6H 3Z6, Canada
- Department of Urologic Sciences, University of British Columbia, 2775 Laurel St, Vancouver, BC, V5Z 1M9, Canada
| | - Kaiyuan Zhu
- Department of Computer Science, Indiana University Bloomington, 700 N. Woodlawn Ave, Bloomington, IN, 47408, USA
| | - Kuoyuan Cheng
- Center for Bioinformatics and Computational Biology, University of Maryland, 8125 Paint Branch Dr, College Park, MD, 20742, USA
| | - Colin C Collins
- Laboratory for Advanced Genome Analysis, Vancouver Prostate Centre, 2660 Oak St, Vancouver, BC, V6H 3Z6, Canada
- Department of Urologic Sciences, University of British Columbia, 2775 Laurel St, Vancouver, BC, V5Z 1M9, Canada
| | - S Cenk Sahinalp
- Laboratory for Advanced Genome Analysis, Vancouver Prostate Centre, 2660 Oak St, Vancouver, BC, V6H 3Z6, Canada
- Department of Computer Science, Indiana University Bloomington, 700 N. Woodlawn Ave, Bloomington, IN, 47408, USA
| |
Collapse
|
25
|
Giallombardo C, Morfea S, Rombo SE. An Integrative Framework for the Construction of Big Functional Networks. 2018 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM) 2018:2088-2093. [DOI: 10.1109/bibm.2018.8621128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
|
26
|
Enabling Precision Medicine through Integrative Network Models. J Mol Biol 2018; 430:2913-2923. [DOI: 10.1016/j.jmb.2018.07.004] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2018] [Revised: 06/15/2018] [Accepted: 07/03/2018] [Indexed: 11/17/2022]
|
27
|
Affiliation(s)
- Trey Ideker
- Department of Medicine, University of California San Diego, La Jolla, California, United States of America
- * E-mail: (TI); (RN)
| | - Ruth Nussinov
- Cancer and Inflammation Program, Leidos Biomedical Research, Inc., Frederick National Laboratory for Cancer Research, National Cancer Institute at Frederick, Frederick, Maryland, United States of America
- Sackler Institute of Molecular Medicine, Department of Human Genetics and Molecular Medicine, Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
- * E-mail: (TI); (RN)
| |
Collapse
|