1
|
Kanu GA, Mouselly A, Mohamed AA. Foundations and applications of computational genomics. DEEP LEARNING IN GENETICS AND GENOMICS 2025:59-75. [DOI: 10.1016/b978-0-443-27574-6.00007-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
|
2
|
Al-Harazi O, El Allali A, Kaya N, Colak D. Identification of Diagnostic and Prognostic Subnetwork Biomarkers for Women with Breast Cancer Using Integrative Genomic and Network-Based Analysis. Int J Mol Sci 2024; 25:12779. [PMID: 39684488 DOI: 10.3390/ijms252312779] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2024] [Revised: 11/11/2024] [Accepted: 11/14/2024] [Indexed: 12/18/2024] Open
Abstract
Breast cancer remains a major global health concern and a leading cause of cancer-related deaths among women. Early detection and effective treatment are essential in improving patient survival. Advances in omics technologies have provided deeper insights into the molecular mechanisms underlying breast cancer. This study aimed to identify subnetwork markers with diagnostic and prognostic potential by integrating genome-wide gene expression data with protein-protein interaction networks. We identified four significant subnetworks revealing potentially important hub genes, including VEGFA, KIF4A, ZWINT, PTPRU, IKBKE, STYK1, CENPO, and UBE2C. The diagnostic and prognostic potentials of these subnetworks were validated using independent datasets. Unsupervised principal component analysis demonstrated a clear separation of breast cancer patients from healthy controls across multiple datasets. A KNN classification model, based on these subnetworks, achieved an accuracy of 97%, sensitivity of 98%, specificity of 94%, and area under the curve (AUC) of 96%. Moreover, the prognostic significance of these subnetwork markers was validated using independent transcriptomic datasets comprising over 4000 patients. These findings suggest that subnetwork markers derived from integrated genomic network analyses can enhance our understanding of the molecular landscape of breast cancer, potentially leading to improved diagnostic, prognostic, and therapeutic strategies.
Collapse
Affiliation(s)
- Olfat Al-Harazi
- Molecular Oncology Department, King Faisal Specialist Hospital and Research Centre, Riyadh 11211, Saudi Arabia
| | - Achraf El Allali
- Bioinformatics Laboratory, College of Computing, Mohammed VI Polytechnic University, Benguerir 43150, Morocco
| | - Namik Kaya
- Translational Genomics Department, Center for Genomic Medicine, King Faisal Specialist Hospital and Research Centre, Riyadh 11211, Saudi Arabia
| | - Dilek Colak
- Molecular Oncology Department, King Faisal Specialist Hospital and Research Centre, Riyadh 11211, Saudi Arabia
| |
Collapse
|
3
|
Gong H, Wang H, Wang Y, Zhang S, Liu X, Che J, Wu S, Wu J, Sun X, Zhang S, Yau ST, Wu R. Topological change of soil microbiota networks for forest resilience under global warming. Phys Life Rev 2024; 50:228-251. [PMID: 39178631 DOI: 10.1016/j.plrev.2024.08.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2024] [Revised: 07/31/2024] [Accepted: 08/02/2024] [Indexed: 08/26/2024]
Abstract
Forest management by thinning can mitigate the detrimental impact of increasing drought caused by global warming. Growing evidence shows that the soil microbiota can coordinate the dynamic relationship between forest functions and drought intensity, but how they function as a cohesive whole remains elusive. We outline a statistical topology model to chart the roadmap of how each microbe acts and interacts with every other microbe to shape the dynamic changes of microbial communities under forest management. To demonstrate its utility, we analyze a soil microbiota data collected from a two-way longitudinal factorial experiment involving three stand densities and three levels of rainfall over a growing season in artificial plantations of a forest tree - larix (Larix kaempferi). We reconstruct the most sophisticated soil microbiota networks that code maximally informative microbial interactions and trace their dynamic trajectories across time, space, and environmental signals. By integrating GLMY homology theory, we dissect the topological architecture of these so-called omnidirectional networks and identify key microbial interaction pathways that play a pivotal role in mediating the structure and function of soil microbial communities. The statistical topological model described provides a systems tool for studying how microbial community assembly alters its structure, function and evolution under climate change.
Collapse
Affiliation(s)
- Huiying Gong
- School of Grassland Science, Beijing Forestry University, Beijing 100083, China; Beijing Institute of Mathematical Sciences and Applications, Beijing 101408, China
| | - Hongxing Wang
- State Key Laboratory of Tree Genetics and Breeding, Key Laboratory of Tree Breeding and Cultivation of National Forestry and Grassland Administration, Research Institute of Forestry, Chinese Academy of Forestry, Beijing 100091, China
| | - Yu Wang
- Beijing Institute of Mathematical Sciences and Applications, Beijing 101408, China
| | - Shen Zhang
- Qiuzhen College, Tsinghua University, Beijing 100084, China
| | - Xiang Liu
- Beijing Institute of Mathematical Sciences and Applications, Beijing 101408, China
| | - Jincan Che
- School of Grassland Science, Beijing Forestry University, Beijing 100083, China; Beijing Institute of Mathematical Sciences and Applications, Beijing 101408, China
| | - Shuang Wu
- Beijing Institute of Mathematical Sciences and Applications, Beijing 101408, China
| | - Jie Wu
- Beijing Institute of Mathematical Sciences and Applications, Beijing 101408, China
| | - Xiaomei Sun
- State Key Laboratory of Tree Genetics and Breeding, Key Laboratory of Tree Breeding and Cultivation of National Forestry and Grassland Administration, Research Institute of Forestry, Chinese Academy of Forestry, Beijing 100091, China.
| | - Shougong Zhang
- State Key Laboratory of Tree Genetics and Breeding, Key Laboratory of Tree Breeding and Cultivation of National Forestry and Grassland Administration, Research Institute of Forestry, Chinese Academy of Forestry, Beijing 100091, China
| | - Shing-Tung Yau
- Beijing Institute of Mathematical Sciences and Applications, Beijing 101408, China; Qiuzhen College, Tsinghua University, Beijing 100084, China; Yau Mathematical Sciences Center, Tsinghua University, Beijing 100084, China
| | - Rongling Wu
- School of Grassland Science, Beijing Forestry University, Beijing 100083, China; Beijing Institute of Mathematical Sciences and Applications, Beijing 101408, China; Qiuzhen College, Tsinghua University, Beijing 100084, China; Yau Mathematical Sciences Center, Tsinghua University, Beijing 100084, China.
| |
Collapse
|
4
|
Bucksot J, Ritchie K, Biancalana M, Cole JA, Cook D. Pan-Cancer, Genome-Scale Metabolic Network Analysis of over 10,000 Patients Elucidates Relationship between Metabolism and Survival. Cancers (Basel) 2024; 16:2302. [PMID: 39001365 PMCID: PMC11240338 DOI: 10.3390/cancers16132302] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2024] [Revised: 06/17/2024] [Accepted: 06/20/2024] [Indexed: 07/16/2024] Open
Abstract
Despite the high variability in cancer biology, cancers nevertheless exhibit cohesive hallmarks across multiple cancer types, notably dysregulated metabolism. Metabolism plays a central role in cancer biology, and shifts in metabolic pathways have been linked to tumor aggressiveness and likelihood of response to therapy. We therefore sought to interrogate metabolism across cancer types and understand how intrinsic modes of metabolism vary within and across indications and how they relate to patient prognosis. We used context specific genome-scale metabolic modeling to simulate metabolism across 10,915 patients from 34 cancer types from The Cancer Genome Atlas and the MMRF-COMMPASS study. We found that cancer metabolism clustered into modes characterized by differential glycolysis, oxidative phosphorylation, and growth rate. We also found that the simulated activities of metabolic pathways are intrinsically prognostic across cancer types, especially tumor growth rate, fatty acid biosynthesis, folate metabolism, oxidative phosphorylation, steroid metabolism, and glutathione metabolism. This work shows the prognostic power of individual patient metabolic modeling across multiple cancer types. Additionally, it shows that analyzing large-scale models of cancer metabolism with survival information provides unique insights into underlying relationships across cancer types and suggests how therapies designed for one cancer type may be repurposed for use in others.
Collapse
|
5
|
Hernández Sánchez LF, Burger B, Castro Campos RA, Johansson S, Njølstad PR, Barsnes H, Vaudel M. Extending protein interaction networks using proteoforms and small molecules. Bioinformatics 2023; 39:btad598. [PMID: 37756698 PMCID: PMC10564616 DOI: 10.1093/bioinformatics/btad598] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 08/21/2023] [Accepted: 09/25/2023] [Indexed: 09/29/2023] Open
Abstract
MOTIVATION Biological network analysis for high-throughput biomedical data interpretation relies heavily on topological characteristics. Networks are commonly composed of nodes representing genes or proteins that are connected by edges when interacting. In this study, we use the rich information available in the Reactome pathway database to build biological networks accounting for small molecules and proteoforms modeled using protein isoforms and post-translational modifications to study the topological changes induced by this refinement of the network representation. RESULTS We find that improving the interactome modeling increases the number of nodes and interactions, but that isoform and post-translational modification annotation is still limited compared to what can be expected biologically. We also note that small molecule information can distort the topology of the network due to the high connectedness of these molecules, which does not necessarily represent the reality of biology. However, by restricting the connections of small molecules to the context of biochemical reactions, we find that these improve the overall connectedness of the network and reduce the prevalence of isolated components and nodes. Overall, changing the representation of the network alters the prevalence of articulation points and bridges globally but also within and across pathways. Hence, some molecules can gain or lose in biological importance depending on the level of detail of the representation of the biological system, which might in turn impact network-based studies of diseases or druggability. AVAILABILITY AND IMPLEMENTATION Networks are constructed based on data publicly available in the Reactome Pathway knowledgebase: reactome.org.
Collapse
Affiliation(s)
- Luis Francisco Hernández Sánchez
- Department of Clinical Science, Mohn Center for Diabetes Precision Medicine, University of Bergen, Bergen 5020, Norway
- Department of Medical Genetics, Center for Medical Genetics and Molecular Medicine, Haukeland University Hospital, Bergen 5020, Norway
| | - Bram Burger
- Department of Clinical Science, Mohn Center for Diabetes Precision Medicine, University of Bergen, Bergen 5020, Norway
- Department of Medical Genetics, Center for Medical Genetics and Molecular Medicine, Haukeland University Hospital, Bergen 5020, Norway
- Department of Biomedicine, Proteomics Unit, University of Bergen, Bergen 5020, Norway
- Department of Informatics, Computational Biology Unit, University of Bergen, Bergen 5020, Norway
| | | | - Stefan Johansson
- Department of Clinical Science, Mohn Center for Diabetes Precision Medicine, University of Bergen, Bergen 5020, Norway
- Department of Medical Genetics, Center for Medical Genetics and Molecular Medicine, Haukeland University Hospital, Bergen 5020, Norway
| | - Pål Rasmus Njølstad
- Department of Clinical Science, Mohn Center for Diabetes Precision Medicine, University of Bergen, Bergen 5020, Norway
- Department of Pediatrics, Haukeland University Hospital, Bergen 5020, Norway
| | - Harald Barsnes
- Department of Biomedicine, Proteomics Unit, University of Bergen, Bergen 5020, Norway
- Department of Informatics, Computational Biology Unit, University of Bergen, Bergen 5020, Norway
| | - Marc Vaudel
- Department of Clinical Science, Mohn Center for Diabetes Precision Medicine, University of Bergen, Bergen 5020, Norway
- Department of Informatics, Computational Biology Unit, University of Bergen, Bergen 5020, Norway
- Department of Genetics and Bioinformatics, Health Data and Digitalization, Norwegian Institute of Public Health, Oslo 0213, Norway
| |
Collapse
|
6
|
Ahmed F, Samantasinghar A, Manzoor Soomro A, Kim S, Hyun Choi K. A systematic review of computational approaches to understand cancer biology for informed drug repurposing. J Biomed Inform 2023; 142:104373. [PMID: 37120047 DOI: 10.1016/j.jbi.2023.104373] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Revised: 03/25/2023] [Accepted: 04/23/2023] [Indexed: 05/01/2023]
Abstract
Cancer is the second leading cause of death globally, trailing only heart disease. In the United States alone, 1.9 million new cancer cases and 609,360 deaths were recorded for 2022. Unfortunately, the success rate for new cancer drug development remains less than 10%, making the disease particularly challenging. This low success rate is largely attributed to the complex and poorly understood nature of cancer etiology. Therefore, it is critical to find alternative approaches to understanding cancer biology and developing effective treatments. One such approach is drug repurposing, which offers a shorter drug development timeline and lower costs while increasing the likelihood of success. In this review, we provide a comprehensive analysis of computational approaches for understanding cancer biology, including systems biology, multi-omics, and pathway analysis. Additionally, we examine the use of these methods for drug repurposing in cancer, including the databases and tools that are used for cancer research. Finally, we present case studies of drug repurposing, discussing their limitations and offering recommendations for future research in this area.
Collapse
Affiliation(s)
- Faheem Ahmed
- Department of Mechatronics Engineering, Jeju National University, Republic of Korea
| | | | | | - Sejong Kim
- Department of Internal Medicine, Seoul National University Bundang Hospital, Seongnam, Korea; Department of Internal Medicine, Seoul National University College of Medicine, Seoul, Korea.
| | - Kyung Hyun Choi
- Department of Mechatronics Engineering, Jeju National University, Republic of Korea.
| |
Collapse
|
7
|
Lee S, Jung H, Park J, Ahn J. Accurate Prediction of Cancer Prognosis by Exploiting Patient-Specific Cancer Driver Genes. Int J Mol Sci 2023; 24:ijms24076445. [PMID: 37047418 PMCID: PMC10095073 DOI: 10.3390/ijms24076445] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Revised: 03/17/2023] [Accepted: 03/28/2023] [Indexed: 04/03/2023] Open
Abstract
Accurate prediction of the prognoses of cancer patients and identification of prognostic biomarkers are both important for the improved treatment of cancer patients, in addition to enhanced anticancer drugs. Many previous bioinformatic studies have been carried out to achieve this goal; however, there remains room for improvement in terms of accuracy. In this study, we demonstrated that patient-specific cancer driver genes could be used to predict cancer prognoses more accurately. To identify patient-specific cancer driver genes, we first generated patient-specific gene networks before using modified PageRank to generate feature vectors that represented the impacts genes had on the patient-specific gene network. Subsequently, the feature vectors of the good and poor prognosis groups were used to train the deep feedforward network. For the 11 cancer types in the TCGA data, the proposed method showed a significantly better prediction performance than the existing state-of-the-art methods for three cancer types (BRCA, CESC and PAAD), better performance for five cancer types (COAD, ESCA, HNSC, KIRC and STAD), and a similar or slightly worse performance for the remaining three cancer types (BLCA, LIHC and LUAD). Furthermore, the case study for the identified breast cancer and cervical squamous cell carcinoma prognostic genes and their subnetworks included several pathways associated with the progression of breast cancer and cervical squamous cell carcinoma. These results suggested that heterogeneous cancer driver information may be associated with cancer prognosis.
Collapse
Affiliation(s)
- Suyeon Lee
- Department of Computer Science and Engineering, Incheon National University, Incheon 22012, Republic of Korea
| | - Heewon Jung
- Samsung Electronics Company Ltd., Suwon 16677, Republic of Korea
| | - Jiwoo Park
- Department of Computer Science and Engineering, Incheon National University, Incheon 22012, Republic of Korea
| | - Jaegyoon Ahn
- Department of Computer Science and Engineering, Incheon National University, Incheon 22012, Republic of Korea
- Correspondence:
| |
Collapse
|
8
|
Santhanam B, Oikonomou P, Tavazoie S. Systematic assessment of prognostic molecular features across cancers. CELL GENOMICS 2023; 3:100262. [PMID: 36950380 PMCID: PMC10025453 DOI: 10.1016/j.xgen.2023.100262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Revised: 09/29/2022] [Accepted: 01/12/2023] [Indexed: 02/05/2023]
Abstract
Precision oncology promises accurate prediction of disease trajectories by utilizing molecular features of tumors. We present a systematic analysis of the prognostic potential of diverse molecular features across large cancer cohorts. We find that the mRNA expression of biologically coherent sets of genes (modules) is substantially more predictive of patient survival than single-locus genomic and transcriptomic aberrations. Extending our analysis beyond existing curated gene modules, we find a large novel class of highly prognostic DNA/RNA cis-regulatory modules associated with dynamic gene expression within cancers. Remarkably, in more than 82% of cancers, modules substantially improve survival stratification compared with conventional clinical factors and prominent genomic aberrations. The prognostic potential of cancer modules generalizes to external cohorts better than conventionally used single-gene features. Finally, a machine-learning framework demonstrates the combined predictive power of multiple modules, yielding prognostic models that perform substantially better than existing histopathological and clinical factors in common use.
Collapse
Affiliation(s)
- Balaji Santhanam
- Department of Biological Sciences, Columbia University, New York, NY 10027, USA
- Department of Systems Biology, Columbia University, New York, NY 10032, USA
- Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY 10032, USA
- Irving Institute for Cancer Dynamics, Columbia University, New York, NY 10032, USA
| | - Panos Oikonomou
- Department of Biological Sciences, Columbia University, New York, NY 10027, USA
- Department of Systems Biology, Columbia University, New York, NY 10032, USA
- Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY 10032, USA
- Irving Institute for Cancer Dynamics, Columbia University, New York, NY 10032, USA
| | - Saeed Tavazoie
- Department of Biological Sciences, Columbia University, New York, NY 10027, USA
- Department of Systems Biology, Columbia University, New York, NY 10032, USA
- Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY 10032, USA
- Irving Institute for Cancer Dynamics, Columbia University, New York, NY 10032, USA
| |
Collapse
|
9
|
Shang J, Zhu X, Sun Y, Li F, Kong X, Liu JX. DM-MOGA: a multi-objective optimization genetic algorithm for identifying disease modules of non-small cell lung cancer. BMC Bioinformatics 2023; 24:13. [PMID: 36624376 PMCID: PMC9830734 DOI: 10.1186/s12859-023-05136-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2022] [Accepted: 01/04/2023] [Indexed: 01/11/2023] Open
Abstract
BACKGROUND Constructing molecular interaction networks from microarray data and then identifying disease module biomarkers can provide insight into the underlying pathogenic mechanisms of non-small cell lung cancer. A promising approach for identifying disease modules in the network is community detection. RESULTS In order to identify disease modules from gene co-expression networks, a community detection method is proposed based on multi-objective optimization genetic algorithm with decomposition. The method is named DM-MOGA and possesses two highlights. First, the boundary correction strategy is designed for the modules obtained in the process of local module detection and pre-simplification. Second, during the evolution, we introduce Davies-Bouldin index and clustering coefficient as fitness functions which are improved and migrated to weighted networks. In order to identify modules that are more relevant to diseases, the above strategies are designed to consider the network topology of genes and the strength of connections with other genes at the same time. Experimental results of different gene expression datasets of non-small cell lung cancer demonstrate that the core modules obtained by DM-MOGA are more effective than those obtained by several other advanced module identification methods. CONCLUSIONS The proposed method identifies disease-relevant modules by optimizing two novel fitness functions to simultaneously consider the local topology of each gene and its connection strength with other genes. The association of the identified core modules with lung cancer has been confirmed by pathway and gene ontology enrichment analysis.
Collapse
Affiliation(s)
- Junliang Shang
- grid.412638.a0000 0001 0227 8151School of Computer Science, Qufu Normal University, Rizhao, 276826 China
| | - Xuhui Zhu
- grid.412638.a0000 0001 0227 8151School of Computer Science, Qufu Normal University, Rizhao, 276826 China
| | - Yan Sun
- grid.412638.a0000 0001 0227 8151School of Computer Science, Qufu Normal University, Rizhao, 276826 China
| | - Feng Li
- grid.412638.a0000 0001 0227 8151School of Computer Science, Qufu Normal University, Rizhao, 276826 China
| | - Xiangzhen Kong
- grid.412638.a0000 0001 0227 8151School of Computer Science, Qufu Normal University, Rizhao, 276826 China
| | - Jin-Xing Liu
- grid.412638.a0000 0001 0227 8151School of Computer Science, Qufu Normal University, Rizhao, 276826 China
| |
Collapse
|
10
|
Decision Theory versus Conventional Statistics for Personalized Therapy of Breast Cancer. J Pers Med 2022; 12:jpm12040570. [PMID: 35455687 PMCID: PMC9028435 DOI: 10.3390/jpm12040570] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2022] [Revised: 03/24/2022] [Accepted: 03/28/2022] [Indexed: 11/17/2022] Open
Abstract
Estrogen and progesterone receptors being present or not represents one of the most important biomarkers for therapy selection in breast cancer patients. Conventional measurement by immunohistochemistry (IHC) involves errors, and numerous attempts have been made to increase precision by additional information from gene expression. This raises the question of how to fuse information, in particular, if there is disagreement. It is the primary domain of Dempster–Shafer decision theory (DST) to deal with contradicting evidence on the same item (here: receptor status), obtained through different techniques. DST is widely used in technical settings, such as self-driving cars and aviation, and is also promising to deliver significant advantages in medicine. Using data from breast cancer patients already presented in previous work, we focus on comparing DST with classical statistics in this work, to pave the way for its application in medicine. First, we explain how DST not only considers probabilities (a single number per sample), but also incorporates uncertainty in a concept of ‘evidence’ (two numbers per sample). This allows for very powerful displays of patient data in so-called ternary plots, a novel and crucial advantage for medical interpretation. Results are obtained according to conventional statistics (ODDS) and, in parallel, according to DST. Agreement and differences are evaluated, and the particular merits of DST discussed. The presented application demonstrates how decision theory introduces new levels of confidence in diagnoses derived from medical data.
Collapse
|
11
|
Luo P, Chen B, Liao B, Wu F. Predicting disease‐associated genes: Computational methods, databases, and evaluations. WIRES DATA MINING AND KNOWLEDGE DISCOVERY 2021; 11. [DOI: 10.1002/widm.1383] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/28/2019] [Accepted: 06/13/2020] [Indexed: 09/09/2024]
Abstract
AbstractComplex diseases are associated with a set of genes (called disease genes), the identification of which can help scientists uncover the mechanisms of diseases and develop new drugs and treatment strategies. Due to the huge cost and time of experimental identification techniques, many computational algorithms have been proposed to predict disease genes. Although several review publications in recent years have discussed many computational methods, some of them focus on cancer driver genes while others focus on biomolecular networks, which only cover a specific aspect of existing methods. In this review, we summarize existing methods and classify them into three categories based on their rationales. Then, the algorithms, biological data, and evaluation methods used in the computational prediction are discussed. Finally, we highlight the limitations of existing methods and point out some future directions for improving these algorithms. This review could help investigators understand the principles of existing methods, and thus develop new methods to advance the computational prediction of disease genes.This article is categorized under:Technologies > Machine LearningTechnologies > PredictionAlgorithmic Development > Biological Data Mining
Collapse
Affiliation(s)
- Ping Luo
- Division of Biomedical Engineering University of Saskatchewan Saskatoon Canada
- Princess Margaret Cancer Centre University Health Network Toronto Canada
| | - Bolin Chen
- School of Computer Science and Technology Northwestern Polytechnical University China
| | - Bo Liao
- School of Mathematics and Statistics Hainan Normal University Haikou China
| | - Fang‐Xiang Wu
- Department of Mechanical Engineering and Department of Computer Science University of Saskatchewan Saskatoon Canada
| |
Collapse
|
12
|
Kenn M, Cacsire Castillo-Tong D, Singer CF, Karch R, Cibena M, Koelbl H, Schreiner W. Decision theory for precision therapy of breast cancer. Sci Rep 2021; 11:4233. [PMID: 33608588 PMCID: PMC7895957 DOI: 10.1038/s41598-021-82418-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2020] [Accepted: 01/11/2021] [Indexed: 01/31/2023] Open
Abstract
Correctly estimating the hormone receptor status for estrogen (ER) and progesterone (PGR) is crucial for precision therapy of breast cancer. It is known that conventional diagnostics (immunohistochemistry, IHC) yields a significant rate of wrongly diagnosed receptor status. Here we demonstrate how Dempster Shafer decision Theory (DST) enhances diagnostic precision by adding information from gene expression. We downloaded data of 3753 breast cancer patients from Gene Expression Omnibus. Information from IHC and gene expression was fused according to DST, and the clinical criterion for receptor positivity was re-modelled along DST. Receptor status predicted according to DST was compared with conventional assessment via IHC and gene-expression, and deviations were flagged as questionable. The survival of questionable cases turned out significantly worse (Kaplan Meier p < 1%) than for patients with receptor status confirmed by DST, indicating a substantial enhancement of diagnostic precision via DST. This study is not only relevant for precision medicine but also paves the way for introducing decision theory into OMICS data science.
Collapse
Affiliation(s)
- Michael Kenn
- Section of Biosimulation and Bioinformatics, Center for Medical Statistics, Informatics and Intelligent Systems (CeMSIIS), Medical University of Vienna, Spitalgasse 23, 1090, Vienna, Austria
| | - Dan Cacsire Castillo-Tong
- Translational Gynecology Group, Department of Obstetrics and Gynecology, Comprehensive Cancer Center, Medical University of Vienna, Waehringer Guertel 18-20, 1090, Vienna, Austria
| | - Christian F Singer
- Translational Gynecology Group, Department of Obstetrics and Gynecology, Comprehensive Cancer Center, Medical University of Vienna, Waehringer Guertel 18-20, 1090, Vienna, Austria
| | - Rudolf Karch
- Section of Biosimulation and Bioinformatics, Center for Medical Statistics, Informatics and Intelligent Systems (CeMSIIS), Medical University of Vienna, Spitalgasse 23, 1090, Vienna, Austria
| | - Michael Cibena
- Section of Biosimulation and Bioinformatics, Center for Medical Statistics, Informatics and Intelligent Systems (CeMSIIS), Medical University of Vienna, Spitalgasse 23, 1090, Vienna, Austria
| | - Heinz Koelbl
- Department of General Gynecology and Gynecologic Oncology, Medical University of Vienna, Waehringer Guertel 18-20, 1090, Vienna, Austria
| | - Wolfgang Schreiner
- Section of Biosimulation and Bioinformatics, Center for Medical Statistics, Informatics and Intelligent Systems (CeMSIIS), Medical University of Vienna, Spitalgasse 23, 1090, Vienna, Austria.
| |
Collapse
|
13
|
Lefebvre M, Gaignard A, Folschette M, Bourdon J, Guziolowski C. Large-scale regulatory and signaling network assembly through linked open data. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2021; 2021:6103765. [PMID: 33459761 PMCID: PMC7812716 DOI: 10.1093/database/baaa113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/25/2020] [Revised: 11/30/2020] [Accepted: 12/11/2020] [Indexed: 11/16/2022]
Abstract
Huge efforts are currently underway to address the organization of biological knowledge through linked open databases. These databases can be automatically queried to reconstruct regulatory and signaling networks. However, assembling networks implies manual operations due to source-specific identification of biological entities and relationships, multiple life-science databases with redundant information and the difficulty of recovering logical flows in biological pathways. We propose a framework based on Semantic Web technologies to automate the reconstruction of large-scale regulatory and signaling networks in the context of tumor cells modeling and drug screening. The proposed tool is pyBRAvo (python Biological netwoRk Assembly), and here we have applied it to a dataset of 910 gene expression measurements issued from liver cancer patients. The tool is publicly available at https://github.com/pyBRAvo/pyBRAvo.
Collapse
Affiliation(s)
- M Lefebvre
- UMR 1332 Biologie du Fruit et Pathologie, INRAE, Univ. Bordeaux, 72 Avenue Edouard Bourlaux, CS20032, 33882, Villenave d'Ornon cedex, France
| | - A Gaignard
- Université de Nantes, CNRS, INSERM, l'Institut du Thorax F-44000, Nantes, France
| | - M Folschette
- Univ. Lille, CNRS, Centrale Lille, UMR 9189 - CRIStAL - Centre de Recherche en Informatique Signal et Automatique de Lille, F-59000 Lille, France
| | - J Bourdon
- Université de Nantes, Centrale Nantes, CNRS, UMR 6004 - LS2N - Laboratoire des Sciences du Numérique de Nantes, F-44000 Nantes, France
| | - C Guziolowski
- Université de Nantes, Centrale Nantes, CNRS, UMR 6004 - LS2N - Laboratoire des Sciences du Numérique de Nantes, F-44000 Nantes, France
| |
Collapse
|
14
|
GVES: machine learning model for identification of prognostic genes with a small dataset. Sci Rep 2021; 11:439. [PMID: 33431999 PMCID: PMC7801384 DOI: 10.1038/s41598-020-79889-5] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Accepted: 12/08/2020] [Indexed: 12/16/2022] Open
Abstract
Machine learning may be a powerful approach to more accurate identification of genes that may serve as prognosticators of cancer outcomes using various types of omics data. However, to date, machine learning approaches have shown limited prediction accuracy for cancer outcomes, primarily owing to small sample numbers and relatively large number of features. In this paper, we provide a description of GVES (Gene Vector for Each Sample), a proposed machine learning model that can be efficiently leveraged even with a small sample size, to increase the accuracy of identification of genes with prognostic value. GVES, an adaptation of the continuous bag of words (CBOW) model, generates vector representations of all genes for all samples by leveraging gene expression and biological network data. GVES clusters samples using their gene vectors, and identifies genes that divide samples into good and poor outcome groups for the prediction of cancer outcomes. Because GVES generates gene vectors for each sample, the sample size effect is reduced. We applied GVES to six cancer types and demonstrated that GVES outperformed existing machine learning methods, particularly for cancer datasets with a small number of samples. Moreover, the genes identified as prognosticators were shown to reside within a number of significant prognostic genetic pathways associated with pancreatic cancer.
Collapse
|
15
|
CODC: a Copula-based model to identify differential coexpression. NPJ Syst Biol Appl 2020; 6:20. [PMID: 32561750 PMCID: PMC7305108 DOI: 10.1038/s41540-020-0137-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2019] [Accepted: 03/18/2020] [Indexed: 11/21/2022] Open
Abstract
Differential coexpression has recently emerged as a new way to establish a fundamental difference in expression pattern among a group of genes between two populations. Earlier methods used some scoring techniques to detect changes in correlation patterns of a gene pair in two conditions. However, modeling differential coexpression by means of finding differences in the dependence structure of the gene pair has hitherto not been carried out. We exploit a copula-based framework to model differential coexpression between gene pairs in two different conditions. The Copula is used to model the dependency between expression profiles of a gene pair. For a gene pair, the distance between two joint distributions produced by copula is served as differential coexpression. We used five pan-cancer TCGA RNA-Seq data to evaluate the model that outperforms the existing state of the art. Moreover, the proposed model can detect a mild change in the coexpression pattern across two conditions. For noisy expression data, the proposed method performs well because of the popular scale-invariant property of copula. In addition, we have identified differentially coexpressed modules by applying hierarchical clustering on the distance matrix. The identified modules are analyzed through Gene Ontology terms and KEGG pathway enrichment analysis.
Collapse
|
16
|
Zhang J, Yan S, Jiang C, Ji Z, Wang C, Tian W. Network Properties of Cancer Prognostic Gene Signatures in the Human Protein Interactome. Genes (Basel) 2020; 11:genes11030247. [PMID: 32111006 PMCID: PMC7140842 DOI: 10.3390/genes11030247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2020] [Revised: 02/24/2020] [Accepted: 02/24/2020] [Indexed: 11/16/2022] Open
Abstract
Prognostic gene signatures are critical in cancer prognosis assessments and their pinpoint treatments. However, their network properties remain unclear. Here, we obtained nine prognostic gene sets including 1439 prognostic genes of different cancers from related publications. Four network centralities were used to examine the network properties of prognostic genes (PG) compared with other gene sets based on the Human Protein Reference Database (HPRD) and String networks. We also proposed three novel network measures for further investigating the network properties of prognostic gene sets (PGS) besides clustering coefficient. The results showed that PG did not occupy key positions in the human protein interaction network and were more similar to essential genes rather than cancer genes. However, PGS had significantly smaller intra-set distance (IAD) and inter-set distance (IED) in comparison with random sets (p-value < 0.001). Moreover, we also found that PGS tended to be distributed within network modules rather than between modules (p-value < 0.01), and the functional intersection of the modules enriched with PGS was closely related to cancer development and progression. Our research reveals the common network properties of cancer prognostic gene signatures in the human protein interactome. We argue that these are biologically meaningful and useful for understanding their molecular mechanism.
Collapse
Affiliation(s)
- Jifeng Zhang
- School of Biological Engineering, Huainan Normal University, Huainan 232001, China (C.J.); (C.W.)
- School of Life Science, Institute of Biostatistics, Fudan University, Shanghai 2004333, China
- Correspondence: (J.Z.); (W.T.); Tel.: +86-181-3013-7151 (J.Z.); +86-21-3124-6723 (W.T.)
| | - Shoubao Yan
- School of Biological Engineering, Huainan Normal University, Huainan 232001, China (C.J.); (C.W.)
| | - Cheng Jiang
- School of Biological Engineering, Huainan Normal University, Huainan 232001, China (C.J.); (C.W.)
| | - Zhicheng Ji
- Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD 21205, USA;
| | - Chenrun Wang
- School of Biological Engineering, Huainan Normal University, Huainan 232001, China (C.J.); (C.W.)
| | - Weidong Tian
- School of Life Science, Institute of Biostatistics, Fudan University, Shanghai 2004333, China
- Correspondence: (J.Z.); (W.T.); Tel.: +86-181-3013-7151 (J.Z.); +86-21-3124-6723 (W.T.)
| |
Collapse
|
17
|
Zou J, Wang E. Cancer Biomarker Discovery for Precision Medicine: New Progress. Curr Med Chem 2020; 26:7655-7671. [PMID: 30027846 DOI: 10.2174/0929867325666180718164712] [Citation(s) in RCA: 44] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2018] [Revised: 06/26/2018] [Accepted: 07/06/2018] [Indexed: 12/30/2022]
Abstract
BACKGROUND Precision medicine puts forward customized healthcare for cancer patients. An important way to accomplish this task is to stratify patients into those who may respond to a treatment and those who may not. For this purpose, diagnostic and prognostic biomarkers have been pursued. OBJECTIVE This review focuses on novel approaches and concepts of exploring biomarker discovery under the circumstances that technologies are developed, and data are accumulated for precision medicine. RESULTS The traditional mechanism-driven functional biomarkers have the advantage of actionable insights, while data-driven computational biomarkers can fulfill more needs, especially with tremendous data on the molecules of different layers (e.g. genetic mutation, mRNA, protein etc.) which are accumulated based on a plenty of technologies. Besides, the technology-driven liquid biopsy biomarker is very promising to improve patients' survival. The developments of biomarker discovery on these aspects are promoting the understanding of cancer, helping the stratification of patients and improving patients' survival. CONCLUSION Current developments on mechanisms-, data- and technology-driven biomarker discovery are achieving the aim of precision medicine and promoting the clinical application of biomarkers. Meanwhile, the complexity of cancer requires more effective biomarkers, which could be accomplished by a comprehensive integration of multiple types of biomarkers together with a deep understanding of cancer.
Collapse
Affiliation(s)
- Jinfeng Zou
- Princess Margaret Cancer Center, University Health Network, Toronto, Ontario, ON, M5G 23C1, Canada
| | - Edwin Wang
- College of Life Science, Tianjin Normal University, Tianjin, China.,Cumming School of Medicine, University of Calgary, Calgary, Alberta AB T2N 1N4, Canada
| |
Collapse
|
18
|
Folschette M, Legagneux V, Poret A, Chebouba L, Guziolowski C, Théret N. A pipeline to create predictive functional networks: application to the tumor progression of hepatocellular carcinoma. BMC Bioinformatics 2020; 21:18. [PMID: 31937236 PMCID: PMC6958715 DOI: 10.1186/s12859-019-3316-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2019] [Accepted: 12/12/2019] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Integrating genome-wide gene expression patient profiles with regulatory knowledge is a challenging task because of the inherent heterogeneity, noise and incompleteness of biological data. From the computational side, several solvers for logic programs are able to perform extremely well in decision problems for combinatorial search domains. The challenge then is how to process the biological knowledge in order to feed these solvers to gain insights in a biological study. It requires formalizing the biological knowledge to give a precise interpretation of this information; currently, very few pathway databases offer this possibility. RESULTS The presented work proposes an automatic pipeline to extract automatically regulatory knowledge from pathway databases and generate novel computational predictions related to the state of expression or activity of biological molecules. We applied it in the context of hepatocellular carcinoma (HCC) progression, and evaluate the precision and the stability of these computational predictions. Our working base is a graph of 3383 nodes and 13,771 edges extracted from the KEGG database, in which we integrate 209 differentially expressed genes between low and high aggressive HCC across 294 patients. Our computational model predicts the shifts of expression of 146 initially non-observed biological components. Our predictions were validated at 88% using a larger experimental dataset and cross-validation techniques. In particular, we focus on the protein complexes predictions and show for the first time that NFKB1/BCL-3 complexes are activated in aggressive HCC. In spite of the large dimension of the reconstructed models, our analyses over the computational predictions discover a well constrained region where KEGG regulatory knowledge constrains gene expression of several biomolecules. These regions can offer interesting windows to perturb experimentally such complex systems. CONCLUSION This new pipeline allows biologists to develop their own predictive models based on a list of genes. It facilitates the identification of new regulatory biomolecules using knowledge graphs and predictive computational methods. Our workflow is implemented in an automatic python pipeline which is publicly available at https://github.com/LokmaneChebouba/key-pipeand contains as testing data all the data used in this paper.
Collapse
Affiliation(s)
- Maxime Folschette
- Univ Rennes, Inria, CNRS, IRISA, UMR 6074, Rennes, France
- Univ Rennes, Inserm, EHESP, Irset, UMR S1085, Rennes, France
- IFB-CORE, Institut Français de Bioinformatique, UMS CNRS 3601, Évry, France
- LS2N, Laboratoire des Sciences du Numérique de Nantes, UMR 6004, Nantes, France
- Univ. Lille, CNRS, Centrale Lille, CRIStAL, Centre de Recherche en Informatique Signal et Automatique de Lille, UMR 9189, F-59000, Lille, France
| | | | - Arnaud Poret
- LS2N, Laboratoire des Sciences du Numérique de Nantes, UMR 6004, Nantes, France
| | - Lokmane Chebouba
- LS2N, Laboratoire des Sciences du Numérique de Nantes, UMR 6004, Nantes, France
- École centrale de Nantes, Nantes, France
- Department of Computer Science, LRIA Laboratory, Electrical Engineering and Computer Science Faculty, University of Science and Technology Houari Boumediene (USTHB), Algiers, Algeria
| | - Carito Guziolowski
- LS2N, Laboratoire des Sciences du Numérique de Nantes, UMR 6004, Nantes, France.
- École centrale de Nantes, Nantes, France.
| | - Nathalie Théret
- Univ Rennes, Inria, CNRS, IRISA, UMR 6074, Rennes, France.
- Univ Rennes, Inserm, EHESP, Irset, UMR S1085, Rennes, France.
| |
Collapse
|
19
|
Ung CY, Ghanat Bari M, Zhang C, Liang J, Correia C, Li H. Regulostat Inferelator: a novel network biology platform to uncover molecular devices that predetermine cellular response phenotypes. Nucleic Acids Res 2019; 47:e82. [PMID: 31114928 PMCID: PMC6698671 DOI: 10.1093/nar/gkz417] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2018] [Revised: 04/18/2019] [Accepted: 05/17/2019] [Indexed: 12/24/2022] Open
Abstract
With the emergence of genome editing technologies and synthetic biology, it is now possible to engineer genetic circuits driving a cell's phenotypic response to a stressor. However, capturing a continuous response, rather than simply a binary ‘on’ or ‘off’ response, remains a bioengineering challenge. No tools currently exist to identify gene candidates responsible for predetermining and fine-tuning cell response phenotypes. To address this gap, we devised a novel Regulostat Inferelator (RSI) algorithm to decipher intrinsic molecular devices or networks that predetermine cellular phenotypic responses. The RSI algorithm is designed to extract gene expression patterns from basal transcriptomic data in order to identify ‘regulostat’ constituent gene pairs, which exhibit rheostat-like mode-of-cooperation capable of fine-tuning cellular response. Our proof-of-concept study provides computational evidence for the existence of regulostats and that these networks predetermine cellular response prior to exposure to a stressor or drug. In addition, our work, for the first time, provides evidence of context-specific, drug–regulostat interactions in predetermining drug response phenotypes in cancer cells. Given RSI-inferred regulostat networks offer insights for prioritizing gene candidates capable of rendering a resistant phenotype sensitive to a given drug, we envision that this tool will be of great value in bioengineering and medicine.
Collapse
Affiliation(s)
- Choong Yong Ung
- Center for Individualized Medicine, Department of Molecular Pharmacology and Experimental Therapeutics, Mayo Clinic College of Medicine, Rochester, MN 55905, USA
| | - Mehrab Ghanat Bari
- Center for Individualized Medicine, Department of Molecular Pharmacology and Experimental Therapeutics, Mayo Clinic College of Medicine, Rochester, MN 55905, USA
| | - Cheng Zhang
- Center for Individualized Medicine, Department of Molecular Pharmacology and Experimental Therapeutics, Mayo Clinic College of Medicine, Rochester, MN 55905, USA
| | - Jingjing Liang
- Department of Population and Quantitative Health Science, Case Western Reserve University, Cleveland, OH, USA
| | - Cristina Correia
- Center for Individualized Medicine, Department of Molecular Pharmacology and Experimental Therapeutics, Mayo Clinic College of Medicine, Rochester, MN 55905, USA
| | - Hu Li
- Center for Individualized Medicine, Department of Molecular Pharmacology and Experimental Therapeutics, Mayo Clinic College of Medicine, Rochester, MN 55905, USA
| |
Collapse
|
20
|
Muralidhar S, Filia A, Nsengimana J, Poźniak J, O'Shea SJ, Diaz JM, Harland M, Randerson-Moor JA, Reichrath J, Laye JP, van der Weyden L, Adams DJ, Bishop DT, Newton-Bishop J. Vitamin D-VDR Signaling Inhibits Wnt/β-Catenin-Mediated Melanoma Progression and Promotes Antitumor Immunity. Cancer Res 2019; 79:5986-5998. [PMID: 31690667 DOI: 10.1158/0008-5472.can-18-3927] [Citation(s) in RCA: 74] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2018] [Revised: 02/12/2019] [Accepted: 10/01/2019] [Indexed: 11/16/2022]
Abstract
1α,25-Dihydroxyvitamin D3 signals via the vitamin D receptor (VDR). Higher serum vitamin D is associated with thinner primary melanoma and better outcome, although a causal mechanism has not been established. As patients with melanoma commonly avoid sun exposure, and consequent vitamin D deficiency might worsen outcomes, we interrogated 703 primary melanoma transcriptomes to understand the role of vitamin D-VDR signaling and replicated the findings in The Cancer Genome Atlas metastases. VDR expression was independently protective for melanoma-related death in both primary and metastatic disease. High tumor VDR expression was associated with upregulation of pathways mediating antitumor immunity and corresponding with higher imputed immune cell scores and histologically detected tumor-infiltrating lymphocytes. High VDR-expressing tumors had downregulation of proliferative pathways, notably Wnt/β-catenin signaling. Deleterious low VDR levels resulted from promoter methylation and gene deletion in metastases. Vitamin D deficiency (<25 nmol/L ∼ 10 ng/mL) shortened survival in primary melanoma in a VDR-dependent manner. In vitro functional validation studies showed that elevated vitamin D-VDR signaling inhibited Wnt/β-catenin signaling genes. Murine melanoma cells overexpressing VDR produced fewer pulmonary metastases than controls in tail-vein metastasis assays. In summary, vitamin D-VDR signaling contributes to controlling pro-proliferative/immunosuppressive Wnt/β-catenin signaling in melanoma and this is associated with less metastatic disease and stronger host immune responses. This is evidence of a causal relationship between vitamin D-VDR signaling and melanoma survival, which should be explored as a therapeutic target in primary resistance to checkpoint blockade. SIGNIFICANCE: VDR expression could potentially be used as a biomarker to stratify patients with melanoma that may respond better to immunotherapy.
Collapse
Affiliation(s)
- Sathya Muralidhar
- University of Leeds School of Medicine, Leeds, United Kingdom
- Division of Molecular Pathology, The Institute of Cancer Research, London, United Kingdom
| | - Anastasia Filia
- Centre for Translational Research, Biomedical Research Foundation of the Academy of Athens, Athens, Greece
| | | | - Joanna Poźniak
- University of Leeds School of Medicine, Leeds, United Kingdom
- Laboratory for Molecular Cancer Biology, VIB Center for Cancer Biology, KU Leuven, Leuven, Belgium
- Department of Oncology, KU Leuven, Leuven, Belgium
| | - Sally J O'Shea
- University of Leeds School of Medicine, Leeds, United Kingdom
- Faculty of Medicine and Health, University College Cork, Cork, Ireland
- Mater Private Hospital Cork, Citygate, Mahon, Cork, Ireland
| | - Joey M Diaz
- University of Leeds School of Medicine, Leeds, United Kingdom
| | - Mark Harland
- University of Leeds School of Medicine, Leeds, United Kingdom
| | | | - Jörg Reichrath
- Center for Clinical and Experimental Photodermatology, The Saarland University Hospital, Homburg, Germany
| | - Jonathan P Laye
- University of Leeds School of Medicine, Leeds, United Kingdom
| | - Louise van der Weyden
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
| | - David J Adams
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
| | - D T Bishop
- University of Leeds School of Medicine, Leeds, United Kingdom
| | | |
Collapse
|
21
|
Shao T, Wang G, Chen H, Xie Y, Jin X, Bai J, Xu J, Li X, Huang J, Jin Y, Li Y. Survey of miRNA-miRNA cooperative regulation principles across cancer types. Brief Bioinform 2019; 20:1621-1638. [PMID: 29800060 DOI: 10.1093/bib/bby038] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2018] [Revised: 04/05/2018] [Indexed: 01/03/2025] Open
Abstract
Cooperative regulation among multiple microRNAs (miRNAs) is a complex type of posttranscriptional regulation in human; however, the global view of the system-level regulatory principles across cancers is still unclear. Here, we investigated miRNA-miRNA cooperative regulatory landscape across 18 cancer types and summarized the regulatory principles of miRNAs. The miRNA-miRNA cooperative pan-cancer network exhibited a scale-free and modular architecture. Cancer types with similar tissue origins had high similarity in cooperative network structure and expression of cooperative miRNA pairs. In addition, cooperative miRNAs showed divergent properties, including higher expression, greater expression variation and a stronger regulatory strength towards targets and were likely to regulate cancer hallmark-related functions. We found a marked rewiring of miRNA-miRNA cooperation between various cancers and revealed conserved and rewired network miRNA hubs. We further identified the common hubs, cancer-specific hubs and other hubs, which tend to target known anticancer drug targets. Finally, miRNA cooperative modules were found to be associated with patient survival in several cancer types. Our study highlights the potential of pan-cancer miRNA-miRNA cooperative regulation as a novel paradigm that may aid in the discovery of tumorigenesis mechanisms and development of anticancer drugs.
Collapse
Affiliation(s)
- Tingting Shao
- College of Bioinformatics Science and Technology and Bio-Pharmaceutical Key Laboratory of Heilongjiang Province, Harbin Medical University, Harbin 150081, China
| | - Guangjuan Wang
- College of Bioinformatics Science and Technology and Bio-Pharmaceutical Key Laboratory of Heilongjiang Province, Harbin Medical University, Harbin 150081, China
| | - Hong Chen
- College of Bioinformatics Science and Technology and Bio-Pharmaceutical Key Laboratory of Heilongjiang Province, Harbin Medical University, Harbin 150081, China
| | - Yunjin Xie
- College of Bioinformatics Science and Technology and Bio-Pharmaceutical Key Laboratory of Heilongjiang Province, Harbin Medical University, Harbin 150081, China
| | - Xiyun Jin
- College of Bioinformatics Science and Technology and Bio-Pharmaceutical Key Laboratory of Heilongjiang Province, Harbin Medical University, Harbin 150081, China
| | - Jing Bai
- College of Bioinformatics Science and Technology and Bio-Pharmaceutical Key Laboratory of Heilongjiang Province, Harbin Medical University, Harbin 150081, China
| | - Juan Xu
- College of Bioinformatics Science and Technology and Bio-Pharmaceutical Key Laboratory of Heilongjiang Province, Harbin Medical University, Harbin 150081, China
| | - Xia Li
- College of Bioinformatics Science and Technology and Bio-Pharmaceutical Key Laboratory of Heilongjiang Province, Harbin Medical University, Harbin 150081, China
| | - Jian Huang
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Yan Jin
- Department of Medical Genetics, Harbin Medical University, Harbin 150081, China
| | - Yongsheng Li
- College of Bioinformatics Science and Technology and Bio-Pharmaceutical Key Laboratory of Heilongjiang Province, Harbin Medical University, Harbin 150081, China
| |
Collapse
|
22
|
Eskandarian Boroujeni M, Aliaghaei A, Maghsoudi N, Gardaneh M. Complementation of dopaminergic signaling by Pitx3-GDNF synergy induces dopamine secretion by multipotent Ntera2 cells. J Cell Biochem 2019; 121:200-212. [PMID: 31310388 DOI: 10.1002/jcb.29109] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2019] [Revised: 04/27/2019] [Accepted: 04/30/2019] [Indexed: 11/07/2022]
Abstract
Human teratocarcinoma cell line Ntera2 (NT2) expresses dopamine signals and has shown its safe profile for clinical applications. Attempts to restore complete dopaminergic (DAergic) phenotype enabling these cells to secrete dopamine have not been fully successful so far. We applied a blend of gene transfer techniques and a defined medium to convert NT2 cells to fully DAergic. The cells were primarily engineered to overexpress the Pitx3 gene product and then cultured in a growth medium supplemented with knockout serum and retinoic acid to form embroid bodies (EBs). Trypsinization of EB colonies produced single cells ready for differentiation. Neuronal/DAergic induction was promoted by applying conditioned medium taken from engineered human astrocytomas over-secreting glial cell-derived neurotrophic factor (GDNF). Immunocytochemistry, reverse-transcription and real-time polymerase chain reaction analyses confirmed significantly induced expression of molecules involved in dopamine signaling and metabolism including tyrosine hydroxylase, Nurr1, dopamine transporter, and aromatic acid decarboxylase. High-performance liquid chromatography analysis indicated release of dopamine only from a class of fully differentiated cells expressing Pitx3 and exposed to GDNF. In addition, Pitx3 and GDNF additively promoted in vitro neuroprotection against Parkinsonian toxin. One month after transplantation to the striatum of 6-OHDA-leasioned rats, differentiated NT2 cells survived and induced significant increase in striatal volume. Besides, cell implantation improved motor coordination in Parkinson's disease (PD) rat models. Our findings highlight the importance of Pitx3-GDNF interplay in dopamine signaling and indicate that our strategy might be useful for the restoration of DAergic fate of NT2 cells to make them clinically applicable toward cell replacement therapy of PD.
Collapse
Affiliation(s)
- Mahdi Eskandarian Boroujeni
- Department of Stem Cells and Regenerative Medicine, Faculty of Medical Biotechnology, National Institute of Genetic Engineering and Biotechnology, Tehran, Iran
| | - Abbas Aliaghaei
- Anatomy and Cell Biology Department, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Nader Maghsoudi
- Neuroscience Research Center, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Mossa Gardaneh
- Department of Stem Cells and Regenerative Medicine, Faculty of Medical Biotechnology, National Institute of Genetic Engineering and Biotechnology, Tehran, Iran
| |
Collapse
|
23
|
Label propagation defines signaling networks associated with recurrently mutated cancer genes. Sci Rep 2019; 9:9401. [PMID: 31253832 PMCID: PMC6599034 DOI: 10.1038/s41598-019-45603-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2018] [Accepted: 06/11/2019] [Indexed: 11/09/2022] Open
Abstract
Human tumors have distinct profiles of genomic alterations, and each of these alterations has the potential to cause unique changes to cellular homeostasis. Detailed analyses of these changes could reveal downstream effects of genomic alterations, contributing to our understanding of their roles in tumor development and progression. Across a range of tumor types, including bladder, lung, and endometrial carcinoma, we determined genes that are frequently altered in The Cancer Genome Atlas patient populations, then examined the effects of these alterations on signaling and regulatory pathways. To achieve this, we used a label propagation-based methodology to generate networks from gene expression signatures associated with defined mutations. Individual networks offered a large-scale view of signaling changes represented by gene signatures, which in turn reflected the scope of molecular events that are perturbed in the presence of a given genomic alteration. Comparing different networks to one another revealed common biological pathways impacted by distinct genomic alterations, highlighting the concept that tumors can dysregulate key pathways through multiple, seemingly unrelated mechanisms. Finally, altered genes inducing common changes to the signaling network were used to search for genomic markers of drug response, connecting shared perturbations to differential drug sensitivity.
Collapse
|
24
|
Clarke R, Tyson JJ, Tan M, Baumann WT, Jin L, Xuan J, Wang Y. Systems biology: perspectives on multiscale modeling in research on endocrine-related cancers. Endocr Relat Cancer 2019; 26:R345-R368. [PMID: 30965282 PMCID: PMC7045974 DOI: 10.1530/erc-18-0309] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/25/2019] [Accepted: 04/08/2019] [Indexed: 12/12/2022]
Abstract
Drawing on concepts from experimental biology, computer science, informatics, mathematics and statistics, systems biologists integrate data across diverse platforms and scales of time and space to create computational and mathematical models of the integrative, holistic functions of living systems. Endocrine-related cancers are well suited to study from a systems perspective because of the signaling complexities arising from the roles of growth factors, hormones and their receptors as critical regulators of cancer cell biology and from the interactions among cancer cells, normal cells and signaling molecules in the tumor microenvironment. Moreover, growth factors, hormones and their receptors are often effective targets for therapeutic intervention, such as estrogen biosynthesis, estrogen receptors or HER2 in breast cancer and androgen receptors in prostate cancer. Given the complexity underlying the molecular control networks in these cancers, a simple, intuitive understanding of how endocrine-related cancers respond to therapeutic protocols has proved incomplete and unsatisfactory. Systems biology offers an alternative paradigm for understanding these cancers and their treatment. To correctly interpret the results of systems-based studies requires some knowledge of how in silico models are built, and how they are used to describe a system and to predict the effects of perturbations on system function. In this review, we provide a general perspective on the field of cancer systems biology, and we explore some of the advantages, limitations and pitfalls associated with using predictive multiscale modeling to study endocrine-related cancers.
Collapse
Affiliation(s)
- Robert Clarke
- Department of Oncology, Georgetown University Medical Center, Washington, District of Columbia, USA
| | - John J Tyson
- Department of Biological Sciences, Virginia Polytechnic Institute and State University, Blacksburg, Virginia, USA
| | - Ming Tan
- Department of Biostatistics, Bioinformatics & Biomathematics, Georgetown University Medical Center, Washington, District of Columbia, USA
| | - William T Baumann
- Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Blacksburg, Virginia, USA
| | - Lu Jin
- Department of Oncology, Georgetown University Medical Center, Washington, District of Columbia, USA
| | - Jianhua Xuan
- Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, Virginia, USA
| | - Yue Wang
- Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, Virginia, USA
| |
Collapse
|
25
|
Li R, Jiang S, Li W, Hong H, Zhao C, Huang X, Zhang Z, Li H, Chen H, Bo X. Exploration of prognosis-related microRNA and transcription factor co-regulatory networks across cancer types. RNA Biol 2019; 16:1010-1021. [PMID: 31046554 PMCID: PMC6602415 DOI: 10.1080/15476286.2019.1607714] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
The study of cancer prognosis serves as an important part of cancer research. Large-scale cancer studies have identified numerous genes and microRNAs (miRNAs) associated with prognosis. These informative genes and miRNAs represent potential biomarkers to predict survival and to elucidate the molecular mechanism of tumour progression. MiRNAs and transcription factors (TFs) can work cooperatively as essential mediators of gene expression, and their dysregulation affects cancer prognosis. A panoramic view of cancer prognosis at the system level, considering the co-regulation roles of miRNA and TF, remains elusive. Here, we establish 12 prognosis-related miRNA-TF co-regulatory networks. The characteristics of prognostic target genes and their regulators in the network are depicted. Although the target genes and co-regulatory patterns exhibit cancer-specific properties, some miRNAs and TFs are highly conserved across cancers. We illustrate and interpret the roles of these conserved regulators by building a model associated with cancer hallmarks, functional enrichment analysis, network community detection, and exhaustive literature research. The elaborated system-level prognostic miRNA-TF co-regulation landscape, including the highlighted roles of conserved regulators, provides a novel and powerful insights into further biological and medical discoveries.
Collapse
Affiliation(s)
- Ruijiang Li
- a Department of Biotechnology , Beijing Institute of Radiation Medicine , Beijing , P.R.China
| | - Shuai Jiang
- a Department of Biotechnology , Beijing Institute of Radiation Medicine , Beijing , P.R.China
| | - Wanying Li
- a Department of Biotechnology , Beijing Institute of Radiation Medicine , Beijing , P.R.China
| | - Hao Hong
- a Department of Biotechnology , Beijing Institute of Radiation Medicine , Beijing , P.R.China
| | - Chenghui Zhao
- a Department of Biotechnology , Beijing Institute of Radiation Medicine , Beijing , P.R.China
| | - Xin Huang
- a Department of Biotechnology , Beijing Institute of Radiation Medicine , Beijing , P.R.China
| | - Zhuo Zhang
- a Department of Biotechnology , Beijing Institute of Radiation Medicine , Beijing , P.R.China
| | - Hao Li
- a Department of Biotechnology , Beijing Institute of Radiation Medicine , Beijing , P.R.China
| | - Hebing Chen
- a Department of Biotechnology , Beijing Institute of Radiation Medicine , Beijing , P.R.China
| | - Xiaochen Bo
- a Department of Biotechnology , Beijing Institute of Radiation Medicine , Beijing , P.R.China
| |
Collapse
|
26
|
Zhao X, Xu M, Cai Z, Yuan W, Cui W, Li MD. Identification of LIFR, PIK3R1, and MMP12 as Novel Prognostic Signatures in Gallbladder Cancer Using Network-Based Module Analysis. Front Oncol 2019; 9:325. [PMID: 31119098 PMCID: PMC6504688 DOI: 10.3389/fonc.2019.00325] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2019] [Accepted: 04/10/2019] [Indexed: 01/17/2023] Open
Abstract
Background: Gallbladder cancer (GBC) is a rare and aggressive malignancy of the biliary tract with a dismal survival rate. Effective biomarkers and therapeutic targets are urgently needed. Methods: We analyzed gene expression profiles of GBC to identify differentially expressed genes (DEGs) and then used these DEGs to identify functional module biomarkers based on protein functional interaction (FI) networks. We further evaluated the module-gene protein expression and clinical significance with immunohistochemistry staining (IHC) in a tissue microarray (TMA) from 80 GBC samples. Results: Five functional modules were identified. Module 0 included classical cancer signaling pathways, such as Ras and PI3K-Akt; and modules 1–4 included genes associated with muscle cells, fibrinogen, extracellular matrix, and integrins, respectively. We validated the expression of LIFR, PIK3R1, and MMP12, which were hubs or functional nodes in modules. Compared with paired peritumoural tissues, we found that the expression of LIFR (P = 0.002) and PIK3R1 (P = 0.046) proteins were significantly downregulated, and MMP12 (P = 0.006) was significantly upregulated. Further prognostic analysis showed that patients with low expression of LIFR had shorter overall survival than those with high expression (log-rank test P = 0.028), the same trend as for PIK3R1 (P = 0.053) and MMP12 (P = 0.006). Multivariate analysis indicated that expression of MMP12 protein (hazard ratio [HR] = 0.429; 95% confidence interval [CI] 0.198, 0.930; P = 0.032) was one of the significant independent prognostic factors for overall survival. Conclusions: We found a highly reliable FI network, which revealed LIFR, PIK3R1, and MMP12 as novel prognostic biomarker candidates for GBC. These findings could accelerate biomarker discovery and therapeutic development in this cancer.
Collapse
Affiliation(s)
- Xinyi Zhao
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Mengxiang Xu
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Zhen Cai
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Wenji Yuan
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Wenyan Cui
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Ming D Li
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China.,Research Center for Air Pollution and Health, Zhejiang University, Hangzhou, China.,Institute of Neuroimmune Pharmacology, Seton Hall University, South Orange, NJ, United States
| |
Collapse
|
27
|
Altieri F, Hansen TV, Vandin F. NoMAS: A Computational Approach to Find Mutated Subnetworks Associated With Survival in Genome-Wide Cancer Studies. Front Genet 2019; 10:265. [PMID: 31024613 PMCID: PMC6468148 DOI: 10.3389/fgene.2019.00265] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2018] [Accepted: 03/08/2019] [Indexed: 12/31/2022] Open
Abstract
Next-generation sequencing technologies allow to measure somatic mutations in a large number of patients from the same cancer type: one of the main goals in their analysis is the identification of mutations associated with clinical parameters. The identification of such relationships is hindered by extensive genetic heterogeneity in tumors, with different genes mutated in different patients, due, in part, to the fact that genes and mutations act in the context of pathways: it is therefore crucial to study mutations in the context of interactions among genes. In this work we study the problem of identifying subnetworks of a large gene-gene interaction network with mutations associated with survival time. We formally define the associated computational problem by using a score for subnetworks based on the log-rank statistical test to compare the survival of two given populations. We propose a novel approach, based on a new algorithm, called Network of Mutations Associated with Survival (NoMAS) to find subnetworks of a large interaction network whose mutations are associated with survival time. NoMAS is based on the color-coding technique, that has been previously employed in other applications to find the highest scoring subnetwork with high probability when the subnetwork score is additive. In our case the score is not additive, so our algorithm cannot identify the optimal solution with the same guarantees associated to additive scores. Nonetheless, we prove that, under a reasonable model for mutations in cancer, NoMAS identifies the optimal solution with high probability. We also design a holdout approach to identify subnetworks significantly associated with survival time. We test NoMAS on simulated and cancer data, comparing it to approaches based on single gene tests and to various greedy approaches. We show that our method does indeed find the optimal solution and performs better than the other approaches. Moreover, on three cancer datasets our method identifies subnetworks with significant association to survival when none of the genes has significant association with survival when considered in isolation.
Collapse
Affiliation(s)
- Federico Altieri
- Department of Information Engineering, University of Padova, Padova, Italy
| | - Tommy V Hansen
- Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
| | - Fabio Vandin
- Department of Information Engineering, University of Padova, Padova, Italy
| |
Collapse
|
28
|
Abstract
This chapter is based on exploiting the network-based representations of proteins, metagraphs, in protein-protein interaction network to identify candidate disease-causing proteins. Protein-protein interaction (PPI) networks are effective tools in studying the functional roles of proteins in the development of various diseases. However, they are insufficient without the support of additional biological knowledge for proteins such as their molecular functions and biological processes. To enhance PPI networks, we utilize biological properties of individual proteins as well. More specifically, we integrate keywords from UniProt database describing protein properties into the PPI network and construct a novel heterogeneous PPI-Keyword (PPIK) network consisting of both proteins and keywords. As proteins with similar functional duties or involving in the same metabolic pathway tend to have similar topological characteristics, we propose to represent them with metagraphs. Compared to the traditional network motif or subgraph, a metagraph can capture the topological arrangements through not only the protein-protein interactions but also protein-keyword associations. We feed those novel metagraph representations into classifiers for disease protein prediction and conduct our experiments on three different PPI databases. They show that the proposed method consistently increases disease protein prediction performance across various classifiers, by 15.3% in AUC on average. It outperforms the diffusion-based (e.g., RWR) and the module-based baselines by 13.8-32.9% in overall disease protein prediction. Breast cancer protein prediction outperforms RWR, PRINCE, and the module-based baselines by 6.6-14.2%. Finally, our predictions also exhibit better correlations with literature findings from PubMed database.
Collapse
|
29
|
Allahyar A, Ubels J, de Ridder J. A data-driven interactome of synergistic genes improves network-based cancer outcome prediction. PLoS Comput Biol 2019; 15:e1006657. [PMID: 30726216 PMCID: PMC6380593 DOI: 10.1371/journal.pcbi.1006657] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2018] [Revised: 02/19/2019] [Accepted: 11/20/2018] [Indexed: 12/13/2022] Open
Abstract
Robustly predicting outcome for cancer patients from gene expression is an important challenge on the road to better personalized treatment. Network-based outcome predictors (NOPs), which considers the cellular wiring diagram in the classification, hold much promise to improve performance, stability and interpretability of identified marker genes. Problematically, reports on the efficacy of NOPs are conflicting and for instance suggest that utilizing random networks performs on par to networks that describe biologically relevant interactions. In this paper we turn the prediction problem around: instead of using a given biological network in the NOP, we aim to identify the network of genes that truly improves outcome prediction. To this end, we propose SyNet, a gene network constructed ab initio from synergistic gene pairs derived from survival-labelled gene expression data. To obtain SyNet, we evaluate synergy for all 69 million pairwise combinations of genes resulting in a network that is specific to the dataset and phenotype under study and can be used to in a NOP model. We evaluated SyNet and 11 other networks on a compendium dataset of >4000 survival-labelled breast cancer samples. For this purpose, we used cross-study validation which more closely emulates real world application of these outcome predictors. We find that SyNet is the only network that truly improves performance, stability and interpretability in several existing NOPs. We show that SyNet overlaps significantly with existing gene networks, and can be confidently predicted (~85% AUC) from graph-topological descriptions of these networks, in particular the breast tissue-specific network. Due to its data-driven nature, SyNet is not biased to well-studied genes and thus facilitates post-hoc interpretation. We find that SyNet is highly enriched for known breast cancer genes and genes related to e.g. histological grade and tamoxifen resistance, suggestive of a role in determining breast cancer outcome.
Collapse
Affiliation(s)
- Amin Allahyar
- Department of Genetics, Center for Molecular Medicine, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
- Delft Bioinformatics Lab, Faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, Delft, The Netherlands
| | - Joske Ubels
- Department of Genetics, Center for Molecular Medicine, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
- Skyline DX, Rotterdam
- Department of Hematology, Erasmus MC Cancer Institute, Rotterdam
| | - Jeroen de Ridder
- Department of Genetics, Center for Molecular Medicine, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| |
Collapse
|
30
|
Shao B, Bjaanæs MM, Helland Å, Schütte C, Conrad T. EMT network-based feature selection improves prognosis prediction in lung adenocarcinoma. PLoS One 2019; 14:e0204186. [PMID: 30703089 PMCID: PMC6354965 DOI: 10.1371/journal.pone.0204186] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2018] [Accepted: 12/25/2018] [Indexed: 12/16/2022] Open
Abstract
Various feature selection algorithms have been proposed to identify cancer prognostic biomarkers. In recent years, however, their reproducibility is criticized. The performance of feature selection algorithms is shown to be affected by the datasets, underlying networks and evaluation metrics. One of the causes is the curse of dimensionality, which makes it hard to select the features that generalize well on independent data. Even the integration of biological networks does not mitigate this issue because the networks are large and many of their components are not relevant for the phenotype of interest. With the availability of multi-omics data, integrative approaches are being developed to build more robust predictive models. In this scenario, the higher data dimensions create greater challenges. We proposed a phenotype relevant network-based feature selection (PRNFS) framework and demonstrated its advantages in lung cancer prognosis prediction. We constructed cancer prognosis relevant networks based on epithelial mesenchymal transition (EMT) and integrated them with different types of omics data for feature selection. With less than 2.5% of the total dimensionality, we obtained EMT prognostic signatures that achieved remarkable prediction performance (average AUC values >0.8), very significant sample stratifications, and meaningful biological interpretations. In addition to finding EMT signatures from different omics data levels, we combined these single-omics signatures into multi-omics signatures, which improved sample stratifications significantly. Both single- and multi-omics EMT signatures were tested on independent multi-omics lung cancer datasets and significant sample stratifications were obtained.
Collapse
Affiliation(s)
- Borong Shao
- Zuse Institute Berlin, Berlin, Germany
- Dept of mathematics and computer science, Freie Universität Berlin, Berlin, Germany
- * E-mail:
| | - Maria Moksnes Bjaanæs
- Dept of Oncology, Oslo University Hospital, Oslo, Norway
- Dept of Cancer Genetics, Oslo University Hospital, Oslo, Norway
- Dept of Clinical Medicine, University of Oslo, Oslo, Norway
| | - Åslaug Helland
- Dept of Oncology, Oslo University Hospital, Oslo, Norway
- Dept of Cancer Genetics, Oslo University Hospital, Oslo, Norway
- Dept of Clinical Medicine, University of Oslo, Oslo, Norway
| | - Christof Schütte
- Zuse Institute Berlin, Berlin, Germany
- Dept of mathematics and computer science, Freie Universität Berlin, Berlin, Germany
| | - Tim Conrad
- Zuse Institute Berlin, Berlin, Germany
- Dept of mathematics and computer science, Freie Universität Berlin, Berlin, Germany
| |
Collapse
|
31
|
Ata SK, Ou-Yang L, Fang Y, Kwoh CK, Wu M, Li XL. Integrating node embeddings and biological annotations for genes to predict disease-gene associations. BMC SYSTEMS BIOLOGY 2018; 12:138. [PMID: 30598097 PMCID: PMC6311944 DOI: 10.1186/s12918-018-0662-y] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
BACKGROUND Predicting disease causative genes (or simply, disease genes) has played critical roles in understanding the genetic basis of human diseases and further providing disease treatment guidelines. While various computational methods have been proposed for disease gene prediction, with the recent increasing availability of biological information for genes, it is highly motivated to leverage these valuable data sources and extract useful information for accurately predicting disease genes. RESULTS We present an integrative framework called N2VKO to predict disease genes. Firstly, we learn the node embeddings from protein-protein interaction (PPI) network for genes by adapting the well-known representation learning method node2vec. Secondly, we combine the learned node embeddings with various biological annotations as rich feature representation for genes, and subsequently build binary classification models for disease gene prediction. Finally, as the data for disease gene prediction is usually imbalanced (i.e. the number of the causative genes for a specific disease is much less than that of its non-causative genes), we further address this serious data imbalance issue by applying oversampling techniques for imbalance data correction to improve the prediction performance. Comprehensive experiments demonstrate that our proposed N2VKO significantly outperforms four state-of-the-art methods for disease gene prediction across seven diseases. CONCLUSIONS In this study, we show that node embeddings learned from PPI networks work well for disease gene prediction, while integrating node embeddings with other biological annotations further improves the performance of classification models. Moreover, oversampling techniques for imbalance correction further enhances the prediction performance. In addition, the literature search of predicted disease genes also shows the effectiveness of our proposed N2VKO framework for disease gene prediction.
Collapse
Affiliation(s)
- Sezin Kircali Ata
- Department of Computer Science and Engineering, Nanyang Technological University, Singapore, Singapore
| | - Le Ou-Yang
- Department of Electronic Engineering, College of Information Engineering, Shenzhen University, China, Singapore, Singapore
| | - Yuan Fang
- School of Information Systems, Singapore Management University, Singapore, Singapore
| | - Chee-Keong Kwoh
- Department of Computer Science and Engineering, Nanyang Technological University, Singapore, Singapore
| | - Min Wu
- Data Analytics Department, Institute for Infocomm Research, Singapore, Singapore.
| | - Xiao-Li Li
- Data Analytics Department, Institute for Infocomm Research, Singapore, Singapore
| |
Collapse
|
32
|
Ayaub EA, Tandon K, Padwal M, Imani J, Patel H, Dubey A, Mekhael O, Upagupta C, Ayoub A, Dvorkin-Gheva A, Murphy J, Kolb PS, Lhotak S, Dickhout JG, Austin RC, Kolb MRJ, Richards CD, Ask K. IL-6 mediates ER expansion during hyperpolarization of alternatively activated macrophages. Immunol Cell Biol 2018; 97:203-217. [PMID: 30298952 PMCID: PMC7379543 DOI: 10.1111/imcb.12212] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2017] [Revised: 08/29/2018] [Accepted: 10/03/2018] [Indexed: 12/11/2022]
Abstract
Although recent evidence has shown that IL-6 is involved in enhanced alternative activation of macrophages toward a profibrotic phenotype, the mechanisms leading to their increased secretory capacity are not fully understood. Here, we investigated the effect of IL-6 on endoplasmic reticulum (ER) expansion and alternative activation of macrophages in vitro. An essential mediator in this ER expansion process is the IRE1 pathway, which possesses a kinase and endoribonuclease domain to cleave XBP1 into a spliced bioactive molecule. To investigate the IRE1-XBP1 expansion pathway, IL-4/IL-13 and IL-4/IL-13/IL-6-mediated alternative programming of murine bone marrow-derived and human THP1 macrophages were assessed by arginase activity in cell lysates, CD206 and arginase-1 expression by flow cytometry, and secreted CCL18 by ELISA, respectively. Ultrastructural intracellular morphology and ER biogenesis were examined by transmission electron microscopy and immunofluorescence. Transcription profiling of 128 genes were assessed by NanoString and Pharmacological inhibition of the IRE1-XBP1 arm was achieved using STF-083010 and was verified by RT-PCR. The addition of IL-6 to the conventional alternative programming cocktail IL-4/IL-13 resulted in increased ER and mitochondrial expansion, profibrotic profiles and unfolded protein response-mediated induction of molecular chaperones. IRE1-XBP1 inhibition substantially reduced the IL-6-mediated hyperpolarization and normalized the above effects. In conclusion, the addition of IL-6 enhances ER expansion and the profibrotic capacity of IL-4/IL-13-mediated activation of macrophages. Therapeutic strategies targeting IL-6 or the IRE1-XBP1 axis may be beneficial to prevent the profibrotic capacity of macrophages.
Collapse
Affiliation(s)
- Ehab A Ayaub
- Department of Medicine, Firestone Institute for Respiratory Health, McMaster University and The Research Institute of St Joe's Hamilton, Hamilton, ON, Canada.,Department of Pathology and Molecular Medicine, McMaster Immunology Research Centre, McMaster University, Hamilton, ON, Canada
| | - Karun Tandon
- Department of Medicine, Firestone Institute for Respiratory Health, McMaster University and The Research Institute of St Joe's Hamilton, Hamilton, ON, Canada.,Department of Pathology and Molecular Medicine, McMaster Immunology Research Centre, McMaster University, Hamilton, ON, Canada
| | - Manreet Padwal
- Department of Medicine, Firestone Institute for Respiratory Health, McMaster University and The Research Institute of St Joe's Hamilton, Hamilton, ON, Canada.,Department of Pathology and Molecular Medicine, McMaster Immunology Research Centre, McMaster University, Hamilton, ON, Canada
| | - Jewel Imani
- Department of Medicine, Firestone Institute for Respiratory Health, McMaster University and The Research Institute of St Joe's Hamilton, Hamilton, ON, Canada.,Department of Pathology and Molecular Medicine, McMaster Immunology Research Centre, McMaster University, Hamilton, ON, Canada
| | - Hemisha Patel
- Department of Medicine, Firestone Institute for Respiratory Health, McMaster University and The Research Institute of St Joe's Hamilton, Hamilton, ON, Canada.,Department of Pathology and Molecular Medicine, McMaster Immunology Research Centre, McMaster University, Hamilton, ON, Canada
| | - Anisha Dubey
- Department of Pathology and Molecular Medicine, McMaster Immunology Research Centre, McMaster University, Hamilton, ON, Canada
| | - Olivia Mekhael
- Department of Medicine, Firestone Institute for Respiratory Health, McMaster University and The Research Institute of St Joe's Hamilton, Hamilton, ON, Canada.,Department of Pathology and Molecular Medicine, McMaster Immunology Research Centre, McMaster University, Hamilton, ON, Canada
| | - Chandak Upagupta
- Department of Medicine, Firestone Institute for Respiratory Health, McMaster University and The Research Institute of St Joe's Hamilton, Hamilton, ON, Canada.,Department of Pathology and Molecular Medicine, McMaster Immunology Research Centre, McMaster University, Hamilton, ON, Canada
| | - Anmar Ayoub
- Department of Medicine, Firestone Institute for Respiratory Health, McMaster University and The Research Institute of St Joe's Hamilton, Hamilton, ON, Canada.,Department of Pathology and Molecular Medicine, McMaster Immunology Research Centre, McMaster University, Hamilton, ON, Canada
| | - Anna Dvorkin-Gheva
- Department of Pathology and Molecular Medicine, McMaster Immunology Research Centre, McMaster University, Hamilton, ON, Canada
| | - James Murphy
- Department of Medicine, Firestone Institute for Respiratory Health, McMaster University and The Research Institute of St Joe's Hamilton, Hamilton, ON, Canada.,Department of Pathology and Molecular Medicine, McMaster Immunology Research Centre, McMaster University, Hamilton, ON, Canada
| | - Philipp S Kolb
- Department of Medicine, Firestone Institute for Respiratory Health, McMaster University and The Research Institute of St Joe's Hamilton, Hamilton, ON, Canada.,Department of Pathology and Molecular Medicine, McMaster Immunology Research Centre, McMaster University, Hamilton, ON, Canada
| | - Sarka Lhotak
- Department of Medicine, Hamilton Centre for Kidney Research, McMaster University, Hamilton, ON, Canada
| | - Jeffrey G Dickhout
- Department of Medicine, Hamilton Centre for Kidney Research, McMaster University, Hamilton, ON, Canada
| | - Rick C Austin
- Department of Medicine, Hamilton Centre for Kidney Research, McMaster University, Hamilton, ON, Canada
| | - Martin R J Kolb
- Department of Medicine, Firestone Institute for Respiratory Health, McMaster University and The Research Institute of St Joe's Hamilton, Hamilton, ON, Canada
| | - Carl D Richards
- Department of Pathology and Molecular Medicine, McMaster Immunology Research Centre, McMaster University, Hamilton, ON, Canada
| | - Kjetil Ask
- Department of Medicine, Firestone Institute for Respiratory Health, McMaster University and The Research Institute of St Joe's Hamilton, Hamilton, ON, Canada.,Department of Pathology and Molecular Medicine, McMaster Immunology Research Centre, McMaster University, Hamilton, ON, Canada
| |
Collapse
|
33
|
Haider S, Yao CQ, Sabine VS, Grzadkowski M, Stimper V, Starmans MHW, Wang J, Nguyen F, Moon NC, Lin X, Drake C, Crozier CA, Brookes CL, van de Velde CJH, Hasenburg A, Kieback DG, Markopoulos CJ, Dirix LY, Seynaeve C, Rea DW, Kasprzyk A, Lambin P, Lio' P, Bartlett JMS, Boutros PC. Pathway-based subnetworks enable cross-disease biomarker discovery. Nat Commun 2018; 9:4746. [PMID: 30420699 PMCID: PMC6232113 DOI: 10.1038/s41467-018-07021-3] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2018] [Accepted: 09/29/2018] [Indexed: 11/29/2022] Open
Abstract
Biomarkers lie at the heart of precision medicine. Surprisingly, while rapid genomic profiling is becoming ubiquitous, the development of biomarkers usually involves the application of bespoke techniques that cannot be directly applied to other datasets. There is an urgent need for a systematic methodology to create biologically-interpretable molecular models that robustly predict key phenotypes. Here we present SIMMS (Subnetwork Integration for Multi-Modal Signatures): an algorithm that fragments pathways into functional modules and uses these to predict phenotypes. We apply SIMMS to multiple data types across five diseases, and in each it reproducibly identifies known and novel subtypes, and makes superior predictions to the best bespoke approaches. To demonstrate its ability on a new dataset, we profile 33 genes/nodes of the PI3K pathway in 1734 FFPE breast tumors and create a four-subnetwork prediction model. This model out-performs a clinically-validated molecular test in an independent cohort of 1742 patients. SIMMS is generic and enables systematic data integration for robust biomarker discovery.
Collapse
Affiliation(s)
- Syed Haider
- Informatics and Biocomputing Program, Ontario Institute for Cancer Research, Toronto, M5G 0A3, Canada.
- Computer Laboratory, University of Cambridge, Cambridge, CB3 0FD, United Kingdom.
| | - Cindy Q Yao
- Informatics and Biocomputing Program, Ontario Institute for Cancer Research, Toronto, M5G 0A3, Canada
- Diagnostic Development Program, Ontario Institute for Cancer Research, Toronto, M5G 0A3, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, M5G 1L7, Canada
| | - Vicky S Sabine
- Diagnostic Development Program, Ontario Institute for Cancer Research, Toronto, M5G 0A3, Canada
| | - Michal Grzadkowski
- Informatics and Biocomputing Program, Ontario Institute for Cancer Research, Toronto, M5G 0A3, Canada
| | - Vincent Stimper
- Informatics and Biocomputing Program, Ontario Institute for Cancer Research, Toronto, M5G 0A3, Canada
| | - Maud H W Starmans
- Informatics and Biocomputing Program, Ontario Institute for Cancer Research, Toronto, M5G 0A3, Canada
- Department of Radiation Oncology (Maastro), GROW-School for Oncology and Developmental Biology, Maastricht University Medical Center, Maastricht, The Netherlands
| | - Jianxin Wang
- Informatics and Biocomputing Program, Ontario Institute for Cancer Research, Toronto, M5G 0A3, Canada
| | - Francis Nguyen
- Informatics and Biocomputing Program, Ontario Institute for Cancer Research, Toronto, M5G 0A3, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, M5G 1L7, Canada
| | - Nathalie C Moon
- Informatics and Biocomputing Program, Ontario Institute for Cancer Research, Toronto, M5G 0A3, Canada
| | - Xihui Lin
- Informatics and Biocomputing Program, Ontario Institute for Cancer Research, Toronto, M5G 0A3, Canada
| | - Camilla Drake
- Diagnostic Development Program, Ontario Institute for Cancer Research, Toronto, M5G 0A3, Canada
| | - Cheryl A Crozier
- Diagnostic Development Program, Ontario Institute for Cancer Research, Toronto, M5G 0A3, Canada
| | - Cassandra L Brookes
- Cancer Research UK Clinical Trials Unit, University of Birmingham, Birmingham, B15 2TT, United Kingdom
| | | | | | | | | | | | | | - Daniel W Rea
- Cancer Research UK Clinical Trials Unit, University of Birmingham, Birmingham, B15 2TT, United Kingdom
| | - Arek Kasprzyk
- Informatics and Biocomputing Program, Ontario Institute for Cancer Research, Toronto, M5G 0A3, Canada
| | - Philippe Lambin
- Department of Radiation Oncology (Maastro), GROW-School for Oncology and Developmental Biology, Maastricht University Medical Center, Maastricht, The Netherlands
| | - Pietro Lio'
- Computer Laboratory, University of Cambridge, Cambridge, CB3 0FD, United Kingdom
| | - John M S Bartlett
- Diagnostic Development Program, Ontario Institute for Cancer Research, Toronto, M5G 0A3, Canada.
| | - Paul C Boutros
- Informatics and Biocomputing Program, Ontario Institute for Cancer Research, Toronto, M5G 0A3, Canada.
- Department of Medical Biophysics, University of Toronto, Toronto, M5G 1L7, Canada.
- Department of Pharmacology and Toxicology, University of Toronto, Toronto, M5S 1A8, Canada.
| |
Collapse
|
34
|
Manners HN, Roy S, Kalita JK. Intrinsic-overlapping co-expression module detection with application to Alzheimer's Disease. Comput Biol Chem 2018; 77:373-389. [PMID: 30466046 DOI: 10.1016/j.compbiolchem.2018.10.014] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2018] [Revised: 10/28/2018] [Accepted: 10/29/2018] [Indexed: 11/18/2022]
Abstract
Genes interact with each other and may cause perturbation in the molecular pathways leading to complex diseases. Often, instead of any single gene, a subset of genes interact, forming a network, to share common biological functions. Such a subnetwork is called a functional module or motif. Identifying such modules and central key genes in them, that may be responsible for a disease, may help design patient-specific drugs. In this study, we consider the neurodegenerative Alzheimer's Disease (AD) and identify potentially responsible genes from functional motif analysis. We start from the hypothesis that central genes in genetic modules are more relevant to a disease that is under investigation and identify hub genes from the modules as potential marker genes. Motifs or modules are often non-exclusive or overlapping in nature. Moreover, they sometimes show intrinsic or hierarchical distributions with overlapping functional roles. To the best of our knowledge, no prior work handles both the situations in an integrated way. We propose a non-exclusive clustering approach, CluViaN (Clustering Via Network) that can detect intrinsic as well as overlapping modules from gene co-expression networks constructed using microarray expression profiles. We compare our method with existing methods to evaluate the quality of modules extracted. CluViaN reports the presence of intrinsic and overlapping motifs in different species not reported by any other research. We further apply our method to extract significant AD specific modules using CluViaN and rank them based the number of genes from a module involved in the disease pathways. Finally, top central genes are identified by topological analysis of the modules. We use two different AD phenotype data for experimentation. We observe that central genes, namely PSEN1, APP, NDUFB2, NDUFA1, UQCR10, PPP3R1 and a few more, play significant roles in the AD. Interestingly, our experiments also find a hub gene, PML, which has recently been reported to play a role in plasticity, circadian rhythms and the response to proteins which can cause neurodegenerative disorders. MUC4, another hub gene that we find experimentally is yet to be investigated for its potential role in AD. A software implementation of CluViaN in Java is available for download at https://sites.google.com/site/swarupnehu/publications/resources/CluViaN Software.rar.
Collapse
Affiliation(s)
- Hazel Nicolette Manners
- Department of Information Technology, North Eastern Hill University, Shillong, Meghalaya, India.
| | - Swarup Roy
- Department of Computer Applications, Sikkim University, Gangtok, Sikkim, India; Department of Information Technology, North Eastern Hill University, Shillong, Meghalaya, India.
| | - Jugal K Kalita
- Department of Computer Science, University of Colorado, Colorado Springs, USA.
| |
Collapse
|
35
|
Computational Methods for Subtyping of Tumors and Their Applications for Deciphering Tumor Heterogeneity. Methods Mol Biol 2018. [PMID: 30378077 DOI: 10.1007/978-1-4939-8868-6_11] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
Abstract
With the rapid development of deep sequencing technologies, many programs are generating multi-platform genomic profiles (e.g., somatic mutation, DNA methylation, and gene expression) for a large number of tumors. This activity has provided unique opportunities and challenges to stratify tumors and decipher tumor heterogeneity. In this chapter, we summarize several computational methods to address the challenge of tumor stratification with different types of genomic data. We further introduce their applications in emerging large-scale genomic data to show their effectiveness in deciphering tumor heterogeneity and clinical relevance.
Collapse
|
36
|
Typing tumors using pathways selected by somatic evolution. Nat Commun 2018; 9:4159. [PMID: 30297789 PMCID: PMC6175900 DOI: 10.1038/s41467-018-06464-y] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2018] [Accepted: 09/03/2018] [Indexed: 01/01/2023] Open
Abstract
Many recent efforts to analyze cancer genomes involve aggregation of mutations within reference maps of molecular pathways and protein networks. Here, we find these pathway studies are impeded by molecular interactions that are functionally irrelevant to cancer or the patient’s tumor type, as these interactions diminish the contrast of driver pathways relative to individual frequently mutated genes. This problem can be addressed by creating stringent tumor-specific networks of biophysical protein interactions, identified by signatures of epistatic selection during tumor evolution. Using such an evolutionarily selected pathway (ESP) map, we analyze the major cancer genome atlases to derive a hierarchical classification of tumor subtypes linked to characteristic mutated pathways. These pathways are clinically prognostic and predictive, including the TP53-AXIN-ARHGEF17 combination in liver and CYLC2-STK11-STK11IP in lung cancer, which we validate in independent cohorts. This ESP framework substantially improves the definition of cancer pathways and subtypes from tumor genome data. Informative pathways driving cancer pathogenesis and subtypes can be difficult to identify in the presence of many gene interactions irrelevant to cancer. Here, the authors describe an approach for cancer gene pathway analysis based on key molecular interactions that drive cancer in relevant tissue types, and they assemble a focused map of Evolutionarily Selected Pathways (ESP) with interactions supported by both protein–protein binding and genetic epistasis during somatic tumor evolution.
Collapse
|
37
|
An Improved Method for Prediction of Cancer Prognosis by Network Learning. Genes (Basel) 2018; 9:genes9100478. [PMID: 30279327 PMCID: PMC6210393 DOI: 10.3390/genes9100478] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2018] [Revised: 09/21/2018] [Accepted: 09/27/2018] [Indexed: 01/01/2023] Open
Abstract
Accurate identification of prognostic biomarkers is an important yet challenging goal in bioinformatics. Many bioinformatics approaches have been proposed for this purpose, but there is still room for improvement. In this paper, we propose a novel machine learning-based method for more accurate identification of prognostic biomarker genes and use them for prediction of cancer prognosis. The proposed method specifies the candidate prognostic gene module by graph learning using the generative adversarial networks (GANs) model, and scores genes using a PageRank algorithm. We applied the proposed method to multiple-omics data that included copy number, gene expression, DNA methylation, and somatic mutation data for five cancer types. The proposed method showed better prediction accuracy than did existing methods. We identified many prognostic genes and their roles in their biological pathways. We also showed that the genes identified from different omics data were complementary, which led to improved accuracy in prediction using multi-omics data.
Collapse
|
38
|
Choi J, Oh I, Seo S, Ahn J. G2Vec: Distributed gene representations for identification of cancer prognostic genes. Sci Rep 2018; 8:13729. [PMID: 30213980 PMCID: PMC6137174 DOI: 10.1038/s41598-018-32180-0] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2018] [Accepted: 09/03/2018] [Indexed: 12/27/2022] Open
Abstract
Identification of cancer prognostic genes is important in that it can lead to accurate outcome prediction and better therapeutic trials for cancer patients. Many computational approaches have been proposed to achieve this goal; however, there is room for improvement. Recent developments in deep learning techniques can aid in the identification of better prognostic genes and more accurate outcome prediction, but one of the main problems in the adoption of deep learning for this purpose is that data from cancer patients have too many dimensions, while the number of samples is relatively small. In this study, we propose a novel network-based deep learning method to identify prognostic gene signatures via distributed gene representations generated by G2Vec, which is a modified Word2Vec model originally used for natural language processing. We applied the proposed method to five cancer types including liver cancer and showed that G2Vec outperformed extant feature selection methods, especially for small number of samples. Moreover, biomarkers identified by G2Vec was useful to find significant prognostic gene modules associated with hepatocellular carcinoma.
Collapse
Affiliation(s)
- Jonghwan Choi
- Department of Computer Science & Engineering, Incheon National University, Incheon, South Korea
| | - Ilhwan Oh
- Department of Computer Science & Engineering, Incheon National University, Incheon, South Korea
| | - Sangmin Seo
- Department of Computer Science & Engineering, Incheon National University, Incheon, South Korea
| | - Jaegyoon Ahn
- Department of Computer Science & Engineering, Incheon National University, Incheon, South Korea.
| |
Collapse
|
39
|
Choi J, Park S, Yoon Y, Ahn J. Improved prediction of breast cancer outcome by identifying heterogeneous biomarkers. Bioinformatics 2018; 33:3619-3626. [PMID: 28961949 DOI: 10.1093/bioinformatics/btx487] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2017] [Accepted: 07/27/2017] [Indexed: 02/07/2023] Open
Abstract
Motivation Identification of genes that can be used to predict prognosis in patients with cancer is important in that it can lead to improved therapy, and can also promote our understanding of tumor progression on the molecular level. One of the common but fundamental problems that render identification of prognostic genes and prediction of cancer outcomes difficult is the heterogeneity of patient samples. Results To reduce the effect of sample heterogeneity, we clustered data samples using K-means algorithm and applied modified PageRank to functional interaction (FI) networks weighted using gene expression values of samples in each cluster. Hub genes among resulting prioritized genes were selected as biomarkers to predict the prognosis of samples. This process outperformed traditional feature selection methods as well as several network-based prognostic gene selection methods when applied to Random Forest. We were able to find many cluster-specific prognostic genes for each dataset. Functional study showed that distinct biological processes were enriched in each cluster, which seems to reflect different aspect of tumor progression or oncogenesis among distinct patient groups. Taken together, these results provide support for the hypothesis that our approach can effectively identify heterogeneous prognostic genes, and these are complementary to each other, improving prediction accuracy. Availability and implementation https://github.com/mathcom/CPR. Contact jgahn@inu.ac.kr. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jonghwan Choi
- Department of Computer Science and Engineering, Incheon National University, Incheon, The Republic of Korea
| | - Sanghyun Park
- Department of Computer Science, Yonsei University, Seoul, The Republic of Korea
| | - Youngmi Yoon
- Department of Computer Engineering, Gachon University, Seongnam-si, Gyeonggi-do, The Republic of Korea
| | - Jaegyoon Ahn
- Department of Computer Science and Engineering, Incheon National University, Incheon, The Republic of Korea
| |
Collapse
|
40
|
Sanati N, Iancu OD, Wu G, Jacobs JE, McWeeney SK. Network-Based Predictors of Progression in Head and Neck Squamous Cell Carcinoma. Front Genet 2018; 9:183. [PMID: 29910823 PMCID: PMC5992410 DOI: 10.3389/fgene.2018.00183] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2018] [Accepted: 05/07/2018] [Indexed: 11/23/2022] Open
Abstract
The heterogeneity in head and neck squamous cell carcinoma (HNSCC) has made reliable stratification extremely challenging. Behavioral risk factors such as smoking and alcohol consumption contribute to this heterogeneity. To help elucidate potential mechanisms of progression in HNSCC, we focused on elucidating patterns of gene interactions associated with tumor progression. We performed de-novo gene co-expression network inference utilizing 229 patient samples from The Cancer Genome Atlas (TCGA) previously annotated by Bornstein et al. (2016). Differential network analysis allowed us to contrast progressor and non-progressor cohorts. Beyond standard differential expression (DE) analysis, this approach evaluates changes in gene expression variance (differential variability DV) and changes in covariance, which we denote as differential wiring (DW). The set of affected genes was overlaid onto the co-expression network, identifying 12 modules significantly enriched in DE, DV, and/or DW genes. Additionally, we identified modules correlated with behavioral measures such as alcohol consumption and smoking status. In the module enriched for differentially wired genes, we identified network hubs including IL10RA, DOK2, APBB1IP, UBASH3A, SASH3, CELF2, TRAF3IP3, GIMAP6, MYO1F, NCKAP1L, WAS, FERMT3, SLA, SELPLG, TNFRSF1B, WIPF1, AMICA1, PTPN22; the network centrality and progression specificity of these genes suggest a potential role in tumor evolution mechanisms related to inflammation and microenvironment. The identification of this network-based gene signature could be further developed to guide progression stratification, highlighting how network approaches may help improve clinical research end points and ultimately aid in clinical utility.
Collapse
Affiliation(s)
- Nasim Sanati
- Division of Bioinformatics and Computational Biology, Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, OR, United States
| | - Ovidiu D Iancu
- Department of Behavioral Neuroscience, Oregon Health and Science University, Portland, OR, United States
| | - Guanming Wu
- Division of Bioinformatics and Computational Biology, Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, OR, United States
| | - James E Jacobs
- Division of Bioinformatics and Computational Biology, Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, OR, United States.,Division of Pediatric Hematology/Oncology, Department of Pediatrics, Oregon Health and Science University, Portland, OR, United States
| | - Shannon K McWeeney
- Division of Bioinformatics and Computational Biology, Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, OR, United States.,OHSU Knight Cancer Institute, Portland, OR, United States
| |
Collapse
|
41
|
Zhang S. Comparisons of gene coexpression network modules in breast cancer and ovarian cancer. BMC SYSTEMS BIOLOGY 2018; 12:8. [PMID: 29671401 PMCID: PMC5907153 DOI: 10.1186/s12918-018-0530-9] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Background Breast cancer and ovarian cancer are hormone driven and are known to have some predisposition genes in common such as the two well known cancer genes BRCA1 and BRCA2. The objective of this study is to compare the coexpression network modules of both cancers, so as to infer the potential cancer-related modules. Methods We applied the eigen-decomposition to the matrix that integrates the gene coexpression networks of both breast cancer and ovarian cancer. With hierarchical clustering of the related eigenvectors, we obtained the network modules of both cancers simultaneously. Enrichment analysis on Gene Ontology (GO), KEGG pathway, Disease Ontology (DO), and Gene Set Enrichment Analysis (GSEA) in the identified modules was performed. Results We identified 43 modules that are enriched by at least one of the four types of enrichments. 31, 25, and 18 modules are enriched by GO terms, KEGG pathways, and DO terms, respectively. The structure of 29 modules in both cancers is significantly different with p-values less than 0.05, of which 25 modules have larger densities in ovarian cancer. One module was found to be significantly enriched by the terms related to breast cancer from GO, KEGG and DO enrichment. One module was found to be significantly enriched by ovarian cancer related terms. Conclusion Breast cancer and ovarian cancer share some common properties on the module level. Integration of both cancers helps identifying the potential cancer associated modules. Electronic supplementary material The online version of this article (10.1186/s12918-018-0530-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Shuqin Zhang
- Center for Computational Systems Biology, Shanghai Key Laboratory for Contemporary Applied Mathematics, School of Mathematical Sciences, Fudan University, No.220 Handan Road, Shanghai, 200433, China.
| |
Collapse
|
42
|
Wang SB, Venkatraman V, Crowgey EL, Liu T, Fu Z, Holewinski R, Ranek M, Kass DA, O'Rourke B, Van Eyk JE. Protein S-Nitrosylation Controls Glycogen Synthase Kinase 3β Function Independent of Its Phosphorylation State. Circ Res 2018; 122:1517-1531. [PMID: 29563102 DOI: 10.1161/circresaha.118.312789] [Citation(s) in RCA: 38] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/22/2018] [Revised: 03/13/2018] [Accepted: 03/19/2018] [Indexed: 01/11/2023]
Abstract
RATIONALE GSK-3β (glycogen synthase kinase 3β) is a multifunctional and constitutively active kinase known to regulate a myriad of cellular processes. The primary mechanism to regulate its function is through phosphorylation-dependent inhibition at serine-9 residue. Emerging evidence indicates that there may be alternative mechanisms that control GSK-3β for certain functions. OBJECTIVES Here, we sought to understand the role of protein S-nitrosylation (SNO) on the function of GSK-3β. SNO-dependent modulation of the localization of GSK-3β and its ability to phosphorylate downstream targets was investigated in vitro, and the network of proteins differentially impacted by phospho- or SNO-dependent GSK-3β regulation and in vivo SNO modification of key signaling kinases during the development of heart failure was also studied. METHODS AND RESULTS We found that GSK-3β undergoes site-specific SNO both in vitro, in HEK293 cells, H9C2 myoblasts, and primary neonatal rat ventricular myocytes, as well as in vivo, in hearts from an animal model of heart failure and sudden cardiac death. S-nitrosylation of GSK-3β significantly inhibits its kinase activity independent of the canonical phospho-inhibition pathway. S-nitrosylation of GSK-3β promotes its nuclear translocation and access to novel downstream phosphosubstrates which are enriched for a novel amino acid consensus sequence motif. Quantitative phosphoproteomics pathway analysis reveals that nuclear GSK-3β plays a central role in cell cycle control, RNA splicing, and DNA damage response. CONCLUSIONS The results indicate that SNO has a differential effect on the location and activity of GSK-3β in the cytoplasm versus the nucleus. SNO modification of GSK-3β occurs in vivo and could contribute to the pathobiology of heart failure and sudden cardiac death.
Collapse
Affiliation(s)
- Sheng-Bing Wang
- From the Department of Medicine (S.-B.W., V.V., T.L., R.H., M.R., D.A.K., B.O'R., J.E.V.E.)
| | - Vidya Venkatraman
- From the Department of Medicine (S.-B.W., V.V., T.L., R.H., M.R., D.A.K., B.O'R., J.E.V.E.).,Johns Hopkins University, Baltimore, MD; Department of Medicine, Advanced Clinical Biosystems Research Institute, The Heart Institute, Cedars-Sinai Medical Center, Los Angeles, CA (V.V., R.H., J.E.V.E.)
| | - Erin L Crowgey
- Department of Nemours Biomedical Research, Nemours Alfred I. duPont Hospital for Children, Wilmington, DE (E.L.C.)
| | - Ting Liu
- From the Department of Medicine (S.-B.W., V.V., T.L., R.H., M.R., D.A.K., B.O'R., J.E.V.E.)
| | | | - Ronald Holewinski
- From the Department of Medicine (S.-B.W., V.V., T.L., R.H., M.R., D.A.K., B.O'R., J.E.V.E.).,Johns Hopkins University, Baltimore, MD; Department of Medicine, Advanced Clinical Biosystems Research Institute, The Heart Institute, Cedars-Sinai Medical Center, Los Angeles, CA (V.V., R.H., J.E.V.E.)
| | - Mark Ranek
- From the Department of Medicine (S.-B.W., V.V., T.L., R.H., M.R., D.A.K., B.O'R., J.E.V.E.)
| | - David A Kass
- From the Department of Medicine (S.-B.W., V.V., T.L., R.H., M.R., D.A.K., B.O'R., J.E.V.E.)
| | - Brian O'Rourke
- From the Department of Medicine (S.-B.W., V.V., T.L., R.H., M.R., D.A.K., B.O'R., J.E.V.E.)
| | - Jennifer E Van Eyk
- From the Department of Medicine (S.-B.W., V.V., T.L., R.H., M.R., D.A.K., B.O'R., J.E.V.E.) .,Johns Hopkins University, Baltimore, MD; Department of Medicine, Advanced Clinical Biosystems Research Institute, The Heart Institute, Cedars-Sinai Medical Center, Los Angeles, CA (V.V., R.H., J.E.V.E.)
| |
Collapse
|
43
|
Zhang Y, Xu Y, Feng L, Li F, Sun Z, Wu T, Shi X, Li J, Li X. Comprehensive characterization of lncRNA-mRNA related ceRNA network across 12 major cancers. Oncotarget 2018; 7:64148-64167. [PMID: 27580177 PMCID: PMC5325432 DOI: 10.18632/oncotarget.11637] [Citation(s) in RCA: 157] [Impact Index Per Article: 22.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2016] [Accepted: 07/28/2016] [Indexed: 12/14/2022] Open
Abstract
Recent studies indicate that long noncoding RNAs (lncRNAs) can act as competing endogenous RNAs (ceRNAs) to indirectly regulate mRNAs through shared microRNAs, which represents a novel layer of RNA crosstalk and plays critical roles in the development of tumor. However, the global regulation landscape and characterization of these lncRNA related ceRNA crosstalk in cancers is still largely unknown. Here, we systematically characterized the lncRNA related ceRNA interactions across 12 major cancers and the normal physiological states by integrating multidimensional molecule profiles of more than 5000 samples. Our study suggest the large difference of ceRNA regulation between normal and tumor states and the higher similarity across similar tissue origin of tumors. The ceRNA related molecules have more conserved features in tumor networks and they play critical roles in both the normal and tumorigenesis processes. Besides, lncRNAs in the pan-cancer ceRNA network may be potential biomarkers of tumor. By exploring hub lncRNAs, we found that these conserved key lncRNAs dominate variable tumor hallmark processes across pan-cancers. Network dynamic analysis highlights the critical roles of ceRNA regulation in tumorigenesis. By analyzing conserved ceRNA interactions, we found that miRNA mediate ceRNA regulation showed different patterns across pan-cancer; while analyzing the cancer specific ceRNA interactions reveal that lncRNAs synergistically regulated tumor driver genes of cancer hallmarks. Finally, we found that ceRNA modules have the potential to predict patient survival. Overall, our study systematically dissected the lncRNA related ceRNA networks in pan-cancer that shed new light on understanding the molecular mechanism of tumorigenesis.
Collapse
Affiliation(s)
- Yunpeng Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Yanjun Xu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Li Feng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Feng Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Zeguo Sun
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Tan Wu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Xinrui Shi
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Jing Li
- Department of Ultrasonic Medicine, The 1st Affiliated Hospital of Heilongjiang University of Chinese Medicine, Harbin 150040, China
| | - Xia Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| |
Collapse
|
44
|
Cao Y, Wang P, Ning S, Xiao W, Xiao B, Li X. Identification of prognostic biomarkers in glioblastoma using a long non-coding RNA-mediated, competitive endogenous RNA network. Oncotarget 2018; 7:41737-41747. [PMID: 27229531 PMCID: PMC5173092 DOI: 10.18632/oncotarget.9569] [Citation(s) in RCA: 42] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2016] [Accepted: 05/10/2016] [Indexed: 02/01/2023] Open
Abstract
Glioblastoma multiforme (GBM) is a highly malignant brain tumor associated with a poor prognosis. Cross-talk between competitive endogenous RNAs (ceRNAs) plays a critical role in tumor development and physiology. In this study, we present a multi-step computational approach to construct a functional GBM long non-coding RNA (lncRNA)-mediated ceRNA network (LMCN) by integrating genome-wide lncRNA and mRNA expression profiles, miRNA-target interactions, functional analyses, and clinical survival analyses. LncRNAs in the LMCN exhibited specific topological features consistent with a regulatory association with coding mRNAs across GBM pathology. We determined that the lncRNA MCM3AP-AS was involved in RNA processing and cell cycle-related functions, and was correlated with patient survival. MCM3AP-AS and MIR17HG acted synergistically to regulate mRNAs in a network module of the competitive LMCN. By integrating the expression profile of this module into a risk model, we stratified GBM patients in both the The Cancer Genome Atlas and an independent GBM dataset into distinct risk groups. Finally, survival analyses demonstrated that the lncRNAs and network module are potential prognostic biomarkers for GBM. Thus, ceRNAs could accelerate biomarker discovery and therapeutic development in GBM.
Collapse
Affiliation(s)
- Yuze Cao
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, 410008, Hunan Province, China
| | - Peng Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang Province, China
| | - Shangwei Ning
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang Province, China
| | - Wenbiao Xiao
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, 410008, Hunan Province, China
| | - Bo Xiao
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, 410008, Hunan Province, China
| | - Xia Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang Province, China
| |
Collapse
|
45
|
Abstract
BACKGROUND The identification of prognostic biomarkers for cancer patients is essential for cancer research. These days, DNA methylation has been proved to be associated with cancer prognosis. However, there are few methods which identify the prognostic markers based on DNA methylation data systematically, especially considering the interaction among DNA methylation sites. METHODS In this paper, we first evaluated the stabilities of microRNA, mRNA, and DNA methylation data in prognosis of cancer. After that, a rank-based method was applied to construct a DNA methylation interaction network. In this network, nodes with the largest degrees (10% of all the nodes) were selected as hubs. Cox regression was applied to select the hubs as prognostic signature. In this prognostic signature, DNA methylation levels of each DNA methylation site are correlated with the outcomes of cancer patients. After obtaining these prognostic genes, we performed the survival analysis in the training group and the test group to verify the reliability of these genes. RESULTS We applied our method in three cancers (ovarian cancer, breast cancer and Glioblastoma Multiforme). In all the three cancers, there are more common ones of prognostic genes selected from different samples in DNA methylation data, compared with gene expression data and miRNA expression data, which indicates the DNA methylation data may be more stable in cancer prognosis. Power-law distribution fitting suggests that the DNA methylation interaction networks are scale-free. And the hubs selected from the three networks are all enriched by cancer related pathways. The gene signatures were obtained for the three cancers respectively, and survival analysis shows they can distinguish the outcomes of tumor patients in both the training data sets and test data sets, which outperformed the control signatures. CONCLUSIONS A computational method was proposed to construct DNA methylation interaction network and this network could be used to select prognostic signatures in cancer.
Collapse
Affiliation(s)
- Wei-Lin Hu
- College of Science, Huazhong Agricultural University, Wuhan, 430070, People's Republic of China
| | - Xiong-Hui Zhou
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, People's Republic of China.
| |
Collapse
|
46
|
Sim W, Lee J, Choi C. Robust method for identification of prognostic gene signatures from gene expression profiles. Sci Rep 2017; 7:16926. [PMID: 29208919 PMCID: PMC5717170 DOI: 10.1038/s41598-017-17213-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2017] [Accepted: 11/22/2017] [Indexed: 12/11/2022] Open
Abstract
In the last decade, many attempts have been made to use gene expression profiles to identify prognostic genes for various types of cancer. Previous studies evaluating the prognostic value of genes suffered by failing to solve the critical problem of classifying patients into different risk groups based on specific gene expression threshold levels. Here, we present a novel method, called iterative patient partitioning (IPP), which was inspired by the receiver operating characteristic (ROC) curve, is based on the log-rank test and overcomes the threshold decision problem. We applied IPP to analyze datasets pertaining to various subtypes of breast cancer. Using IPP, we discovered both novel and well-studied prognostic genes related to cell cycle/proliferation or the immune response. The novel genes were further analyzed using copy-number alteration and mutation data, and these results supported their relationship with prognosis.
Collapse
Affiliation(s)
- Woogwang Sim
- Department of Bio and Brain Engineering, KAIST, Daejeon, 34141, Republic of Korea
| | - Jungsul Lee
- Cellex Life Sciences Incorporated, Daejeon, 34051, Republic of Korea.
| | - Chulhee Choi
- Department of Bio and Brain Engineering, KAIST, Daejeon, 34141, Republic of Korea. .,Cellex Life Sciences Incorporated, Daejeon, 34051, Republic of Korea.
| |
Collapse
|
47
|
Disease gene classification with metagraph representations. Methods 2017; 131:83-92. [DOI: 10.1016/j.ymeth.2017.06.036] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2017] [Revised: 06/23/2017] [Accepted: 06/30/2017] [Indexed: 12/28/2022] Open
|
48
|
Kenn M, Schlangen K, Castillo-Tong DC, Singer CF, Cibena M, Koelbl H, Schreiner W. Gene expression information improves reliability of receptor status in breast cancer patients. Oncotarget 2017; 8:77341-77359. [PMID: 29100391 PMCID: PMC5652334 DOI: 10.18632/oncotarget.20474] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2017] [Accepted: 07/06/2017] [Indexed: 12/28/2022] Open
Abstract
Immunohistochemical (IHC) determination of receptor status in breast cancer patients is frequently inaccurate. Since it directs the choice of systemic therapy, it is essential to increase its reliability. We increase the validity of IHC receptor expression by additionally considering gene expression (GE) measurements. Crisp therapeutic decisions are based on IHC estimates, even if they are borderline reliable. We further improve decision quality by a responsibility function, defining a critical domain for gene expression. Refined normalization is devised to file any newly diagnosed patient into existing data bases. Our approach renders receptor estimates more reliable by identifying patients with questionable receptor status. The approach is also more efficient since the rate of conclusive samples is increased. We have curated and evaluated gene expression data, together with clinical information, from 2880 breast cancer patients. Combining IHC with gene expression information yields a method more reliable and also more efficient as compared to common practice up to now. Several types of possibly suboptimal treatment allocations, based on IHC receptor status alone, are enumerated. A ‘therapy allocation check’ identifies patients possibly miss-classified. Estrogen: false negative 8%, false positive 6%. Progesterone: false negative 14%, false positive 11%. HER2: false negative 2%, false positive 50%. Possible implications are discussed. We propose an ‘expression look-up-plot’, allowing for a significant potential to improve the quality of precision medicine. Methods are developed and exemplified here for breast cancer patients, but they may readily be transferred to diagnostic data relevant for therapeutic decisions in other fields of oncology.
Collapse
Affiliation(s)
- Michael Kenn
- Section of Biosimulation and Bioinformatics, Center for Medical Statistics, Informatics and Intelligent Systems (CeMSIIS), Medical University of Vienna, A-1090 Vienna, Austria
| | - Karin Schlangen
- Section of Biosimulation and Bioinformatics, Center for Medical Statistics, Informatics and Intelligent Systems (CeMSIIS), Medical University of Vienna, A-1090 Vienna, Austria
| | - Dan Cacsire Castillo-Tong
- Translational Gynecology Group, Department of Obstetrics and Gynecology, Comprehensive Cancer Center, Medical University of Vienna, A-1090 Vienna, Austria
| | - Christian F Singer
- Translational Gynecology Group, Department of Obstetrics and Gynecology, Comprehensive Cancer Center, Medical University of Vienna, A-1090 Vienna, Austria
| | - Michael Cibena
- Section of Biosimulation and Bioinformatics, Center for Medical Statistics, Informatics and Intelligent Systems (CeMSIIS), Medical University of Vienna, A-1090 Vienna, Austria
| | - Heinz Koelbl
- Department of General Gynecology and Gynecologic Oncology, Medical University of Vienna, A-1090 Vienna, Austria
| | - Wolfgang Schreiner
- Section of Biosimulation and Bioinformatics, Center for Medical Statistics, Informatics and Intelligent Systems (CeMSIIS), Medical University of Vienna, A-1090 Vienna, Austria
| |
Collapse
|
49
|
Zhang C, Wang X, Li X, Zhao N, Wang Y, Han X, Ci C, Zhang J, Li M, Zhang Y. The landscape of DNA methylation-mediated regulation of long non-coding RNAs in breast cancer. Oncotarget 2017; 8:51134-51150. [PMID: 28881636 PMCID: PMC5584237 DOI: 10.18632/oncotarget.17705] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2017] [Accepted: 04/24/2017] [Indexed: 12/22/2022] Open
Abstract
Although systematic studies have identified a host of long non-coding RNAs (lncRNAs) which are involved in breast cancer, the knowledge about the methyla-tion-mediated dysregulation of those lncRNAs remains limited. Here, we integrated multi-omics data to analyze the methylated alteration of lncRNAs in breast invasive carcinoma (BRCA). We found that lncRNAs showed diverse methylation patterns on promoter regions in BRCA. LncRNAs were divided into two categories and four subcategories based on their promoter methylation patterns and expression levels be-tween tumor and normal samples. Through cis-regulatory analysis and gene ontology network, abnormally methylated lncRNAs were identified to be associated with can-cer regulation, proliferation or expression of transcription factors. Competing endog-enous RNA network and functional enrichment analysis of abnormally methylated lncRNAs showed that lncRNAs with different methylation patterns were involved in several hallmarks and KEGG pathways of cancers significantly. Finally, survival analysis based on mRNA modules in networks revealed that lncRNAs silenced by high methylation were associated with prognosis significantly in BRCA. This study enhances the understanding of aberrantly methylated patterns of lncRNAs and pro-vides a novel insight for identifying cancer biomarkers and potential therapeutic tar-gets in breast cancer.
Collapse
Affiliation(s)
- Chunlong Zhang
- Department of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing, 163000, China
| | - Xinyu Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Xuecang Li
- Department of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing, 163000, China
| | - Ning Zhao
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, 150081, China
| | - Yihan Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Xiaole Han
- Department of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing, 163000, China
| | - Ce Ci
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Jian Zhang
- Department of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing, 163000, China
| | - Meng Li
- Department of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing, 163000, China
| | - Yan Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| |
Collapse
|
50
|
Wang JY, Chen LL, Zhou XH. Identifying prognostic signature in ovarian cancer using DirGenerank. Oncotarget 2017; 8:46398-46413. [PMID: 28615526 PMCID: PMC5542276 DOI: 10.18632/oncotarget.18189] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2017] [Accepted: 04/26/2017] [Indexed: 12/27/2022] Open
Abstract
Identifying the prognostic genes in cancer is essential not only for the treatment of cancer patients, but also for drug discovery. However, it's still a big challenge to select the prognostic genes that can distinguish the risk of cancer patients across various data sets because of tumor heterogeneity. In this situation, the selected genes whose expression levels are statistically related to prognostic risks may be passengers. In this paper, based on gene expression data and prognostic data of ovarian cancer patients, we used conditional mutual information to construct gene dependency network in which the nodes (genes) with more out-degrees have more chances to be the modulators of cancer prognosis. After that, we proposed DirGenerank (Generank in direct netowrk) algorithm, which concerns both the gene dependency network and genes' correlations to prognostic risks, to identify the gene signature that can predict the prognostic risks of ovarian cancer patients. Using ovarian cancer data set from TCGA (The Cancer Genome Atlas) as training data set, 40 genes with the highest importance were selected as prognostic signature. Survival analysis of these patients divided by the prognostic signature in testing data set and four independent data sets showed the signature can distinguish the prognostic risks of cancer patients significantly. Enrichment analysis of the signature with curated cancer genes and the drugs selected by CMAP showed the genes in the signature may be drug targets for therapy. In summary, we have proposed a useful pipeline to identify prognostic genes of cancer patients.
Collapse
Affiliation(s)
- Jian-Yong Wang
- College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Ling-Ling Chen
- College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Xiong-Hui Zhou
- College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| |
Collapse
|