51
|
Liñares-Blanco J, Pazos A, Fernandez-Lozano C. Machine learning analysis of TCGA cancer data. PeerJ Comput Sci 2021; 7:e584. [PMID: 34322589 PMCID: PMC8293929 DOI: 10.7717/peerj-cs.584] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Accepted: 05/17/2021] [Indexed: 06/13/2023]
Abstract
In recent years, machine learning (ML) researchers have changed their focus towards biological problems that are difficult to analyse with standard approaches. Large initiatives such as The Cancer Genome Atlas (TCGA) have allowed the use of omic data for the training of these algorithms. In order to study the state of the art, this review is provided to cover the main works that have used ML with TCGA data. Firstly, the principal discoveries made by the TCGA consortium are presented. Once these bases have been established, we begin with the main objective of this study, the identification and discussion of those works that have used the TCGA data for the training of different ML approaches. After a review of more than 100 different papers, it has been possible to make a classification according to following three pillars: the type of tumour, the type of algorithm and the predicted biological problem. One of the conclusions drawn in this work shows a high density of studies based on two major algorithms: Random Forest and Support Vector Machines. We also observe the rise in the use of deep artificial neural networks. It is worth emphasizing, the increase of integrative models of multi-omic data analysis. The different biological conditions are a consequence of molecular homeostasis, driven by both protein coding regions, regulatory elements and the surrounding environment. It is notable that a large number of works make use of genetic expression data, which has been found to be the preferred method by researchers when training the different models. The biological problems addressed have been classified into five types: prognosis prediction, tumour subtypes, microsatellite instability (MSI), immunological aspects and certain pathways of interest. A clear trend was detected in the prediction of these conditions according to the type of tumour. That is the reason for which a greater number of works have focused on the BRCA cohort, while specific works for survival, for example, were centred on the GBM cohort, due to its large number of events. Throughout this review, it will be possible to go in depth into the works and the methodologies used to study TCGA cancer data. Finally, it is intended that this work will serve as a basis for future research in this field of study.
Collapse
Affiliation(s)
- Jose Liñares-Blanco
- CITIC-Research Center of Information and Communication Technologies, University of A Coruna, A Coruña, Spain
- Department of Computer Science and Information Technologies, Faculty of Computer Science, University of A Coruna, A Coruña, Spain
| | - Alejandro Pazos
- CITIC-Research Center of Information and Communication Technologies, University of A Coruna, A Coruña, Spain
- Department of Computer Science and Information Technologies, Faculty of Computer Science, University of A Coruna, A Coruña, Spain
- Grupo de Redes de Neuronas Artificiales y Sistemas Adaptativos. Imagen Médica y Diagnóstico Radiológico (RNASA-IMEDIR). Complexo Hospitalario Universitario de A Coruña (CHUAC), SERGAS, Universidade da Coruña, Instituto de Investigación Biomédica de A Coruña (INIBIC), A Coruña, Spain
| | - Carlos Fernandez-Lozano
- CITIC-Research Center of Information and Communication Technologies, University of A Coruna, A Coruña, Spain
- Department of Computer Science and Information Technologies, Faculty of Computer Science, University of A Coruna, A Coruña, Spain
- Grupo de Redes de Neuronas Artificiales y Sistemas Adaptativos. Imagen Médica y Diagnóstico Radiológico (RNASA-IMEDIR). Complexo Hospitalario Universitario de A Coruña (CHUAC), SERGAS, Universidade da Coruña, Instituto de Investigación Biomédica de A Coruña (INIBIC), A Coruña, Spain
| |
Collapse
|
52
|
Baptiste M, Moinuddeen SS, Soliz CL, Ehsan H, Kaneko G. Making Sense of Genetic Information: The Promising Evolution of Clinical Stratification and Precision Oncology Using Machine Learning. Genes (Basel) 2021; 12:722. [PMID: 34065872 PMCID: PMC8151328 DOI: 10.3390/genes12050722] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2021] [Revised: 05/07/2021] [Accepted: 05/08/2021] [Indexed: 12/16/2022] Open
Abstract
Precision medicine is a medical approach to administer patients with a tailored dose of treatment by taking into consideration a person's variability in genes, environment, and lifestyles. The accumulation of omics big sequence data led to the development of various genetic databases on which clinical stratification of high-risk populations may be conducted. In addition, because cancers are generally caused by tumor-specific mutations, large-scale systematic identification of single nucleotide polymorphisms (SNPs) in various tumors has propelled significant progress of tailored treatments of tumors (i.e., precision oncology). Machine learning (ML), a subfield of artificial intelligence in which computers learn through experience, has a great potential to be used in precision oncology chiefly to help physicians make diagnostic decisions based on tumor images. A promising venue of ML in precision oncology is the integration of all available data from images to multi-omics big data for the holistic care of patients and high-risk healthy subjects. In this review, we provide a focused overview of precision oncology and ML with attention to breast cancer and glioma as well as the Bayesian networks that have the flexibility and the ability to work with incomplete information. We also introduce some state-of-the-art attempts to use and incorporate ML and genetic information in precision oncology.
Collapse
Affiliation(s)
| | | | | | | | - Gen Kaneko
- School of Arts & Sciences, University of Houston-Victoria, Victoria, TX 77901, USA; (M.B.); (S.S.M.); (C.L.S.); (H.E.)
| |
Collapse
|
53
|
Grzadkowski MR, Holly HD, Somers J, Demir E. Systematic interrogation of mutation groupings reveals divergent downstream expression programs within key cancer genes. BMC Bioinformatics 2021; 22:233. [PMID: 33957863 PMCID: PMC8101181 DOI: 10.1186/s12859-021-04147-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2021] [Accepted: 04/22/2021] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND Genes implicated in tumorigenesis often exhibit diverse sets of genomic variants in the tumor cohorts within which they are frequently mutated. For many genes, neither the transcriptomic effects of these variants nor their relationship to one another in cancer processes have been well-characterized. We sought to identify the downstream expression effects of these mutations and to determine whether this heterogeneity at the genomic level is reflected in a corresponding heterogeneity at the transcriptomic level. RESULTS By applying a novel hierarchical framework for organizing the mutations present in a cohort along with machine learning pipelines trained on samples' expression profiles we systematically interrogated the signatures associated with combinations of mutations recurrent in cancer. This allowed us to catalogue the mutations with discernible downstream expression effects across a number of tumor cohorts as well as to uncover and characterize over a hundred cases where subsets of a gene's mutations are clearly divergent in their function from the remaining mutations of the gene. These findings successfully replicated across a number of disease contexts and were found to have clear implications for the delineation of cancer processes and for clinical decisions. CONCLUSIONS The results of cataloguing the downstream effects of mutation subgroupings across cancer cohorts underline the importance of incorporating the diversity present within oncogenes in models designed to capture the downstream effects of their mutations.
Collapse
Affiliation(s)
- Michal R Grzadkowski
- Department of Molecular and Medical Genetics, Oregon Health & Science University, Portland, OR, USA.
| | - Hannah D Holly
- Department of Molecular and Medical Genetics, Oregon Health & Science University, Portland, OR, USA
| | - Julia Somers
- Department of Molecular and Medical Genetics, Oregon Health & Science University, Portland, OR, USA
| | - Emek Demir
- Department of Molecular and Medical Genetics, Oregon Health & Science University, Portland, OR, USA
| |
Collapse
|
54
|
Roux B, Vaganay C, Vargas JD, Alexe G, Benaksas C, Pardieu B, Fenouille N, Ellegast JM, Malolepsza E, Ling F, Sodaro G, Ross L, Pikman Y, Conway AS, Tang Y, Wu T, Anderson DJ, Le Moigne R, Zhou HJ, Luciano F, Hartigan CR, Galinsky I, DeAngelo DJ, Stone RM, Auberger P, Schenone M, Carr SA, Guirouilh-Barbat J, Lopez B, Khaled M, Lage K, Hermine O, Hemann MT, Puissant A, Stegmaier K, Benajiba L. Targeting acute myeloid leukemia dependency on VCP-mediated DNA repair through a selective second-generation small-molecule inhibitor. Sci Transl Med 2021; 13:eabg1168. [PMID: 33790022 PMCID: PMC8672851 DOI: 10.1126/scitranslmed.abg1168] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2020] [Accepted: 03/12/2021] [Indexed: 12/13/2022]
Abstract
The development and survival of cancer cells require adaptive mechanisms to stress. Such adaptations can confer intrinsic vulnerabilities, enabling the selective targeting of cancer cells. Through a pooled in vivo short hairpin RNA (shRNA) screen, we identified the adenosine triphosphatase associated with diverse cellular activities (AAA-ATPase) valosin-containing protein (VCP) as a top stress-related vulnerability in acute myeloid leukemia (AML). We established that AML was the most responsive disease to chemical inhibition of VCP across a panel of 16 cancer types. The sensitivity to VCP inhibition of human AML cell lines, primary patient samples, and syngeneic and xenograft mouse models of AML was validated using VCP-directed shRNAs, overexpression of a dominant-negative VCP mutant, and chemical inhibition. By combining mass spectrometry-based analysis of the VCP interactome and phospho-signaling studies, we determined that VCP is important for ataxia telangiectasia mutated (ATM) kinase activation and subsequent DNA repair through homologous recombination in AML. A second-generation VCP inhibitor, CB-5339, was then developed and characterized. Efficacy and safety of CB-5339 were validated in multiple AML models, including syngeneic and patient-derived xenograft murine models. We further demonstrated that combining DNA-damaging agents, such as anthracyclines, with CB-5339 treatment synergizes to impair leukemic growth in an MLL-AF9-driven AML murine model. These studies support the clinical testing of CB-5339 as a single agent or in combination with standard-of-care DNA-damaging chemotherapy for the treatment of AML.
Collapse
Affiliation(s)
- Blandine Roux
- Université de Paris, INSERM U944 and CNRS UMR 7212, Institut de Recherche Saint Louis, Hôpital Saint Louis, APHP, 75010 Paris, France
| | - Camille Vaganay
- Université de Paris, INSERM U944 and CNRS UMR 7212, Institut de Recherche Saint Louis, Hôpital Saint Louis, APHP, 75010 Paris, France
| | | | - Gabriela Alexe
- Department of Pediatric Oncology, Dana-Farber Cancer Institute and Boston Children's Hospital, Harvard Medical School, Boston, MA 02215, USA
- Broad Institute of Harvard University and Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - Chaima Benaksas
- Université de Paris, INSERM U944 and CNRS UMR 7212, Institut de Recherche Saint Louis, Hôpital Saint Louis, APHP, 75010 Paris, France
| | - Bryann Pardieu
- Université de Paris, INSERM U944 and CNRS UMR 7212, Institut de Recherche Saint Louis, Hôpital Saint Louis, APHP, 75010 Paris, France
| | - Nina Fenouille
- Université de Paris, INSERM U944 and CNRS UMR 7212, Institut de Recherche Saint Louis, Hôpital Saint Louis, APHP, 75010 Paris, France
| | - Jana M Ellegast
- Department of Pediatric Oncology, Dana-Farber Cancer Institute and Boston Children's Hospital, Harvard Medical School, Boston, MA 02215, USA
- Broad Institute of Harvard University and Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - Edyta Malolepsza
- Broad Institute of Harvard University and Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - Frank Ling
- Université de Paris, INSERM U944 and CNRS UMR 7212, Institut de Recherche Saint Louis, Hôpital Saint Louis, APHP, 75010 Paris, France
| | - Gaetano Sodaro
- Université de Paris, INSERM U944 and CNRS UMR 7212, Institut de Recherche Saint Louis, Hôpital Saint Louis, APHP, 75010 Paris, France
| | - Linda Ross
- Department of Pediatric Oncology, Dana-Farber Cancer Institute and Boston Children's Hospital, Harvard Medical School, Boston, MA 02215, USA
- Broad Institute of Harvard University and Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - Yana Pikman
- Department of Pediatric Oncology, Dana-Farber Cancer Institute and Boston Children's Hospital, Harvard Medical School, Boston, MA 02215, USA
- Broad Institute of Harvard University and Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - Amy S Conway
- Department of Pediatric Oncology, Dana-Farber Cancer Institute and Boston Children's Hospital, Harvard Medical School, Boston, MA 02215, USA
- Broad Institute of Harvard University and Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | | | - Tony Wu
- Cleave Therapeutics Inc., San Francisco, CA 94105, USA
| | | | | | - Han-Jie Zhou
- Cleave Therapeutics Inc., San Francisco, CA 94105, USA
| | | | - Christina R Hartigan
- Broad Institute of Harvard University and Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - Ilene Galinsky
- Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA 02215, USA
| | - Daniel J DeAngelo
- Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA 02215, USA
| | - Richard M Stone
- Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA 02215, USA
| | - Patrick Auberger
- C3M, INSERM U1065, Team Cell Death, Differentiation, Inflammation and Cancer, 06204 Nice, France
| | - Monica Schenone
- Broad Institute of Harvard University and Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - Steven A Carr
- Broad Institute of Harvard University and Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - Josée Guirouilh-Barbat
- Université de Paris, INSERM U1016 and CNRS UMR 8104, Institut Cochin, 75014 Paris, France
| | - Bernard Lopez
- Université de Paris, INSERM U1016 and CNRS UMR 8104, Institut Cochin, 75014 Paris, France
| | - Mehdi Khaled
- INSERM U1186, Gustave-Roussy Cancer Center, Université Paris-Saclay, 94805 Villejuif, France
| | - Kasper Lage
- Broad Institute of Harvard University and Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - Olivier Hermine
- Université de Paris, INSERM U1163 and CNRS 8254, Institut Imagine, Hôpital Necker, APHP, 75015 Paris, France
| | - Michael T Hemann
- Koch Institute for Integrative Cancer Research at Massachusetts Institute of Technology, Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - Alexandre Puissant
- Université de Paris, INSERM U944 and CNRS UMR 7212, Institut de Recherche Saint Louis, Hôpital Saint Louis, APHP, 75010 Paris, France.
| | - Kimberly Stegmaier
- Department of Pediatric Oncology, Dana-Farber Cancer Institute and Boston Children's Hospital, Harvard Medical School, Boston, MA 02215, USA.
- Broad Institute of Harvard University and Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - Lina Benajiba
- Université de Paris, INSERM U944 and CNRS UMR 7212, Institut de Recherche Saint Louis, Hôpital Saint Louis, APHP, 75010 Paris, France.
| |
Collapse
|
55
|
Ebata K, Yamashiro S, Iida K, Okada M. Building patient-specific models for receptor tyrosine kinase signaling networks. FEBS J 2021; 289:90-101. [PMID: 33755310 DOI: 10.1111/febs.15831] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2020] [Revised: 02/26/2021] [Accepted: 03/19/2021] [Indexed: 12/16/2022]
Abstract
Cancer progresses due to changes in the dynamic interactions of multidimensional factors associated with gene mutations. Cancer research has actively adopted computational methods, including data-driven and mathematical model-driven approaches, to identify causative factors and regulatory rules that can explain the complexity and diversity of cancers. A data-driven, statistics-based approach revealed correlations between gene alterations and clinical outcomes in many types of cancers. A model-driven mathematical approach has elucidated the dynamic features of cancer networks and identified the mechanisms of drug efficacy and resistance. More recently, machine learning methods have emerged that can be used for mining omics data and classifying patient. However, as the strengths and weaknesses of each method becoming apparent, new analytical tools are emerging to combine and improve the methodologies and maximize their predictive power for classifying cancer subtypes and prognosis. Here, we introduce recent advances in cancer systems biology aimed at personalized medicine, with focus on the receptor tyrosine kinase signaling network.
Collapse
Affiliation(s)
- Kyoichi Ebata
- Institute for Protein Research, Osaka University, Suita, Japan
| | - Sawa Yamashiro
- Institute for Protein Research, Osaka University, Suita, Japan
| | - Keita Iida
- Institute for Protein Research, Osaka University, Suita, Japan
| | - Mariko Okada
- Institute for Protein Research, Osaka University, Suita, Japan.,Center for Drug Design and Research, National Institutes of Biomedical Innovation, Health and Nutrition, Ibaraki, Japan.,Institute for Chemical Research, Kyoto University, Japan
| |
Collapse
|
56
|
Auslander N, Gussow AB, Koonin EV. Incorporating Machine Learning into Established Bioinformatics Frameworks. Int J Mol Sci 2021; 22:2903. [PMID: 33809353 PMCID: PMC8000113 DOI: 10.3390/ijms22062903] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2021] [Revised: 03/08/2021] [Accepted: 03/10/2021] [Indexed: 12/23/2022] Open
Abstract
The exponential growth of biomedical data in recent years has urged the application of numerous machine learning techniques to address emerging problems in biology and clinical research. By enabling the automatic feature extraction, selection, and generation of predictive models, these methods can be used to efficiently study complex biological systems. Machine learning techniques are frequently integrated with bioinformatic methods, as well as curated databases and biological networks, to enhance training and validation, identify the best interpretable features, and enable feature and model investigation. Here, we review recently developed methods that incorporate machine learning within the same framework with techniques from molecular evolution, protein structure analysis, systems biology, and disease genomics. We outline the challenges posed for machine learning, and, in particular, deep learning in biomedicine, and suggest unique opportunities for machine learning techniques integrated with established bioinformatics approaches to overcome some of these challenges.
Collapse
Affiliation(s)
| | | | - Eugene V. Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA;
| |
Collapse
|
57
|
Gartlgruber M, Sharma AK, Quintero A, Dreidax D, Jansky S, Park YG, Kreth S, Meder J, Doncevic D, Saary P, Toprak UH, Ishaque N, Afanasyeva E, Wecht E, Koster J, Versteeg R, Grünewald TGP, Jones DTW, Pfister SM, Henrich KO, van Nes J, Herrmann C, Westermann F. Super enhancers define regulatory subtypes and cell identity in neuroblastoma. NATURE CANCER 2021; 2:114-128. [PMID: 35121888 DOI: 10.1038/s43018-020-00145-w] [Citation(s) in RCA: 86] [Impact Index Per Article: 21.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/11/2019] [Accepted: 10/19/2020] [Indexed: 02/07/2023]
Abstract
Half of the children diagnosed with neuroblastoma (NB) have high-risk disease, disproportionately contributing to overall childhood cancer-related deaths. In addition to recurrent gene mutations, there is increasing evidence supporting the role of epigenetic deregulation in disease pathogenesis. Yet, comprehensive cis-regulatory network descriptions from NB are lacking. Here, using genome-wide H3K27ac profiles across 60 NBs, covering the different clinical and molecular subtypes, we identified four major super-enhancer-driven epigenetic subtypes and their underlying master regulatory networks. Three of these subtypes recapitulated known clinical groups; namely, MYCN-amplified, MYCN non-amplified high-risk and MYCN non-amplified low-risk NBs. The fourth subtype, exhibiting mesenchymal characteristics, shared cellular identity with multipotent Schwann cell precursors, was induced by RAS activation and was enriched in relapsed disease. Notably, CCND1, an essential gene in NB, was regulated by both mesenchymal and adrenergic regulatory networks converging on distinct super-enhancer modules. Overall, this study reveals subtype-specific super-enhancer regulation in NBs.
Collapse
Affiliation(s)
- Moritz Gartlgruber
- Hopp Children's Cancer Center Heidelberg (KiTZ), Heidelberg, Germany
- Division of Neuroblastoma Genomics, German Cancer Research Center, Heidelberg, Germany
| | - Ashwini Kumar Sharma
- Hopp Children's Cancer Center Heidelberg (KiTZ), Heidelberg, Germany
- Division of Neuroblastoma Genomics, German Cancer Research Center, Heidelberg, Germany
- Health Data Science Unit, Medical Faculty Heidelberg and BioQuant, Heidelberg, Germany
| | - Andrés Quintero
- Hopp Children's Cancer Center Heidelberg (KiTZ), Heidelberg, Germany
- Division of Neuroblastoma Genomics, German Cancer Research Center, Heidelberg, Germany
- Health Data Science Unit, Medical Faculty Heidelberg and BioQuant, Heidelberg, Germany
| | - Daniel Dreidax
- Hopp Children's Cancer Center Heidelberg (KiTZ), Heidelberg, Germany
- Division of Neuroblastoma Genomics, German Cancer Research Center, Heidelberg, Germany
| | - Selina Jansky
- Hopp Children's Cancer Center Heidelberg (KiTZ), Heidelberg, Germany
- Division of Neuroblastoma Genomics, German Cancer Research Center, Heidelberg, Germany
| | - Young-Gyu Park
- Hopp Children's Cancer Center Heidelberg (KiTZ), Heidelberg, Germany
- Division of Neuroblastoma Genomics, German Cancer Research Center, Heidelberg, Germany
| | - Sina Kreth
- Hopp Children's Cancer Center Heidelberg (KiTZ), Heidelberg, Germany
- Division of Neuroblastoma Genomics, German Cancer Research Center, Heidelberg, Germany
| | - Johanna Meder
- Hopp Children's Cancer Center Heidelberg (KiTZ), Heidelberg, Germany
- Division of Neuroblastoma Genomics, German Cancer Research Center, Heidelberg, Germany
| | - Daria Doncevic
- Health Data Science Unit, Medical Faculty Heidelberg and BioQuant, Heidelberg, Germany
| | - Paul Saary
- Health Data Science Unit, Medical Faculty Heidelberg and BioQuant, Heidelberg, Germany
| | - Umut H Toprak
- Hopp Children's Cancer Center Heidelberg (KiTZ), Heidelberg, Germany
- Division of Neuroblastoma Genomics, German Cancer Research Center, Heidelberg, Germany
| | - Naveed Ishaque
- Center for Digital Health, Berlin Institute of Health and Charité Universitätsmedizin Berlin, Berlin, Germany
| | - Elena Afanasyeva
- Hopp Children's Cancer Center Heidelberg (KiTZ), Heidelberg, Germany
- Division of Neuroblastoma Genomics, German Cancer Research Center, Heidelberg, Germany
| | - Elisa Wecht
- Hopp Children's Cancer Center Heidelberg (KiTZ), Heidelberg, Germany
- Division of Neuroblastoma Genomics, German Cancer Research Center, Heidelberg, Germany
| | - Jan Koster
- Department of Oncogenomics, Amsterdam UMC, University of Amsterdam, Amsterdam, the Netherlands
| | - Rogier Versteeg
- Department of Oncogenomics, Amsterdam UMC, University of Amsterdam, Amsterdam, the Netherlands
| | - Thomas G P Grünewald
- Hopp Children's Cancer Center Heidelberg (KiTZ), Heidelberg, Germany
- Division of Translational Pediatric Sarcoma Research, German Cancer Consortium (DKTK), German Cancer Research Center (DKFZ), Heidelberg, Germany
- Institute of Pathology, Heidelberg University Hospital, Heidelberg, Germany
| | - David T W Jones
- Hopp Children's Cancer Center Heidelberg (KiTZ), Heidelberg, Germany
- Pediatric Glioma Research Group, German Cancer Consortium (DKTK), German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Stefan M Pfister
- Hopp Children's Cancer Center Heidelberg (KiTZ), Heidelberg, Germany
- Division of Pediatric Neurooncology, German Cancer Consortium (DKTK), German Cancer Research Center (DKFZ), Heidelberg, Germany
- Department of Pediatric Hematology and Oncology, Heidelberg University Hospital and Hopp Children's Cancer Center Heidelberg (KiTZ), Heidelberg, Germany
| | - Kai-Oliver Henrich
- Hopp Children's Cancer Center Heidelberg (KiTZ), Heidelberg, Germany
- Division of Neuroblastoma Genomics, German Cancer Research Center, Heidelberg, Germany
| | - Johan van Nes
- Department of Oncogenomics, Amsterdam UMC, University of Amsterdam, Amsterdam, the Netherlands
| | - Carl Herrmann
- Health Data Science Unit, Medical Faculty Heidelberg and BioQuant, Heidelberg, Germany.
| | - Frank Westermann
- Hopp Children's Cancer Center Heidelberg (KiTZ), Heidelberg, Germany.
- Division of Neuroblastoma Genomics, German Cancer Research Center, Heidelberg, Germany.
| |
Collapse
|
58
|
Chebanov DK, Mikhaylova IN. On the Methods of Artificial Intelligence for Analysis of Oncological Data. AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS 2020. [DOI: 10.3103/s0005105520050027] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
59
|
Kang J, Lee A, Lee YS. Prediction of PIK3CA mutations from cancer gene expression data. PLoS One 2020; 15:e0241514. [PMID: 33166334 PMCID: PMC7652327 DOI: 10.1371/journal.pone.0241514] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2020] [Accepted: 09/22/2020] [Indexed: 11/20/2022] Open
Abstract
Breast cancers with PIK3CA mutations can be treated with PIK3CA inhibitors in hormone receptor-positive HER2 negative subtypes. We applied a supervised elastic net penalized logistic regression model to predict PIK3CA mutations from gene expression data. This regression approach was applied to predict modeling using the TCGA pan-cancer dataset. Approximately 10,000 cases were available for PIK3CA mutation and mRNA expression data. In 10-fold cross-validation, the model with λ = 0.01 and α = 1.0 (ridge regression) showed the best performance, in terms of area under the receiver operating characteristic (AUROC). The final model was developed with selected hyper-parameters using the entire training set. The training set AUROC was 0.93, and the test set AUROC was 0.84. The area under the precision-recall (AUPR) of the training set was 0.66, and the test set AUPR was 0.39. Cancer types were the most important predictors. Both insulin like growth factor 1 receptor (IGF1R) and the phosphatase and tensin homolog (PTEN) were the most significant genes in gene expression predictors. Our study suggests that predicting genomic alterations using gene expression data is possible, with good outcomes.
Collapse
Affiliation(s)
- Jun Kang
- Department of Hospital Pathology, Seoul St. Mary’s Hospital, College of Medicine, The Catholic University of Korea, Seoul, South Korea
| | - Ahwon Lee
- Department of Hospital Pathology, Seoul St. Mary’s Hospital, College of Medicine, The Catholic University of Korea, Seoul, South Korea
| | - Youn Soo Lee
- Department of Hospital Pathology, Seoul St. Mary’s Hospital, College of Medicine, The Catholic University of Korea, Seoul, South Korea
- * E-mail:
| |
Collapse
|
60
|
Li X, Li S, Wang Y, Zhang S, Wong KC. Identification of pan-cancer Ras pathway activation with deep learning. Brief Bioinform 2020; 22:5943785. [PMID: 33126245 DOI: 10.1093/bib/bbaa258] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2020] [Revised: 08/27/2020] [Accepted: 09/11/2020] [Indexed: 01/06/2023] Open
Abstract
The identification of hidden responders is often an essential challenge in precision oncology. A recent attempt based on machine learning has been proposed for classifying aberrant pathway activity from multiomic cancer data. However, we note several critical limitations there, such as high-dimensionality, data sparsity and model performance. Given the central importance and broad impact of precision oncology, we propose nature-inspired deep Ras activation pan-cancer (NatDRAP), a deep neural network (DNN) model, to address those restrictions for the identification of hidden responders. In this study, we develop the nature-inspired deep learning model that integrates bulk RNA sequencing, copy number and mutation data from PanCanAltas to detect pan-cancer Ras pathway activation. In NatDRAP, we propose to synergize the nature-inspired artificial bee colony algorithm with different gradient-based optimizers in one framework for optimizing DNNs in a collaborative manner. Multiple experiments were conducted on 33 different cancer types across PanCanAtlas. The experimental results demonstrate that the proposed NatDRAP can provide superior performance over other benchmark methods with strong robustness towards diagnosing RAS aberrant pathway activity across different cancer types. In addition, gene ontology enrichment and pathological analysis are conducted to reveal novel insights into the RAS aberrant pathway activity identification and characterization. NatDRAP is written in Python and available at https://github.com/lixt314/NatDRAP1.
Collapse
Affiliation(s)
- Xiangtao Li
- School of Artificial Intelligence, Jilin University
| | - Shaochuan Li
- School of Computer Science, Northeast Normal University
| | - Yunhe Wang
- School of Computer Science, Northeast Normal University
| | - Shixiong Zhang
- Department of Computer science, City University of Hong Kong, Hong Kong SAR
| | - Ka-Chun Wong
- Department of Computer science, City University of Hong Kong, Hong Kong SAR
| |
Collapse
|
61
|
Bellazzo A, Collavin L. Cutting the Brakes on Ras-Cytoplasmic GAPs as Targets of Inactivation in Cancer. Cancers (Basel) 2020; 12:cancers12103066. [PMID: 33096593 PMCID: PMC7588890 DOI: 10.3390/cancers12103066] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Revised: 10/11/2020] [Accepted: 10/15/2020] [Indexed: 12/16/2022] Open
Abstract
Simple Summary GTPase-Activating Proteins (RasGAPs) are a group of structurally related proteins with a fundamental role in controlling the activity of Ras in normal and cancer cells. In particular, loss of function of RasGAPs may contribute to aberrant Ras activation in cancer. Here we review the multiple molecular mechanisms and factors that are involved in downregulating RasGAPs expression and functions in cancer. Additionally, we discuss how extracellular stimuli from the tumor microenvironment can control RasGAPs expression and activity in cancer cells and stromal cells, indirectly affecting Ras activation, with implications for cancer development and progression. Abstract The Ras pathway is frequently deregulated in cancer, actively contributing to tumor development and progression. Oncogenic activation of the Ras pathway is commonly due to point mutation of one of the three Ras genes, which occurs in almost one third of human cancers. In the absence of Ras mutation, the pathway is frequently activated by alternative means, including the loss of function of Ras inhibitors. Among Ras inhibitors, the GTPase-Activating Proteins (RasGAPs) are major players, given their ability to modulate multiple cancer-related pathways. In fact, most RasGAPs also have a multi-domain structure that allows them to act as scaffold or adaptor proteins, affecting additional oncogenic cascades. In cancer cells, various mechanisms can cause the loss of function of Ras inhibitors; here, we review the available evidence of RasGAP inactivation in cancer, with a specific focus on the mechanisms. We also consider extracellular inputs that can affect RasGAP levels and functions, implicating that specific conditions in the tumor microenvironment can foster or counteract Ras signaling through negative or positive modulation of RasGAPs. A better understanding of these conditions might have relevant clinical repercussions, since treatments to restore or enhance the function of RasGAPs in cancer would help circumvent the intrinsic difficulty of directly targeting the Ras protein.
Collapse
|
62
|
Deep transfer learning for reducing health care disparities arising from biomedical data inequality. Nat Commun 2020; 11:5131. [PMID: 33046699 PMCID: PMC7552387 DOI: 10.1038/s41467-020-18918-3] [Citation(s) in RCA: 43] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2020] [Accepted: 09/16/2020] [Indexed: 12/20/2022] Open
Abstract
As artificial intelligence (AI) is increasingly applied to biomedical research and clinical decisions, developing unbiased AI models that work equally well for all ethnic groups is of crucial importance to health disparity prevention and reduction. However, the biomedical data inequality between different ethnic groups is set to generate new health care disparities through data-driven, algorithm-based biomedical research and clinical decisions. Using an extensive set of machine learning experiments on cancer omics data, we find that current prevalent schemes of multiethnic machine learning are prone to generating significant model performance disparities between ethnic groups. We show that these performance disparities are caused by data inequality and data distribution discrepancies between ethnic groups. We also find that transfer learning can improve machine learning model performance for data-disadvantaged ethnic groups, and thus provides an effective approach to reduce health care disparities arising from data inequality among ethnic groups.
Collapse
|
63
|
Baltanás FC, Zarich N, Rojas-Cabañeros JM, Santos E. SOS GEFs in health and disease. Biochim Biophys Acta Rev Cancer 2020; 1874:188445. [PMID: 33035641 DOI: 10.1016/j.bbcan.2020.188445] [Citation(s) in RCA: 49] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2020] [Revised: 10/01/2020] [Accepted: 10/01/2020] [Indexed: 12/11/2022]
Abstract
SOS1 and SOS2 are the most universal and widely expressed family of guanine exchange factors (GEFs) capable or activating RAS or RAC1 proteins in metazoan cells. SOS proteins contain a sequence of modular domains that are responsible for different intramolecular and intermolecular interactions modulating mechanisms of self-inhibition, allosteric activation and intracellular homeostasis. Despite their homology, analyses of SOS1/2-KO mice demonstrate functional prevalence of SOS1 over SOS2 in cellular processes including proliferation, migration, inflammation or maintenance of intracellular redox homeostasis, although some functional redundancy cannot be excluded, particularly at the organismal level. Specific SOS1 gain-of-function mutations have been identified in inherited RASopathies and various sporadic human cancers. SOS1 depletion reduces tumorigenesis mediated by RAS or RAC1 in mouse models and is associated with increased intracellular oxidative stress and mitochondrial dysfunction. Since WT RAS is essential for development of RAS-mutant tumors, the SOS GEFs may be considered as relevant biomarkers or therapy targets in RAS-dependent cancers. Inhibitors blocking SOS expression, intrinsic GEF activity, or productive SOS protein-protein interactions with cellular regulators and/or RAS/RAC targets have been recently developed and shown preclinical and clinical effectiveness blocking aberrant RAS signaling in RAS-driven and RTK-driven tumors.
Collapse
Affiliation(s)
- Fernando C Baltanás
- Centro de Investigación del Cáncer - IBMCC (CSIC-USAL) and CIBERONC, Universidad de Salamanca, 37007 Salamanca, Spain
| | - Natasha Zarich
- Unidad Funcional de Investigación de Enfermedades Crónicas (UFIEC) and CIBERONC, Instituto de Salud Carlos III, 28220, Majadahonda, Madrid, Spain
| | - Jose M Rojas-Cabañeros
- Unidad Funcional de Investigación de Enfermedades Crónicas (UFIEC) and CIBERONC, Instituto de Salud Carlos III, 28220, Majadahonda, Madrid, Spain
| | - Eugenio Santos
- Centro de Investigación del Cáncer - IBMCC (CSIC-USAL) and CIBERONC, Universidad de Salamanca, 37007 Salamanca, Spain.
| |
Collapse
|
64
|
Kumar S, Patnaik S, Dixit A. Predictive models for stage and risk classification in head and neck squamous cell carcinoma (HNSCC). PeerJ 2020; 8:e9656. [PMID: 33024622 PMCID: PMC7518185 DOI: 10.7717/peerj.9656] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2020] [Accepted: 07/14/2020] [Indexed: 01/02/2023] Open
Abstract
Machine learning techniques are increasingly used in the analysis of high throughput genome sequencing data to better understand the disease process and design of therapeutic modalities. In the current study, we have applied state of the art machine learning (ML) algorithms (Random Forest (RF), Support Vector Machine Radial Kernel (svmR), Adaptive Boost (AdaBoost), averaged Neural Network (avNNet), and Gradient Boosting Machine (GBM)) to stratify the HNSCC patients in early and late clinical stages (TNM) and to predict the risk using miRNAs expression profiles. A six miRNA signature was identified that can stratify patients in the early and late stages. The mean accuracy, sensitivity, specificity, and area under the curve (AUC) was found to be 0.84, 0.87, 0.78, and 0.82, respectively indicating the robust performance of the generated model. The prognostic signature of eight miRNAs was identified using LASSO (least absolute shrinkage and selection operator) penalized regression. These miRNAs were found to be significantly associated with overall survival of the patients. The pathway and functional enrichment analysis of the identified biomarkers revealed their involvement in important cancer pathways such as GP6 signalling, Wnt signalling, p53 signalling, granulocyte adhesion, and dipedesis. To the best of our knowledge, this is the first such study and we hope that these signature miRNAs will be useful for the risk stratification of patients and the design of therapeutic modalities.
Collapse
Affiliation(s)
- Sugandh Kumar
- Computational Biology and Bioinformatics Laboratory, Institute of Life Science, Bhubaneswar, Odisha, India.,School of Biotechnology, Kalinga Institute of Industrial Technology (KIIT) University, Bhubaneswar, Odisha, India
| | - Srinivas Patnaik
- School of Biotechnology, Kalinga Institute of Industrial Technology (KIIT) University, Bhubaneswar, Odisha, India
| | - Anshuman Dixit
- Computational Biology and Bioinformatics Laboratory, Institute of Life Science, Bhubaneswar, Odisha, India
| |
Collapse
|
65
|
Integrated phosphoproteomics and transcriptional classifiers reveal hidden RAS signaling dynamics in multiple myeloma. Blood Adv 2020; 3:3214-3227. [PMID: 31698452 DOI: 10.1182/bloodadvances.2019000303] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2019] [Accepted: 08/23/2019] [Indexed: 02/06/2023] Open
Abstract
A major driver of multiple myeloma (MM) is thought to be aberrant signaling, yet no kinase inhibitors have proven successful in the clinic. Here, we employed an integrated, systems approach combining phosphoproteomic and transcriptome analysis to dissect cellular signaling in MM to inform precision medicine strategies. Unbiased phosphoproteomics initially revealed differential activation of kinases across MM cell lines and that sensitivity to mammalian target of rapamycin (mTOR) inhibition may be particularly dependent on mTOR kinase baseline activity. We further noted differential activity of immediate downstream effectors of Ras as a function of cell line genotype. We extended these observations to patient transcriptome data in the Multiple Myeloma Research Foundation CoMMpass study. A machine-learning-based classifier identified surprisingly divergent transcriptional outputs between NRAS- and KRAS-mutated tumors. Genetic dependency and gene expression analysis revealed mutated Ras as a selective vulnerability, but not other MAPK pathway genes. Transcriptional analysis further suggested that aberrant MAPK pathway activation is only present in a fraction of RAS-mutated vs wild-type RAS patients. These high-MAPK patients, enriched for NRAS Q61 mutations, have inferior outcomes, whereas RAS mutations overall carry no survival impact. We further developed an interactive software tool to relate pharmacologic and genetic kinase dependencies in myeloma. Collectively, these predictive models identify vulnerable signaling signatures and highlight surprising differences in functional signaling patterns between NRAS and KRAS mutants invisible to the genomic landscape. These results will lead to improved stratification of MM patients in precision medicine trials while also revealing unexplored modes of Ras biology in MM.
Collapse
|
66
|
Schperberg AV, Boichard A, Tsigelny IF, Richard SB, Kurzrock R. Machine learning model to predict oncologic outcomes for drugs in randomized clinical trials. Int J Cancer 2020; 147:2537-2549. [PMID: 32745254 DOI: 10.1002/ijc.33240] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2020] [Revised: 07/15/2020] [Accepted: 07/17/2020] [Indexed: 11/12/2022]
Abstract
Predicting oncologic outcome is challenging due to the diversity of cancer histologies and the complex network of underlying biological factors. In this study, we determine whether machine learning (ML) can extract meaningful associations between oncologic outcome and clinical trial, drug-related biomarker and molecular profile information. We analyzed therapeutic clinical trials corresponding to 1102 oncologic outcomes from 104 758 cancer patients with advanced colorectal adenocarcinoma, pancreatic adenocarcinoma, melanoma and nonsmall-cell lung cancer. For each intervention arm, a dataset with the following attributes was curated: line of treatment, the number of cytotoxic chemotherapies, small-molecule inhibitors, or monoclonal antibody agents, drug class, molecular alteration status of the clinical arm's population, cancer type, probability of drug sensitivity (PDS) (integrating the status of genomic, transcriptomic and proteomic biomarkers in the population of interest) and outcome. A total of 467 progression-free survival (PFS) and 369 overall survival (OS) data points were used as training sets to build our ML (random forest) model. Cross-validation sets were used for PFS and OS, obtaining correlation coefficients (r) of 0.82 and 0.70, respectively (outcome vs model's parameters). A total of 156 PFS and 110 OS data points were used as test sets. The Spearman correlation (rs ) between predicted and actual outcomes was statistically significant (PFS: rs = 0.879, OS: rs = 0.878, P < .0001). The better outcome arm was predicted in 81% (PFS: N = 59/73, z = 5.24, P < .0001) and 71% (OS: N = 37/52, z = 2.91, P = .004) of randomized trials. The success of our algorithm to predict clinical outcome may be exploitable as a model to optimize clinical trial design with pharmaceutical agents.
Collapse
Affiliation(s)
- Alexander V Schperberg
- CureMatch, Inc., San Diego, California, USA.,Department of Mechanical and Aerospace Engineering, University of California Los Angeles, Los Angeles, California, USA
| | - Amélie Boichard
- Center for Personalized Cancer Therapy and Division of Hematology and Oncology, University of California San Diego Moores Cancer Center, La Jolla, California, USA
| | - Igor F Tsigelny
- CureMatch, Inc., San Diego, California, USA.,San Diego Supercomputer Center, University of California San Diego, La Jolla, California, USA.,Department of Neurosciences, University of California San Diego, La Jolla, California, USA
| | - Stéphane B Richard
- CureMatch, Inc., San Diego, California, USA.,Oncodesign, Inc., New York, New York, USA
| | - Razelle Kurzrock
- Center for Personalized Cancer Therapy and Division of Hematology and Oncology, University of California San Diego Moores Cancer Center, La Jolla, California, USA
| |
Collapse
|
67
|
Tian D, Tang J, Geng X, Li Q, Wang F, Zhao H, Narla G, Yao X, Zhang Y. Targeting UHRF1-dependent DNA repair selectively sensitizes KRAS mutant lung cancer to chemotherapy. Cancer Lett 2020; 493:80-90. [PMID: 32814087 DOI: 10.1016/j.canlet.2020.08.008] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2020] [Revised: 07/16/2020] [Accepted: 08/01/2020] [Indexed: 12/18/2022]
Abstract
Kirsten rat sarcoma virus oncogene homolog (KRAS) mutant lung cancer remains a challenge to cure and chemotherapy is the current standard treatment in the clinic. Hence, understanding molecular mechanisms underlying the sensitivity of KRAS mutant lung cancer to chemotherapy could help uncover unique strategies to treat this disease. Here we report a compound library screen and identification of cardiac glycosides as agents that selectively enhance the in vitro and in vivo effects of chemotherapy on KRAS mutant lung cancer. Quantitative mass spectrometry reveals that cardiac glycosides inhibit DNA double strand break (DSB) repair through suppressing the expression of UHRF1, an important DSB repair factor. Inhibition of UHRF1 by cardiac glycosides was mediated by specific suppression of the oncogenic KRAS pathway. Overexpression of UHRF1 rescued DSB repair inhibited by cardiac glycosides and depletion of UHRF1 mitigated cardiac glycoside-enhanced chemotherapeutic drug sensitivity in KRAS mutant lung cancer cells. Our study reveals a targetable dependency on UHRF1-stimulated DSB repair in KRAS mutant lung cancer in response to chemotherapy.
Collapse
Affiliation(s)
- Danmei Tian
- Institute of Traditional Chinese Medicine and Natural Products, College of Pharmacy, Jinan University, Guangzhou, 510632, People's Republic of China
| | - Jinshan Tang
- Institute of Traditional Chinese Medicine and Natural Products, College of Pharmacy, Jinan University, Guangzhou, 510632, People's Republic of China.
| | - Xinran Geng
- Department of Pharmacology, Case Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, OH, 44106, USA
| | - Qingwen Li
- Institute of Traditional Chinese Medicine and Natural Products, College of Pharmacy, Jinan University, Guangzhou, 510632, People's Republic of China
| | - Fangfang Wang
- Institute of Traditional Chinese Medicine and Natural Products, College of Pharmacy, Jinan University, Guangzhou, 510632, People's Republic of China
| | - Huadong Zhao
- Institute of Traditional Chinese Medicine and Natural Products, College of Pharmacy, Jinan University, Guangzhou, 510632, People's Republic of China
| | - Goutham Narla
- Division of Genetic Medicine, Department of Internal Medicine, The University of Michigan, Ann Arbor, MI, 48109, USA
| | - Xinsheng Yao
- Institute of Traditional Chinese Medicine and Natural Products, College of Pharmacy, Jinan University, Guangzhou, 510632, People's Republic of China.
| | - Youwei Zhang
- Department of Pharmacology, Case Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, OH, 44106, USA.
| |
Collapse
|
68
|
Mochida K, Nishii R, Hirayama T. Decoding Plant-Environment Interactions That Influence Crop Agronomic Traits. PLANT & CELL PHYSIOLOGY 2020; 61:1408-1418. [PMID: 32392328 PMCID: PMC7434589 DOI: 10.1093/pcp/pcaa064] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/31/2020] [Accepted: 04/26/2020] [Indexed: 05/16/2023]
Abstract
To ensure food security in the face of increasing global demand due to population growth and progressive urbanization, it will be crucial to integrate emerging technologies in multiple disciplines to accelerate overall throughput of gene discovery and crop breeding. Plant agronomic traits often appear during the plants' later growth stages due to the cumulative effects of their lifetime interactions with the environment. Therefore, decoding plant-environment interactions by elucidating plants' temporal physiological responses to environmental changes throughout their lifespans will facilitate the identification of genetic and environmental factors, timing and pathways that influence complex end-point agronomic traits, such as yield. Here, we discuss the expected role of the life-course approach to monitoring plant and crop health status in improving crop productivity by enhancing the understanding of plant-environment interactions. We review recent advances in analytical technologies for monitoring health status in plants based on multi-omics analyses and strategies for integrating heterogeneous datasets from multiple omics areas to identify informative factors associated with traits of interest. In addition, we showcase emerging phenomics techniques that enable the noninvasive and continuous monitoring of plant growth by various means, including three-dimensional phenotyping, plant root phenotyping, implantable/injectable sensors and affordable phenotyping devices. Finally, we present an integrated review of analytical technologies and applications for monitoring plant growth, developed across disciplines, such as plant science, data science and sensors and Internet-of-things technologies, to improve plant productivity.
Collapse
Affiliation(s)
- Keiichi Mochida
- RIKEN Center for Sustainable Resource Science, Tsurumi-ku, Yokohama, Japan
- Kihara Institute for Biological Research, Yokohama City University, Totsuka-ku, Yokohama, Japan
- Graduate School of Nanobioscience, Yokohama City University, Kanazawa-ku, Yokohama, Japan
- Institute of Plant Science and Resources, Okayama University, Kurashiki, Japan
- Corresponding author: E-mail, ; Fax, +81-45-503-9609
| | - Ryuei Nishii
- School of Information and Data Sciences, Nagasaki University, Nagasaki, Japan
| | - Takashi Hirayama
- Institute of Plant Science and Resources, Okayama University, Kurashiki, Japan
| |
Collapse
|
69
|
Liñares-Blanco J, Munteanu CR, Pazos A, Fernandez-Lozano C. Molecular docking and machine learning analysis of Abemaciclib in colon cancer. BMC Mol Cell Biol 2020; 21:52. [PMID: 32640984 PMCID: PMC7346626 DOI: 10.1186/s12860-020-00295-w] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2020] [Accepted: 06/24/2020] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND The main challenge in cancer research is the identification of different omic variables that present a prognostic value and personalised diagnosis for each tumour. The fact that the diagnosis is personalised opens the doors to the design and discovery of new specific treatments for each patient. In this context, this work offers new ways to reuse existing databases and work to create added value in research. Three published signatures with significante prognostic value in Colon Adenocarcinoma (COAD) were indentified. These signatures were combined in a new meta-signature and validated with main Machine Learning (ML) and conventional statistical techniques. In addition, a drug repurposing experiment was carried out through Molecular Docking (MD) methodology in order to identify new potential treatments in COAD. RESULTS The prognostic potential of the signature was validated by means of ML algorithms and differential gene expression analysis. The results obtained supported the possibility that this meta-signature could harbor genes of interest for the prognosis and treatment of COAD. We studied drug repurposing following a molecular docking (MD) analysis, where the different protein data bank (PDB) structures of the genes of the meta-signature (in total 155) were confronted with 81 anti-cancer drugs approved by the FDA. We observed four interactions of interest: GLTP - Nilotinib, PTPRN - Venetoclax, VEGFA - Venetoclax and FABP6 - Abemaciclib. The FABP6 gene and its role within different metabolic pathways were studied in tumour and normal tissue and we observed the capability of the FABP6 gene to be a therapeutic target. Our in silico results showed a significant specificity of the union of the protein products of the FABP6 gene as well as the known action of Abemaciclib as an inhibitor of the CDK4/6 protein and therefore, of the cell cycle. CONCLUSIONS The results of our ML and differential expression experiments have first shown the FABP6 gene as a possible new cancer biomarker due to its specificity in colonic tumour tissue and no expression in healthy adjacent tissue. Next, the MD analysis showed that the drug Abemaciclib characteristic affinity for the different protein structures of the FABP6 gene. Therefore, in silico experiments have shown a new opportunity that should be validated experimentally, thus helping to reduce the cost and speed of drug screening. For these reasons, we propose the validation of the drug Abemaciclib for the treatment of colon cancer.
Collapse
Affiliation(s)
- Jose Liñares-Blanco
- Department of Computer Science and Information Technologies, Faculty of Computer Science, University of A Coruña, CITIC, Campus Elviña s/n, A Coruña, 15071, Spain
| | - Cristian R Munteanu
- Department of Computer Science and Information Technologies, Faculty of Computer Science, University of A Coruña, CITIC, Campus Elviña s/n, A Coruña, 15071, Spain.,Grupo de Redes de Neuronas Artificiales y Sistemas Adaptativos. Imagen Médica y Diagnóstico Radiológico (RNASA-IMEDIR). Instituto de Investigación Biomédica de A Coruña (INIBIC). Complexo Hospitalario Universitario de A Coruña (CHUAC), Sergas. Universidade da Coruña (UDC), Xubias de arriba, 84, A Coruña, 15006, Spain
| | - Alejandro Pazos
- Department of Computer Science and Information Technologies, Faculty of Computer Science, University of A Coruña, CITIC, Campus Elviña s/n, A Coruña, 15071, Spain.,Grupo de Redes de Neuronas Artificiales y Sistemas Adaptativos. Imagen Médica y Diagnóstico Radiológico (RNASA-IMEDIR). Instituto de Investigación Biomédica de A Coruña (INIBIC). Complexo Hospitalario Universitario de A Coruña (CHUAC), Sergas. Universidade da Coruña (UDC), Xubias de arriba, 84, A Coruña, 15006, Spain
| | - Carlos Fernandez-Lozano
- Department of Computer Science and Information Technologies, Faculty of Computer Science, University of A Coruña, CITIC, Campus Elviña s/n, A Coruña, 15071, Spain. .,Grupo de Redes de Neuronas Artificiales y Sistemas Adaptativos. Imagen Médica y Diagnóstico Radiológico (RNASA-IMEDIR). Instituto de Investigación Biomédica de A Coruña (INIBIC). Complexo Hospitalario Universitario de A Coruña (CHUAC), Sergas. Universidade da Coruña (UDC), Xubias de arriba, 84, A Coruña, 15006, Spain.
| |
Collapse
|
70
|
Coulouarn C. Artificial intelligence and omics in cancer. Artif Intell Cancer 2020; 1:1-7. [DOI: 10.35713/aic.v1.i1.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/20/2020] [Revised: 06/09/2020] [Accepted: 06/12/2020] [Indexed: 02/06/2023] Open
Affiliation(s)
- Cédric Coulouarn
- Institut National de la Sante et de la Recherche Medicale (Inserm), Université de Rennes 1, Rennes F-35000, France
| |
Collapse
|
71
|
Spontaneous Tumor Regression in Tasmanian Devils Associated with RASL11A Activation. Genetics 2020; 215:1143-1152. [PMID: 32554701 DOI: 10.1534/genetics.120.303428] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2020] [Accepted: 06/12/2020] [Indexed: 12/30/2022] Open
Abstract
Spontaneous tumor regression has been documented in a small proportion of human cancer patients, but the specific mechanisms underlying tumor regression without treatment are not well understood. Tasmanian devils are threatened with extinction from a transmissible cancer due to universal susceptibility and a near 100% case fatality rate. In over 10,000 cases, <20 instances of natural tumor regression have been detected. Previous work in this system has focused on Tasmanian devil genetic variation associated with the regression phenotype. Here, we used comparative and functional genomics to identify tumor genetic variation associated with tumor regression. We show that a single point mutation in the 5' untranslated region of the putative tumor suppressor RASL11A significantly contributes to tumor regression. RASL11A was expressed in regressed tumors but silenced in wild-type, nonregressed tumors, consistent with RASL11A downregulation in human cancers. Induced RASL11A expression significantly reduced tumor cell proliferation in vitro The RAS pathway is frequently altered in human cancers, and RASL11A activation may provide a therapeutic treatment option for Tasmanian devils as well as a general mechanism for tumor inhibition.
Collapse
|
72
|
Xie G, Wang X, Wei R, Wang J, Zhao A, Chen T, Wang Y, Zhang H, Xiao Z, Liu X, Deng Y, Wong L, Rajani C, Kwee S, Bian H, Gao X, Liu P, Jia W. Serum metabolite profiles are associated with the presence of advanced liver fibrosis in Chinese patients with chronic hepatitis B viral infection. BMC Med 2020; 18:144. [PMID: 32498677 PMCID: PMC7273661 DOI: 10.1186/s12916-020-01595-w] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/16/2020] [Accepted: 04/16/2020] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND Accurate and noninvasive diagnosis and staging of liver fibrosis are essential for effective clinical management of chronic liver disease (CLD). We aimed to identify serum metabolite markers that reliably predict the stage of fibrosis in CLD patients. METHODS We quantitatively profiled serum metabolites of participants in 2 independent cohorts. Based on the metabolomics data from cohort 1 (504 HBV associated liver fibrosis patients and 502 normal controls, NC), we selected a panel of 4 predictive metabolite markers. Consequently, we constructed 3 machine learning models with the 4 metabolite markers using random forest (RF), to differentiate CLD patients from normal controls (NC), to differentiate cirrhosis patients from fibrosis patients, and to differentiate advanced fibrosis from early fibrosis, respectively. RESULTS The panel of 4 metabolite markers consisted of taurocholate, tyrosine, valine, and linoelaidic acid. The RF models of the metabolite panel demonstrated the strongest stratification ability in cohort 1 to diagnose CLD patients from NC (area under the receiver operating characteristic curve (AUROC) = 0.997 and the precision-recall curve (AUPR) = 0.994), to differentiate fibrosis from cirrhosis (0.941, 0.870), and to stage liver fibrosis (0.918, 0.892). The diagnostic accuracy of the models was further validated in an independent cohort 2 consisting of 300 CLD patients with chronic HBV infection and 90 NC. The AUCs of the models were consistently higher than APRI, FIB-4, and AST/ALT ratio, with both greater sensitivity and specificity. CONCLUSIONS Our study showed that this 4-metabolite panel has potential usefulness in clinical assessments of CLD progression in patients with chronic hepatitis B virus infection.
Collapse
Affiliation(s)
- Guoxiang Xie
- E-Institute of Shanghai Municipal Education Committee, Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai, 201203, China
- Human Metabolomics Institute, Inc., Shenzhen, 518109, Guangdong, China
| | - Xiaoning Wang
- E-Institute of Shanghai Municipal Education Committee, Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai, 201203, China
- Key Laboratory of Liver and Kidney Diseases (Ministry of Education), Shuguang Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai, 201203, China
| | - Runmin Wei
- University of Hawaii Cancer Center, Honolulu, HI, 96813, USA
| | - Jingye Wang
- University of Hawaii Cancer Center, Honolulu, HI, 96813, USA
| | - Aihua Zhao
- Shanghai Key Laboratory of Diabetes Mellitus and Center for Translational Medicine, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, Shanghai, 200233, China
| | - Tianlu Chen
- Shanghai Key Laboratory of Diabetes Mellitus and Center for Translational Medicine, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, Shanghai, 200233, China
| | - Yixing Wang
- Key Laboratory of Liver and Kidney Diseases (Ministry of Education), Shuguang Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai, 201203, China
| | - Hua Zhang
- E-Institute of Shanghai Municipal Education Committee, Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai, 201203, China
- Key Laboratory of Liver and Kidney Diseases (Ministry of Education), Shuguang Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai, 201203, China
| | - Zhun Xiao
- E-Institute of Shanghai Municipal Education Committee, Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai, 201203, China
- Key Laboratory of Liver and Kidney Diseases (Ministry of Education), Shuguang Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai, 201203, China
| | - Xinzhu Liu
- E-Institute of Shanghai Municipal Education Committee, Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai, 201203, China
- Key Laboratory of Liver and Kidney Diseases (Ministry of Education), Shuguang Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai, 201203, China
| | - Youping Deng
- University of Hawaii Cancer Center, Honolulu, HI, 96813, USA
| | - Linda Wong
- University of Hawaii Cancer Center, Honolulu, HI, 96813, USA
| | - Cynthia Rajani
- University of Hawaii Cancer Center, Honolulu, HI, 96813, USA
| | - Sandi Kwee
- University of Hawaii Cancer Center, Honolulu, HI, 96813, USA
| | - Hua Bian
- Department of Endocrinology and Metabolism, Zhongshan Hospital, Fudan University, Shanghai, 200032, China
| | - Xin Gao
- Department of Endocrinology and Metabolism, Zhongshan Hospital, Fudan University, Shanghai, 200032, China
| | - Ping Liu
- E-Institute of Shanghai Municipal Education Committee, Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai, 201203, China.
- Key Laboratory of Liver and Kidney Diseases (Ministry of Education), Shuguang Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai, 201203, China.
- Institute of Liver Diseases, Shuguang Hospital, Shanghai University of Traditional Chinese Medicine, 528 Zhangheng Road, Shanghai, 201203, China.
| | - Wei Jia
- E-Institute of Shanghai Municipal Education Committee, Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai, 201203, China.
- University of Hawaii Cancer Center, Honolulu, HI, 96813, USA.
- School of Chinese Medicine, Hong Kong Baptist University, Kowloon Tong, Hong Kong, China.
| |
Collapse
|
73
|
Way GP, Zietz M, Rubinetti V, Himmelstein DS, Greene CS. Compressing gene expression data using multiple latent space dimensionalities learns complementary biological representations. Genome Biol 2020; 21:109. [PMID: 32393369 PMCID: PMC7212571 DOI: 10.1186/s13059-020-02021-3] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2019] [Accepted: 04/16/2020] [Indexed: 12/27/2022] Open
Abstract
BACKGROUND Unsupervised compression algorithms applied to gene expression data extract latent or hidden signals representing technical and biological sources of variation. However, these algorithms require a user to select a biologically appropriate latent space dimensionality. In practice, most researchers fit a single algorithm and latent dimensionality. We sought to determine the extent by which selecting only one fit limits the biological features captured in the latent representations and, consequently, limits what can be discovered with subsequent analyses. RESULTS We compress gene expression data from three large datasets consisting of adult normal tissue, adult cancer tissue, and pediatric cancer tissue. We train many different models across a large range of latent space dimensionalities and observe various performance differences. We identify more curated pathway gene sets significantly associated with individual dimensions in denoising autoencoder and variational autoencoder models trained using an intermediate number of latent dimensionalities. Combining compressed features across algorithms and dimensionalities captures the most pathway-associated representations. When trained with different latent dimensionalities, models learn strongly associated and generalizable biological representations including sex, neuroblastoma MYCN amplification, and cell types. Stronger signals, such as tumor type, are best captured in models trained at lower dimensionalities, while more subtle signals such as pathway activity are best identified in models trained with more latent dimensionalities. CONCLUSIONS There is no single best latent dimensionality or compression algorithm for analyzing gene expression data. Instead, using features derived from different compression models across multiple latent space dimensionalities enhances biological representations.
Collapse
Affiliation(s)
- Gregory P Way
- Genomics and Computational Biology Graduate Group, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Department of Systems Pharmacology and Translational Therapeutics, University of Pennsylvania, 10-131 SCTR 34th and Civic Center Blvd, Philadelphia, PA, 19104, USA
- Imaging Platform, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
| | - Michael Zietz
- Department of Systems Pharmacology and Translational Therapeutics, University of Pennsylvania, 10-131 SCTR 34th and Civic Center Blvd, Philadelphia, PA, 19104, USA
| | - Vincent Rubinetti
- Department of Systems Pharmacology and Translational Therapeutics, University of Pennsylvania, 10-131 SCTR 34th and Civic Center Blvd, Philadelphia, PA, 19104, USA
| | - Daniel S Himmelstein
- Department of Systems Pharmacology and Translational Therapeutics, University of Pennsylvania, 10-131 SCTR 34th and Civic Center Blvd, Philadelphia, PA, 19104, USA
| | - Casey S Greene
- Department of Systems Pharmacology and Translational Therapeutics, University of Pennsylvania, 10-131 SCTR 34th and Civic Center Blvd, Philadelphia, PA, 19104, USA.
- Childhood Cancer Data Lab, Alex's Lemonade Stand Foundation, Philadelphia, PA, 19102, USA.
| |
Collapse
|
74
|
Goecks J, Jalili V, Heiser LM, Gray JW. How Machine Learning Will Transform Biomedicine. Cell 2020; 181:92-101. [PMID: 32243801 PMCID: PMC7141410 DOI: 10.1016/j.cell.2020.03.022] [Citation(s) in RCA: 268] [Impact Index Per Article: 53.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2020] [Revised: 03/07/2020] [Accepted: 03/09/2020] [Indexed: 12/15/2022]
Abstract
This Perspective explores the application of machine learning toward improved diagnosis and treatment. We outline a vision for how machine learning can transform three broad areas of biomedicine: clinical diagnostics, precision treatments, and health monitoring, where the goal is to maintain health through a range of diseases and the normal aging process. For each area, early instances of successful machine learning applications are discussed, as well as opportunities and challenges for machine learning. When these challenges are met, machine learning promises a future of rigorous, outcomes-based medicine with detection, diagnosis, and treatment strategies that are continuously adapted to individual and environmental differences.
Collapse
Affiliation(s)
- Jeremy Goecks
- Department of Biomedical Engineering, Oregon Health & Science University, Portland, OR, USA.
| | - Vahid Jalili
- Department of Biomedical Engineering, Oregon Health & Science University, Portland, OR, USA
| | - Laura M Heiser
- Department of Biomedical Engineering, Oregon Health & Science University, Portland, OR, USA
| | - Joe W Gray
- Department of Biomedical Engineering, Oregon Health & Science University, Portland, OR, USA
| |
Collapse
|
75
|
Banerjee J, Allaway RJ, Taroni JN, Baker A, Zhang X, Moon CI, Pratilas CA, Blakeley JO, Guinney J, Hirbe A, Greene CS, Gosline SJC. Integrative Analysis Identifies Candidate Tumor Microenvironment and Intracellular Signaling Pathways that Define Tumor Heterogeneity in NF1. Genes (Basel) 2020; 11:E226. [PMID: 32098059 PMCID: PMC7073563 DOI: 10.3390/genes11020226] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2020] [Revised: 02/15/2020] [Accepted: 02/19/2020] [Indexed: 12/12/2022] Open
Abstract
Neurofibromatosis type 1 (NF1) is a monogenic syndrome that gives rise to numerous symptoms including cognitive impairment, skeletal abnormalities, and growth of benign nerve sheath tumors. Nearly all NF1 patients develop cutaneous neurofibromas (cNFs), which occur on the skin surface, whereas 40-60% of patients develop plexiform neurofibromas (pNFs), which are deeply embedded in the peripheral nerves. Patients with pNFs have a ~10% lifetime chance of these tumors becoming malignant peripheral nerve sheath tumors (MPNSTs). These tumors have a severe prognosis and few treatment options other than surgery. Given the lack of therapeutic options available to patients with these tumors, identification of druggable pathways or other key molecular features could aid ongoing therapeutic discovery studies. In this work, we used statistical and machine learning methods to analyze 77 NF1 tumors with genomic data to characterize key signaling pathways that distinguish these tumors and identify candidates for drug development. We identified subsets of latent gene expression variables that may be important in the identification and etiology of cNFs, pNFs, other neurofibromas, and MPNSTs. Furthermore, we characterized the association between these latent variables and genetic variants, immune deconvolution predictions, and protein activity predictions.
Collapse
Affiliation(s)
- Jineta Banerjee
- Computational Oncology, Sage Bionetworks, Seattle, WA 98121, USA
| | - Robert J Allaway
- Computational Oncology, Sage Bionetworks, Seattle, WA 98121, USA
| | - Jaclyn N Taroni
- Childhood Cancer Data Lab, Alex’s Lemonade Stand Foundation, Philadelphia, PA 19102, USA
| | - Aaron Baker
- Computational Oncology, Sage Bionetworks, Seattle, WA 98121, USA
- Department of Computer Sciences, University of Wisconsin-Madison, Madison, WI 53715, USA
- Morgridge Institute for Research, Madison, WI 53715, USA
| | - Xiaochun Zhang
- Division of Oncology, Washington University Medical School, St. Louis, MO 63110, USA
| | - Chang In Moon
- Division of Oncology, Washington University Medical School, St. Louis, MO 63110, USA
| | - Christine A Pratilas
- Sidney Kimmel Comprehensive Cancer Center and Department of Oncology, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Jaishri O Blakeley
- Sidney Kimmel Comprehensive Cancer Center and Department of Oncology, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
- Neurology, Neurosurgery and Oncology, Johns Hopkins University, Baltimore, MD 21287, USA
| | - Justin Guinney
- Computational Oncology, Sage Bionetworks, Seattle, WA 98121, USA
| | - Angela Hirbe
- Division of Oncology, Washington University Medical School, St. Louis, MO 63110, USA
| | - Casey S Greene
- Childhood Cancer Data Lab, Alex’s Lemonade Stand Foundation, Philadelphia, PA 19102, USA
- Department of Systems Pharmacology and Translational Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Sara JC Gosline
- Computational Oncology, Sage Bionetworks, Seattle, WA 98121, USA
| |
Collapse
|
76
|
Haan D, Tao R, Friedl V, Anastopoulos IN, Wong CK, Weinstein AS, Stuart JM. Using Transcriptional Signatures to Find Cancer Drivers with LURE. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020; 25:343-354. [PMID: 31797609 PMCID: PMC6924983] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
Abstract
Cancer genome projects have produced multidimensional datasets on thousands of samples. Yet, depending on the tumor type, 5-50% of samples have no known driving event. We introduce a semi-supervised method called Learning UnRealized Events (LURE) that uses a progressive label learning framework and minimum spanning analysis to predict cancer drivers based on their altered samples sharing a gene expression signature with the samples of a known event. We demonstrate the utility of the method on the TCGA Pan-Cancer Atlas dataset for which it produced a high-confidence result relating 59 new connections to 18 known mutation events including alterations in the same gene, family, and pathway. We give examples of predicted drivers involved in TP53, telomere maintenance, and MAPK/RTK signaling pathways. LURE identifies connections between genes with no known prior relationship, some of which may offer clues for targeting specific forms of cancer. Code and Supplemental Material are available on the LURE website: https://sysbiowiki.soe.ucsc.edu/lure.
Collapse
Affiliation(s)
- David Haan
- Dept. of Biomolecular Engineering and UC Santa Cruz Genomics Institute University Of California, Santa Cruz, Santa Cruz, CA 95064, USA
| | - Ruikang Tao
- Dept. of Biomolecular Engineering and UC Santa Cruz Genomics Institute University Of California, Santa Cruz, Santa Cruz, CA 95064, USA
| | - Verena Friedl
- Dept. of Biomolecular Engineering and UC Santa Cruz Genomics Institute University Of California, Santa Cruz, Santa Cruz, CA 95064, USA
| | - Ioannis N Anastopoulos
- Dept. of Biomolecular Engineering and UC Santa Cruz Genomics Institute University Of California, Santa Cruz, Santa Cruz, CA 95064, USA
| | - Christopher K Wong
- Dept. of Biomolecular Engineering and UC Santa Cruz Genomics Institute University Of California, Santa Cruz, Santa Cruz, CA 95064, USA
| | - Alana S Weinstein
- Dept. of Biomolecular Engineering and UC Santa Cruz Genomics Institute University Of California, Santa Cruz, Santa Cruz, CA 95064, USA
| | - Joshua M Stuart
- Dept. of Biomolecular Engineering and UC Santa Cruz Genomics Institute University Of California, Santa Cruz, Santa Cruz, CA 95064, USA
| |
Collapse
|
77
|
Tranchevent LC, Azuaje F, Rajapakse JC. A deep neural network approach to predicting clinical outcomes of neuroblastoma patients. BMC Med Genomics 2019; 12:178. [PMID: 31856829 PMCID: PMC6923884 DOI: 10.1186/s12920-019-0628-y] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2019] [Accepted: 11/15/2019] [Indexed: 01/16/2023] Open
Abstract
BACKGROUND The availability of high-throughput omics datasets from large patient cohorts has allowed the development of methods that aim at predicting patient clinical outcomes, such as survival and disease recurrence. Such methods are also important to better understand the biological mechanisms underlying disease etiology and development, as well as treatment responses. Recently, different predictive models, relying on distinct algorithms (including Support Vector Machines and Random Forests) have been investigated. In this context, deep learning strategies are of special interest due to their demonstrated superior performance over a wide range of problems and datasets. One of the main challenges of such strategies is the "small n large p" problem. Indeed, omics datasets typically consist of small numbers of samples and large numbers of features relative to typical deep learning datasets. Neural networks usually tackle this problem through feature selection or by including additional constraints during the learning process. METHODS We propose to tackle this problem with a novel strategy that relies on a graph-based method for feature extraction, coupled with a deep neural network for clinical outcome prediction. The omics data are first represented as graphs whose nodes represent patients, and edges represent correlations between the patients' omics profiles. Topological features, such as centralities, are then extracted from these graphs for every node. Lastly, these features are used as input to train and test various classifiers. RESULTS We apply this strategy to four neuroblastoma datasets and observe that models based on neural networks are more accurate than state of the art models (DNN: 85%-87%, SVM/RF: 75%-82%). We explore how different parameters and configurations are selected in order to overcome the effects of the small data problem as well as the curse of dimensionality. CONCLUSIONS Our results indicate that the deep neural networks capture complex features in the data that help predicting patient clinical outcomes.
Collapse
Affiliation(s)
- Léon-Charles Tranchevent
- Proteome and Genome Research Unit, Department of Oncology, Luxembourg Institute of Health, 1A-B, rue Thomas Edison, Strassen, L-1445 Luxembourg
- Current affiliation: Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 7, avenue des Hauts Fourneaux, Esch-sur-Alzette, L-4362 Luxembourg
| | - Francisco Azuaje
- Proteome and Genome Research Unit, Department of Oncology, Luxembourg Institute of Health, 1A-B, rue Thomas Edison, Strassen, L-1445 Luxembourg
- Current affiliation: Data and Translational Sciences, UCB Celltech, 208 Bath Road, Slough, SL1 3WE UK
| | - Jagath C. Rajapakse
- Bioinformatics Research Center, School of Computer Science and Engineering, Nanyang Technological University, 50, Nanyang Avenue, Singapore, 639798 Singapore
| |
Collapse
|
78
|
Ma T, Zhang A. Integrate multi-omics data with biological interaction networks using Multi-view Factorization AutoEncoder (MAE). BMC Genomics 2019; 20:944. [PMID: 31856727 PMCID: PMC6923820 DOI: 10.1186/s12864-019-6285-x] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
Abstract
BACKGROUND Comprehensive molecular profiling of various cancers and other diseases has generated vast amounts of multi-omics data. Each type of -omics data corresponds to one feature space, such as gene expression, miRNA expression, DNA methylation, etc. Integrating multi-omics data can link different layers of molecular feature spaces and is crucial to elucidate molecular pathways underlying various diseases. Machine learning approaches to mining multi-omics data hold great promises in uncovering intricate relationships among molecular features. However, due to the "big p, small n" problem (i.e., small sample sizes with high-dimensional features), training a large-scale generalizable deep learning model with multi-omics data alone is very challenging. RESULTS We developed a method called Multi-view Factorization AutoEncoder (MAE) with network constraints that can seamlessly integrate multi-omics data and domain knowledge such as molecular interaction networks. Our method learns feature and patient embeddings simultaneously with deep representation learning. Both feature representations and patient representations are subject to certain constraints specified as regularization terms in the training objective. By incorporating domain knowledge into the training objective, we implicitly introduced a good inductive bias into the machine learning model, which helps improve model generalizability. We performed extensive experiments on the TCGA datasets and demonstrated the power of integrating multi-omics data and biological interaction networks using our proposed method for predicting target clinical variables. CONCLUSIONS To alleviate the overfitting problem in deep learning on multi-omics data with the "big p, small n" problem, it is helpful to incorporate biological domain knowledge into the model as inductive biases. It is very promising to design machine learning models that facilitate the seamless integration of large-scale multi-omics data and biomedical domain knowledge for uncovering intricate relationships among molecular features and clinical features.
Collapse
Affiliation(s)
- Tianle Ma
- Department of Computer Science and Engineering, University at Buffalo, 338 Davis Hall, Buffalo, 14260 NY USA
| | - Aidong Zhang
- Department of Computer Science, University of Virginia, 509 Rice Hall, Charlottesville, 22904 VA USA
| |
Collapse
|
79
|
Rokita JL, Rathi KS, Cardenas MF, Upton KA, Jayaseelan J, Cross KL, Pfeil J, Egolf LE, Way GP, Farrel A, Kendsersky NM, Patel K, Gaonkar KS, Modi A, Berko ER, Lopez G, Vaksman Z, Mayoh C, Nance J, McCoy K, Haber M, Evans K, McCalmont H, Bendak K, Böhm JW, Marshall GM, Tyrrell V, Kalletla K, Braun FK, Qi L, Du Y, Zhang H, Lindsay HB, Zhao S, Shu J, Baxter P, Morton C, Kurmashev D, Zheng S, Chen Y, Bowen J, Bryan AC, Leraas KM, Coppens SE, Doddapaneni H, Momin Z, Zhang W, Sacks GI, Hart LS, Krytska K, Mosse YP, Gatto GJ, Sanchez Y, Greene CS, Diskin SJ, Vaske OM, Haussler D, Gastier-Foster JM, Kolb EA, Gorlick R, Li XN, Reynolds CP, Kurmasheva RT, Houghton PJ, Smith MA, Lock RB, Raman P, Wheeler DA, Maris JM. Genomic Profiling of Childhood Tumor Patient-Derived Xenograft Models to Enable Rational Clinical Trial Design. Cell Rep 2019; 29:1675-1689.e9. [PMID: 31693904 PMCID: PMC6880934 DOI: 10.1016/j.celrep.2019.09.071] [Citation(s) in RCA: 114] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2019] [Revised: 07/10/2019] [Accepted: 09/24/2019] [Indexed: 02/08/2023] Open
Abstract
Accelerating cures for children with cancer remains an immediate challenge as a result of extensive oncogenic heterogeneity between and within histologies, distinct molecular mechanisms evolving between diagnosis and relapsed disease, and limited therapeutic options. To systematically prioritize and rationally test novel agents in preclinical murine models, researchers within the Pediatric Preclinical Testing Consortium are continuously developing patient-derived xenografts (PDXs)-many of which are refractory to current standard-of-care treatments-from high-risk childhood cancers. Here, we genomically characterize 261 PDX models from 37 unique pediatric cancers; demonstrate faithful recapitulation of histologies and subtypes; and refine our understanding of relapsed disease. In addition, we use expression signatures to classify tumors for TP53 and NF1 pathway inactivation. We anticipate that these data will serve as a resource for pediatric oncology drug development and will guide rational clinical trial design for children with cancer.
Collapse
Affiliation(s)
- Jo Lynne Rokita
- Division of Oncology, Children's Hospital of Philadelphia, and Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA 19104-4318, USA; Department of Bioinformatics and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Center for Data-Driven Discovery in Biomedicine, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Komal S Rathi
- Department of Bioinformatics and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Center for Data-Driven Discovery in Biomedicine, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Maria F Cardenas
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Kristen A Upton
- Division of Oncology, Children's Hospital of Philadelphia, and Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA 19104-4318, USA
| | - Joy Jayaseelan
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA
| | | | - Jacob Pfeil
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA 95064, USA
| | - Laura E Egolf
- Division of Oncology, Children's Hospital of Philadelphia, and Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA 19104-4318, USA; Cell and Molecular Biology Graduate Group, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Gregory P Way
- Genomics and Computational Biology Graduate Group, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Alvin Farrel
- Department of Bioinformatics and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Nathan M Kendsersky
- Division of Oncology, Children's Hospital of Philadelphia, and Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA 19104-4318, USA; Department of Systems Pharmacology and Translational Therapeutics, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Khushbu Patel
- Department of Bioinformatics and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Krutika S Gaonkar
- Department of Bioinformatics and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Center for Data-Driven Discovery in Biomedicine, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Apexa Modi
- Division of Oncology, Children's Hospital of Philadelphia, and Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA 19104-4318, USA; Genomics and Computational Biology Graduate Group, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Esther R Berko
- Division of Oncology, Children's Hospital of Philadelphia, and Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA 19104-4318, USA
| | - Gonzalo Lopez
- Division of Oncology, Children's Hospital of Philadelphia, and Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA 19104-4318, USA; Department of Bioinformatics and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Zalman Vaksman
- Department of Bioinformatics and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Chelsea Mayoh
- Children's Cancer Institute, School of Women's and Children's Health, UNSW Sydney, Sydney, NSW, Australia
| | - Jonas Nance
- Cancer Center, Texas Tech University Health Sciences Center School of Medicine, Lubbock, TX 79430, USA
| | - Kristyn McCoy
- Cancer Center, Texas Tech University Health Sciences Center School of Medicine, Lubbock, TX 79430, USA
| | - Michelle Haber
- Children's Cancer Institute, School of Women's and Children's Health, UNSW Sydney, Sydney, NSW, Australia
| | - Kathryn Evans
- Children's Cancer Institute, School of Women's and Children's Health, UNSW Sydney, Sydney, NSW, Australia
| | - Hannah McCalmont
- Children's Cancer Institute, School of Women's and Children's Health, UNSW Sydney, Sydney, NSW, Australia
| | - Katerina Bendak
- Children's Cancer Institute, School of Women's and Children's Health, UNSW Sydney, Sydney, NSW, Australia
| | - Julia W Böhm
- Children's Cancer Institute, School of Women's and Children's Health, UNSW Sydney, Sydney, NSW, Australia
| | - Glenn M Marshall
- Children's Cancer Institute, School of Women's and Children's Health, UNSW Sydney, Sydney, NSW, Australia; Sydney Children's Hospital, Sydney, NSW, Australia
| | | | - Karthik Kalletla
- Department of Bioinformatics and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Center for Data-Driven Discovery in Biomedicine, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Frank K Braun
- Texas Children's Cancer and Hematology Center, Department of Pediatrics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Lin Qi
- Preclinical Neurooncology Research Program, Texas Children's Cancer Research Center, Texas Children's Hospital, Houston, TX 77030, USA; Department of Pediatrics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Yunchen Du
- Preclinical Neurooncology Research Program, Texas Children's Cancer Research Center, Texas Children's Hospital, Houston, TX 77030, USA; Department of Pediatrics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Huiyuan Zhang
- Preclinical Neurooncology Research Program, Texas Children's Cancer Research Center, Texas Children's Hospital, Houston, TX 77030, USA; Department of Pediatrics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Holly B Lindsay
- Preclinical Neurooncology Research Program, Texas Children's Cancer Research Center, Texas Children's Hospital, Houston, TX 77030, USA; Department of Pediatrics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Sibo Zhao
- Preclinical Neurooncology Research Program, Texas Children's Cancer Research Center, Texas Children's Hospital, Houston, TX 77030, USA; Department of Pediatrics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Jack Shu
- Preclinical Neurooncology Research Program, Texas Children's Cancer Research Center, Texas Children's Hospital, Houston, TX 77030, USA; Department of Pediatrics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Patricia Baxter
- Preclinical Neurooncology Research Program, Texas Children's Cancer Research Center, Texas Children's Hospital, Houston, TX 77030, USA; Department of Pediatrics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Christopher Morton
- Department of Surgery, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Dias Kurmashev
- Greehey Children's Cancer Research Institute, University of Texas Health Science Center, San Antonio, TX 78229, USA
| | - Siyuan Zheng
- Greehey Children's Cancer Research Institute, University of Texas Health Science Center, San Antonio, TX 78229, USA
| | - Yidong Chen
- Greehey Children's Cancer Research Institute, University of Texas Health Science Center, San Antonio, TX 78229, USA
| | - Jay Bowen
- The Research Institute at Nationwide Children's Hospital, Columbus, OH 43205, USA
| | - Anthony C Bryan
- The Research Institute at Nationwide Children's Hospital, Columbus, OH 43205, USA
| | - Kristen M Leraas
- The Research Institute at Nationwide Children's Hospital, Columbus, OH 43205, USA
| | - Sara E Coppens
- The Research Institute at Nationwide Children's Hospital, Columbus, OH 43205, USA
| | | | - Zeineen Momin
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Wendong Zhang
- Division of Pediatrics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Gregory I Sacks
- Division of Oncology, Children's Hospital of Philadelphia, and Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA 19104-4318, USA
| | - Lori S Hart
- Division of Oncology, Children's Hospital of Philadelphia, and Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA 19104-4318, USA
| | - Kateryna Krytska
- Division of Oncology, Children's Hospital of Philadelphia, and Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA 19104-4318, USA
| | - Yael P Mosse
- Division of Oncology, Children's Hospital of Philadelphia, and Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA 19104-4318, USA
| | - Gregory J Gatto
- Department of Global Health Technologies, RTI International, Research Triangle Park, NC 27709, USA
| | - Yolanda Sanchez
- Department of Molecular and Systems Biology, Geisel School of Medicine at Dartmouth, Hanover, NH 03755, USA; Norris Cotton Cancer Center, Lebanon, NH 03766, USA
| | - Casey S Greene
- Childhood Cancer Data Lab, Alex's Lemonade Stand Foundation, Philadelphia, PA 19102, USA; Department of Systems Pharmacology and Translational Therapeutics, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Sharon J Diskin
- Division of Oncology, Children's Hospital of Philadelphia, and Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA 19104-4318, USA; Department of Bioinformatics and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Olena Morozova Vaske
- Department of Molecular, Cell and Developmental Biology, University of California, Santa Cruz, Santa Cruz, CA 95064, USA; UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA 95064, USA
| | - David Haussler
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA 95064, USA; Howard Hughes Medical Institute, University of California, Santa Cruz, Santa Cruz, CA 95064, USA
| | - Julie M Gastier-Foster
- The Research Institute at Nationwide Children's Hospital, Columbus, OH 43205, USA; The Ohio State University College of Medicine, Departments of Pathology and Pediatrics, Columbus, OH 43210, USA
| | - E Anders Kolb
- Department of Pediatrics, Sidney Kimmel Medical College at Thomas Jefferson University, Philadelphia, PA 19107, USA; Nemours Center for Cancer and Blood Disorders, Nemours/Alfred I. duPont Hospital for Children, Wilmington, DE 19803, USA
| | - Richard Gorlick
- Division of Pediatrics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Xiao-Nan Li
- Preclinical Neurooncology Research Program, Texas Children's Cancer Research Center, Texas Children's Hospital, Houston, TX 77030, USA; Department of Pediatrics, Baylor College of Medicine, Houston, TX 77030, USA; Division of Hematology, Oncology, Neuro-oncology and Stem Cell Transplant, Ann & Robert H. Lurie Children's Hospital of Chicago, Chicago, IL 60611, USA; Department of Pediatrics, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA
| | - C Patrick Reynolds
- Cancer Center, Texas Tech University Health Sciences Center School of Medicine, Lubbock, TX 79430, USA
| | - Raushan T Kurmasheva
- Greehey Children's Cancer Research Institute, University of Texas Health Science Center, San Antonio, TX 78229, USA
| | - Peter J Houghton
- Greehey Children's Cancer Research Institute, University of Texas Health Science Center, San Antonio, TX 78229, USA
| | | | | | - Pichai Raman
- Department of Bioinformatics and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Center for Data-Driven Discovery in Biomedicine, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - David A Wheeler
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - John M Maris
- Division of Oncology, Children's Hospital of Philadelphia, and Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA 19104-4318, USA.
| |
Collapse
|
80
|
MLW-gcForest: A Multi-Weighted gcForest Model for Cancer Subtype Classification by Methylation Data. APPLIED SCIENCES-BASEL 2019. [DOI: 10.3390/app9173589] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Effective cancer treatment requires a clear subtype. Due to the small sample size, high dimensionality, and class imbalances of cancer gene data, classifying cancer subtypes by traditional machine learning methods remains challenging. The gcForest algorithm is a combination of machine learning methods and a deep neural network and has been indicated to achieve better classification of small samples of data. However, the gcForest algorithm still faces many challenges when this method is applied to the classification of cancer subtypes. In this paper, we propose an improved gcForest algorithm (MLW-gcForest) to study the applicability of this method to the small sample sizes, high dimensionality, and class imbalances of genetic data. The main contributions of this algorithm are as follows: (1) Different weights are assigned to different random forests according to the classification ability of the forests. (2) We propose a sorting optimization algorithm that assigns different weights to the feature vectors generated under different sliding windows. The MLW-gcForest model is trained on the methylation data of five data sets from the cancer genome atlas (TCGA). The experimental results show that the MLW-gcForest algorithm achieves high accuracy and area under curve (AUC) values for the classification of cancer subtypes compared with those of traditional machine learning methods and state of the art methods. The results also show that methylation data can be effectively used to diagnose cancer.
Collapse
|
81
|
Way GP, Greene CS. Discovering Pathway and Cell Type Signatures in Transcriptomic Compendia with Machine Learning. Annu Rev Biomed Data Sci 2019. [DOI: 10.1146/annurev-biodatasci-072018-021348] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Pathway and cell type signatures are patterns present in transcriptome data that are associated with biological processes or phenotypic consequences. These signatures result from specific cell type and pathway expression but can require large transcriptomic compendia to detect. Machine learning techniques can be powerful tools for signature discovery through their ability to provide accurate and interpretable results. In this review, we discuss various machine learning applications to extract pathway and cell type signatures from transcriptomic compendia. We focus on the biological motivations and interpretation for both supervised and unsupervised learning approaches in this setting. We consider recent advances, including deep learning, and their applications to expanding bulk and single-cell RNA data. As data and computational resources increase, there will be more opportunities for machine learning to aid in revealing biological signatures.
Collapse
Affiliation(s)
- Gregory P. Way
- Genomics and Computational Biology Graduate Group, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
- Department of Systems Pharmacology and Translational Therapeutics, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Casey S. Greene
- Department of Systems Pharmacology and Translational Therapeutics, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| |
Collapse
|
82
|
Esteban-Medina M, Peña-Chilet M, Loucera C, Dopazo J. Exploring the druggable space around the Fanconi anemia pathway using machine learning and mechanistic models. BMC Bioinformatics 2019; 20:370. [PMID: 31266445 PMCID: PMC6604281 DOI: 10.1186/s12859-019-2969-0] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2019] [Accepted: 06/25/2019] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND In spite of the abundance of genomic data, predictive models that describe phenotypes as a function of gene expression or mutations are difficult to obtain because they are affected by the curse of dimensionality, given the disbalance between samples and candidate genes. And this is especially dramatic in scenarios in which the availability of samples is difficult, such as the case of rare diseases. RESULTS The application of multi-output regression machine learning methodologies to predict the potential effect of external proteins over the signaling circuits that trigger Fanconi anemia related cell functionalities, inferred with a mechanistic model, allowed us to detect over 20 potential therapeutic targets. CONCLUSIONS The use of artificial intelligence methods for the prediction of potentially causal relationships between proteins of interest and cell activities related with disease-related phenotypes opens promising avenues for the systematic search of new targets in rare diseases.
Collapse
Affiliation(s)
- Marina Esteban-Medina
- Clinical Bioinformatics Area. Fundación Progreso y Salud (FPS). CDCA, Hospital Virgen del Rocio, 41013 Sevilla, Spain
| | - María Peña-Chilet
- Clinical Bioinformatics Area. Fundación Progreso y Salud (FPS). CDCA, Hospital Virgen del Rocio, 41013 Sevilla, Spain
- Bioinformatics in Rare Diseases (BiER). Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), FPS, Hospital Virgen del Rocío, 41013 Sevilla, Spain
| | - Carlos Loucera
- Clinical Bioinformatics Area. Fundación Progreso y Salud (FPS). CDCA, Hospital Virgen del Rocio, 41013 Sevilla, Spain
| | - Joaquín Dopazo
- Clinical Bioinformatics Area. Fundación Progreso y Salud (FPS). CDCA, Hospital Virgen del Rocio, 41013 Sevilla, Spain
- Bioinformatics in Rare Diseases (BiER). Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), FPS, Hospital Virgen del Rocío, 41013 Sevilla, Spain
- INB-ELIXIR-es, FPS, Hospital Virgen del Rocío, 42013 Sevilla, Spain
| |
Collapse
|
83
|
Li F, Wu T, Xu Y, Dong Q, Xiao J, Xu Y, Li Q, Zhang C, Gao J, Liu L, Hu X, Huang J, Li X, Zhang Y. A comprehensive overview of oncogenic pathways in human cancer. Brief Bioinform 2019; 21:957-969. [DOI: 10.1093/bib/bbz046] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2018] [Revised: 03/02/2019] [Accepted: 03/31/2019] [Indexed: 12/22/2022] Open
Abstract
Abstract
Alterations of biological pathways can lead to oncogenesis. An overview of these oncogenic pathways would be highly valuable for researchers to reveal the pathogenic mechanism and develop novel therapeutic approaches for cancers. Here, we reviewed approximately 8500 literatures and documented experimentally validated cancer-pathway associations as benchmarking data set. This data resource includes 4709 manually curated relationships between 1557 paths and 49 cancers with 2427 upstream regulators in 7 species. Based on this resource, we first summarized the cancer-pathway associations and revealed some commonly deregulated pathways across tumor types. Then, we systematically analyzed these oncogenic pathways by integrating TCGA pan-cancer data sets. Multi-omics analysis showed oncogenic pathways may play different roles across tumor types under different omics contexts. We also charted the survival relevance landscape of oncogenic pathways in 26 tumor types, identified dominant omics features and found survival relevance for oncogenic pathways varied in tumor types and omics levels. Moreover, we predicted upstream regulators and constructed a hierarchical network model to understand the pathogenic mechanism of human cancers underlying oncogenic pathway context. Finally, we developed `CPAD’ (freely available at http://bio-bigdata.hrbmu.edu.cn/CPAD/), an online resource for exploring oncogenic pathways in human cancers, that integrated manually curated cancer-pathway associations, TCGA pan-cancer multi-omics data sets, drug–target data, drug sensitivity and multi-omics data for cancer cell lines. In summary, our study provides a comprehensive characterization of oncogenic pathways and also presents a valuable resource for investigating the pathogenesis of human cancer.
Collapse
Affiliation(s)
- Feng Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Tan Wu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Yanjun Xu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Qun Dong
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Jing Xiao
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Yingqi Xu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Qian Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Chunlong Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Jianxia Gao
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Liqiu Liu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Xiaoxu Hu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Jian Huang
- Center for Informational Biology, University of Electronic Science and Technology of China
| | - Xia Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Yunpeng Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| |
Collapse
|
84
|
Goldie SJ, Chincarini G, Darido C. Targeted Therapy Against the Cell of Origin in Cutaneous Squamous Cell Carcinoma. Int J Mol Sci 2019; 20:ijms20092201. [PMID: 31060263 PMCID: PMC6539622 DOI: 10.3390/ijms20092201] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2019] [Revised: 04/29/2019] [Accepted: 04/29/2019] [Indexed: 01/03/2023] Open
Abstract
Squamous cell carcinomas (SCC), including cutaneous SCCs, are by far the most frequent cancers in humans, accounting for 80% of all newly diagnosed malignancies worldwide. The old dogma that SCC develops exclusively from stem cells (SC) has now changed to include progenitors, transit-amplifying and differentiated short-lived cells. Accumulation of specific oncogenic mutations is required to induce SCC from each cell population. Whilst as fewer as one genetic hit is sufficient to induce SCC from a SC, multiple events are additionally required in more differentiated cells. Interestingly, the level of differentiation correlates with the number of transforming events required to induce a stem-like phenotype, a long-lived potential and a tumourigenic capacity in a progenitor, a transient amplifying or even in a terminally differentiated cell. Furthermore, it is well described that SCCs originating from different cells of origin differ not only in their squamous differentiation status but also in their malignant characteristics. This review summarises recent findings in cutaneous SCC and highlights transforming oncogenic events in specific cell populations. It underlines oncogenes that are restricted either to stem or differentiated cells, which could provide therapeutic target selectivity against heterogeneous SCC. This strategy may be applicable to SCC from different body locations, such as head and neck SCCs, which are currently still associated with poor survival outcomes.
Collapse
Affiliation(s)
- Stephen J Goldie
- College of Medicine and Public Health, Flinders University, Adelaide, SA 5001, Australia.
| | - Ginevra Chincarini
- Peter MacCallum Cancer Centre, 305 Grattan St, Melbourne, VIC 3000, Australia.
| | - Charbel Darido
- Peter MacCallum Cancer Centre, 305 Grattan St, Melbourne, VIC 3000, Australia.
- Sir Peter MacCallum Department of Oncology, The University of Melbourne, Parkville, VIC 3010, Australia.
| |
Collapse
|
85
|
Jeyaraj PR, Samuel Nadar ER. Computer-assisted medical image classification for early diagnosis of oral cancer employing deep learning algorithm. J Cancer Res Clin Oncol 2019; 145:829-837. [PMID: 30603908 DOI: 10.1007/s00432-018-02834-7] [Citation(s) in RCA: 112] [Impact Index Per Article: 18.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2018] [Accepted: 12/24/2018] [Indexed: 02/07/2023]
Abstract
PURPOSE Oral cancer is a complex wide spread cancer, which has high severity. Using advanced technology and deep learning algorithm early detection and classification are made possible. Medical imaging technique, computer-aided diagnosis and detection can make potential changes in cancer treatment. In this research work, we have developed a deep learning algorithm for automated, computer-aided oral cancer detecting system by investigating patient hyperspectral images. METHODS To validate the proposed regression-based partitioned deep learning algorithm, we compare the performance with other techniques by its classification accuracy, specificity, and sensitivity. For the accurate medical image classification objective, we demonstrate a new structure of partitioned deep Convolution Neural Network (CNN) with two partitioned layers for labeling and classify by labeling region of interest in multidimensional hyperspectral image. RESULTS The performance of the partitioned deep CNN was verified by classification accuracy. We have obtained classification accuracy of 91.4% with sensitivity 0.94 and a specificity of 0.91 for 100 image data sets training for task classification of cancerous tumor with benign and for task classification of cancerous tumor with normal tissue accuracy of 94.5% for 500 training patterns was obtained. CONCLUSIONS We compared the obtained results from another traditional medical image classification algorithm. From the obtained result, we identify that the quality of diagnosis is increased by proposed regression-based partitioned CNN learning algorithm for a complex medical image of oral cancer diagnosis.
Collapse
Affiliation(s)
- Pandia Rajan Jeyaraj
- Department of Electrical and Electronics Engineering, Mepco Schlenk Engineering College (Autonomous), Sivakasi, Tamil Nadu, India.
| | - Edward Rajan Samuel Nadar
- Department of Electrical and Electronics Engineering, Mepco Schlenk Engineering College (Autonomous), Sivakasi, Tamil Nadu, India
| |
Collapse
|
86
|
Cava C, Castiglioni I. In silico perturbation of drug targets in pan-cancer analysis combining multiple networks and pathways. Gene 2019; 698:100-106. [PMID: 30840853 DOI: 10.1016/j.gene.2019.02.064] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2018] [Revised: 02/13/2019] [Accepted: 02/23/2019] [Indexed: 12/13/2022]
Abstract
The knowledge of cancer cell response to conventional therapies is crucial in order to choose the correct therapy of patients affected by cancer. The major problem is generally attributed to the lack of specific biological processes able to predict the therapy efficacy. Here, we optimized a computational method for the analysis of gene networks able to detect and quantify the effects of a drug in a pan-cancer study. Overall, our method, using several network topological measures has identified a cancer gene network with a key role in biological processes. The gene network, able to classify with a good performance cancer vs normal samples, was modulated in silico to evaluate the effects of new or approved drugs. This computational model could offer an interesting hint to decipher molecular mechanisms contributing to resistance or inefficacy of drugs.
Collapse
Affiliation(s)
- Claudia Cava
- Institute of Molecular Bioimaging and Physiology, National Research Council (IBFM-CNR), Via F.Cervi 93, 20090 Segrate, Milan, Italy.
| | - Isabella Castiglioni
- Institute of Molecular Bioimaging and Physiology, National Research Council (IBFM-CNR), Via F.Cervi 93, 20090 Segrate, Milan, Italy.
| |
Collapse
|
87
|
van de Stolpe A. Quantitative Measurement of Functional Activity of the PI3K Signaling Pathway in Cancer. Cancers (Basel) 2019; 11:E293. [PMID: 30832253 PMCID: PMC6468721 DOI: 10.3390/cancers11030293] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2019] [Revised: 02/14/2019] [Accepted: 02/14/2019] [Indexed: 12/12/2022] Open
Abstract
The phosphoinositide 3-kinase (PI3K) growth factor signaling pathway plays an important role in embryonic development and in many physiological processes, for example the generation of an immune response. The pathway is frequently activated in cancer, driving cell division and influencing the activity of other signaling pathways, such as the MAPK, JAK-STAT and TGFβ pathways, to enhance tumor growth, metastasis, and therapy resistance. Drugs that inhibit the pathway at various locations, e.g., receptor tyrosine kinase (RTK), PI3K, AKT and mTOR inhibitors, are clinically available. To predict drug response versus resistance, tests that measure PI3K pathway activity in a patient sample, preferably in combination with measuring the activity of other signaling pathways to identify potential resistance pathways, are needed. However, tests for signaling pathway activity are lacking, hampering optimal clinical application of these drugs. We recently reported the development and biological validation of a test that provides a quantitative PI3K pathway activity score for individual cell and tissue samples across cancer types, based on measuring Forkhead Box O (FOXO) transcription factor target gene mRNA levels in combination with a Bayesian computational interpretation model. A similar approach has been used to develop tests for other signaling pathways (e.g., estrogen and androgen receptor, Hedgehog, TGFβ, Wnt and NFκB pathways). The potential utility of the test is discussed, e.g., to predict response and resistance to targeted drugs, immunotherapy, radiation and chemotherapy, as well as (pre-) clinical research and drug development.
Collapse
Affiliation(s)
- Anja van de Stolpe
- Precision Diagnostics, Philips Research, High Tech Campus, 5656AE Eindhoven, The Netherlands.
| |
Collapse
|
88
|
The next generation personalized models to screen hidden layers of breast cancer tumorigenicity. Breast Cancer Res Treat 2019; 175:277-286. [PMID: 30810866 DOI: 10.1007/s10549-019-05159-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2018] [Accepted: 02/05/2019] [Indexed: 10/27/2022]
Abstract
BACKGROUND Breast cancer (BC) is a challenging disease and major cause of death amongst women worldwide who die due to tumor relapse or sidelong diseases. BC main complexity comes from the heterogeneous nature of breast tumors that demands customized treatments in the form of personalized medicine. REVIEW OF THE LITERATURE AND DISCUSSION Spatiotemporally dynamic and heterogeneous nature of BC tumors is shaped by their clonal evolution and sub-clonal selections and shapes resistance to collective or group therapies that drives cancer recurrence and tumor metastasis. Personalized intervention promises to administer medications that selectively target each individual patient tumor and even further each colonized secondary tumor. Such personalized regimens will require creation of in vitro and in vivo models genuinely recapitulating characteristics of each tumor type as initiating platforms for two main purposes: to closely monitor the tumorigenic processes that shape tumor heterogeneity and evolution as the main driving forces behind tumor chemo-resistance and relapse, and subsequently to establish patient-specific preventive and therapeutic measures. While application of tumor modeling for personalized drug screening and design requires a separate review, here we discuss the personalized utilities of xenograft modeling in investigating BC tumor formation and progression toward metastasis. We will further elaborate on the impact of innovative technologies on personalized modeling of BC tumorigenicity at improved resolution. CONCLUSION Heterogeneous nature of each BC tumor requires personalized intervention implying that modeling breast tumors is inevitable for better disease understanding, detection and cure. Patient-derived xenografts are just the initiating piece of the puzzle for ideal management of breast cancer. Emerging technologies promise to model BC more personalized than before.
Collapse
|
89
|
Precision medicine review: rare driver mutations and their biophysical classification. Biophys Rev 2019; 11:5-19. [PMID: 30610579 PMCID: PMC6381362 DOI: 10.1007/s12551-018-0496-2] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2018] [Accepted: 12/18/2018] [Indexed: 02/07/2023] Open
Abstract
How can biophysical principles help precision medicine identify rare driver mutations? A major tenet of pragmatic approaches to precision oncology and pharmacology is that driver mutations are very frequent. However, frequency is a statistical attribute, not a mechanistic one. Rare mutations can also act through the same mechanism, and as we discuss below, “latent driver” mutations may also follow the same route, with “helper” mutations. Here, we review how biophysics provides mechanistic guidelines that extend precision medicine. We outline principles and strategies, especially focusing on mutations that drive cancer. Biophysics has contributed profoundly to deciphering biological processes. However, driven by data science, precision medicine has skirted some of its major tenets. Data science embodies genomics, tissue- and cell-specific expression levels, making it capable of defining genome- and systems-wide molecular disease signatures. It classifies cancer driver genes/mutations and affected pathways, and its associated protein structural data guide drug discovery. Biophysics complements data science. It considers structures and their heterogeneous ensembles, explains how mutational variants can signal through distinct pathways, and how allo-network drugs can be harnessed. Biophysics clarifies how one mutation—frequent or rare—can affect multiple phenotypic traits by populating conformations that favor interactions with other network modules. It also suggests how to identify such mutations and their signaling consequences. Biophysics offers principles and strategies that can help precision medicine push the boundaries to transform our insight into biological processes and the practice of personalized medicine. By contrast, “phenotypic drug discovery,” which capitalizes on physiological cellular conditions and first-in-class drug discovery, may not capture the proper molecular variant. This is because variants of the same protein can express more than one phenotype, and a phenotype can be encoded by several variants.
Collapse
|
90
|
Pudlarz T, Naoun N, Beinse G, Grazziotin-Soares D, Lotz JP. AACR 2019 — Congrès de l’association américaine de recherche contre le cancer. ONCOLOGIE 2019. [DOI: 10.3166/onco-2019-0036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
Dans ce numéro spécial de la revueOncologie, les principaux points discutés au congrès de l’Association américaine pour la recherche sur le cancer (AACR) sont rapportés. L’objectif ici est de présenter de manière concise des exposés qui méritent une attention toute particulière. Le programme de la réunion de l’AACR de cette année, qui a eu lieu à Atlanta, a couvert les dernières découvertes de tout le spectre de la recherche sur le cancer — des sciences de la population à la prévention ; biologie du cancer, études translationnelles et cliniques ; à la survie et à la défense des droits — et souligne le travail des meilleurs esprits en matière de recherche et de médecine d’institutions du monde entier. Le congrès qui a duré cinq jours a proposé un programme multidisciplinaire couvrant tous les aspects de la recherche sur le cancer depuis ses bases fondamentales jusqu’à ses applications translationnelles et cliniques. Grâce à notre compréhension accrue des bases moléculaires du cancer, de nombreuses thérapies ciblées nouvelles ont émergé. Ainsi, notre compréhension sur la façon dont les tumeurs échappent aux attaques du système immunitaire a conduit au développement de nouvelles thérapies. Compte tenu de l’importance accrue de l’immunothérapie dans le traitement du cancer, nous présentons ici les dernières avancées dans ce domaine. Enfin, d’autres approches telles que l’étude du microbiome, l’épigénétique et l’intelligence artificielle comme un outil dans la recherche sur le cancer ont aussi été discutées au congrès de l’AACR 2019.
Collapse
|
91
|
Association of specific gene mutations derived from machine learning with survival in lung adenocarcinoma. PLoS One 2018; 13:e0207204. [PMID: 30419062 PMCID: PMC6231670 DOI: 10.1371/journal.pone.0207204] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2018] [Accepted: 10/27/2018] [Indexed: 12/20/2022] Open
Abstract
Lung cancer is the second most common cancer in the United States and the leading cause of mortality in cancer patients. Biomarkers predicting survival of patients with lung cancer have a profound effect on patient prognosis and treatment. However, predictive biomarkers for survival and their relevance for lung cancer are not been well known yet. The objective of this study was to perform machine learning with data from The Cancer Genome Atlas of patients with lung adenocarcinoma (LUAD) to find survival-specific gene mutations that could be used as survival-predicting biomarkers. To identify survival-specific mutations according to various clinical factors, four feature selection methods (information gain, chi-squared test, minimum redundancy maximum relevance, and correlation) were used. Extracted survival-specific mutations of LUAD were applied individually or as a group for Kaplan-Meier survival analysis. Mutations in MMRN2 and GMPPA were significantly associated with patient mortality while those in ZNF560 and SETX were associated with patient survival. Mutations in DNAJC2 and MMRN2 showed significant negative association with overall survival while mutations in ZNF560 showed significant positive association with overall survival. Mutations in MMRN2 showed significant negative association with disease-free survival while mutations in DRD3 and ZNF560 showed positive associated with disease-free survival. Mutations in DRD3, SETX, and ZNF560 showed significant positive association with survival in patients with LUAD while the opposite was true for mutations in DNAJC2, GMPPA, and MMRN2. These gene mutations were also found in other cohorts of LUAD, lung squamous cell carcinoma, and small cell lung cancer. In LUAD of Pan-Lung Cancer cohort, mutations in GMPPA, DNAJC2, and MMRN2 showed significant negative associations with survival of patients while mutations in DRD3 and SETX showed significant positive association with survival. In this study, machine learning was conducted to obtain information necessary to discover specific gene mutations associated with the survival of patients with LUAD. Mutations in the above six genes could predict survival rate and disease-free survival rate in patients with LUAD. Thus, they are important biomarker candidates for prognosis.
Collapse
|
92
|
Boosting support vector machines for cancer discrimination tasks. Comput Biol Med 2018; 101:236-249. [DOI: 10.1016/j.compbiomed.2018.08.006] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2018] [Revised: 07/31/2018] [Accepted: 08/04/2018] [Indexed: 01/17/2023]
|
93
|
Grandori C, Kemp CJ. Personalized Cancer Models for Target Discovery and Precision Medicine. Trends Cancer 2018; 4:634-642. [PMID: 30149881 PMCID: PMC6242713 DOI: 10.1016/j.trecan.2018.07.005] [Citation(s) in RCA: 37] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2018] [Revised: 07/10/2018] [Accepted: 07/12/2018] [Indexed: 12/13/2022]
Abstract
Although cancer research is progressing at an exponential rate, translating this knowledge to develop better cancer drugs and more effectively match drugs to patients is lagging. Genome profiling of tumors provides a snapshot of the genetic complexity of individual tumors, yet this knowledge is insufficient to guide therapy for most patients. Model systems, usually cancer cell lines or mice, have been instrumental in cancer research and drug development, but translation of results to the clinic is inefficient, in part, because these models do not sufficiently reflect the complexity and heterogeneity of human cancer. Here, we discuss the potential of combining genomics with high-throughput functional testing of patient-derived tumor cells to overcome key roadblocks in both drug target discovery and precision medicine.
Collapse
|
94
|
De Landtsheer S, Lucarelli P, Sauter T. Using Regularization to Infer Cell Line Specificity in Logical Network Models of Signaling Pathways. Front Physiol 2018; 9:550. [PMID: 29872402 PMCID: PMC5972629 DOI: 10.3389/fphys.2018.00550] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2018] [Accepted: 04/30/2018] [Indexed: 11/13/2022] Open
Abstract
Understanding the functional properties of cells of different origins is a long-standing challenge of personalized medicine. Especially in cancer, the high heterogeneity observed in patients slows down the development of effective cures. The molecular differences between cell types or between healthy and diseased cellular states are usually determined by the wiring of regulatory networks. Understanding these molecular and cellular differences at the systems level would improve patient stratification and facilitate the design of rational intervention strategies. Models of cellular regulatory networks frequently make weak assumptions about the distribution of model parameters across cell types or patients. These assumptions are usually expressed in the form of regularization of the objective function of the optimization problem. We propose a new method of regularization for network models of signaling pathways based on the local density of the inferred parameter values within the parameter space. Our method reduces the complexity of models by creating groups of cell line-specific parameters which can then be optimized together. We demonstrate the use of our method by recovering the correct topology and inferring accurate values of the parameters of a small synthetic model. To show the value of our method in a realistic setting, we re-analyze a recently published phosphoproteomic dataset from a panel of 14 colon cancer cell lines. We conclude that our method efficiently reduces model complexity and helps recovering context-specific regulatory information.
Collapse
Affiliation(s)
- Sébastien De Landtsheer
- Systems Biology Group, Life Sciences Research Unit, University of Luxembourg, Belvaux, Luxembourg
| | - Philippe Lucarelli
- Systems Biology Group, Life Sciences Research Unit, University of Luxembourg, Belvaux, Luxembourg
| | - Thomas Sauter
- Systems Biology Group, Life Sciences Research Unit, University of Luxembourg, Belvaux, Luxembourg
| |
Collapse
|
95
|
Sanchez-Vega F, Mina M, Armenia J, Chatila WK, Luna A, La KC, Dimitriadoy S, Liu DL, Kantheti HS, Saghafinia S, Chakravarty D, Daian F, Gao Q, Bailey MH, Liang WW, Foltz SM, Shmulevich I, Ding L, Heins Z, Ochoa A, Gross B, Gao J, Zhang H, Kundra R, Kandoth C, Bahceci I, Dervishi L, Dogrusoz U, Zhou W, Shen H, Laird PW, Way GP, Greene CS, Liang H, Xiao Y, Wang C, Iavarone A, Berger AH, Bivona TG, Lazar AJ, Hammer GD, Giordano T, Kwong LN, McArthur G, Huang C, Tward AD, Frederick MJ, McCormick F, Meyerson M, Van Allen EM, Cherniack AD, Ciriello G, Sander C, Schultz N. Oncogenic Signaling Pathways in The Cancer Genome Atlas. Cell 2018; 173:321-337.e10. [PMID: 29625050 PMCID: PMC6070353 DOI: 10.1016/j.cell.2018.03.035] [Citation(s) in RCA: 2069] [Impact Index Per Article: 295.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2017] [Revised: 02/28/2018] [Accepted: 03/15/2018] [Indexed: 02/08/2023]
Abstract
Genetic alterations in signaling pathways that control cell-cycle progression, apoptosis, and cell growth are common hallmarks of cancer, but the extent, mechanisms, and co-occurrence of alterations in these pathways differ between individual tumors and tumor types. Using mutations, copy-number changes, mRNA expression, gene fusions and DNA methylation in 9,125 tumors profiled by The Cancer Genome Atlas (TCGA), we analyzed the mechanisms and patterns of somatic alterations in ten canonical pathways: cell cycle, Hippo, Myc, Notch, Nrf2, PI-3-Kinase/Akt, RTK-RAS, TGFβ signaling, p53 and β-catenin/Wnt. We charted the detailed landscape of pathway alterations in 33 cancer types, stratified into 64 subtypes, and identified patterns of co-occurrence and mutual exclusivity. Eighty-nine percent of tumors had at least one driver alteration in these pathways, and 57% percent of tumors had at least one alteration potentially targetable by currently available drugs. Thirty percent of tumors had multiple targetable alterations, indicating opportunities for combination therapy.
Collapse
Affiliation(s)
- Francisco Sanchez-Vega
- Marie-Josée and Henry R. Kravis Center for Molecular Oncology, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA; Human Oncology and Pathogenesis Program, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Marco Mina
- Department of Computational Biology, University of Lausanne (UNIL), 1011 Lausanne, Vaud, Switzerland and Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Joshua Armenia
- Marie-Josée and Henry R. Kravis Center for Molecular Oncology, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA; Human Oncology and Pathogenesis Program, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Walid K Chatila
- Marie-Josée and Henry R. Kravis Center for Molecular Oncology, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Augustin Luna
- cBio Center, Dana-Farber Cancer Institute, Boston, MA; Department of Cell Biology, Harvard Medical School, Boston, MA
| | - Konnor C La
- Marie-Josée and Henry R. Kravis Center for Molecular Oncology, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | | | - David L Liu
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA; Cancer Program, Broad Institute of MIT and Harvard, Cambridge, MA, US
| | | | - Sadegh Saghafinia
- Department of Computational Biology, University of Lausanne (UNIL), 1011 Lausanne, Vaud, Switzerland and Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Debyani Chakravarty
- Marie-Josée and Henry R. Kravis Center for Molecular Oncology, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Foysal Daian
- Marie-Josée and Henry R. Kravis Center for Molecular Oncology, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Qingsong Gao
- Department of Medicine and McDonnell Genome Institute, Washington University in St. Louis, St. Louis, Missouri, 63110, USA
| | - Matthew H Bailey
- Department of Medicine and McDonnell Genome Institute, Washington University in St. Louis, St. Louis, Missouri, 63110, USA
| | - Wen-Wei Liang
- Department of Medicine and McDonnell Genome Institute, Washington University in St. Louis, St. Louis, Missouri, 63110, USA
| | - Steven M Foltz
- Department of Medicine and McDonnell Genome Institute, Washington University in St. Louis, St. Louis, Missouri, 63110, USA
| | | | - Li Ding
- Department of Medicine and McDonnell Genome Institute, Washington University in St. Louis, St. Louis, Missouri, 63110, USA; Siteman Cancer Center, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Zachary Heins
- Marie-Josée and Henry R. Kravis Center for Molecular Oncology, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Angelica Ochoa
- Marie-Josée and Henry R. Kravis Center for Molecular Oncology, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Benjamin Gross
- Marie-Josée and Henry R. Kravis Center for Molecular Oncology, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Jianjiong Gao
- Marie-Josée and Henry R. Kravis Center for Molecular Oncology, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Hongxin Zhang
- Marie-Josée and Henry R. Kravis Center for Molecular Oncology, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Ritika Kundra
- Marie-Josée and Henry R. Kravis Center for Molecular Oncology, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Cyriac Kandoth
- Marie-Josée and Henry R. Kravis Center for Molecular Oncology, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Istemi Bahceci
- Computer Engineering Department, Bilkent University, Ankara 06800, Turkey
| | - Leonard Dervishi
- Computer Engineering Department, Bilkent University, Ankara 06800, Turkey
| | - Ugur Dogrusoz
- Computer Engineering Department, Bilkent University, Ankara 06800, Turkey
| | - Wanding Zhou
- Van Andel Research Institute, 333 Bostwick Ave NE, Grand Rapids Michigan, 49503, USA
| | - Hui Shen
- Van Andel Research Institute, 333 Bostwick Ave NE, Grand Rapids Michigan, 49503, USA
| | - Peter W Laird
- Van Andel Research Institute, 333 Bostwick Ave NE, Grand Rapids Michigan, 49503, USA
| | - Gregory P Way
- Department of Systems Pharmacology and Translational Therapeutics, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Casey S Greene
- Department of Systems Pharmacology and Translational Therapeutics, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Han Liang
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA
| | | | - Chen Wang
- Department of Health Sciences Research and Department of Obstetrics and Gynecology, Mayo Clinic College of Medicine, 200 First Street SW, Rochester, MN, 55905, USA
| | - Antonio Iavarone
- Institute for Cancer Genetics, Department of Neurology and Department of Pathology and Cell Biology, Columbia University Medical Center, New York, NY, 10032, USA
| | - Alice H Berger
- Human Biology Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
| | - Trever G Bivona
- UCSF Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, 1450 3rd Street, San Francisco, California 94143, USA
| | - Alexander J Lazar
- Departments of Pathology, Genomic Medicine & Translational Molecular Pathology, The University of Texas MD Anderson Cancer Center, 1515 Holcombe Blvd-Unit 85, Houston, Texas 77030, USA
| | - Gary D Hammer
- Department of Internal Medicine, Division of Metabolism, Endocrinology and Diabetes, Endocrine Oncology Program, University of Michigan, Ann Arbor, Michigan, MI 48105, USA
| | - Thomas Giordano
- Department of Pathology, University of Michigan Medical School, Ann Arbor, MI; Department of Internal Medicine, Division of Metabolism, Endocrinology & Diabetes, University of Michigan Medical School, Ann Arbor, MI; Comprehensive Cancer Center, Michigan Medicine, Ann Arbor, MI, USA
| | - Lawrence N Kwong
- Department of Translational Molecular Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Grant McArthur
- Peter MacCallum Cancer Centre, Melbourne, VIC, Australia; University of Melbourne, Melbourne, VIC, Australia
| | - Chenfei Huang
- Dept. of Otolaryngology, Baylor College of Medicine, USA
| | - Aaron D Tward
- University of California, San Francisco Department of Otolaryngology-Head and Neck Surgery. 2233 Post Street, San Francisco, CA, 94143, USA
| | | | - Frank McCormick
- UCSF Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, 1450 3rd Street, San Francisco, CA 94143, USA
| | - Matthew Meyerson
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA; Cancer Program, Broad Institute of MIT and Harvard, Cambridge, MA, US
| | - Eliezer M Van Allen
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA; Cancer Program, Broad Institute of MIT and Harvard, Cambridge, MA, US
| | - Andrew D Cherniack
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA; Cancer Program, Broad Institute of MIT and Harvard, Cambridge, MA, US
| | - Giovanni Ciriello
- Department of Computational Biology, University of Lausanne (UNIL), 1011 Lausanne, Vaud, Switzerland and Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland.
| | - Chris Sander
- cBio Center, Dana-Farber Cancer Institute, Boston, MA; Department of Cell Biology, Harvard Medical School, Boston, MA.
| | - Nikolaus Schultz
- Marie-Josée and Henry R. Kravis Center for Molecular Oncology, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA; Human Oncology and Pathogenesis Program, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA; Departments of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA.
| |
Collapse
|
96
|
Knijnenburg TA, Wang L, Zimmermann MT, Chambwe N, Gao GF, Cherniack AD, Fan H, Shen H, Way GP, Greene CS, Liu Y, Akbani R, Feng B, Donehower LA, Miller C, Shen Y, Karimi M, Chen H, Kim P, Jia P, Shinbrot E, Zhang S, Liu J, Hu H, Bailey MH, Yau C, Wolf D, Zhao Z, Weinstein JN, Li L, Ding L, Mills GB, Laird PW, Wheeler DA, Shmulevich I, Monnat RJ, Xiao Y, Wang C. Genomic and Molecular Landscape of DNA Damage Repair Deficiency across The Cancer Genome Atlas. Cell Rep 2018; 23:239-254.e6. [PMID: 29617664 PMCID: PMC5961503 DOI: 10.1016/j.celrep.2018.03.076] [Citation(s) in RCA: 760] [Impact Index Per Article: 108.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2017] [Revised: 03/07/2018] [Accepted: 03/19/2018] [Indexed: 12/20/2022] Open
Abstract
DNA damage repair (DDR) pathways modulate cancer risk, progression, and therapeutic response. We systematically analyzed somatic alterations to provide a comprehensive view of DDR deficiency across 33 cancer types. Mutations with accompanying loss of heterozygosity were observed in over 1/3 of DDR genes, including TP53 and BRCA1/2. Other prevalent alterations included epigenetic silencing of the direct repair genes EXO5, MGMT, and ALKBH3 in ∼20% of samples. Homologous recombination deficiency (HRD) was present at varying frequency in many cancer types, most notably ovarian cancer. However, in contrast to ovarian cancer, HRD was associated with worse outcomes in several other cancers. Protein structure-based analyses allowed us to predict functional consequences of rare, recurrent DDR mutations. A new machine-learning-based classifier developed from gene expression data allowed us to identify alterations that phenocopy deleterious TP53 mutations. These frequent DDR gene alterations in many human cancers have functional consequences that may determine cancer progression and guide therapy.
Collapse
Affiliation(s)
| | - Linghua Wang
- Department of Genomic Medicine, Division of Cancer Medicine, University of Texas MD Anderson Cancer Center, Houston, TX 77054, USA; Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Michael T Zimmermann
- Genomic Sciences and Precision Medicine Center, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI 53226-0509, USA; Department of Health Sciences Research, Mayo Clinic College of Medicine, 200 First Street SW, Rochester, MN 55905, USA
| | | | - Galen F Gao
- The Eli and Edythe L. Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, MA 02142, USA
| | - Andrew D Cherniack
- The Eli and Edythe L. Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, MA 02142, USA
| | - Huihui Fan
- Center for Epigenetics, Van Andel Research Institute, Grand Rapids, MI 49503, USA
| | - Hui Shen
- Center for Epigenetics, Van Andel Research Institute, Grand Rapids, MI 49503, USA
| | - Gregory P Way
- Department of Systems Pharmacology and Translational Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19103, USA
| | - Casey S Greene
- Department of Systems Pharmacology and Translational Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19103, USA
| | - Yuexin Liu
- Department of Bioinformatics and Computational Biology, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Rehan Akbani
- Department of Bioinformatics and Computational Biology, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Bin Feng
- TESARO Inc., Waltham, MA 02451, USA
| | - Lawrence A Donehower
- Department of Molecular Virology and Microbiology, Baylor College of Medicine, Houston, TX 77030, USA
| | - Chase Miller
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Yang Shen
- Department of Electrical and Computer Engineering, 3128 TAMU, Texas A&M University, College Station, TX 77843, USA
| | - Mostafa Karimi
- Department of Electrical and Computer Engineering, 3128 TAMU, Texas A&M University, College Station, TX 77843, USA
| | - Haoran Chen
- Department of Electrical and Computer Engineering, 3128 TAMU, Texas A&M University, College Station, TX 77843, USA
| | - Pora Kim
- Center for Precision Health, School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Peilin Jia
- Center for Precision Health, School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Eve Shinbrot
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Shaojun Zhang
- Department of Genomic Medicine, Division of Cancer Medicine, University of Texas MD Anderson Cancer Center, Houston, TX 77054, USA
| | - Jianfang Liu
- Chan Soon-Shiong Institute of Molecular Medicine at Windber, Windber, PA 15963, USA
| | - Hai Hu
- Chan Soon-Shiong Institute of Molecular Medicine at Windber, Windber, PA 15963, USA
| | - Matthew H Bailey
- Division of Oncology, Department of Medicine, Washington University, St. Louis, MO 63110, USA; McDonnell Genome Institute, Washington University, St. Louis, MO 63110, USA
| | - Christina Yau
- University of California, San Francisco, San Francisco, CA 94115, USA; Buck Institute for Research on Aging, Novato, CA 94945, USA
| | - Denise Wolf
- University of California, San Francisco, San Francisco, CA 94115, USA
| | - Zhongming Zhao
- Center for Precision Health, School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - John N Weinstein
- Department of Bioinformatics and Computational Biology, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Lei Li
- Department of Experimental Radiation Oncology, University of Texas MD Anderson Cancer, Houston, TX 77030, USA
| | - Li Ding
- Division of Oncology, Department of Medicine, Washington University, St. Louis, MO 63110, USA; McDonnell Genome Institute, Washington University, St. Louis, MO 63110, USA; Department of Genetics, Washington University, St. Louis, MO 63110, USA; Siteman Cancer Center, Washington University, St. Louis, MO 63110, USA
| | - Gordon B Mills
- Department of Systems Biology, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Peter W Laird
- Center for Epigenetics, Van Andel Research Institute, Grand Rapids, MI 49503, USA
| | - David A Wheeler
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA
| | | | - Raymond J Monnat
- Departments of Pathology & Genome Sciences, University of Washington, Seattle, WA 98195-7705, USA.
| | | | - Chen Wang
- Department of Health Sciences Research, Mayo Clinic College of Medicine, 200 First Street SW, Rochester, MN 55905, USA; Department of Obstetrics and Gynecology, Mayo Clinic College of Medicine, 200 First Street SW, Rochester, MN 55905, USA.
| |
Collapse
|