1
|
Brativnyk A, Ankill J, Helland Å, Fleischer T. Multi-omics analysis reveals epigenetically regulated processes and patient classification in lung adenocarcinoma. Int J Cancer 2024. [PMID: 38489486 DOI: 10.1002/ijc.34915] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Revised: 12/27/2023] [Accepted: 01/24/2024] [Indexed: 03/17/2024]
Abstract
Aberrant DNA methylation is a hallmark of many cancer types. Despite our knowledge of epigenetic and transcriptomic alterations in lung adenocarcinoma (LUAD), we lack robust multi-modal molecular classifications for patient stratification. This is partly because the impact of epigenetic alterations on lung cancer development and progression is still not fully understood. To that end, we identified disease-associated processes under epigenetic regulation in LUAD. We performed a genome-wide expression-methylation Quantitative Trait Loci (emQTL) analysis by integrating DNA methylation and gene expression data from 453 patients in the TCGA cohort. Using a community detection algorithm, we identified distinct communities of CpG-gene associations with diverse biological processes. Interestingly, we identified a community linked to hormone response and lipid metabolism; the identified CpGs in this community were enriched in enhancer regions and binding regions of transcription factors such as FOXA1/2, GRHL2, HNF1B, AR, and ESR1. Furthermore, the CpGs were connected to their associated genes through chromatin interaction loops. These findings suggest that the expression of genes involved in hormone response and lipid metabolism in LUAD is epigenetically regulated through DNA methylation and enhancer-promoter interactions. By applying consensus clustering on the integrated expression-methylation pattern of the emQTL-genes and CpGs linked to hormone response and lipid metabolism, we further identified subclasses of patients with distinct prognoses. This novel patient stratification was validated in an independent patient cohort of 135 patients and showed increased prognostic significance compared to previously defined molecular subtypes.
Collapse
Affiliation(s)
- Anastasia Brativnyk
- Department of Cancer Genetics, Institute for Cancer Research, Oslo University Hospital, Oslo, Norway
| | - Jørgen Ankill
- Department of Cancer Genetics, Institute for Cancer Research, Oslo University Hospital, Oslo, Norway
- Institute of Clinical Medicine, Faculty of Medicine, University of Oslo, Oslo, Norway
| | - Åslaug Helland
- Department of Cancer Genetics, Institute for Cancer Research, Oslo University Hospital, Oslo, Norway
- Department of Clinical Medicine, University of Oslo, Oslo, Norway
- Department of Oncology, Oslo University Hospital, Oslo, Norway
| | - Thomas Fleischer
- Department of Cancer Genetics, Institute for Cancer Research, Oslo University Hospital, Oslo, Norway
| |
Collapse
|
2
|
Cai Y, Wang S. Deeply integrating latent consistent representations in high-noise multi-omics data for cancer subtyping. Brief Bioinform 2024; 25:bbae061. [PMID: 38426322 PMCID: PMC10939425 DOI: 10.1093/bib/bbae061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Revised: 01/13/2024] [Accepted: 01/29/2024] [Indexed: 03/02/2024] Open
Abstract
Cancer is a complex and high-mortality disease regulated by multiple factors. Accurate cancer subtyping is crucial for formulating personalized treatment plans and improving patient survival rates. The underlying mechanisms that drive cancer progression can be comprehensively understood by analyzing multi-omics data. However, the high noise levels in omics data often pose challenges in capturing consistent representations and adequately integrating their information. This paper proposed a novel variational autoencoder-based deep learning model, named Deeply Integrating Latent Consistent Representations (DILCR). Firstly, multiple independent variational autoencoders and contrastive loss functions were designed to separate noise from omics data and capture latent consistent representations. Subsequently, an Attention Deep Integration Network was proposed to integrate consistent representations across different omics levels effectively. Additionally, we introduced the Improved Deep Embedded Clustering algorithm to make integrated variable clustering friendly. The effectiveness of DILCR was evaluated using 10 typical cancer datasets from The Cancer Genome Atlas and compared with 14 state-of-the-art integration methods. The results demonstrated that DILCR effectively captures the consistent representations in omics data and outperforms other integration methods in cancer subtyping. In the Kidney Renal Clear Cell Carcinoma case study, cancer subtypes were identified by DILCR with significant biological significance and interpretability.
Collapse
Affiliation(s)
- Yueyi Cai
- Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming, 650504, Yunnan, China
| | - Shunfang Wang
- Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming, 650504, Yunnan, China
| |
Collapse
|
3
|
Alkhateeb A, Alshomali L, Dalain M. Prostate cancer bioinformatics analysis: emerging genomic profiling techniques. Transl Cancer Res 2023; 12:4-7. [PMID: 36760375 PMCID: PMC9906049 DOI: 10.21037/tcr-22-2423] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Accepted: 11/29/2022] [Indexed: 01/11/2023]
Affiliation(s)
- Abedalrhman Alkhateeb
- Software Engineering Department, Princess Sumaya University for Technology, Amman, Jordan
| | - Lujain Alshomali
- Software Engineering Department, Princess Sumaya University for Technology, Amman, Jordan
| | - Mutaz Dalain
- Field Epidemiology Training Program, Ministry of Health, Amman, Jordan
| |
Collapse
|
4
|
Wei Y, Li L, Zhao X, Yang H, Sa J, Cao H, Cui Y. Cancer subtyping with heterogeneous multi-omics data via hierarchical multi-kernel learning. Brief Bioinform 2023; 24:6847203. [PMID: 36433785 DOI: 10.1093/bib/bbac488] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2022] [Revised: 09/14/2022] [Accepted: 10/15/2022] [Indexed: 11/27/2022] Open
Abstract
Differentiating cancer subtypes is crucial to guide personalized treatment and improve the prognosis for patients. Integrating multi-omics data can offer a comprehensive landscape of cancer biological process and provide promising ways for cancer diagnosis and treatment. Taking the heterogeneity of different omics data types into account, we propose a hierarchical multi-kernel learning (hMKL) approach, a novel cancer molecular subtyping method to identify cancer subtypes by adopting a two-stage kernel learning strategy. In stage 1, we obtain a composite kernel borrowing the cancer integration via multi-kernel learning (CIMLR) idea by optimizing the kernel parameters for individual omics data type. In stage 2, we obtain a final fused kernel through a weighted linear combination of individual kernels learned from stage 1 using an unsupervised multiple kernel learning method. Based on the final fusion kernel, k-means clustering is applied to identify cancer subtypes. Simulation studies show that hMKL outperforms the one-stage CIMLR method when there is data heterogeneity. hMKL can estimate the number of clusters correctly, which is the key challenge in subtyping. Application to two real data sets shows that hMKL identified meaningful subtypes and key cancer-associated biomarkers. The proposed method provides a novel toolkit for heterogeneous multi-omics data integration and cancer subtypes identification.
Collapse
Affiliation(s)
- Yifang Wei
- Division of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi 030001, PR China
| | - Lingmei Li
- Division of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi 030001, PR China
| | - Xin Zhao
- Division of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi 030001, PR China
| | - Haitao Yang
- Division of Health Statistics, School of Public Health, Hebei Medical University, Shijiazhuang, Hebei 050017, PR China
| | - Jian Sa
- Department of Science and Technology, Shanxi Provincial Key Laboratory of Major Disease Risk Assessment, Shanxi Medical University, Taiyuan, Shanxi 030001, PR China
| | - Hongyan Cao
- Division of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi 030001, PR China.,Department of Mathematics, Shanxi Medical University, Taiyuan, Shanxi 030001, PR China
| | - Yuehua Cui
- Department of Statistics and Probability, Michigan State University, East Lansing, MI 48824, USA
| |
Collapse
|
5
|
Lavrekha VV, Levitsky VG, Tsukanov AV, Bogomolov AG, Grigorovich DA, Omelyanchuk N, Ubogoeva EV, Zemlyanskaya EV, Mironova V. CisCross: A gene list enrichment analysis to predict upstream regulators in Arabidopsis thaliana. Front Plant Sci 2022; 13:942710. [PMID: 36061801 PMCID: PMC9434332 DOI: 10.3389/fpls.2022.942710] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Accepted: 07/26/2022] [Indexed: 06/15/2023]
Abstract
Having DNA-binding profiles for a sufficient number of genome-encoded transcription factors (TFs) opens up the perspectives for systematic evaluation of the upstream regulators for the gene lists. Plant Cistrome database, a large collection of TF binding profiles detected using the DAP-seq method, made it possible for Arabidopsis. Here we re-processed raw DAP-seq data with MACS2, the most popular peak caller that leads among other ones according to quality metrics. In the benchmarking study, we confirmed that the improved collection of TF binding profiles supported a more precise gene list enrichment procedure, and resulted in a more relevant ranking of potential upstream regulators. Moreover, we consistently recovered the TF binding profiles that were missing in the previous collection of DAP-seq peak sets. We developed the CisCross web service (https://plamorph.sysbio.ru/ciscross/) that gives more flexibility in the analysis of potential upstream TF regulators for Arabidopsis thaliana genes.
Collapse
Affiliation(s)
- Viktoriya V. Lavrekha
- Department of Systems Biology, Institute of Cytology and Genetics SB RAS, Novosibirsk, Russia
- Department of Natural Sciences, Novosibirsk State University, Novosibirsk, Russia
| | - Victor G. Levitsky
- Department of Systems Biology, Institute of Cytology and Genetics SB RAS, Novosibirsk, Russia
- Department of Natural Sciences, Novosibirsk State University, Novosibirsk, Russia
| | - Anton V. Tsukanov
- Department of Systems Biology, Institute of Cytology and Genetics SB RAS, Novosibirsk, Russia
| | - Anton G. Bogomolov
- Department of Cell Biology, Institute of Cytology and Genetics SB RAS, Novosibirsk, Russia
| | - Dmitry A. Grigorovich
- Service of Information Technologies, Institute of Cytology and Genetics SB RAS, Novosibirsk, Russia
| | - Nadya Omelyanchuk
- Department of Systems Biology, Institute of Cytology and Genetics SB RAS, Novosibirsk, Russia
| | - Elena V. Ubogoeva
- Department of Systems Biology, Institute of Cytology and Genetics SB RAS, Novosibirsk, Russia
| | - Elena V. Zemlyanskaya
- Department of Systems Biology, Institute of Cytology and Genetics SB RAS, Novosibirsk, Russia
- Department of Natural Sciences, Novosibirsk State University, Novosibirsk, Russia
| | - Victoria Mironova
- Department of Systems Biology, Institute of Cytology and Genetics SB RAS, Novosibirsk, Russia
- Department of Plant Systems Physiology, RIBES, Radboud University, Nijmegen, Netherlands
| |
Collapse
|
6
|
Yang Y, Tian S, Qiu Y, Zhao P, Zou Q. MDICC: novel method for multi-omics data integration and cancer subtype identification. Brief Bioinform 2022; 23:6569541. [PMID: 35437603 DOI: 10.1093/bib/bbac132] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Revised: 03/11/2022] [Accepted: 03/19/2022] [Indexed: 12/12/2022] Open
Abstract
Each type of cancer usually has several subtypes with distinct clinical implications, and therefore the discovery of cancer subtypes is an important and urgent task in disease diagnosis and therapy. Using single-omics data to predict cancer subtypes is difficult because genomes are dysregulated and complicated by multiple molecular mechanisms, and therefore linking cancer genomes to cancer phenotypes is not an easy task. Using multi-omics data to effectively predict cancer subtypes is an area of much interest; however, integrating multi-omics data is challenging. Here, we propose a novel method of multi-omics data integration for clustering to identify cancer subtypes (MDICC) that integrates new affinity matrix and network fusion methods. Our experimental results show the effectiveness and generalization of the proposed MDICC model in identifying cancer subtypes, and its performance was better than those of currently available state-of-the-art clustering methods. Furthermore, the survival analysis demonstrates that MDICC delivered comparable or even better results than many typical integrative methods.
Collapse
Affiliation(s)
- Ying Yang
- College of Mathematics and Statistics, Shenzhen University, 518000, China
| | - Sha Tian
- College of Mathematics and Statistics, Shenzhen University, 518000, China
| | - Yushan Qiu
- College of Mathematics and Statistics, Shenzhen University, 518000, China
| | - Pu Zhao
- College of Life and Health Sciences, Northeastern University, Shenyang, 110169, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, 610056, China
| |
Collapse
|
7
|
Jayavelu AK, Wolf S, Buettner F, Alexe G, Häupl B, Comoglio F, Schneider C, Doebele C, Fuhrmann DC, Wagner S, Donato E, Andresen C, Wilke AC, Zindel A, Jahn D, Splettstoesser B, Plessmann U, Münch S, Abou-El-Ardat K, Makowka P, Acker F, Enssle JC, Cremer A, Schnütgen F, Kurrle N, Chapuy B, Löber J, Hartmann S, Wild PJ, Wittig I, Hübschmann D, Kaderali L, Cox J, Brüne B, Röllig C, Thiede C, Steffen B, Bornhäuser M, Trumpp A, Urlaub H, Stegmaier K, Serve H, Mann M, Oellerich T. The proteogenomic subtypes of acute myeloid leukemia. Cancer Cell 2022; 40:301-317.e12. [PMID: 35245447 DOI: 10.1016/j.ccell.2022.02.006] [Citation(s) in RCA: 39] [Impact Index Per Article: 19.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/21/2021] [Revised: 08/30/2021] [Accepted: 02/07/2022] [Indexed: 12/16/2022]
Abstract
Acute myeloid leukemia (AML) is an aggressive blood cancer with a poor prognosis. We report a comprehensive proteogenomic analysis of bone marrow biopsies from 252 uniformly treated AML patients to elucidate the molecular pathophysiology of AML in order to inform future diagnostic and therapeutic approaches. In addition to in-depth quantitative proteomics, our analysis includes cytogenetic profiling and DNA/RNA sequencing. We identify five proteomic AML subtypes, each reflecting specific biological features spanning genomic boundaries. Two of these proteomic subtypes correlate with patient outcome, but none is exclusively associated with specific genomic aberrations. Remarkably, one subtype (Mito-AML), which is captured only in the proteome, is characterized by high expression of mitochondrial proteins and confers poor outcome, with reduced remission rate and shorter overall survival on treatment with intensive induction chemotherapy. Functional analyses reveal that Mito-AML is metabolically wired toward stronger complex I-dependent respiration and is more responsive to treatment with the BCL2 inhibitor venetoclax.
Collapse
Affiliation(s)
- Ashok Kumar Jayavelu
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany; Clinical Cooperation Unit Pediatric Leukemia, DKFZ and Department of Pediatric Oncology, Hematology and Immunology, University of Heidelberg, Germany; Hopp Children's Cancer Center Heidelberg - KiTZ, Heidelberg, Germany
| | - Sebastian Wolf
- Department of Medicine II, Hematology/Oncology, University Hospital Frankfurt, Goethe University, Theodor-Stern-Kai 7, Frankfurt, Germany; Frankfurt Cancer Institute, Goethe University Frankfurt, Frankfurt, Germany
| | - Florian Buettner
- German Cancer Consortium (DKTK), Partner Site Frankfurt/Mainz and German Cancer Research Center (DKFZ), Heidelberg, Germany; Department of Medicine, University Hospital Frankfurt, Goethe University, Frankfurt, Germany; Frankfurt Cancer Institute, Goethe University Frankfurt, Frankfurt, Germany
| | - Gabriela Alexe
- Division of Hematology/Oncology, Department of Pediatric Oncology, Dana-Farber Cancer Institute and Boston Children's Hospital, Boston, MA, USA
| | - Björn Häupl
- Department of Medicine II, Hematology/Oncology, University Hospital Frankfurt, Goethe University, Theodor-Stern-Kai 7, Frankfurt, Germany; German Cancer Consortium (DKTK), Partner Site Frankfurt/Mainz and German Cancer Research Center (DKFZ), Heidelberg, Germany; Frankfurt Cancer Institute, Goethe University Frankfurt, Frankfurt, Germany
| | | | - Constanze Schneider
- Department of Medicine II, Hematology/Oncology, University Hospital Frankfurt, Goethe University, Theodor-Stern-Kai 7, Frankfurt, Germany
| | - Carmen Doebele
- Department of Medicine II, Hematology/Oncology, University Hospital Frankfurt, Goethe University, Theodor-Stern-Kai 7, Frankfurt, Germany; German Cancer Consortium (DKTK), Partner Site Frankfurt/Mainz and German Cancer Research Center (DKFZ), Heidelberg, Germany
| | | | - Sebastian Wagner
- Department of Medicine II, Hematology/Oncology, University Hospital Frankfurt, Goethe University, Theodor-Stern-Kai 7, Frankfurt, Germany
| | - Elisa Donato
- Division of Stem Cells and Cancer, German Cancer Research Center (DKFZ) and DKFZ-ZMBH Alliance, Heidelberg, Germany; Heidelberg Institute for Stem Cell Technology and Experimental Medicine (HI-STEM gGmbH), Heidelberg, Germany
| | - Carolin Andresen
- Division of Stem Cells and Cancer, German Cancer Research Center (DKFZ) and DKFZ-ZMBH Alliance, Heidelberg, Germany; Heidelberg Institute for Stem Cell Technology and Experimental Medicine (HI-STEM gGmbH), Heidelberg, Germany
| | - Anne C Wilke
- Department of Medicine II, Hematology/Oncology, University Hospital Frankfurt, Goethe University, Theodor-Stern-Kai 7, Frankfurt, Germany
| | - Alena Zindel
- Department of Medicine II, Hematology/Oncology, University Hospital Frankfurt, Goethe University, Theodor-Stern-Kai 7, Frankfurt, Germany; German Cancer Consortium (DKTK), Partner Site Frankfurt/Mainz and German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Dominique Jahn
- Department of Medicine II, Hematology/Oncology, University Hospital Frankfurt, Goethe University, Theodor-Stern-Kai 7, Frankfurt, Germany; German Cancer Consortium (DKTK), Partner Site Frankfurt/Mainz and German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Bianca Splettstoesser
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany
| | - Uwe Plessmann
- Bioanalytical Mass Spectrometry Group, Max Planck Institute for Biophysical Chemistry, Göttingen, Germany
| | - Silvia Münch
- Department of Medicine II, Hematology/Oncology, University Hospital Frankfurt, Goethe University, Theodor-Stern-Kai 7, Frankfurt, Germany
| | - Khali Abou-El-Ardat
- Department of Medicine II, Hematology/Oncology, University Hospital Frankfurt, Goethe University, Theodor-Stern-Kai 7, Frankfurt, Germany
| | - Philipp Makowka
- Department of Medicine II, Hematology/Oncology, University Hospital Frankfurt, Goethe University, Theodor-Stern-Kai 7, Frankfurt, Germany
| | - Fabian Acker
- Department of Medicine II, Hematology/Oncology, University Hospital Frankfurt, Goethe University, Theodor-Stern-Kai 7, Frankfurt, Germany
| | - Julius C Enssle
- Department of Medicine II, Hematology/Oncology, University Hospital Frankfurt, Goethe University, Theodor-Stern-Kai 7, Frankfurt, Germany
| | - Anjali Cremer
- Department of Medicine II, Hematology/Oncology, University Hospital Frankfurt, Goethe University, Theodor-Stern-Kai 7, Frankfurt, Germany
| | - Frank Schnütgen
- Department of Medicine II, Hematology/Oncology, University Hospital Frankfurt, Goethe University, Theodor-Stern-Kai 7, Frankfurt, Germany; German Cancer Consortium (DKTK), Partner Site Frankfurt/Mainz and German Cancer Research Center (DKFZ), Heidelberg, Germany; Frankfurt Cancer Institute, Goethe University Frankfurt, Frankfurt, Germany
| | - Nina Kurrle
- Department of Medicine II, Hematology/Oncology, University Hospital Frankfurt, Goethe University, Theodor-Stern-Kai 7, Frankfurt, Germany; German Cancer Consortium (DKTK), Partner Site Frankfurt/Mainz and German Cancer Research Center (DKFZ), Heidelberg, Germany; Frankfurt Cancer Institute, Goethe University Frankfurt, Frankfurt, Germany
| | - Björn Chapuy
- Department of Medical Hematology and Oncology, University Medical Center Göttingen, Göttingen, Germany; Department of Hematology, Oncology and Tumor Immunology, Charité, Campus Benjamin Franklin, University Medicine Berlin, Berlin, Germany
| | - Jens Löber
- Department of Medical Hematology and Oncology, University Medical Center Göttingen, Göttingen, Germany; Department of Hematology, Oncology and Tumor Immunology, Charité, Campus Benjamin Franklin, University Medicine Berlin, Berlin, Germany
| | - Sylvia Hartmann
- Dr. Senckenberg Institute of Pathology, University Hospital Frankfurt, Frankfurt, Germany
| | - Peter J Wild
- Dr. Senckenberg Institute of Pathology, University Hospital Frankfurt, Frankfurt, Germany
| | - Ilka Wittig
- Functional Proteomics, Institute of Cardiovascular Physiology, Goethe University, Frankfurt, Germany
| | - Daniel Hübschmann
- Division of Stem Cells and Cancer, German Cancer Research Center (DKFZ) and DKFZ-ZMBH Alliance, Heidelberg, Germany; Heidelberg Institute for Stem Cell Technology and Experimental Medicine (HI-STEM gGmbH), Heidelberg, Germany; Pattern Recognition and Digital Medicine, Heidelberg Institute for Stem Cell Technology and Experimental Medicine (HI-STEM), Heidelberg, Germany; Computational Oncology, Molecular Precision Oncology Program, National Center for Tumor Diseases (NCT) Heidelberg and German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Lars Kaderali
- Institute of Bioinformatics, University Medicine Greifswald, Greifswald, Germany
| | - Jürgen Cox
- Computational Systems Biochemistry Research Group, Max Planck Institute of Biochemistry, Martinsried, Germany
| | - Bernhard Brüne
- Department of Biochemistry I, Goethe University, Frankfurt, Germany
| | - Christoph Röllig
- Department of Internal Medicine I, University Hospital Carl Gustav Carus TU Dresden, Dresden, Germany
| | - Christian Thiede
- Department of Internal Medicine I, University Hospital Carl Gustav Carus TU Dresden, Dresden, Germany
| | - Björn Steffen
- Department of Medicine II, Hematology/Oncology, University Hospital Frankfurt, Goethe University, Theodor-Stern-Kai 7, Frankfurt, Germany
| | - Martin Bornhäuser
- Department of Internal Medicine I, University Hospital Carl Gustav Carus TU Dresden, Dresden, Germany; National Center for Tumor Diseases, Dresden (NCT/UCC), Dresden, Germany
| | - Andreas Trumpp
- Division of Stem Cells and Cancer, German Cancer Research Center (DKFZ) and DKFZ-ZMBH Alliance, Heidelberg, Germany; Heidelberg Institute for Stem Cell Technology and Experimental Medicine (HI-STEM gGmbH), Heidelberg, Germany; German Cancer Consortium (DKTK), Heidelberg, Germany
| | - Henning Urlaub
- Bioanalytical Mass Spectrometry Group, Max Planck Institute for Biophysical Chemistry, Göttingen, Germany; Bioanalytics, Institute for Clinical Chemistry, University Medical Center Göttingen, Göttingen, Germany
| | - Kimberly Stegmaier
- Division of Hematology/Oncology, Department of Pediatric Oncology, Dana-Farber Cancer Institute and Boston Children's Hospital, Boston, MA, USA; The Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Hubert Serve
- Department of Medicine II, Hematology/Oncology, University Hospital Frankfurt, Goethe University, Theodor-Stern-Kai 7, Frankfurt, Germany; German Cancer Consortium (DKTK), Partner Site Frankfurt/Mainz and German Cancer Research Center (DKFZ), Heidelberg, Germany; Frankfurt Cancer Institute, Goethe University Frankfurt, Frankfurt, Germany.
| | - Matthias Mann
- Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany.
| | - Thomas Oellerich
- Department of Medicine II, Hematology/Oncology, University Hospital Frankfurt, Goethe University, Theodor-Stern-Kai 7, Frankfurt, Germany; German Cancer Consortium (DKTK), Partner Site Frankfurt/Mainz and German Cancer Research Center (DKFZ), Heidelberg, Germany; Frankfurt Cancer Institute, Goethe University Frankfurt, Frankfurt, Germany.
| |
Collapse
|
8
|
Sienkiewicz K, Chen J, Chatrath A, Lawson JT, Sheffield NC, Zhang L, Ratan A. Detecting molecular subtypes from multi-omics datasets using SUMO. Cell Rep Methods 2022; 2:100152. [PMID: 35211690 PMCID: PMC8865426 DOI: 10.1016/j.crmeth.2021.100152] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/11/2021] [Revised: 08/27/2021] [Accepted: 12/21/2021] [Indexed: 12/31/2022]
Abstract
We present a data integration framework that uses non-negative matrix factorization of patient-similarity networks to integrate continuous multi-omics datasets for molecular subtyping. It is demonstrated to have the capability to handle missing data without using imputation and to be consistently among the best in detecting subtypes with differential prognosis and enrichment of clinical associations in a large number of cancers. When applying the approach to data from individuals with lower-grade gliomas, we identify a subtype with a significantly worse prognosis. Tumors assigned to this subtype are hypomethylated genome wide with a gain of AP-1 occupancy in demethylated distal enhancers. The tumors are also enriched for somatic chromosome 7 (chr7) gain, chr10 loss, and other molecular events that have been suggested as diagnostic markers for "IDH wild type, with molecular features of glioblastoma" by the cIMPACT-NOW consortium but have yet to be included in the World Health Organization (WHO) guidelines.
Collapse
Affiliation(s)
- Karolina Sienkiewicz
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA 22908, USA
| | - Jinyu Chen
- Department of Mathematics and Computational Biology Program, National University of Singapore, Singapore 119076, Singapore
| | - Ajay Chatrath
- Department of Biochemistry and Molecular Genetics, University of Virginia, Charlottesville, VA 22908, USA
| | - John T. Lawson
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA 22908, USA
- Department of Biomedical Engineering, University of Virginia, Charlottesville, VA 22908, USA
| | - Nathan C. Sheffield
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA 22908, USA
- Department of Biochemistry and Molecular Genetics, University of Virginia, Charlottesville, VA 22908, USA
- Department of Biomedical Engineering, University of Virginia, Charlottesville, VA 22908, USA
- Department of Public Health Sciences, University of Virginia, Charlottesville, VA 22908, USA
- University of Virginia Cancer Center, Charlottesville, VA 22908, USA
| | - Louxin Zhang
- Department of Mathematics and Computational Biology Program, National University of Singapore, Singapore 119076, Singapore
| | - Aakrosh Ratan
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA 22908, USA
- Department of Public Health Sciences, University of Virginia, Charlottesville, VA 22908, USA
- University of Virginia Cancer Center, Charlottesville, VA 22908, USA
| |
Collapse
|
9
|
Kirchhoff KN, Billion A, Voolstra CR, Kremb S, Wilke T, Vilcinskas A. Stingray Venom Proteins: Mechanisms of Action Revealed Using a Novel Network Pharmacology Approach. Mar Drugs 2021; 20:27. [PMID: 35049882 DOI: 10.3390/md20010027] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Revised: 12/20/2021] [Accepted: 12/22/2021] [Indexed: 01/02/2023] Open
Abstract
Animal venoms offer a valuable source of potent new drug leads, but their mechanisms of action are largely unknown. We therefore developed a novel network pharmacology approach based on multi-omics functional data integration to predict how stingray venom disrupts the physiological systems of target animals. We integrated 10 million transcripts from five stingray venom transcriptomes and 848,640 records from three high-content venom bioactivity datasets into a large functional data network. The network featured 216 signaling pathways, 29 of which were shared and targeted by 70 transcripts and 70 bioactivity hits. The network revealed clusters for single envenomation outcomes, such as pain, cardiotoxicity and hemorrhage. We carried out a detailed analysis of the pain cluster representing a primary envenomation symptom, revealing bibrotoxin and cholecystotoxin-like transcripts encoding pain-inducing candidate proteins in stingray venom. The cluster also suggested that such pain-inducing toxins primarily activate the inositol-3-phosphate receptor cascade, inducing intracellular calcium release. We also found strong evidence for synergistic activity among these candidates, with nerve growth factors cooperating with the most abundant translationally-controlled tumor proteins to activate pain signaling pathways. Our network pharmacology approach, here applied to stingray venom, can be used as a template for drug discovery in neglected venomous species.
Collapse
|
10
|
Logotheti M, Agioutantis P, Katsaounou P, Loutrari H. Microbiome Research and Multi-Omics Integration for Personalized Medicine in Asthma. J Pers Med 2021; 11:jpm11121299. [PMID: 34945771 PMCID: PMC8707330 DOI: 10.3390/jpm11121299] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Revised: 11/13/2021] [Accepted: 11/24/2021] [Indexed: 12/12/2022] Open
Abstract
Asthma is a multifactorial inflammatory disorder of the respiratory system characterized by high diversity in clinical manifestations, underlying pathological mechanisms and response to treatment. It is generally established that human microbiota plays an essential role in shaping a healthy immune response, while its perturbation can cause chronic inflammation related to a wide range of diseases, including asthma. Systems biology approaches encompassing microbiome analysis can offer valuable platforms towards a global understanding of asthma complexity and improving patients' classification, status monitoring and therapeutic choices. In the present review, we summarize recent studies exploring the contribution of microbiota dysbiosis to asthma pathogenesis and heterogeneity in the context of asthma phenotypes-endotypes and administered medication. We subsequently focus on emerging efforts to gain deeper insights into microbiota-host interactions driving asthma complexity by integrating microbiome and host multi-omics data. One of the most prominent achievements of these research efforts is the association of refractory neutrophilic asthma with certain microbial signatures, including predominant pathogenic bacterial taxa (such as Proteobacteria phyla, Gammaproteobacteria class, especially species from Haemophilus and Moraxella genera). Overall, despite existing challenges, large-scale multi-omics endeavors may provide promising biomarkers and therapeutic targets for future development of novel microbe-based personalized strategies for diagnosis, prevention and/or treatment of uncontrollable asthma.
Collapse
Affiliation(s)
- Marianthi Logotheti
- G.P. Livanos and M. Simou Laboratories, 1st Department of Critical Care Medicine & Pulmonary Services, Evangelismos Hospital, Medical School, National Kapodistrian University of Athens, 3 Ploutarchou Str., 10675 Athens, Greece; (M.L.); (P.A.)
- Biotechnology Laboratory, School of Chemical Engineering, National Technical University of Athens, 5 Iroon Polytechniou Str., Zografou Campus, 15780 Athens, Greece
| | - Panagiotis Agioutantis
- G.P. Livanos and M. Simou Laboratories, 1st Department of Critical Care Medicine & Pulmonary Services, Evangelismos Hospital, Medical School, National Kapodistrian University of Athens, 3 Ploutarchou Str., 10675 Athens, Greece; (M.L.); (P.A.)
| | - Paraskevi Katsaounou
- Pulmonary Dept First ICU, Evangelismos Hospital, Medical School, National Kapodistrian University of Athens, Ipsilantou 45-7, 10675 Athens, Greece;
| | - Heleni Loutrari
- G.P. Livanos and M. Simou Laboratories, 1st Department of Critical Care Medicine & Pulmonary Services, Evangelismos Hospital, Medical School, National Kapodistrian University of Athens, 3 Ploutarchou Str., 10675 Athens, Greece; (M.L.); (P.A.)
- Correspondence:
| |
Collapse
|
11
|
Rautenstrauch P, Vlot AHC, Saran S, Ohler U. Intricacies of single-cell multi-omics data integration. Trends Genet 2021; 38:128-139. [PMID: 34561102 DOI: 10.1016/j.tig.2021.08.012] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2021] [Revised: 08/20/2021] [Accepted: 08/23/2021] [Indexed: 02/06/2023]
Abstract
A wealth of single-cell protocols makes it possible to characterize different molecular layers at unprecedented resolution. Integrating the resulting multimodal single-cell data to find cell-to-cell correspondences remains a challenge. We argue that data integration needs to happen at a meaningful biological level of abstraction and that it is necessary to consider the inherent discrepancies between modalities to strike a balance between biological discovery and noise removal. A survey of current methods reveals that a distinction between technical and biological origins of presumed unwanted variation between datasets is not yet commonly considered. The increasing availability of paired multimodal data will aid the development of improved methods by providing a ground truth on cell-to-cell matches.
Collapse
Affiliation(s)
- Pia Rautenstrauch
- The Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine, 10115 Berlin, Germany; Department of Computer Science, Humboldt Universität zu Berlin, 10117 Berlin, Germany
| | - Anna Hendrika Cornelia Vlot
- The Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine, 10115 Berlin, Germany; Department of Computer Science, Humboldt Universität zu Berlin, 10117 Berlin, Germany
| | - Sepideh Saran
- The Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine, 10115 Berlin, Germany
| | - Uwe Ohler
- The Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine, 10115 Berlin, Germany; Department of Computer Science, Humboldt Universität zu Berlin, 10117 Berlin, Germany; Department of Biology, Humboldt Universität zu Berlin, 10117 Berlin, Germany.
| |
Collapse
|
12
|
Defosset A, Merlat D, Poidevin L, Nevers Y, Kress A, Poch O, Lecompte O. Novel Approach Combining Transcriptional and Evolutionary Signatures to Identify New Multiciliation Genes. Genes (Basel) 2021; 12:1452. [PMID: 34573434 DOI: 10.3390/genes12091452] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2021] [Revised: 09/17/2021] [Accepted: 09/18/2021] [Indexed: 11/19/2022] Open
Abstract
Multiciliogenesis is a complex process that allows the generation of hundreds of motile cilia on the surface of specialized cells, to create fluid flow across epithelial surfaces. Dysfunction of human multiciliated cells is associated with diseases of the brain, airway and reproductive tracts. Despite recent efforts to characterize the transcriptional events responsible for the differentiation of multiciliated cells, a lot of actors remain to be identified. In this work, we capitalize on the ever-growing quantity of high-throughput data to search for new candidate genes involved in multiciliation. After performing a large-scale screening using 10 transcriptomics datasets dedicated to multiciliation, we established a specific evolutionary signature involving Otomorpha fish to use as a criterion to select the most likely targets. Combining both approaches highlighted a list of 114 potential multiciliated candidates. We characterized these genes first by generating protein interaction networks, which showed various clusters of ciliated and multiciliated genes, and then by computing phylogenetic profiles. In the end, we selected 11 poorly characterized genes that seem like particularly promising multiciliated candidates. By combining functional and comparative genomics methods, we developed a novel type of approach to study biological processes and identify new promising candidates linked to that process.
Collapse
|
13
|
Fiorentino G, Visintainer R, Domenici E, Lauria M, Marchetti L. MOUSSE: Multi-Omics Using Subject-Specific SignaturEs. Cancers (Basel) 2021; 13:cancers13143423. [PMID: 34298641 PMCID: PMC8304726 DOI: 10.3390/cancers13143423] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Revised: 06/29/2021] [Accepted: 06/30/2021] [Indexed: 01/06/2023] Open
Abstract
Simple Summary Modern profiling technologies have led to relevant progress toward precision medicine and disease management. A new trend in patient classification is to integrate multiple data types for the same subjects to increase the chance of identifying meaningful phenotype groups. However, these methodologies are still in their infancy, with their performance varying widely depending on the biological conditions analyzed. We developed MOUSSE, a new unsupervised and normalization-free tool for multi-omics integration able to maintain good clustering performance across a wide range of omics data. We verified its efficiency in clustering patients based on survival for ten different cancer types. The results we obtained show a higher average score in classification performance than ten other state-of-the-art algorithms. We have further validated the method by identifying a list of biological features potentially involved in patient survival, finding a high degree of concordance with the literature. Abstract High-throughput technologies make it possible to produce a large amount of data representing different biological layers, examples of which are genomics, proteomics, metabolomics and transcriptomics. Omics data have been individually investigated to understand the molecular bases of various diseases, but this may not be sufficient to fully capture the molecular mechanisms and the multilayer regulatory processes underlying complex diseases, especially cancer. To overcome this problem, several multi-omics integration methods have been introduced but a commonly agreed standard of analysis is still lacking. In this paper, we present MOUSSE, a novel normalization-free pipeline for unsupervised multi-omics integration. The main innovations are the use of rank-based subject-specific signatures and the use of such signatures to derive subject similarity networks. A separate similarity network was derived for each omics, and the resulting networks were then carefully merged in a way that considered their informative content. We applied it to analyze survival in ten different types of cancer. We produced a meaningful clusterization of the subjects and obtained a higher average classification score than ten state-of-the-art algorithms tested on the same data. As further validation, we extracted from the subject-specific signatures a list of relevant features used for the clusterization and investigated their biological role in survival. We were able to verify that, according to the literature, these features are highly involved in cancer progression and differential survival.
Collapse
Affiliation(s)
- Giuseppe Fiorentino
- Fondazione The Microsoft Research, University of Trento Centre for Computational and Systems Biology (COSBI), 38068 Rovereto, Italy; (G.F.); (R.V.); (E.D.); (M.L.)
- Department of Cellular, Computational, and Integrative Biology (CiBio), University of Trento, 38123 Povo, Italy
| | - Roberto Visintainer
- Fondazione The Microsoft Research, University of Trento Centre for Computational and Systems Biology (COSBI), 38068 Rovereto, Italy; (G.F.); (R.V.); (E.D.); (M.L.)
| | - Enrico Domenici
- Fondazione The Microsoft Research, University of Trento Centre for Computational and Systems Biology (COSBI), 38068 Rovereto, Italy; (G.F.); (R.V.); (E.D.); (M.L.)
- Department of Cellular, Computational, and Integrative Biology (CiBio), University of Trento, 38123 Povo, Italy
| | - Mario Lauria
- Fondazione The Microsoft Research, University of Trento Centre for Computational and Systems Biology (COSBI), 38068 Rovereto, Italy; (G.F.); (R.V.); (E.D.); (M.L.)
- Department of Mathematics, University of Trento, 38123 Povo, Italy
| | - Luca Marchetti
- Fondazione The Microsoft Research, University of Trento Centre for Computational and Systems Biology (COSBI), 38068 Rovereto, Italy; (G.F.); (R.V.); (E.D.); (M.L.)
- Correspondence:
| |
Collapse
|
14
|
Vlachavas EI, Bohn J, Ückert F, Nürnberg S. A Detailed Catalogue of Multi-Omics Methodologies for Identification of Putative Biomarkers and Causal Molecular Networks in Translational Cancer Research. Int J Mol Sci 2021; 22:2822. [PMID: 33802234 PMCID: PMC8000236 DOI: 10.3390/ijms22062822] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2021] [Revised: 03/05/2021] [Accepted: 03/05/2021] [Indexed: 02/06/2023] Open
Abstract
Recent advances in sequencing and biotechnological methodologies have led to the generation of large volumes of molecular data of different omics layers, such as genomics, transcriptomics, proteomics and metabolomics. Integration of these data with clinical information provides new opportunities to discover how perturbations in biological processes lead to disease. Using data-driven approaches for the integration and interpretation of multi-omics data could stably identify links between structural and functional information and propose causal molecular networks with potential impact on cancer pathophysiology. This knowledge can then be used to improve disease diagnosis, prognosis, prevention, and therapy. This review will summarize and categorize the most current computational methodologies and tools for integration of distinct molecular layers in the context of translational cancer research and personalized therapy. Additionally, the bioinformatics tools Multi-Omics Factor Analysis (MOFA) and netDX will be tested using omics data from public cancer resources, to assess their overall robustness, provide reproducible workflows for gaining biological knowledge from multi-omics data, and to comprehensively understand the significantly perturbed biological entities in distinct cancer types. We show that the performed supervised and unsupervised analyses result in meaningful and novel findings.
Collapse
Affiliation(s)
- Efstathios Iason Vlachavas
- Medical Informatics for Translational Oncology, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany; (J.B.); (F.Ü.)
| | - Jonas Bohn
- Medical Informatics for Translational Oncology, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany; (J.B.); (F.Ü.)
| | - Frank Ückert
- Medical Informatics for Translational Oncology, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany; (J.B.); (F.Ü.)
- Applied Medical Informatics, University Hospital Hamburg-Eppendorf, 20251 Hamburg, Germany
| | - Sylvia Nürnberg
- Medical Informatics for Translational Oncology, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany; (J.B.); (F.Ü.)
- Applied Medical Informatics, University Hospital Hamburg-Eppendorf, 20251 Hamburg, Germany
| |
Collapse
|
15
|
Dursun C. NECo: A node embedding algorithm for multiplex heterogeneous networks. Proceedings (IEEE Int Conf Bioinformatics Biomed) 2020; 2020:146-149. [PMID: 34584774 PMCID: PMC8466723 DOI: 10.1109/bibm49941.2020.9313595] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Complex diseases such as hypertension, cancer, and diabetes cause nearly 70% of the deaths in the U.S. and involve multiple genes and their interactions with environmental factors. Therefore, identification of genetic factors to understand and decrease the morbidity and mortality from complex diseases is an important and challenging task. With the generation of an unprecedented amount of multi-omics datasets, network-based methods have become popular to represent the multilayered complex molecular interactions. Particularly node embeddings, the low-dimensional representations of nodes in a network are utilized for gene function prediction. Integrated network analysis of multi-omics data alleviates the issues related to missing data and lack of context-specific datasets. Most of the node embedding methods, however, are unable to integrate multiple types of datasets from genes and phenotypes. To address this limitation, we developed a node embedding algorithm called Node Embeddings of Complex networks (NECo) that can utilize multilayered heterogeneous networks of genes and phenotypes. We evaluated the performance of NECo using genotypic and phenotypic datasets from rat (Rattus norvegicus) disease models to classify hypertension disease-related genes. Our method significantly outperformed the state-of-the-art node embedding methods, with AUC of 94.97% compared 85.98% in the second-best performer, and predicted genes not previously implicated in hypertension.
Collapse
Affiliation(s)
- Cagatay Dursun
- Dept. of Biomedical Engineering, Marquette University – Medical, College of Wisconsin, Milwaukee WI USA
| |
Collapse
|
16
|
Griss J, Viteri G, Sidiropoulos K, Nguyen V, Fabregat A, Hermjakob H. ReactomeGSA - Efficient Multi-Omics Comparative Pathway Analysis. Mol Cell Proteomics 2020; 19:2115-2125. [PMID: 32907876 PMCID: PMC7710148 DOI: 10.1074/mcp.tir120.002155] [Citation(s) in RCA: 122] [Impact Index Per Article: 30.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2020] [Revised: 07/28/2020] [Indexed: 01/27/2023] Open
Abstract
Pathway analyses are key methods to analyze 'omics experiments. Nevertheless, integrating data from different 'omics technologies and different species still requires considerable bioinformatics knowledge.Here we present the novel ReactomeGSA resource for comparative pathway analyses of multi-omics datasets. ReactomeGSA can be used through Reactome's existing web interface and the novel ReactomeGSA R Bioconductor package with explicit support for scRNA-seq data. Data from different species is automatically mapped to a common pathway space. Public data from ExpressionAtlas and Single Cell ExpressionAtlas can be directly integrated in the analysis. ReactomeGSA greatly reduces the technical barrier for multi-omics, cross-species, comparative pathway analyses.We used ReactomeGSA to characterize the role of B cells in anti-tumor immunity. We compared B cell rich and poor human cancer samples from five of the Cancer Genome Atlas (TCGA) transcriptomics and two of the Clinical Proteomic Tumor Analysis Consortium (CPTAC) proteomics studies. B cell-rich lung adenocarcinoma samples lacked the otherwise present activation through NFkappaB. This may be linked to the presence of a specific subset of tumor associated IgG+ plasma cells that lack NFkappaB activation in scRNA-seq data from human melanoma. This showcases how ReactomeGSA can derive novel biomedical insights by integrating large multi-omics datasets.
Collapse
Affiliation(s)
- Johannes Griss
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridgeshire, United Kingdom; Department of Dermatology, Medical University of Vienna, Vienna, Austria.
| | - Guilherme Viteri
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridgeshire, United Kingdom
| | - Konstantinos Sidiropoulos
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridgeshire, United Kingdom
| | - Vy Nguyen
- Department of Dermatology, Medical University of Vienna, Vienna, Austria
| | - Antonio Fabregat
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridgeshire, United Kingdom
| | - Henning Hermjakob
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridgeshire, United Kingdom.
| |
Collapse
|
17
|
Randhawa V, Pathania S. Advancing from protein interactomes and gene co-expression networks towards multi-omics-based composite networks: approaches for predicting and extracting biological knowledge. Brief Funct Genomics 2020; 19:364-376. [PMID: 32678894 DOI: 10.1093/bfgp/elaa015] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2020] [Revised: 05/31/2020] [Accepted: 06/15/2020] [Indexed: 01/17/2023] Open
Abstract
Prediction of biological interaction networks from single-omics data has been extensively implemented to understand various aspects of biological systems. However, more recently, there is a growing interest in integrating multi-omics datasets for the prediction of interactomes that provide a global view of biological systems with higher descriptive capability, as compared to single omics. In this review, we have discussed various computational approaches implemented to infer and analyze two of the most important and well studied interactomes: protein-protein interaction networks and gene co-expression networks. We have explicitly focused on recent methods and pipelines implemented to infer and extract biologically important information from these interactomes, starting from utilizing single-omics data and then progressing towards multi-omics data. Accordingly, recent examples and case studies are also briefly discussed. Overall, this review will provide a proper understanding of the latest developments in protein and gene network modelling and will also help in extracting practical knowledge from them.
Collapse
Affiliation(s)
- Vinay Randhawa
- Department of Biochemistry, Panjab University, Chandigarh, 160014, India
| | - Shivalika Pathania
- Department of Biotechnology, Panjab University, Chandigarh, 160014, India
| |
Collapse
|
18
|
Cavalli M, Diamanti K, Pan G, Spalinskas R, Kumar C, Deshmukh AS, Mann M, Sahlén P, Komorowski J, Wadelius C. A Multi-Omics Approach to Liver Diseases: Integration of Single Nuclei Transcriptomics with Proteomics and HiCap Bulk Data in Human Liver. OMICS 2020; 24:180-194. [PMID: 32181701 PMCID: PMC7185313 DOI: 10.1089/omi.2019.0215] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
The liver is the largest solid organ and a primary metabolic hub. In recent years, intact cell nuclei were used to perform single-nuclei RNA-seq (snRNA-seq) for tissues difficult to dissociate and for flash-frozen archived tissue samples to discover unknown and rare cell subpopulations. In this study, we performed snRNA-seq of a liver sample to identify subpopulations of cells based on nuclear transcriptomics. In 4282 single nuclei, we detected, on average, 1377 active genes and we identified seven major cell types. We integrated data from 94,286 distal interactions (p < 0.05) for 7682 promoters from a targeted chromosome conformation capture technique (HiCap) and mass spectrometry proteomics for the same liver sample. We observed a reasonable correlation between proteomics and in silico bulk snRNA-seq (r = 0.47) using tissue-independent gene-specific protein abundancy estimation factors. We specifically looked at genes of medical importance. The DPYD gene is involved in the pharmacogenetics of fluoropyrimidine toxicity and some of its variants are analyzed for clinical purposes. We identified a new putative polymorphic regulatory element, which may contribute to variation in toxicity. Hepatocellular carcinoma (HCC) is the most common type of primary liver cancer and we investigated all known risk genes. We identified a complex regulatory landscape for the SLC2A2 gene with 16 candidate enhancers. Three of them harbor somatic motif breaking and other mutations in HCC in the Pan Cancer Analysis of Whole Genomes dataset and are candidates to contribute to malignancy. Our results highlight the potential of a multi-omics approach in the study of human diseases.
Collapse
Affiliation(s)
- Marco Cavalli
- Science for Life Laboratory, Department of Immunology, Genetics and Pathology, Uppsala University, Uppsala, Sweden
| | - Klev Diamanti
- Science for Life Laboratory, Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden
| | - Gang Pan
- Science for Life Laboratory, Department of Immunology, Genetics and Pathology, Uppsala University, Uppsala, Sweden
| | - Rapolas Spalinskas
- Science for Life Laboratory, Division of Gene Technology, KTH Royal Institute of Technology, Stockholm, Sweden
| | - Chanchal Kumar
- Translational Science and Experimental Medicine, Early Cardiovascular, Renal and Metabolism, BioPharmaceuticals R&D, AstraZeneca, Gothenburg, Sweden
- Karolinska Institutet/AstraZeneca Integrated CardioMetabolic Center (KI/AZ ICMC), Department of Medicine, Novum, Huddinge, Sweden
| | - Atul Shahaji Deshmukh
- Novo Nordisk Foundation Center for Protein Research, Proteomics Program, Clinical Proteomics Group, Copenhagen, Denmark
| | - Matthias Mann
- Novo Nordisk Foundation Center for Protein Research, Proteomics Program, Clinical Proteomics Group, Copenhagen, Denmark
| | - Pelin Sahlén
- Science for Life Laboratory, Division of Gene Technology, KTH Royal Institute of Technology, Stockholm, Sweden
| | - Jan Komorowski
- Science for Life Laboratory, Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden
- Institute of Computer Science, Polish Academy of Sciences, Warszawa, Poland
| | - Claes Wadelius
- Science for Life Laboratory, Department of Immunology, Genetics and Pathology, Uppsala University, Uppsala, Sweden
| |
Collapse
|
19
|
Jiang D, Armour CR, Hu C, Mei M, Tian C, Sharpton TJ, Jiang Y. Microbiome Multi-Omics Network Analysis: Statistical Considerations, Limitations, and Opportunities. Front Genet 2019; 10:995. [PMID: 31781153 PMCID: PMC6857202 DOI: 10.3389/fgene.2019.00995] [Citation(s) in RCA: 77] [Impact Index Per Article: 15.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2019] [Accepted: 09/18/2019] [Indexed: 12/21/2022] Open
Abstract
The advent of large-scale microbiome studies affords newfound analytical opportunities to understand how these communities of microbes operate and relate to their environment. However, the analytical methodology needed to model microbiome data and integrate them with other data constructs remains nascent. This emergent analytical toolset frequently ports over techniques developed in other multi-omics investigations, especially the growing array of statistical and computational techniques for integrating and representing data through networks. While network analysis has emerged as a powerful approach to modeling microbiome data, oftentimes by integrating these data with other types of omics data to discern their functional linkages, it is not always evident if the statistical details of the approach being applied are consistent with the assumptions of microbiome data or how they impact data interpretation. In this review, we overview some of the most important network methods for integrative analysis, with an emphasis on methods that have been applied or have great potential to be applied to the analysis of multi-omics integration of microbiome data. We compare advantages and disadvantages of various statistical tools, assess their applicability to microbiome data, and discuss their biological interpretability. We also highlight on-going statistical challenges and opportunities for integrative network analysis of microbiome data.
Collapse
Affiliation(s)
- Duo Jiang
- Department of Statistics, Oregon State University, Corvallis, OR, United States
| | - Courtney R Armour
- Department of Microbiology, Oregon State University, Corvallis, OR, United States
| | - Chenxiao Hu
- Department of Statistics, Oregon State University, Corvallis, OR, United States
| | - Meng Mei
- Department of Statistics, Oregon State University, Corvallis, OR, United States
| | - Chuan Tian
- Department of Statistics, Oregon State University, Corvallis, OR, United States
| | - Thomas J Sharpton
- Department of Statistics, Oregon State University, Corvallis, OR, United States
- Department of Microbiology, Oregon State University, Corvallis, OR, United States
| | - Yuan Jiang
- Department of Statistics, Oregon State University, Corvallis, OR, United States
| |
Collapse
|
20
|
Zhang L, Lv C, Jin Y, Cheng G, Fu Y, Yuan D, Tao Y, Guo Y, Ni X, Shi T. Deep Learning-Based Multi-Omics Data Integration Reveals Two Prognostic Subtypes in High-Risk Neuroblastoma. Front Genet 2018; 9:477. [PMID: 30405689 PMCID: PMC6201709 DOI: 10.3389/fgene.2018.00477] [Citation(s) in RCA: 101] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2018] [Accepted: 09/26/2018] [Indexed: 12/01/2022] Open
Abstract
High-risk neuroblastoma is a very aggressive disease, with excessive tumor growth and poor outcomes. A proper stratification of the high-risk patients by prognostic outcome is important for treatment. However, there is still a lack of survival stratification for the high-risk neuroblastoma. To fill the gap, we adopt a deep learning algorithm, Autoencoder, to integrate multi-omics data, and combine it with K-means clustering to identify two subtypes with significant survival differences. By comparing the Autoencoder with PCA, iCluster, and DGscore about the classification based on multi-omics data integration, Autoencoder-based classification outperforms the alternative approaches. Furthermore, we also validated the classification in two independent datasets by training machine-learning classification models, and confirmed its robustness. Functional analysis revealed that MYCN amplification was more frequently occurred in the ultra-high-risk subtype, in accordance with the overexpression of MYC/MYCN targets in this subtype. In summary, prognostic subtypes identified by deep learning-based multi-omics integration could not only improve our understanding of molecular mechanism, but also help the clinicians make decisions.
Collapse
Affiliation(s)
- Li Zhang
- Center for Bioinformatics and Computational Biology, and the Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai, China
| | - Chenkai Lv
- Center for Bioinformatics and Computational Biology, and the Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai, China
| | - Yaqiong Jin
- Beijing Key Laboratory for Pediatric Diseases of Otolaryngology, Head and Neck Surgery, MOE Key Laboratory of Major Diseases in Children, Beijing Children's Hospital, National Center for Children's Health, Beijing Pediatric Research Institute, Capital Medical University, Beijing, China.,Biobank for Clinical Data and Samples in Pediatrics, Beijing Children's Hospital, National Center for Children's Health, Beijing Pediatric Research Institute, Capital Medical University, Beijing, China
| | - Ganqi Cheng
- Center for Bioinformatics and Computational Biology, and the Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai, China
| | - Yibao Fu
- Center for Bioinformatics and Computational Biology, and the Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai, China
| | - Dongsheng Yuan
- Center for Bioinformatics and Computational Biology, and the Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai, China
| | - Yiran Tao
- Center for Bioinformatics and Computational Biology, and the Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai, China
| | - Yongli Guo
- Beijing Key Laboratory for Pediatric Diseases of Otolaryngology, Head and Neck Surgery, MOE Key Laboratory of Major Diseases in Children, Beijing Children's Hospital, National Center for Children's Health, Beijing Pediatric Research Institute, Capital Medical University, Beijing, China.,Biobank for Clinical Data and Samples in Pediatrics, Beijing Children's Hospital, National Center for Children's Health, Beijing Pediatric Research Institute, Capital Medical University, Beijing, China
| | - Xin Ni
- Beijing Key Laboratory for Pediatric Diseases of Otolaryngology, Head and Neck Surgery, MOE Key Laboratory of Major Diseases in Children, Beijing Children's Hospital, National Center for Children's Health, Beijing Pediatric Research Institute, Capital Medical University, Beijing, China.,Biobank for Clinical Data and Samples in Pediatrics, Beijing Children's Hospital, National Center for Children's Health, Beijing Pediatric Research Institute, Capital Medical University, Beijing, China.,Department of Otolaryngology, Head and Neck Surgery, Beijing Children's Hospital, National Center for Children's Health, Capital Medical University, Beijing, China
| | - Tieliu Shi
- Center for Bioinformatics and Computational Biology, and the Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai, China.,Department of Otolaryngology, Head and Neck Surgery, Beijing Children's Hospital, National Center for Children's Health, Capital Medical University, Beijing, China
| |
Collapse
|
21
|
Tasaki S, Gaiteri C, Mostafavi S, Yu L, Wang Y, De Jager PL, Bennett DA. Multi-omic Directed Networks Describe Features of Gene Regulation in Aged Brains and Expand the Set of Genes Driving Cognitive Decline. Front Genet 2018; 9:294. [PMID: 30140277 PMCID: PMC6095043 DOI: 10.3389/fgene.2018.00294] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2018] [Accepted: 07/13/2018] [Indexed: 01/10/2023] Open
Abstract
Multiple aspects of molecular regulation, including genetics, epigenetics, and mRNA collectively influence the development of age-related neurologic diseases. Therefore, with the ultimate goal of understanding molecular systems associated with cognitive decline, we infer directed interactions among regulatory elements in the local regulatory vicinity of individual genes based on brain multi-omics data from 413 individuals. These local regulatory networks (LRNs) capture the influences of genetics and epigenetics on gene expression in older adults. LRNs were confirmed through correspondence to known transcription biophysics. To relate LRNs to age-related neurologic diseases, we then incorporate common neuropathologies and measures of cognitive decline into this framework. This step identifies a specific set of largely neuronal genes, such as STAU1 and SEMA3F, predicted to control cognitive decline in older adults. These predictions are validated in separate cohorts by comparison to genetic associations for general cognition. LRNs are shared through www.molecular.network on the Rush Alzheimer’s Disease Center Resource Sharing Hub (www.radc.rush.edu).
Collapse
Affiliation(s)
- Shinya Tasaki
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL, United States
| | - Chris Gaiteri
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL, United States
| | - Sara Mostafavi
- Department of Statistics, Department of Medical Genetics, University of British Columbia, Vancouver, BC, Canada
| | - Lei Yu
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL, United States
| | - Yanling Wang
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL, United States
| | - Philip L De Jager
- Center for Translational and Computational Neuroimmunology, Department of Neurology, Columbia University Medical Center, New York, NY, United States.,Cell Circuits Program, Broad Institute, Cambridge, MA, United States
| | - David A Bennett
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL, United States
| |
Collapse
|
22
|
Hočevar K, Maver A, Kunej T, Peterlin B. Sarcoidosis Related Novel Candidate Genes Identified by Multi-Omics Integrative Analyses. OMICS 2018; 22:322-331. [PMID: 29688803 DOI: 10.1089/omi.2018.0027] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Sarcoidosis is a multifactorial systemic disease characterized by granulomatous inflammation and greatly impacting on global public health. The etiology and mechanisms of sarcoidosis are not fully understood. Recent high-throughput biological research has generated vast amounts of multi-omics big data on sarcoidosis, but their significance remains to be determined. We sought to identify novel candidate regions, and genes consistently altered in heterogeneous omics studies so as to reveal the underlying molecular mechanisms. We conducted a comprehensive integrative literature analysis on global data on sarcoidosis, including genomic, transcriptomic, proteomic, and phenomic studies. We performed positional integration analysis of 38 eligible datasets originating from 17 different biological layers. Using the integration interval length of 50 kb, we identified 54 regions reaching significance value p ≤ 0.0001 and 15 regions with significance value p ≤ 0.00001, when applying more stringent criteria. Secondary literature analysis of the top 20 regions, with the most significant accumulation of signals, revealed several novel candidate genes for which associations with sarcoidosis have not yet been established, but have considerable support for their involvement based on omic data. These new plausible candidate genes include NELFE, CFB, EGFL7, AGPAT2, FKBPL, NRC3, and NEU1. Furthermore, annotated data were prepared to enable custom visualization and browsing of these sarcoidosis related omics evidence in the University of California Santa Cruz (UCSC) Genome Browser. Further multi-omics approaches are called for sarcoidosis biomarkers and diagnostic and therapeutic innovation. Our approach for harnessing multi-omics data and the findings presented herein reflect important steps toward understanding the etiology and underlying pathological mechanisms of sarcoidosis.
Collapse
Affiliation(s)
- Keli Hočevar
- 1 Clinical Institute of Medical Genetics, University Medical Centre Ljubljana , Ljubljana, Slovenia
| | - Aleš Maver
- 1 Clinical Institute of Medical Genetics, University Medical Centre Ljubljana , Ljubljana, Slovenia
| | - Tanja Kunej
- 2 Biotechnical Faculty, Department of Animal Science, University of Ljubljana , Jamnikarjeva 101, Ljubljana, Slovenia
| | - Borut Peterlin
- 1 Clinical Institute of Medical Genetics, University Medical Centre Ljubljana , Ljubljana, Slovenia
| |
Collapse
|
23
|
Hu W, Lin D, Cao S, Liu J, Chen J, Calhoun VD, Wang YP. Adaptive Sparse Multiple Canonical Correlation Analysis With Application to Imaging (Epi)Genomics Study of Schizophrenia. IEEE Trans Biomed Eng 2018; 65:390-399. [PMID: 29364120 PMCID: PMC5826588 DOI: 10.1109/tbme.2017.2771483] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
Finding correlations across multiple data sets in imaging and (epi)genomics is a common challenge. Sparse multiple canonical correlation analysis (SMCCA) is a multivariate model widely used to extract contributing features from each data while maximizing the cross-modality correlation. The model is achieved by using the combination of pairwise covariances between any two data sets. However, the scales of different pairwise covariances could be quite different and the direct combination of pairwise covariances in SMCCA is unfair. The problem of "unfair combination of pairwise covariances" restricts the power of SMCCA for feature selection. In this paper, we propose a novel formulation of SMCCA, called adaptive SMCCA, to overcome the problem by introducing adaptive weights when combining pairwise covariances. Both simulation and real-data analysis show the outperformance of adaptive SMCCA in terms of feature selection over conventional SMCCA and SMCCA with fixed weights. Large-scale numerical experiments show that adaptive SMCCA converges as fast as conventional SMCCA. When applying it to imaging (epi)genetics study of schizophrenia subjects, we can detect significant (epi)genetic variants and brain regions, which are consistent with other existing reports. In addition, several significant brain-development related pathways, e.g., neural tube development, are detected by our model, demonstrating imaging epigenetic association may be overlooked by conventional SMCCA. All these results demonstrate that adaptive SMCCA are well suited for detecting three-way or multiway correlations and thus can find widespread applications in multiple omics and imaging data integration.
Collapse
Affiliation(s)
- Wenxing Hu
- Biomedical Engineering Department, Tulane University, New Orleans, LA 70118, USA
| | - Dongdong Lin
- Mind Research Network and Dept. of ECE, University of New Mexico, Albuquerque, NM, 87106
| | - Shaolong Cao
- Department of Bioinformatics & Computational Biology, UT MD Anderson Cancer Center, Houston, TX
| | - Jingyu Liu
- Mind Research Network and Dept. of ECE, University of New Mexico, Albuquerque, NM, 87106
| | - Jiayu Chen
- Mind Research Network and Dept. of ECE, University of New Mexico, Albuquerque, NM, 87106
| | - Vince D. Calhoun
- Mind Research Network and Dept. of ECE, University of New Mexico, Albuquerque, NM, 87106
| | - Yu-Ping Wang
- Biomedical Engineering Department, Tulane University, New Orleans, LA 70118, USA
| |
Collapse
|
24
|
Abstract
The number of publications on research of male infertility is increasing. Technologies used in research of male infertility generate complex results and various types of data that need to be appropriately managed, arranged, and made available to other researchers for further use. In our previous study, we collected over 800 candidate loci for male fertility in seven mammalian species. However, the continuation of the work towards a comprehensive database of candidate genes associated with different types of idiopathic human male infertility is challenging due to fragmented information, obtained from a variety of technologies and various omics approaches. Results are published in different forms and usually need to be excavated from the text, which hinders the gathering of information. Standardized reporting of genetic anomalies as well as causative and risk factors of male infertility therefore presents an important issue. The aim of the study was to collect examples of diverse genomic loci published in association with human male infertility and to propose a standardized format for reporting genetic causes of male infertility. From the currently available data we have selected 75 studies reporting 186 representative genomic loci which have been proposed as genetic risk factors for male infertility. Based on collected and formatted data, we suggested a first step towards unification of reporting the genetics of male infertility in original and review studies. The proposed initiative consists of five relevant data types: 1) genetic locus, 2) race/ethnicity, number of participants (infertile/controls), 3) methodology, 4) phenotype (clinical data, disease ontology, and disease comorbidity), and 5) reference. The proposed form for standardized reporting presents a baseline for further optimization with additional genetic and clinical information. This data standardization initiative will enable faster multi-omics data integration, database development and sharing, establishing more targeted hypotheses, and facilitating biomarker discovery.
Collapse
Affiliation(s)
- Eva Traven
- a Department of Animal Science, Biotechnical Faculty , University of Ljubljana , Slovenia
| | - Ana Ogrinc
- a Department of Animal Science, Biotechnical Faculty , University of Ljubljana , Slovenia.,b Insitute for Immunology , LMU Munich , Munich , Germany
| | - Tanja Kunej
- a Department of Animal Science, Biotechnical Faculty , University of Ljubljana , Slovenia
| |
Collapse
|
25
|
Meng C, Zeleznik OA, Thallinger GG, Kuster B, Gholami AM, Culhane AC. Dimension reduction techniques for the integrative analysis of multi-omics data. Brief Bioinform 2016; 17:628-41. [PMID: 26969681 PMCID: PMC4945831 DOI: 10.1093/bib/bbv108] [Citation(s) in RCA: 190] [Impact Index Per Article: 23.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2015] [Revised: 10/26/2015] [Indexed: 01/16/2023] Open
Abstract
State-of-the-art next-generation sequencing, transcriptomics, proteomics and other high-throughput 'omics' technologies enable the efficient generation of large experimental data sets. These data may yield unprecedented knowledge about molecular pathways in cells and their role in disease. Dimension reduction approaches have been widely used in exploratory analysis of single omics data sets. This review will focus on dimension reduction approaches for simultaneous exploratory analyses of multiple data sets. These methods extract the linear relationships that best explain the correlated structure across data sets, the variability both within and between variables (or observations) and may highlight data issues such as batch effects or outliers. We explore dimension reduction techniques as one of the emerging approaches for data integration, and how these can be applied to increase our understanding of biological systems in normal physiological function and disease.
Collapse
|