351
|
Integrated transcriptomic-genomic tool Texomer profiles cancer tissues. Nat Methods 2019; 16:401-404. [PMID: 30988467 DOI: 10.1038/s41592-019-0388-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2018] [Accepted: 03/04/2019] [Indexed: 01/08/2023]
Abstract
Profiling of both the genome and the transcriptome promises a comprehensive, functional readout of a tissue sample, yet analytical approaches are required to translate the increased data dimensionality, heterogeneity and complexity into patient benefits. We developed a statistical approach called Texomer ( https://github.com/KChen-lab/Texomer ) that performs allele-specific, tumor-deconvoluted transcriptome-exome integration of autologous bulk whole-exome and transcriptome sequencing data. Texomer results in substantially improved accuracy in sample categorization and functional variant prioritization.
Collapse
|
352
|
Sonawane AR, Weiss ST, Glass K, Sharma A. Network Medicine in the Age of Biomedical Big Data. Front Genet 2019; 10:294. [PMID: 31031797 PMCID: PMC6470635 DOI: 10.3389/fgene.2019.00294] [Citation(s) in RCA: 124] [Impact Index Per Article: 20.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2018] [Accepted: 03/19/2019] [Indexed: 12/13/2022] Open
Abstract
Network medicine is an emerging area of research dealing with molecular and genetic interactions, network biomarkers of disease, and therapeutic target discovery. Large-scale biomedical data generation offers a unique opportunity to assess the effect and impact of cellular heterogeneity and environmental perturbations on the observed phenotype. Marrying the two, network medicine with biomedical data provides a framework to build meaningful models and extract impactful results at a network level. In this review, we survey existing network types and biomedical data sources. More importantly, we delve into ways in which the network medicine approach, aided by phenotype-specific biomedical data, can be gainfully applied. We provide three paradigms, mainly dealing with three major biological network archetypes: protein-protein interaction, expression-based, and gene regulatory networks. For each of these paradigms, we discuss a broad overview of philosophies under which various network methods work. We also provide a few examples in each paradigm as a test case of its successful application. Finally, we delineate several opportunities and challenges in the field of network medicine. We hope this review provides a lexicon for researchers from biological sciences and network theory to come on the same page to work on research areas that require interdisciplinary expertise. Taken together, the understanding gained from combining biomedical data with networks can be useful for characterizing disease etiologies and identifying therapeutic targets, which, in turn, will lead to better preventive medicine with translational impact on personalized healthcare.
Collapse
Affiliation(s)
- Abhijeet R. Sonawane
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, MA, United States
- Department of Medicine, Harvard Medical School, Boston, MA, United States
| | - Scott T. Weiss
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, MA, United States
- Department of Medicine, Harvard Medical School, Boston, MA, United States
| | - Kimberly Glass
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, MA, United States
- Department of Medicine, Harvard Medical School, Boston, MA, United States
| | - Amitabh Sharma
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, MA, United States
- Department of Medicine, Harvard Medical School, Boston, MA, United States
- Center for Interdisciplinary Cardiovascular Sciences, Cardiovascular Division, Brigham and Women’s Hospital, Boston, MA, United States
| |
Collapse
|
353
|
Dai Y, Pei G, Zhao Z, Jia P. A Convergent Study of Genetic Variants Associated With Crohn's Disease: Evidence From GWAS, Gene Expression, Methylation, eQTL and TWAS. Front Genet 2019; 10:318. [PMID: 31024628 PMCID: PMC6467075 DOI: 10.3389/fgene.2019.00318] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2018] [Accepted: 03/21/2019] [Indexed: 12/12/2022] Open
Abstract
Crohn’s Disease (CD) is one of the predominant forms of inflammatory bowel disease (IBD). A combination of genetic and non-genetic risk factors have been reported to contribute to the development of CD. Many high-throughput omics studies have been conducted to identify disease associated risk variants that might contribute to CD, such as genome-wide association studies (GWAS) and next generation sequencing studies. A pressing need remains to prioritize and characterize candidate genes that underlie the etiology of CD. In this study, we collected a comprehensive multi-dimensional data from GWAS, gene expression, and methylation studies and generated transcriptome-wide association study (TWAS) data to further interpret the GWAS association results. We applied our previously developed method called mega-analysis of Odds Ratio (MegaOR) to prioritize CD candidate genes (CDgenes). As a result, we identified consensus sets of CDgenes (62–235 genes) based on the evidence matrix. We demonstrated that these CDgenes were significantly more frequently interact with each other than randomly expected. Functional annotation of these genes highlighted critical immune-related processes such as immune response, MHC class II receptor activity, and immunological disorders. In particular, the constitutive photomorphogenesis 9 (COP9) signalosome related genes were found to be significantly enriched in CDgenes, implying a potential role of COP9 signalosome involved in the pathogenesis of CD. Finally, we found some of the CDgenes shared biological functions with known drug targets of CD, such as the regulation of inflammatory response and the leukocyte adhesion to vascular endothelial cell. In summary, we identified highly confident CDgenes from multi-dimensional evidence, providing insights for the understanding of CD etiology.
Collapse
Affiliation(s)
- Yulin Dai
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, United States
| | - Guangsheng Pei
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, United States
| | - Zhongming Zhao
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, United States.,Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, United States.,Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States
| | - Peilin Jia
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, United States
| |
Collapse
|
354
|
Wang W, Kandimalla R, Huang H, Zhu L, Li Y, Gao F, Goel A, Wang X. Molecular subtyping of colorectal cancer: Recent progress, new challenges and emerging opportunities. Semin Cancer Biol 2019; 55:37-52. [PMID: 29775690 PMCID: PMC6240404 DOI: 10.1016/j.semcancer.2018.05.002] [Citation(s) in RCA: 121] [Impact Index Per Article: 20.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2017] [Revised: 05/13/2018] [Accepted: 05/14/2018] [Indexed: 12/13/2022]
Abstract
Colorectal cancer (CRC) is one of the leading causes of cancer-related deaths worldwide. Similar to many other malignancies, CRC is a heterogeneous disease, making it a clinical challenge for optimization of treatment modalities in reducing the morbidity and mortality associated with this disease. A more precise understanding of the biological properties that distinguish patients with colorectal tumors, especially in terms of their clinical features, is a key requirement towards a more robust, targeted-drug design, and implementation of individualized therapies. In the recent decades, extensive studies have reported distinct CRC subtypes, with a mutation-centered view of tumor heterogeneity. However, more recently, the paradigm has shifted towards transcriptome-based classifications, represented by six independent CRC taxonomies. In 2015, the colorectal cancer subtyping consortium reported the identification of four consensus molecular subtypes (CMSs), providing thus far the most robust classification system for CRC. In this review, we summarize the historical timeline of CRC classification approaches; discuss their salient features and potential limitations that may require further refinement in near future. In other words, in spite of the recent encouraging progress, several major challenges prevent translation of molecular knowledge gleaned from CMSs into the clinic. Herein, we summarize some of these potential challenges and discuss exciting new opportunities currently emerging in related fields. We believe, close collaborations between basic researchers, bioinformaticians and clinicians are imperative for addressing these challenges, and eventually paving the path for CRC subtyping into routine clinical practice as we usher into the era of personalized medicine.
Collapse
Affiliation(s)
- Wei Wang
- Department of Biomedical Sciences, City University of Hong Kong, Hong Kong
| | - Raju Kandimalla
- Center for Gastrointestinal Research, Center for Translational Genomics and Oncology, Baylor Scott & White Research Institute and Charles A Sammons Cancer Center, Baylor Research Institute and Sammons Cancer Center, Baylor University Medical Center, 3410 Worth Street, Suite 610, Dallas, TX 75246, USA
| | - Hao Huang
- College of Veterinary Medicine and Life Sciences, City University of Hong Kong, Hong Kong
| | - Lina Zhu
- College of Veterinary Medicine and Life Sciences, City University of Hong Kong, Hong Kong
| | - Ying Li
- Department of Biomedical Sciences, City University of Hong Kong, Hong Kong
| | - Feng Gao
- College of Veterinary Medicine and Life Sciences, City University of Hong Kong, Hong Kong
| | - Ajay Goel
- Center for Gastrointestinal Research, Center for Translational Genomics and Oncology, Baylor Scott & White Research Institute and Charles A Sammons Cancer Center, Baylor Research Institute and Sammons Cancer Center, Baylor University Medical Center, 3410 Worth Street, Suite 610, Dallas, TX 75246, USA.
| | - Xin Wang
- Department of Biomedical Sciences, City University of Hong Kong, Hong Kong.
| |
Collapse
|
355
|
Xu A, Chen J, Peng H, Han G, Cai H. Simultaneous Interrogation of Cancer Omics to Identify Subtypes With Significant Clinical Differences. Front Genet 2019; 10:236. [PMID: 30984238 PMCID: PMC6448130 DOI: 10.3389/fgene.2019.00236] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2018] [Accepted: 03/04/2019] [Indexed: 11/21/2022] Open
Abstract
Recent advances in high-throughput sequencing have accelerated the accumulation of omics data on the same tumor tissue from multiple sources. Intensive study of multi-omics integration on tumor samples can stimulate progress in precision medicine and is promising in detecting potential biomarkers. However, current methods are restricted owing to highly unbalanced dimensions of omics data or difficulty in assigning weights between different data sources. Therefore, the appropriate approximation and constraints of integrated targets remain a major challenge. In this paper, we proposed an omics data integration method, named high-order path elucidated similarity (HOPES). HOPES fuses the similarities derived from various omics data sources to solve the dimensional discrepancy, and progressively elucidate the similarities from each type of omics data into an integrated similarity with various high-order connected paths. Through a series of incremental constraints for commonality, HOPES can take both specificity of single data and consistency between different data types into consideration. The fused similarity matrix gives global insight into patients' correlation and efficiently distinguishes subgroups. We tested the performance of HOPES on both a simulated dataset and several empirical tumor datasets. The test datasets contain three omics types including gene expression, DNA methylation, and microRNA data for five different TCGA cancer projects. Our method was shown to achieve superior accuracy and high robustness compared with several benchmark methods on simulated data. Further experiments on five cancer datasets demonstrated that HOPES achieved superior performances in cancer classification. The stratified subgroups were shown to have statistically significant differences in survival. We further located and identified the key genes, methylation sites, and microRNAs within each subgroup. They were shown to achieve high potential prognostic value and were enriched in many cancer-related biological processes or pathways.
Collapse
Affiliation(s)
- Aodan Xu
- School of Computer Science and Engineering, South China University of Technology, Guangzhou, China
| | - Jiazhou Chen
- School of Computer Science and Engineering, South China University of Technology, Guangzhou, China
| | - Hong Peng
- School of Computer Science and Engineering, South China University of Technology, Guangzhou, China
| | - GuoQiang Han
- School of Computer Science and Engineering, South China University of Technology, Guangzhou, China
| | - Hongmin Cai
- School of Computer Science and Engineering, South China University of Technology, Guangzhou, China
| |
Collapse
|
356
|
Denecker T, Durand W, Maupetit J, Hébert C, Camadro JM, Poulain P, Lelandais G. Pixel: a content management platform for quantitative omics data. PeerJ 2019; 7:e6623. [PMID: 30944779 PMCID: PMC6441322 DOI: 10.7717/peerj.6623] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2018] [Accepted: 02/14/2019] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND In biology, high-throughput experimental technologies, also referred as "omics" technologies, are increasingly used in research laboratories. Several thousands of gene expression measurements can be obtained in a single experiment. Researchers are routinely facing the challenge to annotate, store, explore and mine all the biological information they have at their disposal. We present here the Pixel web application (Pixel Web App), an original content management platform to help people involved in a multi-omics biological project. METHODS The Pixel Web App is built with open source technologies and hosted on the collaborative development platform GitHub (https://github.com/Candihub/pixel). It is written in Python using the Django framework and stores all the data in a PostgreSQL database. It is developed in the open and licensed under the BSD 3-clause license. The Pixel Web App is also heavily tested with both unit and functional tests, a strong code coverage and continuous integration provided by CircleCI. To ease the development and the deployment of the Pixel Web App, Docker and Docker Compose are used to bundle the application as well as its dependencies. RESULTS The Pixel Web App offers researchers an intuitive way to annotate, store, explore and mine their multi-omics results. It can be installed on a personal computer or on a server to fit the needs of many users. In addition, anyone can enhance the application to better suit their needs, either by contributing directly on GitHub (encouraged) or by extending Pixel on their own. The Pixel Web App does not provide any computational programs to analyze the data. Still, it helps to rapidly explore and mine existing results and holds a strategic position in the management of research data.
Collapse
Affiliation(s)
- Thomas Denecker
- CEA, CNRS, Univ. Paris-Sud, Institute for Integrative Biology of the Cell (I2BC), Gif-sur-Yvette, France
| | | | | | | | | | - Pierre Poulain
- CNRS, Univ. Paris Diderot, Institut Jacques Monod (IJM), Paris, France
| | - Gaëlle Lelandais
- CEA, CNRS, Univ. Paris-Sud, Institute for Integrative Biology of the Cell (I2BC), Gif-sur-Yvette, France
| |
Collapse
|
357
|
Abstract
Traumatic brain and spinal cord injuries cause permanent disability. Although progress has been made in understanding the cellular and molecular mechanisms underlying the pathophysiological changes that affect both structure and function after injury to the brain or spinal cord, there are currently no cures for either condition. This may change with the development and application of multi-layer omics, new sophisticated bioinformatics tools, and cutting-edge imaging techniques. Already, these technical advances, when combined, are revealing an unprecedented number of novel cellular and molecular targets that could be manipulated alone or in combination to repair the injured central nervous system with precision. In this review, we highlight recent advances in applying these new technologies to the study of axon regeneration and rebuilding of injured neural circuitry. We then discuss the challenges ahead to translate results produced by these technologies into clinical application to help improve the lives of individuals who have a brain or spinal cord injury.
Collapse
Affiliation(s)
- Andrea Tedeschi
- Department of Neuroscience and Discovery Themes Initiative, College of Medicine, Ohio State University, Columbus, Ohio, 43210, USA
| | - Phillip G Popovich
- Center for Brain and Spinal Cord Repair, Institute for Behavioral Medicine Research, Ohio State University, Columbus, Ohio, 43210, USA
| |
Collapse
|
358
|
Yu J, Peng J, Chi H. Systems immunology: Integrating multi-omics data to infer regulatory networks and hidden drivers of immunity. ACTA ACUST UNITED AC 2019; 15:19-29. [PMID: 32789283 DOI: 10.1016/j.coisb.2019.03.003] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
The immune system is a highly complex and dynamic biological system. It operates through intracellular molecular networks and intercellular (cell-cell) interaction networks. Systems immunology is an emerging discipline that applies systems biology approaches of integrating high-throughput multi-omics measurements with computational network modeling to better understand immunity at various scales. In this review, we summarize key omics technologies and computational approaches used for immunological studies at both population and single-cell levels. We highlight the hidden driver analysis based on data-driven networks and comment on the potential of translating systems immunology discoveries to immunotherapy of cancer and other human diseases.
Collapse
Affiliation(s)
- Jiyang Yu
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Junmin Peng
- Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
- Department of Developmental Neurobiology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
- Center for Proteomics and Metabolomics, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Hongbo Chi
- Department of Immunology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| |
Collapse
|
359
|
Wang S, Ji F, Li Z, Xue M. Fluorescence imaging-based methods for single-cell protein analysis. Anal Bioanal Chem 2019; 411:4339-4347. [PMID: 30854595 DOI: 10.1007/s00216-019-01694-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2018] [Revised: 01/05/2019] [Accepted: 02/15/2019] [Indexed: 12/17/2022]
Abstract
The quantity and activity of proteins in many biological systems exhibit prominent heterogeneities. Single-cell analytical methods can resolve subpopulations and dissect their unique signatures from heterogeneous samples, enabling a clarifying view of the biological process. Over the last 5 years, technologies for single-cell protein analysis have significantly advanced. In this article, we highlight a branch of those technology developments involving fluorescence-based approaches, with a focus on the methods that increase the ability to multiplex and enable dynamic measurements. We also analyze the limitations of these techniques and discuss current challenges in the field, with the hope that more transformative platforms can soon emerge.
Collapse
Affiliation(s)
- Siwen Wang
- Department of Chemistry, University of California, Riverside, Riverside, CA, 92521, USA
| | - Fei Ji
- Department of Chemistry, University of California, Riverside, Riverside, CA, 92521, USA
| | - Zhonghan Li
- Department of Chemistry, University of California, Riverside, Riverside, CA, 92521, USA
| | - Min Xue
- Department of Chemistry, University of California, Riverside, Riverside, CA, 92521, USA.
| |
Collapse
|
360
|
Huang Z, Zhan X, Xiang S, Johnson TS, Helm B, Yu CY, Zhang J, Salama P, Rizkalla M, Han Z, Huang K. SALMON: Survival Analysis Learning With Multi-Omics Neural Networks on Breast Cancer. Front Genet 2019; 10:166. [PMID: 30906311 PMCID: PMC6419526 DOI: 10.3389/fgene.2019.00166] [Citation(s) in RCA: 133] [Impact Index Per Article: 22.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2018] [Accepted: 02/14/2019] [Indexed: 12/22/2022] Open
Abstract
Improved cancer prognosis is a central goal for precision health medicine. Though many models can predict differential survival from data, there is a strong need for sophisticated algorithms that can aggregate and filter relevant predictors from increasingly complex data inputs. In turn, these models should provide deeper insight into which types of data are most relevant to improve prognosis. Deep Learning-based neural networks offer a potential solution for both problems because they are highly flexible and account for data complexity in a non-linear fashion. In this study, we implement Deep Learning-based networks to determine how gene expression data predicts Cox regression survival in breast cancer. We accomplish this through an algorithm called SALMON (Survival Analysis Learning with Multi-Omics Neural Networks), which aggregates and simplifies gene expression data and cancer biomarkers to enable prognosis prediction. The results revealed improved performance when more omics data were used in model construction. Rather than use raw gene expression values as model inputs, we innovatively use eigengene modules from the result of gene co-expression network analysis. The corresponding high impact co-expression modules and other omics data are identified by feature selection technique, then examined by conducting enrichment analysis and exploiting biological functions, escalated the interpretation of input feature from gene level to co-expression modules level. Our study shows the feasibility of discovering breast cancer related co-expression modules, sketch a blueprint of future endeavors on Deep Learning-based survival analysis. SALMON source code is available at https://github.com/huangzhii/SALMON/.
Collapse
Affiliation(s)
- Zhi Huang
- School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, United States.,Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, United States.,Department of Electrical and Computer Engineering, Indiana University-Purdue University Indianapolis, Indianapolis, IN, United States
| | - Xiaohui Zhan
- Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, United States.,National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen, China
| | - Shunian Xiang
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen, China.,Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, United States
| | - Travis S Johnson
- Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, United States.,Department of Biomedical Informatics, The Ohio State University, Columbus, OH, United States
| | - Bryan Helm
- Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, United States
| | - Christina Y Yu
- Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, United States.,Department of Biomedical Informatics, The Ohio State University, Columbus, OH, United States
| | - Jie Zhang
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, United States
| | - Paul Salama
- Department of Electrical and Computer Engineering, Indiana University-Purdue University Indianapolis, Indianapolis, IN, United States
| | - Maher Rizkalla
- Department of Electrical and Computer Engineering, Indiana University-Purdue University Indianapolis, Indianapolis, IN, United States
| | - Zhi Han
- Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, United States.,Regenstrief Institute, Indianapolis, IN, United States
| | - Kun Huang
- Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, United States.,Department of Electrical and Computer Engineering, Indiana University-Purdue University Indianapolis, Indianapolis, IN, United States.,Regenstrief Institute, Indianapolis, IN, United States
| |
Collapse
|
361
|
Saez-Rodriguez J, Rinschen MM, Floege J, Kramann R. Big science and big data in nephrology. Kidney Int 2019; 95:1326-1337. [PMID: 30982672 DOI: 10.1016/j.kint.2018.11.048] [Citation(s) in RCA: 47] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2018] [Revised: 11/11/2018] [Accepted: 11/20/2018] [Indexed: 12/16/2022]
Abstract
There have been tremendous advances during the last decade in methods for large-scale, high-throughput data generation and in novel computational approaches to analyze these datasets. These advances have had a profound impact on biomedical research and clinical medicine. The field of genomics is rapidly developing toward single-cell analysis, and major advances in proteomics and metabolomics have been made in recent years. The developments on wearables and electronic health records are poised to change clinical trial design. This rise of 'big data' holds the promise to transform not only research progress, but also clinical decision making towards precision medicine. To have a true impact, it requires integrative and multi-disciplinary approaches that blend experimental, clinical and computational expertise across multiple institutions. Cancer research has been at the forefront of the progress in such large-scale initiatives, so-called 'big science,' with an emphasis on precision medicine, and various other areas are quickly catching up. Nephrology is arguably lagging behind, and hence these are exciting times to start (or redirect) a research career to leverage these developments in nephrology. In this review, we summarize advances in big data generation, computational analysis, and big science initiatives, with a special focus on applications to nephrology.
Collapse
Affiliation(s)
- Julio Saez-Rodriguez
- RWTH Aachen University, Faculty of Medicine, Joint Research Centre for Computational Biomedicine (JRC-COMBINE), Aachen, Germany; Institute for Computational Biomedicine, Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Heidelberg, Germany; Molecular Medicine Partnership Unit (MMPU), European Molecular Biology Laboratory and Heidelberg University, Heidelberg, Germany.
| | - Markus M Rinschen
- Department II of Internal Medicine, and Center for Molecular Medicine Cologne, University of Cologne, Cologne, Germany; Center for Mass Spectrometry and Metabolomics, The Scripps Research Institute, La Jolla, California, USA
| | - Jürgen Floege
- RWTH Aachen, Department of Nephrology and Clinical Immunology, Aachen, Germany
| | - Rafael Kramann
- RWTH Aachen, Department of Nephrology and Clinical Immunology, Aachen, Germany; Department of Internal Medicine, Nephrology and Transplantation, Erasmus Medical Center, Rotterdam, The Netherlands.
| |
Collapse
|
362
|
Wang Q, Peng WX, Wang L, Ye L. Toward multiomics-based next-generation diagnostics for precision medicine. Per Med 2019; 16:157-170. [PMID: 30816060 DOI: 10.2217/pme-2018-0085] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Our healthcare system is experiencing a paradigm shift to precision medicine, aiming at an early prediction of individual disease risks and targeted interventions. Whole-genome sequencing is currently gaining momentum, as it has the potential to capture all classes of genetic variation, thus providing a more complete picture of the individual's genetic makeup, which could be utilized in genetic testing; however, this will also lead to difficulties in interpreting the test results, necessitating careful integration of genomic data with other layers of information, both molecular multiomics measurements of epigenome, transcriptome, proteome, metabolome and even microbiome, as well as comprehensive information on diet, lifestyle and environment. Overall, the translation of patient-specific data into actionable diagnostic tools will be a challenging task, requiring expertise from multiple disciplines, secure data sharing in large reference databases and a strong computational infrastructure.
Collapse
Affiliation(s)
- Qi Wang
- Department of Emergency Medicine, Hangzhou Hospital of Traditional Chinese Medicine, Hangzhou 310007, Zhejiang Province, China
| | - Wei-Xian Peng
- Department of Emergency Medicine, Hangzhou Hospital of Traditional Chinese Medicine, Hangzhou 310007, Zhejiang Province, China
| | - Lu Wang
- Department of Emergency Medicine, Hangzhou Hospital of Traditional Chinese Medicine, Hangzhou 310007, Zhejiang Province, China
| | - Li Ye
- Department of Nursing, Tongde Hospital of Zhejiang Province, Hangzhou 310012, Zhejiang Province, China
| |
Collapse
|
363
|
Mitropoulos K, Katsila T, Patrinos GP, Pampalakis G. Multi-Omics for Biomarker Discovery and Target Validation in Biofluids for Amyotrophic Lateral Sclerosis Diagnosis. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2019; 22:52-64. [PMID: 29356625 DOI: 10.1089/omi.2017.0183] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Amyotrophic lateral sclerosis (ALS) is a rare but usually fatal neurodegenerative disease characterized by motor neuron degeneration in the brain and the spinal cord. Two forms are recognized, the familial that accounts for 5-10% and the sporadic that accounts for the rest. New studies suggest that ALS is a highly heterogeneous disease, and this diversity is a major reason for the lack of successful therapeutic treatments. Indeed, only two drugs (riluzole and edaravone) have been approved that provide a limited improvement in the quality of life. Presently, the diagnosis of ALS is based on clinical examination and lag period from the onset of symptoms to the final diagnosis is ∼12 months. Therefore, the discovery of robust molecular biomarkers that can assist in the diagnosis is of major importance. DNA sequencing to identify pathogenic gene variants can be applied in the cases of familial ALS. However, it is not a routinely used diagnostic procedure and most importantly, it cannot be applied in the diagnosis of sporadic ALS. In this expert review, the current approaches in identification of new ALS biomarkers are discussed. The advent of various multi-omics biotechnology platforms, including miRNomics, proteomics, metabolomics, metallomics, volatolomics, and viromics, has assisted in the identification of new biomarkers. The biofluids are the most preferable material for the analysis of potential biomarkers (such as proteins and cell-free miRNAs), since they are easily obtained. In the near future, the biofluid-based biomarkers will be indispensable to classify different ALS subtypes and understand the molecular heterogeneity of the disease.
Collapse
Affiliation(s)
- Konstantinos Mitropoulos
- 1 Department of Histology and Embryology, University of Athens School of Medicine , Athens, Greece
| | - Theodora Katsila
- 2 Department of Pharmacy, University of Patras School of Health Sciences , Patras, Greece
| | - George P Patrinos
- 2 Department of Pharmacy, University of Patras School of Health Sciences , Patras, Greece .,3 Department of Pharmacy, College of Medicine and Health Sciences, United Arab Emirates University , Al Ain, UAE
| | - Georgios Pampalakis
- 2 Department of Pharmacy, University of Patras School of Health Sciences , Patras, Greece
| |
Collapse
|
364
|
Han Y, Ye X, Cheng J, Zhang S, Feng W, Han Z, Zhang J, Huang K. Integrative analysis based on survival associated co-expression gene modules for predicting Neuroblastoma patients' survival time. Biol Direct 2019; 14:4. [PMID: 30760313 PMCID: PMC6375203 DOI: 10.1186/s13062-018-0229-2] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2017] [Accepted: 11/20/2018] [Indexed: 12/03/2022] Open
Abstract
Background More than 90% of neuroblastoma patients are cured in the low-risk group while only less than 50% for those with high-risk disease can be cured. Since the high-risk patients still have poor outcomes, we need more accurate stratification to establish an individualized precise treatment plan for the patients to improve the long-term survival rate. Results We focus on extracting features and providing a workflow to improve survival prediction for neuroblastoma patients. With a workflow for gene co-expression network (GCN) mining in microarray and RNA-Seq datasets, we extracted molecular features from each co-expressed module and summarized them into eigengenes. Then we adopted the lasso-regularized Cox proportional hazards model to select the most informative eigengene features regarding association to the risk of metastasis. Nine eigengenes were selected which show strong association with patient survival prognosis. All of the nine corresponding gene modules also have highly enriched biological functions or cytoband locations. Three of them are unique modules to RNA-Seq data, which complement the modules from microarray data in terms of survival prognosis. We then merged all eigengenes from these unique modules and used an integrative method called Similarity Network Fusion to test the prognostic power of these eigengenes for prognosis. The prognostic accuracies are significantly improved as compared to using all eigengenes, and a subgroup of patients with very poor survival rate was identified. Conclusions We first compared GCNs mined from microarray and RNA-seq data. We discovered that each data modality yields unique GCNs, which are enriched with clear biological functions. Then we do module unique analysis and use lasso-cox model to select survival-associated eigengenes. Integration of unique and survival-associated eigengenes from both data types provides complementary information that leads to more accurate survival prognosis. Reviewers Reviewed by Susmita Datta, Marco Chierici and Dimitar Vassilev. Electronic supplementary material The online version of this article (10.1186/s13062-018-0229-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Yatong Han
- Department of Automation, Harbin Engineering University, Harbin, China.,Department of Neurosurgery, Stanford University, California, USA
| | - Xiufen Ye
- Department of Automation, Harbin Engineering University, Harbin, China
| | - Jun Cheng
- Department of Medicine, Indiana University School of Medicine, Indianapolis, USA.,School of Biomedical Engineering, Shenzhen University, Shenzhen, China
| | - Siyuan Zhang
- Department of Automation, Harbin Engineering University, Harbin, China
| | - Weixing Feng
- Department of Automation, Harbin Engineering University, Harbin, China
| | - Zhi Han
- Department of Medicine, Indiana University School of Medicine, Indianapolis, USA
| | - Jie Zhang
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, USA
| | - Kun Huang
- Department of Medicine, Indiana University School of Medicine, Indianapolis, USA. .,Regenstrief Institute, Indianapolis, USA.
| |
Collapse
|
365
|
Rao MS, Van Vleet TR, Ciurlionis R, Buck WR, Mittelstadt SW, Blomme EAG, Liguori MJ. Comparison of RNA-Seq and Microarray Gene Expression Platforms for the Toxicogenomic Evaluation of Liver From Short-Term Rat Toxicity Studies. Front Genet 2019; 9:636. [PMID: 30723492 PMCID: PMC6349826 DOI: 10.3389/fgene.2018.00636] [Citation(s) in RCA: 139] [Impact Index Per Article: 23.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2018] [Accepted: 11/27/2018] [Indexed: 12/12/2022] Open
Abstract
Gene expression profiling is a useful tool to predict and interrogate mechanisms of toxicity. RNA-Seq technology has emerged as an attractive alternative to traditional microarray platforms for conducting transcriptional profiling. The objective of this work was to compare both transcriptomic platforms to determine whether RNA-Seq offered significant advantages over microarrays for toxicogenomic studies. RNA samples from the livers of rats treated for 5 days with five tool hepatotoxicants (α-naphthylisothiocyanate/ANIT, carbon tetrachloride/CCl4, methylenedianiline/MDA, acetaminophen/APAP, and diclofenac/DCLF) were analyzed with both gene expression platforms (RNA-Seq and microarray). Data were compared to determine any potential added scientific (i.e., better biological or toxicological insight) value offered by RNA-Seq compared to microarrays. RNA-Seq identified more differentially expressed protein-coding genes and provided a wider quantitative range of expression level changes when compared to microarrays. Both platforms identified a larger number of differentially expressed genes (DEGs) in livers of rats treated with ANIT, MDA, and CCl4 compared to APAP and DCLF, in agreement with the severity of histopathological findings. Approximately 78% of DEGs identified with microarrays overlapped with RNA-Seq data, with a Spearman’s correlation of 0.7 to 0.83. Consistent with the mechanisms of toxicity of ANIT, APAP, MDA and CCl4, both platforms identified dysregulation of liver relevant pathways such as Nrf2, cholesterol biosynthesis, eiF2, hepatic cholestasis, glutathione and LPS/IL-1 mediated RXR inhibition. RNA-Seq data showed additional DEGs that not only significantly enriched these pathways, but also suggested modulation of additional liver relevant pathways. In addition, RNA-Seq enabled the identification of non-coding DEGs that offer a potential for improved mechanistic clarity. Overall, these results indicate that RNA-Seq is an acceptable alternative platform to microarrays for rat toxicogenomic studies with several advantages. Because of its wider dynamic range as well as its ability to identify a larger number of DEGs, RNA-Seq may generate more insight into mechanisms of toxicity. However, more extensive reference data will be necessary to fully leverage these additional RNA-Seq data, especially for non-coding sequences.
Collapse
Affiliation(s)
- Mohan S Rao
- Investigative Toxicology and Pathology, Global Preclinical Safety, AbbVie, North Chicago, IL, United States
| | - Terry R Van Vleet
- Investigative Toxicology and Pathology, Global Preclinical Safety, AbbVie, North Chicago, IL, United States
| | - Rita Ciurlionis
- Investigative Toxicology and Pathology, Global Preclinical Safety, AbbVie, North Chicago, IL, United States
| | - Wayne R Buck
- Investigative Toxicology and Pathology, Global Preclinical Safety, AbbVie, North Chicago, IL, United States
| | - Scott W Mittelstadt
- Investigative Toxicology and Pathology, Global Preclinical Safety, AbbVie, North Chicago, IL, United States
| | - Eric A G Blomme
- Investigative Toxicology and Pathology, Global Preclinical Safety, AbbVie, North Chicago, IL, United States
| | - Michael J Liguori
- Investigative Toxicology and Pathology, Global Preclinical Safety, AbbVie, North Chicago, IL, United States
| |
Collapse
|
366
|
Wu C, Zhou F, Ren J, Li X, Jiang Y, Ma S. A Selective Review of Multi-Level Omics Data Integration Using Variable Selection. High Throughput 2019; 8:E4. [PMID: 30669303 PMCID: PMC6473252 DOI: 10.3390/ht8010004] [Citation(s) in RCA: 122] [Impact Index Per Article: 20.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2018] [Revised: 12/24/2018] [Accepted: 01/10/2019] [Indexed: 01/02/2023] Open
Abstract
High-throughput technologies have been used to generate a large amount of omics data. In the past, single-level analysis has been extensively conducted where the omics measurements at different levels, including mRNA, microRNA, CNV and DNA methylation, are analyzed separately. As the molecular complexity of disease etiology exists at all different levels, integrative analysis offers an effective way to borrow strength across multi-level omics data and can be more powerful than single level analysis. In this article, we focus on reviewing existing multi-omics integration studies by paying special attention to variable selection methods. We first summarize published reviews on integrating multi-level omics data. Next, after a brief overview on variable selection methods, we review existing supervised, semi-supervised and unsupervised integrative analyses within parallel and hierarchical integration studies, respectively. The strength and limitations of the methods are discussed in detail. No existing integration method can dominate the rest. The computation aspects are also investigated. The review concludes with possible limitations and future directions for multi-level omics data integration.
Collapse
Affiliation(s)
- Cen Wu
- Department of Statistics, Kansas State University, Manhattan, KS 66506, USA.
| | - Fei Zhou
- Department of Statistics, Kansas State University, Manhattan, KS 66506, USA.
| | - Jie Ren
- Department of Statistics, Kansas State University, Manhattan, KS 66506, USA.
| | - Xiaoxi Li
- Department of Statistics, Kansas State University, Manhattan, KS 66506, USA.
| | - Yu Jiang
- Division of Epidemiology, Biostatistics and Environmental Health, School of Public Health, University of Memphis, Memphis, TN 38152, USA.
| | - Shuangge Ma
- Department of Biostatistics, School of Public Health, Yale University, New Haven, CT 06510, USA.
| |
Collapse
|
367
|
Chaudhary K, Poirion OB, Lu L, Huang S, Ching T, Garmire LX. Multimodal Meta-Analysis of 1,494 Hepatocellular Carcinoma Samples Reveals Significant Impact of Consensus Driver Genes on Phenotypes. Clin Cancer Res 2019; 25:463-472. [PMID: 30242023 PMCID: PMC6542354 DOI: 10.1158/1078-0432.ccr-18-0088] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2018] [Revised: 05/28/2018] [Accepted: 09/17/2018] [Indexed: 01/04/2023]
Abstract
Although driver genes in hepatocellular carcinoma (HCC) have been investigated in various previous genetic studies, prevalence of key driver genes among heterogeneous populations is unknown. Moreover, the phenotypic associations of these driver genes are poorly understood. This report aims to reveal the phenotypic impacts of a group of consensus driver genes in HCC. We used MutSigCV and OncodriveFM modules implemented in the IntOGen pipeline to identify consensus driver genes across six HCC cohorts comprising 1,494 samples in total. To access their global impacts, we used The Cancer Genome Atlas (TCGA) mutations and copy-number variations to predict the transcriptomics data, under generalized linear models. We further investigated the associations of the consensus driver genes to patient survival, age, gender, race, and risk factors. We identify 10 consensus driver genes across six HCC cohorts in total. Integrative analysis of driver mutations, copy-number variations, and transcriptomic data reveals that these consensus driver mutations and their copy-number variations are associated with a majority (62.5%) of the mRNA transcriptome but only a small fraction (8.9%) of miRNAs. Genes associated with TP53, CTNNB1, and ARID1A mutations contribute to the tripod of most densely connected pathway clusters. These driver genes are significantly associated with patients' overall survival. Some driver genes are significantly linked to HCC gender (CTNNB1, ALB, TP53, and AXIN1), race (TP53 and CDKN2A), and age (RB1) disparities. This study prioritizes a group of consensus drivers in HCC, which collectively show vast impacts on the phenotypes. These driver genes may warrant as valuable therapeutic targets of HCC.
Collapse
Affiliation(s)
| | - Olivier B Poirion
- Epidemiology Program, University of Hawaii Cancer Center, Honolulu, Hawaii
| | - Liangqun Lu
- Epidemiology Program, University of Hawaii Cancer Center, Honolulu, Hawaii
- Molecular Biosciences and Bioengineering Graduate Program, University of Hawaii at Manoa, Honolulu, Hawaii
| | - Sijia Huang
- Epidemiology Program, University of Hawaii Cancer Center, Honolulu, Hawaii
- Molecular Biosciences and Bioengineering Graduate Program, University of Hawaii at Manoa, Honolulu, Hawaii
| | - Travers Ching
- Epidemiology Program, University of Hawaii Cancer Center, Honolulu, Hawaii
- Molecular Biosciences and Bioengineering Graduate Program, University of Hawaii at Manoa, Honolulu, Hawaii
| | - Lana X Garmire
- Epidemiology Program, University of Hawaii Cancer Center, Honolulu, Hawaii.
- Molecular Biosciences and Bioengineering Graduate Program, University of Hawaii at Manoa, Honolulu, Hawaii
| |
Collapse
|
368
|
Grabowski P, Rappsilber J. A Primer on Data Analytics in Functional Genomics: How to Move from Data to Insight? Trends Biochem Sci 2019; 44:21-32. [PMID: 30522862 PMCID: PMC6318833 DOI: 10.1016/j.tibs.2018.10.010] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2018] [Revised: 10/19/2018] [Accepted: 10/25/2018] [Indexed: 02/06/2023]
Abstract
High-throughput methodologies and machine learning have been central in developing systems-level perspectives in molecular biology. Unfortunately, performing such integrative analyses has traditionally been reserved for bioinformaticians. This is now changing with the appearance of resources to help bench-side biologists become skilled at computational data analysis and handling large omics data sets. Here, we show an entry route into the field of omics data analytics. We provide information about easily accessible data sources and suggest some first steps for aspiring computational data analysts. Moreover, we highlight how machine learning is transforming the field and how it can help make sense of biological data. Finally, we suggest good starting points for self-learning and hope to convince readers that computational data analysis and programming are not intimidating.
Collapse
Affiliation(s)
- Piotr Grabowski
- Bioanalytics, Institute of Biotechnology, Technische Universität Berlin, 13355 Berlin, Germany
| | - Juri Rappsilber
- Bioanalytics, Institute of Biotechnology, Technische Universität Berlin, 13355 Berlin, Germany; Wellcome Centre for Cell Biology, University of Edinburgh, Edinburgh EH9 3BF, UK.
| |
Collapse
|
369
|
Diseases and their clinical heterogeneity – Are we ignoring the SNiPers and micRomaNAgers? An illustration using Beta-thalassemia clinical spectrum and fetal hemoglobin levels. Genomics 2019; 111:67-75. [DOI: 10.1016/j.ygeno.2018.01.002] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2017] [Revised: 12/18/2017] [Accepted: 01/03/2018] [Indexed: 12/18/2022]
|
370
|
Balluff B, Buck A, Martin‐Lorenzo M, Dewez F, Langer R, McDonnell LA, Walch A, Heeren RM. Integrative Clustering in Mass Spectrometry Imaging for Enhanced Patient Stratification. Proteomics Clin Appl 2019; 13:e1800137. [PMID: 30580496 PMCID: PMC6590511 DOI: 10.1002/prca.201800137] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2018] [Revised: 11/28/2018] [Indexed: 12/04/2022]
Abstract
SCOPE In biomedical research, mass spectrometry imaging (MSI) can obtain spatially-resolved molecular information from tissue sections. Especially matrix-assisted laser desorption/ionization (MALDI) MSI offers, depending on the type of matrix, the detection of a broad variety of molecules ranging from metabolites to proteins, thereby facilitating the collection of multilevel molecular data. Lately, integrative clustering techniques have been developed that make use of the complementary information of multilevel molecular data in order to better stratify patient cohorts, but which have not yet been applied in the field of MSI. MATERIALS AND METHODS In this study, the potential of integrative clustering is investigated for multilevel molecular MSI data to subdivide cancer patients into different prognostic groups. Metabolomic and peptidomic data are obtained by MALDI-MSI from a tissue microarray containing material of 46 esophageal cancer patients. The integrative clustering methods Similarity Network Fusion, iCluster, and moCluster are applied and compared to non-integrated clustering. CONCLUSION The results show that the combination of multilevel molecular data increases the capability of integrative algorithms to detect patient subgroups with different clinical outcome, compared to the single level or concatenated data. This underlines the potential of multilevel molecular data from the same subject using MSI for subsequent integrative clustering.
Collapse
Affiliation(s)
- Benjamin Balluff
- Maastricht MultiModal Molecular Imaging institute (M4I)Maastricht University6229 ERMaastrichtThe Netherlands
| | - Achim Buck
- Research Unit Analytical PathologyHelmholtz Zentrum München85764OberschleißheimGermany
| | - Marta Martin‐Lorenzo
- Maastricht MultiModal Molecular Imaging institute (M4I)Maastricht University6229 ERMaastrichtThe Netherlands
| | - Frédéric Dewez
- Maastricht MultiModal Molecular Imaging institute (M4I)Maastricht University6229 ERMaastrichtThe Netherlands
| | - Rupert Langer
- Institute of PathologyUniversity of BernCH‐3008BernSwitzerland
| | | | - Axel Walch
- Research Unit Analytical PathologyHelmholtz Zentrum München85764OberschleißheimGermany
| | - Ron M.A. Heeren
- Maastricht MultiModal Molecular Imaging institute (M4I)Maastricht University6229 ERMaastrichtThe Netherlands
| |
Collapse
|
371
|
Prosperi M, Min JS, Bian J, Modave F. Big data hurdles in precision medicine and precision public health. BMC Med Inform Decis Mak 2018; 18:139. [PMID: 30594159 PMCID: PMC6311005 DOI: 10.1186/s12911-018-0719-2] [Citation(s) in RCA: 87] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2018] [Accepted: 12/04/2018] [Indexed: 12/18/2022] Open
Abstract
BACKGROUND Nowadays, trendy research in biomedical sciences juxtaposes the term 'precision' to medicine and public health with companion words like big data, data science, and deep learning. Technological advancements permit the collection and merging of large heterogeneous datasets from different sources, from genome sequences to social media posts or from electronic health records to wearables. Additionally, complex algorithms supported by high-performance computing allow one to transform these large datasets into knowledge. Despite such progress, many barriers still exist against achieving precision medicine and precision public health interventions for the benefit of the individual and the population. MAIN BODY The present work focuses on analyzing both the technical and societal hurdles related to the development of prediction models of health risks, diagnoses and outcomes from integrated biomedical databases. Methodological challenges that need to be addressed include improving semantics of study designs: medical record data are inherently biased, and even the most advanced deep learning's denoising autoencoders cannot overcome the bias if not handled a priori by design. Societal challenges to face include evaluation of ethically actionable risk factors at the individual and population level; for instance, usage of gender, race, or ethnicity as risk modifiers, not as biological variables, could be replaced by modifiable environmental proxies such as lifestyle and dietary habits, household income, or access to educational resources. CONCLUSIONS Data science for precision medicine and public health warrants an informatics-oriented formalization of the study design and interoperability throughout all levels of the knowledge inference process, from the research semantics, to model development, and ultimately to implementation.
Collapse
Affiliation(s)
- Mattia Prosperi
- Department of Epidemiology, College of Medicine & College of Public Health and Health Professions, University of Florida, Gainesville, FL, 32610, USA.
| | - Jae S Min
- Department of Epidemiology, College of Medicine & College of Public Health and Health Professions, University of Florida, Gainesville, FL, 32610, USA
| | - Jiang Bian
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, 32610, USA
| | - François Modave
- Center for Health Outcomes and Informatics Research, Loyola University Chicago, Maywood, IL, 60153, USA
| |
Collapse
|
372
|
Zhou G, Xia J. Using OmicsNet for Network Integration and 3D Visualization. ACTA ACUST UNITED AC 2018; 65:e69. [DOI: 10.1002/cpbi.69] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Affiliation(s)
- Guangyan Zhou
- Institute of Parasitology, McGill University, Sainte Anne de Bellevue; Quebec Canada
| | - Jianguo Xia
- Institute of Parasitology, McGill University, Sainte Anne de Bellevue; Quebec Canada
- Department of Animal Sciences, McGill University, Sainte Anne de Bellevue; Quebec Canada
- Department of Microbiology and Immunology, McGill University; Montreal Quebec Canada
| |
Collapse
|
373
|
"Omics" data integration and functional analyses link Enoyl-CoA hydratase, short chain 1 to drug refractory dilated cardiomyopathy. BMC Med Genomics 2018; 11:110. [PMID: 30541556 PMCID: PMC6292014 DOI: 10.1186/s12920-018-0439-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2018] [Accepted: 11/27/2018] [Indexed: 01/06/2023] Open
Abstract
Background Large-scale “omics” datasets have not been leveraged and integrated with functional analyses to discover potential drivers of cardiomyopathy. This study addresses the knowledge gap. Methods We coupled RNA sequence (RNA-Seq) variant detection and transcriptome profiling with pathway analysis to model drug refractory dilated cardiomyopathy (drDCM) using the BaseSpace sequencing hub and Ingenuity Pathway Analysis. We used RNA-Seq case-control datasets (n = 6 cases, n = 4 controls), exome sequence familial DCM datasets (n = 3 Italians, n = 5 Italians, n = 5 Chinese), and controls from the HapMap project (n = 5 Caucasians, and n = 5 Asians) for disease modeling and putative mutation discovery. Variant replication datasets: n = 128 cases and n = 15 controls. Source of datasets: NCBI Sequence Read Archive. Statistics: Pairwise differential expression analyses to determine differentially expressed genes and t-tests to calculate p-values. We adjusted for false discovery rates and reported q-values. We used chi-square tests to assess independence among variables, the Fisher’s Exact Tests and overlap p-values for the pathways and p-scores to rank network. Results Data revealed that ECHS1(enoyl-CoA hydratase, short chain 1(log2(foldchange) = 1.63329) hosts a mirtron, MIR3944 expressed in drDCM (FPKM = 5.2857) and not in controls (FPKM = 0). Has-miR3944-3p is a putative target of BAG1 (BCL2 associated athanogene 1(log2(foldchange) = 1.31978) and has-miR3944-5p of ITGAV (integrin subunit alpha V(log2(foldchange) = 1.46107) and RHOD (ras homolog family member D(log2(foldchange) = 1.28851). There is an association between ECHS1:11 V/A(rs10466126) and drDCM (p = 0.02496). The interaction (p = 2.82E-07) between ECHS1:75 T/I(rs1049951) and ECHS1:rs10466126 is associated with drDCM (p < 2.2e-16). ECHS1:rs10466126 and ECHS1:rs1049951 are in linkage disequilibrium (D’ = 1). The interaction (p = 7.84E-08) between ECHS1:rs1049951 and the novel ECHS1:c.41insT variant is associated with drDCM (p < 2.2e-16). The interaction (p = 0.001096) between DBT (Dihydrolipoamide branched chain transacylase E2):384G/S(rs12021720) and ECHS1:rs10466126 is associated with drDCM (p < 2.2e-16). At the mRNA level, there is an association between ECHS1 (log2(foldchange) = 1.63329; q = 0.013927) and DBT (log2(foldchange) = 0.955072; q = 0.0368792) with drDCM. ECHS1 is involved in valine (−log (p = 3.39E00)), isoleucine degradation (p = 0.00457), fatty acid β-oxidation (−log(p) = 2.83E00), and drug metabolism:cytochrome P450 (z-score = 2.07985196) pathways. The mitochondria (−log(p) = 8.73E00), oxidative phosphorylation (−log(p) = 5.35E00) and TCA-cycle II (−log(p) = 2.70E00) are dysfunctional. Conclusions We introduce an integrative data strategy that considers the interplay between the DNA, mRNA, and associated pathways, which represents a possible diagnostic, prognostic, biomarker, and personalized treatment discovery approach in genomically heterogeneous diseases. Electronic supplementary material The online version of this article (10.1186/s12920-018-0439-6) contains supplementary material, which is available to authorized users.
Collapse
|
374
|
Csala A, Hof MH, Zwinderman AH. Multiset sparse redundancy analysis for high-dimensional omics data. Biom J 2018; 61:406-423. [PMID: 30506971 PMCID: PMC6587877 DOI: 10.1002/bimj.201700248] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2017] [Revised: 09/28/2018] [Accepted: 10/02/2018] [Indexed: 11/23/2022]
Abstract
Redundancy Analysis (RDA) is a well‐known method used to describe the directional relationship between related data sets. Recently, we proposed sparse Redundancy Analysis (sRDA) for high‐dimensional genomic data analysis to find explanatory variables that explain the most variance of the response variables. As more and more biomolecular data become available from different biological levels, such as genotypic and phenotypic data from different omics domains, a natural research direction is to apply an integrated analysis approach in order to explore the underlying biological mechanism of certain phenotypes of the given organism. We show that the multiset sparse Redundancy Analysis (multi‐sRDA) framework is a prominent candidate for high‐dimensional omics data analysis since it accounts for the directional information transfer between omics sets, and, through its sparse solutions, the interpretability of the result is improved. In this paper, we also describe a software implementation for multi‐sRDA, based on the Partial Least Squares Path Modeling algorithm. We test our method through simulation and real omics data analysis with data sets of 364,134 methylation markers, 18,424 gene expression markers, and 47 cytokine markers measured on 37 patients with Marfan syndrome.
Collapse
Affiliation(s)
- Attila Csala
- Department of Clinical Epidemiology, Biostatistics and Bioinformatics, Academic Medical Center, Amsterdam, The Netherlands
| | - Michel H Hof
- Department of Clinical Epidemiology, Biostatistics and Bioinformatics, Academic Medical Center, Amsterdam, The Netherlands
| | - Aeilko H Zwinderman
- Department of Clinical Epidemiology, Biostatistics and Bioinformatics, Academic Medical Center, Amsterdam, The Netherlands
| |
Collapse
|
375
|
Yang H, Cao H, He T, Wang T, Cui Y. Multilevel heterogeneous omics data integration with kernel fusion. Brief Bioinform 2018; 21:156-170. [PMID: 30496340 DOI: 10.1093/bib/bby115] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2018] [Revised: 10/25/2018] [Accepted: 10/26/2018] [Indexed: 01/26/2023] Open
Abstract
High-throughput omics data are generated almost with no limit nowadays. It becomes increasingly important to integrate different omics data types to disentangle the molecular machinery of complex diseases with the hope for better disease prevention and treatment. Since the relationship among different omics data features are typically unknown, a supervised learning model assuming a particular distribution with a specific structure will not serve the purpose to capture the underlying complex relationship between multiple features and a disease phenotype. In this work, we briefly reviewed methods for kernel fusion (KF) based on support vector machine and kernel partial least squares (KPLS) algorithms. We then proposed a fused KPLS (fKPLS) model for disease classification and prediction with multilevel omics data. The fused kernel can deal with effect heterogeneity in which different omic data types may have different effect contribution to the trait of interest, with the purpose to improve the prediction performance. We proposed to optimize the kernel parameters and kernel weights with the genetic algorithm (GA). The proposed GA-fKPLS model can substantially improve disease classification performance by integrating multiple omics data types, demonstrated via extensive simulations and real data analysis. With properly defined fitness functions during GA optimization, the proposed KF method can be extended to other kernel-based analyses such as in kernel association analysis with common or rare variants.
Collapse
Affiliation(s)
- Haitao Yang
- Department of Epidemiology and Health Statistics, School of Public Health, and Hebei Province Key Laboratory of Environment and Human Health, Hebei Medical University, Shijiazhuang, PR China
| | - Hongyan Cao
- Division of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, PR China
| | - Tao He
- Department of Mathematics, San Francisco State University, San Francisco, CA, USA
| | - Tong Wang
- Division of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, PR China
| | - Yuehua Cui
- Division of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, PR China.,Department of Statistics and Probability, Michigan State University, East Lansing, MI, USA
| |
Collapse
|
376
|
Aguilera J, Aguilera‐Gomez M, Barrucci F, Cocconcelli PS, Davies H, Denslow N, Lou Dorne J, Grohmann L, Herman L, Hogstrand C, Kass GEN, Kille P, Kleter G, Nogué F, Plant NJ, Ramon M, Schoonjans R, Waigmann E, Wright MC. EFSA Scientific Colloquium 24 – 'omics in risk assessment: state of the art and next steps. ACTA ACUST UNITED AC 2018. [DOI: 10.2903/sp.efsa.2018.en-1512] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Affiliation(s)
| | | | | | | | | | | | | | | | - Lutz Grohmann
- Federal Office of Consumer Protection and Food Safety
| | | | | | | | | | | | - Fabien Nogué
- French National Institute for Agricultural Research INRA
| | | | | | | | | | | |
Collapse
|
377
|
Sorrentino A, Federico A, Rienzo M, Gazzerro P, Bifulco M, Ciccodicola A, Casamassimi A, Abbondanza C. PR/SET Domain Family and Cancer: Novel Insights from the Cancer Genome Atlas. Int J Mol Sci 2018; 19:ijms19103250. [PMID: 30347759 PMCID: PMC6214140 DOI: 10.3390/ijms19103250] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2018] [Revised: 10/12/2018] [Accepted: 10/17/2018] [Indexed: 12/17/2022] Open
Abstract
The PR/SET domain gene family (PRDM) encodes 19 different transcription factors that share a subtype of the SET domain [Su(var)3-9, enhancer-of-zeste and trithorax] known as the PRDF1-RIZ (PR) homology domain. This domain, with its potential methyltransferase activity, is followed by a variable number of zinc-finger motifs, which likely mediate protein⁻protein, protein⁻RNA, or protein⁻DNA interactions. Intriguingly, almost all PRDM family members express different isoforms, which likely play opposite roles in oncogenesis. Remarkably, several studies have described alterations in most of the family members in malignancies. Here, to obtain a pan-cancer overview of the genomic and transcriptomic alterations of PRDM genes, we reanalyzed the Exome- and RNA-Seq public datasets available at The Cancer Genome Atlas portal. Overall, PRDM2, PRDM3/MECOM, PRDM9, PRDM16 and ZFPM2/FOG2 were the most mutated genes with pan-cancer frequencies of protein-affecting mutations higher than 1%. Moreover, we observed heterogeneity in the mutation frequencies of these genes across tumors, with cancer types also reaching a value of about 20% of mutated samples for a specific PRDM gene. Of note, ZFPM1/FOG1 mutations occurred in 50% of adrenocortical carcinoma patients and were localized in a hotspot region. These findings, together with OncodriveCLUST results, suggest it could be putatively considered a cancer driver gene in this malignancy. Finally, transcriptome analysis from RNA-Seq data of paired samples revealed that transcription of PRDMs was significantly altered in several tumors. Specifically, PRDM12 and PRDM13 were largely overexpressed in many cancers whereas PRDM16 and ZFPM2/FOG2 were often downregulated. Some of these findings were also confirmed by real-time-PCR on primary tumors.
Collapse
Affiliation(s)
- Anna Sorrentino
- Department of Precision Medicine, University of Campania "Luigi Vanvitelli", Via L. De Crecchio, 80138 Naples, Italy.
- Department of Science and Technology, University of Naples "Parthenope", 80143 Naples, Italy.
| | - Antonio Federico
- Department of Science and Technology, University of Naples "Parthenope", 80143 Naples, Italy.
- Institute of Genetics and Biophysics "Adriano Buzzati Traverso", CNR, 80131 Naples, Italy.
| | - Monica Rienzo
- Department of Environmental, Biological, and Pharmaceutical Sciences and Technologies, University of Campania "Luigi Vanvitelli", 81100 Caserta, Italy.
| | - Patrizia Gazzerro
- Department of Pharmacy, University of Salerno, 84084 Salerno, Italy.
| | - Maurizio Bifulco
- Department of Molecular Medicine and Medical Biotechnologies, University of Naples "Federico II", 80131 Naples, Italy.
| | - Alfredo Ciccodicola
- Department of Science and Technology, University of Naples "Parthenope", 80143 Naples, Italy.
- Institute of Genetics and Biophysics "Adriano Buzzati Traverso", CNR, 80131 Naples, Italy.
| | - Amelia Casamassimi
- Department of Precision Medicine, University of Campania "Luigi Vanvitelli", Via L. De Crecchio, 80138 Naples, Italy.
| | - Ciro Abbondanza
- Department of Precision Medicine, University of Campania "Luigi Vanvitelli", Via L. De Crecchio, 80138 Naples, Italy.
| |
Collapse
|
378
|
Costa RL, Boroni M, Soares MA. Distinct co-expression networks using multi-omic data reveal novel interventional targets in HPV-positive and negative head-and-neck squamous cell cancer. Sci Rep 2018; 8:15254. [PMID: 30323202 PMCID: PMC6189122 DOI: 10.1038/s41598-018-33498-5] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2018] [Accepted: 09/19/2018] [Indexed: 12/25/2022] Open
Abstract
The human papillomavirus (HPV) is present in a significant fraction of head-and-neck squamous cell cancer (HNSCC). The main goal of this study was to identify distinct co-expression patterns between HPV+ and HPV- HNSCC and to provide insights into potential regulatory mechanisms/effects within the analyzed networks. We selected cases deposited in The Cancer Genome Atlas database comprising data of gene expression, methylation profiles and mutational patterns, in addition to clinical information. The intersection among differentially expressed and differentially methylated genes showed the negative correlations between the levels of methylation and expression, suggesting that these genes have their expression levels regulated by methylation alteration patterns in their promoter. Weighted correlation network analysis was used to identify co-expression modules and a systematic approach was applied to refine them and identify key regulatory elements integrating results from the other omics. Three distinct co-expression modules were associated with HPV status and molecular signatures. Validation using independent studies reporting biological experimental data converged for the most significant genes in all modules. This study provides insights into complex genetic and epigenetic particularities in the development and progression of HNSCC according to HPV status, and contribute to unveiling specific genes/pathways as novel therapeutic targets in HNSCC.
Collapse
Affiliation(s)
- Raquel L Costa
- Programa de Oncovirologia, Instituto Nacional de Câncer, Rio de Janeiro, Brazil.
- Bioinformatics and Computational Biology Lab, Instituto Nacional de Câncer, Rio de Janeiro, Brazil.
| | - Mariana Boroni
- Bioinformatics and Computational Biology Lab, Instituto Nacional de Câncer, Rio de Janeiro, Brazil
| | - Marcelo A Soares
- Programa de Oncovirologia, Instituto Nacional de Câncer, Rio de Janeiro, Brazil
- Department of Genetics, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil
| |
Collapse
|
379
|
Finotello F, Eduati F. Multi-Omics Profiling of the Tumor Microenvironment: Paving the Way to Precision Immuno-Oncology. Front Oncol 2018; 8:430. [PMID: 30345255 PMCID: PMC6182075 DOI: 10.3389/fonc.2018.00430] [Citation(s) in RCA: 50] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2018] [Accepted: 09/13/2018] [Indexed: 12/20/2022] Open
Abstract
The tumor microenvironment (TME) is a multifaceted ecosystem characterized by profound cellular heterogeneity, dynamicity, and complex intercellular cross-talk. The striking responses obtained with immune checkpoint blockers, i.e., antibodies targeting immune-cell regulators to boost antitumor immunity, have demonstrated the enormous potential of anticancer treatments that target TME components other than tumor cells. However, as checkpoint blockade is currently beneficial only to a limited fraction of patients, there is an urgent need to understand the mechanisms orchestrating the immune response in the TME to guide the rational design of more effective anticancer therapies. In this Mini Review, we give an overview of the methodologies that allow studying the heterogeneity of the TME from multi-omics data generated from bulk samples, single cells, or images of tumor-tissue slides. These include approaches for the characterization of the different cell phenotypes and for the reconstruction of their spatial organization and inter-cellular cross-talk. We discuss how this broader vision of the cellular heterogeneity and plasticity of tumors, which is emerging thanks to these methodologies, offers the opportunity to rationally design precision immuno-oncology treatments. These developments are fundamental to overcome the current limitations of targeted agents and checkpoint blockers and to bring long-term clinical benefits to a larger fraction of cancer patients.
Collapse
Affiliation(s)
- Francesca Finotello
- Biocenter, Division for Bioinformatics, Medical University of Innsbruck, Innsbruck, Austria
| | - Federica Eduati
- Department of Biomedical Engineering, Eindhoven University of Technology, Eindhoven, Netherlands
| |
Collapse
|
380
|
An Improved Method for Prediction of Cancer Prognosis by Network Learning. Genes (Basel) 2018; 9:genes9100478. [PMID: 30279327 PMCID: PMC6210393 DOI: 10.3390/genes9100478] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2018] [Revised: 09/21/2018] [Accepted: 09/27/2018] [Indexed: 01/01/2023] Open
Abstract
Accurate identification of prognostic biomarkers is an important yet challenging goal in bioinformatics. Many bioinformatics approaches have been proposed for this purpose, but there is still room for improvement. In this paper, we propose a novel machine learning-based method for more accurate identification of prognostic biomarker genes and use them for prediction of cancer prognosis. The proposed method specifies the candidate prognostic gene module by graph learning using the generative adversarial networks (GANs) model, and scores genes using a PageRank algorithm. We applied the proposed method to multiple-omics data that included copy number, gene expression, DNA methylation, and somatic mutation data for five cancer types. The proposed method showed better prediction accuracy than did existing methods. We identified many prognostic genes and their roles in their biological pathways. We also showed that the genes identified from different omics data were complementary, which led to improved accuracy in prediction using multi-omics data.
Collapse
|
381
|
Perakakis N, Yazdani A, Karniadakis GE, Mantzoros C. Omics, big data and machine learning as tools to propel understanding of biological mechanisms and to discover novel diagnostics and therapeutics. Metabolism 2018; 87:A1-A9. [PMID: 30098323 PMCID: PMC6325641 DOI: 10.1016/j.metabol.2018.08.002] [Citation(s) in RCA: 66] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/06/2018] [Accepted: 08/07/2018] [Indexed: 12/12/2022]
Affiliation(s)
- Nikolaos Perakakis
- Department of Endocrinology, VA Boston Healthcare System, Jamaica Plain, Boston, MA 02130, USA; Division of Endocrinology, Diabetes and Metabolism, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA 02215, USA
| | - Alireza Yazdani
- Division of Applied Mathematics, Brown University, Providence, RI 02906, USA
| | | | - Christos Mantzoros
- Department of Endocrinology, VA Boston Healthcare System, Jamaica Plain, Boston, MA 02130, USA; Division of Endocrinology, Diabetes and Metabolism, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA 02215, USA.
| |
Collapse
|
382
|
Kedaigle A, Fraenkel E. Turning omics data into therapeutic insights. Curr Opin Pharmacol 2018; 42:95-101. [PMID: 30149217 PMCID: PMC6204089 DOI: 10.1016/j.coph.2018.08.006] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2018] [Revised: 07/18/2018] [Accepted: 08/09/2018] [Indexed: 12/30/2022]
Abstract
Omics technologies have made it easier and cheaper to evaluate thousands of biological molecules at once. These advances have led to novel therapies approved for use in the clinic, elucidated the mechanisms behind disease-associated mutations, led to increased accuracy in disease subtyping and personalized medicine, and revealed novel uses and treatment regimes for existing drugs through drug repurposing and pharmacology studies. In this review, we summarize some of these milestones and discuss the potential of integrative analyses that combine multiple data types for further advances.
Collapse
Affiliation(s)
- Amanda Kedaigle
- Computational & Systems Biology Program and the Department of Biological Engineering, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, USA
| | - Ernest Fraenkel
- Computational & Systems Biology Program and the Department of Biological Engineering, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, USA.
| |
Collapse
|
383
|
Stein-O'Brien GL, Arora R, Culhane AC, Favorov AV, Garmire LX, Greene CS, Goff LA, Li Y, Ngom A, Ochs MF, Xu Y, Fertig EJ. Enter the Matrix: Factorization Uncovers Knowledge from Omics. Trends Genet 2018; 34:790-805. [PMID: 30143323 PMCID: PMC6309559 DOI: 10.1016/j.tig.2018.07.003] [Citation(s) in RCA: 132] [Impact Index Per Article: 18.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2018] [Revised: 06/01/2018] [Accepted: 07/16/2018] [Indexed: 12/20/2022]
Abstract
Omics data contain signals from the molecular, physical, and kinetic inter- and intracellular interactions that control biological systems. Matrix factorization (MF) techniques can reveal low-dimensional structure from high-dimensional data that reflect these interactions. These techniques can uncover new biological knowledge from diverse high-throughput omics data in applications ranging from pathway discovery to timecourse analysis. We review exemplary applications of MF for systems-level analyses. We discuss appropriate applications of these methods, their limitations, and focus on the analysis of results to facilitate optimal biological interpretation. The inference of biologically relevant features with MF enables discovery from high-throughput data beyond the limits of current biological knowledge - answering questions from high-dimensional data that we have not yet thought to ask.
Collapse
Affiliation(s)
- Genevieve L Stein-O'Brien
- Department of Oncology, Division of Biostatistics and Bioinformatics, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins School of Medicine, Baltimore, MD, USA; Department of Neuroscience, Johns Hopkins School of Medicine, Baltimore, MD, USA; McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - Raman Arora
- Department of Computer Science, Institute for Data Intensive Engineering and Science, Johns Hopkins University, Baltimore, MD, USA
| | - Aedin C Culhane
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA, USA; Department of Biostatistics, Harvard TH Chan School of Public Health, Boston, MA, USA
| | - Alexander V Favorov
- Department of Oncology, Division of Biostatistics and Bioinformatics, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins School of Medicine, Baltimore, MD, USA; Vavilov Institute of General Genetics, Moscow, Russia
| | | | - Casey S Greene
- Department of Systems Pharmacology and Translational Therapeutics, Perelman School of Medicine, University of Pennsylvania, PA, USA; Childhood Cancer Data Lab, Alex's Lemonade Stand Foundation, PA, USA
| | - Loyal A Goff
- Department of Neuroscience, Johns Hopkins School of Medicine, Baltimore, MD, USA; McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - Yifeng Li
- Digital Technologies Research Centre, National Research Council of Canada, Ottawa, ON, Canada
| | - Aloune Ngom
- School of Computer Science, University of Windsor, Windsor, ON, Canada
| | - Michael F Ochs
- Department of Mathematics and Statistics, The College of New Jersey, Ewing, NJ, USA
| | - Yanxun Xu
- Department of Applied Mathematics and Statistics, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Elana J Fertig
- Department of Oncology, Division of Biostatistics and Bioinformatics, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins School of Medicine, Baltimore, MD, USA.
| |
Collapse
|
384
|
Morello G, Spampinato AG, Conforti FL, Cavallaro S. Taxonomy Meets Neurology, the Case of Amyotrophic Lateral Sclerosis. Front Neurosci 2018; 12:673. [PMID: 30319346 PMCID: PMC6168652 DOI: 10.3389/fnins.2018.00673] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2018] [Accepted: 09/07/2018] [Indexed: 12/13/2022] Open
Abstract
Recent landmark publications from our research group outline a transformative approach to defining, studying and treating amyotrophic lateral sclerosis (ALS). Rather than approaching ALS as a single entity, we advocate targeting therapies to distinct "clusters" of patients based on their specific genomic and molecular features. Our findings point to the existence of a molecular taxonomy for ALS, bringing us a step closer to the establishment of a precision medicine approach in neurology practice.
Collapse
Affiliation(s)
- Giovanna Morello
- Institute of Neurological Sciences, Italian National Research Council, Catania, Italy
| | | | | | - Sebastiano Cavallaro
- Institute of Neurological Sciences, Italian National Research Council, Catania, Italy
| |
Collapse
|
385
|
El-Manzalawy Y, Hsieh TY, Shivakumar M, Kim D, Honavar V. Min-redundancy and max-relevance multi-view feature selection for predicting ovarian cancer survival using multi-omics data. BMC Med Genomics 2018; 11:71. [PMID: 30255801 PMCID: PMC6157248 DOI: 10.1186/s12920-018-0388-0] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
BACKGROUND Large-scale collaborative precision medicine initiatives (e.g., The Cancer Genome Atlas (TCGA)) are yielding rich multi-omics data. Integrative analyses of the resulting multi-omics data, such as somatic mutation, copy number alteration (CNA), DNA methylation, miRNA, gene expression, and protein expression, offer tantalizing possibilities for realizing the promise and potential of precision medicine in cancer prevention, diagnosis, and treatment by substantially improving our understanding of underlying mechanisms as well as the discovery of novel biomarkers for different types of cancers. However, such analyses present a number of challenges, including heterogeneity, and high-dimensionality of omics data. METHODS We propose a novel framework for multi-omics data integration using multi-view feature selection. We introduce a novel multi-view feature selection algorithm, MRMR-mv, an adaptation of the well-known Min-Redundancy and Maximum-Relevance (MRMR) single-view feature selection algorithm to the multi-view setting. RESULTS We report results of experiments using an ovarian cancer multi-omics dataset derived from the TCGA database on the task of predicting ovarian cancer survival. Our results suggest that multi-view models outperform both view-specific models (i.e., models trained and tested using a single type of omics data) and models based on two baseline data fusion methods. CONCLUSIONS Our results demonstrate the potential of multi-view feature selection in integrative analyses and predictive modeling from multi-omics data.
Collapse
Affiliation(s)
- Yasser El-Manzalawy
- Artificial Intelligence Research Laboratory, College of Information Sciences and Technology, Pennsylvania State University, University Park, PA, 16802, USA.,The Center for Big Data Analytics and Discovery Informatics, Pennsylvania State University, University Park, PA, 16802, USA.,The Clinical and Translational Sciences Institute, Pennsylvania State University, University Park, PA, 16802, USA
| | - Tsung-Yu Hsieh
- Artificial Intelligence Research Laboratory, College of Information Sciences and Technology, Pennsylvania State University, University Park, PA, 16802, USA.,School of Electrical Engineering and Computer Science, Pennsylvania State University, University Park, PA, 16802, USA.,The Center for Big Data Analytics and Discovery Informatics, Pennsylvania State University, University Park, PA, 16802, USA
| | - Manu Shivakumar
- Biomedical and Translational Informatics Institute, Geisinger Health System, Danville, PA, USA
| | - Dokyoon Kim
- Biomedical and Translational Informatics Institute, Geisinger Health System, Danville, PA, USA. .,The Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA, 16802, USA.
| | - Vasant Honavar
- Artificial Intelligence Research Laboratory, College of Information Sciences and Technology, Pennsylvania State University, University Park, PA, 16802, USA. .,The Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA, 16802, USA. .,School of Electrical Engineering and Computer Science, Pennsylvania State University, University Park, PA, 16802, USA. .,The Center for Big Data Analytics and Discovery Informatics, Pennsylvania State University, University Park, PA, 16802, USA. .,The Clinical and Translational Sciences Institute, Pennsylvania State University, University Park, PA, 16802, USA.
| |
Collapse
|
386
|
Hurgobin B, de Jong E, Bosco A. Insights into respiratory disease through bioinformatics. Respirology 2018; 23:1117-1126. [PMID: 30218470 DOI: 10.1111/resp.13401] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2018] [Revised: 08/18/2018] [Accepted: 08/22/2018] [Indexed: 12/21/2022]
Abstract
Respiratory diseases such as asthma, chronic obstructive pulmonary disease and lung cancer represent a critical area for medical research as millions of people are affected globally. The development of new strategies for treatment and/or prevention, and the identification of biomarkers for patient stratification and early detection of disease inception are essential to reducing the impact of lung diseases. The successful translation of research into clinical practice requires a detailed understanding of the underlying biology. In this regard, the advent of next-generation sequencing and mass spectrometry has led to the generation of an unprecedented amount of data spanning multiple layers of biological regulation (genome, epigenome, transcriptome, proteome, metabolome and microbiome). Dealing with this wealth of data requires sophisticated bioinformatics and statistical tools. Here, we review the basic concepts in bioinformatics and genomic data analysis and illustrate the application of these tools to further our understanding of lung diseases. We also highlight the potential for data integration of multi-omic profiles and computational drug repurposing to define disease subphenotypes and match them to targeted therapies, paving the way for personalized medicine.
Collapse
Affiliation(s)
- Bhavna Hurgobin
- Telethon Kids Institute, The University of Western Australia, Perth, WA, Australia
| | - Emma de Jong
- Telethon Kids Institute, The University of Western Australia, Perth, WA, Australia
| | - Anthony Bosco
- Telethon Kids Institute, The University of Western Australia, Perth, WA, Australia
| |
Collapse
|
387
|
Rendleman J, Choi H, Vogel C. Integration of large-scale multi-omic datasets: a protein-centric view. ACTA ACUST UNITED AC 2018; 11:74-81. [PMID: 30906903 DOI: 10.1016/j.coisb.2018.09.001] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Innovative mass spectrometry-based proteomics has enabled routine measurements of protein abundance, localization, interactions, and modifications, covering unique aspects of gene expression regulation and function. It is now time to move from isolated analyses of these datasets toward true integration of proteomics with other data types to gain insights from the interactions and interdependencies of biomolecules. When combined with genomic or transcriptomic data, proteomics expands genome annotation to identify variant or missing genes. Dynamic proteomic measurements can move analysis from predominantly concentration-based framework to that of synthesis and degradation of proteins. Proteomic data from thousands of cancer patients can foster identification of novel pathogenic mutations via detection of protein sequence changes that lead to dysregulated pathways in various tumors. Such comprehensive efforts can exploit the synergy arising from large and complex datasets to advance virtually every field of biology.
Collapse
Affiliation(s)
- Justin Rendleman
- Center for Genomics and Systems Biology, New York University, Department of Biology, New York, USA
| | - Hyungwon Choi
- Department of Medicine, Yong Loo Lin School of Medicine, National University Singapore, Singapore.,Institute of Molecular and Cell Biology, Agency for Science, Technology, and Research, Singapore
| | - Christine Vogel
- Center for Genomics and Systems Biology, New York University, Department of Biology, New York, USA
| |
Collapse
|
388
|
Krzyszczyk P, Acevedo A, Davidoff EJ, Timmins LM, Marrero-Berrios I, Patel M, White C, Lowe C, Sherba JJ, Hartmanshenn C, O'Neill KM, Balter ML, Fritz ZR, Androulakis IP, Schloss RS, Yarmush ML. The growing role of precision and personalized medicine for cancer treatment. TECHNOLOGY 2018; 6:79-100. [PMID: 30713991 PMCID: PMC6352312 DOI: 10.1142/s2339547818300020] [Citation(s) in RCA: 261] [Impact Index Per Article: 37.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
Cancer is a devastating disease that takes the lives of hundreds of thousands of people every year. Due to disease heterogeneity, standard treatments, such as chemotherapy or radiation, are effective in only a subset of the patient population. Tumors can have different underlying genetic causes and may express different proteins in one patient versus another. This inherent variability of cancer lends itself to the growing field of precision and personalized medicine (PPM). There are many ongoing efforts to acquire PPM data in order to characterize molecular differences between tumors. Some PPM products are already available to link these differences to an effective drug. It is clear that PPM cancer treatments can result in immense patient benefits, and companies and regulatory agencies have begun to recognize this. However, broader changes to the healthcare and insurance systems must be addressed if PPM is to become part of standard cancer care.
Collapse
Affiliation(s)
- Paulina Krzyszczyk
- Department of Biomedical Engineering, Rutgers University, 599 Taylor Road, Piscataway, NJ 08854, USA
| | - Alison Acevedo
- Department of Biomedical Engineering, Rutgers University, 599 Taylor Road, Piscataway, NJ 08854, USA
| | - Erika J Davidoff
- Department of Biomedical Engineering, Rutgers University, 599 Taylor Road, Piscataway, NJ 08854, USA
| | - Lauren M Timmins
- Department of Biomedical Engineering, Rutgers University, 599 Taylor Road, Piscataway, NJ 08854, USA
| | - Ileana Marrero-Berrios
- Department of Biomedical Engineering, Rutgers University, 599 Taylor Road, Piscataway, NJ 08854, USA
| | - Misaal Patel
- Department of Biomedical Engineering, Rutgers University, 599 Taylor Road, Piscataway, NJ 08854, USA
| | - Corina White
- Department of Biomedical Engineering, Rutgers University, 599 Taylor Road, Piscataway, NJ 08854, USA
| | - Christopher Lowe
- Department of Biomedical Engineering, Rutgers University, 599 Taylor Road, Piscataway, NJ 08854, USA
| | - Joseph J Sherba
- Department of Biomedical Engineering, Rutgers University, 599 Taylor Road, Piscataway, NJ 08854, USA
| | - Clara Hartmanshenn
- Department of Chemical & Biochemical Engineering, Rutgers University, 98 Brett Road, Piscataway, NJ 08854, USA
| | - Kate M O'Neill
- Department of Biomedical Engineering, Rutgers University, 599 Taylor Road, Piscataway, NJ 08854, USA
| | - Max L Balter
- Department of Biomedical Engineering, Rutgers University, 599 Taylor Road, Piscataway, NJ 08854, USA
| | - Zachary R Fritz
- Department of Biomedical Engineering, Rutgers University, 599 Taylor Road, Piscataway, NJ 08854, USA
| | - Ioannis P Androulakis
- Department of Biomedical Engineering, Rutgers University, 599 Taylor Road, Piscataway, NJ 08854, USA
- Department of Chemical & Biochemical Engineering, Rutgers University, 98 Brett Road, Piscataway, NJ 08854, USA
| | - Rene S Schloss
- Department of Biomedical Engineering, Rutgers University, 599 Taylor Road, Piscataway, NJ 08854, USA
| | - Martin L Yarmush
- Department of Biomedical Engineering, Rutgers University, 599 Taylor Road, Piscataway, NJ 08854, USA
- Department of Chemical & Biochemical Engineering, Rutgers University, 98 Brett Road, Piscataway, NJ 08854, USA
| |
Collapse
|
389
|
Kakouri AC, Christodoulou CC, Zachariou M, Oulas A, Minadakis G, Demetriou CA, Votsi C, Zamba-Papanicolaou E, Christodoulou K, Spyrou GM. Revealing Clusters of Connected Pathways Through Multisource Data Integration in Huntington's Disease and Spastic Ataxia. IEEE J Biomed Health Inform 2018; 23:26-37. [PMID: 30176611 DOI: 10.1109/jbhi.2018.2865569] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
The advancement of scientific and medical research over the past years has generated a wealth of experimental data from multiple technologies, including genomics, transcriptomics, proteomics, and other forms of -omics data, which are available for a number of diseases. The integration of such multisource data is a key component toward the success of precision medicine. In this paper, we are investigating a multisource data integration method developed by our group, regarding its ability to drive to clusters of connected pathways under two different approaches: first, a disease-centric approach, where we integrate data around a disease, and second, a gene-centric approach, where we integrate data around a gene. We have used as a paradigm for the first approach Huntington's disease (HD), a disease with a plethora of available data, whereas for the second approach the GBA2, a gene that is related to spastic ataxia (SA), a phenotype with sparse availability of data. Our paper shows that valuable information at the level of disease-related pathway clusters can be obtained for both HD and SA. New pathways that classical pathway analysis methods were unable to reveal, emerged as necessary "connectors" to build connected pathway stories formed as pathway clusters. The capability to integrate multisource molecular data, concluding to something more than the sum of the existing information, empowers precision and personalized medicine approaches.
Collapse
|
390
|
Metabolomics in chronic kidney disease: Strategies for extended metabolome coverage. J Pharm Biomed Anal 2018; 161:313-325. [PMID: 30195171 DOI: 10.1016/j.jpba.2018.08.046] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2018] [Revised: 08/22/2018] [Accepted: 08/23/2018] [Indexed: 12/16/2022]
Abstract
Chronic kidney disease (CKD) is becoming a major public health issue as prevalence is increasing worldwide. It also represents a major challenge for the identification of new early biomarkers, understanding of biochemical mechanisms, patient monitoring and prognosis. Each metabolite contained in a biofluid or tissue may play a role as a signal or as a driver in the development or progression of the pathology. Therefore, metabolomics is a highly valuable approach in this clinical context. It aims to provide a representative picture of a biological system, making exhaustive metabolite coverage crucial. Two aspects can be considered: analytical and biological coverage. From an analytical point of view, monitoring all metabolites within one run is currently impossible. Multiple analytical techniques providing orthogonal information should be carried out in parallel for coverage improvement. The biological aspect of metabolome coverage can be enhanced by using multiple biofluids or tissues for in-depth biological investigation, as the analysis of a single sample type is generally insufficient for whole organism extrapolation. Hence, recording of signals from multiple sample types and different analytical platforms generates massive and complex datasets so that chemometric tools, including data fusion approaches and multi-block analysis, are key tools for extracting biological information and for discovery of relevant biomarkers. This review presents the recent developments in the field of metabolomic analysis, from sampling and analytical strategies to chemometric tools, dedicated to the generation and handling of multiple complementary metabolomic datasets enabling extended metabolite coverage to improve our biological knowledge of CKD.
Collapse
|
391
|
Berlin R, Gruen R, Best J. Systems Medicine Disease: Disease Classification and Scalability Beyond Networks and Boundary Conditions. Front Bioeng Biotechnol 2018; 6:112. [PMID: 30131956 PMCID: PMC6090066 DOI: 10.3389/fbioe.2018.00112] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2018] [Accepted: 07/18/2018] [Indexed: 12/26/2022] Open
Abstract
In order to accommodate the forthcoming wealth of health and disease related information, from genome to body sensors to population and the environment, the approach to disease description and definition demands re-examination. Traditional classification methods remain trapped by history; to provide the descriptive features that are required for a comprehensive description of disease, systems science, which realizes dynamic processes, adaptive response, and asynchronous communication channels, must be applied (Wolkenhauer et al., 2013). When Disease is viewed beyond the thresholds of lines and threshold boundaries, disease definition is not only the result of reductionist, mechanistic categories which reluctantly face re-composition. Disease is process and synergy as the characteristics of Systems Biology and Systems Medicine are included. To capture the wealth of information and contribute meaningfully to medical practice and biology research, Disease classification goes beyond a single spatial biologic level or static time assignment to include the interface of Disease process and organism response (Bechtel, 2017a; Green et al., 2017).
Collapse
Affiliation(s)
- Richard Berlin
- Department of Computer Science, University of Illinois, Urbana, IL, United States
| | - Russell Gruen
- Department of Surgery, Nanyang Institute of Technology in Health and Medicine, Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore, Singapore
| | - James Best
- Lee Kong China School of Medicine, Nanyang Technological University, Singapore, Singapore
- Imperial College, London, United Kingdom
| |
Collapse
|
392
|
Awany D, Allali I, Chimusa ER. Tantalizing dilemma in risk prediction from disease scoring statistics. Brief Funct Genomics 2018; 18:211-219. [PMID: 30605512 PMCID: PMC6609536 DOI: 10.1093/bfgp/ely040] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2018] [Revised: 08/17/2018] [Accepted: 11/29/2018] [Indexed: 02/01/2023] Open
Abstract
Over the past decade, human host genome-wide association studies (GWASs) have contributed greatly to our understanding of the impact of host genetics on phenotypes. Recently, the microbiome has been recognized as a complex trait in host genetic variation, leading to microbiome GWAS (mGWASs). For these, many different statistical methods and software tools have been developed for association mapping. Applications of these methods and tools have revealed several important findings; however, the establishment of causal factors and the direction of causality in the interactive role between human genetic polymorphisms, the microbiome and the host phenotypes are still a huge challenge. Here, we review disease scoring approaches in host and mGWAS and their underlying statistical methods and tools. We highlight the challenges in pinpointing the genetic-associated causal factors in host and mGWAS and discuss the role of multi-omic approach in disease scoring statistics that may provide a better understanding of human phenotypic variation by enabling further system biological experiment to establish causality.
Collapse
Affiliation(s)
- Denis Awany
- Division of Human Genetics, Department of Pathology, Institute of Infectious Disease and Molecular Medicine, Faculty of Health Sciences, University of Cape Town, South Africa
| | - Imane Allali
- Computational Biology Division, Department of Integrative Biomedical Sciences, Faculty of Health Sciences, University of Cape Town, South Africa
| | - Emile R Chimusa
- Division of Human Genetics, Department of Pathology, Institute of Infectious Disease and Molecular Medicine, Faculty of Health Sciences, University of Cape Town, South Africa
| |
Collapse
|
393
|
Vilne B, Schunkert H. Integrating Genes Affecting Coronary Artery Disease in Functional Networks by Multi-OMICs Approach. Front Cardiovasc Med 2018; 5:89. [PMID: 30065929 PMCID: PMC6056735 DOI: 10.3389/fcvm.2018.00089] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2018] [Accepted: 06/22/2018] [Indexed: 12/26/2022] Open
Abstract
Coronary artery disease (CAD) and myocardial infarction (MI) remain among the leading causes of mortality worldwide, urgently demanding a better understanding of disease etiology, and more efficient therapeutic strategies. Genetic predisposition as well as the environment and lifestyle are thought to contribute to disease risk. It is likely that non-linear and complex interactions occur between these multiple factors, involving simultaneous pathological changes in diverse cell types, tissues, and organs, at multiple molecular levels. Recent technological advances have exponentially expanded the breadth of available -omics data, from genome, epigenome, transcriptome, proteome, metabolome to even the microbiome. Integration of multiple layers of information across several -omics domains, i.e., the so-called multi-omics approach, currently holds the promise as a path toward precision medicine. Indeed, a more meaningful interpretation of genotype-phenotype relationships and the development of successful therapeutics tailored to individual patients are urgently needed. In this review, we will summarize recent findings and applications of integrative multi-omics in elucidating the etiology of CAD/MI; with a special focus on established disease susceptibility loci sequentially identified in genome-wide association studies (GWAS) over the last 10 years. Moreover, in addition to the autosomal genome, we will also consider the genetic variation in our “second genome”—the mitochondrial genome. Finally, we will summarize the current challenges in the field and point to future research directions required in order to successfully and effectively apply these approaches for precision medicine.
Collapse
Affiliation(s)
- Baiba Vilne
- Deutsches Herzzentrum München, Klinik für Herz- und Kreislauferkrankungen, Technische Universität München, Munich, Germany.,Munich Heart Alliance, German Centre for Cardiovascular Research, Munich, Germany
| | - Heribert Schunkert
- Deutsches Herzzentrum München, Klinik für Herz- und Kreislauferkrankungen, Technische Universität München, Munich, Germany.,Munich Heart Alliance, German Centre for Cardiovascular Research, Munich, Germany
| |
Collapse
|
394
|
Misra BB, Langefeld CD, Olivier M, Cox LA. Integrated Omics: Tools, Advances, and Future Approaches. J Mol Endocrinol 2018; 62:JME-18-0055. [PMID: 30006342 DOI: 10.1530/jme-18-0055] [Citation(s) in RCA: 249] [Impact Index Per Article: 35.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/24/2018] [Revised: 07/02/2018] [Accepted: 07/12/2018] [Indexed: 12/13/2022]
Abstract
With the rapid adoption of high-throughput omic approaches to analyze biological samples such as genomics, transcriptomics, proteomics, and metabolomics, each analysis can generate tera- to peta-byte sized data files on a daily basis. These data file sizes, together with differences in nomenclature among these data types, make the integration of these multi-dimensional omics data into biologically meaningful context challenging. Variously named as integrated omics, multi-omics, poly-omics, trans-omics, pan-omics, or shortened to just 'omics', the challenges include differences in data cleaning, normalization, biomolecule identification, data dimensionality reduction, biological contextualization, statistical validation, data storage and handling, sharing, and data archiving. The ultimate goal is towards the holistic realization of a 'systems biology' understanding of the biological question in hand. Commonly used approaches in these efforts are currently limited by the 3 i's - integration, interpretation, and insights. Post integration, these very large datasets aim to yield unprecedented views of cellular systems at exquisite resolution for transformative insights into processes, events, and diseases through various computational and informatics frameworks. With the continued reduction in costs and processing time for sample analyses, and increasing types of omics datasets generated such as glycomics, lipidomics, microbiomics, and phenomics, an increasing number of scientists in this interdisciplinary domain of bioinformatics face these challenges. We discuss recent approaches, existing tools, and potential caveats in the integration of omics datasets for development of standardized analytical pipelines that could be adopted by the global omics research community.
Collapse
Affiliation(s)
- Biswapriya B Misra
- B Misra, Internal Medicine, Wake Forest University School of Medicine, Winston-Salem, United States
| | - Carl D Langefeld
- C Langefeld, Biostatistical Sciences, Wake Forest University School of Medicine, Winston-Salem, United States
| | - Michael Olivier
- M Olivier, Internal Medicine, Wake Forest University School of Medicine, Winston-Salem, United States
| | - Laura A Cox
- L Cox, Internal Medicine, Wake Forest University School of Medicine, Winston-Salem, United States
| |
Collapse
|
395
|
Li C, Lee J, Ding J, Sun S. Integrative analysis of gene expression and methylation data for breast cancer cell lines. BioData Min 2018; 11:13. [PMID: 29983747 PMCID: PMC6019806 DOI: 10.1186/s13040-018-0174-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2018] [Accepted: 06/13/2018] [Indexed: 12/11/2022] Open
Abstract
Background The deadly costs of cancer and necessity for an accurate method of early cancer detection have demanded the identification of genetic and epigenetic factors associated with cancer. DNA methylation, an epigenetic event, plays an important role in cancer susceptibility. In this paper, we use DNA methylation and gene expression data integration and pathway analysis to further explore and understand the complex relationship between methylation and gene expression. Results Through linear modeling and analysis of variance, we obtain genes that show a significant correlation between methylation and gene expression. We then examine the functions and relationships of these genes using bioinformatic tools and databases. In particular, using ConsensusPathDB, we analyze the networks of statistically significant genes to identify hub genes, genes with a large number of links to other genes. We identify eight major hub genes, all in strong association with cancer susceptibility. Through further analysis of the function, gene expression level, and methylation level of these hub genes, we conclude that they are novel potential biomarkers for breast cancer. Conclusions Our findings have various implications for cancer screening, early detection methods, and potential novel treatments for cancer. Researchers can also use our results to develop more effective methods for cancer study. Electronic supplementary material The online version of this article (10.1186/s13040-018-0174-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | - Juyon Lee
- Korea International School Pangyo Campus, Seongnam, South Korea
| | - Jessica Ding
- Liberal Arts and Science Academy, Austin, Texas USA
| | - Shuying Sun
- 4Department of Mathematics, Texas State University, San Marcos, TX USA
| |
Collapse
|
396
|
Poirion OB, Chaudhary K, Garmire LX. Deep Learning data integration for better risk stratification models of bladder cancer. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2018; 2017:197-206. [PMID: 29888072 PMCID: PMC5961799] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
We propose an unsupervised multi-omics integration pipeline, using deep-learning autoencoder algorithm, to predict the survival subtypes in bladder cancer (BC). We used TCGA dataset comprising mRNA, miRNA and methylation to infer two survival subtypes. We then constructed a supervised classification model to predict the survival subgroups of any new individual sample. Our training data gave two subgroups with significant survival differences (p-value=8e-4), where high-risk survival subgroup was enriched with KRT6/14 overexpression and PI3K-Akt pathways. We tested the robustness of model by randomly splitting the main dataset into multiple training and test folds, which gave overall significant p-values. Then, we successfully inferred the subtypes for a subset of samples kept as test dataset (p-value=0.03). We further applied our pipeline to predict the survival subgroups from another validation dataset with miRNA data (p-value=0.02). Conclusively, present pipeline is an effective approach to infer the survival subtype of a new sample, exemplified by BC.
Collapse
Affiliation(s)
- Olivier B Poirion
- Epidemiology Program, University of Hawaii Cancer Center Honolulu, HI 96813, USA
- These authors contributed equally to the work
| | - Kumardeep Chaudhary
- Epidemiology Program, University of Hawaii Cancer Center Honolulu, HI 96813, USA
- These authors contributed equally to the work
| | - Lana X Garmire
- Epidemiology Program, University of Hawaii Cancer Center Honolulu, HI 96813, USA
- Molecular Biosciences and Bioengineering Graduate Program, University of Hawaii at Manoa, Honolulu, HI 96822, USA
| |
Collapse
|
397
|
Minnifield BA, Aslibekyan SW. The Interplay Between the Microbiome and Cardiovascular Risk. CURRENT GENETIC MEDICINE REPORTS 2018. [DOI: 10.1007/s40142-018-0142-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
398
|
Cambiaghi A, Díaz R, Martinez JB, Odena A, Brunelli L, Caironi P, Masson S, Baselli G, Ristagno G, Gattinoni L, de Oliveira E, Pastorelli R, Ferrario M. An Innovative Approach for The Integration of Proteomics and Metabolomics Data In Severe Septic Shock Patients Stratified for Mortality. Sci Rep 2018; 8:6681. [PMID: 29703925 PMCID: PMC5923340 DOI: 10.1038/s41598-018-25035-1] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2017] [Accepted: 04/09/2018] [Indexed: 12/29/2022] Open
Abstract
In this work, we examined plasma metabolome, proteome and clinical features in patients with severe septic shock enrolled in the multicenter ALBIOS study. The objective was to identify changes in the levels of metabolites involved in septic shock progression and to integrate this information with the variation occurring in proteins and clinical data. Mass spectrometry-based targeted metabolomics and untargeted proteomics allowed us to quantify absolute metabolites concentration and relative proteins abundance. We computed the ratio D7/D1 to take into account their variation from day 1 (D1) to day 7 (D7) after shock diagnosis. Patients were divided into two groups according to 28-day mortality. Three different elastic net logistic regression models were built: one on metabolites only, one on metabolites and proteins and one to integrate metabolomics and proteomics data with clinical parameters. Linear discriminant analysis and Partial least squares Discriminant Analysis were also implemented. All the obtained models correctly classified the observations in the testing set. By looking at the variable importance (VIP) and the selected features, the integration of metabolomics with proteomics data showed the importance of circulating lipids and coagulation cascade in septic shock progression, thus capturing a further layer of biological information complementary to metabolomics information.
Collapse
Affiliation(s)
| | - Ramón Díaz
- Proteomics Platform - Parc Científic de Barcelona, Barcelona, Spain
| | | | - Antonia Odena
- Proteomics Platform - Parc Científic de Barcelona, Barcelona, Spain
| | - Laura Brunelli
- IRCCS-Istituto di Ricerche Farmacologiche Mario Negri, Milan, Italy
| | - Pietro Caironi
- Anestesia e Rianimazione, Azienda Ospedaliero-Universitaria S. Luigi Gonzaga, Orbassano, Italy.,Dipartimento di Oncologia, Università degli Studi di Torino, Turin, Italy
| | - Serge Masson
- IRCCS-Istituto di Ricerche Farmacologiche Mario Negri, Milan, Italy
| | | | | | - Luciano Gattinoni
- Department of Anesthesiology, Emergency and Intensive Care Medicine, University of Göttingen, Göttingen, Germany
| | | | | | | |
Collapse
|
399
|
Downs DM, Bazurto JV, Gupta A, Fonseca LL, Voit EO. The three-legged stool of understanding metabolism: integrating metabolomics with biochemical genetics and computational modeling. AIMS Microbiol 2018; 4:289-303. [PMID: 31294216 PMCID: PMC6604926 DOI: 10.3934/microbiol.2018.2.289] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2018] [Accepted: 04/02/2018] [Indexed: 12/23/2022] Open
Abstract
Traditional biochemical research has resulted in a good understanding of many aspects of metabolism. However, this reductionist approach is time consuming and requires substantial resources, thus raising the question whether modern metabolomics and genomics should take over and replace the targeted experiments of old. We proffer that such a replacement is neither feasible not desirable and propose instead the tight integration of modern, system-wide omics with traditional experimental bench science and dedicated computational approaches. This integration is an important prerequisite toward the optimal acquisition of knowledge regarding metabolism and physiology in health and disease. The commentary describes advantages and drawbacks of current approaches to assessing metabolism and highlights the challenges to be overcome as we strive to achieve a deeper level of metabolic understanding in the future.
Collapse
Affiliation(s)
- Diana M Downs
- Department of Microbiology, University of Georgia, Athens, GA, 30602, USA
| | - Jannell V Bazurto
- Department of Biological Sciences, University of Idaho, Moscow, ID, 83844, USA
| | - Anuj Gupta
- Department of Biomedical Engineering, Georgia Institute of Technology, 950 Atlantic Drive, Suite 2115, Atlanta, GA, 30332-2000, USA
| | - Luis L Fonseca
- Department of Biomedical Engineering, Georgia Institute of Technology, 950 Atlantic Drive, Suite 2115, Atlanta, GA, 30332-2000, USA
| | - Eberhard O Voit
- Department of Biomedical Engineering, Georgia Institute of Technology, 950 Atlantic Drive, Suite 2115, Atlanta, GA, 30332-2000, USA
| |
Collapse
|
400
|
Ovejero-Benito MC, Muñoz-Aceituno E, Reolid A, Saiz-Rodríguez M, Abad-Santos F, Daudén E. Pharmacogenetics and Pharmacogenomics in Moderate-to-Severe Psoriasis. Am J Clin Dermatol 2018; 19:209-222. [PMID: 28921458 DOI: 10.1007/s40257-017-0322-9] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Pharmacogenetics is the study of variations in DNA sequence related to drug response. Moreover, the evolution of biotechnology and the sequencing of human DNA have allowed the creation of pharmacogenomics, a branch of genetics that analyzes human genes, the RNAs and proteins encoded by them, and the inter-and intra-individual variations in expression and function in relation to drug response. Pharmacogenetics and pharmacogenomics are being used to search for biomarkers that can predict response to systemic treatments, including those for moderate-to-severe psoriasis. Psoriasis is a chronic inflammatory disease with an autoimmune contribution. Although its etiology remains unknown, genetic, epigenetic, and environmental factors play a role in its development. Diverse systemic and biologic therapies are used to treat moderate-to-severe psoriasis. However, these treatments are not curative, and patients exhibit a wide range of responses to them. Moderate-to-severe psoriasis is usually treated with systemic immunomodulators such as acitretin, ciclosporin, and methotrexate. Anti-tumor necrosis factor (TNF) drugs (adalimumab, etanercept, or infliximab) are the first-line treatment for patients resistant to conventional systemic therapies. Although these therapies are very efficient, around 30-50% of patients have inadequate response. Ustekinumab is a monoclonal antibody that targets interleukin (IL)-12 and IL-23 and is used for moderate-to-severe psoriasis. New drugs (apremilast, brodalumab, guselkumab, ixekizumab, and secukinumab) have recently been approved for psoriasis. However, response rates to systemic treatments for moderate-to-severe psoriasis range from 35 to 80%, so it is necessary to identify non-invasive biomarkers that could help predict treatment outcomes of these therapies and individualize care for patients with psoriasis. These biomarkers could improve patient quality of life and reduce health costs and potential side effects. Pharmacogenetic studies have identified potential biomarkers for response to biologic treatments for moderate-to-severe psoriasis. These biomarkers need to be validated in clinical trials involving large cohorts of patients before they can be translated to the clinic. We review pharmacogenetics and pharmacogenomics studies for the treatment of moderate-to-severe plaque psoriasis.
Collapse
|