1
|
Saqi M, Lysenko A, Guo YK, Tsunoda T, Auffray C. Navigating the disease landscape: knowledge representations for contextualizing molecular signatures. Brief Bioinform 2019; 20:609-623. [PMID: 29684165 PMCID: PMC6556902 DOI: 10.1093/bib/bby025] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2017] [Revised: 02/05/2018] [Indexed: 12/14/2022] Open
Abstract
Large amounts of data emerging from experiments in molecular medicine are leading to the identification of molecular signatures associated with disease subtypes. The contextualization of these patterns is important for obtaining mechanistic insight into the aberrant processes associated with a disease, and this typically involves the integration of multiple heterogeneous types of data. In this review, we discuss knowledge representations that can be useful to explore the biological context of molecular signatures, in particular three main approaches, namely, pathway mapping approaches, molecular network centric approaches and approaches that represent biological statements as knowledge graphs. We discuss the utility of each of these paradigms, illustrate how they can be leveraged with selected practical examples and identify ongoing challenges for this field of research.
Collapse
Affiliation(s)
- Mansoor Saqi
- Mansoor Saqi Data Science Institute, Imperial College London, UK
| | - Artem Lysenko
- Artem Lysenko Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Yi-Ke Guo
- Yi-Ke Guo Data Science Institute, Imperial College London, UK
| | - Tatsuhiko Tsunoda
- Tatsuhiko Tsunoda Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan CREST, JST, Tokyo, Japan Department of Medical Science Mathematics, Medical Research Institute, Tokyo Medical and Dental University, Tokyo, Japan
| | - Charles Auffray
- Charles Auffray European Institute for Systems Biology and Medicine, Lyon, France
| |
Collapse
|
2
|
Kawalia SB, Raschka T, Naz M, de Matos Simoes R, Senger P, Hofmann-Apitius M. Analytical Strategy to Prioritize Alzheimer's Disease Candidate Genes in Gene Regulatory Networks Using Public Expression Data. J Alzheimers Dis 2018; 59:1237-1254. [PMID: 28800327 PMCID: PMC5611835 DOI: 10.3233/jad-170011] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Alzheimer’s disease (AD) progressively destroys cognitive abilities in the aging population with tremendous effects on memory. Despite recent progress in understanding the underlying mechanisms, high drug attrition rates have put a question mark behind our knowledge about its etiology. Re-evaluation of past studies could help us to elucidate molecular-level details of this disease. Several methods to infer such networks exist, but most of them do not elaborate on context specificity and completeness of the generated networks, missing out on lesser-known candidates. In this study, we present a novel strategy that corroborates common mechanistic patterns across large scale AD gene expression studies and further prioritizes potential biomarker candidates. To infer gene regulatory networks (GRNs), we applied an optimized version of the BC3Net algorithm, named BC3Net10, capable of deriving robust and coherent patterns. In principle, this approach initially leverages the power of literature knowledge to extract AD specific genes for generating viable networks. Our findings suggest that AD GRNs show significant enrichment for key signaling mechanisms involved in neurotransmission. Among the prioritized genes, well-known AD genes were prominent in synaptic transmission, implicated in cognitive deficits. Moreover, less intensive studied AD candidates (STX2, HLA-F, HLA-C, RAB11FIP4, ARAP3, AP2A2, ATP2B4, ITPR2, and ATP2A3) are also involved in neurotransmission, providing new insights into the underlying mechanism. To our knowledge, this is the first study to generate knowledge-instructed GRNs that demonstrates an effective way of combining literature-based knowledge and data-driven analysis to identify lesser known candidates embedded in stable and robust functional patterns across disparate datasets.
Collapse
Affiliation(s)
- Shweta Bagewadi Kawalia
- Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, Sankt Augustin, Germany.,Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn-Aachen International Center for Information Technology, Bonn, Germany
| | - Tamara Raschka
- Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, Sankt Augustin, Germany.,University of Applied Sciences Koblenz, RheinAhrCampus, Remagen, Germany
| | - Mufassra Naz
- Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, Sankt Augustin, Germany.,Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn-Aachen International Center for Information Technology, Bonn, Germany
| | | | - Philipp Senger
- Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, Sankt Augustin, Germany
| | - Martin Hofmann-Apitius
- Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, Sankt Augustin, Germany.,Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn-Aachen International Center for Information Technology, Bonn, Germany
| |
Collapse
|
3
|
Soliman M, Nasraoui O, Cooper NGF. Building a glaucoma interaction network using a text mining approach. BioData Min 2016; 9:17. [PMID: 27152122 PMCID: PMC4857381 DOI: 10.1186/s13040-016-0096-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2015] [Accepted: 04/23/2016] [Indexed: 11/21/2022] Open
Abstract
Background The volume of biomedical literature and its underlying knowledge base is rapidly expanding, making it beyond the ability of a single human being to read through all the literature. Several automated methods have been developed to help make sense of this dilemma. The present study reports on the results of a text mining approach to extract gene interactions from the data warehouse of published experimental results which are then used to benchmark an interaction network associated with glaucoma. To the best of our knowledge, there is, as yet, no glaucoma interaction network derived solely from text mining approaches. The presence of such a network could provide a useful summative knowledge base to complement other forms of clinical information related to this disease. Results A glaucoma corpus was constructed from PubMed Central and a text mining approach was applied to extract genes and their relations from this corpus. The extracted relations between genes were checked using reference interaction databases and classified generally as known or new relations. The extracted genes and relations were then used to construct a glaucoma interaction network. Analysis of the resulting network indicated that it bears the characteristics of a small world interaction network. Our analysis showed the presence of seven glaucoma linked genes that defined the network modularity. A web-based system for browsing and visualizing the extracted glaucoma related interaction networks is made available at http://neurogene.spd.louisville.edu/GlaucomaINViewer/Form1.aspx. Conclusions This study has reported the first version of a glaucoma interaction network using a text mining approach. The power of such an approach is in its ability to cover a wide range of glaucoma related studies published over many years. Hence, a bigger picture of the disease can be established. To the best of our knowledge, this is the first glaucoma interaction network to summarize the known literature. The major findings were a set of relations that could not be found in existing interaction databases and that were found to be new, in addition to a smaller subnetwork consisting of interconnected clusters of seven glaucoma genes. Future improvements can be applied towards obtaining a better version of this network. Electronic supplementary material The online version of this article (doi:10.1186/s13040-016-0096-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Maha Soliman
- Department of Anatomical Sciences and Neurobiology, University of Louisville, School of Medicine, Louisville, KY USA
| | - Olfa Nasraoui
- Knowledge Discovery & Web Mining Lab, Department of Computer Engineering & Computer Science, University of Louisville, J.B Speed School of Engineering, Louisville, KY USA
| | - Nigel G F Cooper
- Department of Anatomical Sciences and Neurobiology, University of Louisville, School of Medicine, Louisville, KY USA
| |
Collapse
|
4
|
Saqi M, Pellet J, Roznovat I, Mazein A, Ballereau S, De Meulder B, Auffray C. Systems Medicine: The Future of Medical Genomics, Healthcare, and Wellness. Methods Mol Biol 2016; 1386:43-60. [PMID: 26677178 DOI: 10.1007/978-1-4939-3283-2_3] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
Recent advances in genomics have led to the rapid and relatively inexpensive collection of patient molecular data including multiple types of omics data. The integration of these data with clinical measurements has the potential to impact on our understanding of the molecular basis of disease and on disease management. Systems medicine is an approach to understanding disease through an integration of large patient datasets. It offers the possibility for personalized strategies for healthcare through the development of a new taxonomy of disease. Advanced computing will be an important component in effectively implementing systems medicine. In this chapter we describe three computational challenges associated with systems medicine: disease subtype discovery using integrated datasets, obtaining a mechanistic understanding of disease, and the development of an informatics platform for the mining, analysis, and visualization of data emerging from translational medicine studies.
Collapse
Affiliation(s)
- Mansoor Saqi
- European Institute for Systems Biology and Medicine, CNRS-ENS-UCBL, Université de Lyon, 50 Avenue Tony Garnier, Lyon, 69007, France
| | - Johann Pellet
- European Institute for Systems Biology and Medicine, CNRS-ENS-UCBL, Université de Lyon, 50 Avenue Tony Garnier, Lyon, 69007, France
| | - Irina Roznovat
- European Institute for Systems Biology and Medicine, CNRS-ENS-UCBL, Université de Lyon, 50 Avenue Tony Garnier, Lyon, 69007, France
| | - Alexander Mazein
- European Institute for Systems Biology and Medicine, CNRS-ENS-UCBL, Université de Lyon, 50 Avenue Tony Garnier, Lyon, 69007, France
| | - Stéphane Ballereau
- European Institute for Systems Biology and Medicine, CNRS-ENS-UCBL, Université de Lyon, 50 Avenue Tony Garnier, Lyon, 69007, France
| | - Bertrand De Meulder
- European Institute for Systems Biology and Medicine, CNRS-ENS-UCBL, Université de Lyon, 50 Avenue Tony Garnier, Lyon, 69007, France
| | - Charles Auffray
- European Institute for Systems Biology and Medicine, CNRS-ENS-UCBL, Université de Lyon, 50 Avenue Tony Garnier, Lyon, 69007, France. .,Université Claude Bernard, 3e étage plot 2, 50 Avenue Tony Garnier, Lyon, Cedex 07, 69366, France.
| |
Collapse
|
5
|
Hofmann-Apitius M, Ball G, Gebel S, Bagewadi S, de Bono B, Schneider R, Page M, Kodamullil AT, Younesi E, Ebeling C, Tegnér J, Canard L. Bioinformatics Mining and Modeling Methods for the Identification of Disease Mechanisms in Neurodegenerative Disorders. Int J Mol Sci 2015; 16:29179-206. [PMID: 26690135 PMCID: PMC4691095 DOI: 10.3390/ijms161226148] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2015] [Revised: 11/10/2015] [Accepted: 11/12/2015] [Indexed: 12/22/2022] Open
Abstract
Since the decoding of the Human Genome, techniques from bioinformatics, statistics, and machine learning have been instrumental in uncovering patterns in increasing amounts and types of different data produced by technical profiling technologies applied to clinical samples, animal models, and cellular systems. Yet, progress on unravelling biological mechanisms, causally driving diseases, has been limited, in part due to the inherent complexity of biological systems. Whereas we have witnessed progress in the areas of cancer, cardiovascular and metabolic diseases, the area of neurodegenerative diseases has proved to be very challenging. This is in part because the aetiology of neurodegenerative diseases such as Alzheimer´s disease or Parkinson´s disease is unknown, rendering it very difficult to discern early causal events. Here we describe a panel of bioinformatics and modeling approaches that have recently been developed to identify candidate mechanisms of neurodegenerative diseases based on publicly available data and knowledge. We identify two complementary strategies-data mining techniques using genetic data as a starting point to be further enriched using other data-types, or alternatively to encode prior knowledge about disease mechanisms in a model based framework supporting reasoning and enrichment analysis. Our review illustrates the challenges entailed in integrating heterogeneous, multiscale and multimodal information in the area of neurology in general and neurodegeneration in particular. We conclude, that progress would be accelerated by increasing efforts on performing systematic collection of multiple data-types over time from each individual suffering from neurodegenerative disease. The work presented here has been driven by project AETIONOMY; a project funded in the course of the Innovative Medicines Initiative (IMI); which is a public-private partnership of the European Federation of Pharmaceutical Industry Associations (EFPIA) and the European Commission (EC).
Collapse
Affiliation(s)
- Martin Hofmann-Apitius
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Institutszentrum Birlinghoven, Sankt Augustin D-53754, Germany.
- Rheinische Friedrich-Wilhelms-Universitaet Bonn, University of Bonn, Bonn 53113, Germany.
| | - Gordon Ball
- Unit of Computational Medicine, Center for Molecular Medicine, Department of Medicine, and Unit of Clinical Epidemiology, Karolinska University Hospital, Stockholm SE-171 77, Sweden.
- Science for Life Laboratories, Karolinska Institutet, Stockholm SE-171 77, Sweden.
| | - Stephan Gebel
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 7, avenue des Hauts-Fourneaux, Esch-sur-Alzette L-4362, Luxembourg.
| | - Shweta Bagewadi
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Institutszentrum Birlinghoven, Sankt Augustin D-53754, Germany.
| | - Bernard de Bono
- Institute of Health Informatics, University College London, London NW1 2DA, UK.
- Auckland Bioengineering Institute, University of Auckland, Symmonds Street, Auckland 1142, New Zealand.
| | - Reinhard Schneider
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 7, avenue des Hauts-Fourneaux, Esch-sur-Alzette L-4362, Luxembourg.
| | - Matt Page
- Translational Bioinformatics, UCB Pharma, 216 Bath Rd, Slough SL1 3WE, UK.
| | - Alpha Tom Kodamullil
- Rheinische Friedrich-Wilhelms-Universitaet Bonn, University of Bonn, Bonn 53113, Germany.
| | - Erfan Younesi
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Institutszentrum Birlinghoven, Sankt Augustin D-53754, Germany.
| | - Christian Ebeling
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Institutszentrum Birlinghoven, Sankt Augustin D-53754, Germany.
| | - Jesper Tegnér
- Unit of Computational Medicine, Center for Molecular Medicine, Department of Medicine, and Unit of Clinical Epidemiology, Karolinska University Hospital, Stockholm SE-171 77, Sweden.
- Science for Life Laboratories, Karolinska Institutet, Stockholm SE-171 77, Sweden.
| | - Luc Canard
- Translational Science Unit, SANOFI Recherche & Développement, 1 Avenue Pierre Brossolette, Chilly-Mazarin Cedex 91385, France.
| |
Collapse
|
6
|
Tarnanas I, Tsolaki A, Wiederhold M, Wiederhold B, Tsolaki M. Five-year biomarker progression variability for Alzheimer's disease dementia prediction: Can a complex instrumental activities of daily living marker fill in the gaps? ALZHEIMER'S & DEMENTIA: DIAGNOSIS, ASSESSMENT & DISEASE MONITORING 2015; 1:521-32. [PMID: 27239530 PMCID: PMC4879487 DOI: 10.1016/j.dadm.2015.10.005] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Introduction Biomarker progressions explain higher variability in cognitive decline than baseline values alone. This study examines progressions of established biomarkers along with a novel marker in a longitudinal cognitive decline. Methods A total of 215 subjects were used with a diagnosis of normal, mild cognitive impairment (MCI) or Alzheimer's disease (AD) at baseline. We calculated standardized biomarker progression rates and used them as predictors of outcome within 5 years. Results Early cognitive declines were more strongly explained by fluorodeoxyglucose-positron emission tomography, precuneus and medial temporal cortical thickness, and the complex instrumental activities of daily living (iADL) marker progressions. Using Cox proportional hazards model, we found that these progressions were a significant risk factor for conversion from both MCI to AD (adjusted hazard ratio 1.45; 95% confidence interval 1.20–1.93; P = 1.23 × 10−5) and cognitively normal to MCI (adjusted hazard ratio 1.76; 95% confidence interval 1.32–2.34; P = 1.55 × 10−5). Discussion Compared with standard biological biomarkers, complex functional iADL markers could also provide predictive information for cognitive decline during the presymptomatic stage. This has important implications for clinical trials focusing on prevention in asymptomatic individuals.
Collapse
Affiliation(s)
- Ioannis Tarnanas
- Health-IS Lab, Department of Management, Technology and Economics, ETH Zurich, Zurich, Switzerland
- Piaget Research Foundation, Nürnberg, Germany
- Corresponding author. Tel.: +41-71-224-72-44; Fax: +41-632-97-75.
| | - Anthoula Tsolaki
- 3rd Department of Neurology, Medical School, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - Mark Wiederhold
- Division of Cognitive and Restorative Neurology, Virtual Reality Medical Center, San Diego, CA, USA
| | - Brenda Wiederhold
- Virtual Reality Medical Institute, Clos Chapelle aux Champs, Brussels, Belgium
| | - Magda Tsolaki
- 3rd Department of Neurology, Medical School, Aristotle University of Thessaloniki, Thessaloniki, Greece
| |
Collapse
|