1
|
Danek BP, Makarious MB, Dadu A, Vitale D, Lee PS, Singleton AB, Nalls MA, Sun J, Faghri F. Federated learning for multi-omics: A performance evaluation in Parkinson's disease. Patterns (N Y) 2024; 5:100945. [PMID: 38487808 PMCID: PMC10935499 DOI: 10.1016/j.patter.2024.100945] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Revised: 01/29/2024] [Accepted: 02/02/2024] [Indexed: 03/17/2024]
Abstract
While machine learning (ML) research has recently grown more in popularity, its application in the omics domain is constrained by access to sufficiently large, high-quality datasets needed to train ML models. Federated learning (FL) represents an opportunity to enable collaborative curation of such datasets among participating institutions. We compare the simulated performance of several models trained using FL against classically trained ML models on the task of multi-omics Parkinson's disease prediction. We find that FL model performance tracks centrally trained ML models, where the most performant FL model achieves an AUC-PR of 0.876 ± 0.009, 0.014 ± 0.003 less than its centrally trained variation. We also determine that the dispersion of samples within a federation plays a meaningful role in model performance. Our study implements several open-source FL frameworks and aims to highlight some of the challenges and opportunities when applying these collaborative methods in multi-omics studies.
Collapse
Affiliation(s)
- Benjamin P. Danek
- Department of Computer Science, University of Illinois at Urbana-Champaign, Champaign, IL 61820, USA
- Center for Alzheimer’s and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD 20892, USA
- DataTecnica, Washington, DC 20037, USA
| | - Mary B. Makarious
- Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD 20892, USA
- Department of Clinical and Movement Neurosciences, UCL Queen Square Institute of Neurology, London, UK
- UCL Movement Disorders Centre, University College London, London, UK
| | - Anant Dadu
- Center for Alzheimer’s and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD 20892, USA
- DataTecnica, Washington, DC 20037, USA
| | - Dan Vitale
- Center for Alzheimer’s and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD 20892, USA
- DataTecnica, Washington, DC 20037, USA
| | - Paul Suhwan Lee
- Center for Alzheimer’s and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD 20892, USA
| | - Andrew B. Singleton
- Center for Alzheimer’s and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD 20892, USA
- Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD 20892, USA
| | - Mike A. Nalls
- Center for Alzheimer’s and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD 20892, USA
- DataTecnica, Washington, DC 20037, USA
- Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD 20892, USA
| | - Jimeng Sun
- Department of Computer Science, University of Illinois at Urbana-Champaign, Champaign, IL 61820, USA
- Carle Illinois College of Medicine, University of Illinois at Urbana-Champaign, Champaign, IL 61820, USA
| | - Faraz Faghri
- Center for Alzheimer’s and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD 20892, USA
- DataTecnica, Washington, DC 20037, USA
- Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD 20892, USA
| |
Collapse
|
2
|
Sheng M, Qi Y, Gao Z, Lin X. Analyzing omics data based on sample network. J Bioinform Comput Biol 2024; 22:2450002. [PMID: 38567387 DOI: 10.1142/s0219720024500021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Identifying valuable features from complex omics data is of great significance for disease diagnosis study. This paper proposes a new feature selection algorithm based on sample network (FS-SN) to mine important information from omics data. The sample network is constructed according to the sample neighbor relationship at the molecular (feature) expression level, and the distinguishing ability of the feature is evaluated based on the topology of the sample network. The sample network established on a feature with a strong discriminating ability tends to have many edges between the same group samples and few edges between the different group samples. At the same time, FS-SN removes redundant features according to the gravitational interaction between features. To show the validation of FS-SN, it was compared on ten public datasets with ERGS, mRMR, ReliefF, ATSD-DN, and INDEED which are efficient in omics data analysis. Experimental results show that FS-SN performed better than the compared methods in accuracy, sensitivity and specificity in most cases. Hence, FS-SN making use of the topology of the sample network is effective for analyzing omics data, it can identify key features that reflect the occurrence and development of diseases, and reveal the underlying biological mechanism.
Collapse
Affiliation(s)
- Meizhen Sheng
- School of Computer Science & Technology, Dalian University of Technology, No. 2 Linggong Road, Dalian, Liaoning Province 116024, P. R. China
| | - Yanpeng Qi
- School of Computer Science & Technology, Dalian University of Technology, No. 2 Linggong Road, Dalian, Liaoning Province 116024, P. R. China
| | - Zhenbo Gao
- School of Computer Science & Technology, Dalian University of Technology, No. 2 Linggong Road, Dalian, Liaoning Province 116024, P. R. China
| | - Xiaohui Lin
- School of Computer Science & Technology, Dalian University of Technology, No. 2 Linggong Road, Dalian, Liaoning Province 116024, P. R. China
| |
Collapse
|
3
|
Kong X, Diao L, Jiang P, Nie S, Guo S, Li D. DDK-Linker: a network-based strategy identifies disease signals by linking high-throughput omics datasets to disease knowledge. Brief Bioinform 2024; 25:bbae111. [PMID: 38517698 PMCID: PMC10959161 DOI: 10.1093/bib/bbae111] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Revised: 02/26/2024] [Accepted: 02/27/2024] [Indexed: 03/24/2024] Open
Abstract
The high-throughput genomic and proteomic scanning approaches allow investigators to measure the quantification of genome-wide genes (or gene products) for certain disease conditions, which plays an essential role in promoting the discovery of disease mechanisms. The high-throughput approaches often generate a large gene list of interest (GOIs), such as differentially expressed genes/proteins. However, researchers have to perform manual triage and validation to explore the most promising, biologically plausible linkages between the known disease genes and GOIs (disease signals) for further study. Here, to address this challenge, we proposed a network-based strategy DDK-Linker to facilitate the exploration of disease signals hidden in omics data by linking GOIs to disease knowns genes. Specifically, it reconstructed gene distances in the protein-protein interaction (PPI) network through six network methods (random walk with restart, Deepwalk, Node2Vec, LINE, HOPE, Laplacian) to discover disease signals in omics data that have shorter distances to disease genes. Furthermore, benefiting from the establishment of knowledge base we established, the abundant bioinformatics annotations were provided for each candidate disease signal. To assist in omics data interpretation and facilitate the usage, we have developed this strategy into an application that users can access through a website or download the R package. We believe DDK-Linker will accelerate the exploring of disease genes and drug targets in a variety of omics data, such as genomics, transcriptomics and proteomics data, and provide clues for complex disease mechanism and pharmacological research. DDK-Linker is freely accessible at http://ddklinker.ncpsb.org.cn/.
Collapse
Affiliation(s)
- Xiangren Kong
- State Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China
| | - Lihong Diao
- State Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China
- School of Traditional Chinese Medicine, Beijing University of Chinese Medicine, Beijing 100029, China
| | - Peng Jiang
- State Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China
| | - Shiyan Nie
- State Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China
| | - Shuzhen Guo
- School of Traditional Chinese Medicine, Beijing University of Chinese Medicine, Beijing 100029, China
| | - Dong Li
- State Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China
| |
Collapse
|
4
|
Andalib KMS, Ahmed A, Habib A. Omics data analysis reveals common molecular basis of small cell lung cancer and COVID-19. J Biomol Struct Dyn 2023:1-16. [PMID: 37708006 DOI: 10.1080/07391102.2023.2257803] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Accepted: 08/23/2023] [Indexed: 09/16/2023]
Abstract
The impact of COVID-19 infection on individuals with small cell lung cancer (SCLC) poses a serious threat. Unfortunately, the molecular basis of this severe comorbidity has yet to be elucidated. The present study addresses this gap utilizing publicly available omics data of COVID-19 and SCLC to explore the key molecules and associated pathways involved in the convergence of these diseases. Findings revealed 402 genes, that exhibited differential expression patterns in SCLC patients and also play a pivotal role in COVID-19 pathogenesis. Subsequent functional enrichment analyses identified relevant ontologies and pathways that are significantly associated with these genes, revealing important insights into their potential biological, molecular and cellular functions. The protein-protein interaction network, constructed under four combinatorial topological assessments, highlighted SMAD3, CAV1, PIK3R1, and FN1 as the primary components to this comorbidity. Our results suggest that these components significantly regulate this cross-talk triggering the PI3K-AKT and TGF-β signaling pathways. Lastly, this study made a multi-step computational attempt and identified corylifol A and ginkgetin from natural sources that can potentially inhibit these components. Therefore, the outcomes of this study offer novel perspectives on the common molecular mechanisms underlying SCLC and COVID-19 and present future opportunities for drug development.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- K M Salim Andalib
- Biotechnology and Genetic Engineering Discipline, Life Science School, Khulna University, Khulna, Bangladesh
| | - Asif Ahmed
- Biotechnology and Genetic Engineering Discipline, Life Science School, Khulna University, Khulna, Bangladesh
| | - Ahsan Habib
- Biotechnology and Genetic Engineering Discipline, Life Science School, Khulna University, Khulna, Bangladesh
| |
Collapse
|
5
|
Li S, Hsu C, Zhao T, He L. Editorial: Leveraging machine learning for omics-driven biomarker discovery. Front Mol Biosci 2023; 9:1119644. [PMID: 36699701 PMCID: PMC9868145 DOI: 10.3389/fmolb.2022.1119644] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Accepted: 12/22/2022] [Indexed: 01/11/2023] Open
Affiliation(s)
- Sheng Li
- Zhongnan Hospital, Wuhan University, Wuhan, China,*Correspondence: Sheng Li, ; Charles Hsu, ; Tianyi Zhao, ; Liangcan He,
| | - Charles Hsu
- Department of Population Medicine, College of Medicine, Qatar University, Doha, Qatar,*Correspondence: Sheng Li, ; Charles Hsu, ; Tianyi Zhao, ; Liangcan He,
| | - Tianyi Zhao
- Harbin Institute of Technology, Harbin, China,*Correspondence: Sheng Li, ; Charles Hsu, ; Tianyi Zhao, ; Liangcan He,
| | - Liangcan He
- Harbin Institute of Technology, Harbin, China,*Correspondence: Sheng Li, ; Charles Hsu, ; Tianyi Zhao, ; Liangcan He,
| |
Collapse
|
6
|
Cavallari I, Cerciello F, Giovannetti E, Urso L. Editorial: Moving beyond the molecular mechanisms of malignant pleural mesothelioma: Cues for novel biomarkers and drug targets. Front Oncol 2023; 13:1163144. [PMID: 36950549 PMCID: PMC10025518 DOI: 10.3389/fonc.2023.1163144] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Accepted: 02/20/2023] [Indexed: 03/08/2023] Open
Affiliation(s)
- Ilaria Cavallari
- Immunology and Molecular Oncology Unit, Istituto Oncologico Veneto – Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS), Padua, Italy
- *Correspondence: Ilaria Cavallari, ; Elisa Giovannetti, ; Ferdinando Cerciello, ; Loredana Urso,
| | - Ferdinando Cerciello
- Department of Medical Oncology, Inselspital, Bern University Hospital, University of Bern, Bern, Switzerland
- *Correspondence: Ilaria Cavallari, ; Elisa Giovannetti, ; Ferdinando Cerciello, ; Loredana Urso,
| | - Elisa Giovannetti
- Laboratory Medical Oncology, Department Medical Oncology, Cancer Center Amsterdam, Amsterdam Universitair Medische Centra (UMC), Vrije Universiteit, Amsterdam, Netherlands
- Cancer Pharmacology Lab, Fondazione Pisana per la Scienza, Pisa, Italy
- *Correspondence: Ilaria Cavallari, ; Elisa Giovannetti, ; Ferdinando Cerciello, ; Loredana Urso,
| | - Loredana Urso
- Immunology and Molecular Oncology Unit, Istituto Oncologico Veneto – Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS), Padua, Italy
- Department of Surgery, Oncology and Gastroenterology, University of Padua, Padua, Italy
- *Correspondence: Ilaria Cavallari, ; Elisa Giovannetti, ; Ferdinando Cerciello, ; Loredana Urso,
| |
Collapse
|
7
|
Fierro-Monti I, Wright JC, Choudhary JS, Vizcaíno JA. Identifying individuals using proteomics: are we there yet? Front Mol Biosci 2022; 9:1062031. [PMID: 36523653 PMCID: PMC9744771 DOI: 10.3389/fmolb.2022.1062031] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2022] [Accepted: 11/16/2022] [Indexed: 08/31/2023] Open
Abstract
Multi-omics approaches including proteomics analyses are becoming an integral component of precision medicine. As clinical proteomics studies gain momentum and their sensitivity increases, research on identifying individuals based on their proteomics data is here examined for risks and ethics-related issues. A great deal of work has already been done on this topic for DNA/RNA sequencing data, but it has yet to be widely studied in other omics fields. The current state-of-the-art for the identification of individuals based solely on proteomics data is explained. Protein sequence variation analysis approaches are covered in more detail, including the available analysis workflows and their limitations. We also outline some previous forensic and omics proteomics studies that are relevant for the identification of individuals. Following that, we discuss the risks of patient reidentification using other proteomics data types such as protein expression abundance and post-translational modification (PTM) profiles. In light of the potential identification of individuals through proteomics data, possible legal and ethical implications are becoming increasingly important in the field.
Collapse
Affiliation(s)
- Ivo Fierro-Monti
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, United Kingdom
| | | | | | - Juan Antonio Vizcaíno
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, United Kingdom
| |
Collapse
|
8
|
Thomas PD, Ebert D, Muruganujan A, Mushayahama T, Albou L, Mi H. PANTHER: Making genome-scale phylogenetics accessible to all. Protein Sci 2022; 31:8-22. [PMID: 34717010 PMCID: PMC8740835 DOI: 10.1002/pro.4218] [Citation(s) in RCA: 343] [Impact Index Per Article: 171.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2021] [Revised: 10/24/2021] [Accepted: 10/26/2021] [Indexed: 02/03/2023]
Abstract
Phylogenetics is a powerful tool for analyzing protein sequences, by inferring their evolutionary relationships to other proteins. However, phylogenetics analyses can be challenging: they are computationally expensive and must be performed carefully in order to avoid systematic errors and artifacts. Protein Analysis THrough Evolutionary Relationships (PANTHER; http://pantherdb.org) is a publicly available, user-focused knowledgebase that stores the results of an extensive phylogenetic reconstruction pipeline that includes computational and manual processes and quality control steps. First, fully reconciled phylogenetic trees (including ancestral protein sequences) are reconstructed for a set of "reference" protein sequences obtained from fully sequenced genomes of organisms across the tree of life. Second, the resulting phylogenetic trees are manually reviewed and annotated with function evolution events: inferred gains and losses of protein function along branches of the phylogenetic tree. Here, we describe in detail the current contents of PANTHER, how those contents are generated, and how they can be used in a variety of applications. The PANTHER knowledgebase can be downloaded or accessed via an extensive API. In addition, PANTHER provides software tools to facilitate the application of the knowledgebase to common protein sequence analysis tasks: exploring an annotated genome by gene function; performing "enrichment analysis" of lists of genes; annotating a single sequence or large batch of sequences by homology; and assessing the likelihood that a genetic variant at a particular site in a protein will have deleterious effects.
Collapse
Affiliation(s)
- Paul D. Thomas
- Division of Bioinformatics, Department of Population and Public Health SciencesUniversity of Southern CaliforniaLos AngelesCaliforniaUSA
| | - Dustin Ebert
- Division of Bioinformatics, Department of Population and Public Health SciencesUniversity of Southern CaliforniaLos AngelesCaliforniaUSA
| | - Anushya Muruganujan
- Division of Bioinformatics, Department of Population and Public Health SciencesUniversity of Southern CaliforniaLos AngelesCaliforniaUSA
| | - Tremayne Mushayahama
- Division of Bioinformatics, Department of Population and Public Health SciencesUniversity of Southern CaliforniaLos AngelesCaliforniaUSA
| | - Laurent‐Philippe Albou
- Division of Bioinformatics, Department of Population and Public Health SciencesUniversity of Southern CaliforniaLos AngelesCaliforniaUSA
| | - Huaiyu Mi
- Division of Bioinformatics, Department of Population and Public Health SciencesUniversity of Southern CaliforniaLos AngelesCaliforniaUSA
| |
Collapse
|
9
|
Tayara H, Abdelbaky I, To Chong K. Recent omics-based computational methods for COVID-19 drug discovery and repurposing. Brief Bioinform 2021; 22:6355836. [PMID: 34423353 DOI: 10.1093/bib/bbab339] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2021] [Revised: 07/09/2021] [Indexed: 12/22/2022] Open
Abstract
The coronavirus disease 2019 (COVID-19) pandemic, caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), is the main reason for the increasing number of deaths worldwide. Although strict quarantine measures were followed in many countries, the disease situation is still intractable. Thus, it is needed to utilize all possible means to confront this pandemic. Therefore, researchers are in a race against the time to produce potential treatments to cure or reduce the increasing infections of COVID-19. Computational methods are widely proving rapid successes in biological related problems, including diagnosis and treatment of diseases. Many efforts in recent months utilized Artificial Intelligence (AI) techniques in the context of fighting the spread of COVID-19. Providing periodic reviews and discussions of recent efforts saves the time of researchers and helps to link their endeavors for a faster and efficient confrontation of the pandemic. In this review, we discuss the recent promising studies that used Omics-based data and utilized AI algorithms and other computational tools to achieve this goal. We review the established datasets and the developed methods that were basically directed to new or repurposed drugs, vaccinations and diagnosis. The tools and methods varied depending on the level of details in the available information such as structures, sequences or metabolic data.
Collapse
Affiliation(s)
- Hilal Tayara
- School of international Engineering and Science, Jeonbuk National University, Jeonju 54896, Republic of Korea
| | - Ibrahim Abdelbaky
- Artificial Intelligence Department, Faculty of Computers and Artificial Intelligence, Benha University, Banha 13518, Egypt
| | - Kil To Chong
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju, Jeollabukdo 54896, Republic of Korea.,Advances Electronics and Information Research Center, Jeonbuk National University, Jeonju 54896, Republic of Korea
| |
Collapse
|
10
|
Ostaszewski M, Niarakis A, Mazein A, Kuperstein I, Phair R, Orta‐Resendiz A, Singh V, Aghamiri SS, Acencio ML, Glaab E, Ruepp A, Fobo G, Montrone C, Brauner B, Frishman G, Monraz Gómez LC, Somers J, Hoch M, Kumar Gupta S, Scheel J, Borlinghaus H, Czauderna T, Schreiber F, Montagud A, Ponce de Leon M, Funahashi A, Hiki Y, Hiroi N, Yamada TG, Dräger A, Renz A, Naveez M, Bocskei Z, Messina F, Börnigen D, Fergusson L, Conti M, Rameil M, Nakonecnij V, Vanhoefer J, Schmiester L, Wang M, Ackerman EE, Shoemaker JE, Zucker J, Oxford K, Teuton J, Kocakaya E, Summak GY, Hanspers K, Kutmon M, Coort S, Eijssen L, Ehrhart F, Rex DAB, Slenter D, Martens M, Pham N, Haw R, Jassal B, Matthews L, Orlic‐Milacic M, Senff Ribeiro A, Rothfels K, Shamovsky V, Stephan R, Sevilla C, Varusai T, Ravel J, Fraser R, Ortseifen V, Marchesi S, Gawron P, Smula E, Heirendt L, Satagopam V, Wu G, Riutta A, Golebiewski M, Owen S, Goble C, Hu X, Overall RW, Maier D, Bauch A, Gyori BM, Bachman JA, Vega C, Grouès V, Vazquez M, Porras P, Licata L, Iannuccelli M, Sacco F, Nesterova A, Yuryev A, de Waard A, Turei D, Luna A, Babur O, Soliman S, Valdeolivas A, Esteban‐Medina M, Peña‐Chilet M, Rian K, Helikar T, Puniya BL, Modos D, Treveil A, Olbei M, De Meulder B, Ballereau S, Dugourd A, Naldi A, Noël V, Calzone L, Sander C, Demir E, Korcsmaros T, Freeman TC, Augé F, Beckmann JS, Hasenauer J, Wolkenhauer O, Wilighagen EL, Pico AR, Evelo CT, Gillespie ME, Stein LD, Hermjakob H, D'Eustachio P, Saez‐Rodriguez J, Dopazo J, Valencia A, Kitano H, Barillot E, Auffray C, Balling R, Schneider R. COVID19 Disease Map, a computational knowledge repository of virus-host interaction mechanisms. Mol Syst Biol 2021; 17:e10387. [PMID: 34664389 PMCID: PMC8524328 DOI: 10.15252/msb.202110387] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Revised: 08/25/2021] [Accepted: 08/26/2021] [Indexed: 12/13/2022] Open
Abstract
We need to effectively combine the knowledge from surging literature with complex datasets to propose mechanistic models of SARS-CoV-2 infection, improving data interpretation and predicting key targets of intervention. Here, we describe a large-scale community effort to build an open access, interoperable and computable repository of COVID-19 molecular mechanisms. The COVID-19 Disease Map (C19DMap) is a graphical, interactive representation of disease-relevant molecular mechanisms linking many knowledge sources. Notably, it is a computational resource for graph-based analyses and disease modelling. To this end, we established a framework of tools, platforms and guidelines necessary for a multifaceted community of biocurators, domain experts, bioinformaticians and computational biologists. The diagrams of the C19DMap, curated from the literature, are integrated with relevant interaction and text mining databases. We demonstrate the application of network analysis and modelling approaches by concrete examples to highlight new testable hypotheses. This framework helps to find signatures of SARS-CoV-2 predisposition, treatment response or prioritisation of drug candidates. Such an approach may help deal with new waves of COVID-19 or similar pandemics in the long-term perspective.
Collapse
Affiliation(s)
- Marek Ostaszewski
- Luxembourg Centre for Systems BiomedicineUniversity of LuxembourgEsch‐sur‐AlzetteLuxembourg
| | - Anna Niarakis
- Université Paris‐SaclayLaboratoire Européen de Recherche pour la Polyarthrite rhumatoïde ‐ GenhotelUniv EvryEvryFrance
- Lifeware GroupInria Saclay‐Ile de FrancePalaiseauFrance
| | - Alexander Mazein
- Luxembourg Centre for Systems BiomedicineUniversity of LuxembourgEsch‐sur‐AlzetteLuxembourg
| | - Inna Kuperstein
- Institut CuriePSL Research UniversityParisFrance
- INSERMParisFrance
- MINES ParisTechPSL Research UniversityParisFrance
| | - Robert Phair
- Integrative Bioinformatics, Inc.Mountain ViewCAUSA
| | - Aurelio Orta‐Resendiz
- Institut PasteurUniversité de Paris, Unité HIVInflammation et PersistanceParisFrance
- Bio Sorbonne Paris CitéUniversité de ParisParisFrance
| | - Vidisha Singh
- Université Paris‐SaclayLaboratoire Européen de Recherche pour la Polyarthrite rhumatoïde ‐ GenhotelUniv EvryEvryFrance
| | - Sara Sadat Aghamiri
- Inserm‐ Institut national de la santé et de la recherche médicaleParisFrance
| | - Marcio Luis Acencio
- Luxembourg Centre for Systems BiomedicineUniversity of LuxembourgEsch‐sur‐AlzetteLuxembourg
| | - Enrico Glaab
- Luxembourg Centre for Systems BiomedicineUniversity of LuxembourgEsch‐sur‐AlzetteLuxembourg
| | - Andreas Ruepp
- Institute of Experimental Genetics (IEG)Helmholtz Zentrum München‐German Research Center for Environmental Health (GmbH)NeuherbergGermany
| | - Gisela Fobo
- Institute of Experimental Genetics (IEG)Helmholtz Zentrum München‐German Research Center for Environmental Health (GmbH)NeuherbergGermany
| | - Corinna Montrone
- Institute of Experimental Genetics (IEG)Helmholtz Zentrum München‐German Research Center for Environmental Health (GmbH)NeuherbergGermany
| | - Barbara Brauner
- Institute of Experimental Genetics (IEG)Helmholtz Zentrum München‐German Research Center for Environmental Health (GmbH)NeuherbergGermany
| | - Goar Frishman
- Institute of Experimental Genetics (IEG)Helmholtz Zentrum München‐German Research Center for Environmental Health (GmbH)NeuherbergGermany
| | - Luis Cristóbal Monraz Gómez
- Institut CuriePSL Research UniversityParisFrance
- INSERMParisFrance
- MINES ParisTechPSL Research UniversityParisFrance
| | - Julia Somers
- Department of Molecular and Medical GeneticsOregon Health & Sciences UniversityPortlandORUSA
| | - Matti Hoch
- Department of Systems Biology and BioinformaticsUniversity of RostockRostockGermany
| | | | - Julia Scheel
- Department of Systems Biology and BioinformaticsUniversity of RostockRostockGermany
| | - Hanna Borlinghaus
- Department of Computer and Information ScienceUniversity of KonstanzKonstanzGermany
| | - Tobias Czauderna
- Faculty of Information TechnologyDepartment of Human‐Centred ComputingMonash UniversityClaytonVic.Australia
| | - Falk Schreiber
- Department of Computer and Information ScienceUniversity of KonstanzKonstanzGermany
- Faculty of Information TechnologyDepartment of Human‐Centred ComputingMonash UniversityClaytonVic.Australia
| | | | | | - Akira Funahashi
- Department of Biosciences and InformaticsKeio UniversityYokohamaJapan
| | - Yusuke Hiki
- Department of Biosciences and InformaticsKeio UniversityYokohamaJapan
| | - Noriko Hiroi
- Graduate School of Media and GovernanceResearch Institute at SFCKeio UniversityKanagawaJapan
| | - Takahiro G Yamada
- Department of Biosciences and InformaticsKeio UniversityYokohamaJapan
| | - Andreas Dräger
- Computational Systems Biology of Infections and Antimicrobial‐Resistant PathogensInstitute for Bioinformatics and Medical Informatics (IBMI)University of TübingenTübingenGermany
- Department of Computer ScienceUniversity of TübingenTübingenGermany
- German Center for Infection Research (DZIF), partner siteTübingenGermany
| | - Alina Renz
- Computational Systems Biology of Infections and Antimicrobial‐Resistant PathogensInstitute for Bioinformatics and Medical Informatics (IBMI)University of TübingenTübingenGermany
- Department of Computer ScienceUniversity of TübingenTübingenGermany
| | - Muhammad Naveez
- Department of Systems Biology and BioinformaticsUniversity of RostockRostockGermany
- Institute of Applied Computer SystemsRiga Technical UniversityRigaLatvia
| | - Zsolt Bocskei
- Sanofi R&DTranslational SciencesChilly‐MazarinFrance
| | - Francesco Messina
- Dipartimento di Epidemiologia Ricerca Pre‐Clinica e Diagnostica AvanzataNational Institute for Infectious Diseases 'Lazzaro Spallanzani' I.R.C.C.S.RomeItaly
- COVID‐19 INMI Network Medicine for IDs Study GroupNational Institute for Infectious Diseases 'Lazzaro Spallanzani' I.R.C.C.SRomeItaly
| | - Daniela Börnigen
- Bioinformatics Core FacilityUniversitätsklinikum Hamburg‐EppendorfHamburgGermany
| | - Liam Fergusson
- Royal (Dick) School of Veterinary MedicineThe University of EdinburghEdinburghUK
| | - Marta Conti
- Faculty of Mathematics and Natural SciencesUniversity of BonnBonnGermany
| | - Marius Rameil
- Faculty of Mathematics and Natural SciencesUniversity of BonnBonnGermany
| | - Vanessa Nakonecnij
- Faculty of Mathematics and Natural SciencesUniversity of BonnBonnGermany
| | - Jakob Vanhoefer
- Faculty of Mathematics and Natural SciencesUniversity of BonnBonnGermany
| | - Leonard Schmiester
- Faculty of Mathematics and Natural SciencesUniversity of BonnBonnGermany
- Center for MathematicsChair of Mathematical Modeling of Biological SystemsTechnische Universität MünchenGarchingGermany
| | - Muying Wang
- Department of Chemical and Petroleum EngineeringUniversity of PittsburghPittsburghPAUSA
| | - Emily E Ackerman
- Department of Chemical and Petroleum EngineeringUniversity of PittsburghPittsburghPAUSA
| | - Jason E Shoemaker
- Department of Chemical and Petroleum EngineeringUniversity of PittsburghPittsburghPAUSA
- Department of Computational and Systems BiologyUniversity of PittsburghPittsburghPAUSA
| | | | | | | | | | | | - Kristina Hanspers
- Institute of Data Science and BiotechnologyGladstone InstitutesSan FranciscoCAUSA
| | - Martina Kutmon
- Department of Bioinformatics ‐ BiGCaTNUTRIMMaastricht UniversityMaastrichtThe Netherlands
- Maastricht Centre for Systems Biology (MaCSBio)Maastricht UniversityMaastrichtThe Netherlands
| | - Susan Coort
- Department of Bioinformatics ‐ BiGCaTNUTRIMMaastricht UniversityMaastrichtThe Netherlands
| | - Lars Eijssen
- Department of Bioinformatics ‐ BiGCaTNUTRIMMaastricht UniversityMaastrichtThe Netherlands
- Maastricht University Medical CentreMaastrichtThe Netherlands
| | - Friederike Ehrhart
- Department of Bioinformatics ‐ BiGCaTNUTRIMMaastricht UniversityMaastrichtThe Netherlands
- Maastricht University Medical CentreMaastrichtThe Netherlands
| | | | - Denise Slenter
- Department of Bioinformatics ‐ BiGCaTNUTRIMMaastricht UniversityMaastrichtThe Netherlands
| | - Marvin Martens
- Department of Bioinformatics ‐ BiGCaTNUTRIMMaastricht UniversityMaastrichtThe Netherlands
| | - Nhung Pham
- Department of Bioinformatics ‐ BiGCaTNUTRIMMaastricht UniversityMaastrichtThe Netherlands
| | - Robin Haw
- MaRS CentreOntario Institute for Cancer ResearchTorontoONCanada
| | - Bijay Jassal
- MaRS CentreOntario Institute for Cancer ResearchTorontoONCanada
| | | | | | - Andrea Senff Ribeiro
- MaRS CentreOntario Institute for Cancer ResearchTorontoONCanada
- Universidade Federal do ParanáCuritibaBrasil
| | - Karen Rothfels
- MaRS CentreOntario Institute for Cancer ResearchTorontoONCanada
| | | | - Ralf Stephan
- MaRS CentreOntario Institute for Cancer ResearchTorontoONCanada
| | - Cristoffer Sevilla
- European Bioinformatics Institute (EMBL‐EBI)European Molecular Biology LaboratoryHinxton, CambridgeshireUK
| | - Thawfeek Varusai
- European Bioinformatics Institute (EMBL‐EBI)European Molecular Biology LaboratoryHinxton, CambridgeshireUK
| | - Jean‐Marie Ravel
- INSERM UMR_S 1256Nutrition, Genetics, and Environmental Risk Exposure (NGERE)Faculty of Medicine of NancyUniversity of LorraineNancyFrance
- Laboratoire de génétique médicaleCHRU NancyNancyFrance
| | - Rupsha Fraser
- Queen's Medical Research InstituteThe University of EdinburghEdinburghUK
| | - Vera Ortseifen
- Senior Research Group in Genome Research of Industrial MicroorganismsCenter for BiotechnologyBielefeld UniversityBielefeldGermany
| | - Silvia Marchesi
- Department of Surgical ScienceUppsala UniversityUppsalaSweden
| | - Piotr Gawron
- Luxembourg Centre for Systems BiomedicineUniversity of LuxembourgEsch‐sur‐AlzetteLuxembourg
- Institute of Computing SciencePoznan University of TechnologyPoznanPoland
| | - Ewa Smula
- Luxembourg Centre for Systems BiomedicineUniversity of LuxembourgEsch‐sur‐AlzetteLuxembourg
| | - Laurent Heirendt
- Luxembourg Centre for Systems BiomedicineUniversity of LuxembourgEsch‐sur‐AlzetteLuxembourg
| | - Venkata Satagopam
- Luxembourg Centre for Systems BiomedicineUniversity of LuxembourgEsch‐sur‐AlzetteLuxembourg
| | - Guanming Wu
- Department of Medical Informatics and Clinical EpidemiologyOregon Health & Science UniversityPortlandORUSA
| | - Anders Riutta
- Institute of Data Science and BiotechnologyGladstone InstitutesSan FranciscoCAUSA
| | | | - Stuart Owen
- Department of Computer ScienceThe University of ManchesterManchesterUK
| | - Carole Goble
- Department of Computer ScienceThe University of ManchesterManchesterUK
| | - Xiaoming Hu
- Heidelberg Institute for Theoretical Studies (HITS)HeidelbergGermany
| | - Rupert W Overall
- German Center for Neurodegenerative Diseases (DZNE) DresdenDresdenGermany
- Center for Regenerative Therapies Dresden (CRTD)Technische Universität DresdenDresdenGermany
- Institute for BiologyHumboldt University of BerlinBerlinGermany
| | | | | | - Benjamin M Gyori
- Harvard Medical SchoolLaboratory of Systems PharmacologyBostonMAUSA
| | - John A Bachman
- Harvard Medical SchoolLaboratory of Systems PharmacologyBostonMAUSA
| | - Carlos Vega
- Luxembourg Centre for Systems BiomedicineUniversity of LuxembourgEsch‐sur‐AlzetteLuxembourg
| | - Valentin Grouès
- Luxembourg Centre for Systems BiomedicineUniversity of LuxembourgEsch‐sur‐AlzetteLuxembourg
| | | | - Pablo Porras
- European Bioinformatics Institute (EMBL‐EBI)European Molecular Biology LaboratoryHinxton, CambridgeshireUK
| | - Luana Licata
- Department of BiologyUniversity of Rome Tor VergataRomeItaly
| | | | - Francesca Sacco
- Department of BiologyUniversity of Rome Tor VergataRomeItaly
| | | | | | | | - Denes Turei
- Institute for Computational BiomedicineHeidelberg UniversityHeidelbergGermany
| | - Augustin Luna
- cBio Center, Divisions of Biostatistics and Computational BiologyDepartment of Data SciencesDana‐Farber Cancer InstituteBostonMAUSA
- Department of Cell BiologyHarvard Medical SchoolBostonMAUSA
| | - Ozgun Babur
- Computer Science DepartmentUniversity of Massachusetts BostonBostonMAUSA
| | | | - Alberto Valdeolivas
- Institute for Computational BiomedicineHeidelberg UniversityHeidelbergGermany
| | - Marina Esteban‐Medina
- Clinical Bioinformatics AreaFundación Progreso y Salud (FPS)Hospital Virgen del RocioSevillaSpain
- Computational Systems Medicine GroupInstitute of Biomedicine of Seville (IBIS)Hospital Virgen del RocioSevillaSpain
| | - Maria Peña‐Chilet
- Clinical Bioinformatics AreaFundación Progreso y Salud (FPS)Hospital Virgen del RocioSevillaSpain
- Computational Systems Medicine GroupInstitute of Biomedicine of Seville (IBIS)Hospital Virgen del RocioSevillaSpain
- Bioinformatics in Rare Diseases (BiER)Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER)FPS, Hospital Virgen del RocíoSevillaSpain
| | - Kinza Rian
- Clinical Bioinformatics AreaFundación Progreso y Salud (FPS)Hospital Virgen del RocioSevillaSpain
- Computational Systems Medicine GroupInstitute of Biomedicine of Seville (IBIS)Hospital Virgen del RocioSevillaSpain
| | - Tomáš Helikar
- Department of BiochemistryUniversity of Nebraska‐LincolnLincolnNEUSA
| | | | - Dezso Modos
- Quadram Institute BioscienceNorwichUK
- Earlham InstituteNorwichUK
| | - Agatha Treveil
- Quadram Institute BioscienceNorwichUK
- Earlham InstituteNorwichUK
| | - Marton Olbei
- Quadram Institute BioscienceNorwichUK
- Earlham InstituteNorwichUK
| | | | - Stephane Ballereau
- Cancer Research UK Cambridge InstituteUniversity of CambridgeCambridgeUK
| | - Aurélien Dugourd
- Institute for Computational BiomedicineHeidelberg UniversityHeidelbergGermany
- Institute of Experimental Medicine and Systems BiologyFaculty of Medicine, RWTHAachen UniversityAachenGermany
| | | | - Vincent Noël
- Institut CuriePSL Research UniversityParisFrance
- INSERMParisFrance
- MINES ParisTechPSL Research UniversityParisFrance
| | - Laurence Calzone
- Institut CuriePSL Research UniversityParisFrance
- INSERMParisFrance
- MINES ParisTechPSL Research UniversityParisFrance
| | - Chris Sander
- cBio Center, Divisions of Biostatistics and Computational BiologyDepartment of Data SciencesDana‐Farber Cancer InstituteBostonMAUSA
- Department of Cell BiologyHarvard Medical SchoolBostonMAUSA
| | - Emek Demir
- Department of Molecular and Medical GeneticsOregon Health & Sciences UniversityPortlandORUSA
| | | | - Tom C Freeman
- The Roslin InstituteUniversity of EdinburghEdinburghUK
| | - Franck Augé
- Sanofi R&DTranslational SciencesChilly‐MazarinFrance
| | | | - Jan Hasenauer
- Helmholtz Zentrum München – German Research Center for Environmental HealthInstitute of Computational BiologyNeuherbergGermany
- Interdisciplinary Research Unit Mathematics and Life SciencesUniversity of BonnBonnGermany
| | - Olaf Wolkenhauer
- Department of Systems Biology and BioinformaticsUniversity of RostockRostockGermany
| | - Egon L Wilighagen
- Department of Bioinformatics ‐ BiGCaTNUTRIMMaastricht UniversityMaastrichtThe Netherlands
| | - Alexander R Pico
- Institute of Data Science and BiotechnologyGladstone InstitutesSan FranciscoCAUSA
| | - Chris T Evelo
- Department of Bioinformatics ‐ BiGCaTNUTRIMMaastricht UniversityMaastrichtThe Netherlands
- Maastricht Centre for Systems Biology (MaCSBio)Maastricht UniversityMaastrichtThe Netherlands
| | - Marc E Gillespie
- MaRS CentreOntario Institute for Cancer ResearchTorontoONCanada
- St. John’s University College of Pharmacy and Health SciencesQueensNYUSA
| | - Lincoln D Stein
- MaRS CentreOntario Institute for Cancer ResearchTorontoONCanada
- Department of Molecular GeneticsUniversity of TorontoTorontoONCanada
| | - Henning Hermjakob
- European Bioinformatics Institute (EMBL‐EBI)European Molecular Biology LaboratoryHinxton, CambridgeshireUK
| | | | | | - Joaquin Dopazo
- Clinical Bioinformatics AreaFundación Progreso y Salud (FPS)Hospital Virgen del RocioSevillaSpain
- Computational Systems Medicine GroupInstitute of Biomedicine of Seville (IBIS)Hospital Virgen del RocioSevillaSpain
- Bioinformatics in Rare Diseases (BiER)Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER)FPS, Hospital Virgen del RocíoSevillaSpain
- FPS/ELIXIR‐esHospital Virgen del RocíoSevillaSpain
| | - Alfonso Valencia
- Barcelona Supercomputing Center (BSC)BarcelonaSpain
- Institució Catalana de Recerca i Estudis Avançats (ICREA)BarcelonaSpain
| | - Hiroaki Kitano
- Systems Biology InstituteTokyoJapan
- Okinawa Institute of Science and Technology Graduate SchoolOkinawaJapan
| | - Emmanuel Barillot
- Institut CuriePSL Research UniversityParisFrance
- INSERMParisFrance
- MINES ParisTechPSL Research UniversityParisFrance
| | - Charles Auffray
- Cancer Research UK Cambridge InstituteUniversity of CambridgeCambridgeUK
| | - Rudi Balling
- Luxembourg Centre for Systems BiomedicineUniversity of LuxembourgEsch‐sur‐AlzetteLuxembourg
| | - Reinhard Schneider
- Luxembourg Centre for Systems BiomedicineUniversity of LuxembourgEsch‐sur‐AlzetteLuxembourg
| | | |
Collapse
|
11
|
Yu SH, Ferretti D, Schessner JP, Rudolph JD, Borner GHH, Cox J. Expanding the Perseus Software for Omics Data Analysis With Custom Plugins. ACTA ACUST UNITED AC 2021; 71:e105. [PMID: 32931150 DOI: 10.1002/cpbi.105] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
The Perseus software provides a comprehensive framework for the statistical analysis of large-scale quantitative proteomics data, also in combination with other omics dimensions. Rapid developments in proteomics technology and the ever-growing diversity of biological studies increasingly require the flexibility to incorporate computational methods designed by the user. Here, we present the new functionality of Perseus to integrate self-made plugins written in C#, R, or Python. The user-written codes will be fully integrated into the Perseus data analysis workflow as custom activities. This also makes language-specific R and Python libraries from CRAN (cran.r-project.org), Bioconductor (bioconductor.org), PyPI (pypi.org), and Anaconda (anaconda.org) accessible in Perseus. The different available approaches are explained in detail in this article. To facilitate the distribution of user-developed plugins among users, we have created a plugin repository for community sharing and filled it with the examples provided in this article and a collection of already existing and more extensive plugins. © 2020 The Authors. Basic Protocol 1: Basic steps for R plugins Support Protocol 1: R plugins with additional arguments Basic Protocol 2: Basic steps for python plugins Support Protocol 2: Python plugins with additional arguments Basic Protocol 3: Basic steps and construction of C# plugins Basic Protocol 4: Basic steps of construction and connection for R plugins with C# interface Support Protocol 4: Advanced example of R Plugin with C# interface: UMAP Basic Protocol 5: Basic steps of construction and connection for python plugins with C# interface Support Protocol 5: Advanced example of python plugin with C# interface: UMAP Support Protocol 6: A basic workflow for the analysis of label-free quantification proteomics data using perseus.
Collapse
Affiliation(s)
- Sung-Huan Yu
- Computational Systems Biochemistry Research Group, Max-Planck Institute of Biochemistry, Martinsried, Germany
| | - Daniela Ferretti
- Computational Systems Biochemistry Research Group, Max-Planck Institute of Biochemistry, Martinsried, Germany
| | - Julia P Schessner
- Systems Biology of Membrane Trafficking Research Group, Max-Planck Institute of Biochemistry, Martinsried, Germany
| | - Jan Daniel Rudolph
- Computational Systems Biochemistry Research Group, Max-Planck Institute of Biochemistry, Martinsried, Germany.,Bosch Center for Artificial Intelligence, Robert-Bosch-Campus 1, Renningen, Germany
| | - Georg H H Borner
- Systems Biology of Membrane Trafficking Research Group, Max-Planck Institute of Biochemistry, Martinsried, Germany
| | - Jürgen Cox
- Computational Systems Biochemistry Research Group, Max-Planck Institute of Biochemistry, Martinsried, Germany.,Department of Biological and Medical Psychology, University of Bergen, Bergen, Norway
| |
Collapse
|
12
|
Tsuji S, Ihara S, Aburatani H. A simple knowledge-based mining method for exploring hidden key molecules in a human biomolecular network. BMC Syst Biol 2012; 6:124. [PMID: 22979956 PMCID: PMC3740779 DOI: 10.1186/1752-0509-6-124] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/08/2012] [Accepted: 07/25/2012] [Indexed: 02/02/2023]
Abstract
BACKGROUND In the functional genomics analysis domain, various methodologies are available for interpreting the results produced by high-throughput biological experiments. These methods commonly use a list of genes as an analysis input, and most of them produce a more complicated list of genes or pathways as the results of the analysis. Although there are several network-based methods, which detect key nodes in the network, the results tend to include well-studied, major hub genes. RESULTS To mine the molecules that have biological meaning but to fewer degrees than major hubs, we propose, in this study, a new network-based method for selecting these hidden key molecules based on virtual information flows circulating among the input list of genes. The human biomolecular network was constructed from the Pathway Commons database, and a calculation method based on betweenness centrality was newly developed. We validated the method with the ErbB pathway and applied it to practical cancer research data. We were able to confirm that the output genes, despite having fewer edges than major hubs, have biological meanings that were able to be invoked by the input list of genes. CONCLUSIONS The developed method, named NetHiKe (Network-based Hidden Key molecule miner), was able to detect potential key molecules by utilizing the human biomolecular network as a knowledge base. Thus, it is hoped that this method will enhance the progress of biological data analysis in the whole-genome research era.
Collapse
Affiliation(s)
- Shingo Tsuji
- Genome Science Division, Research Center for Advanced Science and Technology (RCAST), The University of Tokyo, 4-6-1 Komaba, Meguro-ku, Tokyo 153-8904, Japan
- Komaba Open Laboratory, The University of Tokyo, Tokyo, Japan
| | - Sigeo Ihara
- Genome Science Division, Research Center for Advanced Science and Technology (RCAST), The University of Tokyo, 4-6-1 Komaba, Meguro-ku, Tokyo 153-8904, Japan
| | - Hiroyuki Aburatani
- Genome Science Division, Research Center for Advanced Science and Technology (RCAST), The University of Tokyo, 4-6-1 Komaba, Meguro-ku, Tokyo 153-8904, Japan
| |
Collapse
|