1
|
Pinto J, Balarezo-Cisneros LN, Delneri D. Exploring adaptation routes to cold temperatures in the Saccharomyces genus. PLoS Genet 2025; 21:e1011199. [PMID: 39970180 PMCID: PMC11875353 DOI: 10.1371/journal.pgen.1011199] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2024] [Revised: 03/03/2025] [Accepted: 02/06/2025] [Indexed: 02/21/2025] Open
Abstract
The identification of traits that affect adaptation of microbial species to external abiotic factors, such as temperature, is key for our understanding of how biodiversity originates and can be maintained in a constantly changing environment. The Saccharomyces genus, which includes eight species with different thermotolerant profiles, represent an ideal experimental platform to study the impact of adaptive alleles in different genetic backgrounds. Previous studies identified a group of adaptive genes for maintenance of growth at lower temperatures. Here, we carried out a genus-wide assessment of the role of genes partially responsible for cold-adaptation in all eight Saccharomyces species for six candidate genes. We showed that the cold tolerance trait of S. kudriavzevii and S. eubayanus is likely to have evolved from different routes, involving genes important for the conservation of redox-balance, and for the long-chain fatty acid metabolism, respectively. For several loci, temperature- and species-dependent epistasis was detected, underscoring the plasticity and complexity of the genetic interactions. The natural isolates of S. kudriavzevii, S. jurei and S. mikatae had a significantly higher expression of the genes involved in the redox balance compared to S. cerevisiae, suggesting a role at transcriptional level. To distinguish the effects of gene expression from allelic variation, we independently replaced either the promoters or the coding sequences (CDS) of two genes in four yeast species with those derived from S. kudriavzevii. Our data consistently showed a significant fitness improvement at cold temperatures in the strains carrying the S. kudriavzevii promoter, while growth was lower upon CDS swapping. These results suggest that transcriptional strength plays a bigger role in growth maintenance at cold temperatures over the CDS and supports a model of adaptation centred on stochastic tuning of the expression network.
Collapse
Affiliation(s)
- Javier Pinto
- Faculty of Biology Medicine and Health, Manchester Institute of Biotechnology, The University of Manchester, Manchester, United Kingdom
| | - Laura Natalia Balarezo-Cisneros
- Faculty of Biology Medicine and Health, Manchester Institute of Biotechnology, The University of Manchester, Manchester, United Kingdom
| | - Daniela Delneri
- Faculty of Biology Medicine and Health, Manchester Institute of Biotechnology, The University of Manchester, Manchester, United Kingdom
| |
Collapse
|
2
|
Nguyen QH, Nguyen H, Oh EC, Nguyen T. Current approaches and outstanding challenges of functional annotation of metabolites: a comprehensive review. Brief Bioinform 2024; 25:bbae498. [PMID: 39397425 PMCID: PMC11471905 DOI: 10.1093/bib/bbae498] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2024] [Revised: 09/03/2024] [Accepted: 10/02/2024] [Indexed: 10/15/2024] Open
Abstract
Metabolite profiling is a powerful approach for the clinical diagnosis of complex diseases, ranging from cardiometabolic diseases, cancer, and cognitive disorders to respiratory pathologies and conditions that involve dysregulated metabolism. Because of the importance of systems-level interpretation, many methods have been developed to identify biologically significant pathways using metabolomics data. In this review, we first describe a complete metabolomics workflow (sample preparation, data acquisition, pre-processing, downstream analysis, etc.). We then comprehensively review 24 approaches capable of performing functional analysis, including those that combine metabolomics data with other types of data to investigate the disease-relevant changes at multiple omics layers. We discuss their availability, implementation, capability for pre-processing and quality control, supported omics types, embedded databases, pathway analysis methodologies, and integration techniques. We also provide a rating and evaluation of each software, focusing on their key technique, software accessibility, documentation, and user-friendliness. Following our guideline, life scientists can easily choose a suitable method depending on method rating, available data, input format, and method category. More importantly, we highlight outstanding challenges and potential solutions that need to be addressed by future research. To further assist users in executing the reviewed methods, we provide wrappers of the software packages at https://github.com/tinnlab/metabolite-pathway-review-docker.
Collapse
Affiliation(s)
- Quang-Huy Nguyen
- Department of Computer Science and Software Engineering, Auburn University, Auburn, AL 36849, United States
| | - Ha Nguyen
- Department of Computer Science and Software Engineering, Auburn University, Auburn, AL 36849, United States
| | - Edwin C Oh
- Department of Internal Medicine, UNLV School of Medicine, University of Nevada, Las Vegas, NV 89154, United States
| | - Tin Nguyen
- Department of Computer Science and Software Engineering, Auburn University, Auburn, AL 36849, United States
| |
Collapse
|
3
|
Martini L, Baek SH, Lo I, Raby BA, Silverman E, Weiss S, Glass K, Halu A. Detecting and dissecting signaling crosstalk via the multilayer network integration of signaling and regulatory interactions. Nucleic Acids Res 2024; 52:e5. [PMID: 37953325 PMCID: PMC10783515 DOI: 10.1093/nar/gkad1035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 06/27/2023] [Accepted: 10/23/2023] [Indexed: 11/14/2023] Open
Abstract
The versatility of cellular response arises from the communication, or crosstalk, of signaling pathways in a complex network of signaling and transcriptional regulatory interactions. Understanding the various mechanisms underlying crosstalk on a global scale requires untargeted computational approaches. We present a network-based statistical approach, MuXTalk, that uses high-dimensional edges called multilinks to model the unique ways in which signaling and regulatory interactions can interface. We demonstrate that the signaling-regulatory interface is located primarily in the intermediary region between signaling pathways where crosstalk occurs, and that multilinks can differentiate between distinct signaling-transcriptional mechanisms. Using statistically over-represented multilinks as proxies of crosstalk, we infer crosstalk among 60 signaling pathways, expanding currently available crosstalk databases by more than five-fold. MuXTalk surpasses existing methods in terms of model performance metrics, identifies additions to manual curation efforts, and pinpoints potential mediators of crosstalk. Moreover, it accommodates the inherent context-dependence of crosstalk, allowing future applications to cell type- and disease-specific crosstalk.
Collapse
Affiliation(s)
- Leonardo Martini
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, 02115, USA
- Department of Computer, Control, and Management Engineering, Sapienza University of Rome, Rome, 00185, Italy
| | - Seung Han Baek
- Division of Pulmonary Medicine, Boston Children’s Hospital, Harvard Medical School, Boston, MA, 02115, USA
| | - Ian Lo
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA
| | - Benjamin A Raby
- Division of Pulmonary Medicine, Boston Children’s Hospital, Harvard Medical School, Boston, MA, 02115, USA
| | - Edwin K Silverman
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, 02115, USA
| | - Scott T Weiss
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, 02115, USA
| | - Kimberly Glass
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, 02115, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA
| | - Arda Halu
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, 02115, USA
| |
Collapse
|
4
|
Aguilar D, Bosacoma A, Blanco I, Tura-Ceide O, Serrano-Mollar A, Barberà JA, Peinado VI. Differences and Similarities between the Lung Transcriptomic Profiles of COVID-19, COPD, and IPF Patients: A Meta-Analysis Study of Pathophysiological Signaling Pathways. Life (Basel) 2022; 12:887. [PMID: 35743918 PMCID: PMC9227224 DOI: 10.3390/life12060887] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Revised: 06/02/2022] [Accepted: 06/11/2022] [Indexed: 11/20/2022] Open
Abstract
Coronavirus disease 2019 (COVID-19) is a pandemic respiratory disease associated with high morbidity and mortality. Although many patients recover, long-term sequelae after infection have become increasingly recognized and concerning. Among other sequelae, the available data indicate that many patients who recover from COVID-19 could develop fibrotic abnormalities over time. To understand the basic pathophysiology underlying the development of long-term pulmonary fibrosis in COVID-19, as well as the higher mortality rates in patients with pre-existing lung diseases, we compared the transcriptomic fingerprints among patients with COVID-19, idiopathic pulmonary fibrosis (IPF), and chronic obstructive pulmonary disease (COPD) using interactomic analysis. Patients who died of COVID-19 shared some of the molecular biological processes triggered in patients with IPF, such as those related to immune response, airway remodeling, and wound healing, which could explain the radiological images seen in some patients after discharge. However, other aspects of this transcriptomic profile did not resemble the profile associated with irreversible fibrotic processes in IPF. Our mathematical approach instead showed that the molecular processes that were altered in COVID-19 patients more closely resembled those observed in COPD. These data indicate that patients with COPD, who have overcome COVID-19, might experience a faster decline in lung function that will undoubtedly affect global health.
Collapse
Affiliation(s)
- Daniel Aguilar
- Biomedical Research Networking Center in Hepatic and Digestive Diseases (CIBEREDH), 28005 Madrid, Spain;
- Institut d’Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), 08036 Barcelona, Spain; (A.B.); (I.B.); (O.T.-C.); (A.S.-M.); (J.A.B.)
| | - Adelaida Bosacoma
- Institut d’Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), 08036 Barcelona, Spain; (A.B.); (I.B.); (O.T.-C.); (A.S.-M.); (J.A.B.)
- Biomedical Research Networking Center in Respiratory Diseases (CIBERES), 28029 Madrid, Spain
| | - Isabel Blanco
- Institut d’Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), 08036 Barcelona, Spain; (A.B.); (I.B.); (O.T.-C.); (A.S.-M.); (J.A.B.)
- Biomedical Research Networking Center in Respiratory Diseases (CIBERES), 28029 Madrid, Spain
- Department of Pulmonary Medicine, Hospital Clínic, University of Barcelona, 08007 Barcelona, Spain
| | - Olga Tura-Ceide
- Institut d’Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), 08036 Barcelona, Spain; (A.B.); (I.B.); (O.T.-C.); (A.S.-M.); (J.A.B.)
- Biomedical Research Networking Center in Respiratory Diseases (CIBERES), 28029 Madrid, Spain
- Department of Pulmonary Medicine, Hospital Clínic, University of Barcelona, 08007 Barcelona, Spain
- Girona Biomedical Research Institute (IDIBGI), 17190 Girona, Spain
| | - Anna Serrano-Mollar
- Institut d’Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), 08036 Barcelona, Spain; (A.B.); (I.B.); (O.T.-C.); (A.S.-M.); (J.A.B.)
- Biomedical Research Networking Center in Respiratory Diseases (CIBERES), 28029 Madrid, Spain
- Department of Experimental Pathology, Institut d’Investigacions Biomèdiques de Barcelona (IIBB), CSIC-IDIBAPS, 08036 Barcelona, Spain
| | - Joan Albert Barberà
- Institut d’Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), 08036 Barcelona, Spain; (A.B.); (I.B.); (O.T.-C.); (A.S.-M.); (J.A.B.)
- Biomedical Research Networking Center in Respiratory Diseases (CIBERES), 28029 Madrid, Spain
- Department of Pulmonary Medicine, Hospital Clínic, University of Barcelona, 08007 Barcelona, Spain
| | - Victor Ivo Peinado
- Institut d’Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), 08036 Barcelona, Spain; (A.B.); (I.B.); (O.T.-C.); (A.S.-M.); (J.A.B.)
- Biomedical Research Networking Center in Respiratory Diseases (CIBERES), 28029 Madrid, Spain
- Department of Pulmonary Medicine, Hospital Clínic, University of Barcelona, 08007 Barcelona, Spain
- Department of Experimental Pathology, Institut d’Investigacions Biomèdiques de Barcelona (IIBB), CSIC-IDIBAPS, 08036 Barcelona, Spain
| |
Collapse
|
5
|
Castresana-Aguirre M, Guala D, Sonnhammer ELL. Benefits and Challenges of Pre-clustered Network-Based Pathway Analysis. Front Genet 2022; 13:855766. [PMID: 35620466 PMCID: PMC9127507 DOI: 10.3389/fgene.2022.855766] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2022] [Accepted: 04/25/2022] [Indexed: 12/13/2022] Open
Abstract
Functional analysis of gene sets derived from experiments is typically done by pathway annotation. Although many algorithms exist for analyzing the association between a gene set and a pathway, an issue which is generally ignored is that gene sets often represent multiple pathways. In such cases an association to a pathway is weakened by the presence of genes associated with other pathways. A way to counteract this is to cluster the gene set into more homogenous parts before performing pathway analysis on each module. We explored whether network-based pre-clustering of a query gene set can improve pathway analysis. The methods MCL, Infomap, and MGclus were used to cluster the gene set projected onto the FunCoup network. We characterized how well these methods are able to detect individual pathways in multi-pathway gene sets, and applied each of the clustering methods in combination with four pathway analysis methods: Gene Enrichment Analysis, BinoX, NEAT, and ANUBIX. Using benchmarks constructed from the KEGG pathway database we found that clustering can be beneficial by increasing the sensitivity of pathway analysis methods and by providing deeper insights of biological mechanisms related to the phenotype under study. However, keeping a high specificity is a challenge. For ANUBIX, clustering caused a minor loss of specificity, while for BinoX and NEAT it caused an unacceptable loss of specificity. GEA had very low sensitivity both before and after clustering. The choice of clustering method only had a minor effect on the results. We show examples of this approach and conclude that clustering can improve overall pathway annotation performance, but should only be used if the used enrichment method has a low false positive rate.
Collapse
Affiliation(s)
| | | | - Erik L. L. Sonnhammer
- Department of Biochemistry and Biophysics, Science for Life Laboratory, Stockholm University, Stockholm, Sweden
| |
Collapse
|
6
|
Song C, Zhang J, Liu Y, Hu Y, Feng C, Shi P, Zhang Y, Wang L, Xie Y, Zhang M, Zhao X, Cao Y, Li C, Sun H. Characterization and Validation of ceRNA-Mediated Pathway–Pathway Crosstalk Networks Across Eight Major Cardiovascular Diseases. Front Cell Dev Biol 2022; 10:762129. [PMID: 35433687 PMCID: PMC9010821 DOI: 10.3389/fcell.2022.762129] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2021] [Accepted: 03/01/2022] [Indexed: 01/08/2023] Open
Abstract
Pathway analysis is considered as an important strategy to reveal the underlying mechanisms of diseases. Pathways that are involved in crosstalk can regulate each other and co-regulate downstream biological processes. Furthermore, some genes in the pathways can function with other genes via the relationship of the competing endogenous RNA (ceRNA) mechanism, which has also been demonstrated to play key roles in cellular biology. However, the comprehensive analysis of ceRNA-mediated pathway crosstalk is lacking. Here, we constructed the landscape of the ceRNA-mediated pathway–pathway crosstalk of eight major cardiovascular diseases (CVDs) based on sequencing data from ∼2,800 samples. Some common features shared by numerous CVDs were uncovered. A fraction of the pathway–pathway crosstalk was conserved in multiple CVDs and a core pathway–pathway crosstalk network was identified, suggesting the similarity of pathway–pathway crosstalk among CVDs. Experimental evidence also demonstrated that the pathway crosstalk was functioned in CVDs. We split all hub pathways of each pathway–pathway crosstalk network into three categories, namely, common hubs, differential hubs, and specific hubs, which could highlight the common or specific biological mechanisms. Importantly, after a comparison analysis of the hub pathways of networks, ∼480 hub pathway-induced common modules were identified to exert functions in CVDs broadly. Moreover, we performed a random walk algorithm on the hub pathway-induced sub-network and identified 23 potentially novel CVD-related pathways. In summary, our study revealed the potential molecular regulatory mechanisms of ceRNA crosstalk in pathway–pathway crosstalk levels and provided a novel routine to investigate the pathway–pathway crosstalk in cardiology. All CVD pathway–pathway crosstalks are provided in http://www.licpathway.net/cepathway/index.html.
Collapse
Affiliation(s)
- Chao Song
- Department of Pharmacology, Harbin Medical University-Daqing, Daqing, China
| | - Jian Zhang
- Department of Medical Informatics, Harbin Medical University-Daqing, Daqing, China
| | - Yongsheng Liu
- Department of Pharmacology, Harbin Medical University-Daqing, Daqing, China
| | - Yinling Hu
- Department of Rehabilitation, Beijing Rehabilitation Hospital of Capital Medical University, Beijing, China
| | - Chenchen Feng
- Department of Medical Informatics, Harbin Medical University-Daqing, Daqing, China
| | - Pilong Shi
- Department of Pharmacology, Harbin Medical University-Daqing, Daqing, China
| | - Yuexin Zhang
- Department of Medical Informatics, Harbin Medical University-Daqing, Daqing, China
| | - Lixin Wang
- Department of Pharmacology, Harbin Medical University-Daqing, Daqing, China
| | - Yawen Xie
- Department of Pharmacology, Harbin Medical University-Daqing, Daqing, China
| | - Meitian Zhang
- Department of Pharmacology, Harbin Medical University-Daqing, Daqing, China
| | - Xilong Zhao
- Department of Medical Informatics, Harbin Medical University-Daqing, Daqing, China
| | - Yonggang Cao
- Department of Pharmacology, Harbin Medical University-Daqing, Daqing, China
| | - Chunquan Li
- Department of Medical Informatics, Harbin Medical University-Daqing, Daqing, China
- *Correspondence: Hongli Sun, ; Chunquan Li,
| | - Hongli Sun
- Department of Pharmacology, Harbin Medical University-Daqing, Daqing, China
- *Correspondence: Hongli Sun, ; Chunquan Li,
| |
Collapse
|
7
|
Guala D, Sonnhammer ELL. Network Crosstalk as a Basis for Drug Repurposing. Front Genet 2022; 13:792090. [PMID: 35350247 PMCID: PMC8958038 DOI: 10.3389/fgene.2022.792090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2021] [Accepted: 01/27/2022] [Indexed: 11/23/2022] Open
Abstract
The need for systematic drug repurposing has seen a steady increase over the past decade and may be particularly valuable to quickly remedy unexpected pandemics. The abundance of functional interaction data has allowed mapping of substantial parts of the human interactome modeled using functional association networks, favoring network-based drug repurposing. Network crosstalk-based approaches have never been tested for drug repurposing despite their success in the related and more mature field of pathway enrichment analysis. We have, therefore, evaluated the top performing crosstalk-based approaches for drug repurposing. Additionally, the volume of new interaction data as well as more sophisticated network integration approaches compelled us to construct a new benchmark for performance assessment of network-based drug repurposing tools, which we used to compare network crosstalk-based methods with a state-of-the-art technique. We find that network crosstalk-based drug repurposing is able to rival the state-of-the-art method and in some cases outperform it.
Collapse
Affiliation(s)
- Dimitri Guala
- Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, Solna, Sweden
- Merck AB, Solna, Sweden
| | - Erik L. L. Sonnhammer
- Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, Solna, Sweden
- *Correspondence: Erik L. L. Sonnhammer,
| |
Collapse
|
8
|
Yan S, Chi X, Chang X, Tian M. Analysing the meta-interaction between pathways by gene set topological impact analysis. BMC Genomics 2020; 21:748. [PMID: 33109101 PMCID: PMC7592530 DOI: 10.1186/s12864-020-07148-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2020] [Accepted: 10/13/2020] [Indexed: 11/25/2022] Open
Abstract
BACKGROUND Pathway analysis is widely applied in transcriptome analysis. Given certain transcriptomic changes, current pathway analysis tools tend to search for the most impacted pathways, which provides insight into underlying biological mechanisms. Further refining of the enriched pathways and extracting functional modules by "crosstalk" analysis have been proposed. However, the upstream/downstream relationships between the modules, which may provide extra biological insights such as the coordination of different functional modules and the signal transduction flow have been ignored. RESULTS To quantitatively analyse the upstream/downstream relationships between functional modules, we developed a novel GEne Set Topological Impact Analysis (GESTIA), which could be used to assemble the enriched pathways and functional modules into a super-module with a topological structure. We showed the advantages of this analysis in the exploration of extra biological insight in addition to the individual enriched pathways and functional modules. CONCLUSIONS GESTIA can be applied to a broad range of pathway/module analysis result. We hope that GESTIA may help researchers to get one additional step closer to understanding the molecular mechanism from the pathway/module analysis results.
Collapse
Affiliation(s)
- Shen Yan
- College of Agronomy, Sichuan Agricultural University, Chengdu, 611130, Sichuan, China
| | - Xu Chi
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 101300, China
- China National Center for Bioinformation, Chaoyang, Beijing, 101300, China
| | - Xiao Chang
- Department of Dermatology and Venereal Disease, Xuanwu Hospital, Capital Medical University, Beijing, 100053, China
| | - Mengliang Tian
- College of Agronomy, Sichuan Agricultural University, Chengdu, 611130, Sichuan, China.
| |
Collapse
|
9
|
Castresana-Aguirre M, Sonnhammer ELL. Pathway-specific model estimation for improved pathway annotation by network crosstalk. Sci Rep 2020; 10:13585. [PMID: 32788619 PMCID: PMC7423893 DOI: 10.1038/s41598-020-70239-z] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2019] [Accepted: 07/06/2020] [Indexed: 12/23/2022] Open
Abstract
Pathway enrichment analysis is the most common approach for understanding which biological processes are affected by altered gene activities under specific conditions. However, it has been challenging to find a method that efficiently avoids false positives while keeping a high sensitivity. We here present a new network-based method ANUBIX based on sampling random gene sets against intact pathway. Benchmarking shows that ANUBIX is considerably more accurate than previous network crosstalk based methods, which have the drawback of modelling pathways as random gene sets. We demonstrate that ANUBIX does not have a bias for finding certain pathways, which previous methods do, and show that ANUBIX finds biologically relevant pathways that are missed by other methods.
Collapse
Affiliation(s)
- Miguel Castresana-Aguirre
- Department of Biochemistry and Biophysics, Science for Life Laboratory, Stockholm University, Box 1031, 17121, Solna, Sweden
| | - Erik L L Sonnhammer
- Department of Biochemistry and Biophysics, Science for Life Laboratory, Stockholm University, Box 1031, 17121, Solna, Sweden.
| |
Collapse
|
10
|
Aguilar D, Lemonnier N, Koppelman GH, Melén E, Oliva B, Pinart M, Guerra S, Bousquet J, Anto JM. Understanding allergic multimorbidity within the non-eosinophilic interactome. PLoS One 2019; 14:e0224448. [PMID: 31693680 PMCID: PMC6834334 DOI: 10.1371/journal.pone.0224448] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2019] [Accepted: 10/14/2019] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND The mechanisms explaining multimorbidity between asthma, dermatitis and rhinitis (allergic multimorbidity) are not well known. We investigated these mechanisms and their specificity in distinct cell types by means of an interactome-based analysis of expression data. METHODS Genes associated to the diseases were identified using data mining approaches, and their multimorbidity mechanisms in distinct cell types were characterized by means of an in silico analysis of the topology of the human interactome. RESULTS We characterized specific pathomechanisms for multimorbidities between asthma, dermatitis and rhinitis for distinct emergent non-eosinophilic cell types. We observed differential roles for cytokine signaling, TLR-mediated signaling and metabolic pathways for multimorbidities across distinct cell types. Furthermore, we also identified individual genes potentially associated to multimorbidity mechanisms. CONCLUSIONS Our results support the existence of differentiated multimorbidity mechanisms between asthma, dermatitis and rhinitis at cell type level, as well as mechanisms common to distinct cell types. These results will help understanding the biology underlying allergic multimorbidity, assisting in the design of new clinical studies.
Collapse
MESH Headings
- Asthma/epidemiology
- Asthma/genetics
- Asthma/immunology
- Blood Cells/immunology
- Blood Cells/metabolism
- Cytokines/immunology
- Cytokines/metabolism
- Datasets as Topic
- Dermatitis, Allergic Contact/epidemiology
- Dermatitis, Allergic Contact/genetics
- Dermatitis, Allergic Contact/immunology
- Dermatitis, Atopic/epidemiology
- Dermatitis, Atopic/genetics
- Dermatitis, Atopic/immunology
- Gene Expression Profiling
- Humans
- Immunity, Cellular/genetics
- Multimorbidity
- Protein Interaction Maps/genetics
- Protein Interaction Maps/immunology
- Rhinitis, Allergic/epidemiology
- Rhinitis, Allergic/genetics
- Rhinitis, Allergic/immunology
Collapse
Affiliation(s)
- Daniel Aguilar
- Biomedical Research Networking Center in Hepatic and Digestive Diseases (CIBEREHD), Instituto de Salud Carlos III, Barcelona, Spain
- ISGlobal, Barcelona Institute for Global Health, Barcelona, Spain
- 6AM Data Mining, Barcelona, Spain
| | - Nathanael Lemonnier
- Institute for Advanced Biosciences, Inserm U 1209 CNRS UMR 5309 Université Grenoble Alpes, Site Santé, Allée des Alpes, La Tronche, France
| | - Gerard H. Koppelman
- University of Groningen, University Medical Center Groningen, Beatrix Children’s Hospital, Department of Pediatric Pulmonology and Pediatric Allergology, Groningen, Netherlands
- University of Groningen, University Medical Center Groningen, GRIAC Research Institute
| | - Erik Melén
- Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden
| | - Baldo Oliva
- Structural Bioinformatics Group, Research Programme on Biomedical Informatics, Department of Experimental and Health Sciences, Universitat Pompeu Fabra, Barcelona, Spain
| | - Mariona Pinart
- ISGlobal, Barcelona Institute for Global Health, Barcelona, Spain
| | - Stefano Guerra
- ISGlobal, Barcelona Institute for Global Health, Barcelona, Spain
- Asthma and Airway Disease Research Center, University of Arizona, Tucson, Arizona, United States of America
| | - Jean Bousquet
- Hopital Arnaud de Villeneuve University Hospital, Montpellier, France
- Charité, Universitätsmedizin Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Comprehensive Allergy Center, Department of Dermatology and Allergy, Berlin, Germany
| | - Josep M. Anto
- ISGlobal, Barcelona Institute for Global Health, Barcelona, Spain
| |
Collapse
|
11
|
Mora A. Gene set analysis methods for the functional interpretation of non-mRNA data—Genomic range and ncRNA data. Brief Bioinform 2019; 21:1495-1508. [DOI: 10.1093/bib/bbz090] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2019] [Revised: 05/30/2019] [Accepted: 06/28/2019] [Indexed: 12/31/2022] Open
Abstract
Abstract
Gene set analysis (GSA) is one of the methods of choice for analyzing the results of current omics studies; however, it has been mainly developed to analyze mRNA (microarray, RNA-Seq) data. The following review includes an update regarding general methods and resources for GSA and then emphasizes GSA methods and tools for non-mRNA omics datasets, specifically genomic range data (ChIP-Seq, SNP and methylation) and ncRNA data (miRNAs, lncRNAs and others). In the end, the state of the GSA field for non-mRNA datasets is discussed, and some current challenges and trends are highlighted, especially the use of network approaches to face complexity issues.
Collapse
Affiliation(s)
- Antonio Mora
- Joint School of Life Sciences, Guangzhou Medical University and Guangzhou Institutes of Biomedicine and Health - Chinese Academy of Sciences
| |
Collapse
|
12
|
Han H, Lee S, Lee I. NGSEA: Network-Based Gene Set Enrichment Analysis for Interpreting Gene Expression Phenotypes with Functional Gene Sets. Mol Cells 2019; 42:579-588. [PMID: 31307154 PMCID: PMC6715341 DOI: 10.14348/molcells.2019.0065] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2019] [Revised: 06/28/2019] [Accepted: 06/30/2019] [Indexed: 11/27/2022] Open
Abstract
Gene set enrichment analysis (GSEA) is a popular tool to identify underlying biological processes in clinical samples using their gene expression phenotypes. GSEA measures the enrichment of annotated gene sets that represent biological processes for differentially expressed genes (DEGs) in clinical samples. GSEA may be suboptimal for functional gene sets; however, because DEGs from the expression dataset may not be functional genes per se but dysregulated genes perturbed by bona fide functional genes. To overcome this shortcoming, we developed network-based GSEA (NGSEA), which measures the enrichment score of functional gene sets using the expression difference of not only individual genes but also their neighbors in the functional network. We found that NGSEA outperformed GSEA in identifying pathway gene sets for matched gene expression phenotypes. We also observed that NGSEA substantially improved the ability to retrieve known anti-cancer drugs from patient-derived gene expression data using drug-target gene sets compared with another method, Connectivity Map. We also repurposed FDA-approved drugs using NGSEA and experimentally validated budesonide as a chemical with anti-cancer effects for colorectal cancer. We, therefore, expect that NGSEA will facilitate both pathway interpretation of gene expression phenotypes and anti-cancer drug repositioning. NGSEA is freely available at www.inetbio.org/ngsea.
Collapse
Affiliation(s)
- Heonjong Han
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722,
Korea
| | - Sangyoung Lee
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722,
Korea
| | - Insuk Lee
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722,
Korea
- Department of Biomedical Systems Informatics, Yonsei University College of Medicine, Seoul 03722,
Korea
| |
Collapse
|
13
|
Zhou XH, Chu XY, Xue G, Xiong JH, Zhang HY. Identifying cancer prognostic modules by module network analysis. BMC Bioinformatics 2019; 20:85. [PMID: 30777030 PMCID: PMC6380061 DOI: 10.1186/s12859-019-2674-z] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2017] [Accepted: 02/08/2019] [Indexed: 02/08/2023] Open
Abstract
Background The identification of prognostic genes that can distinguish the prognostic risks of cancer patients remains a significant challenge. Previous works have proven that functional gene sets were more reliable for this task than the gene signature. However, few works have considered the cross-talk among functional gene sets, which may result in neglecting important prognostic gene sets for cancer. Results Here, we proposed a new method that considers both the interactions among modules and the prognostic correlation of the modules to identify prognostic modules in cancers. First, dense sub-networks in the gene co-expression network of cancer patients were detected. Second, cross-talk between every two modules was identified by a permutation test, thus generating the module network. Third, the prognostic correlation of each module was evaluated by the resampling method. Then, the GeneRank algorithm, which takes the module network and the prognostic correlations of all the modules as input, was applied to prioritize the prognostic modules. Finally, the selected modules were validated by survival analysis in various data sets. Our method was applied in three kinds of cancers, and the results show that our method succeeded in identifying prognostic modules in all the three cancers. In addition, our method outperformed state-of-the-art methods. Furthermore, the selected modules were significantly enriched with known cancer-related genes and drug targets of cancer, which may indicate that the genes involved in the modules may be drug targets for therapy. Conclusions We proposed a useful method to identify key modules in cancer prognosis and our prognostic genes may be good candidates for drug targets. Electronic supplementary material The online version of this article (10.1186/s12859-019-2674-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Xiong-Hui Zhou
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, 430070, People's Republic of China
| | - Xin-Yi Chu
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, 430070, People's Republic of China
| | - Gang Xue
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, 430070, People's Republic of China
| | - Jiang-Hui Xiong
- State Key Laboratory of Space Medicine Fundamentals and Application, China Astronaut Research and Training Center, Beijing, People's Republic of China.,Lab of Epigenetics and Health Tracking Technology, Space Institute of Southern China, Shenzhen, People's Republic of China
| | - Hong-Yu Zhang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, 430070, People's Republic of China.
| |
Collapse
|
14
|
Jeuken GS, Käll L. A simple null model for inferences from network enrichment analysis. PLoS One 2018; 13:e0206864. [PMID: 30412619 PMCID: PMC6226187 DOI: 10.1371/journal.pone.0206864] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2018] [Accepted: 10/19/2018] [Indexed: 12/31/2022] Open
Abstract
A prevailing technique to infer function from lists of identifications, from molecular biological high-throughput experiments, is over-representation analysis, where the identifications are compared to predefined sets of related genes often referred to as pathways. As at least some pathways are known to be incomplete in their annotation, algorithmic efforts have been made to complement them with information from functional association networks. While the terminology varies in the literature, we will here refer to such methods as Network Enrichment Analysis (NEA). Traditionally, the significance of inferences from NEA has been assigned using a null model constructed from randomizations of the network. Here we instead argue for a null model that more directly relates to the set of genes being studied, and have designed one dynamic programming algorithm that calculates the score distribution of NEA scores that makes it possible to assign unbiased mid p values to inferences. We also implemented a random sampling method, carrying out the same task. We demonstrate that our method obtains a superior statistical calibration as compared to the popular NEA inference engine, BinoX, while also providing statistics that are easier to interpret.
Collapse
Affiliation(s)
- Gustavo S. Jeuken
- Science for Life Laboratory, School of Engineering Sciences in Chemistry, Biotechnology and Health, KTH – Royal Institute of Technology, Box 1031, 17121 Solna, Sweden
| | - Lukas Käll
- Science for Life Laboratory, School of Engineering Sciences in Chemistry, Biotechnology and Health, KTH – Royal Institute of Technology, Box 1031, 17121 Solna, Sweden
- * E-mail:
| |
Collapse
|
15
|
Pita-Juárez Y, Altschuler G, Kariotis S, Wei W, Koler K, Green C, Tanzi RE, Hide W. The Pathway Coexpression Network: Revealing pathway relationships. PLoS Comput Biol 2018; 14:e1006042. [PMID: 29554099 PMCID: PMC5875878 DOI: 10.1371/journal.pcbi.1006042] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2017] [Revised: 03/29/2018] [Accepted: 02/19/2018] [Indexed: 02/02/2023] Open
Abstract
A goal of genomics is to understand the relationships between biological processes. Pathways contribute to functional interplay within biological processes through complex but poorly understood interactions. However, limited functional references for global pathway relationships exist. Pathways from databases such as KEGG and Reactome provide discrete annotations of biological processes. Their relationships are currently either inferred from gene set enrichment within specific experiments, or by simple overlap, linking pathway annotations that have genes in common. Here, we provide a unifying interpretation of functional interaction between pathways by systematically quantifying coexpression between 1,330 canonical pathways from the Molecular Signatures Database (MSigDB) to establish the Pathway Coexpression Network (PCxN). We estimated the correlation between canonical pathways valid in a broad context using a curated collection of 3,207 microarrays from 72 normal human tissues. PCxN accounts for shared genes between annotations to estimate significant correlations between pathways with related functions rather than with similar annotations. We demonstrate that PCxN provides novel insight into mechanisms of complex diseases using an Alzheimer’s Disease (AD) case study. PCxN retrieved pathways significantly correlated with an expert curated AD gene list. These pathways have known associations with AD and were significantly enriched for genes independently associated with AD. As a further step, we show how PCxN complements the results of gene set enrichment methods by revealing relationships between enriched pathways, and by identifying additional highly correlated pathways. PCxN revealed that correlated pathways from an AD expression profiling study include functional clusters involved in cell adhesion and oxidative stress. PCxN provides expanded connections to pathways from the extracellular matrix. PCxN provides a powerful new framework for interrogation of global pathway relationships. Comprehensive exploration of PCxN can be performed at http://pcxn.org/. Genes do not function alone, but interact within pathways to carry out specific biological processes. Pathways, in turn, interact at a higher level to affect major cellular activities such as motility, growth and development. We present a pathway coexpression network (PCxN) that systematically maps and quantifies these high-level interactions and establishes a unifying reference for pathway relationships. The method uses 3,207 human microarrays from 72 normal human tissues and 1,330 of the most well established pathway annotations to describe global relationships between pathways. PCxN accounts for shared genes to estimate correlations between pathways with related functions rather than with redundant pathway definitions. PCxN can be used to discover and explore pathways correlated with a pathway of interest. We applied PCxN to identify key processes related to Alzheimer’s disease (AD), interpreting a mixed genetic association and experimental derived set of disease genes in the context of gene co-expression. We expand the known relationships between pathways identified by gene set enrichment analysis in brain tissues affected with AD. PCxN provides a high-level overview of pathway relationships. PCxN is available as a webtool at http://pcxn.org/, and as a Bioconductor package at http://bioconductor.org/packages/pcxn/.
Collapse
Affiliation(s)
- Yered Pita-Juárez
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, United States of America
| | - Gabriel Altschuler
- Sheffield Institute for Translational Neuroscience, Department of Neuroscience, University of Sheffield, Sheffield, United Kingdom
| | - Sokratis Kariotis
- Sheffield Institute for Translational Neuroscience, Department of Neuroscience, University of Sheffield, Sheffield, United Kingdom
| | - Wenbin Wei
- Sheffield Institute for Translational Neuroscience, Department of Neuroscience, University of Sheffield, Sheffield, United Kingdom
| | - Katjuša Koler
- Sheffield Institute for Translational Neuroscience, Department of Neuroscience, University of Sheffield, Sheffield, United Kingdom
| | - Claire Green
- Sheffield Institute for Translational Neuroscience, Department of Neuroscience, University of Sheffield, Sheffield, United Kingdom
| | - Rudolph E. Tanzi
- Genetics and Aging Research Unit, MassGeneral Institute for Neurodegenerative Disease, Massachusetts General Hospital and Harvard Medical School, Charlestown, Massachusetts, United States of America
| | - Winston Hide
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, United States of America
- Sheffield Institute for Translational Neuroscience, Department of Neuroscience, University of Sheffield, Sheffield, United Kingdom
- Harvard Stem Cell Institute, Cambridge, Massachusetts, United States of America
- National Institute Health Research, Sheffield Biomedical Research Centre, Sheffield, United Kingdom
- * E-mail:
| |
Collapse
|
16
|
Tiys ES, Ivanisenko TV, Demenkov PS, Ivanisenko VA. FunGeneNet: a web tool to estimate enrichment of functional interactions in experimental gene sets. BMC Genomics 2018; 19:76. [PMID: 29504895 PMCID: PMC5836822 DOI: 10.1186/s12864-018-4474-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
Background Estimation of functional connectivity in gene sets derived from genome-wide or other biological experiments is one of the essential tasks of bioinformatics. A promising approach for solving this problem is to compare gene networks built using experimental gene sets with random networks. One of the resources that make such an analysis possible is CrossTalkZ, which uses the FunCoup database. However, existing methods, including CrossTalkZ, do not take into account individual types of interactions, such as protein/protein interactions, expression regulation, transport regulation, catalytic reactions, etc., but rather work with generalized types characterizing the existence of any connection between network members. Results We developed the online tool FunGeneNet, which utilizes the ANDSystem and STRING to reconstruct gene networks using experimental gene sets and to estimate their difference from random networks. To compare the reconstructed networks with random ones, the node permutation algorithm implemented in CrossTalkZ was taken as a basis. To study the FunGeneNet applicability, the functional connectivity analysis of networks constructed for gene sets involved in the Gene Ontology biological processes was conducted. We showed that the method sensitivity exceeds 0.8 at a specificity of 0.95. We found that the significance level of the difference between gene networks of biological processes and random networks is determined by the type of connections considered between objects. At the same time, the highest reliability is achieved for the generalized form of connections that takes into account all the individual types of connections. By taking examples of the thyroid cancer networks and the apoptosis network, it is demonstrated that key participants in these processes are involved in the interactions of those types by which these networks differ from random ones. Conclusions FunGeneNet is a web tool aimed at proving the functionality of networks in a wide range of sizes of experimental gene sets, both for different global networks and for different types of interactions. Using examples of thyroid cancer and apoptosis networks, we have shown that the links over-represented in the analyzed network in comparison with the random ones make possible a biological interpretation of the original gene/protein sets. The FunGeneNet web tool for assessment of the functional enrichment of networks is available at http://www-bionet.sscc.ru/fungenenet/. Electronic supplementary material The online version of this article (10.1186/s12864-018-4474-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Evgeny S Tiys
- The Institute of Cytology and Genetics, The Siberian Branch of the Russian Academy of Sciences, Prospekt Lavrentyeva 10, 630090, Novosibirsk, Russia. .,Laboratory of Computer Genomics, Novosibirsk State University, Pirogova Str. 2, 630090, Novosibirsk, Russia.
| | - Timofey V Ivanisenko
- The Institute of Cytology and Genetics, The Siberian Branch of the Russian Academy of Sciences, Prospekt Lavrentyeva 10, 630090, Novosibirsk, Russia.,Laboratory of Computer Genomics, Novosibirsk State University, Pirogova Str. 2, 630090, Novosibirsk, Russia
| | - Pavel S Demenkov
- The Institute of Cytology and Genetics, The Siberian Branch of the Russian Academy of Sciences, Prospekt Lavrentyeva 10, 630090, Novosibirsk, Russia
| | - Vladimir A Ivanisenko
- The Institute of Cytology and Genetics, The Siberian Branch of the Russian Academy of Sciences, Prospekt Lavrentyeva 10, 630090, Novosibirsk, Russia
| |
Collapse
|
17
|
Detecting pathway relationship in the context of human protein-protein interaction network and its application to Parkinson’s disease. Methods 2017; 131:93-103. [DOI: 10.1016/j.ymeth.2017.08.001] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2017] [Revised: 07/31/2017] [Accepted: 08/03/2017] [Indexed: 02/06/2023] Open
|
18
|
Aguilar D, Pinart M, Koppelman GH, Saeys Y, Nawijn MC, Postma DS, Akdis M, Auffray C, Ballereau S, Benet M, García-Aymerich J, González JR, Guerra S, Keil T, Kogevinas M, Lambrecht B, Lemonnier N, Melen E, Sunyer J, Valenta R, Valverde S, Wickman M, Bousquet J, Oliva B, Antó JM. Computational analysis of multimorbidity between asthma, eczema and rhinitis. PLoS One 2017; 12:e0179125. [PMID: 28598986 PMCID: PMC5466323 DOI: 10.1371/journal.pone.0179125] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2016] [Accepted: 05/24/2017] [Indexed: 12/11/2022] Open
Abstract
Background The mechanisms explaining the co-existence of asthma, eczema and rhinitis (allergic multimorbidity) are largely unknown. We investigated the mechanisms underlying multimorbidity between three main allergic diseases at a molecular level by identifying the proteins and cellular processes that are common to them. Methods An in silico study based on computational analysis of the topology of the protein interaction network was performed in order to characterize the molecular mechanisms of multimorbidity of asthma, eczema and rhinitis. As a first step, proteins associated to either disease were identified using data mining approaches, and their overlap was calculated. Secondly, a functional interaction network was built, allowing to identify cellular pathways involved in allergic multimorbidity. Finally, a network-based algorithm generated a ranked list of newly predicted multimorbidity-associated proteins. Results Asthma, eczema and rhinitis shared a larger number of associated proteins than expected by chance, and their associated proteins exhibited a significant degree of interconnectedness in the interaction network. There were 15 pathways involved in the multimorbidity of asthma, eczema and rhinitis, including IL4 signaling and GATA3-related pathways. A number of proteins potentially associated to these multimorbidity processes were also obtained. Conclusions These results strongly support the existence of an allergic multimorbidity cluster between asthma, eczema and rhinitis, and suggest that type 2 signaling pathways represent a relevant multimorbidity mechanism of allergic diseases. Furthermore, we identified new candidates contributing to multimorbidity that may assist in identifying new targets for multimorbid allergic diseases.
Collapse
Affiliation(s)
- Daniel Aguilar
- ISGlobal, Centre for Research in Environmental Epidemiology (CREAL), Barcelona, Spain
- Structural Bioinformatics Group, Departament de Ciencies Experimentals i de la Salut, Universitat Pompeu Fabra, Barcelona, Spain
- CIBER Epidemiologia y Salud Pública (CIBERESP), Barcelona, Spain
- * E-mail:
| | - Mariona Pinart
- ISGlobal, Centre for Research in Environmental Epidemiology (CREAL), Barcelona, Spain
- CIBER Epidemiologia y Salud Pública (CIBERESP), Barcelona, Spain
- Institut Municipal d'Investigació Mèdica (IMIM), Barcelona, Spain
| | - Gerard H. Koppelman
- University of Groningen, University Medical Center Groningen, Groningen Research Institute for Asthma and COPD, Groningen, The Netherlands
- University of Groningen, University Medical Center Groningen, Beatrix Children's Hospital, Department of Pediatric Pulmonology and Pediatric Allergology, Groningen, The Netherlands
| | - Yvan Saeys
- Inflammation Research Center, VIB, Ghent, Belgium
- Department of Respiratory Medicine, Ghent University Hospital, Ghent, Belgium
| | - Martijn C. Nawijn
- University of Groningen, University Medical Center Groningen, Groningen Research Institute for Asthma and COPD, Groningen, The Netherlands
- University of Groningen, Laboratory of Allergology and Pulmonary Diseases, Department of Pathology and Medical Biology, University Medical Center Groningen, Groningen, The Netherlands
| | - Dirkje S. Postma
- University of Groningen, University Medical Center Groningen, Groningen Research Institute for Asthma and COPD, Groningen, The Netherlands
- University of Groningen, Laboratory of Allergology and Pulmonary Diseases, Department of Pathology and Medical Biology, University Medical Center Groningen, Groningen, The Netherlands
| | - Mübeccel Akdis
- Swiss Institute of Allergy and Asthma Research (SIAF), Davos, Switzerland
- Christine Kühne–Center for Allergy Research and Education, Davos, Switzerland
| | - Charles Auffray
- European Institute for Systems Biology and Medicine (EISBM), CNRS, Lyon, France
| | - Stéphane Ballereau
- European Institute for Systems Biology and Medicine (EISBM), CNRS, Lyon, France
| | - Marta Benet
- ISGlobal, Centre for Research in Environmental Epidemiology (CREAL), Barcelona, Spain
- CIBER Epidemiologia y Salud Pública (CIBERESP), Barcelona, Spain
| | - Judith García-Aymerich
- ISGlobal, Centre for Research in Environmental Epidemiology (CREAL), Barcelona, Spain
- CIBER Epidemiologia y Salud Pública (CIBERESP), Barcelona, Spain
| | - Juan Ramón González
- ISGlobal, Centre for Research in Environmental Epidemiology (CREAL), Barcelona, Spain
- CIBER Epidemiologia y Salud Pública (CIBERESP), Barcelona, Spain
| | - Stefano Guerra
- ISGlobal, Centre for Research in Environmental Epidemiology (CREAL), Barcelona, Spain
- CIBER Epidemiologia y Salud Pública (CIBERESP), Barcelona, Spain
- Arizona Respiratory Center, Tucson, Arizona, United States of America
| | - Thomas Keil
- Institute of Social Medicine, Epidemiology and Health Economics, Charité University Medical Centre, Berlin, Germany
| | - Manolis Kogevinas
- ISGlobal, Centre for Research in Environmental Epidemiology (CREAL), Barcelona, Spain
- CIBER Epidemiologia y Salud Pública (CIBERESP), Barcelona, Spain
- Institut Municipal d'Investigació Mèdica (IMIM), Barcelona, Spain
- National School of Public Health, Athens, Greece
| | - Bart Lambrecht
- University of Groningen, University Medical Center Groningen, Groningen Research Institute for Asthma and COPD, Groningen, The Netherlands
- Department of Pulmonary Medicine, Erasmus MC, Rotterdam, the Netherlands
| | - Nathanael Lemonnier
- European Institute for Systems Biology and Medicine (EISBM), CNRS, Lyon, France
| | - Erik Melen
- Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden
- Sach's Children's Hospital, Stockholm, Sweden
| | - Jordi Sunyer
- ISGlobal, Centre for Research in Environmental Epidemiology (CREAL), Barcelona, Spain
- Structural Bioinformatics Group, Departament de Ciencies Experimentals i de la Salut, Universitat Pompeu Fabra, Barcelona, Spain
- CIBER Epidemiologia y Salud Pública (CIBERESP), Barcelona, Spain
- Institut Municipal d'Investigació Mèdica (IMIM), Barcelona, Spain
| | - Rudolf Valenta
- Division of Immunopathology, Department of Pathophysiology and Allergy Research, Center of Pathophysiology, Infectiology and Immunology, Medical University of Vienna, Vienna, Austria
- Christian Doppler Laboratory for Allergy Research, Medical University of Vienna, Vienna, Austria
| | - Sergi Valverde
- ICREA-Complex Systems Lab, Universitat Pompeu Fabra, Barcelona, Spain
- Institut de Biologia Evolutiva, CSIC-UPF, Barcelona, Spain
| | - Magnus Wickman
- Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden
- Sach's Children's Hospital, Stockholm, Sweden
| | - Jean Bousquet
- Hopital Arnaud de Villeneuve University Hospital and Inserm, Montpellier, France
| | - Baldo Oliva
- Structural Bioinformatics Group, Departament de Ciencies Experimentals i de la Salut, Universitat Pompeu Fabra, Barcelona, Spain
| | - Josep M. Antó
- ISGlobal, Centre for Research in Environmental Epidemiology (CREAL), Barcelona, Spain
- CIBER Epidemiologia y Salud Pública (CIBERESP), Barcelona, Spain
- Institut Municipal d'Investigació Mèdica (IMIM), Barcelona, Spain
| |
Collapse
|
19
|
Jeggari A, Alexeyenko A. NEArender: an R package for functional interpretation of 'omics' data via network enrichment analysis. BMC Bioinformatics 2017; 18:118. [PMID: 28361684 PMCID: PMC5374688 DOI: 10.1186/s12859-017-1534-y] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
Background The statistical evaluation of pathway enrichment, i.e. of gene profiles' confluence to the pathway level, allows exploring molecular landscapes using functionally annotated gene sets. However, pathway scores can also be used as predictive features in machine learning. That requires, firstly, increasing statistical power and biological relevance via a network enrichment analysis (NEA) and, secondly, a fast and convenient procedure for rendering the original data into a space of pathway scores. However, previous implementations of NEA involved multiple runs of network randomization and were therefore slow. Results Here, we present a new R package NEArender which can transform raw 'omics' features of experimental or clinical samples into matrices describing the same samples with many fewer NEA-based pathway scores. This is done via a parametric estimation of the null binomial distribution and is thus much faster and less biased than randomization procedures. Further, we compare estimates from these two alternative procedures and demonstrate that the summarization of individual genes to pathways increases the statistical power compared to both the default differential expression analysis on individual genes and the state-of-the-art gene set enrichment analysis. The package also contains functions for preparing input, modeling null distributions, and evaluating alternative versions of the global network. Conclusions Beyond the state-of-the-art exploration of molecular data through pathway enrichment, score matrices produced by NEArender can be used in larger bioinformatics pipelines as input for phenotype modeling, predicting disease outcomes etc. This approach is often more sensitive and robust than using the original data. The package NEArender is complementary to the online NEA tool EviNet (https://www.evinet.org) and, unlike of the latter, enables high performance of computations off-line. The R package NEArender version 1.4 is available at CRAN repository https://cran.r-project.org/web/packages/NEArender/ Electronic supplementary material The online version of this article (doi:10.1186/s12859-017-1534-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Ashwini Jeggari
- Department of Cell and Molecular Biology, Karolinska Institutet, Stockholm, Sweden
| | - Andrey Alexeyenko
- National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Stockholm, Sweden. .,Department of Microbiology, Tumor and Cell biology, Karolinska Institutet, Stockholm, Sweden.
| |
Collapse
|
20
|
GFD-Net: A novel semantic similarity methodology for the analysis of gene networks. J Biomed Inform 2017; 68:71-82. [PMID: 28274758 DOI: 10.1016/j.jbi.2017.02.013] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2016] [Revised: 02/08/2017] [Accepted: 02/22/2017] [Indexed: 02/06/2023]
Abstract
Since the popularization of biological network inference methods, it has become crucial to create methods to validate the resulting models. Here we present GFD-Net, the first methodology that applies the concept of semantic similarity to gene network analysis. GFD-Net combines the concept of semantic similarity with the use of gene network topology to analyze the functional dissimilarity of gene networks based on Gene Ontology (GO). The main innovation of GFD-Net lies in the way that semantic similarity is used to analyze gene networks taking into account the network topology. GFD-Net selects a functionality for each gene (specified by a GO term), weights each edge according to the dissimilarity between the nodes at its ends and calculates a quantitative measure of the network functional dissimilarity, i.e. a quantitative value of the degree of dissimilarity between the connected genes. The robustness of GFD-Net as a gene network validation tool was demonstrated by performing a ROC analysis on several network repositories. Furthermore, a well-known network was analyzed showing that GFD-Net can also be used to infer knowledge. The relevance of GFD-Net becomes more evident in Section "GFD-Net applied to the study of human diseases" where an example of how GFD-Net can be applied to the study of human diseases is presented. GFD-Net is available as an open-source Cytoscape app which offers a user-friendly interface to configure and execute the algorithm as well as the ability to visualize and interact with the results(http://apps.cytoscape.org/apps/gfdnet).
Collapse
|
21
|
Meng YX, Liu QH, Chen DH, Meng Y. Pathway cross-talk network analysis identifies critical pathways in neonatal sepsis. Comput Biol Chem 2017; 68:101-106. [PMID: 28292731 DOI: 10.1016/j.compbiolchem.2017.02.007] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2016] [Revised: 02/07/2017] [Accepted: 02/21/2017] [Indexed: 11/17/2022]
Abstract
BACKGROUND Despite advances in neonatal care, sepsis remains a major cause of morbidity and mortality in neonates worldwide. Pathway cross-talk analysis might contribute to the inference of the driving forces in bacterial sepsis and facilitate a better understanding of underlying pathogenesis of neonatal sepsis. OBJECTIVE This study aimed to explore the critical pathways associated with the progression of neonatal sepsis by the pathway cross-talk analysis. METHODS By integrating neonatal transcriptome data with known pathway data and protein-protein interaction data, we systematically uncovered the disease pathway cross-talks and constructed a disease pathway cross-talk network for neonatal sepsis. Then, attract method was employed to explore the dysregulated pathways associated with neonatal sepsis. To determine the critical pathways in neonatal sepsis, rank product (RP) algorithm, centrality analysis and impact factor (IF) were introduced sequentially, which synthetically considered the differential expression of genes and pathways, pathways cross-talks and pathway parameters in the network. The dysregulated pathways with the highest IF values as well as RP<0.01 were defined as critical pathways in neonatal sepsis. RESULTS By integrating three kinds of data, only 6919 common genes were included to perform the pathway cross-talk analysis. By statistic analysis, a total of 1249 significant pathway cross-talks were selected to construct the pathway cross-talk network. Moreover, 47 dys-regulated pathways were identified via attract method, 20 pathways were identified under RP<0.01, and the top 10 pathways with the highest IF were also screened from the pathway cross-talk network. Among them, we selected 8 common pathways, i.e. critical pathways. CONCLUSIONS In this study, we systematically tracked 8 critical pathways involved in neonatal sepsis by integrating attract method and pathway cross-talk network. These pathways might be responsible for the host response in infection, and of great value for advancing diagnosis and therapy of neonatal sepsis.
Collapse
Affiliation(s)
- Yu-Xiu Meng
- Department of Neonatology, First People's Hospital of Jining, Jining, Shandong 272011, PR China
| | - Quan-Hong Liu
- Department of Neonatology, Maternal and Child Health Hospital of Sishui, Jining, Shandong 273200, PR China
| | - Deng-Hong Chen
- Department of Obstetrics, First People's Hospital of Jining, Jining, Shandong 272011, PR China.
| | - Ying Meng
- Department of Internal Medicine, Traditional Chinese Medicine Hospital of Yanzhou, Jining, Shandong 272000, PR China
| |
Collapse
|
22
|
Ogris C, Guala D, Helleday T, Sonnhammer ELL. A novel method for crosstalk analysis of biological networks: improving accuracy of pathway annotation. Nucleic Acids Res 2016; 45:e8. [PMID: 27664219 PMCID: PMC5314790 DOI: 10.1093/nar/gkw849] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2016] [Revised: 08/17/2016] [Accepted: 08/23/2016] [Indexed: 12/13/2022] Open
Abstract
Analyzing gene expression patterns is a mainstay to gain functional insights of biological systems. A plethora of tools exist to identify significant enrichment of pathways for a set of differentially expressed genes. Most tools analyze gene overlap between gene sets and are therefore severely hampered by the current state of pathway annotation, yet at the same time they run a high risk of false assignments. A way to improve both true positive and false positive rates (FPRs) is to use a functional association network and instead look for enrichment of network connections between gene sets. We present a new network crosstalk analysis method BinoX that determines the statistical significance of network link enrichment or depletion between gene sets, using the binomial distribution. This is a much more appropriate statistical model than previous methods have employed, and as a result BinoX yields substantially better true positive and FPRs than was possible before. A number of benchmarks were performed to assess the accuracy of BinoX and competing methods. We demonstrate examples of how BinoX finds many biologically meaningful pathway annotations for gene sets from cancer and other diseases, which are not found by other methods. BinoX is available at http://sonnhammer.org/BinoX.
Collapse
Affiliation(s)
- Christoph Ogris
- Stockholm Bioinformatics Center, Department of Biochemistry and Biophysics, Stockholm University, Science for Life Laboratory, Box 1031, 17121 Solna, Sweden
| | - Dimitri Guala
- Stockholm Bioinformatics Center, Department of Biochemistry and Biophysics, Stockholm University, Science for Life Laboratory, Box 1031, 17121 Solna, Sweden
| | - Thomas Helleday
- Division of Translational Medicine and Chemical Biology, Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Science for Life Laboratory, Box 1031, 17121 Solna, Sweden
| | - Erik L L Sonnhammer
- Stockholm Bioinformatics Center, Department of Biochemistry and Biophysics, Stockholm University, Science for Life Laboratory, Box 1031, 17121 Solna, Sweden
| |
Collapse
|
23
|
Signorelli M, Vinciotti V, Wit EC. NEAT: an efficient network enrichment analysis test. BMC Bioinformatics 2016; 17:352. [PMID: 27597310 PMCID: PMC5011912 DOI: 10.1186/s12859-016-1203-6] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2016] [Accepted: 08/24/2016] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Network enrichment analysis is a powerful method, which allows to integrate gene enrichment analysis with the information on relationships between genes that is provided by gene networks. Existing tests for network enrichment analysis deal only with undirected networks, they can be computationally slow and are based on normality assumptions. RESULTS We propose NEAT, a test for network enrichment analysis. The test is based on the hypergeometric distribution, which naturally arises as the null distribution in this context. NEAT can be applied not only to undirected, but to directed and partially directed networks as well. Our simulations indicate that NEAT is considerably faster than alternative resampling-based methods, and that its capacity to detect enrichments is at least as good as the one of alternative tests. We discuss applications of NEAT to network analyses in yeast by testing for enrichment of the Environmental Stress Response target gene set with GO Slim and KEGG functional gene sets, and also by inspecting associations between functional sets themselves. CONCLUSIONS NEAT is a flexible and efficient test for network enrichment analysis that aims to overcome some limitations of existing resampling-based tests. The method is implemented in the R package neat, which can be freely downloaded from CRAN ( https://cran.r-project.org/package=neat ).
Collapse
Affiliation(s)
- Mirko Signorelli
- Johann Bernoulli Institute, University of Groningen, Nijenborgh 9, Groningen, 9747 AG, Netherlands.,Department of Statistical Sciences, University of Padova, Via C. Battisti 241, Padova, 35121, Italy
| | - Veronica Vinciotti
- Department of Mathematics, Brunel University London, Uxbridge UB8 3PH, London, UK
| | - Ernst C Wit
- Johann Bernoulli Institute, University of Groningen, Nijenborgh 9, Groningen, 9747 AG, Netherlands.
| |
Collapse
|
24
|
Mamrut S, Avidan N, Staun-Ram E, Ginzburg E, Truffault F, Berrih-Aknin S, Miller A. Integrative analysis of methylome and transcriptome in human blood identifies extensive sex- and immune cell-specific differentially methylated regions. Epigenetics 2016; 10:943-57. [PMID: 26291385 DOI: 10.1080/15592294.2015.1084462] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
The relationship between DNA methylation and gene expression is complex and elusive. To further elucidate these relations, we performed an integrative analysis of the methylome and transcriptome of 4 circulating immune cell subsets (B cells, monocytes, CD4(+), and CD8(+) T cells) from healthy females. Additionally, in light of the known sex bias in the prevalence of several immune-mediated diseases, the female datasets were compared with similar public available male data sets. Immune cell-specific differentially methylated regions (DMRs) were found to be highly similar between sexes, with an average correlation coefficient of 0.82; however, numerous sex-specific DMRs, shared by the cell subsets, were identified, mainly on autosomal chromosomes. This provides a list of highly interesting candidate genes to be studied in disorders with sexual dimorphism, such as autoimmune diseases. Immune cell-specific DMRs were mainly located in the gene body and intergenic region, distant from CpG islands but overlapping with enhancer elements, indicating that distal regulatory elements are important in immune cell specificity. In contrast, sex-specific DMRs were overrepresented in CpG islands, suggesting that the epigenetic regulatory mechanisms of sex and immune cell specificity may differ. Both positive and, more frequently, negative correlations between subset-specific expression and methylation were observed, and cell-specific DMRs of both interactions were associated with similar biological pathways, while sex-specific DMRs were linked to networks of early development or estrogen receptor and immune-related molecules. Our findings of immune cell- and sex-specific methylome and transcriptome profiles provide novel insight on their complex regulatory interactions and may particularly contribute to research of immune-mediated diseases.
Collapse
Affiliation(s)
- Shimrat Mamrut
- a Rappaport Faculty of Medicine; Technion-Israel Institute of Technology ; Haifa , Israel
| | - Nili Avidan
- a Rappaport Faculty of Medicine; Technion-Israel Institute of Technology ; Haifa , Israel
| | - Elsebeth Staun-Ram
- a Rappaport Faculty of Medicine; Technion-Israel Institute of Technology ; Haifa , Israel
| | - Elizabeta Ginzburg
- a Rappaport Faculty of Medicine; Technion-Israel Institute of Technology ; Haifa , Israel
| | - Frederique Truffault
- b INSERM - U974/CNRS UMR7215//UPMC UM76/AIM; Institute of Myology Pitie-Salpetriere ; Paris , France
| | - Sonia Berrih-Aknin
- b INSERM - U974/CNRS UMR7215//UPMC UM76/AIM; Institute of Myology Pitie-Salpetriere ; Paris , France
| | - Ariel Miller
- a Rappaport Faculty of Medicine; Technion-Israel Institute of Technology ; Haifa , Israel.,c Division of Neuroimmunology; Lady Davis Carmel Medical Center ; Haifa , Israel
| |
Collapse
|
25
|
Ogris C, Helleday T, Sonnhammer ELL. PathwAX: a web server for network crosstalk based pathway annotation. Nucleic Acids Res 2016; 44:W105-9. [PMID: 27151197 PMCID: PMC4987909 DOI: 10.1093/nar/gkw356] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2016] [Accepted: 04/19/2016] [Indexed: 12/22/2022] Open
Abstract
Pathway annotation of gene lists is often used to functionally analyse biomolecular data such as gene expression in order to establish which processes are activated in a given experiment. Databases such as KEGG or GO represent collections of how genes are known to be organized in pathways, and the challenge is to compare a given gene list with the known pathways such that all true relations are identified. Most tools apply statistical measures to the gene overlap between the gene list and pathway. It is however problematic to avoid false negatives and false positives when only using the gene overlap. The pathwAX web server (http://pathwAX.sbc.su.se/) applies a different approach which is based on network crosstalk. It uses the comprehensive network FunCoup to analyse network crosstalk between a query gene list and KEGG pathways. PathwAX runs the BinoX algorithm, which employs Monte-Carlo sampling of randomized networks and estimates a binomial distribution, for estimating the statistical significance of the crosstalk. This results in substantially higher accuracy than gene overlap methods. The system was optimized for speed and allows interactive web usage. We illustrate the usage and output of pathwAX.
Collapse
Affiliation(s)
- Christoph Ogris
- Stockholm Bioinformatics Center, Department of Biochemistry and Biophysics, Stockholm University, Science for Life Laboratory, Box 1031, 17121 Solna, Sweden
| | - Thomas Helleday
- Division of Translational Medicine and Chemical Biology, Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Science for Life Laboratory, Box 1031, 17121 Solna, Sweden
| | - Erik L L Sonnhammer
- Stockholm Bioinformatics Center, Department of Biochemistry and Biophysics, Stockholm University, Science for Life Laboratory, Box 1031, 17121 Solna, Sweden
| |
Collapse
|
26
|
Tegge AN, Sharp N, Murali TM. Xtalk: a path-based approach for identifying crosstalk between signaling pathways. Bioinformatics 2015; 32:242-51. [PMID: 26400040 DOI: 10.1093/bioinformatics/btv549] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2014] [Accepted: 09/04/2015] [Indexed: 12/26/2022] Open
Abstract
MOTIVATION Cells communicate with their environment via signal transduction pathways. On occasion, the activation of one pathway can produce an effect downstream of another pathway, a phenomenon known as crosstalk. Existing computational methods to discover such pathway pairs rely on simple overlap statistics. RESULTS We present Xtalk, a path-based approach for identifying pairs of pathways that may crosstalk. Xtalk computes the statistical significance of the average length of multiple short paths that connect receptors in one pathway to the transcription factors in another. By design, Xtalk reports the precise interactions and mechanisms that support the identified crosstalk. We applied Xtalk to signaling pathways in the KEGG and NCI-PID databases. We manually curated a gold standard set of 132 crosstalking pathway pairs and a set of 140 pairs that did not crosstalk, for which Xtalk achieved an area under the receiver operator characteristic curve of 0.65, a 12% improvement over the closest competing approach. The area under the receiver operator characteristic curve varied with the pathway, suggesting that crosstalk should be evaluated on a pathway-by-pathway level. We also analyzed an extended set of 658 pathway pairs in KEGG and to a set of more than 7000 pathway pairs in NCI-PID. For the top-ranking pairs, we found substantial support in the literature (81% for KEGG and 78% for NCI-PID). We provide examples of networks computed by Xtalk that accurately recovered known mechanisms of crosstalk. AVAILABILITY AND IMPLEMENTATION The XTALK software is available at http://bioinformatics.cs.vt.edu/~murali/software. Crosstalk networks are available at http://graphspace.org/graphs?tags=2015-bioinformatics-xtalk. CONTACT ategge@vt.edu, murali@cs.vt.edu SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Allison N Tegge
- Department of Computer Science, Department of Statistics and
| | | | - T M Murali
- Department of Computer Science, ICTAS Center for Systems Biology of Engineered Tissues, Virginia Tech, Blacksburg, VA 24061, USA
| |
Collapse
|
27
|
Integrative Analysis with Monte Carlo Cross-Validation Reveals miRNAs Regulating Pathways Cross-Talk in Aggressive Breast Cancer. BIOMED RESEARCH INTERNATIONAL 2015; 2015:831314. [PMID: 26240829 PMCID: PMC4512830 DOI: 10.1155/2015/831314] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/20/2015] [Revised: 05/31/2015] [Accepted: 06/08/2015] [Indexed: 12/11/2022]
Abstract
In this work an integrated approach was used to identify functional miRNAs regulating gene pathway cross-talk in breast cancer (BC). We first integrated gene expression profiles and biological pathway information to explore the underlying associations between genes differently expressed among normal and BC samples and pathways enriched from these genes. For each pair of pathways, a score was derived from the distribution of gene expression levels by quantifying their pathway cross-talk. Random forest classification allowed the identification of pairs of pathways with high cross-talk. We assessed miRNAs regulating the identified gene pathways by a mutual information analysis. A Fisher test was applied to demonstrate their significance in the regulated pathways. Our results suggest interesting networks of pathways that could be key regulatory of target genes in BC, including stem cell pluripotency, coagulation, and hypoxia pathways and miRNAs that control these networks could be potential biomarkers for diagnostic, prognostic, and therapeutic development in BC. This work shows that standard methods of predicting normal and tumor classes such as differentially expressed miRNAs or transcription factors could lose intrinsic features; instead our approach revealed the responsible molecules of the disease.
Collapse
|
28
|
Li W, Freudenberg J, Oswald M. Principles for the organization of gene-sets. Comput Biol Chem 2015; 59 Pt B:139-49. [PMID: 26188561 DOI: 10.1016/j.compbiolchem.2015.04.005] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2015] [Accepted: 04/08/2015] [Indexed: 12/23/2022]
Abstract
A gene-set, an important concept in microarray expression analysis and systems biology, is a collection of genes and/or their products (i.e. proteins) that have some features in common. There are many different ways to construct gene-sets, but a systematic organization of these ways is lacking. Gene-sets are mainly organized ad hoc in current public-domain databases, with group header names often determined by practical reasons (such as the types of technology in obtaining the gene-sets or a balanced number of gene-sets under a header). Here we aim at providing a gene-set organization principle according to the level at which genes are connected: homology, physical map proximity, chemical interaction, biological, and phenotypic-medical levels. We also distinguish two types of connections between genes: actual connection versus sharing of a label. Actual connections denote direct biological interactions, whereas shared label connection denotes shared membership in a group. Some extensions of the framework are also addressed such as overlapping of gene-sets, modules, and the incorporation of other non-protein-coding entities such as microRNAs.
Collapse
Affiliation(s)
- Wentian Li
- The Robert S. Boas Center for Genomics and Human Genetics, The Feinstein Institute for Medical Research, North Shore LIJ Health System, Manhasset, NY, USA.
| | - Jan Freudenberg
- The Robert S. Boas Center for Genomics and Human Genetics, The Feinstein Institute for Medical Research, North Shore LIJ Health System, Manhasset, NY, USA
| | - Michaela Oswald
- The Robert S. Boas Center for Genomics and Human Genetics, The Feinstein Institute for Medical Research, North Shore LIJ Health System, Manhasset, NY, USA
| |
Collapse
|
29
|
Dong X, Yambartsev A, Ramsey SA, Thomas LD, Shulzhenko N, Morgun A. Reverse enGENEering of Regulatory Networks from Big Data: A Roadmap for Biologists. Bioinform Biol Insights 2015; 9:61-74. [PMID: 25983554 PMCID: PMC4415676 DOI: 10.4137/bbi.s12467] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2014] [Revised: 02/16/2015] [Accepted: 02/17/2015] [Indexed: 12/29/2022] Open
Abstract
Omics technologies enable unbiased investigation of biological systems through massively parallel sequence acquisition or molecular measurements, bringing the life sciences into the era of Big Data. A central challenge posed by such omics datasets is how to transform these data into biological knowledge, for example, how to use these data to answer questions such as: Which functional pathways are involved in cell differentiation? Which genes should we target to stop cancer? Network analysis is a powerful and general approach to solve this problem consisting of two fundamental stages, network reconstruction, and network interrogation. Here we provide an overview of network analysis including a step-by-step guide on how to perform and use this approach to investigate a biological question. In this guide, we also include the software packages that we and others employ for each of the steps of a network analysis workflow.
Collapse
Affiliation(s)
- Xiaoxi Dong
- College of Pharmacy, Oregon State University, Corvallis, OR, USA
| | - Anatoly Yambartsev
- Department of Statistics, Institute of Mathematics and Statistics, University of Sao Paulo, Sao Paulo, SP, Brazil
| | - Stephen A Ramsey
- School of Electrical Engineering and Computer Science, Department of Biomedical Sciences, Oregon State University, Corvallis, OR, USA. ; College of Veterinary Medicine, Department of Biomedical Sciences, Oregon State University, Corvallis, OR, USA
| | - Lina D Thomas
- Department of Statistics, Institute of Mathematics and Statistics, University of Sao Paulo, Sao Paulo, SP, Brazil
| | - Natalia Shulzhenko
- College of Veterinary Medicine, Department of Biomedical Sciences, Oregon State University, Corvallis, OR, USA
| | - Andrey Morgun
- College of Pharmacy, Oregon State University, Corvallis, OR, USA
| |
Collapse
|
30
|
Merid SK, Goranskaya D, Alexeyenko A. Distinguishing between driver and passenger mutations in individual cancer genomes by network enrichment analysis. BMC Bioinformatics 2014; 15:308. [PMID: 25236784 PMCID: PMC4262241 DOI: 10.1186/1471-2105-15-308] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2014] [Accepted: 09/02/2014] [Indexed: 01/09/2023] Open
Abstract
Background In somatic cancer genomes, delineating genuine driver mutations against a background of multiple passenger events is a challenging task. The difficulty of determining function from sequence data and the low frequency of mutations are increasingly hindering the search for novel, less common cancer drivers. The accumulation of extensive amounts of data on somatic point and copy number alterations necessitates the development of systematic methods for driver mutation analysis. Results We introduce a framework for detecting driver mutations via functional network analysis, which is applied to individual genomes and does not require pooling multiple samples. It probabilistically evaluates 1) functional network links between different mutations in the same genome and 2) links between individual mutations and known cancer pathways. In addition, it can employ correlations of mutation patterns in pairs of genes. The method was used to analyze genomic alterations in two TCGA datasets, one for glioblastoma multiforme and another for ovarian carcinoma, which were generated using different approaches to mutation profiling. The proportions of drivers among the reported de novo point mutations in these cancers were estimated to be 57.8% and 16.8%, respectively. The both sets also included extended chromosomal regions with synchronous duplications or losses of multiple genes. We identified putative copy number driver events within many such segments. Finally, we summarized seemingly disparate mutations and discovered a functional network of collagen modifications in the glioblastoma. In order to select the most efficient network for use with this method, we used a novel, ROC curve-based procedure for benchmarking different network versions by their ability to recover pathway membership. Conclusions The results of our network-based procedure were in good agreement with published gold standard sets of cancer genes and were shown to complement and expand frequency-based driver analyses. On the other hand, three sequence-based methods applied to the same data yielded poor agreement with each other and with our results. We review the difference in driver proportions discovered by different sequencing approaches and discuss the functional roles of novel driver mutations. The software used in this work and the global network of functional couplings are publicly available at http://research.scilifelab.se/andrej_alexeyenko/downloads.html. Electronic supplementary material The online version of this article (doi:10.1186/1471-2105-15-308) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | | | - Andrey Alexeyenko
- Department of Microbiology, Tumour and Cell biology, Bioinformatics Infrastructure for Life Sciences, Science for Life Laboratory, Karolinska Institutet, 17177 Stockholm, Sweden.
| |
Collapse
|
31
|
Ritz A, Tegge AN, Kim H, Poirel CL, Murali TM. Signaling hypergraphs. Trends Biotechnol 2014; 32:356-62. [PMID: 24857424 PMCID: PMC4299695 DOI: 10.1016/j.tibtech.2014.04.007] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2013] [Revised: 04/01/2014] [Accepted: 04/04/2014] [Indexed: 01/10/2023]
Abstract
Signaling pathways function as the information-passing mechanisms of cells. A number of databases with extensive manual curation represent the current knowledge base for signaling pathways. These databases motivate the development of computational approaches for prediction and analysis. Such methods require an accurate and computable representation of signaling pathways. Pathways are often described as sets of proteins or as pairwise interactions between proteins. However, many signaling mechanisms cannot be described using these representations. In this opinion, we highlight a representation of signaling pathways that is underutilized: the hypergraph. We demonstrate the usefulness of hypergraphs in this context and discuss challenges and opportunities for the scientific community.
Collapse
Affiliation(s)
- Anna Ritz
- Department of Computer Science, Virginia Tech, Blacksburg, VA, USA
| | - Allison N Tegge
- Department of Computer Science, Virginia Tech, Blacksburg, VA, USA
| | - Hyunju Kim
- Department of Computer Science, Virginia Tech, Blacksburg, VA, USA
| | | | - T M Murali
- Department of Computer Science, Virginia Tech, Blacksburg, VA, USA; ICTAS Center for Systems Biology of Engineered Tissues, Virginia Tech, Blacksburg, VA, USA.
| |
Collapse
|