1
|
Pisarchik AN, Andreev AV, Kurkin SA, Stoyanov D, Badarin AA, Paunova R, Hramov AE. Topology switching during window thresholding fMRI-based functional networks of patients with major depressive disorder: Consensus network approach. CHAOS (WOODBURY, N.Y.) 2023; 33:093122. [PMID: 37712918 DOI: 10.1063/5.0166148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Accepted: 08/29/2023] [Indexed: 09/16/2023]
Abstract
We present a novel method for analyzing brain functional networks using functional magnetic resonance imaging data, which involves utilizing consensus networks. In this study, we compare our approach to a standard group-based method for patients diagnosed with major depressive disorder (MDD) and a healthy control group, taking into account different levels of connectivity. Our findings demonstrate that the consensus network approach uncovers distinct characteristics in network measures and degree distributions when considering connection strengths. In the healthy control group, as connection strengths increase, we observe a transition in the network topology from a combination of scale-free and random topologies to a small-world topology. Conversely, the MDD group exhibits uncertainty in weak connections, while strong connections display small-world properties. In contrast, the group-based approach does not exhibit significant differences in behavior between the two groups. However, it does indicate a transition in topology from a scale-free-like structure to a combination of small-world and scale-free topologies. The use of the consensus network approach also holds immense potential for the classification of MDD patients, as it unveils substantial distinctions between the two groups.
Collapse
Affiliation(s)
- Alexander N Pisarchik
- Baltic Center for Neurotechnology and Artificial Intelligence, Immanuel Kant Baltic Federal University, 14, A. Nevskogo Str., Kaliningrad 236016, Russia
- Center for Biomedical Technology, Universidad Politécnica de Madrid, Campus Montegancedo, Pozuelo de Alarcón 28223, Spain
| | - Andrey V Andreev
- Baltic Center for Neurotechnology and Artificial Intelligence, Immanuel Kant Baltic Federal University, 14, A. Nevskogo Str., Kaliningrad 236016, Russia
| | - Semen A Kurkin
- Baltic Center for Neurotechnology and Artificial Intelligence, Immanuel Kant Baltic Federal University, 14, A. Nevskogo Str., Kaliningrad 236016, Russia
| | - Drozdstoy Stoyanov
- Department of Psychiatry and Medical Psychology, Research Institute, Medical University Plovdiv, 15A Vassil Aprilov Blvd., Plovdiv 4002, Bulgaria
| | - Artem A Badarin
- Baltic Center for Neurotechnology and Artificial Intelligence, Immanuel Kant Baltic Federal University, 14, A. Nevskogo Str., Kaliningrad 236016, Russia
| | - Rossitsa Paunova
- Department of Psychiatry and Medical Psychology, Research Institute, Medical University Plovdiv, 15A Vassil Aprilov Blvd., Plovdiv 4002, Bulgaria
| | - Alexander E Hramov
- Baltic Center for Neurotechnology and Artificial Intelligence, Immanuel Kant Baltic Federal University, 14, A. Nevskogo Str., Kaliningrad 236016, Russia
| |
Collapse
|
2
|
Martins YC, Ziviani A, Cerqueira e Costa MDO, Cavalcanti MCR, Nicolás MF, de Vasconcelos ATR. PPIntegrator: semantic integrative system for protein-protein interaction and application for host-pathogen datasets. BIOINFORMATICS ADVANCES 2023; 3:vbad067. [PMID: 37359724 PMCID: PMC10290227 DOI: 10.1093/bioadv/vbad067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Revised: 04/28/2023] [Accepted: 05/30/2023] [Indexed: 06/28/2023]
Abstract
Summary Semantic web standards have shown importance in the last 20 years in promoting data formalization and interlinking between the existing knowledge graphs. In this context, several ontologies and data integration initiatives have emerged in recent years for the biological area, such as the broadly used Gene Ontology that contains metadata to annotate gene function and subcellular location. Another important subject in the biological area is protein-protein interactions (PPIs) which have applications like protein function inference. Current PPI databases have heterogeneous exportation methods that challenge their integration and analysis. Presently, several initiatives of ontologies covering some concepts of the PPI domain are available to promote interoperability across datasets. However, the efforts to stimulate guidelines for automatic semantic data integration and analysis for PPIs in these datasets are limited. Here, we present PPIntegrator, a system that semantically describes data related to protein interactions. We also introduce an enrichment pipeline to generate, predict and validate new potential host-pathogen datasets by transitivity analysis. PPIntegrator contains a data preparation module to organize data from three reference databases and a triplification and data fusion module to describe the provenance information and results. This work provides an overview of the PPIntegrator system applied to integrate and compare host-pathogen PPI datasets from four bacterial species using our proposed transitivity analysis pipeline. We also demonstrated some critical queries to analyze this kind of data and highlight the importance and usage of the semantic data generated by our system. Availability and implementation https://github.com/YasCoMa/ppintegrator, https://github.com/YasCoMa/ppi_validation_process and https://github.com/YasCoMa/predprin.
Collapse
Affiliation(s)
- Yasmmin Côrtes Martins
- Bioinformatics Laboratory, National Laboratory for Scientific Computing, Petrópolis 25651-076, Brazil
| | - Artur Ziviani
- Data Extreme Laboratory (DEXL), National Laboratory for Scientific Computing, Petrópolis 25651-076, Brazil
| | | | | | - Marisa Fabiana Nicolás
- Bioinformatics Laboratory, National Laboratory for Scientific Computing, Petrópolis 25651-076, Brazil
| | | |
Collapse
|
3
|
Tripp BA, Otu HH. Integration of Multi-Omics Data Using Probabilistic Graph Models and
External Knowledge. Curr Bioinform 2022. [DOI: 10.2174/1574893616666210906141545] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Background:
High-throughput sequencing technologies have revolutionized the ability to
perform systems-level biology and elucidate molecular mechanisms of disease through the comprehensive
characterization of different layers of biological information. Integration of these heterogeneous
layers can provide insight into the underlying biology but is challenged by modeling complex interactions.
Objective:
We introduce OBaNK: omics integration using Bayesian networks and external knowledge,
an algorithm to model interactions between heterogeneous high-dimensional biological data to elucidate
complex functional clusters and emergent relationships associated with an observed phenotype.
Method:
Using Bayesian network learning, we modeled the statistical dependencies and interactions
between lipidomics, proteomics, and metabolomics data. The strength of a learned interaction between
molecules was altered based on external knowledge.
Results :
Networks learned from synthetic datasets based on real pathways achieved an average area under
the curve score of ~0.85, an improvement of ~0.23 from baseline methods. When applied to real
multi-omics data collected during pregnancy, five distinct functional networks of heterogeneous biological
data were identified, and the results were compared to other multi-omics integration approaches.
Conclusion:
OBaNK successfully improved the accuracy of learning interaction networks from data integrating
external knowledge, identified heterogeneous functional networks from real data, and suggested
potential novel interactions associated with the phenotype. These findings can guide future hypothesis
generation. OBaNK source code is available at: https://github.com/bridgettripp/OBaNK.git, and a
graphical user interface is available at: http://otulab.unl.edu/OBaNK.
Collapse
Affiliation(s)
- Bridget A. Tripp
- Department of Electrical and Computer Engineering, University of Nebraska-Lincoln, Lincoln, Nebraska, USA
- PhD Program of Complex Biosystems, University of Nebraska-Lincoln, Lincoln, Nebraska, USA
| | - Hasan H. Otu
- Department of Electrical and Computer Engineering, University of Nebraska-Lincoln, Lincoln, Nebraska, USA
| |
Collapse
|
4
|
Regondi C, Fratelli M, Damia G, Guffanti F, Ganzinelli M, Matteucci M, Masseroli M. Predictive modeling of gene expression regulation. BMC Bioinformatics 2021; 22:571. [PMID: 34837938 PMCID: PMC8626902 DOI: 10.1186/s12859-021-04481-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2021] [Accepted: 11/15/2021] [Indexed: 11/24/2022] Open
Abstract
Background In-depth analysis of regulation networks of genes aberrantly expressed in cancer is essential for better understanding tumors and identifying key genes that could be therapeutically targeted. Results We developed a quantitative analysis approach to investigate the main biological relationships among different regulatory elements and target genes; we applied it to Ovarian Serous Cystadenocarcinoma and 177 target genes belonging to three main pathways (DNA REPAIR, STEM CELLS and GLUCOSE METABOLISM) relevant for this tumor. Combining data from ENCODE and TCGA datasets, we built a predictive linear model for the regulation of each target gene, assessing the relationships between its expression, promoter methylation, expression of genes in the same or in the other pathways and of putative transcription factors. We proved the reliability and significance of our approach in a similar tumor type (basal-like Breast cancer) and using a different existing algorithm (ARACNe), and we obtained experimental confirmations on potentially interesting results. Conclusions The analysis of the proposed models allowed disclosing the relations between a gene and its related biological processes, the interconnections between the different gene sets, and the evaluation of the relevant regulatory elements at single gene level. This led to the identification of already known regulators and/or gene correlations and to unveil a set of still unknown and potentially interesting biological relationships for their pharmacological and clinical use. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04481-1.
Collapse
Affiliation(s)
- Chiara Regondi
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, 20133, Milan, Italy.
| | - Maddalena Fratelli
- Pharmacogenomics Unit, Istituto di Ricerche Farmacologiche Mario Negri, IRCCS, 20156, Milan, Italy
| | - Giovanna Damia
- Laboratory of Molecular Pharmacology, Istituto di Ricerche Farmacologiche Mario Negri, IRCCS, 20156, Milan, Italy
| | - Federica Guffanti
- Laboratory of Molecular Pharmacology, Istituto di Ricerche Farmacologiche Mario Negri, IRCCS, 20156, Milan, Italy
| | - Monica Ganzinelli
- Laboratory of Molecular Pharmacology, Istituto di Ricerche Farmacologiche Mario Negri, IRCCS, 20156, Milan, Italy.,Department of Medical Oncology, Fondazione IRCCS Istituto Nazionale dei Tumori, 20133, Milan, Italy
| | - Matteo Matteucci
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, 20133, Milan, Italy
| | - Marco Masseroli
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, 20133, Milan, Italy
| |
Collapse
|
5
|
Yi H, Zhang Q, Lin C, Ma S. Information-incorporated Gaussian graphical model for gene expression data. Biometrics 2021; 78:512-523. [PMID: 33527365 DOI: 10.1111/biom.13428] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2020] [Revised: 09/19/2020] [Accepted: 01/13/2021] [Indexed: 11/29/2022]
Abstract
In the analysis of gene expression data, network approaches take a system perspective and have played an irreplaceably important role. Gaussian graphical models (GGMs) have been popular in the network analysis of gene expression data. They investigate the conditional dependence between genes and "transform" the problem of estimating network structures into a sparse estimation of precision matrices. When there is a moderate to large number of genes, the number of parameters to be estimated may overwhelm the limited sample size, leading to unreliable estimation and selection. In this article, we propose incorporating information from previous studies (for example, those deposited at PubMed) to assist estimating the network structure in the present data. It is recognized that such information can be partial, biased, or even wrong. A penalization-based estimation approach is developed, shown to have consistency properties, and realized using an effective computational algorithm. Simulation demonstrates its competitive performance under various information accuracy scenarios. The analysis of TCGA lung cancer prognostic genes leads to network structures different from the alternatives.
Collapse
Affiliation(s)
- Huangdi Yi
- Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut
| | - Qingzhao Zhang
- Department of Statistics, School of Economics; Key Laboratory of Econometrics, Ministry of Education; The Wang Yanan Institute for Studies in Economics, Xiamen University, Xiamen, China
| | - Cunjie Lin
- Center for Applied Statistics and School of Statistics, Renmin University of China, Beijing, China
| | - Shuangge Ma
- Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut
| |
Collapse
|
6
|
Identification of Factors Influencing Out-of-county Hospitalizations in the New Cooperative Medical Scheme. Curr Med Sci 2019; 39:843-851. [PMID: 31612406 DOI: 10.1007/s11596-019-2115-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2018] [Revised: 03/01/2019] [Indexed: 10/25/2022]
Abstract
Throughout the duration of the New Cooperative Medical Scheme (NCMS), it was found that an increasing number of rural patients were seeking out-of-county medical treatment, which posed a great burden on the NCMS fund. Our study was conducted to examine the prevalence of out-of-county hospitalizations and its related factors, and to provide a scientific basis for follow-up health insurance policies. A total of 215 counties in central and western China from 2008 to 2016 were selected. The total out-of-county hospitalization rate in nine years was 16.95%, which increased from 12.37% in 2008 to 19.21% in 2016 with an average annual growth rate of 5.66%. Its related expenses and compensations were shown to increase each year, with those in the central region being higher than those in the western region. Stepwise logistic regression reveals that the increase in out-of-county hospitalization rate was associated with region (X1), rural population (X2), per capita per year net income (X3), per capita gross domestic product (GDP) (X4), per capita funding amount of NCMS (X5), compensation ratio of out-of-county hospitalization cost (X6), per time average in-county (X7) and out-of-county hospitalization cost (X8). According to Bayesian network (BN), the marginal probability of high out-of-county hospitalization rate was as high as 81.7%. Out-of-county hospitalizations were directly related to X8, X3, X4 and X6. The probability of high out-of-county hospitalization obtained based on hospitalization expenses factors, economy factors, regional characteristics and NCMS policy factors was 95.7%, 91.1%, 93.0% and 88.8%, respectively. And how these factors affect out-of-county hospitalization and their interrelationships were found out. Our findings suggest that more attention should be paid to the influence mechanism of these factors on out-of-county hospitalizations, and the increase of hospitalizations outside the county should be reasonably supervised and controlled and our results will be used to help guide the formulation of proper intervention policies.
Collapse
|
7
|
Pan J, Rao H, Zhang X, Li W, Wei Z, Zhang Z, Ren H, Song W, Hou Y, Qiu L. Application of a Tabu search-based Bayesian network in identifying factors related to hypertension. Medicine (Baltimore) 2019; 98:e16058. [PMID: 31232943 PMCID: PMC6636952 DOI: 10.1097/md.0000000000016058] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
Abstract
The study aimed to study the related factors of hypertension using multivariate logistic regression analysis and tabu search-based Bayesian Networks (BNs). A cluster random sampling method was adopted to obtain samples of the general population aged 15 years or above. Multivariate logistic regression analysis indicated that gender, age, cultural level, body mass index (BMI), central obesity, drinking, diabetes mellitus, Myocardial infarction, Coronary heart disease, Stroke are associated with hypertension. While BNs found connections between those related factors and hypertension were established by complex network structure, age, smoking, occupation, cultural level, BMI, central obesity, drinking, diabetes mellitus, myocardial infarction, coronary heart disease, nephropathy, stroke were direct connection with hypertension, gender was indirectly linked to hypertension through drinking. The results showed that BNs can not only find out the correlative factors of hypertension but also analyze how these factors affect hypertension and their interrelationships, which is consistent with practical theory better than logistic regression and has a better application prospects.
Collapse
Affiliation(s)
- Jinhua Pan
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi
| | - Huaxiang Rao
- Department of Public Health and Preventive Medicine, Changzhi Medical University, Shanxi Province
| | - Xuelei Zhang
- Third Affiliated Hospital of Chongqing Medical University, Yubei District, Chongqing, China
| | - Wenhan Li
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi
| | - Zhen Wei
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi
| | - Zhuang Zhang
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi
| | - Hao Ren
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi
| | - Weimei Song
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi
| | - Yuying Hou
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi
| | - Lixia Qiu
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi
| |
Collapse
|
8
|
Application of tabu search-based Bayesian networks in exploring related factors of liver cirrhosis complicated with hepatic encephalopathy and disease identification. Sci Rep 2019; 9:6251. [PMID: 31000773 PMCID: PMC6472503 DOI: 10.1038/s41598-019-42791-w] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2018] [Accepted: 04/08/2019] [Indexed: 02/06/2023] Open
Abstract
This study aimed to explore the related factors and strengths of hepatic cirrhosis complicated with hepatic encephalopathy (HE) by multivariate logistic regression analysis and tabu search-based Bayesian networks (BNs), and to deduce the probability of HE in patients with cirrhosis under different conditions through BN reasoning. Multivariate logistic regression analysis indicated that electrolyte disorders, infections, poor spirits, hepatorenal syndrome, hepatic diabetes, prothrombin time, and total bilirubin are associated with HE. Inferences by BNs found that infection, electrolyte disorder and hepatorenal syndrome are closely related to HE. Those three variables are also related to each other, indicating that the occurrence of any of those three complications may induce the other two complications. When those three complications occur simultaneously, the probability of HE may reach 0.90 or more. The BN constructed by the tabu search algorithm can analyze not only how the correlative factors affect HE but also their interrelationships. Reasoning using BNs can describe how HE is induced on the basis of the order in which doctors acquire patient information, which is consistent with the sequential process of clinical diagnosis and treatment.
Collapse
|
9
|
Zhao H, Duan ZH. Cancer Genetic Network Inference Using Gaussian Graphical Models. Bioinform Biol Insights 2019; 13:1177932219839402. [PMID: 31007526 PMCID: PMC6456846 DOI: 10.1177/1177932219839402] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2019] [Accepted: 03/04/2019] [Indexed: 02/06/2023] Open
Abstract
The Cancer Genome Atlas (TCGA) provides a rich resource that can be used to
understand how genes interact in cancer cells and has collected RNA-Seq gene
expression data for many types of human cancer. However, mining the data to
uncover the hidden gene-interaction patterns remains a challenge. Gaussian
graphical model (GGM) is often used to learn genetic networks because it defines
an undirected graphical structure, revealing the conditional dependences of
genes. In this study, we focus on inferring gene interactions in 15 specific
types of human cancer using RNA-Seq expression data and GGM with graphical
lasso. We take advantage of the corresponding Kyoto Encyclopedia of Genes and
Genomes pathway maps to define the subsets of related genes. RNA-Seq expression
levels of the subsets of genes in solid cancerous tumor and normal tissues were
extracted from TCGA. The gene expression data sets were cleaned and formatted,
and the genetic network corresponding to each cancer type was then inferred
using GGM with graphical lasso. The inferred networks reveal stable conditional
dependences among the genes at the expression level and confirm the essential
roles played by the genes that encode proteins involved in the two key signaling
pathway phosphoinositide 3-kinase (PI3K)/AKT/mTOR and Ras/Raf/MEK/ERK in human
carcinogenesis. These stable dependences elucidate the expression level
interactions among the genes that are implicated in many different human
cancers. The inferred genetic networks were examined to further identify and
characterize a collection of gene interactions that are unique to cancer. The
cross-cancer genetic interactions revealed from our study provide another set of
knowledge for cancer biologists to propose strong hypotheses, so further
biological investigations can be conducted effectively.
Collapse
Affiliation(s)
- Haitao Zhao
- Integrated Bioscience Program, The University of Akron, Akron, OH, USA.,Department of Computer Science, The University of Akron, Akron, OH, USA
| | - Zhong-Hui Duan
- Integrated Bioscience Program, The University of Akron, Akron, OH, USA.,Department of Computer Science, The University of Akron, Akron, OH, USA
| |
Collapse
|
10
|
Computational methods for Gene Regulatory Networks reconstruction and analysis: A review. Artif Intell Med 2019; 95:133-145. [DOI: 10.1016/j.artmed.2018.10.006] [Citation(s) in RCA: 71] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2018] [Revised: 10/23/2018] [Accepted: 10/23/2018] [Indexed: 01/14/2023]
|
11
|
Prevalence of hyperlipidemia in Shanxi Province, China and application of Bayesian networks to analyse its related factors. Sci Rep 2018; 8:3750. [PMID: 29491353 PMCID: PMC5830606 DOI: 10.1038/s41598-018-22167-2] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2017] [Accepted: 02/19/2018] [Indexed: 12/11/2022] Open
Abstract
This study aimed to obtain the prevalence of hyperlipidemia and its related factors in Shanxi Province, China using multivariate logistic regression analysis and tabu search-based Bayesian networks (BNs). A multi-stage stratified random sampling method was adopted to obtain samples among the general population aged 18 years or above. The prevalence of hyperlipidemia in Shanxi Province was 42.6%. Multivariate logistic regression analysis indicated that gender, age, region, occupation, vegetable intake level, physical activity, body mass index, central obesity, hypertension, and diabetes mellitus are associated with hyperlipidemia. BNs were used to find connections between those related factors and hyperlipidemia, which were established by a complex network structure. The results showed that BNs can not only be used to find out the correlative factors of hyperlipidemia but also to analyse how these factors affect hyperlipidemia and their interrelationships, which is consistent with practical theory, is superior to logistic regression and has better application prospects.
Collapse
|