1
|
Carriger JF, Fisher WS. Exploring coral reef communities in Puerto Rico using Bayesian networks. ECOL INFORM 2024; 82:102665. [PMID: 39377040 PMCID: PMC11457097 DOI: 10.1016/j.ecoinf.2024.102665] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/09/2024]
Abstract
Most coral reef studies focus on scleractinian (stony) corals to indicate reef condition, but there are other prominent assemblages that play a role in ecosystem structure and function. In Puerto Rico these include fish, gorgonians, and sponges. The U.S. Environmental Protection Agency conducted unique surveys of coral reef communities across the southern coast of Puerto Rico that included simultaneous measurement of all four assemblages. Evaluating the results from a community perspective demands endpoints for all four assemblages, so patterns of community structure were explored by probabilistic clustering of measured variables with Bayesian networks. Most variables were found to have stronger associations within than between taxa, but unsupervised structure learning identified three cross-taxa relationships with potential ecological significance. Clusters for each assemblage were constructed using an expectation-maximization algorithm that created a factor node jointly characterizing the density, size, and diversity of individuals in each taxon. The clusters were characterized by the measured variables, and relationships to variables for other taxa were examined, such as stony coral clusters with fish variables. Each of the factor nodes were then used to create a set of meta-factor clusters that further summarized the aggregate monitoring variables for the four taxa. Once identified, taxon-specific and meta-clusters represent patterns of community structure that can be examined on a regional or site-specific basis to better understand risk assessment, risk management and delivery of ecosystem services.
Collapse
Affiliation(s)
- John F. Carriger
- U.S. Environmental Protection Agency, Office of Research and Development, Center for Environmental Solutions and Emergency Response, Cincinnati, OH 45268, USA
| | - William S. Fisher
- U.S. Environmental Protection Agency, Office of Research and Development, Center for Environmental Measurement and Modeling, Gulf Breeze, FL 32561, USA
| |
Collapse
|
2
|
Abdelwahab MM, Al-Karawi KA, Semary HE. Deep Learning-Based Prediction of Alzheimer's Disease Using Microarray Gene Expression Data. Biomedicines 2023; 11:3304. [PMID: 38137524 PMCID: PMC10741889 DOI: 10.3390/biomedicines11123304] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 12/02/2023] [Accepted: 12/04/2023] [Indexed: 12/24/2023] Open
Abstract
Alzheimer's disease is a genetically complex disorder, and microarray technology provides valuable insights into it. However, the high dimensionality of microarray datasets and small sample sizes pose challenges. Gene selection techniques have emerged as a promising solution to this challenge, potentially revolutionizing AD diagnosis. The study aims to investigate deep learning techniques, specifically neural networks, in predicting Alzheimer's disease using microarray gene expression data. The goal is to develop a reliable predictive model for early detection and diagnosis, potentially improving patient care and intervention strategies. This study employed gene selection techniques, including Singular Value Decomposition (SVD) and Principal Component Analysis (PCA), to pinpoint pertinent genes within microarray datasets. Leveraging deep learning principles, we harnessed a Convolutional Neural Network (CNN) as our classifier for Alzheimer's disease (AD) prediction. Our approach involved the utilization of a seven-layer CNN with diverse configurations to process the dataset. Empirical outcomes on the AD dataset underscored the effectiveness of the PCA-CNN model, yielding an accuracy of 96.60% and a loss of 0.3503. Likewise, the SVD-CNN model showcased remarkable accuracy, attaining 97.08% and a loss of 0.2466. These results accentuate the potential of our method for gene dimension reduction and classification accuracy enhancement by selecting a subset of pertinent genes. Integrating gene selection methodologies with deep learning architectures presents a promising framework for elevating AD prediction and promoting precision medicine in neurodegenerative disorders. Ongoing research endeavors aim to generalize this approach for diverse applications, explore alternative gene selection techniques, and investigate a variety of deep learning architectures.
Collapse
Affiliation(s)
- Mahmoud M. Abdelwahab
- Department of Mathematics and Statistics, College of Science, Imam Mohammad Ibn Saud Islamic University, Riyadh 11564, Saudi Arabia;
- Department of Basic Sciences, Higher Institute of Administrative Sciences, Belbeis 44621, Egypt
| | - Khamis A. Al-Karawi
- School of Science, Engineering and Environment, Salford University, Salford M5 4WT, UK;
- College of Veterinary Medicine, Diyala University, Baquba 32001, Iraq
| | - Hatem E. Semary
- Department of Mathematics and Statistics, College of Science, Imam Mohammad Ibn Saud Islamic University, Riyadh 11564, Saudi Arabia;
- Department of Statistics and Insurance, Faculty of Commerce, Zagazig University, Zagazig 44519, Egypt
| |
Collapse
|
3
|
Ye Q, Guo NL. Inferencing Bulk Tumor and Single-Cell Multi-Omics Regulatory Networks for Discovery of Biomarkers and Therapeutic Targets. Cells 2022; 12:101. [PMID: 36611894 PMCID: PMC9818242 DOI: 10.3390/cells12010101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Revised: 12/22/2022] [Accepted: 12/24/2022] [Indexed: 12/28/2022] Open
Abstract
There are insufficient accurate biomarkers and effective therapeutic targets in current cancer treatment. Multi-omics regulatory networks in patient bulk tumors and single cells can shed light on molecular disease mechanisms. Integration of multi-omics data with large-scale patient electronic medical records (EMRs) can lead to the discovery of biomarkers and therapeutic targets. In this review, multi-omics data harmonization methods were introduced, and common approaches to molecular network inference were summarized. Our Prediction Logic Boolean Implication Networks (PLBINs) have advantages over other methods in constructing genome-scale multi-omics networks in bulk tumors and single cells in terms of computational efficiency, scalability, and accuracy. Based on the constructed multi-modal regulatory networks, graph theory network centrality metrics can be used in the prioritization of candidates for discovering biomarkers and therapeutic targets. Our approach to integrating multi-omics profiles in a patient cohort with large-scale patient EMRs such as the SEER-Medicare cancer registry combined with extensive external validation can identify potential biomarkers applicable in large patient populations. These methodologies form a conceptually innovative framework to analyze various available information from research laboratories and healthcare systems, accelerating the discovery of biomarkers and therapeutic targets to ultimately improve cancer patient survival outcomes.
Collapse
Affiliation(s)
- Qing Ye
- West Virginia University Cancer Institute, Morgantown, WV 26506, USA
- Lane Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown, WV 26506, USA
| | - Nancy Lan Guo
- West Virginia University Cancer Institute, Morgantown, WV 26506, USA
- Department of Occupational and Environmental Health Sciences, School of Public Health, West Virginia University, Morgantown, WV 26506, USA
| |
Collapse
|
4
|
Ahmed H, Soliman H, Elmogy M. Early detection of Alzheimer's disease using single nucleotide polymorphisms analysis based on gradient boosting tree. Comput Biol Med 2022; 146:105622. [PMID: 35751201 DOI: 10.1016/j.compbiomed.2022.105622] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2021] [Revised: 03/25/2022] [Accepted: 03/29/2022] [Indexed: 11/18/2022]
Abstract
Alzheimer's disease (AD) is a degenerative disorder that attacks nerve cells in the brain. AD leads to memory loss and cognitive & intellectual impairments that can influence social activities and decision-making. The most common type of human genetic variation is single nucleotide polymorphisms (SNPs). SNPs are beneficial markers of complex gene-disease. Many common and serious diseases, such as AD, have associated SNPs. Detection of SNP biomarkers linked with AD could help in the early prediction and diagnosis of this disease. The main objective of this paper is to predict and diagnose AD based on SNPs biomarkers with high classification accuracy in the early stages. One of the most concerning problems is the high number of features. Thus, the paper proposes a comprehensive framework for early AD detection and detecting the most significant genes based on SNPs analysis. Usage of machine learning (ML) techniques to identify new biomarkers of AD is also suggested. In the proposed system, two feature selection techniques are separately checked: the information gain filter and Boruta wrapper. The two feature selection techniques were used to select the most significant genes related to AD in this system. Filter methods measure the relevance of features by their correlation with dependent variables, while wrapper methods measure the usefulness of a subset of features by training a model on it. Gradient boosting tree (GBT) has been applied on all AD genetic data of neuroimaging initiative phase 1 (ADNI-1) and Whole-Genome Sequencing (WGS) datasets by using two feature selection techniques. In the whole-genome approach ADNI-1, results revealed that the GBT learning algorithm scored an overall accuracy of 99.06% in the case of using Boruta feature selection. Using information gain feature selection, the proposed system achieved an average accuracy of 94.87%. The results show that the proposed system is preferable for the early detection of AD. Also, the results revealed that the Boruta wrapper feature selection is superior to the information gain filter technique.
Collapse
Affiliation(s)
- Hala Ahmed
- Information Technology Dept., Faculty of Computers and Information, Mansoura University, Mansoura, P.O.35516, Egypt
| | - Hassan Soliman
- Information Technology Dept., Faculty of Computers and Information, Mansoura University, Mansoura, P.O.35516, Egypt
| | - Mohammed Elmogy
- Information Technology Dept., Faculty of Computers and Information, Mansoura University, Mansoura, P.O.35516, Egypt.
| |
Collapse
|
5
|
Rowe TW, Katzourou IK, Stevenson-Hoare JO, Bracher-Smith MR, Ivanov DK, Escott-Price V. Machine learning for the life-time risk prediction of Alzheimer's disease: a systematic review. Brain Commun 2021; 3:fcab246. [PMID: 34805994 PMCID: PMC8598986 DOI: 10.1093/braincomms/fcab246] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2021] [Revised: 06/30/2021] [Accepted: 07/19/2021] [Indexed: 12/23/2022] Open
Abstract
Alzheimer’s disease is a neurodegenerative disorder and the most common form of dementia. Early diagnosis may assist interventions to delay onset and reduce the progression rate of the disease. We systematically reviewed the use of machine learning algorithms for predicting Alzheimer’s disease using single nucleotide polymorphisms and instances where these were combined with other types of data. We evaluated the ability of machine learning models to distinguish between controls and cases, while also assessing their implementation and potential biases. Articles published between December 2009 and June 2020 were collected using Scopus, PubMed and Google Scholar. These were systematically screened for inclusion leading to a final set of 12 publications. Eighty-five per cent of the included studies used the Alzheimer's Disease Neuroimaging Initiative dataset. In studies which reported area under the curve, discrimination varied (0.49–0.97). However, more than half of the included manuscripts used other forms of measurement, such as accuracy, sensitivity and specificity. Model calibration statistics were also found to be reported inconsistently across all studies. The most frequent limitation in the assessed studies was sample size, with the total number of participants often numbering less than a thousand, whilst the number of predictors usually ran into the many thousands. In addition, key steps in model implementation and validation were often not performed or unreported, making it difficult to assess the capability of machine learning models.
Collapse
Affiliation(s)
- Thomas W Rowe
- UK Dementia Research Institute, Cardiff University, Cardiff, UK
| | | | | | - Matthew R Bracher-Smith
- Division of Psychological Medicine and Clinical Neurosciences, School of Medicine and Clinical Neurosciences, School of Medicine, Cardiff University, Cardiff CF24 4HQ, UK
| | - Dobril K Ivanov
- UK Dementia Research Institute, Cardiff University, Cardiff, UK
| | - Valentina Escott-Price
- UK Dementia Research Institute, Cardiff University, Cardiff, UK.,Division of Psychological Medicine and Clinical Neurosciences, School of Medicine and Clinical Neurosciences, School of Medicine, Cardiff University, Cardiff CF24 4HQ, UK
| |
Collapse
|
6
|
Gogoshin G, Branciamore S, Rodin AS. Synthetic data generation with probabilistic Bayesian Networks. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2021; 18:8603-8621. [PMID: 34814315 PMCID: PMC8848551 DOI: 10.3934/mbe.2021426] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
Bayesian Network (BN) modeling is a prominent and increasingly popular computational systems biology method. It aims to construct network graphs from the large heterogeneous biological datasets that reflect the underlying biological relationships. Currently, a variety of strategies exist for evaluating BN methodology performance, ranging from utilizing artificial benchmark datasets and models, to specialized biological benchmark datasets, to simulation studies that generate synthetic data from predefined network models. The last is arguably the most comprehensive approach; however, existing implementations often rely on explicit and implicit assumptions that may be unrealistic in a typical biological data analysis scenario, or are poorly equipped for automated arbitrary model generation. In this study, we develop a purely probabilistic simulation framework that addresses the demands of statistically sound simulations studies in an unbiased fashion. Additionally, we expand on our current understanding of the theoretical notions of causality and dependence / conditional independence in BNs and the Markov Blankets within.
Collapse
Affiliation(s)
- Grigoriy Gogoshin
- Department of Computational and Quantitative Medicine, Beckman Research Institute, and Diabetes and Metabolism Research Institute, City of Hope National Medical Center, 1500 East Duarte Road, Duarte, CA 91010 USA
| | - Sergio Branciamore
- Department of Computational and Quantitative Medicine, Beckman Research Institute, and Diabetes and Metabolism Research Institute, City of Hope National Medical Center, 1500 East Duarte Road, Duarte, CA 91010 USA
| | - Andrei S. Rodin
- Department of Computational and Quantitative Medicine, Beckman Research Institute, and Diabetes and Metabolism Research Institute, City of Hope National Medical Center, 1500 East Duarte Road, Duarte, CA 91010 USA
| |
Collapse
|
7
|
Ahmed H, Alarabi L, El-Sappagh S, Soliman H, Elmogy M. Genetic variations analysis for complex brain disease diagnosis using machine learning techniques: opportunities and hurdles. PeerJ Comput Sci 2021; 7:e697. [PMID: 34616886 PMCID: PMC8459785 DOI: 10.7717/peerj-cs.697] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Accepted: 08/05/2021] [Indexed: 06/13/2023]
Abstract
BACKGROUND AND OBJECTIVES This paper presents an in-depth review of the state-of-the-art genetic variations analysis to discover complex genes associated with the brain's genetic disorders. We first introduce the genetic analysis of complex brain diseases, genetic variation, and DNA microarrays. Then, the review focuses on available machine learning methods used for complex brain disease classification. Therein, we discuss the various datasets, preprocessing, feature selection and extraction, and classification strategies. In particular, we concentrate on studying single nucleotide polymorphisms (SNP) that support the highest resolution for genomic fingerprinting for tracking disease genes. Subsequently, the study provides an overview of the applications for some specific diseases, including autism spectrum disorder, brain cancer, and Alzheimer's disease (AD). The study argues that despite the significant recent developments in the analysis and treatment of genetic disorders, there are considerable challenges to elucidate causative mutations, especially from the viewpoint of implementing genetic analysis in clinical practice. The review finally provides a critical discussion on the applicability of genetic variations analysis for complex brain disease identification highlighting the future challenges. METHODS We used a methodology for literature surveys to obtain data from academic databases. Criteria were defined for inclusion and exclusion. The selection of articles was followed by three stages. In addition, the principal methods for machine learning to classify the disease were presented in each stage in more detail. RESULTS It was revealed that machine learning based on SNP was widely utilized to solve problems of genetic variation for complex diseases related to genes. CONCLUSIONS Despite significant developments in genetic diseases in the past two decades of the diagnosis and treatment, there is still a large percentage in which the causative mutation cannot be determined, and a final genetic diagnosis remains elusive. So, we need to detect the variations of the genes related to brain disorders in the early disease stages.
Collapse
Affiliation(s)
- Hala Ahmed
- Information Technology Department, Faculty of Computers and Information, Mansoura University, Mansoura, Egypt
| | - Louai Alarabi
- Department of Computer Science, Umm Al-Qura University, Makkah, Saudi Arabia
| | - Shaker El-Sappagh
- Centro Singular de Investigación en Tecnoloxías Intelixentes (CiTIUS), Universidade de Santiago de Compostela, Santiago de Compostela, Spain
- Information Systems Department, Faculty of Computers and Artificial Intelligence, Benha University, Benha, Egypt
| | - Hassan Soliman
- Information Technology Department, Faculty of Computers and Information, Mansoura University, Mansoura, Egypt
| | - Mohammed Elmogy
- Information Technology Department, Faculty of Computers and Information, Mansoura University, Mansoura, Egypt
| |
Collapse
|
8
|
Abd El Hamid MM, Mabrouk MS, Omar YMK. DEVELOPING AN EARLY PREDICTIVE SYSTEM FOR IDENTIFYING GENETIC BIOMARKERS ASSOCIATED TO ALZHEIMER’S DISEASE USING MACHINE LEARNING TECHNIQUES. BIOMEDICAL ENGINEERING: APPLICATIONS, BASIS AND COMMUNICATIONS 2019. [DOI: 10.4015/s1016237219500406] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Alzheimer’s disease (AD) is an irreversible, progressive disorder that assaults the nerve cells of the brain. It is the most widely recognized kind of dementia among older adults. Apolipoprotein E (APOE), is one of the most common genetic risk factors for AD whose significant association with AD is observed in various genome-wide association studies (GWAS). Single nucleotide polymorphisms (SNPs) are the most common type of genetic variation among individuals. SNPs related to many common diseases like AD. SNPs are recognized as significant biomarkers for this disease, they help in understanding and detecting the disease in its early stages. Detecting SNPs biomarkers associated to the disease with high classification accuracy leads to early prediction and diagnosis. Machine learning techniques are utilized to discover new biomarkers of the disease. Sequential minimal optimization (SMO) algorithm with different kernels, Naive Bayes (NB), tree augmented Naive Bayes (TAN) and K2 learning algorithm have been applied on all genetic data of Alzheimer’s disease neuroimaging initiative phase 1 (ADNI-1)/Whole genome sequencing (WGS) datasets. The highest classification accuracy was achieved using 500 SNPs based on the [Formula: see text]-value threshold ([Formula: see text]-value [Formula: see text]). In whole genome approach ADNI-1, results revealed that NB and K2 learning algorithms scored an overall accuracy of 98% and 98.40%, respectively. In whole genome approach WGS, NB and K2 learning algorithms scored an overall accuracy of 99.63% and 99.75%, respectively.
Collapse
Affiliation(s)
| | - Mai S. Mabrouk
- Biomedical Engineering Department, Misr University for Science and Technology (MUST), Egypt
| | - Yasser M. K. Omar
- College of Computing and Information Technology AASTMT, Cairo Branch, Egypt
| |
Collapse
|
9
|
Epi-GTBN: an approach of epistasis mining based on genetic Tabu algorithm and Bayesian network. BMC Bioinformatics 2019; 20:444. [PMID: 31455207 PMCID: PMC6712799 DOI: 10.1186/s12859-019-3022-z] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2018] [Accepted: 08/07/2019] [Indexed: 12/31/2022] Open
Abstract
Background Mining epistatic loci which affects specific phenotypic traits is an important research issue in the field of biology. Bayesian network (BN) is a graphical model which can express the relationship between genetic loci and phenotype. Until now, it has been widely used into epistasis mining in many research work. However, this method has two disadvantages: low learning efficiency and easy to fall into local optimum. Genetic algorithm has the excellence of rapid global search and avoiding falling into local optimum. It is scalable and easy to integrate with other algorithms. This work proposes an epistasis mining approach based on genetic tabu algorithm and Bayesian network (Epi-GTBN). It uses genetic algorithm into the heuristic search strategy of Bayesian network. The individual structure can be evolved through the genetic operations of selection, crossover and mutation. It can help to find the optimal network structure, and then further to mine the epistasis loci effectively. In order to enhance the diversity of the population and obtain a more effective global optimal solution, we use the tabu search strategy into the operations of crossover and mutation in genetic algorithm. It can help to accelerate the convergence of the algorithm. Results We compared Epi-GTBN with other recent algorithms using both simulated and real datasets. The experimental results demonstrate that our method has much better epistasis detection accuracy in the case of not affecting the efficiency for different datasets. Conclusions The presented methodology (Epi-GTBN) is an effective method for epistasis detection, and it can be seen as an interesting addition to the arsenal used in complex traits analyses. Electronic supplementary material The online version of this article (10.1186/s12859-019-3022-z) contains supplementary material, which is available to authorized users.
Collapse
|
10
|
Bottigliengo D, Berchialla P, Lanera C, Azzolina D, Lorenzoni G, Martinato M, Giachino D, Baldi I, Gregori D. The Role of Genetic Factors in Characterizing Extra-Intestinal Manifestations in Crohn's Disease Patients: Are Bayesian Machine Learning Methods Improving Outcome Predictions? J Clin Med 2019; 8:jcm8060865. [PMID: 31212952 PMCID: PMC6617350 DOI: 10.3390/jcm8060865] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2019] [Revised: 06/12/2019] [Accepted: 06/13/2019] [Indexed: 01/01/2023] Open
Abstract
(1) Background: The high heterogeneity of inflammatory bowel disease (IBD) makes the study of this condition challenging. In subjects affected by Crohn’s disease (CD), extra-intestinal manifestations (EIMs) have a remarkable potential impact on health status. Increasing numbers of patient characteristics and the small size of analyzed samples make EIMs prediction very difficult. Under such constraints, Bayesian machine learning techniques (BMLTs) have been proposed as a robust alternative to classical models for outcome prediction. This study aims to determine whether BMLT could improve EIM prediction and statistical support for the decision-making process of clinicians. (2) Methods: Three of the most popular BMLTs were employed in this study: Naϊve Bayes (NB), Bayesian Network (BN) and Bayesian Additive Regression Trees (BART). They were applied to a retrospective observational Italian study of IBD genetics. (3) Results: The performance of the model is strongly affected by the features of the dataset, and BMLTs poorly classify EIM appearance. (4) Conclusions: This study shows that BMLTs perform worse than expected in classifying the presence of EIMs compared to classical statistical tools in a context where mixed genetic and clinical data are available but relevant data are also missing, as often occurs in clinical practice.
Collapse
Affiliation(s)
- Daniele Bottigliengo
- Unit of Biostatistics, Epidemiology and Public Health, Department of Cardiac, Thoracic, and Vascular Sciences and Public Health, University of Padova, 35131 Padova, Italy.
| | - Paola Berchialla
- Department of Clinical and Biological Sciences, University of Torino, 10126 Torino, Italy.
| | - Corrado Lanera
- Unit of Biostatistics, Epidemiology and Public Health, Department of Cardiac, Thoracic, and Vascular Sciences and Public Health, University of Padova, 35131 Padova, Italy.
| | - Danila Azzolina
- Unit of Biostatistics, Epidemiology and Public Health, Department of Cardiac, Thoracic, and Vascular Sciences and Public Health, University of Padova, 35131 Padova, Italy.
| | - Giulia Lorenzoni
- Unit of Biostatistics, Epidemiology and Public Health, Department of Cardiac, Thoracic, and Vascular Sciences and Public Health, University of Padova, 35131 Padova, Italy.
| | - Matteo Martinato
- Unit of Biostatistics, Epidemiology and Public Health, Department of Cardiac, Thoracic, and Vascular Sciences and Public Health, University of Padova, 35131 Padova, Italy.
| | - Daniela Giachino
- Department of Clinical and Biological Sciences, University of Torino, 10126 Torino, Italy.
| | - Ileana Baldi
- Unit of Biostatistics, Epidemiology and Public Health, Department of Cardiac, Thoracic, and Vascular Sciences and Public Health, University of Padova, 35131 Padova, Italy.
| | - Dario Gregori
- Unit of Biostatistics, Epidemiology and Public Health, Department of Cardiac, Thoracic, and Vascular Sciences and Public Health, University of Padova, 35131 Padova, Italy.
| |
Collapse
|
11
|
Singh S, Gupta SK, Seth PK. Biomarkers for detection, prognosis and therapeutic assessment of neurological disorders. Rev Neurosci 2018; 29:771-789. [PMID: 29466244 DOI: 10.1515/revneuro-2017-0097] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2017] [Accepted: 12/17/2017] [Indexed: 10/24/2023]
Abstract
Neurological disorders have aroused a significant concern among the health scientists globally, as diseases such as Parkinson's, Alzheimer's and dementia lead to disability and people have to live with them throughout the life. Recent evidence suggests that a number of environmental chemicals such as pesticides (paraquat) and metals (lead and aluminum) are also the cause of these diseases and other neurological disorders. Biomarkers can help in detecting the disorder at the preclinical stage, progression of the disease and key metabolomic alterations permitting identification of potential targets for intervention. A number of biomarkers have been proposed for some neurological disorders based on laboratory and clinical studies. In silico approaches have also been used by some investigators. Yet the ideal biomarker, which can help in early detection and follow-up on treatment and identifying the susceptible populations, is not available. An attempt has therefore been made to review the recent advancements of in silico approaches for discovery of biomarkers and their validation. In silico techniques implemented with multi-omics approaches have potential to provide a fast and accurate approach to identify novel biomarkers.
Collapse
Affiliation(s)
- Sarita Singh
- Distinguished Scientist Laboratory, Biotech Park, Sector-G Jankipram, Kursi Road, Lucknow 226021, Uttar Pradesh, India
| | - Sunil Kumar Gupta
- Distinguished Scientist Laboratory, Biotech Park, Lucknow 226021, Uttar Pradesh, India
| | - Prahlad Kishore Seth
- Distinguished Scientist Laboratory, Biotech Park, Lucknow 226021, Uttar Pradesh, India
| |
Collapse
|
12
|
Verification of Three-Phase Dependency Analysis Bayesian Network Learning Method for Maize Carotenoid Gene Mining. BIOMED RESEARCH INTERNATIONAL 2017; 2017:1813494. [PMID: 28828382 PMCID: PMC5554554 DOI: 10.1155/2017/1813494] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/03/2017] [Accepted: 06/27/2017] [Indexed: 11/17/2022]
Abstract
Background and Objective Mining the genes related to maize carotenoid components is important to improve the carotenoid content and the quality of maize. Methods On the basis of using the entropy estimation method with Gaussian kernel probability density estimator, we use the three-phase dependency analysis (TPDA) Bayesian network structure learning method to construct the network of maize gene and carotenoid components traits. Results In the case of using two discretization methods and setting different discretization values, we compare the learning effect and efficiency of 10 kinds of Bayesian network structure learning methods. The method is verified and analyzed on the maize dataset of global germplasm collection with 527 elite inbred lines. Conclusions The result confirmed the effectiveness of the TPDA method, which outperforms significantly another 9 kinds of Bayesian network learning methods. It is an efficient method of mining genes for maize carotenoid components traits. The parameters obtained by experiments will help carry out practical gene mining effectively in the future.
Collapse
|
13
|
Kaewprag P, Newton C, Vermillion B, Hyun S, Huang K, Machiraju R. Predictive models for pressure ulcers from intensive care unit electronic health records using Bayesian networks. BMC Med Inform Decis Mak 2017; 17:65. [PMID: 28699545 PMCID: PMC5506589 DOI: 10.1186/s12911-017-0471-z] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Background We develop predictive models enabling clinicians to better understand and explore patient clinical data along with risk factors for pressure ulcers in intensive care unit patients from electronic health record data. Identifying accurate risk factors of pressure ulcers is essential to determining appropriate prevention strategies; in this work we examine medication, diagnosis, and traditional Braden pressure ulcer assessment scale measurements as patient features. In order to predict pressure ulcer incidence and better understand the structure of related risk factors, we construct Bayesian networks from patient features. Bayesian network nodes (features) and edges (conditional dependencies) are simplified with statistical network techniques. Upon reviewing a network visualization of our model, our clinician collaborators were able to identify strong relationships between risk factors widely recognized as associated with pressure ulcers. Methods We present a three-stage framework for predictive analysis of patient clinical data: 1) Developing electronic health record feature extraction functions with assistance of clinicians, 2) simplifying features, and 3) building Bayesian network predictive models. We evaluate all combinations of Bayesian network models from different search algorithms, scoring functions, prior structure initializations, and sets of features. Results From the EHRs of 7,717 ICU patients, we construct Bayesian network predictive models from 86 medication, diagnosis, and Braden scale features. Our model not only identifies known and suspected high PU risk factors, but also substantially increases sensitivity of the prediction - nearly three times higher comparing to logistical regression models - without sacrificing the overall accuracy. We visualize a representative model with which our clinician collaborators identify strong relationships between risk factors widely recognized as associated with pressure ulcers. Conclusions Given the strong adverse effect of pressure ulcers on patients and the high cost for treating pressure ulcers, our Bayesian network based model provides a novel framework for significantly improving the sensitivity of the prediction model. Thus, when the model is deployed in a clinical setting, the caregivers can suitably respond to conditions likely associated with pressure ulcer incidence.
Collapse
Affiliation(s)
- Pacharmon Kaewprag
- Department of Computer Science and Engineering, The Ohio State University, Columbus, Ohio, USA.
| | - Cheryl Newton
- Department of Critical Care Nursing, The Ohio State University Wexner Medical Center, Columbus, Ohio, USA
| | - Brenda Vermillion
- College of Nursing, The Ohio State University, Columbus, Ohio, USA.,Department of Health Services Nursing Education, The Ohio State University Wexner Medical Center, Columbus, Ohio, USA
| | - Sookyung Hyun
- College of Nursing, Pusan National University, Busan, South Korea
| | - Kun Huang
- Department of Computer Science and Engineering, The Ohio State University, Columbus, Ohio, USA. .,Department of Biomedical Informatics, The Ohio State University, Columbus, Ohio, USA.
| | - Raghu Machiraju
- Department of Computer Science and Engineering, The Ohio State University, Columbus, Ohio, USA. .,Department of Biomedical Informatics, The Ohio State University, Columbus, Ohio, USA.
| |
Collapse
|
14
|
Geographical distribution of complement receptor type 1 variants and their associated disease risk. PLoS One 2017; 12:e0175973. [PMID: 28520715 PMCID: PMC5435133 DOI: 10.1371/journal.pone.0175973] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2017] [Accepted: 04/03/2017] [Indexed: 11/19/2022] Open
Abstract
Background Pathogens exert selective pressure which may lead to substantial changes in host immune responses. The human complement receptor type 1 (CR1) is an innate immune recognition glycoprotein that regulates the activation of the complement pathway and removes opsonized immune complexes. CR1 genetic variants in exon 29 have been associated with expression levels, C1q or C3b binding and increased susceptibility to several infectious diseases. Five distinct CR1 nucleotide substitutions determine the Knops blood group phenotypes, namely Kna/b, McCa/b, Sl1/Sl2, Sl4/Sl5 and KCAM+/-. Methods CR1 variants were genotyped by direct sequencing in a cohort of 441 healthy individuals from Brazil, Vietnam, India, Republic of Congo and Ghana. Results The distribution of the CR1 alleles, genotypes and haplotypes differed significantly among geographical settings (p≤0.001). CR1 variants rs17047660A/G (McCa/b) and rs17047661A/G (Sl1/Sl2) were exclusively observed to be polymorphic in African populations compared to the groups from Asia and South-America, strongly suggesting that these two SNPs may be subjected to selection. This is further substantiated by a high linkage disequilibrium between the two variants in the Congolese and Ghanaian populations. A total of nine CR1 haplotypes were observed. The CR1*AGAATA haplotype was found more frequently among the Brazilian and Vietnamese study groups; the CR1*AGAATG haplotype was frequent in the Indian and Vietnamese populations, while the CR1*AGAGTG haplotype was frequent among Congolese and Ghanaian individuals. Conclusion The African populations included in this study might have a selective advantage conferred to immune genes involved in pathogen recognition and signaling, possibly contributing to disease susceptibility or resistance.
Collapse
|
15
|
Yang CH, Weng ZJ, Chuang LY, Yang CS. Identification of SNP-SNP interaction for chronic dialysis patients. Comput Biol Med 2017; 83:94-101. [DOI: 10.1016/j.compbiomed.2017.02.004] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2016] [Revised: 02/14/2017] [Accepted: 02/15/2017] [Indexed: 01/10/2023]
|
16
|
Weiner MW, Veitch DP, Aisen PS, Beckett LA, Cairns NJ, Green RC, Harvey D, Jack CR, Jagust W, Morris JC, Petersen RC, Saykin AJ, Shaw LM, Toga AW, Trojanowski JQ. Recent publications from the Alzheimer's Disease Neuroimaging Initiative: Reviewing progress toward improved AD clinical trials. Alzheimers Dement 2017; 13:e1-e85. [PMID: 28342697 PMCID: PMC6818723 DOI: 10.1016/j.jalz.2016.11.007] [Citation(s) in RCA: 182] [Impact Index Per Article: 22.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2016] [Revised: 11/21/2016] [Accepted: 11/28/2016] [Indexed: 01/31/2023]
Abstract
INTRODUCTION The Alzheimer's Disease Neuroimaging Initiative (ADNI) has continued development and standardization of methodologies for biomarkers and has provided an increased depth and breadth of data available to qualified researchers. This review summarizes the over 400 publications using ADNI data during 2014 and 2015. METHODS We used standard searches to find publications using ADNI data. RESULTS (1) Structural and functional changes, including subtle changes to hippocampal shape and texture, atrophy in areas outside of hippocampus, and disruption to functional networks, are detectable in presymptomatic subjects before hippocampal atrophy; (2) In subjects with abnormal β-amyloid deposition (Aβ+), biomarkers become abnormal in the order predicted by the amyloid cascade hypothesis; (3) Cognitive decline is more closely linked to tau than Aβ deposition; (4) Cerebrovascular risk factors may interact with Aβ to increase white-matter (WM) abnormalities which may accelerate Alzheimer's disease (AD) progression in conjunction with tau abnormalities; (5) Different patterns of atrophy are associated with impairment of memory and executive function and may underlie psychiatric symptoms; (6) Structural, functional, and metabolic network connectivities are disrupted as AD progresses. Models of prion-like spreading of Aβ pathology along WM tracts predict known patterns of cortical Aβ deposition and declines in glucose metabolism; (7) New AD risk and protective gene loci have been identified using biologically informed approaches; (8) Cognitively normal and mild cognitive impairment (MCI) subjects are heterogeneous and include groups typified not only by "classic" AD pathology but also by normal biomarkers, accelerated decline, and suspected non-Alzheimer's pathology; (9) Selection of subjects at risk of imminent decline on the basis of one or more pathologies improves the power of clinical trials; (10) Sensitivity of cognitive outcome measures to early changes in cognition has been improved and surrogate outcome measures using longitudinal structural magnetic resonance imaging may further reduce clinical trial cost and duration; (11) Advances in machine learning techniques such as neural networks have improved diagnostic and prognostic accuracy especially in challenges involving MCI subjects; and (12) Network connectivity measures and genetic variants show promise in multimodal classification and some classifiers using single modalities are rivaling multimodal classifiers. DISCUSSION Taken together, these studies fundamentally deepen our understanding of AD progression and its underlying genetic basis, which in turn informs and improves clinical trial design.
Collapse
Affiliation(s)
- Michael W Weiner
- Department of Veterans Affairs Medical Center, Center for Imaging of Neurodegenerative Diseases, San Francisco, CA, USA; Department of Radiology, University of California, San Francisco, CA, USA; Department of Medicine, University of California, San Francisco, CA, USA; Department of Psychiatry, University of California, San Francisco, CA, USA; Department of Neurology, University of California, San Francisco, CA, USA.
| | - Dallas P Veitch
- Department of Veterans Affairs Medical Center, Center for Imaging of Neurodegenerative Diseases, San Francisco, CA, USA
| | - Paul S Aisen
- Alzheimer's Therapeutic Research Institute, University of Southern California, San Diego, CA, USA
| | - Laurel A Beckett
- Division of Biostatistics, Department of Public Health Sciences, University of California, Davis, CA, USA
| | - Nigel J Cairns
- Knight Alzheimer's Disease Research Center, Washington University School of Medicine, Saint Louis, MO, USA; Department of Neurology, Washington University School of Medicine, Saint Louis, MO, USA
| | - Robert C Green
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Danielle Harvey
- Division of Biostatistics, Department of Public Health Sciences, University of California, Davis, CA, USA
| | | | - William Jagust
- Helen Wills Neuroscience Institute, University of California Berkeley, Berkeley, CA, USA
| | - John C Morris
- Alzheimer's Therapeutic Research Institute, University of Southern California, San Diego, CA, USA
| | | | - Andrew J Saykin
- Department of Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN, USA; Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Leslie M Shaw
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Arthur W Toga
- Laboratory of Neuroimaging, Institute of Neuroimaging and Informatics, Keck School of Medicine of University of Southern California, Los Angeles, CA, USA
| | - John Q Trojanowski
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA; Institute on Aging, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA; Alzheimer's Disease Core Center, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA; Udall Parkinson's Research Center, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
17
|
Li WX, Dai SX, Liu JQ, Wang Q, Li GH, Huang JF. Integrated Analysis of Alzheimer's Disease and Schizophrenia Dataset Revealed Different Expression Pattern in Learning and Memory. J Alzheimers Dis 2016; 51:417-25. [PMID: 26890750 DOI: 10.3233/jad-150807] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Alzheimer's disease (AD) and schizophrenia (SZ) are both accompanied by impaired learning and memory functions. This study aims to explore the expression profiles of learning or memory genes between AD and SZ. We downloaded 10 AD and 10 SZ datasets from GEO-NCBI for integrated analysis. These datasets were processed using RMA algorithm and a global renormalization for all studies. Then Empirical Bayes algorithm was used to find the differentially expressed genes between patients and controls. The results showed that most of the differentially expressed genes were related to AD whereas the gene expression profile was little affected in the SZ. Furthermore, in the aspects of the number of differentially expressed genes, the fold change and the brain region, there was a great difference in the expression of learning or memory related genes between AD and SZ. In AD, the CALB1, GABRA5, and TAC1 were significantly downregulated in whole brain, frontal lobe, temporal lobe, and hippocampus. However, in SZ, only two genes CRHBP and CX3CR1 were downregulated in hippocampus, and other brain regions were not affected. The effect of these genes on learning or memory impairment has been widely studied. It was suggested that these genes may play a crucial role in AD or SZ pathogenesis. The different gene expression patterns between AD and SZ on learning and memory functions in different brain regions revealed in our study may help to understand the different mechanism between two diseases.
Collapse
Affiliation(s)
- Wen-Xing Li
- Institute of Health Sciences, Anhui University, Hefei, Anhui, China.,State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, China
| | - Shao-Xing Dai
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, China
| | - Jia-Qian Liu
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, China
| | - Qian Wang
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, China
| | - Gong-Hua Li
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, China
| | - Jing-Fei Huang
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, China.,KIZ-SU Joint Laboratory of Animal Models and Drug Development, College of Pharmaceutical Sciences, Soochow University, Kunming, Yunnan, China.,Collaborative Innovation Center for Natural Products and Biological Drugs of Yunnan, Kunming, Yunnan, China
| |
Collapse
|
18
|
Mostafa Abd El Hamid M, Omar YM, Mabrouk MS. Identifying genetic biomarkers associated to Alzheimer's disease using Support Vector Machine. 2016 8TH CAIRO INTERNATIONAL BIOMEDICAL ENGINEERING CONFERENCE (CIBEC) 2016. [DOI: 10.1109/cibec.2016.7836087] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]
|
19
|
Gogoshin G, Boerwinkle E, Rodin AS. New Algorithm and Software (BNOmics) for Inferring and Visualizing Bayesian Networks from Heterogeneous Big Biological and Genetic Data. J Comput Biol 2016; 24:340-356. [PMID: 27681505 PMCID: PMC5372779 DOI: 10.1089/cmb.2016.0100] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Bayesian network (BN) reconstruction is a prototypical systems biology data analysis approach that has been successfully used to reverse engineer and model networks reflecting different layers of biological organization (ranging from genetic to epigenetic to cellular pathway to metabolomic). It is especially relevant in the context of modern (ongoing and prospective) studies that generate heterogeneous high-throughput omics datasets. However, there are both theoretical and practical obstacles to the seamless application of BN modeling to such big data, including computational inefficiency of optimal BN structure search algorithms, ambiguity in data discretization, mixing data types, imputation and validation, and, in general, limited scalability in both reconstruction and visualization of BNs. To overcome these and other obstacles, we present BNOmics, an improved algorithm and software toolkit for inferring and analyzing BNs from omics datasets. BNOmics aims at comprehensive systems biology—type data exploration, including both generating new biological hypothesis and testing and validating the existing ones. Novel aspects of the algorithm center around increasing scalability and applicability to varying data types (with different explicit and implicit distributional assumptions) within the same analysis framework. An output and visualization interface to widely available graph-rendering software is also included. Three diverse applications are detailed. BNOmics was originally developed in the context of genetic epidemiology data and is being continuously optimized to keep pace with the ever-increasing inflow of available large-scale omics datasets. As such, the software scalability and usability on the less than exotic computer hardware are a priority, as well as the applicability of the algorithm and software to the heterogeneous datasets containing many data types—single-nucleotide polymorphisms and other genetic/epigenetic/transcriptome variables, metabolite levels, epidemiological variables, endpoints, and phenotypes, etc.
Collapse
Affiliation(s)
- Grigoriy Gogoshin
- 1 Diabetes and Metabolism Research Institute , City of Hope, Duarte, California
| | - Eric Boerwinkle
- 2 Human Genetics Center, School of Public Health, University of Texas Health Science Center , Houston, Texas.,3 Institute of Molecular Medicine, University of Texas Health Science Center , Houston, Texas
| | - Andrei S Rodin
- 1 Diabetes and Metabolism Research Institute , City of Hope, Duarte, California
| |
Collapse
|