1
|
Ahmed H, Soliman H, Elmogy M. Early detection of Alzheimer's disease using single nucleotide polymorphisms analysis based on gradient boosting tree. Comput Biol Med 2022; 146:105622. [PMID: 35751201 DOI: 10.1016/j.compbiomed.2022.105622] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2021] [Revised: 03/25/2022] [Accepted: 03/29/2022] [Indexed: 11/18/2022]
Abstract
Alzheimer's disease (AD) is a degenerative disorder that attacks nerve cells in the brain. AD leads to memory loss and cognitive & intellectual impairments that can influence social activities and decision-making. The most common type of human genetic variation is single nucleotide polymorphisms (SNPs). SNPs are beneficial markers of complex gene-disease. Many common and serious diseases, such as AD, have associated SNPs. Detection of SNP biomarkers linked with AD could help in the early prediction and diagnosis of this disease. The main objective of this paper is to predict and diagnose AD based on SNPs biomarkers with high classification accuracy in the early stages. One of the most concerning problems is the high number of features. Thus, the paper proposes a comprehensive framework for early AD detection and detecting the most significant genes based on SNPs analysis. Usage of machine learning (ML) techniques to identify new biomarkers of AD is also suggested. In the proposed system, two feature selection techniques are separately checked: the information gain filter and Boruta wrapper. The two feature selection techniques were used to select the most significant genes related to AD in this system. Filter methods measure the relevance of features by their correlation with dependent variables, while wrapper methods measure the usefulness of a subset of features by training a model on it. Gradient boosting tree (GBT) has been applied on all AD genetic data of neuroimaging initiative phase 1 (ADNI-1) and Whole-Genome Sequencing (WGS) datasets by using two feature selection techniques. In the whole-genome approach ADNI-1, results revealed that the GBT learning algorithm scored an overall accuracy of 99.06% in the case of using Boruta feature selection. Using information gain feature selection, the proposed system achieved an average accuracy of 94.87%. The results show that the proposed system is preferable for the early detection of AD. Also, the results revealed that the Boruta wrapper feature selection is superior to the information gain filter technique.
Collapse
Affiliation(s)
- Hala Ahmed
- Information Technology Dept., Faculty of Computers and Information, Mansoura University, Mansoura, P.O.35516, Egypt
| | - Hassan Soliman
- Information Technology Dept., Faculty of Computers and Information, Mansoura University, Mansoura, P.O.35516, Egypt
| | - Mohammed Elmogy
- Information Technology Dept., Faculty of Computers and Information, Mansoura University, Mansoura, P.O.35516, Egypt.
| |
Collapse
|
2
|
Okada Y, Matsuyama T, Morita S, Ehara N, Miyamae N, Jo T, Sumida Y, Okada N, Watanabe M, Nozawa M, Tsuruoka A, Fujimoto Y, Okumura Y, Kitamura T, Iiduka R, Ohtsuru S. Machine learning-based prediction models for accidental hypothermia patients. J Intensive Care 2021; 9:6. [PMID: 33422146 PMCID: PMC7797142 DOI: 10.1186/s40560-021-00525-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Accepted: 01/02/2021] [Indexed: 12/23/2022] Open
Abstract
Background Accidental hypothermia is a critical condition with high risks of fatal arrhythmia, multiple organ failure, and mortality; however, there is no established model to predict the mortality. The present study aimed to develop and validate machine learning-based models for predicting in-hospital mortality using easily available data at hospital admission among the patients with accidental hypothermia. Method This study was secondary analysis of multi-center retrospective cohort study (J-point registry) including patients with accidental hypothermia. Adult patients with body temperature 35.0 °C or less at emergency department were included. Prediction models for in-hospital mortality using machine learning (lasso, random forest, and gradient boosting tree) were made in development cohort from six hospitals, and the predictive performance were assessed in validation cohort from other six hospitals. As a reference, we compared the SOFA score and 5A score. Results We included total 532 patients in the development cohort [N = 288, six hospitals, in-hospital mortality: 22.0% (64/288)], and the validation cohort [N = 244, six hospitals, in-hospital mortality 27.0% (66/244)]. The C-statistics [95% CI] of the models in validation cohorts were as follows: lasso 0.784 [0.717–0.851] , random forest 0.794[0.735–0.853], gradient boosting tree 0.780 [0.714–0.847], SOFA 0.787 [0.722–0.851], and 5A score 0.750[0.681–0.820]. The calibration plot showed that these models were well calibrated to observed in-hospital mortality. Decision curve analysis indicated that these models obtained clinical net-benefit. Conclusion This multi-center retrospective cohort study indicated that machine learning-based prediction models could accurately predict in-hospital mortality in validation cohort among the accidental hypothermia patients. These models might be able to support physicians and patient’s decision-making. However, the applicability to clinical settings, and the actual clinical utility is still unclear; thus, further prospective study is warranted to evaluate the clinical usefulness. Supplementary Information The online version contains supplementary material available at 10.1186/s40560-021-00525-z.
Collapse
Affiliation(s)
- Yohei Okada
- Department of Primary Care and Emergency Medicine, Graduate School of Medicine, Kyoto University, ShogoinKawaramachi54, Sakyo, Kyoto, 606-8507, Japan. .,Preventive Services, School of Public Health, Kyoto University, Kyoto, Japan. .,Department of Emergency and Critical Care Medicine, Japanese Red Cross Society, Kyoto Daini Hospital, Kyoto, Japan.
| | - Tasuku Matsuyama
- Department of Emergency Medicine, Kyoto Prefectural University of Medicine, Kyoto, Japan
| | - Sachiko Morita
- Senri Critical Care Medical Center, Saiseikai Senri Hospital, Suita, Japan
| | - Naoki Ehara
- Department of Emergency, Japanese Red Cross Society, Kyoto Daiichi Red Cross Hospital, Kyoto, Japan
| | - Nobuhiro Miyamae
- Department of Emergency Medicine, Rakuwa-kai Otowa Hospital, Kyoto, Japan
| | - Takaaki Jo
- Department of Emergency Medicine, Uji-Tokushukai Medical Center, Uji, Japan
| | - Yasuyuki Sumida
- Department of Emergency Medicine, North Medical Center, Kyoto Prefectural University of Medicine, Kyoto, Japan
| | - Nobunaga Okada
- Department of Emergency Medicine, Kyoto Prefectural University of Medicine, Kyoto, Japan.,Department of Emergency and Critical Care Medicine, National Hospital Organization, Kyoto Medical Center, Kyoto, Japan
| | - Makoto Watanabe
- Department of Emergency Medicine, Kyoto Prefectural University of Medicine, Kyoto, Japan
| | - Masahiro Nozawa
- Department of Emergency and Critical Care Medicine, Saiseikai Shiga Hospital, Ritto, Japan
| | - Ayumu Tsuruoka
- Department of Emergency and Critical Care Medicine, Kyoto Min-Iren Chuo Hospital, Kyoto, Japan
| | - Yoshihiro Fujimoto
- Department of Emergency Medicine, Yodogawa Christian Hospital, Osaka, Japan
| | - Yoshiki Okumura
- Department of Emergency Medicine, Fukuchiyama City Hospital, Fukuchiyama, Japan
| | - Tetsuhisa Kitamura
- Division of Environmental Medicine and Population Sciences, Department of Social and Environmental Medicine, Graduate School of Medicine, Osaka University, Osaka, Japan
| | - Ryoji Iiduka
- Department of Emergency and Critical Care Medicine, Japanese Red Cross Society, Kyoto Daini Hospital, Kyoto, Japan
| | - Shigeru Ohtsuru
- Department of Primary Care and Emergency Medicine, Graduate School of Medicine, Kyoto University, ShogoinKawaramachi54, Sakyo, Kyoto, 606-8507, Japan
| |
Collapse
|
3
|
Tong Y, Wang J, Zheng T, Zhang X, Xiao X, Zhu X, Lai X, Liu X. SETE: Sequence-based Ensemble learning approach for TCR Epitope binding prediction. Comput Biol Chem 2020; 87:107281. [PMID: 32623023 DOI: 10.1016/j.compbiolchem.2020.107281] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2020] [Accepted: 05/09/2020] [Indexed: 11/30/2022]
Abstract
Predicting the binding of T cell receptors (TCRs) to epitopes plays a vital role in the immunotherapy, because it guides the development of therapeutic vaccines and cancer treatments. Many prediction methods attempted to explain the relationship between TCR repertoires from different aspects such as the V(D)J gene locus and the biophysical features of amino acids molecules, but the extraction of these features is time consuming and the performance of these models are limited. Few studies have investigated how k-mers formed by adjacent amino acids in TCR sequences direct the epitope recognition, and the specific mechanism of TCR epitope binding is still unclear. Motivated by these, we presented SETE (Sequence-based Ensemble learning approach for TCR Epitope binding prediction), a novel model to predict the TCR epitope binding accurately. The model deconstructed the CDR3β sequence to short amino acid chains as features and learned the pattern of them between different TCR repertoires with gradient boosting decision tree algorithm. Experiments have demonstrated that SETE can be helpful in predicting the TCRs' corresponding epitopes and it outperforms other state-of-the-art methods in predicting the epitope specificity of TCR on VDJdb data set. The source codes have been uploaded at https://github.com/wonanut/SETE for academic usage only.
Collapse
Affiliation(s)
- Yao Tong
- School of Computer Science and Technology, Xi'an Jiaotong University, Xi'an, 710049, China; Shaanxi Engineering Research Center of Medical and Health Big Data, Xi'an Jiaotong University, Xi'an, 710049, China
| | - Jiayin Wang
- School of Computer Science and Technology, Xi'an Jiaotong University, Xi'an, 710049, China; Shaanxi Engineering Research Center of Medical and Health Big Data, Xi'an Jiaotong University, Xi'an, 710049, China.
| | - Tian Zheng
- School of Computer Science and Technology, Xi'an Jiaotong University, Xi'an, 710049, China; Shaanxi Engineering Research Center of Medical and Health Big Data, Xi'an Jiaotong University, Xi'an, 710049, China
| | - Xuanping Zhang
- School of Computer Science and Technology, Xi'an Jiaotong University, Xi'an, 710049, China; Shaanxi Engineering Research Center of Medical and Health Big Data, Xi'an Jiaotong University, Xi'an, 710049, China
| | - Xiao Xiao
- School of Computer Science and Technology, Xi'an Jiaotong University, Xi'an, 710049, China; Shaanxi Engineering Research Center of Medical and Health Big Data, Xi'an Jiaotong University, Xi'an, 710049, China
| | - Xiaoyan Zhu
- School of Computer Science and Technology, Xi'an Jiaotong University, Xi'an, 710049, China; Shaanxi Engineering Research Center of Medical and Health Big Data, Xi'an Jiaotong University, Xi'an, 710049, China
| | - Xin Lai
- School of Computer Science and Technology, Xi'an Jiaotong University, Xi'an, 710049, China; Shaanxi Engineering Research Center of Medical and Health Big Data, Xi'an Jiaotong University, Xi'an, 710049, China
| | - Xiang Liu
- Department of Cardiothoracic Surgery, The Second Affiliated Hospital, University of South China, Hengyang, 421001, China.
| |
Collapse
|
4
|
Abstract
Recently, persistent homology has had tremendous success in biomolecular data analysis. It works by examining the topological relationship or connectivity of a group of atoms in a molecule at a variety of scales, then rendering a family of topological representations of the molecule. However, persistent homology is rarely employed for the analysis of atomic properties, such as biomolecular flexibility analysis or B-factor prediction. This work introduces atom-specific persistent homology to provide a local atomic level representation of a molecule via a global topological tool. This is achieved through the construction of a pair of conjugated sets of atoms and corresponding conjugated simplicial complexes, as well as conjugated topological spaces. The difference between the topological invariants of the pair of conjugated sets is measured by Bottleneck and Wasserstein metrics and leads to an atom-specific topological representation of individual atomic properties in a molecule. Atom-specific topological features are integrated with various machine learning algorithms, including gradient boosting trees and convolutional neural network for protein thermal fluctuation analysis and B-factor prediction. Extensive numerical results indicate the proposed method provides a powerful topological tool for analyzing and predicting localized information in complex macromolecules.
Collapse
Affiliation(s)
- David Bramer
- Department of Mathematics, Michigan State University, MI 48824, USA
| | - Guo-Wei Wei
- Corresponding Author: Guo-WeiWei: Department of Mathematics, Michigan State University, MI 48824, USA; Department of Biochemistry and Molecular Biology, Michigan State University, MI 48824, USA; Department of Electrical and Computer Engineering, Michigan State University, MI 48824, USA,
| |
Collapse
|
5
|
Park H, Haghani A, Samuel S, Knodler MA. Real-time prediction and avoidance of secondary crashes under unexpected traffic congestion. Accid Anal Prev 2018; 112:39-49. [PMID: 29306687 DOI: 10.1016/j.aap.2017.11.025] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/13/2017] [Revised: 10/15/2017] [Accepted: 11/18/2017] [Indexed: 06/07/2023]
Abstract
According to the Federal Highway Administration, nonrecurring congestion contributes to nearly half of the overall congestion. Temporal disruptions impact the effective use of the complete roadway, due to speed reduction and rubbernecking resulting from primary incidents that in turn provoke secondary incidents. There is an additional reduction of discharge flow caused by secondary incident that significantly increases total delay. Therefore, it is important to sequentially predict the probability of secondary incidents and develop appropriate countermeasures to reduce the associated risk. Advanced computing techniques were used to easily understand and reliably predict secondary incident occurrences that have low sample mean and a small sample size. The likelihood of a secondary incident was sequentially predicted from the point of incident response to the eventual road clearance. The quality of predictions improved with the availability of additional information. The prediction performance of the principled Bayesian learning approach to neural networks (bnn) was compared to the Stochastic Gradient Boosted Decision Trees (gbdt). A pedagogical rule extraction approach, trepan, which extracts comprehensible rules from the neural networks, improved the ability to understand secondary incidents in a simplified manner. With an acceptable accuracy, gbdt is a useful tool that presents the relative importance of the predictor variables. Unexpected traffic congestion incurred by an incident is a dominant causative factor for the occurrence of secondary incidents at different stages of incident clearance. This symbolic description represents a series of decisions that may assist emergency operators by improving their decision-making capabilities. Analyzing causes and effects of traffic incidents helps traffic operators develop incident-specific strategic plans for prompt emergency response and clearance. Application of the model in connected vehicle environments will help drivers receive proactive corrective feedback before a crash. The proposed methodology can be used to alert drivers about potential highway conditions and may increase the drivers' awareness of potential events when no rerouting is possible, optimal or otherwise.
Collapse
Affiliation(s)
- Hyoshin Park
- Department of Computational Science & Engineering, North Carolina Agricultural & Technical State University, United States.
| | - Ali Haghani
- Department of Civil & Environmental Engineering, University of Maryland, College Park, United States
| | - Siby Samuel
- Department of Mechanical & Industrial Engineering, University of Massachusetts, Amherst, United States
| | - Michael A Knodler
- Department of Civil & Environmental Engineering, University of Massachusetts, Amherst, United States
| |
Collapse
|