1
|
Early Prediction for Prediabetes and Type 2 Diabetes Using the Genetic Risk Score and Oxidative Stress Score. Antioxidants (Basel) 2022; 11:antiox11061196. [PMID: 35740093 PMCID: PMC9231325 DOI: 10.3390/antiox11061196] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Revised: 06/14/2022] [Accepted: 06/15/2022] [Indexed: 11/17/2022] Open
Abstract
We aimed to use a genetic risk score (GRS) constructed with prediabetes and type 2 diabetes-related single nucleotide polymorphisms (SNPs) and an oxidative stress score (OSS) to construct an early-prediction model for prediabetes and type 2 diabetes (T2DM) incidence in a Korean population. The study population included 549 prediabetes and T2DM patients and 1036 normal subjects. The GRS was constructed using six prediabetes and T2DM-related SNPs, and the OSS was composed of three recognized oxidative stress biomarkers. Among the nine SNPs, six showed significant associations with the incidence of prediabetes and T2DM. The GRS was profoundly associated with increased prediabetes and T2DM (OR = 1.946) compared with individual SNPs after adjusting for age, sex, and BMI. Each of the three oxidative stress biomarkers was markedly higher in the prediabetes and T2DM group than in the normal group, and the OSS was significantly associated with increased prediabetes and T2DM (OR = 2.270). When BMI was introduced to the model with the OSS and GRS, the area under the ROC curve improved (from 69.3% to 70.5%). We found that the prediction model composed of the OSS, GRS, and BMI showed a significant prediction ability for the incidence of prediabetes and T2DM.
Collapse
|
2
|
Isik YE, Gormez Y, Aydin Z, Bakir-Gungor B. The Determination of Distinctive Single Nucleotide Polymorphism Sets for the Diagnosis of Behçet's Disease. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:1909-1918. [PMID: 33476272 DOI: 10.1109/tcbb.2021.3053429] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Behçet's Disease (BD) is a multi-system inflammatory disorder in which the etiology remains unclear. The most probable hypothesis is that genetic tendency and environmental factors play roles in the development of BD. In order to find the essential reasons, genetic changes on thousands of genes should be analyzed. Besides, there is a need for extra analysis to find out which genetic factor affects the disease. Machine learning approaches have high potential for extracting the knowledge from genomics and selecting the representative Single Nucleotide Polymorphisms (SNPs) as the most effective features for the clinical diagnosis process. In this study, we have attempted to identify representative SNPs using feature selection methods, incorporating biological information and aimed to develop a machine-learning model for diagnosing Behçet's disease. By combining biological information and machine learning classifiers, up to 99.64 percent accuracy of disease prediction is achieved using only 13,611 out of 311,459 SNPs. In addition, we revealed the SNPs that are most distinctive by performing repeated feature selection in cross-validation experiments.
Collapse
|
3
|
Hatmal MM, Alshaer W, Mahmoud IS, Al-Hatamleh MAI, Al-Ameer HJ, Abuyaman O, Zihlif M, Mohamud R, Darras M, Al Shhab M, Abu-Raideh R, Ismail H, Al-Hamadi A, Abdelhay A. Investigating the association of CD36 gene polymorphisms (rs1761667 and rs1527483) with T2DM and dyslipidemia: Statistical analysis, machine learning based prediction, and meta-analysis. PLoS One 2021; 16:e0257857. [PMID: 34648514 PMCID: PMC8516279 DOI: 10.1371/journal.pone.0257857] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Accepted: 09/11/2021] [Indexed: 12/15/2022] Open
Abstract
CD36 (cluster of differentiation 36) is a membrane protein involved in lipid metabolism and has been linked to pathological conditions associated with metabolic disorders, such as diabetes and dyslipidemia. A case-control study was conducted and included 177 patients with type-2 diabetes mellitus (T2DM) and 173 control subjects to study the involvement of CD36 gene rs1761667 (G>A) and rs1527483 (C>T) polymorphisms in the pathogenesis of T2DM and dyslipidemia among Jordanian population. Lipid profile, blood sugar, gender and age were measured and recorded. Also, genotyping analysis for both polymorphisms was performed. Following statistical analysis, 10 different neural networks and machine learning (ML) tools were used to predict subjects with diabetes or dyslipidemia. Towards further understanding of the role of CD36 protein and gene in T2DM and dyslipidemia, a protein-protein interaction network and meta-analysis were carried out. For both polymorphisms, the genotypic frequencies were not significantly different between the two groups (p > 0.05). On the other hand, some ML tools like multilayer perceptron gave high prediction accuracy (≥ 0.75) and Cohen's kappa (κ) (≥ 0.5). Interestingly, in K-star tool, the accuracy and Cohen's κ values were enhanced by including the genotyping results as inputs (0.73 and 0.46, respectively, compared to 0.67 and 0.34 without including them). This study confirmed, for the first time, that there is no association between CD36 polymorphisms and T2DM or dyslipidemia among Jordanian population. Prediction of T2DM and dyslipidemia, using these extensive ML tools and based on such input data, is a promising approach for developing diagnostic and prognostic prediction models for a wide spectrum of diseases, especially based on large medical databases.
Collapse
Affiliation(s)
- Ma’mon M. Hatmal
- Department of Medical Laboratory Sciences, Faculty of Applied Medical Sciences, The Hashemite University, Zarqa, Jordan
- * E-mail:
| | - Walhan Alshaer
- Cell Therapy Centre, The University of Jordan, Amman, Jordan
| | - Ismail S. Mahmoud
- Department of Medical Laboratory Sciences, Faculty of Applied Medical Sciences, The Hashemite University, Zarqa, Jordan
| | - Mohammad A. I. Al-Hatamleh
- Department of Immunology, School of Medical Sciences, Universiti Sains Malaysia, Kubang Kerian, Kelantan, Malaysia
| | - Hamzeh J. Al-Ameer
- Department of Biology and Biotechnology, American University of Madaba, Madaba, Jordan
- Department of Pharmacology, Faculty of Medicine, The University of Jordan, Amman, Jordan
| | - Omar Abuyaman
- Department of Medical Laboratory Sciences, Faculty of Applied Medical Sciences, The Hashemite University, Zarqa, Jordan
| | - Malek Zihlif
- Department of Pharmacology, Faculty of Medicine, The University of Jordan, Amman, Jordan
| | - Rohimah Mohamud
- Department of Immunology, School of Medical Sciences, Universiti Sains Malaysia, Kubang Kerian, Kelantan, Malaysia
| | - Mais Darras
- Department of Medical Laboratory Sciences, Faculty of Applied Medical Sciences, The Hashemite University, Zarqa, Jordan
| | - Mohammad Al Shhab
- Department of Pharmacology, Faculty of Medicine, The University of Jordan, Amman, Jordan
| | - Rand Abu-Raideh
- Department of Medical Laboratory Sciences, Faculty of Applied Medical Sciences, The Hashemite University, Zarqa, Jordan
| | - Hilweh Ismail
- Department of Medical Laboratory Sciences, Faculty of Applied Medical Sciences, The Hashemite University, Zarqa, Jordan
| | - Ali Al-Hamadi
- Department of Medical Laboratory Sciences, Faculty of Applied Medical Sciences, The Hashemite University, Zarqa, Jordan
| | - Ali Abdelhay
- Department of Pharmacology, Faculty of Medicine, The University of Jordan, Amman, Jordan
| |
Collapse
|
4
|
Gerber JE, Geller G, Boyce A, Maragakis LL, Garibaldi BT. Genomics in Patient Care and Workforce Decisions in High-Level Isolation Units: A Survey of Healthcare Workers. Health Secur 2021; 19:318-326. [PMID: 33826422 DOI: 10.1089/hs.2020.0182] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The impact of host genomics on an individual's susceptibility, immune response, and risk of severe outcomes for a given infectious pathogen is increasingly recognized. As we uncover the links between host genomics and infectious disease, a number of ethical, legal, and social issues need to be considered when using that information in clinical practice or workforce decisions. We conducted a survey of the clinical staff at 10 federally funded Regional Ebola and Other Special Pathogen Treatment Centers to understand their views regarding the ethical, legal, and social issues related to host genomics and the administrative and clinical functions of high-level isolation units. Respondents overwhelmingly agreed that genomics could provide valuable information to identify patients and employees at higher risk for poor outcomes from highly infectious diseases. However, there was considerable disagreement about whether such data should inform the allocation of scarce resources or determine treatment decisions. While most respondents supported a confidential employer-based genomic testing system to inform individual employees about risk, respondents disagreed about whether such information should be used in staffing models. Respondents who thought genomic information would be valuable for patient treatment were more willing to undergo genetic testing for staffing purposes. Most respondents felt they would benefit from additional training to better interpret results from genetic testing. Although this study was completed before the COVID-19 pandemic, the responses provide a baseline assessment of provider attitudes that can inform policy during the current pandemic and in future infectious disease outbreaks.
Collapse
Affiliation(s)
- Jennifer E Gerber
- Jennifer E. Gerber, PhD, MSc, was a PhD Student and Graduate Research Assistant at the time of the study, Department of International Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD. She is now an Epidemiologist, RTI International, Washington, DC. Gail Geller, ScD, MHS, is a Professor, Department of Health, Behavior, and Society and Department of Health Policy and Management, Johns Hopkins Bloomberg School of Public Health; Professor, Department of Medicine, Johns Hopkins School of Medicine; and Director of Education Initiatives and Core Faculty, Berman Institute of Bioethics, Johns Hopkins University; all in Baltimore, MD. Angie Boyce, PhD, is a Science and Technology Policy Fellow, American Association for the Advancement of Science, Washington, DC. Lisa L. Maragakis, MD, MPH, is an Associate Professor of Medicine and Executive Director, Johns Hopkins Biocontainment Unit; and Brian T. Garibaldi, MD, MEHP, is an Associate Professor of Medicine, Physiology, and Informatics, Division of Pulmonary and Critical Care, and Director, Johns Hopkins Biocontainment Unit; both in the Johns Hopkins School of Medicine, Baltimore, MD. Lisa L. Maragakis is also Senior Director of Infection Prevention, The Johns Hopkins Health System, Baltimore, MD
| | - Gail Geller
- Jennifer E. Gerber, PhD, MSc, was a PhD Student and Graduate Research Assistant at the time of the study, Department of International Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD. She is now an Epidemiologist, RTI International, Washington, DC. Gail Geller, ScD, MHS, is a Professor, Department of Health, Behavior, and Society and Department of Health Policy and Management, Johns Hopkins Bloomberg School of Public Health; Professor, Department of Medicine, Johns Hopkins School of Medicine; and Director of Education Initiatives and Core Faculty, Berman Institute of Bioethics, Johns Hopkins University; all in Baltimore, MD. Angie Boyce, PhD, is a Science and Technology Policy Fellow, American Association for the Advancement of Science, Washington, DC. Lisa L. Maragakis, MD, MPH, is an Associate Professor of Medicine and Executive Director, Johns Hopkins Biocontainment Unit; and Brian T. Garibaldi, MD, MEHP, is an Associate Professor of Medicine, Physiology, and Informatics, Division of Pulmonary and Critical Care, and Director, Johns Hopkins Biocontainment Unit; both in the Johns Hopkins School of Medicine, Baltimore, MD. Lisa L. Maragakis is also Senior Director of Infection Prevention, The Johns Hopkins Health System, Baltimore, MD
| | - Angie Boyce
- Jennifer E. Gerber, PhD, MSc, was a PhD Student and Graduate Research Assistant at the time of the study, Department of International Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD. She is now an Epidemiologist, RTI International, Washington, DC. Gail Geller, ScD, MHS, is a Professor, Department of Health, Behavior, and Society and Department of Health Policy and Management, Johns Hopkins Bloomberg School of Public Health; Professor, Department of Medicine, Johns Hopkins School of Medicine; and Director of Education Initiatives and Core Faculty, Berman Institute of Bioethics, Johns Hopkins University; all in Baltimore, MD. Angie Boyce, PhD, is a Science and Technology Policy Fellow, American Association for the Advancement of Science, Washington, DC. Lisa L. Maragakis, MD, MPH, is an Associate Professor of Medicine and Executive Director, Johns Hopkins Biocontainment Unit; and Brian T. Garibaldi, MD, MEHP, is an Associate Professor of Medicine, Physiology, and Informatics, Division of Pulmonary and Critical Care, and Director, Johns Hopkins Biocontainment Unit; both in the Johns Hopkins School of Medicine, Baltimore, MD. Lisa L. Maragakis is also Senior Director of Infection Prevention, The Johns Hopkins Health System, Baltimore, MD
| | - Lisa L Maragakis
- Jennifer E. Gerber, PhD, MSc, was a PhD Student and Graduate Research Assistant at the time of the study, Department of International Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD. She is now an Epidemiologist, RTI International, Washington, DC. Gail Geller, ScD, MHS, is a Professor, Department of Health, Behavior, and Society and Department of Health Policy and Management, Johns Hopkins Bloomberg School of Public Health; Professor, Department of Medicine, Johns Hopkins School of Medicine; and Director of Education Initiatives and Core Faculty, Berman Institute of Bioethics, Johns Hopkins University; all in Baltimore, MD. Angie Boyce, PhD, is a Science and Technology Policy Fellow, American Association for the Advancement of Science, Washington, DC. Lisa L. Maragakis, MD, MPH, is an Associate Professor of Medicine and Executive Director, Johns Hopkins Biocontainment Unit; and Brian T. Garibaldi, MD, MEHP, is an Associate Professor of Medicine, Physiology, and Informatics, Division of Pulmonary and Critical Care, and Director, Johns Hopkins Biocontainment Unit; both in the Johns Hopkins School of Medicine, Baltimore, MD. Lisa L. Maragakis is also Senior Director of Infection Prevention, The Johns Hopkins Health System, Baltimore, MD
| | - Brian T Garibaldi
- Jennifer E. Gerber, PhD, MSc, was a PhD Student and Graduate Research Assistant at the time of the study, Department of International Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD. She is now an Epidemiologist, RTI International, Washington, DC. Gail Geller, ScD, MHS, is a Professor, Department of Health, Behavior, and Society and Department of Health Policy and Management, Johns Hopkins Bloomberg School of Public Health; Professor, Department of Medicine, Johns Hopkins School of Medicine; and Director of Education Initiatives and Core Faculty, Berman Institute of Bioethics, Johns Hopkins University; all in Baltimore, MD. Angie Boyce, PhD, is a Science and Technology Policy Fellow, American Association for the Advancement of Science, Washington, DC. Lisa L. Maragakis, MD, MPH, is an Associate Professor of Medicine and Executive Director, Johns Hopkins Biocontainment Unit; and Brian T. Garibaldi, MD, MEHP, is an Associate Professor of Medicine, Physiology, and Informatics, Division of Pulmonary and Critical Care, and Director, Johns Hopkins Biocontainment Unit; both in the Johns Hopkins School of Medicine, Baltimore, MD. Lisa L. Maragakis is also Senior Director of Infection Prevention, The Johns Hopkins Health System, Baltimore, MD
| |
Collapse
|
5
|
Zhao W, Lai X, Liu D, Zhang Z, Ma P, Wang Q, Zhang Z, Pan Y. Applications of Support Vector Machine in Genomic Prediction in Pig and Maize Populations. Front Genet 2020; 11:598318. [PMID: 33343636 PMCID: PMC7744740 DOI: 10.3389/fgene.2020.598318] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2020] [Accepted: 11/11/2020] [Indexed: 01/01/2023] Open
Abstract
Genomic prediction (GP) has revolutionized animal and plant breeding. However, better statistical models that can improve the accuracy of GP are required. For this reason, in this study, we explored the genomic-based prediction performance of a popular machine learning method, the Support Vector Machine (SVM) model. We selected the most suitable kernel function and hyperparameters for the SVM model in eight published genomic data sets on pigs and maize. Next, we compared the SVM model with RBF and the linear kernel functions to the two most commonly used genome-enabled prediction models (GBLUP and BayesR) in terms of prediction accuracy, time, and the memory used. The results showed that the SVM model had the best prediction performance in two of the eight data sets, but in general, the predictions of both models were similar. In terms of time, the SVM model was better than BayesR but worse than GBLUP. In terms of memory, the SVM model was better than GBLUP and worse than BayesR in pig data but the same with BayesR in maize data. According to the results, SVM is a competitive method in animal and plant breeding, and there is no universal prediction model.
Collapse
Affiliation(s)
- Wei Zhao
- Department of Animal Science, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai, China
| | - Xueshuang Lai
- Department of Animal Science, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai, China
| | - Dengying Liu
- Department of Animal Science, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai, China
| | - Zhenyang Zhang
- Department of Animal Science, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai, China
| | - Peipei Ma
- Department of Animal Science, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai, China
| | - Qishan Wang
- Department of Animal Science, College of Animal Science, Zhejiang University, Hangzhou, China
| | - Zhe Zhang
- Department of Animal Science, College of Animal Science, Zhejiang University, Hangzhou, China
| | - Yuchun Pan
- Department of Animal Science, College of Animal Science, Zhejiang University, Hangzhou, China
| |
Collapse
|
6
|
Abstract
PURPOSE OF REVIEW Genetic, socioeconomic and clinical features vary considerably among individuals with type 2 diabetes (T2D) influencing disease development, progression and response to therapy. Although a patient-centred approach to pharmacologic therapy of T2D is widely recommended, patients are often treated similarly, irrespective of the differences that may affect therapeutic response. Addressing the heterogeneity of T2D is a major task of diabetes research to lower the high rate of treatment failure as well as to reduce the risk of long-term complications. RECENT FINDINGS A pathophysiology-based clustering system seems the most promising to help in the stratification of diabetes in terms of complication risk and response to treatment. This urges for clinical studies looking at novel biomarkers related to the different metabolic pathways of T2D and able to inform about the therapeutic cluster of each patient. Here, we review the main settings of diabetes heterogeneity, to what extent it has been already addressed and the current gaps in knowledge towards a personalized therapeutic approach that considers the distinctive features of each patient.
Collapse
Affiliation(s)
- Pieralice Silvia
- Department of Medicine, Unit of Endocrinology and Diabetes, Campus Bio-Medico University of Rome, Via Alvaro del Portillo 21, 00128, Rome, Italy
| | - Zampetti Simona
- Department of Experimental Medicine, Sapienza University, Viale Regina Elena 324, 00161, Rome, Italy
| | - Maddaloni Ernesto
- Department of Experimental Medicine, Sapienza University, Viale Regina Elena 324, 00161, Rome, Italy.
| | - Buzzetti Raffaella
- Department of Experimental Medicine, Sapienza University, Viale Regina Elena 324, 00161, Rome, Italy
| |
Collapse
|
7
|
The Usage of Lasso, Ridge, and Linear Regression to Explore the Most Influential Metabolic Variables that Affect Fasting Blood Sugar in Type 2 Diabetes Patients. ROMANIAN JOURNAL OF DIABETES NUTRITION AND METABOLIC DISEASES 2020. [DOI: 10.2478/rjdnmd-2019-0040] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
Abstract
Background and aims: To explore the most influential variables of fasting blood sugar (FBS) with three regression methods, to identify the existence chance of type 2 diabetes based on influential variables with logistic regression (LR), and to compare the three regression methods according to Mean Squared Error (MSE) value.
Material and Methods: In this cross-sectional study, 270 patients suffering from type 2 diabetes for at least 6 months and 380 healthy people were participated. The Linear regression, Ridge regression, and Least Absolute Shrinkage and Selection Operator (Lasso) regression were used to find influential variables for FBS.
Results: Among 15 variables (8 metabolic, 7 characteristic), Lasso regression selected HbA1c, Urea, age, BMI, heredity, and gender, Ridge regression selected HbA1c, heredity, gender, smoking status, and drug use, and Linear regression selected HbA1c as the most effective predictors for FBS.
Conclusion: HbA1c is the most influential predictor of FBS among 15 variables according to the result of three regression methods. Controlling the variation of HbA1c leads to a more stable FBS. Beside FBS that should be checked before breakfast, maybe HbA1c could be helpful in diagnosis of Type 2 diabetes.
Collapse
|
8
|
Affiliation(s)
- Ian A Scott
- Princess Alexandra Hospital, Woolloongabba, QLD, Australia
- University of Queensland, Brisbane, QLD, Australia
| | - John Attia
- University of Newcastle, Callaghan, NSW, Australia
- John Hunter Hospital, Newcastle, NSW, Australia
| | - Ray Moynihan
- Institute for Evidence-Based Healthcare, Bond University, Robina, QLD, Australia
- Sydney Medical School-Public Health, University of Sydney, Sydney, NSW, Australia
| |
Collapse
|
9
|
Shigemizu D, Akiyama S, Asanomi Y, Boroevich KA, Sharma A, Tsunoda T, Sakurai T, Ozaki K, Ochiya T, Niida S. A comparison of machine learning classifiers for dementia with Lewy bodies using miRNA expression data. BMC Med Genomics 2019; 12:150. [PMID: 31666070 PMCID: PMC6822471 DOI: 10.1186/s12920-019-0607-3] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2019] [Accepted: 10/18/2019] [Indexed: 12/21/2022] Open
Abstract
Background Dementia with Lewy bodies (DLB) is the second most common subtype of neurodegenerative dementia in humans following Alzheimer’s disease (AD). Present clinical diagnosis of DLB has high specificity and low sensitivity and finding potential biomarkers of prodromal DLB is still challenging. MicroRNAs (miRNAs) have recently received a lot of attention as a source of novel biomarkers. Methods In this study, using serum miRNA expression of 478 Japanese individuals, we investigated potential miRNA biomarkers and constructed an optimal risk prediction model based on several machine learning methods: penalized regression, random forest, support vector machine, and gradient boosting decision tree. Results The final risk prediction model, constructed via a gradient boosting decision tree using 180 miRNAs and two clinical features, achieved an accuracy of 0.829 on an independent test set. We further predicted candidate target genes from the miRNAs. Gene set enrichment analysis of the miRNA target genes revealed 6 functional genes included in the DHA signaling pathway associated with DLB pathology. Two of them were further supported by gene-based association studies using a large number of single nucleotide polymorphism markers (BCL2L1: P = 0.012, PIK3R2: P = 0.021). Conclusions Our proposed prediction model provides an effective tool for DLB classification. Also, a gene-based association test of rare variants revealed that BCL2L1 and PIK3R2 were statistically significantly associated with DLB.
Collapse
Affiliation(s)
- Daichi Shigemizu
- Laboratory Chief, Division of Genomic Medicine, Medical Genome Center, National Center for Geriatrics and Gerontology, 7-430 Morioka-cho, Obu, Aichi, 474-8511, Japan. .,Department of Medical Science Mathematics, Medical Research Institute, Tokyo Medical and Dental University (TMDU), Tokyo, 113-8510, Japan. .,RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, 230-0045, Japan. .,CREST, JST, Tokyo, 113-8510, Japan.
| | - Shintaro Akiyama
- Laboratory Chief, Division of Genomic Medicine, Medical Genome Center, National Center for Geriatrics and Gerontology, 7-430 Morioka-cho, Obu, Aichi, 474-8511, Japan
| | - Yuya Asanomi
- Laboratory Chief, Division of Genomic Medicine, Medical Genome Center, National Center for Geriatrics and Gerontology, 7-430 Morioka-cho, Obu, Aichi, 474-8511, Japan
| | - Keith A Boroevich
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, 230-0045, Japan
| | - Alok Sharma
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, 230-0045, Japan.,CREST, JST, Tokyo, 113-8510, Japan.,School of Engineering & Physics, University of the South Pacific, Suva, Fiji.,Institute for Integrated and Intelligent Systems, Griffith University, QLD, Brisbane, 4111, Australia
| | - Tatsuhiko Tsunoda
- Department of Medical Science Mathematics, Medical Research Institute, Tokyo Medical and Dental University (TMDU), Tokyo, 113-8510, Japan.,RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, 230-0045, Japan.,CREST, JST, Tokyo, 113-8510, Japan
| | - Takashi Sakurai
- The Center for Comprehensive Care and Research on Memory Disorders, National Center for Geriatrics and Gerontology, Obu, Aichi, 474-8511, Japan.,Department of Cognitive and Behavioral Science, Nagoya University Graduate School of Medicine, Nagoya, Aichi, 466-8550, Japan
| | - Kouichi Ozaki
- Laboratory Chief, Division of Genomic Medicine, Medical Genome Center, National Center for Geriatrics and Gerontology, 7-430 Morioka-cho, Obu, Aichi, 474-8511, Japan.,RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, 230-0045, Japan
| | - Takahiro Ochiya
- Division of Molecular and Cellular Medicine, Fundamental Innovative Oncology Core Center, National Cancer Center Research Institute, Tokyo, 104-0045, Japan.,Institute of Medical Science, Tokyo Medical University, Tokyo, 160-8402, Japan
| | - Shumpei Niida
- Laboratory Chief, Division of Genomic Medicine, Medical Genome Center, National Center for Geriatrics and Gerontology, 7-430 Morioka-cho, Obu, Aichi, 474-8511, Japan
| |
Collapse
|
10
|
Grinberg NF, Orhobor OI, King RD. An evaluation of machine-learning for predicting phenotype: studies in yeast, rice, and wheat. Mach Learn 2019; 109:251-277. [PMID: 32174648 PMCID: PMC7048706 DOI: 10.1007/s10994-019-05848-5] [Citation(s) in RCA: 47] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2015] [Revised: 09/17/2019] [Accepted: 09/19/2019] [Indexed: 11/01/2022]
Abstract
In phenotype prediction the physical characteristics of an organism are predicted from knowledge of its genotype and environment. Such studies, often called genome-wide association studies, are of the highest societal importance, as they are of central importance to medicine, crop-breeding, etc. We investigated three phenotype prediction problems: one simple and clean (yeast), and the other two complex and real-world (rice and wheat). We compared standard machine learning methods; elastic net, ridge regression, lasso regression, random forest, gradient boosting machines (GBM), and support vector machines (SVM), with two state-of-the-art classical statistical genetics methods; genomic BLUP and a two-step sequential method based on linear regression. Additionally, using the clean yeast data, we investigated how performance varied with the complexity of the biological mechanism, the amount of observational noise, the number of examples, the amount of missing data, and the use of different data representations. We found that for almost all the phenotypes considered, standard machine learning methods outperformed the methods from classical statistical genetics. On the yeast problem, the most successful method was GBM, followed by lasso regression, and the two statistical genetics methods; with greater mechanistic complexity GBM was best, while in simpler cases lasso was superior. In the wheat and rice studies the best two methods were SVM and BLUP. The most robust method in the presence of noise, missing data, etc. was random forests. The classical statistical genetics method of genomic BLUP was found to perform well on problems where there was population structure. This suggests that standard machine learning methods need to be refined to include population structure information when this is present. We conclude that the application of machine learning methods to phenotype prediction problems holds great promise, but that determining which methods is likely to perform well on any given problem is elusive and non-trivial.
Collapse
Affiliation(s)
- Nastasiya F. Grinberg
- School of Computer Science, University of Manchester, Oxford Road, Manchester, M13 9PL UK
- Present Address: Department of Medicine, Cambridge Institute of Therapeutic Immunology & Infectious Disease, Jeffrey Cheah Biomedical Centre, Cambridge Biomedical Campus, University of Cambridge, Cambridge, CB2 0AW UK
| | | | - Ross D. King
- Department of Biology and Biological Engineering, Division of Systems and Synthetic Biology, Chalmers University of Technology, Kemivägen 10, SE-412 96 Gothenburg, Sweden
| |
Collapse
|
11
|
Ho DSW, Schierding W, Wake M, Saffery R, O’Sullivan J. Machine Learning SNP Based Prediction for Precision Medicine. Front Genet 2019; 10:267. [PMID: 30972108 PMCID: PMC6445847 DOI: 10.3389/fgene.2019.00267] [Citation(s) in RCA: 102] [Impact Index Per Article: 20.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2018] [Accepted: 03/11/2019] [Indexed: 12/17/2022] Open
Abstract
In the past decade, precision genomics based medicine has emerged to provide tailored and effective healthcare for patients depending upon their genetic features. Genome Wide Association Studies have also identified population based risk genetic variants for common and complex diseases. In order to meet the full promise of precision medicine, research is attempting to leverage our increasing genomic understanding and further develop personalized medical healthcare through ever more accurate disease risk prediction models. Polygenic risk scoring and machine learning are two primary approaches for disease risk prediction. Despite recent improvements, the results of polygenic risk scoring remain limited due to the approaches that are currently used. By contrast, machine learning algorithms have increased predictive abilities for complex disease risk. This increase in predictive abilities results from the ability of machine learning algorithms to handle multi-dimensional data. Here, we provide an overview of polygenic risk scoring and machine learning in complex disease risk prediction. We highlight recent machine learning application developments and describe how machine learning approaches can lead to improved complex disease prediction, which will help to incorporate genetic features into future personalized healthcare. Finally, we discuss how the future application of machine learning prediction models might help manage complex disease by providing tissue-specific targets for customized, preventive interventions.
Collapse
Affiliation(s)
| | | | - Melissa Wake
- Murdoch Children Research Institute, Melbourne, VIC, Australia
| | - Richard Saffery
- Murdoch Children Research Institute, Melbourne, VIC, Australia
| | | |
Collapse
|
12
|
Shigemizu D, Akiyama S, Asanomi Y, Boroevich KA, Sharma A, Tsunoda T, Matsukuma K, Ichikawa M, Sudo H, Takizawa S, Sakurai T, Ozaki K, Ochiya T, Niida S. Risk prediction models for dementia constructed by supervised principal component analysis using miRNA expression data. Commun Biol 2019; 2:77. [PMID: 30820472 PMCID: PMC6389908 DOI: 10.1038/s42003-019-0324-7] [Citation(s) in RCA: 39] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2018] [Accepted: 01/24/2019] [Indexed: 02/07/2023] Open
Abstract
Alzheimer's disease (AD) is the most common subtype of dementia, followed by Vascular Dementia (VaD), and Dementia with Lewy Bodies (DLB). Recently, microRNAs (miRNAs) have received a lot of attention as the novel biomarkers for dementia. Here, using serum miRNA expression of 1,601 Japanese individuals, we investigated potential miRNA biomarkers and constructed risk prediction models, based on a supervised principal component analysis (PCA) logistic regression method, according to the subtype of dementia. The final risk prediction model achieved a high accuracy of 0.873 on a validation cohort in AD, when using 78 miRNAs: Accuracy = 0.836 with 86 miRNAs in VaD; Accuracy = 0.825 with 110 miRNAs in DLB. To our knowledge, this is the first report applying miRNA-based risk prediction models to a dementia prospective cohort. Our study demonstrates our models to be effective in prospective disease risk prediction, and with further improvement may contribute to practical clinical use in dementia.
Collapse
Affiliation(s)
- Daichi Shigemizu
- Medical Genome Center, National Center for Geriatrics and Gerontology, Obu, Aichi, 474-8511, Japan. .,Department of Medical Science Mathematics, Medical Research Institute, Tokyo Medical and Dental University (TMDU), Tokyo, 113-8510, Japan. .,RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, 230-0045, Japan. .,CREST, JST, Tokyo, 102-8666, Japan.
| | - Shintaro Akiyama
- Medical Genome Center, National Center for Geriatrics and Gerontology, Obu, Aichi, 474-8511, Japan
| | - Yuya Asanomi
- Medical Genome Center, National Center for Geriatrics and Gerontology, Obu, Aichi, 474-8511, Japan
| | - Keith A Boroevich
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, 230-0045, Japan
| | - Alok Sharma
- RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, 230-0045, Japan.,CREST, JST, Tokyo, 102-8666, Japan.,School of Engineering & Physics, University of the South Pacific, Suva, Fiji.,Institute for Integrated and Intelligent Systems, Griffith University, Brisbane, QLD, 4111, Australia
| | - Tatsuhiko Tsunoda
- Department of Medical Science Mathematics, Medical Research Institute, Tokyo Medical and Dental University (TMDU), Tokyo, 113-8510, Japan.,RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, 230-0045, Japan.,CREST, JST, Tokyo, 102-8666, Japan
| | - Kana Matsukuma
- Toray Industries, Inc., Kamakura, Kanagawa, 248-0036, Japan
| | | | - Hiroko Sudo
- Toray Industries, Inc., Kamakura, Kanagawa, 248-0036, Japan
| | | | - Takashi Sakurai
- The Center for Comprehensive Care and Research on Memory Disorders, National Center for Geriatrics and Gerontology, Obu, Aichi, 474-8511, Japan.,Department of Cognitive and Behavioral Science, Nagoya University Graduate School of Medicine, Nagoya, Aichi, 466-8550, Japan
| | - Kouichi Ozaki
- Medical Genome Center, National Center for Geriatrics and Gerontology, Obu, Aichi, 474-8511, Japan.,RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, 230-0045, Japan
| | - Takahiro Ochiya
- Division of Molecular and Cellular Medicine, Fundamental Innovative Oncology Core Center, National Cancer Center Research Institute, Tokyo, 104-0045, Japan.,Institute of Medical Science, Tokyo Medical University, Tokyo, 160-8402, Japan
| | - Shumpei Niida
- Medical Genome Center, National Center for Geriatrics and Gerontology, Obu, Aichi, 474-8511, Japan
| |
Collapse
|
13
|
Valdés MG, Galván-Femenía I, Ripoll VR, Duran X, Yokota J, Gavaldà R, Rafael-Palou X, de Cid R. Pipeline design to identify key features and classify the chemotherapy response on lung cancer patients using large-scale genetic data. BMC SYSTEMS BIOLOGY 2018; 12:97. [PMID: 30458782 PMCID: PMC6245589 DOI: 10.1186/s12918-018-0615-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Abstract
BACKGROUND During the last decade, the interest to apply machine learning algorithms to genomic data has increased in many bioinformatics applications. Analyzing this type of data entails difficulties for managing high-dimensional data, class imbalance for knowledge extraction, identifying important features and classifying individuals. In this study, we propose a general framework to tackle these challenges with different machine learning algorithms and techniques. We apply the configuration of this framework on lung cancer patients, identifying genetic signatures for classifying response to drug treatment response. We intersect these relevant SNPs with the GWAS Catalog of the National Human Genome Research Institute and explore the Regulomedb, GTEx databases for functional analysis purposes. RESULTS The machine learning based solution proposed in this study is a scalable and flexible alternative to the classical uni-variate regression approach to analyze large-scale data. From 36 experiments executed using the machine learning framework design, we obtain good classification performance from the top 5 models with the highest cross-validation score and the smallest standard deviation. One thousand two hundred twenty four SNPs corresponding to the key features from the top 20 models (cross validation F1 mean >= 0.65) were compared with the GWAS Catalog finding no intersection with genome-wide significant reported hits. From these, new genetic signatures in MAE, CEP104, PRKCZ and ADRB2 show relevant biological regulatory functionality related to lung physiology. CONCLUSIONS We have defined a machine learning framework using data with an unbalanced large data-set of SNP-arrays and imputed genotyping data from a pharmacogenomics study in lung cancer patients subjected to first-line platinum-based treatment. This approach found genome signals with no genome-wide significance in the uni-variate regression approach (GWAS Catalog) that are valuable for classifying patients, only few of them with related biological function. The effect results of these variants can be explained by the recently proposed omnigenic model hypothesis, which states that complex traits can be influenced mostly by genes outside not only by the "core genes", mainly found by the genome-wide significant SNPs, but also by the rest of genes outside of the "core pathways" with apparent unrelated biological functionality.
Collapse
Affiliation(s)
- María Gabriela Valdés
- Eurecat. Technology Centre of Catalonia, Av. Diagonal 177, 9th floor, Barcelona, 08018 Spain
| | - Iván Galván-Femenía
- PMPPC-IGTP. Programa de Medicina Predictiva i Personalitzada del Càncer - Institut Germans Trias i Pujol (IGTP). Genomes for Life - GCAT lab Group, Badalona, Spain
| | - Vicent Ribas Ripoll
- Eurecat. Technology Centre of Catalonia, Av. Diagonal 177, 9th floor, Barcelona, 08018 Spain
| | - Xavier Duran
- PMPPC-IGTP. Programa de Medicina Predictiva i Personalitzada del Càncer - Institut Germans Trias i Pujol (IGTP). Genomes for Life - GCAT lab Group, Badalona, Spain
| | - Jun Yokota
- PMPPC-IGTP. Programa de Medicina Predictiva i Personalitzada del Càncer - Institut Germans Trias i Pujol (IGTP). CancerGenome Biology, Badalona, Spain
| | - Ricard Gavaldà
- Universitat Politècnica de Catalunya, Barcelona, Spain
- Barcelona Graduate School of Mathematics, BGSMath, Barcelona, Spain
| | - Xavier Rafael-Palou
- Eurecat. Technology Centre of Catalonia, Av. Diagonal 177, 9th floor, Barcelona, 08018 Spain
| | - Rafael de Cid
- PMPPC-IGTP. Programa de Medicina Predictiva i Personalitzada del Càncer - Institut Germans Trias i Pujol (IGTP). Genomes for Life - GCAT lab Group, Badalona, Spain
| |
Collapse
|
14
|
Langenberg C, Lotta LA. Genomic insights into the causes of type 2 diabetes. Lancet 2018; 391:2463-2474. [PMID: 29916387 DOI: 10.1016/s0140-6736(18)31132-2] [Citation(s) in RCA: 98] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/06/2018] [Revised: 04/30/2018] [Accepted: 05/15/2018] [Indexed: 01/05/2023]
Abstract
Genome-wide association studies have implicated around 250 genomic regions in predisposition to type 2 diabetes, with evidence for causal variants and genes emerging for several of these regions. Understanding of the underlying mechanisms, including the interplay between β-cell failure, insulin sensitivity, appetite regulation, and adipose storage has been facilitated by the integration of multidimensional data for diabetes-related intermediate phenotypes, detailed genomic annotations, functional experiments, and now multiomic molecular features. Studies in diverse ethnic groups and examples from population isolates have shown the value and need for a broad genomic approach to this global disease. Transethnic discovery efforts and large-scale biobanks in diverse populations and ancestries could help to address some of the Eurocentric bias. Despite rapid progress in the discovery of the highly polygenic architecture of type 2 diabetes, dominated by common alleles with small, cumulative effects on disease risk, these insights have been of little clinical use in terms of disease prediction or prevention, and have made only small contributions to subtype classification or stratified approaches to treatment. Successful development of academia-industry partnerships for exome or genome sequencing in large biobanks could help to deliver economies of scale, with implications for the future of genomics-focused research.
Collapse
Affiliation(s)
| | - Luca A Lotta
- MRC Epidemiology Unit, University of Cambridge, Cambridge, UK
| |
Collapse
|
15
|
Genetic risk score of common genetic variants for impaired fasting glucose and newly diagnosed type 2 diabetes influences oxidative stress. Sci Rep 2018; 8:7828. [PMID: 29777116 PMCID: PMC5959868 DOI: 10.1038/s41598-018-26106-z] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2018] [Accepted: 05/02/2018] [Indexed: 12/11/2022] Open
Abstract
We tested the hypothesis that the cumulative effects of common genetic variants related to elevated fasting glucose are collectively associated with oxidative stress. Using 25 single nucleotide polymorphisms (SNPs), a weighted genetic risk score (wGRS) was constructed by summing nine risk alleles based on nominal significance and a consistent effect direction in 1,395 controls and 718 patients with impaired fasting glucose (IFG) or newly diagnosed type 2 diabetes. All the participants were divided into the following three groups: low-wGRS, middle-wGRS, and high-wGRS groups. Among the nine SNPs, five SNPs were significantly associated with IFG and type 2 diabetes in this Korean population. wGRS was significantly associated with increased IFG and newly diagnosed type 2 diabetes (p = 6.83 × 10−14, odds ratio = 1.839) after adjusting for confounding factors. Among the IFG and type 2 diabetes patients, the fasting serum glucose and HbA1c levels were significantly higher in the high-wGRS group than in the other groups. The urinary 8-epi-PGF2α and malondialdehyde concentrations were significantly higher in the high-wGRS group than in the other groups. Moreover, general population-level instrumental variable estimation (using wGRS as an instrument) strengthened the causal effect regarding the largely adverse influence of high levels of fasting serum glucose on markers of oxidative stress in the Korean population. Thus, the combination of common genetic variants with small effects on IFG and newly diagnosed type 2 diabetes are significantly associated with oxidative stress.
Collapse
|
16
|
Lee JS, Cheong HS, Shin HD. Prediction of cholesterol ratios within a Korean population. ROYAL SOCIETY OPEN SCIENCE 2018; 5:171204. [PMID: 29410832 PMCID: PMC5792909 DOI: 10.1098/rsos.171204] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/23/2017] [Accepted: 11/29/2017] [Indexed: 06/08/2023]
Abstract
Cholesterol ratios (total cholesterol (TC)/high-density lipoprotein cholesterol (HDL-c) and triglyceride (TG)/HDL-c) have been suggested as better indicators to predict various clinical features such as insulin resistance and heart disease. Therefore, we aimed to build a single nucleotide polymorphism (SNP) set to predict constitutional lipid metabolism. The genotype data of 7795 samples were obtained from the Korea Association Resource. Among the total of 7795 samples, 7016 subjects were used to perform 10-fold cross-validation. We selected the SNPs that showed significance constantly throughout all 10 cross-validation sets; another 779 samples were used as the final validation set. After performing the 10-fold cross-validation, the six SNPs (rs4420638 (APOC1), rs12421652 (BUD13), rs17411126 (LPL), rs6589566 (ZPR1), rs16940212 (LOC101928635) and rs10852765 (ABCA8)) were finally selected for predicting cholesterol ratios. The weighted genetic risk scores (wGRS) were calculated based on the regression slopes of the six selected SNPs. Our results showed upward trends of wGRS for both the TC/HDL-c and TG/HDL-c ratios within the 10-fold cross-validation. Similarly, the wGRS of the six SNPs also showed upward trends in analyses using the SNP selection set and final validation set. The selected six SNPs can be used to explain both the TC/HDL-c and TG/HDL-c ratios. Our results may be useful for the prospective predictions of cholesterol-related diseases.
Collapse
Affiliation(s)
- Jin Sol Lee
- Department of Life Science, Sogang University, Baekbumro 35, Mapo-gu, Seoul 04107, Republic of Korea
- Research Institute for Basic Science, Sogang University, Mapo-gu, Seoul, 121-742, Republic of Korea
| | - Hyun Sub Cheong
- Department of Genetic Epidemiology, SNP Genetics, Inc., Taihard building 1007, Sogang University, Baekbumro 35, Mapo-gu, Seoul, Republic of Korea
| | - Hyoung Doo Shin
- Department of Life Science, Sogang University, Baekbumro 35, Mapo-gu, Seoul 04107, Republic of Korea
- Research Institute for Basic Science, Sogang University, Mapo-gu, Seoul, 121-742, Republic of Korea
- Department of Genetic Epidemiology, SNP Genetics, Inc., Taihard building 1007, Sogang University, Baekbumro 35, Mapo-gu, Seoul, Republic of Korea
| |
Collapse
|
17
|
Gupta V, Walia GK, Sachdeva MP. 'Mendelian randomization': an approach for exploring causal relations in epidemiology. Public Health 2017; 145:113-119. [PMID: 28359378 DOI: 10.1016/j.puhe.2016.12.033] [Citation(s) in RCA: 44] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2016] [Revised: 12/01/2016] [Accepted: 12/20/2016] [Indexed: 11/17/2022]
Abstract
OBJECTIVES To assess the current status of Mendelian randomization (MR) approach in effectively influencing the observational epidemiology for examining causal relationships. METHODS Narrative review on studies related to principle, strengths, limitations, and achievements of MR approach. RESULTS Observational epidemiological studies have repeatedly produced several beneficiary associations which were discarded when tested by standard randomized controlled trials (RCTs). The technique which is more feasible, highly similar to RCTs, and has the potential to establish a causal relationship between modifiable exposures and disease outcomes is known as MR. The technique uses genetic variants related to modifiable traits/exposures as instruments for detecting causal and directional associations with outcomes. CONCLUSIONS In the last decade, the approach of MR has methodologically developed and progressed to a stage of high acceptance among the epidemiologists and is gradually expanding the landscape of causal relationships in non-communicable chronic diseases.
Collapse
Affiliation(s)
- V Gupta
- Department of Anthropology, University of Delhi, Delhi 110007, India
| | - G K Walia
- Public Health Foundation of India, Gurgaon 122002, India
| | - M P Sachdeva
- Department of Anthropology, University of Delhi, Delhi 110007, India
| |
Collapse
|
18
|
McGeachie MJ, Clemmer GL, Croteau-Chonka DC, Castaldi PJ, Cho MH, Sordillo JE, Lasky-Su JA, Raby BA, Tantisira KG, Weiss ST. Whole genome prediction and heritability of childhood asthma phenotypes. IMMUNITY INFLAMMATION AND DISEASE 2016; 4:487-496. [PMID: 27980782 PMCID: PMC5134727 DOI: 10.1002/iid3.133] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/13/2016] [Revised: 09/01/2016] [Accepted: 09/04/2016] [Indexed: 01/19/2023]
Abstract
Introduction While whole genome prediction (WGP) methods have recently demonstrated successes in the prediction of complex genetic diseases, they have not yet been applied to asthma and related phenotypes. Longitudinal patterns of lung function differ between asthmatics, but these phenotypes have not been assessed for heritability or predictive ability. Herein, we assess the heritability and genetic predictability of asthma‐related phenotypes. Methods We applied several WGP methods to a well‐phenotyped cohort of 832 children with mild‐to‐moderate asthma from CAMP. We assessed narrow‐sense heritability and predictability for airway hyperresponsiveness, serum immunoglobulin E, blood eosinophil count, pre‐ and post‐bronchodilator forced expiratory volume in 1 sec (FEV1), bronchodilator response, steroid responsiveness, and longitudinal patterns of lung function (normal growth, reduced growth, early decline, and their combinations). Prediction accuracy was evaluated using a training/testing set split of the cohort. Results We found that longitudinal lung function phenotypes demonstrated significant narrow‐sense heritability (reduced growth, 95%; normal growth with early decline, 55%). These same phenotypes also showed significant polygenic prediction (areas under the curve [AUCs] 56% to 62%). Including additional demographic covariates in the models increased prediction 4–8%, with reduced growth increasing from 62% to 66% AUC. We found that prediction with a genomic relatedness matrix was improved by filtering available SNPs based on chromatin evidence, and this result extended across cohorts. Conclusions Longitudinal reduced lung function growth displayed extremely high heritability. All phenotypes with significant heritability showed significant polygenic prediction. Using SNP‐prioritization increased prediction across cohorts. WGP methods show promise in predicting asthma‐related heritable traits.
Collapse
Affiliation(s)
- Michael J McGeachie
- Channing Division of Network Medicine Brigham and Women's Hospital and Harvard Medical School Boston Massachusetts
| | - George L Clemmer
- Channing Division of Network Medicine Brigham and Women's Hospital and Harvard Medical School Boston Massachusetts
| | - Damien C Croteau-Chonka
- Channing Division of Network Medicine Brigham and Women's Hospital and Harvard Medical School Boston Massachusetts
| | - Peter J Castaldi
- Channing Division of Network Medicine Brigham and Women's Hospital and Harvard Medical School Boston Massachusetts
| | - Michael H Cho
- Channing Division of Network Medicine Brigham and Women's Hospital and Harvard Medical School Boston Massachusetts
| | - Joanne E Sordillo
- Channing Division of Network Medicine Brigham and Women's Hospital and Harvard Medical School Boston Massachusetts
| | - Jessica A Lasky-Su
- Channing Division of Network Medicine Brigham and Women's Hospital and Harvard Medical School Boston Massachusetts
| | - Benjamin A Raby
- Channing Division of Network Medicine Brigham and Women's Hospital and Harvard Medical School Boston Massachusetts
| | - Kelan G Tantisira
- Channing Division of Network Medicine Brigham and Women's Hospital and Harvard Medical School Boston Massachusetts
| | - Scott T Weiss
- Channing Division of Network Medicine Brigham and Women's Hospital and Harvard Medical School Boston Massachusetts
| |
Collapse
|
19
|
Genetic-risk assessment of GWAS-derived susceptibility loci for type 2 diabetes in a 10 year follow-up of a population-based cohort study. J Hum Genet 2016; 61:1009-1012. [DOI: 10.1038/jhg.2016.93] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2016] [Revised: 06/02/2016] [Accepted: 06/09/2016] [Indexed: 12/22/2022]
|
20
|
Wang X, Strizich G, Hu Y, Wang T, Kaplan RC, Qi Q. Genetic markers of type 2 diabetes: Progress in genome-wide association studies and clinical application for risk prediction. J Diabetes 2016; 8:24-35. [PMID: 26119161 DOI: 10.1111/1753-0407.12323] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/01/2015] [Revised: 05/22/2015] [Accepted: 06/16/2015] [Indexed: 12/18/2022] Open
Abstract
Type 2 diabetes (T2D) has become a leading public health challenge worldwide. To date, a total of 83 susceptibility loci for T2D have been identified by genome-wide association studies (GWAS). Application of meta-analysis and modern genotype imputation approaches to GWAS data from diverse ethnic populations has been key in the effort to discover T2D loci. Genetic information is expected to play a vital role in the prediction of T2D, and many efforts have been made to develop T2D risk models that include both conventional and genetic risk factors. Yet, because most T2D genetic variants identified have small effect size individually (10%-20% increased risk of T2D per risk allele), their clinical utility remains unclear. Most studies report that a genetic risk score combining multiple T2D genetic variants does not substantially improve T2D risk prediction beyond conventional risk factors. In this article, we summarize the recent progress of T2D GWAS and further review the incremental predictive performance of genetic markers for T2D.
Collapse
Affiliation(s)
- Xueyin Wang
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, New York, USA
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University Health Science Center, Beijing, China
| | - Garrett Strizich
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, New York, USA
| | - Yonghua Hu
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University Health Science Center, Beijing, China
| | - Tao Wang
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, New York, USA
| | - Robert C Kaplan
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, New York, USA
| | - Qibin Qi
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, New York, USA
| |
Collapse
|
21
|
Abstract
Metabolomics is a promising approach for the identification of chemical compounds that serve for early detection, diagnosis, prediction of therapeutic response and prognosis of disease. Moreover, metabolomics has shown to increase the diagnostic threshold and prediction of type 2 diabetes. Evidence suggests that branched-chain amino acids, acylcarnitines and aromatic amino acids may play an early role on insulin resistance, exposing defects on amino acid metabolism, β-oxidation, and tricarboxylic acid cycle. This review aims to provide a panoramic view of the metabolic shifts that antecede or follow type 2 diabetes. Key messages BCAAs, AAAs and acylcarnitines are strongly associated with early insulin resistance. Diabetes risk prediction has been improved when adding metabolomic markers of dysglycemia to standard clinical and biochemical factors.
Collapse
Affiliation(s)
| | - Carlos A Aguilar-Salinas
- a Instituto Nacional De Ciencias Médicas Y Nutrición "Salvador Zubirán" , Ciudad De México , D.F
| | - Ivette Cruz-Bautista
- a Instituto Nacional De Ciencias Médicas Y Nutrición "Salvador Zubirán" , Ciudad De México , D.F
| | | |
Collapse
|
22
|
Kruzliak P, Haley AP, Starcevic JN, Gaspar L, Petrovic D. Polymorphisms of the peroxisome proliferator-activated receptor-γ (rs1801282) and its coactivator-1 (rs8192673) are associated with obesity indexes in subjects with type 2 diabetes mellitus. Cardiovasc Diabetol 2015; 14:42. [PMID: 25928419 PMCID: PMC4450508 DOI: 10.1186/s12933-015-0197-0] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/08/2015] [Accepted: 03/20/2015] [Indexed: 12/19/2022] Open
Abstract
ᅟ The aim of this study was to clarify whether common single nucleotide polymorphisms (SNPs) of the Peroxisome Proliferator-Activated Receptor-γ (PPAR-γ) gene (rs1801282) and the Peroxisome Proliferator-Activated Receptor-γ Coactivator-1 (PGC-1α) gene (rs8192673) are associated with obesity indexes (BMI, waist circumference) in subjects with type 2 diabetes mellitus (T2DM) in Caucasian population. The second aim was to find an association of both polymorphisms with T2DM. Methods Two exonic SNPs of both genes rs1801282 of the PPAR-γ gene and rs8192673 of the PGC-1α gene) were genotyped in 881 unrelated Slovene subjects (Caucasians) with T2DM and in 348 subjects without T2DM (control subjects). Results Female homozygotes with the CC genotype of the rs8192673 had higher waist circumference in comparison with subjects with other genotypes. Homozygotes (females, males) with wild allele (Pro) of the rs1801282 (Pro12Ala polymorphism) had higher waist circumference in comparison with subjects with other genotypes. In the study, there were no differences in the distributions of the rs8192673 and the rs1801282 genotypes between patients with T2DM and controls. Linear regression analyses for both polymorphisms were performed and demonstrated an independent effect of the rs1801282 of the PPAR-γ on waist circumference in subjects with T2DM, whereas an independent effect on waist circumference was not demonstrated for the rs8192673 of the PGC-1α gene. Conclusions In a large sample of the Caucasians the rs8192673 of the PGC-1α gene and the rs1801282 of the PPAR-γ gene were associated with waist circumference in subjects with T2DM.
Collapse
Affiliation(s)
- Peter Kruzliak
- Department of Cardiovascular Diseases, International Clinical Research Center, St Anne's University Hospital and Masaryk University, Brno, Czech Republic.
| | - Andreana P Haley
- Department of Psychology, The University of Texas, Austin, TX, USA. .,University of Texas Imaging Research Center, Austin, TX, USA.
| | | | - Ludovit Gaspar
- 2nd Department of Internal Medicine, University Hospital and Comenius University, Bratislava, Slovak Republic.
| | - Daniel Petrovic
- Institute of Histology and Embryology, Faculty of Medicine, University of Ljubljana, Ljubljana, Slovenia.
| |
Collapse
|
23
|
Chang HS, Shin SW, Lee TH, Bae DJ, Park JS, Kim YH, Uh ST, Choi BW, Kim MK, Choi IS, Park BL, Shin HD, Park CS. Development of a genetic marker set to diagnose aspirin-exacerbated respiratory disease in a genome-wide association study. THE PHARMACOGENOMICS JOURNAL 2015; 15:316-21. [PMID: 25707394 DOI: 10.1038/tpj.2014.78] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Received: 05/02/2014] [Revised: 09/28/2014] [Accepted: 11/05/2014] [Indexed: 12/27/2022]
Abstract
We developed a genetic marker set of single nucleotide polymorphisms (SNPs) by summing risk scores of 14 SNPs showing a significant association with aspirin-exacerbated respiratory disease (AERD) from our previous 660 W genome-wide association data. The summed scores were higher in the AERD than in the aspirin-tolerant asthma (ATA) group (P=8.58 × 10(-37)), and were correlated with the percent decrease in forced expiratory volume in 1 s after aspirin challenge (r(2)=0.150, P=5.84 × 10(-30)). The area under the curve of the scores for AERD in the receiver operating characteristic curve was 0.821. The best cutoff value of the summed risk scores was 1.01328 (P=1.38 × 10(-32)). The sensitivity and specificity of the best scores were 64.7% and 85.0%, respectively, with 42.1% positive and 93.4% negative predictive values. The summed risk score may be used as a genetic marker with good discriminative power for distinguishing AERD from ATA.
Collapse
Affiliation(s)
- H S Chang
- Department of Medical Bioscience, Graduate School, Soonchunhyang University, Asan, Republic of Korea
| | - S W Shin
- Asthma Genome Research Center, Soonchunhyang University Bucheon Hospital, Bucheon, Republic of Korea
| | - T H Lee
- Department of Medical Bioscience, Graduate School, Soonchunhyang University, Asan, Republic of Korea
| | - D J Bae
- Department of Medical Bioscience, Graduate School, Soonchunhyang University, Asan, Republic of Korea
| | - J S Park
- 1] Asthma Genome Research Center, Soonchunhyang University Bucheon Hospital, Bucheon, Republic of Korea [2] Division of Allergy and Respiratory Medicine, Soonchunhyang University Bucheon Hospital, Bucheon, Republic of Korea
| | - Y H Kim
- Division of Allergy and Respiratory Medicine, Soonchunhyang University Cheonan Hospital, Cheonan, Republic of Korea
| | - S T Uh
- Division of Allergy and Respiratory Medicine, Soonchunhyang University Seoul Hospital, Seoul, Republic of Korea
| | - B W Choi
- Department of Internal Medicine, Chung-Ang University Yongsan Hospital, Seoul, Republic of Korea
| | - M K Kim
- Division of Internal Medicine, Chungbuk National University, Cheongju, Republic of Korea
| | - I S Choi
- Department of Allergy, Chonnam National University, Gwangju, Republic of Korea
| | - B L Park
- Department of Genetic Epidemiology, SNP Genetics Incorporation, Seoul, Republic of Korea
| | - H D Shin
- 1] Department of Genetic Epidemiology, SNP Genetics Incorporation, Seoul, Republic of Korea [2] Department of Life Science, Sogang University, Seoul, Republic of Korea
| | - C S Park
- 1] Asthma Genome Research Center, Soonchunhyang University Bucheon Hospital, Bucheon, Republic of Korea [2] Division of Allergy and Respiratory Medicine, Soonchunhyang University Bucheon Hospital, Bucheon, Republic of Korea
| |
Collapse
|
24
|
Johansen Taber KA, Dickinson BD. Genomic-based tools for the risk assessment, management, and prevention of type 2 diabetes. APPLICATION OF CLINICAL GENETICS 2015; 8:1-8. [PMID: 25609992 PMCID: PMC4293919 DOI: 10.2147/tacg.s75583] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Type 2 diabetes (T2D) is a common and serious disorder and is a significant risk factor for the development of cardiovascular disease, neuropathy, nephropathy, retinopathy, periodontal disease, and foot ulcers and amputations. The burden of disease associated with T2D has led to an emphasis on early identification of the millions of individuals at high risk so that management and intervention strategies can be effectively implemented before disease progression begins. With increasing knowledge about the genetic basis of T2D, several genomic-based strategies have been tested for their ability to improve risk assessment, management and prevention. Genetic risk scores have been developed with the intent to more accurately identify those at risk for T2D and to potentially improve motivation and adherence to lifestyle modification programs. In addition, evidence is building that oral antihyperglycemic medications are subject to pharmacogenomic variation in a substantial number of patients, suggesting genomics may soon play a role in determining the most effective therapies. T2D is a complex disease that affects individuals differently, and risk prediction and treatment may be challenging for health care providers. Genomic approaches hold promise for their potential to improve risk prediction and tailor management for individual patients and to contribute to better health outcomes for those with T2D.
Collapse
Affiliation(s)
| | - Barry D Dickinson
- Department of Science and Biotechnology, American Medical Association, Chicago, IL, USA
| |
Collapse
|
25
|
Schrodi SJ, Mukherjee S, Shan Y, Tromp G, Sninsky JJ, Callear AP, Carter TC, Ye Z, Haines JL, Brilliant MH, Crane PK, Smelser DT, Elston RC, Weeks DE. Genetic-based prediction of disease traits: prediction is very difficult, especially about the future. Front Genet 2014; 5:162. [PMID: 24917882 PMCID: PMC4040440 DOI: 10.3389/fgene.2014.00162] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2014] [Accepted: 05/15/2014] [Indexed: 01/08/2023] Open
Abstract
Translation of results from genetic findings to inform medical practice is a highly anticipated goal of human genetics. The aim of this paper is to review and discuss the role of genetics in medically-relevant prediction. Germline genetics presages disease onset and therefore can contribute prognostic signals that augment laboratory tests and clinical features. As such, the impact of genetic-based predictive models on clinical decisions and therapy choice could be profound. However, given that (i) medical traits result from a complex interplay between genetic and environmental factors, (ii) the underlying genetic architectures for susceptibility to common diseases are not well-understood, and (iii) replicable susceptibility alleles, in combination, account for only a moderate amount of disease heritability, there are substantial challenges to constructing and implementing genetic risk prediction models with high utility. In spite of these challenges, concerted progress has continued in this area with an ongoing accumulation of studies that identify disease predisposing genotypes. Several statistical approaches with the aim of predicting disease have been published. Here we summarize the current state of disease susceptibility mapping and pharmacogenetics efforts for risk prediction, describe methods used to construct and evaluate genetic-based predictive models, and discuss applications.
Collapse
Affiliation(s)
- Steven J Schrodi
- Center for Human Genetics, Marshfield Clinic Research Foundation Marshfield, WI, USA
| | - Shubhabrata Mukherjee
- Department of Medicine, School of Medicine, University of Washington Seattle, WA, USA
| | - Ying Shan
- Departments of Human Genetics and Biostatistics, Graduate School of Public Health, University of Pittsburgh PA, USA
| | - Gerard Tromp
- Sigfried and Janet Weis Center for Research, Geisinger Health System Danville, PA, USA
| | - John J Sninsky
- Subsidiary of Quest Diagnostics, Discovery Research, Celera Corporation Alameda, CA, USA
| | - Amy P Callear
- Center for Human Genetics, Marshfield Clinic Research Foundation Marshfield, WI, USA ; Department of Biological Sciences, University of Pittsburgh Pittsburgh, PA, USA
| | - Tonia C Carter
- Center for Human Genetics, Marshfield Clinic Research Foundation Marshfield, WI, USA
| | - Zhan Ye
- Biomedical Informatics Research Center, Marshfield Clinic Research Foundation Marshfield, WI, USA
| | - Jonathan L Haines
- Department of Epidemiology and Biostatistics, Case Western Reserve School of Medicine Cleveland, OH, USA
| | - Murray H Brilliant
- Center for Human Genetics, Marshfield Clinic Research Foundation Marshfield, WI, USA
| | - Paul K Crane
- Department of Medicine, School of Medicine, University of Washington Seattle, WA, USA
| | - Diane T Smelser
- Sigfried and Janet Weis Center for Research, Geisinger Health System Danville, PA, USA
| | - Robert C Elston
- Department of Epidemiology and Biostatistics, Case Western Reserve School of Medicine Cleveland, OH, USA
| | - Daniel E Weeks
- Departments of Human Genetics and Biostatistics, Graduate School of Public Health, University of Pittsburgh PA, USA
| |
Collapse
|