1
|
Mohammed A, Alshraideh H, Abu-Helalah M, Shamayleh A. An explainable non-invasive hybrid machine learning framework for accurate prediction of thyroid-stimulating hormone levels. Comput Biol Med 2025; 189:109974. [PMID: 40058078 DOI: 10.1016/j.compbiomed.2025.109974] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2024] [Revised: 02/12/2025] [Accepted: 03/02/2025] [Indexed: 04/01/2025]
Abstract
Machine learning models, including thyroid biomarkers, are increasingly utilized in healthcare for biomarker prediction. These models offer the potential to enhance disease diagnosis through data-driven approaches relying on non-invasive techniques. However, no studies have explored the application of fully non-invasive methods for predicting thyroid-stimulating hormone (TSH) levels. Consequently, this study introduces a novel, fully non-invasive framework for predicting TSH levels by developing an innovative hybrid machine learning model that balances performance, complexity, and interpretability. Seven ML models were evaluated, and the best-performing models were integrated into a hybrid approach to balance performance, complexity, and interpretability. A dataset of 6190 instances from Jordan was used for model development. Four-dimensional non-invasive factors, including demographics, symptoms, family history, and newly engineered symptom scores, were incorporated into the model. The hybrid model achieved an R2 of 94.2 % and RMSE of 0.015, demonstrating superior predictive performance. Model interpretability was ensured using LIME and SHAP explainers, confirming aggregated symptom scores' critical role in enhancing prediction accuracy. A robust feature selection technique was implemented, reducing model complexity and enhancing performance. Among the top ten features for predicting TSH levels were hypothyroidism and hyperthyroidism symptom scores, family history, cold intolerance, itchy-dry skin, sweating, hand tremors, and palpitations. The model can be employed to develop cost-effective diagnostic tools for thyroid disorders. It also offers a robust framework that can be generalized to predict other biomarkers and applied in diverse contexts.
Collapse
Affiliation(s)
- Areej Mohammed
- Department of Industrial Engineering, Engineering Systems Management Program, American University of Sharjah, Sharjah, P.O. Box 26666, United Arab Emirates.
| | - Hussam Alshraideh
- Department of Industrial Engineering, American University of Sharjah, Sharjah, P.O. Box 26666, United Arab Emirates; Industrial Engineering Department, Jordan University of Science and Technology, Irbid, Jordan.
| | - Munir Abu-Helalah
- Department of Family and Community Medicine, School of Medicine, University of Jordan, Public Health Institute, Amman, Jordan.
| | - Abdulrahim Shamayleh
- Department of Industrial Engineering, American University of Sharjah, Sharjah, P.O. Box 26666, United Arab Emirates.
| |
Collapse
|
2
|
Targovnik HM, Barh D, Papendieck P, Adrover E, Gallo AM, Chiesa A, Marques da Silva W, Azevedo V, Rivolta CM. A novel pathogenic DICER1 gene variant is associated with hereditary multinodular goiter in an Argentine family as evidenced by clinical, biochemical and molecular genetic analysis. Endocrine 2025; 87:1150-1161. [PMID: 39607641 DOI: 10.1007/s12020-024-04098-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/01/2024] [Accepted: 11/03/2024] [Indexed: 11/29/2024]
Abstract
DICER1 syndrome is an autosomal-dominant disorder that results in malignant or benign tumors. A number of distinct pathogenic germline and somatic variants have been identified as causing multinodular goiter (MNG). The purpose of the present study was to identify and characterize the genetic cause underlying the familial form of MNG through a whole-exome sequencing (WES) analysis in an Argentine family with three affected siblings. Clinical, biochemical and molecular genetics as well as bioinformatics analysis were performed. A novel heterozygous variant in the DICER1 gene was identified in the proband patient by WES. The variant was a single guanine deletion at nucleotide position 2,042 (NM_177438.3:c.2042del) resulting in a frameshift at amino acid 681 with a putative premature stop codon [NP_803187.1:p.Gly681ValfsTer4]. Family segregation analysis showed that his affected sister and his affected brother also were heterozygous for same variant, whereas the father was a healthy heterozygous carrier of the variant and the healthy mother harbor only wild-type alleles in the DICER1 gene. We have also observed that the frameshift variant does not interfere with the pre-mRNA splicing of the exon 13. In addition, two clinically relevant heterozygous variants, not associated with thyroid disease, were also identified in index sibling using the Franklin platform, a frameshift [NP_000234.1:p.Thr55AsnfsTer49] in the MEFV gene (familial mediterranean fever) and a missense [NP_004530.1:p.Ala422Thr] in the NARS1 gene (neurodevelopmental delay and ataxia). In conclusion, in the present study we have identified a novel frameshift variant corresponding to NP_803187.1:p.Gly681ValfsTer4 in the DUF 283 domain of DICER1. The results were in accordance with previous observations confirming the genetic heterogeneity of DICER1 syndrome. Moreover, the identification of this variant in the unaffected father substantiates the hypothesis of incomplete/reduced penetrance.
Collapse
Affiliation(s)
- Héctor M Targovnik
- Universidad de Buenos Aires. Facultad de Farmacia y Bioquímica, Departamento de Microbiología, Inmunología, Biotecnología y Genética/Cátedra de Genética, Buenos Aires, Argentina.
| | - Debmalya Barh
- Department of Genetics, Ecology & Evolution, Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
- Institute of Integrative Omics & Applied Biotechnology (IIOAB), Nonakuri, Purba, Medinipur, West Bengal, India
| | - Patricia Papendieck
- Centro de Investigaciones Endocrinológicas, CEDIE-CONICET, División Endocrinología, Hospital de Niños "Ricardo Gutiérrez", Buenos Aires, Argentina
| | - Ezequiela Adrover
- Universidad de Buenos Aires. Facultad de Farmacia y Bioquímica, Departamento de Microbiología, Inmunología, Biotecnología y Genética/Cátedra de Genética, Buenos Aires, Argentina
- CONICET-Universidad de Buenos Aires. Instituto de Inmunología, Genética y Metabolismo (INIGEM), Buenos Aires, Argentina
| | - Ariel M Gallo
- Universidad de Buenos Aires. Facultad de Farmacia y Bioquímica, Departamento de Microbiología, Inmunología, Biotecnología y Genética/Cátedra de Genética, Buenos Aires, Argentina
- CONICET-Universidad de Buenos Aires. Instituto de Inmunología, Genética y Metabolismo (INIGEM), Buenos Aires, Argentina
| | - Ana Chiesa
- Centro de Investigaciones Endocrinológicas, CEDIE-CONICET, División Endocrinología, Hospital de Niños "Ricardo Gutiérrez", Buenos Aires, Argentina
| | | | - Vasco Azevedo
- Department of Genetics, Ecology & Evolution, Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | - Carina M Rivolta
- Universidad de Buenos Aires. Facultad de Farmacia y Bioquímica, Departamento de Microbiología, Inmunología, Biotecnología y Genética/Cátedra de Genética, Buenos Aires, Argentina
- CONICET-Universidad de Buenos Aires. Instituto de Inmunología, Genética y Metabolismo (INIGEM), Buenos Aires, Argentina
| |
Collapse
|
3
|
Yan C, Ong HH, Grabowska ME, Krantz MS, Su WC, Dickson AL, Peterson JF, Feng Q, Roden DM, Stein CM, Kerchberger VE, Malin BA, Wei WQ. Large language models facilitate the generation of electronic health record phenotyping algorithms. J Am Med Inform Assoc 2024; 31:1994-2001. [PMID: 38613820 PMCID: PMC11339509 DOI: 10.1093/jamia/ocae072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Revised: 02/21/2024] [Accepted: 03/22/2024] [Indexed: 04/15/2024] Open
Abstract
OBJECTIVES Phenotyping is a core task in observational health research utilizing electronic health records (EHRs). Developing an accurate algorithm demands substantial input from domain experts, involving extensive literature review and evidence synthesis. This burdensome process limits scalability and delays knowledge discovery. We investigate the potential for leveraging large language models (LLMs) to enhance the efficiency of EHR phenotyping by generating high-quality algorithm drafts. MATERIALS AND METHODS We prompted four LLMs-GPT-4 and GPT-3.5 of ChatGPT, Claude 2, and Bard-in October 2023, asking them to generate executable phenotyping algorithms in the form of SQL queries adhering to a common data model (CDM) for three phenotypes (ie, type 2 diabetes mellitus, dementia, and hypothyroidism). Three phenotyping experts evaluated the returned algorithms across several critical metrics. We further implemented the top-rated algorithms and compared them against clinician-validated phenotyping algorithms from the Electronic Medical Records and Genomics (eMERGE) network. RESULTS GPT-4 and GPT-3.5 exhibited significantly higher overall expert evaluation scores in instruction following, algorithmic logic, and SQL executability, when compared to Claude 2 and Bard. Although GPT-4 and GPT-3.5 effectively identified relevant clinical concepts, they exhibited immature capability in organizing phenotyping criteria with the proper logic, leading to phenotyping algorithms that were either excessively restrictive (with low recall) or overly broad (with low positive predictive values). CONCLUSION GPT versions 3.5 and 4 are capable of drafting phenotyping algorithms by identifying relevant clinical criteria aligned with a CDM. However, expertise in informatics and clinical experience is still required to assess and further refine generated algorithms.
Collapse
Affiliation(s)
- Chao Yan
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, United States
| | - Henry H Ong
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, United States
| | - Monika E Grabowska
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, United States
| | - Matthew S Krantz
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, United States
| | - Wu-Chen Su
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, United States
| | - Alyson L Dickson
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37203, United States
| | - Josh F Peterson
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, United States
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37203, United States
| | - QiPing Feng
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37203, United States
| | - Dan M Roden
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, United States
| | - C Michael Stein
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37203, United States
| | - V Eric Kerchberger
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, United States
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37203, United States
| | - Bradley A Malin
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, United States
- Department of Computer Science, Vanderbilt University, Nashville, TN 37203, United States
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN 37203, United States
| | - Wei-Qi Wei
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, United States
- Department of Computer Science, Vanderbilt University, Nashville, TN 37203, United States
| |
Collapse
|
4
|
Bampa M, Miliou I, Jovanovic B, Papapetrou P. M-ClustEHR: A multimodal clustering approach for electronic health records. Artif Intell Med 2024; 154:102905. [PMID: 38908256 DOI: 10.1016/j.artmed.2024.102905] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Revised: 05/29/2024] [Accepted: 06/03/2024] [Indexed: 06/24/2024]
Abstract
Sepsis refers to a potentially life-threatening situation where the immune system of the human body has an extreme response to an infection. In the presence of underlying comorbidities, the situation can become even worse and result in death. Employing unsupervised machine learning techniques, such as clustering, can assist in providing a better understanding of patient phenotypes by unveiling subgroups characterized by distinct sepsis progression and treatment patterns. More concretely, this study introduces M-ClustEHR, a clustering approach that utilizes medical data of multiple modalities by employing a multimodal autoencoder for learning comprehensive sepsis patient representations. M-ClustEHR consistently outperforms traditional clustering approaches in terms of several internal clustering performance metrics, as well as cluster stability in identifying phenotypes in the sepsis cohort. The unveiled patterns, supported by existing medical literature and clinicians, highlight the importance of multimodal clustering for advancing personalized sepsis care.
Collapse
Affiliation(s)
- Maria Bampa
- Department of Computer and Systems Sciences, Stockholm University, Stockholm, Sweden.
| | - Ioanna Miliou
- Department of Computer and Systems Sciences, Stockholm University, Stockholm, Sweden
| | | | - Panagiotis Papapetrou
- Department of Computer and Systems Sciences, Stockholm University, Stockholm, Sweden
| |
Collapse
|
5
|
Saevarsdottir S, Bjarnadottir K, Markusson T, Berglund J, Olafsdottir TA, Halldorsson GH, Rutsdottir G, Gunnarsdottir K, Arnthorsson AO, Lund SH, Stefansdottir L, Gudmundsson J, Johannesson AJ, Sturluson A, Oddsson A, Halldorsson B, Ludviksson BR, Ferkingstad E, Ivarsdottir EV, Sveinbjornsson G, Grondal G, Masson G, Eldjarn GH, Thorisson GA, Kristjansdottir K, Knowlton KU, Moore KHS, Gudjonsson SA, Rognvaldsson S, Knight S, Nadauld LD, Holm H, Magnusson OT, Sulem P, Gudbjartsson DF, Rafnar T, Thorleifsson G, Melsted P, Norddahl GL, Jonsdottir I, Stefansson K. Start codon variant in LAG3 is associated with decreased LAG-3 expression and increased risk of autoimmune thyroid disease. Nat Commun 2024; 15:5748. [PMID: 38982041 PMCID: PMC11233504 DOI: 10.1038/s41467-024-50007-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Accepted: 06/27/2024] [Indexed: 07/11/2024] Open
Abstract
Autoimmune thyroid disease (AITD) is a common autoimmune disease. In a GWAS meta-analysis of 110,945 cases and 1,084,290 controls, 290 sequence variants at 225 loci are associated with AITD. Of these variants, 115 are previously unreported. Multiomics analysis yields 235 candidate genes outside the MHC-region and the findings highlight the importance of genes involved in T-cell regulation. A rare 5'-UTR variant (rs781745126-T, MAF = 0.13% in Iceland) in LAG3 has the largest effect (OR = 3.42, P = 2.2 × 10-16) and generates a novel start codon for an open reading frame upstream of the canonical protein translation initiation site. rs781745126-T reduces mRNA and surface expression of the inhibitory immune checkpoint LAG-3 co-receptor on activated lymphocyte subsets and halves LAG-3 levels in plasma among heterozygotes. All three homozygous carriers of rs781745126-T have AITD, of whom one also has two other T-cell mediated diseases, that is vitiligo and type 1 diabetes. rs781745126-T associates nominally with vitiligo (OR = 5.1, P = 6.5 × 10-3) but not with type 1 diabetes. Thus, the effect of rs781745126-T is akin to drugs that inhibit LAG-3, which unleash immune responses and can have thyroid dysfunction and vitiligo as adverse events. This illustrates how a multiomics approach can reveal potential drug targets and safety concerns.
Collapse
Affiliation(s)
- Saedis Saevarsdottir
- deCODE genetics/Amgen, Inc., Reykjavik, Iceland.
- Faculty of Medicine, School of Health Sciences, University of Iceland, Reykjavik, Iceland.
- Department of Medicine, Landspitali, the National University Hospital of Iceland, Reykjavik, Iceland.
| | | | - Thorsteinn Markusson
- deCODE genetics/Amgen, Inc., Reykjavik, Iceland
- Faculty of Medicine, School of Health Sciences, University of Iceland, Reykjavik, Iceland
| | | | - Thorunn A Olafsdottir
- deCODE genetics/Amgen, Inc., Reykjavik, Iceland
- Faculty of Medicine, School of Health Sciences, University of Iceland, Reykjavik, Iceland
| | - Gisli H Halldorsson
- deCODE genetics/Amgen, Inc., Reykjavik, Iceland
- School of Engineering and Natural Sciences, University of Iceland, Reykjavik, Iceland
| | - Gudrun Rutsdottir
- deCODE genetics/Amgen, Inc., Reykjavik, Iceland
- School of Engineering and Natural Sciences, University of Iceland, Reykjavik, Iceland
| | | | | | | | | | | | - Ari J Johannesson
- Department of Medicine, Landspitali, the National University Hospital of Iceland, Reykjavik, Iceland
| | | | | | | | - Björn R Ludviksson
- Faculty of Medicine, School of Health Sciences, University of Iceland, Reykjavik, Iceland
- Department of Immunology, Landspitali, the National University Hospital of Iceland, Reykjavik, Iceland
| | | | - Erna V Ivarsdottir
- deCODE genetics/Amgen, Inc., Reykjavik, Iceland
- School of Engineering and Natural Sciences, University of Iceland, Reykjavik, Iceland
| | | | - Gerdur Grondal
- Faculty of Medicine, School of Health Sciences, University of Iceland, Reykjavik, Iceland
- Department of Medicine, Landspitali, the National University Hospital of Iceland, Reykjavik, Iceland
| | | | | | | | | | - Kirk U Knowlton
- Intermountain Medical Center, Intermountain Heart Institute, Salt Lake City, UT, USA
- School of Medicine, University of Utah, Salt Lake City, UT, USA
| | | | | | | | - Stacey Knight
- Intermountain Medical Center, Intermountain Heart Institute, Salt Lake City, UT, USA
| | | | - Hilma Holm
- deCODE genetics/Amgen, Inc., Reykjavik, Iceland
| | | | | | - Daniel F Gudbjartsson
- deCODE genetics/Amgen, Inc., Reykjavik, Iceland
- School of Engineering and Natural Sciences, University of Iceland, Reykjavik, Iceland
| | | | | | - Pall Melsted
- deCODE genetics/Amgen, Inc., Reykjavik, Iceland
- School of Engineering and Natural Sciences, University of Iceland, Reykjavik, Iceland
| | | | - Ingileif Jonsdottir
- deCODE genetics/Amgen, Inc., Reykjavik, Iceland
- Faculty of Medicine, School of Health Sciences, University of Iceland, Reykjavik, Iceland
- Department of Immunology, Landspitali, the National University Hospital of Iceland, Reykjavik, Iceland
| | - Kari Stefansson
- deCODE genetics/Amgen, Inc., Reykjavik, Iceland.
- Faculty of Medicine, School of Health Sciences, University of Iceland, Reykjavik, Iceland.
| |
Collapse
|
6
|
Jafari E, Blackman MH, Karnes JH, Van Driest SL, Crawford DC, Choi L, McDonough CW. Using electronic health records for clinical pharmacology research: Challenges and considerations. Clin Transl Sci 2024; 17:e13871. [PMID: 38943244 PMCID: PMC11213823 DOI: 10.1111/cts.13871] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2024] [Revised: 05/21/2024] [Accepted: 05/24/2024] [Indexed: 07/01/2024] Open
Abstract
Electronic health records (EHRs) contain a vast array of phenotypic data on large numbers of individuals, often collected over decades. Due to the wealth of information, EHR data have emerged as a powerful resource to make first discoveries and identify disparities in our healthcare system. While the number of EHR-based studies has exploded in recent years, most of these studies are directed at associations with disease rather than pharmacotherapeutic outcomes, such as drug response or adverse drug reactions. This is largely due to challenges specific to deriving drug-related phenotypes from the EHR. There is great potential for EHR-based discovery in clinical pharmacology research, and there is a critical need to address specific challenges related to accurate and reproducible derivation of drug-related phenotypes from the EHR. This review provides a detailed evaluation of challenges and considerations for deriving drug-related data from EHRs. We provide an examination of EHR-based computable phenotypes and discuss cutting-edge approaches to map medication information for clinical pharmacology research, including medication-based computable phenotypes and natural language processing. We also discuss additional considerations such as data structure, heterogeneity and missing data, rare phenotypes, and diversity within the EHR. By further understanding the complexities associated with conducting clinical pharmacology research using EHR-based data, investigators will be better equipped to design thoughtful studies with more reproducible results. Progress in utilizing EHRs for clinical pharmacology research should lead to significant advances in our ability to understand differential drug response and predict adverse drug reactions.
Collapse
Affiliation(s)
- Eissa Jafari
- Department of Pharmacotherapy and Translational Research, Center for Pharmacogenomics and Precision Medicine, College of PharmacyUniversity of FloridaGainesvilleFloridaUSA
- Department of Pharmacy Practice, College of PharmacyJazan UniversityJazanSaudi Arabia
| | - Marisa H. Blackman
- Department of BiostatisticsVanderbilt University Medical CenterNashvilleTennesseeUSA
| | - Jason H. Karnes
- Department of Pharmacy Practice and ScienceUniversity of Arizona R. Ken Coit College of PharmacyTucsonArizonaUSA
| | - Sara L. Van Driest
- Department of PediatricsVanderbilt University Medical Center (VUMC)NashvilleTennesseeUSA
- Present address:
All of US Research Program, National Institutes of HealthBethesdaMarylandUSA
| | - Dana C. Crawford
- Department of Population and Quantitative Health Sciences, Cleveland Institute for Computational BiologyCase Western Reserve UniversityClevelandOhioUSA
- Department of Genetics and Genome Sciences, Cleveland Institute for Computational BiologyCase Western Reserve UniversityClevelandOhioUSA
| | - Leena Choi
- Department of Biostatistics and Biomedical InformaticsVanderbilt University Medical CenterNashvilleTennesseeUSA
| | - Caitrin W. McDonough
- Department of Pharmacotherapy and Translational Research, Center for Pharmacogenomics and Precision Medicine, College of PharmacyUniversity of FloridaGainesvilleFloridaUSA
| |
Collapse
|
7
|
Ruan X, Liu Y, Wu S, Fu G, Tao M, Huang Y, Li D, Wei S, Gao M, Guo S, Ning J, Zheng X. Multidimensional data analysis revealed thyroiditis-associated TCF19 SNP rs2073724 as a highly ranked protective variant in thyroid cancer. Aging (Albany NY) 2024; 16:6488-6509. [PMID: 38579171 PMCID: PMC11042956 DOI: 10.18632/aging.205718] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2023] [Accepted: 03/14/2024] [Indexed: 04/07/2024]
Abstract
BACKGROUND Thyroid cancer represents the most prevalent malignant endocrine tumour, with rising incidence worldwide and high mortality rates among patients exhibiting dedifferentiation and metastasis. Effective biomarkers and therapeutic interventions are warranted in aggressive thyroid malignancies. The transcription factor 19 (TCF19) gene has been implicated in conferring a malignant phenotype in cancers. However, its contribution to thyroid neoplasms remains unclear. RESULTS In this study, we performed genome-wide and phenome-wide association studies to identify a potential causal relationship between TCF19 and thyroid cancer. Our analyses revealed significant associations between TCF19 and various autoimmune diseases and human cancers, including cervical cancer and autoimmune thyroiditis, with a particularly robust signal for the deleterious missense variation rs2073724 that is associated with thyroid function, hypothyroidism, and autoimmunity. Furthermore, functional assays and transcriptional profiling in thyroid cancer cells demonstrated that TCF19 regulates important biological processes, especially inflammatory and immune responses. We demonstrated that TCF19 could promote the progression of thyroid cancer in vitro and in vivo and the C>T variant of rs2073724 disrupted TCF19 protein binding to target gene promoters and their expression, thus reversing the effect of TCF19 protein. CONCLUSIONS Taken together, these findings implicate TCF19 as a promising therapeutic target in aggressive thyroid malignancies and designate rs2073724 as a causal biomarker warranting further investigation in thyroid cancer.
Collapse
Affiliation(s)
- Xianhui Ruan
- Department of Thyroid and Neck Tumor, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Tianjin’s Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin 300060, China
| | - Yu Liu
- Department of Thyroid and Neck Tumor, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Tianjin’s Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin 300060, China
| | - Shuping Wu
- Department of Thyroid and Neck Tumor, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Tianjin’s Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin 300060, China
- Department of Head and Neck Surgery, Clinical Oncology School of Fujian Medical University, Fujian Cancer Hospital, Fuzhou 350014, Fujian, China
| | - Guiming Fu
- Department of Thyroid and Neck Tumor, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Tianjin’s Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin 300060, China
- Thyroid-Otolaryngology Department, Sichuan Clinical Research Center for Cancer, Sichuan Cancer Hospital and Institute, Sichuan Cancer Center, Affiliated Cancer Hospital of University of Electronic Science and Technology of China, Chengdu 610000, Sichuan, China
| | - Mei Tao
- Department of Thyroid and Neck Tumor, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Tianjin’s Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin 300060, China
| | - Yue Huang
- Department of Thyroid and Neck Tumor, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Tianjin’s Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin 300060, China
| | - Dapeng Li
- Department of Thyroid and Neck Tumor, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Tianjin’s Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin 300060, China
| | - Songfeng Wei
- Department of Thyroid and Neck Tumor, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Tianjin’s Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin 300060, China
| | - Ming Gao
- Department of Thyroid and Neck Tumor, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Tianjin’s Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin 300060, China
- Department of Thyroid and Breast Surgery, Tianjin Union Medical Center, Tianjin 300121, China
- Tianjin Key Laboratory of General Surgery in Construction, Tianjin Union Medical Center, Tianjin 300121, China
| | - Shicheng Guo
- Department of Medical Genetics, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Junya Ning
- Department of Thyroid and Neck Tumor, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Tianjin’s Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin 300060, China
- Department of Thyroid and Breast Surgery, Tianjin Union Medical Center, Tianjin 300121, China
| | - Xiangqian Zheng
- Department of Thyroid and Neck Tumor, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Tianjin’s Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin 300060, China
| |
Collapse
|
8
|
Yan C, Ong HH, Grabowska ME, Krantz MS, Su WC, Dickson AL, Peterson JF, Feng Q, Roden DM, Stein CM, Kerchberger VE, Malin BA, Wei WQ. Large Language Models Facilitate the Generation of Electronic Health Record Phenotyping Algorithms. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2023.12.19.23300230. [PMID: 38196578 PMCID: PMC10775330 DOI: 10.1101/2023.12.19.23300230] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/11/2024]
Abstract
Objectives Phenotyping is a core task in observational health research utilizing electronic health records (EHRs). Developing an accurate algorithm demands substantial input from domain experts, involving extensive literature review and evidence synthesis. This burdensome process limits scalability and delays knowledge discovery. We investigate the potential for leveraging large language models (LLMs) to enhance the efficiency of EHR phenotyping by generating high-quality algorithm drafts. Materials and Methods We prompted four LLMs-GPT-4 and GPT-3.5 of ChatGPT, Claude 2, and Bard-in October 2023, asking them to generate executable phenotyping algorithms in the form of SQL queries adhering to a common data model (CDM) for three phenotypes (i.e., type 2 diabetes mellitus, dementia, and hypothyroidism). Three phenotyping experts evaluated the returned algorithms across several critical metrics. We further implemented the top-rated algorithms and compared them against clinician-validated phenotyping algorithms from the Electronic Medical Records and Genomics (eMERGE) network. Results GPT-4 and GPT-3.5 exhibited significantly higher overall expert evaluation scores in instruction following, algorithmic logic, and SQL executability, when compared to Claude 2 and Bard. Although GPT-4 and GPT-3.5 effectively identified relevant clinical concepts, they exhibited immature capability in organizing phenotyping criteria with the proper logic, leading to phenotyping algorithms that were either excessively restrictive (with low recall) or overly broad (with low positive predictive values). Conclusion GPT versions 3.5 and 4 are capable of drafting phenotyping algorithms by identifying relevant clinical criteria aligned with a CDM. However, expertise in informatics and clinical experience is still required to assess and further refine generated algorithms.
Collapse
Affiliation(s)
- Chao Yan
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN
| | - Henry H. Ong
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN
| | - Monika E. Grabowska
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN
| | - Matthew S. Krantz
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN
| | - Wu-Chen Su
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN
| | - Alyson L. Dickson
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN
| | - Josh F. Peterson
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN
| | - QiPing Feng
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN
| | - Dan M. Roden
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN
| | - C. Michael Stein
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN
| | - V. Eric Kerchberger
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN
| | - Bradley A. Malin
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN
- Department of Computer Science, Vanderbilt University, Nashville, TN
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN
| | - Wei-Qi Wei
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN
- Department of Computer Science, Vanderbilt University, Nashville, TN
| |
Collapse
|
9
|
Zamwar UM, Muneshwar KN. Epidemiology, Types, Causes, Clinical Presentation, Diagnosis, and Treatment of Hypothyroidism. Cureus 2023; 15:e46241. [PMID: 37908940 PMCID: PMC10613832 DOI: 10.7759/cureus.46241] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Accepted: 09/29/2023] [Indexed: 11/02/2023] Open
Abstract
Hypothyroidism means an underactive thyroid gland. This leads to a decrease in the functioning of the thyroid gland. It is a very common endocrine disorder that causes under-secretion of thyroid hormones, mainly thyroxine (T4) and triiodothyronine (T3). It affects people of every age group but is more commonly found in women and older people. The symptoms of hypothyroidism can go unnoticed, may not be specific, and may overlap with other conditions, which makes it harder to diagnose it in some cases. Common symptoms include fatigue, weight gain, increased sensitivity to cold (cold intolerance), irregular bowel movements (constipation), and dry skin (xeroderma). These conditions are mostly the result of a low metabolic rate in the body. Weight gain occurs due to a decrease in fat-burning rate and cold intolerance due to a decrease in heat production by the body. This condition can be caused by a variety of factors, including autoimmune diseases, radiation therapy, thyroid gland removal surgeries, and certain medications. The diagnosis of hypothyroidism is based on laboratory tests that measure the levels of thyroid hormones (T3 and T4) in the blood. Treatment typically involves lifelong hormone replacement therapy with synthetic thyroid hormone replacement medication, such as levothyroxine, to help regulate hormone levels in the body. People with hypothyroidism may need to have their medication dosage adjusted over time. If hypothyroidism is left untreated, it can lead to severe complications like mental retardation, delayed milestones, etc., in infants and heart failure, infertility, myxedema coma, etc., in adults. With appropriate treatment, the symptoms of hypothyroidism can be effectively managed, and most people with the condition can lead normal, healthy lives. Lifestyle modifications like eating healthy food and exercising regularly can help manage the symptoms and improve the quality of life.
Collapse
Affiliation(s)
- Udit M Zamwar
- Community Medicine, Jawaharlal Nehru Medical College, Datta Meghe Institute of Higher Education and Research, Wardha, IND
| | - Komal N Muneshwar
- Community Medicine, Jawaharlal Nehru Medical College, Datta Meghe Institute of Higher Education and Research, Wardha, IND
| |
Collapse
|
10
|
Wan NC, Yaqoob AA, Ong HH, Zhao J, Wei WQ. Evaluating resources composing the PheMAP knowledge base to enhance high-throughput phenotyping. J Am Med Inform Assoc 2023; 30:456-465. [PMID: 36451277 PMCID: PMC9933070 DOI: 10.1093/jamia/ocac234] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2022] [Revised: 10/28/2022] [Accepted: 11/23/2022] [Indexed: 12/02/2022] Open
Abstract
OBJECTIVE A previous study, PheMAP, combined independent, online resources to enable high-throughput phenotyping (HTP) using electronic health records (EHRs). However, online resources offer distinct quality descriptions of diseases which may affect phenotyping performance. We aimed to evaluate the phenotyping performance of single resource-based PheMAPs and investigate an optimized strategy for HTP. MATERIALS AND METHODS We compared how each resource produced top-ranked concept unique identifiers (CUIs) by term frequency-inverse document frequency with Jaccard matrices comparing single resources and the original PheMAP. We correlated top-ranked concepts from each resource to features used in established Phenotype KnowledgeBase (PheKB) algorithms for hypothyroidism, type II diabetes mellitus (T2DM), and dementias. Using resources separately, we calculated multiple phenotype risk scores for individuals from Vanderbilt University Medical Center's BioVU DNA Biobank and compared phenotyping performance against rule-based eMERGE algorithms. Lastly, we implemented an ensemble strategy which classified patient case/control status based upon PheMAP resource agreement. RESULTS Jaccard similarity matrices indicate that the similarity of CUIs comprising single resource-based PheMAPs varies. Single resource-based PheMAPs generated from MedlinePlus and MedicineNet outperformed others but only encompass 81.6% of overall disease phenotypes. We propose the PheMAP-Ensemble which provides higher average accuracy and precision than the combined average accuracy and precision of single resource-based PheMAPs. While offering complete phenotype coverage, PheMAP-Ensemble significantly increases phenotyping recall compared to the original iteration. CONCLUSIONS Resources comprising the PheMAP produce different phenotyping performance when implemented individually. The ensemble method significantly improves the quality of PheMAP by fully utilizing dissimilar resources to capture accurate phenotyping data from EHRs.
Collapse
Affiliation(s)
- Nicholas C Wan
- Department of Biomedical Engineering, Vanderbilt University, Nashville, Tennessee, USA
| | - Ali A Yaqoob
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Henry H Ong
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Juan Zhao
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Wei-Qi Wei
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| |
Collapse
|
11
|
|
12
|
Zamboni M, Strimpakos G, Poggiogalle E, Donini LM, Civitareale D. Adipocyte signaling affects thyroid-specific gene expression via down-regulation of TTF-2/FOXE1. J Mol Endocrinol 2023; 70:e220129. [PMID: 36347053 DOI: 10.1530/jme-22-0129] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Accepted: 11/08/2022] [Indexed: 11/09/2022]
Abstract
Obesity affects thyroid gland function. Hypothyroidism, thyroid nodules, goiter, and thyroid cancer are more frequent in patients with higher BMI values. Although these data are supported by many clinical and epidemiological studies, our knowledge is very scarce at the molecular level. In this study, we present the first experimental evidence that adipocyte signaling downregulates the expression of thyroid-specific transcription factor 2 (TTF-2/FoxE1). It plays a crucial role in thyroid development and thyroid homeostasis and it is strictly connected to thyroid cancer as well. We provide in vivo and in vitro evidence that inhibition of TTF-2/FoxE1 gene expression is mediated by adipocyte signaling.
Collapse
Affiliation(s)
- Michela Zamboni
- Institute of Biochemistry and Cell Biology, National Council of Research, Monterotondo, Rome, Italy
| | - Georgios Strimpakos
- Institute of Biochemistry and Cell Biology, National Council of Research, Monterotondo, Rome, Italy
| | - Eleonora Poggiogalle
- Department of Experimental Medicine - Medical Pathophysiology, Food Science and Endocrinology Section, Sapienza University of Rome, Rome, Italy
| | - Lorenzo M Donini
- Department of Experimental Medicine - Medical Pathophysiology, Food Science and Endocrinology Section, Sapienza University of Rome, Rome, Italy
| | - Donato Civitareale
- Institute of Biochemistry and Cell Biology, National Council of Research, Monterotondo, Rome, Italy
| |
Collapse
|
13
|
Shao F, Li R, Guo Q, Qin R, Su W, Yin H, Tian L. Plasma Metabolomics Reveals Systemic Metabolic Alterations of Subclinical and Clinical Hypothyroidism. J Clin Endocrinol Metab 2022; 108:13-25. [PMID: 36181451 PMCID: PMC9759175 DOI: 10.1210/clinem/dgac555] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/23/2022] [Revised: 09/01/2022] [Indexed: 02/03/2023]
Abstract
CONTEXT Clinical hypothyroidism (CH) and subclinical hypothyroidism (SCH) have been linked to various metabolic comorbidities but the underlying metabolic alterations remain unclear. Metabolomics may provide metabolic insights into the pathophysiology of hypothyroidism. OBJECTIVE We explored metabolic alterations in SCH and CH and identify potential metabolite biomarkers for the discrimination of SCH and CH from euthyroid individuals. METHODS Plasma samples from a cohort of 126 human subjects, including 45 patients with CH, 41 patients with SCH, and 40 euthyroid controls, were analyzed by high-resolution mass spectrometry-based metabolomics. Data were processed by multivariate principal components analysis and orthogonal partial least squares discriminant analysis. Correlation analysis was performed by a Multivariate Linear Regression analysis. Unbiased Variable selection in R algorithm and 3 machine learning models were utilized to develop prediction models based on potential metabolite biomarkers. RESULTS The plasma metabolomic patterns in SCH and CH groups were significantly different from those of control groups, while metabolite alterations between SCH and CH groups were dramatically similar. Pathway enrichment analysis found that SCH and CH had a significant impact on primary bile acid biosynthesis, steroid hormone biosynthesis, lysine degradation, tryptophan metabolism, and purine metabolism. Significant associations for 65 metabolites were found with levels of thyrotropin, free thyroxine, thyroid peroxidase antibody, or thyroglobulin antibody. We successfully selected and validated 17 metabolic biomarkers to differentiate 3 groups. CONCLUSION SCH and CH have significantly altered metabolic patterns associated with hypothyroidism, and metabolomics coupled with machine learning algorithms can be used to develop diagnostic models based on selected metabolites.
Collapse
Affiliation(s)
| | | | - Qian Guo
- Department of Endocrinology (Cadre Ward 3), Gansu Provincial Hospital, Lanzhou, Gansu 730099, China
- Clinical Research Center for Metabolic Disease, Gansu Province. 204 Donggang West Road, Lanzhou, Gansu 730099, China
| | - Rui Qin
- Clinical Research Center for Metabolic Disease, Gansu Province. 204 Donggang West Road, Lanzhou, Gansu 730099, China
| | - Wenxiu Su
- Clinical Research Center for Metabolic Disease, Gansu Province. 204 Donggang West Road, Lanzhou, Gansu 730099, China
| | - Huiyong Yin
- Correspondence: Limin Tian, M.D., The First School of Clinical Medicine, Lanzhou University, Gansu Provincial Hospital, Donggang West Road, 730030, Lanzhou, Gansu, China. ; Huiyong Yin, Ph.D., Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences, 320 Yueyang Road, Shanghai, China 200031.
| | - Limin Tian
- Correspondence: Limin Tian, M.D., The First School of Clinical Medicine, Lanzhou University, Gansu Provincial Hospital, Donggang West Road, 730030, Lanzhou, Gansu, China. ; Huiyong Yin, Ph.D., Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences, 320 Yueyang Road, Shanghai, China 200031.
| |
Collapse
|
14
|
Liang X, Cao X, Sha Q, Zhang S. HCLC-FC: A novel statistical method for phenome-wide association studies. PLoS One 2022; 17:e0276646. [PMID: 36350801 PMCID: PMC9645610 DOI: 10.1371/journal.pone.0276646] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Accepted: 10/11/2022] [Indexed: 11/11/2022] Open
Abstract
The emergence of genetic data coupled to longitudinal electronic medical records (EMRs) offers the possibility of phenome-wide association studies (PheWAS). In PheWAS, the whole phenome can be divided into numerous phenotypic categories according to the genetic architecture across phenotypes. Currently, statistical analyses for PheWAS are mainly univariate analyses, which test the association between one genetic variant and one phenotype at a time. In this article, we derived a novel and powerful multivariate method for PheWAS. The proposed method involves three steps. In the first step, we apply the bottom-up hierarchical clustering method to partition a large number of phenotypes into disjoint clusters within each phenotypic category. In the second step, the clustering linear combination method is used to combine test statistics within each category based on the phenotypic clusters and obtain p-values from each phenotypic category. In the third step, we propose a new false discovery rate (FDR) control approach. We perform extensive simulation studies to compare the performance of our method with that of other existing methods. The results show that our proposed method controls FDR very well and outperforms other methods we compared with. We also apply the proposed approach to a set of EMR-based phenotypes across more than 300,000 samples from the UK Biobank. We find that the proposed approach not only can well-control FDR at a nominal level but also successfully identify 1,244 significant SNPs that are reported to be associated with some phenotypes in the GWAS catalog. Our open-access tools and instructions on how to implement HCLC-FC are available at https://github.com/XiaoyuLiang/HCLCFC.
Collapse
Affiliation(s)
- Xiaoyu Liang
- Department of Preventive Medicine, Division of Biostatistics, University of Tennessee Health Science Center, Memphis, Tennessee, United States of America
| | - Xuewei Cao
- Department of Mathematical Sciences, Michigan Technological University, Houghton, Michigan, United States of America
| | - Qiuying Sha
- Department of Mathematical Sciences, Michigan Technological University, Houghton, Michigan, United States of America
| | - Shuanglin Zhang
- Department of Mathematical Sciences, Michigan Technological University, Houghton, Michigan, United States of America
| |
Collapse
|
15
|
Tong K, Zhang C, Yang T, Guo R, Wang X, Guan R, Jin T. Suggestive evidence of the genetic association of TMOD1 and PTCSC2 polymorphisms with thyroid carcinoma in the Chinese Han population. BMC Endocr Disord 2022; 22:263. [PMID: 36316666 PMCID: PMC9620653 DOI: 10.1186/s12902-022-01177-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Accepted: 10/11/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The purpose of this study was to survey the associations of six single nucleotide polymorphisms (SNPs) in the TMOD1 and PTCSC2 genes with thyroid carcinoma (TC). METHOD Peripheral blood samples were obtained from 510 patients with TC and 509 normal controls. Six SNPs were genotyped by the Agena MassARRAY platform. Logistic regression was used to evaluate the association between SNPs and TC susceptibility by calculating odds ratios (ORs) and 95% confidence intervals (CIs). SNP-SNP interactions were analyzed by multifactor dimensionality reduction (MDR). RESULTS Our study showed that rs925489 (OR = 1.45, p = 0.011) and rs965513 (OR = 1.40, p = 0.021) were significantly associated with an increased risk of TC. Rs10982622 decreased TC risk (OR = 0.74, p = 0.025). Further stratification analysis showed that rs10982622 reduced the susceptibility to TC in patients aged ≤ 45 years (OR = 0.69, p = 0.019) and in females (OR = 0.61, p = 0.014). Rs925489 increased TC risk in people aged > 45 years (OR = 1.54, p = 0.044) and in males (OR = 2.34, p = 0.003). In addition, rs965513 was related to an increased risk of TC in males (OR = 2.14, p = 0.007). Additionally, haplotypes in the block (rs925489|rs965513) significantly increased TC risk (p < 0.05). The best predictive model for TC was the combination of rs1052270, rs10982622, rs1475545, rs16924016, and rs925489. CONCLUSION TMOD1 and PTCSC2 polymorphisms were separately correlated with a remarkable decrease and increase in TC risk based on the analysis.
Collapse
Affiliation(s)
- Kaijun Tong
- Department of Medical Images, People's Hospital of Wanning, Huanshi three eastern Road, Wancheng Town, Wanning City, Hainan Province, China
| | - Chang Zhang
- Department of Clinical Laboratory, People's Hospital of Wanning, Hainan Province, Wanning, China
| | - Tingting Yang
- Department of Medical Images, People's Hospital of Wanning, Huanshi three eastern Road, Wancheng Town, Wanning City, Hainan Province, China
| | - Rongbiao Guo
- Department of Medical Images, People's Hospital of Wanning, Huanshi three eastern Road, Wancheng Town, Wanning City, Hainan Province, China
| | - Xinyuan Wang
- Department of Medical Images, People's Hospital of Wanning, Huanshi three eastern Road, Wancheng Town, Wanning City, Hainan Province, China
| | - Renyang Guan
- Department of Medical Images, People's Hospital of Wanning, Huanshi three eastern Road, Wancheng Town, Wanning City, Hainan Province, China.
| | - Tianbo Jin
- Key Laboratory of Resource Biology and Biotechnology in Western China, Ministry of Education, Northwest University, 710069, Xi'an, Shaanxi, China.
- Provincial Key Laboratory of Biotechnology of Shaanxi Province, Northwest University, 229 North Taibai Road, 710069, Xi'an, Shaanxi, China.
| |
Collapse
|
16
|
Lim G, Widiapradja A, Levick SP, McKelvey KJ, Liao XH, Refetoff S, Bullock M, Clifton-Bligh RJ. Foxe1 Deletion in the Adult Mouse Is Associated With Increased Thyroidal Mast Cells and Hypothyroidism. Endocrinology 2022; 163:bqac158. [PMID: 36156081 PMCID: PMC9618408 DOI: 10.1210/endocr/bqac158] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Indexed: 11/29/2022]
Abstract
CONTEXT Foxe1 is a key thyroid developmental transcription factor. Germline deletion results in athyreosis and congenital hypothyroidism. Some data suggest an ongoing role for maintaining thyroid differentiation. OBJECTIVE We created a mouse model to directly examine the role of Foxe1 in the adult thyroid. METHODS A model of tamoxifen-inducible Cre-mediated ubiquitous deletion of Foxe1 was generated in mice of C57BL/6J background (Foxe1flox/flox/Cre-TAM). Tamoxifen or vehicle was administered to Foxe1flox/flox/Cre mice aged 6-8 weeks. Blood was collected at 4, 12, and 20 weeks, and tissues after 12 or 20 weeks for molecular and histological analyses. Plasma total thyroxine (T4), triiodothyronine, and thyrotropin (TSH) were measured. Transcriptomics was performed using microarray or RNA-seq and validated by reverse transcription quantitative polymerase chain reaction. RESULTS Foxe1 was decreased by approximately 80% in Foxe1flox/flox/Cre-TAM mice and confirmed by immunohistochemistry. Foxe1 deletion was associated with abnormal follicular architecture and smaller follicle size at 12 and 20 weeks. Plasma TSH was elevated in Foxe1flox/flox/Cre-TAM mice as early as 4 weeks and T4 was lower in pooled samples from 12 and 20 weeks. Foxe1 deletion was also associated with an increase in thyroidal mast cells. Transcriptomic analyses found decreased Tpo and Tg and upregulated mast cell markers Mcpt4 and Ctsg in Foxe1flox/flox/Cre-TAM mice. CONCLUSION Foxe1 deletion in adult mice was associated with disruption in thyroid follicular architecture accompanied by biochemical hypothyroidism, confirming its role in maintenance of thyroid differentiation. An unanticipated finding was an increase in thyroidal mast cells. These data suggest a possible explanation for previous human genetic studies associating alleles in/near FOXE1 with hypothyroidism and/or autoimmune thyroiditis.
Collapse
Affiliation(s)
- Grace Lim
- Cancer Genetics Laboratory, Kolling Institute, Faculty of Medicine and Health, The University of Sydney, St Leonards, NSW 2065, Australia
| | - Alexander Widiapradja
- Cardiac Biology and Heart Failure Laboratory, Kolling Institute, Faculty of Medicine and Health, The University of Sydney, St Leonards, NSW 2065, Australia
| | - Scott P Levick
- Cardiac Biology and Heart Failure Laboratory, Kolling Institute, Faculty of Medicine and Health, The University of Sydney, St Leonards, NSW 2065, Australia
| | - Kelly J McKelvey
- Bill Walsh Translational Cancer Research Laboratory, Kolling Institute, Faculty of Medicine and Health, The University of Sydney, St Leonards, NSW 2065, Australia
| | - Xiao-Hui Liao
- Department of Medicine, The University of Chicago, Chicago, Illinois 60637, USA
| | - Samuel Refetoff
- Department of Medicine, Pediatrics and Committee on Genetics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Martyn Bullock
- Cancer Genetics Laboratory, Kolling Institute, Faculty of Medicine and Health, The University of Sydney, St Leonards, NSW 2065, Australia
| | - Roderick J Clifton-Bligh
- Cancer Genetics Laboratory, Kolling Institute, Faculty of Medicine and Health, The University of Sydney, St Leonards, NSW 2065, Australia
- Department of Endocrinology, Royal North Shore Hospital, St Leonards, NSW 2065, Australia
| |
Collapse
|
17
|
Integration of Omics and Phenotypic Data for Precision Medicine. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2022; 2486:19-35. [PMID: 35437716 DOI: 10.1007/978-1-0716-2265-0_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
Over the past two decades, biomedical research is moving toward a big-data-driven approach. The underlying causes of this transition include the ability to gather genetic or molecular profiles of humans faster, the increasing adoption of electronic health record (EHR) system, and the growing interest in linking omics and phenotypic data for analysis. The integration of individual's biology data (e.g., genomics, proteomics, metabolomics), and health-care data has created unprecedented opportunities for precision medicine, that is, a medical model that uses a patient's unique information, mainly genetic, to prevent, diagnose, or treat disease. This chapter reviewed the research opportunities and applications of integrating omics and phenotypic data for precision medicine, such as understanding the relationship between genotype and phenotype, disease subtyping, and diagnosis or prediction of adverse outcomes. We reviewed the recent advanced methods, particularly the machine learning and deep learning-based approaches used for harnessing and harmonizing the multiomics and phenotypic data to address these applications. We finally discussed the challenges and future directions.
Collapse
|
18
|
Kerley CI, Chaganti S, Nguyen TQ, Bermudez C, Cutting LE, Beason-Held LL, Lasko T, Landman BA. pyPheWAS: A Phenome-Disease Association Tool for Electronic Medical Record Analysis. Neuroinformatics 2022; 20:483-505. [PMID: 34981404 PMCID: PMC9250547 DOI: 10.1007/s12021-021-09553-4] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/29/2021] [Indexed: 11/29/2022]
Abstract
Along with the increasing availability of electronic medical record (EMR) data, phenome-wide association studies (PheWAS) and phenome-disease association studies (PheDAS) have become a prominent, first-line method of analysis for uncovering the secrets of EMR. Despite this recent growth, there is a lack of approachable software tools for conducting these analyses on large-scale EMR cohorts. In this article, we introduce pyPheWAS, an open-source python package for conducting PheDAS and related analyses. This toolkit includes 1) data preparation, such as cohort censoring and age-matching; 2) traditional PheDAS analysis of ICD-9 and ICD-10 billing codes; 3) PheDAS analysis applied to a novel EMR phenotype mapping: current procedural terminology (CPT) codes; and 4) novelty analysis of significant disease-phenotype associations found through PheDAS. The pyPheWAS toolkit is approachable and comprehensive, encapsulating data prep through result visualization all within a simple command-line interface. The toolkit is designed for the ever-growing scale of available EMR data, with the ability to analyze cohorts of 100,000 + patients in less than 2 h. Through a case study of Down Syndrome and other intellectual developmental disabilities, we demonstrate the ability of pyPheWAS to discover both known and potentially novel disease-phenotype associations across different experiment designs and disease groups. The software and user documentation are available in open source at https://github.com/MASILab/pyPheWAS .
Collapse
Affiliation(s)
- Cailey I Kerley
- Department of Electrical Engineering, Vanderbilt University, Nashville, TN, USA.
| | - Shikha Chaganti
- Department of Computer Science, Vanderbilt University, Nashville, TN, USA
| | - Tin Q Nguyen
- Vanderbilt Brain Institute, Vanderbilt University, Nashville, TN, USA
- Department of Special Education, Peabody College of Education and Human Development, Nashville, TN, USA
| | - Camilo Bermudez
- Department of Biomedical Engineering, Vanderbilt University, Nashville, TN, USA
| | - Laurie E Cutting
- Vanderbilt Brain Institute, Vanderbilt University, Nashville, TN, USA
- Department of Special Education, Peabody College of Education and Human Development, Nashville, TN, USA
- Vanderbilt Kennedy Center, Vanderbilt University Medical Center, Nashville, TN, USA
- Vanderbilt University Institute of Imaging Science, Vanderbilt University, Nashville, TN, USA
| | - Lori L Beason-Held
- Laboratory of Behavioral Neuroscience, National Institute On Aging, NIH, Baltimore, MD, USA
| | - Thomas Lasko
- Department of Computer Science, Vanderbilt University, Nashville, TN, USA
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Bennett A Landman
- Department of Electrical Engineering, Vanderbilt University, Nashville, TN, USA
- Department of Computer Science, Vanderbilt University, Nashville, TN, USA
- Vanderbilt Brain Institute, Vanderbilt University, Nashville, TN, USA
- Department of Biomedical Engineering, Vanderbilt University, Nashville, TN, USA
- Vanderbilt Kennedy Center, Vanderbilt University Medical Center, Nashville, TN, USA
- Vanderbilt University Institute of Imaging Science, Vanderbilt University, Nashville, TN, USA
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| |
Collapse
|
19
|
Overway EM, Bosma KJ, Claxton DP, Oeser JK, Singh K, Breidenbach LB, Mchaourab HS, Davis LK, O'Brien RM. Nonsynonymous single-nucleotide polymorphisms in the G6PC2 gene affect protein expression, enzyme activity, and fasting blood glucose. J Biol Chem 2022; 298:101534. [PMID: 34954144 PMCID: PMC8800118 DOI: 10.1016/j.jbc.2021.101534] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2021] [Revised: 12/16/2021] [Accepted: 12/17/2021] [Indexed: 12/30/2022] Open
Abstract
G6PC2 encodes a glucose-6-phosphatase (G6Pase) catalytic subunit that modulates the sensitivity of insulin secretion to glucose and thereby regulates fasting blood glucose (FBG). A common single-nucleotide polymorphism (SNP) in G6PC2, rs560887 is an important determinant of human FBG variability. This SNP has a subtle effect on G6PC2 RNA splicing, which raises the question as to whether nonsynonymous SNPs with a major impact on G6PC2 stability or enzyme activity might have a broader disease/metabolic impact. Previous attempts to characterize such SNPs were limited by the very low inherent G6Pase activity and expression of G6PC2 protein in islet-derived cell lines. In this study, we describe the use of a plasmid vector that confers high G6PC2 protein expression in islet cells, allowing for a functional analysis of 22 nonsynonymous G6PC2 SNPs, 19 of which alter amino acids that are conserved in mouse G6PC2 and the human and mouse variants of the related G6PC1 isoform. We show that 16 of these SNPs markedly impair G6PC2 protein expression (>50% decrease). These SNPs have variable effects on the stability of human and mouse G6PC1, despite the high sequence homology between these isoforms. Four of the remaining six SNPs impaired G6PC2 enzyme activity. Electronic health record-derived phenotype analyses showed an association between high-impact SNPs and FBG, but not other diseases/metabolites. While homozygous G6pc2 deletion in mice increases the risk of hypoglycemia, these human data reveal no evidence that the beneficial use of partial G6PC2 inhibitors to lower FBG would be associated with unintended negative consequences.
Collapse
Affiliation(s)
- Emily M Overway
- Department of Molecular Physiology and Biophysics, Vanderbilt University School of Medicine, Nashville, Tennessee, USA
| | - Karin J Bosma
- Department of Molecular Physiology and Biophysics, Vanderbilt University School of Medicine, Nashville, Tennessee, USA
| | - Derek P Claxton
- Department of Molecular Physiology and Biophysics, Vanderbilt University School of Medicine, Nashville, Tennessee, USA
| | - James K Oeser
- Department of Molecular Physiology and Biophysics, Vanderbilt University School of Medicine, Nashville, Tennessee, USA
| | - Kritika Singh
- Department of Medicine, Vanderbilt University School of Medicine, Nashville, Tennessee, USA
| | - Lindsay B Breidenbach
- Department of Medicine, Vanderbilt University School of Medicine, Nashville, Tennessee, USA
| | - Hassane S Mchaourab
- Department of Molecular Physiology and Biophysics, Vanderbilt University School of Medicine, Nashville, Tennessee, USA
| | - Lea K Davis
- Department of Molecular Physiology and Biophysics, Vanderbilt University School of Medicine, Nashville, Tennessee, USA; Department of Medicine, Vanderbilt University School of Medicine, Nashville, Tennessee, USA
| | - Richard M O'Brien
- Department of Molecular Physiology and Biophysics, Vanderbilt University School of Medicine, Nashville, Tennessee, USA.
| |
Collapse
|
20
|
Subclinical Hypothyroidism in Families Due to Chronic Consumption of Nitrate-Contaminated Water in Rural Areas with Intensive Livestock and Agricultural Practices in Durango, Mexico. WATER 2022. [DOI: 10.3390/w14030282] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Nitrate is a widely disseminated water pollutant and has been linked to health disorders, including hypothyroidism. Here, we evaluated the relationship between thyroid function and chronic exposure to nitrates in rural zone families, in addition to the genetic and autoimmune factors. Exposure and effect biomarkers, thyroid hormones, and autoantibodies of tiroperoxidase were measured, as well the presence of two FOXE1 polymorphisms (rs965513, rs1867277). Pearson’s correlation, principal component analysis, Kruskal–Wallis, and chi-squared tests were used for statistical analysis. A total of 102 individuals were analyzed; 45% presented subclinical hypothyroidism, a negative correlation was observed between methemoglobin and the total T3 (r = −0.43, p = 0.001) and free T3 levels (r = −0.34, p = 0.001), as well as between TSH and the free T4 (r = −0.41, p = 0.0001) and total T4 (r = −0.36, p = 0.0001). A total of 15.7% had positive antithyroid ab-TPO, while the polymorphic genotype (AA) represented only 3% (rs965513) and 4% (rs1867277) among subjects with subclinical hypothyroidism. The high frequency of subclinical hypothyroidism in the population under study could be related, mainly, to chronic exposure through the consumption of nitrate-contaminated water.
Collapse
|
21
|
Voora D, Baye J, McDermaid A, Gowda SN, Wilke RA, Myrmoe AN, Hajek C, Larson EA. SLCO1B1*5 allele is associated with atorvastatin discontinuation and adverse muscle symptoms in the context of routine care. Clin Pharmacol Ther 2022; 111:1075-1083. [PMID: 35034348 PMCID: PMC9303592 DOI: 10.1002/cpt.2527] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Revised: 12/13/2021] [Accepted: 12/29/2021] [Indexed: 11/06/2022]
Abstract
SLCO1B1 genotype is known to influence patient adherence to statin therapy, in part by increasing the risk for statin-associated musculoskeletal symptoms (SAMS). The SLCO1B1*5 allele has previously been associated with simvastatin discontinuation and SAMS. Prior analyses of the relationship between SLCO1B1*5 and atorvastatin muscle side effects have been inconclusive due to insufficient power. We now quantify the impact of SLCO1B1*5 on atorvastatin discontinuation and SAMS in a large observational cohort using electronic medical record (EMR) data from a single health care system. In our study cohort (n = 1,627 patients exposed to atorvastatin during the course of routine clinical care), 56% (n = 912 of 1,627 patients) discontinued atorvastatin and 18% (n = 303 of 1,627 patients) developed SAMS. A univariate model revealed that SLCO1B1*5 increased the likelihood that patients would stop atorvastatin during routine care (Odds Ratio 1.2, 95% confidence interval [C.I.]: 1.1 - 1.5, p = 0.04). A multivariate Cox proportional hazards model further demonstrated that this same variant was associated with time to atorvastatin discontinuation (Hazard Ratio 1.2, C.I. 1.1 - 1.4, p = 0.004). Additional time-to-event analyses also revealed that SCLO1B1*5 was associated with SAMS (Hazard Ratio 1.4, C.I. 1.1 - 1.7, p = 0.02). Atorvastatin discontinuation was associated with SAMS (Odds Ratio 1.67, p = 0.0001) in our cohort.
Collapse
Affiliation(s)
- Deepak Voora
- Department of Medicine, Center for Applied Genomics & Precision Medicine, Duke University School of Medicine, Durham, 27710
| | | | - Adam McDermaid
- Sanford Imagenetics, Sioux Falls, 57105.,Department of Internal Medicine, University of South Dakota, Sanford School of Medicine, Sioux Falls, 57105
| | - Smitha Narayana Gowda
- Department of Internal Medicine, University of South Dakota, Sanford School of Medicine, Sioux Falls, 57105
| | - Russell A Wilke
- Department of Internal Medicine, University of South Dakota, Sanford School of Medicine, Sioux Falls, 57105
| | - Anna Nicole Myrmoe
- Department of Internal Medicine, University of South Dakota, Sanford School of Medicine, Sioux Falls, 57105
| | - Catherine Hajek
- Sanford Imagenetics, Sioux Falls, 57105.,Department of Internal Medicine, University of South Dakota, Sanford School of Medicine, Sioux Falls, 57105
| | - Eric A Larson
- Sanford Imagenetics, Sioux Falls, 57105.,Department of Internal Medicine, University of South Dakota, Sanford School of Medicine, Sioux Falls, 57105
| |
Collapse
|
22
|
Abstract
Electronic health records (EHRs) are a rich source of data for researchers, but extracting meaningful information out of this highly complex data source is challenging. Phecodes represent one strategy for defining phenotypes for research using EHR data. They are a high-throughput phenotyping tool based on ICD (International Classification of Diseases) codes that can be used to rapidly define the case/control status of thousands of clinically meaningful diseases and conditions. Phecodes were originally developed to conduct phenome-wide association studies to scan for phenotypic associations with common genetic variants. Since then, phecodes have been used to support a wide range of EHR-based phenotyping methods, including the phenotype risk score. This review aims to comprehensively describe the development, validation, and applications of phecodes and suggest some future directions for phecodes and high-throughput phenotyping.
Collapse
Affiliation(s)
- Lisa Bastarache
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee 37232, USA;
| |
Collapse
|
23
|
Pruett DG, Shaw DM, Chen HH, Petty LE, Polikowsky HG, Kraft SJ, Jones RM, Below JE. Identifying developmental stuttering and associated comorbidities in electronic health records and creating a phenome risk classifier. JOURNAL OF FLUENCY DISORDERS 2021; 68:105847. [PMID: 33894541 PMCID: PMC8188400 DOI: 10.1016/j.jfludis.2021.105847] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/19/2020] [Revised: 03/30/2021] [Accepted: 03/31/2021] [Indexed: 05/31/2023]
Abstract
PURPOSE This study aimed to identify cases of developmental stuttering and associated comorbidities in de-identified electronic health records (EHRs) at Vanderbilt University Medical Center, and, in turn, build and test a stuttering prediction model. METHODS A multi-step process including a keyword search of medical notes, a text-mining algorithm, and manual review was employed to identify stuttering cases in the EHR. Confirmed cases were compared to matched controls in a phenotype code (phecode) enrichment analysis to reveal conditions associated with stuttering (i.e., comorbidities). These associated phenotypes were used as proxy variables to phenotypically predict stuttering in subjects within the EHR that were not otherwise identifiable using the multi-step identification process described above. RESULTS The multi-step process resulted in the manually reviewed identification of 1,143 stuttering cases in the EHR. Highly enriched phecodes included codes related to childhood onset fluency disorder, adult-onset fluency disorder, hearing loss, sleep disorders, atopy, a multitude of codes for infections, neurological deficits, and body weight. These phecodes were used as variables to create a phenome risk classifier (PheRC) prediction model to identify additional high likelihood stuttering cases. The PheRC prediction model resulted in a positive predictive value of 83 %. CONCLUSIONS This study demonstrates the feasibility of using EHRs in the study of stuttering and found phenotypic associations. The creation of the PheRC has the potential to enable future studies of stuttering using existing EHR data, including investigations into the genetic etiology.
Collapse
Affiliation(s)
- Dillon G Pruett
- Department of Hearing and Speech Sciences, Vanderbilt University, United States
| | - Douglas M Shaw
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, United States
| | - Hung-Hsin Chen
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, United States
| | - Lauren E Petty
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, United States
| | - Hannah G Polikowsky
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, United States
| | - Shelly Jo Kraft
- Department of Communication Sciences and Disorders, Wayne State University, United States
| | - Robin M Jones
- Department of Hearing and Speech Sciences, Vanderbilt University, United States
| | - Jennifer E Below
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, United States.
| |
Collapse
|
24
|
Development of an Algorithm to Identify Cases of Nonalcoholic Steatohepatitis Cirrhosis in the Electronic Health Record. Dig Dis Sci 2021; 66:1452-1460. [PMID: 32535780 DOI: 10.1007/s10620-020-06388-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/11/2019] [Accepted: 06/03/2020] [Indexed: 01/11/2023]
Abstract
BACKGROUND AND AIMS Current genetic research of nonalcoholic steatohepatitis (NASH) cirrhosis is limited by our ability to accurately identify cases on a large scale. Our objective was to develop and validate an electronic health record (EHR) algorithm to accurately identify cases of NASH cirrhosis in the EHR. METHODS We used Clinical Query 2, a search tool at Beth Israel Deaconess Medical Center, to create a pool of potential NASH cirrhosis cases (n = 5415). We created a training set of 300 randomly selected patients for chart review to confirm cases of NASH cirrhosis. Test characteristics of different algorithms, consisting of diagnosis codes, laboratory values, anthropomorphic measurements, and medication records, were calculated. The algorithms with the highest positive predictive value (PPV) and the highest F score with a PPV ≥ 80% were selected for internal validation using a separate random set of 100 patients from the potential NASH cirrhosis pool. These were then externally validated in another random set of 100 individuals using the research patient data registry tool at Massachusetts General Hospital. RESULTS The algorithm with the highest PPV of 100% on internal validation and 92% on external validation consisted of ≥ 3 counts of "cirrhosis, no mention of alcohol" (571.5, K74.6) and ≥ 3 counts of "nonalcoholic fatty liver" (571.8-571.9, K75.81, K76.0) codes in the absence of any diagnosis codes for other common causes of chronic liver disease. CONCLUSIONS We developed and validated an EHR algorithm using diagnosis codes that accurately identifies patients with NASH cirrhosis.
Collapse
|
25
|
DeLozier S, Bland HT, McPheeters M, Wells Q, Farber-Eger E, Bejan CA, Fabbri D, Rosenbloom T, Roden D, Johnson KB, Wei WQ, Peterson J, Bastarache L. Phenotyping coronavirus disease 2019 during a global health pandemic: Lessons learned from the characterization of an early cohort. J Biomed Inform 2021; 117:103777. [PMID: 33838341 PMCID: PMC8026248 DOI: 10.1016/j.jbi.2021.103777] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2020] [Revised: 02/09/2021] [Accepted: 04/03/2021] [Indexed: 01/08/2023]
Abstract
From the start of the coronavirus disease 2019 (COVID-19) pandemic, researchers have looked to electronic health record (EHR) data as a way to study possible risk factors and outcomes. To ensure the validity and accuracy of research using these data, investigators need to be confident that the phenotypes they construct are reliable and accurate, reflecting the healthcare settings from which they are ascertained. We developed a COVID-19 registry at a single academic medical center and used data from March 1 to June 5, 2020 to assess differences in population-level characteristics in pandemic and non-pandemic years respectively. Median EHR length, previously shown to impact phenotype performance in type 2 diabetes, was significantly shorter in the SARS-CoV-2 positive group relative to a 2019 influenza tested group (median 3.1 years vs 8.7; Wilcoxon rank sum P = 1.3e-52). Using three phenotyping methods of increasing complexity (billing codes alone and domain-specific algorithms provided by an EHR vendor and clinical experts), common medical comorbidities were abstracted from COVID-19 EHRs, defined by the presence of a positive laboratory test (positive predictive value 100%, recall 93%). After combining performance data across phenotyping methods, we observed significantly lower false negative rates for those records billed for a comprehensive care visit (p = 4e-11) and those with complete demographics data recorded (p = 7e-5). In an early COVID-19 cohort, we found that phenotyping performance of nine common comorbidities was influenced by median EHR length, consistent with previous studies, as well as by data density, which can be measured using portable metrics including CPT codes. Here we present those challenges and potential solutions to creating deeply phenotyped, acute COVID-19 cohorts.
Collapse
Affiliation(s)
- Sarah DeLozier
- Department of Biomedical Informatics, Vanderbilt University Medical Center, West End Ave, Suite 1475, Nashville, TN 37203, USA.
| | - Harris T Bland
- Department of Biomedical Informatics, Vanderbilt University Medical Center, West End Ave, Suite 1475, Nashville, TN 37203, USA
| | - Melissa McPheeters
- Department of Biomedical Informatics, Vanderbilt University Medical Center, West End Ave, Suite 1475, Nashville, TN 37203, USA
| | - Quinn Wells
- Division of Cardiovascular Medicine, Vanderbilt University Medical Center, Pierce Avenue, 383 Preston Research Building, Nashville, TN 37232, USA
| | - Eric Farber-Eger
- Division of Cardiovascular Medicine, Vanderbilt University Medical Center, Pierce Avenue, 383 Preston Research Building, Nashville, TN 37232, USA
| | - Cosmin A Bejan
- Department of Biomedical Informatics, Vanderbilt University Medical Center, West End Ave, Suite 1475, Nashville, TN 37203, USA
| | - Daniel Fabbri
- Department of Biomedical Informatics, Vanderbilt University Medical Center, West End Ave, Suite 1475, Nashville, TN 37203, USA
| | - Trent Rosenbloom
- Department of Biomedical Informatics, Vanderbilt University Medical Center, West End Ave, Suite 1475, Nashville, TN 37203, USA
| | - Dan Roden
- Department of Biomedical Informatics, Vanderbilt University Medical Center, West End Ave, Suite 1475, Nashville, TN 37203, USA; Division of Cardiovascular Medicine, Vanderbilt University Medical Center, Pierce Avenue, 383 Preston Research Building, Nashville, TN 37232, USA
| | - Kevin B Johnson
- Department of Biomedical Informatics, Vanderbilt University Medical Center, West End Ave, Suite 1475, Nashville, TN 37203, USA
| | - Wei-Qi Wei
- Department of Biomedical Informatics, Vanderbilt University Medical Center, West End Ave, Suite 1475, Nashville, TN 37203, USA
| | - Josh Peterson
- Department of Biomedical Informatics, Vanderbilt University Medical Center, West End Ave, Suite 1475, Nashville, TN 37203, USA
| | - Lisa Bastarache
- Department of Biomedical Informatics, Vanderbilt University Medical Center, West End Ave, Suite 1475, Nashville, TN 37203, USA
| |
Collapse
|
26
|
Zhao Y, Fu S, Bielinski SJ, Decker PA, Chamberlain AM, Roger VL, Liu H, Larson NB. Natural Language Processing and Machine Learning for Identifying Incident Stroke From Electronic Health Records: Algorithm Development and Validation. J Med Internet Res 2021; 23:e22951. [PMID: 33683212 PMCID: PMC7985804 DOI: 10.2196/22951] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2020] [Revised: 08/25/2020] [Accepted: 01/20/2021] [Indexed: 11/29/2022] Open
Abstract
Background Stroke is an important clinical outcome in cardiovascular research. However, the ascertainment of incident stroke is typically accomplished via time-consuming manual chart abstraction. Current phenotyping efforts using electronic health records for stroke focus on case ascertainment rather than incident disease, which requires knowledge of the temporal sequence of events. Objective The aim of this study was to develop a machine learning–based phenotyping algorithm for incident stroke ascertainment based on diagnosis codes, procedure codes, and clinical concepts extracted from clinical notes using natural language processing. Methods The algorithm was trained and validated using an existing epidemiology cohort consisting of 4914 patients with atrial fibrillation (AF) with manually curated incident stroke events. Various combinations of feature sets and machine learning classifiers were compared. Using a heuristic rule based on the composition of concepts and codes, we further detected the stroke subtype (ischemic stroke/transient ischemic attack or hemorrhagic stroke) of each identified stroke. The algorithm was further validated using a cohort (n=150) stratified sampled from a population in Olmsted County, Minnesota (N=74,314). Results Among the 4914 patients with AF, 740 had validated incident stroke events. The best-performing stroke phenotyping algorithm used clinical concepts, diagnosis codes, and procedure codes as features in a random forest classifier. Among patients with stroke codes in the general population sample, the best-performing model achieved a positive predictive value of 86% (43/50; 95% CI 0.74-0.93) and a negative predictive value of 96% (96/100). For subtype identification, we achieved an accuracy of 83% in the AF cohort and 80% in the general population sample. Conclusions We developed and validated a machine learning–based algorithm that performed well for identifying incident stroke and for determining type of stroke. The algorithm also performed well on a sample from a general population, further demonstrating its generalizability and potential for adoption by other institutions.
Collapse
Affiliation(s)
- Yiqing Zhao
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, United States
| | - Sunyang Fu
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, United States
| | - Suzette J Bielinski
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, United States
| | - Paul A Decker
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, United States
| | - Alanna M Chamberlain
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, United States
| | - Veronique L Roger
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, United States
| | - Hongfang Liu
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, United States
| | - Nicholas B Larson
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, United States
| |
Collapse
|
27
|
A Phenome-Wide Analysis of Healthcare Costs Associated with Inflammatory Bowel Diseases. Dig Dis Sci 2021; 66:760-767. [PMID: 32436120 DOI: 10.1007/s10620-020-06329-9] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/10/2020] [Accepted: 05/08/2020] [Indexed: 02/07/2023]
Abstract
INTRODUCTION Crohn's disease (CD) and ulcerative colitis (UC) are associated with considerable direct healthcare costs. There have been few comprehensive analyses of all IBD- and non-IBD comorbidities that determine direct costs in this population. METHODS We used data from a validated cohort of patients with inflammatory bowel disease (IBD). Total healthcare costs were estimated as a sum of costs associated with IBD-related hospitalizations and surgery, imaging (CT or MR scans), outpatient visits, endoscopic evaluation, and emergency room (ER) care. All ICD-9 codes were extracted for each patient and clustered into 1804 distinct phecode clusters representing individual phenotypes. A phenome-wide association analysis (PheWAS) was performed using logistic regression to identify predictors of being in the top decile of costs. RESULTS Our cohort is comprised of 10,721 patients with IBD among whom 50% had CD. The median age was 46 years. The median total cost per patient is $11,203 (IQR $2396-30,563). The strongest association with total healthcare costs was intestinal obstruction without mention of hernia (p = 5.93 × 10-156) and other intestinal obstruction (p = 9.24 × 10-131). In addition, strong associations were observed for symptoms consistent with severity of IBD including the presence of fluid-electrolyte imbalance (p = 1.90 × 10-130), hypovolemia (p = 1.65 × 10-114), abdominal pain (p = 7.29 × 10-60), and anemia (p = 1.90-10-83). Cardiopulmonary diseases and psychological comorbidity also demonstrated significant associations with total costs with the latter being more strongly associated with ER visit-related costs. CONCLUSIONS Surrogate markers suggesting possible irreversible bowel damage and active disease demonstrate the greatest influence on IBD-related healthcare costs.
Collapse
|
28
|
Zeber-Lubecka N, Hennig EE. Genetic Susceptibility to Joint Occurrence of Polycystic Ovary Syndrome and Hashimoto's Thyroiditis: How Far Is Our Understanding? Front Immunol 2021; 12:606620. [PMID: 33746952 PMCID: PMC7968419 DOI: 10.3389/fimmu.2021.606620] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Accepted: 01/07/2021] [Indexed: 12/15/2022] Open
Abstract
Polycystic ovary syndrome (PCOS) and Hashimoto’s thyroiditis (HT) are endocrine disorders that commonly occur among young women. A higher prevalence of HT in women with PCOS, relative to healthy individuals, is observed consistently. Combined occurrence of both diseases is associated with a higher risk of severe metabolic and reproductive complications. Genetic factors strongly impact the pathogenesis of both PCOS and HT and several susceptibility loci associated with a higher risk of both disorders have been identified. Furthermore, some candidate gene polymorphisms are thought to be functionally relevant; however, few genetic variants are proposed to be causally associated with the incidence of both disorders together.
Collapse
Affiliation(s)
- Natalia Zeber-Lubecka
- Department of Gastroenterology, Hepatology and Clinical Oncology, Centre of Postgraduate Medical Education, Warsaw, Poland
| | - Ewa E Hennig
- Department of Gastroenterology, Hepatology and Clinical Oncology, Centre of Postgraduate Medical Education, Warsaw, Poland.,Department of Genetics, Maria Skłodowska-Curie National Research Institute of Oncology, Warsaw, Poland
| |
Collapse
|
29
|
Altaraihi M, Hansen TVO, Santoni-Rugiu E, Rossing M, Rasmussen ÅK, Gerdes AM, Wadt K. Prevalence of Pathogenic Germline DICER1 Variants in Young Individuals Thyroidectomised Due to Goitre - A National Danish Cohort. Front Endocrinol (Lausanne) 2021; 12:727970. [PMID: 34552563 PMCID: PMC8451242 DOI: 10.3389/fendo.2021.727970] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/20/2021] [Accepted: 07/26/2021] [Indexed: 11/13/2022] Open
Abstract
INTRODUCTION DICER1 syndrome encompasses a variety of benign and malignant manifestations including multinodular goitre, which is the most common manifestation among individuals carrying pathogenic DICER1 variants. This is the first study estimating the prevalence of pathogenic DICER1 variants in young individuals with multinodular goitre. METHODS Danish individuals diagnosed with nodular goitre based on thyroidectomy samples in 2001-2016 with the age limit at time of operation being ≤ 25 years were offered germline DICER1 gene testing. RESULTS Six of 46 individuals, 13% (CI [3.3;22.7], p <0.05), diagnosed with nodular goitre on the basis of thyroidectomy samples under the age of 25 years had pathogenic germline variants in DICER1. They were found in different pathoanatomical nodular goitre cohorts i.e. nodular goitre (n=2), colloid nodular goitre (n=3) and hyperplastic nodular goitre (n=1). CONCLUSIONS We recommend referral of patients thyroidectomised due to goitre aged <21 years and patients thyroidectomised due to goitre aged <25 years with a family history of goitre to genetic counselling. Patients of all ages thyroidectomised due to goitre, who are affected by another DICER1 manifestation should be referred to genetic counselling.
Collapse
Affiliation(s)
- Mays Altaraihi
- Department of Clinical Genetics, Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark
- *Correspondence: Mays Altaraihi,
| | - Thomas van Overeem Hansen
- Department of Clinical Genetics, Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark
| | - Eric Santoni-Rugiu
- Department of Pathology, Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark
| | - Maria Rossing
- Center for Genomic Medicine, Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark
| | - Åse Krogh Rasmussen
- Department of Endocrinology and Metabolism, Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark
| | - Anne-Marie Gerdes
- Department of Clinical Genetics, Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark
| | - Karin Wadt
- Department of Clinical Genetics, Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark
| |
Collapse
|
30
|
Zheng NS, Feng Q, Kerchberger VE, Zhao J, Edwards TL, Cox NJ, Stein CM, Roden DM, Denny JC, Wei WQ. PheMap: a multi-resource knowledge base for high-throughput phenotyping within electronic health records. J Am Med Inform Assoc 2020; 27:1675-1687. [PMID: 32974638 PMCID: PMC7751140 DOI: 10.1093/jamia/ocaa104] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2020] [Revised: 05/06/2020] [Accepted: 05/13/2020] [Indexed: 01/16/2023] Open
Abstract
OBJECTIVE Developing algorithms to extract phenotypes from electronic health records (EHRs) can be challenging and time-consuming. We developed PheMap, a high-throughput phenotyping approach that leverages multiple independent, online resources to streamline the phenotyping process within EHRs. MATERIALS AND METHODS PheMap is a knowledge base of medical concepts with quantified relationships to phenotypes that have been extracted by natural language processing from publicly available resources. PheMap searches EHRs for each phenotype's quantified concepts and uses them to calculate an individual's probability of having this phenotype. We compared PheMap to clinician-validated phenotyping algorithms from the Electronic Medical Records and Genomics (eMERGE) network for type 2 diabetes mellitus (T2DM), dementia, and hypothyroidism using 84 821 individuals from Vanderbilt Univeresity Medical Center's BioVU DNA Biobank. We implemented PheMap-based phenotypes for genome-wide association studies (GWAS) for T2DM, dementia, and hypothyroidism, and phenome-wide association studies (PheWAS) for variants in FTO, HLA-DRB1, and TCF7L2. RESULTS In this initial iteration, the PheMap knowledge base contains quantified concepts for 841 disease phenotypes. For T2DM, dementia, and hypothyroidism, the accuracy of the PheMap phenotypes were >97% using a 50% threshold and eMERGE case-control status as a reference standard. In the GWAS analyses, PheMap-derived phenotype probabilities replicated 43 of 51 previously reported disease-associated variants for the 3 phenotypes. For 9 of the 11 top associations, PheMap provided an equivalent or more significant P value than eMERGE-based phenotypes. The PheMap-based PheWAS showed comparable or better performance to a traditional phecode-based PheWAS. PheMap is publicly available online. CONCLUSIONS PheMap significantly streamlines the process of extracting research-quality phenotype information from EHRs, with comparable or better performance to current phenotyping approaches.
Collapse
Affiliation(s)
- Neil S Zheng
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - QiPing Feng
- Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, USA
- Division of Clinical Pharmacology, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - V Eric Kerchberger
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
- Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Juan Zhao
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Todd L Edwards
- Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, USA
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Nancy J Cox
- Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, USA
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - C Michael Stein
- Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, USA
- Division of Clinical Pharmacology, Vanderbilt University Medical Center, Nashville, Tennessee, USA
- Department of Pharmacology, Vanderbilt University, Nashville, Tennessee, USA
| | - Dan M Roden
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
- Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, USA
- Division of Clinical Pharmacology, Vanderbilt University Medical Center, Nashville, Tennessee, USA
- Department of Pharmacology, Vanderbilt University, Nashville, Tennessee, USA
| | - Joshua C Denny
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
- Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Wei-Qi Wei
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| |
Collapse
|
31
|
Syring KE, Bosma KJ, Goleva SB, Singh K, Oeser JK, Lopez CA, Skaar EP, McGuinness OP, Davis LK, Powell DR, O’Brien RM. Potential positive and negative consequences of ZnT8 inhibition. J Endocrinol 2020; 246:189-205. [PMID: 32485672 PMCID: PMC7351606 DOI: 10.1530/joe-20-0138] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/04/2020] [Accepted: 06/02/2020] [Indexed: 12/31/2022]
Abstract
SLC30A8 encodes the zinc transporter ZnT8. SLC30A8 haploinsufficiency protects against type 2 diabetes (T2D), suggesting that ZnT8 inhibitors may prevent T2D. We show here that, while adult chow fed Slc30a8 haploinsufficient and knockout (KO) mice have normal glucose tolerance, they are protected against diet-induced obesity (DIO), resulting in improved glucose tolerance. We hypothesize that this protection against DIO may represent one mechanism whereby SLC30A8 haploinsufficiency protects against T2D in humans and that, while SLC30A8 is predominantly expressed in pancreatic islet beta cells, this may involve a role for ZnT8 in extra-pancreatic tissues. Consistent with this latter concept we show in humans, using electronic health record-derived phenotype analyses, that the 'C' allele of the non-synonymous rs13266634 SNP, which confers a gain of ZnT8 function, is associated not only with increased T2D risk and blood glucose, but also with increased risk for hemolytic anemia and decreased mean corpuscular hemoglobin (MCH). In Slc30a8 KO mice, MCH was unchanged but reticulocytes, platelets and lymphocytes were elevated. Both young and adult Slc30a8 KO mice exhibit a delayed rise in insulin after glucose injection, but only the former exhibit increased basal insulin clearance and impaired glucose tolerance. Young Slc30a8 KO mice also exhibit elevated pancreatic G6pc2 gene expression, potentially mediated by decreased islet zinc levels. These data indicate that the absence of ZnT8 results in a transient impairment in some aspects of metabolism during development. These observations in humans and mice suggest the potential for negative effects associated with T2D prevention using ZnT8 inhibitors.
Collapse
Affiliation(s)
- Kristen E. Syring
- Department of Molecular Physiology and Biophysics, Vanderbilt University School of Medicine
| | - Karin J. Bosma
- Department of Molecular Physiology and Biophysics, Vanderbilt University School of Medicine
| | - Slavina B. Goleva
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232
| | - Kritika Singh
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232
| | - James K. Oeser
- Department of Molecular Physiology and Biophysics, Vanderbilt University School of Medicine
| | - Christopher A. Lopez
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232
| | - Eric P. Skaar
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232
| | - Owen P. McGuinness
- Department of Molecular Physiology and Biophysics, Vanderbilt University School of Medicine
| | - Lea K. Davis
- Department of Molecular Physiology and Biophysics, Vanderbilt University School of Medicine
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232
| | - David R. Powell
- Lexicon Pharmaceuticals Incorporated, 8800 Technology Forest Place, The Woodlands, Texas 77381
| | - Richard M. O’Brien
- Department of Molecular Physiology and Biophysics, Vanderbilt University School of Medicine
| |
Collapse
|
32
|
FLT3 stop mutation increases FLT3 ligand level and risk of autoimmune thyroid disease. Nature 2020; 584:619-623. [PMID: 32581359 DOI: 10.1038/s41586-020-2436-0] [Citation(s) in RCA: 87] [Impact Index Per Article: 17.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2019] [Accepted: 04/08/2020] [Indexed: 02/08/2023]
Abstract
Autoimmune thyroid disease is the most common autoimmune disease and is highly heritable1. Here, by using a genome-wide association study of 30,234 cases and 725,172 controls from Iceland and the UK Biobank, we find 99 sequence variants at 93 loci, of which 84 variants are previously unreported2-7. A low-frequency (1.36%) intronic variant in FLT3 (rs76428106-C) has the largest effect on risk of autoimmune thyroid disease (odds ratio (OR) = 1.46, P = 2.37 × 10-24). rs76428106-C is also associated with systemic lupus erythematosus (OR = 1.90, P = 6.46 × 10-4), rheumatoid factor and/or anti-CCP-positive rheumatoid arthritis (OR = 1.41, P = 4.31 × 10-4) and coeliac disease (OR = 1.62, P = 1.20 × 10-4). FLT3 encodes fms-related tyrosine kinase 3, a receptor that regulates haematopoietic progenitor and dendritic cells. RNA sequencing revealed that rs76428106-C generates a cryptic splice site, which introduces a stop codon in 30% of transcripts that are predicted to encode a truncated protein, which lacks its tyrosine kinase domains. Each copy of rs76428106-C doubles the plasma levels of the FTL3 ligand. Activating somatic mutations in FLT3 are associated with acute myeloid leukaemia8 with a poor prognosis and rs76428106-C also predisposes individuals to acute myeloid leukaemia (OR = 1.90, P = 5.40 × 10-3). Thus, a predicted loss-of-function germline mutation in FLT3 causes a reduction in full-length FLT3, with a compensatory increase in the levels of its ligand and an increased disease risk, similar to that of a gain-of-function mutation.
Collapse
|
33
|
Siontis KC, Yao X, Pirruccello JP, Philippakis AA, Noseworthy PA. How Will Machine Learning Inform the Clinical Care of Atrial Fibrillation? Circ Res 2020; 127:155-169. [DOI: 10.1161/circresaha.120.316401] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Machine learning applications in cardiology have rapidly evolved in the past decade. With the availability of machine learning tools coupled with vast data sources, the management of atrial fibrillation (AF), a common chronic disease with significant associated morbidity and socioeconomic impact, is undergoing a knowledge and practice transformation in the increasingly complex healthcare environment. Among other advances, deep-learning machine learning methods, including convolutional neural networks, have enabled the development of AF screening pathways using the ubiquitous 12-lead ECG to detect asymptomatic paroxysmal AF in at-risk populations (such as those with cryptogenic stroke), the refinement of AF and stroke prediction schemes through comprehensive digital phenotyping using structured and unstructured data abstraction from the electronic health record or wearable monitoring technologies, and the optimization of treatment strategies, ranging from stroke prophylaxis to monitoring of antiarrhythmic drug (AAD) therapy. Although the clinical and population-wide impact of these tools continues to be elucidated, such transformative progress does not come without challenges, such as the concerns about adopting black box technologies, assessing input data quality for training such models, and the risk of perpetuating rather than alleviating health disparities. This review critically appraises the advances of machine learning related to the care of AF thus far, their potential future directions, and its potential limitations and challenges.
Collapse
Affiliation(s)
| | - Xiaoxi Yao
- Robert D and Patricia E Kern Center for the Science of Health Care Delivery (X.Y.), Mayo Clinic, Rochester, MN
- Division of Health Care Policy and Research, Department of Health Sciences Research (X.Y.), Mayo Clinic, Rochester, MN
| | - James P. Pirruccello
- Broad Institute, Cambridge, MA (J.P.P., A.A.P.)
- Division of Cardiology, Massachusetts General Hospital, Boston (J.P.P.)
| | | | - Peter A. Noseworthy
- From the Department of Cardiovascular Medicine (K.C.S., P.A.N.), Mayo Clinic, Rochester, MN
| |
Collapse
|
34
|
Wang L, Olson JE, Bielinski SJ, St Sauver JL, Fu S, He H, Cicek MS, Hathcock MA, Cerhan JR, Liu H. Impact of Diverse Data Sources on Computational Phenotyping. Front Genet 2020; 11:556. [PMID: 32582289 PMCID: PMC7283539 DOI: 10.3389/fgene.2020.00556] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Accepted: 05/07/2020] [Indexed: 11/25/2022] Open
Abstract
Electronic health records (EHRs) are widely adopted with a great potential to serve as a rich, integrated source of phenotype information. Computational phenotyping, which extracts phenotypes from EHR data automatically, can accelerate the adoption and utilization of phenotype-driven efforts to advance scientific discovery and improve healthcare delivery. A list of computational phenotyping algorithms has been published but data fragmentation, i.e., incomplete data within one single data source, has been raised as an inherent limitation of computational phenotyping. In this study, we investigated the impact of diverse data sources on two published computational phenotyping algorithms, rheumatoid arthritis (RA) and type 2 diabetes mellitus (T2DM), using Mayo EHRs and Rochester Epidemiology Project (REP) which links medical records from multiple health care systems. Results showed that both RA (less prevalent) and T2DM (more prevalent) case selections were markedly impacted by data fragmentation, with positive predictive value (PPV) of 91.4 and 92.4%, false-negative rate (FNR) of 26.6 and 14% in Mayo data, respectively, PPV of 97.2 and 98.3%, FNR of 5.2 and 3.3% in REP. T2DM controls also contain biases, with PPV of 91.2% and FNR of 1.2% for Mayo. We further elaborated underlying reasons impacting the performance.
Collapse
Affiliation(s)
- Liwei Wang
- Division of Digital Health Sciences, Department of Health Sciences Research, Mayo Clinic, Rochester, MN, United States
| | - Janet E Olson
- Division of Epidemiology, Department of Health Sciences Research, Mayo Clinic, Rochester, MN, United States.,Center for Individualized Medicine, Mayo Clinic, Rochester, MN, United States
| | - Suzette J Bielinski
- Division of Epidemiology, Department of Health Sciences Research, Mayo Clinic, Rochester, MN, United States
| | - Jennifer L St Sauver
- Division of Epidemiology, Department of Health Sciences Research, Mayo Clinic, Rochester, MN, United States
| | - Sunyang Fu
- Division of Digital Health Sciences, Department of Health Sciences Research, Mayo Clinic, Rochester, MN, United States
| | - Huan He
- Division of Digital Health Sciences, Department of Health Sciences Research, Mayo Clinic, Rochester, MN, United States
| | - Mine S Cicek
- Division of Experimental Pathology, Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, United States
| | - Matthew A Hathcock
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, Rochester, MN, United States
| | - James R Cerhan
- Division of Epidemiology, Department of Health Sciences Research, Mayo Clinic, Rochester, MN, United States
| | - Hongfang Liu
- Division of Digital Health Sciences, Department of Health Sciences Research, Mayo Clinic, Rochester, MN, United States
| |
Collapse
|
35
|
Bosma KJ, Rahim M, Singh K, Goleva SB, Wall ML, Xia J, Syring KE, Oeser JK, Poffenberger G, McGuinness OP, Means AL, Powers AC, Li WH, Davis LK, Young JD, O’Brien RM. Pancreatic islet beta cell-specific deletion of G6pc2 reduces fasting blood glucose. J Mol Endocrinol 2020; 64:235-248. [PMID: 32213654 PMCID: PMC7331801 DOI: 10.1530/jme-20-0031] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/26/2020] [Accepted: 03/13/2020] [Indexed: 12/25/2022]
Abstract
The G6PC1, G6PC2 and G6PC3 genes encode distinct glucose-6-phosphatase catalytic subunit (G6PC) isoforms. In mice, germline deletion of G6pc2 lowers fasting blood glucose (FBG) without affecting fasting plasma insulin (FPI) while, in isolated islets, glucose-6-phosphatase activity and glucose cycling are abolished and glucose-stimulated insulin secretion (GSIS) is enhanced at submaximal but not high glucose. These observations are all consistent with a model in which G6PC2 regulates the sensitivity of GSIS to glucose by opposing the action of glucokinase. G6PC2 is highly expressed in human and mouse islet beta cells however, various studies have shown trace G6PC2 expression in multiple tissues raising the possibility that G6PC2 also affects FBG through non-islet cell actions. Using real-time PCR we show here that expression of G6pc1 and/or G6pc3 are much greater than G6pc2 in peripheral tissues, whereas G6pc2 expression is much higher than G6pc3 in both pancreas and islets with G6pc1 expression not detected. In adult mice, beta cell-specific deletion of G6pc2 was sufficient to reduce FBG without changing FPI. In addition, electronic health record-derived phenotype analyses showed no association between G6PC2 expression and phenotypes clearly unrelated to islet function in humans. Finally, we show that germline G6pc2 deletion enhances glycolysis in mouse islets and that glucose cycling can also be detected in human islets. These observations are all consistent with a mechanism by which G6PC2 action in islets is sufficient to regulate the sensitivity of GSIS to glucose and hence influence FBG without affecting FPI.
Collapse
Affiliation(s)
- Karin J. Bosma
- Department of Molecular Physiology and Biophysics, Vanderbilt University School of Medicine, Nashville, TN 37232
| | - Mohsin Rahim
- Department of Chemical and Biomolecular Engineering, Vanderbilt University School of Medicine, Nashville, TN 37232
| | - Kritika Singh
- Department of Medicine, Vanderbilt University School of Medicine, Nashville, TN 37232
| | - Slavina B. Goleva
- Department of Medicine, Vanderbilt University School of Medicine, Nashville, TN 37232
| | - Martha L. Wall
- Department of Chemical and Biomolecular Engineering, Vanderbilt University School of Medicine, Nashville, TN 37232
| | - Jing Xia
- Departments of Cell Biology and of Biochemistry, University of Texas Southwestern Medical Center, Dallas, TX 75390-9039
| | - Kristen E. Syring
- Department of Molecular Physiology and Biophysics, Vanderbilt University School of Medicine, Nashville, TN 37232
| | - James K. Oeser
- Department of Molecular Physiology and Biophysics, Vanderbilt University School of Medicine, Nashville, TN 37232
| | - Greg Poffenberger
- Department of Medicine, Vanderbilt University School of Medicine, Nashville, TN 37232
| | - Owen P. McGuinness
- Department of Molecular Physiology and Biophysics, Vanderbilt University School of Medicine, Nashville, TN 37232
| | - Anna L. Means
- Department of Surgery, Vanderbilt University School of Medicine, Nashville, TN 37232
| | - Alvin C. Powers
- Department of Molecular Physiology and Biophysics, Vanderbilt University School of Medicine, Nashville, TN 37232
- Department of Medicine, Vanderbilt University School of Medicine, Nashville, TN 37232
- VA Tennessee Valley Healthcare System, Nashville, TN 37232
| | - Wen-hong Li
- Departments of Cell Biology and of Biochemistry, University of Texas Southwestern Medical Center, Dallas, TX 75390-9039
| | - Lea K. Davis
- Department of Molecular Physiology and Biophysics, Vanderbilt University School of Medicine, Nashville, TN 37232
- Department of Medicine, Vanderbilt University School of Medicine, Nashville, TN 37232
| | - Jamey D. Young
- Department of Chemical and Biomolecular Engineering, Vanderbilt University School of Medicine, Nashville, TN 37232
| | - Richard M. O’Brien
- Department of Molecular Physiology and Biophysics, Vanderbilt University School of Medicine, Nashville, TN 37232
| |
Collapse
|
36
|
Song W, Zheng S, Li M, Zhang X, Cao R, Ye C, Shao R, Li G, Li J, Liu S, Li H, Li L. Linking endotypes to omics profiles in difficult-to-control asthma using the diagnostic Chinese medicine syndrome differentiation algorithm. J Asthma 2020; 57:532-542. [PMID: 30915875 DOI: 10.1080/02770903.2019.1590589] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2018] [Revised: 01/27/2019] [Accepted: 02/28/2019] [Indexed: 02/08/2023]
Abstract
Objective: Patients with difficult-to-control asthma have difficulty breathing almost all of the time, even leading to life-threatening asthma attacks. However, only few diagnostic markers for this disease have been identified. We aimed to take advantage of unique Chinese medicine theories for phenotypic classification and to explore molecular signatures in difficult-to-control asthma. Methods: The Chinese medicine syndrome differentiation algorithm (CMSDA) is a syndrome-scoring classification method based on the Chinese medicine overall observation theory. Patients with difficult-to-control asthma were classified into Cold- and Hot-pattern groups according to the CMSDA. DNA methylation and metabolomic profiles were obtained using Infinium Human Methylation 450 BeadChip and gas chromatography-mass spectrometer. Subsequently, an integrated bioinformatics analysis was performed to compare those two patterns and identify Cold/Hot-associated candidates, followed by functional validation studies. Results: A total of 20 patients with difficult-to-control asthma were enrolled in the study. Ten were grouped as Cold and 10 as Hot according to the CMSDA. We identified distinct whole-genome DNA methylation and metabolomic profiles between Cold- and Hot-pattern groups. ALDH3A1 gene exhibited variations in the DNA methylation probe cg10791966, while two metabolic pathways were associated with those two patterns. Conclusions: Our study introduced a novel diagnostic classification approach, the CMSDA, for difficult-to-control asthma. This is an alternative way to categorize diverse syndromes and link endotypes with omics profiles of this disease. ALDH3A1 might be a potential biomarker for precision diagnosis of difficult-to-control asthma.
Collapse
Affiliation(s)
- Wenping Song
- Key Laboratory of Antibiotic Bioengineering of National Health and Family Planning Commission (NHFPC), Institute of Medicinal Biotechnology (IMB), Chinese Academy of Medical Sciences and Peking Union Medical College (CAMS & PUMC), Beijing, China
| | - Si Zheng
- Institute of Medical Information (IMI) and Library, Chinese Academy of Medical Sciences and Peking Union Medical College (CAMS & PUMC), Beijing, China
| | - Meng Li
- Guang'anmen Hospital, China Academy of Chinese Medical Sciences, Beijing, China
| | - Xia Zhang
- Guang'anmen Hospital, China Academy of Chinese Medical Sciences, Beijing, China
| | - Rui Cao
- Key Laboratory of Antibiotic Bioengineering of National Health and Family Planning Commission (NHFPC), Institute of Medicinal Biotechnology (IMB), Chinese Academy of Medical Sciences and Peking Union Medical College (CAMS & PUMC), Beijing, China
| | - Cheng Ye
- Key Laboratory of Antibiotic Bioengineering of National Health and Family Planning Commission (NHFPC), Institute of Medicinal Biotechnology (IMB), Chinese Academy of Medical Sciences and Peking Union Medical College (CAMS & PUMC), Beijing, China
| | - Rongguang Shao
- Key Laboratory of Antibiotic Bioengineering of National Health and Family Planning Commission (NHFPC), Institute of Medicinal Biotechnology (IMB), Chinese Academy of Medical Sciences and Peking Union Medical College (CAMS & PUMC), Beijing, China
| | - Guangxi Li
- Guang'anmen Hospital, China Academy of Chinese Medical Sciences, Beijing, China
| | - Jiao Li
- Institute of Medical Information (IMI) and Library, Chinese Academy of Medical Sciences and Peking Union Medical College (CAMS & PUMC), Beijing, China
| | - Shigang Liu
- Guang'anmen Hospital, China Academy of Chinese Medical Sciences, Beijing, China
| | - Hui Li
- Guang'anmen Hospital, China Academy of Chinese Medical Sciences, Beijing, China
| | - Liang Li
- Guang'anmen Hospital, China Academy of Chinese Medical Sciences, Beijing, China
| |
Collapse
|
37
|
Wells QS, Gupta DK, Smith JG, Collins SP, Storrow AB, Ferguson J, Smith ML, Pulley JM, Collier S, Wang X, Roden DM, Gerszten RE, Wang TJ. Accelerating Biomarker Discovery Through Electronic Health Records, Automated Biobanking, and Proteomics. J Am Coll Cardiol 2020; 73:2195-2205. [PMID: 31047008 DOI: 10.1016/j.jacc.2019.01.074] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/28/2018] [Revised: 01/22/2019] [Accepted: 01/23/2019] [Indexed: 01/13/2023]
Abstract
BACKGROUND Circulating biomarkers can facilitate diagnosis and risk stratification for complex conditions such as heart failure (HF). Newer molecular platforms can accelerate biomarker discovery, but they require significant resources for data and sample acquisition. OBJECTIVES The purpose of this study was to test a pragmatic biomarker discovery strategy integrating automated clinical biobanking with proteomics. METHODS Using the electronic health record, the authors identified patients with and without HF, retrieved their discarded plasma samples, and screened these specimens using a DNA aptamer-based proteomic platform (1,129 proteins). Candidate biomarkers were validated in 3 different prospective cohorts. RESULTS In an automated manner, plasma samples from 1,315 patients (31% with HF) were collected. Proteomic analysis of a 96-patient subset identified 9 candidate biomarkers (p < 4.42 × 10-5). Two proteins, angiopoietin-2 and thrombospondin-2, were associated with HF in 3 separate validation cohorts. In an emergency department-based registry of 852 dyspneic patients, the 2 biomarkers improved discrimination of acute HF compared with a clinical score (p < 0.0001) or clinical score plus B-type natriuretic peptide (p = 0.02). In a community-based cohort (n = 768), both biomarkers predicted incident HF independent of traditional risk factors and N-terminal pro-B-type natriuretic peptide (hazard ratio per SD increment: 1.35 [95% confidence interval: 1.14 to 1.61; p = 0.0007] for angiopoietin-2, and 1.37 [95% confidence interval: 1.06 to 1.79; p = 0.02] for thrombospondin-2). Among 30 advanced HF patients, concentrations of both biomarkers declined (80% to 84%) following cardiac transplant (p < 0.001 for both). CONCLUSIONS A novel strategy integrating electronic health records, discarded clinical specimens, and proteomics identified 2 biomarkers that robustly predict HF across diverse clinical settings. This approach could accelerate biomarker discovery for many diseases.
Collapse
Affiliation(s)
- Quinn S Wells
- Vanderbilt Translational and Clinical Cardiovascular Research Center, Vanderbilt University Medical Center, Nashville, Tennessee; Division of Cardiovascular Medicine, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Deepak K Gupta
- Vanderbilt Translational and Clinical Cardiovascular Research Center, Vanderbilt University Medical Center, Nashville, Tennessee; Division of Cardiovascular Medicine, Vanderbilt University Medical Center, Nashville, Tennessee.
| | - J Gustav Smith
- Department of Cardiology, Clinical Sciences, Lund University and Skane University Hospital, Lund, Sweden
| | - Sean P Collins
- Department of Emergency Medicine, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Alan B Storrow
- Department of Emergency Medicine, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Jane Ferguson
- Vanderbilt Translational and Clinical Cardiovascular Research Center, Vanderbilt University Medical Center, Nashville, Tennessee; Division of Cardiovascular Medicine, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Maya Landenhed Smith
- Department of Cardiothoracic Surgery, Clinical Sciences, Lund University and Skane University Hospital, Lund, Sweden
| | - Jill M Pulley
- Vanderbilt Institute for Clinical and Translational Research, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Sarah Collier
- Vanderbilt Institute for Clinical and Translational Research, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Xiaoming Wang
- Vanderbilt Institute for Clinical and Translational Research, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Dan M Roden
- Division of Cardiovascular Medicine, Vanderbilt University Medical Center, Nashville, Tennessee; Departments of Medicine, Pharmacology, and Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Robert E Gerszten
- Division of Cardiovascular Medicine, Beth Israel Deaconess Medical Center, Boston, Massachusetts
| | - Thomas J Wang
- Vanderbilt Translational and Clinical Cardiovascular Research Center, Vanderbilt University Medical Center, Nashville, Tennessee; Division of Cardiovascular Medicine, Vanderbilt University Medical Center, Nashville, Tennessee
| |
Collapse
|
38
|
Abstract
Background: Uncertainty in the mechanism and directionality of observational associations between thyroid function and kidney function may be addressed by genetic analysis with an instrumental variable method termed bidirectional Mendelian randomization (MR). Methods: In the Women's Genome Health Study (WGHS), observational associations between thyroid measures and kidney function were evaluated. Genetic instruments for MR were from recent genome-wide association studies (GWAS) of hypothyroidism, thyrotropin (TSH), and free thyroxine (fT4) concentrations within the reference range, thyroid peroxidase antibodies (TPOAb), estimated glomerular filtration rate from creatinine (eGFRcrea), eGFR from cystatin C (eGFRcys), and chronic kidney disease (CKD). In WGHS individual-level data, these instruments were used for bidirectional MR between thyroid (N = 3336) and kidney (N = 23,186) functions. To increase power, MR was also performed using GWAS summary statistics from the Chronic Kidney Disease Genetics Consortium (CKDGen) for eGFRcrea (N = 567,460), eGFRcys (N = 24,063), CKD [N(total) = 480,698, N(cases) = 41,395], and urinary albumin/creatinine ratio (UACR/N = 54,450). Results: In the WGHS, hypothyroidism was observationally associated with decreased eGFRcrea [beta (standard error, SE): -0.024 (0.009) ln(mL/min/1.73 m2), p = 0.01]. By MR, hypothyroidism was associated with decreased eGFRcrea in the WGHS [beta (SE): -0.007 (0.002) per doubled odds hypothyroidism, p = 1.7 × 10-3] and in CKDGen [beta (SE): -0.004 (0.0005), p = 2.0 × 10-22], and robust to sensitivity analysis. Hypothyroidism was also associated by MR with increased CKD in CKDGen (odds ratio, OR [confidence interval, CI]: 1.05 [1.03-1.08], p = 3.3 × 10-5), but not in the WGHS (OR [CI]: 1.02 [0.95-1.10], p = 0.57). Increased TSH within the reference range had an MR association with increased eGFRcrea in the WGHS [beta (SE): -0.018 (0.007) ln(mL/min/1.73 m2)/standard deviation, SD, p = 6.5 × 10-3] and CKDGen [beta (SE): -0.008 (0.001) ln(mL/min/1.73 m2)/SD, p = 6.8 × 10-17], and with CKD in CKDGen (OR [CI]: 1.10 [1.04-1.15], p = 3.1 × 10-4). There were no MR associations of hypothyroidism or TSH with eGFRcys or UACR, and MR associations of fT4 in the reference range with kidney function were inconsistent in both the WGHS and CKDGen. However, by MR in CKDGen, TPOAb were robustly associated with decreased eGFRcrea [beta (SE): -0.041 (0.009), p = 6.2 × 10-6] and decreased eGFRcys [beta (SE): -0.294 (0.065), p = 6.2 × 10-6]. TPOAb were less robustly associated with CKD but not associated with UACR. In reverse MR in the WGHS, kidney function was not consistently associated with thyroid function. Conclusions: Bidirectional MR supports a directional association from hypothyroidism, increased TSH, and TPOAb, but not fT4, to decreased eGFRcrea and increased CKD.
Collapse
Affiliation(s)
- Christina Ellervik
- Department of Laboratory Medicine, Boston Children's Hospital, Boston, Massachusetts
- Department of Pathology, Harvard Medical School, Boston, Massachusetts
- Department of Medicine, Harvard Medical School, Boston, Massachusetts
- Christina Ellervik, MD, PhD, DMSci, Department of Laboratory Medicine, Boston Children's Hospital, 300 Longwood Avenue, Boston, MA 02115 ;
| | - Samia Mora
- Department of Clinical Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
- Cardiovascular Division, Brigham and Women's Hospital, Boston, Massachusetts
- Division of Preventive Medicine, Brigham and Women's Hospital, Boston, Massachusetts
- Center for Lipid Metabolomics, Division of Preventive Medicine, Brigham and Women's Hospital, Boston, Massachusetts
| | - Paul M. Ridker
- Department of Clinical Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
- Cardiovascular Division, Brigham and Women's Hospital, Boston, Massachusetts
- Division of Preventive Medicine, Brigham and Women's Hospital, Boston, Massachusetts
- Center for Lipid Metabolomics, Division of Preventive Medicine, Brigham and Women's Hospital, Boston, Massachusetts
- Department of Epidemiology, T.H. Chan School of Public Health, Boston, Massachusetts
| | - Daniel I. Chasman
- Cardiovascular Division, Brigham and Women's Hospital, Boston, Massachusetts
- Center for Lipid Metabolomics, Division of Preventive Medicine, Brigham and Women's Hospital, Boston, Massachusetts
- Address correspondence to: Daniel I. Chasman, PhD, Division of Preventive Medicine, Brigham and Women's Hospital, 900 Commonwealth Avenue, Boston, MA 02215
| |
Collapse
|
39
|
Salem JE, Shoemaker MB, Bastarache L, Shaffer CM, Glazer AM, Kroncke B, Wells QS, Shi M, Straub P, Jarvik GP, Larson EB, Velez Edwards DR, Edwards TL, Davis LK, Hakonarson H, Weng C, Fasel D, Knollmann BC, Wang TJ, Denny JC, Ellinor PT, Roden DM, Mosley JD. Association of Thyroid Function Genetic Predictors With Atrial Fibrillation: A Phenome-Wide Association Study and Inverse-Variance Weighted Average Meta-analysis. JAMA Cardiol 2020; 4:136-143. [PMID: 30673079 DOI: 10.1001/jamacardio.2018.4615] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Importance Thyroid hormone levels are tightly regulated through feedback inhibition by thyrotropin, produced by the pituitary gland. Hyperthyroidism is overwhelmingly due to thyroid disorders and is well recognized to contribute to a wide spectrum of cardiovascular morbidity, particularly the increasingly common arrhythmia atrial fibrillation (AF). Objective To determine the association between genetically determined thyrotropin levels and AF. Design, Setting, and Participants This phenome-wide association study scanned 1318 phenotypes associated with a polygenic predictor of thyrotropin levels identified by a previously published genome-wide association study that included participants of European ancestry. North American individuals of European ancestry with longitudinal electronic health records were analyzed from May 2008 to November 2016. Analysis began March 2018. Main Outcomes and Measures Clinical diagnoses associated with a polygenic predictor of thyrotropin levels. Exposures Genetically determined thyrotropin levels. Results Of 37 154 individuals, 19 330 (52%) were men. The thyrotropin polygenic predictor was positively associated with hypothyroidism (odds ratio [OR], 1.10; 95% CI, 1.07-1.14; P = 5 × 10-11) and inversely associated with diagnoses related to hyperthyroidism (OR, 0.64; 95% CI, 0.54-0.74; P = 2 × 10-8 for toxic multinodular goiter). Among nonthyroid associations, the top association was AF/flutter (OR, 0.93; 95% CI, 0.9-0.95; P = 9 × 10-7). When the analyses were repeated excluding 9801 individuals with any diagnoses of a thyroid-related disease, the AF association persisted (OR, 0.91; 95% CI, 0.88-0.95; P = 2.9 × 10-6). To replicate this association, we conducted an inverse-variance weighted average meta-analysis using AF single-nucleotide variant weights from a genome-wide association study of 17 931 AF cases and 115 142 controls. As in the discovery analyses, each SD increase in predicted thyrotropin was associated with a decreased risk of AF (OR, 0.86; 95% CI, 0.79-0.93; P = 4.7 × 10-4). In a set of AF cases (n = 745) and controls (n = 1680) older than 55 years, directly measured thyrotropin levels that fell within the normal range were inversely associated with AF risk (OR, 0.91; 95% CI, 0.83-0.99; P = .04). Conclusions and Relevance This study suggests a role for genetically determined variation in thyroid function within a physiologically accepted normal range as a risk factor for AF. The clinical decision to treat subclinical thyroid disease should incorporate the risk for AF as antithyroid medications to treat hyperthyroidism may reduce AF risk and thyroid hormone replacement for hypothyroidism may increase AF risk.
Collapse
Affiliation(s)
- Joe-Elie Salem
- Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee.,Sorbonne Université, Institut National de la Santé et de la Recherche Médicale (INSERM) CIC Paris-Est, AP-HP, Institute of Cardio metabolism and Nutrition (ICAN), Pitié-Salpêtrière Hospital, Department of Pharmacology, Paris, France.,Department of Pharmacology, Vanderbilt University Medical Center, Nashville, Tennessee
| | - M Benjamin Shoemaker
- Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Lisa Bastarache
- Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Christian M Shaffer
- Department of Pharmacology, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Andrew M Glazer
- Department of Pharmacology, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Brett Kroncke
- Department of Pharmacology, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Quinn S Wells
- Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Mingjian Shi
- Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Peter Straub
- Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Gail P Jarvik
- Department of Medicine (Medical Genetics), University of Washington, Seattle.,Department Genome Sciences, University of Washington, Seattle
| | - Eric B Larson
- Department of Medicine (Medical Genetics), University of Washington, Seattle.,Kaiser Permanente Washington Health Research Institute, Seattle
| | - Digna R Velez Edwards
- Department of Obstetrics and Gynecology, Vanderbilt University Medical Center, Nashville, Tennessee.,Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Todd L Edwards
- Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Lea K Davis
- Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Hakon Hakonarson
- Divisions of Human Genetics and Pulmonary Medicine, The Children's Hospital of Philadelphia, Philadelphia, Pennsylvania
| | - Chunhua Weng
- Department of Biomedical Informatics, Columbia University, New York
| | - David Fasel
- Department of Biomedical Informatics, Columbia University, New York
| | - Bjorn C Knollmann
- Department of Pharmacology, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Thomas J Wang
- Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Joshua C Denny
- Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Patrick T Ellinor
- Cardiovascular Research Center, Massachusetts General Hospital, Boston.,The Broad Institute of Harvard and MIT, Cambridge, Massachusetts
| | - Dan M Roden
- Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee.,Department of Pharmacology, Vanderbilt University Medical Center, Nashville, Tennessee.,Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Jonathan D Mosley
- Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee.,Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee
| |
Collapse
|
40
|
Ellervik C, Roselli C, Christophersen IE, Alonso A, Pietzner M, Sitlani CM, Trompet S, Arking DE, Geelhoed B, Guo X, Kleber ME, Lin HJ, Lin H, MacFarlane P, Selvin E, Shaffer C, Smith AV, Verweij N, Weiss S, Cappola AR, Dörr M, Gudnason V, Heckbert S, Mooijaart S, März W, Psaty BM, Ridker PM, Roden D, Stott DJ, Völzke H, Benjamin EJ, Delgado G, Ellinor P, Homuth G, Köttgen A, Jukema JW, Lubitz SA, Mora S, Rienstra M, Rotter JI, Shoemaker MB, Sotoodehnia N, Taylor KD, van der Harst P, Albert CM, Chasman DI. Assessment of the Relationship Between Genetic Determinants of Thyroid Function and Atrial Fibrillation: A Mendelian Randomization Study. JAMA Cardiol 2020; 4:144-152. [PMID: 30673084 DOI: 10.1001/jamacardio.2018.4635] [Citation(s) in RCA: 72] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Importance Increased free thyroxine (FT4) and decreased thyrotropin are associated with increased risk of atrial fibrillation (AF) in observational studies, but direct involvement is unclear. Objective To evaluate the potential direct involvement of thyroid traits on AF. Design, Setting, and Participants Study-level mendelian randomization (MR) included 11 studies, and summary-level MR included 55 114 AF cases and 482 295 referents, all of European ancestry. Exposures Genomewide significant variants were used as instruments for standardized FT4 and thyrotropin levels within the reference range, standardized triiodothyronine (FT3):FT4 ratio, hypothyroidism, standardized thyroid peroxidase antibody levels, and hyperthyroidism. Mendelian randomization used genetic risk scores in study-level analysis or individual single-nucleotide polymorphisms in 2-sample MR for the summary-level data. Main Outcomes and Measures Prevalent and incident AF. Results The study-level analysis included 7679 individuals with AF and 49 233 referents (mean age [standard error], 62 [3] years; 15 859 men [29.7%]). In study-level random-effects meta-analysis, the pooled hazard ratio of FT4 levels (nanograms per deciliter) for incident AF was 1.55 (95% CI, 1.09-2.20; P = .02; I2 = 76%) and the pooled odds ratio (OR) for prevalent AF was 2.80 (95% CI, 1.41-5.54; P = .003; I2 = 64%) in multivariable-adjusted analyses. The FT4 genetic risk score was associated with an increase in FT4 by 0.082 SD (standard error, 0.007; P < .001) but not with incident AF (risk ratio, 0.84; 95% CI, 0.62-1.14; P = .27) or prevalent AF (OR, 1.32; 95% CI, 0.64-2.73; P = .46). Similarly, in summary-level inverse-variance weighted random-effects MR, gene-based FT4 within the reference range was not associated with AF (OR, 1.01; 95% CI, 0.89-1.14; P = .88). However, gene-based increased FT3:FT4 ratio, increased thyrotropin within the reference range, and hypothyroidism were associated with AF with inverse-variance weighted random-effects OR of 1.33 (95% CI, 1.08-1.63; P = .006), 0.88 (95% CI, 0.84-0.92; P < .001), and 0.94 (95% CI, 0.90-0.99; P = .009), respectively, and robust to tests of horizontal pleiotropy. However, the subset of hypothyroidism single-nucleotide polymorphisms involved in autoimmunity and thyroid peroxidase antibodies levels were not associated with AF. Gene-based hyperthyroidism was associated with AF with MR-Egger OR of 1.31 (95% CI, 1.05-1.63; P = .02) with evidence of horizontal pleiotropy (P = .045). Conclusions and Relevance Genetically increased FT3:FT4 ratio and hyperthyroidism, but not FT4 within the reference range, were associated with increased AF, and increased thyrotropin within the reference range and hypothyroidism were associated with decreased AF, supporting a pathway involving the pituitary-thyroid-cardiac axis.
Collapse
Affiliation(s)
- Christina Ellervik
- Department of Laboratory Medicine, Boston Children's Hospital, Boston, Massachusetts.,Harvard Medical School, Boston, Massachusetts.,Division of Clinical Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Carolina Roselli
- Program in Medical and Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, Massachusetts
| | - Ingrid E Christophersen
- Department of Medical Genetics, Oslo University Hospital, Oslo, Norway.,Department of Medical Research, Bærum Hospital, Vestre Viken Hospital Trust, Gjettum, Norway
| | - Alvaro Alonso
- Department of Epidemiology, Rollins School of Public Health, Emory University, Atlanta, Georgia
| | - Maik Pietzner
- Institute of Clinical Chemistry and Laboratory Medicine, University Medicine Greifswald, Greifswald, Germany.,DZHK (German Center for Cardiovascular Research), Partner Site Greifswald, Greifswald, Germany
| | - Collen M Sitlani
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle
| | - Stella Trompet
- Department of Cardiology, Leiden University Medical Center, Leiden, the Netherlands.,Section of Gerontology and Geriatrics, Department of Internal Medicine, Leiden University Medical Center, Leiden, the Netherlands
| | - Dan E Arking
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland
| | - Bastiaan Geelhoed
- University Medical Center Groningen, University of Groningen, Groningen, the Netherlands
| | - Xiuqing Guo
- Division of Genomic Outcomes, Institute for Translational Genomics and Population Sciences, Torrance, California.,Department of Pediatrics, Los Angeles Biomedical Research Institute, Harbor-University of California, Los Angeles Medical Center, Torrance.,Department of Pediatrics, David Geffen School of Medicine, University of California, Los Angeles
| | - Marcus E Kleber
- Vth Department of Medicine (Nephrology, Hypertensiology, Endocrinology, Diabetology, Rheumatology), Medical Faculty of Mannheim, University of Heidelberg, Mannheim, Germany
| | - Henry J Lin
- Division of Genomic Outcomes, Institute for Translational Genomics and Population Sciences, Torrance, California.,Department of Pediatrics, Los Angeles Biomedical Research Institute, Harbor-University of California, Los Angeles Medical Center, Torrance.,Department of Pediatrics, David Geffen School of Medicine, University of California, Los Angeles
| | - Honghuang Lin
- Department of Medicine, Boston University School of Medicine, Boston, Massachusetts.,National Heart Lung and Blood Institute's and Boston University's Framingham Heart Study, Framingham, Massachusetts
| | - Peter MacFarlane
- Institute of Health and Wellbeing, College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow, United Kingdom
| | - Elizabeth Selvin
- Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland
| | - Christian Shaffer
- Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Albert V Smith
- School of Public Health, Department of Biostatistics, University of Michigan, Ann Arbor.,Icelandic Heart Association, Kopavogur, Iceland
| | - Niek Verweij
- University Medical Center Groningen, University of Groningen, Groningen, the Netherlands
| | - Stefan Weiss
- DZHK (German Center for Cardiovascular Research), Partner Site Greifswald, Greifswald, Germany.,Interfaculty Institute for Genetics and Functional Genomics, University Medicine and University Greifswald, Greifswald, Germany
| | - Anne R Cappola
- Smilow Center for Translational Research, Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia
| | - Marcus Dörr
- DZHK (German Center for Cardiovascular Research), Partner Site Greifswald, Greifswald, Germany.,Department of Internal Medicine B, University Medicine Greifswald, Greifswald, Germany
| | - Vilmundur Gudnason
- Icelandic Heart Association, Kopavogur, Iceland.,Faculty of Medicine, University of Iceland, Reykjavik, Iceland
| | - Susan Heckbert
- Department of Epidemiology, University of Washington, Seattle
| | - Simon Mooijaart
- Section of Gerontology and Geriatrics, Department of Internal Medicine, Leiden University Medical Center, Leiden, the Netherlands.,Institute for Evidence-Based Medicine in Old Age, Leiden, the Netherlands
| | - Winfried März
- Vth Department of Medicine (Nephrology, Hypertensiology, Endocrinology, Diabetology, Rheumatology), Medical Faculty of Mannheim, University of Heidelberg, Mannheim, Germany.,Synlab Academy, Synlab Holding Deutschland GmbH, Mannheim, Germany
| | - Bruce M Psaty
- Cardiovascular Health Research Unit, Department of Medicine, Epidemiology, and Health Services, University of Washington, Seattle.,Kaiser Permanente Washington Health Research Institute, Seattle
| | - Paul M Ridker
- Harvard Medical School, Boston, Massachusetts.,Division of Cardiovascular, Brigham and Women's Hospital, Boston, Massachusetts.,Division of Preventive Medicine, Brigham and Women's Hospital, Boston, Massachusetts
| | - Dan Roden
- Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee
| | - David J Stott
- Institute of Cardiovascular and Medical Sciences, College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow, United Kingdom
| | - Henry Völzke
- DZHK (German Center for Cardiovascular Research), Partner Site Greifswald, Greifswald, Germany.,Institute for Community Medicine, University Medicine Greifswald, Greifswald, Germany
| | - Emelia J Benjamin
- Department of Medicine, Boston University School of Medicine, Boston, Massachusetts.,National Heart Lung and Blood Institute's and Boston University's Framingham Heart Study, Framingham, Massachusetts.,Department of Epidemiology, Boston University School of Public Health, Boston, Massachusetts
| | - Graciela Delgado
- Vth Department of Medicine (Nephrology, Hypertensiology, Endocrinology, Diabetology, Rheumatology), Medical Faculty of Mannheim, University of Heidelberg, Mannheim, Germany
| | - Patrick Ellinor
- Harvard Medical School, Boston, Massachusetts.,Cardiovascular Research Center, Massachusetts General Hospital, Charlestown, Massachusetts.,Cardiac Arrhythmia Service, Massachusetts General Hospital, Boston, Massachusetts
| | - Georg Homuth
- University Medicine Greifswald, Interfaculty Institute for Genetics and Functional Genomics, Greifswald, Germany
| | - Anna Köttgen
- Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland.,Institute of Genetic Epidemiology, Faculty of Medicine and Medical Center, University of Freiburg, Freiburg, Germany
| | - Johan W Jukema
- Department of Cardiology, Leiden University Medical Center, Leiden, the Netherlands.,Einthoven Laboratory for Experimental Vascular Medicine, LUMC, Leiden, the Netherlands.,Interuniversity Cardiology Institute of the Netherlands, Utrecht, the Netherlands
| | - Steven A Lubitz
- Cardiovascular Research Center, Cardiac Arrhythmia Service, Massachusetts General Hospital, Boston
| | - Samia Mora
- Harvard Medical School, Boston, Massachusetts.,Division of Cardiovascular, Brigham and Women's Hospital, Boston, Massachusetts.,Division of Preventive Medicine, Brigham and Women's Hospital, Boston, Massachusetts
| | - Michiel Rienstra
- University Medical Center Groningen, University of Groningen, Groningen, the Netherlands
| | - Jerome I Rotter
- Division of Genomic Outcomes, Institute for Translational Genomics and Population Sciences, Torrance, California.,Department of Pediatrics, Los Angeles Biomedical Research Institute, Harbor-University of California, Los Angeles Medical Center, Torrance.,Department of Pediatrics, David Geffen School of Medicine, University of California, Los Angeles
| | - M Benjamin Shoemaker
- Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Nona Sotoodehnia
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle
| | - Kent D Taylor
- Division of Genomic Outcomes, Institute for Translational Genomics and Population Sciences, Torrance, California.,Department of Pediatrics, Los Angeles Biomedical Research Institute, Harbor-University of California, Los Angeles Medical Center, Torrance.,Department of Pediatrics, David Geffen School of Medicine, University of California, Los Angeles
| | - Pim van der Harst
- University Medical Center Groningen, University of Groningen, Groningen, the Netherlands
| | - Christine M Albert
- Harvard Medical School, Boston, Massachusetts.,Division of Cardiovascular, Brigham and Women's Hospital, Boston, Massachusetts.,Division of Preventive Medicine, Brigham and Women's Hospital, Boston, Massachusetts
| | - Daniel I Chasman
- Harvard Medical School, Boston, Massachusetts.,Program in Medical and Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, Massachusetts.,Division of Preventive Medicine, Brigham and Women's Hospital, Boston, Massachusetts
| |
Collapse
|
41
|
Choi L, Carroll RJ, Beck C, Mosley JD, Roden DM, Denny JC, Van Driest SL. Evaluating statistical approaches to leverage large clinical datasets for uncovering therapeutic and adverse medication effects. Bioinformatics 2019; 34:2988-2996. [PMID: 29912272 DOI: 10.1093/bioinformatics/bty306] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2017] [Accepted: 04/16/2018] [Indexed: 12/31/2022] Open
Abstract
Motivation Phenome-wide association studies (PheWAS) have been used to discover many genotype-phenotype relationships and have the potential to identify therapeutic and adverse drug outcomes using longitudinal data within electronic health records (EHRs). However, the statistical methods for PheWAS applied to longitudinal EHR medication data have not been established. Results In this study, we developed methods to address two challenges faced with reuse of EHR for this purpose: confounding by indication, and low exposure and event rates. We used Monte Carlo simulation to assess propensity score (PS) methods, focusing on two of the most commonly used methods, PS matching and PS adjustment, to address confounding by indication. We also compared two logistic regression approaches (the default of Wald versus Firth's penalized maximum likelihood, PML) to address complete separation due to sparse data with low exposure and event rates. PS adjustment resulted in greater power than PS matching, while controlling Type I error at 0.05. The PML method provided reasonable P-values, even in cases with complete separation, with well controlled Type I error rates. Using PS adjustment and the PML method, we identify novel latent drug effects in pediatric patients exposed to two common antibiotic drugs, ampicillin and gentamicin. Availability and implementation R packages PheWAS and EHR are available at https://github.com/PheWAS/PheWAS and at CRAN (https://www.r-project.org/), respectively. The R script for data processing and the main analysis is available at https://github.com/choileena/EHR. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Leena Choi
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Robert J Carroll
- Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Cole Beck
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, USA
| | | | - Dan M Roden
- Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA.,Medicine, Vanderbilt University Medical Center, Nashville, TN, USA.,Pharmacology, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Joshua C Denny
- Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA.,Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Sara L Van Driest
- Medicine, Vanderbilt University Medical Center, Nashville, TN, USA.,Pediatrics, Vanderbilt University Medical Center, Nashville, TN, USA
| |
Collapse
|
42
|
Making work visible for electronic phenotype implementation: Lessons learned from the eMERGE network. J Biomed Inform 2019; 99:103293. [PMID: 31542521 DOI: 10.1016/j.jbi.2019.103293] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2019] [Revised: 08/26/2019] [Accepted: 09/19/2019] [Indexed: 11/21/2022]
Abstract
BACKGROUND Implementation of phenotype algorithms requires phenotype engineers to interpret human-readable algorithms and translate the description (text and flowcharts) into computable phenotypes - a process that can be labor intensive and error prone. To address the critical need for reducing the implementation efforts, it is important to develop portable algorithms. METHODS We conducted a retrospective analysis of phenotype algorithms developed in the Electronic Medical Records and Genomics (eMERGE) network and identified common customization tasks required for implementation. A novel scoring system was developed to quantify portability from three aspects: Knowledge conversion, clause Interpretation, and Programming (KIP). Tasks were grouped into twenty representative categories. Experienced phenotype engineers were asked to estimate the average time spent on each category and evaluate time saving enabled by a common data model (CDM), specifically the Observational Medical Outcomes Partnership (OMOP) model, for each category. RESULTS A total of 485 distinct clauses (phenotype criteria) were identified from 55 phenotype algorithms, corresponding to 1153 customization tasks. In addition to 25 non-phenotype-specific tasks, 46 tasks are related to interpretation, 613 tasks are related to knowledge conversion, and 469 tasks are related to programming. A score between 0 and 2 (0 for easy, 1 for moderate, and 2 for difficult portability) is assigned for each aspect, yielding a total KIP score range of 0 to 6. The average clause-wise KIP score to reflect portability is 1.37 ± 1.38. Specifically, the average knowledge (K) score is 0.64 ± 0.66, interpretation (I) score is 0.33 ± 0.55, and programming (P) score is 0.40 ± 0.64. 5% of the categories can be completed within one hour (median). 70% of the categories take from days to months to complete. The OMOP model can assist with vocabulary mapping tasks. CONCLUSION This study presents firsthand knowledge of the substantial implementation efforts in phenotyping and introduces a novel metric (KIP) to measure portability of phenotype algorithms for quantifying such efforts across the eMERGE Network. Phenotype developers are encouraged to analyze and optimize the portability in regards to knowledge, interpretation and programming. CDMs can be used to improve the portability for some 'knowledge-oriented' tasks.
Collapse
|
43
|
Barnado A, Carroll RJ, Casey C, Wheless L, Denny JC, Crofford LJ. Phenome-Wide Association Studies Uncover a Novel Association of Increased Atrial Fibrillation in Male Patients With Systemic Lupus Erythematosus. Arthritis Care Res (Hoboken) 2019; 70:1630-1636. [PMID: 29481723 DOI: 10.1002/acr.23553] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2017] [Accepted: 02/20/2018] [Indexed: 12/15/2022]
Abstract
OBJECTIVE Phenome-wide association studies (PheWAS) scan across billing codes in the electronic health record (EHR) and re-purpose clinical EHR data for research. In this study, we examined whether PheWAS could function as an EHR-based discovery tool for systemic lupus erythematosus (SLE) and identified novel clinical associations in male versus female patients with SLE. METHODS We used a de-identified version of the Vanderbilt University Medical Center EHR, which includes more than 2.8 million subjects. We performed EHR-based PheWAS to compare SLE patients with age-, sex-, and race-matched control subjects and to compare male SLE patients with female SLE patients, controlling for multiple testing using a false discovery rate (FDR) P value of 0.05. RESULTS We identified 1,097 patients with SLE and 5,735 matched control subjects. In a comparison of patients with SLE and matched controls, SLE patients were shown to be more likely to have International Classification of Diseases, Ninth Revision codes related to the SLE disease criteria. In the PheWAS of male versus female SLE patients, with adjustment for age and race, male patients were shown to be more likely to have atrial fibrillation (odds ratio 4.50, false discovery rate P = 3.23 × 10-3 ). Chart review confirmed atrial fibrillation, with the majority of patients developing atrial fibrillation after the SLE diagnosis and having multiple risk factors for atrial fibrillation. After adjustment for age, sex, race, and coronary artery disease, SLE disease status was shown to be significantly associated with atrial fibrillation (P = 0.002). CONCLUSION Using PheWAS to compare male and female patients with SLE, we identified a novel association of an increased incidence of atrial fibrillation in male patients. SLE disease status was shown to be independently associated with atrial fibrillation, even after adjustment for age, sex, race, and coronary artery disease. These results demonstrate the utility of PheWAS as an EHR-based discovery tool for SLE.
Collapse
Affiliation(s)
- April Barnado
- Vanderbilt University Medical Center, Nashville, Tennessee
| | | | - Carolyn Casey
- Lehigh Valley Health Network, Allentown, Pennsylvania
| | - Lee Wheless
- Vanderbilt University Medical Center, Nashville, Tennessee
| | - Joshua C Denny
- Vanderbilt University Medical Center, Nashville, Tennessee
| | | |
Collapse
|
44
|
Boland MR, Alur-Gupta S, Levine L, Gabriel P, Gonzalez-Hernandez G. Disease associations depend on visit type: results from a visit-wide association study. BioData Min 2019; 12:15. [PMID: 31338127 PMCID: PMC6625053 DOI: 10.1186/s13040-019-0203-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2019] [Accepted: 07/03/2019] [Indexed: 12/13/2022] Open
Abstract
INTRODUCTION Widespread adoption of Electronic Health Records (EHR) increased the number of reported disease association studies, or Phenome-Wide Association Studies (PheWAS). Traditional PheWAS studies ignore visit type (i.e., department/service conducting the visit). In this study, we investigate the role of visit type on disease association results in the first Visit-Wide Association Study or 'VisitWAS'. RESULTS We studied this visit type effect on association results using EHR data from the University of Pennsylvania. Penn EHR data comes from 1,048 different departments and clinics. We analyzed differences between cancer and obstetrics/gynecologist (Ob/Gyn) visits. Some findings were expected (i.e., increase of neoplasm diagnoses among cancer visits), but others were surprising, including an increase in infectious disease conditions among those visiting the Ob/Gyn. CONCLUSION We conclude that assessing visit type is important for EHR studies because different medical centers have different visit type distributions. To increase reproducibility among EHR data mining algorithms, we recommend that researchers report visit type in studies.
Collapse
Affiliation(s)
- Mary Regina Boland
- Department of Biostatistics, Epidemiology & Informatics, University of Pennsylvania, Philadelphia, PA 19104 USA
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA 19104 USA
- Center for Excellence in Environmental Toxicology, University of Pennsylvania, Philadelphia, PA 19104 USA
- Department of Biomedical and Health Informatics, Children’s Hospital of Philadelphia, 421 Blockley Hall, Philadelphia, PA 19104 USA
| | - Snigdha Alur-Gupta
- Department of Obstetrics & Gynecology, University of Pennsylvania, 421 Blockley Hall, Philadelphia, PA 19104 USA
| | - Lisa Levine
- Department of Obstetrics & Gynecology, University of Pennsylvania, 421 Blockley Hall, Philadelphia, PA 19104 USA
| | - Peter Gabriel
- Department of Radiology, University of Pennsylvania, Philadelphia, PA 19104 USA
| | - Graciela Gonzalez-Hernandez
- Department of Biostatistics, Epidemiology & Informatics, University of Pennsylvania, Philadelphia, PA 19104 USA
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA 19104 USA
| |
Collapse
|
45
|
Zhang XA, Yates A, Vasilevsky N, Gourdine JP, Callahan TJ, Carmody LC, Danis D, Joachimiak MP, Ravanmehr V, Pfaff ER, Champion J, Robasky K, Xu H, Fecho K, Walton NA, Zhu RL, Ramsdill J, Mungall CJ, Köhler S, Haendel MA, McDonald CJ, Vreeman DJ, Peden DB, Bennett TD, Feinstein JA, Martin B, Stefanski AL, Hunter LE, Chute CG, Robinson PN. Semantic integration of clinical laboratory tests from electronic health records for deep phenotyping and biomarker discovery. NPJ Digit Med 2019; 2:32. [PMID: 31119199 PMCID: PMC6527418 DOI: 10.1038/s41746-019-0110-4] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2018] [Accepted: 04/18/2019] [Indexed: 12/22/2022] Open
Abstract
Electronic Health Record (EHR) systems typically define laboratory test results using the Laboratory Observation Identifier Names and Codes (LOINC) and can transmit them using Fast Healthcare Interoperability Resource (FHIR) standards. LOINC has not yet been semantically integrated with computational resources for phenotype analysis. Here, we provide a method for mapping LOINC-encoded laboratory test results transmitted in FHIR standards to Human Phenotype Ontology (HPO) terms. We annotated the medical implications of 2923 commonly used laboratory tests with HPO terms. Using these annotations, our software assesses laboratory test results and converts each result into an HPO term. We validated our approach with EHR data from 15,681 patients with respiratory complaints and identified known biomarkers for asthma. Finally, we provide a freely available SMART on FHIR application that can be used within EHR systems. Our approach allows readily available laboratory tests in EHR to be reused for deep phenotyping and exploits the hierarchical structure of HPO to integrate distinct tests that have comparable medical interpretations for association studies.
Collapse
Affiliation(s)
| | - Amy Yates
- Oregon Clinical and Translational Research Institute, Oregon Health and Science University, Portland, OR 97239 USA
| | - Nicole Vasilevsky
- Oregon Clinical and Translational Research Institute, Oregon Health and Science University, Portland, OR 97239 USA
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, OR 97239 USA
| | - J. P. Gourdine
- Oregon Clinical and Translational Research Institute, Oregon Health and Science University, Portland, OR 97239 USA
- Library, Oregon Health and Science University, Portland, OR 97239 USA
| | - Tiffany J. Callahan
- Computational Bioscience Program, Department of Pharmacology, University of Colorado Anschutz School of Medicine, Aurora, CO 80045 USA
| | - Leigh C. Carmody
- The Jackson Laboratory for Genomic Medicine, Farmington CT, 06032 USA
| | - Daniel Danis
- The Jackson Laboratory for Genomic Medicine, Farmington CT, 06032 USA
| | - Marcin P. Joachimiak
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720 USA
| | - Vida Ravanmehr
- The Jackson Laboratory for Genomic Medicine, Farmington CT, 06032 USA
| | - Emily R. Pfaff
- North Carolina Translational and Clinical Sciences Institute (NC TraCS), University of North Carolina at Chapel Hill, Chapel Hill, NC 27599 USA
| | - James Champion
- North Carolina Translational and Clinical Sciences Institute (NC TraCS), University of North Carolina at Chapel Hill, Chapel Hill, NC 27599 USA
| | - Kimberly Robasky
- North Carolina Translational and Clinical Sciences Institute (NC TraCS), University of North Carolina at Chapel Hill, Chapel Hill, NC 27599 USA
- Genetics Department, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599 USA
- School of Information and Library Sciences, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599 USA
| | - Hao Xu
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599 USA
| | - Karamarie Fecho
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599 USA
| | - Nephi A. Walton
- Genomic Medicine Institute, Geisinger Health System, Danville, PA 17822 USA
| | - Richard L. Zhu
- Institute for Clinical and Translational Research, Johns Hopkins University, Baltimore, MD 21202 USA
| | - Justin Ramsdill
- Oregon Clinical and Translational Research Institute, Oregon Health and Science University, Portland, OR 97239 USA
| | - Christopher J. Mungall
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720 USA
| | - Sebastian Köhler
- Charité Centrum für Therapieforschung, Charité - Universitätsmedizin Berlin Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Berlin, 10117 Germany
- Einstein Center Digital Future, Berlin, 10117 Germany
| | - Melissa A. Haendel
- Oregon Clinical and Translational Research Institute, Oregon Health and Science University, Portland, OR 97239 USA
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, OR 97239 USA
- Linus Pauling Institute and Center for Genome Research and Biocomputing, Oregon State University, Corvallis, OR 97331 USA
| | - Clement J. McDonald
- Lister Hill National Center for Biomedical Communications, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894 USA
| | - Daniel J. Vreeman
- Department of Medicine, Indiana University School of Medicine, Indianapolis, IN 46202 USA
- Center for Biomedical Informatics, Regenstrief Institute, Inc., Indianapolis, IN 46202 USA
| | - David B. Peden
- North Carolina Translational and Clinical Sciences Institute (NC TraCS), University of North Carolina at Chapel Hill, Chapel Hill, NC 27599 USA
- Division of Allergy, Immunology and Rheumatology, Department of Pediatrics, University of North Carolina, Chapel Hill, NC 27599 USA
- University of North Carolina Center for Environmental Medicine, Asthma and Lung Biology, University of North Carolina, Chapel Hill, NC 27599 USA
| | - Tellen D. Bennett
- Department of Pediatrics, Section of Pediatric Critical Care, University of Colorado School of Medicine, Aurora, CO 80045 USA
| | - James A. Feinstein
- Adult and Child Consortium for Health Outcomes Research and Delivery Science (ACCORDS), University of Colorado School of Medicine, Aurora, CO 80045 USA
| | - Blake Martin
- Department of Pediatrics, Section of Pediatric Critical Care, University of Colorado School of Medicine, Aurora, CO 80045 USA
| | - Adrianne L. Stefanski
- Computational Bioscience Program, Department of Pharmacology, University of Colorado Anschutz School of Medicine, Aurora, CO 80045 USA
| | - Lawrence E. Hunter
- Computational Bioscience Program, Department of Pharmacology, University of Colorado Anschutz School of Medicine, Aurora, CO 80045 USA
| | - Christopher G. Chute
- Institute for Clinical and Translational Research, Johns Hopkins University, Baltimore, MD 21202 USA
| | - Peter N. Robinson
- The Jackson Laboratory for Genomic Medicine, Farmington CT, 06032 USA
- Institute for Systems Genomics, University of Connecticut, Farmington, CT 06032 USA
| |
Collapse
|
46
|
Brčić L, Barić A, Gračan S, Brekalo M, Kaličanin D, Gunjača I, Torlak Lovrić V, Tokić S, Radman M, Škrabić V, Miljković A, Kolčić I, Štefanić M, Glavaš-Obrovac L, Lessel D, Polašek O, Zemunik T, Barbalić M, Punda A, Boraska Perica V. Genome-wide association analysis suggests novel loci for Hashimoto's thyroiditis. J Endocrinol Invest 2019; 42:567-576. [PMID: 30284222 DOI: 10.1007/s40618-018-0955-4] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/10/2018] [Accepted: 09/18/2018] [Indexed: 12/20/2022]
Abstract
PURPOSE Hashimoto's thyroiditis (HT) is the most common form of autoimmune thyroid diseases. Current knowledge of HT genetics is limited, and not a single genome-wide association study (GWAS) focusing exclusively on HT has been performed to date. In order to decipher genetic determinants of HT, we performed the first GWAS followed by replication in a total of 1443 individuals from Croatia. METHODS We performed association analysis in a discovery cohort comprising 405 cases and 433 controls. We followed up 13 independent signals (P < 10-5) in 303 cases and 302 controls from two replication cohorts and then meta-analyzed results across discovery and replication datasets. RESULTS We identified three variants suggestively associated with HT: rs12944194 located 206 kb from SDK2 (P = 1.8 × 10-6), rs75201096 inside GNA14 (P = 2.41 × 10-5) and rs791903 inside IP6K3 (P = 3.16 × 10-5). Genetic risk score (GRS), calculated using risk alleles of these loci, accounted for 4.82% of the total HT variance, and individuals from the top GRS quartile had 2.76 times higher odds for HT than individuals from the lowest GRS quartile. CONCLUSIONS Although discovered loci are implicated with susceptibility to HT for the first time, genomic regions harboring these loci exhibit good biological candidacy due to involvement in the regulation of the thyroid function and autoimmunity. Additionally, we observe genetic overlap between HT and several related traits, such as hypothyroidism, Graves' disease and TPOAb. Our study adds a new knowledge of underlying HT genetics and sets a firm basis for further research.
Collapse
Affiliation(s)
- L Brčić
- Department of Medical Biology, School of Medicine, University of Split, Šoltanska 2, 21000, Split, Croatia
| | - A Barić
- Department of Nuclear Medicine, University Hospital Split, Split, Croatia
| | - S Gračan
- Department of Nuclear Medicine, University Hospital Split, Split, Croatia
| | - M Brekalo
- Department of Nuclear Medicine, University Hospital Split, Split, Croatia
| | - D Kaličanin
- Department of Medical Biology, School of Medicine, University of Split, Šoltanska 2, 21000, Split, Croatia
| | - I Gunjača
- Department of Medical Biology, School of Medicine, University of Split, Šoltanska 2, 21000, Split, Croatia
| | - V Torlak Lovrić
- Department of Nuclear Medicine, University Hospital Split, Split, Croatia
| | - S Tokić
- Department of Medical Chemistry, Biochemistry and Clinical Chemistry, Faculty of Medicine, University of Osijek, Osijek, Croatia
| | - M Radman
- Department of Nuclear Medicine, University Hospital Split, Split, Croatia
| | - V Škrabić
- Department of Pediatrics, University Hospital Split, Split, Croatia
| | - A Miljković
- Department of Public Health, School of Medicine, University of Split, Split, Croatia
| | - I Kolčić
- Department of Public Health, School of Medicine, University of Split, Split, Croatia
| | - M Štefanić
- Department of Nuclear Medicine and Oncology, Faculty of Medicine, University of Osijek, Osijek, Croatia
| | - L Glavaš-Obrovac
- Department of Medical Chemistry, Biochemistry and Clinical Chemistry, Faculty of Medicine, University of Osijek, Osijek, Croatia
| | - D Lessel
- Institute of Human Genetics, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - O Polašek
- Department of Public Health, School of Medicine, University of Split, Split, Croatia
| | - T Zemunik
- Department of Medical Biology, School of Medicine, University of Split, Šoltanska 2, 21000, Split, Croatia
| | - M Barbalić
- Department of Medical Biology, School of Medicine, University of Split, Šoltanska 2, 21000, Split, Croatia
| | - A Punda
- Department of Nuclear Medicine, University Hospital Split, Split, Croatia
| | - V Boraska Perica
- Department of Medical Biology, School of Medicine, University of Split, Šoltanska 2, 21000, Split, Croatia.
| |
Collapse
|
47
|
Unlu G, Gamazon ER, Qi X, Levic DS, Bastarache L, Denny JC, Roden DM, Mayzus I, Breyer M, Zhong X, Konkashbaev AI, Rzhetsky A, Knapik EW, Cox NJ. GRIK5 Genetically Regulated Expression Associated with Eye and Vascular Phenomes: Discovery through Iteration among Biobanks, Electronic Health Records, and Zebrafish. Am J Hum Genet 2019; 104:503-519. [PMID: 30827500 PMCID: PMC6407495 DOI: 10.1016/j.ajhg.2019.01.017] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2018] [Accepted: 01/29/2019] [Indexed: 12/15/2022] Open
Abstract
Although the use of model systems for studying the mechanism of mutations that have a large effect is common, we highlight here the ways that zebrafish-model-system studies of a gene, GRIK5, that contributes to the polygenic liability to develop eye diseases have helped to illuminate a mechanism that implicates vascular biology in eye disease. A gene-expression prediction derived from a reference transcriptome panel applied to BioVU, a large electronic health record (EHR)-linked biobank at Vanderbilt University Medical Center, implicated reduced GRIK5 expression in diverse eye diseases. We tested the function of GRIK5 by depletion of its ortholog in zebrafish, and we observed reduced blood vessel numbers and integrity in the eye and increased vascular permeability. Analyses of EHRs in >2.6 million Vanderbilt subjects revealed significant comorbidity of eye and vascular diseases (relative risks 2-15); this comorbidity was confirmed in 150 million individuals from a large insurance claims dataset. Subsequent studies in >60,000 genotyped BioVU participants confirmed the association of reduced genetically predicted expression of GRIK5 with comorbid vascular and eye diseases. Our studies pioneer an approach that allows a rapid iteration of the discovery of gene-phenotype relationships to the primary genetic mechanism contributing to the pathophysiology of human disease. Our findings also add dimension to the understanding of the biology driven by glutamate receptors such as GRIK5 (also referred to as GLUK5 in protein form) and to mechanisms contributing to human eye diseases.
Collapse
Affiliation(s)
- Gokhan Unlu
- Department of Medicine, Division of Genetic Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, USA; Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN 37232, USA; Department of Cell and Developmental Biology, Vanderbilt University, Nashville, TN 37232, USA
| | - Eric R Gamazon
- Department of Medicine, Division of Genetic Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, USA; Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN 37232, USA; Data Science Institute, Vanderbilt University, Nashville, TN 37232, USA; Clare Hall, University of Cambridge, Cambridge CB3 9AL, UK
| | - Xinzi Qi
- Department of Medicine, Division of Genetic Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, USA; Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | - Daniel S Levic
- Department of Medicine, Division of Genetic Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, USA; Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN 37232, USA; Department of Cell and Developmental Biology, Vanderbilt University, Nashville, TN 37232, USA
| | - Lisa Bastarache
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN 37232, USA; Departments of Medicine and Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | - Joshua C Denny
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN 37232, USA; Departments of Medicine and Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | - Dan M Roden
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN 37232, USA; Departments of Medicine and Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37232, USA; Department of Pharmacology, Vanderbilt University, Nashville, TN 37232, USA
| | - Ilya Mayzus
- Departments of Medicine and Human Genetics, the University of Chicago, Chicago, IL 60637, USA
| | - Max Breyer
- Department of Medicine, Division of Genetic Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, USA; Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | - Xue Zhong
- Department of Medicine, Division of Genetic Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, USA; Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | - Anuar I Konkashbaev
- Department of Medicine, Division of Genetic Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, USA; Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | - Andrey Rzhetsky
- Departments of Medicine and Human Genetics, the University of Chicago, Chicago, IL 60637, USA
| | - Ela W Knapik
- Department of Medicine, Division of Genetic Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, USA; Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN 37232, USA; Department of Cell and Developmental Biology, Vanderbilt University, Nashville, TN 37232, USA
| | - Nancy J Cox
- Department of Medicine, Division of Genetic Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, USA; Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN 37232, USA; Data Science Institute, Vanderbilt University, Nashville, TN 37232, USA.
| |
Collapse
|
48
|
Ning W, Chan S, Beam A, Yu M, Geva A, Liao K, Mullen M, Mandl KD, Kohane I, Cai T, Yu S. Feature extraction for phenotyping from semantic and knowledge resources. J Biomed Inform 2019; 91:103122. [PMID: 30738949 PMCID: PMC6424621 DOI: 10.1016/j.jbi.2019.103122] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
OBJECTIVE Phenotyping algorithms can efficiently and accurately identify patients with a specific disease phenotype and construct electronic health records (EHR)-based cohorts for subsequent clinical or genomic studies. Previous studies have introduced unsupervised EHR-based feature selection methods that yielded algorithms with high accuracy. However, those selection methods still require expert intervention to tweak the parameter settings according to the EHR data distribution for each phenotype. To further accelerate the development of phenotyping algorithms, we propose a fully automated and robust unsupervised feature selection method that leverages only publicly available medical knowledge sources, instead of EHR data. METHODS SEmantics-Driven Feature Extraction (SEDFE) collects medical concepts from online knowledge sources as candidate features and gives them vector-form distributional semantic representations derived with neural word embedding and the Unified Medical Language System Metathesaurus. A number of features that are semantically closest and that sufficiently characterize the target phenotype are determined by a linear decomposition criterion and are selected for the final classification algorithm. RESULTS SEDFE was compared with the EHR-based SAFE algorithm and domain experts on feature selection for the classification of five phenotypes including coronary artery disease, rheumatoid arthritis, Crohn's disease, ulcerative colitis, and pediatric pulmonary arterial hypertension using both supervised and unsupervised approaches. Algorithms yielded by SEDFE achieved comparable accuracy to those yielded by SAFE and expert-curated features. SEDFE is also robust to the input semantic vectors. CONCLUSION SEDFE attains satisfying performance in unsupervised feature selection for EHR phenotyping. Both fully automated and EHR-independent, this method promises efficiency and accuracy in developing algorithms for high-throughput phenotyping.
Collapse
Affiliation(s)
- Wenxin Ning
- Department of Industrial Engineering, Tsinghua University, Beijing, China
| | - Stephanie Chan
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Andrew Beam
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Ming Yu
- Department of Industrial Engineering, Tsinghua University, Beijing, China
| | - Alon Geva
- Computational Health Informatics Program, Boston Children's Hospital, Boston, MA, USA; Department of Anesthesiology, Critical Care, and Pain Medicine, Boston Children's Hospital, Boston, MA, USA; Department of Anesthesia, Harvard Medical School, Boston, MA, USA
| | - Katherine Liao
- Department of Medicine, Division of Rheumatology, Immunology and Allergy, Brigham and Women's Hospital, Boston, MA, USA; Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Mary Mullen
- Department of Cardiology, Boston Children's Hospital, Boston, MA, USA; Department of Pediatrics, Harvard Medical School, Boston, MA, USA
| | - Kenneth D Mandl
- Computational Health Informatics Program, Boston Children's Hospital, Boston, MA, USA; Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Isaac Kohane
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Tianxi Cai
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA; Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Sheng Yu
- Center for Statistical Science, Tsinghua University, Beijing, China; Department of Industrial Engineering, Tsinghua University, Beijing, China; Institute for Data Science, Tsinghua University, Beijing, China.
| |
Collapse
|
49
|
Zhang X, Veturi Y, Verma S, Bone W, Verma A, Lucas A, Hebbring S, Denny JC, Stanaway IB, Jarvik GP, Crosslin D, Larson EB, Rasmussen-Torvik L, Pendergrass SA, Smoller JW, Hakonarson H, Sleiman P, Weng C, Fasel D, Wei WQ, Kullo I, Schaid D, Chung WK, Ritchie MD. Detecting potential pleiotropy across cardiovascular and neurological diseases using univariate, bivariate, and multivariate methods on 43,870 individuals from the eMERGE network. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2019; 24:272-283. [PMID: 30864329 PMCID: PMC6457436] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
The link between cardiovascular diseases and neurological disorders has been widely observed in the aging population. Disease prevention and treatment rely on understanding the potential genetic nexus of multiple diseases in these categories. In this study, we were interested in detecting pleiotropy, or the phenomenon in which a genetic variant influences more than one phenotype. Marker-phenotype association approaches can be grouped into univariate, bivariate, and multivariate categories based on the number of phenotypes considered at one time. Here we applied one statistical method per category followed by an eQTL colocalization analysis to identify potential pleiotropic variants that contribute to the link between cardiovascular and neurological diseases. We performed our analyses on ~530,000 common SNPs coupled with 65 electronic health record (EHR)-based phenotypes in 43,870 unrelated European adults from the Electronic Medical Records and Genomics (eMERGE) network. There were 31 variants identified by all three methods that showed significant associations across late onset cardiac- and neurologic- diseases. We further investigated functional implications of gene expression on the detected "lead SNPs" via colocalization analysis, providing a deeper understanding of the discovered associations. In summary, we present the framework and landscape for detecting potential pleiotropy using univariate, bivariate, multivariate, and colocalization methods. Further exploration of these potentially pleiotropic genetic variants will work toward understanding disease causing mechanisms across cardiovascular and neurological diseases and may assist in considering disease prevention as well as drug repositioning in future research.
Collapse
Affiliation(s)
- Xinyuan Zhang
- Genomics and Computational Biology Graduate Group, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA*Authors contributed equally to this work
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
50
|
Barnado A, Carroll RJ, Casey C, Wheless L, Denny JC, Crofford LJ. Phenome-wide association study identifies dsDNA as a driver of major organ involvement in systemic lupus erythematosus. Lupus 2018; 28:66-76. [PMID: 30477398 DOI: 10.1177/0961203318815577] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
In systemic lupus erythematosus (SLE), dsDNA antibodies are associated with renal disease. Less is known about comorbidities in patients without dsDNA or other autoantibodies. Using an electronic health record (EHR) SLE cohort, we employed a phenome-wide association study (PheWAS) that scans across billing codes to compare comorbidities in SLE patients with and without autoantibodies. We used our validated algorithm to identify SLE subjects. Autoantibody status was defined as ever positive for dsDNA, RNP, Smith, SSA and SSB. PheWAS was performed in antibody positive vs. negative SLE patients adjusting for age and race and using a false discovery rate of 0.05. We identified 1097 SLE subjects. In the PheWAS of dsDNA positive vs. negative subjects, dsDNA positive subjects were more likely to have nephritis ( p = 2.33 × 10-9) and renal failure ( p = 1.85 × 10-5). After adjusting for sex, race, age and other autoantibodies, dsDNA was independently associated with nephritis and chronic kidney disease. Those patients negative for dsDNA, RNP, SSA and SSB negative subjects were all more likely to have billing codes for sleep, pain and mood disorders. PheWAS uncovered a hierarchy within SLE-specific autoantibodies with dsDNA having the greatest impact on major organ involvement.
Collapse
Affiliation(s)
- A Barnado
- 1 Department of Medicine, Vanderbilt University Medical Center, Nashville, USA
| | - R J Carroll
- 2 Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, USA
| | - C Casey
- 3 Department of Medicine, Lehigh Valley Health Network, Allentown, USA
| | - L Wheless
- 4 Department of Dermatology, Vanderbilt University Medical Center, Nashville, USA
| | - J C Denny
- 1 Department of Medicine, Vanderbilt University Medical Center, Nashville, USA.,2 Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, USA
| | - L J Crofford
- 1 Department of Medicine, Vanderbilt University Medical Center, Nashville, USA
| |
Collapse
|