1
|
Szczerbinski L, Mandla R, Schroeder P, Porneala BC, Li JH, Florez JC, Mercader JM, Udler MS, Manning AK. Algorithms for the identification of prevalent diabetes in the All of Us Research Program validated using polygenic scores. Sci Rep 2024; 14:26895. [PMID: 39505999 PMCID: PMC11542015 DOI: 10.1038/s41598-024-74730-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2024] [Accepted: 09/29/2024] [Indexed: 11/08/2024] Open
Abstract
The All of Us Research Program (AoU) is an initiative designed to gather a comprehensive and diverse dataset from at least one million individuals across the USA. This longitudinal cohort study aims to advance research by providing a rich resource of genetic and phenotypic information, enabling powerful studies on the epidemiology and genetics of human diseases. One critical challenge to maximizing its use is the development of accurate algorithms that can efficiently and accurately identify well-defined disease and disease-free participants for case-control studies. This study aimed to develop and validate type 1 (T1D) and type 2 diabetes (T2D) algorithms in the AoU cohort, using electronic health record (EHR) and survey data. Building on existing algorithms and using diagnosis codes, medications, laboratory results, and survey data, we developed and implemented algorithms for identifying prevalent cases of type 1 and type 2 diabetes. The first set of algorithms used only EHR data (EHR-only), and the second set used a combination of EHR and survey data (EHR+). A universal algorithm was also developed to identify individuals without diabetes. The performance of each algorithm was evaluated by testing its association with polygenic scores (PSs) for type 1 and type 2 diabetes. We demonstrated the feasibility and utility of using AoU EHR and survey data to employ diabetes algorithms. For T1D, the EHR-only algorithm showed a stronger association with T1D-PS compared to the EHR + algorithm (DeLong p-value = 3 × 10-5). For T2D, the EHR + algorithm outperformed both the EHR-only and the existing T2D definition provided in the AoU Phenotyping Library (DeLong p-values = 0.03 and 1 × 10-4, respectively), identifying 25.79% and 22.57% more cases, respectively, and providing an improved association with T2D PS. We provide a new validated type 1 diabetes definition and an improved type 2 diabetes definition in AoU, which are freely available for diabetes research in the AoU. These algorithms ensure consistency of diabetes definitions in the cohort, facilitating high-quality diabetes research.
Collapse
Affiliation(s)
- Lukasz Szczerbinski
- Department of Endocrinology, Diabetology and Internal Medicine, Medical University of Bialystok, 15-276, Bialystok, Poland
- Clinical Research Centre, Medical University of Bialystok, 15-276, Bialystok, Poland
- Programs in Metabolism and Medical & Population Genetics, Broad Institute of Harvard and MIT, 415 Main St., Cambridge, MA, 02142, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, USA
- Diabetes Unit, Department of Medicine, Massachusetts General Hospital, Boston, USA
| | - Ravi Mandla
- Programs in Metabolism and Medical & Population Genetics, Broad Institute of Harvard and MIT, 415 Main St., Cambridge, MA, 02142, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, USA
- Diabetes Unit, Department of Medicine, Massachusetts General Hospital, Boston, USA
- Cardiology Division, Department of Medicine and Cardiovascular Research Institute, University of California, San Francisco, USA
| | - Philip Schroeder
- Programs in Metabolism and Medical & Population Genetics, Broad Institute of Harvard and MIT, 415 Main St., Cambridge, MA, 02142, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, USA
- Diabetes Unit, Department of Medicine, Massachusetts General Hospital, Boston, USA
| | - Bianca C Porneala
- Division of General Internal Medicine, Department of Medicine, Massachusetts General Hospital, Boston, USA
| | - Josephine H Li
- Programs in Metabolism and Medical & Population Genetics, Broad Institute of Harvard and MIT, 415 Main St., Cambridge, MA, 02142, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, USA
- Diabetes Unit, Department of Medicine, Massachusetts General Hospital, Boston, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
| | - Jose C Florez
- Programs in Metabolism and Medical & Population Genetics, Broad Institute of Harvard and MIT, 415 Main St., Cambridge, MA, 02142, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, USA
- Diabetes Unit, Department of Medicine, Massachusetts General Hospital, Boston, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
| | - Josep M Mercader
- Programs in Metabolism and Medical & Population Genetics, Broad Institute of Harvard and MIT, 415 Main St., Cambridge, MA, 02142, USA.
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, USA.
- Diabetes Unit, Department of Medicine, Massachusetts General Hospital, Boston, USA.
- Department of Medicine, Harvard Medical School, Boston, MA, USA.
| | - Miriam S Udler
- Programs in Metabolism and Medical & Population Genetics, Broad Institute of Harvard and MIT, 415 Main St., Cambridge, MA, 02142, USA.
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, USA.
- Diabetes Unit, Department of Medicine, Massachusetts General Hospital, Boston, USA.
- Department of Medicine, Harvard Medical School, Boston, MA, USA.
| | - Alisa K Manning
- Programs in Metabolism and Medical & Population Genetics, Broad Institute of Harvard and MIT, 415 Main St., Cambridge, MA, 02142, USA.
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, USA.
- Department of Medicine, Harvard Medical School, Boston, MA, USA.
- Clinical and Translational Epidemiology Unit, Department of Medicine, Massachusetts General Hospital, Boston, USA.
| |
Collapse
|
2
|
Gao J, Bonzel CL, Hong C, Varghese P, Zakir K, Gronsbell J. Semi-supervised ROC analysis for reliable and streamlined evaluation of phenotyping algorithms. J Am Med Inform Assoc 2024; 31:640-650. [PMID: 38128118 PMCID: PMC10873838 DOI: 10.1093/jamia/ocad226] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Revised: 09/22/2023] [Accepted: 11/20/2023] [Indexed: 12/23/2023] Open
Abstract
OBJECTIVE High-throughput phenotyping will accelerate the use of electronic health records (EHRs) for translational research. A critical roadblock is the extensive medical supervision required for phenotyping algorithm (PA) estimation and evaluation. To address this challenge, numerous weakly-supervised learning methods have been proposed. However, there is a paucity of methods for reliably evaluating the predictive performance of PAs when a very small proportion of the data is labeled. To fill this gap, we introduce a semi-supervised approach (ssROC) for estimation of the receiver operating characteristic (ROC) parameters of PAs (eg, sensitivity, specificity). MATERIALS AND METHODS ssROC uses a small labeled dataset to nonparametrically impute missing labels. The imputations are then used for ROC parameter estimation to yield more precise estimates of PA performance relative to classical supervised ROC analysis (supROC) using only labeled data. We evaluated ssROC with synthetic, semi-synthetic, and EHR data from Mass General Brigham (MGB). RESULTS ssROC produced ROC parameter estimates with minimal bias and significantly lower variance than supROC in the simulated and semi-synthetic data. For the 5 PAs from MGB, the estimates from ssROC are 30% to 60% less variable than supROC on average. DISCUSSION ssROC enables precise evaluation of PA performance without demanding large volumes of labeled data. ssROC is also easily implementable in open-source R software. CONCLUSION When used in conjunction with weakly-supervised PAs, ssROC facilitates the reliable and streamlined phenotyping necessary for EHR-based research.
Collapse
Affiliation(s)
- Jianhui Gao
- Department of Statistical Sciences, University of Toronto, Toronto, ON, Canada
| | - Clara-Lea Bonzel
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
| | - Chuan Hong
- Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, United States
| | - Paul Varghese
- Health Informatics, Verily Life Sciences, Cambridge, MA, United States
| | - Karim Zakir
- Department of Statistical Sciences, University of Toronto, Toronto, ON, Canada
| | - Jessica Gronsbell
- Department of Statistical Sciences, University of Toronto, Toronto, ON, Canada
- Department of Family and Community Medicine, University of Toronto, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
3
|
Walsh CG, Ripperger MA, Hu Y, Sheu YH, Lee H, Wilimitis D, Zheutlin AB, Rocha D, Choi KW, Castro VM, Kirchner HL, Chabris CF, Davis LK, Smoller JW. Development and multi-site external validation of a generalizable risk prediction model for bipolar disorder. Transl Psychiatry 2024; 14:58. [PMID: 38272862 PMCID: PMC10810911 DOI: 10.1038/s41398-023-02720-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Revised: 11/29/2023] [Accepted: 12/15/2023] [Indexed: 01/27/2024] Open
Abstract
Bipolar disorder is a leading contributor to disability, premature mortality, and suicide. Early identification of risk for bipolar disorder using generalizable predictive models trained on diverse cohorts around the United States could improve targeted assessment of high risk individuals, reduce misdiagnosis, and improve the allocation of limited mental health resources. This observational case-control study intended to develop and validate generalizable predictive models of bipolar disorder as part of the multisite, multinational PsycheMERGE Network across diverse and large biobanks with linked electronic health records (EHRs) from three academic medical centers: in the Northeast (Massachusetts General Brigham), the Mid-Atlantic (Geisinger) and the Mid-South (Vanderbilt University Medical Center). Predictive models were developed and valid with multiple algorithms at each study site: random forests, gradient boosting machines, penalized regression, including stacked ensemble learning algorithms combining them. Predictors were limited to widely available EHR-based features agnostic to a common data model including demographics, diagnostic codes, and medications. The main study outcome was bipolar disorder diagnosis as defined by the International Cohort Collection for Bipolar Disorder, 2015. In total, the study included records for 3,529,569 patients including 12,533 cases (0.3%) of bipolar disorder. After internal and external validation, algorithms demonstrated optimal performance in their respective development sites. The stacked ensemble achieved the best combination of overall discrimination (AUC = 0.82-0.87) and calibration performance with positive predictive values above 5% in the highest risk quantiles at all three study sites. In conclusion, generalizable predictive models of risk for bipolar disorder can be feasibly developed across diverse sites to enable precision medicine. Comparison of a range of machine learning methods indicated that an ensemble approach provides the best performance overall but required local retraining. These models will be disseminated via the PsycheMERGE Network website.
Collapse
Affiliation(s)
- Colin G Walsh
- Vanderbilt University Medical Center Health System, Nashville, TN, USA.
| | | | - Yirui Hu
- Geisinger Health System, Danville, PA, USA
| | - Yi-Han Sheu
- Massachusetts General-Brigham Health System, Boston, MA, USA
- Center for Precision Psychiatry, Department of Psychiatry, Massachusetts General Hospital, Boston, MA, USA
- Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Hyunjoon Lee
- Vanderbilt University Medical Center Health System, Nashville, TN, USA
| | - Drew Wilimitis
- Vanderbilt University Medical Center Health System, Nashville, TN, USA
| | | | | | - Karmel W Choi
- Massachusetts General-Brigham Health System, Boston, MA, USA
| | - Victor M Castro
- Massachusetts General-Brigham Health System, Boston, MA, USA
| | | | | | - Lea K Davis
- Vanderbilt University Medical Center Health System, Nashville, TN, USA
| | - Jordan W Smoller
- Massachusetts General-Brigham Health System, Boston, MA, USA
- Center for Precision Psychiatry, Department of Psychiatry, Massachusetts General Hospital, Boston, MA, USA
- Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| |
Collapse
|
4
|
He T, Belouali A, Patricoski J, Lehmann H, Ball R, Anagnostou V, Kreimeyer K, Botsis T. Trends and opportunities in computable clinical phenotyping: A scoping review. J Biomed Inform 2023; 140:104335. [PMID: 36933631 DOI: 10.1016/j.jbi.2023.104335] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Revised: 03/07/2023] [Accepted: 03/09/2023] [Indexed: 03/18/2023]
Abstract
Identifying patient cohorts meeting the criteria of specific phenotypes is essential in biomedicine and particularly timely in precision medicine. Many research groups deliver pipelines that automatically retrieve and analyze data elements from one or more sources to automate this task and deliver high-performing computable phenotypes. We applied a systematic approach based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines to conduct a thorough scoping review on computable clinical phenotyping. Five databases were searched using a query that combined the concepts of automation, clinical context, and phenotyping. Subsequently, four reviewers screened 7960 records (after removing over 4000 duplicates) and selected 139 that satisfied the inclusion criteria. This dataset was analyzed to extract information on target use cases, data-related topics, phenotyping methodologies, evaluation strategies, and portability of developed solutions. Most studies supported patient cohort selection without discussing the application to specific use cases, such as precision medicine. Electronic Health Records were the primary source in 87.1 % (N = 121) of all studies, and International Classification of Diseases codes were heavily used in 55.4 % (N = 77) of all studies, however, only 25.9 % (N = 36) of the records described compliance with a common data model. In terms of the presented methods, traditional Machine Learning (ML) was the dominant method, often combined with natural language processing and other approaches, while external validation and portability of computable phenotypes were pursued in many cases. These findings revealed that defining target use cases precisely, moving away from sole ML strategies, and evaluating the proposed solutions in the real setting are essential opportunities for future work. There is also momentum and an emerging need for computable phenotyping to support clinical and epidemiological research and precision medicine.
Collapse
Affiliation(s)
- Ting He
- Department of Oncology, The Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA; Biomedical Informatics and Data Science Section, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| | - Anas Belouali
- Biomedical Informatics and Data Science Section, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Jessica Patricoski
- Biomedical Informatics and Data Science Section, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Harold Lehmann
- Biomedical Informatics and Data Science Section, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Robert Ball
- Office of Surveillance and Epidemiology, Center for Drug Evaluation and Research, US FDA, Silver Spring, MD, USA
| | - Valsamo Anagnostou
- Department of Oncology, The Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Kory Kreimeyer
- Department of Oncology, The Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA; Biomedical Informatics and Data Science Section, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Taxiarchis Botsis
- Department of Oncology, The Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA; Biomedical Informatics and Data Science Section, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| |
Collapse
|
5
|
Ryu E, Jenkins GD, Wang Y, Olfson M, Talati A, Lepow L, Coombes BJ, Charney AW, Glicksberg BS, Mann JJ, Weissman MM, Wickramaratne P, Pathak J, Biernacka JM. The importance of social activity to risk of major depression in older adults. Psychol Med 2023; 53:2634-2642. [PMID: 34763736 PMCID: PMC9095757 DOI: 10.1017/s0033291721004566] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Revised: 10/04/2021] [Accepted: 10/20/2021] [Indexed: 11/07/2022]
Abstract
BACKGROUND Several social determinants of health (SDoH) have been associated with the onset of major depressive disorder (MDD). However, prior studies largely focused on individual SDoH and thus less is known about the relative importance (RI) of SDoH variables, especially in older adults. Given that risk factors for MDD may differ across the lifespan, we aimed to identify the SDoH that was most strongly related to newly diagnosed MDD in a cohort of older adults. METHODS We used self-reported health-related survey data from 41 174 older adults (50-89 years, median age = 67 years) who participated in the Mayo Clinic Biobank, and linked ICD codes for MDD in the participants' electronic health records. Participants with a history of clinically documented or self-reported MDD prior to survey completion were excluded from analysis (N = 10 938, 27%). We used Cox proportional hazards models with a gradient boosting machine approach to quantify the RI of 30 pre-selected SDoH variables on the risk of future MDD diagnosis. RESULTS Following biobank enrollment, 2073 older participants were diagnosed with MDD during the follow-up period (median duration = 6.7 years). The most influential SDoH was perceived level of social activity (RI = 0.17). Lower level of social activity was associated with a higher risk of MDD [hazard ratio = 2.27 (95% CI 2.00-2.50) for highest v. lowest level]. CONCLUSION Across a range of SDoH variables, perceived level of social activity is most strongly related to MDD in older adults. Monitoring changes in the level of social activity may help identify older adults at an increased risk of MDD.
Collapse
Affiliation(s)
- Euijung Ryu
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN, USA
| | - Gregory D. Jenkins
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN, USA
| | - Yanshan Wang
- Department of AI and Informatics, Mayo Clinic, Rochester, USA
| | - Mark Olfson
- Department of Psychiatry, Columbia University and New York State Psychiatric Institute, New York, USA
| | - Ardesheer Talati
- Department of Psychiatry, Columbia University and New York State Psychiatric Institute, New York, USA
| | - Lauren Lepow
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, USA
| | - Brandon J. Coombes
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN, USA
| | - Alexander W. Charney
- Mount Sinai Clinical Intelligence Center, Icahn School of Medicine at Mount Sinai, New York, USA
| | - Benjamin S. Glicksberg
- The Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, USA
| | - J. John Mann
- Department of Psychiatry, Columbia University and New York State Psychiatric Institute, New York, USA
| | - Myrna M. Weissman
- Department of Psychiatry, Columbia University and New York State Psychiatric Institute, New York, USA
| | - Priya Wickramaratne
- Department of Psychiatry, Columbia University and New York State Psychiatric Institute, New York, USA
| | | | - Joanna M. Biernacka
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN, USA
- Department of Psychiatry & Psychology, Mayo Clinic, Rochester, USA
| |
Collapse
|
6
|
Walsh CG, Ripperger MA, Hu Y, Sheu YH, Wilimitis D, Zheutlin AB, Rocha D, Choi KW, Castro VM, Kirchner HL, Chabris CF, Davis LK, Smoller JW. Development and Multi-Site External Validation of a Generalizable Risk Prediction Model for Bipolar Disorder. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.02.21.23286251. [PMID: 36865341 PMCID: PMC9980254 DOI: 10.1101/2023.02.21.23286251] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/27/2023]
Abstract
Bipolar disorder is a leading contributor to disability, premature mortality, and suicide. Early identification of risk for bipolar disorder using generalizable predictive models trained on diverse cohorts around the United States could improve targeted assessment of high risk individuals, reduce misdiagnosis, and improve the allocation of limited mental health resources. This observational case-control study intended to develop and validate generalizable predictive models of bipolar disorder as part of the multisite, multinational PsycheMERGE Consortium across diverse and large biobanks with linked electronic health records (EHRs) from three academic medical centers: in the Northeast (Massachusetts General Brigham), the Mid-Atlantic (Geisinger) and the Mid-South (Vanderbilt University Medical Center). Predictive models were developed and validated with multiple algorithms at each study site: random forests, gradient boosting machines, penalized regression, including stacked ensemble learning algorithms combining them. Predictors were limited to widely available EHR-based features agnostic to a common data model including demographics, diagnostic codes, and medications. The main study outcome was bipolar disorder diagnosis as defined by the International Cohort Collection for Bipolar Disorder, 2015. In total, the study included records for 3,529,569 patients including 12,533 cases (0.3%) of bipolar disorder. After internal and external validation, algorithms demonstrated optimal performance in their respective development sites. The stacked ensemble achieved the best combination of overall discrimination (AUC = 0.82 - 0.87) and calibration performance with positive predictive values above 5% in the highest risk quantiles at all three study sites. In conclusion, generalizable predictive models of risk for bipolar disorder can be feasibly developed across diverse sites to enable precision medicine. Comparison of a range of machine learning methods indicated that an ensemble approach provides the best performance overall but required local retraining. These models will be disseminated via the PsycheMERGE Consortium website.
Collapse
|
7
|
Exome sequencing in bipolar disorder identifies AKAP11 as a risk gene shared with schizophrenia. Nat Genet 2022; 54:541-547. [DOI: 10.1038/s41588-022-01034-x] [Citation(s) in RCA: 95] [Impact Index Per Article: 31.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Accepted: 02/15/2022] [Indexed: 12/30/2022]
|
8
|
Multi-omics research in sarcopenia: Current progress and future prospects. Ageing Res Rev 2022; 76:101576. [PMID: 35104630 DOI: 10.1016/j.arr.2022.101576] [Citation(s) in RCA: 41] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2021] [Revised: 12/13/2021] [Accepted: 01/26/2022] [Indexed: 12/17/2022]
Abstract
Sarcopenia is a systemic disease with progressive and generalized skeletal muscle dysfunction defined by age-related low muscle mass, high content of muscle slow fibers, and low muscle function. Muscle phenotypes and sarcopenia risk are heritable; however, the genetic architecture and molecular mechanisms underlying sarcopenia remain largely unclear. In recent years, significant progress has been made in determining susceptibility loci using genome-wide association studies. In addition, recent advances in omics techniques, including genomics, epigenomics, transcriptomics, proteomics, and metabolomics, offer new opportunities to identify novel targets to help us understand the pathophysiology of sarcopenia. However, each individual technology cannot capture the entire view of the biological complexity of this disorder, while integrative multi-omics analyses may be able to reveal new insights. Here, we review the latest findings of multi-omics studies for sarcopenia and provide an in-depth summary of our current understanding of sarcopenia pathogenesis. Leveraging multi-omics data could give us a holistic understanding of sarcopenia etiology that may lead to new clinical applications. This review offers guidance and recommendations for fundamental research, innovative perspectives, and preventative and therapeutic interventions for sarcopenia.
Collapse
|
9
|
Loebel A, Koblan KS, Tsai J, Deng L, Fava M, Kent J, Hopkins SC. A Randomized, Double-blind, Placebo-controlled Proof-of-Concept Trial to Evaluate the Efficacy and Safety of Non-racemic Amisulpride (SEP-4199) for the Treatment of Bipolar I Depression. J Affect Disord 2022; 296:549-558. [PMID: 34614447 DOI: 10.1016/j.jad.2021.09.109] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Revised: 09/16/2021] [Accepted: 09/29/2021] [Indexed: 12/11/2022]
Abstract
BACKGROUND Non-racemic amisulpride (SEP-4199) is an 85:15 ratio of aramisulpride:esamisulpride with a 5-HT7 and D2 receptor binding profile optimized for the treatment of bipolar depression. The aim of this study was to evaluate the efficacy and safety of SEP-4199 for the treatment of bipolar depression. METHODS Patients meeting DSM-5 criteria for bipolar I depression were randomized to 6 weeks of double-blind, placebo-controlled treatment with SEP-4199 200 mg/d or 400 mg/d. The primary endpoint was change in the Montgomery-Asberg Depression Rating Scale (MADRS) at Week 6. The primary efficacy analysis population consisted of patients in Europe and US (n = 289); the secondary efficacy analysis population (ITT; n = 337) included patients in Japan. RESULTS Endpoint improvement in MADRS total score was observed on both the primary analysis for SEP-4199 200 mg/d (P = 0.054; effect size [ES], 0.31) and 400 mg/d (P = 0.054; ES, 0.29), and on the secondary (full ITT) analysis for SEP-4199 200 mg/d (P = 0.016; ES, 0.34) and 400 mg/d (P = 0.024; ES, 0.31). Study completion rates were 81% on SEP-4199 200 mg/d, 88% on 400 mg/d, and 86% on placebo. SEP-4199 had low rates of individual adverse events (<8%) and minimal effects on weight and lipids; median increases in prolactin were +83.6 μg/L on 200 mg/d, +95.2 μg/L on 400 mg/d compared with 0.0 μg/L on placebo. LIMITATIONS The study excluded patients with bipolar II depression and serious psychiatric or medical comorbidity. CONCLUSION Study results provide preliminary proof of concept, needing confirmation in subsequent randomized trials, for the efficacy of non-racemic amisulpride in bipolar depression.
Collapse
Affiliation(s)
- Antony Loebel
- Sunovion Pharmaceuticals Inc., Marlborough, MA, United States of America
| | - Kenneth S Koblan
- Sunovion Pharmaceuticals Inc., Marlborough, MA, United States of America.
| | - Joyce Tsai
- Sunovion Pharmaceuticals Inc., Marlborough, MA, United States of America
| | - Ling Deng
- Sunovion Pharmaceuticals Inc., Marlborough, MA, United States of America
| | - Maurizio Fava
- Department of Psychiatry, Massachusetts General Hospital, and Harvard Medical School, Boston, MA, United States of America
| | - Justine Kent
- Sunovion Pharmaceuticals Inc., Marlborough, MA, United States of America
| | - Seth C Hopkins
- Sunovion Pharmaceuticals Inc., Marlborough, MA, United States of America
| |
Collapse
|
10
|
Abstract
Electronic health records (EHRs) are a rich source of data for researchers, but extracting meaningful information out of this highly complex data source is challenging. Phecodes represent one strategy for defining phenotypes for research using EHR data. They are a high-throughput phenotyping tool based on ICD (International Classification of Diseases) codes that can be used to rapidly define the case/control status of thousands of clinically meaningful diseases and conditions. Phecodes were originally developed to conduct phenome-wide association studies to scan for phenotypic associations with common genetic variants. Since then, phecodes have been used to support a wide range of EHR-based phenotyping methods, including the phenotype risk score. This review aims to comprehensively describe the development, validation, and applications of phecodes and suggest some future directions for phecodes and high-throughput phenotyping.
Collapse
Affiliation(s)
- Lisa Bastarache
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee 37232, USA;
| |
Collapse
|
11
|
Abstract
Bipolar disorder (BP) is a highly heritable disease, with heritability estimated between 60 and 85% by twin studies. The underlying genetic architecture was poorly understood for years since the available technology was limited to the candidate gene approach that did not allow to explore the contribution of multiple loci throughout the genome. BP is a complex disorder, which pathogenesis is influenced by a number of genetic variants, each with small effect size, and environmental exposures. Genome-wide association studies (GWAS) provided meaningful insights into the genetics of BP, including replicated genetic variants, and allowed the development of novel multi-marker methods for gene/pathway analysis and for estimating the genetic overlap between BP and other traits. However, the existing GWAS had also relevant limitations. Notably insufficient statistical power and lack of consideration of rare variants, which may be responsible for the relatively low heritability explained (~20% in the largest GWAS) compared to twin studies. The availability of data from large biobanks and automated phenotyping from electronic health records or digital phenotyping represent key steps for providing samples with adequate power for genetic analysis. Next-generation sequencing is becoming more and more feasible in terms of costs, leading to the rapid growth in the number of samples with whole-genome or whole-exome sequence data. These recent and unprecedented resources are of key importance for a more comprehensive understanding of the specific genetic factors involved in BP and their mechanistic action in determining disease onset and prognosis.
Collapse
Affiliation(s)
- Chiara Fabbri
- Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, UK.
| |
Collapse
|
12
|
Walters CE, Nitin R, Margulis K, Boorom O, Gustavson DE, Bush CT, Davis LK, Below JE, Cox NJ, Camarata SM, Gordon RL. Automated Phenotyping Tool for Identifying Developmental Language Disorder Cases in Health Systems Data (APT-DLD): A New Research Algorithm for Deployment in Large-Scale Electronic Health Record Systems. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2020; 63:3019-3035. [PMID: 32791019 PMCID: PMC7890229 DOI: 10.1044/2020_jslhr-19-00397] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/13/2019] [Revised: 04/23/2020] [Accepted: 05/19/2020] [Indexed: 05/13/2023]
Abstract
Purpose Data mining algorithms using electronic health records (EHRs) are useful in large-scale population-wide studies to classify etiology and comorbidities (Casey et al., 2016). Here, we apply this approach to developmental language disorder (DLD), a prevalent communication disorder whose risk factors and epidemiology remain largely undiscovered. Method We first created a reliable system for manually identifying DLD in EHRs based on speech-language pathologist (SLP) diagnostic expertise. We then developed and validated an automated algorithmic procedure, called, Automated Phenotyping Tool for identifying DLD cases in health systems data (APT-DLD), that classifies a DLD status for patients within EHRs on the basis of ICD (International Statistical Classification of Diseases and Related Health Problems) codes. APT-DLD was validated in a discovery sample (N = 973) using expert SLP manual phenotype coding as a gold-standard comparison and then applied and further validated in a replication sample of N = 13,652 EHRs. Results In the discovery sample, the APT-DLD algorithm correctly classified 98% (concordance) of DLD cases in concordance with manually coded records in the training set, indicating that APT-DLD successfully mimics a comprehensive chart review. The output of APT-DLD was also validated in relation to independently conducted SLP clinician coding in a subset of records, with a positive predictive value of 95% of cases correctly classified as DLD. We also applied APT-DLD to the replication sample, where it achieved a positive predictive value of 90% in relation to SLP clinician classification of DLD. Conclusions APT-DLD is a reliable, valid, and scalable tool for identifying DLD cohorts in EHRs. This new method has promising public health implications for future large-scale epidemiological investigations of DLD and may inform EHR data mining algorithms for other communication disorders. Supplemental Material https://doi.org/10.23641/asha.12753578.
Collapse
Affiliation(s)
- Courtney E. Walters
- Department of Otolaryngology, Vanderbilt University Medical Center, Nashville, TN
- Neuroscience Program, College of Arts and Science, Vanderbilt University, Nashville, TN
| | - Rachana Nitin
- Department of Otolaryngology, Vanderbilt University Medical Center, Nashville, TN
- Vanderbilt Brain Institute, Vanderbilt University, Nashville, TN
| | - Katherine Margulis
- Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, Nashville, TN
- Kennedy Krieger Institute, Baltimore, MD
| | - Olivia Boorom
- Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, Nashville, TN
| | - Daniel E. Gustavson
- Department of Otolaryngology, Vanderbilt University Medical Center, Nashville, TN
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN
| | - Catherine T. Bush
- Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, Nashville, TN
| | - Lea K. Davis
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN
| | - Jennifer E. Below
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN
| | - Nancy J. Cox
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN
| | - Stephen M. Camarata
- Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, Nashville, TN
| | - Reyna L. Gordon
- Department of Otolaryngology, Vanderbilt University Medical Center, Nashville, TN
- Vanderbilt Brain Institute, Vanderbilt University, Nashville, TN
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN
| |
Collapse
|
13
|
Milaneschi Y, Lamers F, Berk M, Penninx BWJH. Depression Heterogeneity and Its Biological Underpinnings: Toward Immunometabolic Depression. Biol Psychiatry 2020; 88:369-380. [PMID: 32247527 DOI: 10.1016/j.biopsych.2020.01.014] [Citation(s) in RCA: 245] [Impact Index Per Article: 49.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/01/2019] [Revised: 12/03/2019] [Accepted: 01/18/2020] [Indexed: 12/14/2022]
Abstract
Epidemiological evidence indicates the presence of dysregulated homeostatic biological pathways in depressed patients, such as increased inflammation and disrupted energy-regulating neuroendocrine signaling (e.g., leptin, insulin). Alterations in these biological pathways may explain the considerable comorbidity between depression and cardiometabolic conditions (e.g., obesity, metabolic syndrome, diabetes) and represent a promising target for intervention. This review describes how immunometabolic dysregulations vary as a function of depression heterogeneity by illustrating that such biological dysregulations map more consistently to atypical behavioral symptoms reflecting altered energy intake/expenditure balance (hyperphagia, weight gain, hypersomnia, fatigue, and leaden paralysis) and may moderate the antidepressant effects of standard or novel (e.g., anti-inflammatory) therapeutic approaches. These lines of evidence are integrated in a conceptual model of immunometabolic depression emerging from the clustering of immunometabolic biological dysregulations and specific behavioral symptoms. The review finally elicits questions to be answered by future research and describes how the immunometabolic depression dimension could be used to dissect the heterogeneity of depression and potentially to match subgroups of patients to specific treatments with higher likelihood of clinical success.
Collapse
Affiliation(s)
- Yuri Milaneschi
- Department of Psychiatry, Amsterdam Public Health and Amsterdam Neuroscience, Amsterdam University Medical Center/Vrije Universiteit & GGZinGeest, Amsterdam, The Netherlands.
| | - Femke Lamers
- Department of Psychiatry, Amsterdam Public Health and Amsterdam Neuroscience, Amsterdam University Medical Center/Vrije Universiteit & GGZinGeest, Amsterdam, The Netherlands
| | - Michael Berk
- Institute for Mental and Physical Health and Clinical Treatment, School of Medicine, Deakin University and Barwon Health, Geelong, Victoria, Australia; Orygen, The National Centre of Excellence in Youth Mental Health, Department of Psychiatry, University of Melbourne, Melbourne, Victoria, Australia; Florey Institute of Neuroscience and Mental Health, University of Melbourne, Melbourne, Victoria, Australia
| | - Brenda W J H Penninx
- Department of Psychiatry, Amsterdam Public Health and Amsterdam Neuroscience, Amsterdam University Medical Center/Vrije Universiteit & GGZinGeest, Amsterdam, The Netherlands
| |
Collapse
|
14
|
Defining Major Depressive Disorder Cohorts Using the EHR: Multiple Phenotypes Based on ICD-9 Codes and Medication Orders. ACTA ACUST UNITED AC 2020; 36:18-26. [PMID: 32218644 DOI: 10.1016/j.npbr.2020.02.002] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Background Major Depressive Disorder (MDD) is one of the most common mental illnesses and a leading cause of disability worldwide. Electronic Health Records (EHR) allow researchers to conduct unprecedented large-scale observational studies investigating MDD, its disease development and its interaction with other health outcomes. While there exist methods to classify patients as clear cases or controls, given specific data requirements, there are presently no simple, generalizable, and validated methods to classify an entire patient population into varying groups of depression likelihood and severity. Methods We have tested a simple, pragmatic electronic phenotype algorithm that classifies patients into one of five mutually exclusive, ordinal groups, varying in depression phenotype. Using data from an integrated health system on 278,026 patients from a 10-year study period we have tested the convergent validity of these constructs using measures of external validation, including patterns of psychiatric prescriptions, symptom severity, indicators of suicidality, comorbidity, mortality, health care utilization, and polygenic risk scores for MDD. Results We found consistent patterns of increasing morbidity and/or adverse outcomes across the five groups, providing evidence for convergent validity. Limitations The study population is from a single rural integrated health system which is predominantly white, possibly limiting its generalizability. Conclusion Our study provides initial evidence that a simple algorithm, generalizable to most EHR data sets, provides categories with meaningful face and convergent validity that can be used for stratification of an entire patient population.
Collapse
|
15
|
Dennis J, Yengo-Kahn AM, Kirby P, Solomon GS, Cox NJ, Zuckerman SL. Diagnostic Algorithms to Study Post-Concussion Syndrome Using Electronic Health Records: Validating a Method to Capture an Important Patient Population. J Neurotrauma 2019; 36:2167-2177. [PMID: 30773988 DOI: 10.1089/neu.2018.5916] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
Post-concussion syndrome (PCS) is characterized by persistent cognitive, somatic, and emotional symptoms after a mild traumatic brain injury (mTBI). Genetic and other biological variables may contribute to PCS etiology, and the emergence of biobanks linked to electronic health records (EHRs) offers new opportunities for research on PCS. We sought to validate the EHR data of PCS patients by comparing two diagnostic algorithms deployed in the Vanderbilt University Medical Center de-identified database of 2.8 million patient EHRs. The algorithms identified individuals with PCS by: 1) natural language processing (NLP) of narrative text in the EHR combined with structured demographic, diagnostic, and encounter data; or 2) coded billing and procedure data. The predictive value of each algorithm was assessed, and cases and controls identified by each approach were compared on demographic and medical characteristics. The NLP algorithm identified 507 cases and 10,857 controls. The negative predictive value in controls was 78% and the positive predictive value (PPV) in cases was 82%. Conversely, the coded algorithm identified 1142 patients with two or more PCS billing codes and had a PPV of 76%. Comparisons of PCS controls to both case groups recovered known epidemiology of PCS: cases were more likely than controls to be female and to have pre-morbid diagnoses of anxiety, migraine, and post-traumatic stress disorder. In contrast, controls and cases were equally likely to have attention deficit hyperactive disorder and learning disabilities, in accordance with the findings of recent systematic reviews of PCS risk factors. We conclude that EHRs are a valuable research tool for PCS. Ascertainment based on coded data alone had a predictive value comparable to an NLP algorithm, recovered known PCS risk factors, and maximized the number of included patients.
Collapse
Affiliation(s)
- Jessica Dennis
- 1 Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee.,2 Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Aaron M Yengo-Kahn
- 3 Vanderbilt Sports Concussion Center, Vanderbilt University School of Medicine, Nashville, Tennessee.,4 Department of Neurological Surgery, Vanderbilt University School of Medicine, Nashville, Tennessee
| | - Paul Kirby
- 3 Vanderbilt Sports Concussion Center, Vanderbilt University School of Medicine, Nashville, Tennessee
| | - Gary S Solomon
- 3 Vanderbilt Sports Concussion Center, Vanderbilt University School of Medicine, Nashville, Tennessee.,4 Department of Neurological Surgery, Vanderbilt University School of Medicine, Nashville, Tennessee
| | - Nancy J Cox
- 1 Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee.,2 Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Scott L Zuckerman
- 3 Vanderbilt Sports Concussion Center, Vanderbilt University School of Medicine, Nashville, Tennessee.,4 Department of Neurological Surgery, Vanderbilt University School of Medicine, Nashville, Tennessee
| |
Collapse
|
16
|
Electronic Health Records Are the Next Frontier for the Genetics of Substance Use Disorders. Trends Genet 2019; 35:317-318. [PMID: 30797598 DOI: 10.1016/j.tig.2019.01.007] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2019] [Accepted: 01/30/2019] [Indexed: 11/23/2022]
Abstract
Compared with other psychiatric disorders of similar heritabilities, the progress of substance use disorders (SUD) genetics has been slow. With the growing availability of large-scale biobanks with extensive phenotypes from electronic health records (EHR) and genotypes across millions of individuals, this platform is the next tool to accelerate SUD genetics research.
Collapse
|
17
|
Song W, Huang H, Zhang CZ, Bates DW, Wright A. Using whole genome scores to compare three clinical phenotyping methods in complex diseases. Sci Rep 2018; 8:11360. [PMID: 30054501 PMCID: PMC6063939 DOI: 10.1038/s41598-018-29634-w] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2018] [Accepted: 07/16/2018] [Indexed: 12/19/2022] Open
Abstract
Genome-wide association studies depend on accurate ascertainment of patient phenotype. However, phenotyping is difficult, and it is often treated as an afterthought in these studies because of the expense involved. Electronic health records (EHRs) may provide higher fidelity phenotypes for genomic research than other sources such as administrative data. We used whole genome association models to evaluate different EHR and administrative data-based phenotyping methods in a cohort of 16,858 Caucasian subjects for type 1 diabetes mellitus, type 2 diabetes mellitus, coronary artery disease and breast cancer. For each disease, we trained and evaluated polygenic models using three different phenotype definitions: phenotypes derived from billing data, the clinical problem list, or a curated phenotyping algorithm. We observed that for these diseases, the curated phenotype outperformed the problem list, and the problem list outperformed administrative billing data. This suggests that using advanced EHR-derived phenotypes can further increase the power of genome-wide association studies.
Collapse
Affiliation(s)
- Wenyu Song
- Division of General Internal Medicine and Primary Care, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts, 02120, USA.,Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, 02115, USA
| | - Hailiang Huang
- Analytic and Translational Genetics Unit, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, 02114, USA.,Broad Institute of MIT and Harvard, Cambridge, Massachusetts, 02142, USA
| | - Cheng-Zhong Zhang
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, 02115, USA.,Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, Massachusetts, 02215, USA.,Broad Institute of MIT and Harvard, Cambridge, Massachusetts, 02142, USA
| | - David W Bates
- Division of General Internal Medicine and Primary Care, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts, 02120, USA.,Information Systems Department, Partners HealthCare, Somerville, Massachusetts, 02145, USA
| | - Adam Wright
- Division of General Internal Medicine and Primary Care, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts, 02120, USA. .,Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, 02115, USA. .,Information Systems Department, Partners HealthCare, Somerville, Massachusetts, 02145, USA.
| |
Collapse
|
18
|
Comparing Deep Learning and Classical Machine Learning Approaches for Predicting Inpatient Violence Incidents from Clinical Text. APPLIED SCIENCES-BASEL 2018. [DOI: 10.3390/app8060981] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|