1
|
Dai L, Liu Z, Guo C, Fan H, Zhang C, Huang J, Zhang X, Zhao S, Wang H, Zhang T. Proteomic insights into metabolic dysfunction-associated steatotic disease: Identifying therapeutic targets and assessing on-target side effects. Life Sci 2025; 373:123665. [PMID: 40287056 DOI: 10.1016/j.lfs.2025.123665] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2024] [Revised: 03/24/2025] [Accepted: 04/21/2025] [Indexed: 04/29/2025]
Abstract
AIMS The prevalence of metabolic dysfunction-associated steatotic liver disease (MASLD) is rising sharply, yet treatment options remain inadequate. To uncover new therapeutic targets for MASLD, we conducted a comprehensive proteome-wide Mendelian randomization (MR) and phenome-wide association study (PheWAS). MATERIALS AND METHODS Discovery MR utilized protein quantitative trait loci (pQTL) data on 4907 plasma protein levels from 35,559 individuals, alongside genome-wide association study (GWAS) on MASLD from the Million Veteran Program (68,725 cases / 95,482 controls). Validation comprised five pairwise combinations of these discovery datasets with three additional datasets: pQTL data for 2923 proteins from the UK Biobank, and liver biopsy-confirmed MASLD GWAS (1483 cases/17,781 controls) and MRI-liver fat GWAS (31,377 subjects) (excluding discovery pair). Candidate proteins underwent druggability assessment and on-target side effect evaluation via PheWAS. KEY FINDINGS We identified 26 proteins associated with MASLD after Bonferroni correction (P < 1.16 × 10-5), with 19 of them showing no significant reverse association. Interleukin-6 (IL-6), alpha-1-antitrypsin (α1-antitrypsin), 5-hydroxytryptamine receptor 7 (5-HT7R), ephrin-B1 (EFNB1), and protein MENT (CA056) were replicated. Notably, IL-6 (OR = 2.02; 95 % CI 1.54-2.64), 5-HT7R (OR = 2.73; 95 % CI 1.96-3.80), and EFNB1 (OR = 1.82; 95 % CI 1.59-2.08) were positively associated with MASLD risk, whereas α1-antitrypsin (OR = 0.84; 95 % CI 0.78-0.90) and CA056 (OR = 0.90; 95 % CI 0.86-0.94) appeared protective. Among these, IL-6, 5-HT7R, and α1-antitrypsin were druggable. PheWAS identified potential cardiovascular side effects for 5-HT7R and α1-antitrypsin. SIGNIFICANCE The integrative study identified several plasma proteins associated with MASLD. IL-6, α1-antitrypsin, 5-HT7R, EFNB1 and CA056 deserve further investigation as potential drug targets for MASLD.
Collapse
Affiliation(s)
- Luojia Dai
- Department of Epidemiology, School of Public Health, Key Laboratory of Public Health Safety (Fudan University), Ministry of Education, Fudan University, Shanghai 200032, China
| | - Zhenqiu Liu
- Human Phenome Institute, Research and Innovation Center, Shanghai Pudong Hospital, Fudan University, Shanghai, China; Fudan University Taizhou Institute of Health Sciences, Taizhou, China
| | - Chengnan Guo
- Department of Epidemiology, School of Public Health, Key Laboratory of Public Health Safety (Fudan University), Ministry of Education, Fudan University, Shanghai 200032, China; Shanghai Institute of Infectious Disease and Biosecurity, School of Public Health, Fudan University, Shanghai, China
| | - Hong Fan
- Department of Epidemiology, School of Public Health, Key Laboratory of Public Health Safety (Fudan University), Ministry of Education, Fudan University, Shanghai 200032, China
| | - Chengjun Zhang
- Department of Epidemiology, School of Public Health, Key Laboratory of Public Health Safety (Fudan University), Ministry of Education, Fudan University, Shanghai 200032, China
| | - Jiayi Huang
- Department of Epidemiology, School of Public Health, Key Laboratory of Public Health Safety (Fudan University), Ministry of Education, Fudan University, Shanghai 200032, China
| | - Xin Zhang
- Department of Epidemiology, School of Public Health, Key Laboratory of Public Health Safety (Fudan University), Ministry of Education, Fudan University, Shanghai 200032, China
| | - Shuzhen Zhao
- Department of Epidemiology, School of Public Health, Key Laboratory of Public Health Safety (Fudan University), Ministry of Education, Fudan University, Shanghai 200032, China
| | - Haili Wang
- Department of Epidemiology, School of Public Health, Key Laboratory of Public Health Safety (Fudan University), Ministry of Education, Fudan University, Shanghai 200032, China
| | - Tiejun Zhang
- Department of Epidemiology, School of Public Health, Key Laboratory of Public Health Safety (Fudan University), Ministry of Education, Fudan University, Shanghai 200032, China; Shanghai Institute of Infectious Disease and Biosecurity, School of Public Health, Fudan University, Shanghai, China.
| |
Collapse
|
2
|
Yan C, Grabowska ME, Thakkar R, Dickson AL, Embí PJ, Feng Q, Denny JC, Kerchberger VE, Malin BA, Wei WQ. Beyond Phecodes: leveraging PheMAP to identify patients lacking diagnosis codes in electronic health records. J Am Med Inform Assoc 2025; 32:1007-1014. [PMID: 40156924 DOI: 10.1093/jamia/ocaf055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2024] [Revised: 02/16/2025] [Accepted: 03/18/2025] [Indexed: 04/01/2025] Open
Abstract
OBJECTIVE Diagnosis codes documented in electronic health records (EHR) are often relied upon to clinically phenotype patients for biomedical research. However, these diagnoses can be incomplete and inaccurate, leading to false negatives when searching for patients with phenotypes of interest. This study aims to determine whether PheMAP, a comprehensive knowledgebase integrating multiple clinical terminologies beyond diagnosis to capture phenotypes, can effectively identify patients lacking relevant EHR diagnosis codes. MATERIALS AND METHODS We investigated a collection of 3.5 million patient records from Vanderbilt University Medical Center's EHR and focused on 4 well-studied phenotypes: (1) type 2 diabetes mellitus (T2DM), (2) dementia, (3) prostate cancer, and (4) sensorineural hearing loss. We applied PheMAP to match structured concepts in patient records and calculated a phenotype risk score (PheScore) to indicate patient-phenotype similarity. Patients meeting predefined PheScore criteria but lacking diagnosis codes were identified. Clinically knowledgeable experts adjudicated randomly selected patients per phenotype as Positive, Possibly Positive, or Negative. RESULTS Our approach indicated that 5.3% of patients lacked a diagnosis for T2DM, 4.5% for dementia, 2.2% for prostate cancer, and 0.2% for sensorineural hearing loss. The expert review indicated 100% precision (for Possibly Positive or Positive cases) for dementia and sensorineural hearing loss, and 90.0% and 85.0% precision for T2DM and prostate cancer, respectively. Excluding Possibly Positive cases, the precision for T2DM and prostate cancer was 88.9% and 81.3%, respectively. CONCLUSIONS Leveraging clinical terminologies incorporated by PheMAP can effectively identify patients with phenotypes who lack EHR diagnosis codes, thereby enhancing phenotyping quality and related research reliability.
Collapse
Affiliation(s)
- Chao Yan
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, United States
| | - Monika E Grabowska
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, United States
| | - Rut Thakkar
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, United States
| | - Alyson L Dickson
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, United States
| | - Peter J Embí
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, United States
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, United States
| | - QiPing Feng
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, United States
| | - Joshua C Denny
- All of United States Research Program, National Institute of Health, Bethesda, MD 20892, United States
| | - Vern Eric Kerchberger
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, United States
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, United States
| | - Bradley A Malin
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, United States
- Department of Computer Science, Vanderbilt University, Nashville, TN 37203, United States
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN 37203, United States
| | - Wei-Qi Wei
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, United States
- Department of Computer Science, Vanderbilt University, Nashville, TN 37203, United States
| |
Collapse
|
3
|
Zhang R, Hou Y, Cui E, Lim K, Chow L, Howell M, Ikramuddin S. Association of Physical Activity from Wearable Devices and Chronic Disease Risk: Insights from the All of Us Research Program. RESEARCH SQUARE 2025:rs.3.rs-6263507. [PMID: 40386402 PMCID: PMC12083687 DOI: 10.21203/rs.3.rs-6263507/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 05/25/2025]
Abstract
Physical activity is a modifiable factor influencing chronic disease risk. Previous studies often relied on self-reported activity measures or short-term assessments, limiting their accuracy. Leveraging Fitbit-derived data from the All of Us Research Program, we investigated associations between long-term physical activity patterns and chronic disease incidence in a diverse cohort. The study included 22,019 participants with at least six months of Fitbit monitoring and linked electronic health records. Key activity metrics included daily step count, activity calories, elevation gain, and activity duration at different intensities. Higher physical activity levels were associated with a lower risk of multiple chronic diseases. A 2,000-step increase in daily step count was linked to a reduced risk of obesity (hazard ratio [HR] = 0.85, 95% confidence interval [CI]: 0.80-0.90), type 2 diabetes (HR = 0.78, CI: 0.72-0.84), and major depressive disorder (HR = 0.83, CI: 0.77-0.90). Elevation gain was inversely associated with obesity (HR = 0.86, CI: 0.78-0.95) and type 2 diabetes (HR = 0.65, CI: 0.53-0.80). Increased time spent in very active intensity correlated with a lower risk of multiple conditions, including obstructive sleep apnea and morbid obesity. Conversely, prolonged sedentary time was associated with an increased risk of cardiometabolic diseases, including obesity (HR = 1.08, CI: 1.06-1.10) and essential hypertension (HR = 1.05, CI: 1.04-1.07). A sensitivity analysis using BMI-defined obesity instead of EHR-based diagnoses confirmed the robustness of these associations. These findings underscore the protective role of increased physical activity and reduced sedentary time in mitigating chronic disease risk. They support the development of personalized physical activity recommendations and targeted public health interventions aimed at improving long-term health outcomes. Future research integrating machine learning approaches could further refine activity-based disease prevention strategies.
Collapse
Affiliation(s)
| | | | - Erjia Cui
- University of Minnesota, Division of Biostatistics
| | | | | | | | | |
Collapse
|
4
|
Paz V, Wilcox H, Goodman M, Wang H, Garfield V, Saxena R, Dashti HS. Associations of a multidimensional polygenic sleep health score and a sleep lifestyle index with disease outcomes and their interaction in a clinical biobank. Sleep Health 2025:S2352-7218(25)00041-5. [PMID: 40222844 DOI: 10.1016/j.sleh.2025.02.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2024] [Revised: 02/10/2025] [Accepted: 02/24/2025] [Indexed: 04/15/2025]
Abstract
OBJECTIVES Sleep is a complex behavior regulated by genetic and environmental factors impacting disease outcomes. However, the effect of multidimensional sleep encompassing several sleep dimensions on common diseases, specifically mental health disorders, has yet to be fully elucidated. Using the Mass General Brigham Biobank, we examined the association of multidimensional sleep with disease outcomes and investigated whether sleep behaviors modulate genetic predisposition to unfavorable sleep on mental health diseases. METHODS We generated a Polygenic Sleep Health Score using previously identified single nucleotide polymorphisms and constructed a Sleep Lifestyle Index based on self-reported questions and electronic health records; tested their association; performed phenome-wide association analyses between these indexes and clinical phenotypes; and analyzed their interaction on prevalent mental health diseases. A total of 15,884 participants were included in the analysis (mean age 54.4; 58.6% female). RESULTS The Polygenic Sleep Health Score was associated with the Sleep Lifestyle Index (β=0.050, 95% CI=0.032, 0.068) and with 114 disease outcomes spanning 12 disease groups, including obesity, sleep, and substance use disease outcomes (p<3.3×10-5). The Sleep Lifestyle Index was associated with 458 disease outcomes spanning 17 groups, including sleep, mood, and anxiety disease outcomes (p<5.1×10-5). A total of 108 disease outcomes were associated with both indexes, spanning 12 disease groups. No interactions were found between the indexes on mental health diseases. CONCLUSIONS Favorable sleep behaviors and genetic predisposition to healthy sleep may independently protect against disease, underscoring the impact of multidimensional sleep on population health and the need for prevention strategies focused on healthy sleep habits.
Collapse
Affiliation(s)
- Valentina Paz
- Instituto de Psicología Clínica, Facultad de Psicología, Universidad de la República, Montevideo, Uruguay; Grupo Cronobiología, Universidad de la República, Montevideo, Uruguay; MRC Unit for Lifelong Health and Ageing at UCL, Institute of Cardiovascular Sciences, Faculty of Population Health Sciences, University College London, London, United Kingdom; Pharmacology and Therapeutics, Systems Molecular and Integrative Biology, Health & Life Sciences, University of Liverpool, Liverpool, United Kingdom; Center for Genomic Medicine, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, United States.
| | - Hannah Wilcox
- Center for Genomic Medicine, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, United States; Department of Anesthesia, Critical Care and Pain Medicine, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, United States
| | - Matthew Goodman
- Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States; Medical and Population Genetics, Broad Institute, Cambridge, Massachusetts, United States
| | - Heming Wang
- Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States; Medical and Population Genetics, Broad Institute, Cambridge, Massachusetts, United States
| | - Victoria Garfield
- MRC Unit for Lifelong Health and Ageing at UCL, Institute of Cardiovascular Sciences, Faculty of Population Health Sciences, University College London, London, United Kingdom; Pharmacology and Therapeutics, Systems Molecular and Integrative Biology, Health & Life Sciences, University of Liverpool, Liverpool, United Kingdom
| | - Richa Saxena
- Center for Genomic Medicine, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, United States; Department of Anesthesia, Critical Care and Pain Medicine, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, United States; Medical and Population Genetics, Broad Institute, Cambridge, Massachusetts, United States; Division of Sleep Medicine, Harvard Medical School, Boston, Massachusetts, United States
| | - Hassan S Dashti
- Center for Genomic Medicine, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, United States; Department of Anesthesia, Critical Care and Pain Medicine, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, United States; Medical and Population Genetics, Broad Institute, Cambridge, Massachusetts, United States; Division of Sleep Medicine, Harvard Medical School, Boston, Massachusetts, United States; Division of Nutrition, Harvard Medical School, Boston, Massachusetts, United States.
| |
Collapse
|
5
|
Allaire P, Fox J, Kitchner T, Gabor R, Folz C, Bettadahalli S, Hebbring S. Familial Renal Glucosuria and Potential Pharmacogenetic Impact on Sodium-Glucose Cotransporter-2 Inhibitors. KIDNEY360 2025; 6:521-530. [PMID: 39412882 PMCID: PMC12045503 DOI: 10.34067/kid.0000000621] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/17/2024] [Accepted: 10/09/2024] [Indexed: 10/18/2024]
Abstract
Key Points A significant knowledge gap exists in SLC5A2 's role in familial renal glycosuria and sodium-glucose cotransporter-2 inhibitors' efficacy. Two percent of individuals in the All-of-Us cohort harbor rare genetic variants in SLC5A2 , potentially increasing the risk of familial renal glycosuria. Our trial suggests differential responses to sodium-glucose cotransporter-2 inhibitors in individuals with rare SLC5A2 alleles compared with wild types. Background Renal glucosuria is a rare inheritable trait caused by loss-of-function variants in the gene that encodes sodium-glucose cotransporter-2 (SGLT2) (i.e ., SLC5A2 ). The genetics of renal glucosuria is poorly understood, and even less is known on how loss-of-function variants in SLC5A2 may affect response to SGLT2 inhibitors, a new class of medication gaining popularity to treat diabetes by artificially inducing glucosuria. Methods We used two biobanks that link genomic with electronic health record data to study the genetics of renal glucosuria. This included 245,394 participants enrolled in the All of Us Research Program and 11,011 enrolled in Marshfield Clinic's Personalized Research Project (PMRP). Association studies in All of Us and PMRP identified ten variants that reached an experiment-wise Bonferroni threshold in either cohort, of which nine were novel. PMRP was further used as a recruitment source for a prospective SGLT2 pharmacogenetic trial. During a glucose tolerance test, the trial measured urine glucose concentrations in 15 SLC5A2 variant–positive individuals and 15 matched wild types with and without an SGLT2 inhibitor. Results This trial demonstrated that carriers of SLC5A2 risk variants may be more sensitive to SGLT2 inhibitors compared with wild types (P = 0.075). On the basis of population data, 2% of an ethnically diverse population carried rare variants in SLC5A2 and are at risk of renal glucosuria. Conclusions As a result, 2% of individuals being treated with SGLT2 inhibitors may respond differently to this new class of medication compared with the general population, suggesting that a larger investigation into SLC5A2 variants and SGLT2 inhibitors is needed.
Collapse
Affiliation(s)
- Patrick Allaire
- Center for Precision Medicine Research, Marshfield Clinic Health System , Marshfield, Wisconsin
| | | | | | | | | | | | | |
Collapse
|
6
|
Neylan CJ, Levin MG, Hartmann K, Beigel K, Khodursky S, DePaolo JS, Abramowitz S, Furth EE, Heuckeroth RO, Damrauer SM, Maguire LH. Genome-wide association meta-analysis identifies 126 novel loci for diverticular disease and implicates connective tissue and colonic motility. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2025:2025.03.27.25324777. [PMID: 40196262 PMCID: PMC11974943 DOI: 10.1101/2025.03.27.25324777] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 04/09/2025]
Abstract
Diverticular disease is a common and morbid complex phenotype influenced by both innate and environmental risk factors. We performed the largest genome-wide association study meta-analysis for diverticular disease, identifying 126 novel loci. Employing multiple downstream analytic strategies, including tissue and pathway enrichment, statistical fine-mapping, allele-specific expression, protein quantitative trait loci and drug-target investigations, and linkage disequilibrium score regression, we prioritized causal genes and produced several lines of evidence linking diverticular disease to connective tissue biology and colonic motility. We substantiated these findings by integrating single-cell RNA sequencing data, showing that prioritized diverticular disease-associated genes are enriched for expression in colonic smooth muscle, fibroblasts, and interstitial cells of Cajal. In quantitative analysis of surgical specimens, we found a substantial reduction in the density of elastin present in the sigmoid colon in severe diverticulitis.
Collapse
Affiliation(s)
- Christopher J. Neylan
- Department of Surgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104
| | - Michael G. Levin
- Cardiovascular Institute, Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104
- Corporal Michael Crescenz VA Medical Center, Philadelphia, PA 19104
| | - Katherine Hartmann
- Department of Radiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104
| | - Katherine Beigel
- Department of Biomedical and Health Informatics (DBHi), Children’s Hospital of Philadelphia, Philadelphia, PA 19104
| | - Sam Khodursky
- Department of Surgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104
| | - John S. DePaolo
- Department of Surgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104
| | - Sarah Abramowitz
- Department of Surgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104
- The Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, Hempstead, NY 11549
| | - Emma E. Furth
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104
| | - Robert O. Heuckeroth
- Division of Gastroenterology, Hepatology and Nutrition, Children’s Hospital of Philadelphia, 3401 Civic Center Blvd, Philadelphia, PA 19104
- The Children’s Hospital of Philadelphia Research Institute and Abramson Research Center, 3615 Civic Center Blvd, Philadelphia, PA 19104, USA
- Perelman School of Medicine at the University of Pennsylvania, 3400 Civic Center Boulevard, Philadelphia, PA 19104
| | - Scott M. Damrauer
- Department of Surgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104
- Cardiovascular Institute, Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104
- Corporal Michael Crescenz VA Medical Center, Philadelphia, PA 19104
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104
| | - Lillias H. Maguire
- Corporal Michael Crescenz VA Medical Center, Philadelphia, PA 19104
- Division of Colon and Rectal Surgery, Department of Surgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104
| |
Collapse
|
7
|
Tozzo V, Zhang LH, Ranganath R, Higgins JM. Transformer-based artificial intelligence on single-cell clinical data for homeostatic mechanism inference and rational biomarker discovery. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2025:2025.03.24.25324556. [PMID: 40196278 PMCID: PMC11974774 DOI: 10.1101/2025.03.24.25324556] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 04/09/2025]
Abstract
Artificial intelligence (AI) applied to single-cell data has the potential to transform our understanding of biological systems by revealing patterns and mechanisms that simpler traditional methods miss. Here, we develop a general-purpose, interpretable AI pipeline consisting of two deep learning models: the Multi- Input Set Transformer++ (MIST) model for prediction and the single-cell FastShap model for interpretability. We apply this pipeline to a large set of routine clinical data containing single-cell measurements of circulating red blood cells (RBC), white blood cells (WBC), and platelets (PLT) to study population fluxes and homeostatic hematological mechanisms. We find that MIST can use these single-cell measurements to explain 70-82% of the variation in blood cell population sizes among patients (RBC count, PLT count, WBC count), compared to 5-20% explained with current approaches. MIST's accuracy implies that substantial information on cellular production and clearance is present in the single-cell measurements. MIST identified substantial crosstalk among RBC, WBC, and PLT populations, suggesting co-regulatory relationships that we validated and investigated using interpretability maps generated by single-cell FastShap. The maps identify granular single-cell subgroups most important for each population's size, enabling generation of evidence-based hypotheses for co-regulatory mechanisms. The interpretability maps also enable rational discovery of a single-WBC biomarker, "Down Shift", that complements an existing marker of inflammation and strengthens diagnostic associations with diseases including sepsis, heart disease, and diabetes. This study illustrates how single-cell data can be leveraged for mechanistic inference with potential clinical relevance and how this AI pipeline can be applied to power scientific discovery.
Collapse
|
8
|
Zhong X, Jia G, Yin Z, Cheng K, Rzhetsky A, Li B, Cox NJ. Longitudinal Analysis of Electronic Health Records Reveals Medical Conditions Associated with Subsequent Alzheimer's Disease Development. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2025:2025.03.22.25324197. [PMID: 40196258 PMCID: PMC11974777 DOI: 10.1101/2025.03.22.25324197] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 04/09/2025]
Abstract
Background Several health conditions are known to increase the risk of Alzheimer's disease (AD). We aim to systematically identify medical conditions that are associated with subsequent development of AD by leveraging the growing resources of electronic health records (EHRs). Methods This retrospective cohort study used de-identified EHRs from two independent databases (MarketScan and VUMC) with 153 million individuals to identify AD cases and age- and gender-matched controls. By tracking their EHRs over a 10-year window before AD diagnosis and comparing the EHRs between AD cases and controls, we identified medical conditions that occur more likely in those who later develop AD. We further assessed the genetic underpinnings of these conditions in relation to AD genetics using data from two large-scale biobanks (BioVU and UK Biobank, total N=450,000). Results We identified 43,508 AD cases and 419,455 matched controls in MarketScan, and 1,320 AD cases and 12,720 matched controls in VUMC. We detected 406 and 102 medical phenotypes that are significantly enriched among the future AD cases in MarketScan and VUMC databases, respectively. In both EHR databases, mental disorders and neurological disorders emerged as the top two most enriched clinical categories. More than 70 medical phenotypes are replicated in both EHR databases, which are dominated by mental disorders (e.g., depression), neurological disorders (e.g., sleep orders), circulatory system disorders (e.g. cerebral atherosclerosis) and endocrine/metabolic disorders (e.g., type 2 diabetes). We identified 19 phenotypes that are either associated with individual risk variants of AD or a polygenic risk score of AD. Conclusions In this study, analysis of longitudinal EHRs from independent large-scale databases enables robust identification of health conditions associated with subsequent development of AD, highlighting potential opportunities of therapeutics and interventions to reduce AD risk.
Collapse
Affiliation(s)
- Xue Zhong
- Department of Medicine, Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN
| | - Gengjie Jia
- Department of Medicine, Institute of Genomics and Systems Biology, University of Chicago, Chicago, IL
| | - Zhijun Yin
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN
| | - Kerou Cheng
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN
| | - Andrey Rzhetsky
- Department of Human Genetics, Department of Medicine, University of Chicago, Chicago, IL
| | - Bingshan Li
- Department of Molecular Physiology and Biophysics, Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN
| | - Nancy J. Cox
- Department of Medicine, Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN
| |
Collapse
|
9
|
Yang L, Sadler MC, Altman RB. Genetic association studies using disease liabilities from deep neural networks. Am J Hum Genet 2025; 112:675-692. [PMID: 39986278 PMCID: PMC11948217 DOI: 10.1016/j.ajhg.2025.01.019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2024] [Revised: 01/23/2025] [Accepted: 01/24/2025] [Indexed: 02/24/2025] Open
Abstract
The case-control study is a widely used method for investigating the genetic underpinnings of binary traits. However, long-term, prospective cohort studies often grapple with absent or evolving health-related outcomes. Here, we propose two methods, liability and meta, for conducting genome-wide association studies (GWASs) that leverage disease liabilities calculated from deep patient phenotyping. Analyzing 38 common traits in ∼300,000 UK Biobank participants, we identified an increased number of loci in comparison to the number identified by the conventional case-control approach, and there were high replication rates in larger external GWASs. Further analyses confirmed the disease specificity of the genetic architecture; the meta method demonstrated higher robustness when phenotypes were imputed with low accuracy. Additionally, polygenic risk scores based on disease liabilities more effectively predicted newly diagnosed cases in the 2022 dataset, which were controls in the earlier 2019 dataset. Our findings demonstrate that integrating high-dimensional phenotypic data into deep neural networks enhances genetic association studies while capturing disease-relevant genetic architecture.
Collapse
Affiliation(s)
- Lu Yang
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA; Department of Computer Science, Stanford University, Stanford, CA 94305, USA.
| | - Marie C Sadler
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA; University Center for Primary Care and Public Health, 1010 Lausanne, Switzerland; Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Russ B Altman
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA; Department of Genetics, Stanford University, Stanford, CA 94305, USA; Department of Medicine, Stanford University, Stanford, CA 94305, USA; Department of Computer Science, Stanford University, Stanford, CA 94305, USA
| |
Collapse
|
10
|
Zheng SL, Jurgens SJ, McGurk KA, Xu X, Grace C, Theotokis PI, Buchan RJ, Francis C, de Marvao A, Curran L, Bai W, Pua CJ, Tang HC, Jorda P, van Slegtenhorst MA, Verhagen JMA, Harper AR, Ormondroyd E, Chin CWL, Pantazis A, Baksi J, Halliday BP, Matthews P, Pinto YM, Walsh R, Amin AS, Wilde AAM, Cook SA, Prasad SK, Barton PJR, O'Regan DP, Lumbers RT, Goel A, Tadros R, Michels M, Watkins H, Bezzina CR, Ware JS. Evaluation of polygenic scores for hypertrophic cardiomyopathy in the general population and across clinical settings. Nat Genet 2025; 57:563-571. [PMID: 39966645 PMCID: PMC11906360 DOI: 10.1038/s41588-025-02094-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Accepted: 01/21/2025] [Indexed: 02/20/2025]
Abstract
Hypertrophic cardiomyopathy (HCM) is an important cause of morbidity and mortality, with pathogenic variants found in about a third of cases. Large-scale genome-wide association studies (GWAS) demonstrate that common genetic variation contributes to HCM risk. Here we derive polygenic scores (PGS) from HCM GWAS and genetically correlated traits and test their performance in the UK Biobank, 100,000 Genomes Project, and clinical cohorts. We show that higher PGS significantly increases the risk of HCM in the general population, particularly among pathogenic variant carriers, where HCM penetrance differs 10-fold between those in the highest and lowest PGS quintiles. Among relatives of HCM probands, PGS stratifies risks of developing HCM and adverse outcomes. Finally, among HCM cases, PGS strongly predicts the risk of adverse outcomes and death. These findings support the broad utility of PGS across clinical settings, enabling tailored screening and surveillance and stratification of risk of adverse outcomes.
Collapse
Affiliation(s)
- Sean L Zheng
- National Heart Lung Institute, Imperial College London, London, UK
- Medical Research Council Laboratory of Medical Sciences, Imperial College London, London, UK
- Royal Brompton & Harefield Hospitals, Guy's and St. Thomas' NHS Foundation Trust, London, UK
| | - Sean J Jurgens
- Department of Experimental Cardiology, Amsterdam Cardiovascular Sciences, University of Amsterdam, Amsterdam UMC, Amsterdam, the Netherlands
- Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Kathryn A McGurk
- National Heart Lung Institute, Imperial College London, London, UK
- Medical Research Council Laboratory of Medical Sciences, Imperial College London, London, UK
| | - Xiao Xu
- National Heart Lung Institute, Imperial College London, London, UK
- Medical Research Council Laboratory of Medical Sciences, Imperial College London, London, UK
| | - Chris Grace
- Radcliffe Department of Medicine, University of Oxford, Division of Cardiovascular Medicine, John Radcliffe Hospital, Oxford, UK
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
| | - Pantazis I Theotokis
- National Heart Lung Institute, Imperial College London, London, UK
- Medical Research Council Laboratory of Medical Sciences, Imperial College London, London, UK
- Royal Brompton & Harefield Hospitals, Guy's and St. Thomas' NHS Foundation Trust, London, UK
| | - Rachel J Buchan
- National Heart Lung Institute, Imperial College London, London, UK
- Medical Research Council Laboratory of Medical Sciences, Imperial College London, London, UK
- Royal Brompton & Harefield Hospitals, Guy's and St. Thomas' NHS Foundation Trust, London, UK
| | - Catherine Francis
- National Heart Lung Institute, Imperial College London, London, UK
- Royal Brompton & Harefield Hospitals, Guy's and St. Thomas' NHS Foundation Trust, London, UK
| | - Antonio de Marvao
- National Heart Lung Institute, Imperial College London, London, UK
- Medical Research Council Laboratory of Medical Sciences, Imperial College London, London, UK
- Department of Women and Children's Health, King's College London, London, UK
- School of Cardiovascular and Metabolic Medicine and Sciences, King's College London, London, UK
| | - Lara Curran
- National Heart Lung Institute, Imperial College London, London, UK
- Medical Research Council Laboratory of Medical Sciences, Imperial College London, London, UK
- Royal Brompton & Harefield Hospitals, Guy's and St. Thomas' NHS Foundation Trust, London, UK
| | - Wenjia Bai
- Biomedical Image Analysis Group, Department of Computing, Imperial College London, London, UK
- Department of Brain Sciences, Imperial College London, London, UK
| | - Chee Jian Pua
- National Heart Research Institute Singapore, National Heart Center, Singapore, Singapore
| | - Hak Chiaw Tang
- Department of Cardiology, National Heart Centre, Singapore, Singapore
| | - Paloma Jorda
- Cardiovascular Genetics Centre, Montreal Heart Institute, Montreal, Quebec, Canada
- Faculty of Medicine, Université de Montréal, Montreal, Quebec, Canada
| | - Marjon A van Slegtenhorst
- Department of Clinical Genetics, Erasmus MC, University Medical Center Rotterdam, Rotterdam, the Netherlands
| | - Judith M A Verhagen
- Department of Clinical Genetics, Erasmus MC, University Medical Center Rotterdam, Rotterdam, the Netherlands
| | - Andrew R Harper
- Radcliffe Department of Medicine, University of Oxford, Division of Cardiovascular Medicine, John Radcliffe Hospital, Oxford, UK
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
| | - Elizabeth Ormondroyd
- Radcliffe Department of Medicine, University of Oxford, Division of Cardiovascular Medicine, John Radcliffe Hospital, Oxford, UK
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
| | - Calvin W L Chin
- Department of Cardiology, National Heart Centre, Singapore, Singapore
| | - Antonis Pantazis
- Royal Brompton & Harefield Hospitals, Guy's and St. Thomas' NHS Foundation Trust, London, UK
| | - John Baksi
- National Heart Lung Institute, Imperial College London, London, UK
- Royal Brompton & Harefield Hospitals, Guy's and St. Thomas' NHS Foundation Trust, London, UK
| | - Brian P Halliday
- National Heart Lung Institute, Imperial College London, London, UK
- Royal Brompton & Harefield Hospitals, Guy's and St. Thomas' NHS Foundation Trust, London, UK
| | - Paul Matthews
- Department of Brain Sciences, Imperial College London, London, UK
| | - Yigal M Pinto
- Department of Experimental Cardiology, Amsterdam Cardiovascular Sciences, University of Amsterdam, Amsterdam UMC, Amsterdam, the Netherlands
- Department of Clinical Cardiology, Amsterdam Cardiovascular Sciences, University of Amsterdam, Amsterdam UMC, Amsterdam, the Netherlands
- European Reference Network for Rare and Low Prevalence Complex Diseases of the Heart, Paris, France
| | - Roddy Walsh
- Department of Experimental Cardiology, Amsterdam Cardiovascular Sciences, University of Amsterdam, Amsterdam UMC, Amsterdam, the Netherlands
| | - Ahmad S Amin
- Department of Experimental Cardiology, Amsterdam Cardiovascular Sciences, University of Amsterdam, Amsterdam UMC, Amsterdam, the Netherlands
- Department of Clinical Cardiology, Amsterdam Cardiovascular Sciences, University of Amsterdam, Amsterdam UMC, Amsterdam, the Netherlands
- European Reference Network for Rare and Low Prevalence Complex Diseases of the Heart, Paris, France
| | - Arthur A M Wilde
- Department of Experimental Cardiology, Amsterdam Cardiovascular Sciences, University of Amsterdam, Amsterdam UMC, Amsterdam, the Netherlands
- Department of Clinical Cardiology, Amsterdam Cardiovascular Sciences, University of Amsterdam, Amsterdam UMC, Amsterdam, the Netherlands
- European Reference Network for Rare and Low Prevalence Complex Diseases of the Heart, Paris, France
| | - Stuart A Cook
- Medical Research Council Laboratory of Medical Sciences, Imperial College London, London, UK
- Department of Cardiology, National Heart Centre, Singapore, Singapore
- Duke-National University of Singapore Medical School, Singapore, Singapore
| | - Sanjay K Prasad
- National Heart Lung Institute, Imperial College London, London, UK
- Royal Brompton & Harefield Hospitals, Guy's and St. Thomas' NHS Foundation Trust, London, UK
| | - Paul J R Barton
- National Heart Lung Institute, Imperial College London, London, UK
- Medical Research Council Laboratory of Medical Sciences, Imperial College London, London, UK
- Royal Brompton & Harefield Hospitals, Guy's and St. Thomas' NHS Foundation Trust, London, UK
| | - Declan P O'Regan
- Medical Research Council Laboratory of Medical Sciences, Imperial College London, London, UK
| | - R T Lumbers
- Institute of Health Informatics, University College London, London, UK
- Health Data Research UK London, University College London, London, UK
- British Heart Foundation Research Accelerator, University College London, London, UK
| | - Anuj Goel
- Radcliffe Department of Medicine, University of Oxford, Division of Cardiovascular Medicine, John Radcliffe Hospital, Oxford, UK
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
| | - Rafik Tadros
- Cardiovascular Genetics Centre, Montreal Heart Institute, Montreal, Quebec, Canada
- Faculty of Medicine, Université de Montréal, Montreal, Quebec, Canada
| | - Michelle Michels
- European Reference Network for Rare and Low Prevalence Complex Diseases of the Heart, Paris, France
- Department of Cardiology, Thorax Center, Cardiovascular Institute, Erasmus University Medical Center, Rotterdam, the Netherlands
| | - Hugh Watkins
- Radcliffe Department of Medicine, University of Oxford, Division of Cardiovascular Medicine, John Radcliffe Hospital, Oxford, UK
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
| | - Connie R Bezzina
- Department of Experimental Cardiology, Amsterdam Cardiovascular Sciences, University of Amsterdam, Amsterdam UMC, Amsterdam, the Netherlands
- European Reference Network for Rare and Low Prevalence Complex Diseases of the Heart, Paris, France
| | - James S Ware
- National Heart Lung Institute, Imperial College London, London, UK.
- Medical Research Council Laboratory of Medical Sciences, Imperial College London, London, UK.
- Royal Brompton & Harefield Hospitals, Guy's and St. Thomas' NHS Foundation Trust, London, UK.
- Imperial College Healthcare NHS Trust, London, UK.
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| |
Collapse
|
11
|
Zhou Y, Xiang B, Yang X, Ren Y, Gu X, Zhou X. Unsupervised Learning-Derived Complex Metabolic Signatures Refine Cardiometabolic Risk. JACC. ADVANCES 2025; 4:101620. [PMID: 39983615 PMCID: PMC11891690 DOI: 10.1016/j.jacadv.2025.101620] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/26/2024] [Revised: 01/14/2025] [Accepted: 01/17/2025] [Indexed: 02/23/2025]
Abstract
BACKGROUND Cardiometabolic diseases have become a leading cause of morbidity and mortality globally. Nuclear magnetic resonance metabolomics represents a precise tool for assessing metabolic individuality. OBJECTIVES This study aimed to use unsupervised learning to decode plasma metabolomic profiles, providing new insights into the etiology of cardiometabolic diseases. METHODS We applied unsupervised learning to generate robust metabolic signatures from the plasma profiles of 118,001 UK Biobank participants. Phenome-wide and genome-wide association studies were conducted to reveal their phenomic and genetic architectures. Integrated prospective cohort analyses and Mendelian randomization clarified their roles in cardiometabolic risks. RESULTS Eleven metabolic clusters were identified, linked to 101 loci and 445 phenotypes, mostly cardiometabolic diseases. These novel signatures partially outperformed traditional lipids in cardiometabolic risk prediction. Triglyceride-rich lipoproteins demonstrated superior predictive power for ischemic heart disease, type 2 diabetes, and hypertension, compared with apolipoprotein B and lipoprotein(a). Non-high-density lipoprotein cholesterol was found to increase the risk of hyperlipidemia and ischemic heart disease while offering a protective effect against type 2 diabetes. Furthermore, different high-density lipoprotein clusters showed heterogeneous associations with cardiometabolic diseases, with high-density lipoprotein subpopulations enriched in free cholesterol or triglyceride increasing risk, and those enriched in cholesterol esters providing protection. CONCLUSIONS These metabolic signatures extract comprehensive information from the metabolomic profile while maintaining clarity and interpretability, facilitating clinical translation. The findings emphasize the crucial roles of lipid subpopulations in cardiometabolic risks, encouraging clinicians to take a more nuanced approach to managing blood lipids and balancing different disease risks.
Collapse
Affiliation(s)
- Yujia Zhou
- Department of Cardiology, The Second Affiliated Hospital of Soochow University, Suzhou, China
| | - Boyang Xiang
- Department of Cardiology, The Second Affiliated Hospital of Nanjing Medical University, Nanjing, China
| | - Xiaoqin Yang
- Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, Suzhou, China
| | - Yuxin Ren
- Department of Cardiology, The Second Affiliated Hospital of Soochow University, Suzhou, China
| | - Xiaosong Gu
- Department of Cardiology, The Second Affiliated Hospital of Soochow University, Suzhou, China
| | - Xiang Zhou
- Department of Cardiology, The Second Affiliated Hospital of Nanjing Medical University, Nanjing, China.
| |
Collapse
|
12
|
Xin Y, Grabowska ME, Gangireddy S, Krantz MS, Kerchberger VE, Dickson AL, Feng Q, Yin Z, Wei WQ. Improving topic modeling performance on social media through semantic relationships within biomedical terminology. PLoS One 2025; 20:e0318702. [PMID: 39982945 PMCID: PMC11845042 DOI: 10.1371/journal.pone.0318702] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2024] [Accepted: 01/20/2025] [Indexed: 02/23/2025] Open
Abstract
Topic modeling utilizes unsupervised machine learning to detect underlying themes within texts and has been deployed routinely to analyze social media for insights into healthcare issues. However, the inherent messiness of social media hinders the full realization of this technique's potential. As such, we hypothesized that restricting medical concepts in social media texts to specific related semantic types and applying topic modeling to these concepts could be a feasible approach to overcome the challenge of traditional topic modeling for social media texts. Therefore, we developed a semantic-type-based topic modeling pipeline to discover self-reported health-related topics. This pipeline integrated semantic type information and Systematized Medical Nomenclature for Medicine (SNOMED) precoordinated expressions into a traditional topic modeling approach to enhance effectiveness in clustering meaningful, distinct topics. Using social media texts regarding statins for illustration, we evaluated the efficacy of this new approach and validated a newly identified topic using real-world clinical data. Based on expert evaluations, this approach resulted in more novel, distinguishable, and meaningful health-related topics compared to traditional topic modeling. In addition, our electronic health record validation for a newly identified topic in two real-world clinical databases indicated that statin users had a higher prevalence of depression or anxiety compared to matched non-users. Our results indicate that this new topic modeling pipeline can improve the extraction of themes from noisy online discussions, thereby contributing to deeper insights for healthcare research.
Collapse
Affiliation(s)
- Yi Xin
- Department of Computer Science, Vanderbilt University, Nashville, Tennessee, United States of America
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Monika E. Grabowska
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Srushti Gangireddy
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Matthew S. Krantz
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
- Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - V. Eric Kerchberger
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
- Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Alyson L. Dickson
- Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Qiping Feng
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
- Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Zhijun Yin
- Department of Computer Science, Vanderbilt University, Nashville, Tennessee, United States of America
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Wei-Qi Wei
- Department of Computer Science, Vanderbilt University, Nashville, Tennessee, United States of America
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| |
Collapse
|
13
|
Thapa R, Kjær MR, He B, Covert I, Moore H, Hanif U, Ganjoo G, Westover MB, Jennum P, Brink-Kjær A, Mignot E, Zou J. A Multimodal Sleep Foundation Model Developed with 500K Hours of Sleep Recordings for Disease Predictions. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2025:2025.02.04.25321675. [PMID: 39974074 PMCID: PMC11838666 DOI: 10.1101/2025.02.04.25321675] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 02/21/2025]
Abstract
Sleep is a fundamental biological process with profound implications for physical and mental health, yet our understanding of its complex patterns and their relationships to a broad spectrum of diseases remains limited. While polysomnography (PSG), the gold standard for sleep analysis, captures rich multimodal physiological data, analyzing these measurements has been challenging due to limited flexibility across recording environments, poor generalizability across cohorts, and difficulty in leveraging information from multiple signals simultaneously. To address this gap, we curated over 585,000 hours of high-quality sleep recordings from approximately 65,000 participants across multiple cohorts and developed SleepFM, a multimodal sleep foundation model trained with a novel contrastive learning approach, designed to accommodate any PSG montage. SleepFM produces informative sleep embeddings that enable predictions of future diseases. We systematically demonstrate that SleepFM embeddings can predict 130 future diseases, as modeled by Phecodes, with C-Index and AUROC of at least 0.75 on held-out participants (Bonferroni-corrected p < 0.01). This includes accurate predictions for death (C-Index: 0.84 [95% CI: 0.81-0.87]), heart failure (C-Index: 0.80 [95% CI: 0.77-0.83]), chronic kidney disease (C-Index: 0.79 [95% CI: 0.77-0.81]), dementia (C-Index: 0.85 [95% CI: 0.82-0.87]), stroke (C-Index: 0.78 [95% CI: 0.76-0.81]), atrial fibrillation (C-Index: 0.78 [95% CI: 0.75-0.81]), and myocardial infarction (C-Index: 0.81 [95% CI: 0.78-0.84]). The model's generalizability was further validated through strong performance on the Sleep Heart Health Study (SHHS), a dataset unseen during pre-training. Additionally, SleepFM demonstrates strong performance on traditional sleep analysis tasks, achieving competitive results in both sleep staging (mean F1 scores: 0.70-0.78) and sleep apnea diagnosis (AUROC: 0.90-0.94). Beyond these standard applications, our analysis reveals that specific sleep stages and physiological signals carry distinct predictive power for different diseases. This work demonstrates how foundation models can leverage sleep polysomnography data to uncover the extensive relationship between sleep physiology and future disease risk.
Collapse
Affiliation(s)
- Rahul Thapa
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Magnus Ruud Kjær
- Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA, USA
- Department of Health Technology, Technical University of Denmark, Kongens Lyngby, Denmark
- Department of Clinical Neurophysiology, Danish Center for Sleep Medicine, Copenhagen University Hospital - Rigshospitalet, Copenhagen, Denmark
| | - Bryan He
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Ian Covert
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Hyatt Moore
- Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA, USA
| | | | - Gauri Ganjoo
- Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA, USA
| | - M Brandon Westover
- Department of Neurology, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA
| | - Poul Jennum
- Department of Clinical Neurophysiology, Danish Center for Sleep Medicine, Copenhagen University Hospital - Rigshospitalet, Copenhagen, Denmark
- Department of Clinical Medicine, University of Copenhagen, Copenhagen, Denmark
| | - Andreas Brink-Kjær
- Department of Health Technology, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Emmanuel Mignot
- Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA, USA
| | - James Zou
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA
- Department of Computer Science, Stanford University, Stanford, CA, USA
| |
Collapse
|
14
|
Wan NC, Grabowska ME, Kerchberger VE, Wei WQ. Exploring beyond diagnoses in electronic health records to improve discovery: a review of the phenome-wide association study. JAMIA Open 2025; 8:ooaf006. [PMID: 40041255 PMCID: PMC11879097 DOI: 10.1093/jamiaopen/ooaf006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2024] [Revised: 12/30/2024] [Accepted: 01/24/2025] [Indexed: 03/06/2025] Open
Abstract
Objective The phenome-wide association study (PheWAS) systematically examines the phenotypic spectrum extracted from electronic health records (EHRs) to uncover correlations between phenotypes and exposures. This review explores methodologies, highlights challenges, and outlines future directions for EHR-driven PheWAS. Materials and Methods We searched the PubMed database for articles spanning from 2010 to 2023, and we collected data regarding exposures, phenotypes, cohorts, terminologies, replication, and ancestry. Results Our search yielded 690 articles. Following exclusion criteria, we identified 291 articles published between January 1, 2010, and December 31, 2023. A total number of 162 (55.6%) articles defined phenomes using phecodes, indicating that research is reliant on the organization of billing codes. Moreover, 72.8% of articles utilized exposures consisting of genetic data, and the majority (69.4%) of PheWAS lacked replication analyses. Discussion Existing literature underscores the need for deeper phenotyping, variability in PheWAS exposure variables, and absence of replication in PheWAS. Current applications of PheWAS mainly focus on cardiovascular, metabolic, and endocrine phenotypes; thus, applications of PheWAS in uncommon diseases, which may lack structured data, remain largely understudied. Conclusions With modern EHRs, future PheWAS should extend beyond diagnosis codes and consider additional data like clinical notes or medications to create comprehensive phenotype profiles that consider severity, temporality, risk, and ancestry. Furthermore, data interoperability initiatives may help mitigate the paucity of PheWAS replication analyses. With the growing availability of data in EHR, PheWAS will remain a powerful tool in precision medicine.
Collapse
Affiliation(s)
- Nicholas C Wan
- Department of Biomedical Engineering, Vanderbilt University, Nashville, TN 37240, United States
| | - Monika E Grabowska
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37302, United States
| | - Vern Eric Kerchberger
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37302, United States
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, United States
| | - Wei-Qi Wei
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37302, United States
| |
Collapse
|
15
|
Wheless L, Mosley D, Dochtermann D, Pyarajan S, Gonzalez K, Weiss R, Maas K, Zhang S, Yao L, Xu Y, Madden C, Ike J, Smith IT, Grossarth S, Wilson O, Hung A, Fillmore NR, Brown K, Landi MT, Hartman RI. The impact of imprecise case definitions in electronic health record research: a melanoma case-study from the Million Veteran Program. Arch Dermatol Res 2025; 317:308. [PMID: 39853431 DOI: 10.1007/s00403-024-03780-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2024] [Revised: 12/26/2024] [Accepted: 12/29/2024] [Indexed: 01/26/2025]
Abstract
Cases for a disease can be defined broadly using diagnostic codes, or narrowly using gold-standard confirmation that often is not available in large administrative datasets. These different definitions can have significant impacts on the results and conclusions of studies. We conducted this study to assess how using melanoma phecodes versus histologic confirmation for invasive or in situ melanoma impacts the results of a genome-wide association study (GWAS) using the Million Veteran Program. Melanoma status was determined three ways: (1) by the presence of two or more phecodes, (2) histologically-confirmed invasive melanoma, and (3) histologically-confirmed melanoma in situ. We conducted a GWAS for variants with minor allele frequencies of 1% or greater. There were 45,665 cases in the phecode cohort, 5364 cases in the confirmed invasive melanoma cohort, and 4792 cases in the confirmed melanoma in situ cohort. There were 20,457 variants significant at the genome-wide level in the phecode cohort, 2582 in the invasive melanoma cohort, and 1989 in the melanoma in situ cohort. Most of the variants identified in the phecode cohort did not replicate in the histologically-confirmed cohorts. The different case definitions led to large differences in sample size and variants associated at the genome-wide level. Unvalidated and imprecise case definitions can lead to less accurate results. Investigators should use validated phenotypes when gold-standard definitions are not available.
Collapse
Affiliation(s)
- Lee Wheless
- Tennessee Valley Healthcare System VA Medical Center, 719 Thompson Lane, Suite 26300, Nashville, TN, 37215, USA.
- Division of Epidemiology, Vanderbilt University Medical Center Department of Medicine, Vanderbilt University Medical Center, Nashville, USA.
- Department of Dermatology, Nashville, TN, USA.
| | | | | | | | - Katlyn Gonzalez
- Vanderbilt University School of Medicine, Nashville, TN, USA
| | | | - Kyle Maas
- Vanderbilt University School of Medicine, Nashville, TN, USA
| | - Siwei Zhang
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Lydia Yao
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Yaomin Xu
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Christopher Madden
- State University of New York Downstate College of Medicine, Brooklyn, NY, USA
| | | | | | - Sarah Grossarth
- Quillen College of Medicine, East Tennessee State University, Johnson City, TN, USA
| | - Otis Wilson
- Tennessee Valley Healthcare System VA Medical Center, 719 Thompson Lane, Suite 26300, Nashville, TN, 37215, USA
- Department of Medicine, Division of Nephrology and Hypertension, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Adriana Hung
- Tennessee Valley Healthcare System VA Medical Center, 719 Thompson Lane, Suite 26300, Nashville, TN, 37215, USA
- Department of Medicine, Division of Nephrology and Hypertension, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Nathanael R Fillmore
- VA Boston Healthcare System, Boston, MA, USA
- Department of Medicine, Veterans Affairs Boston Healthcare System, Boston, MA, USA
- Dana Farber Cancer Institute, Boston, MA, USA
- Massachusetts Veterans Epidemiology Research and Information Center, Veterans Affairs Boston Healthcare System, Boston, MA, USA
| | - Kevin Brown
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Maria Teresa Landi
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Rebecca I Hartman
- VA Boston Healthcare System, Boston, MA, USA
- Brigham and Women's Hospital Department of Dermatology, Boston, MA, USA
| |
Collapse
|
16
|
Steinfeldt J, Wild B, Buergel T, Pietzner M, Upmeier Zu Belzen J, Vauvelle A, Hegselmann S, Denaxas S, Hemingway H, Langenberg C, Landmesser U, Deanfield J, Eils R. Medical history predicts phenome-wide disease onset and enables the rapid response to emerging health threats. Nat Commun 2025; 16:585. [PMID: 39794311 PMCID: PMC11724087 DOI: 10.1038/s41467-025-55879-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2024] [Accepted: 01/02/2025] [Indexed: 01/13/2025] Open
Abstract
The COVID-19 pandemic exposed a global deficiency of systematic, data-driven guidance to identify high-risk individuals. Here, we illustrate the utility of routinely recorded medical history to predict the risk for 1741 diseases across clinical specialties and support the rapid response to emerging health threats such as COVID-19. We developed a neural network to learn from health records of 502,489 UK Biobank participants. Importantly, we observed discriminative improvements over basic demographic predictors for 1546 (88.8%) endpoints. After transferring the unmodified risk models to the All of US cohort, we replicated these improvements for 1115 (78.9%) of 1414 investigated endpoints, demonstrating generalizability across healthcare systems and historically underrepresented groups. Ultimately, we showed how this approach could have been used to identify individuals vulnerable to severe COVID-19. Our study demonstrates the potential of medical history to support guidance for emerging pandemics by systematically estimating risk for thousands of diseases at once at minimal cost.
Collapse
Affiliation(s)
- Jakob Steinfeldt
- Department of Cardiology, Angiology and Intensive Care Medicine, Deutsches Herzzentrum der Charité (DHZC), Berlin, Germany
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt Universität zu Berlin, Klinik/Centrum, Berlin, Germany
- Computational Medicine, Berlin Institute of Health (BIH), Charite - University Medicine Berlin, Berlin, Germany
- Friede Springer Cardiovascular Prevention Center@Charite, Charite - University Medicine Berlin, Berlin, Germany
- Institute of Cardiovascular Sciences, University College London, London, UK
| | - Benjamin Wild
- Institute of Cardiovascular Sciences, University College London, London, UK
- Center for Digital Health, Berlin Institute of Health (BIH), Charite - University Medicine Berlin, Berlin, Germany
| | - Thore Buergel
- Institute of Cardiovascular Sciences, University College London, London, UK
- Center for Digital Health, Berlin Institute of Health (BIH), Charite - University Medicine Berlin, Berlin, Germany
| | - Maik Pietzner
- Computational Medicine, Berlin Institute of Health (BIH), Charite - University Medicine Berlin, Berlin, Germany
- MRC Epidemiology Unit, Institute of Metabolic Science, University of Cambridge, Cambridge, UK
- Precision Health University Research Institute, Queen Mary University of London and Barts NHS Trust, London, UK
| | - Julius Upmeier Zu Belzen
- Center for Digital Health, Berlin Institute of Health (BIH), Charite - University Medicine Berlin, Berlin, Germany
| | - Andre Vauvelle
- Institute of Health Informatics, University College London, London, UK
| | - Stefan Hegselmann
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Massachusetts, USA
- Pattern Recognition and Image Analysis Lab, University of Münster, Münster, Germany
| | - Spiros Denaxas
- Institute of Health Informatics, University College London, London, UK
- British Heart Foundation Data Science Centre, London, UK
- Health Data Research UK, London, UK
- National Institute for Health Research, Biomedical Research Centre at University College London Hospitals, London, UK
| | - Harry Hemingway
- Institute of Health Informatics, University College London, London, UK
- Health Data Research UK, London, UK
- National Institute for Health Research, Biomedical Research Centre at University College London Hospitals, London, UK
| | - Claudia Langenberg
- Computational Medicine, Berlin Institute of Health (BIH), Charite - University Medicine Berlin, Berlin, Germany
- MRC Epidemiology Unit, Institute of Metabolic Science, University of Cambridge, Cambridge, UK
- Precision Health University Research Institute, Queen Mary University of London and Barts NHS Trust, London, UK
| | - Ulf Landmesser
- Department of Cardiology, Angiology and Intensive Care Medicine, Deutsches Herzzentrum der Charité (DHZC), Berlin, Germany
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt Universität zu Berlin, Klinik/Centrum, Berlin, Germany
- Friede Springer Cardiovascular Prevention Center@Charite, Charite - University Medicine Berlin, Berlin, Germany
- Berlin Institute of Health (BIH), Charite - University Medicine Berlin, Berlin, Germany
- DZHK (German Centre for Cardiovascular Research), Partner Site Berlin, Berlin, Berlin, Germany
| | - John Deanfield
- Institute of Cardiovascular Sciences, University College London, London, UK
| | - Roland Eils
- Center for Digital Health, Berlin Institute of Health (BIH), Charite - University Medicine Berlin, Berlin, Germany.
- Health Data Science Unit, Heidelberg University Hospital and BioQuant, Heidelberg, Germany.
| |
Collapse
|
17
|
Gunn S, Wang X, Posner DC, Cho K, Huffman JE, Gaziano M, Wilson PW, Sun YV, Peloso G, Lunetta KL. Comparison of methods for building polygenic scores for diverse populations. HGG ADVANCES 2025; 6:100355. [PMID: 39323095 PMCID: PMC11532986 DOI: 10.1016/j.xhgg.2024.100355] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2024] [Revised: 09/22/2024] [Accepted: 09/22/2024] [Indexed: 09/27/2024] Open
Abstract
Polygenic scores (PGSs) are a promising tool for estimating individual-level genetic risk of disease based on the results of genome-wide association studies (GWASs). However, their promise has yet to be fully realized because most currently available PGSs were built with genetic data from predominantly European-ancestry populations, and PGS performance declines when scores are applied to target populations different from the populations from which they were derived. Thus, there is a great need to improve PGS performance in currently under-studied populations. In this work we leverage data from two large and diverse cohorts the Million Veterans Program (MVP) and All of Us (AoU), providing us the unique opportunity to compare methods for building PGSs for multi-ancestry populations across multiple traits. We build PGSs for five continuous traits and five binary traits using both multi-ancestry and single-ancestry approaches with popular Bayesian PGS methods and both MVP META GWAS results and population-specific GWAS results from the respective African, European, and Hispanic MVP populations. We evaluate these scores in three AoU populations genetically similar to the respective African, Admixed American, and European 1000 Genomes Project superpopulations. Using correlation-based tests, we make formal comparisons of the PGS performance across the multiple AoU populations. We conclude that approaches that combine GWAS data from multiple populations produce PGSs that perform better than approaches that utilize smaller single-population GWAS results matched to the target population, and specifically that multi-ancestry scores built with PRS-CSx outperform the other approaches in the three AoU populations.
Collapse
Affiliation(s)
- Sophia Gunn
- Biostatistics, Boston University School of Public Health, Boston, MA, USA; VA Boston Healthcare System, Boston, MA, USA.
| | - Xin Wang
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA; Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Daniel C Posner
- Massachusetts Veterans Epidemiology Research and Information Center (MAVERIC) , Boston, MA, USA
| | - Kelly Cho
- Department of Medicine, Harvard Medical School, Boston, MA, USA; MVP Boston Coordinating Center, VA Boston Healthcare System, Boston, MA, USA; Department of Medicine, Division of Aging, Brigham and Women's Hospital, Boston, MA 02115, USA
| | - Jennifer E Huffman
- Massachusetts Veterans Epidemiology Research and Information Center (MAVERIC) , Boston, MA, USA; Department of Medicine, Harvard Medical School, Boston, MA, USA; Palo Alto Veterans Institute for Research (PAVIR), Palo Alto Health Care System, Palo Alto, CA, USA
| | - Michael Gaziano
- Department of Medicine, Harvard Medical School, Boston, MA, USA; MVP Boston Coordinating Center, VA Boston Healthcare System, Boston, MA, USA; Department of Medicine, Division of Aging, Brigham and Women's Hospital, Boston, MA 02115, USA
| | - Peter W Wilson
- VA Atlanta Healthcare System, Decatur, GA, USA; Division of Cardiology, Department of Medicine, Emory University School of Medicine, Atlanta, GA, USA; Department of Epidemiology, Rollins School of Public Health, Emory University, Atlanta, GA, USA
| | - Yan V Sun
- VA Atlanta Healthcare System, Decatur, GA, USA; Department of Epidemiology, Rollins School of Public Health, Emory University, Atlanta, GA, USA
| | - Gina Peloso
- Biostatistics, Boston University School of Public Health, Boston, MA, USA; VA Boston Healthcare System, Boston, MA, USA
| | - Kathryn L Lunetta
- Biostatistics, Boston University School of Public Health, Boston, MA, USA
| |
Collapse
|
18
|
Justice A, Kelly MA, Bellus G, Green JD, Zaidi R, Kerrins T, Josyula N, Luperchio TR, Kozel BA, Williams MS. Phenotypic findings associated with variation in elastin. HGG ADVANCES 2025; 6:100388. [PMID: 39604264 PMCID: PMC11730535 DOI: 10.1016/j.xhgg.2024.100388] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2024] [Revised: 11/25/2024] [Accepted: 11/25/2024] [Indexed: 11/29/2024] Open
Abstract
Variation in the elastin gene (ELN) may contribute to connective tissue disease beyond the known disease associations of supravalvar aortic stenosis and cutis laxa. Exome data from MyCode Community Health Initiative participants were analyzed for ELN rare variants (mean allele frequency <1%, not currently annotated as benign). Participants with variants of interest underwent phenotyping by dual chart review using a standardized abstraction tool. Additionally, all rare variants that met inclusion criteria were collapsed into an ELN gene burden score to perform a phenome-wide association study (PheWAS). Two hundred and ninety-six eligible participants with relevant ELN variants were identified from 184,293 MyCode participants. One hundred and three of 254 living participants (41%) met phenotypic criteria, most commonly aortic hypoplasia, arterial dilation, aneurysm, and dissection, and connective tissue abnormalities. ELN variation was significantly (p < 2.8 × 10-5) associated with "arterial dissection" in the PheWAS and two connective tissue Phecodes approached significance. Variation in ELN is associated with connective tissue pathology beyond classic phenotypes.
Collapse
Affiliation(s)
- Anne Justice
- Department of Population Health, Geisinger, Danville, PA, USA
| | | | - Gary Bellus
- Department of Pediatrics, Geisinger, Danville, PA, USA
| | - Joshua D Green
- Department of Genomic Health, Geisinger, Rockville, MD, USA
| | - Raza Zaidi
- Department of Internal Medicine, Geisinger, Danville, PA, USA
| | | | - Navya Josyula
- Department of Population Health, Geisinger, Danville, PA, USA
| | | | - Beth A Kozel
- National Heart, Lung, and Blood Institute, Bethesda, MD, USA
| | | |
Collapse
|
19
|
Weng LC, Rämö JT, Jurgens SJ, Khurshid S, Chaffin M, Hall AW, Morrill VN, Wang X, Nauffal V, Sun YV, Beer D, Lee S, Nadkarni GN, Duong T, Wang B, Czuba T, Austin TR, Yoneda ZT, Friedman DJ, Clayton A, Hyman MC, Judy RL, Skanes AC, Orland KM, Treu TM, Oetjens MT, Alonso A, Soliman EZ, Lin H, Lunetta KL, van der Pals J, Issa TZ, Nafissi NA, May HT, Leong-Sit P, Roselli C, Choi SH, Khan HR, Knight S, Karlsson Linnér R, Bezzina CR, Ripatti S, Heckbert SR, Gaziano JM, Loos RJF, Psaty BM, Smith JG, Benjamin EJ, Arking DE, Rader DJ, Shah SH, Roden DM, Damrauer SM, Eckhardt LL, Roberts JD, Cutler MJ, Shoemaker MB, Haggerty CM, Cho K, Palotie A, Wilson PWF, Ellinor PT, Lubitz SA. The impact of common and rare genetic variants on bradyarrhythmia development. Nat Genet 2025; 57:53-64. [PMID: 39747593 PMCID: PMC11735381 DOI: 10.1038/s41588-024-01978-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2023] [Accepted: 10/09/2024] [Indexed: 01/04/2025]
Abstract
To broaden our understanding of bradyarrhythmias and conduction disease, we performed common variant genome-wide association analyses in up to 1.3 million individuals and rare variant burden testing in 460,000 individuals for sinus node dysfunction (SND), distal conduction disease (DCD) and pacemaker (PM) implantation. We identified 13, 31 and 21 common variant loci for SND, DCD and PM, respectively. Four well-known loci (SCN5A/SCN10A, CCDC141, TBX20 and CAMK2D) were shared for SND and DCD, while others were more specific for SND or DCD. SND and DCD showed a moderate genetic correlation (rg = 0.63). Cardiomyocyte-expressed genes were enriched for contributions to DCD heritability. Rare-variant analyses implicated LMNA for all bradyarrhythmia phenotypes, SMAD6 and SCN5A for DCD and TTN, MYBPC3 and SCN5A for PM. These results show that variation in multiple genetic pathways (for example, ion channel function, cardiac developmental programs, sarcomeric structure and cellular homeostasis) appear critical to the development of bradyarrhythmias.
Collapse
Grants
- R01 HL141901 NHLBI NIH HHS
- R01 HL139738 NHLBI NIH HHS
- 18SFRN34250007 American Heart Association (American Heart Association, Inc.)
- TNE FANTASY 19CV03 Fondation Leducq
- R01 HL092577 NHLBI NIH HHS
- R01 HL105756 NHLBI NIH HHS
- R01 HL157635 NHLBI NIH HHS
- R01 HL139731 NHLBI NIH HHS
- U01 AG068221 NIA NIH HHS
- 23CDA1050571 American Heart Association (American Heart Association, Inc.)
- T32 HL007101 NHLBI NIH HHS
- R01 AG083735 NIA NIH HHS
- 18SFRN34110082 American Heart Association (American Heart Association, Inc.)
- 18SFRN34230127 American Heart Association (American Heart Association, Inc.)
- IK2 CX001780 CSRD VA
- 75N92019D00031 NHLBI NIH HHS
- K23 HL169839 NHLBI NIH HHS
- 03-007-2022-0035 Hartstichting (Dutch Heart Foundation)
- R21 HL175584 NHLBI NIH HHS
- R01 HL163987 NHLBI NIH HHS
- National Institutes of Health:R01HL139731 & R01HL157635
- Sigrid Juséliuksen Säätiö (Sigrid Jusélius Foundation)
- National Institutes of Health: K23HL169839
- National Institutes of Health: RO1HL092577
- National Institutes of Health: T32HL007101
- Swedish Heart-Lung Foundation (2022-0344, 2022-0345), the Swedish Research Council (2021-02273), the European Research Council (ERC-STG-2015-679242), Gothenburg University, Skane University Hospital, the Scania county, governmental funding of clinical research within the Swedish National Health Service, a generous donation from the Knut and Alice Wallenberg foundation to the Wallenberg Center for Molecular Medicine in Lund, and funding from the Swedish Research Council (Linnaeus grant Dnr 349-2006-237, Strategic Research Area Exodiab Dnr 2009-1039) and Swedish Foundation for Strategic Research (Dnr IRC15-0067) to the Lund University Diabetes Center.
- US Department of Veterans Affairs Clinical Research and Development award IK2-CX001780
- National Institutes of Health: R01HL163987-01 and R01HL139738-01
- Academy of Finland Centre of Excellence in Complex Disease Genetics (grant no. 312074 and 336824)
- National Institutes of Health: R01HL139731, R01HL157635, and RO1HL092577 European Union: MAESTRIA 965286
Collapse
Affiliation(s)
- Lu-Chen Weng
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA
- Cardiovascular Disease Initiative, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- VA Boston Healthcare System, Boston, MA, USA
| | - Joel T Rämö
- Institute for Molecular Medicine Finland (FIMM), Helsinki Institute of Life Science (HiLIFE), University of Helsinki, Helsinki, Finland
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Sean J Jurgens
- Cardiovascular Disease Initiative, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Experimental Cardiology, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands
| | - Shaan Khurshid
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA
- Cardiovascular Disease Initiative, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Telemachus and Irene Demoulas Family Foundation Center for Cardiac Arrhythmias, Massachusetts General Hospital, Boston, MA, USA
| | - Mark Chaffin
- Cardiovascular Disease Initiative, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Amelia Weber Hall
- Gene Regulation Observatory, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Valerie N Morrill
- Cardiovascular Disease Initiative, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Xin Wang
- Cardiovascular Disease Initiative, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Victor Nauffal
- Cardiovascular Disease Initiative, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- VA Boston Healthcare System, Boston, MA, USA
- Cardiovascular Medicine Division, Brigham and Women's Hospital, Boston, MA, USA
| | - Yan V Sun
- VA Atlanta Healthcare System, Decatur, GA, USA
- Department of Epidemiology, Emory University Rollins School of Public Health, Atlanta, GA, USA
| | | | - Simon Lee
- Icahn School of Medicine at Mount Sinai, New York City, NY, USA
| | | | - ThuyVy Duong
- McKusick-Nathans Institute, Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Biqi Wang
- Department of Medicine, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | - Tomasz Czuba
- The Wallenberg Laboratory/Department of Molecular and Clinical Medicine, Institute of Medicine, Gothenburg University, Gothenburg, Sweden
- Department of Cardiology, Sahlgrenska University Hospital, Gothenburg, Sweden
| | - Thomas R Austin
- Cardiovascular Health Research Unit, University of Washington, Seattle, WA, USA
- Department of Epidemiology, University of Washington, Seattle, WA, USA
| | - Zachary T Yoneda
- Department of Medicine, Division of Cardiovascular Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Daniel J Friedman
- Division of Cardiology, Department of Medicine, Duke University School of Medicine, Durham, NC, USA
| | - Anne Clayton
- Intermountain Heart Institute, Intermountain Medical Center, Murray, UT, USA
| | - Matthew C Hyman
- Division of Cardiac Electrophysiology, Hospital of the University of Pennsylvania, Philadelphia, PA, USA
| | - Renae L Judy
- Department of Surgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Allan C Skanes
- Section of Cardiac Electrophysiology, Division of Cardiology, Department of Medicine, Western University, London, Ontario, Canada
| | - Kate M Orland
- Department of Medicine, Division of Cardiovascular Medicine, University of Wisconsin-Madison, Madison, WI, USA
| | | | - Matthew T Oetjens
- Autism and Developmental Medicine Institute, Geisinger, Lewisburg, PA, USA
| | - Alvaro Alonso
- Department of Epidemiology, Rollins School of Public Health, Emory University, Atlanta, GA, USA
| | - Elsayed Z Soliman
- Epidemiological Cardiology Research Center, Section on Cardiovascular Medicine, Department of Medicine, Wake Forest School of Medicine, Winston-Salem, NC, USA
| | - Honghuang Lin
- Department of Medicine, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | - Kathryn L Lunetta
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
| | - Jesper van der Pals
- Department of Cardiology, Clinical Sciences, Lund University and Skane University Hospital, Lund, Sweden
| | - Tariq Z Issa
- Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Navid A Nafissi
- Division of Cardiology, Department of Medicine, Duke University School of Medicine, Durham, NC, USA
| | - Heidi T May
- Intermountain Heart Institute, Intermountain Medical Center, Murray, UT, USA
| | - Peter Leong-Sit
- Section of Cardiac Electrophysiology, Division of Cardiology, Department of Medicine, Western University, London, Ontario, Canada
| | - Carolina Roselli
- Cardiovascular Disease Initiative, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Cardiology, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - Seung Hoan Choi
- Cardiovascular Disease Initiative, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Habib R Khan
- Section of Cardiac Electrophysiology, Western University, London, Ontario, Canada
| | - Stacey Knight
- Intermountain Heart Institute, Intermountain Medical Center, Murray, UT, USA
- Department of Medicine, University of Utah, Salt Lake City, UT, USA
| | - Richard Karlsson Linnér
- Autism and Developmental Medicine Institute, Geisinger, Lewisburg, PA, USA
- Department of Economics, Leiden Law School, Leiden University, Leiden, The Netherlands
| | - Connie R Bezzina
- Department of Experimental Cardiology, Amsterdam Cardiovascular Sciences, Heart Center, Amsterdam University Medical Centre, University of Amsterdam, Amsterdam, The Netherlands
| | - Samuli Ripatti
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland
| | - Susan R Heckbert
- Cardiovascular Health Research Unit, University of Washington, Seattle, WA, USA
- Department of Epidemiology, University of Washington, Seattle, WA, USA
| | - J Michael Gaziano
- VA Boston Healthcare System, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
- Brigham and Women's Hospital, Boston, MA, USA
| | - Ruth J F Loos
- Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York City, NY, USA
- Novo Nordisk Foundation Center for Basic Metabolic Research, Department of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Bruce M Psaty
- Cardiovascular Health Research Unit, University of Washington, Seattle, WA, USA
- Departments of Medicine, Epidemiology, and Health Services, University of Washington, Seattle, WA, USA
| | - J Gustav Smith
- The Wallenberg Laboratory/Department of Molecular and Clinical Medicine, Institute of Medicine, Gothenburg University, Gothenburg, Sweden
- Department of Cardiology, Sahlgrenska University Hospital, Gothenburg, Sweden
- Department of Cardiology, Clinical Sciences, Lund University and Skane University Hospital, Lund, Sweden
- Wallenberg Center for Molecular Medicine and Lund University Diabetes Center, Lund University, Lund, Sweden
| | - Emelia J Benjamin
- Department of Medicine, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA
- Department of Epidemiology, Boston University School of Public Health, Boston, MA, USA
- NHLBI and BU's Framingham Heart Study, Framingham, MA, USA
| | - Dan E Arking
- McKusick-Nathans Institute, Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Daniel J Rader
- Departments of Medicine and Genetics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, USA
| | - Svati H Shah
- Division of Cardiology, Department of Medicine, Duke University School of Medicine, Durham, NC, USA
- Duke Molecular Physiology Institute, Duke University School of Medicine, Durham, NC, USA
| | - Dan M Roden
- Vanderbilt University Medical Center, Nashville, TN, USA
| | - Scott M Damrauer
- Department of Surgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Department of Genetics and Cardiovascular Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Lee L Eckhardt
- Department of Medicine, Division of Cardiovascular Medicine, University of Wisconsin-Madison, Madison, WI, USA
| | - Jason D Roberts
- Section of Cardiac Electrophysiology, Division of Cardiology, Department of Medicine, Western University, London, Ontario, Canada
| | - Michael J Cutler
- Intermountain Heart Institute, Intermountain Medical Center, Murray, UT, USA
| | - M Benjamin Shoemaker
- Department of Medicine, Division of Cardiovascular Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Christopher M Haggerty
- Heart Institute, Geisinger, Danville, PA, USA
- Department of Translational Data Science and Informatics, Geisinger, Danville, PA, USA
| | - Kelly Cho
- VA Boston Healthcare System, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
- Brigham and Women's Hospital, Boston, MA, USA
| | - Aarno Palotie
- Institute for Molecular Medicine Finland (FIMM), Helsinki Institute of Life Science (HiLIFE), University of Helsinki, Helsinki, Finland
- The Stanley Center for Psychiatric Research and Program in Medical and Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Department of Medicine, Department of Neurology and Department of Psychiatry, Massachusetts General Hospital, Boston, MA, USA
| | - Peter W F Wilson
- VA Atlanta Healthcare System, Decatur, GA, USA
- Department of Medicine, Emory University School of Medicine, Atlanta, GA, USA
| | - Patrick T Ellinor
- Cardiovascular Disease Initiative, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Telemachus and Irene Demoulas Family Foundation Center for Cardiac Arrhythmias, Massachusetts General Hospital, Boston, MA, USA
| | - Steven A Lubitz
- Telemachus and Irene Demoulas Family Foundation Center for Cardiac Arrhythmias, Massachusetts General Hospital, Boston, MA, USA.
| |
Collapse
|
20
|
Wilcox H, Saxena R, Winkelman JW, Dashti HS. Clinical and genetic associations for night eating syndrome in a patient biobank. J Eat Disord 2024; 12:211. [PMID: 39716312 DOI: 10.1186/s40337-024-01180-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/03/2024] [Accepted: 12/12/2024] [Indexed: 12/25/2024] Open
Abstract
OBJECTIVE Night eating syndrome (NES) is an eating disorder characterized by evening hyperphagia. Despite having a prevalence comparable to some other eating disorders, NES remains sparsely investigated and poorly characterized. The present study examined the phenotypic and genetic associations for NES in the clinical Mass General Brigham Biobank. METHOD Cases of NES were identified through relevant billing codes for eating disorders (F50.89/F50.9) and subsequent chart review; patients likely without NES were set as controls. Other diagnoses were determined from billing codes and collapsed into one of 1,857 distinct phenotypes based on clinical similarity. NES associations with diagnoses were systematically conducted in phenome-wide association scans using logistic regression models with adjustments for age, sex, race, and ethnicity. Polygenic scores for six related traits, namely for anorexia nervosa, depression, insomnia, sleep apnea, obesity, and type 2 diabetes were tested for associations with NES among participants of European ancestry using adjusted logistic regression models. RESULTS Phenome-wide scans comparing patients with NES against controls (cases n = 88; controls n = 64,539) identified associations with 159 clinical diagnoses spanning 13 broad disease groups including endocrine/metabolic and digestive diseases. Notable associations were evident for bariatric surgery, vitamin D deficiency, sleep disorders (sleep apnea, insomnia, and restless legs syndrome), and attention deficit hyperactivity disorder. The polygenic scores for insomnia and obesity were associated with higher odds of NES (insomnia: odds ratio [OR], 1.24; 95% CI, 1.07, 1.43; obesity: 1.98; 95% CI, 1.71, 2.28). DISCUSSION Complementary phenome-wide and genetic exploratory analyses provided information on unique and shared features of NES, offering insights that may facilitate its precise definition, diagnosis, and the development of targeted therapeutic interventions.
Collapse
Affiliation(s)
- Hannah Wilcox
- Department of Anesthesia, Critical Care and Pain Medicine, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Richa Saxena
- Department of Anesthesia, Critical Care and Pain Medicine, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Programs in Metabolism and Medical & Population Genetics, The Broad Institute of M.I.T and Harvard, Cambridge, MA, USA
- Division of Sleep Medicine, Harvard Medical School, Boston, MA, USA
| | - John W Winkelman
- Division of Sleep Medicine, Harvard Medical School, Boston, MA, USA
- Sleep Disorders Clinical Research Program, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Hassan S Dashti
- Department of Anesthesia, Critical Care and Pain Medicine, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA.
- Programs in Metabolism and Medical & Population Genetics, The Broad Institute of M.I.T and Harvard, Cambridge, MA, USA.
- Division of Sleep Medicine, Harvard Medical School, Boston, MA, USA.
- Division of Nutrition, Harvard Medical School, Boston, MA, USA.
- , 55 Fruit Street, Edwards 410C, Boston, MA, 02114, USA.
| |
Collapse
|
21
|
Barron DS, Saltoun K, Kiesow H, Fu M, Cohen-Tanugi J, Geha P, Scheinost D, Isaac Z, Silbersweig D, Bzdok D. Pain can't be carved at the joints: defining function-based pain profiles and their relevance to chronic disease management in healthcare delivery design. BMC Med 2024; 22:594. [PMID: 39696368 PMCID: PMC11656997 DOI: 10.1186/s12916-024-03807-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/10/2024] [Accepted: 12/02/2024] [Indexed: 12/20/2024] Open
Abstract
BACKGROUND Pain is a complex problem that is triaged, diagnosed, treated, and billed based on which body part is painful, almost without exception. While the "body part framework" guides the organization and treatment of individual patients' pain conditions, it remains unclear how to best conceptualize, study, and treat pain conditions at the population level. Here, we investigate (1) how the body part framework agrees with population-level, biologically derived pain profiles; (2) how do data-derived pain profiles interface with other symptom domains from a whole-body perspective; and (3) whether biologically derived pain profiles capture clinically salient differences in medical history. METHODS To understand how pain conditions might be best organized, we applied a carefully designed a multi-variate pattern-learning approach to a subset of the UK Biobank (n = 34,337), the largest publicly available set of real-world pain experience data to define common population-level profiles. We performed a series of post hoc analyses to validate that each pain profile reflects real-world, clinically relevant differences in patient function by probing associations of each profile across 137 medication categories, 1425 clinician-assigned ICD codes, and 757 expert-curated phenotypes. RESULTS We report four unique, biologically based pain profiles that cut across medical specialties: pain interference, depression, medical pain, and anxiety, each representing different facets of functional impairment. Importantly, these profiles do not specifically align with variables believed to be important to the standard pain evaluation, namely painful body part, pain intensity, sex, or BMI. Correlations with individual-level clinical histories reveal that our pain profiles are largely associated with clinical variables and treatments of modifiable, chronic diseases, rather than with specific body parts. Across profiles, notable differences include opioids being associated only with the pain interference profile, while antidepressants linked to the three complimentary profiles. We further provide evidence that our pain profiles offer valuable, additional insights into patients' wellbeing that are not captured by the body-part framework and make recommendations for how our pain profiles might sculpt the future design of healthcare delivery systems. CONCLUSION Overall, we provide evidence for a shift in pain medicine delivery systems from the conventional, body-part-based approach to one anchored in the pain experience and holistic profiles of patient function. This transition facilitates a more comprehensive management of chronic diseases, wherein pain treatment is integrated into broader health strategies. By focusing on holistic patient profiles, our approach not only addresses pain symptoms but also supports the management of underlying chronic conditions, thereby enhancing patient outcomes and improving quality of life. This model advocates for a seamless integration of pain management within the continuum of care for chronic diseases, emphasizing the importance of understanding and treating the interdependencies between chronic conditions and pain.
Collapse
Affiliation(s)
- Daniel S Barron
- Department of Psychiatry, Brigham & Women's Hospital, Mass General Brigham, Boston, USA.
- Department of Physical Medicine and Rehabilitation, Spaulding Rehabilitation Hospital, Mass General Brigham, Boston, USA.
| | - Karin Saltoun
- Department of Biomedical Engineering, Montreal Neurological Institute, McGill University and Mila - Quebec AI Institute, Montreal, Canada
| | - Hannah Kiesow
- Department of Biomedical Engineering, Montreal Neurological Institute, McGill University and Mila - Quebec AI Institute, Montreal, Canada
| | - Melanie Fu
- Department of Psychiatry, Brigham & Women's Hospital, Mass General Brigham, Boston, USA
| | | | - Paul Geha
- Departments of Neuroscience, Psychiatry, Dentistry and Neurology, University of Rochester, Rochester, USA
| | | | - Zacharia Isaac
- Department of Physical Medicine and Rehabilitation, Spaulding Rehabilitation Hospital, Mass General Brigham, Boston, USA
| | - David Silbersweig
- Department of Psychiatry, Brigham & Women's Hospital, Mass General Brigham, Boston, USA
| | - Danilo Bzdok
- Department of Biomedical Engineering, Montreal Neurological Institute, McGill University and Mila - Quebec AI Institute, Montreal, Canada
| |
Collapse
|
22
|
Barr PB, Neale ZE, Bigdeli TB, Chatzinakos C, Harvey PD, Peterson RE, Meyers JL. Social and Polygenic Risk Factors for Time to Comorbid Diagnoses in Individuals with Substance Use Disorders: A Phenome-Wide Survival Analysis. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.12.13.24319000. [PMID: 39711727 PMCID: PMC11661425 DOI: 10.1101/2024.12.13.24319000] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 12/24/2024]
Abstract
Importance Persons with substance use disorders (SUD) often suffer from additional comorbidities, including psychiatric conditions and physical health problems. Researchers have explored this overlap in electronic health records (EHR) using phenome wide association studies (PheWAS) to characterize how different indicators are related to all conditions in an individual's EHR. However, analyses have been largely cross-sectional in nature. Objective To characterize whether various social and genetic risk factors are associated with time to comorbid diagnoses in electronic health records (EHR) after the first diagnosis of SUD. Design Leveraging those with EHR and whole-genome sequencing data in All of Us (N = 287,012), we explored whether social determinants of health are associated with lifetime risk of SUD. Next, within those with a diagnosed SUD (N = 17,460), we examined whether polygenic scores (PGS) were associated with time to comorbid diagnoses performing a phenome-wide survival analysis. Setting Participating health care organizations across the United States. Participants Participants in the All of Us Research Program with available EHR and genomic data. Exposures Social determinants of health and polygenic scores (PGS) for psychiatric and substance use disorders. Main Outcomes and Measures Phecodes for diagnoses derived from International Statistical Classification of Diseases, Ninth and Tenth Revisions, Clinical Modification, codes from EHR. Results Multiple social and demographic risk factors were associated with lifetime SUD diagnosis. Most strikingly, those reporting an annual income <$10K had 4.5 times the odds of having an SUD diagnosis compared to those reporting $100-$150K annually (OR = 4.48, 95% CI = 4.01, 5.01). PGSs for alcohol use disorders, schizophrenia, and post-traumatic stress disorder were associated with time to their respective diagnoses (HRAUD = 1.10, 95% CI = 1.06, 1.14; HRSCZ = 1.13, 95% CI = 1.06, 1.20; HRPTSD = 1.15, 95% CI = 1.08, 1.22). A PGS for ever-smoking was associated with time to subsequent smoking related comorbidities and additional SUD diagnoses HRSMOK = 1.6 to 1.16). Conclusions and Relevance Social determinants, especially those related to income have profound associations with lifetime SUD risk. Additionally, PGS may include information related to outcomes above and beyond lifetime risk, including timing and severity.
Collapse
Affiliation(s)
- Peter B. Barr
- SUNY Downstate Health Sciences University, Department of Psychiatry and Behavioral Sciences
- SUNY Downstate Health Sciences University, Institute for Genomics in Health
- SUNY Downstate Health Sciences University, Department of Community Health Sciences
- VA New York Harbor Healthcare System
| | - Zoe E. Neale
- SUNY Downstate Health Sciences University, Department of Psychiatry and Behavioral Sciences
- SUNY Downstate Health Sciences University, Institute for Genomics in Health
- VA New York Harbor Healthcare System
| | - Tim B. Bigdeli
- SUNY Downstate Health Sciences University, Department of Psychiatry and Behavioral Sciences
- SUNY Downstate Health Sciences University, Institute for Genomics in Health
- VA New York Harbor Healthcare System
- SUNY Downstate Health Sciences University, Department of Epidemiology and Biostatistics
| | - Chris Chatzinakos
- SUNY Downstate Health Sciences University, Department of Psychiatry and Behavioral Sciences
- SUNY Downstate Health Sciences University, Institute for Genomics in Health
| | - Philip D. Harvey
- University of Miami Miller School of Medicine
- Research Service, Bruce W. Carter Miami Veterans Affairs (VA) Medical Center
| | - Roseann E. Peterson
- SUNY Downstate Health Sciences University, Department of Psychiatry and Behavioral Sciences
- SUNY Downstate Health Sciences University, Institute for Genomics in Health
| | - Jacquelyn L. Meyers
- SUNY Downstate Health Sciences University, Department of Psychiatry and Behavioral Sciences
- SUNY Downstate Health Sciences University, Institute for Genomics in Health
- SUNY Downstate Health Sciences University, Department of Epidemiology and Biostatistics
| |
Collapse
|
23
|
Johnson R, Gottlieb U, Shaham G, Eisen L, Waxman J, Devons-Sberro S, Ginder CR, Hong P, Sayeed R, Reis BY, Balicer RD, Dagan N, Zitnik M. Unified Clinical Vocabulary Embeddings for Advancing Precision Medicine. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.12.03.24318322. [PMID: 39677476 PMCID: PMC11643188 DOI: 10.1101/2024.12.03.24318322] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 12/17/2024]
Abstract
Integrating clinical knowledge into AI remains challenging despite numerous medical guidelines and vocabularies. Medical codes, central to healthcare systems, often reflect operational patterns shaped by geographic factors, national policies, insurance frameworks, and physician practices rather than the precise representation of clinical knowledge. This disconnect hampers AI in representing clinical relationships, raising concerns about bias, transparency, and generalizability. Here, we developed a resource of 67,124 clinical vocabulary embeddings derived from a clinical knowledge graph tailored to electronic health record vocabularies, spanning over 1.3 million edges. Using graph transformer neural networks, we generated clinical vocabulary embeddings that provide a new representation of clinical knowledge by unifying seven medical vocabularies. These embeddings were validated through a phenotype risk score analysis involving 4.57 million patients from Clalit Healthcare Services, effectively stratifying individuals based on survival outcomes. Inter-institutional panels of clinicians evaluated the embeddings for alignment with clinical knowledge across 90 diseases and 3,000 clinical codes, confirming their robustness and transferability. This resource addresses gaps in integrating clinical vocabularies into AI models and training datasets, paving the way for knowledge-grounded population and patient-level models.
Collapse
Affiliation(s)
- Ruth Johnson
- The Ivan and Francesca Berkowitz Family Living Laboratory Collaboration at Harvard Medical School and Clalit Research Institute, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Uri Gottlieb
- Clalit Research Institute, Innovation Division, Clalit Health Services, Ramat-Gan, Israel
| | - Galit Shaham
- Clalit Research Institute, Innovation Division, Clalit Health Services, Ramat-Gan, Israel
| | - Lihi Eisen
- Clalit Research Institute, Innovation Division, Clalit Health Services, Ramat-Gan, Israel
| | - Jacob Waxman
- Clalit Research Institute, Innovation Division, Clalit Health Services, Ramat-Gan, Israel
| | - Stav Devons-Sberro
- Clalit Research Institute, Innovation Division, Clalit Health Services, Ramat-Gan, Israel
| | - Curtis R. Ginder
- Cardiovascular Division, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
| | - Peter Hong
- Division of General Pediatrics, Department of Pediatrics, Boston Children’s Hospital, Boston, MA, USA
- Information Technology, Enterprise Data Analytics and Reporting, Boston Children’s Hospital, Boston, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
| | - Raheel Sayeed
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Ben Y. Reis
- The Ivan and Francesca Berkowitz Family Living Laboratory Collaboration at Harvard Medical School and Clalit Research Institute, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Predictive Medicine Group, Computational Health Informatics Program, Boston Children’s Hospital, Boston, MA, USA
- Harvard Data Science Initiative, Cambridge, MA, USA
| | - Ran D. Balicer
- The Ivan and Francesca Berkowitz Family Living Laboratory Collaboration at Harvard Medical School and Clalit Research Institute, Boston, MA, USA
- Clalit Research Institute, Innovation Division, Clalit Health Services, Ramat-Gan, Israel
- Faculty of Health Sciences, School of Public Health, Ben Gurion University of the Negev, Be’er Sheva, Israel
| | - Noa Dagan
- The Ivan and Francesca Berkowitz Family Living Laboratory Collaboration at Harvard Medical School and Clalit Research Institute, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Clalit Research Institute, Innovation Division, Clalit Health Services, Ramat-Gan, Israel
- Software and Information Systems Engineering, Ben Gurion University, Be’er Sheva, Israel
| | - Marinka Zitnik
- The Ivan and Francesca Berkowitz Family Living Laboratory Collaboration at Harvard Medical School and Clalit Research Institute, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Harvard Data Science Initiative, Cambridge, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Kempner Institute for the Study of Natural and Artificial Intelligence, Harvard University, Allston, MA, USA
| |
Collapse
|
24
|
Wang W, Feng Y, Zhao H, Wang X, Cai R, Cai W, Zhang X. Mdpg: a novel multi-disease diagnosis prediction method based on patient knowledge graphs. Health Inf Sci Syst 2024; 12:15. [PMID: 38440103 PMCID: PMC10908733 DOI: 10.1007/s13755-024-00278-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Accepted: 01/23/2024] [Indexed: 03/06/2024] Open
Abstract
Diagnosis prediction, a key factor in enhancing healthcare efficiency, remains a focal point in clinical decision support research. However, the time-series, sparse and multi-noise characteristics of electronic health record (EHR) data make it a great challenge. Existing methods commonly address these issues using RNNs and incorporating medical prior knowledge from medical knowledge bases, but they neglect the local spatial characteristics and spatial-temporal correlation of the data. Consequently, we propose MDPG, a diagnosis prediction model based on patient knowledge graphs. Initially, we represent the electronic visit records of patients as a patient-centered temporal knowledge graph, capturing the local spatial structure and temporal characteristics of the visit information. Subsequently, we design the spatial graph convolution block, temporal self-attention block, and spatial-temporal synchronous graph convolution block to capture the spatial, temporal, and spatial-temporal correlations embedded in them, respectively. Ultimately, we accomplish the prediction of patients' future states through multi-label classification. We conduct comprehensive experiments on two real-world datasets independently and evaluate the results using visit-level precision@k and code-level accuracy@k metrics. The experimental results demonstrate that MDPG outperforms all baseline models, yielding the best performance.
Collapse
Affiliation(s)
- Weiguang Wang
- School of Computer Science and Engineering, Northeastern University, Shenyang, 110819 Liaoning China
- Neusoft Research of Intelligent Healthcare Technology, Co. Ltd, Shenyang, 110167 Liaoning China
| | - Yingying Feng
- School of Computer Science and Engineering, Northeastern University, Shenyang, 110819 Liaoning China
| | - Haiyan Zhao
- School of Computer Science, Peking University, Beijing, 100871 China
- Key Laboratory of High Confidence Software Technologies (PKU), Ministry of Education, Beijing, 100871 China
| | - Xin Wang
- College of Intelligence and Computing, Tianjin University, Tianjin, 300354 China
| | - Ruikai Cai
- Department of Neurosurgery, Shengjing Hospital of China Medical University, Shenyang, 110004 Liaoning China
| | - Wei Cai
- Neusoft Research of Intelligent Healthcare Technology, Co. Ltd, Shenyang, 110167 Liaoning China
| | - Xia Zhang
- School of Computer Science and Engineering, Northeastern University, Shenyang, 110819 Liaoning China
- Neusoft Research of Intelligent Healthcare Technology, Co. Ltd, Shenyang, 110167 Liaoning China
| |
Collapse
|
25
|
Zheng SL, Henry A, Cannie D, Lee M, Miller D, McGurk KA, Bond I, Xu X, Issa H, Francis C, De Marvao A, Theotokis PI, Buchan RJ, Speed D, Abner E, Adams L, Aragam KG, Ärnlöv J, Raja AA, Backman JD, Baksi J, Barton PJR, Biddinger KJ, Boersma E, Brandimarto J, Brunak S, Bundgaard H, Carey DJ, Charron P, Cook JP, Cook SA, Denaxas S, Deleuze JF, Doney AS, Elliott P, Erikstrup C, Esko T, Farber-Eger EH, Finan C, Garnier S, Ghouse J, Giedraitis V, Guðbjartsson DF, Haggerty CM, Halliday BP, Helgadottir A, Hemingway H, Hillege HL, Kardys I, Lind L, Lindgren CM, Lowery BD, Manisty C, Margulies KB, Moon JC, Mordi IR, Morley MP, Morris AD, Morris AP, Morton L, Noursadeghi M, Ostrowski SR, Owens AT, Palmer CNA, Pantazis A, Pedersen OBV, Prasad SK, Shekhar A, Smelser DT, Srinivasan S, Stefansson K, Sveinbjörnsson G, Syrris P, Tammesoo ML, Tayal U, Teder-Laving M, Thorgeirsson G, Thorsteinsdottir U, Tragante V, Trégouët DA, Treibel TA, Ullum H, Valdes AM, van Setten J, van Vugt M, Veluchamy A, Verschuren WMM, Villard E, Yang Y, Asselbergs FW, Cappola TP, Dube MP, Dunn ME, Ellinor PT, Hingorani AD, Lang CC, Samani NJ, Shah SH, Smith JG, Vasan RS, et alZheng SL, Henry A, Cannie D, Lee M, Miller D, McGurk KA, Bond I, Xu X, Issa H, Francis C, De Marvao A, Theotokis PI, Buchan RJ, Speed D, Abner E, Adams L, Aragam KG, Ärnlöv J, Raja AA, Backman JD, Baksi J, Barton PJR, Biddinger KJ, Boersma E, Brandimarto J, Brunak S, Bundgaard H, Carey DJ, Charron P, Cook JP, Cook SA, Denaxas S, Deleuze JF, Doney AS, Elliott P, Erikstrup C, Esko T, Farber-Eger EH, Finan C, Garnier S, Ghouse J, Giedraitis V, Guðbjartsson DF, Haggerty CM, Halliday BP, Helgadottir A, Hemingway H, Hillege HL, Kardys I, Lind L, Lindgren CM, Lowery BD, Manisty C, Margulies KB, Moon JC, Mordi IR, Morley MP, Morris AD, Morris AP, Morton L, Noursadeghi M, Ostrowski SR, Owens AT, Palmer CNA, Pantazis A, Pedersen OBV, Prasad SK, Shekhar A, Smelser DT, Srinivasan S, Stefansson K, Sveinbjörnsson G, Syrris P, Tammesoo ML, Tayal U, Teder-Laving M, Thorgeirsson G, Thorsteinsdottir U, Tragante V, Trégouët DA, Treibel TA, Ullum H, Valdes AM, van Setten J, van Vugt M, Veluchamy A, Verschuren WMM, Villard E, Yang Y, Asselbergs FW, Cappola TP, Dube MP, Dunn ME, Ellinor PT, Hingorani AD, Lang CC, Samani NJ, Shah SH, Smith JG, Vasan RS, O'Regan DP, Holm H, Noseda M, Wells Q, Ware JS, Lumbers RT. Genome-wide association analysis provides insights into the molecular etiology of dilated cardiomyopathy. Nat Genet 2024; 56:2646-2658. [PMID: 39572783 PMCID: PMC11631752 DOI: 10.1038/s41588-024-01952-y] [Show More Authors] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Accepted: 09/18/2024] [Indexed: 12/12/2024]
Abstract
Dilated cardiomyopathy (DCM) is a leading cause of heart failure and cardiac transplantation. We report a genome-wide association study and multi-trait analysis of DCM (14,256 cases) and three left ventricular traits (36,203 UK Biobank participants). We identified 80 genomic risk loci and prioritized 62 putative effector genes, including several with rare variant DCM associations (MAP3K7, NEDD4L and SSPN). Using single-nucleus transcriptomics, we identify cellular states, biological pathways, and intracellular communications that drive pathogenesis. We demonstrate that polygenic scores predict DCM in the general population and modify penetrance in carriers of rare DCM variants. Our findings may inform the design of genetic testing strategies that incorporate polygenic background. They also provide insights into the molecular etiology of DCM that may facilitate the development of targeted therapeutics.
Collapse
Affiliation(s)
- Sean L Zheng
- National Heart and Lung Institute, Imperial College London, London, UK
- MRC Laboratory of Medical Sciences, London, UK
- Royal Brompton & Harefield Hospitals, Guy's and St. Thomas' NHS Foundation Trust, London, UK
| | - Albert Henry
- Institute of Cardiovascular Science, University College London, London, UK
- Institute of Health Informatics, University College London, London, UK
| | - Douglas Cannie
- Institute of Cardiovascular Science, University College London, London, UK
- Barts Heart Centre, St Bartholomew's Hospital, London, UK
| | - Michael Lee
- National Heart and Lung Institute, Imperial College London, London, UK
| | - David Miller
- Division of Biosciences, University College London, London, UK
| | - Kathryn A McGurk
- National Heart and Lung Institute, Imperial College London, London, UK
- MRC Laboratory of Medical Sciences, London, UK
- Program in Medical and Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Isabelle Bond
- Institute of Cardiovascular Science, University College London, London, UK
| | - Xiao Xu
- National Heart and Lung Institute, Imperial College London, London, UK
- MRC Laboratory of Medical Sciences, London, UK
| | - Hanane Issa
- Institute of Health Informatics, University College London, London, UK
| | - Catherine Francis
- National Heart and Lung Institute, Imperial College London, London, UK
- Royal Brompton & Harefield Hospitals, Guy's and St. Thomas' NHS Foundation Trust, London, UK
| | - Antonio De Marvao
- National Heart and Lung Institute, Imperial College London, London, UK
- MRC Laboratory of Medical Sciences, London, UK
- Royal Brompton & Harefield Hospitals, Guy's and St. Thomas' NHS Foundation Trust, London, UK
| | - Pantazis I Theotokis
- National Heart and Lung Institute, Imperial College London, London, UK
- MRC Laboratory of Medical Sciences, London, UK
- Royal Brompton & Harefield Hospitals, Guy's and St. Thomas' NHS Foundation Trust, London, UK
| | - Rachel J Buchan
- National Heart and Lung Institute, Imperial College London, London, UK
- MRC Laboratory of Medical Sciences, London, UK
- Royal Brompton & Harefield Hospitals, Guy's and St. Thomas' NHS Foundation Trust, London, UK
| | - Doug Speed
- Quantitative Genetics and Genomics, Aarhus University, Aarhus, Denmark
| | - Erik Abner
- Estonian Genome Center, Institute of Genomics, University of Tartu, Tartu, Estonia
| | | | - Krishna G Aragam
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Johan Ärnlöv
- Department of Neurobiology, Care Sciences and Society/Section of Family Medicine and Primary Care, Karolinska Institutet, Stockholm, Sweden
- School of Health and Social Sciences, Dalarna University, Falun, Sweden
| | - Anna Axelsson Raja
- Department of Cardiology, The Heart Centre, Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark
| | - Joshua D Backman
- Analytical Genetics, Regeneron Genetics Center, Tarrytown, NY, USA
| | - John Baksi
- Royal Brompton & Harefield Hospitals, Guy's and St. Thomas' NHS Foundation Trust, London, UK
| | - Paul J R Barton
- National Heart and Lung Institute, Imperial College London, London, UK
- MRC Laboratory of Medical Sciences, London, UK
- Royal Brompton & Harefield Hospitals, Guy's and St. Thomas' NHS Foundation Trust, London, UK
| | - Kiran J Biddinger
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Eric Boersma
- Erasmus MC, Cardiovascular Institute, Thorax Center, Department of Cardiology, Utrecht, the Netherlands
| | - Jeffrey Brandimarto
- Penn Cardiovascular Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Søren Brunak
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Henning Bundgaard
- Department of Cardiology, The Heart Centre, Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark
| | - David J Carey
- Department of Molecular and Functional Genomics, Geisinger, Danville, PA, USA
| | - Philippe Charron
- Sorbonne Research Unit on Cardiovascular Disorders, Metabolism and Nutrition, Team Genomics & Pathophysiology of Cardiovascular Diseases, ICAN Institute for Cardiometabolism and Nutrition, Paris, France
- APHP, Department of Genetics, Pitié-Salpêtrière Hospital, Paris, France
| | - James P Cook
- Department of Biostatistics, University of Liverpool, Liverpool, UK
| | - Stuart A Cook
- National Heart and Lung Institute, Imperial College London, London, UK
- MRC Laboratory of Medical Sciences, London, UK
- Royal Brompton & Harefield Hospitals, Guy's and St. Thomas' NHS Foundation Trust, London, UK
| | - Spiros Denaxas
- Institute of Health Informatics, University College London, London, UK
- Health Data Research UK, University College London, London, UK
- British Heart Foundation Data Science Centre, London, UK
- The National Institute for Health Research University College London Hospitals Biomedical Research Centre, University College London, London, UK
| | - Jean-François Deleuze
- Centre National de Recherche en Génomique Humaine (CNRGH), Institut de Biologie François Jacob, CEA, Université Paris-Saclay, Evry, France
- Laboratory of Excellence GENMED (Medical Genomics), Paris, France
- Centre d'Etude du Polymorphisme Humain, Fondation Jean Dausset, Paris, France
| | - Alexander S Doney
- Division of Molecular & Clinical Medicine, University of Dundee, Ninewells Hospital and Medical School, Dundee, UK
| | - Perry Elliott
- Institute of Cardiovascular Science, University College London, London, UK
- Barts Heart Centre, St Bartholomew's Hospital, London, UK
| | - Christian Erikstrup
- Department of Clinical Immunology, Aarhus University Hospital, Aarhus, Denmark
- Deparment of Clinical Medicine, Aarhus University Hospital, Aarhus, Denmark
| | - Tõnu Esko
- Estonian Genome Center, Institute of Genomics, University of Tartu, Tartu, Estonia
- Program in Medical and Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Eric H Farber-Eger
- Vanderbilt Institute for Clinical and Translational Research, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Chris Finan
- Institute of Cardiovascular Science, University College London, London, UK
| | - Sophie Garnier
- Sorbonne Research Unit on Cardiovascular Disorders, Metabolism and Nutrition, Team Genomics & Pathophysiology of Cardiovascular Diseases, ICAN Institute for Cardiometabolism and Nutrition, Paris, France
| | - Jonas Ghouse
- Department of Cardiology, The Heart Centre, Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark
| | | | - Daniel F Guðbjartsson
- deCODE genetics/Amgen Inc., Reykjavik, Iceland
- School of Engineering and Natural Sciences, University of Iceland, Reykjavik, Iceland
| | | | - Brian P Halliday
- National Heart and Lung Institute, Imperial College London, London, UK
- Royal Brompton & Harefield Hospitals, Guy's and St. Thomas' NHS Foundation Trust, London, UK
| | | | - Harry Hemingway
- Institute of Health Informatics, University College London, London, UK
- Health Data Research UK, University College London, London, UK
| | - Hans L Hillege
- Department of Cardiology, University Medical Center Groningen, University of Groningen, Groningen, the Netherlands
| | - Isabella Kardys
- Erasmus MC, Cardiovascular Institute, Thorax Center, Department of Cardiology, Utrecht, the Netherlands
| | - Lars Lind
- Department of Medical Sciences, Uppsala University, Uppsala, Sweden
| | - Cecilia M Lindgren
- Program in Medical and Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Big Data Institute at the Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, UK
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, UK
| | - Brandon D Lowery
- Vanderbilt Institute for Clinical and Translational Research, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Charlotte Manisty
- Institute of Cardiovascular Science, University College London, London, UK
- Barts Heart Centre, St Bartholomew's Hospital, London, UK
| | - Kenneth B Margulies
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - James C Moon
- Institute of Cardiovascular Science, University College London, London, UK
- Barts Heart Centre, St Bartholomew's Hospital, London, UK
| | - Ify R Mordi
- Division of Molecular & Clinical Medicine, University of Dundee, Ninewells Hospital and Medical School, Dundee, UK
| | - Michael P Morley
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Andrew D Morris
- Usher Institute of Population Health Sciences and Informatics, University of Edinburgh, Edinburgh, UK
| | - Andrew P Morris
- Centre for Genetics and Genomics Versus Arthritis, Centre for Musculoskeletal Research, University of Manchester, Manchester, UK
| | - Lori Morton
- Cardiovascular Research, Regeneron Pharmaceuticals, Tarrytown, NY, USA
| | - Mahdad Noursadeghi
- Research Department of Infection, Division of Infection and Immunity, University College London, London, UK
| | - Sisse R Ostrowski
- Department of Clinical Immunology, Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark
- Department of Clinical Medicine, Faculty of Health and Medical Sciences, University of Copenhagen University Hospital, Copenhagen, Denmark
| | - Anjali T Owens
- Penn Cardiovascular Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Colin N A Palmer
- Division of Population Health and Genomics, University of Dundee, Ninewells Hospital and Medical School, Dundee, UK
| | - Antonis Pantazis
- Royal Brompton & Harefield Hospitals, Guy's and St. Thomas' NHS Foundation Trust, London, UK
| | - Ole B V Pedersen
- Department of Clinical Medicine, Faculty of Health and Medical Sciences, University of Copenhagen University Hospital, Copenhagen, Denmark
- Department of Clinical Immunology, Zealand University Hospital, Køge, Denmark
| | - Sanjay K Prasad
- National Heart and Lung Institute, Imperial College London, London, UK
- Royal Brompton & Harefield Hospitals, Guy's and St. Thomas' NHS Foundation Trust, London, UK
| | - Akshay Shekhar
- Cardiovascular Research, Regeneron Pharmaceuticals, Tarrytown, NY, USA
| | - Diane T Smelser
- Department of Molecular and Functional Genomics, Geisinger, Danville, PA, USA
| | - Sundararajan Srinivasan
- Division of Population Health and Genomics, University of Dundee, Ninewells Hospital and Medical School, Dundee, UK
| | - Kari Stefansson
- deCODE genetics/Amgen Inc., Reykjavik, Iceland
- Department of Medicine, University of Iceland, Reykjavik, Iceland
| | | | - Petros Syrris
- Institute of Cardiovascular Science, University College London, London, UK
| | - Mari-Liis Tammesoo
- Estonian Genome Center, Institute of Genomics, University of Tartu, Tartu, Estonia
| | - Upasana Tayal
- National Heart and Lung Institute, Imperial College London, London, UK
- Royal Brompton & Harefield Hospitals, Guy's and St. Thomas' NHS Foundation Trust, London, UK
| | - Maris Teder-Laving
- Estonian Genome Center, Institute of Genomics, University of Tartu, Tartu, Estonia
| | - Guðmundur Thorgeirsson
- deCODE genetics/Amgen Inc., Reykjavik, Iceland
- Department of Medicine, University of Iceland, Reykjavik, Iceland
| | - Unnur Thorsteinsdottir
- deCODE genetics/Amgen Inc., Reykjavik, Iceland
- Department of Medicine, University of Iceland, Reykjavik, Iceland
| | | | - David-Alexandre Trégouët
- Laboratory of Excellence GENMED (Medical Genomics), Paris, France
- Univ. Bordeaux, INSERM, BPH, Bordeaux, France
| | - Thomas A Treibel
- Institute of Cardiovascular Science, University College London, London, UK
- Barts Heart Centre, St Bartholomew's Hospital, London, UK
| | | | - Ana M Valdes
- Injury, Recovery and Inflammation Sciences, School of Medicine, University of Nottingham, Nottingham, UK
| | - Jessica van Setten
- Department of Cardiology, University Medical Center Utrecht, Utrecht, the Netherlands
| | - Marion van Vugt
- Department of Cardiology, University Medical Center Utrecht, Utrecht, the Netherlands
| | - Abirami Veluchamy
- Division of Population Health and Genomics, University of Dundee, Ninewells Hospital and Medical School, Dundee, UK
| | - W M Monique Verschuren
- Department Life Course, Lifestyle and Health, Centre for Prevention, Lifestyle and Health, National Institute for Public Health and the Environment, Bilthoven, the Netherlands
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, the Netherlands
| | - Eric Villard
- Sorbonne Research Unit on Cardiovascular Disorders, Metabolism and Nutrition, Team Genomics & Pathophysiology of Cardiovascular Diseases, ICAN Institute for Cardiometabolism and Nutrition, Paris, France
| | - Yifan Yang
- Penn Cardiovascular Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Folkert W Asselbergs
- Institute of Cardiovascular Science, University College London, London, UK
- The National Institute for Health Research University College London Hospitals Biomedical Research Centre, University College London, London, UK
- Department of Cardiology, Amsterdam University Medical Centers, Amsterdam, the Netherlands
| | - Thomas P Cappola
- Penn Cardiovascular Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Marie-Pierre Dube
- Montreal Heart Institute, Montreal Heart Institute, Montreal, Quebec, Canada
- Faculty of Medicine, Université de Montréal, Montreal, Quebec, Canada
| | - Michael E Dunn
- Cardiovascular Research, Regeneron Pharmaceuticals, Tarrytown, NY, USA
| | - Patrick T Ellinor
- Program in Medical and Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Cardiac Arrhythmia Service and Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA
| | - Aroon D Hingorani
- Institute of Cardiovascular Science, University College London, London, UK
| | - Chim C Lang
- Division of Molecular & Clinical Medicine, University of Dundee, Ninewells Hospital and Medical School, Dundee, UK
- Tuanku Muhriz Chair, Universiti Kebangsaan Malaysia, Kuala Lumpur, Malaysia
| | - Nilesh J Samani
- Department of Cardiovascular Sciences, University of Leicester and NIHR Leicester Biomedical Research Centre, Glenfield Hospital, Leicester, UK
| | - Svati H Shah
- Department of Medicine, Division of Cardiology, Duke University Medical Center, Durham, NC, USA
- Duke Clinical Research Institute, Durham, NC, USA
- Duke Molecular Physiology Institute, Durham, NC, USA
| | - J Gustav Smith
- Department of Cardiology, Clinical Sciences, Lund University and Skåne University Hospital, Lund, Sweden
- Department of Molecular and Clinical Medicine, Institute of Medicine, Gothenburg University and Sahlgrenska University Hospital, Gothenburg, Sweden
- Wallenberg Center for Molecular Medicine and Lund University Diabetes Center, Lund University, Lund, Sweden
| | - Ramachandran S Vasan
- National Heart, Lung, and Blood Institute's and Boston University's Framingham Heart Study, Framingham, MA, USA
- Sections of Cardiology, Preventive Medicine and Epidemiology, Department of Medicine, Boston University Schools of Medicine and Public Health, Boston, MA, USA
| | | | - Hilma Holm
- deCODE genetics/Amgen Inc., Reykjavik, Iceland
| | - Michela Noseda
- National Heart and Lung Institute, Imperial College London, London, UK
| | - Quinn Wells
- Division of Cardiovascular Medicine, Vanderbilt University, Nashville, TN, USA
| | - James S Ware
- National Heart and Lung Institute, Imperial College London, London, UK.
- MRC Laboratory of Medical Sciences, London, UK.
- Royal Brompton & Harefield Hospitals, Guy's and St. Thomas' NHS Foundation Trust, London, UK.
- Program in Medical and Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| | - R Thomas Lumbers
- Institute of Health Informatics, University College London, London, UK.
- Health Data Research UK, University College London, London, UK.
- British Heart Foundation Data Science Centre, London, UK.
| |
Collapse
|
26
|
Coombes BJ, Sanchez-Ruiz JA, Fennessy B, Pazdernik VK, Adekkanattu P, Nuñez NA, Lepow L, Melhuish Beaupre LM, Ryu E, Talati A, Mann JJ, Weissman MM, Olfson M, Pathak J, Charney AW, Biernacka JM. Clinical associations with treatment resistance in depression: An electronic health record study. Psychiatry Res 2024; 342:116203. [PMID: 39321638 PMCID: PMC11617277 DOI: 10.1016/j.psychres.2024.116203] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/25/2024] [Revised: 09/03/2024] [Accepted: 09/15/2024] [Indexed: 09/27/2024]
Abstract
Treatment resistance is common in major depressive disorder (MDD), yet clinical risk factors are not well understood. Using a discovery-replication design, we conducted phenome-wide association studies (PheWASs) of MDD treatment resistance in two electronic health record (EHR)-linked biobanks. The PheWAS included participants with an MDD diagnosis in the EHR and at least one antidepressant (AD) prescription. Participant lifetime diagnoses were mapped to phecodes. PheWASs were conducted for three treatment resistance outcomes based on AD prescription data: number of unique ADs prescribed, ≥1 and ≥2 CE switches. Of the 180 phecodes significantly associated with these outcomes in the discovery cohort (n = 12,558), 71 replicated (n = 8,206). In addition to identifying known clinical factors for treatment resistance in MDD, the total unique AD prescriptions was associated with additional clinical variables including irritable bowel syndrome, gastroesophageal reflux disease, symptomatic menopause, and spondylosis. We calculated polygenic risk of specific-associated conditions and tested their association with AD outcomes revealing that genetic risk for many of these conditions is also associated with the total unique AD prescriptions. The number of unique ADs prescribed, which is easily assessed in EHRs, provides a more nuanced measure of treatment resistance, and may facilitate future research and clinical application in this area.
Collapse
Affiliation(s)
- Brandon J Coombes
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN, USA.
| | | | - Brian Fennessy
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | | | - Prakash Adekkanattu
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, USA; Clinical and Translational Science Center, Weill Cornell Medicine, New York, NY, USA
| | - Nicolas A Nuñez
- Department of Psychiatry & Psychology, Mayo Clinic, Rochester, MN, USA
| | - Lauren Lepow
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | | | - Euijung Ryu
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN, USA
| | - Ardesheer Talati
- Department of Psychiatry, Vagelos College of Physicians and Surgeons Columbia University & NY State Psychiatric Institute, New York, NY, USA
| | - J John Mann
- Department of Psychiatry, Vagelos College of Physicians and Surgeons Columbia University & NY State Psychiatric Institute, New York, NY, USA
| | - Myrna M Weissman
- Department of Psychiatry, Vagelos College of Physicians and Surgeons Columbia University & NY State Psychiatric Institute, New York, NY, USA
| | - Mark Olfson
- Department of Psychiatry, Vagelos College of Physicians and Surgeons Columbia University & NY State Psychiatric Institute, New York, NY, USA
| | - Jyotishman Pathak
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, USA; Department of Psychiatry, Weill Cornell Medicine, New York, NY, USA
| | - Alexander W Charney
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA; Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA; Mount Sinai Clinical Intelligence Center, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Joanna M Biernacka
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN, USA; Department of Psychiatry & Psychology, Mayo Clinic, Rochester, MN, USA.
| |
Collapse
|
27
|
Goldstein JA, Gernand AD, Gallagher K, Shanes ED, Bebell LM, Yee LM. Defining Appropriate Comparator Populations for Placental Pathology for Pregnant People With HIV. Int J Surg Pathol 2024:10668969241295351. [PMID: 39552457 DOI: 10.1177/10668969241295351] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2024]
Abstract
Background. Widespread adoption of antiretroviral therapy has reduced perinatal transmission of HIV; however, people living with HIV (PWH) have higher rates of preterm birth and hypertensive disorders of pregnancy. The placenta is the critical fetal support organ in pregnancy, and multiple investigations have sought associations in PWH between HIV and placental pathology. However, results have been inconclusive. We posit that selection of control group populations influences the apparent anomalies in placentas from PWH and examined the differences seen between these placentas and those of four comparator populations. Methods. Placentas from PWH were compared with those from all patients without HIV, controls from a recent study of severe acute respiratory syndrome coronavirus 2 in pregnancy, patients with a history of melanoma-an indication for examination relatively orthogonal to other problems in pregnancy, and patients paired with PWH using propensity score matching. Results. People living with HIV differ in demographics and comorbidities from comparator groups other than propensity score-matched patients. Placentas from PWH had higher rates of acute placental inflammation, including maternal inflammatory response and fetal inflammatory response, than multiple comparator groups. Placentas from PHW had lower rates of chronic placental inflammation than three of four comparator groups, including the largest comparator group and the group matched to PWH using propensity scores. Conclusion. Differences in placental pathology in PWH depend on the comparator group. Commonly used comparator groups have significantly different demographic and comorbidity profiles, suggesting they are inappropriate comparators for PWH. Propensity score matching may be useful in identifying comparator populations.
Collapse
Affiliation(s)
- Jeffery A Goldstein
- Department of Pathology, Northwestern University, Feinberg School of Medicine, Chicago, IL, USA
| | - Alison D Gernand
- Department of Nutritional Sciences, Penn State College of Health and Human Development, University Park, PA, USA
| | - Kelly Gallagher
- Ross and Carol Nese College of Nursing, Penn State College of Health and Human Development, University Park, PA, USA
| | - Elisheva D Shanes
- Department of Pathology, Northwestern University, Feinberg School of Medicine, Chicago, IL, USA
| | | | - Lynn M Yee
- Department of Obstetrics and Gynecology, Northwestern University, Feinberg School of Medicine, Chicago, IL, USA
| |
Collapse
|
28
|
Hou Y, Cui E, Ikramuddin S, Zhang R. Association of Physical Activity from Wearable Devices and Chronic Disease Risk: Insights from the All of Us Research Program. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.11.11.24317124. [PMID: 39606327 PMCID: PMC11601689 DOI: 10.1101/2024.11.11.24317124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2024]
Abstract
Background Physical activity is widely recognized as a key modifiable factor for reducing the risk of chronic diseases. Wearable devices such as Fitbit offer a unique opportunity to objectively measure physical activity metrics, providing insights into the association between different types of physical activity and chronic disease risk. Objective This study aims to examine the association between physical activity metrics derived from Fitbit devices and the incidence of various chronic diseases among participants from the All of Us Research Program. Methods Physical activity metrics included daily steps, elevation gain, and activity durations at different intensities (e.g., very active, lightly active, and sedentary). Cox proportional hazards models and multiple regression models were used to assess the relationship between these metrics and the incidence of chronic diseases represented by Phecodes. Age, sex, and body mass index (BMI) were included as covariates. Results A total of 15,538 participants provided Fitbit activity data, of which 9,320 also had electronic health records (EHR). Increased daily step count, elevation gain, and very active minutes were significantly associated with a reduced risk of several chronic conditions, including obesity, Type 2 diabetes, and major depressive disorder. Conversely, increased sedentary time was linked to higher risks for conditions such as obesity, Type 2 diabetes, and essential hypertension. Multiple regression analyses confirmed these associations. Conclusion Our findings highlight the beneficial effects of increased physical activity, particularly daily steps and elevation gain, on reducing the risk of chronic diseases. Conversely, sedentary behavior remains a significant risk factor for the development of several conditions. These insights may inform personalized activity recommendations aimed at reducing disease burden and improving population health outcomes.
Collapse
|
29
|
Breeyear JH, Mitchell SL, Nealon CL, Hellwege JN, Charest B, Khakharia A, Halladay CW, Yang J, Garriga GA, Wilson OD, Basnet TB, Hung AM, Reaven PD, Meigs JB, Rhee MK, Sun Y, Lynch MG, Sobrin L, Brantley MA, Sun YV, Wilson PW, Iyengar SK, Peachey NS, Phillips LS, Edwards TL, Giri A. Development of electronic health record based algorithms to identify individuals with diabetic retinopathy. J Am Med Inform Assoc 2024; 31:2560-2570. [PMID: 39158361 PMCID: PMC11491608 DOI: 10.1093/jamia/ocae213] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Revised: 07/17/2024] [Accepted: 07/30/2024] [Indexed: 08/20/2024] Open
Abstract
OBJECTIVES To develop, validate, and implement algorithms to identify diabetic retinopathy (DR) cases and controls from electronic health care records (EHRs). MATERIALS AND METHODS We developed and validated electronic health record (EHR)-based algorithms to identify DR cases and individuals with type I or II diabetes without DR (controls) in 3 independent EHR systems: Vanderbilt University Medical Center Synthetic Derivative (VUMC), the VA Northeast Ohio Healthcare System (VANEOHS), and Massachusetts General Brigham (MGB). Cases were required to meet 1 of the following 3 criteria: (1) 2 or more dates with any DR ICD-9/10 code documented in the EHR, (2) at least one affirmative health-factor or EPIC code for DR along with an ICD9/10 code for DR on a different day, or (3) at least one ICD-9/10 code for any DR occurring within 24 hours of an ophthalmology examination. Criteria for controls included affirmative evidence for diabetes as well as an ophthalmology examination. RESULTS The algorithms, developed and evaluated in VUMC through manual chart review, resulted in a positive predictive value (PPV) of 0.93 for cases and negative predictive value (NPV) of 0.91 for controls. Implementation of algorithms yielded similar metrics in VANEOHS (PPV = 0.94; NPV = 0.86) and lower in MGB (PPV = 0.84; NPV = 0.76). In comparison, the algorithm for DR implemented in Phenome-wide association study (PheWAS) in VUMC yielded similar PPV (0.92) but substantially reduced NPV (0.48). Implementation of the algorithms to the Million Veteran Program identified over 62 000 DR cases with genetic data including 14 549 African Americans and 6209 Hispanics with DR. CONCLUSIONS/DISCUSSION We demonstrate the robustness of the algorithms at 3 separate healthcare centers, with a minimum PPV of 0.84 and substantially improved NPV than existing automated methods. We strongly encourage independent validation and incorporation of features unique to each EHR to enhance algorithm performance for DR cases and controls.
Collapse
Affiliation(s)
- Joseph H Breeyear
- Division of Epidemiology, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, United States
- VA Tennessee Valley Healthcare System (626), Nashville, TN 37212, United States
| | - Sabrina L Mitchell
- VA Tennessee Valley Healthcare System (626), Nashville, TN 37212, United States
- Department of Ophthalmology and Visual Sciences, Vanderbilt University Medical Center, Nashville, TN 37232, United States
| | - Cari L Nealon
- Eye Clinic, VA Northeast Ohio Healthcare System, Cleveland, OH 44106, United States
| | - Jacklyn N Hellwege
- VA Tennessee Valley Healthcare System (626), Nashville, TN 37212, United States
- Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN 37232, United States
- Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, United States
| | - Brian Charest
- Massachusetts Veterans Epidemiology Research and Information Center (MAVERIC), VA Boston Healthcare System, Boston, MA 02111, United States
| | - Anjali Khakharia
- VA Atlanta Healthcare System, Decatur, GA 30033, United States
- Department of Medicine and Geriatrics, Emory University School of Medicine, Atlanta, GA 30307, United States
| | | | - Janine Yang
- Department of Ophthalmology, Mass Eye and Ear Infirmary, Harvard Medical School, Boston, MA 02114, United States
| | - Gustavo A Garriga
- Division of Quantitative and Clinical Sciences, Department of Obstetrics and Gynecology, Vanderbilt University Medical Center, Nashville, TN 37232, United States
| | - Otis D Wilson
- VA Tennessee Valley Healthcare System (626), Nashville, TN 37212, United States
- Division of Nephrology and Hypertension, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, United States
| | - Til B Basnet
- VA Tennessee Valley Healthcare System (626), Nashville, TN 37212, United States
- Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN 37232, United States
- Division of Quantitative and Clinical Sciences, Department of Obstetrics and Gynecology, Vanderbilt University Medical Center, Nashville, TN 37232, United States
| | - Adriana M Hung
- VA Tennessee Valley Healthcare System (626), Nashville, TN 37212, United States
- Division of Nephrology and Hypertension, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, United States
| | - Peter D Reaven
- Phoenix VA Health Care System, Phoenix, AZ 85012, United States
- College of Medicine, University of Arizona, Phoenix, AZ 85721, United States
| | - James B Meigs
- Program in Medical and Population Genetics, Broad Institute, Cambridge, MA 02142, United States
- Department of Medicine, Harvard Medical School, Boston, MA 02115, United States
- Division of General Internal Medicine, Massachusetts General Hospital, Boston, MA 02114, United States
| | - Mary K Rhee
- VA Atlanta Healthcare System, Decatur, GA 30033, United States
- Division of Endocrinology, Metabolism, and Lipids, Department of Medicine, Emory University School of Medicine, Atlanta, GA 30307, United States
| | - Yang Sun
- Department of Ophthalmology, Stanford University School of Medicine, Palo Alto, CA 94305, United States
| | - Mary G Lynch
- VA Atlanta Healthcare System, Decatur, GA 30033, United States
| | - Lucia Sobrin
- Department of Ophthalmology, Mass Eye and Ear Infirmary, Harvard Medical School, Boston, MA 02114, United States
| | - Milam A Brantley
- VA Tennessee Valley Healthcare System (626), Nashville, TN 37212, United States
- Department of Ophthalmology and Visual Sciences, Vanderbilt University Medical Center, Nashville, TN 37232, United States
- Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN 37232, United States
| | - Yan V Sun
- VA Atlanta Healthcare System, Decatur, GA 30033, United States
- Department of Epidemiology, Emory University Rollins School of Public Health, Atlanta, GA 30307, United States
- Department of Biomedical Informatics, Emory University School of Medicine, Atlanta, GA 30307, United States
| | - Peter W Wilson
- VA Atlanta Healthcare System, Decatur, GA 30033, United States
- Division of Cardiology, Department of Medicine, Emory University School of Medicine, Atlanta, GA 30307, United States
| | - Sudha K Iyengar
- Research Service, VA Northeast Ohio Healthcare System, Cleveland, OH 44106, United States
- Department of Population and Quantitative Health Sciences, Case Western Reserve University School of Medicine, Cleveland, OH 44106, United States
| | - Neal S Peachey
- Research Service, VA Northeast Ohio Healthcare System, Cleveland, OH 44106, United States
- Cole Eye Institute, Cleveland Clinic, Cleveland, OH 44106, United States
- Department of Ophthalmology, Cleveland Clinic Lerner College of Medicine of Case Western Reserve University, Cleveland, OH 44195, United States
| | - Lawrence S Phillips
- VA Atlanta Healthcare System, Decatur, GA 30033, United States
- Division of Endocrinology, Metabolism, and Lipids, Department of Medicine, Emory University School of Medicine, Atlanta, GA 30307, United States
| | - Todd L Edwards
- Division of Epidemiology, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, United States
- VA Tennessee Valley Healthcare System (626), Nashville, TN 37212, United States
| | - Ayush Giri
- Division of Epidemiology, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, United States
- VA Tennessee Valley Healthcare System (626), Nashville, TN 37212, United States
- Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN 37232, United States
- Division of Quantitative and Clinical Sciences, Department of Obstetrics and Gynecology, Vanderbilt University Medical Center, Nashville, TN 37232, United States
| |
Collapse
|
30
|
Gaitanidis A, Christensen MA, Breen KA, Kambadakone AR, Joshipura ND, Fernandez-Del Castillo C, Hernandez-Barco YG, Kaafarani HMA, Velmahos GC, Farhat MR, Fagenholz PJ. A Genome-Wide Association Study Reveals a Novel Susceptibility Locus for Pancreas Divisum at 3q29. J Surg Res 2024; 303:287-294. [PMID: 39393116 DOI: 10.1016/j.jss.2024.09.028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2023] [Revised: 05/17/2024] [Accepted: 09/12/2024] [Indexed: 10/13/2024]
Abstract
INTRODUCTION Pancreas divisum (PD) is a common congenital anomaly of the pancreas, but its genetic basis remains unknown. The purpose of this genome-wide association study was to identify genetic loci associated with PD. METHODS Using the Mass General Brigham Biobank, patients diagnosed with PD were identified. Quality control and imputation were performed using standard approaches. Single nucleotide polymorphisms (SNPs) with minor allele frequency (MAF) ≥ 5% were tested for association with PD using mixed linear model-based association analysis. The significance threshold was set at 5 × 10-8. RESULTS A total of 13,940 subjects were included, of which 251 (1.8%) were diagnosed with PD. A genetic locus in chromosome 3q29 was found to be associated with PD (lead SNP rs3850646, MAFPD = 34.6% vs. MAFcontrols = 26.4%, beta = 0.0106, P = 1.47 × 10-8). The identified locus is located in the phosphatidylinositol glycan anchor biosynthesis class Xand p21 activated kinase 2genes. The heritability of PD was estimated at 27.5%. (Expression quantitative trait loci) and chromatin interaction analysis found 12 genes whose expression may be regulated by SNPs in this genomic locus. CONCLUSIONS The results of this study suggest that a genetic locus at 3q29 is associated with PD. This locus is in the phosphatidylinositol glycan anchor biosynthesis class X and p21 activated kinase 2 genes. Twelve candidate genes were identified whose expression may be regulated by this locus. These findings may help us understand both normal and aberrant pancreatic development and may aid in clinical evaluation and genetic counseling of patients with PD and associated diseases, such as acute pancreatitis.
Collapse
Affiliation(s)
- Apostolos Gaitanidis
- Division of Trauma, Emergency Surgery and Surgical Critical Care, Massachusetts General Hospital, Boston, Massachusetts
| | - Mathias A Christensen
- Division of Trauma, Emergency Surgery and Surgical Critical Care, Massachusetts General Hospital, Boston, Massachusetts; Department of Anesthesia, Center of Head and Orthopedics, Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark
| | - Kerry A Breen
- Division of Trauma, Emergency Surgery and Surgical Critical Care, Massachusetts General Hospital, Boston, Massachusetts
| | - Avinash R Kambadakone
- Department of Radiology, Abdominal Imaging, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts
| | - Nencyben D Joshipura
- Department of Radiology, Abdominal Imaging, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts
| | | | - Yasmin G Hernandez-Barco
- Division of Gastroenterology, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts
| | - Haytham M A Kaafarani
- Division of Trauma, Emergency Surgery and Surgical Critical Care, Massachusetts General Hospital, Boston, Massachusetts
| | - George C Velmahos
- Division of Trauma, Emergency Surgery and Surgical Critical Care, Massachusetts General Hospital, Boston, Massachusetts
| | - Maha R Farhat
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts; Pulmonary and Critical Care Medicine, Massachusetts General Hospital, Boston, Massachusetts
| | - Peter J Fagenholz
- Division of Trauma, Emergency Surgery and Surgical Critical Care, Massachusetts General Hospital, Boston, Massachusetts.
| |
Collapse
|
31
|
Mak JKL, Chau YP, Tan KCB, Kung AWC, Cheung CL. Phenome-Wide Analysis of Coffee Intake on Health over 20 Years of Follow-Up Among Adults in Hong Kong Osteoporosis Study. Nutrients 2024; 16:3536. [PMID: 39458530 PMCID: PMC11509949 DOI: 10.3390/nu16203536] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2024] [Revised: 10/11/2024] [Accepted: 10/16/2024] [Indexed: 10/28/2024] Open
Abstract
BACKGROUND/OBJECTIVES There has been limited evidence on the long-term impacts of coffee intake on health. We aimed to investigate the association between coffee intake and the incidence of diseases and mortality risk over 20 years among community-dwelling Chinese adults. METHODS Participants were from the Hong Kong Osteoporosis Study who attended baseline assessments during 1995-2010. Coffee intake was self-reported through a food frequency questionnaire and was previously validated. Disease diagnoses, which were mapped into 1795 distinct phecodes, and mortality data were obtained from linkage with territory-wide electronic health records. Cox models were used to estimate the association between coffee intake and the incidence of each disease outcome and mortality among individuals without a history of the respective medical condition at baseline. All models were adjusted for age, sex, body mass index, smoking, alcohol drinking, and education. RESULTS Among the 7420 included participants (mean age 53.2 years, 72.2% women), 54.0% were non-coffee drinkers, and only 2.7% consumed more than one cup of coffee per day. Over a median follow-up of 20.0 years, any coffee intake was associated with a reduced risk of dementia, atrial fibrillation, painful respirations, infections, atopic dermatitis, and dizziness at a false discovery rate (FDR) of <0.05. Furthermore, any coffee intake was associated with an 18% reduced risk of all-cause mortality (95% confidence interval = 0.73-0.93). CONCLUSION In a population with relatively low coffee consumption, any coffee intake is linked to a lower risk of several neurological, circulatory, and respiratory diseases and symptoms, as well as mortality.
Collapse
Affiliation(s)
- Jonathan K. L. Mak
- Department of Pharmacology and Pharmacy, The University of Hong Kong, Hong Kong, China; (J.K.L.M.); (Y.-P.C.)
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, 17177 Stockholm, Sweden
| | - Yin-Pan Chau
- Department of Pharmacology and Pharmacy, The University of Hong Kong, Hong Kong, China; (J.K.L.M.); (Y.-P.C.)
| | - Kathryn Choon-Beng Tan
- Department of Medicine, School of Clinical Medicine, The University of Hong Kong, Hong Kong, China; (K.C.-B.T.); (A.W.-C.K.)
| | - Annie Wai-Chee Kung
- Department of Medicine, School of Clinical Medicine, The University of Hong Kong, Hong Kong, China; (K.C.-B.T.); (A.W.-C.K.)
| | - Ching-Lung Cheung
- Department of Pharmacology and Pharmacy, The University of Hong Kong, Hong Kong, China; (J.K.L.M.); (Y.-P.C.)
- Laboratory of Data Discovery for Health (D4H), Hong Kong Science Park, Pak Shek Kok, Hong Kong, China
| |
Collapse
|
32
|
Friedman SF, Moran GE, Rakic M, Phillipakis A. Genetic Architectures of Medical Images Revealed by Registration of Multiple Modalities. Bioinform Biol Insights 2024; 18:11779322241282489. [PMID: 39372505 PMCID: PMC11450573 DOI: 10.1177/11779322241282489] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Accepted: 08/16/2024] [Indexed: 10/08/2024] Open
Abstract
The advent of biobanks with vast quantities of medical imaging and paired genetic measurements creates huge opportunities for a new generation of genotype-phenotype association studies. However, disentangling biological signals from the many sources of bias and artifacts remains difficult. Using diverse medical images and time-series (ie, magnetic resonance imagings [MRIs], electrocardiograms [ECGs], and dual-energy X-ray absorptiometries [DXAs]), we show how registration, both spatial and temporal, guided by domain knowledge or learned de novo, helps uncover biological information. A multimodal autoencoder comparison framework quantifies and characterizes how registration affects the representations that unsupervised and self-supervised encoders learn. In this study we (1) train autoencoders before and after registration with nine diverse types of medical image, (2) demonstrate how neural network-based methods (VoxelMorph, DeepCycle, and DropFuse) can effectively learn registrations allowing for more flexible and efficient processing than is possible with hand-crafted registration techniques, and (3) conduct exhaustive phenotypic screening, comprised of millions of statistical tests, to quantify how registration affects the generalizability of learned representations. Genome- and phenome-wide association studies (GWAS and PheWAS) uncover significantly more associations with registered modality representations than with equivalently trained and sized representations learned from native coordinate spaces. Specifically, registered PheWAS yielded 61 more disease associations for ECGs, 53 more disease associations for cardiac MRIs, and 10 more disease associations for brain MRIs. Registration also yields significant increases in the coefficient of determination when regressing continuous phenotypes (eg, 0.36 ± 0.01 with ECGs and 0.11 ± 0.02 for DXA scans). Our findings reveal the crucial role registration plays in enhancing the characterization of physiological states across a broad range of medical imaging data types. Importantly, this finding extends to more flexible types of registration, such as the cross-modal and the circular mapping methods presented here.
Collapse
Affiliation(s)
| | | | - Marianne Rakic
- Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, MA, USA
| | - Anthony Phillipakis
- Data Sciences Platform, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| |
Collapse
|
33
|
Tahir UA, Barber JL, Cruz DE, Kars ME, Deng S, Tuftin B, Gillman MG, Benson MD, Robbins JM, Chen ZZ, Rao P, Katz DH, Farrell L, Sofer T, Hall ME, Ekunwe L, Tracy RP, Durda P, Taylor KD, Liu Y, Johnson WC, Guo X, Chen YDI, Manichaikul AW, Jain D, NHLBI Trans-Omics for Precision Medicine Consortium, Wang TJ, Reiner AP, Natarajan P, Itan Y, Rich SS, Rotter JI, Wilson JG, Raffield LM, Gerszten RE. Proteogenomic analysis integrated with electronic health records data reveals disease-associated variants in Black Americans. J Clin Invest 2024; 134:e181802. [PMID: 39316441 PMCID: PMC11527441 DOI: 10.1172/jci181802] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2024] [Accepted: 09/11/2024] [Indexed: 09/26/2024] Open
Abstract
BACKGROUNDMost GWAS of plasma proteomics have focused on White individuals of European ancestry, limiting biological insight from other ancestry-enriched protein quantitative loci (pQTL).METHODSWe conducted a discovery GWAS of approximately 3,000 plasma proteins measured by the antibody-based Olink platform in 1,054 Black adults from the Jackson Heart Study (JHS) and validated our findings in the Multi-Ethnic Study of Atherosclerosis (MESA). The genetic architecture of identified pQTLs was further explored through fine mapping and admixture association analysis. Finally, using our pQTL findings, we performed a phenome-wide association study (PheWAS) across 2 large multiethnic electronic health record (EHR) systems in All of Us and BioMe.RESULTSWe identified 1,002 pQTLs for 925 protein assays. Fine mapping and admixture analyses suggested allelic heterogeneity of the plasma proteome across diverse populations. We identified associations for variants enriched in African ancestry, many in diseases that lack precise biomarkers, including cis-pQTLs for cathepsin L (CTSL) and Siglec-9, which were linked with sarcoidosis and non-Hodgkin's lymphoma, respectively. We found concordant associations across clinical diagnoses and laboratory measurements, elucidating disease pathways, including a cis-pQTL associated with circulating CD58, WBC count, and multiple sclerosis.CONCLUSIONSOur findings emphasize the value of leveraging diverse populations to enhance biological insights from proteomics GWAS, and we have made this resource readily available as an interactive web portal.FUNDINGNIH K08 HL161445-01A1; 5T32HL160522-03; HHSN268201600034I; HL133870.
Collapse
Affiliation(s)
- Usman A. Tahir
- Division of Cardiovascular Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts, USA
| | - Jacob L. Barber
- Division of Cardiovascular Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts, USA
| | - Daniel E. Cruz
- Division of Cardiovascular Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts, USA
| | | | - Shuliang Deng
- Division of Cardiovascular Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts, USA
| | | | - Madeline G. Gillman
- University of North Carolina School of Medicine, Raleigh, North Carolina, USA
| | - Mark D. Benson
- Division of Cardiovascular Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts, USA
| | - Jeremy M. Robbins
- Division of Cardiovascular Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts, USA
| | - Zsu-Zsu Chen
- Division of Cardiovascular Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts, USA
| | - Prashant Rao
- Division of Cardiovascular Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts, USA
| | | | - Laurie Farrell
- Division of Cardiovascular Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts, USA
| | - Tamar Sofer
- Division of Cardiovascular Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts, USA
| | - Michael E. Hall
- University of Mississippi Medical Center, Jackson, Mississippi, USA
| | - Lynette Ekunwe
- University of Mississippi Medical Center, Jackson, Mississippi, USA
| | - Russell P. Tracy
- Department of Pathology Laboratory Medicine, Larner College of Medicine, University of Vermont, Burlington, Vermont, USA
| | - Peter Durda
- Department of Pathology Laboratory Medicine, Larner College of Medicine, University of Vermont, Burlington, Vermont, USA
| | - Kent D. Taylor
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation, Torrance, California, USA
| | - Yongmei Liu
- Department of Medicine, Division of Cardiology, Duke Molecular Physiology Institute, Duke University Medical Center, Durham, North Carolina, USA
| | - W. Craig Johnson
- Department of Biostatistics, University of Washington, Seattle, Washington, USA
| | - Xiuqing Guo
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation, Torrance, California, USA
| | - Yii-Der Ida Chen
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation, Torrance, California, USA
| | - Ani W. Manichaikul
- Center for Public Health Genomics and
- Division of Biostatistics and Epidemiology, Department of Public Health Sciences, University of Virginia, Charlottesville, Virginia, USA
| | - Deepti Jain
- University of Washington, Seattle, Washington
| | | | - Thomas J. Wang
- Department of Medicine, UT Southwestern Medical Center, Dallas, Texas, USA
| | | | - Pradeep Natarajan
- Cardiovascular Research Center, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, USA
- Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA
| | - Yuval Itan
- University of North Carolina School of Medicine, Raleigh, North Carolina, USA
| | | | - Jerome I. Rotter
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation, Torrance, California, USA
| | - James G. Wilson
- Division of Cardiovascular Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts, USA
| | - Laura M. Raffield
- University of North Carolina School of Medicine, Raleigh, North Carolina, USA
| | - Robert E. Gerszten
- Division of Cardiovascular Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts, USA
- Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA
| |
Collapse
|
34
|
Xian S, Grabowska ME, Kullo IJ, Luo Y, Smoller JW, Wei WQ, Jarvik G, Mooney S, Crosslin D. Language-model-based patient embedding using electronic health records facilitates phenotyping, disease forecasting, and progression analysis. RESEARCH SQUARE 2024:rs.3.rs-4708839. [PMID: 39399661 PMCID: PMC11469380 DOI: 10.21203/rs.3.rs-4708839/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/15/2024]
Abstract
Current studies regarding the secondary use of electronic health records (EHR) predominantly rely on domain expertise and existing medical knowledge. Though significant efforts have been devoted to investigating the application of machine learning algorithms in the EHR, efficient and powerful representation of patients is needed to unleash the potential of discovering new medical patterns underlying the EHR. Here, we present an unsupervised method for embedding high-dimensional EHR data at the patient level, aimed at characterizing patient heterogeneity in complex diseases and identifying new disease patterns associated with clinical outcome disparities. Inspired by the architecture of modern language models-specifically transformers with attention mechanisms, we use patient diagnosis and procedure codes as vocabularies and treat each patient as a sentence to perform the patient embedding. We applied this approach to 34,851 unique medical codes across 1,046,649 longitudinal patient events, including 102,739 patients from the electronic Medical Records and GEnomics (eMERGE) Network. The resulting patient vectors demonstrated excellent performance in predicting future disease events (median AUROC = 0.87 within one year) and bulk phenotyping (median AUROC = 0.84). We then illustrated the utility of these patient vectors in revealing heterogeneous comorbidity patterns, exemplified by disease subtypes in colorectal cancer and systemic lupus erythematosus, and capturing distinct longitudinal disease trajectories. External validation using EHR data from the University of Washington confirmed robust model performance, with median AUROCs of 0.83 and 0.84 for bulk phenotyping tasks and disease onset prediction, respectively. Importantly, the model reproduced the clustering results of disease subtypes identified in the eMERGE cohort and uncovered variations in overall mortality among these subtypes. Together, these results underscore the potential of representation learning in EHRs to enhance patient characterization and associated clinical outcomes, thereby advancing disease forecasting and facilitating personalized medicine.
Collapse
Affiliation(s)
- Su Xian
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA
| | - Monika E Grabowska
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN
| | - Iftikhar J Kullo
- Department of Cardiovascular Medicine and the Gonda Vascular Center, Mayo Clinic Rochester Minnesota
| | - Yuan Luo
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine
| | - Jordan W Smoller
- Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA
- Center for Precision Psychiatry, Department of Psychiatry, Massachusetts General Hospital, Boston, MA
| | - Wei-Qi Wei
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN
| | - Gail Jarvik
- Department of Medicine, Division of Medical Genetics, University of Washington, Seattle, WA
| | - Sean Mooney
- Center for Information Technology, National Institutes of Health
| | - David Crosslin
- Department of Medicine, Division of Biomedical Informatics and Genomics, Tulane University, New Orleans, LA
| |
Collapse
|
35
|
Justice A, Kelly MA, Bellus G, Green JD, Zaidi R, Kerrins T, Josyula N, Luperchio TR, Kozel BA, Williams MS. Phenotypic Findings Associated with Variation in Elastin. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.09.10.24313340. [PMID: 39314928 PMCID: PMC11419209 DOI: 10.1101/2024.09.10.24313340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/25/2024]
Abstract
Variation in the elastin gene ( ELN ) may contribute to connective tissue disease beyond the known disease associations of Supravalvar Aortic Stenosis and Cutis Laxa. Exome data from MyCode Community Health Initiative participants were analyzed for ELN rare variants (mean allele frequency <1%, not currently annotated as benign). Participants with variants of interest underwent phenotyping by dual chart review using a standardized abstraction tool. Additionally, all rare variants that met inclusion criteria were collapsed into an ELN gene burden score to perform a Phenome-wide Association Study (PheWAS). Two hundred and ninety-six eligible participants with relevant ELN variants were identified from 184,293 MyCode participants. One hundred and three of 254 living participants (41%) met phenotypic criteria, most commonly aortic hypoplasia, arterial dilation, aneurysm, and dissection, and connective tissue abnormalities. ELN variation was significantly (P <2.8×10 -5 ) associated with "arterial dissection" in the PheWAS and two connective tissue Phecodes approached significance. Variation in ELN is associated with connective tissue pathology beyond classic phenotypes. eTOC Blurb Carriers of variants of interest in the elastin gene ( ELN ) were evaluated for presence of findings that could be associated with the variation. Chart review and Phenome-wide Association Studies were used. Results are consistent with variation in ELN being associated with findings affecting elastic tissues beyond classic phenotypes.
Collapse
|
36
|
Yang L, Sadler MC, Altman RB. Genetic association studies using disease liabilities from deep neural networks. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2023.01.18.23284383. [PMID: 36712099 PMCID: PMC9882423 DOI: 10.1101/2023.01.18.23284383] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
The case-control study is a widely used method for investigating the genetic underpinnings of binary traits. However, long-term, prospective cohort studies often grapple with absent or evolving health-related outcomes. Here, we propose two methods, liability and meta, for conducting genome-wide association study (GWAS) that leverage disease liabilities calculated from deep patient phenotyping. Analyzing 38 common traits in ~300,000 UK Biobank participants, we identified an increased number of loci compared to the conventional case-control approach, with high replication rates in larger external GWAS. Further analyses confirmed the disease-specificity of the genetic architecture with the meta method demonstrating higher robustness when phenotypes were imputed with low accuracy. Additionally, polygenic risk scores based on disease liabilities more effectively predicted newly diagnosed cases in the 2022 dataset, which were controls in the earlier 2019 dataset. Our findings demonstrate that integrating high-dimensional phenotypic data into deep neural networks enhances genetic association studies while capturing disease-relevant genetic architecture.
Collapse
Affiliation(s)
- Lu Yang
- Department of Bioengineering, Stanford University, Stanford, CA, 94305, USA
- Department of Computer Science, Stanford University, Stanford, CA, 94305, USA
| | - Marie C. Sadler
- Department of Bioengineering, Stanford University, Stanford, CA, 94305, USA
- University Center for Primary Care and Public Health, Lausanne, 1010, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, 1015, Switzerland
| | - Russ B. Altman
- Department of Bioengineering, Stanford University, Stanford, CA, 94305, USA
- Department of Genetics, Stanford University, Stanford, CA, 94305, USA
- Department of Medicine, Stanford University, Stanford, CA, 94305, USA
- Department of Computer Science, Stanford University, Stanford, CA, 94305, USA
| |
Collapse
|
37
|
Sanchez-Ruiz JA, Coombes BJ, Pazdernik VM, Melhuish Beaupre LM, Jenkins GD, Pendegraft RS, Batzler A, Ozerdem A, McElroy SL, Gardea-Resendez MA, Cuellar-Barboza AB, Prieto ML, Frye MA, Biernacka JM. Clinical and genetic contributions to medical comorbidity in bipolar disorder: a study using electronic health records-linked biobank data. Mol Psychiatry 2024; 29:2701-2713. [PMID: 38548982 PMCID: PMC11544602 DOI: 10.1038/s41380-024-02530-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Revised: 02/21/2024] [Accepted: 03/13/2024] [Indexed: 06/14/2024]
Abstract
Bipolar disorder is a chronic and complex polygenic disease with high rates of comorbidity. However, the independent contribution of either diagnosis or genetic risk of bipolar disorder to the medical comorbidity profile of individuals with the disease remains unresolved. Here, we conducted a multi-step phenome-wide association study (PheWAS) of bipolar disorder using phenomes derived from the electronic health records of participants enrolled in the Mayo Clinic Biobank and the Mayo Clinic Bipolar Disorder Biobank. First, we explored the conditions associated with a diagnosis of bipolar disorder by conducting a phenotype-based PheWAS followed by LASSO-penalized regression to account for correlations within the phenome. Then, we explored the conditions associated with bipolar disorder polygenic risk score (BD-PRS) using a PRS-based PheWAS with a sequential exclusion approach to account for the possibility that diagnosis, instead of genetic risk, may drive such associations. 53,386 participants (58.7% women) with a mean age at analysis of 67.8 years (SD = 15.6) were included. A bipolar disorder diagnosis (n = 1479) was associated with higher rates of psychiatric conditions, injuries and poisonings, endocrine/metabolic and neurological conditions, viral hepatitis C, and asthma. BD-PRS was associated with psychiatric comorbidities but, in contrast, had no positive associations with general medical conditions. While our findings warrant confirmation with longitudinal-prospective studies, the limited associations between bipolar disorder genetics and medical conditions suggest that shared environmental effects or environmental consequences of diagnosis may have a greater impact on the general medical comorbidity profile of individuals with bipolar disorder than its genetic risk.
Collapse
Affiliation(s)
| | - Brandon J Coombes
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN, USA
| | | | | | - Greg D Jenkins
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN, USA
| | | | - Anthony Batzler
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN, USA
| | - Aysegul Ozerdem
- Department of Psychiatry & Psychology, Mayo Clinic, Rochester, MN, USA
| | - Susan L McElroy
- Lindner Center of HOPE/University of Cincinnati, Cincinnati, OH, USA
| | - Manuel A Gardea-Resendez
- Department of Psychiatry & Psychology, Mayo Clinic, Rochester, MN, USA
- Department of Psychiatry, Universidad Autónoma de Nuevo León, Monterrey, Mexico
| | - Alfredo B Cuellar-Barboza
- Department of Psychiatry & Psychology, Mayo Clinic, Rochester, MN, USA
- Department of Psychiatry, Universidad Autónoma de Nuevo León, Monterrey, Mexico
| | - Miguel L Prieto
- Department of Psychiatry & Psychology, Mayo Clinic, Rochester, MN, USA
- Department of Psychiatry, Faculty of Medicine, Universidad de Los Andes, Santiago, Chile
- Mental Health Service, Clínica Universidad de los Andes, Santiago, Chile
| | - Mark A Frye
- Department of Psychiatry & Psychology, Mayo Clinic, Rochester, MN, USA
| | - Joanna M Biernacka
- Department of Psychiatry & Psychology, Mayo Clinic, Rochester, MN, USA.
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN, USA.
| |
Collapse
|
38
|
Cao T, Brady V, Whisenant M, Wang X, Gu Y, Wu H. Toward Reliable Symptom Coding in Electronic Health Records for Symptom Assessment and Research: Identification and Categorization of International Classification of Diseases, Ninth Revision, Clinical Modification Symptom Codes. Comput Inform Nurs 2024; 42:636-647. [PMID: 38968447 PMCID: PMC11377150 DOI: 10.1097/cin.0000000000001146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/07/2024]
Abstract
To date, symptom documentation has mostly relied on clinical notes in electronic health records or patient-reported outcomes using disease-specific symptom inventories. To provide a common and precise language for symptom recording, assessment, and research, a comprehensive list of symptom codes is needed. The International Classification of Diseases, Ninth Revision or its clinical modification ( International Classification of Diseases, Ninth Revision, Clinical Modification ) has a range of codes designated for symptoms, but it does not contain codes for all possible symptoms, and not all codes in that range are symptom related. This study aimed to identify and categorize the first list of International Classification of Diseases, Ninth Revision, Clinical Modification symptom codes for a general population and demonstrate their use to characterize symptoms of patients with type 2 diabetes mellitus in the Cerner database. A list of potential symptom codes was automatically extracted from the Unified Medical Language System Metathesaurus. Two clinical experts in symptom science and diabetes manually reviewed this list to identify and categorize codes as symptoms. A total of 1888 International Classification of Diseases, Ninth Revision, Clinical Modification symptom codes were identified and categorized into 65 categories. The symptom characterization using the newly obtained symptom codes and categories was found to be more reasonable than that using the previous symptom codes and categories on the same Cerner diabetes cohort.
Collapse
Affiliation(s)
- Tru Cao
- Author Affiliations: UTHealth Houston School of Public Health (Drs Cao, Wang, and Wu and Mr Gu), UTHealth Houston Cizik School of Nursing (Dr Brady), and The University of Texas MD Anderson Cancer Center (Dr Whisenant)
| | | | | | | | | | | |
Collapse
|
39
|
Lin Z, Xiong J, Yang J, Huang Y, Li J, Zhao G, Li B. A comprehensive analysis of the health effects associated with smoking in the largest population using UK Biobank genotypic and phenotypic data. Heliyon 2024; 10:e35649. [PMID: 39220930 PMCID: PMC11365339 DOI: 10.1016/j.heliyon.2024.e35649] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2024] [Revised: 07/29/2024] [Accepted: 08/01/2024] [Indexed: 09/04/2024] Open
Abstract
Background Smoking is a widespread behavior, while the relationship between smoking and various diseases remains a topic of debate. Objective We conducted analysis to further examine the identified associations and assess potential causal relationships. Methods We utilized seven single nucleotide polymorphisms (SNPs) known to be linked to smoking extracting genotype data from the UK Biobank, a large-scale biomedical repository encompassing comprehensive health-related and genetic information of European descent. Phenome-wide association study (PheWAS) analysis was conducted to map the association of genetically predicted smoking status with 1,549 phenotypes. The associations identified in the PheWAS were then meticulously examined through two-sample Mendelian randomization (MR) analysis, utilizing data from the UK Biobank (n = 487,365) and the Sequencing Consortium of Alcohol and Nicotine Use (GSCAN) (n = 337,334). This approach allowed us to comprehensively characterize the links between smoking and disease patterns. Results The PheWAS analysis produced 34 phenotypes that demonstrated significant associations with smoking (P = 0.05/1460). Importantly, sickle cell anemia and type 2 diabetes exhibited the most significant SNPs (both 85.71% significant SNPs). Furthermore, the MR analyses provided compelling evidence supporting causal associations between smoking and the risk of following diseases: obstructive chronic bronchitis (IVW: Beta = 0.48, 95% confidence interval (CI) 0.36-0.61, P = 1.62×10-13), cancer of the bronchus (IVW: Beta = 0.92, 95% CI 0.68-1.17, P = 2.02×10-13), peripheral vascular disease (IVW: Beta = 1.09, 95% CI 0.71-1.46, P = 1.63×10-8), emphysema (IVW: Beta = 1.63, 95% CI 0.90-2.36, P = 1.29×10-5), pneumococcal pneumonia (IVW: Beta = 0.30, 95% CI 0.11-0.49, P = 1.60×10-3), chronic airway obstruction (IVW: Beta = 0.83, 95% CI 0.30-1.36, P = 2.00×10-3) and type 2 diabetes (IVW: Beta = 0.53, 95% CI 0.16-0.90, P = 5.08×10-3). Conclusion This study affirms causal relationships between smoking and obstructive chronic bronchitis, cancer of the bronchus, peripheral vascular disease, emphysema, pneumococcal pneumonia, chronic airway obstruction, type 2 diabetes, in the European population. These findings highlight the broad health impacts of smoking and support smoking cessation efforts.
Collapse
Affiliation(s)
- Zixun Lin
- The Joint Institute of Smoking and Health & Bioinformatics Centre, National Clinical Research Centre for Geriatric Disorders, Department of Geriatrics, Xiangya Hospital, Central South University, Changsha, Hunan, 410008, China
- Xiangya School of Medicine, Central South University, Changsha, Hunan, 410013, China
| | - Jiayi Xiong
- The Joint Institute of Smoking and Health & Bioinformatics Centre, National Clinical Research Centre for Geriatric Disorders, Department of Geriatrics, Xiangya Hospital, Central South University, Changsha, Hunan, 410008, China
| | - Jiaqi Yang
- The Joint Institute of Smoking and Health & Bioinformatics Centre, National Clinical Research Centre for Geriatric Disorders, Department of Geriatrics, Xiangya Hospital, Central South University, Changsha, Hunan, 410008, China
| | - Yuanfeng Huang
- The Joint Institute of Smoking and Health & Bioinformatics Centre, National Clinical Research Centre for Geriatric Disorders, Department of Geriatrics, Xiangya Hospital, Central South University, Changsha, Hunan, 410008, China
| | - Jinchen Li
- The Joint Institute of Smoking and Health & Bioinformatics Centre, National Clinical Research Centre for Geriatric Disorders, Department of Geriatrics, Xiangya Hospital, Central South University, Changsha, Hunan, 410008, China
- Centre for Medical Genetics & Hunan Key Laboratory, School of Life Sciences, Central South University, Changsha, Hunan, 410008, China
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, Hunan, 410008, China
- Bioinformatics Centre, Furong Laboratory, Changsha, Hunan, 410008, China
| | - Guihu Zhao
- The Joint Institute of Smoking and Health & Bioinformatics Centre, National Clinical Research Centre for Geriatric Disorders, Department of Geriatrics, Xiangya Hospital, Central South University, Changsha, Hunan, 410008, China
| | - Bin Li
- The Joint Institute of Smoking and Health & Bioinformatics Centre, National Clinical Research Centre for Geriatric Disorders, Department of Geriatrics, Xiangya Hospital, Central South University, Changsha, Hunan, 410008, China
| |
Collapse
|
40
|
Carey CE, Shafee R, Wedow R, Elliott A, Palmer DS, Compitello J, Kanai M, Abbott L, Schultz P, Karczewski KJ, Bryant SC, Cusick CM, Churchhouse C, Howrigan DP, King D, Davey Smith G, Neale BM, Walters RK, Robinson EB. Principled distillation of UK Biobank phenotype data reveals underlying structure in human variation. Nat Hum Behav 2024; 8:1599-1615. [PMID: 38965376 PMCID: PMC11343713 DOI: 10.1038/s41562-024-01909-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Accepted: 05/14/2024] [Indexed: 07/06/2024]
Abstract
Data within biobanks capture broad yet detailed indices of human variation, but biobank-wide insights can be difficult to extract due to complexity and scale. Here, using large-scale factor analysis, we distill hundreds of variables (diagnoses, assessments and survey items) into 35 latent constructs, using data from unrelated individuals with predominantly estimated European genetic ancestry in UK Biobank. These factors recapitulate known disease classifications, disentangle elements of socioeconomic status, highlight the relevance of psychiatric constructs to health and improve measurement of pro-health behaviours. We go on to demonstrate the power of this approach to clarify genetic signal, enhance discovery and identify associations between underlying phenotypic structure and health outcomes. In building a deeper understanding of ways in which constructs such as socioeconomic status, trauma, or physical activity are structured in the dataset, we emphasize the importance of considering the interwoven nature of the human phenome when evaluating public health patterns.
Collapse
Affiliation(s)
- Caitlin E Carey
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA.
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA.
| | - Rebecca Shafee
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
- Section on Developmental Neurogenomics, National Institute of Mental Health, Bethesda, MD, USA
| | - Robbee Wedow
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Department of Sociology, Purdue University, West Lafayette, IN, USA
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, USA
- AnalytiXIN, Indianapolis, IN, USA
- Center on Aging and the Life Course, Purdue University, West Lafayette, IN, USA
- Department of Statistics, Purdue University, West Lafayette, IN, USA
| | - Amanda Elliott
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
| | - Duncan S Palmer
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Nuffield Department of Population Health, Medical Sciences Division University of Oxford, Oxford, UK
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, UK
| | - John Compitello
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Masahiro Kanai
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Liam Abbott
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
| | - Patrick Schultz
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Konrad J Karczewski
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
| | - Samuel C Bryant
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
| | - Caroline M Cusick
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Claire Churchhouse
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Daniel P Howrigan
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
| | - Daniel King
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - George Davey Smith
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- MRC Integrative Epidemiology Unit, University of Bristol, Oakfield House, Bristol, UK
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
| | - Benjamin M Neale
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Raymond K Walters
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA.
- Department of Medicine, Harvard Medical School, Boston, MA, USA.
| | - Elise B Robinson
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| |
Collapse
|
41
|
Pietzner M, Denaxas S, Yasmeen S, Ulmer MA, Nakanishi T, Arnold M, Kastenmüller G, Hemingway H, Langenberg C. Complex patterns of multimorbidity associated with severe COVID-19 and long COVID. COMMUNICATIONS MEDICINE 2024; 4:94. [PMID: 38977844 PMCID: PMC11231221 DOI: 10.1038/s43856-024-00506-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2023] [Accepted: 04/19/2024] [Indexed: 07/10/2024] Open
Abstract
BACKGROUND Early evidence that patients with (multiple) pre-existing diseases are at highest risk for severe COVID-19 has been instrumental in the pandemic to allocate critical care resources and later vaccination schemes. However, systematic studies exploring the breadth of medical diagnoses are scarce but may help to understand severe COVID-19 among patients at supposedly low risk. METHODS We systematically harmonized >12 million primary care and hospitalisation health records from ~500,000 UK Biobank participants into 1448 collated disease terms to systematically identify diseases predisposing to severe COVID-19 (requiring hospitalisation or death) and its post-acute sequalae, Long COVID. RESULTS Here we identify 679 diseases associated with an increased risk for severe COVID-19 (n = 672) and/or Long COVID (n = 72) that span almost all clinical specialties and are strongly enriched in clusters of cardio-respiratory and endocrine-renal diseases. For 57 diseases, we establish consistent evidence to predispose to severe COVID-19 based on survival and genetic susceptibility analyses. This includes a possible role of symptoms of malaise and fatigue as a so far largely overlooked risk factor for severe COVID-19. We finally observe partially opposing risk estimates at known risk loci for severe COVID-19 for etiologically related diseases, such as post-inflammatory pulmonary fibrosis or rheumatoid arthritis, possibly indicating a segregation of disease mechanisms. CONCLUSIONS Our results provide a unique reference that demonstrates how 1) complex co-occurrence of multiple - including non-fatal - conditions predispose to increased COVID-19 severity and 2) how incorporating the whole breadth of medical diagnosis can guide the interpretation of genetic risk loci.
Collapse
Affiliation(s)
- Maik Pietzner
- Computational Medicine, Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Berlin, Germany.
- Precision Healthcare University Research Institute, Queen Mary University of London, London, UK.
- MRC Epidemiology Unit, University of Cambridge, Cambridge, UK.
| | - Spiros Denaxas
- Institute of Health Informatics, University College London, London, UK
- Health Data Research UK, London, UK
- British Heart Foundation Data Science Centre, London, UK
- National Institute of Health Research University College London Hospitals Biomedical Research Centre, London, UK
| | - Summaira Yasmeen
- Computational Medicine, Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Maria A Ulmer
- Institute of Computational Biology, Helmholtz Zentrum München - German Research Center for Environmental Health, Neuherberg, Germany
| | - Tomoko Nakanishi
- Precision Healthcare University Research Institute, Queen Mary University of London, London, UK
| | - Matthias Arnold
- Institute of Computational Biology, Helmholtz Zentrum München - German Research Center for Environmental Health, Neuherberg, Germany
- Department of Psychiatry and Behavioral Sciences, Duke University, Durham, NC, USA
| | - Gabi Kastenmüller
- Institute of Computational Biology, Helmholtz Zentrum München - German Research Center for Environmental Health, Neuherberg, Germany
| | - Harry Hemingway
- Institute of Health Informatics, University College London, London, UK.
- Health Data Research UK, London, UK.
- National Institute of Health Research University College London Hospitals Biomedical Research Centre, London, UK.
| | - Claudia Langenberg
- Computational Medicine, Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Berlin, Germany.
- Precision Healthcare University Research Institute, Queen Mary University of London, London, UK.
- MRC Epidemiology Unit, University of Cambridge, Cambridge, UK.
| |
Collapse
|
42
|
Allaire P, Elsayed NS, Berg RL, Rose W, Shukla SK. Phenome-wide association study identifies new clinical phenotypes associated with Staphylococcus aureus infections. PLoS One 2024; 19:e0303395. [PMID: 38968223 PMCID: PMC11226111 DOI: 10.1371/journal.pone.0303395] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Accepted: 04/23/2024] [Indexed: 07/07/2024] Open
Abstract
BACKGROUND Phenome-Wide Association study (PheWAS) is a powerful tool designed to systematically screen clinical observations derived from medical records (phenotypes) for association with a variable of interest. Despite their usefulness, no systematic screening of phenotypes associated with Staphylococcus aureus infections (SAIs) has been done leaving potential novel risk factors or complications undiscovered. METHOD AND COHORTS We tailored the PheWAS approach into a two-stage screening procedure to identify novel phenotypes correlating with SAIs. The first stage screened for co-occurrence of SAIs with other phenotypes within medical records. In the second stage, significant findings were examined for the correlations between their age of onset with that of SAIs. The PheWAS was implemented using the medical records of 754,401 patients from the Marshfield Clinic Health System. Any novel associations discovered were subsequently validated using datasets from TriNetX and All of Us, encompassing 109,884,571 and 118,538 patients respectively. RESULTS Forty-one phenotypes met the significance criteria of a p-value < 3.64e-5 and odds ratios of > 5. Out of these, we classified 23 associations either as risk factors or as complications of SAIs. Three novel associations were discovered and classified either as a risk (long-term use of aspirin) or complications (iron deficiency anemia and anemia of chronic disease). All novel associations were replicated in the TriNetX cohort. In the All of Us cohort, anemia of chronic disease was replicated according to our significance criteria. CONCLUSIONS The PheWAS of SAIs expands our understanding of SAIs interacting phenotypes. Additionally, the novel two-stage PheWAS approach developed in this study can be applied to examine other disease-disease interactions of interest. Due to the possibility of bias inherent in observational data, the findings of this study require further investigation.
Collapse
Affiliation(s)
- Patrick Allaire
- Center for Precision Medicine Research, Marshfield Clinic Research Institute, Marshfield, Wisconsin, United States of America
| | - Noha S. Elsayed
- Center for Precision Medicine Research, Marshfield Clinic Research Institute, Marshfield, Wisconsin, United States of America
| | - Richard L. Berg
- Research Computing and Analytics, Marshfield Clinic Research Institute, Marshfield, Wisconsin, United States of America
| | - Warren Rose
- School of Pharmacy, University of Wisconsin, Madison, Wisconsin, United States of America
| | - Sanjay K. Shukla
- Center for Precision Medicine Research, Marshfield Clinic Research Institute, Marshfield, Wisconsin, United States of America
- Computational and Informatics in Biology and Medicine Program, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| |
Collapse
|
43
|
Gilliland T, Dron JS, Selvaraj MS, Trinder M, Paruchuri K, Urbut SM, Haidermota S, Bernardo R, Uddin MM, Honigberg MC, Peloso GM, Natarajan P. Genetic Architecture and Clinical Outcomes of Combined Lipid Disturbances. Circ Res 2024; 135:265-276. [PMID: 38828614 PMCID: PMC11223949 DOI: 10.1161/circresaha.123.323973] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/10/2023] [Accepted: 05/20/2024] [Indexed: 06/05/2024]
Abstract
BACKGROUND Dyslipoproteinemia often involves simultaneous derangements of multiple lipid traits. We aimed to evaluate the phenotypic and genetic characteristics of combined lipid disturbances in a general population-based cohort. METHODS Among UK Biobank participants without prevalent coronary artery disease, we used blood lipid and apolipoprotein B concentrations to ascribe individuals into 1 of 6 reproducible and mutually exclusive dyslipoproteinemia subtypes. Incident coronary artery disease risk was estimated for each subtype using Cox proportional hazards models. Phenome-wide analyses and genome-wide association studies were performed for each subtype, followed by in silico causal gene prioritization and heritability analyses. Additionally, the prevalence of disruptive variants in causal genes for Mendelian lipid disorders was assessed using whole-exome sequence data. RESULTS Among 450 636 UK Biobank participants: 63 (0.01%) had chylomicronemia; 40 005 (8.9%) had hypercholesterolemia; 94 785 (21.0%) had combined hyperlipidemia; 13 998 (3.1%) had remnant hypercholesterolemia; 110 389 (24.5%) had hypertriglyceridemia; and 49 (0.01%) had mixed hypertriglyceridemia and hypercholesterolemia. Over a median (interquartile range) follow-up of 11.1 (10.4-11.8) years, incident coronary artery disease risk varied across subtypes, with combined hyperlipidemia exhibiting the largest hazard (hazard ratio, 1.92 [95% CI, 1.84-2.01]; P=2×10-16), even when accounting for non-HDL-C (hazard ratio, 1.45 [95% CI, 1.30-1.60]; P=2.6×10-12). Genome-wide association studies revealed 250 loci significantly associated with dyslipoproteinemia subtypes, of which 72 (28.8%) were not detected in prior single lipid trait genome-wide association studies. Mendelian lipid variant carriers were rare (2.0%) among individuals with dyslipoproteinemia, but polygenic heritability was high, ranging from 23% for remnant hypercholesterolemia to 54% for combined hyperlipidemia. CONCLUSIONS Simultaneous assessment of multiple lipid derangements revealed nuanced differences in coronary artery disease risk and genetic architectures across dyslipoproteinemia subtypes. These findings highlight the importance of looking beyond single lipid traits to better understand combined lipid and lipoprotein phenotypes and implications for disease risk.
Collapse
Affiliation(s)
- Thomas Gilliland
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA
- Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad Institute of Harvard and MIT, Cambridge, MA
- Department of Medicine, Harvard Medical School, Boston, MA
| | - Jacqueline S. Dron
- Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad Institute of Harvard and MIT, Cambridge, MA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA
| | - Margaret Sunitha Selvaraj
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA
- Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad Institute of Harvard and MIT, Cambridge, MA
- Department of Medicine, Harvard Medical School, Boston, MA
| | - Mark Trinder
- Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad Institute of Harvard and MIT, Cambridge, MA
- Centre for Heart Lung Innovation, University of British Columbia, Vancouver, BC
| | - Kaavya Paruchuri
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA
- Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad Institute of Harvard and MIT, Cambridge, MA
- Department of Medicine, Harvard Medical School, Boston, MA
| | - Sarah M. Urbut
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA
- Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad Institute of Harvard and MIT, Cambridge, MA
- Department of Medicine, Harvard Medical School, Boston, MA
| | - Sara Haidermota
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA
- Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad Institute of Harvard and MIT, Cambridge, MA
| | - Rachel Bernardo
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA
- Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad Institute of Harvard and MIT, Cambridge, MA
| | - Md Mesbah Uddin
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA
- Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad Institute of Harvard and MIT, Cambridge, MA
| | - Michael C. Honigberg
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA
- Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad Institute of Harvard and MIT, Cambridge, MA
- Department of Medicine, Harvard Medical School, Boston, MA
| | - Gina M. Peloso
- Department of Biostatistics, Boston University School of Public Health, Boston, MA
| | - Pradeep Natarajan
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA
- Program in Medical and Population Genetics and the Cardiovascular Disease Initiative, Broad Institute of Harvard and MIT, Cambridge, MA
- Department of Medicine, Harvard Medical School, Boston, MA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA
| |
Collapse
|
44
|
Pietzner M, Denaxas S, Yasmeen S, Ulmer MA, Nakanishi T, Arnold M, Kastenmüller G, Hemingway H, Langenberg C. Complex patterns of multimorbidity associated with severe COVID-19 and Long COVID. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2023.05.23.23290408. [PMID: 39006431 PMCID: PMC11245059 DOI: 10.1101/2023.05.23.23290408] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/16/2024]
Abstract
Early evidence that patients with (multiple) pre-existing diseases are at highest risk for severe COVID-19 has been instrumental in the pandemic to allocate critical care resources and later vaccination schemes. However, systematic studies exploring the breadth of medical diagnoses, including common, but non-fatal diseases are scarce, but may help to understand severe COVID-19 among patients at supposedly low risk. Here, we systematically harmonized >12 million primary care and hospitalisation health records from ~500,000 UK Biobank participants into 1448 collated disease terms to systematically identify diseases predisposing to severe COVID-19 (requiring hospitalisation or death) and its post-acute sequalae, Long COVID. We identified a total of 679 diseases associated with an increased risk for severe COVID-19 (n=672) and/or Long COVID (n=72) that spanned almost all clinical specialties and were strongly enriched in clusters of cardio-respiratory and endocrine-renal diseases. For 57 diseases, we established consistent evidence to predispose to severe COVID-19 based on survival and genetic susceptibility analyses. This included a possible role of symptoms of malaise and fatigue as a so far largely overlooked risk factor for severe COVID-19. We finally observed partially opposing risk estimates at known risk loci for severe COVID-19 for etiologically related diseases, such as post-inflammatory pulmonary fibrosis (e.g., MUC5B, NPNT, and PSMD3) or rheumatoid arthritis (e.g., TYK2), possibly indicating a segregation of disease mechanisms. Our results provide a unique reference that demonstrates how 1) complex co-occurrence of multiple - including non-fatal - conditions predispose to increased COVID-19 severity and 2) how incorporating the whole breadth of medical diagnosis can guide the interpretation of genetic risk loci.
Collapse
Affiliation(s)
- Maik Pietzner
- Computational Medicine, Berlin Institute of Health at Charité - Universitatsmedizin Berlin, Berlin, Germany
- Precision Healthcare University Research Institute, Queen Mary University of London, London, UK
- MRC Epidemiology Unit, University of Cambridge, Cambridge, UK
| | - Spiros Denaxas
- Institute of Health Informatics, University College London, London, UK
- Health Data Research UK, London, UK
- British Heart Foundation Data Science Centre, London, UK
- National Institute of Health Research University College London Hospitals Biomedical Research Centre
| | - Summaira Yasmeen
- Computational Medicine, Berlin Institute of Health at Charité - Universitatsmedizin Berlin, Berlin, Germany
| | - Maria A. Ulmer
- Institute of Computational Biology, Helmholtz Zentrum München - German Research Center for Environmental Health, Neuherberg, Germany
| | - Tomoko Nakanishi
- Precision Healthcare University Research Institute, Queen Mary University of London, London, UK
| | - Matthias Arnold
- Institute of Computational Biology, Helmholtz Zentrum München - German Research Center for Environmental Health, Neuherberg, Germany
- Department of Psychiatry and Behavioral Sciences, Duke University, Durham, NC, USA
| | - Gabi Kastenmüller
- Institute of Computational Biology, Helmholtz Zentrum München - German Research Center for Environmental Health, Neuherberg, Germany
| | - Harry Hemingway
- Institute of Health Informatics, University College London, London, UK
- Health Data Research UK, London, UK
- National Institute of Health Research University College London Hospitals Biomedical Research Centre
| | - Claudia Langenberg
- Computational Medicine, Berlin Institute of Health at Charité - Universitatsmedizin Berlin, Berlin, Germany
- Precision Healthcare University Research Institute, Queen Mary University of London, London, UK
- MRC Epidemiology Unit, University of Cambridge, Cambridge, UK
| |
Collapse
|
45
|
Løkhammer S, Koller D, Wendt FR, Choi KW, He J, Friligkou E, Overstreet C, Gelernter J, Hellard SL, Polimanti R. Distinguishing vulnerability and resilience to posttraumatic stress disorder evaluating traumatic experiences, genetic risk and electronic health records. Psychiatry Res 2024; 337:115950. [PMID: 38744179 PMCID: PMC11156529 DOI: 10.1016/j.psychres.2024.115950] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Revised: 04/29/2024] [Accepted: 05/04/2024] [Indexed: 05/16/2024]
Abstract
What distinguishes vulnerability and resilience to posttraumatic stress disorder (PTSD) remains unclear. Levering traumatic experiences reporting, genetic data, and electronic health records (EHR), we investigated and predicted the clinical comorbidities (co-phenome) of PTSD vulnerability and resilience in the UK Biobank (UKB) and All of Us Research Program (AoU), respectively. In 60,354 trauma-exposed UKB participants, we defined PTSD vulnerability and resilience considering PTSD symptoms, trauma burden, and polygenic risk scores. EHR-based phenome-wide association studies (PheWAS) were conducted to dissect the co-phenomes of PTSD vulnerability and resilience. Significant diagnostic endpoints were applied as weights, yielding a phenotypic risk score (PheRS) to conduct PheWAS of PTSD vulnerability and resilience PheRS in up to 95,761 AoU participants. EHR-based PheWAS revealed three significant phenotypes positively associated with PTSD vulnerability (top association "Sleep disorders") and five outcomes inversely associated with PTSD resilience (top association "Irritable Bowel Syndrome"). In the AoU cohort, PheRS analysis showed a partial inverse relationship between vulnerability and resilience with distinct comorbid associations. While PheRSvulnerability associations were linked to multiple phenotypes, PheRSresilience showed inverse relationships with eye conditions. Our study unveils phenotypic differences in PTSD vulnerability and resilience, highlighting that these concepts are not simply the absence and presence of PTSD.
Collapse
Affiliation(s)
- Solveig Løkhammer
- Department of Psychiatry, Yale School of Medicine, New Haven, Connecticut, USA
- Department of Clinical Science, University of Bergen, Bergen, Norway
- Dr. Einar Martens Research Group for Biological Psychiatry, Center for Medical Genetics and Molecular Medicine, Haukeland University Hospital, Bergen, Norway
| | - Dora Koller
- Department of Psychiatry, Yale School of Medicine, New Haven, Connecticut, USA
- Department of Genetics, Microbiology, and Statistics, Faculty of Biology, University of Barcelona, Catalonia, Spain
| | - Frank R. Wendt
- Department of Anthropology, University of Toronto, Mississauga, Canada
- Biostatistics Division, Dalla Lana School of Public Health, University of Toronto, Toronto, Canada
| | - Karmel W. Choi
- Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
- Department of Psychiatry, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, USA
| | - Jun He
- Department of Psychiatry, Yale School of Medicine, New Haven, Connecticut, USA
- Veterans Affairs Connecticut Healthcare Center, West Haven, Connecticut, USA
| | - Eleni Friligkou
- Department of Psychiatry, Yale School of Medicine, New Haven, Connecticut, USA
- Veterans Affairs Connecticut Healthcare Center, West Haven, Connecticut, USA
| | - Cassie Overstreet
- Department of Psychiatry, Yale School of Medicine, New Haven, Connecticut, USA
- Veterans Affairs Connecticut Healthcare Center, West Haven, Connecticut, USA
| | - Joel Gelernter
- Department of Psychiatry, Yale School of Medicine, New Haven, Connecticut, USA
- Veterans Affairs Connecticut Healthcare Center, West Haven, Connecticut, USA
- Department of Genetics, Yale School of Medicine, New Haven, Connecticut, USA
- Department of Neuroscience, Yale School of Medicine, New Haven, Connecticut, USA
- Wu Tsai Institute, Yale University, New Haven, Connecticut, USA
| | - Stéphanie Le Hellard
- Department of Clinical Science, University of Bergen, Bergen, Norway
- Dr. Einar Martens Research Group for Biological Psychiatry, Center for Medical Genetics and Molecular Medicine, Haukeland University Hospital, Bergen, Norway
- Bergen Center of Brain Plasticity, Haukeland University Hospital, Bergen, Norway
| | - Renato Polimanti
- Department of Psychiatry, Yale School of Medicine, New Haven, Connecticut, USA
- Veterans Affairs Connecticut Healthcare Center, West Haven, Connecticut, USA
- Wu Tsai Institute, Yale University, New Haven, Connecticut, USA
- Department of Chronic Disease Epidemiology, Yale School of Public Health, New Haven, Connecticut, USA
| |
Collapse
|
46
|
Steinfeldt J, Wild B, Buergel T, Pietzner M, Upmeier Zu Belzen J, Vauvelle A, Hegselmann S, Denaxas S, Hemingway H, Langenberg C, Landmesser U, Deanfield J, Eils R. Medical history predicts phenome-wide disease onset and enables the rapid response to emerging health threats. Nat Commun 2024; 15:4257. [PMID: 38763986 PMCID: PMC11102902 DOI: 10.1038/s41467-024-48568-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Accepted: 05/03/2024] [Indexed: 05/21/2024] Open
Abstract
The COVID-19 pandemic exposed a global deficiency of systematic, data-driven guidance to identify high-risk individuals. Here, we illustrate the utility of routinely recorded medical history to predict the risk for 1883 diseases across clinical specialties and support the rapid response to emerging health threats such as COVID-19. We developed a neural network to learn from health records of 502,460 UK Biobank. Importantly, we observed discriminative improvements over basic demographic predictors for 1774 (94.3%) endpoints. After transferring the unmodified risk models to the All of US cohort, we replicated these improvements for 1347 (89.8%) of 1500 investigated endpoints, demonstrating generalizability across healthcare systems and historically underrepresented groups. Ultimately, we showed how this approach could have been used to identify individuals vulnerable to severe COVID-19. Our study demonstrates the potential of medical history to support guidance for emerging pandemics by systematically estimating risk for thousands of diseases at once at minimal cost.
Collapse
Affiliation(s)
- Jakob Steinfeldt
- Department of Cardiology, Angiology and Intensive Care Medicine, Deutsches Herzzentrum der Charité (DHZC), Berlin, Germany
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt Universität zu Berlin, Klinik/Centrum, Charitéplatz 1, 10117, Berlin, Germany
- Computational Medicine, Berlin Institute of Health (BIH), Charite - University Medicine Berlin, Berlin, Germany
- Friede Springer Cardiovascular Prevention Center@Charite, Charite - University Medicine Berlin, Berlin, Germany
- Institute of Cardiovascular Sciences, University College London, London, UK
| | - Benjamin Wild
- Center for Digital Health, Berlin Institute of Health (BIH), Charite - University Medicine Berlin, Berlin, Germany
| | - Thore Buergel
- Institute of Cardiovascular Sciences, University College London, London, UK
- Center for Digital Health, Berlin Institute of Health (BIH), Charite - University Medicine Berlin, Berlin, Germany
| | - Maik Pietzner
- Computational Medicine, Berlin Institute of Health (BIH), Charite - University Medicine Berlin, Berlin, Germany
- MRC Epidemiology Unit, Institute of Metabolic Science, University of Cambridge, Cambridge, UK
- Precision Health University Research Institute, Queen Mary University of London and Barts NHS Trust, London, UK
| | - Julius Upmeier Zu Belzen
- Center for Digital Health, Berlin Institute of Health (BIH), Charite - University Medicine Berlin, Berlin, Germany
| | - Andre Vauvelle
- Institute of Health Informatics, University College London, London, UK
| | - Stefan Hegselmann
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Massachusetts, USA
- Pattern Recognition and Image Analysis Lab, University of Münster, Münster, Germany
| | - Spiros Denaxas
- Institute of Health Informatics, University College London, London, UK
- British Heart Foundation Data Science Centre, London, UK
- Health Data Research UK, London, UK
- National Institute for Health Research, Biomedical Research Centre at University College London Hospitals National Institute for Health Research, Biomedical Research Centre, London, UK
| | - Harry Hemingway
- Institute of Health Informatics, University College London, London, UK
- Health Data Research UK, London, UK
- National Institute for Health Research, Biomedical Research Centre at University College London Hospitals National Institute for Health Research, Biomedical Research Centre, London, UK
| | - Claudia Langenberg
- Computational Medicine, Berlin Institute of Health (BIH), Charite - University Medicine Berlin, Berlin, Germany
- MRC Epidemiology Unit, Institute of Metabolic Science, University of Cambridge, Cambridge, UK
- Precision Health University Research Institute, Queen Mary University of London and Barts NHS Trust, London, UK
| | - Ulf Landmesser
- Department of Cardiology, Angiology and Intensive Care Medicine, Deutsches Herzzentrum der Charité (DHZC), Berlin, Germany
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt Universität zu Berlin, Klinik/Centrum, Charitéplatz 1, 10117, Berlin, Germany
- Friede Springer Cardiovascular Prevention Center@Charite, Charite - University Medicine Berlin, Berlin, Germany
- Berlin Institute of Health (BIH), Charite - University Medicine Berlin, Berlin, Germany
- DZHK (German Centre for Cardiovascular Research), Partner Site Berlin, Berlin, Berlin, Germany
| | - John Deanfield
- Institute of Cardiovascular Sciences, University College London, London, UK
| | - Roland Eils
- Center for Digital Health, Berlin Institute of Health (BIH), Charite - University Medicine Berlin, Berlin, Germany.
- Health Data Science Unit, Heidelberg University Hospital and BioQuant, Heidelberg, Germany.
| |
Collapse
|
47
|
Kopell BH, Kaji DA, Liharska LE, Vornholt E, Valentine A, Lund A, Hashemi A, Thompson RC, Lohrenz T, Johnson JS, Bussola N, Cheng E, Park YJ, Shah P, Ma W, Searfoss R, Qasim S, Miller GM, Chand NM, Aristel A, Humphrey J, Wilkins L, Ziafat K, Silk H, Linares LM, Sullivan B, Feng C, Batten SR, Bang D, Barbosa LS, Twomey T, White JP, Vannucci M, Hadj-Amar B, Cohen V, Kota P, Moya E, Rieder MK, Figee M, Nadkarni GN, Breen MS, Kishida KT, Scarpa J, Ruderfer DM, Narain NR, Wang P, Kiebish MA, Schadt EE, Saez I, Montague PR, Beckmann ND, Charney AW. Multiomic foundations of human prefrontal cortex tissue function. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.05.17.24307537. [PMID: 38798344 PMCID: PMC11118644 DOI: 10.1101/2024.05.17.24307537] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
The prefrontal cortex (PFC) is a region of the brain that in humans is involved in the production of higher-order functions such as cognition, emotion, perception, and behavior. Neurotransmission in the PFC produces higher-order functions by integrating information from other areas of the brain. At the foundation of neurotransmission, and by extension at the foundation of higher-order brain functions, are an untold number of coordinated molecular processes involving the DNA sequence variants in the genome, RNA transcripts in the transcriptome, and proteins in the proteome. These "multiomic" foundations are poorly understood in humans, perhaps in part because most modern studies that characterize the molecular state of the human PFC use tissue obtained when neurotransmission and higher-order brain functions have ceased (i.e., the postmortem state). Here, analyses are presented on data generated for the Living Brain Project (LBP) to investigate whether PFC tissue from individuals with intact higher-order brain function has characteristic multiomic foundations. Two complementary strategies were employed towards this end. The first strategy was to identify in PFC samples obtained from living study participants a signature of RNA transcript expression associated with neurotransmission measured intracranially at the time of PFC sampling, in some cases while participants performed a task engaging higher-order brain functions. The second strategy was to perform multiomic comparisons between PFC samples obtained from individuals with intact higher-order brain function at the time of sampling (i.e., living study participants) and PFC samples obtained in the postmortem state. RNA transcript expression within multiple PFC cell types was associated with fluctuations of dopaminergic, serotonergic, and/or noradrenergic neurotransmission in the substantia nigra measured while participants played a computer game that engaged higher-order brain functions. A subset of these associations - termed the "transcriptional program associated with neurotransmission" (TPAWN) - were reproduced in analyses of brain RNA transcript expression and intracranial neurotransmission data obtained from a second LBP cohort and from a cohort in an independent study. RNA transcripts involved in TPAWN were found to be (1) enriched for RNA transcripts associated with measures of neurotransmission in rodent and cell models, (2) enriched for RNA transcripts encoded by evolutionarily constrained genes, (3) depleted of RNA transcripts regulated by common DNA sequence variants, and (4) enriched for RNA transcripts implicated in higher-order brain functions by human population genetic studies. In PFC excitatory neurons of living study participants, higher expression of the genes in TPAWN tracked with higher expression of RNA transcripts that in rodent PFC samples are markers of a class of excitatory neurons that connect the PFC to deep brain structures. TPAWN was further reproduced by RNA transcript expression patterns differentiating living PFC samples from postmortem PFC samples, and significant differences between living and postmortem PFC samples were additionally observed with respect to (1) the expression of most primary RNA transcripts, mature RNA transcripts, and proteins, (2) the splicing of most primary RNA transcripts into mature RNA transcripts, (3) the patterns of co-expression between RNA transcripts and proteins, and (4) the effects of some DNA sequence variants on RNA transcript and protein expression. Taken together, this report highlights that studies of brain tissue obtained in a safe and ethical manner from large cohorts of living individuals can help advance understanding of the multiomic foundations of brain function.
Collapse
|
48
|
Hamilton F, Mitchell R, Ghazal P, Timpson N. Phenotypic Associations With the HMOX1 GT(n) Repeat in European Populations. Am J Epidemiol 2024; 193:718-726. [PMID: 37414746 PMCID: PMC11074708 DOI: 10.1093/aje/kwad154] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Revised: 12/21/2023] [Accepted: 07/03/2023] [Indexed: 07/08/2023] Open
Abstract
Heme oxygenase 1 is a key enzyme in the management of heme in humans. A GT(n) repeat length in the heme oxygenase 1 gene (HMOX1) has been widely associated with a variety of phenotypes, including susceptibility to and outcomes in diabetes, cancer, infections, and neonatal jaundice. However, studies have generally been small and results inconsistent. In this study, we imputed the GT(n) repeat length in participants from 2 UK cohort studies (the UK Biobank study (n = 463,005; recruited in 2006-2010) and the Avon Longitudinal Study of Parents and Children (ALSPAC; n = 937; recruited in 1990-1991)), with the reliability of imputation tested in other cohorts (1000 Genomes Project, Human Genome Diversity Project, and Personal Genome Project UK). Subsequently, we measured the relationship between repeat length and previously identified associations (diabetes, chronic obstructive pulmonary disease, pneumonia, and infection-related mortality in the UK Biobank; neonatal jaundice in ALSPAC) and performed a phenomewide association study in the UK Biobank. Despite high-quality imputation (correlation between true repeat length and imputed repeat length > 0.9 in test cohorts), clinical associations were not identified in either the phenomewide association study or specific association studies. These findings were robust to definitions of repeat length and sensitivity analyses. Despite multiple smaller studies identifying associations across a variety of clinical settings, we could not replicate or identify any relevant phenotypic associations with the HMOX1 GT(n) repeat.
Collapse
Affiliation(s)
- Fergus Hamilton
- Correspondence to Dr. Fergus Hamilton, MRC Integrative Epidemiology Unit, University of Bristol, Oakfield House, Oakfield Grove, Bristol BS8 2BN, United Kingdom (e-mail: )
| | | | | | | |
Collapse
|
49
|
Li Y, Yang AY, Marelli A, Li Y. MixEHR-SurG: A joint proportional hazard and guided topic model for inferring mortality-associated topics from electronic health records. J Biomed Inform 2024; 153:104638. [PMID: 38631461 DOI: 10.1016/j.jbi.2024.104638] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Revised: 03/07/2024] [Accepted: 04/03/2024] [Indexed: 04/19/2024]
Abstract
Survival models can help medical practitioners to evaluate the prognostic importance of clinical variables to patient outcomes such as mortality or hospital readmission and subsequently design personalized treatment regimes. Electronic Health Records (EHRs) hold the promise for large-scale survival analysis based on systematically recorded clinical features for each patient. However, existing survival models either do not scale to high dimensional and multi-modal EHR data or are difficult to interpret. In this study, we present a supervised topic model called MixEHR-SurG to simultaneously integrate heterogeneous EHR data and model survival hazard. Our contributions are three-folds: (1) integrating EHR topic inference with Cox proportional hazards likelihood; (2) integrating patient-specific topic hyperparameters using the PheCode concepts such that each topic can be identified with exactly one PheCode-associated phenotype; (3) multi-modal survival topic inference. This leads to a highly interpretable survival topic model that can infer PheCode-specific phenotype topics associated with patient mortality. We evaluated MixEHR-SurG using a simulated dataset and two real-world EHR datasets: the Quebec Congenital Heart Disease (CHD) data consisting of 8211 subjects with 75,187 outpatient claim records of 1767 unique ICD codes; the MIMIC-III consisting of 1458 subjects with multi-modal EHR records. Compared to the baselines, MixEHR-SurG achieved a superior dynamic AUROC for mortality prediction, with a mean AUROC score of 0.89 in the simulation dataset and a mean AUROC of 0.645 on the CHD dataset. Qualitatively, MixEHR-SurG associates severe cardiac conditions with high mortality risk among the CHD patients after the first heart failure hospitalization and critical brain injuries with increased mortality among the MIMIC-III patients after their ICU discharge. Together, the integration of the Cox proportional hazards model and EHR topic inference in MixEHR-SurG not only leads to competitive mortality prediction but also meaningful phenotype topics for in-depth survival analysis. The software is available at GitHub: https://github.com/li-lab-mcgill/MixEHR-SurG.
Collapse
Affiliation(s)
- Yixuan Li
- Department of Mathematics and Statistics, McGill University, Montreal, Canada; Mila - Quebec AI institute, Montreal, Canada
| | - Archer Y Yang
- Department of Mathematics and Statistics, McGill University, Montreal, Canada; Mila - Quebec AI institute, Montreal, Canada; School of Computer Science, McGill University, Montreal, Canada.
| | - Ariane Marelli
- McGill Adult Unit for Congenital Heart Disease (MAUDE Unit), McGill University of Health Centre, Montreal, Canada.
| | - Yue Li
- Mila - Quebec AI institute, Montreal, Canada; School of Computer Science, McGill University, Montreal, Canada.
| |
Collapse
|
50
|
Mosley JD, Shelley JP, Dickson AL, Zanussi J, Daniel LL, Zheng NS, Bastarache L, Wei WQ, Shi M, Jarvik GP, Rosenthal EA, Khan A, Sherafati A, Kullo IJ, Walunas TL, Glessner J, Hakonarson H, Cox NJ, Roden DM, Frangakis SG, Vanderwerff B, Stein CM, Van Driest SL, Borinstein SC, Shu XO, Zawistowski M, Chung CP, Kawai VK. Clinical associations with a polygenic predisposition to benign lower white blood cell counts. Nat Commun 2024; 15:3384. [PMID: 38649760 PMCID: PMC11035609 DOI: 10.1038/s41467-024-47804-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2023] [Accepted: 04/10/2024] [Indexed: 04/25/2024] Open
Abstract
Polygenic variation unrelated to disease contributes to interindividual variation in baseline white blood cell (WBC) counts, but its clinical significance is uncharacterized. We investigated the clinical consequences of a genetic predisposition toward lower WBC counts among 89,559 biobank participants from tertiary care centers using a polygenic score for WBC count (PGSWBC) comprising single nucleotide polymorphisms not associated with disease. A predisposition to lower WBC counts was associated with a decreased risk of identifying pathology on a bone marrow biopsy performed for a low WBC count (odds-ratio = 0.55 per standard deviation increase in PGSWBC [95%CI, 0.30-0.94], p = 0.04), an increased risk of leukopenia (a low WBC count) when treated with a chemotherapeutic (n = 1724, hazard ratio [HR] = 0.78 [0.69-0.88], p = 4.0 × 10-5) or immunosuppressant (n = 354, HR = 0.61 [0.38-0.99], p = 0.04). A predisposition to benign lower WBC counts was associated with an increased risk of discontinuing azathioprine treatment (n = 1,466, HR = 0.62 [0.44-0.87], p = 0.006). Collectively, these findings suggest that there are genetically predisposed individuals who are susceptible to escalations or alterations in clinical care that may be harmful or of little benefit.
Collapse
Affiliation(s)
- Jonathan D Mosley
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA.
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA.
| | - John P Shelley
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Alyson L Dickson
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Jacy Zanussi
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Laura L Daniel
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Neil S Zheng
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
- Yale School of Medicine, New Haven, CT, USA
| | - Lisa Bastarache
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Wei-Qi Wei
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Mingjian Shi
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Gail P Jarvik
- Department of Genome Sciences, University of Washington Medical Center, Seattle, WA, USA
- Department of Medicine (Medical Genetics), University of Washington Medical Center, Seattle, WA, USA
| | - Elisabeth A Rosenthal
- Department of Medicine (Medical Genetics), University of Washington Medical Center, Seattle, WA, USA
| | - Atlas Khan
- Division of Nephrology, Dept of Medicine, Vagelos College of Physicians & Surgeons, Columbia University, New York, NY, USA
| | - Alborz Sherafati
- Department of Cardiovascular Medicine, Mayo Clinic, Rochester, MN, USA
| | - Iftikhar J Kullo
- Department of Cardiovascular Medicine, Mayo Clinic, Rochester, MN, USA
| | - Theresa L Walunas
- Department of Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | - Joseph Glessner
- Department of Pediatrics, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Hakon Hakonarson
- Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Nancy J Cox
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Dan M Roden
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Pharmacology, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Stephan G Frangakis
- Department of Anesthesiology, University of Michigan Medical School, Ann Arbor, MI, USA
| | - Brett Vanderwerff
- Department of Biostatistics, Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, USA
| | - C Michael Stein
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Pharmacology, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Sara L Van Driest
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Pediatrics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Scott C Borinstein
- Department of Pediatrics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Xiao-Ou Shu
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
- Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, TN, USA
| | - Matthew Zawistowski
- Department of Biostatistics, Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, USA
| | | | - Vivian K Kawai
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| |
Collapse
|