1
|
Azhir A, Hügel J, Tian J, Cheng J, Bassett IV, Bell DS, Bernstam EV, Farhat MR, Henderson DW, Lau ES, Morris M, Semenov YR, Triant VA, Visweswaran S, Strasser ZH, Klann JG, Murphy SN, Estiri H. Precision Phenotyping for Curating Research Cohorts of Patients with Post-Acute Sequelae of COVID-19 (PASC) as a Diagnosis of Exclusion. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.04.13.24305771. [PMID: 38699316 PMCID: PMC11065031 DOI: 10.1101/2024.04.13.24305771] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2024]
Abstract
Scalable identification of patients with the post-acute sequelae of COVID-19 (PASC) is challenging due to a lack of reproducible precision phenotyping algorithms and the suboptimal accuracy, demographic biases, and underestimation of the PASC diagnosis code (ICD-10 U09.9). In a retrospective case-control study, we developed a precision phenotyping algorithm for identifying research cohorts of PASC patients, defined as a diagnosis of exclusion. We used longitudinal electronic health records (EHR) data from over 295 thousand patients from 14 hospitals and 20 community health centers in Massachusetts. The algorithm employs an attention mechanism to exclude sequelae that prior conditions can explain. We performed independent chart reviews to tune and validate our precision phenotyping algorithm. Our PASC phenotyping algorithm improves precision and prevalence estimation and reduces bias in identifying Long COVID patients compared to the U09.9 diagnosis code. Our algorithm identified a PASC research cohort of over 24 thousand patients (compared to about 6 thousand when using the U09.9 diagnosis code), with a 79.9 percent precision (compared to 77.8 percent from the U09.9 diagnosis code). Our estimated prevalence of PASC was 22.8 percent, which is close to the national estimates for the region. We also provide an in-depth analysis outlining the clinical attributes, encompassing identified lingering effects by organ, comorbidity profiles, and temporal differences in the risk of PASC. The PASC phenotyping method presented in this study boasts superior precision, accurately gauges the prevalence of PASC without underestimating it, and exhibits less bias in pinpointing Long COVID patients. The PASC cohort derived from our algorithm will serve as a springboard for delving into Long COVID's genetic, metabolomic, and clinical intricacies, surmounting the constraints of recent PASC cohort studies, which were hampered by their limited size and available outcome data.
Collapse
|
2
|
Jiang S, Gai X, Treggiari MM, Stead WW, Zhao Y, Page CD, Zhang AR. Soft phenotyping for sepsis via EHR time-aware soft clustering. J Biomed Inform 2024; 152:104615. [PMID: 38423266 PMCID: PMC11073833 DOI: 10.1016/j.jbi.2024.104615] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Revised: 01/25/2024] [Accepted: 02/20/2024] [Indexed: 03/02/2024]
Abstract
OBJECTIVE Sepsis is one of the most serious hospital conditions associated with high mortality. Sepsis is the result of a dysregulated immune response to infection that can lead to multiple organ dysfunction and death. Due to the wide variability in the causes of sepsis, clinical presentation, and the recovery trajectories, identifying sepsis sub-phenotypes is crucial to advance our understanding of sepsis characterization, to choose targeted treatments and optimal timing of interventions, and to improve prognostication. Prior studies have described different sub-phenotypes of sepsis using organ-specific characteristics. These studies applied clustering algorithms to electronic health records (EHRs) to identify disease sub-phenotypes. However, prior approaches did not capture temporal information and made uncertain assumptions about the relationships among the sub-phenotypes for clustering procedures. METHODS We developed a time-aware soft clustering algorithm guided by clinical variables to identify sepsis sub-phenotypes using data available in the EHR. RESULTS We identified six novel sepsis hybrid sub-phenotypes and evaluated them for medical plausibility. In addition, we built an early-warning sepsis prediction model using logistic regression. CONCLUSION Our results suggest that these novel sepsis hybrid sub-phenotypes are promising to provide more accurate information on sepsis-related organ dysfunction and sepsis recovery trajectories which can be important to inform management decisions and sepsis prognosis.
Collapse
Affiliation(s)
- Shiyi Jiang
- Department of Electrical & Computer Engineering, Duke University, Durham, 27708, NC, USA
| | - Xin Gai
- Department of Statistical Science, Duke University, Durham, 27708, NC, USA
| | | | - William W Stead
- Department of Biomedical Informatics, Vanderbilt University, Nashville, 37235, TN, USA
| | - Yuankang Zhao
- Department of Biostatistics & Bioinformatics, Duke University, Durham, 27708, NC, USA
| | - C David Page
- Department of Biostatistics & Bioinformatics, Duke University, Durham, 27708, NC, USA
| | - Anru R Zhang
- Department of Biostatistics & Bioinformatics, Duke University, Durham, 27708, NC, USA; Department of Computer Science, Duke University, Durham, 27708, NC, USA.
| |
Collapse
|
3
|
Oh W, Jayaraman P, Tandon P, Chaddha US, Kovatch P, Charney AW, Glicksberg BS, Nadkarni GN. A novel method leveraging time series data to improve subphenotyping and application in critically ill patients with COVID-19. Artif Intell Med 2024; 148:102750. [PMID: 38325922 PMCID: PMC10864255 DOI: 10.1016/j.artmed.2023.102750] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Revised: 12/12/2023] [Accepted: 12/14/2023] [Indexed: 02/09/2024]
Abstract
Computational subphenotyping, a data-driven approach to understanding disease subtypes, is a prominent topic in medical research. Numerous ongoing studies are dedicated to developing advanced computational subphenotyping methods for cross-sectional data. However, the potential of time-series data has been underexplored until now. Here, we propose a Multivariate Levenshtein Distance (MLD) that can account for address correlation in multiple discrete features over time-series data. Our algorithm has two distinct components: it integrates an optimal threshold score to enhance the sensitivity in discriminating between pairs of instances, and the MLD itself. We have applied the proposed distance metrics on the k-means clustering algorithm to derive temporal subphenotypes from time-series data of biomarkers and treatment administrations from 1039 critically ill patients with COVID-19 and compare its effectiveness to standard methods. In conclusion, the Multivariate Levenshtein Distance metric is a novel method to quantify the distance from multiple discrete features over time-series data and demonstrates superior clustering performance among competing time-series distance metrics.
Collapse
Affiliation(s)
- Wonsuk Oh
- Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA; Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, NY, USA; Division of Data-Driven and Digital Medicine, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
| | - Pushkala Jayaraman
- Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA; Division of Data-Driven and Digital Medicine, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Pranai Tandon
- Division of Pulmonary, Critical Care and Sleep Medicine, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Udit S Chaddha
- Division of Pulmonary, Critical Care and Sleep Medicine, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Patricia Kovatch
- Department of Scientific Computing, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Alexander W Charney
- Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA; Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA; Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Benjamin S Glicksberg
- Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, NY, USA; Character Biosciences, New York, NY, USA
| | - Girish N Nadkarni
- Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA; Division of Data-Driven and Digital Medicine, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA; Division of Nephrology, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
| |
Collapse
|
4
|
Wang Y, Stroh JN, Hripcsak G, Low Wang CC, Bennett TD, Wrobel J, Der Nigoghossian C, Mueller SW, Claassen J, Albers DJ. A methodology of phenotyping ICU patients from EHR data: High-fidelity, personalized, and interpretable phenotypes estimation. J Biomed Inform 2023; 148:104547. [PMID: 37984547 PMCID: PMC10802138 DOI: 10.1016/j.jbi.2023.104547] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Revised: 11/13/2023] [Accepted: 11/16/2023] [Indexed: 11/22/2023]
Abstract
OBJECTIVE Computing phenotypes that provide high-fidelity, time-dependent characterizations and yield personalized interpretations is challenging, especially given the complexity of physiological and healthcare systems and clinical data quality. This paper develops a methodological pipeline to estimate unmeasured physiological parameters and produce high-fidelity, personalized phenotypes anchored to physiological mechanics from electronic health record (EHR). METHODS A methodological phenotyping pipeline is developed that computes new phenotypes defined with unmeasurable computational biomarkers quantifying specific physiological properties in real time. Working within the inverse problem framework, this pipeline is applied to the glucose-insulin system for ICU patients using data assimilation to estimate an established mathematical physiological model with stochastic optimization. This produces physiological model parameter vectors of clinically unmeasured endocrine properties, here insulin secretion, clearance, and resistance, estimated for individual patient. These physiological parameter vectors are used as inputs to unsupervised machine learning methods to produce phenotypic labels and discrete physiological phenotypes. These phenotypes are inherently interpretable because they are based on parametric physiological descriptors. To establish potential clinical utility, the computed phenotypes are evaluated with external EHR data for consistency and reliability and with clinician face validation. RESULTS The phenotype computation was performed on a cohort of 109 ICU patients who received no or short-acting insulin therapy, rendering continuous and discrete physiological phenotypes as specific computational biomarkers of unmeasured insulin secretion, clearance, and resistance on time windows of three days. Six, six, and five discrete phenotypes were found in the first, middle, and last three-day periods of ICU stays, respectively. Computed phenotypic labels were predictive with an average accuracy of 89%. External validation of discrete phenotypes showed coherence and consistency in clinically observable differences based on laboratory measurements and ICD 9/10 codes and clinical concordance from face validity. A particularly clinically impactful parameter, insulin secretion, had a concordance accuracy of 83%±27%. CONCLUSION The new physiological phenotypes computed with individual patient ICU data and defined by estimates of mechanistic model parameters have high physiological fidelity, are continuous, time-specific, personalized, interpretable, and predictive. This methodology is generalizable to other clinical and physiological settings and opens the door for discovering deeper physiological information to personalize medical care.
Collapse
Affiliation(s)
- Yanran Wang
- Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, 13001 East 17th Place, 3rd Floor, Mail Stop B119, Aurora, CO 80045, United States of America; Department of Biomedical Informatics, University of Colorado School of Medicine, Anschutz Health Sciences Building, 1890 N. Revere Court, Mailstop F600, Aurora, CO 80045, United States of America.
| | - J N Stroh
- Department of Biomedical Informatics, University of Colorado School of Medicine, Anschutz Health Sciences Building, 1890 N. Revere Court, Mailstop F600, Aurora, CO 80045, United States of America; Department of Biomedical Engineering, University of Colorado, 12705 East Montview Boulevard, Suite 100, Aurora, CO 80045, United States of America
| | - George Hripcsak
- Biomedical Informatics, Columbia University, 622 W. 168th Street, PH20, New York, NY 10032, United States of America
| | - Cecilia C Low Wang
- Division of Endocrinology, Metabolism and Diabetes, Department of Medicine, University of Colorado School of Medicine, 12801 East 17th Avenue, 7103, Aurora, CO 80045, United States of America
| | - Tellen D Bennett
- Department of Biomedical Informatics, University of Colorado School of Medicine, Anschutz Health Sciences Building, 1890 N. Revere Court, Mailstop F600, Aurora, CO 80045, United States of America
| | - Julia Wrobel
- Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, 1518 Clifton Rd, NE Atlanta, GA 30322, United States of America
| | - Caroline Der Nigoghossian
- Columbia University School of Nursing, 560 West 168th Street, New York, NY 10032, United States of America
| | - Scott W Mueller
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado Anschutz Medical Campus, 12850 East Montview Boulevard, Aurora, CO 80045, United States of America
| | - Jan Claassen
- The Neurological Institute of New York, Columbia University Irving Medical Center, 710 West 168th Street, New York NY 10032, United States of America
| | - D J Albers
- Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, 13001 East 17th Place, 3rd Floor, Mail Stop B119, Aurora, CO 80045, United States of America; Department of Biomedical Informatics, University of Colorado School of Medicine, Anschutz Health Sciences Building, 1890 N. Revere Court, Mailstop F600, Aurora, CO 80045, United States of America; Department of Biomedical Engineering, University of Colorado, 12705 East Montview Boulevard, Suite 100, Aurora, CO 80045, United States of America; Biomedical Informatics, Columbia University, 622 W. 168th Street, PH20, New York, NY 10032, United States of America
| |
Collapse
|
5
|
Dagliati A, Strasser ZH, Hossein Abad ZS, Klann JG, Wagholikar KB, Mesa R, Visweswaran S, Morris M, Luo Y, Henderson DW, Samayamuthu MJ, Tan BW, Verdy G, Omenn GS, Xia Z, Bellazzi R, Murphy SN, Holmes JH, Estiri H. Characterization of long COVID temporal sub-phenotypes by distributed representation learning from electronic health record data: a cohort study. EClinicalMedicine 2023; 64:102210. [PMID: 37745021 PMCID: PMC10511779 DOI: 10.1016/j.eclinm.2023.102210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Revised: 08/29/2023] [Accepted: 08/29/2023] [Indexed: 09/26/2023] Open
Abstract
Background Characterizing Post-Acute Sequelae of COVID (SARS-CoV-2 Infection), or PASC has been challenging due to the multitude of sub-phenotypes, temporal attributes, and definitions. Scalable characterization of PASC sub-phenotypes can enhance screening capacities, disease management, and treatment planning. Methods We conducted a retrospective multi-centre observational cohort study, leveraging longitudinal electronic health record (EHR) data of 30,422 patients from three healthcare systems in the Consortium for the Clinical Characterization of COVID-19 by EHR (4CE). From the total cohort, we applied a deductive approach on 12,424 individuals with follow-up data and developed a distributed representation learning process for providing augmented definitions for PASC sub-phenotypes. Findings Our framework characterized seven PASC sub-phenotypes. We estimated that on average 15.7% of the hospitalized COVID-19 patients were likely to suffer from at least one PASC symptom and almost 5.98%, on average, had multiple symptoms. Joint pain and dyspnea had the highest prevalence, with an average prevalence of 5.45% and 4.53%, respectively. Interpretation We provided a scalable framework to every participating healthcare system for estimating PASC sub-phenotypes prevalence and temporal attributes, thus developing a unified model that characterizes augmented sub-phenotypes across the different systems. Funding Authors are supported by National Institute of Allergy and Infectious Diseases, National Institute on Aging, National Center for Advancing Translational Sciences, National Medical Research Council, National Institute of Neurological Disorders and Stroke, European Union, National Institutes of Health, National Center for Advancing Translational Sciences.
Collapse
Affiliation(s)
- Arianna Dagliati
- Department of Electrical Computer and Biomedical Engineering, University of Pavia, Pavia, Italy
| | - Zachary H. Strasser
- Department of Medicine, Massachusetts General Hospital, Boston, United States
| | | | - Jeffrey G. Klann
- Department of Medicine, Massachusetts General Hospital, Boston, United States
| | | | - Rebecca Mesa
- Department of Electrical Computer and Biomedical Engineering, University of Pavia, Pavia, Italy
| | - Shyam Visweswaran
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, United States
| | - Michele Morris
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, United States
| | - Yuan Luo
- Department of Preventive Medicine, Northwestern University, Chicago, United States
| | - Darren W. Henderson
- University of Kentucky, Center for Clinical and Translational Science, Lexington, United States
| | | | - Bryce W.Q. Tan
- National University Hospital, Singapore Department of Medicine, Singapore
| | - Guillame Verdy
- Bordeaux University Hospital, IAM Unit, Bordeaux, France
| | - Gilbert S. Omenn
- University of Michigan, Department of Computational Medicine and Bioinformatics, Internal Medicine, Human Genetics, and School of Public Health, Ann Arbor, United States
| | - Zongqi Xia
- University of Pittsburgh Department of Neurology, Pittsburgh, United States
| | - Riccardo Bellazzi
- Department of Electrical Computer and Biomedical Engineering, University of Pavia, Pavia, Italy
| | - Shawn N. Murphy
- Department of Neurology, Massachusetts General Hospital, Boston, United States
| | - John H. Holmes
- University of Pennsylvania Perelman School of Medicine, Department of Biostatistics, Epidemiology, and Informatics, Institute for Biomedical Informatics, Philadelphia, United States
| | - Hossein Estiri
- Department of Medicine, Massachusetts General Hospital, Boston, United States
| |
Collapse
|
6
|
Flothow A, Novelli A, Sundmacher L. Analytical methods for identifying sequences of utilization in health data: a scoping review. BMC Med Res Methodol 2023; 23:212. [PMID: 37759162 PMCID: PMC10523647 DOI: 10.1186/s12874-023-02019-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Accepted: 08/08/2023] [Indexed: 09/29/2023] Open
Abstract
BACKGROUND Healthcare, as with other sectors, has undergone progressive digitalization, generating an ever-increasing wealth of data that enables research and the analysis of patient movement. This can help to evaluate treatment processes and outcomes, and in turn improve the quality of care. This scoping review provides an overview of the algorithms and methods that have been used to identify care pathways from healthcare utilization data. METHOD This review was conducted according to the methodology of the Joanna Briggs Institute and the Preferred Reporting Items for Systematic Reviews Extension for Scoping Reviews (PRISMA-ScR) Checklist. The PubMed, Web of Science, Scopus, and EconLit databases were searched and studies published in English between 2000 and 2021 considered. The search strategy used keywords divided into three categories: the method of data analysis, the requirement profile for the data, and the intended presentation of results. Criteria for inclusion were that health data were analyzed, the methodology used was described and that the chronology of care events was considered. In a two-stage review process, records were reviewed by two researchers independently for inclusion. Results were synthesized narratively. RESULTS The literature search yielded 2,865 entries; 51 studies met the inclusion criteria. Health data from different countries ([Formula: see text]) and of different types of disease ([Formula: see text]) were analyzed with respect to different care events. Applied methods can be divided into those identifying subsequences of care and those describing full care trajectories. Variants of pattern mining or Markov models were mostly used to extract subsequences, with clustering often applied to find care trajectories. Statistical algorithms such as rule mining, probability-based machine learning algorithms or a combination of methods were also applied. Clustering methods were sometimes used for data preparation or result compression. Further characteristics of the included studies are presented. CONCLUSION Various data mining methods are already being applied to gain insight from health data. The great heterogeneity of the methods used shows the need for a scoping review. We performed a narrative review and found that clustering methods currently dominate the literature for identifying complete care trajectories, while variants of pattern mining dominate for identifying subsequences of limited length.
Collapse
Affiliation(s)
- Amelie Flothow
- Chair of Health Economics, Technical University of Munich, Georg-Brauchle-Ring, Munich, Bavaria, 80992, Germany.
| | - Anna Novelli
- Chair of Health Economics, Technical University of Munich, Georg-Brauchle-Ring, Munich, Bavaria, 80992, Germany
| | - Leonie Sundmacher
- Chair of Health Economics, Technical University of Munich, Georg-Brauchle-Ring, Munich, Bavaria, 80992, Germany
| |
Collapse
|
7
|
Wang Y, Stroh JN, Hripcsak G, Low Wang CC, Bennett TD, Wrobel J, Der Nigoghossian C, Mueller S, Claassen J, Albers DJ. A methodology of phenotyping ICU patients from EHR data: high-fidelity, personalized, and interpretable phenotypes estimation. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.03.15.23287315. [PMID: 37662404 PMCID: PMC10473766 DOI: 10.1101/2023.03.15.23287315] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/05/2023]
Abstract
Objective Computing phenotypes that provide high-fidelity, time-dependent characterizations and yield personalized interpretations is challenging, especially given the complexity of physiological and healthcare systems and clinical data quality. This paper develops a methodological pipeline to estimate unmeasured physiological parameters and produce high-fidelity, personalized phenotypes anchored to physiological mechanics from electronic health record (EHR). Methods A methodological phenotyping pipeline is developed that computes new phenotypes defined with unmeasurable computational biomarkers quantifying specific physiological properties in real time. Working within the inverse problem framework, this pipeline is applied to the glucose-insulin system for ICU patients using data assimilation to estimate an established mathematical physiological model with stochastic optimization. This produces physiological model parameter vectors of clinically unmeasured endocrine properties, here insulin secretion, clearance, and resistance, estimated for individual patient. These physiological parameter vectors are used as inputs to unsupervised machine learning methods to produce phenotypic labels and discrete physiological phenotypes. These phenotypes are inherently interpretable because they are based on parametric physiological descriptors. To establish potential clinical utility, the computed phenotypes are evaluated with external EHR data for consistency and reliability and with clinician face validation. Results The phenotype computation was performed on a cohort of 109 ICU patients who received no or short-acting insulin therapy, rendering continuous and discrete physiological phenotypes as specific computational biomarkers of unmeasured insulin secretion, clearance, and resistance on time windows of three days. Six, six, and five discrete phenotypes were found in the first, middle, and last three-day periods of ICU stays, respectively. Computed phenotypic labels were predictive with an average accuracy of 89%. External validation of discrete phenotypes showed coherence and consistency in clinically observable differences based on laboratory measurements and ICD 9/10 codes and clinical concordance from face validity. A particularly clinically impactful parameter, insulin secretion, had a concordance accuracy of 83% ± 27%. Conclusion The new physiological phenotypes computed with individual patient ICU data and defined by estimates of mechanistic model parameters have high physiological fidelity, are continuous, time-specific, personalized, interpretable, and predictive. This methodology is generalizable to other clinical and physiological settings and opens the door for discovering deeper physiological information to personalize medical care.
Collapse
|
8
|
Zhou Y, Shi J, Stein R, Liu X, Baldassano RN, Forrest CB, Chen Y, Huang J. Missing data matter: an empirical evaluation of the impacts of missing EHR data in comparative effectiveness research. J Am Med Inform Assoc 2023; 30:1246-1256. [PMID: 37337922 PMCID: PMC10280351 DOI: 10.1093/jamia/ocad066] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Revised: 03/20/2023] [Accepted: 04/08/2023] [Indexed: 06/21/2023] Open
Abstract
OBJECTIVES The impacts of missing data in comparative effectiveness research (CER) using electronic health records (EHRs) may vary depending on the type and pattern of missing data. In this study, we aimed to quantify these impacts and compare the performance of different imputation methods. MATERIALS AND METHODS We conducted an empirical (simulation) study to quantify the bias and power loss in estimating treatment effects in CER using EHR data. We considered various missing scenarios and used the propensity scores to control for confounding. We compared the performance of the multiple imputation and spline smoothing methods to handle missing data. RESULTS When missing data depended on the stochastic progression of disease and medical practice patterns, the spline smoothing method produced results that were close to those obtained when there were no missing data. Compared to multiple imputation, the spline smoothing generally performed similarly or better, with smaller estimation bias and less power loss. The multiple imputation can still reduce study bias and power loss in some restrictive scenarios, eg, when missing data did not depend on the stochastic process of disease progression. DISCUSSION AND CONCLUSION Missing data in EHRs could lead to biased estimates of treatment effects and false negative findings in CER even after missing data were imputed. It is important to leverage the temporal information of disease trajectory to impute missing values when using EHRs as a data resource for CER and to consider the missing rate and the effect size when choosing an imputation method.
Collapse
Affiliation(s)
- Yizhao Zhou
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, USA
- Department of Pediatrics, Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
| | - Jiasheng Shi
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, USA
- Department of Pediatrics, Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
| | - Ronen Stein
- Department of Pediatrics, Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
- Department of Pediatrics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Xiaokang Liu
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Robert N Baldassano
- Department of Pediatrics, Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
- Department of Pediatrics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Christopher B Forrest
- Department of Pediatrics, Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
- Department of Pediatrics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Yong Chen
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Jing Huang
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, USA
- Department of Pediatrics, Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, USA
| |
Collapse
|
9
|
Estiri H, Azhir A, Blacker DL, Ritchie CS, Patel CJ, Murphy SN. Temporal characterization of Alzheimer's Disease with sequences of clinical records. EBioMedicine 2023; 92:104629. [PMID: 37247495 DOI: 10.1016/j.ebiom.2023.104629] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Revised: 05/05/2023] [Accepted: 05/10/2023] [Indexed: 05/31/2023] Open
Abstract
BACKGROUND Alzheimer's Disease (AD) is a complex clinical phenotype with unprecedented social and economic tolls on an ageing global population. Real-world data (RWD) from electronic health records (EHRs) offer opportunities to accelerate precision drug development and scale epidemiological research on AD. A precise characterization of AD cohorts is needed to address the noise abundant in RWD. METHODS We conducted a retrospective cohort study to develop and test computational models for AD cohort identification using clinical data from 8 Massachusetts healthcare systems. We mined temporal representations from EHR data using the transitive sequential pattern mining algorithm (tSPM) to train and validate our models. We then tested our models against a held-out test set from a review of medical records to adjudicate the presence of AD. We trained two classes of Machine Learning models, using Gradient Boosting Machine (GBM), to compare the utility of AD diagnosis records versus the tSPM temporal representations (comprising sequences of diagnosis and medication observations) from electronic medical records for characterizing AD cohorts. FINDINGS In a group of 4985 patients, we identified 219 tSPM temporal representations (i.e., transitive sequences) of medical records for constructing the best classification models. The models with sequential features improved AD classification by a magnitude of 3-16 percent over the use of AD diagnosis codes alone. The computed cohort included 663 patients, 35 of whom had no record of AD. Six groups of tSPM sequences were identified for characterizing the AD cohorts. INTERPRETATION We present sequential patterns of diagnosis and medication codes from electronic medical records, as digital markers of Alzheimer's Disease. Classification algorithms developed on sequential patterns can replace standard features from EHRs to enrich phenotype modelling. FUNDING National Institutes of Health: the National Institute on Aging (RF1AG074372) and the National Institute of Allergy and Infectious Diseases (R01AI165535).
Collapse
Affiliation(s)
- Hossein Estiri
- Department of Medicine, Massachusetts General Hospital, Boston, MA, USA.
| | - Alaleh Azhir
- Department of Medicine, Massachusetts General Hospital, Boston, MA, USA; Harvard Medical School, Harvard-MIT Program in Health Sciences and Technology, USA
| | - Deborah L Blacker
- Department of Psychiatry, Massachusetts General Hospital, Boston, MA, USA
| | | | - Chirag J Patel
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Shawn N Murphy
- Department of Neurology, Massachusetts General Hospital, Boston, MA, USA
| |
Collapse
|
10
|
Kim DH, Jensen A, Jones K, Raghavan S, Phillips LS, Hung A, Sun YV, Li G, Reaven P, Zhou H, Zhou JJ. A platform for phenotyping disease progression and associated longitudinal risk factors in large-scale EHRs, with application to incident diabetes complications in the UK Biobank. JAMIA Open 2023; 6:ooad006. [PMID: 36789288 PMCID: PMC9912368 DOI: 10.1093/jamiaopen/ooad006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Revised: 01/19/2023] [Accepted: 01/31/2023] [Indexed: 02/12/2023] Open
Abstract
Objective Modern healthcare data reflect massive multi-level and multi-scale information collected over many years. The majority of the existing phenotyping algorithms use case-control definitions of disease. This paper aims to study the time to disease onset and progression and identify the time-varying risk factors that drive them. Materials and Methods We developed an algorithmic approach to phenotyping the incidence of diseases by consolidating data sources from the UK Biobank (UKB), including primary care electronic health records (EHRs). We focused on defining events, event dates, and their censoring time, including relevant terms and existing phenotypes, excluding generic, rare, or semantically distant terms, forward-mapping terminology terms, and expert review. We applied our approach to phenotyping diabetes complications, including a composite cardiovascular disease (CVD) outcome, diabetic kidney disease (DKD), and diabetic retinopathy (DR), in the UKB study. Results We identified 49 049 participants with diabetes. Among them, 1023 had type 1 diabetes (T1D), and 40 193 had type 2 diabetes (T2D). A total of 23 833 diabetes subjects had linked primary care records. There were 3237, 3113, and 4922 patients with CVD, DKD, and DR events, respectively. The risk prediction performance for each outcome was assessed, and our results are consistent with the prediction area under the ROC (receiver operating characteristic) curve (AUC) of standard risk prediction models using cohort studies. Discussion and Conclusion Our publicly available pipeline and platform enable streamlined curation of incidence events, identification of time-varying risk factors underlying disease progression, and the definition of a relevant cohort for time-to-event analyses. These important steps need to be considered simultaneously to study disease progression.
Collapse
Affiliation(s)
- Do Hyun Kim
- Department of Biostatistics, University of California, Los Angeles, California, USA
| | - Aubrey Jensen
- Department of Biostatistics, University of California, Los Angeles, California, USA
| | - Kelly Jones
- Department of Computer Science, Columbia University, New York, New York, USA
| | - Sridharan Raghavan
- Division of Hospital Medicine, University of Colorado School of Medicine, Aurora, Colorado, USA
- Rocky Mountain Regional VA Medical Center, Aurora, Colorado, USA
| | - Lawrence S Phillips
- Division of Endocrinology, Emory University School of Medicine, Atlanta, Georgia, USA
- Atlanta VA Medical Center, Decatur, Georgia, USA
| | - Adriana Hung
- VA Tennessee Valley Healthcare System, Nashville, Tennessee, USA
- Vanderbilt University, Nashville, Tennessee, USA
| | - Yan V Sun
- Department of Epidemiology, Emory University, Atlanta, Georgia, USA
| | - Gang Li
- Department of Biostatistics, University of California, Los Angeles, California, USA
- Department of Computational Medicine, David Geffen School of Medicine, University of California, Los Angeles, California, USA
| | - Peter Reaven
- Phoenix VA Health Care System, Phoenix, Arizona, USA
| | - Hua Zhou
- Department of Biostatistics, University of California, Los Angeles, California, USA
- Department of Computational Medicine, David Geffen School of Medicine, University of California, Los Angeles, California, USA
| | - Jin J Zhou
- Department of Biostatistics, University of California, Los Angeles, California, USA
- Phoenix VA Health Care System, Phoenix, Arizona, USA
- Department of Medicine, David Geffen School of Medicine, University of California, Los Angeles, California, USA
| |
Collapse
|
11
|
Yang S, Varghese P, Stephenson E, Tu K, Gronsbell J. Machine learning approaches for electronic health records phenotyping: a methodical review. J Am Med Inform Assoc 2023; 30:367-381. [PMID: 36413056 PMCID: PMC9846699 DOI: 10.1093/jamia/ocac216] [Citation(s) in RCA: 16] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Revised: 09/27/2022] [Accepted: 10/27/2022] [Indexed: 11/23/2022] Open
Abstract
OBJECTIVE Accurate and rapid phenotyping is a prerequisite to leveraging electronic health records for biomedical research. While early phenotyping relied on rule-based algorithms curated by experts, machine learning (ML) approaches have emerged as an alternative to improve scalability across phenotypes and healthcare settings. This study evaluates ML-based phenotyping with respect to (1) the data sources used, (2) the phenotypes considered, (3) the methods applied, and (4) the reporting and evaluation methods used. MATERIALS AND METHODS We searched PubMed and Web of Science for articles published between 2018 and 2022. After screening 850 articles, we recorded 37 variables on 100 studies. RESULTS Most studies utilized data from a single institution and included information in clinical notes. Although chronic conditions were most commonly considered, ML also enabled the characterization of nuanced phenotypes such as social determinants of health. Supervised deep learning was the most popular ML paradigm, while semi-supervised and weakly supervised learning were applied to expedite algorithm development and unsupervised learning to facilitate phenotype discovery. ML approaches did not uniformly outperform rule-based algorithms, but deep learning offered a marginal improvement over traditional ML for many conditions. DISCUSSION Despite the progress in ML-based phenotyping, most articles focused on binary phenotypes and few articles evaluated external validity or used multi-institution data. Study settings were infrequently reported and analytic code was rarely released. CONCLUSION Continued research in ML-based phenotyping is warranted, with emphasis on characterizing nuanced phenotypes, establishing reporting and evaluation standards, and developing methods to accommodate misclassified phenotypes due to algorithm errors in downstream applications.
Collapse
Affiliation(s)
- Siyue Yang
- Department of Statistical Sciences, University of Toronto, Toronto, Ontario, Canada
| | | | - Ellen Stephenson
- Department of Family & Community Medicine, University of Toronto, Toronto, Ontario, Canada
| | - Karen Tu
- Department of Family & Community Medicine, University of Toronto, Toronto, Ontario, Canada
| | - Jessica Gronsbell
- Department of Statistical Sciences, University of Toronto, Toronto, Ontario, Canada
- Department of Family & Community Medicine, University of Toronto, Toronto, Ontario, Canada
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
12
|
Integration of Omics and Phenotypic Data for Precision Medicine. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2022; 2486:19-35. [PMID: 35437716 DOI: 10.1007/978-1-0716-2265-0_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
Over the past two decades, biomedical research is moving toward a big-data-driven approach. The underlying causes of this transition include the ability to gather genetic or molecular profiles of humans faster, the increasing adoption of electronic health record (EHR) system, and the growing interest in linking omics and phenotypic data for analysis. The integration of individual's biology data (e.g., genomics, proteomics, metabolomics), and health-care data has created unprecedented opportunities for precision medicine, that is, a medical model that uses a patient's unique information, mainly genetic, to prevent, diagnose, or treat disease. This chapter reviewed the research opportunities and applications of integrating omics and phenotypic data for precision medicine, such as understanding the relationship between genotype and phenotype, disease subtyping, and diagnosis or prediction of adverse outcomes. We reviewed the recent advanced methods, particularly the machine learning and deep learning-based approaches used for harnessing and harmonizing the multiomics and phenotypic data to address these applications. We finally discussed the challenges and future directions.
Collapse
|
13
|
Estiri H, Strasser ZH, Brat GA, Semenov YR, Patel CJ, Murphy SN. Evolving phenotypes of non-hospitalized patients that indicate long COVID. BMC Med 2021; 19:249. [PMID: 34565368 PMCID: PMC8474909 DOI: 10.1186/s12916-021-02115-0] [Citation(s) in RCA: 57] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/28/2021] [Accepted: 09/01/2021] [Indexed: 01/28/2023] Open
Abstract
BACKGROUND For some SARS-CoV-2 survivors, recovery from the acute phase of the infection has been grueling with lingering effects. Many of the symptoms characterized as the post-acute sequelae of COVID-19 (PASC) could have multiple causes or are similarly seen in non-COVID patients. Accurate identification of PASC phenotypes will be important to guide future research and help the healthcare system focus its efforts and resources on adequately controlled age- and gender-specific sequelae of a COVID-19 infection. METHODS In this retrospective electronic health record (EHR) cohort study, we applied a computational framework for knowledge discovery from clinical data, MLHO, to identify phenotypes that positively associate with a past positive reverse transcription-polymerase chain reaction (RT-PCR) test for COVID-19. We evaluated the post-test phenotypes in two temporal windows at 3-6 and 6-9 months after the test and by age and gender. Data from longitudinal diagnosis records stored in EHRs from Mass General Brigham in the Boston Metropolitan Area was used for the analyses. Statistical analyses were performed on data from March 2020 to June 2021. Study participants included over 96 thousand patients who had tested positive or negative for COVID-19 and were not hospitalized. RESULTS We identified 33 phenotypes among different age/gender cohorts or time windows that were positively associated with past SARS-CoV-2 infection. All identified phenotypes were newly recorded in patients' medical records 2 months or longer after a COVID-19 RT-PCR test in non-hospitalized patients regardless of the test result. Among these phenotypes, a new diagnosis record for anosmia and dysgeusia (OR 2.60, 95% CI [1.94-3.46]), alopecia (OR 3.09, 95% CI [2.53-3.76]), chest pain (OR 1.27, 95% CI [1.09-1.48]), chronic fatigue syndrome (OR 2.60, 95% CI [1.22-2.10]), shortness of breath (OR 1.41, 95% CI [1.22-1.64]), pneumonia (OR 1.66, 95% CI [1.28-2.16]), and type 2 diabetes mellitus (OR 1.41, 95% CI [1.22-1.64]) is one of the most significant indicators of a past COVID-19 infection. Additionally, more new phenotypes were found with increased confidence among the cohorts who were younger than 65. CONCLUSIONS The findings of this study confirm many of the post-COVID-19 symptoms and suggest that a variety of new diagnoses, including new diabetes mellitus and neurological disorder diagnoses, are more common among those with a history of COVID-19 than those without the infection. Additionally, more than 63% of PASC phenotypes were observed in patients under 65 years of age, pointing out the importance of vaccination to minimize the risk of debilitating post-acute sequelae of COVID-19 among younger adults.
Collapse
Affiliation(s)
- Hossein Estiri
- Laboratory of Computer Science, Massachusetts General Hospital, Boston, MA, 02114, USA. .,Department of Medicine, Massachusetts General Hospital, Boston, MA, 02114, USA.
| | - Zachary H Strasser
- Laboratory of Computer Science, Massachusetts General Hospital, Boston, MA, 02114, USA.,Department of Medicine, Massachusetts General Hospital, Boston, MA, 02114, USA.,Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Gabriel A Brat
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Yevgeniy R Semenov
- Department of Dermatology, Massachusetts General Hospital, Boston, MA, 02114, USA
| | | | - Chirag J Patel
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Shawn N Murphy
- Laboratory of Computer Science, Massachusetts General Hospital, Boston, MA, 02114, USA.,Department of Neurology, Massachusetts General Hospital, Boston, MA, 02114, USA.,Research Information Science and Computing, Mass General Brigham, Boston, MA, USA
| |
Collapse
|
14
|
Daniel C, Bellamine A, Kalra D. Key Contributions in Clinical Research Informatics. Yearb Med Inform 2021; 30:233-238. [PMID: 34479395 PMCID: PMC8416193 DOI: 10.1055/s-0041-1726514] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Objectives:
To summarize key contributions to current research in the field of Clinical Research Informatics (CRI) and to select best papers published in 2020.
Method:
A bibliographic search using a combination of Medical Subject Headings (MeSH) descriptors and free-text terms on CRI was performed using PubMed, followed by a double-blind review in order to select a list of candidate best papers to be then peer-reviewed by external reviewers. After peer-review ranking, a consensus meeting between two section editors and the editorial team was organized to finally conclude on the selected four best papers.
Results:
Among the 877 papers published in 2020 and returned by the search, there were four best papers selected. The first best paper describes a method for mining temporal sequences from clinical documents to infer disease trajectories and enhancing high-throughput phenotyping. The authors of the second best paper demonstrate that the generation of synthetic Electronic Health Record (EHR) data through Generative Adversarial Networks (GANs) could be substantially improved by more appropriate training and evaluation criteria. The third best paper offers an efficient advance on methods to detect adverse drug events by computer-assisting expert reviewers with annotated candidate mentions in clinical documents. The large-scale data quality assessment study reported by the fourth best paper has clinical research informatics implications, in terms of the trustworthiness of inferences made from analysing electronic health records.
Conclusions:
The most significant research efforts in the CRI field are currently focusing on data science with active research in the development and evaluation of Artificial Intelligence/Machine Learning (AI/ML) algorithms based on ever more intensive use of real-world data and especially EHR real or synthetic data. A major lesson that the coronavirus disease 2019 (COVID-19) pandemic has already taught the scientific CRI community is that timely international high-quality data-sharing and collaborative data analysis is absolutely vital to inform policy decisions.
Collapse
Affiliation(s)
- Christel Daniel
- Information Technology Department, AP-HP, F-75012 Paris, France.,Sorbonne University, University Paris 13, Sorbonne Paris Cité, INSERM UMR_S 1142, LIMICS, F-75006 Paris, France
| | - Ali Bellamine
- Information Technology Department, AP-HP, F-75012 Paris, France
| | | | | |
Collapse
|
15
|
Estiri H, Strasser ZH, Brat GA, Semenov YR, Patel CJ, Murphy SN. Evolving Phenotypes of non-hospitalized Patients that Indicate Long Covid. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2021. [PMID: 33948602 DOI: 10.1101/2021.04.25.21255923] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
For some SARS-CoV-2 survivors, recovery from the acute phase of the infection has been grueling with lingering effects. Many of the symptoms characterized as the post-acute sequelae of COVID-19 (PASC) could have multiple causes or are similarly seen in non-COVID patients. Accurate identification of phenotypes will be important to guide future research and help the healthcare system focus its efforts and resources on adequately controlled age- and gender-specific sequelae of a COVID-19 infection. In this retrospective electronic health records (EHR) cohort study, we applied a computational framework for knowledge discovery from clinical data, MLHO, to identify phenotypes that positively associate with a past positive reverse transcription-polymerase chain reaction (RT-PCR) test for COVID-19. We evaluated the post-test phenotypes in two temporal windows at 3-6 and 6-9 months after the test and by age and gender. Data from longitudinal diagnosis records stored in EHRs from Mass General Brigham in the Boston metropolitan area was used for the analyses. Statistical analyses were performed on data from March 2020 to June 2021. Study participants included over 96 thousand patients who had tested positive or negative for COVID-19 and were not hospitalized. We identified 33 phenotypes among different age/gender cohorts or time windows that were positively associated with past SARS-CoV-2 infection. All identified phenotypes were newly recorded in patients’ medical records two months or longer after a COVID-19 RT-PCR test in non-hospitalized patients regardless of the test result. Among these phenotypes, a new diagnosis record for anosmia and dysgeusia (OR: 2.60, 95% CI [1.94 - 3.46]), alopecia (OR: 3.09, 95% CI [2.53 - 3.76]), chest pain (OR: 1.27, 95% CI [1.09 - 1.48]), chronic fatigue syndrome (OR 2.60, 95% CI [1.22-2.10]), shortness of breath (OR 1.41, 95% CI [1.22 - 1.64]), pneumonia (OR 1.66, 95% CI [1.28 - 2.16]), and type 2 diabetes mellitus (OR 1.41, 95% CI [1.22 - 1.64]) are some of the most significant indicators of a past COVID-19 infection. Additionally, more new phenotypes were found with increased confidence among the cohorts who were younger than 65. Our approach avoids a flood of false positive discoveries while offering a more robust probabilistic approach compared to the standard linear phenome-wide association study (PheWAS). The findings of this study confirm many of the post-COVID symptoms and suggest that a variety of new diagnoses, including new diabetes mellitus and neurological disorder diagnoses, are more common among those with a history of COVID-19 than those without the infection. Additionally, more than 63 percent of PASC phenotypes were observed in patients under 65 years of age, pointing out the importance of vaccination to minimize the risk of debilitating post-acute sequelae of COVID-19 among younger adults.
Collapse
|