1
|
Maiorino E, De Marzio M, Xu Z, Yun JH, Chase RP, Hersh CP, Weiss ST, Silverman EK, Castaldi PJ, Glass K. Joint clinical and molecular subtyping of COPD with variational autoencoders. medRxiv 2024:2023.08.19.23294298. [PMID: 38260473 PMCID: PMC10802661 DOI: 10.1101/2023.08.19.23294298] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
Chronic Obstructive Pulmonary Disease (COPD) is a complex, heterogeneous disease. Traditional subtyping methods generally focus on either the clinical manifestations or the molecular endotypes of the disease, resulting in classifications that do not fully capture the disease's complexity. Here, we bridge this gap by introducing a subtyping pipeline that integrates clinical and gene expression data with variational autoencoders. We apply this methodology to the COPDGene study, a large study of current and former smoking individuals with and without COPD. Our approach generates a set of vector embeddings, called Personalized Integrated Profiles (PIPs), that recapitulate the joint clinical and molecular state of the subjects in the study. Prediction experiments show that the PIPs have a predictive accuracy comparable to or better than other embedding approaches. Using trajectory learning approaches, we analyze the main trajectories of variation in the PIP space and identify five well-separated subtypes with distinct clinical phenotypes, expression signatures, and disease outcomes. Notably, these subtypes are more robust to data resampling compared to those identified using traditional clustering approaches. Overall, our findings provide new avenues to establish fine-grained associations between the clinical characteristics, molecular processes, and disease outcomes of COPD.
Collapse
Affiliation(s)
- Enrico Maiorino
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School
| | - Margherita De Marzio
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School
| | - Zhonghui Xu
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School
| | - Jeong H. Yun
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School
| | - Robert P. Chase
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School
| | - Craig P. Hersh
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School
| | - Scott T. Weiss
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School
| | - Edwin K. Silverman
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School
| | | | | |
Collapse
|
2
|
Yuan NF, Hasenstab K, Retson T, Conrad DJ, Lynch DA, Hsiao A. Unsupervised Learning Identifies Computed Tomographic Measurements as Primary Drivers of Progression, Exacerbation, and Mortality in Chronic Obstructive Pulmonary Disease. Ann Am Thorac Soc 2022; 19:1993-2002. [PMID: 35830591 DOI: 10.1513/AnnalsATS.202110-1127OC] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Rationale: Chronic obstructive pulmonary disease (COPD) is a heterogeneous syndrome with phenotypic manifestations that tend to be distributed along a continuum. Unsupervised machine learning based on broad selection of imaging and clinical phenotypes may be used to identify primary variables that define disease axes and stratify patients with COPD. Objectives: To identify primary variables driving COPD heterogeneity using principal component analysis and to define disease axes and assess the prognostic value of these axes across three outcomes: progression, exacerbation, and mortality. Methods: We included 7,331 patients between 39 and 85 years old, of whom 40.3% were Black and 45.8% were female smokers with a mean of 44.6 pack-years, from the COPDGene (Genetic Epidemiology of COPD) phase I cohort (2008-2011) in our analysis. Out of a total of 916 phenotypes, 147 continuous clinical, spirometric, and computed tomography (CT) features were selected. For each principal component (PC), we computed a PC score based on feature weights. We used PC score distributions to define disease axes along which we divided the patients into quartiles. To assess the prognostic value of these axes, we applied logistic regression analyses to estimate 5-year (n = 4,159) and 10-year (n = 1,487) odds of progression. Cox regression and Kaplan-Meier analyses were performed to estimate 5-year and 10-year risk of exacerbation (n = 6,532) and all-cause mortality (n = 7,331). Results: The first PC, accounting for 43.7% of variance, was defined by CT measures of air trapping and emphysema. The second PC, accounting for 13.7% of variance, was defined by spirometric and CT measures of vital capacity and lung volume. The third PC, accounting for 7.9% of the variance, was defined by CT measures of lung mass, airway thickening, and body habitus. Stratification of patients across each disease axis revealed up to 3.2-fold (95% confidence interval [CI] 2.4, 4.3) greater odds of 5-year progression, 5.4-fold (95% CI 4.6, 6.3) greater risk of 5-year exacerbation, and 5.0-fold (95% CI 4.2, 6.0) greater risk of 10-year mortality between the highest and lowest quartiles. Conclusions: Unsupervised learning analysis of the COPDGene cohort reveals that CT measurements may bolster patient stratification along the continuum of COPD phenotypes. Each of the disease axes also individually demonstrate prognostic potential, predictive of future forced expiratory volume in 1 second decline, exacerbation, and mortality.
Collapse
|
3
|
Burke H, Wilkinson TMA. Unravelling the mechanisms driving multimorbidity in COPD to develop holistic approaches to patient-centred care. Eur Respir Rev 2021; 30:30/160/210041. [PMID: 34415848 DOI: 10.1183/16000617.0041-2021] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2021] [Accepted: 04/06/2021] [Indexed: 01/04/2023] Open
Abstract
COPD is a major cause of morbidity and mortality worldwide. Multimorbidity is common in COPD patients and a key modifiable factor, which requires timely identification and targeted holistic management strategies to improve outcomes and reduce the burden of disease.We discuss the use of integrative approaches, such as cluster analysis and network-based theory, to understand the common and novel pathobiological mechanisms underlying COPD and comorbid disease, which are likely to be key to informing new management strategies.Furthermore, we discuss the current understanding of mechanistic drivers to multimorbidity in COPD, including hypotheses such as multimorbidity as a result of shared common exposure to noxious stimuli (e.g. tobacco smoke), or as a consequence of loss of function following the development of pulmonary disease. In addition, we explore the links to pulmonary disease processes such as systemic overspill of pulmonary inflammation, immune cell priming within the inflamed COPD lung and targeted messengers such as extracellular vesicles as a result of local damage as a cause for multimorbidity in COPD.Finally, we focus on current and new management strategies which may target these underlying mechanisms, with the aim of holistic, patient-centred treatment rather than single disease management.
Collapse
Affiliation(s)
- H Burke
- School of Clinical and Experimental Sciences, Faculty of Medicine, University of Southampton, Southampton, UK .,University Hospitals Southampton NHS Foundation Trust, Southampton, UK
| | - T M A Wilkinson
- School of Clinical and Experimental Sciences, Faculty of Medicine, University of Southampton, Southampton, UK.,University Hospitals Southampton NHS Foundation Trust, Southampton, UK.,NIHR Southampton Biomedical Research Centre, University Hospital Southampton, Southampton, UK
| |
Collapse
|
4
|
Castaldi PJ, Boueiz A, Yun J, Estepar RSJ, Ross JC, Washko G, Cho MH, Hersh CP, Kinney GL, Young KA, Regan EA, Lynch DA, Criner GJ, Dy JG, Rennard SI, Casaburi R, Make BJ, Crapo J, Silverman EK, Hokanson JE. Machine Learning Characterization of COPD Subtypes: Insights From the COPDGene Study. Chest 2019; 157:1147-1157. [PMID: 31887283 DOI: 10.1016/j.chest.2019.11.039] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2019] [Revised: 10/18/2019] [Accepted: 11/29/2019] [Indexed: 12/17/2022] Open
Abstract
COPD is a heterogeneous syndrome. Many COPD subtypes have been proposed, but there is not yet consensus on how many COPD subtypes there are and how they should be defined. The COPD Genetic Epidemiology Study (COPDGene), which has generated 10-year longitudinal chest imaging, spirometry, and molecular data, is a rich resource for relating COPD phenotypes to underlying genetic and molecular mechanisms. In this article, we place COPDGene clustering studies in context with other highly cited COPD clustering studies, and summarize the main COPD subtype findings from COPDGene. First, most manifestations of COPD occur along a continuum, which explains why continuous aspects of COPD or disease axes may be more accurate and reproducible than subtypes identified through clustering methods. Second, continuous COPD-related measures can be used to create subgroups through the use of predictive models to define cut-points, and we review COPDGene research on blood eosinophil count thresholds as a specific example. Third, COPD phenotypes identified or prioritized through machine learning methods have led to novel biological discoveries, including novel emphysema genetic risk variants and systemic inflammatory subtypes of COPD. Fourth, trajectory-based COPD subtyping captures differences in the longitudinal evolution of COPD, addressing a major limitation of clustering analyses that are confounded by disease severity. Ongoing longitudinal characterization of subjects in COPDGene will provide useful insights about the relationship between lung imaging parameters, molecular markers, and COPD progression that will enable the identification of subtypes based on underlying disease processes and distinct patterns of disease progression, with the potential to improve the clinical relevance and reproducibility of COPD subtypes.
Collapse
Affiliation(s)
- Peter J Castaldi
- Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA; General Medicine and Primary Care, Brigham and Women's Hospital, Harvard Medical School, Boston, MA.
| | - Adel Boueiz
- Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA; Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - Jeong Yun
- Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA; Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - Raul San Jose Estepar
- Applied Chest Imaging Laboratory, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - James C Ross
- Applied Chest Imaging Laboratory, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - George Washko
- Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA; Applied Chest Imaging Laboratory, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - Michael H Cho
- Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA; Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - Craig P Hersh
- Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA; Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - Gregory L Kinney
- Department of Epidemiology, University of Colorado, Denver, Aurora, CO
| | - Kendra A Young
- Department of Epidemiology, University of Colorado, Denver, Aurora, CO
| | | | - David A Lynch
- Department of Radiology, National Jewish Health, Denver, CO
| | - Gerald J Criner
- Department of Thoracic Medicine and Surgery, Lewis Katz School of Medicine at Temple University, Philadelphia, PA
| | - Jennifer G Dy
- Department of Electrical and Computer Engineering, Northeastern University, Boston, MA
| | - Stephen I Rennard
- Pulmonary and Critical Care Medicine, University of Nebraska Medical Center, Omaha, NE
| | - Richard Casaburi
- Rehabilitation Clinical Trials Center, Los Angeles Biomedical Research Institute at Harbor-UCLA Medical Center, Torrance, CA
| | | | | | - Edwin K Silverman
- Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA; Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - John E Hokanson
- Department of Epidemiology, University of Colorado, Denver, Aurora, CO
| | | |
Collapse
|