1
|
Nikolaou V, Massaro S, Fakhimi M, Stergioulas L, Price D. COPD phenotypes and machine learning cluster analysis: A systematic review and future research agenda. Respir Med 2020; 171:106093. [PMID: 32745966 DOI: 10.1016/j.rmed.2020.106093] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/25/2020] [Revised: 07/19/2020] [Accepted: 07/21/2020] [Indexed: 12/21/2022]
Abstract
Chronic Obstructive Pulmonary Disease (COPD) is a highly heterogeneous condition projected to become the third leading cause of death worldwide by 2030. To better characterize this condition, clinicians have classified patients sharing certain symptomatic characteristics, such as symptom intensity and history of exacerbations, into distinct phenotypes. In recent years, the growing use of machine learning algorithms, and cluster analysis in particular, has promised to advance this classification through the integration of additional patient characteristics, including comorbidities, biomarkers, and genomic information. This combination would allow researchers to more reliably identify new COPD phenotypes, as well as better characterize existing ones, with the aim of improving diagnosis and developing novel treatments. Here, we systematically review the last decade of research progress, which uses cluster analysis to identify COPD phenotypes. Collectively, we provide a systematized account of the extant evidence, describe the strengths and weaknesses of the main methods used, identify gaps in the literature, and suggest recommendations for future research.
Collapse
Affiliation(s)
- Vasilis Nikolaou
- Surrey Business School, University of Surrey, Guildford, GU2 7HX, UK.
| | - Sebastiano Massaro
- Surrey Business School, University of Surrey, Guildford, GU2 7HX, UK; The Organizational Neuroscience Laboratory, London, WC1N 3AX, UK
| | - Masoud Fakhimi
- Surrey Business School, University of Surrey, Guildford, GU2 7HX, UK
| | | | - David Price
- Observational and Pragmatic Research Institute, Singapore, Singapore; Centre of Academic Primary Care, Division of Applied Health Sciences, University of Aberdeen, Aberdeen, UK
| |
Collapse
|
2
|
Ragland MF, Benway CJ, Lutz SM, Bowler RP, Hecker J, Hokanson JE, Crapo JD, Castaldi PJ, DeMeo DL, Hersh CP, Hobbs BD, Lange C, Beaty TH, Cho MH, Silverman EK. Genetic Advances in Chronic Obstructive Pulmonary Disease. Insights from COPDGene. Am J Respir Crit Care Med 2020; 200:677-690. [PMID: 30908940 DOI: 10.1164/rccm.201808-1455so] [Citation(s) in RCA: 58] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
Chronic obstructive pulmonary disease (COPD) is a common and progressive disease that is influenced by both genetic and environmental factors. For many years, knowledge of the genetic basis of COPD was limited to Mendelian syndromes, such as alpha-1 antitrypsin deficiency and cutis laxa, caused by rare genetic variants. Over the past decade, the proliferation of genome-wide association studies, the accessibility of whole-genome sequencing, and the development of novel methods for analyzing genetic variation data have led to a substantial increase in the understanding of genetic variants that play a role in COPD susceptibility and COPD-related phenotypes. COPDGene (Genetic Epidemiology of COPD), a multicenter, longitudinal study of over 10,000 current and former cigarette smokers, has been pivotal to these breakthroughs in understanding the genetic basis of COPD. To date, over 20 genetic loci have been convincingly associated with COPD affection status, with additional loci demonstrating association with COPD-related phenotypes such as emphysema, chronic bronchitis, and hypoxemia. In this review, we discuss the contributions of the COPDGene study to the discovery of these genetic associations as well as the ongoing genetic investigations of COPD subtypes, protein biomarkers, and post-genome-wide association study analysis.
Collapse
Affiliation(s)
- Margaret F Ragland
- Division of Pulmonary Sciences and Critical Care Medicine, School of Medicine, and
| | | | | | | | - Julian Hecker
- Harvard T. H. Chan School of Public Health, Boston, Massachusetts; and
| | - John E Hokanson
- Department of Epidemiology, Colorado School of Public Health, University of Colorado Denver, Aurora, Colorado
| | | | | | - Dawn L DeMeo
- Channing Division of Network Medicine and.,Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, Massachusetts
| | - Craig P Hersh
- Channing Division of Network Medicine and.,Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, Massachusetts
| | - Brian D Hobbs
- Channing Division of Network Medicine and.,Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, Massachusetts
| | - Christoph Lange
- Harvard T. H. Chan School of Public Health, Boston, Massachusetts; and
| | - Terri H Beaty
- Johns Hopkins University Bloomberg School of Public Health, Baltimore, Maryland
| | - Michael H Cho
- Channing Division of Network Medicine and.,Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, Massachusetts
| | - Edwin K Silverman
- Channing Division of Network Medicine and.,Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, Massachusetts
| |
Collapse
|
3
|
Chen D, Liu C, Xie J. Multi-locus Test and Correction for Confounding Effects in Genome-Wide Association Studies. Int J Biostat 2016; 12:/j/ijb.ahead-of-print/ijb-2015-0091/ijb-2015-0091.xml. [PMID: 27232635 DOI: 10.1515/ijb-2015-0091] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Genome-wide association studies (GWAS) examine a large number of genetic variants, e. g., single nucleotide polymorphisms (SNP), and associate them with a disease of interest. Traditional statistical methods for GWASs can produce spurious associations, due to limited information from individual SNPs and confounding effects. This paper develops two statistical methods to enhance data analysis of GWASs. The first is a multiple-SNP association test, which is a weighted chi-square test derived for big contingency tables. The test assesses combinatorial effects of multiple SNPs and improves conventional methods of single SNP analysis. The second is a method that corrects for confounding effects, which may come from population stratification as well as other ambiguous (unknown) factors. The proposed method identifies a latent confounding factor, using a profile of whole genome SNPs, and eliminates confounding effects through matching or stratified statistical analysis. Simulations and a GWAS of rheumatoid arthritis demonstrate that the proposed methods dramatically remove the number of significant tests, or false positives, and outperforms other available methods.
Collapse
|
4
|
O'Brien JA, Vega A, Bouguyon E, Krouk G, Gojon A, Coruzzi G, Gutiérrez RA. Nitrate Transport, Sensing, and Responses in Plants. MOLECULAR PLANT 2016; 9:837-56. [PMID: 27212387 DOI: 10.1016/j.molp.2016.05.004] [Citation(s) in RCA: 298] [Impact Index Per Article: 33.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/01/2016] [Revised: 05/16/2016] [Accepted: 05/16/2016] [Indexed: 05/20/2023]
Abstract
Nitrogen (N) is an essential macronutrient that affects plant growth and development. N is an important component of chlorophyll, amino acids, nucleic acids, and secondary metabolites. Nitrate is one of the most abundant N sources in the soil. Because nitrate and other N nutrients are often limiting, plants have developed sophisticated mechanisms to ensure adequate supply of nutrients in a variable environment. Nitrate is absorbed in the root and mobilized to other organs by nitrate transporters. Nitrate sensing activates signaling pathways that impinge upon molecular, metabolic, physiological, and developmental responses locally and at the whole plant level. With the advent of genomics technologies and genetic tools, important advances in our understanding of nitrate and other N nutrient responses have been achieved in the past decade. Furthermore, techniques that take advantage of natural polymorphisms present in divergent individuals from a single species have been essential in uncovering new components. However, there are still gaps in our understanding of how nitrate signaling affects biological processes in plants. Moreover, we still lack an integrated view of how all the regulatory factors identified interact or crosstalk to orchestrate the myriad N responses plants typically exhibit. In this review, we provide an updated overview of mechanisms by which nitrate is sensed and transported throughout the plant. We discuss signaling components and how nitrate sensing crosstalks with hormonal pathways for developmental responses locally and globally in the plant. Understanding how nitrate impacts on plant metabolism, physiology, and growth and development in plants is key to improving crops for sustainable agriculture.
Collapse
Affiliation(s)
- José A O'Brien
- Departamento de Genética Molecular y Microbiología, FONDAP Center for Genome Regulation, Millennium Nucleus Center for Plant Systems and Synthetic Biology, Pontificia Universidad Católica de Chile, 8331150, Chile; Departamento de Fruticultura y Enología, Pontificia Universidad Católica de Chile, Santiago, 7820436, Chile
| | - Andrea Vega
- Departamento de Ciencias Vegetales, Pontificia Universidad Católica de Chile, Santiago, 7820436, Chile
| | - Eléonore Bouguyon
- Department of Biology, Center for Genomics and Systems Biology, New York University, New York, NY 10003, USA; Laboratoire de Biochimie et Physiologie Moléculaire des Plantes, Institut de Biologie Intégrative des Plantes 'Claude Grignon', UMR CNRS, INRA, SupAgro, UM, 2 Place Viala, 34060 Montpellier Cedex, France
| | - Gabriel Krouk
- Laboratoire de Biochimie et Physiologie Moléculaire des Plantes, Institut de Biologie Intégrative des Plantes 'Claude Grignon', UMR CNRS, INRA, SupAgro, UM, 2 Place Viala, 34060 Montpellier Cedex, France
| | - Alain Gojon
- Laboratoire de Biochimie et Physiologie Moléculaire des Plantes, Institut de Biologie Intégrative des Plantes 'Claude Grignon', UMR CNRS, INRA, SupAgro, UM, 2 Place Viala, 34060 Montpellier Cedex, France
| | - Gloria Coruzzi
- Department of Biology, Center for Genomics and Systems Biology, New York University, New York, NY 10003, USA
| | - Rodrigo A Gutiérrez
- Departamento de Genética Molecular y Microbiología, FONDAP Center for Genome Regulation, Millennium Nucleus Center for Plant Systems and Synthetic Biology, Pontificia Universidad Católica de Chile, 8331150, Chile.
| |
Collapse
|
5
|
Nikolskiy I, Siuzdak G, Patti GJ. Discriminating precursors of common fragments for large-scale metabolite profiling by triple quadrupole mass spectrometry. Bioinformatics 2015; 31:2017-23. [PMID: 25691443 DOI: 10.1093/bioinformatics/btv085] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2014] [Accepted: 02/05/2015] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION The goal of large-scale metabolite profiling is to compare the relative concentrations of as many metabolites extracted from biological samples as possible. This is typically accomplished by measuring the abundances of thousands of ions with high-resolution and high mass accuracy mass spectrometers. Although the data from these instruments provide a comprehensive fingerprint of each sample, identifying the structures of the thousands of detected ions is still challenging and time intensive. An alternative, less-comprehensive approach is to use triple quadrupole (QqQ) mass spectrometry to analyze predetermined sets of metabolites (typically fewer than several hundred). This is done using authentic standards to develop QqQ experiments that specifically detect only the targeted metabolites, with the advantage that the need for ion identification after profiling is eliminated. RESULTS Here, we propose a framework to extend the application of QqQ mass spectrometers to large-scale metabolite profiling. We aim to provide a foundation for designing QqQ multiple reaction monitoring (MRM) experiments for each of the 82 696 metabolites in the METLIN metabolite database. First, we identify common fragmentation products from the experimental fragmentation data in METLIN. Then, we model the likelihoods of each precursor structure in METLIN producing each common fragmentation product. With these likelihood estimates, we select ensembles of common fragmentation products that minimize our uncertainty about metabolite identities. We demonstrate encouraging performance and, based on our results, we suggest how our method can be integrated with future work to develop large-scale MRM experiments. AVAILABILITY AND IMPLEMENTATION Our predictions, Supplementary results, and the code for estimating likelihoods and selecting ensembles of fragmentation reactions are made available on the lab website at http://pattilab.wustl.edu/FragPred.
Collapse
Affiliation(s)
- Igor Nikolskiy
- Department of Genetics, Department of Medicine, Washington University School of Medicine, St. Louis, MO 63110, USA, Scripps Center for Metabolomics and Mass Spectrometry, Departments of Chemistry, Molecular and Computational Biology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, California 92037, USA and Department of Chemistry, Washington University, St. Louis, MO 63130, USA
| | - Gary Siuzdak
- Department of Genetics, Department of Medicine, Washington University School of Medicine, St. Louis, MO 63110, USA, Scripps Center for Metabolomics and Mass Spectrometry, Departments of Chemistry, Molecular and Computational Biology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, California 92037, USA and Department of Chemistry, Washington University, St. Louis, MO 63130, USA
| | - Gary J Patti
- Department of Genetics, Department of Medicine, Washington University School of Medicine, St. Louis, MO 63110, USA, Scripps Center for Metabolomics and Mass Spectrometry, Departments of Chemistry, Molecular and Computational Biology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, California 92037, USA and Department of Chemistry, Washington University, St. Louis, MO 63130, USA Department of Genetics, Department of Medicine, Washington University School of Medicine, St. Louis, MO 63110, USA, Scripps Center for Metabolomics and Mass Spectrometry, Departments of Chemistry, Molecular and Computational Biology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, California 92037, USA and Department of Chemistry, Washington University, St. Louis, MO 63130, USA Department of Genetics, Department of Medicine, Washington University School of Medicine, St. Louis, MO 63110, USA, Scripps Center for Metabolomics and Mass Spectrometry, Departments of Chemistry, Molecular and Computational Biology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, California 92037, USA and Department of Chemistry, Washington University, St. Louis, MO 63130, USA
| |
Collapse
|