1
|
Yeo J, Morales DA, Chen T, Crawford EL, Zhang X, Blomquist TM, Levin AM, Massion PP, Arenberg DA, Midthun DE, Mazzone PJ, Nathan SD, Wainz RJ, Nana-Sinkam P, Willey PFS, Arend TJ, Padda K, Qiu S, Federov A, Hernandez DAR, Hammersley JR, Yoon Y, Safi F, Khuder SA, Willey JC. RNAseq analysis of bronchial epithelial cells to identify COPD-associated genes and SNPs. BMC Pulm Med 2018; 18:42. [PMID: 29506519 PMCID: PMC5838965 DOI: 10.1186/s12890-018-0603-y] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2017] [Accepted: 02/23/2018] [Indexed: 01/09/2023] Open
Abstract
Background There is a need for more powerful methods to identify low-effect SNPs that contribute to hereditary COPD pathogenesis. We hypothesized that SNPs contributing to COPD risk through cis-regulatory effects are enriched in genes comprised by bronchial epithelial cell (BEC) expression patterns associated with COPD. Methods To test this hypothesis, normal BEC specimens were obtained by bronchoscopy from 60 subjects: 30 subjects with COPD defined by spirometry (FEV1/FVC < 0.7, FEV1% < 80%), and 30 non-COPD controls. Targeted next generation sequencing was used to measure total and allele-specific expression of 35 genes in genome maintenance (GM) genes pathways linked to COPD pathogenesis, including seven TP53 and CEBP transcription factor family members. Shrinkage linear discriminant analysis (SLDA) was used to identify COPD-classification models. COPD GWAS were queried for putative cis-regulatory SNPs in the targeted genes. Results On a network basis, TP53 and CEBP transcription factor pathway gene pair network connections, including key DNA repair gene ERCC5, were significantly different in COPD subjects (e.g., Wilcoxon rank sum test for closeness, p-value = 5.0E-11). ERCC5 SNP rs4150275 association with chronic bronchitis was identified in a set of Lung Health Study (LHS) COPD GWAS SNPs restricted to those in putative regulatory regions within the targeted genes, and this association was validated in the COPDgene non-hispanic white (NHW) GWAS. ERCC5 SNP rs4150275 is linked (D’ = 1) to ERCC5 SNP rs17655 which displayed differential allelic expression (DAE) in BEC and is an expression quantitative trait locus (eQTL) in lung tissue (p = 3.2E-7). SNPs in linkage (D’ = 1) with rs17655 were predicted to alter miRNA binding (rs873601). A classifier model that comprised gene features CAT, CEBPG, GPX1, KEAP1, TP73, and XPA had pooled 10-fold cross-validation receiver operator characteristic area under the curve of 75.4% (95% CI: 66.3%–89.3%). The prevalence of DAE was higher than expected (p = 0.0023) in the classifier genes. Conclusions GM genes comprised by COPD-associated BEC expression patterns were enriched for SNPs with cis-regulatory function, including a putative cis-rSNP in ERCC5 that was associated with COPD risk. These findings support additional total and allele-specific expression analysis of gene pathways with high prior likelihood for involvement in COPD pathogenesis. Electronic supplementary material The online version of this article (10.1186/s12890-018-0603-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Jiyoun Yeo
- Department of Pathology, The University of Toledo College of Medicine, 3000 Arlington Avenue, HEB 219, Toledo, OH, 43614, USA
| | - Diego A Morales
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, The University of Toledo College of Medicine, 3000 Arlington Avenue, HEB 219, Toledo, OH, 43614, USA
| | - Tian Chen
- Department of Mathematics and Statistics, The University of Toledo, 2801 W. Bancroft Street, Toledo, OH, 43606, USA
| | - Erin L Crawford
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, The University of Toledo College of Medicine, 3000 Arlington Avenue, HEB 219, Toledo, OH, 43614, USA
| | - Xiaolu Zhang
- Department of Medicine, The University of Toledo College of Medicine, 3000 Arlington Avenue, Toledo, OH, 43614, USA
| | - Thomas M Blomquist
- Department of Pathology, The University of Toledo College of Medicine, 3000 Arlington Avenue, HEB 219, Toledo, OH, 43614, USA
| | - Albert M Levin
- Department of Biostatistics, Henry Ford Health System, 1 Ford Place Detroit, MI, Detroit, MI, 48202, USA
| | - Pierre P Massion
- Thoracic Program, Vanderbilt Ingram Cancer Center, Nashville, TN, 37232, USA
| | | | - David E Midthun
- Department of Pulmonary and Critical Care Medicine, Mayo Clinic, 200 1st St SW, Rochester, MN, 55905, USA
| | - Peter J Mazzone
- Department of Pulmonary Medicine, Cleveland Clinic, 9500 Euclid Ave, Cleveland, OH, 44195, USA
| | - Steven D Nathan
- Department of Pulmonary Medicine, Inova Fairfax Hospital, 3300 Gallows Road, Falls Church, VA, 22042-3300, USA
| | - Ronald J Wainz
- The Toledo Hospital, 2142 N Cove Blvd, Toledo, OH, 43606, USA
| | - Patrick Nana-Sinkam
- Division of Pulmonary Diseases and Critical Care Medicine, Virginia Commonwealth University, USA, Richmond, VA, 23284-2512, USA.,Ohio State University James Comprehensive Cancer Center and Solove Research Institute, Columbus, OH, USA
| | - Paige F S Willey
- American Enterprise Institute, 1789 Massachusetts Ave NW, Washington, DC, 20036, USA
| | - Taylor J Arend
- The University of Toledo College of Medicine, 3000 Arlington Avenue, Toledo, OH, 43614, USA
| | - Karanbir Padda
- Emory University School of Medicine, 1648 Pierce Dr NE, Atlanta, GA, 30307, USA
| | - Shuhao Qiu
- Department of Medicine, The University of Toledo Medical Center, 3000 Arlington Avenue, Toledo, OH, 43614, USA
| | - Alexei Federov
- Department of Mathematics and Statistics, The University of Toledo, 2801 W. Bancroft Street, Toledo, OH, 43606, USA.,Department of Medicine, The University of Toledo College of Medicine, 3000 Arlington Avenue, Toledo, OH, 43614, USA
| | - Dawn-Alita R Hernandez
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, The University of Toledo College of Medicine, 3000 Arlington Avenue, RHC 0012, Toledo, OH, 43614, USA
| | - Jeffrey R Hammersley
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, The University of Toledo College of Medicine, 3000 Arlington Avenue, RHC 0012, Toledo, OH, 43614, USA
| | - Youngsook Yoon
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, The University of Toledo College of Medicine, 3000 Arlington Avenue, RHC 0012, Toledo, OH, 43614, USA
| | - Fadi Safi
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, The University of Toledo College of Medicine, 3000 Arlington Avenue, RHC 0012, Toledo, OH, 43614, USA
| | - Sadik A Khuder
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, The University of Toledo College of Medicine, 3000 Arlington Avenue, RHC 0012, Toledo, OH, 43614, USA
| | - James C Willey
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, The University of Toledo College of Medicine, 3000 Arlington Avenue, Toledo, OH, 43614, USA.
| |
Collapse
|
2
|
Yeo J, Crawford EL, Zhang X, Khuder S, Chen T, Levin A, Blomquist TM, Willey JC. A lung cancer risk classifier comprising genome maintenance genes measured in normal bronchial epithelial cells. BMC Cancer 2017; 17:301. [PMID: 28464886 PMCID: PMC5412061 DOI: 10.1186/s12885-017-3287-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2016] [Accepted: 04/20/2017] [Indexed: 12/14/2022] Open
Abstract
Background Annual low dose CT (LDCT) screening of individuals at high demographic risk reduces lung cancer mortality by more than 20%. However, subjects selected for screening based on demographic criteria typically have less than a 10% lifetime risk for lung cancer. Thus, there is need for a biomarker that better stratifies subjects for LDCT screening. Toward this goal, we previously reported a lung cancer risk test (LCRT) biomarker comprising 14 genome-maintenance (GM) pathway genes measured in normal bronchial epithelial cells (NBEC) that accurately classified cancer (CA) from non-cancer (NC) subjects. The primary goal of the studies reported here was to optimize the LCRT biomarker for high specificity and ease of clinical implementation. Methods Targeted competitive multiplex PCR amplicon libraries were prepared for next generation sequencing (NGS) analysis of transcript abundance at 68 sites among 33 GM target genes in NBEC specimens collected from a retrospective cohort of 120 subjects, including 61 CA cases and 59 NC controls. Genes were selected for analysis based on contribution to the previously reported LCRT biomarker and/or prior evidence for association with lung cancer risk. Linear discriminant analysis was used to identify the most accurate classifier suitable to stratify subjects for screening. Results After cross-validation, a model comprising expression values from 12 genes (CDKN1A, E2F1, ERCC1, ERCC4, ERCC5, GPX1, GSTP1, KEAP1, RB1, TP53, TP63, and XRCC1) and demographic factors age, gender, and pack-years smoking, had Receiver Operator Characteristic area under the curve (ROC AUC) of 0.975 (95% CI: 0.96–0.99). The overall classification accuracy was 93% (95% CI 88%–98%) with sensitivity 93.1%, specificity 92.9%, positive predictive value 93.1% and negative predictive value 93%. The ROC AUC for this classifier was significantly better (p < 0.0001) than the best model comprising demographic features alone. Conclusions The LCRT biomarker reported here displayed high accuracy and ease of implementation on a high throughput, quality-controlled targeted NGS platform. As such, it is optimized for clinical validation in specimens from the ongoing LCRT blinded prospective cohort study. Following validation, the biomarker is expected to have clinical utility by better stratifying subjects for annual lung cancer screening compared to current demographic criteria alone. Electronic supplementary material The online version of this article (doi:10.1186/s12885-017-3287-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Jiyoun Yeo
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, The University of Toledo College of Medicine, 3000 Arlington Avenue, HEB 219, Toledo, OH, 43614, USA
| | - Erin L Crawford
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, The University of Toledo College of Medicine, 3000 Arlington Avenue, HEB 219, Toledo, OH, 43614, USA
| | - Xiaolu Zhang
- Cancer Genetics and Comparative Genomics Branch (CGCGB), National Human Genomes Research Institute (NHGRI), National Institutes of Health (NIH), Bldg 50, Rm 5341, 50 South Dr., Bethesda, MD, 20892, USA
| | - Sadik Khuder
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, The University of Toledo College of Medicine, 3000 Arlington Avenue, RHC 0012, Toledo, OH, 43614, USA
| | - Tian Chen
- Department of Mathematics and Statistics, The University of Toledo, 2801 W. Bancroft Street, Toledo, OH, 43606, USA
| | - Albert Levin
- Department of Biostatistics, Henry Ford Health System, 1 Ford Place, Detroit, MI, 48202, USA
| | - Thomas M Blomquist
- Department of Pathology, The University of Toledo College of Medicine, 3000 Arlington Avenue, Toledo, OH, 43614, USA
| | - James C Willey
- Ruppert 0012, Division of Pulmonary and Critical Care Medicine, Department of Medicine, The University of Toledo College of Medicine, 3000 Arlington Avenue, Toledo, OH, 43614, USA.
| |
Collapse
|