1
|
Williamson BD, Huang Y. Flexible variable selection in the presence of missing data. Int J Biostat 2024; 0:ijb-2023-0059. [PMID: 38348882 DOI: 10.1515/ijb-2023-0059] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Accepted: 11/21/2023] [Indexed: 05/22/2024]
Abstract
In many applications, it is of interest to identify a parsimonious set of features, or panel, from multiple candidates that achieves a desired level of performance in predicting a response. This task is often complicated in practice by missing data arising from the sampling design or other random mechanisms. Most recent work on variable selection in missing data contexts relies in some part on a finite-dimensional statistical model, e.g., a generalized or penalized linear model. In cases where this model is misspecified, the selected variables may not all be truly scientifically relevant and can result in panels with suboptimal classification performance. To address this limitation, we propose a nonparametric variable selection algorithm combined with multiple imputation to develop flexible panels in the presence of missing-at-random data. We outline strategies based on the proposed algorithm that achieve control of commonly used error rates. Through simulations, we show that our proposal has good operating characteristics and results in panels with higher classification and variable selection performance compared to several existing penalized regression approaches in cases where a generalized linear model is misspecified. Finally, we use the proposed method to develop biomarker panels for separating pancreatic cysts with differing malignancy potential in a setting where complicated missingness in the biomarkers arose due to limited specimen volumes.
Collapse
Affiliation(s)
- Brian D Williamson
- Biostatistics Division, Kaiser Permanente Washington Health Research Institute, Seattle, USA
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Center, Seattle, USA
- Department of Biostatistics, University of Washington, Seattle, USA
| | - Ying Huang
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Center, Seattle, USA
- Department of Biostatistics, University of Washington, Seattle, USA
| |
Collapse
|
2
|
Halbrook CJ, Lyssiotis CA, Pasca di Magliano M, Maitra A. Pancreatic cancer: Advances and challenges. Cell 2023; 186:1729-1754. [PMID: 37059070 PMCID: PMC10182830 DOI: 10.1016/j.cell.2023.02.014] [Citation(s) in RCA: 157] [Impact Index Per Article: 157.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2022] [Revised: 01/17/2023] [Accepted: 02/08/2023] [Indexed: 04/16/2023]
Abstract
Pancreatic ductal adenocarcinoma (PDAC) remains one of the deadliest cancers. Significant efforts have largely defined major genetic factors driving PDAC pathogenesis and progression. Pancreatic tumors are characterized by a complex microenvironment that orchestrates metabolic alterations and supports a milieu of interactions among various cell types within this niche. In this review, we highlight the foundational studies that have driven our understanding of these processes. We further discuss the recent technological advances that continue to expand our understanding of PDAC complexity. We posit that the clinical translation of these research endeavors will enhance the currently dismal survival rate of this recalcitrant disease.
Collapse
Affiliation(s)
- Christopher J Halbrook
- Department of Molecular Biology and Biochemistry, University of California, Irvine, Irvine, CA 92697, USA; Institute for Immunology, University of California, Irvine, Irvine, CA 92697, USA; Chao Family Comprehensive Cancer Center, University of California, Irvine, Orange, CA 92868, USA.
| | - Costas A Lyssiotis
- Department of Molecular & Integrative Physiology, University of Michigan, Ann Arbor, MI 48109, USA; Department of Internal Medicine, Division of Gastroenterology and Hepatology, University of Michigan, Ann Arbor, MI 48109, USA; Rogel Cancer Center, University of Michigan, Ann Arbor, MI 48109, USA.
| | - Marina Pasca di Magliano
- Rogel Cancer Center, University of Michigan, Ann Arbor, MI 48109, USA; Department of Surgery, University of Michigan, Ann Arbor, MI 48109, USA; Department of Cell and Developmental Biology, University of Michigan, Ann Arbor, MI 48109, USA.
| | - Anirban Maitra
- Department of Translational Molecular Pathology, Sheikh Ahmed Center for Pancreatic Cancer Research, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
| |
Collapse
|
3
|
Wang CY, Feng Z. A Flexible Method for Diagnostic Accuracy with Biomarker Measurement Error. MATHEMATICS (BASEL, SWITZERLAND) 2023; 11:549. [PMID: 37251695 PMCID: PMC10210524 DOI: 10.3390/math11030549] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Diagnostic biomarkers are often measured with errors due to imperfect lab conditions or analytic variability of the assay. The ability of a diagnostic biomarker to discriminate between cases and controls is often measured by the area under the receiver operating characteristic curve (AUC), sensitivity, specificity, among others. Ignoring measurement error can cause biased estimation of a diagnostic accuracy measure, which results in misleading interpretation of the efficacy of a diagnostic biomarker. Existing assays available are either research grade or clinical grade. Research assays are cost effective, often multiplex, but they may be associated with moderate measurement errors leading to poorer diagnostic performance. In comparison, clinical assays may provide better diagnostic ability, but with higher cost since they are usually developed by industry. Correction for attenuation methods are often valid when biomarkers are from a normal distribution, but may be biased with skewed biomarkers. In this paper, we develop a flexible method based on skew-normal biomarker distributions to correct for bias in estimating diagnostic performance measures including AUC, sensitivity, and specificity. Finite sample performance of the proposed method is examined via extensive simulation studies. The methods are applied to a pancreatic cancer biomarker study.
Collapse
Affiliation(s)
- Ching-Yun Wang
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, P.O. Box 19024, Seattle, WA 98109-1024, USA
| | - Ziding Feng
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, P.O. Box 19024, Seattle, WA 98109-1024, USA
| |
Collapse
|
4
|
Permuth JB, Mesa T, Williams SL, Cardentey Y, Zhang D, Pawlak EA, Li J, Cameron ME, Ali KN, Jeong D, Yoder SJ, Chen DT, Trevino JG, Merchant N, Malafa M. A pilot study to troubleshoot quality control metrics when assessing circulating miRNA expression data reproducibility across study sites. Cancer Biomark 2022; 33:467-478. [PMID: 35491771 PMCID: PMC9428925 DOI: 10.3233/cbm-210255] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
BACKGROUND: Given the growing interest in using microRNAs (miRNAs) as biomarkers of early disease, establishment of robust protocols and platforms for miRNA quantification in biological fluids is critical. OBJECTIVE: The goal of this multi-center pilot study was to evaluate the reproducibility of NanoString nCounter™ technology when analyzing the abundance of miRNAs in plasma and cystic fluid from patients with pancreatic lesions. METHODS: Using sample triplicates analyzed across three study sites, we assessed potential sources of variability (RNA isolation, sample processing/ligation, hybridization, and lot-to-lot variability) that may contribute to suboptimal reproducibility of miRNA abundance when using nCounter™, and evaluated expression of positive and negative controls, housekeeping genes, spike-in genes, and miRNAs. RESULTS: Positive controls showed a high correlation across samples from each site (median correlation coefficient, r> 0.9). Most negative control probes had expression levels below background. Housekeeping and spike-in genes each showed a similar distribution of expression and comparable pairwise correlation coefficients of replicate samples across sites. A total of 804 miRNAs showed a similar distribution of pairwise correlation coefficients between replicate samples (p= 0.93). After normalization and selecting miRNAs with expression levels above zero in 80% of samples, 55 miRNAs were identified; heatmap and principal component analysis revealed similar expression patterns and clustering in replicate samples. CONCLUSIONS: Findings from this pilot investigation suggest the nCounter platform can yield reproducible results across study sites. This study underscores the importance of implementing quality control procedures when designing multi-center evaluations of miRNA abundance.
Collapse
Affiliation(s)
- Jennifer B. Permuth
- Department of Cancer Epidemiology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, USA
- Department of Gastrointestinal Oncology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, USA
| | - Tania Mesa
- Molecular Genomics Core Facility, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, USA
| | - Sion L. Williams
- Oncogenomics Shared Resource, Sylvester Comprehensive Cancer Center, Miami, FL, USA
| | - Yoslayma Cardentey
- Oncogenomics Shared Resource, Sylvester Comprehensive Cancer Center, Miami, FL, USA
| | - Dongyu Zhang
- Department of Cancer Epidemiology, University of Florida, Gainesville, FL, USA
| | | | - Jiannong Li
- Department of Biostatistics and Bioinformatics, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, USA
| | - Miles E. Cameron
- College of Medicine, University of Florida, Gainesville, FL, USA
| | - Karla N. Ali
- Department of Cancer Epidemiology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, USA
| | - Daniel Jeong
- Department of Cancer Epidemiology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, USA
- Department of Diagnostic Imaging and Interventional Radiology, H. Lee Moffitt Cancer Tampa, FL, USA
| | - Sean J. Yoder
- Molecular Genomics Core Facility, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, USA
| | - Dung-Tsa Chen
- Department of Biostatistics and Bioinformatics, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, USA
| | - Jose G. Trevino
- Department of Surgery, University of Florida, Gainesville, FL, USA
- Department of Surgery, Division of Surgical Oncology, Virginia Commonwealth University School of Medicine, Richmond, VA, USA
| | - Nipun Merchant
- Department of Surgery, Sylvester Comprehensive Cancer Center, Miami, FL, USA
| | - Mokenge Malafa
- Department of Gastrointestinal Oncology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, USA
| |
Collapse
|
5
|
Dudley B, Brand RE. Pancreatic Cancer Surveillance and Novel Strategies for Screening. Gastrointest Endosc Clin N Am 2022; 32:13-25. [PMID: 34798981 DOI: 10.1016/j.giec.2021.08.008] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Individuals with a genetic susceptibility to pancreatic ductal adenocarcinoma (PDAC) may benefit from surveillance to increase the likelihood of early detection. Currently, candidates for surveillance are identified based on genetic test results and family history of PDAC, and surveillance is accomplished through imaging of the pancreas (endoscopic ultrasound or MRI). Novel methods that incorporate personalized risk, biomarkers, and radiomics are being investigated in an attempt to improve identification of at-risk individuals and to increase detection of precursor and early-stage lesions.
Collapse
Affiliation(s)
- Beth Dudley
- Division of Gastroenterology, Hepatology, and Nutrition, Department of Medicine, University of Pittsburgh, 5200 Centre Avenue, Suite 409, Pittsburgh, PA 15232, USA
| | - Randall E Brand
- Division of Gastroenterology, Hepatology, and Nutrition, Department of Medicine, University of Pittsburgh, 5200 Centre Avenue, Suite 409, Pittsburgh, PA 15232, USA.
| |
Collapse
|
6
|
Yip-Schneider MT, Wu H, Allison HR, Easler JJ, Sherman S, Al-Haddad MA, Dewitt JM, Schmidt CM. Biomarker Risk Score Algorithm and Preoperative Stratification of Patients with Pancreatic Cystic Lesions. J Am Coll Surg 2021; 233:426-434.e4. [PMID: 34166836 DOI: 10.1016/j.jamcollsurg.2021.05.030] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Revised: 05/28/2021] [Accepted: 05/28/2021] [Indexed: 12/25/2022]
Abstract
BACKGROUND Pancreatic cysts are incidentally detected in up to 13% of patients undergoing radiographic imaging. Of the most frequently encountered types, mucin-producing (mucinous) pancreatic cystic lesions may develop into pancreatic cancer, while nonmucinous ones have little or no malignant potential. Accurate preoperative diagnosis is critical for optimal management, but has been difficult to achieve, resulting in unnecessary major surgery. Here, we aim to develop an algorithm based on biomarker risk scores to improve risk stratification. STUDY DESIGN Patients undergoing surgery and/or surveillance for a pancreatic cystic lesion, with diagnostic imaging and banked pancreatic cyst fluid, were enrolled in the study after informed consent (n = 163 surgical, 67 surveillance). Cyst fluid biomarkers with high specificity for distinguishing nonmucinous from mucinous pancreatic cysts (vascular endothelial growth factor [VEGF], glucose, carcinoembryonic antigen [CEA], amylase, cytology, and DNA mutation) were selected. Biomarker risk scores were used to design an algorithm to predict preoperative diagnosis. Performance was tested using surgical (retrospective) and surveillance (prospective) cohorts. RESULTS In the surgical cohort, the biomarker algorithm outperformed the preoperative clinical diagnosis in correctly predicting the final pathologic diagnosis (91% vs 73%; p < 0.000001). Specifically, nonmucinous serous cystic neoplasms (SCN) and mucinous cystic neoplasms (MCN) were correctly classified more frequently by the algorithm than clinical diagnosis (96% vs 30%; p < 0.000008 and 92% vs 69%; p = 0.04, respectively). In the surveillance cohort, the algorithm predicted a preoperative diagnosis with high confidence based on a high biomarker score and/or consistency with imaging from ≥1 follow-up visits. CONCLUSIONS A biomarker risk score-based algorithm was able to correctly classify pancreatic cysts preoperatively. Importantly, this tool may improve initial and dynamic risk stratification, reducing overdiagnosis and underdiagnosis.
Collapse
Affiliation(s)
- Michele T Yip-Schneider
- Department of Surgery, Indiana University School of Medicine, Indianapolis, IN; Walther Oncology Center, Indianapolis, IN; Indiana University Simon Cancer Center, Indianapolis, IN; Indiana University Health Pancreatic Cyst and Cancer Early Detection Center, Indianapolis, IN
| | - Huangbing Wu
- Department of Surgery, Indiana University School of Medicine, Indianapolis, IN; Indiana University Health Pancreatic Cyst and Cancer Early Detection Center, Indianapolis, IN
| | | | - Jeffrey J Easler
- Department of Medicine, Division of Gastroenterology, Indianapolis, IN
| | - Stuart Sherman
- Department of Medicine, Division of Gastroenterology, Indianapolis, IN
| | - Mohammad A Al-Haddad
- Department of Medicine, Division of Gastroenterology, Indianapolis, IN; Indiana University Health Pancreatic Cyst and Cancer Early Detection Center, Indianapolis, IN
| | - John M Dewitt
- Department of Medicine, Division of Gastroenterology, Indianapolis, IN
| | - C Max Schmidt
- Department of Surgery, Indiana University School of Medicine, Indianapolis, IN; Biochemistry/Molecular Biology, Indianapolis, IN; Walther Oncology Center, Indianapolis, IN; Indiana University Simon Cancer Center, Indianapolis, IN; Indiana University Health Pancreatic Cyst and Cancer Early Detection Center, Indianapolis, IN.
| |
Collapse
|
7
|
Kenner B, Chari ST, Kelsen D, Klimstra DS, Pandol SJ, Rosenthal M, Rustgi AK, Taylor JA, Yala A, Abul-Husn N, Andersen DK, Bernstein D, Brunak S, Canto MI, Eldar YC, Fishman EK, Fleshman J, Go VLW, Holt JM, Field B, Goldberg A, Hoos W, Iacobuzio-Donahue C, Li D, Lidgard G, Maitra A, Matrisian LM, Poblete S, Rothschild L, Sander C, Schwartz LH, Shalit U, Srivastava S, Wolpin B. Artificial Intelligence and Early Detection of Pancreatic Cancer: 2020 Summative Review. Pancreas 2021; 50:251-279. [PMID: 33835956 PMCID: PMC8041569 DOI: 10.1097/mpa.0000000000001762] [Citation(s) in RCA: 49] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
ABSTRACT Despite considerable research efforts, pancreatic cancer is associated with a dire prognosis and a 5-year survival rate of only 10%. Early symptoms of the disease are mostly nonspecific. The premise of improved survival through early detection is that more individuals will benefit from potentially curative treatment. Artificial intelligence (AI) methodology has emerged as a successful tool for risk stratification and identification in general health care. In response to the maturity of AI, Kenner Family Research Fund conducted the 2020 AI and Early Detection of Pancreatic Cancer Virtual Summit (www.pdac-virtualsummit.org) in conjunction with the American Pancreatic Association, with a focus on the potential of AI to advance early detection efforts in this disease. This comprehensive presummit article was prepared based on information provided by each of the interdisciplinary participants on one of the 5 following topics: Progress, Problems, and Prospects for Early Detection; AI and Machine Learning; AI and Pancreatic Cancer-Current Efforts; Collaborative Opportunities; and Moving Forward-Reflections from Government, Industry, and Advocacy. The outcome from the robust Summit conversations, to be presented in a future white paper, indicate that significant progress must be the result of strategic collaboration among investigators and institutions from multidisciplinary backgrounds, supported by committed funders.
Collapse
Affiliation(s)
| | - Suresh T. Chari
- Department of Gastroenterology, Hepatology and Nutrition, The University of Texas MD Anderson Cancer Center, Houston, TX
| | | | - David S. Klimstra
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, NY
| | - Stephen J. Pandol
- Basic and Translational Pancreas Research Program, Department of Medicine, Cedars-Sinai Medical Center, Los Angeles, CA
| | | | - Anil K. Rustgi
- Division of Digestive and Liver Diseases, Department of Medicine, NewYork-Presbyterian/Columbia University Irving Medical Center, New York, NY
| | | | - Adam Yala
- Department of Electrical Engineering and Computer Science
- Jameel Clinic, Massachusetts Institute of Technology, Cambridge, MA
| | - Noura Abul-Husn
- Division of Genomic Medicine, Department of Medicine, Icahn School of Medicine, Mount Sinai, New York, NY
| | - Dana K. Andersen
- Division of Digestive Diseases and Nutrition, National Institute of Diabetes and Digestive and Kidney Diseases, Bethesda, MD
| | | | - Søren Brunak
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen, Denmark
| | - Marcia Irene Canto
- Division of Gastroenterology, Johns Hopkins University School of Medicine, Baltimore, MD
| | - Yonina C. Eldar
- Department of Math and Computer Science, Weizmann Institute of Science, Rehovot, Israel
| | - Elliot K. Fishman
- Department of Radiology and Radiological Science, Johns Hopkins Medicine, Baltimore, MD
| | | | - Vay Liang W. Go
- UCLA Center for Excellence in Pancreatic Diseases, University of California, Los Angeles, Los Angeles, CA
| | | | - Bruce Field
- From the Kenner Family Research Fund, New York, NY
| | - Ann Goldberg
- From the Kenner Family Research Fund, New York, NY
| | | | - Christine Iacobuzio-Donahue
- David M. Rubenstein Center for Pancreatic Cancer Research, Memorial Sloan Kettering Cancer Center, New York, NY
| | - Debiao Li
- Biomedical Research Institute, Cedars-Sinai Medical Center, Los Angeles, CA
| | | | - Anirban Maitra
- Department of Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX
| | | | | | | | | | - Lawrence H. Schwartz
- Department of Radiology, NewYork-Presbyterian Hospital/Columbia University Irving Medical Center, New York, NY
| | - Uri Shalit
- Faculty of Industrial Engineering and Management, Technion—Israel Institute of Technology, Haifa, Israel
| | - Sudhir Srivastava
- Division of Cancer Prevention, National Cancer Institute, Bethesda, MD
| | - Brian Wolpin
- Gastrointestinal Cancer Center, Dana-Farber Cancer Institute, Boston, MA
| |
Collapse
|
8
|
Bast RC, Srivastava S. The National Cancer Institute Early Detection Research Network: Two Decades of Progress. Cancer Epidemiol Biomarkers Prev 2020; 29:2396-2400. [PMID: 33262198 DOI: 10.1158/1055-9965.epi-20-1158] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2020] [Revised: 08/29/2020] [Accepted: 08/31/2020] [Indexed: 12/25/2022] Open
Affiliation(s)
- Robert C Bast
- Department of Experimental Therapeutics, The University of Texas MD Anderson Cancer Center, Houston, Texas.
| | - Sudhir Srivastava
- Division of Cancer Prevention, National Cancer Institute, National Institutes of Health, Bethesda, Maryland
| |
Collapse
|