1
|
Lundine JP, Huling JD, Adelson PD, Burd RS, Fuentes M, Haarbauer-Krupa J, Hagen K, Iske C, Koterba C, Kurowski BG, Petrucci S, Rose SC, Sadowsky CL, Westendorf J, Truelove A, Leonard JC. Using Billing Codes to Create a Pediatric Functional Status e-Score for Children Receiving Inpatient Rehabilitation. Arch Phys Med Rehabil 2023; 104:1882-1891. [PMID: 37075966 PMCID: PMC10579455 DOI: 10.1016/j.apmr.2023.03.025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Revised: 03/15/2023] [Accepted: 03/24/2023] [Indexed: 04/21/2023]
Abstract
OBJECTIVE Provide proof-of-concept for development of a Pediatric Functional Status eScore (PFSeS). Demonstrate that expert clinicians rank billing codes as relevant to patient functional status and identify the domains that codes inform in a way that reliably matches analytical modeling. DESIGN Retrospective chart review, modified Delphi, and nominal group techniques. SETTING Large, urban, quaternary care children's hospital in the Midwestern United States. PARTICIPANTS Data from 1955 unique patients and 2029 hospital admissions (2000-2020); 12 expert consultants representing the continuum of rehabilitation care reviewed 2893 codes (procedural, diagnostic, pharmaceutical, durable medical equipment). MAIN OUTCOME MEASURES Consensus voting to determine whether codes were associated with functional status at discharge and, if so, what domains they informed (self-care, mobility, cognition/ communication). RESULTS The top 250 and 500 codes identified by statistical modeling were mostly composed of codes selected by the consultant panel (78%-80% of the top 250 and 71%-78% of the top 500). The results provide evidence that clinical experts' selection of functionally meaningful codes corresponds with codes selected by statistical modeling as most strongly associated with WeeFIM domain scores. The top 5 codes most strongly related to functional independence ratings from a domain-specific assessment indicate clinically sensible relationships, further supporting the use of billing data in modeling to create a PFSeS. CONCLUSIONS Development of a PFSeS that is predicated on billing data would improve researchers' ability to assess the functional status of children who receive inpatient rehabilitation care for a neurologic injury or illness. An expert clinician panel, representing the spectrum of medical and rehabilitative care, indicated that proposed statistical modeling identifies relevant codes mapped to 3 important domains: self-care, mobility, and cognition/communication.
Collapse
Affiliation(s)
- Jennifer P Lundine
- Department of Speech & Hearing Science, The Ohio State University, Columbus, OH; Division of Clinical Therapies & Inpatient Rehabilitation Program, Nationwide Children's Hospital, Columbus, OH.
| | - Jared D Huling
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN
| | - P David Adelson
- Rockefeller Neuroscience Institute and West Virginia, University Medicine Children's Neuroscience Center, Morgantown, WV
| | - Randall S Burd
- Division of Trauma and Burn Surgery, Children's National Hospital, Washington, DC
| | - Molly Fuentes
- Department of Rehabilitation Medicine, University of Washington, Seattle, WA
| | | | - Kaitlin Hagen
- International Center for Spinal Cord Injury, Kennedy Krieger Institute and Johns Hopkins School of Medicine, Baltimore, MD
| | - Cynthia Iske
- Inpatient Rehabilitation Program, Nationwide Children's Hospital, Columbus, OH
| | - Christine Koterba
- Division of Pediatric Psychology and Neuropsychology, Department of Pediatrics, The Ohio State University College of Medicine, and Nationwide Children's Hospital, Columbus, OH
| | - Brad G Kurowski
- Division of Pediatric Rehabilitation Medicine, Cincinnati Children's Hospital Medical Center, Cincinnati, OH; Departments of Pediatrics and Neurology and Rehabilitation Medicine, University of Cincinnati College of Medicine, Cincinnati, OH
| | - Stephanie Petrucci
- Inpatient Rehabilitation Program, Nationwide Children's Hospital, Columbus, OH
| | - Sean C Rose
- Division of Neurology, Department of Pediatrics, Nationwide Children's Hospital and The Ohio State University College of Medicine, Columbus, OH
| | - Cristina L Sadowsky
- International Center for Spinal Cord Injury, Kennedy Krieger Institute and Johns Hopkins School of Medicine, Baltimore, MD
| | - Jennifer Westendorf
- Division of Occupational Therapy and Physical Therapy, Cincinnati Children's Hospital Medical Center, Cincinnati, OH
| | - Annie Truelove
- Abigail Wexner Research Institute at Nationwide Children's Hospital, Columbus, OH
| | - Julie C Leonard
- Abigail Wexner Research Institute at Nationwide Children's Hospital, Columbus, OH; Division of Emergency Medicine, Department of Pediatrics, The Ohio State University College of Medicine, and Nationwide Children's Hospital, Columbus, OH
| |
Collapse
|
2
|
Parker M, Zheng Z, Lasarev M, Alexandridis RA, Newton MA, Shelef MA, McCoy SS. Novel autoantibodies help diagnose anti-SSA antibody negative Sjögren's disease and predict abnormal labial salivary gland pathology. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.08.29.23294775. [PMID: 37693588 PMCID: PMC10491389 DOI: 10.1101/2023.08.29.23294775] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/12/2023]
Abstract
Objectives Sj□gren's disease (SjD) diagnosis requires either positive anti-SSA antibodies or a labial salivary gland biopsy with a positive focus score (FS). One-third of SjD patients lack anti-SSA antibodies (SSA-), requiring a positive FS for diagnosis. Our objective was to identify novel autoantibodies to diagnose 'seronegative' SjD. Methods IgG binding to a high density whole human peptidome array was quantified using sera from SSA- SjD cases and matched non-autoimmune controls. We identified the highest bound peptides using empirical Bayesian statistical filters, which we confirmed in an independent cohort comprising SSA- SjD (n=76), sicca controls without autoimmunity (n=75), and autoimmune controls (SjD features but not meeting SjD criteria; n=41). In this external validation, we used non-parametric methods for peptide abundance and controlled false discovery rate in group comparisons. For predictive modeling, we used logistic regression, model selection methods, and cross-validation to identify clinical and peptide variables that predict SSA- SjD and FS positivity. Results IgG against a peptide from D-aminoacyl-tRNA deacylase (DTD2) was bound more in SSA- SjD than sicca controls (p=.004) and more than combined controls (sicca and autoimmune controls combined; p=0.003). IgG against peptides from retroelement silencing factor-1 (RESF1) and DTD2, were bound more in FS-positive than FS-negative participants (p=.010; p=0.012). A predictive model incorporating clinical variables showed good discrimination between SjD versus control (AUC 74%) and between FS-positive versus FS-negative (AUC 72%). Conclusion We present novel autoantibodies in SSA- SjD that have good predictive value for SSA- SjD and FS-positivity. KEY MESSAGES What is already known on this topic - Seronegative (anti-SSA antibody negative [SSA-]) Sjögren's disease (SjD) requires a labial salivary gland biopsy for diagnosis, which is challenging to obtain and interpret. What this study adds - We identified novel autoantibodies in SSA- SjD that, when combined with readily available clinical variables, provide good predictive ability to discriminate 1) SSA- SjD from control participants and 2) abnormal salivary gland biopsies from normal salivary gland biopsies. How this study might affect research, practice or policy - This study provides novel diagnostic antibodies addressing the critical need for improvement of SSA- SjD diagnostic tools.
Collapse
|
3
|
Bérubé S, Kobayashi T, Wesolowski A, Norris DE, Ruczinski I, Moss WJ, Louis TA. A Bayesian hierarchical model for signal extraction from protein microarrays. Stat Med 2023; 42:1445-1460. [PMID: 36872556 DOI: 10.1002/sim.9680] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Revised: 11/09/2022] [Accepted: 01/30/2023] [Indexed: 03/07/2023]
Abstract
Protein microarrays are a promising technology that measure protein levels in serum or plasma samples. Due to their high technical variability and high variation in protein levels across serum samples in any population, directly answering biological questions of interest using protein microarray measurements is challenging. Analyzing preprocessed data and within-sample ranks of protein levels can mitigate the impact of between-sample variation. As for any analysis, ranks are sensitive to preprocessing, but loss function based ranks that accommodate major structural relations and components of uncertainty are very effective. Bayesian modeling with full posterior distributions for quantities of interest produce the most effective ranks. Such Bayesian models have been developed for other assays, for example, DNA microarrays, but modeling assumptions for these assays are not appropriate for protein microarrays. Consequently, we develop and evaluate a Bayesian model to extract the full posterior distribution of normalized protein levels and associated ranks for protein microarrays, and show that it fits well to data from two studies that use protein microarrays produced by different manufacturing processes. We validate the model via simulation and demonstrate the downstream impact of using estimates from this model to obtain optimal ranks.
Collapse
Affiliation(s)
- Sophie Bérubé
- Department of Biostatistics, Johns Hopkins University Bloomberg School of Public Health, Baltimore, Maryland, USA
| | - Tamaki Kobayashi
- Department of Epidemiology, Johns Hopkins University Bloomberg School of Public Health, Baltimore, Maryland, USA
| | - Amy Wesolowski
- Department of Epidemiology, Johns Hopkins University Bloomberg School of Public Health, Baltimore, Maryland, USA
| | - Douglas E Norris
- Department of Molecular Microbiology and Immunology, Johns Hopkins University Bloomberg School of Public Health, Baltimore, Maryland, USA
| | - Ingo Ruczinski
- Department of Biostatistics, Johns Hopkins University Bloomberg School of Public Health, Baltimore, Maryland, USA
| | - William J Moss
- Department of Epidemiology, Johns Hopkins University Bloomberg School of Public Health, Baltimore, Maryland, USA
- Department of Molecular Microbiology and Immunology, Johns Hopkins University Bloomberg School of Public Health, Baltimore, Maryland, USA
| | - Thomas A Louis
- Department of Biostatistics, Johns Hopkins University Bloomberg School of Public Health, Baltimore, Maryland, USA
| |
Collapse
|
4
|
James GM, Radchenko P, Rava B. Irrational Exuberance: Correcting Bias in Probability Estimates. J Am Stat Assoc 2022. [DOI: 10.1080/01621459.2020.1787175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Affiliation(s)
- Gareth M. James
- Department of Data Sciences and Operations, University of Southern California, Los Angeles, CA
| | | | - Bradley Rava
- Department of Data Sciences and Operations, University of Southern California, Los Angeles, CA
| |
Collapse
|
5
|
Al Mohamad D, van Zwet E, Solari A, Goeman J. Simultaneous confidence intervals for ranks using the partitioning principle. Electron J Stat 2021. [DOI: 10.1214/21-ejs1847] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Diaa Al Mohamad
- Leiden University Medical Center, Einthovenweg 20. 2333 ZC Leiden, The Nethlerlands
| | - Erik van Zwet
- Leiden University Medical Center, Einthovenweg 20. 2333 ZC Leiden, The Nethlerlands
| | - Aldo Solari
- University of Milano-Bicocca, 1 Piazza dell’Ateneo Nuovo. 20126 Milano, Italy
| | - Jelle Goeman
- Leiden University Medical Center, Einthovenweg 20. 2333 ZC Leiden, The Nethlerlands
| |
Collapse
|
6
|
Ferguson J, Chang J. An empirical Bayesian ranking method, with applications to high throughput biology. Bioinformatics 2020; 36:177-185. [PMID: 31197345 DOI: 10.1093/bioinformatics/btz471] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2018] [Revised: 04/30/2019] [Accepted: 06/05/2019] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION In bioinformatics, genome-wide experiments look for important biological differences between two groups at a large number of locations in the genome. Often, the final analysis focuses on a P-value-based ranking of locations which might then be investigated further in follow-up experiments. However, this strategy may result in small effect sizes, with low P-values, being ranked more favorably than larger more scientifically important effects. Bayesian ranking techniques may offer a solution to this problem provided a good prior distribution for the collective distribution of effect sizes is available. RESULTS We develop an Empirical Bayes ranking algorithm, using the marginal distribution of the data over all locations to estimate an appropriate prior. In simulations and analysis using real datasets, we demonstrate favorable performance compared to ordering P-values and a number of other competing ranking methods. The algorithm is computationally efficient and can be used to rank the entirety of genomic locations or to rank a subset of locations, pre-selected via traditional FWER/FDR methods in a 2-stage analysis. AVAILABILITY AND IMPLEMENTATION An R-package, EBrank, implementing the ranking algorithm is available on CRAN. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- John Ferguson
- Biostatistics Division, HRB Clinical Research Facility, National University of Ireland Galway, Galway, Ireland
| | - Joseph Chang
- Department of Statistics and Data Science, Yale University, New Haven, CT, USA
| |
Collapse
|
7
|
Juul M, Madsen T, Guo Q, Bertl J, Hobolth A, Kellis M, Pedersen JS. ncdDetect2: improved models of the site-specific mutation rate in cancer and driver detection with robust significance evaluation. Bioinformatics 2019; 35:189-199. [PMID: 29945188 PMCID: PMC6330011 DOI: 10.1093/bioinformatics/bty511] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2018] [Accepted: 06/24/2018] [Indexed: 01/22/2023] Open
Abstract
Motivation Understanding the mutational processes that act during cancer development is a key topic of cancer biology. Nevertheless, much remains to be learned, as a complex interplay of processes with dependencies on a range of genomic features creates highly heterogeneous cancer genomes. Accurate driver detection relies on unbiased models of the mutation rate that also capture rate variation from uncharacterized sources. Results Here, we analyse patterns of observed-to-expected mutation counts across 505 whole cancer genomes, and find that genomic features missing from our mutation-rate model likely operate on a megabase length scale. We extend our site-specific model of the mutation rate to include the additional variance from these sources, which leads to robust significance evaluation of candidate cancer drivers. We thus present ncdDetect v.2, with greatly improved cancer driver detection specificity. Finally, we show that ranking candidates by their posterior mean value of their effect sizes offers an equivalent and more computationally efficient alternative to ranking by their P-values. Availability and implementation ncdDetect v.2 is implemented as an R-package and is freely available at http://github.com/TobiasMadsen/ncdDetect2 Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Malene Juul
- Department of Molecular Medicine, Aarhus University, Palle Juul-Jensens Boulevard 99, DK-8200 Aarhus N, Denmark.,Bioinformatics Research Centre, Aarhus University, C.F. Mollers Alle 8, DK-8000 Aarhus C, Denmark
| | - Tobias Madsen
- Department of Molecular Medicine, Aarhus University, Palle Juul-Jensens Boulevard 99, DK-8200 Aarhus N, Denmark.,Bioinformatics Research Centre, Aarhus University, C.F. Mollers Alle 8, DK-8000 Aarhus C, Denmark
| | - Qianyun Guo
- Bioinformatics Research Centre, Aarhus University, C.F. Mollers Alle 8, DK-8000 Aarhus C, Denmark
| | - Johanna Bertl
- Department of Molecular Medicine, Aarhus University, Palle Juul-Jensens Boulevard 99, DK-8200 Aarhus N, Denmark
| | - Asger Hobolth
- Bioinformatics Research Centre, Aarhus University, C.F. Mollers Alle 8, DK-8000 Aarhus C, Denmark
| | - Manolis Kellis
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology (MIT), Cambridge, MA, USA
| | - Jakob Skou Pedersen
- Department of Molecular Medicine, Aarhus University, Palle Juul-Jensens Boulevard 99, DK-8200 Aarhus N, Denmark.,Bioinformatics Research Centre, Aarhus University, C.F. Mollers Alle 8, DK-8000 Aarhus C, Denmark
| |
Collapse
|
8
|
Guan L, Chen X, Wong WH. Detecting strong signals in gene perturbation experiments: An adaptive approach with power guarantee and FDR control. J Am Stat Assoc 2019; 2019. [PMID: 33311819 DOI: 10.1080/01621459.2019.1635484] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
The perturbation of a transcription factor should affect the expression levels of its direct targets. However, not all genes showing changes in expression are direct targets. To increase the chance of detecting direct targets, we propose a modified two-group model where the null group corresponds to genes which are not direct targets, but can have small non-zero effects. We model the behavior of genes from the null set by a Gaussian distribution with unknown variance τ 2. To estimate τ 2, we focus on a simple estimation approach, the iterated empirical Bayes estimation. We conduct a detailed analysis of the properties of the iterated EB estimate and provide theoretical guarantee of its good performance under mild conditions. We provide simulations comparing the new modeling approach with existing methods, and the new approach shows more stable and better performance under different situations. We also apply it to a real data set from gene knock-down experiments and obtained better results compared with the original two-group model testing for non-zero effects.
Collapse
Affiliation(s)
- Leying Guan
- Departments of Statistics, Stanford University
| | - Xi Chen
- Departments of Statistics, Stanford University
| | - Wing Hung Wong
- Departments of Statistics, Stanford University.,Biomedical Data Sciences, Stanford University
| |
Collapse
|
9
|
Jewett PI, Zhu L, Huang B, Feuer EJ, Gangnon RE. Optimal Bayesian point estimates and credible intervals for ranking with application to county health indices. Stat Methods Med Res 2018; 28:2876-2891. [PMID: 30062909 DOI: 10.1177/0962280218790104] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
It is fairly common to rank different geographic units, e.g. counties in the USA, based on health indices. In a typical application, point estimates of the health indices are obtained for each county, and the indices are then simply ranked as if they were known constants. Several authors have considered optimal rank estimators under squared error loss on the rank scale as a default method for general purpose ranking, e.g. situations where ranking units across the full spectrum of performance (low, medium, high) is important. While computationally convenient, squared error loss on the rank scale may not represent the true inferential goals of rank consumers. We construct alternative loss functions based on three components: (1) the inferential goal (rank position or pairwise comparisons), (2) the scale (original, log-transformed or rank) and (3) the (positional or pairwise) loss function (0/1, squared error or absolute error). We can obtain optimal ranks for loss functions based on rank positions and nearly optimal ranks for loss functions based on pairwise comparisons paired with highest posterior density (HPD) credible intervals. We compare inferences produced by the various ranking methods, both optimal and heuristic, using low birth weight data for counties in the Midwestern United States, from 2006 to 2012.
Collapse
Affiliation(s)
- Patricia I Jewett
- 1 Department of Population Health Sciences, University of Wisconsin-Madison, Madison, WI, USA
| | - Li Zhu
- 2 Division of Cancer Control and Population Sciences, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Bin Huang
- 3 Department of Biostatistics, University of Kentucky, Lexington, KY, USA
| | - Eric J Feuer
- 2 Division of Cancer Control and Population Sciences, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Ronald E Gangnon
- 1 Department of Population Health Sciences, University of Wisconsin-Madison, Madison, WI, USA
- 4 Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, USA
| |
Collapse
|