Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Download

Total Articles

17
(from Reference Citation Analysis)

Article PDFs (9)

Cited by > 0 (12)

Searched Name

Stephen R Pfohl

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Indexed Articles

Year Published

Show more Refine

Article Type

Show more Refine

Article Statistics

Refine

MESH Headings

Show more Refine

First Author

Show more Refine

First Author Affiliations

Show more Refine

Authors

Show more Refine

Publication Titles

Show more Refine

Grant Agencies

Show more Refine

Countries/Regions

Show more Refine

Affiliations

Show more Refine

Corresponding Author Affiliations

Show more Refine

Category

Show more Refine

Number

Citation Analysis

Schaekermann M, Spitz T, Pyles M, Cole-Lewis H, Wulczyn E, Pfohl SR, Martin D, Jaroensri R, Keeling G, Liu Y, Farquhar S, Xue Q, Lester J, Hughes C, Strachan P, Tan F, Bui P, Mermel CH, Peng LH, Matias Y, Corrado GS, Webster DR, Virmani S, Semturs C, Liu Y, Horn I, Cameron Chen PH. Health equity assessment of machine learning performance (HEAL): a framework and dermatology AI model case study. EClinicalMedicine 2024;70:102479. [PMID: 38685924 PMCID: PMC11056401 DOI: 10.1016/j.eclinm.2024.102479] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Revised: 01/16/2024] [Accepted: 01/25/2024] [Indexed: 05/02/2024] Open

Abstract

Background

Artificial intelligence (AI) has repeatedly been shown to encode historical inequities in healthcare. We aimed to develop a framework to quantitatively assess the performance equity of health AI technologies and to illustrate its utility via a case study.

Methods

Here, we propose a methodology to assess whether health AI technologies prioritise performance for patient populations experiencing worse outcomes, that is complementary to existing fairness metrics. We developed the Health Equity Assessment of machine Learning performance (HEAL) framework designed to quantitatively assess the performance equity of health AI technologies via a four-step interdisciplinary process to understand and quantify domain-specific criteria, and the resulting HEAL metric. As an illustrative case study (analysis conducted between October 2022 and January 2023), we applied the HEAL framework to a dermatology AI model. A set of 5420 teledermatology cases (store-and-forward cases from patients of 20 years or older, submitted from primary care providers in the USA and skin cancer clinics in Australia), enriched for diversity in age, sex and race/ethnicity, was used to retrospectively evaluate the AI model's HEAL metric, defined as the likelihood that the AI model performs better for subpopulations with worse average health outcomes as compared to others. The likelihood that AI performance was anticorrelated to pre-existing health outcomes was estimated using bootstrap methods as the probability that the negated Spearman's rank correlation coefficient (i.e., "R") was greater than zero. Positive values of R suggest that subpopulations with poorer health outcomes have better AI model performance. Thus, the HEAL metric, defined as p (R >0), measures how likely the AI technology is to prioritise performance for subpopulations with worse average health outcomes as compared to others (presented as a percentage below). Health outcomes were quantified as disability-adjusted life years (DALYs) when grouping by sex and age, and years of life lost (YLLs) when grouping by race/ethnicity. AI performance was measured as top-3 agreement with the reference diagnosis from a panel of 3 dermatologists per case.

Findings

Across all dermatologic conditions, the HEAL metric was 80.5% for prioritizing AI performance of racial/ethnic subpopulations based on YLLs, and 92.1% and 0.0% respectively for prioritizing AI performance of sex and age subpopulations based on DALYs. Certain dermatologic conditions were significantly associated with greater AI model performance compared to a reference category of less common conditions. For skin cancer conditions, the HEAL metric was 73.8% for prioritizing AI performance of age subpopulations based on DALYs.

Interpretation

Analysis using the proposed HEAL framework showed that the dermatology AI model prioritised performance for race/ethnicity, sex (all conditions) and age (cancer conditions) subpopulations with respect to pre-existing health disparities. More work is needed to investigate ways of promoting equitable AI performance across age for non-cancer conditions and to better understand how AI models can contribute towards improving equity in health outcomes.

Funding

Google LLC.

Collapse

Lemmon J, Guo LL, Steinberg E, Morse KE, Fleming SL, Aftandilian C, Pfohl SR, Posada JD, Shah N, Fries J, Sung L. Self-supervised machine learning using adult inpatient data produces effective models for pediatric clinical prediction tasks. J Am Med Inform Assoc 2023;30:2004-2011. [PMID: 37639620 PMCID: PMC10654865 DOI: 10.1093/jamia/ocad175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Revised: 08/03/2023] [Accepted: 08/15/2023] [Indexed: 08/31/2023] Open

Abstract

OBJECTIVE

Development of electronic health records (EHR)-based machine learning models for pediatric inpatients is challenged by limited training data. Self-supervised learning using adult data may be a promising approach to creating robust pediatric prediction models. The primary objective was to determine whether a self-supervised model trained in adult inpatients was noninferior to logistic regression models trained in pediatric inpatients, for pediatric inpatient clinical prediction tasks.

MATERIALS AND METHODS

This retrospective cohort study used EHR data and included patients with at least one admission to an inpatient unit. One admission per patient was randomly selected. Adult inpatients were 18 years or older while pediatric inpatients were more than 28 days and less than 18 years. Admissions were temporally split into training (January 1, 2008 to December 31, 2019), validation (January 1, 2020 to December 31, 2020), and test (January 1, 2021 to August 1, 2022) sets. Primary comparison was a self-supervised model trained in adult inpatients versus count-based logistic regression models trained in pediatric inpatients. Primary outcome was mean area-under-the-receiver-operating-characteristic-curve (AUROC) for 11 distinct clinical outcomes. Models were evaluated in pediatric inpatients.

RESULTS

When evaluated in pediatric inpatients, mean AUROC of self-supervised model trained in adult inpatients (0.902) was noninferior to count-based logistic regression models trained in pediatric inpatients (0.868) (mean difference = 0.034, 95% CI=0.014-0.057; P < .001 for noninferiority and P = .006 for superiority).

CONCLUSIONS

Self-supervised learning in adult inpatients was noninferior to logistic regression models trained in pediatric inpatients. This finding suggests transferability of self-supervised models trained in adult patients to pediatric patients, without requiring costly model retraining.

Collapse

Arora A, Alderman JE, Palmer J, Ganapathi S, Laws E, McCradden MD, Oakden-Rayner L, Pfohl SR, Ghassemi M, McKay F, Treanor D, Rostamzadeh N, Mateen B, Gath J, Adebajo AO, Kuku S, Matin R, Heller K, Sapey E, Sebire NJ, Cole-Lewis H, Calvert M, Denniston A, Liu X. The value of standards for health datasets in artificial intelligence-based applications. Nat Med 2023;29:2929-2938. [PMID: 37884627 PMCID: PMC10667100 DOI: 10.1038/s41591-023-02608-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Accepted: 09/22/2023] [Indexed: 10/28/2023]

Affiliation(s)

Anmol Arora School of Clinical Medicine, University of Cambridge, Cambridge, UK
Joseph E Alderman Institute of Inflammation and Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK National Institute for Health and Care Research Birmingham Biomedical Research Centre, University of Birmingham, Birmingham, UK
Joanne Palmer University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK National Institute for Health and Care Research Birmingham Biomedical Research Centre, University of Birmingham, Birmingham, UK
Shaswath Ganapathi Sandwell and West Birmingham Hospitals NHS Trust, Birmingham, UK
Elinor Laws Institute of Inflammation and Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK National Institute for Health and Care Research Birmingham Biomedical Research Centre, University of Birmingham, Birmingham, UK
Melissa D McCradden Department of Bioethics, The Hospital for Sick Children, Toronto, Ontario, Canada Genetics and Genome Biology, Peter Gilgan Centre for Research and Learning, Toronto, Ontario, Canada Dalla Lana School of Public Health, Toronto, Ontario, Canada
Lauren Oakden-Rayner The Australian Institute for Machine Learning, University of Adelaide, Adelaide, South Australia, Australia
Stephen R Pfohl Google Research, Mountain View, CA, USA
Marzyeh Ghassemi Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA Institute for Medical Engineering & Science, Massachusetts Institute of Technology, Cambridge, MA, USA Vector Institute, Toronto, Ontario, Canada
Francis McKay The Ethox Centre and the Wellcome Centre for Ethics and Humanities, Nuffield Department of Population Health, University of Oxford, Oxford, UK
Darren Treanor Leeds Teaching Hospitals NHS Trust, Leeds, UK University of Leeds, Leeds, UK Department of Clinical Pathology and Department of Clinical and Experimental Medicine, Linköping University, Linköping, Sweden Center for Medical Image Science and Visualization, Linköping University, Linköping, Sweden
Negar Rostamzadeh Google Research, Montreal, Quebec, Canada
Bilal Mateen Institute for Health Informatics, University College London, London, UK Wellcome Trust, London, UK
Jacqui Gath Patient and Public Involvement and Engagement (PPIE) Group, STANDING Together, Birmingham, UK
Adewole O Adebajo Patient and Public Involvement and Engagement (PPIE) Group, STANDING Together, Birmingham, UK
Stephanie Kuku Institute of Women's Health, UCL, London, UK
Rubeta Matin Oxford University Hospitals NHS Foundation Trust, Oxford, UK
Katherine Heller Google Research, Mountain View, CA, USA
Elizabeth Sapey Institute of Inflammation and Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK National Institute for Health and Care Research Birmingham Biomedical Research Centre, University of Birmingham, Birmingham, UK PIONEER, HDR UK Hub in Acute Care, Institute of Inflammation and Ageing, University of Birmingham, Birmingham, UK
Neil J Sebire National Institute for Health and Care Research, Great Ormond Street Hospital Biomedical Research Centre, London, UK Great Ormond Street Institute of Child Health, University Hospital London, London, UK
Heather Cole-Lewis Google Research, Mountain View, CA, USA
Melanie Calvert National Institute for Health and Care Research Birmingham Biomedical Research Centre, University of Birmingham, Birmingham, UK Birmingham Health Partners Centre for Regulatory Science and Innovation, University of Birmingham, Birmingham, UK Centre for Patient Reported Outcomes Research, Institute of Applied Health Research, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK National Institute for Health and Care Research Applied Research Collaboration West Midlands, University of Birmingham, Birmingham, UK National Institute for Health and Care Research Birmingham-Oxford Blood and Transplant Research Unit in Precision Transplant and Cellular Therapeutics, University of Birmingham, Birmingham, UK DEMAND Hub, University of Birmingham, Birmingham, UK UK SPINE, University of Birmingham, Birmingham, UK
Alastair Denniston Institute of Inflammation and Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK National Institute for Health and Care Research Birmingham Biomedical Research Centre, University of Birmingham, Birmingham, UK Birmingham Health Partners Centre for Regulatory Science and Innovation, University of Birmingham, Birmingham, UK National Institute for Health and Care Research Biomedical Research Centre, Moorfields Eye Hospital/University College London, London, UK
Xiaoxuan Liu Institute of Inflammation and Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK. University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK. National Institute for Health and Care Research Birmingham Biomedical Research Centre, University of Birmingham, Birmingham, UK.

Collapse

Guo LL, Steinberg E, Fleming SL, Posada J, Lemmon J, Pfohl SR, Shah N, Fries J, Sung L. EHR foundation models improve robustness in the presence of temporal distribution shift. Sci Rep 2023;13:3767. [PMID: 36882576 PMCID: PMC9992466 DOI: 10.1038/s41598-023-30820-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2022] [Accepted: 03/02/2023] [Indexed: 03/09/2023] Open

Abstract

Temporal distribution shift negatively impacts the performance of clinical prediction models over time. Pretraining foundation models using self-supervised learning on electronic health records (EHR) may be effective in acquiring informative global patterns that can improve the robustness of task-specific models. The objective was to evaluate the utility of EHR foundation models in improving the in-distribution (ID) and out-of-distribution (OOD) performance of clinical prediction models. Transformer- and gated recurrent unit-based foundation models were pretrained on EHR of up to 1.8 M patients (382 M coded events) collected within pre-determined year groups (e.g., 2009-2012) and were subsequently used to construct patient representations for patients admitted to inpatient units. These representations were used to train logistic regression models to predict hospital mortality, long length of stay, 30-day readmission, and ICU admission. We compared our EHR foundation models with baseline logistic regression models learned on count-based representations (count-LR) in ID and OOD year groups. Performance was measured using area-under-the-receiver-operating-characteristic curve (AUROC), area-under-the-precision-recall curve, and absolute calibration error. Both transformer and recurrent-based foundation models generally showed better ID and OOD discrimination relative to count-LR and often exhibited less decay in tasks where there is observable degradation of discrimination performance (average AUROC decay of 3% for transformer-based foundation model vs. 7% for count-LR after 5-9 years). In addition, the performance and robustness of transformer-based foundation models continued to improve as pretraining set size increased. These results suggest that pretraining EHR foundation models at scale is a useful approach for developing clinical prediction models that perform well in the presence of temporal distribution shift.

Collapse

Lemmon J, Guo LL, Posada J, Pfohl SR, Fries J, Fleming SL, Aftandilian C, Shah N, Sung L. Evaluation of Feature Selection Methods for Preserving Machine Learning Performance in the Presence of Temporal Dataset Shift in Clinical Medicine. Methods Inf Med 2023;62:60-70. [PMID: 36812932 DOI: 10.1055/s-0043-1762904] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/24/2023]

Abstract

BACKGROUND

Temporal dataset shift can cause degradation in model performance as discrepancies between training and deployment data grow over time. The primary objective was to determine whether parsimonious models produced by specific feature selection methods are more robust to temporal dataset shift as measured by out-of-distribution (OOD) performance, while maintaining in-distribution (ID) performance.

METHODS

Our dataset consisted of intensive care unit patients from MIMIC-IV categorized by year groups (2008-2010, 2011-2013, 2014-2016, and 2017-2019). We trained baseline models using L2-regularized logistic regression on 2008-2010 to predict in-hospital mortality, long length of stay (LOS), sepsis, and invasive ventilation in all year groups. We evaluated three feature selection methods: L1-regularized logistic regression (L1), Remove and Retrain (ROAR), and causal feature selection. We assessed whether a feature selection method could maintain ID performance (2008-2010) and improve OOD performance (2017-2019). We also assessed whether parsimonious models retrained on OOD data performed as well as oracle models trained on all features in the OOD year group.

RESULTS

The baseline model showed significantly worse OOD performance with the long LOS and sepsis tasks when compared with the ID performance. L1 and ROAR retained 3.7 to 12.6% of all features, whereas causal feature selection generally retained fewer features. Models produced by L1 and ROAR exhibited similar ID and OOD performance as the baseline models. The retraining of these models on 2017-2019 data using features selected from training on 2008-2010 data generally reached parity with oracle models trained directly on 2017-2019 data using all available features. Causal feature selection led to heterogeneous results with the superset maintaining ID performance while improving OOD calibration only on the long LOS task.

CONCLUSIONS

While model retraining can mitigate the impact of temporal dataset shift on parsimonious models produced by L1 and ROAR, new methods are required to proactively improve temporal robustness.

Collapse

Ganapathi S, Palmer J, Alderman JE, Calvert M, Espinoza C, Gath J, Ghassemi M, Heller K, Mckay F, Karthikesalingam A, Kuku S, Mackintosh M, Manohar S, Mateen BA, Matin R, McCradden M, Oakden-Rayner L, Ordish J, Pearson R, Pfohl SR, Rostamzadeh N, Sapey E, Sebire N, Sounderajah V, Summers C, Treanor D, Denniston AK, Liu X. Tackling bias in AI health datasets through the STANDING Together initiative. Nat Med 2022;28:2232-2233. [PMID: 36163296 DOI: 10.1038/s41591-022-01987-w] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]

Affiliation(s)

Shaswath Ganapathi College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK
Jo Palmer University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
Joseph E Alderman University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK.,Institute of Inflammation and Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK
Melanie Calvert Birmingham Health Partners Centre for Regulatory Science and Innovation, University of Birmingham, Birmingham, UK.,Centre for Patient Reported Outcome Research, Institute of Applied Health Research, University of Birmingham, Birmingham, UK.,NIHR Birmingham Biomedical Research Centre, University of Birmingham, Birmingham, UK.,NIHR Surgical Reconstruction and Microbiology Research Centre, University of Birmingham, Birmingham, UK.,NIHR Applied Research Collaborative West Midlands University of Birmingham, Birmingham, UK
Cyrus Espinoza Patient Partner, Birmingham, UK
Jacqui Gath Patient Partner, Birmingham, UK.,Patient Partner, Sheffield, UK
Marzyeh Ghassemi Department of Electrical Engineering and Computer Science; Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA
Katherine Heller Google Research, Mountain View, California, USA
Francis Mckay The Ethox Centre and the Wellcome Centre for Ethics and Humanities, Nuffield Department of Population Health, University of Oxford, Oxford, UK
Alan Karthikesalingam Google Research, London, UK
Stephanie Kuku Institute of Women's Health, University College London, London, UK.,Hardian Health, London, UK
Maxine Mackintosh Genomics England, London, UK
Sinduja Manohar Health Data Research, London, UK
Bilal A Mateen Institute of Health Informatics, University College London, London, UK.,The Wellcome Trust, London, UK
Rubeta Matin Oxford University Hospitals NHS Foundation Trust, Oxford, UK
Melissa McCradden Department of Bioethics, Hospital for Sick Children, Toronto, Ontario, Canada.,Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
Lauren Oakden-Rayner Australian Institute for Machine Learning, University of Adelaide, Adelaide, South Australia, Australia
Johan Ordish Medicines and Healthcare Products Regulatory Agency, London, UK
Russell Pearson Medicines and Healthcare Products Regulatory Agency, London, UK
Stephen R Pfohl Google Research, Mountain View, California, USA
Negar Rostamzadeh Google Research, Montreal, Canada
Elizabeth Sapey University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK.,Institute of Inflammation and Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK
Neil Sebire Health Data Research, London, UK.,Great Ormond Street Hospital for Children, London, UK
Viknesh Sounderajah Institute of Global Health Innovation, Imperial College London, London, UK.,Department of Surgery and Cancer, Imperial College London, London, UK
Charlotte Summers Wolfson Lung Injury Unit, Heart and Lung Research Institute, University of Cambridge, Cambrdige, UK.,Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK
Darren Treanor Leeds Teaching Hospitals NHS Trust, Leeds, UK.,University of Leeds, Leeds, UK.,Department of Clinical Pathology, and Department of Clinical and Experimental Medicine, Linköping University, Linköping, Sweden.,Center for Medical Image Science and Visualization (CMIV), Linköping University, Linköping, Sweden
Alastair K Denniston University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK.,Institute of Inflammation and Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK.,Birmingham Health Partners Centre for Regulatory Science and Innovation, University of Birmingham, Birmingham, UK.,NIHR Birmingham Biomedical Research Centre, University of Birmingham, Birmingham, UK.,Health Data Research, London, UK
Xiaoxuan Liu University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK. .,Institute of Inflammation and Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK. .,Birmingham Health Partners Centre for Regulatory Science and Innovation, University of Birmingham, Birmingham, UK.

Collapse

Foryciarz A, Pfohl SR, Patel B, Shah N. Evaluating algorithmic fairness in the presence of clinical guidelines: the case of atherosclerotic cardiovascular disease risk estimation. BMJ Health Care Inform 2022;29:bmjhci-2021-100460. [PMID: 35396247 PMCID: PMC8996004 DOI: 10.1136/bmjhci-2021-100460] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2021] [Accepted: 12/17/2021] [Indexed: 12/28/2022] Open

Abstract ObjectivesThe American College of Cardiology and the American Heart Association guidelines on primary prevention of atherosclerotic cardiovascular disease (ASCVD) recommend using 10-year ASCVD risk estimation models to initiate statin treatment. For guideline-concordant decision-making, risk estimates need to be calibrated. However, existing models are often miscalibrated for race, ethnicity and sex based subgroups. This study evaluates two algorithmic fairness approaches to adjust the risk estimators (group recalibration and equalised odds) for their compatibility with the assumptions underpinning the guidelines’ decision rules.MethodsUsing an updated pooled cohorts data set, we derive unconstrained, group-recalibrated and equalised odds-constrained versions of the 10-year ASCVD risk estimators, and compare their calibration at guideline-concordant decision thresholds.ResultsWe find that, compared with the unconstrained model, group-recalibration improves calibration at one of the relevant thresholds for each group, but exacerbates differences in false positive and false negative rates between groups. An equalised odds constraint, meant to equalise error rates across groups, does so by miscalibrating the model overall and at relevant decision thresholds.DiscussionHence, because of induced miscalibration, decisions guided by risk estimators learned with an equalised odds fairness constraint are not concordant with existing guidelines. Conversely, recalibrating the model separately for each group can increase guideline compatibility, while increasing intergroup differences in error rates. As such, comparisons of error rates across groups can be misleading when guidelines recommend treating at fixed decision thresholds.ConclusionThe illustrated tradeoffs between satisfying a fairness criterion and retaining guideline compatibility underscore the need to evaluate models in the context of downstream interventions. Collapse

Luo C, Islam MN, Sheils NE, Buresh J, Reps J, Schuemie MJ, Ryan PB, Edmondson M, Duan R, Tong J, Marks-Anglin A, Bian J, Chen Z, Duarte-Salles T, Fernández-Bertolín S, Falconer T, Kim C, Park RW, Pfohl SR, Shah NH, Williams AE, Xu H, Zhou Y, Lautenbach E, Doshi JA, Werner RM, Asch DA, Chen Y. DLMM as a lossless one-shot algorithm for collaborative multi-site distributed linear mixed models. Nat Commun 2022;13:1678. [PMID: 35354802 PMCID: PMC8967932 DOI: 10.1038/s41467-022-29160-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2021] [Accepted: 03/03/2022] [Indexed: 12/21/2022] Open

Affiliation(s)

Chongliang Luo Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA, USA.,Division of Public Health Sciences, Washington University School of Medicine in St. Louis, St. Louis, MO, USA
Md Nazmul Islam Optum Labs, Minnetonka, MN, USA
Natalie E Sheils Optum Labs, Minnetonka, MN, USA
John Buresh Optum Labs, Minnetonka, MN, USA
Jenna Reps Janssen Research and Development LLC, Titusville, NJ, USA
Martijn J Schuemie Janssen Research and Development LLC, Titusville, NJ, USA
Patrick B Ryan Janssen Research and Development LLC, Titusville, NJ, USA
Mackenzie Edmondson Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA, USA
Rui Duan Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA, USA.,Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
Jiayi Tong Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA, USA
Arielle Marks-Anglin Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA, USA
Jiang Bian Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, USA
Zhaoyi Chen Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, USA
Talita Duarte-Salles Fundacio Institut Universitari per a la recerca a l'Atencio Primaria de Salut Jordi Gol i Gurina (IDIAPJGol), Barcelona, Spain
Sergio Fernández-Bertolín Fundacio Institut Universitari per a la recerca a l'Atencio Primaria de Salut Jordi Gol i Gurina (IDIAPJGol), Barcelona, Spain
Thomas Falconer Department of Biomedical Informatics, Columbia University, New York, NY, USA
Chungsoo Kim Department of Biomedical Sciences, Ajou University Graduate School of Medicine, Suwon, Republic of Korea
Rae Woong Park Department of Biomedical Sciences, Ajou University Graduate School of Medicine, Suwon, Republic of Korea.,Department of Biomedical Informatics, Ajou University School of Medicine, Suwon, Republic of Korea
Stephen R Pfohl Stanford Center for Biomedical Informatics Research, Stanford, CA, USA
Nigam H Shah Stanford Center for Biomedical Informatics Research, Stanford, CA, USA
Andrew E Williams Institute for Clinical Research and Health Policy Studies, Tufts University School of Medicine, Boston, MA, USA
Hua Xu School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA
Yujia Zhou School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA
Ebbing Lautenbach Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA, USA.,Division of Infectious Diseases, Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.,Center for Clinical Epidemiology and Biostatistics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
Jalpa A Doshi Division of General Internal Medicine, University of Pennsylvania, Philadelphia, PA, USA.,Leonard Davis Institute of Health Economics, Philadelphia, PA, USA
Rachel M Werner Division of General Internal Medicine, University of Pennsylvania, Philadelphia, PA, USA.,Leonard Davis Institute of Health Economics, Philadelphia, PA, USA.,Cpl Michael J Crescenz VA Medical Center, Philadelphia, PA, USA
David A Asch Division of General Internal Medicine, University of Pennsylvania, Philadelphia, PA, USA.,Leonard Davis Institute of Health Economics, Philadelphia, PA, USA
Yong Chen Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA, USA.

Collapse

Guo LL, Pfohl SR, Fries J, Johnson AEW, Posada J, Aftandilian C, Shah N, Sung L. Evaluation of domain generalization and adaptation on improving model robustness to temporal dataset shift in clinical medicine. Sci Rep 2022;12:2726. [PMID: 35177653 PMCID: PMC8854561 DOI: 10.1038/s41598-022-06484-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2021] [Accepted: 01/31/2022] [Indexed: 11/24/2022] Open

Abstract

Temporal dataset shift associated with changes in healthcare over time is a barrier to deploying machine learning-based clinical decision support systems. Algorithms that learn robust models by estimating invariant properties across time periods for domain generalization (DG) and unsupervised domain adaptation (UDA) might be suitable to proactively mitigate dataset shift. The objective was to characterize the impact of temporal dataset shift on clinical prediction models and benchmark DG and UDA algorithms on improving model robustness. In this cohort study, intensive care unit patients from the MIMIC-IV database were categorized by year groups (2008–2010, 2011–2013, 2014–2016 and 2017–2019). Tasks were predicting mortality, long length of stay, sepsis and invasive ventilation. Feedforward neural networks were used as prediction models. The baseline experiment trained models using empirical risk minimization (ERM) on 2008–2010 (ERM[08–10]) and evaluated them on subsequent year groups. DG experiment trained models using algorithms that estimated invariant properties using 2008–2016 and evaluated them on 2017–2019. UDA experiment leveraged unlabelled samples from 2017 to 2019 for unsupervised distribution matching. DG and UDA models were compared to ERM[08–16] models trained using 2008–2016. Main performance measures were area-under-the-receiver-operating-characteristic curve (AUROC), area-under-the-precision-recall curve and absolute calibration error. Threshold-based metrics including false-positives and false-negatives were used to assess the clinical impact of temporal dataset shift and its mitigation strategies. In the baseline experiments, dataset shift was most evident for sepsis prediction (maximum AUROC drop, 0.090; 95% confidence interval (CI), 0.080–0.101). Considering a scenario of 100 consecutively admitted patients showed that ERM[08–10] applied to 2017–2019 was associated with one additional false-negative among 11 patients with sepsis, when compared to the model applied to 2008–2010. When compared with ERM[08–16], DG and UDA experiments failed to produce more robust models (range of AUROC difference, − 0.003 to 0.050). In conclusion, DG and UDA failed to produce more robust models compared to ERM in the setting of temporal dataset shift. Alternate approaches are required to preserve model performance over time in clinical medicine.

Collapse

Nestsiarovich A, Reps JM, Matheny ME, DuVall SL, Lynch KE, Beaton M, Jiang X, Spotnitz M, Pfohl SR, Shah NH, Torre CO, Reich CG, Lee DY, Son SJ, You SC, Park RW, Ryan PB, Lambert CG. Predictors of diagnostic transition from major depressive disorder to bipolar disorder: a retrospective observational network study. Transl Psychiatry 2021;11:642. [PMID: 34930903 PMCID: PMC8688463 DOI: 10.1038/s41398-021-01760-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/12/2021] [Revised: 11/25/2021] [Accepted: 12/01/2021] [Indexed: 12/02/2022] Open

Affiliation(s)

Anastasiya Nestsiarovich University of New Mexico Health Sciences Center, Department of Internal Medicine, Center for Global Health, Albuquerque, NM, USA
Jenna M Reps Janssen Research and Development, Raritan, NJ, USA
Michael E Matheny Vanderbilt University, Department of Biomedical Informatics, Department of Medicine, Department of Biostatistics, Nashville, TN, USA Tennessee Valley Healthcare System VA, Nashville, TN, USA
Scott L DuVall Veterans Affairs Informatics and Computing Infrastructure, VA Salt Lake City Health Care System, Salt Lake City, UT, USA University of Utah, Department of Internal Medicine, Salt Lake City, UT, USA
Kristine E Lynch Veterans Affairs Informatics and Computing Infrastructure, VA Salt Lake City Health Care System, Salt Lake City, UT, USA University of Utah, Department of Internal Medicine, Salt Lake City, UT, USA
Maura Beaton Columbia University Irving Medical Center, Department of Biomedical Informatics, New York, NY, USA
Xinzhuo Jiang Columbia University Irving Medical Center, Department of Biomedical Informatics, New York, NY, USA
Matthew Spotnitz Columbia University Irving Medical Center, Department of Biomedical Informatics, New York, NY, USA
Stephen R Pfohl Stanford University, Stanford Center for Biomedical Informatics Research, Stanford, CA, USA
Nigam H Shah Stanford University, Stanford Center for Biomedical Informatics Research, Stanford, CA, USA
Carmen Olga Torre IQVIA, Real World Solutions, Brighton, UK
Christian G Reich IQVIA, Real World Solutions, Cambridge, MA, USA
Dong Yun Lee Ajou University School of Medicine, Department of Psychiatry, Suwon, Republic of Korea
Sang Joon Son Ajou University School of Medicine, Department of Psychiatry, Suwon, Republic of Korea
Seng Chan You Ajou University School of Medicine, Department of Biomedical Informatics, Suwon, Republic of Korea
Rae Woong Park Ajou University School of Medicine, Department of Biomedical Informatics, Suwon, Republic of Korea
Patrick B Ryan Janssen Research and Development, Raritan, NJ, USA Columbia University Irving Medical Center, Department of Biomedical Informatics, New York, NY, USA
Christophe G Lambert University of New Mexico Health Sciences Center, Department of Internal Medicine, Center for Global Health, Albuquerque, NM, USA. University of New Mexico Health Sciences Center, Department of Internal Medicine, Center for Global Health, Division of Translational Informatics, Albuquerque, NM, USA.

Collapse

Patel BS, Steinberg E, Pfohl SR, Shah NH. Learning decision thresholds for risk stratification models from aggregate clinician behavior. J Am Med Inform Assoc 2021;28:2258-2264. [PMID: 34350942 PMCID: PMC8449610 DOI: 10.1093/jamia/ocab159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Revised: 06/26/2021] [Accepted: 07/13/2021] [Indexed: 11/22/2022] Open

Guo LL, Pfohl SR, Fries J, Posada J, Fleming SL, Aftandilian C, Shah N, Sung L. Systematic Review of Approaches to Preserve Machine Learning Performance in the Presence of Temporal Dataset Shift in Clinical Medicine. Appl Clin Inform 2021;12:808-815. [PMID: 34470057 PMCID: PMC8410238 DOI: 10.1055/s-0041-1735184] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2021] [Accepted: 07/12/2021] [Indexed: 10/20/2022] Open

Abstract

OBJECTIVE

The change in performance of machine learning models over time as a result of temporal dataset shift is a barrier to machine learning-derived models facilitating decision-making in clinical practice. Our aim was to describe technical procedures used to preserve the performance of machine learning models in the presence of temporal dataset shifts.

METHODS

Studies were included if they were fully published articles that used machine learning and implemented a procedure to mitigate the effects of temporal dataset shift in a clinical setting. We described how dataset shift was measured, the procedures used to preserve model performance, and their effects.

RESULTS

Of 4,457 potentially relevant publications identified, 15 were included. The impact of temporal dataset shift was primarily quantified using changes, usually deterioration, in calibration or discrimination. Calibration deterioration was more common (n = 11) than discrimination deterioration (n = 3). Mitigation strategies were categorized as model level or feature level. Model-level approaches (n = 15) were more common than feature-level approaches (n = 2), with the most common approaches being model refitting (n = 12), probability calibration (n = 7), model updating (n = 6), and model selection (n = 6). In general, all mitigation strategies were successful at preserving calibration but not uniformly successful in preserving discrimination.

CONCLUSION

There was limited research in preserving the performance of machine learning models in the presence of temporal dataset shift in clinical medicine. Future research could focus on the impact of dataset shift on clinical decision making, benchmark the mitigation strategies on a wider range of datasets and tasks, and identify optimal strategies for specific settings.

Collapse

Steinberg E, Jung K, Fries JA, Corbin CK, Pfohl SR, Shah NH. Language models are an effective representation learning technique for electronic health record data. J Biomed Inform 2021;113:103637. [PMID: 33290879 PMCID: PMC7863633 DOI: 10.1016/j.jbi.2020.103637] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2020] [Revised: 10/10/2020] [Accepted: 11/26/2020] [Indexed: 11/17/2022]

Pfohl SR, Foryciarz A, Shah NH. An empirical characterization of fair machine learning for clinical risk prediction. J Biomed Inform 2020;113:103621. [PMID: 33220494 DOI: 10.1016/j.jbi.2020.103621] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2020] [Revised: 10/06/2020] [Accepted: 11/05/2020] [Indexed: 11/19/2022]

Wang Q, Reps JM, Kostka KF, Ryan PB, Zou Y, Voss EA, Rijnbeek PR, Chen R, Rao GA, Morgan Stewart H, Williams AE, Williams RD, Van Zandt M, Falconer T, Fernandez-Chas M, Vashisht R, Pfohl SR, Shah NH, Kasthurirathne SN, You SC, Jiang Q, Reich C, Zhou Y. Development and validation of a prognostic model predicting symptomatic hemorrhagic transformation in acute ischemic stroke at scale in the OHDSI network. PLoS One 2020;15:e0226718. [PMID: 31910437 PMCID: PMC6946584 DOI: 10.1371/journal.pone.0226718] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2019] [Accepted: 12/02/2019] [Indexed: 12/26/2022] Open

Abstract

BACKGROUND AND PURPOSE

Hemorrhagic transformation (HT) after cerebral infarction is a complex and multifactorial phenomenon in the acute stage of ischemic stroke, and often results in a poor prognosis. Thus, identifying risk factors and making an early prediction of HT in acute cerebral infarction contributes not only to the selections of therapeutic regimen but also, more importantly, to the improvement of prognosis of acute cerebral infarction. The purpose of this study was to develop and validate a model to predict a patient's risk of HT within 30 days of initial ischemic stroke.

METHODS

We utilized a retrospective multicenter observational cohort study design to develop a Lasso Logistic Regression prediction model with a large, US Electronic Health Record dataset which structured to the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM). To examine clinical transportability, the model was externally validated across 10 additional real-world healthcare datasets include EHR records for patients from America, Europe and Asia.

RESULTS

In the database the model was developed, the target population cohort contained 621,178 patients with ischemic stroke, of which 5,624 patients had HT within 30 days following initial ischemic stroke. 612 risk predictors, including the distance a patient travels in an ambulance to get to care for a HT, were identified. An area under the receiver operating characteristic curve (AUC) of 0.75 was achieved in the internal validation of the risk model. External validation was performed across 10 databases totaling 5,515,508 patients with ischemic stroke, of which 86,401 patients had HT within 30 days following initial ischemic stroke. The mean external AUC was 0.71 and ranged between 0.60-0.78.

CONCLUSIONS

A HT prognostic predict model was developed with Lasso Logistic Regression based on routinely collected EMR data. This model can identify patients who have a higher risk of HT than the population average with an AUC of 0.78. It shows the OMOP CDM is an appropriate data standard for EMR secondary use in clinical multicenter research for prognostic prediction model development and validation. In the future, combining this model with clinical information systems will assist clinicians to make the right therapy decision for patients with acute ischemic stroke.

Collapse

Affiliation(s)

Qiong Wang Biomedical Engineering School, Sun Yat-Sen University, Guangzhou, China The Third Affiliated Hospital of Guangzhou Medical University, Guangzhou, China Observational Health Data Sciences and Informatics, New York, New York, United States of America
Jenna M. Reps Observational Health Data Sciences and Informatics, New York, New York, United States of America Janssen Research and Development, Raritan, New Jersey, United States of America
Kristin Feeney Kostka Observational Health Data Sciences and Informatics, New York, New York, United States of America IQVIA, Durham, North Carolina, United States of America
Patrick B. Ryan Observational Health Data Sciences and Informatics, New York, New York, United States of America Janssen Research and Development, Raritan, New Jersey, United States of America Department of Biomedical Informatics, Columbia University, New York, New York, United States of America
Yuhui Zou Department of Neurosurgery, General Hospital of Southern Theatre Command, Guangzhou, China
Erica A. Voss Observational Health Data Sciences and Informatics, New York, New York, United States of America Janssen Research and Development, Raritan, New Jersey, United States of America Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
Peter R. Rijnbeek Observational Health Data Sciences and Informatics, New York, New York, United States of America Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
RuiJun Chen Observational Health Data Sciences and Informatics, New York, New York, United States of America Department of Biomedical Informatics, Columbia University, New York, New York, United States of America Department of Medicine, Weill Cornell Medical College, New York, New York, United States of America
Gowtham A. Rao Observational Health Data Sciences and Informatics, New York, New York, United States of America Janssen Research and Development, Raritan, New Jersey, United States of America
Henry Morgan Stewart Observational Health Data Sciences and Informatics, New York, New York, United States of America IQVIA, Durham, North Carolina, United States of America
Andrew E. Williams Observational Health Data Sciences and Informatics, New York, New York, United States of America Tufts Medical Center, Institute for Clinical Research and Health Policy Studies, Boston, Massachusetts, United States of America
Ross D. Williams Observational Health Data Sciences and Informatics, New York, New York, United States of America Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands
Mui Van Zandt Observational Health Data Sciences and Informatics, New York, New York, United States of America IQVIA, Durham, North Carolina, United States of America
Thomas Falconer Observational Health Data Sciences and Informatics, New York, New York, United States of America Department of Biomedical Informatics, Columbia University, New York, New York, United States of America
Margarita Fernandez-Chas Observational Health Data Sciences and Informatics, New York, New York, United States of America IQVIA, Durham, North Carolina, United States of America
Rohit Vashisht Observational Health Data Sciences and Informatics, New York, New York, United States of America Stanford Center for Biomedical Informatics Research, Stanford, California, United States of America
Stephen R. Pfohl Observational Health Data Sciences and Informatics, New York, New York, United States of America Stanford Center for Biomedical Informatics Research, Stanford, California, United States of America
Nigam H. Shah Observational Health Data Sciences and Informatics, New York, New York, United States of America Stanford Center for Biomedical Informatics Research, Stanford, California, United States of America
Suranga N. Kasthurirathne Observational Health Data Sciences and Informatics, New York, New York, United States of America Center for Biomedical Informatics, Regenstrief Institute, Indianapolis, Indiana, United States of America Department of Epidemiology, Indiana University Richard M. Fairbanks School of Public Health, Indianapolis, Indiana, United States of America
Seng Chan You Observational Health Data Sciences and Informatics, New York, New York, United States of America Department of Biomedical informatics, Ajou University School of Medicine, Suwon, Korea
Qing Jiang Biomedical Engineering School, Sun Yat-Sen University, Guangzhou, China
Christian Reich Observational Health Data Sciences and Informatics, New York, New York, United States of America IQVIA, Durham, North Carolina, United States of America
Yi Zhou Department of Biomedical Engineering, Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou, China

Collapse

Pfohl SR, Kim RB, Coan GS, Mitchell CS. Unraveling the Complexity of Amyotrophic Lateral Sclerosis Survival Prediction. Front Neuroinform 2018;12:36. [PMID: 29962944 PMCID: PMC6010549 DOI: 10.3389/fninf.2018.00036] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2018] [Accepted: 05/28/2018] [Indexed: 12/12/2022] Open

Abstract

Objective: The heterogeneity of amyotrophic lateral sclerosis (ALS) survival duration, which varies from <1 year to >10 years, challenges clinical decisions and trials. Utilizing data from 801 deceased ALS patients, we: (1) assess the underlying complex relationships among common clinical ALS metrics; (2) identify which clinical ALS metrics are the "best" survival predictors and how their predictive ability changes as a function of disease progression. Methods: Analyses included examination of relationships within the raw data as well as the construction of interactive survival regression and classification models (generalized linear model and random forests model). Dimensionality reduction and feature clustering enabled decomposition of clinical variable contributions. Thirty-eight metrics were utilized, including Medical Research Council (MRC) muscle scores; respiratory function, including forced vital capacity (FVC) and FVC % predicted, oxygen saturation, negative inspiratory force (NIF); the Revised ALS Functional Rating Scale (ALSFRS-R) and its activities of daily living (ADL) and respiratory sub-scores; body weight; onset type, onset age, gender, and height. Prognostic random forest models confirm the dominance of patient age-related parameters decline in classifying survival at thresholds of 30, 60, 90, and 180 days and 1, 2, 3, 4, and 5 years. Results: Collective prognostic insight derived from the overall investigation includes: multi-dimensionality of ALSFRS-R scores suggests cautious usage for survival forecasting; upper and lower extremities independently degenerate and are autonomous from respiratory decline, with the latter associating with nearer-to-death classifications; height and weight-based metrics are auxiliary predictors for farther-from-death classifications; sex and onset site (limb, bulbar) are not independent survival predictors due to age co-correlation. Conclusion: The dimensionality and fluctuating predictors of ALS survival must be considered when developing predictive models for clinical trial development or in-clinic usage. Additional independent metrics and possible revisions to current metrics, like the ALSFRS-R, are needed to capture the underlying complexity needed for population and personalized forecasting of survival.

Collapse

Pfohl SR, Halicek MT, Mitchell CS. Characterization of the Contribution of Genetic Background and Gender to Disease Progression in the SOD1 G93A Mouse Model of Amyotrophic Lateral Sclerosis: A Meta-Analysis. J Neuromuscul Dis 2015;2:137-150. [PMID: 26594635 PMCID: PMC4652798 DOI: 10.3233/jnd-140068] [Citation(s) in RCA: 62] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Abstract

Background: The SOD1 G93A mouse model of amyotrophic lateral sclerosis (ALS) is the most frequently used model to examine ALS pathophysiology. There is a lack of homogeneity in usage of the SOD1 G93A mouse, including differences in genetic background and gender, which could confound the field’s results.

Objective: In an analysis of 97 studies, we characterized the ALS progression for the high transgene copy control SOD1 G93A mouse on the basis of disease onset, overall lifespan, and disease duration for male and female mice on the B6SJL and C57BL/6J genetic backgrounds and quantified magnitudes of differences between groups.

Methods: Mean age at onset, onset assessment measure, disease duration, and overall lifespan data from each study were extracted and statistically modeled as the response of linear regression with the sex and genetic background factored as predictors. Additional examination was performed on differing experimental onset and endpoint assessment measures.

Results: C57BL/6 background mice show delayed onset of symptoms, increased lifespan, and an extended disease duration compared to their sex-matched B6SJL counterparts. Female B6SJL generally experience extended lifespan and delayed onset compared to their male counterparts, while female mice on the C57BL/6 background show delayed onset but no difference in survival compared to their male counterparts. Finally, different experimental protocols (tremor, rotarod, etc.) for onset determination result in notably different onset means.

Conclusions: Overall, the observed effect of sex on disease endpoints was smaller than that which can be attributed to the genetic background. The often-reported increase in lifespan for female mice was observed only for mice on the B6SJL background, implicating a strain-dependent effect of sex on disease progression that manifests despite identical mutant SOD1 expression.

Collapse