1
|
Fischer L, Roig MB, Brannath W. An exhaustive ADDIS principle for online FWER control. Biom J 2024; 66:e2300237. [PMID: 38637319 DOI: 10.1002/bimj.202300237] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Revised: 01/27/2024] [Accepted: 03/09/2024] [Indexed: 04/20/2024]
Abstract
In this paper, we consider online multiple testing with familywise error rate (FWER) control, where the probability of committing at least one type I error will remain under control while testing a possibly infinite sequence of hypotheses over time. Currently, adaptive-discard (ADDIS) procedures seem to be the most promising online procedures with FWER control in terms of power. Now, our main contribution is a uniform improvement of the ADDIS principle and thus of all ADDIS procedures. This means, the methods we propose reject as least as much hypotheses as ADDIS procedures and in some cases even more, while maintaining FWER control. In addition, we show that there is no other FWER controlling procedure that enlarges the event of rejecting any hypothesis. Finally, we apply the new principle to derive uniform improvements of the ADDIS-Spending and ADDIS-Graph.
Collapse
Affiliation(s)
- Lasse Fischer
- Competence Center for Clinical Trials Bremen, University of Bremen, Bremen, Germany
| | - Marta Bofill Roig
- Center for Medical Data Science, Medical University of Vienna, Vienna, Austria
| | - Werner Brannath
- Competence Center for Clinical Trials Bremen, University of Bremen, Bremen, Germany
| |
Collapse
|
2
|
De A. Statistical Considerations and Challenges for Pivotal Clinical Studies of Artificial Intelligence Medical Tests for Widespread Use: Opportunities for Inter-Disciplinary Collaboration. Stat Biopharm Res 2023. [DOI: 10.1080/19466315.2023.2169752] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Affiliation(s)
- Arkendra De
- Agilent Technologies, 1005 Mark Avenue, Carpinteria, CA 93013, Tel: 408-553-7111,
| |
Collapse
|
3
|
Barrios JP, Tison GH. Advancing cardiovascular medicine with machine learning: Progress, potential, and perspective. Cell Rep Med 2022; 3:100869. [PMID: 36543095 PMCID: PMC9798021 DOI: 10.1016/j.xcrm.2022.100869] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Revised: 10/26/2022] [Accepted: 11/21/2022] [Indexed: 12/24/2022]
Abstract
Recent advances in machine learning (ML) have made it possible to analyze high-dimensional and complex data-such as free text, images, waveforms, videos, and sound-in an automated manner by successfully learning complex associations within these data. Cardiovascular medicine is particularly well poised to take advantage of these ML advances, due to the widespread digitization of medical data and the large number of diagnostic tests used to evaluate cardiovascular disease. Various ML approaches have successfully been applied to cardiovascular tests and diseases to automate interpretation, accurately perform measurements, and, in some cases, predict novel diagnoses from less invasive tests, effectively expanding the utility of more widely accessible diagnostic tests. Here, we present examples of some impactful advances in cardiovascular medicine using ML across a variety of modalities, with a focus on deep learning applications.
Collapse
Affiliation(s)
- Joshua P. Barrios
- Department of Medicine, Division of Cardiology, University of California, San Francisco, 555 Mission Bay Blvd South Box 3120, San Francisco, CA 94158, USA
| | - Geoffrey H. Tison
- Department of Medicine, Division of Cardiology, University of California, San Francisco, 555 Mission Bay Blvd South Box 3120, San Francisco, CA 94158, USA,Bakar Computational Health Sciences Institute, University of California, San Francisco, 555 Mission Bay Blvd South Box 3120, San Francisco, CA 94158, USA,Corresponding author
| |
Collapse
|
4
|
Clinical artificial intelligence quality improvement: towards continual monitoring and updating of AI algorithms in healthcare. NPJ Digit Med 2022; 5:66. [PMID: 35641814 PMCID: PMC9156743 DOI: 10.1038/s41746-022-00611-y] [Citation(s) in RCA: 69] [Impact Index Per Article: 34.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2021] [Accepted: 04/29/2022] [Indexed: 12/13/2022] Open
Abstract
Machine learning (ML) and artificial intelligence (AI) algorithms have the potential to derive insights from clinical data and improve patient outcomes. However, these highly complex systems are sensitive to changes in the environment and liable to performance decay. Even after their successful integration into clinical practice, ML/AI algorithms should be continuously monitored and updated to ensure their long-term safety and effectiveness. To bring AI into maturity in clinical care, we advocate for the creation of hospital units responsible for quality assurance and improvement of these algorithms, which we refer to as “AI-QI” units. We discuss how tools that have long been used in hospital quality assurance and quality improvement can be adapted to monitor static ML algorithms. On the other hand, procedures for continual model updating are still nascent. We highlight key considerations when choosing between existing methods and opportunities for methodological innovation.
Collapse
|
5
|
Feng J, Gossmann A, Sahiner B, Pirracchio R. Bayesian logistic regression for online recalibration and revision of risk prediction models with performance guarantees. J Am Med Inform Assoc 2022; 29:841-852. [PMID: 35022756 DOI: 10.1093/jamia/ocab280] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Revised: 10/25/2021] [Accepted: 12/07/2021] [Indexed: 11/13/2022] Open
Abstract
OBJECTIVE After deploying a clinical prediction model, subsequently collected data can be used to fine-tune its predictions and adapt to temporal shifts. Because model updating carries risks of over-updating/fitting, we study online methods with performance guarantees. MATERIALS AND METHODS We introduce 2 procedures for continual recalibration or revision of an underlying prediction model: Bayesian logistic regression (BLR) and a Markov variant that explicitly models distribution shifts (MarBLR). We perform empirical evaluation via simulations and a real-world study predicting Chronic Obstructive Pulmonary Disease (COPD) risk. We derive "Type I and II" regret bounds, which guarantee the procedures are noninferior to a static model and competitive with an oracle logistic reviser in terms of the average loss. RESULTS Both procedures consistently outperformed the static model and other online logistic revision methods. In simulations, the average estimated calibration index (aECI) of the original model was 0.828 (95%CI, 0.818-0.938). Online recalibration using BLR and MarBLR improved the aECI towards the ideal value of zero, attaining 0.265 (95%CI, 0.230-0.300) and 0.241 (95%CI, 0.216-0.266), respectively. When performing more extensive logistic model revisions, BLR and MarBLR increased the average area under the receiver-operating characteristic curve (aAUC) from 0.767 (95%CI, 0.765-0.769) to 0.800 (95%CI, 0.798-0.802) and 0.799 (95%CI, 0.797-0.801), respectively, in stationary settings and protected against substantial model decay. In the COPD study, BLR and MarBLR dynamically combined the original model with a continually refitted gradient boosted tree to achieve aAUCs of 0.924 (95%CI, 0.913-0.935) and 0.925 (95%CI, 0.914-0.935), compared to the static model's aAUC of 0.904 (95%CI, 0.892-0.916). DISCUSSION Despite its simplicity, BLR is highly competitive with MarBLR. MarBLR outperforms BLR when its prior better reflects the data. CONCLUSIONS BLR and MarBLR can improve the transportability of clinical prediction models and maintain their performance over time.
Collapse
Affiliation(s)
- Jean Feng
- Department of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, California, USA
| | - Alexej Gossmann
- CDRH-Center for Devices and Radiological Health, Food and Drug Administration, Silver Spring, Maryland, USA
| | - Berkman Sahiner
- CDRH-Center for Devices and Radiological Health, Food and Drug Administration, Silver Spring, Maryland, USA
| | - Romain Pirracchio
- Department of Anesthesia and Perioperative Care, University of California, San Francisco, San Francisco, California, USA
| |
Collapse
|
6
|
Harris S, Bonnici T, Keen T, Lilaonitkul W, White MJ, Swanepoel N. Clinical deployment environments: Five pillars of translational machine learning for health. Front Digit Health 2022; 4:939292. [PMID: 36060542 PMCID: PMC9437594 DOI: 10.3389/fdgth.2022.939292] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Accepted: 07/25/2022] [Indexed: 01/14/2023] Open
Abstract
Machine Learning for Health (ML4H) has demonstrated efficacy in computer imaging and other self-contained digital workflows, but has failed to substantially impact routine clinical care. This is no longer because of poor adoption of Electronic Health Records Systems (EHRS), but because ML4H needs an infrastructure for development, deployment and evaluation within the healthcare institution. In this paper, we propose a design pattern called a Clinical Deployment Environment (CDE). We sketch the five pillars of the CDE: (1) real world development supported by live data where ML4H teams can iteratively build and test at the bedside (2) an ML-Ops platform that brings the rigour and standards of continuous deployment to ML4H (3) design and supervision by those with expertise in AI safety (4) the methods of implementation science that enable the algorithmic insights to influence the behaviour of clinicians and patients and (5) continuous evaluation that uses randomisation to avoid bias but in an agile manner. The CDE is intended to answer the same requirements that bio-medicine articulated in establishing the translational medicine domain. It envisions a transition from "real-world" data to "real-world" development.
Collapse
Affiliation(s)
- Steve Harris
- Institute of Health Informatics, University College London, London, United Kingdom
- Department of Critical Care, University College London Hospital, London, United Kingdom
- Correspondence: Steve Harris
| | - Tim Bonnici
- Institute of Health Informatics, University College London, London, United Kingdom
- Department of Critical Care, University College London Hospital, London, United Kingdom
| | - Thomas Keen
- Institute of Health Informatics, University College London, London, United Kingdom
| | - Watjana Lilaonitkul
- Institute of Health Informatics, University College London, London, United Kingdom
| | - Mark J. White
- Digital Healthcare, University College London Hospital, London, United Kingdom
| | - Nel Swanepoel
- Centre for Advanced Research Computing, University College London, London, United Kingdom
| |
Collapse
|
7
|
Dudgeon SN, Wen S, Hanna MG, Gupta R, Amgad M, Sheth M, Marble H, Huang R, Herrmann MD, Szu CH, Tong D, Werness B, Szu E, Larsimont D, Madabhushi A, Hytopoulos E, Chen W, Singh R, Hart SN, Sharma A, Saltz J, Salgado R, Gallas BD. A Pathologist-Annotated Dataset for Validating Artificial Intelligence: A Project Description and Pilot Study. J Pathol Inform 2021; 12:45. [PMID: 34881099 PMCID: PMC8609287 DOI: 10.4103/jpi.jpi_83_20] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Revised: 01/23/2021] [Accepted: 03/16/2021] [Indexed: 12/13/2022] Open
Abstract
Purpose: Validating artificial intelligence algorithms for clinical use in medical images is a challenging endeavor due to a lack of standard reference data (ground truth). This topic typically occupies a small portion of the discussion in research papers since most of the efforts are focused on developing novel algorithms. In this work, we present a collaboration to create a validation dataset of pathologist annotations for algorithms that process whole slide images. We focus on data collection and evaluation of algorithm performance in the context of estimating the density of stromal tumor-infiltrating lymphocytes (sTILs) in breast cancer. Methods: We digitized 64 glass slides of hematoxylin- and eosin-stained invasive ductal carcinoma core biopsies prepared at a single clinical site. A collaborating pathologist selected 10 regions of interest (ROIs) per slide for evaluation. We created training materials and workflows to crowdsource pathologist image annotations on two modes: an optical microscope and two digital platforms. The microscope platform allows the same ROIs to be evaluated in both modes. The workflows collect the ROI type, a decision on whether the ROI is appropriate for estimating the density of sTILs, and if appropriate, the sTIL density value for that ROI. Results: In total, 19 pathologists made 1645 ROI evaluations during a data collection event and the following 2 weeks. The pilot study yielded an abundant number of cases with nominal sTIL infiltration. Furthermore, we found that the sTIL densities are correlated within a case, and there is notable pathologist variability. Consequently, we outline plans to improve our ROI and case sampling methods. We also outline statistical methods to account for ROI correlations within a case and pathologist variability when validating an algorithm. Conclusion: We have built workflows for efficient data collection and tested them in a pilot study. As we prepare for pivotal studies, we will investigate methods to use the dataset as an external validation tool for algorithms. We will also consider what it will take for the dataset to be fit for a regulatory purpose: study size, patient population, and pathologist training and qualifications. To this end, we will elicit feedback from the Food and Drug Administration via the Medical Device Development Tool program and from the broader digital pathology and AI community. Ultimately, we intend to share the dataset, statistical methods, and lessons learned.
Collapse
Affiliation(s)
- Sarah N Dudgeon
- Division of Imaging Diagnostics and Software Reliability, Office of Science and Engineering Laboratories, Center for Devices and Radiologic Health, United States Food and Drug Administration, White Oak, MD, USA
| | - Si Wen
- Division of Imaging Diagnostics and Software Reliability, Office of Science and Engineering Laboratories, Center for Devices and Radiologic Health, United States Food and Drug Administration, White Oak, MD, USA
| | | | - Rajarsi Gupta
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, USA
| | - Mohamed Amgad
- Department of Pathology, Northwestern University, Chicago, IL, USA
| | - Manasi Sheth
- Division of Biostatistics, Center for Devices and Radiologic Health, United States Food and Drug Administration, White Oak, MD, USA
| | - Hetal Marble
- Department of Pathology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Richard Huang
- Department of Pathology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Markus D Herrmann
- Department of Pathology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | | | | | | | - Evan Szu
- Arrive Bio, San Francisco, CA, USA
| | - Denis Larsimont
- Department of Pathology, Institute Jules Bordet, Brussels, Belgium
| | - Anant Madabhushi
- Louis Stokes Cleveland Veterans Administration Medical Center, Cleveland, OH, USA
| | | | - Weijie Chen
- Division of Imaging Diagnostics and Software Reliability, Office of Science and Engineering Laboratories, Center for Devices and Radiologic Health, United States Food and Drug Administration, White Oak, MD, USA
| | - Rajendra Singh
- Northwell Health and Zucker School of Medicine, New York, NY, USA
| | - Steven N Hart
- Department of Pathology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Ashish Sharma
- Department of Biomedical Informatics, Emory University, Atlanta, GA, USA
| | - Joel Saltz
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, USA
| | - Roberto Salgado
- Division of Research, Peter Mac Callum Cancer Centre, Melbourne, Australia.,Department of Pathology, GZA-ZNA Hospitals, Antwerp, Belgium
| | - Brandon D Gallas
- Division of Imaging Diagnostics and Software Reliability, Office of Science and Engineering Laboratories, Center for Devices and Radiologic Health, United States Food and Drug Administration, White Oak, MD, USA
| |
Collapse
|
8
|
Rose S. Discussion on "Approval policies for modifications to machine learning-based software as a medical device: A study of biocreep" by Jean Feng, Scott Emerson, and Noah Simon. Biometrics 2021; 77:49-51. [PMID: 33040334 PMCID: PMC8386180 DOI: 10.1111/biom.13378] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2020] [Revised: 08/10/2020] [Accepted: 08/11/2020] [Indexed: 11/29/2022]
Abstract
I applaud the authors of Feng et al. (2020) for tackling a challenging statistical problem on approval policies for software as a medical device (SaMD). Their work exploring methodology that could autonomously build algorithmic change protocols soundly extends and leverages related literatures in multiple testing and online learning, among others. While their paper appears in the Biometric Methodology section of the journal, I choose to focus on important practical considerations in this invited discussion, given that algorithms optimized and deployed in health care can directly impact human health. Thus, although not a Biometrics Practice paper, I aim to make the case that several broad issues are relevant for much of the algorithmic work of statisticians who are driven by health applications: the data and setting, whether the reference algorithm is an acceptable baseline, and metrics.
Collapse
Affiliation(s)
- Sherri Rose
- Center for Health Policy and Center for Primary Care and Outcomes Research, Stanford University, Stanford, California
| |
Collapse
|
9
|
El Naqa I, Li H, Fuhrman J, Hu Q, Gorre N, Chen W, Giger ML. Lessons learned in transitioning to AI in the medical imaging of COVID-19. J Med Imaging (Bellingham) 2021; 8:010902-10902. [PMID: 34646912 PMCID: PMC8488974 DOI: 10.1117/1.jmi.8.s1.010902] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2021] [Accepted: 09/20/2021] [Indexed: 12/12/2022] Open
Abstract
The coronavirus disease 2019 (COVID-19) pandemic has wreaked havoc across the world. It also created a need for the urgent development of efficacious predictive diagnostics, specifically, artificial intelligence (AI) methods applied to medical imaging. This has led to the convergence of experts from multiple disciplines to solve this global pandemic including clinicians, medical physicists, imaging scientists, computer scientists, and informatics experts to bring to bear the best of these fields for solving the challenges of the COVID-19 pandemic. However, such a convergence over a very brief period of time has had unintended consequences and created its own challenges. As part of Medical Imaging Data and Resource Center initiative, we discuss the lessons learned from career transitions across the three involved disciplines (radiology, medical imaging physics, and computer science) and draw recommendations based on these experiences by analyzing the challenges associated with each of the three associated transition types: (1) AI of non-imaging data to AI of medical imaging data, (2) medical imaging clinician to AI of medical imaging, and (3) AI of medical imaging to AI of COVID-19 imaging. The lessons learned from these career transitions and the diffusion of knowledge among them could be accomplished more effectively by recognizing their associated intricacies. These lessons learned in the transitioning to AI in the medical imaging of COVID-19 can inform and enhance future AI applications, making the whole of the transitions more than the sum of each discipline, for confronting an emergency like the COVID-19 pandemic or solving emerging problems in biomedicine.
Collapse
Affiliation(s)
- Issam El Naqa
- Moffitt Cancer Center, Department of Machine Learning, Tampa, Florida, United States
- The University of Chicago, Medical Imaging Data and Resource Center, Chicago, Illinois, United States
| | - Hui Li
- The University of Chicago, Medical Imaging Data and Resource Center, Chicago, Illinois, United States
- The University of Chicago, Department of Radiology, Chicago, Illinois, United States
| | - Jordan Fuhrman
- The University of Chicago, Medical Imaging Data and Resource Center, Chicago, Illinois, United States
- The University of Chicago, Department of Radiology, Chicago, Illinois, United States
| | - Qiyuan Hu
- The University of Chicago, Medical Imaging Data and Resource Center, Chicago, Illinois, United States
- The University of Chicago, Department of Radiology, Chicago, Illinois, United States
| | - Naveena Gorre
- Moffitt Cancer Center, Department of Machine Learning, Tampa, Florida, United States
- The University of Chicago, Medical Imaging Data and Resource Center, Chicago, Illinois, United States
| | - Weijie Chen
- The University of Chicago, Medical Imaging Data and Resource Center, Chicago, Illinois, United States
- US FDA, CDRH, Office of Science and Engineering Laboratories, Division of Imaging, Diagnosis, and Software Reliability, Silver Spring, Maryland, United States
| | - Maryellen L. Giger
- The University of Chicago, Medical Imaging Data and Resource Center, Chicago, Illinois, United States
- The University of Chicago, Department of Radiology, Chicago, Illinois, United States
| |
Collapse
|