1
|
Ly A, Garcia V, Blenman KRM, Ehinger A, Elfer K, Hanna MG, Li X, Peeters DJE, Birmingham R, Dudgeon S, Gardecki E, Gupta R, Lennerz J, Pan T, Saltz J, Wharton KA, Ehinger D, Acs B, Dequeker EMC, Salgado R, Gallas BD. Training pathologists to assess stromal tumour-infiltrating lymphocytes in breast cancer synergises efforts in clinical care and scientific research. Histopathology 2024; 84:915-923. [PMID: 38433289 PMCID: PMC10990791 DOI: 10.1111/his.15140] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 12/15/2023] [Accepted: 12/31/2023] [Indexed: 03/05/2024]
Abstract
A growing body of research supports stromal tumour-infiltrating lymphocyte (TIL) density in breast cancer to be a robust prognostic and predicive biomarker. The gold standard for stromal TIL density quantitation in breast cancer is pathologist visual assessment using haematoxylin and eosin-stained slides. Artificial intelligence/machine-learning algorithms are in development to automate the stromal TIL scoring process, and must be validated against a reference standard such as pathologist visual assessment. Visual TIL assessment may suffer from significant interobserver variability. To improve interobserver agreement, regulatory science experts at the US Food and Drug Administration partnered with academic pathologists internationally to create a freely available online continuing medical education (CME) course to train pathologists in assessing breast cancer stromal TILs using an interactive format with expert commentary. Here we describe and provide a user guide to this CME course, whose content was designed to improve pathologist accuracy in scoring breast cancer TILs. We also suggest subsequent steps to translate knowledge into clinical practice with proficiency testing.
Collapse
Affiliation(s)
- Amy Ly
- Department of Pathology, Massachusetts General Hospital, Boston, MA, USA
| | - Victor Garcia
- Center for Devices and Radiological Health, Office of Science and Engineering Laboratories, Division of Imaging, Diagnostics, and Software Reliability, U.S. Food and Drug Administration, Silver Spring, MD, USA
| | - Kim RM Blenman
- Department of Internal Medicine, Section of Medical Oncology and Yale Cancer Center, Yale School of Medicine, New Haven, CT, USA
- Department of Computer Science, Yale School of Engineering and Applied Science, New Haven, CT, USA
| | - Anna Ehinger
- Department of Genetics, Pathology and Molecular Diagnostics, Laboratory Medicine, Region Skane, Lund University, Lund, Sweden
| | - Katherine Elfer
- Center for Devices and Radiological Health, Office of Science and Engineering Laboratories, Division of Imaging, Diagnostics, and Software Reliability, U.S. Food and Drug Administration, Silver Spring, MD, USA
| | - Matthew G Hanna
- Department of Pathology and Laboratory Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Xiaoxian Li
- Department of Pathology and Laboratory Medicine, Emory University, Atlanta, GA, USA
| | - Dieter JE Peeters
- Department of Pathology, University Hospital Antwerp, Edegem, Belgium
- Department of Pathology, Algemeen Ziekenhuis (AZ) Sint-Maarten, Mechelen, Belgium
| | - Ryan Birmingham
- Center for Devices and Radiological Health, Office of Science and Engineering Laboratories, Division of Imaging, Diagnostics, and Software Reliability, U.S. Food and Drug Administration, Silver Spring, MD, USA
- Department of Biomedical Informatics, Emory University, Atlanta, GA, USA
| | - Sarah Dudgeon
- Center for Computational Health, Yale School of Medicine, New Haven, CT, USA
| | - Emma Gardecki
- Center for Devices and Radiological Health, Office of Science and Engineering Laboratories, Division of Imaging, Diagnostics, and Software Reliability, U.S. Food and Drug Administration, Silver Spring, MD, USA
| | - Rajarsi Gupta
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, USA
| | - Jochen Lennerz
- Department of Pathology, Center for Integrated Diagnostics, Massachusetts General Hospital, Boston, MA, USA; currently at BostonGene, Boston, MA
| | - Tony Pan
- Department of Biomedical Informatics, Emory University, Atlanta, GA, USA
| | - Joel Saltz
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, USA
| | | | - Daniel Ehinger
- Department of Clinical Sciences, Division of Oncology, Lund University, Lund, Sweden
- Department of Genetics, Pathology, and Molecular Diagnostics, Skane University Hospital, Lund, Sweden
| | - Balazs Acs
- Department of Oncology and Pathology, Cancer Centre Karolinska, Karolinksa Institutet, Stockholm, Sweden
- Department of Clinical Pathology and Cancer Diagnostics, Karolinska University Hospital, Stockholm, Sweden
| | - Elisabeth MC Dequeker
- Department of Public Health and Primary Care, Biomedical Quality Assurance Research Unit, University of Leuven, Leuven, Belgium
| | - Roberto Salgado
- Department of Pathology, Gasthuiszusters Antwerpen-Ziekenhuis Netwerk Antwerpen (GZA-ZNA) Hospitals, Antwerp, Belgium
- Division of Research, Peter MacCallum Cancer Centre, Melbourne, Australia
| | - Brandon D Gallas
- Center for Devices and Radiological Health, Office of Science and Engineering Laboratories, Division of Imaging, Diagnostics, and Software Reliability, U.S. Food and Drug Administration, Silver Spring, MD, USA
| |
Collapse
|
2
|
Elfer K, Gardecki E, Garcia V, Ly A, Hytopoulos E, Wen S, Hanna MG, Peeters DJE, Saltz J, Ehinger A, Dudgeon SN, Li X, Blenman KRM, Chen W, Green U, Birmingham R, Pan T, Lennerz JK, Salgado R, Gallas BD. Reproducible Reporting of the Collection and Evaluation of Annotations for Artificial Intelligence Models. Mod Pathol 2024; 37:100439. [PMID: 38286221 DOI: 10.1016/j.modpat.2024.100439] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 12/14/2023] [Accepted: 01/21/2024] [Indexed: 01/31/2024]
Abstract
This work puts forth and demonstrates the utility of a reporting framework for collecting and evaluating annotations of medical images used for training and testing artificial intelligence (AI) models in assisting detection and diagnosis. AI has unique reporting requirements, as shown by the AI extensions to the Consolidated Standards of Reporting Trials (CONSORT) and Standard Protocol Items: Recommendations for Interventional Trials (SPIRIT) checklists and the proposed AI extensions to the Standards for Reporting Diagnostic Accuracy (STARD) and Transparent Reporting of a Multivariable Prediction model for Individual Prognosis or Diagnosis (TRIPOD) checklists. AI for detection and/or diagnostic image analysis requires complete, reproducible, and transparent reporting of the annotations and metadata used in training and testing data sets. In an earlier work by other researchers, an annotation workflow and quality checklist for computational pathology annotations were proposed. In this manuscript, we operationalize this workflow into an evaluable quality checklist that applies to any reader-interpreted medical images, and we demonstrate its use for an annotation effort in digital pathology. We refer to this quality framework as the Collection and Evaluation of Annotations for Reproducible Reporting of Artificial Intelligence (CLEARR-AI).
Collapse
Affiliation(s)
- Katherine Elfer
- United States Food and Drug Administration, Center for Devices and Radiological Health, Office of Science and Engineering Laboratories, Division of Imaging Diagnostics and Software Reliability, Silver Spring, Maryland; National Institutes of Health, National Cancer Institute, Division of Cancer Prevention, Cancer Prevention Fellowship Program, Bethesda, Maryland.
| | - Emma Gardecki
- United States Food and Drug Administration, Center for Devices and Radiological Health, Office of Science and Engineering Laboratories, Division of Imaging Diagnostics and Software Reliability, Silver Spring, Maryland
| | - Victor Garcia
- United States Food and Drug Administration, Center for Devices and Radiological Health, Office of Science and Engineering Laboratories, Division of Imaging Diagnostics and Software Reliability, Silver Spring, Maryland
| | - Amy Ly
- Department of Pathology, Massachusetts General Hospital, Boston, Massachusetts
| | | | - Si Wen
- United States Food and Drug Administration, Center for Devices and Radiological Health, Office of Science and Engineering Laboratories, Division of Imaging Diagnostics and Software Reliability, Silver Spring, Maryland
| | - Matthew G Hanna
- Department of Pathology and Laboratory Medicine, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Dieter J E Peeters
- Department of Pathology, University Hospital Antwerp/University of Antwerp, Antwerp, Belgium; Department of Pathology, Sint-Maarten Hospital, Mechelen, Belgium
| | - Joel Saltz
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, New York
| | - Anna Ehinger
- Department of Clinical Genetics, Pathology and Molecular Diagnostics, Laboratory Medicine, Lund University, Lund, Sweden
| | - Sarah N Dudgeon
- Department of Laboratory Medicine, Yale School of Medicine, New Haven, Connecticut
| | - Xiaoxian Li
- Department of Pathology and Laboratory Medicine, Emory University School of Medicine, Atlanta, Georgia
| | - Kim R M Blenman
- Department of Internal Medicine, Section of Medical Oncology, Yale School of Medicine and Yale Cancer Center, Yale University, New Haven, Connecticut; Department of Computer Science, School of Engineering and Applied Science, Yale University, New Haven, Connecticut
| | - Weijie Chen
- United States Food and Drug Administration, Center for Devices and Radiological Health, Office of Science and Engineering Laboratories, Division of Imaging Diagnostics and Software Reliability, Silver Spring, Maryland
| | - Ursula Green
- Department of Biomedical Informatics, Emory University School of Medicine, Atlanta, Georgia
| | - Ryan Birmingham
- United States Food and Drug Administration, Center for Devices and Radiological Health, Office of Science and Engineering Laboratories, Division of Imaging Diagnostics and Software Reliability, Silver Spring, Maryland; Department of Biomedical Informatics, Emory University School of Medicine, Atlanta, Georgia
| | - Tony Pan
- Department of Biomedical Informatics, Emory University School of Medicine, Atlanta, Georgia
| | - Jochen K Lennerz
- Department of Pathology, Center for Integrated Diagnostics, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts
| | - Roberto Salgado
- Division of Research, Peter Mac Callum Cancer Centre, Melbourne, Australia; Department of Pathology, GZA-ZNA Hospitals, Antwerp, Belgium
| | - Brandon D Gallas
- United States Food and Drug Administration, Center for Devices and Radiological Health, Office of Science and Engineering Laboratories, Division of Imaging Diagnostics and Software Reliability, Silver Spring, Maryland
| |
Collapse
|
3
|
Hart S, Garcia V, Dudgeon SN, Hanna MG, Li X, Blenman KRM, Elfer K, Ly A, Salgado R, Saltz J, Gupta R, Hytopoulos E, Larsimont D, Lennerz J, Gallas BD. Initial interactions with the FDA on developing a validation dataset as a medical device development tool. J Pathol 2023; 261:378-384. [PMID: 37794720 PMCID: PMC10841854 DOI: 10.1002/path.6208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 07/14/2023] [Accepted: 08/24/2023] [Indexed: 10/06/2023]
Abstract
Quantifying tumor-infiltrating lymphocytes (TILs) in breast cancer tumors is a challenging task for pathologists. With the advent of whole slide imaging that digitizes glass slides, it is possible to apply computational models to quantify TILs for pathologists. Development of computational models requires significant time, expertise, consensus, and investment. To reduce this burden, we are preparing a dataset for developers to validate their models and a proposal to the Medical Device Development Tool (MDDT) program in the Center for Devices and Radiological Health of the U.S. Food and Drug Administration (FDA). If the FDA qualifies the dataset for its submitted context of use, model developers can use it in a regulatory submission within the qualified context of use without additional documentation. Our dataset aims at reducing the regulatory burden placed on developers of models that estimate the density of TILs and will allow head-to-head comparison of multiple computational models on the same data. In this paper, we discuss the MDDT preparation and submission process, including the feedback we received from our initial interactions with the FDA and propose how a qualified MDDT validation dataset could be a mechanism for open, fair, and consistent measures of computational model performance. Our experiences will help the community understand what the FDA considers relevant and appropriate (from the perspective of the submitter), at the early stages of the MDDT submission process, for validating stromal TIL density estimation models and other potential computational models. © 2023 The Authors. The Journal of Pathology published by John Wiley & Sons Ltd on behalf of The Pathological Society of Great Britain and Ireland. This article has been contributed to by U.S. Government employees and their work is in the public domain in the USA.
Collapse
Affiliation(s)
- Steven Hart
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester MN, USA
| | - Victor Garcia
- Division of Imaging, Diagnostics, and Software Reliability, Office Science and Engineering Laboratories, Center for Devices and Radiological Health, US Food and Drug Administration, Silver Spring, MD, USA
| | - Sarah N. Dudgeon
- Computational Biology and Bioinformatics Program, Yale University, New Haven, CT, USA
| | | | - Xiaoxian Li
- Department of Pathology and Laboratory Medicine, Emory University, Atlanta, GA, USA
| | - Kim RM Blenman
- Department of Internal Medicine, Section of Medical Oncology, School of Medicine, Yale University, New Haven, CT, USA
- Department of Computer Science, School of Engineering and Applied Science, Yale University, New Haven, CT, USA
| | - Katherine Elfer
- Division of Imaging, Diagnostics, and Software Reliability, Office Science and Engineering Laboratories, Center for Devices and Radiological Health, US Food and Drug Administration, Silver Spring, MD, USA
| | - Amy Ly
- Department of Pathology, Massachusetts General Hospital, MA, USA
| | - Roberto Salgado
- Department of Pathology, GZA-ZNA Hospitals, Antwerp, Belgium
- Division of Research, Peter Mac Callum Cancer Centre, Melbourne, Australia
| | - Joel Saltz
- Department of Biomedical Informatics, Stony Brook School of Medicine, Stony Brook NY, USA
| | - Rajarsi Gupta
- Department of Biomedical Informatics, Stony Brook School of Medicine, Stony Brook NY, USA
| | | | - Denis Larsimont
- Department of Pathology, Institut Jules Bordet, Université Libre de Bruxelles, Brussels, Belgium
| | - Jochen Lennerz
- Massachusetts General Hospital/Massachusetts General Hospital, Center for Integrated Diagnostics, Boston, MA, USA
| | - Brandon D. Gallas
- Division of Imaging, Diagnostics, and Software Reliability, Office Science and Engineering Laboratories, Center for Devices and Radiological Health, US Food and Drug Administration, Silver Spring, MD, USA
| |
Collapse
|
4
|
Petrick N, Chen W, Delfino JG, Gallas BD, Kang Y, Krainak D, Sahiner B, Samala RK. Regulatory considerations for medical imaging AI/ML devices in the United States: concepts and challenges. J Med Imaging (Bellingham) 2023; 10:051804. [PMID: 37361549 PMCID: PMC10289177 DOI: 10.1117/1.jmi.10.5.051804] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Revised: 05/22/2023] [Accepted: 05/30/2023] [Indexed: 06/28/2023] Open
Abstract
Purpose To introduce developers to medical device regulatory processes and data considerations in artificial intelligence and machine learning (AI/ML) device submissions and to discuss ongoing AI/ML-related regulatory challenges and activities. Approach AI/ML technologies are being used in an increasing number of medical imaging devices, and the fast evolution of these technologies presents novel regulatory challenges. We provide AI/ML developers with an introduction to U.S. Food and Drug Administration (FDA) regulatory concepts, processes, and fundamental assessments for a wide range of medical imaging AI/ML device types. Results The device type for an AI/ML device and appropriate premarket regulatory pathway is based on the level of risk associated with the device and informed by both its technological characteristics and intended use. AI/ML device submissions contain a wide array of information and testing to facilitate the review process with the model description, data, nonclinical testing, and multi-reader multi-case testing being critical aspects of the AI/ML device review process for many AI/ML device submissions. The agency is also involved in AI/ML-related activities that support guidance document development, good machine learning practice development, AI/ML transparency, AI/ML regulatory research, and real-world performance assessment. Conclusion FDA's AI/ML regulatory and scientific efforts support the joint goals of ensuring patients have access to safe and effective AI/ML devices over the entire device lifecycle and stimulating medical AI/ML innovation.
Collapse
Affiliation(s)
- Nicholas Petrick
- U.S. Food and Drug Administration, Center for Devices and Radiological Health, Office of Science and Engineering Labs, Silver Spring, Maryland, United States
| | - Weijie Chen
- U.S. Food and Drug Administration, Center for Devices and Radiological Health, Office of Science and Engineering Labs, Silver Spring, Maryland, United States
| | - Jana G. Delfino
- U.S. Food and Drug Administration, Center for Devices and Radiological Health, Office of Science and Engineering Labs, Silver Spring, Maryland, United States
| | - Brandon D. Gallas
- U.S. Food and Drug Administration, Center for Devices and Radiological Health, Office of Science and Engineering Labs, Silver Spring, Maryland, United States
| | - Yanna Kang
- U.S. Food and Drug Administration, Center for Devices and Radiological Health, Office of Product Evaluation and Quality, Silver Spring, Maryland, United States
| | - Daniel Krainak
- U.S. Food and Drug Administration, Center for Devices and Radiological Health, Office of Product Evaluation and Quality, Silver Spring, Maryland, United States
| | - Berkman Sahiner
- U.S. Food and Drug Administration, Center for Devices and Radiological Health, Office of Science and Engineering Labs, Silver Spring, Maryland, United States
| | - Ravi K. Samala
- U.S. Food and Drug Administration, Center for Devices and Radiological Health, Office of Science and Engineering Labs, Silver Spring, Maryland, United States
| |
Collapse
|
5
|
Garcia V, Ly A, Salgado R, Gallas BD. Abstract P1-08-05: Educating pathologists in quantitating stromal tumor-infiltrating lymphocytes in breast cancer for artificial intelligence applications. Cancer Res 2023. [DOI: 10.1158/1538-7445.sabcs22-p1-08-05] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/06/2023]
Abstract
Abstract
Background Immune cells in the tumor microenvironment play an important role in cancer development [1]. In triple negative breast cancer (TNBC), stromal tumor-infiltrating lymphocytes (sTILs) have been identified as a biomarker with both predictive and prognostic clinical values [2]. The High Throughput Truthing project collected sTILs density estimates in hematoxylin and eosin stained invasive breast cancer biopsy specimens. The goal of the project is to produce a dataset to validate artificial intelligence and machine learning models [3]. After collecting annotations from pathologists for a pilot study, we observed a high level of interobserver variability in sTILs density estimates. To improve pathologist accuracy in breast cancer sTILs assessment, we created educational materials using an expert panel and pilot study data. Method The pilot study data consisted of 640 unique regions of interest (ROIs) derived from 64 digital whole slide images. We categorized ROIs based on their mean sTILs density as “10% or less”, “11% to 40%”, or “greater than 40%”, and selected 72 unique ROIs from those with the highest and lowest pathologist variability in each density bin. In a series of eight one-hour sessions, each ROI was reviewed in a group setting by at least three members of our expert panel, which consisted of one clinical scientist and seven board-certified pathologists trained in breast cancer sTILs assessment. Experts provided estimates of the percent of tumor-associated stroma and sTILs density, and commentary on features that confound sTILs assessment for each ROI. Results We created a set of educational materials to teach the sTILs assessment methodology in breast cancer. These materials include an introduction to the clinical relevance of tumor infiltrating lymphocytes in the breast cancer microenvironment, a tutorial for assessing sTILs according to published guidelines [4], and a discussion of specific pitfalls that may be encountered. Expert panel annotations, comments, and pitfalls were used to generate a reference document and interactive tests: one with expert feedback on each ROI and one to determine proficiency. Conclusions Educational materials designed by an expert panel will serve as reference materials for learning sTILs assessment in breast cancer. Our work provides valuable education for pathologists, and directly supports their ability to provide up-to-date diagnostic information used in caring for breast cancer patients. References: 1. Zitvogel, L., Tesniere, A. & Kroemer, G. Cancer despite immunosurveillance: immunoselection and immunosubversion | Nature Reviews Immunology. https://www.nature.com/articles/nri1936. 2. Loi, S. et al. Tumor-Infiltrating Lymphocytes and Prognosis: A Pooled Individual Patient Analysis of Early-Stage Triple-Negative Breast Cancers. J Clin Oncol 37, 559–569 (2019). 3. Dudgeon, S. N. et al. A pathologist-annotated dataset for validating artificial intelligence: A project description and pilot study. Journal of Pathology Informatics 12, 45 (2021). 4. Salgado, R. et al. The evaluation of tumor-infiltrating lymphocytes (TILs) in breast cancer: recommendations by an International TILs Working Group 2014. Annals of Oncology 26, 259–271 (2015).
Citation Format: Victor Garcia, Amy Ly, Roberto Salgado, Brandon D. Gallas. Educating pathologists in quantitating stromal tumor-infiltrating lymphocytes in breast cancer for artificial intelligence applications [abstract]. In: Proceedings of the 2022 San Antonio Breast Cancer Symposium; 2022 Dec 6-10; San Antonio, TX. Philadelphia (PA): AACR; Cancer Res 2023;83(5 Suppl):Abstract nr P1-08-05.
Collapse
Affiliation(s)
| | - Amy Ly
- 2Massachusetts General Hospital
| | - Roberto Salgado
- 3GZA-ZNA-Hospitals, Antwerp, Belgium; Peter Mac Callum Cancer Centre, Melbourne, Australia
| | - Brandon D. Gallas
- 4U.S. FDA/CDRH/OSEL Division of Imaging, Diagnostics, and Software Reliability, Silver Spring, Maryland
| |
Collapse
|
6
|
Du H, Wen S, Guo Y, Jin F, Gallas BD. Single reader between-cases AUC estimator with nested data. Stat Methods Med Res 2022; 31:2069-2086. [PMID: 35790462 DOI: 10.1177/09622802221111539] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The area under the receiver operating characteristic curve (AUC) is widely used in evaluating diagnostic performance for many clinical tasks. It is still challenging to evaluate the reading performance of distinguishing between positive and negative regions of interest (ROIs) in the nested-data problem, where multiple ROIs are nested within the cases. To address this issue, we identify two kinds of AUC estimators, within-cases AUC and between-cases AUC. We focus on the between-cases AUC estimator, since our main research interest is in patient-level diagnostic performance rather than location-level performance (the ability to separate ROIs with and without disease within each patient). Another reason is that as the case number increases, the number of between-cases paired ROIs is much larger than the number of within-cases ROIs. We provide estimators for the variance of the between-cases AUC and for the covariance when there are two readers. We derive and prove the above estimators' theoretical values based on a simulation model and characterize their behavior using Monte Carlo simulation results. We also provide a real-data example. Moreover, we connect the distribution-based simulation model with the simulation model based on the linear mixed-effect model, which helps better understand the sources of variation in the simulated dataset.
Collapse
Affiliation(s)
- Hongfei Du
- Statistics Department, 8367The George Washington University, Washington, USA
| | - Si Wen
- 4137U.S. Food and Drug Administration, CDRH, OSEL, DIDSR, Silver Spring, USA
| | - Yufei Guo
- Statistics Department, 8367The George Washington University, Washington, USA
| | - Fang Jin
- Statistics Department, 8367The George Washington University, Washington, USA
| | - Brandon D Gallas
- 4137U.S. Food and Drug Administration, CDRH, OSEL, DIDSR, Silver Spring, USA
| |
Collapse
|
7
|
Elfer K, Dudgeon S, Garcia V, Blenman K, Hytopoulos E, Wen S, Li X, Ly A, Werness B, Sheth MS, Amgad M, Gupta R, Saltz J, Hanna MG, Ehinger A, Peeters D, Salgado R, Gallas BD. Pilot study to evaluate tools to collect pathologist annotations for validating machine learning algorithms. J Med Imaging (Bellingham) 2022; 9:047501. [PMID: 35911208 PMCID: PMC9326105 DOI: 10.1117/1.jmi.9.4.047501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Accepted: 06/28/2022] [Indexed: 11/14/2022] Open
Abstract
Purpose: Validation of artificial intelligence (AI) algorithms in digital pathology with a reference standard is necessary before widespread clinical use, but few examples focus on creating a reference standard based on pathologist annotations. This work assesses the results of a pilot study that collects density estimates of stromal tumor-infiltrating lymphocytes (sTILs) in breast cancer biopsy specimens. This work will inform the creation of a validation dataset for the evaluation of AI algorithms fit for a regulatory purpose. Approach: Collaborators and crowdsourced pathologists contributed glass slides, digital images, and annotations. Here, "annotations" refer to any marks, segmentations, measurements, or labels a pathologist adds to a report, image, region of interest (ROI), or biological feature. Pathologists estimated sTILs density in 640 ROIs from hematoxylin and eosin stained slides of 64 patients via two modalities: an optical light microscope and two digital image viewing platforms. Results: The pilot study generated 7373 sTILs density estimates from 29 pathologists. Analysis of annotations found the variability of density estimates per ROI increases with the mean; the root mean square differences were 4.46, 14.25, and 26.25 as the mean density ranged from 0% to 10%, 11% to 40%, and 41% to 100%, respectively. The pilot study informs three areas of improvement for future work: technical workflows, annotation platforms, and agreement analysis methods. Upgrades to the workflows and platforms will improve operability and increase annotation speed and consistency. Conclusions: Exploratory data analysis demonstrates the need to develop new statistical approaches for agreement. The pilot study dataset and analysis methods are publicly available to allow community feedback. The development and results of the validation dataset will be publicly available to serve as an instructive tool that can be replicated by developers and researchers.
Collapse
Affiliation(s)
- Katherine Elfer
- United States Food and Drug Administration, Center for Devices and Radiological Health, Office of Science and Engineering Laboratories, Division of Imaging Diagnostics & Software Reliability, Silver Spring, Maryland, United States
- National Institutes of Health, National Cancer Institute, Division of Cancer Prevention, Cancer Prevention Fellowship Program, Bethesda, Maryland, United States
| | - Sarah Dudgeon
- Yale University Computational Biology and Bioinformatics, New Haven, Connecticut, United States
- Yale New Haven Hospital, Center for Outcomes Research and Evaluation, New Haven, Connecticut, United States
| | - Victor Garcia
- United States Food and Drug Administration, Center for Devices and Radiological Health, Office of Science and Engineering Laboratories, Division of Imaging Diagnostics & Software Reliability, Silver Spring, Maryland, United States
| | - Kim Blenman
- School of Medicine, Yale Cancer Center, Department of Internal Medicine, Section of Medical Oncology, New Haven, Connecticut, United States
- Yale University, School of Engineering and Applied Science, Department of Computer Science, New Haven, Connecticut, United States
| | | | - Si Wen
- United States Food and Drug Administration, Center for Devices and Radiological Health, Office of Science and Engineering Laboratories, Division of Imaging Diagnostics & Software Reliability, Silver Spring, Maryland, United States
| | - Xiaoxian Li
- Emory University School of Medicine, Department of Pathology and Laboratory Medicine, Atlanta, Georgia, United States
| | - Amy Ly
- Massachusetts General Hospital, Boston, Massachusetts, United States
| | - Bruce Werness
- Inova Health System Department of Pathology, Falls Church, Virginia, United States
- Arrive Bio LLC, San Francisco, California, United States
| | - Manasi S. Sheth
- United States Food and Drug Administration (FDA), Center for Devices and Radiologic Health, Office of Product Evaluation and Quality, Office of Clinical Evidence and Analysis, Division of Biostatistics, White Oak, Maryland, United States
| | - Mohamed Amgad
- Northwestern University Feinberg School of Medicine, Department of Pathology, Chicago, Illinois, United States
| | - Rajarsi Gupta
- SUNY Stony Brook Medicine, Department of Biomedical Informatics, Stony Brook, New York, United States
| | - Joel Saltz
- SUNY Stony Brook Medicine, Department of Biomedical Informatics, Stony Brook, New York, United States
- SUNY Stony Brook Medicine, Department of Pathology, Stony Brook, New York, United States
| | - Matthew G. Hanna
- Memorial Sloan Kettering Cancer Center, New York, New York, United States
| | - Anna Ehinger
- Lund University, Laboratory Medicine, Region Skåne, Department of Genetics and Pathology, Lund, Sweden
| | - Dieter Peeters
- Sint-Maarten Hospital, Department of Pathology, Mechelen, Belgium
- University of Antwerp, Department of Biomedical Sciences, Antwerp, Belgium
| | - Roberto Salgado
- Peter Mac Callum Cancer Centre, Division of Research, Melbourne, Australia
- GZA-ZNA Hospitals, Department of Pathology, Antwerp, Belgium
| | - Brandon D. Gallas
- United States Food and Drug Administration, Center for Devices and Radiological Health, Office of Science and Engineering Laboratories, Division of Imaging Diagnostics & Software Reliability, Silver Spring, Maryland, United States
- Address all correspondence to Brandon D. Gallas,
| |
Collapse
|
8
|
Wen S, Gallas BD. Three-Way Mixed Effect ANOVA to Estimate MRMC Limits of Agreement. Stat Biopharm Res 2022. [DOI: 10.1080/19466315.2022.2063169] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Si Wen
- CDRH/OSEL Division of Imaging, Diagnostics, and Software Reliability, U.S. FDA, Silver Spring, MD
| | - Brandon D. Gallas
- CDRH/OSEL Division of Imaging, Diagnostics, and Software Reliability, U.S. FDA, Silver Spring, MD
| |
Collapse
|
9
|
Gallas BD, Badano A, Dudgeon S, Elfer K, Garcia V, Lennerz JK, Myers K, Petrick N, Margerrison E. FDA Fosters Innovative Approaches in Research, Resources, and Collaboration. NAT MACH INTELL 2022; 4:97-98. [PMID: 38410812 PMCID: PMC10895477 DOI: 10.1038/s42256-022-00450-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Brandon D Gallas
- FDA Center for Devices and Radiological Health, Office of Science and Engineering Laboratories, Division of Imaging, Diagnostics, and Software Reliability
| | - Aldo Badano
- FDA Center for Devices and Radiological Health, Office of Science and Engineering Laboratories, Division of Imaging, Diagnostics, and Software Reliability
| | - Sarah Dudgeon
- Yale New Haven Hospital, Center for Outcomes Research and Evaluation
- Yale University, Biological and Biomedical Sciences
| | - Katherine Elfer
- FDA Center for Devices and Radiological Health, Office of Science and Engineering Laboratories, Division of Imaging, Diagnostics, and Software Reliability
| | - Victor Garcia
- FDA Center for Devices and Radiological Health, Office of Science and Engineering Laboratories, Division of Imaging, Diagnostics, and Software Reliability
| | - Jochen K Lennerz
- Massachusetts General Hospital/Harvard Medical School, Department of Pathology, Center for Integrated Diagnostics, Boston, MA
| | | | - Nicholas Petrick
- FDA Center for Devices and Radiological Health, Office of Science and Engineering Laboratories, Division of Imaging, Diagnostics, and Software Reliability
| | - Ed Margerrison
- FDA Center for Devices and Radiological Health, Office of Science and Engineering Laboratories
| |
Collapse
|
10
|
Treviño M, Birdsong G, Carrigan A, Choyke P, Drew T, Eckstein M, Fernandez A, Gallas BD, Giger M, Hewitt SM, Horowitz TS, Jiang YV, Kudrick B, Martinez-Conde S, Mitroff S, Nebeling L, Saltz J, Samuelson F, Seltzer SE, Shabestari B, Shankar L, Siegel E, Tilkin M, Trueblood JS, Van Dyke AL, Venkatesan AM, Whitney D, Wolfe JM. Advancing Research on Medical Image Perception by Strengthening Multidisciplinary Collaboration. JNCI Cancer Spectr 2021; 6:6491257. [DOI: 10.1093/jncics/pkab099] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Revised: 10/20/2021] [Accepted: 11/11/2021] [Indexed: 11/14/2022] Open
Abstract
Abstract
Medical image interpretation is central to detecting, diagnosing, and staging cancer and many other disorders. At a time when medical imaging is being transformed by digital technologies and artificial intelligence, understanding the basic perceptual and cognitive processes underlying medical image interpretation is vital for increasing diagnosticians’ accuracy and performance, improving patient outcomes, and reducing diagnostician burn-out. Medical image perception remains substantially understudied. In September of 2019, the National Cancer Institute convened a multidisciplinary panel of radiologists and pathologists together with researchers working in medical image perception and adjacent fields of cognition and perception for the “Cognition and Medical Image Perception Think Tank.” The Think Tank’s key objectives were: to identify critical unsolved problems related to visual perception in pathology and radiology from the perspective of diagnosticians; to discuss how these clinically relevant questions could be addressed through cognitive and perception research; to identify barriers and solutions for transdisciplinary collaborations; to define ways to elevate the profile of cognition and perception research within the medical image community; to determine the greatest needs to advance medical image perception; and to outline future goals and strategies to evaluate progress. The Think Tank emphasized diagnosticians’ perspectives as the crucial starting point for medical image perception research, with diagnosticians describing their interpretation process and identifying perceptual and cognitive problems that arise. This paper reports the deliberations of the Think Tank participants to address these objectives and highlight opportunities to expand research on medical image perception.
Collapse
Affiliation(s)
- Melissa Treviño
- National Cancer Institute, United States of America
- National Center for Complementary and Integrative Health, United States of America
| | - George Birdsong
- Emory University School of Medicine, United States of America
| | | | - Peter Choyke
- National Cancer Institute, United States of America
| | | | - Miguel Eckstein
- University of California, Santa Barbara, United States of America
| | - Anna Fernandez
- National Cancer Institute, United States of America
- Booz Allen Hamilton, United States of America
| | | | | | | | | | | | - Bonnie Kudrick
- Transportation Security Administration, United States of America
| | | | | | | | - Joseph Saltz
- Stony Brook University, United States of America
| | | | - Steven E Seltzer
- Brigham and Women’s Hospital, United States of America
- Harvard Medical School, United States of America
| | - Behrouz Shabestari
- National Institute of Biomedical Imaging and Bioengineering, United States of America
| | | | - Eliot Siegel
- University of Maryland School of Medicine, United States of America
| | - Mike Tilkin
- American College of Radiology, United States of America
| | | | | | | | - David Whitney
- University of California, Berkeley, United States of America
| | - Jeremy M Wolfe
- Brigham and Women’s Hospital, United States of America
- Harvard Medical School, United States of America
| |
Collapse
|
11
|
Dudgeon SN, Wen S, Hanna MG, Gupta R, Amgad M, Sheth M, Marble H, Huang R, Herrmann MD, Szu CH, Tong D, Werness B, Szu E, Larsimont D, Madabhushi A, Hytopoulos E, Chen W, Singh R, Hart SN, Sharma A, Saltz J, Salgado R, Gallas BD. A Pathologist-Annotated Dataset for Validating Artificial Intelligence: A Project Description and Pilot Study. J Pathol Inform 2021; 12:45. [PMID: 34881099 PMCID: PMC8609287 DOI: 10.4103/jpi.jpi_83_20] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Revised: 01/23/2021] [Accepted: 03/16/2021] [Indexed: 12/13/2022] Open
Abstract
Purpose: Validating artificial intelligence algorithms for clinical use in medical images is a challenging endeavor due to a lack of standard reference data (ground truth). This topic typically occupies a small portion of the discussion in research papers since most of the efforts are focused on developing novel algorithms. In this work, we present a collaboration to create a validation dataset of pathologist annotations for algorithms that process whole slide images. We focus on data collection and evaluation of algorithm performance in the context of estimating the density of stromal tumor-infiltrating lymphocytes (sTILs) in breast cancer. Methods: We digitized 64 glass slides of hematoxylin- and eosin-stained invasive ductal carcinoma core biopsies prepared at a single clinical site. A collaborating pathologist selected 10 regions of interest (ROIs) per slide for evaluation. We created training materials and workflows to crowdsource pathologist image annotations on two modes: an optical microscope and two digital platforms. The microscope platform allows the same ROIs to be evaluated in both modes. The workflows collect the ROI type, a decision on whether the ROI is appropriate for estimating the density of sTILs, and if appropriate, the sTIL density value for that ROI. Results: In total, 19 pathologists made 1645 ROI evaluations during a data collection event and the following 2 weeks. The pilot study yielded an abundant number of cases with nominal sTIL infiltration. Furthermore, we found that the sTIL densities are correlated within a case, and there is notable pathologist variability. Consequently, we outline plans to improve our ROI and case sampling methods. We also outline statistical methods to account for ROI correlations within a case and pathologist variability when validating an algorithm. Conclusion: We have built workflows for efficient data collection and tested them in a pilot study. As we prepare for pivotal studies, we will investigate methods to use the dataset as an external validation tool for algorithms. We will also consider what it will take for the dataset to be fit for a regulatory purpose: study size, patient population, and pathologist training and qualifications. To this end, we will elicit feedback from the Food and Drug Administration via the Medical Device Development Tool program and from the broader digital pathology and AI community. Ultimately, we intend to share the dataset, statistical methods, and lessons learned.
Collapse
Affiliation(s)
- Sarah N Dudgeon
- Division of Imaging Diagnostics and Software Reliability, Office of Science and Engineering Laboratories, Center for Devices and Radiologic Health, United States Food and Drug Administration, White Oak, MD, USA
| | - Si Wen
- Division of Imaging Diagnostics and Software Reliability, Office of Science and Engineering Laboratories, Center for Devices and Radiologic Health, United States Food and Drug Administration, White Oak, MD, USA
| | | | - Rajarsi Gupta
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, USA
| | - Mohamed Amgad
- Department of Pathology, Northwestern University, Chicago, IL, USA
| | - Manasi Sheth
- Division of Biostatistics, Center for Devices and Radiologic Health, United States Food and Drug Administration, White Oak, MD, USA
| | - Hetal Marble
- Department of Pathology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Richard Huang
- Department of Pathology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Markus D Herrmann
- Department of Pathology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | | | | | | | - Evan Szu
- Arrive Bio, San Francisco, CA, USA
| | - Denis Larsimont
- Department of Pathology, Institute Jules Bordet, Brussels, Belgium
| | - Anant Madabhushi
- Louis Stokes Cleveland Veterans Administration Medical Center, Cleveland, OH, USA
| | | | - Weijie Chen
- Division of Imaging Diagnostics and Software Reliability, Office of Science and Engineering Laboratories, Center for Devices and Radiologic Health, United States Food and Drug Administration, White Oak, MD, USA
| | - Rajendra Singh
- Northwell Health and Zucker School of Medicine, New York, NY, USA
| | - Steven N Hart
- Department of Pathology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Ashish Sharma
- Department of Biomedical Informatics, Emory University, Atlanta, GA, USA
| | - Joel Saltz
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, USA
| | - Roberto Salgado
- Division of Research, Peter Mac Callum Cancer Centre, Melbourne, Australia.,Department of Pathology, GZA-ZNA Hospitals, Antwerp, Belgium
| | - Brandon D Gallas
- Division of Imaging Diagnostics and Software Reliability, Office of Science and Engineering Laboratories, Center for Devices and Radiologic Health, United States Food and Drug Administration, White Oak, MD, USA
| |
Collapse
|
12
|
Marble HD, Huang R, Dudgeon SN, Lowe A, Herrmann MD, Blakely S, Leavitt MO, Isaacs M, Hanna MG, Sharma A, Veetil J, Goldberg P, Schmid JH, Lasiter L, Gallas BD, Abels E, Lennerz JK. A Regulatory Science Initiative to Harmonize and Standardize Digital Pathology and Machine Learning Processes to Speed up Clinical Innovation to Patients. J Pathol Inform 2020; 11:22. [PMID: 33042601 PMCID: PMC7518200 DOI: 10.4103/jpi.jpi_27_20] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2020] [Revised: 04/20/2020] [Accepted: 06/16/2020] [Indexed: 12/13/2022] Open
Abstract
Unlocking the full potential of pathology data by gaining computational access to histological pixel data and metadata (digital pathology) is one of the key promises of computational pathology. Despite scientific progress and several regulatory approvals for primary diagnosis using whole-slide imaging, true clinical adoption at scale is slower than anticipated. In the U.S., advances in digital pathology are often siloed pursuits by individual stakeholders, and to our knowledge, there has not been a systematic approach to advance the field through a regulatory science initiative. The Alliance for Digital Pathology (the Alliance) is a recently established, volunteer, collaborative, regulatory science initiative to standardize digital pathology processes to speed up innovation to patients. The purpose is: (1) to account for the patient perspective by including patient advocacy; (2) to investigate and develop methods and tools for the evaluation of effectiveness, safety, and quality to specify risks and benefits in the precompetitive phase; (3) to help strategize the sequence of clinically meaningful deliverables; (4) to encourage and streamline the development of ground-truth data sets for machine learning model development and validation; and (5) to clarify regulatory pathways by investigating relevant regulatory science questions. The Alliance accepts participation from all stakeholders, and we solicit clinically relevant proposals that will benefit the field at large. The initiative will dissolve once a clinical, interoperable, modularized, integrated solution (from tissue acquisition to diagnostic algorithm) has been implemented. In times of rapidly evolving discoveries, scientific input from subject-matter experts is one essential element to inform regulatory guidance and decision-making. The Alliance aims to establish and promote synergistic regulatory science efforts that will leverage diverse inputs to move digital pathology forward and ultimately improve patient care.
Collapse
Affiliation(s)
- Hetal Desai Marble
- Department of Pathology, Center for Integrated Diagnostics, Harvard Medical School, Massachusetts General Hospital, Boston, MA, USA
| | - Richard Huang
- Department of Pathology, Center for Integrated Diagnostics, Harvard Medical School, Massachusetts General Hospital, Boston, MA, USA
| | - Sarah Nixon Dudgeon
- Division of Imaging, Diagnostics, and Software Reliability, Center for Devices and Radiological Health, Food and Drug Administration, Office of Science and Engineering Laboratories, Silver Spring, MD, USA
| | | | - Markus D Herrmann
- Department of Pathology, Harvard Medical School, Massachusetts General Hospital, Boston, MA, USA
| | | | | | - Mike Isaacs
- Department of Pathology and Immunology, Washington University School of Medicine, St. Louis, MO, USA
| | - Matthew G Hanna
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Ashish Sharma
- Department of Biomedical Informatics, Emory University School of Medicine, Atlanta, GA, USA
| | - Jithesh Veetil
- Medical Device Innovation Consortium, Arlington, VA, USA
| | | | | | | | - Brandon D Gallas
- Division of Imaging, Diagnostics, and Software Reliability, Center for Devices and Radiological Health, Food and Drug Administration, Office of Science and Engineering Laboratories, Silver Spring, MD, USA
| | | | - Jochen K Lennerz
- Department of Pathology, Center for Integrated Diagnostics, Harvard Medical School, Massachusetts General Hospital, Boston, MA, USA
| |
Collapse
|
13
|
Kos Z, Roblin E, Kim RS, Michiels S, Gallas BD, Chen W, van de Vijver KK, Goel S, Adams S, Demaria S, Viale G, Nielsen TO, Badve SS, Symmans WF, Sotiriou C, Rimm DL, Hewitt S, Denkert C, Loibl S, Luen SJ, Bartlett JMS, Savas P, Pruneri G, Dillon DA, Cheang MCU, Tutt A, Hall JA, Kok M, Horlings HM, Madabhushi A, van der Laak J, Ciompi F, Laenkholm AV, Bellolio E, Gruosso T, Fox SB, Araya JC, Floris G, Hudeček J, Voorwerk L, Beck AH, Kerner J, Larsimont D, Declercq S, Van den Eynden G, Pusztai L, Ehinger A, Yang W, AbdulJabbar K, Yuan Y, Singh R, Hiley C, Bakir MA, Lazar AJ, Naber S, Wienert S, Castillo M, Curigliano G, Dieci MV, André F, Swanton C, Reis-Filho J, Sparano J, Balslev E, Chen IC, Stovgaard EIS, Pogue-Geile K, Blenman KRM, Penault-Llorca F, Schnitt S, Lakhani SR, Vincent-Salomon A, Rojo F, Braybrooke JP, Hanna MG, Soler-Monsó MT, Bethmann D, Castaneda CA, Willard-Gallo K, Sharma A, Lien HC, Fineberg S, Thagaard J, Comerma L, Gonzalez-Ericsson P, Brogi E, Loi S, Saltz J, Klaushen F, Cooper L, Amgad M, Moore DA, Salgado R. Pitfalls in assessing stromal tumor infiltrating lymphocytes (sTILs) in breast cancer. NPJ Breast Cancer 2020; 6:17. [PMID: 32411819 PMCID: PMC7217863 DOI: 10.1038/s41523-020-0156-0] [Citation(s) in RCA: 89] [Impact Index Per Article: 22.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2019] [Accepted: 03/02/2020] [Indexed: 02/08/2023] Open
Abstract
Stromal tumor-infiltrating lymphocytes (sTILs) are important prognostic and predictive biomarkers in triple-negative (TNBC) and HER2-positive breast cancer. Incorporating sTILs into clinical practice necessitates reproducible assessment. Previously developed standardized scoring guidelines have been widely embraced by the clinical and research communities. We evaluated sources of variability in sTIL assessment by pathologists in three previous sTIL ring studies. We identify common challenges and evaluate impact of discrepancies on outcome estimates in early TNBC using a newly-developed prognostic tool. Discordant sTIL assessment is driven by heterogeneity in lymphocyte distribution. Additional factors include: technical slide-related issues; scoring outside the tumor boundary; tumors with minimal assessable stroma; including lymphocytes associated with other structures; and including other inflammatory cells. Small variations in sTIL assessment modestly alter risk estimation in early TNBC but have the potential to affect treatment selection if cutpoints are employed. Scoring and averaging multiple areas, as well as use of reference images, improve consistency of sTIL evaluation. Moreover, to assist in avoiding the pitfalls identified in this analysis, we developed an educational resource available at www.tilsinbreastcancer.org/pitfalls.
Collapse
Affiliation(s)
- Zuzana Kos
- Department of Pathology, BC Cancer - Vancouver, Vancouver, BC Canada
| | - Elvire Roblin
- Department of Biostatistics and Epidemiology, Gustave Roussy, University Paris-Saclay, Villejuif, France
- Oncostat U1018, Inserm, University Paris-Saclay, labeled Ligue Contre le Cancer, Villejuif, France
| | - Rim S. Kim
- National Surgical Adjuvant Breast and Bowel Project (NSABP)/NRG Oncology, Pittsburgh, PA USA
| | - Stefan Michiels
- Department of Biostatistics and Epidemiology, Gustave Roussy, University Paris-Saclay, Villejuif, France
- Oncostat U1018, Inserm, University Paris-Saclay, labeled Ligue Contre le Cancer, Villejuif, France
| | - Brandon D. Gallas
- Division of Imaging, Diagnostics, and Software Reliability (DIDSR); Office of Science and Engineering Laboratories (OSEL); Center for Devices and Radiological Health (CDRH), US Food and Drug Administration (US FDA), Silver Spring, MD USA
| | - Weijie Chen
- Division of Imaging, Diagnostics, and Software Reliability (DIDSR); Office of Science and Engineering Laboratories (OSEL); Center for Devices and Radiological Health (CDRH), US Food and Drug Administration (US FDA), Silver Spring, MD USA
| | - Koen K. van de Vijver
- Department of Pathology, University Hospital Antwerp, Antwerp, Belgium
- Department of Pathology, Ghent University Hospital, Cancer Research Institute Ghent (CRIG), Ghent, Belgium
| | - Shom Goel
- The Sir Peter MacCallum Cancer Centre, Melbourne, VIC Australia
- Peter MacCallum Department of Oncology, University of Melbourne, Melbourne, Victoria Australia
| | - Sylvia Adams
- Perlmutter Cancer Center, New York University Medical School, New York, NY USA
| | - Sandra Demaria
- Departments of Radiation Oncology and Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY USA
| | - Giuseppe Viale
- Department of Pathology, Istituto Europeo di Oncologia, University of Milan, Milan, Italy
| | - Torsten O. Nielsen
- Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, Canada
| | - Sunil S. Badve
- Department of Pathology and Laboratory Medicine, Indiana University, Indianapolis, USA
| | - W. Fraser Symmans
- Department of Pathology, The University of Texas M.D. Anderson Cancer Center, Houston, TX USA
| | - Christos Sotiriou
- Department of Medical Oncology, Institut Jules Bordet, Université Libre de Bruxelles, Brussels, Belgium
| | - David L. Rimm
- Department of Pathology, Yale School of Medicine, New Haven, CT USA
| | - Stephen Hewitt
- Laboratory of Pathology, National Cancer Institute, NIH, Bethesda, MD USA
| | - Carsten Denkert
- Institute of Pathology, Universitätsklinikum Gießen und Marburg GmbH, Standort Marburg and Philipps-Universität Marburg, Marburg, Germany
| | | | - Stephen J. Luen
- Peter MacCallum Department of Oncology, University of Melbourne, Melbourne, Victoria Australia
- Division of Research and Cancer Medicine, Peter MacCallum Cancer Centre, University of Melbourne, Melbourne, VIC Australia
| | - John M. S. Bartlett
- Ontario Institute for Cancer Research, Toronto, ON Canada
- University of Edinburgh Cancer Research Centre, Edinburgh, UK
| | - Peter Savas
- Peter MacCallum Department of Oncology, University of Melbourne, Melbourne, Victoria Australia
- Division of Research and Cancer Medicine, Peter MacCallum Cancer Centre, University of Melbourne, Melbourne, VIC Australia
| | - Giancarlo Pruneri
- Department of Pathology, IRCCS Fondazione Instituto Nazionale Tumori and University of Milan, School of Medicine, Milan, Italy
| | - Deborah A. Dillon
- Department of Pathology, Brigham and Women’s Hospital, Boston, MA USA
- Department of Pathology, Dana Farber Cancer Institute, Boston, MA USA
| | - Maggie Chon U. Cheang
- Institute of Cancer Research Clinical Trials and Statistics Unit, The Institute of Cancer Research, Surrey, UK
| | - Andrew Tutt
- Breast Cancer Now Toby Robins Research Centre, The Institute of Cancer Research, London, UK
| | | | - Marleen Kok
- Department of Medical Oncology and Division of Tumor Biology & Immunology, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Hugo M. Horlings
- Department of Pathology, University Hospital Antwerp, Antwerp, Belgium
- Division of Molecular Pathology, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Anant Madabhushi
- Department of Biomedical Engineering, Case Western Reserve University, Cleveland, OH USA
- Louis Stokes Cleveland Veterans Affairs Medical Center, Cleveland, OH USA
| | - Jeroen van der Laak
- Computational Pathology Group, Department of Pathology, Radboud University Medical Center, Nijmegen, Netherlands
| | - Francesco Ciompi
- Computational Pathology Group, Department of Pathology, Radboud University Medical Center, Nijmegen, Netherlands
| | | | - Enrique Bellolio
- Departamento de Anatomía Patológica, Universidad de La Frontera, Temuco, Chile
| | | | - Stephen B. Fox
- The Sir Peter MacCallum Cancer Centre, Melbourne, VIC Australia
- Department of Pathology, Peter MacCallum Cancer Centre Department of Pathology, Melbourne, VIC Australia
| | | | - Giuseppe Floris
- KU Leuven- Univerisity of Leuven, Department of Imaging and Pathology, Laboratory of Translational Cell & Tissue Research and KU Leuven- University Hospitals Leuven, Department of Pathology, Leuven, Belgium
| | - Jan Hudeček
- Department of Research IT, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Leonie Voorwerk
- Division of Tumor Biology & Immunology, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | | | | | - Denis Larsimont
- Department of Pathology, Jules Bordet Institute, Brussels, Belgium
| | | | | | - Lajos Pusztai
- Department of Internal Medicine, Section of Medical Oncology, Yale Cancer Center, Yale School of Medicine, New Haven, CT USA
| | - Anna Ehinger
- Department of Clinical Genetics and Pathology, Skåne University Hospital, Lund University, Lund, Sweden
| | - Wentao Yang
- Department of Pathology, Fudan University Shanghai Cancer Centre, Shanghai, China
| | - Khalid AbdulJabbar
- Centre for Evolution and Cancer; Division of Molecular Pathology, The Institute of Cancer Research, London, UK
| | - Yinyin Yuan
- Centre for Evolution and Cancer; Division of Molecular Pathology, The Institute of Cancer Research, London, UK
| | - Rajendra Singh
- Icahn School of Medicine at Mt. Sinai, New York, NY 10029 USA
| | - Crispin Hiley
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, University College London, London, UK
| | - Maise al Bakir
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, University College London, London, UK
| | - Alexander J. Lazar
- Departments of Pathology, Genomic Medicine, Dermatology, and Translational Molecular Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX USA
| | - Stephen Naber
- Department of Pathology and Laboratory Medicine, Tufts Medical Center, Boston, USA
| | - Stephan Wienert
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Institute of Pathology, Charitéplatz 1, 10117 Berlin, Germany
| | - Miluska Castillo
- Department of Medical Oncology and Research, Instituto Nacional de Enfermedades Neoplasicas, Lima, 15038 Peru
| | | | - Maria-Vittoria Dieci
- Medical Oncology 2, Istituto Oncologico Veneto IOV - IRCCS, Padova, Italy
- Department of Surgery, Oncology and Gastroenterology, University of Padova, Padova, Italy
| | - Fabrice André
- Department of Medical Oncology, Institut Gustave Roussy, Villejuif, France
| | - Charles Swanton
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, University College London, London, UK
- Francis Crick Institute, Midland Road, London, UK
| | - Jorge Reis-Filho
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, NY USA
- Human Oncology and Pathogenesis Program, Memorial Sloan Kettering Cancer Center, New York, NY USA
| | - Joseph Sparano
- Montefiore Medical Center, Albert Einstein College of Medicine, Bronx, NY USA
| | - Eva Balslev
- Department of Pathology, Herlev and Gentofte Hospital, Herlev, Denmark
| | - I-Chun Chen
- Department of Oncology, National Taiwan University Cancer Center, Taipei, Taiwan
- Department of Oncology, National Taiwan University Hospital, Taipei, Taiwan
- Graduate Institute of Oncology, College of Medicine, National Taiwan University, Taipei, Taiwan
| | | | - Katherine Pogue-Geile
- National Surgical Adjuvant Breast and Bowel Project (NSABP)/NRG Oncology, Pittsburgh, PA USA
| | - Kim R. M. Blenman
- Department of Internal Medicine, Section of Medical Oncology, Yale Cancer Center, Yale School of Medicine, New Haven, CT USA
| | | | - Stuart Schnitt
- Department of Pathology, Brigham and Women’s Hospital, Boston, MA USA
| | - Sunil R. Lakhani
- The University of Queensland Centre for Clinical Research and Pathology Queensland, Brisbane, QLD Australia
| | - Anne Vincent-Salomon
- Institut Curie, Paris Sciences Lettres Université, Inserm U934, Department of Pathology, Paris, France
| | - Federico Rojo
- Pathology Department, Instituto de Investigación Sanitaria Fundación Jiménez Díaz (IIS-FJD) - CIBERONC, Madrid, Spain
- GEICAM-Spanish Breast Cancer Research Group, Madrid, Spain
| | - Jeremy P. Braybrooke
- Nuffield Department of Population Health, University of Oxford, Oxford and Department of Medical Oncology, University Hospitals Bristol NHS Foundation Trust, Bristol, UK
| | - Matthew G. Hanna
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, NY USA
| | - M. Teresa Soler-Monsó
- Department of Pathology, Bellvitge University Hospital, IDIBELL. Breast Unit. Catalan Institut of Oncology. L ‘Hospitalet del Llobregat’, Barcelona, 08908 Catalonia Spain
| | - Daniel Bethmann
- University Hospital Halle (Saale), Institute of Pathology, Halle (Saale), Germany
| | - Carlos A. Castaneda
- Department of Medical Oncology and Research, Instituto Nacional de Enfermedades Neoplasicas, Lima, 15038 Peru
| | - Karen Willard-Gallo
- Molecular Immunology Unit, Institut Jules Bordet, Universitè Libre de Bruxelles, Brussels, Belgium
| | - Ashish Sharma
- Department of Biomedical Informatics, Emory University, Atlanta, GA USA
| | - Huang-Chun Lien
- Department of Pathology, National Taiwan University Hospital, Taipei, Taiwan
| | - Susan Fineberg
- Department of Pathology, Montefiore Medical Center and the Albert Einstein College of Medicine, Bronx, NY USA
| | - Jeppe Thagaard
- DTU Compute, Department of Applied Mathematics, Technical University of Denmark; Visiopharm A/S, Hørsholm, Denmark
| | - Laura Comerma
- GEICAM-Spanish Breast Cancer Research Group, Madrid, Spain
- Pathology Department, Hospital del Mar, Parc de Salut Mar, Barcelona, Spain
| | - Paula Gonzalez-Ericsson
- Breast Cancer Program, Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, TN USA
| | - Edi Brogi
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, NY USA
| | - Sherene Loi
- Peter MacCallum Department of Oncology, University of Melbourne, Melbourne, Victoria Australia
- Division of Research and Cancer Medicine, Peter MacCallum Cancer Centre, University of Melbourne, Melbourne, VIC Australia
| | - Joel Saltz
- Biomedical Informatics Department, Stony Brook University, Stony Brook, NY USA
| | - Frederick Klaushen
- Institute of Pathology, Charité Universitätsmedizin Berlin, Berlin, Germany
| | - Lee Cooper
- Department of Pathology, Northwestern University Feinberg School of Medicine, Chicago, IL USA
| | - Mohamed Amgad
- Department of Biomedical Informatics, Emory University School of Medicine, Atlanta, GA USA
| | - David A. Moore
- Department of Pathology, UCL Cancer Institute, UCL, London, UK
- University College Hospitals NHS Trust, London, UK
| | - Roberto Salgado
- Division of Research and Cancer Medicine, Peter MacCallum Cancer Centre, University of Melbourne, Melbourne, VIC Australia
- Department of Pathology, GZA-ZNA, Antwerp, Belgium
| |
Collapse
|
14
|
Amgad M, Stovgaard ES, Balslev E, Thagaard J, Chen W, Dudgeon S, Sharma A, Kerner JK, Denkert C, Yuan Y, AbdulJabbar K, Wienert S, Savas P, Voorwerk L, Beck AH, Madabhushi A, Hartman J, Sebastian MM, Horlings HM, Hudeček J, Ciompi F, Moore DA, Singh R, Roblin E, Balancin ML, Mathieu MC, Lennerz JK, Kirtani P, Chen IC, Braybrooke JP, Pruneri G, Demaria S, Adams S, Schnitt SJ, Lakhani SR, Rojo F, Comerma L, Badve SS, Khojasteh M, Symmans WF, Sotiriou C, Gonzalez-Ericsson P, Pogue-Geile KL, Kim RS, Rimm DL, Viale G, Hewitt SM, Bartlett JMS, Penault-Llorca F, Goel S, Lien HC, Loibl S, Kos Z, Loi S, Hanna MG, Michiels S, Kok M, Nielsen TO, Lazar AJ, Bago-Horvath Z, Kooreman LFS, van der Laak JAWM, Saltz J, Gallas BD, Kurkure U, Barnes M, Salgado R, Cooper LAD. Report on computational assessment of Tumor Infiltrating Lymphocytes from the International Immuno-Oncology Biomarker Working Group. NPJ Breast Cancer 2020; 6:16. [PMID: 32411818 PMCID: PMC7217824 DOI: 10.1038/s41523-020-0154-2] [Citation(s) in RCA: 74] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2019] [Accepted: 02/18/2020] [Indexed: 02/07/2023] Open
Abstract
Assessment of tumor-infiltrating lymphocytes (TILs) is increasingly recognized as an integral part of the prognostic workflow in triple-negative (TNBC) and HER2-positive breast cancer, as well as many other solid tumors. This recognition has come about thanks to standardized visual reporting guidelines, which helped to reduce inter-reader variability. Now, there are ripe opportunities to employ computational methods that extract spatio-morphologic predictive features, enabling computer-aided diagnostics. We detail the benefits of computational TILs assessment, the readiness of TILs scoring for computational assessment, and outline considerations for overcoming key barriers to clinical translation in this arena. Specifically, we discuss: 1. ensuring computational workflows closely capture visual guidelines and standards; 2. challenges and thoughts standards for assessment of algorithms including training, preanalytical, analytical, and clinical validation; 3. perspectives on how to realize the potential of machine learning models and to overcome the perceptual and practical limits of visual scoring.
Collapse
Affiliation(s)
- Mohamed Amgad
- Department of Biomedical Informatics, Emory University School of Medicine, Atlanta, GA USA
| | | | - Eva Balslev
- Department of Pathology, Herlev and Gentofte Hospital, University of Copenhagen, Herlev, Denmark
| | - Jeppe Thagaard
- DTU Compute, Department of Applied Mathematics, Technical University of Denmark, Lyngby, Denmark
- Visiopharm A/S, Hørsholm, Denmark
| | - Weijie Chen
- FDA/CDRH/OSEL/Division of Imaging, Diagnostics, and Software Reliability, Silver Spring, MD USA
| | - Sarah Dudgeon
- FDA/CDRH/OSEL/Division of Imaging, Diagnostics, and Software Reliability, Silver Spring, MD USA
| | - Ashish Sharma
- Department of Biomedical Informatics, Emory University School of Medicine, Atlanta, GA USA
| | | | - Carsten Denkert
- Institut für Pathologie, Universitätsklinikum Gießen und Marburg GmbH, Standort Marburg, Philipps-Universität Marburg, Marburg, Germany
- Institute of Pathology, Philipps-University Marburg, Marburg, Germany
- German Cancer Consortium (DKTK), Partner Site Charité, Berlin, Germany
| | - Yinyin Yuan
- Centre for Evolution and Cancer, The Institute of Cancer Research, London, UK
- Division of Molecular Pathology, The Institute of Cancer Research, London, UK
| | - Khalid AbdulJabbar
- Centre for Evolution and Cancer, The Institute of Cancer Research, London, UK
- Division of Molecular Pathology, The Institute of Cancer Research, London, UK
| | - Stephan Wienert
- Institut für Pathologie, Universitätsklinikum Gießen und Marburg GmbH, Standort Marburg, Philipps-Universität Marburg, Marburg, Germany
| | - Peter Savas
- Division of Research and Cancer Medicine, Peter MacCallum Cancer Centre, University of Melbourne, Victoria, Australia
- Sir Peter MacCallum Department of Oncology, University of Melbourne, Parkville, Australia
| | - Leonie Voorwerk
- Department of Tumor Biology & Immunology, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | | | - Anant Madabhushi
- Case Western Reserve University, Department of Biomedical Engineering, Cleveland, OH USA
- Louis Stokes Cleveland Veterans Administration Medical Center, Cleveland, OH USA
| | - Johan Hartman
- Department of Oncology and Pathology, Karolinska Institutet and University Hospital, Solna, Sweden
| | - Manu M. Sebastian
- Departments of Epigenetics and Molecular Carcinogenesis, The University of Texas MD Anderson Cancer Center, Houston, TX USA
| | - Hugo M. Horlings
- Division of Molecular Pathology, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Jan Hudeček
- Department of Research IT, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Francesco Ciompi
- Department of Pathology, Radboud University Medical Center, Nijmegen, The Netherlands
| | - David A. Moore
- Department of Pathology, UCL Cancer Institute, London, UK
| | - Rajendra Singh
- Department of Pathology and Laboratory Medicine, Icahn School of Medicine at Mount Sinai, New York, NY USA
| | - Elvire Roblin
- Université Paris-Saclay, Univ. Paris-Sud, Villejuif, France
| | - Marcelo Luiz Balancin
- Department of Pathology, Faculty of Medicine, University of São Paulo, São Paulo, Brazil
| | - Marie-Christine Mathieu
- Department of Medical Biology and Pathology, Gustave Roussy Cancer Campus, Villejuif, France
| | - Jochen K. Lennerz
- Department of Pathology, Massachusetts General Hospital, Boston, MA USA
| | - Pawan Kirtani
- Department of Histopathology, Manipal Hospitals Dwarka, New Delhi, India
| | - I-Chun Chen
- Department of Oncology, National Taiwan University Cancer Center, Taipei, Taiwan
| | - Jeremy P. Braybrooke
- Nuffield Department of Population Health, University of Oxford, Oxford, UK
- Department of Medical Oncology, University Hospitals Bristol NHS Foundation Trust, Bristol, UK
| | - Giancarlo Pruneri
- Pathology Department, Fondazione IRCCS Istituto Nazionale Tumori and University of Milan, School of Medicine, Milan, Italy
| | | | - Sylvia Adams
- Laura and Isaac Perlmutter Cancer Center, NYU Langone Medical Center, New York, NY USA
| | - Stuart J. Schnitt
- Department of Pathology, Brigham and Women’s Hospital, Boston, MA USA
| | - Sunil R. Lakhani
- The University of Queensland Centre for Clinical Research and Pathology Queensland, Brisbane, Australia
| | - Federico Rojo
- Pathology Department, CIBERONC-Instituto de Investigación Sanitaria Fundación Jiménez Díaz (IIS-FJD), Madrid, Spain
- GEICAM-Spanish Breast Cancer Research Group, Madrid, Spain
| | - Laura Comerma
- Pathology Department, CIBERONC-Instituto de Investigación Sanitaria Fundación Jiménez Díaz (IIS-FJD), Madrid, Spain
- GEICAM-Spanish Breast Cancer Research Group, Madrid, Spain
| | - Sunil S. Badve
- Department of Pathology and Laboratory Medicine, Indiana University School of Medicine, Indianapolis, IN USA
| | | | - W. Fraser Symmans
- Department of Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX USA
| | - Christos Sotiriou
- Breast Cancer Translational Research Laboratory, Institut Jules Bordet, Université Libre de Bruxelles (ULB), Brussels, Belgium
- ULB-Cancer Research Center (U-CRC) Université Libre de Bruxelles, Brussels, Belgium
| | - Paula Gonzalez-Ericsson
- Breast Cancer Program, Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, TN USA
| | | | | | - David L. Rimm
- Department of Pathology, Yale University School of Medicine, New Haven, CT USA
| | - Giuseppe Viale
- Department of Pathology, IEO, European Institute of Oncology IRCCS & State University of Milan, Milan, Italy
| | - Stephen M. Hewitt
- Laboratory of Pathology, National Cancer Institute, National Institutes of Health, Bethesda, MD USA
| | - John M. S. Bartlett
- Ontario Institute for Cancer Research, Toronto, ON Canada
- Edinburgh Cancer Research Centre, Western General Hospital, Edinburgh, UK
| | - Frédérique Penault-Llorca
- Department of Pathology and Molecular Pathology, Centre Jean Perrin, Clermont-Ferrand, France
- UMR INSERM 1240, Universite Clermont Auvergne, Clermont-Ferrand, France
| | - Shom Goel
- Victorian Comprehensive Cancer Centre building, Peter MacCallum Cancer Centre, Melbourne, Victoria Australia
| | - Huang-Chun Lien
- Department of Pathology, National Taiwan University Hospital, Taipei, Taiwan
| | - Sibylle Loibl
- German Breast Group, c/o GBG-Forschungs GmbH, Neu-Isenburg, Germany
| | - Zuzana Kos
- Department of Pathology, BC Cancer, Vancouver, British Columbia Canada
| | - Sherene Loi
- Sir Peter MacCallum Department of Oncology, University of Melbourne, Parkville, Australia
- Peter MacCallum Cancer Centre, Melbourne, Australia
| | - Matthew G. Hanna
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, NY USA
| | - Stefan Michiels
- Gustave Roussy, Universite Paris-Saclay, Villejuif, France
- Université Paris-Sud, Institut National de la Santé et de la Recherche Médicale, Villejuif, France
| | - Marleen Kok
- Division of Molecular Oncology & Immunology, The Netherlands Cancer Institute, Amsterdam, The Netherlands
- Department of Medical Oncology, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | | | - Alexander J. Lazar
- Department of Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX USA
- Department of Genomic Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX USA
- Department of Translational Molecular Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX USA
- Department of Dermatology, The University of Texas MD Anderson Cancer Center, Houston, TX USA
| | | | - Loes F. S. Kooreman
- GROW - School for Oncology and Developmental Biology, Maastricht University Medical Centre, Maastricht, The Netherlands
- Department of Pathology, Maastricht University Medical Centre, Maastricht, The Netherlands
| | - Jeroen A. W. M. van der Laak
- Department of Pathology, Radboud University Medical Center, Nijmegen, The Netherlands
- Center for Medical Image Science and Visualization, Linköping University, Linköping, Sweden
| | - Joel Saltz
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY USA
| | - Brandon D. Gallas
- FDA/CDRH/OSEL/Division of Imaging, Diagnostics, and Software Reliability, Silver Spring, MD USA
| | - Uday Kurkure
- Roche Tissue Diagnostics, Digital Pathology, Santa Clara, CA USA
| | - Michael Barnes
- Roche Diagnostics Information Solutions, Belmont, CA USA
| | - Roberto Salgado
- Division of Research and Cancer Medicine, Peter MacCallum Cancer Centre, University of Melbourne, Victoria, Australia
- Department of Pathology, GZA-ZNA Ziekenhuizen, Antwerp, Belgium
| | - Lee A. D. Cooper
- Department of Pathology, Northwestern University Feinberg School of Medicine, Chicago, IL USA
| |
Collapse
|
15
|
Wei BR, Halsey CH, Hoover SB, Puri M, Yang HH, Gallas BD, Lee MP, Chen W, Durham AC, Dwyer JE, Sánchez MD, Traslavina RP, Frank C, Bradley C, McGill LD, Esplin DG, Schaffer PA, Cramer SD, Lyle LT, Beck J, Buza E, Gong Q, Hewitt SM, Simpson RM. Agreement in Histological Assessment of Mitotic Activity Between Microscopy and Digital Whole Slide Images Informs Conversion for Clinical Diagnosis. Acad Pathol 2019; 6:2374289519859841. [PMID: 31321298 PMCID: PMC6628521 DOI: 10.1177/2374289519859841] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2019] [Revised: 05/15/2019] [Accepted: 05/19/2019] [Indexed: 01/27/2023] Open
Abstract
Validating digital pathology as substitute for conventional microscopy in diagnosis
remains a priority to assure effectiveness. Intermodality concordance studies typically
focus on achieving the same diagnosis by digital display of whole slide images and
conventional microscopy. Assessment of discrete histological features in whole slide
images, such as mitotic figures, has not been thoroughly evaluated in diagnostic practice.
To further gauge the interchangeability of conventional microscopy with digital display
for primary diagnosis, 12 pathologists examined 113 canine naturally occurring mucosal
melanomas exhibiting a wide range of mitotic activity. Design reflected diverse diagnostic
settings and investigated independent location, interpretation, and enumeration of mitotic
figures. Intermodality agreement was assessed employing conventional microscopy (CM40×),
and whole slide image specimens scanned at 20× (WSI20×) and at 40× (WSI40×) objective
magnifications. An aggregate 1647 mitotic figure count observations were available from
conventional microscopy and whole slide images for comparison. The intraobserver
concordance rate of paired observations was 0.785 to 0.801; interobserver rate was 0.784
to 0.794. Correlation coefficients between the 2 digital modes, and as compared to
conventional microscopy, were similar and suggest noninferiority among modalities,
including whole slide image acquired at lower 20× resolution. As mitotic figure counts
serve for prognostic grading of several tumor types, including melanoma, 6 of 8
pathologists retrospectively predicted survival prognosis using whole slide images,
compared to 9 of 10 by conventional microscopy, a first evaluation of whole slide image
for mitotic figure prognostic grading. This study demonstrated agreement of replicate
reads obtained across conventional microscopy and whole slide images. Hence, quantifying
mitotic figures served as surrogate histological feature with which to further credential
the interchangeability of whole slide images for primary diagnosis.
Collapse
Affiliation(s)
- Bih-Rong Wei
- Laboratory of Cancer Biology and Genetics, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA.,Frederick National Laboratory for Cancer Research, Leidos Biomedical Research, Inc., Frederick, MD, USA
| | - Charles H Halsey
- Laboratory of Cancer Biology and Genetics, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Shelley B Hoover
- Laboratory of Cancer Biology and Genetics, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Munish Puri
- Laboratory of Cancer Biology and Genetics, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Howard H Yang
- Laboratory of Cancer Biology and Genetics, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Brandon D Gallas
- Division of Imaging, Diagnostics, and Software Reliability, Office of Science and Engineering Laboratories, Center for Devices and Radiological Health, US Food and Drug Administration, Silver Spring, MD, USA
| | - Maxwell P Lee
- Laboratory of Cancer Biology and Genetics, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Weijie Chen
- Division of Imaging, Diagnostics, and Software Reliability, Office of Science and Engineering Laboratories, Center for Devices and Radiological Health, US Food and Drug Administration, Silver Spring, MD, USA
| | - Amy C Durham
- Department of Pathobiology, University of Pennsylvania, Philadelphia, PA, USA
| | - Jennifer E Dwyer
- Laboratory of Cancer Biology and Genetics, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Melissa D Sánchez
- Department of Pathobiology, University of Pennsylvania, Philadelphia, PA, USA
| | - Ryan P Traslavina
- Section of Infections of the Nervous System, National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA
| | - Chad Frank
- Department of Microbiology, Immunology, and Pathology, Colorado State University, Fort Collins, CO, USA
| | - Charles Bradley
- Department of Pathobiology, University of Pennsylvania, Philadelphia, PA, USA
| | | | | | - Paula A Schaffer
- Department of Microbiology, Immunology, and Pathology, Colorado State University, Fort Collins, CO, USA
| | - Sarah D Cramer
- Cancer and Inflammation Program, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - L Tiffany Lyle
- Women's Malignancies Branch, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Jessica Beck
- Laboratory of Human Carcinogenesis, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Elizabeth Buza
- Department of Pathobiology, University of Pennsylvania, Philadelphia, PA, USA
| | - Qi Gong
- Division of Imaging, Diagnostics, and Software Reliability, Office of Science and Engineering Laboratories, Center for Devices and Radiological Health, US Food and Drug Administration, Silver Spring, MD, USA
| | - Stephen M Hewitt
- Laboratory of Pathology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - R Mark Simpson
- Laboratory of Cancer Biology and Genetics, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| |
Collapse
|
16
|
Tabata K, Uraoka N, Benhamida J, Hanna MG, Sirintrapun SJ, Gallas BD, Gong Q, Aly RG, Emoto K, Matsuda KM, Hameed MR, Klimstra DS, Yagi Y. Validation of mitotic cell quantification via microscopy and multiple whole-slide scanners. Diagn Pathol 2019; 14:65. [PMID: 31238983 PMCID: PMC6593538 DOI: 10.1186/s13000-019-0839-8] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2018] [Accepted: 06/11/2019] [Indexed: 01/03/2023] Open
Abstract
BACKGROUND The establishment of whole-slide imaging (WSI) as a medical diagnostic device allows that pathologists may evaluate mitotic activity with this new technology. Furthermore, the image digitalization provides an opportunity to develop algorithms for automatic quantifications, ideally leading to improved reproducibility as compared to the naked eye examination by pathologists. In order to implement them effectively, accuracy of mitotic figure detection using WSI should be investigated. In this study, we aimed to measure pathologist performance in detecting mitotic figures (MFs) using multiple platforms (multiple scanners) and compare the results with those obtained using a brightfield microscope. METHODS Four slides of canine oral melanoma were prepared and digitized using 4 WSI scanners. In these slides, 40 regions of interest (ROIs) were demarcated, and five observers identified the MFs using different viewing modes: microscopy and WSI. We evaluated the inter- and intra-observer agreements between modes with Cohen's Kappa and determined "true" MFs with a consensus panel. We then assessed the accuracy (agreement with truth) using the average of sensitivity and specificity. RESULTS In the 40 ROIs, 155 candidate MFs were detected by five pathologists; 74 of them were determined to be true MFs. Inter- and intra-observer agreement was mostly "substantial" or greater (Kappa = 0.594-0.939). Accuracy was between 0.632 and 0.843 across all readers and modes. After averaging over readers for each modality, we found that mitosis detection accuracy for 3 of the 4 WSI scanners was significantly less than that of the microscope (p = 0.002, 0.012, and 0.001). CONCLUSIONS This study is the first to compare WSIs and microscopy in detecting MFs at the level of individual cells. Our results suggest that WSI can be used for mitotic cell detection and offers similar reproducibility to the microscope, with slightly less accuracy.
Collapse
Affiliation(s)
- Kazuhiro Tabata
- Department of Pathology, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY 10065 USA
- Department of Pathology, Nagasaki University Hospital, 1-7-1 Sakamoto, Nagasaki, Nagasaki 8528501 Japan
| | - Naohiro Uraoka
- Department of Pathology, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY 10065 USA
| | - Jamal Benhamida
- Department of Pathology, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY 10065 USA
| | - Matthew G. Hanna
- Department of Pathology, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY 10065 USA
| | | | - Brandon D. Gallas
- Center For Devices and Radiological Health, Office of Science and Engineering Laboratories, U.S. Food and Drug Administration, 10903 New Hampshire Avenue, Silver Spring, MD 20993 USA
| | - Qi Gong
- Center For Devices and Radiological Health, Office of Science and Engineering Laboratories, U.S. Food and Drug Administration, 10903 New Hampshire Avenue, Silver Spring, MD 20993 USA
| | - Rania G. Aly
- Department of Pathology, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY 10065 USA
- Department of Pathology, Faculty of Medicine, Alexandria university, 22 El-Guish Road, El-Shatby, Alexandria, 21526 Egypt
| | - Katsura Emoto
- Department of Pathology, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY 10065 USA
- Thoracic Service, Department of Surgery, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, 10065 NY USA
| | - Kant M. Matsuda
- Department of Pathology, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY 10065 USA
| | - Meera R. Hameed
- Department of Pathology, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY 10065 USA
| | - David S. Klimstra
- Department of Pathology, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY 10065 USA
| | - Yukako Yagi
- Department of Pathology, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY 10065 USA
| |
Collapse
|
17
|
Gallas BD, Chen W, Cole E, Ochs R, Petrick N, Pisano ED, Sahiner B, Samuelson FW, Myers KJ. Impact of prevalence and case distribution in lab-based diagnostic imaging studies. J Med Imaging (Bellingham) 2019; 6:015501. [PMID: 30713851 PMCID: PMC6340399 DOI: 10.1117/1.jmi.6.1.015501] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2018] [Accepted: 12/17/2018] [Indexed: 11/14/2022] Open
Abstract
We investigated effects of prevalence and case distribution on radiologist diagnostic performance as measured by area under the receiver operating characteristic curve (AUC) and sensitivity-specificity in lab-based reader studies evaluating imaging devices. Our retrospective reader studies compared full-field digital mammography (FFDM) to screen-film mammography (SFM) for women with dense breasts. Mammograms were acquired from the prospective Digital Mammographic Imaging Screening Trial. We performed five reader studies that differed in terms of cancer prevalence and the distribution of noncancers. Twenty radiologists participated in each reader study. Using split-plot study designs, we collected recall decisions and multilevel scores from the radiologists for calculating sensitivity, specificity, and AUC. Differences in reader-averaged AUCs slightly favored SFM over FFDM (biggest AUC difference: 0.047, SE = 0.023 , p = 0.047 ), where standard error accounts for reader and case variability. The differences were not significant at a level of 0.01 (0.05/5 reader studies). The differences in sensitivities and specificities were also indeterminate. Prevalence had little effect on AUC (largest difference: 0.02), whereas sensitivity increased and specificity decreased as prevalence increased. We found that AUC is robust to changes in prevalence, while radiologists were more aggressive with recall decisions as prevalence increased.
Collapse
Affiliation(s)
- Brandon D. Gallas
- FDA/CDRH/OSEL/Division of Imaging, Diagnostics, and Software Reliability, Silver Spring, Maryland, United States
| | - Weijie Chen
- FDA/CDRH/OSEL/Division of Imaging, Diagnostics, and Software Reliability, Silver Spring, Maryland, United States
| | - Elodia Cole
- Medical University of South Carolina, Charleston, South Carolina, United States
| | - Robert Ochs
- FDA/CDRH/OIR/Division of Radiological Health, Silver Spring, Maryland, United States
| | - Nicholas Petrick
- FDA/CDRH/OSEL/Division of Imaging, Diagnostics, and Software Reliability, Silver Spring, Maryland, United States
| | - Etta D. Pisano
- Medical University of South Carolina, Charleston, South Carolina, United States
| | - Berkman Sahiner
- FDA/CDRH/OSEL/Division of Imaging, Diagnostics, and Software Reliability, Silver Spring, Maryland, United States
| | - Frank W. Samuelson
- FDA/CDRH/OSEL/Division of Imaging, Diagnostics, and Software Reliability, Silver Spring, Maryland, United States
| | - Kyle J. Myers
- FDA/CDRH/OSEL/Division of Imaging, Diagnostics, and Software Reliability, Silver Spring, Maryland, United States
| |
Collapse
|
18
|
Chen W, Gong Q, Gallas BD. Paired split-plot designs of multireader multicase studies. J Med Imaging (Bellingham) 2018; 5:031410. [PMID: 29795776 DOI: 10.1117/1.jmi.5.3.031410] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2017] [Accepted: 04/30/2018] [Indexed: 11/14/2022] Open
Abstract
The widely used multireader multicase ROC study design for comparing imaging modalities is the fully crossed (FC) design: every reader reads every case of both modalities. We investigate paired split-plot (PSP) designs that may allow for reduced cost and increased flexibility compared with the FC design. In the PSP design, case images from two modalities are read by the same readers, thereby the readings are paired across modalities. However, within each modality, not every reader reads every case. Instead, both the readers and the cases are partitioned into a fixed number of groups and each group of readers reads its own group of cases-a split-plot design. Using a [Formula: see text]-statistic based variance analysis for AUC (i.e., area under the ROC curve), we show analytically that precision can be gained by the PSP design as compared with the FC design with the same number of readers and readings. Equivalently, we show that the PSP design can achieve the same statistical power as the FC design with a reduced number of readings. The trade-off for the increased precision in the PSP design is the cost of collecting a larger number of truth-verified patient cases than the FC design. This means that one can trade-off between different sources of cost and choose a least burdensome design. We provide a validation study to show the iMRMC software can be reliably used for analyzing data from both FC and PSP designs. Finally, we demonstrate the advantages of the PSP design with a reader study comparing full-field digital mammography with screen-film mammography.
Collapse
Affiliation(s)
- Weijie Chen
- Food and Drug Administration, Center for Devices and Radiological Health, Office of Science and Engineering Laboratories, Division of Imaging, Diagnostics, and Software Reliability, Silver Spring, Maryland, United States
| | - Qi Gong
- Food and Drug Administration, Center for Devices and Radiological Health, Office of Science and Engineering Laboratories, Division of Imaging, Diagnostics, and Software Reliability, Silver Spring, Maryland, United States
| | - Brandon D Gallas
- Food and Drug Administration, Center for Devices and Radiological Health, Office of Science and Engineering Laboratories, Division of Imaging, Diagnostics, and Software Reliability, Silver Spring, Maryland, United States
| |
Collapse
|
19
|
Gallas BD, Pisano E, Cole E, Myers K. Impact of Different Study Populations on Reader Behavior and Performance Metrics: Initial Results. Proc SPIE Int Soc Opt Eng 2017; 10136. [PMID: 28845078 DOI: 10.1117/12.2255977] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
The FDA recently completed a study on design methodologies surrounding the Validation of Imaging Premarket Evaluation and Regulation called VIPER. VIPER consisted of five large reader sub-studies to compare the impact of different study populations on reader behavior as seen by sensitivity, specificity, and AUC, the area under the ROC curve (receiver operating characteristic curve). The study investigated different prevalence levels and two kinds of sampling of non-cancer patients: a screening population and a challenge population. The VIPER study compared full-field digital mammography (FFDM) to screen-film mammography (SFM) for women with heterogeneously dense or extremely dense breasts. All cases and corresponding images were sampled from Digital Mammographic Imaging Screening Trial (DMIST) archives. There were 20 readers (American Board Certified radiologists) for each sub-study, and instead of every reader reading every case (fully-crossed study), readers and cases were split into groups to reduce reader workload and the total number of observations (split-plot study). For data collection, readers first decided whether or not they would recall a patient. Following that decision, they provided an ROC score for how close or far that patient was from the recall decision threshold. Performance results for FFDM show that as prevalence increases to 50%, there is a moderate increase in sensitivity and decrease in specificity, whereas AUC is mainly flat. Regarding precision, the statistical efficiency (ratio of variances) of sensitivity and specificity relative to AUC are 0.66 at best and decrease with prevalence. Analyses comparing modalities and the study populations (screening vs. challenge) are still ongoing.
Collapse
Affiliation(s)
| | - Etta Pisano
- Beth Israel Deaconess Medical Center, Boston, MA.,Harvard Medical School, Harvard University, Boston, MA
| | - Elodia Cole
- Beth Israel Deaconess Medical Center, Boston, MA
| | | |
Collapse
|
20
|
Gallas BD, Anam A, Chen W, Wunderlich A, Zhang Z. MRMC analysis of agreement studies. Proc SPIE Int Soc Opt Eng 2016; 9787:97870F. [PMID: 28794577 PMCID: PMC5546377 DOI: 10.1117/12.2217074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
The purpose of this work is to present and evaluate methods based on U-statistics to compare intra- or inter-reader agreement across different imaging modalities. We apply these methods to multi-reader multi-case (MRMC) studies. We measure reader-averaged agreement and estimate its variance accounting for the variability from readers and cases (an MRMC analysis). In our application, pathologists (readers) evaluate patient tissue mounted on glass slides (cases) in two ways. They evaluate the slides on a microscope (reference modality) and they evaluate digital scans of the slides on a computer display (new modality). In the current work, we consider concordance as the agreement measure, but many of the concepts outlined here apply to other agreement measures. Concordance is the probability that two readers rank two cases in the same order. Concordance can be estimated with a U-statistic and thus it has some nice properties: it is unbiased, asymptotically normal, and its variance is given by an explicit formula. Another property of a U-statistic is that it is symmetric in its inputs; it doesn't matter which reader is listed first or which case is listed first, the result is the same. Using this property and a few tricks while building the U-statistic kernel for concordance, we get a mathematically tractable problem and efficient software. Simulations show that our variance and covariance estimates are unbiased.
Collapse
Affiliation(s)
- Brandon D Gallas
- CDRH/OSEL Division of Imaging, Diagnostics, and Software Reliability, 10903 New Hampshire Ave, Silver Spring, MD, 20993
| | - Amrita Anam
- CDRH/OSEL Division of Imaging, Diagnostics, and Software Reliability, 10903 New Hampshire Ave, Silver Spring, MD, 20993
- UMBC, Department of Information Systems, 1000 Hilltop Cir, Baltimore, MD 21250
| | - Weijie Chen
- CDRH/OSEL Division of Imaging, Diagnostics, and Software Reliability, 10903 New Hampshire Ave, Silver Spring, MD, 20993
| | - Adam Wunderlich
- CDRH/OSEL Division of Imaging, Diagnostics, and Software Reliability, 10903 New Hampshire Ave, Silver Spring, MD, 20993
| | - Zhiwei Zhang
- CDRH/OSB Division of Biostatistics, 10903 New Hampshire Ave, Silver Spring, MD, 20993
| |
Collapse
|
21
|
Treanor D, Gallas BD, Gavrielides MA, Hewitt SM. Evaluating whole slide imaging: A working group opportunity. J Pathol Inform 2015; 6:4. [PMID: 25774315 PMCID: PMC4355829] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2014] [Accepted: 10/27/2014] [Indexed: 10/26/2022] Open
Affiliation(s)
- Darren Treanor
- Leeds Teaching Hospitals NHS Trust and University of Leeds, Leeds, England
| | - Brandon D. Gallas
- Division of Imaging, Diagnostics, and Software Reliability, Office of Science and Engineering Laboratories, Center for Devices and Radiological Health, U.S. Food and Drug Administration, Silver Spring, MD, USA
| | - Marios A. Gavrielides
- Division of Imaging, Diagnostics, and Software Reliability, Office of Science and Engineering Laboratories, Center for Devices and Radiological Health, U.S. Food and Drug Administration, Silver Spring, MD, USA
| | - Stephen M. Hewitt
- Laboratory of Pathology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, USA,Corresponding author
| |
Collapse
|
22
|
Wunderlich A, Noo F, Gallas BD, Heilbrun ME. Exact confidence intervals for channelized Hotelling observer performance in image quality studies. IEEE Trans Med Imaging 2015; 34:453-64. [PMID: 25265629 PMCID: PMC5542023 DOI: 10.1109/tmi.2014.2360496] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
Task-based assessments of image quality constitute a rigorous, principled approach to the evaluation of imaging system performance. To conduct such assessments, it has been recognized that mathematical model observers are very useful, particularly for purposes of imaging system development and optimization. One type of model observer that has been widely applied in the medical imaging community is the channelized Hotelling observer (CHO), which is well-suited to known-location discrimination tasks. In the present work, we address the need for reliable confidence interval estimators of CHO performance. Specifically, we show that the bias associated with point estimates of CHO performance can be overcome by using confidence intervals proposed by Reiser for the Mahalanobis distance. In addition, we find that these intervals are well-defined with theoretically-exact coverage probabilities, which is a new result not proved by Reiser. The confidence intervals are tested with Monte Carlo simulation and demonstrated with two examples comparing X-ray CT reconstruction strategies. Moreover, commonly-used training/testing approaches are discussed and compared to the exact confidence intervals. MATLAB software implementing the estimators described in this work is publicly available at http://code.google.com/p/iqmodelo/.
Collapse
|
23
|
|
24
|
Chen W, Wunderlich A, Petrick N, Gallas BD. Multireader multicase reader studies with binary agreement data: simulation, analysis, validation, and sizing. J Med Imaging (Bellingham) 2014; 1:031011. [PMID: 26158051 DOI: 10.1117/1.jmi.1.3.031011] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2014] [Accepted: 11/07/2014] [Indexed: 11/14/2022] Open
Abstract
We treat multireader multicase (MRMC) reader studies for which a reader's diagnostic assessment is converted to binary agreement (1: agree with the truth state, 0: disagree with the truth state). We present a mathematical model for simulating binary MRMC data with a desired correlation structure across readers, cases, and two modalities, assuming the expected probability of agreement is equal for the two modalities ([Formula: see text]). This model can be used to validate the coverage probabilities of 95% confidence intervals (of [Formula: see text], [Formula: see text], or [Formula: see text] when [Formula: see text]), validate the type I error of a superiority hypothesis test, and size a noninferiority hypothesis test (which assumes [Formula: see text]). To illustrate the utility of our simulation model, we adapt the Obuchowski-Rockette-Hillis (ORH) method for the analysis of MRMC binary agreement data. Moreover, we use our simulation model to validate the ORH method for binary data and to illustrate sizing in a noninferiority setting. Our software package is publicly available on the Google code project hosting site for use in simulation, analysis, validation, and sizing of MRMC reader studies with binary agreement data.
Collapse
Affiliation(s)
- Weijie Chen
- Food and Drug Administration, Center for Devices and Radiological Health , Office of Science and Engineering Laboratories, Division of Imaging, Diagnostics, and Software Reliability, 10903 New Hampshire Avenue, Silver Spring, Maryland 20993, United States
| | - Adam Wunderlich
- Food and Drug Administration, Center for Devices and Radiological Health , Office of Science and Engineering Laboratories, Division of Imaging, Diagnostics, and Software Reliability, 10903 New Hampshire Avenue, Silver Spring, Maryland 20993, United States
| | - Nicholas Petrick
- Food and Drug Administration, Center for Devices and Radiological Health , Office of Science and Engineering Laboratories, Division of Imaging, Diagnostics, and Software Reliability, 10903 New Hampshire Avenue, Silver Spring, Maryland 20993, United States
| | - Brandon D Gallas
- Food and Drug Administration, Center for Devices and Radiological Health , Office of Science and Engineering Laboratories, Division of Imaging, Diagnostics, and Software Reliability, 10903 New Hampshire Avenue, Silver Spring, Maryland 20993, United States
| |
Collapse
|
25
|
Kang L, Chen W, Petrick NA, Gallas BD. Comparing two correlated C indices with right-censored survival outcome: a one-shot nonparametric approach. Stat Med 2014; 34:685-703. [PMID: 25399736 DOI: 10.1002/sim.6370] [Citation(s) in RCA: 241] [Impact Index Per Article: 24.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2013] [Revised: 10/29/2014] [Accepted: 10/29/2014] [Indexed: 11/06/2022]
Abstract
The area under the receiver operating characteristic curve is often used as a summary index of the diagnostic ability in evaluating biomarkers when the clinical outcome (truth) is binary. When the clinical outcome is right-censored survival time, the C index, motivated as an extension of area under the receiver operating characteristic curve, has been proposed by Harrell as a measure of concordance between a predictive biomarker and the right-censored survival outcome. In this work, we investigate methods for statistical comparison of two diagnostic or predictive systems, of which they could either be two biomarkers or two fixed algorithms, in terms of their C indices. We adopt a U-statistics-based C estimator that is asymptotically normal and develop a nonparametric analytical approach to estimate the variance of the C estimator and the covariance of two C estimators. A z-score test is then constructed to compare the two C indices. We validate our one-shot nonparametric method via simulation studies in terms of the type I error rate and power. We also compare our one-shot method with resampling methods including the jackknife and the bootstrap. Simulation results show that the proposed one-shot method provides almost unbiased variance estimations and has satisfactory type I error control and power. Finally, we illustrate the use of the proposed method with an example from the Framingham Heart Study.
Collapse
Affiliation(s)
- Le Kang
- Department of Biostatistics, School of Medicine, Virginia Commonwealth University, Richmond, VA, U.S.A.; Division of Imaging, Diagnostics, and Software Reliability, Office of Science and Engineering Laboratories, Center for Devices and Radiological Health, US Food and Drug Administration, Silver Spring, MD, U.S.A
| | | | | | | |
Collapse
|
26
|
Gallas BD, Gavrielides MA, Conway CM, Ivansky A, Keay TC, Cheng WC, Hipp J, Hewitt SM. Evaluation environment for digital and analog pathology: a platform for validation studies. J Med Imaging (Bellingham) 2014; 1:037501. [PMID: 26158076 DOI: 10.1117/1.jmi.1.3.037501] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2014] [Revised: 10/10/2014] [Accepted: 10/13/2014] [Indexed: 11/14/2022] Open
Abstract
We present a platform for designing and executing studies that compare pathologists interpreting histopathology of whole slide images (WSIs) on a computer display to pathologists interpreting glass slides on an optical microscope. eeDAP is an evaluation environment for digital and analog pathology. The key element in eeDAP is the registration of the WSI to the glass slide. Registration is accomplished through computer control of the microscope stage and a camera mounted on the microscope that acquires real-time images of the microscope field of view (FOV). Registration allows for the evaluation of the same regions of interest (ROIs) in both domains. This can reduce or eliminate disagreements that arise from pathologists interpreting different areas and focuses on the comparison of image quality. We reduced the pathologist interpretation area from an entire glass slide (10 to [Formula: see text]) to small ROIs ([Formula: see text]). We also made possible the evaluation of individual cells. We summarize eeDAP's software and hardware and provide calculations and corresponding images of the microscope FOV and the ROIs extracted from the WSIs. The eeDAP software can be downloaded from the Google code website (project: eeDAP) as a MATLAB source or as a precompiled stand-alone license-free application.
Collapse
Affiliation(s)
- Brandon D Gallas
- FDA/CDRH/OSEL , Division of Imaging, Diagnostics, and Software Reliability, 10903 New Hampshire Avenue, Building 62, Room 3124, Silver Spring, Maryland 20993-0002, United States
| | - Marios A Gavrielides
- FDA/CDRH/OSEL , Division of Imaging, Diagnostics, and Software Reliability, 10903 New Hampshire Avenue, Building 62, Room 3124, Silver Spring, Maryland 20993-0002, United States
| | - Catherine M Conway
- National Cancer Institute , National Institutes of Health, Center for Cancer Research, Laboratory of Pathology, 10 Center Drive, MSC 1500, Bethesda, Maryland 20892, United States
| | - Adam Ivansky
- FDA/CDRH/OSEL , Division of Imaging, Diagnostics, and Software Reliability, 10903 New Hampshire Avenue, Building 62, Room 3124, Silver Spring, Maryland 20993-0002, United States
| | - Tyler C Keay
- FDA/CDRH/OSEL , Division of Imaging, Diagnostics, and Software Reliability, 10903 New Hampshire Avenue, Building 62, Room 3124, Silver Spring, Maryland 20993-0002, United States
| | - Wei-Chung Cheng
- FDA/CDRH/OSEL , Division of Imaging, Diagnostics, and Software Reliability, 10903 New Hampshire Avenue, Building 62, Room 3124, Silver Spring, Maryland 20993-0002, United States
| | - Jason Hipp
- National Cancer Institute , National Institutes of Health, Center for Cancer Research, Laboratory of Pathology, 10 Center Drive, MSC 1500, Bethesda, Maryland 20892, United States
| | - Stephen M Hewitt
- National Cancer Institute , National Institutes of Health, Center for Cancer Research, Laboratory of Pathology, 10 Center Drive, MSC 1500, Bethesda, Maryland 20892, United States
| |
Collapse
|
27
|
Gallas BD, Hillis SL. Generalized Roe and Metz receiver operating characteristic model: analytic link between simulated decision scores and empirical AUC variances and covariances. J Med Imaging (Bellingham) 2014; 1:031006. [PMID: 26158048 PMCID: PMC4478859 DOI: 10.1117/1.jmi.1.3.031006] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2014] [Revised: 06/26/2014] [Accepted: 08/22/2014] [Indexed: 08/23/2023] Open
Abstract
Modeling and simulation are often used to understand and investigate random quantities and estimators. In 1997, Roe and Metz introduced a simulation model to validate analysis methods for the popular endpoint in reader studies to evaluate medical imaging devices, the reader-averaged area under the receiver operating characteristic (ROC) curve. Here, we generalize the notation of the model to allow more flexibility in recognition that variances of ROC ratings depend on modality and truth state. We also derive and validate equations for computing population variances and covariances for reader-averaged empirical AUC estimates under the generalized model. The equations are one-dimensional integrals that can be calculated using standard numerical integration techniques. This work provides the theoretical foundation and validation for a Java application called iRoeMetz that can simulate multireader multicase ROC studies and numerically calculate the corresponding variances and covariances of the empirical AUC. The iRoeMetz application and source code can be found at the "iMRMC" project on the google code project hosting site. These results and the application can be used by investigators to investigate ROC endpoints, validate analysis methods, and plan future studies.
Collapse
Affiliation(s)
- Brandon D. Gallas
- CDRH/FDA, Division of Imaging and Applied Mathematics, Bldg. 62, Rm 3124, 10903 New Hampshire Avenue, Silver Spring, Maryland 20993-0002, United States
| | - Stephen L. Hillis
- University of Iowa, Departments of Radiology and Biostatistics, 3170 Medical Laboratories, 200 Hawkins Drive, Iowa City, Iowa 52242-1077, United States
- Comprehensive Access and Delivery Research and Evaluation Center, VA Health Care System, Iowa City, Iowa 52242-1077, United States
| |
Collapse
|
28
|
Abbey CK, Gallas BD, Boone JM, Niklason LT, Hadjiiski LM, Sahiner B, Samuelson FW. Comparative statistical properties of expected utility and area under the ROC curve for laboratory studies of observer performance in screening mammography. Acad Radiol 2014; 21:481-90. [PMID: 24594418 DOI: 10.1016/j.acra.2013.12.011] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2013] [Revised: 12/11/2013] [Accepted: 12/11/2013] [Indexed: 11/25/2022]
Abstract
RATIONALE AND OBJECTIVES Our objective is to determine whether expected utility (EU) and the area under the receiver operator characteristic (AUC) are consistent with one another as endpoints of observer performance studies in mammography. These two measures characterize receiver operator characteristic performance somewhat differently. We compare these two study endpoints at the level of individual reader effects, statistical inference, and components of variance across readers and cases. MATERIALS AND METHODS We reanalyze three previously published laboratory observer performance studies that investigate various x-ray breast imaging modalities using EU and AUC. The EU measure is based on recent estimates of relative utility for screening mammography. RESULTS The AUC and EU measures are correlated across readers for individual modalities (r = 0.93) and differences in modalities (r = 0.94 to 0.98). Statistical inference for modality effects based on multi-reader multi-case analysis is very similar, with significant results (P < .05) in exactly the same conditions. Power analyses show mixed results across studies, with a small increase in power on average for EU that corresponds to approximately a 7% reduction in the number of readers. Despite a large number of crossing receiver operator characteristic curves (59% of readers), modality effects only rarely have opposite signs for EU and AUC (6%). CONCLUSIONS We do not find any evidence of systematic differences between EU and AUC in screening mammography observer studies. Thus, when utility approaches are viable (i.e., an appropriate value of relative utility exists), practical effects such as statistical efficiency may be used to choose study endpoints.
Collapse
|
29
|
Gallas BD, Cheng WC, Gavrielides MA, Ivansky A, Keay T, Wunderlich A, Hipp J, Hewitt SM. eeDAP: An Evaluation Environment for Digital and Analog Pathology. Proc SPIE Int Soc Opt Eng 2014; 9037:903709. [PMID: 28845079 PMCID: PMC5568810 DOI: 10.1117/12.2044443] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
PURPOSE The purpose of this work is to present a platform for designing and executing studies that compare pathologists interpreting histopathology of whole slide images (WSI) on a computer display to pathologists interpreting glass slides on an optical microscope. METHODS Here we present eeDAP, an evaluation environment for digital and analog pathology. The key element in eeDAP is the registration of the WSI to the glass slide. Registration is accomplished through computer control of the microscope stage and a camera mounted on the microscope that acquires images of the real time microscope view. Registration allows for the evaluation of the same regions of interest (ROIs) in both domains. This can reduce or eliminate disagreements that arise from pathologists interpreting different areas and focuses the comparison on image quality. RESULTS We reduced the pathologist interpretation area from an entire glass slide (≈10-30 mm)2 to small ROIs <(50 um)2. We also made possible the evaluation of individual cells. CONCLUSIONS We summarize eeDAP's software and hardware and provide calculations and corresponding images of the microscope field of view and the ROIs extracted from the WSIs. These calculations help provide a sense of eeDAP's functionality and operating principles, while the images provide a sense of the look and feel of studies that can be conducted in the digital and analog domains. The eeDAP software can be downloaded from code.google.com (project: eeDAP) as Matlab source or as a precompiled stand-alone license-free application.
Collapse
Affiliation(s)
- Brandon D Gallas
- Division of Imaging and Applied Mathematics, OSEL/CDRH/FDA, Silver Spring, MD
| | - Wei-Chung Cheng
- Division of Imaging and Applied Mathematics, OSEL/CDRH/FDA, Silver Spring, MD
| | | | - Adam Ivansky
- Division of Imaging and Applied Mathematics, OSEL/CDRH/FDA, Silver Spring, MD
| | - Tyler Keay
- Division of Imaging and Applied Mathematics, OSEL/CDRH/FDA, Silver Spring, MD
| | - Adam Wunderlich
- Division of Imaging and Applied Mathematics, OSEL/CDRH/FDA, Silver Spring, MD
| | - Jason Hipp
- Laboratory of Pathology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland
| | - Stephen M Hewitt
- Laboratory of Pathology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland
| |
Collapse
|
30
|
Abstract
The goal of this work is to design computerized image analysis techniques for automatically characterizing lung nodule subtlety in CT images. Automated subtlety estimation methods may help in computer-aided detection (CAD) assessment by quantifying dataset difficulty and facilitating comparisons among different CAD algorithms. A dataset containing 813 nodules from 499 patients was obtained from the Lung Image Database Consortium. Each nodule was evaluated by four radiologists regarding nodule subtlety using a 5-point rating scale (1: most subtle). We developed a 3D technique for segmenting lung nodules using a prespecified initial ROI. Texture and morphological features were automatically extracted from the segmented nodules and their margins. The dataset was partitioned into trainers and testers using a 1:1 ratio. An artificial neural network (ANN) was trained with average reader subtlety scores as the reference. Effective features for characterizing nodule subtlety were selected based on the training set using the ANN and a stepwise feature selection method. The performance of the classifier was evaluated using prediction probability (PK) as an agreement measure, which is considered a generalization of the area under the receiver operating characteristic curve when the reference standard is multi-level. Using an ANN classifier trained with a set of 2 features (selected from a total of 30 features), including compactness and average gray value, the test concordance between computer scores and the average reader scores was 0.789 ± 0.014. Our results show that the proposed method had strong agreement with the average of subtlety scores provided by radiologists.
Collapse
Affiliation(s)
- Xin He
- US Food and Drug Administration, Center for Devices and Radiological Health, Office of Science and Engineering Laboratories, Division of Imaging and Applied Mathematics, 10903 New Hampshire Avenue Silver Spring, MD 20993, USA
| | | | | | | | | |
Collapse
|
31
|
Chen W, Samuelson FW, Gallas BD, Kang L, Sahiner B, Petrick N. On the assessment of the added value of new predictive biomarkers. BMC Med Res Methodol 2013; 13:98. [PMID: 23895587 PMCID: PMC3733611 DOI: 10.1186/1471-2288-13-98] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2013] [Accepted: 07/24/2013] [Indexed: 01/22/2023] Open
Abstract
BACKGROUND The surge in biomarker development calls for research on statistical evaluation methodology to rigorously assess emerging biomarkers and classification models. Recently, several authors reported the puzzling observation that, in assessing the added value of new biomarkers to existing ones in a logistic regression model, statistical significance of new predictor variables does not necessarily translate into a statistically significant increase in the area under the ROC curve (AUC). Vickers et al. concluded that this inconsistency is because AUC "has vastly inferior statistical properties," i.e., it is extremely conservative. This statement is based on simulations that misuse the DeLong et al. method. Our purpose is to provide a fair comparison of the likelihood ratio (LR) test and the Wald test versus diagnostic accuracy (AUC) tests. DISCUSSION We present a test to compare ideal AUCs of nested linear discriminant functions via an F test. We compare it with the LR test and the Wald test for the logistic regression model. The null hypotheses of these three tests are equivalent; however, the F test is an exact test whereas the LR test and the Wald test are asymptotic tests. Our simulation shows that the F test has the nominal type I error even with a small sample size. Our results also indicate that the LR test and the Wald test have inflated type I errors when the sample size is small, while the type I error converges to the nominal value asymptotically with increasing sample size as expected. We further show that the DeLong et al. method tests a different hypothesis and has the nominal type I error when it is used within its designed scope. Finally, we summarize the pros and cons of all four methods we consider in this paper. SUMMARY We show that there is nothing inherently less powerful or disagreeable about ROC analysis for showing the usefulness of new biomarkers or characterizing the performance of classification models. Each statistical method for assessing biomarkers and classification models has its own strengths and weaknesses. Investigators need to choose methods based on the assessment purpose, the biomarker development phase at which the assessment is being performed, the available patient data, and the validity of assumptions behind the methodologies.
Collapse
Affiliation(s)
- Weijie Chen
- Division of Imaging and Applied Mathematics, Office of Science and Engineering Laboratories, Center for Devices and Radiological Health, Food and Drug Administration, 10903 New Hampshire Avenue, Silver Spring, MD 20993, USA.
| | | | | | | | | | | |
Collapse
|
32
|
Abbey CK, Samuelson FW, Gallas BD. Statistical power considerations for a utility endpoint in observer performance studies. Acad Radiol 2013; 20:798-806. [PMID: 23611439 DOI: 10.1016/j.acra.2013.02.008] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2013] [Revised: 02/27/2013] [Accepted: 02/27/2013] [Indexed: 10/26/2022]
Abstract
RATIONALE AND OBJECTIVES The purpose of this investigation is to compare the statistical power of the most common measure of performance for observer performance studies, area under the ROC curve (AUC), to an expected utility (EU) endpoint. MATERIALS AND METHODS We have modified a well-known simulation procedure developed by Roe and Metz for statistical power analysis in receiver operating characteristic (ROC) studies. Starting from a set of baseline simulations, we investigate the effects of three parameters that describe properties of the observers (iso-utility slope, unequal variance, and tendency to favor more aggressive or conservative actions) and three parameters that affect experimental design (number of readers, number of cases, and fraction of positive cases). RESULTS The EU endpoint generally has good statistical power relative to AUC in our simulations. Of 396 total conditions simulated, EU had higher statistical power in 377 cases (95%). In 246 of these cases, EU power was 5 percentage points or more higher than AUC. In simulation runs evaluating the effect of the number of readers and cases on the baseline simulations, EU measure had equivalent power to AUC with fewer readers (9% to 28%) or fewer cases (18% to 41%). CONCLUSION These simulation studies provide further motivation for considering EU in studies of screening mammography technology and they motivate investigations of utility in other diagnostic tasks.
Collapse
|
33
|
Obuchowski NA, Gallas BD, Hillis SL. Multi-reader ROC studies with split-plot designs: a comparison of statistical methods. Acad Radiol 2012; 19:1508-17. [PMID: 23122570 DOI: 10.1016/j.acra.2012.09.012] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2012] [Revised: 09/21/2012] [Accepted: 09/21/2012] [Indexed: 10/27/2022]
Abstract
RATIONALE AND OBJECTIVES Multireader imaging trials often use a factorial design, in which study patients undergo testing with all imaging modalities and readers interpret the results of all tests for all patients. A drawback of this design is the large number of interpretations required of each reader. Split-plot designs have been proposed as an alternative, in which one or a subset of readers interprets all images of a sample of patients, while other readers interpret the images of other samples of patients. In this paper, the authors compare three methods of analysis for the split-plot design. MATERIALS AND METHODS Three statistical methods are presented: the Obuchowski-Rockette method modified for the split-plot design, a newly proposed marginal-mean analysis-of-variance approach, and an extension of the three-sample U-statistic method. A simulation study using the Roe-Metz model was performed to compare the type I error rate, power, and confidence interval coverage of the three test statistics. RESULTS The type I error rates for all three methods are close to the nominal level but tend to be slightly conservative. The statistical power is nearly identical for the three methods. The coverage of 95% confidence intervals falls close to the nominal coverage for small and large sample sizes. CONCLUSIONS The split-plot multireader, multicase study design can be statistically efficient compared to the factorial design, reducing the number of interpretations required per reader. Three methods of analysis, shown to have nominal type I error rates, similar power, and nominal confidence interval coverage, are available for this study design.
Collapse
|
34
|
Platiša L, Goossens B, Vansteenkiste E, Park S, Gallas BD, Badano A, Philips W. Channelized Hotelling observers for the assessment of volumetric imaging data sets. J Opt Soc Am A Opt Image Sci Vis 2011; 28:1145-1163. [PMID: 21643400 DOI: 10.1364/josaa.28.001145] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
Current clinical practice is rapidly moving in the direction of volumetric imaging. For two-dimensional (2D) images, task-based medical image quality is often assessed using numerical model observers. For three-dimensional (3D) images, however, these models have been little explored so far. In this work, first, two novel designs of a multislice channelized Hotelling observer (CHO) are proposed for the task of detecting 3D signals in 3D images. The novel designs are then compared and evaluated in a simulation study with five different CHO designs: a single-slice model, three multislice models, and a volumetric model. Four different random background statistics are considered, both gaussian (noncorrelated and correlated gaussian noise) and non-gaussian (lumpy and clustered lumpy backgrounds). Overall, the results show that the volumetric model outperforms the others, while the disparity between the models decreases for greater complexity of the detection task. Among the multislice models, the second proposed CHO could most closely approach the volumetric model, whereas the first new CHO seems to be least affected by the number of training samples.
Collapse
Affiliation(s)
- Ljiljana Platiša
- TELIN-IPI-IBBT, Ghent University, St-Pietersnieuwstraat 41, B-9000 Ghent, Belgium.
| | | | | | | | | | | | | |
Collapse
|
35
|
Gavrielides MA, Gallas BD, Lenz P, Badano A, Hewitt SM. Observer variability in the interpretation of HER2/neu immunohistochemical expression with unaided and computer-aided digital microscopy. Arch Pathol Lab Med 2011. [PMID: 21284444 DOI: 10.1043/1543-2165-135.2.233] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
CONTEXT Observer variability in digital microscopy and the effect of computer-aided digital microscopy are underexamined areas in need of further research, considering the increasing use and future role of digital imaging in pathology. A reduction in observer variability using computer aids could enhance the statistical power of studies designed to determine the utility of new biomarkers and accelerate their incorporation in clinical practice. OBJECTIVES To quantify interobserver and intraobserver variability in immunohistochemical analysis of HER2/neu with digital microscopy and computer-aided digital microscopy, and to test the hypothesis that observer agreement in the quantitative assessment of HER2/neu immunohistochemical expression is increased with the use of computer-aided microscopy. DESIGN A set of 335 digital microscopy images extracted from 64 breast cancer tissue slides stained with a HER2 antibody, were read by 14 observers in 2 reading modes: the unaided mode and the computer-aided mode. In the unaided mode, HER2 images were displayed on a calibrated color monitor with no other information, whereas in the computer-aided mode, observers were shown a HER2 image along with a corresponding feature plot showing computer-extracted values of membrane staining intensity and membrane completeness for the particular image under examination and, at the same time, mean feature values of the different HER2 categories. In both modes, observers were asked to provide a continuous score of HER2 expression. RESULTS Agreement analysis performed on the output of the study showed significant improvement in both interobserver and intraobserver agreement when the computer-aided reading mode was used to evaluate preselected image fields. CONCLUSION The role of computer-aided digital microscopy in reducing observer variability in immunohistochemistry is promising.
Collapse
Affiliation(s)
- Marios A Gavrielides
- Division of Imaging and Applied Mathematics, Office of Science and Engineering Laboratories, Center for Devices and Radiological Health, US Food and Drug Administration, Silver Spring, Maryland 20993, USA.
| | | | | | | | | |
Collapse
|
36
|
Gavrielides MA, Gallas BD, Lenz P, Badano A, Hewitt SM. Observer variability in the interpretation of HER2/neu immunohistochemical expression with unaided and computer-aided digital microscopy. Arch Pathol Lab Med 2011; 135:233-42. [PMID: 21284444 DOI: 10.5858/135.2.233] [Citation(s) in RCA: 90] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
CONTEXT Observer variability in digital microscopy and the effect of computer-aided digital microscopy are underexamined areas in need of further research, considering the increasing use and future role of digital imaging in pathology. A reduction in observer variability using computer aids could enhance the statistical power of studies designed to determine the utility of new biomarkers and accelerate their incorporation in clinical practice. OBJECTIVES To quantify interobserver and intraobserver variability in immunohistochemical analysis of HER2/neu with digital microscopy and computer-aided digital microscopy, and to test the hypothesis that observer agreement in the quantitative assessment of HER2/neu immunohistochemical expression is increased with the use of computer-aided microscopy. DESIGN A set of 335 digital microscopy images extracted from 64 breast cancer tissue slides stained with a HER2 antibody, were read by 14 observers in 2 reading modes: the unaided mode and the computer-aided mode. In the unaided mode, HER2 images were displayed on a calibrated color monitor with no other information, whereas in the computer-aided mode, observers were shown a HER2 image along with a corresponding feature plot showing computer-extracted values of membrane staining intensity and membrane completeness for the particular image under examination and, at the same time, mean feature values of the different HER2 categories. In both modes, observers were asked to provide a continuous score of HER2 expression. RESULTS Agreement analysis performed on the output of the study showed significant improvement in both interobserver and intraobserver agreement when the computer-aided reading mode was used to evaluate preselected image fields. CONCLUSION The role of computer-aided digital microscopy in reducing observer variability in immunohistochemistry is promising.
Collapse
Affiliation(s)
- Marios A Gavrielides
- Division of Imaging and Applied Mathematics, Office of Science and Engineering Laboratories, Center for Devices and Radiological Health, US Food and Drug Administration, Silver Spring, Maryland 20993, USA.
| | | | | | | | | |
Collapse
|
37
|
Samuelson F, Gallas BD, Myers KJ, Petrick N, Pinsky P, Sahiner B, Campbell G, Pennello GA. The importance of ROC data. Acad Radiol 2011; 18:257-8; author reply 259-61. [PMID: 21232688 DOI: 10.1016/j.acra.2010.10.016] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2010] [Revised: 10/18/2010] [Accepted: 10/20/2010] [Indexed: 11/19/2022]
|
38
|
Abstract
Multiclass receiver operating characteristic (ROC) analysis has remained an open theoretical problem since the introduction of binary ROC analysis in the 1950s. Previously, we have developed a paradigm for three-class ROC analysis that extends and unifies decision theoretic, linear discriminant analysis, and probabilistic foundations of binary ROC analysis in a three-class paradigm. One critical element in this paradigm is the equal error utility (EEU) assumption. This assumption allows us to reduce the intrinsic space of the three-class ROC analysis (5-D hypersurface in 6-D hyperspace) to a 2-D surface in the 3-D space of true positive fractions (sensitivity space). In this work, we show that this 2-D ROC surface fully and uniquely provides a complete descriptor for the optimal performance of a system for a three-class classification task, i.e., the triplet of likelihood ratio distributions, assuming such a triplet exists. To be specific, we consider two classifiers that utilize likelihood ratios, and we assumed each classifier has a continuous and differentiable 2-D sensitivity-space ROC surface. Under these conditions, we proved that the classifiers have the same triplet of likelihood ratio distributions if and only if they have the same 2-D sensitivity-space ROC surfaces. As a result, the 2-D sensitivity surface contains complete information on the optimal three-class task performance for the corresponding likelihood ratio classifier.
Collapse
Affiliation(s)
- Xin He
- Department of Radiology, Johns Hopkins School of Medicine, Baltimore, MD 21287 USA ()
| | - Brandon D. Gallas
- DIAM/OSEL/CDRH, Food and Drug Administration, Silver Spring, MD, 20993 USA ()
| | - Eric C. Frey
- Department of Radiology, Johns Hopkins School of Medicine, Baltimore, MD 21287 USA ()
| |
Collapse
|
39
|
Gallas BD, Bandos A, Samuelson FW, Wagner RF. A Framework for Random-Effects ROC Analysis: Biases with the Bootstrap and Other Variance Estimators. COMMUN STAT-THEOR M 2009. [DOI: 10.1080/03610920802610084] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
40
|
Abstract
Contrast sensitivity of the human visual system is a characteristic that can adversely affect human performance in detection tasks. In this paper, we propose a method for incorporating human contrast sensitivity in anthropomorphic model observers. In our method, we model human contrast sensitivity using the Barten model with the mean luminance of a region of interest centered at the signal location. In addition, one free parameter is varied to control the effect of the contrast sensitivity on the model observer's performance. We investigate our model of human contrast sensitivity in a channelized-Hotelling observer (CHO) with difference-of-Gaussian channels. We call the CHO incorporating the contrast sensitivity a contrast-sensitive CHO (CS-CHO). The human data from a psychophysical study by Park et al. [1] are used for comparing the performance of the CS-CHO to human performance. That study used Gaussian signals with six different signal intensities in non-Gaussian lumpy backgrounds. A value of the free parameter is chosen to match the performance of the CS-CHO to the mean human performance only at the strongest signal. Results show that the CS-CHO with the chosen value of the free parameter predicts the mean human performance at the five lower signal intensities. Our results show that the CS-CHO predicts human performance well as a function of signal intensity.
Collapse
Affiliation(s)
- Subok Park
- NIBIB/CDRH Laboratory for the Assessment of Medical Imaging System, Division of Imaging and Applied Mathematics, Center for Devices and Radiological Health, Food and Drug Administration, White Oak, MD 20993, USA.
| | | | | | | |
Collapse
|
41
|
Kyprianou IS, Badano A, Gallas BD, Myers KJ. Singular value description of a digital radiographic detector: theory and measurements. Med Phys 2008; 35:4744-56. [PMID: 18975719 DOI: 10.1118/1.2975222] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Abstract
The H operator represents the deterministic performance of any imaging system. For a linear, digital imaging system, this system operator can be written in terms of a matrix, H, that describes the deterministic response of the system to a set of point objects. A singular value decomposition of this matrix results in a set of orthogonal functions (singular vectors) that form the system basis. A linear combination of these vectors completely describes the transfer of objects through the linear system, where the respective singular values associated with each singular vector describe the magnitude with which that contribution to the object is transferred through the system. This paper is focused on the measurement, analysis, and interpretation of the H matrix for digital x-ray detectors. A key ingredient in the measurement of the H matrix is the detector response to a single x ray (or infinitestimal x-ray beam). The authors have developed a method to estimate the 2D detector shift-variant, asymmetric ray response function (RRF) from multiple measured line response functions (LRFs) using a modified edge technique. The RRF measurements cover a range of x-ray incident angles from 0 degree (equivalent location at the detector center) to 30 degrees (equivalent location at the detector edge) for a standard radiographic or cone-beam CT geometric setup. To demonstrate the method, three beam qualities were tested using the inherent, Lu/Er, and Yb beam filtration. The authors show that measures using the LRF, derived from an edge measurement, underestimate the system's performance when compared with the H matrix derived using the RRF. Furthermore, the authors show that edge measurements must be performed at multiple directions in order to capture rotational asymmetries of the RRF. The authors interpret the results of the H matrix SVD and provide correlations with the familiar MTF methodology. Discussion is made about the benefits of the H matrix technique with regards to signal detection theory, and the characterization of shift-variant imaging systems.
Collapse
Affiliation(s)
- Iacovos S Kyprianou
- NIBIB/CDRH Laboratory for the Assessment of Medical Imaging Systems, US Food and Drug Administration, New Hampshire Avenue, Silver Spring, Maryland 20993, USA.
| | | | | | | |
Collapse
|
42
|
Liang H, Park S, Gallas BD, Myers KJ, Badano A. Image browsing in slow medical liquid crystal displays. Acad Radiol 2008; 15:370-82. [PMID: 18280935 DOI: 10.1016/j.acra.2007.10.017] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2007] [Revised: 10/18/2007] [Accepted: 10/20/2007] [Indexed: 10/22/2022]
Abstract
RATIONALE AND OBJECTIVES Statistics show that radiologists are reading more studies than ever before, creating the challenge of interpreting an increasing number of images without compromising diagnostic performance. Stack-mode image display has the potential to allow radiologists to browse large three-dimensional (3D) datasets at refresh rates as high as 30 images/second. In this framework, the slow temporal response of liquid crystal displays (LCDs) can compromise the image quality when the images are browsed in a fast sequence. MATERIALS AND METHODS In this article, we report on the effect of the LCD response time at different image browsing speeds based on the performance of a contrast-sensitive channelized-hoteling observer. A stack of simulated 3D clustered lumpy background images with a designer nodule to be detected is used. The effect of different browsing speeds is calculated with LCD temporal response measurements from our previous work. The image set is then analyzed by the model observer, which has been shown to predict human detection performance in Gaussian and non-Gaussian lumpy backgrounds. This methodology allows us to quantify the effect of slow temporal response of medical liquid crystal displays on the performance of the anthropomorphic observers. RESULTS We find that the slow temporal response of the display device greatly affects lesion contrast and observer performance. A detectability decrease of more than 40% could be caused by the slow response of the display. CONCLUSIONS After validation with human observers, this methodology can be applied to more realistic background data with the goal of providing recommendations for the browsing speed of large volumetric image datasets (from computed tomography, magnetic resonance, or tomosynthesis) when read in stack-mode.
Collapse
|
43
|
Abstract
Evaluation of computational intelligence (CI) systems designed to improve the performance of a human operator is complicated by the need to include the effect of human variability. In this paper we consider human (reader) variability in the context of medical imaging computer-assisted diagnosis (CAD) systems, and we outline how to compare the detection performance of readers with and without the CAD. An effective and statistically powerful comparison can be accomplished with a receiver operating characteristic (ROC) experiment, summarized by the reader-averaged area under the ROC curve (AUC). The comparison requires sophisticated yet well-developed methods for multi-reader multi-case (MRMC) variance analysis. MRMC variance analysis accounts for random readers, random cases, and correlations in the experiment. In this paper, we extend the methods available for estimating this variability. Specifically, we present a method that can treat arbitrary study designs. Most methods treat only the fully-crossed study design, where every reader reads every case in two experimental conditions. We demonstrate our method with a computer simulation, and we assess the statistical power of a variety of study designs.
Collapse
Affiliation(s)
- Brandon D Gallas
- NIBIB/CDRH Laboratory for the Assessment of Medical Imaging Systems, FDA, Silver Spring, MD 20993-0002, United States.
| | | |
Collapse
|
44
|
Abstract
Multireader multicase (MRMC) variance analysis has become widely utilized to analyze observer studies for which the summary measure is the area under the receiver operating characteristic (ROC) curve. We extend MRMC variance analysis to binary data and also to generic study designs in which every reader may not interpret every case. A subset of the fundamental moments central to MRMC variance analysis of the area under the ROC curve (AUC) is found to be required. Through multiple simulation configurations, we compare our unbiased variance estimates to naïve estimates across a range of study designs, average percent correct, and numbers of readers and cases.
Collapse
Affiliation(s)
- Brandon D Gallas
- National Institute of Biomedical Imaging and Bioengineering/Center for Derices and Radiological Health, US Food and Drug Administration, Silver Spring, Maryland 20993, USA.
| | | | | |
Collapse
|
45
|
Abstract
RATIONALE AND OBJECTIVES One popular study design for estimating the area under the receiver operating characteristic curve (AUC) is the one in which a set of readers reads a set of cases: a fully crossed design in which every reader reads every case. The variability of the subsequent reader-averaged AUC has two sources: the multiple readers and the multiple cases (MRMC). In this article, we present a nonparametric estimate for the variance of the reader-averaged AUC that is unbiased and does not use resampling tools. MATERIALS AND METHODS The one-shot estimate is based on the MRMC variance derived by the mechanistic approach of Barrett et al. (2005), as well as the nonparametric variance of a single-reader AUC derived in the literature on U statistics. We investigate the bias and variance properties of the one-shot estimate through a set of Monte Carlo simulations with simulated model observers and images. The different simulation configurations vary numbers of readers and cases, amounts of image noise and internal noise, as well as how the readers are constructed. We compare the one-shot estimate to a method that uses the jackknife resampling technique with an analysis of variance model at its foundation (Dorfman et al. 1992). The name one-shot highlights that resampling is not used. RESULTS The one-shot and jackknife estimators behave similarly, with the one-shot being marginally more efficient when the number of cases is small. CONCLUSIONS We have derived a one-shot estimate of the MRMC variance of AUC that is based on a probabilistic foundation with limited assumptions, is unbiased, and compares favorably to an established estimate.
Collapse
Affiliation(s)
- Brandon D Gallas
- NIBIB/CDRH Laboratory for the Assessment of Medical Imaging Systems, US FDA/CDRH, Bldg 1, HFZ-140, 12720 Twinbrook Parkway (Rm 158), Rockville MD 20852-1720, USA.
| |
Collapse
|
46
|
Badano A, Gallas BD. Detectability decreases with off-normal viewing in medical liquid crystal displays. Acad Radiol 2006; 13:210-8. [PMID: 16428057 DOI: 10.1016/j.acra.2005.08.015] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2005] [Revised: 08/08/2005] [Accepted: 08/09/2005] [Indexed: 11/15/2022]
Abstract
RATIONALE AND OBJECTIVES To quantify the reduction in detection performance of subtle signals at off-normal viewing directions in medical active-matrix liquid crystal displays (AMLCDs). MATERIALS AND METHODS Fifty synthetic image pairs per viewing condition (a total of 350) were used in a two-alternative forced-choice experiment in which 11 trained observers viewed images at 0, 30, and 45 degrees from the display normal, along the diagonal axis of a 5 million pixel in-plane switching monochrome AMLCD. The images were generated using white-noise backgrounds. A Gaussian signal was added to the signal-present set with three different signal amplitudes (4, 8, and 12 gray levels in a 10-bit scale). RESULTS The average percent correct achieved for a signal of 4 gray levels was 79.6 (95% confidence intervals based on reader and case variability: 71.6-86.9), 63.4 (CI 56.0-71.3), and 55.3 (CI 48.4-62.0), for 0, 30 and 45 degrees from the display normal, respectively. When the signal amplitude was increased by a factor of two, the performance was 76.9 and 57.0 for 30 and 45 degrees, respectively, and 95.3 and 85.3 when the amplitude was increased by a factor of three. The observers took on average about twice as long and as much as seven times as long to reach decisions in off-normal viewing. CONCLUSIONS Off-normal viewing of diagnostic images in AMLCDs significantly reduces the detection of low-contrast abnormalities. Increased off-normal signal amplitudes were required to regain the detection performance measured for normal viewing. We observed this decrease in detection performance for off-normal viewing even when measured decision times were about twice as long as for normal viewing.
Collapse
Affiliation(s)
- Aldo Badano
- NIBIB/CDRH Laboratory for the Assessment of Medical Imaging, Center for Devices and Radiological Health, Food and Drug Administration, Rockville, MD 20857, USA.
| | | |
Collapse
|
47
|
Abstract
The use of imaging phantoms is a common method of evaluating image quality in the clinical setting. These evaluations rely on a subjective decision by a human observer with respect to the faintest detectable signal(s) in the image. Because of the variable and subjective nature of the human-observer scores, the evaluations manifest a lack of precision and a potential for bias. The advent of digital imaging systems with their inherent digital data provides the opportunity to use techniques that do not rely on human-observer decisions and thresholds. Using the digital data, signal-detection theory (SDT) provides the basis for more objective and quantitative evaluations which are independent of a human-observer decision threshold. In a SDT framework, the evaluation of imaging phantoms represents a "signal-known-exactly/background-known-exactly" ("SKE/ BKE") detection task. In this study, we compute the performance of prewhitening and nonprewhitening model observers in terms of the observer signal-to-noise ratio (SNR) for these "SK E/BKE" tasks. We apply the evaluation methods to a number of imaging systems. For example, we use data from a laboratory implementation of digital radiography and from a full-field digital mammography system in a clinical setting. In addition, we make a comparison of our methods to human-observer scoring of a set of digital images of the CDMAM phantom available from the internet (EUREF-European Reference Organization). In the latter case, we show a significant increase in the precision of the quantitative methods versus the variability in the scores from human observers on the same set of images. As regards bias, the performance of a model observer estimated from a finite data set is known to be biased. In this study, we minimize the bias and estimate the variance of the observer SNR using statistical resampling techniques, namely, "bootstrapping" and "shuffling" of the data sets. Our methods provide objective and quantitative evaluation of imaging systems with increased precision and reduced bias.
Collapse
Affiliation(s)
- Robert M Gagne
- Center for Devices and Radiological Health, FDA, 12720 Twinbrook Parkway, Rockville, Maryland 20857, USA.
| | | | | |
Collapse
|
48
|
Kyprianou IS, Ganguly A, Rudin S, Bednarek DR, Gallas BD, Myers KJ. Efficiency of the Human Observer Compared to an Ideal Observer Based on a Generalized NEQ Which Incorporates Scatter and Geometric Unsharpness: Evaluation with a 2AFC Experiment. Proc SPIE Int Soc Opt Eng 2005; 5749:251-262. [PMID: 21311735 DOI: 10.1117/12.595870] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
Under certain assumptions the detectability of the ideal observer can be defined as the integral of the system Noise Equivalent Quanta multiplied by the squared object spatial frequency distribution. Using the detector Noise-Equivalent-Quanta (NEQ(D)) for the calculation of detectability inadequately describes the performance of an x-ray imaging system because it does not take into account the effects of patient scatter and geometric unsharpness. As a result, the ideal detectability index is overestimated, and hence the efficiency of the human observer in detecting objects is underestimated. We define a Generalized-NEQ (GNEQ) for an x-ray system referenced at the object plane that incorporates the scatter fraction, the spatial distributions of scatter and focal spot, the detector MTF(D), and the detector Normalized-Noise-Power-Spectrum (NNPS(D)). This GNEQ was used in the definition of the ideal detectability for the evaluation of the human observer efficiency during a two Alternative Forced Choice (2-AFC) experiment, and was compared with the case where only the NEQ(D) was used in the detectability calculations. The 2-AFC experiment involved the detection of images of polyethylene tubes (diameters between 100-300 μm) filled with iodine contrast (concentrations between 0-120 mg/cm(3)) placed onto a uniform head equivalent phantom placed near the surface of a microangiographic detector (43 μm pixel size). The resulting efficiency of the human observer without regarding the effects of scatter and geometric unsharpness was 30%. When these effects were considered the efficiency was increased to 70%. The ideal observer with the GNEQ can be a simple optimization method of a complete imaging system.
Collapse
Affiliation(s)
- Iacovos S Kyprianou
- Laboratory for the Assessment of Medical Imaging Systems, NIBIB/CDRH, US FDA
| | | | | | | | | | | |
Collapse
|
49
|
Abstract
Noise transfer in granular x-ray imaging phosphor screens is not proportional to the square of the magnitude of the signal transfer when the transfer properties are considered for the entire screen thickness, unless appropriately weighted at each depth of interaction. This property, known as the Lubberts effect, has not yet been studied in columnar structured screens because of a lack of a generalized description of the depth-dependent light transport. In this paper, we investigate the signal and noise transfer characteristics of columnar phosphors used in digital mammography detectors using DETECT-II, an optical Monte Carlo light transport simulation code. We first validate our choice of optical parameters for the description of granular and columnar screens using published normalized modulation transfer (MTF) experimental data. Our calculations of MTF match empirically measured MTFs for a granular film/screen analog system, and for an indirect x-ray digital imaging system with CsI:Tl screen representative of digital mammography systems. Using the depth-dependent spread functions and collection efficiencies, we calculate the signal and noise transfer functions and the Lubberts fraction, which is the ratio of the signal transfer function to the noise transfer function, for different screen thicknesses of granular and columnar phosphors. We find that the Lubberts fraction of a 85 microm granular screen model corresponding to a Gd2O2S:Tb screen is similar to the fraction for a 100 microm columnar CsI:Tl screen.
Collapse
Affiliation(s)
- Aldo Badano
- Laboratory for the Assessment of Medical Imaging Systems, Center for Devices and Radiological Health (FDA) and National Institute of Biomedical Imaging and Bioengineering (NIH), Rockville, Maryland 20857, USA.
| | | | | | | | | | | |
Collapse
|
50
|
Abstract
In this paper, we model an x-ray imaging system, paying special attention to the energy- and depth-dependent characteristics of the inputs and interactions: x rays are polychromatic, interaction depth and conversion to optical photons is energy-dependent, optical scattering and the collection efficiency depend on the depth of interaction. The model we construct is a random function of the point process that begins with the distribution of x rays incident on the phosphor and ends with optical photons being detected by the active area of detector pixels to form an image. We show how the point-process representation can be used to calculate the characteristic statistics of the model. We then simulate a Gd2O2S:Tb phosphor, estimate its characteristic statistics, and proceed with a signal-detection experiment to investigate the impact of the pixel fill factor on detecting spherical calcifications (the signal). The two extremes possible from this experiment are that SNR2 does not change with fill factor or changes in proportion to fill factor. In our results, the impact of fill factor is between these extremes, and depends on the diameter of the signal.
Collapse
Affiliation(s)
- Brandon D Gallas
- NIBIB/CDRH Laboratory for the Assessment of Medical Imaging Systems, Rockville, Maryland 20857, USA
| | | | | | | | | |
Collapse
|