Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Download

Total Articles

161
(from Reference Citation Analysis)

Article PDFs (5)

Cited by > 0 (145)

Searched Name

Berkman Sahiner

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Indexed Articles

Year Published

Show more Refine

Article Type

Show more Refine

Article Statistics

Refine

MESH Headings

Show more Refine

First Author

Show more Refine

First Author Affiliations

Show more Refine

Authors

Show more Refine

Publication Titles

Show more Refine

Grant Agencies

Show more Refine

Countries/Regions

Show more Refine

Affiliations

Show more Refine

Corresponding Author Affiliations

Show more Refine

Category

Show more Refine

Number

Citation Analysis

Coroller T, Sahiner B, Amatya A, Gossmann A, Karagiannis K, Moloney C, Samala RK, Santana-Quintero L, Solovieff N, Wang C, Amiri-Kordestani L, Cao Q, Cha KH, Charlab R, Cross FH, Hu T, Huang R, Kraft J, Krusche P, Li Y, Li Z, Mazo I, Paul R, Schnakenberg S, Serra P, Smith S, Song C, Su F, Tiwari M, Vechery C, Xiong X, Zarate JP, Zhu H, Chakravartty A, Liu Q, Ohlssen D, Petrick N, Schneider JA, Walderhaug M, Zuber E. Methodology for Good Machine Learning with Multi-Omics Data. Clin Pharmacol Ther 2024;115:745-757. [PMID: 37965805 DOI: 10.1002/cpt.3105] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2023] [Accepted: 10/20/2023] [Indexed: 11/16/2023]

Affiliation(s)

Thibaud Coroller Novartis Pharmaceutical Company, East Hanover, New Jersey, USA
Berkman Sahiner Center for Devices and Radiological Health, U.S. Food and Drug Administration, Silver Spring, Maryland, USA
Anup Amatya Center for Drug Evaluation and Research, U.S. Food and Drug Administration, Silver Spring, Maryland, USA
Alexej Gossmann Center for Devices and Radiological Health, U.S. Food and Drug Administration, Silver Spring, Maryland, USA
Konstantinos Karagiannis Center for Biologics Evaluation and Research, U.S. Food and Drug Administration, Silver Spring, Maryland, USA
Conor Moloney Novartis Pharma AG, Rotkreuz, Switzerland
Ravi K Samala Center for Devices and Radiological Health, U.S. Food and Drug Administration, Silver Spring, Maryland, USA
Luis Santana-Quintero Center for Biologics Evaluation and Research, U.S. Food and Drug Administration, Silver Spring, Maryland, USA
Nadia Solovieff Novartis Pharmaceutical Company, East Hanover, New Jersey, USA
Craig Wang Novartis Pharma AG, Rotkreuz, Switzerland
Laleh Amiri-Kordestani Center for Drug Evaluation and Research, U.S. Food and Drug Administration, Silver Spring, Maryland, USA
Qian Cao Center for Devices and Radiological Health, U.S. Food and Drug Administration, Silver Spring, Maryland, USA
Kenny H Cha Center for Devices and Radiological Health, U.S. Food and Drug Administration, Silver Spring, Maryland, USA
Rosane Charlab Center for Drug Evaluation and Research, U.S. Food and Drug Administration, Silver Spring, Maryland, USA
Frank H Cross Center for Drug Evaluation and Research, U.S. Food and Drug Administration, Silver Spring, Maryland, USA
Tingting Hu Center for Devices and Radiological Health, U.S. Food and Drug Administration, Silver Spring, Maryland, USA
Ruihao Huang Center for Drug Evaluation and Research, U.S. Food and Drug Administration, Silver Spring, Maryland, USA
Jeffrey Kraft Center for Drug Evaluation and Research, U.S. Food and Drug Administration, Silver Spring, Maryland, USA
Peter Krusche Novartis Pharma AG, Rotkreuz, Switzerland
Yutong Li Novartis Pharmaceutical Company, East Hanover, New Jersey, USA
Zheng Li Novartis Pharmaceutical Company, East Hanover, New Jersey, USA
Ilya Mazo Center for Biologics Evaluation and Research, U.S. Food and Drug Administration, Silver Spring, Maryland, USA
Rahul Paul Center for Biologics Evaluation and Research, U.S. Food and Drug Administration, Silver Spring, Maryland, USA
Susan Schnakenberg Novartis Pharmaceutical Company, East Hanover, New Jersey, USA
Paolo Serra Novartis Pharmaceutical Company, East Hanover, New Jersey, USA
Sean Smith Center for Biologics Evaluation and Research, U.S. Food and Drug Administration, Silver Spring, Maryland, USA
Chi Song Center for Drug Evaluation and Research, U.S. Food and Drug Administration, Silver Spring, Maryland, USA
Fei Su Novartis Pharmaceutical Company, East Hanover, New Jersey, USA
Mohit Tiwari Center for Drug Evaluation and Research, U.S. Food and Drug Administration, Silver Spring, Maryland, USA
Colin Vechery Novartis Pharmaceutical Company, East Hanover, New Jersey, USA
Xin Xiong Center for Devices and Radiological Health, U.S. Food and Drug Administration, Silver Spring, Maryland, USA
Juan Pablo Zarate Novartis Pharmaceutical Company, East Hanover, New Jersey, USA
Hao Zhu Center for Drug Evaluation and Research, U.S. Food and Drug Administration, Silver Spring, Maryland, USA
Arunava Chakravartty Novartis Pharmaceutical Company, East Hanover, New Jersey, USA
Qi Liu Center for Drug Evaluation and Research, U.S. Food and Drug Administration, Silver Spring, Maryland, USA
David Ohlssen Novartis Pharmaceutical Company, East Hanover, New Jersey, USA
Nicholas Petrick Center for Devices and Radiological Health, U.S. Food and Drug Administration, Silver Spring, Maryland, USA
Julie A Schneider Oncology Center of Excellence, U.S. Food and Drug Administration, Silver Spring, Maryland, USA
Mark Walderhaug Center for Biologics Evaluation and Research, U.S. Food and Drug Administration, Silver Spring, Maryland, USA
Emmanuel Zuber Novartis Pharma AG, Rotkreuz, Switzerland

Collapse

Drukker K, Sahiner B, Hu T, Kim GH, Whitney HM, Baughan N, Myers KJ, Giger ML, McNitt-Gray M. MIDRC-MetricTree: a decision tree-based tool for recommending performance metrics in artificial intelligence-assisted medical image analysis. J Med Imaging (Bellingham) 2024;11:024504. [PMID: 38576536 PMCID: PMC10990563 DOI: 10.1117/1.jmi.11.2.024504] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Revised: 02/16/2024] [Accepted: 03/18/2024] [Indexed: 04/06/2024] Open

Abstract

Purpose

The Medical Imaging and Data Resource Center (MIDRC) was created to facilitate medical imaging machine learning (ML) research for tasks including early detection, diagnosis, prognosis, and assessment of treatment response related to the coronavirus disease 2019 pandemic and beyond. The purpose of this work was to create a publicly available metrology resource to assist researchers in evaluating the performance of their medical image analysis ML algorithms.

Approach

An interactive decision tree, called MIDRC-MetricTree, has been developed, organized by the type of task that the ML algorithm was trained to perform. The criteria for this decision tree were that (1) users can select information such as the type of task, the nature of the reference standard, and the type of the algorithm output and (2) based on the user input, recommendations are provided regarding appropriate performance evaluation approaches and metrics, including literature references and, when possible, links to publicly available software/code as well as short tutorial videos.

Results

Five types of tasks were identified for the decision tree: (a) classification, (b) detection/localization, (c) segmentation, (d) time-to-event (TTE) analysis, and (e) estimation. As an example, the classification branch of the decision tree includes two-class (binary) and multiclass classification tasks and provides suggestions for methods, metrics, software/code recommendations, and literature references for situations where the algorithm produces either binary or non-binary (e.g., continuous) output and for reference standards with negligible or non-negligible variability and unreliability.

Conclusions

The publicly available decision tree is a resource to assist researchers in conducting task-specific performance evaluations, including classification, detection/localization, segmentation, TTE, and estimation tasks.

Collapse

Mahmood U, Shukla-Dave A, Chan HP, Drukker K, Samala RK, Chen Q, Vergara D, Greenspan H, Petrick N, Sahiner B, Huo Z, Summers RM, Cha KH, Tourassi G, Deserno TM, Grizzard KT, Näppi JJ, Yoshida H, Regge D, Mazurchuk R, Suzuki K, Morra L, Huisman H, Armato SG, Hadjiiski L. Artificial intelligence in medicine: mitigating risks and maximizing benefits via quality assurance, quality control, and acceptance testing. BJR Artif Intell 2024;1:ubae003. [PMID: 38476957 PMCID: PMC10928809 DOI: 10.1093/bjrai/ubae003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 01/08/2024] [Accepted: 01/12/2024] [Indexed: 03/14/2024]

Affiliation(s)

Usman Mahmood Department of Medical Physics, Memorial Sloan-Kettering Cancer Center, New York, NY, 10065, United States
Amita Shukla-Dave Department of Medical Physics, Memorial Sloan-Kettering Cancer Center, New York, NY, 10065, United States Department of Radiology, Memorial Sloan-Kettering Cancer Center, New York, NY, 10065, United States
Heang-Ping Chan Department of Radiology, University of Michigan, Ann Arbor, MI, 48109, United States
Karen Drukker Department of Radiology, University of Chicago, Chicago, IL, 60637, United States
Ravi K Samala Office of Science and Engineering Laboratories, Center for Devices and Radiological Health, U.S. Food and Drug Administration, Silver Spring, MD, 20993, United States
Quan Chen Department of Radiation Oncology, Mayo Clinic Arizona, Phoenix, AZ, 85054, United States
Daniel Vergara Department of Radiology, University of Washington, Seattle, WA, 98195, United States
Hayit Greenspan Biomedical Engineering and Imaging Institute, Department of Radiology, Icahn School of Medicine at Mt Sinai, New York, NY, 10029, United States
Nicholas Petrick Office of Science and Engineering Laboratories, Center for Devices and Radiological Health, U.S. Food and Drug Administration, Silver Spring, MD, 20993, United States
Berkman Sahiner Office of Science and Engineering Laboratories, Center for Devices and Radiological Health, U.S. Food and Drug Administration, Silver Spring, MD, 20993, United States
Zhimin Huo Tencent America, Palo Alto, CA, 94306, United States
Ronald M Summers Radiology and Imaging Sciences, National Institutes of Health Clinical Center, Bethesda, MD, 20892, United States
Kenny H Cha Office of Science and Engineering Laboratories, Center for Devices and Radiological Health, U.S. Food and Drug Administration, Silver Spring, MD, 20993, United States
Georgia Tourassi Computing and Computational Sciences Directorate, Oak Ridge National Laboratory, Oak Ridge, TN, 37830, United States
Thomas M Deserno Peter L. Reichertz Institute for Medical Informatics, TU Braunschweig and Hannover Medical School, Braunschweig, Niedersachsen, 38106, Germany
Kevin T Grizzard Department of Radiology and Biomedical Imaging, Yale University School of Medicine, New Haven, CT, 06510, United States
Janne J Näppi 3D Imaging Research, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, 02114, United States
Hiroyuki Yoshida 3D Imaging Research, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, 02114, United States
Daniele Regge Radiology Unit, Candiolo Cancer Institute, FPO-IRCCS, Candiolo, 10060, Italy Department of Translational Research and of New Surgical and Medical Technologies, University of Pisa, Pisa, 56126, Italy
Richard Mazurchuk Division of Cancer Prevention, National Cancer Institute, National Institutes of Health, Bethesda, MD, 20892, United States
Kenji Suzuki Institute of Innovative Research, Tokyo Institute of Technology, Midori-ku, Yokohama, Kanagawa, 226-8503, Japan
Lia Morra Department of Control and Computer Engineering, Politecnico di Torino, Torino, Piemonte, 10129, Italy
Henkjan Huisman Radboud Institute for Health Sciences, Radboud University Medical Center, Nijmegen, Gelderland, 6525 GA, Netherlands
Samuel G Armato Department of Radiology, University of Chicago, Chicago, IL, 60637, United States
Lubomir Hadjiiski Department of Radiology, University of Michigan, Ann Arbor, MI, 48109, United States

Collapse

Burgon A, Sahiner B, Petrick N, Pennello G, Cha KH, Samala RK. Decision region analysis for generalizability of artificial intelligence models: estimating model generalizability in the case of cross-reactivity and population shift. J Med Imaging (Bellingham) 2024;11:014501. [PMID: 38283653 PMCID: PMC10810180 DOI: 10.1117/1.jmi.11.1.014501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Revised: 12/14/2023] [Accepted: 12/28/2023] [Indexed: 01/30/2024] Open

Whitney HM, Baughan N, Myers KJ, Drukker K, Gichoya J, Bower B, Chen W, Gruszauskas N, Kalpathy-Cramer J, Koyejo S, Sá RC, Sahiner B, Zhang Z, Giger ML. Longitudinal assessment of demographic representativeness in the Medical Imaging and Data Resource Center open data commons. J Med Imaging (Bellingham) 2023;10:61105. [PMID: 37469387 PMCID: PMC10353566 DOI: 10.1117/1.jmi.10.6.061105] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 06/21/2023] [Accepted: 06/23/2023] [Indexed: 07/21/2023] Open

Affiliation(s)

Heather M. Whitney University of Chicago, Chicago, Illinois, United States The Medical Imaging and Data Resource Center (midrc.org)
Natalie Baughan University of Chicago, Chicago, Illinois, United States The Medical Imaging and Data Resource Center (midrc.org)
Kyle J. Myers The Medical Imaging and Data Resource Center (midrc.org) Puente Solutions LLC, Phoenix, Arizona, United States
Karen Drukker University of Chicago, Chicago, Illinois, United States The Medical Imaging and Data Resource Center (midrc.org)
Judy Gichoya The Medical Imaging and Data Resource Center (midrc.org) Emory University, Atlanta, Georgia, United States
Brad Bower The Medical Imaging and Data Resource Center (midrc.org) National Institutes of Health, Bethesda, Maryland, United States
Weijie Chen The Medical Imaging and Data Resource Center (midrc.org) United States Food and Drug Administration, Silver Spring, Maryland, United States
Nicholas Gruszauskas University of Chicago, Chicago, Illinois, United States The Medical Imaging and Data Resource Center (midrc.org)
Jayashree Kalpathy-Cramer The Medical Imaging and Data Resource Center (midrc.org) University of Colorado Anschutz Medical Campus, Aurora, Colorado, United States
Sanmi Koyejo The Medical Imaging and Data Resource Center (midrc.org) Stanford University, Stanford, California, United States
Rui C. Sá The Medical Imaging and Data Resource Center (midrc.org) National Institutes of Health, Bethesda, Maryland, United States University of California, San Diego, La Jolla, California, United States
Berkman Sahiner The Medical Imaging and Data Resource Center (midrc.org) United States Food and Drug Administration, Silver Spring, Maryland, United States
Zi Zhang The Medical Imaging and Data Resource Center (midrc.org) Jefferson Health, Philadelphia, Pennsylvania, United States
Maryellen L. Giger University of Chicago, Chicago, Illinois, United States The Medical Imaging and Data Resource Center (midrc.org)

Collapse

Drukker K, Chen W, Gichoya J, Gruszauskas N, Kalpathy-Cramer J, Koyejo S, Myers K, Sá RC, Sahiner B, Whitney H, Zhang Z, Giger M. Toward fairness in artificial intelligence for medical image analysis: identification and mitigation of potential biases in the roadmap from data collection to model deployment. J Med Imaging (Bellingham) 2023;10:061104. [PMID: 37125409 PMCID: PMC10129875 DOI: 10.1117/1.jmi.10.6.061104] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Accepted: 04/03/2023] [Indexed: 05/02/2023] Open

Baughan N, Whitney HM, Drukker K, Sahiner B, Hu T, Kim GH, McNitt-Gray M, Myers KJ, Giger ML. Sequestration of imaging studies in MIDRC: stratified sampling to balance demographic characteristics of patients in a multi-institutional data commons. J Med Imaging (Bellingham) 2023;10:064501. [PMID: 38074627 PMCID: PMC10704184 DOI: 10.1117/1.jmi.10.6.064501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Revised: 10/23/2023] [Accepted: 10/25/2023] [Indexed: 02/12/2024] Open

Abstract

Purpose

The Medical Imaging and Data Resource Center (MIDRC) is a multi-institutional effort to accelerate medical imaging machine intelligence research and create a publicly available image repository/commons as well as a sequestered commons for performance evaluation and benchmarking of algorithms. After de-identification, approximately 80% of the medical images and associated metadata become part of the open commons and 20% are sequestered from the open commons. To ensure that both commons are representative of the population available, we introduced a stratified sampling method to balance the demographic characteristics across the two datasets.

Approach

Our method uses multi-dimensional stratified sampling where several demographic variables of interest are sequentially used to separate the data into individual strata, each representing a unique combination of variables. Within each resulting stratum, patients are assigned to the open or sequestered commons. This algorithm was used on an example dataset containing 5000 patients using the variables of race, age, sex at birth, ethnicity, COVID-19 status, and image modality and compared resulting demographic distributions to naïve random sampling of the dataset over 2000 independent trials.

Results

Resulting prevalence of each demographic variable matched the prevalence from the input dataset within one standard deviation. Mann-Whitney U test results supported the hypothesis that sequestration by stratified sampling provided more balanced subsets than naïve randomization, except for demographic subcategories with very low prevalence.

Conclusions

The developed multi-dimensional stratified sampling algorithm can partition a large dataset while maintaining balance across several variables, superior to the balance achieved from naïve randomization.

Collapse

Sahiner B, Chen W, Samala RK, Petrick N. Data drift in medical machine learning: implications and potential remedies. Br J Radiol 2023;96:20220878. [PMID: 36971405 PMCID: PMC10546450 DOI: 10.1259/bjr.20220878] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Revised: 02/16/2023] [Accepted: 02/20/2023] [Indexed: 03/29/2023] Open

Petrick N, Chen W, Delfino JG, Gallas BD, Kang Y, Krainak D, Sahiner B, Samala RK. Regulatory considerations for medical imaging AI/ML devices in the United States: concepts and challenges. J Med Imaging (Bellingham) 2023;10:051804. [PMID: 37361549 PMCID: PMC10289177 DOI: 10.1117/1.jmi.10.5.051804] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Revised: 05/22/2023] [Accepted: 05/30/2023] [Indexed: 06/28/2023] Open

Wang X, Sahiner B, Scully CG, Cha KH. AFE-GAN: Synthesizing Electrocardiograms with Atrial Fibrillation Characteristics Using Generative Adversarial Networks^{. Annu Int Conf IEEE Eng Med Biol Soc 2023;2023:1-5. [PMID: 38083445 DOI: 10.1109/embc40787.2023.10340565] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2023]}

Hadjiiski L, Cha K, Chan HP, Drukker K, Morra L, Näppi JJ, Sahiner B, Yoshida H, Chen Q, Deserno TM, Greenspan H, Huisman H, Huo Z, Mazurchuk R, Petrick N, Regge D, Samala R, Summers RM, Suzuki K, Tourassi G, Vergara D, Armato SG. AAPM task group report 273: Recommendations on best practices for AI and machine learning for computer-aided diagnosis in medical imaging. Med Phys 2023;50:e1-e24. [PMID: 36565447 DOI: 10.1002/mp.16188] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2022] [Revised: 11/13/2022] [Accepted: 11/22/2022] [Indexed: 12/25/2022] Open

Abstract

Rapid advances in artificial intelligence (AI) and machine learning, and specifically in deep learning (DL) techniques, have enabled broad application of these methods in health care. The promise of the DL approach has spurred further interest in computer-aided diagnosis (CAD) development and applications using both "traditional" machine learning methods and newer DL-based methods. We use the term CAD-AI to refer to this expanded clinical decision support environment that uses traditional and DL-based AI methods. Numerous studies have been published to date on the development of machine learning tools for computer-aided, or AI-assisted, clinical tasks. However, most of these machine learning models are not ready for clinical deployment. It is of paramount importance to ensure that a clinical decision support tool undergoes proper training and rigorous validation of its generalizability and robustness before adoption for patient care in the clinic. To address these important issues, the American Association of Physicists in Medicine (AAPM) Computer-Aided Image Analysis Subcommittee (CADSC) is charged, in part, to develop recommendations on practices and standards for the development and performance assessment of computer-aided decision support systems. The committee has previously published two opinion papers on the evaluation of CAD systems and issues associated with user training and quality assurance of these systems in the clinic. With machine learning techniques continuing to evolve and CAD applications expanding to new stages of the patient care process, the current task group report considers the broader issues common to the development of most, if not all, CAD-AI applications and their translation from the bench to the clinic. The goal is to bring attention to the proper training and validation of machine learning algorithms that may improve their generalizability and reliability and accelerate the adoption of CAD-AI systems for clinical decision support.

Collapse

Affiliation(s)

Lubomir Hadjiiski Department of Radiology, University of Michigan, Ann Arbor, Michigan, USA
Kenny Cha U.S. Food and Drug Administration, Silver Spring, Maryland, USA
Heang-Ping Chan Department of Radiology, University of Michigan, Ann Arbor, Michigan, USA
Karen Drukker Department of Radiology, University of Chicago, Chicago, Illinois, USA
Lia Morra Department of Control and Computer Engineering, Politecnico di Torino, Torino, Italy
Janne J Näppi 3D Imaging Research, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, USA
Berkman Sahiner U.S. Food and Drug Administration, Silver Spring, Maryland, USA
Hiroyuki Yoshida 3D Imaging Research, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, USA
Quan Chen Department of Radiation Medicine, University of Kentucky, Lexington, Kentucky, USA
Thomas M Deserno Peter L. Reichertz Institute for Medical Informatics of TU Braunschweig and Hannover Medical School, Braunschweig, Germany
Hayit Greenspan Department of Biomedical Engineering, Faculty of Engineering, Tel Aviv, Israel & Department of Radiology, Ichan School of Medicine, Tel Aviv University, Mt Sinai, New York, New York, USA
Henkjan Huisman Radboud Institute for Health Sciences, Radboud University Medical Center, Nijmegen, The Netherlands
Zhimin Huo Tencent America, Palo Alto, California, USA
Richard Mazurchuk Division of Cancer Prevention, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, USA
Nicholas Petrick U.S. Food and Drug Administration, Silver Spring, Maryland, USA
Daniele Regge Radiology Unit, Candiolo Cancer Institute, FPO-IRCCS, Candiolo, Italy.,Department of Surgical Sciences, University of Turin, Turin, Italy
Ravi Samala U.S. Food and Drug Administration, Silver Spring, Maryland, USA
Ronald M Summers Radiology and Imaging Sciences, National Institutes of Health Clinical Center, Bethesda, Maryland, USA
Kenji Suzuki Institute of Innovative Research, Tokyo Institute of Technology, Tokyo, Japan
Georgia Tourassi Oak Ridge National Lab, Oak Ridge, Tennessee, USA
Daniel Vergara Department of Radiology, Yale New Haven Hospital, New Haven, Connecticut, USA
Samuel G Armato Department of Radiology, University of Chicago, Chicago, Illinois, USA

Collapse

Maynord M, Farhangi MM, Fermüller C, Aloimonos Y, Levine G, Petrick N, Sahiner B, Pezeshk A. Semi-supervised training using cooperative labeling of weakly annotated data for nodule detection in chest CT. Med Phys 2023. [PMID: 36630691 DOI: 10.1002/mp.16219] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Revised: 12/14/2022] [Accepted: 12/23/2022] [Indexed: 01/13/2023] Open

Abstract

PURPOSE

Machine learning algorithms are best trained with large quantities of accurately annotated samples. While natural scene images can often be labeled relatively cheaply and at large scale, obtaining accurate annotations for medical images is both time consuming and expensive. In this study, we propose a cooperative labeling method that allows us to make use of weakly annotated medical imaging data for the training of a machine learning algorithm. As most clinically produced data are weakly-annotated - produced for use by humans rather than machines and lacking information machine learning depends upon - this approach allows us to incorporate a wider range of clinical data and thereby increase the training set size.

METHODS

Our pseudo-labeling method consists of multiple stages. In the first stage, a previously established network is trained using a limited number of samples with high-quality expert-produced annotations. This network is used to generate annotations for a separate larger dataset that contains only weakly annotated scans. In the second stage, by cross-checking the two types of annotations against each other, we obtain higher-fidelity annotations. In the third stage, we extract training data from the weakly annotated scans, and combine it with the fully annotated data, producing a larger training dataset. We use this larger dataset to develop a computer-aided detection (CADe) system for nodule detection in chest CT.

RESULTS

We evaluated the proposed approach by presenting the network with different numbers of expert-annotated scans in training and then testing the CADe using an independent expert-annotated dataset. We demonstrate that when availability of expert annotations is severely limited, the inclusion of weakly-labeled data leads to a 5% improvement in the competitive performance metric (CPM), defined as the average of sensitivities at different false-positive rates.

CONCLUSIONS

Our proposed approach can effectively merge a weakly-annotated dataset with a small, well-annotated dataset for algorithm training. This approach can help enlarge limited training data by leveraging the large amount of weakly labeled data typically generated in clinical image interpretation.

Collapse

Feng J, Gossmann A, Sahiner B, Pirracchio R. Bayesian logistic regression for online recalibration and revision of risk prediction models with performance guarantees. J Am Med Inform Assoc 2022;29:841-852. [PMID: 35022756 DOI: 10.1093/jamia/ocab280] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Revised: 10/25/2021] [Accepted: 12/07/2021] [Indexed: 11/13/2022] Open

Abstract

OBJECTIVE

After deploying a clinical prediction model, subsequently collected data can be used to fine-tune its predictions and adapt to temporal shifts. Because model updating carries risks of over-updating/fitting, we study online methods with performance guarantees.

MATERIALS AND METHODS

We introduce 2 procedures for continual recalibration or revision of an underlying prediction model: Bayesian logistic regression (BLR) and a Markov variant that explicitly models distribution shifts (MarBLR). We perform empirical evaluation via simulations and a real-world study predicting Chronic Obstructive Pulmonary Disease (COPD) risk. We derive "Type I and II" regret bounds, which guarantee the procedures are noninferior to a static model and competitive with an oracle logistic reviser in terms of the average loss.

RESULTS

Both procedures consistently outperformed the static model and other online logistic revision methods. In simulations, the average estimated calibration index (aECI) of the original model was 0.828 (95%CI, 0.818-0.938). Online recalibration using BLR and MarBLR improved the aECI towards the ideal value of zero, attaining 0.265 (95%CI, 0.230-0.300) and 0.241 (95%CI, 0.216-0.266), respectively. When performing more extensive logistic model revisions, BLR and MarBLR increased the average area under the receiver-operating characteristic curve (aAUC) from 0.767 (95%CI, 0.765-0.769) to 0.800 (95%CI, 0.798-0.802) and 0.799 (95%CI, 0.797-0.801), respectively, in stationary settings and protected against substantial model decay. In the COPD study, BLR and MarBLR dynamically combined the original model with a continually refitted gradient boosted tree to achieve aAUCs of 0.924 (95%CI, 0.913-0.935) and 0.925 (95%CI, 0.914-0.935), compared to the static model's aAUC of 0.904 (95%CI, 0.892-0.916).

DISCUSSION

Despite its simplicity, BLR is highly competitive with MarBLR. MarBLR outperforms BLR when its prior better reflects the data.

CONCLUSIONS

BLR and MarBLR can improve the transportability of clinical prediction models and maintain their performance over time.

Collapse

El Naqa I, Boone JM, Benedict SH, Goodsitt MM, Chan HP, Drukker K, Hadjiiski L, Ruan D, Sahiner B. AI in medical physics: guidelines for publication. Med Phys 2021;48:4711-4714. [PMID: 34545957 DOI: 10.1002/mp.15170] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2021] [Revised: 08/10/2021] [Accepted: 08/10/2021] [Indexed: 12/16/2022] Open

Farhangi MM, Sahiner B, Petrick N, Pezeshk A. Automatic lung nodule detection in thoracic CT scans using dilated slice-wise convolutions. Med Phys 2021;48:3741-3751. [PMID: 33932241 DOI: 10.1002/mp.14915] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2020] [Revised: 04/08/2021] [Accepted: 04/15/2021] [Indexed: 12/24/2022] Open

Petrick N, Akbar S, Cha KH, Nofech-Mozes S, Sahiner B, Gavrielides MA, Kalpathy-Cramer J, Drukker K, Martel AL. SPIE-AAPM-NCI BreastPathQ challenge: an image analysis challenge for quantitative tumor cellularity assessment in breast cancer histology images following neoadjuvant treatment. J Med Imaging (Bellingham) 2021;8:034501. [PMID: 33987451 PMCID: PMC8107263 DOI: 10.1117/1.jmi.8.3.034501] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Accepted: 04/13/2021] [Indexed: 12/20/2022] Open

Pennello G, Sahiner B, Gossmann A, Petrick N. Discussion on "Approval policies for modifications to machine learning-based software as a medical device: A study of bio-creep" by Jean Feng, Scott Emerson, and Noah Simon. Biometrics 2020;77:45-48. [PMID: 33040332 DOI: 10.1111/biom.13381] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2020] [Accepted: 07/28/2020] [Indexed: 10/23/2022]

Farhangi MM, Petrick N, Sahiner B, Frigui H, Amini AA, Pezeshk A. Recurrent attention network for false positive reduction in the detection of pulmonary nodules in thoracic CT scans. Med Phys 2020;47:2150-2160. [DOI: 10.1002/mp.14076] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2019] [Revised: 12/13/2019] [Accepted: 01/13/2020] [Indexed: 12/19/2022] Open

Schaffter T, Buist DSM, Lee CI, Nikulin Y, Ribli D, Guan Y, Lotter W, Jie Z, Du H, Wang S, Feng J, Feng M, Kim HE, Albiol F, Albiol A, Morrell S, Wojna Z, Ahsen ME, Asif U, Jimeno Yepes A, Yohanandan S, Rabinovici-Cohen S, Yi D, Hoff B, Yu T, Chaibub Neto E, Rubin DL, Lindholm P, Margolies LR, McBride RB, Rothstein JH, Sieh W, Ben-Ari R, Harrer S, Trister A, Friend S, Norman T, Sahiner B, Strand F, Guinney J, Stolovitzky G. Evaluation of Combined Artificial Intelligence and Radiologist Assessment to Interpret Screening Mammograms. JAMA Netw Open 2020;3:e200265. [PMID: 32119094 PMCID: PMC7052735 DOI: 10.1001/jamanetworkopen.2020.0265] [Citation(s) in RCA: 157] [Impact Index Per Article: 39.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/12/2019] [Accepted: 12/26/2019] [Indexed: 12/18/2022] Open

Abstract

Importance

Mammography screening currently relies on subjective human interpretation. Artificial intelligence (AI) advances could be used to increase mammography screening accuracy by reducing missed cancers and false positives.

Objective

To evaluate whether AI can overcome human mammography interpretation limitations with a rigorous, unbiased evaluation of machine learning algorithms.

Design, Setting, and Participants

In this diagnostic accuracy study conducted between September 2016 and November 2017, an international, crowdsourced challenge was hosted to foster AI algorithm development focused on interpreting screening mammography. More than 1100 participants comprising 126 teams from 44 countries participated. Analysis began November 18, 2016.

Main Outcomes and Measurements

Algorithms used images alone (challenge 1) or combined images, previous examinations (if available), and clinical and demographic risk factor data (challenge 2) and output a score that translated to cancer yes/no within 12 months. Algorithm accuracy for breast cancer detection was evaluated using area under the curve and algorithm specificity compared with radiologists' specificity with radiologists' sensitivity set at 85.9% (United States) and 83.9% (Sweden). An ensemble method aggregating top-performing AI algorithms and radiologists' recall assessment was developed and evaluated.

Results

Overall, 144 231 screening mammograms from 85 580 US women (952 cancer positive ≤12 months from screening) were used for algorithm training and validation. A second independent validation cohort included 166 578 examinations from 68 008 Swedish women (780 cancer positive). The top-performing algorithm achieved an area under the curve of 0.858 (United States) and 0.903 (Sweden) and 66.2% (United States) and 81.2% (Sweden) specificity at the radiologists' sensitivity, lower than community-practice radiologists' specificity of 90.5% (United States) and 98.5% (Sweden). Combining top-performing algorithms and US radiologist assessments resulted in a higher area under the curve of 0.942 and achieved a significantly improved specificity (92.0%) at the same sensitivity.

Conclusions and Relevance

While no single AI algorithm outperformed radiologists, an ensemble of AI algorithms combined with radiologist assessment in a single-reader screening environment improved overall accuracy. This study underscores the potential of using machine learning methods for enhancing mammography screening interpretation.

Collapse

Affiliation(s)

Thomas Schaffter Computational Oncology, Sage Bionetworks, Seattle, Washington
Diana S. M. Buist Kaiser Permanente Washington Health Research Institute, Seattle, Washington
Christoph I. Lee University of Washington School of Medicine, Seattle
Yaroslav Nikulin Therapixel, Paris, France
Dezső Ribli Department of Physics of Complex Systems, ELTE Eötvös Loránd University, Budapest, Hungary
Yuanfang Guan Department of Computational Medicine and Bioinformatics, Michigan Medicine, University of Michigan, Ann Arbor
William Lotter DeepHealth Inc, Cambridge, Massachusetts
Zequn Jie Tencent AI Lab, Shenzhen, China
Hao Du National University of Singapore, Singapore
Sijia Wang Integrated Health Information Systems Pte Ltd, Singapore
Jiashi Feng Department of Electrical and Computer Engineering, National University of Singapore, Singapore
Mengling Feng National University Health System, Singapore
Hyo-Eun Kim Lunit Inc, Seoul, Korea
Francisco Albiol Instituto de Física Corpuscular (IFIC), CSIC–Universitat de València, Valencia, Spain
Alberto Albiol Universitat Politecnica de Valencia, Valencia, Valenciana, Spain
Stephen Morrell Centre for Medical Image Computing, University College London, Bloomsbury, London, United Kingdom
Zbigniew Wojna Tensorflight Inc, Mountain View, California
Mehmet Eren Ahsen University of Illinois at Urbana-Champaign, Urbana
Umar Asif IBM Research Australia, Melbourne, Australia
Antonio Jimeno Yepes IBM Research Australia, Melbourne, Australia
Shivanthan Yohanandan IBM Research Australia, Melbourne, Australia
Simona Rabinovici-Cohen IBM Research Haifa, Haifa University Campus, Mount Carmel, Haifa, Israel
Darvin Yi Stanford University, Stanford, California
Bruce Hoff Computational Oncology, Sage Bionetworks, Seattle, Washington
Thomas Yu Computational Oncology, Sage Bionetworks, Seattle, Washington
Elias Chaibub Neto Computational Oncology, Sage Bionetworks, Seattle, Washington
Daniel L. Rubin Department of Biomedical Data Science, Radiology, and Medicine (Biomedical Informatics), Stanford University, Stanford, California
Peter Lindholm Department of Physiology and Pharmacology, Karolinska Institutet, Stockholm, Sweden
Laurie R. Margolies Department of Diagnostic, Molecular and Interventional Radiology, Icahn School of Medicine at Mount Sinai, New York, New York
Russell Bailey McBride Department of Pathology, Molecular and Cell-Based Medicine, Icahn School of Medicine at Mount Sinai, New York, New York
Joseph H. Rothstein Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York
Weiva Sieh Department of Population Health Science and Policy, Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York
Rami Ben-Ari IBM Research Haifa, Haifa University Campus, Mount Carmel, Haifa, Israel
Stefan Harrer IBM Research Australia, Melbourne, Australia
Andrew Trister Fred Hutchinson Cancer Research Center, Seattle, Washington
Stephen Friend Computational Oncology, Sage Bionetworks, Seattle, Washington
Thea Norman Bill and Melinda Gates Foundation, Seattle, Washington
Berkman Sahiner Center for Devices and Radiological Health, Food and Drug Administration, Silver Spring, Maryland
Fredrik Strand Department of Oncology-Pathology, Karolinska Institutet, Stockholm, Sweden Breast Radiology, Karolinska University Hospital, Stockholm, Sweden
Justin Guinney Computational Oncology, Sage Bionetworks, Seattle, Washington
Gustavo Stolovitzky IBM Research, Translational Systems Biology and Nanobiotechnology, Thomas J. Watson Research Center, Yorktown Heights, New York
and the DM DREAM Consortium

Collapse

Cha KH, Petrick N, Pezeshk A, Graff CG, Sharma D, Badal A, Sahiner B. Evaluation of data augmentation via synthetic images for improved breast mass detection on mammograms using deep learning. J Med Imaging (Bellingham) 2020;7:012703. [PMID: 31763356 PMCID: PMC6872953 DOI: 10.1117/1.jmi.7.1.012703] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2019] [Accepted: 09/04/2019] [Indexed: 01/18/2023] Open

Abstract

We evaluated whether using synthetic mammograms for training data augmentation may reduce the effects of overfitting and increase the performance of a deep learning algorithm for breast mass detection. Synthetic mammograms were generated using in silico procedural analytic breast and breast mass modeling algorithms followed by simulated x-ray projections of the breast models into mammographic images. In silico breast phantoms containing masses were modeled across the four BI-RADS breast density categories, and the masses were modeled with different sizes, shapes, and margins. A Monte Carlo-based x-ray transport simulation code, MC-GPU, was used to project the three-dimensional phantoms into realistic synthetic mammograms. 2000 mammograms with 2522 masses were generated to augment a real data set during training. From the Curated Breast Imaging Subset of the Digital Database for Screening Mammography (CBIS-DDSM) data set, we used 1111 mammograms (1198 masses) for training, 120 mammograms (120 masses) for validation, and 361 mammograms (378 masses) for testing. We used faster R-CNN for our deep learning network with pretraining from ImageNet using the Resnet-101 architecture. We compared the detection performance when the network was trained using different percentages of the real CBIS-DDSM training set (100%, 50%, and 25%), and when these subsets of the training set were augmented with 250, 500, 1000, and 2000 synthetic mammograms. Free-response receiver operating characteristic (FROC) analysis was performed to compare performance with and without the synthetic mammograms. We generally observed an improved test FROC curve when training with the synthetic images compared to training without them, and the amount of improvement depended on the number of real and synthetic images used in training. Our study shows that enlarging the training data with synthetic samples can increase the performance of deep learning systems.

Collapse

Pezeshk A, Hamidian S, Petrick N, Sahiner B. 3-D Convolutional Neural Networks for Automatic Detection of Pulmonary Nodules in Chest CT. IEEE J Biomed Health Inform 2019;23:2080-2090. [DOI: 10.1109/jbhi.2018.2879449] [Citation(s) in RCA: 54] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Gavrielides MA, Li Q, Zeng R, Berman BP, Sahiner B, Gong Q, Myers KJ, DeFilippo G, Petrick N. Discrimination of Pulmonary Nodule Volume Change for Low- and High-contrast Tasks in a Phantom CT Study with Low-dose Protocols. Acad Radiol 2019;26:937-948. [PMID: 30292564 DOI: 10.1016/j.acra.2018.09.006] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2018] [Revised: 08/30/2018] [Accepted: 09/09/2018] [Indexed: 12/20/2022]

Abstract

RATIONALE AND OBJECTIVES

The quantitative assessment of volumetric CT for discriminating small changes in nodule size has been under-examined. This phantom study examined the effect of imaging protocol, nodule size, and measurement method on volume-based change discrimination across low and high object to background contrast tasks.

MATERIALS AND METHODS

Eight spherical objects ranging in diameter from 5.0 mm to 5.75 mm and 8.0 mm to 8.75 mm with 0.25 mm increments were scanned within an anthropomorphic phantom with either foam-background (high-contrast task, ∼1000 HU object to background difference)) or gelatin-background (low-contrast task, ∼50 to 100 HU difference). Ten repeat acquisitions were collected for each protocol with varying exposures, reconstructed slice thicknesses and reconstruction kernels. Volume measurements were obtained using a matched-filter approach (MF) and a publicly available 3D segmentation-based tool (SB). Discrimination of nodule sizes was assessed using the area under the ROC curve (AUC).

RESULTS

Using a low-dose (1.3 mGy), thin-slice (≤1.5 mm) protocol, changes of 0.25 mm in diameter were detected with AU = 1.0 for all baseline sizes for the high-contrast task regardless of measurement method. For the more challenging low-contrast task and same protocol, MF detected changes of 0.25 mm from baseline sizes ≥5.25 mm and volume changes ≥9.4% with AUC≥0.81 whereas corresponding results for SB were poor (AUC within 0.49-0.60). Performance for SB was improved, but still inconsistent, when exposure was increased to 4.4 mGy.

CONCLUSION

The reliable discrimination of small changes in pulmonary nodule size with low-dose, thin-slice CT protocols suitable for lung cancer screening was dependent on the inter-related effects of nodule to background contrast and measurement method.

Collapse

Robins M, Kalpathy-Cramer J, Obuchowski NA, Buckler A, Athelogou M, Jarecha R, Petrick N, Pezeshk A, Sahiner B, Samei E. Evaluation of Simulated Lesions as Surrogates to Clinical Lesions for Thoracic CT Volumetry: The Results of an International Challenge. Acad Radiol 2019;26:e161-e173. [PMID: 30219290 DOI: 10.1016/j.acra.2018.07.022] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2018] [Revised: 07/29/2018] [Accepted: 07/30/2018] [Indexed: 10/28/2022]

Abstract

RATIONALE AND OBJECTIVES

To evaluate a new approach to establish compliance of segmentation tools with the computed tomography volumetry profile of the Quantitative Imaging Biomarker Alliance (QIBA); and determine the statistical exchangeability between real and simulated lesions through an international challenge.

MATERIALS AND METHODS

The study used an anthropomorphic phantom with 16 embedded physical lesions and 30 patient cases from the Reference Image Database to Evaluate Therapy Response with pathologically confirmed malignancies. Hybrid datasets were generated by virtually inserting simulated lesions corresponding to physical lesions into the phantom datasets using one projection-domain-based method (Method 1), two image-domain insertion methods (Methods 2 and 3), and simulated lesions corresponding to real lesions into the Reference Image Database to Evaluate Therapy Response dataset (using Method 2). The volumes of the real and simulated lesions were compared based on bias (measured mean volume differences between physical and virtually inserted lesions in phantoms as quantified by segmentation algorithms), repeatability, reproducibility, equivalence (phantom phase), and overall QIBA compliance (phantom and clinical phase).

RESULTS

For phantom phase, three of eight groups were fully QIBA compliant, and one was marginally compliant. For compliant groups, the estimated biases were -1.8 ± 1.4%, -2.5 ± 1.1%, -3 ± 1%, -1.8 ± 1.5% (±95% confidence interval). No virtual insertion method showed statistical equivalence to physical insertion in bias equivalence testing using Schuirmann's two one-sided test (±5% equivalence margin). Differences in repeatability and reproducibility across physical and simulated lesions were largely comparable (0.1%-16% and 7%-18% differences, respectively). For clinical phase, 7 of 16 groups were QIBA compliant.

CONCLUSION

Hybrid datasets yielded conclusions similar to real computed tomography datasets where phantom QIBA compliant was also compliant for hybrid datasets. Some groups deemed compliant for simulated methods, not for physical lesion measurements. The magnitude of this difference was small (<5.4%). While technical performance is not equivalent, they correlate, such that, volumetrically simulated lesions could potentially serve as practical proxies.

Collapse

Keenan KE, Biller JR, Delfino JG, Boss MA, Does MD, Evelhoch JL, Griswold MA, Gunter JL, Hinks RS, Hoffman SW, Kim G, Lattanzi R, Li X, Marinelli L, Metzger GJ, Mukherjee P, Nordstrom RJ, Peskin AP, Perez E, Russek SE, Sahiner B, Serkova N, Shukla-Dave A, Steckner M, Stupic KF, Wilmes LJ, Wu HH, Zhang H, Jackson EF, Sullivan DC. Recommendations towards standards for quantitative MRI (qMRI) and outstanding needs. J Magn Reson Imaging 2019;49:e26-e39. [PMID: 30680836 PMCID: PMC6663309 DOI: 10.1002/jmri.26598] [Citation(s) in RCA: 55] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2018] [Revised: 11/16/2018] [Accepted: 11/16/2018] [Indexed: 12/12/2022] Open

Affiliation(s)

Kathryn E Keenan Physical Measurement Laboratory, National Institute of Standards and Technology, Boulder, Colorado, USA
Joshua R Biller Physical Measurement Laboratory, National Institute of Standards and Technology, Boulder, Colorado, USA
Jana G Delfino Center for Devices and Radiological Health, Food and Drug Administration, Silver Spring, Maryland, USA
Michael A Boss Physical Measurement Laboratory, National Institute of Standards and Technology, Boulder, Colorado, USA Department of Physics, University of Colorado, Boulder, Colorado, USA
Mark D Does Vanderbilt University Institute of Imaging Science, Vanderbilt University, Nashville, Tennessee, USA
Jeffrey L Evelhoch Merck Research Laboratories, West Point, Pennsylvania, USA
Mark A Griswold Department of Radiology, Case Western Reserve University, Cleveland, Ohio, USA
Jeffrey L Gunter Departments of Radiology and Information Technology, Mayo Clinic, Rochester, Minnesota, USA
R Scott Hinks GE Healthcare, Milwaukee, Wisconsin, USA
Stuart W Hoffman Rehabilitation Research and Development Service, Department of Veterans Affairs, Washington, DC, USA
Geena Kim College of Computer & Information Sciences, Regis University, Denver, Colorado, USA
Riccardo Lattanzi Department of Radiology, New York University School of Medicine, New York, New York, USA
Xiaojuan Li Program of Advanced Musculoskeletal Imaging (PAMI), Cleveland Clinic, Cleveland, Ohio, USA
Luca Marinelli GE Global Research, Niskayuna, New York, USA
Gregory J Metzger Department of Radiology, University of Minnesota, Minneapolis, Minnesota, USA
Pratik Mukherjee Department of Radiology, University of California San Francisco, San Francisco, California, USA
Robert J Nordstrom National Cancer Institute, Bethesda, Maryland, USA
Adele P Peskin Information Technology Laboratory, National Institute of Standards and Technology, Boulder, Colorado, USA
Elena Perez Arterys, San Francisco, California, USA
Stephen E Russek Physical Measurement Laboratory, National Institute of Standards and Technology, Boulder, Colorado, USA
Berkman Sahiner Center for Devices and Radiological Health, Food and Drug Administration, Silver Spring, Maryland, USA
Natalie Serkova Department of Radiology, Anschutz Medical Center, Aurora, Colorado, USA
Amita Shukla-Dave Departments of Medical Physics and Radiology, Memorial Sloan Kettering Cancer Center, New York, New York, USA
Michael Steckner Canon Medical Systems USA, Mayfield Village, Ohio, USA
Karl F Stupic Physical Measurement Laboratory, National Institute of Standards and Technology, Boulder, Colorado, USA
Lisa J Wilmes Department of Radiology and Biomedical Imaging, University of California San Francisco, San Francisco, California, USA
Holden H Wu Department of Radiological Sciences, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, California, USA
Huiming Zhang National Cancer Institute, Bethesda, Maryland, USA
Edward F Jackson Department of Medical Physics, University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin, USA
Daniel C Sullivan Department of Radiology, Duke University Medical Center, Durham, North Carolina, USA

Collapse

Gallas BD, Chen W, Cole E, Ochs R, Petrick N, Pisano ED, Sahiner B, Samuelson FW, Myers KJ. Impact of prevalence and case distribution in lab-based diagnostic imaging studies. J Med Imaging (Bellingham) 2019;6:015501. [PMID: 30713851 PMCID: PMC6340399 DOI: 10.1117/1.jmi.6.1.015501] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2018] [Accepted: 12/17/2018] [Indexed: 11/14/2022] Open

Sahiner B, Pezeshk A, Hadjiiski LM, Wang X, Drukker K, Cha KH, Summers RM, Giger ML. Deep learning in medical imaging and radiation therapy. Med Phys 2018;46:e1-e36. [PMID: 30367497 DOI: 10.1002/mp.13264] [Citation(s) in RCA: 354] [Impact Index Per Article: 59.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2018] [Revised: 09/18/2018] [Accepted: 10/09/2018] [Indexed: 12/15/2022] Open

Ghanian Z, Pezeshk A, Petrick N, Sahiner B. Computational insertion of microcalcification clusters on mammograms: reader differentiation from native clusters and computer-aided detection comparison. J Med Imaging (Bellingham) 2018;5:044502. [PMID: 30840741 DOI: 10.1117/1.jmi.5.4.044502] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2018] [Accepted: 10/10/2018] [Indexed: 11/14/2022] Open

Abstract

Mammographic computer-aided detection (CADe) devices are typically first developed and assessed for a specific "original" acquisition system. When developers are ready to apply their CADe device to a mammographic acquisition system, they typically assess the device with images acquired using the system. Collecting large repositories of clinical images containing verified lesion locations acquired by a system is costly and time consuming. We previously developed an image blending technique that allows users to seamlessly insert regions of interest (ROIs) from one medical image into another image. Our goal is to assess the performance of this technique for inserting microcalcification clusters from one mammogram into another, with the idea that when fully developed, our technique may be useful for reducing the clinical data burden in the assessment of a CADe device for use with an image acquisition system. We first perform a reader study to assess whether experienced observers can distinguish between computationally inserted and native clusters. For this purpose, we apply our insertion technique to 55 clinical cases. ROIs containing microcalcification clusters from one breast of a patient are inserted into the contralateral breast of the same patient. The analysis of the reader ratings using receiver operating characteristic (ROC) methodology indicates that inserted clusters cannot be reliably distinguished from native clusters (area under the ROC curve = 0.58 ± 0.04 ). Furthermore, CADe sensitivity is evaluated on mammograms of 68 clinical cases with native and inserted microcalcification clusters using a commercial CADe system. The average by-case sensitivities for native and inserted clusters are equal, 85.3% (58/68). The average by-image sensitivities for native and inserted clusters are 72.3% and 67.6%, respectively, with a difference of 4.7% and a 95% confidence interval of [ - 2.1 11.6]. These results demonstrate the potential for using the inserted microcalcification clusters for assessing mammographic CADe devices.

Collapse

Li Q, Berman BP, Hagio T, Gavrielides MA, Zeng R, Sahiner B, Gong Q, Fang Y, Liu S, Petrick N. Coronary artery calcium quantification using contrast-enhanced dual-energy computed tomography scans in comparison with unenhanced single-energy scans. Phys Med Biol 2018;63:175006. [PMID: 30101756 PMCID: PMC6183065 DOI: 10.1088/1361-6560/aad9be] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

Abstract

Extracting coronary artery calcium (CAC) scores from contrast-enhanced computed tomography (CT) images using dual-energy (DE) based material decomposition has been shown feasible, mainly through patient studies. However, the quantitative performance of such DE-based CAC scores, particularly per stenosis, is underexamined due to lack of reference standard and repeated scans. In this work we conducted a comprehensive quantitative comparative analysis of CAC scores obtained with DE and compare to conventional unenhanced single-energy (SE) CT scans through phantom studies. Synthetic vessels filled with iodinated blood mimicking material and containing calcium stenoses of different sizes and densities were scanned with a third generation dual-source CT scanner in a chest phantom using a DE coronary CT angiography protocol with three exposures/CTDIvol: auto-mAs/8 mGy (automatic exposure), 160 mAs/20 mGy and 260 mAs/34 mGy and 10 repeats. As a control, a set of vessel phantoms without iodine was scanned using a standard SE CAC score protocol (3 mGy). Calcium volume, mass and Agatston scores were estimated for each stenosis. For DE dataset, image-based three-material decomposition was applied to remove iodine before scoring. Performance of DE-based calcium scores were analyzed on a per-stenosis level and compared to SE-based scores. There was excellent correlation between the DE- and SE-based scores (correlation coefficient r: 0.92-0.98). Percent bias for the calcium volume and mass scores varied as a function of stenosis size and density for both modalities. Precision (coefficient of variation) improved with larger and denser stenoses for both DE- and SE-based calcium scores. DE-based scores (20 mGy and 34 mGy) provided comparable per-stenosis precision to SE-based (3 mGy). Our findings suggest that on a per-stenosis level, DE-based CAC scores from contrast-enhanced CT images can achieve comparable quantification performance to conventional SE-based scores. However, DE-based CAC scoring required more dose compared with SE for high per-stenosis precision so some caution is necessary with clinical DE-based CAC scoring.

Collapse

Senaras C, Niazi MKK, Sahiner B, Pennell MP, Tozbikian G, Lozanski G, Gurcan MN. Optimized generation of high-resolution phantom images using cGAN: Application to quantification of Ki67 breast cancer images. PLoS One 2018;13:e0196846. [PMID: 29742125 PMCID: PMC5942823 DOI: 10.1371/journal.pone.0196846] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2017] [Accepted: 04/20/2018] [Indexed: 11/29/2022] Open

Chen W, Sahiner B, Samuelson F, Pezeshk A, Petrick N. Calibration of medical diagnostic classifier scores to the probability of disease. Stat Methods Med Res 2018;27:1394-1409. [PMID: 27507287 PMCID: PMC5548655 DOI: 10.1177/0962280216661371] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Schoener B, Baird P, Dorn L, Giuliano KK, Ho M, Jump M, Sahiner B, Zink R. Using Data-Based Decisions to Transform Health Technology and Improve Patient Care. Biomed Instrum Technol 2018;52:7-16. [PMID: 29775385 DOI: 10.2345/0899-8205-52.s2.7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]

Robins M, Solomon J, Sahbaee P, Sedlmair M, Roy Choudhury K, Pezeshk A, Sahiner B, Samei E. Techniques for virtual lung nodule insertion: volumetric and morphometric comparison of projection-based and image-based methods for quantitative CT. Phys Med Biol 2017;62:7280-7299. [PMID: 28786399 DOI: 10.1088/1361-6560/aa83f8] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]

Abstract

Virtual nodule insertion paves the way towards the development of standardized databases of hybrid CT images with known lesions. The purpose of this study was to assess three methods (an established and two newly developed techniques) for inserting virtual lung nodules into CT images. Assessment was done by comparing virtual nodule volume and shape to the CT-derived volume and shape of synthetic nodules. 24 synthetic nodules (three sizes, four morphologies, two repeats) were physically inserted into the lung cavity of an anthropomorphic chest phantom (KYOTO KAGAKU). The phantom was imaged with and without nodules on a commercial CT scanner (SOMATOM Definition Flash, Siemens) using a standard thoracic CT protocol at two dose levels (1.4 and 22 mGy CTDI_vol). Raw projection data were saved and reconstructed with filtered back-projection and sinogram affirmed iterative reconstruction (SAFIRE, strength 5) at 0.6 mm slice thickness. Corresponding 3D idealized, virtual nodule models were co-registered with the CT images to determine each nodule's location and orientation. Virtual nodules were voxelized, partial volume corrected, and inserted into nodule-free CT data (accounting for system imaging physics) using two methods: projection-based Technique A, and image-based Technique B. Also a third Technique C based on cropping a region of interest from the acquired image of the real nodule and blending it into the nodule-free image was tested. Nodule volumes were measured using a commercial segmentation tool (iNtuition, TeraRecon, Inc.) and deformation was assessed using the Hausdorff distance. Nodule volumes and deformations were compared between the idealized, CT-derived and virtual nodules using a linear mixed effects regression model which utilized the mean, standard deviation, and coefficient of variation ([Formula: see text], [Formula: see text] and [Formula: see text] of the regional Hausdorff distance. Overall, there was a close concordance between the volumes of the CT-derived and virtual nodules. Percent differences between them were less than 3% for all insertion techniques and were not statistically significant in most cases. Correlation coefficient values were greater than 0.97. The deformation according to the Hausdorff distance was also similar between the CT-derived and virtual nodules with minimal statistical significance in the ([Formula: see text]) for Techniques A, B, and C. This study shows that both projection-based and image-based nodule insertion techniques yield realistic nodule renderings with statistical similarity to the synthetic nodules with respect to nodule volume and deformation. These techniques could be used to create a database of hybrid CT images containing nodules of known size, location and morphology.

Collapse

Liu J, Wang D, Lu L, Wei Z, Kim L, Turkbey EB, Sahiner B, Petrick NA, Summers RM. Detection and diagnosis of colitis on computed tomography using deep convolutional neural networks. Med Phys 2017;44:4630-4642. [PMID: 28594460 DOI: 10.1002/mp.12399] [Citation(s) in RCA: 39] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2016] [Revised: 05/05/2017] [Accepted: 05/24/2017] [Indexed: 01/15/2023] Open

Abstract

PURPOSE

Colitis refers to inflammation of the inner lining of the colon that is frequently associated with infection and allergic reactions. In this paper, we propose deep convolutional neural networks methods for lesion-level colitis detection and a support vector machine (SVM) classifier for patient-level colitis diagnosis on routine abdominal CT scans.

METHODS

The recently developed Faster Region-based Convolutional Neural Network (Faster RCNN) is utilized for lesion-level colitis detection. For each 2D slice, rectangular region proposals are generated by region proposal networks (RPN). Then, each region proposal is jointly classified and refined by a softmax classifier and bounding-box regressor. Two convolutional neural networks, eight layers of ZF net and 16 layers of VGG net are compared for colitis detection. Finally, for each patient, the detections on all 2D slices are collected and a SVM classifier is applied to develop a patient-level diagnosis. We trained and evaluated our method with 80 colitis patients and 80 normal cases using 4 × 4-fold cross validation.

RESULTS

For lesion-level colitis detection, with ZF net, the mean of average precisions (mAP) were 48.7% and 50.9% for RCNN and Faster RCNN, respectively. The detection system achieved sensitivities of 51.4% and 54.0% at two false positives per patient for RCNN and Faster RCNN, respectively. With VGG net, Faster RCNN increased the mAP to 56.9% and increased the sensitivity to 58.4% at two false positive per patient. For patient-level colitis diagnosis, with ZF net, the average areas under the ROC curve (AUC) were 0.978 ± 0.009 and 0.984 ± 0.008 for RCNN and Faster RCNN method, respectively. The difference was not statistically significant with P = 0.18. At the optimal operating point, the RCNN method correctly identified 90.4% (72.3/80) of the colitis patients and 94.0% (75.2/80) of normal cases. The sensitivity improved to 91.6% (73.3/80) and the specificity improved to 95.0% (76.0/80) for the Faster RCNN method. With VGG net, Faster RCNN increased the AUC to 0.986 ± 0.007 and increased the diagnosis sensitivity to 93.7% (75.0/80) and specificity was unchanged at 95.0% (76.0/80).

CONCLUSION

Colitis detection and diagnosis by deep convolutional neural networks is accurate and promising for future clinical application.

Collapse

Abas FS, Shana’ah A, Christian B, Hasserjian R, Louissaint A, Pennell M, Sahiner B, Chen W, Niazi MKK, Lozanski G, Gurcan M. Computer-assisted quantification of CD3+ T cells in follicular lymphoma. Cytometry A 2017;91:609-621. [PMID: 28110507 PMCID: PMC10680104 DOI: 10.1002/cyto.a.23049] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2016] [Accepted: 12/19/2016] [Indexed: 01/01/2023]

Abstract

The advance of high resolution digital scans of pathology slides allowed development of computer based image analysis algorithms that may help pathologists in IHC stains quantification. While very promising, these methods require further refinement before they are implemented in routine clinical setting. Particularly critical is to evaluate algorithm performance in a setting similar to current clinical practice. In this article, we present a pilot study that evaluates the use of a computerized cell quantification method in the clinical estimation of CD3 positive (CD3+) T cells in follicular lymphoma (FL). Our goal is to demonstrate the degree to which computerized quantification is comparable to the practice of estimation by a panel of expert pathologists. The computerized quantification method uses entropy based histogram thresholding to separate brown (CD3+) and blue (CD3-) regions after a color space transformation. A panel of four board-certified hematopathologists evaluated a database of 20 FL images using two different reading methods: visual estimation and manual marking of each CD3+ cell in the images. These image data and the readings provided a reference standard and the range of variability among readers. Sensitivity and specificity measures of the computer's segmentation of CD3+ and CD- T cell are recorded. For all four pathologists, mean sensitivity and specificity measures are 90.97 and 88.38%, respectively. The computerized quantification method agrees more with the manual cell marking as compared to the visual estimations. Statistical comparison between the computerized quantification method and the pathologist readings demonstrated good agreement with correlation coefficient values of 0.81 and 0.96 in terms of Lin's concordance correlation and Spearman's correlation coefficient, respectively. These values are higher than most of those calculated among the pathologists. In the future, the computerized quantification method may be used to investigate the relationship between the overall architectural pattern (i.e., interfollicular vs. follicular) and outcome measures (e.g., overall survival, and time to treatment). © 2017 International Society for Advancement of Cytometry.

Collapse

Pezeshk A, Petrick N, Sahiner B. Seamless Lesion Insertion for Data Augmentation in CAD Training. IEEE Trans Med Imaging 2017;36:1005-1015. [PMID: 28113310 PMCID: PMC5509514 DOI: 10.1109/tmi.2016.2640180] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]

Hamidian S, Sahiner B, Petrick N, Pezeshk A. 3D Convolutional Neural Network for Automatic Detection of Lung Nodules in Chest CT. Proc SPIE Int Soc Opt Eng 2017;10134:1013409. [PMID: 28845077 PMCID: PMC5568782 DOI: 10.1117/12.2255795] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]

Senaras C, Pennell M, Chen W, Sahiner B, Shana'ah A, Louissaint A, Hasserjian RP, Lozanski G, Gurcan MN. FOXP3-stained image analysis for follicular lymphoma: Optimal adaptive thresholding with maximal nucleus coverage. Proc SPIE Int Soc Opt Eng 2017;10140. [PMID: 28579665 DOI: 10.1117/12.2255671] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]

Abstract

Immunohistochemical detection of FOXP3 antigen is a usable marker for detection of regulatory T lymphocytes (TR) in formalin fixed and paraffin embedded sections of different types of tumor tissue. TR plays a major role in homeostasis of normal immune systems where they prevent auto reactivity of the immune system towards the host. This beneficial effect of TR is frequently "hijacked" by malignant cells where tumor-infiltrating regulatory T cells are recruited by the malignant nuclei to inhibit the beneficial immune response of the host against the tumor cells. In the majority of human solid tumors, an increased number of tumor-infiltrating FOXP3 positive TR is associated with worse outcome. However, in follicular lymphoma (FL) the impact of the number and distribution of TR on the outcome still remains controversial. In this study, we present a novel method to detect and enumerate nuclei from FOXP3 stained images of FL biopsies. The proposed method defines a new adaptive thresholding procedure, namely the optimal adaptive thresholding (OAT) method, which aims to minimize under-segmented and over-segmented nuclei for coarse segmentation. Next, we integrate a parameter free elliptical arc and line segment detector (ELSD) as additional information to refine segmentation results and to split most of the merged nuclei. Finally, we utilize a state-of-the-art super-pixel method, Simple Linear Iterative Clustering (SLIC) to split the rest of the merged nuclei. Our dataset consists of 13 region-of-interest images containing 769 negative and 88 positive nuclei. Three expert pathologists evaluated the method and reported sensitivity values in detecting negative and positive nuclei ranging from 83-100% and 90-95%, and precision values of 98-100% and 99-100%, respectively. The proposed solution can be used to investigate the impact of FOXP3 positive nuclei on the outcome and prognosis in FL.

Collapse

Li Q, Liu S, Myers KJ, Gavrielides MA, Zeng R, Sahiner B, Petrick N. Impact of Reconstruction Algorithms and Gender-Associated Anatomy on Coronary Calcium Scoring with CT: An Anthropomorphic Phantom Study. Acad Radiol 2016;23:1470-1479. [PMID: 27665673 PMCID: PMC5567798 DOI: 10.1016/j.acra.2016.08.014] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2016] [Revised: 07/20/2016] [Accepted: 08/01/2016] [Indexed: 10/21/2022]

Abstract

RATIONALE AND OBJECTIVES

Different computed tomography imaging protocols and patient characteristics can impact the accuracy and precision of the calcium score and may lead to inconsistent patient treatment recommendations. The aim of this work was to determine the impact of reconstruction algorithm and gender characteristics on coronary artery calcium scoring based on a phantom study using computed tomography.

MATERIALS AND METHODS

Four synthetic heart vessels with vessel diameters corresponding to female and male left main and left circumflex arteries containing calcification-mimicking materials (200-1000 HU) were inserted into a thorax phantom and were scanned with and without female breast plates (male and female phantoms, respectively). Ten scans were acquired and were reconstructed at 3-mm slices using filtered-back projection (FBP) and iterative reconstruction with medium and strong denoising (IR3 and IR5) algorithms. Agatston and calcium volume scores were estimated for each vessel. Calcium scores for each vessel and the total calcium score (summation of all four vessels) were compared between the two phantoms to quantify the impact of the breast plates and reconstruction parameters. Calcium scores were also compared among vessels of different diameters to investigate the impact of the vessel size.

RESULTS

The calcium scores were significantly larger for FBP reconstruction (FBP > IR3>IR5). Agatston scores (calcium volume score) for vessels in the male phantom scans were on average 4.8% (2.9%), 8.2% (7.1%), and 10.5% (9.4%) higher compared to those in the female phantom with FBP, IR3, and IR5, respectively, when exposure was conserved across phantoms. The total calcium scores from the male phantom were significantly larger than those from the female phantom (P <0.05). In general, calcium volume scores were underestimated (up to about 50%) for smaller vessels, especially when scanned in the female phantom.

CONCLUSIONS

Calcium scores significantly decreased with iterative reconstruction and tended to be underestimated for female anatomy (smaller vessels and presence of breast plates).

Collapse

Li Q, Gavrielides MA, Sahiner B, Myers KJ, Zeng R, Petrick N. Statistical analysis of lung nodule volume measurements with CT in a large-scale phantom study. Med Phys 2016;42:3932-47. [PMID: 26133594 DOI: 10.1118/1.4921734] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open

Abstract

PURPOSE

To determine inter-related factors that contribute substantially to measurement error of pulmonary nodule measurements with CT by assessing a large-scale dataset of phantom scans and to quantitatively validate the repeatability and reproducibility of a subset containing nodules and CT acquisitions consistent with the Quantitative Imaging Biomarker Alliance (QIBA) metrology recommendations.

METHODS

The dataset has about 40 000 volume measurements of 48 nodules (5-20 mm, four shapes, three radiodensities) estimated by a matched-filter estimator from CT images involving 72 imaging protocols. Technical assessment was performed under a framework suggested by QIBA, which aimed to minimize the inconsistency of terminologies and techniques used in the literature. Accuracy and precision of lung nodule volume measurements were examined by analyzing the linearity, bias, variance, root mean square error (RMSE), repeatability, reproducibility, and significant and substantial factors that contribute to the measurement error. Statistical methodologies including linear regression, analysis of variance, and restricted maximum likelihood were applied to estimate the aforementioned metrics. The analysis was performed on both the whole dataset and a subset meeting the criteria proposed in the QIBA Profile document.

RESULTS

Strong linearity was observed for all data. Size, slice thickness × collimation, and randomness in attachment to vessels or chest wall were the main sources of measurement error. Grouping the data by nodule size and slice thickness × collimation, the standard deviation (3.9%-28%), and RMSE (4.4%-68%) tended to increase with smaller nodule size and larger slice thickness. For 5, 8, 10, and 20 mm nodules with reconstruction slice thickness ≤0.8, 3, 3, and 5 mm, respectively, the measurements were almost unbiased (-3.0% to 3.0%). Repeatability coefficients (RCs) were from 6.2% to 40%. Pitch of 0.9, detail kernel, and smaller slice thicknesses yielded better (smaller) RCs than those from pitch of 1.2, medium kernel, and larger slice thicknesses. Exposure showed no impact on RC. The overall reproducibility coefficient (RDC) was 45%, and reduced to about 20%-30% when the slice thickness and collimation were fixed. For nodules and CT imaging complying with the QIBA Profile (QIBA Profile subset), the measurements were highly repeatable and reproducible in spite of variations in nodule characteristics and imaging protocols. The overall measurement error was small and mostly due to the randomness in attachment. The bias, standard deviation, and RMSE grouped by nodule size and slice thickness × collimation in the QIBA Profile subset were within ±3%, 4%, and 5%, respectively. RCs are within 11% and the overall RDC is equal to 11%.

CONCLUSIONS

The authors have performed a comprehensive technical assessment of lung nodule volumetry with a matched-filter estimator from CT scans of synthetic nodules and identified the main sources of measurement error among various nodule characteristics and imaging parameters. The results confirm that the QIBA Profile set is highly repeatable and reproducible. These phantom study results can serve as a bound on the clinical performance achievable with volumetric CT measurements of pulmonary nodules.

Collapse

Gavrielides MA, Li Q, Zeng R, Myers KJ, Sahiner B, Petrick N. Volume estimation of multidensity nodules with thoracic computed tomography. J Med Imaging (Bellingham) 2016;3:013504. [PMID: 26844235 DOI: 10.1117/1.jmi.3.1.013504] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2015] [Accepted: 12/18/2015] [Indexed: 11/14/2022] Open

Affiliation(s)

Marios A Gavrielides U.S. Food and Drug Administration , Division of Imaging, Diagnostics, and Software Reliability (DIDSR), Office of Science and Engineering Laboratories, Center for Devices and Radiological Health, 10903 New Hampshire Avenue, Building 62, Room 4126, Silver Spring, Maryland 20993, United States
Qin Li U.S. Food and Drug Administration , Division of Imaging, Diagnostics, and Software Reliability (DIDSR), Office of Science and Engineering Laboratories, Center for Devices and Radiological Health, 10903 New Hampshire Avenue, Building 62, Room 4126, Silver Spring, Maryland 20993, United States
Rongping Zeng U.S. Food and Drug Administration , Division of Imaging, Diagnostics, and Software Reliability (DIDSR), Office of Science and Engineering Laboratories, Center for Devices and Radiological Health, 10903 New Hampshire Avenue, Building 62, Room 4126, Silver Spring, Maryland 20993, United States
Kyle J Myers U.S. Food and Drug Administration , Division of Imaging, Diagnostics, and Software Reliability (DIDSR), Office of Science and Engineering Laboratories, Center for Devices and Radiological Health, 10903 New Hampshire Avenue, Building 62, Room 4126, Silver Spring, Maryland 20993, United States
Berkman Sahiner U.S. Food and Drug Administration , Division of Imaging, Diagnostics, and Software Reliability (DIDSR), Office of Science and Engineering Laboratories, Center for Devices and Radiological Health, 10903 New Hampshire Avenue, Building 62, Room 4126, Silver Spring, Maryland 20993, United States
Nicholas Petrick U.S. Food and Drug Administration , Division of Imaging, Diagnostics, and Software Reliability (DIDSR), Office of Science and Engineering Laboratories, Center for Devices and Radiological Health, 10903 New Hampshire Avenue, Building 62, Room 4126, Silver Spring, Maryland 20993, United States

Collapse

Zeng R, Gavrielides MA, Petrick N, Sahiner B, Li Q, Myers KJ. Estimating local noise power spectrum from a few FBP-reconstructed CT scans. Med Phys 2016;43:568. [DOI: 10.1118/1.4939061] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open

Fauzi MFA, Pennell M, Sahiner B, Chen W, Shana'ah A, Hemminger J, Gru A, Kurt H, Losos M, Joehlin-Price A, Kavran C, Smith SM, Nowacki N, Mansor S, Lozanski G, Gurcan MN. Classification of follicular lymphoma: the effect of computer aid on pathologists grading. BMC Med Inform Decis Mak 2015;15:115. [PMID: 26715518 PMCID: PMC4696238 DOI: 10.1186/s12911-015-0235-6] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2015] [Accepted: 12/15/2015] [Indexed: 11/28/2022] Open

Abstract

Background

Follicular lymphoma (FL) is one of the most common lymphoid malignancies in the western world. FL cases are stratified into three histological grades based on the average centroblast count per high power field (HPF). The centroblast count is performed manually by the pathologist using an optical microscope and hematoxylin and eosin (H&E) stained tissue section. Although this is the current clinical practice, it suffers from high inter- and intra-observer variability and is vulnerable to sampling bias.

Methods

In this paper, we present a system, called Follicular Lymphoma Grading System (FLAGS), to assist the pathologist in grading FL cases. We also assess the effect of FLAGS on accuracy of expert and inexperienced readers. FLAGS automatically identifies possible HPFs for examination by analyzing H&E and CD20 stains, before classifying them into low or high risk categories. The pathologist is first asked to review the slides according to the current routine clinical practice, before being presented with FLAGS classification via color-coded map. The accuracy of the readers with and without FLAGS assistance is measured.

Results

FLAGS was used by four experts (board-certified hematopathologists) and seven pathology residents on 20 FL slides. Access to FLAGS improved overall reader accuracy with the biggest improvement seen among residents. An average AUC value of 0.75 was observed which generally indicates “acceptable” diagnostic performance.

Conclusions

The results of this study show that FLAGS can be useful in increasing the pathologists’ accuracy in grading the tissue. To the best of our knowledge, this study measure, for the first time, the effect of computerized image analysis on pathologists’ grading of follicular lymphoma. When fully developed, such systems have the potential to reduce sampling bias by examining an increased proportion of HPFs within follicle regions, as well as to reduce inter- and intra-reader variability.

Electronic supplementary material

The online version of this article (doi:10.1186/s12911-015-0235-6) contains supplementary material, which is available to authorized users.

Collapse

Pezeshk A, Sahiner B, Zeng R, Wunderlich A, Chen W, Petrick N. Seamless Insertion of Pulmonary Nodules in Chest CT Images. IEEE Trans Biomed Eng 2015;62:2812-2827. [PMID: 26080378 DOI: 10.1109/tbme.2015.2445054] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Li Q, Gavrielides MA, Zeng R, Myers KJ, Sahiner B, Petrick N. Volume estimation of low-contrast lesions with CT: a comparison of performances from a phantom study, simulations and theoretical analysis. Phys Med Biol 2015;60:671-88. [PMID: 25555240 DOI: 10.1088/0031-9155/60/2/671] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

Abstract

Measurements of lung nodule volume with multi-detector computed tomography (MDCT) have been shown to be more accurate and precise compared to conventional lower dimensional measurements. Quantifying the size of lesions is potentially more difficult when the object-to-background contrast is low as with lesions in the liver. Physical phantom and simulation studies are often utilized to analyze the bias and variance of lesion size estimates because a ground truth or reference standard can be established. In addition, it may also be useful to derive theoretical bounds as another way of characterizing lesion sizing methods. The goal of this work was to study the performance of a MDCT system for a lesion volume estimation task with object-to-background contrast less than 50 HU, and to understand the relation among performances obtained from phantom study, simulation and theoretical analysis. We performed both phantom and simulation studies, and analyzed the bias and variance of volume measurements estimated by a matched-filter-based estimator. We further corroborated results with a theoretical analysis to estimate the achievable performance bound, which was the Cramer-Rao's lower bound (CRLB) of minimum variance for the size estimates. Results showed that estimates of non-attached solid small lesion volumes with object-to-background contrast of 31-46 HU can be accurate and precise, with less than 10.8% in percent bias and 4.8% in standard deviation of percent error (SPE), in standard dose scans. These results are consistent with theoretical (CRLB), computational (simulation) and empirical phantom bounds. The difference between the bounds is rather small (for SPE less than 1.9%) indicating that the theoretical- and simulation-based performance bounds can be good surrogates for physical phantom studies.

Collapse

Wang S, Li D, Petrick N, Sahiner B, Linguraru MG, Summers RM. Optimizing area under the ROC curve using semi-supervised learning. Pattern Recognit 2015;48:276-287. [PMID: 25395692 PMCID: PMC4226543 DOI: 10.1016/j.patcog.2014.07.025] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]

He X, Samuelson F, Zeng R, Sahiner B. Discovering intrinsic properties of human observers' visual search and mathematical observers' scanning. J Opt Soc Am A Opt Image Sci Vis 2014;31:2495-2510. [PMID: 25401363 DOI: 10.1364/josaa.31.002495] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]

Zeng R, Gavrielides M, Li Q, Petrick N, Sahiner B, Myers K. WE-D-18A-06: Estimating Local Noise Power Spectrum From a Few CT Scans. Med Phys 2014. [DOI: 10.1118/1.4889415] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open

Abbey CK, Gallas BD, Boone JM, Niklason LT, Hadjiiski LM, Sahiner B, Samuelson FW. Comparative statistical properties of expected utility and area under the ROC curve for laboratory studies of observer performance in screening mammography. Acad Radiol 2014;21:481-90. [PMID: 24594418 DOI: 10.1016/j.acra.2013.12.011] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2013] [Revised: 12/11/2013] [Accepted: 12/11/2013] [Indexed: 11/25/2022]

Petrick N, Sahiner B, Armato SG, Bert A, Correale L, Delsanto S, Freedman MT, Fryd D, Gur D, Hadjiiski L, Huo Z, Jiang Y, Morra L, Paquerault S, Raykar V, Samuelson F, Summers RM, Tourassi G, Yoshida H, Zheng B, Zhou C, Chan HP. Evaluation of computer-aided detection and diagnosis systems. Med Phys 2014;40:087001. [PMID: 23927365 DOI: 10.1118/1.4816310] [Citation(s) in RCA: 60] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open

Abstract

Computer-aided detection and diagnosis (CAD) systems are increasingly being used as an aid by clinicians for detection and interpretation of diseases. Computer-aided detection systems mark regions of an image that may reveal specific abnormalities and are used to alert clinicians to these regions during image interpretation. Computer-aided diagnosis systems provide an assessment of a disease using image-based information alone or in combination with other relevant diagnostic data and are used by clinicians as a decision support in developing their diagnoses. While CAD systems are commercially available, standardized approaches for evaluating and reporting their performance have not yet been fully formalized in the literature or in a standardization effort. This deficiency has led to difficulty in the comparison of CAD devices and in understanding how the reported performance might translate into clinical practice. To address these important issues, the American Association of Physicists in Medicine (AAPM) formed the Computer Aided Detection in Diagnostic Imaging Subcommittee (CADSC), in part, to develop recommendations on approaches for assessing CAD system performance. The purpose of this paper is to convey the opinions of the AAPM CADSC members and to stimulate the development of consensus approaches and "best practices" for evaluating CAD systems. Both the assessment of a standalone CAD system and the evaluation of the impact of CAD on end-users are discussed. It is hoped that awareness of these important evaluation elements and the CADSC recommendations will lead to further development of structured guidelines for CAD performance assessment. Proper assessment of CAD system performance is expected to increase the understanding of a CAD system's effectiveness and limitations, which is expected to stimulate further research and development efforts on CAD technologies, reduce problems due to improper use, and eventually improve the utility and efficacy of CAD in clinical practice.

Collapse

He X, Sahiner B, Gallas BD, Chen W, Petrick N. Computerized characterization of lung nodule subtlety using thoracic CT images. Phys Med Biol 2014;59:897-910. [PMID: 24487773 DOI: 10.1088/0031-9155/59/4/897] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]