51
|
Kim YS, Lee SE, Chang JM, Kim SY, Bae YK. Ultrasonographic morphological characteristics determined using a deep learning-based computer-aided diagnostic system of breast cancer. Medicine (Baltimore) 2022; 101:e28621. [PMID: 35060538 PMCID: PMC8772632 DOI: 10.1097/md.0000000000028621] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/13/2021] [Accepted: 12/23/2021] [Indexed: 01/05/2023] Open
Abstract
To investigate the correlations between ultrasonographic morphological characteristics quantitatively assessed using a deep learning-based computer-aided diagnostic system (DL-CAD) and histopathologic features of breast cancer.This retrospective study included 282 women with invasive breast cancer (<5 cm; mean age, 54.4 [range, 29-85] years) who underwent surgery between February 2016 and April 2017. The morphological characteristics of breast cancer on B-mode ultrasonography were analyzed using DL-CAD, and quantitative scores (0-1) were obtained. Associations between quantitative scores and tumor histologic type, grade, size, subtype, and lymph node status were compared.Two-hundred and thirty-six (83.7%) tumors were invasive ductal carcinoma, 18 (6.4%) invasive lobular carcinoma, and 28 (9.9%) micropapillary, apocrine, and mucinous. The mean size was 1.8 ± 1.0 (standard deviation) cm, and 108 (38.3%) cases were node positive. Irregular shape score was associated with tumor size (P < .001), lymph nodes status (P = .001), and estrogen receptor status (P = .016). Not-circumscribed margin (P < .001) and hypoechogenicity (P = .003) scores correlated with tumor size, and non-parallel orientation score correlated with histologic grade (P = .024). Luminal A tumors exhibited more irregular features (P = .048) with no parallel orientation (P = .002), whereas triple-negative breast cancer showed a rounder/more oval and parallel orientation.Quantitative morphological characteristics of breast cancers determined using DL-CAD correlated with histopathologic features and could provide useful information about breast cancer phenotypes.
Collapse
Affiliation(s)
- Young Seon Kim
- Department of Radiology, Yeungnam University Hospital, Yeungnam University College of Medicine, Daegu, South Korea
| | - Seung Eun Lee
- Department of Radiology, Yeungnam University Hospital, Yeungnam University College of Medicine, Daegu, South Korea
| | - Jung Min Chang
- Department of Radiology, Seoul National University Hospital, Seoul, South Korea
| | - Soo-Yeon Kim
- Department of Radiology, Seoul National University Hospital, Seoul, South Korea
| | - Young Kyung Bae
- Department of Pathology, Yeungnam University Hospital, Yeungnam University College of Medicine, Daegu, South Korea
| |
Collapse
|
52
|
O'Connell AM, Bartolotta TV, Orlando A, Jung SH, Baek J, Parker KJ. Diagnostic Performance of an Artificial Intelligence System in Breast Ultrasound. JOURNAL OF ULTRASOUND IN MEDICINE : OFFICIAL JOURNAL OF THE AMERICAN INSTITUTE OF ULTRASOUND IN MEDICINE 2022; 41:97-105. [PMID: 33665833 DOI: 10.1002/jum.15684] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/03/2020] [Revised: 02/17/2021] [Accepted: 02/18/2021] [Indexed: 05/26/2023]
Abstract
OBJECTIVES We study the performance of an artificial intelligence (AI) program designed to assist radiologists in the diagnosis of breast cancer, relative to measures obtained from conventional readings by radiologists. METHODS A total of 10 radiologists read a curated, anonymized group of 299 breast ultrasound images that contained at least one suspicious lesion and for which a final diagnosis was independently determined. Separately, the AI program was initialized by a lead radiologist and the computed results compared against those of the radiologists. RESULTS The AI program's diagnoses of breast lesions had concordance with the 10 radiologists' readings across a number of BI-RADS descriptors. The sensitivity, specificity, and accuracy of the AI program's diagnosis of benign versus malignant was above 0.8, in agreement with the highest performing radiologists and commensurate with recent studies. CONCLUSION The trained AI program can contribute to accuracy of breast cancer diagnoses with ultrasound.
Collapse
Affiliation(s)
- Avice M O'Connell
- Department of Imaging Sciences, University of Rochester Medical Center, Rochester, New York, USA
| | - Tommaso V Bartolotta
- Department of Radiology, University Hospital, Palermo, Italy
- Fondazione Istituto G. Giglio Hospital, Cefalù, Italy
| | - Alessia Orlando
- Department of Radiology, University Hospital, Palermo, Italy
| | - Sin-Ho Jung
- Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, North Carolina, USA
| | - Jihye Baek
- Department of Electrical and Computer Engineering, University of Rochester, Rochester, New York, USA
| | - Kevin J Parker
- Department of Electrical and Computer Engineering, University of Rochester, Rochester, New York, USA
| |
Collapse
|
53
|
Ying Z, Xiaohong J, Yijie D, Juan L, Yilai C, Congcong Y, Weiwei Z, Jianqiao Z. Using S-Detect to Improve Breast Ultrasound: The Different Combined Strategies Based on Radiologist Experienc. ADVANCED ULTRASOUND IN DIAGNOSIS AND THERAPY 2022. [DOI: 10.37015/audt.2022.220007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022] Open
|
54
|
Pfob A, Sidey-Gibbons C, Barr RG, Duda V, Alwafai Z, Balleyguier C, Clevert DA, Fastner S, Gomez C, Goncalo M, Gruber I, Hahn M, Hennigs A, Kapetas P, Lu SC, Nees J, Ohlinger R, Riedel F, Rutten M, Schaefgen B, Schuessler M, Stieber A, Togawa R, Tozaki M, Wojcinski S, Xu C, Rauch G, Heil J, Golatta M. The importance of multi-modal imaging and clinical information for humans and AI-based algorithms to classify breast masses (INSPiRED 003): an international, multicenter analysis. Eur Radiol 2022; 32:4101-4115. [PMID: 35175381 PMCID: PMC9123064 DOI: 10.1007/s00330-021-08519-z] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2021] [Revised: 09/14/2021] [Accepted: 10/17/2021] [Indexed: 01/23/2023]
Abstract
OBJECTIVES AI-based algorithms for medical image analysis showed comparable performance to human image readers. However, in practice, diagnoses are made using multiple imaging modalities alongside other data sources. We determined the importance of this multi-modal information and compared the diagnostic performance of routine breast cancer diagnosis to breast ultrasound interpretations by humans or AI-based algorithms. METHODS Patients were recruited as part of a multicenter trial (NCT02638935). The trial enrolled 1288 women undergoing routine breast cancer diagnosis (multi-modal imaging, demographic, and clinical information). Three physicians specialized in ultrasound diagnosis performed a second read of all ultrasound images. We used data from 11 of 12 study sites to develop two machine learning (ML) algorithms using unimodal information (ultrasound features generated by the ultrasound experts) to classify breast masses which were validated on the remaining study site. The same ML algorithms were subsequently developed and validated on multi-modal information (clinical and demographic information plus ultrasound features). We assessed performance using area under the curve (AUC). RESULTS Of 1288 breast masses, 368 (28.6%) were histopathologically malignant. In the external validation set (n = 373), the performance of the two unimodal ultrasound ML algorithms (AUC 0.83 and 0.82) was commensurate with performance of the human ultrasound experts (AUC 0.82 to 0.84; p for all comparisons > 0.05). The multi-modal ultrasound ML algorithms performed significantly better (AUC 0.90 and 0.89) but were statistically inferior to routine breast cancer diagnosis (AUC 0.95, p for all comparisons ≤ 0.05). CONCLUSIONS The performance of humans and AI-based algorithms improves with multi-modal information. KEY POINTS • The performance of humans and AI-based algorithms improves with multi-modal information. • Multimodal AI-based algorithms do not necessarily outperform expert humans. • Unimodal AI-based algorithms do not represent optimal performance to classify breast masses.
Collapse
Affiliation(s)
- André Pfob
- grid.5253.10000 0001 0328 4908University Breast Unit, Department of Obstetrics and Gynecology, Heidelberg University Hospital, Im Neuenheimer Feld 440, 69120 Heidelberg, Germany ,grid.240145.60000 0001 2291 4776MD Anderson Center for INSPiRED Cancer Care (Integrated Systems for Patient-Reported Data), The University of Texas MD Anderson Cancer Center, Houston, TX USA
| | - Chris Sidey-Gibbons
- grid.240145.60000 0001 2291 4776MD Anderson Center for INSPiRED Cancer Care (Integrated Systems for Patient-Reported Data), The University of Texas MD Anderson Cancer Center, Houston, TX USA ,grid.240145.60000 0001 2291 4776Department of Symptom Research, The University of Texas MD Anderson Cancer Center, Houston, TX USA
| | - Richard G. Barr
- grid.261103.70000 0004 0459 7529Department of Radiology, Northeast Ohio Medical University, Ravenna, OH USA
| | - Volker Duda
- grid.10253.350000 0004 1936 9756Department of Gynecology and Obstetrics, University of Marburg, Marburg, Germany
| | - Zaher Alwafai
- grid.5603.0Department of Gynecology and Obstetrics, University of Greifswald, Greifswald, Germany
| | - Corinne Balleyguier
- grid.14925.3b0000 0001 2284 9388Department of Radiology, Institut Gustave Roussy, Villejuif Cedex, France
| | - Dirk-André Clevert
- grid.411095.80000 0004 0477 2585Department of Radiology, University Hospital Munich-Grosshadern, Munich, Germany
| | - Sarah Fastner
- grid.5253.10000 0001 0328 4908University Breast Unit, Department of Obstetrics and Gynecology, Heidelberg University Hospital, Im Neuenheimer Feld 440, 69120 Heidelberg, Germany
| | - Christina Gomez
- grid.5253.10000 0001 0328 4908University Breast Unit, Department of Obstetrics and Gynecology, Heidelberg University Hospital, Im Neuenheimer Feld 440, 69120 Heidelberg, Germany
| | - Manuela Goncalo
- grid.8051.c0000 0000 9511 4342Department of Radiology, University of Coimbra, Coimbra, Portugal
| | - Ines Gruber
- grid.10392.390000 0001 2190 1447Department of Gynecology and Obstetrics, University of Tuebingen, Tuebingen, Germany
| | - Markus Hahn
- grid.10392.390000 0001 2190 1447Department of Gynecology and Obstetrics, University of Tuebingen, Tuebingen, Germany
| | - André Hennigs
- grid.5253.10000 0001 0328 4908University Breast Unit, Department of Obstetrics and Gynecology, Heidelberg University Hospital, Im Neuenheimer Feld 440, 69120 Heidelberg, Germany
| | - Panagiotis Kapetas
- grid.22937.3d0000 0000 9259 8492Department of Biomedical Imaging and Image-Guided Therapy, Medical University of Vienna, Vienna, Austria
| | - Sheng-Chieh Lu
- grid.240145.60000 0001 2291 4776MD Anderson Center for INSPiRED Cancer Care (Integrated Systems for Patient-Reported Data), The University of Texas MD Anderson Cancer Center, Houston, TX USA ,grid.240145.60000 0001 2291 4776Department of Symptom Research, The University of Texas MD Anderson Cancer Center, Houston, TX USA
| | - Juliane Nees
- grid.5253.10000 0001 0328 4908University Breast Unit, Department of Obstetrics and Gynecology, Heidelberg University Hospital, Im Neuenheimer Feld 440, 69120 Heidelberg, Germany
| | - Ralf Ohlinger
- grid.5603.0Department of Gynecology and Obstetrics, University of Greifswald, Greifswald, Germany
| | - Fabian Riedel
- grid.5253.10000 0001 0328 4908University Breast Unit, Department of Obstetrics and Gynecology, Heidelberg University Hospital, Im Neuenheimer Feld 440, 69120 Heidelberg, Germany
| | - Matthieu Rutten
- grid.413508.b0000 0004 0501 9798Department of Radiology, Jeroen Bosch Hospital, ‘s-Hertogenbosch, The Netherlands ,grid.10417.330000 0004 0444 9382Radboud University Medical Center, Nijmegen, The Netherlands
| | - Benedikt Schaefgen
- grid.5253.10000 0001 0328 4908University Breast Unit, Department of Obstetrics and Gynecology, Heidelberg University Hospital, Im Neuenheimer Feld 440, 69120 Heidelberg, Germany
| | - Maximilian Schuessler
- grid.5253.10000 0001 0328 4908National Center for Tumor Diseases, Heidelberg University Hospital, Heidelberg, Germany
| | - Anne Stieber
- grid.5253.10000 0001 0328 4908University Breast Unit, Department of Obstetrics and Gynecology, Heidelberg University Hospital, Im Neuenheimer Feld 440, 69120 Heidelberg, Germany
| | - Riku Togawa
- grid.5253.10000 0001 0328 4908University Breast Unit, Department of Obstetrics and Gynecology, Heidelberg University Hospital, Im Neuenheimer Feld 440, 69120 Heidelberg, Germany
| | | | - Sebastian Wojcinski
- grid.461805.e0000 0000 9323 0964Department of Gynecology and Obstetrics, Breast Cancer Center, Klinikum Bielefeld Mitte GmbH, Bielefeld, Germany
| | - Cai Xu
- grid.240145.60000 0001 2291 4776MD Anderson Center for INSPiRED Cancer Care (Integrated Systems for Patient-Reported Data), The University of Texas MD Anderson Cancer Center, Houston, TX USA ,grid.240145.60000 0001 2291 4776Department of Symptom Research, The University of Texas MD Anderson Cancer Center, Houston, TX USA
| | - Geraldine Rauch
- grid.7468.d0000 0001 2248 7639Institute of Biometry and Clinical Epidemiology, Charité – Universitätsmedizin Berlin, Freie Universität Berlin, Humboldt-Universität Zu Berlin, Berlin , Germany
| | - Joerg Heil
- grid.5253.10000 0001 0328 4908University Breast Unit, Department of Obstetrics and Gynecology, Heidelberg University Hospital, Im Neuenheimer Feld 440, 69120 Heidelberg, Germany
| | - Michael Golatta
- grid.5253.10000 0001 0328 4908University Breast Unit, Department of Obstetrics and Gynecology, Heidelberg University Hospital, Im Neuenheimer Feld 440, 69120 Heidelberg, Germany
| |
Collapse
|
55
|
Hsieh YH, Hsu FR, Dai ST, Huang HY, Chen DR, Shia WC. Incorporating the Breast Imaging Reporting and Data System Lexicon with a Fully Convolutional Network for Malignancy Detection on Breast Ultrasound. Diagnostics (Basel) 2021; 12:66. [PMID: 35054233 PMCID: PMC8774546 DOI: 10.3390/diagnostics12010066] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Revised: 12/21/2021] [Accepted: 12/25/2021] [Indexed: 11/16/2022] Open
Abstract
In this study, we applied semantic segmentation using a fully convolutional deep learning network to identify characteristics of the Breast Imaging Reporting and Data System (BI-RADS) lexicon from breast ultrasound images to facilitate clinical malignancy tumor classification. Among 378 images (204 benign and 174 malignant images) from 189 patients (102 benign breast tumor patients and 87 malignant patients), we identified seven malignant characteristics related to the BI-RADS lexicon in breast ultrasound. The mean accuracy and mean IU of the semantic segmentation were 32.82% and 28.88, respectively. The weighted intersection over union was 85.35%, and the area under the curve was 89.47%, showing better performance than similar semantic segmentation networks, SegNet and U-Net, in the same dataset. Our results suggest that the utilization of a deep learning network in combination with the BI-RADS lexicon can be an important supplemental tool when using ultrasound to diagnose breast malignancy.
Collapse
Affiliation(s)
- Yung-Hsien Hsieh
- Department of Information Engineering and Computer Science, Feng Chia University, Taichung 40724, Taiwan; (Y.-H.H.); (F.-R.H.); (S.-T.D.)
| | - Fang-Rong Hsu
- Department of Information Engineering and Computer Science, Feng Chia University, Taichung 40724, Taiwan; (Y.-H.H.); (F.-R.H.); (S.-T.D.)
| | - Seng-Tong Dai
- Department of Information Engineering and Computer Science, Feng Chia University, Taichung 40724, Taiwan; (Y.-H.H.); (F.-R.H.); (S.-T.D.)
| | - Hsin-Ya Huang
- Comprehensive Breast Cancer Center, Changhua Christian Hospital, Changhua 500, Taiwan;
| | - Dar-Ren Chen
- Comprehensive Breast Cancer Center, Changhua Christian Hospital, Changhua 500, Taiwan;
- School of Medicine, Chung Shan Medical University, Taichung 40201, Taiwan
| | - Wei-Chung Shia
- Department of Information Engineering and Computer Science, Feng Chia University, Taichung 40724, Taiwan; (Y.-H.H.); (F.-R.H.); (S.-T.D.)
- Molecular Medicine Laboratory, Department of Research, Changhua Christian Hospital, Changhua 500, Taiwan
| |
Collapse
|
56
|
Preliminary Study on the Diagnostic Performance of a Deep Learning System for Submandibular Gland Inflammation Using Ultrasonography Images. J Clin Med 2021; 10:jcm10194508. [PMID: 34640523 PMCID: PMC8509623 DOI: 10.3390/jcm10194508] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2021] [Revised: 09/24/2021] [Accepted: 09/28/2021] [Indexed: 11/24/2022] Open
Abstract
This study was performed to evaluate the diagnostic performance of deep learning systems using ultrasonography (USG) images of the submandibular glands (SMGs) in three different conditions: obstructive sialoadenitis, Sjögren’s syndrome (SjS), and normal glands. Fifty USG images with a confirmed diagnosis of obstructive sialoadenitis, 50 USG images with a confirmed diagnosis of SjS, and 50 USG images with no SMG abnormalities were included in the study. The training group comprised 40 obstructive sialoadenitis images, 40 SjS images, and 40 control images, and the test group comprised 10 obstructive sialoadenitis images, 10 SjS images, and 10 control images for deep learning analysis. The performance of the deep learning system was calculated and compared between two experienced radiologists. The sensitivity of the deep learning system in the obstructive sialoadenitis group, SjS group, and control group was 55.0%, 83.0%, and 73.0%, respectively, and the total accuracy was 70.3%. The sensitivity of the two radiologists was 64.0%, 72.0%, and 86.0%, respectively, and the total accuracy was 74.0%. This study revealed that the deep learning system was more sensitive than experienced radiologists in diagnosing SjS in USG images of two case groups and a group of healthy subjects in inflammation of SMGs.
Collapse
|
57
|
Zhang D, Jiang F, Yin R, Wu GG, Wei Q, Cui XW, Zeng SE, Ni XJ, Dietrich CF. A Review of the Role of the S-Detect Computer-Aided Diagnostic Ultrasound System in the Evaluation of Benign and Malignant Breast and Thyroid Masses. Med Sci Monit 2021; 27:e931957. [PMID: 34552043 PMCID: PMC8477643 DOI: 10.12659/msm.931957] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2021] [Accepted: 06/10/2021] [Indexed: 12/24/2022] Open
Abstract
Computer-aided diagnosis (CAD) systems have attracted extensive attention owing to their performance in the field of image diagnosis and are rapidly becoming a promising auxiliary tool in medical imaging tasks. These systems can quantitatively evaluate complex medical imaging features and achieve efficient and high-diagnostic accuracy. Deep learning is a representation learning method. As a major branch of artificial intelligence technology, it can directly process original image data by simulating the structure of the human brain neural network, thus independently completing the task of image recognition. S-Detect is a novel and interactive CAD system based on a deep learning algorithm, which has been integrated into ultrasound equipment and can help radiologists identify benign and malignant nodules, reduce physician workload, and optimize the ultrasound clinical workflow. S-Detect is becoming one of the most commonly used CAD systems for ultrasound evaluation of breast and thyroid nodules. In this review, we describe the S-Detect workflow and outline its application in breast and thyroid nodule detection. Finally, we discuss the difficulties and challenges faced by S-Detect as a precision medical tool in clinical practice and its prospects.
Collapse
Affiliation(s)
- Di Zhang
- Department of Medical Ultrasound, Affiliated Hospital of Nantong University, Nantong, Jiangsu, PR China
| | - Fan Jiang
- Department of Medical Ultrasound, The Second Affiliated Hospital of Anhui Medical University, Hefei, Anhui, PR China
| | - Rui Yin
- Department of Ultrasound, Affiliated Renhe Hospital of China Three Gorges University, Yichang, Hubei, PR China
| | - Ge-Ge Wu
- Department of Medical Ultrasound, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, PR China
| | - Qi Wei
- Department of Medical Ultrasound, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, PR China
| | - Xin-Wu Cui
- Department of Medical Ultrasound, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, PR China
| | - Shu-E Zeng
- Department of Ultrasound, Hubei Cancer Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, PR China
| | - Xue-Jun Ni
- Department of Medical Ultrasound, Affiliated Hospital of Nantong University, Nantong, Jiangsu, PR China
| | | |
Collapse
|
58
|
Shen YT, Chen L, Yue WW, Xu HX. Artificial intelligence in ultrasound. Eur J Radiol 2021; 139:109717. [PMID: 33962110 DOI: 10.1016/j.ejrad.2021.109717] [Citation(s) in RCA: 95] [Impact Index Per Article: 23.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Revised: 03/28/2021] [Accepted: 04/11/2021] [Indexed: 12/13/2022]
Abstract
Ultrasound (US), a flexible green imaging modality, is expanding globally as a first-line imaging technique in various clinical fields following with the continual emergence of advanced ultrasonic technologies and the well-established US-based digital health system. Actually, in US practice, qualified physicians should manually collect and visually evaluate images for the detection, identification and monitoring of diseases. The diagnostic performance is inevitably reduced due to the intrinsic property of high operator-dependence from US. In contrast, artificial intelligence (AI) excels at automatically recognizing complex patterns and providing quantitative assessment for imaging data, showing high potential to assist physicians in acquiring more accurate and reproducible results. In this article, we will provide a general understanding of AI, machine learning (ML) and deep learning (DL) technologies; We then review the rapidly growing applications of AI-especially DL technology in the field of US-based on the following anatomical regions: thyroid, breast, abdomen and pelvis, obstetrics heart and blood vessels, musculoskeletal system and other organs by covering image quality control, anatomy localization, object detection, lesion segmentation, and computer-aided diagnosis and prognosis evaluation; Finally, we offer our perspective on the challenges and opportunities for the clinical practice of biomedical AI systems in US.
Collapse
Affiliation(s)
- Yu-Ting Shen
- Department of Medical Ultrasound, Shanghai Tenth People's Hospital, Ultrasound Research and Education Institute, Tongji University School of Medicine, Tongji University Cancer Center, Shanghai Engineering Research Center of Ultrasound Diagnosis and Treatment, National Clnical Research Center of Interventional Medicine, Shanghai, 200072, PR China
| | - Liang Chen
- Department of Gastroenterology, Shanghai Tenth People's Hospital, Tongji University School of Medicine, Shanghai, 200072, PR China
| | - Wen-Wen Yue
- Department of Medical Ultrasound, Shanghai Tenth People's Hospital, Ultrasound Research and Education Institute, Tongji University School of Medicine, Tongji University Cancer Center, Shanghai Engineering Research Center of Ultrasound Diagnosis and Treatment, National Clnical Research Center of Interventional Medicine, Shanghai, 200072, PR China.
| | - Hui-Xiong Xu
- Department of Medical Ultrasound, Shanghai Tenth People's Hospital, Ultrasound Research and Education Institute, Tongji University School of Medicine, Tongji University Cancer Center, Shanghai Engineering Research Center of Ultrasound Diagnosis and Treatment, National Clnical Research Center of Interventional Medicine, Shanghai, 200072, PR China.
| |
Collapse
|
59
|
Berg WA, Gur D, Bandos AI, Nair B, Gizienski TA, Tyma CS, Abrams G, Davis KM, Mehta AS, Rathfon G, Waheed UX, Hakim CM. Impact of Original and Artificially Improved Artificial Intelligence-based Computer-aided Diagnosis on Breast US Interpretation. JOURNAL OF BREAST IMAGING 2021; 3:301-311. [PMID: 38424776 DOI: 10.1093/jbi/wbab013] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2020] [Indexed: 03/02/2024]
Abstract
OBJECTIVE For breast US interpretation, to assess impact of computer-aided diagnosis (CADx) in original mode or with improved sensitivity or specificity. METHODS In this IRB approved protocol, orthogonal-paired US images of 319 lesions identified on screening, including 88 (27.6%) cancers (median 7 mm, range 1-34 mm), were reviewed by 9 breast imaging radiologists. Each observer provided BI-RADS assessments (2, 3, 4A, 4B, 4C, 5) before and after CADx in a mode-balanced design: mode 1, original CADx (outputs benign, probably benign, suspicious, or malignant); mode 2, artificially-high-sensitivity CADx (benign or malignant); and mode 3, artificially-high-specificity CADx (benign or malignant). Area under the receiver operating characteristic curve (AUC) was estimated under each modality and for standalone CADx outputs. Multi-reader analysis accounted for inter-reader variability and correlation between same-lesion assessments. RESULTS AUC of standalone CADx was 0.77 (95% CI: 0.72-0.83). For mode 1, average reader AUC was 0.82 (range 0.76-0.84) without CADx and not significantly changed with CADx. In high-sensitivity mode, all observers' AUCs increased: average AUC 0.83 (range 0.78-0.86) before CADx increased to 0.88 (range 0.84-0.90), P < 0.001. In high-specificity mode, all observers' AUCs increased: average AUC 0.82 (range 0.76-0.84) before CADx increased to 0.89 (range 0.87-0.92), P < 0.0001. Radiologists responded more frequently to malignant CADx cues in high-specificity mode (42.7% vs 23.2% mode 1, and 27.0% mode 2, P = 0.008). CONCLUSION Original CADx did not substantially impact radiologists' interpretations. Radiologists showed improved performance and were more responsive when CADx produced fewer false-positive malignant cues.
Collapse
Affiliation(s)
- Wendie A Berg
- University of Pittsburgh School of Medicine, Department of Radiology, Pittsburgh, PA,USA
- Magee-Womens Hospital of UPMC, Pittsburgh, PA,USA
| | - David Gur
- University of Pittsburgh School of Medicine, Department of Radiology, Pittsburgh, PA,USA
| | - Andriy I Bandos
- University of Pittsburgh Graduate School of Public Health, Department of Biostatistics, Pittsburgh, PA, USA
| | - Bronwyn Nair
- University of Pittsburgh School of Medicine, Department of Radiology, Pittsburgh, PA,USA
- Magee-Womens Hospital of UPMC, Pittsburgh, PA,USA
| | - Terri-Ann Gizienski
- University of Pittsburgh School of Medicine, Department of Radiology, Pittsburgh, PA,USA
- Magee-Womens Hospital of UPMC, Pittsburgh, PA,USA
| | - Cathy S Tyma
- University of Pittsburgh School of Medicine, Department of Radiology, Pittsburgh, PA,USA
- Magee-Womens Hospital of UPMC, Pittsburgh, PA,USA
- New York University Langone Medical Center, Department of Radiology, New York, NY,USA
| | - Gordon Abrams
- University of Pittsburgh School of Medicine, Department of Radiology, Pittsburgh, PA,USA
- Magee-Womens Hospital of UPMC, Pittsburgh, PA,USA
| | - Katie M Davis
- Magee-Womens Hospital of UPMC, Pittsburgh, PA,USA
- Vanderbilt University Medical Center, Department of Radiology, Nashville, TN,USA
| | - Amar S Mehta
- Magee-Womens Hospital of UPMC, Pittsburgh, PA,USA
- DuPage Medical Group, Department of Radiology, Downers Grove, IL,USA
| | - Grace Rathfon
- University of Pittsburgh School of Medicine, Department of Radiology, Pittsburgh, PA,USA
- Magee-Womens Hospital of UPMC, Pittsburgh, PA,USA
- Steuben Radiology Associates, Steubenville, OH,USA
| | - Uzma X Waheed
- University of Pittsburgh School of Medicine, Department of Radiology, Pittsburgh, PA,USA
- Magee-Womens Hospital of UPMC, Pittsburgh, PA,USA
| | - Christiane M Hakim
- University of Pittsburgh School of Medicine, Department of Radiology, Pittsburgh, PA,USA
- Magee-Womens Hospital of UPMC, Pittsburgh, PA,USA
| |
Collapse
|
60
|
Bahl M. Artificial Intelligence for Breast Ultrasound: Will It Impact Radiologists' Accuracy? JOURNAL OF BREAST IMAGING 2021; 3:312-314. [PMID: 34056592 PMCID: PMC8139610 DOI: 10.1093/jbi/wbab022] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2021] [Indexed: 12/16/2022]
Affiliation(s)
- Manisha Bahl
- Massachusetts General Hospital, Department of Radiology, Boston, MA, USA
| |
Collapse
|
61
|
Park SH, Choi J, Byeon JS. Key Principles of Clinical Validation, Device Approval, and Insurance Coverage Decisions of Artificial Intelligence. Korean J Radiol 2021; 22:442-453. [PMID: 33629545 PMCID: PMC7909857 DOI: 10.3348/kjr.2021.0048] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2021] [Revised: 01/15/2021] [Accepted: 01/15/2021] [Indexed: 01/17/2023] Open
Abstract
Artificial intelligence (AI) will likely affect various fields of medicine. This article aims to explain the fundamental principles of clinical validation, device approval, and insurance coverage decisions of AI algorithms for medical diagnosis and prediction. Discrimination accuracy of AI algorithms is often evaluated with the Dice similarity coefficient, sensitivity, specificity, and traditional or free-response receiver operating characteristic curves. Calibration accuracy should also be assessed, especially for algorithms that provide probabilities to users. As current AI algorithms have limited generalizability to real-world practice, clinical validation of AI should put it to proper external testing and assisting roles. External testing could adopt diagnostic case-control or diagnostic cohort designs. A diagnostic case-control study evaluates the technical validity/accuracy of AI while the latter tests the clinical validity/accuracy of AI in samples representing target patients in real-world clinical scenarios. Ultimate clinical validation of AI requires evaluations of its impact on patient outcomes, referred to as clinical utility, and for which randomized clinical trials are ideal. Device approval of AI is typically granted with proof of technical validity/accuracy and thus does not intend to directly indicate if AI is beneficial for patient care or if it improves patient outcomes. Neither can it categorically address the issue of limited generalizability of AI. After achieving device approval, it is up to medical professionals to determine if the approved AI algorithms are beneficial for real-world patient care. Insurance coverage decisions generally require a demonstration of clinical utility that the use of AI has improved patient outcomes.
Collapse
Affiliation(s)
- Seong Ho Park
- Department of Radiology and Research Institute of Radiology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea.
| | - Jaesoon Choi
- Department of Biomedical Engineering, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea
| | - Jeong Sik Byeon
- Department of Gastroenterology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea
| |
Collapse
|
62
|
Aggarwal R, Sounderajah V, Martin G, Ting DSW, Karthikesalingam A, King D, Ashrafian H, Darzi A. Diagnostic accuracy of deep learning in medical imaging: a systematic review and meta-analysis. NPJ Digit Med 2021; 4:65. [PMID: 33828217 PMCID: PMC8027892 DOI: 10.1038/s41746-021-00438-z] [Citation(s) in RCA: 295] [Impact Index Per Article: 73.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2020] [Accepted: 02/25/2021] [Indexed: 12/19/2022] Open
Abstract
Deep learning (DL) has the potential to transform medical diagnostics. However, the diagnostic accuracy of DL is uncertain. Our aim was to evaluate the diagnostic accuracy of DL algorithms to identify pathology in medical imaging. Searches were conducted in Medline and EMBASE up to January 2020. We identified 11,921 studies, of which 503 were included in the systematic review. Eighty-two studies in ophthalmology, 82 in breast disease and 115 in respiratory disease were included for meta-analysis. Two hundred twenty-four studies in other specialities were included for qualitative review. Peer-reviewed studies that reported on the diagnostic accuracy of DL algorithms to identify pathology using medical imaging were included. Primary outcomes were measures of diagnostic accuracy, study design and reporting standards in the literature. Estimates were pooled using random-effects meta-analysis. In ophthalmology, AUC's ranged between 0.933 and 1 for diagnosing diabetic retinopathy, age-related macular degeneration and glaucoma on retinal fundus photographs and optical coherence tomography. In respiratory imaging, AUC's ranged between 0.864 and 0.937 for diagnosing lung nodules or lung cancer on chest X-ray or CT scan. For breast imaging, AUC's ranged between 0.868 and 0.909 for diagnosing breast cancer on mammogram, ultrasound, MRI and digital breast tomosynthesis. Heterogeneity was high between studies and extensive variation in methodology, terminology and outcome measures was noted. This can lead to an overestimation of the diagnostic accuracy of DL algorithms on medical imaging. There is an immediate need for the development of artificial intelligence-specific EQUATOR guidelines, particularly STARD, in order to provide guidance around key issues in this field.
Collapse
Affiliation(s)
- Ravi Aggarwal
- Institute of Global Health Innovation, Imperial College London, London, UK
| | | | - Guy Martin
- Institute of Global Health Innovation, Imperial College London, London, UK
| | - Daniel S W Ting
- Singapore Eye Research Institute, Singapore National Eye Center, Singapore, Singapore
| | | | - Dominic King
- Institute of Global Health Innovation, Imperial College London, London, UK
| | - Hutan Ashrafian
- Institute of Global Health Innovation, Imperial College London, London, UK.
| | - Ara Darzi
- Institute of Global Health Innovation, Imperial College London, London, UK
| |
Collapse
|
63
|
Wang XY, Cui LG, Feng J, Chen W. Artificial intelligence for breast ultrasound: An adjunct tool to reduce excessive lesion biopsy. Eur J Radiol 2021; 138:109624. [PMID: 33706046 DOI: 10.1016/j.ejrad.2021.109624] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2020] [Revised: 02/24/2021] [Accepted: 03/01/2021] [Indexed: 01/30/2023]
Abstract
PURPOSE To determine whether adding an artificial intelligence (AI) system to breast ultrasound (US) can reduce unnecessary biopsies. METHODS Conventional US and AI analyses were prospectively performed on 173 suspicious breast lesions before US-guided core needle biopsy or vacuum-assisted excision. Conventional US images were retrospectively reviewed according to the BI-RADS 2013 lexicon and categories. Two downgrading stratifications based on AI assessments were manually used to downgrade the BI-RADS category 4A lesions to category 3. Stratification A was used to downgrade if the assessments of both orthogonal sections of a lesion from AI were possibly benign. Stratification B was used to downgrade if the assessment of any of the orthogonal sections was possibly benign. The effects of AI-based diagnosis on lesions to reduce unnecessary biopsy were analyzed using histopathological results as reference standards. RESULTS Forty-three lesions diagnosed as BI-RADS category 4A by conventional US received AI-based hypothetical downgrading. While downgrading with stratification A, 14 biopsies were correctly avoided. The biopsy rate for BI-RADS category 4A lesions decreased from 100 % to 67.4 % (P < 0.001). While downgrading with stratification B, 27 biopsies could be avoided with two malignancies missed, and the biopsy rate would decrease to 37.2 % (P < 0.05, compared with conventional US and stratification A). CONCLUSION Adding an AI system to breast US could reduce unnecessary lesion biopsies. Downgrading stratification A was recommended for its lower misdiagnosis rate.
Collapse
Affiliation(s)
- Xin-Yi Wang
- Department of Ultrasound, Peking University Third Hospital, Beijing 10091, China; Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education/ Beijing), Breast Center, Peking University Cancer Hospital & Institute, Beijing 100142, China
| | - Li-Gang Cui
- Department of Ultrasound, Peking University Third Hospital, Beijing 10091, China.
| | - Jie Feng
- Department of Ultrasound, Peking University Third Hospital, Beijing 10091, China; Department of Ultrasound, Haicang Hospital, Xiamen 361000, China
| | - Wen Chen
- Department of Ultrasound, Peking University Third Hospital, Beijing 10091, China
| |
Collapse
|
64
|
Xiang H, Huang YS, Lee CH, Chang Chien TY, Lee CK, Liu L, Li A, Lin X, Chang RF. 3-D Res-CapsNet convolutional neural network on automated breast ultrasound tumor diagnosis. Eur J Radiol 2021; 138:109608. [PMID: 33711572 DOI: 10.1016/j.ejrad.2021.109608] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2020] [Revised: 02/06/2021] [Accepted: 02/19/2021] [Indexed: 12/24/2022]
Abstract
PURPOSE We propose a 3-D tumor computer-aided diagnosis (CADx) system with U-net and a residual-capsule neural network (Res-CapsNet) for ABUS images and provide a reference for early tumor diagnosis, especially non-mass lesions. METHODS A total of 396 patients with 444 tumors (226 malignant and 218 benign) were retrospectively enrolled from Sun Yat-sen University Cancer Center. In our CADx, preprocessing was performed first to crop and resize the tumor volumes of interest (VOIs). Then, a 3-D U-net and postprocessing were applied to the VOIs to obtain tumor masks. Finally, a 3-D Res-CapsNet classification model was executed with the VOIs and the corresponding masks to diagnose the tumors. Finally, the diagnostic performance, including accuracy, sensitivity, specificity, and area under the curve (AUC), was compared with other classification models and among three readers with different years of experience in ABUS review. RESULTS For all tumors, the accuracy, sensitivity, specificity, and AUC of the proposed CADx were 84.9 %, 87.2 %, 82.6 %, and 0.9122, respectively, outperforming other models and junior reader. Next, the tumors were subdivided into mass and non-mass tumors to validate the system performance. For mass tumors, our CADx achieved an accuracy, sensitivity, specificity, and AUC of 85.2 %, 88.2 %, 82.3 %, and 0.9147, respectively, which was higher than that of other models and junior reader. For non-mass tumors, our CADx achieved an accuracy, sensitivity, specificity, and AUC of 81.6 %, 78.3 %, 86.7 %, and 0.8654, respectively, outperforming the two readers. CONCLUSION The proposed CADx with 3-D U-net and 3-D Res-CapsNet models has the potential to reduce misdiagnosis, especially for non-mass lesions.
Collapse
Affiliation(s)
- Huiling Xiang
- Department of Ultrasound, State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou, China
| | - Yao-Sian Huang
- Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan
| | - Chu-Hsuan Lee
- Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan
| | - Ting-Yin Chang Chien
- Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan
| | | | - Lixian Liu
- Department of Ultrasound, State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou, China
| | - Anhua Li
- Department of Ultrasound, State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou, China
| | - Xi Lin
- Department of Ultrasound, State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou, China.
| | - Ruey-Feng Chang
- Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan; Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, Taipei, Taiwan.
| |
Collapse
|
65
|
Vasey B, Ursprung S, Beddoe B, Taylor EH, Marlow N, Bilbro N, Watkinson P, McCulloch P. Association of Clinician Diagnostic Performance With Machine Learning-Based Decision Support Systems: A Systematic Review. JAMA Netw Open 2021; 4:e211276. [PMID: 33704476 PMCID: PMC7953308 DOI: 10.1001/jamanetworkopen.2021.1276] [Citation(s) in RCA: 56] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
IMPORTANCE An increasing number of machine learning (ML)-based clinical decision support systems (CDSSs) are described in the medical literature, but this research focuses almost entirely on comparing CDSS directly with clinicians (human vs computer). Little is known about the outcomes of these systems when used as adjuncts to human decision-making (human vs human with computer). OBJECTIVES To conduct a systematic review to investigate the association between the interactive use of ML-based diagnostic CDSSs and clinician performance and to examine the extent of the CDSSs' human factors evaluation. EVIDENCE REVIEW A search of MEDLINE, Embase, PsycINFO, and grey literature was conducted for the period between January 1, 2010, and May 31, 2019. Peer-reviewed studies published in English comparing human clinician performance with and without interactive use of an ML-based diagnostic CDSSs were included. All metrics used to assess human performance were considered as outcomes. The risk of bias was assessed using Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) and Risk of Bias in Non-Randomised Studies-Intervention (ROBINS-I). Narrative summaries were produced for the main outcomes. Given the heterogeneity of medical conditions, outcomes of interest, and evaluation metrics, no meta-analysis was performed. FINDINGS A total of 8112 studies were initially retrieved and 5154 abstracts were screened; of these, 37 studies met the inclusion criteria. The median number of participating clinicians was 4 (interquartile range, 3-8). Of the 107 results that reported statistical significance, 54 (50%) were increased by the use of CDSSs, 4 (4%) were decreased, and 49 (46%) showed no change or an unclear change. In the subgroup of studies carried out in representative clinical settings, no association between the use of ML-based diagnostic CDSSs and improved clinician performance could be observed. Interobserver agreement was the commonly reported outcome whose change was the most strongly associated with CDSS use. Four studies (11%) reported on user feedback, and, in all but 1 case, clinicians decided to override at least some of the algorithms' recommendations. Twenty-eight studies (76%) were rated as having a high risk of bias in at least 1 of the 4 QUADAS-2 core domains, and 6 studies (16%) were considered to be at serious or critical risk of bias using ROBINS-I. CONCLUSIONS AND RELEVANCE This systematic review found only sparse evidence that the use of ML-based CDSSs is associated with improved clinician diagnostic performance. Most studies had a low number of participants, were at high or unclear risk of bias, and showed little or no consideration for human factors. Caution should be exercised when estimating the current potential of ML to improve human diagnostic performance, and more comprehensive evaluation should be conducted before deploying ML-based CDSSs in clinical settings. The results highlight the importance of considering supported human decisions as end points rather than merely the stand-alone CDSSs outputs.
Collapse
Affiliation(s)
- Baptiste Vasey
- Nuffield Department of Surgical Sciences, University of Oxford, Oxford, United Kingdom
| | - Stephan Ursprung
- Department of Radiology, University of Cambridge, Cambridge, United Kingdom
| | - Benjamin Beddoe
- Faculty of Medicine, Imperial College London, London, United Kingdom
| | - Elliott H. Taylor
- Nuffield Department of Surgical Sciences, University of Oxford, Oxford, United Kingdom
| | - Neale Marlow
- Nuffield Department of Surgical Sciences, University of Oxford, Oxford, United Kingdom
- Oxford University Hospitals NHS Foundation Trust, Oxford, United Kingdom
| | - Nicole Bilbro
- Department of Surgery, Maimonides Medical Center, Brooklyn, New York
| | - Peter Watkinson
- Critical Care Research Group, Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, United Kingdom
| | - Peter McCulloch
- Nuffield Department of Surgical Sciences, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
66
|
Shen L. Data mining artificial intelligence technology for college english test framework and performance analysis system. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2021. [DOI: 10.3233/jifs-189386] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
This article first studies and designs the college English test framework and performance analysis system. The author analyzes a large number of data collected by the system in three dimensions: using data mining title association models, using machine learning to merge college English score prediction models, and finally diagnosing on the basis of the sexual evaluation model, the author designed and implemented a test paper algorithm based on the association rules of the question type, and carried out relevant verification from the three aspects of test paper time, test question recommendation and improvement according to scores. Finally, according to the needs analysis, the author uses the diagnostic evaluation model and related test paper algorithm to design and implement the diagnostic evaluation model, which is added to the college English diagnostic practice system. It can be obtained through comparative experiments that the paper-based algorithm based on the diagnostic evaluation model proposed in this paper can effectively give better practice guidance and test question recommendation to the learner’s learning status and knowledge point problem obstacles, and can effectively improve learning. The achievements of the authors have broad application prospects and research value.
Collapse
Affiliation(s)
- Lin Shen
- College of Foreign Languages, Guizhou University, Guiyang, China
| |
Collapse
|
67
|
Shia WC, Lin LS, Chen DR. Classification of malignant tumours in breast ultrasound using unsupervised machine learning approaches. Sci Rep 2021; 11:1418. [PMID: 33446841 PMCID: PMC7809485 DOI: 10.1038/s41598-021-81008-x] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2020] [Accepted: 12/07/2020] [Indexed: 12/22/2022] Open
Abstract
Traditional computer-aided diagnosis (CAD) processes include feature extraction, selection, and classification. Effective feature extraction in CAD is important in improving the classification’s performance. We introduce a machine-learning method and have designed an analysis procedure of benign and malignant breast tumour classification in ultrasound (US) images without a need for a priori tumour region-selection processing, thereby decreasing clinical diagnosis efforts while maintaining high classification performance. Our dataset constituted 677 US images (benign: 312, malignant: 365). Regarding two-dimensional US images, the oriented gradient descriptors’ histogram pyramid was extracted and utilised to obtain feature vectors. The correlation-based feature selection method was used to evaluate and select significant feature sets for further classification. Sequential minimal optimisation—combining local weight learning—was utilised for classification and performance enhancement. The image dataset’s classification performance showed an 81.64% sensitivity and 87.76% specificity for malignant images (area under the curve = 0.847). The positive and negative predictive values were 84.1 and 85.8%, respectively. Here, a new workflow, utilising machine learning to recognise malignant US images was proposed. Comparison of physician diagnoses and the automatic classifications made using machine learning yielded similar outcomes. This indicates the potential applicability of machine learning in clinical diagnoses.
Collapse
Affiliation(s)
- Wei-Chung Shia
- Molecular Medicine Laboratory, Department of Research, Changhua Christian Hospital, Changhua, Taiwan
| | - Li-Sheng Lin
- Department of Breast Surgery, The Affiliated Hospital (Group) of Putian University, Putian, Fujian, China
| | - Dar-Ren Chen
- Comprehensive Breast Cancer Center, Changhua Christian Hospital, Changhua, Taiwan.
| |
Collapse
|
68
|
Kim SY, Choi Y, Kim EK, Han BK, Yoon JH, Choi JS, Chang JM. Deep learning-based computer-aided diagnosis in screening breast ultrasound to reduce false-positive diagnoses. Sci Rep 2021; 11:395. [PMID: 33432076 PMCID: PMC7801712 DOI: 10.1038/s41598-020-79880-0] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2020] [Accepted: 12/09/2020] [Indexed: 01/31/2023] Open
Abstract
A major limitation of screening breast ultrasound (US) is a substantial number of false-positive biopsy. This study aimed to develop a deep learning-based computer-aided diagnosis (DL-CAD)-based diagnostic model to improve the differential diagnosis of screening US-detected breast masses and reduce false-positive diagnoses. In this multicenter retrospective study, a diagnostic model was developed based on US images combined with information obtained from the DL-CAD software for patients with breast masses detected using screening US; the data were obtained from two hospitals (development set: 299 imaging studies in 2015). Quantitative morphologic features were obtained from the DL-CAD software, and the clinical findings were collected. Multivariable logistic regression analysis was performed to establish a DL-CAD-based nomogram, and the model was externally validated using data collected from 164 imaging studies conducted between 2018 and 2019 at another hospital. Among the quantitative morphologic features extracted from DL-CAD, a higher irregular shape score (P = .018) and lower parallel orientation score (P = .007) were associated with malignancy. The nomogram incorporating the DL-CAD-based quantitative features, radiologists' Breast Imaging Reporting and Data Systems (BI-RADS) final assessment (P = .014), and patient age (P < .001) exhibited good discrimination in both the development and validation cohorts (area under the receiver operating characteristic curve, 0.89 and 0.87). Compared with the radiologists' BI-RADS final assessment, the DL-CAD-based nomogram lowered the false-positive rate (68% vs. 31%, P < .001 in the development cohort; 97% vs. 45% P < .001 in the validation cohort) without affecting the sensitivity (98% vs. 93%, P = .317 in the development cohort; each 100% in the validation cohort). In conclusion, the proposed model showed good performance for differentiating screening US-detected breast masses, thus demonstrating a potential to reduce unnecessary biopsies.
Collapse
Affiliation(s)
- Soo -Yeon Kim
- Department of Radiology, Seoul National University Hospital, Seoul National University College of Medicine, 101 Daehak-ro, Jongno-gu, Seoul, 03080, Republic of Korea
| | - Yunhee Choi
- Medical Research Collaborating Center, Seoul National University Hospital, Seoul, Republic of Korea
| | - Eun -Kyung Kim
- Department of Radiology and Research Institute of Radiological Science, Severance Hospital, Yonsei University College of Medicine, Seoul, Republic of Korea
| | - Boo-Kyung Han
- Department of Radiology and Center for Imaging Science, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea
| | - Jung Hyun Yoon
- Department of Radiology and Research Institute of Radiological Science, Severance Hospital, Yonsei University College of Medicine, Seoul, Republic of Korea
| | - Ji Soo Choi
- Department of Radiology and Center for Imaging Science, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea
| | - Jung Min Chang
- Department of Radiology, Seoul National University Hospital, Seoul National University College of Medicine, 101 Daehak-ro, Jongno-gu, Seoul, 03080, Republic of Korea.
| |
Collapse
|
69
|
Park SH. Artificial intelligence for ultrasonography: unique opportunities and challenges. Ultrasonography 2021; 40:3-6. [PMID: 33227844 PMCID: PMC7758099 DOI: 10.14366/usg.20078] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2020] [Revised: 10/31/2020] [Accepted: 11/03/2020] [Indexed: 12/12/2022] Open
|
70
|
Heller SL, Wegener M, Babb JS, Gao Y. Can an Artificial Intelligence Decision Aid Decrease False-Positive Breast Biopsies? Ultrasound Q 2020; 37:10-15. [PMID: 33394994 DOI: 10.1097/ruq.0000000000000550] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
ABSTRACT This study aimed to evaluate the effect of an artificial intelligence (AI) support system on breast ultrasound diagnostic accuracy.In this Health Insurance Portability and Accountability Act-compliant, institutional review board-approved retrospective study, 200 lesions (155 benign, 45 malignant) were randomly selected from consecutive ultrasound-guided biopsies (June 2017-January 2019). Two readers, blinded to clinical history and pathology, evaluated lesions with and without an Food and Drug Administration-approved AI software. Lesion features, Breast Imaging Reporting and Data System (BI-RADS) rating (1-5), reader confidence level (1-5), and AI BI-RADS equivalent (1-5) were recorded. Statistical analysis was performed for diagnostic accuracy, negative predictive value, positive predictive value (PPV), sensitivity, and specificity of reader versus AI BI-RADS. Generalized estimating equation analysis was used for reader versus AI accuracy regarding lesion features and AI impact on low-confidence score lesions. Artificial intelligence effect on false-positive biopsy rate was determined. Statistical tests were conducted at a 2-sided 5% significance level.There was no significant difference in accuracy (73 vs 69.8%), negative predictive value (100% vs 98.5%), PPV (45.5 vs 42.4%), sensitivity (100% vs 96.7%), and specificity (65.2 vs 61.9; P = 0.118-0.409) for AI versus pooled reader assessment. Artificial intelligence was more accurate than readers for irregular shape (74.1% vs 57.4%, P = 0.002) and less accurate for round shape (26.5% vs 50.0%, P = 0.049). Artificial intelligence improved diagnostic accuracy for reader-rated low-confidence lesions with increased PPV (24.7% AI vs 19.3%, P = 0.004) and specificity (57.8% vs 44.6%, P = 0.008).Artificial intelligence decision support aid may help improve sonographic diagnostic accuracy, particularly in cases with low reader confidence, thereby decreasing false-positives.
Collapse
Affiliation(s)
- Samantha L Heller
- Department of Radiology, New York University Grossman School of Medicine, New York, NY
| | | | | | | |
Collapse
|
71
|
Shia WC, Chen DR. Classification of malignant tumors in breast ultrasound using a pretrained deep residual network model and support vector machine. Comput Med Imaging Graph 2020; 87:101829. [PMID: 33302247 DOI: 10.1016/j.compmedimag.2020.101829] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2020] [Revised: 10/26/2020] [Accepted: 11/18/2020] [Indexed: 12/22/2022]
Abstract
In this study, a transfer learning method was utilized to recognize and classify benign and malignant breast tumors, using two-dimensional breast ultrasound (US) images, to decrease the effort expended by physicians and improve the quality of clinical diagnosis. The pretrained deep residual network model was utilized for image feature extraction from the convolutional layer of the trained network; whereas, the linear support vector machine (SVM), with a sequential minimal optimization solver, was used to classify the extracted feature. We used an image dataset with 2099 unlabeled two-dimensional breast US images, collected from 543 patients (benign: 302, malignant: 241). The classification performance yielded a sensitivity of 94.34 % and a specificity of 93.22 % for malignant images (Area under curve = 0.938). The positive and negative predictive values were 92.6 and 94.8, respectively. A comparison between the diagnosis made by the physician and the automated classification by a trained classifier, showed that the latter had significantly better outcomes. This indicates the potential applicability of the proposed approach that incorporates both the pretrained deep learning network and a well-trained classifier, to improve the quality and efficacy of clinical diagnosis.
Collapse
Affiliation(s)
- Wei-Chung Shia
- Molecular Medicine Laboratory, Department of Research, Changhua Christian Hospital, 8F., No. 235, XuGuang Road, Changhua, Taiwan.
| | - Dar-Ren Chen
- Comprehensive Breast Cancer Center, Changhua Christian Hospital, No. 135, NanXiao Street, Changhua, Taiwan.
| |
Collapse
|
72
|
Zhou QQ, Wang J, Tang W, Hu ZC, Xia ZY, Li XS, Zhang R, Yin X, Zhang B, Zhang H. Automatic Detection and Classification of Rib Fractures on Thoracic CT Using Convolutional Neural Network: Accuracy and Feasibility. Korean J Radiol 2020; 21:869-879. [PMID: 32524787 PMCID: PMC7289688 DOI: 10.3348/kjr.2019.0651] [Citation(s) in RCA: 66] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2019] [Revised: 01/10/2020] [Accepted: 01/21/2020] [Indexed: 12/31/2022] Open
Abstract
Objective To evaluate the performance of a convolutional neural network (CNN) model that can automatically detect and classify rib fractures, and output structured reports from computed tomography (CT) images. Materials and Methods This study included 1079 patients (median age, 55 years; men, 718) from three hospitals, between January 2011 and January 2019, who were divided into a monocentric training set (n = 876; median age, 55 years; men, 582), five multicenter/multiparameter validation sets (n = 173; median age, 59 years; men, 118) with different slice thicknesses and image pixels, and a normal control set (n = 30; median age, 53 years; men, 18). Three classifications (fresh, healing, and old fracture) combined with fracture location (corresponding CT layers) were detected automatically and delivered in a structured report. Precision, recall, and F1-score were selected as metrics to measure the optimum CNN model. Detection/diagnosis time, precision, and sensitivity were employed to compare the diagnostic efficiency of the structured report and that of experienced radiologists. Results A total of 25054 annotations (fresh fracture, 10089; healing fracture, 10922; old fracture, 4043) were labelled for training (18584) and validation (6470). The detection efficiency was higher for fresh fractures and healing fractures than for old fractures (F1-scores, 0.849, 0.856, 0.770, respectively, p = 0.023 for each), and the robustness of the model was good in the five multicenter/multiparameter validation sets (all mean F1-scores > 0.8 except validation set 5 [512 × 512 pixels; F1-score = 0.757]). The precision of the five radiologists improved from 80.3% to 91.1%, and the sensitivity increased from 62.4% to 86.3% with artificial intelligence-assisted diagnosis. On average, the diagnosis time of the radiologists was reduced by 73.9 seconds. Conclusion Our CNN model for automatic rib fracture detection could assist radiologists in improving diagnostic efficiency, reducing diagnosis time and radiologists' workload.
Collapse
Affiliation(s)
- Qing Qing Zhou
- Department of Radiology, The Affiliated Jiangning Hospital of Nanjing Medical University, Nanjing, China
| | - Jiashuo Wang
- Research Center of Biostatistics and Computational Pharmacy, China Pharmaceutical University, Nanjing, China
| | - Wen Tang
- FL 8, Ocean International Center E, Beijing, China
| | - Zhang Chun Hu
- Department of Radiology, The Affiliated Jiangning Hospital of Nanjing Medical University, Nanjing, China
| | - Zi Yi Xia
- Department of Radiology, The Affiliated Jiangning Hospital of Nanjing Medical University, Nanjing, China
| | - Xue Song Li
- Department of Radiology, The Affiliated Jiangning Hospital of Nanjing Medical University, Nanjing, China
| | | | - Xindao Yin
- Department of Radiology, Nanjing First Hospital, Nanjing Medical University, Nanjing, China
| | - Bing Zhang
- Department of Radiology, The Affiliated Nanjing Drum Tower Hospital of Nanjing University Medical School, Nanjing, China
| | - Hong Zhang
- Department of Radiology, The Affiliated Jiangning Hospital of Nanjing Medical University, Nanjing, China.
| |
Collapse
|
73
|
Kim J, Kim HJ, Kim C, Kim WH. Artificial intelligence in breast ultrasonography. Ultrasonography 2020; 40:183-190. [PMID: 33430577 PMCID: PMC7994743 DOI: 10.14366/usg.20117] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2020] [Accepted: 11/12/2020] [Indexed: 12/13/2022] Open
Abstract
Although breast ultrasonography is the mainstay modality for differentiating between benign and malignant breast masses, it has intrinsic problems with false positives and substantial interobserver variability. Artificial intelligence (AI), particularly with deep learning models, is expected to improve workflow efficiency and serve as a second opinion. AI is highly useful for performing three main clinical tasks in breast ultrasonography: detection (localization/segmentation), differential diagnosis (classification), and prognostication (prediction). This article provides a current overview of AI applications in breast ultrasonography, with a discussion of methodological considerations in the development of AI models and an up-to-date literature review of potential clinical applications.
Collapse
Affiliation(s)
- Jaeil Kim
- School of Computer Science and Engineering, Kyungpook National University, Daegu, Korea
| | - Hye Jung Kim
- Department of Radiology, School of Medicine, Kyungpook National University, Kyungpook National University Chilgok Hospital, Daegu, Korea
| | - Chanho Kim
- School of Computer Science and Engineering, Kyungpook National University, Daegu, Korea
| | - Won Hwa Kim
- Department of Radiology, School of Medicine, Kyungpook National University, Kyungpook National University Chilgok Hospital, Daegu, Korea
| |
Collapse
|
74
|
Park SH, Choi J, Byeon JS. Key principles of clinical validation, device approval, and insurance coverage decisions of artificial intelligence. JOURNAL OF THE KOREAN MEDICAL ASSOCIATION 2020. [DOI: 10.5124/jkma.2020.63.11.696] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
Artificial intelligence (AI) will likely affect various fields of medicine. This article aims to explain the fundamental principles of clinical validation, device approval, and insurance coverage decisions of AI algorithms for medical diagnosis and prediction. Discrimination accuracy of AI algorithms is often evaluated with the Dice similarity coefficient, sensitivity, specificity, and traditional or free-response receiver operating characteristic curves. Calibration accuracy should also be assessed, especially for algorithms that provide probabilities to users. As current AI algorithms have limited generalizability to real-world practice, clinical validation of AI should put it to proper external testing and assisting roles. External testing could adopt diagnostic case-control or diagnostic cohort designs. A diagnostic case-control study evaluates the technical validity/accuracy of AI while the latter tests the clinical validity/accuracy of AI in samples representing target patients in real-world clinical scenarios. Ultimate clinical validation of AI requires evaluations of its impact on patient outcomes, referred to as clinical utility, and for which randomized clinical trials are ideal. Device approval of AI is typically granted with proof of technical validity/accuracy and thus does not intend to directly indicate if AI is beneficial for patient care or if it improves patient outcomes. Neither can it categorically address the issue of limited generalizability of AI. After achieving device approval, it is up to medical professionals to determine if the approved AI algorithms are beneficial for real-world patient care. Insurance coverage decisions generally require a demonstration of clinical utility that the use of AI has improved patient outcomes.
Collapse
|
75
|
Wei M, Du Y, Wu X, Su Q, Zhu J, Zheng L, Lv G, Zhuang J. A Benign and Malignant Breast Tumor Classification Method via Efficiently Combining Texture and Morphological Features on Ultrasound Images. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2020; 2020:5894010. [PMID: 33062038 PMCID: PMC7547332 DOI: 10.1155/2020/5894010] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/13/2020] [Revised: 09/01/2020] [Accepted: 09/15/2020] [Indexed: 12/14/2022]
Abstract
The classification of benign and malignant based on ultrasound images is of great value because breast cancer is an enormous threat to women's health worldwide. Although both texture and morphological features are crucial representations of ultrasound breast tumor images, their straightforward combination brings little effect for improving the classification of benign and malignant since high-dimensional texture features are too aggressive so that drown out the effect of low-dimensional morphological features. For that, an efficient texture and morphological feature combing method is proposed to improve the classification of benign and malignant. Firstly, both texture (i.e., local binary patterns (LBP), histogram of oriented gradients (HOG), and gray-level co-occurrence matrixes (GLCM)) and morphological (i.e., shape complexities) features of breast ultrasound images are extracted. Secondly, a support vector machine (SVM) classifier working on texture features is trained, and a naive Bayes (NB) classifier acting on morphological features is designed, in order to exert the discriminative power of texture features and morphological features, respectively. Thirdly, the classification scores of the two classifiers (i.e., SVM and NB) are weighted fused to obtain the final classification result. The low-dimensional nonparameterized NB classifier is effectively control the parameter complexity of the entire classification system combine with the high-dimensional parametric SVM classifier. Consequently, texture and morphological features are efficiently combined. Comprehensive experimental analyses are presented, and the proposed method obtains a 91.11% accuracy, a 94.34% sensitivity, and an 86.49% specificity, which outperforms many related benign and malignant breast tumor classification methods.
Collapse
Affiliation(s)
- Mengwan Wei
- College of Engineering, Huaqiao University, Quanzhou 362021, China
| | - Yongzhao Du
- College of Engineering, Huaqiao University, Quanzhou 362021, China
- School of Medicine, Huaqiao University, Quanzhou 362021, China
- Collaborative Innovation Center for Maternal and Infant Health Service Application Technology, Quanzhou Medical College, Quanzhou, China
| | - Xiuming Wu
- The First Hospital of Quanzhou, Fujian Medical University, Quanzhou 350005, China
| | - Qichen Su
- Collaborative Innovation Center for Maternal and Infant Health Service Application Technology, Quanzhou Medical College, Quanzhou, China
- Department of Medical Ultrasonics, The Second Affiliated Hospital of Fujian Medical University, Quanzhou 362000, China
| | - Jianqing Zhu
- College of Engineering, Huaqiao University, Quanzhou 362021, China
| | - Lixin Zheng
- College of Engineering, Huaqiao University, Quanzhou 362021, China
| | - Guorong Lv
- Collaborative Innovation Center for Maternal and Infant Health Service Application Technology, Quanzhou Medical College, Quanzhou, China
- Department of Medical Ultrasonics, The Second Affiliated Hospital of Fujian Medical University, Quanzhou 362000, China
| | - Jiafu Zhuang
- Quanzhou Institute of Equipment Manufacturing, Haixi Institutes, Chinese Academy of Sciences, 362216 Quanzhou, China
| |
Collapse
|
76
|
Xiao M, Zhao C, Li J, Zhang J, Liu H, Wang M, Ouyang Y, Zhang Y, Jiang Y, Zhu Q. Diagnostic Value of Breast Lesions Between Deep Learning-Based Computer-Aided Diagnosis System and Experienced Radiologists: Comparison the Performance Between Symptomatic and Asymptomatic Patients. Front Oncol 2020; 10:1070. [PMID: 32733799 PMCID: PMC7358588 DOI: 10.3389/fonc.2020.01070] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2020] [Accepted: 05/28/2020] [Indexed: 11/13/2022] Open
Abstract
Purpose: The purpose of this study was to compare the diagnostic performance of breast lesions between deep learning-based computer-aided diagnosis (deep learning-based CAD) system and experienced radiologists and to compare the performance between symptomatic and asymptomatic patients. Methods: From January to December 2018, a total of 451 breast lesions in 389 consecutive patients were examined (mean age 46.86 ± 13.03 years, range 19-84 years) by both ultrasound and deep learning-based CAD system, all of which were biopsied, and the pathological results were obtained. The lesions were diagnosed by two experienced radiologists according to the fifth edition Breast Imaging Reporting and Data System (BI-RADS). The final deep learning-based CAD assessments were dichotomized as possibly benign or possibly malignant. The diagnostic performances of the radiologists and deep learning-based CAD were calculated and compared for asymptomatic patients and symptomatic patients. Results: There were 206 asymptomatic screening patients with 235 lesions (mean age 45.06 ± 10.90 years, range 21-73 years) and 183 symptomatic patients with 216 lesions (mean age 50.03 ± 14.97 years, range 19-84 years). The sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), accuracy and area under the receiver operating characteristic curve (AUC) of the deep learning-based CAD in asymptomatic patients were 93.8, 83.9, 75.0, 96.3, 87.2, and 0.89%, respectively. In asymptomatic patients, the specificity (83.9 vs. 66.5%, p < 0.001), PPV (75.0 vs. 59.4%, p = 0.013), accuracy (87.2 vs. 76.2%, p = 0.002) and AUC (0.89 to 0.81, p = 0.0013) of CAD were all significantly higher than those of the experienced radiologists. The sensitivity (93.8 vs. 80.0%), specificity (83.9 vs. 61.8%,), accuracy (87.2 vs. 73.6%) and AUC (0.89 vs. 0.71) of CAD were all higher for asymptomatic patients than for symptomatic patients. If the BI-RADS 4a lesions diagnosed by the radiologists in asymptomatic patients were downgraded to BI-RADS 3 according to the CAD, then 54.8% (23/42) of the lesions would avoid biopsy without missing the malignancy. Conclusion: The deep learning-based CAD system had better performance in asymptomatic patients than in symptomatic patients and could be a promising complementary tool to ultrasound for increasing diagnostic specificity and avoiding unnecessary biopsies in asymptomatic screening patients.
Collapse
Affiliation(s)
- Mengsu Xiao
- Department of Ultrasound, Chinese Academy of Medical Sciences and Peking Union Medical College Hospital, Beijing, China
| | - Chenyang Zhao
- Department of Ultrasound, Chinese Academy of Medical Sciences and Peking Union Medical College Hospital, Beijing, China
| | - Jianchu Li
- Department of Ultrasound, Chinese Academy of Medical Sciences and Peking Union Medical College Hospital, Beijing, China
| | - Jing Zhang
- Department of Ultrasound, Chinese Academy of Medical Sciences and Peking Union Medical College Hospital, Beijing, China
| | - He Liu
- Department of Ultrasound, Chinese Academy of Medical Sciences and Peking Union Medical College Hospital, Beijing, China
| | - Ming Wang
- Department of Ultrasound, Chinese Academy of Medical Sciences and Peking Union Medical College Hospital, Beijing, China
| | - Yunshu Ouyang
- Department of Ultrasound, Chinese Academy of Medical Sciences and Peking Union Medical College Hospital, Beijing, China
| | - Yixiu Zhang
- Department of Ultrasound, Chinese Academy of Medical Sciences and Peking Union Medical College Hospital, Beijing, China
| | - Yuxin Jiang
- Department of Ultrasound, Chinese Academy of Medical Sciences and Peking Union Medical College Hospital, Beijing, China
| | - Qingli Zhu
- Department of Ultrasound, Chinese Academy of Medical Sciences and Peking Union Medical College Hospital, Beijing, China
| |
Collapse
|
77
|
Kise Y, Shimizu M, Ikeda H, Fujii T, Kuwada C, Nishiyama M, Funakoshi T, Ariji Y, Fujita H, Katsumata A, Yoshiura K, Ariji E. Usefulness of a deep learning system for diagnosing Sjögren's syndrome using ultrasonography images. Dentomaxillofac Radiol 2020; 49:20190348. [PMID: 31804146 PMCID: PMC7068075 DOI: 10.1259/dmfr.20190348] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2019] [Revised: 11/28/2019] [Accepted: 11/29/2019] [Indexed: 12/18/2022] Open
Abstract
OBJECTIVES We evaluated the diagnostic performance of a deep learning system for the detection of Sjögren's syndrome (SjS) in ultrasonography (US) images, and compared it with the performance of inexperienced radiologists. METHODS 100 patients with a confirmed diagnosis of SjS according to both the Japanese criteria and American-European Consensus Group criteria and 100 non-SjS patients that had a dry mouth and suspected SjS but were definitively diagnosed as non-SjS were enrolled in this study. All the patients underwent US scans of both the parotid glands (PG) and submandibular glands (SMG). The training group consisted of 80 SjS patients and 80 non-SjS patients, whereas the test group consisted of 20 SjS patients and 20 non-SjS patients for deep learning analysis. The performance of the deep learning system for diagnosing SjS from the US images was compared with the diagnoses made by three inexperienced radiologists. RESULTS The accuracy, sensitivity and specificity of the deep learning system for the PG were 89.5, 90.0 and 89.0%, respectively, and those for the inexperienced radiologists were 76.7, 67.0 and 86.3%, respectively. The deep learning system results for the SMG were 84.0, 81.0 and 87.0%, respectively, and those for the inexperienced radiologists were 72.0, 78.0 and 66.0%, respectively. The AUC for the inexperienced radiologists was significantly different from that of the deep learning system. CONCLUSIONS The deep learning system had a high diagnostic ability for SjS. This suggests that deep learning could be used for diagnostic support when interpreting US images.
Collapse
Affiliation(s)
- Yoshitaka Kise
- Department of Oral and Maxillofacial Radiology, Aichi Gakuin University, Nagoya, Japan
| | - Mayumi Shimizu
- Department of Oral and Maxillofacial Radiology, Kyushu University Hospital, Fukuoka, Japan
| | - Haruka Ikeda
- Department of Oral and Maxillofacial Radiology, Aichi Gakuin University, Nagoya, Japan
| | - Takeshi Fujii
- Department of Oral and Maxillofacial Radiology, Aichi Gakuin University, Nagoya, Japan
| | - Chiaki Kuwada
- Department of Oral and Maxillofacial Radiology, Aichi Gakuin University, Nagoya, Japan
| | - Masako Nishiyama
- Department of Oral and Maxillofacial Radiology, Aichi Gakuin University, Nagoya, Japan
| | - Takuma Funakoshi
- Department of Oral and Maxillofacial Radiology, Aichi Gakuin University, Nagoya, Japan
| | - Yoshiko Ariji
- Department of Oral and Maxillofacial Radiology, Aichi Gakuin University, Nagoya, Japan
| | - Hiroshi Fujita
- Department of Electrical, Electronic and Computer Faculty of Engineering, Gifu University, Gifu, Japan
| | - Akitoshi Katsumata
- Department of Oral Radiology, Asahi University School of Dentistry, Mizuho, Japan
| | - Kazunori Yoshiura
- Department of Oral and Maxillofacial Radiology, Faculty of Dental Science, Kyushu University, Fukuoka, Japan
| | - Eiichiro Ariji
- Department of Oral and Maxillofacial Radiology, Aichi Gakuin University, Nagoya, Japan
| |
Collapse
|
78
|
Choe YH. Characteristics of Recent Articles Published in the Korean Journal of Radiology Based on the Citation Frequency. Korean J Radiol 2020; 21:1284. [PMID: 33236548 PMCID: PMC7689137 DOI: 10.3348/kjr.2020.1322] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Revised: 11/05/2020] [Accepted: 11/05/2020] [Indexed: 11/24/2022] Open
Affiliation(s)
- Yeon Hyeon Choe
- Department of Radiology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea
- HVSI Imaging Center, Heart Vascular Stroke Institute, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea
| |
Collapse
|
79
|
Berg WA, Vourtsis A. Screening Breast Ultrasound Using Handheld or Automated Technique in Women with Dense Breasts. JOURNAL OF BREAST IMAGING 2019; 1:283-296. [PMID: 38424808 DOI: 10.1093/jbi/wbz055] [Citation(s) in RCA: 65] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2019] [Accepted: 08/01/2019] [Indexed: 03/02/2024]
Abstract
In women with dense breasts (heterogeneously or extremely dense), adding screening ultrasound to mammography increases detection of node-negative invasive breast cancer. Similar incremental cancer detection rates averaging 2.1-2.7 per 1000 have been observed for physician- and technologist-performed handheld ultrasound (HHUS) and automated ultrasound (AUS). Adding screening ultrasound (US) for women with dense breasts significantly reduces interval cancer rates. Training is critical before interpreting examinations for both modalities, and a learning curve to achieve optimal performance has been observed. On average, about 3% of women will be recommended for biopsy on the prevalence round because of screening US, with a wide range of 2%-30% malignancy rates for suspicious findings seen only on US. Breast Imaging Reporting and Data System 3 lesions identified only on screening HHUS can be safely followed at 1 year rather than 6 months. Computer-aided detection and diagnosis software can augment performance of AUS and HHUS; ongoing research on machine learning and deep learning algorithms will likely improve outcomes and workflow with screening US.
Collapse
Affiliation(s)
- Wendie A Berg
- University of Pittsburgh School of Medicine, Magee-Womens Hospital of the University of Pittsburgh School of Medicine, Department of Radiology, Pittsburgh, PA
| | - Athina Vourtsis
- Diagnostic Mammography Medical Diagnostic Imaging Unit, Athens, Greece
| |
Collapse
|
80
|
Xiao M, Zhao C, Zhu Q, Zhang J, Liu H, Li J, Jiang Y. An investigation of the classification accuracy of a deep learning framework-based computer-aided diagnosis system in different pathological types of breast lesions. J Thorac Dis 2019; 11:5023-5031. [PMID: 32030218 DOI: 10.21037/jtd.2019.12.10] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Background Deep learning-based computer-aided diagnosis (CAD) is an important method in aiding diagnosis for radiologists. We investigated the accuracy of a deep learning-based CAD in classifying breast lesions with different histological types. Methods A total of 448 breast lesions were detected on ultrasound (US) and classified by an experienced radiologist, a resident and deep learning-based CAD respectively. The pathological results of the lesions were chosen as the golden standard. The diagnostic performances of the three raters in different pathological types were analyzed. Results For the overall diagnostic performance, deep learning-based CAD presented a significantly higher specificity (76.96%) compared with the two radiologists. The area under ROC of CAD was almost equal with the experienced radiologist (0.81 vs. 0.81), while significantly higher than the resident (0.81 vs. 0.70, P<0.0001). In the benign lesions, deep learning-based CAD had a higher accuracy than both the two radiologists, which correctly classified as benign lesions in 119/135 of fibroadenomas (88.1%), 25/35 of adenosis (71.4%), 14/27 of intraductal papillary tumors (51.9%), 5/10 of inflammation (50%), and 4/8 of sclerosing adenosis (50%). But only the differences between CAD and the two radiologists in fibroadenomas had statistical significance (P=0.0011 and P=0.0313), and the differences between CAD and the resident in adenosis had statistical significance (P=0.012). In the malignant lesions, 151/168 of invasive ductal carcinomas (89.9%), 21/29 of ductal carcinoma in situ (DCIS) (72.4%) and 6/7 of invasive lobular carcinomas (85.7%) were diagnosed as malignancies by deep learning-based CAD, with no significant differences between CAD and the two radiologists. Conclusions In the diagnosis of these common types of breast lesions, deep learning-based CAD had a satisfying performance. Deep learning-based CAD had a better performance in the breast benign lesions, especially in fibroadenomas and adenosis. Therefore, deep learning-based CAD is a promising supplemental tool to US to increase the specificity and avoid unnecessary benign biopsies.
Collapse
Affiliation(s)
- Mengsu Xiao
- Department of Ultrasound, Chinese Academy of Medical Sciences and Peking Union Medical College Hospital, Beijing 100730, China
| | - Chenyang Zhao
- Department of Ultrasound, Chinese Academy of Medical Sciences and Peking Union Medical College Hospital, Beijing 100730, China
| | - Qingli Zhu
- Department of Ultrasound, Chinese Academy of Medical Sciences and Peking Union Medical College Hospital, Beijing 100730, China
| | - Jing Zhang
- Department of Ultrasound, Chinese Academy of Medical Sciences and Peking Union Medical College Hospital, Beijing 100730, China
| | - He Liu
- Department of Ultrasound, Chinese Academy of Medical Sciences and Peking Union Medical College Hospital, Beijing 100730, China
| | - Jianchu Li
- Department of Ultrasound, Chinese Academy of Medical Sciences and Peking Union Medical College Hospital, Beijing 100730, China
| | - Yuxin Jiang
- Department of Ultrasound, Chinese Academy of Medical Sciences and Peking Union Medical College Hospital, Beijing 100730, China
| |
Collapse
|
81
|
Liu X, Faes L, Kale AU, Wagner SK, Fu DJ, Bruynseels A, Mahendiran T, Moraes G, Shamdas M, Kern C, Ledsam JR, Schmid MK, Balaskas K, Topol EJ, Bachmann LM, Keane PA, Denniston AK. A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis. Lancet Digit Health 2019; 1:e271-e297. [PMID: 33323251 DOI: 10.1016/s2589-7500(19)30123-2] [Citation(s) in RCA: 778] [Impact Index Per Article: 129.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2019] [Revised: 08/06/2019] [Accepted: 08/14/2019] [Indexed: 02/06/2023]
Abstract
BACKGROUND Deep learning offers considerable promise for medical diagnostics. We aimed to evaluate the diagnostic accuracy of deep learning algorithms versus health-care professionals in classifying diseases using medical imaging. METHODS In this systematic review and meta-analysis, we searched Ovid-MEDLINE, Embase, Science Citation Index, and Conference Proceedings Citation Index for studies published from Jan 1, 2012, to June 6, 2019. Studies comparing the diagnostic performance of deep learning models and health-care professionals based on medical imaging, for any disease, were included. We excluded studies that used medical waveform data graphics material or investigated the accuracy of image segmentation rather than disease classification. We extracted binary diagnostic accuracy data and constructed contingency tables to derive the outcomes of interest: sensitivity and specificity. Studies undertaking an out-of-sample external validation were included in a meta-analysis, using a unified hierarchical model. This study is registered with PROSPERO, CRD42018091176. FINDINGS Our search identified 31 587 studies, of which 82 (describing 147 patient cohorts) were included. 69 studies provided enough data to construct contingency tables, enabling calculation of test accuracy, with sensitivity ranging from 9·7% to 100·0% (mean 79·1%, SD 0·2) and specificity ranging from 38·9% to 100·0% (mean 88·3%, SD 0·1). An out-of-sample external validation was done in 25 studies, of which 14 made the comparison between deep learning models and health-care professionals in the same sample. Comparison of the performance between health-care professionals in these 14 studies, when restricting the analysis to the contingency table for each study reporting the highest accuracy, found a pooled sensitivity of 87·0% (95% CI 83·0-90·2) for deep learning models and 86·4% (79·9-91·0) for health-care professionals, and a pooled specificity of 92·5% (95% CI 85·1-96·4) for deep learning models and 90·5% (80·6-95·7) for health-care professionals. INTERPRETATION Our review found the diagnostic performance of deep learning models to be equivalent to that of health-care professionals. However, a major finding of the review is that few studies presented externally validated results or compared the performance of deep learning models and health-care professionals using the same sample. Additionally, poor reporting is prevalent in deep learning studies, which limits reliable interpretation of the reported diagnostic accuracy. New reporting standards that address specific challenges of deep learning could improve future studies, enabling greater confidence in the results of future evaluations of this promising technology. FUNDING None.
Collapse
Affiliation(s)
- Xiaoxuan Liu
- Department of Ophthalmology, University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK; Academic Unit of Ophthalmology, Institute of Inflammation & Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK; Medical Retina Department, Moorfields Eye Hospital NHS Foundation Trust, London, UK; Health Data Research UK, London, UK
| | - Livia Faes
- Medical Retina Department, Moorfields Eye Hospital NHS Foundation Trust, London, UK; Eye Clinic, Cantonal Hospital of Lucerne, Lucerne, Switzerland
| | - Aditya U Kale
- Department of Ophthalmology, University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
| | - Siegfried K Wagner
- NIHR Biomedical Research Centre for Ophthalmology, Moorfields Eye Hospital NHS Foundation Trust and UCL Institute of Ophthalmology, London, UK
| | - Dun Jack Fu
- Medical Retina Department, Moorfields Eye Hospital NHS Foundation Trust, London, UK
| | - Alice Bruynseels
- Department of Ophthalmology, University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
| | - Thushika Mahendiran
- Department of Ophthalmology, University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
| | - Gabriella Moraes
- Medical Retina Department, Moorfields Eye Hospital NHS Foundation Trust, London, UK
| | - Mohith Shamdas
- Academic Unit of Ophthalmology, Institute of Inflammation & Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK
| | - Christoph Kern
- Medical Retina Department, Moorfields Eye Hospital NHS Foundation Trust, London, UK; University Eye Hospital, Ludwig Maximilian University of Munich, Munich, Germany
| | | | - Martin K Schmid
- Eye Clinic, Cantonal Hospital of Lucerne, Lucerne, Switzerland
| | - Konstantinos Balaskas
- Medical Retina Department, Moorfields Eye Hospital NHS Foundation Trust, London, UK; NIHR Biomedical Research Centre for Ophthalmology, Moorfields Eye Hospital NHS Foundation Trust and UCL Institute of Ophthalmology, London, UK
| | - Eric J Topol
- Scripps Research Translational Institute, La Jolla, California
| | | | - Pearse A Keane
- NIHR Biomedical Research Centre for Ophthalmology, Moorfields Eye Hospital NHS Foundation Trust and UCL Institute of Ophthalmology, London, UK; Health Data Research UK, London, UK
| | - Alastair K Denniston
- Department of Ophthalmology, University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK; Academic Unit of Ophthalmology, Institute of Inflammation & Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK; Centre for Patient Reported Outcome Research, Institute of Applied Health Research, University of Birmingham, Birmingham, UK; NIHR Biomedical Research Centre for Ophthalmology, Moorfields Eye Hospital NHS Foundation Trust and UCL Institute of Ophthalmology, London, UK; Health Data Research UK, London, UK.
| |
Collapse
|
82
|
Choe YH. A Glimpse on Trends and Characteristics of Recent Articles Published in the Korean Journal of Radiology. Korean J Radiol 2019; 20:1555-1561. [PMID: 31854145 PMCID: PMC6923209 DOI: 10.3348/kjr.2019.0928] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Affiliation(s)
- Yeon Hyeon Choe
- Department of Radiology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea; Heart Vascular Stroke Institute, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea
| |
Collapse
|