1
|
de Vries CF, Staff RT, Dymiter JA, Boyle M, Anderson LA, Lip G. Service and clinical impacts of reader bias in breast cancer screening: a retrospective study. Br J Radiol 2024; 97:120-125. [PMID: 38263824 PMCID: PMC11027282 DOI: 10.1093/bjr/tqad024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Revised: 09/27/2023] [Accepted: 09/29/2023] [Indexed: 01/25/2024] Open
Abstract
OBJECTIVES To determine factors influencing reader agreement in breast screening and investigate the relationship between agreement level and patient outcomes. METHODS Reader pair agreement for 83 265 sets of mammograms from the Scottish Breast Screening service (2015-2020) was evaluated using Cohen's kappa statistic. Each mammography examination was read by two readers, per routine screening practice, with the second initially blinded but able to choose to view the first reader's opinion. If the two readers disagreed, a third reader arbitrated. Variation in reader agreement was examined by: whether the reader acted as the first or second reader, reader experience, and recall, cancer detection and arbitration recall rate. RESULTS Readers' opinions varied by whether they acted as the first or second reader. Furthermore, reader 2 was more likely to agree with reader 1 if reader 1 was more experienced than they were, and less likely to agree if they themselves were more experienced than reader 1 (P < .001). Agreement was not significantly associated with cancer detection rate, overall recall rate or arbitration recall rates (P > .05). Lower agreement between readers led to a higher arbiter workload (P < .001). CONCLUSIONS In mammography screening, the second reader's opinion is influenced by the first reader's opinion, with the degree of influence dependent on the readers' relative experience levels. ADVANCES IN KNOWLEDGE While less-experienced readers relied on their more experienced reading partner, no adverse impact on service outcomes was observed. Allowing access to the first reader's opinion may benefit newly qualified readers, but reduces independent evaluation, which may lower cancer detection rates.
Collapse
Affiliation(s)
- Clarisse F de Vries
- Aberdeen Centre for Health Data Science, University of Aberdeen, Aberdeen AB25 2ZD, United Kingdom
- Aberdeen Biomedical Imaging Centre, University of Aberdeen, Aberdeen AB25 2ZN, United Kingdom
| | - Roger T Staff
- National Health Service Grampian (NHSG), Aberdeen Royal Infirmary, Aberdeen AB25 2ZN, United Kingdom
| | - Jaroslaw A Dymiter
- Grampian Data Safe Haven (DaSH), University of Aberdeen, Aberdeen AB25 2ZD, United Kingdom
| | - Moragh Boyle
- Aberdeen Centre for Health Data Science, University of Aberdeen, Aberdeen AB25 2ZD, United Kingdom
| | - Lesley A Anderson
- Aberdeen Centre for Health Data Science, University of Aberdeen, Aberdeen AB25 2ZD, United Kingdom
| | - Gerald Lip
- National Health Service Grampian (NHSG), Aberdeen Royal Infirmary, Aberdeen AB25 2ZN, United Kingdom
- North East Scotland Breast Screening Centre, Aberdeen AB25 2XF, United Kingdom
| |
Collapse
|
2
|
Kim H, Choi JS, Kim K, Ko ES, Ko EY, Han BK. Effect of artificial intelligence-based computer-aided diagnosis on the screening outcomes of digital mammography: a matched cohort study. Eur Radiol 2023; 33:7186-7198. [PMID: 37188881 DOI: 10.1007/s00330-023-09692-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Revised: 02/21/2023] [Accepted: 03/09/2023] [Indexed: 05/17/2023]
Abstract
OBJECTIVE To investigate whether artificial intelligence-based computer-aided diagnosis (AI-CAD) can improve radiologists' performance when used to support radiologists' interpretation of digital mammography (DM) in breast cancer screening. METHODS A retrospective database search identified 3158 asymptomatic Korean women who consecutively underwent screening DM between January and December 2019 without AI-CAD support, and screening DM between February and July 2020 with image interpretation aided by AI-CAD in a tertiary referral hospital using single reading. Propensity score matching was used to match the DM with AI-CAD group in a 1:1 ratio with the DM without AI-CAD group according to age, breast density, experience level of the interpreting radiologist, and screening round. Performance measures were compared with the McNemar test and generalized estimating equations. RESULTS A total of 1579 women who underwent DM with AI-CAD were matched with 1579 women who underwent DM without AI-CAD. Radiologists showed higher specificity (96% [1500 of 1563] vs 91.6% [1430 of 1561]; p < 0.001) and lower abnormal interpretation rates (AIR) (4.9% [77 of 1579] vs 9.2% [145 of 1579]; p < 0.001) with AI-CAD than without. There was no significant difference in the cancer detection rate (CDR) (AI-CAD vs no AI-CAD, 8.9 vs 8.9 per 1000 examinations; p = 0.999), sensitivity (87.5% vs 77.8%; p = 0.999), and positive predictive value for biopsy (PPV3) (35.0% vs 35.0%; p = 0.999) according to AI-CAD support. CONCLUSIONS AI-CAD increases the specificity for radiologists without decreasing sensitivity as a supportive tool in the single reading of DM for breast cancer screening. CLINICAL RELEVANCE STATEMENT This study shows that AI-CAD could improve the specificity of radiologists' DM interpretation in the single reading system without decreasing sensitivity, suggesting that it can benefit patients by reducing false positive and recall rates. KEY POINTS • In this retrospective-matched cohort study (DM without AI-CAD vs DM with AI-CAD), radiologists showed higher specificity and lower AIR when AI-CAD was used to support decision-making in DM screening. • CDR, sensitivity, and PPV for biopsy did not differ with and without AI-CAD support.
Collapse
Affiliation(s)
- Haejung Kim
- Department of Radiology and Center for Imaging Science, Samsung Medical Center, Sungkyunkwan University School of Medicine, 81 Irwon-Ro, Gangnam-Gu, Seoul, 06351, Korea
| | - Ji Soo Choi
- Department of Radiology and Center for Imaging Science, Samsung Medical Center, Sungkyunkwan University School of Medicine, 81 Irwon-Ro, Gangnam-Gu, Seoul, 06351, Korea.
- Department of Digital Health, SAIHST, Sungkyunkwan University, Seoul, Korea.
| | - Kyunga Kim
- Department of Digital Health, SAIHST, Sungkyunkwan University, Seoul, Korea
- Biomedical Statistics Center, Research Institute for Future Medicine, Samsung Medical Center, Seoul, Korea
- Department of Data Convergence & Future Medicine, Sungkyunkwan University School of Medicine, Seoul, Korea
| | - Eun Sook Ko
- Department of Radiology and Center for Imaging Science, Samsung Medical Center, Sungkyunkwan University School of Medicine, 81 Irwon-Ro, Gangnam-Gu, Seoul, 06351, Korea
| | - Eun Young Ko
- Department of Radiology and Center for Imaging Science, Samsung Medical Center, Sungkyunkwan University School of Medicine, 81 Irwon-Ro, Gangnam-Gu, Seoul, 06351, Korea
| | - Boo-Kyung Han
- Department of Radiology and Center for Imaging Science, Samsung Medical Center, Sungkyunkwan University School of Medicine, 81 Irwon-Ro, Gangnam-Gu, Seoul, 06351, Korea
| |
Collapse
|
3
|
Vargas-Palacios A, Sharma N, Sagoo GS. Cost-effectiveness requirements for implementing artificial intelligence technology in the Women's UK Breast Cancer Screening service. Nat Commun 2023; 14:6110. [PMID: 37777510 PMCID: PMC10542368 DOI: 10.1038/s41467-023-41754-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2023] [Accepted: 09/17/2023] [Indexed: 10/02/2023] Open
Abstract
The UK NHS Women's National Breast Screening programme aims to detect breast cancer early. The reference standard approach requires mammograms to be independently double-read by qualified radiology staff. If two readers disagree, arbitration by an independent reader is undertaken. Whilst this process maximises accuracy and minimises recall rates, the procedure is labour-intensive, adding pressure to a system currently facing a workforce crisis. Artificial intelligence technology offers an alternative to human readers. While artificial intelligence has been shown to be non-inferior versus human second readers, the minimum requirements needed (effectiveness, set-up costs, maintenance, etc) for such technology to be cost-effective in the NHS have not been evaluated. We developed a simulation model replicating NHS screening services to evaluate the potential value of the technology. Our results indicate that if non-inferiority is maintained, the use of artificial intelligence technology as a second reader is a viable and potentially cost-effective use of NHS resources.
Collapse
Affiliation(s)
- Armando Vargas-Palacios
- Academic Unit of Health Economics, University of Leeds, Leeds, UK.
- Centro de Investigación en Ciencias de la Salud, Universidad Anáhuac, Mexico, México.
| | | | - Gurdeep S Sagoo
- Academic Unit of Health Economics, University of Leeds, Leeds, UK
- Population Health Sciences Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, UK
| |
Collapse
|
4
|
Ponsiglione AM, Angelone F, Amato F, Sansone M. A Statistical Approach to Assess the Robustness of Radiomics Features in the Discrimination of Mammographic Lesions. J Pers Med 2023; 13:1104. [PMID: 37511717 PMCID: PMC10381882 DOI: 10.3390/jpm13071104] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Revised: 07/01/2023] [Accepted: 07/05/2023] [Indexed: 07/30/2023] Open
Abstract
Despite mammography (MG) being among the most widespread techniques in breast cancer screening, tumour detection and classification remain challenging tasks due to the high morphological variability of the lesions. The extraction of radiomics features has proved to be a promising approach in MG. However, radiomics features can suffer from dependency on factors such as acquisition protocol, segmentation accuracy, feature extraction and engineering methods, which prevent the implementation of robust and clinically reliable radiomics workflow in MG. In this study, the variability and robustness of radiomics features is investigated as a function of lesion segmentation in MG images from a public database. A statistical analysis is carried out to assess feature variability and a radiomics robustness score is introduced based on the significance of the statistical tests performed. The obtained results indicate that variability is observable not only as a function of the abnormality type (calcification and masses), but also among feature categories (first-order and second-order), image view (craniocaudal and medial lateral oblique), and the type of lesions (benign and malignant). Furthermore, through the proposed approach, it is possible to identify those radiomics characteristics with a higher discriminative power between benign and malignant lesions and a lower dependency on segmentation, thus suggesting the most appropriate choice of robust features to be used as inputs to automated classification algorithms.
Collapse
Affiliation(s)
- Alfonso Maria Ponsiglione
- Department of Information Technology and Electrical Engineering, University of Naples Federico II, 80125 Naples, Italy
| | - Francesca Angelone
- Department of Information Technology and Electrical Engineering, University of Naples Federico II, 80125 Naples, Italy
| | - Francesco Amato
- Department of Information Technology and Electrical Engineering, University of Naples Federico II, 80125 Naples, Italy
| | - Mario Sansone
- Department of Information Technology and Electrical Engineering, University of Naples Federico II, 80125 Naples, Italy
| |
Collapse
|
5
|
Hirai Y, Fujimoto A, Matsutani N, Murakami S, Nakajima Y, Miyanaga R, Nakazato Y, Watanabe K, Kikuchi M, Yahagi N. Evaluation of the visibility of bleeding points using red dichromatic imaging in endoscopic hemostasis for acute GI bleeding (with video). Gastrointest Endosc 2022; 95:692-700.e3. [PMID: 34762920 DOI: 10.1016/j.gie.2021.10.031] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/10/2021] [Accepted: 10/24/2021] [Indexed: 02/08/2023]
Abstract
BACKGROUND AND AIMS We aimed to clarify whether red dichromatic imaging (RDI), a new type of image-enhanced endoscopy, improves the visibility of bleeding points in acute GI bleeding (AGIB) compared with white-light imaging (WLI). METHODS Images and videos of bleeding points acquired with WLI and RDI during endoscopic hemostasis for AGIB were retrospectively compared. In images, the color difference between bleeding points and surrounding blood was analyzed. In videos, 4 expert and 4 trainee endoscopists evaluated the visibility on a scale of 1 (undetectable) to 4 (easily detectable). Furthermore, the correlation between the color difference and visibility score was evaluated. RESULTS We analyzed 64 lesions. The color difference was significantly higher in RDI (13.11 ± 4.02) than in WLI (7.38 ± 3.68, P < .001). The mean visibility score for all endoscopists was significantly higher in RDI (3.12 ± .51) compared with WLI (2.72 ± .50, P < .001); this was also observed in experts (3.18 ± .51 vs 2.79 ± .54, P < .001) and trainees (3.05 ± .54 vs 2.64 ± .47, P < .001). The color difference and visibility score were moderately correlated for all endoscopists (γ = .56, P < .001) and for experts (γ = .53, P < .001) and trainees (γ = .57, P < .001). CONCLUSIONS RDI improves the visibility of bleeding points in AGIB compared with WLI. RDI can help endoscopists at all levels of experience to recognize bleeding points by enhancing the color contrast relative to surrounding blood.
Collapse
Affiliation(s)
- Yuichiro Hirai
- Department of Gastroenterology, National Hospital Organization Tokyo Medical Center, Tokyo, Japan; Division of Research and Development for Minimally Invasive Treatment, Cancer Center, Keio University School of Medicine, Tokyo, Japan
| | - Ai Fujimoto
- Department of Gastroenterology, National Hospital Organization Tokyo Medical Center, Tokyo, Japan; Division of Research and Development for Minimally Invasive Treatment, Cancer Center, Keio University School of Medicine, Tokyo, Japan
| | - Naomi Matsutani
- Department of Gastroenterology, National Hospital Organization Tokyo Medical Center, Tokyo, Japan
| | - Soichiro Murakami
- Department of Gastroenterology, National Hospital Organization Tokyo Medical Center, Tokyo, Japan
| | - Yuki Nakajima
- Department of Gastroenterology, National Hospital Organization Tokyo Medical Center, Tokyo, Japan
| | - Ryoichi Miyanaga
- Department of Gastroenterology, National Hospital Organization Tokyo Medical Center, Tokyo, Japan
| | - Yoshihiro Nakazato
- Department of Gastroenterology, National Hospital Organization Tokyo Medical Center, Tokyo, Japan
| | - Kazuyo Watanabe
- Department of Gastroenterology, National Hospital Organization Tokyo Medical Center, Tokyo, Japan
| | - Masahiro Kikuchi
- Department of Gastroenterology, National Hospital Organization Tokyo Medical Center, Tokyo, Japan
| | - Naohisa Yahagi
- Division of Research and Development for Minimally Invasive Treatment, Cancer Center, Keio University School of Medicine, Tokyo, Japan
| |
Collapse
|
6
|
Cheng Q, Ye S, Fu C, Zhou J, He X, Miao H, Xu N, Wang M. Quantitative evaluation of computed and voxelwise computed diffusion-weighted imaging in breast cancer. Br J Radiol 2019; 92:20180978. [PMID: 31291125 DOI: 10.1259/bjr.20180978] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
OBJECTIVES To assess the value of computed diffusion-weighted imaging (cDWI) and voxelwise computed diffusion-weighted imaging (vcDWI) in breast cancer. METHODS This retrospective study involved 130 patients (age range, 25-70 years; mean age ± standard deviation, 48.6 ± 10.5 years) with 130 malignant lesions, who underwent MRI examinations, including a DWI sequence, prior to needle biopsy or surgery. cDWIs with higher b-values of 1500, 2000, 2500, 3000, 3500, and 4000 s/mm2, and vcDWI were generated from measured (m) DWI with two lower b-values of 0/600, 0/800, or 0/1000 s/mm2. The signal-to-noise ratio (SNR) and contrast ratio (CR) of all image sets were computed and compared among different DWIs by two experienced radiologists independently. To better compare the CR with the SNR, the CR value was multiplied by 100 (CR100). RESULTS The CR of vcDWI, and cDWIs, except for cDWI1000, differed significantly from that of measured diffusion-weighted imaging (mDWI) (cDWI1000: CR = 0.4904, p = 0.394; cDWI1500: CR = 0.5503, p = 0.006; cDWI2000: CR = 0.5889, p < 0.001; cDWI2500: CR = 0.6109, p < 0.001; cDWI3000: mean = 0.6214, p < 0.001; cDWI3500: CR = 0.6245, p < 0.001; cDWI4000: CR = 0.6228, p < 0.001). The vcDWI provided the highest CR, while the CRs of all cDWI image sets improved with increased b-values. The SNR of neither cDWI1000 nor vcDWI differed significantly from that of mDWI, but the mean SNRs of the remaining cDWIs were significantly lower than that of mDWI. The SNRs of cDWIs declined with increasing b-values, and the initial decrease at low b-values was steeper than the gradual attenuation at higher b-values; the CR100 rose gradually, and the two converged on the b-value interval of 1500-2000 s/mm2 . CONCLUSIONS The highest CR was achieved with vcDWI; this could be a promising approach easier detection of breast cancer. ADVANCES IN KNOWLEDGE This study comprehensively compared and evaluated the value of the emerging post-processing DWI techniques (including a set of cDWIs and vcDWI) in breast cancer.
Collapse
Affiliation(s)
- Qingyuan Cheng
- 1 Department of Radiology, First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China
| | - Shuxin Ye
- 1 Department of Radiology, First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China
| | - Chuqi Fu
- 1 Department of Radiology, First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China
| | - Jiejie Zhou
- 1 Department of Radiology, First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China
| | - Xiaxia He
- 1 Department of Radiology, First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China
| | - Haiwei Miao
- 1 Department of Radiology, First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China
| | - Nina Xu
- 1 Department of Radiology, First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China
| | - Meihao Wang
- 1 Department of Radiology, First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China
| |
Collapse
|
7
|
Nakamura S, Hayashi K, Imaoka Y, Kitamura Y, Akazawa Y, Tabata K, Groen R, Tsuchiya T, Yamasaki N, Nagayasu T, Fukuoka J. Intratumoral heterogeneity of programmed cell death ligand-1 expression is common in lung cancer. PLoS One 2017; 12:e0186192. [PMID: 29049375 PMCID: PMC5648155 DOI: 10.1371/journal.pone.0186192] [Citation(s) in RCA: 39] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2017] [Accepted: 09/27/2017] [Indexed: 12/20/2022] Open
Abstract
Programmed cell death ligand-1 (PD-L1) expression may predict the response to both programmed cell death-1 and PD-L1 inhibitors in lung cancer. However, the extent of intratumoral heterogeneity of PD-L1 expression, which may cause false negative results, is largely unexplored. We aimed to assess the intratumoral heterogeneity of PD-L1 expression in surgically resected lung cancer specimens by applying a novel method of tissue microarray, namely Spiral Arrays, which enables us to observe the heterogeneity in spiral-shaped tissue cores. Adenocarcinoma and squamous cell carcinoma specimens were obtained from consecutive patients with lung cancer who had undergone surgical resection at Nagasaki University Hospital (Nagasaki, Japan) since 2009. Small cell lung cancer and large cell carcinoma specimens were selected from patients in the same archive who had undergone resection since 1998. Spiral Arrays were constructed of spiral-shaped cores, prepared from representative blocks of each case, which were subjected to immunohistochemistry using an anti-PD-L1 antibody. Each core was divided into 8 segments and each segment was classified as either PD-L1-positive or PD-L1-negative using thresholds of 1.0%, 5.0%, 10.0%, and 50.0%, respectively. In total, 138 specimens were selected, including 60 adenocarcinomas, 59 squamous cell carcinomas, 12 small cell lung cancers, and 7 large cell carcinomas. The majority of specimens with PD-L1-positive segments exhibited heterogeneous expression (i.e., had a mixture of PD-L1-positive and PD-L1-negative segments within a core) irrespective of the threshold (1.0%, 66.7%; 5.0%, 74.4%; 10.0%, 75.8%; and 50.0%, 85.7%]. Large variations in the ratios of PD-L1-positive segments were observed. At least 50.0% of the segments within a core were negative in no fewer than 50.0% (range, 50.0–76.0%) of cases with heterogeneous PD-L1 expression. In conclusion, intratumoral heterogeneity of PD-L1 expression was frequently observed in cases of lung cancer. Thus, multiple tumor biopsy specimens may be needed to accurately determine the PD-L1 expression status.
Collapse
Affiliation(s)
- Sayuri Nakamura
- Department of Pathology, Nagasaki University Graduate School of Biomedical Sciences, Nagasaki, Japan
| | - Kentaro Hayashi
- Department of Pathology, Nagasaki University Graduate School of Biomedical Sciences, Nagasaki, Japan
| | - Yuki Imaoka
- Department of Pathology, Nagasaki University Graduate School of Biomedical Sciences, Nagasaki, Japan
| | - Yuka Kitamura
- Department of Pathology, Nagasaki University Graduate School of Biomedical Sciences, Nagasaki, Japan
| | - Yuko Akazawa
- Department of Pathology, Nagasaki University Graduate School of Biomedical Sciences, Nagasaki, Japan
| | - Kazuhiro Tabata
- Department of Pathology, Nagasaki University Graduate School of Biomedical Sciences, Nagasaki, Japan
| | - Ruben Groen
- Department of Pathology, Nagasaki University Graduate School of Biomedical Sciences, Nagasaki, Japan
| | - Tomoshi Tsuchiya
- Department of Surgical Oncology, Nagasaki University Graduate School of Biomedical Sciences, Nagasaki, Japan
| | - Naoya Yamasaki
- Department of Surgical Oncology, Nagasaki University Graduate School of Biomedical Sciences, Nagasaki, Japan
| | - Takeshi Nagayasu
- Department of Surgical Oncology, Nagasaki University Graduate School of Biomedical Sciences, Nagasaki, Japan
| | - Junya Fukuoka
- Department of Pathology, Nagasaki University Graduate School of Biomedical Sciences, Nagasaki, Japan
| |
Collapse
|
8
|
Posso M, Carles M, Rué M, Puig T, Bonfill X. Cost-Effectiveness of Double Reading versus Single Reading of Mammograms in a Breast Cancer Screening Programme. PLoS One 2016; 11:e0159806. [PMID: 27459663 PMCID: PMC4961365 DOI: 10.1371/journal.pone.0159806] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2016] [Accepted: 07/10/2016] [Indexed: 11/18/2022] Open
Abstract
OBJECTIVES The usual practice in breast cancer screening programmes for mammogram interpretation is to perform double reading. However, little is known about its cost-effectiveness in the context of digital mammography. Our purpose was to evaluate the cost-effectiveness of double reading versus single reading of digital mammograms in a population-based breast cancer screening programme. METHODS Data from 28,636 screened women was used to establish a decision-tree model and to compare three strategies: 1) double reading; 2) double reading for women in their first participation and single reading for women in their subsequent participations; and 3) single reading. We calculated the incremental cost-effectiveness ratio (ICER), which was defined as the expected cost per one additionally detected cancer. We performed a deterministic sensitivity analysis to test the robustness of the ICER. RESULTS The detection rate of double reading (5.17‰) was similar to that of single reading (4.78‰; P = .768). The mean cost of each detected cancer was €8,912 for double reading and €8,287 for single reading. The ICER of double reading versus single reading was €16,684. The sensitivity analysis showed variations in the ICER according to the sensitivity of reading strategies. The strategy that combines double reading in first participation with single reading in subsequent participations was ruled out due to extended dominance. CONCLUSIONS From our results, double reading appears not to be a cost-effective strategy in the context of digital mammography. Double reading would eventually be challenged in screening programmes, as single reading might entail important net savings without significantly changing the cancer detection rate. These results are not conclusive and should be confirmed in prospective studies that investigate long-term outcomes like quality adjusted life years (QALYs).
Collapse
Affiliation(s)
- Margarita Posso
- Service of Clinical Epidemiology and Public Health, Biomedical Research Institute Sant Pau (IIB Sant Pau), Barcelona, Spain
| | | | - Montserrat Rué
- Basic Medical Sciences Department, Biomedical Research Institut of Lleida (IRBLLEIDA), Universitat de Lleida, Lleida, Spain
| | - Teresa Puig
- Service of Clinical Epidemiology and Public Health, Biomedical Research Institute Sant Pau (IIB Sant Pau), Barcelona, Spain
- Universitat Autònoma de Barcelona (UAB), Barcelona, Spain
| | - Xavier Bonfill
- Service of Clinical Epidemiology and Public Health, Biomedical Research Institute Sant Pau (IIB Sant Pau), Barcelona, Spain
- Universitat Autònoma de Barcelona (UAB), Barcelona, Spain
- CIBER of Epidemiology and Public Health (CIBERESP), Barcelona, Spain
| |
Collapse
|
9
|
Hands JR, Clemens G, Stables R, Ashton K, Brodbelt A, Davis C, Dawson TP, Jenkinson MD, Lea RW, Walker C, Baker MJ. Brain tumour differentiation: rapid stratified serum diagnostics via attenuated total reflection Fourier-transform infrared spectroscopy. J Neurooncol 2016; 127:463-72. [PMID: 26874961 PMCID: PMC4835510 DOI: 10.1007/s11060-016-2060-x] [Citation(s) in RCA: 90] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2015] [Accepted: 01/22/2016] [Indexed: 01/07/2023]
Abstract
The ability to diagnose cancer rapidly with high sensitivity and specificity is essential to exploit advances in new treatments to lead significant reductions in mortality and morbidity. Current cancer diagnostic tests observing tissue architecture and specific protein expression for specific cancers suffer from inter-observer variability, poor detection rates and occur when the patient is symptomatic. A new method for the detection of cancer using 1 μl of human serum, attenuated total reflection-Fourier transform infrared spectroscopy and pattern recognition algorithms is reported using a 433 patient dataset (3897 spectra). To the best of our knowledge, we present the largest study on serum mid-infrared spectroscopy for cancer research. We achieve optimum sensitivities and specificities using a Radial Basis Function Support Vector Machine of between 80.0 and 100 % for all strata and identify the major spectral features, hence biochemical components, responsible for the discrimination within each stratum. We assess feature fed-SVM analysis for our cancer versus non-cancer model and achieve 91.5 and 83.0 % sensitivity and specificity respectively. We demonstrate the use of infrared light to provide a spectral signature from human serum to detect, for the first time, cancer versus non-cancer, metastatic cancer versus organ confined, brain cancer severity and the organ of origin of metastatic disease from the same sample enabling stratified diagnostics depending upon the clinical question asked.
Collapse
Affiliation(s)
- James R Hands
- WestCHEM, Department of Pure and Applied Chemistry, Technology and Innovation Centre, University of Strathclyde, 99 George Street, Glasgow, G11RD, UK
| | - Graeme Clemens
- WestCHEM, Department of Pure and Applied Chemistry, Technology and Innovation Centre, University of Strathclyde, 99 George Street, Glasgow, G11RD, UK
- Centre for Materials Science, Division of Chemistry, University of Central Lancashire, Preston, PR12HE, UK
| | - Ryan Stables
- Digital Media Technology Laboratory, Millennium Point, City Centre Campus Birmingham City University, West Midlands, B47XG, UK
| | - Katherine Ashton
- Neuropathology, Lancashire Teaching Hospitals NHS Trust, Royal Preston Hospital, Sharoe Green Lane North, Preston, PR29HT, UK
| | - Andrew Brodbelt
- The Walton Centre for Neurology and Neurosurgery, The Walton Centre NHS Trust, Lower Lane, Liverpool, L97LJ, UK
| | - Charles Davis
- Neuropathology, Lancashire Teaching Hospitals NHS Trust, Royal Preston Hospital, Sharoe Green Lane North, Preston, PR29HT, UK
| | - Timothy P Dawson
- Neuropathology, Lancashire Teaching Hospitals NHS Trust, Royal Preston Hospital, Sharoe Green Lane North, Preston, PR29HT, UK
| | - Michael D Jenkinson
- The Walton Centre for Neurology and Neurosurgery, The Walton Centre NHS Trust, Lower Lane, Liverpool, L97LJ, UK
| | - Robert W Lea
- School of Pharmacy and Biomedical Sciences, Maudland Building, University of Central Lancashire, Preston, PR12HE, UK
| | - Carol Walker
- The Walton Centre for Neurology and Neurosurgery, The Walton Centre NHS Trust, Lower Lane, Liverpool, L97LJ, UK
| | - Matthew J Baker
- WestCHEM, Department of Pure and Applied Chemistry, Technology and Innovation Centre, University of Strathclyde, 99 George Street, Glasgow, G11RD, UK.
| |
Collapse
|
10
|
Posso MC, Puig T, Quintana MJ, Solà-Roca J, Bonfill X. Double versus single reading of mammograms in a breast cancer screening programme: a cost-consequence analysis. Eur Radiol 2016; 26:3262-71. [PMID: 26747264 DOI: 10.1007/s00330-015-4175-4] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2015] [Revised: 11/30/2015] [Accepted: 12/15/2015] [Indexed: 12/29/2022]
Abstract
OBJECTIVES To assess the costs and health-related outcomes of double versus single reading of digital mammograms in a breast cancer screening programme. METHODS Based on data from 57,157 digital screening mammograms from women aged 50-69 years, we compared costs, false-positive results, positive predictive value and cancer detection rate using four reading strategies: double reading with and without consensus and arbitration, and single reading with first reader only and second reader only. Four highly trained radiologists read the mammograms. RESULTS Double reading with consensus and arbitration was 15 % (Euro 334,341) more expensive than single reading with first reader only. False-positive results were more frequent at double reading with consensus and arbitration than at single reading with first reader only (4.5 % and 4.2 %, respectively; p < 0.001). The positive predictive value (9.3 % and 9.1 %; p = 0.812) and cancer detection rate were similar for both reading strategies (4.6 and 4.2 per 1000 screens; p = 0.283). CONCLUSIONS Our results suggest that changing to single reading of mammograms could produce savings in breast cancer screening. Single reading could reduce the frequency of false-positive results without changing the cancer detection rate. These results are not conclusive and cannot be generalized to other contexts with less trained radiologists. KEY POINTS • Double reading of digital mammograms is more expensive than single reading. • Compared to single reading, double reading yields a higher proportion of false-positive results. • The cancer detection rate was similar for double and single readings. • Single reading may be a cost-effective strategy in breast cancer screening programmes.
Collapse
Affiliation(s)
- Margarita C Posso
- Epidemiology Department, Hospital de la Santa Creu i Sant Pau, Biomedical Research Institute Sant Pau (IIB Sant Pau), Barcelona, Spain. .,Iberoamerican Cochrane Centre, Biomedical Research Institute Sant Pau (IIB Sant Pau), Hospital de la Santa Creu i Sant Pau, C/ Sant Antoni Maria Claret, 167. Pavelló 18, planta 0, CP: 08025, Barcelona, Spain.
| | - Teresa Puig
- Epidemiology Department, Hospital de la Santa Creu i Sant Pau, Biomedical Research Institute Sant Pau (IIB Sant Pau), Barcelona, Spain.,Universitat Autònoma de Barcelona (UAB), Barcelona, Spain
| | - Ma Jesus Quintana
- Epidemiology Department, Hospital de la Santa Creu i Sant Pau, Biomedical Research Institute Sant Pau (IIB Sant Pau), Barcelona, Spain.,CIBER of Epidemiology and Public Health (CIBERESP), Barcelona, Spain
| | - Judit Solà-Roca
- Epidemiology Department, Hospital de la Santa Creu i Sant Pau, Biomedical Research Institute Sant Pau (IIB Sant Pau), Barcelona, Spain
| | - Xavier Bonfill
- Epidemiology Department, Hospital de la Santa Creu i Sant Pau, Biomedical Research Institute Sant Pau (IIB Sant Pau), Barcelona, Spain.,Universitat Autònoma de Barcelona (UAB), Barcelona, Spain.,CIBER of Epidemiology and Public Health (CIBERESP), Barcelona, Spain
| |
Collapse
|
11
|
Ichimasa K, Kudo SE, Mori Y, Wakamura K, Ikehara N, Kutsukawa M, Takeda K, Misawa M, Kudo T, Miyachi H, Yamamura F, Ohkoshi S, Hamatani S, Inoue H. Double staining with crystal violet and methylene blue is appropriate for colonic endocytoscopy: an in vivo prospective pilot study. Dig Endosc 2014; 26:403-8. [PMID: 24016362 PMCID: PMC4232925 DOI: 10.1111/den.12164] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/02/2013] [Accepted: 08/02/2013] [Indexed: 01/20/2023]
Abstract
BACKGROUND AND AIM Endocytoscopy (EC) at ultra-high magnification enables in vivo visualization of cellular atypia of gastrointestinal mucosae. Clear images are essential for precise diagnosis by EC. The aim of the present study was to evaluate the optimal staining method for EC in the colon. METHODS Thirty prospectively enrolled patients were allocated 1:1:1 to three distinct staining methods: 0.05% crystal violet (CV) alone, 1% methylene blue (MB) alone, or CV+MB (CM double). Normal rectal mucosae were stained with each dye and videos of EC images were recorded. Visibility of nuclei and gland formation after staining were evaluated as 'recognizable' or 'not recognizable'. Time for each parameter to become 'recognizable' was measured, and the average times for the three staining regimens were compared. RESULTS MB alone and CM double staining resulted in 'recognizable' (102 ± 27 vs 89 ± 22 s, P=0.263) nuclei within comparable periods of time, whereas CV alone was unable to identify nuclei. Gland formation became 'recognizable' sooner after CM double staining than after MB alone (61 ± 16 vs 108 ± 24 s, P<0.001). CONCLUSIONS Double staining with CV and MB, which rapidly provided recognizable images of both nuclei and gland formation, is an appropriate staining regimen for colonic EC.
Collapse
Affiliation(s)
- Katsuro Ichimasa
- Digestive Disease Center, Showa University Northern Yokohama HospitalYokohama, Japan
| | - Shin-ei Kudo
- Digestive Disease Center, Showa University Northern Yokohama HospitalYokohama, Japan
| | - Yuichi Mori
- Digestive Disease Center, Showa University Northern Yokohama HospitalYokohama, Japan
| | - Kunihiko Wakamura
- Digestive Disease Center, Showa University Northern Yokohama HospitalYokohama, Japan
| | - Nobunao Ikehara
- Digestive Disease Center, Showa University Northern Yokohama HospitalYokohama, Japan
| | - Makoto Kutsukawa
- Digestive Disease Center, Showa University Northern Yokohama HospitalYokohama, Japan
| | - Kenichi Takeda
- Digestive Disease Center, Showa University Northern Yokohama HospitalYokohama, Japan
| | - Masashi Misawa
- Digestive Disease Center, Showa University Northern Yokohama HospitalYokohama, Japan
| | - Toyoki Kudo
- Digestive Disease Center, Showa University Northern Yokohama HospitalYokohama, Japan
| | - Hideyuki Miyachi
- Digestive Disease Center, Showa University Northern Yokohama HospitalYokohama, Japan
| | - Fuyuhiko Yamamura
- Digestive Disease Center, Showa University Northern Yokohama HospitalYokohama, Japan
| | - Shogo Ohkoshi
- Digestive Disease Center, Showa University Northern Yokohama HospitalYokohama, Japan
| | - Shigeharu Hamatani
- Digestive Disease Center, Showa University Northern Yokohama HospitalYokohama, Japan
| | - Haruhiro Inoue
- Digestive Disease Center, Showa University Northern Yokohama HospitalYokohama, Japan
| |
Collapse
|
12
|
Daido S, Nakai A, Kido A, Okada T, Kamae T, Fujimoto K, Ito I, Togashi K. Anticholinergic agents result in weaker and shorter suppression of uterine contractility compared with intestinal motion: time course observation with cine MRI. J Magn Reson Imaging 2013; 38:1196-202. [DOI: 10.1002/jmri.24072] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2012] [Accepted: 01/14/2013] [Indexed: 11/11/2022] Open
Affiliation(s)
- Sayaka Daido
- Department of Diagnostic Imaging and Nuclear Medicine; Graduate School of Medicine; Kyoto University; Kyoto Japan
| | - Asako Nakai
- Department of Diagnostic Imaging and Nuclear Medicine; Graduate School of Medicine; Kyoto University; Kyoto Japan
| | - Aki Kido
- Department of Diagnostic Imaging and Nuclear Medicine; Graduate School of Medicine; Kyoto University; Kyoto Japan
| | - Tomohisa Okada
- Department of Diagnostic Imaging and Nuclear Medicine; Graduate School of Medicine; Kyoto University; Kyoto Japan
| | - Toshikazu Kamae
- Department of Diagnostic Imaging and Nuclear Medicine; Graduate School of Medicine; Kyoto University; Kyoto Japan
| | - Koji Fujimoto
- Department of Diagnostic Imaging and Nuclear Medicine; Graduate School of Medicine; Kyoto University; Kyoto Japan
| | - Isao Ito
- Department of Respiratory Medicine; Kyoto University Hospital; Kyoto Japan
| | - Kaori Togashi
- Department of Diagnostic Imaging and Nuclear Medicine; Graduate School of Medicine; Kyoto University; Kyoto Japan
| |
Collapse
|
13
|
Sakurada S, Hang NTL, Ishizuka N, Toyota E, Hung LD, Chuc PT, Lien LT, Thuong PH, Bich PTN, Keicho N, Kobayashi N. Inter-rater agreement in the assessment of abnormal chest X-ray findings for tuberculosis between two Asian countries. BMC Infect Dis 2012; 12:31. [PMID: 22296612 PMCID: PMC3311558 DOI: 10.1186/1471-2334-12-31] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2011] [Accepted: 02/01/2012] [Indexed: 11/20/2022] Open
Abstract
Background Inter-rater agreement in the interpretation of chest X-ray (CXR) films is crucial for clinical and epidemiological studies of tuberculosis. We compared the readings of CXR films used for a survey of tuberculosis between raters from two Asian countries. Methods Of the 11,624 people enrolled in a prevalence survey in Hanoi, Viet Nam, in 2003, we studied 258 individuals whose CXR films did not exclude the possibility of active tuberculosis. Follow-up films obtained from accessible individuals in 2006 were also analyzed. Two Japanese and two Vietnamese raters read the CXR films based on a coding system proposed by Den Boon et al. and another system newly developed in this study. Inter-rater agreement was evaluated by kappa statistics. Marginal homogeneity was evaluated by the generalized estimating equation (GEE). Results CXR findings suspected of tuberculosis differed between the four raters. The frequencies of infiltrates and fibrosis/scarring detected on the films significantly differed between the raters from the two countries (P < 0.0001 and P = 0.0082, respectively, by GEE). The definition of findings such as primary cavity, used in the coding systems also affected the degree of agreement. Conclusions CXR findings were inconsistent between the raters with different backgrounds. High inter-rater agreement is a component necessary for an optimal CXR coding system, particularly in international studies. An analysis of reading results and a thorough discussion to achieve a consensus would be necessary to achieve further consistency and high quality of reading.
Collapse
|
14
|
Généreux P, Palmerini T, Caixeta A, Cristea E, Mehran R, Sanchez R, Lazar D, Jankovic I, Corral MD, Dressler O, Fahy MP, Parise H, Lansky AJ, Stone GW. SYNTAX Score Reproducibility and Variability Between Interventional Cardiologists, Core Laboratory Technicians, and Quantitative Coronary Measurements. Circ Cardiovasc Interv 2011; 4:553-61. [DOI: 10.1161/circinterventions.111.961862] [Citation(s) in RCA: 110] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Affiliation(s)
- Philippe Généreux
- From Columbia University Medical Center and the Cardiovascular Research Foundation; New York, NY
| | - Tullio Palmerini
- From Columbia University Medical Center and the Cardiovascular Research Foundation; New York, NY
| | - Adriano Caixeta
- From Columbia University Medical Center and the Cardiovascular Research Foundation; New York, NY
| | - Ecaterina Cristea
- From Columbia University Medical Center and the Cardiovascular Research Foundation; New York, NY
| | - Roxana Mehran
- From Columbia University Medical Center and the Cardiovascular Research Foundation; New York, NY
| | - Raquel Sanchez
- From Columbia University Medical Center and the Cardiovascular Research Foundation; New York, NY
| | - Dana Lazar
- From Columbia University Medical Center and the Cardiovascular Research Foundation; New York, NY
| | - Ivana Jankovic
- From Columbia University Medical Center and the Cardiovascular Research Foundation; New York, NY
| | - Maria D. Corral
- From Columbia University Medical Center and the Cardiovascular Research Foundation; New York, NY
| | - Ovidiu Dressler
- From Columbia University Medical Center and the Cardiovascular Research Foundation; New York, NY
| | - Martin P. Fahy
- From Columbia University Medical Center and the Cardiovascular Research Foundation; New York, NY
| | - Helen Parise
- From Columbia University Medical Center and the Cardiovascular Research Foundation; New York, NY
| | - Alexandra J. Lansky
- From Columbia University Medical Center and the Cardiovascular Research Foundation; New York, NY
| | - Gregg W. Stone
- From Columbia University Medical Center and the Cardiovascular Research Foundation; New York, NY
| |
Collapse
|
15
|
Wang Y, van Klaveren RJ, de Bock GH, Zhao Y, Vernhout R, Leusveld A, Scholten E, Verschakelen J, Mali W, de Koning H, Oudkerk M. No benefit for consensus double reading at baseline screening for lung cancer with the use of semiautomated volumetry software. Radiology 2011; 262:320-6. [PMID: 22106357 DOI: 10.1148/radiol.11102289] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
PURPOSE To retrospectively evaluate the performance of consensus double reading compared with single reading at baseline screening of a lung cancer computed tomography (CT) screening trial. MATERIALS AND METHODS The study was approved by the Dutch Minister of Health and ethical committees. Written informed consent was obtained from all participants. The benefit of consensus double reading was expressed by the percentage change in cancer detection rate, recall rate, number of additional nodules detected, and change in sensitivity and specificity in 7557 participants. The reference standard was a retrospective analysis of the serial CT scans performed in participants diagnosed with lung cancer during a 2-year period after baseline. Semiautomated volumetric software was used for nodule evaluation. McNemar tests were performed to test statistical significance. In addition, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were calculated and 95% confidence intervals (CIs) constructed. RESULTS Seventy-four cases of lung cancer were qualified as detectable at baseline. Compared with single reading, consensus double reading did not increase the cancer detection rate (2.7%; 95% CI: -1.0%, 6.4%; P = .50) or change the recall rate (20.6% vs 20.8%, P = .28), but led to the detection of 19.0% (1635 of 8623; 95% CI: 18.0%, 19.9%, P < .01) more nodules. The sensitivity, specificity, PPV, and NPV were 95.9% (71 of 74), 80.2% (6001 of 7483), 4.6% (71 of 1553) and 99.9% (6001 of 6004) for single reading and 98.6% (73 of 74), 80.0% (1497 of 7483), 4.6% (73 of 1570), and 99.9% (5986 of 5987) for consensus double reading, respectively. CONCLUSION There is no statistically significant benefit for consensus double reading at baseline screening for lung cancer with the use of a nodule management strategy based solely on semiautomated volumetry.
Collapse
Affiliation(s)
- Ying Wang
- Department of Radiology, University Medical Center Groningen, Hanzeplein 1, 9700 RB Groningen, the Netherlands
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
16
|
Wei J, Chan HP, Zhou C, Wu YT, Sahiner B, Hadjiiski LM, Roubidoux MA, Helvie MA. Computer-aided detection of breast masses: four-view strategy for screening mammography. Med Phys 2011; 38:1867-76. [PMID: 21626920 DOI: 10.1118/1.3560462] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Abstract
PURPOSE To improve the performance of a computer-aided detection (CAD) system for mass detection by using four-view information in screening mammography. METHODS The authors developed a four-view CAD system that emulates radiologists' reading by using the craniocaudal and mediolateral oblique views of the ipsilateral breast to reduce false positives (FPs) and the corresponding views of the contralateral breast to detect asymmetry. The CAD system consists of four major components: (1) Initial detection of breast masses on individual views, (2) information fusion of the ipsilateral views of the breast (referred to as two-view analysis), (3) information fusion of the corresponding views of the contralateral breast (referred to as bilateral analysis), and (4) fusion of the four-view information with a decision tree. The authors collected two data sets for training and testing of the CAD system: A mass set containing 389 patients with 389 biopsy-proven masses and a normal set containing 200 normal subjects. All cases had four-view mammograms. The true locations of the masses on the mammograms were identified by an experienced MQSA radiologist. The authors randomly divided the mass set into two independent sets for cross validation training and testing. The overall test performance was assessed by averaging the free response receiver operating characteristic (FROC) curves of the two test subsets. The FP rates during the FROC analysis were estimated by using the normal set only. The jackknife free-response ROC (JAFROC) method was used to estimate the statistical significance of the difference between the test FROC curves obtained with the single-view and the four-view CAD systems. RESULTS Using the single-view CAD system, the breast-based test sensitivities were 58% and 77% at the FP rates of 0.5 and 1.0 per image, respectively. With the four-view CAD system, the breast-based test sensitivities were improved to 76% and 87% at the corresponding FP rates, respectively. The improvement was found to be statistically significant (p < 0.0001) by JAFROC analysis. CONCLUSIONS The four-view information fusion approach that emulates radiologists' reading strategy significantly improves the performance of breast mass detection of the CAD system in comparison with the single-view approach.
Collapse
Affiliation(s)
- Jun Wei
- Department of Radiology, University of Michigan, 1500 East Medical Center Drive, C478 Med-Inn Building, Ann Arbor, Michigan 48109-5842, USA.
| | | | | | | | | | | | | | | |
Collapse
|
17
|
Hofvind S, Yankaskas BC, Bulliard JL, Klabunde CN, Fracheboud J. Comparing Interval Breast Cancer Rates in Norway and North Carolina: Results and Challenges. J Med Screen 2009; 16:131-9. [DOI: 10.1258/jms.2009.009012] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Objective To compare interval breast cancer rates (ICR) between a biennial organized screening programme in Norway and annual opportunistic screening in North Carolina (NC) for different conceptualizations of interval cancer. Setting Two regions with different screening practices and performance. Methods 620,145 subsequent screens (1996–2002) performed in women aged 50–69 and 1280 interval cancers were analysed. Various definitions and quantification methods for interval cancers were compared. Results ICR for one year follow-up were lower in Norway compared with NC both when the rate was based on all screens (0.54 versus 1.29 per 1000 screens), negative final assessments (0.54 versus 1.29 per 1000 screens), and negative screening assessments (0.53 versus 1.28 per 1000 screens). The rate of ductal carcinoma in situ was significantly lower in Norway than in NC for cases diagnosed in both the first and second year after screening. The distributions of histopathological tumour size and lymph node involvement in invasive cases did not differ between the two regions for interval cancers diagnosed during the first year after screening. In contrast, in the second year after screening, tumour characteristics remained stable in Norway but became prognostically more favorable in NC. Conclusion Even when applying a common set of definitions of interval cancer, the ICR was lower in Norway than in NC. Different definitions of interval cancer did not influence the ICR within Norway or NC. Organization of screening and screening performance might be major contributors to the differences in ICR between Norway and NC.
Collapse
Affiliation(s)
- Solveig Hofvind
- Department of Screening Based-research, The Cancer Registry of Norway, 0304 Oslo, Norway
| | - Bonnie C Yankaskas
- Department of Radiology, University of North Carolina at Chapel Hill, 27599, USA
| | - Jean-Luc Bulliard
- Cancer Epidemiology Unit, University Institute of Social and Preventive Medicine, Lausanne, Switzerland
| | - Carrie N Klabunde
- Health Services and Economics Branch, Applied Research Program, Division of Cancer Control and Population Sciences, National Cancer Institute, Bethesda, Maryland 20892–7344, USA
| | - Jacques Fracheboud
- Department of Public Health, NETB, Erasmus MC, University Medical Center Rotterdam, Rotterdam, The Netherlands
| |
Collapse
|
18
|
Hofvind S, Geller BM, Rosenberg RD, Skaane P. Screening-detected breast cancers: discordant independent double reading in a population-based screening program. Radiology 2009; 253:652-60. [PMID: 19789229 DOI: 10.1148/radiol.2533090210] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
PURPOSE To analyze discordant and concordant screening-detected breast cancers in a nationwide population-based screening program by using independent double reading with consensus. MATERIALS AND METHODS The study is a part of the evaluation of the Norwegian Breast Cancer Screening Program and is covered by the Cancer Registry regulation. Analyses were based on prospective initial interpretation scores of 1 033 870 screenings that included 5611 breast cancers. A five-point scale for probability of cancer was used in the initial interpretation. Screening mammograms with a score of 2 or higher by either radiologist were discussed at consensus meetings where the decision whether to recall was made. A score of 1 by one reader and 2 or higher by the other was defined as a discordant interpretation and discordant cancer, whereas a score of 2 or higher by both readers was defined as a concordant recall and cancer. RESULTS Discordant interpretation was present in 5.3% (54 447 of 1 033 870) of the screenings, whereas 2.1% (21 928 of 1 033 870) were concordant positive interpretations. Of the screening-detected cancers, 23.6% (1326 of 5611) were diagnosed in women who were recalled because of screenings with discordant interpretation. One hundred seventeen interval breast cancers were diagnosed among the 40 312 screenings that were dismissed at consensus; these were 6.5% of all interval cancers. A significantly higher proportion of microcalcifications alone was present in discordant cancers (24.9% [304 of 1219]) compared with concordant cancers (17.7% [704 of 3972]) (P < .001). CONCLUSION Independent double reading with consensus at mammography screening has the potential to increase the cancer detection rate compared with single reading. Mammograms with microcalcifications alone are significantly more common among discordant cancers.
Collapse
Affiliation(s)
- Solveig Hofvind
- Department of Screening-based-Research, Cancer Registry of Norway, Montebello, 0310 Oslo, Norway.
| | | | | | | |
Collapse
|
19
|
Barceló J, Vilanova JC, Albanell J, Ferrer J, Castañer F, Viejo N, Argelaguet M. [Breast MRI: the usefulness of diffusion-weighted sequences for differentiating between benign and malignant lesions]. RADIOLOGIA 2009; 51:469-76. [PMID: 19647840 DOI: 10.1016/j.rx.2009.01.015] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2008] [Revised: 01/12/2009] [Accepted: 01/22/2009] [Indexed: 12/16/2022]
Abstract
OBJECTIVE To evaluate the usefulness of diffusion-weighted MRI sequences and of the apparent diffusion coefficient (ADC) to differentiate between benign and malignant breast lesions. MATERIAL AND METHODS We prospectively studied 88 patients (aged 31 to 79 years) with 94 lesions (80 malignant and 14 benign) who were referred for preoperative local staging. All patients underwent dynamic MRI examination after intravenous contrast administration and a diffusion-weighted sequence with ADC calculation. The results obtained at diffusion-weighted imaging were correlated with those obtained at histological examination. RESULTS The mean value of the ADC for malignant lesions (1.12+/-0.25x10(-3)mm(2)/s) was significantly lower (p<0.001) than for benign lesions (1.61+/-0.52x10(-3)mm(2)/s). No significant differences in ADC values were found between the different subtypes of invasive carcinomas or between intraductal carcinoma and invasive carcinoma (p>0.05). Using an ADC lower than 0.95x10(-3)mm(2)/s as a threshold for malignancy, the sensitivity is 52% and the specificity is 100%. CONCLUSION Diffusion-weighted sequences provide additional information in breast MRI that is useful for differentiating between benign and malignant lesions, thus improving the specificity of the technique.
Collapse
Affiliation(s)
- J Barceló
- Ressonància Girona, Clínica Girona, Girona, España.
| | | | | | | | | | | | | |
Collapse
|
20
|
Inter-observer variability in mammography screening and effect of type and number of readers on screening outcome. Br J Cancer 2009; 100:901-7. [PMID: 19259088 PMCID: PMC2661777 DOI: 10.1038/sj.bjc.6604954] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
We prospectively determined the variability in radiologists' interpretation of screening mammograms and assessed the influence of type and number of readers on screening outcome. Twenty-one screening mammography radiographers and eight screening radiologists participated. A total of 106 093 screening mammograms were double-read by two radiographers and, in turn, by two radiologists. Initially, radiologists were blinded to the referral opinion of the radiographers. A woman was referred if she was considered positive at radiologist double-reading with consensus interpretation or referred after radiologist review of positive cases at radiographer double-reading. During 2-year follow-up, clinical data, breast imaging reports, biopsy results and breast surgery reports were collected of all women with a positive screening result from any reader. Single radiologist reading (I) resulted in a mean cancer detection rate of 4.64 per 1000 screens (95% confidence intervals (CI)=4.23–5.05) with individual variations from 3.44 (95% CI=2.30–4.58) to 5.04 (95% CI=3.81–6.27), and a sensitivity of 63.9% (95% CI=60.5–67.3), ranging from 51.5% (95% CI=39.6–63.3) to 75.0% (95% CI=65.3–84.7). Sensitivity at non-blinded, radiologist double-reading (II), radiologist double-reading followed by radiologist review of positive cases at radiographer double-reading (III), triple reading by one radiologist and two radiographers with referral of all positive readings (IV) and quadruple reading by two radiologists and two radiographers with referral of all positive readings (V) were as follows: 68.6% (95% CI=65.3–71.9) (II); 73.2% (95% CI=70.1–76.4) (III); 75.2% (95% CI=72.1–78.2) (IV), and 76.9% (95% CI=73.9–79.9) (V). We conclude that screener performance significantly varied at single-reading. Double-reading increased sensitivity by a relative 7.3%. When there is a shortage of screening radiologists, triple reading by one radiologist and two radiographers may replace radiologist double-reading.
Collapse
|
21
|
Abstract
Five procedures to calculate the probability of weighted kappa with multiple raters under the null hypothesis of independence are described and compared in terms of accuracy, ease of use, generality, and limitations. The five procedures are (1) exact variance, (2) resampling contingency, (3) intraclass correlation, (4) randomized block, and (5) resampling block. While each procedure possesses strengths and limitations, the resampling contingency procedure is shown to be the most versatile and accurate of the five procedures, provided the number of raters is not too large. The resampling contingency procedure permits any weighting scheme, accommodates both symmetrical and asymmetrical weights, is suitable for both weighted and unweighted kappa, and makes no assumptions about either the data distribution or the probability distribution.
Collapse
Affiliation(s)
| | - Janis E. Johnston
- AAAS Science & Technology Policy Fellow, U.S. EPA National Homeland Security, Research Center
| | | |
Collapse
|
22
|
Interval breast cancers in a community screening programme: frequency, radiological classification and prognostic factors. Eur J Cancer Prev 2008; 17:414-21. [PMID: 18714182 DOI: 10.1097/cej.0b013e3282f75ef5] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
The frequency of interval cancers (IC) can be an indicator inversely related to the quality of a breast screening programme. The objectives were to estimate the frequency of IC, to classify IC by posterior radiological review, and to describe the prognostic factors of these IC. The setting was the Sabadell-Cerdanyola Breast Cancer Screening Programme, in Northeast Spain. We developed a population-based study of the IC occurring in the first three rounds (1995-2001). The indicators used were the incidence rate of invasive IC per 10 000 women screened and the proportional incidence, stratified by age group, type of screening and the round, and the time elapsed since the last screening mammogram. A radiological informed consensus review was used to classify the IC. No specific pattern of incidence rates was evident with respect to age, type of screening, or round, although screening was generally more sensitive in women aged 60-69 years. The proportional incidence for the period 0-11 months was always under 30%. Twenty-one percent of 38 IC evaluated (95% CI: 8.0-34.0) were attributed to errors in the screening process (false negatives). No major differences in the prognostic factors of the 57 IC were identified on examining the radiological type or the time since the last screening mammogram. We observed a high frequency of IC from 12 months after screening. It is necessary to reach a consensus regarding the definition and the analysis of IC and to establish mechanisms that would allow all the malignant tumours diagnosed in the target population to be identified.
Collapse
|
23
|
Mielke PW, Berry KJ, Johnston JE. Resampling Probability Values for Weighted Kappa with Multiple Raters. Psychol Rep 2008; 102:606-13. [DOI: 10.2466/pr0.102.2.606-613] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
A new procedure to compute weighted kappa with multiple raters is described. A resampling procedure to compute approximate probability values for weighted kappa with multiple raters is presented. Applications of weighted kappa are illustrated with an example analysis of classifications by three independent raters.
Collapse
Affiliation(s)
| | | | - Janis E. Johnston
- AAAS Science and Technology Policy Fellow at U.S. EPA Homeland Security Research Center
| |
Collapse
|
24
|
|
25
|
Blinded comparison of computer-aided detection with human second reading in screening mammography. AJR Am J Roentgenol 2007; 189:1135-41. [PMID: 17954651 DOI: 10.2214/ajr.07.2393] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
OBJECTIVE The purpose of this study was to compare a human second reader with computer-aided detection (CAD) for the reduction of false-negative cases by a primary radiologist. We retrospectively reviewed our clinical practice. MATERIALS AND METHODS We found that 6,381 consecutive screening mammograms were interpreted by a primary reader. This radiologist then reinterpreted the studies using CAD ("CAD reader"). A second human reader who was blinded to the CAD results but knowledgeable of the primary reader's findings reviewed the studies, looking for abnormalities not seen by the first reader. RESULTS Two cancers were called back by the second human reader that were not called back by the CAD reader; however, the CAD system had marked the findings, but they were dismissed by the primary reader. Because of the small numbers, the difference between the CAD and second human reader was not statistically significant. The CAD and human second readers increased the recall rates 6.4% and 7.2% (p = 0.70), respectively, and the biopsy rates 10% and 14.7%. The positive predictive value was 0% (0/3) for the CAD reader and was 40% (2/5) for the human second reader. The relative increases in the cancer detection rate compared with the primary reader's detection rate were 0% for the CAD reader and 15.4% (2/13) for the human second reader (p = 0.50). CONCLUSION A human second reader or the use of a CAD system can increase the cancer detection rate, but we found no statistical difference between the two because of the small sample size. A possible benefit from a human second reader is that CAD systems can only point to possible abnormalities, whereas a human must determine the significance of the finding. Having two humans review a study may increase detection rates due to interpreter--hence, perceptual--variability and not just increased detection.
Collapse
|
26
|
Helvie M. Improving mammographic interpretation: double reading and computer-aided diagnosis. Radiol Clin North Am 2007; 45:801-11, vi. [PMID: 17888770 DOI: 10.1016/j.rcl.2007.06.004] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
This article discusses two commonly used techniques advocated to improve screening mammography performance: double reading (DR) and computer-aided detection (CAD). Analysis of these methods is incomplete because no randomized controlled trials have been performed to assess changes in survival. Although DR and CAD have shown improvement in sensitivity, specificity often has decreased. Balancing which parameter is more important involves health care policy, costs, cultural factors, legal risk, and patient preference.
Collapse
Affiliation(s)
- Mark Helvie
- Department of Radiology, University of Michigan Health System, 1500 East Medical Center Drive, TC 2910N, Ann Arbor, MI 48109-0326, USA.
| |
Collapse
|
27
|
Abstract
Achieving and delivering optimal quality of care in radiology requires continual self-examination by the profession, particularly with regard to technical, interpretive, and communication skills. The importance of empirical data pertaining to quality and variability in radiology, the underlying causes of error, and the sources of variability are discussed. Key measures (e.g., receiver operating characteristics, kappa) and approaches (professional audits and peer reviews, surveys, inspections, and risk management programs) used in improvement efforts are reviewed, and data from key studies are highlighted. Diagnostic errors are important because of their connection to outcomes and the wide variability observed with modalities such as chest radiography and mammography.
Collapse
|
28
|
Elmore JG, Brenner RJ. The More Eyes, the Better to See? From Double to Quadruple Reading of Screening Mammograms. J Natl Cancer Inst 2007; 99:1141-3. [PMID: 17652275 DOI: 10.1093/jnci/djm079] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
29
|
Marini C, Iacconi C, Giannelli M, Cilotti A, Moretti M, Bartolozzi C. Quantitative diffusion-weighted MR imaging in the differential diagnosis of breast lesion. Eur Radiol 2007; 17:2646-55. [PMID: 17356840 DOI: 10.1007/s00330-007-0621-2] [Citation(s) in RCA: 185] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2006] [Revised: 01/26/2007] [Accepted: 02/13/2007] [Indexed: 12/14/2022]
Abstract
The role of diffusion-weighted magnetic resonance imaging (DWI) to differentiate breast lesions in vivo was evaluated. Sixty women (mean age, 53 years) with 81 breast lesions were enrolled. A coronal echo planar imaging (EPI) sequence sensitised to diffusion (b value=1,000 s/mm(2)) was added to standard MR. The mean diffusivity (MD) was calculated. Differences in MD among cysts, benign lesions and malignant lesions were evaluated, and the sensitivity and specificity of DWI to diagnose malignant and benign lesions were calculated. The diagnosis was 18 cysts, 21 benign and 42 malignant nodules. MD values (mean +/- SD x 10(-3) mm(2)/s) were (1.48 +/- 0.37) for benign lesions, (0.95 +/- 0.18) for malignant lesions and (2.25 +/- 0.26) for cysts. Different MD values characterized different malignant breast lesion types. A MD threshold value of 1.1 x 10(-3) mm(2)/s discriminated malignant breast lesions from benign lesions with a specificity of 81% and sensitivity of 80%. Choosing a cut-off of 1.31 x 10(-3) mm(2)/s (MD of malignant lesions -2 SD), the specificity would be 67% with a sensitivity of 100%. Thus, MD values, related to tumor cellularity, provide reliable information to differentiate malignant breast lesions from benign ones. Quantitative DWI is not time-consuming and can be easily inserted into standard clinical breast MR imaging protocols.
Collapse
Affiliation(s)
- C Marini
- Department of Diagnostic and Interventional Radiology, University of Pisa, Via Roma 67, 56100, Pisa (PI), Italy
| | | | | | | | | | | |
Collapse
|
30
|
Jiang Y, Nishikawa RM, Schmidt RA, Metz CE. Comparison of independent double readings and computer-aided diagnosis (CAD) for the diagnosis of breast calcifications. Acad Radiol 2006; 13:84-94. [PMID: 16399036 DOI: 10.1016/j.acra.2005.09.086] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2005] [Revised: 09/20/2005] [Accepted: 09/20/2005] [Indexed: 11/29/2022]
Abstract
RATIONALE AND OBJECTIVES The aim of the study is to compare independent double readings by radiologists and computer-aided diagnosis (CAD) in diagnostic interpretation of mammographic calcifications. MATERIALS AND METHODS Ten radiologists independently interpreted 104 mammograms containing clustered microcalcifications. Forty-six of these were malignant and 58 were benign at biopsy. Radiologists read the images with and without a computer aid by using a counterbalanced study design. Sensitivity and specificity were calculated from observer biopsy recommendations, and receiver operating characteristic (ROC) curves were computed from their diagnostic confidence ratings. Unaided double-reading sensitivity and specificity values were derived post hoc by using three different objective rules and an additional rule of simulated-optimal double reading that assumed that consultations for resolving two radiologists' different independent diagnoses always produce the correct clinical recommendation. ROC curves of unaided double readings were obtained according to the literature. RESULTS Single reading without computer aid yielded 74% sensitivity and 32% specificity, whereas CAD reading yielded 87% sensitivity and 42% specificity and appeared on a higher ROC curve (P < .0001). Three methods of formulating independent double readings generated sensitivities between 59% and 89%, specificities between 50% and 13%, and operating points that moved essentially along the average unaided single-reading ROC curve. ROC curves of unaided independent double readings showed small, statistically insignificant improvement over those of unaided single readings. Results of the simulated-optimal double reading were similar to CAD: 89% sensitivity and 50% specificity. CONCLUSION Independent double readings of mammographic calcifications may not improve diagnostic performance. CAD reading improves diagnostic performance to an extent approaching the maximum possible performance.
Collapse
Affiliation(s)
- Yulei Jiang
- Department of Radiology, The University of Chicago, 5841 South Maryland Avenue, Chicago, IL 60637
| | | | | | | |
Collapse
|
31
|
Hendrick RE, Cutter GR, Berns EA, Nakano C, Egger J, Carney PA, Abraham L, Taplin SH, D'Orsi CJ, Barlow W, Elmore JG. Community-based mammography practice: services, charges, and interpretation methods. AJR Am J Roentgenol 2005; 184:433-8. [PMID: 15671359 PMCID: PMC3142997 DOI: 10.2214/ajr.184.2.01840433] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
OBJECTIVE The purpose of our study was to accurately describe facility characteristics among community-based screening and diagnostic mammography practices in the United States. MATERIALS AND METHODS A survey was developed and applied to community-based facilities providing screening mammography in three geographically distinct locations in the states of Washington, Colorado, and New Hampshire. The facility survey was conducted between December 2001 and September 2002. Characteristics surveyed included facility type, services offered, charges for screening and diagnostic mammography, information systems, and interpretation methods, including the frequency of double interpretation. RESULTS Among 45 responding facilities, services offered included screening mammography at all facilities, diagnostic mammography at 34 facilities (76%), breast sonography at 30 (67%), breast MRI at seven (16%), and nuclear medicine breast scanning at seven (16%). Most facilities surveyed were radiology practices in nonhospital settings. Eight facilities (18%) reported performing clinical breast examinations routinely along with screening mammography. Only five screening sites (11%) used computer-aided detection (CAD) and only two (5%) used digital mammography. Nearly two thirds of facilities interpreted screening mammography examinations on-site, whereas 91% of facilities interpreted diagnostic examinations on-site. Only three facilities (7%) interpreted screening examinations on line as they were performed. Approximately half of facilities reported using some type of double interpretation, although the methods of double interpretation and the fraction of cases double-interpreted varied widely across facilities. On average, approximately 15% of screening examinations and 10% of diagnostic examinations were reported as being double-interpreted. CONCLUSION Comparison of this survey's results with those collected a decade earlier indicates dramatic changes in the practice of mammography, including a clear distinction between screening and diagnostic mammography, batch interpretation of screening mammograms, and improved quality assurance and medical audit tools. Diffusion of new technologies such as CAD and digital mammography was not widespread. The methods of double-interpretation and the fraction of cases double-interpreted varied widely across study sites.
Collapse
Affiliation(s)
- R Edward Hendrick
- Department of Radiology, Lynn Sage Comprehensive Breast Center, Northwestern University Feinberg School of Medicine, Galter Pavilion, 13th Floor, 251 E Huron St., Chicago, IL 60611, USA.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
32
|
Abstract
Statistical measures are described that are used in diagnostic imaging for expressing observer agreement in regard to categorical data. The measures are used to characterize the reliability of imaging methods and the reproducibility of disease classifications and, occasionally with great care, as the surrogate for accuracy. The review concentrates on the chance-corrected indices, kappa and weighted kappa. Examples from the imaging literature illustrate the method of calculation and the effects of both disease prevalence and the number of rating categories. Other measures of agreement that are used less frequently, including multiple-rater kappa, are referenced and described briefly.
Collapse
Affiliation(s)
- Harold L Kundel
- Department of Radiology and MCP Hahnemann School of Public Health, University of Pennsylvania Medical Center, 3600 Market St, Suite 370, Philadelphia, PA 19104, USA.
| | | |
Collapse
|
33
|
Harvey SC, Geller B, Oppenheimer RG, Pinet M, Riddell L, Garra B. Increase in cancer detection and recall rates with independent double interpretation of screening mammography. AJR Am J Roentgenol 2003; 180:1461-7. [PMID: 12704069 DOI: 10.2214/ajr.180.5.1801461] [Citation(s) in RCA: 68] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
OBJECTIVE This study reports the increase in cancer detection that resulted from independent double interpretation of screening mammography. Although screening mammography is used to detect occult breast cancer, its sensitivity and specificity are limited. Double interpretation of screening mammograms is one proven method used to improve detection, with studies reporting a 5-15% increase in cancer detection. MATERIALS AND METHODS Two radiologists independently double-interpreted 25,369 screening mammograms performed from November 1998 to April 2000. The second reviewer could add but could not delete recalls. The subsequent additional diagnostic imaging was performed in the same way whether generated from the first or the second reviewer. The outcome of each case was determined. The cancer detection rate and sensitivity are reported. RESULTS Double interpretation of screening mammograms detected 143 breast malignancies. The second reviewer found nine (6.3%) of 143 cancers and all except one were stage 0 or I. The sensitivity increased from 74.4% to 79.4% with double interpretation. The second reviewer contributed 371 of the 3591 total recalls, increasing the absolute rate of recalls by 1.5% (371/25,369) and the relative rate by 11.5% (371/3220). Six hundred seventy-two total biopsies were performed; 38 were generated by the second interpretation. CONCLUSION The relative increase in cancer detection as a result of the second reviewer is 6.3%, similar to the 5-15% reported in the literature. All but one of the nine additional cancers detected were in the early stages.
Collapse
Affiliation(s)
- Susan C Harvey
- Department of Radiology, Fletcher Allen Health Care, University of Vermont College of Medicine, UHC Campus, Burlington, VT 05401, USA
| | | | | | | | | | | |
Collapse
|
34
|
Karssemeijer N, Otten JDM, Verbeek ALM, Groenewoud JH, de Koning HJ, Hendriks JHCL, Holland R. Computer-aided detection versus independent double reading of masses on mammograms. Radiology 2003; 227:192-200. [PMID: 12616008 DOI: 10.1148/radiol.2271011962] [Citation(s) in RCA: 98] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
PURPOSE To evaluate the use of a computer-aided detection (CAD) system (designed for mammographic mass detection) to help improve mass interpretation and to compare CAD results with independent double-reading results. MATERIALS AND METHODS Screening mammograms from 500 cases were collected; 125 of these cases were screening-detected cancers, and 125 were interval cancers. Previously obtained screening mammograms (ie, prior mammograms) were available in all cases. All mammograms were analyzed by a CAD system, which detected mass regions and assigned a level of (cancer) suspicion to each mass. Ten experienced screening radiologists read the prior mammograms. For independent interpretation with CAD, the suspicion rating assigned to each finding by the radiologist was weighted with the CAD output at the area of the finding. CAD markers on areas that were not reported by the radiologist were not used. Independent double reading was implemented by using a rule to combine the levels of suspicion assigned to findings by two radiologists. Results were evaluated by using localized-response receiver operating characteristic analysis. RESULTS In a total of 141 cases, there was a visible abnormality at the location of the cancer on the prior mammogram, and 115 of these were classified as mass cases. For prior mammograms that depicted masses, the mean sensitivity of the radiologists, as averaged among the false-positive rates lower than 10%, was 39.4%; this increased by 7.0% with CAD and by 10.5% with double reading. Differences among single, double, and CAD readings were statistically significant (P <.001). CONCLUSION Although independent double reading yields the best detection performance, the presence and probability of CAD mass markers can improve mammogram interpretation.
Collapse
Affiliation(s)
- Nico Karssemeijer
- Department of Radiology, University Medical Center Nijmegen, Geert Grooteplein 18, 6525 GA Nijmegen, The Netherlands.
| | | | | | | | | | | | | |
Collapse
|
35
|
Cole EB, Pisano ED, Kistner EO, Muller KE, Brown ME, Feig SA, Jong RA, Maidment ADA, Staiger MJ, Kuzmiak CM, Freimanis RI, Lesko N, Rosen EL, Walsh R, Williford M, Braeuning MP. Diagnostic accuracy of digital mammography in patients with dense breasts who underwent problem-solving mammography: effects of image processing and lesion type. Radiology 2003; 226:153-60. [PMID: 12511684 DOI: 10.1148/radiol.2261012024] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
PURPOSE To determine effects of lesion type (calcification vs mass) and image processing on radiologist's performance for area under the receiver operating characteristic curve (AUC), sensitivity, and specificity for detection of masses and calcifications with digital mammography in women with mammographically dense breasts. MATERIALS AND METHODS This study included 201 women who underwent digital mammography at seven U.S. and Canadian medical centers. Three image-processing algorithms were applied to the digital images, which were acquired with Fischer, General Electric, and Lorad digital mammography units. Eighteen readers participated in the reader study (six readers per algorithm). Baseline values for reader performance with screen-film mammograms were obtained through the additional interpretation of 179 screen-film mammograms. A repeated-measures analysis of covariance allowing unequal slopes was used in each of the nine analyses (AUC, sensitivity, and specificity for each of three machines). Bonferroni correction was used. RESULTS Although lesion type did not affect the AUC or sensitivity for Fischer digital images, it did affect specificity (P =.0004). For the General Electric digital images, AUC, sensitivity, and specificity were not affected by lesion type. For Lorad digital images, the results strongly suggested that lesion type affected AUC and sensitivity (P <.0001). None of the three image-processing methods tested affected the AUC, sensitivity, or specificity for the Fischer, General Electric, or Lorad digital images. CONCLUSION Findings in this study indicate that radiologist's interpretation accuracy in interpreting digital mammograms depends on lesion type. Interpretation accuracy was not influenced by the image-processing method.
Collapse
Affiliation(s)
- Elodia B Cole
- Department of Radiology, Lineberger Comprehensive Cancer Center, University of North Carolina School of Medicine, Chapel Hill, NC 27599-7510, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
36
|
Mello-Thoms C, Dunn S, Nodine CF, Kundel HL, Weinstein SP. The perception of breast cancer: what differentiates missed from reported cancers in mammography? Acad Radiol 2002; 9:1004-12. [PMID: 12238541 DOI: 10.1016/s1076-6332(03)80475-0] [Citation(s) in RCA: 39] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
RATIONALE AND OBJECTIVES Mammographers map endogenous and exogenous factors into decisions whether to report the presence of a malignant finding in a mammogram case. Thus, to understand how image-based elements are translated into observer-based decisions, the authors used spatial frequency analysis to model the areas on mammograms that attracted visual attention, in addition to the areas localized as abnormal. MATERIALS AND METHODS Four mammographers read 40 two-view mammogram cases, of which 30 contained at least one malignant lesion visible on one or two views. Their eye positions were recorded during visual search. Once the mammographer felt confident enough to provide an initial impression of the case ("normal" or "abnormal"), the eye position monitoring was turned off and the mammographer indicated, with a mouse-controlled cursor, the location and nature of any malignant findings. Regions that elicited an overt or a covert response by the mammographers were extracted for processing by means of wavelet packets and artificial neural networks. RESULTS Different decision outcomes yielded different energy representations, in the spatial frequency domain. These energy representations were used by an artificial neural network to predict decision outcome in areas of interest, derived from eye position analysis, on mammograms from new cases. Individual trends were observed for each mammographer. CONCLUSION Spatial frequency representation of regions that attracted a given mammographer's visual attention may be useful for characterizing how that mammographer will respond to the visually selected areas.
Collapse
Affiliation(s)
- Claudia Mello-Thoms
- Department of Radiology, University of Pittsburgh, Magee-Women's Hospital, PA 15213-3180, USA
| | | | | | | | | |
Collapse
|
37
|
Abstract
In summary, it is an exciting time in breast imaging with many tools being brought to bear on an ever more common problem. The challenge for this decade will be to develop optimal cost-effective strategies to use all the tools now available with minimal discomfort and disfigurement to the patient.
Collapse
Affiliation(s)
- W A Berg
- Department of Radiology and Greenebaum Cancer Center, University of Maryland, 419 W Redwood St, Suite 110, Baltimore, MD 21201, USA
| |
Collapse
|