1
|
Olaisen S, Smistad E, Espeland T, Hu J, Pasdeloup D, Østvik A, Aakhus S, Rösner A, Malm S, Stylidis M, Holte E, Grenne B, Løvstakken L, Dalen H. Automatic measurements of left ventricular volumes and ejection fraction by artificial intelligence: clinical validation in real time and large databases. Eur Heart J Cardiovasc Imaging 2024; 25:383-395. [PMID: 37883712 PMCID: PMC11024810 DOI: 10.1093/ehjci/jead280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Revised: 10/11/2023] [Accepted: 10/15/2023] [Indexed: 10/28/2023] Open
Abstract
AIMS Echocardiography is a cornerstone in cardiac imaging, and left ventricular (LV) ejection fraction (EF) is a key parameter for patient management. Recent advances in artificial intelligence (AI) have enabled fully automatic measurements of LV volumes and EF both during scanning and in stored recordings. The aim of this study was to evaluate the impact of implementing AI measurements on acquisition and processing time and test-retest reproducibility compared with standard clinical workflow, as well as to study the agreement with reference in large internal and external databases. METHODS AND RESULTS Fully automatic measurements of LV volumes and EF by a novel AI software were compared with manual measurements in the following clinical scenarios: (i) in real time use during scanning of 50 consecutive patients, (ii) in 40 subjects with repeated echocardiographic examinations and manual measurements by 4 readers, and (iii) in large internal and external research databases of 1881 and 849 subjects, respectively. Real-time AI measurements significantly reduced the total acquisition and processing time by 77% (median 5.3 min, P < 0.001) compared with standard clinical workflow. Test-retest reproducibility of AI measurements was superior in inter-observer scenarios and non-inferior in intra-observer scenarios. AI measurements showed good agreement with reference measurements both in real time and in large research databases. CONCLUSION The software reduced the time taken to perform and volumetrically analyse routine echocardiograms without a decrease in accuracy compared with experts.
Collapse
Affiliation(s)
- Sindre Olaisen
- Centre for Innovative Ultrasound Solutions, Department of Circulation and Medical Imaging, Norwegian University of Science and Technology, Prinsesse Kristinas Gate 3, 7030 Trondheim, Norway
| | - Erik Smistad
- Centre for Innovative Ultrasound Solutions, Department of Circulation and Medical Imaging, Norwegian University of Science and Technology, Prinsesse Kristinas Gate 3, 7030 Trondheim, Norway
- Medical Image Analysis, Health Research, SINTEF Digital, Trondheim, Norway
| | - Torvald Espeland
- Centre for Innovative Ultrasound Solutions, Department of Circulation and Medical Imaging, Norwegian University of Science and Technology, Prinsesse Kristinas Gate 3, 7030 Trondheim, Norway
- Clinic of Cardiology, St.Olavs Hospital, Trondheim University Hospital, Prinsesse Kristinas Gate 3, 7030 Trondheim, Norway
| | - Jieyu Hu
- Centre for Innovative Ultrasound Solutions, Department of Circulation and Medical Imaging, Norwegian University of Science and Technology, Prinsesse Kristinas Gate 3, 7030 Trondheim, Norway
| | - David Pasdeloup
- Centre for Innovative Ultrasound Solutions, Department of Circulation and Medical Imaging, Norwegian University of Science and Technology, Prinsesse Kristinas Gate 3, 7030 Trondheim, Norway
| | - Andreas Østvik
- Centre for Innovative Ultrasound Solutions, Department of Circulation and Medical Imaging, Norwegian University of Science and Technology, Prinsesse Kristinas Gate 3, 7030 Trondheim, Norway
- Medical Image Analysis, Health Research, SINTEF Digital, Trondheim, Norway
| | - Svend Aakhus
- Centre for Innovative Ultrasound Solutions, Department of Circulation and Medical Imaging, Norwegian University of Science and Technology, Prinsesse Kristinas Gate 3, 7030 Trondheim, Norway
- Clinic of Cardiology, St.Olavs Hospital, Trondheim University Hospital, Prinsesse Kristinas Gate 3, 7030 Trondheim, Norway
| | - Assami Rösner
- Department of Cardiology, University Hospital of North Norway, Tromsø, Norway
- Institute for Clinical Medicine, UiT, The Arctic University of Norway, Tromsø, Norway
| | - Siri Malm
- Institute for Clinical Medicine, UiT, The Arctic University of Norway, Tromsø, Norway
- Department of Cardiology, University Hospital of North Norway, UNN Harstad, Tromsø, Norway
| | - Michael Stylidis
- Department of Cardiology, University Hospital of North Norway, Tromsø, Norway
- Department of Community Medicine, UiT, The Arctic University of Norway, Tromsø, Norway
| | - Espen Holte
- Centre for Innovative Ultrasound Solutions, Department of Circulation and Medical Imaging, Norwegian University of Science and Technology, Prinsesse Kristinas Gate 3, 7030 Trondheim, Norway
- Clinic of Cardiology, St.Olavs Hospital, Trondheim University Hospital, Prinsesse Kristinas Gate 3, 7030 Trondheim, Norway
| | - Bjørnar Grenne
- Centre for Innovative Ultrasound Solutions, Department of Circulation and Medical Imaging, Norwegian University of Science and Technology, Prinsesse Kristinas Gate 3, 7030 Trondheim, Norway
- Clinic of Cardiology, St.Olavs Hospital, Trondheim University Hospital, Prinsesse Kristinas Gate 3, 7030 Trondheim, Norway
| | - Lasse Løvstakken
- Centre for Innovative Ultrasound Solutions, Department of Circulation and Medical Imaging, Norwegian University of Science and Technology, Prinsesse Kristinas Gate 3, 7030 Trondheim, Norway
| | - Havard Dalen
- Centre for Innovative Ultrasound Solutions, Department of Circulation and Medical Imaging, Norwegian University of Science and Technology, Prinsesse Kristinas Gate 3, 7030 Trondheim, Norway
- Clinic of Cardiology, St.Olavs Hospital, Trondheim University Hospital, Prinsesse Kristinas Gate 3, 7030 Trondheim, Norway
- Department of Medicine, Levanger Hospital, Nord-Trøndelag Hospital Trust, Kirkegata 2, 7600 Levanger, Norway
| |
Collapse
|
2
|
Moulton N, Abbasi M, Ahmad D, Burks A, Chenna P, Haas K, Loiselle A, Mekhaiel E, Pilli S, Sadoughi A, Lydon B, Patel T, Chen AC. Inter- and intra- observer variability of radial-endobronchial ultrasound image interpretation for peripheral pulmonary lesions. J Thorac Dis 2024; 16:450-456. [PMID: 38410559 PMCID: PMC10894385 DOI: 10.21037/jtd-23-998] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Accepted: 11/24/2023] [Indexed: 02/28/2024]
Abstract
Background Radial probe endobronchial ultrasound (R-EBUS) is often utilized in guided bronchoscopy for the diagnosis of peripheral pulmonary lesions. R-EBUS probe positioning has been shown to correlate with diagnostic yield, but overall diagnostic yield with this technology has been inconsistent across the published literature. Currently there is no standardization for R-EBUS image interpretation, which may result in variability in grading concentricity of lesions and subsequently procedure performance. This was a survey-based study evaluating variability among practicing pulmonologists in R-EBUS image interpretation. Methods R-EBUS images from peripheral bronchoscopy cases were sent to 10 practicing Interventional Pulmonologists at two different time points (baseline and 3 months). Participants were asked to grade the images as concentric, eccentric, or no image. Cohen's Kappa-coefficient was calculated for inter- and intra-observer variability. Results A total of 100 R-EBUS images were included in the survey. There was 100% participation with complete survey responses from all 10 participants. Overall kappa-statistic for inter-observer variability for Survey 1 and 2 was 0.496 and 0.477 respectively. Overall kappa-statistic for intra-observer variability between the two surveys was 0.803. Conclusions There is significant variability between pulmonologists when characterizing R-EBUS images. However, there is strong intra-rater agreement from each participant between surveys. A standardized approach and grading system for radial EBUS patterns may improve inter-observer variability in order to optimize our clinical use and research efforts in the field.
Collapse
Affiliation(s)
| | | | | | - Allen Burks
- University of North Carolina, Chapel Hill, NC, USA
| | - Praveen Chenna
- Washington University School of Medicine, St. Louis, MO, USA
| | - Kevin Haas
- University of Illinois at Chicago, Chicago, IL, USA
| | - Andrea Loiselle
- Washington University School of Medicine, St. Louis, MO, USA
| | | | | | | | - Brandt Lydon
- Washington University School of Medicine, St. Louis, MO, USA
| | - Tej Patel
- Washington University School of Medicine, St. Louis, MO, USA
| | | |
Collapse
|
3
|
Trumpy G, Andersen CF, Farup I, Elezabi O. Mapping Quantitative Observer Metamerism of Displays. J Imaging 2023; 9:227. [PMID: 37888334 PMCID: PMC10607170 DOI: 10.3390/jimaging9100227] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2023] [Revised: 10/15/2023] [Accepted: 10/16/2023] [Indexed: 10/28/2023] Open
Abstract
Observer metamerism (OM) is the name given to the variability between the color matches that individual observers consider accurate. The standard color imaging approach, which uses color-matching functions of a single representative observer, does not accurately represent every individual observer's perceptual properties. This paper investigates OM in color displays and proposes a quantitative assessment of the OM distribution across the chromaticity diagram. An OM metric is calculated from a database of individual LMS cone fundamentals and the spectral power distributions of the display's primaries. Additionally, a visualization method is suggested to map the distribution of OM across the display's color gamut. Through numerical assessment of OM using two distinct publicly available sets of individual observers' functions, the influence of the selected dataset on the intensity and distribution of OM has been underscored. The case study of digital cinema has been investigated, specifically the transition from xenon-arc to laser projectors. The resulting heatmaps represent the "topography" of OM for both types of projectors. The paper also presents color difference values, showing that achromatic highlights could be particularly prone to disagreements between observers in laser-based cinema theaters. Overall, this study provides valuable resources for display manufacturers and researchers, offering insights into observer metamerism and facilitating the development of improved display technologies.
Collapse
Affiliation(s)
- Giorgio Trumpy
- Department of Computer Science, Norwegian University of Science and Technology, 2815 Gjøvik, Norway; (C.F.A.); (I.F.); (O.E.)
| | | | | | | |
Collapse
|
4
|
Jensen AL, Krogh AKH, Nielsen LN. Comparison of visual assessments of anisocytosis in canine blood smears and analyzer-calculated red blood cell distribution width. Front Vet Sci 2023; 10:1258857. [PMID: 37808118 PMCID: PMC10551143 DOI: 10.3389/fvets.2023.1258857] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Accepted: 08/28/2023] [Indexed: 10/10/2023] Open
Abstract
Red blood cell distribution width (RDW) and visual assessments of anisocytosis assess variability in erythrocyte size. Veterinary studies on the correlation between the two methods and on observer agreement are scarce. The objectives were to assess the correlation of the grading of anisocytosis by means of conventional microscopy of canine blood smears to RDW, and to assess intra- and inter-observer variation in assessing the degree of anisocytosis. The study included 100 canine blood samples on which blood smear examination and RDW measurement were performed. RDW was measured on the Advia 2120i analyzer. The degree of anisocytosis was based on a human grading scheme assessing the ratio between the size of the representative largest red blood cell and that of the representative smallest red blood cell (1+ if <2x, 2+ if 2-3x, 3+ if 3-4x, and 4+ if >4x). Three observers participated and assessed the blood smears by conventional microscopy twice, 3 weeks apart by each observer. The correlation was assessed for each observer on each occasion using Kendahl-tau-b analysis. Intra-observer agreement was assessed using quadratically weighted kappa. Inter-observer agreement was assessed using free-marginal multi-rater kappa. Anisocytosis graded on blood smears correlated significantly with RDW values as assessed by Kendahl-tau-b ranging between 0.37 and 0.51 (p < 0.0001). Intra-observer agreement ranged from weak to moderate with resulting kappa-coefficients being 0.58, 0.68, and 0.75, respectively. Inter-observer agreement was weak (Kappa-values 0.44). The weak to moderate observer agreement in the visual assessment of anisocytosis indicates that the more precise and more repeatable RDW measurement should be used for clinical decision-making.
Collapse
Affiliation(s)
- Asger L. Jensen
- Department of Veterinary Clinical Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | | | | |
Collapse
|
5
|
Gomes Ataide EJ, Jabaraj MS, Schenke S, Petersen M, Haghghi S, Wuestemann J, Illanes A, Friebe M, Kreissl MC. Thyroid Nodule Detection and Region Estimation in Ultrasound Images: A Comparison between Physicians and an Automated Decision Support System Approach. Diagnostics (Basel) 2023; 13:2873. [PMID: 37761240 PMCID: PMC10529523 DOI: 10.3390/diagnostics13182873] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 08/27/2023] [Accepted: 09/05/2023] [Indexed: 09/29/2023] Open
Abstract
BACKGROUND Thyroid nodules are very common. In most cases, they are benign, but they can be malignant in a low percentage of cases. The accurate assessment of these nodules is critical to choosing the next diagnostic steps and potential treatment. Ultrasound (US) imaging, the primary modality for assessing these nodules, can lack objectivity due to varying expertise among physicians. This leads to observer variability, potentially affecting patient outcomes. PURPOSE This study aims to assess the potential of a Decision Support System (DSS) in reducing these variabilities for thyroid nodule detection and region estimation using US images, particularly in lesser experienced physicians. METHODS Three physicians with varying levels of experience evaluated thyroid nodules on US images, focusing on nodule detection and estimating cystic and solid regions. The outcomes were compared to those obtained from a DSS for comparison. Metrics such as classification match percentage and variance percentage were used to quantify differences. RESULTS Notable disparities exist between physician evaluations and the DSS assessments: the overall classification match percentage was just 19.2%. Individually, Physicians 1, 2, and 3 had match percentages of 57.6%, 42.3%, and 46.1% with the DSS, respectively. Variances in assessments highlight the subjectivity and observer variability based on physician experience levels. CONCLUSIONS The evident variability among physician evaluations underscores the need for supplementary decision-making tools. Given its consistency, the CAD offers potential as a reliable "second opinion" tool, minimizing human-induced variabilities in the critical diagnostic process of thyroid nodules using US images. Future integration of such systems could bolster diagnostic precision and improve patient outcomes.
Collapse
Affiliation(s)
- Elmer Jeto Gomes Ataide
- Division of Nuclear Medicine, Department of Radiology and Nuclear Medicine, University Hospital Magdeburg, 39120 Magdeburg, Germany; (S.S.); (M.C.K.)
| | | | - Simone Schenke
- Division of Nuclear Medicine, Department of Radiology and Nuclear Medicine, University Hospital Magdeburg, 39120 Magdeburg, Germany; (S.S.); (M.C.K.)
- Department of Nuclear Medicine, Klinikum Bayreuth, 95445 Bayreuth, Germany
| | - Manuela Petersen
- Department of General, Visceral, Vascular and Transplant Surgery, University Hospital Magdeburg, 39120 Magdeburg, Germany
| | - Sarvar Haghghi
- Division of Nuclear Medicine, Department of Radiology and Nuclear Medicine, University Hospital Magdeburg, 39120 Magdeburg, Germany; (S.S.); (M.C.K.)
- Department of Nuclear Medicine, University Hospital Frankfurt, 60590 Frankfurt, Germany
| | - Jan Wuestemann
- Division of Nuclear Medicine, Department of Radiology and Nuclear Medicine, University Hospital Magdeburg, 39120 Magdeburg, Germany; (S.S.); (M.C.K.)
| | | | - Michael Friebe
- Surag Medical GmbH, 39118 Magdeburg, Germany
- Department of Biocybernetics and Biomedical Engineering, AGH University of Science and Technology, 30-059 Krakow, Poland
- Center for Innovation, Business Development and Entrepreneurship (CIBE), FOM University of Applied Science, 45127 Essen, Germany
| | - Michael C. Kreissl
- Division of Nuclear Medicine, Department of Radiology and Nuclear Medicine, University Hospital Magdeburg, 39120 Magdeburg, Germany; (S.S.); (M.C.K.)
- STIMULATE Research Campus, 39106 Magdeburg, Germany
- Center for Advanced Medical Engineering (CAME), Otto-von-Guericke University Magdeburg, 39106 Magdeburg, Germany
| |
Collapse
|
6
|
Boylan K, Kanth P, Delker D, Hazel MW, Boucher KM, Affolter K, Clayton F, Evason K, Jedrzkiewicz J, Pletneva M, Samowitz W, Swanson E, Bronner MP. Three Pathologic Criteria for Reproducible Diagnosis of Colonic Sessile Serrated Lesion Versus Hyperplastic Polyp. Hum Pathol 2023; 137:25-35. [PMID: 37044202 DOI: 10.1016/j.humpath.2023.04.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Revised: 04/05/2023] [Accepted: 04/06/2023] [Indexed: 04/14/2023]
Abstract
BACKGROUND Colonic sessile serrated lesions are thought to predispose to ∼30% of colonic adenocarcinomas. This increased risk, compared to benign hyperplastic polyps, makes their distinction vitally important. However, no gold standard exists to differentiate them, and wide observer variability is reported. METHODS To better distinguish these polyps, we investigated 94 serrated polyps (53 sessile serrated lesions and 41 hyperplastic polyps), using an easy-to-apply pathologic scoring system that combines, for the first time, three established distinguishing features: polyp morphology, location, and size. As an additional novel approach, polyp size was assessed by serrated biopsy number compared to endoscopic size. RNA expression profiling served as an additional biomarker. The considerable morphologic overlap across serrated polyps was quantitated for the first time. Interobserver variability was assessed by eight expert gastrointestinal pathologists. RESULTS By ROC analysis, polyp size by biopsy number performed best, followed by polyp location and morphology (areas under the curves [AUC] 85.9%, 81.2%, 65.9%, respectively). Optimal discrimination combined all three features (AUC 92.9%). For polyp size, biopsy number proved superior to endoscopic size (AUC 85.9% versus 55.2%, p=0.001). Interobserver variability analysis yielded the highest reported Fleiss and Kappa statistics (0.879) and percent agreement (96.8%), showing great promise toward improved diagnosis. CONCLUSIONS The proposed three-criteria pathologic system, combining size by biopsy number, location, and morphology, yields an improved, easy to use, and highly reproducible diagnostic approach for differentiating sessile serrated lesions and hyperplastic polyps.
Collapse
|
7
|
Eagleson R, Joskowicz L. Verification, Evaluation, and Validation: Which, How & Why, in Medical Augmented Reality System Design. J Imaging 2023; 9:jimaging9020020. [PMID: 36826939 PMCID: PMC9965271 DOI: 10.3390/jimaging9020020] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 12/24/2022] [Accepted: 01/11/2023] [Indexed: 01/19/2023] Open
Abstract
This paper presents a discussion about the fundamental principles of Analysis of Augmented and Virtual Reality (AR/VR) Systems for Medical Imaging and Computer-Assisted Interventions. The three key concepts of Analysis (Verification, Evaluation, and Validation) are introduced, illustrated with examples of systems using AR/VR, and defined. The concepts of system specifications, measurement accuracy, uncertainty, and observer variability are defined and related to the analysis principles. The concepts are illustrated with examples of AR/VR working systems.
Collapse
Affiliation(s)
- Roy Eagleson
- AI and Software Engineering Program, The University of Western Ontario, London, ON N6A 5B9, Canada
- Correspondence:
| | - Leo Joskowicz
- School of Computer Science and Engineering, Edmond J. Safra Campus, The Hebrew University of Jerusalem, Givat Ram, Jerusalem 9190401, Israel
| |
Collapse
|
8
|
Hillaert A, Stock E, Favril S, Duchateau L, Saunders JH, Vanderperren K. Intra- and Inter- Observer Variability of Quantitative Parameters Used in Contrast-Enhanced Ultrasound of Kidneys of Healthy Cats. Animals (Basel) 2022; 12. [PMID: 36552476 DOI: 10.3390/ani12243557] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2022] [Revised: 12/05/2022] [Accepted: 12/13/2022] [Indexed: 12/23/2022] Open
Abstract
Contrast-enhanced ultrasound (CEUS) is a non-invasive imaging technique which allows qualitative and quantitative assessment of tissue perfusion. Although CEUS offers numerous advantages, a major challenge remains the variability in tissue perfusion quantification. This study aimed to assess intra- and inter-observer variability for quantification of renal perfusion. Two observers with different levels of expertise performed a quantitative analysis of 36 renal CEUS studies, twice. The CEUS data were collected from 12 healthy cats at 3 different time points with a 7-day interval. The inter- and intra-observer agreement was assessed by the intraclass correlation coefficient. Within and between observers, a good agreement was demonstrated for intensity-related parameters in the cortex, medulla, and interlobular artery. For some parameters, ICCinter was considerably lower than ICCintra, mostly when the ROI encompassed the entire kidney or medulla. With the exception of time to peak (TTP) and mean transit time (mTTI), time-related and slope-related parameters showed poor agreement among observers. In conclusion, it may be advised against having the quantitative assessment of renal perfusion performed by different observers, especially if their experience levels differ. The cortical mTTI seemed to be the most appropriate parameter as it showed a favorable inter-observer agreement and inter-period agreement.
Collapse
|
9
|
Hollestelle RVA, Hansen D, Hoeks SE, van Meeteren NLU, Stolker RJ, Maissan IM. Observer Variability as a Determinant of Measurement Error of Ultrasonographic Measurements of the Optic Nerve Sheath Diameter: A Systematic Review. J Emerg Med 2022; 63:200-211. [PMID: 36038435 DOI: 10.1016/j.jemermed.2022.04.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2021] [Revised: 03/13/2022] [Accepted: 04/23/2022] [Indexed: 06/15/2023]
Abstract
BACKGROUND Ultrasonographic measurements of the diameter of the sheath of the optic nerve can be used to assess intracranial pressure indirectly. These measurements come with measurement error. OBJECTIVE Our aim was to estimate observer's measurement error as a determinant of ultrasonographic measurement variability of the optic nerve sheath diameter. METHODS A systematic search of the literature was conducted in Embase, Medline, Web of Science, the Cochrane Central Register of Trials, and the first 200 articles of Google Scholar up to April 19, 2021. Inclusion criteria were the following: healthy adults, B-mode ultrasonography, and measurements 3 mm behind the retina. Studies were excluded if standard error of measurement could not be calculated. Nine studies featuring 389 participants (median 40; range 15-100) and 22 observers (median 2; range 1-4) were included. Standard error of measurement and minimal detectable differences were calculated to quantify observer variability. Quality and risk of bias were assessed with the Guidelines for Reporting Reliability and Agreement Studies. RESULTS The standard error of measurement of the intra- and interobserver variability had a range of 0.10-0.41 mm and 0.14-0.42 mm, respectively. Minimal detectable difference of a single observer was 0.28-1.1 mm. Minimal detectable difference of multiple observers (range 2-4) was 0.40-1.1 mm. Quality assessment showed room for methodological improvement of included studies. CONCLUSIONS The standard errors of measurement and minimal detectable differences of ultrasonographic measurements of the optic nerve sheath diameter found in this review with healthy participants indicate caution should be urged when interpreting results acquired with this measurement method in clinical context.
Collapse
Affiliation(s)
| | - Daniel Hansen
- Department of Anesthesiology, Erasmus Medical Centre, Rotterdam, The Netherlands
| | - Sanne E Hoeks
- Department of Anesthesiology, Erasmus Medical Centre, Rotterdam, The Netherlands
| | | | - Robert J Stolker
- Department of Anesthesiology, Erasmus Medical Centre, Rotterdam, The Netherlands
| | - Iscander M Maissan
- Department of Anesthesiology, Erasmus Medical Centre, Rotterdam, The Netherlands
| |
Collapse
|
10
|
Couzins M, Forbes S, Vigneswaran G, Mitra I, Rutherford EE. Ultrasound grading of thyroid nodules using the BTA U-scoring guidelines - Is there evidence of intra-and inter observer variability? Ultrasound 2020; 29:100-105. [PMID: 33995556 DOI: 10.1177/1742271x20971323] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/06/2020] [Accepted: 10/05/2020] [Indexed: 11/15/2022]
Abstract
Introduction U-score ultrasound classification (graded U1-U5) is widely used to grade thyroid nodules based on benign and malignant sonographic features. It is well established that ultrasound is an operator-dependent imaging modality and thus more susceptible to subjective variances between operators when using imaging-based scoring systems. We aimed to assess whether there is any intra- or interobserver variability when U-scoring thyroid nodules and whether previous thyroid ultrasound experience has an effect on this variability. Methods A total of 14 ultrasound operators were identified (five experienced thyroid operators, five with intermediate experience and four with no experience) and were asked to U-score images from 20 thyroid cases shown as a single projection, with and without Doppler flow. The cases were subsequently rescored by the 14 operators after six weeks. The first and second round U-scores for the three operator groups were then analysed using Fleiss' kappa to assess interobserver variability and Cochran's Q test to determine any intraobserver variability. Results We found no significant interobserver variability on combined assessment of all operators with fair agreement in round 1 (Fleiss' kappa = 0.30, p <0.0001) and slight agreement in round 2 (Fleiss' kappa = 0.19, p < 0.0001). Cochran's Q test revealed no significant intraobserver variability in all 14 operators between round 1 and round 2 (all p>0.05). Conclusions We found no statistically significant inter- or intraobserver variability in the U-scoring of thyroid nodules between all participants reinforcing the validity of this scoring method in clinical practice, allaying concerns regarding potential subjective biases in reporting.
Collapse
Affiliation(s)
- Michael Couzins
- University Hospital Southampton NHS Foundation Trust, Southampton, UK
| | - Stuart Forbes
- University Hospital Southampton NHS Foundation Trust, Southampton, UK
| | | | - Indu Mitra
- Chelsea and Westminster NHS Hospital, London, UK
| | | |
Collapse
|
11
|
Vernooij J, de Munck F, van Nieuwenhuizen E, Webb E, Jonker H, Vos P, Holm D. Reliability of pelvimetry is affected by observer experience but not by breed and sex: A cross-sectional study in beef cattle. Reprod Domest Anim 2020; 55:1592-1598. [PMID: 32885509 PMCID: PMC7756854 DOI: 10.1111/rda.13814] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2020] [Accepted: 08/27/2020] [Indexed: 11/29/2022]
Abstract
Pelvis size plays an important role to prevent dystocia in cattle caused by the foeto‐maternal disproportion in commonly primiparous females. The reproducibility and repeatability are two important aspects for the reliability of the measurements to use in the selection of cattle for culling. Pelvic measures were taken with a Rice pelvimeter from 224 young cattle (180 females and 44 males) of four beef breeds in South Africa. One experienced and two inexperienced observers each measured pelvic height and width twice. The proportion measurements with a maximum difference of 0.5 cm within animal compared with the first measurement by the experienced observer are around 80% and by the inexperienced observers around 50% for pelvic height and around 60% for pelvic width. Breed and sex do not affect the reliability of pelvimetry by an experienced observer. Under‐ and overestimation of pelvis size were observed in inexperienced observers, which seems to be unrelated to breed and sex.
Collapse
Affiliation(s)
- Johannes Vernooij
- Department of Population Health Sciences, Section Farm Animal Health, Faculty of Veterinary Medicine, Utrecht University, Utrecht, The Netherlands
| | - Florine de Munck
- Department of Population Health Sciences, Section Farm Animal Health, Faculty of Veterinary Medicine, Utrecht University, Utrecht, The Netherlands
| | - Evelien van Nieuwenhuizen
- Department of Population Health Sciences, Section Farm Animal Health, Faculty of Veterinary Medicine, Utrecht University, Utrecht, The Netherlands
| | - Edward Webb
- Department of Animal and Wildlife Sciences, Faculty of Natural and Agricultural Sciences, University of Pretoria, Pretoria, South Africa
| | - Herman Jonker
- Department of Population Health Sciences, Section Farm Animal Health, Faculty of Veterinary Medicine, Utrecht University, Utrecht, The Netherlands
| | - Peter Vos
- Department of Population Health Sciences, Section Farm Animal Health, Faculty of Veterinary Medicine, Utrecht University, Utrecht, The Netherlands
| | - Dietmar Holm
- Department of Production Animal Studies, Faculty of Veterinary Science, University of Pretoria, Onderstepoort, South Africa
| |
Collapse
|
12
|
Rasmussen CK, Van den Bosch T, Exacoustos C, Manegold-Brauer G, Benacerraf BR, Froyman W, Landolfo C, Condorelli M, Egekvist AG, Josefsson H, Leone FPG, Jokubkiene L, Zannoni L, Epstein E, Installé A, Dueholm M. Intra- and Inter-Rater Agreement Describing Myometrial Lesions Using Morphologic Uterus Sonographic Assessment: A Pilot Study. J Ultrasound Med 2019; 38:2673-2683. [PMID: 30801764 DOI: 10.1002/jum.14971] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/03/2018] [Revised: 01/20/2019] [Accepted: 01/27/2019] [Indexed: 05/14/2023]
Abstract
OBJECTIVES To evaluate the intra- and inter-rater agreement for myometrial lesions using Morphologic Uterus Sonographic Assessment terminology. METHODS Thirteen raters with high (n = 6) or medium experience (n = 7) assessed 30 3-dimensional ultrasound clips with (n = 20) and without (n = 10) benign myometrial lesions. Myometrial lesions were reported as poorly or well defined and then systematically evaluated for the presence of individual features. The clips were blindly assessed twice (at a 2-month interval). Intra- and inter-rater agreements were calculated with κ statistics. RESULTS The reporting of poorly defined lesions reached moderate intra-rater agreement (κ = 0.49 [high experience] and 0.47 [medium experience]) and poor inter-rater agreement (κ = 0.39 [high experience] and 0.25 [medium experience]). The reporting of well-defined lesions reached good to very good intra-rater agreement (κ = 0.73 [high experience] and 0.82 [medium experience]) and good inter-rater agreement (κ = 0.75 [high experience] and 0.63 [medium experience]). Most individual features associated with ill-defined lesions reached moderate intra- and inter-rater agreement among highly experienced raters (κ = 0.41-0.60). The least reproducible features were myometrial cysts, hyperechoic islands, subendometrial lines and buds, and translesional flow (κ = 0.11-0.34). Most individual features associated with well-defined lesions reached moderate to good intra- and inter-rater agreement among all observers (κ = 0.41-0.80). The least reproducible features were a serosal contour, asymmetry, a hyperechoic rim, and fan-shaped shadows (κ = 0.00-0.35). CONCLUSIONS The reporting of well-defined lesions showed excellent agreement, whereas the agreement for poorly defined lesions was low, even among highly experienced raters. The agreement on identifying individual features varied, especially for features associated with ill-defined lesions. Guidelines on minimum requirements for features associated with ill-defined lesions to be interpreted as poorly defined lesions may improve agreement.
Collapse
Affiliation(s)
| | - Thierry Van den Bosch
- Department of Obstetrics and Gynecology, University Hospital Leuven, Leuven, Belgium
| | - Caterina Exacoustos
- Department of Biomedicine and Prevention, Obstetrics and Gynecology Clinic, Università Degli Studi di Roma Tor Vergata, Rome, Italy
| | - Gwendolin Manegold-Brauer
- Division of Gynecologic and Prenatal Ultrasound, Department of Obstetrics and Gynecology, University of Basel, Basel, Switzerland
| | - Beryl R Benacerraf
- Departments of Radiology and Obstetrics and Gynecology, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, USA
| | - Wouter Froyman
- Department of Obstetrics and Gynecology, University Hospital Leuven, Leuven, Belgium
- Department of Development and Regeneration, Katholieke Universiteit Leuven, Leuven, Belgium
| | - Chiara Landolfo
- Department of Development and Regeneration, Katholieke Universiteit Leuven, Leuven, Belgium
- Queen Charlotte's and Chelsea Hospital, Imperial College, London, England
| | | | - Anne G Egekvist
- Department of Obstetrics and Gynecology, Aarhus University Hospital, Aarhus, Denmark
| | - Hampus Josefsson
- Department of Clinical Science and Education, Södersjukhuset, and Department of Women's and Children's Health, Karolinska Institutet, Stockholm, Sweden
| | | | - Ligita Jokubkiene
- Department of Obstetrics and Gynecology, Skaane University Hospital, Malmo, Sweden
| | - Letizia Zannoni
- Department of Obstetrics Gynecology, Sant'Orsola Malpighi Hospital, Bologna, Italy
| | - Elisabeth Epstein
- Department of Clinical Science and Education, Södersjukhuset, and Department of Women's and Children's Health, Karolinska Institutet, Stockholm, Sweden
| | - Arnaud Installé
- Department of Obstetrics and Gynecology, University Hospital Leuven, Leuven, Belgium
| | - Margit Dueholm
- Department of Obstetrics and Gynecology, Aarhus University Hospital, Aarhus, Denmark
| |
Collapse
|
13
|
Cabitza F, Locoro A, Alderighi C, Rasoini R, Compagnone D, Berjano P. The elephant in the record: On the multiplicity of data recording work. Health Informatics J 2019; 25:475-490. [PMID: 30666882 DOI: 10.1177/1460458218824705] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
This article focuses on the production side of clinical data work, or data recording work, and in particular, on its multiplicity in terms of data variability. We report the findings from two case studies aimed at assessing the multiplicity that can be observed when the same medical phenomenon is recorded by multiple competent experts, yet the recorded data enable the knowledgeable management of illness trajectories. Often framed in terms of the latent unreliability of medical data, and then treated as a problem to solve, we argue that practitioners in the health informatics field must gain a greater awareness of the natural variability of data inscribing work, assess it, and design solutions that allow actors on both sides of clinical data work, that is, the production and care, as well as the primary and secondary uses of data to aptly inform each other's practices.
Collapse
Affiliation(s)
- Federico Cabitza
- IRCCS Istituto Ortopedico Galeazzi, Italy; University of Milano-Bicocca, Italy
| | | | | | | | | | | |
Collapse
|
14
|
Demchig D, Mello-Thoms C, Lee W, Khurelsukh K, Ramish A, Brennan P. Observer Variability in Breast Cancer Diagnosis between Countries with and without Breast Screening. Acad Radiol 2019; 26:62-68. [PMID: 29580792 DOI: 10.1016/j.acra.2018.03.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2018] [Revised: 02/20/2018] [Accepted: 03/01/2018] [Indexed: 11/25/2022]
Abstract
RATIONAL AND OBJECTIVES Image reporting is a vital component of patient management depending on individual radiologists' performance. Our objective was to explore mammographic diagnostic efficacy in a country where breast cancer screening does not exist. MATERIALS AND METHODS Two mammographic test sets were used: a typical screening (TS) and high-difficulty (HD) test set. Nonscreening (NS) radiologists (n = 11) read both test sets, while 52 and 49 screening radiologists read the TS and HD test sets, respectively. The screening radiologists were classified into two groups: a less experienced (LE) group with ≤5 years' experience and a more experienced (ME) group with ≥5 years' experience. A Kruskal-Wallis and Tukey-Kramer post hoc test were used to compare reading performance among reader groups, and the Wilcoxon matched pairs tests was used to compare TS and ND test sets for the NS radiologists. RESULTS Across the three reader groups, there were significant differences in case sensitivity (χ2 [2] = 9.4, P = .008), specificity (χ2 [2] = 10.3, P = .006), location sensitivity (χ2 [2] = 19.8, P < .001), receiver operating characteristics, area under the curve (χ2 [2] = 19.7, P < .001) and jack-knife free-response receiver operating characteristics (JAFROCs) (χ2 [2] = 18.1, P < .001). NS performance for all measured scores was significantly lower than those for the ME readers (P < .006), while only location sensitivity was lower (χ2 [2] = 17.5, P = .026) for the NS compared to the LE group. No other significant differences were observed. CONCLUSION Large variations in mammographic performance exist between radiologists from screening and nonscreening countries.
Collapse
|
15
|
van Dijk LJD, van der Wel T, van Noord D, Moelker A, Verhagen HJM, Nieboer D, Kuipers EJ, Bruno MJ. Intraobserver and interobserver reliability of visible light spectroscopy during upper gastrointestinal endoscopy. Expert Rev Med Devices 2018; 15:605-610. [PMID: 29973094 DOI: 10.1080/17434440.2018.1496818] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
BACKGROUND Visible light spectroscopy (VLS) performed during upper gastrointestinal endoscopy allows measuring mucosal oxygen saturation levels to determine gastrointestinal ischemia. We aimed to determine the observer variability of VLS. METHODS This is a single-center prospective study of 24 patients planned for usual care upper endoscopy. To test intraobserver variability, VLS measurements were performed in duplicate by a single endoscopist in 12 patients. For interobserver variability analysis, in another 12 patients VLS measurements were repeatedly and independently performed by two endoscopists in the same patient during the same endoscopy session. Observer variability was assessed with intraclass correlation coefficient (ICC) and clinical disagreement defined as >5% difference between first and second set of VLS measurements. RESULTS The intraobserver reliability was excellent (ICC antrum 0.77, duodenal bulb 0.81 and duodenum 0.84) with clinical disagreement only in antrum (3% of all intraobserver measurements). The interobserver reliability was good for the duodenal bulb (ICC 0.70) without clinical disagreement; however, interobserver reliability was fair for duodenum (ICC 0.49) and antrum (ICC 0.56) with clinical disagreement occurring in 11% of all interobserver measurements. CONCLUSIONS The observer reliability of VLS is fair to good with intraobserver reliability being better than interobserver reliability. This supports the use of VLS for detection of gastrointestinal ischemia.
Collapse
Affiliation(s)
- Louisa J D van Dijk
- a Department of Gastroenterology and Hepatology , Erasmus MC University Medical Center , Rotterdam , The Netherlands.,b Department of Radiology , Erasmus MC University Medical Center , Rotterdam , The Netherlands
| | - Twan van der Wel
- a Department of Gastroenterology and Hepatology , Erasmus MC University Medical Center , Rotterdam , The Netherlands
| | - Desirée van Noord
- a Department of Gastroenterology and Hepatology , Erasmus MC University Medical Center , Rotterdam , The Netherlands.,c Department of Gastroenterology and Hepatology , Franciscus Gasthuis & Vlietland , Rotterdam , The Netherlands
| | - Adriaan Moelker
- b Department of Radiology , Erasmus MC University Medical Center , Rotterdam , The Netherlands
| | - Hence J M Verhagen
- d Department of Vascular Surgery , Erasmus MC University Medical Center , Rotterdam , The Netherlands
| | - Daan Nieboer
- e Department of Public Health , Erasmus MC University Medical Center , Rotterdam , The Netherlands
| | - Ernst J Kuipers
- a Department of Gastroenterology and Hepatology , Erasmus MC University Medical Center , Rotterdam , The Netherlands
| | - Marco J Bruno
- a Department of Gastroenterology and Hepatology , Erasmus MC University Medical Center , Rotterdam , The Netherlands
| |
Collapse
|
16
|
Elder DE, Piepkorn MW, Barnhill RL, Longton GM, Nelson HD, Knezevich SR, Pepe MS, Carney PA, Titus LJ, Onega T, Tosteson ANA, Weinstock MA, Elmore JG. Pathologist characteristics associated with accuracy and reproducibility of melanocytic skin lesion interpretation. J Am Acad Dermatol 2018; 79:52-59.e5. [PMID: 29524584 PMCID: PMC6016831 DOI: 10.1016/j.jaad.2018.02.070] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2017] [Revised: 01/31/2018] [Accepted: 02/21/2018] [Indexed: 10/17/2022]
Abstract
BACKGROUND Diagnostic interpretations of melanocytic skin lesions vary widely among pathologists, yet the underlying reasons remain unclear. OBJECTIVE Identify pathologist characteristics associated with rates of accuracy and reproducibility. METHODS Pathologists independently interpreted the same set of biopsy specimens from melanocytic lesions on 2 occasions. Diagnoses were categorized into 1 of 5 classes according to the Melanocytic Pathology Assessment Tool and Hierarchy for Diagnosis system. Reproducibility was determined by pathologists' concordance of diagnoses across 2 occasions. Accuracy was defined by concordance with a consensus reference standard. Associations of pathologist characteristics with reproducibility and accuracy were assessed individually and in multivariable logistic regression models. RESULTS Rates of diagnostic reproducibility and accuracy were highest among pathologists with board certification and/or fellowship training in dermatopathology and in those with 5 or more years of experience. In addition, accuracy was high among pathologists with a higher proportion of melanocytic lesions in their caseload composition and higher volume of melanocytic lesions. LIMITATIONS Data gathered in a test set situation by using a classification tool not currently in clinical use. CONCLUSION Diagnoses are more accurate among pathologists with specialty training and those with more experience interpreting melanocytic lesions. These findings support the practice of referring difficult cases to more experienced pathologists to improve diagnostic accuracy, although the impact of these referrals on patient outcomes requires additional research.
Collapse
Affiliation(s)
- David E Elder
- Department of Pathology and Laboratory Medicine, Hospital of the University of Pennsylvania, Philadelphia, Pennsylvania
| | - Michael W Piepkorn
- Division of Dermatology, Department of Medicine, University of Washington School of Medicine, Seattle, Washington; Dermatopathology Northwest, Bellevue, Washington
| | - Raymond L Barnhill
- Department of Pathology, Institut Curie, Paris, France; University of Paris Descartes, Paris, France
| | - Gary M Longton
- Program in Biostatistics and Biomathematics, Fred Hutchinson Cancer Research Center, Seattle, Washington
| | - Heidi D Nelson
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, Oregon; Department of Medicine, Oregon Health and Science University, Portland, Oregon
| | | | - Margaret S Pepe
- Program in Biostatistics and Biomathematics, Fred Hutchinson Cancer Research Center, Seattle, Washington
| | - Patricia A Carney
- Department of Family Medicine, Oregon Health and Science University, Portland, Oregon
| | - Linda J Titus
- Department of Epidemiology, Geisel School of Medicine at Dartmouth, Norris Cotton Cancer Center, Lebanon, New Hampshire; Department of Pediatrics, Geisel School of Medicine at Dartmouth, Norris Cotton Cancer Center, Lebanon, New Hampshire
| | - Tracy Onega
- Department of Biomedical Data Science, Department of Epidemiology, Norris Cotton Cancer Center, Lebanon, New Hampshire; Geisel School of Medicine at Dartmouth, The Dartmouth Institute for Health Policy and Clinical Practice, Lebanon, New Hampshire
| | - Anna N A Tosteson
- Department of Medicine, The Dartmouth Institute for Health Policy and Clinical Practice, Geisel School of Medicine at Dartmouth, Norris Cotton Cancer Center, Lebanon, New Hampshire; Department of Community and Family Medicine, The Dartmouth Institute for Health Policy and Clinical Practice, Geisel School of Medicine at Dartmouth, Norris Cotton Cancer Center, Lebanon, New Hampshire
| | - Martin A Weinstock
- Center for Dermatoepidemiology, VA Medical Center, Providence Department of Dermatology, Rhode Island Hospital, Providence, Rhode Island; Department of Dermatology, Brown University, Providence, Rhode Island; Department of Epidemiology, Brown University, Providence, Rhode Island
| | - Joann G Elmore
- Department of Medicine, University of Washington School of Medicine, Seattle, Washington.
| |
Collapse
|
17
|
Ng CS, Wei W, Ghosh P, Anderson E, Herron DH, Chandler AG. Observer Variability in CT Perfusion Parameters in Primary and Metastatic Tumors in the Lung. Technol Cancer Res Treat 2018; 17:1533034618769767. [PMID: 29681221 PMCID: PMC5949952 DOI: 10.1177/1533034618769767] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
PURPOSE Evaluate observer variability in computed tomography perfusion measurements in lung tumors and assess the relative contributions of individual factors to overall variability. MATERIALS AND METHODS Four observers independently delineated tumor and defined arterial input function region of interests (tumor region of interest and arterial input function region of interest) on each of 4 contiguous slice levels of computed tomography perfusion images (arterial input function level), in 12 computed tomography perfusion data sets containing lung tumors (>2.5 cm size), on 2 separate occasions. Computed tomography perfusion parameters (blood flow, blood volume, mean transit time, and permeability surface area product) for tumor volumes of interest were computed for all combinations of these factors, totaling up to 1024 combinations per patient. Overall, inter- and intraobserver variability were assessed by within-patient coefficient of variation, variance components analyses, and intraclass correlation. RESULTS Overall observer within-patient coefficient of variations for tumor blood flow, blood volume, mean transit time, and permeability surface area product were 20.3%, 11.9%, 6.3%, and 31.7%, and intraclass correlations were 0.94, 0.91, 0.82, and 0.72, respectively. Interobserver tumor volume of interest and arterial input function level were the highest contributors to overall variance for blood flow, blood volume, and mean transit time. Overall intraobserver wCVs for blood flow, blood volume, mean transit time, and permeability surface area product (4.3%, 2.4%, 0.9%, and 3.1%) were smaller than interobserver within-patient coefficient of variations (9.5%, 5.6%, 1.6%, and 7.0%), respectively. CONCLUSION The largest contributors to observer variability were interobserver tumor volume of interest and arterial input function level. Overall variability in computed tomography perfusion studies can potentially be minimized by using a single observer and a consistent level for arterial input function, which would be important considerations in longitudinal and multicenter studies. Methods to reliably define arterial input function and delineate tumor volumes would help to reduce variability in estimations of computed tomography perfusion parameter values.
Collapse
Affiliation(s)
- Chaan S Ng
- 1 Department of Radiology, University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Wei Wei
- 2 Department of Biostatistics, University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Payel Ghosh
- 1 Department of Radiology, University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Ella Anderson
- 1 Department of Radiology, University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Delise H Herron
- 1 Department of Radiology, University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | | |
Collapse
|
18
|
Chaturvedi A, Whitnah J, Maki JH, Baran T, Mitsumori LM. Horizontal Long Axis Imaging Plane for Evaluation of Right Ventricular Function on Cardiac Magnetic Resonance Imaging. J Clin Imaging Sci 2017; 6:52. [PMID: 28123842 PMCID: PMC5209858 DOI: 10.4103/2156-7514.197076] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2016] [Accepted: 11/14/2016] [Indexed: 11/10/2022] Open
Abstract
Purpose: The purpose of this study was to evaluate a horizontal long axis (HLA) magnetic resonance imaging (MRI) plane aligned to the long axis of the right ventricular (RV) cavity for functional analysis by comparing the measurement variability and time required for the analysis with that using a short-axis (SAX) image orientation. Materials and Methods: Thirty-four cardiac MRI exams with cine balanced steady-state free precession image stacks in both the SAX and the HLA of the RV (RHLA) were evaluated. Two reviewers independently traced RV endocardial borders on each image of the cine stacks. The time required to complete each set of traces was recorded, and the RV end-diastolic volume, end-systolic volume, and ejection fraction were calculated. Analysis times and RV measurements were compared between the two orientations. Results: Analysis time for each reviewer was significantly shorter for the RHLA stack (reviewer 1 = 6.4 ± 1.8 min, reviewer 2 = 6.0 ± 3.3 min) than for the SAX stack (7.5 ± 2.1 and 6.9 ± 3.6 min, respectively; P < 0.002). Bland–Altman analysis revealed lower mean differences, limits of agreement, and coefficients of variation for RV measurements obtained with the RHLA stack. Conclusions: RV functional analysis using a RHLA stack resulted in shorter analysis times and lower measurement variability than for a SAX stack orientation.
Collapse
Affiliation(s)
- Abhishek Chaturvedi
- Department of Radiology, University of Washington School of Medicine, 1959 Pacific Street, Seattle, WA, USA; Department of Imaging Sciences, University of Rochester, 601 Elmwood Avenue, Rochester, NY, USA
| | - Joseph Whitnah
- Department of Radiology, University of Washington School of Medicine, 1959 Pacific Street, Seattle, WA, USA
| | - Jeffrey H Maki
- Department of Radiology, University of Washington School of Medicine, 1959 Pacific Street, Seattle, WA, USA
| | - Timothy Baran
- Department of Imaging Sciences, University of Rochester, 601 Elmwood Avenue, Rochester, NY, USA
| | - Lee M Mitsumori
- Department of Radiology, University of Washington School of Medicine, 1959 Pacific Street, Seattle, WA, USA; Department of Radiology, Straub Clinic and Hospital, Honolulu, HI, NY, USA
| |
Collapse
|
19
|
Erdoğan Z, Abdülrezzak U, Silov G, Ozdal A, Turhal O. Evaluation of inter observer variability of parenchymal phase of Tc-99m mercaptoacetyltriglycine and Tc-99m dimercaptosuccinic acid renal scintigraphy. Indian J Nucl Med 2014; 29:87-91. [PMID: 24761059 PMCID: PMC3996777 DOI: 10.4103/0972-3919.130288] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
Abstract
Objective: The aim of this study was to investigate the variability in the interpretation of parenchymal abnormalities and to assess the differences in interpretation of routine renal scintigraphic findings on posterior view of technetium-99m dimercaptosuccinic acid (pvDMSA) scans and parenchymal phase of technetium-99m mercaptoacetyltriglycine (ppMAG3) scans by using standard criterions to make standardization and semiquantitative evaluation and to have more accurately correlation. Materials and Methods: Two experienced nuclear medicine physicians independently interpreted pvDMSA scans of 204 and ppMAG3 scans of 102 pediatric patients, retrospectively. Comparisons were made by visual inspection of pvDMSA scans, and ppMAG3 scans by using a grading system modified from Itoh et al. According to this, anatomical damage of the renal parenchyma was classified into six types: Grade 0-V. In the calculation of the agreement rates, Kendall correlation (tau-b) analysis was used. Results: According to our findings, excellent agreement was found for DMSA grade readings (DMSA-GR) (tau-b = 0.827) and good agreement for MAG3 grade readings (MAG3-GR) (tau-b = 0.790) between two observers. Most of clear parenchymal lesions detected on pvDMSA scans and ppMAG3 scans identified by observers equally. Studies with negative or minimal lesions reduced correlation degrees for both DMSA-GR and MAG3-GR. Conclusion: Our grading system can be used for standardization of the reports. We conclude that standardization of criteria and terminology in the interpretations may result in higher interobserver consistency, also improve low interobserver reproducibility and objectivity of renal scintigraphy reports.
Collapse
Affiliation(s)
- Zeynep Erdoğan
- Department of Nuclear Medicine, Kayseri Training and Research Hospital, Kayseri, Turkey
| | | | - Güler Silov
- Department of Nuclear Medicine, Kayseri Training and Research Hospital, Kayseri, Turkey
| | - Ayşegül Ozdal
- Department of Nuclear Medicine, Kayseri Training and Research Hospital, Kayseri, Turkey
| | - Ozgül Turhal
- Department of Nuclear Medicine, Kayseri Training and Research Hospital, Kayseri, Turkey
| |
Collapse
|
20
|
Tourassi G, Yoon HJ, Xu S, Morin-Ducote G, Hudson K. Comparative analysis of data collection methods for individualized modeling of radiologists' visual similarity judgments in mammograms. Acad Radiol 2013; 20:1371-80. [PMID: 24119349 DOI: 10.1016/j.acra.2013.08.002] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2013] [Revised: 08/06/2013] [Accepted: 08/06/2013] [Indexed: 11/20/2022]
Abstract
RATIONALE AND OBJECTIVES We conducted an observer study to investigate how the data collection method affects the efficacy of modeling individual radiologists' judgments regarding the perceptual similarity of breast masses on mammograms. MATERIALS AND METHODS Six observers of varying experience levels in breast imaging were recruited to assess the perceptual similarity of mammographic masses. The observers' subjective judgments were collected using (i) a rating method, (ii) a preference method, and (iii) a hybrid method combining rating and ranking. Personalized user models were developed with the collected data to predict observers' opinions. The relative efficacy of each data collection method was assessed based on the classification accuracy of the resulting user models. RESULTS The average accuracy of the user models derived from data collected with the hybrid method was 55.5 ± 1.5%. The models were significantly more accurate (P < .0005) than those derived from the rating (45.3 ± 3.5%) and the preference (40.8 ± 5%) methods. On average, the rating data collection method was significantly faster than the other two methods (P < .0001). No time advantage was observed between the preference and the hybrid methods. CONCLUSIONS A hybrid method combining rating and ranking is an intuitive and efficient way for collecting subjective similarity judgments to model human perceptual opinions with a higher accuracy than other, more commonly used data collection methods.
Collapse
|
21
|
Piepkorn MW, Barnhill RL, Elder DE, Knezevich SR, Carney PA, Reisch LM, Elmore JG. The MPATH-Dx reporting schema for melanocytic proliferations and melanoma. J Am Acad Dermatol 2013; 70:131-41. [PMID: 24176521 DOI: 10.1016/j.jaad.2013.07.027] [Citation(s) in RCA: 64] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2013] [Revised: 07/12/2013] [Accepted: 07/18/2013] [Indexed: 02/06/2023]
Abstract
BACKGROUND The histologic diagnosis of melanoma and nevi can be subject to discordance and errors, potentially leading to inappropriate treatment and harm. Diagnostic terminology is not standardized, creating confusion for providers and patients and challenges for investigators. OBJECTIVE We sought to describe the development of a pathology reporting form for more precise research on melanoma and a diagnostic-treatment mapping tool for improved patient care and consistency in treatment. METHODS Three dermatopathologists independently reviewed melanocytic lesions randomly selected from a dermatopathology database. Melanocytic Pathology Assessment Tool and Hierarchy for Diagnosis (MPATH-Dx) reporting schema evolved from iterative case review and form revision. RESULTS Differences in diagnostic thresholds, interpretation, and nomenclature contributed to development of the MPATH-Dx histology reporting form, which groups lesions by similarities in histogenesis and degrees of atypia. Because preliminary results indicate greater agreement regarding suggested treatments than for specific diagnoses, the diverse terminologies of the MPATH-Dx histology reporting form were stratified by commonalities of treatments in the MPATH-Dx diagnostic-treatment mapping scheme. LIMITATIONS Without transformative advances in diagnostic paradigms, the interpretation of melanocytic lesions frequently remains subjective. CONCLUSIONS The MPATH-Dx diagnostic-treatment mapping scheme could diminish confusion for those receiving reports by categorizing diverse nomenclature into a hierarchy stratified by suggested management interventions.
Collapse
Affiliation(s)
- Michael W Piepkorn
- Division of Dermatology, University of Washington School of Medicine, Seattle, Washington; Department of Medicine, University of Washington School of Medicine, Seattle, Washington; Dermatopathology Northwest, Bellevue, Washington.
| | - Raymond L Barnhill
- Department of Pathology and Laboratory Medicine, University of California at Los Angeles, Los Angeles, California
| | - David E Elder
- Department of Pathology and Laboratory Medicine, Hospital of the University of Pennsylvania, Philadelphia, Pennsylvania
| | | | - Patricia A Carney
- Department of Family Medicine, Oregon Health Sciences University, Portland, Oregon
| | - Lisa M Reisch
- Department of Medicine, University of Washington School of Medicine, Seattle, Washington
| | - Joann G Elmore
- Department of Medicine, University of Washington School of Medicine, Seattle, Washington
| |
Collapse
|
22
|
Deegan T, Owen R, Holt T, Roberts L, Biggs J, McCarthy A, Parfitt M, Fielding A. Inter observer variability of radiation therapists aligning to fiducial markers for prostate radiation therapy. J Med Imaging Radiat Oncol 2013; 57:519-23; quiz 524-5. [PMID: 23870354 DOI: 10.1111/1754-9485.12055] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2012] [Accepted: 02/15/2013] [Indexed: 11/27/2022]
Abstract
INTRODUCTION As the use of fiducial markers (FMs) for the localisation of the prostate during external beam radiation therapy (EBRT) has become part of routine practice, radiation therapists (RTs) have become increasingly responsible for online image interpretation. The aim of this investigation was to quantify the limits of agreement (LoA) between RTs when localising to FMs with orthogonal kilovoltage (kV) imaging. METHODS Six patients receiving prostate EBRT utilising FMs were included in this study. Treatment localisation was performed using kV imaging prior to each fraction. Online stereoscopic assessment of FMs, performed by the treating RTs, was compared with the offline assessment by three RTs. Observer agreement was determined by pairwise Bland-Altman analysis. RESULTS Stereoscopic analysis of 225 image pairs was performed online at the time of treatment, and offline by three RT observers. Eighteen pairwise Bland-Altman analyses were completed to assess the level of agreement between observers. Localisation by RTs was found to be within clinically acceptable 95% LoAs. CONCLUSIONS Small differences between RTs, in both the online and offline setting, were found to be within clinically acceptable limits. RTs were able to make consistent and reliable judgements when matching FMs on planar kV imaging.
Collapse
Affiliation(s)
- Timothy Deegan
- Radiation Oncology Mater Centre, Princess Alexandra Hospital, South Brisbane.
| | | | | | | | | | | | | | | |
Collapse
|
23
|
Nicolaas L, Tigchelaar S, Koëter S. Patellofemoral evaluation with magnetic resonance imaging in 51 knees of asymptomatic subjects. Knee Surg Sports Traumatol Arthrosc 2011; 19:1735-9. [PMID: 21533540 PMCID: PMC3176398 DOI: 10.1007/s00167-011-1508-y] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/24/2010] [Accepted: 03/31/2011] [Indexed: 11/30/2022]
Abstract
PURPOSE The objective of this study is to evaluate patellofemoral joint imaging on magnetic resonance imaging (MRI) in asymptomatic subjects to assess normal values and to test statistical correlation and reliability of MRI scan. METHODS An analysis of 51 standard MRI examinations was performed. Sulcus angle (SA), patellar axis (PA), lateral patellofemoral angle (LPFA), and lateral patellofemoral length (LPL) were measured. None of the patients suffered from patellofemoral complaints. Patients with patella alta and significant hydrops were excluded. The measurements were assessed with a 2-week interval by two raters under blinded conditions. Statistical analysis was applied by an independent analyst. RESULTS The mean SA referenced 142.4 ± 6.9°, PA 5.3 ± 3.8°, LPFA 13 ± 4.4°, and LPL 0.8 ± 2.9 mm. Inter-observer variability showed high correlation for LPL and PA, as the repeatability coefficient was high (LPL; 1.49 (LN), 5.7 (ST) and PA; 4.1 (LN), 5.8 (ST). Also, intra-observer variability showed good correlation for LPL and PA. CONCLUSION The results represent patellofemoral values in the normal population. They indicate that MRI is a reliable imaging technique to determine lateral patellofemoral length and patellar axis. Lateral patellofemoral angle and sulcus angle showed a poor correlation and should not be used for decision making. LEVEL OF EVIDENCE Development of diagnostic criteria in a consecutive series of patients and a universally applied "gold" standard, Level II.
Collapse
Affiliation(s)
- L. Nicolaas
- Department of Orthopaedic Surgery, Radboud University Nijmegen Medical Centre, Nijmegen, The Netherlands
| | - S. Tigchelaar
- Department of Orthopaedic Surgery, Radboud University Nijmegen Medical Centre, Nijmegen, The Netherlands
| | - S. Koëter
- Department of Orthopaedic Surgery, Canisius Wilhelmina Hospital, PO box 9015, 6500 GS Nijmegen, The Netherlands
| |
Collapse
|
24
|
Jernberg T, Cronblad J, Lindahl B, Wallentin L. Observer variability and optimal criteria of transient ischemia during ST monitoring with continuous 12-lead ECG. Ann Noninvasive Electrocardiol 2006; 7:181-90. [PMID: 12167177 PMCID: PMC7027604 DOI: 10.1111/j.1542-474x.2002.tb00161.x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open
Abstract
BACKGROUND ST monitoring with continuous 12-lead ECG is a well-established method in patients with unstable coronary artery disease (CAD). However, the method lacks documentation on optimal criteria for episodes of transient ischemia and on observer variability. METHODS Observer variability was evaluated in 24-hour recordings from 100 patients with unstable CAD with monitoring in the coronary care unit. Influence on ST changes by variations in body position were evaluated by monitoring 50 patients in different body positions. Different criteria of transient ischemia and their predictive importance were evaluated in 630 patients with unstable CAD who underwent 12 hours of monitoring and thereafter were followed for 1 to13 months. Two sets of criteria were tested: (1) ST deviation > or = 0.1 mV for at least 1 minute, and (2) ST depression > or = 0.05 mV or elevation > or = 0.1 mV for at least 1 minute. RESULTS When the first set of criteria were used, the interobserver agreement was good (kappa = 0.72) and 8 (16%) had significant ST changes in at least one body position. Out of 100 patients with symptoms suggestive of unstable CAD and such ischemia, 24 (24%) had a cardiac event during follow-up. When the second set of criteria were used, the interobserver agreement was poor (kappa = 0.32) and 21 (42%) had significant ST changes in at least one body position. Patients fulfilling the second but not the first set of criteria did not have a higher risk of cardiac event than those without transient ischemia (5.3 vs 4.3%). CONCLUSIONS During 12-lead ECG monitoring, transient ischemic episodes should be defined as ST deviations > or = 0.1 mV for at least 1 minute, based on a low observer variability, minor problems with postural ST changes and an important predictive value.
Collapse
Affiliation(s)
- Tomas Jernberg
- Department of Cardiology, Cardiothoracic Center, University Hospital, 751 85 Uppsala, Sweden.
| | | | | | | |
Collapse
|
25
|
Haritoglou C, Neubauer AS, Herzum H, Freeman WR, Mueller AJ. Interobserver and intra observer variability of measurements of uveal melanomas using standardised echography. Br J Ophthalmol 2002; 86:1390-4. [PMID: 12446372 PMCID: PMC1771401 DOI: 10.1136/bjo.86.12.1390] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
AIMS To report on the intraindividual and interindividual variability of tumour size (height and base diameter) measurements using standardised echography in a masked prospective study. METHODS 20 consecutive eyes of 20 patients were examined on four different visits by three experienced examiners using standardised echography. As common in standardised echography, tumour height was evaluated with A-scan technique, while transverse and longitudinal base diameter were calculated with B-scan. RESULTS Tumour height measurements using A-scan were more accurate than base diameter measurements using B-scan. The standard deviation for tumour height over all visits/measurements was 0.18 mm (A-scan), 0.79 mm for transverse, and 0.69 mm for longitudinal base diameters (B-scan). The interclass correlation coefficient (ICC) was much higher for tumour height measurements with A-scan (0.7735 for three examiners on one visit) than for transverse (0.6563) or longitudinal (0.4522) base diameter measurements with B-scan techniques. CONCLUSIONS A-scan techniques for tumour height measurements provide very reproducible results with little intraindividual and interobserver variability. As B-scan techniques for tumour base evaluation are less accurate they should be used for topographic and morphological examinations.
Collapse
Affiliation(s)
- C Haritoglou
- Department of Ophthalmology, Ludwig-Maximilians-University, Munich, Germany.
| | | | | | | | | |
Collapse
|