1
|
Zaboski BA, Bednarek L. Precision Psychiatry for Obsessive-Compulsive Disorder: Clinical Applications of Deep Learning Architectures. J Clin Med 2025; 14:2442. [PMID: 40217892 PMCID: PMC11989962 DOI: 10.3390/jcm14072442] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2025] [Revised: 03/20/2025] [Accepted: 04/01/2025] [Indexed: 04/14/2025] Open
Abstract
Obsessive-compulsive disorder (OCD) is a complex psychiatric condition characterized by significant heterogeneity in symptomatology and treatment response. Advances in neuroimaging, EEG, and other multimodal datasets have created opportunities to identify biomarkers and predict outcomes, yet traditional statistical methods often fall short in analyzing such high-dimensional data. Deep learning (DL) offers powerful tools for addressing these challenges by leveraging architectures capable of classification, prediction, and data generation. This brief review provides an overview of five key DL architectures-feedforward neural networks, convolutional neural networks, recurrent neural networks, generative adversarial networks, and transformers-and their applications in OCD research and clinical practice. We highlight how these models have been used to identify the neural predictors of treatment response, diagnose and classify OCD, and advance precision psychiatry. We conclude by discussing the clinical implementation of DL, summarizing its advances and promises in OCD, and underscoring key challenges for the field.
Collapse
Affiliation(s)
- Brian A. Zaboski
- Yale School of Medicine, Department of Psychiatry, Yale University, New Haven, CT 06510, USA
| | - Lora Bednarek
- Department of Psychology, University of California, San Diego, CA 92093, USA;
| |
Collapse
|
2
|
Park J, Kim J, Ahn S, Cho Y, Yoon YE. AI-ECG Supported Decision-Making for Coronary Angiography in Acute Chest Pain: The QCG-AID Study. J Korean Med Sci 2025; 40:e105. [PMID: 40165577 PMCID: PMC11964906 DOI: 10.3346/jkms.2025.40.e105] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/11/2024] [Accepted: 02/11/2025] [Indexed: 04/02/2025] Open
Abstract
This pilot study evaluates an artificial intelligence (AI)-assisted electrocardiography (ECG) analysis system, QCG, to enhance urgent coronary angiography (CAG) decision-making for acute chest pain in the emergency department (ED). We retrospectively analyzed 300 ED cases, categorized as non-coronary chest pain (Group 1), acute coronary syndrome (ACS) without occlusive coronary artery disease (CAD) (Group 2), and ACS with occlusive CAD (Group 3). Six clinicians made urgent CAG decision using a conventional approach (clinical data and ECG) and a QCG-assisted approach (including QCG scores). The QCG-assisted approach improved correct CAG decisions in Group 2 (36.0% vs. 45.3%, P = 0.003) and Group 3 (85.3% vs. 90.0%, P = 0.017), with minimal impact in Group 1 (92.7% vs. 95.0%, P = 0.125). Diagnostic accuracy for ACS improved from 77% to 81% with QCG assistance and reached 82% with QCG alone, supporting AI's potential to enhance urgent CAG decision-making for ED chest pain cases.
Collapse
Affiliation(s)
- Jiesuck Park
- Department of Cardiology, Seoul National University Bundang Hospital, Seongnam, Korea
- Department of Internal Medicine, Seoul National University College of Medicine, Seoul, Korea
| | - Joonghee Kim
- Department of Emergency Medicine, Seoul National University Bundang Hospital, Seongnam, Korea
- ARPI Inc., Seongnam, Korea
| | - Soyeon Ahn
- Medical Research Collaborating Center, Seoul National University Bundang Hospital, Seongnam, Korea
| | - Youngjin Cho
- Department of Cardiology, Seoul National University Bundang Hospital, Seongnam, Korea
- Department of Internal Medicine, Seoul National University College of Medicine, Seoul, Korea
- ARPI Inc., Seongnam, Korea.
| | - Yeonyee E Yoon
- Department of Cardiology, Seoul National University Bundang Hospital, Seongnam, Korea
- Department of Internal Medicine, Seoul National University College of Medicine, Seoul, Korea.
| |
Collapse
|
3
|
de Camargo TFO, Ribeiro GAS, da Silva MCB, da Silva LO, Torres PPTES, Rodrigues DDSDS, de Santos MON, Filho WS, Rosa MEE, Novaes MDA, Massarutto TA, Junior OL, Yanata E, Reis MRDC, Szarf G, Netto PVS, de Paiva JPQ. Clinical validation of an artificial intelligence algorithm for classifying tuberculosis and pulmonary findings in chest radiographs. Front Artif Intell 2025; 8:1512910. [PMID: 39991462 PMCID: PMC11843218 DOI: 10.3389/frai.2025.1512910] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2024] [Accepted: 01/16/2025] [Indexed: 02/25/2025] Open
Abstract
Background Chest X-ray (CXR) interpretation is critical in diagnosing various lung diseases. However, physicians, not specialists, are often the first ones to read them, frequently facing challenges in accurate interpretation. Artificial Intelligence (AI) algorithms could be of great help, but using real-world data is crucial to ensure their effectiveness in diverse healthcare settings. This study evaluates a deep learning algorithm designed for CXR interpretation, focusing on its utility for non-specialists in thoracic radiology physicians. Purpose To assess the performance of a Convolutional Neural Networks (CNNs)-based AI algorithm in interpreting CXRs and compare it with a team of physicians, including thoracic radiologists, who served as the gold-standard. Methods A retrospective study from January 2021 to July 2023 evaluated an algorithm with three independent models for Lung Abnormality, Radiological Findings, and Tuberculosis. The algorithm's performance was measured using accuracy, sensitivity, and specificity. Two groups of physicians validated the model: one with varying specialties and experience levels in interpreting chest radiographs (Group A) and another of board-certified thoracic radiologists (Group B). The study also assessed the agreement between the two groups on the algorithm's heatmap and its influence on their decisions. Results In the internal validation, the Lung Abnormality and Tuberculosis models achieved an AUC of 0.94, while the Radiological Findings model yielded a mean AUC of 0.84. During the external validation, utilizing the ground truth generated by board-certified thoracic radiologists, the algorithm achieved better sensitivity in 6 out of 11 classes than physicians with varying experience levels. Furthermore, Group A physicians demonstrated higher agreement with the algorithm in identifying markings in specific lung regions than Group B (37.56% Group A vs. 21.75% Group B). Additionally, physicians declared that the algorithm did not influence their decisions in 93% of the cases. Conclusion This retrospective clinical validation study assesses an AI algorithm's effectiveness in interpreting Chest X-rays (CXR). The results show the algorithm's performance is comparable to Group A physicians, using gold-standard analysis (Group B) as the reference. Notably, both Groups reported minimal influence of the algorithm on their decisions in most cases.
Collapse
Affiliation(s)
- Thiago Fellipe Ortiz de Camargo
- Image Research Center, Hospital Israelita Albert Einstein, São Paulo, Brazil
- Electrical, Mechanical and Computer Engineering School, Federal University of Goias, Goias, Brazil
| | - Guilherme Alberto Sousa Ribeiro
- Image Research Center, Hospital Israelita Albert Einstein, São Paulo, Brazil
- Electrical, Mechanical and Computer Engineering School, Federal University of Goias, Goias, Brazil
| | | | | | | | | | | | | | | | | | | | | | - Elaine Yanata
- Image Research Center, Hospital Israelita Albert Einstein, São Paulo, Brazil
| | | | - Gilberto Szarf
- Image Research Center, Hospital Israelita Albert Einstein, São Paulo, Brazil
| | | | | |
Collapse
|
4
|
Chouvarda I, Colantonio S, Verde ASC, Jimenez-Pastor A, Cerdá-Alberich L, Metz Y, Zacharias L, Nabhani-Gebara S, Bobowicz M, Tsakou G, Lekadir K, Tsiknakis M, Martí-Bonmati L, Papanikolaou N. Differences in technical and clinical perspectives on AI validation in cancer imaging: mind the gap! Eur Radiol Exp 2025; 9:7. [PMID: 39812924 PMCID: PMC11735720 DOI: 10.1186/s41747-024-00543-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2024] [Accepted: 11/29/2024] [Indexed: 01/16/2025] Open
Abstract
Good practices in artificial intelligence (AI) model validation are key for achieving trustworthy AI. Within the cancer imaging domain, attracting the attention of clinical and technical AI enthusiasts, this work discusses current gaps in AI validation strategies, examining existing practices that are common or variable across technical groups (TGs) and clinical groups (CGs). The work is based on a set of structured questions encompassing several AI validation topics, addressed to professionals working in AI for medical imaging. A total of 49 responses were obtained and analysed to identify trends and patterns. While TGs valued transparency and traceability the most, CGs pointed out the importance of explainability. Among the topics where TGs may benefit from further exposure are stability and robustness checks, and mitigation of fairness issues. On the other hand, CGs seemed more reluctant towards synthetic data for validation and would benefit from exposure to cross-validation techniques, or segmentation metrics. Topics emerging from the open questions were utility, capability, adoption and trustworthiness. These findings on current trends in AI validation strategies may guide the creation of guidelines necessary for training the next generation of professionals working with AI in healthcare and contribute to bridging any technical-clinical gap in AI validation. RELEVANCE STATEMENT: This study recognised current gaps in understanding and applying AI validation strategies in cancer imaging and helped promote trust and adoption for interdisciplinary teams of technical and clinical researchers. KEY POINTS: Clinical and technical researchers emphasise interpretability, external validation with diverse data, and bias awareness in AI validation for cancer imaging. In cancer imaging AI research, clinical researchers prioritise explainability, while technical researchers focus on transparency and traceability, and see potential in synthetic datasets. Researchers advocate for greater homogenisation of AI validation practices in cancer imaging.
Collapse
Affiliation(s)
- Ioanna Chouvarda
- School of Medicine, Aristotle University of Thessaloniki, Thessaloniki, Greece.
| | - Sara Colantonio
- Institute of Information Science and Technologies of the National Research Council of Italy, Pisa, Italy
| | - Ana S C Verde
- Computational Clinical Imaging Group (CCIG), Champalimaud Research, Champalimaud Foundation, Lisbon, Portugal
| | | | - Leonor Cerdá-Alberich
- Biomedical Imaging Research Group (GIBI230), La Fe Health Research Institute, Valencia, Spain
| | - Yannick Metz
- Data Analysis and Visualization, University of Konstanz, Konstanz, Germany
| | | | - Shereen Nabhani-Gebara
- Faculty of Health, Science, Social Care & Education, Kingston University London, London, UK
| | - Maciej Bobowicz
- 2nd Department of Radiology, Medical University of Gdansk, Gdansk, Poland
| | - Gianna Tsakou
- Research and Development Lab, Gruppo Maggioli Greek Branch, Maroussi, Greece
| | - Karim Lekadir
- Departament de Matemàtiques i Informàtica, Artificial Intelligence in Medicine Lab (BCN-AIM), Universitat de Barcelona, Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| | - Manolis Tsiknakis
- Computational BioMedicine Laboratory (CBML), Foundation for Research and Technology-Hellas (FORTH), Heraklion, Greece
| | - Luis Martí-Bonmati
- Biomedical Imaging Research Group (GIBI230), La Fe Health Research Institute, Valencia, Spain
- Radiology Department, La Fe Polytechnic and University Hospital and Health Research Institute, Valencia, Spain
| | - Nikolaos Papanikolaou
- Computational Clinical Imaging Group (CCIG), Champalimaud Research, Champalimaud Foundation, Lisbon, Portugal
| |
Collapse
|
5
|
De La Hoz-M J, Montes-Escobar K, Pérez-Ortiz V. Research Trends of Artificial Intelligence in Lung Cancer: A Combined Approach of Analysis With Latent Dirichlet Allocation and HJ-Biplot Statistical Methods. Pulm Med 2024; 2024:5911646. [PMID: 39664363 PMCID: PMC11634404 DOI: 10.1155/pm/5911646] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2024] [Accepted: 11/12/2024] [Indexed: 12/13/2024] Open
Abstract
Lung cancer (LC) remains one of the leading causes of cancer-related mortality worldwide. With recent technological advances, artificial intelligence (AI) has begun to play a crucial role in improving diagnostic and treatment methods. It is crucial to understand how AI has integrated into LC research and to identify the main areas of focus. The aim of the study was to provide an updated insight into the role of AI in LC research, analyzing evolving topics, geographical distribution, and contributions to journals. The study explores research trends in AI applied to LC through a novel approach combining latent Dirichlet allocation (LDA) topic modeling with the HJ-Biplot statistical technique. A growing interest in AI applications in LC oncology was observed, reflected in a significant increase in publications, especially after 2017, coinciding with the availability of computing resources. Frontiers in Oncology leads in publishing AI-related LC research, reflecting rigorous investigation in the field. Geographically, China and the United States lead in contributions, attributed to significant investment in R&D and corporate sector involvement. LDA analysis highlights key research areas such as pulmonary nodule detection, patient prognosis prediction, and clinical decision support systems, demonstrating the impact of AI in improving LC outcomes. DL and AI emerge as prominent trends, focusing on radiomics and feature selection, promising better decision-making in LC care. The increase in AI-driven research covers various topics, including data analysis methodologies, tumor characterization, and predictive methods, indicating a concerted effort to advance LC research. HJ-Biplot visualization reveals thematic clustering, illustrating temporal and geographical associations and highlighting the influence of high-impact journals and countries with advanced research capabilities. This multivariate approach offers insights into global collaboration dynamics and specialization, emphasizing the evolving role of AI in LC research and diagnosis.
Collapse
Affiliation(s)
- Javier De La Hoz-M
- Faculty of Engineering, Universidad del Magdalena, Santa Marta, Colombia
| | - Karime Montes-Escobar
- Departamento de Matemáticas y Estadística, Facultad de Ciencias Básicas, Universidad Técnica de Manabí, Portoviejo 130105, Ecuador
| | - Viorkis Pérez-Ortiz
- Facultad Ciencias de la Salud, Carrera de Medicina, Universidad Técnica de Manabí, Portoviejo 130105, Ecuador
| |
Collapse
|
6
|
Walston SL, Seki H, Takita H, Mitsuyama Y, Sato S, Hagiwara A, Ito R, Hanaoka S, Miki Y, Ueda D. Data set terminology of deep learning in medicine: a historical review and recommendation. Jpn J Radiol 2024; 42:1100-1109. [PMID: 38856878 DOI: 10.1007/s11604-024-01608-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Accepted: 05/31/2024] [Indexed: 06/11/2024]
Abstract
Medicine and deep learning-based artificial intelligence (AI) engineering represent two distinct fields each with decades of published history. The current rapid convergence of deep learning and medicine has led to significant advancements, yet it has also introduced ambiguity regarding data set terms common to both fields, potentially leading to miscommunication and methodological discrepancies. This narrative review aims to give historical context for these terms, accentuate the importance of clarity when these terms are used in medical deep learning contexts, and offer solutions to mitigate misunderstandings by readers from either field. Through an examination of historical documents, including articles, writing guidelines, and textbooks, this review traces the divergent evolution of terms for data sets and their impact. Initially, the discordant interpretations of the word 'validation' in medical and AI contexts are explored. We then show that in the medical field as well, terms traditionally used in the deep learning domain are becoming more common, with the data for creating models referred to as the 'training set', the data for tuning of parameters referred to as the 'validation (or tuning) set', and the data for the evaluation of models as the 'test set'. Additionally, the test sets used for model evaluation are classified into internal (random splitting, cross-validation, and leave-one-out) sets and external (temporal and geographic) sets. This review then identifies often misunderstood terms and proposes pragmatic solutions to mitigate terminological confusion in the field of deep learning in medicine. We support the accurate and standardized description of these data sets and the explicit definition of data set splitting terminologies in each publication. These are crucial methods for demonstrating the robustness and generalizability of deep learning applications in medicine. This review aspires to enhance the precision of communication, thereby fostering more effective and transparent research methodologies in this interdisciplinary field.
Collapse
Affiliation(s)
- Shannon L Walston
- Department of Diagnostic and Interventional Radiology, Graduate School of Medicine, Osaka Metropolitan University, Osaka, Japan
| | - Hiroshi Seki
- Department of Diagnostic and Interventional Radiology, Graduate School of Medicine, Osaka Metropolitan University, Osaka, Japan
| | - Hirotaka Takita
- Department of Diagnostic and Interventional Radiology, Graduate School of Medicine, Osaka Metropolitan University, Osaka, Japan
| | - Yasuhito Mitsuyama
- Department of Diagnostic and Interventional Radiology, Graduate School of Medicine, Osaka Metropolitan University, Osaka, Japan
| | - Shingo Sato
- Sidney Kimmel Cancer Center, Thomas Jefferson University, Philadelphia, PA, USA
| | - Akifumi Hagiwara
- Department of Radiology, Juntendo University School of Medicine, Tokyo, Japan
| | - Rintaro Ito
- Department of Radiology, Nagoya University, Nagoya, Japan
| | - Shouhei Hanaoka
- Department of Radiology, University of Tokyo Hospital, Tokyo, Japan
| | - Yukio Miki
- Department of Diagnostic and Interventional Radiology, Graduate School of Medicine, Osaka Metropolitan University, Osaka, Japan
| | - Daiju Ueda
- Department of Diagnostic and Interventional Radiology, Graduate School of Medicine, Osaka Metropolitan University, Osaka, Japan.
- Department of Artificial Intelligence, Graduate School of Medicine, Osaka Metropolitan University, Osaka, Japan.
- Center for Health Science Innovation, Osaka Metropolitan University, Osaka, Japan.
| |
Collapse
|
7
|
Ruitenbeek HC, Oei EHG, Visser JJ, Kijowski R. Artificial intelligence in musculoskeletal imaging: realistic clinical applications in the next decade. Skeletal Radiol 2024; 53:1849-1868. [PMID: 38902420 DOI: 10.1007/s00256-024-04684-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Revised: 04/06/2024] [Accepted: 04/15/2024] [Indexed: 06/22/2024]
Abstract
This article will provide a perspective review of the most extensively investigated deep learning (DL) applications for musculoskeletal disease detection that have the best potential to translate into routine clinical practice over the next decade. Deep learning methods for detecting fractures, estimating pediatric bone age, calculating bone measurements such as lower extremity alignment and Cobb angle, and grading osteoarthritis on radiographs have been shown to have high diagnostic performance with many of these applications now commercially available for use in clinical practice. Many studies have also documented the feasibility of using DL methods for detecting joint pathology and characterizing bone tumors on magnetic resonance imaging (MRI). However, musculoskeletal disease detection on MRI is difficult as it requires multi-task, multi-class detection of complex abnormalities on multiple image slices with different tissue contrasts. The generalizability of DL methods for musculoskeletal disease detection on MRI is also challenging due to fluctuations in image quality caused by the wide variety of scanners and pulse sequences used in routine MRI protocols. The diagnostic performance of current DL methods for musculoskeletal disease detection must be further evaluated in well-designed prospective studies using large image datasets acquired at different institutions with different imaging parameters and imaging hardware before they can be fully implemented in clinical practice. Future studies must also investigate the true clinical benefits of current DL methods and determine whether they could enhance quality, reduce error rates, improve workflow, and decrease radiologist fatigue and burnout with all of this weighed against the costs.
Collapse
Affiliation(s)
- Huibert C Ruitenbeek
- Department of Radiology and Nuclear Medicine, Erasmus MC, University Medical Center, P.O. Box 2040, 3000 CA, Rotterdam, The Netherlands
| | - Edwin H G Oei
- Department of Radiology and Nuclear Medicine, Erasmus MC, University Medical Center, P.O. Box 2040, 3000 CA, Rotterdam, The Netherlands
| | - Jacob J Visser
- Department of Radiology and Nuclear Medicine, Erasmus MC, University Medical Center, P.O. Box 2040, 3000 CA, Rotterdam, The Netherlands
| | - Richard Kijowski
- Department of Radiology, New York University Grossman School of Medicine, 660 First Avenue, 3rd Floor, New York, NY, 10016, USA.
| |
Collapse
|
8
|
Shen H, Jin Z, Chen Q, Zhang L, You J, Zhang S, Zhang B. Image-based artificial intelligence for the prediction of pathological complete response to neoadjuvant chemoradiotherapy in patients with rectal cancer: a systematic review and meta-analysis. LA RADIOLOGIA MEDICA 2024; 129:598-614. [PMID: 38512622 DOI: 10.1007/s11547-024-01796-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Accepted: 01/24/2024] [Indexed: 03/23/2024]
Abstract
OBJECTIVE Artificial intelligence (AI) holds enormous potential for noninvasively identifying patients with rectal cancer who could achieve pathological complete response (pCR) following neoadjuvant chemoradiotherapy (nCRT). We aimed to conduct a meta-analysis to summarize the diagnostic performance of image-based AI models for predicting pCR to nCRT in patients with rectal cancer. METHODS This study followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines. A literature search of PubMed, Embase, Cochrane Library, and Web of Science was performed from inception to July 29, 2023. Studies that developed or utilized AI models for predicting pCR to nCRT in rectal cancer from medical images were included. The Quality Assessment of Diagnostic Accuracy Studies-AI was used to appraise the methodological quality of the studies. The bivariate random-effects model was used to summarize the individual sensitivities, specificities, and areas-under-the-curve (AUCs). Subgroup and meta-regression analyses were conducted to identify potential sources of heterogeneity. Protocol for this study was registered with PROSPERO (CRD42022382374). RESULTS Thirty-four studies (9933 patients) were identified. Pooled estimates of sensitivity, specificity, and AUC of AI models for pCR prediction were 82% (95% CI: 76-87%), 84% (95% CI: 79-88%), and 90% (95% CI: 87-92%), respectively. Higher specificity was seen for the Asian population, low risk of bias, and deep-learning, compared with the non-Asian population, high risk of bias, and radiomics (all P < 0.05). Single-center had a higher sensitivity than multi-center (P = 0.001). The retrospective design had lower sensitivity (P = 0.012) but higher specificity (P < 0.001) than the prospective design. MRI showed higher sensitivity (P = 0.001) but lower specificity (P = 0.044) than non-MRI. The sensitivity and specificity of internal validation were higher than those of external validation (both P = 0.005). CONCLUSIONS Image-based AI models exhibited favorable performance for predicting pCR to nCRT in rectal cancer. However, further clinical trials are warranted to verify the findings.
Collapse
Affiliation(s)
- Hui Shen
- Department of Radiology, The First Affiliated Hospital of Jinan University, No. 613 Huangpu West Road, Tianhe District, Guangzhou, 510627, Guangdong, China
| | - Zhe Jin
- Department of Radiology, The First Affiliated Hospital of Jinan University, No. 613 Huangpu West Road, Tianhe District, Guangzhou, 510627, Guangdong, China
| | - Qiuying Chen
- Department of Radiology, The First Affiliated Hospital of Jinan University, No. 613 Huangpu West Road, Tianhe District, Guangzhou, 510627, Guangdong, China
| | - Lu Zhang
- Department of Radiology, The First Affiliated Hospital of Jinan University, No. 613 Huangpu West Road, Tianhe District, Guangzhou, 510627, Guangdong, China
| | - Jingjing You
- Department of Radiology, The First Affiliated Hospital of Jinan University, No. 613 Huangpu West Road, Tianhe District, Guangzhou, 510627, Guangdong, China
| | - Shuixing Zhang
- Department of Radiology, The First Affiliated Hospital of Jinan University, No. 613 Huangpu West Road, Tianhe District, Guangzhou, 510627, Guangdong, China
| | - Bin Zhang
- Department of Radiology, The First Affiliated Hospital of Jinan University, No. 613 Huangpu West Road, Tianhe District, Guangzhou, 510627, Guangdong, China.
| |
Collapse
|
9
|
Lee WF, Day MY, Fang CY, Nataraj V, Wen SC, Chang WJ, Teng NC. Establishing a novel deep learning model for detecting peri-implantiti s. J Dent Sci 2024; 19:1165-1173. [PMID: 38618118 PMCID: PMC11010782 DOI: 10.1016/j.jds.2023.11.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Revised: 11/21/2023] [Accepted: 11/23/2023] [Indexed: 04/16/2024] Open
Abstract
BACKGROUND/PURPOSE The diagnosis of peri-implantitis using periapical radiographs is crucial. Recently, artificial intelligence may apply in radiographic image analysis effectively. The aim of this study was to differentiate the degree of marginal bone loss of an implant, and also to classify the severity of peri-implantitis using a deep learning model. MATERIALS AND METHODS A dataset of 800 periapical radiographic images were divided into training (n = 600), validation (n = 100), and test (n = 100) datasets with implants used for deep learning. An object detection algorithm (YOLOv7) was used to identify peri-implantitis. The classification performance of this model was evaluated using metrics, including the specificity, precision, recall, and F1 score. RESULTS Considering the classification performance, the specificity was 100%, precision was 100%, recall was 94.44%, and F1 score was 97.10%. CONCLUSION Results of this study suggested that implants can be identified from periapical radiographic images using deep learning-based object detection. This identification system could help dentists and patients suffering from implant problems. However, more images of other implant systems are needed to increase the learning performance to apply this system in clinical practice.
Collapse
Affiliation(s)
- Wei-Fang Lee
- School of Dentistry, Taipei Medical University, Taipei, Taiwan
- School of Dental Technology, Taipei Medical University, Taipei, Taiwan
| | - Min-Yuh Day
- Institute of Information Management, National Taipei University, New Taipei City, Taiwan
| | - Chih-Yuan Fang
- School of Dentistry, Taipei Medical University, Taipei, Taiwan
- Department of Oral and Maxillofacial Surgery, Wan Fang Hospital, Taipei Medical University, Taipei, Taiwan
| | - Vidhya Nataraj
- Institute of Information Management, National Taipei University, New Taipei City, Taiwan
| | - Shih-Cheng Wen
- School of Dentistry, Taipei Medical University, Taipei, Taiwan
- Private Practice, New Taipei City, Taiwan
| | - Wei-Jen Chang
- School of Dentistry, Taipei Medical University, Taipei, Taiwan
- Dental Department, Taipei Medical University, Shuang Ho Hospital, New Taipei City, Taiwan
| | - Nai-Chia Teng
- School of Dentistry, Taipei Medical University, Taipei, Taiwan
- Department of Dentistry, Taipei Medical University Hospital, Taipei, Taiwan
| |
Collapse
|
10
|
Bai A, Si M, Xue P, Qu Y, Jiang Y. Artificial intelligence performance in detecting lymphoma from medical imaging: a systematic review and meta-analysis. BMC Med Inform Decis Mak 2024; 24:13. [PMID: 38191361 PMCID: PMC10775443 DOI: 10.1186/s12911-023-02397-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2023] [Accepted: 12/07/2023] [Indexed: 01/10/2024] Open
Abstract
BACKGROUND Accurate diagnosis and early treatment are essential in the fight against lymphatic cancer. The application of artificial intelligence (AI) in the field of medical imaging shows great potential, but the diagnostic accuracy of lymphoma is unclear. This study was done to systematically review and meta-analyse researches concerning the diagnostic performance of AI in detecting lymphoma using medical imaging for the first time. METHODS Searches were conducted in Medline, Embase, IEEE and Cochrane up to December 2023. Data extraction and assessment of the included study quality were independently conducted by two investigators. Studies that reported the diagnostic performance of an AI model/s for the early detection of lymphoma using medical imaging were included in the systemic review. We extracted the binary diagnostic accuracy data to obtain the outcomes of interest: sensitivity (SE), specificity (SP), and Area Under the Curve (AUC). The study was registered with the PROSPERO, CRD42022383386. RESULTS Thirty studies were included in the systematic review, sixteen of which were meta-analyzed with a pooled sensitivity of 87% (95%CI 83-91%), specificity of 94% (92-96%), and AUC of 97% (95-98%). Satisfactory diagnostic performance was observed in subgroup analyses based on algorithms types (machine learning versus deep learning, and whether transfer learning was applied), sample size (≤ 200 or > 200), clinicians versus AI models and geographical distribution of institutions (Asia versus non-Asia). CONCLUSIONS Even if possible overestimation and further studies with a better standards for application of AI algorithms in lymphoma detection are needed, we suggest the AI may be useful in lymphoma diagnosis.
Collapse
Affiliation(s)
- Anying Bai
- School of Population Medicine and Public Health, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Mingyu Si
- School of Population Medicine and Public Health, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Peng Xue
- School of Population Medicine and Public Health, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China.
| | - Yimin Qu
- School of Population Medicine and Public Health, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Yu Jiang
- School of Population Medicine and Public Health, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China.
- School of Health Policy and Management, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China.
| |
Collapse
|
11
|
Haidar O, Jaques A, McCaughran PW, Metcalfe MJ. AI-Generated Information for Vascular Patients: Assessing the Standard of Procedure-Specific Information Provided by the ChatGPT AI-Language Model. Cureus 2023; 15:e49764. [PMID: 38046759 PMCID: PMC10691169 DOI: 10.7759/cureus.49764] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/30/2023] [Indexed: 12/05/2023] Open
Abstract
Introduction Ensuring access to high-quality information is paramount to facilitating informed surgical decision-making. The use of the internet to access health-related information is increasing, along with the growing prevalence of AI language models such as ChatGPT. We aim to assess the standard of AI-generated patient-facing information through a qualitative analysis of its readability and quality. Materials and methods We performed a retrospective qualitative analysis of information regarding three common vascular procedures: endovascular aortic repair (EVAR), endovenous laser ablation (EVLA), and femoro-popliteal bypass (FPBP). The ChatGPT responses were compared to patient information leaflets provided by the vascular charity, Circulation Foundation UK. Readability was assessed using four readability scores: the Flesch-Kincaid reading ease (FKRE) score, the Flesch-Kincaid grade level (FKGL), the Gunning fog score (GFS), and the simple measure of gobbledygook (SMOG) index. Quality was assessed using the DISCERN tool by two independent assessors. Results The mean FKRE score was 33.3, compared to 59.1 for the information provided by the Circulation Foundation (SD=14.5, p=0.025) indicating poor readability of AI-generated information. The FFKGL indicated that the expected grade of students likely to read and understand ChatGPT responses was consistently higher than compared to information leaflets at 12.7 vs. 9.4 (SD=1.9, p=0.002). Two metrics measure readability in terms of the number of years of education required to understand a piece of writing: the GFS and SMOG. Both scores indicated that AI-generated answers were less accessible. The GFS for ChatGPT-provided information was 16.7 years versus 12.8 years for the leaflets (SD=2.2, p=0.002) and the SMOG index scores were 12.2 and 9.4 years for ChatGPT and the patient information leaflets, respectively (SD=1.7, p=0.001). The DISCERN scores were consistently higher in human-generated patient information leaflets compared to AI-generated information across all procedures; the mean score for the information provided by ChatGPT was 50.3 vs. 56.0 for the Circulation Foundation information leaflets (SD=3.38, p<0.001). Conclusion We concluded that AI-generated information about vascular surgical procedures is currently poor in both the readability of text and the quality of information. Patients should be directed to reputable, human-generated information sources from trusted professional bodies to supplement direct education from the clinician during the pre-procedure consultation process.
Collapse
Affiliation(s)
- Omar Haidar
- Vascular Surgery, Lister Hospital, Stevenage, GBR
| | | | | | | |
Collapse
|
12
|
Higgins DC, Johner C. Validation of Artificial Intelligence Containing Products Across the Regulated Healthcare Industries. Ther Innov Regul Sci 2023; 57:797-809. [PMID: 37202591 DOI: 10.1007/s43441-023-00530-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Accepted: 04/28/2023] [Indexed: 05/20/2023]
Abstract
PURPOSE The introduction of artificial intelligence / machine learning (AI/ML) products to the regulated fields of pharmaceutical research and development (R&D) and drug manufacture, and medical devices (MD) and in vitro diagnostics (IVD), poses new regulatory problems: a lack of a common terminology and understanding leads to confusion, delays and product failures. Validation as a key step in product development, common to each of these sectors including computerized systems and AI/ML development, offers an opportune point of comparison for aligning people and processes for cross-sectoral product development. METHODS A comparative approach, built upon workshops and a subsequent written sequence of exchanges, is summarized in a look-up table suitable for mixed-teams work. RESULTS 1. A bottom-up, definitions led, approach which leads to a distinction between broad vs narrow validation, and their relationship to regulatory regimes. 2. Common basis introduction to the primary methodologies for software validation, including AI-containing software validation. 3. Pharmaceutical drug development and MD/IVD-specific perspectives on compliant AI software development, as a basis for collaboration. CONCLUSIONS Alignment of the terms and methodologies used in validation of software products containing artificial intelligence/machine learning (AI/ML) components across the regulated industries of human health is a vital first step in streamlining processes and improving workflows.
Collapse
Affiliation(s)
- David C Higgins
- Berlin Institute of Health, Bertolt-Brecht-Platz 3, 10117, Berlin, Germany.
| | - Christian Johner
- Johner Institut GmbH, Reichenaustr. 39a, 78467, Constance, Germany
| |
Collapse
|
13
|
Farah L, Davaze-Schneider J, Martin T, Nguyen P, Borget I, Martelli N. Are current clinical studies on artificial intelligence-based medical devices comprehensive enough to support a full health technology assessment? A systematic review. Artif Intell Med 2023; 140:102547. [PMID: 37210155 DOI: 10.1016/j.artmed.2023.102547] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Revised: 03/28/2023] [Accepted: 04/04/2023] [Indexed: 05/22/2023]
Abstract
INTRODUCTION Artificial Intelligence-based Medical Devices (AI-based MDs) are experiencing exponential growth in healthcare. This study aimed to investigate whether current studies assessing AI contain the information required for health technology assessment (HTA) by HTA bodies. METHODS We conducted a systematic literature review based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses methodology to extract articles published between 2016 and 2021 related to the assessment of AI-based MDs. Data extraction focused on study characteristics, technology, algorithms, comparators, and results. AI quality assessment and HTA scores were calculated to evaluate whether the items present in the included studies were concordant with the HTA requirements. We performed a linear regression for the HTA and AI scores with the explanatory variables of the impact factor, publication date, and medical specialty. We conducted a univariate analysis of the HTA score and a multivariate analysis of the AI score with an alpha risk of 5 %. RESULTS Of 5578 retrieved records, 56 were included. The mean AI quality assessment score was 67 %; 32 % of articles had an AI quality score ≥ 70 %, 50 % had a score between 50 % and 70 %, and 18 % had a score under 50 %. The highest quality scores were observed for the study design (82 %) and optimisation (69 %) categories, whereas the scores were lowest in the clinical practice category (23 %). The mean HTA score was 52 % for all seven domains. 100 % of the studies assessed clinical effectiveness, whereas only 9 % evaluated safety, and 20 % evaluated economic issues. There was a statistically significant relationship between the impact factor and the HTA and AI scores (both p = 0.046). DISCUSSION Clinical studies on AI-based MDs have limitations and often lack adapted, robust, and complete evidence. High-quality datasets are also required because the output data can only be trusted if the inputs are reliable. The existing assessment frameworks are not specifically designed to assess AI-based MDs. From the perspective of regulatory authorities, we suggest that these frameworks should be adapted to assess the interpretability, explainability, cybersecurity, and safety of ongoing updates. From the perspective of HTA agencies, we highlight that transparency, professional and patient acceptance, ethical issues, and organizational changes are required for the implementation of these devices. Economic assessments of AI should rely on a robust methodology (business impact or health economic models) to provide decision-makers with more reliable evidence. CONCLUSION Currently, AI studies are insufficient to cover HTA prerequisites. HTA processes also need to be adapted because they do not consider the important specificities of AI-based MDs. Specific HTA workflows and accurate assessment tools should be designed to standardise evaluations, generate reliable evidence, and create confidence.
Collapse
Affiliation(s)
- Line Farah
- Groupe de Recherche et d'accueil en Droit et Economie de la Santé (GRADES) Department, University Paris-Saclay, Orsay, France; Innovation Center for Medical Devices, Foch Hospital, 40 Rue Worth, 92150 Suresnes, France.
| | - Julie Davaze-Schneider
- Pharmacy Department, Georges Pompidou European Hospital, AP-HP, 20 Rue Leblanc, 75015 Paris, France
| | - Tess Martin
- Groupe de Recherche et d'accueil en Droit et Economie de la Santé (GRADES) Department, University Paris-Saclay, Orsay, France; Pharmacy Department, Georges Pompidou European Hospital, AP-HP, 20 Rue Leblanc, 75015 Paris, France
| | - Pierre Nguyen
- Pharmacy Department, Georges Pompidou European Hospital, AP-HP, 20 Rue Leblanc, 75015 Paris, France
| | - Isabelle Borget
- Groupe de Recherche et d'accueil en Droit et Economie de la Santé (GRADES) Department, University Paris-Saclay, Orsay, France; Department of Biostatistics and Epidemiology, Gustave Roussy, University Paris-Saclay, 94805 Villejuif, France; Oncostat U1018, Inserm, University Paris-Saclay, Équipe Labellisée Ligue Contre le Cancer, Villejuif, France
| | - Nicolas Martelli
- Groupe de Recherche et d'accueil en Droit et Economie de la Santé (GRADES) Department, University Paris-Saclay, Orsay, France; Pharmacy Department, Georges Pompidou European Hospital, AP-HP, 20 Rue Leblanc, 75015 Paris, France
| |
Collapse
|
14
|
Chen M, Liang X, Xu Y. Construction and Analysis of Emotion Recognition and Psychotherapy System of College Students under Convolutional Neural Network and Interactive Technology. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:5993839. [PMID: 36164423 PMCID: PMC9509236 DOI: 10.1155/2022/5993839] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Revised: 08/24/2022] [Accepted: 09/06/2022] [Indexed: 11/17/2022]
Abstract
This study's aim is to effectively establish a psychological intervention and treatment system for college students and discover and correct their psychological problems encountered in a timely manner. From the perspectives of pedagogy and psychology, the college students majoring in physical education are selected as the research objects, and an interactive college student emotion recognition and psychological intervention system is established based on convolutional neural network (CNN). The system takes face recognition as the data source, adopts feature recognition algorithms to effectively classify the different students, and designs a psychological intervention platform based on interactive technology, and it is compared with existing systems and models to further verify its effectiveness. The results show that the deep learning CNN has better ability to recognize student emotions than backpropagation neural network (BPNN) and decision tree (DT) algorithm. The recognition accuracy (ACC) can be as high as 89.32%. Support vector machine (SVM) algorithm is adopted to classify the emotions, and the recognition ACC is increased by 20%. When the system's K value is 5 and d value is 8, the ACC of the model can reach 92.35%. The use of this system for psychotherapy has a significant effect, and 45% of the students are very satisfied with the human-computer interaction of the system. This study aims to guess the psychology of students through emotion recognition and reduce human participation based on the human-computer interaction, which can provide a new research idea for college psychotherapy. At present, the mental health problems of college students cannot be ignored; especially every year, there will be news reports of college students' extreme behaviors due to depression and other psychological problems. An interactive college student emotion recognition and psychological intervention system based on convolutional neural network (CNN) is established. This system uses face recognition as the basic support technology and uses feature recognition algorithms to effectively classify different students. An interaction technology-based psychological intervention platform is designed and compared with existing systems and models to further verify the effectiveness of the proposed system. The results show that deep learning has better student emotion recognition ability than backpropagation neural network (BPNN) and decision tree algorithm. The recognition accuracy is up to 89.32%. Support vector machine algorithm is employed to classify emotions, and the recognition acceptability rate increases by 20%. When K is 5 and d is 8, the acceptability rate of the model can reach 92.35%. The effect of this system in psychotherapy is remarkable, and 45% of students are very satisfied with the human-computer interaction of this system. This work aims to speculate students' psychology through emotion recognition, reduce people's participation via human-computer interaction, and provide a new research idea for university psychotherapy.
Collapse
Affiliation(s)
- Minwei Chen
- College of Physical Education, Chongqing University, Chongqing 400044, China
| | - Xiaojun Liang
- College of Humanities, Zhaoqing Medical College, Zhaoqing 526020, China
| | - Yi Xu
- Ministry of Basic Education, GuangdongEco-Engineering Polytechnic, Guangzhou 510520, China
| |
Collapse
|
15
|
Zhang X, Yang Y, Shen YW, Zhang KR, Jiang ZK, Ma LT, Ding C, Wang BY, Meng Y, Liu H. Diagnostic accuracy and potential covariates of artificial intelligence for diagnosing orthopedic fractures: a systematic literature review and meta-analysis. Eur Radiol 2022; 32:7196-7216. [PMID: 35754091 DOI: 10.1007/s00330-022-08956-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Revised: 05/07/2022] [Accepted: 06/08/2022] [Indexed: 02/05/2023]
Abstract
OBJECTIVES To systematically quantify the diagnostic accuracy and identify potential covariates affecting the performance of artificial intelligence (AI) in diagnosing orthopedic fractures. METHODS PubMed, Embase, Web of Science, and Cochrane Library were systematically searched for studies on AI applications in diagnosing orthopedic fractures from inception to September 29, 2021. Pooled sensitivity and specificity and the area under the receiver operating characteristic curves (AUC) were obtained. This study was registered in the PROSPERO database prior to initiation (CRD 42021254618). RESULTS Thirty-nine were eligible for quantitative analysis. The overall pooled AUC, sensitivity, and specificity were 0.96 (95% CI 0.94-0.98), 90% (95% CI 87-92%), and 92% (95% CI 90-94%), respectively. In subgroup analyses, multicenter designed studies yielded higher sensitivity (92% vs. 88%) and specificity (94% vs. 91%) than single-center studies. AI demonstrated higher sensitivity with transfer learning (with vs. without: 92% vs. 87%) or data augmentation (with vs. without: 92% vs. 87%), compared to those without. Utilizing plain X-rays as input images for AI achieved results comparable to CT (AUC 0.96 vs. 0.96). Moreover, AI achieved comparable results to humans (AUC 0.97 vs. 0.97) and better results than non-expert human readers (AUC 0.98 vs. 0.96; sensitivity 95% vs. 88%). CONCLUSIONS AI demonstrated high accuracy in diagnosing orthopedic fractures from medical images. Larger-scale studies with higher design quality are needed to validate our findings. KEY POINTS • Multicenter study design, application of transfer learning, and data augmentation are closely related to improving the performance of artificial intelligence models in diagnosing orthopedic fractures. • Utilizing plain X-rays as input images for AI to diagnose fractures achieved results comparable to CT (AUC 0.96 vs. 0.96). • AI achieved comparable results to humans (AUC 0.97 vs. 0.97) but was superior to non-expert human readers (AUC 0.98 vs. 0.96, sensitivity 95% vs. 88%) in diagnosing fractures.
Collapse
Affiliation(s)
- Xiang Zhang
- Department of Orthopedics, Orthopedic Research Institute, West China Hospital, Sichuan University, No. 37 Guo Xue Rd, Chengdu, 610041, China
| | - Yi Yang
- Department of Orthopedics, Orthopedic Research Institute, West China Hospital, Sichuan University, No. 37 Guo Xue Rd, Chengdu, 610041, China
| | - Yi-Wei Shen
- Department of Orthopedics, Orthopedic Research Institute, West China Hospital, Sichuan University, No. 37 Guo Xue Rd, Chengdu, 610041, China
| | - Ke-Rui Zhang
- Department of Orthopedics, Orthopedic Research Institute, West China Hospital, Sichuan University, No. 37 Guo Xue Rd, Chengdu, 610041, China
| | - Ze-Kun Jiang
- West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, 610000, China
| | - Li-Tai Ma
- Department of Orthopedics, Orthopedic Research Institute, West China Hospital, Sichuan University, No. 37 Guo Xue Rd, Chengdu, 610041, China
| | - Chen Ding
- Department of Orthopedics, Orthopedic Research Institute, West China Hospital, Sichuan University, No. 37 Guo Xue Rd, Chengdu, 610041, China
| | - Bei-Yu Wang
- Department of Orthopedics, Orthopedic Research Institute, West China Hospital, Sichuan University, No. 37 Guo Xue Rd, Chengdu, 610041, China
| | - Yang Meng
- Department of Orthopedics, Orthopedic Research Institute, West China Hospital, Sichuan University, No. 37 Guo Xue Rd, Chengdu, 610041, China
| | - Hao Liu
- Department of Orthopedics, Orthopedic Research Institute, West China Hospital, Sichuan University, No. 37 Guo Xue Rd, Chengdu, 610041, China.
| |
Collapse
|
16
|
Liu R, Wang M, Zheng T, Zhang R, Li N, Chen Z, Yan H, Shi Q. An artificial intelligence-based risk prediction model of myocardial infarction. BMC Bioinformatics 2022; 23:217. [PMID: 35672659 PMCID: PMC9175344 DOI: 10.1186/s12859-022-04761-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2022] [Accepted: 05/30/2022] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND Myocardial infarction can lead to malignant arrhythmia, heart failure, and sudden death. Clinical studies have shown that early identification of and timely intervention for acute MI can significantly reduce mortality. The traditional MI risk assessment models are subjective, and the data that go into them are difficult to obtain. Generally, the assessment is only conducted among high-risk patient groups. OBJECTIVE To construct an artificial intelligence-based risk prediction model of myocardial infarction (MI) for continuous and active monitoring of inpatients, especially those in noncardiovascular departments, and early warning of MI. METHODS The imbalanced data contain 59 features, which were constructed into a specific dataset through proportional division, upsampling, downsampling, easy ensemble, and w-easy ensemble. Then, the dataset was traversed using supervised machine learning, with recursive feature elimination as the top-layer algorithm and random forest, gradient boosting decision tree (GBDT), logistic regression, and support vector machine as the bottom-layer algorithms, to select the best model out of many through a variety of evaluation indices. RESULTS GBDT was the best bottom-layer algorithm, and downsampling was the best dataset construction method. In the validation set, the F1 score and accuracy of the 24-feature downsampling GBDT model were both 0.84. In the test set, the F1 score and accuracy of the 24-feature downsampling GBDT model were both 0.83, and the area under the curve was 0.91. CONCLUSION Compared with traditional models, artificial intelligence-based machine learning models have better accuracy and real-time performance and can reduce the occurrence of in-hospital MI from a data-driven perspective, thereby increasing the cure rate of patients and improving their prognosis.
Collapse
Affiliation(s)
- Ran Liu
- MOE Key Lab for Neuroinformation, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054 Sichuan China
- Engineering Research Center of Medical Information Technology, Ministry of Education, West China Hospital of Sichuan University, Chengdu, 610041 Sichuan China
| | - Miye Wang
- Engineering Research Center of Medical Information Technology, Ministry of Education, West China Hospital of Sichuan University, Chengdu, 610041 Sichuan China
| | - Tao Zheng
- Engineering Research Center of Medical Information Technology, Ministry of Education, West China Hospital of Sichuan University, Chengdu, 610041 Sichuan China
| | - Rui Zhang
- Engineering Research Center of Medical Information Technology, Ministry of Education, West China Hospital of Sichuan University, Chengdu, 610041 Sichuan China
| | - Nan Li
- Engineering Research Center of Medical Information Technology, Ministry of Education, West China Hospital of Sichuan University, Chengdu, 610041 Sichuan China
| | - Zhongxiu Chen
- Department of Cardiology, West China Hospital of Sichuan University, Chengdu, 610041 Sichuan China
| | - Hongmei Yan
- MOE Key Lab for Neuroinformation, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054 Sichuan China
| | - Qingke Shi
- Engineering Research Center of Medical Information Technology, Ministry of Education, West China Hospital of Sichuan University, Chengdu, 610041 Sichuan China
| |
Collapse
|
17
|
Cho SW, Jung SJ, Shin JH, Won TB, Rhee CS, Kim JW. Evaluating Prediction Models of Sleep Apnea From Smartphone-Recorded Sleep Breathing Sounds. JAMA Otolaryngol Head Neck Surg 2022; 148:515-521. [PMID: 35420648 PMCID: PMC9011176 DOI: 10.1001/jamaoto.2022.0244] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
Importance Breathing sounds during sleep are an important characteristic feature of obstructive sleep apnea (OSA) and have been regarded as a potential biomarker. Breathing sounds during sleep can be easily recorded using a microphone, which is found in most smartphone devices. Therefore, it may be easy to implement an evaluation tool for prescreening purposes. Objective To evaluate OSA prediction models using smartphone-recorded sounds and identify optimal settings with regard to noise processing and sound feature selection. Design, Setting, and Participants A cross-sectional study was performed among patients who visited the sleep center of Seoul National University Bundang Hospital for snoring or sleep apnea from August 2015 to August 2019. Audio recordings during sleep were performed using a smartphone during routine, full-night, in-laboratory polysomnography. Using a random forest algorithm, binary classifications were separately conducted for 3 different threshold criteria according to an apnea hypopnea index (AHI) threshold of 5, 15, or 30 events/h. Four regression models were created according to noise reduction and feature selection from the input sound to predict actual AHI: (1) noise reduction without feature selection, (2) noise reduction with feature selection, (3) neither noise reduction nor feature selection, and (4) feature selection without noise reduction. Clinical and polysomnographic parameters that may have been associated with errors were assessed. Data were analyzed from September 2019 to September 2020. Main Outcomes and Measures Accuracy of OSA prediction models. Results A total of 423 patients (mean [SD] age, 48.1 [12.8] years; 356 [84.1%] male) were analyzed. Data were split into training (n = 256 [60.5%]) and test data sets (n = 167 [39.5%]). Accuracies were 88.2%, 82.3%, and 81.7%, and the areas under curve were 0.90, 0.89, and 0.90 for an AHI threshold of 5, 15, and 30 events/h, respectively. In the regression analysis, using recorded sounds that had not been denoised and had only selected attributes resulted in the highest correlation coefficient (r = 0.78; 95% CI, 0.69-0.88). The AHI (β = 0.33; 95% CI, 0.24-0.42) and sleep efficiency (β = -0.20; 95% CI, -0.35 to -0.05) were found to be associated with estimation error. Conclusions and Relevance In this cross-sectional study, recorded sleep breathing sounds using a smartphone were used to create reasonably accurate OSA prediction models. Future research should focus on real-life recordings using various smartphone devices.
Collapse
Affiliation(s)
- Sung-Woo Cho
- Department of Otorhinolaryngology-Head and Neck Surgery, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, Korea
| | - Sung Jae Jung
- Big Data Center, Seoul National University Bundang Hospital, Seongnam, Korea
| | - Jin Ho Shin
- Big Data Center, Seoul National University Bundang Hospital, Seongnam, Korea
| | - Tae-Bin Won
- Department of Otorhinolaryngology-Head and Neck Surgery, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, Korea.,Department of Otorhinolaryngology-Head and Neck Surgery, Seoul National University Hospital, Seoul National University College of Medicine, Seoul, Korea
| | - Chae-Seo Rhee
- Department of Otorhinolaryngology-Head and Neck Surgery, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, Korea.,Department of Otorhinolaryngology-Head and Neck Surgery, Seoul National University Hospital, Seoul National University College of Medicine, Seoul, Korea.,Sensory Organ Research Institute, Seoul National University Medical Research Center, Seoul, Korea
| | - Jeong-Whun Kim
- Department of Otorhinolaryngology-Head and Neck Surgery, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, Korea.,Sensory Organ Research Institute, Seoul National University Medical Research Center, Seoul, Korea
| |
Collapse
|
18
|
Marti-Bonmati L, Koh DM, Riklund K, Bobowicz M, Roussakis Y, Vilanova JC, Fütterer JJ, Rimola J, Mallol P, Ribas G, Miguel A, Tsiknakis M, Lekadir K, Tsakou G. Considerations for artificial intelligence clinical impact in oncologic imaging: an AI4HI position paper. Insights Imaging 2022; 13:89. [PMID: 35536446 PMCID: PMC9091068 DOI: 10.1186/s13244-022-01220-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Accepted: 04/07/2022] [Indexed: 01/12/2023] Open
Abstract
To achieve clinical impact in daily oncological practice, emerging AI-based cancer imaging research needs to have clearly defined medical focus, AI methods, and outcomes to be estimated. AI-supported cancer imaging should predict major relevant clinical endpoints, aiming to extract associations and draw inferences in a fair, robust, and trustworthy way. AI-assisted solutions as medical devices, developed using multicenter heterogeneous datasets, should be targeted to have an impact on the clinical care pathway. When designing an AI-based research study in oncologic imaging, ensuring clinical impact in AI solutions requires careful consideration of key aspects, including target population selection, sample size definition, standards, and common data elements utilization, balanced dataset splitting, appropriate validation methodology, adequate ground truth, and careful selection of clinical endpoints. Endpoints may be pathology hallmarks, disease behavior, treatment response, or patient prognosis. Ensuring ethical, safety, and privacy considerations are also mandatory before clinical validation is performed. The Artificial Intelligence for Health Imaging (AI4HI) Clinical Working Group has discussed and present in this paper some indicative Machine Learning (ML) enabled decision-support solutions currently under research in the AI4HI projects, as well as the main considerations and requirements that AI solutions should have from a clinical perspective, which can be adopted into clinical practice. If effectively designed, implemented, and validated, cancer imaging AI-supported tools will have the potential to revolutionize the field of precision medicine in oncology.
Collapse
Affiliation(s)
- Luis Marti-Bonmati
- Radiology Department and Biomedical Imaging Research Group (GIBI230), La Fe Polytechnics and University Hospital and Health Research Institute, Valencia, Spain.
| | - Dow-Mu Koh
- Department of Radiology, Royal Marsden Hospital and Division of Radiotherapy and Imaging, Institute of Cancer Research, London, UK.,Department of Radiology, The Royal Marsden NHS Trust, London, UK
| | - Katrine Riklund
- Department of Radiation Sciences, Diagnostic Radiology, Umeå University, 901 85, Umeå, Sweden
| | - Maciej Bobowicz
- 2nd Department of Radiology, Medical University of Gdansk, 17 Smoluchowskiego Str, 80-214, Gdansk, Poland
| | - Yiannis Roussakis
- Department of Medical Physics, German Oncology Center, 4108, Limassol, Cyprus
| | - Joan C Vilanova
- Department of Radiology, Clínica Girona, Institute of Diagnostic Imaging (IDI)-Girona, Faculty of Medicine, University of Girona, Girona, Spain
| | - Jurgen J Fütterer
- Department of Radiology and Nuclear Medicine, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Jordi Rimola
- CIBERehd, Barcelona Clinic Liver Cancer (BCLC) Group, Department of Radiology, Hospital Clínic, University of Barcelona, Barcelona, Spain
| | - Pedro Mallol
- Radiology Department and Biomedical Imaging Research Group (GIBI230), La Fe Polytechnics and University Hospital and Health Research Institute, Valencia, Spain
| | - Gloria Ribas
- Radiology Department and Biomedical Imaging Research Group (GIBI230), La Fe Polytechnics and University Hospital and Health Research Institute, Valencia, Spain
| | - Ana Miguel
- Radiology Department and Biomedical Imaging Research Group (GIBI230), La Fe Polytechnics and University Hospital and Health Research Institute, Valencia, Spain
| | - Manolis Tsiknakis
- Foundation for Research and Technology Hellas, Institute of Computer Science, Computational Biomedicine Lab (CBML), FORTH-ICS Heraklion, Crete, Greece
| | - Karim Lekadir
- Departament de Matemàtiques and Informàtica, Artificial Intelligence in Medicine Lab (BCN-AIM), Universitat de Barcelona, Barcelona, Spain
| | - Gianna Tsakou
- Maggioli S.P.A., Research and Development Lab, Athens, Greece
| |
Collapse
|
19
|
Bernstam EV, Shireman PK, Meric‐Bernstam F, N. Zozus M, Jiang X, Brimhall BB, Windham AK, Schmidt S, Visweswaran S, Ye Y, Goodrum H, Ling Y, Barapatre S, Becich MJ. Artificial intelligence in clinical and translational science: Successes, challenges and opportunities. Clin Transl Sci 2022; 15:309-321. [PMID: 34706145 PMCID: PMC8841416 DOI: 10.1111/cts.13175] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2021] [Accepted: 10/01/2021] [Indexed: 01/12/2023] Open
Abstract
Artificial intelligence (AI) is transforming many domains, including finance, agriculture, defense, and biomedicine. In this paper, we focus on the role of AI in clinical and translational research (CTR), including preclinical research (T1), clinical research (T2), clinical implementation (T3), and public (or population) health (T4). Given the rapid evolution of AI in CTR, we present three complementary perspectives: (1) scoping literature review, (2) survey, and (3) analysis of federally funded projects. For each CTR phase, we addressed challenges, successes, failures, and opportunities for AI. We surveyed Clinical and Translational Science Award (CTSA) hubs regarding AI projects at their institutions. Nineteen of 63 CTSA hubs (30%) responded to the survey. The most common funding source (48.5%) was the federal government. The most common translational phase was T2 (clinical research, 40.2%). Clinicians were the intended users in 44.6% of projects and researchers in 32.3% of projects. The most common computational approaches were supervised machine learning (38.6%) and deep learning (34.2%). The number of projects steadily increased from 2012 to 2020. Finally, we analyzed 2604 AI projects at CTSA hubs using the National Institutes of Health Research Portfolio Online Reporting Tools (RePORTER) database for 2011-2019. We mapped available abstracts to medical subject headings and found that nervous system (16.3%) and mental disorders (16.2) were the most common topics addressed. From a computational perspective, big data (32.3%) and deep learning (30.0%) were most common. This work represents a snapshot in time of the role of AI in the CTSA program.
Collapse
Affiliation(s)
- Elmer V. Bernstam
- School of Biomedical InformaticsThe University of Texas Health Science Center at HoustonHoustonTexasUSA
- Division of General Internal MedicineDepartment of Internal MedicineMcGovern Medical SchoolThe University of Texas Health Science Center at HoustonHoustonTexasUSA
| | - Paula K. Shireman
- Departments of Surgery and MicrobiologyImmunology & Molecular GeneticsUniversity of Texas Health San AntonioSan AntonioTexasUSA
- University HealthSan AntonioTexasUSA
- South Texas Veterans Health Care SystemSan AntonioTexasUSA
| | - Funda Meric‐Bernstam
- Department of Investigational Cancer TherapeuticsThe University of Texas MD Anderson Cancer CenterHoustonTexasUSA
| | - Meredith N. Zozus
- Division of Clinical Research InformaticsDepartment of Population Health SciencesUniversity of Texas Health San AntonioSan AntonioTexasUSA
| | - Xiaoqian Jiang
- School of Biomedical InformaticsThe University of Texas Health Science Center at HoustonHoustonTexasUSA
| | - Bradley B. Brimhall
- University HealthSan AntonioTexasUSA
- Department of PathologyUniversity of Texas Health San AntonioSan AntonioTexasUSA
| | - Ashley K. Windham
- University HealthSan AntonioTexasUSA
- Department of PathologyUniversity of Texas Health San AntonioSan AntonioTexasUSA
| | - Susanne Schmidt
- Department of Population Health SciencesUniversity of Texas Health San AntonioSan AntonioTexasUSA
| | - Shyam Visweswaran
- Department of Biomedical InformaticsUniversity of Pittsburgh School of MedicinePittsburghPennsylvaniaUSA
| | - Ye Ye
- Department of Biomedical InformaticsUniversity of Pittsburgh School of MedicinePittsburghPennsylvaniaUSA
| | - Heath Goodrum
- School of Biomedical InformaticsThe University of Texas Health Science Center at HoustonHoustonTexasUSA
| | - Yaobin Ling
- School of Biomedical InformaticsThe University of Texas Health Science Center at HoustonHoustonTexasUSA
| | - Seemran Barapatre
- Department of Biomedical InformaticsUniversity of Pittsburgh School of MedicinePittsburghPennsylvaniaUSA
| | - Michael J. Becich
- Department of Biomedical InformaticsUniversity of Pittsburgh School of MedicinePittsburghPennsylvaniaUSA
| |
Collapse
|
20
|
Liu M, Wang S, Chen H, Liu Y. A pilot study of a deep learning approach to detect marginal bone loss around implants. BMC Oral Health 2022; 22:11. [PMID: 35034611 PMCID: PMC8762847 DOI: 10.1186/s12903-021-02035-8] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2021] [Accepted: 12/28/2021] [Indexed: 01/17/2023] Open
Abstract
Background Recently, there has been considerable innovation in artificial intelligence (AI) for healthcare. Convolutional neural networks (CNNs) show excellent object detection and classification performance. This study assessed the accuracy of an artificial intelligence (AI) application for the detection of marginal bone loss on periapical radiographs. Methods A Faster region-based convolutional neural network (R-CNN) was trained. Overall, 1670 periapical radiographic images were divided into training (n = 1370), validation (n = 150), and test (n = 150) datasets. The system was evaluated in terms of sensitivity, specificity, the mistake diagnostic rate, the omission diagnostic rate, and the positive predictive value. Kappa (κ) statistics were compared between the system and dental clinicians. Results Evaluation metrics of AI system is equal to resident dentist. The agreement between the AI system and expert is moderate to substantial (κ = 0.547 and 0.568 for bone loss sites and bone loss implants, respectively) for detecting marginal bone loss around dental implants. Conclusions This AI system based on Faster R-CNN analysis of periapical radiographs is a highly promising auxiliary diagnostic tool for peri-implant bone loss detection.
Collapse
Affiliation(s)
- Min Liu
- Department of Prosthodontics, Peking University School and Hospital of Stomatology and National Engineering Laboratory for Digital and Material Technology of Stomatology and Research Center of Engineering and Technology for Digital Dentistry of Ministry of Health and Beijing Key Laboratory of Digital Stomatology and National Clinical Research Center for Oral Diseases, 22 ZhongguancunNandajie, Haidian District, Beijing, 100081, China
| | - Shimin Wang
- Department of Prosthodontics, Peking University School and Hospital of Stomatology and National Engineering Laboratory for Digital and Material Technology of Stomatology and Research Center of Engineering and Technology for Digital Dentistry of Ministry of Health and Beijing Key Laboratory of Digital Stomatology and National Clinical Research Center for Oral Diseases, 22 ZhongguancunNandajie, Haidian District, Beijing, 100081, China
| | - Hu Chen
- Department of Prosthodontics, Peking University School and Hospital of Stomatology and National Engineering Laboratory for Digital and Material Technology of Stomatology and Research Center of Engineering and Technology for Digital Dentistry of Ministry of Health and Beijing Key Laboratory of Digital Stomatology and National Clinical Research Center for Oral Diseases, 22 ZhongguancunNandajie, Haidian District, Beijing, 100081, China.
| | - Yunsong Liu
- Department of Prosthodontics, Peking University School and Hospital of Stomatology and National Engineering Laboratory for Digital and Material Technology of Stomatology and Research Center of Engineering and Technology for Digital Dentistry of Ministry of Health and Beijing Key Laboratory of Digital Stomatology and National Clinical Research Center for Oral Diseases, 22 ZhongguancunNandajie, Haidian District, Beijing, 100081, China.
| |
Collapse
|
21
|
Fischer UM, Shireman PK, Lin JC. Current applications of artificial intelligence in vascular surgery. Semin Vasc Surg 2021; 34:268-271. [PMID: 34911633 PMCID: PMC9883982 DOI: 10.1053/j.semvascsurg.2021.10.008] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2021] [Revised: 10/17/2021] [Accepted: 10/17/2021] [Indexed: 01/31/2023]
Abstract
Basic foundations of artificial intelligence (AI) include analyzing large amounts of data, recognizing patterns, and predicting outcomes. At the core of AI are well-defined areas, such as machine learning, natural language processing, artificial neural networks, and computer vision. Although research and development of AI in health care is being conducted in many medical subspecialties, only a few applications have been implemented in clinical practice. This is true in vascular surgery, where applications are mostly in the translational research stage. These AI applications are being evaluated in the realms of vascular diagnostics, perioperative medicine, risk stratification, and outcome prediction, among others. Apart from the technical challenges of AI and research outcomes on safe and beneficial use in patient care, ethical issues and policy surrounding AI will present future challenges for its successful implementation. This review will give a brief overview and a basic understanding of AI and summarize the currently available and used clinical AI applications in vascular surgery.
Collapse
Affiliation(s)
| | - Paula K. Shireman
- University of Texas Health San Antonio Long School of Medicine and the South Texas Veterans Health Care System
| | | |
Collapse
|
22
|
Kim C, Lee G, Oh H, Jeong G, Kim SW, Chun EJ, Kim YH, Lee JG, Yang DH. A deep learning-based automatic analysis of cardiovascular borders on chest radiographs of valvular heart disease: development/external validation. Eur Radiol 2021; 32:1558-1569. [PMID: 34647180 DOI: 10.1007/s00330-021-08296-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2021] [Revised: 07/19/2021] [Accepted: 08/19/2021] [Indexed: 11/30/2022]
Abstract
OBJECTIVES Cardiovascular border (CB) analysis is the primary method for detecting and quantifying the severity of cardiovascular disease using posterior-anterior chest radiographs (CXRs). This study aimed to develop and validate a deep learning-based automatic CXR CB analysis algorithm (CB_auto) for diagnosing and quantitatively evaluating valvular heart disease (VHD). METHODS We developed CB_auto using 816 normal and 798 VHD CXRs. For validation, 640 normal and 542 VHD CXRs from three different hospitals and 132 CXRs from a public dataset were assigned. The reliability of the CB parameters determined by CB_auto was evaluated. To evaluate the differences between parameters determined by CB_auto and manual CB drawing (CB_hand), the absolute percentage measurement error (APE) was calculated. Pearson correlation coefficients were calculated between CB_hand and echocardiographic measurements. RESULTS CB parameters determined by CB_auto yielded excellent reliability (intraclass correlation coefficient > 0.98). The 95% limits of agreement for the cardiothoracic ratio were 0.00 ± 0.04% without systemic bias. The differences between parameters determined by CB_auto and CB_hand as defined by the APE were < 10% for all parameters except for carinal angle and left atrial appendage. In the public dataset, all CB parameters were successfully drawn in 124 of 132 CXRs (93.9%). All CB parameters were significantly greater in VHD than in normal controls (all p < 0.05). All CB parameters showed significant correlations (p < 0.05) with echocardiographic measurements. CONCLUSIONS The CB_auto system empowered by deep learning algorithm provided highly reliable CB measurements that could be useful not only in daily clinical practice but also for research purposes. KEY POINTS • A deep learning-based automatic CB analysis algorithm for diagnosing and quantitatively evaluating VHD using posterior-anterior chest radiographs was developed and validated. • Our algorithm (CB_auto) yielded comparable reliability to manual CB drawing (CB_hand) in terms of various CB measurement variables, as confirmed by external validation with datasets from three different hospitals and a public dataset. • All CB parameters were significantly different between VHD and normal control measurements, and echocardiographic measurements were significantly correlated with CB parameters measured from normal control and VHD CXRs.
Collapse
Affiliation(s)
- Cherry Kim
- Department of Radiology, Korea University Ansan Hospital, Ansan, Korea
| | - Gaeun Lee
- Biomedical Engineering Research Center, Asan Institute for Life Sciences, University of Ulsan College of Medicine, Seoul, Korea
| | - Hongmin Oh
- Department of Radiology and Research Institute of Radiology, Cardiac Imaging Center, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea
| | - Gyujun Jeong
- Biomedical Engineering Research Center, Asan Institute for Life Sciences, University of Ulsan College of Medicine, Seoul, Korea
| | - Sun Won Kim
- Department of Cardiology, Korea University Ansan Hospital, Ansan, Korea
| | - Eun Ju Chun
- Department of Radiology, Seoul University Bundang Hospital, Seongnam, Korea
| | - Young-Hak Kim
- Department of Cardiology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea
| | - June-Goo Lee
- Biomedical Engineering Research Center, Asan Institute for Life Sciences, University of Ulsan College of Medicine, Seoul, Korea
| | - Dong Hyun Yang
- Department of Radiology and Research Institute of Radiology, Cardiac Imaging Center, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea.
| |
Collapse
|
23
|
Tsopra R, Fernandez X, Luchinat C, Alberghina L, Lehrach H, Vanoni M, Dreher F, Sezerman OU, Cuggia M, de Tayrac M, Miklasevics E, Itu LM, Geanta M, Ogilvie L, Godey F, Boldisor CN, Campillo-Gimenez B, Cioroboiu C, Ciusdel CF, Coman S, Hijano Cubelos O, Itu A, Lange B, Le Gallo M, Lespagnol A, Mauri G, Soykam HO, Rance B, Turano P, Tenori L, Vignoli A, Wierling C, Benhabiles N, Burgun A. A framework for validating AI in precision medicine: considerations from the European ITFoC consortium. BMC Med Inform Decis Mak 2021; 21:274. [PMID: 34600518 PMCID: PMC8487519 DOI: 10.1186/s12911-021-01634-3] [Citation(s) in RCA: 39] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2020] [Accepted: 09/22/2021] [Indexed: 12/22/2022] Open
Abstract
BACKGROUND Artificial intelligence (AI) has the potential to transform our healthcare systems significantly. New AI technologies based on machine learning approaches should play a key role in clinical decision-making in the future. However, their implementation in health care settings remains limited, mostly due to a lack of robust validation procedures. There is a need to develop reliable assessment frameworks for the clinical validation of AI. We present here an approach for assessing AI for predicting treatment response in triple-negative breast cancer (TNBC), using real-world data and molecular -omics data from clinical data warehouses and biobanks. METHODS The European "ITFoC (Information Technology for the Future Of Cancer)" consortium designed a framework for the clinical validation of AI technologies for predicting treatment response in oncology. RESULTS This framework is based on seven key steps specifying: (1) the intended use of AI, (2) the target population, (3) the timing of AI evaluation, (4) the datasets used for evaluation, (5) the procedures used for ensuring data safety (including data quality, privacy and security), (6) the metrics used for measuring performance, and (7) the procedures used to ensure that the AI is explainable. This framework forms the basis of a validation platform that we are building for the "ITFoC Challenge". This community-wide competition will make it possible to assess and compare AI algorithms for predicting the response to TNBC treatments with external real-world datasets. CONCLUSIONS The predictive performance and safety of AI technologies must be assessed in a robust, unbiased and transparent manner before their implementation in healthcare settings. We believe that the consideration of the ITFoC consortium will contribute to the safe transfer and implementation of AI in clinical settings, in the context of precision oncology and personalized care.
Collapse
Affiliation(s)
- Rosy Tsopra
- Centre de Recherche Des Cordeliers, Inserm, Université de Paris, Sorbonne Université, 75006, Paris, France. .,Inria, HeKA, Inria Paris, France. .,Department of Medical Informatics, Hôpital Européen Georges-Pompidou, AP-HP, Paris, France. .,Univ Rennes, CHU Rennes, Inserm, LTSI - UMR 1099, 35000, Rennes, France.
| | | | - Claudio Luchinat
- Centro Risonanze Magnetiche - CERM/CIRMMP and Department of Chemistry, University of Florence, 50019, Sesto Fiorentino (Florence), Italy
| | - Lilia Alberghina
- Department of Biotechnology and Biosciences, University of Milano Bicocca and ISBE-Italy/SYSBIO - Candidate National Node of Italy for ISBE, Research Infrastructure for Systems Biology Europe, Milan, Italy
| | - Hans Lehrach
- Max Planck Institute for Molecular Genetics, Berlin, Germany.,Alacris Theranostics GmbH, Berlin, Germany
| | - Marco Vanoni
- Department of Biotechnology and Biosciences, University of Milano Bicocca and ISBE-Italy/SYSBIO - Candidate National Node of Italy for ISBE, Research Infrastructure for Systems Biology Europe, Milan, Italy
| | | | - O Ugur Sezerman
- School of Medicine Biostatistics and Medical Informatics Dept., Acibadem University, Istanbul, Turkey
| | - Marc Cuggia
- Univ Rennes, CHU Rennes, Inserm, LTSI - UMR 1099, 35000, Rennes, France
| | - Marie de Tayrac
- Univ Rennes, Department of Molecular Genetics and Genomics, CHU Rennes, IGDR-UMR6290, CNRS, 35000, Rennes, France
| | | | | | - Marius Geanta
- Centre for Innovation in Medicine, Bucharest, Romania
| | - Lesley Ogilvie
- Max Planck Institute for Molecular Genetics, Berlin, Germany.,Alacris Theranostics GmbH, Berlin, Germany
| | - Florence Godey
- INSERM U1242 « Chemistry, Oncogenesis Stress Signaling », Université de Rennes, 35042, CEDEX, Rennes, France.,Centre de Lutte Contre Le Cancer Eugène Marquis, CRB Santé (BRIF Number: BB-0033-00056), 35042, CEDEX, Rennes, France
| | | | | | | | | | - Simona Coman
- Transilvania University of Brasov, Brasov, Romania
| | | | - Alina Itu
- Transilvania University of Brasov, Brasov, Romania
| | - Bodo Lange
- Alacris Theranostics GmbH, Berlin, Germany
| | - Matthieu Le Gallo
- INSERM U1242 « Chemistry, Oncogenesis Stress Signaling », Université de Rennes, 35042, CEDEX, Rennes, France.,Centre de Lutte Contre Le Cancer Eugène Marquis, CRB Santé (BRIF Number: BB-0033-00056), 35042, CEDEX, Rennes, France
| | - Alexandra Lespagnol
- Department of Molecular Genetics and Genomics, CHU Rennes, 35000, Rennes, France
| | - Giancarlo Mauri
- Department of Informatics, Systems and Communication, University of Milano Bicocca and ISBE-Italy/SYSBIO - Candidate National Node of Italy for ISBE, Research Infrastructure for Systems Biology Europe, Milan, Italy
| | | | - Bastien Rance
- Centre de Recherche Des Cordeliers, Inserm, Université de Paris, Sorbonne Université, 75006, Paris, France.,Inria, HeKA, Inria Paris, France.,Department of Medical Informatics, Hôpital Européen Georges-Pompidou, AP-HP, Paris, France
| | - Paola Turano
- Centro Risonanze Magnetiche - CERM/CIRMMP and Department of Chemistry, University of Florence, 50019, Sesto Fiorentino (Florence), Italy
| | - Leonardo Tenori
- Centro Risonanze Magnetiche - CERM/CIRMMP and Department of Chemistry, University of Florence, 50019, Sesto Fiorentino (Florence), Italy
| | - Alessia Vignoli
- Centro Risonanze Magnetiche - CERM/CIRMMP and Department of Chemistry, University of Florence, 50019, Sesto Fiorentino (Florence), Italy
| | | | - Nora Benhabiles
- Direction de La Recherche Fondamentale (DRF), CEA, Université Paris-Saclay, 91191, Gif-sur-Yvette, France
| | - Anita Burgun
- Centre de Recherche Des Cordeliers, Inserm, Université de Paris, Sorbonne Université, 75006, Paris, France.,Inria, HeKA, Inria Paris, France.,Department of Medical Informatics, Hôpital Européen Georges-Pompidou, AP-HP, Paris, France.,PaRis Artificial Intelligence Research InstitutE (Prairie), Paris, France
| |
Collapse
|
24
|
Sarhan A, Swift A, Gorner A, Rokne J, Alhajj R, Docherty G, Crichton A. Utilizing a responsive web portal for studying disc tracing agreement in retinal images. PLoS One 2021; 16:e0251703. [PMID: 34032798 PMCID: PMC8148353 DOI: 10.1371/journal.pone.0251703] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2020] [Accepted: 05/02/2021] [Indexed: 11/18/2022] Open
Abstract
Glaucoma is a leading cause of blindness worldwide whose detection is based on multiple factors, including measuring the cup to disc ratio, retinal nerve fiber layer and visual field defects. Advances in image processing and machine learning have allowed the development of automated approached for segmenting objects from fundus images. However, to build a robust system, a reliable ground truth dataset is required for proper training and validation of the model. In this study, we investigate the level of agreement in properly detecting the retinal disc in fundus images using an online portal built for such purposes. Two Doctors of Optometry independently traced the discs for 159 fundus images obtained from publicly available datasets using a purpose-built online portal. Additionally, we studied the effectiveness of ellipse fitting in handling misalignments in tracing. We measured tracing precision, interobserver variability, and average boundary distance between the results provided by ophthalmologists, and optometrist tracing. We also studied whether ellipse fitting has a positive or negative impact on properly detecting disc boundaries. The overall agreement between the optometrists in terms of locating the disc region in these images was 0.87. However, we found that there was a fair agreement on the disc border with kappa = 0.21. Disagreements were mainly in fundus images obtained from glaucomatous patients. The resulting dataset was deemed to be an acceptable ground truth dataset for training a validation of models for automatic detection of objects in fundus images.
Collapse
Affiliation(s)
- Abdullah Sarhan
- Department of Computer Science, University of Calgary, Calgary, Canada
- * E-mail:
| | - Andrew Swift
- Cumming School of Medicine, University of Calgary, Calgary, Canada
| | - Adam Gorner
- Cumming School of Medicine, University of Calgary, Calgary, Canada
| | - Jon Rokne
- Department of Computer Science, University of Calgary, Calgary, Canada
| | - Reda Alhajj
- Department of Computer Science, University of Calgary, Calgary, Canada
- Department of Computer Engineering, Istanbul Medipol University, Istanbul, Turkey
- Department of Health Informatics, University of Southern Denmark, Odense, Denmark
| | - Gavin Docherty
- Department of Ophthalmology and Visual Sciences, University of Calgary, Calgary, Canada
| | - Andrew Crichton
- Department of Ophthalmology and Visual Sciences, University of Calgary, Calgary, Canada
| |
Collapse
|
25
|
Yoon JH, Kim EK. Deep Learning-Based Artificial Intelligence for Mammography. Korean J Radiol 2021; 22:1225-1239. [PMID: 33987993 PMCID: PMC8316774 DOI: 10.3348/kjr.2020.1210] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2020] [Revised: 01/11/2021] [Accepted: 01/17/2021] [Indexed: 12/27/2022] Open
Abstract
During the past decade, researchers have investigated the use of computer-aided mammography interpretation. With the application of deep learning technology, artificial intelligence (AI)-based algorithms for mammography have shown promising results in the quantitative assessment of parenchymal density, detection and diagnosis of breast cancer, and prediction of breast cancer risk, enabling more precise patient management. AI-based algorithms may also enhance the efficiency of the interpretation workflow by reducing both the workload and interpretation time. However, more in-depth investigation is required to conclusively prove the effectiveness of AI-based algorithms. This review article discusses how AI algorithms can be applied to mammography interpretation as well as the current challenges in its implementation in real-world practice.
Collapse
Affiliation(s)
- Jung Hyun Yoon
- Department of Radiology, Severance Hospital, Research Institute of Radiological Science, Seoul, Korea
| | - Eun Kyung Kim
- Department of Radiology, Yongin Severance Hospital, Yonsei University, College of Medicine, Yongin, Korea.
| |
Collapse
|
26
|
Cho SJ, Sunwoo L, Baik SH, Bae YJ, Choi BS, Kim JH. Brain metastasis detection using machine learning: a systematic review and meta-analysis. Neuro Oncol 2021; 23:214-225. [PMID: 33075135 PMCID: PMC7906058 DOI: 10.1093/neuonc/noaa232] [Citation(s) in RCA: 61] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Accurate detection of brain metastasis (BM) is important for cancer patients. We aimed to systematically review the performance and quality of machine-learning-based BM detection on MRI in the relevant literature. METHODS A systematic literature search was performed for relevant studies reported before April 27, 2020. We assessed the quality of the studies using modified tailored questionnaires of the Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) criteria and the Checklist for Artificial Intelligence in Medical Imaging (CLAIM). Pooled detectability was calculated using an inverse-variance weighting model. RESULTS A total of 12 studies were included, which showed a clear transition from classical machine learning (cML) to deep learning (DL) after 2018. The studies on DL used a larger sample size than those on cML. The cML and DL groups also differed in the composition of the dataset, and technical details such as data augmentation. The pooled proportions of detectability of BM were 88.7% (95% CI, 84-93%) and 90.1% (95% CI, 84-95%) in the cML and DL groups, respectively. The false-positive rate per person was lower in the DL group than the cML group (10 vs 135, P < 0.001). In the patient selection domain of QUADAS-2, three studies (25%) were designated as high risk due to non-consecutive enrollment and arbitrary exclusion of nodules. CONCLUSION A comparable detectability of BM with a low false-positive rate per person was found in the DL group compared with the cML group. Improvements are required in terms of quality and study design.
Collapse
Affiliation(s)
- Se Jin Cho
- Department of Radiology, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, Gyeonggi, Republic of Korea
| | - Leonard Sunwoo
- Department of Radiology, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, Gyeonggi, Republic of Korea
| | - Sung Hyun Baik
- Department of Radiology, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, Gyeonggi, Republic of Korea
| | - Yun Jung Bae
- Department of Radiology, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, Gyeonggi, Republic of Korea
| | - Byung Se Choi
- Department of Radiology, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, Gyeonggi, Republic of Korea
| | - Jae Hyoung Kim
- Department of Radiology, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, Gyeonggi, Republic of Korea
| |
Collapse
|
27
|
Ha EJ, Baek JH. Applications of machine learning and deep learning to thyroid imaging: where do we stand? Ultrasonography 2021; 40:23-29. [PMID: 32660203 PMCID: PMC7758100 DOI: 10.14366/usg.20068] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2020] [Revised: 07/01/2020] [Accepted: 07/03/2020] [Indexed: 01/17/2023] Open
Abstract
Ultrasonography (US) is the primary diagnostic tool used to assess the risk of malignancy and to inform decision-making regarding the use of fine-needle aspiration (FNA) and postFNA management in patients with thyroid nodules. However, since US image interpretation is operator-dependent and interobserver variability is moderate to substantial, unnecessary FNA and/or diagnostic surgery are common in practice. Artificial intelligence (AI)-based computeraided diagnosis (CAD) systems have been introduced to help with the accurate and consistent interpretation of US features, ultimately leading to a decrease in unnecessary FNA. This review provides a developmental overview of the AI-based CAD systems currently used for thyroid nodules and describes the future developmental directions of these systems for the personalized and optimized management of thyroid nodules.
Collapse
Affiliation(s)
- Eun Ju Ha
- Department of Radiology, Ajou University School of Medicine, Suwon, Korea
| | - Jung Hwan Baek
- Department of Radiology and Research Institute of Radiology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea
| |
Collapse
|
28
|
Lara Hernandez KA, Rienmüller T, Baumgartner D, Baumgartner C. Deep learning in spatiotemporal cardiac imaging: A review of methodologies and clinical usability. Comput Biol Med 2020; 130:104200. [PMID: 33421825 DOI: 10.1016/j.compbiomed.2020.104200] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2020] [Revised: 12/16/2020] [Accepted: 12/21/2020] [Indexed: 12/24/2022]
Abstract
The use of different cardiac imaging modalities such as MRI, CT or ultrasound enables the visualization and interpretation of altered morphological structures and function of the heart. In recent years, there has been an increasing interest in AI and deep learning that take into account spatial and temporal information in medical image analysis. In particular, deep learning tools using temporal information in image processing have not yet found their way into daily clinical practice, despite its presumed high diagnostic and prognostic value. This review aims to synthesize the most relevant deep learning methods and discuss their clinical usability in dynamic cardiac imaging using for example the complete spatiotemporal image information of the heart cycle. Selected articles were categorized according to the following indicators: clinical applications, quality of datasets, preprocessing and annotation, learning methods and training strategy, and test performance. Clinical usability was evaluated based on these criteria by classifying the selected papers into (i) clinical level, (ii) robust candidate and (iii) proof of concept applications. Interestingly, not a single one of the reviewed papers was classified as a "clinical level" study. Almost 39% of the articles achieved a "robust candidate" and as many as 61% a "proof of concept" status. In summary, deep learning in spatiotemporal cardiac imaging is still strongly research-oriented and its implementation in clinical application still requires considerable efforts. Challenges that need to be addressed are the quality of datasets together with clinical verification and validation of the performance achieved by the used method.
Collapse
Affiliation(s)
- Karen Andrea Lara Hernandez
- Institute of Health Care Engineering with European Testing Center of Medical Devices, Graz University of Technology, Graz, Austria; Department of Biomedical Engineering, Galileo University, Guatemala City, Guatemala
| | - Theresa Rienmüller
- Institute of Health Care Engineering with European Testing Center of Medical Devices, Graz University of Technology, Graz, Austria
| | | | - Christian Baumgartner
- Institute of Health Care Engineering with European Testing Center of Medical Devices, Graz University of Technology, Graz, Austria.
| |
Collapse
|
29
|
Han M, Ha EJ, Park JH. Computer-Aided Diagnostic System for Thyroid Nodules on Ultrasonography: Diagnostic Performance Based on the Thyroid Imaging Reporting and Data System Classification and Dichotomous Outcomes. AJNR Am J Neuroradiol 2020; 42:559-565. [PMID: 33361374 DOI: 10.3174/ajnr.a6922] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2020] [Accepted: 09/29/2020] [Indexed: 01/19/2023]
Abstract
BACKGROUND AND PURPOSE Artificial intelligence-based computer-aided diagnostic systems have been introduced for thyroid cancer diagnosis. Our aim was to compare the diagnostic performance of a commercially available computer-aided diagnostic system and radiologist-based assessment for the detection of thyroid cancer based on the Thyroid Imaging Reporting and Data Systems (TIRADS) and dichotomous outcomes. MATERIALS AND METHODS In total, 372 consecutive patients with 454 thyroid nodules were enrolled. The computer-aided diagnostic system was set up to render a possible diagnosis in 2 formats, the Korean Society of Thyroid Radiology (K)-TIRADS and the American Thyroid Association (ATA)-TIRADS-classifications, and dichotomous outcomes (possibly benign or possibly malignant). RESULTS The diagnostic sensitivity, specificity, positive predictive value, negative predictive value, and accuracy of the computer-aided diagnostic system for thyroid cancer were, respectively, 97.6%, 21.6%, 42.0%, 93.9%, and 49.6% for K-TIRADS; 94.6%, 29.6%, 43.9%, 90.4%, and 53.5% for ATA-TIRADS; and 81.4%, 81.9%, 72.3%, 88.3%, and 81.7% for dichotomous outcomes. The sensitivities of the computer-aided diagnostic system did not differ significantly from those of the radiologist (all P > .05); the specificities and accuracies were significantly lower than those of the radiologist (all P < .001). Unnecessary fine-needle aspiration rates were lower for the dichotomous outcome characterizations, particularly for those performed by the radiologist. The interobserver agreement for the description of K-TIRADS and ATA-TIRADS classifications was fair-to-moderate, but the dichotomous outcomes were in substantial agreement. CONCLUSIONS The diagnostic performance of the computer-aided diagnostic system varies in terms of TIRADS classification and dichotomous outcomes and relative to radiologist-based assessments. Clinicians should know about the strengths and weaknesses associated with the diagnosis of thyroid cancer using computer-aided diagnostic systems.
Collapse
Affiliation(s)
- M Han
- Department of Radiology, Ajou University School of Medicine, Suwon, Korea
| | - E J Ha
- Department of Radiology, Ajou University School of Medicine, Suwon, Korea
| | - J H Park
- Department of Radiology, Ajou University School of Medicine, Suwon, Korea
| |
Collapse
|
30
|
Kim HS. Apprehensions about Excessive Belief in Digital Therapeutics: Points of Concern Excluding Merits. J Korean Med Sci 2020; 35:e373. [PMID: 33230984 PMCID: PMC7683239 DOI: 10.3346/jkms.2020.35.e373] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/07/2020] [Accepted: 09/08/2020] [Indexed: 12/26/2022] Open
Abstract
Digital therapeutics (DTx), like drugs or medical devices, 1) must prove their effectiveness and safety through clinical trials; 2) are provided to patients through prescriptions from doctors; and 3) may require the approval of regulatory agencies, though this might not be mandatory. Although DTx will play an important role in the medical field in the near future, some merits of DTx have been exaggerated at this crucial juncture. In the medical field, where safety and effectiveness are important, merely reducing the development time and costs of DTx is not advantageous. The adverse effects of DTx are not yet well-known, and will be identified eventually, with the passage of time. DTx is beneficial for the collection and analysis of real-world data (RWD); however, they require new and distinct work to collect and analyze high-quality RWD. Naturally, whether this is possible must be independently ascertained through scientific methods. Depending on the type of disease, it is not recommended that DTx be prescribed, even if the patient rejects conventional treatment. Prescription of conventional pharmacotherapy is often necessary, and if the prescription of DTx is inadequate, the critical time for initial treatment may be missed. There is no basis for continuing DTx use by patients. Rather, the rate of continuity of DTx use is extremely low. While many conventional pharmacotherapies have undergone numerous verification and safety tests over a long time, barriers to the application of DTx in the medical field are lower than those for conventional pharmacotherapies. Considering these reasons, except for certain special cases, an approach to DTx is needed that complements the prescription of conventional pharmacotherapy by the medical staff. When DTx are prescribed by doctors who clearly know their advantages and disadvantages, the doctors' expertise may undergo further refinement, and the quality of medical care is expected to improve.
Collapse
Affiliation(s)
- Hun Sung Kim
- Department of Medical Informatics, College of Medicine, The Catholic University of Korea, Seoul, Korea
- Department of Endocrinology and Metabolism, College of Medicine, The Catholic University of Korea, Seoul, Korea.
| |
Collapse
|
31
|
Kim DW, Jang HY, Ko Y, Son JH, Kim PH, Kim SO, Lim JS, Park SH. Inconsistency in the use of the term "validation" in studies reporting the performance of deep learning algorithms in providing diagnosis from medical imaging. PLoS One 2020; 15:e0238908. [PMID: 32915901 PMCID: PMC7485764 DOI: 10.1371/journal.pone.0238908] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2020] [Accepted: 08/26/2020] [Indexed: 12/22/2022] Open
Abstract
BACKGROUND The development of deep learning (DL) algorithms is a three-step process-training, tuning, and testing. Studies are inconsistent in the use of the term "validation", with some using it to refer to tuning and others testing, which hinders accurate delivery of information and may inadvertently exaggerate the performance of DL algorithms. We investigated the extent of inconsistency in usage of the term "validation" in studies on the accuracy of DL algorithms in providing diagnosis from medical imaging. METHODS AND FINDINGS We analyzed the full texts of research papers cited in two recent systematic reviews. The papers were categorized according to whether the term "validation" was used to refer to tuning alone, both tuning and testing, or testing alone. We analyzed whether paper characteristics (i.e., journal category, field of study, year of print publication, journal impact factor [JIF], and nature of test data) were associated with the usage of the terminology using multivariable logistic regression analysis with generalized estimating equations. Of 201 papers published in 125 journals, 118 (58.7%), 9 (4.5%), and 74 (36.8%) used the term to refer to tuning alone, both tuning and testing, and testing alone, respectively. A weak association was noted between higher JIF and using the term to refer to testing (i.e., testing alone or both tuning and testing) instead of tuning alone (vs. JIF <5; JIF 5 to 10: adjusted odds ratio 2.11, P = 0.042; JIF >10: adjusted odds ratio 2.41, P = 0.089). Journal category, field of study, year of print publication, and nature of test data were not significantly associated with the terminology usage. CONCLUSIONS Existing literature has a significant degree of inconsistency in using the term "validation" when referring to the steps in DL algorithm development. Efforts are needed to improve the accuracy and clarity in the terminology usage.
Collapse
Affiliation(s)
- Dong Wook Kim
- Department of Radiology and Research Institute of Radiology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | - Hye Young Jang
- Department of Radiology, National Cancer Center, Goyang, Republic of Korea
| | - Yousun Ko
- Biomedical Research Center, Asan Institute for Life Sciences, Asan Medical Center, Seoul, Republic of Korea
| | - Jung Hee Son
- Department of Radiology and Research Institute of Radiology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | - Pyeong Hwa Kim
- Department of Radiology and Research Institute of Radiology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | - Seon-Ok Kim
- Department of Clinical Epidemiology and Biostatistics, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | - Joon Seo Lim
- Scientific Publications Team, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | - Seong Ho Park
- Department of Radiology and Research Institute of Radiology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| |
Collapse
|
32
|
Mongan J, Moy L, Kahn CE. Checklist for Artificial Intelligence in Medical Imaging (CLAIM): A Guide for Authors and Reviewers. Radiol Artif Intell 2020; 2:e200029. [PMID: 33937821 PMCID: PMC8017414 DOI: 10.1148/ryai.2020200029] [Citation(s) in RCA: 652] [Impact Index Per Article: 130.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2020] [Accepted: 03/05/2020] [Indexed: 12/23/2022]
Affiliation(s)
- John Mongan
- From the Department of Radiology and Biomedical Imaging, University of California–San Francisco, San Francisco, Calif (J.M.); Department of Radiology and Center for Advanced Imaging Innovation and Research, New York University School of Medicine, New York, NY (L.M.); and Department of Radiology, University of Pennsylvania, 3400 Spruce St, 1 Silverstein, Philadelphia, PA 19104 (C.E.K.)
| | - Linda Moy
- From the Department of Radiology and Biomedical Imaging, University of California–San Francisco, San Francisco, Calif (J.M.); Department of Radiology and Center for Advanced Imaging Innovation and Research, New York University School of Medicine, New York, NY (L.M.); and Department of Radiology, University of Pennsylvania, 3400 Spruce St, 1 Silverstein, Philadelphia, PA 19104 (C.E.K.)
| | - Charles E. Kahn
- From the Department of Radiology and Biomedical Imaging, University of California–San Francisco, San Francisco, Calif (J.M.); Department of Radiology and Center for Advanced Imaging Innovation and Research, New York University School of Medicine, New York, NY (L.M.); and Department of Radiology, University of Pennsylvania, 3400 Spruce St, 1 Silverstein, Philadelphia, PA 19104 (C.E.K.)
| |
Collapse
|
33
|
Kim DW, Jang HY, Kim KW, Shin Y, Park SH. Design Characteristics of Studies Reporting the Performance of Artificial Intelligence Algorithms for Diagnostic Analysis of Medical Images: Results from Recently Published Papers. Korean J Radiol 2019; 20:405-410. [PMID: 30799571 PMCID: PMC6389801 DOI: 10.3348/kjr.2019.0025] [Citation(s) in RCA: 280] [Impact Index Per Article: 46.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2019] [Accepted: 02/04/2019] [Indexed: 01/17/2023] Open
Abstract
Objective To evaluate the design characteristics of studies that evaluated the performance of artificial intelligence (AI) algorithms for the diagnostic analysis of medical images. Materials and Methods PubMed MEDLINE and Embase databases were searched to identify original research articles published between January 1, 2018 and August 17, 2018 that investigated the performance of AI algorithms that analyze medical images to provide diagnostic decisions. Eligible articles were evaluated to determine 1) whether the study used external validation rather than internal validation, and in case of external validation, whether the data for validation were collected, 2) with diagnostic cohort design instead of diagnostic case-control design, 3) from multiple institutions, and 4) in a prospective manner. These are fundamental methodologic features recommended for clinical validation of AI performance in real-world practice. The studies that fulfilled the above criteria were identified. We classified the publishing journals into medical vs. non-medical journal groups. Then, the results were compared between medical and non-medical journals. Results Of 516 eligible published studies, only 6% (31 studies) performed external validation. None of the 31 studies adopted all three design features: diagnostic cohort design, the inclusion of multiple institutions, and prospective data collection for external validation. No significant difference was found between medical and non-medical journals. Conclusion Nearly all of the studies published in the study period that evaluated the performance of AI algorithms for diagnostic analysis of medical images were designed as proof-of-concept technical feasibility studies and did not have the design features that are recommended for robust validation of the real-world clinical performance of AI algorithms.
Collapse
Affiliation(s)
- Dong Wook Kim
- Department of Radiology, Taean-gun Health Center and County Hospital, Taean-gun, Korea
| | - Hye Young Jang
- Department of Radiology and Research Institute of Radiology, University of Ulsan College of Medicine, Asan Medical Center, Seoul, Korea
| | - Kyung Won Kim
- Department of Radiology and Research Institute of Radiology, University of Ulsan College of Medicine, Asan Medical Center, Seoul, Korea
| | - Youngbin Shin
- Department of Radiology and Research Institute of Radiology, University of Ulsan College of Medicine, Asan Medical Center, Seoul, Korea
| | - Seong Ho Park
- Department of Radiology and Research Institute of Radiology, University of Ulsan College of Medicine, Asan Medical Center, Seoul, Korea.
| |
Collapse
|
34
|
Kim HL, Ha EJ, Han M. Real-World Performance of Computer-Aided Diagnosis System for Thyroid Nodules Using Ultrasonography. ULTRASOUND IN MEDICINE & BIOLOGY 2019; 45:2672-2678. [PMID: 31262524 DOI: 10.1016/j.ultrasmedbio.2019.05.032] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/26/2019] [Revised: 05/29/2019] [Accepted: 05/30/2019] [Indexed: 05/22/2023]
Abstract
This study evaluated the diagnostic performance of a commercially available computer-aided diagnosis (CAD) system (S-Detect 1 and S-Detect 2 for thyroid) for detecting thyroid cancers. Among 218 thyroid nodules in 106 patients, the sensitivity, specificity, positive predictive value, negative predictive value and accuracy of the CAD systems were 80.2%, 82.6%, 75.0%, 86.3% and 81.7%, respectively, for the S-Detect 1 and 81.4%, 68.2%, 62.5%, 84.9% and 73.4%, respectively, for the S-Detect 2. The inter-observer agreement between the CAD system and radiologist for the description of calcifications was fair (kappa = 0.336), while the final diagnosis and each ultrasonographic descriptor showed moderate to substantial agreement for the S-Detect 2. To conclude, the current CAD systems had limited specificity in the diagnosis of thyroid cancer. One of the main limitations of the S-Detect 2 was its inaccuracy in recognizing calcifications, which meant that differentiation had to be undertaken by the radiologist.
Collapse
Affiliation(s)
- Hye Lin Kim
- Department of Radiology, Ajou University School of Medicine, Suwon, South Korea
| | - Eun Ju Ha
- Department of Radiology, Ajou University School of Medicine, Suwon, South Korea.
| | - Miran Han
- Department of Radiology, Ajou University School of Medicine, Suwon, South Korea
| |
Collapse
|
35
|
Becker A. Artificial intelligence in medicine: What is it doing for us today? HEALTH POLICY AND TECHNOLOGY 2019. [DOI: 10.1016/j.hlpt.2019.03.004] [Citation(s) in RCA: 46] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
|
36
|
Gasparyan AY, Kitas GD. Steps towards quality of open access publishing. Mediterr J Rheumatol 2018; 29:184-186. [PMID: 32185323 PMCID: PMC7045940 DOI: 10.31138/mjr.29.4.184] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2018] [Accepted: 12/20/2018] [Indexed: 11/17/2022] Open
Affiliation(s)
- Armen Yuri Gasparyan
- Departments of Rheumatology and Research and Development, Dudley Group NHS Foundation Trust (Teaching Trust of the University of Birmingham, UK), Russells Hall Hospital, Dudley, West Midlands, UK
| | - George D Kitas
- Departments of Rheumatology and Research and Development, Dudley Group NHS Foundation Trust (Teaching Trust of the University of Birmingham, UK), Russells Hall Hospital, Dudley, West Midlands, UK.,Arthritis Research UK Epidemiology Unit, University of Manchester, Manchester, UK
| |
Collapse
|
37
|
Kang JH, Kim DH, Park SH, Baek JH. Age of Data in Contemporary Research Articles Published in Representative General Radiology Journals. Korean J Radiol 2018; 19:1172-1178. [PMID: 30386148 PMCID: PMC6201984 DOI: 10.3348/kjr.2018.19.6.1172] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2018] [Accepted: 09/01/2018] [Indexed: 12/13/2022] Open
Abstract
Objective To analyze and compare the age of data in contemporary research articles published in representative general radiology journals. Materials and Methods We searched for articles reporting original research studies analyzing patient data that were published in the print issues of the Korean Journal of Radiology (KJR), European Radiology (ER), and Radiology in 2017. Eligible articles were reviewed to extract data collection period (time from first patient recruitment to last patient follow-up) and age of data (time between data collection end and publication). The journals were compared in terms of the proportion of articles reporting the data collection period to the level of calendar month and regarding the age of data. Results There were 50, 492, and 254 eligible articles in KJR, ER, and Radiology, respectively. Of these, 44 (88%; 95% confidence interval [CI]: 75.8-94.8%), 359 (73%; 95% CI: 68.9-76.7%), and 211 (83.1%; 95% CI: 78-87.2%) articles, respectively, provided enough details of data collection period, revealing a significant difference between ER and Radiology (p = 0.002). The age of data was significantly greater in KJR (median age: 826 days; range: 299-2843 days) than in ER (median age: 570 days; range: 56-4742 days; p < 0.001) and Radiology (median age: 618; range: 75-4271 days; p < 0.001). Conclusion Korean Journal of Radiology did not fall behind ER or Radiology in reporting of data collection period, but showed a significantly greater age of data than ER and Radiology, suggesting that KJR should take measures to improve the timeliness of its data.
Collapse
Affiliation(s)
- Ji Hun Kang
- Department of Radiology and Research Institute of Radiology, University of Ulsan College of Medicine, Asan Medical Center, Seoul 05505, Korea
| | - Dong Hwan Kim
- Department of Radiology and Research Institute of Radiology, University of Ulsan College of Medicine, Asan Medical Center, Seoul 05505, Korea
| | - Seong Ho Park
- Department of Radiology and Research Institute of Radiology, University of Ulsan College of Medicine, Asan Medical Center, Seoul 05505, Korea
| | - Jung Hwan Baek
- Department of Radiology and Research Institute of Radiology, University of Ulsan College of Medicine, Asan Medical Center, Seoul 05505, Korea
| |
Collapse
|
38
|
Park SJ, Shin JY, Kim S, Son J, Jung KH, Park KH. A Novel Fundus Image Reading Tool for Efficient Generation of a Multi-dimensional Categorical Image Database for Machine Learning Algorithm Training. J Korean Med Sci 2018; 33:e239. [PMID: 30344460 PMCID: PMC6193885 DOI: 10.3346/jkms.2018.33.e239] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/04/2018] [Accepted: 07/10/2018] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND We described a novel multi-step retinal fundus image reading system for providing high-quality large data for machine learning algorithms, and assessed the grader variability in the large-scale dataset generated with this system. METHODS A 5-step retinal fundus image reading tool was developed that rates image quality, presence of abnormality, findings with location information, diagnoses, and clinical significance. Each image was evaluated by 3 different graders. Agreements among graders for each decision were evaluated. RESULTS The 234,242 readings of 79,458 images were collected from 55 licensed ophthalmologists during 6 months. The 34,364 images were graded as abnormal by at-least one rater. Of these, all three raters agreed in 46.6% in abnormality, while 69.9% of the images were rated as abnormal by two or more raters. Agreement rate of at-least two raters on a certain finding was 26.7%-65.2%, and complete agreement rate of all-three raters was 5.7%-43.3%. As for diagnoses, agreement of at-least two raters was 35.6%-65.6%, and complete agreement rate was 11.0%-40.0%. Agreement of findings and diagnoses were higher when restricted to images with prior complete agreement on abnormality. Retinal/glaucoma specialists showed higher agreements on findings and diagnoses of their corresponding subspecialties. CONCLUSION This novel reading tool for retinal fundus images generated a large-scale dataset with high level of information, which can be utilized in future development of machine learning-based algorithms for automated identification of abnormal conditions and clinical decision supporting system. These results emphasize the importance of addressing grader variability in algorithm developments.
Collapse
Affiliation(s)
- Sang Jun Park
- Department of Ophthalmology, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, Korea
| | - Joo Young Shin
- Department of Ophthalmology, Dongguk University Ilsan Hospital, Goyang, Korea
| | | | | | | | - Kyu Hyung Park
- Department of Ophthalmology, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, Korea
| |
Collapse
|
39
|
Park SH. Regulatory Approval versus Clinical Validation of Artificial Intelligence Diagnostic Tools. Radiology 2018; 288:910-911. [PMID: 30040041 DOI: 10.1148/radiol.2018181310] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Affiliation(s)
- Seong Ho Park
- Department of Radiology and Research Institute of Radiology, University of Ulsan College of Medicine, Asan Medical Center, 88 Olympic-ro 43-gil, Songpa-gu, Seoul 05505, South Koreae-mail
| |
Collapse
|
40
|
Park SH, Do KH, Choi JI, Sim JS, Yang DM, Eo H, Woo H, Lee JM, Jung SE, Oh JH. Principles for evaluating the clinical implementation of novel digital healthcare devices. JOURNAL OF THE KOREAN MEDICAL ASSOCIATION 2018. [DOI: 10.5124/jkma.2018.61.12.765] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Affiliation(s)
- Seong Ho Park
- Department of Radiology and Research Institute of Radiology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea
| | - Kyung-Hyun Do
- Department of Radiology and Research Institute of Radiology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Korea
| | - Joon-Il Choi
- Department of Radiology, Seoul St. Mary's Hospital, The Catholic University of Korea College of Medicine, Seoul, Korea
| | | | - Dal Mo Yang
- Department of Radiology, Kyung Hee University Hospital at Gangdong, Seoul, Korea
| | - Hong Eo
- Department of Radiology and Center for Imaging Science, Samsung Medical Center, Seoul, Korea
| | - Hyunsik Woo
- Department of Radiology, SMG-SNU Boramae Medical Center, Seoul National University College of Medicine, Seoul, Korea
| | - Jeong Min Lee
- Department of Radiology, Seoul National University Hospital, Seoul National University College of Medicine, Seoul, Korea
| | - Seung Eun Jung
- Department of Radiology, Seoul St. Mary's Hospital, The Catholic University of Korea College of Medicine, Seoul, Korea
| | - Joo Hyeong Oh
- Department of Radiology, Kyung Hee University Hospital, Kyung Hee University College of Medicine, Seoul, Korea
| |
Collapse
|