1
|
Xie J, Zhang Q, Cui Z, Ma C, Zhou Y, Wang W, Shen D. Integrating Eye Tracking With Grouped Fusion Networks for Semantic Segmentation on Mammogram Images. IEEE TRANSACTIONS ON MEDICAL IMAGING 2025; 44:868-879. [PMID: 39331544 DOI: 10.1109/tmi.2024.3468404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/29/2024]
Abstract
Medical image segmentation has seen great progress in recent years, largely due to the development of deep neural networks. However, unlike in computer vision, high-quality clinical data is relatively scarce, and the annotation process is often a burden for clinicians. As a result, the scarcity of medical data limits the performance of existing medical image segmentation models. In this paper, we propose a novel framework that integrates eye tracking information from experienced radiologists during the screening process to improve the performance of deep neural networks with limited data. Our approach, a grouped hierarchical network, guides the network to learn from its faults by using gaze information as weak supervision. We demonstrate the effectiveness of our framework on mammogram images, particularly for handling segmentation classes with large scale differences. We evaluate the impact of gaze information on medical image segmentation tasks and show that our method achieves better segmentation performance compared to state-of-the-art models. A robustness study is conducted to investigate the influence of distraction or inaccuracies in gaze collection. We also develop a convenient system for collecting gaze data without interrupting the normal clinical workflow. Our work offers novel insights into the potential benefits of integrating gaze information into medical image segmentation tasks.
Collapse
|
2
|
Zhu H, Liu W, Gao Z, Zhang H. Explainable Classification of Benign-Malignant Pulmonary Nodules With Neural Networks and Information Bottleneck. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:2028-2039. [PMID: 37843998 DOI: 10.1109/tnnls.2023.3303395] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/18/2023]
Abstract
Computerized tomography (CT) is a clinically primary technique to differentiate benign-malignant pulmonary nodules for lung cancer diagnosis. Early classification of pulmonary nodules is essential to slow down the degenerative process and reduce mortality. The interactive paradigm assisted by neural networks is considered to be an effective means for early lung cancer screening in large populations. However, some inherent characteristics of pulmonary nodules in high-resolution CT images, e.g., diverse shapes and sparse distribution over the lung fields, have been inducing inaccurate results. On the other hand, most existing methods with neural networks are dissatisfactory from a lack of transparency. In order to overcome these obstacles, a united framework is proposed, including the classification and feature visualization stages, to learn distinctive features and provide visual results. Specifically, a bilateral scheme is employed to synchronously extract and aggregate global-local features in the classification stage, where the global branch is constructed to perceive deep-level features and the local branch is built to focus on the refined details. Furthermore, an encoder is built to generate some features, and a decoder is constructed to simulate decision behavior, followed by the information bottleneck viewpoint to optimize the objective. Extensive experiments are performed to evaluate our framework on two publicly available datasets, namely, 1) the Lung Image Database Consortium and Image Database Resource Initiative (LIDC-IDRI) and 2) the Lung and Colon Histopathological Image Dataset (LC25000). For instance, our framework achieves 92.98% accuracy and presents additional visualizations on the LIDC. The experiment results show that our framework can obtain outstanding performance and is effective to facilitate explainability. It also demonstrates that this united framework is a serviceable tool and further has the scalability to be introduced into clinical research.
Collapse
|
3
|
Ma C, Zhao L, Chen Y, Guo L, Zhang T, Hu X, Shen D, Jiang X, Liu T. Rectify ViT Shortcut Learning by Visual Saliency. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:18013-18025. [PMID: 37703160 DOI: 10.1109/tnnls.2023.3310531] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/15/2023]
Abstract
Shortcut learning in deep learning models occurs when unintended features are prioritized, resulting in degenerated feature representations and reduced generalizability and interpretability. However, shortcut learning in the widely used vision transformer (ViT) framework is largely unknown. Meanwhile, introducing domain-specific knowledge is a major approach to rectifying the shortcuts that are predominated by background-related factors. For example, eye-gaze data from radiologists are effective human visual prior knowledge that has the great potential to guide the deep learning models to focus on meaningful foreground regions. However, obtaining eye-gaze data can still sometimes be time-consuming, labor-intensive, and even impractical. In this work, we propose a novel and effective saliency-guided ViT (SGT) model to rectify shortcut learning in ViT with the absence of eye-gaze data. Specifically, a computational visual saliency model (either pretrained or fine-tuned) is adopted to predict saliency maps for input image samples. Then, the saliency maps are used to filter the most informative image patches. Considering that this filter operation may lead to global information loss, we further introduce a residual connection that calculates the self-attention across all the image patches. The experiment results on natural and medical image datasets show that our SGT framework can effectively learn and leverage human prior knowledge without eye-gaze data and achieves much better performance than baselines. Meanwhile, it successfully rectifies the harmful shortcut learning and significantly improves the interpretability of the ViT model, demonstrating the promise of transferring human prior knowledge derived visual saliency in rectifying shortcut learning.
Collapse
|
4
|
Peng P, Fan W, Shen Y, Liu W, Yang X, Zhang Q, Wei X, Zhou D. Eye Gaze Guided Cross-Modal Alignment Network for Radiology Report Generation. IEEE J Biomed Health Inform 2024; 28:7406-7419. [PMID: 38995704 DOI: 10.1109/jbhi.2024.3422168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/14/2024]
Abstract
The potential benefits of automatic radiology report generation, such as reducing misdiagnosis rates and enhancing clinical diagnosis efficiency, are significant. However, existing data-driven methods lack essential medical prior knowledge, which hampers their performance. Moreover, establishing global correspondences between radiology images and related reports, while achieving local alignments between images correlated with prior knowledge and text, remains a challenging task. To address these shortcomings, we introduce a novel Eye Gaze Guided Cross-modal Alignment Network (EGGCA-Net) for generating accurate medical reports. Our approach incorporates prior knowledge from radiologists' Eye Gaze Region (EGR) to refine the fidelity and comprehensibility of report generation. Specifically, we design a Dual Fine-Grained Branch (DFGB) and a Multi-Task Branch (MTB) to collaboratively ensure the alignment of visual and textual semantics across multiple levels. To establish fine-grained alignment between EGR-related images and sentences, we introduce the Sentence Fine-grained Prototype Module (SFPM) within DFGB to capture cross-modal information at different levels. Additionally, to learn the alignment of EGR-related image topics, we introduce the Multi-task Feature Fusion Module (MFFM) within MTB to refine the encoder output information. Finally, a specifically designed label matching mechanism is designed to generate reports that are consistent with the anticipated disease states. The experimental outcomes indicate that the introduced methodology surpasses previous advanced techniques, yielding enhanced performance on two extensively used benchmark datasets: Open-i and MIMIC-CXR.
Collapse
|
5
|
De Luca GR, Diciotti S, Mascalchi M. The Pivotal Role of Baseline LDCT for Lung Cancer Screening in the Era of Artificial Intelligence. Arch Bronconeumol 2024:S0300-2896(24)00439-3. [PMID: 39643515 DOI: 10.1016/j.arbres.2024.11.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2024] [Revised: 10/21/2024] [Accepted: 11/06/2024] [Indexed: 12/09/2024]
Abstract
In this narrative review, we address the ongoing challenges of lung cancer (LC) screening using chest low-dose computerized tomography (LDCT) and explore the contributions of artificial intelligence (AI), in overcoming them. We focus on evaluating the initial (baseline) LDCT examination, which provides a wealth of information relevant to the screening participant's health. This includes the detection of large-size prevalent LC and small-size malignant nodules that are typically diagnosed as LCs upon growth in subsequent annual LDCT scans. Additionally, the baseline LDCT examination provides valuable information about smoking-related comorbidities, including cardiovascular disease, chronic obstructive pulmonary disease, and interstitial lung disease (ILD), by identifying relevant markers. Notably, these comorbidities, despite the slow progression of their markers, collectively exceed LC as ultimate causes of death at follow-up in LC screening participants. Computer-assisted diagnosis tools currently improve the reproducibility of radiologic readings and reduce the false negative rate of LDCT. Deep learning (DL) tools that analyze the radiomic features of lung nodules are being developed to distinguish between benign and malignant nodules. Furthermore, AI tools can predict the risk of LC in the years following a baseline LDCT. AI tools that analyze baseline LDCT examinations can also compute the risk of cardiovascular disease or death, paving the way for personalized screening interventions. Additionally, DL tools are available for assessing osteoporosis and ILD, which helps refine the individual's current and future health profile. The primary obstacles to AI integration into the LDCT screening pathway are the generalizability of performance and the explainability.
Collapse
Affiliation(s)
- Giulia Raffaella De Luca
- Department of Electrical, Electronic, and Information Engineering "Guglielmo Marconi" - DEI, University of Bologna, 47522 Cesena, Italy
| | - Stefano Diciotti
- Department of Electrical, Electronic, and Information Engineering "Guglielmo Marconi" - DEI, University of Bologna, 47522 Cesena, Italy; Alma Mater Research Institute for Human-Centered Artificial Intelligence, University of Bologna, 40121 Bologna, Italy
| | - Mario Mascalchi
- Department of Experimental and Clinical Biomedical Sciences "Mario Serio", University of Florence, 50139 Florence, Italy.
| |
Collapse
|
6
|
Mohamed Selim A, Barz M, Bhatti OS, Alam HMT, Sonntag D. A review of machine learning in scanpath analysis for passive gaze-based interaction. Front Artif Intell 2024; 7:1391745. [PMID: 38903158 PMCID: PMC11188426 DOI: 10.3389/frai.2024.1391745] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2024] [Accepted: 05/15/2024] [Indexed: 06/22/2024] Open
Abstract
The scanpath is an important concept in eye tracking. It refers to a person's eye movements over a period of time, commonly represented as a series of alternating fixations and saccades. Machine learning has been increasingly used for the automatic interpretation of scanpaths over the past few years, particularly in research on passive gaze-based interaction, i.e., interfaces that implicitly observe and interpret human eye movements, with the goal of improving the interaction. This literature review investigates research on machine learning applications in scanpath analysis for passive gaze-based interaction between 2012 and 2022, starting from 2,425 publications and focussing on 77 publications. We provide insights on research domains and common learning tasks in passive gaze-based interaction and present common machine learning practices from data collection and preparation to model selection and evaluation. We discuss commonly followed practices and identify gaps and challenges, especially concerning emerging machine learning topics, to guide future research in the field.
Collapse
Affiliation(s)
- Abdulrahman Mohamed Selim
- German Research Center for Artificial Intelligence (DFKI), Interactive Machine Learning Department, Saarbrücken, Germany
| | - Michael Barz
- German Research Center for Artificial Intelligence (DFKI), Interactive Machine Learning Department, Saarbrücken, Germany
- Applied Artificial Intelligence, University of Oldenburg, Oldenburg, Germany
| | - Omair Shahzad Bhatti
- German Research Center for Artificial Intelligence (DFKI), Interactive Machine Learning Department, Saarbrücken, Germany
| | - Hasan Md Tusfiqur Alam
- German Research Center for Artificial Intelligence (DFKI), Interactive Machine Learning Department, Saarbrücken, Germany
| | - Daniel Sonntag
- German Research Center for Artificial Intelligence (DFKI), Interactive Machine Learning Department, Saarbrücken, Germany
- Applied Artificial Intelligence, University of Oldenburg, Oldenburg, Germany
| |
Collapse
|
7
|
Ibragimov B, Mello-Thoms C. The Use of Machine Learning in Eye Tracking Studies in Medical Imaging: A Review. IEEE J Biomed Health Inform 2024; 28:3597-3612. [PMID: 38421842 PMCID: PMC11262011 DOI: 10.1109/jbhi.2024.3371893] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/02/2024]
Abstract
Machine learning (ML) has revolutionized medical image-based diagnostics. In this review, we cover a rapidly emerging field that can be potentially significantly impacted by ML - eye tracking in medical imaging. The review investigates the clinical, algorithmic, and hardware properties of the existing studies. In particular, it evaluates 1) the type of eye-tracking equipment used and how the equipment aligns with study aims; 2) the software required to record and process eye-tracking data, which often requires user interface development, and controller command and voice recording; 3) the ML methodology utilized depending on the anatomy of interest, gaze data representation, and target clinical application. The review concludes with a summary of recommendations for future studies, and confirms that the inclusion of gaze data broadens the ML applicability in Radiology from computer-aided diagnosis (CAD) to gaze-based image annotation, physicians' error detection, fatigue recognition, and other areas of potentially high research and clinical impact.
Collapse
|
8
|
Neves J, Hsieh C, Nobre IB, Sousa SC, Ouyang C, Maciel A, Duchowski A, Jorge J, Moreira C. Shedding light on ai in radiology: A systematic review and taxonomy of eye gaze-driven interpretability in deep learning. Eur J Radiol 2024; 172:111341. [PMID: 38340426 DOI: 10.1016/j.ejrad.2024.111341] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 01/04/2024] [Accepted: 01/25/2024] [Indexed: 02/12/2024]
Abstract
X-ray imaging plays a crucial role in diagnostic medicine. Yet, a significant portion of the global population lacks access to this essential technology due to a shortage of trained radiologists. Eye-tracking data and deep learning models can enhance X-ray analysis by mapping expert focus areas, guiding automated anomaly detection, optimizing workflow efficiency, and bolstering training methods for novice radiologists. However, the literature shows contradictory results regarding the usefulness of eye-tracking data in deep-learning architectures for abnormality detection. We argue that these discrepancies between studies in the literature are due to (a) the way eye-tracking data is (or is not) processed, (b) the types of deep learning architectures chosen, and (c) the type of application that these architectures will have. We conducted a systematic literature review using PRISMA to address these contradicting results. We analyzed 60 studies that incorporated eye-tracking data in a deep-learning approach for different application goals in radiology. We performed a comparative analysis to understand if eye gaze data contains feature maps that can be useful under a deep learning approach and whether they can promote more interpretable predictions. To the best of our knowledge, this is the first survey in the area that performs a thorough investigation of eye gaze data processing techniques and their impacts in different deep learning architectures for applications such as error detection, classification, object detection, expertise level analysis, fatigue estimation and human attention prediction in medical imaging data. Our analysis resulted in two main contributions: (1) taxonomy that first divides the literature by task, enabling us to analyze the value eye movement can bring for each case and build guidelines regarding architectures and gaze processing techniques adequate for each application, and (2) an overall analysis of how eye gaze data can promote explainability in radiology.
Collapse
Affiliation(s)
- José Neves
- Instituto Superior Técnico / INESC-ID, University of Lisbon, Portugal.
| | - Chihcheng Hsieh
- School of Information Systems, Queensland University of Technology, Australia.
| | | | | | - Chun Ouyang
- School of Information Systems, Queensland University of Technology, Australia.
| | - Anderson Maciel
- Instituto Superior Técnico / INESC-ID, University of Lisbon, Portugal.
| | | | - Joaquim Jorge
- Instituto Superior Técnico / INESC-ID, University of Lisbon, Portugal.
| | - Catarina Moreira
- Human Technology Institute, University of Technology Sydney, Australia.
| |
Collapse
|
9
|
Futane A, Jadhav P, Mustafa AH, Srinivasan A, Narayanamurthy V. Aptamer-functionalized MOFs and AI-driven strategies for early cancer diagnosis and therapeutics. Biotechnol Lett 2024; 46:1-17. [PMID: 38155321 DOI: 10.1007/s10529-023-03454-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Revised: 11/07/2023] [Accepted: 11/20/2023] [Indexed: 12/30/2023]
Abstract
Metal-Organic Frameworks (MOFs) have exceptional inherent properties that make them highly suitable for diverse applications, such as catalysis, storage, optics, chemo sensing, and biomedical science and technology. Over the past decades, researchers have utilized various techniques, including solvothermal, hydrothermal, mechanochemical, electrochemical, and ultrasonic, to synthesize MOFs with tailored properties. Post-synthetic modification of linkers, nodal components, and crystallite domain size and morphology can functionalize MOFs to improve their aptamer applications. Advancements in AI and machine learning led to the development of nonporous MOFs and nanoscale MOFs for medical purposes. MOFs have exhibited promise in cancer therapy, with the successful accumulation of a photosensitizer in cancer cells representing a significant breakthrough. This perspective is focused on MOFs' use as advanced materials and systems for cancer therapy, exploring the challenging aspects and promising features of MOF-based cancer diagnosis and treatment. The paper concludes by emphasizing the potential of MOFs as a transformative technology for cancer treatment and diagnosis.
Collapse
Affiliation(s)
- Abhishek Futane
- Department of Engineering Technology, Faculty of Electronics and Computer Technology & Engineering, Universiti Teknikal Malaysia Melaka, Hang Tuah Jaya, 76100, Durian Tunggal, Melaka, Malaysia
| | - Pramod Jadhav
- Faculty of Civil Engineering Technology, Universiti Malaysia Pahang (UMP) Lebuhraya Tun Razak, 26300, Gambang, Kuantan, Pahang, Malaysia
| | - Abu Hasnat Mustafa
- Faculty of Industrial Science and Technology, Universiti Malaysia Pahang, 26300, Gambang, Pahang, Malaysia
| | - Arthi Srinivasan
- Faculty of Chemical and Process Engineering Technology, University Malaysia Pahang (UMP), Lebuhraya Tun Razak, 26300, Gambang, Kunatan, Pahang, Malaysia
| | - Vigneswaran Narayanamurthy
- Department of Engineering Technology, Faculty of Electronics and Computer Technology & Engineering, Universiti Teknikal Malaysia Melaka, Hang Tuah Jaya, 76100, Durian Tunggal, Melaka, Malaysia.
- Department of Biotechnology, Saveetha School of Engineering, Saveetha Institute of Medical and Technical Sciences, Chennai, India.
| |
Collapse
|
10
|
Ma C, Zhao L, Chen Y, Wang S, Guo L, Zhang T, Shen D, Jiang X, Liu T. Eye-Gaze-Guided Vision Transformer for Rectifying Shortcut Learning. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:3384-3394. [PMID: 37335796 DOI: 10.1109/tmi.2023.3287572] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/21/2023]
Abstract
Learning harmful shortcuts such as spurious correlations and biases prevents deep neural networks from learning meaningful and useful representations, thus jeopardizing the generalizability and interpretability of the learned representation. The situation becomes even more serious in medical image analysis, where the clinical data are limited and scarce while the reliability, generalizability and transparency of the learned model are highly required. To rectify the harmful shortcuts in medical imaging applications, in this paper, we propose a novel eye-gaze-guided vision transformer (EG-ViT) model which infuses the visual attention from radiologists to proactively guide the vision transformer (ViT) model to focus on regions with potential pathology rather than spurious correlations. To do so, the EG-ViT model takes the masked image patches that are within the radiologists' interest as input while has an additional residual connection to the last encoder layer to maintain the interactions of all patches. The experiments on two medical imaging datasets demonstrate that the proposed EG-ViT model can effectively rectify the harmful shortcut learning and improve the interpretability of the model. Meanwhile, infusing the experts' domain knowledge can also improve the large-scale ViT model's performance over all compared baseline methods with limited samples available. In general, EG-ViT takes the advantages of powerful deep neural networks while rectifies the harmful shortcut learning with human expert's prior knowledge. This work also opens new avenues for advancing current artificial intelligence paradigms by infusing human intelligence.
Collapse
|
11
|
Kamalanathan A, Muthu B, Kuniyil Kaleena P. Artificial Intelligence (AI) Game Changer in Cancer Biology. MARVELS OF ARTIFICIAL AND COMPUTATIONAL INTELLIGENCE IN LIFE SCIENCES 2023:62-87. [DOI: 10.2174/9789815136807123010009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/07/2025]
Abstract
Healthcare is one of many industries where the most modern technologies,
such as artificial intelligence and machine learning, have shown a wide range of
applications. Cancer, one of the most prevalent non-communicable diseases in modern
times, accounts for a sizable portion of worldwide mortality. Investigations are
continuously being conducted to find ways to reduce cancer mortality and morbidity.
Artificial Intelligence (AI) is currently being used in cancer research, with promising
results. Two main features play a vital role in improving cancer prognosis: early
detection and proper diagnosis using imaging and molecular techniques. AI's use as a
tool in these sectors has demonstrated its capacity to precisely detect and diagnose,
which is one of AI's many applications in cancer research. The purpose of this chapter
is to review the literature and find AI applications in a range of cancers that are
commonly seen.
Collapse
Affiliation(s)
- Ashok Kamalanathan
- Department of Microbiology and Biotechnology, Faculty of Arts and Science, Bharath Institute
of Higher Education and Research (BIHER), Chennai- 600 073, Tamil Nadu, India
| | - Babu Muthu
- Department of Microbiology and Biotechnology, Faculty of Arts and Science, Bharath Institute
of Higher Education and Research (BIHER), Chennai- 600 073, Tamil Nadu, India
| | | |
Collapse
|
12
|
Chen X, Wang X, Lv J, Qin G, Zhou Z. An integrated network based on 2D/3D feature correlations for benign-malignant tumor classification and uncertainty estimation in digital breast tomosynthesis. Phys Med Biol 2023; 68:175046. [PMID: 37582379 DOI: 10.1088/1361-6560/acf092] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Accepted: 08/15/2023] [Indexed: 08/17/2023]
Abstract
Objective.Classification of benign and malignant tumors is important for the early diagnosis of breast cancer. Over the last decade, digital breast tomosynthesis (DBT) has gradually become an effective imaging modality for breast cancer diagnosis due to its ability to generate three-dimensional (3D) visualizations. However, computer-aided diagnosis (CAD) systems based on 3D images require high computational costs and time. Furthermore, there is considerable redundant information in 3D images. Most CAD systems are designed based on 2D images, which may lose the spatial depth information of tumors. In this study, we propose a 2D/3D integrated network for the diagnosis of benign and malignant breast tumors.Approach.We introduce a correlation strategy to describe feature correlations between slices in 3D volumes, corresponding to the tissue relationship and spatial depth features of tumors. The correlation strategy can be used to extract spatial features with little computational cost. In the prediction stage, 3D spatial correlation features and 2D features are both used for classification.Main results.Experimental results demonstrate that our proposed framework achieves higher accuracy and reliability than pure 2D or 3D models. Our framework has a high area under the curve of 0.88 and accuracy of 0.82. The parameter size of the feature extractor in our framework is only 35% of that of the 3D models. In reliability evaluations, our proposed model is more reliable than pure 2D or 3D models because of its effective and nonredundant features.Significance.This study successfully combines 3D spatial correlation features and 2D features for the diagnosis of benign and malignant breast tumors in DBT. In addition to high accuracy and low computational cost, our model is more reliable and can output uncertainty value. From this point of view, the proposed method has the potential to be applied in clinic.
Collapse
Affiliation(s)
- Xi Chen
- School of Information and Communications Engineering, Xi'an Jiaotong University, Xi'an, 710049, Shaanxi, People's Republic of China
| | - Xiaoyu Wang
- School of Information and Communications Engineering, Xi'an Jiaotong University, Xi'an, 710049, Shaanxi, People's Republic of China
| | - Jiahuan Lv
- School of Information and Communications Engineering, Xi'an Jiaotong University, Xi'an, 710049, Shaanxi, People's Republic of China
| | - Genggeng Qin
- Department of Radiology, Nanfang Hospital, Southern Medical University, Guangzhou, 510515, Guangdong, People's Republic of China
| | - Zhiguo Zhou
- Department of Biostatistics and Data Science, University of Kansas Medical Center, Kansas City, KS-66160, United States of America
| |
Collapse
|
13
|
Jiang H, Hou Y, Miao H, Ye H, Gao M, Li X, Jin R, Liu J. Eye tracking based deep learning analysis for the early detection of diabetic retinopathy: A pilot study. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2023.104830] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/12/2023]
|
14
|
Zhao Y, Wang X, Che T, Bao G, Li S. Multi-task deep learning for medical image computing and analysis: A review. Comput Biol Med 2023; 153:106496. [PMID: 36634599 DOI: 10.1016/j.compbiomed.2022.106496] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Revised: 12/06/2022] [Accepted: 12/27/2022] [Indexed: 12/29/2022]
Abstract
The renaissance of deep learning has provided promising solutions to various tasks. While conventional deep learning models are constructed for a single specific task, multi-task deep learning (MTDL) that is capable to simultaneously accomplish at least two tasks has attracted research attention. MTDL is a joint learning paradigm that harnesses the inherent correlation of multiple related tasks to achieve reciprocal benefits in improving performance, enhancing generalizability, and reducing the overall computational cost. This review focuses on the advanced applications of MTDL for medical image computing and analysis. We first summarize four popular MTDL network architectures (i.e., cascaded, parallel, interacted, and hybrid). Then, we review the representative MTDL-based networks for eight application areas, including the brain, eye, chest, cardiac, abdomen, musculoskeletal, pathology, and other human body regions. While MTDL-based medical image processing has been flourishing and demonstrating outstanding performance in many tasks, in the meanwhile, there are performance gaps in some tasks, and accordingly we perceive the open challenges and the perspective trends. For instance, in the 2018 Ischemic Stroke Lesion Segmentation challenge, the reported top dice score of 0.51 and top recall of 0.55 achieved by the cascaded MTDL model indicate further research efforts in high demand to escalate the performance of current models.
Collapse
Affiliation(s)
- Yan Zhao
- Beijing Advanced Innovation Center for Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, Beijing, 100083, China
| | - Xiuying Wang
- School of Computer Science, The University of Sydney, Sydney, NSW, 2008, Australia.
| | - Tongtong Che
- Beijing Advanced Innovation Center for Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, Beijing, 100083, China
| | - Guoqing Bao
- School of Computer Science, The University of Sydney, Sydney, NSW, 2008, Australia
| | - Shuyu Li
- State Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University, Beijing, 100875, China.
| |
Collapse
|
15
|
McGough WC, Sanchez LE, McCague C, Stewart GD, Schönlieb CB, Sala E, Crispin-Ortuzar M. Artificial intelligence for early detection of renal cancer in computed tomography: A review. CAMBRIDGE PRISMS. PRECISION MEDICINE 2022; 1:e4. [PMID: 38550952 PMCID: PMC10953744 DOI: 10.1017/pcm.2022.9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Revised: 09/28/2022] [Accepted: 10/17/2022] [Indexed: 11/06/2024]
Abstract
Renal cancer is responsible for over 100,000 yearly deaths and is principally discovered in computed tomography (CT) scans of the abdomen. CT screening would likely increase the rate of early renal cancer detection, and improve general survival rates, but it is expected to have a prohibitively high financial cost. Given recent advances in artificial intelligence (AI), it may be possible to reduce the cost of CT analysis and enable CT screening by automating the radiological tasks that constitute the early renal cancer detection pipeline. This review seeks to facilitate further interdisciplinary research in early renal cancer detection by summarising our current knowledge across AI, radiology, and oncology and suggesting useful directions for future novel work. Initially, this review discusses existing approaches in automated renal cancer diagnosis, and methods across broader AI research, to summarise the existing state of AI cancer analysis. Then, this review matches these methods to the unique constraints of early renal cancer detection and proposes promising directions for future research that may enable AI-based early renal cancer detection via CT screening. The primary targets of this review are clinicians with an interest in AI and data scientists with an interest in the early detection of cancer.
Collapse
Affiliation(s)
- William C. McGough
- Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK
- Department of Oncology, University of Cambridge, Cambridge, UK
| | - Lorena E. Sanchez
- Department of Radiology, University of Cambridge, Cambridge, UK
- Cancer Research UK Cambridge Centre, Cambridge, UK
| | - Cathal McCague
- Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK
- Department of Radiology, University of Cambridge, Cambridge, UK
- Cancer Research UK Cambridge Centre, Cambridge, UK
| | - Grant D. Stewart
- Cancer Research UK Cambridge Centre, Cambridge, UK
- Department of Surgery, University of Cambridge, Cambridge, UK
| | - Carola-Bibiane Schönlieb
- Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Cambridge, UK
| | - Evis Sala
- Department of Radiology, University of Cambridge, Cambridge, UK
- Cancer Research UK Cambridge Centre, Cambridge, UK
| | - Mireia Crispin-Ortuzar
- Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK
- Department of Oncology, University of Cambridge, Cambridge, UK
| |
Collapse
|
16
|
Li Y, Wu X, Yang P, Jiang G, Luo Y. Machine Learning for Lung Cancer Diagnosis, Treatment, and Prognosis. GENOMICS, PROTEOMICS & BIOINFORMATICS 2022; 20:850-866. [PMID: 36462630 PMCID: PMC10025752 DOI: 10.1016/j.gpb.2022.11.003] [Citation(s) in RCA: 60] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Revised: 10/03/2022] [Accepted: 11/17/2022] [Indexed: 12/03/2022]
Abstract
The recent development of imaging and sequencing technologies enables systematic advances in the clinical study of lung cancer. Meanwhile, the human mind is limited in effectively handling and fully utilizing the accumulation of such enormous amounts of data. Machine learning-based approaches play a critical role in integrating and analyzing these large and complex datasets, which have extensively characterized lung cancer through the use of different perspectives from these accrued data. In this review, we provide an overview of machine learning-based approaches that strengthen the varying aspects of lung cancer diagnosis and therapy, including early detection, auxiliary diagnosis, prognosis prediction, and immunotherapy practice. Moreover, we highlight the challenges and opportunities for future applications of machine learning in lung cancer.
Collapse
Affiliation(s)
- Yawei Li
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL 60611, USA
| | - Xin Wu
- Department of Medicine, University of Illinois at Chicago, Chicago, IL 60612, USA
| | - Ping Yang
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN 55905 / Scottsdale, AZ 85259, USA
| | - Guoqian Jiang
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, MN 55905, USA
| | - Yuan Luo
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL 60611, USA.
| |
Collapse
|
17
|
Zheng S, Zhu Z, Liu Z, Guo Z, Liu Y, Yang Y, Zhao Y. Multi-Modal Graph Learning for Disease Prediction. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; 41:2207-2216. [PMID: 35286257 DOI: 10.1109/tmi.2022.3159264] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Benefiting from the powerful expressive capability of graphs, graph-based approaches have been popularly applied to handle multi-modal medical data and achieved impressive performance in various biomedical applications. For disease prediction tasks, most existing graph-based methods tend to define the graph manually based on specified modality (e.g., demographic information), and then integrated other modalities to obtain the patient representation by Graph Representation Learning (GRL). However, constructing an appropriate graph in advance is not a simple matter for these methods. Meanwhile, the complex correlation between modalities is ignored. These factors inevitably yield the inadequacy of providing sufficient information about the patient's condition for a reliable diagnosis. To this end, we propose an end-to-end Multi-modal Graph Learning framework (MMGL) for disease prediction with multi-modality. To effectively exploit the rich information across multi-modality associated with the disease, modality-aware representation learning is proposed to aggregate the features of each modality by leveraging the correlation and complementarity between the modalities. Furthermore, instead of defining the graph manually, the latent graph structure is captured through an effective way of adaptive graph learning. It could be jointly optimized with the prediction model, thus revealing the intrinsic connections among samples. Our model is also applicable to the scenario of inductive learning for those unseen data. An extensive group of experiments on two disease prediction tasks demonstrates that the proposed MMGL achieves more favorable performance. The code of MMGL is available at https://github.com/SsGood/MMGL.
Collapse
|
18
|
Bigolin Lanfredi R, Zhang M, Auffermann WF, Chan J, Duong PAT, Srikumar V, Drew T, Schroeder JD, Tasdizen T. REFLACX, a dataset of reports and eye-tracking data for localization of abnormalities in chest x-rays. Sci Data 2022; 9:350. [PMID: 35717401 PMCID: PMC9206650 DOI: 10.1038/s41597-022-01441-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2021] [Accepted: 06/01/2022] [Indexed: 11/08/2022] Open
Abstract
Deep learning has shown recent success in classifying anomalies in chest x-rays, but datasets are still small compared to natural image datasets. Supervision of abnormality localization has been shown to improve trained models, partially compensating for dataset sizes. However, explicitly labeling these anomalies requires an expert and is very time-consuming. We propose a potentially scalable method for collecting implicit localization data using an eye tracker to capture gaze locations and a microphone to capture a dictation of a report, imitating the setup of a reading room. The resulting REFLACX (Reports and Eye-Tracking Data for Localization of Abnormalities in Chest X-rays) dataset was labeled across five radiologists and contains 3,032 synchronized sets of eye-tracking data and timestamped report transcriptions for 2,616 chest x-rays from the MIMIC-CXR dataset. We also provide auxiliary annotations, including bounding boxes around lungs and heart and validation labels consisting of ellipses localizing abnormalities and image-level labels. Furthermore, a small subset of the data contains readings from all radiologists, allowing for the calculation of inter-rater scores.
Collapse
Affiliation(s)
- Ricardo Bigolin Lanfredi
- Scientific Computing and Imaging Institute, University of Utah, 72 S Central Campus Drive, Room 3750, Salt Lake City, UT, 84112, USA.
| | - Mingyuan Zhang
- Department of Population Health Sciences, University of Utah, 295 Chipeta Way, Williams Building, Room 1N410, Salt Lake City, UT, 84108, USA
| | - William F Auffermann
- Department of Radiology and Imaging Sciences, University of Utah, 30 North 1900 East #1A071, Salt Lake City, UT, 84132, USA
| | - Jessica Chan
- Department of Radiology and Imaging Sciences, University of Utah, 30 North 1900 East #1A071, Salt Lake City, UT, 84132, USA
| | - Phuong-Anh T Duong
- Department of Radiology and Imaging Sciences, University of Utah, 30 North 1900 East #1A071, Salt Lake City, UT, 84132, USA
| | - Vivek Srikumar
- School of Computing, University of Utah, Room 3190, 50 Central Campus Dr., Salt Lake City, UT, 84112, USA
| | - Trafton Drew
- Department of Psychology, University of Utah, 380 S 1530 E Beh S 502, Salt Lake City, UT, 84112, USA
| | - Joyce D Schroeder
- Department of Radiology and Imaging Sciences, University of Utah, 30 North 1900 East #1A071, Salt Lake City, UT, 84132, USA
| | - Tolga Tasdizen
- Scientific Computing and Imaging Institute, University of Utah, 72 S Central Campus Drive, Room 3750, Salt Lake City, UT, 84112, USA
| |
Collapse
|
19
|
A Deep Learning Approach for Predicting Subject-Specific Human Skull Shape from Head Toward a Decision Support System for Home-Based Facial Rehabilitation. Ing Rech Biomed 2022. [DOI: 10.1016/j.irbm.2022.05.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
20
|
Franceschiello B, Noto TD, Bourgeois A, Murray MM, Minier A, Pouget P, Richiardi J, Bartolomeo P, Anselmi F. Machine learning algorithms on eye tracking trajectories to classify patients with spatial neglect. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022; 221:106929. [PMID: 35675721 DOI: 10.1016/j.cmpb.2022.106929] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/03/2021] [Revised: 05/19/2022] [Accepted: 05/31/2022] [Indexed: 06/15/2023]
Abstract
BACKGROUND AND OBJECTIVE Eye-movement trajectories are rich behavioral data, providing a window on how the brain processes information. We address the challenge of characterizing signs of visuo-spatial neglect from saccadic eye trajectories recorded in brain-damaged patients with spatial neglect as well as in healthy controls during a visual search task. METHODS We establish a standardized pre-processing pipeline adaptable to other task-based eye-tracker measurements. We use traditional machine learning algorithms together with deep convolutional networks (both 1D and 2D) to automatically analyze eye trajectories. RESULTS Our top-performing machine learning models classified neglect patients vs. healthy individuals with an Area Under the ROC curve (AUC) ranging from 0.83 to 0.86. Moreover, the 1D convolutional neural network scores correlated with the degree of severity of neglect behavior as estimated with standardized paper-and-pencil tests and with the integrity of white matter tracts measured from Diffusion Tensor Imaging (DTI). Interestingly, the latter showed a clear correlation with the third branch of the superior longitudinal fasciculus (SLF), especially damaged in neglect. CONCLUSIONS The study introduces new methods for both the pre-processing and the classification of eye-movement trajectories in patients with neglect syndrome. The proposed methods can likely be applied to other types of neurological diseases opening the possibility of new computer-aided, precise, sensitive and non-invasive diagnostic tools.
Collapse
Affiliation(s)
- Benedetta Franceschiello
- The LINE (Laboratory for Investigative Neurophysiology), Department of Diagnostic and Interventional Radiology, Lausanne University Hospital and University of Lausanne, Lausanne, Switzerland.; CIBM Center for Biomedical Imaging, Lausanne, Switzerland; Department of Radiology, Lausanne University Hospital and University of Lausanne, Lausanne, Switzerland; The Sense Innovation and Research Center, Lausanne and Sion, Switzerland; School of Engineering, Institute of Systems Engineering, HES-SO Valais-Wallis, Route de L'industrie 23, Sion, Switzerland
| | - Tommaso Di Noto
- Department of Radiology, Lausanne University Hospital and University of Lausanne, Lausanne, Switzerland
| | - Alexia Bourgeois
- Laboratory of Cognitive Neurorehabilitation, Faculty of Medicine, University of Geneva, Geneva, Switzerland
| | - Micah M Murray
- The LINE (Laboratory for Investigative Neurophysiology), Department of Diagnostic and Interventional Radiology, Lausanne University Hospital and University of Lausanne, Lausanne, Switzerland.; Department of Ophthalmology, Fondation Asile des Aveugles and University of Lausanne, Lausanne, Switzerland; CIBM Center for Biomedical Imaging, Lausanne, Switzerland; Department of Hearing and Speech Sciences, Vanderbilt University, Nashville, TN, USA; The Sense Innovation and Research Center, Lausanne and Sion, Switzerland
| | - Astrid Minier
- The LINE (Laboratory for Investigative Neurophysiology), Department of Diagnostic and Interventional Radiology, Lausanne University Hospital and University of Lausanne, Lausanne, Switzerland.; Department of Ophthalmology, Fondation Asile des Aveugles and University of Lausanne, Lausanne, Switzerland
| | - Pierre Pouget
- Laboratory of Cognitive Neurorehabilitation, Faculty of Medicine, University of Geneva, Geneva, Switzerland
| | - Jonas Richiardi
- Department of Radiology, Lausanne University Hospital and University of Lausanne, Lausanne, Switzerland; The Sense Innovation and Research Center, Lausanne and Sion, Switzerland
| | - Paolo Bartolomeo
- Sorbonne Universite, Inserm, CNRS, Institut du Cerveau - Paris Brain Institute, ICM, Hopital de la Pitie-Salpetriere, Paris, France
| | - Fabio Anselmi
- Center for Neuroscience and Artificial Intelligence, Department of Neuroscience, Baylor College of Medicine, Houston, TX, USA; Center for Brains, Minds, and Machines, McGovern Institute for Brain Research at MIT, Cambridge, MA, USA.
| |
Collapse
|
21
|
Isaac A, Nehemiah HK, Dunston SD, Elgin Christo V, Kannan A. Feature selection using competitive coevolution of bio-inspired algorithms for the diagnosis of pulmonary emphysema. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2021.103340] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
22
|
Luca AR, Ursuleanu TF, Gheorghe L, Grigorovici R, Iancu S, Hlusneac M, Grigorovici A. Impact of quality, type and volume of data used by deep learning models in the analysis of medical images. INFORMATICS IN MEDICINE UNLOCKED 2022. [DOI: 10.1016/j.imu.2022.100911] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022] Open
|
23
|
Zhang Y, Liu M, Hu S, Shen Y, Lan J, Jiang B, de Bock GH, Vliegenthart R, Chen X, Xie X. Development and multicenter validation of chest X-ray radiography interpretations based on natural language processing. COMMUNICATIONS MEDICINE 2021; 1:43. [PMID: 35602222 PMCID: PMC9053275 DOI: 10.1038/s43856-021-00043-x] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Accepted: 09/23/2021] [Indexed: 01/01/2023] Open
Abstract
Background Artificial intelligence can assist in interpreting chest X-ray radiography (CXR) data, but large datasets require efficient image annotation. The purpose of this study is to extract CXR labels from diagnostic reports based on natural language processing, train convolutional neural networks (CNNs), and evaluate the classification performance of CNN using CXR data from multiple centers Methods We collected the CXR images and corresponding radiology reports of 74,082 subjects as the training dataset. The linguistic entities and relationships from unstructured radiology reports were extracted by the bidirectional encoder representations from transformers (BERT) model, and a knowledge graph was constructed to represent the association between image labels of abnormal signs and the report text of CXR. Then, a 25-label classification system were built to train and test the CNN models with weakly supervised labeling. Results In three external test cohorts of 5,996 symptomatic patients, 2,130 screening examinees, and 1,804 community clinic patients, the mean AUC of identifying 25 abnormal signs by CNN reaches 0.866 ± 0.110, 0.891 ± 0.147, and 0.796 ± 0.157, respectively. In symptomatic patients, CNN shows no significant difference with local radiologists in identifying 21 signs (p > 0.05), but is poorer for 4 signs (p < 0.05). In screening examinees, CNN shows no significant difference for 17 signs (p > 0.05), but is poorer at classifying nodules (p = 0.013). In community clinic patients, CNN shows no significant difference for 12 signs (p > 0.05), but performs better for 6 signs (p < 0.001). Conclusion We construct and validate an effective CXR interpretation system based on natural language processing. Chest X-rays are accompanied by a report from the radiologist, which contains valuable diagnostic information in text format. Extracting and interpreting information from these reports, such as keywords, is time-consuming, but artificial intelligence (AI) can help with this. Here, we use a type of AI known as natural language processing to extract information about abnormal signs seen on chest X-rays from the corresponding report. We develop and test natural language processing models using data from multiple hospitals and clinics, and show that our models achieve similar performance to interpretation from the radiologists themselves. Our findings suggest that AI might help radiologists to speed up interpretation of chest X-ray reports, which could be useful not only in patient triage and diagnosis but also cataloguing and searching of radiology datasets. Zhang et al. develop a natural language processing approach, based on the BERT model, to extract linguistic information from chest X-ray radiography reports. The authors establish a 25-label classification system for abnormal findings described in the reports and validate their model using data from multiple sites.
Collapse
|
24
|
Panetta K, Rajendran R, Ramesh A, Rao S, Agaian S. Tufts Dental Database: A Multimodal Panoramic X-ray Dataset for Benchmarking Diagnostic Systems. IEEE J Biomed Health Inform 2021; 26:1650-1659. [PMID: 34606466 DOI: 10.1109/jbhi.2021.3117575] [Citation(s) in RCA: 37] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
The application of Artificial Intelligence in dental healthcare has a very promising role due to the abundance of imagery and non-imagery-based clinical data. Expert analysis of dental radiographs can provide crucial information for clinical diagnosis and treatment. In recent years, Convolutional Neural Networks have achieved the highest accuracy in various benchmarks, including analyzing dental X-ray images to improve clinical care quality. The Tufts Dental Database, a new X-ray panoramic radiography image dataset, has been presented in this paper. This dataset consists of 1000 panoramic dental radiography images with expert labeling of abnormalities and teeth. The classification of radiography images was performed based on five different levels: anatomical location, peripheral characteristics, radiodensity, effects on the surrounding structure, and the abnormality category. This first-of-its-kind multimodal dataset also includes the radiologist's expertise captured in the form of eye-tracking and think-aloud protocol. The contributions of this work are 1) publicly available dataset that can help researchers to incorporate human expertise into AI and achieve more robust and accurate abnormality detection; 2) a benchmark performance analysis for various state-of-the-art systems for dental radiograph image enhancement and image segmentation using deep learning; 3) an in-depth review of various panoramic dental image datasets, along with segmentation and detection systems. The release of this dataset aims to propel the development of AI-powered automated abnormality detection and classification in dental panoramic radiographs, enhance tooth segmentation algorithms, and the ability to distill the radiologist's expertise into AI.
Collapse
|
25
|
Ursuleanu TF, Luca AR, Gheorghe L, Grigorovici R, Iancu S, Hlusneac M, Preda C, Grigorovici A. Deep Learning Application for Analyzing of Constituents and Their Correlations in the Interpretations of Medical Images. Diagnostics (Basel) 2021; 11:1373. [PMID: 34441307 PMCID: PMC8393354 DOI: 10.3390/diagnostics11081373] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2021] [Revised: 07/25/2021] [Accepted: 07/27/2021] [Indexed: 12/13/2022] Open
Abstract
The need for time and attention, given by the doctor to the patient, due to the increased volume of medical data to be interpreted and filtered for diagnostic and therapeutic purposes has encouraged the development of the option to support, constructively and effectively, deep learning models. Deep learning (DL) has experienced an exponential development in recent years, with a major impact on interpretations of the medical image. This has influenced the development, diversification and increase of the quality of scientific data, the development of knowledge construction methods and the improvement of DL models used in medical applications. All research papers focus on description, highlighting, classification of one of the constituent elements of deep learning models (DL), used in the interpretation of medical images and do not provide a unified picture of the importance and impact of each constituent in the performance of DL models. The novelty in our paper consists primarily in the unitary approach, of the constituent elements of DL models, namely, data, tools used by DL architectures or specifically constructed DL architecture combinations and highlighting their "key" features, for completion of tasks in current applications in the interpretation of medical images. The use of "key" characteristics specific to each constituent of DL models and the correct determination of their correlations, may be the subject of future research, with the aim of increasing the performance of DL models in the interpretation of medical images.
Collapse
Affiliation(s)
- Tudor Florin Ursuleanu
- Faculty of General Medicine, “Grigore T. Popa” University of Medicine and Pharmacy, 700115 Iasi, Romania; (T.F.U.); (R.G.); (S.I.); (M.H.); (C.P.); (A.G.)
- Department of Surgery VI, “Sf. Spiridon” Hospital, 700111 Iasi, Romania
- Department of Surgery I, Regional Institute of Oncology, 700483 Iasi, Romania
| | - Andreea Roxana Luca
- Faculty of General Medicine, “Grigore T. Popa” University of Medicine and Pharmacy, 700115 Iasi, Romania; (T.F.U.); (R.G.); (S.I.); (M.H.); (C.P.); (A.G.)
- Department Obstetrics and Gynecology, Integrated Ambulatory of Hospital “Sf. Spiridon”, 700106 Iasi, Romania
| | - Liliana Gheorghe
- Faculty of General Medicine, “Grigore T. Popa” University of Medicine and Pharmacy, 700115 Iasi, Romania; (T.F.U.); (R.G.); (S.I.); (M.H.); (C.P.); (A.G.)
- Department of Radiology, “Sf. Spiridon” Hospital, 700111 Iasi, Romania
| | - Roxana Grigorovici
- Faculty of General Medicine, “Grigore T. Popa” University of Medicine and Pharmacy, 700115 Iasi, Romania; (T.F.U.); (R.G.); (S.I.); (M.H.); (C.P.); (A.G.)
| | - Stefan Iancu
- Faculty of General Medicine, “Grigore T. Popa” University of Medicine and Pharmacy, 700115 Iasi, Romania; (T.F.U.); (R.G.); (S.I.); (M.H.); (C.P.); (A.G.)
| | - Maria Hlusneac
- Faculty of General Medicine, “Grigore T. Popa” University of Medicine and Pharmacy, 700115 Iasi, Romania; (T.F.U.); (R.G.); (S.I.); (M.H.); (C.P.); (A.G.)
| | - Cristina Preda
- Faculty of General Medicine, “Grigore T. Popa” University of Medicine and Pharmacy, 700115 Iasi, Romania; (T.F.U.); (R.G.); (S.I.); (M.H.); (C.P.); (A.G.)
- Department of Endocrinology, “Sf. Spiridon” Hospital, 700111 Iasi, Romania
| | - Alexandru Grigorovici
- Faculty of General Medicine, “Grigore T. Popa” University of Medicine and Pharmacy, 700115 Iasi, Romania; (T.F.U.); (R.G.); (S.I.); (M.H.); (C.P.); (A.G.)
- Department of Surgery VI, “Sf. Spiridon” Hospital, 700111 Iasi, Romania
| |
Collapse
|
26
|
Vogado L, Veras R, Aires K, Araújo F, Silva R, Ponti M, Tavares JMRS. Diagnosis of Leukaemia in Blood Slides Based on a Fine-Tuned and Highly Generalisable Deep Learning Model. SENSORS (BASEL, SWITZERLAND) 2021; 21:2989. [PMID: 33923209 PMCID: PMC8123151 DOI: 10.3390/s21092989] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/21/2021] [Revised: 04/19/2021] [Accepted: 04/21/2021] [Indexed: 02/06/2023]
Abstract
Leukaemia is a dysfunction that affects the production of white blood cells in the bone marrow. Young cells are abnormally produced, replacing normal blood cells. Consequently, the person suffers problems in transporting oxygen and in fighting infections. This article proposes a convolutional neural network (CNN) named LeukNet that was inspired on convolutional blocks of VGG-16, but with smaller dense layers. To define the LeukNet parameters, we evaluated different CNNs models and fine-tuning methods using 18 image datasets, with different resolution, contrast, colour and texture characteristics. We applied data augmentation operations to expand the training dataset, and the 5-fold cross-validation led to an accuracy of 98.61%. To evaluate the CNNs generalisation ability, we applied a cross-dataset validation technique. The obtained accuracies using cross-dataset experiments on three datasets were 97.04, 82.46 and 70.24%, which overcome the accuracies obtained by current state-of-the-art methods. We conclude that using the most common and deepest CNNs may not be the best choice for applications where the images to be classified differ from those used in pre-training. Additionally, the adopted cross-dataset validation approach proved to be an excellent choice to evaluate the generalisation capability of a model, as it considers the model performance on unseen data, which is paramount for CAD systems.
Collapse
Affiliation(s)
- Luis Vogado
- Departamento de Computação, Universidade Federal do Piauí, Teresina 64049-550, Brazil; (L.V.); (R.V.); (K.A.)
| | - Rodrigo Veras
- Departamento de Computação, Universidade Federal do Piauí, Teresina 64049-550, Brazil; (L.V.); (R.V.); (K.A.)
| | - Kelson Aires
- Departamento de Computação, Universidade Federal do Piauí, Teresina 64049-550, Brazil; (L.V.); (R.V.); (K.A.)
| | - Flávio Araújo
- Curso de Bacharelado em Sistemas de Informação, Universidade Federal do Piauí, Picos 64607-670, Brazil; (F.A.); (R.S.)
| | - Romuere Silva
- Curso de Bacharelado em Sistemas de Informação, Universidade Federal do Piauí, Picos 64607-670, Brazil; (F.A.); (R.S.)
| | - Moacir Ponti
- Instituto de Ciências Matemáticas de de Computação, Universidade de São Paulo, São Carlos 13566-590, Brazil;
| | - João Manuel R. S. Tavares
- Departamento de Engenharia Mecânica, Faculdade de Engenharia, Instituto de Ciência e Inovação em Engenharia Mecânica e Engenharia Industrial, Universidade do Porto, 4200-465 Porto, Portugal
| |
Collapse
|
27
|
El-Bouri R, Taylor T, Youssef A, Zhu T, Clifton DA. Machine learning in patient flow: a review. PROGRESS IN BIOMEDICAL ENGINEERING (BRISTOL, ENGLAND) 2021; 3:022002. [PMID: 34738074 PMCID: PMC8559147 DOI: 10.1088/2516-1091/abddc5] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Revised: 01/18/2021] [Accepted: 01/20/2021] [Indexed: 12/13/2022]
Abstract
This work is a review of the ways in which machine learning has been used in order to plan, improve or aid the problem of moving patients through healthcare services. We decompose the patient flow problem into four subcategories: prediction of demand on a healthcare institution, prediction of the demand and resource required to transfer patients from the emergency department to the hospital, prediction of potential resource required for the treatment and movement of inpatients and prediction of length-of-stay and discharge timing. We argue that there are benefits to both approaches of considering the healthcare institution as a whole as well as the patient by patient case and that ideally a combination of these would be best for improving patient flow through hospitals. We also argue that it is essential for there to be a shared dataset that will allow researchers to benchmark their algorithms on and thereby allow future researchers to build on that which has already been done. We conclude that machine learning for the improvement of patient flow is still a young field with very few papers tailor-making machine learning methods for the problem being considered. Future works should consider the need to transfer algorithms trained on a dataset to multiple hospitals and allowing for dynamic algorithms which will allow real-time decision-making to help clinical staff on the shop floor.
Collapse
Affiliation(s)
- Rasheed El-Bouri
- Institute of Biomedical Engineering, University of Oxford, Oxford, United Kingdom
| | - Thomas Taylor
- Institute of Biomedical Engineering, University of Oxford, Oxford, United Kingdom
| | - Alexey Youssef
- Institute of Biomedical Engineering, University of Oxford, Oxford, United Kingdom
| | - Tingting Zhu
- Institute of Biomedical Engineering, University of Oxford, Oxford, United Kingdom
| | - David A Clifton
- Institute of Biomedical Engineering, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
28
|
Karargyris A, Kashyap S, Lourentzou I, Wu JT, Sharma A, Tong M, Abedin S, Beymer D, Mukherjee V, Krupinski EA, Moradi M. Creation and validation of a chest X-ray dataset with eye-tracking and report dictation for AI development. Sci Data 2021; 8:92. [PMID: 33767191 PMCID: PMC7994908 DOI: 10.1038/s41597-021-00863-5] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Accepted: 02/09/2021] [Indexed: 12/15/2022] Open
Abstract
We developed a rich dataset of Chest X-Ray (CXR) images to assist investigators in artificial intelligence. The data were collected using an eye-tracking system while a radiologist reviewed and reported on 1,083 CXR images. The dataset contains the following aligned data: CXR image, transcribed radiology report text, radiologist's dictation audio and eye gaze coordinates data. We hope this dataset can contribute to various areas of research particularly towards explainable and multimodal deep learning/machine learning methods. Furthermore, investigators in disease classification and localization, automated radiology report generation, and human-machine interaction can benefit from these data. We report deep learning experiments that utilize the attention maps produced by the eye gaze dataset to show the potential utility of this dataset.
Collapse
Affiliation(s)
| | | | - Ismini Lourentzou
- IBM Research, Almaden Research Center, San Jose, CA, 95120, USA
- Department of Computer Science, Virginia Tech, Blacksburg, VA, 24061, USA
| | - Joy T Wu
- IBM Research, Almaden Research Center, San Jose, CA, 95120, USA
| | - Arjun Sharma
- IBM Research, Almaden Research Center, San Jose, CA, 95120, USA
| | - Matthew Tong
- IBM Research, Almaden Research Center, San Jose, CA, 95120, USA
| | - Shafiq Abedin
- IBM Research, Almaden Research Center, San Jose, CA, 95120, USA
| | - David Beymer
- IBM Research, Almaden Research Center, San Jose, CA, 95120, USA
| | | | - Elizabeth A Krupinski
- Department of Radiology and Imaging Sciences, Emory University, Atlanta, GA, 30322, USA
| | - Mehdi Moradi
- IBM Research, Almaden Research Center, San Jose, CA, 95120, USA.
| |
Collapse
|
29
|
Nogales A, García-Tejedor ÁJ, Monge D, Vara JS, Antón C. A survey of deep learning models in medical therapeutic areas. Artif Intell Med 2021; 112:102020. [PMID: 33581832 DOI: 10.1016/j.artmed.2021.102020] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2020] [Revised: 12/21/2020] [Accepted: 01/10/2021] [Indexed: 12/18/2022]
Abstract
Artificial intelligence is a broad field that comprises a wide range of techniques, where deep learning is presently the one with the most impact. Moreover, the medical field is an area where data both complex and massive and the importance of the decisions made by doctors make it one of the fields in which deep learning techniques can have the greatest impact. A systematic review following the Cochrane recommendations with a multidisciplinary team comprised of physicians, research methodologists and computer scientists has been conducted. This survey aims to identify the main therapeutic areas and the deep learning models used for diagnosis and treatment tasks. The most relevant databases included were MedLine, Embase, Cochrane Central, Astrophysics Data System, Europe PubMed Central, Web of Science and Science Direct. An inclusion and exclusion criteria were defined and applied in the first and second peer review screening. A set of quality criteria was developed to select the papers obtained after the second screening. Finally, 126 studies from the initial 3493 papers were selected and 64 were described. Results show that the number of publications on deep learning in medicine is increasing every year. Also, convolutional neural networks are the most widely used models and the most developed area is oncology where they are used mainly for image analysis.
Collapse
Affiliation(s)
- Alberto Nogales
- CEIEC, Research Institute, Universidad Francisco de Vitoria, Ctra. M-515 Pozuelo-Majadahonda km 1800, 28223, Pozuelo de Alarcón, Spain.
| | - Álvaro J García-Tejedor
- CEIEC, Research Institute, Universidad Francisco de Vitoria, Ctra. M-515 Pozuelo-Majadahonda km 1800, 28223, Pozuelo de Alarcón, Spain.
| | - Diana Monge
- Faculty of Medicine, Research Institute, Universidad Francisco de Vitoria, Ctra. M-515 Pozuelo-Majadahonda km 1800, 28223, Pozuelo de Alarcón, Spain.
| | - Juan Serrano Vara
- CEIEC, Research Institute, Universidad Francisco de Vitoria, Ctra. M-515 Pozuelo-Majadahonda km 1800, 28223, Pozuelo de Alarcón, Spain.
| | - Cristina Antón
- Faculty of Medicine, Research Institute, Universidad Francisco de Vitoria, Ctra. M-515 Pozuelo-Majadahonda km 1800, 28223, Pozuelo de Alarcón, Spain.
| |
Collapse
|
30
|
Joy Mathew C, David AM, Joy Mathew CM. Artificial Intelligence and its future potential in lung cancer screening. EXCLI JOURNAL 2021; 19:1552-1562. [PMID: 33408594 PMCID: PMC7783473 DOI: 10.17179/excli2020-3095] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/27/2020] [Accepted: 12/05/2020] [Indexed: 12/18/2022]
Abstract
Artificial intelligence (AI) simulates intelligent behavior as well as critical thinking comparable to a human being and can be used to analyze and interpret complex medical data. The application of AI in imaging diagnostics reduces the burden of radiologists and increases the sensitivity of lung cancer screening so that the morbidity and mortality associated with lung cancer can be decreased. In this article, we have tried to evaluate the role of artificial intelligence in lung cancer screening, as well as the future potential and efficiency of AI in the classification of nodules. The relevant studies between 2010-2020 were selected from the PubMed database after excluding animal studies and were analyzed for the contribution of AI. Techniques such as deep learning and machine learning allow automatic characterization and classification of nodules with high precision and promise an advanced lung cancer screening method in the future. Even though several combination models with high performance have been proposed, an effectively validated model for routine use still needs to be improvised. Combining the performance of artificial intelligence with a radiologist's expertise offers a successful outcome with higher accuracy. Thus, we can conclude that higher sensitivity, specificity, and accuracy of lung cancer screening and classification of nodules is possible through the integration of artificial intelligence and radiology. The validation of models and further research is to be carried out to determine the feasibility of this integration.
Collapse
Affiliation(s)
- Christopher Joy Mathew
- Acute Medicine Department, Conquest Hospital, East Sussex Healthcare NHS Trust, United Kingdom
| | - Ashwini Maria David
- Jubilee Mission Medical College and Research Institute, Kerala University of Health Sciences, Kerala, India
| | | |
Collapse
|
31
|
Stember JN, Celik H, Gutman D, Swinburne N, Young R, Eskreis-Winkler S, Holodny A, Jambawalikar S, Wood BJ, Chang PD, Krupinski E, Bagci U. Integrating Eye Tracking and Speech Recognition Accurately Annotates MR Brain Images for Deep Learning: Proof of Principle. Radiol Artif Intell 2021; 3:e200047. [PMID: 33842890 PMCID: PMC7845782 DOI: 10.1148/ryai.2020200047] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2020] [Revised: 07/23/2020] [Accepted: 08/03/2020] [Indexed: 12/19/2022]
Abstract
PURPOSE To generate and assess an algorithm combining eye tracking and speech recognition to extract brain lesion location labels automatically for deep learning (DL). MATERIALS AND METHODS In this retrospective study, 700 two-dimensional brain tumor MRI scans from the Brain Tumor Segmentation database were clinically interpreted. For each image, a single radiologist dictated a standard phrase describing the lesion into a microphone, simulating clinical interpretation. Eye-tracking data were recorded simultaneously. Using speech recognition, gaze points corresponding to each lesion were obtained. Lesion locations were used to train a keypoint detection convolutional neural network to find new lesions. A network was trained to localize lesions for an independent test set of 85 images. The statistical measure to evaluate our method was percent accuracy. RESULTS Eye tracking with speech recognition was 92% accurate in labeling lesion locations from the training dataset, thereby demonstrating that fully simulated interpretation can yield reliable tumor location labels. These labels became those that were used to train the DL network. The detection network trained on these labels predicted lesion location of a separate testing set with 85% accuracy. CONCLUSION The DL network was able to locate brain tumors on the basis of training data that were labeled automatically from simulated clinical image interpretation.© RSNA, 2020.
Collapse
Affiliation(s)
- Joseph N. Stember
- From the Department of Radiology, Memorial Sloan-Kettering Cancer Center, 1275 York Ave, New York, NY 10065 (J.N.S., D.G., N.S., R.Y., S.E.W., A.H.); The National Institutes of Health Clinical Center, Bethesda, Md (H.C., B.J.W.); Department of Radiology, Columbia University Medical Center, New York, NY (S.J.); Department of Radiology, University of California–Irvine, Irvine, Calif (P.D.C.); Department of Radiology & Imaging Sciences, Emory University, Atlanta, Ga (E.K.); and Center for Research in Computer Vision, University of Central Florida, Orlando, Fla (U.B.)
| | - Haydar Celik
- From the Department of Radiology, Memorial Sloan-Kettering Cancer Center, 1275 York Ave, New York, NY 10065 (J.N.S., D.G., N.S., R.Y., S.E.W., A.H.); The National Institutes of Health Clinical Center, Bethesda, Md (H.C., B.J.W.); Department of Radiology, Columbia University Medical Center, New York, NY (S.J.); Department of Radiology, University of California–Irvine, Irvine, Calif (P.D.C.); Department of Radiology & Imaging Sciences, Emory University, Atlanta, Ga (E.K.); and Center for Research in Computer Vision, University of Central Florida, Orlando, Fla (U.B.)
| | - David Gutman
- From the Department of Radiology, Memorial Sloan-Kettering Cancer Center, 1275 York Ave, New York, NY 10065 (J.N.S., D.G., N.S., R.Y., S.E.W., A.H.); The National Institutes of Health Clinical Center, Bethesda, Md (H.C., B.J.W.); Department of Radiology, Columbia University Medical Center, New York, NY (S.J.); Department of Radiology, University of California–Irvine, Irvine, Calif (P.D.C.); Department of Radiology & Imaging Sciences, Emory University, Atlanta, Ga (E.K.); and Center for Research in Computer Vision, University of Central Florida, Orlando, Fla (U.B.)
| | - Nathaniel Swinburne
- From the Department of Radiology, Memorial Sloan-Kettering Cancer Center, 1275 York Ave, New York, NY 10065 (J.N.S., D.G., N.S., R.Y., S.E.W., A.H.); The National Institutes of Health Clinical Center, Bethesda, Md (H.C., B.J.W.); Department of Radiology, Columbia University Medical Center, New York, NY (S.J.); Department of Radiology, University of California–Irvine, Irvine, Calif (P.D.C.); Department of Radiology & Imaging Sciences, Emory University, Atlanta, Ga (E.K.); and Center for Research in Computer Vision, University of Central Florida, Orlando, Fla (U.B.)
| | - Robert Young
- From the Department of Radiology, Memorial Sloan-Kettering Cancer Center, 1275 York Ave, New York, NY 10065 (J.N.S., D.G., N.S., R.Y., S.E.W., A.H.); The National Institutes of Health Clinical Center, Bethesda, Md (H.C., B.J.W.); Department of Radiology, Columbia University Medical Center, New York, NY (S.J.); Department of Radiology, University of California–Irvine, Irvine, Calif (P.D.C.); Department of Radiology & Imaging Sciences, Emory University, Atlanta, Ga (E.K.); and Center for Research in Computer Vision, University of Central Florida, Orlando, Fla (U.B.)
| | - Sarah Eskreis-Winkler
- From the Department of Radiology, Memorial Sloan-Kettering Cancer Center, 1275 York Ave, New York, NY 10065 (J.N.S., D.G., N.S., R.Y., S.E.W., A.H.); The National Institutes of Health Clinical Center, Bethesda, Md (H.C., B.J.W.); Department of Radiology, Columbia University Medical Center, New York, NY (S.J.); Department of Radiology, University of California–Irvine, Irvine, Calif (P.D.C.); Department of Radiology & Imaging Sciences, Emory University, Atlanta, Ga (E.K.); and Center for Research in Computer Vision, University of Central Florida, Orlando, Fla (U.B.)
| | - Andrei Holodny
- From the Department of Radiology, Memorial Sloan-Kettering Cancer Center, 1275 York Ave, New York, NY 10065 (J.N.S., D.G., N.S., R.Y., S.E.W., A.H.); The National Institutes of Health Clinical Center, Bethesda, Md (H.C., B.J.W.); Department of Radiology, Columbia University Medical Center, New York, NY (S.J.); Department of Radiology, University of California–Irvine, Irvine, Calif (P.D.C.); Department of Radiology & Imaging Sciences, Emory University, Atlanta, Ga (E.K.); and Center for Research in Computer Vision, University of Central Florida, Orlando, Fla (U.B.)
| | - Sachin Jambawalikar
- From the Department of Radiology, Memorial Sloan-Kettering Cancer Center, 1275 York Ave, New York, NY 10065 (J.N.S., D.G., N.S., R.Y., S.E.W., A.H.); The National Institutes of Health Clinical Center, Bethesda, Md (H.C., B.J.W.); Department of Radiology, Columbia University Medical Center, New York, NY (S.J.); Department of Radiology, University of California–Irvine, Irvine, Calif (P.D.C.); Department of Radiology & Imaging Sciences, Emory University, Atlanta, Ga (E.K.); and Center for Research in Computer Vision, University of Central Florida, Orlando, Fla (U.B.)
| | - Bradford J. Wood
- From the Department of Radiology, Memorial Sloan-Kettering Cancer Center, 1275 York Ave, New York, NY 10065 (J.N.S., D.G., N.S., R.Y., S.E.W., A.H.); The National Institutes of Health Clinical Center, Bethesda, Md (H.C., B.J.W.); Department of Radiology, Columbia University Medical Center, New York, NY (S.J.); Department of Radiology, University of California–Irvine, Irvine, Calif (P.D.C.); Department of Radiology & Imaging Sciences, Emory University, Atlanta, Ga (E.K.); and Center for Research in Computer Vision, University of Central Florida, Orlando, Fla (U.B.)
| | - Peter D. Chang
- From the Department of Radiology, Memorial Sloan-Kettering Cancer Center, 1275 York Ave, New York, NY 10065 (J.N.S., D.G., N.S., R.Y., S.E.W., A.H.); The National Institutes of Health Clinical Center, Bethesda, Md (H.C., B.J.W.); Department of Radiology, Columbia University Medical Center, New York, NY (S.J.); Department of Radiology, University of California–Irvine, Irvine, Calif (P.D.C.); Department of Radiology & Imaging Sciences, Emory University, Atlanta, Ga (E.K.); and Center for Research in Computer Vision, University of Central Florida, Orlando, Fla (U.B.)
| | - Elizabeth Krupinski
- From the Department of Radiology, Memorial Sloan-Kettering Cancer Center, 1275 York Ave, New York, NY 10065 (J.N.S., D.G., N.S., R.Y., S.E.W., A.H.); The National Institutes of Health Clinical Center, Bethesda, Md (H.C., B.J.W.); Department of Radiology, Columbia University Medical Center, New York, NY (S.J.); Department of Radiology, University of California–Irvine, Irvine, Calif (P.D.C.); Department of Radiology & Imaging Sciences, Emory University, Atlanta, Ga (E.K.); and Center for Research in Computer Vision, University of Central Florida, Orlando, Fla (U.B.)
| | - Ulas Bagci
- From the Department of Radiology, Memorial Sloan-Kettering Cancer Center, 1275 York Ave, New York, NY 10065 (J.N.S., D.G., N.S., R.Y., S.E.W., A.H.); The National Institutes of Health Clinical Center, Bethesda, Md (H.C., B.J.W.); Department of Radiology, Columbia University Medical Center, New York, NY (S.J.); Department of Radiology, University of California–Irvine, Irvine, Calif (P.D.C.); Department of Radiology & Imaging Sciences, Emory University, Atlanta, Ga (E.K.); and Center for Research in Computer Vision, University of Central Florida, Orlando, Fla (U.B.)
| |
Collapse
|
32
|
Hammad M, Iliyasu AM, Subasi A, Ho ESL, El-Latif AAA. A Multitier Deep Learning Model for Arrhythmia Detection. IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT 2021; 70:1-9. [PMID: 0 DOI: 10.1109/tim.2020.3033072] [Citation(s) in RCA: 57] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
|
33
|
|
34
|
Liu F, Wang K, Liu D, Yang X, Tian J. Deep pyramid local attention neural network for cardiac structure segmentation in two-dimensional echocardiography. Med Image Anal 2020; 67:101873. [PMID: 33129143 DOI: 10.1016/j.media.2020.101873] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2020] [Revised: 10/12/2020] [Accepted: 10/13/2020] [Indexed: 02/07/2023]
Abstract
Automatic semantic segmentation in 2D echocardiography is vital in clinical practice for assessing various cardiac functions and improving the diagnosis of cardiac diseases. However, two distinct problems have persisted in automatic segmentation in 2D echocardiography, namely the lack of an effective feature enhancement approach for contextual feature capture and lack of label coherence in category prediction for individual pixels. Therefore, in this study, we propose a deep learning model, called deep pyramid local attention neural network (PLANet), to improve the segmentation performance of automatic methods in 2D echocardiography. Specifically, we propose a pyramid local attention module to enhance features by capturing supporting information within compact and sparse neighboring contexts. We also propose a label coherence learning mechanism to promote prediction consistency for pixels and their neighbors by guiding the learning with explicit supervision signals. The proposed PLANet was extensively evaluated on the dataset of cardiac acquisitions for multi-structure ultrasound segmentation (CAMUS) and sub-EchoNet-Dynamic, which are two large-scale and public 2D echocardiography datasets. The experimental results show that PLANet performs better than traditional and deep learning-based segmentation methods on geometrical and clinical metrics. Moreover, PLANet can complete the segmentation of heart structures in 2D echocardiography in real time, indicating a potential to assist cardiologists accurately and efficiently.
Collapse
Affiliation(s)
- Fei Liu
- CAS Key Laboratory of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China; Department of the Artificial Intelligence Technology, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Kun Wang
- CAS Key Laboratory of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China; Department of the Artificial Intelligence Technology, University of Chinese Academy of Sciences, Beijing, 100049, China; Zhuhai Precision Medical Center, Zhuhai People's Hospital (affiliated with Jinan University), Zhuhai, 519000, China
| | - Dan Liu
- Department of Ultrasound, The Second Affiliated Hospital of Nanchang University, Nanchang, 330008, China
| | - Xin Yang
- CAS Key Laboratory of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China; Department of the Artificial Intelligence Technology, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Jie Tian
- CAS Key Laboratory of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China; Zhuhai Precision Medical Center, Zhuhai People's Hospital (affiliated with Jinan University), Zhuhai, 519000, China; Beijing Advanced Innovation Center for Big Data-Based Precision Medicine, Beihang University, Beijing, 100191, China; Key Laboratory of Big Data-Based Precision Medicine (Beihang University), Ministry of Industry and Information Technology, Beijing, 100191, China.
| |
Collapse
|
35
|
Stember JN, Celik H, Krupinski E, Chang PD, Mutasa S, Wood BJ, Lignelli A, Moonis G, Schwartz LH, Jambawalikar S, Bagci U. Eye Tracking for Deep Learning Segmentation Using Convolutional Neural Networks. J Digit Imaging 2020; 32:597-604. [PMID: 31044392 PMCID: PMC6646645 DOI: 10.1007/s10278-019-00220-4] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Deep learning with convolutional neural networks (CNNs) has experienced tremendous growth in multiple healthcare applications and has been shown to have high accuracy in semantic segmentation of medical (e.g., radiology and pathology) images. However, a key barrier in the required training of CNNs is obtaining large-scale and precisely annotated imaging data. We sought to address the lack of annotated data with eye tracking technology. As a proof of principle, our hypothesis was that segmentation masks generated with the help of eye tracking (ET) would be very similar to those rendered by hand annotation (HA). Additionally, our goal was to show that a CNN trained on ET masks would be equivalent to one trained on HA masks, the latter being the current standard approach. Step 1: Screen captures of 19 publicly available radiologic images of assorted structures within various modalities were analyzed. ET and HA masks for all regions of interest (ROIs) were generated from these image datasets. Step 2: Utilizing a similar approach, ET and HA masks for 356 publicly available T1-weighted postcontrast meningioma images were generated. Three hundred six of these image + mask pairs were used to train a CNN with U-net-based architecture. The remaining 50 images were used as the independent test set. Step 1: ET and HA masks for the nonneurological images had an average Dice similarity coefficient (DSC) of 0.86 between each other. Step 2: Meningioma ET and HA masks had an average DSC of 0.85 between each other. After separate training using both approaches, the ET approach performed virtually identically to HA on the test set of 50 images. The former had an area under the curve (AUC) of 0.88, while the latter had AUC of 0.87. ET and HA predictions had trimmed mean DSCs compared to the original HA maps of 0.73 and 0.74, respectively. These trimmed DSCs between ET and HA were found to be statistically equivalent with a p value of 0.015. We have demonstrated that ET can create segmentation masks suitable for deep learning semantic segmentation. Future work will integrate ET to produce masks in a faster, more natural manner that distracts less from typical radiology clinical workflow.
Collapse
Affiliation(s)
- J N Stember
- Department of Radiology, Columbia University Medical Center - NYPH, New York, NY, 10032, USA.
| | - H Celik
- The National Institutes of Health, Clinical Center, Bethesda, MD, 20892, USA
| | - E Krupinski
- Department of Radiology & Imaging Sciences, Emory University, Atlanta, GA, 30322, USA
| | - P D Chang
- Department of Radiology, University of California, Irvine, CA, 92697, USA
| | - S Mutasa
- Department of Radiology, Columbia University Medical Center - NYPH, New York, NY, 10032, USA
| | - B J Wood
- The National Institutes of Health, Clinical Center, Bethesda, MD, 20892, USA
| | - A Lignelli
- Department of Radiology, Columbia University Medical Center - NYPH, New York, NY, 10032, USA
| | - G Moonis
- Department of Radiology, Columbia University Medical Center - NYPH, New York, NY, 10032, USA
| | - L H Schwartz
- Department of Radiology, Columbia University Medical Center - NYPH, New York, NY, 10032, USA
| | - S Jambawalikar
- Department of Radiology, Columbia University Medical Center - NYPH, New York, NY, 10032, USA
| | - U Bagci
- Center for Research in Computer Vision, University of Central Florida, 4328 Scorpius St. HEC 221, Orlando, FL, 32816, USA
| |
Collapse
|
36
|
|
37
|
Farhat H, Sakr GE, Kilany R. Deep learning applications in pulmonary medical imaging: recent updates and insights on COVID-19. MACHINE VISION AND APPLICATIONS 2020; 31:53. [PMID: 32834523 PMCID: PMC7386599 DOI: 10.1007/s00138-020-01101-5] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/18/2020] [Revised: 06/21/2020] [Accepted: 07/07/2020] [Indexed: 05/07/2023]
Abstract
Shortly after deep learning algorithms were applied to Image Analysis, and more importantly to medical imaging, their applications increased significantly to become a trend. Likewise, deep learning applications (DL) on pulmonary medical images emerged to achieve remarkable advances leading to promising clinical trials. Yet, coronavirus can be the real trigger to open the route for fast integration of DL in hospitals and medical centers. This paper reviews the development of deep learning applications in medical image analysis targeting pulmonary imaging and giving insights of contributions to COVID-19. It covers more than 160 contributions and surveys in this field, all issued between February 2017 and May 2020 inclusively, highlighting various deep learning tasks such as classification, segmentation, and detection, as well as different pulmonary pathologies like airway diseases, lung cancer, COVID-19 and other infections. It summarizes and discusses the current state-of-the-art approaches in this research domain, highlighting the challenges, especially with COVID-19 pandemic current situation.
Collapse
Affiliation(s)
- Hanan Farhat
- Saint Joseph University of Beirut, Mar Roukos, Beirut, Lebanon
| | - George E. Sakr
- Saint Joseph University of Beirut, Mar Roukos, Beirut, Lebanon
| | - Rima Kilany
- Saint Joseph University of Beirut, Mar Roukos, Beirut, Lebanon
| |
Collapse
|
38
|
Wang J, Chen X, Lu H, Zhang L, Pan J, Bao Y, Su J, Qian D. Feature-shared adaptive-boost deep learning for invasiveness classification of pulmonary subsolid nodules in CT images. Med Phys 2020; 47:1738-1749. [PMID: 32020649 DOI: 10.1002/mp.14068] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2019] [Revised: 01/08/2020] [Accepted: 01/22/2020] [Indexed: 12/30/2022] Open
Abstract
PURPOSE In clinical practice, invasiveness is an important reference indicator for differentiating the malignant degree of subsolid pulmonary nodules. These nodules can be classified as atypical adenomatous hyperplasia (AAH), adenocarcinoma in situ (AIS), minimally invasive adenocarcinoma (MIA), or invasive adenocarcinoma (IAC). The automatic determination of a nodule's invasiveness based on chest CT scans can guide treatment planning. However, it is challenging, owing to the insufficiency of training data and their interclass similarity and intraclass variation. To address these challenges, we propose a two-stage deep learning strategy for this task: prior-feature learning followed by adaptive-boost deep learning. METHODS The adaptive-boost deep learning is proposed to train a strong classifier for invasiveness classification of subsolid nodules in chest CT images, using multiple 3D convolutional neural network (CNN)-based weak classifiers. Because ensembles of multiple deep 3D CNN models have a huge number of parameters and require large computing resources along with more training and testing time, the prior-feature learning is proposed to reduce the computations by sharing the CNN layers between all weak classifiers. Using this strategy, all weak classifiers can be integrated into a single network. RESULTS Tenfold cross validation of binary classification was conducted on a total of 1357 nodules, including 765 noninvasive (AAH and AIS) and 592 invasive nodules (MIA and IAC). Ablation experimental results indicated that the proposed binary classifier achieved an accuracy of 73.4 \% ± 1.4 with an AUC of 81.3 \% ± 2.2 . These results are superior compared to those achieved by three experienced chest imaging specialists who achieved an accuracy of 69.1 \% , 69.3 \% , and 67.9 \% , respectively. About 200 additional nodules were also collected. These nodules covered 50 cases for each category (AAH, AIS, MIA, and IAC, respectively). Both binary and multiple classifications were performed on these data and the results demonstrated that the proposed method definitely achieves better performance than the performance achieved by nonensemble deep learning methods. CONCLUSIONS It can be concluded that the proposed adaptive-boost deep learning can significantly improve the performance of invasiveness classification of pulmonary subsolid nodules in CT images, while the prior-feature learning can significantly reduce the total size of deep models. The promising results on clinical data show that the trained models can be used as an effective lung cancer screening tool in hospitals. Moreover, the proposed strategy can be easily extended to other similar classification tasks in 3D medical images.
Collapse
Affiliation(s)
- Jun Wang
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Xiaorong Chen
- Medical Imaging Department, Jinhua Municipal Central Hospital, Jinhua, 321001, China
| | - Hongbing Lu
- College of Computer Science and Technology, Zhejiang University, Hangzhou, 310027, China
| | - Lichi Zhang
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Jianfeng Pan
- Medical Imaging Department, Jinhua Municipal Central Hospital, Jinhua, 321001, China
| | - Yong Bao
- Changzhou Industrial Technology Research Institute of Zhejiang University, Changzhou, 213022, China
| | - Jiner Su
- Medical Imaging Department, Jinhua Municipal Central Hospital, Jinhua, 321001, China
| | - Dahong Qian
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China
| |
Collapse
|
39
|
Aresta G, Ferreira C, Pedrosa J, Araujo T, Rebelo J, Negrao E, Morgado M, Alves F, Cunha A, Ramos I, Campilho A. Automatic Lung Nodule Detection Combined With Gaze Information Improves Radiologists' Screening Performance. IEEE J Biomed Health Inform 2020; 24:2894-2901. [PMID: 32092022 DOI: 10.1109/jbhi.2020.2976150] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Early diagnosis of lung cancer via computed tomography can significantly reduce the morbidity and mortality rates associated with the pathology. However, searching lung nodules is a high complexity task, which affects the success of screening programs. Whilst computer-aided detection systems can be used as second observers, they may bias radiologists and introduce significant time overheads. With this in mind, this study assesses the potential of using gaze information for integrating automatic detection systems in the clinical practice. For that purpose, 4 radiologists were asked to annotate 20 scans from a public dataset while being monitored by an eye tracker device, and an automatic lung nodule detection system was developed. Our results show that radiologists follow a similar search routine and tend to have lower fixation periods in regions where finding errors occur. The overall detection sensitivity of the specialists was 0.67±0.07, whereas the system achieved 0.69. Combining the annotations of one radiologist with the automatic system significantly improves the detection performance to similar levels of two annotators. Filtering automatic detection candidates only for low fixation regions still significantly improves the detection sensitivity without increasing the number of false-positives.
Collapse
|
40
|
Hybrid fuzzy based spearman rank correlation for cranial nerve palsy detection in MIoT environment. HEALTH AND TECHNOLOGY 2019. [DOI: 10.1007/s12553-019-00294-8] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|