1
|
Huhulea EN, Huang L, Eng S, Sumawi B, Huang A, Aifuwa E, Hirani R, Tiwari RK, Etienne M. Artificial Intelligence Advancements in Oncology: A Review of Current Trends and Future Directions. Biomedicines 2025; 13:951. [PMID: 40299653 DOI: 10.3390/biomedicines13040951] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2025] [Revised: 04/03/2025] [Accepted: 04/10/2025] [Indexed: 05/01/2025] Open
Abstract
Cancer remains one of the leading causes of mortality worldwide, driving the need for innovative approaches in research and treatment. Artificial intelligence (AI) has emerged as a powerful tool in oncology, with the potential to revolutionize cancer diagnosis, treatment, and management. This paper reviews recent advancements in AI applications within cancer research, focusing on early detection through computer-aided diagnosis, personalized treatment strategies, and drug discovery. We survey AI-enhanced diagnostic applications and explore AI techniques such as deep learning, as well as the integration of AI with nanomedicine and immunotherapy for cancer care. Comparative analyses of AI-based models versus traditional diagnostic methods are presented, highlighting AI's superior potential. Additionally, we discuss the importance of integrating social determinants of health to optimize cancer care. Despite these advancements, challenges such as data quality, algorithmic biases, and clinical validation remain, limiting widespread adoption. The review concludes with a discussion of the future directions of AI in oncology, emphasizing its potential to reshape cancer care by enhancing diagnosis, personalizing treatments and targeted therapies, and ultimately improving patient outcomes.
Collapse
Affiliation(s)
- Ellen N Huhulea
- School of Medicine, New York Medical College, Valhalla, NY 10595, USA
| | - Lillian Huang
- School of Medicine, New York Medical College, Valhalla, NY 10595, USA
| | - Shirley Eng
- School of Medicine, New York Medical College, Valhalla, NY 10595, USA
| | - Bushra Sumawi
- Barshop Institute, The University of Texas Health Science Center, San Antonio, TX 78229, USA
| | - Audrey Huang
- School of Medicine, New York Medical College, Valhalla, NY 10595, USA
| | - Esewi Aifuwa
- School of Medicine, New York Medical College, Valhalla, NY 10595, USA
| | - Rahim Hirani
- School of Medicine, New York Medical College, Valhalla, NY 10595, USA
- Graduate School of Biomedical Sciences, New York Medical College, Valhalla, NY 10595, USA
| | - Raj K Tiwari
- School of Medicine, New York Medical College, Valhalla, NY 10595, USA
- Graduate School of Biomedical Sciences, New York Medical College, Valhalla, NY 10595, USA
| | - Mill Etienne
- School of Medicine, New York Medical College, Valhalla, NY 10595, USA
- Department of Neurology, New York Medical College, Valhalla, NY 10595, USA
| |
Collapse
|
2
|
Lv J, Zeng X, Chen B, Hu M, Yang S, Qiu X, Wang Z. A stochastic structural similarity guided approach for multi-modal medical image fusion. Sci Rep 2025; 15:8792. [PMID: 40082698 PMCID: PMC11906891 DOI: 10.1038/s41598-025-93662-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2025] [Accepted: 03/07/2025] [Indexed: 03/16/2025] Open
Abstract
Multi-modal medical image fusion (MMIF) aims to integrate complementary information from different modalities to obtain a fused image that contains more comprehensive details, providing clinicians with a more thorough reference for diagnosis. However, most existing deep learning-based fusion methods predominantly focus on the local statistical features within images, which limits the ability of the model to capture long-range dependencies and correlations within source images, thus compromising fusion performance. To address this issue, we propose an unsupervised image fusion method guided by stochastic structural similarity (S3IMFusion). This method incorporates a multi-scale fusion network based on CNN and Transformer modules to extract complementary information from the images effectively. During the training, a loss function with the ability to interact global contextual information was designed. Specifically, a random sorting index is generated based on the source images, and pixel features are mixed and rearranged between the fused and source images according to this index. The structural similarity loss is then computed by averaging the losses between pixel blocks of the rearranged images. This ensures that the fusion result preserves the globally correlated complementary features from the source images. Experimental results on the Harvard dataset demonstrate that S3IMFusion outperforms existing methods, achieving more accurate fusion of medical images. Additionally, we extend the method to infrared and visible image fusion tasks, with results indicating that S3IMFusion exhibits excellent generalization performance.
Collapse
Affiliation(s)
- Junhui Lv
- Department of Neurosurgery, Sir Run Run Shaw Hospital, School of Medicine, Zhejiang University, Hangzhou, 310016, China
| | - Xiangzhi Zeng
- Department of Automation, Zhejiang University of Technology, Hangzhou, 310023, China
| | - Bo Chen
- Department of Automation, Zhejiang University of Technology, Hangzhou, 310023, China
| | - Mingnan Hu
- Department of Automation, Zhejiang University of Technology, Hangzhou, 310023, China
| | - Shuxu Yang
- Department of Neurosurgery, Sir Run Run Shaw Hospital, School of Medicine, Zhejiang University, Hangzhou, 310016, China
| | - Xiang Qiu
- Department of Automation, Zhejiang University of Technology, Hangzhou, 310023, China
| | - Zheming Wang
- Department of Automation, Zhejiang University of Technology, Hangzhou, 310023, China.
| |
Collapse
|
3
|
Sait ARW, AlBalawi E, Nagaraj R. Ensemble learning driven Kolmogorov-Arnold Networks-based Lung Cancer classification. PLoS One 2024; 19:e0313386. [PMID: 39739892 DOI: 10.1371/journal.pone.0313386] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2024] [Accepted: 10/22/2024] [Indexed: 01/02/2025] Open
Abstract
Early Lung Cancer (LC) detection is essential for reducing the global mortality rate. The limitations of traditional diagnostic techniques cause challenges in identifying LC using medical imaging data. In this study, we aim to develop a robust LC detection model. Positron Emission Tomography / Computed Tomography (PET / CT) images are utilized to comprehend the metabolic and anatomical data, leading to optimal LC diagnosis. In order to extract multiple LC features, we enhance MobileNet V3 and LeViT models. The weighted sum feature fusion technique is used to generate unique LC features. The extracted features are classified using spline functions, including linear, cubic, and B-spline of Kolmogorov-Arnold Networks (KANs). We ensemble the outcomes using the soft-voting approach. The model is generalized using the Lung-PET-CT-DX dataset. Five-fold cross-validation is used to evaluate the model. The proposed LC detection model achieves an impressive accuracy of 99.0% with a minimal loss of 0.07. In addition, limited resources are required to classify PET / CT images. The high performance underscores the potential of the proposed LC detection model in providing valuable and optimal results. The study findings can significantly improve clinical practice by presenting sophisticated and interpretable outcomes. The proposed model can be enhanced by integrating advanced feature fusion techniques.
Collapse
Affiliation(s)
- Abdul Rahaman Wahab Sait
- Department of Archives and Communication, King Faisal University, Hofuf, Kingdom of Saudi Arabia
| | - Eid AlBalawi
- Department of Computer Science, College of Computer Science and Information Technology, King Faisal University, Al Hofuf, Kingdom of Saudi Arabia
| | - Ramprasad Nagaraj
- Department of Biochemistry, S S Hospital, S S Institute of Medical Sciences & Research Centre, Rajiv Gandhi University of Health Sciences, Davangere, Karnataka, India
| |
Collapse
|
4
|
Braveen M, Nachiyappan S, Seetha R, Anusha K, Ahilan A, Prasanth A, Jeyam A. RETRACTED ARTICLE: ALBAE feature extraction based lung pneumonia and cancer classification. Soft comput 2024; 28:589. [PMID: 37362264 PMCID: PMC10187954 DOI: 10.1007/s00500-023-08453-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/06/2023] [Indexed: 06/28/2023]
Affiliation(s)
- M. Braveen
- Assistant professor senior, School of
Computer Science and Engineering, Vellore
institute of technology, Chennai, Tamil
Nadu India
| | - S. Nachiyappan
- Associate Professor, School of
Computer Science and Engineering, Vellore
Institute of Technology, Chennai, Tamil
Nadu India
| | - R. Seetha
- Associate Professor, School of
Information Technology and Engineering,
Vellore Institute of Technology,
Vellore, Tamil Nadu India
| | - K. Anusha
- Associate Professor, School of
Computer Science and Engineering, Vellore
Institute of Technology, Chennai, Tamil
Nadu India
| | - A. Ahilan
- Associate Professor, Department of
Electronics and Communication Engineering,
PSN College of Engineering and Technology,
Tirunelveli, Tamil Nadu India
| | - A. Prasanth
- Assistant Professor, Department of
Electronics and Communication Engineering,
Sri Venkateswara College of Engineering,
Sriperumbudur, India
| | - A. Jeyam
- Assistant Professor, Computer Science and
Engineering, Lord Jegannath College of Engineering and Technology, Kanyakumari,
Tamil Nadu 629402 India
| |
Collapse
|
5
|
Zheng H, Zou W, Hu N, Wang J. Joint segmentation of tumors in 3D PET-CT images with a network fusing multi-view and multi-modal information. Phys Med Biol 2024; 69:205009. [PMID: 39317235 DOI: 10.1088/1361-6560/ad7f1b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2024] [Accepted: 09/24/2024] [Indexed: 09/26/2024]
Abstract
Objective. Joint segmentation of tumors in positron emission tomography-computed tomography (PET-CT) images is crucial for precise treatment planning. However, current segmentation methods often use addition or concatenation to fuse PET and CT images, which potentially overlooks the nuanced interplay between these modalities. Additionally, these methods often neglect multi-view information that is helpful for more accurately locating and segmenting the target structure. This study aims to address these disadvantages and develop a deep learning-based algorithm for joint segmentation of tumors in PET-CT images.Approach. To address these limitations, we propose the Multi-view Information Enhancement and Multi-modal Feature Fusion Network (MIEMFF-Net) for joint tumor segmentation in three-dimensional PET-CT images. Our model incorporates a dynamic multi-modal fusion strategy to effectively exploit the metabolic and anatomical information from PET and CT images and a multi-view information enhancement strategy to effectively recover the lost information during upsamping. A Multi-scale Spatial Perception Block is proposed to effectively extract information from different views and reduce redundancy interference in the multi-view feature extraction process.Main results. The proposed MIEMFF-Net achieved a Dice score of 83.93%, a Precision of 81.49%, a Sensitivity of 87.89% and an IOU of 69.27% on the Soft Tissue Sarcomas dataset and a Dice score of 76.83%, a Precision of 86.21%, a Sensitivity of 80.73% and an IOU of 65.15% on the AutoPET dataset.Significance. Experimental results demonstrate that MIEMFF-Net outperforms existing state-of-the-art models which implies potential applications of the proposed method in clinical practice.
Collapse
Affiliation(s)
- HaoYang Zheng
- School of Electronic and Information Engineering, Soochow University, Suzhou 215006, People's Republic of China
| | - Wei Zou
- School of Electronic and Information Engineering, Soochow University, Suzhou 215006, People's Republic of China
| | - Nan Hu
- School of Electronic and Information Engineering, Soochow University, Suzhou 215006, People's Republic of China
| | - Jiajun Wang
- School of Electronic and Information Engineering, Soochow University, Suzhou 215006, People's Republic of China
| |
Collapse
|
6
|
Kang S, Kang Y, Tan S. Exploring and Exploiting Multi-Modality Uncertainty for Tumor Segmentation on PET/CT. IEEE J Biomed Health Inform 2024; 28:5435-5446. [PMID: 38776203 DOI: 10.1109/jbhi.2024.3397332] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/24/2024]
Abstract
Despite the success of deep learning methods in multi-modality segmentation tasks, they typically produce a deterministic output, neglecting the underlying uncertainty. The absence of uncertainty could lead to over-confident predictions with catastrophic consequences, particularly in safety-critical clinical applications. Recently, uncertainty estimation has attracted increasing attention, offering a measure of confidence associated with machine decisions. Nonetheless, existing uncertainty estimation approaches primarily focus on single-modality networks, leaving the uncertainty of multi-modality networks a largely under-explored domain. In this study, we present the first exploration of multi-modality uncertainties in the context of tumor segmentation on PET/CT. Concretely, we assessed four well-established uncertainty estimation approaches across various dimensions, including segmentation performance, uncertainty quality, comparison to single-modality uncertainties, and correlation to the contradictory information between modalities. Through qualitative and quantitative analyses, we gained valuable insights into what benefits multi-modality uncertainties derive, what information multi-modality uncertainties capture, and how multi-modality uncertainties correlate to information from single modalities. Drawing from these insights, we introduced a novel uncertainty-driven loss, which incentivized the network to effectively utilize the complementary information between modalities. The proposed approach outperformed the backbone network by 4.53 and 2.92 Dices in percentages on two PET/CT datasets while achieving lower uncertainties. This study not only advanced the comprehension of multi-modality uncertainties but also revealed the potential benefit of incorporating them into the segmentation network.
Collapse
|
7
|
Huynh BN, Groendahl AR, Tomic O, Liland KH, Knudtsen IS, Hoebers F, van Elmpt W, Dale E, Malinen E, Futsaether CM. Deep learning with uncertainty estimation for automatic tumor segmentation in PET/CT of head and neck cancers: impact of model complexity, image processing and augmentation. Biomed Phys Eng Express 2024; 10:055038. [PMID: 39127060 DOI: 10.1088/2057-1976/ad6dcd] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2024] [Accepted: 08/09/2024] [Indexed: 08/12/2024]
Abstract
Objective.Target volumes for radiotherapy are usually contoured manually, which can be time-consuming and prone to inter- and intra-observer variability. Automatic contouring by convolutional neural networks (CNN) can be fast and consistent but may produce unrealistic contours or miss relevant structures. We evaluate approaches for increasing the quality and assessing the uncertainty of CNN-generated contours of head and neck cancers with PET/CT as input.Approach.Two patient cohorts with head and neck squamous cell carcinoma and baseline18F-fluorodeoxyglucose positron emission tomography and computed tomography images (FDG-PET/CT) were collected retrospectively from two centers. The union of manual contours of the gross primary tumor and involved nodes was used to train CNN models for generating automatic contours. The impact of image preprocessing, image augmentation, transfer learning and CNN complexity, architecture, and dimension (2D or 3D) on model performance and generalizability across centers was evaluated. A Monte Carlo dropout technique was used to quantify and visualize the uncertainty of the automatic contours.Main results. CNN models provided contours with good overlap with the manually contoured ground truth (median Dice Similarity Coefficient: 0.75-0.77), consistent with reported inter-observer variations and previous auto-contouring studies. Image augmentation and model dimension, rather than model complexity, architecture, or advanced image preprocessing, had the largest impact on model performance and cross-center generalizability. Transfer learning on a limited number of patients from a separate center increased model generalizability without decreasing model performance on the original training cohort. High model uncertainty was associated with false positive and false negative voxels as well as low Dice coefficients.Significance.High quality automatic contours can be obtained using deep learning architectures that are not overly complex. Uncertainty estimation of the predicted contours shows potential for highlighting regions of the contour requiring manual revision or flagging segmentations requiring manual inspection and intervention.
Collapse
Affiliation(s)
- Bao Ngoc Huynh
- Faculty of Science and Technology, Norwegian University of Life Sciences, Ås, Norway
| | - Aurora Rosvoll Groendahl
- Faculty of Science and Technology, Norwegian University of Life Sciences, Ås, Norway
- Section of Oncology, Vestre Viken Hospital Trust, Drammen, Norway
| | - Oliver Tomic
- Faculty of Science and Technology, Norwegian University of Life Sciences, Ås, Norway
| | - Kristian Hovde Liland
- Faculty of Science and Technology, Norwegian University of Life Sciences, Ås, Norway
| | - Ingerid Skjei Knudtsen
- Department of Circulation and Medical Imaging, Norwegian University of Science and Technology, Trondheim, Norway
| | - Frank Hoebers
- Department of Radiation Oncology (MAASTRO), GROW School for Oncology and Reproduction, Maastricht, Netherlands
| | - Wouter van Elmpt
- Department of Radiation Oncology (MAASTRO), GROW School for Oncology and Reproduction, Maastricht, Netherlands
| | - Einar Dale
- Department of Oncology, Oslo University Hospital, Oslo, Norway
| | - Eirik Malinen
- Department of Medical Physics, Oslo University Hospital, Oslo, Norway
- Department of Physics, University of Oslo, Oslo, Norway
| | | |
Collapse
|
8
|
Cho MJ, Hwang D, Yie SY, Lee JS. Multi-modal co-learning with attention mechanism for head and neck tumor segmentation on 18FDG PET-CT. EJNMMI Phys 2024; 11:67. [PMID: 39052194 PMCID: PMC11272764 DOI: 10.1186/s40658-024-00670-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Accepted: 07/12/2024] [Indexed: 07/27/2024] Open
Abstract
PURPOSE Effective radiation therapy requires accurate segmentation of head and neck cancer, one of the most common types of cancer. With the advancement of deep learning, people have come up with various methods that use positron emission tomography-computed tomography to get complementary information. However, these approaches are computationally expensive because of the separation of feature extraction and fusion functions and do not make use of the high sensitivity of PET. We propose a new deep learning-based approach to alleviate these challenges. METHODS We proposed a tumor region attention module that fully exploits the high sensitivity of PET and designed a network that learns the correlation between the PET and CT features using squeeze-and-excitation normalization (SE Norm) without separating the feature extraction and fusion functions. In addition, we introduce multi-scale context fusion, which exploits contextual information from different scales. RESULTS The HECKTOR challenge 2021 dataset was used for training and testing. The proposed model outperformed the state-of-the-art models for medical image segmentation; in particular, the dice similarity coefficient increased by 8.78% compared to U-net. CONCLUSION The proposed network segmented the complex shape of the tumor better than the state-of-the-art medical image segmentation methods, accurately distinguishing between tumor and non-tumor regions.
Collapse
Affiliation(s)
- Min Jeong Cho
- Interdisciplinary Program in Bioengineering, Seoul National University College of Engineering, Seoul, 03080, South Korea
- Department of Nuclear Medicine, Seoul National University College of Medicine, 103 Daehak-ro, Jongno-gu, Seoul, 03080, South Korea
- Integrated Major in Innovative Medical Science, Seoul National Graduate School, Seoul, South Korea
| | - Donghwi Hwang
- Department of Nuclear Medicine, Seoul National University College of Medicine, 103 Daehak-ro, Jongno-gu, Seoul, 03080, South Korea
- Department of Biomedical Sciences, Seoul National University College of Medicine, Seoul, 03080, South Korea
| | - Si Young Yie
- Interdisciplinary Program in Bioengineering, Seoul National University College of Engineering, Seoul, 03080, South Korea
- Department of Nuclear Medicine, Seoul National University College of Medicine, 103 Daehak-ro, Jongno-gu, Seoul, 03080, South Korea
- Integrated Major in Innovative Medical Science, Seoul National Graduate School, Seoul, South Korea
| | - Jae Sung Lee
- Interdisciplinary Program in Bioengineering, Seoul National University College of Engineering, Seoul, 03080, South Korea.
- Department of Nuclear Medicine, Seoul National University College of Medicine, 103 Daehak-ro, Jongno-gu, Seoul, 03080, South Korea.
- Integrated Major in Innovative Medical Science, Seoul National Graduate School, Seoul, South Korea.
- Department of Biomedical Sciences, Seoul National University College of Medicine, Seoul, 03080, South Korea.
- Brightonix Imaging Inc, Seoul, 04782, South Korea.
| |
Collapse
|
9
|
Pathan S, Ali T, P G S, P VK, Rao D. An optimized convolutional neural network architecture for lung cancer detection. APL Bioeng 2024; 8:026121. [PMID: 38868458 PMCID: PMC11168751 DOI: 10.1063/5.0208520] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Accepted: 05/31/2024] [Indexed: 06/14/2024] Open
Abstract
Lung cancer, the treacherous malignancy affecting the respiratory system of a human body, has a devastating impact on the health and well-being of an individual. Due to the lack of automated and noninvasive diagnostic tools, healthcare professionals look forward toward biopsy as a gold standard for diagnosis. However, biopsy could be traumatizing and expensive process. Additionally, the limited availability of dataset and inaccuracy in diagnosis is a major drawback experienced by researchers. The objective of the proposed research is to develop an automated diagnostic tool for screening of lung cancer using optimized hyperparameters such that convolutional neural network (CNN) model generalizes well for universally obtained computerized tomography (CT) slices of lung pathologies. The aforementioned objective is achieved in the following ways: (i) Initially, a preprocessing methodology specific to lung CT scans is formulated to avoid the loss of information due to random image smoothing, and (ii) a sine cosine algorithm optimization algorithm (SCA) is integrated in the CNN model, to optimally select the tuning parameters of CNN. The error rate is used as an objective function, and the SCA algorithm tries to minimize. The proposed method successfully achieved an average classification accuracy of 99% in classification of lung scans in normal, benign, and malignant classes. Further, the generalization ability of the proposed model is tested on unseen dataset, thereby achieving promising results. The quantitative results prove the efficacy of the system to be used by radiologists in a clinical scenario.
Collapse
Affiliation(s)
- Sameena Pathan
- Department of Information and Communication Technology, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal 576104, India
| | - Tanweer Ali
- Department of Electronics and Communication Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal 576104, India
| | - Sudheesh P G
- Department of Electronics and Communication Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal 576104, India
| | - Vasanth Kumar P
- Department of Electronics and Communication Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal 576104, India
| | - Divya Rao
- Department of Information and Communication Technology, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal 576104, India
| |
Collapse
|
10
|
Zou Z, Zou B, Kui X, Chen Z, Li Y. DGCBG-Net: A dual-branch network with global cross-modal interaction and boundary guidance for tumor segmentation in PET/CT images. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 250:108125. [PMID: 38631130 DOI: 10.1016/j.cmpb.2024.108125] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Revised: 02/24/2024] [Accepted: 03/07/2024] [Indexed: 04/19/2024]
Abstract
BACKGROUND AND OBJECTIVES Automatic tumor segmentation plays a crucial role in cancer diagnosis and treatment planning. Computed tomography (CT) and positron emission tomography (PET) are extensively employed for their complementary medical information. However, existing methods ignore bilateral cross-modal interaction of global features during feature extraction, and they underutilize multi-stage tumor boundary features. METHODS To address these limitations, we propose a dual-branch tumor segmentation network based on global cross-modal interaction and boundary guidance in PET/CT images (DGCBG-Net). DGCBG-Net consists of 1) a global cross-modal interaction module that extracts global contextual information from PET/CT images and promotes bilateral cross-modal interaction of global feature; 2) a shared multi-path downsampling module that learns complementary features from PET/CT modalities to mitigate the impact of misleading features and decrease the loss of discriminative features during downsampling; 3) a boundary prior-guided branch that extracts potential boundary features from CT images at multiple stages, assisting the semantic segmentation branch in improving the accuracy of tumor boundary segmentation. RESULTS Extensive experiments are conducted on STS and Hecktor 2022 datasets to evaluate the proposed method. The average Dice scores of our DGCB-Net on the two datasets are 80.33% and 79.29%, with average IOU scores of 67.64% and 70.18%. DGCB-Net outperformed the current state-of-the-art methods with a 1.77% higher Dice score and a 2.12% higher IOU score. CONCLUSIONS Extensive experimental results demonstrate that DGCBG-Net outperforms existing segmentation methods, and is competitive to state-of-arts.
Collapse
Affiliation(s)
- Ziwei Zou
- School of Computer Science and Engineering, Central South University, No. 932, Lushan South Road, ChangSha, 410083, China
| | - Beiji Zou
- School of Computer Science and Engineering, Central South University, No. 932, Lushan South Road, ChangSha, 410083, China
| | - Xiaoyan Kui
- School of Computer Science and Engineering, Central South University, No. 932, Lushan South Road, ChangSha, 410083, China.
| | - Zhi Chen
- School of Computer Science and Engineering, Central South University, No. 932, Lushan South Road, ChangSha, 410083, China
| | - Yang Li
- School of Informatics, Hunan University of Chinese Medicine, No. 300, Xueshi Road, ChangSha, 410208, China
| |
Collapse
|
11
|
Zhou Z, Islam MT, Xing L. Multibranch CNN With MLP-Mixer-Based Feature Exploration for High-Performance Disease Diagnosis. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:7351-7362. [PMID: 37028335 PMCID: PMC11779602 DOI: 10.1109/tnnls.2023.3250490] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Deep learning-based diagnosis is becoming an indispensable part of modern healthcare. For high-performance diagnosis, the optimal design of deep neural networks (DNNs) is a prerequisite. Despite its success in image analysis, existing supervised DNNs based on convolutional layers often suffer from their rudimentary feature exploration ability caused by the limited receptive field and biased feature extraction of conventional convolutional neural networks (CNNs), which compromises the network performance. Here, we propose a novel feature exploration network named manifold embedded multilayer perceptron (MLP) mixer (ME-Mixer), which utilizes both supervised and unsupervised features for disease diagnosis. In the proposed approach, a manifold embedding network is employed to extract class-discriminative features; then, two MLP-Mixer-based feature projectors are adopted to encode the extracted features with the global reception field. Our ME-Mixer network is quite general and can be added as a plugin to any existing CNN. Comprehensive evaluations on two medical datasets are performed. The results demonstrate that their approach greatly enhances the classification accuracy in comparison with different configurations of DNNs with acceptable computational complexity.
Collapse
|
12
|
Zhang P, Gao C, Huang Y, Chen X, Pan Z, Wang L, Dong D, Li S, Qi X. Artificial intelligence in liver imaging: methods and applications. Hepatol Int 2024; 18:422-434. [PMID: 38376649 DOI: 10.1007/s12072-023-10630-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Accepted: 12/18/2023] [Indexed: 02/21/2024]
Abstract
Liver disease is regarded as one of the major health threats to humans. Radiographic assessments hold promise in terms of addressing the current demands for precisely diagnosing and treating liver diseases, and artificial intelligence (AI), which excels at automatically making quantitative assessments of complex medical image characteristics, has made great strides regarding the qualitative interpretation of medical imaging by clinicians. Here, we review the current state of medical-imaging-based AI methodologies and their applications concerning the management of liver diseases. We summarize the representative AI methodologies in liver imaging with focusing on deep learning, and illustrate their promising clinical applications across the spectrum of precise liver disease detection, diagnosis and treatment. We also address the current challenges and future perspectives of AI in liver imaging, with an emphasis on feature interpretability, multimodal data integration and multicenter study. Taken together, it is revealed that AI methodologies, together with the large volume of available medical image data, might impact the future of liver disease care.
Collapse
Affiliation(s)
- Peng Zhang
- Institute for TCM-X, MOE Key Laboratory of Bioinformatics, Bioinformatics Division, BNRIST, Department of Automation, Tsinghua University, Beijing, China
| | - Chaofei Gao
- Institute for TCM-X, MOE Key Laboratory of Bioinformatics, Bioinformatics Division, BNRIST, Department of Automation, Tsinghua University, Beijing, China
| | - Yifei Huang
- Department of Gastroenterology, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, China
| | - Xiangyi Chen
- Institute for TCM-X, MOE Key Laboratory of Bioinformatics, Bioinformatics Division, BNRIST, Department of Automation, Tsinghua University, Beijing, China
| | - Zhuoshi Pan
- Institute for TCM-X, MOE Key Laboratory of Bioinformatics, Bioinformatics Division, BNRIST, Department of Automation, Tsinghua University, Beijing, China
| | - Lan Wang
- Institute for TCM-X, MOE Key Laboratory of Bioinformatics, Bioinformatics Division, BNRIST, Department of Automation, Tsinghua University, Beijing, China
| | - Di Dong
- CAS Key Laboratory of Molecular Imaging, Beijing Key Laboratory of Molecular Imaging, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
| | - Shao Li
- Institute for TCM-X, MOE Key Laboratory of Bioinformatics, Bioinformatics Division, BNRIST, Department of Automation, Tsinghua University, Beijing, China.
| | - Xiaolong Qi
- Center of Portal Hypertension, Department of Radiology, Zhongda Hospital, Medical School, Nurturing Center of Jiangsu Province for State Laboratory of AI Imaging & Interventional Radiology, Southeast University, Nanjing, China.
| |
Collapse
|
13
|
Shyamala Bharathi P, Shalini C. Advanced hybrid attention-based deep learning network with heuristic algorithm for adaptive CT and PET image fusion in lung cancer detection. Med Eng Phys 2024; 126:104138. [PMID: 38621836 DOI: 10.1016/j.medengphy.2024.104138] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Revised: 02/17/2024] [Accepted: 03/02/2024] [Indexed: 04/17/2024]
Abstract
Lung cancer is one of the most deadly diseases in the world. Lung cancer detection can save the patient's life. Despite being the best imaging tool in the medical sector, clinicians find it challenging to interpret and detect cancer from Computed Tomography (CT) scan data. One of the most effective ways for the diagnosis of certain malignancies like lung tumours is Positron Emission Tomography (PET) imaging. So many diagnosis models have been implemented nowadays to diagnose various diseases. Early lung cancer identification is very important for predicting the severity level of lung cancer in cancer patients. To explore the effective model, an image fusion-based detection model is proposed for lung cancer detection using an improved heuristic algorithm of the deep learning model. Firstly, the PET and CT images are gathered from the internet. Further, these two collected images are fused for further process by using the Adaptive Dilated Convolution Neural Network (AD-CNN), in which the hyperparameters are tuned by the Modified Initial Velocity-based Capuchin Search Algorithm (MIV-CapSA). Subsequently, the abnormal regions are segmented by influencing the TransUnet3+. Finally, the segmented images are fed into the Hybrid Attention-based Deep Networks (HADN) model, encompassed with Mobilenet and Shufflenet. Therefore, the effectiveness of the novel detection model is analyzed using various metrics compared with traditional approaches. At last, the outcome evinces that it aids in early basic detection to treat the patients effectively.
Collapse
Affiliation(s)
- P Shyamala Bharathi
- Saveetha School of Engineering, Saveetha Institute of Medical and Technical Sciences, Chennai, India.
| | - C Shalini
- Saveetha School of Engineering, Saveetha Institute of Medical and Technical Sciences, Chennai, India
| |
Collapse
|
14
|
Artesani A, Bruno A, Gelardi F, Chiti A. Empowering PET: harnessing deep learning for improved clinical insight. Eur Radiol Exp 2024; 8:17. [PMID: 38321340 PMCID: PMC10847083 DOI: 10.1186/s41747-023-00413-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Accepted: 11/20/2023] [Indexed: 02/08/2024] Open
Abstract
This review aims to take a journey into the transformative impact of artificial intelligence (AI) on positron emission tomography (PET) imaging. To this scope, a broad overview of AI applications in the field of nuclear medicine and a thorough exploration of deep learning (DL) implementations in cancer diagnosis and therapy through PET imaging will be presented. We firstly describe the behind-the-scenes use of AI for image generation, including acquisition (event positioning, noise reduction though time-of-flight estimation and scatter correction), reconstruction (data-driven and model-driven approaches), restoration (supervised and unsupervised methods), and motion correction. Thereafter, we outline the integration of AI into clinical practice through the applications to segmentation, detection and classification, quantification, treatment planning, dosimetry, and radiomics/radiogenomics combined to tumour biological characteristics. Thus, this review seeks to showcase the overarching transformation of the field, ultimately leading to tangible improvements in patient treatment and response assessment. Finally, limitations and ethical considerations of the AI application to PET imaging and future directions of multimodal data mining in this discipline will be briefly discussed, including pressing challenges to the adoption of AI in molecular imaging such as the access to and interoperability of huge amount of data as well as the "black-box" problem, contributing to the ongoing dialogue on the transformative potential of AI in nuclear medicine.Relevance statementAI is rapidly revolutionising the world of medicine, including the fields of radiology and nuclear medicine. In the near future, AI will be used to support healthcare professionals. These advances will lead to improvements in diagnosis, in the assessment of response to treatment, in clinical decision making and in patient management.Key points• Applying AI has the potential to enhance the entire PET imaging pipeline.• AI may support several clinical tasks in both PET diagnosis and prognosis.• Interpreting the relationships between imaging and multiomics data will heavily rely on AI.
Collapse
Affiliation(s)
- Alessia Artesani
- Department of Biomedical Sciences, Humanitas University, Via Rita Levi Montalcini 4, Milan, Pieve Emanuele, 20090, Italy
| | - Alessandro Bruno
- Department of Business, Law, Economics and Consumer Behaviour "Carlo A. Ricciardi", IULM Libera Università Di Lingue E Comunicazione, Via P. Filargo 38, Milan, 20143, Italy
| | - Fabrizia Gelardi
- Department of Biomedical Sciences, Humanitas University, Via Rita Levi Montalcini 4, Milan, Pieve Emanuele, 20090, Italy.
- Vita-Salute San Raffaele University, Via Olgettina 58, Milan, 20132, Italy.
| | - Arturo Chiti
- Vita-Salute San Raffaele University, Via Olgettina 58, Milan, 20132, Italy
- Department of Nuclear Medicine, IRCCS Ospedale San Raffaele, Via Olgettina 60, Milan, 20132, Italy
| |
Collapse
|
15
|
Nigam S, Mohapatra J, Makela AV, Hayat H, Rodriguez JM, Sun A, Kenyon E, Redman NA, Spence D, Jabin G, Gu B, Ashry M, Sempere LF, Mitra A, Li J, Chen J, Wei GW, Bolin S, Etchebarne B, Liu JP, Contag CH, Wang P. Shape Anisotropy-Governed High-Performance Nanomagnetosol for In Vivo Magnetic Particle Imaging of Lungs. SMALL (WEINHEIM AN DER BERGSTRASSE, GERMANY) 2024; 20:e2305300. [PMID: 37735143 PMCID: PMC10842459 DOI: 10.1002/smll.202305300] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/25/2023] [Revised: 08/24/2023] [Indexed: 09/23/2023]
Abstract
Caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), coronavirus disease 2019 (COVID-19) has shown extensive lung manifestations in vulnerable individuals, putting lung imaging and monitoring at the forefront of early detection and treatment. Magnetic particle imaging (MPI) is an imaging modality, which can bring excellent contrast, sensitivity, and signal-to-noise ratios to lung imaging for the development of new theranostic approaches for respiratory diseases. Advances in MPI tracers would offer additional improvements and increase the potential for clinical translation of MPI. Here, a high-performance nanotracer based on shape anisotropy of magnetic nanoparticles is developed and its use in MPI imaging of the lung is demonstrated. Shape anisotropy proves to be a critical parameter for increasing signal intensity and resolution and exceeding those properties of conventional spherical nanoparticles. The 0D nanoparticles exhibit a 2-fold increase, while the 1D nanorods have a > 5-fold increase in signal intensity when compared to VivoTrax. Newly designed 1D nanorods displayed high signal intensities and excellent resolution in lung images. A spatiotemporal lung imaging study in mice revealed that this tracer offers new opportunities for monitoring disease and guiding intervention.
Collapse
Affiliation(s)
- Saumya Nigam
- Precision Health Program, Michigan State University, East Lansing, MI, 48824, USA
- Department of Radiology, College of Human Medicine, Michigan State University, East Lansing, MI, 48824, USA
| | - Jeotikanta Mohapatra
- Department of Physics, The University of Texas at Arlington, Arlington, TX, 76019, USA
| | - Ashley V Makela
- Institute for Quantitative Health Science and Engineering (IQ), Michigan State University, East Lansing, MI, 48824, USA
- Department of Biomedical Engineering, College of Engineering, Michigan State University, East Lansing, MI, 48824, USA
| | - Hanaan Hayat
- Precision Health Program, Michigan State University, East Lansing, MI, 48824, USA
- Department of Radiology, College of Human Medicine, Michigan State University, East Lansing, MI, 48824, USA
| | - Jessi Mercedes Rodriguez
- Precision Health Program, Michigan State University, East Lansing, MI, 48824, USA
- Department of Radiology, College of Human Medicine, Michigan State University, East Lansing, MI, 48824, USA
- Human Biology Program, College of Natural Science, Michigan State University, East Lansing, MI, 48824, USA
| | - Aixia Sun
- Precision Health Program, Michigan State University, East Lansing, MI, 48824, USA
- Department of Radiology, College of Human Medicine, Michigan State University, East Lansing, MI, 48824, USA
| | - Elizabeth Kenyon
- Precision Health Program, Michigan State University, East Lansing, MI, 48824, USA
- Department of Radiology, College of Human Medicine, Michigan State University, East Lansing, MI, 48824, USA
| | - Nathan A Redman
- Institute for Quantitative Health Science and Engineering (IQ), Michigan State University, East Lansing, MI, 48824, USA
- Department of Biomedical Engineering, College of Engineering, Michigan State University, East Lansing, MI, 48824, USA
| | - Dana Spence
- Institute for Quantitative Health Science and Engineering (IQ), Michigan State University, East Lansing, MI, 48824, USA
- Department of Biomedical Engineering, College of Engineering, Michigan State University, East Lansing, MI, 48824, USA
| | - George Jabin
- Department of Physics, The University of Texas at Arlington, Arlington, TX, 76019, USA
| | - Bin Gu
- Department of Obstetrics, Gynecology and Reproductive Sciences, College of Human Medicine, Michigan State University, East Lansing, MI, 48824, USA
| | - Mohamed Ashry
- Precision Health Program, Michigan State University, East Lansing, MI, 48824, USA
- Department of Radiology, College of Human Medicine, Michigan State University, East Lansing, MI, 48824, USA
| | - Lorenzo F Sempere
- Precision Health Program, Michigan State University, East Lansing, MI, 48824, USA
- Department of Radiology, College of Human Medicine, Michigan State University, East Lansing, MI, 48824, USA
| | - Arijit Mitra
- Department of Materials Science and Engineering, National Cheng Kung University, Tainan City, 701, Taiwan
| | - Jinxing Li
- Institute for Quantitative Health Science and Engineering (IQ), Michigan State University, East Lansing, MI, 48824, USA
- Department of Biomedical Engineering, College of Engineering, Michigan State University, East Lansing, MI, 48824, USA
| | - Jiahui Chen
- Department of Mathematics, College of Natural Science, Michigan State U, niversity, East Lansing, MI, 48824, USA
| | - Guo-Wei Wei
- Department of Mathematics, College of Natural Science, Michigan State U, niversity, East Lansing, MI, 48824, USA
- Department of Electrical and Computer Engineering, College of Engineering, Michigan State University, East Lansing, MI, 48824, USA
- Department of Biochemistry and Molecular Biology, College of Natural Science, Michigan State University, East Lansing, MI, 48824, USA
| | - Steven Bolin
- Department of Pathobiology and Diagnostic Investigation, College of Veterinary Medicine, Michigan State University, East Lansing, MI, 48824, USA
| | - Brett Etchebarne
- Osteopathic Medical Specialties, College of Osteopathic Medicine, Michigan State University, East Lansing, MI, 48824, USA
| | - J Ping Liu
- Department of Physics, The University of Texas at Arlington, Arlington, TX, 76019, USA
| | - Christopher H Contag
- Institute for Quantitative Health Science and Engineering (IQ), Michigan State University, East Lansing, MI, 48824, USA
- Department of Biomedical Engineering, College of Engineering, Michigan State University, East Lansing, MI, 48824, USA
- Department of Microbiology and Molecular Genetics, Michigan State University, East Lansing, MI, 48824, USA
| | - Ping Wang
- Precision Health Program, Michigan State University, East Lansing, MI, 48824, USA
- Department of Radiology, College of Human Medicine, Michigan State University, East Lansing, MI, 48824, USA
| |
Collapse
|
16
|
Zhang Q, Hu Y, Zhou C, Zhao Y, Zhang N, Zhou Y, Yang Y, Zheng H, Fan W, Liang D, Hu Z. Reducing pediatric total-body PET/CT imaging scan time with multimodal artificial intelligence technology. EJNMMI Phys 2024; 11:1. [PMID: 38165551 PMCID: PMC10761657 DOI: 10.1186/s40658-023-00605-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Accepted: 12/20/2023] [Indexed: 01/04/2024] Open
Abstract
OBJECTIVES This study aims to decrease the scan time and enhance image quality in pediatric total-body PET imaging by utilizing multimodal artificial intelligence techniques. METHODS A total of 270 pediatric patients who underwent total-body PET/CT scans with a uEXPLORER at the Sun Yat-sen University Cancer Center were retrospectively enrolled. 18F-fluorodeoxyglucose (18F-FDG) was administered at a dose of 3.7 MBq/kg with an acquisition time of 600 s. Short-term scan PET images (acquired within 6, 15, 30, 60 and 150 s) were obtained by truncating the list-mode data. A three-dimensional (3D) neural network was developed with a residual network as the basic structure, fusing low-dose CT images as prior information, which were fed to the network at different scales. The short-term PET images and low-dose CT images were processed by the multimodal 3D network to generate full-length, high-dose PET images. The nonlocal means method and the same 3D network without the fused CT information were used as reference methods. The performance of the network model was evaluated by quantitative and qualitative analyses. RESULTS Multimodal artificial intelligence techniques can significantly improve PET image quality. When fused with prior CT information, the anatomical information of the images was enhanced, and 60 s of scan data produced images of quality comparable to that of the full-time data. CONCLUSION Multimodal artificial intelligence techniques can effectively improve the quality of pediatric total-body PET/CT images acquired using ultrashort scan times. This has the potential to decrease the use of sedation, enhance guardian confidence, and reduce the probability of motion artifacts.
Collapse
Affiliation(s)
- Qiyang Zhang
- Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
| | - Yingying Hu
- Department of Nuclear Medicine, Sun Yat-sen University Cancer Center, Guangzhou, 510060, China
| | - Chao Zhou
- Department of Nuclear Medicine, Sun Yat-sen University Cancer Center, Guangzhou, 510060, China
| | - Yumo Zhao
- Department of Nuclear Medicine, Sun Yat-sen University Cancer Center, Guangzhou, 510060, China
| | - Na Zhang
- Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
| | - Yun Zhou
- United Imaging Healthcare Group, Central Research Institute, Shanghai, 201807, China
| | - Yongfeng Yang
- Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
| | - Hairong Zheng
- Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
| | - Wei Fan
- Department of Nuclear Medicine, Sun Yat-sen University Cancer Center, Guangzhou, 510060, China
| | - Dong Liang
- Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
| | - Zhanli Hu
- Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China.
| |
Collapse
|
17
|
Fallahpoor M, Chakraborty S, Pradhan B, Faust O, Barua PD, Chegeni H, Acharya R. Deep learning techniques in PET/CT imaging: A comprehensive review from sinogram to image space. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 243:107880. [PMID: 37924769 DOI: 10.1016/j.cmpb.2023.107880] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 10/16/2023] [Accepted: 10/21/2023] [Indexed: 11/06/2023]
Abstract
Positron emission tomography/computed tomography (PET/CT) is increasingly used in oncology, neurology, cardiology, and emerging medical fields. The success stems from the cohesive information that hybrid PET/CT imaging offers, surpassing the capabilities of individual modalities when used in isolation for different malignancies. However, manual image interpretation requires extensive disease-specific knowledge, and it is a time-consuming aspect of physicians' daily routines. Deep learning algorithms, akin to a practitioner during training, extract knowledge from images to facilitate the diagnosis process by detecting symptoms and enhancing images. This acquired knowledge aids in supporting the diagnosis process through symptom detection and image enhancement. The available review papers on PET/CT imaging have a drawback as they either included additional modalities or examined various types of AI applications. However, there has been a lack of comprehensive investigation specifically focused on the highly specific use of AI, and deep learning, on PET/CT images. This review aims to fill that gap by investigating the characteristics of approaches used in papers that employed deep learning for PET/CT imaging. Within the review, we identified 99 studies published between 2017 and 2022 that applied deep learning to PET/CT images. We also identified the best pre-processing algorithms and the most effective deep learning models reported for PET/CT while highlighting the current limitations. Our review underscores the potential of deep learning (DL) in PET/CT imaging, with successful applications in lesion detection, tumor segmentation, and disease classification in both sinogram and image spaces. Common and specific pre-processing techniques are also discussed. DL algorithms excel at extracting meaningful features, and enhancing accuracy and efficiency in diagnosis. However, limitations arise from the scarcity of annotated datasets and challenges in explainability and uncertainty. Recent DL models, such as attention-based models, generative models, multi-modal models, graph convolutional networks, and transformers, are promising for improving PET/CT studies. Additionally, radiomics has garnered attention for tumor classification and predicting patient outcomes. Ongoing research is crucial to explore new applications and improve the accuracy of DL models in this rapidly evolving field.
Collapse
Affiliation(s)
- Maryam Fallahpoor
- Centre for Advanced Modelling and Geospatial Information Systems (CAMGIS), School of Civil and Environmental Engineering, University of Technology Sydney, Ultimo, NSW 2007, Australia
| | - Subrata Chakraborty
- Centre for Advanced Modelling and Geospatial Information Systems (CAMGIS), School of Civil and Environmental Engineering, University of Technology Sydney, Ultimo, NSW 2007, Australia; School of Science and Technology, Faculty of Science, Agriculture, Business and Law, University of New England, Armidale, NSW 2351, Australia
| | - Biswajeet Pradhan
- Centre for Advanced Modelling and Geospatial Information Systems (CAMGIS), School of Civil and Environmental Engineering, University of Technology Sydney, Ultimo, NSW 2007, Australia; Earth Observation Centre, Institute of Climate Change, Universiti Kebangsaan Malaysia, Bangi 43600, Malaysia.
| | - Oliver Faust
- School of Computing and Information Science, Anglia Ruskin University Cambridge Campus, United Kingdom
| | - Prabal Datta Barua
- School of Science and Technology, Faculty of Science, Agriculture, Business and Law, University of New England, Armidale, NSW 2351, Australia; Faculty of Engineering and Information Technology, University of Technology Sydney, Australia; School of Business (Information Systems), Faculty of Business, Education, Law & Arts, University of Southern Queensland, Australia
| | | | - Rajendra Acharya
- School of Mathematics, Physics and Computing, University of Southern Queensland, Toowoomba, QLD, Australia
| |
Collapse
|
18
|
Shiri I, Amini M, Yousefirizi F, Vafaei Sadr A, Hajianfar G, Salimi Y, Mansouri Z, Jenabi E, Maghsudi M, Mainta I, Becker M, Rahmim A, Zaidi H. Information fusion for fully automated segmentation of head and neck tumors from PET and CT images. Med Phys 2024; 51:319-333. [PMID: 37475591 DOI: 10.1002/mp.16615] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2023] [Revised: 05/16/2023] [Accepted: 06/19/2023] [Indexed: 07/22/2023] Open
Abstract
BACKGROUND PET/CT images combining anatomic and metabolic data provide complementary information that can improve clinical task performance. PET image segmentation algorithms exploiting the multi-modal information available are still lacking. PURPOSE Our study aimed to assess the performance of PET and CT image fusion for gross tumor volume (GTV) segmentations of head and neck cancers (HNCs) utilizing conventional, deep learning (DL), and output-level voting-based fusions. METHODS The current study is based on a total of 328 histologically confirmed HNCs from six different centers. The images were automatically cropped to a 200 × 200 head and neck region box, and CT and PET images were normalized for further processing. Eighteen conventional image-level fusions were implemented. In addition, a modified U2-Net architecture as DL fusion model baseline was used. Three different input, layer, and decision-level information fusions were used. Simultaneous truth and performance level estimation (STAPLE) and majority voting to merge different segmentation outputs (from PET and image-level and network-level fusions), that is, output-level information fusion (voting-based fusions) were employed. Different networks were trained in a 2D manner with a batch size of 64. Twenty percent of the dataset with stratification concerning the centers (20% in each center) were used for final result reporting. Different standard segmentation metrics and conventional PET metrics, such as SUV, were calculated. RESULTS In single modalities, PET had a reasonable performance with a Dice score of 0.77 ± 0.09, while CT did not perform acceptably and reached a Dice score of only 0.38 ± 0.22. Conventional fusion algorithms obtained a Dice score range of [0.76-0.81] with guided-filter-based context enhancement (GFCE) at the low-end, and anisotropic diffusion and Karhunen-Loeve transform fusion (ADF), multi-resolution singular value decomposition (MSVD), and multi-level image decomposition based on latent low-rank representation (MDLatLRR) at the high-end. All DL fusion models achieved Dice scores of 0.80. Output-level voting-based models outperformed all other models, achieving superior results with a Dice score of 0.84 for Majority_ImgFus, Majority_All, and Majority_Fast. A mean error of almost zero was achieved for all fusions using SUVpeak , SUVmean and SUVmedian . CONCLUSION PET/CT information fusion adds significant value to segmentation tasks, considerably outperforming PET-only and CT-only methods. In addition, both conventional image-level and DL fusions achieve competitive results. Meanwhile, output-level voting-based fusion using majority voting of several algorithms results in statistically significant improvements in the segmentation of HNC.
Collapse
Affiliation(s)
- Isaac Shiri
- Division of Nuclear Medicine and Molecular Imaging, Geneva University Hospital, Geneva, Switzerland
| | - Mehdi Amini
- Division of Nuclear Medicine and Molecular Imaging, Geneva University Hospital, Geneva, Switzerland
| | - Fereshteh Yousefirizi
- Department of Integrative Oncology, BC Cancer Research Institute, Vancouver, British Columbia, Canada
| | - Alireza Vafaei Sadr
- Institute of Pathology, RWTH Aachen University Hospital, Aachen, Germany
- Department of Public Health Sciences, College of Medicine, The Pennsylvania State University, Hershey, USA
| | - Ghasem Hajianfar
- Division of Nuclear Medicine and Molecular Imaging, Geneva University Hospital, Geneva, Switzerland
| | - Yazdan Salimi
- Division of Nuclear Medicine and Molecular Imaging, Geneva University Hospital, Geneva, Switzerland
| | - Zahra Mansouri
- Division of Nuclear Medicine and Molecular Imaging, Geneva University Hospital, Geneva, Switzerland
| | - Elnaz Jenabi
- Research Center for Nuclear Medicine, Shariati Hospital, Tehran University of Medical Sciences, Tehran, Iran
| | - Mehdi Maghsudi
- Rajaie Cardiovascular Medical and Research Center, Iran University of Medical Sciences, Tehran, Iran
| | - Ismini Mainta
- Division of Nuclear Medicine and Molecular Imaging, Geneva University Hospital, Geneva, Switzerland
| | - Minerva Becker
- Service of Radiology, Geneva University Hospital, Geneva, Switzerland
| | - Arman Rahmim
- Department of Integrative Oncology, BC Cancer Research Institute, Vancouver, British Columbia, Canada
- Department of Radiology and Physics, University of British Columbia, Vancouver, Canada
| | - Habib Zaidi
- Division of Nuclear Medicine and Molecular Imaging, Geneva University Hospital, Geneva, Switzerland
- Geneva University Neurocenter, Geneva University, Geneva, Switzerland
- Department of Nuclear Medicine and Molecular Imaging, University of Groningen, University Medical Center Groningen, Groningen, Netherlands
- Department of Nuclear Medicine, University of Southern Denmark, Odense, Denmark
| |
Collapse
|
19
|
Hussain D, Al-Masni MA, Aslam M, Sadeghi-Niaraki A, Hussain J, Gu YH, Naqvi RA. Revolutionizing tumor detection and classification in multimodality imaging based on deep learning approaches: Methods, applications and limitations. JOURNAL OF X-RAY SCIENCE AND TECHNOLOGY 2024; 32:857-911. [PMID: 38701131 DOI: 10.3233/xst-230429] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2024]
Abstract
BACKGROUND The emergence of deep learning (DL) techniques has revolutionized tumor detection and classification in medical imaging, with multimodal medical imaging (MMI) gaining recognition for its precision in diagnosis, treatment, and progression tracking. OBJECTIVE This review comprehensively examines DL methods in transforming tumor detection and classification across MMI modalities, aiming to provide insights into advancements, limitations, and key challenges for further progress. METHODS Systematic literature analysis identifies DL studies for tumor detection and classification, outlining methodologies including convolutional neural networks (CNNs), recurrent neural networks (RNNs), and their variants. Integration of multimodality imaging enhances accuracy and robustness. RESULTS Recent advancements in DL-based MMI evaluation methods are surveyed, focusing on tumor detection and classification tasks. Various DL approaches, including CNNs, YOLO, Siamese Networks, Fusion-Based Models, Attention-Based Models, and Generative Adversarial Networks, are discussed with emphasis on PET-MRI, PET-CT, and SPECT-CT. FUTURE DIRECTIONS The review outlines emerging trends and future directions in DL-based tumor analysis, aiming to guide researchers and clinicians toward more effective diagnosis and prognosis. Continued innovation and collaboration are stressed in this rapidly evolving domain. CONCLUSION Conclusions drawn from literature analysis underscore the efficacy of DL approaches in tumor detection and classification, highlighting their potential to address challenges in MMI analysis and their implications for clinical practice.
Collapse
Affiliation(s)
- Dildar Hussain
- Department of Artificial Intelligence and Data Science, Sejong University, Seoul, Korea
| | - Mohammed A Al-Masni
- Department of Artificial Intelligence and Data Science, Sejong University, Seoul, Korea
| | - Muhammad Aslam
- Department of Artificial Intelligence and Data Science, Sejong University, Seoul, Korea
| | - Abolghasem Sadeghi-Niaraki
- Department of Computer Science & Engineering and Convergence Engineering for Intelligent Drone, XR Research Center, Sejong University, Seoul, Korea
| | - Jamil Hussain
- Department of Artificial Intelligence and Data Science, Sejong University, Seoul, Korea
| | - Yeong Hyeon Gu
- Department of Artificial Intelligence and Data Science, Sejong University, Seoul, Korea
| | - Rizwan Ali Naqvi
- Department of Intelligent Mechatronics Engineering, Sejong University, Seoul, Korea
| |
Collapse
|
20
|
Xing F, Silosky M, Ghosh D, Chin BB. Location-Aware Encoding for Lesion Detection in 68Ga-DOTATATE Positron Emission Tomography Images. IEEE Trans Biomed Eng 2024; 71:247-257. [PMID: 37471190 DOI: 10.1109/tbme.2023.3297249] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/22/2023]
Abstract
OBJECTIVE Lesion detection with positron emission tomography (PET) imaging is critical for tumor staging, treatment planning, and advancing novel therapies to improve patient outcomes, especially for neuroendocrine tumors (NETs). Current lesion detection methods often require manual cropping of regions/volumes of interest (ROIs/VOIs) a priori, or rely on multi-stage, cascaded models, or use multi-modality imaging to detect lesions in PET images. This leads to significant inefficiency, high variability and/or potential accumulative errors in lesion quantification. To tackle this issue, we propose a novel single-stage lesion detection method using only PET images. METHODS We design and incorporate a new, plug-and-play codebook learning module into a U-Net-like neural network and promote lesion location-specific feature learning at multiple scales. We explicitly regularize the codebook learning with direct supervision at the network's multi-level hidden layers and enforce the network to learn multi-scale discriminative features with respect to predicting lesion positions. The network automatically combines the predictions from the codebook learning module and other layers via a learnable fusion layer. RESULTS We evaluate the proposed method on a real-world clinical 68Ga-DOTATATE PET image dataset, and our method produces significantly better lesion detection performance than recent state-of-the-art approaches. CONCLUSION We present a novel deep learning method for single-stage lesion detection in PET imaging data, with no ROI/VOI cropping in advance, no multi-stage modeling and no multi-modality data. SIGNIFICANCE This study provides a new perspective for effective and efficient lesion identification in PET, potentially accelerating novel therapeutic regimen development for NETs and ultimately improving patient outcomes including survival.
Collapse
|
21
|
Bi L, Buehner U, Fu X, Williamson T, Choong P, Kim J. Hybrid CNN-transformer network for interactive learning of challenging musculoskeletal images. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 243:107875. [PMID: 37871450 DOI: 10.1016/j.cmpb.2023.107875] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Revised: 10/16/2023] [Accepted: 10/17/2023] [Indexed: 10/25/2023]
Abstract
BACKGROUND AND OBJECTIVES Segmentation of regions of interest (ROIs) such as tumors and bones plays an essential role in the analysis of musculoskeletal (MSK) images. Segmentation results can help with orthopaedic surgeons in surgical outcomes assessment and patient's gait cycle simulation. Deep learning-based automatic segmentation methods, particularly those using fully convolutional networks (FCNs), are considered as the state-of-the-art. However, in scenarios where the training data is insufficient to account for all the variations in ROIs, these methods struggle to segment the challenging ROIs that with less common image characteristics. Such characteristics might include low contrast to the background, inhomogeneous textures, and fuzzy boundaries. METHODS we propose a hybrid convolutional neural network - transformer network (HCTN) for semi-automatic segmentation to overcome the limitations of segmenting challenging MSK images. Specifically, we propose to fuse user-inputs (manual, e.g., mouse clicks) with high-level semantic image features derived from the neural network (automatic) where the user-inputs are used in an interactive training for uncommon image characteristics. In addition, we propose to leverage the transformer network (TN) - a deep learning model designed for handling sequence data, in together with features derived from FCNs for segmentation; this addresses the limitation of FCNs that can only operate on small kernels, which tends to dismiss global context and only focus on local patterns. RESULTS We purposely selected three MSK imaging datasets covering a variety of structures to evaluate the generalizability of the proposed method. Our semi-automatic HCTN method achieved a dice coefficient score (DSC) of 88.46 ± 9.41 for segmenting the soft-tissue sarcoma tumors from magnetic resonance (MR) images, 73.32 ± 11.97 for segmenting the osteosarcoma tumors from MR images and 93.93 ± 1.84 for segmenting the clavicle bones from chest radiographs. When compared to the current state-of-the-art automatic segmentation method, our HCTN method is 11.7%, 19.11% and 7.36% higher in DSC on the three datasets, respectively. CONCLUSION Our experimental results demonstrate that HCTN achieved more generalizable results than the current methods, especially with challenging MSK studies.
Collapse
Affiliation(s)
- Lei Bi
- Institute of Translational Medicine, National Center for Translational Medicine, Shanghai Jiao Tong University, Shanghai, China; School of Computer Science, University of Sydney, NSW, Australia
| | | | - Xiaohang Fu
- School of Computer Science, University of Sydney, NSW, Australia
| | - Tom Williamson
- Stryker Corporation, Kalamazoo, Michigan, USA; Centre for Additive Manufacturing, School of Engineering, RMIT University, VIC, Australia
| | - Peter Choong
- Department of Surgery, University of Melbourne, VIC, Australia
| | - Jinman Kim
- School of Computer Science, University of Sydney, NSW, Australia.
| |
Collapse
|
22
|
Alshmrani GM, Ni Q, Jiang R, Muhammed N. Hyper-Dense_Lung_Seg: Multimodal-Fusion-Based Modified U-Net for Lung Tumour Segmentation Using Multimodality of CT-PET Scans. Diagnostics (Basel) 2023; 13:3481. [PMID: 37998617 PMCID: PMC10670323 DOI: 10.3390/diagnostics13223481] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Revised: 11/09/2023] [Accepted: 11/15/2023] [Indexed: 11/25/2023] Open
Abstract
The majority of cancer-related deaths globally are due to lung cancer, which also has the second-highest mortality rate. The segmentation of lung tumours, treatment evaluation, and tumour stage classification have become significantly more accessible with the advent of PET/CT scans. With the advent of PET/CT scans, it is possible to obtain both functioning and anatomic data during a single examination. However, integrating images from different modalities can indeed be time-consuming for medical professionals and remains a challenging task. This challenge arises from several factors, including differences in image acquisition techniques, image resolutions, and the inherent variations in the spectral and temporal data captured by different imaging modalities. Artificial Intelligence (AI) methodologies have shown potential in the automation of image integration and segmentation. To address these challenges, multimodal fusion approach-based U-Net architecture (early fusion, late fusion, dense fusion, hyper-dense fusion, and hyper-dense VGG16 U-Net) are proposed for lung tumour segmentation. Dice scores of 73% show that hyper-dense VGG16 U-Net is superior to the other four proposed models. The proposed method can potentially aid medical professionals in detecting lung cancer at an early stage.
Collapse
Affiliation(s)
- Goram Mufarah Alshmrani
- School of Computing and Commutations, Lancaster University, Lancaster LA1 4YW, UK; (Q.N.); (R.J.)
- College of Computing and Information Technology, University of Bisha, Bisha 67714, Saudi Arabia
| | - Qiang Ni
- School of Computing and Commutations, Lancaster University, Lancaster LA1 4YW, UK; (Q.N.); (R.J.)
| | - Richard Jiang
- School of Computing and Commutations, Lancaster University, Lancaster LA1 4YW, UK; (Q.N.); (R.J.)
| | - Nada Muhammed
- Computers and Control Engineering Department, Faculty of Engineering, Tanta University, Tanta 31733, Egypt;
| |
Collapse
|
23
|
Zhong Y, Cai C, Chen T, Gui H, Deng J, Yang M, Yu B, Song Y, Wang T, Sun X, Shi J, Chen Y, Xie D, Chen C, She Y. PET/CT based cross-modal deep learning signature to predict occult nodal metastasis in lung cancer. Nat Commun 2023; 14:7513. [PMID: 37980411 PMCID: PMC10657428 DOI: 10.1038/s41467-023-42811-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2023] [Accepted: 10/20/2023] [Indexed: 11/20/2023] Open
Abstract
Occult nodal metastasis (ONM) plays a significant role in comprehensive treatments of non-small cell lung cancer (NSCLC). This study aims to develop a deep learning signature based on positron emission tomography/computed tomography to predict ONM of clinical stage N0 NSCLC. An internal cohort (n = 1911) is included to construct the deep learning nodal metastasis signature (DLNMS). Subsequently, an external cohort (n = 355) and a prospective cohort (n = 999) are utilized to fully validate the predictive performances of the DLNMS. Here, we show areas under the receiver operating characteristic curve of the DLNMS for occult N1 prediction are 0.958, 0.879 and 0.914 in the validation set, external cohort and prospective cohort, respectively, and for occult N2 prediction are 0.942, 0.875 and 0.919, respectively, which are significantly better than the single-modal deep learning models, clinical model and physicians. This study demonstrates that the DLNMS harbors the potential to predict ONM of clinical stage N0 NSCLC.
Collapse
Affiliation(s)
- Yifan Zhong
- Department of Thoracic Surgery, Shanghai Pulmonary Hospital, Tongji University School of Medicine, Shanghai, China
| | - Chuang Cai
- School of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang, Jiangsu, China
| | - Tao Chen
- Department of Thoracic Surgery, Shanghai Pulmonary Hospital, Tongji University School of Medicine, Shanghai, China
| | - Hao Gui
- Graduate School at Shenzhen, Tsinghua University, Shenzhen, China
| | - Jiajun Deng
- Department of Thoracic Surgery, Shanghai Pulmonary Hospital, Tongji University School of Medicine, Shanghai, China
| | - Minglei Yang
- Department of Thoracic Surgery, Ningbo HwaMei Hospital, Chinese Academy of Sciences, Zhejiang, China
| | - Bentong Yu
- Department of Thoracic Surgery, The First Affiliated Hospital of Nanchang University, Jiangxi, China
| | - Yongxiang Song
- Department of Thoracic Surgery, Affiliated Hospital of Zunyi Medical University, Guizhou, China
| | - Tingting Wang
- Department of Radiology, Zhongshan Hospital, Fudan University, Shanghai, China
| | - Xiwen Sun
- Department of Radiology, Shanghai Pulmonary Hospital, Tongji University School of Medicine, Shanghai, China
| | - Jingyun Shi
- Department of Radiology, Shanghai Pulmonary Hospital, Tongji University School of Medicine, Shanghai, China
| | - Yangchun Chen
- Department of Nuclear Medicine, Shanghai Pulmonary Hospital, Tongji University School of Medicine, Shanghai, China
| | - Dong Xie
- Department of Thoracic Surgery, Shanghai Pulmonary Hospital, Tongji University School of Medicine, Shanghai, China.
| | - Chang Chen
- Department of Thoracic Surgery, Shanghai Pulmonary Hospital, Tongji University School of Medicine, Shanghai, China.
| | - Yunlang She
- Department of Thoracic Surgery, Shanghai Pulmonary Hospital, Tongji University School of Medicine, Shanghai, China.
| |
Collapse
|
24
|
Li L, Tan J, Yu L, Li C, Nan H, Zheng S. LSAM: L2-norm self-attention and latent space feature interaction for automatic 3D multi-modal head and neck tumor segmentation. Phys Med Biol 2023; 68:225004. [PMID: 37852283 DOI: 10.1088/1361-6560/ad04a8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2023] [Accepted: 10/18/2023] [Indexed: 10/20/2023]
Abstract
Objective.Head and neck (H&N) cancers are prevalent globally, and early and accurate detection is absolutely crucial for timely and effective treatment. However, the segmentation of H&N tumors is challenging due to the similar density of the tumors and surrounding tissues in CT images. While positron emission computed tomography (PET) images provide information about the metabolic activity of the tissue and can distinguish between lesion regions and normal tissue. But they are limited by their low spatial resolution. To fully leverage the complementary information from PET and CT images, we propose a novel and innovative multi-modal tumor segmentation method specifically designed for H&N tumor segmentation.Approach.The proposed novel and innovative multi-modal tumor segmentation network (LSAM) consists of two key learning modules, namely L2-Norm self-attention and latent space feature interaction, which exploit the high sensitivity of PET images and the anatomical information of CT images. These two advanced modules contribute to a powerful 3D segmentation network based on a U-shaped structure. The well-designed segmentation method can integrate complementary features from different modalities at multiple scales, thereby improving the feature interaction between modalities.Main results.We evaluated the proposed method on the public HECKTOR PET-CT dataset, and the experimental results demonstrate that the proposed method convincingly outperforms existing H&N tumor segmentation methods in terms of key evaluation metrics, including DSC (0.8457), Jaccard (0.7756), RVD (0.0938), and HD95 (11.75).Significance.The innovative Self-Attention mechanism based on L2-Norm offers scalability and is effective in reducing the impact of outliers on the performance of the model. And the novel method for multi-scale feature interaction based on Latent Space utilizes the learning process in the encoder phase to achieve the best complementary effects among different modalities.
Collapse
Affiliation(s)
- Laquan Li
- College of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing, People's Republic of China
- School of Science, Chongqing University of Posts and Telecommunications, Chongqing, People's Republic of China
| | - Jiaxin Tan
- College of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing, People's Republic of China
| | - Lei Yu
- Emergency Department, The Second Affiliated Hospital of Chongqing Medical University, Chongqing, People's Republic of China
| | - Chunwen Li
- Emergency Department, The Second Affiliated Hospital of Chongqing Medical University, Chongqing, People's Republic of China
| | - Hai Nan
- College of Computer Science and Engineering, Chongqing University of Technology, Chongqing, People's Republic of China
| | - Shenhai Zheng
- College of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing, People's Republic of China
| |
Collapse
|
25
|
Ye S, Wang T, Ding M, Zhang X. F-DARTS: Foveated Differentiable Architecture Search Based Multimodal Medical Image Fusion. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:3348-3361. [PMID: 37285248 DOI: 10.1109/tmi.2023.3283517] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Multimodal medical image fusion (MMIF) is highly significant in such fields as disease diagnosis and treatment. The traditional MMIF methods are difficult to provide satisfactory fusion accuracy and robustness due to the influence of such possible human-crafted components as image transform and fusion strategies. Existing deep learning based fusion methods are generally difficult to ensure image fusion effect due to the adoption of a human-designed network structure and a relatively simple loss function and the ignorance of human visual characteristics during weight learning. To address these issues, we have presented the foveated differentiable architecture search (F-DARTS) based unsupervised MMIF method. In this method, the foveation operator is introduced into the weight learning process to fully explore human visual characteristics for the effective image fusion. Meanwhile, a distinctive unsupervised loss function is designed for network training by integrating mutual information, sum of the correlations of differences, structural similarity and edge preservation value. Based on the presented foveation operator and loss function, an end-to-end encoder-decoder network architecture will be searched using the F-DARTS to produce the fused image. Experimental results on three multimodal medical image datasets demonstrate that the F-DARTS performs better than several traditional and deep learning based fusion methods by providing visually superior fused results and better objective evaluation metrics.
Collapse
|
26
|
He J, Zhang Y, Chung M, Wang M, Wang K, Ma Y, Ding X, Li Q, Pu Y. Whole-body tumor segmentation from PET/CT images using a two-stage cascaded neural network with camouflaged object detection mechanisms. Med Phys 2023; 50:6151-6162. [PMID: 37134002 DOI: 10.1002/mp.16438] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Revised: 03/25/2023] [Accepted: 04/12/2023] [Indexed: 05/04/2023] Open
Abstract
BACKGROUND Whole-body Metabolic Tumor Volume (MTVwb) is an independent prognostic factor for overall survival in lung cancer patients. Automatic segmentation methods have been proposed for MTV calculation. Nevertheless, most of existing methods for patients with lung cancer only segment tumors in the thoracic region. PURPOSE In this paper, we present a Two-Stage cascaded neural network integrated with Camouflaged Object Detection mEchanisms (TS-Code-Net) for automatic segmenting tumors from whole-body PET/CT images. METHODS Firstly, tumors are detected from the Maximum Intensity Projection (MIP) images of PET/CT scans, and tumors' approximate localizations along z-axis are identified. Secondly, the segmentations are performed on PET/CT slices that contain tumors identified by the first step. Camouflaged object detection mechanisms are utilized to distinguish the tumors from their surrounding regions that have similar Standard Uptake Values (SUV) and texture appearance. Finally, the TS-Code-Net is trained by minimizing the total loss that incorporates the segmentation accuracy loss and the class imbalance loss. RESULTS The performance of the TS-Code-Net is tested on a whole-body PET/CT image data-set including 480 Non-Small Cell Lung Cancer (NSCLC) patients with five-fold cross-validation using image segmentation metrics. Our method achieves 0.70, 0.76, and 0.70, for Dice, Sensitivity and Precision, respectively, which demonstrates the superiority of the TS-Code-Net over several existing methods related to metastatic lung cancer segmentation from whole-body PET/CT images. CONCLUSIONS The proposed TS-Code-Net is effective for whole-body tumor segmentation of PET/CT images. Codes for TS-Code-Net are available at: https://github.com/zyj19/TS-Code-Net.
Collapse
Affiliation(s)
- Jiangping He
- Department of Electronic Engineering, Lanzhou University of Finance and Economics, Lanzhou, Gansu, China
| | - Yangjie Zhang
- Department of Electronic Engineering, Lanzhou University of Finance and Economics, Lanzhou, Gansu, China
| | - Maggie Chung
- Department of Radiology, University of California, San Francisco, California, USA
| | - Michael Wang
- Department of Pathology, University of California, San Francisco, California, USA
| | - Kun Wang
- Department of Electronic Engineering, Lanzhou University of Finance and Economics, Lanzhou, Gansu, China
| | - Yan Ma
- Department of Electronic Engineering, Lanzhou University of Finance and Economics, Lanzhou, Gansu, China
| | - Xiaoyang Ding
- Department of Electronic Engineering, Lanzhou University of Finance and Economics, Lanzhou, Gansu, China
| | - Qiang Li
- Department of Electronic Engineering, Lanzhou University of Finance and Economics, Lanzhou, Gansu, China
| | - Yonglin Pu
- Department of Radiology, University of Chicago, Chicago, Illinois, USA
| |
Collapse
|
27
|
Yu X, He L, Wang Y, Dong Y, Song Y, Yuan Z, Yan Z, Wang W. A deep learning approach for automatic tumor delineation in stereotactic radiotherapy for non-small cell lung cancer using diagnostic PET-CT and planning CT. Front Oncol 2023; 13:1235461. [PMID: 37601687 PMCID: PMC10437048 DOI: 10.3389/fonc.2023.1235461] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Accepted: 07/10/2023] [Indexed: 08/22/2023] Open
Abstract
Introduction Accurate delineation of tumor targets is crucial for stereotactic body radiation therapy (SBRT) for non-small cell lung cancer (NSCLC). This study aims to develop a deep learning-based segmentation approach to accurately and efficiently delineate NSCLC targets using diagnostic PET-CT and SBRT planning CT (pCT). Methods The diagnostic PET was registered to pCT using the transform matrix from registering diagnostic CT to the pCT. We proposed a 3D-UNet-based segmentation method to segment NSCLC tumor targets on dual-modality PET-pCT images. This network contained squeeze-and-excitation and Residual blocks in each convolutional block to perform dynamic channel-wise feature recalibration. Furthermore, up-sampling paths were added to supplement low-resolution features to the model and also to compute the overall loss function. The dice similarity coefficient (DSC), precision, recall, and the average symmetric surface distances were used to assess the performance of the proposed approach on 86 pairs of diagnostic PET and pCT images. The proposed model using dual-modality images was compared with both conventional 3D-UNet architecture and single-modality image input. Results The average DSC of the proposed model with both PET and pCT images was 0.844, compared to 0.795 and 0.827, when using 3D-UNet and nnUnet. It also outperformed using either pCT or PET alone with the same network, which had DSC of 0.823 and 0.732, respectively. Discussion Therefore, our proposed segmentation approach is able to outperform the current 3D-UNet network with diagnostic PET and pCT images. The integration of two image modalities helps improve segmentation accuracy.
Collapse
Affiliation(s)
- Xuyao Yu
- Department of Radiation Oncology, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Tianjin’s Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin, China
- Tianjin Medical University, Tianjin, China
| | - Lian He
- Perception Vision Medical Technologies Co Ltd, Guangzhou, China
| | - Yuwen Wang
- Department of Radiotherapy, Tianjin Cancer Hospital Airport Hospital, Tianjin, China
| | - Yang Dong
- Department of Radiation Oncology, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Tianjin’s Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin, China
| | - Yongchun Song
- Department of Radiation Oncology, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Tianjin’s Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin, China
| | - Zhiyong Yuan
- Department of Radiation Oncology, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Tianjin’s Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin, China
| | - Ziye Yan
- Perception Vision Medical Technologies Co Ltd, Guangzhou, China
| | - Wei Wang
- Department of Radiation Oncology, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Tianjin’s Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin, China
| |
Collapse
|
28
|
Wan P, Xue H, Liu C, Chen F, Kong W, Zhang D. Dynamic Perfusion Representation and Aggregation Network for Nodule Segmentation Using Contrast-Enhanced US. IEEE J Biomed Health Inform 2023; 27:3431-3442. [PMID: 37097791 DOI: 10.1109/jbhi.2023.3270307] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/26/2023]
Abstract
Dynamic contrast-enhanced ultrasound (CEUS) imaging has been widely applied in lesion detection and characterization, due to its offered real-time observation of microvascular perfusion. Accurate lesion segmentation is of great importance to the quantitative and qualitative perfusion analysis. In this paper, we propose a novel dynamic perfusion representation and aggregation network (DpRAN) for the automatic segmentation of lesions using dynamic CEUS imaging. The core challenge of this work lies in enhancement dynamics modeling of various perfusion areas. Specifically, we divide enhancement features into the two scales: short-range enhancement patterns and long-range evolution tendency. To effectively represent real-time enhancement characteristics and aggregate them in a global view, we introduce the perfusion excitation (PE) gate and cross-attention temporal aggregation (CTA) module, respectively. Different from the common temporal fusion methods, we also introduce an uncertainty estimation strategy to assist the model to locate the critical enhancement point first, in which a relatively distinguished enhancement pattern is displayed. The segmentation performance of our DpRAN method is validated on our collected CEUS datasets of thyroid nodules. We obtain the mean dice coefficient (DSC) and intersection of union (IoU) of 0.794 and 0.676, respectively. Superior performance demonstrates its efficacy to capture distinguished enhancement characteristics for lesion recognition.
Collapse
|
29
|
Zhou T, Cheng Q, Lu H, Li Q, Zhang X, Qiu S. Deep learning methods for medical image fusion: A review. Comput Biol Med 2023; 160:106959. [PMID: 37141652 DOI: 10.1016/j.compbiomed.2023.106959] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Revised: 04/12/2023] [Accepted: 04/17/2023] [Indexed: 05/06/2023]
Abstract
The image fusion methods based on deep learning has become a research hotspot in the field of computer vision in recent years. This paper reviews these methods from five aspects: Firstly, the principle and advantages of image fusion methods based on deep learning are expounded; Secondly, the image fusion methods are summarized in two aspects: End-to-End and Non-End-to-End, according to the different tasks of deep learning in the feature processing stage, the non-end-to-end image fusion methods are divided into two categories: deep learning for decision mapping and deep learning for feature extraction. According to the different types of the networks, the end-to-end image fusion methods are divided into three categories: image fusion methods based on Convolutional Neural Network, Generative Adversarial Network, and Encoder-Decoder Network; Thirdly, the application of the image fusion methods based on deep learning in medical image field is summarized from two aspects: method and data set; Fourthly, evaluation metrics commonly used in the field of medical image fusion are sorted out from 14 aspects; Fifthly, the main challenges faced by the medical image fusion are discussed from two aspects: data sets and fusion methods. And the future development direction is prospected. This paper systematically summarizes the image fusion methods based on the deep learning, which has a positive guiding significance for the in-depth study of multi modal medical images.
Collapse
Affiliation(s)
- Tao Zhou
- School of Computer Science and Engineering, North Minzu University, Yinchuan, 750021, China; Key Laboratory of Image and Graphics Intelligent Processing of State Ethnic Affairs Commission, North Minzu University, Yinchuan, 750021, China
| | - QianRu Cheng
- School of Computer Science and Engineering, North Minzu University, Yinchuan, 750021, China; Key Laboratory of Image and Graphics Intelligent Processing of State Ethnic Affairs Commission, North Minzu University, Yinchuan, 750021, China.
| | - HuiLing Lu
- School of Science, Ningxia Medical University, Yinchuan, 750004, China.
| | - Qi Li
- School of Computer Science and Engineering, North Minzu University, Yinchuan, 750021, China; Key Laboratory of Image and Graphics Intelligent Processing of State Ethnic Affairs Commission, North Minzu University, Yinchuan, 750021, China
| | - XiangXiang Zhang
- School of Computer Science and Engineering, North Minzu University, Yinchuan, 750021, China; Key Laboratory of Image and Graphics Intelligent Processing of State Ethnic Affairs Commission, North Minzu University, Yinchuan, 750021, China
| | - Shi Qiu
- Key Laboratory of Spectral Imaging Technology CAS, Xi'an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, Xi'an, 710119, China
| |
Collapse
|
30
|
Ding Z, Li H, Guo Y, Zhou D, Liu Y, Xie S. M 4FNet: Multimodal medical image fusion network via multi-receptive-field and multi-scale feature integration. Comput Biol Med 2023; 159:106923. [PMID: 37075601 DOI: 10.1016/j.compbiomed.2023.106923] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 04/03/2023] [Accepted: 04/13/2023] [Indexed: 04/21/2023]
Abstract
The main purpose of multimodal medical image fusion is to aggregate the significant information from different modalities and obtain an informative image, which provides comprehensive content and may help to boost other image processing tasks. Many existing methods based on deep learning neglect the extraction and retention of multi-scale features of medical images and the construction of long-distance relationships between depth feature blocks. Therefore, a robust multimodal medical image fusion network via the multi-receptive-field and multi-scale feature (M4FNet) is proposed to achieve the purpose of preserving detailed textures and highlighting the structural characteristics. Specifically, the dual-branch dense hybrid dilated convolution blocks (DHDCB) is proposed to extract the depth features from multi-modalities by expanding the receptive field of the convolution kernel as well as reusing features, and establish long-range dependencies. In order to make full use of the semantic features of the source images, the depth features are decomposed into multi-scale domain by combining the 2-D scale function and wavelet function. Subsequently, the down-sampling depth features are fused by the proposed attention-aware fusion strategy and inversed to the feature space with equal size of source images. Ultimately, the fusion result is reconstructed by a deconvolution block. To force the fusion network balancing information preservation, a local standard deviation-driven structural similarity is proposed as the loss function. Extensive experiments prove that the performance of the proposed fusion network outperforms six state-of-the-art methods, which SD, MI, QABF and QEP are about 12.8%, 4.1%, 8.5% and 9.7% gains, respectively.
Collapse
Affiliation(s)
- Zhaisheng Ding
- School of Information and Artificial Intelligence, Yunnan University, Kunming, 650504, China
| | - Haiyan Li
- School of Information and Artificial Intelligence, Yunnan University, Kunming, 650504, China.
| | - Yi Guo
- School of Information and Artificial Intelligence, Yunnan University, Kunming, 650504, China
| | - Dongming Zhou
- School of Information and Artificial Intelligence, Yunnan University, Kunming, 650504, China
| | - Yanyu Liu
- School of Information and Artificial Intelligence, Yunnan University, Kunming, 650504, China
| | - Shidong Xie
- School of Information and Artificial Intelligence, Yunnan University, Kunming, 650504, China
| |
Collapse
|
31
|
Wang F, Cheng C, Cao W, Wu Z, Wang H, Wei W, Yan Z, Liu Z. MFCNet: A multi-modal fusion and calibration networks for 3D pancreas tumor segmentation on PET-CT images. Comput Biol Med 2023; 155:106657. [PMID: 36791551 DOI: 10.1016/j.compbiomed.2023.106657] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Revised: 01/29/2023] [Accepted: 02/09/2023] [Indexed: 02/12/2023]
Abstract
In clinical diagnosis, positron emission tomography and computed tomography (PET-CT) images containing complementary information are fused. Tumor segmentation based on multi-modal PET-CT images is an important part of clinical diagnosis and treatment. However, the existing current PET-CT tumor segmentation methods mainly focus on positron emission tomography (PET) and computed tomography (CT) feature fusion, which weakens the specificity of the modality. In addition, the information interaction between different modal images is usually completed by simple addition or concatenation operations, but this has the disadvantage of introducing irrelevant information during the multi-modal semantic feature fusion, so effective features cannot be highlighted. To overcome this problem, this paper propose a novel Multi-modal Fusion and Calibration Networks (MFCNet) for tumor segmentation based on three-dimensional PET-CT images. First, a Multi-modal Fusion Down-sampling Block (MFDB) with a residual structure is developed. The proposed MFDB can fuse complementary features of multi-modal images while retaining the unique features of different modal images. Second, a Multi-modal Mutual Calibration Block (MMCB) based on the inception structure is designed. The MMCB can guide the network to focus on a tumor region by combining different branch decoding features using the attention mechanism and extracting multi-scale pathological features using a convolution kernel of different sizes. The proposed MFCNet is verified on both the public dataset (Head and Neck cancer) and the in-house dataset (pancreas cancer). The experimental results indicate that on the public and in-house datasets, the average Dice values of the proposed multi-modal segmentation network are 74.14% and 76.20%, while the average Hausdorff distances are 6.41 and 6.84, respectively. In addition, the experimental results show that the proposed MFCNet outperforms the state-of-the-art methods on the two datasets.
Collapse
Affiliation(s)
- Fei Wang
- Institute of Biomedical Engineering, School of Communication and Information Engineering, Shanghai University, Shanghai, 200444, China; Department of Medical Imaging, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China
| | - Chao Cheng
- Department of Nuclear Medicine, The First Affiliated Hospital of Naval Medical University(Changhai Hospital), Shanghai, 200433, China
| | - Weiwei Cao
- Department of Medical Imaging, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China
| | - Zhongyi Wu
- Department of Medical Imaging, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China
| | - Heng Wang
- School of Electronic and Information Engineering, Changchun University of Science and Technology, Changchun, 130022, China
| | - Wenting Wei
- School of Electronic and Information Engineering, Changchun University of Science and Technology, Changchun, 130022, China
| | - Zhuangzhi Yan
- Institute of Biomedical Engineering, School of Communication and Information Engineering, Shanghai University, Shanghai, 200444, China.
| | - Zhaobang Liu
- Department of Medical Imaging, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China.
| |
Collapse
|
32
|
Xu W, Bian Y, Lu Y, Meng Q, Zhu W, Shi F, Chen X, Shao C, Xiang D. Semi-supervised interactive fusion network for MR image segmentation. Med Phys 2023; 50:1586-1600. [PMID: 36345139 DOI: 10.1002/mp.16072] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2022] [Revised: 10/06/2022] [Accepted: 10/15/2022] [Indexed: 11/11/2022] Open
Abstract
BACKGROUND Medical image segmentation is an important task in the diagnosis and treatment of cancers. The low contrast and highly flexible anatomical structure make it challenging to accurately segment the organs or lesions. PURPOSE To improve the segmentation accuracy of the organs or lesions in magnetic resonance (MR) images, which can be useful in clinical diagnosis and treatment of cancers. METHODS First, a selective feature interaction (SFI) module is designed to selectively extract the similar features of the sequence images based on the similarity interaction. Second, a multi-scale guided feature reconstruction (MGFR) module is designed to reconstruct low-level semantic features and focus on small targets and the edges of the pancreas. Third, to reduce manual annotation of large amounts of data, a semi-supervised training method is also proposed. Uncertainty estimation is used to further improve the segmentation accuracy. RESULTS Three hundred ninety-five 3D MR images from 395 patients with pancreatic cancer, 259 3D MR images from 259 patients with brain tumors, and four-fold cross-validation strategy are used to evaluate the proposed method. Compared to state-of-the-art deep learning segmentation networks, the proposed method can achieve better segmentation of pancreas or tumors in MR images. CONCLUSIONS SFI-Net can fuse dual sequence MR images for abnormal pancreas or tumor segmentation. The proposed semi-supervised strategy can further improve the performance of SFI-Net.
Collapse
Affiliation(s)
- Wenxuan Xu
- School of Electronic and Information Engineering, Soochow University, Jiangsu, China
| | - Yun Bian
- Department of Radiology, Changhai Hospital, The Navy Military Medical University, Shanghai, China
| | - Yuxuan Lu
- School of Electronic and Information Engineering, Soochow University, Jiangsu, China
| | - Qingquan Meng
- School of Electronic and Information Engineering, Soochow University, Jiangsu, China
| | - Weifang Zhu
- School of Electronic and Information Engineering, Soochow University, Jiangsu, China
| | - Fei Shi
- School of Electronic and Information Engineering, Soochow University, Jiangsu, China
| | - Xinjian Chen
- School of Electronic and Information Engineering, Soochow University, Jiangsu, China
| | - Chengwei Shao
- Department of Radiology, Changhai Hospital, The Navy Military Medical University, Shanghai, China
| | - Dehui Xiang
- School of Electronic and Information Engineering, Soochow University, Jiangsu, China
| |
Collapse
|
33
|
Fan C, Lin H, Qiu Y, Yang L. DAGM-fusion: A dual-path CT-MRI image fusion model based multi-axial gated MLP. Comput Biol Med 2023; 155:106620. [PMID: 36774887 DOI: 10.1016/j.compbiomed.2023.106620] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Revised: 12/28/2022] [Accepted: 01/28/2023] [Indexed: 02/04/2023]
Abstract
Medical imaging technology provides a good understanding of human tissue structure. MRI provides high-resolution soft tissue information, and CT provides high-quality bone density information. By creating CT-MRI fusion images of complex diagnostic situations, experts can develop diagnoses and treatment plans more quickly and precisely. We propose a dual-path CT-MRI image fusion model based on multi-axial gated MLP to create high-quality CT-MRI fusion images. The model employs the feature fusion module SFT-block to effectively integrate detailed Local-Path information guided by global Global-Path information. The fusion is completed through triple constraints, namely global constraints, local constraints, and overall constraints. We design a multi-axial gated MLP module (Ag-MLP). The multi-axial structure maintains the computational complexity linear and increases MLP's inductive bias, allowing MLP to work in shallower or pixel-level small dataset tasks. Ag-MLP and CNN are combined in the network so that the model has both globality and locality. In addition, we design a loss calculation method based on image patches that adaptively generates weights for each patch based on image pixel intensity. The details of the image are efficiently increased when patch-loss is used. Numerous studies demonstrate that the results of our model are superior to those of the latest mainstream fusion model, which are more in accordance with actual clinical diagnostic standards. The ablation studies successfully validate the performance of the model's constituent parts. It is worth mentioning that the model can also be excellently generalized to other modal image fusion tasks.
Collapse
Affiliation(s)
- Chao Fan
- School of Artificial Intelligence and Big Data, Henan University of Technology. Zhengzhou, Henan, China; Key Laboratory of Grain Information Processing and Control, Ministry of Education, Zhengzhou, Henan, China
| | - Hao Lin
- School of Information Science and Engineering, Henan University of Technology, Zhengzhou, Henan, China.
| | - Yingying Qiu
- School of Information Science and Engineering, Henan University of Technology, Zhengzhou, Henan, China
| | - Litao Yang
- School of Information Science and Engineering, Henan University of Technology, Zhengzhou, Henan, China
| |
Collapse
|
34
|
Chen M, Chen Z, Xi Y, Qiao X, Chen X, Huang Q. Multimodal Fusion Network for Detecting Hyperplastic Parathyroid Glands in SPECT/CT Images. IEEE J Biomed Health Inform 2023; 27:1524-1534. [PMID: 37015701 DOI: 10.1109/jbhi.2022.3228603] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
In secondary hyperparathyroidism (SHPT) disease, preoperatively localizing hyperplastic parathyroid glands is crucial in the surgical procedure. These glands can be detected via the dual-modality imaging technique single-photon emission computed tomography/computed tomography (SPECT/CT) since it has high sensitivity and provides an accurate location. However, due to possible low-uptake glands in SPECT images, manually labeling glands is challenging, not to mention automatic label methods. In this work, we present a deep learning method with a novel fusion network to detect hyperplastic parathyroid glands in SPECT/CT images. Our proposed fusion network follows the convolutional neural network (CNN) with a three-pathway architecture that extracts modality-specific feature maps. The fusion network, composed of the channel attention module, the feature selection module, and the modality-specific spatial attention module, is designed to integrate complementary anatomical and functional information, especially for low-uptake glands. Experiments with patient data show that our fusion method improves performance in discerning low-uptake glands compared with current fusion strategies, achieving an average sensitivity of 0.822. Our results prove the effectiveness of the three-pathway architecture with our proposed fusion network for solving the glands detection task. To our knowledge, this is the first study to detect abnormal parathyroid glands in SHPT disease using SPECT/CT images, which promotes the application of preoperative glands localization.
Collapse
|
35
|
Zhu X, Jiang H, Diao Z. CGBO-Net: Cruciform structure guided and boundary-optimized lymphoma segmentation network. Comput Biol Med 2023; 153:106534. [PMID: 36608464 DOI: 10.1016/j.compbiomed.2022.106534] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Revised: 12/27/2022] [Accepted: 12/31/2022] [Indexed: 01/05/2023]
Abstract
Lymphoma segmentation plays an important role in the diagnosis and treatment of lymphocytic tumor. Most current existing automatic segmentation methods are difficult to give precise tumor boundary and location. Semi-automatic methods are usually combined with manually added features such as bounding box or points to locate the tumor. Inspired by this, we propose a cruciform structure guided and boundary-optimized lymphoma segmentation network(CGBS-Net). The method uses a cruciform structure extracted based on PET images as an additional input to the network, while using a boundary gradient loss function to optimize the boundary of the tumor. Our method is divided into two main stages: In the first stage, we use the proposed axial context-based cruciform structure extraction (CCE) method to extract the cruciform structures of all tumor slices. In the second stage, we use PET/CT and the corresponding cruciform structure as input in the designed network (CGBO-Net) to extract tumor structure and boundary information. The Dice, Precision, Recall, IOU and RVD are 90.7%, 89.4%, 92.5%, 83.1% and 4.5%, respectively. Validate on the lymphoma dataset and publicly available head and neck data, our proposed approach is better than the other state-of-the-art semi-segmentation methods, which produces promising segmentation results.
Collapse
Affiliation(s)
- Xiaolin Zhu
- Northeastern University, No. 195, Chuangxin Road, Hunnan District, Shenyang, 110169, Liaoning, China
| | - Huiyan Jiang
- Software College, Northeastern University, No. 195, Chuangxin Road, Hunnan District, Shenyang, 110169, Liaoning, China; Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, No. 195, Chuangxin Road, Hunnan District, Shenyang, 110169, Liaoning, China.
| | - Zhaoshuo Diao
- Software College, Northeastern University, No. 195, Chuangxin Road, Hunnan District, Shenyang, 110169, Liaoning, China
| |
Collapse
|
36
|
PET-guided attention for prediction of microvascular invasion in preoperative hepatocellular carcinoma on PET/CT. Ann Nucl Med 2023; 37:238-245. [PMID: 36723705 DOI: 10.1007/s12149-023-01822-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Accepted: 01/23/2023] [Indexed: 02/02/2023]
Abstract
PURPOSE To achieve PET/CT-based preoperative prediction of microvascular invasion in hepatocellular carcinoma by combining the advantages of PET and CT. METHODS This retrospective study included a total of 100 patients from two institutions who underwent PET/CT imaging. The above patients were divided into a training cohort (n = 70) and a validation cohort (n = 30). This study was based on PET/CT images to evaluate the possibility of microvascular invasion (MVI) of patients. In this study, we proposed a two-branch PET-guided attention network to predict MVI. The model used a two-branch network to extract image features from PET and CT, respectively. The PET-guided attention module aimed to enable the model to focus on the lesion region and reduce the disturbance of irrelevant and redundant information. Model performance was evaluated by the area under the curve (AUC) of the receiver operating characteristic (ROC) curve. RESULTS The method outperformed the single-modality prediction model for preoperative hepatocyte microvascular invasion, achieving an AUC of 0.907. On the validation set, accuracy reached 0.846, precision reached 0.881, recall 0.793, and F1-score 0.835. CONCLUSION The model exploits the particularities of the molecular metabolic function of PET and the anatomical structure of CT and can strongly improve the accuracy of clinical diagnosis of MVI.
Collapse
|
37
|
Wang S, Mahon R, Weiss E, Jan N, Taylor RJ, McDonagh PR, Quinn B, Yuan L. Automated Lung Cancer Segmentation Using a PET and CT Dual-Modality Deep Learning Neural Network. Int J Radiat Oncol Biol Phys 2023; 115:529-539. [PMID: 35934160 DOI: 10.1016/j.ijrobp.2022.07.2312] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2021] [Revised: 06/16/2022] [Accepted: 07/28/2022] [Indexed: 01/11/2023]
Abstract
PURPOSE To develop an automated lung tumor segmentation method for radiation therapy planning based on deep learning and dual-modality positron emission tomography (PET) and computed tomography (CT) images. METHODS AND MATERIALS A 3-dimensional (3D) convolutional neural network using inputs from diagnostic PETs and simulation CTs was constructed with 2 parallel convolution paths for independent feature extraction at multiple resolution levels and a single deconvolution path. At each resolution level, the extracted features from the convolution arms were concatenated and fed through the skip connections into the deconvolution path that produced the tumor segmentation. Our network was trained/validated/tested by a 3:1:1 split on 290 pairs of PET and CT images from patients with lung cancer treated at our clinic, with manual physician contours as the ground truth. A stratified training strategy based on the magnitude of the gross tumor volume (GTV) was investigated to improve performance, especially for small tumors. Multiple radiation oncologists assessed the clinical acceptability of the network-produced segmentations. RESULTS The mean Dice similarity coefficient, Hausdorff distance, and bidirectional local distance comparing manual versus automated contours were 0.79 ± 0.10, 5.8 ± 3.2 mm, and 2.8 ± 1.5 mm for the unstratified 3D dual-modality model. Stratification delivered the best results when the model for the large GTVs (>25 mL) was trained with all-size GTVs and the model for the small GTVs (<25 mL) was trained with small GTVs only. The best combined Dice similarity coefficient, Hausdorff distance, and bidirectional local distance from the 2 stratified models on their corresponding test data sets were 0.83 ± 0.07, 5.9 ± 2.5 mm, and 2.8 ± 1.4 mm, respectively. In the multiobserver review, 91.25% manual versus 88.75% automatic contours were accepted or accepted with modifications. CONCLUSIONS By using an expansive clinical PET and CT image database and a dual-modality architecture, the proposed 3D network with a novel GTVbased stratification strategy generated clinically useful lung cancer contours that were highly acceptable on physician review.
Collapse
Affiliation(s)
- Siqiu Wang
- Department of Radiation Oncology, Virginia Commonwealth University, Richmond, Virginia
| | - Rebecca Mahon
- Washington University School of Medicine in St Louis, St Louis, Missouri
| | - Elisabeth Weiss
- Department of Radiation Oncology, Virginia Commonwealth University, Richmond, Virginia
| | - Nuzhat Jan
- Department of Radiation Oncology, Virginia Commonwealth University, Richmond, Virginia
| | - Ross James Taylor
- Department of Radiation Oncology, Virginia Commonwealth University, Richmond, Virginia
| | - Philip Reed McDonagh
- Department of Radiation Oncology, Virginia Commonwealth University, Richmond, Virginia
| | - Bridget Quinn
- Department of Radiation Oncology, Virginia Commonwealth University, Richmond, Virginia
| | - Lulin Yuan
- Department of Radiation Oncology, Virginia Commonwealth University, Richmond, Virginia.
| |
Collapse
|
38
|
Zhou Y, Jiang H, Diao Z, Tong G, Luan Q, Li Y, Li X. MRLA-Net: A tumor segmentation network embedded with a multiple receptive-field lesion attention module in PET-CT images. Comput Biol Med 2023; 153:106538. [PMID: 36646023 DOI: 10.1016/j.compbiomed.2023.106538] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Revised: 12/14/2022] [Accepted: 01/10/2023] [Indexed: 01/13/2023]
Abstract
The tumor image segmentation is an important basis for doctors to diagnose and formulate treatment planning. PET-CT is an extremely important technology for recognizing the systemic situation of diseases due to the complementary advantages of their respective modal information. However, current PET-CT tumor segmentation methods generally focus on the fusion of PET and CT features. The fusion of features will weaken the characteristics of the modality itself. Therefore, enhancing the modal features of the lesions can obtain optimized feature sets, which is extremely necessary to improve the segmentation results. This paper proposed an attention module that integrates the PET-CT diagnostic visual field and the modality characteristics of the lesion, that is, the multiple receptive-field lesion attention module. This paper made full use of the spatial domain, frequency domain, and channel attention, and proposed a large receptive-field lesion localization module and a small receptive-field lesion enhancement module, which together constitute the multiple receptive-field lesion attention module. In addition, a network embedded with a multiple receptive-field lesion attention module has been proposed for tumor segmentation. This paper conducted experiments on a private liver tumor dataset as well as two publicly available datasets, the soft tissue sarcoma dataset, and the head and neck tumor segmentation dataset. The experimental results showed that the proposed method achieves excellent performance on multiple datasets, and has a significant improvement compared with DenseUNet, and the tumor segmentation results on the above three PET/CT datasets were improved by 7.25%, 6.5%, 5.29% in Dice per case. Compared with the latest PET-CT liver tumor segmentation research, the proposed method improves by 8.32%.
Collapse
Affiliation(s)
- Yang Zhou
- Department of Software College, Northeastern University, Shenyang 110819, China
| | - Huiyan Jiang
- Department of Software College, Northeastern University, Shenyang 110819, China.
| | - Zhaoshuo Diao
- Department of Software College, Northeastern University, Shenyang 110819, China
| | - Guoyu Tong
- Department of Software College, Northeastern University, Shenyang 110819, China
| | - Qiu Luan
- Department of Nuclear Medicine, The First Affiliated Hospital of China Medical University, Shenyang 110001, China
| | - Yaming Li
- Department of Nuclear Medicine, The First Affiliated Hospital of China Medical University, Shenyang 110001, China
| | - Xuena Li
- Department of Nuclear Medicine, The First Affiliated Hospital of China Medical University, Shenyang 110001, China.
| |
Collapse
|
39
|
Zheng S, Tan J, Jiang C, Li L. Automated multi-modal Transformer network (AMTNet) for 3D medical images segmentation. Phys Med Biol 2023; 68:025014. [PMID: 36595252 DOI: 10.1088/1361-6560/aca74c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2022] [Accepted: 11/29/2022] [Indexed: 12/05/2022]
Abstract
Objective.Over the past years, convolutional neural networks based methods have dominated the field of medical image segmentation. But the main drawback of these methods is that they have difficulty representing long-range dependencies. Recently, the Transformer has demonstrated super performance in computer vision and has also been successfully applied to medical image segmentation because of the self-attention mechanism and long-range dependencies encoding on images. To the best of our knowledge, only a few works focus on cross-modalities of image segmentation using the Transformer. Hence, the main objective of this study was to design, propose and validate a deep learning method to extend the application of Transformer to multi-modality medical image segmentation.Approach.This paper proposes a novel automated multi-modal Transformer network termed AMTNet for 3D medical image segmentation. Especially, the network is a well-modeled U-shaped network architecture where many effective and significant changes have been made in the feature encoding, fusion, and decoding parts. The encoding part comprises 3D embedding, 3D multi-modal Transformer, and 3D Co-learn down-sampling blocks. Symmetrically, the 3D Transformer block, upsampling block, and 3D-expanding blocks are included in the decoding part. In addition, a Transformer-based adaptive channel interleaved Transformer feature fusion module is designed to fully fuse features of different modalities.Main results.We provide a comprehensive experimental analysis of the Prostate and BraTS2021 datasets. The results show that our method achieves an average DSC of 0.907 and 0.851 (0.734 for ET, 0.895 for TC, and 0.924 for WT) on these two datasets, respectively. These values show that AMTNet yielded significant improvements over the state-of-the-art segmentation networks.Significance.The proposed 3D segmentation network exploits complementary features of different modalities during the feature extraction process at multiple scales to increase the 3D feature representations and improve the segmentation efficiency. This powerful network enriches the research of the Transformer to multi-modal medical image segmentation.
Collapse
Affiliation(s)
- Shenhai Zheng
- College of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing 400065, People's Republic of China
- Chongqing Key Laboratory of Image Cognition, Chongqing University of Posts and Telecommunications, Chongqing 400065, People's Republic of China
| | - Jiaxin Tan
- College of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing 400065, People's Republic of China
| | - Chuangbo Jiang
- School of Science, Chongqing University of Posts and Telecommunications, Chongqing 400065, People's Republic of China
| | - Laquan Li
- College of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing 400065, People's Republic of China
- School of Science, Chongqing University of Posts and Telecommunications, Chongqing 400065, People's Republic of China
| |
Collapse
|
40
|
Luo S, Jiang H, Wang M. C 2BA-UNet: A context-coordination multi-atlas boundary-aware UNet-like method for PET/CT images based tumor segmentation. Comput Med Imaging Graph 2023; 103:102159. [PMID: 36549193 DOI: 10.1016/j.compmedimag.2022.102159] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Revised: 11/11/2022] [Accepted: 12/05/2022] [Indexed: 12/13/2022]
Abstract
Tumor segmentation is a necessary step in clinical processing that can help doctors diagnose tumors and plan surgical treatments. Since tumors are usually small, the locations and appearances vary substantially across individuals, and the contrast between tumors and adjacent normal tissues is low, tumor segmentation is still a challenging task. Although convolutional neural networks (CNNs) have achieved good results in tumor segmentation, the information about tumor boundaries has been rarely explored. To solve the problem, this paper proposes a new method for automatic tumor segmentation in PET/CT images based on context-coordination and boundary-aware, termed as C2BA-UNet. We employ a UNet-like backbone network and replace the encoder with EfficientNet-B0 for efficiency. To acquire potential tumor boundaries, we propose a new multi-atlas boundary-aware (MABA) module based on gradient atlas, uncertainty atlas, and level set atlas, that focuses on uncertain regions between tumors and adjacent tissues. Furthermore, we propose a new context coordination module (CCM) to combine multi-scale context information with attention mechanism to optimize skip connection in high-level layers. To validate the superiority of our method, we conduct experiments on a publicly available soft tissue sarcoma (STS) dataset and a lymphoma dataset, and the results show our method is competitive with other comparison methods.
Collapse
Affiliation(s)
- Shijie Luo
- Software College, Northeastern University, Shenyang 110819, China
| | - Huiyan Jiang
- Software College, Northeastern University, Shenyang 110819, China; Key Laboratory of Intelligent Computing in Biomedical Image, Ministry of Education, Northeastern University, Shenyang 110819, China.
| | - Meng Wang
- Software College, Northeastern University, Shenyang 110819, China
| |
Collapse
|
41
|
Zhang S, Zhang J, Tian B, Lukasiewicz T, Xu Z. Multi-modal contrastive mutual learning and pseudo-label re-learning for semi-supervised medical image segmentation. Med Image Anal 2023; 83:102656. [PMID: 36327656 DOI: 10.1016/j.media.2022.102656] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Revised: 10/04/2022] [Accepted: 10/12/2022] [Indexed: 12/12/2022]
Abstract
Semi-supervised learning has a great potential in medical image segmentation tasks with a few labeled data, but most of them only consider single-modal data. The excellent characteristics of multi-modal data can improve the performance of semi-supervised segmentation for each image modality. However, a shortcoming for most existing multi-modal solutions is that as the corresponding processing models of the multi-modal data are highly coupled, multi-modal data are required not only in the training but also in the inference stages, which thus limits its usage in clinical practice. Consequently, we propose a semi-supervised contrastive mutual learning (Semi-CML) segmentation framework, where a novel area-similarity contrastive (ASC) loss leverages the cross-modal information and prediction consistency between different modalities to conduct contrastive mutual learning. Although Semi-CML can improve the segmentation performance of both modalities simultaneously, there is a performance gap between two modalities, i.e., there exists a modality whose segmentation performance is usually better than that of the other. Therefore, we further develop a soft pseudo-label re-learning (PReL) scheme to remedy this gap. We conducted experiments on two public multi-modal datasets. The results show that Semi-CML with PReL greatly outperforms the state-of-the-art semi-supervised segmentation methods and achieves a similar (and sometimes even better) performance as fully supervised segmentation methods with 100% labeled data, while reducing the cost of data annotation by 90%. We also conducted ablation studies to evaluate the effectiveness of the ASC loss and the PReL module.
Collapse
Affiliation(s)
- Shuo Zhang
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, China; Tianjin Key Laboratory of Bioelectromagnetic Technology and Intelligent Health, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, China
| | - Jiaojiao Zhang
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, China; Tianjin Key Laboratory of Bioelectromagnetic Technology and Intelligent Health, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, China
| | - Biao Tian
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, China; Tianjin Key Laboratory of Bioelectromagnetic Technology and Intelligent Health, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, China
| | | | - Zhenghua Xu
- State Key Laboratory of Reliability and Intelligence of Electrical Equipment, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, China; Tianjin Key Laboratory of Bioelectromagnetic Technology and Intelligent Health, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, China.
| |
Collapse
|
42
|
Zhang X, Zhang B, Deng S, Meng Q, Chen X, Xiang D. Cross modality fusion for modality-specific lung tumor segmentation in PET-CT images. Phys Med Biol 2022; 67. [DOI: 10.1088/1361-6560/ac994e] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2022] [Accepted: 10/11/2022] [Indexed: 11/09/2022]
Abstract
Abstract
Although positron emission tomography-computed tomography (PET-CT) images have been widely used, it is still challenging to accurately segment the lung tumor. The respiration, movement and imaging modality lead to large modality discrepancy of the lung tumors between PET images and CT images. To overcome these difficulties, a novel network is designed to simultaneously obtain the corresponding lung tumors of PET images and CT images. The proposed network can fuse the complementary information and preserve modality-specific features of PET images and CT images. Due to the complementarity between PET images and CT images, the two modality images should be fused for automatic lung tumor segmentation. Therefore, cross modality decoding blocks are designed to extract modality-specific features of PET images and CT images with the constraints of the other modality. The edge consistency loss is also designed to solve the problem of blurred boundaries of PET images and CT images. The proposed method is tested on 126 PET-CT images with non-small cell lung cancer, and Dice similarity coefficient scores of lung tumor segmentation reach 75.66 ± 19.42 in CT images and 79.85 ± 16.76 in PET images, respectively. Extensive comparisons with state-of-the-art lung tumor segmentation methods have also been performed to demonstrate the superiority of the proposed network.
Collapse
|
43
|
Wang L. Deep Learning Techniques to Diagnose Lung Cancer. Cancers (Basel) 2022; 14:5569. [PMID: 36428662 PMCID: PMC9688236 DOI: 10.3390/cancers14225569] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Revised: 11/11/2022] [Accepted: 11/11/2022] [Indexed: 11/15/2022] Open
Abstract
Medical imaging tools are essential in early-stage lung cancer diagnostics and the monitoring of lung cancer during treatment. Various medical imaging modalities, such as chest X-ray, magnetic resonance imaging, positron emission tomography, computed tomography, and molecular imaging techniques, have been extensively studied for lung cancer detection. These techniques have some limitations, including not classifying cancer images automatically, which is unsuitable for patients with other pathologies. It is urgently necessary to develop a sensitive and accurate approach to the early diagnosis of lung cancer. Deep learning is one of the fastest-growing topics in medical imaging, with rapidly emerging applications spanning medical image-based and textural data modalities. With the help of deep learning-based medical imaging tools, clinicians can detect and classify lung nodules more accurately and quickly. This paper presents the recent development of deep learning-based imaging techniques for early lung cancer detection.
Collapse
Affiliation(s)
- Lulu Wang
- Biomedical Device Innovation Center, Shenzhen Technology University, Shenzhen 518118, China
| |
Collapse
|
44
|
Huang Z, Zou S, Wang G, Chen Z, Shen H, Wang H, Zhang N, Zhang L, Yang F, Wang H, Liang D, Niu T, Zhu X, Hu Z. ISA-Net: Improved spatial attention network for PET-CT tumor segmentation. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022; 226:107129. [PMID: 36156438 DOI: 10.1016/j.cmpb.2022.107129] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 07/06/2022] [Accepted: 09/13/2022] [Indexed: 06/16/2023]
Abstract
BACKGROUND AND OBJECTIVE Achieving accurate and automated tumor segmentation plays an important role in both clinical practice and radiomics research. Segmentation in medicine is now often performed manually by experts, which is a laborious, expensive and error-prone task. Manual annotation relies heavily on the experience and knowledge of these experts. In addition, there is much intra- and interobserver variation. Therefore, it is of great significance to develop a method that can automatically segment tumor target regions. METHODS In this paper, we propose a deep learning segmentation method based on multimodal positron emission tomography-computed tomography (PET-CT), which combines the high sensitivity of PET and the precise anatomical information of CT. We design an improved spatial attention network(ISA-Net) to increase the accuracy of PET or CT in detecting tumors, which uses multi-scale convolution operation to extract feature information and can highlight the tumor region location information and suppress the non-tumor region location information. In addition, our network uses dual-channel inputs in the coding stage and fuses them in the decoding stage, which can take advantage of the differences and complementarities between PET and CT. RESULTS We validated the proposed ISA-Net method on two clinical datasets, a soft tissue sarcoma(STS) and a head and neck tumor(HECKTOR) dataset, and compared with other attention methods for tumor segmentation. The DSC score of 0.8378 on STS dataset and 0.8076 on HECKTOR dataset show that ISA-Net method achieves better segmentation performance and has better generalization. CONCLUSIONS The method proposed in this paper is based on multi-modal medical image tumor segmentation, which can effectively utilize the difference and complementarity of different modes. The method can also be applied to other multi-modal data or single-modal data by proper adjustment.
Collapse
Affiliation(s)
- Zhengyong Huang
- Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China; University of Chinese Academy of Sciences, Beijing, 101408, China
| | - Sijuan Zou
- Department of Nuclear Medicine and PET, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430000, China
| | - Guoshuai Wang
- Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China; University of Chinese Academy of Sciences, Beijing, 101408, China
| | - Zixiang Chen
- Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China; Chinese Academy of Sciences Key Laboratory of Health Informatics, Shenzhen, 518055, China
| | - Hao Shen
- Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China; University of Chinese Academy of Sciences, Beijing, 101408, China
| | - Haiyan Wang
- Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China; University of Chinese Academy of Sciences, Beijing, 101408, China
| | - Na Zhang
- Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China; Chinese Academy of Sciences Key Laboratory of Health Informatics, Shenzhen, 518055, China
| | - Lu Zhang
- Brain Cognition and Brain Disease Institute (BCBDI), Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China; Shenzhen-Hong Kong Institute of Brain Science-Shenzhen Fundamental Research Institutions,Shenzhen, 518055, China
| | - Fan Yang
- Brain Cognition and Brain Disease Institute (BCBDI), Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China; Shenzhen-Hong Kong Institute of Brain Science-Shenzhen Fundamental Research Institutions,Shenzhen, 518055, China
| | - Haining Wang
- United Imaging Research Institute of Innovative Medical Equipment, Shenzhen, 518045, China
| | - Dong Liang
- Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China; Chinese Academy of Sciences Key Laboratory of Health Informatics, Shenzhen, 518055, China
| | - Tianye Niu
- Institute of Biomedical Engineering, Shenzhen Bay Laboratory, Shenzhen, 518118, China
| | - Xiaohua Zhu
- Department of Nuclear Medicine and PET, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430000, China
| | - Zhanli Hu
- Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China; Chinese Academy of Sciences Key Laboratory of Health Informatics, Shenzhen, 518055, China.
| |
Collapse
|
45
|
Lipkova J, Chen RJ, Chen B, Lu MY, Barbieri M, Shao D, Vaidya AJ, Chen C, Zhuang L, Williamson DFK, Shaban M, Chen TY, Mahmood F. Artificial intelligence for multimodal data integration in oncology. Cancer Cell 2022; 40:1095-1110. [PMID: 36220072 PMCID: PMC10655164 DOI: 10.1016/j.ccell.2022.09.012] [Citation(s) in RCA: 201] [Impact Index Per Article: 67.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/09/2022] [Revised: 07/12/2022] [Accepted: 09/15/2022] [Indexed: 02/07/2023]
Abstract
In oncology, the patient state is characterized by a whole spectrum of modalities, ranging from radiology, histology, and genomics to electronic health records. Current artificial intelligence (AI) models operate mainly in the realm of a single modality, neglecting the broader clinical context, which inevitably diminishes their potential. Integration of different data modalities provides opportunities to increase robustness and accuracy of diagnostic and prognostic models, bringing AI closer to clinical practice. AI models are also capable of discovering novel patterns within and across modalities suitable for explaining differences in patient outcomes or treatment resistance. The insights gleaned from such models can guide exploration studies and contribute to the discovery of novel biomarkers and therapeutic targets. To support these advances, here we present a synopsis of AI methods and strategies for multimodal data fusion and association discovery. We outline approaches for AI interpretability and directions for AI-driven exploration through multimodal data interconnections. We examine challenges in clinical adoption and discuss emerging solutions.
Collapse
Affiliation(s)
- Jana Lipkova
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA; Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA; Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Richard J Chen
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA; Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA; Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA; Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Bowen Chen
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA; Department of Computer Science, Harvard University, Cambridge, MA, USA
| | - Ming Y Lu
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA; Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA; Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA; Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology (MIT), Cambridge, MA, USA
| | - Matteo Barbieri
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Daniel Shao
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA; Harvard-MIT Health Sciences and Technology (HST), Cambridge, MA, USA
| | - Anurag J Vaidya
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA; Harvard-MIT Health Sciences and Technology (HST), Cambridge, MA, USA
| | - Chengkuan Chen
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA; Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA; Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Luoting Zhuang
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Drew F K Williamson
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA; Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA; Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Muhammad Shaban
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA; Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA; Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Tiffany Y Chen
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA; Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA; Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Faisal Mahmood
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA; Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA; Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA; Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA; Harvard Data Science Initiative, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
46
|
Oh C, Chung JY, Han Y. An End-to-End Recurrent Neural Network for Radial MR Image Reconstruction. SENSORS (BASEL, SWITZERLAND) 2022; 22:7277. [PMID: 36236376 PMCID: PMC9572393 DOI: 10.3390/s22197277] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Revised: 09/20/2022] [Accepted: 09/23/2022] [Indexed: 06/16/2023]
Abstract
Recent advances in deep learning have contributed greatly to the field of parallel MR imaging, where a reduced amount of k-space data are acquired to accelerate imaging time. In our previous work, we have proposed a deep learning method to reconstruct MR images directly from k-space data acquired with Cartesian trajectories. However, MRI utilizes various non-Cartesian trajectories, such as radial trajectories, with various numbers of multi-channel RF coils according to the purpose of an MRI scan. Thus, it is important for a reconstruction network to efficiently unfold aliasing artifacts due to undersampling and to combine multi-channel k-space data into single-channel data. In this work, a neural network named 'ETER-net' is utilized to reconstruct an MR image directly from k-space data acquired with Cartesian and non-Cartesian trajectories and multi-channel RF coils. In the proposed image reconstruction network, the domain transform network converts k-space data into a rough image, which is then refined in the following network to reconstruct a final image. We also analyze loss functions including adversarial and perceptual losses to improve the network performance. For experiments, we acquired k-space data at a 3T MRI scanner with Cartesian and radial trajectories to show the learning mechanism of the direct mapping relationship between the k-space and the corresponding image by the proposed network and to demonstrate the practical applications. According to our experiments, the proposed method showed satisfactory performance in reconstructing images from undersampled single- or multi-channel k-space data with reduced image artifacts. In conclusion, the proposed method is a deep-learning-based MR reconstruction network, which can be used as a unified solution for parallel MRI, where k-space data are acquired with various scanning trajectories.
Collapse
Affiliation(s)
- Changheun Oh
- Neuroscience Research Institute, Gachon University, Incheon 21565, Korea
| | - Jun-Young Chung
- Department of Neuroscience, College of Medicine, Gachon University, Incheon 21565, Korea
| | - Yeji Han
- Department of Biomedical Engineering, Gachon University, Incheon 21936, Korea
| |
Collapse
|
47
|
Yuan C, Shi Q, Huang X, Wang L, He Y, Li B, Zhao W, Qian D. Multimodal deep learning model on interim [ 18F]FDG PET/CT for predicting primary treatment failure in diffuse large B-cell lymphoma. Eur Radiol 2022; 33:77-88. [PMID: 36029345 DOI: 10.1007/s00330-022-09031-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 05/30/2022] [Accepted: 07/13/2022] [Indexed: 11/28/2022]
Abstract
OBJECTIVES The prediction of primary treatment failure (PTF) is necessary for patients with diffuse large B-cell lymphoma (DLBCL) since it serves as a prominent means for improving front-line outcomes. Using interim 18F-fluoro-2-deoxyglucose ([18F]FDG) positron emission tomography/computed tomography (PET/CT) imaging data, we aimed to construct multimodal deep learning (MDL) models to predict possible PTF in low-risk DLBCL. METHODS Initially, 205 DLBCL patients undergoing interim [18F]FDG PET/CT scans and the front-line standard of care were included in the primary dataset for model development. Then, 44 other patients were included in the external dataset for generalization evaluation. Based on the powerful backbone of the Conv-LSTM network, we incorporated five different multimodal fusion strategies (pixel intermixing, separate channel, separate branch, quantitative weighting, and hybrid learning) to make full use of PET/CT features and built five corresponding MDL models. Moreover, we found the best model, that is, the hybrid learning model, and optimized it by integrating the contrastive training objective to further improve its prediction performance. RESULTS The final model with contrastive objective optimization, named the contrastive hybrid learning model, performed best, with an accuracy of 91.22% and an area under the receiver operating characteristic curve (AUC) of 0.926, in the primary dataset. In the external dataset, its accuracy and AUC remained at 88.64% and 0.925, respectively, indicating its good generalization ability. CONCLUSIONS The proposed model achieved good performance, validated the predictive value of interim PET/CT, and holds promise for directing individualized clinical treatment. KEY POINTS • The proposed multimodal models achieved accurate prediction of primary treatment failure in DLBCL patients. • Using an appropriate feature-level fusion strategy can make the same class close to each other regardless of the modal heterogeneity of the data source domain and positively impact the prediction performance. • Deep learning validated the predictive value of interim PET/CT in a way that exceeded human capabilities.
Collapse
Affiliation(s)
- Cheng Yuan
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200040, China
| | - Qing Shi
- Shanghai Institute of Hematology, State Key Laboratory of Medical Genomics, National Research Center for Translational Medicine at Shanghai, Ruijin Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China
| | - Xinyun Huang
- Department of Nuclear Medicine, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China
| | - Li Wang
- Shanghai Institute of Hematology, State Key Laboratory of Medical Genomics, National Research Center for Translational Medicine at Shanghai, Ruijin Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China
| | - Yang He
- Shanghai Institute of Hematology, State Key Laboratory of Medical Genomics, National Research Center for Translational Medicine at Shanghai, Ruijin Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China
| | - Biao Li
- Department of Nuclear Medicine, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China.
| | - Weili Zhao
- Shanghai Institute of Hematology, State Key Laboratory of Medical Genomics, National Research Center for Translational Medicine at Shanghai, Ruijin Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China.
| | - Dahong Qian
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200040, China.
| |
Collapse
|
48
|
Tang W, He F, Liu Y, Duan Y. MATR: Multimodal Medical Image Fusion via Multiscale Adaptive Transformer. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:5134-5149. [PMID: 35901003 DOI: 10.1109/tip.2022.3193288] [Citation(s) in RCA: 44] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Owing to the limitations of imaging sensors, it is challenging to obtain a medical image that simultaneously contains functional metabolic information and structural tissue details. Multimodal medical image fusion, an effective way to merge the complementary information in different modalities, has become a significant technique to facilitate clinical diagnosis and surgical navigation. With powerful feature representation ability, deep learning (DL)-based methods have improved such fusion results but still have not achieved satisfactory performance. Specifically, existing DL-based methods generally depend on convolutional operations, which can well extract local patterns but have limited capability in preserving global context information. To compensate for this defect and achieve accurate fusion, we propose a novel unsupervised method to fuse multimodal medical images via a multiscale adaptive Transformer termed MATR. In the proposed method, instead of directly employing vanilla convolution, we introduce an adaptive convolution for adaptively modulating the convolutional kernel based on the global complementary context. To further model long-range dependencies, an adaptive Transformer is employed to enhance the global semantic extraction capability. Our network architecture is designed in a multiscale fashion so that useful multimodal information can be adequately acquired from the perspective of different scales. Moreover, an objective function composed of a structural loss and a region mutual information loss is devised to construct constraints for information preservation at both the structural-level and the feature-level. Extensive experiments on a mainstream database demonstrate that the proposed method outperforms other representative and state-of-the-art methods in terms of both visual quality and quantitative evaluation. We also extend the proposed method to address other biomedical image fusion issues, and the pleasing fusion results illustrate that MATR has good generalization capability. The code of the proposed method is available at https://github.com/tthinking/MATR.
Collapse
|
49
|
Ye S, Chen C, Bai Z, Wang J, Yao X, Nedzvedz O. Intelligent Labeling of Tumor Lesions Based on Positron Emission Tomography/Computed Tomography. SENSORS (BASEL, SWITZERLAND) 2022; 22:5171. [PMID: 35890851 PMCID: PMC9320307 DOI: 10.3390/s22145171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/29/2022] [Revised: 07/04/2022] [Accepted: 07/08/2022] [Indexed: 06/15/2023]
Abstract
Positron emission tomography/computed tomography (PET/CT) plays a vital role in diagnosing tumors. However, PET/CT imaging relies primarily on manual interpretation and labeling by medical professionals. An enormous workload will affect the training samples' construction for deep learning. The labeling of tumor lesions in PET/CT images involves the intersection of computer graphics and medicine, such as registration, a fusion of medical images, and labeling of lesions. This paper extends the linear interpolation, enhances it in a specific area of the PET image, and uses the outer frame scaling of the PET/CT image and the least-squares residual affine method. The PET and CT images are subjected to wavelet transformation and then synthesized in proportion to form a PET/CT fusion image. According to the absorption of 18F-FDG (fluoro deoxy glucose) SUV in the PET image, the professionals randomly select a point in the focus area in the fusion image, and the system will automatically select the seed point of the focus area to delineate the tumor focus with the regional growth method. Finally, the focus delineated on the PET and CT fusion images is automatically mapped to CT images in the form of polygons, and rectangular segmentation and labeling are formed. This study took the actual PET/CT of patients with lymphatic cancer as an example. The semiautomatic labeling of the system and the manual labeling of imaging specialists were compared and verified. The recognition rate was 93.35%, and the misjudgment rate was 6.52%.
Collapse
Affiliation(s)
- Shiping Ye
- School of Information Science and Technology, Zhejiang Shuren University, Hangzhou 310015, China; (S.Y.); (Z.B.); (J.W.)
- International Science and Technology Cooperation Base of Zhejiang Province: Remote Sensing Image Processing and Application, Hangzhou 310015, China;
| | - Chaoxiang Chen
- International Science and Technology Cooperation Base of Zhejiang Province: Remote Sensing Image Processing and Application, Hangzhou 310015, China;
- Shulan International Medical School, Zhejiang Shuren University, Hangzhou 310015, China;
| | - Zhican Bai
- School of Information Science and Technology, Zhejiang Shuren University, Hangzhou 310015, China; (S.Y.); (Z.B.); (J.W.)
- International Science and Technology Cooperation Base of Zhejiang Province: Remote Sensing Image Processing and Application, Hangzhou 310015, China;
| | - Jinming Wang
- School of Information Science and Technology, Zhejiang Shuren University, Hangzhou 310015, China; (S.Y.); (Z.B.); (J.W.)
- International Science and Technology Cooperation Base of Zhejiang Province: Remote Sensing Image Processing and Application, Hangzhou 310015, China;
| | - Xiaoxaio Yao
- International Science and Technology Cooperation Base of Zhejiang Province: Remote Sensing Image Processing and Application, Hangzhou 310015, China;
- Shulan International Medical School, Zhejiang Shuren University, Hangzhou 310015, China;
| | - Olga Nedzvedz
- Shulan International Medical School, Zhejiang Shuren University, Hangzhou 310015, China;
- Faculty of Biology, Belarusian State University, 220030 Minsk, Belarus
| |
Collapse
|
50
|
Nakao T, Hanaoka S, Nomura Y, Hayashi N, Abe O. Anomaly detection in chest 18F-FDG PET/CT by Bayesian deep learning. Jpn J Radiol 2022; 40:730-739. [PMID: 35094221 PMCID: PMC9252947 DOI: 10.1007/s11604-022-01249-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2021] [Accepted: 01/11/2022] [Indexed: 12/25/2022]
Abstract
PURPOSE To develop an anomaly detection system in PET/CT with the tracer 18F-fluorodeoxyglucose (FDG) that requires only normal PET/CT images for training and can detect abnormal FDG uptake at any location in the chest region. MATERIALS AND METHODS We trained our model based on a Bayesian deep learning framework using 1878 PET/CT scans with no abnormal findings. Our model learns the distribution of standard uptake values in these normal training images and detects out-of-normal uptake regions. We evaluated this model using 34 scans showing focal abnormal FDG uptake in the chest region. This evaluation dataset includes 28 pulmonary and 17 extrapulmonary abnormal FDG uptake foci. We performed per-voxel and per-slice receiver operating characteristic (ROC) analyses and per-lesion free-response receiver operating characteristic analysis. RESULTS Our model showed an area under the ROC curve of 0.992 on discriminating abnormal voxels and 0.852 on abnormal slices. Our model detected 41 of 45 (91.1%) of the abnormal FDG uptake foci with 12.8 false positives per scan (FPs/scan), which include 26 of 28 pulmonary and 15 of 17 extrapulmonary abnormalities. The sensitivity at 3.0 FPs/scan was 82.2% (37/45). CONCLUSION Our model trained only with normal PET/CT images successfully detected both pulmonary and extrapulmonary abnormal FDG uptake in the chest region.
Collapse
Affiliation(s)
- Takahiro Nakao
- Department of Computational Diagnostic Radiology and Preventive Medicine, The University of Tokyo Hospital, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8655, Japan.
| | - Shouhei Hanaoka
- Department of Radiology, The University of Tokyo Hospital, 7-3-1 Hongo, Bunkyo-ku, Tokyo, Japan
| | - Yukihiro Nomura
- Department of Computational Diagnostic Radiology and Preventive Medicine, The University of Tokyo Hospital, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8655, Japan
- Center for Frontier Medical Engineering, Chiba University, 1-33 Yayoicho, Inage-ku, Chiba, Japan
| | - Naoto Hayashi
- Department of Computational Diagnostic Radiology and Preventive Medicine, The University of Tokyo Hospital, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8655, Japan
| | - Osamu Abe
- Department of Radiology, The University of Tokyo Hospital, 7-3-1 Hongo, Bunkyo-ku, Tokyo, Japan
- Division of Radiology and Biomedical Engineering, Graduate School of Medicine, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, Japan
| |
Collapse
|