1
|
Cai Z, Xin J, You C, Shi P, Dong S, Dvornek NC, Zheng N, Duncan JS. Style mixup enhanced disentanglement learning for unsupervised domain adaptation in medical image segmentation. Med Image Anal 2025; 101:103440. [PMID: 39764933 DOI: 10.1016/j.media.2024.103440] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Revised: 11/22/2024] [Accepted: 12/13/2024] [Indexed: 03/05/2025]
Abstract
Unsupervised domain adaptation (UDA) has shown impressive performance by improving the generalizability of the model to tackle the domain shift problem for cross-modality medical segmentation. However, most of the existing UDA approaches depend on high-quality image translation with diversity constraints to explicitly augment the potential data diversity, which is hard to ensure semantic consistency and capture domain-invariant representation. In this paper, free of image translation and diversity constraints, we propose a novel Style Mixup Enhanced Disentanglement Learning (SMEDL) for UDA medical image segmentation to further improve domain generalization and enhance domain-invariant learning ability. Firstly, our method adopts disentangled style mixup to implicitly generate style-mixed domains with diverse styles in the feature space through a convex combination of disentangled style factors, which can effectively improve the model generalization. Meanwhile, we further introduce pixel-wise consistency regularization to ensure the effectiveness of style-mixed domains and provide domain consistency guidance. Secondly, we introduce dual-level domain-invariant learning, including intra-domain contrastive learning and inter-domain adversarial learning to mine the underlying domain-invariant representation under both intra- and inter-domain variations. We have conducted comprehensive experiments to evaluate our method on two public cardiac datasets and one brain dataset. Experimental results demonstrate that our proposed method achieves superior performance compared to the state-of-the-art methods for UDA medical image segmentation.
Collapse
Affiliation(s)
- Zhuotong Cai
- National Key Laboratory of Human-Machine Hybrid Augmented Intelligence, National Engineering Research Center for Visual Information and Applications, and Institute of Artificial Intelligence and Robotics, Xi'an Jiaotong University, Xi'an, Shaanxi, China; Department of Biomedical Engineering, Yale University, New Haven, CT, USA.
| | - Jingmin Xin
- National Key Laboratory of Human-Machine Hybrid Augmented Intelligence, National Engineering Research Center for Visual Information and Applications, and Institute of Artificial Intelligence and Robotics, Xi'an Jiaotong University, Xi'an, Shaanxi, China.
| | - Chenyu You
- Department of Electrical Engineering, Yale University, New Haven, CT, USA
| | - Peiwen Shi
- National Key Laboratory of Human-Machine Hybrid Augmented Intelligence, National Engineering Research Center for Visual Information and Applications, and Institute of Artificial Intelligence and Robotics, Xi'an Jiaotong University, Xi'an, Shaanxi, China
| | - Siyuan Dong
- Department of Electrical Engineering, Yale University, New Haven, CT, USA
| | - Nicha C Dvornek
- Department of Biomedical Engineering, Yale University, New Haven, CT, USA
| | - Nanning Zheng
- National Key Laboratory of Human-Machine Hybrid Augmented Intelligence, National Engineering Research Center for Visual Information and Applications, and Institute of Artificial Intelligence and Robotics, Xi'an Jiaotong University, Xi'an, Shaanxi, China
| | - James S Duncan
- Department of Electrical Engineering, Yale University, New Haven, CT, USA; Department of Biomedical Engineering, Yale University, New Haven, CT, USA.
| |
Collapse
|
2
|
Wang Y, Meng C, Tang Z, Bai X, Ji P, Bai X. Unsupervised Domain Adaptation for Cross-Modality Cerebrovascular Segmentation. IEEE J Biomed Health Inform 2025; 29:2871-2884. [PMID: 40030830 DOI: 10.1109/jbhi.2024.3523103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Cerebrovascular segmentation from time-of-flight magnetic resonance angiography (TOF-MRA) and computed tomography angiography (CTA) is essential in providing supportive information for diagnosing and treatment planning of multiple intracranial vascular diseases. Different imaging modalities utilize distinct principles to visualize the cerebral vasculature, which leads to the limitations of expensive annotations and performance degradation while training and deploying deep learning models. In this paper, we propose an unsupervised domain adaptation framework CereTS to perform translation and segmentation of cross-modality unpaired cerebral angiography. Considering the commonality of vascular structures and stylistic textures as domain-invariant and domain-specific features, CereTS adopts a multi-level domain alignment pattern that includes an image-level cyclic geometric consistency constraint, a patch-level masked contrastive constraint and a feature-level semantic perception constraint to shrink domain discrepancy while preserving consistency of vascular structures. Conducted on a publicly available TOF-MRA dataset and a private CTA dataset, our experiment shows that CereTS outperforms current state-of-the-art methods by a large margin.
Collapse
|
3
|
Lin J, Yu X, Wang Z, Ma C. Source-free domain transfer algorithm with reduced style sensitivity for medical image segmentation. PLoS One 2024; 19:e0309118. [PMID: 39729484 DOI: 10.1371/journal.pone.0309118] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2024] [Accepted: 06/26/2024] [Indexed: 12/29/2024] Open
Abstract
In unsupervised transfer learning for medical image segmentation, where existing algorithms face the challenge of error propagation due to inaccessible source domain data. In response to this scenario, source-free domain transfer algorithm with reduced style sensitivity (SFDT-RSS) is designed. SFDT-RSS initially pre-trains the source domain model by using the generalization strategy and subsequently adapts the pre-trained model to target domain without accessing source data. Then, SFDT-RSS conducts interpatch style transfer (ISS) strategy, based on self-training with Transformer architecture, to minimize the pre-trained model's style sensitivity, enhancing its generalization capability and reducing reliance on a single image style. Simultaneously, the global perception ability of the Transformer architecture enhances semantic representation to improve style generalization effectiveness. In the domain transfer phase, the proposed algorithm utilizes a model-agnostic adaptive confidence regulation (ACR) loss to adjust the source model. Experimental results on five publicly available datasets for unsupervised cross-domain organ segmentation demonstrate that compared to existing algorithms, SFDT-RSS achieves segmentation accuracy improvements of 2.83%, 2.64%, 3.21%, 3.01%, and 3.32% respectively.
Collapse
Affiliation(s)
- Jian Lin
- Sichuan Academy of Medical Science and Sichuan Provincial People's Hospital, Chengdu, China
| | - Xiaomin Yu
- School of Electronic Engineering, Chengdu University of Information Technology, Chengdu, China
| | - Zhengxian Wang
- School of Electronic Engineering, Chengdu University of Information Technology, Chengdu, China
| | - Chaoqiong Ma
- Sichuan Academy of Medical Science and Sichuan Provincial People's Hospital, Chengdu, China
| |
Collapse
|
4
|
Kang B, Nam H, Kang M, Heo KS, Lim M, Oh JH, Kam TE. Target-aware cross-modality unsupervised domain adaptation for vestibular schwannoma and cochlea segmentation. Sci Rep 2024; 14:27883. [PMID: 39537681 PMCID: PMC11561345 DOI: 10.1038/s41598-024-77633-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Accepted: 10/23/2024] [Indexed: 11/16/2024] Open
Abstract
There is growing interest in research on segmentation for the vestibular schwannoma (VS) and cochlea using high-resolution T2 (hrT2) imaging over contrast-enhanced T1 (ceT1) imaging due to the contrast agent side effects. However, the hrT2 imaging remains a problem of insufficient annotated data, which is fatal for building more robust segmentation models. To address the issue, recent studies have adopted unsupervised domain adaptation approaches that translate ceT1 images to hrT2 images. However, previous studies did not consider the size and visual characteristics of the target objects, such as VS and cochlea, during image translation. Specifically, those works simply performed normalization on the entire image without considering its significant impact on the quality of the translated images. These approaches tend to erase the small target objects, making it difficult to preserve the structure of these objects when generating pseudo-target images. Furthermore, they may also struggle to accurately reflect the unique style of the target objects within the images. Therefore, we propose a target-aware unsupervised domain adaptation framework, designed for translating target objects, each tailored to their unique visual characteristics and size using target-aware normalization. We demonstrate the superiority of the proposed framework on a publicly available challenge dataset. Codes are available at https://github.com/Bokyeong-Kang/TANQ .
Collapse
Affiliation(s)
- Bogyeong Kang
- Department of Artificial Intelligence, Korea University, Seoul, South Korea
| | - Hyeonyeong Nam
- Department of Artificial Intelligence, Korea University, Seoul, South Korea
| | - Myeongkyun Kang
- Department of Robotics and Mechatronics Engineering, Daegu Gyeongbuk Institute of Science and Technology (DGIST), Daegu, South Korea
| | - Keun-Soo Heo
- Department of Artificial Intelligence, Korea University, Seoul, South Korea
| | - Minjoo Lim
- Department of Artificial Intelligence, Korea University, Seoul, South Korea
| | - Ji-Hye Oh
- Department of Artificial Intelligence, Korea University, Seoul, South Korea
| | - Tae-Eui Kam
- Department of Artificial Intelligence, Korea University, Seoul, South Korea.
| |
Collapse
|
5
|
Wang R, Heimann AF, Tannast M, Zheng G. CycleSGAN: A cycle-consistent and semantics-preserving generative adversarial network for unpaired MR-to-CT image synthesis. Comput Med Imaging Graph 2024; 117:102431. [PMID: 39243464 DOI: 10.1016/j.compmedimag.2024.102431] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2024] [Revised: 08/09/2024] [Accepted: 08/30/2024] [Indexed: 09/09/2024]
Abstract
CycleGAN has been leveraged to synthesize a CT image from an available MR image after trained on unpaired data. Due to the lack of direct constraints between the synthetic and the input images, CycleGAN cannot guarantee structural consistency and often generates inaccurate mappings that shift the anatomy, which is highly undesirable for downstream clinical applications such as MRI-guided radiotherapy treatment planning and PET/MRI attenuation correction. In this paper, we propose a cycle-consistent and semantics-preserving generative adversarial network, referred as CycleSGAN, for unpaired MR-to-CT image synthesis. Our design features a novel and generic way to incorporate semantic information into CycleGAN. This is done by designing a pair of three-player games within the CycleGAN framework where each three-player game consists of one generator and two discriminators to formulate two distinct types of adversarial learning: appearance adversarial learning and structure adversarial learning. These two types of adversarial learning are alternately trained to ensure both realistic image synthesis and semantic structure preservation. Results on unpaired hip MR-to-CT image synthesis show that our method produces better synthetic CT images in both accuracy and visual quality as compared to other state-of-the-art (SOTA) unpaired MR-to-CT image synthesis methods.
Collapse
Affiliation(s)
- Runze Wang
- Institute of Medical Robotics, School of Biomedical Engineering, Shanghai Jiao Tong University, No. 800, Dongchuan Road, Shanghai, 200240, China
| | - Alexander F Heimann
- Department of Orthopaedic Surgery, HFR Cantonal Hospital, University of Fribourg, Fribourg, Switzerland
| | - Moritz Tannast
- Department of Orthopaedic Surgery, HFR Cantonal Hospital, University of Fribourg, Fribourg, Switzerland
| | - Guoyan Zheng
- Institute of Medical Robotics, School of Biomedical Engineering, Shanghai Jiao Tong University, No. 800, Dongchuan Road, Shanghai, 200240, China.
| |
Collapse
|
6
|
Liu Z, Kainth K, Zhou A, Deyer TW, Fayad ZA, Greenspan H, Mei X. A review of self-supervised, generative, and few-shot deep learning methods for data-limited magnetic resonance imaging segmentation. NMR IN BIOMEDICINE 2024; 37:e5143. [PMID: 38523402 DOI: 10.1002/nbm.5143] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Revised: 02/15/2024] [Accepted: 02/16/2024] [Indexed: 03/26/2024]
Abstract
Magnetic resonance imaging (MRI) is a ubiquitous medical imaging technology with applications in disease diagnostics, intervention, and treatment planning. Accurate MRI segmentation is critical for diagnosing abnormalities, monitoring diseases, and deciding on a course of treatment. With the advent of advanced deep learning frameworks, fully automated and accurate MRI segmentation is advancing. Traditional supervised deep learning techniques have advanced tremendously, reaching clinical-level accuracy in the field of segmentation. However, these algorithms still require a large amount of annotated data, which is oftentimes unavailable or impractical. One way to circumvent this issue is to utilize algorithms that exploit a limited amount of labeled data. This paper aims to review such state-of-the-art algorithms that use a limited number of annotated samples. We explain the fundamental principles of self-supervised learning, generative models, few-shot learning, and semi-supervised learning and summarize their applications in cardiac, abdomen, and brain MRI segmentation. Throughout this review, we highlight algorithms that can be employed based on the quantity of annotated data available. We also present a comprehensive list of notable publicly available MRI segmentation datasets. To conclude, we discuss possible future directions of the field-including emerging algorithms, such as contrastive language-image pretraining, and potential combinations across the methods discussed-that can further increase the efficacy of image segmentation with limited labels.
Collapse
Affiliation(s)
- Zelong Liu
- BioMedical Engineering and Imaging Institute, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Komal Kainth
- BioMedical Engineering and Imaging Institute, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Alexander Zhou
- BioMedical Engineering and Imaging Institute, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Timothy W Deyer
- East River Medical Imaging, New York, New York, USA
- Department of Radiology, Cornell Medicine, New York, New York, USA
| | - Zahi A Fayad
- BioMedical Engineering and Imaging Institute, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Department of Diagnostic, Molecular, and Interventional Radiology, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Hayit Greenspan
- BioMedical Engineering and Imaging Institute, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Department of Diagnostic, Molecular, and Interventional Radiology, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Xueyan Mei
- BioMedical Engineering and Imaging Institute, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Department of Diagnostic, Molecular, and Interventional Radiology, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| |
Collapse
|
7
|
Kumari S, Singh P. Deep learning for unsupervised domain adaptation in medical imaging: Recent advancements and future perspectives. Comput Biol Med 2024; 170:107912. [PMID: 38219643 DOI: 10.1016/j.compbiomed.2023.107912] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Revised: 11/02/2023] [Accepted: 12/24/2023] [Indexed: 01/16/2024]
Abstract
Deep learning has demonstrated remarkable performance across various tasks in medical imaging. However, these approaches primarily focus on supervised learning, assuming that the training and testing data are drawn from the same distribution. Unfortunately, this assumption may not always hold true in practice. To address these issues, unsupervised domain adaptation (UDA) techniques have been developed to transfer knowledge from a labeled domain to a related but unlabeled domain. In recent years, significant advancements have been made in UDA, resulting in a wide range of methodologies, including feature alignment, image translation, self-supervision, and disentangled representation methods, among others. In this paper, we provide a comprehensive literature review of recent deep UDA approaches in medical imaging from a technical perspective. Specifically, we categorize current UDA research in medical imaging into six groups and further divide them into finer subcategories based on the different tasks they perform. We also discuss the respective datasets used in the studies to assess the divergence between the different domains. Finally, we discuss emerging areas and provide insights and discussions on future research directions to conclude this survey.
Collapse
Affiliation(s)
- Suruchi Kumari
- Department of Computer Science and Engineering, Indian Institute of Technology Roorkee, India.
| | - Pravendra Singh
- Department of Computer Science and Engineering, Indian Institute of Technology Roorkee, India.
| |
Collapse
|
8
|
Zhang Y, Wang Y, Xu L, Yao Y, Qian W, Qi L. ST-GAN: A Swin Transformer-Based Generative Adversarial Network for Unsupervised Domain Adaptation of Cross-Modality Cardiac Segmentation. IEEE J Biomed Health Inform 2024; 28:893-904. [PMID: 38019618 DOI: 10.1109/jbhi.2023.3336965] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2023]
Abstract
Unsupervised domain adaptation (UDA) methods have shown great potential in cross-modality medical image segmentation tasks, where target domain labels are unavailable. However, the domain shift among different image modalities remains challenging, because the conventional UDA methods are based on convolutional neural networks (CNNs), which tend to focus on the texture of images and cannot establish the global semantic relevance of features due to the locality of CNNs. This paper proposes a novel end-to-end Swin Transformer-based generative adversarial network (ST-GAN) for cross-modality cardiac segmentation. In the generator of ST-GAN, we utilize the local receptive fields of CNNs to capture spatial information and introduce the Swin Transformer to extract global semantic information, which enables the generator to better extract the domain-invariant features in UDA tasks. In addition, we design a multi-scale feature fuser to sufficiently fuse the features acquired at different stages and improve the robustness of the UDA network. We extensively evaluated our method with two cross-modality cardiac segmentation tasks on the MS-CMR 2019 dataset and the M&Ms dataset. The results of two different tasks show the validity of ST-GAN compared with the state-of-the-art cross-modality cardiac image segmentation methods.
Collapse
|
9
|
Paudyal R, Shah AD, Akin O, Do RKG, Konar AS, Hatzoglou V, Mahmood U, Lee N, Wong RJ, Banerjee S, Shin J, Veeraraghavan H, Shukla-Dave A. Artificial Intelligence in CT and MR Imaging for Oncological Applications. Cancers (Basel) 2023; 15:cancers15092573. [PMID: 37174039 PMCID: PMC10177423 DOI: 10.3390/cancers15092573] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 04/13/2023] [Accepted: 04/17/2023] [Indexed: 05/15/2023] Open
Abstract
Cancer care increasingly relies on imaging for patient management. The two most common cross-sectional imaging modalities in oncology are computed tomography (CT) and magnetic resonance imaging (MRI), which provide high-resolution anatomic and physiological imaging. Herewith is a summary of recent applications of rapidly advancing artificial intelligence (AI) in CT and MRI oncological imaging that addresses the benefits and challenges of the resultant opportunities with examples. Major challenges remain, such as how best to integrate AI developments into clinical radiology practice, the vigorous assessment of quantitative CT and MR imaging data accuracy, and reliability for clinical utility and research integrity in oncology. Such challenges necessitate an evaluation of the robustness of imaging biomarkers to be included in AI developments, a culture of data sharing, and the cooperation of knowledgeable academics with vendor scientists and companies operating in radiology and oncology fields. Herein, we will illustrate a few challenges and solutions of these efforts using novel methods for synthesizing different contrast modality images, auto-segmentation, and image reconstruction with examples from lung CT as well as abdome, pelvis, and head and neck MRI. The imaging community must embrace the need for quantitative CT and MRI metrics beyond lesion size measurement. AI methods for the extraction and longitudinal tracking of imaging metrics from registered lesions and understanding the tumor environment will be invaluable for interpreting disease status and treatment efficacy. This is an exciting time to work together to move the imaging field forward with narrow AI-specific tasks. New AI developments using CT and MRI datasets will be used to improve the personalized management of cancer patients.
Collapse
Affiliation(s)
- Ramesh Paudyal
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, New York City, NY 10065, USA
| | - Akash D Shah
- Department of Radiology, Memorial Sloan Kettering Cancer Center, New York City, NY 10065, USA
| | - Oguz Akin
- Department of Radiology, Memorial Sloan Kettering Cancer Center, New York City, NY 10065, USA
| | - Richard K G Do
- Department of Radiology, Memorial Sloan Kettering Cancer Center, New York City, NY 10065, USA
| | - Amaresha Shridhar Konar
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, New York City, NY 10065, USA
| | - Vaios Hatzoglou
- Department of Radiology, Memorial Sloan Kettering Cancer Center, New York City, NY 10065, USA
| | - Usman Mahmood
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, New York City, NY 10065, USA
| | - Nancy Lee
- Department of Radiation Oncology, Memorial Sloan Kettering Cancer Center, New York City, NY 10065, USA
| | - Richard J Wong
- Head and Neck Service, Department of Surgery, Memorial Sloan Kettering Cancer Center, New York City, NY 10065, USA
| | | | | | - Harini Veeraraghavan
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, New York City, NY 10065, USA
| | - Amita Shukla-Dave
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, New York City, NY 10065, USA
- Department of Radiology, Memorial Sloan Kettering Cancer Center, New York City, NY 10065, USA
| |
Collapse
|
10
|
Chen J, Chen S, Wee L, Dekker A, Bermejo I. Deep learning based unpaired image-to-image translation applications for medical physics: a systematic review. Phys Med Biol 2023; 68. [PMID: 36753766 DOI: 10.1088/1361-6560/acba74] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Accepted: 02/08/2023] [Indexed: 02/10/2023]
Abstract
Purpose. There is a growing number of publications on the application of unpaired image-to-image (I2I) translation in medical imaging. However, a systematic review covering the current state of this topic for medical physicists is lacking. The aim of this article is to provide a comprehensive review of current challenges and opportunities for medical physicists and engineers to apply I2I translation in practice.Methods and materials. The PubMed electronic database was searched using terms referring to unpaired (unsupervised), I2I translation, and medical imaging. This review has been reported in compliance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement. From each full-text article, we extracted information extracted regarding technical and clinical applications of methods, Transparent Reporting for Individual Prognosis Or Diagnosis (TRIPOD) study type, performance of algorithm and accessibility of source code and pre-trained models.Results. Among 461 unique records, 55 full-text articles were included in the review. The major technical applications described in the selected literature are segmentation (26 studies), unpaired domain adaptation (18 studies), and denoising (8 studies). In terms of clinical applications, unpaired I2I translation has been used for automatic contouring of regions of interest in MRI, CT, x-ray and ultrasound images, fast MRI or low dose CT imaging, CT or MRI only based radiotherapy planning, etc Only 5 studies validated their models using an independent test set and none were externally validated by independent researchers. Finally, 12 articles published their source code and only one study published their pre-trained models.Conclusion. I2I translation of medical images offers a range of valuable applications for medical physicists. However, the scarcity of external validation studies of I2I models and the shortage of publicly available pre-trained models limits the immediate applicability of the proposed methods in practice.
Collapse
Affiliation(s)
- Junhua Chen
- Department of Radiation Oncology (MAASTRO), GROW School for Oncology and Developmental Biology, Maastricht University Medical Centre+, Maastricht, 6229 ET, The Netherlands
| | - Shenlun Chen
- Department of Radiation Oncology (MAASTRO), GROW School for Oncology and Developmental Biology, Maastricht University Medical Centre+, Maastricht, 6229 ET, The Netherlands
| | - Leonard Wee
- Department of Radiation Oncology (MAASTRO), GROW School for Oncology and Developmental Biology, Maastricht University Medical Centre+, Maastricht, 6229 ET, The Netherlands
| | - Andre Dekker
- Department of Radiation Oncology (MAASTRO), GROW School for Oncology and Developmental Biology, Maastricht University Medical Centre+, Maastricht, 6229 ET, The Netherlands
| | - Inigo Bermejo
- Department of Radiation Oncology (MAASTRO), GROW School for Oncology and Developmental Biology, Maastricht University Medical Centre+, Maastricht, 6229 ET, The Netherlands
| |
Collapse
|
11
|
Zhao Y, Wang X, Che T, Bao G, Li S. Multi-task deep learning for medical image computing and analysis: A review. Comput Biol Med 2023; 153:106496. [PMID: 36634599 DOI: 10.1016/j.compbiomed.2022.106496] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Revised: 12/06/2022] [Accepted: 12/27/2022] [Indexed: 12/29/2022]
Abstract
The renaissance of deep learning has provided promising solutions to various tasks. While conventional deep learning models are constructed for a single specific task, multi-task deep learning (MTDL) that is capable to simultaneously accomplish at least two tasks has attracted research attention. MTDL is a joint learning paradigm that harnesses the inherent correlation of multiple related tasks to achieve reciprocal benefits in improving performance, enhancing generalizability, and reducing the overall computational cost. This review focuses on the advanced applications of MTDL for medical image computing and analysis. We first summarize four popular MTDL network architectures (i.e., cascaded, parallel, interacted, and hybrid). Then, we review the representative MTDL-based networks for eight application areas, including the brain, eye, chest, cardiac, abdomen, musculoskeletal, pathology, and other human body regions. While MTDL-based medical image processing has been flourishing and demonstrating outstanding performance in many tasks, in the meanwhile, there are performance gaps in some tasks, and accordingly we perceive the open challenges and the perspective trends. For instance, in the 2018 Ischemic Stroke Lesion Segmentation challenge, the reported top dice score of 0.51 and top recall of 0.55 achieved by the cascaded MTDL model indicate further research efforts in high demand to escalate the performance of current models.
Collapse
Affiliation(s)
- Yan Zhao
- Beijing Advanced Innovation Center for Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, Beijing, 100083, China
| | - Xiuying Wang
- School of Computer Science, The University of Sydney, Sydney, NSW, 2008, Australia.
| | - Tongtong Che
- Beijing Advanced Innovation Center for Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, Beijing, 100083, China
| | - Guoqing Bao
- School of Computer Science, The University of Sydney, Sydney, NSW, 2008, Australia
| | - Shuyu Li
- State Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University, Beijing, 100875, China.
| |
Collapse
|
12
|
Liu H, Zhuang Y, Song E, Xu X, Hung CC. A bidirectional multilayer contrastive adaptation network with anatomical structure preservation for unpaired cross-modality medical image segmentation. Comput Biol Med 2022; 149:105964. [PMID: 36007288 DOI: 10.1016/j.compbiomed.2022.105964] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Revised: 07/16/2022] [Accepted: 08/13/2022] [Indexed: 11/03/2022]
Abstract
Multi-modal medical image segmentation has achieved great success through supervised deep learning networks. However, because of domain shift and limited annotation information, unpaired cross-modality segmentation tasks are still challenging. The unsupervised domain adaptation (UDA) methods can alleviate the segmentation degradation of cross-modality segmentation by knowledge transfer between different domains, but current methods still suffer from the problems of model collapse, adversarial training instability, and mismatch of anatomical structures. To tackle these issues, we propose a bidirectional multilayer contrastive adaptation network (BMCAN) for unpaired cross-modality segmentation. The shared encoder is first adopted for learning modality-invariant encoding representations in image synthesis and segmentation simultaneously. Secondly, to retain the anatomical structure consistency in cross-modality image synthesis, we present a structure-constrained cross-modality image translation approach for image alignment. Thirdly, we construct a bidirectional multilayer contrastive learning approach to preserve the anatomical structures and enhance encoding representations, which utilizes two groups of domain-specific multilayer perceptron (MLP) networks to learn modality-specific features. Finally, a semantic information adversarial learning approach is designed to learn structural similarities of semantic outputs for output space alignment. Our proposed method was tested on three different cross-modality segmentation tasks: brain tissue, brain tumor, and cardiac substructure segmentation. Compared with other UDA methods, experimental results show that our proposed BMCAN achieves state-of-the-art segmentation performance on the above three tasks, and it has fewer training components and better feature representations for overcoming overfitting and domain shift problems. Our proposed method can efficiently reduce the annotation burden of radiologists in cross-modality image analysis.
Collapse
Affiliation(s)
- Hong Liu
- Center for Biomedical Imaging and Bioinformatics, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, 430074, China.
| | - Yuzhou Zhuang
- Institute of Artificial Intelligence, Huazhong University of Science and Technology, Wuhan, 430074, China.
| | - Enmin Song
- Center for Biomedical Imaging and Bioinformatics, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, 430074, China.
| | - Xiangyang Xu
- Center for Biomedical Imaging and Bioinformatics, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, 430074, China.
| | - Chih-Cheng Hung
- Center for Machine Vision and Security Research, Kennesaw State University, Marietta, MA, 30060, USA.
| |
Collapse
|
13
|
Jiang J, Rimner A, Deasy JO, Veeraraghavan H. Unpaired Cross-Modality Educed Distillation (CMEDL) for Medical Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; 41:1057-1068. [PMID: 34855590 PMCID: PMC9128665 DOI: 10.1109/tmi.2021.3132291] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
Accurate and robust segmentation of lung cancers from CT, even those located close to mediastinum, is needed to more accurately plan and deliver radiotherapy and to measure treatment response. Therefore, we developed a new cross-modality educed distillation (CMEDL) approach, using unpaired CT and MRI scans, whereby an informative teacher MRI network guides a student CT network to extract features that signal the difference between foreground and background. Our contribution eliminates two requirements of distillation methods: (i) paired image sets by using an image to image (I2I) translation and (ii) pre-training of the teacher network with a large training set by using concurrent training of all networks. Our framework uses an end-to-end trained unpaired I2I translation, teacher, and student segmentation networks. Architectural flexibility of our framework is demonstrated using 3 segmentation and 2 I2I networks. Networks were trained with 377 CT and 82 T2w MRI from different sets of patients, with independent validation (N = 209 tumors) and testing (N = 609 tumors) datasets. Network design, methods to combine MRI with CT information, distillation learning under informative (MRI to CT), weak (CT to MRI) and equal teacher (MRI to MRI), and ablation tests were performed. Accuracy was measured using Dice similarity (DSC), surface Dice (sDSC), and Hausdorff distance at the 95th percentile (HD95). The CMEDL approach was significantly (p < 0.001) more accurate (DSC of 0.77 vs. 0.73) than non-CMEDL methods with an informative teacher for CT lung tumor, with a weak teacher (DSC of 0.84 vs. 0.81) for MRI lung tumor, and with equal teacher (DSC of 0.90 vs. 0.88) for MRI multi-organ segmentation. CMEDL also reduced inter-rater lung tumor segmentation variabilities.
Collapse
|
14
|
Jiang J, Veeraraghavan H. One shot PACS: Patient specific Anatomic Context and Shape prior aware recurrent registration-segmentation of longitudinal thoracic cone beam CTs. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; PP:10.1109/TMI.2022.3154934. [PMID: 35213307 PMCID: PMC9642320 DOI: 10.1109/tmi.2022.3154934] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/27/2023]
Abstract
Image-guided adaptive lung radiotherapy requires accurate tumor and organs segmentation from during treatment cone-beam CT (CBCT) images. Thoracic CBCTs are hard to segment because of low soft-tissue contrast, imaging artifacts, respiratory motion, and large treatment induced intra-thoracic anatomic changes. Hence, we developed a novel Patient-specific Anatomic Context and Shape prior or PACS-aware 3D recurrent registration-segmentation network for longitudinal thoracic CBCT segmentation. Segmentation and registration networks were concurrently trained in an end-to-end framework and implemented with convolutional long-short term memory models. The registration network was trained in an unsupervised manner using pairs of planning CT (pCT) and CBCT images and produced a progressively deformed sequence of images. The segmentation network was optimized in a one-shot setting by combining progressively deformed pCT (anatomic context) and pCT delineations (shape context) with CBCT images. Our method, one-shot PACS was significantly more accurate (p <0.001) for tumor (DSC of 0.83 ± 0.08, surface DSC [sDSC] of 0.97 ± 0.06, and Hausdorff distance at 95th percentile [HD95] of 3.97±3.02mm) and the esophagus (DSC of 0.78 ± 0.13, sDSC of 0.90±0.14, HD95 of 3.22±2.02) segmentation than multiple methods. Ablation tests and comparative experiments were also done.
Collapse
|
15
|
Hill CE, Biasiolli L, Robson MD, Grau V, Pavlides M. Emerging artificial intelligence applications in liver magnetic resonance imaging. World J Gastroenterol 2021; 27:6825-6843. [PMID: 34790009 PMCID: PMC8567471 DOI: 10.3748/wjg.v27.i40.6825] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/06/2021] [Revised: 04/16/2021] [Accepted: 09/30/2021] [Indexed: 02/06/2023] Open
Abstract
Chronic liver diseases (CLDs) are becoming increasingly more prevalent in modern society. The use of imaging techniques for early detection, such as magnetic resonance imaging (MRI), is crucial in reducing the impact of these diseases on healthcare systems. Artificial intelligence (AI) algorithms have been shown over the past decade to excel at image-based analysis tasks such as detection and segmentation. When applied to liver MRI, they have the potential to improve clinical decision making, and increase throughput by automating analyses. With Liver diseases becoming more prevalent in society, the need to implement these techniques to utilize liver MRI to its full potential, is paramount. In this review, we report on the current methods and applications of AI methods in liver MRI, with a focus on machine learning and deep learning methods. We assess four main themes of segmentation, classification, image synthesis and artefact detection, and their respective potential in liver MRI and the wider clinic. We provide a brief explanation of some of the algorithms used and explore the current challenges affecting the field. Though there are many hurdles to overcome in implementing AI methods in the clinic, we conclude that AI methods have the potential to positively aid healthcare professionals for years to come.
Collapse
Affiliation(s)
- Charles E Hill
- Department of Engineering Science, University of Oxford, Oxford OX3 7DQ, United Kingdom
| | - Luca Biasiolli
- Radcliffe Department of Medicine, University of Oxford, Oxford OX3 9DU, United Kingdom
| | | | - Vicente Grau
- Department of Engineering, University of Oxford, Oxford OX3 7DQ, United Kingdom
| | - Michael Pavlides
- Oxford Centre for Clinical Magnetic Resonance Research, Division of Cardiovascular Medicine, Radcliffe Department of Medicine, University of Oxford, Oxford OX3 9DU, United Kingdom
- Translational Gastroenterology Unit, University of Oxford, Oxford OX3 9DU, United Kingdom
- Oxford NIHR Biomedical Research Centre, University of Oxford, Oxford OX3 9DU, United Kingdom
| |
Collapse
|
16
|
Jiang J, Riyahi Alam S, Chen I, Zhang P, Rimner A, Deasy JO, Veeraraghavan H. Deep cross-modality (MR-CT) educed distillation learning for cone beam CT lung tumor segmentation. Med Phys 2021; 48:3702-3713. [PMID: 33905558 DOI: 10.1002/mp.14902] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2020] [Revised: 03/27/2021] [Accepted: 04/06/2021] [Indexed: 12/25/2022] Open
Abstract
PURPOSE Despite the widespread availability of in-treatment room cone beam computed tomography (CBCT) imaging, due to the lack of reliable segmentation methods, CBCT is only used for gross set up corrections in lung radiotherapies. Accurate and reliable auto-segmentation tools could potentiate volumetric response assessment and geometry-guided adaptive radiation therapies. Therefore, we developed a new deep learning CBCT lung tumor segmentation method. METHODS The key idea of our approach called cross-modality educed distillation (CMEDL) is to use magnetic resonance imaging (MRI) to guide a CBCT segmentation network training to extract more informative features during training. We accomplish this by training an end-to-end network comprised of unpaired domain adaptation (UDA) and cross-domain segmentation distillation networks (SDNs) using unpaired CBCT and MRI datasets. UDA approach uses CBCT and MRI that are not aligned and may arise from different sets of patients. The UDA network synthesizes pseudo MRI from CBCT images. The SDN consists of teacher MRI and student CBCT segmentation networks. Feature distillation regularizes the student network to extract CBCT features that match the statistical distribution of MRI features extracted by the teacher network and obtain better differentiation of tumor from background. The UDA network was implemented with a cycleGAN improved with contextual losses separately on Unet and dense fully convolutional segmentation networks (DenseFCN). Performance comparisons were done against CBCT only using 2D and 3D networks. We also compared against an alternative framework that used UDA with MR segmentation network, whereby segmentation was done on the synthesized pseudo MRI representation. All networks were trained with 216 weekly CBCTs and 82 T2-weighted turbo spin echo MRI acquired from different patient cohorts. Validation was done on 20 weekly CBCTs from patients not used in training. Independent testing was done on 38 weekly CBCTs from patients not used in training or validation. Segmentation accuracy was measured using surface Dice similarity coefficient (SDSC) and Hausdroff distance at 95th percentile (HD95) metrics. RESULTS The CMEDL approach significantly improved (p < 0.001) the accuracy of both Unet (SDSC of 0.83 ± 0.08; HD95 of 7.69 ± 7.86 mm) and DenseFCN (SDSC of 0.75 ± 0.13; HD95 of 11.42 ± 9.87 mm) over CBCT only 2DUnet (SDSC of 0.69 ± 0.11; HD95 of 21.70 ± 16.34 mm), 3D Unet (SDSC of 0.72 ± 0.20; HD95 15.01 ± 12.98 mm), and DenseFCN (SDSC of 0.66 ± 0.15; HD95 of 22.15 ± 17.19 mm) networks. The alternate framework using UDA with the MRI network was also more accurate than the CBCT only methods but less accurate the CMEDL approach. CONCLUSIONS Our results demonstrate feasibility of the introduced CMEDL approach to produce reasonably accurate lung cancer segmentation from CBCT images. Further validation on larger datasets is necessary for clinical translation.
Collapse
Affiliation(s)
- Jue Jiang
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY, 1006, USA
| | - Sadegh Riyahi Alam
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY, 1006, USA
| | - Ishita Chen
- Department of Radiation Oncology, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY, 1006, USA
| | - Perry Zhang
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY, 1006, USA
| | - Andreas Rimner
- Department of Radiation Oncology, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY, 1006, USA
| | - Joseph O Deasy
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY, 1006, USA
| | - Harini Veeraraghavan
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY, 1006, USA
| |
Collapse
|
17
|
Choi W, Nadeem S, Alam SR, Deasy JO, Tannenbaum A, Lu W. Reproducible and Interpretable Spiculation Quantification for Lung Cancer Screening. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2021; 200:105839. [PMID: 33221055 PMCID: PMC7920914 DOI: 10.1016/j.cmpb.2020.105839] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/28/2020] [Accepted: 11/08/2020] [Indexed: 05/28/2023]
Abstract
Spiculations are important predictors of lung cancer malignancy, which are spikes on the surface of the pulmonary nodules. In this study, we proposed an interpretable and parameter-free technique to quantify the spiculation using area distortion metric obtained by the conformal (angle-preserving) spherical parameterization. We exploit the insight that for an angle-preserved spherical mapping of a given nodule, the corresponding negative area distortion precisely characterizes the spiculations on that nodule. We introduced novel spiculation scores based on the area distortion metric and spiculation measures. We also semi-automatically segment lung nodule (for reproducibility) as well as vessel and wall attachment to differentiate the real spiculations from lobulation and attachment. A simple pathological malignancy prediction model is also introduced. We used the publicly-available LIDC-IDRI dataset pathologists (strong-label) and radiologists (weak-label) ratings to train and test radiomics models containing this feature, and then externally validate the models. We achieved AUC = 0.80 and 0.76, respectively, with the models trained on the 811 weakly-labeled LIDC datasets and tested on the 72 strongly-labeled LIDC and 73 LUNGx datasets; the previous best model for LUNGx had AUC = 0.68. The number-of-spiculations feature was found to be highly correlated (Spearman's rank correlation coefficient ρ=0.44) with the radiologists' spiculation score. We developed a reproducible and interpretable, parameter-free technique for quantifying spiculations on nodules. The spiculation quantification measures was then applied to the radiomics framework for pathological malignancy prediction with reproducible semi-automatic segmentation of nodule. Using our interpretable features (size, attachment, spiculation, lobulation), we were able to achieve higher performance than previous models. In the future, we will exhaustively test our model for lung cancer screening in the clinic.
Collapse
Affiliation(s)
- Wookjin Choi
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, 1275 York Ave, New York, NY 10065, USA; Department of Engineering and Computer Science, Virginia State University, 1 Hayden St, Petersburg, VA 23806, USA
| | - Saad Nadeem
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, 1275 York Ave, New York, NY 10065, USA.
| | - Sadegh R Alam
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, 1275 York Ave, New York, NY 10065, USA
| | - Joseph O Deasy
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, 1275 York Ave, New York, NY 10065, USA
| | - Allen Tannenbaum
- Departments of Computer Science and Applied Mathematics & Statistics, Stony Brook University, Stony Brook, NY 11790, USA
| | - Wei Lu
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, 1275 York Ave, New York, NY 10065, USA
| |
Collapse
|
18
|
Barragán-Montero A, Javaid U, Valdés G, Nguyen D, Desbordes P, Macq B, Willems S, Vandewinckele L, Holmström M, Löfman F, Michiels S, Souris K, Sterpin E, Lee JA. Artificial intelligence and machine learning for medical imaging: A technology review. Phys Med 2021; 83:242-256. [PMID: 33979715 PMCID: PMC8184621 DOI: 10.1016/j.ejmp.2021.04.016] [Citation(s) in RCA: 130] [Impact Index Per Article: 32.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/06/2020] [Revised: 04/15/2021] [Accepted: 04/18/2021] [Indexed: 02/08/2023] Open
Abstract
Artificial intelligence (AI) has recently become a very popular buzzword, as a consequence of disruptive technical advances and impressive experimental results, notably in the field of image analysis and processing. In medicine, specialties where images are central, like radiology, pathology or oncology, have seized the opportunity and considerable efforts in research and development have been deployed to transfer the potential of AI to clinical applications. With AI becoming a more mainstream tool for typical medical imaging analysis tasks, such as diagnosis, segmentation, or classification, the key for a safe and efficient use of clinical AI applications relies, in part, on informed practitioners. The aim of this review is to present the basic technological pillars of AI, together with the state-of-the-art machine learning methods and their application to medical imaging. In addition, we discuss the new trends and future research directions. This will help the reader to understand how AI methods are now becoming an ubiquitous tool in any medical image analysis workflow and pave the way for the clinical implementation of AI-based solutions.
Collapse
Affiliation(s)
- Ana Barragán-Montero
- Molecular Imaging, Radiation and Oncology (MIRO) Laboratory, UCLouvain, Belgium.
| | - Umair Javaid
- Molecular Imaging, Radiation and Oncology (MIRO) Laboratory, UCLouvain, Belgium
| | - Gilmer Valdés
- Department of Radiation Oncology, Department of Epidemiology and Biostatistics, University of California, San Francisco, USA
| | - Dan Nguyen
- Medical Artificial Intelligence and Automation (MAIA) Laboratory, Department of Radiation Oncology, UT Southwestern Medical Center, USA
| | - Paul Desbordes
- Information and Communication Technologies, Electronics and Applied Mathematics (ICTEAM), UCLouvain, Belgium
| | - Benoit Macq
- Information and Communication Technologies, Electronics and Applied Mathematics (ICTEAM), UCLouvain, Belgium
| | - Siri Willems
- ESAT/PSI, KU Leuven Belgium & MIRC, UZ Leuven, Belgium
| | | | | | | | - Steven Michiels
- Molecular Imaging, Radiation and Oncology (MIRO) Laboratory, UCLouvain, Belgium
| | - Kevin Souris
- Molecular Imaging, Radiation and Oncology (MIRO) Laboratory, UCLouvain, Belgium
| | - Edmond Sterpin
- Molecular Imaging, Radiation and Oncology (MIRO) Laboratory, UCLouvain, Belgium; KU Leuven, Department of Oncology, Laboratory of Experimental Radiotherapy, Belgium
| | - John A Lee
- Molecular Imaging, Radiation and Oncology (MIRO) Laboratory, UCLouvain, Belgium
| |
Collapse
|
19
|
Brion E, Léger J, Barragán-Montero AM, Meert N, Lee JA, Macq B. Domain adversarial networks and intensity-based data augmentation for male pelvic organ segmentation in cone beam CT. Comput Biol Med 2021; 131:104269. [PMID: 33639352 DOI: 10.1016/j.compbiomed.2021.104269] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2020] [Revised: 02/07/2021] [Accepted: 02/08/2021] [Indexed: 12/25/2022]
Abstract
In radiation therapy, a CT image is used to manually delineate the organs and plan the treatment. During the treatment, a cone beam CT (CBCT) is often acquired to monitor the anatomical modifications. For this purpose, automatic organ segmentation on CBCT is a crucial step. However, manual segmentations on CBCT are scarce, and models trained with CT data do not generalize well to CBCT images. We investigate adversarial networks and intensity-based data augmentation, two strategies leveraging large databases of annotated CTs to train neural networks for segmentation on CBCT. Adversarial networks consist of a 3D U-Net segmenter and a domain classifier. The proposed framework is aimed at encouraging the learning of filters producing more accurate segmentations on CBCT. Intensity-based data augmentation consists in modifying the training CT images to reduce the gap between CT and CBCT distributions. The proposed adversarial networks reach DSCs of 0.787, 0.447, and 0.660 for the bladder, rectum, and prostate respectively, which is an improvement over the DSCs of 0.749, 0.179, and 0.629 for "source only" training. Our brightness-based data augmentation reaches DSCs of 0.837, 0.701, and 0.734, which outperforms the morphons registration algorithms for the bladder (0.813) and rectum (0.653), while performing similarly on the prostate (0.731). The proposed adversarial training framework can be used for any segmentation application where training and test distributions differ. Our intensity-based data augmentation can be used for CBCT segmentation to help achieve the prescribed dose on target and lower the dose delivered to healthy organs.
Collapse
Affiliation(s)
- Eliott Brion
- ICTEAM, UCLouvain, Louvain-la-Neuve, 1348, Belgium.
| | - Jean Léger
- ICTEAM, UCLouvain, Louvain-la-Neuve, 1348, Belgium
| | | | - Nicolas Meert
- Hôpital André Vésale, Montigny-le-Tilleul, 6110, Belgium
| | - John A Lee
- ICTEAM, UCLouvain, Louvain-la-Neuve, 1348, Belgium; IREC/MIRO, UCLouvain, Brussels, 1200, Belgium
| | - Benoit Macq
- ICTEAM, UCLouvain, Louvain-la-Neuve, 1348, Belgium
| |
Collapse
|