1
|
Makkar V, Tewary A, Rathish Kumar BV, Pandey RK. Punctured window based multiscale line detector for efficient segmentation of retinal blood vessels. Comput Biol Med 2025; 191:110155. [PMID: 40245689 DOI: 10.1016/j.compbiomed.2025.110155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2024] [Revised: 03/03/2025] [Accepted: 04/04/2025] [Indexed: 04/19/2025]
Abstract
Changes in the retinal vasculature can help diagnose diseases like diabetes, hypertension, and arteriosclerosis. To enable ophthalmologists to provide an efficient diagnosis and further reduce the cost of treatment, we propose an automated algorithm for the segmentation of retinal vasculature. Line detector is a classic approach for vessel-like structure detection or segmentation but with a fundamental flaw in estimation of the background intensity around a pixel. In this work, we highlight and rectify that issue in the classic line detector, by introducing the idea of punctured windows. This enhances the ability of a line detector to identify minor vessels in low contrast regions. Firstly, the image is denoised using a fractional filter. Then, the line detector with punctured window is used to compute the line responses at multiple scales. The final response is computed as the arithmetic mean of all responses at different scales and the underlying image intensity. Finally, hysteresis thresholding is applied to obtain the segmented vessels. The majority of methods proposed in the literature are evaluated only on DRIVE and STARE datasets, and using the performance metrics that are biased due to the issue of class imbalance. While many other methods fail to be consistent either across different datasets or the performance metrics used. The proposed algorithm is tested on four publically available datasets, namely, RC-SLO, STARE, CHASE_DB1, and DRIVE using several performance metrics that are unaffected by the class imbalance prevalent in vessel classification problems. The proposed technique is comparable with state-of-the-art methods and outperforms many of them.
Collapse
Affiliation(s)
- Varun Makkar
- Department of Mathematical Sciences, Indian Institute of Technology (BHU) Varanasi, Varanasi 221005, Uttar Pradesh, India
| | - Arya Tewary
- Department of Mathematical Sciences, Indian Institute of Technology (BHU) Varanasi, Varanasi 221005, Uttar Pradesh, India
| | - B V Rathish Kumar
- Department of Mathematics and Statistics, Indian Institute of Technology Kanpur, Kalyanpur 208016, Kanpur, India
| | - Rajesh K Pandey
- Department of Mathematical Sciences, Indian Institute of Technology (BHU) Varanasi, Varanasi 221005, Uttar Pradesh, India.
| |
Collapse
|
2
|
Radha K, Karuna Y. Latent space autoencoder generative adversarial model for retinal image synthesis and vessel segmentation. BMC Med Imaging 2025; 25:149. [PMID: 40325399 DOI: 10.1186/s12880-025-01694-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2024] [Accepted: 04/25/2025] [Indexed: 05/07/2025] Open
Abstract
Diabetes is a widespread condition that can lead to serious vision problems over time. Timely identification and treatment of diabetic retinopathy (DR) depend on accurately segmenting retinal vessels, which can be achieved through the invasive technique of fundus imaging. This methodology facilitates the systematic monitoring and assessment of the progression of DR. In recent years, deep learning has made significant steps in various fields, including medical image processing. Numerous algorithms have been developed for segmenting retinal vessels in fundus images, demonstrating excellent performance. However, it is widely recognized that large datasets are essential for training deep learning models to ensure they can generalize well. A major challenge in retinal vessel segmentation is the lack of ground truth samples to train these models. To overcome this, we aim to generate synthetic data. This work draws inspiration from recent advancements in generative adversarial networks (GANs). Our goal is to generate multiple realistic retinal fundus images based on tubular structured annotations while simultaneously creating binary masks from the retinal fundus images. We have integrated a latent space auto-encoder to maintain the vessel morphology when generating RGB fundus images and mask images. This approach can synthesize diverse images from a single tubular structured annotation and generate various tubular structures from a single fundus image. To test our method, we utilized three primary datasets, DRIVE, STARE, and CHASE_DB, to generate synthetic data. We then trained and tested a simple UNet model for segmentation using this synthetic data and compared its performance against the standard dataset. The results indicated that the synthetic data offered excellent segmentation performance, a crucial aspect in medical image analysis, where smaller datasets are often common. This demonstrates the potential of synthetic data as a valuable resource for training segmentation and classification models for disease diagnosis. Overall, we used the DRIVE, STARE, and CHASE_DB datasets to synthesize and evaluate the proposed image-to-image translation approach and its segmentation effectiveness.
Collapse
Affiliation(s)
- K Radha
- School of Electronics Engineering, Vellore Institute of Technology, Vellore, India
| | - Yepuganti Karuna
- School of Electronics Engineering, VIT-AP University, Amaravati, India.
| |
Collapse
|
3
|
Wang Y, Fan L, Pagnucco M, Song Y. Unsupervised domain adaptation with multi-level distillation boost and adaptive mask for medical image segmentation. Comput Biol Med 2025; 190:110055. [PMID: 40158461 DOI: 10.1016/j.compbiomed.2025.110055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2024] [Revised: 03/17/2025] [Accepted: 03/18/2025] [Indexed: 04/02/2025]
Abstract
The mean-teacher (MT) framework has emerged as a commonly used approach in unsupervised domain adaptation (UDA) tasks. Existing methods primarily focus on aligning the outputs of the student and teacher networks by using guidance from the teacher network's multi-layer features. To build upon the potential of the MT framework, we propose a framework named Multi-Level Distillation Boost (MLDB). It combines Self-Knowledge Distillation and Dual-Directional Knowledge Distillation to align predictions between the intermediate and high-level features of the student and teacher networks. Additionally, considering the complex variability in anatomical structures, foregrounds, and backgrounds across different domains of medical images, we introduce an Adaptive Masked Image Consistency (AMIC) approach. It provides a customized masking strategy to augment images for source and target domain datasets, using varying mask ratios and sizes to improve the adaptability and efficacy of data augmentation. Our experiments on fundus and polyp datasets indicate that the proposed methods achieve competitive performances of 95.2%/86.1% and 97.3%/89.0% Dice scores for optic disc/cup on REFUGE→RIM, REFUGE→Drishti-GS, and 78.3% and 86.2% for polyp on Kvasir→ETIS and Kvasir→Endo, respectively. The code is available at https://github.com/Yongze/MLDB_AMIC.
Collapse
Affiliation(s)
- Yongze Wang
- School of Computer Science and Engineering, University of New South Wales, Australia.
| | - Lei Fan
- School of Computer Science and Engineering, University of New South Wales, Australia.
| | - Maurice Pagnucco
- School of Computer Science and Engineering, University of New South Wales, Australia.
| | - Yang Song
- School of Computer Science and Engineering, University of New South Wales, Australia.
| |
Collapse
|
4
|
Yang B, Han H, Zhang W, Li H. General retinal image enhancement via reconstruction: Bridging distribution shifts using latent diffusion adaptors. Med Image Anal 2025; 103:103603. [PMID: 40300379 DOI: 10.1016/j.media.2025.103603] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Revised: 01/21/2025] [Accepted: 04/12/2025] [Indexed: 05/01/2025]
Abstract
Deep learning-based fundus image enhancement has attracted extensive research attention recently, which has shown remarkable effectiveness in improving the visibility of low-quality images. However, these methods are often constrained to specific datasets and degradations, leading to poor generalization capabilities and having challenges in the fine-tuning process. Therefore, a general method for fundus image enhancement is proposed for improved generalizability and flexibility, which decomposes the enhancement task into reconstruction and adaptation phases. In the reconstruction phase, self-supervised training with unpaired data is employed, allowing the utilization of extensive public datasets to improve the generalizability of the model. During the adaptation phase, the model is fine-tuned according to the target datasets and their degradations, utilizing the pre-trained weights from the reconstruction. The proposed method improves the feasibility of latent diffusion models for retinal image enhancement. Adaptation loss and enhancement adaptor are proposed in autoencoders and diffusion networks for fewer paired training data, fewer trainable parameters, and faster convergence compared with training from scratch. The proposed method can be easily fine-tuned and experiments demonstrate the adaptability for different datasets and degradations. Additionally, the reconstruction-adaptation framework can be utilized in different backbones and other modalities, which shows its generality.
Collapse
Affiliation(s)
- Bingyu Yang
- Beijing Institute of Technology, Beijing, 100081, China
| | - Haonan Han
- Beijing Institute of Technology, Beijing, 100081, China
| | - Weihang Zhang
- Beijing Institute of Technology, Beijing, 100081, China
| | - Huiqi Li
- Beijing Institute of Technology, Beijing, 100081, China.
| |
Collapse
|
5
|
Xing H, Sun R, Ren J, Wei J, Feng CM, Ding X, Guo Z, Wang Y, Hu Y, Wei W, Ban X, Xie C, Tan Y, Liu X, Cui S, Duan X, Li Z. Achieving flexible fairness metrics in federated medical imaging. Nat Commun 2025; 16:3342. [PMID: 40199877 PMCID: PMC11978761 DOI: 10.1038/s41467-025-58549-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Accepted: 03/26/2025] [Indexed: 04/10/2025] Open
Abstract
The rapid adoption of Artificial Intelligence (AI) in medical imaging raises fairness and privacy concerns across demographic groups, especially in diagnosis and treatment decisions. While federated learning (FL) offers decentralized privacy preservation, current frameworks often prioritize collaboration fairness over group fairness, risking healthcare disparities. Here we present FlexFair, an innovative FL framework designed to address both fairness and privacy challenges. FlexFair incorporates a flexible regularization term to facilitate the integration of multiple fairness criteria, including equal accuracy, demographic parity, and equal opportunity. Evaluated across four clinical applications (polyp segmentation, fundus vascular segmentation, cervical cancer segmentation, and skin disease diagnosis), FlexFair outperforms state-of-the-art methods in both fairness and accuracy. Moreover, we curate a multi-center dataset for cervical cancer segmentation that includes 678 patients from four hospitals. This diverse dataset allows for a more comprehensive analysis of model performance across different population groups, ensuring the findings are applicable to a broader range of patients.
Collapse
Grants
- This work was supported by Shenzhen-Hong Kong Joint Funding No. SGDX20211123112401002, by NSFC with Grant No. 62293482, by the Basic Research Project No. HZQB-KCZYZ-2021067 of Hetao Shenzhen HK S&T Cooperation Zone, by Shenzhen General Program No. JCYJ20220530143600001, by the Shenzhen Outstanding Talents Training Fund 202002, by Guang-dong Research Project No. 2017ZT07X152 and No. 2019CX01X104, by the Guangdong Provin-cial Key Laboratory of Future Networks of Intelligence (Grant No. 2022B1212010001), by the Guangdong Provincial Key Laboratory of Big Data Computing, CHUK-Shenzhen, by the NSFC 61931024&12326610, by the Key Area R&D Program of Guangdong Province with grant No. 2018B030338001, by the Shenzhen Key Laboratory of Big Data and Artificial Intelligence (Grant No. ZDSYS201707251409055), and by Tencent & Huawei Open Fund.
Collapse
Affiliation(s)
- Huijun Xing
- Shenzhen Future Network of Intelligence Institute and Guangdong Provincial Key Laboratory of Future Networks of Intelligence, The Chinese University of Hong Kong (Shenzhen), Shenzhen, China
- School of Science and Engineering, The Chinese University of Hong Kong (Shenzhen), Shenzhen, China
| | - Rui Sun
- Shenzhen Future Network of Intelligence Institute and Guangdong Provincial Key Laboratory of Future Networks of Intelligence, The Chinese University of Hong Kong (Shenzhen), Shenzhen, China
- School of Science and Engineering, The Chinese University of Hong Kong (Shenzhen), Shenzhen, China
| | - Jinke Ren
- Shenzhen Future Network of Intelligence Institute and Guangdong Provincial Key Laboratory of Future Networks of Intelligence, The Chinese University of Hong Kong (Shenzhen), Shenzhen, China
- School of Science and Engineering, The Chinese University of Hong Kong (Shenzhen), Shenzhen, China
| | - Jun Wei
- Shenzhen Future Network of Intelligence Institute and Guangdong Provincial Key Laboratory of Future Networks of Intelligence, The Chinese University of Hong Kong (Shenzhen), Shenzhen, China
- School of Science and Engineering, The Chinese University of Hong Kong (Shenzhen), Shenzhen, China
| | - Chun-Mei Feng
- Institute of High Performance Computing, Agency for Science, Technology and Research, Singapore, Singapore
| | - Xuan Ding
- Department of Statistics, Faculty of Arts and Sciences, Beijing Normal University, Zhuhai, Guangdong, China
| | - Zilu Guo
- Shenzhen Future Network of Intelligence Institute and Guangdong Provincial Key Laboratory of Future Networks of Intelligence, The Chinese University of Hong Kong (Shenzhen), Shenzhen, China
- School of Science and Engineering, The Chinese University of Hong Kong (Shenzhen), Shenzhen, China
| | - Yu Wang
- Department of Radiology, Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, Guangdong, China
| | - Yudong Hu
- Aberdeen Institute of Data Science and Artificial Intelligence, South China Normal University, Foshan, Guangdong, China
| | - Wei Wei
- Department of Gynecologic Oncology, Sun Yat-sen University Cancer Center, Guangzhou, Guangdong, China
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Guangzhou, Guangdong, China
| | - Xiaohua Ban
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Guangzhou, Guangdong, China
- Department of Radiology, Sun Yat-sen University Cancer Center, Guangzhou, Guangdong, China
| | - Chuanlong Xie
- Department of Statistics, Faculty of Arts and Sciences, Beijing Normal University, Zhuhai, Guangdong, China.
| | - Yu Tan
- Department of Radiology, Guangdong Women and Children Hospital, Guangzhou, China
| | - Xian Liu
- Radiology Department, The Second Affiliated Hospital of Guangzhou University of Chinese Medicine, Guangzhou, Guangdong, China
| | - Shuguang Cui
- Shenzhen Future Network of Intelligence Institute and Guangdong Provincial Key Laboratory of Future Networks of Intelligence, The Chinese University of Hong Kong (Shenzhen), Shenzhen, China
- School of Science and Engineering, The Chinese University of Hong Kong (Shenzhen), Shenzhen, China
| | - Xiaohui Duan
- Department of Radiology, Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, Guangdong, China.
- Guangdong Provincial Key Laboratory of Malignant Tumor Epigenetics and Gene Regulation, Medical Research Center, Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, Guangdong, China.
| | - Zhen Li
- Shenzhen Future Network of Intelligence Institute and Guangdong Provincial Key Laboratory of Future Networks of Intelligence, The Chinese University of Hong Kong (Shenzhen), Shenzhen, China.
- School of Science and Engineering, The Chinese University of Hong Kong (Shenzhen), Shenzhen, China.
| |
Collapse
|
6
|
Zhong J, Tian W, Xie Y, Liu Z, Ou J, Tian T, Zhang L. PMFSNet: Polarized multi-scale feature self-attention network for lightweight medical image segmentation. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2025; 261:108611. [PMID: 39892086 DOI: 10.1016/j.cmpb.2025.108611] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/14/2024] [Revised: 01/05/2025] [Accepted: 01/19/2025] [Indexed: 02/03/2025]
Abstract
BACKGROUND AND OBJECTIVES Current state-of-the-art medical image segmentation methods prioritize precision but often at the expense of increased computational demands and larger model sizes. Applying these large-scale models to the relatively limited scale of medical image datasets tends to induce redundant computation, complicating the process without the necessary benefits. These approaches increase complexity and pose challenges for integrating and deploying lightweight models on edge devices. For instance, recent transformer-based models have excelled in 2D and 3D medical image segmentation due to their extensive receptive fields and high parameter count. However, their effectiveness comes with the risk of overfitting when applied to small datasets. It often neglects the vital inductive biases of Convolutional Neural Networks (CNNs), essential for local feature representation. METHODS In this work, we propose PMFSNet, a novel medical imaging segmentation model that effectively balances global and local feature processing while avoiding the computational redundancy typical of larger models. PMFSNet streamlines the UNet-based hierarchical structure and simplifies the self-attention mechanism's computational complexity, making it suitable for lightweight applications. It incorporates a plug-and-play PMFS block, a multi-scale feature enhancement module based on attention mechanisms, to capture long-term dependencies. RESULTS The extensive comprehensive results demonstrate that our method achieves superior performance in various segmentation tasks on different data scales even with fewer than a million parameters. Results reveal that our PMFSNet achieves IoU of 84.68%, 82.02%, 78.82%, and 76.48% on public datasets of 3D CBCT Tooth, ovarian tumors ultrasound (MMOTU), skin lesions dermoscopy (ISIC 2018), and gastrointestinal polyp (Kvasir SEG), and yields DSC of 78.29%, 77.45%, and 78.04% on three retinal vessel segmentation datasets, DRIVE, STARE, and CHASE-DB1, respectively. CONCLUSION Our proposed model exhibits competitive performance across various datasets, accomplishing this with significantly fewer model parameters and inference time, demonstrating its value in model integration and deployment. It strikes an optimal compromise between efficiency and performance and can be a highly efficient solution for medical image analysis in resource-constrained clinical environments. The source code is available at https://github.com/yykzjh/PMFSNet.
Collapse
Affiliation(s)
- Jiahui Zhong
- School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, 610054, PR China.
| | - Wenhong Tian
- School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, 610054, PR China.
| | - Yuanlun Xie
- School of Electronic Information and Electrical Engineering, Chengdu University, Chengdu 610106, China.
| | - Zhijia Liu
- School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, 610054, PR China.
| | - Jie Ou
- School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, 610054, PR China.
| | - Taoran Tian
- State Key Laboratory of Oral Diseases, National Clinical Research Center for Oral Diseases, West China Hospital of Stomatology, Sichuan University, Chengdu, 610041, PR China.
| | - Lei Zhang
- School of Computer Science, University of Lincoln, LN6 7TS, UK.
| |
Collapse
|
7
|
Bakkouri I, Bakkouri S. UGS-M3F: unified gated swin transformer with multi-feature fully fusion for retinal blood vessel segmentation. BMC Med Imaging 2025; 25:77. [PMID: 40050753 PMCID: PMC11887399 DOI: 10.1186/s12880-025-01616-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2024] [Accepted: 02/25/2025] [Indexed: 03/09/2025] Open
Abstract
Automated segmentation of retinal blood vessels in fundus images plays a key role in providing ophthalmologists with critical insights for the non-invasive diagnosis of common eye diseases. Early and precise detection of these conditions is essential for preserving vision, making vessel segmentation crucial for identifying vascular diseases that pose a threat to vision. However, accurately segmenting blood vessels in fundus images is challenging due to factors such as significant variability in vessel scale and appearance, occlusions, complex backgrounds, variations in image quality, and the intricate branching patterns of retinal vessels. To overcome these challenges, the Unified Gated Swin Transformer with Multi-Feature Full Fusion (UGS-M3F) model has been developed as a powerful deep learning framework tailored for retinal vessel segmentation. UGS-M3F leverages its Unified Multi-Context Feature Fusion (UM2F) and Gated Boundary-Aware Swin Transformer (GBS-T) modules to capture contextual information across different levels. The UM2F module enhances the extraction of detailed vessel features, while the GBS-T module emphasizes small vessel detection and ensures extensive coverage of large vessels. Extensive experimental results on publicly available datasets, including FIVES, DRIVE, STARE, and CHAS_DB1, show that UGS-M3F significantly outperforms existing state-of-the-art methods. Specifically, UGS-M3F achieves a Dice Coefficient (DC) improvement of 2.12% on FIVES, 1.94% on DRIVE, 2.52% on STARE, and 2.14% on CHAS_DB1 compared to the best-performing baseline. This improvement in segmentation accuracy has the potential to revolutionize diagnostic techniques, allowing for more precise disease identification and management across a range of ocular conditions.
Collapse
Affiliation(s)
- Ibtissam Bakkouri
- LS2ME Laboratory, Sultan Moulay Slimane University, Beni-Mellal, Morocco.
| | - Siham Bakkouri
- TIAD Laboratory, Sultan Moulay Slimane University, Beni-Mellal, Morocco
| |
Collapse
|
8
|
Eid P, Bourredjem A, Anwer A, Creuzot-Garcher C, Keane PA, Zhou Y, Wagner S, Meriaudeau F, Arnould L. Retinal Microvascular Biomarker Assessment With Automated Algorithm and Semiautomated Software in the Montrachet Dataset. Transl Vis Sci Technol 2025; 14:13. [PMID: 40072417 PMCID: PMC11918093 DOI: 10.1167/tvst.14.3.13] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/14/2025] Open
Abstract
Purpose To compare automated and semiautomated methods for the measurement of retinal microvascular biomarkers: the automated retinal vascular morphology (AutoMorph) algorithm and the Singapore "I" Vessel Assessment (SIVA) software. Methods Analysis of retinal fundus photographs centered on optic discs from the population-based Montrachet Study of adults aged 75 years and older. Comparison and agreement evaluation with intraclass correlation coefficients (ICCs) between SIVA and AutoMorph measures of the central retinal venular and arteriolar equivalent, arteriolar-venular ratio, and fractal dimension. Results Overall, 1069 fundus photographs were included in this study. The mean age of the patients was 80.04 ± 3.94 years. After the image quality grading process with an optimal threshold, the lowest rejection rate was 51.17% for the AutoMorph analysis (n = 522). The measure of agreement between SIVA and AutoMorph retinal microvascular biomarkers showed a good correlation for vascular complexity (ICC, 0.77-0.47), a poor correlation for vascular calibers (ICC, 0.36-0.23), and no correlation for vascular tortuosity. Significant associations between retinal biomarkers and systemic variables (age, history of stroke, and systolic blood pressure) were consistent between SIVA and AutoMorph. Conclusions In this dataset, AutoMorph presented a substantial rejection rate. SIVA and AutoMorph provided well-correlated measurements of vascular complexity and caliber with consistent clinical associations. Further comparisons are needed before a transition is made from semiautomated to automated algorithms for the analysis of retinal microvascular biomarkers. Translational Relevance Open source software needs to be compared with former semiautomated software for retinal microvascular biomarkers assessment before transition in daily clinic and collaborative research.
Collapse
Affiliation(s)
- Pétra Eid
- Ophthalmology Department, Dijon University Hospital, Dijon, France
- Centre des Sciences du Goût et de l'Alimentation, AgroSup Dijon, CNRS, INRAE, Université Bourgogne, Dijon, France
| | - Abderrahmane Bourredjem
- CIC 1432, Epidémiologie Clinique, Centre Hospitalier Universitaire Dijon-Bourgogne, Dijon, France
| | - Atif Anwer
- Institut de Chimie Moléculaire Université de Bourgogne (ICMUB), Imagerie Fonctionnelle et moléculaire et Traitement des Images Médicales (IFTIM), Burgundy University, Dijon, France
| | - Catherine Creuzot-Garcher
- Ophthalmology Department, Dijon University Hospital, Dijon, France
- Centre des Sciences du Goût et de l'Alimentation, AgroSup Dijon, CNRS, INRAE, Université Bourgogne, Dijon, France
| | - Pearse Andrew Keane
- NIHR Biomedical Research Centre at Moorfields Eye Hospital NHS Foundation Trust, London, UK
- Institute of Ophthalmology, University College London, London, UK
| | - Yukun Zhou
- NIHR Biomedical Research Centre at Moorfields Eye Hospital NHS Foundation Trust, London, UK
- Institute of Ophthalmology, University College London, London, UK
| | - Siegfried Wagner
- NIHR Biomedical Research Centre at Moorfields Eye Hospital NHS Foundation Trust, London, UK
- Institute of Ophthalmology, University College London, London, UK
| | - Fabrice Meriaudeau
- Institut de Chimie Moléculaire Université de Bourgogne (ICMUB), Imagerie Fonctionnelle et moléculaire et Traitement des Images Médicales (IFTIM), Burgundy University, Dijon, France
| | - Louis Arnould
- Ophthalmology Department, Dijon University Hospital, Dijon, France
- Pathophysiology and Epidemiology of Cerebro-Cardiovascular Diseases (PEC2), (EA 7460), Faculty of Health Sciences, Université de Bourgogne, Dijon, France
| |
Collapse
|
9
|
Reddy VVRK, Villordon M, Do QN, Xi Y, Lewis MA, Herrera CL, Owen D, Spong CY, Twickler DM, Fei B. Ensemble of fine-tuned machine learning models for hysterectomy prediction in pregnant women using magnetic resonance images. J Med Imaging (Bellingham) 2025; 12:024502. [PMID: 40109885 PMCID: PMC11915718 DOI: 10.1117/1.jmi.12.2.024502] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2024] [Revised: 02/07/2025] [Accepted: 02/24/2025] [Indexed: 03/22/2025] Open
Abstract
Purpose Identifying pregnant patients at high risk of hysterectomy before giving birth informs clinical management and improves outcomes. We aim to develop machine learning models to predict hysterectomy in pregnant women with placenta accreta spectrum (PAS). Approach We developed five machine learning models using information from magnetic resonance images and combined them with topographic maps and radiomic features to predict hysterectomy. The models were trained, optimized, and evaluated on data from 241 patients, in groups of 157, 24, and 60 for training, validation, and testing, respectively. Results We assessed the models individually as well as using an ensemble approach. When these models are combined, the ensembled model produced the best performance and achieved an area under the curve of 0.90, a sensitivity of 90.0%, and a specificity of 90.0% for predicting hysterectomy. Conclusions Various machine learning models were developed to predict hysterectomy in pregnant women with PAS, which may have potential clinical applications to help improve patient management.
Collapse
Affiliation(s)
- Vishnu Vardhan Reddy Kanamata Reddy
- The University of Texas at Dallas, Department of Bioengineering, Richardson, Texas, United States
- The University of Texas at Dallas, Center for Imaging and Surgical Innovation, Richardson, Texas, United States
| | - Michael Villordon
- The University of Texas at Dallas, Department of Bioengineering, Richardson, Texas, United States
- The University of Texas at Dallas, Center for Imaging and Surgical Innovation, Richardson, Texas, United States
| | - Quyen N Do
- The University of Texas Southwestern Medical Center, Department of Radiology, Dallas, Texas, United States
| | - Yin Xi
- The University of Texas Southwestern Medical Center, Department of Radiology, Dallas, Texas, United States
| | - Matthew A Lewis
- The University of Texas Southwestern Medical Center, Department of Radiology, Dallas, Texas, United States
| | - Christina L Herrera
- The University of Texas Southwestern Medical Center, Department of Obstetrics and Gynecology, Dallas, Texas, United States
- Parkland Health, Dallas, Texas, United States
| | - David Owen
- The University of Texas Southwestern Medical Center, Department of Obstetrics and Gynecology, Dallas, Texas, United States
- Parkland Health, Dallas, Texas, United States
| | - Catherine Y Spong
- The University of Texas Southwestern Medical Center, Department of Obstetrics and Gynecology, Dallas, Texas, United States
- Parkland Health, Dallas, Texas, United States
| | - Diane M Twickler
- The University of Texas Southwestern Medical Center, Department of Radiology, Dallas, Texas, United States
- The University of Texas Southwestern Medical Center, Department of Obstetrics and Gynecology, Dallas, Texas, United States
- Parkland Health, Dallas, Texas, United States
| | - Baowei Fei
- The University of Texas at Dallas, Department of Bioengineering, Richardson, Texas, United States
- The University of Texas at Dallas, Center for Imaging and Surgical Innovation, Richardson, Texas, United States
- The University of Texas Southwestern Medical Center, Department of Radiology, Dallas, Texas, United States
| |
Collapse
|
10
|
Fang L, Sheng H, Li H, Li S, Feng S, Chen M, Li Y, Chen J, Chen F. Unsupervised translation of vascular masks to NIR-II fluorescence images using Attention-Guided generative adversarial networks. Sci Rep 2025; 15:6725. [PMID: 40000690 PMCID: PMC11861915 DOI: 10.1038/s41598-025-91416-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2024] [Accepted: 02/20/2025] [Indexed: 02/27/2025] Open
Abstract
The second near-infrared window (NIR-II) fluorescence imaging is a crucial technology for investigating the structure and functionality of blood vessels. However, challenges arise from privacy concerns and the significant effort needed for data annotation, complicating the acquisition of near-infrared vascular imaging datasets. To tackle these issues, methods based on deep learning for data synthesis have demonstrated promise in generating high-quality synthetic images. In this paper, we propose an unsupervised generative adversarial network (GAN) approach for translating vascular masks into realistic NIR-II fluorescence vascular images. Leveraging an attention mechanism integrated into the loss function, our model focuses on essential features during the generation process, resulting in high-quality NIRII images without the need for supervision. Our method significantly outperforms eight baseline techniques in both visual quality and quantitative metrics, demonstrating its potential to address the challenge of limited datasets in NIR-II medical imaging. This work not only enhances the applications of NIR-II imaging but also facilitates downstream tasks by providing abundant, high-fidelity synthetic data.
Collapse
Affiliation(s)
- Lu Fang
- Chinese Academy of Sciences, Shanghai Institute of Technical Physics, Shanghai, 200083, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Huaixuan Sheng
- Sports Medicine Institute of Fudan University, Department of Sports Medicine, Huashan Hospital, Fudan University, Shanghai, 200040, China
| | - Huizhu Li
- Sports Medicine Institute of Fudan University, Department of Sports Medicine, Huashan Hospital, Fudan University, Shanghai, 200040, China
| | - Shunyao Li
- Sports Medicine Institute of Fudan University, Department of Sports Medicine, Huashan Hospital, Fudan University, Shanghai, 200040, China
| | - Sijia Feng
- Sports Medicine Institute of Fudan University, Department of Sports Medicine, Huashan Hospital, Fudan University, Shanghai, 200040, China
| | - Mo Chen
- Department of Bone and Joint Surgery, Department of Orthopedics, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, 200001, China
| | - Yunxia Li
- Sports Medicine Institute of Fudan University, Department of Sports Medicine, Huashan Hospital, Fudan University, Shanghai, 200040, China
| | - Jun Chen
- Sports Medicine Institute of Fudan University, Department of Sports Medicine, Huashan Hospital, Fudan University, Shanghai, 200040, China
| | - Fuchun Chen
- Chinese Academy of Sciences, Shanghai Institute of Technical Physics, Shanghai, 200083, China.
| |
Collapse
|
11
|
Ahmed W, Liatsis P. LHU-VT: A Lightweight Hypercomplex U-Net with Vessel Thickness-Guided Dice Loss for retinal vessel segmentation. Comput Biol Med 2025; 185:109470. [PMID: 39667053 DOI: 10.1016/j.compbiomed.2024.109470] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2024] [Revised: 11/11/2024] [Accepted: 11/22/2024] [Indexed: 12/14/2024]
Abstract
Vision loss is often caused by retinal disorders, such as age-related macular degeneration and diabetic retinopathy, where early indicators like microaneurysms and hemorrhages appear as changes in retinal blood vessels. Accurate segmentation of these vessels in retinal images is essential for early diagnosis. However, retinal vessel segmentation presents challenges due to complex vessel structures, low contrast, and dense branching patterns, which are further complicated in resource-limited settings requiring lightweight solutions. To address these challenges, we propose a novel Lightweight Hypercomplex U-Net1 (LHUN) with Vessel Thickness-Guided Dice Loss (VTDL), collectively called LHU-VT. LHUN utilizes hypercomplex octonions to capture intricate patterns and cross-channel relationships in fundus images, reducing parameter count and enabling edge deployment. The VTDL component applies vessel thickness-guided weights to address class imbalance, thereby enhancing segmentation accuracy. Our experiments show that LHU-VT significantly outperforms current methods, achieving up to 2.4× fewer FLOPs, 4.4× fewer parameters, and 2.6× smaller model size. The model achieves AUC scores of 0.9938, 0.9879, 0.9988, and 0.9808, respectively, on four benchmark datasets CHASE, DRIVE, STARE, and HRF.
Collapse
Affiliation(s)
- Waqar Ahmed
- Silo AI, Helsinki, Finland; Department of Computer Science, Khalifa University, Abu Dhabi, United Arab Emirates
| | - Panos Liatsis
- Department of Computer Science, Khalifa University, Abu Dhabi, United Arab Emirates.
| |
Collapse
|
12
|
Kande GB, Nalluri MR, Manikandan R, Cho J, Veerappampalayam Easwaramoorthy S. Multi scale multi attention network for blood vessel segmentation in fundus images. Sci Rep 2025; 15:3438. [PMID: 39870673 PMCID: PMC11772654 DOI: 10.1038/s41598-024-84255-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2024] [Accepted: 12/20/2024] [Indexed: 01/29/2025] Open
Abstract
Precise segmentation of retinal vasculature is crucial for the early detection, diagnosis, and treatment of vision-threatening ailments. However, this task is challenging due to limited contextual information, variations in vessel thicknesses, the complexity of vessel structures, and the potential for confusion with lesions. In this paper, we introduce a novel approach, the MSMA Net model, which overcomes these challenges by replacing traditional convolution blocks and skip connections with an improved multi-scale squeeze and excitation block (MSSE Block) and Bottleneck residual paths (B-Res paths) with spatial attention blocks (SAB). Our experimental findings on publicly available datasets of fundus images, specifically DRIVE, STARE, CHASE_DB1, HRF and DR HAGIS consistently demonstrate that our approach outperforms other segmentation techniques, achieving higher accuracy, sensitivity, Dice score, and area under the receiver operator characteristic (AUC) in the segmentation of blood vessels with different thicknesses, even in situations involving diverse contextual information, the presence of coexisting lesions, and intricate vessel morphologies.
Collapse
Affiliation(s)
- Giri Babu Kande
- Vasireddy Venkatadri Institute of Technology, Nambur, 522508, India
| | - Madhusudana Rao Nalluri
- School of Computing, Amrita Vishwa Vidyapeetham, Amaravati, 522503, India.
- Department of Computer Science & Engineering, Faculty of Science and Technology (IcfaiTech), ICFAI Foundation for Higher Education, Hyderabad, India.
| | - R Manikandan
- School of Computing, SASTRA Deemed University, Thanjavur, 613401, India
| | - Jaehyuk Cho
- Department of Software Engineering & Division of Electronics and Information Engineering, Jeonbuk National University, Jeonju-si, 54896, Republic of Korea.
| | | |
Collapse
|
13
|
Yang S, Zhang X, He Y, Chen Y, Zhou Y. TBE-Net: A Deep Network Based on Tree-Like Branch Encoder for Medical Image Segmentation. IEEE J Biomed Health Inform 2025; 29:521-534. [PMID: 39374271 DOI: 10.1109/jbhi.2024.3468904] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/09/2024]
Abstract
In recent years, encoder-decoder-based network structures have been widely used in designing medical image segmentation models. However, these methods still face some limitations: 1) The network's feature extraction capability is limited, primarily due to insufficient attention to the encoder, resulting in a failure to extract rich and effective features. 2) Unidirectional stepwise decoding of smaller-sized feature maps restricts segmentation performance. To address the above limitations, we propose an innovative Tree-like Branch Encoder Network (TBE-Net), which adopts a tree-like branch encoder to better perform feature extraction and preserve feature information. Additionally, we introduce the Depth and Width Expansion (DWE) module to expand the network depth and width at low parameter cost, thereby enhancing network performance. Furthermore, we design a Deep Aggregation Module (DAM) to better aggregate and process encoder features. Subsequently, we directly decode the aggregated features to generate the segmentation map. The experimental results show that, compared to other advanced algorithms, our method, with the lowest parameter cost, achieved improvements in the IoU metric on the TNBC, PH2, CHASE-DB1, STARE, and COVID-19-CT-Seg datasets by 1.6%, 0.46%, 0.81%, 1.96%, and 0.86%, respectively.
Collapse
|
14
|
Tong L, Li T, Zhang Q, Zhang Q, Zhu R, Du W, Hu P. LiViT-Net: A U-Net-like, lightweight Transformer network for retinal vessel segmentation. Comput Struct Biotechnol J 2024; 24:213-224. [PMID: 38572168 PMCID: PMC10987887 DOI: 10.1016/j.csbj.2024.03.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Revised: 02/22/2024] [Accepted: 03/04/2024] [Indexed: 04/05/2024] Open
Abstract
The intricate task of precisely segmenting retinal vessels from images, which is critical for diagnosing various eye diseases, presents significant challenges for models due to factors such as scale variation, complex anatomical patterns, low contrast, and limitations in training data. Building on these challenges, we offer novel contributions spanning model architecture, loss function design, robustness, and real-time efficacy. To comprehensively address these challenges, a new U-Net-like, lightweight Transformer network for retinal vessel segmentation is presented. By integrating MobileViT+ and a novel local representation in the encoder, our design emphasizes lightweight processing while capturing intricate image structures, enhancing vessel edge precision. A novel joint loss is designed, leveraging the characteristics of weighted cross-entropy and Dice loss to effectively guide the model through the task's challenges, such as foreground-background imbalance and intricate vascular structures. Exhaustive experiments were performed on three prominent retinal image databases. The results underscore the robustness and generalizability of the proposed LiViT-Net, which outperforms other methods in complex scenarios, especially in intricate environments with fine vessels or vessel edges. Importantly, optimized for efficiency, LiViT-Net excels on devices with constrained computational power, as evidenced by its fast performance. To demonstrate the model proposed in this study, a freely accessible and interactive website was established (https://hz-t3.matpool.com:28765?token=aQjYR4hqMI), revealing real-time performance with no login requirements.
Collapse
Affiliation(s)
- Le Tong
- The College of Information, Mechanical and Electrical Engineering, Shanghai Normal University, No. 100 Haisi Road, Shanghai, 201418, China
| | - Tianjiu Li
- The College of Information, Mechanical and Electrical Engineering, Shanghai Normal University, No. 100 Haisi Road, Shanghai, 201418, China
| | - Qian Zhang
- The College of Information, Mechanical and Electrical Engineering, Shanghai Normal University, No. 100 Haisi Road, Shanghai, 201418, China
| | - Qin Zhang
- Ophthalmology Department, Jing'an District Central Hospital, No. 259, Xikang Road, Shanghai, 200040, China
| | - Renchaoli Zhu
- The College of Information, Mechanical and Electrical Engineering, Shanghai Normal University, No. 100 Haisi Road, Shanghai, 201418, China
| | - Wei Du
- Laboratory of Smart Manufacturing in Energy Chemical Process, East China University of Science and Technology, No. 130 Meilong Road, Shanghai, 200237, China
| | - Pengwei Hu
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, 40-1 South Beijing Road, Urumqi, 830011, China
| |
Collapse
|
15
|
Kato S, Hotta K. Expanded tube attention for tubular structure segmentation. Int J Comput Assist Radiol Surg 2024; 19:2187-2193. [PMID: 38112883 DOI: 10.1007/s11548-023-03038-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2023] [Accepted: 11/13/2023] [Indexed: 12/21/2023]
Abstract
PURPOSE Semantic segmentation of tubular structures, such as blood vessels and cell membranes, is a very difficult task, and it tends to break many predicted regions in the middle. This problem is due to the fact that tubular ground truth is very thin, and the number of pixels is extremely unbalanced compared to the background. METHODS We present a novel training method using pseudo-labels generated by morphological transformation. Furthermore, we present an attention module using thickened pseudo-labels, called the expanded tube attention (ETA) module. By using the ETA module, the network learns thickened regions based on pseudo-labels at first and then gradually learns thinned original regions while transferring information in the thickened regions as an attention map. RESULTS Through experiments conducted on retina vessel image datasets using various evaluation measures, we confirmed that the proposed method using ETA modules improved the clDice metric accuracy in comparison with the conventional methods. CONCLUSIONS We demonstrated that the proposed novel expanded tube attention module using thickened pseudo-labels can achieve easy-to-hard learning.
Collapse
Affiliation(s)
- Sota Kato
- Department of Electrical, Information, Materials and Materials Engineering, Meijo University, Tempaku-ku, Nagoya, Aichi, 468-8502, Japan.
| | - Kazuhiro Hotta
- Department of Electrical and Electronic Engineering, Meijo University, Tempaku-ku, Nagoya, Aichi, Japan
| |
Collapse
|
16
|
Wu X, Xu Z, Tong RKY. Continual learning in medical image analysis: A survey. Comput Biol Med 2024; 182:109206. [PMID: 39332115 DOI: 10.1016/j.compbiomed.2024.109206] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2024] [Revised: 06/24/2024] [Accepted: 09/22/2024] [Indexed: 09/29/2024]
Abstract
In the dynamic realm of practical clinical scenarios, Continual Learning (CL) has gained increasing interest in medical image analysis due to its potential to address major challenges associated with data privacy, model adaptability, memory inefficiency, prediction robustness and detection accuracy. In general, the primary challenge in adapting and advancing CL remains catastrophic forgetting. Beyond this challenge, recent years have witnessed a growing body of work that expands our comprehension and application of continual learning in the medical domain, highlighting its practical significance and intricacy. In this paper, we present an in-depth and up-to-date review of the application of CL in medical image analysis. Our discussion delves into the strategies employed to address specific tasks within the medical domain, categorizing existing CL methods into three settings: Task-Incremental Learning, Class-Incremental Learning, and Domain-Incremental Learning. These settings are further subdivided based on representative learning strategies, allowing us to assess their strengths and weaknesses in the context of various medical scenarios. By establishing a correlation between each medical challenge and the corresponding insights provided by CL, we provide a comprehensive understanding of the potential impact of these techniques. To enhance the utility of our review, we provide an overview of the commonly used benchmark medical datasets and evaluation metrics in the field. Through a comprehensive comparison, we discuss promising future directions for the application of CL in medical image analysis. A comprehensive list of studies is being continuously updated at https://github.com/xw1519/Continual-Learning-Medical-Adaptation.
Collapse
Affiliation(s)
- Xinyao Wu
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Shatin, NT, Hong Kong, China.
| | - Zhe Xu
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Shatin, NT, Hong Kong, China; Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA.
| | - Raymond Kai-Yu Tong
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Shatin, NT, Hong Kong, China.
| |
Collapse
|
17
|
Wu Q, Chen Y, Liu W, Yue X, Zhuang X. Deep Closing: Enhancing Topological Connectivity in Medical Tubular Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:3990-4003. [PMID: 38801688 DOI: 10.1109/tmi.2024.3405982] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
Accurately segmenting tubular structures, such as blood vessels or nerves, holds significant clinical implications across various medical applications. However, existing methods often exhibit limitations in achieving satisfactory topological performance, particularly in terms of preserving connectivity. To address this challenge, we propose a novel deep-learning approach, termed Deep Closing, inspired by the well-established classic closing operation. Deep Closing first leverages an AutoEncoder trained in the Masked Image Modeling (MIM) paradigm, enhanced with digital topology knowledge, to effectively learn the inherent shape prior of tubular structures and indicate potential disconnected regions. Subsequently, a Simple Components Erosion module is employed to generate topology-focused outcomes, which refines the preceding segmentation results, ensuring all the generated regions are topologically significant. To evaluate the efficacy of Deep Closing, we conduct comprehensive experiments on 4 datasets: DRIVE, CHASE_DB1, DCA1, and CREMI. The results demonstrate that our approach yields considerable improvements in topological performance compared with existing methods. Furthermore, Deep Closing exhibits the ability to generalize and transfer knowledge from external datasets, showcasing its robustness and adaptability. The code for this paper has been available at: https://github.com/5k5000/DeepClosing.
Collapse
|
18
|
Lv N, Xu L, Chen Y, Sun W, Tian J, Zhang S. TCDDU-Net: combining transformer and convolutional dual-path decoding U-Net for retinal vessel segmentation. Sci Rep 2024; 14:25978. [PMID: 39472606 PMCID: PMC11522399 DOI: 10.1038/s41598-024-77464-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Accepted: 10/22/2024] [Indexed: 11/02/2024] Open
Abstract
Accurate segmentation of retinal blood vessels is crucial for enhancing diagnostic efficiency and preventing disease progression. However, the small size and complex structure of retinal blood vessels, coupled with low contrast in corresponding fundus images, pose significant challenges for this task. We propose a novel approach for retinal vessel segmentation, which combines the transformer and convolutional dual-path decoding U-Net (TCDDU-Net). We propose the selective dense connection swin transformer block, which converts the input feature map into patches, introduces MLPs to generate probabilities, and performs selective fusion at different stages. This structure forms a dense connection framework, enabling the capture of long-distance dependencies and effective fusion of features across different stages. The subsequent stage involves the design of the background decoder, which utilizes deformable convolution to learn the background information of retinal vessels by treating them as segmentation objects. This is then combined with the foreground decoder to form a dual-path decoding U-Net. Finally, the foreground segmentation results and the processed background segmentation results are fused to obtain the final retinal vessel segmentation map. To evaluate the effectiveness of our method, we performed experiments on the DRIVE, STARE, and CHASE datasets for retinal vessel segmentation. Experimental results show that the segmentation accuracies of our algorithms are 96.98, 97.40, and 97.23, and the AUC metrics are 98.68, 98.56, and 98.50, respectively.In addition, we evaluated our methods using F1 score, specificity, and sensitivity metrics. Through a comparative analysis, we found that our proposed TCDDU-Net method effectively improves retinal vessel segmentation performance and achieves impressive results on multiple datasets compared to existing methods.
Collapse
Affiliation(s)
- Nianzu Lv
- College of Information Engineering, Xinjiang Institute of Technology, No.1 Xuefu West Road, Aksu, 843100, Xinjiang, China
| | - Li Xu
- College of Information Engineering, Xinjiang Institute of Technology, No.1 Xuefu West Road, Aksu, 843100, Xinjiang, China.
| | - Yuling Chen
- School of Information Engineering, Mianyang Teachers' College, No. 166 Mianxing West Road, High Tech Zone, Mianyang, 621000, Sichuan, China
| | - Wei Sun
- CISDI Engineering Co., LTD, Chongqing, 401120, China
| | - Jiya Tian
- College of Information Engineering, Xinjiang Institute of Technology, No.1 Xuefu West Road, Aksu, 843100, Xinjiang, China
| | - Shuping Zhang
- College of Information Engineering, Xinjiang Institute of Technology, No.1 Xuefu West Road, Aksu, 843100, Xinjiang, China
| |
Collapse
|
19
|
Liu W, Tian T, Wang L, Xu W, Li L, Li H, Zhao W, Tian S, Pan X, Deng Y, Gao F, Yang H, Wang X, Su R. DIAS: A dataset and benchmark for intracranial artery segmentation in DSA sequences. Med Image Anal 2024; 97:103247. [PMID: 38941857 DOI: 10.1016/j.media.2024.103247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Revised: 05/31/2024] [Accepted: 06/17/2024] [Indexed: 06/30/2024]
Abstract
The automated segmentation of Intracranial Arteries (IA) in Digital Subtraction Angiography (DSA) plays a crucial role in the quantification of vascular morphology, significantly contributing to computer-assisted stroke research and clinical practice. Current research primarily focuses on the segmentation of single-frame DSA using proprietary datasets. However, these methods face challenges due to the inherent limitation of single-frame DSA, which only partially displays vascular contrast, thereby hindering accurate vascular structure representation. In this work, we introduce DIAS, a dataset specifically developed for IA segmentation in DSA sequences. We establish a comprehensive benchmark for evaluating DIAS, covering full, weak, and semi-supervised segmentation methods. Specifically, we propose the vessel sequence segmentation network, in which the sequence feature extraction module effectively captures spatiotemporal representations of intravascular contrast, achieving intracranial artery segmentation in 2D+Time DSA sequences. For weakly-supervised IA segmentation, we propose a novel scribble learning-based image segmentation framework, which, under the guidance of scribble labels, employs cross pseudo-supervision and consistency regularization to improve the performance of the segmentation network. Furthermore, we introduce the random patch-based self-training framework, aimed at alleviating the performance constraints encountered in IA segmentation due to the limited availability of annotated DSA data. Our extensive experiments on the DIAS dataset demonstrate the effectiveness of these methods as potential baselines for future research and clinical applications. The dataset and code are publicly available at https://doi.org/10.5281/zenodo.11401368 and https://github.com/lseventeen/DIAS.
Collapse
Affiliation(s)
- Wentao Liu
- School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing, China.
| | - Tong Tian
- State Key Laboratory of Structural Analysis, Optimization and CAE Software for Industrial Equipment, School of Mechanics and Aerospace Engineering, Dalian University of Technology, Dalian, China
| | - Lemeng Wang
- School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing, China
| | - Weijin Xu
- School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing, China
| | - Lei Li
- Department of Interventional Neuroradiology, Beijing Tiantan Hospital, Capital Medical University, Beijing, China
| | - Haoyuan Li
- School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing, China
| | - Wenyi Zhao
- School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing, China
| | - Siyu Tian
- Ultrasonic Department, The Fourth Hospital of Hebei Medical University and Hebei Tumor Hospital, Shijiazhuang, China
| | - Xipeng Pan
- School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin, China
| | - Yiming Deng
- Department of Interventional Neuroradiology, Beijing Tiantan Hospital, Capital Medical University, Beijing, China
| | - Feng Gao
- Department of Interventional Neuroradiology, Beijing Tiantan Hospital, Capital Medical University, Beijing, China.
| | - Huihua Yang
- School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing, China; School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin, China.
| | - Xin Wang
- Department of Radiology, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Ruisheng Su
- Department of Radiology & Nuclear Medicine, Erasmus MC, University Medical Center Rotterdam, The Netherlands; Medical Image Analysis group, Department of Biomedical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands
| |
Collapse
|
20
|
Liu L, Jan H, Tang C, He H, Zhang L, Lei Z. Dual-channel lightweight GAN for enhancing color retinal images with noise suppression and structural protection. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA. A, OPTICS, IMAGE SCIENCE, AND VISION 2024; 41:1948-1958. [PMID: 39889019 DOI: 10.1364/josaa.530601] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/20/2024] [Accepted: 09/03/2024] [Indexed: 02/02/2025]
Abstract
As we all know, suppressing noise while maintaining detailed structure has been a challenging problem in the field of image enhancement, especially for color retinal images. In this paper, a dual-channel lightweight GAN named dilated shuffle generative adversarial network (DS-GAN) is proposed to solve the above problems. The lightweight generator consists of the RB branch used in the red-blue channels and the GN branch used in the green channel. The branches are then integrated with a cat function to generate enhanced images. The RB branch cascades six identical RB-enhanced modules and adds skip connections. The structure of the GN branch is similar to that of the RB branch. The generator simultaneously leverages the local context extraction capability of the normal convolution and the global information extraction capability of the dilated convolution. In addition, it facilitates the fusion and communication of feature information between channels through channel shuffle. Additionally, we utilize the lightweight image classification model ShuffleNetV2 as a discriminator to distinguish between enhanced images and corresponding labels. We also constructed a dataset for color retinal image enhancement by using traditional methods and a hybrid loss function by combining the MS-SSIM and perceptual loss for training the generator. With the proposed dataset and loss function, we train the DS-GAN successfully. We test our method on four publicly available datasets (Messidor, DIARETDB0, DRIVE, and FIRE) and a clinic dataset from the Tianjin Eye Hospital (China), and compare it with six existing image enhancement methods. The results show that the proposed method can simultaneously suppress noise, preserve structure, and enhance contrast in color retinal image enhancement. It gets better results than the compared methods in all cases. Furthermore, the model has fewer parameters, which provides the possibility of real-time image enhancement for portable devices.
Collapse
|
21
|
Zhang Y, Chung ACS. Retinal Vessel Segmentation by a Transformer-U-Net Hybrid Model With Dual-Path Decoder. IEEE J Biomed Health Inform 2024; 28:5347-5359. [PMID: 38669172 DOI: 10.1109/jbhi.2024.3394151] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/28/2024]
Abstract
This paper introduces an effective and efficient framework for retinal vessel segmentation. First, we design a Transformer-CNN hybrid model in which a Transformer module is inserted inside the U-Net to capture long-range interactions. Second, we design a dual-path decoder in the U-Net framework, which contains two decoding paths for multi-task outputs. Specifically, we train the extra decoder to predict vessel skeletons as an auxiliary task which helps the model learn balanced features. The proposed framework, named as TSNet, not only achieves good performances in a fully supervised learning manner but also enables a rough skeleton annotation process. The annotators only need to roughly delineate vessel skeletons instead of giving precise pixel-wise vessel annotations. To learn with rough skeleton annotations plus a few precise vessel annotations, we propose a skeleton semi-supervised learning scheme. We adopt a mean teacher model to produce pseudo vessel annotations and conduct annotation correction for roughly labeled skeletons annotations. This learning scheme can achieve promising performance with fewer annotation efforts. We have evaluated TSNet through extensive experiments on five benchmarking datasets. Experimental results show that TSNet yields state-of-the-art performances on retinal vessel segmentation and provides an efficient training scheme in practice.
Collapse
|
22
|
Alksas A, Sharafeldeen A, Balaha HM, Haq MZ, Mahmoud A, Ghazal M, Alghamdi NS, Alhalabi M, Yousaf J, Sandhu H, El-Baz A. Advanced OCTA imaging segmentation: Unsupervised, non-linear retinal vessel detection using modified self-organizing maps and joint MGRF modeling. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 254:108309. [PMID: 39002431 DOI: 10.1016/j.cmpb.2024.108309] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/04/2024] [Revised: 06/06/2024] [Accepted: 06/25/2024] [Indexed: 07/15/2024]
Abstract
BACKGROUND AND OBJECTIVE This paper proposes a fully automated and unsupervised stochastic segmentation approach using two-level joint Markov-Gibbs Random Field (MGRF) to detect the vascular system from retinal Optical Coherence Tomography Angiography (OCTA) images, which is a critical step in developing Computer-Aided Diagnosis (CAD) systems for detecting retinal diseases. METHODS Using a new probabilistic model based on a Linear Combination of Discrete Gaussian (LCDG), the first level models the appearance of OCTA images and their spatially smoothed images. The parameters of the LCDG model are estimated using a modified Expectation Maximization (EM) algorithm. The second level models the maps of OCTA images, including the vascular system and other retina tissues, using MGRF with analytically estimated parameters from the input images. The proposed segmentation approach employs modified self-organizing maps as a MAP-based optimizer maximizing the joint likelihood and handles the Joint MGRF model in a new, unsupervised way. This approach deviates from traditional stochastic optimization approaches and leverages non-linear optimization to achieve more accurate segmentation results. RESULTS The proposed segmentation framework is evaluated quantitatively on a dataset of 204 subjects. Achieving 0.92 ± 0.03 Dice similarity coefficient, 0.69 ± 0.25 95-percentile bidirectional Hausdorff distance, and 0.93 ± 0.03 accuracy, confirms the superior performance of the proposed approach. CONCLUSIONS The conclusions drawn from the study highlight the superior performance of the proposed unsupervised and fully automated segmentation approach in detecting the vascular system from OCTA images. This approach not only deviates from traditional methods but also achieves more accurate segmentation results, demonstrating its potential in aiding the development of CAD systems for detecting retinal diseases.
Collapse
Affiliation(s)
- Ahmed Alksas
- Bioengineering Department, University of Louisville, Louisville, KY 40292, USA
| | - Ahmed Sharafeldeen
- Bioengineering Department, University of Louisville, Louisville, KY 40292, USA
| | - Hossam Magdy Balaha
- Bioengineering Department, University of Louisville, Louisville, KY 40292, USA
| | - Mohammad Z Haq
- School of Medicine, University of Louisville, Louisville, KY 40292, USA
| | - Ali Mahmoud
- Bioengineering Department, University of Louisville, Louisville, KY 40292, USA
| | - Mohamed Ghazal
- Electrical, Computer, and Biomedical Engineering Department, Abu Dhabi University, Abu Dhabi 59911, United Arab Emirates
| | - Norah Saleh Alghamdi
- Department of Computer Sciences, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia
| | - Marah Alhalabi
- Electrical, Computer, and Biomedical Engineering Department, Abu Dhabi University, Abu Dhabi 59911, United Arab Emirates
| | - Jawad Yousaf
- Electrical, Computer, and Biomedical Engineering Department, Abu Dhabi University, Abu Dhabi 59911, United Arab Emirates
| | - Harpal Sandhu
- Bioengineering Department, University of Louisville, Louisville, KY 40292, USA
| | - Ayman El-Baz
- Bioengineering Department, University of Louisville, Louisville, KY 40292, USA.
| |
Collapse
|
23
|
Chen H, Wang X, Li H, Wang L. 3D Vessel Segmentation With Limited Guidance of 2D Structure-Agnostic Vessel Annotations. IEEE J Biomed Health Inform 2024; 28:5410-5421. [PMID: 38833403 DOI: 10.1109/jbhi.2024.3409382] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/06/2024]
Abstract
Delineating 3D blood vessels of various anatomical structures is essential for clinical diagnosis and treatment, however, is challenging due to complex structure variations and varied imaging conditions. Although recent supervised deep learning models have demonstrated their superior capacity in automatic 3D vessel segmentation, the reliance on expensive 3D manual annotations and limited capacity for annotation reuse among different vascular structures hinder their clinical applications. To avoid the repetitive and costly annotating process for each vascular structure and make full use of existing annotations, this paper proposes a novel 3D shape-guided local discrimination (3D-SLD) model for 3D vascular segmentation under limited guidance from public 2D vessel annotations. The primary hypothesis is that 3D vessels are composed of semantically similar voxels and often exhibit tree-shaped morphology. Accordingly, the 3D region discrimination loss is firstly proposed to learn the discriminative representation measuring voxel-wise similarities and cluster semantically consistent voxels to form the candidate 3D vascular segmentation in unlabeled images. Secondly, the shape distribution from existing 2D structure-agnostic vessel annotations is introduced to guide the 3D vessels with the tree-shaped morphology by the adversarial shape constraint loss. Thirdly, to enhance training stability and prediction credibility, the highlighting-reviewing-summarizing (HRS) mechanism is proposed. This mechanism involves summarizing historical models to maintain temporal consistency and identifying credible pseudo labels as reliable supervision signals. Only guided by public 2D coronary artery annotations, our method achieves results comparable to SOTA barely-supervised methods in 3D cerebrovascular segmentation, and the best DSC in 3D hepatic vessel segmentation, demonstrating the effectiveness of our method.
Collapse
|
24
|
Xu H, Wu Y. G2ViT: Graph Neural Network-Guided Vision Transformer Enhanced Network for retinal vessel and coronary angiograph segmentation. Neural Netw 2024; 176:106356. [PMID: 38723311 DOI: 10.1016/j.neunet.2024.106356] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Revised: 04/26/2024] [Accepted: 04/29/2024] [Indexed: 06/17/2024]
Abstract
Blood vessel segmentation is a crucial stage in extracting morphological characteristics of vessels for the clinical diagnosis of fundus and coronary artery disease. However, traditional convolutional neural networks (CNNs) are confined to learning local vessel features, making it challenging to capture the graph structural information and fail to perceive the global context of vessels. Therefore, we propose a novel graph neural network-guided vision transformer enhanced network (G2ViT) for vessel segmentation. G2ViT skillfully orchestrates the Convolutional Neural Network, Graph Neural Network, and Vision Transformer to enhance comprehension of the entire graphical structure of blood vessels. To achieve deeper insights into the global graph structure and higher-level global context cognizance, we investigate a graph neural network-guided vision transformer module. This module constructs graph-structured representation in an unprecedented manner using the high-level features extracted by CNNs for graph reasoning. To increase the receptive field while ensuring minimal loss of edge information, G2ViT introduces a multi-scale edge feature attention module (MEFA), leveraging dilated convolutions with different dilation rates and the Sobel edge detection algorithm to obtain multi-scale edge information of vessels. To avoid critical information loss during upsampling and downsampling, we design a multi-level feature fusion module (MLF2) to fuse complementary information between coarse and fine features. Experiments on retinal vessel datasets (DRIVE, STARE, CHASE_DB1, and HRF) and coronary angiography datasets (DCA1 and CHUAC) indicate that the G2ViT excels in robustness, generality, and applicability. Furthermore, it has acceptable inference time and computational complexity and presents a new solution for blood vessel segmentation.
Collapse
Affiliation(s)
- Hao Xu
- State Key Laboratory of Public Big Data, Guizhou University, Guiyang 550025, China; College of Computer Science and Technology, Guizhou University, Guiyang 550025, China
| | - Yun Wu
- State Key Laboratory of Public Big Data, Guizhou University, Guiyang 550025, China; College of Computer Science and Technology, Guizhou University, Guiyang 550025, China.
| |
Collapse
|
25
|
Frisken SF, Haouchine N, Chlorogiannis DD, Gopalakrishnan V, Cafaro A, Wells WT, Golby AJ, Du R. VESCL: an open source 2D vessel contouring library. Int J Comput Assist Radiol Surg 2024; 19:1627-1636. [PMID: 38879659 PMCID: PMC11875012 DOI: 10.1007/s11548-024-03212-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Accepted: 06/03/2024] [Indexed: 08/17/2024]
Abstract
PURPOSE VESCL (pronounced 'vessel') is a novel vessel contouring library for computer-assisted 2D vessel contouring and segmentation. VESCL facilitates manual vessel segmentation in 2D medical images to generate gold-standard datasets for training, testing, and validating automatic vessel segmentation. METHODS VESCL is an open-source C++ library designed for easy integration into medical image processing systems. VESCL provides an intuitive interface for drawing variable-width parametric curves along vessels in 2D images. It includes highly optimized localized filtering to automatically fit drawn curves to the nearest vessel centerline and automatically determine the varying vessel width along each curve. To support a variety of segmentation paradigms, VESCL can export multiple segmentation representations including binary segmentations, occupancy maps, and distance fields. RESULTS VESCL provides sub-pixel resolution for vessel centerlines and vessel widths. It is optimized to segment small vessels with single- or sub-pixel widths that are visible to the human eye but hard to segment automatically via conventional filters. When tested on neurovascular digital subtraction angiography (DSA), VESCL's intuitive hand-drawn input with automatic curve fitting increased the speed of fully manual segmentation by 22× over conventional methods and by 3× over the best publicly available computer-assisted manual segmentation method. Accuracy was shown to be within the range of inter-operator variability of gold standard manually segmented data from a publicly available dataset of neurovascular DSA images as measured using Dice scores. Preliminary tests showed similar improvements for segmenting DSA of coronary arteries and RGB images of retinal arteries. CONCLUSION VESCL is an open-source C++ library for contouring vessels in 2D images which can be used to reduce the tedious, labor-intensive process of manually generating gold-standard segmentations for training, testing, and comparing automatic segmentation methods.
Collapse
Affiliation(s)
- S F Frisken
- Brigham and Women's Hospital, Boston, USA.
- Harvard Medical School, Boston, USA.
| | - N Haouchine
- Brigham and Women's Hospital, Boston, USA
- Harvard Medical School, Boston, USA
| | - D D Chlorogiannis
- Brigham and Women's Hospital, Boston, USA
- Aristotle University of Thessaloniki, Thessaloníki, Greece
| | - V Gopalakrishnan
- Harvard-MIT Health Sciences and Technology, Cambridge, USA
- Massachusetts Institute of Technology, Cambridge, USA
| | - A Cafaro
- Brigham and Women's Hospital, Boston, USA
- Université Paris-Saclay, Villejuif, France
| | - W T Wells
- Brigham and Women's Hospital, Boston, USA
- Harvard Medical School, Boston, USA
- Massachusetts Institute of Technology, Cambridge, USA
| | - A J Golby
- Brigham and Women's Hospital, Boston, USA
- Harvard Medical School, Boston, USA
| | - R Du
- Brigham and Women's Hospital, Boston, USA
- Harvard Medical School, Boston, USA
| |
Collapse
|
26
|
Matloob Abbasi M, Iqbal S, Aurangzeb K, Alhussein M, Khan TM. LMBiS-Net: A lightweight bidirectional skip connection based multipath CNN for retinal blood vessel segmentation. Sci Rep 2024; 14:15219. [PMID: 38956117 PMCID: PMC11219784 DOI: 10.1038/s41598-024-63496-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Accepted: 05/29/2024] [Indexed: 07/04/2024] Open
Abstract
Blinding eye diseases are often related to changes in retinal structure, which can be detected by analysing retinal blood vessels in fundus images. However, existing techniques struggle to accurately segment these delicate vessels. Although deep learning has shown promise in medical image segmentation, its reliance on specific operations can limit its ability to capture crucial details such as the edges of the vessel. This paper introduces LMBiS-Net, a lightweight convolutional neural network designed for the segmentation of retinal vessels. LMBiS-Net achieves exceptional performance with a remarkably low number of learnable parameters (only 0.172 million). The network used multipath feature extraction blocks and incorporates bidirectional skip connections for the information flow between the encoder and decoder. In addition, we have optimised the efficiency of the model by carefully selecting the number of filters to avoid filter overlap. This optimisation significantly reduces training time and improves computational efficiency. To assess LMBiS-Net's robustness and ability to generalise to unseen data, we conducted comprehensive evaluations on four publicly available datasets: DRIVE, STARE, CHASE_DB1, and HRF The proposed LMBiS-Net achieves significant performance metrics in various datasets. It obtains sensitivity values of 83.60%, 84.37%, 86.05%, and 83.48%, specificity values of 98.83%, 98.77%, 98.96%, and 98.77%, accuracy (acc) scores of 97.08%, 97.69%, 97.75%, and 96.90%, and AUC values of 98.80%, 98.82%, 98.71%, and 88.77% on the DRIVE, STARE, CHEASE_DB, and HRF datasets, respectively. In addition, it records F1 scores of 83.43%, 84.44%, 83.54%, and 78.73% on the same datasets. Our evaluations demonstrate that LMBiS-Net achieves high segmentation accuracy (acc) while exhibiting both robustness and generalisability across various retinal image datasets. This combination of qualities makes LMBiS-Net a promising tool for various clinical applications.
Collapse
Affiliation(s)
- Mufassir Matloob Abbasi
- Department of Electrical Engineering, Abasyn University Islamabad Campus (AUIC), Islamabad, 44000, Pakistan
| | - Shahzaib Iqbal
- Department of Electrical Engineering, Abasyn University Islamabad Campus (AUIC), Islamabad, 44000, Pakistan.
| | - Khursheed Aurangzeb
- Department of Computer Engineering, College of Computer and Information Sciences, King Saud University, Riyadh, P. O. Box 51178, 11543, Saudi Arabia
| | - Musaed Alhussein
- Department of Computer Engineering, College of Computer and Information Sciences, King Saud University, Riyadh, P. O. Box 51178, 11543, Saudi Arabia
| | - Tariq M Khan
- School of Computer Science and Engineering, University of New South Wales, Sydney, NSW, Australia
| |
Collapse
|
27
|
Qian G, Wang H, Wang Y, Chen X, Yu D, Luo S, Sun Y, Xu P, Ye J. Cascade spatial and channel-wise multifusion network with criss cross augmentation for corneal segmentation and reconstruction. Comput Biol Med 2024; 177:108602. [PMID: 38805809 DOI: 10.1016/j.compbiomed.2024.108602] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Revised: 04/22/2024] [Accepted: 05/11/2024] [Indexed: 05/30/2024]
Abstract
High-quality 3D corneal reconstruction from AS-OCT images has demonstrated significant potential in computer-aided diagnosis, enabling comprehensive observation of corneal thickness, precise assessment of morphological characteristics, as well as location and quantification of keratitis-affected regions. However, it faces two main challenges: (1) prevalent medical image segmentation networks often struggle to accurately process low-contrast corneal regions, which is a vital pre-processing step for 3D corneal reconstruction, and (2) there are no reconstruction methods that can be directly applied to AS-OCT sequences with 180-degree scanning. To combat these, we propose CSCM-CCA-Net, a simple yet efficient network for accurate corneal segmentation. This network incorporates two key techniques: cascade spatial and channel-wise multifusion (CSCM), which captures intricate contextual interdependencies and effectively extracts low-contrast and obscure corneal features; and criss cross augmentation (CCA), which enhances shape-preserved feature representation to improve segmentation accuracy. Based on the obtained corneal segmentation results, we reconstruct the 3D volume data and generate a topographic map of corneal thickness through corneal image alignment. Additionally, we design a transfer function based on the analysis of intensity histogram and gradient histogram to explore more internal cues for better visualization results. Experimental results on CORNEA benchmark demonstrate the impressive performance of our proposed method in terms of both corneal segmentation and 3D reconstruction. Furthermore, we compare CSCM-CCA-Net with state-of-the-art medical image segmentation approaches using three challenging medical fundus segmentation datasets (DRIVE, CHASEDB1, FIVES), highlighting its superiority in terms of segmentation accuracy. The code and models will be made available at https://github.com/qianguiping/CSCM-CCA-Net.
Collapse
Affiliation(s)
- Guiping Qian
- College of Media Engineering, Communication University of Zhejiang, Hangzhou, 310018, China.
| | - Huaqiong Wang
- College of Media Engineering, Communication University of Zhejiang, Hangzhou, 310018, China
| | - Yaqi Wang
- College of Media Engineering, Communication University of Zhejiang, Hangzhou, 310018, China
| | - Xiaodiao Chen
- School of Computer, Hangzhou Dianzi University, Hangzhou, 310018, China
| | - Dingguo Yu
- College of Media Engineering, Communication University of Zhejiang, Hangzhou, 310018, China
| | - Shan Luo
- College of Media Engineering, Communication University of Zhejiang, Hangzhou, 310018, China
| | - Yiming Sun
- Department of Ophthalmology, The Second Affiliated Hospital of Zhejiang University, School of Medicine, Hangzhou, 310005, China
| | - Peifang Xu
- Department of Ophthalmology, The Second Affiliated Hospital of Zhejiang University, School of Medicine, Hangzhou, 310005, China
| | - Juan Ye
- Department of Ophthalmology, The Second Affiliated Hospital of Zhejiang University, School of Medicine, Hangzhou, 310005, China
| |
Collapse
|
28
|
Qi X, Wu Z, Zou W, Ren M, Gao Y, Sun M, Zhang S, Shan C, Sun Z. Exploring Generalizable Distillation for Efficient Medical Image Segmentation. IEEE J Biomed Health Inform 2024; 28:4170-4183. [PMID: 38954557 DOI: 10.1109/jbhi.2024.3385098] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/04/2024]
Abstract
Efficient medical image segmentation aims to provide accurate pixel-wise predictions with a lightweight implementation framework. However, existing lightweight networks generally overlook the generalizability of the cross-domain medical segmentation tasks. In this paper, we propose Generalizable Knowledge Distillation (GKD), a novel framework for enhancing the performance of lightweight networks on cross-domain medical segmentation by generalizable knowledge distillation from powerful teacher networks. Considering the domain gaps between different medical datasets, we propose the Model-Specific Alignment Networks (MSAN) to obtain the domain-invariant representations. Meanwhile, a customized Alignment Consistency Training (ACT) strategy is designed to promote the MSAN training. Based on the domain-invariant vectors in MSAN, we propose two generalizable distillation schemes, Dual Contrastive Graph Distillation (DCGD) and Domain-Invariant Cross Distillation (DICD). In DCGD, two implicit contrastive graphs are designed to model the intra-coupling and inter-coupling semantic correlations. Then, in DICD, the domain-invariant semantic vectors are reconstructed from two networks (i.e., teacher and student) with a crossover manner to achieve simultaneous generalization of lightweight networks, hierarchically. Moreover, a metric named Fréchet Semantic Distance (FSD) is tailored to verify the effectiveness of the regularized domain-invariant features. Extensive experiments conducted on the Liver, Retinal Vessel and Colonoscopy segmentation datasets demonstrate the superiority of our method, in terms of performance and generalization ability on lightweight networks.
Collapse
|
29
|
He H, Qiu J, Lin L, Cai Z, Cheng P, Tang X. JOINEDTrans: Prior guided multi-task transformer for joint optic disc/cup segmentation and fovea detection. Comput Biol Med 2024; 177:108613. [PMID: 38781644 DOI: 10.1016/j.compbiomed.2024.108613] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Revised: 01/18/2024] [Accepted: 05/11/2024] [Indexed: 05/25/2024]
Abstract
Deep learning-based image segmentation and detection models have largely improved the efficiency of analyzing retinal landmarks such as optic disc (OD), optic cup (OC), and fovea. However, factors including ophthalmic disease-related lesions and low image quality issues may severely complicate automatic OD/OC segmentation and fovea detection. Most existing works treat the identification of each landmark as a single task, and take into account no prior information. To address these issues, we propose a prior guided multi-task transformer framework for joint OD/OC segmentation and fovea detection, named JOINEDTrans. JOINEDTrans effectively combines various spatial features of the fundus images, relieving the structural distortions induced by lesions and other imaging issues. It contains a segmentation branch and a detection branch. To be noted, we employ an encoder with prior-learning in a vessel segmentation task to effectively exploit the positional relationship among vessel, OD/OC, and fovea, successfully incorporating spatial prior into the proposed JOINEDTrans framework. There are a coarse stage and a fine stage in JOINEDTrans. In the coarse stage, OD/OC coarse segmentation and fovea heatmap localization are obtained through a joint segmentation and detection module. In the fine stage, we crop regions of interest for subsequent refinement and use predictions obtained in the coarse stage to provide additional information for better performance and faster convergence. Experimental results demonstrate that JOINEDTrans outperforms existing state-of-the-art methods on the publicly available GAMMA, REFUGE, and PALM fundus image datasets. We make our code available at https://github.com/HuaqingHe/JOINEDTrans.
Collapse
Affiliation(s)
- Huaqing He
- Department of Electronic and Electrical Engineering, Southern University of Science and Technology, Shenzhen, Guangdong, China; Jiaxing Research Institute, Southern University of Science and Technology, Jiaxing, Zhejiang, China.
| | - Jiaming Qiu
- Department of Electronic and Electrical Engineering, Southern University of Science and Technology, Shenzhen, Guangdong, China.
| | - Li Lin
- Department of Electronic and Electrical Engineering, Southern University of Science and Technology, Shenzhen, Guangdong, China; Jiaxing Research Institute, Southern University of Science and Technology, Jiaxing, Zhejiang, China; Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong, China.
| | - Zhiyuan Cai
- Department of Electronic and Electrical Engineering, Southern University of Science and Technology, Shenzhen, Guangdong, China; Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong, China.
| | - Pujin Cheng
- Department of Electronic and Electrical Engineering, Southern University of Science and Technology, Shenzhen, Guangdong, China; Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong, China.
| | - Xiaoying Tang
- Department of Electronic and Electrical Engineering, Southern University of Science and Technology, Shenzhen, Guangdong, China; Jiaxing Research Institute, Southern University of Science and Technology, Jiaxing, Zhejiang, China.
| |
Collapse
|
30
|
Iqbal S, Khan TM, Naqvi SS, Naveed A, Usman M, Khan HA, Razzak I. LDMRes-Net: A Lightweight Neural Network for Efficient Medical Image Segmentation on IoT and Edge Devices. IEEE J Biomed Health Inform 2024; 28:3860-3871. [PMID: 37938951 DOI: 10.1109/jbhi.2023.3331278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2023]
Abstract
In this study, we propose LDMRes-Net, a lightweight dual-multiscale residual block-based convolutional neural network tailored for medical image segmentation on IoT and edge platforms. Conventional U-Net-based models face challenges in meeting the speed and efficiency demands of real-time clinical applications, such as disease monitoring, radiation therapy, and image-guided surgery. In this study, we present the Lightweight Dual Multiscale Residual Block-based Convolutional Neural Network (LDMRes-Net), which is specifically designed to overcome these difficulties. LDMRes-Net overcomes these limitations with its remarkably low number of learnable parameters (0.072 M), making it highly suitable for resource-constrained devices. The model's key innovation lies in its dual multiscale residual block architecture, which enables the extraction of refined features on multiple scales, enhancing overall segmentation performance. To further optimize efficiency, the number of filters is carefully selected to prevent overlap, reduce training time, and improve computational efficiency. The study includes comprehensive evaluations, focusing on the segmentation of the retinal image of vessels and hard exudates crucial for the diagnosis and treatment of ophthalmology. The results demonstrate the robustness, generalizability, and high segmentation accuracy of LDMRes-Net, positioning it as an efficient tool for accurate and rapid medical image segmentation in diverse clinical applications, particularly on IoT and edge platforms. Such advances hold significant promise for improving healthcare outcomes and enabling real-time medical image analysis in resource-limited settings.
Collapse
|
31
|
Liu X, Tan H, Wang W, Chen Z. Deep learning based retinal vessel segmentation and hypertensive retinopathy quantification using heterogeneous features cross-attention neural network. Front Med (Lausanne) 2024; 11:1377479. [PMID: 38841586 PMCID: PMC11150614 DOI: 10.3389/fmed.2024.1377479] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2024] [Accepted: 05/09/2024] [Indexed: 06/07/2024] Open
Abstract
Retinal vessels play a pivotal role as biomarkers in the detection of retinal diseases, including hypertensive retinopathy. The manual identification of these retinal vessels is both resource-intensive and time-consuming. The fidelity of vessel segmentation in automated methods directly depends on the fundus images' quality. In instances of sub-optimal image quality, applying deep learning-based methodologies emerges as a more effective approach for precise segmentation. We propose a heterogeneous neural network combining the benefit of local semantic information extraction of convolutional neural network and long-range spatial features mining of transformer network structures. Such cross-attention network structure boosts the model's ability to tackle vessel structures in the retinal images. Experiments on four publicly available datasets demonstrate our model's superior performance on vessel segmentation and the big potential of hypertensive retinopathy quantification.
Collapse
Affiliation(s)
- Xinghui Liu
- School of Clinical Medicine, Guizhou Medical University, Guiyang, China
- Department of Cardiovascular Medicine, Guizhou Provincial People's Hospital, Guiyang, China
| | - Hongwen Tan
- Department of Cardiovascular Medicine, Guizhou Provincial People's Hospital, Guiyang, China
| | - Wu Wang
- Electrical Engineering College, Guizhou University, Guiyang, China
| | - Zhangrong Chen
- School of Clinical Medicine, Guizhou Medical University, Guiyang, China
- Department of Cardiovascular Medicine, The Affiliated Hospital of Guizhou Medical University, Guiyang, China
| |
Collapse
|
32
|
Santarossa M, Beyer TT, Scharf ABA, Tatli A, von der Burchard C, Nazarenus J, Roider JB, Koch R. When Two Eyes Don't Suffice-Learning Difficult Hyperfluorescence Segmentations in Retinal Fundus Autofluorescence Images via Ensemble Learning. J Imaging 2024; 10:116. [PMID: 38786570 PMCID: PMC11122615 DOI: 10.3390/jimaging10050116] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2024] [Revised: 05/03/2024] [Accepted: 05/06/2024] [Indexed: 05/25/2024] Open
Abstract
Hyperfluorescence (HF) and reduced autofluorescence (RA) are important biomarkers in fundus autofluorescence images (FAF) for the assessment of health of the retinal pigment epithelium (RPE), an important indicator of disease progression in geographic atrophy (GA) or central serous chorioretinopathy (CSCR). Autofluorescence images have been annotated by human raters, but distinguishing biomarkers (whether signals are increased or decreased) from the normal background proves challenging, with borders being particularly open to interpretation. Consequently, significant variations emerge among different graders, and even within the same grader during repeated annotations. Tests on in-house FAF data show that even highly skilled medical experts, despite previously discussing and settling on precise annotation guidelines, reach a pair-wise agreement measured in a Dice score of no more than 63-80% for HF segmentations and only 14-52% for RA. The data further show that the agreement of our primary annotation expert with herself is a 72% Dice score for HF and 51% for RA. Given these numbers, the task of automated HF and RA segmentation cannot simply be refined to the improvement in a segmentation score. Instead, we propose the use of a segmentation ensemble. Learning from images with a single annotation, the ensemble reaches expert-like performance with an agreement of a 64-81% Dice score for HF and 21-41% for RA with all our experts. In addition, utilizing the mean predictions of the ensemble networks and their variance, we devise ternary segmentations where FAF image areas are labeled either as confident background, confident HF, or potential HF, ensuring that predictions are reliable where they are confident (97% Precision), while detecting all instances of HF (99% Recall) annotated by all experts.
Collapse
Affiliation(s)
- Monty Santarossa
- Department of Computer Science, Kiel University, 24118 Kiel, Germany; (T.T.B.); (J.N.); (R.K.)
| | - Tebbo Tassilo Beyer
- Department of Computer Science, Kiel University, 24118 Kiel, Germany; (T.T.B.); (J.N.); (R.K.)
| | | | - Ayse Tatli
- Department of Ophthalmology, Kiel University, 24118 Kiel, Germany; (A.B.A.S.); (A.T.); (C.v.d.B.); (J.B.R.)
| | - Claus von der Burchard
- Department of Ophthalmology, Kiel University, 24118 Kiel, Germany; (A.B.A.S.); (A.T.); (C.v.d.B.); (J.B.R.)
| | - Jakob Nazarenus
- Department of Computer Science, Kiel University, 24118 Kiel, Germany; (T.T.B.); (J.N.); (R.K.)
| | - Johann Baptist Roider
- Department of Ophthalmology, Kiel University, 24118 Kiel, Germany; (A.B.A.S.); (A.T.); (C.v.d.B.); (J.B.R.)
| | - Reinhard Koch
- Department of Computer Science, Kiel University, 24118 Kiel, Germany; (T.T.B.); (J.N.); (R.K.)
| |
Collapse
|
33
|
Huang H, Shang Z, Yu C. FRD-Net: a full-resolution dilated convolution network for retinal vessel segmentation. BIOMEDICAL OPTICS EXPRESS 2024; 15:3344-3365. [PMID: 38855685 PMCID: PMC11161363 DOI: 10.1364/boe.522482] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Revised: 04/13/2024] [Accepted: 04/17/2024] [Indexed: 06/11/2024]
Abstract
Accurate and automated retinal vessel segmentation is essential for performing diagnosis and surgical planning of retinal diseases. However, conventional U-shaped networks often suffer from segmentation errors when dealing with fine and low-contrast blood vessels due to the loss of continuous resolution in the encoding stage and the inability to recover the lost information in the decoding stage. To address this issue, this paper introduces an effective full-resolution retinal vessel segmentation network, namely FRD-Net, which consists of two core components: the backbone network and the multi-scale feature fusion module (MFFM). The backbone network achieves horizontal and vertical expansion through the interaction mechanism of multi-resolution dilated convolutions while preserving the complete image resolution. In the backbone network, the effective application of dilated convolutions with varying dilation rates, coupled with the utilization of dilated residual modules for integrating multi-scale feature maps from adjacent stages, facilitates continuous learning of multi-scale features to enhance high-level contextual information. Moreover, MFFM further enhances segmentation by fusing deeper multi-scale features with the original image, facilitating edge detail recovery for accurate vessel segmentation. In tests on multiple classical datasets,compared to state-of-the-art segmentation algorithms, FRD-Net achieves superior performance and generalization with fewer model parameters.
Collapse
Affiliation(s)
- Hua Huang
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China
| | - Zhenhong Shang
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China
- Yunnan Key Laboratory of Artificial Intelligence, Kunming University of Science and Technology, Kunming 650500, China
| | - Chunhui Yu
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China
| |
Collapse
|
34
|
Shi D, Zhou Y, He S, Wagner SK, Huang Y, Keane PA, Ting DS, Zhang L, Zheng Y, He M. Cross-modality Labeling Enables Noninvasive Capillary Quantification as a Sensitive Biomarker for Assessing Cardiovascular Risk. OPHTHALMOLOGY SCIENCE 2024; 4:100441. [PMID: 38420613 PMCID: PMC10899028 DOI: 10.1016/j.xops.2023.100441] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Revised: 11/26/2023] [Accepted: 11/27/2023] [Indexed: 03/02/2024]
Abstract
Purpose We aim to use fundus fluorescein angiography (FFA) to label the capillaries on color fundus (CF) photographs and train a deep learning model to quantify retinal capillaries noninvasively from CF and apply it to cardiovascular disease (CVD) risk assessment. Design Cross-sectional and longitudinal study. Participants A total of 90732 pairs of CF-FFA images from 3893 participants for segmentation model development, and 49229 participants in the UK Biobank for association analysis. Methods We matched the vessels extracted from FFA and CF, and used vessels from FFA as labels to train a deep learning model (RMHAS-FA) to segment retinal capillaries using CF. We tested the model's accuracy on a manually labeled internal test set (FundusCapi). For external validation, we tested the segmentation model on 7 vessel segmentation datasets, and investigated the clinical value of the segmented vessels in predicting CVD events in the UK Biobank. Main Outcome Measures Area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, and specificity for segmentation. Hazard ratio (HR; 95% confidence interval [CI]) for Cox regression analysis. Results On the FundusCapi dataset, the segmentation performance was AUC = 0.95, accuracy = 0.94, sensitivity = 0.90, and specificity = 0.93. Smaller vessel skeleton density had a stronger correlation with CVD risk factors and incidence (P < 0.01). Reduced density of small vessel skeletons was strongly associated with an increased risk of CVD incidence and mortality for women (HR [95% CI] = 0.91 [0.84-0.98] and 0.68 [0.54-0.86], respectively). Conclusions Using paired CF-FFA images, we automated the laborious manual labeling process and enabled noninvasive capillary quantification from CF, supporting its potential as a sensitive screening method for identifying individuals at high risk of future CVD events. Financial Disclosures Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.
Collapse
Affiliation(s)
- Danli Shi
- School of Optometry, The Hong Kong Polytechnic University, Kowloon, Hong Kong, China
- Research Centre for SHARP Vision, The Hong Kong Polytechnic University, Kowloon, Hong Kong, China
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Yukun Zhou
- Centre for Medical Image Computing, University College London, London, UK
| | - Shuang He
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Siegfried K. Wagner
- NIHR Biomedical Research Centre at Moorfields Eye Hospital NHS Foundation Trust and UCL Institute of Ophthalmology, London, UK
| | - Yu Huang
- Department of Ophthalmology, Guangdong Academy of Medical Sciences, Guangdong Provincial People's Hospital, Guangzhou, China
| | - Pearse A. Keane
- NIHR Biomedical Research Centre at Moorfields Eye Hospital NHS Foundation Trust and UCL Institute of Ophthalmology, London, UK
| | - Daniel S.W. Ting
- Singapore National Eye Center, Singapore Eye Research Institute, and Duke-NUS Medical School, National University of Singapore, Singapore, Singapore
| | - Lei Zhang
- Faculty of Medicine, Central Clinical School, Monash University, Melbourne, Victoria, Australia
| | - Yingfeng Zheng
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Mingguang He
- School of Optometry, The Hong Kong Polytechnic University, Kowloon, Hong Kong, China
- Research Centre for SHARP Vision, The Hong Kong Polytechnic University, Kowloon, Hong Kong, China
- Department of Ophthalmology, Guangdong Academy of Medical Sciences, Guangdong Provincial People's Hospital, Guangzhou, China
| |
Collapse
|
35
|
Gao Y, Ma C, Guo L, Liu G, Zhang X, Ji X. Adversarial learning-based domain adaptation algorithm for intracranial artery stenosis detection on multi-source datasets. Comput Biol Med 2024; 170:108001. [PMID: 38280254 DOI: 10.1016/j.compbiomed.2024.108001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Revised: 12/26/2023] [Accepted: 01/13/2024] [Indexed: 01/29/2024]
Abstract
Intracranial arterial stenosis (ICAS) is characterized by the pathological narrowing or occlusion of the inner lumen of intracranial blood vessels. However, the retina can indirectly react to cerebrovascular disease. Therefore, retinal fundus images (RFI) serve as valuable noninvasive and easily accessible screening tools for early detection and diagnosis of ICAS. This paper introduces an adversarial learning-based domain adaptation algorithm (ALDA) specifically designed for ICAS detection in multi-source datasets. The primary objective is to achieve accurate detection and enhanced generalization of ICAS based on RFI. Given the limitations of traditional algorithms in meeting the accuracy and generalization requirements, ALDA overcomes these challenges by leveraging RFI datasets from multiple sources and employing the concept of adversarial learning to facilitate feature representation sharing and distinguishability learning. In order to evaluate the performance of the ALDA algorithm, we conducted experimental validation on multi-source datasets. We compared its results with those obtained from other deep learning algorithms in the ICAS detection task. Furthermore, we validated the potential of ALDA for detecting diabetic retinopathy. The experimental results clearly demonstrate the significant improvements achieved by the ALDA algorithm. By leveraging information from diverse datasets, ALDA learns feature representations that exhibit enhanced generalizability and robustness. This makes it a reliable auxiliary diagnostic tool for clinicians, thereby facilitating the prevention and treatment of cerebrovascular diseases.
Collapse
Affiliation(s)
- Yuan Gao
- Department of Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, 100191, Beijing, China; Department of Ophthalmology, Xuanwu Hospital, Capital Medical University, 100053, Beijing, China.
| | - Chenbin Ma
- Department of Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, 100191, Beijing, China; Shen Yuan Honors College, Beihang University, 100191, Beijing, China.
| | - Lishuang Guo
- Department of Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, 100191, Beijing, China.
| | - Guiyou Liu
- Department of Ophthalmology, Beijing Tiantan Hospital, Capital Medical University, 100050, Beijing, China.
| | - Xuxiang Zhang
- Beijing Institute for Brain Disorders, Capital Medical University, 100069, Beijing, China.
| | - Xunming Ji
- Department of Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, 100191, Beijing, China.
| |
Collapse
|
36
|
Yap BP, Ng BK. Coarse-to-fine visual representation learning for medical images via class activation maps. Comput Biol Med 2024; 171:108203. [PMID: 38430741 DOI: 10.1016/j.compbiomed.2024.108203] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Revised: 01/29/2024] [Accepted: 02/19/2024] [Indexed: 03/05/2024]
Abstract
The value of coarsely labeled datasets in learning transferable representations for medical images is investigated in this work. Compared to fine labels which require meticulous effort to annotate, coarse labels can be acquired at a significantly lower cost and can provide useful training signals for data-hungry deep neural networks. We consider coarse labels in the form of binary labels differentiating a normal (healthy) image from an abnormal (diseased) image and propose CAMContrast, a two-stage representation learning framework for medical images. Using class activation maps, CAMContrast makes use of the binary labels to generate heatmaps as positive views for contrastive representation learning. Specifically, the learning objective is optimized to maximize the agreement within fixed crops of image-heatmap pair to learn fine-grained representations that are generalizable to different downstream tasks. We empirically validate the transfer learning performance of CAMContrast on several public datasets, covering classification and segmentation tasks on fundus photographs and chest X-ray images. The experimental results showed that our method outperforms other self-supervised and supervised pretrain methods in terms of data efficiency and downstream performance.
Collapse
Affiliation(s)
- Boon Peng Yap
- School of Electrical and Electronic Engineering, Nanyang Technological University, 50 Nanyang Ave, 639798, Singapore; Centre for OptoElectronics and Biophotonics, Nanyang Technological University, 50 Nanyang Ave, 639798, Singapore.
| | - Beng Koon Ng
- School of Electrical and Electronic Engineering, Nanyang Technological University, 50 Nanyang Ave, 639798, Singapore; Centre for OptoElectronics and Biophotonics, Nanyang Technological University, 50 Nanyang Ave, 639798, Singapore.
| |
Collapse
|
37
|
Fakhouri HN, Alawadi S, Awaysheh FM, Alkhabbas F, Zraqou J. A cognitive deep learning approach for medical image processing. Sci Rep 2024; 14:4539. [PMID: 38402321 PMCID: PMC10894297 DOI: 10.1038/s41598-024-55061-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Accepted: 02/20/2024] [Indexed: 02/26/2024] Open
Abstract
In ophthalmic diagnostics, achieving precise segmentation of retinal blood vessels is a critical yet challenging task, primarily due to the complex nature of retinal images. The intricacies of these images often hinder the accuracy and efficiency of segmentation processes. To overcome these challenges, we introduce the cognitive DL retinal blood vessel segmentation (CoDLRBVS), a novel hybrid model that synergistically combines the deep learning capabilities of the U-Net architecture with a suite of advanced image processing techniques. This model uniquely integrates a preprocessing phase using a matched filter (MF) for feature enhancement and a post-processing phase employing morphological techniques (MT) for refining the segmentation output. Also, the model incorporates multi-scale line detection and scale space methods to enhance its segmentation capabilities. Hence, CoDLRBVS leverages the strengths of these combined approaches within the cognitive computing framework, endowing the system with human-like adaptability and reasoning. This strategic integration enables the model to emphasize blood vessels, accurately segment effectively, and proficiently detect vessels of varying sizes. CoDLRBVS achieves a notable mean accuracy of 96.7%, precision of 96.9%, sensitivity of 99.3%, and specificity of 80.4% across all of the studied datasets, including DRIVE, STARE, HRF, retinal blood vessel and Chase-DB1. CoDLRBVS has been compared with different models, and the resulting metrics surpass the compared models and establish a new benchmark in retinal vessel segmentation. The success of CoDLRBVS underscores its significant potential in advancing medical image processing, particularly in the realm of retinal blood vessel segmentation.
Collapse
Affiliation(s)
- Hussam N Fakhouri
- Department of Data Science and Artificial Intelligence, The University of Petra, Amman, Jordan
| | - Sadi Alawadi
- Department of Computer Science, Blekinge Institute of Technology, Karlskrona, Sweden.
- Computer Graphics and Data Engineering (COGRADE) Research Group, University of Santiago de Compostela, Santiago de Compostela, Spain.
| | - Feras M Awaysheh
- Institute of Computer Science, Delta Research Centre, University of Tartu, Tartu, Estonia
| | - Fahed Alkhabbas
- Internet of Things and People Research Center, Malmö University, Malmö, Sweden
- Department of Computer Science and Media Technology, Malmö University, Malmö, Sweden
| | - Jamal Zraqou
- Virtual and Augment Reality Department, Faculty of Information Technology, University of Petra, Amman, Jordan
| |
Collapse
|
38
|
Jiao R, Zhang Y, Ding L, Xue B, Zhang J, Cai R, Jin C. Learning with limited annotations: A survey on deep semi-supervised learning for medical image segmentation. Comput Biol Med 2024; 169:107840. [PMID: 38157773 DOI: 10.1016/j.compbiomed.2023.107840] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 10/30/2023] [Accepted: 12/07/2023] [Indexed: 01/03/2024]
Abstract
Medical image segmentation is a fundamental and critical step in many image-guided clinical approaches. Recent success of deep learning-based segmentation methods usually relies on a large amount of labeled data, which is particularly difficult and costly to obtain, especially in the medical imaging domain where only experts can provide reliable and accurate annotations. Semi-supervised learning has emerged as an appealing strategy and been widely applied to medical image segmentation tasks to train deep models with limited annotations. In this paper, we present a comprehensive review of recently proposed semi-supervised learning methods for medical image segmentation and summarize both the technical novelties and empirical results. Furthermore, we analyze and discuss the limitations and several unsolved problems of existing approaches. We hope this review can inspire the research community to explore solutions to this challenge and further advance the field of medical image segmentation.
Collapse
Affiliation(s)
- Rushi Jiao
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China; School of Engineering Medicine, Beihang University, Beijing, 100191, China; Shanghai Artificial Intelligence Laboratory, Shanghai, 200232, China.
| | - Yichi Zhang
- School of Data Science, Fudan University, Shanghai, 200433, China; Artificial Intelligence Innovation and Incubation Institute, Fudan University, Shanghai, 200433, China.
| | - Le Ding
- School of Biological Science and Medical Engineering, Beihang University, Beijing, 100191, China.
| | - Bingsen Xue
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China; Shanghai Artificial Intelligence Laboratory, Shanghai, 200232, China.
| | - Jicong Zhang
- School of Biological Science and Medical Engineering, Beihang University, Beijing, 100191, China; Hefei Innovation Research Institute, Beihang University, Hefei, 230012, China.
| | - Rong Cai
- School of Engineering Medicine, Beihang University, Beijing, 100191, China; Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beihang University, Beijing, 100191, China.
| | - Cheng Jin
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China; Shanghai Artificial Intelligence Laboratory, Shanghai, 200232, China; Beijing Anding Hospital, Capital Medical University, Beijing, 100088, China.
| |
Collapse
|
39
|
Peng Y, Tang Y, Luan P, Zhang Z, Tu H. MAFE-Net: retinal vessel segmentation based on a multiple attention-guided fusion mechanism and ensemble learning network. BIOMEDICAL OPTICS EXPRESS 2024; 15:843-862. [PMID: 38404318 PMCID: PMC10890843 DOI: 10.1364/boe.510251] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Revised: 01/09/2024] [Accepted: 01/10/2024] [Indexed: 02/27/2024]
Abstract
The precise and automatic recognition of retinal vessels is of utmost importance in the prevention, diagnosis and assessment of certain eye diseases, yet it brings a nontrivial uncertainty for this challenging detection mission due to the presence of intricate factors, such as uneven and indistinct curvilinear shapes, unpredictable pathological deformations, and non-uniform contrast. Therefore, we propose a unique and practical approach based on a multiple attention-guided fusion mechanism and ensemble learning network (MAFE-Net) for retinal vessel segmentation. In conventional UNet-based models, long-distance dependencies are explicitly modeled, which may cause partial scene information loss. To compensate for the deficiency, various blood vessel features can be extracted from retinal images by using an attention-guided fusion module. In the skip connection part, a unique spatial attention module is applied to remove redundant and irrelevant information; this structure helps to better integrate low-level and high-level features. The final step involves a DropOut layer that removes some neurons randomly to prevent overfitting and improve generalization. Moreover, an ensemble learning framework is designed to detect retinal vessels by combining different deep learning models. To demonstrate the effectiveness of the proposed model, experimental results were verified in public datasets STARE, DRIVE, and CHASEDB1, which achieved F1 scores of 0.842, 0.825, and 0.814, and Accuracy values of 0.975, 0.969, and 0.975, respectively. Compared with eight state-of-the-art models, the designed model produces satisfactory results both visually and quantitatively.
Collapse
Affiliation(s)
- Yuanyuan Peng
- School of Electrical and Automation Engineering, East China Jiaotong University, Nanchang 330000, China
| | - Yingjie Tang
- School of Electrical and Automation Engineering, East China Jiaotong University, Nanchang 330000, China
| | - Pengpeng Luan
- School of Electrical and Automation Engineering, East China Jiaotong University, Nanchang 330000, China
| | - Zixu Zhang
- School of Electrical and Automation Engineering, East China Jiaotong University, Nanchang 330000, China
| | - Hongbin Tu
- School of Electrical and Automation Engineering, East China Jiaotong University, Nanchang 330000, China
| |
Collapse
|
40
|
Zheng C, Li H, Ge Y, He Y, Yi Y, Zhu M, Sun H, Kong J. Retinal vessel segmentation based on multi-scale feature and style transfer. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2024; 21:49-74. [PMID: 38303413 DOI: 10.3934/mbe.2024003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/03/2024]
Abstract
Retinal vessel segmentation is very important for diagnosing and treating certain eye diseases. Recently, many deep learning-based retinal vessel segmentation methods have been proposed; however, there are still many shortcomings (e.g., they cannot obtain satisfactory results when dealing with cross-domain data or segmenting small blood vessels). To alleviate these problems and avoid overly complex models, we propose a novel network based on a multi-scale feature and style transfer (MSFST-NET) for retinal vessel segmentation. Specifically, we first construct a lightweight segmentation module named MSF-Net, which introduces the selective kernel (SK) module to increase the multi-scale feature extraction ability of the model to achieve improved small blood vessel segmentation. Then, to alleviate the problem of model performance degradation when segmenting cross-domain datasets, we propose a style transfer module and a pseudo-label learning strategy. The style transfer module is used to reduce the style difference between the source domain image and the target domain image to improve the segmentation performance for the target domain image. The pseudo-label learning strategy is designed to be combined with the style transfer module to further boost the generalization ability of the model. Moreover, we trained and tested our proposed MSFST-NET in experiments on the DRIVE and CHASE_DB1 datasets. The experimental results demonstrate that MSFST-NET can effectively improve the generalization ability of the model on cross-domain datasets and achieve improved retinal vessel segmentation results than other state-of-the-art methods.
Collapse
Affiliation(s)
- Caixia Zheng
- Jilin Animation Institute, Changchun 130013, China
- College of Information Science and Technology, Northeast Normal University, Changchun 130117, China
| | - Huican Li
- College of Information Science and Technology, Northeast Normal University, Changchun 130117, China
| | - Yingying Ge
- Jilin Animation Institute, Changchun 130013, China
| | - Yanlin He
- College of Information Science and Technology, Northeast Normal University, Changchun 130117, China
| | - Yugen Yi
- School of Software, Jiangxi Normal University, Nanchang 330022, China
| | - Meili Zhu
- Jilin Animation Institute, Changchun 130013, China
| | - Hui Sun
- School of Science and Technology, Changchun Humanities and Sciences College, Changchun 130117, China
| | - Jun Kong
- College of Information Science and Technology, Northeast Normal University, Changchun 130117, China
| |
Collapse
|
41
|
Ma Z, Li X. An improved supervised and attention mechanism-based U-Net algorithm for retinal vessel segmentation. Comput Biol Med 2024; 168:107770. [PMID: 38056215 DOI: 10.1016/j.compbiomed.2023.107770] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Revised: 11/08/2023] [Accepted: 11/26/2023] [Indexed: 12/08/2023]
Abstract
The segmentation results of retinal blood vessels are crucial for automatically diagnosing ophthalmic diseases such as diabetic retinopathy, hypertension, cardiovascular and cerebrovascular diseases. To improve the accuracy of vessel segmentation and better extract information about small vessels and edges, we introduce the U-Net algorithm with a supervised attention mechanism for retinal vessel segmentation. We achieve this by introducing a decoder fusion module (DFM) in the encoding part, effectively combining different convolutional blocks to extract features comprehensively. Additionally, in the decoding part of U-Net, we propose the context squeeze and excitation (CSE) decoding module to enhance important contextual feature information and the detection of tiny blood vessels. For the final output, we introduce the supervised fusion mechanism (SFM), which combines multiple branches from shallow to deep layers, effectively fusing multi-scale features and capturing information from different levels, fully integrating low-level and high-level features to improve segmentation performance. Our experimental results on the public datasets of DRIVE, STARE, and CHASED_B1 demonstrate the excellent performance of our proposed network.
Collapse
Affiliation(s)
- Zhendi Ma
- School of Computer Science and Technology, Zhejiang Normal University, Jinhua 321004, China
| | - Xiaobo Li
- School of Computer Science and Technology, Zhejiang Normal University, Jinhua 321004, China.
| |
Collapse
|
42
|
Li C, Li Z, Liu W. TDCAU-Net: retinal vessel segmentation using transformer dilated convolutional attention-based U-Net method. Phys Med Biol 2023; 69:015003. [PMID: 38052089 DOI: 10.1088/1361-6560/ad1273] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Accepted: 12/05/2023] [Indexed: 12/07/2023]
Abstract
Retinal vessel segmentation plays a vital role in the medical field, facilitating the identification of numerous chronic conditions based on retinal vessel images. These conditions include diabetic retinopathy, hypertensive retinopathy, glaucoma, and others. Although the U-Net model has shown promising results in retinal vessel segmentation, it tends to struggle with fine branching and dense vessel segmentation. To further enhance the precision of retinal vessel segmentation, we propose a novel approach called transformer dilated convolution attention U-Net (TDCAU-Net), which builds upon the U-Net architecture with improved Transformer-based dilated convolution attention mechanisms. The proposed model retains the three-layer architecture of the U-Net network. The Transformer component enables the learning of contextual information for each pixel in the image, while the dilated convolution attention prevents information loss. The algorithm efficiently addresses several challenges to optimize blood vessel detection. The process starts with five-step preprocessing of the images, followed by chunking them into segments. Subsequently, the retinal images are fed into the modified U-Net network introduced in this paper for segmentation. The study employs eye fundus images from the DRIVE and CHASEDB1 databases for both training and testing purposes. Evaluation metrics are utilized to compare the algorithm's results with state-of-the-art methods. The experimental analysis on both databases demonstrates that the algorithm achieves high values of sensitivity, specificity, accuracy, and AUC. Specifically, for the first database, the achieved values are 0.8187, 0.9756, 0.9556, and 0.9795, respectively. For the second database, the corresponding values are 0.8243, 0.9836, 0.9738, and 0.9878, respectively. These results demonstrate that the proposed approach outperforms state-of-the-art methods, achieving higher performance on both datasets. The TDCAU-Net model presented in this study exhibits substantial capabilities in accurately segmenting fine branching and dense vessels. The segmentation performance of the network surpasses that of the U-Net algorithm and several mainstream methods.
Collapse
Affiliation(s)
- Chunyang Li
- School of Electronics and Information Engineering, University of Science and Technology Liaoning, Anshan, People's Republic of China
| | - Zhigang Li
- School of Electronics and Information Engineering, University of Science and Technology Liaoning, Anshan, People's Republic of China
| | - Weikang Liu
- School of Electronics and Information Engineering, University of Science and Technology Liaoning, Anshan, People's Republic of China
| |
Collapse
|
43
|
Zhao T, Guan Y, Tu D, Yuan L, Lu G. Neighbored-attention U-net (NAU-net) for diabetic retinopathy image segmentation. Front Med (Lausanne) 2023; 10:1309795. [PMID: 38131040 PMCID: PMC10733532 DOI: 10.3389/fmed.2023.1309795] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2023] [Accepted: 11/22/2023] [Indexed: 12/23/2023] Open
Abstract
Background Diabetic retinopathy-related (DR-related) diseases are posing an increasing threat to eye health as the number of patients with diabetes mellitus that are young increases significantly. The automatic diagnosis of DR-related diseases has benefited from the rapid development of image semantic segmentation and other deep learning technology. Methods Inspired by the architecture of U-Net family, a neighbored attention U-Net (NAU-Net) is designed to balance the identification performance and computational cost for DR fundus image segmentation. In the new network, only the neighboring high- and low-dimensional feature maps of the encoder and decoder are fused by using four attention gates. With the help of this improvement, the common target features in the high-dimensional feature maps of encoder are enhanced, and they are also fused with the low-dimensional feature map of decoder. Moreover, this network fuses only neighboring layers and does not include the inner layers commonly used in U-Net++. Consequently, the proposed network incurs a better identification performance with a lower computational cost. Results The experimental results of three open datasets of DR fundus images, including DRIVE, HRF, and CHASEDB, indicate that the NAU-Net outperforms FCN, SegNet, attention U-Net, and U-Net++ in terms of Dice score, IoU, accuracy, and precision, while its computation cost is between attention U-Net and U-Net++. Conclusion The proposed NAU-Net exhibits better performance at a relatively low computational cost and provides an efficient novel approach for DR fundus image segmentation and a new automatic tool for DR-related eye disease diagnosis.
Collapse
Affiliation(s)
- Tingting Zhao
- The Second Department of Internal Medicine, Donghu Hospital of Wuhan, Wuhan, China
| | - Yawen Guan
- The Second Department of Internal Medicine, Donghu Hospital of Wuhan, Wuhan, China
| | - Dan Tu
- The Second Department of Internal Medicine, Donghu Hospital of Wuhan, Wuhan, China
| | - Lixia Yuan
- The Department of Ophthalmology, Donghu Hospital of Wuhan, Wuhan, China
| | - Guangtao Lu
- Precision Manufacturing Institute, Wuhan University of Science and Technology, Wuhan, China
| |
Collapse
|
44
|
Wang J, Huang G, Zhong G, Yuan X, Pun CM, Deng J. QGD-Net: A Lightweight Model Utilizing Pixels of Affinity in Feature Layer for Dermoscopic Lesion Segmentation. IEEE J Biomed Health Inform 2023; 27:5982-5993. [PMID: 37773914 DOI: 10.1109/jbhi.2023.3320953] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/01/2023]
Abstract
RESPONSE Pixels with location affinity, which can be also called "pixels of affinity," have similar semantic information. Group convolution and dilated convolution can utilize them to improve the capability of the model. However, for group convolution, it does not utilize pixels of affinity between layers. For dilated convolution, after multiple convolutions with the same dilated rate, the pixels utilized within each layer do not possess location affinity with each other. To solve the problem of group convolution, our proposed quaternion group convolution uses the quaternion convolution, which promotes the communication between to promote utilizing pixels of affinity between channels. In quaternion group convolution, the feature layers are divided into 4 layers per group, ensuring the quaternion convolution can be performed. To solve the problem of dilated convolution, we propose the quaternion sawtooth wave-like dilated convolutions module (QS module). QS module utilizes quaternion convolution with sawtooth wave-like dilated rates to effectively leverage the pixels that share the location affinity both between and within layers. This allows for an expanded receptive field, ultimately enhancing the performance of the model. In particular, we perform our quaternion group convolution in QS module to design the quaternion group dilated neutral network (QGD-Net). Extensive experiments on Dermoscopic Lesion Segmentation based on ISIC 2016 and ISIC 2017 indicate that our method has significantly reduced the model parameters and highly promoted the precision of the model in Dermoscopic Lesion Segmentation. And our method also shows generalizability in retinal vessel segmentation.
Collapse
|
45
|
Lin L, Peng L, He H, Cheng P, Wu J, Wong KKY, Tang X. YoloCurvSeg: You only label one noisy skeleton for vessel-style curvilinear structure segmentation. Med Image Anal 2023; 90:102937. [PMID: 37672901 DOI: 10.1016/j.media.2023.102937] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2023] [Revised: 06/30/2023] [Accepted: 08/16/2023] [Indexed: 09/08/2023]
Abstract
Weakly-supervised learning (WSL) has been proposed to alleviate the conflict between data annotation cost and model performance through employing sparsely-grained (i.e., point-, box-, scribble-wise) supervision and has shown promising performance, particularly in the image segmentation field. However, it is still a very challenging task due to the limited supervision, especially when only a small number of labeled samples are available. Additionally, almost all existing WSL segmentation methods are designed for star-convex structures which are very different from curvilinear structures such as vessels and nerves. In this paper, we propose a novel sparsely annotated segmentation framework for curvilinear structures, named YoloCurvSeg. A very essential component of YoloCurvSeg is image synthesis. Specifically, a background generator delivers image backgrounds that closely match the real distributions through inpainting dilated skeletons. The extracted backgrounds are then combined with randomly emulated curves generated by a Space Colonization Algorithm-based foreground generator and through a multilayer patch-wise contrastive learning synthesizer. In this way, a synthetic dataset with both images and curve segmentation labels is obtained, at the cost of only one or a few noisy skeleton annotations. Finally, a segmenter is trained with the generated dataset and possibly an unlabeled dataset. The proposed YoloCurvSeg is evaluated on four publicly available datasets (OCTA500, CORN, DRIVE and CHASEDB1) and the results show that YoloCurvSeg outperforms state-of-the-art WSL segmentation methods by large margins. With only one noisy skeleton annotation (respectively 0.14%, 0.03%, 1.40%, and 0.65% of the full annotation), YoloCurvSeg achieves more than 97% of the fully-supervised performance on each dataset. Code and datasets will be released at https://github.com/llmir/YoloCurvSeg.
Collapse
Affiliation(s)
- Li Lin
- Department of Electronic and Electrical Engineering, Southern University of Science and Technology, Shenzhen, China; Department of Electrical and Electronic Engineering, the University of Hong Kong, Hong Kong, China; Jiaxing Research Institute, Southern University of Science and Technology, Jiaxing, China
| | - Linkai Peng
- Department of Electronic and Electrical Engineering, Southern University of Science and Technology, Shenzhen, China
| | - Huaqing He
- Department of Electronic and Electrical Engineering, Southern University of Science and Technology, Shenzhen, China; Jiaxing Research Institute, Southern University of Science and Technology, Jiaxing, China
| | - Pujin Cheng
- Department of Electronic and Electrical Engineering, Southern University of Science and Technology, Shenzhen, China; Jiaxing Research Institute, Southern University of Science and Technology, Jiaxing, China
| | - Jiewei Wu
- Department of Electronic and Electrical Engineering, Southern University of Science and Technology, Shenzhen, China
| | - Kenneth K Y Wong
- Department of Electrical and Electronic Engineering, the University of Hong Kong, Hong Kong, China
| | - Xiaoying Tang
- Department of Electronic and Electrical Engineering, Southern University of Science and Technology, Shenzhen, China; Jiaxing Research Institute, Southern University of Science and Technology, Jiaxing, China.
| |
Collapse
|
46
|
Chen H, Wang R, Wang X, Li J, Fang Q, Li H, Bai J, Peng Q, Meng D, Wang L. Unsupervised Local Discrimination for Medical Images. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:15912-15929. [PMID: 37494162 DOI: 10.1109/tpami.2023.3299038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/28/2023]
Abstract
Contrastive learning, which aims to capture general representation from unlabeled images to initialize the medical analysis models, has been proven effective in alleviating the high demand for expensive annotations. Current methods mainly focus on instance-wise comparisons to learn the global discriminative features, however, pretermitting the local details to distinguish tiny anatomical structures, lesions, and tissues. To address this challenge, in this paper, we propose a general unsupervised representation learning framework, named local discrimination (LD), to learn local discriminative features for medical images by closely embedding semantically similar pixels and identifying regions of similar structures across different images. Specifically, this model is equipped with an embedding module for pixel-wise embedding and a clustering module for generating segmentation. And these two modules are unified by optimizing our novel region discrimination loss function in a mutually beneficial mechanism, which enables our model to reflect structure information as well as measure pixel-wise and region-wise similarity. Furthermore, based on LD, we propose a center-sensitive one-shot landmark localization algorithm and a shape-guided cross-modality segmentation model to foster the generalizability of our model. When transferred to downstream tasks, the learned representation by our method shows a better generalization, outperforming representation from 18 state-of-the-art (SOTA) methods and winning 9 out of all 12 downstream tasks. Especially for the challenging lesion segmentation tasks, the proposed method achieves significantly better performance.
Collapse
|
47
|
Lewandowska E, Węsierski D, Mazur-Milecka M, Liss J, Jezierska A. Ensembling noisy segmentation masks of blurred sperm images. Comput Biol Med 2023; 166:107520. [PMID: 37804777 DOI: 10.1016/j.compbiomed.2023.107520] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Revised: 08/11/2023] [Accepted: 09/19/2023] [Indexed: 10/09/2023]
Abstract
BACKGROUND Sperm tail morphology and motility have been demonstrated to be important factors in determining sperm quality for in vitro fertilization. However, many existing computer-aided sperm analysis systems leave the sperm tail out of the analysis, as detecting a few tail pixels is challenging. Moreover, some publicly available datasets for classifying morphological defects contain images limited only to the sperm head. This study focuses on the segmentation of full sperm, which consists of the head and tail parts, and appear alone and in groups. METHODS We re-purpose the Feature Pyramid Network to ensemble an input image with multiple masks from state-of-the-art segmentation algorithms using a scale-specific cross-attention module. We normalize homogeneous backgrounds for improved training. The low field depth of microscopes blurs the images, easily confusing human raters in discerning minuscule sperm from large backgrounds. We thus propose evaluation protocols for scoring segmentation models trained on imbalanced data and noisy ground truth. RESULTS The neural ensembling of noisy segmentation masks outperforms all single, state-of-the-art segmentation algorithms in full sperm segmentation. Human raters agree more on the head than tail masks. The algorithms also segment the head better than the tail. CONCLUSIONS The extensive evaluation of state-of-the-art segmentation algorithms shows that full sperm segmentation is challenging. We release the SegSperm dataset of images from Intracytoplasmic Sperm Injection procedures to spur further progress on full sperm segmentation with noisy and imbalanced ground truth. The dataset is publicly available at https://doi.org/10.34808/6wm7-1159.
Collapse
Affiliation(s)
| | - Daniel Węsierski
- Cameras and Algorithms Lab, Gdańsk University of Technology, Poland; Multimedia Systems Department, Faculty of Electronics, Telecommunication, and Informatics, Gdańsk University of Technology, Poland
| | - Magdalena Mazur-Milecka
- Department of Biomedical Engineering, Faculty of Electronics, Telecommunications, and Informatics, Gdańsk University of Technology, Poland
| | - Joanna Liss
- Invicta Research and Development Center, Sopot, Poland; Department of Medical Biology and Genetics, University of Gdańsk, Poland
| | - Anna Jezierska
- Cameras and Algorithms Lab, Gdańsk University of Technology, Poland; Department of Biomedical Engineering, Faculty of Electronics, Telecommunications, and Informatics, Gdańsk University of Technology, Poland; Department of Modelling and Optimization of Dynamical Systems, Systems Research Institute Warsaw, Poland.
| |
Collapse
|
48
|
Suman S, Tiwari AK, Singh K. Computer-aided diagnostic system for hypertensive retinopathy: A review. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 240:107627. [PMID: 37320942 DOI: 10.1016/j.cmpb.2023.107627] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Revised: 05/03/2023] [Accepted: 05/27/2023] [Indexed: 06/17/2023]
Abstract
Hypertensive Retinopathy (HR) is a retinal disease caused by elevated blood pressure for a prolonged period. There are no obvious signs in the early stages of high blood pressure, but it affects various body parts over time, including the eyes. HR is a biomarker for several illnesses, including retinal diseases, atherosclerosis, strokes, kidney disease, and cardiovascular risks. Early microcirculation abnormalities in chronic diseases can be diagnosed through retinal examination prior to the onset of major clinical consequences. Computer-aided diagnosis (CAD) plays a vital role in the early identification of HR with improved diagnostic accuracy, which is time-efficient and demands fewer resources. Recently, numerous studies have been reported on the automatic identification of HR. This paper provides a comprehensive review of the automated tasks of Artery-Vein (A/V) classification, Arteriovenous ratio (AVR) computation, HR detection (Binary classification), and HR severity grading. The review is conducted using the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) protocol. The paper discusses the clinical features of HR, the availability of datasets, existing methods used for A/V classification, AVR computation, HR detection, and severity grading, and performance evaluation metrics. The reviewed articles are summarized with classifiers details, adoption of different kinds of methodologies, performance comparisons, datasets details, their pros and cons, and computational platform. For each task, a summary and critical in-depth analysis are provided, as well as common research issues and challenges in the existing studies. Finally, the paper proposes future research directions to overcome challenges associated with data set availability, HR detection, and severity grading.
Collapse
Affiliation(s)
- Supriya Suman
- Interdisciplinary Research Platform (IDRP): Smart Healthcare, Indian Institute of Technology, N.H. 62, Nagaur Road, Karwar, Jodhpur, Rajasthan 342030, India.
| | - Anil Kumar Tiwari
- Department of Electrical Engineering, Indian Institute of Technology, N.H. 62, Nagaur Road, Karwar, Jodhpur, Rajasthan 342030, India
| | - Kuldeep Singh
- Department of Pediatrics, All India Institute of Medical Sciences, Basni Industrial Area Phase-2, Jodhpur, Rajasthan 342005, India
| |
Collapse
|
49
|
Lin J, Huang X, Zhou H, Wang Y, Zhang Q. Stimulus-guided adaptive transformer network for retinal blood vessel segmentation in fundus images. Med Image Anal 2023; 89:102929. [PMID: 37598606 DOI: 10.1016/j.media.2023.102929] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Revised: 06/15/2023] [Accepted: 08/07/2023] [Indexed: 08/22/2023]
Abstract
Automated retinal blood vessel segmentation in fundus images provides important evidence to ophthalmologists in coping with prevalent ocular diseases in an efficient and non-invasive way. However, segmenting blood vessels in fundus images is a challenging task, due to the high variety in scale and appearance of blood vessels and the high similarity in visual features between the lesions and retinal vascular. Inspired by the way that the visual cortex adaptively responds to the type of stimulus, we propose a Stimulus-Guided Adaptive Transformer Network (SGAT-Net) for accurate retinal blood vessel segmentation. It entails a Stimulus-Guided Adaptive Module (SGA-Module) that can extract local-global compound features based on inductive bias and self-attention mechanism. Alongside a light-weight residual encoder (ResEncoder) structure capturing the relevant details of appearance, a Stimulus-Guided Adaptive Pooling Transformer (SGAP-Former) is introduced to reweight the maximum and average pooling to enrich the contextual embedding representation while suppressing the redundant information. Moreover, a Stimulus-Guided Adaptive Feature Fusion (SGAFF) module is designed to adaptively emphasize the local details and global context and fuse them in the latent space to adjust the receptive field (RF) based on the task. The evaluation is implemented on the largest fundus image dataset (FIVES) and three popular retinal image datasets (DRIVE, STARE, CHASEDB1). Experimental results show that the proposed method achieves a competitive performance over the other existing method, with a clear advantage in avoiding errors that commonly happen in areas with highly similar visual features. The sourcecode is publicly available at: https://github.com/Gins-07/SGAT.
Collapse
Affiliation(s)
- Ji Lin
- School of Electronic Engineering and Computer Science, Queen Mary University of London, Mile End Road, London, E1 4NS, United Kingdom
| | - Xingru Huang
- School of Electronic Engineering and Computer Science, Queen Mary University of London, Mile End Road, London, E1 4NS, United Kingdom
| | - Huiyu Zhou
- School of Informatics, University of Leicester, University Road, Leicester, LE1 7RH, United Kingdom
| | - Yaqi Wang
- College of Media Engineering, Communication University of Zhejiang, Hangzhou, 310018, China
| | - Qianni Zhang
- School of Electronic Engineering and Computer Science, Queen Mary University of London, Mile End Road, London, E1 4NS, United Kingdom.
| |
Collapse
|
50
|
Ryu J, Rehman MU, Nizami IF, Chong KT. SegR-Net: A deep learning framework with multi-scale feature fusion for robust retinal vessel segmentation. Comput Biol Med 2023; 163:107132. [PMID: 37343468 DOI: 10.1016/j.compbiomed.2023.107132] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2023] [Revised: 05/12/2023] [Accepted: 06/04/2023] [Indexed: 06/23/2023]
Abstract
Retinal vessel segmentation is an important task in medical image analysis and has a variety of applications in the diagnosis and treatment of retinal diseases. In this paper, we propose SegR-Net, a deep learning framework for robust retinal vessel segmentation. SegR-Net utilizes a combination of feature extraction and embedding, deep feature magnification, feature precision and interference, and dense multiscale feature fusion to generate accurate segmentation masks. The model consists of an encoder module that extracts high-level features from the input images and a decoder module that reconstructs the segmentation masks by combining features from the encoder module. The encoder module consists of a feature extraction and embedding block that enhances by dense multiscale feature fusion, followed by a deep feature magnification block that magnifies the retinal vessels. To further improve the quality of the extracted features, we use a group of two convolutional layers after each DFM block. In the decoder module, we utilize a feature precision and interference block and a dense multiscale feature fusion block (DMFF) to combine features from the encoder module and reconstruct the segmentation mask. We also incorporate data augmentation and pre-processing techniques to improve the generalization of the trained model. Experimental results on three fundus image publicly available datasets (CHASE_DB1, STARE, and DRIVE) demonstrate that SegR-Net outperforms state-of-the-art models in terms of accuracy, sensitivity, specificity, and F1 score. The proposed framework can provide more accurate and more efficient segmentation of retinal blood vessels in comparison to the state-of-the-art techniques, which is essential for clinical decision-making and diagnosis of various eye diseases.
Collapse
Affiliation(s)
- Jihyoung Ryu
- Electronics and Telecommunications Research Institute, 176-11 Cheomdan Gwagi-ro, Buk-gu, Gwangju 61012, Republic of Korea.
| | - Mobeen Ur Rehman
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju 54896, Republic of Korea.
| | - Imran Fareed Nizami
- Department of Electrical Engineering, Bahria University, Islamabad, Pakistan.
| | - Kil To Chong
- Electronics and Telecommunications Research Institute, 176-11 Cheomdan Gwagi-ro, Buk-gu, Gwangju 61012, Republic of Korea; Advances Electronics and Information Research Center, Jeonbuk National University, Jeonju 54896, Republic of Korea.
| |
Collapse
|