1
|
Remtulla R, Samet A, Kulbay M, Akdag A, Hocini A, Volniansky A, Kahn Ali S, Qian CX. A Future Picture: A Review of Current Generative Adversarial Neural Networks in Vitreoretinal Pathologies and Their Future Potentials. Biomedicines 2025; 13:284. [PMID: 40002698 PMCID: PMC11852121 DOI: 10.3390/biomedicines13020284] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2024] [Revised: 01/06/2025] [Accepted: 01/14/2025] [Indexed: 02/27/2025] Open
Abstract
Machine learning has transformed ophthalmology, particularly in predictive and discriminatory models for vitreoretinal pathologies. However, generative modeling, especially generative adversarial networks (GANs), remains underexplored. GANs consist of two neural networks-the generator and discriminator-that work in opposition to synthesize highly realistic images. These synthetic images can enhance diagnostic accuracy, expand the capabilities of imaging technologies, and predict treatment responses. GANs have already been applied to fundus imaging, optical coherence tomography (OCT), and fluorescein autofluorescence (FA). Despite their potential, GANs face challenges in reliability and accuracy. This review explores GAN architecture, their advantages over other deep learning models, and their clinical applications in retinal disease diagnosis and treatment monitoring. Furthermore, we discuss the limitations of current GAN models and propose novel applications combining GANs with OCT, OCT-angiography, fluorescein angiography, fundus imaging, electroretinograms, visual fields, and indocyanine green angiography.
Collapse
Affiliation(s)
- Raheem Remtulla
- Department of Ophthalmology & Visual Sciences, McGill University, Montreal, QC H4A 3SE, Canada; (R.R.); (M.K.)
| | - Adam Samet
- Department of Ophthalmology & Visual Sciences, McGill University, Montreal, QC H4A 3SE, Canada; (R.R.); (M.K.)
| | - Merve Kulbay
- Department of Ophthalmology & Visual Sciences, McGill University, Montreal, QC H4A 3SE, Canada; (R.R.); (M.K.)
- Centre de Recherche de l’Hôpital Maisonneuve-Rosemont, Université de Montréal, Montreal, QC H1T 2M4, Canada
| | - Arjin Akdag
- Faculty of Medicine and Health Sciences, McGill University, Montreal, QC H3G 2M1, Canada
| | - Adam Hocini
- Faculty of Medicine, Université de Montréal, Montreal, QC H3T 1J4, Canada
| | - Anton Volniansky
- Department of Psychiatry, Université Laval, Quebec City, QC G1V 0A6, Canada
| | - Shigufa Kahn Ali
- Centre de Recherche de l’Hôpital Maisonneuve-Rosemont, Université de Montréal, Montreal, QC H1T 2M4, Canada
- Department of Ophthalmology, Centre Universitaire d’Ophtalmologie (CUO), Hôpital Maisonneuve-Rosemont, University of Montreal, Montreal, QC H1T 2M4, Canada
| | - Cynthia X. Qian
- Centre de Recherche de l’Hôpital Maisonneuve-Rosemont, Université de Montréal, Montreal, QC H1T 2M4, Canada
- Department of Ophthalmology, Centre Universitaire d’Ophtalmologie (CUO), Hôpital Maisonneuve-Rosemont, University of Montreal, Montreal, QC H1T 2M4, Canada
| |
Collapse
|
2
|
Lin D, Wang Y, Liang L, Li P, Chen CLP. Deep LSAC for Fine-Grained Recognition. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:200-214. [PMID: 33048766 DOI: 10.1109/tnnls.2020.3027603] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Fine-grained recognition emphasizes the identification of subtle differences among object categories given objects that appear in different shapes and poses. These variances should be reduced for reliable recognition. We propose a fine-grained recognition system that incorporates localization, segmentation, alignment, and classification in a unified deep neural network. The input to the classification module includes functions that enable backward-propagation (BP) in constructing the solver. Our major contribution is to propose a valve linkage function (VLF) for BP chaining and form our deep localization, segmentation, alignment, and classification (LSAC) system. The VLF can adaptively compromise errors of classification and alignment when training the LSAC model. It in turn helps to update the localization and segmentation. We evaluate our framework on two widely used fine-grained object data sets. The performance confirms the effectiveness of our LSAC system.
Collapse
|
3
|
Gu X, Li M. A multi-granularity locally optimal prototype-based approach for classification. Inf Sci (N Y) 2021. [DOI: 10.1016/j.ins.2021.04.039] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
4
|
Wang DH, Zhou W, Li J, Wu Y, Zhu S. Exploring Misclassification Information for Fine-Grained Image Classification. SENSORS 2021; 21:s21124176. [PMID: 34206995 PMCID: PMC8235489 DOI: 10.3390/s21124176] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/25/2021] [Revised: 06/01/2021] [Accepted: 06/09/2021] [Indexed: 11/16/2022]
Abstract
Fine-grained image classification is a hot topic that has been widely studied recently. Many fine-grained image classification methods ignore misclassification information, which is important to improve classification accuracy. To make use of misclassification information, in this paper, we propose a novel fine-grained image classification method by exploring the misclassification information (FGMI) of prelearned models. For each class, we harvest the confusion information from several prelearned fine-grained image classification models. For one particular class, we select a number of classes which are likely to be misclassified with this class. The images of selected classes are then used to train classifiers. In this way, we can reduce the influence of irrelevant images to some extent. We use the misclassification information for all the classes by training a number of confusion classifiers. The outputs of these trained classifiers are combined to represent images and produce classifications. To evaluate the effectiveness of the proposed FGMI method, we conduct fine-grained classification experiments on several public image datasets. Experimental results prove the usefulness of the proposed method.
Collapse
Affiliation(s)
- Da-Han Wang
- Fujian Key Laboratory of Pattern Recognition and Image Understanding, Xiamen 361024, China; (W.Z.); (J.L.); (Y.W.); (S.Z.)
- School of Computer and Information Engineering, Xiamen University of Technology, Xiamen 361024, China
- Correspondence:
| | - Wei Zhou
- Fujian Key Laboratory of Pattern Recognition and Image Understanding, Xiamen 361024, China; (W.Z.); (J.L.); (Y.W.); (S.Z.)
- School of Computer and Information Engineering, Xiamen University of Technology, Xiamen 361024, China
| | - Jianmin Li
- Fujian Key Laboratory of Pattern Recognition and Image Understanding, Xiamen 361024, China; (W.Z.); (J.L.); (Y.W.); (S.Z.)
- School of Computer and Information Engineering, Xiamen University of Technology, Xiamen 361024, China
| | - Yun Wu
- Fujian Key Laboratory of Pattern Recognition and Image Understanding, Xiamen 361024, China; (W.Z.); (J.L.); (Y.W.); (S.Z.)
- School of Computer and Information Engineering, Xiamen University of Technology, Xiamen 361024, China
| | - Shunzhi Zhu
- Fujian Key Laboratory of Pattern Recognition and Image Understanding, Xiamen 361024, China; (W.Z.); (J.L.); (Y.W.); (S.Z.)
- School of Computer and Information Engineering, Xiamen University of Technology, Xiamen 361024, China
| |
Collapse
|
5
|
Xie GS, Zhang Z, Liu L, Zhu F, Zhang XY, Shao L, Li X. SRSC: Selective, Robust, and Supervised Constrained Feature Representation for Image Classification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:4290-4302. [PMID: 31870993 DOI: 10.1109/tnnls.2019.2953675] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Feature representation learning, an emerging topic in recent years, has achieved great progress. Powerful learned features can lead to excellent classification accuracy. In this article, a selective and robust feature representation framework with a supervised constraint (SRSC) is presented. SRSC seeks a selective, robust, and discriminative subspace by transforming the original feature space into the category space. Particularly, we add a selective constraint to the transformation matrix (or classifier parameter) that can select discriminative dimensions of the input samples. Moreover, a supervised regularization is tailored to further enhance the discriminability of the subspace. To relax the hard zero-one label matrix in the category space, an additional error term is also incorporated into the framework, which can lead to a more robust transformation matrix. SRSC is formulated as a constrained least square learning (feature transforming) problem. For the SRSC problem, an inexact augmented Lagrange multiplier method (ALM) is utilized to solve it. Extensive experiments on several benchmark data sets adequately demonstrate the effectiveness and superiority of the proposed method. The proposed SRSC approach has achieved better performances than the compared counterpart methods.
Collapse
|
6
|
|
7
|
Zhang C, Cheng J, Tian Q. Multiview, Few-Labeled Object Categorization by Predicting Labels With View Consistency. IEEE TRANSACTIONS ON CYBERNETICS 2019; 49:3834-3843. [PMID: 29994693 DOI: 10.1109/tcyb.2018.2845912] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
The categorization accuracies of objects have been greatly improved in recent years. However, large quantities of labeled images are needed. many methods fail when only few labeled images are available. To tackle the few-labeled object categorization problem, we need to represent and classify them from multiple views. In this paper, we propose a novel multiview, few-labeled object categorization algorithm by predicting the labels of images with view consistency (MVFL-VC). We use labeled images along with other unlabeled images in a unified framework. A mapping function is learned to model the correlations of images with their labels. Since there are no labeling information for unlabeled images, we simultaneously learn the mapping function and image labels by classification error minimization. We make use of multiview information for joint object categorization. Although different views represent different aspects of images, for one image, the predicted categories of multiple views should be consistent with each other. We learn the mapping function by minimizing the summed classification losses along with the discrepancy of predicted labels between different views in an alternative way. We conduct object categorization experiments on five public image datasets and compare with other semi-supervised methods. Experimental results well demonstrate the effectiveness of the proposed MVFL-VC method.
Collapse
|
8
|
Zhang C, Cheng J, Tian Q. Multi-View Image Classification With Visual, Semantic And View Consistency. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 29:617-627. [PMID: 31425078 DOI: 10.1109/tip.2019.2934576] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Multi-view visual classification methods have been widely applied to use discriminative information of different views. This strategy has been proven very effective by many researchers. On the one hand, images are often treated independently without fully considering their visual and semantic correlations. On the other hand, view consistency is often ignored. To solve these problems, in this paper, we propose a novel multi-view image classification method with visual, semantic and view consistency (VSVC). For each image, we linearly combine multi-view information for image classification. The combination parameters are determined by considering both the classification loss and the visual, semantic and view consistency. Visual consistency is imposed by ensuring that visually similar images of the same view are predicted to have similar values. For semantic consistency, we impose the locality constraint that nearby images should be predicted to have the same class by multiview combination. View consistency is also used to ensure that similar images have consistent multi-view combination parameters. An alternative optimization strategy is used to learn the combination parameters. To evaluate the effectiveness of VSVC, we perform image classification experiments on several public datasets. The experimental results on these datasets show the effectiveness of the proposed VSVC method.
Collapse
|
9
|
Zhang C, Cheng J, Tian Q. Semantically Modeling of Object and Context for Categorization. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:1013-1024. [PMID: 30106698 DOI: 10.1109/tnnls.2018.2856096] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Object-centric-based categorization methods have been proven more effective than hard partitions of images (e.g., spatial pyramid matching). However, how to determine the locations of objects is still an open problem. Besides, modeling of context areas is often mixed with the background. Moreover, the semantic information is often ignored by these methods that only use visual representations for classification. In this paper, we propose an object categorization method by semantically modeling the object and context information (SOC). We first select a number of candidate regions with high confidence scores and semantically represent these regions by measuring correlations of each region with prelearned classifiers (e.g., local feature-based classifiers and deep convolutional-neural-network-based classifiers). These regions are clustered for object selections. The other selected areas are then viewed as context areas. We treat other areas beyond the object and context areas within one image as the background. The visually and semantically represented objects and contexts are then used along with the background area for object representations and categorizations. Experimental results on several public data sets well demonstrate the effectiveness of the proposed object categorization method by semantically modeling the object and context information.
Collapse
|
10
|
Zhu G, Wang J, Wang P, Wu Y, Lu H. Feature Distilled Tracking. IEEE TRANSACTIONS ON CYBERNETICS 2019; 49:440-452. [PMID: 29990247 DOI: 10.1109/tcyb.2017.2776977] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Feature extraction and representation is one of the most important components for fast, accurate, and robust visual tracking. Very deep convolutional neural networks (CNNs) provide effective tools for feature extraction with good generalization ability. However, extracting features using very deep CNN models needs high performance hardware due to its large computation complexity, which prohibits its extensions in real-time applications. To alleviate this problem, we aim at obtaining small and fast-to-execute shallow models based on model compression for visual tracking. Specifically, we propose a small feature distilled network (FDN) for tracking by imitating the intermediate representations of a much deeper network. The FDN extracts rich visual features with higher speed than the original deeper network. To further speed-up, we introduce a shift-and-stitch method to reduce the arithmetic operations, while preserving the spatial resolution of the distilled feature maps unchanged. Finally, a scale adaptive discriminative correlation filter is learned on the distilled feature for visual tracking to handle scale variation of the target. Comprehensive experimental results on object tracking benchmark datasets show that the proposed approach achieves 5× speed-up with competitive performance to the state-of-the-art deep trackers.
Collapse
|