1
|
Butt AR, Manzoor S, Baig A, Imran A, Ullah I, Syed Muhammad W. On-the-move heterogeneous face recognition in frequency and spatial domain using sparse representation. PLoS One 2024; 19:e0308566. [PMID: 39365809 PMCID: PMC11451977 DOI: 10.1371/journal.pone.0308566] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2023] [Accepted: 07/26/2024] [Indexed: 10/06/2024] Open
Abstract
Heterogeneity of a probe image is one of the most complex challenges faced by researchers and implementers of current surveillance systems. This is due to existence of multiple cameras working in different spectral ranges in a single surveillance setup. This paper proposes two different approaches including spatial sparse representations (SSR) and frequency sparse representation (FSR) to recognize on-the-move heterogeneous face images with database of single sample per person (SSPP). SCface database, with five visual and two Infrared (IR) cameras, is taken as a benchmark for experiments, which is further confirmed using CASIA NIR-VIS 2.0 face database with 17580 visual and IR images. Similarity, comparison is performed for different scenarios such as, variation of distances from a camera and variation in sizes of face images and various visual and infrared (IR) modalities. Least square minimization based approach for finding the solution is used to match face images as it makes the recognition process simpler. A side by side comparison of both the proposed approaches with the state-of-the-art, classical, principal component analysis (PCA), kernel fisher analysis (KFA) and coupled kernel embedding (CKE) methods, along with modern low-rank preserving projection via graph regularized reconstruction (LRPP-GRR) method, is also presented. Experimental results suggest that the proposed approaches achieve superior performance.
Collapse
Affiliation(s)
- Asif Raza Butt
- Department of Electrical Engineering, Mirpur University of Science and Technology, Mirpur, AJK, Pakistan
| | - Sajjad Manzoor
- Department of Electrical Engineering, Mirpur University of Science and Technology, Mirpur, AJK, Pakistan
- Research Institute of Engineering and Technology, Hanyang University (ERICA), Ansan, South Korea
| | - Asim Baig
- Curious Thing AI, Sydney, New South Wales, Australia
| | - Abid Imran
- Department of Mechanical Engineering, Ghulam Ishaq Khan Institute of Engineering Sciences and Technology (GIKI), Swabi, KPK, Pakistan
| | - Ihsan Ullah
- Department of Electrical Engineering, Comsats University Islamabad, Abbottabad Campus, Abbottabad, KPK, Pakistan
| | - Wasif Syed Muhammad
- Department of Electrical Engineering, University of Gujrat (UoG), Gujrat, Pakistan
| |
Collapse
|
2
|
|
3
|
A deep learning method to assist with chronic atrophic gastritis diagnosis using white light images. Dig Liver Dis 2022; 54:1513-1519. [PMID: 35610166 DOI: 10.1016/j.dld.2022.04.025] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/20/2022] [Revised: 03/29/2022] [Accepted: 04/20/2022] [Indexed: 12/30/2022]
Abstract
BACKGROUND Chronic atrophic gastritis is a common preneoplastic condition of the stomach with a low detection rate during endoscopy. AIMS This study aimed to develop two deep learning models to improve the diagnostic rate. METHODS We collected 10,593 images from 4005 patients including 2280 patients with chronic atrophic gastritis and 1725 patients with chronic non-atrophic gastritis from two tertiary hospitals. Two deep learning models were developed to detect chronic atrophic gastritis using ResNet50. The detection ability of the deep learning model was compared with that of three expert endoscopists. RESULTS In the external test set, the diagnostic accuracy of model 1 for detecting gastric antrum atrophy was 0.890. The identification accuracies for the severity of gastric antrum atrophy were 0.773 and 0.590 in the internal and external test sets, respectively. In the other two external sets, the detection accuracies of model 2 for chronic atrophic gastritis were 0.854 and 0.916, respectively. Deep learning model 1's ability to identify gastric antrum atrophy was comparable to that of human experts. CONCLUSION Deep-learning-based models can detect chronic atrophic gastritis with good performance, which may greatly reduce the burden on endoscopists, relieve patient suffering, and improve the disease's detection rate in primary hospitals.
Collapse
|
4
|
SCU-Net: Semantic Segmentation Network for Learning Channel Information on Remote Sensing Images. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:8469415. [PMID: 35440946 PMCID: PMC9013575 DOI: 10.1155/2022/8469415] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Accepted: 03/09/2022] [Indexed: 12/02/2022]
Abstract
Extracting detailed information from remote sensing images is an important direction in semantic segmentation. Not only the amounts of parameters and calculations of the network model in the learning process but also the prediction effect after learning must be considered. This paper designs a new module, the upsampling convolution-deconvolution module (CDeConv). On the basis of CDeConv, a convolutional neural network (CNN) with a channel attention mechanism for semantic segmentation is proposed as a channel upsampling network (SCU-Net). SCU-Net has been verified by experiments. The mean intersection-over-union (MIOU) of the SCU-Net-102-A model reaches 55.84%, the pixel accuracy is 91.53%, and the frequency weighted intersection-over-union (FWIU) is 85.83%. Compared with some of the state-of-the-art methods, SCU-Net can learn more detailed information in the channel and has better generalization capabilities.
Collapse
|
5
|
Image Target Recognition via Mixed Feature-Based Joint Sparse Representation. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2020; 2020:8887453. [PMID: 32849866 PMCID: PMC7436358 DOI: 10.1155/2020/8887453] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/05/2020] [Revised: 07/14/2020] [Accepted: 07/23/2020] [Indexed: 11/18/2022]
Abstract
An image target recognition approach based on mixed features and adaptive weighted joint sparse representation is proposed in this paper. This method is robust to the illumination variation, deformation, and rotation of the target image. It is a data-lightweight classification framework, which can recognize targets well with few training samples. First, Gabor wavelet transform and convolutional neural network (CNN) are used to extract the Gabor wavelet features and deep features of training samples and test samples, respectively. Then, the contribution weights of the Gabor wavelet feature vector and the deep feature vector are calculated. After adaptive weighted reconstruction, we can form the mixed features and obtain the training sample feature set and test sample feature set. Aiming at the high-dimensional problem of mixed features, we use principal component analysis (PCA) to reduce the dimensions. Lastly, the public features and private features of images are extracted from the training sample feature set so as to construct the joint feature dictionary. Based on joint feature dictionary, the sparse representation based classifier (SRC) is used to recognize the targets. The experiments on different datasets show that this approach is superior to some other advanced methods.
Collapse
|
6
|
Wang W, Hu Y, Zou T, Liu H, Wang J, Wang X. A New Image Classification Approach via Improved MobileNet Models with Local Receptive Field Expansion in Shallow Layers. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2020; 2020:8817849. [PMID: 32802028 PMCID: PMC7416240 DOI: 10.1155/2020/8817849] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/17/2020] [Revised: 07/04/2020] [Accepted: 07/10/2020] [Indexed: 11/25/2022]
Abstract
Because deep neural networks (DNNs) are both memory-intensive and computation-intensive, they are difficult to apply to embedded systems with limited hardware resources. Therefore, DNN models need to be compressed and accelerated. By applying depthwise separable convolutions, MobileNet can decrease the number of parameters and computational complexity with less loss of classification precision. Based on MobileNet, 3 improved MobileNet models with local receptive field expansion in shallow layers, also called Dilated-MobileNet (Dilated Convolution MobileNet) models, are proposed, in which dilated convolutions are introduced into a specific convolutional layer of the MobileNet model. Without increasing the number of parameters, dilated convolutions are used to increase the receptive field of the convolution filters to obtain better classification accuracy. The experiments were performed on the Caltech-101, Caltech-256, and Tubingen animals with attribute datasets, respectively. The results show that Dilated-MobileNets can obtain up to 2% higher classification accuracy than MobileNet.
Collapse
Affiliation(s)
- Wei Wang
- College of Computer and Communication Engineering, Changsha University of Science and Technology, Changsha 410114, China
| | - Yiyang Hu
- College of Computer and Communication Engineering, Changsha University of Science and Technology, Changsha 410114, China
| | - Ting Zou
- Yiyang Branch, China Telecom Co., Ltd., Yiyang 413000, China
| | - Hongmei Liu
- Hunan Railway Professional Technology College, Zhuzhou 410116, China
| | - Jin Wang
- College of Computer and Communication Engineering, Changsha University of Science and Technology, Changsha 410114, China
- School of Information Science and Engineering, Fujian University of Technology, Fujian 350118, China
| | - Xin Wang
- College of Computer and Communication Engineering, Changsha University of Science and Technology, Changsha 410114, China
| |
Collapse
|
7
|
Wang W, Tian J, Zhang C, Luo Y, Wang X, Li J. An improved deep learning approach and its applications on colonic polyp images detection. BMC Med Imaging 2020; 20:83. [PMID: 32698839 PMCID: PMC7374886 DOI: 10.1186/s12880-020-00482-3] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2020] [Accepted: 07/08/2020] [Indexed: 12/22/2022] Open
Abstract
Background Colonic polyps are more likely to be cancerous, especially those with large diameter, large number and atypical hyperplasia. If colonic polyps cannot be treated in early stage, they are likely to develop into colon cancer. Colonoscopy is easily limited by the operator’s experience, and factors such as inexperience and visual fatigue will directly affect the accuracy of diagnosis. Cooperating with Hunan children’s hospital, we proposed and improved a deep learning approach with global average pooling (GAP) in colonoscopy for assisted diagnosis. Our approach for assisted diagnosis in colonoscopy can prompt endoscopists to pay attention to polyps that may be ignored in real time, improve the detection rate, reduce missed diagnosis, and improve the efficiency of medical diagnosis. Methods We selected colonoscopy images from the gastrointestinal endoscopy room of Hunan children’s hospital to form the colonic polyp datasets. And we applied the image classification method based on Deep Learning to the classification of Colonic Polyps. The classic networks we used are VGGNets and ResNets. By using global average pooling, we proposed the improved approaches: VGGNets-GAP and ResNets-GAP. Results The accuracies of all models in datasets exceed 98%. The TPR and TNR are above 96 and 98% respectively. In addition, VGGNets-GAP networks not only have high classification accuracies, but also have much fewer parameters than those of VGGNets. Conclusions The experimental results show that the proposed approach has good effect on the automatic detection of colonic polyps. The innovations of our method are in two aspects: (1) the detection accuracy of colonic polyps has been improved. (2) our approach reduces the memory consumption and makes the model lightweight. Compared with the original VGG networks, the parameters of our VGG19-GAP networks are greatly reduced.
Collapse
Affiliation(s)
- Wei Wang
- School of Computer and Communication Engineering, Changsha University of Science and Technology, Changsha, 410114, China.
| | - Jinge Tian
- School of Computer and Communication Engineering, Changsha University of Science and Technology, Changsha, 410114, China
| | - Chengwen Zhang
- School of Computer and Communication Engineering, Changsha University of Science and Technology, Changsha, 410114, China
| | - Yanhong Luo
- Hunan Children's Hospital, Changsha, 410000, China.
| | - Xin Wang
- School of Computer and Communication Engineering, Changsha University of Science and Technology, Changsha, 410114, China.
| | - Ji Li
- School of Computer and Communication Engineering, Changsha University of Science and Technology, Changsha, 410114, China
| |
Collapse
|
8
|
High-Resolution Radar Target Recognition via Inception-Based VGG (IVGG) Networks. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2020; 2020:8893419. [PMID: 32733549 PMCID: PMC7383303 DOI: 10.1155/2020/8893419] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/13/2020] [Revised: 06/21/2020] [Accepted: 07/01/2020] [Indexed: 11/17/2022]
Abstract
Aiming at high-resolution radar target recognition, new convolutional neural networks, namely, Inception-based VGG (IVGG) networks, are proposed to classify and recognize different targets in high range resolution profile (HRRP) and synthetic aperture radar (SAR) signals. The IVGG networks have been improved in two aspects. One is to adjust the connection mode of the full connection layer. The other is to introduce the Inception module into the visual geometry group (VGG) network to make the network structure more suik / for radar target recognition. After the Inception module, we also add a point convolutional layer to strengthen the nonlinearity of the network. Compared with the VGG network, IVGG networks are simpler and have fewer parameters. The experiments are compared with GoogLeNet, ResNet18, DenseNet121, and VGG on 4 datasets. The experimental results show that the IVGG networks have better accuracies than the existing convolutional neural networks.
Collapse
|
9
|
A SAR Image Target Recognition Approach via Novel SSF-Net Models. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2020; 2020:8859172. [PMID: 32695155 PMCID: PMC7368189 DOI: 10.1155/2020/8859172] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/18/2020] [Revised: 06/03/2020] [Accepted: 06/16/2020] [Indexed: 11/18/2022]
Abstract
With the wide application of high-resolution radar, the application of Radar Automatic Target Recognition (RATR) is increasingly focused on how to quickly and accurately distinguish high-resolution radar targets. Therefore, Synthetic Aperture Radar (SAR) image recognition technology has become one of the research hotspots in this field. Based on the characteristics of SAR images, a Sparse Data Feature Extraction module (SDFE) has been designed, and a new convolutional neural network SSF-Net has been further proposed based on the SDFE module. Meanwhile, in order to improve processing efficiency, the network adopts three methods to classify targets: three Fully Connected (FC) layers, one Fully Connected (FC) layer, and Global Average Pooling (GAP). Among them, the latter two methods have less parameters and computational cost, and they have better real-time performance. The methods were tested on public datasets SAR-SOC and SAR-EOC-1. The experimental results show that the SSF-Net has relatively better robustness and achieves the highest recognition accuracy of 99.55% and 99.50% on SAR-SOC and SAR-EOC-1, respectively, which is 1% higher than the comparison methods on SAR-EOC-1.
Collapse
|
10
|
A New Volumetric CNN for 3D Object Classification Based on Joint Multiscale Feature and Subvolume Supervised Learning Approaches. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2020. [DOI: 10.1155/2020/5851465] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
The advancement of low-cost RGB-D and LiDAR three-dimensional (3D) sensors has permitted the obtainment of the 3D model easier in real-time. However, making intricate 3D features is crucial for the advancement of 3D object classifications. The existing volumetric voxel-based CNN approaches have achieved remarkable progress, but they generate huge computational overhead that limits the extraction of global features at higher resolutions of 3D objects. In this paper, a low-cost 3D volumetric deep convolutional neural network is proposed for 3D object classification based on joint multiscale hierarchical and subvolume supervised learning strategies. Our proposed deep neural network inputs 3D data, which are preprocessed by implementing memory-efficient octree representation, and we propose to limit the full layer octree depth to a certain level based on the predefined input volume resolution for storing high-precision contour features. Multiscale features are concatenated from multilevel octree depths inside the network, aiming to adaptively generate high-level global features. The strategy of the subvolume supervision approach is to train the network on subparts of the 3D object in order to learn local features. Our framework has been evaluated with two publicly available 3D repositories. Experimental results demonstrate the effectiveness of our proposed method where the classification accuracy is improved in comparison to existing volumetric approaches, and the memory consumption ratio and run-time are significantly reduced.
Collapse
|