1
|
Huang S, Lin J, Huangfu L, Xing Y, Hu J, Zeng DD. Adaptively Weighted k-Tuple Metric Network for Kinship Verification. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:6173-6186. [PMID: 35439158 DOI: 10.1109/tcyb.2022.3163707] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Facial image-based kinship verification is a rapidly growing field in computer vision and biometrics. The key to determining whether a pair of facial images has a kin relation is to train a model that can enlarge the margin between the faces that have no kin relation while reducing the distance between faces that have a kin relation. Most existing approaches primarily exploit duplet (i.e., two input samples without cross pair) or triplet (i.e., single negative pair for each positive pair with low-order cross pair) information, omitting discriminative features from multiple negative pairs. These approaches suffer from weak generalizability, resulting in unsatisfactory performance. Inspired by human visual systems that incorporate both low-order and high-order cross-pair information from local and global perspectives, we propose to leverage high-order cross-pair features and develop a novel end-to-end deep learning model called the adaptively weighted k -tuple metric network (AW k -TMN). Our main contributions are three-fold. First, a novel cross-pair metric learning loss based on k -tuplet loss is introduced. It naturally captures both the low-order and high-order discriminative features from multiple negative pairs. Second, an adaptively weighted scheme is formulated to better highlight hard negative examples among multiple negative pairs, leading to enhanced performance. Third, the model utilizes multiple levels of convolutional features and jointly optimizes feature and metric learning to further exploit the low-order and high-order representational power. Extensive experimental results on three popular kinship verification datasets demonstrate the effectiveness of our proposed AW k -TMN approach compared with several state-of-the-art approaches. The source codes and models are released.1.
Collapse
|
2
|
Singh S, Kumar M, Kumar A, Verma BK, Shitharth S. Pneumonia detection with QCSA network on chest X-ray. Sci Rep 2023; 13:9025. [PMID: 37270553 DOI: 10.1038/s41598-023-35922-x] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2023] [Accepted: 05/25/2023] [Indexed: 06/05/2023] Open
Abstract
Worldwide, pneumonia is the leading cause of infant mortality. Experienced radiologists use chest X-rays to diagnose pneumonia and other respiratory diseases. The diagnostic procedure's complexity causes radiologists to disagree with the decision. Early diagnosis is the only feasible strategy for mitigating the disease's impact on the patent. Computer-aided diagnostics improve the accuracy of diagnosis. Recent studies established that Quaternion neural networks classify and predict better than real-valued neural networks, especially when dealing with multi-dimensional or multi-channel input. The attention mechanism has been derived from the human brain's visual and cognitive ability in which it focuses on some portion of the image and ignores the rest portion of the image. The attention mechanism maximizes the usage of the image's relevant aspects, hence boosting classification accuracy. In the current work, we propose a QCSA network (Quaternion Channel-Spatial Attention Network) by combining the spatial and channel attention mechanism with Quaternion residual network to classify chest X-Ray images for Pneumonia detection. We used a Kaggle X-ray dataset. The suggested architecture achieved 94.53% accuracy and 0.89 AUC. We have also shown that performance improves by integrating the attention mechanism in QCNN. Our results indicate that our approach to detecting pneumonia is promising.
Collapse
Affiliation(s)
| | - Manoj Kumar
- JSS Academy of Technical Education, Noida, India
| | - Abhay Kumar
- National Institute of Technology Patna, Patna, India
| | | | - S Shitharth
- Kebri Dehar University, Kebri Dehar, Ethiopia.
| |
Collapse
|
3
|
Ma Z, Lai Y, Xie J, Meng D, Kleijn WB, Guo J, Yu J. Dirichlet Process Mixture of Generalized Inverted Dirichlet Distributions for Positive Vector Data With Extended Variational Inference. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:6089-6102. [PMID: 34086578 DOI: 10.1109/tnnls.2021.3072209] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
A Bayesian nonparametric approach for estimation of a Dirichlet process (DP) mixture of generalized inverted Dirichlet distributions [i.e., an infinite generalized inverted Dirichlet mixture model (InGIDMM)] has been proposed. The generalized inverted Dirichlet distribution has been proven to be efficient in modeling the vectors that contain only positive elements. Under the classical variational inference (VI) framework, the key challenge in the Bayesian estimation of InGIDMM is that the expectation of the joint distribution of data and variables cannot be explicitly calculated. Therefore, numerical methods are usually applied to simulate the optimal posterior distributions. With the recently proposed extended VI (EVI) framework, we introduce lower bound approximations to the original variational objective function in the VI framework such that an analytically tractable solution can be derived. Hence, the problem in numerical simulation has been overcome. By applying the DP mixture technique, an InGIDMM can automatically determine the number of mixture components from the observed data. Moreover, the DP mixture model with an infinite number of mixture components also avoids the problems of underfitting and overfitting. The performance of the proposed approach is demonstrated with both synthesized data and real-life data applications.
Collapse
|
4
|
Liu L, Chen CLP, Li S. Hallucinating Color Face Image by Learning Graph Representation in Quaternion Space. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:265-277. [PMID: 32224475 DOI: 10.1109/tcyb.2020.2979320] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Recently, learning-based representation techniques have been well exploited for grayscale face image hallucination. For color images, the previous methods only handle the luminance component or each color channel individually, without considering the abundant correlations among different channels as well as the inherent geometrical structure of data manifold. In this article, we propose a learning-based model in quaternion space with graph representation for color face hallucination. Instead of the spatial domain, the color image is represented in the quaternion domain to preserve correlations among different color channels. Moreover, a quaternion graph is learned to smooth the quaternion feature space, which helps to not only stabilize the linear system but also enclose the inherent topology structure of quaternion patch manifold. Besides, considering that single low-resolution (LR) image patch can just provide limited informative information in representation, we propose to simultaneously encode the query smaller LR patch as well as a larger patch containing the surrounding pixels seated at the same position in the objective. The larger patch with rich patterns is used to compensate the lost information in the query LR patch, which further enhances the manifold consistency assumption between the LR and HR patch spaces. The experimental results demonstrated the efficiency of the proposed method in hallucinating color face images.
Collapse
|
5
|
Singh S, Tripathi BK. Pneumonia classification using quaternion deep learning. MULTIMEDIA TOOLS AND APPLICATIONS 2021; 81:1743-1764. [PMID: 34658656 PMCID: PMC8506489 DOI: 10.1007/s11042-021-11409-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/01/2020] [Revised: 01/20/2021] [Accepted: 08/02/2021] [Indexed: 06/05/2023]
Abstract
Pneumonia is an infection in one or both the lungs because of virus or bacteria through breathing air. It inflames air sacs in lungs which fill with fluid which further leads to problems in respiration. Pneumonia is interpreted by radiologists by observing abnormality in lungs in case of fluid in Chest X-Rays. Computer Aided Detection Diagnosis (CAD) tools can assist radiologists by improving their diagnostic accuracy. Such CAD tools use neural networks which are trained on Chest X-Ray dataset to classify a Chest X-Ray into normal or infected with Pneumonia. Convolution neural networks have shown remarkable performance in object detection in an image. Quaternion Convolution neural network (QCNN) is a generalization of conventional convolution neural networks. QCNN treats all three channels (R, G, B) of color image as a single unit and it extracts better representative features and which further improves classification. In this paper, we have trained Quaternion residual network on a publicly available large Chest X-Ray dataset on Kaggle repository and obtained classification accuracy of 93.75% and F-score of .94. We have also compared our performance with other CNN architectures. We found that classification accuracy was higher with Quaternion Residual network when we compared it with a real valued Residual network.
Collapse
Affiliation(s)
| | - B. K. Tripathi
- Harcourt Butler Technological University Kanpur, Kanpur, India
| |
Collapse
|
6
|
Ma Z, Xie J, Lai Y, Taghia J, Xue JH, Guo J. Insights Into Multiple/Single Lower Bound Approximation for Extended Variational Inference in Non-Gaussian Structured Data Modeling. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:2240-2254. [PMID: 30908264 DOI: 10.1109/tnnls.2019.2899613] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
For most of the non-Gaussian statistical models, the data being modeled represent strongly structured properties, such as scalar data with bounded support (e.g., beta distribution), vector data with unit length (e.g., Dirichlet distribution), and vector data with positive elements (e.g., generalized inverted Dirichlet distribution). In practical implementations of non-Gaussian statistical models, it is infeasible to find an analytically tractable solution to estimating the posterior distributions of the parameters. Variational inference (VI) is a widely used framework in Bayesian estimation. Recently, an improved framework, namely, the extended VI (EVI), has been introduced and applied successfully to a number of non-Gaussian statistical models. EVI derives analytically tractable solutions by introducing lower bound approximations to the variational objective function. In this paper, we compare two approximation strategies, namely, the multiple lower bounds (MLBs) approximation and the single lower bound (SLB) approximation, which can be applied to carry out the EVI. For implementation, two different conditions, the weak and the strong conditions, are discussed. Convergence of the EVI depends on the selection of the lower bound, regardless of the choice of weak or strong condition. We also discuss the convergence properties to clarify the differences between MLB and SLB. Extensive comparisons are made based on some EVI-based non-Gaussian statistical models. Theoretical analysis is conducted to demonstrate the differences between the weak and strong conditions. Experimental results based on real data show advantages of the SLB approximation over the MLB approximation.
Collapse
|
7
|
Porebski A, Truong Hoang V, Vandenbroucke N, Hamad D. Combination of LBP Bin and Histogram Selections for Color Texture Classification. J Imaging 2020; 6:53. [PMID: 34460599 PMCID: PMC8321149 DOI: 10.3390/jimaging6060053] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2020] [Revised: 06/16/2020] [Accepted: 06/19/2020] [Indexed: 11/23/2022] Open
Abstract
LBP (Local Binary Pattern) is a very popular texture descriptor largely used in computer vision. In most applications, LBP histograms are exploited as texture features leading to a high dimensional feature space, especially for color texture classification problems. In the past few years, different solutions were proposed to reduce the dimension of the feature space based on the LBP histogram. Most of these approaches apply feature selection methods in order to find the most discriminative bins. Recently another strategy proposed selecting the most discriminant LBP histograms in their entirety. This paper tends to improve on these previous approaches, and presents a combination of LBP bin and histogram selections, where a histogram ranking method is applied before processing a bin selection procedure. The proposed approach is evaluated on five benchmark image databases and the obtained results show the effectiveness of the combination of LBP bin and histogram selections which outperforms the simple LBP bin and LBP histogram selection approaches when they are applied independently.
Collapse
Affiliation(s)
- Alice Porebski
- LISIC laboratory, Université du Littoral Côte d’Opale, 50 rue Ferdinand Buisson, 62228 Calais CEDEX, France; (N.V.); (D.H.)
| | - Vinh Truong Hoang
- Faculty of Information Technology, Ho Chi Minh City Open University, 97 Vo Van Tan, District 3, 700000 Ho Chi Minh City, Vietnam;
| | - Nicolas Vandenbroucke
- LISIC laboratory, Université du Littoral Côte d’Opale, 50 rue Ferdinand Buisson, 62228 Calais CEDEX, France; (N.V.); (D.H.)
| | - Denis Hamad
- LISIC laboratory, Université du Littoral Côte d’Opale, 50 rue Ferdinand Buisson, 62228 Calais CEDEX, France; (N.V.); (D.H.)
| |
Collapse
|
8
|
|
9
|
Abstract
Small scale face detection is a very difficult problem. In order to achieve a higher detection accuracy, we propose a novel method, termed SE-IYOLOV3, for small scale face in this work. In SE-IYOLOV3, we improve the YOLOV3 first, in which the anchorage box with a higher average intersection ratio is obtained by combining niche technology on the basis of the k-means algorithm. An upsampling scale is added to form a face network structure that is suitable for detecting dense small scale faces. The number of prediction boxes is five times more than the YOLOV3 network. To further improve the detection performance, we adopt the SENet structure to enhance the global receptive field of the network. The experimental results on the WIDERFACEdataset show that the IYOLOV3 network embedded in the SENet structure can significantly improve the detection accuracy of dense small scale faces.
Collapse
|
10
|
Xiao X, Chen Y, Gong YJ, Zhou Y. Two-Dimensional Quaternion Sparse Discriminant Analysis. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 29:2271-2286. [PMID: 31670667 DOI: 10.1109/tip.2019.2947775] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Linear discriminant analysis has been incorporated with various representations and measurements for dimension reduction and feature extraction. In this paper, we propose two-dimensional quaternion sparse discriminant analysis (2D-QSDA) that meets the requirements of representing RGB and RGB-D images. 2D-QSDA advances in three aspects: 1) including sparse regularization, 2D-QSDA relies only on the important variables, and thus shows good generalization ability to the out-of-sample data which are unseen during the training phase; 2) benefited from quaternion representation, 2D-QSDA well preserves the high order correlation among different image channels and provides a unified approach to extract features from RGB and RGB-D images; 3) the spatial structure of the input images is retained via the matrix-based processing. We tackle the constrained trace ratio problem of 2D-QSDA by solving a corresponding constrained trace difference problem, which is then transformed into a quaternion sparse regression (QSR) model. Afterward, we reformulate the QSR model to an equivalent complex form to avoid the processing of the complicated structure of quaternions. A nested iterative algorithm is designed to learn the solution of 2D-QSDA in the complex space and then we convert this solution back to the quaternion domain. To improve the separability of 2D-QSDA, we further propose 2D-QSDAw using the weighted pairwise between-class distances. Extensive experiments on RGB and RGB-D databases demonstrate the effectiveness of 2D-QSDA and 2D-QSDAw compared with peer competitors.
Collapse
|
11
|
Xu Z, Hu R, Chen J, Chen C, Jiang J, Li J, Li H. Semisupervised Discriminant Multimanifold Analysis for Action Recognition. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:2951-2962. [PMID: 30762568 DOI: 10.1109/tnnls.2018.2886008] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Although recent semisupervised approaches have proven their effectiveness when there are limited training data, they assume that the samples from different actions lie on a single data manifold in the feature space and try to uncover a common subspace for all samples. However, this assumption ignores the intraclass compactness and the interclass separability simultaneously. We believe that human actions should occupy multimanifold subspace and, therefore, model the samples of the same action as the same manifold and those of different actions as different manifolds. In order to obtain the optimum subspace projection matrix, the current approaches may be mathematically imprecise owe to the badly scaled matrix and improper convergence. To address these issues in unconstrained convex optimization, we introduce a nontrivial spectral projected gradient method and Karush-Kuhn-Tucker conditions without matrix inversion. Through maximizing the separability between different classes by using labeled data points and estimating the intrinsic geometric structure of the data distributions by exploring unlabeled data points, the proposed algorithm can learn global and local consistency and boost the recognition performance. Extensive experiments conducted on the realistic video data sets, including JHMDB, HMDB51, UCF50, and UCF101, have demonstrated that our algorithm outperforms the compared algorithms, including deep learning approach when there are only a few labeled samples.
Collapse
|
12
|
Chen Y, Xiao X, Zhou Y. Low-rank quaternion approximation for color image processing. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 29:1426-1439. [PMID: 31545725 DOI: 10.1109/tip.2019.2941319] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Low-rank matrix approximation (LRMA)-based methods have made a great success for grayscale image processing. When handling color images, LRMA either restores each color channel independently using the monochromatic model or processes the concatenation of three color channels using the concatenation model. However, these two schemes may not make full use of the high correlation among RGB channels. To address this issue, we propose a novel low-rank quaternion approximation (LRQA) model. It contains two major components: first, instead of modeling a color image pixel as a scalar in conventional sparse representation and LRMA-based methods, the color image is encoded as a pure quaternion matrix, such that the cross-channel correlation of color channels can be well exploited; second, LRQA imposes the low-rank constraint on the constructed quaternion matrix. To better estimate the singular values of the underlying low-rank quaternion matrix from its noisy observation, a general model for LRQA is proposed based on several nonconvex functions. Extensive evaluations for color image denoising and inpainting tasks verify that LRQA achieves better performance over several state-of-the-art sparse representation and LRMA-based methods in terms of both quantitative metrics and visual quality.
Collapse
|
13
|
Xiao X, Zhou Y. Two-Dimensional Quaternion PCA and Sparse PCA. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:2028-2042. [PMID: 30418886 DOI: 10.1109/tnnls.2018.2872541] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Benefited from quaternion representation that is able to encode the cross-channel correlation of color images, quaternion principle component analysis (QPCA) was proposed to extract features from color images while reducing the feature dimension. A quaternion covariance matrix (QCM) of input samples was constructed, and its eigenvectors were derived to find the solution of QPCA. However, eigen-decomposition leads to the fixed solution for the same input. This solution is susceptible to outliers and cannot be further optimized. To solve this problem, this paper proposes a novel quaternion ridge regression (QRR) model for two-dimensional QPCA (2D-QPCA). We mathematically prove that this QRR model is equivalent to the QCM model of 2D-QPCA. The QRR model is a general framework and is flexible to combine 2D-QPCA with other technologies or constraints to adapt different requirements of real-world applications. Including sparsity constraints, we then propose a quaternion sparse regression model for 2D-QSPCA to improve its robustness for classification. An alternating minimization algorithm is developed to iteratively learn the solution of 2D-QSPCA in the equivalent complex domain. In addition, 2D-QPCA and 2D-QSPCA can preserve the spatial structure of color images and have a low computation cost. Experiments on several challenging databases demonstrate that 2D-QPCA and 2D-QSPCA are effective in color face recognition, and 2D-QSPCA outperforms the state of the arts.
Collapse
|
14
|
|
15
|
An LBP encoding scheme jointly using quaternionic representation and angular information. Neural Comput Appl 2019. [DOI: 10.1007/s00521-018-03968-y] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
16
|
Liu L, Li S, Chen CLP. Quaternion Locality-Constrained Coding for Color Face Hallucination. IEEE TRANSACTIONS ON CYBERNETICS 2018; 48:1474-1485. [PMID: 28541233 DOI: 10.1109/tcyb.2017.2703134] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Recently, the locality linear coding (LLC) has attracted more and more attentions in the areas of image processing and computer vision. However, the conventional LLC with real setting is just designed for the grayscale image. For the color image, it usually treats each color channel individually or encodes the monochrome image by concatenating all the color channels, which ignores the correlations among different channels. In this paper, we propose a quaternion-based locality-constrained coding (QLC) model for color face hallucination in the quaternion space. In QLC, the face images are represented as quaternion matrices. By transforming the channel images into an orthogonal feature space and encoding the coefficients in the quaternion domain, the proposed QLC is expected to learn the advantages of both quaternion algebra and locality coding scheme. Hence, the QLC cannot only expose the true topology of image patch manifold but also preserve the inherent correlations among different color channels. Experimental results demonstrated that our proposed QLC method achieved superior performance in color face hallucination compared with other state-of-the-art methods.
Collapse
|