1
|
Mohammadi M, Tino P, Bunte K. Manifold Alignment Aware Ants: A Markovian Process for Manifold Extraction. Neural Comput 2022; 34:595-641. [PMID: 35026002 DOI: 10.1162/neco_a_01478] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Accepted: 11/04/2021] [Indexed: 11/04/2022]
Abstract
The presence of manifolds is a common assumption in many applications, including astronomy and computer vision. For instance, in astronomy, low-dimensional stellar structures, such as streams, shells, and globular clusters, can be found in the neighborhood of big galaxies such as the Milky Way. Since these structures are often buried in very large data sets, an algorithm, which can not only recover the manifold but also remove the background noise (or outliers), is highly desirable. While other works try to recover manifolds either by pushing all points toward manifolds or by downsampling from dense regions, aiming to solve one of the problems, they generally fail to suppress the noise on manifolds and remove background noise simultaneously. Inspired by the collective behavior of biological ants in food-seeking process, we propose a new algorithm that employs several random walkers equipped with a local alignment measure to detect and denoise manifolds. During the walking process, the agents release pheromone on data points, which reinforces future movements. Over time the pheromone concentrates on the manifolds, while it fades in the background noise due to an evaporation procedure. We use the Markov chain (MC) framework to provide a theoretical analysis of the convergence of the algorithm and its performance. Moreover, an empirical analysis, based on synthetic and real-world data sets, is provided to demonstrate its applicability in different areas, such as improving the performance of t-distributed stochastic neighbor embedding (t-SNE) and spectral clustering using the underlying MC formulas, recovering astronomical low-dimensional structures, and improving the performance of the fast Parzen window density estimator.
Collapse
Affiliation(s)
- Mohammad Mohammadi
- Faculty of Science and Engineering, University of Groningen, Gronigen 9747AA, The Netherlands
| | - Peter Tino
- Department of Computer Science, University of Birmingham, Birmingham B15 2TT
| | - Kerstin Bunte
- Faculty of Science and Engineering, University of Groningen, Gronigen 9747 AG, The Netherlands
| |
Collapse
|
2
|
Nielsen F. On a Variational Definition for the Jensen-Shannon Symmetrization of Distances Based on the Information Radius. ENTROPY (BASEL, SWITZERLAND) 2021; 23:464. [PMID: 33919986 PMCID: PMC8071043 DOI: 10.3390/e23040464] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/12/2021] [Revised: 04/09/2021] [Accepted: 04/09/2021] [Indexed: 01/21/2023]
Abstract
We generalize the Jensen-Shannon divergence and the Jensen-Shannon diversity index by considering a variational definition with respect to a generic mean, thereby extending the notion of Sibson's information radius. The variational definition applies to any arbitrary distance and yields a new way to define a Jensen-Shannon symmetrization of distances. When the variational optimization is further constrained to belong to prescribed families of probability measures, we get relative Jensen-Shannon divergences and their equivalent Jensen-Shannon symmetrizations of distances that generalize the concept of information projections. Finally, we touch upon applications of these variational Jensen-Shannon divergences and diversity indices to clustering and quantization tasks of probability measures, including statistical mixtures.
Collapse
Affiliation(s)
- Frank Nielsen
- Sony Computer Science Laboratories, Tokyo 141-0022, Japan
| |
Collapse
|
3
|
Wang G, Teoh JYC, Lu J, Choi KS. Least squares support vector machines with fast leave-one-out AUC optimization on imbalanced prostate cancer data. INT J MACH LEARN CYB 2020. [DOI: 10.1007/s13042-020-01081-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
4
|
The Fisher-Rao Distance between Multivariate Normal Distributions: Special Cases, Bounds and Applications. ENTROPY 2020; 22:e22040404. [PMID: 33286178 PMCID: PMC7516881 DOI: 10.3390/e22040404] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/26/2020] [Revised: 03/06/2020] [Accepted: 03/11/2020] [Indexed: 11/16/2022]
Abstract
The Fisher–Rao distance is a measure of dissimilarity between probability distributions, which, under certain regularity conditions of the statistical model, is up to a scaling factor the unique Riemannian metric invariant under Markov morphisms. It is related to the Shannon entropy and has been used to enlarge the perspective of analysis in a wide variety of domains such as image processing, radar systems, and morphological classification. Here, we approach this metric considered in the statistical model of normal multivariate probability distributions, for which there is not an explicit expression in general, by gathering known results (closed forms for submanifolds and bounds) and derive expressions for the distance between distributions with the same covariance matrix and between distributions with mirrored covariance matrices. An application of the Fisher–Rao distance to the simplification of Gaussian mixtures using the hierarchical clustering algorithm is also presented.
Collapse
|
5
|
Liu Z, Chan SC, Zhang S, Zhang Z, Chen X. Automatic Muscle Fiber Orientation Tracking in Ultrasound Images Using a New Adaptive Fading Bayesian Kalman Smoother. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 28:3714-3727. [PMID: 30794172 DOI: 10.1109/tip.2019.2899941] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
This paper proposes a new algorithm for automatic estimation of muscle fiber orientation (MFO) in musculoskeletal ultrasound images, which is commonly used for both diagnosis and rehabilitation assessment of patients. The algorithm is based on a novel adaptive fading Bayesian Kalman filter (AF-BKF) and an automatic region of interest (ROI) extraction method. The ROI is first enhanced by the Gabor filter (GF) and extracted automatically using the revoting constrained Radon transform (RCRT) approach. The dominant MFO in the ROI is then detected by the RT and tracked by the proposed AF-BKF, which employs simplified Gaussian mixtures to approximate the non-Gaussian state densities and a new adaptive fading method to update the mixture parameters. An AF-BK smoother (AF-BKS) is also proposed by extending the AF-BKF using the concept of Rauch-Tung-Striebel smoother for further smoothing the fascicle orientations. The experimental results and comparisons show that: 1) the maximum segmentation error of the proposed RCRT is below nine pixels, which is sufficiently small for MFO tracking; 2) the accuracy of MFO gauged by RT in the ROI enhanced by the GF is comparable to that of using multiscale vessel enhancement filter-based method and better than those of local RT and revoting Hough transform approaches; and 3) the proposed AF-BKS algorithm outperforms the other tested approaches and achieves a performance close to those obtained by experienced operators (the overall covariance obtained by the AF-BKS is 3.19, which is rather close to that of the operators, 2.86). It, thus, serves as a valuable tool for automatic estimation of fascicle orientations and possibly for other applications in musculoskeletal ultrasound images.
Collapse
|
6
|
Chan AB. Density-Preserving Hierarchical EM Algorithm: Simplifying Gaussian Mixture Models for Approximate Inference. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2019; 41:1323-1337. [PMID: 29994194 DOI: 10.1109/tpami.2018.2845371] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
We propose an algorithm for simplifying a finite mixture model into a reduced mixture model with fewer mixture components. The reduced model is obtained by maximizing a variational lower bound of the expected log-likelihood of a set of virtual samples. We develop three applications for our mixture simplification algorithm: recursive Bayesian filtering using Gaussian mixture model posteriors, KDE mixture reduction, and belief propagation without sampling. For recursive Bayesian filtering, we propose an efficient algorithm for approximating an arbitrary likelihood function as a sum of scaled Gaussian. Experiments on synthetic data, human location modeling, visual tracking, and vehicle self-localization show that our algorithm can be widely used for probabilistic data analysis, and is more accurate than other mixture simplification methods.
Collapse
|
7
|
Hong X, Gao J, Chen S, Zia T. Sparse Density Estimation on the Multinomial Manifold. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2015; 26:2972-2977. [PMID: 25647665 DOI: 10.1109/tnnls.2015.2389273] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
A new sparse kernel density estimator is introduced based on the minimum integrated square error criterion for the finite mixture model. Since the constraint on the mixing coefficients of the finite mixture model is on the multinomial manifold, we use the well-known Riemannian trust-region (RTR) algorithm for solving this problem. The first- and second-order Riemannian geometry of the multinomial manifold are derived and utilized in the RTR algorithm. Numerical examples are employed to demonstrate that the proposed approach is effective in constructing sparse kernel density estimators with an accuracy competitive with those of existing kernel density estimators.
Collapse
|
8
|
Fan W, Bouguila N, Ziou D. Variational learning of finite Dirichlet mixture models using component splitting. Neurocomputing 2014. [DOI: 10.1016/j.neucom.2013.03.049] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
9
|
Kristan M, Leonardis A. Online Discriminative Kernel Density Estimator With Gaussian Kernels. IEEE TRANSACTIONS ON CYBERNETICS 2014; 44:355-365. [PMID: 23757555 DOI: 10.1109/tcyb.2013.2255983] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
We propose a new method for a supervised online estimation of probabilistic discriminative models for classification tasks. The method estimates the class distributions from a stream of data in the form of Gaussian mixture models (GMMs). The reconstructive updates of the distributions are based on the recently proposed online kernel density estimator (oKDE). We maintain the number of components in the model low by compressing the GMMs from time to time. We propose a new cost function that measures loss of interclass discrimination during compression, thus guiding the compression toward simpler models that still retain discriminative properties. The resulting classifier thus independently updates the GMM of each class, but these GMMs interact during their compression through the proposed cost function. We call the proposed method the online discriminative kernel density estimator (odKDE). We compare the odKDE to oKDE, batch state-of-the-art kernel density estimators (KDEs), and batch/incremental support vector machines (SVM) on the publicly available datasets. The odKDE achieves comparable classification performance to that of best batch KDEs and SVM, while allowing online adaptation from large datasets, and produces models of lower complexity than the oKDE.
Collapse
|
10
|
|
11
|
Cheng J, Sayeh MR, Zargham MR, Cheng Q. Real-time vector quantization and clustering based on ordinary differential equations. IEEE TRANSACTIONS ON NEURAL NETWORKS 2011; 22:2143-8. [PMID: 22057062 DOI: 10.1109/tnn.2011.2172627] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
This brief presents a dynamical system approach to vector quantization or clustering based on ordinary differential equations with the potential for real-time implementation. Two examples of different pattern clusters demonstrate that the model can successfully quantize different types of input patterns. Furthermore, we analyze and study the stability of our dynamical system. By discovering the equilibrium points for certain input patterns and analyzing their stability, we have shown the quantizing behavior of the system with respect to its vigilance parameter. The proposed system is applied to two real-world problems, providing comparable results to the best reported findings. This validates the effectiveness of our proposed approach.
Collapse
Affiliation(s)
- Jie Cheng
- Department of Computer Science, University of Hawaii, Hilo, HI 96720, USA.
| | | | | | | |
Collapse
|
12
|
Nie F, Zeng Z, Tsang IW, Xu D, Zhang C. Spectral embedded clustering: a framework for in-sample and out-of-sample spectral clustering. ACTA ACUST UNITED AC 2011; 22:1796-808. [PMID: 21965198 DOI: 10.1109/tnn.2011.2162000] [Citation(s) in RCA: 187] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Spectral clustering (SC) methods have been successfully applied to many real-world applications. The success of these SC methods is largely based on the manifold assumption, namely, that two nearby data points in the high-density region of a low-dimensional data manifold have the same cluster label. However, such an assumption might not always hold on high-dimensional data. When the data do not exhibit a clear low-dimensional manifold structure (e.g., high-dimensional and sparse data), the clustering performance of SC will be degraded and become even worse than K -means clustering. In this paper, motivated by the observation that the true cluster assignment matrix for high-dimensional data can be always embedded in a linear space spanned by the data, we propose the spectral embedded clustering (SEC) framework, in which a linearity regularization is explicitly added into the objective function of SC methods. More importantly, the proposed SEC framework can naturally deal with out-of-sample data. We also present a new Laplacian matrix constructed from a local regression of each pattern and incorporate it into our SEC framework to capture both local and global discriminative information for clustering. Comprehensive experiments on eight real-world high-dimensional datasets demonstrate the effectiveness and advantages of our SEC framework over existing SC methods and K-means-based clustering methods. Our SEC framework significantly outperforms SC using the Nyström algorithm on unseen data.
Collapse
Affiliation(s)
- Feiping Nie
- University of Texas, Arlington, TX 76019, USA.
| | | | | | | | | |
Collapse
|
13
|
Zhuo Z, Cai SM, Fu ZQ, Zhang J. Hierarchical organization of brain functional networks during visual tasks. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2011; 84:031923. [PMID: 22060419 DOI: 10.1103/physreve.84.031923] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/24/2011] [Revised: 08/14/2011] [Indexed: 05/31/2023]
Abstract
The functional network of the brain is known to demonstrate modular structure over different hierarchical scales. In this paper, we systematically investigated the hierarchical modular organizations of the brain functional networks that are derived from the extent of phase synchronization among high-resolution EEG time series during a visual task. In particular, we compare the modular structure of the functional network from EEG channels with that of the anatomical parcellation of the brain cortex. Our results show that the modular architectures of brain functional networks correspond well to those from the anatomical structures over different levels of hierarchy. Most importantly, we find that the consistency between the modular structures of the functional network and the anatomical network becomes more pronounced in terms of vision, sensory, vision-temporal, motor cortices during the visual task, which implies that the strong modularity in these areas forms the functional basis for the visual task. The structure-function relationship further reveals that the phase synchronization of EEG time series in the same anatomical group is much stronger than that of EEG time series from different anatomical groups during the task and that the hierarchical organization of functional brain network may be a consequence of functional segmentation of the brain cortex.
Collapse
Affiliation(s)
- Zhao Zhuo
- Department of Electronic Science and Technology, University of Science and Technology of China, Hefei, Anhui 230026, People's Republic of China
| | | | | | | |
Collapse
|
14
|
Zhang J, Xu XK, Li P, Zhang K, Small M. Node importance for dynamical process on networks: a multiscale characterization. CHAOS (WOODBURY, N.Y.) 2011; 21:016107. [PMID: 21456849 DOI: 10.1063/1.3553644] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
Defining the importance of nodes in a complex network has been a fundamental problem in analyzing the structural organization of a network, as well as the dynamical processes on it. Traditionally, the measures of node importance usually depend either on the local neighborhood or global properties of a network. Many real-world networks, however, demonstrate finely detailed structure at various organization levels, such as hierarchy and modularity. In this paper, we propose a multiscale node-importance measure that can characterize the importance of the nodes at varying topological scale. This is achieved by introducing a kernel function whose bandwidth dictates the ranges of interaction, and meanwhile, by taking into account the interactions from all the paths a node is involved. We demonstrate that the scale here is closely related to the physical parameters of the dynamical processes on networks, and that our node-importance measure can characterize more precisely the node influence under different physical parameters of the dynamical process. We use epidemic spreading on networks as an example to show that our multiscale node-importance measure is more effective than other measures.
Collapse
Affiliation(s)
- Jie Zhang
- Centre for Computational Systems Biology, Fudan University, Shanghai 200433, People's Republic of China
| | | | | | | | | |
Collapse
|
15
|
Liu Z, Song YQ, Chen JM, Xie CH, Zhu F. Color image segmentation using nonparametric mixture models with multivariate orthogonal polynomials. Neural Comput Appl 2011. [DOI: 10.1007/s00521-011-0538-1] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
16
|
Arun Kumar M, Gopal M. Fast Multiclass SVM Classification Using Decision Tree Based One-Against-All Method. Neural Process Lett 2010. [DOI: 10.1007/s11063-010-9160-y] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
17
|
Bouguila N. Count data modeling and classification using finite mixtures of distributions. ACTA ACUST UNITED AC 2010; 22:186-98. [PMID: 21095862 DOI: 10.1109/tnn.2010.2091428] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
In this paper, we consider the problem of constructing accurate and flexible statistical representations for count data, which we often confront in many areas such as data mining, computer vision, and information retrieval. In particular, we analyze and compare several generative approaches widely used for count data clustering, namely multinomial, multinomial Dirichlet, and multinomial generalized Dirichlet mixture models. Moreover, we propose a clustering approach via a mixture model based on a composition of the Liouville family of distributions, from which we select the Beta-Liouville distribution, and the multinomial. The novel proposed model, which we call multinomial Beta-Liouville mixture, is optimized by deterministic annealing expectation-maximization and minimum description length, and strives to achieve a high accuracy of count data clustering and model selection. An important feature of the multinomial Beta-Liouville mixture is that it has fewer parameters than the recently proposed multinomial generalized Dirichlet mixture. The performance evaluation is conducted through a set of extensive empirical experiments, which concern text and image texture modeling and classification and shape modeling, and highlights the merits of the proposed models and approaches.
Collapse
Affiliation(s)
- Nizar Bouguila
- Concordia Institute for Information Systems Engineering, Concordia University, Montreal, QC H3G 1T7, Canada.
| |
Collapse
|
18
|
Zhang J, Zhou C, Xu X, Small M. Mapping from structure to dynamics: a unified view of dynamical processes on networks. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2010; 82:026116. [PMID: 20866885 DOI: 10.1103/physreve.82.026116] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/15/2009] [Revised: 07/19/2010] [Indexed: 05/29/2023]
Abstract
Although it is unambiguously agreed that structure plays a fundamental role in shaping the collective dynamics of complex systems, how structure determines dynamics exactly still remains unclear. We investigate a general computational transformation by which we can map the network topology directly to the dynamical patterns emergent on it-independent of the nature of the dynamical processes. Remarkably, we find that many seemingly different dynamical processes on networks, such as coupled oscillators, ensemble neuron firing, epidemic spreading and diffusion can all be understood and unified through this same procedure. Utilizing the inherent multiscale nature of this structure-dynamics transformation, we further define a multiscale complexity measure, which can quantify the functional diversity a general network can support at different organization levels using only its structure. We find that a wide variety of topological features observed in real networks, such as modularity, hierarchy, degree heterogeneity and mixing all result in higher complexity. This result suggests that the demand for functional diversity is driving the structural evolution of physical networks.
Collapse
Affiliation(s)
- Jie Zhang
- Department of Electronic and Information Engineering, Hong Kong Polytechnic University, Hong Kong, People's Republic of China.
| | | | | | | |
Collapse
|