1
|
Ranjbar A, Suratgar AA, Menhaj MB, Abbasi-Asl R. Structurally-constrained encoding framework using a multi-voxel reduced-rank latent model for human natural vision. J Neural Eng 2024; 21:046027. [PMID: 38986451 DOI: 10.1088/1741-2552/ad6184] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2023] [Accepted: 07/10/2024] [Indexed: 07/12/2024]
Abstract
Objective. Voxel-wise visual encoding models based on convolutional neural networks (CNNs) have emerged as one of the prominent predictive tools of human brain activity via functional magnetic resonance imaging signals. While CNN-based models imitate the hierarchical structure of the human visual cortex to generate explainable features in response to natural visual stimuli, there is still a need for a brain-inspired model to predict brain responses accurately based on biomedical data.Approach. To bridge this gap, we propose a response prediction module called the Structurally Constrained Multi-Output (SCMO) module to include homologous correlations that arise between a group of voxels in a cortical region and predict more accurate responses.Main results. This module employs all the responses across a visual area to predict individual voxel-wise BOLD responses and therefore accounts for the population activity and collective behavior of voxels. Such a module can determine the relationships within each visual region by creating a structure matrix that represents the underlying voxel-to-voxel interactions. Moreover, since each response module in visual encoding tasks relies on the image features, we conducted experiments using two different feature extraction modules to assess the predictive performance of our proposed module. Specifically, we employed a recurrent CNN that integrates both feedforward and recurrent interactions, as well as the popular AlexNet model that utilizes feedforward connections.Significance.We demonstrate that the proposed framework provides a reliable predictive ability to generate brain responses across multiple areas, outperforming benchmark models in terms of stability and coherency of features.
Collapse
Affiliation(s)
- Amin Ranjbar
- Department of Electrical Engineering, Amirkabir University of Technology, Tehran, Iran
- Distributed and Intelligence Optimization Research Laboratory (DIOR Lab.), Tehran, Iran
| | - Amir Abolfazl Suratgar
- Department of Electrical Engineering, Amirkabir University of Technology, Tehran, Iran
- Distributed and Intelligence Optimization Research Laboratory (DIOR Lab.), Tehran, Iran
| | - Mohammad Bagher Menhaj
- Department of Electrical Engineering, Amirkabir University of Technology, Tehran, Iran
- Distributed and Intelligence Optimization Research Laboratory (DIOR Lab.), Tehran, Iran
| | - Reza Abbasi-Asl
- Department of Neurology, Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA, United States of America
- UCSF Weill Institute for Neurosciences, San Francisco, CA, United States of America
| |
Collapse
|
2
|
Li Z, Li S, Bamasag OO, Alhothali A, Luo X. Diversified Regularization Enhanced Training for Effective Manipulator Calibration. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:8778-8790. [PMID: 35263261 DOI: 10.1109/tnnls.2022.3153039] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Recently, robot arms have become an irreplaceable production tool, which play an important role in the industrial production. It is necessary to ensure the absolute positioning accuracy of the robot to realize automatic production. Due to the influence of machining tolerance, assembly tolerance, the robot positioning accuracy is poor. Therefore, in order to enable the precise operation of the robot, it is necessary to calibrate the robotic kinematic parameters. The least square method and Levenberg-Marquardt (LM) algorithm are commonly used to identify the positioning error of robot. However, it generally has the overfitting caused by improper regularization schemes. To solve this problem, this article discusses six regularization schemes based on its error models, i.e., L1 , L2 , dropout, elastic, log, and swish. Moreover, this article proposes a scheme with six regularization to obtain a reliable ensemble, which can effectively avoid overfitting. The positioning accuracy of the robot is improved significantly after calibration by enough experiments, which verifies the feasibility of the proposed method.
Collapse
|
3
|
Zhong P, Xu Y. Subspace screening rule for multi-label estimator with sparsity-inducing regularization. Neurocomputing 2023. [DOI: 10.1016/j.neucom.2023.01.030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
|
4
|
Zhang YY, Wang H, Lv X, Zhang P. Capturing the grouping and compactness of high-level semantic feature for saliency detection. Neural Netw 2021; 142:351-362. [PMID: 34116448 DOI: 10.1016/j.neunet.2021.04.028] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2020] [Revised: 03/03/2021] [Accepted: 04/20/2021] [Indexed: 10/21/2022]
Abstract
Saliency detection is an important and challenging research topic due to the variety and complexity of the background and saliency regions. In this paper, we present a novel unsupervised saliency detection approach by exploiting the grouping and compactness characteristics of the high-level semantic features. First, for the high-level semantic feature, the elastic net based hypergraph model is adopted to discover the group structure relationships of salient regional points, and the calculation of the spatial distribution is constructed to detect the compactness of the saliency regions. Next, the grouping-based and compactness-based saliency maps are improved by a propagation algorithm. The propagation process uses an enhanced similarity matrix, which fuses the low-level deep feature and the high-level semantic feature through cross diffusion. Results on four benchmark datasets with pixel-wise accurate labeling demonstrate the effectiveness of the proposed method. Particularly, the proposed unsupervised method achieves competitive performance with deep learning-based methods.
Collapse
Affiliation(s)
- Ying Ying Zhang
- School of Physics Electronic Engineering, Nanyang Normal University, Nanyang 473061, China.
| | - HongJuan Wang
- School of Mechanical and Electrical Engineering, Nanyang Normal University, Nanyang 473061, China
| | - XiaoDong Lv
- School of Mechanical and Electrical Engineering, Nanyang Normal University, Nanyang 473061, China
| | - Ping Zhang
- School of Physics Electronic Engineering, Nanyang Normal University, Nanyang 473061, China
| |
Collapse
|
5
|
|
6
|
Wu G, Zheng R, Tian Y, Liu D. Joint Ranking SVM and Binary Relevance with robust Low-rank learning for multi-label classification. Neural Netw 2019; 122:24-39. [PMID: 31675625 DOI: 10.1016/j.neunet.2019.10.002] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2018] [Revised: 08/14/2019] [Accepted: 10/07/2019] [Indexed: 10/25/2022]
Abstract
Multi-label classification studies the task where each example belongs to multiple labels simultaneously. As a representative method, Ranking Support Vector Machine (Rank-SVM) aims to minimize the Ranking Loss and can also mitigate the negative influence of the class-imbalance issue. However, due to its stacking-style way for thresholding, it may suffer error accumulation and thus reduces the final classification performance. Binary Relevance (BR) is another typical method, which aims to minimize the Hamming Loss and only needs one-step learning. Nevertheless, it might have the class-imbalance issue and does not take into account label correlations. To address the above issues, we propose a novel multi-label classification model, which joints Ranking support vector machine and Binary Relevance with robust Low-rank learning (RBRL). RBRL inherits the ranking loss minimization advantages of Rank-SVM, and thus overcomes the disadvantages of BR suffering the class-imbalance issue and ignoring the label correlations. Meanwhile, it utilizes the hamming loss minimization and one-step learning advantages of BR, and thus tackles the disadvantages of Rank-SVM including another thresholding learning step. Besides, a low-rank constraint is utilized to further exploit high-order label correlations under the assumption of low dimensional label space. Furthermore, to achieve nonlinear multi-label classifiers, we derive the kernelization RBRL. Two accelerated proximal gradient methods (APG) are used to solve the optimization problems efficiently. Extensive comparative experiments with several state-of-the-art methods illustrate a highly competitive or superior performance of our method RBRL.
Collapse
Affiliation(s)
- Guoqiang Wu
- School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 100049, China; Research Center on Fictitious Economy and Data Science, Chinese Academy of Sciences, Beijing 100190, China.
| | - Ruobing Zheng
- Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China.
| | - Yingjie Tian
- Research Center on Fictitious Economy and Data Science, Chinese Academy of Sciences, Beijing 100190, China; School of Economics and Management, University of Chinese Academy of Sciences, Beijing 100190, China; Key Laboratory of Big Data Mining and Knowledge management, Chinese Academy of Sciences, Beijing 100190, China.
| | - Dalian Liu
- Department of Basic Course Teaching, Beijing Union University, Beijing 100101, China.
| |
Collapse
|
7
|
Nguyen B, De Baets B. Kernel-Based Distance Metric Learning for Supervised k -Means Clustering. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:3084-3095. [PMID: 30668483 DOI: 10.1109/tnnls.2018.2890021] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Finding an appropriate distance metric that accurately reflects the (dis)similarity between examples is a key to the success of k -means clustering. While it is not always an easy task to specify a good distance metric, we can try to learn one based on prior knowledge from some available clustered data sets, an approach that is referred to as supervised clustering. In this paper, a kernel-based distance metric learning method is developed to improve the practical use of k -means clustering. Given the corresponding optimization problem, we derive a meaningful Lagrange dual formulation and introduce an efficient algorithm in order to reduce the training complexity. Our formulation is simple to implement, allowing a large-scale distance metric learning problem to be solved in a computationally tractable way. Experimental results show that the proposed method yields more robust and better performances on synthetic as well as real-world data sets compared to other state-of-the-art distance metric learning methods.
Collapse
|
8
|
Xu M, Yang Y, Han M, Qiu T, Lin H. Spatio-Temporal Interpolated Echo State Network for Meteorological Series Prediction. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:1621-1634. [PMID: 30307877 DOI: 10.1109/tnnls.2018.2869131] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Spatio-temporal series prediction has attracted increasing attention in the field of meteorology in recent years. The spatial and temporal joint effect makes predictions challenging. Most of the existing spatio-temporal prediction models are computationally complicated. To develop an accurate but easy-to-implement spatio-temporal prediction model, this paper designs a novel spatio-temporal prediction model based on echo state networks. For real-world observed meteorological data with randomness and large changes, we use a cubic spline method to bridge the gaps between the neighboring points, which results in a pleasingly smooth series. The interpolated series is later input into the spatio-temporal echo state networks, in which the spatial coefficients are computed by the elastic-net algorithm. This approach offers automatic selection and continuous shrinkage of the spatial variables. The proposed model provides an intuitive but effective approach to address the interaction of spatial and temporal effects. To demonstrate the practicality of the proposed model, we apply it to predict two real-world datasets: monthly precipitation series and daily air quality index series. Experimental results demonstrate that the proposed model achieves a normalized root-mean-square error of approximately 0.250 on both datasets. Similar results are achieved on the long short-term memory model, but the computation time of our proposed model is considerably shorter. It can be inferred that our proposed neural network model has advantages on predicting meteorological series over other models.
Collapse
|
9
|
Ye H, Li H, Cao F, Zhang L. A Hybrid Truncated Norm Regularization Method for Matrix Completion. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 28:5171-5186. [PMID: 31170070 DOI: 10.1109/tip.2019.2918733] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Matrix completion has been widely used in image processing, in which the popular approach is to formulate this issue as a general low-rank matrix approximation problem. This paper proposes a novel regularization method referred to as truncated Frobenius norm (TFN), and presents a hybrid truncated norm (HTN) model combining the truncated nuclear norm and truncated Frobenius norm for solving matrix completion problems. To address this model, a simple and effective two-step iteration algorithm is designed. Further, an adaptive way to change the penalty parameter is introduced to reduce the computational cost. Also, the convergence of the proposed method is discussed and proved mathematically. The proposed approach could not only effectively improve the recovery performance but also greatly promote the stability of the model. Meanwhile, the use of this new method could eliminate large variations that exist when estimating complex models, and achieve competitive successes in matrix completion. Experimental results on the synthetic data, real-world images as well as recommendation systems, particularly the use of the statistical analysis strategy, verify the effectiveness and superiority of the proposed method, i.e. the proposed method is more stable and effective than other state-of-the-art approaches.
Collapse
|
10
|
|
11
|
Li H, Zhao H, Li H. Neural-Response-Based Extreme Learning Machine for Image Classification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:539-552. [PMID: 29994407 DOI: 10.1109/tnnls.2018.2845857] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
This paper proposes a novel and simple multilayer feature learning method for image classification by employing the extreme learning machine (ELM). The proposed algorithm is composed of two stages: the multilayer ELM (ML-ELM) feature mapping stage and the ELM learning stage. The ML-ELM feature mapping stage is recursively built by alternating between feature map construction and maximum pooling operation. In particular, the input weights for constructing feature maps are randomly generated and hence need not be trained or tuned, which makes the algorithm highly efficient. Moreover, the maximum pooling operation enables the algorithm to be invariant to certain transformations. During the ELM learning stage, elastic-net regularization is proposed to learn the output weight. Elastic-net regularization helps to learn more compact and meaningful output weight. In addition, we preprocess the input data with the dense scale-invariant feature transform operation to improve both the robustness and invariance of the algorithm. To evaluate the effectiveness of the proposed method, several experiments are conducted on three challenging databases. Compared with the conventional deep learning methods and other related ones, the proposed method achieves the best classification results with high computational efficiency.
Collapse
|
12
|
Wang X, Zhen X, Li Q, Shen D, Huang H. Cognitive Assessment Prediction in Alzheimer's Disease by Multi-Layer Multi-Target Regression. Neuroinformatics 2018; 16:285-294. [PMID: 29802511 PMCID: PMC6378694 DOI: 10.1007/s12021-018-9381-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/16/2022]
Abstract
Accurate and automatic prediction of cognitive assessment from multiple neuroimaging biomarkers is crucial for early detection of Alzheimer's disease. The major challenges arise from the nonlinear relationship between biomarkers and assessment scores and the inter-correlation among them, which have not yet been well addressed. In this paper, we propose multi-layer multi-target regression (MMR) which enables simultaneously modeling intrinsic inter-target correlations and nonlinear input-output relationships in a general compositional framework. Specifically, by kernelized dictionary learning, the MMR can effectively handle highly nonlinear relationship between biomarkers and assessment scores; by robust low-rank linear learning via matrix elastic nets, the MMR can explicitly encode inter-correlations among multiple assessment scores; moreover, the MMR is flexibly and allows to work with non-smooth ℓ2,1-norm loss function, which enables calibration of multiple targets with disparate noise levels for more robust parameter estimation. The MMR can be efficiently solved by an alternating optimization algorithm via gradient descent with guaranteed convergence. The MMR has been evaluated by extensive experiments on the ADNI database with MRI data, and produced high accuracy surpassing previous regression models, which demonstrates its great effectiveness as a new multi-target regression model for clinical multivariate prediction.
Collapse
Affiliation(s)
- Xiaoqian Wang
- Department of Electrical, Computer Engineering, University of Pittsburgh, Pennsylvania, PA 15263, USA
| | - Xiantong Zhen
- Department of Electrical, Computer Engineering, University of Pittsburgh, Pennsylvania, PA 15263, USA
| | - Quanzheng Li
- Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USA
| | - Dinggang Shen
- Radiology and BRIC, UNC-CH School of Medicine, 130 Mason Farm Road, Chapel Hill, NC 27599, USA
| | - Heng Huang
- Department of Electrical, Computer Engineering, University of Pittsburgh, Pennsylvania, PA 15263, USA
| |
Collapse
|
13
|
Zhen X, Yu M, He X, Li S. Multi-Target Regression via Robust Low-Rank Learning. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2018; 40:497-504. [PMID: 28368816 DOI: 10.1109/tpami.2017.2688363] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Multi-target regression has recently regained great popularity due to its capability of simultaneously learning multiple relevant regression tasks and its wide applications in data mining, computer vision and medical image analysis, while great challenges arise from jointly handling inter-target correlations and input-output relationships. In this paper, we propose Multi-layer Multi-target Regression (MMR) which enables simultaneously modeling intrinsic inter-target correlations and nonlinear input-output relationships in a general framework via robust low-rank learning. Specifically, the MMR can explicitly encode inter-target correlations in a structure matrix by matrix elastic nets (MEN); the MMR can work in conjunction with the kernel trick to effectively disentangle highly complex nonlinear input-output relationships; the MMR can be efficiently solved by a new alternating optimization algorithm with guaranteed convergence. The MMR leverages the strength of kernel methods for nonlinear feature learning and the structural advantage of multi-layer learning architectures for inter-target correlation modeling. More importantly, it offers a new multi-layer learning paradigm for multi-target regression which is endowed with high generality, flexibility and expressive ability. Extensive experimental evaluation on 18 diverse real-world datasets demonstrates that our MMR can achieve consistently high performance and outperforms representative state-of-the-art algorithms, which shows its great effectiveness and generality for multivariate prediction.
Collapse
|
14
|
Zhao X, Li X, Zhang Z, Shen C, Zhuang Y, Gao L, Li X. Scalable Linear Visual Feature Learning via Online Parallel Nonnegative Matrix Factorization. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2016; 27:2628-2642. [PMID: 26625429 DOI: 10.1109/tnnls.2015.2499273] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Visual feature learning, which aims to construct an effective feature representation for visual data, has a wide range of applications in computer vision. It is often posed as a problem of nonnegative matrix factorization (NMF), which constructs a linear representation for the data. Although NMF is typically parallelized for efficiency, traditional parallelization methods suffer from either an expensive computation or a high runtime memory usage. To alleviate this problem, we propose a parallel NMF method called alternating least square block decomposition (ALSD), which efficiently solves a set of conditionally independent optimization subproblems based on a highly parallelized fine-grained grid-based blockwise matrix decomposition. By assigning each block optimization subproblem to an individual computing node, ALSD can be effectively implemented in a MapReduce-based Hadoop framework. In order to cope with dynamically varying visual data, we further present an incremental version of ALSD, which is able to incrementally update the NMF solution with a low computational cost. Experimental results demonstrate the efficiency and scalability of the proposed methods as well as their applications to image clustering and image retrieval.
Collapse
|
15
|
Li D, Zhu Y, Wang Z, Chong C, Gao D. Regularized Matrix-Pattern-Oriented Classification Machine with Universum. Neural Process Lett 2016. [DOI: 10.1007/s11063-016-9567-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
16
|
Kim E, Lee M, Oh S. Robust Elastic-Net Subspace Representation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2016; 25:4245-4259. [PMID: 27411222 DOI: 10.1109/tip.2016.2588321] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Recently, finding the low-dimensional structure of high-dimensional data has gained much attention. Given a set of data points sampled from a single subspace or a union of subspaces, the goal is to learn or capture the underlying subspace structure of the data set. In this paper, we propose elastic-net subspace representation, a new subspace representation framework using elastic-net regularization of singular values. Due to the strong convexity enforced by elastic-net, the proposed method is more stable and robust in the presence of heavy corruptions compared with existing lasso-type rank minimization approaches. For discovering a single low-dimensional subspace, we propose a computationally efficient low-rank factorization algorithm, called FactEN, using a property of the nuclear norm and the augmented Lagrangian method. Then, ClustEN is proposed to handle the general case, in which the data samples are drawn from a union of multiple subspaces, for joint subspace clustering and estimation. The proposed algorithms are applied to a number of subspace representation problems to evaluate the robustness and efficiency under various noisy conditions, and experimental results show the benefits of the proposed method compared with existing methods.
Collapse
|
17
|
Tang Y, Yuan Y. Image Pair Analysis With Matrix-Value Operator. IEEE TRANSACTIONS ON CYBERNETICS 2015; 45:2042-2050. [PMID: 25616089 DOI: 10.1109/tcyb.2014.2363882] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Image pair analysis provides significant image pair priori which describes the dependency between training image pairs for various learning-based image processing. For avoiding the information loss caused by vectorizing training images, a novel matrix-value operator learning method is proposed for image pair analysis. Sample-dependent operators, named image pair operators (IPOs) by us, are employed to represent the local image-to-image dependency defined by each of the training image pairs. A linear combination of IPOs is learned via operator regression for representing the global dependency between input and output images defined by all of the training image pairs. The proposed operator learning method enjoys the image-level information of training image pairs because IPOs enable training images to be used without vectorizing during the learning and testing process. By applying the proposed algorithm in learning-based super-resolution, the efficiency and the effectiveness of the proposed algorithm in learning image pair information is verified by experimental results.
Collapse
|
18
|
|
19
|
Extreme learning machine for ranking: Generalization analysis and applications. Neural Netw 2014; 53:119-26. [DOI: 10.1016/j.neunet.2014.01.015] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2013] [Revised: 01/09/2014] [Accepted: 01/24/2014] [Indexed: 11/21/2022]
|
20
|
Kaban A. Fractional norm regularization: learning with very few relevant features. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2013; 24:953-963. [PMID: 24808476 DOI: 10.1109/tnnls.2013.2247417] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Learning in the presence of a large number of irrelevant features is an important problem in high-dimensional tasks. Previous studies have shown that L1-norm regularization can be effective in such cases while L2-norm regularization is not. Furthermore, work in compressed sensing suggests that regularization by nonconvex (e.g., fractional) semi-norms may outperform L1-regularization. However, for classification it is largely unclear when this may or may not be the case. In addition, the nonconvex problem is harder to solve than the convex L1 problem. In this paper, we provide a more in-depth analysis to elucidate the potential advantages and pitfalls of nonconvex regularization in the context of logistic regression where the regularization term employs the family of Lq semi-norms. First, using results from the phenomenon of concentration of norms and distances in high dimensions, we gain intuition about the working of sparse estimation when the dimensionality is very high. Second, using the probably approximately correct (PAC)-Bayes methodology, we give a data-dependent bound on the generalization error of Lq-regularized logistic regression, which is applicable to any algorithm that implements this model, and may be used to predict its generalization behavior from the training set alone. Third, we demonstrate the usefulness of our approach by experiments and applications, where the PAC-Bayes bound is used to guide the choice of semi-norm in the regularization term. The results support the conclusion that the optimal choice of regularization depends on the relative fraction of relevant versus irrelevant features, and a fractional norm with a small exponent is most suitable when the fraction of relevant features is very small.
Collapse
|
21
|
Deng Y, Dai Q, Liu R, Zhang Z, Hu S. Low-rank structure learning via nonconvex heuristic recovery. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2013; 24:383-396. [PMID: 24808312 DOI: 10.1109/tnnls.2012.2235082] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
In this paper, we propose a nonconvex framework to learn the essential low-rank structure from corrupted data. Different from traditional approaches, which directly utilizes convex norms to measure the sparseness, our method introduces more reasonable nonconvex measurements to enhance the sparsity in both the intrinsic low-rank structure and the sparse corruptions. We will, respectively, introduce how to combine the widely used ℓp norm (0 < p < 1) and log-sum term into the framework of low-rank structure learning. Although the proposed optimization is no longer convex, it still can be effectively solved by a majorization-minimization (MM)-type algorithm, with which the nonconvex objective function is iteratively replaced by its convex surrogate and the nonconvex problem finally falls into the general framework of reweighed approaches. We prove that the MM-type algorithm can converge to a stationary point after successive iterations. The proposed model is applied to solve two typical problems: robust principal component analysis and low-rank representation. Experimental results on low-rank structure learning demonstrate that our nonconvex heuristic methods, especially the log-sum heuristic recovery algorithm, generally perform much better than the convex-norm-based method (0 < p < 1) for both data with higher rank and with denser corruptions.
Collapse
|
22
|
Hancock T, Mamitsuka H. Boosted network classifiers for local feature selection. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2012; 23:1767-1778. [PMID: 24808071 DOI: 10.1109/tnnls.2012.2214057] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Like all models, network feature selection models require that assumptions be made on the size and structure of the desired features. The most common assumption is sparsity, where only a small section of the entire network is thought to produce a specific phenomenon. The sparsity assumption is enforced through regularized models, such as the lasso. However, assuming sparsity may be inappropriate for many real-world networks, which possess highly correlated modules. In this paper, we illustrate two novel optimization strategies, namely, boosted expectation propagation (BEP) and boosted message passing (BMP), which directly use the network structure to estimate the parameters of a network classifier. BEP and BMP are ensemble methods that seek to optimize classification performance by combining individual models built upon local network features. Neither BEP nor BMP assumes a sparse solution, but instead they seek a weighted average of all network features where the weights are used to emphasize all features that are useful for classification. In this paper, we compare BEP and BMP with network-regularized logistic regression models on simulated and real biological networks. The results show that, where highly correlated network structure exists, assuming sparsity adversely effects the accuracy and feature selection power of the network classifier.
Collapse
|