1
|
Zhong G, Xiao Y, Liu B, Zhao L, Kong X. Ordinal Regression With Pinball Loss. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:11246-11260. [PMID: 37030787 DOI: 10.1109/tnnls.2023.3258464] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Ordinal regression (OR) aims to solve multiclass classification problems with ordinal classes. Support vector OR (SVOR) is a typical OR algorithm and has been extensively used in OR problems. In this article, based on the characteristics of OR problems, we propose a novel pinball loss function and present an SVOR method with pinball loss (pin-SVOR). Pin-SVOR is fundamentally different from traditional SVOR with hinge loss. Traditional SVOR employs the hinge loss function, and the classifier is determined by only a few data points near the class boundary, called support vectors, which may lead to a noise sensitive and re-sampling unstable classifier. Distinctively, pin-SVOR employs the pinball loss function. It attaches an extra penalty to correctly classified data that lies inside the class, such that all the training data is involved in deciding the classifier. The data near the middle of each class has a small penalty, and that near the class boundary has a large penalty. Thus, the training data tend to lie near the middle of each class instead of on the class boundary, which leads to scatter minimization in the middle of each class and noise insensitivity. The experimental results show that pin-SVOR has better classification performance than state-of-the-art OR methods.
Collapse
|
2
|
Alsharari F, Saber Y, Alohali H, Alqahtani MH, Ebodey M, Elmasry T, Alsharif J, Soliman AF, Smarandache F, Sikander F. On stratified single-valued soft topogenous structures. Heliyon 2024; 10:e27926. [PMID: 39670082 PMCID: PMC11636833 DOI: 10.1016/j.heliyon.2024.e27926] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Revised: 03/03/2024] [Accepted: 03/08/2024] [Indexed: 12/14/2024] Open
Abstract
This paper presents novel concepts including stratified single-valued neutrosophic soft topogenous (stratified svns-topogenous), stratified single-valued neutrosophic soft filter (stratified svns-filter), stratified single-valued neutrosophic soft quasi uniformity (stratified svnsq-uniformity) and stratified single-valued neutrosophic soft quasi proximity (stratified svnsq-proximity). Additionally, we present the idea of single-valued neutrosophic soft topogenous structures, formed by integrating svns-topogenous with svns-filter, and discuss their properties. Furthermore, we explore the connections between these single-valued neutrosophic soft topological structures and their corresponding stratifications.
Collapse
Affiliation(s)
- Fahad Alsharari
- Department of Mathematics, College of Science, Jouf University, Sakaka 72311, Saudi Arabia
| | - Yaser Saber
- Department of Mathematics, College of Science Al-Zulfi, Majmaah University, P. O. Box 66, Al-Majmaah 11952, Saudi Arabia
- Department of Mathematics, Faculty of Science, Al-Azhar University, Assiut, 71524, Egypt
| | - Hanan Alohali
- Department of Mathematics, College of Science, King Saud University, P.O. Box 2455, Riyadh 11451, Saudi Arabia
| | - Mesfer H. Alqahtani
- Department of Mathematics, University College of Umluj, University of Tabuk, Tabuk 48322, Saudi Arabia
| | - Mubarak Ebodey
- Department of Business Administration, Faculty of Science and Humanities at Hotat Sudair, Majmaah University, 11952, Riyadh, Saudi Arabia
| | - Tawfik Elmasry
- Department of Business Administration, Faculty of Science and Humanities at Hotat Sudair, Majmaah University, 11952, Riyadh, Saudi Arabia
| | - Jafar Alsharif
- Department of Business Administration, Faculty of Science and Humanities at Hotat Sudair, Majmaah University, 11952, Riyadh, Saudi Arabia
| | - Amal F. Soliman
- Department of Mathematics, College of Science and Humanities in Alkharj, Prince Sattam Bin Abdulaziz University, Alkharj, Saudi Arabia
- Department of Basic Science, Benha Faculty of Engineering, Benha University, Banha, Egypt
| | | | - Fahad Sikander
- Department of Basics Sciences, College of Science and Theoretical studies, Saudi Electronic University, Jeddah 23442, Saudi Arabia
| |
Collapse
|
3
|
Lu B, Wu D, Qin Z, Wang L. Privacy-Preserving Indoor Trajectory Matching with IoT Devices. SENSORS (BASEL, SWITZERLAND) 2023; 23:4029. [PMID: 37112370 PMCID: PMC10146115 DOI: 10.3390/s23084029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Revised: 04/12/2023] [Accepted: 04/13/2023] [Indexed: 06/19/2023]
Abstract
With the rapid development of the Internet of Things (IoT) technology, Wi-Fi signals have been widely used for trajectory signal acquisition. Indoor trajectory matching aims to achieve the monitoring of the encounters between people and trajectory analysis in indoor environments. Due to constraints ofn the computation abilities IoT devices, the computation of indoor trajectory matching requires the assistance of a cloud platform, which brings up privacy concerns. Therefore, this paper proposes a trajectory-matching calculation method that supports ciphertext operations. Hash algorithms and homomorphic encryption are selected to ensure the security of different private data, and the actual trajectory similarity is determined based on correlation coefficients. However, due to obstacles and other interferences in indoor environments, the original data collected may be missing in certain stages. Therefore, this paper also complements the missing values on ciphertexts through mean, linear regression, and KNN algorithms. These algorithms can predict the missing parts of the ciphertext dataset, and the accuracy of the complemented dataset can reach over 97%. This paper provides original and complemented datasets for matching calculations, and demonstrates their high feasibility and effectiveness in practical applications from the perspective of calculation time and accuracy loss.
Collapse
|
4
|
Li Y, Huang X, Zhao C, Ding P. A novel remaining useful life prediction method based on multi-support vector regression fusion and adaptive weight updating. ISA TRANSACTIONS 2022; 131:444-459. [PMID: 35581022 DOI: 10.1016/j.isatra.2022.04.042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/12/2021] [Revised: 04/23/2022] [Accepted: 04/23/2022] [Indexed: 06/15/2023]
Abstract
Remaining useful life prediction is of huge significance in preventing equipment malfunctions and reducing maintenance costs. Currently, machine learning algorithms have become hotspots in remaining useful life prediction due to their high flexibility and convenience. However, machine learnings require large amounts of data, and their prediction performance depends heavily on the selection of hyper-parameters. To overcome these shortcomings, a novel remaining useful life prediction method for small sample cases is proposed based on multi-support vector regression fusion. In the offline training phase, the fusion model is established, consisting of multiple support vector regression sub-models To obtain the optimal sub-model parameters, the Bayesian optimization algorithm is applied and an improved optimization target is formulated with various metrics describing regression and prediction performance. In the online prediction phase, an adaptive weight updating algorithm based on dynamic time warping is developed to measure the fitness of each sub-model and determine the corresponding weight value. The C-MAPSS engine dataset is used to test the performance of the proposed method, along with some existing machine learning methods as comparison. The proposed method only requires 30% of the training data sample to achieve high accuracy, with a root mean square error of 14.98, which is superior to other state-of-the-art methods. The results demonstrate the superiority of the proposed method.
Collapse
Affiliation(s)
- Yuxiong Li
- School of Mechanical Engineering and Automation, Northeastern University, Shenyang, 110819, PR China
| | - Xianzhen Huang
- School of Mechanical Engineering and Automation, Northeastern University, Shenyang, 110819, PR China; Key Laboratory of Vibration and Control of Aero-Propulsion Systems Ministry of Education of China, Northeastern University, Shenyang, 110819, PR China.
| | - Chengying Zhao
- School of Mechanical Engineering and Automation, Northeastern University, Shenyang, 110819, PR China
| | - Pengfei Ding
- School of Mechanical Engineering and Automation, Northeastern University, Shenyang, 110819, PR China
| |
Collapse
|
5
|
Zhao H, Wang H, Fu Y, Wu F, Li X. Memory-Efficient Class-Incremental Learning for Image Classification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:5966-5977. [PMID: 33939615 DOI: 10.1109/tnnls.2021.3072041] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
With the memory-resource-limited constraints, class-incremental learning (CIL) usually suffers from the "catastrophic forgetting" problem when updating the joint classification model on the arrival of newly added classes. To cope with the forgetting problem, many CIL methods transfer the knowledge of old classes by preserving some exemplar samples into the size-constrained memory buffer. To utilize the memory buffer more efficiently, we propose to keep more auxiliary low-fidelity exemplar samples, rather than the original real-high-fidelity exemplar samples. Such a memory-efficient exemplar preserving scheme makes the old-class knowledge transfer more effective. However, the low-fidelity exemplar samples are often distributed in a different domain away from that of the original exemplar samples, that is, a domain shift. To alleviate this problem, we propose a duplet learning scheme that seeks to construct domain-compatible feature extractors and classifiers, which greatly narrows down the above domain gap. As a result, these low-fidelity auxiliary exemplar samples have the ability to moderately replace the original exemplar samples with a lower memory cost. In addition, we present a robust classifier adaptation scheme, which further refines the biased classifier (learned with the samples containing distillation label knowledge about old classes) with the help of the samples of pure true class labels. Experimental results demonstrate the effectiveness of this work against the state-of-the-art approaches. We will release the code, baselines, and training statistics for all models to facilitate future research.
Collapse
|
6
|
Zhu H, Shan H, Zhang Y, Che L, Xu X, Zhang J, Shi J, Wang FY. Convolutional Ordinal Regression Forest for Image Ordinal Estimation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:4084-4095. [PMID: 33600323 DOI: 10.1109/tnnls.2021.3055816] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Image ordinal estimation is to predict the ordinal label of a given image, which can be categorized as an ordinal regression (OR) problem. Recent methods formulate an OR problem as a series of binary classification problems. Such methods cannot ensure that the global ordinal relationship is preserved since the relationships among different binary classifiers are neglected. We propose a novel OR approach, termed convolutional OR forest (CORF), for image ordinal estimation, which can integrate OR and differentiable decision trees with a convolutional neural network for obtaining precise and stable global ordinal relationships. The advantages of the proposed CORF are twofold. First, instead of learning a series of binary classifiers independently, the proposed method aims at learning an ordinal distribution for OR by optimizing those binary classifiers simultaneously. Second, the differentiable decision trees in the proposed CORF can be trained together with the ordinal distribution in an end-to-end manner. The effectiveness of the proposed CORF is verified on two image ordinal estimation tasks, i.e., facial age estimation and image esthetic assessment, showing significant improvements and better stability over the state-of-the-art OR methods.
Collapse
|
7
|
Yu H, Lu J, Zhang G. Continuous Support Vector Regression for Nonstationary Streaming Data. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:3592-3605. [PMID: 32915757 DOI: 10.1109/tcyb.2020.3015266] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Quadratic programming is the process of solving a special type of mathematical optimization problem. Recent advances in online solutions for quadratic programming problems (QPPs) have created opportunities to widen the scope of applications for support vector regression (SVR). In this vein, efforts to make SVR compatible with streaming data have been met with substantial success. However, streaming data with concept drift remain problematic because the trained prediction function in SVR tends to drift as the data distribution drifts. Aiming to contribute a solution to this aspect of SVR's advancement, we have developed continuous SVR (C-SVR) to solve regression problems with nonstationary streaming data, that is, data where the optimal input-output prediction function can drift over time. The basic idea of C-SVR is to continuously learn a series of input-output functions over a series of time windows to make predictions about different periods. However, strikingly, the learning process in different time windows is not independent. An additional similarity term in the QPP, which is solved incrementally, threads the various input-output functions together by conveying some learned knowledge through consecutive time windows. How much learned knowledge is transferred is determined by the extent of the concept drift. Experimental evaluations with both synthetic and real-world datasets indicate that C-SVR has better performance than most existing methods for nonstationary streaming data regression.
Collapse
|
8
|
Reis MS, Jiang B. Predicting the lifetime of Lithium–Ion batteries: Integrated feature extraction and modeling through sequential Unsupervised-Supervised Projections (USP). Chem Eng Sci 2022. [DOI: 10.1016/j.ces.2022.117510] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
|
9
|
Chen H, Jia Y, Ge J, Gu B. Incremental learning algorithm for large-scale semi-supervised ordinal regression. Neural Netw 2022; 149:124-136. [DOI: 10.1016/j.neunet.2022.02.004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2020] [Revised: 11/24/2021] [Accepted: 02/06/2022] [Indexed: 11/25/2022]
|
10
|
Gu B, Geng X, Li X, Shi W, Zheng G, Deng C, Huang H. Scalable Kernel Ordinal Regression via Doubly Stochastic Gradients. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:3677-3689. [PMID: 32857699 DOI: 10.1109/tnnls.2020.3015937] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Ordinal regression (OR) is one of the most important machine learning tasks. The kernel method is a major technique to achieve nonlinear OR. However, traditional kernel OR solvers are inefficient due to increased complexity introduced by multiple ordinal thresholds as well as the cost of kernel computation. Doubly stochastic gradient (DSG) is a very efficient and scalable kernel learning algorithm that combines random feature approximation with stochastic functional optimization. However, the theory and algorithm of DSG can only support optimization tasks within the unique reproducing kernel Hilbert space (RKHS), which is not suitable for OR problems where the multiple ordinal thresholds usually lead to multiple RKHSs. To address this problem, we construct a kernel whose RKHS can contain the decision function with multiple thresholds. Based on this new kernel, we further propose a novel DSG-like algorithm, DSGOR. In each iteration of DSGOR, we update the decision functional as well as the function bias with appropriately set learning rates for each. Our theoretic analysis shows that DSGOR can achieve O(1/t) convergence rate, which is as good as DSG, even though dealing with a much harder problem. Extensive experimental results demonstrate that our algorithm is much more efficient than traditional kernel OR solvers, especially on large-scale problems.
Collapse
|
11
|
Pascual-Triana JD, Charte D, Andrés Arroyo M, Fernández A, Herrera F. Revisiting data complexity metrics based on morphology for overlap and imbalance: snapshot, new overlap number of balls metrics and singular problems prospect. Knowl Inf Syst 2021. [DOI: 10.1007/s10115-021-01577-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
12
|
Wang L, Zhu D. Tackling Ordinal Regression Problem for Heterogeneous Data: Sparse and Deep Multi-Task Learning Approaches. Data Min Knowl Discov 2021; 35:1134-1161. [PMID: 34054330 PMCID: PMC8153254 DOI: 10.1007/s10618-021-00746-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2019] [Accepted: 03/04/2021] [Indexed: 11/27/2022]
Abstract
Many real-world datasets are labeled with natural orders, i.e., ordinal labels. Ordinal regression is a method to predict ordinal labels that finds a wide range of applications in data-rich domains, such as natural, health and social sciences. Most existing ordinal regression approaches work well for independent and identically distributed (IID) instances via formulating a single ordinal regression task. However, for heterogeneous non-IID instances with well-defined local geometric structures, e.g., subpopulation groups, multi-task learning (MTL) provides a promising framework to encode task (subgroup) relatedness, bridge data from all tasks, and simultaneously learn multiple related tasks in efforts to improve generalization performance. Even though MTL methods have been extensively studied, there is barely existing work investigating MTL for heterogeneous data with ordinal labels. We tackle this important problem via sparse and deep multi-task approaches. Specifically, we develop a regularized multi-task ordinal regression (MTOR) model for smaller datasets and a deep neural networks based MTOR model for large-scale datasets. We evaluate the performance using three real-world healthcare datasets with applications to multi-stage disease progression diagnosis. Our experiments indicate that the proposed MTOR models markedly improve the prediction performance comparing with single-task ordinal regression models.
Collapse
Affiliation(s)
- Lu Wang
- Dept. of Computer Science, Wayne State University, Detroit, MI 48202
| | - Dongxiao Zhu
- Dept. of Computer Science, Wayne State University, Detroit, MI 48202
| |
Collapse
|
13
|
Zhu F, Ning Y, Chen X, Zhao Y, Gang Y. On removing potential redundant constraints for SVOR learning. Appl Soft Comput 2021. [DOI: 10.1016/j.asoc.2020.106941] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
14
|
Ray A, Chaudhuri AK. Smart healthcare disease diagnosis and patient management: Innovation, improvement and skill development. MACHINE LEARNING WITH APPLICATIONS 2021. [DOI: 10.1016/j.mlwa.2020.100011] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
|
15
|
Wu W, Wu QMJ, Sun W, Yang Y, Yuan X, Zheng WL, Lu BL. A Regression Method With Subnetwork Neurons for Vigilance Estimation Using EOG and EEG. IEEE Trans Cogn Dev Syst 2021. [DOI: 10.1109/tcds.2018.2889223] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
16
|
|
17
|
Bi J, Li S, Yuan H, Zhou M. Integrated deep learning method for workload and resource prediction in cloud systems. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2020.11.011] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
18
|
Zhang S, Duan Z, Yang W, Qian C, You Y. iDHS-DASTS: identifying DNase I hypersensitive sites based on LASSO and stacking learning. Mol Omics 2021; 17:130-141. [PMID: 33295914 DOI: 10.1039/d0mo00115e] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
The DNase I hypersensitivity site is an important marker of the DNA regulatory region, and its identification in the DNA sequence is of great significance for biomedical research. However, traditional identification methods are extremely time-consuming and can not obtain an accurate result. In this paper, we proposed a predictor called iDHS-DASTS to identify the DHS based on benchmark datasets. First, we adopt a feature extraction method called PseDNC which can incorporate the original DNA properties and spatial information of the DNA sequence. Then we use a method called LASSO to reduce the dimensions of the original data. Finally, we utilize stacking learning as a classifier, which includes Adaboost, random forest, gradient boosting, extra trees and SVM. Before we train the classifier, we use SMOTE-Tomek to overcome the imbalance of the datasets. In the experiment, our iDHS-DASTS achieves remarkable performance on three benchmark datasets. We achieve state-of-the-art results with over 92.06%, 91.06% and 90.72% accuracy for datasets [Doublestruck S]1, [Doublestruck S]2 and [Doublestruck S]3, respectively. To verify the validation and transferability of our model, we establish another independent dataset [Doublestruck S]4, for which the accuracy can reach 90.31%. Furthermore, we used the proposed model to construct a user friendly web server called iDHS-DASTS, which is available at http://www.xdu-duan.cn/.
Collapse
Affiliation(s)
- Shengli Zhang
- School of Mathematics and Statistics, Xidian University, Xi'an 710071, P. R. China.
| | - Zhengpeng Duan
- School of Electronic Enginnering, Xidian University, Xi'an 710071, P. R. China
| | - Wenhao Yang
- School of Electronic Enginnering, Xidian University, Xi'an 710071, P. R. China
| | - Chenlai Qian
- School of Electronic Enginnering, Xidian University, Xi'an 710071, P. R. China
| | - Yiwei You
- International Business School, Shanghai University of International Business and Economics, Shanghai, 201620, P. R. China
| |
Collapse
|
19
|
Tian Q, Cao M, Chen S, Yin H. Structure-Exploiting Discriminative Ordinal Multioutput Regression. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:266-280. [PMID: 32203034 DOI: 10.1109/tnnls.2020.2978508] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Although the least-squares regression (LSR) has achieved great success in regression tasks, its discriminating ability is limited since the margins between classes are not specially preserved. To mitigate this issue, dragging techniques have been introduced to remodel the regression targets of LSR. Such variants have gained certain performance improvement, but their generalization ability is still unsatisfactory when handling real data. This is because structure-related information, which is typically contained in the data, is not exploited. To overcome this shortcoming, in this article, we construct a multioutput regression model by exploiting the intraclass correlations and input-output relationships via a structure matrix. We also discriminatively enlarge the regression margins by embedding a metric that is guided automatically by the training data. To better handle such structured data with ordinal labels, we encode the model output as cumulative attributes and, hence, obtain our proposed model, termed structure-exploiting discriminative ordinal multioutput regression (SEDOMOR). In addition, to further enhance its distinguishing ability, we extend the SEDOMOR to its nonlinear counterparts with kernel functions and deep architectures. We also derive the corresponding optimization algorithms for solving these models and prove their convergence. Finally, extensive experiments have testified the effectiveness and superiority of the proposed methods.
Collapse
|
20
|
Xu F, Pun CM, Li H, Zhang Y, Song Y, Gao H. Training Feed-Forward Artificial Neural Networks with a modified artificial bee colony algorithm. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2019.04.086] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
21
|
Alkhazi B, DiStasi A, Aljedaani W, Alrubaye H, Ye X, Mkaouer MW. Learning to rank developers for bug report assignment. Appl Soft Comput 2020. [DOI: 10.1016/j.asoc.2020.106667] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
22
|
Manwani N, Chandra M. Exact Passive-Aggressive Algorithms for Ordinal Regression Using Interval Labels. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:3259-3268. [PMID: 31613784 DOI: 10.1109/tnnls.2019.2939861] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
In this article, we propose exact passive-aggressive (PA) online algorithms for ordinal regression. The proposed algorithms can be used even when we have interval labels instead of actual labels for example. The proposed algorithms solve a convex optimization problem at every trial. We find an exact solution to those optimization problems to determine the updated parameters. We propose a support class algorithm (SCA) that finds the active constraints using the Karush-Kuhn-Tucker (KKT) conditions of the optimization problems. These active constraints form a support set, which determines the set of thresholds that need to be updated. We derive update rules for PA, PA-I, and PA-II. We show that the proposed algorithms maintain the ordering of the thresholds after every trial. We provide the mistake bounds of the proposed algorithms in both ideal and general settings. We also show experimentally that the proposed algorithms successfully learn accurate classifiers using interval labels as well as exact labels. The proposed algorithms also do well compared to other approaches.
Collapse
|
23
|
|
24
|
Wang Y, Pan Z, Pan Y. A Training Data Set Cleaning Method by Classification Ability Ranking for the k -Nearest Neighbor Classifier. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:1544-1556. [PMID: 31265416 DOI: 10.1109/tnnls.2019.2920864] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
The k -nearest neighbor (KNN) rule is a successful technique in pattern classification due to its simplicity and effectiveness. As a supervised classifier, KNN classification performance usually suffers from low-quality samples in the training data set. Thus, training data set cleaning (TDC) methods are needed for enhancing the classification accuracy by cleaning out noisy, or even wrong, samples in the original training data set. In this paper, we propose a classification ability ranking (CAR)-based TDC method to improve the performance of a KNN classifier, namely CAR-based TDC method. The proposed classification ability function ranks a training sample in terms of its contribution to correctly classify other training samples as a KNN through the leave-one-out (LV1) strategy in the cleaning stage. The training sample that likely misclassifies the other samples during the KNN classifications according to the LV1 strategy is considered to have lower classification ability and will be cleaned out from the original training data set. Extensive experiments, based on ten real-world data sets, show that the proposed CAR-based TDC method can significantly reduce the classification error rates of KNN-based classifiers, while reducing computational complexity thanks to a smaller cleaned training data set.
Collapse
|
25
|
A new global best guided artificial bee colony algorithm with application in robot path planning. Appl Soft Comput 2020. [DOI: 10.1016/j.asoc.2019.106037] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
26
|
Zhang Y, Cheung YM, Tan KC. A Unified Entropy-Based Distance Metric for Ordinal-and-Nominal-Attribute Data Clustering. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:39-52. [PMID: 30908240 DOI: 10.1109/tnnls.2019.2899381] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Ordinal data are common in many data mining and machine learning tasks. Compared to nominal data, the possible values (also called categories interchangeably) of an ordinal attribute are naturally ordered. Nevertheless, since the data values are not quantitative, the distance between two categories of an ordinal attribute is generally not well defined, which surely has a serious impact on the result of the quantitative analysis if an inappropriate distance metric is utilized. From the practical perspective, ordinal-and-nominal-attribute categorical data, i.e., categorical data associated with a mixture of nominal and ordinal attributes, is common, but the distance metric for such data has yet to be well explored in the literature. In this paper, within the framework of clustering analysis, we therefore first propose an entropy-based distance metric for ordinal attributes, which exploits the underlying order information among categories of an ordinal attribute for the distance measurement. Then, we generalize this distance metric and propose a unified one accordingly, which is applicable to ordinal-and-nominal-attribute categorical data. Compared with the existing metrics proposed for categorical data, the proposed metric is simple to use and nonparametric. More importantly, it reasonably exploits the underlying order information of ordinal attributes and statistical information of nominal attributes for distance measurement. Extensive experiments show that the proposed metric outperforms the existing counterparts on both the real and benchmark data sets.
Collapse
|
27
|
Ghani NA, Hamid S, Targio Hashem IA, Ahmed E. Social media big data analytics: A survey. COMPUTERS IN HUMAN BEHAVIOR 2019. [DOI: 10.1016/j.chb.2018.08.039] [Citation(s) in RCA: 155] [Impact Index Per Article: 25.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
|
28
|
Yu D, Xu Z, Wang X. Bibliometric analysis of support vector machines research trend: a case study in China. INT J MACH LEARN CYB 2019. [DOI: 10.1007/s13042-019-01028-y] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
29
|
Rodríguez Aldana Y, Marañón Reyes EJ, Macias FS, Rodríguez VR, Chacón LM, Van Huffel S, Hunyadi B. Nonconvulsive epileptic seizure monitoring with incremental learning. Comput Biol Med 2019; 114:103434. [PMID: 31561098 DOI: 10.1016/j.compbiomed.2019.103434] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2019] [Revised: 09/02/2019] [Accepted: 09/03/2019] [Indexed: 11/29/2022]
Abstract
Nonconvulsive epileptic seizures (NCSz) and nonconvulsive status epilepticus (NCSE) are two neurological entities associated with increment in morbidity and mortality in critically ill patients. In a previous work, we introduced a method which accurately detected NCSz in EEG data (referred here as 'Batch method'). However, this approach was less effective when the EEG features identified at the beginning of the recording changed over time. Such pattern drift is an issue that causes failures of automated seizure detection methods. This paper presents a support vector machine (SVM)-based incremental learning method for NCSz detection that for the first time addresses the seizure evolution in EEG records from patients with epileptic disorders and from ICU having NCSz. To implement the incremental learning SVM, three methodologies are tested. These approaches differ in the way they reduce the set of potentially available support vectors that are used to build the decision function of the classifier. To evaluate the suitability of the three incremental learning approaches proposed here for NCSz detection, first, a comparative study between the three methods is performed. Secondly, the incremental learning approach with the best performance is compared with the Batch method and three other batch methods from the literature. From this comparison, the incremental learning method based on maximum relevance minimum redundancy (MRMR_IL) obtained the best results. MRMR_IL method proved to be an effective tool for NCSz detection in a real-time setting, achieving sensitivity and accuracy values above 99%.
Collapse
Affiliation(s)
- Yissel Rodríguez Aldana
- Universidad de Oriente, Center of Neuroscience and Signals and Image Processing. Santiago de Cuba, Cuba; KU Leuven, Department of Electrical Engineering (ESAT), Stadius Center for Dynamical Systems, Signal Processing and Data Analytics, Leuven, Belgium.
| | - Enrique J Marañón Reyes
- Universidad de Oriente, Center of Neuroscience and Signals and Image Processing. Santiago de Cuba, Cuba
| | | | - Valia Rodríguez Rodríguez
- Aston University, Birmingham, United Kingdom; Cuban Neuroscience Center, Havana, Cuba; Clinical-Surgical Hospital "Hermanos Almeijeiras", Havana, Cuba
| | | | - Sabine Van Huffel
- KU Leuven, Department of Electrical Engineering (ESAT), Stadius Center for Dynamical Systems, Signal Processing and Data Analytics, Leuven, Belgium
| | - Borbála Hunyadi
- KU Leuven, Department of Electrical Engineering (ESAT), Stadius Center for Dynamical Systems, Signal Processing and Data Analytics, Leuven, Belgium; Department of Microelectronics, Delft University of Technology, Delft, Netherlands
| |
Collapse
|
30
|
Li X, Rao Y, Xie H, Liu X, Wong TL, Wang FL. Social emotion classification based on noise-aware training. DATA KNOWL ENG 2019. [DOI: 10.1016/j.datak.2017.07.008] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
31
|
An Efficient Conjugate Gradient Method for Convex Constrained Monotone Nonlinear Equations with Applications. MATHEMATICS 2019. [DOI: 10.3390/math7090767] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
This research paper proposes a derivative-free method for solving systems of nonlinearequations with closed and convex constraints, where the functions under consideration are continuousand monotone. Given an initial iterate, the process first generates a specific direction and then employsa line search strategy along the direction to calculate a new iterate. If the new iterate solves theproblem, the process will stop. Otherwise, the projection of the new iterate onto the closed convex set(constraint set) determines the next iterate. In addition, the direction satisfies the sufficient descentcondition and the global convergence of the method is established under suitable assumptions.Finally, some numerical experiments were presented to show the performance of the proposedmethod in solving nonlinear equations and its application in image recovery problems.
Collapse
|
32
|
Extended hesitant fuzzy linguistic term set with fuzzy confidence for solving group decision-making problems. Neural Comput Appl 2019. [DOI: 10.1007/s00521-019-04275-w] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
33
|
|
34
|
Kamarajugadda KK, Polipalli TR. Extract Features from Periocular Region to Identify the Age Using Machine Learning Algorithms. J Med Syst 2019; 43:196. [PMID: 31119384 DOI: 10.1007/s10916-019-1335-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2019] [Accepted: 05/10/2019] [Indexed: 10/26/2022]
Abstract
Latest studies done on huge data collected from aging features proved that the performance of facial image based age estimation is low and need to be improved. One of the significant biometric traits for human recognition or search is Human age. Age assessment is very much exigent over other pattern recognition problems since the aging differs from person to person. This paper proposes a new framework that uses periocular region for age feature extraction and application of hybrid algorithm for age recognition. Firstly, preprocessing and periocular region normalization is done to acquire age invariant features. Secondly, the periocular region that underwent preprocessing is analyzed using hybrid approach, a novel machine algorithm that combines both SVM and kNN. The proposed technique generates the best recognition outputs.
Collapse
|
35
|
Lan L, Wang Z, Zhe S, Cheng W, Wang J, Zhang K. Scaling Up Kernel SVM on Limited Resources: A Low-Rank Linearization Approach. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:369-378. [PMID: 29994133 DOI: 10.1109/tnnls.2018.2838140] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/14/2023]
Abstract
Kernel support vector machines (SVMs) deliver state-of-the-art results in many real-world nonlinear classification problems, but the computational cost can be quite demanding in order to maintain a large number of support vectors. Linear SVM, on the other hand, is highly scalable to large data but only suited for linearly separable problems. In this paper, we propose a novel approach called low-rank linearized SVM to scale up kernel SVM on limited resources. Our approach transforms a nonlinear SVM to a linear one via an approximate empirical kernel map computed from efficient kernel low-rank decompositions. We theoretically analyze the gap between the solutions of the approximate and optimal rank- k kernel map, which in turn provides guidance on the sampling scheme of the Nyström approximation. Furthermore, we extend it to a semisupervised metric learning scenario in which partially labeled samples can be exploited to further improve the quality of the low-rank embedding. Our approach inherits rich representability of kernel SVM and high efficiency of linear SVM. Experimental results demonstrate that our approach is more robust and achieves a better tradeoff between model representability and scalability against state-of-the-art algorithms for large-scale SVMs.
Collapse
|
36
|
Identifying intention posts in discussion forums using multi-instance learning and multiple sources transfer learning. Soft comput 2018. [DOI: 10.1007/s00500-017-2755-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
37
|
LSRR-LA: An Anisotropy-Tolerant Localization Algorithm Based on Least Square Regularized Regression for Multi-Hop Wireless Sensor Networks. SENSORS 2018; 18:s18113974. [PMID: 30445789 PMCID: PMC6263435 DOI: 10.3390/s18113974] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/30/2018] [Revised: 11/10/2018] [Accepted: 11/12/2018] [Indexed: 11/16/2022]
Abstract
As is well known, multi-hop range-free localization algorithms demonstrate pretty good performance in isotropic networks in which sensor nodes distribute evenly and densely. However, these algorithms are easily affected by network topology, causing a significant decrease in positioning accuracy. To improve the localization performance in anisotropic networks, this paper presents a multi-hop range-free localization algorithm based on Least Square Regularized Regression (LSRR). By building a mapping relationship between hop counts and real distances, we can regard the process of localization as a regularized regression. Firstly, the proximity information of the given network is measured. Then, a mapping model between the geographical distances and the hop distances is constructed by LSRR. Finally, each sensor node finds its own position via this mapping. The Average Localization Error (ALE) metric is used to evaluate the proposed method in our experiments, and results show that, compared with similar methods, our approach can effectively decrease the effect of anisotropy, thus considerably improving the positioning accuracy.
Collapse
|
38
|
Ren F, Cao P, Zhao D, Wan C. Diabetic macular edema grading in retinal images using vector quantization and semi-supervised learning. Technol Health Care 2018; 26:389-397. [PMID: 29689762 PMCID: PMC6004946 DOI: 10.3233/thc-174704] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
BACKGROUND Diabetic macular edema (DME) is one of the severe complication of diabetic retinopathy causing severe vision loss and leads to blindness in severe cases if left untreated. OBJECTIVE To grade the severity of DME in retinal images. METHODS Firstly, the macular is localized using its anatomical features and the information of the macula location with respect to the optic disc. Secondly, a novel method for the exudates detection is proposed. The possible exudate regions are segmented using vector quantization technique and formulated using a set of feature vectors. A semi-supervised learning with graph based classifier is employed to identify the true exudates. Thirdly, the disease severity is graded into different stages based on the location of exudates and the macula coordinates. RESULTS The results are obtained with the mean value of 0.975 and 0.942 for accuracy and F1-scrore, respectively. CONCLUSION The present work contributes to macula localization, exudate candidate identification with vector quantization and exudate candidate classification with semi-supervised learning. The proposed method and the state-of-the-art approaches are compared in terms of performance, and experimental results show the proposed system overcomes the challenge of the DME grading and demonstrate a promising effectiveness.
Collapse
Affiliation(s)
- Fulong Ren
- School of Computer Science and Engineering, Northeastern University, Shenyang, Liaoning, China.,Key Laboratory of Medical Image Computing of Ministry of Education, Northeastern University, Shenyang, Liaoning, China
| | - Peng Cao
- School of Computer Science and Engineering, Northeastern University, Shenyang, Liaoning, China.,Key Laboratory of Medical Image Computing of Ministry of Education, Northeastern University, Shenyang, Liaoning, China
| | - Dazhe Zhao
- School of Computer Science and Engineering, Northeastern University, Shenyang, Liaoning, China.,Key Laboratory of Medical Image Computing of Ministry of Education, Northeastern University, Shenyang, Liaoning, China
| | - Chao Wan
- Department of Ophthalmology, The First Hospital of China Medical University, Shenyang, Liaoning, China
| |
Collapse
|
39
|
Learning from crowds with active learning and self-healing. Neural Comput Appl 2018. [DOI: 10.1007/s00521-017-2878-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
40
|
Efficient multi-kernel multi-instance learning using weakly supervised and imbalanced data for diabetic retinopathy diagnosis. Comput Med Imaging Graph 2018; 69:112-124. [DOI: 10.1016/j.compmedimag.2018.08.008] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2018] [Revised: 07/09/2018] [Accepted: 08/22/2018] [Indexed: 11/19/2022]
|
41
|
|
42
|
Wang J, Li T, Shih FY. Discrimination of Computer Generated and Photographic Images Based on CQWT Quaternion Markov Features. INT J PATTERN RECOGN 2018. [DOI: 10.1142/s0218001419540077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
In this paper, an effective method based on the color quaternion wavelet transform (CQWT) for image forensics is proposed. Compared to discrete wavelet transform (DWT), the CQWT provides more information, such as the quaternion’s magnitude and phase measures, to discriminate between computer generated (CG) and photographic (PG) images. Meanwhile, we extend the classic Markov features into the quaternion domain to develop the quaternion Markov statistical features for color images. Experimental results show that the proposed scheme can achieve the classification rate of 92.70%, which is 6.89% higher than the classic Markov features.
Collapse
Affiliation(s)
- Jinwei Wang
- Jiangsu Engineering Center of Network Monitoring, Nanjing University of Information Science and Technology, Jiangsu, P. R. China
| | - Ting Li
- Jiangsu Engineering Center of Network Monitoring, Nanjing University of Information Science and Technology, Jiangsu, P. R. China
| | - Frank Y. Shih
- Department of Computer Science, New Jersey Institute of Technology, Newark, NJ, USA
- Department of Computer Science and Information Engineering, Asia University, Taichung, Taiwan
| |
Collapse
|
43
|
Mu C, Wang D, He H. Data-Driven Finite-Horizon Approximate Optimal Control for Discrete-Time Nonlinear Systems Using Iterative HDP Approach. IEEE TRANSACTIONS ON CYBERNETICS 2018; 48:2948-2961. [PMID: 29028219 DOI: 10.1109/tcyb.2017.2752845] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
This paper presents a data-based finite-horizon optimal control approach for discrete-time nonlinear affine systems. The iterative adaptive dynamic programming (ADP) is used to approximately solve Hamilton-Jacobi-Bellman equation by minimizing the cost function in finite time. The idea is implemented with the heuristic dynamic programming (HDP) involved the model network, which makes the iterative control at the first step can be obtained without the system function, meanwhile the action network is used to obtain the approximate optimal control law and the critic network is utilized for approximating the optimal cost function. The convergence of the iterative ADP algorithm and the stability of the weight estimation errors based on the HDP structure are intensively analyzed. Finally, two simulation examples are provided to demonstrate the theoretical results and show the performance of the proposed method.
Collapse
|
44
|
Shen X, Shen F, Liu L, Yuan YH, Liu W, Sun QS. Multiview Discrete Hashing for Scalable Multimedia Search. ACM T INTEL SYST TEC 2018. [DOI: 10.1145/3178119] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
Abstract
Hashing techniques have recently gained increasing research interest in multimedia studies. Most existing hashing methods only employ single features for hash code learning. Multiview data with each view corresponding to a type of feature generally provides more comprehensive information. How to efficiently integrate multiple views for learning compact hash codes still remains challenging. In this article, we propose a novel unsupervised hashing method, dubbed multiview discrete hashing (MvDH), by effectively exploring multiview data. Specifically, MvDH performs matrix factorization to generate the hash codes as the latent representations shared by multiple views, during which spectral clustering is performed simultaneously. The joint learning of hash codes and cluster labels enables that MvDH can generate more discriminative hash codes, which are optimal for classification. An efficient alternating algorithm is developed to solve the proposed optimization problem with guaranteed convergence and low computational complexity. The binary codes are optimized via the discrete cyclic coordinate descent (DCC) method to reduce the quantization errors. Extensive experimental results on three large-scale benchmark datasets demonstrate the superiority of the proposed method over several state-of-the-art methods in terms of both accuracy and scalability.
Collapse
Affiliation(s)
| | - Fumin Shen
- University of Electronic Science and Technology of China, Chengdu, China
| | - Li Liu
- Northumbria University, UK
| | | | - Weiwei Liu
- The University of New South Wales, Sydney, NSW, Australia
| | - Quan-Sen Sun
- Nanjing University of Science and Technology, Nanjing, China
| |
Collapse
|
45
|
|
46
|
Wu H, Cao C, Xia X, Lu Q. Unified Deep Learning Architecture for Modeling Biology Sequence. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 15:1445-1452. [PMID: 28991751 DOI: 10.1109/tcbb.2017.2760832] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Prediction of the spatial structure or function of biological macromolecules based on their sequences remains an important challenge in bioinformatics. When modeling biological sequences using traditional sequencing models, long-range interaction, complicated and variable output of labeled structures, and variable length of biological sequences usually lead to different solutions on a case-by-case basis. This study proposed a unified deep learning architecture based on long short-term memory or a gated recurrent unit to capture long-range interactions. The architecture designs the optional reshape operator to adapt to the diversity of the output labels and implements a training algorithm to support the training of sequence models capable of processing variable-length sequences. The merging and pooling operators enhances the ability of capturing short-range interactions between basic units of biological sequences. The proposed deep-learning architecture and its training algorithm might be capable of solving currently variable biological sequence-modeling problems under a unified framework. We validated the model on one of the most difficult biological sequence-modeling problems, protein residue interaction prediction. The results indicate that the accuracy of obtaining the residue interactions of the model exceeded popular approaches by 10 percent on multiple widely-used benchmarks.
Collapse
|
47
|
Zeng J, Liu Y, Leng B, Xiong Z, Cheung YM. Dimensionality Reduction in Multiple Ordinal Regression. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 29:4088-4101. [PMID: 29028214 DOI: 10.1109/tnnls.2017.2752003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Supervised dimensionality reduction (DR) plays an important role in learning systems with high-dimensional data. It projects the data into a low-dimensional subspace and keeps the projected data distinguishable in different classes. In addition to preserving the discriminant information for binary or multiple classes, some real-world applications also require keeping the preference degrees of assigning the data to multiple aspects, e.g., to keep the different intensities for co-occurring facial expressions or the product ratings in different aspects. To address this issue, we propose a novel supervised DR method for DR in multiple ordinal regression (DRMOR), whose projected subspace preserves all the ordinal information in multiple aspects or labels. We formulate this problem as a joint optimization framework to simultaneously perform DR and ordinal regression. In contrast to most existing DR methods, which are conducted independently of the subsequent classification or ordinal regression, the proposed framework fully benefits from both of the procedures. We experimentally demonstrate that the proposed DRMOR method (DRMOR-M) well preserves the ordinal information from all the aspects or labels in the learned subspace. Moreover, DRMOR-M exhibits advantages compared with representative DR or ordinal regression algorithms on three standard data sets.
Collapse
|
48
|
Xiao Y, Liu B, Hao Z. Multiple-Instance Ordinal Regression. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 29:4398-4413. [PMID: 29990132 DOI: 10.1109/tnnls.2017.2766164] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Ordinal regression (OR) is a paradigm in supervised learning, which aims at learning a prediction model for ordered classes. The existing studies mainly focus on single-instance OR, and the multi-instance OR problem has not been explicitly addressed. In many real-world applications, considering the OR problem from a multiple-instance aspect can yield better classification performance than from a single-instance aspect. For example, in image retrieval, an image may contain multiple and possibly heterogeneous objects. The user is usually interested in only a small part of the objects. If we represent the whole image as a global feature vector, the useful information from the targeted objects that the user is of interest may be overridden by the noisy information from irrelevant objects. However, this problem fits in the multiple-instance setting well. Each image is considered as a bag, and each object region is treated as an instance. The image is considered as of the user interest if it contains at least one targeted object region. In this paper, we address the multi-instance OR where the OR classifier is learned on multiple-instance data, instead of single-instance data. To solve this problem, we present a novel multiple-instance ordinal regression (MIOR) method. In MIOR, a set of parallel hyperplanes is used to separate the classes, and the label ordering information is incorporated into learning the classifier by imputing the parallel hyperplanes with an order. Moreover, considering that a bag may contain instances not belonging to its class, for each bag, the instance which is nearest to the middle of the corresponding class is selected to learn the classifier. Compared with the existing single-instance OR work, MIOR is able to learn a more accurate OR classifier on multiple-instance data where only the bag label is available and the instance label is unknown. Extensive experiments show that MIOR outperforms the existing single-instance OR methods.
Collapse
|
49
|
Shen X, Liu W, Tsang IW, Sun QS, Ong YS. Multilabel Prediction via Cross-View Search. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 29:4324-4338. [PMID: 29990175 DOI: 10.1109/tnnls.2017.2763967] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Embedding methods have shown promising performance in multilabel prediction, as they are able to discover the label dependence. However, most methods ignore the correlations between the input and output, such that their learned embeddings are not well aligned, which leads to degradation in prediction performance. This paper presents a formulation for multilabel learning, from the perspective of cross-view learning, that explores the correlations between the input and the output. The proposed method, called Co-Embedding (CoE), jointly learns a semantic common subspace and view-specific mappings within one framework. The semantic similarity structure among the embeddings is further preserved, ensuring that close embeddings share similar labels. Additionally, CoE conducts multilabel prediction through the cross-view $k$ nearest neighborhood ( $k$ NN) search among the learned embeddings, which significantly reduces computational costs compared with conventional decoding schemes. A hashing-based model, i.e., Co-Hashing (CoH), is further proposed. CoH is based on CoE, and imposes the binary constraint on continuous latent embeddings. CoH aims to generate compact binary representations to improve the prediction efficiency by benefiting from the efficient $k$ NN search of multiple labels in the Hamming space. Extensive experiments on various real-world data sets demonstrate the superiority of the proposed methods over the state of the arts in terms of both prediction accuracy and efficiency.
Collapse
|
50
|
Li C, de Rijke M. Incremental sparse Bayesian ordinal regression. Neural Netw 2018; 106:294-302. [PMID: 30121479 DOI: 10.1016/j.neunet.2018.07.015] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2018] [Revised: 06/09/2018] [Accepted: 07/25/2018] [Indexed: 10/28/2022]
Abstract
Ordinal Regression (OR) aims to model the ordering information between different data categories, which is a crucial topic in multi-label learning. An important class of approaches to OR models the problem as a linear combination of basis functions that map features to a high-dimensional non-linear space. However, most of the basis function-based algorithms are time consuming. We propose an incremental sparse Bayesian approach to OR tasks and introduce an algorithm to sequentially learn the relevant basis functions in the ordinal scenario. Our method, called Incremental Sparse Bayesian Ordinal Regression (ISBOR), automatically optimizes the hyper-parameters via the type-II maximum likelihood method. By exploiting fast marginal likelihood optimization, ISBOR can avoid big matrix inverses, which is the main bottleneck in applying basis function-based algorithms to OR tasks on large-scale datasets. We show that ISBOR can make accurate predictions with parsimonious basis functions while offering automatic estimates of the prediction uncertainty. Extensive experiments on synthetic and real word datasets demonstrate the efficiency and effectiveness of ISBOR compared to other basis function-based OR approaches.
Collapse
Affiliation(s)
- Chang Li
- University of Amsterdam, Science Park 904, 1098 XH Amsterdam, The Netherlands.
| | - Maarten de Rijke
- University of Amsterdam, Science Park 904, 1098 XH Amsterdam, The Netherlands.
| |
Collapse
|